blob: f99bb337cdfdfcbc41c666026c5be4f7cec9efb9 [file] [log] [blame]
<!DOCTYPE html><html lang="en"><head><meta charset="utf-8"></meta><title>CountText</title><link rel="stylesheet" href="../../../../../css/component-usage.css" type="text/css"></link></head><script type="text/javascript">window.onload = function(){if(self==top) { document.getElementById('nameHeader').style.display = "inherit"; } }</script><body><h1 id="nameHeader" style="display: none;">CountText</h1><h2>Description: </h2><p>Counts various metrics on incoming text. The requested results will be recorded as attributes. The resulting flowfile will not have its content modified.</p><h3>Tags: </h3><p>count, text, line, word, character</p><h3>Properties: </h3><p>In the list below, the names of required properties appear in <strong>bold</strong>. Any other properties (not in bold) are considered optional. The table also indicates any default values.</p><table id="properties"><tr><th>Display Name</th><th>API Name</th><th>Default Value</th><th>Allowable Values</th><th>Description</th></tr><tr><td id="name"><strong>Count Lines</strong></td><td>text-line-count</td><td id="default-value">true</td><td id="allowable-values"><ul><li>true</li><li>false</li></ul></td><td id="description">If enabled, will count the number of lines present in the incoming text.</td></tr><tr><td id="name"><strong>Count Non-Empty Lines</strong></td><td>text-line-nonempty-count</td><td id="default-value">false</td><td id="allowable-values"><ul><li>true</li><li>false</li></ul></td><td id="description">If enabled, will count the number of lines that contain a non-whitespace character present in the incoming text.</td></tr><tr><td id="name"><strong>Count Words</strong></td><td>text-word-count</td><td id="default-value">false</td><td id="allowable-values"><ul><li>true</li><li>false</li></ul></td><td id="description">If enabled, will count the number of words (alphanumeric character groups bounded by whitespace) present in the incoming text. Common logical delimiters [_-.] do not bound a word unless 'Split Words on Symbols' is true.</td></tr><tr><td id="name"><strong>Count Characters</strong></td><td>text-character-count</td><td id="default-value">false</td><td id="allowable-values"><ul><li>true</li><li>false</li></ul></td><td id="description">If enabled, will count the number of characters (including whitespace and symbols, but not including newlines and carriage returns) present in the incoming text.</td></tr><tr><td id="name"><strong>Split Words on Symbols</strong></td><td>split-words-on-symbols</td><td id="default-value">false</td><td id="allowable-values"><ul><li>true</li><li>false</li></ul></td><td id="description">If enabled, the word count will identify strings separated by common logical delimiters [ _ - . ] as independent words (ex. split-words-on-symbols = 4 words).</td></tr><tr><td id="name"><strong>Character Encoding</strong></td><td>character-encoding</td><td id="default-value">UTF-8</td><td id="allowable-values"><ul><li>ISO-8859-1</li><li>UTF-8</li><li>UTF-16</li><li>UTF-16LE</li><li>UTF-16BE</li><li>US-ASCII</li></ul></td><td id="description">Specifies a character encoding to use.</td></tr><tr><td id="name"><strong>Call Immediate Adjustment</strong></td><td>ajust-immediately</td><td id="default-value">false</td><td id="allowable-values"><ul><li>true</li><li>false</li></ul></td><td id="description">If true, the counter will be updated immediately, without regard to whether the ProcessSession is commit or rolled back;otherwise, the counter will be incremented only if and when the ProcessSession is committed.</td></tr></table><h3>Relationships: </h3><table id="relationships"><tr><th>Name</th><th>Description</th></tr><tr><td>success</td><td>The flowfile contains the original content with one or more attributes added containing the respective counts</td></tr><tr><td>failure</td><td>If the flowfile text cannot be counted for some reason, the original file will be routed to this destination and nothing will be routed elsewhere</td></tr></table><h3>Reads Attributes: </h3>None specified.<h3>Writes Attributes: </h3><table id="writes-attributes"><tr><th>Name</th><th>Description</th></tr><tr><td>text.line.count</td><td>The number of lines of text present in the FlowFile content</td></tr><tr><td>text.line.nonempty.count</td><td>The number of lines of text (with at least one non-whitespace character) present in the original FlowFile</td></tr><tr><td>text.word.count</td><td>The number of words present in the original FlowFile</td></tr><tr><td>text.character.count</td><td>The number of characters (given the specified character encoding) present in the original FlowFile</td></tr></table><h3>State management: </h3>This component does not store state.<h3>Restricted: </h3>This component is not restricted.<h3>Input requirement: </h3>This component requires an incoming relationship.<h3>System Resource Considerations:</h3>None specified.<h3>See Also:</h3><p><a href="../org.apache.nifi.processors.standard.SplitText/index.html">SplitText</a></p></body></html>