blob: 0d50f55dcf55c6dcaa7fdce2bcb53a1312e959f7 [file] [log] [blame]
<!DOCTYPE html><html lang="en"><head><meta charset="utf-8"></meta><title>GetHDFSSequenceFile</title><link rel="stylesheet" href="../../../../../css/component-usage.css" type="text/css"></link></head><script type="text/javascript">window.onload = function(){if(self==top) { document.getElementById('nameHeader').style.display = "inherit"; } }</script><body><h1 id="nameHeader" style="display: none;">GetHDFSSequenceFile</h1><h2>Description: </h2><p>Fetch sequence files from Hadoop Distributed File System (HDFS) into FlowFiles</p><h3>Tags: </h3><p>hadoop, HCFS, HDFS, get, fetch, ingest, source, sequence file</p><h3>Properties: </h3><p>In the list below, the names of required properties appear in <strong>bold</strong>. Any other properties (not in bold) are considered optional. The table also indicates any default values, and whether a property supports the <a href="../../../../../html/expression-language-guide.html">NiFi Expression Language</a>.</p><table id="properties"><tr><th>Display Name</th><th>API Name</th><th>Default Value</th><th>Allowable Values</th><th>Description</th></tr><tr><td id="name">Hadoop Configuration Resources</td><td>Hadoop Configuration Resources</td><td></td><td id="allowable-values"></td><td id="description">A file or comma separated list of files which contains the Hadoop file system configuration. Without this, Hadoop will search the classpath for a 'core-site.xml' and 'hdfs-site.xml' file or will revert to a default configuration. To use swebhdfs, see 'Additional Details' section of PutHDFS's documentation.<br/><br/><strong>This property expects a comma-separated list of file resources.</strong><br/><br/><strong>Supports Expression Language: true (will be evaluated using variable registry only)</strong></td></tr><tr><td id="name">Kerberos Credentials Service</td><td>kerberos-credentials-service</td><td></td><td id="allowable-values"><strong>Controller Service API: </strong><br/>KerberosCredentialsService<br/><strong>Implementation: </strong><a href="../../../nifi-kerberos-credentials-service-nar/1.19.1/org.apache.nifi.kerberos.KeytabCredentialsService/index.html">KeytabCredentialsService</a></td><td id="description">Specifies the Kerberos Credentials Controller Service that should be used for authenticating with Kerberos</td></tr><tr><td id="name">Kerberos User Service</td><td>kerberos-user-service</td><td></td><td id="allowable-values"><strong>Controller Service API: </strong><br/>KerberosUserService<br/><strong>Implementations: </strong><a href="../../../nifi-kerberos-user-service-nar/1.19.1/org.apache.nifi.kerberos.KerberosPasswordUserService/index.html">KerberosPasswordUserService</a><br/><a href="../../../nifi-kerberos-user-service-nar/1.19.1/org.apache.nifi.kerberos.KerberosKeytabUserService/index.html">KerberosKeytabUserService</a><br/><a href="../../../nifi-kerberos-user-service-nar/1.19.1/org.apache.nifi.kerberos.KerberosTicketCacheUserService/index.html">KerberosTicketCacheUserService</a></td><td id="description">Specifies the Kerberos User Controller Service that should be used for authenticating with Kerberos</td></tr><tr><td id="name">Kerberos Principal</td><td>Kerberos Principal</td><td></td><td id="allowable-values"></td><td id="description">Kerberos principal to authenticate as. Requires nifi.kerberos.krb5.file to be set in your nifi.properties<br/><strong>Supports Expression Language: true (will be evaluated using variable registry only)</strong></td></tr><tr><td id="name">Kerberos Keytab</td><td>Kerberos Keytab</td><td></td><td id="allowable-values"></td><td id="description">Kerberos keytab associated with the principal. Requires nifi.kerberos.krb5.file to be set in your nifi.properties<br/><br/><strong>This property requires exactly one file to be provided..</strong><br/><br/><strong>Supports Expression Language: true (will be evaluated using variable registry only)</strong></td></tr><tr><td id="name">Kerberos Password</td><td>Kerberos Password</td><td></td><td id="allowable-values"></td><td id="description">Kerberos password associated with the principal.<br/><strong>Sensitive Property: true</strong></td></tr><tr><td id="name">Kerberos Relogin Period</td><td>Kerberos Relogin Period</td><td id="default-value">4 hours</td><td id="allowable-values"></td><td id="description">Period of time which should pass before attempting a kerberos relogin.
This property has been deprecated, and has no effect on processing. Relogins now occur automatically.<br/><strong>Supports Expression Language: true (will be evaluated using variable registry only)</strong></td></tr><tr><td id="name">Additional Classpath Resources</td><td>Additional Classpath Resources</td><td></td><td id="allowable-values"></td><td id="description">A comma-separated list of paths to files and/or directories that will be added to the classpath and used for loading native libraries. When specifying a directory, all files with in the directory will be added to the classpath, but further sub-directories will not be included.<br/><br/><strong>This property expects a comma-separated list of resources. Each of the resources may be of any of the following types: directory, file.</strong><br/></td></tr><tr><td id="name"><strong>Directory</strong></td><td>Directory</td><td></td><td id="allowable-values"></td><td id="description">The HDFS directory from which files should be read<br/><strong>Supports Expression Language: true (will be evaluated using variable registry only)</strong></td></tr><tr><td id="name"><strong>Recurse Subdirectories</strong></td><td>Recurse Subdirectories</td><td id="default-value">true</td><td id="allowable-values"><ul><li>true</li><li>false</li></ul></td><td id="description">Indicates whether to pull files from subdirectories of the HDFS directory</td></tr><tr><td id="name"><strong>Keep Source File</strong></td><td>Keep Source File</td><td id="default-value">false</td><td id="allowable-values"><ul><li>true</li><li>false</li></ul></td><td id="description">Determines whether to delete the file from HDFS after it has been successfully transferred. If true, the file will be fetched repeatedly. This is intended for testing only.</td></tr><tr><td id="name">File Filter Regex</td><td>File Filter Regex</td><td></td><td id="allowable-values"></td><td id="description">A Java Regular Expression for filtering Filenames; if a filter is supplied then only files whose names match that Regular Expression will be fetched, otherwise all files will be fetched</td></tr><tr><td id="name"><strong>Filter Match Name Only</strong></td><td>Filter Match Name Only</td><td id="default-value">true</td><td id="allowable-values"><ul><li>true</li><li>false</li></ul></td><td id="description">If true then File Filter Regex will match on just the filename, otherwise subdirectory names will be included with filename in the regex comparison</td></tr><tr><td id="name"><strong>Ignore Dotted Files</strong></td><td>Ignore Dotted Files</td><td id="default-value">true</td><td id="allowable-values"><ul><li>true</li><li>false</li></ul></td><td id="description">If true, files whose names begin with a dot (".") will be ignored</td></tr><tr><td id="name"><strong>Minimum File Age</strong></td><td>Minimum File Age</td><td id="default-value">0 sec</td><td id="allowable-values"></td><td id="description">The minimum age that a file must be in order to be pulled; any file younger than this amount of time (based on last modification date) will be ignored</td></tr><tr><td id="name">Maximum File Age</td><td>Maximum File Age</td><td></td><td id="allowable-values"></td><td id="description">The maximum age that a file must be in order to be pulled; any file older than this amount of time (based on last modification date) will be ignored</td></tr><tr><td id="name"><strong>Polling Interval</strong></td><td>Polling Interval</td><td id="default-value">0 sec</td><td id="allowable-values"></td><td id="description">Indicates how long to wait between performing directory listings</td></tr><tr><td id="name"><strong>Batch Size</strong></td><td>Batch Size</td><td id="default-value">100</td><td id="allowable-values"></td><td id="description">The maximum number of files to pull in each iteration, based on run schedule.</td></tr><tr><td id="name">IO Buffer Size</td><td>IO Buffer Size</td><td></td><td id="allowable-values"></td><td id="description">Amount of memory to use to buffer file contents during IO. This overrides the Hadoop Configuration</td></tr><tr><td id="name"><strong>Compression codec</strong></td><td>Compression codec</td><td id="default-value">NONE</td><td id="allowable-values"><ul><li>NONE <img src="../../../../../html/images/iconInfo.png" alt="No compression" title="No compression"></img></li><li>DEFAULT <img src="../../../../../html/images/iconInfo.png" alt="Default ZLIB compression" title="Default ZLIB compression"></img></li><li>BZIP <img src="../../../../../html/images/iconInfo.png" alt="BZIP compression" title="BZIP compression"></img></li><li>GZIP <img src="../../../../../html/images/iconInfo.png" alt="GZIP compression" title="GZIP compression"></img></li><li>LZ4 <img src="../../../../../html/images/iconInfo.png" alt="LZ4 compression" title="LZ4 compression"></img></li><li>LZO <img src="../../../../../html/images/iconInfo.png" alt="LZO compression - it assumes LD_LIBRARY_PATH has been set and jar is available" title="LZO compression - it assumes LD_LIBRARY_PATH has been set and jar is available"></img></li><li>SNAPPY <img src="../../../../../html/images/iconInfo.png" alt="Snappy compression" title="Snappy compression"></img></li><li>AUTOMATIC <img src="../../../../../html/images/iconInfo.png" alt="Will attempt to automatically detect the compression codec." title="Will attempt to automatically detect the compression codec."></img></li></ul></td><td id="description">No Description Provided.</td></tr><tr><td id="name"><strong>FlowFile Content</strong></td><td>FlowFile Content</td><td id="default-value">VALUE ONLY</td><td id="allowable-values"><ul><li>VALUE ONLY</li><li>KEY VALUE PAIR</li></ul></td><td id="description">Indicate if the content is to be both the key and value of the Sequence File, or just the value.</td></tr></table><h3>Relationships: </h3><table id="relationships"><tr><th>Name</th><th>Description</th></tr><tr><td>success</td><td>All files retrieved from HDFS are transferred to this relationship</td></tr></table><h3>Reads Attributes: </h3>None specified.<h3>Writes Attributes: </h3><table id="writes-attributes"><tr><th>Name</th><th>Description</th></tr><tr><td>filename</td><td>The name of the file that was read from HDFS.</td></tr><tr><td>path</td><td>The path is set to the relative path of the file's directory on HDFS. For example, if the Directory property is set to /tmp, then files picked up from /tmp will have the path attribute set to "./". If the Recurse Subdirectories property is set to true and a file is picked up from /tmp/abc/1/2/3, then the path attribute will be set to "abc/1/2/3".</td></tr></table><h3>State management: </h3>This component does not store state.<h3>Restricted: </h3><table id="restrictions"><tr><th>Required Permission</th><th>Explanation</th></tr><tr><td>read distributed filesystem</td><td>Provides operator the ability to retrieve any file that NiFi has access to in HDFS or the local filesystem.</td></tr><tr><td>write distributed filesystem</td><td>Provides operator the ability to delete any file that NiFi has access to in HDFS or the local filesystem.</td></tr></table><h3>Input requirement: </h3>This component does not allow an incoming relationship.<h3>System Resource Considerations:</h3>None specified.<h3>See Also:</h3><p><a href="../org.apache.nifi.processors.hadoop.PutHDFS/index.html">PutHDFS</a></p></body></html>