blob: e6b03ad4d12f8a38fd5d771225f24b9011aa13e7 [file] [log] [blame]
<!DOCTYPE html><html lang="en"><head><meta charset="utf-8"></meta><title>GetHDFSEvents</title><link rel="stylesheet" href="../../../../../css/component-usage.css" type="text/css"></link></head><script type="text/javascript">window.onload = function(){if(self==top) { document.getElementById('nameHeader').style.display = "inherit"; } }</script><body><h1 id="nameHeader" style="display: none;">GetHDFSEvents</h1><h2>Description: </h2><p>This processor polls the notification events provided by the HdfsAdmin API. Since this uses the HdfsAdmin APIs it is required to run as an HDFS super user. Currently there are six types of events (append, close, create, metadata, rename, and unlink). Please see org.apache.hadoop.hdfs.inotify.Event documentation for full explanations of each event. This processor will poll for new events based on a defined duration. For each event received a new flow file will be created with the expected attributes and the event itself serialized to JSON and written to the flow file's content. For example, if event.type is APPEND then the content of the flow file will contain a JSON file containing the information about the append event. If successful the flow files are sent to the 'success' relationship. Be careful of where the generated flow files are stored. If the flow files are stored in one of processor's watch directories there will be a never ending flow of events. It is also important to be aware that this processor must consume all events. The filtering must happen within the processor. This is because the HDFS admin's event notifications API does not have filtering.</p><h3>Tags: </h3><p>hadoop, events, inotify, notifications, filesystem</p><h3>Properties: </h3><p>In the list below, the names of required properties appear in <strong>bold</strong>. Any other properties (not in bold) are considered optional. The table also indicates any default values, and whether a property supports the <a href="../../../../../html/expression-language-guide.html">NiFi Expression Language</a>.</p><table id="properties"><tr><th>Display Name</th><th>API Name</th><th>Default Value</th><th>Allowable Values</th><th>Description</th></tr><tr><td id="name">Hadoop Configuration Resources</td><td>Hadoop Configuration Resources</td><td></td><td id="allowable-values"></td><td id="description">A file or comma separated list of files which contains the Hadoop file system configuration. Without this, Hadoop will search the classpath for a 'core-site.xml' and 'hdfs-site.xml' file or will revert to a default configuration. To use swebhdfs, see 'Additional Details' section of PutHDFS's documentation.<br/><br/><strong>This property expects a comma-separated list of file resources.</strong><br/><br/><strong>Supports Expression Language: true (will be evaluated using variable registry only)</strong></td></tr><tr><td id="name">Kerberos Credentials Service</td><td>kerberos-credentials-service</td><td></td><td id="allowable-values"><strong>Controller Service API: </strong><br/>KerberosCredentialsService<br/><strong>Implementation: </strong><a href="../../../nifi-kerberos-credentials-service-nar/1.19.1/org.apache.nifi.kerberos.KeytabCredentialsService/index.html">KeytabCredentialsService</a></td><td id="description">Specifies the Kerberos Credentials Controller Service that should be used for authenticating with Kerberos</td></tr><tr><td id="name">Kerberos User Service</td><td>kerberos-user-service</td><td></td><td id="allowable-values"><strong>Controller Service API: </strong><br/>KerberosUserService<br/><strong>Implementations: </strong><a href="../../../nifi-kerberos-user-service-nar/1.19.1/org.apache.nifi.kerberos.KerberosPasswordUserService/index.html">KerberosPasswordUserService</a><br/><a href="../../../nifi-kerberos-user-service-nar/1.19.1/org.apache.nifi.kerberos.KerberosKeytabUserService/index.html">KerberosKeytabUserService</a><br/><a href="../../../nifi-kerberos-user-service-nar/1.19.1/org.apache.nifi.kerberos.KerberosTicketCacheUserService/index.html">KerberosTicketCacheUserService</a></td><td id="description">Specifies the Kerberos User Controller Service that should be used for authenticating with Kerberos</td></tr><tr><td id="name">Kerberos Principal</td><td>Kerberos Principal</td><td></td><td id="allowable-values"></td><td id="description">Kerberos principal to authenticate as. Requires nifi.kerberos.krb5.file to be set in your nifi.properties<br/><strong>Supports Expression Language: true (will be evaluated using variable registry only)</strong></td></tr><tr><td id="name">Kerberos Keytab</td><td>Kerberos Keytab</td><td></td><td id="allowable-values"></td><td id="description">Kerberos keytab associated with the principal. Requires nifi.kerberos.krb5.file to be set in your nifi.properties<br/><br/><strong>This property requires exactly one file to be provided..</strong><br/><br/><strong>Supports Expression Language: true (will be evaluated using variable registry only)</strong></td></tr><tr><td id="name">Kerberos Password</td><td>Kerberos Password</td><td></td><td id="allowable-values"></td><td id="description">Kerberos password associated with the principal.<br/><strong>Sensitive Property: true</strong></td></tr><tr><td id="name">Kerberos Relogin Period</td><td>Kerberos Relogin Period</td><td id="default-value">4 hours</td><td id="allowable-values"></td><td id="description">Period of time which should pass before attempting a kerberos relogin.
This property has been deprecated, and has no effect on processing. Relogins now occur automatically.<br/><strong>Supports Expression Language: true (will be evaluated using variable registry only)</strong></td></tr><tr><td id="name">Additional Classpath Resources</td><td>Additional Classpath Resources</td><td></td><td id="allowable-values"></td><td id="description">A comma-separated list of paths to files and/or directories that will be added to the classpath and used for loading native libraries. When specifying a directory, all files with in the directory will be added to the classpath, but further sub-directories will not be included.<br/><br/><strong>This property expects a comma-separated list of resources. Each of the resources may be of any of the following types: directory, file.</strong><br/></td></tr><tr><td id="name"><strong>Poll Duration</strong></td><td>Poll Duration</td><td id="default-value">1 second</td><td id="allowable-values"></td><td id="description">The time before the polling method returns with the next batch of events if they exist. It may exceed this amount of time by up to the time required for an RPC to the NameNode.</td></tr><tr><td id="name"><strong>HDFS Path to Watch</strong></td><td>HDFS Path to Watch</td><td></td><td id="allowable-values"></td><td id="description">The HDFS path to get event notifications for. This property accepts both expression language and regular expressions. This will be evaluated during the OnScheduled phase.<br/><strong>Supports Expression Language: true (will be evaluated using variable registry only)</strong></td></tr><tr><td id="name"><strong>Ignore Hidden Files</strong></td><td>Ignore Hidden Files</td><td id="default-value">false</td><td id="allowable-values"><ul><li>true</li><li>false</li></ul></td><td id="description">If true and the final component of the path associated with a given event starts with a '.' then that event will not be processed.</td></tr><tr><td id="name"><strong>Event Types to Filter On</strong></td><td>Event Types to Filter On</td><td id="default-value">append, close, create, metadata, rename, unlink</td><td id="allowable-values"></td><td id="description">A comma-separated list of event types to process. Valid event types are: append, close, create, metadata, rename, and unlink. Case does not matter.</td></tr><tr><td id="name"><strong>IOException Retries During Event Polling</strong></td><td>IOException Retries During Event Polling</td><td id="default-value">3</td><td id="allowable-values"></td><td id="description">According to the HDFS admin API for event polling it is good to retry at least a few times. This number defines how many times the poll will be retried if it throws an IOException.</td></tr></table><h3>Relationships: </h3><table id="relationships"><tr><th>Name</th><th>Description</th></tr><tr><td>success</td><td>A flow file with updated information about a specific event will be sent to this relationship.</td></tr></table><h3>Reads Attributes: </h3>None specified.<h3>Writes Attributes: </h3><table id="writes-attributes"><tr><th>Name</th><th>Description</th></tr><tr><td>mime.type</td><td>This is always application/json.</td></tr><tr><td>hdfs.inotify.event.type</td><td>This will specify the specific HDFS notification event type. Currently there are six types of events (append, close, create, metadata, rename, and unlink).</td></tr><tr><td>hdfs.inotify.event.path</td><td>The specific path that the event is tied to.</td></tr></table><h3>State management: </h3><table id="stateful"><tr><th>Scope</th><th>Description</th></tr><tr><td>CLUSTER</td><td>The last used transaction id is stored. This is used </td></tr></table><h3>Restricted: </h3>This component is not restricted.<h3>Input requirement: </h3>This component does not allow an incoming relationship.<h3>System Resource Considerations:</h3>None specified.<h3>See Also:</h3><p><a href="../org.apache.nifi.processors.hadoop.GetHDFS/index.html">GetHDFS</a>, <a href="../org.apache.nifi.processors.hadoop.FetchHDFS/index.html">FetchHDFS</a>, <a href="../org.apache.nifi.processors.hadoop.PutHDFS/index.html">PutHDFS</a>, <a href="../org.apache.nifi.processors.hadoop.ListHDFS/index.html">ListHDFS</a></p></body></html>