blob: d587ef1f46f8f81ef46e1f7c661eef5b1d2b7cdc [file] [log] [blame]
<!DOCTYPE html><html lang="en"><head><meta charset="utf-8"></meta><title>GetHBase</title><link rel="stylesheet" href="../../../../../css/component-usage.css" type="text/css"></link></head><script type="text/javascript">window.onload = function(){if(self==top) { document.getElementById('nameHeader').style.display = "inherit"; } }</script><body><h1 id="nameHeader" style="display: none;">GetHBase</h1><h2>Description: </h2><p>This Processor polls HBase for any records in the specified table. The processor keeps track of the timestamp of the cells that it receives, so that as new records are pushed to HBase, they will automatically be pulled. Each record is output in JSON format, as {"row": "&lt;row key&gt;", "cells": { "&lt;column 1 family&gt;:&lt;column 1 qualifier&gt;": "&lt;cell 1 value&gt;", "&lt;column 2 family&gt;:&lt;column 2 qualifier&gt;": "&lt;cell 2 value&gt;", ... }}. For each record received, a Provenance RECEIVE event is emitted with the format hbase://&lt;table name&gt;/&lt;row key&gt;, where &lt;row key&gt; is the UTF-8 encoded value of the row's key.</p><h3>Tags: </h3><p>hbase, get, ingest</p><h3>Properties: </h3><p>In the list below, the names of required properties appear in <strong>bold</strong>. Any other properties (not in bold) are considered optional. The table also indicates any default values, and whether a property supports the <a href="../../../../../html/expression-language-guide.html">NiFi Expression Language</a>.</p><table id="properties"><tr><th>Display Name</th><th>API Name</th><th>Default Value</th><th>Allowable Values</th><th>Description</th></tr><tr><td id="name"><strong>HBase Client Service</strong></td><td>HBase Client Service</td><td></td><td id="allowable-values"><strong>Controller Service API: </strong><br/>HBaseClientService<br/><strong>Implementations: </strong><a href="../../../nifi-hbase_2-client-service-nar/1.19.1/org.apache.nifi.hbase.HBase_2_ClientService/index.html">HBase_2_ClientService</a><br/><a href="../../../nifi-hbase_1_1_2-client-service-nar/1.19.1/org.apache.nifi.hbase.HBase_1_1_2_ClientService/index.html">HBase_1_1_2_ClientService</a></td><td id="description">Specifies the Controller Service to use for accessing HBase.</td></tr><tr><td id="name">Distributed Cache Service</td><td>Distributed Cache Service</td><td></td><td id="allowable-values"><strong>Controller Service API: </strong><br/>DistributedMapCacheClient<br/><strong>Implementations: </strong><a href="../../../nifi-distributed-cache-services-nar/1.19.1/org.apache.nifi.distributed.cache.client.DistributedMapCacheClientService/index.html">DistributedMapCacheClientService</a><br/><a href="../../../nifi-redis-nar/1.19.1/org.apache.nifi.redis.service.RedisDistributedMapCacheClientService/index.html">RedisDistributedMapCacheClientService</a><br/><a href="../../../nifi-hbase_1_1_2-client-service-nar/1.19.1/org.apache.nifi.hbase.HBase_1_1_2_ClientMapCacheService/index.html">HBase_1_1_2_ClientMapCacheService</a><br/><a href="../../../nifi-hazelcast-services-nar/1.19.1/org.apache.nifi.hazelcast.services.cacheclient.HazelcastMapCacheClient/index.html">HazelcastMapCacheClient</a><br/><a href="../../../nifi-hbase_2-client-service-nar/1.19.1/org.apache.nifi.hbase.HBase_2_ClientMapCacheService/index.html">HBase_2_ClientMapCacheService</a><br/><a href="../../../nifi-cassandra-services-nar/1.19.1/org.apache.nifi.controller.cassandra.CassandraDistributedMapCache/index.html">CassandraDistributedMapCache</a><br/><a href="../../../nifi-couchbase-nar/1.19.1/org.apache.nifi.couchbase.CouchbaseMapCacheClient/index.html">CouchbaseMapCacheClient</a></td><td id="description">Specifies the Controller Service that should be used to maintain state about what has been pulled from HBase so that if a new node begins pulling data, it won't duplicate all of the work that has been done.</td></tr><tr><td id="name"><strong>Table Name</strong></td><td>Table Name</td><td></td><td id="allowable-values"></td><td id="description">The name of the HBase Table to put data into<br/><strong>Supports Expression Language: true (will be evaluated using variable registry only)</strong></td></tr><tr><td id="name">Columns</td><td>Columns</td><td></td><td id="allowable-values"></td><td id="description">A comma-separated list of "&lt;colFamily&gt;:&lt;colQualifier&gt;" pairs to return when scanning. To return all columns for a given family, leave off the qualifier such as "&lt;colFamily1&gt;,&lt;colFamily2&gt;".<br/><strong>Supports Expression Language: true (will be evaluated using variable registry only)</strong></td></tr><tr><td id="name">Authorizations</td><td>hbase-fetch-row-authorizations</td><td></td><td id="allowable-values"></td><td id="description">The list of authorizations to pass to the scanner. This will be ignored if cell visibility labels are not in use.<br/><strong>Supports Expression Language: true (will be evaluated using variable registry only)</strong></td></tr><tr><td id="name">Filter Expression</td><td>Filter Expression</td><td></td><td id="allowable-values"></td><td id="description">An HBase filter expression that will be applied to the scan. This property can not be used when also using the Columns property.<br/><strong>Supports Expression Language: true (will be evaluated using variable registry only)</strong></td></tr><tr><td id="name"><strong>Initial Time Range</strong></td><td>Initial Time Range</td><td id="default-value">None</td><td id="allowable-values"><ul><li>None</li><li>Current Time</li></ul></td><td id="description">The time range to use on the first scan of a table. None will pull the entire table on the first scan, Current Time will pull entries from that point forward.</td></tr><tr><td id="name"><strong>Character Set</strong></td><td>Character Set</td><td id="default-value">UTF-8</td><td id="allowable-values"></td><td id="description">Specifies which character set is used to encode the data in HBase<br/><strong>Supports Expression Language: true (will be evaluated using variable registry only)</strong></td></tr></table><h3>Relationships: </h3><table id="relationships"><tr><th>Name</th><th>Description</th></tr><tr><td>success</td><td>All FlowFiles are routed to this relationship</td></tr></table><h3>Reads Attributes: </h3>None specified.<h3>Writes Attributes: </h3><table id="writes-attributes"><tr><th>Name</th><th>Description</th></tr><tr><td>hbase.table</td><td>The name of the HBase table that the data was pulled from</td></tr><tr><td>mime.type</td><td>Set to application/json to indicate that output is JSON</td></tr></table><h3>State management: </h3><table id="stateful"><tr><th>Scope</th><th>Description</th></tr><tr><td>CLUSTER</td><td>After performing a fetching from HBase, stores a timestamp of the last-modified cell that was found. In addition, it stores the ID of the row(s) and the value of each cell that has that timestamp as its modification date. This is stored across the cluster and allows the next fetch to avoid duplicating data, even if this Processor is run on Primary Node only and the Primary Node changes.</td></tr></table><h3>Restricted: </h3>This component is not restricted.<h3>Input requirement: </h3>This component does not allow an incoming relationship.<h3>System Resource Considerations:</h3>None specified.</body></html>