blob: d89ce3dbd4ef4815e48f1f99687afe994a5ab18d [file] [log] [blame]
<!DOCTYPE html><html lang="en"><head><meta charset="utf-8"></meta><title>PutBigQueryBatch</title><link rel="stylesheet" href="../../../../../css/component-usage.css" type="text/css"></link></head><script type="text/javascript">window.onload = function(){if(self==top) { document.getElementById('nameHeader').style.display = "inherit"; } }</script><body><h1 id="nameHeader" style="display: none;">PutBigQueryBatch</h1><h2>Deprecation notice: </h2><p>This processor is deprecated and may be removed in future releases.</p><p>Please consider using one the following alternatives: <a href="../org.apache.nifi.processors.gcp.bigquery.PutBigQuery/index.html">PutBigQuery</a></p><h2>Description: </h2><p>Please be aware this processor is deprecated and may be removed in the near future. Use PutBigQuery instead. Batch loads flow files content to a Google BigQuery table.</p><h3>Tags: </h3><p>google, google cloud, bq, bigquery</p><h3>Properties: </h3><p>In the list below, the names of required properties appear in <strong>bold</strong>. Any other properties (not in bold) are considered optional. The table also indicates any default values, and whether a property supports the <a href="../../../../../html/expression-language-guide.html">NiFi Expression Language</a>.</p><table id="properties"><tr><th>Display Name</th><th>API Name</th><th>Default Value</th><th>Allowable Values</th><th>Description</th></tr><tr><td id="name">Project ID</td><td>gcp-project-id</td><td></td><td id="allowable-values"></td><td id="description">Google Cloud Project ID<br/><strong>Supports Expression Language: true (will be evaluated using variable registry only)</strong></td></tr><tr><td id="name"><strong>GCP Credentials Provider Service</strong></td><td>GCP Credentials Provider Service</td><td></td><td id="allowable-values"><strong>Controller Service API: </strong><br/>GCPCredentialsService<br/><strong>Implementation: </strong><a href="../org.apache.nifi.processors.gcp.credentials.service.GCPCredentialsControllerService/index.html">GCPCredentialsControllerService</a></td><td id="description">The Controller Service used to obtain Google Cloud Platform credentials.</td></tr><tr><td id="name"><strong>Number of retries</strong></td><td>gcp-retry-count</td><td id="default-value">6</td><td id="allowable-values"></td><td id="description">How many retry attempts should be made before routing to the failure relationship.</td></tr><tr><td id="name">Proxy host</td><td>gcp-proxy-host</td><td></td><td id="allowable-values"></td><td id="description">IP or hostname of the proxy to be used.
You might need to set the following properties in bootstrap for https proxy usage:
-Djdk.http.auth.tunneling.disabledSchemes=
-Djdk.http.auth.proxying.disabledSchemes=<br/><strong>Supports Expression Language: true (will be evaluated using variable registry only)</strong></td></tr><tr><td id="name">Proxy port</td><td>gcp-proxy-port</td><td></td><td id="allowable-values"></td><td id="description">Proxy port number<br/><strong>Supports Expression Language: true (will be evaluated using variable registry only)</strong></td></tr><tr><td id="name">HTTP Proxy Username</td><td>gcp-proxy-user-name</td><td></td><td id="allowable-values"></td><td id="description">HTTP Proxy Username<br/><strong>Supports Expression Language: true (will be evaluated using variable registry only)</strong></td></tr><tr><td id="name">HTTP Proxy Password</td><td>gcp-proxy-user-password</td><td></td><td id="allowable-values"></td><td id="description">HTTP Proxy Password<br/><strong>Sensitive Property: true</strong><br/><strong>Supports Expression Language: true (will be evaluated using variable registry only)</strong></td></tr><tr><td id="name">Proxy Configuration Service</td><td>proxy-configuration-service</td><td></td><td id="allowable-values"><strong>Controller Service API: </strong><br/>ProxyConfigurationService<br/><strong>Implementation: </strong><a href="../../../nifi-proxy-configuration-nar/1.19.0/org.apache.nifi.proxy.StandardProxyConfigurationService/index.html">StandardProxyConfigurationService</a></td><td id="description">Specifies the Proxy Configuration Controller Service to proxy network requests. If set, it supersedes proxy settings configured per component. Supported proxies: HTTP + AuthN</td></tr><tr><td id="name"><strong>Dataset</strong></td><td>bq.dataset</td><td id="default-value">${bq.dataset}</td><td id="allowable-values"></td><td id="description">BigQuery dataset name (Note - The dataset must exist in GCP)<br/><strong>Supports Expression Language: true (will be evaluated using flow file attributes and variable registry)</strong></td></tr><tr><td id="name"><strong>Table Name</strong></td><td>bq.table.name</td><td id="default-value">${bq.table.name}</td><td id="allowable-values"></td><td id="description">BigQuery table name<br/><strong>Supports Expression Language: true (will be evaluated using flow file attributes and variable registry)</strong></td></tr><tr><td id="name"><strong>Ignore Unknown Values</strong></td><td>bq.load.ignore_unknown</td><td id="default-value">false</td><td id="allowable-values"></td><td id="description">Sets whether BigQuery should allow extra values that are not represented in the table schema. If true, the extra values are ignored. If false, records with extra columns are treated as bad records, and if there are too many bad records, an invalid error is returned in the job result. By default unknown values are not allowed.<br/><strong>Supports Expression Language: true (will be evaluated using flow file attributes and variable registry)</strong></td></tr><tr><td id="name">Table Schema</td><td>bq.table.schema</td><td></td><td id="allowable-values"></td><td id="description">BigQuery schema in JSON format<br/><strong>Supports Expression Language: true (will be evaluated using flow file attributes and variable registry)</strong></td></tr><tr><td id="name"><strong>Read Timeout</strong></td><td>bq.readtimeout</td><td id="default-value">5 minutes</td><td id="allowable-values"></td><td id="description">Load Job Time Out<br/><strong>Supports Expression Language: true (will be evaluated using flow file attributes and variable registry)</strong></td></tr><tr><td id="name"><strong>Load file type</strong></td><td>bq.load.type</td><td></td><td id="allowable-values"></td><td id="description">Data type of the file to be loaded. Possible values: AVRO, NEWLINE_DELIMITED_JSON, CSV.<br/><strong>Supports Expression Language: true (will be evaluated using flow file attributes and variable registry)</strong></td></tr><tr><td id="name"><strong>Create Disposition</strong></td><td>bq.load.create_disposition</td><td id="default-value">CREATE_IF_NEEDED</td><td id="allowable-values"><ul><li>CREATE_IF_NEEDED <img src="../../../../../html/images/iconInfo.png" alt="Configures the job to create the table if it does not exist." title="Configures the job to create the table if it does not exist."></img></li><li>CREATE_NEVER <img src="../../../../../html/images/iconInfo.png" alt="Configures the job to fail with a not-found error if the table does not exist." title="Configures the job to fail with a not-found error if the table does not exist."></img></li></ul></td><td id="description">Sets whether the job is allowed to create new tables</td></tr><tr><td id="name"><strong>Write Disposition</strong></td><td>bq.load.write_disposition</td><td id="default-value">WRITE_EMPTY</td><td id="allowable-values"><ul><li>WRITE_EMPTY <img src="../../../../../html/images/iconInfo.png" alt="Configures the job to fail with a duplicate error if the table already exists." title="Configures the job to fail with a duplicate error if the table already exists."></img></li><li>WRITE_APPEND <img src="../../../../../html/images/iconInfo.png" alt="Configures the job to append data to the table if it already exists." title="Configures the job to append data to the table if it already exists."></img></li><li>WRITE_TRUNCATE <img src="../../../../../html/images/iconInfo.png" alt="Configures the job to overwrite the table data if table already exists." title="Configures the job to overwrite the table data if table already exists."></img></li></ul></td><td id="description">Sets the action that should occur if the destination table already exists.</td></tr><tr><td id="name"><strong>Max Bad Records</strong></td><td>bq.load.max_badrecords</td><td id="default-value">0</td><td id="allowable-values"></td><td id="description">Sets the maximum number of bad records that BigQuery can ignore when running the job. If the number of bad records exceeds this value, an invalid error is returned in the job result. By default no bad record is ignored.</td></tr><tr><td id="name"><strong>CSV Input - Allow Jagged Rows</strong></td><td>bq.csv.allow.jagged.rows</td><td id="default-value">false</td><td id="allowable-values"><ul><li>true</li><li>false</li></ul></td><td id="description">Set whether BigQuery should accept rows that are missing trailing optional columns. If true, BigQuery treats missing trailing columns as null values. If false, records with missing trailing columns are treated as bad records, and if there are too many bad records, an invalid error is returned in the job result. By default, rows with missing trailing columns are considered bad records.</td></tr><tr><td id="name"><strong>CSV Input - Allow Quoted New Lines</strong></td><td>bq.csv.allow.quoted.new.lines</td><td id="default-value">false</td><td id="allowable-values"><ul><li>true</li><li>false</li></ul></td><td id="description">Sets whether BigQuery should allow quoted data sections that contain newline characters in a CSV file. By default quoted newline are not allowed.</td></tr><tr><td id="name"><strong>CSV Input - Character Set</strong></td><td>bq.csv.charset</td><td id="default-value">UTF-8</td><td id="allowable-values"><ul><li>UTF-8</li><li>ISO-8859-1</li></ul></td><td id="description">Sets the character encoding of the data.</td></tr><tr><td id="name"><strong>CSV Input - Field Delimiter</strong></td><td>bq.csv.delimiter</td><td id="default-value">,</td><td id="allowable-values"></td><td id="description">Sets the separator for fields in a CSV file. BigQuery converts the string to ISO-8859-1 encoding, and then uses the first byte of the encoded string to split the data in its raw, binary state. BigQuery also supports the escape sequence " " to specify a tab separator. The default value is a comma (',').<br/><strong>Supports Expression Language: true (will be evaluated using flow file attributes and variable registry)</strong></td></tr><tr><td id="name"><strong>CSV Input - Quote</strong></td><td>bq.csv.quote</td><td id="default-value">"</td><td id="allowable-values"></td><td id="description">Sets the value that is used to quote data sections in a CSV file. BigQuery converts the string to ISO-8859-1 encoding, and then uses the first byte of the encoded string to split the data in its raw, binary state. The default value is a double-quote ('"'). If your data does not contain quoted sections, set the property value to an empty string. If your data contains quoted newline characters, you must also set the Allow Quoted New Lines property to true.<br/><strong>Supports Expression Language: true (will be evaluated using flow file attributes and variable registry)</strong></td></tr><tr><td id="name"><strong>CSV Input - Skip Leading Rows</strong></td><td>bq.csv.skip.leading.rows</td><td id="default-value">0</td><td id="allowable-values"></td><td id="description">Sets the number of rows at the top of a CSV file that BigQuery will skip when reading the data. The default value is 0. This property is useful if you have header rows in the file that should be skipped.<br/><strong>Supports Expression Language: true (will be evaluated using flow file attributes and variable registry)</strong></td></tr><tr><td id="name"><strong>Avro Input - Use Logical Types</strong></td><td>bq.avro.use.logical.types</td><td id="default-value">false</td><td id="allowable-values"><ul><li>true</li><li>false</li></ul></td><td id="description">If format is set to Avro and if this option is set to true, you can interpret logical types into their corresponding types (such as TIMESTAMP) instead of only using their raw types (such as INTEGER).</td></tr></table><h3>Relationships: </h3><table id="relationships"><tr><th>Name</th><th>Description</th></tr><tr><td>success</td><td>FlowFiles are routed to this relationship after a successful Google BigQuery operation.</td></tr><tr><td>failure</td><td>FlowFiles are routed to this relationship if the Google BigQuery operation fails.</td></tr></table><h3>Reads Attributes: </h3>None specified.<h3>Writes Attributes: </h3><table id="writes-attributes"><tr><th>Name</th><th>Description</th></tr><tr><td>bq.job.stat.creation_time</td><td>Time load job creation</td></tr><tr><td>bq.job.stat.end_time</td><td>Time load job ended</td></tr><tr><td>bq.job.stat.start_time</td><td>Time load job started</td></tr><tr><td>bq.job.link</td><td>API Link to load job</td></tr><tr><td>bq.job.id</td><td>ID of the BigQuery job</td></tr><tr><td>bq.error.message</td><td>Load job error message</td></tr><tr><td>bq.error.reason</td><td>Load job error reason</td></tr><tr><td>bq.error.location</td><td>Load job error location</td></tr><tr><td>bq.records.count</td><td>Number of records successfully inserted</td></tr></table><h3>State management: </h3>This component does not store state.<h3>Restricted: </h3>This component is not restricted.<h3>Input requirement: </h3>This component requires an incoming relationship.<h3>System Resource Considerations:</h3>None specified.<h3>See Also:</h3><p><a href="../org.apache.nifi.processors.gcp.storage.PutGCSObject/index.html">PutGCSObject</a>, <a href="../org.apache.nifi.processors.gcp.storage.DeleteGCSObject/index.html">DeleteGCSObject</a></p></body></html>