| //// |
| Licensed to the Apache Software Foundation (ASF) under one |
| or more contributor license agreements. See the NOTICE file |
| distributed with this work for additional information |
| regarding copyright ownership. The ASF licenses this file |
| to you under the Apache License, Version 2.0 (the |
| "License"); you may not use this file except in compliance |
| with the License. You may obtain a copy of the License at |
| http://www.apache.org/licenses/LICENSE-2.0 |
| Unless required by applicable law or agreed to in writing, |
| software distributed under the License is distributed on an |
| "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY |
| KIND, either express or implied. See the License for the |
| specific language governing permissions and limitations |
| under the License. |
| //// |
| :documentationPath: /pipeline/transforms/ |
| :language: en_US |
| :description: The Metadata Injection transform inserts data from various sources into a pipeline at runtime. |
| |
| :openvar: ${ |
| :closevar: } |
| |
| = image:transforms/icons/GenericTransform.svg[Metadata Injection transform Icon, role="image-doc-icon"] Metadata Injection |
| |
| [%noheader,cols="3a,1a", role="table-no-borders" ] |
| |=== |
| | |
| == Description |
| |
| The Metadata Injection transform inserts data from various sources into a pipeline at runtime. |
| |
| | |
| == Supported Engines |
| [%noheader,cols="2,1a",frame=none, role="table-supported-engines"] |
| !=== |
| !Hop Engine! image:check_mark.svg[Supported, 24] |
| !Spark! image:cross.svg[Not Supported, 24] |
| !Flink! image:cross.svg[Not Supported, 24] |
| !Dataflow! image:cross.svg[Not Supported, 24] |
| !=== |
| |=== |
| |
| == Usage |
| |
| The Metadata Injection transform inserts data from various sources into a pipeline at runtime. |
| |
| This is typically used to automate repetitive ETL tasks. |
| |
| For example, you might have a simple pipeline to load transaction data values from a supplier, filter specific values, and output them to a file. |
| If you have more than one supplier, you would need to run this simple pipeline for each supplier. |
| Yet, with metadata injection, you can expand this simple repetitive pipeline by inserting metadata from another pipeline that contains the ETL Metadata Injection transform. |
| This transform coordinates the data values from the various inputs through the metadata you define. |
| This process reduces the need for you to adjust and run the repetitive pipeline for each specific input. |
| |
| The repetitive pipeline is known as the template pipeline. |
| The template pipeline is called by the ETL Metadata Injection transform. |
| You will create a pipeline to prepare what common values you want to use as metadata and inject these specific values through the ETL Metadata Injection transform. |
| |
| We recommend the following basic procedure for using this transform to inject metadata: |
| |
| 1. Optimize your data for injection, such as preparing folder structures and inputs. |
| |
| 2. Develop pipelines for the repetitive process (the template pipeline), for metadata injection through the ETL Metadata Injection transform, and for handling multiple inputs. |
| |
| |
| The metadata is injected into the template pipeline through any transform that supports metadata injection. |
| |
| == Options |
| |
| === General |
| |
| [options="header"] |
| |=== |
| |Option|Description |
| |Transform name|Name of the transform. |
| |Pipeline|Specify your template pipeline by entering in its path. |
| Click Browse to display and enter the path details using the Virtual File System Browser. |
| |
| If you select a pipeline that has the same root path as the current pipeline, the variable {openvar}Internal.Transform.Current.Directory{closevar} will automatically be inserted in place of the common root path. |
| For example, if the current pipeline's path is /home/admin/pipeline.hpl and you select a pipeline in the folder /home/admin/path/sub.hpl then the path will automatically be converted to {openvar}Internal.Transform.Current.Directory{closevar}/path/sub.hpl. |
| |=== |
| |
| The ETL Metadata Injection transform features the two tabs with fields. |
| Each tab is described below. |
| |
| === Inject Metadata Tab |
| |
| [options="header"] |
| |=== |
| |Option|Description |
| |Target injection transform key| Lists the available fields in each transform of the template pipeline that can be injected with metadata. |
| |Target description|Describes how the target fields relate to their target transforms. |
| |Source transform|Lists the transform associated with the fields to be injected into the target fields as metadata. |
| |Source field|Lists the fields to be injected into the target fields as metadata. |
| |=== |
| |
| To specify the source field as metadata to be injected, perform the following transforms: |
| |
| 1. In the Target injection transform key column, double-click the field for which you want to specify a source field. |
| The Source field dialog box opens. |
| |
| 2. Select a source field and click OK. |
| |
| 3. Optionally, select Use constant value to specify a constant value for the injected metadata through one of the following actions: |
| - Manually entering a value. |
| - Using an internal variable to set the value ({openvar}Internal.transform.Unique.Count{closevar} for example). |
| - Using a combination of manually specified values and parameter values ({openvar}FILE_PREFIX{closevar}_{openvar}FILE_DATE{closevar}.txt for example). |
| |
| When specifying constant values for grouped lists of values like fields or filenames please note that there isn't a good solution for that. Best practice is to use a xref:pipeline/transforms/datagrid.adoc[Data Grid] transform to inject a complete set of constant values. You can map those in this metadata injection transform. It will do its best to accommodate you by allowing you to inject a single row in the group with the specified constant value. |
| |
| ==== Injecting Metadata into the ETL Metadata Injection transform |
| |
| For injecting metadata into the ETL Metadata Injection transform itself, the following exceptions apply: |
| |
| |
| - To inject a method for how to specify a field (such as by FILENAME), set a PIPELINE_SPECIFICATION_METHOD constant to the field of an input transform. |
| You can then map the field as a source to the PIPELINE_SPECIFICATION_METHOD constant in the ETL Metadata Injection transform. |
| |
| - The target field for the ETL Metadata Injection transform inserting the metadata into the original injection is defined by [GROUP NAME].[FIELD NAME]. |
| For example, if the GROUP NAME is 'OUTPUT_FIELDS' and the FIELD NAME is 'OUTPUT_FIELDNAME', you would set the target field to 'OUTPUT_FIELDS.OUTPUT_FIELDNAME'. |
| |
| === Options Tab |
| |
| [options="header"] |
| |=== |
| |Option|Description |
| |transform to read from (optional)|Optionally, select a transform in your template pipeline to pass data directly to a transform following the ETL Metadata Injection transform in your current pipeline. |
| |Field name|If transform to read from is selected, enter the name of the field passed directly from the transform in the template pipeline. |
| |Type|If transform to read from is selected, select the type of the field passed directly from the transform in the template pipeline. |
| |Length|If transform to read from is selected, enter the length of the field passed directly from the transform in the template pipeline. |
| |Precision|If transform to read from is selected, enter the precision of the field passed directly from the transform in the template pipeline. |
| |Optional target file (hpl after injection)|For initial pipeline development or debugging, specify an optional file for creating and saving a pipeline of your template after metadata injection occurs. |
| The resulting pipeline will be your template pipeline with the metadata already injected as constant values. |
| |Streaming source transform|Select a source transform in your current pipeline to directly pass data to the Streaming target transform in the template pipeline. |
| |Streaming target transform|Select the target transform in your template pipeline to receive data directly from the Streaming source transform. |
| |Run resulting pipeline|Select to inject metadata and run the template pipeline. |
| If this option is not selected, metadata injection occurs, but the template pipeline does not run. |
| |=== |
| |