hop-user-manual/modules/ROOT/pages/plugins/transforms/mergerows.adoc - incubator-hop-docs - Git at Google

 ////
 Licensed to the Apache Software Foundation (ASF) under one
 or more contributor license agreements.  See the NOTICE file
 distributed with this work for additional information
 regarding copyright ownership.  The ASF licenses this file
 to you under the Apache License, Version 2.0 (the
 "License"); you may not use this file except in compliance
 with the License.  You may obtain a copy of the License at
   http://www.apache.org/licenses/LICENSE-2.0
 Unless required by applicable law or agreed to in writing,
 software distributed under the License is distributed on an
 "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
 KIND, either express or implied.  See the License for the
 specific language governing permissions and limitations
 under the License.
 ////
 :documentationPath: /plugins/transforms/
 :language: en_US
 :page-alternativeEditUrl: https://github.com/apache/incubator-hop/edit/master/plugins/transforms/mergerows/src/main/doc/mergerows.adoc
 = Merge rows (diff)

 == Description

 The Merge rows (diff) transform compares and merges data within two rows of data. This transform is useful for comparing data collected at two different times. For example, the source system of your data warehouse might not contain a timestamp of the last data update. You could use this transform to compare the two data streams and and merge the dates and timestamps in the rows.

 Based on keys for comparison, this transform merges reference rows (previous data) with compare rows (new data) and creates merged output rows. A flag in the row indicates how the values were compared and merged. Flag values include:

 * **identical**: The key was found in both rows, and the compared values are identical.

 * **changed**: The key was found in both rows, but one or more compared values are different.

 * **new**: The key was not found in the reference rows.

 * **deleted**: The key was not found in the compare rows.

 If the row's flag is **identical** or **deleted**, the merged output rows are based on the reference rows.

 For **new** or **changed** rows, the merged output rows are based on the compare rows.

 You can also send values from the merged and flagged rows to a subsequent transform in your pipeline, such as the Switch-Case transform or the Synchronize after merge transform. In the subsequent transform, you can use the flag field generated by **Merge rows (diff)** to control updates/inserts/deletes on a target table.

 == Options

 [width="90%", options="header"]
 |===
 |Option|Description
 |Transform name|Name of the transform.
 |Reference rows origin|Specify the transform origin for the reference rows <- Stream with original rows, or rows you want to compare the new rows to.
 |Compare rows origin|Specify the transform origin for the compare rows.<- Stream with new rows
 |Flag fieldname|Specify the name of the flag field on the output stream.
 |Keys to match|Specify fields containing the keys on which to match;click Get key fields to insert all of the fields originating from the reference rows transform
 |Values to compare|Specify fields contaning the values to compare; click Get value fields to insert all of the fields from the originating value rows transform.  Key fields do not need to be specified here.
 |===

 == Metadata Injection Support

 All fields of this transform support metadata injection. You can use this transform with ETL Metadata Injection to pass metadata to your pipeline at runtime.
	////
	Licensed to the Apache Software Foundation (ASF) under one
	or more contributor license agreements. See the NOTICE file
	distributed with this work for additional information
	regarding copyright ownership. The ASF licenses this file
	to you under the Apache License, Version 2.0 (the
	"License"); you may not use this file except in compliance
	with the License. You may obtain a copy of the License at
	http://www.apache.org/licenses/LICENSE-2.0
	Unless required by applicable law or agreed to in writing,
	software distributed under the License is distributed on an
	"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
	KIND, either express or implied. See the License for the
	specific language governing permissions and limitations
	under the License.
	////
	:documentationPath: /plugins/transforms/
	:language: en_US
	:page-alternativeEditUrl: https://github.com/apache/incubator-hop/edit/master/plugins/transforms/mergerows/src/main/doc/mergerows.adoc
	= Merge rows (diff)

	== Description

	The Merge rows (diff) transform compares and merges data within two rows of data. This transform is useful for comparing data collected at two different times. For example, the source system of your data warehouse might not contain a timestamp of the last data update. You could use this transform to compare the two data streams and and merge the dates and timestamps in the rows.

	Based on keys for comparison, this transform merges reference rows (previous data) with compare rows (new data) and creates merged output rows. A flag in the row indicates how the values were compared and merged. Flag values include:

	* identical: The key was found in both rows, and the compared values are identical.

	* changed: The key was found in both rows, but one or more compared values are different.

	* new: The key was not found in the reference rows.

	* deleted: The key was not found in the compare rows.

	If the row's flag is identical or deleted, the merged output rows are based on the reference rows.

	For new or changed rows, the merged output rows are based on the compare rows.

	You can also send values from the merged and flagged rows to a subsequent transform in your pipeline, such as the Switch-Case transform or the Synchronize after merge transform. In the subsequent transform, you can use the flag field generated by Merge rows (diff) to control updates/inserts/deletes on a target table.

	== Options

	[width="90%", options="header"]
	\|===
	\|Option\|Description
	\|Transform name\|Name of the transform.
	\|Reference rows origin\|Specify the transform origin for the reference rows <- Stream with original rows, or rows you want to compare the new rows to.
	\|Compare rows origin\|Specify the transform origin for the compare rows.<- Stream with new rows
	\|Flag fieldname\|Specify the name of the flag field on the output stream.
	\|Keys to match\|Specify fields containing the keys on which to match;click Get key fields to insert all of the fields originating from the reference rows transform
	\|Values to compare\|Specify fields contaning the values to compare; click Get value fields to insert all of the fields from the originating value rows transform. Key fields do not need to be specified here.
	\|===

	== Metadata Injection Support

	All fields of this transform support metadata injection. You can use this transform with ETL Metadata Injection to pass metadata to your pipeline at runtime.