blob: 9daf9de999744366955107e5021cba02178dda12 [file] [log] [blame]
<!DOCTYPE html><html lang="en"><head><meta charset="utf-8"></meta><title>MergeContent</title><link rel="stylesheet" href="../../../../../css/component-usage.css" type="text/css"></link></head><script type="text/javascript">window.onload = function(){if(self==top) { document.getElementById('nameHeader').style.display = "inherit"; } }</script><body><h1 id="nameHeader" style="display: none;">MergeContent</h1><h2>Description: </h2><p>Merges a Group of FlowFiles together based on a user-defined strategy and packages them into a single FlowFile. It is recommended that the Processor be configured with only a single incoming connection, as Group of FlowFiles will not be created from FlowFiles in different connections. This processor updates the mime.type attribute as appropriate.</p><p><a href="additionalDetails.html">Additional Details...</a></p><h3>Tags: </h3><p>merge, content, correlation, tar, zip, stream, concatenation, archive, flowfile-stream, flowfile-stream-v3</p><h3>Properties: </h3><p>In the list below, the names of required properties appear in <strong>bold</strong>. Any other properties (not in bold) are considered optional. The table also indicates any default values, and whether a property supports the <a href="../../../../../html/expression-language-guide.html">NiFi Expression Language</a>.</p><table id="properties"><tr><th>Display Name</th><th>API Name</th><th>Default Value</th><th>Allowable Values</th><th>Description</th></tr><tr><td id="name"><strong>Merge Strategy</strong></td><td>Merge Strategy</td><td id="default-value">Bin-Packing Algorithm</td><td id="allowable-values"><ul><li>Bin-Packing Algorithm <img src="../../../../../html/images/iconInfo.png" alt="Generates 'bins' of FlowFiles and fills each bin as full as possible. FlowFiles are placed into a bin based on their size and optionally their attributes (if the &lt;Correlation Attribute&gt; property is set)" title="Generates 'bins' of FlowFiles and fills each bin as full as possible. FlowFiles are placed into a bin based on their size and optionally their attributes (if the &lt;Correlation Attribute&gt; property is set)"></img></li><li>Defragment <img src="../../../../../html/images/iconInfo.png" alt="Combines fragments that are associated by attributes back into a single cohesive FlowFile. If using this strategy, all FlowFiles must have the attributes &lt;fragment.identifier&gt;, &lt;fragment.count&gt;, and &lt;fragment.index&gt; or alternatively (for backward compatibility purposes) &lt;segment.identifier&gt;, &lt;segment.count&gt;, and &lt;segment.index&gt;. All FlowFiles with the same value for &quot;fragment.identifier&quot; will be grouped together. All FlowFiles in this group must have the same value for the &quot;fragment.count&quot; attribute. All FlowFiles in this group must have a unique value for the &quot;fragment.index&quot; attribute between 0 and the value of the &quot;fragment.count&quot; attribute." title="Combines fragments that are associated by attributes back into a single cohesive FlowFile. If using this strategy, all FlowFiles must have the attributes &lt;fragment.identifier&gt;, &lt;fragment.count&gt;, and &lt;fragment.index&gt; or alternatively (for backward compatibility purposes) &lt;segment.identifier&gt;, &lt;segment.count&gt;, and &lt;segment.index&gt;. All FlowFiles with the same value for &quot;fragment.identifier&quot; will be grouped together. All FlowFiles in this group must have the same value for the &quot;fragment.count&quot; attribute. All FlowFiles in this group must have a unique value for the &quot;fragment.index&quot; attribute between 0 and the value of the &quot;fragment.count&quot; attribute."></img></li></ul></td><td id="description">Specifies the algorithm used to merge content. The 'Defragment' algorithm combines fragments that are associated by attributes back into a single cohesive FlowFile. The 'Bin-Packing Algorithm' generates a FlowFile populated by arbitrarily chosen FlowFiles</td></tr><tr><td id="name"><strong>Merge Format</strong></td><td>Merge Format</td><td id="default-value">Binary Concatenation</td><td id="allowable-values"><ul><li>TAR <img src="../../../../../html/images/iconInfo.png" alt="A bin of FlowFiles will be combined into a single TAR file. The FlowFiles' &lt;path&gt; attribute will be used to create a directory in the TAR file if the &lt;Keep Paths&gt; property is set to true; otherwise, all FlowFiles will be added at the root of the TAR file. If a FlowFile has an attribute named &lt;tar.permissions&gt; that is 3 characters, each between 0-7, that attribute will be used as the TAR entry's 'mode'." title="A bin of FlowFiles will be combined into a single TAR file. The FlowFiles' &lt;path&gt; attribute will be used to create a directory in the TAR file if the &lt;Keep Paths&gt; property is set to true; otherwise, all FlowFiles will be added at the root of the TAR file. If a FlowFile has an attribute named &lt;tar.permissions&gt; that is 3 characters, each between 0-7, that attribute will be used as the TAR entry's 'mode'."></img></li><li>ZIP <img src="../../../../../html/images/iconInfo.png" alt="A bin of FlowFiles will be combined into a single ZIP file. The FlowFiles' &lt;path&gt; attribute will be used to create a directory in the ZIP file if the &lt;Keep Paths&gt; property is set to true; otherwise, all FlowFiles will be added at the root of the ZIP file. The &lt;Compression Level&gt; property indicates the ZIP compression to use." title="A bin of FlowFiles will be combined into a single ZIP file. The FlowFiles' &lt;path&gt; attribute will be used to create a directory in the ZIP file if the &lt;Keep Paths&gt; property is set to true; otherwise, all FlowFiles will be added at the root of the ZIP file. The &lt;Compression Level&gt; property indicates the ZIP compression to use."></img></li><li>FlowFile Stream, v3 <img src="../../../../../html/images/iconInfo.png" alt="A bin of FlowFiles will be combined into a single Version 3 FlowFile Stream" title="A bin of FlowFiles will be combined into a single Version 3 FlowFile Stream"></img></li><li>FlowFile Stream, v2 <img src="../../../../../html/images/iconInfo.png" alt="A bin of FlowFiles will be combined into a single Version 2 FlowFile Stream" title="A bin of FlowFiles will be combined into a single Version 2 FlowFile Stream"></img></li><li>FlowFile Tar, v1 <img src="../../../../../html/images/iconInfo.png" alt="A bin of FlowFiles will be combined into a single Version 1 FlowFile Package" title="A bin of FlowFiles will be combined into a single Version 1 FlowFile Package"></img></li><li>Binary Concatenation <img src="../../../../../html/images/iconInfo.png" alt="The contents of all FlowFiles will be concatenated together into a single FlowFile" title="The contents of all FlowFiles will be concatenated together into a single FlowFile"></img></li><li>Avro <img src="../../../../../html/images/iconInfo.png" alt="The Avro contents of all FlowFiles will be concatenated together into a single FlowFile" title="The Avro contents of all FlowFiles will be concatenated together into a single FlowFile"></img></li></ul></td><td id="description">Determines the format that will be used to merge the content.</td></tr><tr><td id="name"><strong>Attribute Strategy</strong></td><td>Attribute Strategy</td><td id="default-value">Keep Only Common Attributes</td><td id="allowable-values"><ul><li>Keep Only Common Attributes <img src="../../../../../html/images/iconInfo.png" alt="Any attribute that is not the same on all FlowFiles in a bin will be dropped. Those that are the same across all FlowFiles will be retained." title="Any attribute that is not the same on all FlowFiles in a bin will be dropped. Those that are the same across all FlowFiles will be retained."></img></li><li>Keep All Unique Attributes <img src="../../../../../html/images/iconInfo.png" alt="Any attribute that has the same value for all FlowFiles in a bin, or has no value for a FlowFile, will be kept. For example, if a bin consists of 3 FlowFiles and 2 of them have a value of 'hello' for the 'greeting' attribute and the third FlowFile has no 'greeting' attribute then the outbound FlowFile will get a 'greeting' attribute with the value 'hello'." title="Any attribute that has the same value for all FlowFiles in a bin, or has no value for a FlowFile, will be kept. For example, if a bin consists of 3 FlowFiles and 2 of them have a value of 'hello' for the 'greeting' attribute and the third FlowFile has no 'greeting' attribute then the outbound FlowFile will get a 'greeting' attribute with the value 'hello'."></img></li></ul></td><td id="description">Determines which FlowFile attributes should be added to the bundle. If 'Keep All Unique Attributes' is selected, any attribute on any FlowFile that gets bundled will be kept unless its value conflicts with the value from another FlowFile. If 'Keep Only Common Attributes' is selected, only the attributes that exist on all FlowFiles in the bundle, with the same value, will be preserved.</td></tr><tr><td id="name">Correlation Attribute Name</td><td>Correlation Attribute Name</td><td></td><td id="allowable-values"></td><td id="description">If specified, like FlowFiles will be binned together, where 'like FlowFiles' means FlowFiles that have the same value for this Attribute. If not specified, FlowFiles are bundled by the order in which they are pulled from the queue.<br/><strong>Supports Expression Language: true (will be evaluated using flow file attributes and variable registry)</strong><br/><br/><strong>This Property is only considered if </strong><strong>the [Merge Strategy] Property has a value of "Bin-Packing Algorithm".</strong></td></tr><tr><td id="name"><strong>Metadata Strategy</strong></td><td>mergecontent-metadata-strategy</td><td id="default-value">Do Not Merge Uncommon Metadata</td><td id="allowable-values"><ul><li>Use First Metadata <img src="../../../../../html/images/iconInfo.png" alt="For any input format that supports metadata (Avro, e.g.), the metadata for the first FlowFile in the bin will be set on the output FlowFile." title="For any input format that supports metadata (Avro, e.g.), the metadata for the first FlowFile in the bin will be set on the output FlowFile."></img></li><li>Keep Only Common Metadata <img src="../../../../../html/images/iconInfo.png" alt="For any input format that supports metadata (Avro, e.g.), any FlowFile whose metadata values match those of the first FlowFile, any additional metadata will be dropped but the FlowFile will be merged. Any FlowFile whose metadata values do not match those of the first FlowFile in the bin will not be merged." title="For any input format that supports metadata (Avro, e.g.), any FlowFile whose metadata values match those of the first FlowFile, any additional metadata will be dropped but the FlowFile will be merged. Any FlowFile whose metadata values do not match those of the first FlowFile in the bin will not be merged."></img></li><li>Do Not Merge Uncommon Metadata <img src="../../../../../html/images/iconInfo.png" alt="For any input format that supports metadata (Avro, e.g.), any FlowFile whose metadata values do not match those of the first FlowFile in the bin will not be merged." title="For any input format that supports metadata (Avro, e.g.), any FlowFile whose metadata values do not match those of the first FlowFile in the bin will not be merged."></img></li><li>Ignore Metadata <img src="../../../../../html/images/iconInfo.png" alt="Ignores (does not transfer, compare, etc.) any metadata from a FlowFile whose content supports embedded metadata." title="Ignores (does not transfer, compare, etc.) any metadata from a FlowFile whose content supports embedded metadata."></img></li></ul></td><td id="description">For FlowFiles whose input format supports metadata (Avro, e.g.), this property determines which metadata should be added to the bundle. If 'Use First Metadata' is selected, the metadata keys/values from the first FlowFile to be bundled will be used. If 'Keep Only Common Metadata' is selected, only the metadata that exists on all FlowFiles in the bundle, with the same value, will be preserved. If 'Ignore Metadata' is selected, no metadata is transferred to the outgoing bundled FlowFile. If 'Do Not Merge Uncommon Metadata' is selected, any FlowFile whose metadata values do not match those of the first bundled FlowFile will not be merged.<br/><br/><strong>This Property is only considered if </strong><strong>the [Merge Format] Property has a value of "Avro".</strong></td></tr><tr><td id="name"><strong>Minimum Number of Entries</strong></td><td>Minimum Number of Entries</td><td id="default-value">1</td><td id="allowable-values"></td><td id="description">The minimum number of files to include in a bundle<br/><br/><strong>This Property is only considered if </strong><strong>the [Merge Strategy] Property has a value of "Bin-Packing Algorithm".</strong></td></tr><tr><td id="name"><strong>Maximum Number of Entries</strong></td><td>Maximum Number of Entries</td><td id="default-value">1000</td><td id="allowable-values"></td><td id="description">The maximum number of files to include in a bundle<br/><br/><strong>This Property is only considered if </strong><strong>the [Merge Strategy] Property has a value of "Bin-Packing Algorithm".</strong></td></tr><tr><td id="name"><strong>Minimum Group Size</strong></td><td>Minimum Group Size</td><td id="default-value">0 B</td><td id="allowable-values"></td><td id="description">The minimum size for the bundle<br/><br/><strong>This Property is only considered if </strong><strong>the [Merge Strategy] Property has a value of "Bin-Packing Algorithm".</strong></td></tr><tr><td id="name">Maximum Group Size</td><td>Maximum Group Size</td><td></td><td id="allowable-values"></td><td id="description">The maximum size for the bundle. If not specified, there is no maximum.<br/><br/><strong>This Property is only considered if </strong><strong>the [Merge Strategy] Property has a value of "Bin-Packing Algorithm".</strong></td></tr><tr><td id="name">Max Bin Age</td><td>Max Bin Age</td><td></td><td id="allowable-values"></td><td id="description">The maximum age of a Bin that will trigger a Bin to be complete. Expected format is &lt;duration&gt; &lt;time unit&gt; where &lt;duration&gt; is a positive integer and time unit is one of seconds, minutes, hours</td></tr><tr><td id="name"><strong>Maximum number of Bins</strong></td><td>Maximum number of Bins</td><td id="default-value">5</td><td id="allowable-values"></td><td id="description">Specifies the maximum number of bins that can be held in memory at any one time</td></tr><tr><td id="name"><strong>Delimiter Strategy</strong></td><td>Delimiter Strategy</td><td id="default-value">Do Not Use Delimiters</td><td id="allowable-values"><ul><li>Do Not Use Delimiters <img src="../../../../../html/images/iconInfo.png" alt="No Header, Footer, or Demarcator will be used" title="No Header, Footer, or Demarcator will be used"></img></li><li>Filename <img src="../../../../../html/images/iconInfo.png" alt="The values of Header, Footer, and Demarcator will be retrieved from the contents of a file" title="The values of Header, Footer, and Demarcator will be retrieved from the contents of a file"></img></li><li>Text <img src="../../../../../html/images/iconInfo.png" alt="The values of Header, Footer, and Demarcator will be specified as property values" title="The values of Header, Footer, and Demarcator will be specified as property values"></img></li></ul></td><td id="description">Determines if Header, Footer, and Demarcator should point to files containing the respective content, or if the values of the properties should be used as the content.<br/><br/><strong>This Property is only considered if </strong><strong>the [Merge Format] Property has a value of "Binary Concatenation".</strong></td></tr><tr><td id="name">Header</td><td>Header File</td><td></td><td id="allowable-values"></td><td id="description">Filename or text specifying the header to use. If not specified, no header is supplied.<br/><br/><strong>This property requires exactly one resource to be provided. That resource may be any of the following types: text, file.</strong><br/><br/><strong>Supports Expression Language: true (will be evaluated using flow file attributes and variable registry)</strong><br/><br/><strong>This Property is only considered if all of the following conditions are met:</strong><ul><li>The [Delimiter Strategy] Property is set to one of the following values: [Filename], [Text]</li><li>The [Merge Format] Property has a value of "Binary Concatenation".</li></ul></td></tr><tr><td id="name">Footer</td><td>Footer File</td><td></td><td id="allowable-values"></td><td id="description">Filename or text specifying the footer to use. If not specified, no footer is supplied.<br/><br/><strong>This property requires exactly one resource to be provided. That resource may be any of the following types: text, file.</strong><br/><br/><strong>Supports Expression Language: true (will be evaluated using flow file attributes and variable registry)</strong><br/><br/><strong>This Property is only considered if all of the following conditions are met:</strong><ul><li>The [Delimiter Strategy] Property is set to one of the following values: [Filename], [Text]</li><li>The [Merge Format] Property has a value of "Binary Concatenation".</li></ul></td></tr><tr><td id="name">Demarcator</td><td>Demarcator File</td><td></td><td id="allowable-values"></td><td id="description">Filename or text specifying the demarcator to use. If not specified, no demarcator is supplied.<br/><br/><strong>This property requires exactly one resource to be provided. That resource may be any of the following types: text, file.</strong><br/><br/><strong>Supports Expression Language: true (will be evaluated using flow file attributes and variable registry)</strong><br/><br/><strong>This Property is only considered if all of the following conditions are met:</strong><ul><li>The [Delimiter Strategy] Property is set to one of the following values: [Filename], [Text]</li><li>The [Merge Format] Property has a value of "Binary Concatenation".</li></ul></td></tr><tr><td id="name"><strong>Compression Level</strong></td><td>Compression Level</td><td id="default-value">1</td><td id="allowable-values"><ul><li>0</li><li>1</li><li>2</li><li>3</li><li>4</li><li>5</li><li>6</li><li>7</li><li>8</li><li>9</li></ul></td><td id="description">Specifies the compression level to use when using the Zip Merge Format; if not using the Zip Merge Format, this value is ignored<br/><br/><strong>This Property is only considered if </strong><strong>the [Merge Format] Property has a value of "ZIP".</strong></td></tr><tr><td id="name"><strong>Keep Path</strong></td><td>Keep Path</td><td id="default-value">false</td><td id="allowable-values"><ul><li>true</li><li>false</li></ul></td><td id="description">If using the Zip or Tar Merge Format, specifies whether or not the FlowFiles' paths should be included in their entry names.<br/><br/><strong>This Property is only considered if </strong><strong>the [Merge Format] Property is set to one of the following values: [ZIP], [TAR]</strong></td></tr><tr><td id="name">Tar Modified Time</td><td>Tar Modified Time</td><td id="default-value">${file.lastModifiedTime}</td><td id="allowable-values"></td><td id="description">If using the Tar Merge Format, specifies if the Tar entry should store the modified timestamp either by expression (e.g. ${file.lastModifiedTime} or static value, both of which must match the ISO8601 format 'yyyy-MM-dd'T'HH:mm:ssZ'.<br/><strong>Supports Expression Language: true (will be evaluated using flow file attributes and variable registry)</strong><br/><br/><strong>This Property is only considered if </strong><strong>the [Merge Format] Property has a value of "TAR".</strong></td></tr></table><h3>Relationships: </h3><table id="relationships"><tr><th>Name</th><th>Description</th></tr><tr><td>failure</td><td>If the bundle cannot be created, all FlowFiles that would have been used to created the bundle will be transferred to failure</td></tr><tr><td>original</td><td>The FlowFiles that were used to create the bundle</td></tr><tr><td>merged</td><td>The FlowFile containing the merged content</td></tr></table><h3>Reads Attributes: </h3><table id="reads-attributes"><tr><th>Name</th><th>Description</th></tr><tr><td>fragment.identifier</td><td>Applicable only if the &lt;Merge Strategy&gt; property is set to Defragment. All FlowFiles with the same value for this attribute will be bundled together.</td></tr><tr><td>fragment.index</td><td>Applicable only if the &lt;Merge Strategy&gt; property is set to Defragment. This attribute indicates the order in which the fragments should be assembled. This attribute must be present on all FlowFiles when using the Defragment Merge Strategy and must be a unique (i.e., unique across all FlowFiles that have the same value for the "fragment.identifier" attribute) integer between 0 and the value of the fragment.count attribute. If two or more FlowFiles have the same value for the "fragment.identifier" attribute and the same value for the "fragment.index" attribute, the first FlowFile processed will be accepted and subsequent FlowFiles will not be accepted into the Bin.</td></tr><tr><td>fragment.count</td><td>Applicable only if the &lt;Merge Strategy&gt; property is set to Defragment. This attribute must be present on all FlowFiles with the same value for the fragment.identifier attribute. All FlowFiles in the same bundle must have the same value for this attribute. The value of this attribute indicates how many FlowFiles should be expected in the given bundle.</td></tr><tr><td>segment.original.filename</td><td>Applicable only if the &lt;Merge Strategy&gt; property is set to Defragment. This attribute must be present on all FlowFiles with the same value for the fragment.identifier attribute. All FlowFiles in the same bundle must have the same value for this attribute. The value of this attribute will be used for the filename of the completed merged FlowFile.</td></tr><tr><td>tar.permissions</td><td>Applicable only if the &lt;Merge Format&gt; property is set to TAR. The value of this attribute must be 3 characters; each character must be in the range 0 to 7 (inclusive) and indicates the file permissions that should be used for the FlowFile's TAR entry. If this attribute is missing or has an invalid value, the default value of 644 will be used</td></tr></table><h3>Writes Attributes: </h3><table id="writes-attributes"><tr><th>Name</th><th>Description</th></tr><tr><td>filename</td><td>When more than 1 file is merged, the filename comes from the segment.original.filename attribute. If that attribute does not exist in the source FlowFiles, then the filename is set to the number of nanoseconds matching system time. Then a filename extension may be applied:if Merge Format is TAR, then the filename will be appended with .tar, if Merge Format is ZIP, then the filename will be appended with .zip, if Merge Format is FlowFileStream, then the filename will be appended with .pkg</td></tr><tr><td>merge.count</td><td>The number of FlowFiles that were merged into this bundle</td></tr><tr><td>merge.bin.age</td><td>The age of the bin, in milliseconds, when it was merged and output. Effectively this is the greatest amount of time that any FlowFile in this bundle remained waiting in this processor before it was output</td></tr><tr><td>merge.uuid</td><td>UUID of the merged flow file that will be added to the original flow files attributes.</td></tr><tr><td>merge.reason</td><td>This processor allows for several thresholds to be configured for merging FlowFiles. This attribute indicates which of the Thresholds resulted in the FlowFiles being merged. For an explanation of each of the possible values and their meanings, see the Processor's Usage / documentation and see the 'Additional Details' page.</td></tr></table><h3>State management: </h3>This component does not store state.<h3>Restricted: </h3>This component is not restricted.<h3>Input requirement: </h3>This component requires an incoming relationship.<h3>System Resource Considerations:</h3><table id="system-resource-considerations"><tr><th>Resource</th><th>Description</th></tr><tr><td>MEMORY</td><td>While content is not stored in memory, the FlowFiles' attributes are. The configuration of MergeContent (maximum bin size, maximum group size, maximum bin age, max number of entries) will influence how much memory is used. If merging together many small FlowFiles, a two-stage approach may be necessary in order to avoid excessive use of memory.</td></tr></table><h3>See Also:</h3><p><a href="../org.apache.nifi.processors.standard.SegmentContent/index.html">SegmentContent</a>, <a href="../org.apache.nifi.processors.standard.MergeRecord/index.html">MergeRecord</a></p></body></html>