EXPRESSIONS.md - nifi-minifi-cpp - Git at Google

 <!--
   Licensed to the Apache Software Foundation (ASF) under one or more
   contributor license agreements.  See the NOTICE file distributed with
   this work for additional information regarding copyright ownership.
   The ASF licenses this file to You under the Apache License, Version 2.0
   (the "License"); you may not use this file except in compliance with
   the License.  You may obtain a copy of the License at
       http://www.apache.org/licenses/LICENSE-2.0
   Unless required by applicable law or agreed to in writing, software
   distributed under the License is distributed on an "AS IS" BASIS,
   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
   See the License for the specific language governing permissions and
   limitations under the License.
 -->
 # MiNiFi - C++ Expression Language

 Apache NiFi - MiNiFi - C++ supports a subset of the [Apache NiFi Expression
 Language](https://nifi.apache.org/docs/nifi-docs/html/expression-language-guide.html)
 (EL).  EL is a tiny DSL enabling processor property values to be computed
 dynamically according to contextual information such as FlowFile attributes.
 Dynamic values may be manipulated by a number of functions supported by EL,
 including boolean logic, string manipulation, encoding/decoding, searching,
 mathematical operators, date manipulation, type coercion, and more.
 Processors/properties supporting EL are marked in the [processors
 documentation](PROCESSORS.md).

 ## Overview

 All data in Apache NiFi is represented by an abstraction called a FlowFile. A
 FlowFile comprises two major pieces: content and attributes. The content
 portion of the FlowFile represents the data on which to operate. For instance,
 if a file is picked up from a local file system using the GetFile Processor,
 the contents of the file will become the contents of the FlowFile.

 The attributes portion of the FlowFile represents information about the data
 itself, or metadata. Attributes are key-value pairs that represent what is
 known about the data as well as information that is useful for routing and
 processing the data appropriately. Keeping with the example of a file that is
 picked up from a local file system, the FlowFile would have an attribute called
 `filename` that reflected the name of the file on the file system.
 Additionally, the FlowFile will have a `path` attribute that reflects the
 directory on the file system that this file lived in. The FlowFile will also
 have an attribute named `uuid`, which is a unique identifier for this FlowFile.
 For complete listing of the core attributes check out the FlowFile section of
 the Apache NiFi Developer’s Guide.

 However, placing these attributes on a FlowFile do not provide much benefit if
 the user is unable to make use of them. The NiFi Expression Language provides
 the ability to reference these attributes, compare them to other values, and
 manipulate their values.

 ## Structure of a NiFi Expression

 The NiFi Expression Language always begins with the start delimiter `${` and
 ends with the end delimiter `}`. Between the start and end delimiters is the
 text of the Expression itself. In its most basic form, the Expression can
 consist of just an attribute name. For example, `${filename}` will return the
 value of the `filename` attribute.

 In a slightly more complex example, we can instead return a manipulation of
 this value. We can, for example, return an all upper-case version of the
 filename by calling the `toUpper` function: `${filename:toUpper()}`. In this
 case, we reference the `filename` attribute and then manipulate this value by
 using the `toUpper` function. A function call consists of 5 elements. First,
 there is a function call delimiter `:`. Second is the name of the function --in
 this case, `toUpper`. Next is an open parenthesis (`(`), followed by the
 function arguments.  The arguments necessary are dependent upon which function
 is being called. In this example, we are using the `toUpper` function, which
 does not have any arguments, so this element is omitted. Finally, the closing
 parenthesis (`)`) indicates the end of the function call. There are many
 different functions that are supported by the Expression Language to achieve
 many different goals. Some functions provide String (text) manipulation, such
 as the `toUpper` function.  Others, such as the equals and matches functions,
 provide comparison functionality. Functions also exist for manipulating dates
 and times and for performing mathematical operations. Each of these functions
 is described below, in the Functions section, with an explanation of what the
 function does, the arguments that it requires, and the type of information that
 it returns.

 When we perform a function call on an attribute, as above, we refer to the
 attribute as the subject of the function, as the attribute is the entity on
 which the function is operating. We can then chain together multiple function
 calls, where the return value of the first function becomes the subject of the
 second function and its return value becomes the subject of the third function
 and so on. Continuing with our example, we can chain together multiple
 functions by using an expression similar to
 `${filename:toUpper():equals('HELLO.TXT')}`.  There is no limit to the number
 of functions that can be chained together.

 Any FlowFile attribute can be referenced using the Expression Language.
 However, if the attribute name contains a special character, the attribute
 name must be escaped by quoting it. The following characters are each
 considered special characters:

 - `$` (dollar sign)
 - `|` (pipe)
 - `{` (open brace)
 - `}` (close brace)
 - `(` (open parenthesis)
 - `)` (close parenthesis)
 - `[` (open bracket)
 - `]` (close bracket)
 - `,` (comma)
 - `:` (colon)
 - `;` (semicolon)
 - `/` (forward slash)
 - `*` (asterisk)
 - `'` (single quote)
 - ` ` (space)
 - `\t` (tab)
 - `\r` (carriage return)
 - `\n` (new-line)

 Additionally, a number is considered a special character if it is the first
 character of the attribute name. If any of these special characters is present
 in an attribute is quoted by using either single or double quotes. The
 Expression Language allows single quotes and double quotes to be used
 interchangeably. For example, the following can be used to escape an attribute
 named my attribute: `${"my attribute"}` or `${'my attribute'}`.

 In this example, the value to be returned is the value of the "my attribute"
 value, if it exists. If that attribute does not exist, the Expression Language
 will then look for a System Environment Variable named "my attribute." Finally,
 if none of these exists, the Expression Language will return a null value.

 There also exist some functions that expect to have no subject. These functions
 are invoked simply by calling the function at the beginning of the Expression,
 such as `${hostname()}`. These functions can then be changed together, as well.
 For example, `${hostname():toUpper()}`. Attempting to evaluate the function with
 subject will result in an error. In the Functions section below, these
 functions will clearly indicate in their descriptions that they do not require
 a subject.

 Often times, we will need to compare the values of two different attributes to
 each other. We are able to accomplish this by using embedded Expressions. We
 can, for example, check if the filename attribute is the same as the uuid
 attribute: `${filename:equals( ${uuid} )}`. Notice here, also, that we have a
 space between the opening parenthesis for the equals method and the embedded
 Expression. This is not necessary and does not affect how the Expression is
 evaluated in any way. Rather, it is intended to make the Expression easier to
 read. White space is ignored by the Expression Language between delimiters.
 Therefore, we can use the Expression `${ filename : equals(${ uuid}) }` or
 `${filename:equals(${uuid})}` and both Expressions mean the same thing. We
 cannot, however, use `${file name:equals(${uuid})}`, because this results in
 file and name being interpreted as different tokens, rather than a single
 token, filename.

 ## Supported Features

 ### String Manipulation

 - [`toUpper`](#toupper)
 - [`toLower`](#tolower)
 - [`substring`](#substring)
 - [`substringBefore`](#substringbefore)
 - [`substringBeforeLast`](#substringbeforelast)
 - [`substringAfter`](#substringafter)
 - [`substringAfterLast`](#substringafterlast)
 - [`replace`](#replace)
 - [`replaceFirst`](#replacefirst)
 - [`replaceAll`](#replaceall)
 - [`replaceNull`](#replacenull)
 - [`replaceEmpty`](#replaceempty)

 ### Mathematical Operations and Numeric Manipulation

 - [`plus`](#plus)
 - [`minus`](#minus)
 - [`multiply`](#multiply)
 - [`divide`](#divide)
 - [`mod`](#mod)
 - [`toRadix`](#toradix)
 - [`fromRadix`](#fromradix)
 - [`random`](#random)

 ## Planned Features

 ### String Manipulation

 - `trim`
 - `getDelimitedField`
 - `append`
 - `prepend`
 - `length`

 ### Boolean Logic

 - `isNull`
 - `notNull`
 - `isEmpty`
 - `equals`
 - `equalsIgnoreCase`
 - `gt`
 - `ge`
 - `lt`
 - `le`
 - `and`
 - `or`
 - `not`
 - `ifElse`

 ### Encode/Decode Functions

 - `escapeJson`
 - `escapeXml`
 - `escapeCsv`
 - `escapeHtml3`
 - `escapeHtml4`
 - `unescapeJson`
 - `unescapeXml`
 - `unescapeCsv`
 - `unescapeHtml3`
 - `unescapeHtml4`
 - `urlEncode`
 - `urlDecode`
 - `base64Encode`
 - `base64Decode`

 ### Encode/Decode Functions

 - `escapeJson`
 - `escapeXml`
 - `escapeCsv`
 - `escapeHtml3`
 - `escapeHtml4`
 - `unescapeJson`
 - `unescapeXml`
 - `unescapeCsv`
 - `unescapeHtml3`
 - `unescapeHtml4`
 - `urlEncode`
 - `urlDecode`
 - `base64Encode`
 - `base64Decode`

 ### Date Manipulation

 - `format`
 - `toDate`
 - `now`

 ### Subjectless Functions

 - `ip`
 - `hostname`
 - `UUID`
 - `nextInt`
 - `literal`
 - `getStateValue`

 ### Evaluating Multiple Attributes

 - `anyAttribute`
 - `allAttributes`
 - `anyMatchingAttribute`
 - `allMatchingAttributes`
 - `anyDelineatedValue`
 - `allDelineatedValues`
 - `join`
 - `count`

 ## Unsupported Features

 The following EL features are currently not supported, and no support is
 planned due to language/environment (Java vs. C++) differences:

 ### Mathematical Operations and Numeric Manipulation

 - `math `

 ## String Manipulation

 Each of the following functions manipulates a String in some way.

 ### toUpper

 **Description**: This function converts the Subject into an all upper-case
 String. Said another way, it replaces any lowercase letter with the uppercase
 equivalent.

 **Subject Type**: String

 **Arguments**: No arguments

 **Return Type**: String

 **Examples**: If the `filename` attribute is `abc123.txt`, then the Expression
 `${filename:toUpper()}` will return `ABC123.TXT`

 ### toLower

 **Description**: This function converts the Subject into an all lower-case
 String. Said another way, it replaces any uppercase letter with the lowercase
 equivalent.

 **Subject Type**: String

 **Arguments**: No arguments

 **Return Type**: String

 **Examples**: If the `filename` attribute is `ABC123.TXT`, then the Expression
 `${filename:toLower()}` will return `abc123.txt`

 ### substring

 **Description**: Returns a portion of the Subject, given a starting index and
 an optional ending index. If the ending index is not supplied, it will return
 the portion of the Subject starting at the given 'start index' and ending at
 the end of the Subject value.

 The starting index and ending index are zero-based. That is, the first
 character is referenced by using the value 0, not 1.

 If either the starting index is or the ending index is not a number, this
 function call will result in an error.

 If the starting index is larger than the ending index, this function call will
 result in an error.

 If the starting index or the ending index is greater than the length of the
 Subject or has a value less than 0, this function call will result in an error.

 **Subject Type**: String

 **Arguments**:

 | Argument | Description |
 | - | - |
 | starting index | The 0-based index of the first character to capture (inclusive) |
 | ending index | The 0-based index of the last character to capture (exclusive) |

 **Return Type**: String

 **Examples**:

 If we have an attribute named `filename` with the value `a brand new
 filename.txt`, then the following Expressions will result in the following
 values:

 | Expression | Value |
 | - | - |
 | `${filename:substring(0,1)}` | a |
 | `${filename:substring(2)}` | brand new filename.txt |
 | `${filename:substring(12)}` | filename.txt |
 | `${filename:substring( ${filename:length():minus(2)} )}` | xt |

 ### substringBefore

 **Description**: Returns a portion of the Subject, starting with the first
 character of the Subject and ending with the character immediately before the
 first occurrence of the argument. If the argument is not present in the
 Subject, the entire Subject will be returned.

 **Subject Type**: String

 **Arguments**:

 | Argument | Description |
 | - | - |
 | value | The String to search for in the Subject |

 **Return Type**: String

 **Examples**: If the `filename` attribute has the value `a brand new
 filename.txt`, then the following Expressions will result in the following
 values:

 | Expression | Value |
 | - | - |
 | `${filename:substringBefore('.')}` | a brand new filename |
 | `${filename:substringBefore(' ')}` | a |
 | `${filename:substringBefore(' n')}` | a brand |
 | `${filename:substringBefore('missing')}` | a brand new filename.txt |

 ### substringBeforeLast

 **Description**: Returns a portion of the Subject, starting with the first
 character of the Subject and ending with the character immediately before the
 last occurrence of the argument. If the argument is not present in the Subject,
 the entire Subject will be returned.

 **Subject Type**: String

 **Arguments**:

 | Argument | Description |
 | - | - |
 | value | The String to search for in the Subject |

 **Return Type**: String

 **Examples**: If the `filename` attribute has the value `a brand new
 filename.txt`, then the following Expressions will result in the following
 values:

 | Expression | Value |
 | - | - |
 | `${filename:substringBeforeLast('.')}` | a brand new filename |
 | `${filename:substringBeforeLast(' ')}` | a brand new |
 | `${filename:substringBeforeLast(' n')}` | a brand |
 | `${filename:substringBeforeLast('missing')}` | a brand new filename.txt |

 ### substringAfter

 **Description**: Returns a portion of the Subject, starting with the character
 immediately after the first occurrence of the argument and extending to the end
 of the Subject. If the argument is not present in the Subject, the entire
 Subject will be returned.

 **Subject Type**: String

 **Arguments**:

 | Argument | Description |
 | - | - |
 | value | The String to search for in the Subject |

 **Return Type**: String

 **Examples**: If the `filename` attribute has the value `a brand new
 filename.txt`, then the following Expressions will result in the following
 values:

 | Expression | Value |
 | - | - |
 | `${filename:substringAfter('.')}` | txt |
 | `${filename:substringAfter(' ')}` | brand new filename.txt |
 | `${filename:substringAfter(' n')}` | ew filename.txt |
 | `${filename:substringAfter('missing')}` | a brand new filename.txt |

 ### substringAfterLast

 **Description**: Returns a portion of the Subject, starting with the character
 immediately after the last occurrence of the argument and extending to the end
 of the Subject. If the argument is not present in the Subject, the entire
 Subject will be returned.

 **Subject Type**: String

 **Arguments**:

 | Argument | Description |
 | - | - |
 | value | The String to search for in the Subject |

 **Return Type**: String

 **Examples**: If the `filename` attribute has the value `a brand new
 filename.txt`, then the following Expressions will result in the following
 values:

 | Expression | Value |
 | - | - |
 | `${filename:substringAfterLast('.')}` | txt |
 | `${filename:substringAfterLast(' ')}` | filename.txt |
 | `${filename:substringAfterLast(' n')}` | ew filename.txt |
 | `${filename:substringAfterLast('missing')}` | a brand new filename.txt |

 ### replace

 **Description**: Replaces all occurrences of one literal String within the Subject
 with another String.

 **Subject Type**: String

 **Arguments**:

 | Argument | Description |
 | - | - |
 | Search String | The String to find within the Subject |
 | Replacement | The value to replace Search String with |

 **Return Type**: String

 **Examples**: If the `filename` attribute has the value `a brand new
 filename.txt`, then the following Expressions will provide the following
 results:

 | Expression | Value |
 | - | - |
 | `${filename:replace('.', '_')}` | a brand new filename_txt |
 | `${filename:replace(' ', '.')}` | a.brand.new.filename.txt |
 | `${filename:replace('XYZ', 'ZZZ')}` | a brand new filename.txt |
 | `${filename:replace('filename', 'book')}` | a brand new book.txt |

 ### replaceFirst

 **Description**: Replaces the first occurrence of one literal String or regular
 expression within the Subject with another String.

 **Subject Type**: String

 **Arguments**:

 | Argument | Description |
 | - | - |
 | Search String | The String (literal or regular expression pattern) to find within the Subject |
 | Replacement | The value to replace Search String with |

 **Return Type**: String

 **Examples**: If the `filename` attribute has the value `a brand new
 filename.txt`, then the following Expressions will provide the following
 results:

 | Expression | Value |
 | - | - |
 | `${filename:replaceFirst('a', 'the')}` | the brand new filename.txt |
 | `${filename:replaceFirst('[br]', 'g')}` | a grand new filename.txt |
 | `${filename:replaceFirst('XYZ', 'ZZZ')}` | a brand new filename.txt |
 | `${filename:replaceFirst('\w{8}', 'book')}` | a brand new book.txt |

 ### replaceAll

 **Description**: The replaceAll function takes two String arguments: a literal
 String or Regular Expression (NiFi uses the Java Pattern syntax), and a
 replacement string. The return value is the result of substituting the
 replacement string for all patterns within the Subject that match the Regular
 Expression.

 **Subject Type**: String

 **Arguments**:

 | Argument | Description |
 | - | - |
 | Regex | he Regular Expression (in Java syntax) to match in the Subject |
 | Replacement | The value to use for replacing matches in the Subject. If the regular expression argument uses Capturing Groups, back references are allowed in the replacement. |

 **Return Type**: String

 **Examples**: If the `filename` attribute has the value `a brand new
 filename.txt`, then the following Expressions will provide the following
 results:

 | Expression | Value |
 | - | - |
 | `${filename:replaceAll('\..*', '')}` | a brand new filename |
 | `${filename:replaceAll('a brand (new)', '$1')}` | new filename.txt |
 | `${filename:replaceAll('XYZ', 'ZZZ')}` | a brand new filename.txt |
 | `${filename:replaceAll('brand (new)', 'somewhat $1')}` | a somewhat new filename.txt |

 ### replaceNull

 **Description**: The replaceNull function returns the argument if the Subject is
 null. Otherwise, returns the Subject.

 **Subject Type**: Any

 **Arguments**:

 | Argument | Description |
 | - | - |
 | Replacement | The value to return if the Subject is null. |

 **Return Type**: Type of Subject if Subject is not null; else, type of Argument

 **Examples**: If the attribute `filename` has the value `a brand new filename.txt` and the attribute `hello` does not exist, then the Expression `${filename:replaceNull('abc')}` will return `a brand new filename.txt`, while `${hello:replaceNull('abc')}` will return `abc`.

 ### replaceEmpty

 **Description**: The replaceEmpty function returns the argument if the Subject is
 null or if the Subject consists only of white space (new line, carriage return,
 tab, space). Otherwise, returns the Subject.

 **Subject Type**: String

 **Arguments**:

 | Argument | Description |
 | - | - |
 | Replacement | The value to return if the Subject is null or empty. |

 **Return Type**: String

 **Examples**: If the attribute `filename` has the value `a brand new
 filename.txt` and the attribute `hello` has the value ` `, then the Expression
 `${filename:replaceEmpty('abc')}` will return `a brand new filename.txt`, while
 `${hello:replaceEmpty('abc')}` will return `abc`.

 ## Mathematical Operations and Numeric Manipulation

 For those functions that support Decimal and Number (whole number) types, the
 return value type depends on the input types. If either the subject or argument
 are a Decimal then the result will be a Decimal. If both values are Numbers
 then the result will be a Number. This includes Divide. This is to preserve
 backwards compatibility and to not force rounding errors.

 ### plus

 **Description**: Adds a numeric value to the Subject. If either the argument or the
 Subject cannot be coerced into a Number, returns null.

 **Subject Type**: Number or Decimal

 **Arguments**:

 | Argument | Description |
 | - | - |
 | Operand | The value to add to the Subject |

 **Return Type**: Number or Decimal (depending on input types)

 **Examples**: If the `fileSize` attribute has a value of 100, then the
 Expression `${fileSize:plus(1000)}` will return the value 1100.

 ### minus

 **Description**: Subtracts a numeric value from the Subject.

 **Subject Type**: Number or Decimal

 **Arguments**:

 | Argument | Description |
 | - | - |
 | Operand | The value to subtract from the Subject |

 **Return Type**: Number or Decimal (depending on input types)

 **Examples**: If the `fileSize` attribute has a value of 100, then the
 Expression `${fileSize:minus(100)}` will return the value 0.

 ### multiply

 **Description**: Multiplies a numeric value by the Subject and returns the product.

 **Subject Type**: Number or Decimal

 **Arguments**:

 | Argument | Description |
 | - | - |
 | Operand | The value to multiple the Subject by |

 **Return Type**: Number or Decimal (depending on input types)

 **Examples**: If the `fileSize` attribute has a value of 100, then the
 Expression `${fileSize:multiply(1024)}` will return the value 102400.

 ### divide

 **Description**: Divides the Subject by a numeric value and returns the result.

 **Subject Type**: Number or Decimal

 **Arguments**:

 | Argument | Description |
 | - | - |
 | Operand | The value to divide the Subject by |

 **Return Type**: Number or Decimal (depending on input types)

 **Examples**: If the `fileSize` attribute has a value of 100, then the
 Expression `${fileSize:divide(12)}` will return the value 8.

 ### mod

 **Description**: Performs a modular division of the Subject by the argument. That
 is, this function will divide the Subject by the value of the argument and
 return not the quotient but rather the remainder.

 **Subject Type**: Number or Decimal

 **Arguments**:

 | Argument | Description |
 | - | - |
 | Operand | The value to divide the Subject by |

 **Return Type**: Number or Decimal (depending on input types)

 **Examples**: If the `fileSize` attribute has a value of 100, then the
 Expression `${fileSize:mod(12)}` will return the value 4.

 ### toRadix

 **Description**: Converts the Subject from a Base 10 number to a
 different Radix (or number base). An optional second argument can be used to
 indicate the minimum number of characters to be used. If the converted value
 has fewer than this number of characters, the number will be padded with
 leading zeroes.

 If a decimal is passed as the subject, it will first be converted to a whole
 number and then processed.

 **Subject Type**: Number

 **Arguments**:

 | Argument | Description |
 | - | - |
 | Desired Base | A Number between 2 and 36 (inclusive) |
 | Padding | Optional argument that specifies the minimum number of characters in the converted output |

 **Return Type**: String

 **Examples**: If the `fileSize` attributes has a value of 1024, then the
 following Expressions will yield the following results:

 | Expression | Value |
 | - | - |
 | `${fileSize:toRadix(10)}` | 1024 |
 | `${fileSize:toRadix(10, 1)}` | 1024 |
 | `${fileSize:toRadix(10, 8)}` | 00001024 |
 | `${fileSize:toRadix(16)}` | 400 |
 | `${fileSize:toRadix(16, 8)}` | 00000400 |
 | `${fileSize:toRadix(2)}` | 10000000000 |
 | `${fileSize:toRadix(2, 16)}` | 0000010000000000 |

 ### fromRadix

 **Description**: Converts the Subject from a specified Radix (or
 number base) to a base ten whole number. The subject will converted as is,
 without interpretation, and all characters must be valid for the base being
 converted from. For example converting "0xFF" from hex will not work due to "x"
 being a invalid hex character.

 If a decimal is passed as the subject, it will first be converted to a whole
 number and then processed.

 **Subject Type**: String

 **Arguments**:

 | Argument | Description |
 | - | - |
 | Subject Base | A Number between 2 and 36 (inclusive) |

 **Return Type**: Number

 **Examples**: If the `fileSize` attributes has a value of 1234A, then the
 following Expressions will yield the following results:

 | Expression | Value |
 | - | - |
 | `${fileSize:fromRadix(11)}` | 17720 |
 | `${fileSize:fromRadix(16)}` | 74570 |
 | `${fileSize:fromRadix(20)}` | 177290 |

 ### random

 **Description**: Returns a random whole number (0 to 2^63 - 1) using an insecure
 random number generator.

 **Subject Type**: No subject

 **Arguments**: No arguments

 **Return Type**: Number

 **Examples**: `${random():mod(10):plus(1)}` returns random number between 1 and 10 inclusive.