blob: abc288ec174e50cc32180ee994a3a096cfe3a86d [file] [log] [blame]
import Tabs from '@theme/Tabs';
import TabItem from '@theme/TabItem';
# Split
> Split transform plugin
## Description
A string cutting function is defined, which is used to split the specified field in the Sql plugin.
:::tip
This transform both supported by engine Spark and Flink.
:::
## Options
<Tabs
groupId="engine-type"
defaultValue="spark"
values={[
{label: 'Spark', value: 'spark'},
{label: 'Flink', value: 'flink'},
]}>
<TabItem value="spark">
| name | type | required | default value |
| -------------- | ------ | -------- | ------------- |
| separator | string | no | " " |
| fields | array | yes | - |
| source_field | string | no | raw_message |
| target_field | string | no | *root* |
| common-options | string | no | - |
### separator [string]
Separator, the input string is separated according to the separator. The default separator is a space `(" ")` .
Note: If you use some special characters in the separator, you need to escape it. e.g. "\\|"
### source_field [string]
The source field of the string before being split, if not configured, the default is `raw_message`
### target_field [string]
`target_field` can specify the location where multiple split fields are added to the Event. If it is not configured, the default is `_root_` , that is, all split fields will be added to the top level of the Event. If a specific field is specified, the divided field will be added to the next level of this field.
</TabItem>
<TabItem value="flink">
| name | type | required | default value |
| -------------- | ------ | -------- | ------------- |
| separator | string | no | , |
| fields | array | yes | - |
| common-options | string | no | - |
### separator [string]
The specified delimiter, the default is `,`
</TabItem>
</Tabs>
### fields [list]
In the split field name list, specify the field names of each character string after splitting in order. If the length of the `fields` is greater than the length of the separation result, the extra fields are assigned null characters.
### common options [string]
Transform plugin common parameters, please refer to [Transform Plugin](common-options.mdx) for details
## Examples
<Tabs
groupId="engine-type"
defaultValue="spark"
values={[
{label: 'Spark', value: 'spark'},
{label: 'Flink', value: 'flink'},
]}>
<TabItem value="spark">
Split the `message` field in the source data according to `&`, you can use `field1` or `field2` as the key to get the corresponding value
```bash
split {
source_field = "message"
separator = "&"
fields = ["field1", "field2"]
}
```
Split the `message` field in the source data according to `,` , the split field is `info` , you can use `info.field1` or `info.field2` as the key to get the corresponding value
```bash
split {
source_field = "message"
target_field = "info"
separator = ","
fields = ["field1", "field2"]
}
```
</TabItem>
<TabItem value="flink">
</TabItem>
</Tabs>
Use `Split` as udf in sql.
```bash
# This just created a udf called split
Split{
separator = "#"
fields = ["name","age"]
}
# Use the split function (confirm that the fake table exists)
sql {
sql = "select * from (select raw_message,split(raw_message) as info_row from fake) t1"
}
```