docs/en/connector/source/Fake.mdx - seatunnel-web - Git at Google

 import Tabs from '@theme/Tabs';
 import TabItem from '@theme/TabItem';

 # Fake

 > Fake source connector

 ## Description

 `Fake` is mainly used to conveniently generate user-specified data, which is used as input for functional verification, testing, and performance testing of seatunnel.

 :::note

 Engine Supported and plugin name

 * [x] Spark: Fake, FakeStream
 * [x] Flink: FakeSource, FakeSourceStream
     * Flink `Fake Source` is mainly used to automatically generate data. The data has only two columns. The first column is of `String type` and the content is a random one from `["Gary", "Ricky Huo", "Kid Xiong"]` . The second column is of `Int type` , which is the current 13-digit timestamp is used as input for functional verification and testing of `seatunnel` .

 :::

 ## Options

 <Tabs
     groupId="engine-type"
     defaultValue="spark"
     values={[
         {label: 'Spark', value: 'spark'},
         {label: 'Flink', value: 'flink'},
     ]}>
 <TabItem value="spark">

 :::note

 These options is for Spark:`FakeStream`, and Spark:`Fake` do not have any options

 :::

 | name           | type   | required | default value |
 | -------------- | ------ | -------- | ------------- |
 | content        | array  | no       | -             |
 | rate           | number | yes      | -             |
 | common-options | string | yes      | -             |

 ### content [array]

 List of test data strings

 ### rate [number]

 Number of test cases generated per second

 </TabItem>
 <TabItem value="flink">

 | name               | type                 | required | default value |
 |--------------------|----------------------|----------|---------------|
 | parallelism        | `Int`                | no       | -             |
 | common-options     | `string`             | no       | -             |
 | mock_data_schema   | list [column_config] | no       | see details.  |
 | mock_data_size     | int                  | no       | 300           |
 | mock_data_interval | int (second)         | no       | 1             |

 ### parallelism [`Int`]

 The parallelism of an individual operator, for Fake Source Stream

 </TabItem>
 </Tabs>

 ### common options [string]

 Source plugin common parameters, please refer to [Source Plugin](common-options.mdx) for details

 ### mock_data_schema Option  [list[column_config]]

 Config mock data's schema. Each is column_config option.

 When mock_data_schema is not defined. Data will generate with schema like this:
 ```bash
 mock_data_schema = [
   {
     name = "name",
     type = "string",
     mock_config = {
       string_seed = ["Gary", "Ricky Huo", "Kid Xiong"]
       size_range = [1,1]
     }
   }
   {
     name = "age",
     type = "int",
     mock_config = {
       int_range = [1, 100]
     }
   }
 ]
 ```

 column_config option type.

 | name        | type        | required | default value | support values                                                                                                                                                                                                                                      |
 |-------------|-------------|----------|---------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
 | name        | string      | yes      | string        | -                                                                                                                                                                                                                                                   |
 | type        | string      | yes      | string        | int,integer,byte,boolean,char,<br/>character,short,long,float,double,<br/>date,timestamp,decimal,bigdecimal,<br/>bigint,int[],byte[],<br/>boolean[],char[],character[],short[],<br/>long[],float[],double[],string[],<br/>binary,varchar |
 | mock_config | mock_config | no       | -             | -                                                                                                                                                                                                                                                   |

 mock_config Option

 | name          | type                  | required | default value | sample                                   |
 |---------------|-----------------------|----------|---------------|------------------------------------------|
 | byte_range    | list[byte] [size=2]   | no       | -             | [0,127]                                  |
 | boolean_seed  | list[boolean]         | no       | -             | [true, true, false]                      |
 | char_seed     | list[char] [size=2]   | no       | -             | ['a','b','c']                            |
 | date_range    | list[string] [size=2] | no       | -             | ["1970-01-01", "2100-12-31"]             |
 | decimal_scale | int                   | no       | -             | 2                                        |
 | double_range  | list[double] [size=2] | no       | -             | [0.0, 10000.0]                           |
 | float_range   | list[flout] [size=2]  | no       | -             | [0.0, 10000.0]                           |
 | int_range     | list[int] [size=2]    | no       | -             | [0, 100]                                 |
 | long_range    | list[long] [size=2]   | no       | -             | [0, 100000]                              |
 | number_regex  | string                | no       | -             | "[1-9]{1}\\d?"                           |
 | time_range    | list[int] [size=6]    | no       | -             | [0,24,0,60,0,60]                         |
 | size_range    | list[int] [size=2]    | no       | -             | [6,10]                                   |
 | string_regex  | string                | no       | -             | "[a-z0-9]{5}\\@\\w{3}\\.[a-z]{3}"        |
 | string_seed   | list[string]          | no       | -             | ["Gary", "Ricky Huo", "Kid Xiong"]       |

 ### mock_data_size Option [int]

 Config mock data size.

 ### mock_data_interval Option [int]

 Config the data can mock with interval, The unit is SECOND.

 ## Examples

 <Tabs
     groupId="engine-type"
     defaultValue="spark"
     values={[
         {label: 'Spark', value: 'spark'},
         {label: 'Flink', value: 'flink'},
     ]}>
 <TabItem value="spark">

 ### Fake

 ```bash
 Fake {
     result_table_name = "my_dataset"
 }
 ```

 ### FakeStream

 ```bash
 fakeStream {
     content = ['name=ricky&age=23', 'name=gary&age=28']
     rate = 5
 }
 ```

 The generated data is as follows, randomly extract the string from the `content` list

 ```bash
 +-----------------+
 |raw_message      |
 +-----------------+
 |name=gary&age=28 |
 |name=ricky&age=23|
 +-----------------+
 ```

 </TabItem>
 <TabItem value="flink">

 ### FakeSourceStream


 ```bash
 source {
     FakeSourceStream {
         result_table_name = "fake"
         field_name = "name,age"
     }
 }
 ```

 ### FakeSource

 ```bash
 source {
     FakeSource {
         result_table_name = "fake"
         field_name = "name,age"
         mock_data_size = 100 // will generate 100 rows mock data.
     }
 }
 ```

 </TabItem>
 </Tabs>
	import Tabs from '@theme/Tabs';
	import TabItem from '@theme/TabItem';

	# Fake

	> Fake source connector

	## Description

	`Fake` is mainly used to conveniently generate user-specified data, which is used as input for functional verification, testing, and performance testing of seatunnel.

	:::note

	Engine Supported and plugin name

	* [x] Spark: Fake, FakeStream
	* [x] Flink: FakeSource, FakeSourceStream
	* Flink `Fake Source` is mainly used to automatically generate data. The data has only two columns. The first column is of `String type` and the content is a random one from `["Gary", "Ricky Huo", "Kid Xiong"]` . The second column is of `Int type` , which is the current 13-digit timestamp is used as input for functional verification and testing of `seatunnel` .

	:::

	## Options

	<Tabs
	groupId="engine-type"
	defaultValue="spark"
	values={[
	{label: 'Spark', value: 'spark'},
	{label: 'Flink', value: 'flink'},
	]}>
	<TabItem value="spark">

	:::note

	These options is for Spark:`FakeStream`, and Spark:`Fake` do not have any options

	:::

	\| name \| type \| required \| default value \|
	\| -------------- \| ------ \| -------- \| ------------- \|
	\| content \| array \| no \| - \|
	\| rate \| number \| yes \| - \|
	\| common-options \| string \| yes \| - \|

	### content [array]

	List of test data strings

	### rate [number]

	Number of test cases generated per second

	</TabItem>
	<TabItem value="flink">

	\| name \| type \| required \| default value \|
	\|--------------------\|----------------------\|----------\|---------------\|
	\| parallelism \| `Int` \| no \| - \|
	\| common-options \| `string` \| no \| - \|
	\| mock_data_schema \| list [column_config] \| no \| see details. \|
	\| mock_data_size \| int \| no \| 300 \|
	\| mock_data_interval \| int (second) \| no \| 1 \|

	### parallelism [`Int`]

	The parallelism of an individual operator, for Fake Source Stream

	</TabItem>
	</Tabs>

	### common options [string]

	Source plugin common parameters, please refer to [Source Plugin](common-options.mdx) for details

	### mock_data_schema Option [list[column_config]]

	Config mock data's schema. Each is column_config option.

	When mock_data_schema is not defined. Data will generate with schema like this:
	```bash
	mock_data_schema = [
	{
	name = "name",
	type = "string",
	mock_config = {
	string_seed = ["Gary", "Ricky Huo", "Kid Xiong"]
	size_range = [1,1]
	}
	}
	{
	name = "age",
	type = "int",
	mock_config = {
	int_range = [1, 100]
	}
	}
	]
	```

	column_config option type.

	\| name \| type \| required \| default value \| support values \|
	\|-------------\|-------------\|----------\|---------------\|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------\|
	\| name \| string \| yes \| string \| - \|
	\| type \| string \| yes \| string \| int,integer,byte,boolean,char,<br/>character,short,long,float,double,<br/>date,timestamp,decimal,bigdecimal,<br/>bigint,int[],byte[],<br/>boolean[],char[],character[],short[],<br/>long[],float[],double[],string[],<br/>binary,varchar \|
	\| mock_config \| mock_config \| no \| - \| - \|

	mock_config Option

	\| name \| type \| required \| default value \| sample \|
	\|---------------\|-----------------------\|----------\|---------------\|------------------------------------------\|
	\| byte_range \| list[byte] [size=2] \| no \| - \| [0,127] \|
	\| boolean_seed \| list[boolean] \| no \| - \| [true, true, false] \|
	\| char_seed \| list[char] [size=2] \| no \| - \| ['a','b','c'] \|
	\| date_range \| list[string] [size=2] \| no \| - \| ["1970-01-01", "2100-12-31"] \|
	\| decimal_scale \| int \| no \| - \| 2 \|
	\| double_range \| list[double] [size=2] \| no \| - \| [0.0, 10000.0] \|
	\| float_range \| list[flout] [size=2] \| no \| - \| [0.0, 10000.0] \|
	\| int_range \| list[int] [size=2] \| no \| - \| [0, 100] \|
	\| long_range \| list[long] [size=2] \| no \| - \| [0, 100000] \|
	\| number_regex \| string \| no \| - \| "[1-9]{1}\\d?" \|
	\| time_range \| list[int] [size=6] \| no \| - \| [0,24,0,60,0,60] \|
	\| size_range \| list[int] [size=2] \| no \| - \| [6,10] \|
	\| string_regex \| string \| no \| - \| "[a-z0-9]{5}\\@\\w{3}\\.[a-z]{3}" \|
	\| string_seed \| list[string] \| no \| - \| ["Gary", "Ricky Huo", "Kid Xiong"] \|

	### mock_data_size Option [int]

	Config mock data size.

	### mock_data_interval Option [int]

	Config the data can mock with interval, The unit is SECOND.

	## Examples

	<Tabs
	groupId="engine-type"
	defaultValue="spark"
	values={[
	{label: 'Spark', value: 'spark'},
	{label: 'Flink', value: 'flink'},
	]}>
	<TabItem value="spark">

	### Fake

	```bash
	Fake {
	result_table_name = "my_dataset"
	}
	```

	### FakeStream

	```bash
	fakeStream {
	content = ['name=ricky&age=23', 'name=gary&age=28']
	rate = 5
	}
	```

	The generated data is as follows, randomly extract the string from the `content` list

	```bash
	+-----------------+
	\|raw_message \|
	+-----------------+
	\|name=gary&age=28 \|
	\|name=ricky&age=23\|
	+-----------------+
	```

	</TabItem>
	<TabItem value="flink">

	### FakeSourceStream



	```bash
	source {
	FakeSourceStream {
	result_table_name = "fake"
	field_name = "name,age"
	}
	}
	```

	### FakeSource

	```bash
	source {
	FakeSource {
	result_table_name = "fake"
	field_name = "name,age"
	mock_data_size = 100 // will generate 100 rows mock data.
	}
	}
	```

	</TabItem>
	</Tabs>