Users want to be able to define some common variables when writing code and then replace them during execution. For example, users run the same sql in batches every day, and need to specify the partition time of the previous day. If based on sql It will be more complicated to write if the system provides a variable of run_date which will be very convenient to use.
During the execution of the Linkis task, the custom variables are carried out in Entrance, mainly through the interceptor of Entrance before the task is submitted and executed. The variable and the defined variable, and complete the code replacement through the initial value of the custom variable passed in by the task, and become the final executable code.
The overall structure of custom variables is as follows. After the task is submitted, it will go through the variable replacement interceptor. First, all variables and expressions used in the code will be parsed, and then replaced with the system and user-defined initial values of variables, and finally the parsed code will be submitted to EngineConn for execution. So the underlying engine is already replaced code.
The variable types supported by Linkis are divided into custom variables and system built-in variables. The internal variables are predefined by Linkis and can be used directly. Then different variable types support different calculation formats: String supports +, integer decimal supports +-*/, date supports +-.
The currently supported built-in variables are as follows:
| variable name | variable type | variable meaning | variable value example |
|---|---|---|---|
| run_date | String | Data statistics time (support user's own setting, the default setting is the day before the current time), if the data of yesterday is executed today, it will be the time of yesterday, the format is yyyyMMdd | 20180129 |
| run_date_std | String | Data statistics time (standard date format), if yesterday‘s data is executed today, it will be yesterday’s time, the format is yyyy-MM-dd | 2018-01-29 |
| run_today | String | The day after run_date (data statistics time), the format is yyyyMMdd | 20211210 |
| run_today_std | String | The day after run_date (data statistics time) (standard format), the format is yyyy-MM-dd | 2021-12-10 |
| run_mon | String | The month of the data statistics time, the format is yyyyMM | 202112 |
| run_mon_std | String | The month of the data statistics time (standard format), the format is yyyy-MM | 2021-12 |
| run_month_begin | String | The first day of the month in which the data is counted, in the format yyyyMMdd | 20180101 |
| run_month_begin_std | String | The first day of the month where the data statistics time is (standard date format), the format is yyyy-MM-dd | 2018-01-01 |
| run_month_now_begin | String | The first day of the month where run_today is in the format yyyyMMdd | 20211201 |
| run_month_now_begin_std | String | The first day of the month run_today (standard format), the format is yyyy-MM-dd | 2021-12-01 |
| run_month_end | String | The last day of the month in which the data is counted, in the format yyyyMMdd | 20180131 |
| run_month_end_std | String | The last day of the month in which the data is counted (standard date format), the format is yyyy-MM-dd | 2018-01-31 |
| run_month_now_end | String | The last day of the month where run_today is in the format yyyyMMdd | 20211231 |
| run_month_now_end_std | String | The last day of the month in which run_today is located (standard date format), the format is yyyy-MM-dd | 2021-12-31 |
| run_quarter_begin | String | The first day of the quarter in which the data is counted, in the format yyyyMMdd | 20210401 |
| run_quarter_end | String | The last day of the quarter in which the data is counted, in the format yyyyMMdd | 20210630 |
| run_half_year_begin | String | The first day of the half year where the data statistics time is located, in the format yyyyMMdd | 20210101 |
| run_half_year_end | String | The last day of the half year where the data statistics time is located, the format is yyyyMMdd | 20210630 |
| run_year_begin | String | The first day of the year in which the data is counted, in the format yyyyMMdd | 20210101 |
| run_year_end | String | The last day of the year in which the data is counted, in the format yyyyMMdd | 20211231 |
| run_quarter_begin_std | String | The first day of the quarter in which the data is counted (standard format), the format is yyyy-MM-dd | 2021-10-01 |
| run_quarter_end_std | String | The last day of the quarter where the data statistics time is located (standard format), the format is yyyy-MM-dd | 2021-12-31 |
| run_half_year_begin_std | String | The first day of the half year where the data statistics time is located (standard format), the format is yyyy-MM-dd | 2021-07-01 |
| run_half_year_end_std | String | The last day of the half year where the data statistics time is located (standard format), the format is yyyy-MM-dd | 2021-12-31 |
| run_year_begin_std | String | The first day of the year in which the data is counted (standard format), the format is yyyy-MM-dd | 2021-01-01 |
| run_year_end_std | String | The last day of the year in which the data is counted (standard format), the format is yyyy-MM-dd | 2021-12-31 |
details:
What are custom variables? User variables that are defined first and then used. User-defined variables temporarily support the definition of strings, integers, and floating-point variables. Strings support the + method, and integers and floating-point numbers support the +-*/ method. User-defined variables do not conflict with the set variable syntax supported by SparkSQL and HQL, but the same name is not allowed. How to define and use custom variables? as follows:
## Defined in the code, specified before the task code sql type definition method: --@set f=20.1 The python/shell types are defined as follows: #@set f=20.1 Note: Only one variable can be defined on one line
The use is directly used in the code through {varName expression}, such as ${f*2}
Custom variables in linkis also have scope, and the priority is that the variable defined in the script is greater than the Variable defined in the task parameter is greater than the built-in run_date variable. The task parameters are defined as follows:
##restful
{
"executionContent": {"code": "select \"${f-1}\";", "runType": "sql"},
"params": {
"variable": {f: "20.1"},
"configuration": {
"runtime": {
"linkis.openlookeng.url":"http://127.0.0.1:9090"
}
}
},
"source": {"scriptPath": "file:///mnt/bdp/hadoop/1.sql"},
"labels": {
"engineType": "spark-2.4.3",
"userCreator": "hadoop-IDE"
}
}
## java SDK
JobSubmitAction.builder
.addExecuteCode(code)
.setStartupParams(startupMap)
.setUser(user) //submit user
.addExecuteUser(user) //execute user
.setLabels(labels)
.setVariableMap(varMap) //setVar
.build