Explain where manual modification is required for Apache, CDH, HDP and other version adaptations
Enter the root directory of the project and execute the following commands in sequence
mvn -N install mvn clean install -Dmaven.test.skip=true
linkis-dist -> package -> linkis-dml.sql(db folder)
Switch the corresponding engine version to the version you need. If the version you use is consistent with the official version, you do not need to modify this step
for example:
-- variableļ¼ SET @SPARK_LABEL="spark-2.4.3"; SET @HIVE_LABEL="hive-2.3.3"; SET @PYTHON_LABEL="python-python2"; SET @PIPELINE_LABEL="pipeline-1"; SET @JDBC_LABEL="jdbc-4"; SET @PRESTO_LABEL="presto-0.234"; SET @IO_FILE_LABEL="io_file-1.0"; SET @OPENLOOKENG_LABEL="openlookeng-1.5.0";
| engine | version |
|---|---|
| hadoop | 2.7.2 |
| hive | 2.3.3 |
| spark | 2.4.3 |
| flink | 1.12.2 |
| engine | version |
|---|---|
| hadoop | 3.1.1 |
| hive | 3.1.2 |
| spark | 3.0.1 |
| flink | 1.13.2 |
For Linkis version < 1.3.2
<hadoop.version>3.1.1</hadoop.version> <scala.version>2.12.10</scala.version> <scala.binary.version>2.12</scala.binary.version> <!-- hadoop-hdfs replace with hadoop-hdfs-client --> <dependency> <groupId>org.apache.hadoop</groupId> <artifactId>hadoop-hdfs-client</artifactId> <version>${hadoop.version}</version> <dependency>
For Linkis version >= 1.3.2, we only need to set scala.version and scala.binary.version if necessary
<scala.version>2.12.10</scala.version> <scala.binary.version>2.12</scala.binary.version>
Because we can directly compile with hadoop-3.3 or hadoop-2.7 profile. Profile hadoop-3.3 can be used for any hadoop3.x, default hadoop3.x version will be hadoop 3.3.1, Profile hadoop-2.7 can be used for any hadoop2.x, default hadoop2.x version will be hadoop 2.7.2, other hadoop version can be specified by -Dhadoop.version=xxx
mvn -N install mvn clean install -Phadoop-3.3 -Dmaven.test.skip=true mvn clean install -Phadoop-3.3 -Dhadoop.version=3.1.1 -Dmaven.test.skip=true
For Linkis version < 1.3.2
<!-- Notice here <version>${hadoop.version}</version> , adjust according to whether you have encountered errors --> <dependency> <groupId>org.apache.hadoop</groupId> <artifactId>hadoop-hdfs-client</artifactId> <version>${hadoop.version}</version> </dependency>
For Linkis version >= 1.3.2,linkis-hadoop-common module no need to change
<hive.version>3.1.2</hive.version>
For Linkis version < 1.3.2
<spark.version>3.0.1</spark.version>
For Linkis version >= 1.3.2
We can directly compile with spark-3.2 or spark-2.4-hadoop-3.3 profile, if we need to used with hadoop3, then profile hadoop-3.3 will be needed. default spark3.x version will be spark 3.2.1. if we compile with spark-3.2 then scala version will be 2.12.15 by default, so we do not need to set the scala version in Linkis project pom file(mentioned in 5.1.1). if spark2.x used with hadoop3, for compatibility reason, profile `spark-2.4-hadoop-3.3` need to be activated.
mvn -N install mvn clean install -Pspark-3.2 -Phadoop-3.3 -Dmaven.test.skip=true mvn clean install -Pspark-2.4-hadoop-3.3 -Phadoop-3.3 -Dmaven.test.skip=true
<flink.version>1.13.2</flink.version>
Since some classes of Flink 1.12.2 to 1.13.2 are adjusted, it is necessary to compile and adjust Flink. Select Scala version 2.12 for compiling Flink
:::caution temporary plan
Note that the following operations are all in flink
Due to flink1.12.2 to 1.13.2, some classes are adjusted, so flink needs to be compiled and adjusted, and the version of scala selected for compiling flink is version 2.12(The scala version is based on the actual version used)
flink compilation reference instruction: mvn clean install -DskipTests -P scala-2.12 -Dfast -T 4 -Dmaven.compile.fock=true
:::
-- Note that the following classes are copied from version 1.12.2 to version 1.13.2 org.apache.flink.table.client.config.entries.DeploymentEntry org.apache.flink.table.client.config.entries.ExecutionEntry org.apache.flink.table.client.gateway.local.CollectBatchTableSink org.apache.flink.table.client.gateway.local.CollectStreamTableSink
org.apache.linkis.manager.label.conf.LabelCommonConfig file adjustment
public static final CommonVars<String> SPARK_ENGINE_VERSION = CommonVars.apply("wds.linkis.spark.engine.version", "3.0.1"); public static final CommonVars<String> HIVE_ENGINE_VERSION = CommonVars.apply("wds.linkis.hive.engine.version", "3.1.2");
org.apache.linkis.governance.common.conf.GovernanceCommonConf file adjustment
val SPARK_ENGINE_VERSION = CommonVars("wds.linkis.spark.engine.version", "3.0.1") val HIVE_ENGINE_VERSION = CommonVars("wds.linkis.hive.engine.version", "3.1.2")
| engine | version |
|---|---|
| hadoop | 3.1.1 |
| hive | 3.1.0 |
| spark | 2.3.2 |
| json4s.version | 3.2.11 |
For Linkis version < 1.3.2
<hadoop.version>3.1.1</hadoop.version> <json4s.version>3.2.11</json4s.version> <!-- hadoop-hdfs replace with hadoop-hdfs-client --> <dependency> <groupId>org.apache.hadoop</groupId> <artifactId>hadoop-hdfs-client</artifactId> <version>${hadoop.version}</version> <dependency>
For Linkis version >= 1.3.2, we only need to set json4s.version if necessary
<json4s.version>3.2.11</json4s.version>
Because we can directly compile with hadoop-3.3 or hadoop-2.7 profile. Profile hadoop-3.3 can be used for any hadoop3.x, default hadoop3.x version will be hadoop 3.3.1, Profile hadoop-2.7 can be used for any hadoop2.x, default hadoop2.x version will be hadoop 2.7.2, other hadoop version can be specified by -Dhadoop.version=xxx
mvn -N install mvn clean install -Phadoop-3.3 -Dmaven.test.skip=true mvn clean install -Phadoop-3.3 -Dhadoop.version=3.1.1 -Dmaven.test.skip=true
<hive.version>3.1.0</hive.version>
For Linkis version < 1.3.2
<spark.version>2.3.2</spark.version>
For Linkis version >= 1.3.2
We can directly compile with spark-3.2 profile, if we need to use with hadoop3, then profile hadoop-3.3 will be needed. default spark3.x version will be spark 3.2.1. if we compile with spark-3.2 then scala version will be 2.12.15 by default, so we do not need to set the scala version in Linkis project pom file(mentioned in 5.1.1). if spark2.x used with hadoop3, for compatibility reason, profile `spark-2.4-hadoop-3.3` need to be activated.
mvn -N install mvn clean install -Pspark-3.2 -Phadoop-3.3 -Dmaven.test.skip=true mvn clean install -Pspark-2.4-hadoop-3.3 -Phadoop-3.3 -Dmaven.test.skip=true
org.apache.linkis.manager.label.conf.LabelCommonConfig file adjustment
public static final CommonVars<String> SPARK_ENGINE_VERSION = CommonVars.apply("wds.linkis.spark.engine.version", "2.3.2"); public static final CommonVars<String> HIVE_ENGINE_VERSION = CommonVars.apply("wds.linkis.hive.engine.version", "3.1.0");
org.apache.linkis.governance.common.conf.GovernanceCommonConf file adjustment
val SPARK_ENGINE_VERSION = CommonVars("wds.linkis.spark.engine.version", "2.3.2") val HIVE_ENGINE_VERSION = CommonVars("wds.linkis.hive.engine.version", "3.1.0")
<mirrors> <!-- mirror | Specifies a repository mirror site to use instead of a given repository. The repository that | this mirror serves has an ID that matches the mirrorOf element of this mirror. IDs are used | for inheritance and direct lookup purposes, and must be unique across the set of mirrors. | <mirror> <id>mirrorId</id> <mirrorOf>repositoryId</mirrorOf> <name>Human Readable Name for this Mirror.</name> <url>http://my.repository.com/repo/path</url> </mirror> --> <mirror> <id>nexus-aliyun</id> <mirrorOf>*,!cloudera</mirrorOf> <name>Nexus aliyun</name> <url>http://maven.aliyun.com/nexus/content/groups/public</url> </mirror> <mirror> <id>aliyunmaven</id> <mirrorOf>*,!cloudera</mirrorOf> <name>Alibaba Cloud Public Warehouse</name> <url>https://maven.aliyun.com/repository/public</url> </mirror> <mirror> <id>aliyunmaven</id> <mirrorOf>*,!cloudera</mirrorOf> <name>spring-plugin</name> <url>https://maven.aliyun.com/repository/spring-plugin</url> </mirror> <mirror> <id>maven-default-http-blocker</id> <mirrorOf>external:http:*</mirrorOf> <name>Pseudo repository to mirror external repositories initially using HTTP.</name> <url>http://0.0.0.0/</url> <blocked>true</blocked> </mirror> </mirrors>
<repositories> <repository> <id>cloudera</id> <url>https://repository.cloudera.com/artifactory/cloudera-repos/</url> <releases> <enabled>true</enabled> </releases> </repository> <!--To prevent cloudera from not being found, add Alibaba Source--> <repository> <id>aliyun</id> <url>http://maven.aliyun.com/nexus/content/groups/public/</url> <releases> <enabled>true</enabled> </releases> </repository> </repositories>
| engine | version |
|---|---|
| hadoop | 2.6.0-cdh5.12.1 |
| zookeeper | 3.4.5-cdh5.12.1 |
| hive | 1.1.0-cdh5.12.1 |
| spark | 2.3.4 |
| flink | 1.12.4 |
| python | python3 |
<hadoop.version>2.6.0-cdh5.12.1</hadoop.version> <zookeeper.version>3.4.5-cdh5.12.1</zookeeper.version> <scala.version>2.11.8</scala.version>
-- update <hive.version>1.1.0-cdh5.12.1</hive.version> -- add <package.hive.version>1.1.0_cdh5.12.1</package.hive.version>
<outputDirectory>/dist/v${package.hive.version}/lib</outputDirectory> <outputDirectory>dist/v${package.hive.version}/conf</outputDirectory> <outputDirectory>plugin/${package.hive.version}</outputDirectory>
update CustomerDelimitedJSONSerDe file
/* hive version is too low and needs to be noted
case INTERVAL_YEAR_MONTH:
{
wc = ((HiveIntervalYearMonthObjectInspector) oi).getPrimitiveWritableObject(o);
binaryData = Base64.encodeBase64(String.valueOf(wc).getBytes());
break;
}
case INTERVAL_DAY_TIME:
{
wc = ((HiveIntervalDayTimeObjectInspector) oi).getPrimitiveWritableObject(o);
binaryData = Base64.encodeBase64(String.valueOf(wc).getBytes());
break;
}
*/
<flink.version>1.12.4</flink.version>
<spark.version>2.3.4</spark.version>
<python.version>python3</python.version>
org.apache.linkis.manager.label.conf.LabelCommonConfig file adjustment
public static final CommonVars<String> SPARK_ENGINE_VERSION = CommonVars.apply("wds.linkis.spark.engine.version", "2.3.4"); public static final CommonVars<String> HIVE_ENGINE_VERSION = CommonVars.apply("wds.linkis.hive.engine.version", "1.1.0"); CommonVars.apply("wds.linkis.python.engine.version", "python3")
org.apache.linkis.governance.common.conf.GovernanceCommonConf file adjustment
val SPARK_ENGINE_VERSION = CommonVars("wds.linkis.spark.engine.version", "2.3.4") val HIVE_ENGINE_VERSION = CommonVars("wds.linkis.hive.engine.version", "1.1.0") val PYTHON_ENGINE_VERSION = CommonVars("wds.linkis.python.engine.version", "python3")
| engine | version |
|---|---|
| hadoop | 3.0.0-cdh6.3.2 |
| hive | 2.1.1-cdh6.3.2 |
| spark | 3.0.0 |
<hadoop.version>3.0.0-cdh6.3.2</hadoop.version> <scala.version>2.12.10</scala.version>
<!-- hadoop-hdfs replace with hadoop-hdfs-client --> <dependency> <groupId>org.apache.hadoop</groupId> <artifactId>hadoop-hdfs-client</artifactId> </dependency>
-- update <hive.version>2.1.1-cdh6.3.2</hive.version> -- add <package.hive.version>2.1.1_cdh6.3.2</package.hive.version>
update assembly under distribution.xml file
<outputDirectory>/dist/v${package.hive.version}/lib</outputDirectory> <outputDirectory>dist/v${package.hive.version}/conf</outputDirectory> <outputDirectory>plugin/${package.hive.version}</outputDirectory>
<spark.version>3.0.0</spark.version>
org.apache.linkis.manager.label.conf.LabelCommonConfig file adjustment
public static final CommonVars<String> SPARK_ENGINE_VERSION = CommonVars.apply("wds.linkis.spark.engine.version", "3.0.0"); public static final CommonVars<String> HIVE_ENGINE_VERSION = CommonVars.apply("wds.linkis.hive.engine.version", "2.1.1_cdh6.3.2");
org.apache.linkis.governance.common.conf.GovernanceCommonConf file adjustment
val SPARK_ENGINE_VERSION = CommonVars("wds.linkis.spark.engine.version", "3.0.0") val HIVE_ENGINE_VERSION = CommonVars("wds.linkis.hive.engine.version", "2.1.1_cdh6.3.2")