spark configure

Module Name (Service Name)Parameter NameDefault ValueDescriptionUsed
sparklinkis.spark.yarn.cluster.jarshdfs:///spark/clusterspark.yarn.cluster.jars
sparklinkis.spark.etl.support.hudifalsespark.etl.support.hudi
sparklinkis.bgservice.store.prefixhdfs:///tmp/bdp-ide/bgservice.store.prefix
sparklinkis.bgservice.store.suffixbgservice.store.suffix
sparkwds.linkis.dolphin.decimal.precision32dolphin.decimal.precision
sparkwds.linkis.dolphin.decimal.scale10dolphin.decimal.scale
sparkwds.linkis.park.extension.max.pool2extension.max.pool
sparkwds.linkis.process.threadpool.max100process.threadpool.max
sparkwds.linkis.engine.spark.session.hookspark.session.hook
sparkwds.linkis.engine.spark.spark-loop.init.time120sspark.spark-loop.init.time
sparkwds.linkis.engine.spark.language-repl.init.time30sspark.language-repl.init.time
sparkwds.linkis.spark.sparksubmit.pathspark-submitspark.sparksubmit.path
sparkwds.linkis.spark.output.line.limit10spark.output.line.limit
sparkwds.linkis.spark.useHiveContexttruespark.useHiveContext
sparkwds.linkis.enginemanager.core.jarenginemanager.core.jar
sparkwds.linkis.ecp.spark.default.jarlinkis-engineconn-core-1.2.0.jarspark.default.jar
sparkwds.linkis.dws.ujes.spark.extension.timeout3000Lspark.extension.timeout
sparkwds.linkis.engine.spark.fraction.length30spark.fraction.length
sparkwds.linkis.show.df.max.resshow.df.max.res
sparkwds.linkis.mdq.application.namelinkis-ps-datasourcemdq.application.name
sparkwds.linkis.dolphin.limit.len5000dolphin.limit.len
sparkwds.linkis.spark.engine.is.viewfs.envtruespark.engine.is.viewfs.env
sparkwds.linkis.spark.engineconn.fatal.logerror writing class;OutOfMemoryErrorspark.engineconn.fatal.log
sparkwds.linkis.spark.engine.scala.replace_package_header.enabletruespark.engine.scala.replace_package_header.enable

Use spark yarn cluster mode,need to set label “engingeConnRuntimeMode”: “yarnCluster”,and need to upload the dependence of the spark to ‘linkis.spark.yarn.cluster.jar’(the default value is ‘hdfs:///spark/cluster’) spark dependencies include jars and configuration files,For example: ‘/appcom/Install/linkis/lib/linkis-engineconn-plugins/spark/dist/3.2.1/lib/*.jar’,‘/appcom/Install/linkis/conf/*’

Precautions for using yarnCluster: Eureka url if 127.0.0.1 should be changed to the real host, such as “127.0.0.1:20303/eureka/” should be changed to “wds001:20303/eureka/”

The spark-excel package may cause class conflicts,need to download separately,put it in spark lib wget https://repo1.maven.org/maven2/com/crealytics/spark-excel-2.12.17-3.2.2_2.12/3.2.2_0.18.1/spark-excel-2.12.17-3.2.2_2.12-3.2.2_0.18.1.jar cp spark-excel-2.12.17-3.2.2_2.12-3.2.2_0.18.1.jar {LINKIS_HOME}/lib/linkis-engineconn-plugins/spark/dist/3.2.1/lib

spark3 is not supported by native rocketmq-spark, and the source code needs to be modified, which can be downloaded directly from the link below https://github.com/ChengJie1053/spark3-rocketmq-connector-jar