Usage

SedonaSQL supports many parameters. To change their values,

  1. Set it through SparkConf:
sparkSession = SparkSession.builder().
      config("spark.serializer","org.apache.spark.serializer.KryoSerializer").
      config("spark.kryo.registrator", "org.apache.sedona.core.serde.SedonaKryoRegistrator").
      config("sedona.global.index","true")
      master("local[*]").appName("mySedonaSQLdemo").getOrCreate()
  1. Check your current SedonaSQL configuration:
val sedonaConf = new SedonaConf(sparkSession.conf)
println(sedonaConf)
  1. Sedona parameters can be changed at runtime:
sparkSession.conf.set("sedona.global.index","false")

In addition, you can also add spark prefix to the parameter name, for example:

sparkSession.conf.set("spark.sedona.global.index","false")

However, any parameter set through spark prefix will be honored by Spark, which means you can set these parameters before hand via spark-defaults.conf or Spark on Kubernetes configuration.

If you set the same parameter through both sedona and spark.sedona prefixes, the parameter set through sedona prefix will override the parameter set through spark.sedona prefix.

General Parameters

ParameterDescriptionDefaultPossible Values
sedona.global.indexUse spatial index (currently, only supports in SQL range join and SQL distance join)truetrue, false
sedona.global.indextypeSpatial index type, only valid when sedona.global.index is truertreertree, quadtree
spark.sedona.enableParserExtensionEnable the parser extension to parse GEOMETRY data type in SQL DDL statementstruetrue, false

Join Parameters

ParameterDescriptionDefaultPossible Values
sedona.join.autoBroadcastJoinThresholdMaximum size in bytes for a table that will be broadcast to all worker nodes when performing a join. Set to -1 to disable automatic broadcasting.Same as spark.sql.autoBroadcastJoinThresholdAny integer with a byte suffix, e.g. 10MB or 512KB
sedona.join.gridtypeSpatial partitioning grid type for join querykdbtreequadtree, kdbtree
spark.sedona.join.knn.includeTieBreakersKNN join will include all ties in the result, possibly returning more than k resultsfalsetrue, false
sedona.join.indexbuildside(Advanced) The side which Sedona builds spatial indices onleftleft, right
sedona.join.numpartition(Advanced) Number of partitions for both sides in a join query-1 (use existing partitions)Any integer
sedona.join.spatitionside(Advanced) The dominant side in spatial partitioning stageleftleft, right
sedona.join.optimizationmode(Advanced) When Sedona should optimize spatial join SQL queriesnonequiall (always optimize, even equi-joins), none (disable optimization), nonequi (optimize non-equi-joins only)

CRS Transformation Parameters

ParameterDescriptionDefaultPossible ValuesSince
spark.sedona.crs.geotoolsControls which library is used for CRS transformations in ST_Transformrasternone (proj4sedona for all), raster (proj4sedona for vector, GeoTools for raster), all (GeoTools for all — legacy)v1.9.0
spark.sedona.crs.url.baseBase URL of a CRS definition server for resolving authority codes (e.g., EPSG) via HTTP. When set, ST_Transform will consult this URL provider before the built-in definitions.(empty — disabled)e.g. https://crs.example.comv1.9.0
spark.sedona.crs.url.pathTemplateURL path template appended to spark.sedona.crs.url.base. Placeholders {authority} and {code} are replaced at runtime./{authority}/{code}.jsone.g. /epsg/{code}.jsonv1.9.0
spark.sedona.crs.url.formatThe CRS definition format returned by the URL providerprojjsonprojjson, proj, wkt1 (OGC WKT1), wkt2 (ISO 19162 WKT2)v1.9.0