blob: d72f3fbc2f05d39a5a57e805cb66e12521854bb8 [file] [log] [blame] [view]
<!--
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
-->
# CRS Transformation
Sedona provides coordinate reference system (CRS) transformation through the `ST_Transform` function. Since v1.9.0, Sedona uses the proj4sedona library, a pure Java implementation that supports multiple CRS input formats and grid-based transformations.
## Supported CRS Formats
Sedona supports the following formats for specifying source and target coordinate reference systems:
### Authority Code
The most common way to specify a CRS is using an authority code in the format `AUTHORITY:CODE`. Sedona uses [spatialreference.org](https://spatialreference.org/projjson_index.json) as an open-source CRS database, which supports multiple authorities:
| Authority | Description | Example |
|-----------|-------------|---------|
| EPSG | European Petroleum Survey Group | `EPSG:4326`, `EPSG:3857` |
| ESRI | Esri coordinate systems | `ESRI:102008`, `ESRI:54012` |
| IAU | International Astronomical Union (planetary CRS) | `IAU:30100` |
| SR-ORG | User-contributed definitions | `SR-ORG:6864` |
```sql
-- Transform from WGS84 (EPSG:4326) to Web Mercator (EPSG:3857)
SELECT ST_Transform(
ST_GeomFromText('POINT(-122.4194 37.7749)'),
'EPSG:4326',
'EPSG:3857'
) AS transformed_point
```
Output:
```
POINT (-13627665.271218014 4548257.702387721)
```
```sql
-- Transform using ESRI authority code (North America Albers Equal Area Conic)
SELECT ST_Transform(
ST_GeomFromText('POINT(-122.4194 37.7749)'),
'EPSG:4326',
'ESRI:102008'
) AS transformed_point
```
```sql
-- Transform from WGS84 to UTM Zone 10N (EPSG:32610)
SELECT ST_Transform(
ST_GeomFromText('POLYGON((-122.5 37.5, -122.5 38.0, -122.0 38.0, -122.0 37.5, -122.5 37.5))'),
'EPSG:4326',
'EPSG:32610'
) AS transformed_polygon
```
You can browse available CRS codes at [spatialreference.org](https://spatialreference.org/projjson_index.json) or [EPSG.io](https://epsg.io/).
### WKT1 (OGC Well-Known Text)
WKT1 is the OGC Well-Known Text format for CRS definitions. It starts with `PROJCS[...]` for projected CRS or `GEOGCS[...]` for geographic CRS.
```sql
-- Transform using WKT1 format for target CRS
SELECT ST_Transform(
ST_GeomFromText('POINT(-122.4194 37.7749)'),
'EPSG:4326',
'PROJCS["WGS 84 / Pseudo-Mercator",
GEOGCS["WGS 84",
DATUM["WGS_1984",
SPHEROID["WGS 84",6378137,298.257223563]],
PRIMEM["Greenwich",0],
UNIT["degree",0.0174532925199433]],
PROJECTION["Mercator_1SP"],
PARAMETER["central_meridian",0],
PARAMETER["scale_factor",1],
PARAMETER["false_easting",0],
PARAMETER["false_northing",0],
UNIT["metre",1],
AUTHORITY["EPSG","3857"]]'
) AS transformed_point
```
### WKT2 (ISO 19162:2019)
WKT2 is the modern ISO 19162:2019 standard format. It starts with `PROJCRS[...]` for projected CRS or `GEOGCRS[...]` for geographic CRS.
```sql
-- Transform using WKT2 format for target CRS
SELECT ST_Transform(
ST_GeomFromText('POINT(-122.4194 37.7749)'),
'EPSG:4326',
'PROJCRS["WGS 84 / UTM zone 10N",
BASEGEOGCRS["WGS 84",
DATUM["World Geodetic System 1984",
ELLIPSOID["WGS 84",6378137,298.257223563]]],
CONVERSION["UTM zone 10N",
METHOD["Transverse Mercator"],
PARAMETER["Latitude of natural origin",0],
PARAMETER["Longitude of natural origin",-123],
PARAMETER["Scale factor at natural origin",0.9996],
PARAMETER["False easting",500000],
PARAMETER["False northing",0]],
CS[Cartesian,2],
AXIS["easting",east],
AXIS["northing",north],
UNIT["metre",1],
ID["EPSG",32610]]'
) AS transformed_point
```
### PROJ String
PROJ strings provide a compact way to define CRS using projection parameters. They start with `+proj=`.
```sql
-- Transform using PROJ string for UTM Zone 10N
SELECT ST_Transform(
ST_GeomFromText('POINT(-122.4194 37.7749)'),
'+proj=longlat +datum=WGS84 +no_defs',
'+proj=utm +zone=10 +datum=WGS84 +units=m +no_defs'
) AS transformed_point
```
```sql
-- Transform using PROJ string for Lambert Conformal Conic
SELECT ST_Transform(
ST_GeomFromText('POINT(-122.4194 37.7749)'),
'EPSG:4326',
'+proj=lcc +lat_1=33 +lat_2=45 +lat_0=39 +lon_0=-96 +x_0=0 +y_0=0 +datum=NAD83 +units=m +no_defs'
) AS transformed_point
```
### PROJJSON
PROJJSON is a JSON representation of CRS, useful when working with JSON-based workflows.
```sql
-- Transform using PROJJSON for target CRS
SELECT ST_Transform(
ST_GeomFromText('POINT(-122.4194 37.7749)'),
'EPSG:4326',
'{
"type": "ProjectedCRS",
"name": "WGS 84 / UTM zone 10N",
"base_crs": {
"name": "WGS 84",
"datum": {
"type": "GeodeticReferenceFrame",
"name": "World Geodetic System 1984",
"ellipsoid": {
"name": "WGS 84",
"semi_major_axis": 6378137,
"inverse_flattening": 298.257223563
}
},
"coordinate_system": {
"subtype": "ellipsoidal",
"axis": [
{"name": "Longitude", "abbreviation": "lon", "direction": "east", "unit": "degree"},
{"name": "Latitude", "abbreviation": "lat", "direction": "north", "unit": "degree"}
]
}
},
"conversion": {
"name": "UTM zone 10N",
"method": {"name": "Transverse Mercator"},
"parameters": [
{"name": "Latitude of natural origin", "value": 0, "unit": "degree"},
{"name": "Longitude of natural origin", "value": -123, "unit": "degree"},
{"name": "Scale factor at natural origin", "value": 0.9996},
{"name": "False easting", "value": 500000, "unit": "metre"},
{"name": "False northing", "value": 0, "unit": "metre"}
]
},
"coordinate_system": {
"subtype": "Cartesian",
"axis": [
{"name": "Easting", "abbreviation": "E", "direction": "east", "unit": "metre"},
{"name": "Northing", "abbreviation": "N", "direction": "north", "unit": "metre"}
]
},
"id": {"authority": "EPSG", "code": 32610}
}'
) AS transformed_point
```
## URL CRS Provider
Since v1.9.0, Sedona supports resolving CRS definitions from a remote HTTP server. This is useful when you need custom or internal CRS definitions that are not included in the built-in database, or when you want to use your own CRS definition service.
When configured, the URL provider is consulted **before** the built-in CRS database. If the URL provider returns a valid CRS definition, it is used directly. If the URL returns a 404 or an error, Sedona falls back to the built-in definitions.
### Hosting CRS definitions
You can host your custom CRS definitions on any HTTP-accessible location. Two common approaches:
- **GitHub repository**: Store CRS definition files in a public GitHub repo and use the raw content URL. This is the easiest way to get started — no server infrastructure required.
- **Public S3 bucket**: Upload CRS definition files to an Amazon S3 bucket with public read access and use the S3 static website URL or CloudFront distribution.
Each file should contain a single CRS definition in the format you specify via `spark.sedona.crs.url.format` (PROJJSON, PROJ string, WKT1, or WKT2).
### Configuration
Set the following Spark configuration properties when creating your Sedona session:
```python
config = (
SedonaContext.builder()
.config("spark.sedona.crs.url.base", "https://crs.example.com")
.config("spark.sedona.crs.url.pathTemplate", "/{authority}/{code}.json")
.config("spark.sedona.crs.url.format", "projjson")
.getOrCreate()
)
sedona = SedonaContext.create(config)
```
With the default path template, resolving `EPSG:4326` will fetch:
```
https://crs.example.com/epsg/4326.json
```
Only `spark.sedona.crs.url.base` is required. The other two properties have sensible defaults (`/{authority}/{code}.json` and `projjson`).
### Supported response formats
| Format value | Description | Content example |
|-------------|-------------|----------------|
| `projjson` | PROJJSON (default) | `{"type": "GeographicCRS", ...}` |
| `proj` | PROJ string | `+proj=longlat +datum=WGS84 +no_defs` |
| `wkt1` | OGC WKT1 | `GEOGCS["WGS 84", ...]` |
| `wkt2` | ISO 19162 WKT2 | `GEOGCRS["WGS 84", ...]` |
### Example: GitHub repository
Suppose you have a GitHub repo `myorg/crs-definitions` with the following structure:
```
crs-definitions/
epsg/
990001.proj
990002.proj
```
where `epsg/990001.proj` contains a PROJ string like:
```
+proj=merc +a=6378137 +b=6378137 +lat_ts=0 +lon_0=0 +x_0=0 +y_0=0 +k=1 +units=m +no_defs
```
Point Sedona to the raw GitHub content URL:
```python
config = (
SedonaContext.builder()
.config(
"spark.sedona.crs.url.base",
"https://raw.githubusercontent.com/myorg/crs-definitions/main",
)
.config("spark.sedona.crs.url.pathTemplate", "/epsg/{code}.proj")
.config("spark.sedona.crs.url.format", "proj")
.getOrCreate()
)
sedona = SedonaContext.create(config)
# Resolves EPSG:990001 from:
# https://raw.githubusercontent.com/myorg/crs-definitions/main/epsg/990001.proj
sedona.sql("""
SELECT ST_Transform(
ST_GeomFromText('POINT(-122.4194 37.7749)'),
'EPSG:4326',
'EPSG:990001'
) AS transformed_point
""").show()
```
### Example: self-hosted CRS server
```python
config = (
SedonaContext.builder()
.config("spark.sedona.crs.url.base", "https://crs.mycompany.com")
.config("spark.sedona.crs.url.pathTemplate", "/epsg/{code}.proj")
.config("spark.sedona.crs.url.format", "proj")
.getOrCreate()
)
sedona = SedonaContext.create(config)
# Now ST_Transform will try https://crs.mycompany.com/epsg/3857.proj
# before falling back to built-in definitions
sedona.sql("""
SELECT ST_Transform(
ST_GeomFromText('POINT(-122.4194 37.7749)'),
'EPSG:4326',
'EPSG:3857'
) AS transformed_point
""").show()
```
### Example: custom authority codes
The URL provider is especially useful for custom or internal authority codes that are not in any public database. With the default path template `/{authority}/{code}.json`, the `{authority}` placeholder is replaced by the authority name from the CRS string (lowercased):
```python
config = (
SedonaContext.builder()
.config("spark.sedona.crs.url.base", "https://crs.mycompany.com")
.config("spark.sedona.crs.url.format", "proj")
.getOrCreate()
)
sedona = SedonaContext.create(config)
# Resolves MYORG:1001 from:
# https://crs.mycompany.com/myorg/1001.json
sedona.sql("""
SELECT ST_Transform(
ST_GeomFromText('POINT(-122.4194 37.7749)'),
'EPSG:4326',
'MYORG:1001'
) AS transformed_point
""").show()
```
### Example: using geometry SRID with URL provider
If the geometry already has an SRID set (e.g., via `ST_SetSRID`), you can omit the source CRS parameter. The source CRS is derived from the geometry's SRID as an EPSG code:
```python
config = (
SedonaContext.builder()
.config("spark.sedona.crs.url.base", "https://crs.mycompany.com")
.config("spark.sedona.crs.url.format", "proj")
.getOrCreate()
)
sedona = SedonaContext.create(config)
# The source CRS is taken from the geometry's SRID (4326 → EPSG:4326).
# Only the target CRS string is needed.
sedona.sql("""
SELECT ST_Transform(
ST_SetSRID(ST_GeomFromText('POINT(-122.4194 37.7749)'), 4326),
'EPSG:3857'
) AS transformed_point
""").show()
```
### Disabling the URL provider
To avoid enabling the URL provider, omit `spark.sedona.crs.url.base` or leave it as an empty string (the default). Note that once a URL provider has been registered in an executor JVM, it remains active for the lifetime of that JVM.
See also: [Configuration parameters](Parameter.md#crs-transformation) for the full list of URL CRS provider settings.
## Grid File Support
Grid files enable high-accuracy datum transformations, such as NAD27 to NAD83 or OSGB36 to ETRS89. Sedona supports loading grid files from multiple sources.
### Grid File Sources
Grid files can be specified using the `+nadgrids` parameter in PROJ strings:
| Source | Format | Example |
|--------|--------|---------|
| Local file | Absolute path | `+nadgrids=/path/to/grid.gsb` |
| PROJ CDN | `@` prefix | `+nadgrids=@us_noaa_conus.tif` |
| HTTPS URL | Full URL | `+nadgrids=https://cdn.proj.org/us_noaa_conus.tif` |
When using the `@` prefix, grid files are automatically fetched from [PROJ CDN](https://cdn.proj.org/).
### Optional vs Mandatory Grids
- **`@` prefix (optional)**: The transformation continues without the grid if it's unavailable. Use this when the grid improves accuracy but isn't required.
- **No prefix (mandatory)**: An error is thrown if the grid file cannot be found.
### SQL Examples with Grid Files
```sql
-- Transform NAD27 to NAD83 using PROJ CDN grid (optional)
SELECT ST_Transform(
ST_GeomFromText('POINT(-122.4194 37.7749)'),
'+proj=longlat +datum=NAD27 +no_defs +nadgrids=@us_noaa_conus.tif',
'EPSG:4269'
) AS transformed_point
```
```sql
-- Transform using mandatory grid file (error if not found)
SELECT ST_Transform(
ST_GeomFromText('POINT(-122.4194 37.7749)'),
'+proj=longlat +datum=NAD27 +no_defs +nadgrids=us_noaa_conus.tif',
'EPSG:4269'
) AS transformed_point
```
```sql
-- Transform OSGB36 to ETRS89 using UK grid
SELECT ST_Transform(
ST_GeomFromText('POINT(-0.1276 51.5074)'),
'+proj=longlat +datum=OSGB36 +nadgrids=@uk_os_OSTN15_NTv2_OSGBtoETRS.gsb +no_defs',
'EPSG:4258'
) AS transformed_point
```
## Coordinate Order
Sedona expects geometries to be in **longitude/latitude (lon/lat)** order. If your data is in lat/lon order, use `ST_FlipCoordinates` to swap the coordinates before transformation.
```sql
-- If your data is in lat/lon order, flip first
SELECT ST_Transform(
ST_FlipCoordinates(ST_GeomFromText('POINT(37.7749 -122.4194)')),
'EPSG:4326',
'EPSG:3857'
) AS transformed_point
```
Sedona automatically handles coordinate order in the CRS definition, ensuring the source and target CRS use lon/lat order internally.
## Using Geometry SRID
If the geometry already has an SRID set, you can omit the source CRS parameter:
```sql
-- Set SRID on geometry and transform using only target CRS
SELECT ST_Transform(
ST_SetSRID(ST_GeomFromText('POINT(-122.4194 37.7749)'), 4326),
'EPSG:3857'
) AS transformed_point
```
## See Also
- [ST_Transform](Spatial-Reference-System/ST_Transform.md) - Function reference
- [ST_SetSRID](Spatial-Reference-System/ST_SetSRID.md) - Set the SRID of a geometry
- [ST_SRID](Spatial-Reference-System/ST_SRID.md) - Get the SRID of a geometry
- [ST_FlipCoordinates](Geometry-Editors/ST_FlipCoordinates.md) - Swap X and Y coordinates