layout: global title: Geospatial (Geometry/Geography) Types displayTitle: Geospatial (Geometry/Geography) Types license: | Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements. See the NOTICE file distributed with this work for additional information regarding copyright ownership. The ASF licenses this file to You under the Apache License, Version 2.0 (the “License”); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Spark SQL supports GEOMETRY and GEOGRAPHY types for spatial data, as defined in the Open Geospatial Consortium (OGC) Simple Feature Access specification. At runtime, values are represented as Well-Known Binary (WKB) and are associated with a Spatial Reference Identifier (SRID) that defines the coordinate system. How values are persisted is determined by each data source.
| Type | Coordinate system | Typical use and notes |
|---|---|---|
| GEOMETRY | Cartesian (planar) | Projected or local coordinates; planar calculations. Represents points, lines, polygons in a flat coordinate system. Suitable for Web Mercator (SRID 3857), UTM, or local grids (e.g. engineering/CAD). Default SRID in Spark is 4326. |
| GEOGRAPHY | Geographic (latitude/longitude) | Earth-based data; distances and areas on the sphere/ellipsoid. Coordinates in longitude and latitude (degrees). Edge interpolation is always SPHERICAL. Default SRID is 4326 (WGS 84). |
Choose GEOMETRY when:
Choose GEOGRAPHY when:
Using the wrong type can give misleading results: for example, the shortest path between London and New York on a sphere crosses Canada, whereas a planar GEOMETRY may suggest a path that does not.
In SQL you must specify the type with an SRID or ANY:
GEOMETRY(srid) — e.g. GEOMETRY(4326), GEOMETRY(3857)GEOGRAPHY(srid) — e.g. GEOGRAPHY(4326)GEOMETRY(ANY)GEOGRAPHY(ANY)Unparameterized GEOMETRY or GEOGRAPHY (without (srid) or (ANY)) is not supported in SQL.
-- Fixed SRID: all values must use the given SRID (e.g. WGS 84) CREATE TABLE points ( id BIGINT, pt GEOMETRY(4326) ); CREATE TABLE locations ( id BIGINT, loc GEOGRAPHY(4326) ); -- Mixed SRID: each row can have a different SRID CREATE TABLE mixed_geoms ( id BIGINT, geom GEOMETRY(ANY) );
Values are created from Well-Known Binary (WKB) using built-in functions. WKB is a standard binary encoding for spatial shapes (points, lines, polygons, etc.). See Well-known binary for the format.
From WKB (binary):
ST_GeomFromWKB(wkb) — returns GEOMETRY with default SRID 0.ST_GeomFromWKB(wkb, srid) — returns GEOMETRY with the given SRID.ST_GeogFromWKB(wkb) — returns GEOGRAPHY with SRID 4326.Example (point in WKB, then use in a table):
-- Point (1, 2) in WKB (little-endian point, 2D) SELECT ST_GeomFromWKB(X'0101000000000000000000F03F0000000000000040'); SELECT ST_GeomFromWKB(X'0101000000000000000000F03F0000000000000040', 4326); SELECT ST_GeogFromWKB(X'0101000000000000000000F03F0000000000000040'); INSERT INTO points (id, pt) VALUES (1, ST_GeomFromWKB(X'0101000000000000000000F03F0000000000000040', 4326));
When parsing WKB, Spark applies the following rules. Violations result in a parse error.
POINT EMPTY in Well-Known Text). LineString and Polygon (and points inside them) do not allow NaN in coordinate values.ST_GeogFromWKB), longitude must be in [-180, 180] (inclusive) and latitude in [-90, 90] (inclusive). GEOMETRY does not enforce these bounds.Spark SQL provides scalar functions for working with GEOMETRY and GEOGRAPHY values. They are grouped under st_funcs in the Built-in Functions API.
| Function | Description |
|---|---|
ST_AsBinary(geo) | Returns the GEOMETRY or GEOGRAPHY value as WKB (BINARY). |
ST_GeomFromWKB(wkb) | Parses WKB and returns a GEOMETRY with default SRID 0. |
ST_GeomFromWKB(wkb, srid) | Parses WKB and returns a GEOMETRY with the given SRID. |
ST_GeogFromWKB(wkb) | Parses WKB and returns a GEOGRAPHY with SRID 4326. |
ST_Srid(geo) | Returns the SRID of the GEOMETRY or GEOGRAPHY value (NULL if input is NULL). |
ST_SetSrid(geo, srid) | Returns a new GEOMETRY or GEOGRAPHY with the given SRID. |
Examples:
SELECT hex(ST_AsBinary(ST_GeogFromWKB(X'0101000000000000000000F03F0000000000000040'))); -- 0101000000000000000000F03F0000000000000040 SELECT ST_Srid(ST_GeogFromWKB(X'0101000000000000000000F03F0000000000000040')); -- 4326 SELECT ST_Srid(ST_SetSrid(ST_GeomFromWKB(X'0101000000000000000000F03F0000000000000040'), 3857)); -- 3857
ST_SetSrid to set the value’s SRID to match the column).GEOMETRY(ANY) or GEOGRAPHY(ANY)): Values can have different SRIDs. Only valid SRIDs are allowed.For the full list of supported data types and API usage in Scala, Java, Python, and SQL, see Data Types.