blob: fa716942a7e04233c506fe29e92e82a30e6e885e [file] [log] [blame]
{
"cells": [
{
"cell_type": "markdown",
"id": "8239e8eb-724c-44df-b651-c2dd02f1d96c",
"metadata": {},
"source": [
"<!---\n",
" Licensed to the Apache Software Foundation (ASF) under one\n",
" or more contributor license agreements. See the NOTICE file\n",
" distributed with this work for additional information\n",
" regarding copyright ownership. The ASF licenses this file\n",
" to you under the Apache License, Version 2.0 (the\n",
" \"License\"); you may not use this file except in compliance\n",
" with the License. You may obtain a copy of the License at\n",
"\n",
" http://www.apache.org/licenses/LICENSE-2.0\n",
"\n",
" Unless required by applicable law or agreed to in writing,\n",
" software distributed under the License is distributed on an\n",
" \"AS IS\" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY\n",
" KIND, either express or implied. See the License for the\n",
" specific language governing permissions and limitations\n",
" under the License.\n",
"-->\n",
"\n",
"# SedonaDB Overture Examples\n",
"\n",
"> Note: Before running this notebook, ensure that you have installed SedonaDB: `pip install \"apache-sedona[db]\"`\n",
"\n",
"This notebook demonstrates how to query and analyze the [Overture Maps](https://overturemaps.org/) dataset using SedonaDB. See [this page](https://docs.overturemaps.org/release-calendar/) to get the latest version of the Overture data.\n",
"\n",
"The notebook explains how to:\n",
"\n",
"* Load Overture data for the `buildings` and `divisions` themes directly from S3.\n",
"* Perform spatial queries to find features within a specific geographic area.\n",
"* Optimize subsequent query performance by caching a subset of data in memory."
]
},
{
"cell_type": "code",
"execution_count": 19,
"id": "c5e580ff",
"metadata": {
"collapsed": true,
"jupyter": {
"outputs_hidden": true
},
"scrolled": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Requirement already satisfied: lonboard in /opt/miniconda3/lib/python3.12/site-packages (0.12.1)\n",
"Requirement already satisfied: anywidget~=0.9.0 in /opt/miniconda3/lib/python3.12/site-packages (from lonboard) (0.9.18)\n",
"Requirement already satisfied: arro3-compute>=0.4.1 in /opt/miniconda3/lib/python3.12/site-packages (from lonboard) (0.6.3)\n",
"Requirement already satisfied: arro3-core>=0.4.1 in /opt/miniconda3/lib/python3.12/site-packages (from lonboard) (0.6.3)\n",
"Requirement already satisfied: arro3-io>=0.4.1 in /opt/miniconda3/lib/python3.12/site-packages (from lonboard) (0.6.3)\n",
"Requirement already satisfied: geoarrow-rust-core>=0.5.2 in /opt/miniconda3/lib/python3.12/site-packages (from lonboard) (0.5.2)\n",
"Requirement already satisfied: ipywidgets>=7.6.0 in /opt/miniconda3/lib/python3.12/site-packages (from lonboard) (8.1.7)\n",
"Requirement already satisfied: numpy>=1.14 in /opt/miniconda3/lib/python3.12/site-packages (from lonboard) (2.3.3)\n",
"Requirement already satisfied: pyproj>=3.3 in /opt/miniconda3/lib/python3.12/site-packages (from lonboard) (3.7.2)\n",
"Requirement already satisfied: traitlets>=5.7.1 in /opt/miniconda3/lib/python3.12/site-packages (from lonboard) (5.14.3)\n",
"Requirement already satisfied: psygnal>=0.8.1 in /opt/miniconda3/lib/python3.12/site-packages (from anywidget~=0.9.0->lonboard) (0.14.1)\n",
"Requirement already satisfied: typing-extensions>=4.2.0 in /opt/miniconda3/lib/python3.12/site-packages (from anywidget~=0.9.0->lonboard) (4.15.0)\n",
"Requirement already satisfied: comm>=0.1.3 in /opt/miniconda3/lib/python3.12/site-packages (from ipywidgets>=7.6.0->lonboard) (0.2.3)\n",
"Requirement already satisfied: ipython>=6.1.0 in /opt/miniconda3/lib/python3.12/site-packages (from ipywidgets>=7.6.0->lonboard) (9.5.0)\n",
"Requirement already satisfied: widgetsnbextension~=4.0.14 in /opt/miniconda3/lib/python3.12/site-packages (from ipywidgets>=7.6.0->lonboard) (4.0.14)\n",
"Requirement already satisfied: jupyterlab_widgets~=3.0.15 in /opt/miniconda3/lib/python3.12/site-packages (from ipywidgets>=7.6.0->lonboard) (3.0.15)\n",
"Requirement already satisfied: certifi in /opt/miniconda3/lib/python3.12/site-packages (from pyproj>=3.3->lonboard) (2025.8.3)\n",
"Requirement already satisfied: decorator in /opt/miniconda3/lib/python3.12/site-packages (from ipython>=6.1.0->ipywidgets>=7.6.0->lonboard) (5.2.1)\n",
"Requirement already satisfied: ipython-pygments-lexers in /opt/miniconda3/lib/python3.12/site-packages (from ipython>=6.1.0->ipywidgets>=7.6.0->lonboard) (1.1.1)\n",
"Requirement already satisfied: jedi>=0.16 in /opt/miniconda3/lib/python3.12/site-packages (from ipython>=6.1.0->ipywidgets>=7.6.0->lonboard) (0.19.2)\n",
"Requirement already satisfied: matplotlib-inline in /opt/miniconda3/lib/python3.12/site-packages (from ipython>=6.1.0->ipywidgets>=7.6.0->lonboard) (0.1.7)\n",
"Requirement already satisfied: pexpect>4.3 in /opt/miniconda3/lib/python3.12/site-packages (from ipython>=6.1.0->ipywidgets>=7.6.0->lonboard) (4.9.0)\n",
"Requirement already satisfied: prompt_toolkit<3.1.0,>=3.0.41 in /opt/miniconda3/lib/python3.12/site-packages (from ipython>=6.1.0->ipywidgets>=7.6.0->lonboard) (3.0.52)\n",
"Requirement already satisfied: pygments>=2.4.0 in /opt/miniconda3/lib/python3.12/site-packages (from ipython>=6.1.0->ipywidgets>=7.6.0->lonboard) (2.19.2)\n",
"Requirement already satisfied: stack_data in /opt/miniconda3/lib/python3.12/site-packages (from ipython>=6.1.0->ipywidgets>=7.6.0->lonboard) (0.6.3)\n",
"Requirement already satisfied: parso<0.9.0,>=0.8.4 in /opt/miniconda3/lib/python3.12/site-packages (from jedi>=0.16->ipython>=6.1.0->ipywidgets>=7.6.0->lonboard) (0.8.5)\n",
"Requirement already satisfied: ptyprocess>=0.5 in /opt/miniconda3/lib/python3.12/site-packages (from pexpect>4.3->ipython>=6.1.0->ipywidgets>=7.6.0->lonboard) (0.7.0)\n",
"Requirement already satisfied: wcwidth in /opt/miniconda3/lib/python3.12/site-packages (from prompt_toolkit<3.1.0,>=3.0.41->ipython>=6.1.0->ipywidgets>=7.6.0->lonboard) (0.2.14)\n",
"Requirement already satisfied: executing>=1.2.0 in /opt/miniconda3/lib/python3.12/site-packages (from stack_data->ipython>=6.1.0->ipywidgets>=7.6.0->lonboard) (2.2.1)\n",
"Requirement already satisfied: asttokens>=2.1.0 in /opt/miniconda3/lib/python3.12/site-packages (from stack_data->ipython>=6.1.0->ipywidgets>=7.6.0->lonboard) (3.0.0)\n",
"Requirement already satisfied: pure-eval in /opt/miniconda3/lib/python3.12/site-packages (from stack_data->ipython>=6.1.0->ipywidgets>=7.6.0->lonboard) (0.2.3)\n",
"Note: you may need to restart the kernel to use updated packages.\n"
]
}
],
"source": [
"%pip install lonboard"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "6d6fa0ab-b4ed-4e60-b099-a1af0998b486",
"metadata": {},
"outputs": [],
"source": [
"import sedona.db\n",
"import os\n",
"\n",
"os.environ[\"AWS_SKIP_SIGNATURE\"] = \"true\"\n",
"os.environ[\"AWS_DEFAULT_REGION\"] = \"us-west-2\"\n",
"\n",
"sd = sedona.db.connect()"
]
},
{
"cell_type": "markdown",
"id": "4f44adfb-2973-4a65-b4f2-d24b28700b79",
"metadata": {},
"source": [
"## Overture buildings table"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "52855769-4872-472a-9c42-afced3d85ca8",
"metadata": {},
"outputs": [],
"source": [
"df = sd.read_parquet(\n",
" \"s3://overturemaps-us-west-2/release/2025-11-19.0/theme=buildings/type=building/\"\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "b45b5e5c-64ed-49ba-a8aa-9f2292f617c6",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"┌──────────────────────────────────────┬─────────────────────────────────────────┬───┬─────────────┐\n",
"│ id ┆ geometry ┆ … ┆ roof_height │\n",
"│ utf8 ┆ geometry ┆ ┆ float64 │\n",
"╞══════════════════════════════════════╪═════════════════════════════════════════╪═══╪═════════════╡\n",
"│ 85b47da4-1b8d-4132-ac6c-d8dc14fab4b8 ┆ POLYGON((-6.4292972 54.8290034,-6.4291… ┆ … ┆ │\n",
"├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌┤\n",
"│ ec12e345-d44d-4e40-8e08-e1e6e68d4d17 ┆ POLYGON((-6.430836 54.8299412,-6.43095… ┆ … ┆ │\n",
"├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌┤\n",
"│ 285f9ff9-2d6d-409c-b214-74992c8d7e7d ┆ POLYGON((-6.4311579 54.8300247,-6.4313… ┆ … ┆ │\n",
"├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌┤\n",
"│ abedfc7c-e5fd-4a29-931e-da77b610d02d ┆ POLYGON((-6.4321833 54.8294427,-6.4322… ┆ … ┆ │\n",
"├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌┤\n",
"│ a203a2c6-e130-4979-a7d5-8a059c6f31fd ┆ POLYGON((-6.4300627 54.829276,-6.43006… ┆ … ┆ │\n",
"├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌┤\n",
"│ 1d14caf6-b12d-486e-87dd-feef82fba9a7 ┆ POLYGON((-6.4301786 54.8281533,-6.4299… ┆ … ┆ │\n",
"├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌┤\n",
"│ 4b1e67cf-7355-439b-9a31-46a50f3ee227 ┆ POLYGON((-6.4298614 54.8278977,-6.4299… ┆ … ┆ │\n",
"├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌┤\n",
"│ 06de994e-efd4-4a1c-8a20-b4e883904cb2 ┆ POLYGON((-6.4296383 54.827599,-6.42956… ┆ … ┆ │\n",
"├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌┤\n",
"│ ea0b2ea6-7c52-4395-9baa-bc023c7d3166 ┆ POLYGON((-6.4296844 54.8277379,-6.4296… ┆ … ┆ │\n",
"├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌┤\n",
"│ 49f022ef-5574-4613-ae54-af139666fde3 ┆ POLYGON((-6.4296843 54.8278169,-6.4296… ┆ … ┆ │\n",
"└──────────────────────────────────────┴─────────────────────────────────────────┴───┴─────────────┘\n"
]
}
],
"source": [
"df.limit(10).show()"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "e37a023e-0e80-403a-a65b-b5a190004f72",
"metadata": {},
"outputs": [],
"source": [
"df.to_view(\"buildings\")"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "ebfe4776-e08f-4f38-97fc-fca8ec6fc364",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"┌────────────┐\n",
"│ count(*) │\n",
"│ int64 │\n",
"╞════════════╡\n",
"│ 2541497985 │\n",
"└────────────┘\n"
]
}
],
"source": [
"# the buildings table is large and contains billions of rows\n",
"sd.sql(\"\"\"\n",
"SELECT\n",
" COUNT(*)\n",
"FROM\n",
" buildings\n",
"\"\"\").show()"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "b73f670d-0d10-4a7a-bfc7-e2abe5d9edd2",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"SedonaSchema with 24 fields:\n",
" id: utf8<Utf8View>\n",
" geometry: geometry<WkbView(ogc:crs84)>\n",
" bbox: struct<Struct(xmin Float32, xmax Float32, ymin Float32, ymax Float32)>\n",
" version: int32<Int32>\n",
" sources: list<List(Field { name: \"element\", data_type: Struct([Field { name: \"property\", data_type: Utf8, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: \"dataset\", data_type: Utf8, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: \"license\", data_type: Utf8, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: \"record_id\", data_type: Utf8, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: \"update_time\", data_type: Utf8, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: \"confidence\", data_type: Float64, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: \"between\", data_type: List(Field { name: \"element\", data_type: Float64, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }), nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }]), nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} })>\n",
" level: int32<Int32>\n",
" subtype: utf8<Utf8View>\n",
" class: utf8<Utf8View>\n",
" height: float64<Float64>\n",
" names: struct<Struct(primary Utf8, common Map(Field { name: \"key_value\", data_type: Struct([Field { name: \"key\", data_type: Utf8, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: \"value\", data_type: Utf8, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }]), nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} }, false), rules List(Field { name: \"element\", data_type: Struct([Field { name: \"variant\", data_type: Utf8, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: \"language\", data_type: Utf8, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: \"perspectives\", data_type: Struct([Field { name: \"mode\", data_type: Utf8, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: \"countries\", data_type: List(Field { name: \"element\", data_type: Utf8, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }), nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }]), nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: \"value\", data_type: Utf8, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: \"between\", data_type: List(Field { name: \"element\", data_type: Float64, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }), nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: \"side\", data_type: Utf8, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }]), nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }))>\n",
" has_parts: boolean<Boolean>\n",
" is_underground: boolean<Boolean>\n",
" num_floors: int32<Int32>\n",
" num_floors_underground: int32<Int32>\n",
" min_height: float64<Float64>\n",
" min_floor: int32<Int32>\n",
" facade_color: utf8<Utf8View>\n",
" facade_material: utf8<Utf8View>\n",
" roof_material: utf8<Utf8View>\n",
" roof_shape: utf8<Utf8View>\n",
" roof_direction: float64<Float64>\n",
" roof_orientation: utf8<Utf8View>\n",
" roof_color: utf8<Utf8View>\n",
" roof_height: float64<Float64>"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# check out the schema of the buildings table to see what it contains\n",
"df.schema"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "68d1b68c-dd26-45c2-944f-61138b212943",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"┌─────────────────────────┬────────────────────┬────────────┬────────────┬─────────────────────────┐\n",
"│ id ┆ height ┆ num_floors ┆ roof_shape ┆ centroid │\n",
"│ utf8 ┆ float64 ┆ int32 ┆ utf8 ┆ geometry │\n",
"╞═════════════════════════╪════════════════════╪════════════╪════════════╪═════════════════════════╡\n",
"│ aa8e3a73-c72c-4f1a-b6e… ┆ 20.38205909729004 ┆ ┆ ┆ POINT(-74.187673580307… │\n",
"├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤\n",
"│ efe7616b-7f7e-464c-9ce… ┆ 26.18361473083496 ┆ ┆ ┆ POINT(-74.189040982134… │\n",
"├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤\n",
"│ b3f734a1-325b-4e8c-b1d… ┆ 27.025876998901367 ┆ ┆ ┆ POINT(-74.2558161 40.8… │\n",
"├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤\n",
"│ 45d88655-e2f4-4a08-926… ┆ 25.485210418701172 ┆ ┆ ┆ POINT(-74.182252194444… │\n",
"├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤\n",
"│ 31e8353c-7d5b-4b20-94e… ┆ 21.294815063476562 ┆ ┆ ┆ POINT(-74.197113787905… │\n",
"└─────────────────────────┴────────────────────┴────────────┴────────────┴─────────────────────────┘\n"
]
}
],
"source": [
"# find all the buildings in New York City that are taller than 20 meters\n",
"nyc_bbox_wkt = (\n",
" \"POLYGON((-74.2591 40.4774, -74.2591 40.9176, -73.7004 40.9176, \"\n",
" \"-73.7004 40.4774, -74.2591 40.4774))\"\n",
")\n",
"sd.sql(f\"\"\"\n",
"SELECT\n",
" id,\n",
" height,\n",
" num_floors,\n",
" roof_shape,\n",
" ST_Centroid(geometry) as centroid\n",
"FROM\n",
" buildings\n",
"WHERE\n",
" is_underground = FALSE\n",
" AND height IS NOT NULL\n",
" AND height > 20\n",
" AND ST_Intersects(\n",
" geometry,\n",
" ST_SetSRID(ST_GeomFromText('{nyc_bbox_wkt}'), 4326)\n",
" )\n",
"LIMIT 5;\n",
"\"\"\").show()"
]
},
{
"cell_type": "markdown",
"id": "e07fcdc1-962b-4dce-90cb-bf715432e299",
"metadata": {},
"source": [
"## Overture divisions table"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "d9f122d3-4d90-46b0-ab9a-259a71cc423b",
"metadata": {},
"outputs": [],
"source": [
"df = sd.read_parquet(\n",
" \"s3://overturemaps-us-west-2/release/2025-11-19.0/theme=divisions/type=division_area/\"\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "19a75b19-0b56-4167-b3f1-73a171ecc480",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"┌─────────────────┬────────────────┬────────────────┬───┬────────────────┬────────┬────────────────┐\n",
"│ id ┆ geometry ┆ bbox ┆ … ┆ is_territorial ┆ region ┆ division_id │\n",
"│ utf8 ┆ geometry ┆ struct ┆ ┆ boolean ┆ utf8 ┆ utf8 │\n",
"╞═════════════════╪════════════════╪════════════════╪═══╪════════════════╪════════╪════════════════╡\n",
"│ 3665c36d-d3a9-… ┆ POLYGON((12.5… ┆ {xmin: 12.455… ┆ … ┆ true ┆ IT-34 ┆ f05aa29f-151f… │\n",
"├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤\n",
"│ 18a69439-a1da-… ┆ POLYGON((12.5… ┆ {xmin: 12.596… ┆ … ┆ true ┆ IT-36 ┆ ae00d58c-6e67… │\n",
"├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤\n",
"│ 7d0f6d37-bb55-… ┆ POLYGON((12.6… ┆ {xmin: 12.567… ┆ … ┆ true ┆ IT-36 ┆ bdfc82ca-5f23… │\n",
"├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤\n",
"│ 3f480ff6-6361-… ┆ POLYGON((12.5… ┆ {xmin: 12.549… ┆ … ┆ true ┆ IT-36 ┆ 1c750104-4470… │\n",
"├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤\n",
"│ 31c3ab5e-eb6f-… ┆ POLYGON((12.6… ┆ {xmin: 12.612… ┆ … ┆ true ┆ IT-34 ┆ d90804ee-19a4… │\n",
"├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤\n",
"│ 308517e6-64b4-… ┆ POLYGON((12.5… ┆ {xmin: 12.589… ┆ … ┆ true ┆ IT-34 ┆ aabd71e9-4d98… │\n",
"├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤\n",
"│ 646e5b1f-b76a-… ┆ POLYGON((12.5… ┆ {xmin: 12.485… ┆ … ┆ true ┆ IT-34 ┆ 502c1c4e-fc19… │\n",
"├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤\n",
"│ f2809a49-1082-… ┆ POLYGON((12.5… ┆ {xmin: 12.538… ┆ … ┆ true ┆ IT-34 ┆ 8b446eed-00ad… │\n",
"├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤\n",
"│ 72b27245-c7fd-… ┆ POLYGON((12.5… ┆ {xmin: 12.501… ┆ … ┆ true ┆ IT-34 ┆ 1d535e1f-d19e… │\n",
"├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤\n",
"│ 815855d9-05d0-… ┆ POLYGON((12.4… ┆ {xmin: 12.371… ┆ … ┆ true ┆ IT-34 ┆ 5aa91354-9e8c… │\n",
"└─────────────────┴────────────────┴────────────────┴───┴────────────────┴────────┴────────────────┘\n"
]
}
],
"source": [
"# inspect a few rows of the data\n",
"df.show(10)"
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "03b951de-3397-4fcf-9baf-50e139a38dd4",
"metadata": {},
"outputs": [],
"source": [
"df.to_view(\"division_area\")"
]
},
{
"cell_type": "code",
"execution_count": 12,
"id": "9c6bd69d-9407-432a-bdc8-d60976237a3a",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"┌──────────┐\n",
"│ count(*) │\n",
"│ int64 │\n",
"╞══════════╡\n",
"│ 1052542 │\n",
"└──────────┘\n"
]
}
],
"source": [
"sd.sql(\"\"\"\n",
"SELECT\n",
" COUNT(*)\n",
"FROM division_area\n",
"\"\"\").show()"
]
},
{
"cell_type": "code",
"execution_count": 13,
"id": "75a6d0ed-9767-4d36-a77a-4afb7952fbe4",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"SedonaSchema with 13 fields:\n",
" id: utf8<Utf8View>\n",
" geometry: geometry<WkbView(ogc:crs84)>\n",
" bbox: struct<Struct(xmin Float32, xmax Float32, ymin Float32, ymax Float32)>\n",
" country: utf8<Utf8View>\n",
" version: int32<Int32>\n",
" sources: list<List(Field { name: \"element\", data_type: Struct([Field { name: \"property\", data_type: Utf8, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: \"dataset\", data_type: Utf8, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: \"license\", data_type: Utf8, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: \"record_id\", data_type: Utf8, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: \"update_time\", data_type: Utf8, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: \"confidence\", data_type: Float64, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: \"between\", data_type: List(Field { name: \"element\", data_type: Float64, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }), nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }]), nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} })>\n",
" subtype: utf8<Utf8View>\n",
" class: utf8<Utf8View>\n",
" names: struct<Struct(primary Utf8, common Map(Field { name: \"key_value\", data_type: Struct([Field { name: \"key\", data_type: Utf8, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: \"value\", data_type: Utf8, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }]), nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} }, false), rules List(Field { name: \"element\", data_type: Struct([Field { name: \"variant\", data_type: Utf8, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: \"language\", data_type: Utf8, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: \"perspectives\", data_type: Struct([Field { name: \"mode\", data_type: Utf8, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: \"countries\", data_type: List(Field { name: \"element\", data_type: Utf8, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }), nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }]), nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: \"value\", data_type: Utf8, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: \"between\", data_type: List(Field { name: \"element\", data_type: Float64, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }), nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: \"side\", data_type: Utf8, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }]), nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }))>\n",
" is_land: boolean<Boolean>\n",
" is_territorial: boolean<Boolean>\n",
" region: utf8<Utf8View>\n",
" division_id: utf8<Utf8View>"
]
},
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.schema"
]
},
{
"cell_type": "code",
"execution_count": 14,
"id": "f1f7158c-ef2b-4377-9bee-180309ddd553",
"metadata": {},
"outputs": [],
"source": [
"# get all the divisions in Nova Scotia and save them in memory with to_memtable()\n",
"nova_scotia_bbox_wkt = (\n",
" \"POLYGON((-66.5 43.4, -66.5 47.1, -59.8 47.1, -59.8 43.4, -66.5 43.4))\"\n",
")\n",
"ns = sd.sql(f\"\"\"\n",
"SELECT\n",
" country, region, names, geometry\n",
"FROM division_area\n",
"WHERE\n",
" ST_Intersects(\n",
" geometry,\n",
" ST_SetSRID(ST_GeomFromText('{nova_scotia_bbox_wkt}'), 4326)\n",
" )\n",
"\"\"\").to_memtable()"
]
},
{
"cell_type": "code",
"execution_count": 15,
"id": "27e6909d-06fa-438b-88e0-d300fd2fb1ec",
"metadata": {},
"outputs": [],
"source": [
"ns.to_view(\"ns_divisions\")"
]
},
{
"cell_type": "code",
"execution_count": 16,
"id": "2dec92d8-a374-4021-990a-e50f5769516e",
"metadata": {},
"outputs": [],
"source": [
"df = sd.sql(\"\"\"\n",
"SELECT UNNEST(names), geometry\n",
"FROM ns_divisions\n",
"WHERE region = 'CA-NS'\n",
"\"\"\")"
]
},
{
"cell_type": "code",
"execution_count": 17,
"id": "7f39a283-0eee-4f72-a30a-8dd9fa1aaa69",
"metadata": {
"scrolled": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"┌────────────────────────┬────────────────────────┬────────────────────────┬───────────────────────┐\n",
"│ __unnest_placeholder(n ┆ __unnest_placeholder(n ┆ __unnest_placeholder(n ┆ geometry │\n",
"│ s_divisions.names).pr… ┆ s_divisions.names).co… ┆ s_divisions.names).ru… ┆ geometry │\n",
"╞════════════════════════╪════════════════════════╪════════════════════════╪═══════════════════════╡\n",
"│ Apple River ┆ ┆ ┆ POLYGON((-64.7260681… │\n",
"├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤\n",
"│ Allen Hill ┆ ┆ ┆ POLYGON((-64.6956656… │\n",
"└────────────────────────┴────────────────────────┴────────────────────────┴───────────────────────┘\n",
"CPU times: user 1.25 ms, sys: 805 μs, total: 2.05 ms\n",
"Wall time: 1.42 ms\n"
]
}
],
"source": [
"%%time\n",
"# this executes quickly because the Nova Scotia data was persisted in memory with `to_memtable()`\n",
"df.show(2)"
]
},
{
"cell_type": "markdown",
"id": "fc1d2023-c83a-4010-808b-212161b1b577",
"metadata": {},
"source": [
"## Visualize the results with lonboard"
]
},
{
"cell_type": "code",
"execution_count": 18,
"id": "f78583fd-a73a-4169-9c45-74d8026bb5fb",
"metadata": {},
"outputs": [
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "02a229b2c09f4acf8ae4daedcb8af8ae",
"version_major": 2,
"version_minor": 1
},
"text/plain": [
"Map(basemap_style=<CartoBasemap.DarkMatter: 'https://basemaps.cartocdn.com/gl/dark-matter-gl-style/style.json'…"
]
},
"execution_count": 18,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"import lonboard\n",
"\n",
"lonboard.viz(df)"
]
},
{
"cell_type": "markdown",
"id": "0f668f51-462e-4e18-b774-c047140b224a",
"metadata": {},
"source": [
"![Lonboard NS](image/lonboard_ns.png)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.12.4"
}
},
"nbformat": 4,
"nbformat_minor": 5
}