This is a template project that shows how to use Sedona in Spatial Data Mining
Spatial co-location is defined as two or more species are often located in a neighborhood relationship. Ripley's K function is often used in judging co-location. It usually executes multiple times and form a 2-dimension curve for observation.
In Africa, lions co-locate with zebras.
We use Ripley's K function to calculate Multivariate Spatial Patterns
Here are some materials regarding how to use Ripley's K function and its transformation L function.
Single type K function:
Multivariate K function and L function:
The data scientist in NYC Taxi Company has a guess that the taxi pickup points are co-located with these area landmarks such as airports, museums, hospitals, colleges and so on. In other words, many taxi trips start from area landmarks. He wants to use a quantitative metric to measure the degree of co-location pattern.
Dataset
Code
Run the code in “ScalaExample”. You will obtain a visualized co-location map and the result of 10 iterations Ripley's L function.
You can download more NYC taxi trip data to obtain a detailed analyze result.
Result
Visualized co-location map, use the subset in the template project. The output image is in the root folder:
Visualized co-location map, use all 1.3 billion taxi trip pickup points:
Ripley's L function result:
Conclusion: New York City taxi trip pickup points co-locate with New York City area landmarks