SessionContext

SessionContext is the entry point into DataFusion from Java. It owns the catalog of registered tables and the query planner.

Lifecycle

try (SessionContext ctx = new SessionContext()) {
    // register tables, build queries...
}

SessionContext is AutoCloseable. Closing it releases the underlying native context. Use try-with-resources so the native side is freed even on exception.

Threading

A SessionContext is not thread-safe. Do not share one across threads without external synchronization. The simplest pattern is one context per thread.

Configuration

SessionContext.builder() exposes a fluent builder for overriding DataFusion defaults — batch size, target partitions, statistics collection, information schema, memory pool size, and the spill directory. See the

SessionContextBuilder Javadoc for the full list.

try (SessionContext ctx = SessionContext.builder()
        .batchSize(4096)
        .targetPartitions(8)
        .build()) {
    // ...
}

Spark-compatible functions

withSparkFunctions() registers Apache Spark–compatible functions and expression planners (from the datafusion-spark crate) on the context:

try (SessionContext ctx = SessionContext.builder()
        .withSparkFunctions()
        .build();
     DataFrame df = ctx.sql("SELECT crc32('Spark')")) {
    // ...
}

When enabled, Spark-compatible functions override any DataFusion built-in of the same name. This requires the native library to be built with the spark Cargo feature, which is enabled in the default build; otherwise build() throws explaining the feature is missing.