SessionContext is the entry point into DataFusion from Java. It owns the catalog of registered tables and the query planner.
try (SessionContext ctx = new SessionContext()) { // register tables, build queries... }
SessionContext is AutoCloseable. Closing it releases the underlying native context. Use try-with-resources so the native side is freed even on exception.
A SessionContext is not thread-safe. Do not share one across threads without external synchronization. The simplest pattern is one context per thread.
SessionContext.builder() exposes a fluent builder for overriding DataFusion defaults — batch size, target partitions, statistics collection, information schema, memory pool size, and the spill directory. See the
SessionContextBuilder Javadoc for the full list.
try (SessionContext ctx = SessionContext.builder() .batchSize(4096) .targetPartitions(8) .build()) { // ... }
withSparkFunctions() registers Apache Spark–compatible functions and expression planners (from the datafusion-spark crate) on the context:
try (SessionContext ctx = SessionContext.builder() .withSparkFunctions() .build(); DataFrame df = ctx.sql("SELECT crc32('Spark')")) { // ... }
When enabled, Spark-compatible functions override any DataFusion built-in of the same name. This requires the native library to be built with the spark Cargo feature, which is enabled in the default build; otherwise build() throws explaining the feature is missing.