chore(release): prepare 0.1.0 release (#89)
4 files changed
tree: 91656ac3fe612806232e6fb9ed1955b65bed1061
  1. .github/
  2. .mvn/
  3. core/
  4. dev/
  5. docs/
  6. examples/
  7. native/
  8. proto/
  9. .asf.yaml
  10. .gitignore
  11. CONTRIBUTING.md
  12. LICENSE.txt
  13. Makefile
  14. mvnw
  15. mvnw.cmd
  16. NOTICE.txt
  17. pom.xml
  18. README.md
README.md

Apache DataFusion Java

Java bindings for Apache DataFusion. Queries run in native Rust and results return to the JVM as Apache Arrow batches via the Arrow C Data Interface.

Early development: no releases yet, API will change. Bug reports and contributions welcome.

Quickstart

import org.apache.arrow.memory.RootAllocator;
import org.apache.arrow.vector.ipc.ArrowReader;
import org.apache.datafusion.DataFrame;
import org.apache.datafusion.SessionContext;

try (var allocator = new RootAllocator();
     var ctx = new SessionContext()) {

    ctx.registerParquet("orders", "/path/to/orders.parquet");

    try (DataFrame df = ctx.sql(
            "SELECT o_orderpriority, COUNT(*) AS n " +
            "FROM orders GROUP BY o_orderpriority");
         ArrowReader reader = df.collect(allocator)) {
        while (reader.loadNextBatch()) {
            var batch = reader.getVectorSchemaRoot();
            // ...
        }
    }
}

SessionContext and DataFrame are AutoCloseable and not thread-safe.

Documentation

The full documentation lives under docs/source/ and is built with Sphinx (see docs/README.md for the build steps):

  • User guide — installation, the DataFrame and SQL APIs, Parquet ingestion.
  • Contributor guide — build, test, code style, and how to bump the DataFusion version.

Requirements

JDK 17+. Building from source: see docs/source/contributor-guide/development.md.

Contributing

Open an issue to discuss non-trivial changes before sending a PR. See the contributor guide.

License

Apache License 2.0. See LICENSE.txt and NOTICE.txt.