site/_docs/arrow_adapter.md

layout: docs title: Arrow adapter permalink: /docs/arrow_adapter.html

Note: Arrow Adapter is an experimental feature; changes in public API and usage are expected.

Overview

Calcite's adapter for Apache Arrow is able to read and process data in Arrow format using SQL.

It can read files in Arrow's Feather format (which generally have a .arrow suffix) in the same way that the File Adapter can read .csv files.

A simple example

Let's start with a simple example. First, we need a [model definition]({{ site.baseurl }}/docs/model.html), as follows.

{% highlight json %} { “version”: “1.0”, “defaultSchema”: “ARROW”, “schemas”: [ { “name”: “ARROW”, “type”: “custom”, “factory”: “org.apache.calcite.adapter.arrow.ArrowSchemaFactory”, “operand”: { “directory”: “arrow” } } ] } {% endhighlight %}

The model file is stored as arrow/src/test/resources/arrow-model.json, so you can connect via sqlline as follows:

{% highlight bash %} $ ./sqlline sqlline> !connect jdbc:calcite:model=arrow/src/test/resources/arrow-model.json admin admin sqlline> select * from arrow.test; +----------+----------+------------+ | fieldOne | fieldTwo | fieldThree | +----------+----------+------------+ | 1 | abc | 1.2 | | 2 | def | 3.4 | | 3 | xyz | 5.6 | | 4 | abcd | 1.22 | | 5 | defg | 3.45 | | 6 | xyza | 5.67 | +----------+----------+------------+ 6 rows selected {% endhighlight %}

The arrow directory contains a file called test.arrow, and so it shows up as a table called test.