Note: Arrow Adapter is an experimental feature; changes in public API and usage are expected.
Calcite's adapter for Apache Arrow is able to read and process data in Arrow format using SQL.
It can read files in Arrow's Feather format (which generally have a .arrow
suffix) in the same way that the File Adapter can read .csv
files.
Let's start with a simple example. First, we need a [model definition]({{ site.baseurl }}/docs/model.html), as follows.
{% highlight json %} { “version”: “1.0”, “defaultSchema”: “ARROW”, “schemas”: [ { “name”: “ARROW”, “type”: “custom”, “factory”: “org.apache.calcite.adapter.arrow.ArrowSchemaFactory”, “operand”: { “directory”: “arrow” } } ] } {% endhighlight %}
The model file is stored as arrow/src/test/resources/arrow-model.json
, so you can connect via sqlline
as follows:
{% highlight bash %} $ ./sqlline sqlline> !connect jdbc:calcite:model=arrow/src/test/resources/arrow-model.json admin admin sqlline> select * from arrow.test; +----------+----------+------------+ | fieldOne | fieldTwo | fieldThree | +----------+----------+------------+ | 1 | abc | 1.2 | | 2 | def | 3.4 | | 3 | xyz | 5.6 | | 4 | abcd | 1.22 | | 5 | defg | 3.45 | | 6 | xyza | 5.67 | +----------+----------+------------+ 6 rows selected {% endhighlight %}
The arrow
directory contains a file called test.arrow
, and so it shows up as a table called test
.