Here's some miscellaneous documentation about using Calcite and its various adapters.
Prerequisites are maven (3.2.1 or later) and Java (JDK 1.7 or later, 1.8 preferred) on your path.
Unpack the source distribution .tar.gz
or .zip
file, cd
to the root directory of the unpacked source, then build using maven:
{% highlight bash %} $ tar xvfz calcite-1.6.0-source.tar.gz $ cd calcite-1.6.0 $ mvn install {% endhighlight %}
Running tests describes how to run more or fewer tests.
Prerequisites are git, maven (3.2.1 or later) and Java (JDK 1.7 or later, 1.8 preferred) on your path.
Create a local copy of the github repository, cd
to its root directory, then build using maven:
{% highlight bash %} $ git clone git://github.com/apache/calcite.git $ cd calcite $ mvn install {% endhighlight %}
Running tests describes how to run more or fewer tests.
The test suite will run by default when you build, unless you specify -DskipTests
:
{% highlight bash %}
$ mvn clean $ mvn -DskipTests install $ mvn test {% endhighlight %}
There are other options that control which tests are run, and in what environment, as follows.
-Dcalcite.test.db=DB
(where db is h2
, hsqldb
, mysql
, or postgresql
) allows you to change the JDBC data source for the test suite. Calcite's test suite requires a JDBC data source populated with the foodmart data set.hsqldb
, the default, uses an in-memory hsqldb database.mysql
and postgresql
might be somewhat faster than hsqldb, but you need to populate it (i.e. provision a VM).-Dcalcite.debug
prints extra debugging information to stdout.-Dcalcite.test.slow
enables tests that take longer to execute. For example, there are tests that create virtual TPC-H and TPC-DS schemas in-memory and run tests from those benchmarks.-Dcalcite.test.splunk
enables tests that run against Splunk. Splunk must be installed and running.For testing Calcite's external adapters, a test virtual machine should be used. The VM includes H2, HSQLDB, MySQL, MongoDB, and PostgreSQL.
Test VM requires 5GiB of disk space and it takes 30 minutes to build.
Note: you can use calcite-test-dataset to populate your own database, however it is recommended to use test VM so the test environment can be reproduced.
Install dependencies: Vagrant and VirtualBox
Clone https://github.com/vlsi/calcite-test-dataset.git at the same level as calcite repository. For instance:
{% highlight bash %} code +-- calcite +-- calcite-test-dataset {% endhighlight %}
Note: integration tests search for ../calcite-test-dataset or ../../calcite-test-dataset. You can specify full path via calcite.test.dataset system property.
{% highlight bash %} cd calcite-test-dataset && mvn install {% endhighlight %}
Test VM is provisioned by Vagrant, so regular Vagrant vagrant up
and vagrant halt
should be used to start and stop the VM. The connection strings for different databases are listed in calcite-test-dataset readme.
Note: test VM should be started before you launch integration tests. Calcite itself does not start/stop the VM.
Command line:
mvn test
or mvn install
.mvn verify -Pit
. it
stands for “integration-test”. mvn install -Pit
works as well.mvn -Dtest=foo -DfailIfNoTests=false -Pit verify
cd mongo; mvn verify -Pit
From within IDE:
MongoAdapterIT.java
as usual (no additional properties are required)JdbcTest
and JdbcAdapterTest
with setting -Dcalcite.test.db=mysql
JdbcTest
and JdbcAdapterTest
with setting -Dcalcite.test.db=postgresql
Tests with external data are executed at maven's integration-test phase. We do not currently use pre-integration-test/post-integration-test, however we could use that in future. The verification of build pass/failure is performed at verify phase. Integration tests should be named ...IT.java
, so they are not picked up on unit test execution.
See the [developers guide]({{ site.baseurl }}/develop/#contributing).
See the [developers guide]({{ site.baseurl }}/develop/#getting-started).
To enable tracing, add the following flags to the java command line:
-Dcalcite.debug=true -Djava.util.logging.config.file=core/src/test/resources/logging.properties
The first flag causes Calcite to print the Java code it generates (to execute queries) to stdout. It is especially useful if you are debugging mysterious problems like this:
Exception in thread "main" java.lang.ClassCastException: Integer cannot be cast to Long at Baz$1$1.current(Unknown Source)
The second flag specifies a config file for the java.util.logging framework. Put the following into core/src/test/resources/logging.properties:
{% highlight properties %} handlers= java.util.logging.ConsoleHandler .level= INFO org.apache.calcite.plan.RelOptPlanner.level=FINER java.util.logging.ConsoleHandler.level=ALL {% endhighlight %}
The line org.apache.calcite.plan.RelOptPlanner.level=FINER
tells the planner to produce fairly verbose output. You can modify the file to enable other loggers, or to change levels. For instance, if you change FINER
to FINEST
the planner will give you an account of the planning process so detailed that it might fill up your hard drive.
See the tutorial.
First, download and install Calcite, and install MongoDB.
Note: you can use MongoDB from integration test virtual machine above.
Import MongoDB's zipcode data set into MongoDB:
{% highlight bash %} $ curl -o /tmp/zips.json http://media.mongodb.org/zips.json $ mongoimport --db test --collection zips --file /tmp/zips.json Tue Jun 4 16:24:14.190 check 9 29470 Tue Jun 4 16:24:14.469 imported 29470 objects {% endhighlight %}
Log into MongoDB to check it's there:
{% highlight bash %} $ mongo MongoDB shell version: 2.4.3 connecting to: test
db.zips.find().limit(3) { “city” : “ACMAR”, “loc” : [ -86.51557, 33.584132 ], “pop” : 6055, “state” : “AL”, “_id” : “35004” } { “city” : “ADAMSVILLE”, “loc” : [ -86.959727, 33.588437 ], “pop” : 10616, “state” : “AL”, “_id” : “35005” } { “city” : “ADGER”, “loc” : [ -87.167455, 33.434277 ], “pop” : 3205, “state” : “AL”, “_id” : “35006” } exit bye {% endhighlight %}
Connect using the [mongo-zips-model.json]({{ site.sourceRoot }}/mongodb/src/test/resources/mongo-zips-model.json) Calcite model:
{% highlight bash %} $ ./sqlline sqlline> !connect jdbc:calcite:model=mongodb/target/test-classes/mongo-zips-model.json admin admin Connecting to jdbc:calcite:model=mongodb/target/test-classes/mongo-zips-model.json Connected to: Calcite (version 1.x.x) Driver: Calcite JDBC Driver (version 1.x.x) Autocommit status: true Transaction isolation: TRANSACTION_REPEATABLE_READ sqlline> !tables +------------+--------------+-----------------+---------------+ | TABLE_CAT | TABLE_SCHEM | TABLE_NAME | TABLE_TYPE | +------------+--------------+-----------------+---------------+ | null | mongo_raw | zips | TABLE | | null | mongo_raw | system.indexes | TABLE | | null | mongo | ZIPS | VIEW | | null | metadata | COLUMNS | SYSTEM_TABLE | | null | metadata | TABLES | SYSTEM_TABLE | +------------+--------------+-----------------+---------------+ sqlline> select count(*) from zips; +---------+ | EXPR$0 | +---------+ | 29467 | +---------+ 1 row selected (0.746 seconds) sqlline> !quit Closing: org.apache.calcite.jdbc.FactoryJdbc41$CalciteConnectionJdbc41 $ {% endhighlight %}
To run the test suite and sample queries against Splunk, load Splunk's tutorialdata.zip
data set as described in the Splunk tutorial.
(This step is optional, but it provides some interesting data for the sample queries. It is also necessary if you intend to run the test suite, using -Dcalcite.test.splunk=true
.)
New adapters can be created by implementing CalcitePrepare.Context
:
{% highlight java %} import org.apache.calcite.adapter.java.JavaTypeFactory; import org.apache.calcite.jdbc.CalcitePrepare; import org.apache.calcite.jdbc.CalciteSchema;
public class AdapterContext implements CalcitePrepare.Context { @Override public JavaTypeFactory getTypeFactory() { // adapter implementation return typeFactory; }
@Override public CalciteSchema getRootSchema() { // adapter implementation return rootSchema; } } {% endhighlight %}
The example below shows how SQL query can be submitted to CalcitePrepare
with a custom context (AdapterContext
in this case). Calcite prepares and implements the query execution, using the resources provided by the Context
. CalcitePrepare.PrepareResult
provides access to the underlying enumerable and methods for enumeration. The enumerable itself can naturally be some adapter specific implementation.
{% highlight java %} import org.apache.calcite.jdbc.CalcitePrepare; import org.apache.calcite.prepare.CalcitePrepareImpl; import org.junit.Test;
public class AdapterContextTest { @Test public void testSelectAllFromTable() { AdapterContext ctx = new AdapterContext(); String sql = “SELECT * FROM TABLENAME”; Class elementType = Object[].class; CalcitePrepare.PrepareResult prepared = new CalcitePrepareImpl().prepareSql(ctx, sql, null, elementType, -1); Object enumerable = prepared.getExecutable(); // etc. } } {% endhighlight %}
The following sections might be of interest if you are adding features to particular parts of the code base. You don't need to understand these topics if you are just building from source and running tests.
When Calcite compares types (instances of RelDataType
), it requires them to be the same object. If there are two distinct type instances that refer to the same Java type, Calcite may fail to recognize that they match. It is recommended to:
JavaTypeFactory
within the calcite context;Calcite's Avatica Server component supports RPC serialization using Protocol Buffers. In the context of Avatica, Protocol Buffers can generate a collection of messages defined by a schema. The library itself can parse old serialized messages using a new schema. This is highly desirable in an environment where the client and server are not guaranteed to have the same version of objects.
Typically, the code generated by the Protocol Buffers library doesn't need to be re-generated only every build, only when the schema changes.
First, install Protobuf 3.0:
{% highlight bash %} $ wget https://github.com/google/protobuf/releases/download/v3.0.0-beta-1/protobuf-java-3.0.0-beta-1.tar.gz $ tar xf protobuf-java-3.0.0-beta-1.tar.gz && cd protobuf-3.0.0-beta-1 $ ./configure $ make $ sudo make install {% endhighlight %}
Then, re-generate the compiled code:
{% highlight bash %} $ cd avatica $ ./src/main/scripts/generate-protobuf.sh {% endhighlight %}
The following sections are of interest to Calcite committers and in particular release managers.
Follow instructions here to create a key pair. (On Mac OS X, I did brew install gpg
and gpg --gen-key
.)
Add your public key to the KEYS
file by following instructions in the KEYS
file.
Before you start:
-Dcalcite.test.db=hsqldb
(the default){% highlight bash %}
read -s GPG_PASSPHRASE
git clean -xn mvn clean
mvn -Papache-release -Dgpg.passphrase=${GPG_PASSPHRASE} install {% endhighlight %}
When the dry-run has succeeded, change install
to deploy
.
Before you start:
README
and site/_docs/howto.md
have the correct version number.version.major
and version.minor
in pom.xml
.julianhyde/coverity_scan
branch, and when it completes, make sure that there are no important issues.Create a release branch named after the release, e.g. branch-1.1
, and push it to Apache.
{% highlight bash %} $ git checkout -b branch-X.Y $ git push -u origin branch-X.Y {% endhighlight %}
We will use the branch for the entire the release process. Meanwhile, we do not allow commits to the master branch. After the release is final, we can use git merge --ff-only
to append the changes on the release branch onto the master branch. (Apache does not allow reverts to the master branch, which makes it difficult to clean up the kind of messy commits that inevitably happen while you are trying to finalize a release.)
Now, set up your environment and do a dry run. The dry run will not commit any changes back to git and gives you the opportunity to verify that the release process will complete as expected.
If any of the steps fail, clean up (see below), fix the problem, and start again from the top.
{% highlight bash %}
read -s GPG_PASSPHRASE
git clean -xn mvn clean
mvn -DdryRun=true -DskipTests -DreleaseVersion=X.Y.Z -DdevelopmentVersion=X.Y.Z+1-SNAPSHOT -Papache-release -Darguments=“-Dgpg.passphrase=${GPG_PASSPHRASE}” release:prepare 2>&1 | tee /tmp/prepare-dry.log {% endhighlight %}
Check the artifacts:
target
directory should be these 8 files, among others:apache-calcite-
..tar.gz
and .zip
(currently there is no binary distro), check that all files belong to a directory called apache-calcite-X.Y.Z-src
.NOTICE
, LICENSE
, README
, README.md
README
is correctcore/target/calcite-core-X.Y.Z.jar
and mongodb/target/calcite-mongodb-X.Y.Z-sources.jar
), check that the META-INF
directory contains DEPENDENCIES
, LICENSE
, NOTICE
and git.properties
org-apache-calcite-jdbc.properties
is present and does not contain un-substituted ${...}
variablesNow, remove the -DdryRun
flag and run the release for real.
{% highlight bash %}
mvn -DdryRun=false -DskipTests -DreleaseVersion=X.Y.Z -DdevelopmentVersion=X.Y.Z+1-SNAPSHOT -Papache-release -Darguments=“-Dgpg.passphrase=${GPG_PASSPHRASE}” release:prepare 2>&1 | tee /tmp/prepare.log
mvn -DskipTests -Papache-release -Darguments=“-Dgpg.passphrase=${GPG_PASSPHRASE}” release:perform 2>&1 | tee /tmp/perform.log {% endhighlight %}
Verify the staged artifacts in the Nexus repository:
Build Promotion
, click Staging Repositories
Staging Repositories
tab there should be a line with profile org.apache.calcite
Upload the artifacts via subversion to a staging area, https://dist.apache.org/repos/dist/dev/calcite/apache-calcite-X.Y.Z-rcN:
{% highlight bash %}
mkdir -p ~/dist/dev pushd ~/dist/dev svn co https://dist.apache.org/repos/dist/dev/calcite popd
cd target mkdir ~/dist/dev/calcite/apache-calcite-X.Y.Z-rcN mv apache-calcite-* ~/dist/dev/calcite/apache-calcite-X.Y.Z-rcN
cd ~/dist/dev/calcite svn add apache-calcite-X.Y.Z-rcN svn ci {% endhighlight %}
{% highlight bash %}
git tag
git tag -d apache-calcite-X.Y.Z git push origin :refs/tags/apache-calcite-X.Y.Z
mvn release:clean
git status git reset --hard HEAD {% endhighlight %}
{% highlight bash %}
gpg --recv-keys key
curl -O https://dist.apache.org/repos/dist/release/calcite/KEYS
function checkHash() { cd “$1” for i in *.{zip,pom,gz}; do if [ ! -f $i ]; then continue fi if [ -f $i.md5 ]; then if [ “$(cat $i.md5)” = “$(md5 -q $i)” ]; then echo $i.md5 present and correct else echo $i.md5 does not match fi else md5 -q $i > $i.md5 echo $i.md5 created fi if [ -f $i.sha1 ]; then if [ “$(cat $i.sha1)” = “$(sha1 -q $i)” ]; then echo $i.sha1 present and correct else echo $i.sha1 does not match fi else sha1 -q $i > $i.sha1 echo $i.sha1 created fi done } checkHash apache-calcite-X.Y.Z-rcN {% endhighlight %}
Release vote on dev list
{% highlight text %} To: dev@calcite.apache.org Subject: [VOTE] Release apache-calcite-X.Y.Z (release candidate N)
Hi all,
I have created a build for Apache Calcite X.Y.Z, release candidate N.
Thanks to everyone who has contributed to this release. You can read the release notes here: https://github.com/apache/calcite/blob/XXXX/site/_docs/history.md
The commit to be voted upon: http://git-wip-us.apache.org/repos/asf/calcite/commit/NNNNNN
Its hash is XXXX.
The artifacts to be voted on are located here: https://dist.apache.org/repos/dist/dev/calcite/apache-calcite-X.Y.Z-rcN/
The hashes of the artifacts are as follows: src.tar.gz.md5 XXXX src.tar.gz.sha1 XXXX src.zip.md5 XXXX src.zip.sha1 XXXX
A staged Maven repository is available for review at: https://repository.apache.org/content/repositories/orgapachecalcite-NNNN
Release artifacts are signed with the following key: https://people.apache.org/keys/committer/jhyde.asc
Please vote on releasing this package as Apache Calcite X.Y.Z.
The vote is open for the next 72 hours and passes if a majority of at least three +1 PMC votes are cast.
[ ] +1 Release this package as Apache Calcite X.Y.Z [ ] 0 I don‘t feel strongly about it, but I’m okay with the release [ ] -1 Do not release this package because...
Here is my vote:
+1 (binding)
Julian {% endhighlight %}
After vote finishes, send out the result:
{% highlight text %} Subject: [RESULT] [VOTE] Release apache-calcite-X.Y.Z (release candidate N) To: dev@calcite.apache.org
Thanks to everyone who has tested the release candidate and given their comments and votes.
The tally is as follows.
N binding +1s:
N non-binding +1s:
No 0s or -1s.
Therefore I am delighted to announce that the proposal to release Apache Calcite X.Y.Z has passed.
Thanks everyone. We’ll now roll the release out to the mirrors.
There was some feedback during voting. I shall open a separate thread to discuss.
Julian {% endhighlight %}
Use the Apache URL shortener to generate shortened URLs for the vote proposal and result emails. Examples: s.apache.org/calcite-1.2-vote and s.apache.org/calcite-1.2-result.
After a successful release vote, we need to push the release out to mirrors, and other tasks.
In JIRA, search for all issues resolved in this release, and do a bulk update changing their status to “Closed”, with a change comment “Resolved in release X.Y.Z (YYYY-MM-DD)” (fill in release number and date appropriately). Uncheck “Send mail for this update”.
Promote the staged nexus artifacts.
Check the artifacts into svn.
{% highlight bash %}
mkdir -p ~/dist/dev cd ~/dist/dev svn co https://dist.apache.org/repos/dist/dev/calcite
mkdir -p ~/dist/release cd ~/dist/release svn co https://dist.apache.org/repos/dist/release/calcite cd calcite cp -rp ../../dev/calcite/apache-calcite-X.Y.Z-rcN apache-calcite-X.Y.Z svn add apache-calcite-X.Y.Z
svn ci {% endhighlight %}
Svnpubsub will publish to https://dist.apache.org/repos/dist/release/calcite and propagate to http://www.apache.org/dyn/closer.cgi/calcite within 24 hours.
If there are now more than 2 releases, clear out the oldest ones:
{% highlight bash %} cd ~/dist/release/calcite svn rm apache-calcite-X.Y.Z svn ci {% endhighlight %}
The old releases will remain available in the release archive.
Add a release note by copying [site/_posts/2015-11-10-release-1.5.0.md]({{ site.sourceRoot }}/site/_posts/2015-11-10-release-1.5.0.md), publish the site, and check that it appears in the contents in news.
{: #publish-the-web-site}
See instructions in [site/README.md]({{ site.sourceRoot }}/site/README.md).