website/_dev/datasource

layout: dev title: Develop JDBC Data Source categories: development permalink: /development/datasource_sdk.html

Available since Apache Kylin v2.6.0

Data source SDK

Since v2.6.0 Apache Kylin provides a new data source framework Data source SDK, which provides APIs to help developers handle dialect differences and easily implement a new data source.

How to develop

Configuration to implement a new data source

Data source SDK provides a conversion framework and has pre-defined a configuration file default.xml for ansi sql dialect.

Developers do not need coding, what they should do is just create a new configuration file {dialect}.xml for the new data source dialect.

Structure of the configuration:

Root node:

<DATASOURCE_DEF NAME="kylin" ID="mysql" DIALECT="mysql"/>

The value of ID is normally the same with configuration file.
The value of DIALECT is defined mainly for quote string for database identifier.
For example Mysql use ``, Microsoft sql server use [].
Mapping of Kylin DIALECT and Apache Calcite Dialect as belows:

Property node: Define the properties of the dialect.

Function node: Developers can define the functions implementation in target data source dialect.
For example, we want to implement Greenplum as data source, but Greenplum does not support function such as TIMESTAMPDIFF, so we can define in greenplum.xml

<FUNCTION_DEF ID="64" EXPRESSION="(CAST($1 AS DATE) - CAST($0 AS DATE))"/>

contrast with the configuration in default.xml

<FUNCTION_DEF ID="64" EXPRESSION="TIMESTAMPDIFF(day, $0, $1)"/>

Data source SDK provides conversion functions from default to target dialect with same function id.

Type node: Developers can define the types implementation in target data source dialect. Also take Greenplum as example, Greenplum support BIGINT instead of LONG, so we can define in greenplum.xml

<TYPE_DEF ID="Long" EXPRESSION="BIGINT"/>

contrast with the configuration in default.xml

<TYPE_DEF ID="Long" EXPRESSION="LONG"/>

Data source SDK provides conversion types from default to target dialect with same type id.

Adaptor

Adaptor provides a list of API like get metadata and data from data source. Data source SDK provides a default implementation，developers can create a new class to extends it and have their own implementation. {% highlight Groff markup %} org.apache.kylin.sdk.datasource.adaptor.DefaultAdaptor {% endhighlight %}

Adaptor also reserves a function fixSql(String sql).
After the conversion with the conversion framework, if the sql still have some problems to adapt the target dialect, developers can implement the function to fix sql finally.

How to enable data source for Kylin

Some new configurations:
{% highlight Groff markup %} kylin.query.pushdown.runner-class-name=org.apache.kylin.query.pushdown.PushdownRunnerSDKImpl kylin.source.default=16 kylin.source.jdbc.dialect={Dialect} kylin.source.jdbc.adaptor={Class name of Adaptor} kylin.source.jdbc.user={JDBC Connection Username} kylin.source.jdbc.pass={JDBC Connection Password} kylin.source.jdbc.connection-url={JDBC Connection String} kylin.source.jdbc.driver={JDBC Driver Class Name} {% endhighlight %}

Take mysql as an example: {% highlight Groff markup %} kylin.query.pushdown.runner-class-name=org.apache.kylin.query.pushdown.PushdownRunnerSDKImpl kylin.source.default=16 kylin.source.jdbc.dialect=mysql kylin.source.jdbc.adaptor=org.apache.kylin.sdk.datasource.adaptor.MysqlAdaptor kylin.source.jdbc.user={MYSQL_USERNAME} kylin.source.jdbc.pass={MYSQL_PASSWORD} kylin.source.jdbc.connection-url=jdbc:mysql://{HOST_URL}:3306/{DATABASE_NAME} kylin.source.jdbc.driver=com.mysql.jdbc.Driver {% endhighlight %}

Put the configuration file {dialect}.xml under directory $KYLIN_HOME/conf/datasource. Create jar file for the new Adaptor, and put under directory $KYLIN_HOME/ext.

Other configurations are identical with the former jdbc connection, please refer to setup_jdbc_datasource.