{% include JB/setup %}
JDBC interpreter lets you create a JDBC connection to any data sources seamlessly.
Inserts, Updates, and Upserts are applied immediately after running each statement.
By now, it has been tested with:
If you are using other databases not in the above list, please feel free to share your use case. It would be helpful to improve the functionality of JDBC interpreter.
First, click + Create
button at the top-right corner in the interpreter setting page.
Fill Interpreter name
field with whatever you want to use as the alias(e.g. mysql, mysql2, hive, redshift, and etc..). Please note that this alias will be used as %interpreter_name
to call the interpreter in the paragraph. Then select jdbc
as an Interpreter group
.
The default driver of JDBC interpreter is set as PostgreSQL
. It means Zeppelin includes PostgreSQL
driver jar in itself. So you don't need to add any dependencies(e.g. the artifact name or path for PostgreSQL
driver jar) for PostgreSQL
connection. The JDBC interpreter properties are defined by default like below.
If you want to connect other databases such as Mysql
, Redshift
and Hive
, you need to edit the property values. You can also use Credential for JDBC authentication. If default.user
and default.password
properties are deleted(using X button) for database connection in the interpreter setting page, the JDBC interpreter will get the account information from Credential.
The below example is for Mysql
connection.
The last step is Dependency Setting. Since Zeppelin only includes PostgreSQL
driver jar by default, you need to add each driver‘s maven coordinates or JDBC driver’s jar file path for the other databases.
That's it. You can find more JDBC connection setting examples(Mysql, MariaDB, Redshift, Apache Hive, Apache Phoenix, and Apache Tajo) in this section.
There are more JDBC interpreter properties you can specify like below.
You can also add more properties by using this method. For example, if a connection needs a schema parameter, it would have to add the property as follows:
To bind the interpreters created in the interpreter setting page, click the gear icon at the top-right corner.
Select(blue) or deselect(white) the interpreter buttons depending on your use cases. If you need to use more than one interpreter in the notebook, activate several buttons. Don't forget to click Save
button, or you will face Interpreter *** is not found
error.
To test whether your databases and Zeppelin are successfully connected or not, type %jdbc_interpreter_name
(e.g. %mysql
) at the top of the paragraph and run show databases
.
%jdbc_interpreter_name show databases
If the paragraph is FINISHED
without any errors, a new paragraph will be automatically added after the previous one with %jdbc_interpreter_name
. So you don‘t need to type this prefix in every paragraphs’ header.
You can leverage Zeppelin Dynamic Form inside your queries. You can use both the text input
and select form
parametrization features.
%jdbc_interpreter_name SELECT name, country, performer FROM demo.performers WHERE name='${performer=Sheryl Crow|Doof|Fanfarlo|Los Paranoia}'
You can set precode for each data source. Code runs once while opening the connection.
An example settings of interpreter for the two data sources, each of which has its precode parameter.
Test of execution precode for each data source.
%jdbc show search_path
Returns value of search_path
which is set in the default.precode.
%mysql select @v
Returns value of v
which is set in the mysql.precode.
Here are some examples you can refer to. Including the below connectors, you can connect every databases as long as it can be configured with it's JDBC driver.
Maven Repository: org.postgresql:postgresql
Maven Repository: mysql:mysql-connector-java
Maven Repository: org.mariadb.jdbc:mariadb-java-client
Maven Repository: com.amazonaws:aws-java-sdk-redshift
Apache Hive 1 JDBC Driver Docs Apache Hive 2 JDBC Driver Docs
Maven Repository : org.apache.hive:hive-jdbc
When Zeppelin server is running with authentication enabled, then the interpreter can utilize Hive's user proxy feature i.e. send extra parameter for creating and running a session (“hive.server2.proxy.user=”: “${loggedInUser}”). This is particularly useful when multiple users are sharing a notebook.
To enable this set following:
zeppelin.jdbc.auth.type
as SIMPLE
or KERBEROS
(if required) in the interpreter setting.${prefix}.proxy.user.property
as hive.server2.proxy.user
See User Impersonation in interpreter for more information.
Phoenix supports thick
and thin
connection types:
Use the appropriate default.driver
, default.url
, and the dependency artifact for your connection type.
Maven Repository: org.apache.phoenix:phoenix-core
Before Adding one of the below dependencies, check the Phoenix version first.
Maven Repository: org.apache.phoenix:phoenix-queryserver-client
Maven Repository: org.apache.tajo:tajo-jdbc
The JDBC interpreter also supports interpolation of ZeppelinContext
objects into the paragraph text. The following example shows one use of this facility:
####In Scala cell:
z.put("country_code", "KR") // ...
####In later JDBC cell:
%jdbc_interpreter_name select * from patents_list where priority_country = '{country_code}' and filing_date like '2015-%'
Object interpolation is disabled by default, and can be enabled for all instances of the JDBC interpreter by setting the value of the property zeppelin.jdbc.interpolation
to true
(see More Properties above). More details of this feature can be found in the Spark interpreter documentation under Zeppelin-Context
If you find a bug using JDBC interpreter, please create a JIRA ticket.