| <?xml version="1.0" encoding="UTF-8"?> |
| <!-- |
| Licensed to the Apache Software Foundation (ASF) under one |
| or more contributor license agreements. See the NOTICE file |
| distributed with this work for additional information |
| regarding copyright ownership. The ASF licenses this file |
| to you under the Apache License, Version 2.0 (the |
| "License"); you may not use this file except in compliance |
| with the License. You may obtain a copy of the License at |
| |
| http://www.apache.org/licenses/LICENSE-2.0 |
| |
| Unless required by applicable law or agreed to in writing, |
| software distributed under the License is distributed on an |
| "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY |
| KIND, either express or implied. See the License for the |
| specific language governing permissions and limitations |
| under the License. |
| --> |
| <!DOCTYPE concept PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd"> |
| <concept id="impala_jdbc"> |
| |
| <title id="jdbc">Configuring Impala to Work with JDBC</title> |
| |
| <prolog> |
| <metadata> |
| <data name="Category" value="Impala"/> |
| <data name="Category" value="JDBC"/> |
| <data name="Category" value="Java"/> |
| <data name="Category" value="SQL"/> |
| <data name="Category" value="Querying"/> |
| <data name="Category" value="Configuring"/> |
| <data name="Category" value="Starting and Stopping"/> |
| <data name="Category" value="Developers"/> |
| </metadata> |
| </prolog> |
| |
| <conbody> |
| |
| <p> |
| <indexterm audience="hidden">JDBC</indexterm> |
| Impala supports the standard JDBC interface, allowing access from commercial Business |
| Intelligence tools and custom software written in Java or other programming languages. The |
| JDBC driver allows you to access Impala from a Java program that you write, or a Business |
| Intelligence or similar tool that uses JDBC to communicate with various database products. |
| </p> |
| |
| <p> |
| Setting up a JDBC connection to Impala involves the following steps: |
| </p> |
| |
| <ul> |
| <li> |
| Verifying the communication port where the Impala daemons in your cluster are listening |
| for incoming JDBC requests. |
| </li> |
| |
| <li> |
| Installing the JDBC driver on every system that runs the JDBC-enabled application. |
| </li> |
| |
| <li> |
| Specifying a connection string for the JDBC application to access one of the servers |
| running the <cmdname>impalad</cmdname> daemon, with the appropriate security settings. |
| </li> |
| </ul> |
| |
| <p outputclass="toc inpage"/> |
| |
| </conbody> |
| |
| <concept id="jdbc_port"> |
| |
| <title>Configuring the JDBC Port</title> |
| |
| <conbody> |
| |
| <p> |
| The following are the default ports that Impala server accepts JDBC connections through: |
| <simpletable frame="all" |
| relcolwidth="1.0* 1.03* 2.38*" id="simpletable_tr2_gnt_43b"> |
| |
| <strow> |
| |
| <stentry><b>Protocol</b> |
| |
| </stentry> |
| |
| <stentry><b>Default Port</b> |
| |
| </stentry> |
| |
| <stentry><b>Flag to Specify an Alternate Port</b> |
| |
| </stentry> |
| |
| </strow> |
| |
| <strow> |
| |
| <stentry>HTTP</stentry> |
| |
| <stentry>28000</stentry> |
| |
| <stentry><codeph>‑‑hs2_http_port</codeph> |
| |
| </stentry> |
| |
| </strow> |
| |
| <strow> |
| |
| <stentry>Binary TCP</stentry> |
| |
| <stentry>21050</stentry> |
| |
| <stentry><codeph>‑‑hs2_port</codeph> |
| |
| </stentry> |
| |
| </strow> |
| |
| </simpletable> |
| </p> |
| |
| <p> |
| Make sure the port for the protocol you are using is available for communication with |
| clients, for example, that it is not blocked by firewall software. |
| </p> |
| |
| <p> |
| If your JDBC client software connects to a different port, specify that alternative port |
| number with the flag in the above table when starting the <codeph>impalad</codeph>. |
| </p> |
| |
| </conbody> |
| |
| </concept> |
| |
| <concept id="jdbc_driver_choice"> |
| |
| <title>Choosing the JDBC Driver</title> |
| |
| <prolog> |
| <metadata> |
| <data name="Category" value="Planning"/> |
| </metadata> |
| </prolog> |
| |
| <conbody> |
| |
| <p> |
| In Impala 2.0 and later, you can use the Hive 0.13 JDBC driver. If you are already using |
| JDBC applications with an earlier Impala release, you should update your JDBC driver, |
| because the Hive 0.12 driver that was formerly the only choice is not compatible with |
| Impala 2.0 and later. |
| </p> |
| |
| <p> |
| The Hive JDBC driver provides a substantial speed increase for JDBC applications with |
| Impala 2.0 and higher, for queries that return large result sets. |
| </p> |
| |
| </conbody> |
| |
| </concept> |
| |
| <concept id="jdbc_setup"> |
| |
| <title>Enabling Impala JDBC Support on Client Systems</title> |
| |
| <prolog> |
| <metadata> |
| <data name="Category" value="Installing"/> |
| </metadata> |
| </prolog> |
| |
| <conbody> |
| |
| <section id="install_hive_driver"> |
| |
| <title>Using the Hive JDBC Driver</title> |
| |
| <p> |
| You install the Hive JDBC driver (<codeph>hive-jdbc</codeph> package) through the |
| Linux package manager, on hosts within the cluster. The driver consists of several |
| Java JAR files. The same driver can be used by Impala and Hive. |
| </p> |
| |
| <p> |
| To get the JAR files, install the Hive JDBC driver on each host in the cluster that |
| will run JDBC applications. |
| <!-- TODO: Find a URL to point to for instructions and downloads --> |
| </p> |
| |
| <note> |
| The latest JDBC driver, corresponding to Hive 0.13, provides substantial performance |
| improvements for Impala queries that return large result sets. Impala 2.0 and later |
| are compatible with the Hive 0.13 driver. If you already have an older JDBC driver |
| installed, and are running Impala 2.0 or higher, consider upgrading to the latest Hive |
| JDBC driver for best performance with JDBC applications. |
| </note> |
| |
| <p> |
| If you are using JDBC-enabled applications on hosts outside the cluster, you cannot |
| use the the same install procedure on the hosts. Install the JDBC driver on at least |
| one cluster host using the preceding procedure. Then download the JAR files to each |
| client machine that will use JDBC with Impala: |
| </p> |
| |
| <codeblock>commons-logging-X.X.X.jar |
| hadoop-common.jar |
| hive-common-X.XX.X.jar |
| hive-jdbc-X.XX.X.jar |
| hive-metastore-X.XX.X.jar |
| hive-service-X.XX.X.jar |
| httpclient-X.X.X.jar |
| httpcore-X.X.X.jar |
| libfb303-X.X.X.jar |
| libthrift-X.X.X.jar |
| log4j-X.X.XX.jar |
| slf4j-api-X.X.X.jar |
| slf4j-logXjXX-X.X.X.jar |
| </codeblock> |
| |
| <p> |
| <b>To enable JDBC support for Impala on the system where you run the JDBC |
| application:</b> |
| </p> |
| |
| <ol> |
| <li> |
| Download the JAR files listed above to each client machine. |
| <note> |
| For Maven users, see <xref keyref="Impala-JDBC-Example">this sample github |
| page</xref> for an example of the dependencies you could add to a |
| <codeph>pom</codeph> file instead of downloading the individual JARs. |
| </note> |
| </li> |
| |
| <li> |
| Store the JAR files in a location of your choosing, ideally a directory already |
| referenced in your <codeph>CLASSPATH</codeph> setting. For example: |
| <ul> |
| <li> |
| On Linux, you might use a location such as <codeph>/opt/jars/</codeph>. |
| </li> |
| |
| <li> |
| On Windows, you might use a subdirectory underneath <filepath>C:\Program |
| Files</filepath>. |
| </li> |
| </ul> |
| </li> |
| |
| <li> |
| To successfully load the Impala JDBC driver, client programs must be able to locate |
| the associated JAR files. This often means setting the <codeph>CLASSPATH</codeph> |
| for the client process to include the JARs. Consult the documentation for your JDBC |
| client for more details on how to install new JDBC drivers, but some examples of how |
| to set <codeph>CLASSPATH</codeph> variables include: |
| <ul> |
| <li> |
| On Linux, if you extracted the JARs to <codeph>/opt/jars/</codeph>, you might |
| issue the following command to prepend the JAR files path to an existing |
| classpath: |
| <codeblock>export CLASSPATH=/opt/jars/*.jar:$CLASSPATH</codeblock> |
| </li> |
| |
| <li> |
| On Windows, use the <b>System Properties</b> control panel item to modify the |
| <b>Environment Variables</b> for your system. Modify the environment variables |
| to include the path to which you extracted the files. |
| <note> |
| If the existing <codeph>CLASSPATH</codeph> on your client machine refers to |
| some older version of the Hive JARs, ensure that the new JARs are the first |
| ones listed. Either put the new JAR files earlier in the listings, or delete |
| the other references to Hive JAR files. |
| </note> |
| </li> |
| </ul> |
| </li> |
| </ol> |
| |
| </section> |
| |
| </conbody> |
| |
| </concept> |
| |
| <concept id="jdbc_connect"> |
| |
| <title>Establishing JDBC Connections</title> |
| |
| <conbody> |
| |
| <p> |
| The JDBC driver class depends on which driver you select. |
| </p> |
| |
| <note conref="../shared/impala_common.xml#common/proxy_jdbc_caveat"/> |
| |
| <section id="class_hive_driver"> |
| |
| <title>Using the Hive JDBC Driver</title> |
| |
| <p> |
| For example, with the Hive JDBC driver, the class name is |
| <codeph>org.apache.hive.jdbc.HiveDriver</codeph>. Once you have configured Impala to |
| work with JDBC, you can establish connections between the two. To do so for a cluster |
| that does not use Kerberos authentication, use a connection string of the form |
| <codeph>jdbc:hive2://<varname>host</varname>:<varname>port</varname>/;auth=noSasl</codeph>. |
| <!-- |
| Include the <codeph>auth=noSasl</codeph> argument |
| only when connecting to a non-Kerberos cluster; if Kerberos is enabled, omit the <codeph>auth</codeph> argument. |
| --> |
| For example, you might use: |
| </p> |
| |
| <codeblock>jdbc:hive2://myhost.example.com:21050/;auth=noSasl</codeblock> |
| |
| <p> |
| To connect to an instance of Impala that requires Kerberos authentication, use a |
| connection string of the form |
| <codeph>jdbc:hive2://<varname>host</varname>:<varname>port</varname>/;principal=<varname>principal_name</varname></codeph>. |
| The principal must be the same user principal you used when starting Impala. For |
| example, you might use: |
| </p> |
| |
| <codeblock>jdbc:hive2://myhost.example.com:21050/;principal=impala/myhost.example.com@H2.EXAMPLE.COM</codeblock> |
| |
| <p> |
| To connect to an instance of Impala that requires LDAP authentication, use a |
| connection string of the form |
| <codeph>jdbc:hive2://<varname>host</varname>:<varname>port</varname>/<varname>db_name</varname>;user=<varname>ldap_userid</varname>;password=<varname>ldap_password</varname></codeph>. |
| For example, you might use: |
| </p> |
| |
| <codeblock>jdbc:hive2://myhost.example.com:21050/test_db;user=fred;password=xyz123</codeblock> |
| |
| <p> |
| To connect to an instance of Impala over HTTP, specify the HTTP port, 28000 by |
| default, and <codeph>transportMode=http</codeph> in the connection string. For |
| example: |
| <codeblock>jdbc:hive2://myhost.example.com:28000/;transportMode=http</codeblock> |
| </p> |
| |
| <note> |
| <p conref="../shared/impala_common.xml#common/hive_jdbc_ssl_kerberos_caveat"/> |
| </note> |
| |
| </section> |
| |
| </conbody> |
| |
| </concept> |
| |
| <concept rev="2.3.0" id="jdbc_odbc_notes"> |
| |
| <title>Notes about JDBC and ODBC Interaction with Impala SQL Features</title> |
| |
| <conbody> |
| |
| <p> |
| Most Impala SQL features work equivalently through the <cmdname>impala-shell</cmdname> |
| interpreter of the JDBC or ODBC APIs. The following are some exceptions to keep in mind |
| when switching between the interactive shell and applications using the APIs: |
| </p> |
| |
| <ul> |
| <li> |
| <p conref="../shared/impala_common.xml#common/complex_types_blurb"/> |
| <ul> |
| <li> |
| <p> |
| Queries involving the complex types (<codeph>ARRAY</codeph>, |
| <codeph>STRUCT</codeph>, and <codeph>MAP</codeph>) require notation that might |
| not be available in all levels of JDBC and ODBC drivers. If you have trouble |
| querying such a table due to the driver level or inability to edit the queries |
| used by the application, you can create a view that exposes a <q>flattened</q> |
| version of the complex columns and point the application at the view. See |
| <xref href="impala_complex_types.xml#complex_types"/> for details. |
| </p> |
| </li> |
| |
| <li> |
| <p> |
| The complex types available in <keyword keyref="impala23_full"/> and higher are |
| supported by the JDBC <codeph>getColumns()</codeph> API. Both |
| <codeph>MAP</codeph> and <codeph>ARRAY</codeph> are reported as the JDBC SQL |
| Type <codeph>ARRAY</codeph>, because this is the closest matching Java SQL type. |
| This behavior is consistent with Hive. <codeph>STRUCT</codeph> types are |
| reported as the JDBC SQL Type <codeph>STRUCT</codeph>. |
| </p> |
| |
| <p> |
| To be consistent with Hive's behavior, the TYPE_NAME field is populated with the |
| primitive type name for scalar types, and with the full <codeph>toSql()</codeph> |
| for complex types. The resulting type names are somewhat inconsistent, because |
| nested types are printed differently than top-level types. For example, the |
| following list shows how <codeph>toSQL()</codeph> for Impala types are |
| translated to <codeph>TYPE_NAME</codeph> values: |
| <codeblock><![CDATA[DECIMAL(10,10) becomes DECIMAL |
| CHAR(10) becomes CHAR |
| VARCHAR(10) becomes VARCHAR |
| ARRAY<DECIMAL(10,10)> becomes ARRAY<DECIMAL(10,10)> |
| ARRAY<CHAR(10)> becomes ARRAY<CHAR(10)> |
| ARRAY<VARCHAR(10)> becomes ARRAY<VARCHAR(10)> |
| ]]> |
| </codeblock> |
| </p> |
| </li> |
| </ul> |
| </li> |
| </ul> |
| |
| </conbody> |
| |
| </concept> |
| |
| <concept id="jdbc_kudu"> |
| |
| <title>Kudu Considerations for DML Statements</title> |
| |
| <conbody> |
| |
| <p> |
| Currently, Impala <codeph>INSERT</codeph>, <codeph>UPDATE</codeph>, or other DML |
| statements issued through the JDBC interface against a Kudu table do not return JDBC |
| error codes for conditions such as duplicate primary key columns. Therefore, for |
| applications that issue a high volume of DML statements, prefer to use the Kudu Java API |
| directly rather than a JDBC application. |
| </p> |
| |
| </conbody> |
| |
| </concept> |
| |
| </concept> |