| //// |
| Licensed to the Apache Software Foundation (ASF) under one |
| or more contributor license agreements. See the NOTICE file |
| distributed with this work for additional information |
| regarding copyright ownership. The ASF licenses this file |
| to you under the Apache License, Version 2.0 (the |
| "License"); you may not use this file except in compliance |
| with the License. You may obtain a copy of the License at |
| |
| http://www.apache.org/licenses/LICENSE-2.0 |
| |
| Unless required by applicable law or agreed to in writing, software |
| distributed under the License is distributed on an "AS IS" BASIS, |
| WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
| See the License for the specific language governing permissions and |
| limitations under the License. |
| //// |
| |
| Compatibility Notes |
| ------------------- |
| |
| Sqoop uses JDBC to connect to databases and adheres to |
| published standards as much as possible. For databases which do not |
| support standards-compliant SQL, Sqoop uses alternate codepaths to |
| provide functionality. In general, Sqoop is believed to be compatible |
| with a large number of databases, but it is tested with only a few. |
| |
| Nonetheless, several database-specific decisions were made in the |
| implementation of Sqoop, and some databases offer additional settings |
| which are extensions to the standard. |
| |
| This section describes the databases tested with Sqoop, any |
| exceptions in Sqoop's handling of each database relative to the |
| norm, and any database-specific settings available in Sqoop. |
| |
| Supported Databases |
| ~~~~~~~~~~~~~~~~~~~ |
| |
| While JDBC is a compatibility layer that allows a program to access |
| many different databases through a common API, slight differences in |
| the SQL language spoken by each database may mean that Sqoop can't use |
| every database out of the box, or that some databases may be used in |
| an inefficient manner. |
| |
| When you provide a connect string to Sqoop, it inspects the protocol scheme to |
| determine appropriate vendor-specific logic to use. If Sqoop knows about |
| a given database, it will work automatically. If not, you may need to |
| specify the driver class to load via +\--driver+. This will use a generic |
| code path which will use standard SQL to access the database. Sqoop provides |
| some databases with faster, non-JDBC-based access mechanisms. These can be |
| enabled by specfying the +\--direct+ parameter. |
| |
| Sqoop includes vendor-specific support for the following databases: |
| |
| [grid="all"] |
| `-----------`--------`--------------------`--------------------- |
| Database version +\--direct+ support? connect string matches |
| ---------------------------------------------------------------- |
| HSQLDB 1.8.0+ No +jdbc:hsqldb:*//+ |
| MySQL 5.0+ Yes +jdbc:mysql://+ |
| Oracle 10.2.0+ Yes +jdbc:oracle:*//+ |
| PostgreSQL 8.3+ Yes (import only) +jdbc:postgresql://+ |
| CUBRID 9.2+ NO +jdbc:cubrid:*+ |
| ---------------------------------------------------------------- |
| |
| Sqoop may work with older versions of the databases listed, but we have |
| only tested it with the versions specified above. |
| |
| Even if Sqoop supports a database internally, you may still need to |
| install the database vendor's JDBC driver in your +$SQOOP_HOME/lib+ |
| path on your client. Sqoop can load classes from any jars in |
| +$SQOOP_HOME/lib+ on the client and will use them as part of any |
| MapReduce jobs it runs; unlike older versions, you no longer need to |
| install JDBC jars in the Hadoop library path on your servers. |
| |
| MySQL |
| ~~~~~ |
| |
| JDBC Driver: http://www.mysql.com/downloads/connector/j/[MySQL |
| Connector/J] |
| |
| MySQL v5.0 and above offers very thorough coverage by Sqoop. Sqoop |
| has been tested with +mysql-connector-java-5.1.13-bin.jar+. |
| |
| zeroDateTimeBehavior |
| ^^^^^^^^^^^^^^^^^^^^ |
| |
| MySQL allows values of +\'0000-00-00\'+ for +DATE+ columns, which is a |
| non-standard extension to SQL. When communicated via JDBC, these |
| values are handled in one of three different ways: |
| |
| - Convert to +NULL+. |
| - Throw an exception in the client. |
| - Round to the nearest legal date (+\'0001-01-01\'+). |
| |
| You specify the behavior by using the +zeroDateTimeBehavior+ |
| property of the connect string. If a +zeroDateTimeBehavior+ property |
| is not specified, Sqoop uses the +convertToNull+ behavior. |
| |
| You can override this behavior. For example: |
| |
| ---- |
| $ sqoop import --table foo \ |
| --connect jdbc:mysql://db.example.com/someDb?zeroDateTimeBehavior=round |
| ---- |
| |
| +UNSIGNED+ columns |
| ^^^^^^^^^^^^^^^^^^ |
| |
| Columns with type +UNSIGNED+ in MySQL can hold values between 0 and |
| 2^32 (+4294967295+), but the database will report the data type to Sqoop |
| as +INTEGER+, which will can hold values between +-2147483648+ and |
| +\+2147483647+. Sqoop cannot currently import +UNSIGNED+ values above |
| +2147483647+. |
| |
| +BLOB+ and +CLOB+ columns |
| ^^^^^^^^^^^^^^^^^^^^^^^^^ |
| |
| Sqoop's direct mode does not support imports of +BLOB+, +CLOB+, or |
| +LONGVARBINARY+ columns. Use JDBC-based imports for these |
| columns; do not supply the +\--direct+ argument to the import tool. |
| |
| Importing views in direct mode |
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| |
| Sqoop is currently not supporting import from view in direct mode. Use |
| JDBC based (non direct) mode in case that you need to import view (simply |
| omit +--direct+ parameter). |
| |
| |
| PostgreSQL |
| ~~~~~~~~~~ |
| |
| Sqoop supports JDBC-based connector for PostgreSQL: http://jdbc.postgresql.org/ |
| |
| The connector has been tested using JDBC driver version "9.1-903 JDBC 4" with |
| PostgreSQL server 9.1. |
| |
| Importing views in direct mode |
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| |
| Sqoop is currently not supporting import from view in direct mode. Use |
| JDBC based (non direct) mode in case that you need to import view (simply |
| omit +--direct+ parameter). |
| |
| Oracle |
| ~~~~~~ |
| |
| JDBC Driver: |
| http://www.oracle.com/technology/software/tech/java/sqlj_jdbc/htdocs/jdbc_112010.html[Oracle |
| JDBC Thin Driver] - Sqoop is compatible with +ojdbc6.jar+. |
| |
| Sqoop has been tested with Oracle 10.2.0 Express Edition. Oracle is |
| notable in its different approach to SQL from the ANSI standard, and |
| its non-standard JDBC driver. Therefore, several features work |
| differently. |
| |
| Dates and Times |
| ^^^^^^^^^^^^^^^ |
| |
| Oracle JDBC represents +DATE+ and +TIME+ SQL types as +TIMESTAMP+ |
| values. Any +DATE+ columns in an Oracle database will be imported as a |
| +TIMESTAMP+ in Sqoop, and Sqoop-generated code will store these values |
| in +java.sql.Timestamp+ fields. |
| |
| When exporting data back to a database, Sqoop parses text fields as |
| +TIMESTAMP+ types (with the form +yyyy-mm-dd HH:MM:SS.ffffffff+) even |
| if you expect these fields to be formatted with the JDBC date escape |
| format of +yyyy-mm-dd+. Dates exported to Oracle should be formatted |
| as full timestamps. |
| |
| Oracle also includes the additional date/time types +TIMESTAMP WITH |
| TIMEZONE+ and +TIMESTAMP WITH LOCAL TIMEZONE+. To support these types, |
| the user's session timezone must be specified. By default, Sqoop will |
| specify the timezone +"GMT"+ to Oracle. You can override this setting |
| by specifying a Hadoop property +oracle.sessionTimeZone+ on the |
| command-line when running a Sqoop job. For example: |
| |
| ---- |
| $ sqoop import -D oracle.sessionTimeZone=America/Los_Angeles \ |
| --connect jdbc:oracle:thin:@//db.example.com/foo --table bar |
| ---- |
| |
| Note that Hadoop parameters (+-D ...+) are _generic arguments_ and |
| must appear before the tool-specific arguments (+\--connect+, |
| +\--table+, and so on). |
| |
| Legal values for the session timezone string are enumerated at |
| http://download-west.oracle.com/docs/cd/B19306_01/server.102/b14225/applocaledata.htm#i637736[]. |
| |
| |
| include::hive-notes.txt[] |
| |
| CUBRID |
| ~~~~~~ |
| |
| Sqoop supports JDBC-based connector for Cubrid: http://www.cubrid.org/?mid=downloads&item=jdbc_driver |
| |
| The connector has been tested using JDBC driver version "JDBC-9.2.0.0155-cubrid.jar" with Cubrid 9.2. |