| |
| //// |
| Licensed to the Apache Software Foundation (ASF) under one |
| or more contributor license agreements. See the NOTICE file |
| distributed with this work for additional information |
| regarding copyright ownership. The ASF licenses this file |
| to you under the Apache License, Version 2.0 (the |
| "License"); you may not use this file except in compliance |
| with the License. You may obtain a copy of the License at |
| |
| http://www.apache.org/licenses/LICENSE-2.0 |
| |
| Unless required by applicable law or agreed to in writing, software |
| distributed under the License is distributed on an "AS IS" BASIS, |
| WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
| See the License for the specific language governing permissions and |
| limitations under the License. |
| //// |
| |
| |
| Importing Data Into HBase |
| ^^^^^^^^^^^^^^^^^^^^^^^^^ |
| |
| Sqoop supports additional import targets beyond HDFS and Hive. Sqoop |
| can also import records into a table in HBase. |
| |
| By specifying +\--hbase-table+, you instruct Sqoop to import |
| to a table in HBase rather than a directory in HDFS. Sqoop will |
| import data to the table specified as the argument to +\--hbase-table+. |
| Each row of the input table will be transformed into an HBase |
| +Put+ operation to a row of the output table. The key for each row is |
| taken from a column of the input. By default Sqoop will use the split-by |
| column as the row key column. If that is not specified, it will try to |
| identify the primary key column, if any, of the source table. You can |
| manually specify the row key column with +\--hbase-row-key+. Each output |
| column will be placed in the same column family, which must be specified |
| with +\--column-family+. |
| |
| NOTE: This function is incompatible with direct import (parameter |
| +\--direct+). |
| |
| If the input table has composite key, the +\--hbase-row-key+ must be |
| in the form of a comma-separated list of composite key attributes. |
| In this case, the row key for HBase row will be generated by combining |
| values of composite key attributes using underscore as a separator. |
| NOTE: Sqoop import for a table with composite key will work only if |
| parameter +\--hbase-row-key+ has been specified. |
| |
| If the target table and column family do not exist, the Sqoop job will |
| exit with an error. You should create the target table and column family |
| before running an import. If you specify +\--hbase-create-table+, Sqoop |
| will create the target table and column family if they do not exist, |
| using the default parameters from your HBase configuration. |
| |
| Sqoop currently serializes all values to HBase by converting each field |
| to its string representation (as if you were importing to HDFS in text |
| mode), and then inserts the UTF-8 bytes of this string in the target |
| cell. Sqoop will skip all rows containing null values in all columns |
| except the row key column. |
| |
| By default Sqoop will retain the previously imported value for columns |
| updated to null during incremental imports. This can be changed to |
| delete all previous versions of the column by using |
| +\--hbase-null-incremental-mode delete+. |
| |
| To decrease the load on hbase, Sqoop can do bulk loading as opposed to |
| direct writes. To use bulk loading, enable it using +\--hbase-bulkload+. |