| sqoop-import(1) |
| =============== |
| |
| NAME |
| ---- |
| sqoop-import - Import a table from an RDBMS to HDFS |
| |
| SYNOPSIS |
| -------- |
| 'sqoop-import' <generic-options> <tool-options> |
| |
| 'sqoop import' <generic-options> <tool-options> |
| |
| |
| DESCRIPTION |
| ----------- |
| |
| include::../user/import-purpose.txt[] |
| |
| OPTIONS |
| ------- |
| |
| The +--connect+ and +--table+ options are required. |
| |
| include::common-args.txt[] |
| |
| include::import-args.txt[] |
| |
| include::output-args.txt[] |
| |
| include::input-args.txt[] |
| |
| include::hive-args.txt[] |
| |
| include::hbase-args.txt[] |
| |
| include::codegen-args.txt[] |
| |
| |
| Incremental import options |
| ~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| |
| Sqoop can be configured to import only "new" data from the source database |
| using the arguments in this section. While any import job can make use of |
| these arguments, they are most powerful when used to initialize a recurring |
| job with +sqoop job --create ...+. After executing a saved job, the last |
| observed value of the check column is updated in the saved job. |
| |
| --incremental (mode):: |
| Specifies that this is an incremental import. Determines how Sqoop should |
| discover new data. "mode" may be +append+, in which case new rows are |
| expected to be added with increasing id values, or +lastmodified+, in which |
| case new data is discovered by comparing a timestamp column with the |
| timestamp at which the last import was performed. |
| |
| --check-column (col):: |
| Specifies a column whose value should be compared to the last imported |
| id or the last import timestamp to determine rows to import. |
| |
| --last-value (value):: |
| Specifies the most recent id imported, or the timestamp of the most recent |
| id. This argument is unnecessary for an initial import. |
| |
| |
| Database-specific options |
| ~~~~~~~~~~~~~~~~~~~~~~~~~ |
| |
| Additional arguments may be passed to the database manager |
| after a lone '--' on the command-line. |
| |
| In MySQL direct mode, additional arguments are passed directly to |
| mysqldump. |
| |
| Note: When using MySQL direct mode, the MySQL bulk utilities |
| +mysqldump+ and +mysqlimport+ should be available on the task nodes and |
| present in the shell path of the task process. |
| |
| Note: When using PostgreSQL direct mode, the PostgreSQL client utility |
| +psql+ should be available on the task nodes and present in the shell path |
| of the task process. |
| |
| ENVIRONMENT |
| ----------- |
| |
| See 'sqoop(1)' |
| |
| //// |
| Copyright 2011 The Apache Software Foundation |
| |
| Licensed to the Apache Software Foundation (ASF) under one |
| or more contributor license agreements. See the NOTICE file |
| distributed with this work for additional information |
| regarding copyright ownership. The ASF licenses this file |
| to you under the Apache License, Version 2.0 (the |
| "License"); you may not use this file except in compliance |
| with the License. You may obtain a copy of the License at |
| |
| http://www.apache.org/licenses/LICENSE-2.0 |
| |
| Unless required by applicable law or agreed to in writing, software |
| distributed under the License is distributed on an "AS IS" BASIS, |
| WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
| See the License for the specific language governing permissions and |
| limitations under the License. |
| //// |
| |