| Title: Apache Accumulo Export/Import Example |
| Notice: Licensed to the Apache Software Foundation (ASF) under one |
| or more contributor license agreements. See the NOTICE file |
| distributed with this work for additional information |
| regarding copyright ownership. The ASF licenses this file |
| to you under the Apache License, Version 2.0 (the |
| "License"); you may not use this file except in compliance |
| with the License. You may obtain a copy of the License at |
| . |
| http://www.apache.org/licenses/LICENSE-2.0 |
| . |
| Unless required by applicable law or agreed to in writing, |
| software distributed under the License is distributed on an |
| "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY |
| KIND, either express or implied. See the License for the |
| specific language governing permissions and limitations |
| under the License. |
| |
| Accumulo provides a mechanism to export and import tables. This README shows |
| how to use this feature. |
| |
| The shell session below shows creating a table, inserting data, and exporting |
| the table. A table must be offline to export it, and it should remain offline |
| for the duration of the distcp. An easy way to take a table offline without |
| interuppting access to it is to clone it and take the clone offline. |
| |
| root@test15> createtable table1 |
| root@test15 table1> insert a cf1 cq1 v1 |
| root@test15 table1> insert h cf1 cq1 v2 |
| root@test15 table1> insert z cf1 cq1 v3 |
| root@test15 table1> insert z cf1 cq2 v4 |
| root@test15 table1> addsplits -t table1 b r |
| root@test15 table1> scan |
| a cf1:cq1 [] v1 |
| h cf1:cq1 [] v2 |
| z cf1:cq1 [] v3 |
| z cf1:cq2 [] v4 |
| root@test15> config -t table1 -s table.split.threshold=100M |
| root@test15 table1> clonetable table1 table1_exp |
| root@test15 table1> offline table1_exp |
| root@test15 table1> exporttable -t table1_exp /tmp/table1_export |
| root@test15 table1> quit |
| |
| After executing the export command, a few files are created in the hdfs dir. |
| One of the files is a list of files to distcp as shown below. |
| |
| $ hadoop fs -ls /tmp/table1_export |
| Found 2 items |
| -rw-r--r-- 3 user supergroup 162 2012-07-25 09:56 /tmp/table1_export/distcp.txt |
| -rw-r--r-- 3 user supergroup 821 2012-07-25 09:56 /tmp/table1_export/exportMetadata.zip |
| $ hadoop fs -cat /tmp/table1_export/distcp.txt |
| hdfs://n1.example.com:6093/accumulo/tables/3/default_tablet/F0000000.rf |
| hdfs://n1.example.com:6093/tmp/table1_export/exportMetadata.zip |
| |
| Before the table can be imported, it must be copied using distcp. After the |
| discp completed, the cloned table may be deleted. |
| |
| $ hadoop distcp -f /tmp/table1_export/distcp.txt /tmp/table1_export_dest |
| |
| The Accumulo shell session below shows importing the table and inspecting it. |
| The data, splits, config, and logical time information for the table were |
| preserved. |
| |
| root@test15> importtable table1_copy /tmp/table1_export_dest |
| root@test15> table table1_copy |
| root@test15 table1_copy> scan |
| a cf1:cq1 [] v1 |
| h cf1:cq1 [] v2 |
| z cf1:cq1 [] v3 |
| z cf1:cq2 [] v4 |
| root@test15 table1_copy> getsplits -t table1_copy |
| b |
| r |
| root@test15> config -t table1_copy -f split |
| ---------+--------------------------+------------------------------------------- |
| SCOPE | NAME | VALUE |
| ---------+--------------------------+------------------------------------------- |
| default | table.split.threshold .. | 1G |
| table | @override ........... | 100M |
| ---------+--------------------------+------------------------------------------- |
| root@test15> tables -l |
| accumulo.metadata => !0 |
| accumulo.root => +r |
| table1_copy => 5 |
| trace => 1 |
| root@test15 table1_copy> scan -t accumulo.metadata -b 5 -c srv:time |
| 5;b srv:time [] M1343224500467 |
| 5;r srv:time [] M1343224500467 |
| 5< srv:time [] M1343224500467 |
| |
| |