| <?xml version="1.0" encoding="UTF-8"?> |
| <!-- |
| Copyright 2002-2004 The Apache Software Foundation |
| |
| Licensed under the Apache License, Version 2.0 (the "License"); |
| you may not use this file except in compliance with the License. |
| You may obtain a copy of the License at |
| |
| http://www.apache.org/licenses/LICENSE-2.0 |
| |
| Unless required by applicable law or agreed to in writing, software |
| distributed under the License is distributed on an "AS IS" BASIS, |
| WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
| See the License for the specific language governing permissions and |
| limitations under the License. |
| --> |
| <document xmlns="http://maven.apache.org/XDOC/2.0" |
| xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" |
| xsi:schemaLocation="http://maven.apache.org/XDOC/2.0 http://maven.apache.org/xsd/xdoc-2.0.xsd"> |
| <head> |
| <title>DistCp</title> |
| </head> |
| <body> |
| <section name="Overview"> |
| <p> |
| DistCp (distributed copy) is a tool used for large inter/intra-cluster |
| copying. It uses Map/Reduce to effect its distribution, error |
| handling and recovery, and reporting. It expands a list of files and |
| directories into input to map tasks, each of which will copy a partition |
| of the files specified in the source list. |
| </p> |
| <p> |
| The erstwhile implementation of DistCp has its share of quirks and |
| drawbacks, both in its usage, as well as its extensibility and |
| performance. The purpose of the DistCp refactor was to fix these shortcomings, |
| enabling it to be used and extended programmatically. New paradigms have |
| been introduced to improve runtime and setup performance, while simultaneously |
| retaining the legacy behaviour as default. |
| </p> |
| <p> |
| This document aims to describe the design of the new DistCp, its spanking |
| new features, their optimal use, and any deviance from the legacy |
| implementation. |
| </p> |
| </section> |
| </body> |
| </document> |