| ~~ Licensed under the Apache License, Version 2.0 (the "License"); |
| ~~ you may not use this file except in compliance with the License. |
| ~~ You may obtain a copy of the License at |
| ~~ |
| ~~ http://www.apache.org/licenses/LICENSE-2.0 |
| ~~ |
| ~~ Unless required by applicable law or agreed to in writing, software |
| ~~ distributed under the License is distributed on an "AS IS" BASIS, |
| ~~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
| ~~ See the License for the specific language governing permissions and |
| ~~ limitations under the License. See accompanying LICENSE file. |
| |
| --- |
| Apache Hadoop ${project.version} |
| --- |
| --- |
| ${maven.build.timestamp} |
| |
| Apache Hadoop ${project.version} |
| |
| Apache Hadoop ${project.version} consists of significant |
| improvements over the previous stable release (hadoop-1.x). |
| |
| Here is a short overview of the improvments to both HDFS and MapReduce. |
| |
| * {HDFS Federation} |
| |
| In order to scale the name service horizontally, federation uses multiple |
| independent Namenodes/Namespaces. The Namenodes are federated, that is, the |
| Namenodes are independent and don't require coordination with each other. |
| The datanodes are used as common storage for blocks by all the Namenodes. |
| Each datanode registers with all the Namenodes in the cluster. Datanodes |
| send periodic heartbeats and block reports and handles commands from the |
| Namenodes. |
| |
| More details are available in the |
| {{{./hadoop-project-dist/hadoop-hdfs/Federation.html}HDFS Federation}} |
| document. |
| |
| * {MapReduce NextGen aka YARN aka MRv2} |
| |
| The new architecture introduced in hadoop-0.23, divides the two major |
| functions of the JobTracker: resource management and job life-cycle management |
| into separate components. |
| |
| The new ResourceManager manages the global assignment of compute resources to |
| applications and the per-application ApplicationMaster manages the |
| application’s scheduling and coordination. |
| |
| An application is either a single job in the sense of classic MapReduce jobs |
| or a DAG of such jobs. |
| |
| The ResourceManager and per-machine NodeManager daemon, which manages the |
| user processes on that machine, form the computation fabric. |
| |
| The per-application ApplicationMaster is, in effect, a framework specific |
| library and is tasked with negotiating resources from the ResourceManager and |
| working with the NodeManager(s) to execute and monitor the tasks. |
| |
| More details are available in the |
| {{{./hadoop-yarn/hadoop-yarn-site/YARN.html}YARN}} |
| document. |
| |
| Getting Started |
| |
| The Hadoop documentation includes the information you need to get started using |
| Hadoop. Begin with the |
| {{{./hadoop-project-dist/hadoop-common/SingleCluster.html}Single Node Setup}} which |
| shows you how to set up a single-node Hadoop installation. Then move on to the |
| {{{./hadoop-project-dist/hadoop-common/ClusterSetup.html}Cluster Setup}} to learn how |
| to set up a multi-node Hadoop installation. |
| |
| |