| ~~ Licensed to the Apache Software Foundation (ASF) under one or more |
| ~~ contributor license agreements. See the NOTICE file distributed with |
| ~~ this work for additional information regarding copyright ownership. |
| ~~ The ASF licenses this file to You under the Apache License, Version 2.0 |
| ~~ (the "License"); you may not use this file except in compliance with |
| ~~ the License. You may obtain a copy of the License at |
| ~~ |
| ~~ http://www.apache.org/licenses/LICENSE-2.0 |
| ~~ |
| ~~ Unless required by applicable law or agreed to in writing, software |
| ~~ distributed under the License is distributed on an "AS IS" BASIS, |
| ~~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
| ~~ See the License for the specific language governing permissions and |
| ~~ limitations under the License. |
| ~~ |
| Introduction |
| |
| The Apache Tez project is aimed at building an application framework which allows for a complex directed-acyclic-graph of tasks for processing data. It is currently built atop {{{http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YARN.html}Apache Hadoop YARN}} |
| |
| The 2 main design themes for Tez are: |
| |
| * <<Empowering end users by:>> |
| |
| * Expressive dataflow definition APIs |
| |
| * Flexible Input-Processor-Output runtime model |
| |
| * Data type agnostic |
| |
| * Simplifying deployment |
| |
| |
| * <<Execution Performance>> |
| |
| * Performance gains over Map Reduce |
| |
| * Optimal resource management |
| |
| * Plan reconfiguration at runtime |
| |
| * Dynamic physical data flow decisions |
| |
| [] |
| |
| By allowing projects like Apache Hive and Apache Pig to run a complex DAG of tasks, Tez can be used to process data, that earlier took multiple MR jobs, now in a single Tez job as shown below. |
| |
| [./images/PigHiveQueryOnMR.png] Flow for a Hive or Pig Query on MapReduce |
| |
| [./images/PigHiveQueryOnTez.png] Flow for a Hive or Pig Query on Tez |
| |
| Disclaimer |
| |
| Apache Tez is an effort currently undergoing incubation at The Apache Software Foundation (ASF) sponsored by the Apache Incubator PMC. |
| |