commit	1fc00352ccae70d265649621aca22a5904ba771f	[log] [tgz]
author	Raghav Aggarwal <raghavaggarwal03.ra@gmail.com>	Mon Jan 26 15:06:15 2026 +0530
committer	GitHub <noreply@github.com>	Mon Jan 26 10:36:15 2026 +0100
tree	30cedc7eb20f74ba745a84b2ad9dd66b0246b4c0
parent	ffd346b93ae325ace48e6c5c6697d5ca5a686dcf [diff]

TEZ-4654: Migrate from commons-lang2.x to commons-lang3.x (#441) (Raghav Aggarwal reviewed by Laszlo Bodor) - Upgrade to commons-lang-3.19.0 - org.apache.commons.lang.ArrayUtils => org.apache.commons.lang3.ArrayUtils - org.apache.commons.lang.RandomStringUtils => org.apache.commons.lang3.RandomStringUtils - org.apache.commons.lang.StringEscapeUtils => org.apache.commons.lang3.StringEscapeUtils - org.apache.commons.lang.StringUtils => org.apache.commons.lang3.StringUtils - org.apache.commons.lang.SystemUtils => org.apache.commons.lang3.SystemUtils - org.apache.commons.lang.exception.ExceptionUtils => org.apache.commons.lang3.exception.ExceptionUtils - org.apache.commons.lang.mutable.MutableInt => org.apache.commons.lang3.mutable.MutableInt - org.apache.commons.lang.NotImplementedException => org.apache.commons.lang3.NotImplementedException

tree: 30cedc7eb20f74ba745a84b2ad9dd66b0246b4c0

README.md

Apache Tez

Apache Tez is a generic data-processing pipeline engine envisioned as a low-level engine for higher abstractions such as Apache Hadoop Map-Reduce, Apache Pig, Apache Hive etc.

At its heart, tez is very simple and has just two components:

The data-processing pipeline engine where-in one can plug-in input, processing and output implementations to perform arbitrary data-processing. Every ‘task’ in tez has the following:

Input to consume key/value pairs from.
Processor to process them.
Output to collect the processed key/value pairs.

A master for the data-processing application, where-by one can put together arbitrary data-processing ‘tasks’ described above into a task-DAG to process data as desired. The generic master is implemented as a Apache Hadoop YARN ApplicationMaster.