blob: dd9300704d4952c505d1c1e852c2b3ab3bf28e69 [file] [log] [blame]
Apache Tez Change Log
=====================
Release 0.5.4: 06-17-2015
ALL CHANGES:
TEZ-2548. TezClient submitDAG can hang if the AM is in the process of shutting down.
TEZ-2533. AM deadlock when shutdown
TEZ-2537. mapreduce.map.env and mapreduce.reduce.env need to fall back to mapred.child.env for compatibility
TEZ-2304. InvalidStateTransitonException TA_SCHEDULE at START_WAIT during recovery
TEZ-2488. Tez AM crashes if a submitted DAG is configured to use invalid resource sizes.
TEZ-2080. Localclient should be using tezconf in init instead of yarnconf.
TEZ-2369. Add a few unit tests for RootInputInitializerManager. Backport a findbugs warning fix from master.
TEZ-2379. org.apache.hadoop.yarn.state.InvalidStateTransitonException:
Invalid event: T_ATTEMPT_KILLED at KILLED.
TEZ-2397. Translation of LocalResources via Tez plan serialization can be lossy.
TEZ-2221. VertexGroup name should be unqiue
TEZ-1521. VertexDataMovementEventsGeneratedEvent may be logged twice in recovery log
TEZ-1560. Invalid state machine handling for V_SOURCE_VERTEX_RECOVERED in recovery.
TEZ-2348. EOF exception during UnorderedKVReader.next().
TEZ-2305. MR compatibility sleep job fails with IOException: Undefined job output-path
TEZ-2303. ConcurrentModificationException while processing recovery.
TEZ-2334. ContainerManagementProtocolProxy modifies IPC timeout conf without making a copy.
TEZ-2317. Event processing backlog can result in task failures for short
tasks
TEZ-2289. ATSHistoryLoggingService can generate ArrayOutOfBoundsException.
TEZ-2257. Fix potential NPEs in TaskReporter.
TEZ-2192. Relocalization does not check for source.
TEZ-2224. EventQueue empty doesn't mean events are consumed in RecoveryService
TEZ-2240. Fix toUpperCase/toLowerCase to use Locale.ENGLISH.
TEZ-2238. TestContainerReuse flaky
TEZ-2217. The min-held-containers being released prematurely
TEZ-2214. FetcherOrderedGrouped can get stuck indefinitely when MergeManager misses memToDiskMerging
TEZ-1923. FetcherOrderedGrouped gets into infinite loop due to memory pressure
TEZ-2219. Should verify the input_name/output_name to be unique per vertex
TEZ-2186. OOM with a simple scatter gather job with re-use
TEZ-2220. TestTezJobs compile failure in branch 0.5.
TEZ-2199. updateLocalResourcesForInputSplits assumes wrongly that split data is on same FS as the default FS.
TEZ-2162. org.apache.hadoop.mapreduce.lib.output.SequenceFileOutputFormat is not recognized
TEZ-2193. Check returned value from EdgeManagerPlugin before using it
TEZ-2133. Secured Impersonation: Failed to delete tez scratch data dir
TEZ-2058. Flaky test: TestTezJobs::testInvalidQueueSubmission.
TEZ-2037. Should log TaskAttemptFinishedEvent if taskattempt is recovered to KILLED.
TEZ-2071. TestAMRecovery should set test names for test DAGs.
TEZ-1928. Tez local mode hang in Pig tez local mode.
TEZ-1893. Verify invalid -1 parallelism in DAG.verify().
TEZ-900. Confusing message for incorrect queue for some tez examples.
TEZ-2036. OneToOneEdgeManager should enforce that source and destination
tasks have same number
TEZ-1895. Vertex reRunning should decrease successfulMembers of VertexGroupInfo.
TEZ-2020. For 1-1 edge vertex configured event may be sent incorrectly
TEZ-2015. VertexImpl.doneReconfiguringVertex() should check other criteria
before sending notification
TEZ-2011. InputReadyVertexManager not resilient to updates in parallelism
TEZ-1934. TestAMRecovery may fail due to the execution order is not determined.
TEZ-1642. TestAMRecovery sometimes fail.
TEZ-1931. Publish tez version info to Timeline.
TEZ-1942. Number of tasks show in Tez UI with auto-reduce parallelism is misleading.
TEZ-1962. Fix a thread leak in LocalMode.
TEZ-1924. Tez AM does not register with AM with full FQDN causing jobs
to fail in some environments.
TEZ-1878. Task-specific log level override not working in certain conditions.
TEZ-1775. Allow setting log level per logger.
TEZ-1851. FileSystem counters do not differentiate between different FileSystems.
TEZ-1852. Get examples to work in LocalMode.
TEZ-1861. Fix failing test: TestOnFileSortedOutput.
TEZ-1836. Provide better error messages when tez.runtime.io.sort.mb, spill percentage is incorrectly configured.
TEZ-1800. Integer overflow in ExternalSorter.getInitialMemoryRequirement()
TEZ-1949. Whitelist TEZ_RUNTIME_OPTIMIZE_SHARED_FETCH for broadcast edges.
Release 0.5.3: 2014-12-10
ALL CHANGES:
TEZ-1758. TezClient should provide YARN diagnostics when the AM crashes
TEZ-1742. Improve response time of internal preemption
TEZ-1745. TestATSHistoryLoggingService::testATSHistoryLoggingServiceShutdown can be flaky.
TEZ-1747. Increase test timeout for TestSecureShuffle.
TEZ-1746. Flaky test in TestVertexImpl and TestExceptionPropagation.
TEZ-1749. Increase test timeout for TestLocalMode.testMultipleClientsWithSession
TEZ-1750. Add a DAGScheduler which schedules tasks only when sources have been scheduled.
TEZ-1761. TestRecoveryParser::testGetLastInProgressDAG fails in similar manner to TEZ-1686.
TEZ-1770. Handle ConnectExceptions correctly when establishing connections to an NM which may be down.
TEZ-1774. AppLaunched event for Timeline does not have start time set.
TEZ-1780. tez-api is missing jersey dependencies.
TEZ-1796. Use of DeprecationDelta broke build against 2.2 Hadoop.
TEZ-1818. Problem loading tez-api-version-info.properties in case current context classloader
in not pointing to Tez jars.
TEZ-1808. Job can fail since name of intermediate files can be too long in specific situation.
Release 0.5.2: 2014-11-07
INCOMPATIBLE CHANGES
TEZ-1666. UserPayload should be null if the payload is not specified.
0.5.1 client cannot talk to 0.5.2 AMs (TEZ-1666 and TEZ-1664).
context.getUserPayload can now return null, apps may need to add defensive code.
TEZ-1699. Vertex.setParallelism should throw an exception for invalid
invocations
TEZ-1700. Replace containerId from TaskLocationHint with [TaskIndex+Vertex]
based affinity
ALL CHANGES:
TEZ-1620. Wait for application finish before stopping MiniTezCluster
TEZ-1621. Should report error to AM before shuting down TezChild
TEZ-1634. Fix compressed IFile shuffle errors
TEZ-1648. Update website after 0.5.1
TEZ-1614. Use setFromConfiguration() in SortMergeJoinExample to demonstrate the usage
TEZ-1641. Add debug logs in VertexManager to help debugging custom VertexManagerPlugins
TEZ-1645. Add support for specifying additional local resources via config.
TEZ-1646. Add support for augmenting classpath via configs.
TEZ-1647. Issue with caching of events in VertexManager::onRootVertexInitialized.
TEZ-1470. Recovery fails due to TaskAttemptFinishedEvent being recorded multiple times for the same task.
TEZ-1649. ShuffleVertexManager auto reduce parallelism can cause jobs to hang indefinitely.
TEZ-1566. Reduce log verbosity.
TEZ-1083. Enable IFile RLE for DefaultSorter.
TEZ-1637. Improved shuffle error handling across NM restarts.
TEZ-1479. Disambiguate (refactor) between ShuffleInputEventHandlers and Fetchers.
TEZ-1665. DAGScheduler should provide a priority range instead of an exact
priority
TEZ-1632. NPE at TestPreemption.testPreemptionWithoutSession
TEZ-1674. Rename configuration parameters related to counters / memory scaling.
TEZ-1176. Set parallelism should end up sending an update to ATS if numTasks are updated at run-time.
TEZ-1658. Additional data generation to Timeline for UI.
TEZ-1676. Fix failing test in TestHistoryEventTimelineConversion.
TEZ-1673. Update the default value for allowed node failures, num events per heartbeat
and counter update interval.
TEZ-1462. Remove unnecessary SuppressWarnings.
TEZ-1633. Fixed expected values in TestTaskRecovery.testRecovery_OneTAStarted.
TEZ-1669. yarn-swimlanes.sh throws error post TEZ-1556.
TEZ-1682. Tez AM hangs at times when there are task failures.
TEZ-1683. Do ugi::getGroups only when necessary when checking ACLs.
TEZ-1584. Restore counters from DAGFinishedEvent when DAG is completed.
TEZ-1525. BroadcastLoadGen testcase.
TEZ-1686. TestRecoveryParser.testGetLastCompletedDAG fails sometimes
TEZ-1667. Add a system test for InitializerEvents.
TEZ-1668. InputInitializers should be able to register for Vertex state updates in the constructor.
TEZ-1656. Grouping of splits should maintain the original ordering of splits
within a group
TEZ-1396. Grouping should generate consistent groups when given the same set
of splits
TEZ-1210. TezClientUtils.localizeDagPlanAsText() needs to be fixed for
session mode
TEZ-1629. Replace ThreadPool's default RejectedExecutionHandler in ContainerLauncherImpl to void
abort when AM shutdown.
TEZ-1643. DAGAppMaster kills DAG & shuts down, when RM is restarted.
TEZ-1684. upgrade mockito to latest release.
TEZ-1567. Avoid blacklisting nodes when the disable blacklisting threshold is about to be hit.
TEZ-1267. Exception handling for VertexManager.
TEZ-1688. Add applicationId as a primary filter for all Timeline data for easier export.
TEZ-1141. DAGStatus.Progress should include number of failed and killed attempts.
TEZ-1424. Fixes to DAG text representation in debug mode.
TEZ-1590. Fetchers should not report failures after the Processor on the task completes.
TEZ-1542. Fix a Local Mode crash on concurrentModificationException.
TEZ-1638. Fix Missing type parametrization in runtime Input/Output configs.
TEZ-1596. Secure Shuffle utils is extremely expensive for fast queries.
TEZ-1712. SSL context gets destroyed too soon after downloading data from one of the vertices.
TEZ-1710. Add support for cluster default AM/task launch opts.
TEZ-1713. tez.lib.uris should not require the paths specified to be fully qualified.
TEZ-1715. Fix use of import java.util.* in MultiMRInput.
TEZ-1664. Add checks to ensure that the client and AM are compatible.
TEZ-1689. Exception handling for EdgeManagerPlugin.
TEZ-1701. ATS fixes to flush all history events and also using batching.
TEZ-792. Default staging path should have user name.
TEZ-1689. addendum - fix unit test failure.
TEZ-1666. UserPayload should be null if the payload is not specified.
0.5.1 client cannot talk to 0.5.2 AMs (TEZ-1666 and TEZ-1664).
context.getUserPayload can now return null, apps may need to add defensive code.
TEZ-1699. Vertex.setParallelism should throw an exception for invalid
invocations
TEZ-1700. Replace containerId from TaskLocationHint with [TaskIndex+Vertex]
based affinity
TEZ-1716. Additional ATS data for UI.
TEZ-1722. DAG should be related to Application Id in ATS data.
TEZ-1711. Don't cache outputSpecList in VertexImpl.getOutputSpecList(taskIndex)
TEZ-1703. Exception handling for InputInitializer.
TEZ-1698. Cut down on ResourceCalculatorProcessTree overheads in Tez.
TEZ-1703. addendum - fix flaky test.
TEZ-1725. Fix nanosecond to millis conversion in TezMxBeanResourceCalculator.
TEZ-1726. Build broken against Hadoop-2.6.0 due to change in NodeReport.
TEZ-1579. MR examples should be setting mapreduce.framework.name to yarn-tez.
TEZ-1731. OnDiskMerger can end up clobbering files across tasks with LocalDiskFetch enabled.
TEZ-1735. Allow setting basic info per DAG for Tez UI.
TEZ-1728. Remove local host name from Fetcher thread name.
TEZ-1547. Make use of state change notifier in VertexManagerPlugins and fix
TEZ-1494 without latency penalty
Release 0.5.1: 2014-10-02
INCOMPATIBLE CHANGES
TEZ-1488. Rename HashComparator to ProxyComparator and implement in TezBytesComparator
TEZ-1578. Remove TeraSort from Tez codebase.
TEZ-1499. Add SortMergeJoinExample to tez-examples
TEZ-1539. Change InputInitializerEvent semantics to SEND_ONCE_ON_TASK_SUCCESS
TEZ-1571. Add create method for DataSinkDescriptor.
ALL CHANGES
TEZ-1544. Link to release artifacts for 0.5.0 does not point to a specific link for 0.5.0.
TEZ-1559. Add system tests for AM recovery.
TEZ-850. Recovery unit tests.
TEZ-853. Support counters recovery.
TEZ-1345. Add checks to guarantee all init events are written to recovery to consider vertex initialized.
TEZ-1575. MRRSleepJob does not pick MR settings for container size and java opts.
TEZ-1488. Rename HashComparator to ProxyComparator and implement in TezBytesComparator
TEZ-1578. Remove TeraSort from Tez codebase.
TEZ-1569. Add tests for preemption
TEZ-1580. Change TestOrderedWordCount to optionally use MR configs.
TEZ-1524. Resolve user group information only if ACLs are enabled.
TEZ-1581. GroupByOrderByMRRTest no longer functional.
TEZ-1157. Optimize broadcast shuffle to download data only once per host.
TEZ-1607. support mr envs in mrrsleep and testorderedwordcount
TEZ-1499. Add SortMergeJoinExample to tez-examples
TEZ-1613. Decrease running time for TestAMRecovery
TEZ-1240. Add system test for propagation of diagnostics for errors
TEZ-1618. LocalTaskSchedulerService.getTotalResources() and getAvailableResources() can get
negative if JVM memory is larger than 2GB
TEZ-1611. Change DataSource/Sink to be able to supply URIs for credentials
TEZ-1592. Vertex should wait for all initializers to finish before moving to INITED state
TEZ-1612. ShuffleVertexManager's EdgeManager should not hard code source num tasks
TEZ-1555. TestTezClientUtils.validateSetTezJarLocalResourcesDefinedButEmpty
failing on Windows
TEZ-1609. Add hostname to logIdentifiers of fetchers for easy debugging
TEZ-1494. DAG hangs waiting for ShuffleManager.getNextInput()
TEZ-1515. Remove usage of ResourceBundles in Counters.
TEZ-1527. Fix indentation of Vertex status in DAGClient output.
TEZ-1536. Fix spelling typo "configurartion" in TezClientUtils.
TEZ-1310. Update website documentation framework
TEZ-1447. Provide a mechanism for InputInitializers to know about Vertex state changes.
TEZ-1362. Remove DAG_COMPLETED in DAGEventType.
TEZ-1519. TezTaskRunner should not initialize TezConfiguration in TezChild.
TEZ-1534. Make client side configs available to AM and tasks.
TEZ-1574. Support additional formats for the tez deployed archive
TEZ-1563. TezClient.submitDAGSession alters DAG local resources regardless
of DAG submission
TEZ-1585. Memory leak in tez session mode.
TEZ-1533. Request Events more often if a complete set of events is received by a task.
TEZ-1587. Some tez-examples fail in local mode.
TEZ-1597. ImmediateStartVertexManager should handle corner case of vertex having zero tasks.
TEZ-1495. ATS integration for TezClient
TEZ-1553. Multiple failures in testing path-related tests in
TestTezCommonUtils for Windows
TEZ-1598. DAGClientTimelineImpl uses ReflectiveOperationException (which has JDK 1.7 dependency)
TEZ-1599. TezClient.preWarm() is not enabled
TEZ-1550. TestEnvironmentUpdateUtils.testMultipleUpdateEnvironment fails on
Windows
TEZ-1554. Failing tests in TestMRHelpers related to environment on Windows
TEZ-978. Enhance auto parallelism tuning for queries having empty outputs or data skewness
TEZ-1433. Invalid credentials can be used when a DAG is submitted to a
session which has timed out
TEZ-1624. Flaky tests in TestContainerReuse due to race condition in DelayedContainerManager thread
Release 0.5.0: 2014-09-03
INCOMPATIBLE CHANGES
TEZ-1038. Move TaskLocationHint outside of VertexLocationHint.
TEZ-960. VertexManagerPluginContext::getTotalAVailableResource() changed to
VertexManagerPluginContext::getTotalAvailableResource()
TEZ-1025. Rename tez.am.max.task.attempts to tez.am.task.max.failed.attempts
TEZ-1018. VertexManagerPluginContext should enable assigning locality to
scheduled tasks
TEZ-1169. Allow numPhysicalInputs to be specified for RootInputs.
TEZ-1131. Simplify EdgeManager APIs
TEZ-1127. Add TEZ_TASK_JAVA_OPTS and TEZ_ENV configs to specify values from
config
TEZ-692. Unify job submission in either TezClient or TezSession
TEZ-1130. Replace confusing names on Vertex API
TEZ-1213. Fix parameter naming in TezJobConfig.
- Details at https://issues.apache.org/jira/browse/TEZ-1213?focusedCommentId
=14039381&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpa
nel#comment-14039381
TEZ-1080, TEZ-1272, TEZ-1279, TEZ-1266. Change YARNRunner to use EdgeConfigs.
- Removes separation of runtime configs into input/ouput configs. Also
refactors public methods used for this conversion.
TEZ-696. Remove implicit copying of processor payload to input and output
TEZ-1269. TaskScheduler prematurely releases containers
TEZ-857. Split Input/Output interfaces into user/framework components.
TEZ-866. Add a TezMergedInputContext for MergedInputs
TEZ-1137. Move TezJobConfig to runtime-library and rename to
TezRuntimeConfiguration
TEZ-1134. InputInitializer and OutputCommitter implicitly use payloads of
the input and output
TEZ-1312. rename vertex.addInput/Output() to vertex.addDataSource/Sink()
TEZ-1311. get sharedobjectregistry from the context instead of a static
TEZ-1300. Change deploy mechanism for Tez to be based on a tarball which
includes Hadoop libs.
TEZ-1278. TezClient#waitTillReady() should not swallow interrupts
TEZ-1058. Replace user land interfaces with abstract classes
TEZ-1303. Change Inputs, Outputs, InputInitializer, OutputCommitter, VertexManagerPlugin, EdgeManager
to require constructors for creation, and remove the initialize methods.
TEZ-1133. Remove some unused methods from MRHelpers.
TEZ-1346. Change Processor to require context constructors for creation, and remove the requirement of the initialize method requiring the context.
TEZ-1041. Use VertexLocationHint consistently everywhere in the API
TEZ-1057. Replace interfaces with abstract classes for
Processor/Input/Output classes
TEZ-1351. MROutput needs a flush method to ensure data is materialized for
FileOutputCommitter
TEZ-1317. Simplify MRinput/MROutput configuration
TEZ-1379. Allow EdgeConfigurers to accept Configuration for Comparators. Change the way partitioner, comparator, combiner confs are set (from Hadoop Configuration to Map). Rename specific Input/Output classes from *Configuration to *Configurer.
TEZ-1382. Change ObjectRegistry API to allow for future extensions
TEZ-1386. TezGroupedSplitsInputFormat should not need to be setup to enable grouping.
TEZ-1394. Create example code for OrderedWordCount
TEZ-1372. Fix preWarm to work after recent API changes
TEZ-1237. Consolidate naming of API classes
TEZ-1407. Move MRInput related methods out of MRHelpers and consolidate.
TEZ-1194. Make TezUserPayload user facing for payload specification
TEZ-1347. Consolidate MRHelpers.
TEZ-1072. Consolidate monitoring APIs in DAGClient
TEZ-1410. DAGClient#waitForCompletion() methods should not swallow interrupts
TEZ-1416. tez-api project javadoc/annotations review and clean up
TEZ-1425. Move constants to TezConstants
TEZ-1388. mvn site is slow and generates errors
TEZ-1432. Rename property to cancel delegation tokens on app completion (tez.am.am.complete.cancel.delegation.tokens)
TEZ-1320. Remove getApplicationId from DAGClient
TEZ-1427. Change remaining classes that are using byte[] to UserPayload
TEZ-1418. Provide Default value for TEZ_AM_LAUNCH_ENV and TEZ_TASK_LAUNCH
TEZ-1438. Annotate add java doc for tez-runtime-library and tez-mapreduce
TEZ-1055. Rename tez-mapreduce-examples to tez-examples
TEZ-1132. Consistent naming of Input and Outputs
TEZ-1423. Ability to pass custom properties to keySerializer for
OnFileUnorderedPartitionedKVOutput
TEZ-1426. Create configuration helpers for ShuffleVertexManager and
TezGrouping code
TEZ-1390. Replace byte[] with ByteBuffer as the type of user payload in the API
TEZ-1417. Rename Configurer* to Config/ConfigBuilder
TEZ-1450. Documentation of TezConfiguration
TEZ-1231. Clean up TezRuntimeConfiguration
TEZ-1246. Replace constructors with create() methods for DAG, Vertex, Edge etc in the API
TEZ-1455. Replace deprecated junit.framework.Assert with org.junit.Assert
TEZ-1465. Update and document IntersectExample. Change name to JoinExample
TEZ-1449. Change user payloads to work with a byte buffer
TEZ-1469. AM/Session LRs are not shipped to vertices in new API use-case
TEZ-1472. Separate method calls for creating InputDataInformationEvent with
serialized/unserialized payloads
TEZ-1485. Disable node blacklisting and ATS in AM for local mode
TEZ-1463. Remove dependency on private class org.apache.hadoop.util.StringUtils
TEZ-1476. DAGClient waitForCompletion output is confusing
TEZ-1490. dagid reported is incorrect in TezClient.java
TEZ-1500. DAG should be created via a create method
TEZ-1509. Set a useful default value for java opts
ALL CHANGES
TEZ-1516. Log transfer rates for broadcast fetch. (sseth)
TEZ-1511. MROutputConfigBuilder sets OutputFormat as String class if OutputFormat is not provided (bikas)
TEZ-1509. Set a useful default value for java opts (bikas)
TEZ-1517. Avoid sending routed events via the AsyncDispatcher. (sseth)
TEZ-1510. Add missed file. TezConfiguration should not add tez-site.xml as a default resource. (hitesh)
TEZ-1510. Addendum patch. TezConfiguration should not add tez-site.xml as a default resource. (hitesh)
TEZ-1510. TezConfiguration should not add tez-site.xml as a default resource. (hitesh)
TEZ-1501. Add a test dag to generate load on the getTask RPC. (sseth)
TEZ-1481. Flaky test : org.apache.tez.dag.api.client.TestDAGClientHandler.testDAGClientHandler (Contributed by Alexander Pivovarov)
TEZ-1512. VertexImpl.getTask(int) can be CPU intensive when lots of tasks are present in the vertex
TEZ-1492. IFile RLE not kicking in due to bug in BufferUtils.compare()
TEZ-1496. Multi MR inputs can not be configured without accessing internal proto structures (Siddharth Seth via bikas)
TEZ-1493. Tez examples sometimes fail in cases where AM recovery kicks in. (Jeff Zhang via hitesh)
TEZ-1038. Move TaskLocationHint outside of VertexLocationHint. (Alexander Pivovarov via hitesh)
TEZ-1475. Fix HDFS commands in INSTALL.txt (bikas)
TEZ-1500. DAG should be created via a create method (Siddharth Seth via bikas)
TEZ-1430. Javadoc generation should not generate docs for classes annotated as private. (Jonathan Eagles via hitesh)
TEZ-1498. Usage info is not printed when wrong number of arguments is provided for JoinExample. (Jeff Zhang via hitesh)
TEZ-1486. Event routing should not throw an exception if the EdgePlugin does not generate a routing table in cases where the destination vertex has a parallelism of 0. (sseth)
TEZ-1490. dagid reported is incorrect in TezClient.java (jeagles)
TEZ-1476. DAGClient waitForCompletion output is confusing (jeagles)
TEZ-1471. Additional supplement for TEZ local mode document. Contributed by Chen He.
TEZ-1360. Provide vertex parallelism to each vertex task. Contributed by Gopal V, Johannes Zillmann and Rajesh Balamohan.
TEZ-1463. Remove dependency on private class org.apache.hadoop.util.StringUtils (Alexander Pivovarov via jeagles)
TEZ-1448. Make WeightedScalingMemoryDistributor as the default memory distributor (Rajesh Balamohan)
TEZ-1485. Disable node blacklisting and ATS in AM for local mode (jeagles)
TEZ-1446. Move the fetch code for local disk fetch from data movement event handlers to fetcher. Contributed by Prakash Ramachandran.
TEZ-1487. Switch master to 0.6.0-SNAPSHOT. (hitesh)
TEZ-1474. detect missing native libraries for compression at the beginning of a task rather than at the end. Contributed by Prakash Ramachandran.
TEZ-1436. Fix javadoc warnings (Jonathan Eagles via bikas)
TEZ-1472. Separate method calls for creating InputDataInformationEvent with serialized/unserialized payloads (Siddharth Seth via bikas)
TEZ-1469. AM/Session LRs are not shipped to vertices in new API use-case (bikas)
TEZ-1464 Addendum. Update INSTALL.txt (bikas)
TEZ-1464. Update INSTALL.txt (bikas)
TEZ-1449. Change user payloads to work with a byte buffer (Siddharth Seth via bikas)
TEZ-1325. RecoveryParser can find incorrect last DAG ID. (Jeff Zhang via hitesh)
TEZ-1466. Fix JDK8 builds of Tez (gopalv)
TEZ-1251. Fix website to not display latest snapshot version in header. (Alexander Pivovarov via hitesh)
TEZ-1465. Update and document IntersectExample. Change name to JoinExample (bikas)
TEZ-1458. org.apache.tez.common.security.Groups does not compile against hadoop-2.2.0 anymore. (hitesh)
TEZ-1455. Replace deprecated junit.framework.Assert with org.junit.Assert (Alexander Pivovarov via jeagles)
TEZ-1454. Remove unused imports (Alexander Pivovarov via bikas)
TEZ-1415. Merge various Util classes in Tez (Alexander Pivovarov via bikas)
TEZ-1461. Add public key to KEYS (bikas)
TEZ-1452. Add license and notice to jars (bikas)
TEZ-1456. Fix typo in TestIFile.testWithRLEMarker (Contributed by Alexander Pivovarov)
TEZ-1246. Replace constructors with create() methods for DAG, Vertex, Edge etc in the API. (sseth)
TEZ-1453. Fix rat check for 0.5 (bikas)
TEZ-1231. Clean up TezRuntimeConfiguration (bikas)
TEZ-1450. Documentation of TezConfiguration (bikas)
TEZ-1417. Rename *Configurer to ConfigBuilder/Config. (sseth)
TEZ-1349. Add documentation for LocalMode usage. (sseth)
TEZ-1390. Replace byte[] with ByteBuffer as the type of user payload in the API. Contributed by Tsuyoshi OZAWA.
TEZ-1395. Fix failure in IFile handling of compressed data. (Rajesh Balamohan via hitesh)
TEZ-1445. Add more logging to catch shutdown handler race conditions. (hitesh)
TEZ-1426. Create configuration helpers for ShuffleVertexManager and TezGrouping code (Rajesh Balamohan via bikas)
TEZ-1439. IntersectDataGen/Example/Validate should move back to tez-examples. (hitesh)
TEZ-1423. Ability to pass custom properties to keySerializer for OnFileUnorderedPartitionedKVOutput (Siddharth Seth via bikas)
TEZ-1132. Consistent naming of Input and Outputs (bikas)
TEZ-1400. Reducers stuck when enabling auto-reduce parallelism
TEZ-1055 addendum. Rename tez-mapreduce-examples to tez-examples (Hitesh Shah via bikas)
TEZ-1055. Rename tez-mapreduce-examples to tez-examples (Hitesh Shah via bikas)
TEZ-1438. Annotate add java doc for tez-runtime-library and tez-mapreduce. (bikas via hitesh)
TEZ-1411. Address initial feedback on swimlanes (gopalv)
TEZ-1418. Provide Default value for TEZ_AM_LAUNCH_ENV and TEZ_TASK_LAUNCH (Subroto Sanyal via bikas)
TEZ-1065 addendum-1 to fix broken test (bikas)
TEZ-1435. Fix unused imports. (hitesh)
TEZ-1434. Make only wait apis in TezClient to throw InterruptedException. (hitesh)
TEZ-1427. Change remaining classes that are using byte[] to UserPayload. (sseth)
TEZ-1429. Avoid sysexit in the DAGAM in case of local mode. (sseth)
TEZ-1338. Support submission of multiple applications with LocalRunner from within the same JVM. (sseth)
TEZ-1334. Annotate all non public classes in tez-runtime-library with @private. (hitesh)
TEZ-1320. Remove getApplicationId from DAGClient (Jonathan Eagles via bikas)
TEZ-1065 addendum to fix broken test (bikas)
TEZ-1431. Fix use of synchronized for certain functions in TezClient. (hitesh)
TEZ-1065. DAGStatus.getVertexStatus and other vertex related API's should maintain vertex order (Jeff Zhang via bikas)
TEZ-1432. TEZ_AM_CANCEL_DELEGATION_TOKEN is named inorrectly. (sseth)
TEZ-1388. mvn site is slow and generates errors (jeagles)
TEZ-1409. Change MRInputConfigurer, MROutputConfigurer to accept complete configurations. (sseth)
TEZ-1425. Move constants to TezConstants (bikas)
TEZ-1416. tez-api project javadoc/annotations review and clean up (bikas)
TEZ-671. Support View/Modify ACLs for DAGs. (hitesh)
TEZ-1413. Fix build for TestTezClientUtils.testLocalResourceVisibility (Prakash Ramachandran via bikas)
TEZ-1422. Use NetUtils to create the bind address for the client, which allows clients to setup static address resolution. Contributed by Johannes Zillmann.
TEZ-1410. DAGClient#waitForCompletion() methods should not swallow interrupts. Contributed by Johannes Zillmann.
TEZ-1419. Release link broken on website for 0.4.1 release. (hitesh)
TEZ-1072. Consolidate monitoring APIs in DAGClient (jeagles)
TEZ-1420. Remove unused classes - LocalClientProtocolProviderTez, LocalJobRunnerMetricsTez, LocalJobRunnerTez. (sseth)
TEZ-1330. Create a default dist target which contains jars. (sseth)
TEZ-1414. Disable TestTezClientUtils.testLocalResourceVisibility to make builds pass(bikas)
TEZ-1347. Consolidate MRHelpers. (sseth)
TEZ-1402. MRoutput configurer should allow disabling the committer (bikas)
TEZ-817. TEZ_LIB_URI are always uploaded as public Local Resource (Prakash Ramachandran via bikas)
TEZ-1404. groupCommitInProgress in RecoveryTransition of DAGImpl is not set correctly. (Jeff Zhang via hitesh)
TEZ-1024. Fix determination of failed attempts in recovery. (Jeff Zhang via hitesh)
TEZ-1403. oah.mapred.Partitioner is not configured by JobConf. Contributed by Navis.
TEZ-1399. Add an example to show session usage (bikas)
TEZ-1194. Make TezUserPayload user facing for payload specification (Tsuyoshi Ozawa and bikas)
TEZ-1393. user.dir should not be reset in LocalMode. (sseth)
TEZ-1216. Clean up the staging directory when the application completes. (hitesh) This closes #3
TEZ-1407. Move MRInput related methods out of MRHelpers and consolidate. (sseth)
TEZ-1205. Remove profiling keyword from APIs/configs
TEZ-1405. TestSecureShuffle is slow (Rajesh Balamohan via bikas)
TEZ-1237. Consolidate naming of API classes (bikas)
TEZ-1391. Setup IGNORE_LIB_URIS correctly for Local Mode. (sseth)
TEZ-1318 addendum. Simplify Vertex Constructor (bikas)
TEZ-1318. Simplify Vertex Constructor (bikas)
TEZ-1372. Remove author tag from previous commit (bikas)
TEZ-1372. Fix preWarm to work after recent API changes (bikas)
TEZ-1372. Fix preWarm to work after recent API changes (bikas)
TEZ-1394. Create example code for OrderedWordCount (bikas)
TEZ-1394. Create example code for OrderedWordCount (bikas)
TEZ-1392. Fix MRRSleepJob failure. (sseth)
TEZ-1386. TezGroupedSplitsInputFormat should not need to be setup to enable grouping. (sseth)
TEZ-1385. Disk Direct fails for MapOutput when trying to use OnDiskMerger. Contributed by Prakash Ramachandran.
TEZ-1382. Change ObjectRegistry API to allow for future extensions (bikas)
TEZ-1332. Swimlane diagrams from tez AM logs (gopalv)
TEZ-1368. TestSecureShuffle failing
TEZ-1379. Allow EdgeConfigurers to accept Configuration for Comparators. Change the way partitioner, comparator, combiner confs are set (from Hadoop Configuration to Map). Rename specific Input/Output classes from *Configuration to *Configurer. (sseth)
TEZ-1317. Simplify MRinput/MROutput configuration (bikas)
TEZ-1351. MROutput needs a flush method to ensure data is materialized for FileOutputCommitter (bikas)
TEZ-1368 is fixed. (hitesh)
TEZ-1057. Replace interfaces with abstract classes for Processor/Input/Output classes (bikas)
TEZ-1041. Use VertexLocationHint consistently everywhere in the API (bikas)
TEZ-1343. Bypass the Fetcher and read directly from the local filesystem if source task ran on the same host. Contributed by Prakash Ramachandran.
TEZ-1365. Local mode should ignore tez.lib.uris, and set a config for the AM to be aware of session mode. (sseth)
TEZ-1355. Read host and shuffle meta-information for events only if data is generated by Outputs. Contributed by Jonathan Eagles.
TEZ-1342. Fix a bug which caused tez.am.client.am.port-range to not take effect. Contributed by Jeff Zhang.
TEZ-870. Change LocalContainerLauncher to handle multiple threads, improve error reporting and inform other components about container completion. (sseth)
TEZ-1352. HADOOP_CONF_DIR should be in the classpath for containers. (sseth)
TEZ-1238. Display more clear diagnostics info on client side on task failures. (Jeff Zhang via hitesh)
TEZ-1354. Fix NPE in FilterByWordOutputProcessor. Contributed by Jonathan Eagles.
TEZ-1322. OrderedWordCount broken in master branch. (hitesh)
TEZ-1341. IFile append() has string concats leading to memory pressure
TEZ-1346. Change Processor to require context constructors for creation, and remove the requirement of the initialize method requiring the context. (sseth)
TEZ-1064. Restore dagName Set for duplicate detection in recovered AMs. (Jeff Zhang via hitesh)
TEZ-1133 as incompatible.
TEZ-1133. Remove some unused methods from MRHelpers. Contributed by Chen He.
TEZ-1303. Change Inputs, Outputs, InputInitializer, OutputCommitter, VertexManagerPlugin, EdgeManager to require constructors for creation, and remove the initialize methods. (sseth)
TEZ-1276. Remove unnecessary TaskAttemptEventType TA_FAIL_REQUEST. (Jeff Zhang via hitesh)
TEZ-1326. AMStartedEvent should not be recovery event. (Jeff Zhang via hitesh)
TEZ-1333. Flaky test: TestOnFileSortedOutput fails in jenkins server with OOM
TEZ-717. Client changes for local mode DAG submission. Contributed by Jonathan Eagles.
TEZ-707. Add a LocalContainerLauncher. Contributed by Chen He.
TEZ-1058. Replace user land interfaces with abstract classes (bikas)
TEZ-1324. OnFileSortedOutput: send host/port/pathComponent details only when one of the partitions has data
TEZ-1328. Move EnvironmentUpdateUtils to tez-common
TEZ-1278. TezClient#waitTillReady() should not swallow interrupts. Contributed by Johannes Zillmann.
TEZ-1305: Log job tracking url (rohini)
TEZ-1257. Error on empty partition when using OnFileUnorderedKVOutput and ShuffledMergedInput
TEZ-1300. Change default tez classpath to not include hadoop jars from the cluster. (sseth)
TEZ-1321. Remove methods annotated as @Private from TezClient and DAGClient. (sseth)
TEZ-1288. Create FastTezSerialization as an optional feature (rajesh)
TEZ-1304. Abstract out client interactions with YARN. Contributed by Jonathan Eagles.
TEZ-1306. Remove unused ValuesIterator. Contributed by Jonathan Eagles.
TEZ-1127 addendum for changing tez.am.java.opts. Add TEZ_TASK_JAVA_OPTS and TEZ_ENV configs to specify values from config (bikas)
TEZ-1311. get sharedobjectregistry from the context instead of a static (bikas)
TEZ-1312. rename vertex.addInput/Output() to vertex.addDataSource/Sink() (Chen He via bikas)
TEZ-1134. InputInitializer and OutputCommitter implicitly use payloads of the input and output (bikas)
TEZ-1309. Use hflush instead of hsync in recovery log. (hitesh)
TEZ-1137. Move TezJobConfig to runtime-library and rename to TezRuntimeConfiguration (bikas)
TEZ-1247. Allow DAG.verify() to be called multiple times (Jeff Zhang via bikas)
TEZ-1299. Get rid of unnecessary setter override in EntityDescriptors. (sseth)
TEZ-866. Add a TezMergedInputContext for MergedInputs (bikas)
TEZ-1296. commons-math3 dependency (bikas)
TEZ-1301. Fix title of pages in docs. (hitesh)
TEZ-811. Addendum. Update page title. (hitesh)
TEZ-811. Update docs on how to contribute to Tez. (hitesh)
TEZ-1242. Icon. Logos for Tez (Harshad P Dhavale via bikas)
TEZ-1298. Add parameterized constructor capabilities in ReflectionUtils. Contributed by Jonathan Eagles.
TEZ-1242. Logos for Tez (Harshad P Dhavale via bikas)
TEZ-1242. Logos for Tez (Harshad P Dhavale via bikas)
TEZ-1295. Modify the tez-dist-full build target to include hadoop libraries. Also makes the tez direct dependencies explicit in the poms. (sseth)
TEZ-857. Split Input/Output interfaces into user/framework components. (sseth)
TEZ-1290. Make graduation related changes. (hitesh)
TEZ-1269. TaskScheduler prematurely releases containers (bikas)
TEZ-1269. TaskScheduler prematurely releases containers (bikas)
TEZ-1119. Support display of user payloads in Tez UI. (hitesh)
TEZ-1089. Change CompositeDataMovementEvent endIndex to a count of number of events. Contributed by Chen He.
TEZ-696. Remove implicit copying of processor payload to input and output (bikas)
TEZ-1285. Add Utility for Modifying Environment Variables. Contributed by Jonathan Eagles and Oleg Zhurakousky.
TEZ-1287. TestJavaProfilerOptions is missing apache license header. Contributed by Jonathan Eagles.
TEZ-1260. Allow KeyValueWriter to support writing list of values
TEZ-1244. Fix typo in RootInputDataInformationEvent javadoc. Contributed by Chen He.
TEZ-1266. Create *EdgeConfigurer.createDefaultCustomEdge() and force setting partitioners where applicable. (sseth)
TEZ-1279. Rename *EdgeConfiguration to *EdgeConfigurer. (sseth)
TEZ-1272. Change YARNRunner to make use of EdgeConfigurations. (sseth)
TEZ-1130. Replace confusing names on Vertex API (bikas)
TEZ-1228. Define a memory & merge optimized vertex-intermediate file format for Tez
TEZ-1076. Allow events to be sent to InputInitializers. (sseth)
TEZ-657. Tez should process the Container exit status - specifically when the RM preempts a container (bikas)
TEZ-1262. Change Tez examples to use Edge configs. (sseth)
TEZ-1241 Consistent getter for staging dir (kamrul)
TEZ-1131 addendum for missing fix. Simplify EdgeManager APIs (bikas)
TEZ-1118. Tez with container reuse reports negative CPU usage. Contributed by Robert Grandl.
TEZ-1131 (bikas)
TEZ-1080. Add specific Configuration APIs for non MR based Inputs / Outputs. (sseth)
TEZ-225. Tests for DAGClient (Jeff Zhang via bikas)
TEZ-1258. Remove unused class JobStateInternal. Contributed by Jeff Zhang.
TEZ-1253. Remove unused VertexEventTypes. Contributed by Jeff Zhang.
TEZ-1170 addendum to remove unnecessary transitions. Simplify Vertex Initializing transition (bikas)
TEZ-692. Unify job submission in either TezClient or TezSession (bikas)
TEZ-1163. Tez Auto Reducer-parallelism throws Divide-by-Zero
TEZ-699. Have sensible defaults for java opts. (hitesh)
TEZ-387. Move DAGClientHandler into its own class (Jeff Zhang via bikas)
TEZ-1234. Replace Interfaces with Abstract classes for VertexManagerPlugin and EdgeManager. (hitesh)
TEZ-1218. Make TaskScheduler an Abstract class instead of an Inteface. Contributed by Jeff Zhang.
TEZ-1127. Add TEZ_TASK_JAVA_OPTS and TEZ_ENV configs to specify values from config
TEZ-1214. Diagnostics of Vertex is missing when constructing TimelineEntity. (Jeff Zhang via hitesh)
TEZ-106. TaskImpl does not hold any diagnostic information that can be emitted to history. (Jeff Zhang via hitesh)
TEZ-1219. Addendum. Fix roles. (hitesh)
TEZ-1219. Update team list to match incubator status page. (hitesh)
TEZ-1213. Fix parameter naming in TezJobConfig. (sseth)
TEZ-1168. Add MultiMRInput, which can process multiple splits, and returns individual readers for each of these. (sseth)
TEZ-1106. Tez framework should use a unique subdir for staging data. (Mohammad Kamrul Islam via hitesh)
TEZ-1042.Stop re-routing stdout, stderr for tasks and AM. (sseth)
TEZ-1208. Log time taken to connect/getInputStream to a http source in fetcher. Contributed by Rajesh Balamohan.
TEZ-1172. Allow multiple Root Inputs to be specified per Vertex. (sseth)
TEZ-1170 Simplify Vertex Initializing transition (bikas)
TEZ-1193. Allow 'tez.lib.uris' to be overridden (Oleg Zhurakousky via bikas)
TEZ-1032. Allow specifying tasks/vertices to be profiled. (Rajesh Balamohan via hitesh)
TEZ-1131. Simplify EdgeManager APIs
TEZ-1169 addendum to modify the incompatible change list in CHANGES.txt
TEZ-1169. Allow numPhysicalInputs to be specified for RootInputs. (sseth)
TEZ-1192. Fix loop termination in TezChild. Contributed by Oleg Zhurakousky.
TEZ-1178. Prevent duplicate ObjectRegistryImpl inits in TezChild. (gopalv)
TEZ-1199. EdgeVertexName in EventMetaData can be null. (hitesh)
TEZ-1196. FaultToleranceTestRunner should allow passing generic options from cli (Karam Singh via tassapola)
TEZ-1162. The simple history text files now use ^A\n as their line endings.
TEZ-1164. Only events for tasks should be buffered in Initializing state (bikas)
TEZ-1171. Vertex remains in INITED state if all source vertices start while the vertex was in INITIALIZING state (bikas)
TEZ-373. Create UserPayload class for internal code (Tsuyoshi OZAWA via bikas)
TEZ-1151. Vertex should stay in initializing state until custom vertex manager sets the parallelism (bikas)
TEZ-1145. Vertices should not start if they have uninitialized custom edges (bikas)
TEZ-1143 (addendum). 1-1 source split event should be handled in Vertex.RUNNING and Vertex.INITED state (bikas)
TEZ-1116. Refactor YarnTezDAGChild to be testable and usable for LocalMode. (sseth)
TEZ-1143. 1-1 source split event should be handled in Vertex.RUNNING state (bikas)
TEZ-800. One-one edge with parallelism -1 fails if source vertex parallelism is not -1 as well (bikas)
TEZ-1154. tez-mapreduce-examples should depend on yarn-api. (hitesh)
TEZ-1090. Micro optimization - Remove duplicate updateProcessTree() in TaskCounterUpdater. (Rajesh Balamohan via hitesh)
TEZ-1027. orderedwordcount needs to respect tez.staging-dir property. (Rekha Joshi via hitesh)
TEZ-1150. Replace String EdgeId with Edge in the Vertex (bikas)
TEZ-1066. Generate events to integrate with YARN timeline server. (hitesh)
TEZ-1140. TestSecureShuffle leaves behind test data dirs. (hitesh)
TEZ-1039. Add Container locality to TaskScheduler (bikas)
TEZ-1139. Add a test for IntersectDataGen and IntersectValidate. (sseth)
TEZ-1126. Add a data generator and validator for the intersect example. (sseth)
TEZ-1114. Fix encrypted shuffle. Contributed by Rajesh Balamohan.
TEZ-1128. OnFileUnorderedPartitionedKVOutput does not handle partitioning correctly with the MRPartitioner. (sseth)
TEZ-1121. Clean up avro dependencies. (hitesh)
TEZ-1111. TestMRHelpers fails if HADOOP_COMMON_HOME is defined in the shell env. ( Mohammad Kamrul Islam via hitesh)
TEZ-1099. Minor documentation improvement and Eclipse direct import friendlyness. Contributed by Thiruvalluvan M. G.
TEZ-1112. MROutput committer should be initialized from initialized OutputFormat. Contributed by Rohini Palaniswamy.
TEZ-1102. Abstract out connection management logic in shuffle code. Contributed by Rajesh Balamohan.
TEZ-1088. Flaky Test: TestFaultTolerance.testInputFailureCausesRerunAttemptWithinMaxAttemptSuccess (Tassapol Athiapinya via bikas)
TEZ-1105. Fix docs to ensure users are aware of adding "*" for HADOOP_CLASSPATH. (hitesh)
TEZ-1091. Respect keepAlive when shutting down Fetchers. Contributed by Rajesh Balamohan.
TEZ-1093. Add an example for OnFileUnorderedPartitionedOutput. (sseth)
TEZ-661. Add an implementation for a non sorted, partitioned, key-value output. (sseth)
TEZ-1002. Generate Container Stop history events. Contributed by Gopal V.
TEZ-1085. Leave env values unchanged if they aren't set on the client. Contributed by Rohini Palaniswamy.
TEZ-1082. Fix the mechanism used by the Fetcher to check for an open connection when draining the error stream. Contributed by Rajesh Balamohan
TEZ-886. Add @Nullable annotation at API level (Tsuyoshi OZAWA via bikas)
TEZ-802. Determination of Task Placement for 1-1 Edges (bikas)
TEZ-1079. Make tez example jobs use the ToolRunner framework (Devaraj K via bikas)
TEZ-1087. ShuffleManager fails with IllegalStateException (Cheolsoo Park via bikas)
TEZ-1074. Reduce the frequency at which counters are sent from the task to the AM to reduce AM CPU usage. Contributed by Rajesh Balamohan.
TEZ-1062. Create SimpleProcessor for processors that only need to implement the run method (Mohammad Kamrul Islam via bikas)
TEZ-1073. RLE fast-forward merge for IFile (gopalv)
TEZ-1023. Tez runtime configuration changes by users may not get propagated to jobs. Contributed by Rajesh Balamohan.
TEZ-698. Make it easy to create and configure MRInput/MROutput and other inputs/outputs (bikas)
TEZ-1018. VertexManagerPluginContext should enable assigning locality to scheduled tasks (bikas)
TEZ-1077. Add unit tests for SortedMergedGroupedInput. (sseth)
TEZ-1003. Add a MergedInput to combine multiple ShuffledMergedInputs. Contributed by Rohini Palaniswamy.
TEZ-873. Expose InputSplit via MRInputLegacy, and underlying splits via TezGroupedSplits. Contributed by Mohammad Kamrul Islam.
TEZ-737. DAG name should be unique within a Tez Session. (Mohammad Kamrul Islam via hitesh)
TEZ-919. Fix shutdown handling for Shuffle. (sseth)
TEZ-988. Enable KeepAlive in Tez Fetcher (Rajesh Balamohan via bikas)
TEZ-695. Create Abstract class for Input/Processor/Output (Mohammad Kamrul Islam via bikas)
Revert "TEZ-695. Create Abstract class for Input/Processor/Output (Mohammad Kamrul Islam via bikas)"
TEZ-695. Create Abstract class for Input/Processor/Output (Mohammad Kamrul Islam via bikas)
TEZ-708. Add a LocalTaskScheduler for use in Local mode. Contributed by Jonathan Eagles.
TEZ-700. Helper API's to monitor a DAG to completion (Mohammad Kamrul Islam via bikas)
TEZ-1053. Refactor: Pass TaskLocationHint directly to the Scheduling logic (bikas)
TEZ-1049. Refactor - LocationHint need not be passed into TaskAttemptImpl's constructor (bikas)
TEZ-480. Create InputReady VertexManager (bikas)
TEZ-1007. MRHelpers.addLog4jSystemProperties() duplicates code from TezClientUtils.addLog4jSystemProperties(). (Thomas Jungblut via hitesh)
TEZ-1025. Rename tez.am.max.task.attempts to tez.am.task.max.failed.attempts (bikas)
TEZ-960. Addendum - updated CHANGES.txt for incompatible change. (hitesh)
TEZ-37. TaskScheduler.addTaskRequest() should handle duplicate tasks (bikas)
TEZ-960. Typos in MRJobConfig. (Chen He via hitesh)
Release 0.4.0-incubating: 2014-04-05
ALL CHANGES
TEZ-932 addendum. Add missing license to file.
TEZ-1001 addendum. Remove checked in jar. (sseth)
TEZ-1001. Change unit test for AM relocalization to generate a jar, and remove previously checked in test jar. (sseth)
TEZ-1000. Add a placeholder partitioned unsorted output. (sseth)
TEZ-932. Add a weighted scaling initial memory allocator. (sseth)
TEZ-989. Allow addition of resources to the AM for a running session. (sseth)
TEZ-991. Fix a bug which could cause getReader to occasionally hang on ShuffledMergedInput. (sseth)
TEZ-990. Add support for a credentials file read by TezClient. (sseth)
TEZ-983. Support a helper function to extract required additional tokens from a file. (hitesh)
TEZ-981. Port MAPREDUCE-3685. (sseth)
TEZ-980. Data recovered flag file should be a namenode only OP. (hitesh)
TEZ-973. Abort additional attempts if recovery fails. (hitesh)
TEZ-976. WordCount example does not handle -D<param> args. (Tassapol Athiapinya via hitesh)
TEZ-977. Flaky test: TestAMNodeMap - add more debug logging. (hitesh)
TEZ-8649. Fix BufferOverflow in PipelinedSorter by ensuring the last block is big enough for one k,v pair (gopalv)
TEZ-972. Shuffle Phase - optimize memory usage of empty partition data in DataMovementEvent. Contributed by Rajesh Balamohan.
TEZ-879. Fix potential synchronization issues in runtime components. (sseth)
TEZ-876. Fix graphviz dag generation for Hive. (hitesh)
TEZ-951. Port MAPREDUCE-5028. Contributed by Gopal V and Siddharth Seth.
TEZ-950. Port MAPREDUCE-5462. Contributed by Gopal V and Siddharth Seth.
TEZ-971. Change Shuffle to report errors early rather than waiting for access before reporting them. (sseth)
TEZ-970. Consolidate Shuffle payload to have a single means of indicating absence of a partition. (sseth)
TEZ-969. Fix a bug which could cause the Merger to get stuck when merging 0 segments, in case of container re-use. (sseth)
TEZ-968. Fix flush mechanism when max unflushed events count set to -1. (hitesh)
TEZ-948. Log counters at the end of Task execution. (sseth)
TEZ-964. Flaky test: TestAMNodeMap.testNodeSelfBlacklist. (hitesh)
TEZ-966. Tez AM has invalid state transition error when datanode is bad. (hitesh)
TEZ-949. Handle Session Tokens for Recovery. (hitesh)
TEZ-955. Tez should close inputs after calling processor's close. (hitesh)
TEZ-953. Port MAPREDUCE-5493. (sseth)
TEZ-958. Increase sleeps in TestContainerReuse to fix flaky tests. (hitesh)
TEZ-952. Port MAPREDUCE-5209 and MAPREDUCE-5251. (sseth)
TEZ-956. Handle zero task vertices correctly on Recovery. (hitesh)
TEZ-938. Avoid fetching empty partitions when the OnFileSortedOutput, ShuffledMergedInput pair is used. Contributed by Rajesh Balamohan.
TEZ-934. Add configuration fields for Tez Local Mode (part of TEZ-684). Contributed by Chen He.
TEZ-947. Fix OrderedWordCount job progress reporting to work across AM attempts. (hitesh)
TEZ-944. Tez Job gets "Could not load native gpl library" Error. (hitesh)
TEZ-851. Handle failure to persist events to HDFS. (hitesh)
TEZ-940. Fix a memory leak in TaskSchedulerAppCallbackWrapper. Contributed by Gopal V and Siddharth Seth.
TEZ-939. Fix build break caused by changes in YARN-1824. (sseth)
TEZ-903. Make max maps per fetcher configurable in ShuffleScheduler. Contributed by Rajesh Balamohan.
TEZ-942. Mrrsleep job with only maps fails with 'Illegal output to map'. (hitesh)
TEZ-936. Remove unused event types from AMSchedulerEventType. (sseth)
TEZ-933 addendum. Fix a unit test to work correctly. (sseth)
TEZ-933. Change Vertex.getNumTasks to return the initially configured number of tasks if initialization has not completed. (sseth) depending on when it's called. (sseth)
TEZ-935. Fix function names for Vertex Stats. (hitesh)
TEZ-930. Addendum patch. Provide additional aggregated task stats at the vertex level. (hitesh)
TEZ-930. Provide additional aggregated task stats at the vertex level. (hitesh)
TEZ-931. DAGHistoryEvent should not be allowed to be sent to Dispatcher/EventHandlers. (hitesh)
TEZ-904. Committer recovery events should be out-of-band. (hitesh)
TEZ-813. Remove unused imports across project. (Jonathan Eagles via hitesh)
TEZ-920. Increase defaults for number of counters, ensure AM uses this. (sseth)
TEZ-928. NPE in last app attempt caused by registering for an RM unregister. (hitesh)
TEZ-676. Tez job fails on client side if nodemanager running AM is lost. (Tsuyoshi Ozawa and hitesh via hitesh)
TEZ-918. Fix a bug which could cause shuffle to hang if there are intermittent fetch failures. (sseth)
TEZ-910. Allow ShuffledUnorderedKVInput to work for cases other than broadcast. (sseth)
TEZ-911. Re-factor BroadcastShuffle related code to be independent of Braodcast. (sseth)
TEZ-901. Improvements to Counters generated by runtime components. (sseth)
TEZ-902. SortedMergedInput Fetcher can hang on retrying a bad input (bikas)
Addendum to TEZ-804. Remove comments to add test since test has been added. Some new logs added. (bikas)
TEZ-847. Support basic AM recovery. (hitesh)
TEZ-915. TaskScheduler can get hung when all headroom is used and it cannot utilize existing new containers (bikas)
TEZ-894. Tez should have a way to know build manifest. (Ashish Singh via hitesh)
TEZ-893. Terasort gives ArrayIndexOutOfBound Exception for 'hadoop jar <jar> terasort'. (hitesh)
TEZ-884. Add parameter checking for context related user API's (Tsuyoshi Ozawa via bikas)
TEZ-906. Finalize Tez 0.3.0 release. (hitesh)
TEZ-535. Redirect AM logs to different files when running multiple DAGs in the same AM. Contributed by Mohammad Kamrul Islam.
TEZ-887. Allow counters to be separated at a per Input/Output level. (sseth)
TEZ-898. Handle inputReady and initial memory request in case of 0 physical inputs. (sseth)
TEZ-896. Fix AM splits to work on secure clusters when using the mapred API. (sseth)
TEZ-715. Auto Reduce Parallelism can rarely trigger NPE in AM at DAGAppMaster.handle(DAGAppMaster.java:1268) (bikas)
TEZ-891. TestTaskScheduler does not handle mockApp being called on different thread (bikas)
TEZ-889. Fixes a bug in MRInputSplitDistributor (caused by TEZ-880). (sseth)
TEZ-883. Fix unit tests to use comma separated values in tests having output verification at more than 1 tasks. (Tassapol Athiapinya via bikas)
TEZ-888. TestAMNodeMap.testNodeSelfBlacklist failing intermittently (bikas)
TEZ-865. TezTaskContext.getDAGName() does not return DAG name (Tsuyoshi OZAWA via bikas)
TEZ-885. TestTaskScheduler intermittent failures (bikas)
TEZ-882. Update versions in master for next release. (hitesh)
Release 0.3.0-incubating: 2014-02-26
INCOMPATIBLE CHANGES
TEZ-720. Inconsistency between VertexState and VertexStatus.State.
TEZ-41. Get VertexCommitter from API and remove MRVertexOutputCommitter.
TEZ-650. MRHelpers.createMRInputPayloadWithGrouping() methods should not
take an MRSplitsProto argument.
TEZ-827. Separate initialize and start operations on Inputs/Outputs.
TEZ-668. Allow Processors to control Input/Output start.
TEZ-837. Remove numTasks from VertexLocationHints.
ALL CHANGES
TEZ-889. Fixes a bug in MRInputSplitDistributor (caused by TEZ-880). (sseth)
TEZ-881. Update Notice and any other copyright tags for year 2014. (hitesh)
TEZ-874. Fetcher inputstream is not buffered. (Rajesh Balamohan via hitesh)
TEZ-880. Support sending deserialized data in RootInputDataInformationEvents. (sseth)
TEZ-779. Make Tez grouping splits logic possible outside InputFormat (bikas)
TEZ-844. Processors should have a mechanism to know when an Input is ready for consumption. (sseth)
TEZ-878. doAssignAll() in TaskScheduler ignores delayedContainers being out of sync with heldContainers (bikas)
TEZ-824. Failure of 2 tasks in a vertex having input failures (Tassapol Athiapinya via bikas)
Addendum patch for TEZ-769. Change Vertex.setParallelism() to accept a set of EdgeManagerDescriptors. (hitesh)
TEZ-769. Change Vertex.setParallelism() to accept a set of EdgeManagerDescriptors. (hitesh)
TEZ-863 Addendum. Queue events for relevant inputs untill the Input has been started. Fixes a potential NPE in case of no auto start. (Contributed by Rajesh Balamohan)
TEZ-289. DAGAppMasterShutdownHook should not report KILLED when exception brings down AM. (Tsuyoshi Ozawa via hitesh)
TEZ-835. Fix a bug which could cause a divide by zero if a very small sort buffer is configured. (sseth)
TEZ-842. Add built-in verification of expected execution pattern into TestFaultTolerance (Tassapol Athiapinya bikas)
TEZ-863. Queue events for relevant inputs untill the Input has been started. Fixes a potential NPE in case of no auto start. (sseth)
TEZ-788. Clean up dist tarballs. (Contributed by Jonathan Eagles)
TEZ-619. Failing test: TestTaskScheduler.testTaskSchedulerWithReuse (Jonathan Eagles via bikas)
TEZ-845. Handle un-blacklisting of nodes (bikas)
TEZ-847. Visualize tez statemachines. (Min Zhou via hitesh)
TEZ-823. Add DAGs with a vertex connecting with 2 downstream/upstream vertices and unit tests for fault tolerance on these DAGs (Tassapol Athiapinya via bikas)
TEZ-787. Revert Guava dependency to 11.0.2. (sseth)
TEZ-801. Support routing of event to multiple destination physical inputs (bikas)
TEZ-854. Fix non-recovery code path for sessions. (hitesh)
TEZ-837. Remove numTasks from VertexLocationHints. (sseth)
TEZ-668. Allow Processors to control Input / Output start. (sseth)
TEZ-843. Fix failing unit test post TEZ-756. (sseth)
TEZ-816. Add unit test for cascading input failure (Tassapol Athiapinya via bikas)
TEZ-825. Fix incorrect inheritance method calls in ThreeLevels and SixLevels failing dags. (Tassapol Athiapinya via bikas)
TEZ-756 Addendum. Fix a unit test failure. (sseth)
TEZ-840. Default value of DataMovementEventPayloadProto.data_generated should be true. (sseth)
TEZ-756. Allow VertexManagerPlugins to configure RootInputEvents without access to Tez internal structures. (sseth)
TEZ-804. Handle node loss/bad nodes (bikas)
TEZ-836. Add ConcatenatedKeyValueInput for vertex groups (Gunther Hagleitner via bikas)
TEZ-755. Change VertexManagerPlugin.initialize and context to be consistent with the rest of the context objects (bikas)
TEZ-833. Have the Tez task framework set Framework counters instead of MR Processors setting them. (sseth)
TEZ-826. Remove wordcountmrrtest example. (sseth)
TEZ-815. Split initialize and start implementations for the various Inputs and Outputs. (sseth)
TEZ-637. [MR Support] Add all required info into JobConf for MR related I/O/P (bikas)
TEZ-596. Change MRHelpers.serializeConf* methods to use Protobuf for serialization. Contributed by Mohammad Kamrul Islam
TEZ-832. Fix a race in MemoryDistributor. (sseth)
TEZ-827. Separate initialize and start operations on Inputs/Outputs. (sseth)
TEZ-831. Reduce line length of state machines (bikas)
TEZ-796. AM Hangs & does not kill containers when map-task fails (bikas)
TEZ-782. Scale I/O mem requirements if misconfigured. (sseth)
TEZ-819. YARNRunner should not put -m/-r in output name when using mapred API (bikas)
TEZ-773. Fix configuration of tez jars location in MiniTezCluster (bikas)
TEZ-799. Generate data to be used for recovery. (hitesh)
TEZ-812. exclude slf4j dependencies from hadoop. (Giridharan Kesavan via hitesh)
TEZ-810. Add Gunther H. to the website. (Gunther Hagleitner via hitesh)
TEZ-807. Build broken due to NPE (Patch by Gunther Hagleitner, reviewed by Siddharth Seth)
TEZ-777. Obtain tokens for LocalResources specified in the DAGPlan (Patch by Gunther Hagleitner, reviewed by Siddharth Seth)
TEZ-781. Add unit test for fault tolerance (input failure causes re-run of previous task under allowed maximum failed attempt) (Tassapol Athiapinya via bikas)
TEZ-745. Rename TEZ_AM_ABORT_ALL_OUTPUTS_ON_DAG_FAILURE in TezConfiguration to TEZ_AM_COMMIT_ALL_OUTPUTS_ON_DAG_SUCCESS (Jonathan Eagles via bikas)
TEZ-783. Add standard DAGs using failing processors/inputs for test purpose (Tassapol Athiapinya via bikas)
TEZ-773. Fix configuration of tez jars location in MiniTezCluster (bikas)
TEZ-798. Change DAG.addURIsForCredentials to accept a Collection instead of a List. (Gunther Hagleitner via sseth)
TEZ-797. Add documentation for some of the Tez config parameters. (sseth)
TEZ-766. Support an api to pre-warm containers for a session. (hitesh)
TEZ-774. Fix name resolution for local addresses in test (bikas)
TEZ-761. Replace MRJobConfig parameter references in the tez-dag module with Tez equivalents. (Mohammad Kamrul Islam via sseth)
TEZ-775. Fix usage of env vars (bikas)
TEZ-773. Fix configuration of tez jars location in MiniTezCluster (bikas)
TEZ-791. Remove ShuffleHandler and related classes from tez-library. (sseth)
TEZ-790. Addendum patch for TEZ-784: failing config constants renaming is incomplete. (Tassapol Athiapinya via bikas)
TEZ-718. Remove some unused classes - JobEndNotifier and Speculation. (Mohammad Kamrul Islam via sseth)
TEZ-784. Add TestDriver to allow cmd line submission of tests to a cluster (bikas)
TEZ-771. Dist build broken after TEZ-749. (Jonathan Eagles via hitesh)
TEZ-678. Support for union operations via VertexGroup abstraction (bikas)
TEZ-646. Introduce a CompositeDataMovementEvent to avoid multiple copies of the same payload in the AM. (sseth)
TEZ-752 addendum. Javadoc modifications for DAG.addURIsForCredentials. (sseth)
TEZ-768. Consolidate TokenCache, Master and related code. (sseth)
TEZ-752. Add an API to DAG to accept a list of URIs for which tokens are needed. (sseth)
TEZ-748. Test Tez Fault Tolerance (bikas)
TEZ-749. Maintain order of vertices as specified by the user (Jonathan Eagles via bikas)
TEZ-674. TezClient should obtain Tokens for the staging directory it uses. (sseth)
TEZ-665. Fix javadoc warnings. (Jonathan Eagles via hitesh)
TEZ-765. Allow tez.runtime.sort.threads > 1 to turn on PipelinedSorter (gopalv).
TEZ-763. Tez doesn't compile with Non-resolvable parent error. (Jonathan Eagles via hitesh)
TEZ-395. Allow credentials to be specified on a per DAG basis. (Contributed by Michael Weng)
tez-739 tez should include incubating keyword as part of the version string
TEZ-724. Allow configuration of CUSTOM edges on the DAG API. (sseth)
Addendum to TEZ-650. MRHelpers.createMRInputPayloadWithGrouping() methods should not take an MRSplitsProto argument (bikas)
TEZ-650. MRHelpers.createMRInputPayloadWithGrouping() methods should not take an MRSplitsProto argument (Mohammad Kamrul Islam via bikas)
TEZ-624. Fix MROutput to support multiple outputs to the same output location (bikas)
TEZ-722. Tez trunk doesn't compile against latest branch-2. (Jonathan Eagles via hitesh)
TEZ-728. Semantics of output commit (bikas)
TEZ-738. Hive query fails with Invalid event: TA_CONTAINER_PREEMPTED at SUCCEEDED (Hitesh Shah via bikas)
TEZ-140. Tez/hive task failure on large DAG with Invalid event: TA_SCHEDULE at KILLED (bikas)
TEZ-734. Fix the AppMaster to work in the context of the App Submitter's UGI. (sseth)
TEZ-735. Add timeout to tests in TestDAGImpl. (hitesh)
TEZ-732. OrderedWordCount not working after TEZ-582 (bikas)
TEZ-731. Fix a NullPointerException in ContainerTask.toString (sseth)
TEZ-729. Missing dependency on netty in tez-runtime module. (hitesh)
TEZ-721 Junit dependencies need to be specified in submodule pom files
TEZ-727. Fix private resource localization failures on secure clusters. (sseth)
TEZ-720. Inconsistency between VertexState and VertexStatus.State. (hitesh)
TEZ-726. TestVertexImpl and TestDAGImpl failing after TEZ-582 (bikas)
TEZ-725. Allow profiling of specific containers. (sseth)
TEZ-723. Fix missing mocks in TestVertexImpl post TEZ-688. (sseth)
TEZ-582. Refactor and abstract out VertexManager to enable users to plugin their own logic (bikas)
TEZ-688. Make use of DAG specific credentials in Session mode. (sseth)
TEZ-686. Add utility to visualize DAGs. (hitesh)
TEZ-716. Remove some unnecessary classes. (sseth)
TEZ-713. Fix typo in compression property name. (sseth)
TEZ-41. Get VertexCommitter from API and remove MRVertexOutputCommitter. (hitesh)
TEZ-685. Add archive link to mail list page. (hitesh)
TEZ-364. Make VertexScheduler and abstract class. Rename to VertexManager. (bikas)
TEZ-687. Allow re-localization of resources for a running container. (sseth)
TEZ-689. Write wordcount in Tez API (bikas)
TEZ-683. Diamond shape DAG fail. (hitesh)
TEZ-682. TezGroupedSplits fails with empty (zero length) file (bikas)
TEZ-681. Grouping generates incorrect splits if multiple DNs run on a single node (bikas)
Addendum to TEZ-675. Pre-empted taskAttempt gets marked as FAILED instead of KILLED (bikas)
TEZ-672. Fix Tez specific examples to work on secure clusters. (sseth)
TEZ-675. Pre-empted taskAttempt gets marked as FAILED instead of KILLED
TEZ-533. Exception thrown by a VertexCommitter kills the AM instead of just the DAG. (hitesh)
TEZ-664. Add ability to generate source and javadoc jars. (hitesh)
TEZ-606. Fix Tez to work on kerberos secured clusters. (sseth)
TEZ-667. BroadcastShuffleManager should not report errors after it has been closed. (sseth)
TEZ-666. Fix MRRSleep to respect command line parameters. (sseth)
TEZ-644. Failing unit test TestTaskScheduler. (hitesh)
TEZ-663. Log DAG diagnostics when OrderedWordCount fails. (hitesh)
TEZ-662. Fix YarnTezDAGChild to log exception on exit. (hitesh)
TEZ-653. Adding maven-compiler-plugin to pom.xml to force -source option to JVM. (Tsuyoshi Ozawa via hitesh)
TEZ-660. Successful TaskAttempts belonging to LeafVertices should not be marked as KILLED in case of NodeFailure. (sseth) case
TEZ-638. Bring pipelined sorter up-to-date. (gopalv)
TEZ-658. YarnTezDagChild exits with a non 0 exit code even if a task succeeds. (sseth)
TEZ-659. Fix init thread pool in LogicalIOProcessorRuntimeTask for tasks with no inputs and outputs. (hitesh)
TEZ-656. Update site to match INSTALL.txt. (hitesh)
TEZ-655. Update docs for 0.2.0 release. (hitesh)
TEZ-654. Incorrect target index calculation after auto-reduce parallelism (bikas)
TEZ-652. Make OrderedWordCount do grouping in the AM (bikas)
TEZ-645. Re-use ID instances in the AM, intern vertex names etc where possible. (sseth)
TEZ-647. Add support configurable max app attempts for Tez applications (bikas)
TEZ-648. Fix Notice file. (hitesh)
TEZ-642. Fix poms for release. (hitesh)
TEZ-643. Change getProgress APIs to return some form of progress and 1.0f once the map or reduce phase complete. (sseth)
Release 0.2.0-incubating: 2013-11-30
First version.