blob: ae64ad77829feff5377c1c034a4f2fde3e76d427 [file] [log] [blame]
Apache Tez Change Log
=====================
Release 0.4.0-incubating: 2014-04-02
ALL CHANGES
TEZ-932 addendum. Add missing license to file.
TEZ-1001 addendum. Remove checked in jar. (sseth)
TEZ-1001. Change unit test for AM relocalization to generate a jar, and remove previously checked in test jar. (sseth)
TEZ-1000. Add a placeholder partitioned unsorted output. (sseth)
TEZ-932. Add a weighted scaling initial memory allocator. (sseth)
TEZ-989. Allow addition of resources to the AM for a running session. (sseth)
TEZ-991. Fix a bug which could cause getReader to occasionally hang on ShuffledMergedInput. (sseth)
TEZ-990. Add support for a credentials file read by TezClient. (sseth)
TEZ-983. Support a helper function to extract required additional tokens from a file. (hitesh)
TEZ-981. Port MAPREDUCE-3685. (sseth)
TEZ-980. Data recovered flag file should be a namenode only OP. (hitesh)
TEZ-973. Abort additional attempts if recovery fails. (hitesh)
TEZ-976. WordCount example does not handle -D<param> args. (Tassapol Athiapinya via hitesh)
TEZ-977. Flaky test: TestAMNodeMap - add more debug logging. (hitesh)
TEZ-8649. Fix BufferOverflow in PipelinedSorter by ensuring the last block is big enough for one k,v pair (gopalv)
TEZ-972. Shuffle Phase - optimize memory usage of empty partition data in DataMovementEvent. Contributed by Rajesh Balamohan.
TEZ-879. Fix potential synchronization issues in runtime components. (sseth)
TEZ-876. Fix graphviz dag generation for Hive. (hitesh)
TEZ-951. Port MAPREDUCE-5028. Contributed by Gopal V and Siddharth Seth.
TEZ-950. Port MAPREDUCE-5462. Contributed by Gopal V and Siddharth Seth.
TEZ-971. Change Shuffle to report errors early rather than waiting for access before reporting them. (sseth)
TEZ-970. Consolidate Shuffle payload to have a single means of indicating absence of a partition. (sseth)
TEZ-969. Fix a bug which could cause the Merger to get stuck when merging 0 segments, in case of container re-use. (sseth)
TEZ-968. Fix flush mechanism when max unflushed events count set to -1. (hitesh)
TEZ-948. Log counters at the end of Task execution. (sseth)
TEZ-964. Flaky test: TestAMNodeMap.testNodeSelfBlacklist. (hitesh)
TEZ-966. Tez AM has invalid state transition error when datanode is bad. (hitesh)
TEZ-949. Handle Session Tokens for Recovery. (hitesh)
TEZ-955. Tez should close inputs after calling processor's close. (hitesh)
TEZ-953. Port MAPREDUCE-5493. (sseth)
TEZ-958. Increase sleeps in TestContainerReuse to fix flaky tests. (hitesh)
TEZ-952. Port MAPREDUCE-5209 and MAPREDUCE-5251. (sseth)
TEZ-956. Handle zero task vertices correctly on Recovery. (hitesh)
TEZ-938. Avoid fetching empty partitions when the OnFileSortedOutput, ShuffledMergedInput pair is used. Contributed by Rajesh Balamohan.
TEZ-934. Add configuration fields for Tez Local Mode (part of TEZ-684). Contributed by Chen He.
TEZ-947. Fix OrderedWordCount job progress reporting to work across AM attempts. (hitesh)
TEZ-944. Tez Job gets "Could not load native gpl library" Error. (hitesh)
TEZ-851. Handle failure to persist events to HDFS. (hitesh)
TEZ-940. Fix a memory leak in TaskSchedulerAppCallbackWrapper. Contributed by Gopal V and Siddharth Seth.
TEZ-939. Fix build break caused by changes in YARN-1824. (sseth)
TEZ-903. Make max maps per fetcher configurable in ShuffleScheduler. Contributed by Rajesh Balamohan.
TEZ-942. Mrrsleep job with only maps fails with 'Illegal output to map'. (hitesh)
TEZ-936. Remove unused event types from AMSchedulerEventType. (sseth)
TEZ-933 addendum. Fix a unit test to work correctly. (sseth)
TEZ-933. Change Vertex.getNumTasks to return the initially configured number of tasks if initialization has not completed. (sseth) depending on when it's called. (sseth)
TEZ-935. Fix function names for Vertex Stats. (hitesh)
TEZ-930. Addendum patch. Provide additional aggregated task stats at the vertex level. (hitesh)
TEZ-930. Provide additional aggregated task stats at the vertex level. (hitesh)
TEZ-931. DAGHistoryEvent should not be allowed to be sent to Dispatcher/EventHandlers. (hitesh)
TEZ-904. Committer recovery events should be out-of-band. (hitesh)
TEZ-813. Remove unused imports across project. (Jonathan Eagles via hitesh)
TEZ-920. Increase defaults for number of counters, ensure AM uses this. (sseth)
TEZ-928. NPE in last app attempt caused by registering for an RM unregister. (hitesh)
TEZ-676. Tez job fails on client side if nodemanager running AM is lost. (Tsuyoshi Ozawa and hitesh via hitesh)
TEZ-918. Fix a bug which could cause shuffle to hang if there are intermittent fetch failures. (sseth)
TEZ-910. Allow ShuffledUnorderedKVInput to work for cases other than broadcast. (sseth)
TEZ-911. Re-factor BroadcastShuffle related code to be independent of Braodcast. (sseth)
TEZ-901. Improvements to Counters generated by runtime components. (sseth)
TEZ-902. SortedMergedInput Fetcher can hang on retrying a bad input (bikas)
Addendum to TEZ-804. Remove comments to add test since test has been added. Some new logs added. (bikas)
TEZ-847. Support basic AM recovery. (hitesh)
TEZ-915. TaskScheduler can get hung when all headroom is used and it cannot utilize existing new containers (bikas)
TEZ-894. Tez should have a way to know build manifest. (Ashish Singh via hitesh)
TEZ-893. Terasort gives ArrayIndexOutOfBound Exception for 'hadoop jar <jar> terasort'. (hitesh)
TEZ-884. Add parameter checking for context related user API's (Tsuyoshi Ozawa via bikas)
TEZ-906. Finalize Tez 0.3.0 release. (hitesh)
TEZ-535. Redirect AM logs to different files when running multiple DAGs in the same AM. Contributed by Mohammad Kamrul Islam.
TEZ-887. Allow counters to be separated at a per Input/Output level. (sseth)
TEZ-898. Handle inputReady and initial memory request in case of 0 physical inputs. (sseth)
TEZ-896. Fix AM splits to work on secure clusters when using the mapred API. (sseth)
TEZ-715. Auto Reduce Parallelism can rarely trigger NPE in AM at DAGAppMaster.handle(DAGAppMaster.java:1268) (bikas)
TEZ-891. TestTaskScheduler does not handle mockApp being called on different thread (bikas)
TEZ-889. Fixes a bug in MRInputSplitDistributor (caused by TEZ-880). (sseth)
TEZ-883. Fix unit tests to use comma separated values in tests having output verification at more than 1 tasks. (Tassapol Athiapinya via bikas)
TEZ-888. TestAMNodeMap.testNodeSelfBlacklist failing intermittently (bikas)
TEZ-865. TezTaskContext.getDAGName() does not return DAG name (Tsuyoshi OZAWA via bikas)
TEZ-885. TestTaskScheduler intermittent failures (bikas)
TEZ-882. Update versions in master for next release. (hitesh)
Release 0.3.0-incubating: 2014-02-26
INCOMPATIBLE CHANGES
TEZ-720. Inconsistency between VertexState and VertexStatus.State.
TEZ-41. Get VertexCommitter from API and remove MRVertexOutputCommitter.
TEZ-650. MRHelpers.createMRInputPayloadWithGrouping() methods should not
take an MRSplitsProto argument.
TEZ-827. Separate initialize and start operations on Inputs/Outputs.
TEZ-668. Allow Processors to control Input/Output start.
TEZ-837. Remove numTasks from VertexLocationHints.
ALL CHANGES
TEZ-889. Fixes a bug in MRInputSplitDistributor (caused by TEZ-880). (sseth)
TEZ-881. Update Notice and any other copyright tags for year 2014. (hitesh)
TEZ-874. Fetcher inputstream is not buffered. (Rajesh Balamohan via hitesh)
TEZ-880. Support sending deserialized data in RootInputDataInformationEvents. (sseth)
TEZ-779. Make Tez grouping splits logic possible outside InputFormat (bikas)
TEZ-844. Processors should have a mechanism to know when an Input is ready for consumption. (sseth)
TEZ-878. doAssignAll() in TaskScheduler ignores delayedContainers being out of sync with heldContainers (bikas)
TEZ-824. Failure of 2 tasks in a vertex having input failures (Tassapol Athiapinya via bikas)
Addendum patch for TEZ-769. Change Vertex.setParallelism() to accept a set of EdgeManagerDescriptors. (hitesh)
TEZ-769. Change Vertex.setParallelism() to accept a set of EdgeManagerDescriptors. (hitesh)
TEZ-863 Addendum. Queue events for relevant inputs untill the Input has been started. Fixes a potential NPE in case of no auto start. (Contributed by Rajesh Balamohan)
TEZ-289. DAGAppMasterShutdownHook should not report KILLED when exception brings down AM. (Tsuyoshi Ozawa via hitesh)
TEZ-835. Fix a bug which could cause a divide by zero if a very small sort buffer is configured. (sseth)
TEZ-842. Add built-in verification of expected execution pattern into TestFaultTolerance (Tassapol Athiapinya bikas)
TEZ-863. Queue events for relevant inputs untill the Input has been started. Fixes a potential NPE in case of no auto start. (sseth)
TEZ-788. Clean up dist tarballs. (Contributed by Jonathan Eagles)
TEZ-619. Failing test: TestTaskScheduler.testTaskSchedulerWithReuse (Jonathan Eagles via bikas)
TEZ-845. Handle un-blacklisting of nodes (bikas)
TEZ-847. Visualize tez statemachines. (Min Zhou via hitesh)
TEZ-823. Add DAGs with a vertex connecting with 2 downstream/upstream vertices and unit tests for fault tolerance on these DAGs (Tassapol Athiapinya via bikas)
TEZ-787. Revert Guava dependency to 11.0.2. (sseth)
TEZ-801. Support routing of event to multiple destination physical inputs (bikas)
TEZ-854. Fix non-recovery code path for sessions. (hitesh)
TEZ-837. Remove numTasks from VertexLocationHints. (sseth)
TEZ-668. Allow Processors to control Input / Output start. (sseth)
TEZ-843. Fix failing unit test post TEZ-756. (sseth)
TEZ-816. Add unit test for cascading input failure (Tassapol Athiapinya via bikas)
TEZ-825. Fix incorrect inheritance method calls in ThreeLevels and SixLevels failing dags. (Tassapol Athiapinya via bikas)
TEZ-756 Addendum. Fix a unit test failure. (sseth)
TEZ-840. Default value of DataMovementEventPayloadProto.data_generated should be true. (sseth)
TEZ-756. Allow VertexManagerPlugins to configure RootInputEvents without access to Tez internal structures. (sseth)
TEZ-804. Handle node loss/bad nodes (bikas)
TEZ-836. Add ConcatenatedKeyValueInput for vertex groups (Gunther Hagleitner via bikas)
TEZ-755. Change VertexManagerPlugin.initialize and context to be consistent with the rest of the context objects (bikas)
TEZ-833. Have the Tez task framework set Framework counters instead of MR Processors setting them. (sseth)
TEZ-826. Remove wordcountmrrtest example. (sseth)
TEZ-815. Split initialize and start implementations for the various Inputs and Outputs. (sseth)
TEZ-637. [MR Support] Add all required info into JobConf for MR related I/O/P (bikas)
TEZ-596. Change MRHelpers.serializeConf* methods to use Protobuf for serialization. Contributed by Mohammad Kamrul Islam
TEZ-832. Fix a race in MemoryDistributor. (sseth)
TEZ-827. Separate initialize and start operations on Inputs/Outputs. (sseth)
TEZ-831. Reduce line length of state machines (bikas)
TEZ-796. AM Hangs & does not kill containers when map-task fails (bikas)
TEZ-782. Scale I/O mem requirements if misconfigured. (sseth)
TEZ-819. YARNRunner should not put -m/-r in output name when using mapred API (bikas)
TEZ-773. Fix configuration of tez jars location in MiniTezCluster (bikas)
TEZ-799. Generate data to be used for recovery. (hitesh)
TEZ-812. exclude slf4j dependencies from hadoop. (Giridharan Kesavan via hitesh)
TEZ-810. Add Gunther H. to the website. (Gunther Hagleitner via hitesh)
TEZ-807. Build broken due to NPE (Patch by Gunther Hagleitner, reviewed by Siddharth Seth)
TEZ-777. Obtain tokens for LocalResources specified in the DAGPlan (Patch by Gunther Hagleitner, reviewed by Siddharth Seth)
TEZ-781. Add unit test for fault tolerance (input failure causes re-run of previous task under allowed maximum failed attempt) (Tassapol Athiapinya via bikas)
TEZ-745. Rename TEZ_AM_ABORT_ALL_OUTPUTS_ON_DAG_FAILURE in TezConfiguration to TEZ_AM_COMMIT_ALL_OUTPUTS_ON_DAG_SUCCESS (Jonathan Eagles via bikas)
TEZ-783. Add standard DAGs using failing processors/inputs for test purpose (Tassapol Athiapinya via bikas)
TEZ-773. Fix configuration of tez jars location in MiniTezCluster (bikas)
TEZ-798. Change DAG.addURIsForCredentials to accept a Collection instead of a List. (Gunther Hagleitner via sseth)
TEZ-797. Add documentation for some of the Tez config parameters. (sseth)
TEZ-766. Support an api to pre-warm containers for a session. (hitesh)
TEZ-774. Fix name resolution for local addresses in test (bikas)
TEZ-761. Replace MRJobConfig parameter references in the tez-dag module with Tez equivalents. (Mohammad Kamrul Islam via sseth)
TEZ-775. Fix usage of env vars (bikas)
TEZ-773. Fix configuration of tez jars location in MiniTezCluster (bikas)
TEZ-791. Remove ShuffleHandler and related classes from tez-library. (sseth)
TEZ-790. Addendum patch for TEZ-784: failing config constants renaming is incomplete. (Tassapol Athiapinya via bikas)
TEZ-718. Remove some unused classes - JobEndNotifier and Speculation. (Mohammad Kamrul Islam via sseth)
TEZ-784. Add TestDriver to allow cmd line submission of tests to a cluster (bikas)
TEZ-771. Dist build broken after TEZ-749. (Jonathan Eagles via hitesh)
TEZ-678. Support for union operations via VertexGroup abstraction (bikas)
TEZ-646. Introduce a CompositeDataMovementEvent to avoid multiple copies of the same payload in the AM. (sseth)
TEZ-752 addendum. Javadoc modifications for DAG.addURIsForCredentials. (sseth)
TEZ-768. Consolidate TokenCache, Master and related code. (sseth)
TEZ-752. Add an API to DAG to accept a list of URIs for which tokens are needed. (sseth)
TEZ-748. Test Tez Fault Tolerance (bikas)
TEZ-749. Maintain order of vertices as specified by the user (Jonathan Eagles via bikas)
TEZ-674. TezClient should obtain Tokens for the staging directory it uses. (sseth)
TEZ-665. Fix javadoc warnings. (Jonathan Eagles via hitesh)
TEZ-765. Allow tez.runtime.sort.threads > 1 to turn on PipelinedSorter (gopalv).
TEZ-763. Tez doesn't compile with Non-resolvable parent error. (Jonathan Eagles via hitesh)
TEZ-395. Allow credentials to be specified on a per DAG basis. (Contributed by Michael Weng)
tez-739 tez should include incubating keyword as part of the version string
TEZ-724. Allow configuration of CUSTOM edges on the DAG API. (sseth)
Addendum to TEZ-650. MRHelpers.createMRInputPayloadWithGrouping() methods should not take an MRSplitsProto argument (bikas)
TEZ-650. MRHelpers.createMRInputPayloadWithGrouping() methods should not take an MRSplitsProto argument (Mohammad Kamrul Islam via bikas)
TEZ-624. Fix MROutput to support multiple outputs to the same output location (bikas)
TEZ-722. Tez trunk doesn't compile against latest branch-2. (Jonathan Eagles via hitesh)
TEZ-728. Semantics of output commit (bikas)
TEZ-738. Hive query fails with Invalid event: TA_CONTAINER_PREEMPTED at SUCCEEDED (Hitesh Shah via bikas)
TEZ-140. Tez/hive task failure on large DAG with Invalid event: TA_SCHEDULE at KILLED (bikas)
TEZ-734. Fix the AppMaster to work in the context of the App Submitter's UGI. (sseth)
TEZ-735. Add timeout to tests in TestDAGImpl. (hitesh)
TEZ-732. OrderedWordCount not working after TEZ-582 (bikas)
TEZ-731. Fix a NullPointerException in ContainerTask.toString (sseth)
TEZ-729. Missing dependency on netty in tez-runtime module. (hitesh)
TEZ-721 Junit dependencies need to be specified in submodule pom files
TEZ-727. Fix private resource localization failures on secure clusters. (sseth)
TEZ-720. Inconsistency between VertexState and VertexStatus.State. (hitesh)
TEZ-726. TestVertexImpl and TestDAGImpl failing after TEZ-582 (bikas)
TEZ-725. Allow profiling of specific containers. (sseth)
TEZ-723. Fix missing mocks in TestVertexImpl post TEZ-688. (sseth)
TEZ-582. Refactor and abstract out VertexManager to enable users to plugin their own logic (bikas)
TEZ-688. Make use of DAG specific credentials in Session mode. (sseth)
TEZ-686. Add utility to visualize DAGs. (hitesh)
TEZ-716. Remove some unnecessary classes. (sseth)
TEZ-713. Fix typo in compression property name. (sseth)
TEZ-41. Get VertexCommitter from API and remove MRVertexOutputCommitter. (hitesh)
TEZ-685. Add archive link to mail list page. (hitesh)
TEZ-364. Make VertexScheduler and abstract class. Rename to VertexManager. (bikas)
TEZ-687. Allow re-localization of resources for a running container. (sseth)
TEZ-689. Write wordcount in Tez API (bikas)
TEZ-683. Diamond shape DAG fail. (hitesh)
TEZ-682. TezGroupedSplits fails with empty (zero length) file (bikas)
TEZ-681. Grouping generates incorrect splits if multiple DNs run on a single node (bikas)
Addendum to TEZ-675. Pre-empted taskAttempt gets marked as FAILED instead of KILLED (bikas)
TEZ-672. Fix Tez specific examples to work on secure clusters. (sseth)
TEZ-675. Pre-empted taskAttempt gets marked as FAILED instead of KILLED
TEZ-533. Exception thrown by a VertexCommitter kills the AM instead of just the DAG. (hitesh)
TEZ-664. Add ability to generate source and javadoc jars. (hitesh)
TEZ-606. Fix Tez to work on kerberos secured clusters. (sseth)
TEZ-667. BroadcastShuffleManager should not report errors after it has been closed. (sseth)
TEZ-666. Fix MRRSleep to respect command line parameters. (sseth)
TEZ-644. Failing unit test TestTaskScheduler. (hitesh)
TEZ-663. Log DAG diagnostics when OrderedWordCount fails. (hitesh)
TEZ-662. Fix YarnTezDAGChild to log exception on exit. (hitesh)
TEZ-653. Adding maven-compiler-plugin to pom.xml to force -source option to JVM. (Tsuyoshi Ozawa via hitesh)
TEZ-660. Successful TaskAttempts belonging to LeafVertices should not be marked as KILLED in case of NodeFailure. (sseth) case
TEZ-638. Bring pipelined sorter up-to-date. (gopalv)
TEZ-658. YarnTezDagChild exits with a non 0 exit code even if a task succeeds. (sseth)
TEZ-659. Fix init thread pool in LogicalIOProcessorRuntimeTask for tasks with no inputs and outputs. (hitesh)
TEZ-656. Update site to match INSTALL.txt. (hitesh)
TEZ-655. Update docs for 0.2.0 release. (hitesh)
TEZ-654. Incorrect target index calculation after auto-reduce parallelism (bikas)
TEZ-652. Make OrderedWordCount do grouping in the AM (bikas)
TEZ-645. Re-use ID instances in the AM, intern vertex names etc where possible. (sseth)
TEZ-647. Add support configurable max app attempts for Tez applications (bikas)
TEZ-648. Fix Notice file. (hitesh)
TEZ-642. Fix poms for release. (hitesh)
TEZ-643. Change getProgress APIs to return some form of progress and 1.0f once the map or reduce phase complete. (sseth)
Release 0.2.0-incubating: 2013-11-30
First version.