| Apache Tez Change Log |
| ===================== |
| |
| Release 0.4.0-incubating: 2014-04-02 |
| |
| ALL CHANGES |
| |
| TEZ-932 addendum. Add missing license to file. |
| TEZ-1001 addendum. Remove checked in jar. (sseth) |
| TEZ-1001. Change unit test for AM relocalization to generate a jar, and remove previously checked in test jar. (sseth) |
| TEZ-1000. Add a placeholder partitioned unsorted output. (sseth) |
| TEZ-932. Add a weighted scaling initial memory allocator. (sseth) |
| TEZ-989. Allow addition of resources to the AM for a running session. (sseth) |
| TEZ-991. Fix a bug which could cause getReader to occasionally hang on ShuffledMergedInput. (sseth) |
| TEZ-990. Add support for a credentials file read by TezClient. (sseth) |
| TEZ-983. Support a helper function to extract required additional tokens from a file. (hitesh) |
| TEZ-981. Port MAPREDUCE-3685. (sseth) |
| TEZ-980. Data recovered flag file should be a namenode only OP. (hitesh) |
| TEZ-973. Abort additional attempts if recovery fails. (hitesh) |
| TEZ-976. WordCount example does not handle -D<param> args. (Tassapol Athiapinya via hitesh) |
| TEZ-977. Flaky test: TestAMNodeMap - add more debug logging. (hitesh) |
| TEZ-8649. Fix BufferOverflow in PipelinedSorter by ensuring the last block is big enough for one k,v pair (gopalv) |
| TEZ-972. Shuffle Phase - optimize memory usage of empty partition data in DataMovementEvent. Contributed by Rajesh Balamohan. |
| TEZ-879. Fix potential synchronization issues in runtime components. (sseth) |
| TEZ-876. Fix graphviz dag generation for Hive. (hitesh) |
| TEZ-951. Port MAPREDUCE-5028. Contributed by Gopal V and Siddharth Seth. |
| TEZ-950. Port MAPREDUCE-5462. Contributed by Gopal V and Siddharth Seth. |
| TEZ-971. Change Shuffle to report errors early rather than waiting for access before reporting them. (sseth) |
| TEZ-970. Consolidate Shuffle payload to have a single means of indicating absence of a partition. (sseth) |
| TEZ-969. Fix a bug which could cause the Merger to get stuck when merging 0 segments, in case of container re-use. (sseth) |
| TEZ-968. Fix flush mechanism when max unflushed events count set to -1. (hitesh) |
| TEZ-948. Log counters at the end of Task execution. (sseth) |
| TEZ-964. Flaky test: TestAMNodeMap.testNodeSelfBlacklist. (hitesh) |
| TEZ-966. Tez AM has invalid state transition error when datanode is bad. (hitesh) |
| TEZ-949. Handle Session Tokens for Recovery. (hitesh) |
| TEZ-955. Tez should close inputs after calling processor's close. (hitesh) |
| TEZ-953. Port MAPREDUCE-5493. (sseth) |
| TEZ-958. Increase sleeps in TestContainerReuse to fix flaky tests. (hitesh) |
| TEZ-952. Port MAPREDUCE-5209 and MAPREDUCE-5251. (sseth) |
| TEZ-956. Handle zero task vertices correctly on Recovery. (hitesh) |
| TEZ-938. Avoid fetching empty partitions when the OnFileSortedOutput, ShuffledMergedInput pair is used. Contributed by Rajesh Balamohan. |
| TEZ-934. Add configuration fields for Tez Local Mode (part of TEZ-684). Contributed by Chen He. |
| TEZ-947. Fix OrderedWordCount job progress reporting to work across AM attempts. (hitesh) |
| TEZ-944. Tez Job gets "Could not load native gpl library" Error. (hitesh) |
| TEZ-851. Handle failure to persist events to HDFS. (hitesh) |
| TEZ-940. Fix a memory leak in TaskSchedulerAppCallbackWrapper. Contributed by Gopal V and Siddharth Seth. |
| TEZ-939. Fix build break caused by changes in YARN-1824. (sseth) |
| TEZ-903. Make max maps per fetcher configurable in ShuffleScheduler. Contributed by Rajesh Balamohan. |
| TEZ-942. Mrrsleep job with only maps fails with 'Illegal output to map'. (hitesh) |
| TEZ-936. Remove unused event types from AMSchedulerEventType. (sseth) |
| TEZ-933 addendum. Fix a unit test to work correctly. (sseth) |
| TEZ-933. Change Vertex.getNumTasks to return the initially configured number of tasks if initialization has not completed. (sseth) depending on when it's called. (sseth) |
| TEZ-935. Fix function names for Vertex Stats. (hitesh) |
| TEZ-930. Addendum patch. Provide additional aggregated task stats at the vertex level. (hitesh) |
| TEZ-930. Provide additional aggregated task stats at the vertex level. (hitesh) |
| TEZ-931. DAGHistoryEvent should not be allowed to be sent to Dispatcher/EventHandlers. (hitesh) |
| TEZ-904. Committer recovery events should be out-of-band. (hitesh) |
| TEZ-813. Remove unused imports across project. (Jonathan Eagles via hitesh) |
| TEZ-920. Increase defaults for number of counters, ensure AM uses this. (sseth) |
| TEZ-928. NPE in last app attempt caused by registering for an RM unregister. (hitesh) |
| TEZ-676. Tez job fails on client side if nodemanager running AM is lost. (Tsuyoshi Ozawa and hitesh via hitesh) |
| TEZ-918. Fix a bug which could cause shuffle to hang if there are intermittent fetch failures. (sseth) |
| TEZ-910. Allow ShuffledUnorderedKVInput to work for cases other than broadcast. (sseth) |
| TEZ-911. Re-factor BroadcastShuffle related code to be independent of Braodcast. (sseth) |
| TEZ-901. Improvements to Counters generated by runtime components. (sseth) |
| TEZ-902. SortedMergedInput Fetcher can hang on retrying a bad input (bikas) |
| Addendum to TEZ-804. Remove comments to add test since test has been added. Some new logs added. (bikas) |
| TEZ-847. Support basic AM recovery. (hitesh) |
| TEZ-915. TaskScheduler can get hung when all headroom is used and it cannot utilize existing new containers (bikas) |
| TEZ-894. Tez should have a way to know build manifest. (Ashish Singh via hitesh) |
| TEZ-893. Terasort gives ArrayIndexOutOfBound Exception for 'hadoop jar <jar> terasort'. (hitesh) |
| TEZ-884. Add parameter checking for context related user API's (Tsuyoshi Ozawa via bikas) |
| TEZ-906. Finalize Tez 0.3.0 release. (hitesh) |
| TEZ-535. Redirect AM logs to different files when running multiple DAGs in the same AM. Contributed by Mohammad Kamrul Islam. |
| TEZ-887. Allow counters to be separated at a per Input/Output level. (sseth) |
| TEZ-898. Handle inputReady and initial memory request in case of 0 physical inputs. (sseth) |
| TEZ-896. Fix AM splits to work on secure clusters when using the mapred API. (sseth) |
| TEZ-715. Auto Reduce Parallelism can rarely trigger NPE in AM at DAGAppMaster.handle(DAGAppMaster.java:1268) (bikas) |
| TEZ-891. TestTaskScheduler does not handle mockApp being called on different thread (bikas) |
| TEZ-889. Fixes a bug in MRInputSplitDistributor (caused by TEZ-880). (sseth) |
| TEZ-883. Fix unit tests to use comma separated values in tests having output verification at more than 1 tasks. (Tassapol Athiapinya via bikas) |
| TEZ-888. TestAMNodeMap.testNodeSelfBlacklist failing intermittently (bikas) |
| TEZ-865. TezTaskContext.getDAGName() does not return DAG name (Tsuyoshi OZAWA via bikas) |
| TEZ-885. TestTaskScheduler intermittent failures (bikas) |
| TEZ-882. Update versions in master for next release. (hitesh) |
| |
| |
| Release 0.3.0-incubating: 2014-02-26 |
| |
| INCOMPATIBLE CHANGES |
| |
| TEZ-720. Inconsistency between VertexState and VertexStatus.State. |
| TEZ-41. Get VertexCommitter from API and remove MRVertexOutputCommitter. |
| TEZ-650. MRHelpers.createMRInputPayloadWithGrouping() methods should not |
| take an MRSplitsProto argument. |
| TEZ-827. Separate initialize and start operations on Inputs/Outputs. |
| TEZ-668. Allow Processors to control Input/Output start. |
| TEZ-837. Remove numTasks from VertexLocationHints. |
| |
| ALL CHANGES |
| |
| TEZ-889. Fixes a bug in MRInputSplitDistributor (caused by TEZ-880). (sseth) |
| TEZ-881. Update Notice and any other copyright tags for year 2014. (hitesh) |
| TEZ-874. Fetcher inputstream is not buffered. (Rajesh Balamohan via hitesh) |
| TEZ-880. Support sending deserialized data in RootInputDataInformationEvents. (sseth) |
| TEZ-779. Make Tez grouping splits logic possible outside InputFormat (bikas) |
| TEZ-844. Processors should have a mechanism to know when an Input is ready for consumption. (sseth) |
| TEZ-878. doAssignAll() in TaskScheduler ignores delayedContainers being out of sync with heldContainers (bikas) |
| TEZ-824. Failure of 2 tasks in a vertex having input failures (Tassapol Athiapinya via bikas) |
| Addendum patch for TEZ-769. Change Vertex.setParallelism() to accept a set of EdgeManagerDescriptors. (hitesh) |
| TEZ-769. Change Vertex.setParallelism() to accept a set of EdgeManagerDescriptors. (hitesh) |
| TEZ-863 Addendum. Queue events for relevant inputs untill the Input has been started. Fixes a potential NPE in case of no auto start. (Contributed by Rajesh Balamohan) |
| TEZ-289. DAGAppMasterShutdownHook should not report KILLED when exception brings down AM. (Tsuyoshi Ozawa via hitesh) |
| TEZ-835. Fix a bug which could cause a divide by zero if a very small sort buffer is configured. (sseth) |
| TEZ-842. Add built-in verification of expected execution pattern into TestFaultTolerance (Tassapol Athiapinya bikas) |
| TEZ-863. Queue events for relevant inputs untill the Input has been started. Fixes a potential NPE in case of no auto start. (sseth) |
| TEZ-788. Clean up dist tarballs. (Contributed by Jonathan Eagles) |
| TEZ-619. Failing test: TestTaskScheduler.testTaskSchedulerWithReuse (Jonathan Eagles via bikas) |
| TEZ-845. Handle un-blacklisting of nodes (bikas) |
| TEZ-847. Visualize tez statemachines. (Min Zhou via hitesh) |
| TEZ-823. Add DAGs with a vertex connecting with 2 downstream/upstream vertices and unit tests for fault tolerance on these DAGs (Tassapol Athiapinya via bikas) |
| TEZ-787. Revert Guava dependency to 11.0.2. (sseth) |
| TEZ-801. Support routing of event to multiple destination physical inputs (bikas) |
| TEZ-854. Fix non-recovery code path for sessions. (hitesh) |
| TEZ-837. Remove numTasks from VertexLocationHints. (sseth) |
| TEZ-668. Allow Processors to control Input / Output start. (sseth) |
| TEZ-843. Fix failing unit test post TEZ-756. (sseth) |
| TEZ-816. Add unit test for cascading input failure (Tassapol Athiapinya via bikas) |
| TEZ-825. Fix incorrect inheritance method calls in ThreeLevels and SixLevels failing dags. (Tassapol Athiapinya via bikas) |
| TEZ-756 Addendum. Fix a unit test failure. (sseth) |
| TEZ-840. Default value of DataMovementEventPayloadProto.data_generated should be true. (sseth) |
| TEZ-756. Allow VertexManagerPlugins to configure RootInputEvents without access to Tez internal structures. (sseth) |
| TEZ-804. Handle node loss/bad nodes (bikas) |
| TEZ-836. Add ConcatenatedKeyValueInput for vertex groups (Gunther Hagleitner via bikas) |
| TEZ-755. Change VertexManagerPlugin.initialize and context to be consistent with the rest of the context objects (bikas) |
| TEZ-833. Have the Tez task framework set Framework counters instead of MR Processors setting them. (sseth) |
| TEZ-826. Remove wordcountmrrtest example. (sseth) |
| TEZ-815. Split initialize and start implementations for the various Inputs and Outputs. (sseth) |
| TEZ-637. [MR Support] Add all required info into JobConf for MR related I/O/P (bikas) |
| TEZ-596. Change MRHelpers.serializeConf* methods to use Protobuf for serialization. Contributed by Mohammad Kamrul Islam |
| TEZ-832. Fix a race in MemoryDistributor. (sseth) |
| TEZ-827. Separate initialize and start operations on Inputs/Outputs. (sseth) |
| TEZ-831. Reduce line length of state machines (bikas) |
| TEZ-796. AM Hangs & does not kill containers when map-task fails (bikas) |
| TEZ-782. Scale I/O mem requirements if misconfigured. (sseth) |
| TEZ-819. YARNRunner should not put -m/-r in output name when using mapred API (bikas) |
| TEZ-773. Fix configuration of tez jars location in MiniTezCluster (bikas) |
| TEZ-799. Generate data to be used for recovery. (hitesh) |
| TEZ-812. exclude slf4j dependencies from hadoop. (Giridharan Kesavan via hitesh) |
| TEZ-810. Add Gunther H. to the website. (Gunther Hagleitner via hitesh) |
| TEZ-807. Build broken due to NPE (Patch by Gunther Hagleitner, reviewed by Siddharth Seth) |
| TEZ-777. Obtain tokens for LocalResources specified in the DAGPlan (Patch by Gunther Hagleitner, reviewed by Siddharth Seth) |
| TEZ-781. Add unit test for fault tolerance (input failure causes re-run of previous task under allowed maximum failed attempt) (Tassapol Athiapinya via bikas) |
| TEZ-745. Rename TEZ_AM_ABORT_ALL_OUTPUTS_ON_DAG_FAILURE in TezConfiguration to TEZ_AM_COMMIT_ALL_OUTPUTS_ON_DAG_SUCCESS (Jonathan Eagles via bikas) |
| TEZ-783. Add standard DAGs using failing processors/inputs for test purpose (Tassapol Athiapinya via bikas) |
| TEZ-773. Fix configuration of tez jars location in MiniTezCluster (bikas) |
| TEZ-798. Change DAG.addURIsForCredentials to accept a Collection instead of a List. (Gunther Hagleitner via sseth) |
| TEZ-797. Add documentation for some of the Tez config parameters. (sseth) |
| TEZ-766. Support an api to pre-warm containers for a session. (hitesh) |
| TEZ-774. Fix name resolution for local addresses in test (bikas) |
| TEZ-761. Replace MRJobConfig parameter references in the tez-dag module with Tez equivalents. (Mohammad Kamrul Islam via sseth) |
| TEZ-775. Fix usage of env vars (bikas) |
| TEZ-773. Fix configuration of tez jars location in MiniTezCluster (bikas) |
| TEZ-791. Remove ShuffleHandler and related classes from tez-library. (sseth) |
| TEZ-790. Addendum patch for TEZ-784: failing config constants renaming is incomplete. (Tassapol Athiapinya via bikas) |
| TEZ-718. Remove some unused classes - JobEndNotifier and Speculation. (Mohammad Kamrul Islam via sseth) |
| TEZ-784. Add TestDriver to allow cmd line submission of tests to a cluster (bikas) |
| TEZ-771. Dist build broken after TEZ-749. (Jonathan Eagles via hitesh) |
| TEZ-678. Support for union operations via VertexGroup abstraction (bikas) |
| TEZ-646. Introduce a CompositeDataMovementEvent to avoid multiple copies of the same payload in the AM. (sseth) |
| TEZ-752 addendum. Javadoc modifications for DAG.addURIsForCredentials. (sseth) |
| TEZ-768. Consolidate TokenCache, Master and related code. (sseth) |
| TEZ-752. Add an API to DAG to accept a list of URIs for which tokens are needed. (sseth) |
| TEZ-748. Test Tez Fault Tolerance (bikas) |
| TEZ-749. Maintain order of vertices as specified by the user (Jonathan Eagles via bikas) |
| TEZ-674. TezClient should obtain Tokens for the staging directory it uses. (sseth) |
| TEZ-665. Fix javadoc warnings. (Jonathan Eagles via hitesh) |
| TEZ-765. Allow tez.runtime.sort.threads > 1 to turn on PipelinedSorter (gopalv). |
| TEZ-763. Tez doesn't compile with Non-resolvable parent error. (Jonathan Eagles via hitesh) |
| TEZ-395. Allow credentials to be specified on a per DAG basis. (Contributed by Michael Weng) |
| tez-739 tez should include incubating keyword as part of the version string |
| TEZ-724. Allow configuration of CUSTOM edges on the DAG API. (sseth) |
| Addendum to TEZ-650. MRHelpers.createMRInputPayloadWithGrouping() methods should not take an MRSplitsProto argument (bikas) |
| TEZ-650. MRHelpers.createMRInputPayloadWithGrouping() methods should not take an MRSplitsProto argument (Mohammad Kamrul Islam via bikas) |
| TEZ-624. Fix MROutput to support multiple outputs to the same output location (bikas) |
| TEZ-722. Tez trunk doesn't compile against latest branch-2. (Jonathan Eagles via hitesh) |
| TEZ-728. Semantics of output commit (bikas) |
| TEZ-738. Hive query fails with Invalid event: TA_CONTAINER_PREEMPTED at SUCCEEDED (Hitesh Shah via bikas) |
| TEZ-140. Tez/hive task failure on large DAG with Invalid event: TA_SCHEDULE at KILLED (bikas) |
| TEZ-734. Fix the AppMaster to work in the context of the App Submitter's UGI. (sseth) |
| TEZ-735. Add timeout to tests in TestDAGImpl. (hitesh) |
| TEZ-732. OrderedWordCount not working after TEZ-582 (bikas) |
| TEZ-731. Fix a NullPointerException in ContainerTask.toString (sseth) |
| TEZ-729. Missing dependency on netty in tez-runtime module. (hitesh) |
| TEZ-721 Junit dependencies need to be specified in submodule pom files |
| TEZ-727. Fix private resource localization failures on secure clusters. (sseth) |
| TEZ-720. Inconsistency between VertexState and VertexStatus.State. (hitesh) |
| TEZ-726. TestVertexImpl and TestDAGImpl failing after TEZ-582 (bikas) |
| TEZ-725. Allow profiling of specific containers. (sseth) |
| TEZ-723. Fix missing mocks in TestVertexImpl post TEZ-688. (sseth) |
| TEZ-582. Refactor and abstract out VertexManager to enable users to plugin their own logic (bikas) |
| TEZ-688. Make use of DAG specific credentials in Session mode. (sseth) |
| TEZ-686. Add utility to visualize DAGs. (hitesh) |
| TEZ-716. Remove some unnecessary classes. (sseth) |
| TEZ-713. Fix typo in compression property name. (sseth) |
| TEZ-41. Get VertexCommitter from API and remove MRVertexOutputCommitter. (hitesh) |
| TEZ-685. Add archive link to mail list page. (hitesh) |
| TEZ-364. Make VertexScheduler and abstract class. Rename to VertexManager. (bikas) |
| TEZ-687. Allow re-localization of resources for a running container. (sseth) |
| TEZ-689. Write wordcount in Tez API (bikas) |
| TEZ-683. Diamond shape DAG fail. (hitesh) |
| TEZ-682. TezGroupedSplits fails with empty (zero length) file (bikas) |
| TEZ-681. Grouping generates incorrect splits if multiple DNs run on a single node (bikas) |
| Addendum to TEZ-675. Pre-empted taskAttempt gets marked as FAILED instead of KILLED (bikas) |
| TEZ-672. Fix Tez specific examples to work on secure clusters. (sseth) |
| TEZ-675. Pre-empted taskAttempt gets marked as FAILED instead of KILLED |
| TEZ-533. Exception thrown by a VertexCommitter kills the AM instead of just the DAG. (hitesh) |
| TEZ-664. Add ability to generate source and javadoc jars. (hitesh) |
| TEZ-606. Fix Tez to work on kerberos secured clusters. (sseth) |
| TEZ-667. BroadcastShuffleManager should not report errors after it has been closed. (sseth) |
| TEZ-666. Fix MRRSleep to respect command line parameters. (sseth) |
| TEZ-644. Failing unit test TestTaskScheduler. (hitesh) |
| TEZ-663. Log DAG diagnostics when OrderedWordCount fails. (hitesh) |
| TEZ-662. Fix YarnTezDAGChild to log exception on exit. (hitesh) |
| TEZ-653. Adding maven-compiler-plugin to pom.xml to force -source option to JVM. (Tsuyoshi Ozawa via hitesh) |
| TEZ-660. Successful TaskAttempts belonging to LeafVertices should not be marked as KILLED in case of NodeFailure. (sseth) case |
| TEZ-638. Bring pipelined sorter up-to-date. (gopalv) |
| TEZ-658. YarnTezDagChild exits with a non 0 exit code even if a task succeeds. (sseth) |
| TEZ-659. Fix init thread pool in LogicalIOProcessorRuntimeTask for tasks with no inputs and outputs. (hitesh) |
| TEZ-656. Update site to match INSTALL.txt. (hitesh) |
| TEZ-655. Update docs for 0.2.0 release. (hitesh) |
| TEZ-654. Incorrect target index calculation after auto-reduce parallelism (bikas) |
| TEZ-652. Make OrderedWordCount do grouping in the AM (bikas) |
| TEZ-645. Re-use ID instances in the AM, intern vertex names etc where possible. (sseth) |
| TEZ-647. Add support configurable max app attempts for Tez applications (bikas) |
| TEZ-648. Fix Notice file. (hitesh) |
| TEZ-642. Fix poms for release. (hitesh) |
| TEZ-643. Change getProgress APIs to return some form of progress and 1.0f once the map or reduce phase complete. (sseth) |
| |
| Release 0.2.0-incubating: 2013-11-30 |
| |
| First version. |