Apache Hadoop Changelog

Release 0.13.0 - 2007-06-08

INCOMPATIBLE CHANGES:

JIRASummaryPriorityComponentReporterContributor

IMPORTANT ISSUES:

JIRASummaryPriorityComponentReporterContributor

NEW FEATURES:

JIRASummaryPriorityComponentReporterContributor
HADOOP-1251A method to get the InputSplit from a MapperMajor.Owen O'MalleyOwen O'Malley
HADOOP-1247Make Hadoop Abacus work with Hadoop StreamingMajor.Runping QiRunping Qi
HADOOP-1217Specify a junit test timeout in build.xml filesMinorbuildNigel DaleyNigel Daley
HADOOP-1216Hadoop should support reduce none optionMajor.Runping QiRunping Qi
HADOOP-1120Contribute some code helping implement map/reduce apps for joining data from multiple sourcesMajor.Runping QiRunping Qi
HADOOP-1111Job completion notification to a job configured URLMajor.Alejandro Abdelnur
HADOOP-702DFS Upgrade ProposalMajor.Konstantin ShvachkoKonstantin Shvachko
HADOOP-485allow a different comparator for grouping keys in calls to reduceMajor.Owen O'MalleyTahir Hashmi

IMPROVEMENTS:

JIRASummaryPriorityComponentReporterContributor
HADOOP-1326Return the RunningJob from JobClient.runJobMajor.Owen O'MalleyOwen O'Malley
HADOOP-1324FSError encountered by one running task should not be fatal to other tasks on that nodeMajor.Devaraj DasArun C Murthy
HADOOP-1315Hadoop Streaming code clean upMajor.Runping QiRunping Qi
HADOOP-1308Tighten generic Class restrictions in JobConf.javaMinor.Michael BieniosekMichael Bieniosek
HADOOP-1304MAX_TASK_FAILURES should be configurableMajor.Christian KunzDevaraj Das
HADOOP-1290Move Hadoop Abacus to hadoop.mapred.libMajor.Runping Qi
HADOOP-1284clean up the protocol between stream mapper/reducer and the frameworkMajor.Runping QiRunping Qi
HADOOP-1276TaskTracker expiry interval is not configurableMajor.Alejandro AbdelnurArun C Murthy
HADOOP-1270Randomize the fetch of map outputsMajor.Arun C MurthyArun C Murthy
HADOOP-1263retry logic when dfs exist or open fails temporarily, e.g because of timeoutMajor.Christian KunzHairong Kuang
HADOOP-1260need code review guidelinesMajorbuildNigel DaleyNigel Daley
HADOOP-1250Remove the MustangFile class from streaming and promote the chmod into FileUtilsMajor.Owen O'MalleyOwen O'Malley
HADOOP-1214the first step for streaming clean upMajor.Runping QiRunping Qi
HADOOP-1213When RPC call fails then log call message detailMinoripcNigel DaleyDoug Cutting
HADOOP-1194map output should not do block level compressionMajor.Runping QiArun C Murthy
HADOOP-1190Fix unchecked warningsMajor.Tom WhiteTom White
HADOOP-1167InMemoryFileSystem uses synchronizedtMaps with maps that are locked anywaysMinorfsOwen O'MalleyOwen O'Malley
HADOOP-1166Pull the NullOutputFormat into the lib packageMajor.Owen O'MalleyOwen O'Malley
HADOOP-1165Code for toString in code generated by Record I/O Compiler can be genericMinorrecordMilind BhandarkarMilind Bhandarkar
HADOOP-1161need improved release processMajorbuildDoug CuttingDoug Cutting
HADOOP-1148re-indent all codeMinor.Doug CuttingDoug Cutting
HADOOP-1144Hadoop should allow a configurable percentage of failed map tasks before declaring a job failed.Major.Christian KunzArun C Murthy
HADOOP-1133Tools to analyze and debug namenode on a production clusterMajor.dhruba borthakurdhruba borthakur
HADOOP-1131Add a closeAll() static method to FileSystemMinor.Philippe Gassmann
HADOOP-1127Speculative Execution and output of Reduce tasksMajor.Arun C MurthyArun C Murthy
HADOOP-1116Add maxmemory=“256m” in the junit call of build-contrib.xmlMajorbuildPhilippe Gassmann
HADOOP-1101Add more statistics in the web-ui to do with tasksMajor.Devaraj DasDevaraj Das
HADOOP-1094Optimize readFields and write methods in record I/OMajorrecordMilind BhandarkarMilind Bhandarkar
HADOOP-1068Improve error message for 0 datanode caseMajor.Owen O'Malleydhruba borthakur
HADOOP-988Namenode should use single map for block to its meta data.Major.Raghu AngadiRaghu Angadi
HADOOP-978AlreadyBeingCreatedException detail message could contain more useful infoMinor.Nigel DaleyKonstantin Shvachko
HADOOP-971DFS Scalabilty: Improve name node performance by adding a hostname to datanodes mapMajor.Hairong KuangHairong Kuang
HADOOP-968Reduce shuffle and merge should be done a child JVMMajor.Owen O'MalleyDevaraj Das
HADOOP-819LineRecordWriter should not always insert tab char between key and valueMajor.Runping QiRunping Qi

BUG FIXES:

JIRASummaryPriorityComponentReporterContributor
HADOOP-1452map output transfers of more than 2^31 bytes output are failingBlocker.Christian KunzOwen O'Malley
HADOOP-1435FileSystem.globPaths should not create a Path from an empty stringBlockerfsHairong KuangHairong Kuang
HADOOP-1431Map tasks can't timeout for failing to call progressBlocker.Owen O'MalleyArun C Murthy
HADOOP-1427Typo in GzipCodec.createInputStream - bufferSizeBlockerioEspen Amble KolstadEspen Amble Kolstad
HADOOP-1411AlreadyBeingCreatedException from task retriesBlocker.Nigel DaleyHairong Kuang
HADOOP-1407Failed tasks not killing jobBlocker.Nigel DaleyArun C Murthy
HADOOP-1388Possible Null Pointer Dereference in taskdetails.jspMajor.Devaraj DasDevaraj Das
HADOOP-1386The constructor of Path should not take an empty string as a parameterBlocker.Hairong KuangHairong Kuang
HADOOP-1385MD5Hash has a bad hash functionMajorioOwen O'MalleyOwen O'Malley
HADOOP-1369Inconsistent synchronization of TaskTracker fieldsBlocker.Nigel DaleyOwen O'Malley
HADOOP-1368Inconsistent synchronization of 3 fields in JobInProgress.javaBlocker.Nigel DaleyOwen O'Malley
HADOOP-1363waitForCompletion() calls Thread.sleep() with a lock heldBlocker.Nigel DaleyOwen O'Malley
HADOOP-1361seek calls in 3 io classes ignore result of skipBytes(int)BlockerioNigel DaleyHairong Kuang
HADOOP-1358seek call ignores result of skipBytes(int)Blocker.Nigel DaleyHairong Kuang
HADOOP-1356ValueHistogram.addNextValue(Object) ignores return value of String.substring(int, int)Blocker.Nigel DaleyRunping Qi
HADOOP-1354Null pointer dereference of paths in FsShell.dus(String)BlockerfsNigel DaleyHairong Kuang
HADOOP-1353Null pointer dereference of nodeInfo in FSNamesystem.removeDatanode(DatanodeID)Blocker.Nigel Daleydhruba borthakur
HADOOP-1350Shuffle started taking a very long time after the HADOOP-1176 fixBlocker.Devaraj DasDevaraj Das
HADOOP-1345Checksum object does not get restored to the old state in retries when handle ChecksumExceptionBlocker.Hairong KuangHairong Kuang
HADOOP-1332Sporadic unit test failures (TestMiniMRClasspath, TestMiniMRLocalFS, TestMiniMRDFSCaching)Blocker.Nigel DaleyArun C Murthy
HADOOP-1322Tasktracker blacklist leads to hung jobs in single-node clusterCritical.Arun C MurthyArun C Murthy
HADOOP-1312heartbeat monitor thread goes awayBlocker.dhruba borthakurdhruba borthakur
HADOOP-1310Fix unchecked warnings in aggregate codeMajor.Tom WhiteTom White
HADOOP-1299Once RPC.stopClient has been called, RPC can not be used againMinoripcstackstack
HADOOP-1297datanode sending block reports to namenode once every secondMajor.dhruba borthakurdhruba borthakur
HADOOP-1294Fix unchecked warnings in main Hadoop code under Java 6.MajortestTom WhiteTom White
HADOOP-1293stderr from streaming skipped after first 20 lines.Minor.Koji NoguchiKoji Noguchi
HADOOP-1279list of completed jobs purges jobs based on submission not on completion ageMajor.Alejandro AbdelnurArun C Murthy
HADOOP-1278Fix the per-job tasktracker ‘blacklist’Major.Arun C MurthyArun C Murthy
HADOOP-1275job notification property in hadoop-default.xml is misspelledTrivial.Alejandro Abdelnur
HADOOP-1272Extract InnerClasses from FSNamesystem into separate classesMajor.dhruba borthakurdhruba borthakur
HADOOP-1271The StreamBaseRecordReader is unable to log record data that's not UTF-8Minor.Gautam KowshikArun C Murthy
HADOOP-1262file corruption detected because dfs client does not use replica blocks for checksum fileMajor.dhruba borthakurHairong Kuang
HADOOP-1258TestCheckpoint test case doesn't wait for MiniDFSCluster to be activeTrivialtestNigel DaleyNigel Daley
HADOOP-1256Dfs image loading and edits loading creates multiple instances of DatanodeDescriptor for the same datanodeMajor.Hairong KuangHairong Kuang
HADOOP-1255Name-node falls into infinite loop trying to remove a dead node.Blocker.Konstantin ShvachkoHairong Kuang
HADOOP-1253ConcurrentModificationException and NPE in JobControlMinor.Johan OskarssonJohan Oskarsson
HADOOP-1252Disk problems should be handled better by the MR frameworkMajor.Devaraj DasDevaraj Das
HADOOP-1244stop-dfs.sh incorrectly specifies slaves file for stopping datanodeMinor.Michael Bieniosekdhruba borthakur
HADOOP-1243ClientProtocol.versionID should be 11Major.Konstantin Shvachkodhruba borthakur
HADOOP-1242dfs upgrade/downgrade problemsBlocker.Owen O'MalleyKonstantin Shvachko
HADOOP-1241Null PointerException in processReport when namenode is restartedMajor.dhruba borthakurdhruba borthakur
HADOOP-1239Classes in src/test/testjar need package nameTrivialtestJim KellermanJim Kellerman
HADOOP-1238maps_running metric is only updated at the end of the taskMinormetricsMichael BieniosekDavid Bowen
HADOOP-1224“Browse the filesystem” link pointing to a dead data-nodeMajor.Konstantin ShvachkoEnis Soztutar
HADOOP-1219Spurious progress messages should be discarded after a task is doneMajor.Devaraj DasDevaraj Das
HADOOP-1218In TaskTracker the access to RunningJob object is not synchronized in one placeMajor.Devaraj DasDevaraj Das
HADOOP-1211Remove deprecated constructor and unused static members in DataNode classMajor.Konstantin ShvachkoKonstantin Shvachko
HADOOP-1205The open method of FSNamesystem should be synchronizedBlocker.Hairong KuangHairong Kuang
HADOOP-1204Re-factor InputFormat/RecordReader related classesMajor.Runping QiRunping Qi
HADOOP-1203UpgradeUtilities should use MiniDFSCluster to start and stop NameNode/DataNodesMajortestNigel DaleyNigel Daley
HADOOP-1200Datanode should periodically do a disk checkBlocker.Hairong KuangHairong Kuang
HADOOP-1198ipc.client.timeout of 2000ms for test cases seems too small; causes too many timeouts and leads to hung test casesMajortestArun C MurthyArun C Murthy
HADOOP-1189Still seeing some unexpected ‘No space left on device’ exceptionsMajor.Raghu AngadiRaghu Angadi
HADOOP-1187DFS Scalability: avoid scanning entire list of datanodes in getAdditionalBlocksMajor.dhruba borthakurdhruba borthakur
HADOOP-1184Decommission fails if a block that needs replication has only one replicaMajor.dhruba borthakur
HADOOP-1178NullPointer Exception in org.apache.hadoop.dfs.NameNode.isDir on namenode restartMajor.dhruba borthakurdhruba borthakur
HADOOP-1176Reduce hang on huge map outputMajor.Hairong KuangArun C Murthy
HADOOP-1170Very high CPU usage on data nodes because of FSDataset.checkDataDir() on every connectMajor.Igor Bolotin
HADOOP-1169CopyFiles skips src files of s3 urlsMinorutilstack
HADOOP-1164TestReplicationPolicy doesn't use port 0 for the NameNodeMajortestOwen O'MalleyOwen O'Malley
HADOOP-1163Ganglia metrics reporting is misconfiguredMinormetricsMichael Bieniosek
HADOOP-1160DistributedFileSystem doesn't close the RawDistributedFileSystem on close.Blocker.Owen O'MalleyHairong Kuang
HADOOP-1156NullPointerException in MiniDFSClusterMajor.Konstantin ShvachkoHairong Kuang
HADOOP-1154streaming hang. (PipeMapRed$MROutputThread gone)Major.Koji NoguchiKoji Noguchi
HADOOP-1153DataNode and FSNamesystem don't shutdown cleanlyMajor.Nigel DaleyKonstantin Shvachko
HADOOP-1152Reduce task hang failing in MapOutputCopier.copyOutputMajor.Koji NoguchiTahir Hashmi
HADOOP-1151streaming PipeMapRed prints system info to stderrTrivial.Koji NoguchiKoji Noguchi
HADOOP-1149DFS Scalability: high cpu usage in addStoredBlockMajor.dhruba borthakurRaghu Angadi
HADOOP-1146“Reduce input records” counter name is misleadingMajor.David BowenDavid Bowen
HADOOP-1137StatusHttpServer assumes that resources for /static are in filesMajor.Benjamin Reed
HADOOP-1136exception in UnderReplicatedBlocks:add when ther are more replicas of a block than requiredMajor.dhruba borthakurHairong Kuang
HADOOP-1122Divide-by-zero exception in chooseTargetMajor.dhruba borthakurdhruba borthakur
HADOOP-1114bin/hadoop script clobbers CLASSPATHMinorscriptsMichael BieniosekDoug Cutting
HADOOP-1110JobTracker WebUI “Map input records” a little off.Trivial.Koji NoguchiDavid Bowen
HADOOP-1093NNBench generates millions of NotReplicatedYetException in Namenode logMajor.Nigel Daleydhruba borthakur
HADOOP-1090In SortValidator, the check for whether a file belongs to sort-input or sort-output dir is weakMajor.Devaraj DasArun C Murthy
HADOOP-1085Remove ‘port rolling’ from Mini{DFS|MR}ClusterMajortestArun C MurthyArun C Murthy
HADOOP-1081JAVA_PLATFORM with spaces (i.e. Mac OS X-ppc-32) breaks bin/hadoop scriptMajorscriptsAndrzej Bialecki
HADOOP-1073DFS Scalability: high CPU usage in choosing replication targets and file openMajor.dhruba borthakurHairong Kuang
HADOOP-1071RPC$VersionMismatch exception is not fatal to JobTrackerMajor.Nigel DaleyTahir Hashmi
HADOOP-1064dfsclient logging messages should have appropriate log levelsMajor.dhruba borthakurdhruba borthakur
HADOOP-1063MiniDFSCluster exists a race condition that lead to data node resources are not properly releasedMajortestHairong KuangHairong Kuang
HADOOP-1061S3 listSubPaths bugCriticalfsMike Smith
HADOOP-1050Do not count lost tasktracker against the jobMajor.Arun C MurthyArun C Murthy
HADOOP-1047TestReplication fails because DFS does not guarantee all the replicas are placed when a file is closedMajor.Hairong KuangHairong Kuang
HADOOP-1011ConcurrentModificationException in JobHistoryMajor.Nigel DaleyTahir Hashmi
HADOOP-1001the output of the map is not type checked against the specified typesMajor.Owen O'MalleyTahir Hashmi
HADOOP-672dfs shell enhancementsMinor.Yoram Arnondhruba borthakur

TESTS:

JIRASummaryPriorityComponentReporterContributor

SUB-TASKS:

JIRASummaryPriorityComponentReporterContributor

OTHER:

JIRASummaryPriorityComponentReporterContributor