Apache Hadoop Changelog

Release 0.5.0 - 2006-08-04

INCOMPATIBLE CHANGES:

JIRASummaryPriorityComponentReporterContributor

IMPORTANT ISSUES:

JIRASummaryPriorityComponentReporterContributor

NEW FEATURES:

JIRASummaryPriorityComponentReporterContributor
HADOOP-425a python word count example that runs under jythonMajor.Owen O'MalleyOwen O'Malley
HADOOP-412provide an input format that fetches a subset of sequence file recordsMajor.Hairong KuangHairong Kuang
HADOOP-386Periodically move blocks from full nodes to those with spaceMajor.Johan OskarssonJohan Oskarsson
HADOOP-381keeping files for tasks that match regex on task idMajor.Owen O'MalleyOwen O'Malley
HADOOP-369Added ability to copy all part-files into one output fileTrivial.Johan OskarssonJohan Oskarsson
HADOOP-359add optional compression of map outputsMajor.Owen O'MalleyOwen O'Malley
HADOOP-347Implement HDFS content browsing interfaceMajor.Devaraj DasDevaraj Das
HADOOP-342Design/Implement a tool to support archival and analysis of logfiles.Major.Arun C Murthy
HADOOP-339making improvements to the jobclients to get information on currenlyl running jobs and the jobqueueMinor.Mahadev konarMahadev konar

IMPROVEMENTS:

JIRASummaryPriorityComponentReporterContributor
HADOOP-410Using HashMap instead of TreeMap for some maps in Namenode yields 17% performance improvementMajor.Milind BhandarkarMilind Bhandarkar
HADOOP-409expose JobConf properties as environment variablesMajor.Michel Tourn
HADOOP-396Writable DatanodeIDMajor.Konstantin ShvachkoKonstantin Shvachko
HADOOP-395infoPort field should be a DatanodeID memberMajor.Konstantin ShvachkoDevaraj Das
HADOOP-394MiniDFSCluster shudown orderMinor.Konstantin ShvachkoKonstantin Shvachko
HADOOP-392Improve the UI for DFS content browsingMajor.Devaraj Das
HADOOP-361junit with pure-Java hadoopStreaming combiner; remove CRLF in some filesMajor.Michel Tourn
HADOOP-356Build and test hadoopStreaming nightlyMajor.Michel Tourn
HADOOP-355hadoopStreaming: fix APIs, -reduce NONE, StreamSequenceRecordReaderMajor.Michel Tourn
HADOOP-345JobConf access to name-valuesMajor.Michel TournMichel Tourn
HADOOP-341Enhance distcp to handle *http* as a ‘source protocol’.MajorutilArun C MurthyArun C Murthy
HADOOP-340Using wildcards in config pathnamesMinorconfJohan OskarssonDoug Cutting
HADOOP-335factor out the namespace image/transaction log writingMajor.Owen O'MalleyKonstantin Shvachko
HADOOP-321DatanodeInfo refactoringMajor.Konstantin ShvachkoKonstantin Shvachko
HADOOP-302class Text (replacement for class UTF8) was: HADOOP-136MajorioMichel TournHairong Kuang
HADOOP-260the start up scripts should take a command line parameter --config making it easy to run multiple hadoop installation on same machinesMinor.Mahadev konarMilind Bhandarkar
HADOOP-252add versioning to RPCMajoripcYoram Arnon
HADOOP-237Standard set of Performance Metrics for HadoopMajormetricsMilind BhandarkarMilind Bhandarkar

BUG FIXES:

JIRASummaryPriorityComponentReporterContributor
HADOOP-415DFSNodesStatus() should sort data nodes.Major.Konstantin Shvachko
HADOOP-404Regression tests are not working.Major.Mahadev konar
HADOOP-393The validateUTF function of class Text throws MalformedInputException for valid UTF8 code containing ascii charsMajorioHairong KuangHairong Kuang
HADOOP-391test-contrib with spaces in classpath (Windows)Major.Michel Tourn
HADOOP-389MiniMapReduce tests get stuck because of some timing issues with initialization of tasktrackers.Major.Mahadev konarMahadev konar
HADOOP-388the hadoop-daemons.sh fails with “no such file or directory” when used from a relative pathMajor.Owen O'MalleyOwen O'Malley
HADOOP-387LocalJobRunner assigns duplicate mapid'sMajor.Sami Siren
HADOOP-385rcc does not generate correct Java code for the field of a record typeMajor.Hairong KuangMilind Bhandarkar
HADOOP-384improved error messages for file checksum errorsMinorfsOwen O'MalleyOwen O'Malley
HADOOP-383unit tests fail on windowsMajor.Owen O'MalleyMichel Tourn
HADOOP-380The reduce tasks poll for mapoutputs in a loopMajor.Mahadev konarMahadev konar
HADOOP-377Configuration does not handle URLMajor.Jean-Baptiste Quenot
HADOOP-376Datanode does not scan for an open http portMajor.Owen O'MalleyOwen O'Malley
HADOOP-375Introduce a way for datanodes to register their HTTP info ports with the NameNodeMajor.Devaraj DasDevaraj Das
HADOOP-368DistributedFSCheck should cleanup, seek, and report missing files.MinorfsKonstantin ShvachkoKonstantin Shvachko
HADOOP-365datanode crashes on startup with ClassCastExceptionMajor.Owen O'MalleyOwen O'Malley
HADOOP-364rpc versioning broke out-of-order server launchesMajoripcOwen O'MalleyOwen O'Malley
HADOOP-362tasks can get lost when reporting task completion to the JobTracker has an errorMajor.Devaraj DasOwen O'Malley
HADOOP-360hadoop-daemon starts but does not stop servers under cygWinMajor.Konstantin Shvachko
HADOOP-358NPE in Path.equalsMajorfsFrédéric BertinDoug Cutting
HADOOP-354All daemons should have public methods to start and stop themMajor.Barry Kaplan
HADOOP-352Portability of hadoop shell scripts for deploymentMajor.Jean-Baptiste Quenot
HADOOP-350In standalone mode, ‘org.apache.commons.cli cannot be resolved’Minor.stack
HADOOP-344TaskTracker passes incorrect file path to DF under cygwinMajor.Konstantin ShvachkoKonstantin Shvachko
HADOOP-327ToolBase calls System.exitMajorutilOwen O'MalleyHairong Kuang
HADOOP-313A stand alone driver for individual tasksMajor.Michel TournMichel Tourn
HADOOP-226DFSShell problems. Incorrect block replication detection in fsck.Major.Konstantin Shvachko

TESTS:

JIRASummaryPriorityComponentReporterContributor
HADOOP-418hadoopStreaming test jobconf -> env.var. mappingMajor.Michel Tourn
HADOOP-411junit test for HADOOP-59: support generic command-line optionsMajor.Hairong KuangHairong Kuang

SUB-TASKS:

JIRASummaryPriorityComponentReporterContributor

OTHER:

JIRASummaryPriorityComponentReporterContributor
HADOOP-351Remove Jetty dependencyMajoripcBarry KaplanDevaraj Das
HADOOP-307Many small jobs benchmark for MapReduceMinor.Sanjay DahiyaSanjay Dahiya