Apache Hadoop Changelog

Release 0.20.205.0 - 2011-10-06

INCOMPATIBLE CHANGES:

JIRASummaryPriorityComponentReporterContributor
HDFS-2202Changes to balancer bandwidth should not require datanode restart.Majorbalancer & mover, datanodeEric PayneEric Payne
HDFS-1554Append 0.20: New semantics for recoverLeaseMajor.Hairong KuangHairong Kuang
HDFS-630In DFSOutputStream.nextBlockOutputStream(), the client can exclude specific datanodes when locating the next block.Majorhdfs-client, namenodeRuyue MaCosmin Lehene

IMPORTANT ISSUES:

JIRASummaryPriorityComponentReporterContributor

NEW FEATURES:

JIRASummaryPriorityComponentReporterContributor
HADOOP-7594Support HTTP REST in HttpServerMajor.Tsz Wo Nicholas SzeTsz Wo Nicholas Sze
HADOOP-7119add Kerberos HTTP SPNEGO authentication support to Hadoop JT/NN/DN/TT web-consolesMajorsecurityAlejandro AbdelnurAlejandro Abdelnur
HADOOP-6889Make RPC to have an option to timeoutMajoripcHairong KuangJohn George
HDFS-1520HDFS 20 append: Lightweight NameNode operation to trigger lease recoveryMajornamenodeHairong KuangHairong Kuang
HDFS-200In HDFS, sync() not yet guarantees data available to the new readersBlocker.Tsz Wo Nicholas Szedhruba borthakur
MAPREDUCE-2777Backport MAPREDUCE-220 to Hadoop 20 security branchMajor.Jonathan EaglesAmar Kamat

IMPROVEMENTS:

JIRASummaryPriorityComponentReporterContributor
HADOOP-7720improve the hadoop-setup-conf.sh to read in the hbase user and setup the configsMajorconfArpit GuptaArpit Gupta
HADOOP-7707improve config generator to allow users to specify proxy user, turn append on or off, turn webhdfs on or offMajorconfArpit GuptaArpit Gupta
HADOOP-7655provide a small validation script that smoke tests the installed clusterMajor.Arpit GuptaArpit Gupta
HADOOP-7472RPC client should deal with the IP address changesMinoripcKihwal LeeKihwal Lee
HADOOP-7432Back-port HADOOP-7110 to 0.20-securityMajor.Sherry ChenSherry Chen
HADOOP-7343backport HADOOP-7008 and HADOOP-7042 to branch-0.20-securityMinortestThomas GravesThomas Graves
HADOOP-7314Add support for throwing UnknownHostException when a host doesn't resolveMajor.Jeffrey NaisbittJeffrey Naisbitt
HDFS-1555HDFS 20 append: Disallow pipeline recovery if a file is already being lease recoveredMajor.Hairong KuangHairong Kuang
HDFS-12110.20 append: Block receiver should not log “rewind” packets at INFO levelMinordatanodeTodd LipconTodd Lipcon
HDFS-1210DFSClient should log exception when block recovery failsTrivialhdfs-clientTodd LipconTodd Lipcon
HDFS-1054Remove unnecessary sleep after failure in nextBlockOutputStreamMajorhdfs-clientTodd LipconTodd Lipcon
HDFS-895Allow hflush/sync to occur in parallel with new writes to the fileMajorhdfs-clientdhruba borthakurTodd Lipcon
HDFS-826Allow a mechanism for an application to detect that datanode(s) have died in the write pipelineMajorhdfs-clientdhruba borthakurdhruba borthakur
MAPREDUCE-2981Backport trunk fairscheduler to 0.20-security branchMajorcontrib/fair-shareMatei ZahariaMatei Zaharia
MAPREDUCE-2729Reducers are always counted having “pending tasks” even if they can't be scheduled yet because not enough of their mappers have completedMajor.Sherry ChenSherry Chen
MAPREDUCE-2494Make the distributed cache delete entires using LRU priorityMajordistributed-cacheRobert Joseph EvansRobert Joseph Evans

BUG FIXES:

JIRASummaryPriorityComponentReporterContributor
HADOOP-7724hadoop-setup-conf.sh should put proxy user info into the core-site.xmlMajor.Giridharan KesavanArpit Gupta
HADOOP-7721dfs.web.authentication.kerberos.principal expects the full hostname and does not replace _HOST with the hostnameMajor.Arpit GuptaJitendra Nath Pandey
HADOOP-7715see log4j Error when running mr jobs and certain dfs callsMajorconfArpit GuptaEric Yang
HADOOP-7711hadoop-env.sh generated from templates has duplicate infoMajorconfArpit GuptaArpit Gupta
HADOOP-7708config generator does not update the properties file if on exists alreadyCriticalconfArpit GuptaEric Yang
HADOOP-7691hadoop deb pkg should take a diff group idMajor.Giridharan KesavanEric Yang
HADOOP-7684jobhistory server and secondarynamenode should have init.d scriptMajorscriptsEric YangEric Yang
HADOOP-7683hdfs-site.xml template has properties that are not used in 20Minor.Arpit GuptaArpit Gupta
HADOOP-7681log4j.properties is missing properties for security audit and hdfs audit should be changed to infoMinorconfArpit GuptaArpit Gupta
HADOOP-7679log4j.properties templates does not define mapred.jobsummary.loggerMajorconfRamya SunilRamya Sunil
HADOOP-7674TestKerberosName fails in 20 branch.Major.Jitendra Nath PandeyJitendra Nath Pandey
HADOOP-7658to fix hadoop config templateMajor.Giridharan KesavanEric Yang
HADOOP-7649TestMapredGroupMappingServiceRefresh and TestRefreshUserMappings fail after HADOOP-7625Blockersecurity, testKihwal LeeJitendra Nath Pandey
HADOOP-7645HTTP auth tests requiring Kerberos infrastructure are not disabled on branch-0.20-securityBlockersecurityAaron T. MyersJitendra Nath Pandey
HADOOP-7644Fix the delegation token tests to use the new style renewersBlockersecurityOwen O'MalleyOwen O'Malley
HADOOP-7637Fair scheduler configuration file is not bundled in RPMMajorbuildEric YangEric Yang
HADOOP-7633log4j.properties should be added to the hadoop conf on deployMajorconfArpit GuptaEric Yang
HADOOP-7631In mapred-site.xml, stream.tmpdir is mapped to ${mapred.temp.dir} which is undeclared.MajorconfRamya SunilEric Yang
HADOOP-7630hadoop-metrics2.properties should have a property *.period set to a default value foe metricsMajorconfArpit GuptaEric Yang
HADOOP-7626Allow overwrite of HADOOP_CLASSPATH and HADOOP_OPTSMajorscriptsEric YangEric Yang
HADOOP-7625TestDelegationToken is failing in 205Major.Owen O'MalleyOwen O'Malley
HADOOP-7615Binary layout does not put share/hadoop/contrib/*.jar into the class pathMajorscriptsEric YangEric Yang
HADOOP-7610/etc/profile.d does not exist on DebianMajorscriptsEric YangEric Yang
HADOOP-7603Set default hdfs, mapred uid, and hadoop group gid for RPM packagesMajor.Eric YangEric Yang
HADOOP-7602wordcount, sort etc on har files fails with NPEMajor.John GeorgeJohn George
HADOOP-7599Improve hadoop setup conf script to setup secure Hadoop clusterMajorscriptsEric YangEric Yang
HADOOP-7596Enable jsvc to work with Hadoop RPM packageMajorbuildEric YangEric Yang
HADOOP-7539merge hadoop archive goodness from trunk to .20Major.John GeorgeJohn George
HADOOP-7400HdfsProxyTests fails when the -Dtest.build.dir and -Dbuild.test is setMajorbuildGiridharan KesavanGiridharan Kesavan
HADOOP-6833IPC leaks call parameters when exceptions thrownBlocker.Todd LipconTodd Lipcon
HADOOP-6722NetUtils.connect should check that it hasn't connected a socket to itselfMajorutilTodd LipconTodd Lipcon
HDFS-2411with webhdfs enabled in secure mode the auth to local mappings are not being respected.MajorwebhdfsArpit GuptaJitendra Nath Pandey
HDFS-2408DFSClient#getNumCurrentReplicas is package private in 205 but public in branch-0.20-appendBlockerhdfs-clientstackstack
HDFS-2405hadoop dfs command with webhdfs fails on secure hadoopCriticalwebhdfsArpit GuptaJitendra Nath Pandey
HDFS-2392Dist with hftp is failing againCriticalnamenodeRajit SahaDaryn Sharp
HDFS-2375TestFileAppend4 fails in 0.20.205 branchBlockerhdfs-clientSuresh SrinivasSuresh Srinivas
HDFS-2373Commands using webhdfs and hftp print unnecessary debug information on the console with security enabledMajorwebhdfsArpit GuptaArpit Gupta
HDFS-2368defaults created for web keytab and principal, these properties should not have defaultsMajor.Arpit GuptaTsz Wo Nicholas Sze
HDFS-2361hftp is brokenCriticalnamenodeRajit SahaJitendra Nath Pandey
HDFS-2359NPE found in Datanode log while Disk failed during different HDFS operationMajordatanodeRajit SahaJonathan Eagles
HDFS-2358NPE when the default filesystem's uri has no authorityMajornamenodeRajit SahaDaryn Sharp
HDFS-2342TestSleepJob and TestHdfsProxy broken after HDFS-2284BlockerbuildKihwal LeeTsz Wo Nicholas Sze
HDFS-2333HDFS-2284 introduced 2 findbugs warnings on trunkMajor.Ivan KellyTsz Wo Nicholas Sze
HDFS-2331Hdfs compilation failsMajorhdfs-clientAbhijit Suresh ShingateAbhijit Suresh Shingate
HDFS-2328hftp throws NPE if security is not enabled on remote clusterCritical.Daryn SharpOwen O'Malley
HDFS-2325Fuse-DFS fails to build on Hadoop 20.203.0Blockerfuse-dfs, libhdfsCharles EarlKihwal Lee
HDFS-2320Make merged protocol changes from 0.20-append to 0.20-security compatible with previous releases.Majordatanode, hdfs-client, namenodeSuresh SrinivasSuresh Srinivas
HDFS-2309TestRenameWhileOpen fails in branch-0.20-securityMajor.Jitendra Nath PandeyJitendra Nath Pandey
HDFS-2300TestFileAppend4 and TestMultiThreadedSync fail on 20.append and 20-security.Major.Jitendra Nath PandeyJitendra Nath Pandey
HDFS-2259DN web-UI doesn't work with paths that contain htmlMinordatanodeEli CollinsEli Collins
HDFS-2190NN fails to start if it encounters an empty or malformed fstime fileMajornamenodeAaron T. MyersAaron T. Myers
HDFS-2117DiskChecker#mkdirsWithExistsAndPermissionCheck may return true even when the dir is not createdMinordatanodeEli CollinsEli Collins
HDFS-2053Bug in INodeDirectory#computeContentSummary warningMinornamenodeMichael NollMichael Noll
HDFS-1836Thousand of CLOSE_WAIT socketMajorhdfs-clientDennis CheungBharath Mundlapudi
HDFS-1779After NameNode restart , Clients can not read partial files even after client invokes Sync.Majordatanode, namenodeUma Maheswara Rao GUma Maheswara Rao G
HDFS-1346DFSClient receives out of order packet ackMajordatanode, hdfs-clientHairong KuangHairong Kuang
HDFS-12600.20: Block lost when multiple DNs trying to recover it to different genstampsCritical.Todd LipconTodd Lipcon
HDFS-121820 append: Blocks recovered on startup should be treated with lower priority during block synchronizationCriticaldatanodeTodd LipconTodd Lipcon
HDFS-12070.20-append: stallReplicationWork should be volatileMajornamenodeTodd LipconTodd Lipcon
HDFS-12040.20: Lease expiration should recover single files, not entire lease holderMajor.Todd Lipconsam rash
HDFS-1202DataBlockScanner throws NPE when updated before initializedMajordatanodeTodd LipconTodd Lipcon
HDFS-1197Blocks are considered “complete” prematurely after commitBlockSynchronization or DN restartMajordatanode, hdfs-client, namenodeTodd LipconTodd Lipcon
HDFS-11860.20: DNs should interrupt writers at start of recoveryBlockerdatanodeTodd LipconTodd Lipcon
HDFS-1164TestHdfsProxy is failingMajorcontrib/hdfsproxyEli CollinsTodd Lipcon
HDFS-1141completeFile does not check lease ownershipBlockernamenodeTodd LipconTodd Lipcon
HDFS-1118DFSOutputStream socket leak when cannot connect to DataNodeMajor.Zheng ShaoZheng Shao
HDFS-988saveNamespace race can corrupt the edits logBlockernamenodedhruba borthakurEli Collins
HDFS-724Pipeline close hangs if one of the datanode is not responsive.Blockerdatanode, hdfs-clientTsz Wo Nicholas SzeHairong Kuang
HDFS-606ConcurrentModificationException in invalidateCorruptReplicas()MajornamenodeKonstantin ShvachkoKonstantin Shvachko
HDFS-142In 0.20, move blocks being written into a blocksBeingWritten directoryBlocker.Raghu Angadidhruba borthakur
MAPREDUCE-3112Calling hadoop cli inside mapreduce job leads to errorsMajorcontrib/streamingEric YangEric Yang
MAPREDUCE-3081Change the name format for hadoop core and vaidya jar to be hadoop-{core/vaidya}-{version}.jar in vaidya.shMajorcontrib/vaidyavitthal (Suhas) Gogate
MAPREDUCE-3076TestSleepJob failsBlockertestArun C MurthyArun C Murthy
MAPREDUCE-2915LinuxTaskController does not work when JniBasedUnixGroupsNetgroupMapping or JniBasedUnixGroupsMapping is enabledMajortask-controllerKihwal LeeKihwal Lee
MAPREDUCE-2852Jira for YDH bug 2854624MajortasktrackerEli CollinsKihwal Lee
MAPREDUCE-2764Fix renewal of dfs delegation tokensMajor.Daryn SharpOwen O'Malley
MAPREDUCE-2705tasks localized and launched serially by TaskLauncher - causing other tasks to be delayedMajortasktrackerThomas GravesThomas Graves
MAPREDUCE-2650back-port MAPREDUCE-2238 to 0.20-securityMajor.Sherry ChenSherry Chen
MAPREDUCE-2610Inconsistent API JobClient.getQueueAclsForCurrentUserMajorclientJoep RottinghuisJoep Rottinghuis
MAPREDUCE-2549Potential resource leaks in HadoopServer.java, RunOnHadoopWizard.java and Environment.javaMajorcontrib/eclipse-plugin, contrib/streamingDevaraj KDevaraj K
MAPREDUCE-2489Jobsplits with random hostnames can make the queue unusableMajorjobtrackerJeffrey NaisbittJeffrey Naisbitt
MAPREDUCE-2324Job should fail if a reduce task can't be scheduled anywhereMajor.Todd LipconRobert Joseph Evans
MAPREDUCE-2187map tasks timeout during sortingMajor.Gianmarco De Francisci MoralesAnupam Seth

TESTS:

JIRASummaryPriorityComponentReporterContributor
HDFS-1252TestDFSConcurrentFileOperations broken in 0.20-appendjMajortestTodd LipconTodd Lipcon
HDFS-12420.20 append: Add test for appendFile() race solved in HDFS-142Major.Todd LipconTodd Lipcon

SUB-TASKS:

JIRASummaryPriorityComponentReporterContributor
HDFS-2404webhdfs liststatus json response is not correctMajorwebhdfsArpit GuptaSuresh Srinivas
HDFS-2403The renewer in NamenodeWebHdfsMethods.generateDelegationToken(..) is not usedMajor.Tsz Wo Nicholas SzeTsz Wo Nicholas Sze
HDFS-2395webhdfs api's should return a root element in the json responseCriticalwebhdfsArpit GuptaTsz Wo Nicholas Sze
HDFS-2385Support delegation token renewal in webhdfsMajorwebhdfsTsz Wo Nicholas SzeTsz Wo Nicholas Sze
HDFS-2366webhdfs throws a npe when ugi is null from getDelegationTokenMajorwebhdfsArpit GuptaTsz Wo Nicholas Sze
HDFS-2356webhdfs: support case insensitive query parameter namesMajorwebhdfsTsz Wo Nicholas SzeTsz Wo Nicholas Sze
HDFS-2348Support getContentSummary and getFileChecksum in webhdfsMajorwebhdfsTsz Wo Nicholas SzeTsz Wo Nicholas Sze
HDFS-2340Support getFileBlockLocations and getDelegationToken in webhdfsMajorwebhdfsTsz Wo Nicholas SzeTsz Wo Nicholas Sze
HDFS-2338Configuration option to enable/disable webhdfs.MajorwebhdfsJitendra Nath PandeyJitendra Nath Pandey
HDFS-2318Provide authentication to webhdfs using SPNEGOMajorwebhdfsTsz Wo Nicholas SzeTsz Wo Nicholas Sze
HDFS-2317Read access to HDFS using HTTP RESTMajor.Tsz Wo Nicholas SzeTsz Wo Nicholas Sze
HDFS-2284Write Http access to HDFSMajor.Sanjay RadiaTsz Wo Nicholas Sze
HDFS-1057Concurrent readers hit ChecksumExceptions if following a writer to very end of fileBlockerdatanodeTodd Lipconsam rash
HDFS-561Fix write pipeline READ_TIMEOUTMajordatanode, hdfs-clientKan ZhangKan Zhang
MAPREDUCE-2928MR-2413 improvementsMajortasktrackerEli CollinsEli Collins
MAPREDUCE-2780Standardize the value of token serviceMajor.Daryn SharpDaryn Sharp

OTHER:

JIRASummaryPriorityComponentReporterContributor
HDFS-1795Port 0.20-append changes onto 0.20-security-203Major.Andrew Purtell