blob: 9666dc00de3d6fdfe2520025753d9fa7d86bf12a [file] [log] [blame]
Hadoop Change Log
Release 1.0.4 - 2012.10.02
NEW FEATURES
IMPROVEMENTS
HADOOP-7154. Should set MALLOC_ARENA_MAX in hadoop-env.sh
(todd via mattf)
MAPREDUCE-4399. Change the Jetty response buffer size to improve
shuffle performance. (Luke Lu via suresh)
BUG FIXES
HDFS-3652. FSEditLog failure removes the wrong edit stream when storage
dirs have same name. (todd)
Release 1.0.3 - 2012.05.07
NEW FEATURES
IMPROVEMENTS
MAPREDUCE-4017. Add jobname to jobsummary log (tgraves and Koji Noguchi
via bobby)
BUG FIXES
HADOOP-6924. Adds a directory to the list of directories to search for
the libjvm.so file. The new directory is found by running a 'find' command
and the first output is taken. This was done to handle the build of Hadoop
with IBM's JDK. (Stephen Watt, Guillermo Cabrera and ddas)
HADOOP-6941. Adds support for building Hadoop with IBM's JDK
(Stephen Watt, Eli and ddas)
HADOOP-8188. Fixes the build process to do with jsvc, with IBM's JDK
as the underlying jdk. (ddas)
HDFS-3127. Do not throw exceptions when FSImage.restoreStorageDirs()
fails. (Brandon Li via szetszwo)
MAPREDUCE-3377. Ensure OutputCommitter.checkOutputSpecs is called prior to
copying job.xml. (Jane Chen via acmurthy)
HADOOP-5528. Ensure BinaryPartitioner is present in mapred libs. (Klaas
Bosteels via acmurthy)
HADOOP-6963. In FileUtil.getDU(..), neither include the size of directories
nor follow symbolic links. (Ravi Prakash via szetszwo)
HADOOP-8251. Fix SecurityUtil.fetchServiceTicket after HADOOP-6941. (todd)
HADOOP-8293. Fix the Makefile.am for the native library to include the
JNI path. (omalley)
MAPREDUCE-4154. streaming MR job succeeds even if the streaming command
fails. (Devaraj Das via tgraves)
HDFS-119. Fix a bug in logSync(), which causes NameNode block forever.
(shv)
HADOOP-8294. IPC Connection becomes unusable even if server address
was temporarilly unresolvable. Backport of HADOOP-7428. (Kihwal Lee via
mattf)
HDFS-3310. Make sure that we abort when no edit log directories are left.
(Colin Patrick McCabe via eli)
MAPREDUCE-4207. Remove System.out.println() in FileInputFormat
(Kihwal Lee via harsh)
HDFS-3265. PowerPc Build error. (Kumar Ravi via mattf)
HDFS-1041. DFSClient.getFileChecksum(..) should retry if connection to
the first datanode fails. (szetszwo)
HADOOP-8338. Fix renew and cancel of RPC HDFS delegation tokens. (omalley)
HADOOP-8346. Makes oid changes to make SPNEGO work. Was broken due
to fixes introduced by the IBM JDK compatibility patch. (ddas)
HADOOP-8352. Regenerate configure scripts for the c++ compilation.
(omalley)
HDFS-3061. Cached directory size in INodeDirectory can get permanently
out of sync with computed size, causing quota issues; port of HDFS-1487.
(Kihwal Lee via mattf)
HADOOP-7381. FindBugs OutOfMemoryError. (Joep Rottinghuis via mattf)
HADOOP-8151. Error handling in snappy decompressor throws invalid
exceptions. (Matt Foley)
HDFS-3374. hdfs' TestDelegationToken fails intermittently with a race
condition. (Owen O'Malley via mattf)
MAPREDUCE-3857. Grep example ignores mapred.job.queue.name.
(Jonathan Eagles via mattf)
MAPREDUCE-1238. mapred metrics shows negative count of waiting maps and
reduces (tgraves via bobby)
MAPREDUCE-4003. log.index (No such file or directory) AND Task process
exit with nonzero status of 126. (Koji Noguchi via tgraves)
MAPREDUCE-4012. Hadoop Job setup error leaves no useful info to users
(when LinuxTaskController is used) (tgraves)
HADOOP-8027. Visiting /jmx on the daemon web interfaces may print unnecessary
error in logs (Aaron Myers and Hitesh Shah)
Release 1.0.2 - 2012.03.24
NEW FEATURES
HADOOP-7206. Support Snappy compression. (Issei Yoshida and
Alejandro Abdelnur via vinodkv).
HDFS-2701. Cleanup FS* processIOError methods. (eli)
HDFS-2978. The NameNode should expose name dir statuses via JMX. (atm)
IMPROVEMENTS
MAPREDUCE-3773. Add queue metrics with buckets for job run times. (omalley
via acmurthy)
HADOOP-1722. Allow hadoop streaming to handle non-utf8 byte array. (Klaas
Bosteels and Matthias Lehmann via acmurthy)
HADOOP-5450. Add support for application-specific typecodes to typed
bytes. (Klaas Bosteels via acmurthy)
HADOOP-8090. rename hadoop 64 bit rpm/deb package name. (Giridharan Kesavan
via mattf)
BUG FIXES
HADOOP-8050. Deadlock in metrics. (Kihwal Lee via mattf)
MAPREDUCE-3824. Distributed caches are not removed properly. (Thomas Graves
via mattf)
MAPREDUCE-3583. Change pid to String and stime to BigInteger in order to
avoid NumberFormatException caused by overflow. (Zhihong Yu via szetszwo)
HDFS-3006. In WebHDFS, when the return body is empty, set the Content-Type
to application/octet-stream instead of application/json. (szetszwo)
MAPREDUCE-764. Fix TypedBytesInput.readRaw to preserve custom type codes.
(Klaas Bosteels via acmurthy)
HDFS-2703. removedStorageDirs is not updated everywhere we remove
a storage dir. (eli)
HDFS-2702. A single failed name dir can cause the NN to exit. (eli)
HDFS-3075. Backport HADOOP-4885: Try to restore failed name-node storage
directories at checkpoint time. (Brandon Li via szetszwo)
HDFS-3101. Cannot read empty file using WebHDFS. (szetszwo)
MAPREDUCE-3851. Allow more aggressive action on detection of the jetty
issue (tgraves via bobby)
HADOOP-8088. User-group mapping cache incorrectly does negative caching on
transient failures (Kihwal Lee via bobby)
HADOOP-8132. 64bit secure datanodes do not start as the jsvc path is wrong
(Arpit Gupta via mattf)
HADOOP-8201. create the configure script for native compilation as part of
the build (Giri Kesavan via mattf)
Release 1.0.1 - 2012.02.14
NEW FEATURES
IMPROVEMENTS
MAPREDUCE-3607. Port missing new API mapreduce lib classes to
1.x. (tomwhite)
HADOOP-7987. Support setting the run-as user in unsecure mode. (jitendra)
HADOOP-7988. Upper case in hostname part of the principals doesn't
work with kerberos. (jitendra)
HDFS-2814. NamenodeMXBean does not account for svn revision in the version
information. (Hitesh Shah via jitendra)
HADOOP-7470. Move up to Jackson 1.8.8. (Enis Soztutar via szetszwo)
HDFS-2379. Allow block reports to proceed without holding FSDataset lock.
(todd via suresh)
HADOOP-8009. Create hadoop-client and hadoop-minicluster artifacts for
downstream projects. (Alejandro Abdelnur via mattf)
MAPREDUCE-3184. Add a thread to the TaskTracker which monitors for
spinning Jetty selector threads, and shuts down the daemon when one is
detected. (todd)
BUG FIXES
HADOOP-7960. Port HADOOP-5203 to branch-1, build version comparison is too
restrictive. (mattf)
HADOOP-7964. Deadlock in NetUtils and SecurityUtil class initialization.
(Daryn Sharp via suresh)
HADOOP-8010. hadoop-config.sh errors when HADOOP_HOME_WARN_SUPPRESS is set
to true and HADOOP_HOME is present. (Roman Shaposhnik via mattf)
HADOOP-8052. Hadoop Metrics2 should emit Float.MAX_VALUE (instead of
Double.MAX_VALUE) to avoid making Ganglia's gmetad core. (Varun Kapoor
via mattf)
MAPREDUCE-3343. TaskTracker Out of Memory because of distributed cache.
(Zhao Yunjiong).
HADOOP-8037. Binary tarball does not preserve platform info for
native builds, and RPMs fail to provide needed symlinks for
libhadoop.so. (Matt Foley)
Release 1.0.0 - 2011.12.15
NEW FEATURES
HDFS-2316. [umbrella] WebHDFS: a complete FileSystem implementation for
accessing HDFS over HTTP (szetszwo)
HDFS-2539. Support doAs and GETHOMEDIRECTORY in WebHDFS.
(szetszwo)
IMPROVEMENTS
HDFS-2427. Change the default permission in WebHDFS to 755 and add range
check/validation for all parameters. (szetszwo)
HDFS-2501. Add version prefix and root methods to WebHDFS. (szetszwo)
HADOOP-7728. Enable task memory management to be configurable in hadoop
config setup script. (ramya)
HDFS-2450. Filesystem supports path with both short names and FQDN.
(Daryn Sharp via suresh)
HDFS-617. Support for non-recursive create() in HDFS.
(Kan Zhang via jitendra)
HADOOP-6840. Support non-recursive create() in FileSystem &
SequenceFile.Writer. (Nicolas Spiegelberg via jitendra)
HADOOP-6886. LocalFileSystem Needs createNonRecursive API.
(Nicolas Spiegelberg via jitendra)
HADOOP-5124. A few optimizations to FsNamesystem#RecentInvalidateSets.
(Hairong Kuang via jitendra)
HADOOP-7664. Remove warmings when overriding final parameter configuration
if the override value is same as the final parameter value.
(Ravi Prakash via szetszwo)
HADOOP-7816. Allow HADOOP_HOME deprecated warning suppression based
on config specified in hadoop-env.sh (Dave Thompson via suresh)
MAPREDUCE-3169. Create a new MiniMRCluster equivalent which only provides
client APIs cross MR1 and MR2. (Ahmed via tucu)
HDFS-2552. Add Forrest doc for WebHDFS REST API. (szetszwo)
HDFS-2246. Shortcut a local client reads to a Datanodes files directly.
(Andrew Purtell, Suresh, Jitendra)
HADOOP-7804. Enable hadoop config generator to set configurations to enable
short circuit read. (Arpit Gupta via jitendra)
HDFS-2604. Add a log message to show if WebHDFS is enabled and a
configuration section in the forrest doc. (szetszwo)
HADOOP-7923. Update doc versions from 0.20 to 1.0, and automate the
updating of version numbers in the doc system. (szetszwo via mattf)
BUG FIXES
HDFS-2673. While Namenode processing the blocksBeingWrittenReport,
it will log incorrect number blocks count. (Uma Maheswara via mattf)
MAPREDUCE-3319. Hadoop example "multifilewc" broken in 0.20.205.0.
(Subroto Sanyal via mattf)
HDFS-2589. Remove unnecessary hftp token fetch and renewal thread.
(Daryn Sharp via mattf)
MAPREDUCE-3475. JT can't renew its own tokens. (Daryn Sharp via mattf)
HADOOP-7869. HADOOP_HOME warning happens all of the time (Owen O'Malley
via mattf)
HADOOP-7815. Fixed configuring map memory mb in hadoop-setup-conf.sh.
(Ramya Sunil)
HDFS-2346. TestHost2NodesMap & TestReplicasMap will fail depending
upon execution order of test methods. (Laxman and Uma Maheswara
Rao via Matt Foley)
MAPREDUCE-3374. src/c++/task-controller/configure is not set executable in
the tarball and that prevents task-controller from rebuilding.
(Roman Shaposhnik via Matt Foley)
HDFS-1943. Fail to start datanode while start-dfs.sh is executed by root
user. (Wei Yongjun's patch updated by Matt Foley)
HADOOP-7784. Fixed jsvc packaging. (Eric Yang)
HADOOP-7740. Fixed security audit logger configuration.
(Arpit Gupta via Eric Yang)
HADOOP-7765. Clean packaging working directory for Debian packaging.
(Eric Yang)
HDFS-2441. Remove the Content-Type set by HttpServer.QuotingInputFilter in
WebHDFS responses. (szetszwo)
HDFS-2428. Convert com.sun.jersey.api.ParamException$QueryParamException
to IllegalArgumentException and response it as http BAD_REQUEST in WebHDFS.
(szetszwo)
HDFS-2424. Added a root element "HdfsFileStatuses" for the response
of WebHDFS listStatus. (szetszwo)
HDFS-2439. Fix NullPointerException in WebHDFS when opening a non-existing
file or creating a file without specifying the replication parameter.
(szetszwo)
HDFS-2453. Fix http response code for partial content in WebHDFS, added
getDefaultBlockSize() and getDefaultReplication() in WebHdfsFileSystem
and cleared content type in ExceptionHandler. (szetszwo)
HDFS-2416. Distcp with a WebHDFS uri on a secure cluster fails. (jitendra)
HDFS-2494. Close the streams and DFSClient in DatanodeWebHdfsMethods.
(Uma Maheswara Rao G via szetszwo)
HDFS-2432. WebHDFS: response FORBIDDEN when setReplication on non-files;
clear umask before creating a flie; throw IllegalArgumentException if
setOwner with both owner and group empty; throw FileNotFoundException if
getFileStatus on non-existing files; fix bugs in getBlockLocations; and
changed getFileChecksum json response root to "FileChecksum". (szetszwo)
HDFS-2065. Add null checks in DFSClient.getFileChecksum(..). (Uma
Maheswara Rao G via szetszwo)
HDFS-2527. WebHDFS: remove the use of "Range" header in Open; use ugi
username if renewer parameter is null in GetDelegationToken; response OK
when setting replication for non-files; rename GETFILEBLOCKLOCATIONS to
GET_BLOCK_LOCATIONS and state that it is a private unstable API; replace
isDirectory and isSymlink with enum {FILE, DIRECTORY, SYMLINK} in
HdfsFileStatus JSON object. (szetszwo)
HDFS-2528. WebHDFS: set delegation kind to WEBHDFS and add a HDFS token
when http requests are redirected to datanode. (szetszwo)
HDFS-2540. WebHDFS: change "Expect: 100-continue" to two-step write; change
"HdfsFileStatus" and "localName" respectively to "FileStatus" and
"pathSuffix" in JSON response. (szetszwo)
HDFS-1257. Race condition on FSNamesystem#recentInvalidateSets introduced
by HADOOP-5124. (Eric Payne via jitendra)
HDFS-611. Heartbeats times from Datanodes increase when there are plenty of
blocks to delete. (Zheng Shao via jitendra)
HADOOP-7853. multiple javax security configurations cause conflicts.
(Daryn via jitendra)
HDFS-2590. Fix the missing links in the WebHDFS forrest doc. (szetszwo)
HADOOP-7854. UGI getCurrentUser is not synchronized. (Daryn Sharp
via jitendra)
HADOOP-7865. Test Failures in 1.0 hdfs/common. (jitendra)
MAPREDUCE-3480. Disable TestJvmReuse in branch-1. (jitendra)
HADOOP-7855. fix to remove datanode dir creation and attribute
setup from hadoop-conf-setup.sh (gkesavan)
HADOOP-7461. Fix to add jackson dependency to hadoop pom. (gkesavan)
Release 0.20.205.0 - 2011.10.06
NEW FEATURES
HDFS-2202. Add a new DFSAdmin command to set balancer bandwidth of
datanodes without restarting. (Eric Payne via szetszwo)
HDFS-200. Support append and sync for hadoop 0.20 branch. (dhruba)
HDFS-826. Allow a mechanism for an application to detect that
datanode(s) have died in the write pipeline. (dhruba)
HDFS-142. Blocks that are being written by a client are stored in the
blocksBeingWritten directory.
(Dhruba Borthakur, Nicolas Spiegelberg, Todd Lipcon via dhruba)
HDFS-630. Client can exclude specific nodes in the write pipeline.
(Nicolas Spiegelberg via dhruba)
HDFS-895. Allow hflush/sync to occur in parallel with new writes to
the file. (Todd Lipcon via hairong)
HDFS-1520. Lightweight NameNode operation recoverLease to trigger
lease recovery. (Hairong Kuang via dhruba)
MAPREDUCE-2764. Allow JobTracker to renew and cancel arbitrary token types,
including delegation tokens obtained via hftp. (omalley)
HADOOP-7119 add Kerberos HTTP SPNEGO authentication support to
Hadoop JT/NN/DN/TT web-consoles backport from Trunk (sanjay)
HDFS-2284. Add a new FileSystem, webhdfs://, for supporting write Http
access to HDFS. (szetszwo)
HDFS-2317. Support read access to HDFS in WebHDFS. (szetszwo)
HDFS-2338. Add configuration option to enable/disable WebHDFS.
(jitendra via szetszwo)
HDFS-2318. Provide authentication to WebHDFS using SPNEGO and delegation
tokens. (szetszwo)
HDFS-2340. Support getFileBlockLocations and getDelegationToken in WebHDFS.
(szetszwo)
HDFS-2348. Support getContentSummary and getFileChecksum in WebHDFS.
(szetszwo)
HDFS-2385. Support renew and cancel delegation tokens in WebHDFS.
(szetszwo)
MAPREDUCE-2777. Backport of MAPREDUCE-220 and MAPREDUCE-2469. Includes
adding cumulative CPU usage and total heap usage to task conters. (amarrk)
BUG FIXES
HDFS-2404. WebHDFS liststatus json response is not correct.
(Suresh Srinivas via mattf)
HDFS-2358. NPE when the default filesystem's uri has no authority.
(Daryn Sharp via mattf)
MAPREDUCE-3112. Calling hadoop cli inside mapreduce job leads to errors.
(Eric Yang via mattf)
HADOOP-7691. Hadoop deb pkg group id. (Eric Yang via mattf)
HADOOP-7685. Resolve issues with hadoop-common file hadoop-setup-conf.sh.
(Eric Yang and Devaraj K, via mattf)
HADOOP-7684. jobhistory server and secondarynamenode should have
init.d script for rpm and deb. (Eric Yang via mattf)
HADOOP-7683. remove hdfs-site.xml template has properties that are not used
in 0.20-security. (Arpit Gupta via mattf)
HADOOP-7603. Set default hdfs, mapred uid, and hadoop group gid for RPM
packages. (Eric Yang via mattf)
HADOOP-7681. log4j.properties is missing properties for security audit and
hdfs audit should be changed to info. (Arpit Gupta via mattf)
HADOOP-7679. log4j.properties templates must define
mapred.jobsummary.logger (Ramya Sunil via mattf)
HDFS-2325. Fuse-DFS fails to build on Hadoop 20.203.0
(Kihwal Lee via mattf)
HDFS-2342. add Jersey libraries to ivy.xml files in contrib, to fix
TestSleepJob and TestHdfsProxy. (Tsz Wo (Nicholas), SZE via mattf)
MAPREDUCE-2324. Removed usage of broken
ResourceEstimator.getEstimatedReduceInputSize to check against usable
disk-space on TaskTracker. (Robert Evans via acmurthy)
MAPREDUCE-2729. Ensure jobs with reduces which can't be launched due to
slow-start do not count for user-limits. (Sherry Chen via acmurthy)
HADOOP-6833. IPC leaks call parameters when exceptions thrown.
(Todd Lipcon via eli)
HADOOP-7400. Fix HdfsProxyTests fails when the -Dtest.build.dir
and -Dbuild.test is set a dir other than build dir (gkesavan)
MAPREDUCE-2650. back-port MAPREDUCE-2238 to 0.20-security.
(Sherry Chen via mahadev)
HDFS-2053. Bug in INodeDirectory#computeContentSummary warning
(Michael Noll via eli)
HDFS-2117. DiskChecker#mkdirsWithExistsAndPermissionCheck may
return true even when the dir is not created. (eli)
MAPREDUCE-2489. Jobsplits with random hostnames can make the
queue unusable. (Jeffrey Naisbitt via mahadev)
HDFS-2190. NN fails to start if it encounters an empty or malformed fstime
file. (atm)
HDFS-2259. DN web-UI doesn't work with paths that contain html. (eli)
HDFS-561. Fix write pipeline READ_TIMEOUT.
(Todd Lipcon via dhruba)
HDFS-606. Fix ConcurrentModificationException in invalidateCorruptReplicas.
(Todd Lipcon via dhruba)
HDFS-1118. Fix socketleak on DFSClient.
(Zheng Shao via dhruba)
HDFS-988. Fix bug where savenameSpace can corrupt edits log.
(Nicolas Spiegelberg via dhruba)
HDFS-1054. remove sleep before retry for allocating a block.
(Todd Lipcon via dhruba)
HDFS-1207. FSNamesystem.stallReplicationWork should be volatile.
(Todd Lipcon via dhruba)
HDFS-1141. completeFile does not check lease ownership.
(Todd Lipcon via dhruba)
HDFS-1204. Lease expiration should recover single files,
not entire lease holder (Sam Rash via dhruba)
HDFS-1346. DFSClient receives out of order packet ack. (hairong)
HDFS-1057. Concurrent readers hit ChecksumExceptions if following
a writer to very end of file (Sam Rash via dhruba)
HDFS-724. Use a bidirectional heartbeat to detect stuck
pipeline. (hairong)
HDFS-1555. Disallow pipelien recovery if a file is already being
lease recovered. (hairong)
HDFS-1554. New semantics for recoverLease. (hairong)
HADOOP-7596. Makes packaging of 64-bit jsvc possible. Has other
bug fixes to do with packaging. (Eric Yang via ddas)
HDFS-2309. TestRenameWhileOpen fails. (jitendra)
HDFS-2300. TestFileAppend4 and TestMultiThreadedSync failure. (jitendra)
HDFS-1122. client block verification may result in blocks in
DataBlockScanner prematurely. (Sam Rash via jitendra)
HADOOP-6722. NetUtils.connect should check that it hasn't connected a
socket to itself. (Todd Lipcon via suresh)
HDFS-1779. After NameNode restart , Clients can not read partial files
even after client invokes Sync. (Uma Maheswara Rao G via jitendra)
HDFS-1197. Blocks are considered "complete" prematurely after
commitBlockSynchronization or DN restart. (Todd Lipcon via jitendra)
HDFS-1218. Blocks recovered on startup should be treated with lower
priority during block synchronization. (Todd Lipcon via suresh)
HDFS-1186. DNs should interrupt writers at start of recovery.
(Todd Lipcon via suresh)
HDFS-1252. Fix TestDFSConcurrentFileOperations.
(Todd Lipcon via suresh).
HDFS-1260. Block lost when multiple DNs trying to recover it to different
genstamps. (Todd Lipcon via jitendra)
HADOOP-7626. Bugfix for a config generator (Eric Yang via ddas)
MAPREDUCE-2549. Fix resource leaks in Eclipse plugin. (Devaraj K via
acmurthy)
HDFS-2328. HFTP throws NPE if security is enabled locally, but not
remotely. (omalley)
HADOOP-7602. wordcount, sort etc on har files fails with NPE.
(John George via jitendra)
HADOOP-7625. Fix TestDelegationToken by having DFSClient set the service
correctly and having the test cases use the common jar. (omalley)
HADOOP-7644. Fix TestDelegationTokenRenewal and TestDelegationTokenFetcher
to use and test the new style renewers. (omalley)
HADOOP-7637. Fix to include FairScheduler configuration file in
RPM. (Eric Yang via ddas)
HADOOP-7633. Adds log4j.properties to the hadoop-conf dir on
deploy (Eric Yang via ddas)
HADOOP-7631. Fixes a config problem to do with running streaming jobs
(Eric Yang via ddas)
HADOOP-7630. Fixes hadoop-metrics2.properties to have a property
*.period set to a default value. (Eric Yang via ddas)
HADOOP-7615. Fixes to have contrib jars in the HADOOP_CLASSPATH
for the binary layout case. (Eric Yang via ddas)
HADOOP-7661. FileSystem.getCanonicalServiceName throws NPE for any
file system uri that doesn't have an authority. (jitendra)
HADOOP-7649. TestMapredGroupMappingServiceRefresh and
TestRefreshUserMappings fail after HADOOP-7625. (jitendra)
HADOOP-7658. Fix hadoop config template for secured and unsecured
installation (Eric Yang via gkesavan)
MAPREDUCE-3076. Annotate o.a.h.mapreduce.TestSleepJob with @Ignore since it
is not a junit test. (acmurthy via szetszwo)
HDFS-2331. Fix WebHdfsFileSystem compilation problems for a bug in JDK
version < 1.6.0_26. (Abhijit Suresh Shingate via szetszwo)
HADOOP-7645. HTTP auth tests requiring Kerberos infrastructure are not
disabled on branch-0.20-security. (jitendra)
HADOOP-7674. TestKerberosName fails in 20 branch. (jitendra)
HDFS-2333. Change DFSOutputStream back to package private, otherwise,
there are two SC_START_IN_CTOR findbugs warnings. (szetszwo)
HADOOP-7676. Enable hbase to run as hdfs user (gkesavan)
HDFS-2359. Fix NullPointerException in DataBlockScanner.
(Jonthan Eagles via suresh)
MAPREDUCE-3081. Fix for vaidya.sh to work with the new layout
(Suhas via gkesavan)
HDFS-2366. Initialize WebHdfsFileSystem.ugi in object construction.
(szetszwo)
HDFS-2361. hftp is broken. Fixed username checks in JspHelper. (jitendra)
HDFS-2375. Fix TestFileAppend4 failure. (suresh)
HDFS-2373. Commands using WebHDFS and hftp print unnecessary debug
info on the console with security enabled. (Arpit Gupta via suresh)
HADOOP-7610. Fix for hadoop debian package (Eric Yang via gkesavan)
HADOOP-7715. Removed unnecessary security logger configuration. (Eric Yang)
HADOOP-7711. Fixed recursive sourcing of HADOOP_OPTS environment
variables (Arpit Gupta via Eric Yang)
HDFS-2392. Dist with hftp is failing again. (Daryn Sharp via jitendra)
HDFS-2408. DFSClient#getNumCurrentReplicas is package private in 205 but
public in branch-0.20-append (stack via atm)
HADOOP-7721. dfs.web.authentication.kerberos.principal expects the full
hostname and does not replace _HOST with the hostname. (jitendra)
HDFS-2403. NamenodeWebHdfsMethods.generateDelegationToken(..) does not use
the renewer parameter. (szetszwo)
HADOOP-7724. Fixed hadoop-setup-conf.sh to put proxy user in
core-site.xml. (Arpit Gupta via Eric Yang)
HDFS-2411. With WebHDFS enabled in secure mode the auth to local mappings
are not being respected. (jitendra)
IMPROVEMENTS
MAPREDUCE-2928. MR-2413 improvements (Eli Collins via mattf)
HADOOP-7655. provide a small validation script that smoke tests
the installed cluster. (Arpit Gupta via mattf)
MAPREDUCE-2187. Reporter sends progress during sort/merge. (Anupam Seth via
acmurthy)
MAPREDUCE-2705. Implements launch of multiple tasks concurrently.
(Thomas Graves via ddas)
HADOOP-7343. Make the number of warnings accepted by test-patch
configurable to limit false positives. (Thomas Graves via cdouglas)
HDFS-1836. Thousand of CLOSE_WAIT socket. Contributed by Todd Lipcon,
ported to security branch by Bharath Mundlapudi. (via mattf)
HADOOP-7432. Back-port HADOOP-7110 to 0.20-security: Implement chmod
in NativeIO library. (Sherry Chen via mattf)
HADOOP-7314. Add support for throwing UnknownHostException when a host
doesn't resolve. Needed for MAPREDUCE-2489. (Jeffrey Naisbitt via mattf)
MAPREDUCE-2494. Make the distributed cache delete entires using LRU
priority (Robert Joseph Evans via mahadev)
HADOOP-6889. Make RPC to have an option to timeout - backport to
0.20-security. Unit tests updated to 17/Aug/2011 version.
(John George and Ravi Prakash via mattf)
MAPREDUCE-2780. Use a utility method to set service in token.
(Daryn Sharp via jitendra)
HADOOP-7472. RPC client should deal with IP address change.
(Kihwal Lee via suresh)
MAPREDUCE-2489. Jobsplits with random hostnames can make the queue unusable
(Jeffrey Naisbit via mahadev)
MAPREDUCE-2852. Jira for YDH bug 2854624. (Kihwal Lee via eli)
HDFS-1210. DFSClient should log exception when block recovery fails.
(Todd Lipcon via dhruba)
HDFS-1211. Block receiver should not log "rewind" packets at INFO level.
(Todd Lipcon)
HDFS-1164. TestHdfsProxy is failing. (Todd Lipcon)
HDFS-1202. DataBlockScanner throws NPE when updated before initialized.
(Todd Lipcon)
HADOOP-7539. merge hadoop archive goodness from trunk to .20 (John George
via mahadev)
HADOOP-7594. Support HTTP REST in HttpServer. (szetszwo)
HDFS-1242. Add test for appendFile() race solved in HDFS-142.
(Todd Lipcon via jitendra)
HDFS-2320. Make 0.20-append protocol changes compatible with
0.20-secuirty. (suresh)
MAPREDUCE-2610. Make QueueAclsInfo public. (Joep Rottinghuis via acmurthy)
MAPREDUCE-2915. Ensure LTC passes java.library.path. (Kihwal Lee via
acmurthy)
HADOOP-7599. Script improvements to setup a secure Hadoop cluster
(Eric Yang via ddas)
MAPREDUCE-2981. Backport FairScheduler from trunk. (Matei Zaharia via
acmurthy)
MAPREDUCE-1734. Undeprecate old API in branch-0.20-security. (Todd Lipcon
via acmurthy)
HDFS-2356. Support case insensitive query parameter names in WebHDFS.
(szetszwo)
HADOOP-7510. Add configurable option to use original hostname in token
instead of IP to allow server IP change. (Daryn Sharp via suresh)
HDFS-2368. Move SPNEGO conf properties from hdfs-default.xml to
hdfs-site.xml. (szetszwo)
HADOOP-7710. Added hadoop-setup-application.sh for creating
application directory (Arpit Gupta via Eric Yang)
HADOOP-7708. Fixed hadoop-setup-conf.sh to handle config file
consistently. (Eric Yang)
HADOOP-7707. Added toggle for dfs.support.append, WebHDFS and hadoop proxy
user to setup config script. (Arpit Gupta via Eric Yang)
HADOOP-7720. Added parameter for HBase user to setup config script.
(Arpit Gupta via Eric Yang)
HDFS-2395. Add a root element in the JSON responses of WebHDFS.
(szetszwo)
Release 0.20.204.0 - 2011-8-25
NEW FEATURES
HADOOP-6255. Create RPM and Debian packages for common. Changes deployment
layout to be consistent across the binary tgz, rpm, and deb. Adds setup
scripts for easy one node cluster configuration and user creation.
(Eric Yang via omalley)
HADOOP-7324. Ganglia plugins for metrics v2. (Priyo Mustafi via llu)
BUG FIXES
MAPREDUCE-2804. Fixed a race condition in setting up the log directories
for tasks that are starting at the same time. (omalley)
MAPREDUCE-2846. Fixed a race condition in writing the log index file that
caused tasks to fail. (omalley)
MAPREDUCE-2651. Fix race condition in Linux task controller for
job log directory creation. (Bharath Mundlapudi via llu)
MAPREDUCE-2621. TestCapacityScheduler fails with "Queue "q1" does not
exist". (Sherry Chen via mahadev)
HADOOP-7475. Fix hadoop-setup-single-node.sh to reflect new layout. (eyang
via omalley)
HADOOP-7045. TestDU fails on systems with local file systems with
extended attributes. (eli)
MAPREDUCE-2495. exit() the TaskTracker when the distributed cache cleanup
thread dies. (Robert Joseph Evans via cdouglas)
HDFS-1878. TestHDFSServerPorts unit test failure - race condition
in FSNamesystem.close() causes NullPointerException without serious
consequence. (mattf)
MAPREDUCE-2452. Moves the cancellation of delegation tokens to a separate
thread. (ddas)
MAPREDUCE-2555. Avoid sprious logging from completedtasks. (Thomas Graves
via cdouglas)
MAPREDUCE-2451. Log the details from health check script at the
JobTracker. (Thomas Graves via cdouglas)
MAPREDUCE-2535. Fix NPE in JobClient caused by retirement. (Robert Joseph
Evans via cdouglas)
MAPREDUCE-2456. Log the reduce taskID and associated TaskTrackers with
failed fetch notifications in the JobTracker log.
(Jeffrey Naisbitt via cdouglas)
HDFS-2044. TestQueueProcessingStatistics failing automatic test due to
timing issues. (mattf)
HADOOP-7248. Update eclipse target to generate .classpath from ivy config.
(Thomas Graves and Tom White via cdouglas)
MAPREDUCE-2558. Add queue-level metrics 0.20-security branch (test fixes)
(Jeffrey Naisbitt via mahadev)
HADOOP-7364. TestMiniMRDFSCaching fails if test.build.dir is set to
something other than build/test. (Thomas Graves via mahadev)
HADOOP-7277. Add generation of run configurations to eclipse target.
(Jeffrey Naisbitt and Philip Zeyliger via cdouglas)
HADOOP-7373. Fix {start,stop}-{dfs,mapred} and hadoop-daemons.sh from
trying to use the wrong bin directory. (omalley)
HADOOP-7274. Fix typos in IOUtils. (Jonathan Eagles via cdouglas)
HADOOP-7369. Fix permissions in tarball for sbin/* and libexec/* (omalley)
MAPREDUCE-2479. Move distributed cache cleanup to a background task,
backporting MAPREDUCE-1568. (Robert Joseph Evans via cdouglas)
HADOOP-7356. Fix bin/hadoop scripts (eyang via omalley)
HADOOP-7272. Remove unnecessary security related info logs. (suresh)
MAPREDUCE-2514. Fix typo in TaskTracker ReinitTrackerAction log message.
(Jonathan Eagles via cdouglas)
HDFS-1906. Remove logging exception stack trace in client logs when one of
the datanode targets to read from is not reachable. (suresh)
MAPREDUCE-2490. Add logging to graylist and blacklist activity to aid
diagnosis of related issues. (Jonathan Eagles via cdouglas)
MAPREDUCE-2447. Fix Child.java to set Task.jvmContext sooner to avoid
corner cases in error handling. (Siddharth Seth via acmurthy)
MAPREDUCE-2429. Validate JVM in TaskUmbilicalProtocol. (Siddharth Seth via
acmurthy)
MAPREDUCE-2418. Show job errors in JobHistory page. (Siddharth Seth via
acmurthy)
HDFS-1592. At Startup, Valid volumes required in FSDataset doesn't
handle consistently with volumes tolerated. (Bharath Mundlapudi)
HDFS-1598. Directory listing on hftp:// does not show
.*.crc files. (szetszwo)
HDFS-1750. ListPathsServlet should not use HdfsFileStatus.getLocalName()
to get file name since it may return an empty string. (szetszwo)
HDFS-1758. Make Web UI JSP pages thread safe. (Tanping via suresh)
HDFS-1773. Do not show decommissioned datanodes, which are not in both
include and exclude lists, on web and JMX interfaces.
(Tanping Wang via szetszwo)
MAPREDUCE-2409. Distinguish distributed cache artifacts localized as
files, archives. (Siddharth Seth via cdouglas)
MAPREDUCE-118. Fix Job.getJobID() to get the new ID as soon as it's
assigned. (Amareshwari Sriramadasu and Dick King via cdouglas)
MAPREDUCE-2411. Force an exception when the queue has an invalid name or
its ACLs are misconfigured. (Dick King via cdouglas)
HDFS-1258. Clearing namespace quota on "/" corrupts fs image.
(Aaron T. Myers via szetszwo)
HDFS-1189. Quota counts missed between clear quota and set quota.
(John George via szetszwo)
HDFS-1692. In secure mode, Datanode process doesn't exit when disks
fail. (bharathm via boryas)
MAPREDUCE-2420. JobTracker should be able to renew delegation token
over HTTP (boryas)
MAPREDUCE-2443. Fix TaskAspect for TaskUmbilicalProtocol.ping(..).
(Siddharth Seth via szetszwo)
HDFS-1842. Handle editlog opcode conflict with 0.20.203 during upgrade,
by throwing an error to indicate the editlog needs to be empty.
(suresh)
HDFS-1377. Quota bug for partial blocks allows quotas to be violated. (eli)
HDFS-2057. Wait time to terminate the threads causes unit tests to
take longer time. (Bharath Mundlapudi via suresh)
HDFS-2218. Disable TestHdfsProxy.testHdfsProxyInterface in automated test
suite for 0.20-security-204 release. (Matt Foley)
IMPROVEMENTS
HADOOP-7144. Expose JMX metrics via JSON servlet. (Robert Joseph Evans via
cdouglas)
MAPREDUCE-2524. Port reduce failure reporting semantics from trunk, to
fail faulty maps more aggressively. (Thomas Graves via cdouglas)
MAPREDUCE-2529. Add support for regex-based shuffle metric counting
exceptions. (Thomas Graves via cdouglas)
HADOOP-7398. Add mechanism to suppress warnings about use of HADOOP_HOME.
(omalley)
HDFS-2023. Backport of NPE for File.list and File.listFiles.
Merged ports of HADOOP-7322, HDFS-1934, HADOOP-7342, and HDFS-2019.
(Bharath Mundlapudi via mattf)
MAPREDUCE-2415. Distribute the user task logs on to multiple disks.
(Bharath Mundlapudi via omalley)
MAPREDUCE-2413. TaskTracker should handle disk failures by reinitializing
itself. (Ravi Gummadi and Jagane Sundar via omalley)
HDFS-1541. Not marking datanodes dead when namenode in safemode.
(hairong)
HDFS-1767. Namenode ignores non-initial block report from datanodes
when in safemode during startup. (Matt Foley via suresh)
MAPREDUCE-1251. c++ utils doesn't compile. (Eli Collins via shv)
HADOOP-7459. Remove jdk-1.6.0 dependency check from rpm. (omalley)
HADOOP-7330. Fix MetricsSourceAdapter to use the value instead of the
object. (Luke Lu via omalley)
Release 0.20.203.0 - 2011-5-11
MAPREDUCE-1280. Update Eclipse plugin to the new eclipse.jdt API.
(Alex Kozlov via szetszwo)
HADOOP-7259. Contrib modules should include the build.properties from
the enclosing hadoop directory. (omalley)
HADOOP-7253. Update the default configuration to fix security audit log
and metrics2 property configuration warnings. (omalley)
HADOOP-7247. Update documentation to match current jar names. (omalley)
HADOOP-7246. Update the log4j configuration to match the EventCounter
package. (Luke Lu via omalley)
HADOOP-7143. Restore HadoopArchives. (Joep Rottinghuis via omalley)
MAPREDUCE-2316. Updated CapacityScheduler documentation. (acmurthy)
HADOOP-7243. Fix contrib unit tests missing dependencies. (omalley)
HADOOP-7190. Add metrics v1 back for backwards compatibility. (omalley)
MAPREDUCE-2360. Remove stripping of scheme, authority from submit dir in
support of viewfs. (cdouglas)
MAPREDUCE-2359 Use correct file system to access distributed cache objects.
(Krishna Ramachandran)
MAPREDUCE-2361. "Fix Distributed Cache is not adding files to class paths
correctly" - Drop the host/scheme/fragment from URI (cdouglas)
MAPREDUCE-2362. Fix unit-test failures: TestBadRecords (NPE due to
rearranged MapTask code) and TestTaskTrackerMemoryManager
(need hostname in output-string pattern). (Greg Roelofs, Krishna
Ramachandran)
HDFS-1729. Add statistics logging for better visibility into
startup time costs. (Matt Foley)
MAPREDUCE-2363. When a queue is built without any access rights we
explain the problem. (Richard King)
MAPREDUCE-1563. TaskDiagnosticInfo may be missed sometime. (Krishna
Ramachandran)
MAPREDUCE-2364. Don't hold the rjob lock while localizing resources. (ddas
via omalley)
MAPREDUCE-2365. New counters for FileInputFormat (BYTES_READ) and
FileOutputFormat (BYTES_WRITTEN).
New counter MAP_OUTPUT_MATERIALIZED_BYTES for compressed MapOutputSize.
(Siddharth Seth)
HADOOP-7040. Change DiskErrorException to IOException (boryas)
HADOOP-7104. Remove unnecessary DNS reverse lookups from RPC layer
(kzhang)
MAPREDUCE-2366. Fix a problem where the task browser UI can't retrieve the
stdxxx printouts of streaming jobs that abend in the unix code, in
the common case where the containing job doesn't reuse JVM's.
(Richard King)
HADOOP-6977. Herriot daemon clients should vend statistics (cos)
HADOOP-6971. Clover build doesn't generate per-test coverage (cos)
HADOOP-6879. Provide SSH based (Jsch) remote execution API for system
tests. (cos)
HADOOP-7215. RPC clients must use network interface corresponding to
the host in the client's kerberos principal key. (suresh)
HADOOP-7232. Fix Javadoc warnings. (omalley)
HADOOP-7258. The Gzip codec should not return null decompressors. (omalley)
Release 0.20.202.0 - unreleased
MAPREDUCE-2355. Add a configuration knob
mapreduce.tasktracker.outofband.heartbeat.damper that limits out of band
heartbeats (acmurthy)
MAPREDUCE-2356. Fix a race-condition that corrupted a task's state on the
JobTracker. (Luke Lu)
MAPREDUCE-2357. Always propagate IOExceptions that are thrown by
non-FileInputFormat. (Luke Lu)
HADOOP-7163. RPC handles SocketTimeOutException during SASL negotiation.
(ddas)
MAPREDUCE-2358. MapReduce assumes the default FileSystem is HDFS.
(Krishna Ramachandran)
MAPREDUCE-1904. Reducing locking contention in TaskTracker's
MapOutputServlet LocalDirAllocator. (Rajesh Balamohan via acmurthy)
HDFS-1626. Make BLOCK_INVALIDATE_LIMIT configurable. (szetszwo)
HDFS-1584. Adds a check for whether relogin is needed to
getDelegationToken in HftpFileSystem. (Kan Zhang via ddas)
HADOOP-7115. Reduces the number of calls to getpwuid_r and
getpwgid_r, by implementing a cache in NativeIO. (ddas)
HADOOP-6882. An XSS security exploit in jetty-6.1.14. jetty upgraded to
6.1.26. (ddas)
MAPREDUCE-2278. Fixes a memory leak in the TaskTracker. (cdouglas)
HDFS-1353 redux. Modulate original 1353 to not bump RPC version.
(jhoman)
MAPREDUCE-2082 Race condition in writing the jobtoken password file when
launching pipes jobs (jitendra and ddas)
HADOOP-6978. Fixes task log servlet vulnerabilities via symlinks.
(Todd Lipcon and Devaraj Das)
MAPREDUCE-2178. Write task initialization to avoid race
conditions leading to privilege escalation and resource leakage by
performing more actiions as the user. (Owen O'Malley, Devaraj Das,
Chris Douglas via cdouglas)
HDFS-1364. HFTP client should support relogin from keytab
HADOOP-6907. Make RPC client to use per-proxy configuration.
(Kan Zhang via ddas)
MAPREDUCE-2055. Fix JobTracker to decouple job retirement from copy of
job-history file to HDFS and enhance RetiredJobInfo to carry aggregated
job-counters to prevent a disk roundtrip on job-completion to fetch
counters for the JobClient. (Krishna Ramachandran via acmurthy)
HDFS-1353. Remove most of getBlockLocation optimization (jghoman)
MAPREDUCE-2023. TestDFSIO read test may not read specified bytes. (htang)
HDFS-1340. A null delegation token is appended to the url if security is
disabled when browsing filesystem.(boryas)
HDFS-1352. Fix jsvc.location. (jghoman)
HADOOP-6860. 'compile-fault-inject' should never be called directly. (cos)
MAPREDUCE-2005. TestDelegationTokenRenewal fails (boryas)
MAPREDUCE-2000. Rumen is not able to extract counters for Job history logs
from Hadoop 0.20. (htang)
MAPREDUCE-1961. ConcurrentModificationException when shutting down Gridmix.
(htang)
HADOOP-6899. RawLocalFileSystem set working directory does
not work for relative names. (suresh)
HDFS-495. New clients should be able to take over files lease if the old
client died. (shv)
HADOOP-6728. Re-design and overhaul of the Metrics framework. (Luke Lu via
acmurthy)
MAPREDUCE-1966. Change blacklisting of tasktrackers on task failures to be
a simple graylist to fingerpoint bad tasktrackers. (Greg Roelofs via
acmurthy)
HADOOP-6864. Add ability to get netgroups (as returned by getent
netgroup command) using native code (JNI) instead of forking. (Erik Steffl)
HDFS-1318. HDFS Namenode and Datanode WebUI information needs to be
accessible programmatically for scripts. (Tanping Wang via suresh)
HDFS-1315. Add fsck event to audit log and remove other audit log events
corresponding to FSCK listStatus and open calls. (suresh)
MAPREDUCE-1941. Provides access to JobHistory file (raw) with job user/acl
permission. (Srikanth Sundarrajan via ddas)
MAPREDUCE-291. Optionally a separate daemon should serve JobHistory.
(Srikanth Sundarrajan via ddas)
MAPREDUCE-1936. Make Gridmix3 more customizable (sync changes from trunk).
(htang)
HADOOP-5981. Fix variable substitution during parsing of child environment
variables. (Krishna Ramachandran via acmurthy)
MAPREDUCE-339. Greedily schedule failed tasks to cause early job failure.
(cdouglas)
MAPREDUCE-1872. Hardened CapacityScheduler to have comprehensive, coherent
limits on tasks/jobs for jobs/users/queues. Also, added the ability to
refresh queue definitions without the need to restart the JobTracker.
(acmurthy)
HDFS-1161. Make DN minimum valid volumes configurable. (shv)
HDFS-457. Reintroduce volume failure tolerance for DataNodes. (shv)
HDFS-1307 Add start time, end time and total time taken for FSCK
to FSCK report. (suresh)
MAPREDUCE-1207. Sanitize user environment of map/reduce tasks and allow
admins to set environment and java options. (Krishna Ramachandran via
acmurthy)
HDFS-1298 - Add support in HDFS for new statistics added in FileSystem
to track the file system operations (suresh)
HDFS-1301. TestHDFSProxy need to use server side conf for ProxyUser
stuff.(boryas)
HADOOP-6859 - Introduce additional statistics to FileSystem to track
file system operations (suresh)
HADOOP-6818. Provides a JNI implementation of Unix Group resolution. The
config hadoop.security.group.mapping should be set to
org.apache.hadoop.security.JniBasedUnixGroupsMapping to enable this
implementation. (ddas)
MAPREDUCE-1938. Introduces a configuration for putting user classes before
the system classes during job submission and in task launches. Two things
need to be done in order to use this feature -
(1) mapreduce.user.classpath.first : this should be set to true in the
jobconf, and, (2) HADOOP_USER_CLASSPATH_FIRST : this is relevant for job
submissions done using bin/hadoop shell script. HADOOP_USER_CLASSPATH_FIRST
should be defined in the environment with some non-empty value
(like "true"), and then bin/hadoop should be executed. (ddas)
HADOOP-6669. Respect compression configuration when creating DefaultCodec
compressors. (Koji Noguchi via cdouglas)
HADOOP-6855. Add support for netgroups, as returned by command
getent netgroup. (Erik Steffl)
HDFS-599. Allow NameNode to have a seprate port for service requests from
client requests. (Dmytro Molkov via hairong)
HDFS-132. Fix namenode to not report files deleted metrics for deletions
done while replaying edits during startup. (shv)
MAPREDUCE-1521. Protection against incorrectly configured reduces
(mahadev)
MAPREDUCE-1936. Make Gridmix3 more customizable. (htang)
MAPREDUCE-517. Enhance the CapacityScheduler to assign multiple tasks
per-heartbeat. (acmurthy)
MAPREDUCE-323. Re-factor layout of JobHistory files on HDFS to improve
operability. (Dick King via acmurthy)
MAPREDUCE-1921. Ensure exceptions during reading of input data in map
tasks are augmented by information about actual input file which caused
the exception. (Krishna Ramachandran via acmurthy)
MAPREDUCE-1118. Enhance the JobTracker web-ui to ensure tabular columns
are sortable, also added a /scheduler servlet to CapacityScheduler for
enhanced UI for queue information. (Krishna Ramachandran via acmurthy)
HADOOP-5913. Add support for starting/stopping queues. (cdouglas)
HADOOP-6835. Add decode support for concatenated gzip files. (Greg Roelofs)
HDFS-1158. Revert HDFS-457. (shv)
MAPREDUCE-1699. Ensure JobHistory isn't disabled for any reason. (Krishna
Ramachandran via acmurthy)
MAPREDUCE-1682. Fix speculative execution to ensure tasks are not
scheduled after job failure. (acmurthy)
MAPREDUCE-1914. Ensure unique sub-directories for artifacts in the
DistributedCache are cleaned up. (Dick King via acmurthy)
HADOOP-6713. Multiple RPC Reader Threads (Bharathm)
HDFS-1250. Namenode should reject block reports and block received
requests from dead datanodes (suresh)
MAPREDUCE-1863. [Rumen] Null failedMapAttemptCDFs in job traces generated
by Rumen. (htang)
MAPREDUCE-1309. Rumen refactory. (htang)
HDFS-1114. Implement LightWeightGSet for BlocksMap in order to reduce
NameNode memory footprint. (szetszwo)
MAPREDUCE-572. Fixes DistributedCache.checkURIs to throw error if link is
missing for uri in cache archives. (amareshwari)
MAPREDUCE-787. Fix JobSubmitter to honor user given symlink in the path.
(amareshwari)
HADOOP-6815. refreshSuperUserGroupsConfiguration should use
server side configuration for the refresh( boryas)
MAPREDUCE-1868. Add a read and connection timeout to JobClient while
pulling tasklogs. (Krishna Ramachandran via acmurthy)
HDFS-1119. Introduce a GSet interface to BlocksMap. (szetszwo)
MAPREDUCE-1778. Ensure failure to setup CompletedJobStatusStore is not
silently ignored by the JobTracker. (Krishna Ramachandran via acmurthy)
MAPREDUCE-1538. Add a limit on the number of artifacts in the
DistributedCache to ensure we cleanup aggressively. (Dick King via
acmurthy)
MAPREDUCE-1850. Add information about the host from which a job is
submitted. (Krishna Ramachandran via acmurthy)
HDFS-1110. Reuses objects for commonly used file names in namenode to
reduce the heap usage. (suresh)
HADOOP-6810. Extract a subset of tests for smoke (DOA) validation. (cos)
HADOOP-6642. Remove debug stmt left from original patch. (cdouglas)
HADOOP-6808. Add comments on how to setup File/Ganglia Context for
kerberos metrics (Erik Steffl)
HDFS-1061. INodeFile memory optimization. (bharathm)
HDFS-1109. HFTP supports filenames that contains the character "+".
(Dmytro Molkov via dhruba, backported by szetszwo)
HDFS-1085. Check file length and bytes read when reading a file through
hftp in order to detect failure. (szetszwo)
HDFS-1311. Running tests with 'testcase' cause triple execution of the
same test case (cos)
HDFS-1150.FIX. Verify datanodes' identities to clients in secure clusters.
Update to patch to improve handling of jsvc source in build.xml (jghoman)
HADOOP-6752. Remote cluster control functionality needs JavaDocs
improvement. (Balaji Rajagopalan via cos)
MAPREDUCE-1288. Fixes TrackerDistributedCacheManager to take into account
the owner of the localized file in the mapping from cache URIs to
CacheStatus objects. (ddas)
MAPREDUCE-1682. Fix speculative execution to ensure tasks are not
scheduled after job failure. (acmurthy)
MAPREDUCE-1914. Ensure unique sub-directories for artifacts in the
DistributedCache are cleaned up. (Dick King via acmurthy)
MAPREDUCE-1538. Add a limit on the number of artifacts in the
DistributedCache to ensure we cleanup aggressively. (Dick King via
acmurthy)
MAPREDUCE-1900. Fixes a FS leak that i missed in the earlier patch.
(ddas)
MAPREDUCE-1900. Makes JobTracker/TaskTracker close filesystems, created
on behalf of users, when they are no longer needed. (ddas)
HADOOP-6832. Add a static user plugin for web auth for external users.
(omalley)
HDFS-1007. Fixes a bug in SecurityUtil.buildDTServiceName to do
with handling of null hostname. (omalley)
HDFS-1007. makes long running servers using hftp work. Also has some
refactoring in the MR code to do with handling of delegation tokens.
(omalley & ddas)
HDFS-1178. The NameNode servlets should not use RPC to connect to the
NameNode. (omalley)
MAPREDUCE-1807. Re-factor TestQueueManager. (Richard King via acmurthy)
HDFS-1150. Fixes the earlier patch to do logging in the right directory
and also adds facility for monitoring processes (via -Dprocname in the
command line). (Jakob Homan via ddas)
HADOOP-6781. security audit log shouldn't have exception in it. (boryas)
HADOOP-6776. Fixes the javadoc in UGI.createProxyUser. (ddas)
HDFS-1150. building jsvc from source tar. source tar is also checked in.
(jitendra)
HDFS-1150. Bugfix in the hadoop shell script. (ddas)
HDFS-1153. The navigation to /dfsnodelist.jsp with invalid input
parameters produces NPE and HTTP 500 error (rphulari)
MAPREDUCE-1664. Bugfix to enable queue administrators of a queue to
view job details of jobs submitted to that queue even though they
are not part of acl-view-job.
HDFS-1150. Bugfix to add more knobs to secure datanode starter.
HDFS-1157. Modifications introduced by HDFS-1150 are breaking aspect's
bindings (cos)
HDFS-1130. Adds a configuration dfs.cluster.administrators for
controlling access to the default servlets in hdfs. (ddas)
HADOOP-6706.FIX. Relogin behavior for RPC clients could be improved
(boryas)
HDFS-1150. Verify datanodes' identities to clients in secure clusters.
(jghoman)
MAPREDUCE-1442. Fixed regex in job-history related to parsing Counter
values. (Luke Lu via acmurthy)
HADOOP-6760. WebServer shouldn't increase port number in case of negative
port setting caused by Jetty's race. (cos)
HDFS-1146. Javadoc for getDelegationTokenSecretManager in FSNamesystem.
(jitendra)
HADOOP-6706. Fix on top of the earlier patch. Closes the connection
on a SASL connection failure, and retries again with a new
connection. (ddas)
MAPREDUCE-1716. Fix on top of earlier patch for logs truncation a.k.a
MAPREDUCE-1100. Addresses log truncation issues when binary data is
written to log files and adds a header to a truncated log file to
inform users of the done trucation.
HDFS-1383. Improve the error messages when using hftp://.
MAPREDUCE-1744. Fixed DistributedCache apis to take a user-supplied
FileSystem to allow for better proxy behaviour for Oozie. (Richard King)
MAPREDUCE-1733. Authentication between pipes processes and java
counterparts. (jitendra)
MAPREDUCE-1664. Bugfix on top of the previous patch. (ddas)
HDFS-1136. FileChecksumServlets.RedirectServlet doesn't carry forward
the delegation token (boryas)
HADOOP-6756. Change value of FS_DEFAULT_NAME_KEY from fs.defaultFS
to fs.default.name which is a correct name for 0.20 (steffl)
HADOOP-6756. Document (javadoc comments) and cleanup configuration
keys in CommonConfigurationKeys.java (steffl)
MAPREDUCE-1759. Exception message for unauthorized user doing killJob,
killTask, setJobPriority needs to be improved. (gravi via vinodkv)
HADOOP-6715. AccessControlList.toString() returns empty string when
we set acl to "*". (gravi via vinodkv)
HADOOP-6757. NullPointerException for hadoop clients launched from
streaming tasks. (amarrk via vinodkv)
HADOOP-6631. FileUtil.fullyDelete() should continue to delete other files
despite failure at any level. (vinodkv)
MAPREDUCE-1317. NPE in setHostName in Rumen. (rksingh)
MAPREDUCE-1754. Replace mapred.persmissions.supergroup with an acl :
mapreduce.cluster.administrators and HADOOP-6748.: Remove
hadoop.cluster.administrators. Contributed by Amareshwari Sriramadasu.
HADOOP-6701. Incorrect exit codes for "dfs -chown", "dfs -chgrp"
(rphulari)
HADOOP-6640. FileSystem.get() does RPC retires within a static
synchronized block. (hairong)
HDFS-1006. Removes unnecessary logins from the previous patch. (ddas)
HADOOP-6745. adding some java doc to Server.RpcMetrics, UGI (boryas)
MAPREDUCE-1707. TaskRunner can get NPE in getting ugi from TaskTracker.
(vinodkv)
HDFS-1104. Fsck triggers full GC on NameNode. (hairong)
HADOOP-6332. Large-scale Automated Test Framework (sharad, Sreekanth
Ramakrishnan, at all via cos)
HADOOP-6526. Additional fix for test context on top of existing one. (cos)
HADOOP-6710. Symbolic umask for file creation is not conformant with posix.
(suresh)
HADOOP-6693. Added metrics to track kerberos login success and failure.
(suresh)
MAPREDUCE-1711. Gridmix should provide an option to submit jobs to the same
queues as specified in the trace. (rksing via htang)
MAPREDUCE-1687. Stress submission policy does not always stress the
cluster. (htang)
MAPREDUCE-1641. Bug-fix to ensure command line options such as
-files/-archives are checked for duplicate artifacts in the
DistributedCache. (Amareshwari Sreeramadasu via acmurthy)
MAPREDUCE-1641. Fix DistributedCache to ensure same files cannot be put in
both the archives and files sections. (Richard King via acmurthy)
HADOOP-6670. Fixes a testcase issue introduced by the earlier commit
of the HADOOP-6670 patch. (ddas)
MAPREDUCE-1718. Fixes a problem to do with correctly constructing
service name for the delegation token lookup in HftpFileSystem
(borya via ddas)
HADOOP-6674. Fixes the earlier patch to handle pings correctly (ddas).
MAPREDUCE-1664. Job Acls affect when Queue Acls are set.
(Ravi Gummadi via vinodkv)
HADOOP-6718. Fixes a problem to do with clients not closing RPC
connections on a SASL failure. (ddas)
MAPREDUCE-1397. NullPointerException observed during task failures.
(Amareshwari Sriramadasu via vinodkv)
HADOOP-6670. Use the UserGroupInformation's Subject as the criteria for
equals and hashCode. (omalley)
HADOOP-6716. System won't start in non-secure mode when kerb5.conf
(edu.mit.kerberos on Mac) is not present. (boryas)
MAPREDUCE-1607. Task controller may not set permissions for a
task cleanup attempt's log directory. (Amareshwari Sreeramadasu via
vinodkv)
MAPREDUCE-1533. JobTracker performance enhancements. (Amar Kamat via
vinodkv)
MAPREDUCE-1701. AccessControlException while renewing a delegation token
in not correctly handled in the JobTracker. (boryas)
HDFS-481. Incremental patch to fix broken unit test in contrib/hdfsproxy
HADOOP-6706. Fixes a bug in the earlier version of the same patch (ddas)
HDFS-1096. allow dfsadmin/mradmin refresh of superuser proxy group
mappings(boryas).
HDFS-1012. Support for cluster specific path entries in ldap for hdfsproxy
(Srikanth Sundarrajan via Nicholas)
HDFS-1011. Improve Logging in HDFSProxy to include cluster name associated
with the request (Srikanth Sundarrajan via Nicholas)
HDFS-1010. Retrieve group information from UnixUserGroupInformation
instead of LdapEntry (Srikanth Sundarrajan via Nicholas)
HDFS-481. Bug fix - hdfsproxy: Stack overflow + Race conditions
(Srikanth Sundarrajan via Nicholas)
MAPREDUCE-1657. After task logs directory is deleted, tasklog servlet
displays wrong error message about job ACLs. (Ravi Gummadi via vinodkv)
MAPREDUCE-1692. Remove TestStreamedMerge from the streaming tests.
(Amareshwari Sriramadasu and Sreekanth Ramakrishnan via vinodkv)
HDFS-1081. Performance regression in
DistributedFileSystem::getFileBlockLocations in secure systems (jhoman)
MAPREDUCE-1656. JobStory should provide queue info. (htang)
MAPREDUCE-1317. Reducing memory consumption of rumen objects. (htang)
MAPREDUCE-1317. Reverting the patch since it caused build failures. (htang)
MAPREDUCE-1683. Fixed jobtracker web-ui to correctly display heap-usage.
(acmurthy)
HADOOP-6706. Fixes exception handling for saslConnect. The ideal
solution is to the Refreshable interface but as Owen noted in
HADOOP-6656, it doesn't seem to work as expected. (ddas)
MAPREDUCE-1617. TestBadRecords failed once in our test runs. (Amar
Kamat via vinodkv).
MAPREDUCE-587. Stream test TestStreamingExitStatus fails with Out of
Memory. (Amar Kamat via vinodkv).
HDFS-1096. Reverting the patch since it caused build failures. (ddas)
MAPREDUCE-1317. Reducing memory consumption of rumen objects. (htang)
MAPREDUCE-1680. Add a metric to track number of heartbeats processed by the
JobTracker. (Richard King via acmurthy)
MAPREDUCE-1683. Removes JNI calls to get jvm current/max heap usage in
ClusterStatus by default. (acmurthy)
HADOOP-6687. user object in the subject in UGI should be reused in case
of a relogin. (jitendra)
HADOOP-5647. TestJobHistory fails if /tmp/_logs is not writable to.
Testcase should not depend on /tmp. (Ravi Gummadi via vinodkv)
MAPREDUCE-181. Bug fix for Secure job submission. (Ravi Gummadi via
vinodkv)
MAPREDUCE-1635. ResourceEstimator does not work after MAPREDUCE-842.
(Amareshwari Sriramadasu via vinodkv)
MAPREDUCE-1526. Cache the job related information while submitting the
job. (rksingh)
HADOOP-6674. Turn off SASL checksums for RPCs. (jitendra via omalley)
HADOOP-5958. Replace fork of DF with library call. (cdouglas via omalley)
HDFS-999. Secondary namenode should login using kerberos if security
is configured. Bugfix to original patch. (jhoman)
MAPREDUCE-1594. Support for SleepJobs in Gridmix (rksingh)
HDFS-1007. Fix. ServiceName for delegation token for Hftp has hftp
port and not RPC port.
MAPREDUCE-1376. Support for varied user submissions in Gridmix (rksingh)
HDFS-1080. SecondaryNameNode image transfer should use the defined
http address rather than local ip address (jhoman)
HADOOP-6661. User document for UserGroupInformation.doAs for secure
impersonation. (jitendra)
MAPREDUCE-1624. Documents the job credentials and associated details
to do with delegation tokens (ddas)
HDFS-1036. Documentation for fetchdt for forrest (boryas)
HDFS-1039. New patch on top of previous patch. Gets namenode address
from conf. (jitendra)
HADOOP-6656. Renew Kerberos TGT when 80% of the renew lifetime has been
used up. (omalley)
HADOOP-6653. Protect against NPE in setupSaslConnection when real user is
null. (omalley)
HADOOP-6649. An error in the previous committed patch. (jitendra)
HADOOP-6652. ShellBasedUnixGroupsMapping shouldn't have a cache.
(ddas)
HADOOP-6649. login object in UGI should be inside the subject
(jitendra)
HADOOP-6637. Benchmark overhead of RPC session establishment
(shv via jitendra)
HADOOP-6648. Credentials must ignore null tokens that can be generated
when using HFTP to talk to insecure clusters. (omalley)
HADOOP-6632. Fix on JobTracker to reuse filesystem handles if possible.
(ddas)
HADOOP-6647. balancer fails with "is not authorized for protocol
interface NamenodeProtocol" in secure environment (boryas)
MAPREDUCE-1612. job conf file is not accessible from job history
web page. (Ravi Gummadi via vinodkv)
MAPREDUCE-1611. Refresh nodes and refresh queues doesnt work with
service authorization enabled. (Amar Kamat via vinodkv)
HADOOP-6644. util.Shell getGROUPS_FOR_USER_COMMAND method
name - should use common naming convention (boryas)
MAPREDUCE-1609. Fixes a problem with localization of job log
directories when tasktracker is re-initialized that can result
in failed tasks. (Amareshwari Sriramadasu via yhemanth)
MAPREDUCE-1610. Update forrest documentation for directory
structure of localized files. (Ravi Gummadi via yhemanth)
MAPREDUCE-1532. Fixes a javadoc and an exception message in JobInProgress
when the authenticated user is different from the user in conf. (ddas)
MAPREDUCE-1417. Update forrest documentation for private
and public distributed cache files. (Ravi Gummadi via yhemanth)
HADOOP-6634. AccessControlList uses full-principal names to verify acls
causing queue-acls to fail (vinodkv)
HADOOP-6642. Fix javac, javadoc, findbugs warnings. (chrisdo via acmurthy)
HDFS-1044. Cannot submit mapreduce job from secure client to
unsecure sever. (boryas)
HADOOP-6638. try to relogin in a case of failed RPC connection
(expired tgt) only in case the subject is loginUser or
proxyUgi.realUser. (boryas)
HADOOP-6632. Support for using different Kerberos keys for different
instances of Hadoop services. (jitendra)
HADOOP-6526. Need mapping from long principal names to local OS
user names. (jitendra)
MAPREDUCE-1604. Update Forrest documentation for job authorization
ACLs. (Amareshwari Sriramadasu via yhemanth)
HDFS-1045. In secure clusters, re-login is necessary for https
clients before opening connections (jhoman)
HADOOP-6603. Addition to original patch to be explicit
about new method not being for general use. (jhoman)
MAPREDUCE-1543. Add audit log messages for job and queue
access control checks. (Amar Kamat via yhemanth)
MAPREDUCE-1606. Fixed occassinal timeout in TestJobACL. (Ravi Gummadi via
acmurthy)
HADOOP-6633. normalize property names for JT/NN kerberos principal
names in configuration. (boryas)
HADOOP-6613. Changes the RPC server so that version is checked first
on an incoming connection. (Kan Zhang via ddas)
HADOOP-5592. Fix typo in Streaming doc in reference to GzipCodec.
(Corinne Chandel via tomwhite)
MAPREDUCE-813. Updates Streaming and M/R tutorial documents.
(Corinne Chandel via ddas)
MAPREDUCE-927. Cleanup of task-logs should happen in TaskTracker instead
of the Child. (Amareshwari Sriramadasu via vinodkv)
HDFS-1039. Service should be set in the token in JspHelper.getUGI.
(jitendra)
MAPREDUCE-1599. MRBench reuses jobConf and credentials there in.
(jitendra)
MAPREDUCE-1522. FileInputFormat may use the default FileSystem for the
input path. (Tsz Wo (Nicholas), SZE via cdouglas)
HDFS-1036. In DelegationTokenFetch pass Configuration object so
getDefaultUri will work correctly.
HDFS-1038. In nn_browsedfscontent.jsp fetch delegation token only if
security is enabled. (jitendra)
HDFS-1036. in DelegationTokenFetch dfs.getURI returns no port (boryas)
HADOOP-6598. Verbose logging from the Group class (one more case)
(boryas)
HADOOP-6627. Bad Connection to FS" message in FSShell should print
message from the exception (boryas)
HDFS-1033. In secure clusters, NN and SNN should verify that the remote
principal during image and edits transfer (jhoman)
HDFS-1005. Fixes a bug to do with calling the cross-realm API in Fsck
client. (ddas)
MAPREDUCE-1422. Fix cleanup of localized job directory to work if files
with non-deletable permissions are created within it.
(Amar Kamat via yhemanth)
HDFS-1007. Fixes bugs to do with 20S cluster talking to 20 over
hftp (borya)
MAPREDUCE:1566. Fixes bugs in the earlier patch. (ddas)
HDFS-992. A bug in backport for HDFS-992. (jitendra)
HADOOP-6598. Remove verbose logging from the Groups class. (borya)
HADOOP-6620. NPE if renewer is passed as null in getDelegationToken.
(jitendra)
HDFS-1023. Second Update to original patch to fix username (jhoman)
MAPREDUCE-1435. Add test cases to already committed patch for this
jira, synchronizing changes with trunk. (yhemanth)
HADOOP-6612. Protocols RefreshUserToGroupMappingsProtocol and
RefreshAuthorizationPolicyProtocol authorization settings thru
KerberosInfo (boryas)
MAPREDUCE-1566. Bugfix for tests on top of the earlier patch. (ddas)
MAPREDUCE-1566. Mechanism to import tokens and secrets from a file in to
the submitted job. (omalley)
HADOOP-6603. Provide workaround for issue with Kerberos not
resolving corss-realm principal. (kan via jhoman)
HDFS-1023. Update to original patch to fix username (jhoman)
HDFS-814. Add an api to get the visible length of a
DFSDataInputStream. (hairong)
HDFS-1023. Allow http server to start as regular user if https
principal is not defined. (jhoman)
HDFS-1022. Merge all three test specs files (common, hdfs, mapred)
into one. (steffl)
HDFS-101. DFS write pipeline: DFSClient sometimes does not detect
second datanode failure. (hairong)
HDFS-1015. Intermittent failure in TestSecurityTokenEditLog. (jitendra)
MAPREDUCE-1550. A bugfix on top of what was committed earlier (ddas).
MAPREDUCE-1155. DISABLING THE TestStreamingExitStatus temporarily. (ddas)
HDFS-1020. Changes the check for renewer from short name to long name
in the cancel/renew delegation token methods. (jitendra via ddas)
HDFS-1019. Fixes values of delegation token parameters in
hdfs-default.xml. (jitendra via ddas)
MAPREDUCE-1430. Fixes a backport issue with the earlier patch. (ddas)
MAPREDUCE-1559. Fixes a problem in DelegationTokenRenewal class to
do with using the right credentials when talking to the NameNode.(ddas)
MAPREDUCE-1550. Fixes a problem to do with creating a filesystem using
the user's UGI in the JobHistory browsing. (ddas)
HADOOP-6609. Fix UTF8 to use a thread local DataOutputBuffer instead of
a static that was causing a deadlock in RPC. (omalley)
HADOOP-6584. Fix javadoc warnings introduced by original HADOOP-6584
patch (jhoman)
HDFS-1017. browsedfs jsp should call JspHelper.getUGI rather than using
createRemoteUser(). (jhoman)
MAPREDUCE-899. Modified LinuxTaskController to check that task-controller
has right permissions and ownership before performing any actions.
(Amareshwari Sriramadasu via yhemanth)
HDFS-204. Revive number of files listed metrics. (hairong)
HADOOP-6569. FsShell#cat should avoid calling uneccessary getFileStatus
before opening a file to read. (hairong)
HDFS-1014. Error in reading delegation tokens from edit logs. (jitendra)
HDFS-458. Add under-10-min tests from 0.22 to 0.20.1xx, only the tests
that already exist in 0.20.1xx (steffl)
MAPREDUCE-1155. Just pulls out the TestStreamingExitStatus part of the
patch from jira (that went to 0.22). (ddas)
HADOOP-6600. Fix for branch backport only. Comparing of user should use
equals. (boryas).
HDFS-1006. Fixes NameNode and SecondaryNameNode to use kerberizedSSL for
the http communication. (Jakob Homan via ddas)
HDFS-1007. Fixes a bug on top of the earlier patch. (ddas)
HDFS-1005. Fsck security. Makes it work over kerberized SSL (boryas and
jhoman)
HDFS-1007. Makes HFTP and Distcp use kerberized SSL. (ddas)
MAPREDUCE-1455. Fixes a testcase in the earlier patch.
(Ravi Gummadi via ddas)
HDFS-992. Refactors block access token implementation to conform to the
generic Token interface. (Kan Zhang via ddas)
HADOOP-6584. Adds KrbSSL connector for jetty. (Jakob Homan via ddas)
HADOOP-6589. Add a framework for better error messages when rpc connections
fail to authenticate. (Kan Zhang via omalley)
HADOOP-6600,HDFS-1003,MAPREDUCE-1539. mechanism for authorization check
for inter-server protocols(boryas)
HADOOP-6580,HDFS-993,MAPREDUCE-1516. UGI should contain authentication
method.
Namenode and JT should issue a delegation token only for kerberos
authenticated clients. (jitendra)
HDFS-984,HADOOP-6573,MAPREDUCE-1537. Delegation Tokens should be persisted
in Namenode, and corresponding changes in common and mr. (jitendra)
HDFS-994. Provide methods for obtaining delegation token from Namenode for
hftp and other uses. Incorporates HADOOP-6594: Update hdfs script to
provide fetchdt tool. (jitendra)
HADOOP-6586. Log authentication and authorization failures and successes
(boryas)
HDFS-991. Allow use of delegation tokens to authenticate to the
HDFS servlets. (omalley)
HADOOP-1849. Add undocumented configuration parameter for per handler
call queue size in IPC Server. (shv)
HADOOP-6599. Split existing RpcMetrics with summary in RpcMetrics and
details information in RpcDetailedMetrics. (suresh)
HDFS-985. HDFS should issue multiple RPCs for listing a large directory.
(hairong)
HDFS-1000. Updates libhdfs to use the new UGI. (ddas)
MAPREDUCE-1532. Ensures all filesystem operations at the client is done
as the job submitter. Also, changes the renewal to maintain list of tokens
to renew. (ddas)
HADOOP-6596. Add a version field to the seialization of the
AbstractDelegationTokenIdentifier. (omalley)
HADOOP-5561. Add javadoc.maxmemory to build.xml to allow larger memory.
(jkhoman via omalley)
HADOOP-6579. Add a mechanism for encoding and decoding Tokens in to
url-safe strings. (omalley)
MAPREDUCE-1354. Make incremental changes in jobtracker for
improving scalability (acmurthy)
HDFS-999.Secondary namenode should login using kerberos if security
is configured(boryas)
MAPREDUCE-1466. Added a private configuration variable
mapreduce.input.num.files, to store number of input files
being processed by M/R job. (Arun Murthy via yhemanth)
MAPREDUCE-1403. Save file-sizes of each of the artifacts in
DistributedCache in the JobConf (Arun Murthy via yhemanth)
HADOOP-6543. Fixes a compilation problem in the original commit. (ddas)
MAPREDUCE-1520. Moves a call to setWorkingDirectory in Child to within
a doAs block. (Amareshwari Sriramadasu via ddas)
HADOOP-6543. Allows secure clients to talk to unsecure clusters.
(Kan Zhang via ddas)
MAPREDUCE-1505. Delays construction of the job client until it is really
required. (Arun C Murthy via ddas)
HADOOP-6549. TestDoAsEffectiveUser should use ip address of the host
for superuser ip check. (jitendra)
HDFS-464. Fix memory leaks in libhdfs. (Christian Kunz via suresh)
HDFS-946. NameNode should not return full path name when lisitng a
diretory or getting the status of a file. (hairong)
MAPREDUCE-1398. Fix TaskLauncher to stop waiting for slots on a TIP
that is killed / failed. (Amareshwari Sriramadasu via yhemanth)
MAPREDUCE-1476. Fix the M/R framework to not call commit for special
tasks like job setup/cleanup and task cleanup.
(Amareshwari Sriramadasu via yhemanth)
HADOOP-6467. Performance improvement for liststatus on directories in
hadoop archives. (mahadev)
HADOOP-6558. archive does not work with distcp -update. (nicholas via
mahadev)
HADOOP-6583. Captures authentication and authorization metrics. (ddas)
MAPREDUCE-1316. Fixes a memory leak of TaskInProgress instances in
the jobtracker. (Amar Kamat via yhemanth)
MAPREDUCE-670. Creates ant target for 10 mins patch test build.
(Jothi Padmanabhan via gkesavan)
MAPREDUCE-1430. JobTracker should be able to renew delegation tokens
for the jobs(boryas)
HADOOP-6551, HDFS-986, MAPREDUCE-1503. Change API for tokens to throw
exceptions instead of returning booleans. (omalley)
HADOOP-6545. Changes the Key for the FileSystem to be UGI. (ddas)
HADOOP-6572. Makes sure that SASL encryption and push to responder queue
for the RPC response happens atomically. (Kan Zhang via ddas)
HDFS-965. Split the HDFS TestDelegationToken into two tests, of which
one proxy users and the other normal users. (jitendra via omalley)
HADOOP-6560. HarFileSystem throws NPE for har://hdfs-/foo (nicholas via
mahadev)
MAPREDUCE-686. Move TestSpeculativeExecution.Fake* into a separate class
so that it can be used by other tests. (Jothi Padmanabhan via sharad)
MAPREDUCE-181. Fixes an issue in the use of the right config. (ddas)
MAPREDUCE-1026. Fixes a bug in the backport. (ddas)
HADOOP-6559. Makes the RPC client automatically re-login when the SASL
connection setup fails. This is applicable to only keytab based logins.
(ddas)
HADOOP-2141. Backport changes made in the original JIRA to aid
fast unit tests in Map/Reduce. (Amar Kamat via yhemanth)
HADOOP-6382. Import the mavenizable pom file structure and adjust
the build targets and bin scripts. (gkesvan via ltucker)
MAPREDUCE-1425. archive throws OutOfMemoryError (mahadev)
MAPREDUCE-1399. The archive command shows a null error message. (nicholas)
HADOOP-6552. Puts renewTGT=true and useTicketCache=true for the keytab
kerberos options. (ddas)
MAPREDUCE-1433. Adds delegation token for MapReduce (ddas)
HADOOP-4359. Fixes a bug in the earlier backport. (ddas)
HADOOP-6547, HDFS-949, MAPREDUCE-1470. Move Delegation token into Common
so that we can use it for MapReduce also. It is a combined patch for
common, hdfs and mr. (jitendra)
HADOOP-6510,HDFS-935,MAPREDUCE-1464. Support for doAs to allow
authenticated superuser to impersonate proxy users. It is a combined
patch with compatible fixes in HDFS and MR. (jitendra)
MAPREDUCE-1435. Fixes the way symlinks are handled when cleaning up
work directory files. (Ravi Gummadi via yhemanth)
MAPREDUCE-6419. Fixes a bug in the backported patch. (ddas)
MAPREDUCE-1457. Fixes JobTracker to get the FileSystem object within
getStagingAreaDir within a privileged block. Fixes Child.java to use the
appropriate UGIs while getting the TaskUmbilicalProtocol proxy and while
executing the task. Contributed by Jakob Homan. (ddas)
MAPREDUCE-1440. Replace the long user name in MapReduce with the local
name. (ddas)
HADOOP-6419. Adds SASL based authentication to RPC. Also includes the
MAPREDUCE-1335 and HDFS-933 patches. Contributed by Kan Zhang.
(ddas)
HADOOP-6538. Sets hadoop.security.authentication to simple by default.
(ddas)
HDFS-938. Replace calls to UGI.getUserName() with
UGI.getShortUserName()(boryas)
HADOOP-6544. fix ivy settings to include JSON jackson.codehause.org
libs for .20 (boryas)
HDFS-907. Add tests for getBlockLocations and totalLoad metrics. (rphulari)
HADOOP-6204. Implementing aspects development and fault injeciton
framework for Hadoop (cos)
MAPREDUCE-1432. Adds hooks in the jobtracker and tasktracker
for loading the tokens in the user's ugi. This is required for
the copying of files from the hdfs. (Devaraj Das vi boryas)
MAPREDUCE-1383. Automates fetching of delegation tokens in File*Formats
Distributed Cache and Distcp. Also, provides a config
mapreduce.job.hdfs-servers that the jobs can populate with a comma
separated list of namenodes. The job client automatically fetches
delegation tokens from those namenodes.
HADOOP-6337. Update FilterInitializer class to be more visible
and take a conf for further development. (jhoman)
HADOOP-6520. UGI should load tokens from the environment. (jitendra)
HADOOP-6517, HADOOP-6518. Ability to add/get tokens from
UserGroupInformation & Kerberos login in UGI should honor KRB5CCNAME
(jitendra)
HADOOP-6299. Reimplement the UserGroupInformation to use the OS
specific and Kerberos JAAS login. (jhoman, ddas, oom)
HADOOP-6524. Contrib tests are failing Clover'ed build. (cos)
MAPREDUCE-842. Fixing a bug in the earlier version of the patch
related to improper localization of the job token file.
(Ravi Gummadi via yhemanth)
HDFS-919. Create test to validate the BlocksVerified metric (Gary Murry
via cos)
MAPREDUCE-1186. Modified code in distributed cache to set
permissions only on required set of localized paths.
(Amareshwari Sriramadasu via yhemanth)
HDFS-899. Delegation Token Implementation. (Jitendra Nath Pandey)
MAPREDUCE-896. Enhance tasktracker to cleanup files that might have
been created by user tasks with non-writable permissions.
(Ravi Gummadi via yhemanth)
HADOOP-5879. Read compression level and strategy from Configuration for
gzip compression. (He Yongqiang via cdouglas)
HADOOP-6161. Add get/setEnum methods to Configuration. (cdouglas)
HADOOP-6382 Mavenize the build.xml targets and update the bin scripts
in preparation for publishing POM files (giri kesavan via ltucker)
HDFS-737. Add full path name of the file to the block information and
summary of total number of files, blocks, live and deadnodes to
metasave output. (Jitendra Nath Pandey via suresh)
HADOOP-6577. Add hidden configuration option "ipc.server.max.response.size"
to change the default 1 MB, the maximum size when large IPC handler
response buffer is reset. (suresh)
HADOOP-6521. Fix backward compatiblity issue with umask when applications
use deprecated param dfs.umask in configuration or use
FsPermission.setUMask(). (suresh)
HDFS-737. Add full path name of the file to the block information and
summary of total number of files, blocks, live and deadnodes to
metasave output. (Jitendra Nath Pandey via suresh)
HADOOP-6521. Fix backward compatiblity issue with umask when applications
use deprecated param dfs.umask in configuration or use
FsPermission.setUMask(). (suresh)
MAPREDUCE-433. Use more reliable counters in TestReduceFetch.
(Christopher Douglas via ddas)
MAPREDUCE-744. Introduces the notion of a public distributed cache.
(ddas)
MAPREDUCE-1140. Fix DistributedCache to not decrement reference counts
for unreferenced files in error conditions.
(Amareshwari Sriramadasu via yhemanth)
MAPREDUCE-1284. Fix fts_open() call in task-controller that was failing
LinuxTaskController unit tests. (Ravi Gummadi via yhemanth)
MAPREDUCE-1098. Fixed the distributed-cache to not do i/o while
holding a global lock.
(Amareshwari Sriramadasu via acmurthy)
MAPREDUCE-1338. Introduces the notion of token cache using which
tokens and secrets can be sent by the Job client to the JobTracker.
(Boris Shkolnik)
HADOOP-6495. Identifier should be serialized after the password is created
In Token constructor. (Jitendra Nath Pandey)
HADOOP-6506. Failing tests prevent the rest of test targets from
execution. (cos)
HADOOP-5457. Fix to continue to run builds even if contrib test fails.
(gkesavan)
MAPREDUCE-856. Setup secure permissions for distributed cache files.
(Vinod Kumar Vavilapalli via yhemanth)
MAPREDUCE-871. Fix ownership of Job/Task local files to have correct
group ownership according to the egid of the tasktracker.
(Vinod Kumar Vavilapalli via yhemanth)
MAPREDUCE-476. Extend DistributedCache to work locally (LocalJobRunner).
(Philip Zeyliger via tomwhite)
MAPREDUCE-711. Removed Distributed Cache from Common, to move it under
Map/Reduce. (Vinod Kumar Vavilapalli via yhemanth)
MAPREDUCE-478. Allow map and reduce jvm parameters, environment
variables and ulimit to be set separately. (acmurthy)
MAPREDUCE-842. Setup secure permissions for localized job files,
intermediate outputs and log files on tasktrackers.
(Vinod Kumar Vavilapalli via yhemanth)
MAPREDUCE-408. Fixes an assertion problem in TestKillSubProcesses.
(Ravi Gummadi via ddas)
HADOOP-4041. IsolationRunner does not work as documented.
(Philip Zeyliger via tomwhite)
MAPREDUCE-181. Changes the job submission process to be secure.
(Devaraj Das)
HADOOP-5737. Fixes a problem in the way the JobTracker used to talk to
other daemons like the NameNode to get the job's files. Also adds APIs
in the JobTracker to get the FileSystem objects as per the JobTracker's
configuration. (Amar Kamat via ddas)
HADOOP-5771. Implements unit tests for LinuxTaskController.
(Sreekanth Ramakrishnan and Vinod Kumar Vavilapalli via yhemanth)
HADOOP-4656, HDFS-685, MAPREDUCE-1083. Use the user-to-groups mapping
service in the NameNode and JobTracker. Combined patch for these 3 jiras
otherwise tests fail. (Jitendra Nath Pandey)
MAPREDUCE-1250. Refactor job token to use a common token interface.
(Jitendra Nath Pandey)
MAPREDUCE-1026. Shuffle should be secure. (Jitendra Nath Pandey)
HADOOP-4268. Permission checking in fsck. (Jitendra Nath Pandey)
HADOOP-6415. Adding a common token interface for both job token and
delegation token. (Jitendra Nath Pandey)
HADOOP-6367, HDFS-764. Moving Access Token implementation from Common to
HDFS. These two jiras must be committed together otherwise build will
fail. (Jitendra Nath Pandey)
HDFS-409. Add more access token tests
(Jitendra Nath Pandey)
HADOOP-6132. RPC client opens an extra connection for VersionedProtocol.
(Jitendra Nath Pandey)
HDFS-445. pread() fails when cached block locations are no longer valid.
(Jitendra Nath Pandey)
HDFS-195. Need to handle access token expiration when re-establishing the
pipeline for dfs write. (Jitendra Nath Pandey)
HADOOP-6176. Adding a couple private methods to AccessTokenHandler
for testing purposes. (Jitendra Nath Pandey)
HADOOP-5824. remove OP_READ_METADATA functionality from Datanode.
(Jitendra Nath Pandey)
HADOOP-4359. Access Token: Support for data access authorization
checking on DataNodes. (Jitendra Nath Pandey)
MAPREDUCE-1372. Fixed a ConcurrentModificationException in jobtracker.
(Arun C Murthy via yhemanth)
MAPREDUCE-1316. Fix jobs' retirement from the JobTracker to prevent memory
leaks via stale references. (Amar Kamat via acmurthy)
MAPREDUCE-1342. Fixed deadlock in global blacklisting of tasktrackers.
(Amareshwari Sriramadasu via acmurthy)
HADOOP-6460. Reinitializes buffers used for serializing responses in ipc
server on exceeding maximum response size to free up Java heap. (suresh)
MAPREDUCE-1100. Truncate user logs to prevent TaskTrackers' disks from
filling up. (Vinod Kumar Vavilapalli via acmurthy)
MAPREDUCE-1143. Fix running task counters to be updated correctly
when speculative attempts are running for a TIP.
(Rahul Kumar Singh via yhemanth)
HADOOP-6151, 6281, 6285, 6441. Add HTML quoting of the parameters to all
of the servlets to prevent XSS attacks. (omalley)
MAPREDUCE-896. Fix bug in earlier implementation to prevent
spurious logging in tasktracker logs for absent file paths.
(Ravi Gummadi via yhemanth)
MAPREDUCE-676. Fix Hadoop Vaidya to ensure it works for map-only jobs.
(Suhas Gogate via acmurthy)
HADOOP-5582. Fix Hadoop Vaidya to use new Counters in
org.apache.hadoop.mapreduce package. (Suhas Gogate via acmurthy)
HDFS-595. umask settings in configuration may now use octal or
symbolic instead of decimal. Update HDFS tests as such. (jghoman)
MAPREDUCE-1068. Added a verbose error message when user specifies an
incorrect -file parameter. (Amareshwari Sriramadasu via acmurthy)
MAPREDUCE-1171. Allow the read-error notification in shuffle to be
configurable. (Amareshwari Sriramadasu via acmurthy)
MAPREDUCE-353. Allow shuffle read and connection timeouts to be
configurable. (Amareshwari Sriramadasu via acmurthy)
HDFS-781. Namenode metrics PendingDeletionBlocks is not decremented.
(suresh)
MAPREDUCE-1185. Redirect running job url to history url if job is already
retired. (Amareshwari Sriramadasu and Sharad Agarwal via sharad)
MAPREDUCE-754. Fix NPE in expiry thread when a TT is lost. (Amar Kamat
via sharad)
MAPREDUCE-896. Modify permissions for local files on tasktracker before
deletion so they can be deleted cleanly. (Ravi Gummadi via yhemanth)
HADOOP-5771. Implements unit tests for LinuxTaskController.
(Sreekanth Ramakrishnan and Vinod Kumar Vavilapalli via yhemanth)
MAPREDUCE-1124. Import Gridmix3 and Rumen. (cdouglas)
MAPREDUCE-1063. Document gridmix benchmark. (cdouglas)
HDFS-758. Changes to report status of decommissioining on the namenode web
UI. (jitendra)
HADOOP-6234. Add new option dfs.umaskmode to set umask in configuration
to use octal or symbolic instead of decimal. (Jakob Homan via suresh)
MAPREDUCE-1147. Add map output counters to new API. (Amar Kamat via
cdouglas)
MAPREDUCE-1182. Fix overflow in reduce causing allocations to exceed the
configured threshold. (cdouglas)
HADOOP-4933. Fixes a ConcurrentModificationException problem that shows up
when the history viewer is accessed concurrently.
(Amar Kamat via ddas)
MAPREDUCE-1140. Fix DistributedCache to not decrement reference counts for
unreferenced files in error conditions.
(Amareshwari Sriramadasu via yhemanth)
HADOOP-6203. FsShell rm/rmr error message indicates exceeding Trash quota
and suggests using -skpTrash, when moving to trash fails.
(Boris Shkolnik via suresh)
HADOOP-5675. Do not launch a job if DistCp has no work to do. (Tsz Wo
(Nicholas), SZE via cdouglas)
HDFS-457. Better handling of volume failure in Data Node storage,
This fix is a port from hdfs-0.22 to common-0.20 by Boris Shkolnik.
Contributed by Erik Steffl
HDFS-625. Fix NullPointerException thrown from ListPathServlet.
Contributed by Suresh Srinivas.
HADOOP-6343. Log unexpected throwable object caught in RPC.
Contributed by Jitendra Nath Pandey
MAPREDUCE-1186. Fixed DistributedCache to do a recursive chmod on just the
per-cache directory, not all of mapred.local.dir.
(Amareshwari Sriramadasu via acmurthy)
MAPREDUCE-1231. Add an option to distcp to ignore checksums when used with
the upgrade option.
(Jothi Padmanabhan via yhemanth)
MAPREDUCE-1219. Fixed JobTracker to not collect per-job metrics, thus
easing load on it. (Amareshwari Sriramadasu via acmurthy)
HDFS-761. Fix failure to process rename operation from edits log due to
quota verification. (suresh)
MAPREDUCE-1196. Fix FileOutputCommitter to use the deprecated cleanupJob
api correctly. (acmurthy)
HADOOP-6344. rm and rmr immediately delete files rather than sending
to trash, despite trash being enabled, if a user is over-quota. (jhoman)
MAPREDUCE-1160. Reduce verbosity of log lines in some Map/Reduce classes
to avoid filling up jobtracker logs on a busy cluster.
(Ravi Gummadi and Hong Tang via yhemanth)
HDFS-587. Add ability to run HDFS with MR test on non-default queue,
also updated junit dependendcy from junit-3.8.1 to junit-4.5 (to make
it possible to use Configured and Tool to process command line to
be able to specify a queue). Contributed by Erik Steffl.
MAPREDUCE-1158. Fix JT running maps and running reduces metrics.
(sharad)
MAPREDUCE-947. Fix bug in earlier implementation that was
causing unit tests to fail.
(Ravi Gummadi via yhemanth)
MAPREDUCE-1062. Fix MRReliabilityTest to work with retired jobs
(Contributed by Sreekanth Ramakrishnan)
MAPREDUCE-1090. Modified log statement in TaskMemoryManagerThread to
include task attempt id. (yhemanth)
MAPREDUCE-1098. Fixed the distributed-cache to not do i/o while
holding a global lock. (Amareshwari Sriramadasu via acmurthy)
MAPREDUCE-1048. Add occupied/reserved slot usage summary on
jobtracker UI. (Amareshwari Sriramadasu via sharad)
MAPREDUCE-1103. Added more metrics to Jobtracker. (sharad)
MAPREDUCE-947. Added commitJob and abortJob apis to OutputCommitter.
Enhanced FileOutputCommitter to create a _SUCCESS file for successful
jobs. (Amar Kamat & Jothi Padmanabhan via acmurthy)
MAPREDUCE-1105. Remove max limit configuration in capacity scheduler in
favor of max capacity percentage thus allowing the limit to go over
queue capacity. (Rahul Kumar Singh via yhemanth)
MAPREDUCE-1086. Setup Hadoop logging environment for tasks to point to
task related parameters. (Ravi Gummadi via yhemanth)
MAPREDUCE-739. Allow relative paths to be created inside archives.
(mahadev)
HADOOP-6097. Multiple bugs w/ Hadoop archives (mahadev)
HADOOP-6231. Allow caching of filesystem instances to be disabled on a
per-instance basis (ben slusky via mahadev)
MAPREDUCE-826. harchive doesn't use ToolRunner / harchive returns 0 even
if the job fails with exception (koji via mahadev)
HDFS-686. NullPointerException is thrown while merging edit log and
image. (hairong)
HDFS-709. Fix TestDFSShell failure due to rename bug introduced by
HDFS-677. (suresh)
HDFS-677. Rename failure when both source and destination quota exceeds
results in deletion of source. (suresh)
HADOOP-6284. Add a new parameter, HADOOP_JAVA_PLATFORM_OPTS, to
hadoop-config.sh so that it allows setting java command options for
JAVA_PLATFORM. (Koji Noguchi via szetszwo)
MAPREDUCE-732. Removed spurious log statements in the node
blacklisting logic. (Sreekanth Ramakrishnan via yhemanth)
MAPREDUCE-144. Includes dump of the process tree in task diagnostics when
a task is killed due to exceeding memory limits.
(Vinod Kumar Vavilapalli via yhemanth)
MAPREDUCE-979. Fixed JobConf APIs related to memory parameters to
return values of new configuration variables when deprecated
variables are disabled. (Sreekanth Ramakrishnan via yhemanth)
MAPREDUCE-277. Makes job history counters available on the job history
viewers. (Jothi Padmanabhan via ddas)
HADOOP-5625. Add operation duration to clienttrace. (Lei Xu
via cdouglas)
HADOOP-5222. Add offset to datanode clienttrace. (Lei Xu via cdouglas)
HADOOP-6218. Adds a feature where TFile can be split by Record
Sequence number. Contributed by Hong Tang and Raghu Angadi.
MAPREDUCE-1088. Changed permissions on JobHistory files on local disk to
0744. Contributed by Arun C. Murthy.
HADOOP-6304. Use java.io.File.set{Readable|Writable|Executable} where
possible in RawLocalFileSystem. Contributed by Arun C. Murthy.
MAPREDUCE-270. Fix the tasktracker to optionally send an out-of-band
heartbeat on task-completion for better job-latency. Contributed by
Arun C. Murthy
Configuration changes:
add mapreduce.tasktracker.outofband.heartbeat
MAPREDUCE-1030. Fix capacity-scheduler to assign a map and a reduce task
per-heartbeat. Contributed by Rahuk K Singh.
MAPREDUCE-1028. Fixed number of slots occupied by cleanup tasks to one
irrespective of slot size for the job. Contributed by Ravi Gummadi.
MAPREDUCE-964. Fixed start and finish times of TaskStatus to be
consistent, thereby fixing inconsistencies in metering tasks.
Contributed by Sreekanth Ramakrishnan.
HADOOP-5976. Add a new command, classpath, to the hadoop
script. Contributed by Owen O'Malley and Gary Murry
HADOOP-5784. Makes the number of heartbeats that should arrive
a second at the JobTracker configurable. Contributed by
Amareshwari Sriramadasu.
MAPREDUCE-945. Modifies MRBench and TestMapRed to use
ToolRunner so that options such as queue name can be
passed via command line. Contributed by Sreekanth Ramakrishnan.
HADOOP:5420 Correct bug in earlier implementation
by Arun C. Murthy
HADOOP-5363 Add support for proxying connections to multiple
clusters with different versions to hdfsproxy. Contributed
by Zhiyong Zhang
HADOOP-5780. Improve per block message prited by -metaSave
in HDFS. (Raghu Angadi)
HADOOP-6227. Fix Configuration to allow final parameters to be set
to null and prevent them from being overridden. Contributed by
Amareshwari Sriramadasu.
MAPREDUCE-430 Added patch supplied by Amar Kamat to allow roll forward
on branch to includ externally committed patch.
MAPREDUCE-768. Provide an option to dump jobtracker configuration in
JSON format to standard output. Contributed by V.V.Chaitanya
MAPREDUCE-834 Correct an issue created by merging this issue with
patch attached to external Jira.
HADOOP-6184 Provide an API to dump Configuration in a JSON format.
Contributed by V.V.Chaitanya Krishna.
MAPREDUCE-745 Patch added for this issue to allow branch-0.20 to
merge cleanly.
MAPREDUCE:478 Allow map and reduce jvm parameters, environment
variables and ulimit to be set separately.
MAPREDUCE:682 Removes reservations on tasktrackers which are blacklisted.
Contributed by Sreekanth Ramakrishnan.
HADOOP:5420 Support killing of process groups in LinuxTaskController
binary
HADOOP-5488 Removes the pidfile management for the Task JVM from the
framework and instead passes the PID back and forth between the
TaskTracker and the Task processes. Contributed by Ravi Gummadi.
MAPREDUCE:467 Provide ability to collect statistics about total tasks and
succeeded tasks in different time windows.
MAPREDUCE-817. Add a cache for retired jobs with minimal job
info and provide a way to access history file url
MAPREDUCE-814. Provide a way to configure completed job history
files to be on HDFS.
MAPREDUCE-838 Fixes a problem in the way commit of task outputs
happens. The bug was that even if commit failed, the task would be
declared as successful. Contributed by Amareshwari Sriramadasu.
MAPREDUCE-809 Fix job-summary logs to correctly record final status of
FAILED and KILLED jobs.
MAPREDUCE-740 Log a job-summary at the end of a job, while
allowing it to be configured to use a custom appender if desired.
MAPREDUCE-771 Fixes a bug which delays normal jobs in favor of
high-ram jobs.
HADOOP-5420 Support setsid based kill in LinuxTaskController.
MAPREDUCE-733 Fixes a bug that when a task tracker is killed ,
it throws exception. Instead it should catch it and process it and
allow the rest of the flow to go through
MAPREDUCE-734 Fixes a bug which prevented hi ram jobs from being
removed from the scheduler queue.
MAPREDUCE-693 Fixes a bug that when a job is submitted and the
JT is restarted (before job files have been written) and the job
is killed after recovery, the conf files fail to be moved to the
"done" subdirectory.
MAPREDUCE-722 Fixes a bug where more slots are getting reserved
for HiRAM job tasks than required.
MAPREDUCE-683 TestJobTrackerRestart failed because of stale
filemanager cache (which was created once per jvm). This patch makes
sure that the filemanager is inited upon every JobHistory.init()
and hence upon every restart. Note that this wont happen in production
as upon a restart the new jobtracker will start in a new jvm and
hence a new cache will be created.
MAPREDUCE-709 Fixes a bug where node health check script does
not display the correct message on timeout.
MAPREDUCE-708 Fixes a bug where node health check script does
not refresh the "reason for blacklisting".
MAPREDUCE-522 Rewrote TestQueueCapacities to make it simpler
and avoid timeout errors.
MAPREDUCE-532 Provided ability in the capacity scheduler to
limit the number of slots that can be concurrently used per queue
at any given time.
MAPREDUCE-211 Provides ability to run a health check script on
the tasktracker nodes and blacklist nodes if they are unhealthy.
Contributed by Sreekanth Ramakrishnan.
MAPREDUCE-516 Remove .orig file included by mistake.
MAPREDUCE-416 Moves the history file to a "done" folder whenever
a job completes.
HADOOP-5980 Previously, task spawned off by LinuxTaskController
didn't get LD_LIBRARY_PATH in their environment. The tasks will now
get same LD_LIBRARY_PATH value as when spawned off by
DefaultTaskController.
HADOOP-5981 This issue completes the feature mentioned in
HADOOP-2838. HADOOP-2838 provided a way to set env variables in
child process. This issue provides a way to inherit tt's env variables
and append or reset it. So now X=$X:y will inherit X (if there) and
append y to it.
HADOOP-5419 This issue is to provide an improvement on the
existing M/R framework to let users know which queues they have
access to, and for what operations. One use case for this would
that currently there is no easy way to know if the user has access
to submit jobs to a queue, until it fails with an access control
exception.
HADOOP-5420 Support setsid based kill in LinuxTaskController.
HADOOP-5643 Added the functionality to refresh jobtrackers node
list via command line (bin/hadoop mradmin -refreshNodes). The command
should be run as the jobtracker owner (jobtracker process owner)
or from a super group (mapred.permissions.supergroup).
HADOOP-2838 Now the users can set environment variables using
mapred.child.env. They can do the following X=Y : set X to Y X=$X:Y
: Append Y to X (which should be taken from the tasktracker)
HADOOP-5818. Revert the renaming from FSNamesystem.checkSuperuserPrivilege
to checkAccess by HADOOP-5643. (Amar Kamat via szetszwo)
HADOOP-5801. Fixes the problem: If the hosts file is changed across restart
then it should be refreshed upon recovery so that the excluded hosts are
lost and the maps are re-executed. (Amar Kamat via ddas)
HADOOP-5643. HADOOP-5643. Adds a way to decommission TaskTrackers
while the JobTracker is running. (Amar Kamat via ddas)
HADOOP-5419. Provide a facility to query the Queue ACLs for the
current user. (Rahul Kumar Singh via yhemanth)
HADOOP-5733. Add map/reduce slot capacity and blacklisted capacity to
JobTracker metrics. (Sreekanth Ramakrishnan via cdouglas)
HADOOP-5738. Split "waiting_tasks" JobTracker metric into waiting maps and
waiting reduces. (Sreekanth Ramakrishnan via cdouglas)
HADOOP-4842. Streaming now allows specifiying a command for the combiner.
(Amareshwari Sriramadasu via ddas)
HADOOP-4490. Provide ability to run tasks as job owners.
(Sreekanth Ramakrishnan via yhemanth)
HADOOP-5442. Paginate jobhistory display and added some search
capabilities. (Amar Kamat via acmurthy)
HADOOP-3327. Improves handling of READ_TIMEOUT during map output copying.
(Amareshwari Sriramadasu via ddas)
HADOOP-5113. Fixed logcondense to remove files for usernames
beginning with characters specified in the -l option.
(Peeyush Bishnoi via yhemanth)
HADOOP-2898. Provide an option to specify a port range for
Hadoop services provisioned by HOD.
(Peeyush Bishnoi via yhemanth)
HADOOP-4930. Implement a Linux native executable that can be used to
launch tasks as users. (Sreekanth Ramakrishnan via yhemanth)
Release 0.20.3 - Unreleased
IMPROVEMENTS
BUG FIXES
HDFS-955. New implementation of saveNamespace() to avoid loss of edits
when name-node fails during saving. (shv)
Release 0.20.2 - 2010-2-16
BUG FIXES
MAPREDUCE-112. Add counters for reduce input, output records to the new API.
(Jothi Padmanabhan via cdouglas)
HADOOP-6498. IPC client bug may cause rpc call hang. (Ruyue Ma and hairong
via hairong)
HDFS-927. DFSInputStream retries too many times for new block locations
(Todd Lipcon via Stack)
HDFS-793. DataNode should first receive the whole packet ack message
before it constructs and sends its own ack message for the packet.
(hairong)
HDFS-723. Fix deadlock in DFSClient#DFSOutputStream. (hairong)
HDFS-732. DFSClient.DFSOutputStream.close() should throw an exception if
the stream cannot be closed successfully. (szetszwo)
IMPROVEMENTS
HDFS-187. Initialize secondary namenode http address in TestStartup.
(Todd Lipcon via szetszwo)
HDFS-185. Disallow chown, chgrp, chmod, setQuota, and setSpaceQuota when
name-node is in safemode. (Ravi Phulari via shv)
HADOOP-5611. Fix C++ libraries to build on Debian Lenny. (Todd Lipcon
via tomwhite)
HADOOP-5612. Some c++ scripts are not chmodded before ant execution.
(Todd Lipcon via tomwhite)
HDFS-579. Fix DfsTask to follow the semantics of 0.19, regarding non-zero
return values as failures. (Christian Kunz via cdouglas)
HDFS-596. Fix memory leak in hdfsFreeFileInfo() for libhdfs.
(Zhang Bingjun via dhruba)
MAPREDUCE-1070. Prevent a deadlock in the fair scheduler servlet.
(Todd Lipcon via cdouglas)
HADOOP-5623. Ensure streaming status messages aren't overwritten. (Rick
Cox & Ravi Gummadi via tomwhite)
MAPREDUCE-1163. Remove unused, hard-coded paths from libhdfs. (Allen
Wittenauer via cdouglas)
HADOOP-6315. Avoid incorrect use of BuiltInflater/BuiltInDeflater in
GzipCodec. (Aaron Kimball via cdouglas)
HADOOP-6269. Fix threading issue with defaultResource in Configuration.
(Sreekanth Ramakrishnan via cdouglas)
HADOOP-5759. Fix for IllegalArgumentException when CombineFileInputFormat
is used as job InputFormat. (Amareshwari Sriramadasu via zshao)
Release 0.20.1 - 2009-09-01
INCOMPATIBLE CHANGES
HADOOP-5726. Remove pre-emption from capacity scheduler code base.
(Rahul Kumar Singh via yhemanth)
HADOOP-5881. Simplify memory monitoring and scheduling related
configuration. (Vinod Kumar Vavilapalli via yhemanth)
NEW FEATURES
HADOOP-6080. Introduce -skipTrash option to rm and rmr.
(Jakob Homan via shv)
HADOOP-3315. Add a new, binary file foramt, TFile. (Hong Tang via cdouglas)
IMPROVEMENTS
HADOOP-5711. Change Namenode file close log to info. (szetszwo)
HADOOP-5736. Update the capacity scheduler documentation for features
like memory based scheduling, job initialization and removal of pre-emption.
(Sreekanth Ramakrishnan via yhemanth)
HADOOP-4674. Fix fs help messages for -test, -text, -tail, -stat
and -touchz options. (Ravi Phulari via szetszwo)
HADOOP-4372. Improves the way history filenames are obtained and manipulated.
(Amar Kamat via ddas)
HADOOP-5897. Add name-node metrics to capture java heap usage.
(Suresh Srinivas via shv)
HDFS-438. Improve help message for space quota command. (Raghu Angadi)
MAPREDUCE-767. Remove the dependence on the CLI 2.0 snapshot.
(Amar Kamat via ddas)
OPTIMIZATIONS
BUG FIXES
HADOOP-5691. Makes org.apache.hadoop.mapreduce.Reducer concrete class
instead of abstract. (Amareshwari Sriramadasu via sharad)
HADOOP-5646. Fixes a problem in TestQueueCapacities.
(Vinod Kumar Vavilapalli via ddas)
HADOOP-5655. TestMRServerPorts fails on java.net.BindException. (Devaraj
Das via hairong)
HADOOP-5654. TestReplicationPolicy.<init> fails on java.net.BindException.
(hairong)
HADOOP-5688. Fix HftpFileSystem checksum path construction. (Tsz Wo
(Nicholas) Sze via cdouglas)
HADOOP-5213. Fix Null pointer exception caused when bzip2compression
was used and user closed a output stream without writing any data.
(Zheng Shao via dhruba)
HADOOP-5718. Remove the check for the default queue in capacity scheduler.
(Sreekanth Ramakrishnan via yhemanth)
HADOOP-5719. Remove jobs that failed initialization from the waiting queue
in the capacity scheduler. (Sreekanth Ramakrishnan via yhemanth)
HADOOP-4744. Attaching another fix to the jetty port issue. The TaskTracker
kills itself if it ever discovers that the port to which jetty is actually
bound is invalid (-1). (ddas)
HADOOP-5349. Fixes a problem in LocalDirAllocator to check for the return
path value that is returned for the case where the file we want to write
is of an unknown size. (Vinod Kumar Vavilapalli via ddas)
HADOOP-5636. Prevents a job from going to RUNNING state after it has been
KILLED (this used to happen when the SetupTask would come back with a
success after the job has been killed). (Amar Kamat via ddas)
HADOOP-5641. Fix a NullPointerException in capacity scheduler's memory
based scheduling code when jobs get retired. (yhemanth)
HADOOP-5828. Use absolute path for mapred.local.dir of JobTracker in
MiniMRCluster. (yhemanth)
HADOOP-4981. Fix capacity scheduler to schedule speculative tasks
correctly in the presence of High RAM jobs.
(Sreekanth Ramakrishnan via yhemanth)
HADOOP-5210. Solves a problem in the progress report of the reduce task.
(Ravi Gummadi via ddas)
HADOOP-5850. Fixes a problem to do with not being able to jobs with
0 maps/reduces. (Vinod K V via ddas)
HADOOP-5728. Fixed FSEditLog.printStatistics IndexOutOfBoundsException.
(Wang Xu via johan)
HADOOP-4626. Correct the API links in hdfs forrest doc so that they
point to the same version of hadoop. (szetszwo)
HADOOP-5883. Fixed tasktracker memory monitoring to account for
momentary spurts in memory usage due to java's fork() model.
(yhemanth)
HADOOP-5539. Fixes a problem to do with not preserving intermediate
output compression for merged data.
(Jothi Padmanabhan and Billy Pearson via ddas)
HADOOP-5932. Fixes a problem in capacity scheduler in computing
available memory on a tasktracker.
(Vinod Kumar Vavilapalli via yhemanth)
HADOOP-5648. Fixes a build issue in not being able to generate gridmix.jar
in hadoop binary tarball. (Giridharan Kesavan via gkesavan)
HADOOP-5908. Fixes a problem to do with ArithmeticException in the
JobTracker when there are jobs with 0 maps. (Amar Kamat via ddas)
HADOOP-5924. Fixes a corner case problem to do with job recovery with
empty history files. Also, after a JT restart, sends KillTaskAction to
tasks that report back but the corresponding job hasn't been initialized
yet. (Amar Kamat via ddas)
HADOOP-5882. Fixes a reducer progress update problem for new mapreduce
api. (Amareshwari Sriramadasu via sharad)
HADOOP-5746. Fixes a corner case problem in Streaming, where if an
exception happens in MROutputThread after the last call to the map/reduce
method, the exception goes undetected. (Amar Kamat via ddas)
HADOOP-5884. Fixes accounting in capacity scheduler so that high RAM jobs
take more slots. (Vinod Kumar Vavilapalli via yhemanth)
HADOOP-5937. Correct a safemode message in FSNamesystem. (Ravi Phulari
via szetszwo)
HADOOP-5869. Fix bug in assignment of setup / cleanup task that was
causing TestQueueCapacities to fail.
(Sreekanth Ramakrishnan via yhemanth)
HADOOP-5921. Fixes a problem in the JobTracker where it sometimes never
used to come up due to a system file creation on JobTracker's system-dir
failing. This problem would sometimes show up only when the FS for the
system-dir (usually HDFS) is started at nearly the same time as the
JobTracker. (Amar Kamat via ddas)
HADOOP-5920. Fixes a testcase failure for TestJobHistory.
(Amar Kamat via ddas)
HDFS-26. Better error message to users when commands fail because of
lack of quota. Allow quota to be set even if the limit is lower than
current consumption. (Boris Shkolnik via rangadi)
MAPREDUCE-2. Fixes a bug in KeyFieldBasedPartitioner in handling empty
keys. (Amar Kamat via sharad)
MAPREDUCE-130. Delete the jobconf copy from the log directory of the
JobTracker when the job is retired. (Amar Kamat via sharad)
MAPREDUCE-657. Fix hardcoded filesystem problem in CompletedJobStatusStore.
(Amar Kamat via sharad)
MAPREDUCE-179. Update progress in new RecordReaders. (cdouglas)
MAPREDUCE-124. Fix a bug in failure handling of abort task of
OutputCommiter. (Amareshwari Sriramadasu via sharad)
HADOOP-6139. Fix the FsShell help messages for rm and rmr. (Jakob Homan
via szetszwo)
HADOOP-6141. Fix a few bugs in 0.20 test-patch.sh. (Hong Tang via
szetszwo)
HADOOP-6145. Fix FsShell rm/rmr error messages when there is a FNFE.
(Jakob Homan via szetszwo)
MAPREDUCE-565. Fix partitioner to work with new API. (Owen O'Malley via
cdouglas)
MAPREDUCE-465. Fix a bug in MultithreadedMapRunner. (Amareshwari
Sriramadasu via sharad)
MAPREDUCE-18. Puts some checks to detect cases where jetty serves up
incorrect output during shuffle. (Ravi Gummadi via ddas)
MAPREDUCE-735. Fixes a problem in the KeyFieldHelper to do with
the end index for some inputs (Amar Kamat via ddas)
HADOOP-6150. Users should be able to instantiate comparator using TFile
API. (Hong Tang via rangadi)
MAPREDUCE-383. Fix a bug in Pipes combiner due to bytes count not
getting reset after the spill. (Christian Kunz via sharad)
MAPREDUCE-40. Keep memory management backwards compatible for job
configuration parameters and limits. (Rahul Kumar Singh via yhemanth)
MAPREDUCE-796. Fixes a ClassCastException in an exception log in
MultiThreadedMapRunner. (Amar Kamat via ddas)
MAPREDUCE-838. Fixes a problem in the way commit of task outputs
happens. The bug was that even if commit failed, the task would
be declared as successful. (Amareshwari Sriramadasu via ddas)
MAPREDUCE-805. Fixes some deadlocks in the JobTracker due to the fact
the JobTracker lock hierarchy wasn't maintained in some JobInProgress
method calls. (Amar Kamat via ddas)
HDFS-167. Fix a bug in DFSClient that caused infinite retries on write.
(Bill Zeller via szetszwo)
HDFS-527. Remove unnecessary DFSClient constructors. (szetszwo)
MAPREDUCE-832. Reduce number of warning messages printed when
deprecated memory variables are used. (Rahul Kumar Singh via yhemanth)
MAPREDUCE-745. Fixes a testcase problem to do with generation of JobTracker
IDs. (Amar Kamat via ddas)
MAPREDUCE-834. Enables memory management on tasktrackers when old
memory management parameters are used in configuration.
(Sreekanth Ramakrishnan via yhemanth)
MAPREDUCE-818. Fixes Counters#getGroup API. (Amareshwari Sriramadasu
via sharad)
MAPREDUCE-807. Handles the AccessControlException during the deletion of
mapred.system.dir in the JobTracker. The JobTracker will bail out if it
encounters such an exception. (Amar Kamat via ddas)
HADOOP-6213. Remove commons dependency on commons-cli2. (Amar Kamat via
sharad)
MAPREDUCE-430. Fix a bug related to task getting stuck in case of
OOM error. (Amar Kamat via ddas)
HADOOP-6215. fix GenericOptionParser to deal with -D with '=' in the
value. (Amar Kamat via sharad)
MAPREDUCE-421. Fix Pipes to use returned system exit code.
(Christian Kunz via omalley)
HDFS-525. The SimpleDateFormat object in ListPathsServlet is not thread
safe. (Suresh Srinivas and cdouglas)
MAPREDUCE-911. Fix a bug in TestTaskFail related to speculative
execution. (Amareshwari Sriramadasu via sharad)
MAPREDUCE-687. Fix an assertion in TestMiniMRMapRedDebugScript.
(Amareshwari Sriramadasu via sharad)
MAPREDUCE-924. Fixes the TestPipes testcase to use Tool.
(Amareshwari Sriramadasu via sharad)
Release 0.20.0 - 2009-04-15
INCOMPATIBLE CHANGES
HADOOP-4210. Fix findbugs warnings for equals implementations of mapred ID
classes. Removed public, static ID::read and ID::forName; made ID an
abstract class. (Suresh Srinivas via cdouglas)
HADOOP-4253. Fix various warnings generated by findbugs.
Following deprecated methods in RawLocalFileSystem are removed:
public String getName()
public void lock(Path p, boolean shared)
public void release(Path p)
(Suresh Srinivas via johan)
HADOOP-4618. Move http server from FSNamesystem into NameNode.
FSNamesystem.getNameNodeInfoPort() is removed.
FSNamesystem.getDFSNameNodeMachine() and FSNamesystem.getDFSNameNodePort()
replaced by FSNamesystem.getDFSNameNodeAddress().
NameNode(bindAddress, conf) is removed.
(shv)
HADOOP-4567. GetFileBlockLocations returns the NetworkTopology
information of the machines where the blocks reside. (dhruba)
HADOOP-4435. The JobTracker WebUI displays the amount of heap memory
in use. (dhruba)
HADOOP-4628. Move Hive into a standalone subproject. (omalley)
HADOOP-4188. Removes task's dependency on concrete filesystems.
(Sharad Agarwal via ddas)
HADOOP-1650. Upgrade to Jetty 6. (cdouglas)
HADOOP-3986. Remove static Configuration from JobClient. (Amareshwari
Sriramadasu via cdouglas)
JobClient::setCommandLineConfig is removed
JobClient::getCommandLineConfig is removed
JobShell, TestJobShell classes are removed
HADOOP-4422. S3 file systems should not create bucket.
(David Phillips via tomwhite)
HADOOP-4035. Support memory based scheduling in capacity scheduler.
(Vinod Kumar Vavilapalli via yhemanth)
HADOOP-3497. Fix bug in overly restrictive file globbing with a
PathFilter. (tomwhite)
HADOOP-4445. Replace running task counts with running task
percentage in capacity scheduler UI. (Sreekanth Ramakrishnan via
yhemanth)
HADOOP-4631. Splits the configuration into three parts - one for core,
one for mapred and the last one for HDFS. (Sharad Agarwal via cdouglas)
HADOOP-3344. Fix libhdfs build to use autoconf and build the same
architecture (32 vs 64 bit) of the JVM running Ant. The libraries for
pipes, utils, and libhdfs are now all in c++/<os_osarch_jvmdatamodel>/lib.
(Giridharan Kesavan via nigel)
HADOOP-4874. Remove LZO codec because of licensing issues. (omalley)
HADOOP-4970. The full path name of a file is preserved inside Trash.
(Prasad Chakka via dhruba)
HADOOP-4103. NameNode keeps a count of missing blocks. It warns on
WebUI if there are such blocks. '-report' and '-metaSave' have extra
info to track such blocks. (Raghu Angadi)
HADOOP-4783. Change permissions on history files on the jobtracker
to be only group readable instead of world readable.
(Amareshwari Sriramadasu via yhemanth)
HADOOP-5531. Removed Chukwa from Hadoop 0.20.0. (nigel)
NEW FEATURES
HADOOP-4575. Add a proxy service for relaying HsftpFileSystem requests.
Includes client authentication via user certificates and config-based
access control. (Kan Zhang via cdouglas)
HADOOP-4661. Add DistCh, a new tool for distributed ch{mod,own,grp}.
(szetszwo)
HADOOP-4709. Add several new features and bug fixes to Chukwa.
Added Hadoop Infrastructure Care Center (UI for visualize data collected
by Chukwa)
Added FileAdaptor for streaming small file in one chunk
Added compression to archive and demux output
Added unit tests and validation for agent, collector, and demux map
reduce job
Added database loader for loading demux output (sequence file) to jdbc
connected database
Added algorithm to distribute collector load more evenly
(Jerome Boulon, Eric Yang, Andy Konwinski, Ariel Rabkin via cdouglas)
HADOOP-4179. Add Vaidya tool to analyze map/reduce job logs for performanc
problems. (Suhas Gogate via omalley)
HADOOP-4029. Add NameNode storage information to the dfshealth page and
move DataNode information to a separated page. (Boris Shkolnik via
szetszwo)
HADOOP-4348. Add service-level authorization for Hadoop. (acmurthy)
HADOOP-4826. Introduce admin command saveNamespace. (shv)
HADOOP-3063 BloomMapFile - fail-fast version of MapFile for sparsely
populated key space (Andrzej Bialecki via stack)
HADOOP-1230. Add new map/reduce API and deprecate the old one. Generally,
the old code should work without problem. The new api is in
org.apache.hadoop.mapreduce and the old classes in org.apache.hadoop.mapred
are deprecated. Differences in the new API:
1. All of the methods take Context objects that allow us to add new
methods without breaking compatability.
2. Mapper and Reducer now have a "run" method that is called once and
contains the control loop for the task, which lets applications
replace it.
3. Mapper and Reducer by default are Identity Mapper and Reducer.
4. The FileOutputFormats use part-r-00000 for the output of reduce 0 and
part-m-00000 for the output of map 0.
5. The reduce grouping comparator now uses the raw compare instead of
object compare.
6. The number of maps in FileInputFormat is controlled by min and max
split size rather than min size and the desired number of maps.
(omalley)
HADOOP-3305. Use Ivy to manage dependencies. (Giridharan Kesavan
and Steve Loughran via cutting)
IMPROVEMENTS
HADOOP-4565. Added CombineFileInputFormat to use data locality information
to create splits. (dhruba via zshao)
HADOOP-4749. Added a new counter REDUCE_INPUT_BYTES. (Yongqiang He via
zshao)
HADOOP-4234. Fix KFS "glue" layer to allow applications to interface
with multiple KFS metaservers. (Sriram Rao via lohit)
HADOOP-4245. Update to latest version of KFS "glue" library jar.
(Sriram Rao via lohit)
HADOOP-4244. Change test-patch.sh to check Eclipse classpath no matter
it is run by Hudson or not. (szetszwo)
HADOOP-3180. Add name of missing class to WritableName.getClass
IOException. (Pete Wyckoff via omalley)
HADOOP-4178. Make the capacity scheduler's default values configurable.
(Sreekanth Ramakrishnan via omalley)
HADOOP-4262. Generate better error message when client exception has null
message. (stevel via omalley)
HADOOP-4226. Refactor and document LineReader to make it more readily
understandable. (Yuri Pradkin via cdouglas)
HADOOP-4238. When listing jobs, if scheduling information isn't available
print NA instead of empty output. (Sreekanth Ramakrishnan via johan)
HADOOP-4284. Support filters that apply to all requests, or global filters,
to HttpServer. (Kan Zhang via cdouglas)
HADOOP-4276. Improve the hashing functions and deserialization of the
mapred ID classes. (omalley)
HADOOP-4485. Add a compile-native ant task, as a shorthand. (enis)
HADOOP-4454. Allow # comments in slaves file. (Rama Ramasamy via omalley)
HADOOP-3461. Remove hdfs.StringBytesWritable. (szetszwo)
HADOOP-4437. Use Halton sequence instead of java.util.Random in
PiEstimator. (szetszwo)
HADOOP-4572. Change INode and its sub-classes to package private.
(szetszwo)
HADOOP-4187. Does a runtime lookup for JobConf/JobConfigurable, and if
found, invokes the appropriate configure method. (Sharad Agarwal via ddas)
HADOOP-4453. Improve ssl configuration and handling in HsftpFileSystem,
particularly when used with DistCp. (Kan Zhang via cdouglas)
HADOOP-4583. Several code optimizations in HDFS. (Suresh Srinivas via
szetszwo)
HADOOP-3923. Remove org.apache.hadoop.mapred.StatusHttpServer. (szetszwo)
HADOOP-4622. Explicitly specify interpretor for non-native
pipes binaries. (Fredrik Hedberg via johan)
HADOOP-4505. Add a unit test to test faulty setup task and cleanup
task killing the job. (Amareshwari Sriramadasu via johan)
HADOOP-4608. Don't print a stack trace when the example driver gets an
unknown program to run. (Edward Yoon via omalley)
HADOOP-4645. Package HdfsProxy contrib project without the extra level
of directories. (Kan Zhang via omalley)
HADOOP-4126. Allow access to HDFS web UI on EC2 (tomwhite via omalley)
HADOOP-4612. Removes RunJar's dependency on JobClient.
(Sharad Agarwal via ddas)
HADOOP-4185. Adds setVerifyChecksum() method to FileSystem.
(Sharad Agarwal via ddas)
HADOOP-4523. Prevent too many tasks scheduled on a node from bringing
it down by monitoring for cumulative memory usage across tasks.
(Vinod Kumar Vavilapalli via yhemanth)
HADOOP-4640. Adds an input format that can split lzo compressed
text files. (johan)
HADOOP-4666. Launch reduces only after a few maps have run in the
Fair Scheduler. (Matei Zaharia via johan)
HADOOP-4339. Remove redundant calls from FileSystem/FsShell when
generating/processing ContentSummary. (David Phillips via cdouglas)
HADOOP-2774. Add counters tracking records spilled to disk in MapTask and
ReduceTask. (Ravi Gummadi via cdouglas)
HADOOP-4513. Initialize jobs asynchronously in the capacity scheduler.
(Sreekanth Ramakrishnan via yhemanth)
HADOOP-4649. Improve abstraction for spill indices. (cdouglas)
HADOOP-3770. Add gridmix2, an iteration on the gridmix benchmark. (Runping
Qi via cdouglas)
HADOOP-4708. Add support for dfsadmin commands in TestCLI. (Boris Shkolnik
via cdouglas)
HADOOP-4758. Add a splitter for metrics contexts to support more than one
type of collector. (cdouglas)
HADOOP-4722. Add tests for dfsadmin quota error messages. (Boris Shkolnik
via cdouglas)
HADOOP-4690. fuse-dfs - create source file/function + utils + config +
main source files. (pete wyckoff via mahadev)
HADOOP-3750. Fix and enforce module dependencies. (Sharad Agarwal via
tomwhite)
HADOOP-4747. Speed up FsShell::ls by removing redundant calls to the
filesystem. (David Phillips via cdouglas)
HADOOP-4305. Improves the blacklisting strategy, whereby, tasktrackers
that are blacklisted are not given tasks to run from other jobs, subject
to the following conditions (all must be met):
1) The TaskTracker has been blacklisted by at least 4 jobs (configurable)
2) The TaskTracker has been blacklisted 50% more number of times than
the average (configurable)
3) The cluster has less than 50% trackers blacklisted
Once in 24 hours, a TaskTracker blacklisted for all jobs is given a chance.
Restarting the TaskTracker moves it out of the blacklist.
(Amareshwari Sriramadasu via ddas)
HADOOP-4688. Modify the MiniMRDFSSort unit test to spill multiple times,
exercising the map-side merge code. (cdouglas)
HADOOP-4737. Adds the KILLED notification when jobs get killed.
(Amareshwari Sriramadasu via ddas)
HADOOP-4728. Add a test exercising different namenode configurations.
(Boris Shkolnik via cdouglas)
HADOOP-4807. Adds JobClient commands to get the active/blacklisted tracker
names. Also adds commands to display running/completed task attempt IDs.
(ddas)
HADOOP-4699. Remove checksum validation from map output servlet. (cdouglas)
HADOOP-4838. Added a registry to automate metrics and mbeans management.
(Sanjay Radia via acmurthy)
HADOOP-3136. Fixed the default scheduler to assign multiple tasks to each
tasktracker per heartbeat, when feasible. To ensure locality isn't hurt
too badly, the scheudler will not assign more than one off-switch task per
heartbeat. The heartbeat interval is also halved since the task-tracker is
fixed to no longer send out heartbeats on each task completion. A
slow-start for scheduling reduces is introduced to ensure that reduces
aren't started till sufficient number of maps are done, else reduces of
jobs whose maps aren't scheduled might swamp the cluster.
Configuration changes to mapred-default.xml:
add mapred.reduce.slowstart.completed.maps
(acmurthy)
HADOOP-4545. Add example and test case of secondary sort for the reduce.
(omalley)
HADOOP-4753. Refactor gridmix2 to reduce code duplication. (cdouglas)
HADOOP-4909. Fix Javadoc and make some of the API more consistent in their
use of the JobContext instead of Configuration. (omalley)
HADOOP-4830. Add end-to-end test cases for testing queue capacities.
(Vinod Kumar Vavilapalli via yhemanth)
HADOOP-4980. Improve code layout of capacity scheduler to make it
easier to fix some blocker bugs. (Vivek Ratan via yhemanth)
HADOOP-4916. Make user/location of Chukwa installation configurable by an
external properties file. (Eric Yang via cdouglas)
HADOOP-4950. Make the CompressorStream, DecompressorStream,
BlockCompressorStream, and BlockDecompressorStream public to facilitate
non-Hadoop codecs. (omalley)
HADOOP-4843. Collect job history and configuration in Chukwa. (Eric Yang
via cdouglas)
HADOOP-5030. Build Chukwa RPM to install into configured directory. (Eric
Yang via cdouglas)
HADOOP-4828. Updates documents to do with configuration (HADOOP-4631).
(Sharad Agarwal via ddas)
HADOOP-4939. Adds a test that would inject random failures for tasks in
large jobs and would also inject TaskTracker failures. (ddas)
HADOOP-4920. Stop storing Forrest output in Subversion. (cutting)
HADOOP-4944. A configuration file can include other configuration
files. (Rama Ramasamy via dhruba)
HADOOP-4804. Provide Forrest documentation for the Fair Scheduler.
(Sreekanth Ramakrishnan via yhemanth)
HADOOP-5248. A testcase that checks for the existence of job directory
after the job completes. Fails if it exists. (ddas)
HADOOP-4664. Introduces multiple job initialization threads, where the
number of threads are configurable via mapred.jobinit.threads.
(Matei Zaharia and Jothi Padmanabhan via ddas)
HADOOP-4191. Adds a testcase for JobHistory. (Ravi Gummadi via ddas)
HADOOP-5466. Change documenation CSS style for headers and code. (Corinne
Chandel via szetszwo)
HADOOP-5275. Add ivy directory and files to built tar.
(Giridharan Kesavan via nigel)
HADOOP-5468. Add sub-menus to forrest documentation and make some minor
edits. (Corinne Chandel via szetszwo)
HADOOP-5437. Fix TestMiniMRDFSSort to properly test jvm-reuse. (omalley)
HADOOP-5521. Removes dependency of TestJobInProgress on RESTART_COUNT
JobHistory tag. (Ravi Gummadi via ddas)
HADOOP-5714. Add a metric for NameNode getFileInfo operation. (Jakob Homan
via szetszwo)
OPTIMIZATIONS
HADOOP-3293. Fixes FileInputFormat to do provide locations for splits
based on the rack/host that has the most number of bytes.
(Jothi Padmanabhan via ddas)
HADOOP-4683. Fixes Reduce shuffle scheduler to invoke
getMapCompletionEvents in a separate thread. (Jothi Padmanabhan
via ddas)
BUG FIXES
HADOOP-5379. CBZip2InputStream to throw IOException on data crc error.
(Rodrigo Schmidt via zshao)
HADOOP-5326. Fixes CBZip2OutputStream data corruption problem.
(Rodrigo Schmidt via zshao)
HADOOP-4204. Fix findbugs warnings related to unused variables, naive
Number subclass instantiation, Map iteration, and badly scoped inner
classes. (Suresh Srinivas via cdouglas)
HADOOP-4207. Update derby jar file to release 10.4.2 release.
(Prasad Chakka via dhruba)
HADOOP-4325. SocketInputStream.read() should return -1 in case EOF.
(Raghu Angadi)
HADOOP-4408. FsAction functions need not create new objects. (cdouglas)
HADOOP-4440. TestJobInProgressListener tests for jobs killed in queued
state (Amar Kamat via ddas)
HADOOP-4346. Implement blocking connect so that Hadoop is not affected
by selector problem with JDK default implementation. (Raghu Angadi)
HADOOP-4388. If there are invalid blocks in the transfer list, Datanode
should handle them and keep transferring the remaining blocks. (Suresh
Srinivas via szetszwo)
HADOOP-4587. Fix a typo in Mapper javadoc. (Koji Noguchi via szetszwo)
HADOOP-4530. In fsck, HttpServletResponse sendError fails with
IllegalStateException. (hairong)
HADOOP-4377. Fix a race condition in directory creation in
NativeS3FileSystem. (David Phillips via cdouglas)
HADOOP-4621. Fix javadoc warnings caused by duplicate jars. (Kan Zhang via
cdouglas)
HADOOP-4566. Deploy new hive code to support more types.
(Zheng Shao via dhruba)
HADOOP-4571. Add chukwa conf files to svn:ignore list. (Eric Yang via
szetszwo)
HADOOP-4589. Correct PiEstimator output messages and improve the code
readability. (szetszwo)
HADOOP-4650. Correct a mismatch between the default value of
local.cache.size in the config and the source. (Jeff Hammerbacher via
cdouglas)
HADOOP-4606. Fix cygpath error if the log directory does not exist.
(szetszwo via omalley)
HADOOP-4141. Fix bug in ScriptBasedMapping causing potential infinite
loop on misconfigured hadoop-site. (Aaron Kimball via tomwhite)
HADOOP-4691. Correct a link in the javadoc of IndexedSortable. (szetszwo)
HADOOP-4598. '-setrep' command skips under-replicated blocks. (hairong)
HADOOP-4429. Set defaults for user, group in UnixUserGroupInformation so
login fails more predictably when misconfigured. (Alex Loddengaard via
cdouglas)
HADOOP-4676. Fix broken URL in blacklisted tasktrackers page. (Amareshwari
Sriramadasu via cdouglas)
HADOOP-3422 Ganglia counter metrics are all reported with the metric
name "value", so the counter values can not be seen. (Jason Attributor
and Brian Bockelman via stack)
HADOOP-4704. Fix javadoc typos "the the". (szetszwo)
HADOOP-4677. Fix semantics of FileSystem::getBlockLocations to return
meaningful values. (Hong Tang via cdouglas)
HADOOP-4669. Use correct operator when evaluating whether access time is
enabled (Dhruba Borthakur via cdouglas)
HADOOP-4732. Pass connection and read timeouts in the correct order when
setting up fetch in reduce. (Amareshwari Sriramadasu via cdouglas)
HADOOP-4558. Fix capacity reclamation in capacity scheduler.
(Amar Kamat via yhemanth)
HADOOP-4770. Fix rungridmix_2 script to work with RunJar. (cdouglas)
HADOOP-4738. When using git, the saveVersion script will use only the
commit hash for the version and not the message, which requires escaping.
(cdouglas)
HADOOP-4576. Show pending job count instead of task count in the UI per
queue in capacity scheduler. (Sreekanth Ramakrishnan via yhemanth)
HADOOP-4623. Maintain running tasks even if speculative execution is off.
(Amar Kamat via yhemanth)
HADOOP-4786. Fix broken compilation error in
TestTrackerBlacklistAcrossJobs. (yhemanth)
HADOOP-4785. Fixes theJobTracker heartbeat to not make two calls to
System.currentTimeMillis(). (Amareshwari Sriramadasu via ddas)
HADOOP-4792. Add generated Chukwa configuration files to version control
ignore lists. (cdouglas)
HADOOP-4796. Fix Chukwa test configuration, remove unused components. (Eric
Yang via cdouglas)
HADOOP-4708. Add binaries missed in the initial checkin for Chukwa. (Eric
Yang via cdouglas)
HADOOP-4805. Remove black list collector from Chukwa Agent HTTP Sender.
(Eric Yang via cdouglas)
HADOOP-4837. Move HADOOP_CONF_DIR configuration to chukwa-env.sh (Jerome
Boulon via cdouglas)
HADOOP-4825. Use ps instead of jps for querying process status in Chukwa.
(Eric Yang via cdouglas)
HADOOP-4844. Fixed javadoc for
org.apache.hadoop.fs.permission.AccessControlException to document that
it's deprecated in favour of
org.apache.hadoop.security.AccessControlException. (acmurthy)
HADOOP-4706. Close the underlying output stream in
IFileOutputStream::close. (Jothi Padmanabhan via cdouglas)
HADOOP-4855. Fixed command-specific help messages for refreshServiceAcl in
DFSAdmin and MRAdmin. (acmurthy)
HADOOP-4820. Remove unused method FSNamesystem::deleteInSafeMode. (Suresh
Srinivas via cdouglas)
HADOOP-4698. Lower io.sort.mb to 10 in the tests and raise the junit memory
limit to 512m from 256m. (Nigel Daley via cdouglas)
HADOOP-4860. Split TestFileTailingAdapters into three separate tests to
avoid contention. (Eric Yang via cdouglas)
HADOOP-3921. Fixed clover (code coverage) target to work with JDK 6.
(tomwhite via nigel)
HADOOP-4845. Modify the reduce input byte counter to record only the
compressed size and add a human-readable label. (Yongqiang He via cdouglas)
HADOOP-4458. Add a test creating symlinks in the working directory.
(Amareshwari Sriramadasu via cdouglas)
HADOOP-4879. Fix org.apache.hadoop.mapred.Counters to correctly define
Object.equals rather than depend on contentEquals api. (omalley via
acmurthy)
HADOOP-4791. Fix rpm build process for Chukwa. (Eric Yang via cdouglas)
HADOOP-4771. Correct initialization of the file count for directories
with quotas. (Ruyue Ma via shv)
HADOOP-4878. Fix eclipse plugin classpath file to point to ivy's resolved
lib directory and added the same to test-patch.sh. (Giridharan Kesavan via
acmurthy)
HADOOP-4774. Fix default values of some capacity scheduler configuration
items which would otherwise not work on a fresh checkout.
(Sreekanth Ramakrishnan via yhemanth)
HADOOP-4876. Fix capacity scheduler reclamation by updating count of
pending tasks correctly. (Sreekanth Ramakrishnan via yhemanth)
HADOOP-4849. Documentation for Service Level Authorization implemented in
HADOOP-4348. (acmurthy)
HADOOP-4827. Replace Consolidator with Aggregator macros in Chukwa (Eric
Yang via cdouglas)
HADOOP-4894. Correctly parse ps output in Chukwa jettyCollector.sh. (Ari
Rabkin via cdouglas)
HADOOP-4892. Close fds out of Chukwa ExecPlugin. (Ari Rabkin via cdouglas)
HADOOP-4889. Fix permissions in RPM packaging. (Eric Yang via cdouglas)
HADOOP-4869. Fixes the TT-JT heartbeat to have an explicit flag for
restart apart from the initialContact flag that there was earlier.
(Amareshwari Sriramadasu via ddas)
HADOOP-4716. Fixes ReduceTask.java to clear out the mapping between
hosts and MapOutputLocation upon a JT restart (Amar Kamat via ddas)
HADOOP-4880. Removes an unnecessary testcase from TestJobTrackerRestart.
(Amar Kamat via ddas)
HADOOP-4924. Fixes a race condition in TaskTracker re-init. (ddas)
HADOOP-4854. Read reclaim capacity interval from capacity scheduler
configuration. (Sreekanth Ramakrishnan via yhemanth)
HADOOP-4896. HDFS Fsck does not load HDFS configuration. (Raghu Angadi)
HADOOP-4956. Creates TaskStatus for failed tasks with an empty Counters
object instead of null. (ddas)
HADOOP-4979. Fix capacity scheduler to block cluster for failed high
RAM requirements across task types. (Vivek Ratan via yhemanth)
HADOOP-4949. Fix native compilation. (Chris Douglas via acmurthy)
HADOOP-4787. Fixes the testcase TestTrackerBlacklistAcrossJobs which was
earlier failing randomly. (Amareshwari Sriramadasu via ddas)
HADOOP-4914. Add description fields to Chukwa init.d scripts (Eric Yang via
cdouglas)
HADOOP-4884. Make tool tip date format match standard HICC format. (Eric
Yang via cdouglas)
HADOOP-4925. Make Chukwa sender properties configurable. (Ari Rabkin via
cdouglas)
HADOOP-4947. Make Chukwa command parsing more forgiving of whitespace. (Ari
Rabkin via cdouglas)
HADOOP-5026. Make chukwa/bin scripts executable in repository. (Andy
Konwinski via cdouglas)
HADOOP-4977. Fix a deadlock between the reclaimCapacity and assignTasks
in capacity scheduler. (Vivek Ratan via yhemanth)
HADOOP-4988. Fix reclaim capacity to work even when there are queues with
no capacity. (Vivek Ratan via yhemanth)
HADOOP-5065. Remove generic parameters from argument to
setIn/OutputFormatClass so that it works with SequenceIn/OutputFormat.
(cdouglas via omalley)
HADOOP-4818. Pass user config to instrumentation API. (Eric Yang via
cdouglas)
HADOOP-4993. Fix Chukwa agent configuration and startup to make it both
more modular and testable. (Ari Rabkin via cdouglas)
HADOOP-5048. Fix capacity scheduler to correctly cleanup jobs that are
killed after initialization, but before running.
(Sreekanth Ramakrishnan via yhemanth)
HADOOP-4671. Mark loop control variables shared between threads as
volatile. (cdouglas)
HADOOP-5079. HashFunction inadvertently destroys some randomness
(Jonathan Ellis via stack)
HADOOP-4999. A failure to write to FsEditsLog results in
IndexOutOfBounds exception. (Boris Shkolnik via rangadi)
HADOOP-5139. Catch IllegalArgumentException during metrics registration
in RPC. (Hairong Kuang via szetszwo)
HADOOP-5085. Copying a file to local with Crc throws an exception.
(hairong)
HADOOP-4759. Removes temporary output directory for failed and
killed tasks by launching special CLEANUP tasks for the same.
(Amareshwari Sriramadasu via ddas)
HADOOP-5211. Fix check for job completion in TestSetupAndCleanupFailure.
(enis)
HADOOP-5254. The Configuration class should be able to work with XML
parsers that do not support xmlinclude. (Steve Loughran via dhruba)
HADOOP-4692. Namenode in infinite loop for replicating/deleting corrupt
blocks. (hairong)
HADOOP-5255. Fix use of Math.abs to avoid overflow. (Jonathan Ellis via
cdouglas)
HADOOP-5269. Fixes a problem to do with tasktracker holding on to
FAILED_UNCLEAN or KILLED_UNCLEAN tasks forever. (Amareshwari Sriramadasu
via ddas)
HADOOP-5214. Fixes a ConcurrentModificationException while the Fairshare
Scheduler accesses the tasktrackers stored by the JobTracker.
(Rahul Kumar Singh via yhemanth)
HADOOP-5233. Addresses the three issues - Race condition in updating
status, NPE in TaskTracker task localization when the conf file is missing
(HADOOP-5234) and NPE in handling KillTaskAction of a cleanup task
(HADOOP-5235). (Amareshwari Sriramadasu via ddas)
HADOOP-5247. Introduces a broadcast of KillJobAction to all trackers when
a job finishes. This fixes a bunch of problems to do with NPE when a
completed job is not in memory and a tasktracker comes to the jobtracker
with a status report of a task belonging to that job. (Amar Kamat via ddas)
HADOOP-5282. Fixed job history logs for task attempts that are
failed by the JobTracker, say due to lost task trackers. (Amar
Kamat via yhemanth)
HADOOP-4963. Fixes a logging to do with getting the location of
map output file. (Amareshwari Sriramadasu via ddas)
HADOOP-5292. Fix NPE in KFS::getBlockLocations. (Sriram Rao via lohit)
HADOOP-5241. Fixes a bug in disk-space resource estimation. Makes
the estimation formula linear where blowUp =
Total-Output/Total-Input. (Sharad Agarwal via ddas)
HADOOP-5142. Fix MapWritable#putAll to store key/value classes.
(Do??acan G??ney via enis)
HADOOP-4744. Workaround for jetty6 returning -1 when getLocalPort
is invoked on the connector. The workaround patch retries a few
times before failing. (Jothi Padmanabhan via yhemanth)
HADOOP-5280. Adds a check to prevent a task state transition from
FAILED to any of UNASSIGNED, RUNNING, COMMIT_PENDING or
SUCCEEDED. (ddas)
HADOOP-5272. Fixes a problem to do with detecting whether an
attempt is the first attempt of a Task. This affects JobTracker
restart. (Amar Kamat via ddas)
HADOOP-5306. Fixes a problem to do with logging/parsing the http port of a
lost tracker. Affects JobTracker restart. (Amar Kamat via ddas)
HADOOP-5111. Fix Job::set* methods to work with generics. (cdouglas)
HADOOP-5274. Fix gridmix2 dependency on wordcount example. (cdouglas)
HADOOP-5145. Balancer sometimes runs out of memory after running
days or weeks. (hairong)
HADOOP-5338. Fix jobtracker restart to clear task completion
events cached by tasktrackers forcing them to fetch all events
afresh, thus avoiding missed task completion events on the
tasktrackers. (Amar Kamat via yhemanth)
HADOOP-4695. Change TestGlobalFilter so that it allows a web page to be
filtered more than once for a single access. (Kan Zhang via szetszwo)
HADOOP-5298. Change TestServletFilter so that it allows a web page to be
filtered more than once for a single access. (szetszwo)
HADOOP-5432. Disable ssl during unit tests in hdfsproxy, as it is unused
and causes failures. (cdouglas)
HADOOP-5416. Correct the shell command "fs -test" forrest doc description.
(Ravi Phulari via szetszwo)
HADOOP-5327. Fixed job tracker to remove files from system directory on
ACL check failures and also check ACLs on restart.
(Amar Kamat via yhemanth)
HADOOP-5395. Change the exception message when a job is submitted to an
invalid queue. (Rahul Kumar Singh via yhemanth)
HADOOP-5276. Fixes a problem to do with updating the start time of
a task when the tracker that ran the task is lost. (Amar Kamat via
ddas)
HADOOP-5278. Fixes a problem to do with logging the finish time of
a task during recovery (after a JobTracker restart). (Amar Kamat
via ddas)
HADOOP-5490. Fixes a synchronization problem in the
EagerTaskInitializationListener class. (Jothi Padmanabhan via
ddas)
HADOOP-5493. The shuffle copier threads return the codecs back to
the pool when the shuffle completes. (Jothi Padmanabhan via ddas)
HADOOP-5505. Fix JspHelper initialization in the context of
MiniDFSCluster. (Raghu Angadi)
HADOOP-5414. Fixes IO exception while executing hadoop fs -touchz
fileName by making sure that lease renewal thread exits before dfs
client exits. (hairong)
HADOOP-5103. FileInputFormat now reuses the clusterMap network
topology object and that brings down the log messages in the
JobClient to do with NetworkTopology.add significantly. (Jothi
Padmanabhan via ddas)
HADOOP-5483. Fixes a problem in the Directory Cleanup Thread due to which
TestMiniMRWithDFS sometimes used to fail. (ddas)
HADOOP-5281. Prevent sharing incompatible ZlibCompressor instances between
GzipCodec and DefaultCodec. (cdouglas)
HADOOP-5463. Balancer throws "Not a host:port pair" unless port is
specified in fs.default.name. (Stuart White via hairong)
HADOOP-5514. Fix JobTracker metrics and add metrics for wating, failed
tasks. (cdouglas)
HADOOP-5516. Fix NullPointerException in TaskMemoryManagerThread
that comes when monitored processes disappear when the thread is
running. (Vinod Kumar Vavilapalli via yhemanth)
HADOOP-5382. Support combiners in the new context object API. (omalley)
HADOOP-5471. Fixes a problem to do with updating the log.index file in the
case where a cleanup task is run. (Amareshwari Sriramadasu via ddas)
HADOOP-5534. Fixed a deadlock in Fair scheduler's servlet.
(Rahul Kumar Singh via yhemanth)
HADOOP-5328. Fixes a problem in the renaming of job history files during
job recovery. Amar Kamat via ddas)
HADOOP-5417. Don't ignore InterruptedExceptions that happen when calling
into rpc. (omalley)
HADOOP-5320. Add a close() in TestMapReduceLocal. (Jothi Padmanabhan
via szetszwo)
HADOOP-5520. Fix a typo in disk quota help message. (Ravi Phulari
via szetszwo)
HADOOP-5519. Remove claims from mapred-default.xml that prime numbers
of tasks are helpful. (Owen O'Malley via szetszwo)
HADOOP-5484. TestRecoveryManager fails wtih FileAlreadyExistsException.
(Amar Kamat via hairong)
HADOOP-5564. Limit the JVM heap size in the java command for initializing
JAVA_PLATFORM. (Suresh Srinivas via szetszwo)
HADOOP-5565. Add API for failing/finalized jobs to the JT metrics
instrumentation. (Jerome Boulon via cdouglas)
HADOOP-5390. Remove duplicate jars from tarball, src from binary tarball
added by hdfsproxy. (Zhiyong Zhang via cdouglas)
HADOOP-5066. Building binary tarball should not build docs/javadocs, copy
src, or run jdiff. (Giridharan Kesavan via cdouglas)
HADOOP-5459. Fix undetected CRC errors where intermediate output is closed
before it has been completely consumed. (cdouglas)
HADOOP-5571. Remove widening primitive conversion in TupleWritable mask
manipulation. (Jingkei Ly via cdouglas)
HADOOP-5588. Remove an unnecessary call to listStatus(..) in
FileSystem.globStatusInternal(..). (Hairong Kuang via szetszwo)
HADOOP-5473. Solves a race condition in killing a task - the state is KILLED
if there is a user request pending to kill the task and the TT reported
the state as SUCCESS. (Amareshwari Sriramadasu via ddas)
HADOOP-5576. Fix LocalRunner to work with the new context object API in
mapreduce. (Tom White via omalley)
HADOOP-4374. Installs a shutdown hook in the Task JVM so that log.index is
updated before the JVM exits. Also makes the update to log.index atomic.
(Ravi Gummadi via ddas)
HADOOP-5577. Add a verbose flag to mapreduce.Job.waitForCompletion to get
the running job's information printed to the user's stdout as it runs.
(omalley)
HADOOP-5607. Fix NPE in TestCapacityScheduler. (cdouglas)
HADOOP-5605. All the replicas incorrectly got marked as corrupt. (hairong)
HADOOP-5337. JobTracker, upon restart, now waits for the TaskTrackers to
join back before scheduling new tasks. This fixes race conditions associated
with greedy scheduling as was the case earlier. (Amar Kamat via ddas)
HADOOP-5227. Fix distcp so -update and -delete can be meaningfully
combined. (Tsz Wo (Nicholas), SZE via cdouglas)
HADOOP-5305. Increase number of files and print debug messages in
TestCopyFiles. (szetszwo)
HADOOP-5548. Add synchronization for JobTracker methods in RecoveryManager.
(Amareshwari Sriramadasu via sharad)
HADOOP-3810. NameNode seems unstable on a cluster with little space left.
(hairong)
HADOOP-5068. Fix NPE in TestCapacityScheduler. (Vinod Kumar Vavilapalli
via szetszwo)
HADOOP-5585. Clear FileSystem statistics between tasks when jvm-reuse
is enabled. (omalley)
HADOOP-5394. JobTracker might schedule 2 attempts of the same task
with the same attempt id across restarts. (Amar Kamat via sharad)
HADOOP-5645. After HADOOP-4920 we need a place to checkin
releasenotes.html. (nigel)
Release 0.19.2 - Unreleased
BUG FIXES
HADOOP-5154. Fixes a deadlock in the fairshare scheduler.
(Matei Zaharia via yhemanth)
HADOOP-5146. Fixes a race condition that causes LocalDirAllocator to miss
files. (Devaraj Das via yhemanth)
HADOOP-4638. Fixes job recovery to not crash the job tracker for problems
with a single job file. (Amar Kamat via yhemanth)
HADOOP-5384. Fix a problem that DataNodeCluster creates blocks with
generationStamp == 1. (szetszwo)
HADOOP-5376. Fixes the code handling lost tasktrackers to set the task state
to KILLED_UNCLEAN only for relevant type of tasks.
(Amareshwari Sriramadasu via yhemanth)
HADOOP-5285. Fixes the issues - (1) obtainTaskCleanupTask checks whether job is
inited before trying to lock the JobInProgress (2) Moves the CleanupQueue class
outside the TaskTracker and makes it a generic class that is used by the
JobTracker also for deleting the paths on the job's output fs. (3) Moves the
references to completedJobStore outside the block where the JobTracker is locked.
(ddas)
HADOOP-5392. Fixes a problem to do with JT crashing during recovery when
the job files are garbled. (Amar Kamat vi ddas)
HADOOP-5332. Appending to files is not allowed (by default) unless
dfs.support.append is set to true. (dhruba)
HADOOP-5333. libhdfs supports appending to files. (dhruba)
HADOOP-3998. Fix dfsclient exception when JVM is shutdown. (dhruba)
HADOOP-5440. Fixes a problem to do with removing a taskId from the list
of taskIds that the TaskTracker's TaskMemoryManager manages.
(Amareshwari Sriramadasu via ddas)
HADOOP-5446. Restore TaskTracker metrics. (cdouglas)
HADOOP-5449. Fixes the history cleaner thread.
(Amareshwari Sriramadasu via ddas)
HADOOP-5479. NameNode should not send empty block replication request to
DataNode. (hairong)
HADOOP-5259. Job with output hdfs:/user/<username>/outputpath (no
authority) fails with Wrong FS. (Doug Cutting via hairong)
HADOOP-5522. Documents the setup/cleanup tasks in the mapred tutorial.
(Amareshwari Sriramadasu via ddas)
HADOOP-5549. ReplicationMonitor should schedule both replication and
deletion work in one iteration. (hairong)
HADOOP-5554. DataNodeCluster and CreateEditsLog should create blocks with
the same generation stamp value. (hairong via szetszwo)
HADOOP-5231. Clones the TaskStatus before passing it to the JobInProgress.
(Amareshwari Sriramadasu via ddas)
HADOOP-4719. Fix documentation of 'ls' format for FsShell. (Ravi Phulari
via cdouglas)
HADOOP-5374. Fixes a NPE problem in getTasksToSave method.
(Amareshwari Sriramadasu via ddas)
HADOOP-4780. Cache the size of directories in DistributedCache, avoiding
long delays in recalculating it. (He Yongqiang via cdouglas)
HADOOP-5551. Prevent directory destruction on file create.
(Brian Bockelman via shv)
HADOOP-5671. Fix FNF exceptions when copying from old versions of
HftpFileSystem. (Tsz Wo (Nicholas), SZE via cdouglas)
HADOOP-5579. Set errno correctly in libhdfs for permission, quota, and FNF
conditions. (Brian Bockelman via cdouglas)
HADOOP-5816. Fixes a problem in the KeyFieldBasedComparator to do with
ArrayIndexOutOfBounds exception. (He Yongqiang via ddas)
HADOOP-5951. Add Apache license header to StorageInfo.java. (Suresh
Srinivas via szetszwo)
Release 0.19.1 - 2009-02-23
IMPROVEMENTS
HADOOP-4739. Fix spelling and grammar, improve phrasing of some sections in
mapred tutorial. (Vivek Ratan via cdouglas)
HADOOP-3894. DFSClient logging improvements. (Steve Loughran via shv)
HADOOP-5126. Remove empty file BlocksWithLocations.java (shv)
HADOOP-5127. Remove public methods in FSDirectory. (Jakob Homan via shv)
BUG FIXES
HADOOP-4697. Fix getBlockLocations in KosmosFileSystem to handle multiple
blocks correctly. (Sriram Rao via cdouglas)
HADOOP-4420. Add null checks for job, caused by invalid job IDs.
(Aaron Kimball via tomwhite)
HADOOP-4632. Fix TestJobHistoryVersion to use test.build.dir instead of the
current workding directory for scratch space. (Amar Kamat via cdouglas)
HADOOP-4508. Fix FSDataOutputStream.getPos() for append. (dhruba via
szetszwo)
HADOOP-4727. Fix a group checking bug in fill_stat_structure(...) in
fuse-dfs. (Brian Bockelman via szetszwo)
HADOOP-4836. Correct typos in mapred related documentation. (Jord? Polo
via szetszwo)
HADOOP-4821. Usage description in the Quotas guide documentations are
incorrect. (Boris Shkolnik via hairong)
HADOOP-4847. Moves the loading of OutputCommitter to the Task.
(Amareshwari Sriramadasu via ddas)
HADOOP-4966. Marks completed setup tasks for removal.
(Amareshwari Sriramadasu via ddas)
HADOOP-4982. TestFsck should run in Eclipse. (shv)
HADOOP-5008. TestReplication#testPendingReplicationRetry leaves an opened
fd unclosed. (hairong)
HADOOP-4906. Fix TaskTracker OOM by keeping a shallow copy of JobConf in
TaskTracker.TaskInProgress. (Sharad Agarwal via acmurthy)
HADOOP-4918. Fix bzip2 compression to work with Sequence Files.
(Zheng Shao via dhruba).
HADOOP-4965. TestFileAppend3 should close FileSystem. (shv)
HADOOP-4967. Fixes a race condition in the JvmManager to do with killing
tasks. (ddas)
HADOOP-5009. DataNode#shutdown sometimes leaves data block scanner
verification log unclosed. (hairong)
HADOOP-5086. Use the appropriate FileSystem for trash URIs. (cdouglas)
HADOOP-4955. Make DBOutputFormat us column names from setOutput().
(Kevin Peterson via enis)
HADOOP-4862. Minor : HADOOP-3678 did not remove all the cases of
spurious IOExceptions logged by DataNode. (Raghu Angadi)
HADOOP-5034. NameNode should send both replication and deletion requests
to DataNode in one reply to a heartbeat. (hairong)
HADOOP-5156. TestHeartbeatHandling uses MiiDFSCluster.getNamesystem()
which does not exit in branch 0.19 and 0.20. (hairong)
HADOOP-5161. Accepted sockets do not get placed in
DataXceiverServer#childSockets. (hairong)
HADOOP-5193. Correct calculation of edits modification time. (shv)
HADOOP-4494. Allow libhdfs to append to files.
(Pete Wyckoff via dhruba)
HADOOP-5166. Fix JobTracker restart to work when ACLs are configured
for the JobTracker. (Amar Kamat via yhemanth).
HADOOP-5067. Fixes TaskInProgress.java to keep track of count of failed and
killed tasks correctly. (Amareshwari Sriramadasu via ddas)
HADOOP-4760. HDFS streams should not throw exceptions when closed twice.
(enis)
Release 0.19.0 - 2008-11-18
INCOMPATIBLE CHANGES
HADOOP-3595. Remove deprecated methods for mapred.combine.once
functionality, which was necessary to providing backwards
compatible combiner semantics for 0.18. (cdouglas via omalley)
HADOOP-3667. Remove the following deprecated methods from JobConf:
addInputPath(Path)
getInputPaths()
getMapOutputCompressionType()
getOutputPath()
getSystemDir()
setInputPath(Path)
setMapOutputCompressionType(CompressionType style)
setOutputPath(Path)
(Amareshwari Sriramadasu via omalley)
HADOOP-3652. Remove deprecated class OutputFormatBase.
(Amareshwari Sriramadasu via cdouglas)
HADOOP-2885. Break the hadoop.dfs package into separate packages under
hadoop.hdfs that reflect whether they are client, server, protocol,
etc. DistributedFileSystem and DFSClient have moved and are now
considered package private. (Sanjay Radia via omalley)
HADOOP-2325. Require Java 6. (cutting)
HADOOP-372. Add support for multiple input paths with a different
InputFormat and Mapper for each path. (Chris Smith via tomwhite)
HADOOP-1700. Support appending to file in HDFS. (dhruba)
HADOOP-3792. Make FsShell -test consistent with unix semantics, returning
zero for true and non-zero for false. (Ben Slusky via cdouglas)
HADOOP-3664. Remove the deprecated method InputFormat.validateInput,
which is no longer needed. (tomwhite via omalley)
HADOOP-3549. Give more meaningful errno's in libhdfs. In particular,
EACCES is returned for permission problems. (Ben Slusky via omalley)
HADOOP-4036. ResourceStatus was added to TaskTrackerStatus by HADOOP-3759,
so increment the InterTrackerProtocol version. (Hemanth Yamijala via
omalley)
HADOOP-3150. Moves task promotion to tasks. Defines a new interface for
committing output files. Moves job setup to jobclient, and moves jobcleanup
to a separate task. (Amareshwari Sriramadasu via ddas)
HADOOP-3446. Keep map outputs in memory during the reduce. Remove
fs.inmemory.size.mb and replace with properties defining in memory map
output retention during the shuffle and reduce relative to maximum heap
usage. (cdouglas)
HADOOP-3245. Adds the feature for supporting JobTracker restart. Running
jobs can be recovered from the history file. The history file format has
been modified to support recovery. The task attempt ID now has the
JobTracker start time to disinguish attempts of the same TIP across
restarts. (Amar Ramesh Kamat via ddas)
HADOOP-4007. REMOVE DFSFileInfo - FileStatus is sufficient.
(Sanjay Radia via hairong)
HADOOP-3722. Fixed Hadoop Streaming and Hadoop Pipes to use the Tool
interface and GenericOptionsParser. (Enis Soztutar via acmurthy)
HADOOP-2816. Cluster summary at name node web reports the space
utilization as:
Configured Capacity: capacity of all the data directories - Reserved space
Present Capacity: Space available for dfs,i.e. remaining+used space
DFS Used%: DFS used space/Present Capacity
(Suresh Srinivas via hairong)
HADOOP-3938. Disk space quotas for HDFS. This is similar to namespace
quotas in 0.18. (rangadi)
HADOOP-4293. Make Configuration Writable and remove unreleased
WritableJobConf. Configuration.write is renamed to writeXml. (omalley)
HADOOP-4281. Change dfsadmin to report available disk space in a format
consistent with the web interface as defined in HADOOP-2816. (Suresh
Srinivas via cdouglas)
HADOOP-4430. Further change the cluster summary at name node web that was
changed in HADOOP-2816:
Non DFS Used - This indicates the disk space taken by non DFS file from
the Configured capacity
DFS Used % - DFS Used % of Configured Capacity
DFS Remaining % - Remaing % Configured Capacity available for DFS use
DFS command line report reflects the same change. Config parameter
dfs.datanode.du.pct is no longer used and is removed from the
hadoop-default.xml. (Suresh Srinivas via hairong)
HADOOP-4116. Balancer should provide better resource management. (hairong)
HADOOP-4599. BlocksMap and BlockInfo made package private. (shv)
NEW FEATURES
HADOOP-3341. Allow streaming jobs to specify the field separator for map
and reduce input and output. The new configuration values are:
stream.map.input.field.separator
stream.map.output.field.separator
stream.reduce.input.field.separator
stream.reduce.output.field.separator
All of them default to "\t". (Zheng Shao via omalley)
HADOOP-3479. Defines the configuration file for the resource manager in
Hadoop. You can configure various parameters related to scheduling, such
as queues and queue properties here. The properties for a queue follow a
naming convention,such as, hadoop.rm.queue.queue-name.property-name.
(Hemanth Yamijala via ddas)
HADOOP-3149. Adds a way in which map/reducetasks can create multiple
outputs. (Alejandro Abdelnur via ddas)
HADOOP-3714. Add a new contrib, bash-tab-completion, which enables
bash tab completion for the bin/hadoop script. See the README file
in the contrib directory for the installation. (Chris Smith via enis)
HADOOP-3730. Adds a new JobConf constructor that disables loading
default configurations. (Alejandro Abdelnur via ddas)
HADOOP-3772. Add a new Hadoop Instrumentation api for the JobTracker and
the TaskTracker, refactor Hadoop Metrics as an implementation of the api.
(Ari Rabkin via acmurthy)
HADOOP-2302. Provides a comparator for numerical sorting of key fields.
(ddas)
HADOOP-153. Provides a way to skip bad records. (Sharad Agarwal via ddas)
HADOOP-657. Free disk space should be modelled and used by the scheduler
to make scheduling decisions. (Ari Rabkin via omalley)
HADOOP-3719. Initial checkin of Chukwa, which is a data collection and
analysis framework. (Jerome Boulon, Andy Konwinski, Ari Rabkin,
and Eric Yang)
HADOOP-3873. Add -filelimit and -sizelimit options to distcp to cap the
number of files/bytes copied in a particular run to support incremental
updates and mirroring. (TszWo (Nicholas), SZE via cdouglas)
HADOOP-3585. FailMon package for hardware failure monitoring and
analysis of anomalies. (Ioannis Koltsidas via dhruba)
HADOOP-1480. Add counters to the C++ Pipes API. (acmurthy via omalley)
HADOOP-3854. Add support for pluggable servlet filters in the HttpServers.
(Tsz Wo (Nicholas) Sze via omalley)
HADOOP-3759. Provides ability to run memory intensive jobs without
affecting other running tasks on the nodes. (Hemanth Yamijala via ddas)
HADOOP-3746. Add a fair share scheduler. (Matei Zaharia via omalley)
HADOOP-3754. Add a thrift interface to access HDFS. (dhruba via omalley)
HADOOP-3828. Provides a way to write skipped records to DFS.
(Sharad Agarwal via ddas)
HADOOP-3948. Separate name-node edits and fsimage directories.
(Lohit Vijayarenu via shv)
HADOOP-3939. Add an option to DistCp to delete files at the destination
not present at the source. (Tsz Wo (Nicholas) Sze via cdouglas)
HADOOP-3601. Add a new contrib module for Hive, which is a sql-like
query processing tool that uses map/reduce. (Ashish Thusoo via omalley)
HADOOP-3866. Added sort and multi-job updates in the JobTracker web ui.
(Craig Weisenfluh via omalley)
HADOOP-3698. Add access control to control who is allowed to submit or
modify jobs in the JobTracker. (Hemanth Yamijala via omalley)
HADOOP-1869. Support access times for HDFS files. (dhruba)
HADOOP-3941. Extend FileSystem API to return file-checksums.
(szetszwo)
HADOOP-3581. Prevents memory intensive user tasks from taking down
nodes. (Vinod K V via ddas)
HADOOP-3970. Provides a way to recover counters written to JobHistory.
(Amar Kamat via ddas)
HADOOP-3702. Adds ChainMapper and ChainReducer classes allow composing
chains of Maps and Reduces in a single Map/Reduce job, something like
MAP+ / REDUCE MAP*. (Alejandro Abdelnur via ddas)
HADOOP-3445. Add capacity scheduler that provides guaranteed capacities to
queues as a percentage of the cluster. (Vivek Ratan via omalley)
HADOOP-3992. Add a synthetic load generation facility to the test
directory. (hairong via szetszwo)
HADOOP-3981. Implement a distributed file checksum algorithm in HDFS
and change DistCp to use file checksum for comparing src and dst files
(szetszwo)
HADOOP-3829. Narrown down skipped records based on user acceptable value.
(Sharad Agarwal via ddas)
HADOOP-3930. Add common interfaces for the pluggable schedulers and the
cli & gui clients. (Sreekanth Ramakrishnan via omalley)
HADOOP-4176. Implement getFileChecksum(Path) in HftpFileSystem. (szetszwo)
HADOOP-249. Reuse JVMs across Map-Reduce Tasks.
Configuration changes to hadoop-default.xml:
add mapred.job.reuse.jvm.num.tasks
(Devaraj Das via acmurthy)
HADOOP-4070. Provide a mechanism in Hive for registering UDFs from the
query language. (tomwhite)
HADOOP-2536. Implement a JDBC based database input and output formats to
allow Map-Reduce applications to work with databases. (Fredrik Hedberg and
Enis Soztutar via acmurthy)
HADOOP-3019. A new library to support total order partitions.
(cdouglas via omalley)
HADOOP-3924. Added a 'KILLED' job status. (Subramaniam Krishnan via
acmurthy)
IMPROVEMENTS
HADOOP-4205. hive: metastore and ql to use the refactored SerDe library.
(zshao)
HADOOP-4106. libhdfs: add time, permission and user attribute support
(part 2). (Pete Wyckoff through zshao)
HADOOP-4104. libhdfs: add time, permission and user attribute support.
(Pete Wyckoff through zshao)
HADOOP-3908. libhdfs: better error message if llibhdfs.so doesn't exist.
(Pete Wyckoff through zshao)
HADOOP-3732. Delay intialization of datanode block verification till
the verification thread is started. (rangadi)
HADOOP-1627. Various small improvements to 'dfsadmin -report' output.
(rangadi)
HADOOP-3577. Tools to inject blocks into name node and simulated
data nodes for testing. (Sanjay Radia via hairong)
HADOOP-2664. Add a lzop compatible codec, so that files compressed by lzop
may be processed by map/reduce. (cdouglas via omalley)
HADOOP-3655. Add additional ant properties to control junit. (Steve
Loughran via omalley)
HADOOP-3543. Update the copyright year to 2008. (cdouglas via omalley)
HADOOP-3587. Add a unit test for the contrib/data_join framework.
(cdouglas)
HADOOP-3402. Add terasort example program (omalley)
HADOOP-3660. Add replication factor for injecting blocks in simulated
datanodes. (Sanjay Radia via cdouglas)
HADOOP-3684. Add a cloning function to the contrib/data_join framework
permitting users to define a more efficient method for cloning values from
the reduce than serialization/deserialization. (Runping Qi via cdouglas)
HADOOP-3478. Improves the handling of map output fetching. Now the
randomization is by the hosts (and not the map outputs themselves).
(Jothi Padmanabhan via ddas)
HADOOP-3617. Removed redundant checks of accounting space in MapTask and
makes the spill thread persistent so as to avoid creating a new one for
each spill. (Chris Douglas via acmurthy)
HADOOP-3412. Factor the scheduler out of the JobTracker and make
it pluggable. (Tom White and Brice Arnould via omalley)
HADOOP-3756. Minor. Remove unused dfs.client.buffer.dir from
hadoop-default.xml. (rangadi)
HADOOP-3747. Adds counter suport for MultipleOutputs.
(Alejandro Abdelnur via ddas)
HADOOP-3169. LeaseChecker daemon should not be started in DFSClient
constructor. (TszWo (Nicholas), SZE via hairong)
HADOOP-3824. Move base functionality of StatusHttpServer to a core