blob: 552e0f6ccc3c1ff68b2516ae4614529a007022ab [file] [log] [blame]
Hadoop HDFS Change Log
Trunk (unreleased changes)
INCOMPATIBLE CHANGES
NEW FEATURES
HDFS-503. This patch implements an optional layer over HDFS that
implements offline erasure-coding. It can be used to reduce the
total storage requirements of HDFS. (dhruba)
HDFS-654. Add support new atomic rename functionality in HDFS for
supporting rename in FileContext. (suresh)
HDFS-222. Support for concatenating of files into a single file
without copying. (Boris Shkolnik via hairong)
IMPROVEMENTS
HDFS-703. Replace current fault injection implementation with one
from (cos)
HDFS-754. Reduce ivy console output to ovservable level (cos)
HDFS-832. HDFS side of HADOOP-6222. (cos)
HDFS-840. Change tests to use FileContext test helper introduced in
HADOOP-6394. (Jitendra Nath Pandey via suresh)
HDFS-685. Use the user-to-groups mapping service in the NameNode. (boryas, acmurthy)
HDFS-755. Read multiple checksum chunks at once in DFSInputStream.
(Todd Lipcon via tomwhite)
HDFS-786. Implement getContentSummary in HftpFileSystem.
(Tsz Wo (Nicholas), SZE via cdouglas)
OPTIMIZATIONS
BUG FIXES
HDFS-695. RaidNode should read in configuration from hdfs-site.xml.
(dhruba)
HDFS-726. Eclipse .classpath template has outdated jar files and is
missing some new ones. (cos)
HDFS-750. Fix build failure due to TestRename. (suresh)
HDFS-712. Move libhdfs from mapreduce subproject to hdfs subproject.
(Eli Collins via dhruba)
HDFS-757. Enable Unit test for HDFS Raid. (dhruba)
HDFS-611. Prevent DataNode heartbeat times from increasing even when
the DataNode has many blocks to delete. (Zheng Shao via dhruba)
HDFS-751. Fix TestCrcCorruption to pick up the correct datablocks to
corrupt. (dhruba)
HDFS-763. Fix slightly misleading report from DataBlockScanner
about corrupted scans. (dhruba)
HDFS-727. bug setting block size hdfsOpenFile (Eli Collins via cos)
HDFS-756. libhdfs unit tests do not run. (Eli Collins via cos)
HDFS-783. libhdfs tests brakes code coverage runs with Clover (cos)
HDFS-785. Add Apache license to several namenode unit tests.
(Ravi Phulari via jghoman)
HDFS-802. Update Eclipse configuration to match changes to Ivy
configuration (Edwin Chan via cos)
HDFS-423. Unbreak FUSE build and fuse_dfs_wrapper.sh (Eli Collins via cos)
HDFS-825. Build fails to pull latest hadoop-core-* artifacts (cos)
HDFS-94. The Heap Size printed in the NameNode WebUI is accurate.
(Dmytro Molkov via dhruba)
HDFS-767. An improved retry policy when the DFSClient is unable to fetch a
block from the datanode. (Ning Zhang via dhruba)
HDFS-187. Initialize secondary namenode http address in TestStartup.
(Todd Lipcon via szetszwo)
Release 0.21.0 - Unreleased
INCOMPATIBLE CHANGES
HDFS-538. Per the contract elucidated in HADOOP-6201, throw
FileNotFoundException from FileSystem::listStatus rather than returning
null. (Jakob Homan via cdouglas)
HDFS-602. DistributedFileSystem mkdirs throws FileAlreadyExistsException
instead of FileNotFoundException. (Boris Shkolnik via suresh)
HDFS-544. Add a "rbw" subdir to DataNode data directory. (hairong)
HDFS-576. Block report includes under-construction replicas. (shv)
HDFS-636. SafeMode counts complete blocks only. (shv)
HDFS-644. Lease recovery, concurrency support. (shv)
HDFS-570. Get last block length from a data-node when opening a file
being written to. (Tsz Wo (Nicholas), SZE via shv)
HDFS-657. Remove unused legacy data-node protocol methods. (shv)
HDFS-658. Block recovery for primary data-node. (shv)
HDFS-660. Remove deprecated methods from InterDatanodeProtocol. (shv)
HDFS-512. Block.equals() and compareTo() compare blocks based
only on block Ids, ignoring generation stamps. (shv)
NEW FEATURES
HDFS-436. Introduce AspectJ framework for HDFS code and tests.
(Konstantin Boudnik via szetszwo)
HDFS-447. Add LDAP lookup to hdfsproxy. (Zhiyong Zhang via cdouglas)
HDFS-459. Introduce Job History Log Analyzer. (shv)
HDFS-461. Tool to analyze file size distribution in HDFS. (shv)
HDFS-492. Add two JSON JSP pages to the Namenode for providing corrupt
blocks/replicas information. (Bill Zeller via szetszwo)
HDFS-578. Add support for new FileSystem method for clients to get server
defaults. (Kan Zhang via suresh)
HDFS-595. umask settings in configuration may now use octal or symbolic
instead of decimal. (Jakob Homan via suresh)
HADOOP-6234. Updated hadoop-core and test jars to propagate new option
dfs.umaskmode in configuration. (Jakob Homan via suresh)
HDFS-235. Add support for byte ranges in HftpFileSystem to serve
range of bytes from a file. (Bill Zeller via suresh)
HDFS-385. Add support for an experimental API that allows a module external
to HDFS to specify how HDFS blocks should be placed. (dhruba)
HADOOP-4952. Update hadoop-core and test jars to propagate new FileContext
file system application interface. (Sanjay Radia via suresh).
HDFS-567. Add block forensics contrib tool to print history of corrupt and
missing blocks from the HDFS logs.
(Bill Zeller, Jitendra Nath Pandey via suresh).
HDFS-610. Support o.a.h.fs.FileContext. (Sanjay Radia via szetszwo)
HDFS-536. Support hflush at DFSClient. (hairong)
HDFS-517. Introduce BlockInfoUnderConstruction to reflect block replica
states while writing. (shv)
HDFS-565. Introduce block committing logic during new block allocation
and file close. (shv)
HDFS-537. DataNode exposes a replica's meta info to BlockReceiver for the
support of dfs writes/hflush. It also updates a replica's bytes received,
bytes on disk, and bytes acked after receiving a packet. (hairong)
HDFS-585. Datanode should serve up to visible length of a replica for read
requests. (szetszwo)
HDFS-604. Block report processing for append. (shv)
HDFS-619. Support replica recovery initialization in datanode for the new
append design. (szetszwo)
HDFS-592. Allow clients to fetch a new generation stamp from NameNode for
pipeline recovery. (hairong)
HDFS-624. Support a new algorithm for pipeline recovery and pipeline setup
for append. (hairong)
HDFS-627. Support replica update in data-node.
(Tsz Wo (Nicholas), SZE and Hairong Kuang via shv)
HDFS-642. Support pipeline close and close error recovery. (hairong)
HDFS-631. Rename configuration keys towards API standardization and
backward compatibility. (Jitendra Nath Pandey via suresh)
HDFS-669. Add unit tests framework (Mockito) (cos, Eli Collins)
HDFS-731. Support new Syncable interface in HDFS. (hairong)
HDFS-702. Add HDFS implementation of AbstractFileSystem.
(Sanjay Radio via suresh)
HDFS-758. Add decommissioning status page to Namenode Web UI.
(Jitendra Nath Pandey via suresh)
HDFS-814. Add an api to get the visible length of a DFSDataInputStream.
(szetszwo)
IMPROVEMENTS
HDFS-381. Remove blocks from DataNode maps when corresponding file
is deleted. (Suresh Srinivas via rangadi)
HDFS-377. Separate codes which implement DataTransferProtocol.
(szetszwo)
HDFS-396. NameNode image and edits directories are specified as URIs.
(Luca Telloli via rangadi)
HDFS-444. Allow to change probability levels dynamically in the fault
injection framework. (Konstantin Boudnik via szetszwo)
HDFS-352. Documentation for saveNamespace command. (Ravi Phulari via shv)
HADOOP-6106. Updated hadoop-core and test jars from hudson trunk
build #12. (Giridharan Kesavan)
HDFS-204. Add a new metrics FilesInGetListingOps to the Namenode.
(Jitendra Nath Pandey via szetszwo)
HDFS-278. HDFS Outputstream close does not hang forever. (dhruba)
HDFS-443. Add a new metrics numExpiredHeartbeats to the Namenode.
(Jitendra Nath Pandey via szetszwo)
HDFS-475. Add new ant targets for fault injection jars and tests.
(Konstantin Boudnik via szetszwo)
HDFS-458. Create a new ant target, run-commit-test. (Jakob Homan
via szetszwo)
HDFS-493. Change build.xml so that the fault-injected tests are executed
only by the run-test-*-fault-inject targets. (Konstantin Boudnik via
szetszwo)
HDFS-446. Improvements to Offline Image Viewer. (Jakob Homan via shv)
HADOOP-6160. Fix releaseaudit target to run on specific directories.
(gkesavan)
HDFS-501. Use enum to define the constants in DataTransferProtocol.
(szetszwo)
HDFS-508. Factor out BlockInfo from BlocksMap. (shv)
HDFS-510. Rename DatanodeBlockInfo to be ReplicaInfo.
(Jakob Homan & Hairong Kuang via shv)
HDFS-500. Deprecate NameNode methods deprecated in NameNodeProtocol.
(Jakob Homan via shv)
HDFS-514. Change DFSClient.namenode from public to private. (Bill Zeller
via szetszwo)
HDFS-496. Use PureJavaCrc32 in HDFS. (Todd Lipcon via szetszwo)
HDFS-511. Remove redundant block searches in BlockManager. (shv)
HDFS-504. Update the modification time of a file when the file
is closed. (Chun Zhang via dhruba)
HDFS-498. Add development guide and documentation for the fault injection
framework. (Konstantin Boudnik via szetszwo)
HDFS-524. Further DataTransferProtocol code refactoring. (szetszwo)
HDFS-529. Use BlockInfo instead of Block to avoid redundant block searches
in BlockManager. (shv)
HDFS-530. Refactor TestFileAppend* to remove code duplication.
(Konstantin Boudnik via szetszwo)
HDFS-451. Add fault injection tests for DataTransferProtocol. (szetszwo)
HDFS-409. Add more access token tests. (Kan Zhang via szetszwo)
HDFS-546. DatanodeDescriptor iterates blocks as BlockInfo. (shv)
HDFS-457. Do not shutdown datanode if some, but not all, volumes fail.
(Boris Shkolnik via szetszwo)
HDFS-548. TestFsck takes nearly 10 minutes to run. (hairong)
HDFS-539. Refactor fault injeciton pipeline test util for future reuse.
(Konstantin Boudnik via szetszwo)
HDFS-552. Change TestFiDataTransferProtocol to junit 4 and add a few new
tests. (szetszwo)
HDFS-563. Simplify the codes in FSNamesystem.getBlockLocations(..).
(szetszwo)
HDFS-581. Introduce an iterator over blocks in the block report array.(shv)
HDFS-549. Add a new target, run-with-fault-inject-testcaseonly, which
allows an execution of non-FI tests in FI-enable environment. (Konstantin
Boudnik via szetszwo)
HDFS-173. Namenode will not block until a large directory deletion
completes. It allows other operations when the deletion is in progress.
(suresh)
HDFS-551. Create new functional test for a block report. (Konstantin
Boudnik via hairong)
HDFS-288. Redundant computation in hashCode() implementation.
(szetszwo via tomwhite)
HDFS-412. Hadoop JMX usage makes Nagios monitoring impossible.
(Brian Bockelman via tomwhite)
HDFS-472. Update hdfsproxy documentation. Adds a setup guide and design
document. (Zhiyong Zhang via cdouglas)
HDFS-617. Support non-recursive create(). (Kan Zhang via szetszwo)
HDFS-618. Support non-recursive mkdir(). (Kan Zhang via szetszwo)
HDFS-574. Split the documentation between the subprojects.
(Corinne Chandel via omalley)
HDFS-598. Eclipse launch task for HDFS. (Eli Collins via tomwhite)
HDFS-641. Move all of the components that depend on map/reduce to
map/reduce. (omalley)
HDFS-509. Redesign DataNode volumeMap to include all types of Replicas.
(hairong)
HDFS-562. Add a test for NameNode.getBlockLocations(..) to check read from
un-closed file. (szetszwo)
HDFS-543. Break FSDatasetInterface#writToBlock() into writeToRemporary,
writeToRBW, ad append. (hairong)
HDFS-603. Add a new interface, Replica, which is going to replace the use
of Block in datanode. (szetszwo)
HDFS-589. Change block write protocol to support pipeline recovery.
(hairong)
HDFS-652. Replace BlockInfo.isUnderConstruction() with isComplete() (shv)
HDFS-648. Change some methods in AppendTestUtil to public. (Konstantin
Boudnik via szetszwo)
HDFS-662. Unnecessary info message from DFSClient. (hairong)
HDFS-518. Create new tests for Append's hflush. (Konstantin Boudnik
via szetszwo)
HDFS-688. Add configuration resources to DFSAdmin. (shv)
HDFS-29. Validate the consistency of the lengths of replica and its file
in replica recovery. (szetszwo)
HDFS-680. Add new access method to a copy of a block's replica. (shv)
HDFS-704. Unify build property names to facilitate cross-projects
modifications (cos)
HDFS-705. Create an adapter to access some of package-private methods of
DataNode from tests (cos)
HDFS-710. Add actions with constraints to the pipeline fault injection
tests and change SleepAction to support uniform random sleeping over an
interval. (szetszwo)
HDFS-713. Need to properly check the type of the test class from an aspect
(cos)
HDFS-716. Define a pointcut for pipeline close and add a few fault
injection tests to simulate out of memory problem. (szetszwo)
HDFS-719. Add 6 fault injection tests for pipeline close to simulate slow
datanodes and disk errors. (szetszwo)
HDFS-616. Create functional tests for new design of the block report. (cos)
HDFS-584. Fail the fault-inject build if any advices are mis-bound. (cos)
HDFS-730. Add 4 fault injection tests to simulate non-responsive datanode
and out-of-memory problem for pipeline close ack. (szetszwo)
HDFS-728. Create a comprehensive functional test for append. (hairong)
HDFS-736. commitBlockSynchronization() updates block GS and length
in-place. (shv)
HADOOP-5107. Use Maven ant tasks to publish the subproject jars.
(Giridharan Kesavan via omalley)
HDFS-521. Create new tests for pipeline (cos)
HDFS-764. Places the Block Access token implementation in hdfs project.
(Kan Zhang via ddas)
HDFS-787. Upgrade some libraries to be consistent with common and
mapreduce. (omalley)
HDFS-519. Create new tests for lease recovery (cos)
HDFS-804. New unit tests for concurrent lease recovery (cos)
HDFS-813. Enable the append test in TestReadWhileWriting. (szetszwo)
BUG FIXES
HDFS-76. Better error message to users when commands fail because of
lack of quota. Allow quota to be set even if the limit is lower than
current consumption. (Boris Shkolnik via rangadi)
HADOOP-4687. HDFS is split from Hadoop Core. It is a subproject under
Hadoop (Owen O'Malley)
HADOOP-6096. Fix Eclipse project and classpath files following project
split. (tomwhite)
HDFS-195. Handle expired tokens when write pipeline is reestablished.
(Kan Zhang via rangadi)
HDFS-181. Validate src path in FSNamesystem.getFileInfo(..). (Todd
Lipcon via szetszwo)
HDFS-441. Remove TestFTPFileSystem. (szetszwo)
HDFS-440. Fix javadoc broken links in DFSClient. (szetszwo)
HDFS-480. Fix a typo in the jar name in build.xml.
(Konstantin Shvachko via gkesavan)
HDFS-438. Check for NULL before invoking GenericArgumentParser in
DataNode. (Raghu Angadi)
HDFS-415. BlockReceiver hangs in case of certain runtime exceptions.
(Konstantin Boudnik via rangadi)
HDFS-462. loadFSImage should close edits file. (Jakob Homan via shv)
HDFS-489. Update TestHDFSCLI for the -skipTrash option in rm. (Jakob Homan
via szetszwo)
HDFS-445. pread() does not pick up changes to block locations.
(Kan Zhang via rangadi)
HDFS-463. CreateEditLog utility broken after HDFS-396 (URI for
FSImage). (Suresh Srinivas via rangadi)
HDFS-484. Fix bin-package and package target to package jar files.
(gkesavan)
HDFS-490. Eliminate the deprecated warnings introduced by H-5438.
(He Yongqiang via szetszwo)
HDFS-119. Fix a bug in logSync(), which causes NameNode block forever.
(Suresh Srinivas via shv)
HDFS-534. Include avro in ivy. (szetszwo)
HDFS-532. Allow applications to know that a read request failed
because block is missing. (dhruba)
HDFS-561. Fix write pipeline READ_TIMEOUT in DataTransferProtocol.
(Kan Zhang via szetszwo)
HDFS-553. BlockSender reports wrong failed position in ChecksumException.
(hairong)
HDFS-568. Set mapred.job.tracker.retire.jobs to false in
src/test/mapred-site.xml for mapreduce tests to run. (Amareshwari
Sriramadasu via szetszwo)
HDFS-15. All replicas end up on 1 rack. (Jitendra Nath Pandey via hairong)
HDFS-586. TestBlocksWithNotEnoughRacks sometimes fails.
(Jitendra Nath Pandey via hairong)
HADOOP-6243. Fixed a NullPointerException in handling deprecated keys.
(Sreekanth Ramakrishnan via yhemanth)
HDFS-605. Do not run fault injection tests in the run-test-hdfs-with-mr
target. (Konstantin Boudnik via szetszwo)
HDFS-606. Fix ConcurrentModificationException in invalidateCorruptReplicas()
(shv)
HDFS-601. TestBlockReport obtains data directories directly from
MiniHDFSCluster. (Konstantin Boudnik via shv)
HDFS-614. TestDatanodeBlockScanner obtains data directories directly from
MiniHDFSCluster. (shv)
HDFS-612. Remove the use of org.mortbay.log.Log in FSDataset. (szetszwo)
HDFS-622. checkMinReplication should count live nodes only. (shv)
HDFS-629. Remove ReplicationTargetChooser.java along with fixing
import warnings generated by Eclipse. (dhruba)
HDFS-637. DataNode sends a Success ack when block write fails. (hairong)
HDFS-640. Fixed TestHDFSFileContextMainOperations.java build failure. (suresh)
HDFS-547. TestHDFSFileSystemContract#testOutputStreamClosedTwice
sometimes fails with CloseByInterruptException. (hairong)
HDFS-588. Fix TestFiDataTransferProtocol and TestAppend2 failures. (shv)
HDFS-550. DataNode restarts may introduce corrupt/duplicated/lost replicas
when handling detached replicas. (hairong)
HDFS-659. If the the last block is not complete, update its length with
one of its replica's length stored in datanode. (szetszwo)
HDFS-649. Check null pointers for DataTransferTest. (Konstantin Boudnik
via szetszwo)
HDFS-661. DataNode upgrade fails on non-existant current directory.
(hairong)
HDFS-597. Mofication introduced by HDFS-537 breakes an advice binding in
FSDatasetAspects. (Konstantin Boudnik via szetszwo)
HDFS-665. TestFileAppend2 sometimes hangs. (hairong)
HDFS-676. Fix NPE in FSDataset.updateReplicaUnderRecovery() (shv)
HDFS-673. BlockReceiver#PacketResponder should not remove a packet from
the ack queue before its ack is sent. (hairong)
HDFS-682. Fix bugs in TestBlockUnderConstruction. (szetszwo)
HDFS-668. TestFileAppend3#TC7 sometimes hangs. (hairong)
HDFS-679. Appending to a partial chunk incorrectly assumes the
first packet fills up the partial chunk. (hairong)
HDFS-722. Fix callCreateBlockWriteStream pointcut in FSDatasetAspects.
(szetszwo)
HDFS-690. TestAppend2#testComplexAppend failed on "Too many open files".
(hairong)
HDFS-725. Support the build error fix for HADOOP-6327. (Sanjay Radia via
szetszwo)
HDFS-625. Fix NullPointerException thrown from ListPathServlet. (suresh)
HDFS-735. TestReadWhileWriting has wrong line termination symbols (cos)
HDFS-691. Fix an overflow error in DFSClient.DFSInputStream.available().
(szetszwo)
HDFS-733. TestBlockReport fails intermittently. (cos)
HDFS-774. Intermittent race condition in TestFiPipelines (cos)
HDFS-741. TestHFlush test doesn't seek() past previously written part of
the file (cos, szetszwo)
HDFS-706. Intermittent failures in TestFiHFlush (cos)
HDFS-646. Fix test-patch failure by adding test-contrib ant target.
(gkesavan)
HDFS-791. Build is broken after HDFS-787 patch has been applied (cos)
HDFS-792. TestHDFSCLI is failing. (Todd Lipcon via cos)
HDFS-781. Namenode metrics PendingDeletionBlocks is not decremented.
(Suresh)
HDFS-192. Fix TestBackupNode failures. (shv)
HDFS-797. TestHDFSCLI much slower after HDFS-265 merge. (Todd Lipcon via cos)
HDFS-824. Stop lease checker in TestReadWhileWriting. (szetszwo)
HDFS-823. CheckPointer should use addInternalServlet for image-fetching
servlet (jghoman)
HDFS-456. Fix URI generation for windows file paths. (shv)
HDFS-812. FSNamesystem#internalReleaseLease throws NullPointerException on
a single-block file's lease recovery. (cos)
HDFS-724. Pipeline hangs if one of the block receiver is not responsive.
(hairong)
HDFS-564. Adding pipeline tests 17-35. (hairong)
HDFS-849. TestFiDataTransferProtocol2#pipeline_Fi_18 sometimes fails.
(hairong)
HDFS-762. Balancer causes Null Pointer Exception.
(Cristian Ivascu via dhruba)
HDFS-868. Fix link to Hadoop Upgrade Wiki. (Chris A. Mattmann via shv)
Release 0.20.2 - Unreleased
IMPROVEMENTS
HDFS-127. Reset failure count in DFSClient for each block acquiring
operation. (Igor Bolotin via szetszwo)
HDFS-737. Add full path name of the file to the block information and
summary of total number of files, blocks, live and deadnodes to
metasave output. (Jitendra Nath Pandey via suresh)
BUG FIXES
HDFS-686. NullPointerException is thrown while merging edit log and image.
(hairong)
HDFS-677. Rename failure when both source and destination quota exceeds
results in deletion of source. (suresh)
HDFS-709. Fix TestDFSShell failure due to rename bug introduced by
HDFS-677. (suresh)
HDFS-579. Fix DfsTask to follow the semantics of 0.19, regarding non-zero
return values as failures. (Christian Kunz via cdouglas)
HDFS-723. Fix deadlock in DFSClient#DFSOutputStream. (hairong)
HDFS-596. Fix memory leak in hdfsFreeFileInfo() for libhdfs.
(Zhang Bingjun via dhruba)
HDFS-793. Data node should receive the whole packet ack message before it
constructs and sends its own ack message for the packet. (hairong)
HDFS-185. Disallow chown, chgrp, chmod, setQuota, and setSpaceQuota when
name-node is in safemode. (Ravi Phulari via shv)
HDFS-101. DFS write pipeline: DFSClient sometimes does not detect second
datanode failure. (hairong)
Release 0.20.1 - 2009-09-01
IMPROVEMENTS
HDFS-438. Improve help message for space quota command. (Raghu Angadi)
BUG FIXES
HDFS-167. Fix a bug in DFSClient that caused infinite retries on write.
(Bill Zeller via szetszwo)
HDFS-527. Remove/deprecate unnecessary DFSClient constructors. (szetszwo)
HDFS-525. The SimpleDateFormat object in ListPathsServlet is not thread
safe. (Suresh Srinivas and cdouglas)
HDFS-761. Fix failure to process rename operation from edits log due to
quota verification. (suresh)