| <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> |
| <html><head> |
| <meta http-equiv="content-type" content="text/html; charset=UTF-8"> |
| <title>Hadoop 0.18.3 Release Notes</title></head><body> |
| <font face="sans-serif"> |
| |
| <h1>Hadoop 0.18.3 Release Notes</h1> |
| Hadoop 0.18.3 fixes serveral problems that may lead to data loss |
| from the file system. Important changes were made to lease recovery and the management of |
| block replicas. The bug fixes are listed below. |
| <ul> |
| <h2>Changes Since Hadoop 0.18.2</h2> |
| <ul> |
| <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-4823'>HADOOP-4823</a>] - Should not use java.util.NavigableMap in 0.18</li> |
| <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-4824'>HADOOP-4824</a>] - Should not use File.setWritable(..) in 0.18</li> |
| <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-1980'>HADOOP-1980</a>] - 'dfsadmin -safemode enter' should prevent the namenode from leaving safemode automatically after startup</li> |
| <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-3121'>HADOOP-3121</a>] - dfs -lsr fail with "Could not get listing "</li> |
| <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-3883'>HADOOP-3883</a>] - TestFileCreation fails once in a while</li> |
| <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-4061'>HADOOP-4061</a>] - Large number of decommission freezes the Namenode</li> |
| <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-4257'>HADOOP-4257</a>] - TestLeaseRecovery2.testBlockSynchronization failing.</li> |
| <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-4499'>HADOOP-4499</a>] - DFSClient should invoke checksumOk only once.</li> |
| <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-4542'>HADOOP-4542</a>] - Fault in TestDistributedUpgrade</li> |
| <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-4556'>HADOOP-4556</a>] - Block went missing</li> |
| <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-4597'>HADOOP-4597</a>] - Under-replicated blocks are not calculated if the name-node is forced out of safe-mode.</li> |
| <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-4610'>HADOOP-4610</a>] - Always calculate mis-replicated blocks when safe-mode is turned off.</li> |
| <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-4613'>HADOOP-4613</a>] - browseBlock.jsp does not generate "genstamp" property.</li> |
| <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-4614'>HADOOP-4614</a>] - "Too many open files" error while processing a large gzip file</li> |
| <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-4616'>HADOOP-4616</a>] - assertion makes fuse-dfs exit when reading incomplete data</li> |
| <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-4620'>HADOOP-4620</a>] - Streaming mapper never completes if the mapper does not write to stdout</li> |
| <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-4635'>HADOOP-4635</a>] - Memory leak ?</li> |
| <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-4643'>HADOOP-4643</a>] - NameNode should exclude excessive replicas when counting live replicas for a block</li> |
| <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-4647'>HADOOP-4647</a>] - NamenodeFsck creates a new DFSClient but never closes it</li> |
| <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-4654'>HADOOP-4654</a>] - remove temporary output directory of failed tasks</li> |
| <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-4659'>HADOOP-4659</a>] - Root cause of connection failure is being lost to code that uses it for delaying startup</li> |
| <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-4679'>HADOOP-4679</a>] - Datanode prints tons of log messages: Waiting for threadgroup to exit, active theads is XX</li> |
| <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-4702'>HADOOP-4702</a>] - Failed block replication leaves an incomplete block in receiver's tmp data directory</li> |
| <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-4703'>HADOOP-4703</a>] - DataNode.createInterDataNodeProtocolProxy should not wait for proxy forever while recovering lease</li> |
| <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-4713'>HADOOP-4713</a>] - librecordio does not scale to large records</li> |
| <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-4714'>HADOOP-4714</a>] - map tasks timing out during merge phase</li> |
| <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-4717'>HADOOP-4717</a>] - Removal of default port# in NameNode.getUri() cause a map/reduce job failed to prompt temporay output</li> |
| <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-4726'>HADOOP-4726</a>] - documentation typos: "the the"</li> |
| <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-4734'>HADOOP-4734</a>] - Some lease recovery codes in 0.19 or trunk should also be committed in 0.18.</li> |
| <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-4742'>HADOOP-4742</a>] - Mistake delete replica in hadoop 0.18.1</li> |
| <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-4746'>HADOOP-4746</a>] - Job output directory should be normalized</li> |
| <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-4778'>HADOOP-4778</a>] - Check for zero size block meta file when updating a block.</li> |
| <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-4795'>HADOOP-4795</a>] - Lease monitor may get into an infinite loop</li> |
| <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-4797'>HADOOP-4797</a>] - RPC Server can leave a lot of direct buffers </li> |
| <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-4806'>HADOOP-4806</a>] - HDFS rename does not work correctly if src contains Java regular expression special characters</li> |
| <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-4810'>HADOOP-4810</a>] - Data lost at cluster startup time</li> |
| <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-4822'>HADOOP-4822</a>] - 0.18 cannot be compiled in Java 5.</li> |
| <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-4840'>HADOOP-4840</a>] - TestNodeCount sometimes fails with NullPointerException</li> |
| <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-4904'>HADOOP-4904</a>] - Deadlock while leaving safe mode.</li> |
| <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-4910'>HADOOP-4910</a>] - NameNode should exclude corrupt replicas when choosing excessive replicas to delete</li> |
| <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-4919'>HADOOP-4919</a>] - [HOD] Provide execute access to JT history directory path for group</li> |
| <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-4924'>HADOOP-4924</a>] - Race condition in re-init of TaskTracker</li> |
| <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-4935'>HADOOP-4935</a>] - Manual leaving of safe mode may lead to data lost</li> |
| <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-4951'>HADOOP-4951</a>] - Lease monitor does not own the LeaseManager lock in changing leases.</li> |
| <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-4961'>HADOOP-4961</a>] - ConcurrentModificationException in lease recovery of empty files.</li> |
| <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-4971'>HADOOP-4971</a>] - Block report times from datanodes could converge to same time. </li> |
| <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-4983'>HADOOP-4983</a>] - Job counters sometimes go down as tasks run without task failures</li> |
| <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-4997'>HADOOP-4997</a>] - workaround for tmp file handling on DataNodes in 0.18 (HADOOP-4663)</li> |
| <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-5077'>HADOOP-5077</a>] - JavaDoc errors in 0.18.3</li> |
| <li>[<a href='https://issues.apache.org/jira/browse/HADOOP-3780'>HADOOP-3780</a>] - JobTracker should synchronously resolve the tasktracker's network location when the tracker registers</li> |
| </ul> |
| </ul> |
| |
| <h1>Hadoop 0.18.2 Release Notes</h1> |
| The bug fixes are listed below. |
| <ul> |
| <h2>Changes Since Hadoop 0.18.1</h2> |
| <ul> |
| <li>[<a href="https://issues.apache.org/jira/browse/HADOOP-2421">HADOOP-2421</a>] - Release JDiff report of changes between different versions of Hadoop.</li> |
| <li>[<a href="https://issues.apache.org/jira/browse/HADOOP-3217">HADOOP-3217</a>] - [HOD] Be less aggressive when querying job status from resource manager.</li> |
| <li>[<a href="https://issues.apache.org/jira/browse/HADOOP-3614">HADOOP-3614</a>] - TestLeaseRecovery fails when run with assertions enabled.</li> |
| <li>[<a href="https://issues.apache.org/jira/browse/HADOOP-3786">HADOOP-3786</a>] - Changes in HOD documentation.</li> |
| <li>[<a href="https://issues.apache.org/jira/browse/HADOOP-3914">HADOOP-3914</a>] - checksumOk implementation in DFSClient can break applications.</li> |
| <li>[<a href="https://issues.apache.org/jira/browse/HADOOP-4116">HADOOP-4116</a>] - Balancer should provide better resource management.</li> |
| <li>[<a href="https://issues.apache.org/jira/browse/HADOOP-4228">HADOOP-4228</a>] - dfs datanode metrics, bytes_read, bytes_written overflows due to incorrect type used.</li> |
| <li>[<a href="https://issues.apache.org/jira/browse/HADOOP-4271">HADOOP-4271</a>] - Bug in FSInputChecker makes it possible to read from an invalid buffer.</li> |
| <li>[<a href="https://issues.apache.org/jira/browse/HADOOP-4277">HADOOP-4277</a>] - Checksum verification is disabled for LocalFS.</li> |
| <li>[<a href="https://issues.apache.org/jira/browse/HADOOP-4292">HADOOP-4292</a>] - append() does not work for LocalFileSystem.</li> |
| <li>[<a href="https://issues.apache.org/jira/browse/HADOOP-4314">HADOOP-4314</a>] - TestReplication fails quite often.</li> |
| <li>[<a href="https://issues.apache.org/jira/browse/HADOOP-4326">HADOOP-4326</a>] - ChecksumFileSystem does not override all create(...) methods.</li> |
| <li>[<a href="https://issues.apache.org/jira/browse/HADOOP-4340">HADOOP-4340</a>] - "hadoop jar" always returns exit code 0 (success) to the shell when jar throws a fatal exception.</li> |
| <li>[<a href="https://issues.apache.org/jira/browse/HADOOP-4351">HADOOP-4351</a>] - ArrayIndexOutOfBoundsException during fsck.</li> |
| <li>[<a href="https://issues.apache.org/jira/browse/HADOOP-4369">HADOOP-4369</a>] - Metric Averages are not averages.</li> |
| <li>[<a href="https://issues.apache.org/jira/browse/HADOOP-4395">HADOOP-4395</a>] - Reloading FSImage and FSEditLog may erase user and group information.</li> |
| <li>[<a href="https://issues.apache.org/jira/browse/HADOOP-4399">HADOOP-4399</a>] - fuse-dfs per FD context is not thread safe and can cause segfaults and corruptions.</li> |
| <li>[<a href="https://issues.apache.org/jira/browse/HADOOP-4403">HADOOP-4403</a>] - TestLeaseRecovery.testBlockSynchronization failed on trunk.</li> |
| <li>[<a href="https://issues.apache.org/jira/browse/HADOOP-4407">HADOOP-4407</a>] - HADOOP-4395 should use a Java 1.5 API for 0.18.</li> |
| <li>[<a href="https://issues.apache.org/jira/browse/HADOOP-4467">HADOOP-4467</a>] - SerializationFactory should use current context ClassLoader.</li> |
| <li>[<a href="https://issues.apache.org/jira/browse/HADOOP-4469">HADOOP-4469</a>] - ant jar file not being included in tar distribution.</li> |
| <li>[<a href="https://issues.apache.org/jira/browse/HADOOP-4483">HADOOP-4483</a>] - getBlockArray in DatanodeDescriptor does not honor passed in maxblocks value.</li> |
| <li>[<a href="https://issues.apache.org/jira/browse/HADOOP-4517">HADOOP-4517</a>] - unstable dfs when running jobs on 0.18.1.</li> |
| <li>[<a href="https://issues.apache.org/jira/browse/HADOOP-4526">HADOOP-4526</a>] - fsck failing with NullPointerException (return value 0).</li> |
| <li>[<a href="https://issues.apache.org/jira/browse/HADOOP-4533">HADOOP-4533</a>] - HDFS client of hadoop 0.18.1 and HDFS server 0.18.2 (0.18 branch) not compatible.</li> |
| </ul> |
| </ul> |
| |
| <h1>Hadoop 0.18.1 Release Notes</h1> |
| The bug fixes are listed below. |
| <ul> |
| <h2>Changes Since Hadoop 0.18.0</h2> |
| <ul> |
| <li><a name="changes">[</a><a href="https://issues.apache.org/jira/browse/HADOOP-4040">HADOOP-4040</a>] - Remove the hardcoded ipc.client.connection.maxidletime setting from the TaskTracker.Child.main().</li> |
| <li>[<a href="https://issues.apache.org/jira/browse/HADOOP-3934">HADOOP-3934</a>] - Update log4j from 1.2.13 to 1.2.15.</li> |
| <li>[<a href="https://issues.apache.org/jira/browse/HADOOP-3995">HADOOP-3995</a>] - renameTo(src, dst) does not restore src name in case of quota failur.</li> |
| <li>[<a href="https://issues.apache.org/jira/browse/HADOOP-4046">HADOOP-4046</a>] - WritableComparator's constructor should be protected instead of private.</li> |
| <li>[<a href="https://issues.apache.org/jira/browse/HADOOP-3821">HADOOP-3821</a>] |
| - SequenceFile's Reader.decompressorPool or Writer.decompressorPool |
| gets into an inconsistent state when calling close() more than onc.</li> |
| <li>[<a href="https://issues.apache.org/jira/browse/HADOOP-3940">HADOOP-3940</a>] - Reduce often attempts in memory merge with no work.</li> |
| <li>[<a href="https://issues.apache.org/jira/browse/HADOOP-4161">HADOOP-4161</a>] - [HOD] Uncaught exceptions can potentially hang hod-client.</li> |
| <li>[<a href="https://issues.apache.org/jira/browse/HADOOP-4060">HADOOP-4060</a>] - [HOD] Make HOD to roll log files on the client.</li> |
| <li>[<a href="https://issues.apache.org/jira/browse/HADOOP-4145">HADOOP-4145</a>] - [HOD] Support an accounting plugin script for HOD.</li> |
| </ul> |
| </ul> |
| |
| |
| <h1>Hadoop 0.18.0 Release Notes</h1> |
| These release notes include new developer and user facing incompatibilities, features, and major improvements. |
| The table below is sorted by Component. |
| <ul> |
| <h2>Changes Since Hadoop 0.17.2</h2> |
| <ul> |
| <table 100%="" border="1" cellpadding="4"> |
| <tbody><tr> |
| <td><b>Issue</b></td> |
| <td><b>Component</b></td> |
| <td><b>Notes</b></td> |
| </tr> |
| <tr> |
| <td><a href="https://issues.apache.org/jira/browse/HADOOP-3355">HADOOP-3355</a></td> |
| <td>conf</td> |
| <td>Added support for hexadecimal values in |
| Configuration</td> |
| </tr> |
| <tr> |
| <td><a href="https://issues.apache.org/jira/browse/HADOOP-1702">HADOOP-1702</a></td> |
| <td>dfs</td> |
| <td>Reduced buffer copies as data is written to HDFS. |
| The order of sending data bytes and control information has changed, but this |
| will not be observed by client applications.</td> |
| </tr> |
| <tr> |
| <td><a href="https://issues.apache.org/jira/browse/HADOOP-2065">HADOOP-2065</a></td> |
| <td>dfs</td> |
| <td>Added "corrupt" flag to LocatedBlock to |
| indicate that all replicas of the block thought to be corrupt.</td> |
| </tr> |
| <tr> |
| <td><a href="https://issues.apache.org/jira/browse/HADOOP-2585">HADOOP-2585</a></td> |
| <td>dfs</td> |
| <td>Improved management of replicas of the name space |
| image. If all replicas on the Name Node are lost, the latest check point can |
| be loaded from the secondary Name Node. Use parameter |
| "-importCheckpoint" and specify the location with "fs.checkpoint.dir." |
| The directory structure on the secondary Name Node has changed to match the |
| primary Name Node.</td> |
| </tr> |
| <tr> |
| <td><a href="https://issues.apache.org/jira/browse/HADOOP-2656">HADOOP-2656</a></td> |
| <td>dfs</td> |
| <td>Associated a generation stamp with each block. On |
| data nodes, the generation stamp is stored as part of the file name of the |
| block's meta-data file.</td> |
| </tr> |
| <tr> |
| <td><a href="https://issues.apache.org/jira/browse/HADOOP-2703">HADOOP-2703</a></td> |
| <td>dfs</td> |
| <td>Changed fsck to ignore files opened for writing. |
| Introduced new option "-openforwrite" to explicitly show open |
| files.</td> |
| </tr> |
| <tr> |
| <td><a href="https://issues.apache.org/jira/browse/HADOOP-2797">HADOOP-2797</a></td> |
| <td>dfs</td> |
| <td>Withdrew the upgrade-to-CRC facility. HDFS will no |
| longer support upgrades from versions without CRCs for block data. Users |
| upgrading from version 0.13 or earlier must first upgrade to an intermediate |
| (0.14, 0.15, 0.16, 0.17) version before doing upgrade to version 0.18 or |
| later.</td> |
| </tr> |
| <tr> |
| <td><a href="https://issues.apache.org/jira/browse/HADOOP-2865">HADOOP-2865</a></td> |
| <td>dfs</td> |
| <td>Changed the output of the "fs -ls" command |
| to more closely match familiar Linux format. Additional changes were made by |
| HADOOP-3459. Applications that parse the command output should be reviewed.</td> |
| </tr> |
| <tr> |
| <td><a href="https://issues.apache.org/jira/browse/HADOOP-3035">HADOOP-3035</a></td> |
| <td>dfs</td> |
| <td>Changed protocol for transferring blocks between |
| data nodes to report corrupt blocks to data node for re-replication from a |
| good replica.</td> |
| </tr> |
| <tr> |
| <td><a href="https://issues.apache.org/jira/browse/HADOOP-3113">HADOOP-3113</a></td> |
| <td>dfs</td> |
| <td>Added sync() method to FSDataOutputStream to really, |
| really persist data in HDFS. InterDatanodeProtocol to implement this feature.</td> |
| </tr> |
| <tr> |
| <td><a href="https://issues.apache.org/jira/browse/HADOOP-3164">HADOOP-3164</a></td> |
| <td>dfs</td> |
| <td>Changed data node to use FileChannel.tranferTo() to |
| transfer block data. <br> |
| </td> |
| </tr> |
| <tr> |
| <td><a href="https://issues.apache.org/jira/browse/HADOOP-3177">HADOOP-3177</a></td> |
| <td>dfs</td> |
| <td>Added a new public interface Syncable which declares |
| the sync() operation. FSDataOutputStream implements Syncable. If the |
| wrappedStream in FSDataOutputStream is Syncalbe, calling |
| FSDataOutputStream.sync() is equivalent to call wrappedStream.sync(). Otherwise, |
| FSDataOutputStream.sync() is a no-op. Both DistributedFileSystem and |
| LocalFileSystem support the sync() operation.</td> |
| </tr> |
| <tr> |
| <td><a href="https://issues.apache.org/jira/browse/HADOOP-3187">HADOOP-3187</a></td> |
| <td>dfs</td> |
| <td>Introduced directory quota as hard limits on the |
| number of names in the tree rooted at that directory. An administrator may |
| set quotas on individual directories explicitly. Newly created directories |
| have no associated quota. File/directory creations fault if the quota would |
| be exceeded. The attempt to set a quota faults if the directory would be in |
| violation of the new quota.</td> |
| </tr> |
| <tr> |
| <td><a href="https://issues.apache.org/jira/browse/HADOOP-3193">HADOOP-3193</a></td> |
| <td>dfs</td> |
| <td>Added reporter to FSNamesystem stateChangeLog, and a |
| new metric to track the number of corrupted replicas.</td> |
| </tr> |
| <tr> |
| <td><a href="https://issues.apache.org/jira/browse/HADOOP-3232">HADOOP-3232</a></td> |
| <td>dfs</td> |
| <td>Changed 'du' command to run in a seperate thread so |
| that it does not block user.</td> |
| </tr> |
| <tr> |
| <td><a href="https://issues.apache.org/jira/browse/HADOOP-3310">HADOOP-3310</a></td> |
| <td>dfs</td> |
| <td>Implemented Lease Recovery to sync the last bock of |
| a file. Added ClientDatanodeProtocol for client trigging block recovery. |
| Changed DatanodeProtocol to support block synchronization. Changed |
| InterDatanodeProtocol to support block update.</td> |
| </tr> |
| <tr> |
| <td><a href="https://issues.apache.org/jira/browse/HADOOP-3317">HADOOP-3317</a></td> |
| <td>dfs</td> |
| <td>Changed the default port for "hdfs:" URIs |
| to be 8020, so that one may simply use URIs of the form |
| "hdfs://example.com/dir/file".</td> |
| </tr> |
| <tr> |
| <td><a href="https://issues.apache.org/jira/browse/HADOOP-3329">HADOOP-3329</a></td> |
| <td>dfs</td> |
| <td>Changed format of file system image to not store |
| locations of last block.</td> |
| </tr> |
| <tr> |
| <td><a href="https://issues.apache.org/jira/browse/HADOOP-3336">HADOOP-3336</a></td> |
| <td>dfs</td> |
| <td>Added a log4j appender that emits events from |
| FSNamesystem for audit logging</td> |
| </tr> |
| <tr> |
| <td><a href="https://issues.apache.org/jira/browse/HADOOP-3339">HADOOP-3339</a></td> |
| <td>dfs</td> |
| <td>Improved failure handling of last Data Node in write |
| pipeline. <br> |
| </td> |
| </tr> |
| <tr> |
| <td><a href="https://issues.apache.org/jira/browse/HADOOP-3390">HADOOP-3390</a></td> |
| <td>dfs</td> |
| <td>Removed deprecated |
| ClientProtocol.abandonFileInProgress().</td> |
| </tr> |
| <tr> |
| <td><a href="https://issues.apache.org/jira/browse/HADOOP-3452">HADOOP-3452</a></td> |
| <td>dfs</td> |
| <td>Changed exit status of fsck to report whether the |
| files system is healthy or corrupt.</td> |
| </tr> |
| <tr> |
| <td><a href="https://issues.apache.org/jira/browse/HADOOP-3459">HADOOP-3459</a></td> |
| <td>dfs</td> |
| <td>Changed the output of the "fs -ls" command |
| to more closely match familiar Linux format. Applications that parse the |
| command output should be reviewed.</td> |
| </tr> |
| <tr> |
| <td><a href="https://issues.apache.org/jira/browse/HADOOP-3486">HADOOP-3486</a></td> |
| <td>dfs</td> |
| <td>Changed the default value of |
| dfs.blockreport.initialDelay to be 0 seconds.</td> |
| </tr> |
| <tr> |
| <td><a href="https://issues.apache.org/jira/browse/HADOOP-3677">HADOOP-3677</a></td> |
| <td>dfs</td> |
| <td>Simplify generation stamp upgrade by making is a |
| local upgrade on datandodes. Deleted distributed upgrade.</td> |
| </tr> |
| <tr> |
| <td><a href="https://issues.apache.org/jira/browse/HADOOP-2188">HADOOP-2188</a></td> |
| <td>dfs <br> |
| ipc</td> |
| <td>Replaced timeouts with pings to check that client |
| connection is alive. Removed the property ipc.client.timeout from the default |
| Hadoop configuration. Removed the metric RpcOpsDiscardedOPsNum. <br> |
| </td> |
| </tr> |
| <tr> |
| <td><a href="https://issues.apache.org/jira/browse/HADOOP-3283">HADOOP-3283</a></td> |
| <td>dfs <br> |
| ipc</td> |
| <td>Added an IPC server in DataNode and a new IPC |
| protocol InterDatanodeProtocol. Added conf properties |
| dfs.datanode.ipc.address and dfs.datanode.handler.count with defaults |
| "0.0.0.0:50020" and 3, respectively. <br> |
| Changed the serialization in DatanodeRegistration |
| and DatanodeInfo, and therefore, updated the versionID in ClientProtocol, |
| DatanodeProtocol, NamenodeProtocol.</td> |
| </tr> |
| <tr> |
| <td><a href="https://issues.apache.org/jira/browse/HADOOP-3058">HADOOP-3058</a></td> |
| <td>dfs <br> |
| metrics</td> |
| <td>Added FSNamesystem status metrics.</td> |
| </tr> |
| <tr> |
| <td><a href="https://issues.apache.org/jira/browse/HADOOP-3683">HADOOP-3683</a></td> |
| <td>dfs <br> |
| metrics</td> |
| <td>Change FileListed to getNumGetListingOps and add |
| CreateFileOps, DeleteFileOps and AddBlockOps metrics.</td> |
| </tr> |
| <tr> |
| <td><a href="https://issues.apache.org/jira/browse/HADOOP-3265">HADOOP-3265</a></td> |
| <td>fs</td> |
| <td>Removed deprecated API getFileCacheHints</td> |
| </tr> |
| <tr> |
| <td><a href="https://issues.apache.org/jira/browse/HADOOP-3307">HADOOP-3307</a></td> |
| <td>fs</td> |
| <td>Introduced archive feature to Hadoop. A Map/Reduce |
| job can be run to create an archive with indexes. A FileSystem abstraction is |
| provided over the archive.</td> |
| </tr> |
| <tr> |
| <td><a href="https://issues.apache.org/jira/browse/HADOOP-930">HADOOP-930</a></td> |
| <td>fs</td> |
| <td>Added support for reading and writing native S3 |
| files. Native S3 files are referenced using s3n URIs. See |
| http://wiki.apache.org/hadoop/AmazonS3 for more details.</td> |
| </tr> |
| <tr> |
| <td><a href="https://issues.apache.org/jira/browse/HADOOP-3095">HADOOP-3095</a></td> |
| <td>fs <br> |
| fs/s3</td> |
| <td>Added overloaded method |
| getFileBlockLocations(FileStatus, long, long). This is an incompatible change |
| for FileSystem implementations which override getFileBlockLocations(Path, |
| long, long). They should have the signature of this method changed to |
| getFileBlockLocations(FileStatus, long, long) to work correctly.</td> |
| </tr> |
| <tr> |
| <td><a href="https://issues.apache.org/jira/browse/HADOOP-4">HADOOP-4</a></td> |
| <td>fuse-dfs</td> |
| <td>Introduced FUSE module for HDFS. Module allows mount |
| of HDFS as a Unix filesystem, and optionally the export of that mount point |
| to other machines. Writes are disabled. rmdir, mv, mkdir, rm are supported, |
| but not cp, touch, and the like. Usage information is attached to the Jira |
| record. <br> |
| <br> |
| </td> |
| </tr> |
| <tr> |
| <td><a href="https://issues.apache.org/jira/browse/HADOOP-3184">HADOOP-3184</a></td> |
| <td>hod</td> |
| <td>Modified HOD to handle master (NameNode or |
| JobTracker) failures on bad nodes by trying to bring them up on another node |
| in the ring. Introduced new property ringmaster.max-master-failures to |
| specify the maximum number of times a master is allowed to fail.</td> |
| </tr> |
| <tr> |
| <td><a href="https://issues.apache.org/jira/browse/HADOOP-3266">HADOOP-3266</a></td> |
| <td>hod</td> |
| <td>Moved HOD change items from CHANGES.txt to a new |
| file src/contrib/hod/CHANGES.txt.</td> |
| </tr> |
| <tr> |
| <td><a href="https://issues.apache.org/jira/browse/HADOOP-3376">HADOOP-3376</a></td> |
| <td>hod</td> |
| <td>Modified HOD client to look for specific messages |
| related to resource limit overruns and take appropriate actions - such as |
| either failing to allocate the cluster, or issuing a warning to the user. A |
| tool is provided, specific to Maui and Torque, that will set these specific |
| messages.</td> |
| </tr> |
| <tr> |
| <td><a href="https://issues.apache.org/jira/browse/HADOOP-3464">HADOOP-3464</a></td> |
| <td>hod</td> |
| <td>Implemented a mechanism to transfer HOD errors that |
| occur on compute nodes to the submit node running the HOD client, so users |
| have good feedback on why an allocation failed.</td> |
| </tr> |
| <tr> |
| <td><a href="https://issues.apache.org/jira/browse/HADOOP-3483">HADOOP-3483</a></td> |
| <td>hod</td> |
| <td>Modified HOD to create a cluster directory if one |
| does not exist and to auto-deallocate a cluster while reallocating it, if it |
| is already dead.</td> |
| </tr> |
| <tr> |
| <td><a href="https://issues.apache.org/jira/browse/HADOOP-3564">HADOOP-3564</a></td> |
| <td>hod</td> |
| <td>Modifed HOD to generate the dfs.datanode.ipc.address |
| parameter in the hadoop-site.xml of datanodes that it launches.</td> |
| </tr> |
| <tr> |
| <td><a href="https://issues.apache.org/jira/browse/HADOOP-3610">HADOOP-3610</a></td> |
| <td>hod</td> |
| <td>Modified HOD to automatically create a cluster |
| directory if the one specified with the script command does not exist.</td> |
| </tr> |
| <tr> |
| <td><a href="https://issues.apache.org/jira/browse/HADOOP-3703">HADOOP-3703</a></td> |
| <td>hod</td> |
| <td>Modified logcondense.py to use the new format of |
| hadoop dfs -lsr output. This version of logcondense would not work with |
| previous versions of Hadoop and hence is incompatible.</td> |
| </tr> |
| <tr> |
| <td><a href="https://issues.apache.org/jira/browse/HADOOP-3061">HADOOP-3061</a></td> |
| <td>io</td> |
| <td>Introduced ByteWritable and DoubleWritable |
| (implementing WritableComparable) implementations for Byte and Double.</td> |
| </tr> |
| <tr> |
| <td><a href="https://issues.apache.org/jira/browse/HADOOP-3299">HADOOP-3299</a></td> |
| <td>io <br> |
| mapred</td> |
| <td>Changed the TextInputFormat and KeyValueTextInput |
| classes to initialize the compressionCodecs member variable before |
| dereferencing it.</td> |
| </tr> |
| <tr> |
| <td><a href="https://issues.apache.org/jira/browse/HADOOP-2909">HADOOP-2909</a></td> |
| <td>ipc</td> |
| <td>Removed property ipc.client.maxidletime from the |
| default configuration. The allowed idle time is twice |
| ipc.client.connection.maxidletime. <br> |
| </td> |
| </tr> |
| <tr> |
| <td><a href="https://issues.apache.org/jira/browse/HADOOP-3569">HADOOP-3569</a></td> |
| <td>KFS</td> |
| <td>Fixed KFS to have read() read and return 1 byte |
| instead of 4.</td> |
| </tr> |
| <tr> |
| <td><a href="https://issues.apache.org/jira/browse/HADOOP-1915">HADOOP-1915</a></td> |
| <td>mapred</td> |
| <td>Provided a new method to update counters. |
| "incrCounter(String group, String counter, long amount)"</td> |
| </tr> |
| <tr> |
| <td><a href="https://issues.apache.org/jira/browse/HADOOP-2019">HADOOP-2019</a></td> |
| <td>mapred</td> |
| <td>Added support for .tar, .tgz and .tar.gz files in |
| DistributedCache. File sizes are limited to 2GB.</td> |
| </tr> |
| <tr> |
| <td><a href="https://issues.apache.org/jira/browse/HADOOP-2095">HADOOP-2095</a></td> |
| <td>mapred</td> |
| <td>Reduced in-memory copies of keys and values as they |
| flow through the Map-Reduce framework. Changed the storage of intermediate |
| map outputs to use new IFile instead of SequenceFile for better compression.</td> |
| </tr> |
| <tr> |
| <td><a href="https://issues.apache.org/jira/browse/HADOOP-2132">HADOOP-2132</a></td> |
| <td>mapred</td> |
| <td>Changed "job -kill" to only allow a job |
| that is in the RUNNING or PREP state to be killed.</td> |
| </tr> |
| <tr> |
| <td><a href="https://issues.apache.org/jira/browse/HADOOP-2181">HADOOP-2181</a></td> |
| <td>mapred</td> |
| <td>Added logging for input splits in job tracker log |
| and job history log. Added web UI for viewing input splits in the job UI and |
| history UI.</td> |
| </tr> |
| <tr> |
| <td><a href="https://issues.apache.org/jira/browse/HADOOP-236">HADOOP-236</a></td> |
| <td>mapred</td> |
| <td>Changed connection protocol job tracker and task |
| tracker so that task tracker will not connect to a job tracker with a |
| different build version.</td> |
| </tr> |
| <tr> |
| <td><a href="https://issues.apache.org/jira/browse/HADOOP-2427">HADOOP-2427</a></td> |
| <td>mapred</td> |
| <td>The current working directory of a task, i.e. |
| ${mapred.local.dir}/taskTracker/jobcache/<jobid>/<task_dir>/work |
| is cleanedup, as soon as the task is finished.</td> |
| </tr> |
| <tr> |
| <td><a href="https://issues.apache.org/jira/browse/HADOOP-2867">HADOOP-2867</a></td> |
| <td>mapred</td> |
| <td>Added task's cwd to its LD_LIBRARY_PATH.</td> |
| </tr> |
| <tr> |
| <td><a href="https://issues.apache.org/jira/browse/HADOOP-3135">HADOOP-3135</a></td> |
| <td>mapred</td> |
| <td>Changed job submission protocol to not allow |
| submission if the client's value of mapred.system.dir does not match the job |
| tracker's. Deprecated JobConf.getSystemDir(); use JobClient.getSystemDir().</td> |
| </tr> |
| <tr> |
| <td><a href="https://issues.apache.org/jira/browse/HADOOP-3221">HADOOP-3221</a></td> |
| <td>mapred</td> |
| <td>Added org.apache.hadoop.mapred.lib.NLineInputFormat, |
| which splits N lines of input as one split. N can be specified by |
| configuration property "mapred.line.input.format.linespermap", |
| which defaults to 1.</td> |
| </tr> |
| <tr> |
| <td><a href="https://issues.apache.org/jira/browse/HADOOP-3226">HADOOP-3226</a></td> |
| <td>mapred</td> |
| <td>Changed policy for running combiner. The combiner |
| may be run multiple times as the map's output is sorted and merged. |
| Additionally, it may be run on the reduce side as data is merged. The old |
| semantics are available in Hadoop 0.18 if the user calls: <br> |
| job.setCombineOnlyOnce(true); <br> |
| </td> |
| </tr> |
| <tr> |
| <td><a href="https://issues.apache.org/jira/browse/HADOOP-3326">HADOOP-3326</a></td> |
| <td>mapred</td> |
| <td>Changed fetchOutputs() so that LocalFSMerger and |
| InMemFSMergeThread threads are spawned only once. The thread gets notified |
| when something is ready for merge. The merge happens when thresholds are met.</td> |
| </tr> |
| <tr> |
| <td><a href="https://issues.apache.org/jira/browse/HADOOP-3366">HADOOP-3366</a></td> |
| <td>mapred</td> |
| <td>Improved shuffle so that all fetched map-outputs are |
| kept in-memory before being merged by stalling the shuffle so that the |
| in-memory merge executes and frees up memory for the shuffle.</td> |
| </tr> |
| <tr> |
| <td><a href="https://issues.apache.org/jira/browse/HADOOP-3405">HADOOP-3405</a></td> |
| <td>mapred</td> |
| <td>Refactored previously public classes MapTaskStatus, |
| ReduceTaskStatus, JobSubmissionProtocol, CompletedJobStatusStore to be |
| package local.</td> |
| </tr> |
| <tr> |
| <td><a href="https://issues.apache.org/jira/browse/HADOOP-3417">HADOOP-3417</a></td> |
| <td>mapred</td> |
| <td>Removed the public class |
| org.apache.hadoop.mapred.JobShell. <br> |
| Command line options -libjars, -files and -archives are moved to |
| GenericCommands. Thus applications have to implement |
| org.apache.hadoop.util.Tool to use the options.</td> |
| </tr> |
| <tr> |
| <td><a href="https://issues.apache.org/jira/browse/HADOOP-3427">HADOOP-3427</a></td> |
| <td>mapred</td> |
| <td>Changed shuffle scheduler policy to wait for |
| notifications from shuffle threads before scheduling more.</td> |
| </tr> |
| <tr> |
| <td><a href="https://issues.apache.org/jira/browse/HADOOP-3460">HADOOP-3460</a></td> |
| <td>mapred</td> |
| <td>Created SequenceFileAsBinaryOutputFormat to write |
| raw bytes as keys and values to a SequenceFile.</td> |
| </tr> |
| <tr> |
| <td><a href="https://issues.apache.org/jira/browse/HADOOP-3512">HADOOP-3512</a></td> |
| <td>mapred</td> |
| <td>Separated Distcp, Logalyzer and Archiver into a |
| tools jar.</td> |
| </tr> |
| <tr> |
| <td><a href="https://issues.apache.org/jira/browse/HADOOP-3565">HADOOP-3565</a></td> |
| <td>mapred</td> |
| <td>Change the Java serialization framework, which is |
| not enabled by default, to correctly make the objects independent of the |
| previous objects.</td> |
| </tr> |
| <tr> |
| <td><a href="https://issues.apache.org/jira/browse/HADOOP-3598">HADOOP-3598</a></td> |
| <td>mapred</td> |
| <td>Changed Map-Reduce framework to no longer create |
| temporary task output directories for staging outputs if staging outputs |
| isn't necessary. ${mapred.out.dir}/_temporary/_${taskid}</td> |
| </tr> |
| <tr> |
| <td><a href="https://issues.apache.org/jira/browse/HADOOP-544">HADOOP-544</a></td> |
| <td>mapred</td> |
| <td>Introduced new classes JobID, TaskID and |
| TaskAttemptID, which should be used instead of their string counterparts. |
| Deprecated functions in JobClient, TaskReport, RunningJob, jobcontrol.Job and |
| TaskCompletionEvent that use string arguments. Applications can use |
| xxxID.toString() and xxxID.forName() methods to convert/restore objects |
| to/from strings. <br> |
| </td> |
| </tr> |
| <tr> |
| <td><a href="https://issues.apache.org/jira/browse/HADOOP-3230">HADOOP-3230</a></td> |
| <td>scripts</td> |
| <td>Added command line tool "job -counter |
| <job-id> <group-name> <counter-name>" to access |
| counters.</td> |
| </tr> |
| <tr> |
| <td><a href="https://issues.apache.org/jira/browse/HADOOP-1328">HADOOP-1328</a></td> |
| <td>streaming</td> |
| <td>Introduced a way for a streaming process to update |
| global counters and status using stderr stream to emit information. Use |
| "reporter:counter:<group>,<counter>,<amount> " to |
| update a counter. Use "reporter:status:<message>" to update |
| status.</td> |
| </tr> |
| <tr> |
| <td><a href="https://issues.apache.org/jira/browse/HADOOP-3429">HADOOP-3429</a></td> |
| <td>streaming</td> |
| <td>Increased the size of the buffer used in the |
| communication between the Java task and the Streaming process to 128KB.</td> |
| </tr> |
| <tr> |
| <td><a href="https://issues.apache.org/jira/browse/HADOOP-3379">HADOOP-3379</a></td> |
| <td>streaming <br> |
| documentation</td> |
| <td>Set default value for configuration property |
| "stream.non.zero.exit.status.is.failure" to be "true".</td> |
| </tr> |
| <tr> |
| <td><a href="https://issues.apache.org/jira/browse/HADOOP-3246">HADOOP-3246</a></td> |
| <td>util</td> |
| <td>Introduced an FTPFileSystem backed by Apache Commons |
| FTPClient to directly store data into HDFS.</td> |
| </tr> |
| </tbody></table> |
| </ul> |
| </ul> |
| </font> |
| </body></html> |