| Hadoop Change Log |
| |
| |
| Trunk (unreleased changes) |
| |
| 1. HADOOP-477. Extend contrib/streaming to scan the PATH environment |
| variables when resolving executable program names. |
| (Dhruba Borthakur via cutting) |
| |
| 2. HADOOP-583. In DFSClient, reduce the log level of re-connect |
| attempts from 'info' to 'debug', so they are not normally shown. |
| (Konstantin Shvachko via cutting) |
| |
| 3. HADOOP-498. Re-implement DFS integrity checker to run server-side, |
| for much improved performance. (Milind Bhandarkar via cutting) |
| |
| 4. HADOOP-586. Use the jar name for otherwise un-named jobs. |
| (Sanjay Dahiya via cutting) |
| |
| 5. HADOOP-514. Make DFS heartbeat interval configurable. |
| (Milind Bhandarkar via cutting) |
| |
| 6. HADOOP-588. Fix logging and accounting of failed tasks. |
| (Sanjay Dahiya via cutting) |
| |
| 7. HADOOP-462. Improve command line parsing in DFSShell, so that |
| incorrect numbers of arguments result in informative errors rather |
| than ArrayOutOfBoundsException. (Dhruba Borthakur via cutting) |
| |
| 8. HADOOP-561. Fix DFS so that one replica of each block is written |
| locally, if possible. This was the intent, but there as a bug. |
| (Dhruba Borthakur via cutting) |
| |
| 9. HADOOP-610. Fix TaskTracker to survive more exceptions, keeping |
| tasks from becoming lost. (omalley via cutting) |
| |
| 10. HADOOP-625. Add a servlet to all http daemons that displays a |
| stack dump, useful for debugging. (omalley via cutting) |
| |
| 11. HADOOP-554. Fix DFSShell to return -1 for errors. |
| (Dhruba Borthakur via cutting) |
| |
| 12. HADOOP-626. Correct the documentation in the NNBench example |
| code, and also remove a mistaken call there. |
| (Nigel Daley via cutting) |
| |
| 13. HADOOP-634. Add missing license to many files. |
| (Nigel Daley via cutting) |
| |
| 14. HADOOP-627. Fix some synchronization problems in MiniMRCluster |
| that sometimes caused unit tests to fail. (Nigel Daley via cutting) |
| |
| 15. HADOOP-563. Improve the NameNode's lease policy so that leases |
| are held for one hour without renewal (instead of one minute). |
| However another attempt to create the same file will still succeed |
| if the lease has not been renewed within a minute. This prevents |
| communication or scheduling problems from causing a write to fail |
| for up to an hour, barring some other process trying to create the |
| same file. (Dhruba Borthakur via cutting) |
| |
| 16. HADOOP-635. In DFSShell, permit specification of multiple files |
| as the source for file copy and move commands. |
| (Dhruba Borthakur via cutting) |
| |
| 17. HADOOP-641. Change NameNode to request a fresh block report from |
| a re-discovered DataNode, so that no-longer-needed replications |
| are stopped promptly. (Konstantin Shvachko via cutting) |
| |
| 18. HADOOP-642. Change IPC client to specify an explicit connect |
| timeout. (Konstantin Shvachko via cutting) |
| |
| 19. HADOOP-638. Fix an unsynchronized access to TaskTracker's |
| internal state. (Nigel Daley via cutting) |
| |
| 20. HADOOP-624. Fix servlet path to stop a Jetty warning on startup. |
| (omalley via cutting) |
| |
| 21. HADOOP-578. Failed tasks are no longer placed at the end of the |
| task queue. This was originally done to work around other |
| problems that have now been fixed. Re-executing failed tasks |
| sooner causes buggy jobs to fail faster. (Sanjay Dahiya via cutting) |
| |
| 22. HADOOP-658. Update source file headers per Apache policy. (cutting) |
| |
| 23. HADOOP-636. Add MapFile & ArrayFile constructors which accept a |
| Progressable, and pass it down to SequenceFile. This permits |
| reduce tasks which use MapFile to still report progress while |
| writing blocks to the filesystem. (cutting) |
| |
| 24. HADOOP-576. Enable contrib/streaming to use the file cache. Also |
| extend the cache to permit symbolic links to cached items, rather |
| than local file copies. (Mahadev Konar via cutting) |
| |
| 25. HADOOP-482. Fix unit tests to work when a cluster is running on |
| the same machine, removing port conflicts. (Wendy Chien via cutting) |
| |
| 26. HADOOP-90. Permit dfs.name.dir to list multiple directories, |
| where namenode data is to be replicated. (Milind Bhandarkar via cutting) |
| |
| 27. HADOOP-651. Fix DFSCk to correctly pass parameters to the servlet |
| on the namenode. (Milind Bhandarkar via cutting) |
| |
| 28. HADOOP-553. Change main() routines of DataNode and NameNode to |
| log exceptions rather than letting the JVM print them to standard |
| error. Also, change the hadoop-daemon.sh script to rotate |
| standard i/o log files. (Raghu Angadi via cutting) |
| |
| 29. HADOOP-399. Fix javadoc warnings. (Nigel Daley via cutting) |
| |
| 30. HADOOP-599. Fix web ui and command line to correctly report DFS |
| filesystem size statistics. Also improve web layout. |
| (Raghu Angadi via cutting) |
| |
| 31. HADOOP-660. Permit specification of junit test output format. |
| (Nigel Daley via cutting) |
| |
| 32. HADOOP-663. Fix a few unit test issues. (Mahadev Konar via cutting) |
| |
| 33. HADOOP-664. Cause entire build to fail if libhdfs tests fail. |
| (Nigel Daley via cutting) |
| |
| 34. HADOOP-633. Keep jobtracker from dying when job initialization |
| throws exceptions. Also improve exception handling in a few other |
| places and add more informative thread names. |
| (omalley via cutting) |
| |
| 35. HADOOP-669. Fix a problem introduced by HADOOP-90 that can cause |
| DFS to lose files. (Milind Bhandarkar via cutting) |
| |
| 36. HADOOP-373. Consistently check the value returned by |
| FileSystem.mkdirs(). (Wendy Chien via cutting) |
| |
| 37. HADOOP-670. Code cleanups in some DFS internals: use generic |
| types, replace Vector with ArrayList, etc. |
| (Konstantin Shvachko via cutting) |
| |
| 38. HADOOP-647. Permit map outputs to use a different compression |
| type than the job output. (omalley via cutting) |
| |
| 39. HADOOP-671. Fix file cache to check for pre-existence before |
| creating . (Mahadev Konar via cutting) |
| |
| 40. HADOOP-665. Extend many DFSShell commands to accept multiple |
| arguments. Now commands like "ls", "rm", etc. will operate on |
| multiple files. (Dhruba Borthakur via cutting) |
| |
| |
| Release 0.7.2 - 2006-10-18 |
| |
| 1. HADOOP-607. Fix a bug where classes included in job jars were not |
| found by tasks. (Mahadev Konar via cutting) |
| |
| 2. HADOOP-609. Add a unit test that checks that classes in job jars |
| can be found by tasks. Also modify unit tests to specify multiple |
| local directories. (Mahadev Konar via cutting) |
| |
| |
| Release 0.7.1 - 2006-10-11 |
| |
| 1. HADOOP-593. Fix a NullPointerException in the JobTracker. |
| (omalley via cutting) |
| |
| 2. HADOOP-592. Fix a NullPointerException in the IPC Server. Also |
| consistently log when stale calls are discarded. (omalley via cutting) |
| |
| 3. HADOOP-594. Increase the DFS safe-mode threshold from .95 to |
| .999, so that nearly all blocks must be reported before filesystem |
| modifications are permitted. (Konstantin Shvachko via cutting) |
| |
| 4. HADOOP-598. Fix tasks to retry when reporting completion, so that |
| a single RPC timeout won't fail a task. (omalley via cutting) |
| |
| 5. HADOOP-597. Fix TaskTracker to not discard map outputs for errors |
| in transmitting them to reduce nodes. (omalley via cutting) |
| |
| |
| Release 0.7.0 - 2006-10-06 |
| |
| 1. HADOOP-243. Fix rounding in the display of task and job progress |
| so that things are not shown to be 100% complete until they are in |
| fact finished. (omalley via cutting) |
| |
| 2. HADOOP-438. Limit the length of absolute paths in DFS, since the |
| file format used to store pathnames has some limitations. |
| (Wendy Chien via cutting) |
| |
| 3. HADOOP-530. Improve error messages in SequenceFile when keys or |
| values are of the wrong type. (Hairong Kuang via cutting) |
| |
| 4. HADOOP-288. Add a file caching system and use it in MapReduce to |
| cache job jar files on slave nodes. (Mahadev Konar via cutting) |
| |
| 5. HADOOP-533. Fix unit test to not modify conf directory. |
| (Hairong Kuang via cutting) |
| |
| 6. HADOOP-527. Permit specification of the local address that various |
| Hadoop daemons should bind to. (Philippe Gassmann via cutting) |
| |
| 7. HADOOP-542. Updates to contrib/streaming: reformatted source code, |
| on-the-fly merge sort, a fix for HADOOP-540, etc. |
| (Michel Tourn via cutting) |
| |
| 8. HADOOP-545. Remove an unused config file parameter. |
| (Philippe Gassmann via cutting) |
| |
| 9. HADOOP-548. Add an Ant property "test.output" to build.xml that |
| causes test output to be logged to the console. (omalley via cutting) |
| |
| 10. HADOOP-261. Record an error message when map output is lost. |
| (omalley via cutting) |
| |
| 11. HADOOP-293. Report the full list of task error messages in the |
| web ui, not just the most recent. (omalley via cutting) |
| |
| 12. HADOOP-551. Restore JobClient's console printouts to only include |
| a maximum of one update per one percent of progress. |
| (omalley via cutting) |
| |
| 13. HADOOP-306. Add a "safe" mode to DFS. The name node enters this |
| when less than a specified percentage of file data is complete. |
| Currently safe mode is only used on startup, but eventually it |
| will also be entered when datanodes disconnect and file data |
| becomes incomplete. While in safe mode no filesystem |
| modifications are permitted and block replication is inhibited. |
| (Konstantin Shvachko via cutting) |
| |
| 14. HADOOP-431. Change 'dfs -rm' to not operate recursively and add a |
| new command, 'dfs -rmr' which operates recursively. |
| (Sameer Paranjpye via cutting) |
| |
| 15. HADOOP-263. Include timestamps for job transitions. The web |
| interface now displays the start and end times of tasks and the |
| start times of sorting and reducing for reduce tasks. Also, |
| extend ObjectWritable to handle enums, so that they can be passed |
| as RPC parameters. (Sanjay Dahiya via cutting) |
| |
| 16. HADOOP-556. Contrib/streaming: send keep-alive reports to task |
| tracker every 10 seconds rather than every 100 records, to avoid |
| task timeouts. (Michel Tourn via cutting) |
| |
| 17. HADOOP-547. Fix reduce tasks to ping tasktracker while copying |
| data, rather than only between copies, avoiding task timeouts. |
| (Sanjay Dahiya via cutting) |
| |
| 18. HADOOP-537. Fix src/c++/libhdfs build process to create files in |
| build/, no longer modifying the source tree. |
| (Arun C Murthy via cutting) |
| |
| 19. HADOOP-487. Throw a more informative exception for unknown RPC |
| hosts. (Sameer Paranjpye via cutting) |
| |
| 20. HADOOP-559. Add file name globbing (pattern matching) support to |
| the FileSystem API, and use it in DFSShell ('bin/hadoop dfs') |
| commands. (Hairong Kuang via cutting) |
| |
| 21. HADOOP-508. Fix a bug in FSDataInputStream. Incorrect data was |
| returned after seeking to a random location. |
| (Milind Bhandarkar via cutting) |
| |
| 22. HADOOP-560. Add a "killed" task state. This can be used to |
| distinguish kills from other failures. Task state has also been |
| converted to use an enum type instead of an int, uncovering a bug |
| elsewhere. The web interface is also updated to display killed |
| tasks. (omalley via cutting) |
| |
| 23. HADOOP-423. Normalize Paths containing directories named "." and |
| "..", using the standard, unix interpretation. Also add checks in |
| DFS, prohibiting the use of "." or ".." as directory or file |
| names. (Wendy Chien via cutting) |
| |
| 24. HADOOP-513. Replace map output handling with a servlet, rather |
| than a JSP page. This fixes an issue where |
| IllegalStateException's were logged, sets content-length |
| correctly, and better handles some errors. (omalley via cutting) |
| |
| 25. HADOOP-552. Improved error checking when copying map output files |
| to reduce nodes. (omalley via cutting) |
| |
| 26. HADOOP-566. Fix scripts to work correctly when accessed through |
| relative symbolic links. (Lee Faris via cutting) |
| |
| 27. HADOOP-519. Add positioned read methods to FSInputStream. These |
| permit one to read from a stream without moving its position, and |
| can hence be performed by multiple threads at once on a single |
| stream. Implement an optimized version for DFS and local FS. |
| (Milind Bhandarkar via cutting) |
| |
| 28. HADOOP-522. Permit block compression with MapFile and SetFile. |
| Since these formats are always sorted, block compression can |
| provide a big advantage. (cutting) |
| |
| 29. HADOOP-567. Record version and revision information in builds. A |
| package manifest is added to the generated jar file containing |
| version information, and a VersionInfo utility is added that |
| includes further information, including the build date and user, |
| and the subversion revision and repository. A 'bin/hadoop |
| version' comand is added to show this information, and it is also |
| added to various web interfaces. (omalley via cutting) |
| |
| 30. HADOOP-568. Fix so that errors while initializing tasks on a |
| tasktracker correctly report the task as failed to the jobtracker, |
| so that it will be rescheduled. (omalley via cutting) |
| |
| 31. HADOOP-550. Disable automatic UTF-8 validation in Text. This |
| permits, e.g., TextInputFormat to again operate on non-UTF-8 data. |
| (Hairong and Mahadev via cutting) |
| |
| 32. HADOOP-343. Fix mapred copying so that a failed tasktracker |
| doesn't cause other copies to slow. (Sameer Paranjpye via cutting) |
| |
| 33. HADOOP-239. Add a persistent job history mechanism, so that basic |
| job statistics are not lost after 24 hours and/or when the |
| jobtracker is restarted. (Sanjay Dahiya via cutting) |
| |
| 34. HADOOP-506. Ignore heartbeats from stale task trackers. |
| (Sanjay Dahiya via cutting) |
| |
| 35. HADOOP-255. Discard stale, queued IPC calls. Do not process |
| calls whose clients will likely time out before they receive a |
| response. When the queue is full, new calls are now received and |
| queued, and the oldest calls are discarded, so that, when servers |
| get bogged down, they no longer develop a backlog on the socket. |
| This should improve some DFS namenode failure modes. |
| (omalley via cutting) |
| |
| 36. HADOOP-581. Fix datanode to not reset itself on communications |
| errors with the namenode. If a request to the namenode fails, the |
| datanode should retry, not restart. This reduces the load on the |
| namenode, since restarts cause a resend of the block report. |
| (omalley via cutting) |
| |
| |
| Release 0.6.2 - 2006-09-18 |
| |
| 1. HADOOP-532. Fix a bug reading value-compressed sequence files, |
| where an exception was thrown reporting that the full value had not |
| been read. (omalley via cutting) |
| |
| 2. HADOOP-534. Change the default value class in JobConf to be Text |
| instead of the now-deprecated UTF8. This fixes the Grep example |
| program, which was updated to use Text, but relies on this |
| default. (Hairong Kuang via cutting) |
| |
| |
| Release 0.6.1 - 2006-09-13 |
| |
| 1. HADOOP-520. Fix a bug in libhdfs, where write failures were not |
| correctly returning error codes. (Arun C Murthy via cutting) |
| |
| 2. HADOOP-523. Fix a NullPointerException when TextInputFormat is |
| explicitly specified. Also add a test case for this. |
| (omalley via cutting) |
| |
| 3. HADOOP-521. Fix another NullPointerException finding the |
| ClassLoader when using libhdfs. (omalley via cutting) |
| |
| 4. HADOOP-526. Fix a NullPointerException when attempting to start |
| two datanodes in the same directory. (Milind Bhandarkar via cutting) |
| |
| 5. HADOOP-529. Fix a NullPointerException when opening |
| value-compressed sequence files generated by pre-0.6.0 Hadoop. |
| (omalley via cutting) |
| |
| |
| Release 0.6.0 - 2006-09-08 |
| |
| 1. HADOOP-427. Replace some uses of DatanodeDescriptor in the DFS |
| web UI code with DatanodeInfo, the preferred public class. |
| (Devaraj Das via cutting) |
| |
| 2. HADOOP-426. Fix streaming contrib module to work correctly on |
| Solaris. This was causing nightly builds to fail. |
| (Michel Tourn via cutting) |
| |
| 3. HADOOP-400. Improvements to task assignment. Tasks are no longer |
| re-run on nodes where they have failed (unless no other node is |
| available). Also, tasks are better load-balanced among nodes. |
| (omalley via cutting) |
| |
| 4. HADOOP-324. Fix datanode to not exit when a disk is full, but |
| rather simply to fail writes. (Wendy Chien via cutting) |
| |
| 5. HADOOP-434. Change smallJobsBenchmark to use standard Hadoop |
| scripts. (Sanjay Dahiya via cutting) |
| |
| 6. HADOOP-453. Fix a bug in Text.setCapacity(). (siren via cutting) |
| |
| |
| 7. HADOOP-450. Change so that input types are determined by the |
| RecordReader rather than specified directly in the JobConf. This |
| facilitates jobs with a variety of input types. |
| |
| WARNING: This contains incompatible API changes! The RecordReader |
| interface has two new methods that all user-defined InputFormats |
| must now define. Also, the values returned by TextInputFormat are |
| no longer of class UTF8, but now of class Text. |
| |
| 8. HADOOP-436. Fix an error-handling bug in the web ui. |
| (Devaraj Das via cutting) |
| |
| 9. HADOOP-455. Fix a bug in Text, where DEL was not permitted. |
| (Hairong Kuang via cutting) |
| |
| 10. HADOOP-456. Change the DFS namenode to keep a persistent record |
| of the set of known datanodes. This will be used to implement a |
| "safe mode" where filesystem changes are prohibited when a |
| critical percentage of the datanodes are unavailable. |
| (Konstantin Shvachko via cutting) |
| |
| 11. HADOOP-322. Add a job control utility. This permits one to |
| specify job interdependencies. Each job is submitted only after |
| the jobs it depends on have successfully completed. |
| (Runping Qi via cutting) |
| |
| 12. HADOOP-176. Fix a bug in IntWritable.Comparator. |
| (Dick King via cutting) |
| |
| 13. HADOOP-421. Replace uses of String in recordio package with Text |
| class, for improved handling of UTF-8 data. |
| (Milind Bhandarkar via cutting) |
| |
| 14. HADOOP-464. Improved error message when job jar not found. |
| (Michel Tourn via cutting) |
| |
| 15. HADOOP-469. Fix /bin/bash specifics that have crept into our |
| /bin/sh scripts since HADOOP-352. |
| (Jean-Baptiste Quenot via cutting) |
| |
| 16. HADOOP-468. Add HADOOP_NICENESS environment variable to set |
| scheduling priority for daemons. (Vetle Roeim via cutting) |
| |
| 17. HADOOP-473. Fix TextInputFormat to correctly handle more EOL |
| formats. Things now work correctly with CR, LF or CRLF. |
| (Dennis Kubes & James White via cutting) |
| |
| 18. HADOOP-461. Make Java 1.5 an explicit requirement. (cutting) |
| |
| 19. HADOOP-54. Add block compression to SequenceFile. One may now |
| specify that blocks of keys and values are compressed together, |
| improving compression for small keys and values. |
| SequenceFile.Writer's constructor is now deprecated and replaced |
| with a factory method. (Arun C Murthy via cutting) |
| |
| 20. HADOOP-281. Prohibit DFS files that are also directories. |
| (Wendy Chien via cutting) |
| |
| 21. HADOOP-486. Add the job username to JobStatus instances returned |
| by JobClient. (Mahadev Konar via cutting) |
| |
| 22. HADOOP-437. contrib/streaming: Add support for gzipped inputs. |
| (Michel Tourn via cutting) |
| |
| 23. HADOOP-463. Add variable expansion to config files. |
| Configuration property values may now contain variable |
| expressions. A variable is referenced with the syntax |
| '${variable}'. Variables values are found first in the |
| configuration, and then in Java system properties. The default |
| configuration is modified so that temporary directories are now |
| under ${hadoop.tmp.dir}, which is, by default, |
| /tmp/hadoop-${user.name}. (Michel Tourn via cutting) |
| |
| 24. HADOOP-419. Fix a NullPointerException finding the ClassLoader |
| when using libhdfs. (omalley via cutting) |
| |
| 25. HADOOP-460. Fix contrib/smallJobsBenchmark to use Text instead of |
| UTF8. (Sanjay Dahiya via cutting) |
| |
| 26. HADOOP-196. Fix Configuration(Configuration) constructor to work |
| correctly. (Sami Siren via cutting) |
| |
| 27. HADOOP-501. Fix Configuration.toString() to handle URL resources. |
| (Thomas Friol via cutting) |
| |
| 28. HADOOP-499. Reduce the use of Strings in contrib/streaming, |
| replacing them with Text for better performance. |
| (Hairong Kuang via cutting) |
| |
| 29. HADOOP-64. Manage multiple volumes with a single DataNode. |
| Previously DataNode would create a separate daemon per configured |
| volume, each with its own connection to the NameNode. Now all |
| volumes are handled by a single DataNode daemon, reducing the load |
| on the NameNode. (Milind Bhandarkar via cutting) |
| |
| 30. HADOOP-424. Fix MapReduce so that jobs which generate zero splits |
| do not fail. (Frédéric Bertin via cutting) |
| |
| 31. HADOOP-408. Adjust some timeouts and remove some others so that |
| unit tests run faster. (cutting) |
| |
| 32. HADOOP-507. Fix an IllegalAccessException in DFS. |
| (omalley via cutting) |
| |
| 33. HADOOP-320. Fix so that checksum files are correctly copied when |
| the destination of a file copy is a directory. |
| (Hairong Kuang via cutting) |
| |
| 34. HADOOP-286. In DFSClient, avoid pinging the NameNode with |
| renewLease() calls when no files are being written. |
| (Konstantin Shvachko via cutting) |
| |
| 35. HADOOP-312. Close idle IPC connections. All IPC connections were |
| cached forever. Now, after a connection has been idle for more |
| than a configurable amount of time (one second by default), the |
| connection is closed, conserving resources on both client and |
| server. (Devaraj Das via cutting) |
| |
| 36. HADOOP-497. Permit the specification of the network interface and |
| nameserver to be used when determining the local hostname |
| advertised by datanodes and tasktrackers. |
| (Lorenzo Thione via cutting) |
| |
| 37. HADOOP-441. Add a compression codec API and extend SequenceFile |
| to use it. This will permit the use of alternate compression |
| codecs in SequenceFile. (Arun C Murthy via cutting) |
| |
| 38. HADOOP-483. Improvements to libhdfs build and documentation. |
| (Arun C Murthy via cutting) |
| |
| 39. HADOOP-458. Fix a memory corruption bug in libhdfs. |
| (Arun C Murthy via cutting) |
| |
| 40. HADOOP-517. Fix a contrib/streaming bug in end-of-line detection. |
| (Hairong Kuang via cutting) |
| |
| 41. HADOOP-474. Add CompressionCodecFactory, and use it in |
| TextInputFormat and TextOutputFormat. Compressed input files are |
| automatically decompressed when they have the correct extension. |
| Output files will, when output compression is specified, be |
| generated with an approprate extension. Also add a gzip codec and |
| fix problems with UTF8 text inputs. (omalley via cutting) |
| |
| |
| Release 0.5.0 - 2006-08-04 |
| |
| 1. HADOOP-352. Fix shell scripts to use /bin/sh instead of |
| /bin/bash, for better portability. |
| (Jean-Baptiste Quenot via cutting) |
| |
| 2. HADOOP-313. Permit task state to be saved so that single tasks |
| may be manually re-executed when debugging. (omalley via cutting) |
| |
| 3. HADOOP-339. Add method to JobClient API listing jobs that are |
| not yet complete, i.e., that are queued or running. |
| (Mahadev Konar via cutting) |
| |
| 4. HADOOP-355. Updates to the streaming contrib module, including |
| API fixes, making reduce optional, and adding an input type for |
| StreamSequenceRecordReader. (Michel Tourn via cutting) |
| |
| 5. HADOOP-358. Fix a NPE bug in Path.equals(). |
| (Frédéric Bertin via cutting) |
| |
| 6. HADOOP-327. Fix ToolBase to not call System.exit() when |
| exceptions are thrown. (Hairong Kuang via cutting) |
| |
| 7. HADOOP-359. Permit map output to be compressed. |
| (omalley via cutting) |
| |
| 8. HADOOP-341. Permit input URI to CopyFiles to use the HTTP |
| protocol. This lets one, e.g., more easily copy log files into |
| DFS. (Arun C Murthy via cutting) |
| |
| 9. HADOOP-361. Remove unix dependencies from streaming contrib |
| module tests, making them pure java. (Michel Tourn via cutting) |
| |
| 10. HADOOP-354. Make public methods to stop DFS daemons. |
| (Barry Kaplan via cutting) |
| |
| 11. HADOOP-252. Add versioning to RPC protocols. |
| (Milind Bhandarkar via cutting) |
| |
| 12. HADOOP-356. Add contrib to "compile" and "test" build targets, so |
| that this code is better maintained. (Michel Tourn via cutting) |
| |
| 13. HADOOP-307. Add smallJobsBenchmark contrib module. This runs |
| lots of small jobs, in order to determine per-task overheads. |
| (Sanjay Dahiya via cutting) |
| |
| 14. HADOOP-342. Add a tool for log analysis: Logalyzer. |
| (Arun C Murthy via cutting) |
| |
| 15. HADOOP-347. Add web-based browsing of DFS content. The namenode |
| redirects browsing requests to datanodes. Content requests are |
| redirected to datanodes where the data is local when possible. |
| (Devaraj Das via cutting) |
| |
| 16. HADOOP-351. Make Hadoop IPC kernel independent of Jetty. |
| (Devaraj Das via cutting) |
| |
| 17. HADOOP-237. Add metric reporting to DFS and MapReduce. With only |
| minor configuration changes, one can now monitor many Hadoop |
| system statistics using Ganglia or other monitoring systems. |
| (Milind Bhandarkar via cutting) |
| |
| 18. HADOOP-376. Fix datanode's HTTP server to scan for a free port. |
| (omalley via cutting) |
| |
| 19. HADOOP-260. Add --config option to shell scripts, specifying an |
| alternate configuration directory. (Milind Bhandarkar via cutting) |
| |
| 20. HADOOP-381. Permit developers to save the temporary files for |
| tasks whose names match a regular expression, to facilliate |
| debugging. (omalley via cutting) |
| |
| 21. HADOOP-344. Fix some Windows-related problems with DF. |
| (Konstantin Shvachko via cutting) |
| |
| 22. HADOOP-380. Fix reduce tasks to poll less frequently for map |
| outputs. (Mahadev Konar via cutting) |
| |
| 23. HADOOP-321. Refactor DatanodeInfo, in preparation for |
| HADOOP-306. (Konstantin Shvachko & omalley via cutting) |
| |
| 24. HADOOP-385. Fix some bugs in record io code generation. |
| (Milind Bhandarkar via cutting) |
| |
| 25. HADOOP-302. Add new Text class to replace UTF8, removing |
| limitations of that class. Also refactor utility methods for |
| writing zero-compressed integers (VInts and VLongs). |
| (Hairong Kuang via cutting) |
| |
| 26. HADOOP-335. Refactor DFS namespace/transaction logging in |
| namenode. (Konstantin Shvachko via cutting) |
| |
| 27. HADOOP-375. Fix handling of the datanode HTTP daemon's port so |
| that multiple datanode's can be run on a single host. |
| (Devaraj Das via cutting) |
| |
| 28. HADOOP-386. When removing excess DFS block replicas, remove those |
| on nodes with the least free space first. |
| (Johan Oskarson via cutting) |
| |
| 29. HADOOP-389. Fix intermittent failures of mapreduce unit tests. |
| Also fix some build dependencies. |
| (Mahadev & Konstantin via cutting) |
| |
| 30. HADOOP-362. Fix a problem where jobs hang when status messages |
| are recieved out-of-order. (omalley via cutting) |
| |
| 31. HADOOP-394. Change order of DFS shutdown in unit tests to |
| minimize errors logged. (Konstantin Shvachko via cutting) |
| |
| 32. HADOOP-396. Make DatanodeID implement Writable. |
| (Konstantin Shvachko via cutting) |
| |
| 33. HADOOP-377. Permit one to add URL resources to a Configuration. |
| (Jean-Baptiste Quenot via cutting) |
| |
| 34. HADOOP-345. Permit iteration over Configuration key/value pairs. |
| (Michel Tourn via cutting) |
| |
| 35. HADOOP-409. Streaming contrib module: make configuration |
| properties available to commands as environment variables. |
| (Michel Tourn via cutting) |
| |
| 36. HADOOP-369. Add -getmerge option to dfs command that appends all |
| files in a directory into a single local file. |
| (Johan Oskarson via cutting) |
| |
| 37. HADOOP-410. Replace some TreeMaps with HashMaps in DFS, for |
| a 17% performance improvement. (Milind Bhandarkar via cutting) |
| |
| 38. HADOOP-411. Add unit tests for command line parser. |
| (Hairong Kuang via cutting) |
| |
| 39. HADOOP-412. Add MapReduce input formats that support filtering |
| of SequenceFile data, including sampling and regex matching. |
| Also, move JobConf.newInstance() to a new utility class. |
| (Hairong Kuang via cutting) |
| |
| 40. HADOOP-226. Fix fsck command to properly consider replication |
| counts, now that these can vary per file. (Bryan Pendleton via cutting) |
| |
| 41. HADOOP-425. Add a Python MapReduce example, using Jython. |
| (omalley via cutting) |
| |
| |
| Release 0.4.0 - 2006-06-28 |
| |
| 1. HADOOP-298. Improved progress reports for CopyFiles utility, the |
| distributed file copier. (omalley via cutting) |
| |
| 2. HADOOP-299. Fix the task tracker, permitting multiple jobs to |
| more easily execute at the same time. (omalley via cutting) |
| |
| 3. HADOOP-250. Add an HTTP user interface to the namenode, running |
| on port 50070. (Devaraj Das via cutting) |
| |
| 4. HADOOP-123. Add MapReduce unit tests that run a jobtracker and |
| tasktracker, greatly increasing code coverage. |
| (Milind Bhandarkar via cutting) |
| |
| 5. HADOOP-271. Add links from jobtracker's web ui to tasktracker's |
| web ui. Also attempt to log a thread dump of child processes |
| before they're killed. (omalley via cutting) |
| |
| 6. HADOOP-210. Change RPC server to use a selector instead of a |
| thread per connection. This should make it easier to scale to |
| larger clusters. Note that this incompatibly changes the RPC |
| protocol: clients and servers must both be upgraded to the new |
| version to ensure correct operation. (Devaraj Das via cutting) |
| |
| 7. HADOOP-311. Change DFS client to retry failed reads, so that a |
| single read failure will not alone cause failure of a task. |
| (omalley via cutting) |
| |
| 8. HADOOP-314. Remove the "append" phase when reducing. Map output |
| files are now directly passed to the sorter, without first |
| appending them into a single file. Now, the first third of reduce |
| progress is "copy" (transferring map output to reduce nodes), the |
| middle third is "sort" (sorting map output) and the last third is |
| "reduce" (generating output). Long-term, the "sort" phase will |
| also be removed. (omalley via cutting) |
| |
| 9. HADOOP-316. Fix a potential deadlock in the jobtracker. |
| (omalley via cutting) |
| |
| 10. HADOOP-319. Fix FileSystem.close() to remove the FileSystem |
| instance from the cache. (Hairong Kuang via cutting) |
| |
| 11. HADOOP-135. Fix potential deadlock in JobTracker by acquiring |
| locks in a consistent order. (omalley via cutting) |
| |
| 12. HADOOP-278. Check for existence of input directories before |
| starting MapReduce jobs, making it easier to debug this common |
| error. (omalley via cutting) |
| |
| 13. HADOOP-304. Improve error message for |
| UnregisterdDatanodeException to include expected node name. |
| (Konstantin Shvachko via cutting) |
| |
| 14. HADOOP-305. Fix TaskTracker to ask for new tasks as soon as a |
| task is finished, rather than waiting for the next heartbeat. |
| This improves performance when tasks are short. |
| (Mahadev Konar via cutting) |
| |
| 15. HADOOP-59. Add support for generic command line options. One may |
| now specify the filesystem (-fs), the MapReduce jobtracker (-jt), |
| a config file (-conf) or any configuration property (-D). The |
| "dfs", "fsck", "job", and "distcp" commands currently support |
| this, with more to be added. (Hairong Kuang via cutting) |
| |
| 16. HADOOP-296. Permit specification of the amount of reserved space |
| on a DFS datanode. One may specify both the percentage free and |
| the number of bytes. (Johan Oskarson via cutting) |
| |
| 17. HADOOP-325. Fix a problem initializing RPC parameter classes, and |
| remove the workaround used to initialize classes. |
| (omalley via cutting) |
| |
| 18. HADOOP-328. Add an option to the "distcp" command to ignore read |
| errors while copying. (omalley via cutting) |
| |
| 19. HADOOP-27. Don't allocate tasks to trackers whose local free |
| space is too low. (Johan Oskarson via cutting) |
| |
| 20. HADOOP-318. Keep slow DFS output from causing task timeouts. |
| This incompatibly changes some public interfaces, adding a |
| parameter to OutputFormat.getRecordWriter() and the new method |
| Reporter.progress(), but it makes lots of tasks succeed that were |
| previously failing. (Milind Bhandarkar via cutting) |
| |
| |
| Release 0.3.2 - 2006-06-09 |
| |
| 1. HADOOP-275. Update the streaming contrib module to use log4j for |
| its logging. (Michel Tourn via cutting) |
| |
| 2. HADOOP-279. Provide defaults for log4j logging parameters, so |
| that things still work reasonably when Hadoop-specific system |
| properties are not provided. (omalley via cutting) |
| |
| 3. HADOOP-280. Fix a typo in AllTestDriver which caused the wrong |
| test to be run when "DistributedFSCheck" was specified. |
| (Konstantin Shvachko via cutting) |
| |
| 4. HADOOP-240. DFS's mkdirs() implementation no longer logs a warning |
| when the directory already exists. (Hairong Kuang via cutting) |
| |
| 5. HADOOP-285. Fix DFS datanodes to be able to re-join the cluster |
| after the connection to the namenode is lost. (omalley via cutting) |
| |
| 6. HADOOP-277. Fix a race condition when creating directories. |
| (Sameer Paranjpye via cutting) |
| |
| 7. HADOOP-289. Improved exception handling in DFS datanode. |
| (Konstantin Shvachko via cutting) |
| |
| 8. HADOOP-292. Fix client-side logging to go to standard error |
| rather than standard output, so that it can be distinguished from |
| application output. (omalley via cutting) |
| |
| 9. HADOOP-294. Fixed bug where conditions for retrying after errors |
| in the DFS client were reversed. (omalley via cutting) |
| |
| |
| Release 0.3.1 - 2006-06-05 |
| |
| 1. HADOOP-272. Fix a bug in bin/hadoop setting log |
| parameters. (omalley & cutting) |
| |
| 2. HADOOP-274. Change applications to log to standard output rather |
| than to a rolling log file like daemons. (omalley via cutting) |
| |
| 3. HADOOP-262. Fix reduce tasks to report progress while they're |
| waiting for map outputs, so that they do not time out. |
| (Mahadev Konar via cutting) |
| |
| 4. HADOOP-245 and HADOOP-246. Improvements to record io package. |
| (Mahadev Konar via cutting) |
| |
| 5. HADOOP-276. Add logging config files to jar file so that they're |
| always found. (omalley via cutting) |
| |
| |
| Release 0.3.0 - 2006-06-02 |
| |
| 1. HADOOP-208. Enhance MapReduce web interface, adding new pages |
| for failed tasks, and tasktrackers. (omalley via cutting) |
| |
| 2. HADOOP-204. Tweaks to metrics package. (David Bowen via cutting) |
| |
| 3. HADOOP-209. Add a MapReduce-based file copier. This will |
| copy files within or between file systems in parallel. |
| (Milind Bhandarkar via cutting) |
| |
| 4. HADOOP-146. Fix DFS to check when randomly generating a new block |
| id that no existing blocks already have that id. |
| (Milind Bhandarkar via cutting) |
| |
| 5. HADOOP-180. Make a daemon thread that does the actual task clean ups, so |
| that the main offerService thread in the taskTracker doesn't get stuck |
| and miss his heartbeat window. This was killing many task trackers as |
| big jobs finished (300+ tasks / node). (omalley via cutting) |
| |
| 6. HADOOP-200. Avoid transmitting entire list of map task names to |
| reduce tasks. Instead just transmit the number of map tasks and |
| henceforth refer to them by number when collecting map output. |
| (omalley via cutting) |
| |
| 7. HADOOP-219. Fix a NullPointerException when handling a checksum |
| exception under SequenceFile.Sorter.sort(). (cutting & stack) |
| |
| 8. HADOOP-212. Permit alteration of the file block size in DFS. The |
| default block size for new files may now be specified in the |
| configuration with the dfs.block.size property. The block size |
| may also be specified when files are opened. |
| (omalley via cutting) |
| |
| 9. HADOOP-218. Avoid accessing configuration while looping through |
| tasks in JobTracker. (Mahadev Konar via cutting) |
| |
| 10. HADOOP-161. Add hashCode() method to DFS's Block. |
| (Milind Bhandarkar via cutting) |
| |
| 11. HADOOP-115. Map output types may now be specified. These are also |
| used as reduce input types, thus permitting reduce input types to |
| differ from reduce output types. (Runping Qi via cutting) |
| |
| 12. HADOOP-216. Add task progress to task status page. |
| (Bryan Pendelton via cutting) |
| |
| 13. HADOOP-233. Add web server to task tracker that shows running |
| tasks and logs. Also add log access to job tracker web interface. |
| (omalley via cutting) |
| |
| 14. HADOOP-205. Incorporate pending tasks into tasktracker load |
| calculations. (Mahadev Konar via cutting) |
| |
| 15. HADOOP-247. Fix sort progress to better handle exceptions. |
| (Mahadev Konar via cutting) |
| |
| 16. HADOOP-195. Improve performance of the transfer of map outputs to |
| reduce nodes by performing multiple transfers in parallel, each on |
| a separate socket. (Sameer Paranjpye via cutting) |
| |
| 17. HADOOP-251. Fix task processes to be tolerant of failed progress |
| reports to their parent process. (omalley via cutting) |
| |
| 18. HADOOP-325. Improve the FileNotFound exceptions thrown by |
| LocalFileSystem to include the name of the file. |
| (Benjamin Reed via cutting) |
| |
| 19. HADOOP-254. Use HTTP to transfer map output data to reduce |
| nodes. This, together with HADOOP-195, greatly improves the |
| performance of these transfers. (omalley via cutting) |
| |
| 20. HADOOP-163. Cause datanodes that\ are unable to either read or |
| write data to exit, so that the namenode will no longer target |
| them for new blocks and will replicate their data on other nodes. |
| (Hairong Kuang via cutting) |
| |
| 21. HADOOP-222. Add a -setrep option to the dfs commands that alters |
| file replication levels. (Johan Oskarson via cutting) |
| |
| 22. HADOOP-75. In DFS, only check for a complete file when the file |
| is closed, rather than as each block is written. |
| (Milind Bhandarkar via cutting) |
| |
| 23. HADOOP-124. Change DFS so that datanodes are identified by a |
| persistent ID rather than by host and port. This solves a number |
| of filesystem integrity problems, when, e.g., datanodes are |
| restarted. (Konstantin Shvachko via cutting) |
| |
| 24. HADOOP-256. Add a C API for DFS. (Arun C Murthy via cutting) |
| |
| 25. HADOOP-211. Switch to use the Jakarta Commons logging internally, |
| configured to use log4j by default. (Arun C Murthy and cutting) |
| |
| 26. HADOOP-265. Tasktracker now fails to start if it does not have a |
| writable local directory for temporary files. In this case, it |
| logs a message to the JobTracker and exits. (Hairong Kuang via cutting) |
| |
| 27. HADOOP-270. Fix potential deadlock in datanode shutdown. |
| (Hairong Kuang via cutting) |
| |
| Release 0.2.1 - 2006-05-12 |
| |
| 1. HADOOP-199. Fix reduce progress (broken by HADOOP-182). |
| (omalley via cutting) |
| |
| 2. HADOOP-201. Fix 'bin/hadoop dfs -report'. (cutting) |
| |
| 3. HADOOP-207. Fix JDK 1.4 incompatibility introduced by HADOOP-96. |
| System.getenv() does not work in JDK 1.4. (Hairong Kuang via cutting) |
| |
| |
| Release 0.2.0 - 2006-05-05 |
| |
| 1. Fix HADOOP-126. 'bin/hadoop dfs -cp' now correctly copies .crc |
| files. (Konstantin Shvachko via cutting) |
| |
| 2. Fix HADOOP-51. Change DFS to support per-file replication counts. |
| (Konstantin Shvachko via cutting) |
| |
| 3. Fix HADOOP-131. Add scripts to start/stop dfs and mapred daemons. |
| Use these in start/stop-all scripts. (Chris Mattmann via cutting) |
| |
| 4. Stop using ssh options by default that are not yet in widely used |
| versions of ssh. Folks can still enable their use by uncommenting |
| a line in conf/hadoop-env.sh. (cutting) |
| |
| 5. Fix HADOOP-92. Show information about all attempts to run each |
| task in the web ui. (Mahadev konar via cutting) |
| |
| 6. Fix HADOOP-128. Improved DFS error handling. (Owen O'Malley via cutting) |
| |
| 7. Fix HADOOP-129. Replace uses of java.io.File with new class named |
| Path. This fixes bugs where java.io.File methods were called |
| directly when FileSystem methods were desired, and reduces the |
| likelihood of such bugs in the future. It also makes the handling |
| of pathnames more consistent between local and dfs FileSystems and |
| between Windows and Unix. java.io.File-based methods are still |
| available for back-compatibility, but are deprecated and will be |
| removed once 0.2 is released. (cutting) |
| |
| 8. Change dfs.data.dir and mapred.local.dir to be comma-separated |
| lists of directories, no longer be space-separated. This fixes |
| several bugs on Windows. (cutting) |
| |
| 9. Fix HADOOP-144. Use mapred task id for dfs client id, to |
| facilitate debugging. (omalley via cutting) |
| |
| 10. Fix HADOOP-143. Do not line-wrap stack-traces in web ui. |
| (omalley via cutting) |
| |
| 11. Fix HADOOP-118. In DFS, improve clean up of abandoned file |
| creations. (omalley via cutting) |
| |
| 12. Fix HADOOP-138. Stop multiple tasks in a single heartbeat, rather |
| than one per heartbeat. (Stefan via cutting) |
| |
| 13. Fix HADOOP-139. Remove a potential deadlock in |
| LocalFileSystem.lock(). (Igor Bolotin via cutting) |
| |
| 14. Fix HADOOP-134. Don't hang jobs when the tasktracker is |
| misconfigured to use an un-writable local directory. (omalley via cutting) |
| |
| 15. Fix HADOOP-115. Correct an error message. (Stack via cutting) |
| |
| 16. Fix HADOOP-133. Retry pings from child to parent, in case of |
| (local) communcation problems. Also log exit status, so that one |
| can distinguish patricide from other deaths. (omalley via cutting) |
| |
| 17. Fix HADOOP-142. Avoid re-running a task on a host where it has |
| previously failed. (omalley via cutting) |
| |
| 18. Fix HADOOP-148. Maintain a task failure count for each |
| tasktracker and display it in the web ui. (omalley via cutting) |
| |
| 19. Fix HADOOP-151. Close a potential socket leak, where new IPC |
| connection pools were created per configuration instance that RPCs |
| use. Now a global RPC connection pool is used again, as |
| originally intended. (cutting) |
| |
| 20. Fix HADOOP-69. Don't throw a NullPointerException when getting |
| hints for non-existing file split. (Bryan Pendelton via cutting) |
| |
| 21. Fix HADOOP-157. When a task that writes dfs files (e.g., a reduce |
| task) failed and was retried, it would fail again and again, |
| eventually failing the job. The problem was that dfs did not yet |
| know that the failed task had abandoned the files, and would not |
| yet let another task create files with the same names. Dfs now |
| retries when creating a file long enough for locks on abandoned |
| files to expire. (omalley via cutting) |
| |
| 22. Fix HADOOP-150. Improved task names that include job |
| names. (omalley via cutting) |
| |
| 23. Fix HADOOP-162. Fix ConcurrentModificationException when |
| releasing file locks. (omalley via cutting) |
| |
| 24. Fix HADOOP-132. Initial check-in of new Metrics API, including |
| implementations for writing metric data to a file and for sending |
| it to Ganglia. (David Bowen via cutting) |
| |
| 25. Fix HADOOP-160. Remove some uneeded synchronization around |
| time-consuming operations in the TaskTracker. (omalley via cutting) |
| |
| 26. Fix HADOOP-166. RPCs failed when passed subclasses of a declared |
| parameter type. This is fixed by changing ObjectWritable to store |
| both the declared type and the instance type for Writables. Note |
| that this incompatibly changes the format of ObjectWritable and |
| will render unreadable any ObjectWritables stored in files. |
| Nutch only uses ObjectWritable in intermediate files, so this |
| should not be a problem for Nutch. (Stefan & cutting) |
| |
| 27. Fix HADOOP-168. MapReduce RPC protocol methods should all declare |
| IOException, so that timeouts are handled appropriately. |
| (omalley via cutting) |
| |
| 28. Fix HADOOP-169. Don't fail a reduce task if a call to the |
| jobtracker to locate map outputs fails. (omalley via cutting) |
| |
| 29. Fix HADOOP-170. Permit FileSystem clients to examine and modify |
| the replication count of individual files. Also fix a few |
| replication-related bugs. (Konstantin Shvachko via cutting) |
| |
| 30. Permit specification of a higher replication levels for job |
| submission files (job.xml and job.jar). This helps with large |
| clusters, since these files are read by every node. (cutting) |
| |
| 31. HADOOP-173. Optimize allocation of tasks with local data. (cutting) |
| |
| 32. HADOOP-167. Reduce number of Configurations and JobConf's |
| created. (omalley via cutting) |
| |
| 33. NUTCH-256. Change FileSystem#createNewFile() to create a .crc |
| file. The lack of a .crc file was causing warnings. (cutting) |
| |
| 34. HADOOP-174. Change JobClient to not abort job until it has failed |
| to contact the job tracker for five attempts, not just one as |
| before. (omalley via cutting) |
| |
| 35. HADOOP-177. Change MapReduce web interface to page through tasks. |
| Previously, when jobs had more than a few thousand tasks they |
| could crash web browsers. (Mahadev Konar via cutting) |
| |
| 36. HADOOP-178. In DFS, piggyback blockwork requests from datanodes |
| on heartbeat responses from namenode. This reduces the volume of |
| RPC traffic. Also move startup delay in blockwork from datanode |
| to namenode. This fixes a problem where restarting the namenode |
| triggered a lot of uneeded replication. (Hairong Kuang via cutting) |
| |
| 37. HADOOP-183. If the DFS namenode is restarted with different |
| minimum and/or maximum replication counts, existing files' |
| replication counts are now automatically adjusted to be within the |
| newly configured bounds. (Hairong Kuang via cutting) |
| |
| 38. HADOOP-186. Better error handling in TaskTracker's top-level |
| loop. Also improve calculation of time to send next heartbeat. |
| (omalley via cutting) |
| |
| 39. HADOOP-187. Add two MapReduce examples/benchmarks. One creates |
| files containing random data. The second sorts the output of the |
| first. (omalley via cutting) |
| |
| 40. HADOOP-185. Fix so that, when a task tracker times out making the |
| RPC asking for a new task to run, the job tracker does not think |
| that it is actually running the task returned. (omalley via cutting) |
| |
| 41. HADOOP-190. If a child process hangs after it has reported |
| completion, its output should not be lost. (Stack via cutting) |
| |
| 42. HADOOP-184. Re-structure some test code to better support testing |
| on a cluster. (Mahadev Konar via cutting) |
| |
| 43. HADOOP-191 Add streaming package, Hadoop's first contrib module. |
| This permits folks to easily submit MapReduce jobs whose map and |
| reduce functions are implemented by shell commands. Use |
| 'bin/hadoop jar build/hadoop-streaming.jar' to get details. |
| (Michel Tourn via cutting) |
| |
| 44. HADOOP-189. Fix MapReduce in standalone configuration to |
| correctly handle job jar files that contain a lib directory with |
| nested jar files. (cutting) |
| |
| 45. HADOOP-65. Initial version of record I/O framework that enables |
| the specification of record types and generates marshalling code |
| in both Java and C++. Generated Java code implements |
| WritableComparable, but is not yet otherwise used by |
| Hadoop. (Milind Bhandarkar via cutting) |
| |
| 46. HADOOP-193. Add a MapReduce-based FileSystem benchmark. |
| (Konstantin Shvachko via cutting) |
| |
| 47. HADOOP-194. Add a MapReduce-based FileSystem checker. This reads |
| every block in every file in the filesystem. (Konstantin Shvachko |
| via cutting) |
| |
| 48. HADOOP-182. Fix so that lost task trackers to not change the |
| status of reduce tasks or completed jobs. Also fixes the progress |
| meter so that failed tasks are subtracted. (omalley via cutting) |
| |
| 49. HADOOP-96. Logging improvements. Log files are now separate from |
| standard output and standard error files. Logs are now rolled. |
| Logging of all DFS state changes can be enabled, to facilitate |
| debugging. (Hairong Kuang via cutting) |
| |
| |
| Release 0.1.1 - 2006-04-08 |
| |
| 1. Added CHANGES.txt, logging all significant changes to Hadoop. (cutting) |
| |
| 2. Fix MapReduceBase.close() to throw IOException, as declared in the |
| Closeable interface. This permits subclasses which override this |
| method to throw that exception. (cutting) |
| |
| 3. Fix HADOOP-117. Pathnames were mistakenly transposed in |
| JobConf.getLocalFile() causing many mapred temporary files to not |
| be removed. (Raghavendra Prabhu via cutting) |
| |
| 4. Fix HADOOP-116. Clean up job submission files when jobs complete. |
| (cutting) |
| |
| 5. Fix HADOOP-125. Fix handling of absolute paths on Windows (cutting) |
| |
| Release 0.1.0 - 2006-04-01 |
| |
| 1. The first release of Hadoop. |
| |