branch-0.23.1/hadoop-hdfs-project/hadoop-hdfs/src/main/docs/src/documentation/content/xdocs/hdfs_editsviewer.xml - hadoop - Git at Google

 <?xml version="1.0"?>
 <!--
   Licensed to the Apache Software Foundation (ASF) under one or more
   contributor license agreements.  See the NOTICE file distributed with
   this work for additional information regarding copyright ownership.
   The ASF licenses this file to You under the Apache License, Version 2.0
   (the "License"); you may not use this file except in compliance with
   the License.  You may obtain a copy of the License at

       http://www.apache.org/licenses/LICENSE-2.0

   Unless required by applicable law or agreed to in writing, software
   distributed under the License is distributed on an "AS IS" BASIS,
   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
   See the License for the specific language governing permissions and
   limitations under the License.
 -->

 <!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "http://forrest.apache.org/dtd/document-v20.dtd">

 <document>

   <header>
     <title>Offline Edits Viewer Guide</title>
     <authors>
       <person name="Erik Steffl" email="steffl@yahoo-inc.com"/>
     </authors>
   </header>

   <body>

     <section>

       <title>Overview</title>

       <p>
         Offline Edits Viewer is a tool to parse the Edits log file. The
         current processors are mostly useful for conversion between
         different formats, including XML which is human readable and
         easier to edit than native binary format.
       </p>

       <p>
         The tool can parse the edits formats -18 (roughly Hadoop 0.19)
         and later. The tool operates on files only, it does not need
         Hadoop cluster to be running.
       </p>

       <p>Input formats supported:</p>
       <ol>
         <li><strong>binary</strong>: native binary format that Hadoop uses internally</li>
         <li>
           <strong>xml</strong>: XML format, as produced by
           <strong>xml</strong> processor, used if filename has xml
           (case insensitive) extension
         </li>
       </ol>

       <p>
         The Offline Edits Viewer provides several output processors
         (unless stated otherwise the output of the processor can be
         converted back to original edits file):
       </p>
       <ol>
         <li><strong>binary</strong>: native binary format that Hadoop uses internally</li>
         <li><strong>xml</strong>: XML format</li>
         <li><strong>stats</strong>: prints out statistics, this cannot be converted back to Edits file</li>
       </ol>

     </section> <!-- Overview -->

     <section>

       <title>Usage</title>

       <p><code>bash$ bin/hdfs oev -i edits -o edits.xml</code></p>

       <table>
         <tr><th>Flag</th><th>Description</th></tr>
         <tr>
           <td><code>[-i|--inputFile] &lt;input file&gt;</code></td>
           <td>
             Specify the input edits log file to process. Xml (case
             insensitive) extension means XML format otherwise binary
             format is assumed. Required.
           </td>
         </tr>
         <tr>
           <td><code>[-o|--outputFile] &lt;output file&gt;</code></td>
           <td>
             Specify the output filename, if the specified output processor
             generates one. If the specified file already exists, it is
             silently overwritten. Required.
           </td>
         </tr>
         <tr>
           <td><code>[-p|--processor] &lt;processor&gt;</code></td>
           <td>
             Specify the image processor to apply against the image
             file. Currently valid options are <strong>binary</strong>,
             <strong>xml</strong> (default) and <strong>stats</strong>.
           </td>
         </tr>
         <tr>
           <td><code>[-v|--verbose]-</code></td>
           <td>
             Print the input and output filenames and pipe output of
             processor to console as well as specified file. On extremely
             large files, this may increase processing time by an order
             of magnitude.
           </td>
         </tr>
         <tr>
           <td><code>[-h|--help]</code></td>
           <td>
             Display the tool usage and help information and exit.
           </td>
         </tr>
       </table>

     </section> <!-- Usage -->

     <section>

       <title>Case study: Hadoop cluster recovery</title>

       <p>
         In case there is some problem with hadoop cluster and the edits
         file is corrupted it is possible to save at least part of the
         edits file that is correct. This can be done by converting the
         binary edits to XML, edit it manually and then convert it back
         to binary. The most common problem is that the edits file is
         missing the closing record (record that has opCode -1). This
         should be recognized by the tool and the XML format should be
         properly closed.
       </p>

       <p>
         If there is no closing record in the XML file you can add one
         after last correct record. Anything after the record with opCode
         -1 is ignored.
       </p>

       <p>Example of a closing record (with opCode -1):</p>
 <source>
 &lt;RECORD&gt;
   &lt;OPCODE&gt;-1&lt;/OPCODE&gt;
   &lt;DATA&gt;
   &lt;/DATA&gt;
 &lt;/RECORD&gt;
 </source>

     </section> <!-- Case study: Hadoop cluster recovery -->

   </body>

 </document>
	<?xml version="1.0"?>
	<!--
	Licensed to the Apache Software Foundation (ASF) under one or more
	contributor license agreements. See the NOTICE file distributed with
	this work for additional information regarding copyright ownership.
	The ASF licenses this file to You under the Apache License, Version 2.0
	(the "License"); you may not use this file except in compliance with
	the License. You may obtain a copy of the License at

	http://www.apache.org/licenses/LICENSE-2.0

	Unless required by applicable law or agreed to in writing, software
	distributed under the License is distributed on an "AS IS" BASIS,
	WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
	See the License for the specific language governing permissions and
	limitations under the License.
	-->

	<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "http://forrest.apache.org/dtd/document-v20.dtd">

	<document>

	<header>
	<title>Offline Edits Viewer Guide</title>
	<authors>
	<person name="Erik Steffl" email="steffl@yahoo-inc.com"/>
	</authors>
	</header>

	<body>

	<section>

	<title>Overview</title>

	<p>
	Offline Edits Viewer is a tool to parse the Edits log file. The
	current processors are mostly useful for conversion between
	different formats, including XML which is human readable and
	easier to edit than native binary format.
	</p>

	<p>
	The tool can parse the edits formats -18 (roughly Hadoop 0.19)
	and later. The tool operates on files only, it does not need
	Hadoop cluster to be running.
	</p>

	<p>Input formats supported:</p>
	<ol>
	<li><strong>binary</strong>: native binary format that Hadoop uses internally</li>
	<li>
	<strong>xml</strong>: XML format, as produced by
	<strong>xml</strong> processor, used if filename has xml
	(case insensitive) extension
	</li>
	</ol>

	<p>
	The Offline Edits Viewer provides several output processors
	(unless stated otherwise the output of the processor can be
	converted back to original edits file):
	</p>
	<ol>
	<li><strong>binary</strong>: native binary format that Hadoop uses internally</li>
	<li><strong>xml</strong>: XML format</li>
	<li><strong>stats</strong>: prints out statistics, this cannot be converted back to Edits file</li>
	</ol>

	</section> <!-- Overview -->

	<section>

	<title>Usage</title>

	<p><code>bash$ bin/hdfs oev -i edits -o edits.xml</code></p>

	<table>
	<tr><th>Flag</th><th>Description</th></tr>
	<tr>
	<td><code>[-i\|--inputFile] <input file></code></td>
	<td>
	Specify the input edits log file to process. Xml (case
	insensitive) extension means XML format otherwise binary
	format is assumed. Required.
	</td>
	</tr>
	<tr>
	<td><code>[-o\|--outputFile] <output file></code></td>
	<td>
	Specify the output filename, if the specified output processor
	generates one. If the specified file already exists, it is
	silently overwritten. Required.
	</td>
	</tr>
	<tr>
	<td><code>[-p\|--processor] <processor></code></td>
	<td>
	Specify the image processor to apply against the image
	file. Currently valid options are <strong>binary</strong>,
	<strong>xml</strong> (default) and <strong>stats</strong>.
	</td>
	</tr>
	<tr>
	<td><code>[-v\|--verbose]-</code></td>
	<td>
	Print the input and output filenames and pipe output of
	processor to console as well as specified file. On extremely
	large files, this may increase processing time by an order
	of magnitude.
	</td>
	</tr>
	<tr>
	<td><code>[-h\|--help]</code></td>
	<td>
	Display the tool usage and help information and exit.
	</td>
	</tr>
	</table>

	</section> <!-- Usage -->

	<section>

	<title>Case study: Hadoop cluster recovery</title>

	<p>
	In case there is some problem with hadoop cluster and the edits
	file is corrupted it is possible to save at least part of the
	edits file that is correct. This can be done by converting the
	binary edits to XML, edit it manually and then convert it back
	to binary. The most common problem is that the edits file is
	missing the closing record (record that has opCode -1). This
	should be recognized by the tool and the XML format should be
	properly closed.
	</p>

	<p>
	If there is no closing record in the XML file you can add one
	after last correct record. Anything after the record with opCode
	-1 is ignored.
	</p>

	<p>Example of a closing record (with opCode -1):</p>
	<source>
	<RECORD>
	<OPCODE>-1</OPCODE>
	<DATA>
	</DATA>
	</RECORD>
	</source>

	</section> <!-- Case study: Hadoop cluster recovery -->

	</body>

	</document>