| <?xml version="1.0" encoding="UTF-8"?> |
| <!-- |
| Licensed to the Apache Software Foundation (ASF) under one or more |
| contributor license agreements. See the NOTICE file distributed with |
| this work for additional information regarding copyright ownership. |
| The ASF licenses this file to You under the Apache License, Version 2.0 |
| (the "License"); you may not use this file except in compliance with |
| the License. You may obtain a copy of the License at |
| |
| http://www.apache.org/licenses/LICENSE-2.0 |
| |
| Unless required by applicable law or agreed to in writing, software |
| distributed under the License is distributed on an "AS IS" BASIS, |
| WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
| See the License for the specific language governing permissions and |
| limitations under the License. |
| --> |
| <document xmlns="http://maven.apache.org/XDOC/2.0" |
| xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" |
| xsi:schemaLocation="http://maven.apache.org/XDOC/2.0 http://maven.apache.org/xsd/xdoc-2.0.xsd"> |
| |
| <properties> |
| <title>HFDS Snapshots</title> |
| </properties> |
| |
| <body> |
| |
| <h1>HDFS Snapshots</h1> |
| <macro name="toc"> |
| <param name="section" value="0"/> |
| <param name="fromDepth" value="0"/> |
| <param name="toDepth" value="4"/> |
| </macro> |
| |
| <section name="Overview" id="Overview"> |
| <p> |
| HDFS Snapshots are read-only point-in-time copies of the file system. |
| Snapshots can be taken on a subtree of the file system or the entire file system. |
| Some common use cases of snapshots are data backup, protection against user errors |
| and disaster recovery. |
| </p> |
| |
| <p> |
| The implementation of HDFS Snapshots is efficient: |
| </p> |
| <ul> |
| <li>Snapshot creation is instantaneous: |
| the cost is <em>O(1)</em> excluding the inode lookup time.</li> |
| <li>Additional memory is used only when modifications are made relative to a snapshot: |
| memory usage is <em>O(M)</em>, |
| where <em>M</em> is the number of modified files/directories.</li> |
| <li>Blocks in datanodes are not copied: |
| the snapshot files record the block list and the file size. |
| There is no data copying.</li> |
| <li>Snapshots do not adversely affect regular HDFS operations: |
| modifications are recorded in reverse chronological order |
| so that the current data can be accessed directly. |
| The snapshot data is computed by subtracting the modifications |
| from the current data.</li> |
| </ul> |
| |
| <subsection name="Snapshottable Directories" id="SnapshottableDirectories"> |
| <p> |
| Snapshots can be taken on any directory once the directory has been set as |
| <em>snapshottable</em>. |
| A snapshottable directory is able to accommodate 65,536 simultaneous snapshots. |
| There is no limit on the number of snapshottable directories. |
| Administrators may set any directory to be snapshottable. |
| If there are snapshots in a snapshottable directory, |
| the directory can be neither deleted nor renamed |
| before all the snapshots are deleted. |
| </p> |
| <!-- |
| <p> |
| Nested snapshottable directories are currently not allowed. |
| In other words, a directory cannot be set to snapshottable |
| if one of its ancestors is a snapshottable directory. |
| </p> |
| --> |
| </subsection> |
| |
| <subsection name="Snapshot Paths" id="SnapshotPaths"> |
| <p> |
| For a snapshottable directory, |
| the path component <em>".snapshot"</em> is used for accessing its snapshots. |
| Suppose <code>/foo</code> is a snapshottable directory, |
| <code>/foo/bar</code> is a file/directory in <code>/foo</code>, |
| and <code>/foo</code> has a snapshot <code>s0</code>. |
| Then, the path <source>/foo/.snapshot/s0/bar</source> |
| refers to the snapshot copy of <code>/foo/bar</code>. |
| The usual API and CLI can work with the ".snapshot" paths. |
| The following are some examples. |
| </p> |
| <ul> |
| <li>Listing all the snapshots under a snapshottable directory: |
| <source>hdfs dfs -ls /foo/.snapshot</source></li> |
| <li>Listing the files in snapshot <code>s0</code>: |
| <source>hdfs dfs -ls /foo/.snapshot/s0</source></li> |
| <li>Copying a file from snapshot <code>s0</code>: |
| <source>hdfs dfs -cp /foo/.snapshot/s0/bar /tmp</source></li> |
| </ul> |
| <p> |
| The name ".snapshot" is now a reserved file name in HDFS |
| so that users cannot create a file/directory with ".snapshot" as the name. |
| If ".snapshot" is used in a previous version of HDFS, it must be renamed before upgrade; |
| otherwise, upgrade will fail. |
| </p> |
| </subsection> |
| </section> |
| |
| <section name="Snapshot Operations" id="SnapshotOperations"> |
| <subsection name="Administrator Operations" id="AdministratorOperations"> |
| <p> |
| The operations described in this section require superuser privilege. |
| </p> |
| |
| <h4>Allow Snapshots</h4> |
| <p> |
| Allowing snapshots of a directory to be created. |
| If the operation completes successfully, the directory becomes snapshottable. |
| </p> |
| <ul> |
| <li>Command: |
| <source>hdfs dfsadmin -allowSnapshot <path></source></li> |
| <li>Arguments:<table> |
| <tr><td>path</td><td>The path of the snapshottable directory.</td></tr> |
| </table></li> |
| </ul> |
| <p> |
| See also the corresponding Java API |
| <code>void allowSnapshot(Path path)</code> in <code>HdfsAdmin</code>. |
| </p> |
| |
| <h4>Disallow Snapshots</h4> |
| <p> |
| Disallowing snapshots of a directory to be created. |
| All snapshots of the directory must be deleted before disallowing snapshots. |
| </p> |
| <ul> |
| <li>Command: |
| <source>hdfs dfsadmin -disallowSnapshot <path></source></li> |
| <li>Arguments:<table> |
| <tr><td>path</td><td>The path of the snapshottable directory.</td></tr> |
| </table></li> |
| </ul> |
| <p> |
| See also the corresponding Java API |
| <code>void disallowSnapshot(Path path)</code> in <code>HdfsAdmin</code>. |
| </p> |
| </subsection> |
| |
| <subsection name="User Operations" id="UserOperations"> |
| <p> |
| The section describes user operations. |
| Note that HDFS superuser can perform all the operations |
| without satisfying the permission requirement in the individual operations. |
| </p> |
| |
| <h4>Create Snapshots</h4> |
| <p> |
| Create a snapshot of a snapshottable directory. |
| This operation requires owner privilege of the snapshottable directory. |
| </p> |
| <ul> |
| <li>Command: |
| <source>hdfs dfs -createSnapshot <path> [<snapshotName>]</source></li> |
| <li>Arguments:<table> |
| <tr><td>path</td><td>The path of the snapshottable directory.</td></tr> |
| <tr><td>snapshotName</td><td> |
| The snapshot name, which is an optional argument. |
| When it is omitted, a default name is generated using a timestamp with the format |
| <code>"'s'yyyyMMdd-HHmmss.SSS"</code>, e.g. "s20130412-151029.033". |
| </td></tr> |
| </table></li> |
| </ul> |
| <p> |
| See also the corresponding Java API |
| <code>Path createSnapshot(Path path)</code> and |
| <code>Path createSnapshot(Path path, String snapshotName)</code> |
| in <a href="../../api/org/apache/hadoop/fs/FileSystem.html"><code>FileSystem</code></a>. |
| The snapshot path is returned in these methods. |
| </p> |
| |
| <h4>Delete Snapshots</h4> |
| <p> |
| Delete a snapshot of from a snapshottable directory. |
| This operation requires owner privilege of the snapshottable directory. |
| </p> |
| <ul> |
| <li>Command: |
| <source>hdfs dfs -deleteSnapshot <path> <snapshotName></source></li> |
| <li>Arguments:<table> |
| <tr><td>path</td><td>The path of the snapshottable directory.</td></tr> |
| <tr><td>snapshotName</td><td>The snapshot name.</td></tr> |
| </table></li> |
| </ul> |
| <p> |
| See also the corresponding Java API |
| <code>void deleteSnapshot(Path path, String snapshotName)</code> |
| in <a href="../../api/org/apache/hadoop/fs/FileSystem.html"><code>FileSystem</code></a>. |
| </p> |
| |
| <h4>Rename Snapshots</h4> |
| <p> |
| Rename a snapshot. |
| This operation requires owner privilege of the snapshottable directory. |
| </p> |
| <ul> |
| <li>Command: |
| <source>hdfs dfs -renameSnapshot <path> <oldName> <newName></source></li> |
| <li>Arguments:<table> |
| <tr><td>path</td><td>The path of the snapshottable directory.</td></tr> |
| <tr><td>oldName</td><td>The old snapshot name.</td></tr> |
| <tr><td>newName</td><td>The new snapshot name.</td></tr> |
| </table></li> |
| </ul> |
| <p> |
| See also the corresponding Java API |
| <code>void renameSnapshot(Path path, String oldName, String newName)</code> |
| in <a href="../../api/org/apache/hadoop/fs/FileSystem.html"><code>FileSystem</code></a>. |
| </p> |
| |
| <h4>Get Snapshottable Directory Listing</h4> |
| <p> |
| Get all the snapshottable directories where the current user has permission to take snapshtos. |
| </p> |
| <ul> |
| <li>Command: |
| <source>hdfs lsSnapshottableDir</source></li> |
| <li>Arguments: none</li> |
| </ul> |
| <p> |
| See also the corresponding Java API |
| <code>SnapshottableDirectoryStatus[] getSnapshottableDirectoryListing()</code> |
| in <code>DistributedFileSystem</code>. |
| </p> |
| |
| <h4>Get Snapshots Difference Report</h4> |
| <p> |
| Get the differences between two snapshots. |
| This operation requires read access privilege for all files/directories in both snapshots. |
| </p> |
| <ul> |
| <li>Command: |
| <source>hdfs snapshotDiff <path> <fromSnapshot> <toSnapshot></source></li> |
| <li>Arguments:<table> |
| <tr><td>path</td><td>The path of the snapshottable directory.</td></tr> |
| <tr><td>fromSnapshot</td><td>The name of the starting snapshot.</td></tr> |
| <tr><td>toSnapshot</td><td>The name of the ending snapshot.</td></tr> |
| </table></li> |
| </ul> |
| <p> |
| See also the corresponding Java API |
| <code>SnapshotDiffReport getSnapshotDiffReport(Path path, String fromSnapshot, String toSnapshot)</code> |
| in <code>DistributedFileSystem</code>. |
| </p> |
| |
| </subsection> |
| </section> |
| |
| </body> |
| </document> |