blob: 317fe7d46ccf0b64a6d7c1c12ecd0db7d43bea78 [file] [log] [blame]
~~ Licensed under the Apache License, Version 2.0 (the "License");
~~ you may not use this file except in compliance with the License.
~~ You may obtain a copy of the License at
~~
~~ http://www.apache.org/licenses/LICENSE-2.0
~~
~~ Unless required by applicable law or agreed to in writing, software
~~ distributed under the License is distributed on an "AS IS" BASIS,
~~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
~~ See the License for the specific language governing permissions and
~~ limitations under the License. See accompanying LICENSE file.
---
HDFS Quotas Guide
---
---
${maven.build.timestamp}
HDFS Quotas Guide
\[ {{{./index.html}Go Back}} \]
%{toc|section=1|fromDepth=0}
* Overview
The Hadoop Distributed File System (HDFS) allows the administrator to
set quotas for the number of names used and the amount of space used
for individual directories. Name quotas and space quotas operate
independently, but the administration and implementation of the two
types of quotas are closely parallel.
* Name Quotas
The name quota is a hard limit on the number of file and directory
names in the tree rooted at that directory. File and directory
creations fail if the quota would be exceeded. Quotas stick with
renamed directories; the rename operation fails if operation would
result in a quota violation. The attempt to set a quota will still
succeed even if the directory would be in violation of the new quota. A
newly created directory has no associated quota. The largest quota is
Long.Max_Value. A quota of one forces a directory to remain empty.
(Yes, a directory counts against its own quota!)
Quotas are persistent with the fsimage. When starting, if the fsimage
is immediately in violation of a quota (perhaps the fsimage was
surreptitiously modified), a warning is printed for each of such
violations. Setting or removing a quota creates a journal entry.
* Space Quotas
The space quota is a hard limit on the number of bytes used by files in
the tree rooted at that directory. Block allocations fail if the quota
would not allow a full block to be written. Each replica of a block
counts against the quota. Quotas stick with renamed directories; the
rename operation fails if the operation would result in a quota
violation. A newly created directory has no associated quota. The
largest quota is <<<Long.Max_Value>>>. A quota of zero still permits files
to be created, but no blocks can be added to the files. Directories don't
use host file system space and don't count against the space quota. The
host file system space used to save the file meta data is not counted
against the quota. Quotas are charged at the intended replication
factor for the file; changing the replication factor for a file will
credit or debit quotas.
Quotas are persistent with the fsimage. When starting, if the fsimage
is immediately in violation of a quota (perhaps the fsimage was
surreptitiously modified), a warning is printed for each of such
violations. Setting or removing a quota creates a journal entry.
* Administrative Commands
Quotas are managed by a set of commands available only to the
administrator.
* <<<dfsadmin -setQuota <N> <directory>...<directory> >>>
Set the name quota to be N for each directory. Best effort for each
directory, with faults reported if N is not a positive long
integer, the directory does not exist or it is a file, or the
directory would immediately exceed the new quota.
* <<<dfsadmin -clrQuota <directory>...<directory> >>>
Remove any name quota for each directory. Best effort for each
directory, with faults reported if the directory does not exist or
it is a file. It is not a fault if the directory has no quota.
* <<<dfsadmin -setSpaceQuota <N> <directory>...<directory> >>>
Set the space quota to be N bytes for each directory. This is a
hard limit on total size of all the files under the directory tree.
The space quota takes replication also into account, i.e. one GB of
data with replication of 3 consumes 3GB of quota. N can also be
specified with a binary prefix for convenience, for e.g. 50g for 50
gigabytes and 2t for 2 terabytes etc. Best effort for each
directory, with faults reported if N is neither zero nor a positive
integer, the directory does not exist or it is a file, or the
directory would immediately exceed the new quota.
* <<<dfsadmin -clrSpaceQuota <directory>...<director> >>>
Remove any space quota for each directory. Best effort for each
directory, with faults reported if the directory does not exist or
it is a file. It is not a fault if the directory has no quota.
* Reporting Command
An an extension to the count command of the HDFS shell reports quota
values and the current count of names and bytes in use.
* <<<fs -count -q <directory>...<directory> >>>
With the -q option, also report the name quota value set for each
directory, the available name quota remaining, the space quota
value set, and the available space quota remaining. If the
directory does not have a quota set, the reported values are <<<none>>>
and <<<inf>>>.