WebHDFS REST API

Document Conventions

MonospacedUsed for commands, HTTP request and responses and code blocks.
<Monospaced>User entered values.
[Monospaced]Optional values. When the value is not specified, the default value is used.
ItalicsImportant phrases and words.

Introduction

The HTTP REST API supports the complete FileSystem/FileContext interface for HDFS. The operations and the corresponding FileSystem/FileContext methods are shown in the next section. The Section HTTP Query Parameter Dictionary specifies the parameter details such as the defaults and the valid values.

Operations

FileSystem URIs vs HTTP URLs

The FileSystem scheme of WebHDFS is “webhdfs://”. A WebHDFS FileSystem URI has the following format.

  webhdfs://<HOST>:<HTTP_PORT>/<PATH>

The above WebHDFS URI corresponds to the below HDFS URI.

  hdfs://<HOST>:<RPC_PORT>/<PATH>

In the REST API, the prefix “/webhdfs/v1” is inserted in the path and a query is appended at the end. Therefore, the corresponding HTTP URL has the following format.

  http://<HOST>:<HTTP_PORT>/webhdfs/v1/<PATH>?op=...

Note that if WebHDFS is secured with SSL, then the scheme should be “swebhdfs://”.

  swebhdfs://<HOST>:<HTTP_PORT>/<PATH>

HDFS Configuration Options

Below are the HDFS configuration options for WebHDFS.

Property NameDescription
dfs.webhdfs.enabled Enable/disable WebHDFS in Namenodes and Datanodes
dfs.web.authentication.kerberos.principalThe HTTP Kerberos principal used by Hadoop-Auth in the HTTP endpoint. The HTTP Kerberos principal MUST start with ‘HTTP/’ per Kerberos HTTP SPNEGO specification. A value of “*” will use all HTTP principals found in the keytab.
dfs.web.authentication.kerberos.keytab The Kerberos keytab file with the credentials for the HTTP Kerberos principal used by Hadoop-Auth in the HTTP endpoint.
dfs.webhdfs.socket.connect-timeoutHow long to wait for a connection to be established before failing. Specified as a time duration, ie numerical value followed by a units symbol, eg 2m for two minutes. Defaults to 60s.
dfs.webhdfs.socket.read-timeoutHow long to wait for data to arrive before failing. Defaults to 60s.

Authentication

When security is off, the authenticated user is the username specified in the user.name query parameter. If the user.name parameter is not set, the server may either set the authenticated user to a default web user, if there is any, or return an error response.

When security is on, authentication is performed by either Hadoop delegation token or Kerberos SPNEGO. If a token is set in the delegation query parameter, the authenticated user is the user encoded in the token. If the delegation parameter is not set, the user is authenticated by Kerberos SPNEGO.

Below are examples using the curl command tool.

  1. Authentication when security is off:

    curl -i "http://<HOST>:<PORT>/webhdfs/v1/<PATH>?[user.name=<USER>&]op=..."
    
  2. Authentication using Kerberos SPNEGO when security is on:

    curl -i --negotiate -u : "http://<HOST>:<PORT>/webhdfs/v1/<PATH>?op=..."
    
  3. Authentication using Hadoop delegation token when security is on:

    curl -i "http://<HOST>:<PORT>/webhdfs/v1/<PATH>?delegation=<TOKEN>&op=..."
    

See also: Authentication for Hadoop HTTP web-consoles

Additionally, WebHDFS supports OAuth2 on the client side. The Namenode and Datanodes do not currently support clients using OAuth2 but other backends that implement the WebHDFS REST interface may.

WebHDFS supports two type of OAuth2 code grants (user-provided refresh and access token or user provided credential) by default and provides a pluggable mechanism for implementing other OAuth2 authentications per the OAuth2 RFC, or custom authentications. When using either of the provided code grant mechanisms, the WebHDFS client will refresh the access token as necessary.

OAuth2 should only be enabled for clients not running with Kerberos SPENGO.

OAuth2 code grant mechanismDescriptionValue of dfs.webhdfs.oauth2.access.token.provider that implements code grant
Authorization Code GrantThe user provides an initial access token and refresh token, which are then used to authenticate WebHDFS requests and obtain replacement access tokens, respectively.org.apache.hadoop.hdfs.web.oauth2.ConfRefreshTokenBasedAccessTokenProvider
Client Credentials GrantThe user provides a credential which is used to obtain access tokens, which are then used to authenticate WebHDFS requests.org.apache.hadoop.hdfs.web.oauth2.ConfCredentialBasedAccessTokenProvider

The following properties control OAuth2 authentication.

OAuth2 related propertyDescription
dfs.webhdfs.oauth2.enabledBoolean to enable/disable OAuth2 authentication
dfs.webhdfs.oauth2.access.token.providerClass name of an implementation of org.apache.hadoop.hdfs.web.oauth.AccessTokenProvider. Two are provided with the code, as described above, or the user may specify a user-provided implementation. The default value for this configuration key is the ConfCredentialBasedAccessTokenProvider implementation.
dfs.webhdfs.oauth2.client.idClient id used to obtain access token with either credential or refresh token
dfs.webhdfs.oauth2.refresh.urlURL against which to post for obtaining bearer token with either credential or refresh token
dfs.webhdfs.oauth2.access.token(required if using ConfRefreshTokenBasedAccessTokenProvider) Initial access token with which to authenticate
dfs.webhdfs.oauth2.refresh.token(required if using ConfRefreshTokenBasedAccessTokenProvider) Initial refresh token to use to obtain new access tokens
dfs.webhdfs.oauth2.refresh.token.expires.ms.since.epoch(required if using ConfRefreshTokenBasedAccessTokenProvider) Access token expiration measured in milliseconds since Jan 1, 1970. Note this is a different value than provided by OAuth providers and has been munged as described in interface to be suitable for a client application
dfs.webhdfs.oauth2.credential(required if using ConfCredentialBasedAccessTokenProvider). Credential used to obtain initial and subsequent access tokens.

Proxy Users

When the proxy user feature is enabled, a proxy user P may submit a request on behalf of another user U. The username of U must be specified in the doas query parameter unless a delegation token is presented in authentication. In such case, the information of both users P and U must be encoded in the delegation token.

  1. A proxy request when security is off:

    curl -i "http://<HOST>:<PORT>/webhdfs/v1/<PATH>?[user.name=<USER>&]doas=<USER>&op=..."
    
  2. A proxy request using Kerberos SPNEGO when security is on:

    curl -i --negotiate -u : "http://<HOST>:<PORT>/webhdfs/v1/<PATH>?doas=<USER>&op=..."
    
  3. A proxy request using Hadoop delegation token when security is on:

    curl -i "http://<HOST>:<PORT>/webhdfs/v1/<PATH>?delegation=<TOKEN>&op=..."
    

Cross-Site Request Forgery Prevention

WebHDFS supports an optional, configurable mechanism for cross-site request forgery (CSRF) prevention. When enabled, WebHDFS HTTP requests to the NameNode or DataNode must include a custom HTTP header. Configuration properties allow adjusting which specific HTTP methods are protected and the name of the HTTP header. The value sent in the header is not relevant. Only the presence of a header by that name is required.

Enabling CSRF prevention also sets up the WebHdfsFileSystem class to send the required header. This ensures that CLI commands like hdfs dfs and hadoop distcp continue to work correctly when used with webhdfs: URIs.

Enabling CSRF prevention also sets up the NameNode web UI to send the required header. After enabling CSRF prevention and restarting the NameNode, existing users of the NameNode web UI need to refresh the browser to reload the page and find the new configuration.

The following properties control CSRF prevention.

PropertyDescriptionDefault Value
dfs.webhdfs.rest-csrf.enabledIf true, then enables WebHDFS protection against cross-site request forgery (CSRF). The WebHDFS client also uses this property to determine whether or not it needs to send the custom CSRF prevention header in its HTTP requests.false
dfs.webhdfs.rest-csrf.custom-headerThe name of a custom header that HTTP requests must send when protection against cross-site request forgery (CSRF) is enabled for WebHDFS by setting dfs.webhdfs.rest-csrf.enabled to true. The WebHDFS client also uses this property to determine whether or not it needs to send the custom CSRF prevention header in its HTTP requests.X-XSRF-HEADER
dfs.webhdfs.rest-csrf.methods-to-ignoreA comma-separated list of HTTP methods that do not require HTTP requests to include a custom header when protection against cross-site request forgery (CSRF) is enabled for WebHDFS by setting dfs.webhdfs.rest-csrf.enabled to true. The WebHDFS client also uses this property to determine whether or not it needs to send the custom CSRF prevention header in its HTTP requests.GET,OPTIONS,HEAD,TRACE
dfs.webhdfs.rest-csrf.browser-useragents-regexA comma-separated list of regular expressions used to match against an HTTP request‘s User-Agent header when protection against cross-site request forgery (CSRF) is enabled for WebHDFS by setting dfs.webhdfs.reset-csrf.enabled to true. If the incoming User-Agent matches any of these regular expressions, then the request is considered to be sent by a browser, and therefore CSRF prevention is enforced. If the request’s User-Agent does not match any of these regular expressions, then the request is considered to be sent by something other than a browser, such as scripted automation. In this case, CSRF is not a potential attack vector, so the prevention is not enforced. This helps achieve backwards-compatibility with existing automation that has not been updated to send the CSRF prevention header.^Mozilla.*,^Opera.*

The following is an example curl call that uses the -H option to include the custom header in the request.

    curl -i -L -X PUT -H 'X-XSRF-HEADER: ""' 'http://<HOST>:<PORT>/webhdfs/v1/<PATH>?op=CREATE'

WebHDFS Retry Policy

WebHDFS supports an optional, configurable retry policy for resilient copy of large files that could timeout, or copy file between HA clusters that could failover during the copy.

The following properties control WebHDFS retry and failover policy.

PropertyDescriptionDefault Value
dfs.http.client.retry.policy.enabledIf “true”, enable the retry policy of WebHDFS client. If “false”, retry policy is turned off.false
dfs.http.client.retry.policy.specSpecify a policy of multiple linear random retry for WebHDFS client, e.g. given pairs of number of retries and sleep time (n0, t0), (n1, t1), ..., the first n0 retries sleep t0 milliseconds on average, the following n1 retries sleep t1 milliseconds on average, and so on.10000,6,60000,10
dfs.http.client.failover.max.attemptsSpecify the max number of failover attempts for WebHDFS client in case of network exception.15
dfs.http.client.retry.max.attemptsSpecify the max number of retry attempts for WebHDFS client, if the difference between retried attempts and failovered attempts is larger than the max number of retry attempts, there will be no more retries.10
dfs.http.client.failover.sleep.base.millisSpecify the base amount of time in milliseconds upon which the exponentially increased sleep time between retries or failovers is calculated for WebHDFS client.500
dfs.http.client.failover.sleep.max.millisSpecify the upper bound of sleep time in milliseconds between retries or failovers for WebHDFS client.15000

File and Directory Operations

Create and Write to a File

  • Step 1: Submit a HTTP PUT request without automatically following redirects and without sending the file data.

      curl -i -X PUT "http://<HOST>:<PORT>/webhdfs/v1/<PATH>?op=CREATE
                          [&overwrite=<true |false>][&blocksize=<LONG>][&replication=<SHORT>]
                          [&permission=<OCTAL>][&buffersize=<INT>][&noredirect=<true|false>]"
    

    Usually the request is redirected to a datanode where the file data is to be written.

      HTTP/1.1 307 TEMPORARY_REDIRECT
      Location: http://<DATANODE>:<PORT>/webhdfs/v1/<PATH>?op=CREATE...
      Content-Length: 0
    

    However, if you do not want to be automatically redirected, you can set the noredirect flag.

      HTTP/1.1 200 OK
      Content-Type: application/json
      {"Location":"http://<DATANODE>:<PORT>/webhdfs/v1/<PATH>?op=CREATE..."}
    
  • Step 2: Submit another HTTP PUT request using the URL in the Location header (or the returned response in case you specified noredirect) with the file data to be written.

      curl -i -X PUT -T <LOCAL_FILE> "http://<DATANODE>:<PORT>/webhdfs/v1/<PATH>?op=CREATE..."
    

    The client receives a 201 Created response with zero content length and the WebHDFS URI of the file in the Location header:

      HTTP/1.1 201 Created
      Location: webhdfs://<HOST>:<PORT>/<PATH>
      Content-Length: 0
    

If no permissions are specified, the newly created file will be assigned with default 755 permission. This also applies for new directories. No umask mode will be applied from server side (so “fs.permissions.umask-mode” value configuration set on Namenode side will have no effect).

Note that the reason of having two-step create/append is for preventing clients to send out data before the redirect. This issue is addressed by the “Expect: 100-continue” header in HTTP/1.1; see RFC 2616, Section 8.2.3. Unfortunately, there are software library bugs (e.g. Jetty 6 HTTP server and Java 6 HTTP client), which do not correctly implement “Expect: 100-continue”. The two-step create/append is a temporary workaround for the software library bugs.

See also: overwrite, blocksize, replication, permission, buffersize, FileSystem.create

Append to a File

  • Step 1: Submit a HTTP POST request without automatically following redirects and without sending the file data.

      curl -i -X POST "http://<HOST>:<PORT>/webhdfs/v1/<PATH>?op=APPEND[&buffersize=<INT>][&noredirect=<true|false>]"
    

    Usually the request is redirected to a datanode where the file data is to be appended:

      HTTP/1.1 307 TEMPORARY_REDIRECT
      Location: http://<DATANODE>:<PORT>/webhdfs/v1/<PATH>?op=APPEND...
      Content-Length: 0
    

    However, if you do not want to be automatically redirected, you can set the noredirect flag.

      HTTP/1.1 200 OK
      Content-Type: application/json
      {"Location":"http://<DATANODE>:<PORT>/webhdfs/v1/<PATH>?op=APPEND..."}
    
  • Step 2: Submit another HTTP POST request using the URL in the Location header (or the returned response in case you specified noredirect) with the file data to be appended.

      curl -i -X POST -T <LOCAL_FILE> "http://<DATANODE>:<PORT>/webhdfs/v1/<PATH>?op=APPEND..."
    

    The client receives a response with zero content length:

      HTTP/1.1 200 OK
      Content-Length: 0
    

See the note in the previous section for the description of why this operation requires two steps.

See also: buffersize, FileSystem.append

Concat File(s)

  • Submit a HTTP POST request.

      curl -i -X POST "http://<HOST>:<PORT>/webhdfs/v1/<PATH>?op=CONCAT&sources=<PATHS>"
    

    The client receives a response with zero content length:

      HTTP/1.1 200 OK
      Content-Length: 0
    

See also: sources, FileSystem.concat

Open and Read a File

  • Submit a HTTP GET request with automatically following redirects.

      curl -i -L "http://<HOST>:<PORT>/webhdfs/v1/<PATH>?op=OPEN
                          [&offset=<LONG>][&length=<LONG>][&buffersize=<INT>][&noredirect=<true|false>]"
    

    Usually the request is redirected to a datanode where the file data can be read:

      HTTP/1.1 307 TEMPORARY_REDIRECT
      Location: http://<DATANODE>:<PORT>/webhdfs/v1/<PATH>?op=OPEN...
      Content-Length: 0
    

    However if you do not want to be automatically redirected, you can set the noredirect flag.

      HTTP/1.1 200 OK
      Content-Type: application/json
      {"Location":"http://<DATANODE>:<PORT>/webhdfs/v1/<PATH>?op=OPEN..."}
    

    The client follows the redirect to the datanode and receives the file data:

      HTTP/1.1 200 OK
      Content-Type: application/octet-stream
      Content-Length: 22
    
      Hello, webhdfs user!
    

See also: offset, length, buffersize, FileSystem.open

Make a Directory

  • Submit a HTTP PUT request.

      curl -i -X PUT "http://<HOST>:<PORT>/webhdfs/v1/<PATH>?op=MKDIRS[&permission=<OCTAL>]"
    

    The client receives a response with a boolean JSON object:

      HTTP/1.1 200 OK
      Content-Type: application/json
      Transfer-Encoding: chunked
    
      {"boolean": true}
    

If no permissions are specified, the newly created directory will be assigned with default 755 permission. This also applies to new files created. No umask mode will be applied from server side (so “fs.permissions.umask-mode” value configuration set on Namenode side will have no effect).

See also: permission, FileSystem.mkdirs

Create a Symbolic Link

  • Submit a HTTP PUT request.

      curl -i -X PUT "http://<HOST>:<PORT>/webhdfs/v1/<PATH>?op=CREATESYMLINK
                                    &destination=<PATH>[&createParent=<true |false>]"
    

    The client receives a response with zero content length:

      HTTP/1.1 200 OK
      Content-Length: 0
    

See also: destination, createParent, FileSystem.createSymlink

Rename a File/Directory

  • Submit a HTTP PUT request.

      curl -i -X PUT "<HOST>:<PORT>/webhdfs/v1/<PATH>?op=RENAME&destination=<PATH>"
    

    The client receives a response with a boolean JSON object:

      HTTP/1.1 200 OK
      Content-Type: application/json
      Transfer-Encoding: chunked
    
      {"boolean": true}
    

See also: destination, FileSystem.rename

Delete a File/Directory

  • Submit a HTTP DELETE request.

      curl -i -X DELETE "http://<host>:<port>/webhdfs/v1/<path>?op=DELETE
                                    [&recursive=<true |false>]"
    

    The client receives a response with a boolean JSON object:

      HTTP/1.1 200 OK
      Content-Type: application/json
      Transfer-Encoding: chunked
    
      {"boolean": true}
    

See also: recursive, FileSystem.delete

Truncate a File

  • Submit a HTTP POST request.

      curl -i -X POST "http://<HOST>:<PORT>/webhdfs/v1/<PATH>?op=TRUNCATE&newlength=<LONG>"
    

    The client receives a response with a boolean JSON object:

      HTTP/1.1 200 OK
      Content-Type: application/json
      Transfer-Encoding: chunked
    
      {"boolean": true}
    

See also: newlength, FileSystem.truncate

Status of a File/Directory

  • Submit a HTTP GET request.

      curl -i  "http://<HOST>:<PORT>/webhdfs/v1/<PATH>?op=GETFILESTATUS"
    

    The client receives a response with a FileStatus JSON object:

      HTTP/1.1 200 OK
      Content-Type: application/json
      Transfer-Encoding: chunked
    
      {
        "FileStatus":
        {
          "accessTime"      : 0,
          "blockSize"       : 0,
          "group"           : "supergroup",
          "length"          : 0,             //in bytes, zero for directories
          "modificationTime": 1320173277227,
          "owner"           : "webuser",
          "pathSuffix"      : "",
          "permission"      : "777",
          "replication"     : 0,
          "type"            : "DIRECTORY"    //enum {FILE, DIRECTORY, SYMLINK}
        }
      }
    

See also: FileSystem.getFileStatus

List a Directory

  • Submit a HTTP GET request.

      curl -i  "http://<HOST>:<PORT>/webhdfs/v1/<PATH>?op=LISTSTATUS"
    

    The client receives a response with a FileStatuses JSON object:

      HTTP/1.1 200 OK
      Content-Type: application/json
      Content-Length: 427
    
      {
        "FileStatuses":
        {
          "FileStatus":
          [
            {
              "accessTime"      : 1320171722771,
              "blockSize"       : 33554432,
              "group"           : "supergroup",
              "length"          : 24930,
              "modificationTime": 1320171722771,
              "owner"           : "webuser",
              "pathSuffix"      : "a.patch",
              "permission"      : "644",
              "replication"     : 1,
              "type"            : "FILE"
            },
            {
              "accessTime"      : 0,
              "blockSize"       : 0,
              "group"           : "supergroup",
              "length"          : 0,
              "modificationTime": 1320895981256,
              "owner"           : "username",
              "pathSuffix"      : "bar",
              "permission"      : "711",
              "replication"     : 0,
              "type"            : "DIRECTORY"
            },
            ...
          ]
        }
      }
    

See also: FileSystem.listStatus

Iteratively List a Directory

  • Submit a HTTP GET request.

      curl -i  "http://<HOST>:<PORT>/webhdfs/v1/<PATH>?op=LISTSTATUS_BATCH&startAfter=<CHILD>"
    

    The client receives a response with a DirectoryListing JSON object, which contains a FileStatuses JSON object, as well as iteration information:

      HTTP/1.1 200 OK
      Cache-Control: no-cache
      Expires: Thu, 08 Sep 2016 03:40:38 GMT
      Date: Thu, 08 Sep 2016 03:40:38 GMT
      Pragma: no-cache
      Expires: Thu, 08 Sep 2016 03:40:38 GMT
      Date: Thu, 08 Sep 2016 03:40:38 GMT
      Pragma: no-cache
      Content-Type: application/json
      X-FRAME-OPTIONS: SAMEORIGIN
      Transfer-Encoding: chunked
      Server: Jetty(6.1.26)
    
      {
          "DirectoryListing": {
              "partialListing": {
                  "FileStatuses": {
                      "FileStatus": [
                          {
                              "accessTime": 0,
                              "blockSize": 0,
                              "childrenNum": 0,
                              "fileId": 16387,
                              "group": "supergroup",
                              "length": 0,
                              "modificationTime": 1473305882563,
                              "owner": "andrew",
                              "pathSuffix": "bardir",
                              "permission": "755",
                              "replication": 0,
                              "storagePolicy": 0,
                              "type": "DIRECTORY"
                          },
                          {
                              "accessTime": 1473305896945,
                              "blockSize": 1024,
                              "childrenNum": 0,
                              "fileId": 16388,
                              "group": "supergroup",
                              "length": 0,
                              "modificationTime": 1473305896965,
                              "owner": "andrew",
                              "pathSuffix": "bazfile",
                              "permission": "644",
                              "replication": 3,
                              "storagePolicy": 0,
                              "type": "FILE"
                          }
                      ]
                  }
              },
              "remainingEntries": 2
          }
      }
    

If remainingEntries is non-zero, there are additional entries in the directory. To query the next batch, set the startAfter parameter to the pathSuffix of the last item returned in the current batch. For example:

    curl -i  "http://<HOST>:<PORT>/webhdfs/v1/<PATH>?op=LISTSTATUS_BATCH&startAfter=bazfile"

Which will return the next batch of directory entries:

    HTTP/1.1 200 OK
    Cache-Control: no-cache
    Expires: Thu, 08 Sep 2016 03:43:20 GMT
    Date: Thu, 08 Sep 2016 03:43:20 GMT
    Pragma: no-cache
    Expires: Thu, 08 Sep 2016 03:43:20 GMT
    Date: Thu, 08 Sep 2016 03:43:20 GMT
    Pragma: no-cache
    Content-Type: application/json
    X-FRAME-OPTIONS: SAMEORIGIN
    Transfer-Encoding: chunked
    Server: Jetty(6.1.26)

    {
        "DirectoryListing": {
            "partialListing": {
                "FileStatuses": {
                    "FileStatus": [
                        {
                            "accessTime": 0,
                            "blockSize": 0,
                            "childrenNum": 0,
                            "fileId": 16386,
                            "group": "supergroup",
                            "length": 0,
                            "modificationTime": 1473305878951,
                            "owner": "andrew",
                            "pathSuffix": "foodir",
                            "permission": "755",
                            "replication": 0,
                            "storagePolicy": 0,
                            "type": "DIRECTORY"
                        },
                        {
                            "accessTime": 1473305902864,
                            "blockSize": 1024,
                            "childrenNum": 0,
                            "fileId": 16389,
                            "group": "supergroup",
                            "length": 0,
                            "modificationTime": 1473305902878,
                            "owner": "andrew",
                            "pathSuffix": "quxfile",
                            "permission": "644",
                            "replication": 3,
                            "storagePolicy": 0,
                            "type": "FILE"
                        }
                    ]
                }
            },
            "remainingEntries": 0
        }
    }

Batch size is controlled by the dfs.ls.limit option on the NameNode.

See also: FileSystem.listStatusIterator

Other File System Operations

Get Content Summary of a Directory

  • Submit a HTTP GET request.

      curl -i "http://<HOST>:<PORT>/webhdfs/v1/<PATH>?op=GETCONTENTSUMMARY"
    

    The client receives a response with a ContentSummary JSON object:

      HTTP/1.1 200 OK
      Content-Type: application/json
      Transfer-Encoding: chunked
    
      {
        "ContentSummary":
        {
          "directoryCount": 2,
          "fileCount"     : 1,
          "length"        : 24930,
          "quota"         : -1,
          "spaceConsumed" : 24930,
          "spaceQuota"    : -1,
          "typeQuota":
          {
            "ARCHIVE":
            {
              "consumed": 500,
              "quota": 10000
            },
            "DISK":
            {
              "consumed": 500,
              "quota": 10000
            },
            "SSD":
            {
              "consumed": 500,
              "quota": 10000
            }
          }
        }
      }
    

See also: FileSystem.getContentSummary

Get Quota Usage of a Directory

  • Submit a HTTP GET request.

      curl -i "http://<HOST>:<PORT>/webhdfs/v1/<PATH>?op=GETQUOTAUSAGE"
    

    The client receives a response with a QuotaUsage JSON object:

      HTTP/1.1 200 OK
      Content-Type: application/json
      Transfer-Encoding: chunked
    
      {
        "QuotaUsage":
        {
          "fileAndDirectoryCount": 1,
          "quota"         : 100,
          "spaceConsumed" : 24930,
          "spaceQuota"    : 100000,
          "typeQuota":
          {
            "ARCHIVE":
            {
              "consumed": 500,
              "quota": 10000
            },
            "DISK":
            {
              "consumed": 500,
              "quota": 10000
            },
            "SSD":
            {
              "consumed": 500,
              "quota": 10000
            }
          }
        }
      }
    

See also: FileSystem.getQuotaUsage

Get File Checksum

  • Submit a HTTP GET request.

      curl -i "http://<HOST>:<PORT>/webhdfs/v1/<PATH>?op=GETFILECHECKSUM"
    

    Usually the request is redirected to a datanode:

      HTTP/1.1 307 TEMPORARY_REDIRECT
      Location: http://<DATANODE>:<PORT>/webhdfs/v1/<PATH>?op=GETFILECHECKSUM...
      Content-Length: 0
    

    However, if you do not want to be automatically redirected, you can set the noredirect flag.

      HTTP/1.1 200 OK
      Content-Type: application/json
      {"Location":"http://<DATANODE>:<PORT>/webhdfs/v1/<PATH>?op=GETFILECHECKSUM..."}
    

    The client follows the redirect to the datanode and receives a FileChecksum JSON object:

      HTTP/1.1 200 OK
      Content-Type: application/json
      Transfer-Encoding: chunked
    
      {
        "FileChecksum":
        {
          "algorithm": "MD5-of-1MD5-of-512CRC32",
          "bytes"    : "eadb10de24aa315748930df6e185c0d ...",
          "length"   : 28
        }
      }
    

See also: FileSystem.getFileChecksum

Get Home Directory

  • Submit a HTTP GET request.

      curl -i "http://<HOST>:<PORT>/webhdfs/v1/?op=GETHOMEDIRECTORY"
    

    The client receives a response with a Path JSON object:

      HTTP/1.1 200 OK
      Content-Type: application/json
      Transfer-Encoding: chunked
    
      {"Path": "/user/username"}
    

See also: FileSystem.getHomeDirectory

Get Trash Root

  • Submit a HTTP GET request.

      curl -i "http://<HOST>:<PORT>/webhdfs/v1/<PATH>?op=GETTRASHROOT"
    

    The client receives a response with a Path JSON object:

      HTTP/1.1 200 OK
      Content-Type: application/json
      Transfer-Encoding: chunked
    
      {"Path": "/user/username/.Trash"}
    

    if the path is an encrypted zone path and user has permission of the path, the client receives a response like this:

      HTTP/1.1 200 OK
      Content-Type: application/json
      Transfer-Encoding: chunked
    
      {"Path": "/PATH/.Trash/username"}
    

See also: FileSystem.getTrashRoot

For more details about trash root in an encrypted zone, please refer to Transparent Encryption Guide.

Set Permission

  • Submit a HTTP PUT request.

      curl -i -X PUT "http://<HOST>:<PORT>/webhdfs/v1/<PATH>?op=SETPERMISSION
                                    [&permission=<OCTAL>]"
    

    The client receives a response with zero content length:

      HTTP/1.1 200 OK
      Content-Length: 0
    

See also: permission, FileSystem.setPermission

Set Owner

  • Submit a HTTP PUT request.

      curl -i -X PUT "http://<HOST>:<PORT>/webhdfs/v1/<PATH>?op=SETOWNER
                                    [&owner=<USER>][&group=<GROUP>]"
    

    The client receives a response with zero content length:

      HTTP/1.1 200 OK
      Content-Length: 0
    

See also: owner, group, FileSystem.setOwner

Set Replication Factor

  • Submit a HTTP PUT request.

      curl -i -X PUT "http://<HOST>:<PORT>/webhdfs/v1/<PATH>?op=SETREPLICATION
                                    [&replication=<SHORT>]"
    

    The client receives a response with a boolean JSON object:

      HTTP/1.1 200 OK
      Content-Type: application/json
      Transfer-Encoding: chunked
    
      {"boolean": true}
    

See also: replication, FileSystem.setReplication

Set Access or Modification Time

  • Submit a HTTP PUT request.

      curl -i -X PUT "http://<HOST>:<PORT>/webhdfs/v1/<PATH>?op=SETTIMES
                                    [&modificationtime=<TIME>][&accesstime=<TIME>]"
    

    The client receives a response with zero content length:

      HTTP/1.1 200 OK
      Content-Length: 0
    

See also: modificationtime, accesstime, FileSystem.setTimes

Modify ACL Entries

  • Submit a HTTP PUT request.

      curl -i -X PUT "http://<HOST>:<PORT>/webhdfs/v1/<PATH>?op=MODIFYACLENTRIES
                                    &aclspec=<ACLSPEC>"
    

    The client receives a response with zero content length:

      HTTP/1.1 200 OK
      Content-Length: 0
    

See also: FileSystem.modifyAclEntries

Remove ACL Entries

  • Submit a HTTP PUT request.

      curl -i -X PUT "http://<HOST>:<PORT>/webhdfs/v1/<PATH>?op=REMOVEACLENTRIES
                                    &aclspec=<ACLSPEC>"
    

    The client receives a response with zero content length:

      HTTP/1.1 200 OK
      Content-Length: 0
    

See also: FileSystem.removeAclEntries

Remove Default ACL

  • Submit a HTTP PUT request.

      curl -i -X PUT "http://<HOST>:<PORT>/webhdfs/v1/<PATH>?op=REMOVEDEFAULTACL"
    

    The client receives a response with zero content length:

      HTTP/1.1 200 OK
      Content-Length: 0
    

See also: FileSystem.removeDefaultAcl

Remove ACL

  • Submit a HTTP PUT request.

      curl -i -X PUT "http://<HOST>:<PORT>/webhdfs/v1/<PATH>?op=REMOVEACL"
    

    The client receives a response with zero content length:

      HTTP/1.1 200 OK
      Content-Length: 0
    

See also: FileSystem.removeAcl

Set ACL

  • Submit a HTTP PUT request.

      curl -i -X PUT "http://<HOST>:<PORT>/webhdfs/v1/<PATH>?op=SETACL
                                    &aclspec=<ACLSPEC>"
    

    The client receives a response with zero content length:

      HTTP/1.1 200 OK
      Content-Length: 0
    

See also: FileSystem.setAcl

Get ACL Status

  • Submit a HTTP GET request.

      curl -i "http://<HOST>:<PORT>/webhdfs/v1/<PATH>?op=GETACLSTATUS"
    

    The client receives a response with a AclStatus JSON object:

      HTTP/1.1 200 OK
      Content-Type: application/json
      Transfer-Encoding: chunked
    
      {
          "AclStatus": {
              "entries": [
                  "user:carla:rw-", 
                  "group::r-x"
              ], 
              "group": "supergroup", 
              "owner": "hadoop", 
              "permission":"775",
              "stickyBit": false
          }
      }
    

See also: FileSystem.getAclStatus

Check access

  • Submit a HTTP GET request.

      curl -i "http://<HOST>:<PORT>/webhdfs/v1/<PATH>?op=CHECKACCESS
                                    &fsaction=<FSACTION>
    

    The client receives a response with zero content length:

      HTTP/1.1 200 OK
      Content-Length: 0
    

See also: FileSystem.access

Storage Policy Operations

Get all Storage Policies

  • Submit a HTTP GET request.

      curl -i "http://<HOST>:<PORT>/webhdfs/v1?op=GETALLSTORAGEPOLICY"
    

    The client receives a response with a BlockStoragePolicies JSON object:

      HTTP/1.1 200 OK
      Content-Type: application/json
      Transfer-Encoding: chunked
    
      {
          "BlockStoragePolicies": {
              "BlockStoragePolicy": [
                 {
                     "copyOnCreateFile": false,
                     "creationFallbacks": [],
                     "id": 2,
                     "name": "COLD",
                     "replicationFallbacks": [],
                     "storageTypes": ["ARCHIVE"]
                 },
                 {
                     "copyOnCreateFile": false,
                     "creationFallbacks": ["DISK","ARCHIVE"],
                     "id": 5,
                     "name": "WARM",
                     "replicationFallbacks": ["DISK","ARCHIVE"],
                     "storageTypes": ["DISK","ARCHIVE"]
                 },
                 {
                     "copyOnCreateFile": false,
                     "creationFallbacks": [],
                     "id": 7,
                     "name": "HOT",
                     "replicationFallbacks": ["ARCHIVE"],
                     "storageTypes": ["DISK"]
                 },
                 {
                     "copyOnCreateFile": false,
                     "creationFallbacks": ["SSD","DISK"],
                     "id": 10,"name": "ONE_SSD",
                     "replicationFallbacks": ["SSD","DISK"],
                     "storageTypes": ["SSD","DISK"]
                 },
                 {
                     "copyOnCreateFile": false,
                     "creationFallbacks": ["DISK"],
                     "id": 12,
                     "name": "ALL_SSD",
                     "replicationFallbacks": ["DISK"],
                     "storageTypes": ["SSD"]
                 },
                 {
                     "copyOnCreateFile": true,
                     "creationFallbacks": ["DISK"],
                     "id": 15,
                     "name": "LAZY_PERSIST",
                     "replicationFallbacks": ["DISK"],
                     "storageTypes": ["RAM_DISK","DISK"]
                 }
             ]
         }
      }
    

See also: FileSystem.getAllStoragePolicies

Set Storage Policy

  • Submit a HTTP PUT request.

      curl -i -X PUT "http://<HOST>:<PORT>/webhdfs/v1/<PATH>?op=SETSTORAGEPOLICY
                                    &storagepolicy=<policy>"
    

    The client receives a response with zero content length:

      HTTP/1.1 200 OK
      Content-Length: 0
    

See also: FileSystem.setStoragePolicy

Unset Storage Policy

  • Submit a HTTP POT request.

      curl -i -X POST "http://<HOST>:<PORT>/webhdfs/v1/<PATH>?op=UNSETSTORAGEPOLICY"
    

    The client receives a response with zero content length:

      HTTP/1.1 200 OK
      Content-Length: 0
    

See also: FileSystem.unsetStoragePolicy

Get Storage Policy

  • Submit a HTTP GET request.

      curl -i "http://<HOST>:<PORT>/webhdfs/v1/<PATH>?op=GETSTORAGEPOLICY"
    

    The client receives a response with a BlockStoragePolicy JSON object:

      HTTP/1.1 200 OK
      Content-Type: application/json
      Transfer-Encoding: chunked
    
      {
          "BlockStoragePolicy": {
              "copyOnCreateFile": false,
              "creationFallbacks": [],
              "id":7,
              "name":"HOT",
              "replicationFallbacks":["ARCHIVE"],
              "storageTypes":["DISK"]
          }
      }
    

See also: FileSystem.getStoragePolicy

Get File Block Locations

  • Submit a HTTP GET request.

      curl -i "http://<HOST>:<PORT>/webhdfs/v1/<PATH>?op=GETFILEBLOCKLOCATIONS
    

    The client receives a response with a BlockLocations JSON Object:

      HTTP/1.1 200 OK
      Content-Type: application/json
      Transfer-Encoding: chunked
    
      {
        "BlockLocations" :
        {
          "BlockLocation":
          [
            {
              "cachedHosts" : [],
              "corrupt" : false,
              "hosts" : ["host"],
              "length" : 134217728,                             // length of this block
              "names" : ["host:ip"],
              "offset" : 0,                                     // offset of the block in the file
              "storageIds" : ["storageid"],
              "storageTypes" : ["DISK"],                        // enum {RAM_DISK, SSD, DISK, ARCHIVE}
              "topologyPaths" : ["/default-rack/hostname:ip"]
            }, {
              "cachedHosts" : [],
              "corrupt" : false,
              "hosts" : ["host"],
              "length" : 62599364,
              "names" : ["host:ip"],
              "offset" : 134217728,
              "storageIds" : ["storageid"],
              "storageTypes" : ["DISK"],
              "topologyPaths" : ["/default-rack/hostname:ip"]
            },
            ...
          ]
        }
      }
    

See also: offset, length, FileSystem.getFileBlockLocations

Extended Attributes(XAttrs) Operations

Set XAttr

  • Submit a HTTP PUT request.

      curl -i -X PUT "http://<HOST>:<PORT>/webhdfs/v1/<PATH>?op=SETXATTR
                                    &xattr.name=<XATTRNAME>&xattr.value=<XATTRVALUE>
                                    &flag=<FLAG>"
    

    The client receives a response with zero content length:

      HTTP/1.1 200 OK
      Content-Length: 0
    

See also: FileSystem.setXAttr

Remove XAttr

  • Submit a HTTP PUT request.

      curl -i -X PUT "http://<HOST>:<PORT>/webhdfs/v1/<PATH>?op=REMOVEXATTR
                                    &xattr.name=<XATTRNAME>"
    

    The client receives a response with zero content length:

      HTTP/1.1 200 OK
      Content-Length: 0
    

See also: FileSystem.removeXAttr

Get an XAttr

  • Submit a HTTP GET request.

      curl -i "http://<HOST>:<PORT>/webhdfs/v1/<PATH>?op=GETXATTRS
                                    &xattr.name=<XATTRNAME>&encoding=<ENCODING>"
    

    The client receives a response with a XAttrs JSON object:

      HTTP/1.1 200 OK
      Content-Type: application/json
      Transfer-Encoding: chunked
    
      {
          "XAttrs": [
              {
                  "name":"XATTRNAME",
                  "value":"XATTRVALUE"
              }
          ]
      }
    

See also: FileSystem.getXAttr

Get multiple XAttrs

  • Submit a HTTP GET request.

      curl -i "http://<HOST>:<PORT>/webhdfs/v1/<PATH>?op=GETXATTRS
                                    &xattr.name=<XATTRNAME1>&xattr.name=<XATTRNAME2>
                                    &encoding=<ENCODING>"
    

    The client receives a response with a XAttrs JSON object:

      HTTP/1.1 200 OK
      Content-Type: application/json
      Transfer-Encoding: chunked
    
      {
          "XAttrs": [
              {
                  "name":"XATTRNAME1",
                  "value":"XATTRVALUE1"
              },
              {
                  "name":"XATTRNAME2",
                  "value":"XATTRVALUE2"
              }
          ]
      }
    

See also: FileSystem.getXAttrs

Get all XAttrs

  • Submit a HTTP GET request.

      curl -i "http://<HOST>:<PORT>/webhdfs/v1/<PATH>?op=GETXATTRS
                                    &encoding=<ENCODING>"
    

    The client receives a response with a XAttrs JSON object:

      HTTP/1.1 200 OK
      Content-Type: application/json
      Transfer-Encoding: chunked
    
      {
          "XAttrs": [
              {
                  "name":"XATTRNAME1",
                  "value":"XATTRVALUE1"
              },
              {
                  "name":"XATTRNAME2",
                  "value":"XATTRVALUE2"
              },
              {
                  "name":"XATTRNAME3",
                  "value":"XATTRVALUE3"
              }
          ]
      }
    

See also: FileSystem.getXAttrs

List all XAttrs

  • Submit a HTTP GET request.

      curl -i "http://<HOST>:<PORT>/webhdfs/v1/<PATH>?op=LISTXATTRS"
    

    The client receives a response with a XAttrNames JSON object:

      HTTP/1.1 200 OK
      Content-Type: application/json
      Transfer-Encoding: chunked
    
      {
          "XAttrNames":"[\"XATTRNAME1\",\"XATTRNAME2\",\"XATTRNAME3\"]"
      }
    

See also: FileSystem.listXAttrs

Snapshot Operations

Allow Snapshot

  • Submit a HTTP PUT request.

      curl -i -X PUT "http://<HOST>:<PORT>/webhdfs/v1/<PATH>?op=ALLOWSNAPSHOT"
    

    The client receives a response with zero content length on success:

      HTTP/1.1 200 OK
      Content-Length: 0
    

Disallow Snapshot

  • Submit a HTTP PUT request.

      curl -i -X PUT "http://<HOST>:<PORT>/webhdfs/v1/<PATH>?op=DISALLOWSNAPSHOT"
    

    The client receives a response with zero content length on success:

      HTTP/1.1 200 OK
      Content-Length: 0
    

Create Snapshot

  • Submit a HTTP PUT request.

      curl -i -X PUT "http://<HOST>:<PORT>/webhdfs/v1/<PATH>?op=CREATESNAPSHOT[&snapshotname=<SNAPSHOTNAME>]"
    

    The client receives a response with a Path JSON object:

      HTTP/1.1 200 OK
      Content-Type: application/json
      Transfer-Encoding: chunked
    
      {"Path": "/user/username/.snapshot/s1"}
    

See also: FileSystem.createSnapshot

Delete Snapshot

  • Submit a HTTP DELETE request.

      curl -i -X DELETE "http://<HOST>:<PORT>/webhdfs/v1/<PATH>?op=DELETESNAPSHOT&snapshotname=<SNAPSHOTNAME>"
    

    The client receives a response with zero content length:

      HTTP/1.1 200 OK
      Content-Length: 0
    

See also: FileSystem.deleteSnapshot

Rename Snapshot

  • Submit a HTTP PUT request.

      curl -i -X PUT "http://<HOST>:<PORT>/webhdfs/v1/<PATH>?op=RENAMESNAPSHOT
                         &oldsnapshotname=<SNAPSHOTNAME>&snapshotname=<SNAPSHOTNAME>"
    

    The client receives a response with zero content length:

      HTTP/1.1 200 OK
      Content-Length: 0
    

See also: FileSystem.renameSnapshot

Delegation Token Operations

Get Delegation Token

  • Submit a HTTP GET request.

      curl -i "http://<HOST>:<PORT>/webhdfs/v1/?op=GETDELEGATIONTOKEN&renewer=<USER>&service=<SERVICE>&kind=<KIND>"
    

    The client receives a response with a Token JSON object:

      HTTP/1.1 200 OK
      Content-Type: application/json
      Transfer-Encoding: chunked
    
      {
        "Token":
        {
          "urlString": "JQAIaG9y..."
        }
      }
    

See also: renewer, FileSystem.getDelegationToken, kind, service

Renew Delegation Token

  • Submit a HTTP PUT request.

      curl -i -X PUT "http://<HOST>:<PORT>/webhdfs/v1/?op=RENEWDELEGATIONTOKEN&token=<TOKEN>"
    

    The client receives a response with a long JSON object:

      HTTP/1.1 200 OK
      Content-Type: application/json
      Transfer-Encoding: chunked
    
      {"long": 1320962673997}           //the new expiration time
    

See also: token, DelegationTokenAuthenticator.renewDelegationToken

Cancel Delegation Token

  • Submit a HTTP PUT request.

      curl -i -X PUT "http://<HOST>:<PORT>/webhdfs/v1/?op=CANCELDELEGATIONTOKEN&token=<TOKEN>"
    

    The client receives a response with zero content length:

      HTTP/1.1 200 OK
      Content-Length: 0
    

See also: token, DelegationTokenAuthenticator.cancelDelegationToken

Error Responses

When an operation fails, the server may throw an exception. The JSON schema of error responses is defined in RemoteException JSON Schema. The table below shows the mapping from exceptions to HTTP response codes.

HTTP Response Codes

ExceptionsHTTP Response Codes
IllegalArgumentException 400 Bad Request
UnsupportedOperationException400 Bad Request
SecurityException 401 Unauthorized
IOException 403 Forbidden
FileNotFoundException 404 Not Found
RuntimeException 500 Internal Server Error

Below are examples of exception responses.

Illegal Argument Exception

HTTP/1.1 400 Bad Request
Content-Type: application/json
Transfer-Encoding: chunked

{
  "RemoteException":
  {
    "exception"    : "IllegalArgumentException",
    "javaClassName": "java.lang.IllegalArgumentException",
    "message"      : "Invalid value for webhdfs parameter \"permission\": ..."
  }
}

Security Exception

HTTP/1.1 401 Unauthorized
Content-Type: application/json
Transfer-Encoding: chunked

{
  "RemoteException":
  {
    "exception"    : "SecurityException",
    "javaClassName": "java.lang.SecurityException",
    "message"      : "Failed to obtain user group information: ..."
  }
}

Access Control Exception

HTTP/1.1 403 Forbidden
Content-Type: application/json
Transfer-Encoding: chunked

{
  "RemoteException":
  {
    "exception"    : "AccessControlException",
    "javaClassName": "org.apache.hadoop.security.AccessControlException",
    "message"      : "Permission denied: ..."
  }
}

File Not Found Exception

HTTP/1.1 404 Not Found
Content-Type: application/json
Transfer-Encoding: chunked

{
  "RemoteException":
  {
    "exception"    : "FileNotFoundException",
    "javaClassName": "java.io.FileNotFoundException",
    "message"      : "File does not exist: /foo/a.patch"
  }
}

JSON Schemas

All operations, except for OPEN, either return a zero-length response or a JSON response. For OPEN, the response is an octet-stream. The JSON schemas are shown below. See draft-zyp-json-schema-03 for the syntax definitions of the JSON schemas.

Note that the default value of additionalProperties is an empty schema which allows any value for additional properties. Therefore, all WebHDFS JSON responses allow any additional property. However, if additional properties are included in the responses, they are considered as optional properties in order to maintain compatibility.

ACL Status JSON Schema

{
  "name"      : "AclStatus",
  "properties":
  {
    "AclStatus":
    {
      "type"      : "object",
      "properties":
      {
        "entries":
        {
          "type": "array",
          "items":
          {
            "description": "ACL entry.",
            "type": "string"
          }
        },
        "group":
        {
          "description": "The group owner.",
          "type"       : "string",
          "required"   : true
        },
        "owner":
        {
          "description": "The user who is the owner.",
          "type"       : "string",
          "required"   : true
        },
        "stickyBit":
        {
          "description": "True if the sticky bit is on.",
          "type"       : "boolean",
          "required"   : true
        }
      }
    }
  }
}

XAttrs JSON Schema

{
  "name"      : "XAttrs",
  "properties":
  {
    "XAttrs":
    {
      "type"      : "array",
      "items":
      {
        "type"    : "object",
        "properties":
        {
          "name":
          {
            "description": "XAttr name.",
            "type"       : "string",
            "required"   : true
          },
          "value":
          {
            "description": "XAttr value.",
            "type"       : "string"
          }
        }
      }
    }
  }
}

XAttrNames JSON Schema

{
  "name"      : "XAttrNames",
  "properties":
  {
    "XAttrNames":
    {
      "description": "XAttr names.",
      "type"       : "string",
      "required"   : true
    }
  }
}

Boolean JSON Schema

{
  "name"      : "boolean",
  "properties":
  {
    "boolean":
    {
      "description": "A boolean value",
      "type"       : "boolean",
      "required"   : true
    }
  }
}

See also: MKDIRS, RENAME, DELETE, SETREPLICATION

ContentSummary JSON Schema

{
  "name"      : "ContentSummary",
  "properties":
  {
    "ContentSummary":
    {
      "type"      : "object",
      "properties":
      {
        "directoryCount":
        {
          "description": "The number of directories.",
          "type"       : "integer",
          "required"   : true
        },
        "fileCount":
        {
          "description": "The number of files.",
          "type"       : "integer",
          "required"   : true
        },
        "length":
        {
          "description": "The number of bytes used by the content.",
          "type"       : "integer",
          "required"   : true
        },
        "quota":
        {
          "description": "The namespace quota of this directory.",
          "type"       : "integer",
          "required"   : true
        },
        "spaceConsumed":
        {
          "description": "The disk space consumed by the content.",
          "type"       : "integer",
          "required"   : true
        },
        "spaceQuota":
        {
          "description": "The disk space quota.",
          "type"       : "integer",
          "required"   : true
        },
        "typeQuota":
        {
          "type"      : "object",
          "properties":
          {
            "ARCHIVE":
            {
              "type"      : "object",
              "properties":
              {
                "consumed":
                {
                  "description": "The storage type space consumed.",
                  "type"       : "integer",
                  "required"   : true
                },
                "quota":
                {
                  "description": "The storage type quota.",
                  "type"       : "integer",
                  "required"   : true
                }
              }
            },
            "DISK":
            {
              "type"      : "object",
              "properties":
              {
                "consumed":
                {
                  "description": "The storage type space consumed.",
                  "type"       : "integer",
                  "required"   : true
                },
                "quota":
                {
                  "description": "The storage type quota.",
                  "type"       : "integer",
                  "required"   : true
                }
              }
            },
            "SSD":
            {
              "type"      : "object",
              "properties":
              {
                "consumed":
                {
                  "description": "The storage type space consumed.",
                  "type"       : "integer",
                  "required"   : true
                },
                "quota":
                {
                  "description": "The storage type quota.",
                  "type"       : "integer",
                  "required"   : true
                }
              }
            }
          }
        }
      }
    }
  }
}

See also: GETCONTENTSUMMARY

QuotaUsage JSON Schema

{
  "name"      : "QuotaUsage",
  "properties":
  {
    "QuotaUsage":
    {
      "type"      : "object",
      "properties":
      {
        "fileAndDirectoryCount":
        {
          "description": "The number of files and directories.",
          "type"       : "integer",
          "required"   : true
        },
        "quota":
        {
          "description": "The namespace quota of this directory.",
          "type"       : "integer",
          "required"   : true
        },
        "spaceConsumed":
        {
          "description": "The disk space consumed by the content.",
          "type"       : "integer",
          "required"   : true
        },
        "spaceQuota":
        {
          "description": "The disk space quota.",
          "type"       : "integer",
          "required"   : true
        },
        "typeQuota":
        {
          "type"      : "object",
          "properties":
          {
            "ARCHIVE":
            {
              "type"      : "object",
              "properties":
              {
                "consumed":
                {
                  "description": "The storage type space consumed.",
                  "type"       : "integer",
                  "required"   : true
                },
                "quota":
                {
                  "description": "The storage type quota.",
                  "type"       : "integer",
                  "required"   : true
                }
              }
            },
            "DISK":
            {
              "type"      : "object",
              "properties":
              {
                "consumed":
                {
                  "description": "The storage type space consumed.",
                  "type"       : "integer",
                  "required"   : true
                },
                "quota":
                {
                  "description": "The storage type quota.",
                  "type"       : "integer",
                  "required"   : true
                }
              }
            },
            "SSD":
            {
              "type"      : "object",
              "properties":
              {
                "consumed":
                {
                  "description": "The storage type space consumed.",
                  "type"       : "integer",
                  "required"   : true
                },
                "quota":
                {
                  "description": "The storage type quota.",
                  "type"       : "integer",
                  "required"   : true
                }
              }
            }
          }
        }
      }
    }
  }
}

See also: GETQUOTAUSAGE

FileChecksum JSON Schema

{
  "name"      : "FileChecksum",
  "properties":
  {
    "FileChecksum":
    {
      "type"      : "object",
      "properties":
      {
        "algorithm":
        {
          "description": "The name of the checksum algorithm.",
          "type"       : "string",
          "required"   : true
        },
        "bytes":
        {
          "description": "The byte sequence of the checksum in hexadecimal.",
          "type"       : "string",
          "required"   : true
        },
        "length":
        {
          "description": "The length of the bytes (not the length of the string).",
          "type"       : "integer",
          "required"   : true
        }
      }
    }
  }
}

FileStatus JSON Schema

{
  "name"      : "FileStatus",
  "properties":
  {
    "FileStatus": fileStatusProperties      //See FileStatus Properties
  }
}

See also: FileStatus Properties, GETFILESTATUS, FileStatus

FileStatus Properties

JavaScript syntax is used to define fileStatusProperties so that it can be referred in both FileStatus and FileStatuses JSON schemas.

var fileStatusProperties =
{
  "type"      : "object",
  "properties":
  {
    "accessTime":
    {
      "description": "The access time.",
      "type"       : "integer",
      "required"   : true
    },
    "blockSize":
    {
      "description": "The block size of a file.",
      "type"       : "integer",
      "required"   : true
    },
    "group":
    {
      "description": "The group owner.",
      "type"       : "string",
      "required"   : true
    },
    "length":
    {
      "description": "The number of bytes in a file.",
      "type"       : "integer",
      "required"   : true
    },
    "modificationTime":
    {
      "description": "The modification time.",
      "type"       : "integer",
      "required"   : true
    },
    "owner":
    {
      "description": "The user who is the owner.",
      "type"       : "string",
      "required"   : true
    },
    "pathSuffix":
    {
      "description": "The path suffix.",
      "type"       : "string",
      "required"   : true
    },
    "permission":
    {
      "description": "The permission represented as a octal string.",
      "type"       : "string",
      "required"   : true
    },
    "replication":
    {
      "description": "The number of replication of a file.",
      "type"       : "integer",
      "required"   : true
    },
   "symlink":                                         //an optional property
    {
      "description": "The link target of a symlink.",
      "type"       : "string"
    },
   "type":
    {
      "description": "The type of the path object.",
      "enum"       : ["FILE", "DIRECTORY", "SYMLINK"],
      "required"   : true
    }
  }
};

FileStatuses JSON Schema

A FileStatuses JSON object represents an array of FileStatus JSON objects.

{
  "name"      : "FileStatuses",
  "properties":
  {
    "FileStatuses":
    {
      "type"      : "object",
      "properties":
      {
        "FileStatus":
        {
          "description": "An array of FileStatus",
          "type"       : "array",
          "items"      : fileStatusProperties      //See FileStatus Properties
        }
      }
    }
  }
}

See also: FileStatus Properties, LISTSTATUS, FileStatus

DirectoryListing JSON Schema

A DirectoryListing JSON object represents a batch of directory entries while iteratively listing a directory. It contains a FileStatuses JSON object as well as iteration information.

{
  "name"      : "DirectoryListing",
  "properties":
  {
    "DirectoryListing":
    {
      "type"      : "object",
      "properties":
      {
        "partialListing":
        {
          "description": "A partial directory listing",
          "type"       : "object", // A FileStatuses object
          "required"   : true
        },
        "remainingEntries":
        {
          "description": "Number of remaining entries",
          "type"       : "integer",
          "required"   : true
        }
      }
    }
  }

}

See also: FileStatuses JSON Schema, LISTSTATUS_BATCH, FileStatus

Long JSON Schema

{
  "name"      : "long",
  "properties":
  {
    "long":
    {
      "description": "A long integer value",
      "type"       : "integer",
      "required"   : true
    }
  }
}

See also: RENEWDELEGATIONTOKEN,

Path JSON Schema

{
  "name"      : "Path",
  "properties":
  {
    "Path":
    {
      "description": "The string representation a Path.",
      "type"       : "string",
      "required"   : true
    }
  }
}

See also: GETHOMEDIRECTORY, Path

RemoteException JSON Schema

{
  "name"      : "RemoteException",
  "properties":
  {
    "RemoteException":
    {
      "type"      : "object",
      "properties":
      {
        "exception":
        {
          "description": "Name of the exception",
          "type"       : "string",
          "required"   : true
        },
        "message":
        {
          "description": "Exception message",
          "type"       : "string",
          "required"   : true
        },
        "javaClassName":                                     //an optional property
        {
          "description": "Java class name of the exception",
          "type"       : "string"
        }
      }
    }
  }
}

See also: Error Responses

Token JSON Schema

{
  "name"      : "Token",
  "properties":
  {
    "Token": tokenProperties      //See Token Properties
  }
}

See also: Token Properties, GETDELEGATIONTOKEN, the note in Delegation.

Token Properties

JavaScript syntax is used to define tokenProperties so that it can be referred in Token JSON schema.

var tokenProperties =
{
  "type"      : "object",
  "properties":
  {
    "urlString":
    {
      "description": "A delegation token encoded as a URL safe string.",
      "type"       : "string",
      "required"   : true
    }
  }
}

See also: Token Properties, the note in Delegation.

BlockStoragePolicy JSON Schema

{
  "name"      : "BlockStoragePolicy",
  "properties":
  {
    "BlockStoragePolicy": blockStoragePolicyProperties      //See BlockStoragePolicy Properties
  }
}

See also: BlockStoragePolicy Properties, GETSTORAGEPOLICY

BlockStoragePolicy Properties

JavaScript syntax is used to define blockStoragePolicyProperties so that it can be referred in both BlockStoragePolicy and BlockStoragePolicies JSON schemas.

var blockStoragePolicyProperties =
{
  "type"      : "object",
  "properties":
  {
    "id":
    {
      "description": "Policy ID.",
      "type"       : "integer",
      "required"   : true
    },
    "name":
    {
      "description": "Policy name.",
      "type"       : "string",
      "required"   : true
    },
    "storageTypes":
    {
      "description": "An array of storage types for block placement.",
      "type"       : "array",
      "required"   : true
      "items"      :
      {
        "type": "string"
      }
    },
    "replicationFallbacks":
    {
      "description": "An array of fallback storage types for replication.",
      "type"       : "array",
      "required"   : true
      "items"      :
      {
        "type": "string"
      }
    },
    "creationFallbacks":
    {
      "description": "An array of fallback storage types for file creation.",
      "type"       : "array",
      "required"   : true
      "items"      :
      {
       "type": "string"
      }
    },
    "copyOnCreateFile":
    {
      "description": "If set then the policy cannot be changed after file creation.",
      "type"       : "boolean",
      "required"   : true
    }
  }
};

BlockStoragePolicies JSON Schema

A BlockStoragePolicies JSON object represents an array of BlockStoragePolicy JSON objects.

{
  "name"      : "BlockStoragePolicies",
  "properties":
  {
    "BlockStoragePolicies":
    {
      "type"      : "object",
      "properties":
      {
        "BlockStoragePolicy":
        {
          "description": "An array of BlockStoragePolicy",
          "type"       : "array",
          "items"      : blockStoragePolicyProperties      //See BlockStoragePolicy Properties
        }
      }
    }
  }
}

BlockLocations JSON Schema

A BlockLocations JSON object represents an array of BlockLocation JSON objects.

{
  "name"      : "BlockLocations",
  "properties":
  {
    "BlockLocations":
    {
      "type"      : "object",
      "properties":
      {
        "BlockLocation":
        {
          "description": "An array of BlockLocation",
          "type"       : "array",
          "items"      : blockLocationProperties      //See BlockLocation Properties
        }
      }
    }
  }
}

See also BlockLocation Properties, GETFILEBLOCKLOCATIONS, BlockLocation

BlockLocation JSON Schema

{
  "name"      : "BlockLocation",
  "properties":
  {
    "BlockLocation": blockLocationProperties      //See BlockLocation Properties
  }
}

See also BlockLocation Properties, GETFILEBLOCKLOCATIONS, BlockLocation

BlockLocation Properties

JavaScript syntax is used to define blockLocationProperties so that it can be referred in both BlockLocation and BlockLocations JSON schemas.

var blockLocationProperties =
{
  "type"      : "object",
  "properties":
  {
    "cachedHosts":
    {
      "description": "Datanode hostnames with a cached replica",
      "type"       : "array",
      "required"   : "true",
      "items"      :
      {
        "description": "A datanode hostname",
        "type"       : "string"
      }
    },
    "corrupt":
    {
      "description": "True if the block is corrupted",
      "type"       : "boolean",
      "required"   : "true"
    },
    "hosts":
    {
      "description": "Datanode hostnames store the block",
      "type"       : "array",
      "required"   : "true",
      "items"      :
      {
        "description": "A datanode hostname",
        "type"       : "string"
      }
    },
    "length":
    {
      "description": "Length of the block",
      "type"       : "integer",
      "required"   : "true"
    },
    "names":
    {
      "description": "Datanode IP:xferPort for accessing the block",
      "type"       : "array",
      "required"   : "true",
      "items"      :
      {
        "description": "DatanodeIP:xferPort",
        "type"       : "string"
      }
    },
    "offset":
    {
      "description": "Offset of the block in the file",
      "type"       : "integer",
      "required"   : "true"
    },
    "storageIds":
    {
      "description": "Storage ID of each replica",
      "type"       : "array",
      "required"   : "true",
      "items"      :
      {
        "description": "Storage ID",
        "type"       : "string"
      }
    },
    "storageTypes":
    {
      "description": "Storage type of each replica",
      "type"       : "array",
      "required"   : "true",
      "items"      :
      {
        "description": "Storage type",
        "enum"       : ["RAM_DISK", "SSD", "DISK", "ARCHIVE"]
      }
    },
    "topologyPaths":
    {
      "description": "Datanode addresses in network topology",
      "type"       : "array",
      "required"   : "true",
      "items"      :
      {
        "description": "/rack/host:ip",
        "type"       : "string"
      }
    }
  }
};

HTTP Query Parameter Dictionary

ACL Spec

Nameaclspec
DescriptionThe ACL spec included in ACL modification operations.
TypeString
Default Value<empty>
Valid ValuesSee Permissions and HDFS.
SyntaxSee Permissions and HDFS.

XAttr Name

Namexattr.name
DescriptionThe XAttr name of a file/directory.
TypeString
Default Value<empty>
Valid ValuesAny string prefixed with user./trusted./system./security..
SyntaxAny string prefixed with user./trusted./system./security..

XAttr Value

Namexattr.value
DescriptionThe XAttr value of a file/directory.
TypeString
Default Value<empty>
Valid ValuesAn encoded value.
SyntaxEnclosed in double quotes or prefixed with 0x or 0s.

See also: Extended Attributes

XAttr set flag

Nameflag
DescriptionThe XAttr set flag.
TypeString
Default Value<empty>
Valid ValuesCREATE,REPLACE.
SyntaxCREATE,REPLACE.

See also: Extended Attributes

XAttr value encoding

Nameencoding
DescriptionThe XAttr value encoding.
TypeString
Default Value<empty>
Valid Valuestext
Syntaxtext

See also: Extended Attributes

Access Time

Nameaccesstime
DescriptionThe access time of a file/directory.
Typelong
Default Value-1 (means keeping it unchanged)
Valid Values-1 or a timestamp
SyntaxAny integer.

See also: SETTIMES

Block Size

Nameblocksize
DescriptionThe block size of a file.
Typelong
Default ValueSpecified in the configuration.
Valid Values> 0
SyntaxAny integer.

See also: CREATE

Buffer Size

Namebuffersize
DescriptionThe size of the buffer used in transferring data.
Typeint
Default ValueSpecified in the configuration.
Valid Values> 0
SyntaxAny integer.

See also: CREATE, APPEND, OPEN

Create Flag

Namecreateflag
DescriptionEnum of possible flags to process while creating a file
Typeenumerated strings
Default Value<empty>
Valid ValuesLegal combinations of create, overwrite, append and sync_block
SyntaxSee note below

The following combinations are not valid:

  • append,create
  • create,append,overwrite

See also: CREATE

Create Parent

Namecreateparent
DescriptionIf the parent directories do not exist, should they be created?
Typeboolean
Default Valuetrue
Valid Valuestrue, false
Syntaxtrue

See also: CREATESYMLINK

Delegation

Namedelegation
DescriptionThe delegation token used for authentication.
TypeString
Default Value<empty>
Valid ValuesAn encoded token.
SyntaxSee the note below.

Note that delegation tokens are encoded as a URL safe string; see encodeToUrlString() and decodeFromUrlString(String) in org.apache.hadoop.security.token.Token for the details of the encoding.

See also: Authentication

Destination

Namedestination
DescriptionThe destination path.
TypePath
Default Value<empty> (an invalid path)
Valid ValuesAn absolute FileSystem path without scheme and authority.
SyntaxAny path.

See also: CREATESYMLINK, RENAME

Do As

Namedoas
DescriptionAllowing a proxy user to do as another user.
TypeString
Default Valuenull
Valid ValuesAny valid username.
SyntaxAny string.

See also: Proxy Users

Fs Action

Namefsaction
DescriptionFile system operation read/write/execute
TypeString
Default Valuenull (an invalid value)
Valid ValuesStrings matching regex pattern  "[r-][w-][x-] "
Syntax "[r-][w-][x-] "

See also: CHECKACCESS,

Group

Namegroup
DescriptionThe name of a group.
TypeString
Default Value<empty> (means keeping it unchanged)
Valid ValuesAny valid group name.
SyntaxAny string.

See also: SETOWNER

Length

Namelength
DescriptionThe number of bytes to be processed.
Typelong
Default Valuenull (means the entire file)
Valid Values>= 0 or null
SyntaxAny integer.

See also: OPEN

Modification Time

Namemodificationtime
DescriptionThe modification time of a file/directory.
Typelong
Default Value-1 (means keeping it unchanged)
Valid Values-1 or a timestamp
SyntaxAny integer.

See also: SETTIMES

New Length

Namenewlength
DescriptionThe size the file is to be truncated to.
Typelong
Valid Values>= 0
SyntaxAny long.

Offset

Nameoffset
DescriptionThe starting byte position.
Typelong
Default Value0
Valid Values>= 0
SyntaxAny integer.

See also: OPEN

Old Snapshot Name

Nameoldsnapshotname
DescriptionThe old name of the snapshot to be renamed.
TypeString
Default Valuenull
Valid ValuesAn existing snapshot name.
SyntaxAny string.

See also: RENAMESNAPSHOT

Op

Nameop
DescriptionThe name of the operation to be executed.
Typeenum
Default Valuenull (an invalid value)
Valid ValuesAny valid operation name.
SyntaxAny string.

See also: Operations

Overwrite

Nameoverwrite
DescriptionIf a file already exists, should it be overwritten?
Typeboolean
Default Valuefalse
Valid Valuestrue
Syntaxtrue

See also: CREATE

Owner

Nameowner
DescriptionThe username who is the owner of a file/directory.
TypeString
Default Value<empty> (means keeping it unchanged)
Valid ValuesAny valid username.
SyntaxAny string.

See also: SETOWNER

Permission

Namepermission
DescriptionThe permission of a file/directory.
TypeOctal
Default Value755
Valid Values0 - 1777
SyntaxAny radix-8 integer (leading zeros may be omitted.)

See also: CREATE, MKDIRS, SETPERMISSION

Recursive

Namerecursive
DescriptionShould the operation act on the content in the subdirectories?
Typeboolean
Default Valuefalse
Valid Valuestrue
Syntaxtrue

See also: RENAME

Renewer

Namerenewer
DescriptionThe username of the renewer of a delegation token.
TypeString
Default Value<empty> (means the current user)
Valid ValuesAny valid username.
SyntaxAny string.

See also: GETDELEGATIONTOKEN

Replication

Namereplication
DescriptionThe number of replications of a file.
Typeshort
Default ValueSpecified in the configuration.
Valid Values> 0
SyntaxAny integer.

See also: CREATE, SETREPLICATION

Snapshot Name

Namesnapshotname
DescriptionThe name of the snapshot to be created/deleted. Or the new name for snapshot rename.
TypeString
Default Valuenull
Valid ValuesAny valid snapshot name.
SyntaxAny string.

See also: CREATESNAPSHOT, DELETESNAPSHOT, RENAMESNAPSHOT

Sources

Namesources
DescriptionA list of source paths.
TypeString
Default Value<empty>
Valid ValuesA list of comma seperated absolute FileSystem paths without scheme and authority.
SyntaxAny string.

See also: CONCAT

Token

Nametoken
DescriptionThe delegation token used for the operation.
TypeString
Default Value<empty>
Valid ValuesAn encoded token.
SyntaxSee the note in Delegation.

See also: RENEWDELEGATIONTOKEN, CANCELDELEGATIONTOKEN

Token Kind

Namekind
DescriptionThe kind of the delegation token requested
TypeString
Default Value<empty> (Server sets the default kind for the service)
Valid ValuesA string that represents token kind e.g “HDFS_DELEGATION_TOKEN” or “WEBHDFS delegation”
SyntaxAny string.

See also: GETDELEGATIONTOKEN

Token Service

Nameservice
DescriptionThe name of the service where the token is supposed to be used, e.g. ip:port of the namenode
TypeString
Default Value<empty>
Valid Valuesip:port in string format or logical name of the service
SyntaxAny string.

See also: GETDELEGATIONTOKEN

Username

Nameuser.name
DescriptionThe authenticated user; see Authentication.
TypeString
Default Valuenull
Valid ValuesAny valid username.
SyntaxAny string.

See also: Authentication

NoRedirect

Namenoredirect
DescriptionWhether the response should return an HTTP 307 redirect or HTTP 200 OK. See Create and Write to a File.
Typeboolean
Default Valuefalse
Valid Valuestrue
Syntaxtrue

See also: Create and Write to a File

Storage Policy

Namestoragepolicy
DescriptionThe name of the storage policy.
TypeString
Default Value<empty>
Valid ValuesAny valid storage policy name; see GETALLSTORAGEPOLICY.
SyntaxAny string.

See also: SETSTORAGEPOLICY

Start After

NamestartAfter
DescriptionThe last item returned in the liststatus batch.
TypeString
Default Value<empty>
Valid ValuesAny valid file/directory name.
SyntaxAny string.

See also: LISTSTATUS_BATCH