| |
| ~~ Licensed under the Apache License, Version 2.0 (the "License"); |
| ~~ you may not use this file except in compliance with the License. |
| ~~ You may obtain a copy of the License at |
| ~~ |
| ~~ http://www.apache.org/licenses/LICENSE-2.0 |
| ~~ |
| ~~ Unless required by applicable law or agreed to in writing, software |
| ~~ distributed under the License is distributed on an "AS IS" BASIS, |
| ~~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
| ~~ See the License for the specific language governing permissions and |
| ~~ limitations under the License. See accompanying LICENSE file. |
| |
| --- |
| Hadoop Distributed File System-${project.version} - HDFS NFS Gateway |
| --- |
| --- |
| ${maven.build.timestamp} |
| |
| HDFS NFS Gateway |
| |
| %{toc|section=1|fromDepth=0} |
| |
| * {Overview} |
| |
| The NFS Gateway supports NFSv3 and allows HDFS to be mounted as part of the client's local file system. |
| Currently NFS Gateway supports and enables the following usage patterns: |
| |
| * Users can browse the HDFS file system through their local file system |
| on NFSv3 client compatible operating systems. |
| |
| * Users can download files from the the HDFS file system on to their |
| local file system. |
| |
| * Users can upload files from their local file system directly to the |
| HDFS file system. |
| |
| * Users can stream data directly to HDFS through the mount point. File |
| append is supported but random write is not supported. |
| |
| The NFS gateway machine needs the same thing to run an HDFS client like Hadoop JAR files, HADOOP_CONF directory. |
| The NFS gateway can be on the same host as DataNode, NameNode, or any HDFS client. |
| |
| |
| * {Configuration} |
| |
| The user running the NFS-gateway must be able to proxy all the users using the NFS mounts. |
| For instance, if user 'nfsserver' is running the gateway, and users belonging to the groups 'nfs-users1' |
| and 'nfs-users2' use the NFS mounts, then in core-site.xml of the namenode, the following must be set |
| (NOTE: replace 'nfsserver' with the user name starting the gateway in your cluster): |
| |
| ---- |
| <property> |
| <name>hadoop.proxyuser.nfsserver.groups</name> |
| <value>nfs-users1,nfs-users2</value> |
| <description> |
| The 'nfsserver' user is allowed to proxy all members of the 'nfs-users1' and |
| 'nfs-users2' groups. Set this to '*' to allow nfsserver user to proxy any group. |
| </description> |
| </property> |
| ---- |
| |
| ---- |
| <property> |
| <name>hadoop.proxyuser.nfsserver.hosts</name> |
| <value>nfs-client-host1.com</value> |
| <description> |
| This is the host where the nfs gateway is running. Set this to '*' to allow |
| requests from any hosts to be proxied. |
| </description> |
| </property> |
| ---- |
| |
| The above are the only required configuration for the NFS gateway in non-secure mode. For Kerberized |
| hadoop clusters, the following configurations need to be added to hdfs-site.xml: |
| |
| ---- |
| <property> |
| <name>nfs.keytab.file</name> |
| <value>/etc/hadoop/conf/nfsserver.keytab</value> <!-- path to the nfs gateway keytab --> |
| </property> |
| ---- |
| |
| ---- |
| <property> |
| <name>nfs.kerberos.principal</name> |
| <value>nfsserver/_HOST@YOUR-REALM.COM</value> |
| </property> |
| ---- |
| |
| The AIX NFS client has a {{{https://issues.apache.org/jira/browse/HDFS-6549}few known issues}} |
| that prevent it from working correctly by default with the HDFS NFS |
| Gateway. If you want to be able to access the HDFS NFS Gateway from AIX, you |
| should set the following configuration setting to enable work-arounds for these |
| issues: |
| |
| ---- |
| <property> |
| <name>nfs.aix.compatibility.mode.enabled</name> |
| <value>true</value> |
| </property> |
| ---- |
| |
| Note that regular, non-AIX clients should NOT enable AIX compatibility mode. |
| The work-arounds implemented by AIX compatibility mode effectively disable |
| safeguards to ensure that listing of directory contents via NFS returns |
| consistent results, and that all data sent to the NFS server can be assured to |
| have been committed. |
| |
| It's strongly recommended for the users to update a few configuration properties based on their use |
| cases. All the related configuration properties can be added or updated in hdfs-site.xml. |
| |
| * If the client mounts the export with access time update allowed, make sure the following |
| property is not disabled in the configuration file. Only NameNode needs to restart after |
| this property is changed. On some Unix systems, the user can disable access time update |
| by mounting the export with "noatime". If the export is mounted with "noatime", the user |
| doesn't need to change the following property and thus no need to restart namenode. |
| |
| ---- |
| <property> |
| <name>dfs.namenode.accesstime.precision</name> |
| <value>3600000</value> |
| <description>The access time for HDFS file is precise upto this value. |
| The default value is 1 hour. Setting a value of 0 disables |
| access times for HDFS. |
| </description> |
| </property> |
| ---- |
| |
| * Users are expected to update the file dump directory. NFS client often |
| reorders writes. Sequential writes can arrive at the NFS gateway at random |
| order. This directory is used to temporarily save out-of-order writes |
| before writing to HDFS. For each file, the out-of-order writes are dumped after |
| they are accumulated to exceed certain threshold (e.g., 1MB) in memory. |
| One needs to make sure the directory has enough |
| space. For example, if the application uploads 10 files with each having |
| 100MB, it is recommended for this directory to have roughly 1GB space in case if a |
| worst-case write reorder happens to every file. Only NFS gateway needs to restart after |
| this property is updated. |
| |
| ---- |
| <property> |
| <name>nfs.dump.dir</name> |
| <value>/tmp/.hdfs-nfs</value> |
| </property> |
| ---- |
| |
| * For optimal performance, it is recommended that rtmax be updated to |
| 1MB. However, note that this 1MB is a per client allocation, and not |
| from a shared memory pool, and therefore a larger value may adversely |
| affect small reads, consuming a lot of memory. The maximum value of |
| this property is 1MB. |
| |
| ---- |
| <property> |
| <name>nfs.rtmax</name> |
| <value>1048576</value> |
| <description>This is the maximum size in bytes of a READ request |
| supported by the NFS gateway. If you change this, make sure you |
| also update the nfs mount's rsize(add rsize= # of bytes to the |
| mount directive). |
| </description> |
| </property> |
| ---- |
| |
| ---- |
| <property> |
| <name>nfs.wtmax</name> |
| <value>65536</value> |
| <description>This is the maximum size in bytes of a WRITE request |
| supported by the NFS gateway. If you change this, make sure you |
| also update the nfs mount's wsize(add wsize= # of bytes to the |
| mount directive). |
| </description> |
| </property> |
| ---- |
| |
| * By default, the export can be mounted by any client. To better control the access, |
| users can update the following property. The value string contains machine name and |
| access privilege, separated by whitespace |
| characters. The machine name format can be a single host, a Java regular expression, or an IPv4 address. The |
| access privilege uses rw or ro to specify read/write or read-only access of the machines to exports. If the access |
| privilege is not provided, the default is read-only. Entries are separated by ";". |
| For example: "192.168.0.0/22 rw ; host.*\.example\.com ; host1.test.org ro;". Only the NFS gateway needs to restart after |
| this property is updated. |
| |
| ---- |
| <property> |
| <name>nfs.exports.allowed.hosts</name> |
| <value>* rw</value> |
| </property> |
| ---- |
| |
| * Customize log settings. To get NFS debug trace, users can edit the log4j.property file |
| to add the following. Note, debug trace, especially for ONCRPC, can be very verbose. |
| |
| To change logging level: |
| |
| ----------------------------------------------- |
| log4j.logger.org.apache.hadoop.hdfs.nfs=DEBUG |
| ----------------------------------------------- |
| |
| To get more details of ONCRPC requests: |
| |
| ----------------------------------------------- |
| log4j.logger.org.apache.hadoop.oncrpc=DEBUG |
| ----------------------------------------------- |
| |
| |
| * {Start and stop NFS gateway service} |
| |
| Three daemons are required to provide NFS service: rpcbind (or portmap), mountd and nfsd. |
| The NFS gateway process has both nfsd and mountd. It shares the HDFS root "/" as the |
| only export. It is recommended to use the portmap included in NFS gateway package. Even |
| though NFS gateway works with portmap/rpcbind provide by most Linux distributions, the |
| package included portmap is needed on some Linux systems such as REHL6.2 due to an |
| {{{https://bugzilla.redhat.com/show_bug.cgi?id=731542}rpcbind bug}}. More detailed discussions can |
| be found in {{{https://issues.apache.org/jira/browse/HDFS-4763}HDFS-4763}}. |
| |
| [[1]] Stop nfs/rpcbind/portmap services provided by the platform (commands can be different on various Unix platforms): |
| |
| ------------------------- |
| service nfs stop |
| |
| service rpcbind stop |
| ------------------------- |
| |
| |
| [[2]] Start package included portmap (needs root privileges): |
| |
| ------------------------- |
| hadoop portmap |
| |
| OR |
| |
| hadoop-daemon.sh start portmap |
| ------------------------- |
| |
| [[3]] Start mountd and nfsd. |
| |
| No root privileges are required for this command. However, ensure that the user starting |
| the Hadoop cluster and the user starting the NFS gateway are same. |
| |
| ------------------------- |
| hadoop nfs3 |
| |
| OR |
| |
| hadoop-daemon.sh start nfs3 |
| ------------------------- |
| |
| Note, if the hadoop-daemon.sh script starts the NFS gateway, its log can be found in the hadoop log folder. |
| |
| |
| [[4]] Stop NFS gateway services. |
| |
| ------------------------- |
| hadoop-daemon.sh stop nfs3 |
| |
| hadoop-daemon.sh stop portmap |
| ------------------------- |
| |
| Optionally, you can forgo running the Hadoop-provided portmap daemon and |
| instead use the system portmap daemon on all operating systems if you start the |
| NFS Gateway as root. This will allow the HDFS NFS Gateway to work around the |
| aforementioned bug and still register using the system portmap daemon. To do |
| so, just start the NFS gateway daemon as you normally would, but make sure to |
| do so as the "root" user, and also set the "HADOOP_PRIVILEGED_NFS_USER" |
| environment variable to an unprivileged user. In this mode the NFS Gateway will |
| start as root to perform its initial registration with the system portmap, and |
| then will drop privileges back to the user specified by the |
| HADOOP_PRIVILEGED_NFS_USER afterward and for the rest of the duration of the |
| lifetime of the NFS Gateway process. Note that if you choose this route, you |
| should skip steps 1 and 2 above. |
| |
| |
| * {Verify validity of NFS related services} |
| |
| [[1]] Execute the following command to verify if all the services are up and running: |
| |
| ------------------------- |
| rpcinfo -p $nfs_server_ip |
| ------------------------- |
| |
| You should see output similar to the following: |
| |
| ------------------------- |
| program vers proto port |
| |
| 100005 1 tcp 4242 mountd |
| |
| 100005 2 udp 4242 mountd |
| |
| 100005 2 tcp 4242 mountd |
| |
| 100000 2 tcp 111 portmapper |
| |
| 100000 2 udp 111 portmapper |
| |
| 100005 3 udp 4242 mountd |
| |
| 100005 1 udp 4242 mountd |
| |
| 100003 3 tcp 2049 nfs |
| |
| 100005 3 tcp 4242 mountd |
| ------------------------- |
| |
| [[2]] Verify if the HDFS namespace is exported and can be mounted. |
| |
| ------------------------- |
| showmount -e $nfs_server_ip |
| ------------------------- |
| |
| You should see output similar to the following: |
| |
| ------------------------- |
| Exports list on $nfs_server_ip : |
| |
| / (everyone) |
| ------------------------- |
| |
| |
| * {Mount the export “/”} |
| |
| Currently NFS v3 only uses TCP as the transportation protocol. |
| NLM is not supported so mount option "nolock" is needed. It's recommended to use |
| hard mount. This is because, even after the client sends all data to |
| NFS gateway, it may take NFS gateway some extra time to transfer data to HDFS |
| when writes were reorderd by NFS client Kernel. |
| |
| If soft mount has to be used, the user should give it a relatively |
| long timeout (at least no less than the default timeout on the host) . |
| |
| The users can mount the HDFS namespace as shown below: |
| |
| ------------------------------------------------------------------- |
| mount -t nfs -o vers=3,proto=tcp,nolock,noacl $server:/ $mount_point |
| ------------------------------------------------------------------- |
| |
| Then the users can access HDFS as part of the local file system except that, |
| hard link and random write are not supported yet. |
| |
| * {Allow mounts from unprivileged clients} |
| |
| In environments where root access on client machines is not generally |
| available, some measure of security can be obtained by ensuring that only NFS |
| clients originating from privileged ports can connect to the NFS server. This |
| feature is referred to as "port monitoring." This feature is not enabled by default |
| in the HDFS NFS Gateway, but can be optionally enabled by setting the |
| following config in hdfs-site.xml on the NFS Gateway machine: |
| |
| ------------------------------------------------------------------- |
| <property> |
| <name>nfs.port.monitoring.disabled</name> |
| <value>false</value> |
| </property> |
| ------------------------------------------------------------------- |
| |
| * {User authentication and mapping} |
| |
| NFS gateway in this release uses AUTH_UNIX style authentication. When the user on NFS client |
| accesses the mount point, NFS client passes the UID to NFS gateway. |
| NFS gateway does a lookup to find user name from the UID, and then passes the |
| username to the HDFS along with the HDFS requests. |
| For example, if the NFS client has current user as "admin", when the user accesses |
| the mounted directory, NFS gateway will access HDFS as user "admin". To access HDFS |
| as the user "hdfs", one needs to switch the current user to "hdfs" on the client system |
| when accessing the mounted directory. |
| |
| The system administrator must ensure that the user on NFS client host has the same |
| name and UID as that on the NFS gateway host. This is usually not a problem if |
| the same user management system (e.g., LDAP/NIS) is used to create and deploy users on |
| HDFS nodes and NFS client node. In case the user account is created manually on different hosts, one might need to |
| modify UID (e.g., do "usermod -u 123 myusername") on either NFS client or NFS gateway host |
| in order to make it the same on both sides. More technical details of RPC AUTH_UNIX can be found |
| in {{{http://tools.ietf.org/html/rfc1057}RPC specification}}. |
| |
| Optionally, the system administrator can configure a custom static mapping |
| file in the event one wishes to access the HDFS NFS Gateway from a system with |
| a completely disparate set of UIDs/GIDs. By default this file is located at |
| "/etc/nfs.map", but a custom location can be configured by setting the |
| "nfs.static.mapping.file" property to the path of the static mapping file. |
| The format of the static mapping file is similar to what is described in the |
| exports(5) manual page, but roughly it is: |
| |
| ------------------------- |
| # Mapping for clients accessing the NFS gateway |
| uid 10 100 # Map the remote UID 10 the local UID 100 |
| gid 11 101 # Map the remote GID 11 to the local GID 101 |
| ------------------------- |