| ~~ Licensed under the Apache License, Version 2.0 (the "License"); |
| ~~ you may not use this file except in compliance with the License. |
| ~~ You may obtain a copy of the License at |
| ~~ |
| ~~ http://www.apache.org/licenses/LICENSE-2.0 |
| ~~ |
| ~~ Unless required by applicable law or agreed to in writing, software |
| ~~ distributed under the License is distributed on an "AS IS" BASIS, |
| ~~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
| ~~ See the License for the specific language governing permissions and |
| ~~ limitations under the License. See accompanying LICENSE file. |
| |
| --- |
| Service Level Authorization Guide |
| --- |
| --- |
| ${maven.build.timestamp} |
| |
| Service Level Authorization Guide |
| |
| %{toc|section=1|fromDepth=0} |
| |
| * Purpose |
| |
| This document describes how to configure and manage Service Level |
| Authorization for Hadoop. |
| |
| * Prerequisites |
| |
| Make sure Hadoop is installed, configured and setup correctly. For more |
| information see: |
| * Single Node Setup for first-time users. |
| * Cluster Setup for large, distributed clusters. |
| |
| * Overview |
| |
| Service Level Authorization is the initial authorization mechanism to |
| ensure clients connecting to a particular Hadoop service have the |
| necessary, pre-configured, permissions and are authorized to access the |
| given service. For example, a MapReduce cluster can use this mechanism |
| to allow a configured list of users/groups to submit jobs. |
| |
| The <<<${HADOOP_CONF_DIR}/hadoop-policy.xml>>> configuration file is used to |
| define the access control lists for various Hadoop services. |
| |
| Service Level Authorization is performed much before to other access |
| control checks such as file-permission checks, access control on job |
| queues etc. |
| |
| * Configuration |
| |
| This section describes how to configure service-level authorization via |
| the configuration file <<<${HADOOP_CONF_DIR}/hadoop-policy.xml>>>. |
| |
| ** Enable Service Level Authorization |
| |
| By default, service-level authorization is disabled for Hadoop. To |
| enable it set the configuration property hadoop.security.authorization |
| to true in <<<${HADOOP_CONF_DIR}/core-site.xml>>>. |
| |
| ** Hadoop Services and Configuration Properties |
| |
| This section lists the various Hadoop services and their configuration |
| knobs: |
| |
| *-------------------------------------+--------------------------------------+ |
| || Property || Service |
| *-------------------------------------+--------------------------------------+ |
| security.client.protocol.acl | ACL for ClientProtocol, which is used by user code via the DistributedFileSystem. |
| *-------------------------------------+--------------------------------------+ |
| security.client.datanode.protocol.acl | ACL for ClientDatanodeProtocol, the client-to-datanode protocol for block recovery. |
| *-------------------------------------+--------------------------------------+ |
| security.datanode.protocol.acl | ACL for DatanodeProtocol, which is used by datanodes to communicate with the namenode. |
| *-------------------------------------+--------------------------------------+ |
| security.inter.datanode.protocol.acl | ACL for InterDatanodeProtocol, the inter-datanode protocol for updating generation timestamp. |
| *-------------------------------------+--------------------------------------+ |
| security.namenode.protocol.acl | ACL for NamenodeProtocol, the protocol used by the secondary namenode to communicate with the namenode. |
| *-------------------------------------+--------------------------------------+ |
| security.inter.tracker.protocol.acl | ACL for InterTrackerProtocol, used by the tasktrackers to communicate with the jobtracker. |
| *-------------------------------------+--------------------------------------+ |
| security.job.submission.protocol.acl | ACL for JobSubmissionProtocol, used by job clients to communciate with the jobtracker for job submission, querying job status etc. |
| *-------------------------------------+--------------------------------------+ |
| security.task.umbilical.protocol.acl | ACL for TaskUmbilicalProtocol, used by the map and reduce tasks to communicate with the parent tasktracker. |
| *-------------------------------------+--------------------------------------+ |
| security.refresh.policy.protocol.acl | ACL for RefreshAuthorizationPolicyProtocol, used by the dfsadmin and mradmin commands to refresh the security policy in-effect. |
| *-------------------------------------+--------------------------------------+ |
| security.ha.service.protocol.acl | ACL for HAService protocol used by HAAdmin to manage the active and stand-by states of namenode. |
| *-------------------------------------+--------------------------------------+ |
| |
| ** Access Control Lists |
| |
| <<<${HADOOP_CONF_DIR}/hadoop-policy.xml>>> defines an access control list for |
| each Hadoop service. Every access control list has a simple format: |
| |
| The list of users and groups are both comma separated list of names. |
| The two lists are separated by a space. |
| |
| Example: <<<user1,user2 group1,group2>>>. |
| |
| Add a blank at the beginning of the line if only a list of groups is to |
| be provided, equivalently a comman-separated list of users followed by |
| a space or nothing implies only a set of given users. |
| |
| A special value of <<<*>>> implies that all users are allowed to access the |
| service. |
| |
| ** Refreshing Service Level Authorization Configuration |
| |
| The service-level authorization configuration for the NameNode and |
| JobTracker can be changed without restarting either of the Hadoop |
| master daemons. The cluster administrator can change |
| <<<${HADOOP_CONF_DIR}/hadoop-policy.xml>>> on the master nodes and instruct |
| the NameNode and JobTracker to reload their respective configurations |
| via the <<<-refreshServiceAcl>>> switch to <<<dfsadmin>>> and <<<mradmin>>> commands |
| respectively. |
| |
| Refresh the service-level authorization configuration for the NameNode: |
| |
| ---- |
| $ bin/hadoop dfsadmin -refreshServiceAcl |
| ---- |
| |
| Refresh the service-level authorization configuration for the |
| JobTracker: |
| |
| ---- |
| $ bin/hadoop mradmin -refreshServiceAcl |
| ---- |
| |
| Of course, one can use the <<<security.refresh.policy.protocol.acl>>> |
| property in <<<${HADOOP_CONF_DIR}/hadoop-policy.xml>>> to restrict access to |
| the ability to refresh the service-level authorization configuration to |
| certain users/groups. |
| |
| ** Examples |
| |
| Allow only users <<<alice>>>, <<<bob>>> and users in the <<<mapreduce>>> group to submit |
| jobs to the MapReduce cluster: |
| |
| ---- |
| <property> |
| <name>security.job.submission.protocol.acl</name> |
| <value>alice,bob mapreduce</value> |
| </property> |
| ---- |
| |
| Allow only DataNodes running as the users who belong to the group |
| datanodes to communicate with the NameNode: |
| |
| ---- |
| <property> |
| <name>security.datanode.protocol.acl</name> |
| <value>datanodes</value> |
| </property> |
| ---- |
| |
| Allow any user to talk to the HDFS cluster as a DFSClient: |
| |
| ---- |
| <property> |
| <name>security.client.protocol.acl</name> |
| <value>*</value> |
| </property> |
| ---- |