| <?xml version="1.0"?> |
| <!-- |
| Licensed to the Apache Software Foundation (ASF) under one or more |
| contributor license agreements. See the NOTICE file distributed with |
| this work for additional information regarding copyright ownership. |
| The ASF licenses this file to You under the Apache License, Version 2.0 |
| (the "License"); you may not use this file except in compliance with |
| the License. You may obtain a copy of the License at |
| |
| http://www.apache.org/licenses/LICENSE-2.0 |
| |
| Unless required by applicable law or agreed to in writing, software |
| distributed under the License is distributed on an "AS IS" BASIS, |
| WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
| See the License for the specific language governing permissions and |
| limitations under the License. |
| --> |
| |
| <!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "http://forrest.apache.org/dtd/document-v20.dtd"> |
| |
| <document> |
| |
| <header> |
| <title>Service Level Authorization Guide</title> |
| </header> |
| |
| <body> |
| |
| <section> |
| <title>Purpose</title> |
| |
| <p>This document describes how to configure and manage <em>Service Level |
| Authorization</em> for Hadoop.</p> |
| </section> |
| |
| <section> |
| <title>Prerequisites</title> |
| |
| <p>Make sure Hadoop is installed, configured and setup correctly. For more information see: </p> |
| <ul> |
| <li> |
| <a href="single_node_setup.html">Single Node Setup</a> for first-time users. |
| </li> |
| <li> |
| <a href="cluster_setup.html">Cluster Setup</a> for large, distributed clusters. |
| </li> |
| </ul> |
| </section> |
| |
| <section> |
| <title>Overview</title> |
| |
| <p>Service Level Authorization is the initial authorization mechanism to |
| ensure clients connecting to a particular Hadoop <em>service</em> have the |
| necessary, pre-configured, permissions and are authorized to access the given |
| service. For example, a MapReduce cluster can use this mechanism to allow a |
| configured list of users/groups to submit jobs.</p> |
| |
| <p>The <code>${HADOOP_CONF_DIR}/hadoop-policy.xml</code> configuration file |
| is used to define the access control lists for various Hadoop services.</p> |
| |
| <p>Service Level Authorization is performed much before to other access |
| control checks such as file-permission checks, access control on job queues |
| etc.</p> |
| </section> |
| |
| <section> |
| <title>Configuration</title> |
| |
| <p>This section describes how to configure service-level authorization |
| via the configuration file <code>{HADOOP_CONF_DIR}/hadoop-policy.xml</code>. |
| </p> |
| |
| <section> |
| <title>Enable Service Level Authorization</title> |
| |
| <p>By default, service-level authorization is disabled for Hadoop. To |
| enable it set the configuration property |
| <code>hadoop.security.authorization</code> to <strong>true</strong> |
| in <code>${HADOOP_CONF_DIR}/core-site.xml</code>.</p> |
| </section> |
| |
| <section> |
| <title>Hadoop Services and Configuration Properties</title> |
| |
| <p>This section lists the various Hadoop services and their configuration |
| knobs:</p> |
| |
| <table> |
| <tr> |
| <th>Property</th> |
| <th>Service</th> |
| </tr> |
| <tr> |
| <td><code>security.client.protocol.acl</code></td> |
| <td>ACL for ClientProtocol, which is used by user code via the |
| DistributedFileSystem.</td> |
| </tr> |
| <tr> |
| <td><code>security.client.datanode.protocol.acl</code></td> |
| <td>ACL for ClientDatanodeProtocol, the client-to-datanode protocol |
| for block recovery.</td> |
| </tr> |
| <tr> |
| <td><code>security.datanode.protocol.acl</code></td> |
| <td>ACL for DatanodeProtocol, which is used by datanodes to |
| communicate with the namenode.</td> |
| </tr> |
| <tr> |
| <td><code>security.inter.datanode.protocol.acl</code></td> |
| <td>ACL for InterDatanodeProtocol, the inter-datanode protocol |
| for updating generation timestamp.</td> |
| </tr> |
| <tr> |
| <td><code>security.namenode.protocol.acl</code></td> |
| <td>ACL for NamenodeProtocol, the protocol used by the secondary |
| namenode to communicate with the namenode.</td> |
| </tr> |
| <tr> |
| <td><code>security.inter.tracker.protocol.acl</code></td> |
| <td>ACL for InterTrackerProtocol, used by the tasktrackers to |
| communicate with the jobtracker.</td> |
| </tr> |
| <tr> |
| <td><code>security.job.submission.protocol.acl</code></td> |
| <td>ACL for JobSubmissionProtocol, used by job clients to |
| communciate with the jobtracker for job submission, querying job status |
| etc.</td> |
| </tr> |
| <tr> |
| <td><code>security.task.umbilical.protocol.acl</code></td> |
| <td>ACL for TaskUmbilicalProtocol, used by the map and reduce |
| tasks to communicate with the parent tasktracker.</td> |
| </tr> |
| <tr> |
| <td><code>security.refresh.policy.protocol.acl</code></td> |
| <td>ACL for RefreshAuthorizationPolicyProtocol, used by the |
| dfsadmin and mradmin commands to refresh the security policy in-effect. |
| </td> |
| </tr> |
| </table> |
| </section> |
| |
| <section> |
| <title>Access Control Lists</title> |
| |
| <p><code>${HADOOP_CONF_DIR}/hadoop-policy.xml</code> defines an access |
| control list for each Hadoop service. Every access control list has a |
| simple format:</p> |
| |
| <p>The list of users and groups are both comma separated list of names. |
| The two lists are separated by a space.</p> |
| |
| <p>Example: <code>user1,user2 group1,group2</code>.</p> |
| |
| <p>Add a blank at the beginning of the line if only a list of groups |
| is to be provided, equivalently a comman-separated list of users followed |
| by a space or nothing implies only a set of given users.</p> |
| |
| <p>A special value of <strong>*</strong> implies that all users are |
| allowed to access the service.</p> |
| </section> |
| |
| <section> |
| <title>Refreshing Service Level Authorization Configuration</title> |
| |
| <p>The service-level authorization configuration for the NameNode and |
| JobTracker can be changed without restarting either of the Hadoop master |
| daemons. The cluster administrator can change |
| <code>${HADOOP_CONF_DIR}/hadoop-policy.xml</code> on the master nodes and |
| instruct the NameNode and JobTracker to reload their respective |
| configurations via the <em>-refreshServiceAcl</em> switch to |
| <em>dfsadmin</em> and <em>mradmin</em> commands respectively.</p> |
| |
| <p>Refresh the service-level authorization configuration for the |
| NameNode:</p> |
| <p> |
| <code>$ bin/hadoop dfsadmin -refreshServiceAcl</code> |
| </p> |
| |
| <p>Refresh the service-level authorization configuration for the |
| JobTracker:</p> |
| <p> |
| <code>$ bin/hadoop mradmin -refreshServiceAcl</code> |
| </p> |
| |
| <p>Of course, one can use the |
| <code>security.refresh.policy.protocol.acl</code> property in |
| <code>${HADOOP_CONF_DIR}/hadoop-policy.xml</code> to restrict access to |
| the ability to refresh the service-level authorization configuration to |
| certain users/groups.</p> |
| |
| </section> |
| |
| <section> |
| <title>Examples</title> |
| |
| <p>Allow only users <code>alice</code>, <code>bob</code> and users in the |
| <code>mapreduce</code> group to submit jobs to the MapReduce cluster:</p> |
| |
| <source> |
| <property> |
| <name>security.job.submission.protocol.acl</name> |
| <value>alice,bob mapreduce</value> |
| </property> |
| </source> |
| |
| <p></p><p>Allow only DataNodes running as the users who belong to the |
| group <code>datanodes</code> to communicate with the NameNode:</p> |
| |
| <source> |
| <property> |
| <name>security.datanode.protocol.acl</name> |
| <value>datanodes</value> |
| </property> |
| </source> |
| |
| <p></p><p>Allow any user to talk to the HDFS cluster as a DFSClient:</p> |
| |
| <source> |
| <property> |
| <name>security.client.protocol.acl</name> |
| <value>*</value> |
| </property> |
| </source> |
| |
| </section> |
| </section> |
| |
| </body> |
| |
| </document> |