| <noautolink> |
| |
| [[index][::Go back to Oozie Documentation Index::]] |
| |
| ---+!! Action Authentication |
| |
| %TOC% |
| |
| ---++ Background |
| |
| A secure cluster requires that actions have been authenticated (typically via Kerberos). However, due to the way that Oozie runs |
| actions, Kerberos credentials are not easily made available to actions launched by Oozie. For many action types, this is not a |
| problem because they are self contained (beyond core Hadoop components). For example, a Pig action typically only talks to |
| MapReduce and HDFS. However, some actions require talking to external services (e.g. HCatalog, HBase Region Server, Hive Server 2) |
| and in these cases, the actions require some extra configuration in Oozie to authenticate. To be clear, this extra configuration |
| is only required if an action will be talking to these types of external services; running a typical MapReduce, Pig, Hive, etc |
| action will not require any of this. |
| |
| For these situations, Oozie will have to use its Kerberos credentials to obtain "delegation tokens" (think of it like a cookie) on |
| behalf of the user from the service in question. The details of what this means is beyond the scope of this documentation, but |
| basically, Oozie needs some extra configuration in the workflow so that it can obtain this delegation token. |
| |
| ---++ Oozie Server Configuration |
| |
| The code to obtain delegation tokens is pluggable so that it is easy to add support for different services by simply subclassing |
| org.apache.oozie.action.hadoop.Credentials to retrieve a delegation token from the service and add it to the Configuration. |
| |
| Out of the box, Oozie already comes with support for some credential types |
| (see [[DG_ActionAuthentication#Built-in_Credentials_Implementations][Built-in Credentials Implementations]]). |
| The credential classes that Oozie should load are specified by the following property in oozie-site.xml. The lefthand side of the |
| equals sign is the type for the credential type, while the righthand side is the class. |
| |
| <verbatim> |
| <property> |
| <name>oozie.credentials.credentialclasses</name> |
| <value> |
| hcat=org.apache.oozie.action.hadoop.HCatCredentials, |
| hbase=org.apache.oozie.action.hadoop.HbaseCredentials, |
| hive2=org.apache.oozie.action.hadoop.Hive2Credentials |
| </value> |
| </property> |
| </verbatim> |
| |
| ---++ Workflow Changes |
| |
| The user should add a =credentials= section to the top of their workflow that contains 1 or more =credential= sections. Each of |
| these =credential= sections contains a name for the credential, the type for the credential, and any configuration properties |
| needed by that type of credential for obtaining a delegation token. The =credentials= section is available in workflow schema |
| version 0.3 and later. |
| |
| For example, the following workflow is configured to obtain an HCatalog delegation token, which is given to a Pig action so that the |
| Pig action can talk to a secure HCatalog: |
| |
| <verbatim> |
| <workflow-app xmlns='uri:oozie:workflow:0.4' name='pig-wf'> |
| <credentials> |
| <credential name='my-hcat-creds' type='hcat'> |
| <property> |
| <name>hcat.metastore.uri</name> |
| <value>HCAT_URI</value> |
| </property> |
| <property> |
| <name>hcat.metastore.principal</name> |
| <value>HCAT_PRINCIPAL</value> |
| </property> |
| </credential> |
| </credentials> |
| ... |
| <action name='pig' cred='my-hcat-creds'> |
| <pig> |
| <job-tracker>JT</job-tracker> |
| <name-node>NN</name-node> |
| <configuration> |
| <property> |
| <name>TESTING</name> |
| <value>${start}</value> |
| </property> |
| </configuration> |
| </pig> |
| </action> |
| ... |
| </workflow-app> |
| </verbatim> |
| |
| The type of the =credential= is "hcat", which is the type name we gave for the HCatCredentials class in oozie-site.xml. We gave |
| the =credential= a name, "my-hcat-creds", which can be whatever you want; we then specify cred='my-hcat-creds' in the Pig action, |
| so that Oozie will include these credentials with the action. You can include multiple credentials with an action by specifying |
| a comma-separated list of =credential= names. And finally, the HCatCredentials required two properties (the metastore URI and |
| principal), which we also specified. |
| |
| Adding the =credentials= section to a workflow and referencing it in an action will make Oozie always try to obtain that delegation |
| token. Ordinarily, this would mean that you cannot re-use this workflow in a non-secure cluster without editing it because trying |
| to obtain the delegation token will likely fail. However, you can tell Oozie to ignore the =credentials= for a workflow by setting |
| the job-level property =oozie.credentials.skip= to =true=; this will allow you to use the same workflow.xml in a secure and |
| non-secure cluster by simply changing the job-level property at runtime. If omitted or set to =false=, Oozie will handle |
| the =credentials= section normally. In addition, you can also set this property at the action-level or server-level to skip getting |
| credentials for just that action or for all workflows, respectively. The order of priority is this: |
| |
| 1. =oozie.credentials.skip= in the =configuration= section of an action, if set |
| 1. =oozie.credentials.skip= in the job.properties for a workflow, if set |
| 1. =oozie.credentials.skip= in oozie-site.xml for all workflows, if set |
| 1. (don't skip) |
| |
| ---++ Built-in Credentials Implementations |
| |
| Oozie currently comes with the following Credentials implementations: |
| |
| 1. HCatalog and Hive Metastore: =org.apache.oozie.action.hadoop.HCatCredentials= |
| 1. HBase: =org.apache.oozie.action.hadoop.HBaseCredentials= |
| 1. Hive Server 2: =org.apache.oozie.action.hadoop.Hive2Credentials= |
| |
| HCatCredentials requires these two properties: |
| |
| 1. =hcat.metastore.principal= or hive.metastore.kerberos.principal |
| 1. =hcat.metastore.uri= or hive.metastore.uris |
| |
| *Note:* The HCatalog Metastore and Hive Metastore are one and the same and so the "hcat" type credential can also be used to talk |
| to a secure Hive Metastore, though the property names would still start with "hcat.". |
| |
| HBase does not require any additional properties since the hbase-site.xml on the Oozie server provides necessary information to the |
| obtain delegation token; though properties can be overwritten here if desired. |
| |
| Hive2Credentials requires these two properties: |
| |
| 1. =hive2.server.principal= |
| 1. =hive2.jdbc.url= |
| |
| [[index][::Go back to Oozie Documentation Index::]] |
| |
| </noautolink> |