| .. Licensed to the Apache Software Foundation (ASF) under one |
| or more contributor license agreements. See the NOTICE file |
| distributed with this work for additional information |
| regarding copyright ownership. The ASF licenses this file |
| to you under the Apache License, Version 2.0 (the |
| "License"); you may not use this file except in compliance |
| with the License. You may obtain a copy of the License at |
| |
| .. http://www.apache.org/licenses/LICENSE-2.0 |
| |
| .. Unless required by applicable law or agreed to in writing, |
| software distributed under the License is distributed on an |
| "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY |
| KIND, either express or implied. See the License for the |
| specific language governing permissions and limitations |
| under the License. |
| |
| Kerberos |
| -------- |
| |
| Airflow has initial support for Kerberos. This means that Airflow can renew kerberos |
| tickets for itself and store it in the ticket cache. The hooks and dags can make use of ticket |
| to authenticate against kerberized services. |
| |
| .. contents:: |
| :depth: 1 |
| :local: |
| |
| Limitations |
| ''''''''''' |
| |
| Please note that at this time, not all hooks have been adjusted to make use of this functionality. |
| Also it does not integrate kerberos into the web interface and you will have to rely on network |
| level security for now to make sure your service remains secure. |
| |
| Celery integration has not been tried and tested yet. However, if you generate a key tab for every |
| host and launch a ticket renewer next to every worker it will most likely work. |
| |
| Enabling kerberos |
| ''''''''''''''''' |
| |
| Airflow |
| ^^^^^^^ |
| |
| To enable kerberos you will need to generate a (service) key tab. |
| |
| .. code-block:: bash |
| |
| # in the kadmin.local or kadmin shell, create the airflow principal |
| kadmin: addprinc -randkey airflow/fully.qualified.domain.name@YOUR-REALM.COM |
| |
| # Create the airflow keytab file that will contain the airflow principal |
| kadmin: xst -norandkey -k airflow.keytab airflow/fully.qualified.domain.name |
| |
| Now store this file in a location where the airflow user can read it (chmod 600). And then add the following to |
| your ``airflow.cfg`` |
| |
| .. code-block:: ini |
| |
| [core] |
| security = kerberos |
| |
| [kerberos] |
| keytab = /etc/airflow/airflow.keytab |
| reinit_frequency = 3600 |
| principal = airflow |
| |
| Launch the ticket renewer by |
| |
| .. code-block:: bash |
| |
| # run ticket renewer |
| airflow kerberos |
| |
| Hadoop |
| ^^^^^^ |
| |
| If want to use impersonation this needs to be enabled in ``core-site.xml`` of your hadoop config. |
| |
| .. code-block:: xml |
| |
| <property> |
| <name>hadoop.proxyuser.airflow.groups</name> |
| <value>*</value> |
| </property> |
| |
| <property> |
| <name>hadoop.proxyuser.airflow.users</name> |
| <value>*</value> |
| </property> |
| |
| <property> |
| <name>hadoop.proxyuser.airflow.hosts</name> |
| <value>*</value> |
| </property> |
| |
| Of course if you need to tighten your security replace the asterisk with something more appropriate. |
| |
| Using kerberos authentication |
| ''''''''''''''''''''''''''''' |
| |
| The hive hook has been updated to take advantage of kerberos authentication. To allow your DAGs to |
| use it, simply update the connection details with, for example: |
| |
| .. code-block:: json |
| |
| { "use_beeline": true, "principal": "hive/_HOST@EXAMPLE.COM"} |
| |
| Adjust the principal to your settings. The ``_HOST`` part will be replaced by the fully qualified domain name of |
| the server. |
| |
| You can specify if you would like to use the dag owner as the user for the connection or the user specified in the login |
| section of the connection. For the login user, specify the following as extra: |
| |
| .. code-block:: json |
| |
| { "use_beeline": true, "principal": "hive/_HOST@EXAMPLE.COM", "proxy_user": "login"} |
| |
| For the DAG owner use: |
| |
| .. code-block:: json |
| |
| { "use_beeline": true, "principal": "hive/_HOST@EXAMPLE.COM", "proxy_user": "owner"} |
| |
| and in your DAG, when initializing the HiveOperator, specify: |
| |
| .. code-block:: bash |
| |
| run_as_owner=True |
| |
| To use kerberos authentication, you must install Airflow with the ``kerberos`` extras group: |
| |
| .. code-block:: bash |
| |
| pip install 'apache-airflow[kerberos]' |
| |
| You can read about some production aspects of kerberos deployment at :ref:`production-deployment:kerberos` |