| <?xml version="1.0" encoding="UTF-8"?> |
| <!-- |
| Licensed to the Apache Software Foundation (ASF) under one |
| or more contributor license agreements. See the NOTICE file |
| distributed with this work for additional information |
| regarding copyright ownership. The ASF licenses this file |
| to you under the Apache License, Version 2.0 (the |
| "License"); you may not use this file except in compliance |
| with the License. You may obtain a copy of the License at |
| |
| http://www.apache.org/licenses/LICENSE-2.0 |
| |
| Unless required by applicable law or agreed to in writing, |
| software distributed under the License is distributed on an |
| "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY |
| KIND, either express or implied. See the License for the |
| specific language governing permissions and limitations |
| under the License. |
| --> |
| <!DOCTYPE concept PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd"> |
| <concept rev="1.1" id="authorization"> |
| |
| <title>Enabling Sentry Authorization for Impala</title> |
| |
| <prolog> |
| <metadata> |
| <data name="Category" value="Security"/> |
| <data name="Category" value="Sentry"/> |
| <data name="Category" value="Impala"/> |
| <data name="Category" value="Configuring"/> |
| <data name="Category" value="Starting and Stopping"/> |
| <data name="Category" value="Users"/> |
| <data name="Category" value="Groups"/> |
| <data name="Category" value="Administrators"/> |
| </metadata> |
| </prolog> |
| |
| <conbody id="sentry"> |
| |
| <p> |
| Authorization determines which users are allowed to access which resources, and what |
| operations they are allowed to perform. In Impala 1.1 and higher, you use Apache Sentry |
| for authorization. Sentry adds a fine-grained authorization framework for Hadoop. By |
| default (when authorization is not enabled), Impala does all read and write operations |
| with the privileges of the <codeph>impala</codeph> user, which is suitable for a |
| development/test environment but not for a secure production environment. When |
| authorization is enabled, Impala uses the OS user ID of the user who runs |
| <cmdname>impala-shell</cmdname> or other client program, and associates various privileges |
| with each user. |
| </p> |
| |
| <note> |
| Sentry is typically used in conjunction with Kerberos authentication, which defines which |
| hosts are allowed to connect to each server. Using the combination of Sentry and Kerberos |
| prevents malicious users from being able to connect by creating a named account on an |
| untrusted machine. See <xref href="impala_kerberos.xml#kerberos"/> for details about |
| Kerberos authentication. |
| </note> |
| |
| <p audience="PDF" outputclass="toc inpage"> |
| See the following sections for details about using the Impala authorization features: |
| </p> |
| |
| </conbody> |
| |
| <concept id="sentry_priv_model"> |
| |
| <title>The Sentry Privilege Model</title> |
| |
| <conbody> |
| |
| <p> |
| Privileges can be granted on different objects in the schema. Any privilege that can be |
| granted is associated with a level in the object hierarchy. If a privilege is granted on |
| a parent object in the hierarchy, the child object automatically inherits it. This is |
| the same privilege model as Hive and other database systems. |
| </p> |
| |
| <p> |
| The objects in the Impala schema hierarchy are: |
| </p> |
| |
| <codeblock>Server |
| URI |
| Database |
| Table |
| Column |
| </codeblock> |
| |
| <p rev="2.3.0 collevelauth"> |
| The table-level privileges apply to views as well. Anywhere you specify a table name, |
| you can specify a view name instead. |
| </p> |
| |
| <p rev="2.3.0 collevelauth"> |
| In <keyword keyref="impala23_full"/> and higher, you can specify privileges for |
| individual columns. |
| </p> |
| |
| <p conref="../shared/impala_common.xml#common/sentry_privileges_objects"/> |
| |
| <p> |
| Originally, privileges were encoded in a policy file, stored in HDFS. This mode of |
| operation is still an option, but the emphasis of privilege management is moving towards |
| being SQL-based. The mode of operation with <codeph>GRANT</codeph> and |
| <codeph>REVOKE</codeph> statements instead of the policy file requires that a special |
| Sentry service be enabled; this service stores, retrieves, and manipulates privilege |
| information stored inside the metastore database. |
| </p> |
| |
| <note> |
| <p> |
| Although this document refers to the <codeph>ALL</codeph> privilege, currently if you |
| use the policy file mode, you do not use the actual keyword <codeph>ALL</codeph> in |
| the policy file. When you code role entries in the policy file: |
| </p> |
| <ul> |
| <li> |
| To specify the <codeph>ALL</codeph> privilege for a server, use a role like |
| <codeph>server=<varname>server_name</varname></codeph>. |
| </li> |
| |
| <li> |
| To specify the <codeph>ALL</codeph> privilege for a database, use a role like |
| <codeph>server=<varname>server_name</varname>->db=<varname>database_name</varname></codeph>. |
| </li> |
| |
| <li> |
| To specify the <codeph>ALL</codeph> privilege for a table, use a role like |
| <codeph>server=<varname>server_name</varname>->db=<varname>database_name</varname>->table=<varname>table_name</varname>->action=*</codeph>. |
| </li> |
| </ul> |
| </note> |
| |
| <p> |
| If you change privileges in Sentry, e.g. adding a user, removing a user, modifying |
| privileges, you must clear the Impala Catalog server cache by running the |
| <codeph>INVALIDATE METADATA</codeph> statement. <codeph>INVALIDATE METADATA</codeph> is |
| not required if you make the changes to privileges within Impala. |
| </p> |
| |
| </conbody> |
| |
| </concept> |
| |
| <concept id="secure_startup"> |
| |
| <title>Starting the impalad Daemon with Sentry Authorization Enabled</title> |
| |
| <prolog> |
| <metadata> |
| <data name="Category" value="Starting and Stopping"/> |
| </metadata> |
| </prolog> |
| |
| <conbody> |
| |
| <p> |
| To run the <cmdname>impalad</cmdname> daemon with authorization enabled, you add one or |
| more options to the <codeph>IMPALA_SERVER_ARGS</codeph> declaration in the |
| <filepath>/etc/default/impala</filepath> configuration file: |
| </p> |
| |
| <ul> |
| <li> |
| <codeph>-server_name</codeph>: Turns on Sentry authorization for Impala. The |
| authorization rules refer to a symbolic server name, and you specify the same name to |
| use as the argument to the <codeph>-server_name</codeph> option for all |
| <cmdname>impalad</cmdname> nodes in the cluster. |
| <p> |
| Starting in Impala 1.4.0 and higher, if you specify just |
| <codeph>-server_name</codeph> without <codeph>-authorization_policy_file</codeph>, |
| Impala uses the Sentry service for authorization. |
| </p> |
| </li> |
| |
| <li> |
| <codeph>-sentry_config</codeph>: Specifies the local path to the |
| <codeph>sentry-site.xml</codeph> configuration file. This setting is required to |
| enable authorization. |
| </li> |
| |
| <li> |
| <codeph>-authorization_policy_file</codeph>: Specifies the HDFS path to the policy |
| file that defines the privileges on schema objects. Prior to Impala 1.4.0, or if you |
| want to continue storing privilege rules in the policy file, specify the |
| <codeph>-authorization_policy_file</codeph> option to make Impala read privilege |
| information from a policy file, rather than from the metastore database. |
| </li> |
| </ul> |
| |
| <p rev="1.4.0"> |
| For example, you might adapt your <filepath>/etc/default/impala</filepath> configuration |
| to contain lines like the following. To use the Sentry service rather than the policy |
| file: |
| </p> |
| |
| <codeblock rev="1.4.0">IMPALA_SERVER_ARGS=" \ |
| -server_name=server1 \ |
| ... |
| </codeblock> |
| |
| <p> |
| Or to use the policy file, as in releases prior to Impala 1.4: |
| </p> |
| |
| <codeblock>IMPALA_SERVER_ARGS=" \ |
| -authorization_policy_file=/user/hive/warehouse/auth-policy.ini \ |
| -server_name=server1 \ |
| ... |
| </codeblock> |
| |
| <p> |
| The preceding examples set up a symbolic name of <codeph>server1</codeph> to refer to |
| the current instance of Impala. Specify the symbolic name for the |
| <codeph>sentry.hive.server</codeph> property in the <filepath>sentry-site.xml</filepath> |
| configuration file for Hive, as well as in the <codeph>-server_name</codeph> option for |
| <cmdname>impalad</cmdname>. |
| </p> |
| |
| <p> |
| Now restart the <cmdname>impalad</cmdname> daemons on all the nodes. |
| </p> |
| |
| </conbody> |
| |
| </concept> |
| |
| <concept id="sentry_service"> |
| |
| <title>Using Impala with the Sentry Service</title> |
| |
| <conbody> |
| |
| <p> |
| When you use the Sentry service, you set up privileges through the |
| <codeph>GRANT</codeph> and <codeph>REVOKE</codeph> statements in either Impala or Hive. |
| Then both components use those same privileges automatically. (Impala added the |
| <codeph>GRANT</codeph> and <codeph>REVOKE</codeph> statements in |
| <keyword keyref="impala20_full" |
| />.) |
| </p> |
| |
| <p> |
| For information about using the Impala <codeph>GRANT</codeph> and |
| <codeph>REVOKE</codeph> statements, see <xref |
| href="impala_grant.xml#grant"/> |
| and <xref |
| href="impala_revoke.xml#revoke"/>. |
| </p> |
| |
| </conbody> |
| |
| <concept id="changing_privileges"> |
| |
| <title>Changing Privileges</title> |
| |
| <conbody> |
| |
| <p> |
| If you make a change to privileges in Sentry from outside of Impala, e.g. adding a |
| user, removing a user, modifying privileges, there are two options to propagate the |
| change: |
| </p> |
| |
| <ul> |
| <li> |
| Use the <codeph>catalogd</codeph> flag, |
| <codeph>--sentry_catalog_polling_frequency_s</codeph> to specify how often to do a |
| Sentry refresh. The flag is set to 60 seconds by default. |
| </li> |
| |
| <li> |
| Run the <codeph>INVALIDATE METADATA</codeph> statement to force a Sentry refresh. |
| <codeph>INVALIDATE METADATA</codeph> forces a Sentry refresh regardless of the |
| <codeph>--sentry_catalog_polling_fequency_s</codeph> flag. |
| </li> |
| </ul> |
| |
| <p> |
| If you make a change to privileges within Impala, <codeph>INVALIDATE METADATA</codeph> |
| is not required. |
| </p> |
| |
| <note type="warning"> |
| As <codeph>INVALIDATE METADATA</codeph> is an expensive operation, you should use it |
| judiciously. |
| </note> |
| |
| </conbody> |
| |
| </concept> |
| |
| <concept id="granting_on_uri"> |
| |
| <title>Granting Privileges on URI</title> |
| |
| <conbody> |
| |
| <p> |
| URIs represent the file paths you specify as part of statements such as <codeph>CREATE |
| EXTERNAL TABLE</codeph> and <codeph>LOAD DATA</codeph>. Typically, you specify what |
| look like UNIX paths, but these locations can also be prefixed with |
| <codeph>hdfs://</codeph> to make clear that they are really URIs. To set privileges |
| for a URI, specify the name of a directory, and the privilege applies to all the files |
| in that directory and any directories underneath it. |
| </p> |
| |
| <p> |
| URIs must start with <codeph>hdfs://</codeph>, <codeph>s3a://</codeph>, |
| <codeph>adl://</codeph>, or <codeph>file://</codeph>. If a URI starts with an absolute |
| path, the path will be appended to the default filesystem prefix. For example, if you |
| specify: |
| <codeblock> |
| GRANT ALL ON URI '/tmp'; |
| </codeblock> |
| The above statement effectively becomes the following where the default filesystem is |
| HDFS. |
| <codeblock> |
| GRANT ALL ON URI 'hdfs://localhost:20500/tmp'; |
| </codeblock> |
| </p> |
| |
| <p> |
| When defining URIs for HDFS, you must also specify the NameNode. For example: |
| <codeblock>GRANT ALL ON URI file:///path/to/dir TO <role> |
| GRANT ALL ON URI hdfs://namenode:port/path/to/dir TO <role></codeblock> |
| <note type="warning"> |
| Because the NameNode host and port must be specified, it is strongly recommended |
| that you use High Availability (HA). This ensures that the URI will remain constant |
| even if the NameNode changes. For example: |
| <codeblock>GRANT ALL ON URI hdfs://ha-nn-uri/path/to/dir TO <role></codeblock> |
| </note> |
| </p> |
| |
| </conbody> |
| |
| </concept> |
| |
| <concept id="concept_k45_lbm_f2b"> |
| |
| <title>Examples of Setting up Authorization for Security Scenarios</title> |
| |
| <conbody> |
| |
| <p> |
| The following examples show how to set up authorization to deal with various |
| scenarios. |
| </p> |
| |
| <example> |
| |
| <title>A User with No Privileges</title> |
| |
| <p> |
| If a user has no privileges at all, that user cannot access any schema objects in |
| the system. The error messages do not disclose the names or existence of objects |
| that the user is not authorized to read. |
| </p> |
| |
| <p> |
| This is the experience you want a user to have if they somehow log into a system |
| where they are not an authorized Impala user. Or in a real deployment, a user might |
| have no privileges because they are not a member of any of the authorized groups. |
| </p> |
| |
| </example> |
| |
| <example> |
| |
| <title>Examples of Privileges for Administrative Users</title> |
| |
| <p> |
| In this example, the SQL statements grant the <codeph>entire_server</codeph> role |
| all privileges on both the databases and URIs within the server. |
| </p> |
| |
| <codeblock>CREATE ROLE entire_server; |
| GRANT ROLE entire_server TO GROUP admin_group; |
| GRANT ALL ON SERVER server1 TO ROLE entire_server; |
| </codeblock> |
| |
| </example> |
| |
| <example> |
| |
| <title>A User with Privileges for Specific Databases and Tables</title> |
| |
| <p> |
| If a user has privileges for specific tables in specific databases, the user can |
| access those things but nothing else. They can see the tables and their parent |
| databases in the output of <codeph>SHOW TABLES</codeph> and <codeph>SHOW |
| DATABASES</codeph>, <codeph>USE</codeph> the appropriate databases, and perform the |
| relevant actions (<codeph>SELECT</codeph> and/or <codeph>INSERT</codeph>) based on |
| the table privileges. To actually create a table requires the <codeph>ALL</codeph> |
| privilege at the database level, so you might define separate roles for the user |
| that sets up a schema and other users or applications that perform day-to-day |
| operations on the tables. |
| </p> |
| |
| <codeblock> |
| CREATE ROLE one_database; |
| GRANT ROLE one_database TO GROUP admin_group; |
| GRANT ALL ON DATABASE db1 TO ROLE one_database; |
| |
| CREATE ROLE instructor; |
| GRANT ROLE instructor TO GROUP trainers; |
| GRANT ALL ON TABLE db1.lesson TO ROLE instructor; |
| |
| # This particular course is all about queries, so the students can SELECT but not INSERT or CREATE/DROP. |
| CREATE ROLE student; |
| GRANT ROLE student TO GROUP visitors; |
| GRANT SELECT ON TABLE db1.training TO ROLE student;</codeblock> |
| |
| </example> |
| |
| <example> |
| |
| <title>Privileges for Working with External Data Files</title> |
| |
| <p> |
| When data is being inserted through the <codeph>LOAD DATA</codeph> statement, or is |
| referenced from an HDFS location outside the normal Impala database directories, the |
| user also needs appropriate permissions on the URIs corresponding to those HDFS |
| locations. |
| </p> |
| |
| <p> |
| In this example: |
| </p> |
| |
| <ul> |
| <li> |
| The <codeph>external_table</codeph> role can insert into and query the Impala |
| table, <codeph>external_table.sample</codeph>. |
| </li> |
| |
| <li> |
| The <codeph>staging_dir</codeph> role can specify the HDFS path |
| <filepath>/user/impala-user/external_data</filepath> with the <codeph>LOAD |
| DATA</codeph> statement. When Impala queries or loads data files, it operates on |
| all the files in that directory, not just a single file, so any Impala |
| <codeph>LOCATION</codeph> parameters refer to a directory rather than an |
| individual file. |
| </li> |
| </ul> |
| |
| <codeblock>CREATE ROLE external_table; |
| GRANT ROLE external_table TO GROUP impala_users; |
| GRANT ALL ON TABLE external_table.sample TO ROLE external_table; |
| |
| CREATE ROLE staging_dir; |
| GRANT ROLE staging TO GROUP impala_users; |
| GRANT ALL ON URI 'hdfs://127.0.0.1:8020/user/impala-user/external_data' TO ROLE staging_dir;</codeblock> |
| |
| </example> |
| |
| <example> |
| |
| <title>Separating Administrator Responsibility from Read and Write Privileges</title> |
| |
| <p> |
| To create a database, you need the full privilege on that database while day-to-day |
| operations on tables within that database can be performed with lower levels of |
| privilege on specific table. Thus, you might set up separate roles for each database |
| or application: an administrative one that could create or drop the database, and a |
| user-level one that can access only the relevant tables. |
| </p> |
| |
| <p> |
| In this example, the responsibilities are divided between users in 3 different |
| groups: |
| </p> |
| |
| <ul> |
| <li> |
| Members of the <codeph>supergroup</codeph> group have the |
| <codeph>training_sysadmin</codeph> role and so can set up a database named |
| <codeph>training</codeph>. |
| </li> |
| |
| <li> |
| Members of the <codeph>impala_users</codeph> group have the |
| <codeph>instructor</codeph> role and so can create, insert into, and query any |
| tables in the <codeph>training</codeph> database, but cannot create or drop the |
| database itself. |
| </li> |
| |
| <li> |
| Members of the <codeph>visitor</codeph> group have the <codeph>student</codeph> |
| role and so can query those tables in the <codeph>training</codeph> database. |
| </li> |
| </ul> |
| |
| <codeblock>CREATE ROLE training_sysadmin; |
| GRANT ROLE training_sysadmin TO GROUP supergroup; |
| GRANT ALL ON DATABASE training1 TO ROLE training_sysadmin; |
| |
| CREATE ROLE instructor; |
| GRANT ROLE instructor TO GROUP impala_users; |
| GRANT ALL ON TABLE training1.course1 TO ROLE instructor; |
| |
| CREATE ROLE visitor; |
| GRANT ROLE student TO GROUP visitor; |
| GRANT SELECT ON TABLE training1.course1 TO ROLE student;</codeblock> |
| |
| </example> |
| |
| </conbody> |
| |
| </concept> |
| |
| </concept> |
| |
| <concept id="security_policy_file"> |
| |
| <title>Using Impala with the Sentry Policy File</title> |
| |
| <conbody> |
| |
| <p> |
| The policy file is a file that you put in a designated location in HDFS, and is read |
| during the startup of the <cmdname>impalad</cmdname> daemon when you specify both the |
| <codeph>-server_name</codeph> and <codeph>-authorization_policy_file</codeph> startup |
| options. It controls which objects (databases, tables, and HDFS directory paths) can be |
| accessed by the user who connects to <cmdname>impalad</cmdname>, and what operations |
| that user can perform on the objects. |
| </p> |
| |
| <note rev="1.4.0"> |
| The policy-file based authorization was deprecated in <keyword keyref="impala26"/>. We |
| recommend managing privileges through SQL statements as described in |
| <xref |
| href="impala_authorization.xml#sentry_service"/>. If you are still using |
| policy files, plan to migrate to the new approach some time in the future. |
| </note> |
| |
| <p> |
| The location of the policy file is listed in the <filepath>auth-site.xml</filepath> |
| configuration file. |
| </p> |
| |
| <p> |
| When authorization is enabled, Impala uses the policy file as a <i>whitelist</i>, |
| representing every privilege available to any user on any object. That is, only |
| operations specified for the appropriate combination of object, role, group, and user |
| are allowed. All other operations are not allowed. If a group or role is defined |
| multiple times in the policy file, the last definition takes precedence. |
| </p> |
| |
| <p> |
| To understand the notion of whitelisting, set up a minimal policy file that does not |
| provide any privileges for any object. When you connect to an Impala node where this |
| policy file is in effect, you get no results for <codeph>SHOW DATABASES</codeph>, and an |
| error when you issue any <codeph>SHOW TABLES</codeph>, <codeph>USE |
| <varname>database_name</varname></codeph>, <codeph>DESCRIBE |
| <varname>table_name</varname></codeph>, <codeph>SELECT</codeph>, and or other statements |
| that expect to access databases or tables, even if the corresponding databases and |
| tables exist. |
| </p> |
| |
| <p> |
| The contents of the policy file are cached, to avoid a performance penalty for each |
| query. The policy file is re-checked by each <cmdname>impalad</cmdname> node every 5 |
| minutes. When you make a non-time-sensitive change such as adding new privileges or new |
| users, you can let the change take effect automatically a few minutes later. If you |
| remove or reduce privileges, and want the change to take effect immediately, restart the |
| <cmdname>impalad</cmdname> daemon on all nodes, again specifying the |
| <codeph>-server_name</codeph> and <codeph>-authorization_policy_file</codeph> options so |
| that the rules from the updated policy file are applied. |
| </p> |
| |
| </conbody> |
| |
| <concept id="security_policy_file_details"> |
| |
| <title>Policy File Format</title> |
| |
| <conbody> |
| |
| <p> |
| The policy file uses the familiar <codeph>.ini</codeph> format, divided into the major |
| sections <codeph>[groups]</codeph> and <codeph>[roles]</codeph>. |
| </p> |
| |
| <p> |
| There is also an optional <codeph>[databases]</codeph> section, which allows you to |
| specify a specific policy file for a particular database, as explained in |
| <xref href="#security_multiple_policy_files" |
| />. |
| </p> |
| |
| <p> |
| Another optional section, <codeph>[users]</codeph>, allows you to override the |
| OS-level mapping of users to groups; that is an advanced technique primarily for |
| testing and debugging, and is beyond the scope of this document. |
| </p> |
| |
| <p> |
| In the <codeph>[groups]</codeph> section, you define various categories of users and |
| select which roles are associated with each category. The group and usernames |
| correspond to Linux groups and users on the server where the |
| <cmdname>impalad</cmdname> daemon runs. |
| </p> |
| |
| <p> |
| The group and usernames in the <codeph>[groups]</codeph> section correspond to Hadoop |
| groups and users on the server where the <cmdname>impalad</cmdname> daemon runs. When |
| you access Impala through the <cmdname>impalad</cmdname> interpreter, for purposes of |
| authorization, the user is the logged-in Linux user and the groups are the Linux |
| groups that user is a member of. When you access Impala through the ODBC or JDBC |
| interfaces, the user and password specified through the connection string are used as |
| login credentials for the Linux server, and authorization is based on that username |
| and the associated Linux group membership. |
| </p> |
| |
| <p> |
| In the <codeph>[roles]</codeph> section, you a set of roles. For each role, you |
| specify precisely the set of privileges is available. That is, which objects users |
| with that role can access, and what operations they can perform on those objects. This |
| is the lowest-level category of security information; the other sections in the policy |
| file map the privileges to higher-level divisions of groups and users. In the |
| <codeph>[groups]</codeph> section, you specify which roles are associated with which |
| groups. The group and usernames correspond to Linux groups and users on the server |
| where the <cmdname>impalad</cmdname> daemon runs. The privileges are specified using |
| patterns like: |
| <codeblock>server=<varname>server_name</varname>->db=<varname>database_name</varname>->table=<varname>table_name</varname>->action=SELECT |
| server=<varname>server_name</varname>->db=<varname>database_name</varname>->table=t<varname>able_name</varname>->action=CREATE |
| server=<varname>server_name</varname>->db=<varname>database_name</varname>->table=<varname>table_name</varname>->action=ALL |
| </codeblock> |
| For the <varname>server_name</varname> value, substitute the same symbolic name you |
| specify with the <cmdname>impalad</cmdname> <codeph>-server_name</codeph> option. You |
| can use <codeph>*</codeph> wildcard characters at each level of the privilege |
| specification to allow access to all such objects. For example: |
| <codeblock>server=impala-host.example.com->db=default->table=t1->action=SELECT |
| server=impala-host.example.com->db=*->table=*->action=CREATE |
| server=impala-host.example.com->db=*->table=audit_log->action=SELECT |
| server=impala-host.example.com->db=default->table=t1->action=* |
| </codeblock> |
| </p> |
| |
| </conbody> |
| |
| </concept> |
| |
| <concept id="security_multiple_policy_files"> |
| |
| <title>Using Multiple Policy Files for Different Databases</title> |
| |
| <conbody> |
| |
| <p> |
| For an Impala cluster with many databases being accessed by many users and |
| applications, it might be cumbersome to update the security policy file for each |
| privilege change or each new database, table, or view. You can allow security to be |
| managed separately for individual databases, by setting up a separate policy file for |
| each database: |
| </p> |
| |
| <ul> |
| <li> |
| Add the optional <codeph>[databases]</codeph> section to the main policy file. |
| </li> |
| |
| <li> |
| Add entries in the <codeph>[databases]</codeph> section for each database that has |
| its own policy file. |
| </li> |
| |
| <li> |
| For each listed database, specify the HDFS path of the appropriate policy file. |
| </li> |
| </ul> |
| |
| <p> |
| For example: |
| </p> |
| |
| <codeblock>[databases] |
| # Defines the location of the per-DB policy files for the 'customers' and 'sales' databases. |
| customers = hdfs://ha-nn-uri/etc/access/customers.ini |
| sales = hdfs://ha-nn-uri/etc/access/sales.ini |
| </codeblock> |
| |
| <p> |
| To enable URIs in per-DB policy files, the Java configuration option |
| <codeph>sentry.allow.uri.db.policyfile</codeph> must be set to <codeph>true</codeph>. |
| For example: |
| </p> |
| |
| <codeblock>JAVA_TOOL_OPTIONS="-Dsentry.allow.uri.db.policyfile=true" |
| </codeblock> |
| |
| <note type="important"> |
| Enabling URIs in per-DB policy files introduces a security risk by allowing the owner |
| of the db-level policy file to grant himself/herself load privileges to anything the |
| <codeph>impala</codeph> user has read permissions for in HDFS (including data in other |
| databases controlled by different db-level policy files). |
| </note> |
| |
| </conbody> |
| |
| </concept> |
| |
| </concept> |
| |
| <concept id="security_schema"> |
| |
| <title>Setting Up Schema Objects for a Secure Impala Deployment</title> |
| |
| <conbody> |
| |
| <p> |
| In your role definitions, you must specify privileges at the level of individual |
| databases and tables, or all databases or all tables within a database. To simplify the |
| structure of these rules, plan ahead of time how to name your schema objects so that |
| data with different authorization requirements is divided into separate databases. |
| </p> |
| |
| <p> |
| If you are adding security on top of an existing Impala deployment, you can rename |
| tables or even move them between databases using the <codeph>ALTER TABLE</codeph> |
| statement. |
| </p> |
| |
| </conbody> |
| |
| </concept> |
| |
| <concept id="sentry_debug"> |
| |
| <title><ph conref="../shared/impala_common.xml#common/title_sentry_debug" |
| /></title> |
| |
| <conbody> |
| |
| <p conref="../shared/impala_common.xml#common/sentry_debug"/> |
| |
| </conbody> |
| |
| </concept> |
| |
| <concept id="sec_ex_default"> |
| |
| <title>The DEFAULT Database in a Secure Deployment</title> |
| |
| <conbody> |
| |
| <p> |
| Because of the extra emphasis on granular access controls in a secure deployment, you |
| should move any important or sensitive information out of the <codeph>DEFAULT</codeph> |
| database into a named database whose privileges are specified in the policy file. |
| Sometimes you might need to give privileges on the <codeph>DEFAULT</codeph> database for |
| administrative reasons; for example, as a place you can reliably specify with a |
| <codeph>USE</codeph> statement when preparing to drop a database. |
| </p> |
| |
| </conbody> |
| |
| </concept> |
| |
| </concept> |