| // Licensed to the Apache Software Foundation (ASF) under one or more |
| // contributor license agreements. See the NOTICE file distributed with |
| // this work for additional information regarding copyright ownership. |
| // The ASF licenses this file to You under the Apache License, Version 2.0 |
| // (the "License"); you may not use this file except in compliance with |
| // the License. You may obtain a copy of the License at |
| // |
| // http://www.apache.org/licenses/LICENSE-2.0 |
| // |
| // Unless required by applicable law or agreed to in writing, software |
| // distributed under the License is distributed on an "AS IS" BASIS, |
| // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
| // See the License for the specific language governing permissions and |
| // limitations under the License. |
| |
| == Security |
| |
| Accumulo extends the BigTable data model to implement a security mechanism |
| known as cell-level security. Every key-value pair has its own security label, stored |
| under the column visibility element of the key, which is used to determine whether |
| a given user meets the security requirements to read the value. This enables data of |
| various security levels to be stored within the same row, and users of varying |
| degrees of access to query the same table, while preserving data confidentiality. |
| |
| === Security Label Expressions |
| |
| When mutations are applied, users can specify a security label for each value. This is |
| done as the Mutation is created by passing a ColumnVisibility object to the put() |
| method: |
| |
| [source,java] |
| ---- |
| Text rowID = new Text("row1"); |
| Text colFam = new Text("myColFam"); |
| Text colQual = new Text("myColQual"); |
| ColumnVisibility colVis = new ColumnVisibility("public"); |
| long timestamp = System.currentTimeMillis(); |
| |
| Value value = new Value("myValue"); |
| |
| Mutation mutation = new Mutation(rowID); |
| mutation.put(colFam, colQual, colVis, timestamp, value); |
| ---- |
| |
| === Security Label Expression Syntax |
| |
| Security labels consist of a set of user-defined tokens that are required to read the |
| value the label is associated with. The set of tokens required can be specified using |
| syntax that supports logical AND +&+ and OR +|+ combinations of terms, as |
| well as nesting groups +()+ of terms together. |
| |
| Each term is comprised of one to many alpha-numeric characters, hyphens, underscores or |
| periods. Optionally, each term may be wrapped in quotation marks |
| which removes the restriction on valid characters. In quoted terms, quotation marks |
| and backslash characters can be used as characters in the term by escaping them |
| with a backslash. |
| |
| For example, suppose within our organization we want to label our data values with |
| security labels defined in terms of user roles. We might have tokens such as: |
| |
| admin |
| audit |
| system |
| |
| These can be specified alone or combined using logical operators: |
| |
| ---- |
| // Users must have admin privileges |
| admin |
| |
| // Users must have admin and audit privileges |
| admin&audit |
| |
| // Users with either admin or audit privileges |
| admin|audit |
| |
| // Users must have audit and one or both of admin or system |
| (admin|system)&audit |
| ---- |
| |
| When both +|+ and +&+ operators are used, parentheses must be used to specify |
| precedence of the operators. |
| |
| === Authorization |
| |
| When clients attempt to read data from Accumulo, any security labels present are |
| examined against the set of authorizations passed by the client code when the |
| Scanner or BatchScanner are created. If the authorizations are determined to be |
| insufficient to satisfy the security label, the value is suppressed from the set of |
| results sent back to the client. |
| |
| Authorizations are specified as a comma-separated list of tokens the user possesses: |
| |
| [source,java] |
| ---- |
| // user possesses both admin and system level access |
| Authorization auths = new Authorization("admin","system"); |
| |
| Scanner s = connector.createScanner("table", auths); |
| ---- |
| |
| === User Authorizations |
| |
| Each Accumulo user has a set of associated security labels. To manipulate |
| these in the shell while using the default authorizor, use the setuaths and getauths commands. |
| These may also be modified for the default authorizor using the java security operations API. |
| |
| When a user creates a scanner a set of Authorizations is passed. If the |
| authorizations passed to the scanner are not a subset of the users |
| authorizations, then an exception will be thrown. |
| |
| To prevent users from writing data they can not read, add the visibility |
| constraint to a table. Use the -evc option in the createtable shell command to |
| enable this constraint. For existing tables use the following shell command to |
| enable the visibility constraint. Ensure the constraint number does not |
| conflict with any existing constraints. |
| |
| config -t table -s table.constraint.1=org.apache.accumulo.core.security.VisibilityConstraint |
| |
| Any user with the alter table permission can add or remove this constraint. |
| This constraint is not applied to bulk imported data, if this a concern then |
| disable the bulk import permission. |
| |
| === Pluggable Security |
| |
| New in 1.5 of Accumulo is a pluggable security mechanism. It can be broken into three actions -- |
| authentication, authorization, and permission handling. By default all of these are handled in |
| Zookeeper, which is how things were handled in Accumulo 1.4 and before. It is worth noting at this |
| point, that it is a new feature in 1.5 and may be adjusted in future releases without the standard |
| deprecation cycle. |
| |
| Authentication simply handles the ability for a user to verify their integrity. A combination of |
| principal and authentication token are used to verify a user is who they say they are. An |
| authentication token should be constructed, either directly through its constructor, but it is |
| advised to use the +init(Property)+ method to populate an authentication token. It is expected that a |
| user knows what the appropriate token to use for their system is. The default token is |
| +PasswordToken+. |
| |
| Once a user is authenticated by the Authenticator, the user has access to the other actions within |
| Accumulo. All actions in Accumulo are ACLed, and this ACL check is handled by the Permission |
| Handler. This is what manages all of the permissions, which are divided in system and per table |
| level. From there, if a user is doing an action which requires authorizations, the Authorizor is |
| queried to determine what authorizations the user has. |
| |
| This setup allows a variety of different mechanisms to be used for handling different aspects of |
| Accumulo's security. A system like Kerberos can be used for authentication, then a system like LDAP |
| could be used to determine if a user has a specific permission, and then it may default back to the |
| default ZookeeperAuthorizor to determine what Authorizations a user is ultimately allowed to use. |
| This is a pluggable system so custom components can be created depending on your need. |
| |
| === Secure Authorizations Handling |
| |
| For applications serving many users, it is not expected that an Accumulo user |
| will be created for each application user. In this case an Accumulo user with |
| all authorizations needed by any of the applications users must be created. To |
| service queries, the application should create a scanner with the application |
| user's authorizations. These authorizations could be obtained from a trusted 3rd |
| party. |
| |
| Often production systems will integrate with Public-Key Infrastructure (PKI) and |
| designate client code within the query layer to negotiate with PKI servers in order |
| to authenticate users and retrieve their authorization tokens (credentials). This |
| requires users to specify only the information necessary to authenticate themselves |
| to the system. Once user identity is established, their credentials can be accessed by |
| the client code and passed to Accumulo outside of the reach of the user. |
| |
| === Query Services Layer |
| |
| Since the primary method of interaction with Accumulo is through the Java API, |
| production environments often call for the implementation of a Query layer. This |
| can be done using web services in containers such as Apache Tomcat, but is not a |
| requirement. The Query Services Layer provides a mechanism for providing a |
| platform on which user facing applications can be built. This allows the application |
| designers to isolate potentially complex query logic, and enables a convenient point |
| at which to perform essential security functions. |
| |
| Several production environments choose to implement authentication at this layer, |
| where users identifiers are used to retrieve their access credentials which are then |
| cached within the query layer and presented to Accumulo through the |
| Authorizations mechanism. |
| |
| Typically, the query services layer sits between Accumulo and user workstations. |