In Mesos, the authorization subsystem allows the operator to configure the actions that certain principals are allowed to perform. For example, the operator can use authorization to ensure that principal foo
can only register frameworks subscribed to role bar
, and no other principals can register frameworks subscribed to any roles.
A reference implementation local authorizer provides basic security for most use cases. This authorizer is configured using Access Control Lists (ACLs). Alternative implementations could express their authorization rules in different ways. The local authorizer is used if the --authorizers
flag is not specified (or manually set to the default value local
) and ACLs are specified via the --acls
flag.
This document is divided into two main sections. The first section explores the concepts necessary to successfully configure the local authorizer. The second briefly discusses how to implement a custom authorizer; this section is not directed at operators but at engineers who wish to build their own authorizer back end.
When the agent‘s --authenticate_http_executors
flag is set, HTTP executors are required to authenticate with the HTTP executor API. When they do so, a simple implicit authorization rule is applied. In plain language, the rule states that executors can only perform actions on themselves. More specifically, an executor’s authenticated principal must contain claims with keys fid
, eid
, and cid
, with values equal to the currently-running executor‘s framework ID, executor ID, and container ID, respectively. By default, an authentication token containing these claims is injected into the executor’s environment (see the authentication documentation for more information).
Similarly, when the agent‘s --authenticate_http_readwrite
flag is set, HTTP executor’s are required to authenticate with the HTTP operator API when making calls such as LAUNCH_NESTED_CONTAINER
. In this case, executor authorization is performed via the loaded authorizer module, if present. The default Mesos local authorizer applies a simple implicit authorization rule, requiring that the executor‘s principal contain a claim with key cid
and a value equal to the currently-running executor’s container ID.
A principal identifies an entity (i.e., a framework or an operator) that interacts with Mesos. A role, on the other hand, is used to associate resources with frameworks in various ways. A useful analogy can be made with user management in the Unix world: principals correspond to usernames, while roles approximately correspond to groups. For more information about roles, see the roles documentation.
In a real-world organization, principals and roles might be used to represent various individuals or groups; for example, principals could correspond to people responsible for particular frameworks, while roles could correspond to departments within the organization which run frameworks on the cluster. To illustrate this point, consider a company that wants to allocate datacenter resources amongst multiple departments, one of which is the accounting department. Here is a possible scenario in which the accounting department launches a Mesos framework and then attempts to destroy a persistent volume:
principal
and secret
. Here, let the framework principal be payroll-framework
; this principal represents the trusted identity of the framework.FrameworkInfo
object containing a principal
and roles
; in this case, it will use a single role named accounting
. The principal in this message must be payroll-framework
, to match the one used by the framework for authentication.RegisterFramework
ACL which authorizes the principal payroll-framework
to register with the accounting
role. It does find such an ACL, the framework registers successfully. Now that the framework is subscribed to the accounting
role, any weights, reservations, persistent volumes, or quota associated with the accounting department's role will apply when allocating resources to this role within the framework. This allows operators to control the resource consumption of this department.ACCEPT
call containing an offer operation which will DESTROY
the persistent volume.DestroyVolume
ACL which asserts that the principal payroll-framework
can destroy volumes created by a creator_principal
of NONE
; in other words, this framework cannot destroy persistent volumes, so the operation will be refused.When authorizing an action, the local authorizer proceeds through a list of relevant rules until it finds one that can either grant or deny permission to the subject making the request. These rules are configured with Access Control Lists (ACLs) in the case of the local authorizer. The ACLs are defined with a JSON-based language via the --acls
flag.
Each ACL consist of an array of JSON objects. Each of these objects has two entries. The first, principals
, is common to all actions and describes the subjects which wish to perform the given action. The second entry varies among actions and describes the object on which the action will be executed. Both entries are specified with the same type of JSON object, known as Entity
. The local authorizer works by comparing Entity
objects, so understanding them is key to writing good ACLs.
An Entity
is essentially a container which can either hold a particular value or specify the special types ANY
or NONE
.
A global field which affects all ACLs can be set. This field is called permissive
and it defines the behavior when no ACL applies to the request made. If set to true
(which is the default) it will allow by default all non-matching requests, if set to false
it will reject all non-matching requests.
Note that when setting permissive
to false
a number of standard operations (e.g., run_tasks
or register_frameworks
) will require ACLs in order to work. There are two ways to disallow unauthorized uses on specific operations:
Leave permissive
set to true
and disallow ANY
principal to perform actions to all objects except the ones explicitly allowed. Consider the example below for details.
Set permissive
to false
but allow ANY
principal to perform the action on ANY
object. This needs to be done for all actions which should work without being checked against ACLs. A template doing this for all actions can be found in acls_template.json.
More information about the structure of the ACLs can be found in their definition inside the Mesos source code.
ACLs are compared in the order that they are specified. In other words, if an ACL allows some action and a later ACL forbids it, the action is allowed; likewise, if the ACL forbidding the action appears earlier than the one allowing the action, the action is forbidden. If no ACLs match a request, the request is authorized if the ACLs are permissive (which is the default behavior). If permissive
is explicitly set to false, all non-matching requests are declined.
Currently, the local authorizer configuration format supports the following entries, each representing an authorizable action:
The get_endpoints
action covers:
/files/debug
/logging/toggle
/metrics/snapshot
/slave(id)/containers
/slave(id)/containerizer/debug
/slave(id)/monitor/statistics
Consider for example the following ACL: Only principal foo
can register frameworks subscribed to the analytics
role. All principals can register frameworks subscribing to any other roles (including the principal foo
since permissive is the default behavior).
{ "register_frameworks": [ { "principals": { "values": ["foo"] }, "roles": { "values": ["analytics"] } }, { "principals": { "type": "NONE" }, "roles": { "values": ["analytics"] } } ] }
Principal foo
can register frameworks subscribed to the analytics
and ads
roles and no other role. Any other principal (or framework without a principal) can register frameworks subscribed to any roles.
{ "register_frameworks": [ { "principals": { "values": ["foo"] }, "roles": { "values": ["analytics", "ads"] } }, { "principals": { "values": ["foo"] }, "roles": { "type": "NONE" } } ] }
Only principal foo
and no one else can register frameworks subscribed to the analytics
role. Any other principal (or framework without a principal) can register frameworks subscribed to any other roles.
{ "register_frameworks": [ { "principals": { "values": ["foo"] }, "roles": { "values": ["analytics"] } }, { "principals": { "type": "NONE" }, "roles": { "values": ["analytics"] } } ] }
Principal foo
can register frameworks subscribed to the analytics
role and no other roles. No other principal can register frameworks subscribed to any roles, including *
.
{ "permissive": false, "register_frameworks": [ { "principals": { "values": ["foo"] }, "roles": { "values": ["analytics"] } } ] }
In the following example permissive
is set to false
; hence, principals can only run tasks as operating system users guest
or bar
, but not as any other user.
{ "permissive": false, "run_tasks": [ { "principals": { "type": "ANY" }, "users": { "values": ["guest", "bar"] } } ] }
Principals foo
and bar
can run tasks as the agent operating system user alice
and no other user. No other principal can run tasks.
{ "permissive": false, "run_tasks": [ { "principals": { "values": ["foo", "bar"] }, "users": { "values": ["alice"] } } ] }
Principal foo
can run tasks only as the agent operating system user guest
and no other user. Any other principal (or framework without a principal) can run tasks as any user.
{ "run_tasks": [ { "principals": { "values": ["foo"] }, "users": { "values": ["guest"] } }, { "principals": { "values": ["foo"] }, "users": { "type": "NONE" } } ] }
No principal can run tasks as the agent operating system user root
. Any principal (or framework without a principal) can run tasks as any other user.
{ "run_tasks": [ { "principals": { "type": "NONE" }, "users": { "values": ["root"] } } ] }
The order in which the rules are defined is important. In the following example, the ACLs effectively forbid anyone from tearing down frameworks even though the intention clearly is to allow only admin
to shut them down:
{ "teardown_frameworks": [ { "principals": { "type": "NONE" }, "framework_principals": { "type": "ANY" } }, { "principals": { "type": "admin" }, "framework_principals": { "type": "ANY" } } ] }
The previous ACL can be fixed as follows:
{ "teardown_frameworks": [ { "principals": { "type": "admin" }, "framework_principals": { "type": "ANY" } }, { "principals": { "type": "NONE" }, "framework_principals": { "type": "ANY" } } ] }
The ops
principal can teardown any framework using the /teardown HTTP endpoint. No other principal can teardown any frameworks.
{ "permissive": false, "teardown_frameworks": [ { "principals": { "values": ["ops"] }, "framework_principals": { "type": "ANY" } } ] }
The principal foo
can reserve resources for any role, and no other principal can reserve resources.
{ "permissive": false, "reserve_resources": [ { "principals": { "values": ["foo"] }, "roles": { "type": "ANY" } } ] }
The principal foo
cannot reserve resources, and any other principal (or framework without a principal) can reserve resources for any role.
{ "reserve_resources": [ { "principals": { "values": ["foo"] }, "roles": { "type": "NONE" } } ] }
The principal foo
can reserve resources only for roles prod
and dev
, and no other principal (or framework without a principal) can reserve resources for any role.
{ "permissive": false, "reserve_resources": [ { "principals": { "values": ["foo"] }, "roles": { "values": ["prod", "dev"] } } ] }
The principal foo
can unreserve resources reserved by itself and by the principal bar
. The principal bar
, however, can only unreserve its own resources. No other principal can unreserve resources.
{ "permissive": false, "unreserve_resources": [ { "principals": { "values": ["foo"] }, "reserver_principals": { "values": ["foo", "bar"] } }, { "principals": { "values": ["bar"] }, "reserver_principals": { "values": ["bar"] } } ] }
The principal foo
can create persistent volumes for any role, and no other principal can create persistent volumes.
{ "permissive": false, "create_volumes": [ { "principals": { "values": ["foo"] }, "roles": { "type": "ANY" } } ] }
The principal foo
cannot create persistent volumes for any role, and any other principal can create persistent volumes for any role.
{ "create_volumes": [ { "principals": { "values": ["foo"] }, "roles": { "type": "NONE" } } ] }
The principal foo
can create persistent volumes only for roles prod
and dev
, and no other principal can create persistent volumes for any role.
{ "permissive": false, "create_volumes": [ { "principals": { "values": ["foo"] }, "roles": { "values": ["prod", "dev"] } } ] }
The principal foo
can destroy volumes created by itself and by the principal bar
. The principal bar
, however, can only destroy its own volumes. No other principal can destroy volumes.
{ "permissive": false, "destroy_volumes": [ { "principals": { "values": ["foo"] }, "creator_principals": { "values": ["foo", "bar"] } }, { "principals": { "values": ["bar"] }, "creator_principals": { "values": ["bar"] } } ] }
The principal ops
can query quota status for any role. The principal foo
, however, can only query quota status for foo-role
. No other principal can query quota status.
{ "permissive": false, "get_quotas": [ { "principals": { "values": ["ops"] }, "roles": { "type": "ANY" } }, { "principals": { "values": ["foo"] }, "roles": { "values": ["foo-role"] } } ] }
The principal ops
can update quota information (set or remove) for any role. The principal foo
, however, can only update quota for foo-role
. No other principal can update quota.
{ "permissive": false, "update_quotas": [ { "principals": { "values": ["ops"] }, "roles": { "type": "ANY" } }, { "principals": { "values": ["foo"] }, "roles": { "values": ["foo-role"] } } ] }
The principal ops
can reach all HTTP endpoints using the GET method. The principal foo
, however, can only use the HTTP GET on the /logging/toggle
and /monitor/statistics
endpoints. No other principals can use GET on any endpoints.
{ "permissive": false, "get_endpoints": [ { "principals": { "values": ["ops"] }, "paths": { "type": "ANY" } }, { "principals": { "values": ["foo"] }, "paths": { "values": ["/logging/toggle", "/monitor/statistics"] } } ] }
In case you plan to implement your own authorizer module, the authorization interface consists of three parts:
First, the authorization::Request
protobuf message represents a request to be authorized. It follows the Subject-Verb-Object pattern, where a subject ---commonly a principal---attempts to perform an action on a given object.
Second, the Future<bool> mesos::Authorizer::authorized(const mesos::authorization::Request& request)
interface defines the entry point for authorizer modules (and the local authorizer). A call to authorized()
returns a future that indicates the result of the (asynchronous) authorization operation. If the future is set to true, the request was authorized successfully; if it was set to false, the request was rejected. A failed future indicates that the request could not be processed at the moment and it can be retried later.
The authorization::Request
message is defined in authorizer.proto:
message Request { optional Subject subject = 1; optional Action action = 2; optional Object object = 3; } message Subject { optional string value = 1; } message Object { optional string value = 1; optional FrameworkInfo framework_info = 2; optional Task task = 3; optional TaskInfo task_info = 4; optional ExecutorInfo executor_info = 5; optional MachineID machine_id = 11; }
Subject
or Object
are optional fields; if they are not set they will only match an ACL with ANY or NONE in the corresponding location. This allows users to construct the following requests: Can everybody perform action A on object O?, or Can principal Z execute action X on all objects?.
Object
has several optional fields of which, depending on the action, one or more fields must be set (e.g., the view_executors
action expects the executor_info
and framework_info
to be set).
The action
field of the Request
message is an enum. It is kept optional--- even though a valid action is necessary for every request---to allow for backwards compatibility when adding new fields (see MESOS-4997 for details).
Third, the ObjectApprover
interface. In order to support efficient authorization of large objects and multiple objects a user can request an ObjectApprover
via Future<shared_ptr<const ObjectApprover>> getApprover(const authorization::Subject& subject, const authorization::Action& action)
. The resulting ObjectApprover
provides Try<bool> approved(const ObjectApprover::Object& object)
to synchronously check whether objects are authorized. The ObjectApprover::Object
follows the structure of the Request::Object
above.
struct Object { const std::string* value; const FrameworkInfo* framework_info; const Task* task; const TaskInfo* task_info; const ExecutorInfo* executor_info; const MachineID* machine_id; };
As the fields take pointer to each entity the ObjectApprover::Object
does not require the entity to be copied.
Authorizer must ensure that ObjectApprover
s returned by getApprover(...)
method are valid throughout their whole lifetime. This is relied upon by parts of Mesos code (Scheduler API, Operator API events and so on) that have a need to frequently authorize a limited number of long-lived authorization subjects. This code on the Mesos side, on its part, must ensure that it does not store ObjectApprover
for authorization subjects that it no longer uses (i.e. that it does not leak ObjectApprover
s).
NOTE: As the ObjectApprover
is run synchronously in a different actor process ObjectApprover.approved()
call must not block!