docs/app-framework-development-guide.md - mesos - Git at Google

 ---
 title: Apache Mesos - Framework Development Guide
 layout: documentation
 ---

 # Framework Development Guide

 In this document we refer to Mesos applications as "frameworks".

 See one of the example framework schedulers in `MESOS_HOME/src/examples/` to
 get an idea of what a Mesos framework scheduler and executor in the language
 of your choice looks like. [RENDLER](https://github.com/mesosphere/RENDLER)
 provides example framework implementations in C++, Go, Haskell, Java, Python
 and Scala.

 ## Create your Framework Scheduler

 You can write a framework scheduler in C, C++, Java/Scala, or Python. Your
 framework scheduler should inherit from the `Scheduler` class (see API below).
 Your scheduler should create a SchedulerDriver (which will mediate communication
 between your scheduler and the Mesos master) and then call
 `SchedulerDriver.run()`.

 ### Scheduler API

 Callback interface to be implemented by framework schedulers.

 Declared in `MESOS_HOME/include/mesos/scheduler.hpp`

 ~~~{.cpp}
 /*
  * Invoked when the scheduler successfully registers with a Mesos
  * master. A unique ID (generated by the master) used for
  * distinguishing this framework from others and `MasterInfo`
  * with the ip and port of the current master are provided as arguments.
  */
 virtual void registered(
     SchedulerDriver* driver,
     const FrameworkID& frameworkId,
     const MasterInfo& masterInfo);

 /*
  * Invoked when the scheduler re-registers with a newly elected Mesos master.
  * This is only called when the scheduler has previously been registered.
  * `MasterInfo` containing the updated information about the elected master
  * is provided as an argument.
  */
 virtual void reregistered(
     SchedulerDriver* driver,
     const MasterInfo& masterInfo);

 /*
  * Invoked when the scheduler becomes "disconnected" from the master
  * (e.g., the master fails and another is taking over).
  */
 virtual void disconnected(SchedulerDriver* driver);

 /*
  * Invoked when resources have been offered to this framework. A
  * single offer will only contain resources from a single slave.
  * Resources associated with an offer will not be re-offered to
  * _this_ framework until either (a) this framework has rejected
  * those resources (see SchedulerDriver::launchTasks) or (b) those
  * resources have been rescinded (see Scheduler::offerRescinded).
  * Note that resources may be concurrently offered to more than one
  * framework at a time (depending on the allocator being used). In
  * that case, the first framework to launch tasks using those
  * resources will be able to use them while the other frameworks
  * will have those resources rescinded (or if a framework has
  * already launched tasks with those resources then those tasks will
  * fail with a TASK_LOST status and a message saying as much).
  */
 virtual void resourceOffers(
     SchedulerDriver* driver,
     const std::vector<Offer>& offers);

 /*
  * Invoked when an offer is no longer valid (e.g., the slave was
  * lost or another framework used resources in the offer). If for
  * whatever reason an offer is never rescinded (e.g., dropped
  * message, failing over framework, etc.), a framework that attempts
  * to launch tasks using an invalid offer will receive TASK_LOST
  * status updates for those tasks (see Scheduler::resourceOffers).
  */
 virtual void offerRescinded(SchedulerDriver* driver, const OfferID& offerId);

 /*
  * Invoked when the status of a task has changed (e.g., a slave is
  * lost and so the task is lost, a task finishes and an executor
  * sends a status update saying so, etc). If implicit
  * acknowledgements are being used, then returning from this
  * callback _acknowledges_ receipt of this status update! If for
  * whatever reason the scheduler aborts during this callback (or
  * the process exits) another status update will be delivered (note,
  * however, that this is currently not true if the slave sending the
  * status update is lost/fails during that time). If explicit
  * acknowledgements are in use, the scheduler must acknowledge this
  * status on the driver.
  */
 virtual void statusUpdate(SchedulerDriver* driver, const TaskStatus& status);

 /*
  * Invoked when an executor sends a message. These messages are best
  * effort; do not expect a framework message to be retransmitted in
  * any reliable fashion.
  */
 virtual void frameworkMessage(
     SchedulerDriver* driver,
     const ExecutorID& executorId,
     const SlaveID& slaveId,
     const std::string& data);

 /*
  * Invoked when a slave has been determined unreachable (e.g.,
  * machine failure, network partition). Most frameworks will need to
  * reschedule any tasks launched on this slave on a new slave.
  *
  * NOTE: This callback is not reliably delivered. If a host or
  * network failure causes messages between the master and the
  * scheduler to be dropped, this callback may not be invoked.
  */
 virtual void slaveLost(SchedulerDriver* driver, const SlaveID& slaveId);

 /*
  * Invoked when an executor has exited/terminated. Note that any
  * tasks running will have TASK_LOST status updates automagically
  * generated.
  *
  * NOTE: This callback is not reliably delivered. If a host or
  * network failure causes messages between the master and the
  * scheduler to be dropped, this callback may not be invoked.
  */
 virtual void executorLost(
     SchedulerDriver* driver,
     const ExecutorID& executorId,
     const SlaveID& slaveId,
     int status);

 /*
  * Invoked when there is an unrecoverable error in the scheduler or
  * scheduler driver. The driver will be aborted BEFORE invoking this
  * callback.
  */
 virtual void error(SchedulerDriver* driver, const std::string& message);
 ~~~

 ### Scheduler Driver API

 The Scheduler Driver is responsible for managing the scheduler's lifecycle
 (e.g., start, stop, or wait to finish) and interacting with Mesos Master
 (e.g., launch tasks, kill tasks, etc.).

 Note that this interface is usually not implemented by a framework itself,
 but it describes the possible calls a framework scheduler can make to
 interact with the Mesos Master.

 Please note that usage of this interface requires an instantiated
 MesosSchedulerDiver.
 See `src/examples/test_framework.cpp` for an example of using the
 MesosSchedulerDriver.

 Declared in `MESOS_HOME/include/mesos/scheduler.hpp`

 ~~~{.cpp}
 // Starts the scheduler driver. This needs to be called before any
 // other driver calls are made.
 virtual Status start();

 // Stops the scheduler driver. If the 'failover' flag is set to
 // false then it is expected that this framework will never
 // reconnect to Mesos. So Mesos will unregister the framework and
 // shutdown all its tasks and executors. If 'failover' is true, all
 // executors and tasks will remain running (for some framework
 // specific failover timeout) allowing the scheduler to reconnect
 // (possibly in the same process, or from a different process, for
 // example, on a different machine).
 virtual Status stop(bool failover = false);

 // Aborts the driver so that no more callbacks can be made to the
 // scheduler. The semantics of abort and stop have deliberately been
 // separated so that code can detect an aborted driver (i.e., via
 // the return status of SchedulerDriver::join, see below), and
 // instantiate and start another driver if desired (from within the
 // same process). Note that 'stop()' is not automatically called
 // inside 'abort()'.
 virtual Status abort();

 // Waits for the driver to be stopped or aborted, possibly
 // _blocking_ the current thread indefinitely. The return status of
 // this function can be used to determine if the driver was aborted
 // (see mesos.proto for a description of Status).
 virtual Status join();

 // Starts and immediately joins (i.e., blocks on) the driver.
 virtual Status run();

 // Requests resources from Mesos (see mesos.proto for a description
 // of Request and how, for example, to request resources from
 // specific slaves). Any resources available are offered to the
 // framework via Scheduler::resourceOffers callback, asynchronously.
 virtual Status requestResources(const std::vector<Request>& requests);

 // Launches the given set of tasks. Any remaining resources (i.e.,
 // those that are not used by the launched tasks or their executors)
 // will be considered declined. Note that this includes resources
 // used by tasks that the framework attempted to launch but failed
 // (with `TASK_ERROR`) due to a malformed task description. The
 // specified filters are applied on all unused resources (see
 // mesos.proto for a description of Filters). Available resources
 // are aggregated when multiple offers are provided. Note that all
 // offers must belong to the same slave. Invoking this function with
 // an empty collection of tasks declines offers in their entirety
 // (see Scheduler::declineOffer).
 virtual Status launchTasks(
     const std::vector<OfferID>& offerIds,
     const std::vector<TaskInfo>& tasks,
     const Filters& filters = Filters());

 // Kills the specified task. Note that attempting to kill a task is
 // currently not reliable. If, for example, a scheduler fails over
 // while it was attempting to kill a task it will need to retry in
 // the future. Likewise, if unregistered / disconnected, the request
 // will be dropped (these semantics may be changed in the future).
 virtual Status killTask(const TaskID& taskId);

 // Accepts the given offers and performs a sequence of operations on
 // those accepted offers. See Offer.Operation in mesos.proto for the
 // set of available operations. Any remaining resources (i.e., those
 // that are not used by the launched tasks or their executors) will
 // be considered declined. Note that this includes resources used by
 // tasks that the framework attempted to launch but failed (with
 // `TASK_ERROR`) due to a malformed task description. The specified
 // filters are applied on all unused resources (see mesos.proto for
 // a description of Filters). Available resources are aggregated
 // when multiple offers are provided. Note that all offers must
 // belong to the same slave.
 virtual Status acceptOffers(
     const std::vector<OfferID>& offerIds,
     const std::vector<Offer::Operation>& operations,
     const Filters& filters = Filters());

 // Declines an offer in its entirety and applies the specified
 // filters on the resources (see mesos.proto for a description of
 // Filters). Note that this can be done at any time, it is not
 // necessary to do this within the Scheduler::resourceOffers
 // callback.
 virtual Status declineOffer(
     const OfferID& offerId,
     const Filters& filters = Filters());

 // Removes all filters previously set by the framework (via
 // launchTasks()). This enables the framework to receive offers from
 // those filtered slaves.
 virtual Status reviveOffers();

 // Inform Mesos master to stop sending offers to the framework. The
 // scheduler should call reviveOffers() to resume getting offers.
 virtual Status suppressOffers();

 // Acknowledges the status update. This should only be called
 // once the status update is processed durably by the scheduler.
 // Not that explicit acknowledgements must be requested via the
 // constructor argument, otherwise a call to this method will
 // cause the driver to crash.
 virtual Status acknowledgeStatusUpdate(const TaskStatus& status);

 // Sends a message from the framework to one of its executors. These
 // messages are best effort; do not expect a framework message to be
 // retransmitted in any reliable fashion.
 virtual Status sendFrameworkMessage(
     const ExecutorID& executorId,
     const SlaveID& slaveId,
     const std::string& data);

 // Allows the framework to query the status for non-terminal tasks.
 // This causes the master to send back the latest task status for
 // each task in 'statuses', if possible. Tasks that are no longer
 // known will result in a TASK_LOST update. If statuses is empty,
 // then the master will send the latest status for each task
 // currently known.
 virtual Status reconcileTasks(const std::vector<TaskStatus>& statuses);
 ~~~

 ### Handling Failures
 How to build Mesos frameworks that remain available in the face of failures is
 discussed in a [separate document](high-availability-framework-guide.md).

 ## Working with Executors

 ### Using the Mesos Command Executor

 Mesos provides a simple executor that can execute shell commands and Docker
 containers on behalf of the framework scheduler; enough functionality for a
 wide variety of framework requirements.

 Any scheduler can make use of the Mesos command executor by filling in the
 optional `CommandInfo` member of the `TaskInfo` protobuf message.

 ~~~{.proto}
 message TaskInfo {
   ...
   optional CommandInfo command = 7;
   ...
 }
 ~~~

 The Mesos slave will fill in the rest of the `ExecutorInfo` for you when tasks
 are specified this way.

 Note that the agent will derive an `ExecutorInfo` from the `TaskInfo` and
 additionally copy fields (e.g., `Labels`) from `TaskInfo` into the new
 `ExecutorInfo`. This `ExecutorInfo` is only visible on the agent.

 ### Creating a custom Framework Executor

 If your framework has special requirements, you might want to provide your own
 Executor implementation. For example, you may not want a 1:1 relationship
 between tasks and processes.

 Your framework executor must inherit from the Executor class. It must override
 the launchTask() method. You can use the $MESOS_HOME environment variable inside
 of your executor to determine where Mesos is running from.

 #### Executor API

 Declared in `MESOS_HOME/include/mesos/executor.hpp`

 ~~~{.cpp}
 /*
  * Invoked once the executor driver has been able to successfully
  * connect with Mesos. In particular, a scheduler can pass some
  * data to it's executors through the `FrameworkInfo.ExecutorInfo`'s
  * data field.
  */
 virtual void registered(
     ExecutorDriver* driver,
     const ExecutorInfo& executorInfo,
     const FrameworkInfo& frameworkInfo,
     const SlaveInfo& slaveInfo);

 /*
  * Invoked when the executor re-registers with a restarted slave.
  */
 virtual void reregistered(ExecutorDriver* driver, const SlaveInfo& slaveInfo);

 /*
  * Invoked when the executor becomes "disconnected" from the slave
  * (e.g., the slave is being restarted due to an upgrade).
  */
 virtual void disconnected(ExecutorDriver* driver);

 /*
  * Invoked when a task has been launched on this executor (initiated
  * via Scheduler::launchTasks). Note that this task can be realized
  * with a thread, a process, or some simple computation, however, no
  * other callbacks will be invoked on this executor until this
  * callback has returned.
  */
 virtual void launchTask(ExecutorDriver* driver, const TaskInfo& task);

 /*
  * Invoked when a task running within this executor has been killed
  * (via SchedulerDriver::killTask). Note that no status update will
  * be sent on behalf of the executor, the executor is responsible
  * for creating a new TaskStatus (i.e., with TASK_KILLED) and
  * invoking ExecutorDriver::sendStatusUpdate.
  */
 virtual void killTask(ExecutorDriver* driver, const TaskID& taskId);

 /*
  * Invoked when a framework message has arrived for this
  * executor. These messages are best effort; do not expect a
  * framework message to be retransmitted in any reliable fashion.
  */
 virtual void frameworkMessage(ExecutorDriver* driver, const std::string& data);

 /*
  * Invoked when the executor should terminate all of it's currently
  * running tasks. Note that after a Mesos has determined that an
  * executor has terminated any tasks that the executor did not send
  * terminal status updates for (e.g., TASK_KILLED, TASK_FINISHED,
  * TASK_FAILED, etc) a TASK_LOST status update will be created.
  */
 virtual void shutdown(ExecutorDriver* driver);

 /*
  * Invoked when a fatal error has occurred with the executor and/or
  * executor driver. The driver will be aborted BEFORE invoking this
  * callback.
  */
 virtual void error(ExecutorDriver* driver, const std::string& message);
 ~~~

 #### Install your custom Framework Executor

 After creating your custom executor, you need to make it available to all slaves
 in the cluster.

 One way to distribute your framework executor is to let the
 [Mesos fetcher](fetcher.md) download it on-demand when your scheduler launches
 tasks on that slave. `ExecutorInfo` is a Protocol Buffer Message class (defined
 in `include/mesos/mesos.proto`), and it contains a field of type `CommandInfo`.
 `CommandInfo` allows schedulers to specify, among other things, a number of
 resources as URIs. These resources are fetched to a sandbox directory on the
 slave before attempting to execute the `ExecutorInfo` command. Several URI
 schemes are supported, including HTTP, FTP, HDFS, and S3 (e.g. see
 src/examples/java/TestFramework.java for an example of this).

 Alternatively, you can pass the `frameworks_home` configuration option
 (defaults to: `MESOS_HOME/frameworks`) to your `mesos-slave` daemons when you
 launch them to specify where your framework executors are stored (e.g. on an
 NFS mount that is available to all slaves), then use a relative path in
 `CommandInfo.uris`, and the slave will prepend the value of `frameworks_home`
 to the relative path provided.

 Once you are sure that your executors are available to the mesos-slaves, you
 should be able to run your scheduler, which will register with the Mesos master,
 and start receiving resource offers!


 ## Labels

 `Labels` can be found in the `FrameworkInfo`, `TaskInfo`, `DiscoveryInfo` and
 `TaskStatus` messages; framework and module writers can use Labels to tag and
 pass unstructured information around Mesos. Labels are free-form key-value pairs
 supplied by the framework scheduler or label decorator hooks. Below is the
 protobuf definitions of labels:

 ~~~{.proto}
   optional Labels labels = 11;
 ~~~

 ~~~{.proto}
 /**
  * Collection of labels.
  */
 message Labels {
     repeated Label labels = 1;
 }

 /**
  * Key, value pair used to store free form user-data.
  */
 message Label {
   required string key = 1;
   optional string value = 2;
 }
 ~~~

 Labels are not interpreted by Mesos itself, but will be made available over
 master and slave state endpoints. Further more, the executor and scheduler can
 introspect labels on the `TaskInfo` and `TaskStatus` programmatically.
 Below is an example of how two label pairs (`"environment": "prod"` and
 `"bananas": "apples"`) can be fetched from the master state endpoint.


 ~~~{.sh}
 $ curl http://master/state.json
 ...
 {
   "executor_id": "default",
   "framework_id": "20150312-120017-16777343-5050-39028-0000",
   "id": "3",
   "labels": [
     {
       "key": "environment",
       "value": "prod"
     },
     {
       "key": "bananas",
       "value": "apples"
     }
   ],
   "name": "Task 3",
   "slave_id": "20150312-115625-16777343-5050-38751-S0",
   "state": "TASK_FINISHED",
   ...
 },
 ~~~

 ## Service discovery

 When your framework registers an executor or launches a task, it can provide
 additional information for service discovery. This information is stored by
 the Mesos master along with other imporant information such as the slave
 currently running the task. A service discovery system can programmatically
 retrieve this information in order to set up DNS entries, configure proxies,
 or update any consistent store used for service discovery in a Mesos cluster
 that runs multiple frameworks and multiple tasks.

 The optional `DiscoveryInfo` message for `TaskInfo` and `ExecutorInfo` is
 declared in  `MESOS_HOME/include/mesos/mesos.proto`

 ~~~{.proto}
 message DiscoveryInfo {
   enum Visibility {
     FRAMEWORK = 0;
     CLUSTER = 1;
     EXTERNAL = 2;
   }

   required Visibility visibility = 1;
   optional string name = 2;
   optional string environment = 3;
   optional string location = 4;
   optional string version = 5;
   optional Ports ports = 6;
   optional Labels labels = 7;
 }
 ~~~

 `Visibility` is the key parameter that instructs the service discovery system
 whether a service should be discoverable. We currently differentiate between
 three cases:

  - a task should not be discoverable for anyone but its framework.
  - a task should be discoverable for all frameworks running on the Mesos cluster
    but not externally.
  - a task should be made discoverable broadly.

 Many service discovery systems provide additional features that manage the
 visibility of services (e.g., ACLs in proxy based systems, security extensions
 to DNS, VLAN or subnet selection). It is not the intended use of the visibility
 field to manage such features. When a service discovery system retrieves the
 task or executor information from the master, it can decide how to handle tasks
 without `DiscoveryInfo`. For instance, tasks may be made non discoverable to
 other frameworks (equivalent to `visibility=FRAMEWORK`) or discoverable to all
 frameworks (equivalent to `visibility=CLUSTER`).

 The `name` field is a string that that provides the service discovery system
 with the name under which the task is discoverable. The typical use of the name
 field will be to provide a valid hostname. If name is not provided, it is up to
 the service discovery system to create a name for the task based on the name
 field in `taskInfo` or other information.

 The `environment`, `location`, and `version` fields provide first class support
 for common attributes used to differentiate between similar services in large
 deployments. The `environment` may receive values such as `PROD/QA/DEV`, the
 `location` field may receive values like `EAST-US/WEST-US/EUROPE/AMEA`, and the
 `version` field may receive values like v2.0/v0.9. The exact use of these fields
 is up to the service discovery system.

 The `ports` field allows the framework to identify the ports a task listens to
 and explicitly name the functionality they represent and the layer-4 protocol
 they use (TCP, UDP, or other). For example, a Cassandra task will define ports
 like `"7000,Cluster,TCP"`, `"7001,SSL,TCP"`, `"9160,Thrift,TCP"`,
 `"9042,Native,TCP"`, and `"7199,JMX,TCP"`. It is up to the service discovery
 system to use these names and protocol in appropriate ways, potentially
 combining them with the `name` field in `DiscoveryInfo`.

 The `labels` field allows a framework to pass arbitrary labels to the service
 discovery system in the form of key/value pairs. Note that anything passed
 through this field is not guaranteed to be supported moving forward.
 Nevertheless, this field provides extensibility. Common uses of this field will
 allow us to identify use cases that require first class support.