The latest Mesos release, 0.19.0 is now available for download. This new version includes the following features and improvements:
Full release notes are available on JIRA.
Mesos 0.19.0 introduces the “Registrar”: the master now persists the list of registered slaves in a durable replicated manner. The previous lack of durable state was an intentional design decision that simplified failover and allowed masters to be run and migrated with ease. However, the stateless design had issues:
Persisting the list of registered slaves allows failed over masters to detect slaves that do not reregister, and notify frameworks accordingly. It also allows us to prevent rogue slaves from reregistering; terminating the rogue tasks in the process.
The state is persisted using the replicated log (available since 0.9.0).
As alluded to during the containerization / isolation refactor in 0.18.0, the ExternalContainerizer has landed in this release. This provides alpha level support for custom containerization.
Developers can implement their own external containerizers to provide support for custom container technologies. Initial Docker support is now available through some community driven external containerizers: Docker Containerizer for Mesos by Tom Arnfeld and Deimos by Jason Dusek. Please reach out on the mailing lists with questions!
Previously, Mesos components had to use custom metrics code and custom HTTP endpoints for exposing metrics. This made it difficult to expose additional system metrics and often required having an endpoint for each libprocess Process (Actor) for which metrics were desired. Having metrics spread across endpoints was operationally complex.
We needed a consistent, simple, and global way to expose metrics, which led to the creation of a metrics library within libprocess. All metrics are now exposed via /metrics/snapshot. The /stats.json endpoint remains for backwards compatibility.
For backwards compatibility, the “Registrar” will be enabled in a phased manner. By default, the “Registrar” is write-only in 0.19.0 and will be read/write in 0.20.0.
If running in high-availability mode with ZooKeeper, operators must now specify the --work_dir
for the master, along with the --quorum
size of the ensemble of masters. This means adding or removing masters must be done carefully! The best practice is to only ever add or remove a single master at a time and to allow a small amount of time for the replicated log to catch up on the new master. Maintenance documentation will be added to reflect this.
Please refer to the upgrades document, which details how to perform an upgrade from 0.18.x.
Thanks to the Registrar, reconciliation primitives can now be provided to ensure that the state of tasks between Mesos and frameworks is kept consistent. This will remove the need for frameworks to implement out-of-band task reconciliation to inspect the state of slaves. Reconciliation work is being tracked at MESOS-1407.
The addition of state through the Registrar opens up a rich set of possible features that were previously not possible due to the lack of persistent state in the master. These include:
We encourage you to try out this release, and let us know what you think and if you hit any issues on the user mailing list. You can also get in touch with us via @ApacheMesos or via mailing lists and IRC.
Thanks to the 32 contributors who made 0.19.0 possible:
Ashutosh Jain, Adam B, Alexandra Sava, Anton Lindström, Archana kumari, Benjamin Hindman, Benjamin Mahler, Bernardo Gomez Palacio, Bernd Mathiske, Charlie Carson, Chengwei Yang, Chi Zhang, Dave Lester, Dominic Hamon, Ian Downes, Isabel Jimenez, Jake Farrell, Jameel, Al-Aziz, Jiang Yan Xu, Jie Yu, Nikita Vetoshkin, Niklas Q. Nielsen, Ritwik Yadav, Sam Taha, Steven Phung, Till Toenshoff, Timothy St. Clair, Tobi Knaup, Tom Arnfeld, Tom Galloway, Vinod Kone, Vinson Lee