commit | 6be17200b8084ad3524e7d450c411765b3214c0f | [log] [tgz] |
---|---|---|
author | Benjamin Mahler <bmahler@apache.org> | Tue Sep 01 14:58:28 2020 -0400 |
committer | Benjamin Mahler <bmahler@apache.org> | Thu Sep 10 13:09:42 2020 -0400 |
tree | afdf927f70931da6e92107f429c8dfe44e225a2e | |
parent | b0ee625ce1c07177e3e12ffa56ce3d76969dfdbe [diff] |
Fixed a CHECK failure in master during agent removal. Per MESOS-9609, it's possible for the master to encounter a CHECK failure during agent removal in the following situation: 1. Given a framework with checkpoint == false, with only executor(s) (no tasks) running on an agent: 2. When this agent disconects from the master, Master::removeFramework(Slave*, Framework*) removes the tasks and executors. However, when there are no tasks, this function will accidentally insert an entry into Master::Slave::tasks! (Due to the [] operator usage) 3. Now if the framework is removed, we have an entry in Slave::tasks, for which there is no corresponding framework. 4. When the agent is removed, we have a CHECK failure given we can't find the framework. This fixes the issue by avoiding the accidental insertion. Review: https://reviews.apache.org/r/72831
Apache Mesos is a cluster manager that provides efficient resource isolation and sharing across distributed applications, or frameworks. It can run Hadoop, Jenkins, Spark, Aurora, and other frameworks on a dynamically shared pool of nodes.
Visit us at mesos.apache.org.
Documentation is available in the docs/ directory. Additionally, a rendered HTML version can be found on the Mesos website's Documentation page.
Instructions are included on the Getting Started page.
Apache Mesos is licensed under the Apache License, Version 2.0.
For additional information, see the LICENSE and NOTICE files.