blob: 844c112463c5a218299ed82c3a291bde558b5ebb [file] [log] [blame]
Release Notes - Mesos - Version 1.10.1 (WIP)
-------------------------------------------
* This is a bug fix release.
** Bug
* [MESOS-9609] - Master check failure when marking agent unreachable.
* [MESOS-10126] - Docker volume isolator needs to clean up the `info` struct regardless the result of unmount operation
* [MESOS-10169] - Reintroduce image fetch deduplication while keeping it possible to destroy UCR containers in PROVISIONING state.
Release Notes - Mesos - Version 1.10.0
--------------------------------------------
This release contains the following highlights:
* Container resource bursting has been supported on Linux. Frameworks are
now able to specify CPU and memory limits for tasks (separately from
resource requests) and also the level of isolation they desire when
launching task groups - CPU and memory may be isolated at the executor
container level, or the task container level (MESOS-10001).
* Executors can now use a Unix domain socket to connect to an agent, instead
of connecting via TCP (MESOS-10034).
* Existing reservations can now be modified via the RESERVE_RESOURCES
master API call (MESOS-9981).
* Performance of read-only V1 operator API calls has been improved by
introducing direct serialization into JSON/protobuf and extending the
batching mechanism to parallel processing of these calls by the master
(similarly to `/state` endpoint). This brings V1 operator API performance
on par with older HTTP endpoints (MESOS-10026, MESOS-9497).
* **Breaking change** for authorizer modules: authorizers are now required
to implement a method for returning `ObjectApprover`s that are valid
throughout all of their lifetime. For framework and operator API subscriber
principals the set of `ObjectAprover`s is now requested from the authorizer
only once per subscription (MESOS-10056, MESOS-10057).
Additional API Changes:
* Quota can now be set on the default `*` role.
* Quota consumption metrics are now exposed by the allocator.
Unresolved Critical Issues:
* [MESOS-10066] - mesos-docker-executor process dies when agent stops. Recovery fails when agent returns
* [MESOS-10011] - Operation feedback with stale agent ID crashes the master
* [MESOS-9967] - Authorization header is missing when using a default registry
* [MESOS-9609] - Master check failure when marking agent unreachable
* [MESOS-9579] - ExecutorHttpApiTest.HeartbeatCalls is flaky.
* [MESOS-9536] - Nested container launched with non-root user may not be able to write to its sandbox via the environment variable `MESOS_SANDBOX`
* [MESOS-9500] - spark submit with docker image on mesos cluster fails.
* [MESOS-9426] - ZK master detection can become forever pending.
* [MESOS-9393] - Fetcher crashes extracting archives with non-ASCII filenames.
* [MESOS-9365] - Windows - GET_CONTAINERS API call causes the Mesos agent to fail
* [MESOS-9355] - Persistence volume does not unmount correctly with wrong artifact URI
* [MESOS-9352] - Data in persistent volume deleted accidentally when using Docker container and Persistent volume
* [MESOS-9053] - Network ports isolator can falsely trigger while destroying containers.
* [MESOS-9006] - The agent's GET_AGENT leaks resource information when using authorization
* [MESOS-8840] - `cpu.cfs_quota_us` may be accidentally set for command task using docker during agent recovery.
* [MESOS-8803] - Libprocess deadlocks in a test.
* [MESOS-8679] - "If the first KILL stuck in the default executor, all other KILLs will be ignored."
* [MESOS-8608] - RmdirContinueOnErrorTest.RemoveWithContinueOnError fails.
* [MESOS-8257] - "Unified Containerizer ""leaks"" a target container mount path to the host FS when the target resolves to an absolute path"
* [MESOS-8256] - Libprocess can silently deadlock due to worker thread exhaustion.
* [MESOS-8096] - Enqueueing events in MockHTTPScheduler can lead to segfaults.
* [MESOS-8038] - Launching GPU task sporadically fails.
* [MESOS-7971] - PersistentVolumeEndpointsTest.EndpointCreateThenOfferRemove test is flaky
* [MESOS-7911] - Non-checkpointing framework's tasks should not be marked LOST when agent disconnects.
* [MESOS-7748] - Slow subscribers of streaming APIs can lead to Mesos OOMing.
* [MESOS-7721] - Master's agent removal rate limit also applies to agent unreachability.
* [MESOS-7566] - Master crash due to failed check in DRFSorter::remove
* [MESOS-7386] - Executor not cleaning up existing running docker containers if external logrotate/logger processes die/killed
* [MESOS-6285] - Agents may OOM during recovery if there are too many tasks or executors
* [MESOS-5989] - Libevent SSL Socket downgrade code accesses uninitialized memory / assumes single peek is sufficient.
All Resolved Issues:
** Bug
* [MESOS-621] - `HierarchicalAllocatorProcess::removeSlave` doesn't properly handle framework allocations/resources
* [MESOS-4996] - 'containerizer->update' will always fail after killing a docker container.
* [MESOS-7217] - CgroupsIsolatorTest.ROOT_CGROUPS_CFS_EnableCfs is flaky.
* [MESOS-7639] - Oversubscription could crash the master due to CHECK failure in the allocator
* [MESOS-8537] - Default executor doesn't wait for status updates to be ack'd before shutting down
* [MESOS-8877] - Docker container's resources will be wrongly enlarged in cgroups after agent recovery
* [MESOS-9337] - Hook manager implementation is missing mutex acquisition in several places.
* [MESOS-9847] - Docker executor doesn't wait for status updates to be ack'd before shutting down.
* [MESOS-9889] - Master CPU high due to unexpected foreachkey behaviour in Master::__reregisterSlave.
* [MESOS-9958] - New CLI is not included in distribution tarball
* [MESOS-9965] - agent should not send `TASK_GONE_BY_OPERATOR` if the framework is not partition aware.
* [MESOS-9968] - WWWAuthenticate header parsing fails when commas are in (quoted) realm
* [MESOS-9971] - 'dist' and 'distcheck' cmake targets are implemented as shell scripts, so fail on Windows/MSVC.
* [MESOS-9975] - Sorter may leak clients allocations.
* [MESOS-9978] - Nvml isolator cannot be disabled which makes it impossible to exclude non-free code
* [MESOS-9980] - HierarchicalAllocatorTest.MaintenanceInverseOffers is flaky
* [MESOS-10007] - Command executor can miss exit status for short-lived commands due to double-reaping.
* [MESOS-10008] - Very large quota values can crash master.
* [MESOS-10015] - updateAllocation() can stall the allocator with a huge number of reservations on an agent.
* [MESOS-10018] - Duplicate tasks if agent partitioned during maintenance down
* [MESOS-10023] - Allocator method dispatches can be reordered (relative to scheduler API calls which triggered them).
* [MESOS-10041] - Libprocess SSL verification can leak memory
* [MESOS-10083] - Authorizing invalid operation can result in declined authorization.
* [MESOS-10084] - Detecting whether executor is generated for command task should work when the launcher_dir changes
* [MESOS-10090] - Mesos build on Windows appears to be broken.
* [MESOS-10092] - Cannot pull image from docker registry which does not reply with 'scope'/'service' in WWW-Authenticate header
* [MESOS-10094] - Master's agent draining VLOG prints incorrect task counts.
* [MESOS-10096] - Reactivating a draining agent leaves the agent in draining state.
* [MESOS-10097] - After HTTP framework disconnects, heartbeater idle-loops instead of being deleted.
* [MESOS-10098] - Mesos agent fails to start on outdated systemd.
* [MESOS-10100] - Recently introduced PathTest.Relative and PathTest.PathIteration fail on windows.
* [MESOS-10102] - MasterAPITest.ReservationUpdate is flaky
* [MESOS-10103] - MSVC build can segfault when composing authorization Action for updating reservation.
* [MESOS-10107] - containeriser: failed to remove cgroup - EBUSY
* [MESOS-10109] - After failover, master crashes on re-adding an agent with maintenance schedule set.
* [MESOS-10110] - Libprocess ignores most protobuf (de)serialisation failure cases.
* [MESOS-10111] - Failed check in libevent_ssl_socket.cpp: 'self->bev' Must be non NULL
* [MESOS-10113] - OpenSSLSocketImpl with 'support_downgrade' waits for incoming bytes before accepting new connection.
* [MESOS-10114] - OpenSSLSocketImpl with 'support_downgrade' can silently stop accepting sockets.
* [MESOS-10116] - Attempt to reactivate disconnected agent crashes the master
* [MESOS-10118] - Agent incorrectly handles draining when empty
* [MESOS-10120] - Authorization for /logging/toggle and /metrics/snapshot is skipped on Windows.
* [MESOS-10123] - Windows overlapped IO discard handling can drop data.
* [MESOS-10124] - OpenSSLSocketImpl on Windows with 'support_downgrade' is incorrectly polling for read readiness.
* [MESOS-10125] - Web UI roles tree files are missing from automake install.
* [MESOS-10128] - Performance regression in HierarchicalAllocations_BENCHMARK_Test.PersistentVolumes
** Epic
* [MESOS-9981] - Introduce a Mesos API to update reservations
* [MESOS-10001] - Resource Limits and Requests
* [MESOS-10034] - Agent/executor domain socket communication
** Improvement
* [MESOS-7245] - Add a Windows segfault handler for stacktraces
* [MESOS-9123] - Expose quota consumption metrics.
* [MESOS-9497] - Parallel reads for expensive master v1 read-only calls.
* [MESOS-9914] - Refactor `MesosTest::StartSlave` in favour of builder style interface
* [MESOS-9948] - master::Slave::hasExecutor occupies 37% of a 150 second perf sample.
* [MESOS-9964] - Support destroying UCR containers in provisioning state
* [MESOS-9972] - Update Names for TLS-related environment variables in libprocess.
* [MESOS-10016] - Add a benchmark for HierarchicalAllocatorProcess::updateAllocation()
* [MESOS-10017] - Log all reverse DNS lookup failures in 'legacy' TLS (SSL) hostname validation scheme.
* [MESOS-10026] - Improve v1 operator API read performance.
* [MESOS-10056] - Perform synchronous authorization for scheduler calls.
* [MESOS-10057] - Perform synchronous authorization for outgoing events on event stream.
* [MESOS-10095] - Agent draining logging makes it hard to tell which tasks did not terminate.
* [MESOS-10112] - Log peer address during TLS handshake failures.
** Wish
* [MESOS-9630] - Consider moving linter setup to pre-commit
** Task
* [MESOS-3938] - Consider allowing setting quotas for the default '*' role.
* [MESOS-6084] - Deprecate and remove the included MPI framework
* [MESOS-8503] - Improve UI when displaying frameworks with many roles.
* [MESOS-9843] - Implement tests for the `containerizer/debug` endpoint.
* [MESOS-9949] - Track allocated/offered in the allocator's role tree.
* [MESOS-9974] - Remove support/mesos-style.py transition script
* [MESOS-9982] - Add a 'source' field to operator API ReserveResources protobuf
* [MESOS-9983] - Intermediate rejection of Reserve operations with source set
* [MESOS-9984] - Provide a function to compute a common "reservation ancestor" between two 'Resources'
* [MESOS-9985] - Update validation of 'ReserveResources' for 'source'
* [MESOS-9986] - Update 'getConsumedResources' and 'getResourceConversions' for 'source' in reservations
* [MESOS-9987] - Update 'Master::Http::_reserve' to also require 'source' resources
* [MESOS-9988] - Add 'source' field to scheduler reservation API
* [MESOS-9989] - Update 'Master::Http::_reserve' to pass 'source' into generated operation
* [MESOS-9990] - Consolidate 'Master::authorizeReserveResources' overloads
* [MESOS-9991] - Update 'Master::authorizeReserveResources' for re-reservations
* [MESOS-9992] - Add end-to-end test excercising re-reservation operator API
* [MESOS-9993] - Update operator API documentation for re-reservations
* [MESOS-10002] - Design doc for container bursting
* [MESOS-10009] - Implement glue code for the Windows event loop and OpenSSL's basic I/O abstraction
* [MESOS-10010] - Implement an SSL socket for Windows, using OpenSSL directly
* [MESOS-10033] - Design per-task cgroup isolation
* [MESOS-10035] - Implement `enable_http_executor_domain_sockets` agent flag
* [MESOS-10036] - Implement agent code to create a domain socket on startup
* [MESOS-10037] - Create code to bind-mount domain sockets into mesos-type executor containers
* [MESOS-10038] - Implement agent code to listen on a domain socket
* [MESOS-10039] - Let the default executor connect through a domain socket when available
* [MESOS-10043] - Add resource limits into the protobuf message `TaskInfo`
* [MESOS-10044] - Add a new capability `TASK_RESOURCE_LIMITS` into Mesos agent
* [MESOS-10045] - Validate task's resources limits and the `share_cgroups` field
* [MESOS-10046] - Launch executor container with resource limits
* [MESOS-10047] - Update the CPU subsystem in the cgroup isolator to set container's CPU resource limits
* [MESOS-10048] - Update the memory subsystem in the cgroup isolator to set container's memory resource limits and `oom_score_adj`
* [MESOS-10049] - Add a new reason in `TaskStatus::Reason` for the case that a task is OOM-killed due to exceeding its memory request
* [MESOS-10050] - Update the `update()` method of containerizer to handle container resource limits
* [MESOS-10051] - Update the `LaunchContainer` agent API to support container resource limits
* [MESOS-10053] - Update Docker executor to set Docker container's resource limits and `oom_score_adj`
* [MESOS-10054] - Update Docker containerizer to set Docker container's resource limits and `oom_score_adj`
* [MESOS-10055] - Update Mesos UI to display the resource limits of tasks
* [MESOS-10061] - Implement chmod() support for stout
* [MESOS-10062] - Implement relative path computation for stout
* [MESOS-10063] - Update default executor to call `LAUNCH_CONTAINER` to launch nested containers
* [MESOS-10064] - Accommodate the "Infinity" value in JSON
* [MESOS-10065] - Update the `update()` method of isolator interface to handle container resource limits
* [MESOS-10067] - Update the `update()` method of cgroups subsystem interface to handle container resource limits
* [MESOS-10073] - Implement SSL downgrade on the native SSL socket
* [MESOS-10074] - Adapt design for executor domain sockets for agent restarts
* [MESOS-10075] - Add the `shared_cgroups` field into the protobuf message `LinuxInfo`
* [MESOS-10076] - Cgroups isolator: create nested cgroups
* [MESOS-10077] - Cgroups isolator: allow updating and isolating resources for nested cgroups
* [MESOS-10079] - Cgroups isolator: recover nested cgroups
* [MESOS-10086] - Add support for systemd socket activation for mesos domain sockets
* [MESOS-10087] - Update master & agent's HTTP endpoints for showing resource limits
* [MESOS-10115] - Add documentation for task resource limits
* [MESOS-10117] - Update the `usage()` method of containerizer to set resource limits in the `ResourceStatistics` protobuf message
** Documentation
* [MESOS-9938] - Standalone container documentation
* [MESOS-9979] - Add docs for FrameworkInfo updates and the UPDATE_FRAMEWORK call.
Release Notes - Mesos - Version 1.9.1 (WIP)
-------------------------------------------
* This is a bug fix release.
** Bug
* [MESOS-9964] - Support destroying UCR containers in provisioning state.
* [MESOS-9965] - Agent should not send `TASK_GONE_BY_OPERATOR` if the framework is not partition aware.
* [MESOS-9966] - Agent crashes when trying to destroy orphaned nested container if root container is orphaned as well.
* [MESOS-9968] - WWWAuthenticate header parsing fails when commas are in (quoted) realm
* [MESOS-9972] - Update Names for TLS-related environment variables in libprocess.
* [MESOS-10007] - Command executor can miss exit status for short-lived commands due to double-reaping.
* [MESOS-10008] - Very large quota values can crash master.
* [MESOS-10015] - updateAllocation() can stall the allocator with a huge number of reservations on an agent.
* [MESOS-10041] - Libprocess SSL verification can leak memory.
* [MESOS-10094] - Master's agent draining VLOG prints incorrect task counts.
* [MESOS-10096] - Reactivating a draining agent leaves the agent in draining state.
* [MESOS-10118] - Agent incorrectly handles draining when empty.
** Improvement
* [MESOS-9889] - Master CPU high due to unexpected foreachkey behaviour in Master::__reregisterSlave.
* [MESOS-9948] - master::Slave::hasExecutor occupies 37% of a 150 second perf sample.
* [MESOS-10017] - Log all reverse DNS lookup failures in 'legacy' TLS (SSL) hostname validation scheme.
* [MESOS-10095] - Agent draining logging makes it hard to tell which tasks did not terminate.
* [MESOS-10112] - Log peer address during TLS handshake failures.
Release Notes - Mesos - Version 1.9.0
-------------------------------------
This release contains the following highlights:
* Maintenance:
* Added new APIs to support automatic node draining via operator APIs.
This serves as an alternative to framework-assisted draining using
maintenance primitives. (MESOS-9753)
* Resource Management:
* Support for quota limits has been added. The existing quota guarantees
are deprecated in favor of using limits (and in the future, priorities).
* Security
* A new libprocess flag `--hostname_validation_scheme` has been added.
This allows users to enable a new RFC 6125-compliant hostname verification
scheme based on primitives provided by OpenSSL. This will also improve
performance by getting rid of all reverse DNS lookups. (MESOS-9784)
* The use of anonymous cipher suites is now disallowed when TLS certificate
verification is enabled. (MESOS-9810)
* Containerization:
* A new `--docker_ignore_runtime` flag has been added. This causes the agent
to ignore any runtime configuration present in Docker images. (MESOS-9760)
* Add no-new-privileges isolator. A new Linux isolator has been added to
support enabling the no_new_privs process control flag. (MESOS-9770)
* The Mesos containerizer now masks sensitive paths in `/proc` for
containers that do not share the host's PID namespace. (MESOS-9771)
* The Mesos containerizer now supports configurable IPC namespace and
/dev/shm. Container can be configured to have a private IPC namespace
and /dev/shm or share them from its parent, and the size of its private
/dev/shm is also configurable. (MESOS-9795)
* The Mesos containerizer now includes ephemeral overlayfs storage in the
task disk quota as well as sandbox storage. (MESOS-9900)
* A new `/containerizer/debug` HTTP endpoint has been added. This endpoint
exposes debug information for the Mesos containerizer. At the moment, it
returns a list of pending operations related to Isolators and Launchers.
(MESOS-9756)
Additional API Changes:
* Mesos components will now forego TLS certificate validation for incoming
connections, unless `LIBPROCESS_SSL_REQUIRE_CERT` is set to true.
* The `Socket::connect(const Address&)` member function will now abort the
program when called on a `LibeventSSLSocket`. Instead, the new overload
`Socket::connect(const Address&, const TLSClientConfig&)` must be used.
NOTE: This new overload is only available when libprocess is compiled
with `--enable-ssl`.
Unresolved Critical Issues:
* MESOS-9889 - Master CPU high due to unexpected foreachkey behaviour in Master::__reregisterSlave
* MESOS-9697 - Release RPMs are not uploaded to bintray
* MESOS-9579 - ExecutorHttpApiTest.HeartbeatCalls is flaky.
* MESOS-9536 - Nested container launched with non-root user may not be able to write to its sandbox via the environment variable `MESOS_SANDBOX`
* MESOS-9520 - IOTest.Read hangs on Windows
* MESOS-9500 - spark submit with docker image on mesos cluster fails.
* MESOS-9426 - ZK master detection can become forever pending.
* MESOS-9393 - Fetcher crashes extracting archives with non-ASCII filenames.
* MESOS-9365 - Windows - GET_CONTAINERS API call causes the Mesos agent to fail
* MESOS-9355 - Persistence volume does not unmount correctly with wrong artifact URI
* MESOS-9352 - Data in persistent volume deleted accidentally when using Docker container and Persistent volume
* MESOS-9053 - Network ports isolator can falsely trigger while destroying containers.
* MESOS-9006 - The agent's GET_AGENT leaks resource information when using authorization
* MESOS-8877 - Docker container's resources will be wrongly enlarged in cgroups after agent recovery
* MESOS-8840 - `cpu.cfs_quota_us` may be accidentally set for command task using docker during agent recovery.
* MESOS-8803 - Libprocess deadlocks in a test.
* MESOS-8679 - If the first KILL stuck in the default executor, all other KILLs will be ignored.
* MESOS-8608 - RmdirContinueOnErrorTest.RemoveWithContinueOnError fails.
* MESOS-8257 - Unified Containerizer "leaks" a target container mount path to the host FS when the target resolves to an absolute path
* MESOS-8256 - Libprocess can silently deadlock due to worker thread exhaustion.
* MESOS-8096 - Enqueueing events in MockHTTPScheduler can lead to segfaults.
* MESOS-8038 - Launching GPU task sporadically fails.
* MESOS-7971 - PersistentVolumeEndpointsTest.EndpointCreateThenOfferRemove test is flaky
* MESOS-7911 - Non-checkpointing framework's tasks should not be marked LOST when agent disconnects.
* MESOS-7748 - Slow subscribers of streaming APIs can lead to Mesos OOMing.
* MESOS-7721 - Master's agent removal rate limit also applies to agent unreachability.
* MESOS-7566 - Master crash due to failed check in DRFSorter::remove
* MESOS-7386 - Executor not cleaning up existing running docker containers if external logrotate/logger processes die/killed
* MESOS-6285 - Agents may OOM during recovery if there are too many tasks or executors
* MESOS-5989 - Libevent SSL Socket downgrade code accesses uninitialized memory / assumes single peek is sufficient.
All Resolved Issues:
** Bug
* [MESOS-2842] - Master crashes when framework changes principal on re-registration
* [MESOS-5804] - ExamplesTest.DynamicReservationFramework is flaky
* [MESOS-6382] - Add option to enable parallel test runner for cmake builds
* [MESOS-6605] - configure looks for wrong header file for elfio
* [MESOS-8968] - Wire `UPDATE_QUOTA` call.
* [MESOS-9353] - libprocess triggers deprecation warnings when built against openssl 1.1.
* [MESOS-9395] - Check failure on `StorageLocalResourceProviderProcess::applyCreateDisk`.
* [MESOS-9482] - Resource provider manager can crash on invalid data from resource providers
* [MESOS-9560] - ContentType/AgentAPITest.MarkResourceProviderGone/1 is flaky
* [MESOS-9594] - Test `StorageLocalResourceProviderTest.RetryRpcWithExponentialBackoff` is flaky.
* [MESOS-9609] - Master check failure when marking agent unreachable
* [MESOS-9616] - `Filters.refuse_seconds` declines resources not in offers.
* [MESOS-9667] - Check failure when executor for task using resource provider resources subscribes before agent is registered
* [MESOS-9698] - DroppedOperationStatusUpdate test is flaky
* [MESOS-9707] - Calling link::lo() may cause runtime error
* [MESOS-9711] - Avoid shutting down executors registering before a required resource provider.
* [MESOS-9712] - StorageLocalResourceProviderTest.CsiPluginRpcMetrics is flaky.
* [MESOS-9719] - Test `AgentFailoverHTTPExecutorUsingResourceProviderResources` is flaky.
* [MESOS-9727] - Heartbeat calls from executor to agent are reported as errors
* [MESOS-9733] - Random sorter generates non-uniform result for hierarchical roles.
* [MESOS-9750] - Agent V1 GET_STATE response may report a complete executor's tasks as non-terminal after a graceful agent shutdown
* [MESOS-9765] - Test `ROOT_CreateDestroyPersistentMountVolumeWithReboot` is flaky.
* [MESOS-9766] - /__processes__ endpoint can hang.
* [MESOS-9779] - `UPDATE_RESOURCE_PROVIDER_CONFIG` agent call returns 404 ambiguously.
* [MESOS-9782] - Random sorter fails to clear removed clients.
* [MESOS-9785] - Frameworks recovered from reregistered agents are not reported to master `/api/v1` subscribers.
* [MESOS-9786] - Race between two REMOVE_QUOTA calls crashes the master.
* [MESOS-9803] - Memory leak caused by an infinite chain of futures in `UriDiskProfileAdaptor`.
* [MESOS-9808] - libprocess can deadlock on termination (cleanup() vs use() + terminate())
* [MESOS-9811] - Don't use reverse DNS for hostname validation
* [MESOS-9831] - Master should not report disconnected resource providers.
* [MESOS-9835] - `QuotaRoleAllocateNonQuotaResource` is failing.
* [MESOS-9836] - Docker containerizer overwrites `/mesos/slave` cgroups.
* [MESOS-9852] - Slow memory growth in master due to deferred deletion of offer filters and timers.
* [MESOS-9854] - /roles endpoint should return both guarantees and limits.
* [MESOS-9856] - REVIVE call with specified role(s) clears filters for all roles of a framework.
* [MESOS-9861] - Make PushGauges support floating point stats.
* [MESOS-9870] - Simultaneous adding/removal of a role from framework's roles and its suppressed roles crashes the master.
* [MESOS-9875] - Mesos did not respond correctly when operations should fail
* [MESOS-9881] - StorageLocalResourceProviderTest.RetryOperationStatusUpdateAfterRecovery is flaky.
* [MESOS-9882] - Mesos.UpdateFrameworkV0Test.SuppressedRoles is flaky.
* [MESOS-9886] - RoleTest.RolesEndpointContainsConsumedQuota is flaky.
* [MESOS-9887] - Race condition between two terminal task status updates for Docker/Command executor.
* [MESOS-9888] - /roles and GET_ROLES do not expose roles with only static reservations
* [MESOS-9890] - /roles and GET_ROLES does not always expose parent roles.
* [MESOS-9893] - `volume/secret` isolator should cleanup the stored secret from runtime directory when the container is destroyed
* [MESOS-9894] - Mesos failed to build due to fatal error C1083 on Windows using MSVC.
* [MESOS-9895] - SlaveTest.DrainingAgentRejectLaunch is flaky
* [MESOS-9901] - jsonify uses non-standard mapping for protobuf map fields.
* [MESOS-9902] - Mesos failed to build due to error C2280 on windows with MSVC
* [MESOS-9906] - Libprocess tests hangs on arm
* [MESOS-9909] - Mesos agent crashes after recovery when there is nested container joins a CNI network
* [MESOS-9922] - MasterQuotaTest.RescindOffersEnforcingLimits is flaky
* [MESOS-9925] - Default executor takes a couple of seconds to start and subscribe Mesos agent
* [MESOS-9930] - DRF sorter may omit clients in sorting after removing an inactive leaf node.
* [MESOS-9934] - Master does not handle returning unreachable agents as draining/deactivated
* [MESOS-9935] - The agent crashes after the disk du isolator supporting rootfs checks.
* [MESOS-9952] - ExampleTest.DiskFullFramework is slow
* [MESOS-9956] - CSI plugins reporting duplicated volumes will crash the agent.
** Epic
* [MESOS-9534] - CSI Spec v1.0 Support.
* [MESOS-9756] - Introduce a container debug endpoint.
* [MESOS-9784] - Client side SSL certificate verification in Libprocess.
* [MESOS-9795] - Support configurable /dev/shm and IPC namespace.
** Improvement
* [MESOS-7258] - Provide scheduler calls to subscribe to additional roles and unsubscribe from roles.
* [MESOS-8456] - Allocator should allow roles to burst above guarantees but below limits.
* [MESOS-8789] - /roles and webui roles table should display distinct offered and allocated resources.
* [MESOS-9254] - Make SLRP be able to update its volumes and storage pools.
* [MESOS-9545] - Marking an unreachable agent as gone should transition the tasks to terminal state
* [MESOS-9618] - Display quota consumption in the webui.
* [MESOS-9640] - Add authorization support for `UPDATE_QUOTA` call.
* [MESOS-9668] - Add authorization support for the new `GET_QUOTA` call.
* [MESOS-9669] - Deprecate v0 quota calls.
* [MESOS-9695] - Remove the duplicate pid check in Docker containerizer
* [MESOS-9701] - Allocator's roles map should track reservations.
* [MESOS-9724] - Flatten the weighted shuffling in the random sorter.
* [MESOS-9758] - Take ports out of the GET_ROLES endpoints.
* [MESOS-9759] - Log required quota headroom and available quota headroom in the allocator.
* [MESOS-9760] - Decouple Docker runtime isolator manifest configuration from image provider
* [MESOS-9769] - Add direct containerized support for filesystem operations.
* [MESOS-9770] - Add no-new-privileges isolator.
* [MESOS-9771] - Mask sensitive procfs paths.
* [MESOS-9778] - Randomized the agents in the second allocation stage.
* [MESOS-9787] - Log slow SSL (TLS) peer reverse DNS lookup.
* [MESOS-9791] - Libprocess does not support server only SSL certificate verification.
* [MESOS-9799] - Adopt container file operations in secrets volumes.
* [MESOS-9802] - Remove quota role sorter in the allocator.
* [MESOS-9805] - Run cgroup subsystems before moving the target PID.
* [MESOS-9806] - Address allocator performance regression due to the addition of quota limits.
* [MESOS-9807] - Introduce a `struct Quota` wrapper.
* [MESOS-9812] - Add achievability validation for update quota call.
* [MESOS-9820] - Add `updateQuota()` method to the allocator.
* [MESOS-9833] - Introduce an agent flag for the default `/dev/shm` size
* [MESOS-9876] - Use geteuid to determine subprocess' user when launching task.
* [MESOS-9878] - Enable libprocess users to pass a custom SSL context when using Socket
* [MESOS-9900] - Include overlayfs upperdir in disk quota accounting.
* [MESOS-9908] - Introduce a new agent flag and support docker volume chown to task user.
* [MESOS-9917] - Store a role tree in the allocator.
* [MESOS-9932] - Removal of a role from the suppression list should be equivalent to REVIVE.
** Task
* [MESOS-8486] - Webui should display role limits.
* [MESOS-9485] - Unit test for master operation authorization.
* [MESOS-9565] - Unit tests for creating and destroying persistent volumes in SLRP.
* [MESOS-9598] - Update GET `/quota` to return both guarantees and limits.
* [MESOS-9599] - Update `GET_QUOTA` to return both guarantees and limits.
* [MESOS-9600] - Deprecate `SET_QUOTA` and `REMOVE_QUOTA` calls in favor of `UPDATE_QUOTA`.
* [MESOS-9601] - Persist `QuotaConfig`s in the registry.
* [MESOS-9602] - Provide backward compatibility for old quota configurations.
* [MESOS-9603] - Add quota limits metrics.
* [MESOS-9627] - Test CSI v1 in SLRP unit tests.
* [MESOS-9699] - Pull in glog 0.4.0
* [MESOS-9710] - Add tests to ensure random sorter performs correct weighted sorting.
* [MESOS-9715] - Support specifying output file name for curl fetcher plugin
* [MESOS-9754] - Design doc for agent draining
* [MESOS-9757] - Design doc for container debug endpoint.
* [MESOS-9775] - Design doc for UCR shared memory.
* [MESOS-9788] - Configurable IPC namespace and shared memory in `namespaces/ipc` isolator
* [MESOS-9793] - Implement UPDATE_FRAMEWORK call in V0 API for C++/Java
* [MESOS-9809] - Use OpenSSL built-in functions for hostname validation
* [MESOS-9810] - Reject certificate-less ciphers when certificate verification is enabled
* [MESOS-9814] - Implement DrainAgent master/operator call with associated registry actions
* [MESOS-9816] - Add draining state information to master state endpoints
* [MESOS-9817] - Add minimum master capability for draining and deactivation states
* [MESOS-9818] - Implement minimal agent-side draining handler
* [MESOS-9821] - Agent kills all tasks when draining
* [MESOS-9822] - Agent recovery code for task draining
* [MESOS-9823] - Agent should modify status updates while draining
* [MESOS-9825] - Introduce an agent flag to disallow sharing the IPC namespace from the host.
* [MESOS-9826] - Set up `/dev/shm` in `filesystem/linux` isolator only when `namespaces/ipc` isolator is not enabled
* [MESOS-9827] - Introduce the configurable shm protobuf API.
* [MESOS-9828] - Document the IPC namespace and shm on UCR.
* [MESOS-9829] - Implement the container debug endpoint on slave/http.cpp
* [MESOS-9837] - Implement `FutureTracker` class along with helper functions.
* [MESOS-9839] - Implement `IsolatorTracker` class.
* [MESOS-9840] - Implement `LauncherTracker` class.
* [MESOS-9841] - Integrate `IsolatorTracker` and `LinuxLauncher` with Mesos containerizer.
* [MESOS-9842] - Implement tests for the `FutureTracker` class and for its helper functions.
* [MESOS-9845] - Add docs for automatic agent draining
* [MESOS-9846] - Update UI for agent draining
* [MESOS-9849] - Add support for per-role REVIVE / SUPPRESS to V0 scheduler driver.
* [MESOS-9853] - Update Docker executor to allow kill policy overrides
* [MESOS-9860] - Agent should erase DrainInfo when draining complete
* [MESOS-9862] - Agent should fail task launches while draining
* [MESOS-9871] - Expose quota consumption in /roles endpoint.
* [MESOS-9874] - Add environment variable `MESOS_ALLOCATION_ROLE` to the task/container.
* [MESOS-9892] - Test various agent state transitions involving agent draining
* [MESOS-9907] - Retain agent draining start time in master
** Documentation
* [MESOS-9427] - Revisit quota documentation.
Release Notes - Mesos - Version 1.8.2 (WIP)
-------------------------------------------
* This is a bug fix release.
** Bug
* [MESOS-9785] - Frameworks recovered from reregistered agents are not reported to master `/api/v1` subscribers.
* [MESOS-9836] - Docker containerizer overwrites `/mesos/slave` cgroups.
* [MESOS-9868] - NetworkInfo from the agent /state endpoint is not correct.
* [MESOS-9887] - Race condition between two terminal task status updates for Docker/Command executor.
* [MESOS-9893] - `volume/secret` isolator should cleanup the stored secret from runtime directory when the container is destroyed.
* [MESOS-9925] - Default executor takes a couple of seconds to start and subscribe Mesos agent.
* [MESOS-9964] - Support destroying UCR containers in provisioning state.
* [MESOS-9966] - Agent crashes when trying to destroy orphaned nested container if root container is orphaned as well.
* [MESOS-9968] - WWWAuthenticate header parsing fails when commas are in (quoted) realm
* [MESOS-10007] - Command executor can miss exit status for short-lived commands due to double-reaping.
* [MESOS-10015] - updateAllocation() can stall the allocator with a huge number of reservations on an agent.
** Improvement
* [MESOS-9889] - Master CPU high due to unexpected foreachkey behaviour in Master::__reregisterSlave.
* [MESOS-9948] - master::Slave::hasExecutor occupies 37% of a 150 second perf sample.
* [MESOS-10017] - Log all reverse DNS lookup failures in 'legacy' TLS (SSL) hostname validation scheme.
Release Notes - Mesos - Version 1.8.1
-------------------------------------
* This is a bug fix release.
** Bug
* [MESOS-9395] - Check failure on `StorageLocalResourceProviderProcess::applyCreateDisk`.
* [MESOS-9616] - `Filters.refuse_seconds` declines resources not in offers.
* [MESOS-9730] - Executors cannot reconnect with agents using TLS1.3
* [MESOS-9750] - Agent V1 GET_STATE response may report a complete executor's tasks as non-terminal after a graceful agent shutdown.
* [MESOS-9766] - /__processes__ endpoint can hang.
* [MESOS-9779] - `UPDATE_RESOURCE_PROVIDER_CONFIG` agent call returns 404 ambiguously.
* [MESOS-9782] - Random sorter fails to clear removed clients.
* [MESOS-9786] - Race between two REMOVE_QUOTA calls crashes the master.
* [MESOS-9803] - Memory leak caused by an infinite chain of futures in `UriDiskProfileAdaptor`.
* [MESOS-9831] - Master should not report disconnected resource providers.
* [MESOS-9852] - Slow memory growth in master due to deferred deletion of offer filters and timers.
* [MESOS-9856] - REVIVE call with specified role(s) clears filters for all roles of a framework.
* [MESOS-9870] - Simultaneous adding/removal of a role from framework's roles and its suppressed roles crashes the master.
** Improvement
* [MESOS-9695] - Remove the duplicate pid check in Docker containerizer
* [MESOS-9759] - Log required quota headroom and available quota headroom in the allocator.
* [MESOS-9787] - Log slow SSL (TLS) peer reverse DNS lookup.
Release Notes - Mesos - Version 1.8.0
-------------------------------------
This release contains the following highlights:
* Performance Improvements:
* Frameworks can now specify the minimum resource quantities needed
in an offer, which acts as an override of the global
`--min_allocatable_resources` master flag. Updating schedulers to
specify this field improves multi-scheduler scalability as it
reduces the amount of offers declined from having insufficient
resource quantities. Note that this feature currently requires that
the scheduler re-subscribes each time it wants to mutate the
minimum resource quantity offer filter information, see MESOS-7258.
* The batching mechanism used for requests to the master's `/state`
endpoint was extending to other read-only master endpoints like
`/state-summary`, `/frameworks`, `/roles`, etc. (see MESOS-9158)
In addition, responses for multiple concurrent requests to read-only master
endpoints are now only computed once in cases where it can be guaranteed
that all responses would be equal. (see MESOS-9224)
This should significantly increase master responsiveness under
heavy load.
* Allocator cycle time is significantly decreased (around 40% for a
small size cluster and up to 70% for larger clusters) when quota is
used. This greatly narrows the allocator performance gap between
quota and non-quota usage scenarios.
* CLI
* The new Mesos CLI now offers the task subcommand. The first
command, attach, allows you to attach your terminal to a running
task launched with a tty. The second command, exec, launches a
new nested container inside a running task. To build the CLI,
use the flag `--enable-new-cli` with Autotools and
`-DENABLE_NEW_CLI=1` with CMake on MacOS or Linux.
* Operation Feedback:
* V1 schedulers can now receive operation feedback for operations on agent
default resources, i.e. normal cpu, memory, and disk. This means that the
v1 scheduler API's operation feedback feature can now be used for all
non-task-launch operations (any offer operations except for LAUNCH and
LAUNCH_GROUP) on any type of resources.
* The experimental operation feedback API for v1 schedulers made a breaking
change: the RECONCILE_OPERATIONS call no longer returns a 200 OK response
with a body containing the full reconciliation results. Instead, a
successful request now returns 202 Accepted, and a series of operation
status updates are sent on the scheduler's event stream to satisfy the
reconciliation request. This is similar to the way in which the master
replies to requests for task status reconciliation.
* Containerization:
* [MESOS-9029] - New `linux/seccomp` isolator: Containers launched
by Mesos containerizer can be sandboxed by enabling filtering of
system calls using a configurable policy.
* [MESOS-9675] - Support pulling docker images with docker manifest
V2 Schema2 on Mesos Containerizer.
* [MESOS-9133] - Support custom port range option to the `network/ports`
isolator. Added the `--container_ports_isolated_range` flag to the
`network/ports` isolator. This allows the operator to specify a custom
port range to be protected by the isolator.
* [MESOS-5158] - Support XFS quota for persistent volumes. Added
persistent volume support to the `disk/xfs` isolator.
* [MESOS-9009] - Support an option to create non-existing host
paths for host path volume in Mesos Containerizer. Added a new
agent flag `--host_path_volume_force_creation` for the
`volume/host_path` isolator.
* Container Storage Interface (CSI):
* **Experimental** Supported the new CSI v1 API. Operators can deploy
plugins that are compatible to either CSI v0 or v1 to create persistent
volumes through storage local resource providers, and Mesos will
automatically detect which CSI versions are supported by the plugins.
Additional API Changes:
* [MESOS-9540] - Improved the experimental `DESTROY_DISK` operations so
frameworks can now deprovision any unwanted pre-provisioned CSI volume
directly, if they are authorized to perform `DESTROY_RAW_DISK` actions.
Unresolved Critical Issues:
* [MESOS-9697] - Release RPMs are not uploaded to bintray
* [MESOS-9672] - Docker containerizer should ignore pids of executors that do not pass the connection check.
* [MESOS-9654] - `PUBLISH_RESOURCES` should fail if the resource version changes.
* [MESOS-9616] - `Filters.refuse_seconds` declines resources not in offers.
* [MESOS-9609] - Master check failure when marking agent unreachable
* [MESOS-9579] - ExecutorHttpApiTest.HeartbeatCalls is flaky.
* [MESOS-9560] - ContentType/AgentAPITest.MarkResourceProviderGone/1 is flaky
* [MESOS-9536] - Nested container launched with non-root user may not be able to write to its sandbox via the environment variable
* [MESOS-9520] - IOTest.Read hangs on Windows
* [MESOS-9500] - spark submit with docker image on mesos cluster fails.
* [MESOS-9426] - ZK master detection can become forever pending.
* [MESOS-9393] - Fetcher crashes extracting archives with non-ASCII filenames.
* [MESOS-9365] - Windows - GET_CONTAINERS API call causes the Mesos agent to fail
* [MESOS-9355] - Persistence volume does not unmount correctly with wrong artifact URI
* [MESOS-9352] - Data in persistent volume deleted accidentally when using Docker container and Persistent volume
* [MESOS-9306] - Mesos containerizer can get stuck during cgroup cleanup
* [MESOS-9180] - tasks get stuck in TASK_KILLING on the default executor
* [MESOS-9053] - Network ports isolator can falsely trigger while destroying containers.
* [MESOS-9006] - The agent's GET_AGENT leaks resource information when using authorization
* [MESOS-8946] - CURL 7.58 causes Mesos to fail decoding raw responses.
* [MESOS-8840] - `cpu.cfs_quota_us` may be accidentally set for command task using docker during agent recovery.
* [MESOS-8803] - Libprocess deadlocks in a test.
* [MESOS-8769] - Agent crashes when CNI config not defined
* [MESOS-8679] - If the first KILL stuck in the default executor, all other KILLs will be ignored.
* [MESOS-8608] - RmdirContinueOnErrorTest.RemoveWithContinueOnError fails.
* [MESOS-8257] - Unified Containerizer "leaks" a target container mount path to the host FS when the target resolves to an absolute path
* [MESOS-8256] - Libprocess can silently deadlock due to worker thread exhaustion.
* [MESOS-8096] - Enqueueing events in MockHTTPScheduler can lead to segfaults.
* [MESOS-8038] - Launching GPU task sporadically fails.
* [MESOS-7971] - PersistentVolumeEndpointsTest.EndpointCreateThenOfferRemove test is flaky
* [MESOS-7911] - Non-checkpointing framework's tasks should not be marked LOST when agent disconnects.
* [MESOS-7748] - Slow subscribers of streaming APIs can lead to Mesos OOMing.
* [MESOS-7721] - Master's agent removal rate limit also applies to agent unreachability.
* [MESOS-7566] - Master crash due to failed check in DRFSorter::remove
* [MESOS-7386] - Executor not cleaning up existing running docker containers if external logrotate/logger processes die/killed
* [MESOS-5989] - Libevent SSL Socket downgrade code accesses uninitialized memory / assumes single peek is sufficient.
* [MESOS-5754] - CommandInfo.user not honored in docker containerizer
* [MESOS-2842] - Master crashes when framework changes principal on re-registration
All Resolved Issues:
** Bug
* [MESOS-5048] - MesosContainerizerSlaveRecoveryTest.ResourceStatistics is flaky
* [MESOS-5189] - SSLTest.ProtocolMismatch is slow
* [MESOS-6874] - Agent silently ignores FS isolation when protobuf is malformed
* [MESOS-6949] - SchedulerTest.MasterFailover is flaky
* [MESOS-6990] - PartitionTest.TaskCompletedOnPartitionedAgent is flaky.
* [MESOS-7042] - Send SIGKILL after SIGTERM to IOSwitchboard after container termination.
* [MESOS-7076] - libprocess tests fail when using libevent 2.1.8
* [MESOS-7474] - Mesos fetcher cache doesn't retry when missed.
* [MESOS-7564] - Introduce a heartbeat mechanism for v1 HTTP executor <-> agent communication.
* [MESOS-7883] - Quota heuristic check not accounting for mount volumes
* [MESOS-8156] - Add a socketpair helper to the stout net API
* [MESOS-8343] - SchedulerHttpApiTest.UpdatePidToHttpScheduler is flaky.
* [MESOS-8467] - Destroyed executors might be used after `Slave::publishResource()`.
* [MESOS-8470] - CHECK failure in DRFSorter due to invalid framework id.
* [MESOS-8545] - AgentAPIStreamingTest.AttachInputToNestedContainerSession is flaky.
* [MESOS-8547] - Mount devpts with compatible defaults.
* [MESOS-8568] - Command checks should always call `WAIT_NESTED_CONTAINER` before `REMOVE_NESTED_CONTAINER`
* [MESOS-8782] - Transition operations to OPERATION_GONE_BY_OPERATOR when marking an agent gone.
* [MESOS-8783] - Transition pending operations to OPERATION_UNREACHABLE when an agent is removed.
* [MESOS-8797] - Check failed in the default executor while running `MesosContainerizer/DefaultExecutorTest.TaskUsesExecutor/0` test.
* [MESOS-8835] - mesos-tests takes a long time to execute no tests
* [MESOS-8872] - OperationReconciliationTest.AgentPendingOperationAfterMasterFailover is flaky.
* [MESOS-8887] - Unreachable tasks are not GC'ed when unreachable agent is GC'ed.
* [MESOS-8907] - Docker image fetcher fails with HTTP/2.
* [MESOS-8978] - Command executor calling setsid breaks the tty support.
* [MESOS-9056] - mesos-style.py messaging is poor
* [MESOS-9074] - Pylint is too noisy when using mesos-style.py
* [MESOS-9079] - Test MasterTestPrePostReservationRefinement.LaunchGroup is flaky.
* [MESOS-9089] - Test `PartitionTest.PartitionAwareTaskCompletedOnPartitionedAgent` is flaky.
* [MESOS-9112] - mesos-style reports violations on a clean checkout
* [MESOS-9124] - Agent reconfiguration can cause master to REVIVE on scheduler's behalf
* [MESOS-9130] - Test `StorageLocalResourceProviderTest.ROOT_ContainerTerminationMetric` is flaky.
* [MESOS-9131] - Health checks launching nested containers while a container is being destroyed lead to unkillable tasks.
* [MESOS-9143] - MasterQuotaTest.RemoveSingleQuota is flaky.
* [MESOS-9168] - Libprocess' http client does not encode the outgoing query.
* [MESOS-9172] - Fetcher deadlock with duplicated URIs.
* [MESOS-9179] - ./support/python3/mesos-gtest-runner.py --help crashes
* [MESOS-9186] - Failed to build Mesos with Python 3.7 and new CLI enabled
* [MESOS-9187] - Add allocator benchmark to allow multiple framework/agent profiles.
* [MESOS-9190] - Test `StorageLocalResourceProviderTest.ROOT_CreateDestroyDiskRecovery` is flaky.
* [MESOS-9193] - Mesos build fail with Clang 3.5.
* [MESOS-9210] - Mesos v1 scheduler library does not properly handle SUBSCRIBE retries
* [MESOS-9212] - Disable SIGCHLD handling in libev.
* [MESOS-9214] - Stout.FsTest.Used fails on macOS
* [MESOS-9217] - LongLivedDefaultExecutorRestart is flaky.
* [MESOS-9222] - Linking libevent should be avoided.
* [MESOS-9225] - Github's mesos/modules does not build.
* [MESOS-9228] - SLRP does not clean up plugin containers after it is removed.
* [MESOS-9231] - `docker inspect` may return an unexpected result to Docker executor due to a race condition.
* [MESOS-9232] - verify-reviews.py broken after enabling python3 support scripts
* [MESOS-9240] - CSI protobuf build fails when dependency tracking is disabled.
* [MESOS-9253] - Reviewbot is failing when posting a review
* [MESOS-9266] - Whenever our packaging tasks trigger errors we run into permission problems.
* [MESOS-9274] - v1 JAVA scheduler library can drop TEARDOWN upon destruction.
* [MESOS-9279] - Docker Containerizer 'usage' call might be expensive if mount table is big.
* [MESOS-9281] - SLRP gets a stale checkpoint after system crash.
* [MESOS-9283] - Docker containerizer actor can get backlogged with large number of containers.
* [MESOS-9293] - If a framework looses operation information it cannot reconcile to acknowledge updates.
* [MESOS-9295] - Nested container launch could fail if the agent upgrade with new cgroup subsystems.
* [MESOS-9300] - XFS isolator can mislabel project IDs on persistence volumes.
* [MESOS-9302] - Mesos fails to build on Fedora 28
* [MESOS-9308] - URI disk profile adaptor could deadlock.
* [MESOS-9316] - FsTest.Used is flaky
* [MESOS-9317] - Some master endpoints do not handle failed authorization properly.
* [MESOS-9319] - Move root filesystem creation to the `filesystem/linux` isolator.
* [MESOS-9324] - Resource fragmentation: frameworks may be starved of port resources in the presence of large number frameworks with quota.
* [MESOS-9331] - Some library functions ignore failures from ::close which should probably be handled.
* [MESOS-9334] - Container stuck at ISOLATING state due to libevent poll never returns.
* [MESOS-9350] - CLI build step is broken with CMake due to missing file.
* [MESOS-9354] - Automatically remount read-only bind mounts.
* [MESOS-9357] - FetcherTest.DuplicateFileURI fails on macos
* [MESOS-9358] - Test `SlaveRecoveryTest.AgentReconfigurationWithRunningTask` is flaky.
* [MESOS-9362] - Test `CgroupsIsolatorTest.ROOT_CGROUPS_CreateRecursively` is flaky.
* [MESOS-9366] - Test `HealthCheckTest.HealthyTaskNonShell` can hang.
* [MESOS-9367] - GetContainers call crashes when using XFS disk isolation.
* [MESOS-9370] - Unable to build new Mesos CLI with PyInstaller and Python 3.7.
* [MESOS-9382] - mesos-gtest-runner doesn't work on systems without ulimit binary
* [MESOS-9390] - Warnings in AdaptedOperation prevent clang build
* [MESOS-9397] - PosixRLimitsIsolatorTest.UnsetLimits is broken on macOS 10.14.2 beta3.
* [MESOS-9398] - post-reviews.py fails to update an existing chain.
* [MESOS-9411] - Validation of JWT tokens using HS256 hashing algorithm is not thread safe.
* [MESOS-9417] - User mesosphere made lots of incorrect ticket updates
* [MESOS-9418] - Add support for the `Discard` blkio operation type.
* [MESOS-9419] - Executor to framework message crashes master if framework has not re-registered.
* [MESOS-9434] - Completed framework update streams may retry forever
* [MESOS-9459] - Reviewbot is not verifying reviews that need verification
* [MESOS-9462] - Devices in a container are inaccessible due to `nodev` on `/var/run`.
* [MESOS-9469] - Mesos does not validate framework-supplied FrameworkIDs
* [MESOS-9474] - Master does not respect authorization result for `CREATE_DISK` and `DESTROY_DISK`.
* [MESOS-9479] - SLRP does not set RP ID in produced OperationStatus.
* [MESOS-9480] - Master may skip processing authorization results for `LAUNCH_GROUP`.
* [MESOS-9492] - Persist CNI working directory across reboot.
* [MESOS-9495] - Test `MasterTest.CreateVolumesV1AuthorizationFailure` is flaky.
* [MESOS-9501] - Mesos executor fails to terminate and gets stuck after agent host reboot.
* [MESOS-9502] - IOswitchboard cleanup could get stuck due to FD leak from a race.
* [MESOS-9505] - `make check` failed with linking errors when c-ares is installed.
* [MESOS-9507] - Agent could not recover due to empty docker volume checkpointed files.
* [MESOS-9508] - Official 1.7.0 tarball can't be built on Ubuntu 16.04 LTS.
* [MESOS-9514] - Reviewboard bot fails on verify-reviews.py.
* [MESOS-9517] - SLRP should treat gRPC timeouts as non-terminal errors, instead of reporting OPERATION_FAILED.
* [MESOS-9518] - CNI_NETNS should not be set for orphan containers that do not have network namespace.
* [MESOS-9519] - Unable to build Mesos with CMake on Ubuntu 14.04.
* [MESOS-9521] - MasterAPITest.OperationUpdatesUponAgentGone is flaky
* [MESOS-9529] - `/proc` should be remounted even if a nested container set `share_pid_namespace` to true
* [MESOS-9531] - chown error handling is incorrect in createSandboxDirectory.
* [MESOS-9532] - ResourceOffersTest.ResourceOfferWithMultipleSlaves is flaky.
* [MESOS-9533] - CniIsolatorTest.ROOT_CleanupAfterReboot is flaky.
* [MESOS-9537] - SLRP sends inconsistent status updates for dropped operations.
* [MESOS-9542] - Hierarchical allocator check failure when an operation on a shutdown framework finishes
* [MESOS-9544] - SLRP does not clean up destroyed persistent volumes.
* [MESOS-9549] - nvidia/cuda 10 does not work on GPU isolator.
* [MESOS-9554] - Allocator might skip allocations because a single framework is incapable of receiving certain resources.
* [MESOS-9555] - Allocator CHECK failure: reservationScalarQuantities.contains(role).
* [MESOS-9557] - Operations are leaked in Framework struct when agents are removed
* [MESOS-9559] - OPERATION_UNREACHABLE and OPERATION_GONE_BY_OPERATOR updates don't include the agent/RP IDs
* [MESOS-9564] - Logrotate container logger lets tasks execute arbitrary commands in the Mesos agent's namespace
* [MESOS-9568] - SLRP does not clean up mount directories for destroyed MOUNT disks.
* [MESOS-9573] - Agent should not try to recover operation status update streams that haven't been created yet.
* [MESOS-9574] - Operation status update streams are not properly garbage collected.
* [MESOS-9582] - Reviewbot jenkins jobs stops validating any reviews as soon as it sees a patch which does not apply
* [MESOS-9590] - Mesos CI sometimes, incorrectly, overwrites already-pushed mesos master nightly images with new images built from non-master branches.
* [MESOS-9592] - Mesos Websitebot is flaky
* [MESOS-9597] - Status update streams for operations affecting agent default resources should be stored under "meta/slaves/<slave_id>/operations/"
* [MESOS-9605] - mesos/mesos-centos nightly docker image has to include the SHA of the build.
* [MESOS-9607] - Removing a resource provider with consumers breaks resource publishing.
* [MESOS-9610] - Fetcher vulnerability - escaping from sandbox
* [MESOS-9612] - Resource provider manager assumes all operations are triggered by frameworks
* [MESOS-9619] - Mesos Master Crashes with Launch Group when using Port Resources
* [MESOS-9621] - Mesos failed to build due to error LNK2019 on Windows using MSVC.
* [MESOS-9629] - Pylint reports cyclic dependencies in cli_new
* [MESOS-9635] - OperationReconciliationTest.AgentPendingOperationAfterMasterFailover is flaky again (3x) due to orphan operations
* [MESOS-9637] - Impossible to CREATE a volume on resource provider resources over the operator API
* [MESOS-9661] - Agent crashes when SLRP recovers dropped operations.
* [MESOS-9667] - Check failure when executor for task using resource provider resources subscribes before agent is registered.
* [MESOS-9688] - Quota is not enforced properly when subroles have reservations.
* [MESOS-9691] - Quota headroom calculation is off when subroles are involved.
* [MESOS-9692] - Quota may be under allocated for disk resources.
* [MESOS-9696] - Test MasterQuotaTest.AvailableResourcesSingleDisconnectedAgent is flaky
* [MESOS-9707] - Calling link::lo() may cause runtime error
* [MESOS-9667] - Check failure when executor for task using resource provider resources subscribes before agent is registered.
* [MESOS-9711] - Avoid shutting down executors registering before a required resource provider.
* [MESOS-9712] - StorageLocalResourceProviderTest.CsiPluginRpcMetrics is flaky.
* [MESOS-9727] - Heartbeat calls from executor to agent are reported as errors.
* [MESOS-9729] - Unpublishing a volume that is failed to publish crashes the agent with CSI v1.
* [MESOS-9733] - Random sorter generates non-uniform result for hierarchical roles.
* [MESOS-9740] - Invalid protobuf unions in ExecutorInfo::ContainerInfo will prevent agents from reregistering with 1.8+ masters
** Epic
* [MESOS-8054] - Feedback for operations
* [MESOS-8345] - Improve master responsiveness while serving state information.
* [MESOS-9029] - Seccomp syscall filtering in Mesos containerizer
* [MESOS-9211] - Make the new Mesos CLI production ready
* [MESOS-9675] - Docker Manifest V2 Schema2 Support.
** Story
* [MESOS-907] - Add Kerberos Authentication support
** Improvement
* [MESOS-4036] - Install instructions for CentOS 6.6 lead to errors running `perf`.
* [MESOS-4599] - ReviewBot should re-verify a review chain if any of the reviews is updated
* [MESOS-5158] - Provide XFS quota support for persistent volumes.
* [MESOS-6765] - Make the Resources wrapper "copy-on-write" to improve performance.
* [MESOS-6934] - Support pulling Docker images with V2 Schema 2 image manifest
* [MESOS-7124] - Replace monadic type get() functions with operator*
* [MESOS-7947] - Add GC capability to nested containers
* [MESOS-8025] - Update the master field in the new CLI config to accept a URL instead of an <ip:port>
* [MESOS-8206] - Add the pip-requirements from other modules to the pylint virtual environment
* [MESOS-8380] - Update WebUI to show local resource providers.
* [MESOS-8403] - Add agent HTTP API operator call to mark local resource providers as gone
* [MESOS-8880] - Add minimum capabilities in the master.
* [MESOS-8999] - Add default bodies for libprocess HTTP error responses.
* [MESOS-9133] - Make the range of ports protected by the network/ports isolator configurable.
* [MESOS-9158] - Parallel serving of state-related read-only requests in the Master.
* [MESOS-9194] - Extend request batching to '/roles' endpoint
* [MESOS-9223] - Storage local provider does not sufficiently handle container launch failures or errors
* [MESOS-9224] - De-duplicate read-only requests to master based on principal.
* [MESOS-9239] - Improve sorting performance in the DRF sorter.
* [MESOS-9249] - Avoid dirtying the DRF sorter when allocating resources.
* [MESOS-9255] - Use consistent "totals" across role / framework DRF.
* [MESOS-9258] - Prevent subscribers to the master's event stream from leaking connections
* [MESOS-9275] - Allow optional `profile` to be specified in `CREATE_DISK` offer operation.
* [MESOS-9292] - Rejected quotas request error messages should specify which resources were overcommitted.
* [MESOS-9301] - Add flag to disable per-framework metrics.
* [MESOS-9305] - Create cgoup recursively to workaround systemd deleting cgroups_root.
* [MESOS-9315] - Adding support for implicit allocation of mandatory custom resources in Mesos
* [MESOS-9321] - Add an optional `vendor` field in `Resource.DiskInfo.Source`.
* [MESOS-9340] - Log all socket errors in libprocess.
* [MESOS-9384] - Resource providers reported by master should reflect connected resource providers
* [MESOS-9406] - Allow for optionally unbundled leveldb from CMake builds.
* [MESOS-9486] - Set up `object.value` for `CREATE_DISK` and `DESTROY_DISK` authorizations.
* [MESOS-9504] - Use ResourceQuantities in the allocator and sorter to improve performance.
* [MESOS-9510] - Disallowed nan, inf and so on in `Value::Scalar`.
* [MESOS-9516] - Extend `min_allocatable_resources` flag to cover non-scalar resources.
* [MESOS-9523] - Add per-framework allocatable resources matcher/filter.
* [MESOS-9540] - Support `DESTROY_DISK` on preprovisioned CSI volumes.
* [MESOS-9608] - Refactor and Improve `class ResourceQuantity`.
* [MESOS-9613] - Support seccomp `unconfined` option for whitelisting.
* [MESOS-9628] - Consider running tox as part of test suite, not as part of style checking
* [MESOS-9642] - Avoid reading host mount table when allocating a gid in GIDManager.
* [MESOS-9643] - Make setting volume ownership asynchronous in volume gid manager
* [MESOS-9655] - Improving SLRP tests for preprovisioned volumes.
* [MESOS-9704] - Support docker manifest v2s2 config GC.
** Task
* [MESOS-4509] - Remove deprecated .json endpoints.
* [MESOS-5827] - Add example framework for using inverse offers
* [MESOS-6551] - Add attach/exec commands to the Mesos CLI
* [MESOS-6630] - Add some benchmark test for quota allocation
* [MESOS-6840] - Tests for quota capacity heuristic.
* [MESOS-8241] - Add metrics for offer operation feedback
* [MESOS-8528] - Design Doc for Storage External Resource Provider (SERP) support.
* [MESOS-8770] - Use Python3 for Mesos support scripts
* [MESOS-8810] - Grant non-root task user the permissions to access the SANDBOX_PATH volume of PARENT type
* [MESOS-8813] - Support multiple tasks with different users can access a persistent volume.
* [MESOS-8957] - Install Python 3 on Mesos CI instances
* [MESOS-8975] - Problem and solution overview for the slow API issue.
* [MESOS-9009] - Support for creation non-existing host paths in a whitelist as source paths
* [MESOS-9032] - Update build scripts to support `seccomp-isolator` flag and `libseccomp` library
* [MESOS-9033] - Add Seccomp-related protobufs
* [MESOS-9034] - Implement a wrapper class for `libseccomp` API
* [MESOS-9035] - Implement `linux/seccomp` isolator
* [MESOS-9099] - Add allocator quota tests regarding reserve/unreserve already allocated resources.
* [MESOS-9105] - Implement Docker Seccomp profile parser.
* [MESOS-9106] - Add seccomp filter into containerizer launcher.
* [MESOS-9229] - Install Python3 on ubuntu-16.04-arm docker image
* [MESOS-9265] - Analyse and pinpoint libprocess SSL failures when using libevent 2.1.8.
* [MESOS-9270] - Get rid of dependency on `net-tools` in network/cni isolator.
* [MESOS-9278] - Add an operation status update manager to the agent
* [MESOS-9318] - Consider providing better operation status updates while an RP is recovering
* [MESOS-9333] - Document usage and build of new Mesos CLI
* [MESOS-9356] - Make agent atomically checkpoint operations and resources
* [MESOS-9392] - Implement tests for Seccomp parser
* [MESOS-9396] - Use the built CLI binary when running new CLI integration tests in CI
* [MESOS-9399] - Update 'mesos task list' to only list running tasks
* [MESOS-9409] - Implement Seccomp isolator tests
* [MESOS-9471] - Master should track operations on agent default resources.
* [MESOS-9472] - Unblock operation feedback on agent default resources.
* [MESOS-9473] - Add end to end tests for operations on agent default resources.
* [MESOS-9477] - Documentation for operation feedback
* [MESOS-9525] - Agent capability for operation feedback on default resources
* [MESOS-9535] - Master should clean up operations from downgraded agents
* [MESOS-9538] - Agent `ReconcileOperations` handler should handle operation affecting default resources
* [MESOS-9578] - Document per framework minimal allocatable resources in framework development guides
* [MESOS-9596] - Add a new `UPDATE_QUOTA` operator call.
* [MESOS-9604] - Clean up `QuotaRequest` and `QuotaInfo`.
* [MESOS-9615] - Example framework for feedback on agent default resources
* [MESOS-9620] - Add metrics for volume gid manager
* [MESOS-9622] - Refactor SLRP with a CSI volume manager.
* [MESOS-9623] - Implement CSI volume manager with CSI v1.
* [MESOS-9624] - Bundle CSI spec v1.0 in Mesos.
* [MESOS-9625] - Make `DiskProfileAdaptor` agnostic to CSI spec version.
* [MESOS-9626] - Make SLRP pick the appropriate CSI versions for plugins.
* [MESOS-9632] - Refactor SLRP with a CSI service manager.
* [MESOS-9639] - Make CSI plugin RPC metrics agnostic to CSI versions.
* [MESOS-9648] - Make operation reconciliation send asynchronous updates
* [MESOS-9651] - Design for docker registry v2 schema2 basic support.
* [MESOS-9676] - Add prettyjws support for docker v2 s1 manifest.
* [MESOS-9694] - Refactor UCR docker store to construct 'Image' protobuf at Puller.
** Documentation
* [MESOS-9036] - Document `linux/seccomp` isolator
Release Notes - Mesos - Version 1.7.3 (WIP)
-------------------------------------------
* This is a bug fix release.
** Bug
* [MESOS-8467] - Destroyed executors might be used after `Slave::publishResource()`.
* [MESOS-8537] - Default executor doesn't wait for status updates to be ack'd before shutting down.
* [MESOS-9124] - Agent reconfiguration can cause master to unsuppress on scheduler's behalf.
* [MESOS-9507] - Agent could not recover due to empty docker volume checkpointed files.
* [MESOS-9529] - `/proc` should be remounted even if a nested container set `share_pid_namespace` to true.
* [MESOS-9549] - nvidia/cuda 10 does not work on GPU isolator.
* [MESOS-9564] - Logrotate container logger lets tasks execute arbitrary commands in the Mesos agent's namespace.
* [MESOS-9568] - SLRP does not clean up mount directories for destroyed MOUNT disks.
* [MESOS-9581] - Mesos package naming appears to be undeterministic.
* [MESOS-9590] - Mesos CI sometimes, incorrectly, overwrites already-pushed mesos master nightly images with new images built from non-master branches.
* [MESOS-9607] - Removing a resource provider with consumers breaks resource publishing.
* [MESOS-9610] - Fetcher vulnerability - escaping from sandbox.
* [MESOS-9616] - `Filters.refuse_seconds` declines resources not in offers.
* [MESOS-9619] - Mesos Master Crashes with Launch Group when using Port Resources
* [MESOS-9661] - Agent crashes when SLRP recovers dropped operations.
* [MESOS-9692] - Quota may be under allocated for disk resources.
* [MESOS-9695] - Remove the duplicate pid check in Docker containerizer
* [MESOS-9707] - Calling link::lo() may cause runtime error
* [MESOS-9750] - Agent V1 GET_STATE response may report a complete executor's tasks as non-terminal after a graceful agent shutdown.
* [MESOS-9766] - /__processes__ endpoint can hang.
* [MESOS-9785] - Frameworks recovered from reregistered agents are not reported to master `/api/v1` subscribers.
* [MESOS-9786] - Race between two REMOVE_QUOTA calls crashes the master.
* [MESOS-9787] - Log slow SSL (TLS) peer reverse DNS lookup.
* [MESOS-9803] - Memory leak caused by an infinite chain of futures in `UriDiskProfileAdaptor`.
* [MESOS-9836] - Docker containerizer overwrites `/mesos/slave` cgroups.
* [MESOS-9847] - Docker executor doesn't wait for status updates to be ack'd before shutting down.
* [MESOS-9852] - Slow memory growth in master due to deferred deletion of offer filters and timers.
* [MESOS-9856] - REVIVE call with specified role(s) clears filters for all roles of a framework.
* [MESOS-9868] - NetworkInfo from the agent /state endpoint is not correct.
* [MESOS-9870] - Simultaneous adding/removal of a role from framework's roles and its suppressed roles crashes the master.
* [MESOS-9887] - Race condition between two terminal task status updates for Docker/Command executor.
* [MESOS-9889] - Master CPU high due to unexpected foreachkey behaviour in Master::__reregisterSlave.
* [MESOS-9893] - `volume/secret` isolator should cleanup the stored secret from runtime directory when the container is destroyed.
* [MESOS-9925] - Default executor takes a couple of seconds to start and subscribe Mesos agent.
* [MESOS-9964] - Support destroying UCR containers in provisioning state.
* [MESOS-9966] - Agent crashes when trying to destroy orphaned nested container if root container is orphaned as well.
* [MESOS-9968] - WWWAuthenticate header parsing fails when commas are in (quoted) realm.
* [MESOS-10007] - Command executor can miss exit status for short-lived commands due to double-reaping.
* [MESOS-10015] - updateAllocation() can stall the allocator with a huge number of reservations on an agent.
* [MESOS-10018] - Duplicate tasks if agent partitioned during maintenance down.
* [MESOS-10084] - Detecting whether executor is generated for command task should work when the launcher_dir changes.
* [MESOS-10092] - Cannot pull image from docker registry which does not reply with 'scope'/'service' in WWW-Authenticate header.
** Improvements
* [MESOS-8880] - Add minimum capabilities in the master.
* [MESOS-9159] - Support Foreign URLs in docker registry puller.
* [MESOS-9540] - Support `DESTROY_DISK` on preprovisioned CSI volumes.
* [MESOS-9545] - Marking an unreachable agent as gone should transition the tasks to terminal state.
* [MESOS-9675] - Docker Manifest V2 Schema2 Support.
* [MESOS-9704] - Support docker manifest v2s2 config GC.
* [MESOS-9759] - Log required quota headroom and available quota headroom in the allocator.
* [MESOS-9948] - master::Slave::hasExecutor occupies 37% of a 150 second perf sample.
* [MESOS-10017] - Log all reverse DNS lookup failures in 'legacy' TLS (SSL) hostname validation scheme.
Release Notes - Mesos - Version 1.7.2
-------------------------------------
* This is a bug fix release.
** Bug
* [MESOS-8887] - Unreachable tasks are not GC'ed when unreachable agent is GC'ed.
* [MESOS-9210] - Mesos v1 scheduler library does not properly handle SUBSCRIBE retries.
* [MESOS-9517] - SLRP should treat gRPC timeouts as non-terminal errors, instead of reporting OPERATION_FAILED.
* [MESOS-9531] - chown error handling is incorrect in createSandboxDirectory.
* [MESOS-9532] - ResourceOffersTest.ResourceOfferWithMultipleSlaves is flaky.
* [MESOS-9533] - CniIsolatorTest.ROOT_CleanupAfterReboot is flaky.
* [MESOS-9537] - SLRP sends inconsistent status updates for dropped operations.
* [MESOS-9544] - SLRP does not clean up destroyed persistent volumes.
* [MESOS-9554] - Allocator might skip allocations because a single framework is incapable of receiving certain resources.
* [MESOS-9555] - Allocator CHECK failure: reservationScalarQuantities.contains(role).
** Improvement
* [MESOS-9340] - Log all socket errors in libprocess.
Release Notes - Mesos - Version 1.7.1
-------------------------------------
* This is a bug fix release. Also includes performance and API
improvements:
* **Allocator**: Improved allocation cycle time substantially
(see MESOS-9239 and MESOS-9249). These reduce the allocation
cycle time in some benchmarks by 80%.
* **Scheduler API**: Improved the experimental `CREATE_DISK` and
`DESTROY_DISK` operations for CSI volume recovery (see MESOS-9275
and MESOS-9321). Storage local resource providers now return disk
resources with the `source.vendor` field set, so frameworks needs to
upgrade the `Resource` protobuf definitions.
* **Scheduler API**: Offer operation feedbacks now present their agent
IDs and resource provider IDs (see MESOS-9293).
** Bug
* [MESOS-7042] - Send SIGKILL after SIGTERM to IOSwitchboard after container termination.
* [MESOS-7474] - Mesos fetcher cache doesn't retry when missed.
* [MESOS-8545] - AgentAPIStreamingTest.AttachInputToNestedContainerSession is flaky.
* [MESOS-8907] - Docker image fetcher fails with HTTP/2.
* [MESOS-8978] - Command executor calling setsid breaks the tty support.
* [MESOS-9131] - Health checks launching nested containers while a container is being destroyed lead to unkillable tasks.
* [MESOS-9152] - Close all file descriptors except whitelist_fds in posix/subprocess.
* [MESOS-9154] - MasterTest.TaskStateMetrics is flaky
* [MESOS-9164] - Subprocess should unset CLOEXEC on whitelisted file descriptors.
* [MESOS-9228] - SLRP does not clean up plugin containers after it is removed.
* [MESOS-9231] - `docker inspect` may return an unexpected result to Docker executor due to a race condition.
* [MESOS-9266] - Whenever our packaging tasks trigger errors we run into permission problems.
* [MESOS-9267] - Mesos agent crashes when CNI network is not configured but used.
* [MESOS-9274] - v1 JAVA scheduler library can drop TEARDOWN upon destruction.
* [MESOS-9279] - Docker Containerizer 'usage' call might be expensive if mount table is big.
* [MESOS-9281] - SLRP gets a stale checkpoint after system crash.
* [MESOS-9283] - Docker containerizer actor can get backlogged with large number of containers.
* [MESOS-9293] - If a framework looses operation information it cannot reconcile to acknowledge updates.
* [MESOS-9295] - Nested container launch could fail if the agent upgrade with new cgroup subsystems.
* [MESOS-9308] - URI disk profile adaptor could deadlock.
* [MESOS-9317] - Some master endpoints do not handle failed authorization properly.
* [MESOS-9324] - Resource fragmentation: frameworks may be starved of port resources in the presence of large number frameworks with quota.
* [MESOS-9332] - Nested container should run as the same user of its parent container by default.
* [MESOS-9334] - Container stuck at ISOLATING state due to libevent poll never returns.
* [MESOS-9362] - Test `CgroupsIsolatorTest.ROOT_CGROUPS_CreateRecursively` is flaky.
* [MESOS-9411] - Validation of JWT tokens using HS256 hashing algorithm is not thread safe.
* [MESOS-9418] - Add support for the `Discard` blkio operation type.
* [MESOS-9419] - Executor to framework message crashes master if framework has not re-registered.
* [MESOS-9474] - Master does not respect authorization result for `CREATE_DISK` and `DESTROY_DISK`.
* [MESOS-9479] - SLRP does not set RP ID in produced OperationStatus.
* [MESOS-9480] - Master may skip processing authorization results for `LAUNCH_GROUP`.
* [MESOS-9492] - Persist CNI working directory across reboot.
* [MESOS-9501] - Mesos executor fails to terminate and gets stuck after agent host reboot.
* [MESOS-9502] - IOswitchboard cleanup could get stuck due to FD leak from a race.
* [MESOS-9505] - `make check` failed with linking errors when c-ares is installed.
* [MESOS-9508] - Official 1.7.0 tarball can't be built on Ubuntu 16.04 LTS.
* [MESOS-9518] - CNI_NETNS should not be set for orphan containers that do not have network namespace.
* [MESOS-9519] - Unable to build Mesos with CMake on Ubuntu 14.04.
** Improvement
* [MESOS-6765] - Make the Resources wrapper "copy-on-write" to improve performance.
* [MESOS-9239] - Improve sorting performance in the DRF sorter.
* [MESOS-9249] - Avoid dirtying the DRF sorter when allocating resources.
* [MESOS-9255] - Use consistent "totals" across role / framework DRF.
* [MESOS-9275] - Allow optional `profile` to be specified in `CREATE_DISK` offer operation.
* [MESOS-9305] - Create cgoup recursively to workaround systemd deleting cgroups_root.
* [MESOS-9321] - Add an optional `vendor` field in `Resource.DiskInfo.Source`.
* [MESOS-9325] - Optimize `Resources::filter` operation.
* [MESOS-9486] - Set up `object.value` for `CREATE_DISK` and `DESTROY_DISK` authorizations.
* [MESOS-9510] - Disallowed nan, inf and so on in `Value::Scalar`.
* [MESOS-9516] - Extend `min_allocatable_resources` flag to cover non-scalar resources.
Release Notes - Mesos - Version 1.7.0
-------------------------------------
This release contains the following highlights:
* Performance Improvements:
* **Master `/state` endpoint:** Adopted RapidJSON and reduced
copying for a ~130% throughput improvement due to a ~55%
decrease in latency (MESOS-9092). Also, added parallel
processing of `/state` requests to reduce master backlogging
/ interference under high request load (MESOS-9122).
* **Allocator:** Improved allocator cycle time significantly
(MESOS-9087). This, together with the reduced master
backlogging from `/state` improvements, reduces the
end-to-end offer cycling time between Mesos and schedulers.
* **Agent `/containers` endpoint:** Fixed a performance issue
that caused high latency / cpu consumption when there are
many containers on the agent (MESOS-8418).
* **Agent container launching performance improvements**:
The expensive `cgroups::verify()` calls were removed which
provides a significant improvement to container launch /
destroy throughput (MESOS-9081).
* Containerization:
* [MESOS-8794] - **Experimental** Supported docker image tarball
fetching from HDFS through the `--docker_registry` agent flag.
* [MESOS-7691] - Added a new option `cgroups/all` to the agent
flag `--isolation`. This allows cgroups isolator to
automatically load all the local enabled cgroups subsystems.
If this option is specified in the agent flag `--isolation`
along with other cgroups related options
(e.g., `cgroups/cpu`), those options will be just ignored.
* [MESOS-7947] - Added a new `--gc_non_executor_container_sandboxes`
option which tells the agent to garbage collect sandboxes created
via the LAUNCH_NESTED_CONTAINER API. The same flag will apply to
standalone container sandboxes in future.
* [MESOS-8327] - Added container-specific cgroups mounts under
`/sys/fs/cgroup` to containers with image launched by Mesos
containerizer.
* [MESOS-5647] - Expose network statistics for containers on
CNI network in the `network/cni` isolator.
* [MESOS-8792] - Added a new `linux/devices` isolator that
automatically populates containers with devices that have
been whitelisted with the `--allowed_devices` agent flag.
* [MESOS-8340] Added a new `--enforce_container_ports`
option to toggle ports resource enforcement by the
`network/ports` isolator.
* [MESOS-6451] - Add timer and percentile metrics for docker
pull latency distribution.
* Windows:
* [MESOS-8668] - Added support to libprocess for the Windows
Thread Pool API, replacing libevent with the native Windows
event and thread pool library. This can be enabled with
`-DENABLE_LIBWINIO=ON` during CMake configuration. By
utilizing I/O Completion Ports, this enables non-blocking
asynchronous I/O on Windows for sockets, pipes, and files.
* Multi-Framework Workloads:
* [MESOS-8842] - **Experimental** Added per-framework metrics
to the master. These new metrics provide detailed information
about the behavior of each framework and can help with
scalability testing, debugging, and fine grained monitoring.
Please refer to docs/monitoring.md for more details.
* [MESOS-8238] Documentation was added in the framework
development guide to provide recommendations on how schedulers
can behave co-operatively in a multi-framework setting, as
well as how to operationally configure Mesos in such a setting.
* [MESOS-8936] A new weighted random sorter was added as an
alternative to the existing DRF sorter, this allows users
that don't need DRF behavior to opt-out.
Additional API Changes:
* [MESOS-9066] - Introduced `CREATE_DISK` and `DESTROY_DISK` offer
operations to replace `CREATE_VOLUME`, `CREATE_BLOCK`,
`DESTROY_VOLUME` and `DESTROY_BLOCK`.
* Container logger module interface has been changed. The `prepare()` method
now takes `ContainerID` and `ContainerConfig` instead.
* `Isolator::recover` interface has been changed to take an `std::vector`
instead of `std::list`.
* JSON endpoints now use rapidjson to provide a performance improvement,
this means that if a client has a JSON de-serializer that does not
conform to the ECMA-404 spec for JSON, they may break. As an example,
Mesos would previously serialize '/' as '\/', but the spec does not
require the escaping and rapidjson does not escape '/'.
Changes to Dependencies:
* [MESOS-8395] - Made gRPC a requirement for Mesos builds. The `--enable-grpc`
Autotools option and the `-DENABLE_GRPC=ON` CMake option is now removed.
* [MESOS-8064] - Mesos now requires libarchive to programmatically decode
.zip, .tar, .gzip, and other common file compression schemes. Version 3.3.2
is bundled in Mesos.
* [MESOS-9092] - Adopt rapidjson for improved json serialization performance.
Version 1.1.0 is bundled in Mesos.
Unresolved Critical Issues:
* [MESOS-1718] - Command executor can overcommit the agent.
* [MESOS-2554] - Slave flaps when using --slave_subsystems that are not used for isolation.
* [MESOS-2774] - SIGSEGV received during process::MessageEncoder::encode()
* [MESOS-2842] - Update FrameworkInfo.principal on framework re-registration
* [MESOS-3747] - HTTP Scheduler API no longer allows FrameworkInfo.user to be empty string
* [MESOS-5396] - After failover, master does not remove agents with same UPID.
* [MESOS-5989] - Libevent SSL Socket downgrade code accesses uninitialized memory / assumes single peek is sufficient.
* [MESOS-5995] - Protobuf JSON deserialisation does not accept numbers formated as strings
* [MESOS-6632] - ContainerLogger might leak FD if container launch fails.
* [MESOS-7076] - libprocess tests fail when using libevent 2.1.8
* [MESOS-7386] - Executor not cleaning up existing running docker containers if external logrotate/logger processes die/killed
* [MESOS-7566] - Master crash due to failed check in DRFSorter::remove
* [MESOS-7622] - Agent can crash if a HTTP executor tries to retry subscription in running state.
* [MESOS-7721] - Master's agent removal rate limit also applies to agent unreachability.
* [MESOS-7748] - Slow subscribers of streaming APIs can lead to Mesos OOMing.
* [MESOS-7911] - Non-checkpointing framework's tasks should not be marked LOST when agent disconnects.
* [MESOS-7991] - fatal, check failed !framework->recovered()
* [MESOS-8038] - Launching GPU task sporadically fails.
* [MESOS-8137] - Mesos agent can hang during startup.
* [MESOS-8256] - Libprocess can silently deadlock due to worker thread exhaustion.
* [MESOS-8257] - Unified Containerizer "leaks" a target container mount path to the host FS when the target resolves to an absolute path
* [MESOS-8522] - `prepareMounts` in Mesos containerizer is flaky.
* [MESOS-8623] - Crashed framework brings down the whole Mesos cluster
* [MESOS-8679] - If the first KILL stuck in the default executor, all other KILLs will be ignored.
* [MESOS-8703] - Mesos master can`t reconnect to zookeeper
* [MESOS-8731] - mesos master APIs become latent
* [MESOS-8769] - Agent crashes when CNI config not defined
* [MESOS-8803] - Libprocess deadlocks in a test.
* [MESOS-8840] - `cpu.cfs_quota_us` may be accidentally set for command task using docker during agent recovery.
* [MESOS-8927] - Default executor cannot kill tasks if `LAUNCH_NESTED_CONTAINER` is stuck.
* [MESOS-9006] - The agent's GET_AGENT leaks resource information when using authorization
* [MESOS-9022] - Race condition in task updates could cause missing event in streaming
* [MESOS-9049] - Agent GC could unmount a dangling persistent volume multiple times.
* [MESOS-9053] - Network ports isolator can falsely trigger while destroying containers.
* [MESOS-9109] - Windows agent uses reserved character :(colon) for file name and crashes when attempting to remove link
* [MESOS-9131] - Health checks launching nested containers while a container is being destroyed lead to unkillable tasks
* [MESOS-9157] - cannot pull docker image from dockerhub
* [MESOS-9169] - docker image fetching fails
All Resolved Issues:
** Bug
* [MESOS-2199] - Failing test: SlaveTest.ROOT_RunTaskWithCommandInfoWithUser
* [MESOS-3202] - Avoid role/framework offer starvation in DRF allocator.
* [MESOS-3475] - TestContainerizer should not modify global environment variables.
* [MESOS-3790] - ZooKeeper connection should retry on EAI_NONAME
* [MESOS-5371] - Implement `fcntl.hpp`
* [MESOS-5904] - Process routes implementation seems to drop routes on Windows.
* [MESOS-6092] - Docker containerizer launch command may access a "Container" struct after it has been destroyed
* [MESOS-6622] - NvidiaGpuTest.ROOT_INTERNET_CURL_CGROUPS_NVIDIA_GPU_NvidiaDockerImage is flaky
* [MESOS-6823] - bool/UserContainerLoggerTest.ROOT_LOGROTATE_RotateWithSwitchUserTrueOrFalse/0 is flaky
* [MESOS-6985] - os::getenv() can segfault
* [MESOS-7032] - Mesos fail NvidiaGpuTest.ROOT_INTERNET_CURL_CGROUPS_NVIDIA_GPU_NvidiaDockerImage
* [MESOS-7168] - Agent should validate that the nested container ID does not exceed certain length.
* [MESOS-7220] - 'EXPECT_SOME' and other asserts don't work with 'Try's that have a custom error state.
* [MESOS-7342] - Port Docker tests
* [MESOS-7397] - apply-reviews.py silently fails when using chain mode.
* [MESOS-7658] - apply-reviews.py fails with Unicode characters
* [MESOS-7966] - check for maintenance on agent causes fatal error
* [MESOS-8128] - Make os::pipe file descriptors O_CLOEXEC.
* [MESOS-8134] - SlaveTest.ContainersEndpoint is flaky due to getenv crash.
* [MESOS-8429] - Clean up endpoint socket if the container daemon is destroyed while waiting.
* [MESOS-8499] - Change docker health check image to the new nanoserver one
* [MESOS-8567] - Test UriDiskProfileTest.FetchFromHTTP is flaky.
* [MESOS-8568] - Command checks should always call `WAIT_NESTED_CONTAINER` before `REMOVE_NESTED_CONTAINER`
* [MESOS-8613] - Test `MasterAllocatorTest/*.TaskFinished` is flaky.
* [MESOS-8626] - The 'allocatable' check in the allocator is problematic with multi-role frameworks
* [MESOS-8686] - Mesos build failed with /permissive- + MSVC on windows
* [MESOS-8687] - Check failure in `ProcessBase::_consume()`.
* [MESOS-8786] - CgroupIsolatorProcess accesses subsystem processes directly.
* [MESOS-8830] - Agent gc on old slave sandboxes could empty persistent volume data
* [MESOS-8838] - Consider validating that resubscribing resource providers do not change their name or type
* [MESOS-8857] - Fix subprocess(flags) logic on Windows to handle arguments with quotes
* [MESOS-8871] - Agent may fail to recover if the agent dies before image store cache checkpointed.
* [MESOS-8873] - StorageLocalResourceProviderTest.ROOT_ZeroSizedDisk is flaky.
* [MESOS-8875] - `leveldb::PosixEnv::DeleteFile()` can segfault.
* [MESOS-8884] - Flaky `DockerContainerizerTest.ROOT_DOCKER_MaxCompletionTime`.
* [MESOS-8892] - MasterSlaveReconciliationTest.ReconcileDroppedOperation is flaky
* [MESOS-8897] - ROOT_XFS_QuotaTest.DiskUsageExceedsQuotaWithKill is flaky
* [MESOS-8906] - `UriDiskProfileAdaptor` fails to update profile selectors.
* [MESOS-8913] - Resource provider manager registry leaks file descriptors into executors.
* [MESOS-8917] - Agent leaking file descriptors into forked processes
* [MESOS-8921] - Autotools don't work with newer OpenJDK versions
* [MESOS-8932] - Quota guarantee metric does not handle removal correctly.
* [MESOS-8935] - Quota limit "chopping" can lead to cpu-only and memory-only offers.
* [MESOS-8945] - Master check failure due to CHECK_SOME(providerId).
* [MESOS-8952] - process::await/collect n^2 performance issue
* [MESOS-8954] - python3/post-reviews.py errors due to TypeError.
* [MESOS-8958] - LinuxDevicesIsolatorTest.ROOT_PopulateWhitelistedDevices fails on some boxes.
* [MESOS-8963] - Executor crash trying to print container ID.
* [MESOS-8970] - Tests relying on metrics segfault on some Linux distros.
* [MESOS-8977] - BuildBot uses Docker with AUFS that has a max file length limit of 242 characters
* [MESOS-8979] - python3/push-commits.py fails due to TypeError
* [MESOS-8980] - mesos-slave can deadlock with docker pull
* [MESOS-8985] - Posting to the operator api with 'accept recordio' header can crash the agent
* [MESOS-8987] - Master asks agent to shutdown upon auth errors.
* [MESOS-9000] - Operator API event stream can miss task status updates.
* [MESOS-9007] - XFS disk isolator doesn't clean up project ID from symlinks
* [MESOS-9008] - Fetcher fails to extract some archives containing hardlinks
* [MESOS-9010] - `UPDATE_STATE` can race with `UPDATE_OPERATION_STATUS` for a resource provider.
* [MESOS-9014] - MasterAPITest.SubscribersReceiveHealthUpdates is flaky
* [MESOS-9025] - The container which joins CNI network and has checkpoint enabled will be mistakenly destroyed by agent
* [MESOS-9027] - GPU Isolator still depends on cgroups/devices agent flag given cgrous/all is supported.
* [MESOS-9037] - DefaultExecutorTest.SigkillExecutor is flaky
* [MESOS-9038] - Archiver utility extracts links within subdirectories incorrectly
* [MESOS-9039] - CNI isolator recovery should wait until unknown orphan cleanup is done
* [MESOS-9051] - Move agent call validation into common validation library.
* [MESOS-9065] - Apply the `override` keyword globally.
* [MESOS-9073] - Tox doesn't run in the support virtualenv when using Python 3 mesos-style.py
* [MESOS-9075] - Virtualenv management in support directory is buggy.
* [MESOS-9094] - On macOS libprocess_tests fail to link when compiling with gRPC
* [MESOS-9114] - cmake build is broken on macos
* [MESOS-9115] - Stout depends on missing rapidjson headers.
* [MESOS-9116] - Launch nested container session fails due to incorrect detection of `mnt` namespace of command executor's task.
* [MESOS-9125] - Port mapper CNI plugin might fail with "Resource temporarily unavailable"
* [MESOS-9127] - Port mapper CNI plugin might deadlock iptables on the agent.
* [MESOS-9137] - GRPC build fails to pass compiler flags
* [MESOS-9142] - CNI detach might fail due to missing network config file.
* [MESOS-9144] - Master authentication handling leads to request amplification.
* [MESOS-9145] - Master has a fragile burned-in 5s authentication timeout.
* [MESOS-9146] - Agent has a fragile burn-in 5s authentication timeout.
* [MESOS-9147] - Agent and scheduler driver authentication retry backoff time could overflow.
* [MESOS-9149] - Failed to build gRPC on Linux without OpenSSL.
* [MESOS-9151] - Container stuck at ISOLATING due to FD leak
* [MESOS-9156] - StorageLocalResourceProviderProcess can deadlock
* [MESOS-9160] - Failed to compile gRPC when the build path contains symlinks.
* [MESOS-9163] - `UriDiskProfileAdaptor` should not update profiles when a poll returns a non-OK HTTP status.
* [MESOS-9170] - Zookeeper doesn't compile with newer gcc due to format error
* [MESOS-9171] - Mesos agent crashes in CNI isolator when usage is queried
* [MESOS-9177] - Mesos master segfaults when responding to /state requests.
* [MESOS-9185] - An attempt to remove or destroy container in composing containerizer leads to segfault.
* [MESOS-9193] - Mesos build fail with Clang 3.5.
* [MESOS-9196] - Removing rootfs mounts may fail with EBUSY.
** Epic
* [MESOS-8564] - Port libprocess-tests suites to Windows
* [MESOS-8668] - Transition libprocess on Windows to use the Thread Pool API
* [MESOS-8705] - Composing containerizer improvements
* [MESOS-8842] - Per Framework Metrics on Master
* [MESOS-8916] - Allocation logic cleanup.
* [MESOS-9013] - Support container Cgroup FS mount.
** Improvement
* [MESOS-6451] - Add timer and percentile for docker pull latency distribution.
* [MESOS-7691] - Support local enabled cgroups subsystems automatically.
* [MESOS-7947] - Add GC capability to nested containers
* [MESOS-8064] - Add capability so mesos can programmatically decode .zip, .tar, .gzip, and other common file compression schemes
* [MESOS-8106] - Docker fetcher plugin unsupported scheme failure message is not accurate.
* [MESOS-8340] - Add a no-enforce option to the `network/ports` isolator.
* [MESOS-8418] - mesos-agent high cpu usage because of numerous /proc/mounts reads
* [MESOS-8680] - Rename variable names in slave.hpp to be more explicit.
* [MESOS-8788] - Add alg RS256 support for JWT generator and validator in libprocess
* [MESOS-8792] - Automatically create whitelisted devices.
* [MESOS-8798] - Build the "unsecure" gRPC libraries to remove SSL dependency.
* [MESOS-8829] - Get rid of extra `containerizer->wait()` calls in tests.
* [MESOS-8908] - Add -fno-omit-frame-pointer to improve debugging and profiling.
* [MESOS-8911] - Add framework metrics benchmark test.
* [MESOS-8919] - Per Framework SUBSCRIBE metrics.
* [MESOS-8920] - Support per-container container logger configuration.
* [MESOS-8924] - Refactor the libprocess gRPC warpper.
* [MESOS-8955] - Manage Python2 and 3 in build steps
* [MESOS-8986] - `slave.available()` in the allocator is expensive and drags down allocation performance.
* [MESOS-8989] - Add a better benchmark for range type resources.
* [MESOS-8998] - Allow for unbundled libevent in CMake builds to work around 2.1.x SSL issues.
* [MESOS-9015] - Allow resources to be removed when updating the sorter.
* [MESOS-9055] - Make gRPC call deadline configurable.
* [MESOS-9067] - Improve performance of json parsing by avoiding conversion cost.
* [MESOS-9081] - cgroups::verify is expensive and is done implicitly during cgroups operations.
* [MESOS-9086] - Optimize range subtraction operation.
* [MESOS-9092] - Adopt rapidjson for improved json serialization performance.
* [MESOS-9104] - Refactor capability related logic in the allocator.
* [MESOS-9110] - Add move support to the Resources / Resource_ wrappers.
* [MESOS-9122] - Batch '/state' requests in the Master actor.
* [MESOS-9129] - Port mapper CNI plugin should use '-n' option with 'iptables --list'
* [MESOS-9213] - Avoid double copying of master->framework messages when incrementing metrics.
** Task
* [MESOS-2633] - Move implementations of Framework struct functions out of master.hpp.
* [MESOS-3442] - Port path_tests to Windows
* [MESOS-3444] - Port sendfile_tests
* [MESOS-5647] - Expose network statistics for containers on CNI network in the `network/cni` isolator.
* [MESOS-5814] - Port libprocess http_tests.cpp
* [MESOS-5817] - Port libprocess process_tests.cpp
* [MESOS-5941] - RemoteLink tests fail on Windows
* [MESOS-7329] - Authorize offer operations for converting disk resources
* [MESOS-7527] - Enable ProcessTest.THREADSAFE_Http2 on Windows.
* [MESOS-8314] - Add authorization to display of resource provider information in API calls and endpoints
* [MESOS-8327] - Add container-specific CGroup FS mounts under /sys/fs/cgroup/* to Mesos containers
* [MESOS-8383] - Add metrics for operations in Storage Local Resource Provider (SLRP).
* [MESOS-8395] - Made gRPC a requirement for Mesos builds.
* [MESOS-8473] - Authorize `GET_OPERATIONS` calls.
* [MESOS-8670] - Implement `process::io::read/write` using Thread Pool API
* [MESOS-8671] - Add EventLoop implementation using Thread Pool API
* [MESOS-8672] - Replace libprocess `PollSocketImpl` with IOCP and Thread Pool API
* [MESOS-8674] - Fix os::pipe to work in overlapped mode
* [MESOS-8681] - Clean up os::sendfile on Windows
* [MESOS-8712] - Remove `destroyed` promise from `Container` struct
* [MESOS-8713] - Synchronize result of `wait` and `destroy` composing c'zer methods
* [MESOS-8714] - Cleanup `containers_` hashmap once container exits
* [MESOS-8732] - Use composing containerizer in some agent tests.
* [MESOS-8734] - Restore `WaitAfterDestroy` test to check termination status of a terminated nested container.
* [MESOS-8736] - Implement a test which ensures that `wait` and `destroy` return the same result for a terminated nested container.
* [MESOS-8737] - Update composing containerizer tests.
* [MESOS-8774] - Authenticate and authorize calls to the resource provider manager's API
* [MESOS-8794] - Support docker image tarball hdfs based fetching.
* [MESOS-8814] - Mount the volume based on `Volume.mode`.
* [MESOS-8825] - Remove storage pools associated with missing profiles.
* [MESOS-8837] - Add test of resource provider manager recovery
* [MESOS-8843] - Per Framework CALL metrics
* [MESOS-8844] - Per Framework EVENT metrics
* [MESOS-8845] - Per Framework Operation metrics
* [MESOS-8846] - Per Framework state metrics
* [MESOS-8847] - Per Framework task state metrics
* [MESOS-8848] - Per Framework Offer metrics
* [MESOS-8849] - Per Framework resource allocation metrics
* [MESOS-8903] - Update the Python CLI to use Python 3
* [MESOS-8912] - Per Framework terminal task state metrics
* [MESOS-8931] - Add os::shell back to Windows
* [MESOS-8934] - Update python.m4 to support Python 3
* [MESOS-8936] - Implement a Random Sorter for offer allocations.
* [MESOS-8940] - Per Framework Offer metrics with a specific resource type
* [MESOS-8942] - Master streaming API does not send (health) check updates for tasks.
* [MESOS-8943] - Add metrics about CSI calls.
* [MESOS-8961] - Output of tasks gets corrupted if task defines the same environment variables as the executor container
* [MESOS-8990] - Build failure of the google-test dependency on Windows using MSVC.
* [MESOS-8995] - Add SLRP unit tests for missing profiles.
* [MESOS-8997] - Consider dropping PATH disk support for CSI volumes.
* [MESOS-9002] - GCC 8.1 build failure in os::Fork::Tree.
* [MESOS-9043] - Move check validators to the common validation library.
* [MESOS-9066] - Changing `CREATE_VOLUME` and `CREATE_BLOCK` to `CREATE_DISK`.
* [MESOS-9068] - Add a metrics benchmark in libprocess.
* [MESOS-9070] - Support systemd and freezer cgroup subsystems bind mount for container with rootfs.
* [MESOS-9148] - Make cgroups destroy timeout configurable for Mesos containerizer
** Documentation
* [MESOS-8740] - Update description of a Containerizer interface.
* [MESOS-9020] - Seccomp design doc
Release Notes - Mesos - Version 1.6.3 (WIP)
-------------------------------------------
* This is a bug fix release.
** Bug
* [MESOS-9124] - Agent reconfiguration can cause master to unsuppress on scheduler's behalf.
* [MESOS-9507] - Agent could not recover due to empty docker volume checkpointed files.
* [MESOS-9529] - `/proc` should be remounted even if a nested container set `share_pid_namespace` to true.
* [MESOS-9564] - Logrotate container logger lets tasks execute arbitrary commands in the Mesos agent's namespace.
* [MESOS-9616] - `Filters.refuse_seconds` declines resources not in offers.
* [MESOS-9619] - Mesos Master Crashes with Launch Group when using Port Resources
* [MESOS-9692] - Quota may be under allocated for disk resources.
* [MESOS-9695] - Remove the duplicate pid check in Docker containerizer
* [MESOS-9707] - Calling link::lo() may cause runtime error
* [MESOS-9766] - /__processes__ endpoint can hang.
* [MESOS-9786] - Race between two REMOVE_QUOTA calls crashes the master.
* [MESOS-9787] - Log slow SSL (TLS) peer reverse DNS lookup.
* [MESOS-9836] - Docker containerizer overwrites `/mesos/slave` cgroups.
* [MESOS-9852] - Slow memory growth in master due to deferred deletion of offer filters and timers.
* [MESOS-9856] - REVIVE call with specified role(s) clears filters for all roles of a framework.
* [MESOS-9868] - NetworkInfo from the agent /state endpoint is not correct.
* [MESOS-9870] - Simultaneous adding/removal of a role from framework's roles and its suppressed roles crashes the master.
* [MESOS-9887] - Race condition between two terminal task status updates for Docker/Command executor.
* [MESOS-9889] - Master CPU high due to unexpected foreachkey behaviour in Master::__reregisterSlave.
* [MESOS-9893] - `volume/secret` isolator should cleanup the stored secret from runtime directory when the container is destroyed.
* [MESOS-10007] - Command executor can miss exit status for short-lived commands due to double-reaping.
** Improvement
* [MESOS-8880] - Add minimum capabilities in the master.
* [MESOS-9159] - Support Foreign URLs in docker registry puller.
* [MESOS-9675] - Docker Manifest V2 Schema2 Support.
* [MESOS-9704] - Support docker manifest v2s2 config GC.
* [MESOS-9759] - Log required quota headroom and available quota headroom in the allocator.
* [MESOS-9948] - master::Slave::hasExecutor occupies 37% of a 150 second perf sample.
* [MESOS-10017] - Log all reverse DNS lookup failures in 'legacy' TLS (SSL) hostname validation scheme.
Release Notes - Mesos - Version 1.6.2
-------------------------------------------
* This is a bug fix release.
** Bug
* [MESOS-7042] - Send SIGKILL after SIGTERM to IOSwitchboard after container termination.
* [MESOS-7474] - Mesos fetcher cache doesn't retry when missed.
* [MESOS-8128] - Make os::pipe file descriptors O_CLOEXEC.
* [MESOS-8418] - mesos-agent high cpu usage because of numerous /proc/mounts reads.
* [MESOS-8545] - AgentAPIStreamingTest.AttachInputToNestedContainerSession is flaky.
* [MESOS-8568] - Command checks should always call `WAIT_NESTED_CONTAINER` before `REMOVE_NESTED_CONTAINER`
* [MESOS-8887] - Unreachable tasks are not GC'ed when unreachable agent is GC'ed.
* [MESOS-8907] - Docker image fetcher fails with HTTP/2.
* [MESOS-8917] - Agent leaking file descriptors into forked processes
* [MESOS-8921] - Autotools don't work with newer OpenJDK versions
* [MESOS-8978] - Command executor calling setsid breaks the tty support.
* [MESOS-9116] - Launch nested container session fails due to incorrect detection of `mnt` namespace of command executor's task.
* [MESOS-9125] - Port mapper CNI plugin might fail with "Resource temporarily unavailable"
* [MESOS-9127] - Port mapper CNI plugin might deadlock iptables on the agent.
* [MESOS-9131] - Health checks launching nested containers while a container is being destroyed lead to unkillable tasks.
* [MESOS-9142] - CNI detach might fail due to missing network config file.
* [MESOS-9144] - Master authentication handling leads to request amplification.
* [MESOS-9145] - Master has a fragile burned-in 5s authentication timeout.
* [MESOS-9146] - Agent has a fragile burn-in 5s authentication timeout.
* [MESOS-9147] - Agent and scheduler driver authentication retry backoff time could overflow.
* [MESOS-9151] - Container stuck at ISOLATING due to FD leak.
* [MESOS-9152] - Close all file descriptors except whitelist_fds in posix/subprocess.
* [MESOS-9164] - Subprocess should unset CLOEXEC on whitelisted file descriptors.
* [MESOS-9170] - Zookeeper doesn't compile with newer gcc due to format error.
* [MESOS-9196] - Removing rootfs mounts may fail with EBUSY.
* [MESOS-9210] - Mesos v1 scheduler library does not properly handle SUBSCRIBE retries.
* [MESOS-9231] - `docker inspect` may return an unexpected result to Docker executor due to a race condition.
* [MESOS-9267] - Mesos agent crashes when CNI network is not configured but used.
* [MESOS-9274] - v1 JAVA scheduler library can drop TEARDOWN upon destruction.
* [MESOS-9279] - Docker Containerizer 'usage' call might be expensive if mount table is big.
* [MESOS-9283] - Docker containerizer actor can get backlogged with large number of containers.
* [MESOS-9308] - URI disk profile adaptor could deadlock.
* [MESOS-9317] - Some master endpoints do not handle failed authorization properly.
* [MESOS-9324] - Resource fragmentation: frameworks may be starved of port resources in the presence of large number frameworks with quota.
* [MESOS-9332] - Nested container should run as the same user of its parent container by default.
* [MESOS-9334] - Container stuck at ISOLATING state due to libevent poll never returns.
* [MESOS-9411] - Validation of JWT tokens using HS256 hashing algorithm is not thread safe.
* [MESOS-9418] - Add support for the `Discard` blkio operation type.
* [MESOS-9419] - Executor to framework message crashes master if framework has not re-registered.
* [MESOS-9480] - Master may skip processing authorization results for `LAUNCH_GROUP`.
* [MESOS-9492] - Persist CNI working directory across reboot.
* [MESOS-9501] - Mesos executor fails to terminate and gets stuck after agent host reboot.
* [MESOS-9502] - IOswitchboard cleanup could get stuck due to FD leak from a race.
* [MESOS-9518] - CNI_NETNS should not be set for orphan containers that do not have network namespace.
* [MESOS-9531] - chown error handling is incorrect in createSandboxDirectory.
* [MESOS-9532] - ResourceOffersTest.ResourceOfferWithMultipleSlaves is flaky.
* [MESOS-9533] - CniIsolatorTest.ROOT_CleanupAfterReboot is flaky.
* [MESOS-9555] - Allocator CHECK failure: reservationScalarQuantities.contains(role).
** Improvement
* [MESOS-8418] - mesos-agent high cpu usage because of numerous /proc/mounts reads
* [MESOS-9305] - Create cgoup recursively to workaround systemd deleting cgroups_root.
* [MESOS-9340] - Log all socket errors in libprocess.
* [MESOS-9501] - Mesos executor fails to terminate and gets stuck after agent host reboot.
* [MESOS-9502] - IOswitchboard cleanup could get stuck due to FD leak from a race.
* [MESOS-9510] - Disallowed nan, inf and so on in `Value::Scalar`.
* [MESOS-9516] - Extend `min_allocatable_resources` flag to cover non-scalar resources.
Release Notes - Mesos - Version 1.6.1
-------------------------------------
* This is a bug fix release.
** Bug
* [MESOS-3790] - ZooKeeper connection should retry on `EAI_NONAME`.
* [MESOS-8106] - Docker fetcher plugin unsupported scheme failure message is not accurate.
* [MESOS-8786] - CgroupIsolatorProcess accesses subsystem processes directly.
* [MESOS-8830] - Agent gc on old slave sandboxes could empty persistent volume data
* [MESOS-8871] - Agent may fail to recover if the agent dies before image store cache checkpointed.
* [MESOS-8904] - Master crash when removing quota.
* [MESOS-8906] - `UriDiskProfileAdaptor` fails to update profile selectors.
* [MESOS-8935] - Quota limit "chopping" can lead to cpu-only and memory-only offers.
* [MESOS-8936] - Implement a Random Sorter for offer allocations.
* [MESOS-8942] - Master streaming API does not send (health) check updates for tasks.
* [MESOS-8945] - Master check failure due to CHECK_SOME(providerId).
* [MESOS-8947] - Improve the container preparing logging in IOSwitchboard and volume/secret isolator.
* [MESOS-8952] - process::await/collect n^2 performance issue.
* [MESOS-8963] - Executor crash trying to print container ID.
* [MESOS-8980] - mesos-slave can deadlock with docker pull.
* [MESOS-8986] - `slave.available()` in the allocator is expensive and drags down allocation performance.
* [MESOS-8987] - Master asks agent to shutdown upon auth errors.
* [MESOS-9002] - GCC 8.1 build failure in os::Fork::Tree.
* [MESOS-9024] - Mesos master segfaults with stack overflow under load.
* [MESOS-9025] - The container which joins CNI network and has checkpoint enabled will be mistakenly destroyed by agent.
* [MESOS-9049] - Agent GC could unmount a dangling persistent volume multiple times.
** Improvement
* [MESOS-8934] - Update python.m4 to support Python 3.
Release Notes - Mesos - Version 1.6.0
-------------------------------------
This release contains the following new features:
* [MESOS-4965] - **Experimental** Persistent volumes can be resized
through new offer operations and V1 operator API now.
* [MESOS-6575] - Added a new `--xfs_kill_containers` flag to the
Mesos agent. This causes the `disk/xfs` isolator to terminate
containers that exceed their disk quota.
* [MESOS-7944] - **Experimental** Added a new `MemoryProfiler` class to
libprocess to aid in debugging memory issues.
* [MESOS-8054] - **Experimental** Schedulers can now receive feedback about
offer operations which operate on resources managed by resource providers.
In the future, this feature will be extended to operations on agent default
resources.
* [MESOS-8534] - **Experimental** A nested container is now allowed
to join a separate CNI network than its parent container.
* [MESOS-8572] - Improvements to the Docker containerizer and executor
to more gracefully handle situations in which the Docker CLI is
unresponsive.
* [MESOS-8607] - The `mesos-execute` tool has been ported to Windows.
* [MESOS-8649] - **Experimental** Support for Container Storage Interface
(CSI) version 0.2 in Mesos.
* [MESOS-8659] - The Windows build now links the C runtime library
dynamically instead of statically. This requires the Visual Studio
redistributable to be available at runtime.
* [MESOS-8682] - The use of the C runtime library's POSIX wrappers on
Windows has been deprecated in favor of the native Windows APIs.
* [MESOS-8725] - Added a new `max_completion_time` field to `TaskInfo`.
Tasks which do not complete at the end of the specified duration will
fail with a new reason `REASON_MAX_COMPLETION_TIME_REACHED`.
* [MESOS-8801] - **Experimental** On Linux, Mesos can now be
configured to use the jemalloc allocator by default via the
`--enable-jemalloc-allocator` configuration option.
* Agents now support the `--fetcher_stall_timeout` flag which allows container
image and artifact fetchers to abort after the timeout when downloads stall.
Deprecations/Removals:
* Support for CSI v0.1 is deprecated in favor of CSI v0.2.
Additional API Changes:
* [MESOS-8306] - Authorization of resource reservation has been updated
to allow the restriction of which agents can statically reserve
resources for which roles.
* [MESOS-8332] - Container sandbox permissions have been changed from
0755 to 0750.
* [MESOS-8388] - Local resource provider resources are now included in
the responses to the GET_AGENTS and GET_RESOURCE_PROVIDER calls.
* [MESOS-8534] - Nested containers within a task group can now specify
separate network namespaces.
Changes to Dependencies:
* Upgraded minimum required gRPC library to version 1.10+ for gRPC-enabled builds.
Unresolved Critical Issues:
* [MESOS-1718] - Command executor can overcommit the agent.
* [MESOS-2554] - Slave flaps when using --slave_subsystems that are not used for isolation.
* [MESOS-2774] - SIGSEGV received during process::MessageEncoder::encode()
* [MESOS-2842] - Update FrameworkInfo.principal on framework re-registration
* [MESOS-3533] - Unable to find and run URIs files
* [MESOS-3747] - HTTP Scheduler API no longer allows FrameworkInfo.user to be empty string
* [MESOS-5396] - After failover, master does not remove agents with same UPID.
* [MESOS-5989] - Libevent SSL Socket downgrade code accesses uninitialized memory / assumes single peek is sufficient.
* [MESOS-5995] - Protobuf JSON deserialisation does not accept numbers formated as strings
* [MESOS-6632] - ContainerLogger might leak FD if container launch fails.
* [MESOS-7386] - Executor not cleaning up existing running docker containers if external logrotate/logger processes die/killed
* [MESOS-7566] - Master crash due to failed check in DRFSorter::remove
* [MESOS-7622] - Agent can crash if a HTTP executor tries to retry subscription in running state.
* [MESOS-7721] - Master's agent removal rate limit also applies to agent unreachability.
* [MESOS-7748] - Slow subscribers of streaming APIs can lead to Mesos OOMing.
* [MESOS-7911] - Non-checkpointing framework's tasks should not be marked LOST when agent disconnects.
* [MESOS-7966] - check for maintenance on agent causes fatal error
* [MESOS-7991] - fatal, check failed !framework->recovered()
* [MESOS-8137] - Mesos agent can hang during startup.
* [MESOS-8256] - Libprocess can silently deadlock due to worker thread exhaustion.
* [MESOS-8257] - Unified Containerizer "leaks" a target container mount path to the host FS when the target resolves to an absolute path
* [MESOS-8522] - `prepareMounts` in Mesos containerizer is flaky.
* [MESOS-8623] - Crashed framework brings down the whole Mesos cluster
* [MESOS-8679] - If the first KILL stuck in the default executor, all other KILLs will be ignored.
* [MESOS-8703] - Mesos master can`t reconnect to zookeeper
* [MESOS-8731] - mesos master APIs become latent
* [MESOS-8769] - Agent crashes when CNI config not defined
* [MESOS-8803] - Libprocess deadlocks in a test.
* [MESOS-8830] - Agent gc on old slave sandboxes could empty persistent volume data
* [MESOS-8840] - `cpu.cfs_quota_us` may be accidentally set for command task using docker during agent recovery.
Feature Graduations:
* [MESOS-4828] - XFS disk quota isolator.
* [MESOS-6906] - Introduce a general non-interpreting task check.
All Experimental Features:
* [MESOS-3094] - Mesos on Windows.
* [MESOS-3421] - Support sharing of resources across task instances.
* [MESOS-4312] - Porting Mesos on Power (ppc64le).
* [MESOS-4355] - Implement isolator for Docker volume.
* [MESOS-4965] - Persistent volume resizing.
* [MESOS-5344] - Partition-aware Mesos frameworks.
* [MESOS-5788] - Added JAVA API adapter for seamless transition to new scheduler API.
* [MESOS-5931] - Support auto backend in Mesos Containerizer.
* [MESOS-6014] - Added port mapping CNI plugin.
* [MESOS-7944] - Libprocess `MemoryProfiler`.
* [MESOS-8054] - Offer operation feedback.
* [MESOS-8534] - Separate CNI networks for nested containers.
* [MESOS-8649] - Support for Container Storage Interface version 0.2.
* [MESOS-8801] - Linux support for jemalloc.
All Resolved Issues:
** Bug
* [MESOS-1720] - Slave should send exited executor message when the executor is never launched.
* [MESOS-3915] - Upgrade vendored Boost
* [MESOS-4420] - Support read host physical link speed from virtio driver
* [MESOS-5333] - GET /master/maintenance/schedule/ produces 404.
* [MESOS-5820] - Port master to Windows
* [MESOS-5882] - `os::cloexec` does not exist on Windows
* [MESOS-5940] - `setPaths` doesn't work on Windows
* [MESOS-6555] - Namespace 'mnt' is not supported
* [MESOS-6713] - Port `slave_recovery_tests.cpp`
* [MESOS-6715] - Port `uri_fetcher_tests.cpp`
* [MESOS-6822] - CNI reports confusing error message for failed interface setup.
* [MESOS-6973] - Fix BOOST random generator initialization on Windows
* [MESOS-7028] - NetSocketTest.EOFBeforeRecv is flaky.
* [MESOS-7342] - Port Docker tests
* [MESOS-7506] - Multiple tests leave orphan containers.
* [MESOS-7604] - SlaveTest.ExecutorReregistrationTimeoutFlag aborts on Windows
* [MESOS-7699] - "stdlib.h: No such file or directory" when building with GCC 6 (Debian stable freshly released)
* [MESOS-7742] - Race conditions in IOSwitchboard: listening on unix socket and premature closing of the connection.
* [MESOS-7803] - fs::list drops path components on Windows
* [MESOS-7944] - Implement jemalloc memory profiling support for Mesos
* [MESOS-7979] - reviewboard's GUESS_FIELDS setting leads to redundant information in commit messages
* [MESOS-8125] - Agent should properly handle recovering an executor when its pid is reused
* [MESOS-8140] - Executors should clear their auth tokens
* [MESOS-8232] - SlaveTest.RegisteredAgentReregisterAfterFailover is flaky.
* [MESOS-8258] - Mesos.DockerContainerizerTest.ROOT_DOCKER_SlaveRecoveryTaskContainer is flaky.
* [MESOS-8305] - DefaultExecutorTest.ROOT_MultiTaskgroupSharePidNamespace is flaky.
* [MESOS-8308] - CommandExecutorCheckTest.CommandCheckTimeout is flaky on Windows
* [MESOS-8334] - PartitionedSlaveReregistrationMasterFailover is flaky.
* [MESOS-8336] - MasterTest.RegistryUpdateAfterReconfiguration is flaky
* [MESOS-8348] - Enable function sections in the build.
* [MESOS-8350] - Resource provider-capable agents not correctly synchronizing checkpointed agent resources on reregistration
* [MESOS-8404] - Improve image puller error messages.
* [MESOS-8411] - Killing a queued task can lead to the command executor never terminating.
* [MESOS-8413] - Zookeeper configuration passwords are shown in clear text
* [MESOS-8416] - CHECK failure if trying to recover nested containers but the framework checkpointing is not enabled.
* [MESOS-8440] - `network/ports` isolator kills legitimate tasks on recovery.
* [MESOS-8444] - GC failure causes agent miss to detach virtual paths for the executor's sandbox
* [MESOS-8446] - Agent miss to detach `virtualLatestPath` for the executor's sandbox during recovery
* [MESOS-8447] - Incomplete output of apply-reviews.py --dry-run
* [MESOS-8453] - ExecutorAuthorizationTest.RunTaskGroup segfaults.
* [MESOS-8463] - Test MasterAllocatorTest/1.SingleFramework is flaky
* [MESOS-8468] - `LAUNCH_GROUP` failure tears down the default executor.
* [MESOS-8474] - Test StorageLocalResourceProviderTest.ROOT_ConvertPreExistingVolume is flaky
* [MESOS-8477] - Make clean fails without Python artifacts.
* [MESOS-8480] - Mesos returns high resource usage when killing a Docker task.
* [MESOS-8482] - Signed/Unsigned comparisons in tests
* [MESOS-8483] - ExampleTests PythonFramework fails with sigabort.
* [MESOS-8484] - stout test NumifyTest.HexNumberTest fails.
* [MESOS-8485] - MasterTest.RegistryGcByCount is flaky
* [MESOS-8489] - LinuxCapabilitiesIsolatorFlagsTest.ROOT_IsolatorFlags is flaky
* [MESOS-8490] - UpdateSlaveMessageWithPendingOffers is flaky.
* [MESOS-8497] - Docker parameter `name` does not work with Docker Containerizer.
* [MESOS-8508] - Missing map header when compiling against unbundled protobuf
* [MESOS-8510] - URI disk profile adaptor does not consider plugin type for a profile.
* [MESOS-8512] - Fetcher doesn't log it's stdout/stderr properly to the log file
* [MESOS-8513] - Noisy "transport endpoint is not connected" logs on closing sockets.
* [MESOS-8519] - Fix recovery of job object isolated tasks
* [MESOS-8530] - Default executor tasks can get stuck in KILLING state
* [MESOS-8536] - Pending offer operations on resource provider resources not properly accounted for in allocator
* [MESOS-8545] - AgentAPIStreamingTest.AttachInputToNestedContainerSession is flaky.
* [MESOS-8546] - PythonFramework test fails with cache write failure.
* [MESOS-8548] - Test StorageLocalResourceProviderTest.ROOT_Metrics is flaky
* [MESOS-8550] - Bug in `Master::detected()` leads to coredump in `MasterZooKeeperTest.MasterInfoAddress`.
* [MESOS-8552] - CGROUPS_ROOT_PidNamespaceForward and CGROUPS_ROOT_PidNamespaceBackward tests fail
* [MESOS-8563] - Windows executors cannot re-register
* [MESOS-8565] - Persistent volumes are not visible in Mesos UI when launching a pod using default executor.
* [MESOS-8577] - Destroy nested container if `LAUNCH_NESTED_CONTAINER_SESSION` fails
* [MESOS-8578] - UpgradeTest.UpgradeAgentIntoHierarchicalRoleForNonHierarchicalRole is flaky.
* [MESOS-8585] - Agent crashes when starting a task with an unknown user.
* [MESOS-8586] - apply-reviews.py silently does nothing when a review was submitted already.
* [MESOS-8594] - Mesos master stack overflow in libprocess socket send loop.
* [MESOS-8598] - Allow empty resource provider selector in `UriDiskProfileAdaptor`.
* [MESOS-8601] - Master crashes during slave reregistration after failover.
* [MESOS-8604] - Quota headroom tracking may be incorrect in the presence of hierarchical reservation.
* [MESOS-8605] - Terminal task status update will not send if 'docker inspect' is hung
* [MESOS-8610] - NsTest.SupportedNamespaces fails on CentOS7
* [MESOS-8611] - SlaveTest.RemoveExecutorUponFailedLaunch is flaky.
* [MESOS-8617] - Tests using default executor occasionally fail.
* [MESOS-8618] - ReconciliationTest.ReconcileStatusUpdateTaskState is flaky.
* [MESOS-8619] - Docker on Windows uses USERPROFILE instead of HOME for credentials
* [MESOS-8620] - Containers stuck in FETCHING possibly due to unresponsive server.
* [MESOS-8624] - Valid tasks may be explicitly dropped by agent due to race conditions
* [MESOS-8631] - Agent should be able to start a task with every CPU on a Windows machine
* [MESOS-8641] - Event stream could send heartbeat before subscribed
* [MESOS-8642] - ballon-executor is hard to run as unprivileged user
* [MESOS-8643] - `os::system` and `os::spawn` returns -1 on valid windows commands
* [MESOS-8644] - W* macros wrong on Windows.
* [MESOS-8646] - Agent should be able to resolve file names on open files.
* [MESOS-8647] - Enable resource provider agent capability by default
* [MESOS-8651] - Potential memory leaks in the `volume/sandbox_path` isolator
* [MESOS-8654] - The `/proc/sys` mount point in Mesos containers should also include `nosuid,noexec,nodev` mount options.
* [MESOS-8659] - Fix warning `cl : Command line warning D9025 : overriding '/MTd' with '/MDd'`
* [MESOS-8664] - Perf sampler doesn't handle extra fields and nameless counters
* [MESOS-8691] - Forward CXX_FLAGS to C++ projects and C_FLAGS to C projects in CMake
* [MESOS-8711] - SlaveTest.ChangeDomain is disabled.
* [MESOS-8719] - Mesos configured with `--enable-grpc` doesn't compile on non-Linux builds
* [MESOS-8724] - G++ Warning about libc system macros `major` and `minor` prevents Mesos build
* [MESOS-8733] - OversubscriptionTest.ForwardUpdateSlaveMessage is flaky
* [MESOS-8741] - `Add` to sequence will not run if it races with sequence destruction
* [MESOS-8742] - Agent resource provider config API calls should be idempotent.
* [MESOS-8749] - CSI proto is always included in the build when using CMake
* [MESOS-8761] - Default linker fails to link tests on FreeBSD
* [MESOS-8781] - Mesos master shouldn't silently drop operations
* [MESOS-8784] - OPERATION_DROPPED operation status updates should include the operation/framework IDs
* [MESOS-8787] - RP-related API should be experimental.
* [MESOS-8804] - Fix Ninja Release builds on Windows
* [MESOS-8818] - VolumeSandboxPathIsolatorTest.SharedParentTypeVolume fails on macOS
* [MESOS-8834] - Indirect recursion between `send` and `_send` in libprocess may cause stack overflow.
* [MESOS-8865] - Suspicious enum value comparisons in scheduler Java bindings
* [MESOS-8866] - CMake builds are missing byproduct declaration for jemalloc.
* [MESOS-8868] - Some 'FsTest' test cases fail on macOS
* [MESOS-8870] - Master does not correctly reconcile dropped operations after agent failover
* [MESOS-8874] - ResourceProviderManagerHttpApiTest.ResubscribeResourceProvider is flaky.
* [MESOS-8876] - Normal exit of Docker container using rexray volume results in TASK_FAILED.
* [MESOS-8881] - Enable epoll backend in libevent integration.
* [MESOS-8885] - Disable libevent debug mode.
** Improvement
* [MESOS-2922] - Add move constructors / assignment to Future.
* [MESOS-3022] - export additional metrics from scheduler driver
* [MESOS-4965] - Support resizing of an existing persistent volume
* [MESOS-5362] - Add authentication to example frameworks
* [MESOS-6128] - Make "re-register" vs. "reregister" consistent in the master
* [MESOS-7016] - Make default AWAIT_* duration configurable
* [MESOS-7643] - The order of isolators provided in '--isolation' flag is not preserved and instead sorted alphabetically
* [MESOS-7656] - Update the JSON <=> protobuf message conversion for map support
* [MESOS-7881] - Building gRPC with CMake
* [MESOS-7990] - Support systemd named hierarchy (name=systemd) for Mesos Containerizer.
* [MESOS-8033] - Use more idiomatic CMake for compiler features
* [MESOS-8240] - Add an option to build the new CLI and run unit tests.
* [MESOS-8306] - Restrict which agents can statically reserve resources for which roles
* [MESOS-8332] - Narrow the container sandbox permissions.
* [MESOS-8357] - Example frameworks have an inconsistent UX.
* [MESOS-8361] - Example frameworks to support launching mesos-local.
* [MESOS-8389] - Notion of "removable" task in master code is inaccurate.
* [MESOS-8390] - Notion of "transitioning" agents in the master is now inaccurate.
* [MESOS-8402] - Resource provider manager should persist resource provider information
* [MESOS-8426] - Speed up SLRP tests
* [MESOS-8427] - Clean up residual CSI endpoints for SLRP tests.
* [MESOS-8434] - Cleanup Authorization logic in master and agent
* [MESOS-8454] - Add a download link for master and agent logs in WebUI
* [MESOS-8471] - Allow revocable_resources capability for mesos-execute
* [MESOS-8488] - Docker bug can cause unkillable tasks.
* [MESOS-8506] - Add test coverage for `Resources::find` on revocable resources
* [MESOS-8556] - Boost emits warning repeatedly
* [MESOS-8573] - Container stuck in PULLING when Docker daemon hangs
* [MESOS-8574] - Docker executor makes no progress when 'docker inspect' hangs
* [MESOS-8575] - Improve discard handling for 'Docker::stop' and 'Docker::pull'.
* [MESOS-8576] - Improve discard handling of 'Docker::inspect()'
* [MESOS-8591] - Add infra to test a hung Docker daemon
* [MESOS-8599] - Build with Ninja on Windows
* [MESOS-8607] - Port mesos-execute to Windows
* [MESOS-8609] - Create a metric to indicate how long agent takes to recover executors
* [MESOS-8640] - Validate `DockerInfo` exists when container's type is `DOCKER`
* [MESOS-8656] - Improve stout JSON -> protobuf message conversion to handle more valid JSONs
* [MESOS-8658] - CMake build should use same compiler warnings as Autotools
* [MESOS-8702] - Replace the manual parsing in Mesos code with the native protobuf map support
* [MESOS-8725] - Support max_duration for tasks
* [MESOS-8728] - Don't print full usage for invocation errors
* [MESOS-8772] - Add slave recovery test for default executor.
* [MESOS-8793] - Add more logging to agent recovery path.
* [MESOS-8801] - Add jemalloc as optional third-party memory allocator
* [MESOS-8851] - Introduce a push-based gauge.
** Task
* [MESOS-3441] - Port os_tests to Windows
* [MESOS-3445] - Port signals_tests to Windows
* [MESOS-3644] - Implement stout/os/windows/signals.hpp
* [MESOS-4176] - Support CMake build on FreeBSD
* [MESOS-5726] - Benchmark the v1 Operator API
* [MESOS-5850] - Add a test that runs the 'mesos-local' binary
* [MESOS-6575] - Change `disk/xfs` isolator to terminate executor when it exceeds quota
* [MESOS-7558] - Add resource provider validation
* [MESOS-8184] - Implement master's AcknowledgeOfferOperationMessage handler.
* [MESOS-8189] - Master's OperationStatusUpdate handler should forward updates to the framework when OfferOperationID is set.
* [MESOS-8190] - Update the master to accept OfferOperationIDs from frameworks.
* [MESOS-8191] - Implement ReconcileOfferOperations handler in the master
* [MESOS-8192] - Update the scheduler library to support request/response API calls.
* [MESOS-8275] - Remove use of ::_stat on Windows
* [MESOS-8284] - Add a ns::supported convenience API.
* [MESOS-8362] - Verify end-to-end operation status update retry after RP failover
* [MESOS-8363] - Verify that the master acknowledges operation status updates correctly
* [MESOS-8373] - Test reconciliation after operation is dropped en route to agent
* [MESOS-8382] - Master should bookkeep local resource providers.
* [MESOS-8388] - Show LRP resources in master and agent endpoints.
* [MESOS-8407] - Add SLRP unit tests for profile updates and corner cases.
* [MESOS-8408] - Add an SLRP test for CSI plugin restart.
* [MESOS-8409] - Add an SLRP test for agent registered with a new ID.
* [MESOS-8415] - Add an SLRP test for agent reboot.
* [MESOS-8420] - Test that operation status updates are retried after being dropped en-route to the master.
* [MESOS-8424] - Test that operations are correctly reported following a master failover
* [MESOS-8442] - Source tree contains generated endpoint documentation
* [MESOS-8445] - Test that `UPDATE_STATE` of a resource provider doesn't have unwanted side-effects in master or agent
* [MESOS-8462] - Unit test for `Slave::detachFile` on removed frameworks.
* [MESOS-8492] - Checkpoint profiles in storage local resource provider.
* [MESOS-8527] - Add metrics about number of subscribed LRPs on the agent.
* [MESOS-8534] - Allow nested containers in TaskGroups to have separate network namespaces
* [MESOS-8539] - Add metrics about CSI plugin terminations.
* [MESOS-8551] - Port libprocess HTTPTest.QueryEncodeDecode
* [MESOS-8569] - Allow newline characters when decoding base64 strings in stout.
* [MESOS-8650] - Bump CSI bundle to v0.2.
* [MESOS-8653] - Make the CSI client to support CSI v0.2.
* [MESOS-8657] - Build CSI proto in CMake.
* [MESOS-8673] - Fix os::open to use HANDLEs
* [MESOS-8675] - Remove FD_CRT from WindowsFD
* [MESOS-8676] - Fix os::read and os::write to use HANDLES
* [MESOS-8678] - Bump gRPC bundle to 1.10.0.
* [MESOS-8683] - Remove _close from Windows close.hpp
* [MESOS-8684] - Replace _dup with DuplicateHandle on Windows
* [MESOS-8685] - Replace _lseek with SetFilePointer
* [MESOS-8692] - Replace _chsize_s with SetEndOfFile on Windows
* [MESOS-8697] - Make gRPC-related tests cross-platform.
* [MESOS-8698] - Enable storage local resource provider in CMake.
* [MESOS-8706] - Unify return type of `wait` and `destroy` containerizer methods
* [MESOS-8710] - Update tests after changing return type of `wait` method
* [MESOS-8717] - Support CSI v0.2 in SLRP.
* [MESOS-8735] - Implement recovery for resource provider manager registrar
* [MESOS-8747] - Support resizing persistent volume through operator API
* [MESOS-8748] - Create ACL for grow and shrink volume
* [MESOS-8750] - Check failed: !slaves.registered.contains(task->slave_id)
* [MESOS-8777] - Support `STAGE_UNSTAGE_VOLUME` CSI capability in SLRP
* [MESOS-8819] - mesos.pom file hardcodes developers
* [MESOS-8833] - Port libprocess subprocess_tests.cpp
** Documentation
* [MESOS-8291] - Add documentation about fault domains
Release Notes - Mesos - Version 1.5.4 (WIP)
-------------------------------------------
* This is a bug fix release.
** Bug
* [MESOS-9124] - Agent reconfiguration can cause master to unsuppress on scheduler's behalf.
* [MESOS-9418] - Add support for the `Discard` blkio operation type.
* [MESOS-9529] - `/proc` should be remounted even if a nested container set `share_pid_namespace` to true.
* [MESOS-9616] - `Filters.refuse_seconds` declines resources not in offers.
* [MESOS-9619] - Mesos Master Crashes with Launch Group when using Port Resources
* [MESOS-9695] - Remove the duplicate pid check in Docker containerizer
* [MESOS-9707] - Calling link::lo() may cause runtime error
* [MESOS-9766] - /__processes__ endpoint can hang.
* [MESOS-9786] - Race between two REMOVE_QUOTA calls crashes the master.
* [MESOS-9787] - Log slow SSL (TLS) peer reverse DNS lookup.
* [MESOS-9852] - Slow memory growth in master due to deferred deletion of offer filters and timers.
* [MESOS-9856] - REVIVE call with specified role(s) clears filters for all roles of a framework.
* [MESOS-9870] - Simultaneous adding/removal of a role from framework's roles and its suppressed roles crashes the master.
* [MESOS-10007] - Command executor can miss exit status for short-lived commands due to double-reaping.
** Improvement
* [MESOS-9159] - Support Foreign URLs in docker registry puller.
* [MESOS-9675] - Docker Manifest V2 Schema2 Support.
* [MESOS-9704] - Support docker manifest v2s2 config GC.
* [MESOS-9948] - master::Slave::hasExecutor occupies 37% of a 150 second perf sample.
* [MESOS-10017] - Log all reverse DNS lookup failures in 'legacy' TLS (SSL) hostname validation scheme.
Release Notes - Mesos - Version 1.5.3
-------------------------------------
* This is a bug fix release.
** Bug
* [MESOS-7474] - Mesos fetcher cache doesn't retry when missed.
* [MESOS-8887] - Unreachable tasks are not GC'ed when unreachable agent is GC'ed.
* [MESOS-8907] - Docker image fetcher fails with HTTP/2.
* [MESOS-9210] - Mesos v1 scheduler library does not properly handle SUBSCRIBE retries
* [MESOS-9317] - Some master endpoints do not handle failed authorization properly.
* [MESOS-9332] - Nested container should run as the same user of its parent container by default.
* [MESOS-9334] - Container stuck at ISOLATING state due to libevent poll never returns.
* [MESOS-9362] - Test `CgroupsIsolatorTest.ROOT_CGROUPS_CreateRecursively` is flaky.
* [MESOS-9411] - Validation of JWT tokens using HS256 hashing algorithm is not thread safe.
* [MESOS-9492] - Persist CNI working directory across reboot.
* [MESOS-9507] - Agent could not recover due to empty docker volume checkpointed files.
* [MESOS-9518] - CNI_NETNS should not be set for orphan containers that do not have network namespace.
* [MESOS-9532] - ResourceOffersTest.ResourceOfferWithMultipleSlaves is flaky.
* [MESOS-9533] - CniIsolatorTest.ROOT_CleanupAfterReboot is flaky.
* [MESOS-9555] - Allocator CHECK failure: reservationScalarQuantities.contains(role).
* [MESOS-9581] - Mesos package naming appears to be undeterministic.
** Improvement:
* [MESOS-9305] - Create cgoup recursively to workaround systemd deleting cgroups_root.
Release Notes - Mesos - Version 1.5.2
-------------------------------------
* This is a bug fix release.
** Bug
* [MESOS-3790] - ZooKeeper connection should retry on `EAI_NONAME`.
* [MESOS-7474] - Mesos fetcher cache doesn't retry when missed.
* [MESOS-8128] - Make os::pipe file descriptors O_CLOEXEC.
* [MESOS-8418] - mesos-agent high cpu usage because of numerous /proc/mounts reads.
* [MESOS-8545] - AgentAPIStreamingTest.AttachInputToNestedContainerSession is flaky.
* [MESOS-8568] - Command checks should always call `WAIT_NESTED_CONTAINER` before `REMOVE_NESTED_CONTAINER`.
* [MESOS-8620] - Containers stuck in FETCHING possibly due to unresponsive server.
* [MESOS-8830] - Agent gc on old slave sandboxes could empty persistent volume data.
* [MESOS-8871] - Agent may fail to recover if the agent dies before image store cache checkpointed.
* [MESOS-8904] - Master crash when removing quota.
* [MESOS-8906] - `UriDiskProfileAdaptor` fails to update profile selectors.
* [MESOS-8907] - Docker image fetcher fails with HTTP/2.
* [MESOS-8917] - Agent leaking file descriptors into forked processes.
* [MESOS-8921] - Autotools don't work with newer OpenJDK versions.
* [MESOS-8935] - Quota limit "chopping" can lead to cpu-only and memory-only offers.
* [MESOS-8936] - Implement a Random Sorter for offer allocations.
* [MESOS-8942] - Master streaming API does not send (health) check updates for tasks.
* [MESOS-8945] - Master check failure due to CHECK_SOME(providerId).
* [MESOS-8947] - Improve the container preparing logging in IOSwitchboard and volume/secret isolator.
* [MESOS-8952] - process::await/collect n^2 performance issue.
* [MESOS-8963] - Executor crash trying to print container ID.
* [MESOS-8978] - Command executor calling setsid breaks the tty support.
* [MESOS-8980] - mesos-slave can deadlock with docker pull.
* [MESOS-8986] - `slave.available()` in the allocator is expensive and drags down allocation performance.
* [MESOS-8987] - Master asks agent to shutdown upon auth errors.
* [MESOS-9024] - Mesos master segfaults with stack overflow under load.
* [MESOS-9049] - Agent GC could unmount a dangling persistent volume multiple times.
* [MESOS-9116] - Launch nested container session fails due to incorrect detection of `mnt` namespace of command executor's task.
* [MESOS-9125] - Port mapper CNI plugin might fail with "Resource temporarily unavailable".
* [MESOS-9127] - Port mapper CNI plugin might deadlock iptables on the agent.
* [MESOS-9131] - Health checks launching nested containers while a container is being destroyed lead to unkillable tasks.
* [MESOS-9142] - CNI detach might fail due to missing network config file.
* [MESOS-9144] - Master authentication handling leads to request amplification.
* [MESOS-9145] - Master has a fragile burned-in 5s authentication timeout.
* [MESOS-9146] - Agent has a fragile burn-in 5s authentication timeout.
* [MESOS-9147] - Agent and scheduler driver authentication retry backoff time could overflow.
* [MESOS-9151] - Container stuck at ISOLATING due to FD leak.
* [MESOS-9170] - Zookeeper doesn't compile with newer gcc due to format error.
* [MESOS-9196] - Removing rootfs mounts may fail with EBUSY.
* [MESOS-9231] - `docker inspect` may return an unexpected result to Docker executor due to a race condition.
* [MESOS-9267] - Mesos agent crashes when CNI network is not configured but used.
* [MESOS-9279] - Docker Containerizer 'usage' call might be expensive if mount table is big.
* [MESOS-9283] - Docker containerizer actor can get backlogged with large number of containers.
* [MESOS-9305] - Create cgoup recursively to workaround systemd deleting cgroups_root.
* [MESOS-9308] - URI disk profile adaptor could deadlock.
* [MESOS-9317] - Some master endpoints do not handle failed authorization properly.
* [MESOS-9332] - Nested container should run as the same user of its parent container by default.
* [MESOS-9334] - Container stuck at ISOLATING state due to libevent poll never returns.
* [MESOS-9411] - Validation of JWT tokens using HS256 hashing algorithm is not thread safe.
* [MESOS-9419] - Executor to framework message crashes master if framework has not re-registered.
* [MESOS-9480] - Master may skip processing authorization results for `LAUNCH_GROUP`.
* [MESOS-9492] - Persist CNI working directory across reboot.
* [MESOS-9501] - Mesos executor fails to terminate and gets stuck after agent host reboot.
* [MESOS-9502] - IOswitchboard cleanup could get stuck due to FD leak from a race.
** Improvement:
* [MESOS-9510] - Disallowed nan, inf and so on in `Value::Scalar`.
* [MESOS-9516] - Extend `min_allocatable_resources` flag to cover non-scalar resources.
* [MESOS-9518] - CNI_NETNS should not be set for orphan containers that do not have network namespace.
Release Notes - Mesos - Version 1.5.1
-------------------------------------
* This is a bug fix release.
** Bug
* [MESOS-1720] - Slave should send exited executor message when the executor is never launched.
* [MESOS-7742] - Race conditions in IOSwitchboard: listening on unix socket and premature closing of the connection.
* [MESOS-8125] - Agent should properly handle recovering an executor when its pid is reused.
* [MESOS-8411] - Killing a queued task can lead to the command executor never terminating.
* [MESOS-8416] - CHECK failure if trying to recover nested containers but the framework checkpointing is not enabled.
* [MESOS-8468] - `LAUNCH_GROUP` failure tears down the default executor.
* [MESOS-8488] - Docker bug can cause unkillable tasks.
* [MESOS-8510] - URI disk profile adaptor does not consider plugin type for a profile.
* [MESOS-8536] - Pending offer operations on resource provider resources not properly accounted for in allocator.
* [MESOS-8550] - Bug in `Master::detected()` leads to coredump in `MasterZooKeeperTest.MasterInfoAddress`.
* [MESOS-8552] - CGROUPS_ROOT_PidNamespaceForward and CGROUPS_ROOT_PidNamespaceBackward tests fail.
* [MESOS-8565] - Persistent volumes are not visible in Mesos UI when launching a pod using default executor.
* [MESOS-8569] - Allow newline characters when decoding base64 strings in stout.
* [MESOS-8574] - Docker executor makes no progress when 'docker inspect' hangs.
* [MESOS-8575] - Improve discard handling for 'Docker::stop' and 'Docker::pull'.
* [MESOS-8576] - Improve discard handling of 'Docker::inspect()'.
* [MESOS-8577] - Destroy nested container if `LAUNCH_NESTED_CONTAINER_SESSION` fails.
* [MESOS-8594] - Mesos master stack overflow in libprocess socket send loop.
* [MESOS-8598] - Allow empty resource provider selector in `UriDiskProfileAdaptor`.
* [MESOS-8601] - Master crashes during slave reregistration after failover.
* [MESOS-8604] - Quota headroom tracking may be incorrect in the presence of hierarchical reservation.
* [MESOS-8605] - Terminal task status update will not send if 'docker inspect' is hung.
* [MESOS-8619] - Docker on Windows uses `USERPROFILE` instead of `HOME` for credentials.
* [MESOS-8624] - Valid tasks may be explicitly dropped by agent due to race conditions.
* [MESOS-8631] - Agent should be able to start a task with every CPU on a Windows machine.
* [MESOS-8641] - Event stream could send heartbeat before subscribed.
* [MESOS-8646] - Agent should be able to resolve file names on open files.
* [MESOS-8651] - Potential memory leaks in the `volume/sandbox_path` isolator.
* [MESOS-8741] - `Add` to sequence will not run if it races with sequence destruction.
* [MESOS-8742] - Agent resource provider config API calls should be idempotent.
* [MESOS-8786] - CgroupIsolatorProcess accesses subsystem processes directly.
* [MESOS-8787] - RP-related API should be experimental.
* [MESOS-8876] - Normal exit of Docker container using rexray volume results in TASK_FAILED.
* [MESOS-8881] - Enable epoll backend in libevent integration.
* [MESOS-8885] - Disable libevent debug mode.
Release Notes - Mesos - Version 1.5.0
-------------------------------------
This release contains the following new features:
* [MESOS-1739] - **Experimental** Agents now support the
`--reconfiguration_policy` flag which allows them to recover
the agent ID and running tasks after configuration changes.
See docs/agent-recovery.md for more details.
* [MESOS-4945] - **Experimental** Agents now can automatically
garbage collect unused Docker image layers used by Mesos
Containerizer.
* [MESOS-7289, MESOS-7235] - **Experimental** Support for the
Container Storage Interface (CSI) to simplify storage management
in Mesos, and allow 3rdparty vendors to plugin into Mesos very
easily.
* [MESOS-7302] - Support launching standalone containers on the
agent using MesosContainerizer without a master or framework
running.
* [MESOS-7749] - **Experimental** Support for gRPC client in Mesos.
The gRPC is bundled in Mesos and a gRPC client API is built is
built into libprocess.
* [MESOS-7973] - **Experimental** Non-leading replica is now allowed
to catch-up missing log positions in the replicated log. This opens
the door for implementing hot standby (by offloading some reading
from a leader to standbys) and fast failover time (by keeping
in-memory storage represented by the log "hot").
* Several improvements and fixes to the enforcement of quota
guarantees have been made:
* [MESOS-4527]: Previously a role could "game" the quota system
by amassing reservations that it leaves unused. This is now
prevented by accounting for reservations when allocating
resources.
* [MESOS-7099]: Resources are now allocated in a fine-grained
manner to prevent roles from exceeding their quota.
* [MESOS-8293]: There was a bug where a role may not receive its
reservation when it does not have quota, this has been fixed.
* [MESOS-8339]: When a role has more reservations than quota,
there was a bug previously where an insufficient amount of
quota headroom was held. This has been fixed.
* [MESOS-8352]: When allocating to a role with quota, we
previously included all other resources on the agent that the
role does not have quota for. This made it possible to violate
the quota guarantees of a different role. This has been fixed
by taking into account the headroom that is needed when
allocating the resources.
Deprecations/Removals:
* [MESOS-7305] - Some nested container agent APIs `****_NESTED_CONTAINER`
are deprecated in favor of the new generally named agent APIs
`****_CONTAINER`.
* Agent flag `--executor_secret_key` has been deprecated. Operators
should use `--jwt_secret_key` instead.
Additional API Changes:
* [MESOS-6406, MESOS-7215, MESOS-8337] Now when an agent is partitioned,
the master tracks all noncompleted tasks regardless of partition-awareness
so when the agent reregisters it can recover all of them and send their
latest statuses to the scheduler. NOTE: The master now sends updates for
tasks recovered from partitioned agents upon reregistration so the scheduler
can get them before reconciliation. We also fixed the buggy semantics that
exposes terminal unacknowledged tasks when partitioned as "completed" in the
HTTP endpoints and the operator API, now they are shown as "unreachable". We
plan to further improve the API on this in MESOS-8405.
* [MESOS-7550] The fields `Resource.disk.source.path.root` and
`Resource.disk.source.mount.root` can now be set to relative paths
to an agent's work directory.
* [MESOS-7660] `Filter::refuse_seconds` is now capped to 31536000
seconds (365 days).
* [MESOS-7941] Built-in executors will now send a TASK_STARTING
status update when a task is starting.
* [MESOS-7973] A new `catchup` method has been added to the
`Log.Reader` interface (including Java binding).
* [MESOS-8040] Return nested/standalone containers in `GET_CONTAINERS`
API call.
* [MESOS-8165] Master will now send TASK_GONE status for unknown
tasks of PARTITION_AWARE frameworks belonging to registered agents
during explicit reconciliation.
Changes to Dependencies:
* Upgraded minimum required Protobuf library to version 3+.
Feature Graduations:
* [MESOS-4791] - v1 Operator API is now considered stable. The performance has
been improved so that when using protobuf it is faster than v0, and when
using JSON it is slightly slower than v0.
* [MESOS-5116] - Add support for accounting only mode in XFS isolator.
* [MESOS-5275, MESOS-7476, MESOS-7477, MESOS-7671] - Add file-based and
protobuf-based capabilities support for mesos containerizer. This
includes the support for effective and bounding capabilities.
* [MESOS-6077] - Added a default (task group) executor.
* [MESOS-6402] - rlimit support for Mesos containerizer.
* [MESOS-6460] - Container Attach/Exec.
* [MESOS-6758] - Support docker registry that requires basic auth.
* [MESOS-7088] - Support private registry credential per container.
* [MESOS-7418] - Add support for file-based secrets.
Unresolved Critical Issues:
* [MESOS-1718] - Command executor can overcommit the agent.
* [MESOS-2554] - Slave flaps when using --slave_subsystems that are not used for isolation.
* [MESOS-2774] - SIGSEGV received during process::MessageEncoder::encode().
* [MESOS-2842] - Update FrameworkInfo.principal on framework re-registration.
* [MESOS-3533] - Unable to find and run URIs files.
* [MESOS-3747] - HTTP Scheduler API no longer allows FrameworkInfo.user to be empty string.
* [MESOS-4996] - 'containerizer->update' will always fail after killing a docker container.
* [MESOS-5352] - Docker volume isolator cleanup can be blocked by first cleanup failure.
* [MESOS-5396] - After failover, master does not remove agents with same UPID.
* [MESOS-5989] - Libevent SSL Socket downgrade code accesses uninitialized memory / assumes single peek is sufficient.
* [MESOS-5995] - Protobuf JSON deserialisation does not accept numbers formated as strings.
* [MESOS-6632] - ContainerLogger might leak FD if container launch fails.
* [MESOS-6804] - Running 'tty' inside a debug container that has a tty reports "Not a tty".
* [MESOS-6986] - abort in DRFSorter::add.
* [MESOS-7386] - Executor not cleaning up existing running docker containers if external logrotate/logger processes die/killed.
* [MESOS-7566] - Master crash due to failed check in DRFSorter::remove.
* [MESOS-7622] - Agent can crash if a HTTP executor tries to retry subscription in running state.
* [MESOS-7721] - Master's agent removal rate limit also applies to agent unreachability.
* [MESOS-7748] - Slow subscribers of streaming APIs can lead to Mesos OOMing.
* [MESOS-7911] - Non-checkpointing framework's tasks should not be marked LOST when agent disconnects.
* [MESOS-7966] - check for maintenance on agent causes fatal error.
* [MESOS-7991] - fatal, check failed !framework->recovered().
* [MESOS-8038] - Launching GPU task sporadically fails.
* [MESOS-8125] - Agent should properly handle recovering an executor when its pid is reused.
* [MESOS-8137] - Mesos agent can hang during startup.
* [MESOS-8256] - Libprocess can silently deadlock due to worker thread exhaustion.
* [MESOS-8411] - Killing a queued task can lead to the command executor never terminating.
* [MESOS-8468] - `LAUNCH_GROUP` failure tears down the default executor.
All Resolved Issues:
** Bug
* [MESOS-1216] - Attributes comparator operator should allow multiple attributes of same name and type.
* [MESOS-3576] - Audit CMake linking flags.
* [MESOS-5455] - Transition away from temporary build variables.
* [MESOS-5462] - Re-organize isolator hierarchy.
* [MESOS-5656] - Incomplete modelling of 3rdparty dependencies in cmake build.
* [MESOS-5881] - Semantics of `os::symlink` differ across POSIX and Windows.
* [MESOS-5905] - Zookeeper tests do not work on CMake builds as directory structure changed.
* [MESOS-6086] - PersistentVolumeEndpointsTest.EndpointCreateThenOfferRemove is flaky.
* [MESOS-6187] - "double free or corruption" with Java 8.
* [MESOS-6345] - ExamplesTest.PersistentVolumeFramework failing due to double free corruption on Ubuntu 14.04.
* [MESOS-6406] - Send latest status for partition-aware tasks when agent reregisters.
* [MESOS-6428] - Mesos containerizer helper function signalSafeWriteStatus is not AS-Safe.
* [MESOS-6616] - Error: dereferencing type-punned pointer will break strict-aliasing rules.
* [MESOS-6671] - External 3rdparty deps are not built with the configured compiler in cmake build.
* [MESOS-6690] - Wire up resource control API to Windows Job objects API.
* [MESOS-6697] - Port `authentication_tests.cpp`.
* [MESOS-6703] - Port `credentials_tests.cpp`.
* [MESOS-6705] - Port `fetcher_tests.cpp`.
* [MESOS-6708] - Port `group_tests.cpp`.
* [MESOS-6735] - `os::realpath` semantics differ between Windows and POSIX.
* [MESOS-6784] - IOSwitchboardTest.KillSwitchboardContainerDestroyed is flaky.
* [MESOS-6790] - Wrong task started time in webui.
* [MESOS-6794] - Properly model header dependencies of cmake build components.
* [MESOS-6816] - Allows frameworks to overwrite system environment variables.
* [MESOS-6942] - CMake build with `-DENABLE_LIBEVENT=ON` requires system-installed `openssl`.
* [MESOS-6949] - SchedulerTest.MasterFailover is flaky.
* [MESOS-7007] - filesystem/shared and --default_container_info broken since 1.1.
* [MESOS-7099] - Quota can be exceeded due to coarse-grained offer technique.
* [MESOS-7130] - port_mapping isolator: executor hangs when running on EC2.
* [MESOS-7160] - Parsing of perf version segfaults.
* [MESOS-7215] - Race condition on re-registration of non-partition-aware frameworks.
* [MESOS-7223] - Linux filesystem isolator cannot mount host volume /dev/log.
* [MESOS-7296] - CMake 2.8.10 does not support TIMESTAMP.
* [MESOS-7312] - Update Resource proto for storage resource providers.
* [MESOS-7425] - ImageAlpine/ProvisionerDockerTest.ROOT_INTERNET_CURL_SimpleCommand/3 is flaky in some OS.
* [MESOS-7440] - Various DefaultExecutorCheckTest* tests flaky on ASF CI.
* [MESOS-7500] - Command checks via agent lead to flaky tests.
* [MESOS-7504] - Parent's mount namespace cannot be determined when launching a nested container.
* [MESOS-7509] - CniIsolatorPortMapperTest.ROOT_INTERNET_CURL_PortMapper fails on some Linux distros.
* [MESOS-7511] - CniIsolatorTest.ROOT_DynamicAddDelofCniConfig is flaky.
* [MESOS-7519] - OversubscriptionTest.RescindRevocableOfferWithIncreasedRevocable is flaky.
* [MESOS-7541] - Cannot compile without pre-compiled headers on Windows.
* [MESOS-7586] - Make use of cout/cerr and glog consistent.
* [MESOS-7589] - CommandExecutorCheckTest.CommandCheckDeliveredAndReconciled is flaky.
* [MESOS-7660] - HierarchicalAllocator uses the default filter instead of a very long one.
* [MESOS-7661] - Libprocess timers with long durations trigger immediately.
* [MESOS-7704] - Remove use of #pragma comment (lib, "IPHLPAPI.lib").
* [MESOS-7726] - MasterTest.IgnoreOldAgentReregistration test is flaky.
* [MESOS-7729] - ExamplesTest.DynamicReservationFramework is flaky.
* [MESOS-7741] - SlaveRecoveryTest/0.MultipleSlaves has double free corruption.
* [MESOS-7781] - Windows API GetVersionExW was declared deprecated.
* [MESOS-7784] - MasterTestPrePostReservationRefinement.CreateAndDestroyVolumesV1 is flaky.
* [MESOS-7791] - subprocess' childMain using ABORT when encountering user errors.
* [MESOS-7811] - libprocess-tests depend on gtest but it's not setup.
* [MESOS-7828] - Current approach to parse protobuf enum from JSON does not support upgrades.
* [MESOS-7835] - CMake build does not support Marathon.
* [MESOS-7851] - Master stores old resource format in the registry.
* [MESOS-7867] - Master doesn't handle scheduler driver downgrade from HTTP based to PID based.
* [MESOS-7873] - Expose `ExecutorInfo.ContainerInfo.NetworkInfo` in Mesos `state` endpoint.
* [MESOS-7877] - Audit test code for undefined behavior in accessing container elements.
* [MESOS-7917] - Docker statistics not reported on Windows.
* [MESOS-7921] - ProcessManager::resume sometimes crashes accessing EventQueue.
* [MESOS-7923] - Make args optional in mesos port mapper plugin.
* [MESOS-7927] - The composing containerizer leaks memory in some scenarios.
* [MESOS-7929] - `Metrics()` hangs on second call on Windows.
* [MESOS-7945] - MasterAPITest.EventAuthorizationFiltering is flaky.
* [MESOS-7963] - Task groups can lose the container limitation status.
* [MESOS-7964] - Heavy-duty GC makes the agent unresponsive.
* [MESOS-7968] - Handle `/proc/self/ns/pid_for_children` when parsing available namespace.
* [MESOS-7969] - Handle cgroups v2 hierarchy when parsing /proc/self/cgroups.
* [MESOS-7972] - SlaveTest.HTTPSchedulerSlaveRestart test is flaky.
* [MESOS-7975] - The command/default/docker executor can incorrectly send a TASK_FINISHED update even when the task is killed.
* [MESOS-7978] - Lint javascript files to enable linting.
* [MESOS-7980] - Stout fails to compile with libc >= 2.26.
* [MESOS-7988] - Mesos attempts to open handle for the system idle process.
* [MESOS-7993] - Fix Windows header orderings.
* [MESOS-7996] - ContentType/SchedulerTest.NoOffersWithAllRolesSuppressed is flaky.
* [MESOS-7997] - ContentType/MasterAPITest.CreateAndDestroyVolumes is flaky.
* [MESOS-7998] - PersistentVolumeEndpointsTest.UnreserveVolumeResources is flaky.
* [MESOS-8000] - DefaultExecutorCniTest.ROOT_VerifyContainerIP is flaky.
* [MESOS-8001] - PersistentVolumeEndpointsTest.NoAuthentication is flaky.
* [MESOS-8003] - PersistentVolumeEndpointsTest.SlavesEndpointFullResources is flaky.
* [MESOS-8010] - AfterTest.Loop is flaky.
* [MESOS-8027] - os::open doesn't always atomically apply O_CLOEXEC.
* [MESOS-8035] - Correct mesos-tests CMake build dependencies.
* [MESOS-8039] - A broken connection during LaunchNestedContainer call might result in the nested container not being cleaned up.
* [MESOS-8046] - MasterTestPrePostReservationRefinement.ReserveAndUnreserveResourcesV1 is flaky.
* [MESOS-8048] - ReservationEndpointsTest.GoodReserveAndUnreserveACL is flaky.
* [MESOS-8051] - Killing TASK_GROUP fail to kill some tasks.
* [MESOS-8052] - "protoc" not found when running "make -j4 check" directly in stout.
* [MESOS-8057] - Apply security patches to AngularJS and JQuery in the Mesos UI.
* [MESOS-8058] - Agent and master can race when updating agent state.
* [MESOS-8066] - Pylint report errors in apply-reviews.py on Ubuntu 14.04.
* [MESOS-8070] - Bundled GRPC build does not build on Debian 8.
* [MESOS-8076] - PersistentVolumeTest.SharedPersistentVolumeRescindOnDestroy is flaky.
* [MESOS-8080] - The default executor does not propagate missing task exit status correctly.
* [MESOS-8082] - updateAvailable races with a periodic allocation and leads to flaky tests.
* [MESOS-8084] - Double free corruption in tests due to parallel manipulation of signal and control handlers.
* [MESOS-8085] - No point in deallocate() for a framework for maintenance if it is deactivated.
* [MESOS-8090] - Mesos 1.4.0 crashes with 1.3.x agent with oversubscription.
* [MESOS-8093] - Some tests miss subscribed event because expectation is set after event fires.
* [MESOS-8095] - ResourceProviderRegistrarTest.AgentRegistrar is flaky.
* [MESOS-8116] - Fix off by-one error in Windows long path support.
* [MESOS-8119] - ROOT_DOCKER_DockerHealthyTask segfaults in debian 8.
* [MESOS-8121] - Unified Containerizer Auto backend should check xfs ftype for overlayfs backend.
* [MESOS-8123] - GPU tests are failing due to TASK_STARTING.
* [MESOS-8135] - Masters can lose track of tasks' executor IDs.
* [MESOS-8136] - Update XFS isolator tests to handle TASK_STARTING.
* [MESOS-8157] - Review #62775 broke the build.
* [MESOS-8159] - ns::clone uses an async signal unsafe stack.
* [MESOS-8165] - TASK_UNKNOWN status is ambiguous.
* [MESOS-8169] - Incorrect master validation forces executor IDs to be globally unique.
* [MESOS-8171] - Using a failoverTimeout of 0 with Mesos native scheduler client can result in infinite subscribe loop.
* [MESOS-8173] - Improve fetcher exit status message.
* [MESOS-8178] - UnreachableAgentReregisterAfterFailover is flaky.
* [MESOS-8179] - Scheduler library has incorrect assumptions about connections.
* [MESOS-8180] - Port mesos-fetcher to Windows.
* [MESOS-8200] - Suppressed roles are not honoured for v1 scheduler subscribe requests.
* [MESOS-8217] - Don't run linters on every commit.
* [MESOS-8220] - Can't build with Visual Studio 15.5.
* [MESOS-8223] - Master crashes when suppressed on subscribe is enabled.
* [MESOS-8225] - Port os::which to Windows.
* [MESOS-8237] - Strip (Offer|Resource).allocation_info for non-MULTI_ROLE schedulers.
* [MESOS-8245] - SlaveRecoveryTest/0.ReconnectExecutor is flaky.
* [MESOS-8249] - Support image prune in mesos containerizer and provisioner.
* [MESOS-8263] - ResourceProviderManagerHttpApiTest.ConvertResources is flaky.
* [MESOS-8267] - NestedMesosContainerizerTest.ROOT_CGROUPS_RecoverLauncherOrphans is flaky.
* [MESOS-8272] - Fall back to bind mounting container devices.
* [MESOS-8279] - Persistent volumes are not visible in Mesos UI using default executor on Linux.
* [MESOS-8280] - Mesos Containerizer GC should set 'layers' after checkpointing layer ids in provisioner.
* [MESOS-8282] - Take pending offer operations into account when calculating framework allocated resources.
* [MESOS-8288] - SlaveTest.IgnoreV0ExecutorIfItReregistersWithoutReconnect is flaky.
* [MESOS-8289] - ReservationTest.MasterFailover is flaky when run with `RESOURCE_PROVIDER` capability.
* [MESOS-8293] - Reservation may not be allocated when the role has no quota.
* [MESOS-8297] - Built-in driver-based executors ignore kill task if the task has not been launched.
* [MESOS-8312] - Pass resource provider information to master as part of UpdateSlaveMessage.
* [MESOS-8315] - ResourceProviderManagerHttpApiTest.ResubscribeResourceProvider is flaky.
* [MESOS-8316] - Tests that fetch docker images might be flaky due to insufficient wait timeout.
* [MESOS-8318] - OfferOperationStatusUpdateManagerTest tests fail on Windows.
* [MESOS-8320] - Expose information about local resource providers in master.
* [MESOS-8325] - Mesos containerizer does not properly handle old running containers.
* [MESOS-8337] - Invalid state transition attempted when agent is lost.
* [MESOS-8339] - Quota headroom may be insufficiently held when role has more reservation than quota.
* [MESOS-8341] - Agent can become stuck in (re-)registering state during upgrades.
* [MESOS-8344] - Improve JSON v1 operator API performance.
* [MESOS-8346] - Resubscription of a resource provider will crash the agent if its HTTP connection isn't closed.
* [MESOS-8349] - When a resource provider driver is disconnected, it fails to reconnect.
* [MESOS-8350] - Resource provider-capable agents not correctly synchronizing checkpointed agent resources on reregistration.
* [MESOS-8352] - Resources may get over allocated to some roles while fail to meet the quota of other roles.
* [MESOS-8356] - Persistent volume ownership is set to root despite of sandbox owner (frameworkInfo.user) when docker executor is used.
* [MESOS-8369] - CI build failure compiling volume_profile.proto.
* [MESOS-8376] - Bundled GRPC does not build on Debian 9.
* [MESOS-8377] - RecoverTest.CatchupTruncated is flaky.
* [MESOS-8391] - Mesos agent doesn't notice that a pod task exits or crashes after the agent restart.
* [MESOS-8393] - SLRP NewVolumeRecovery and LaunchTaskRecovery tests CHECK failures.
* [MESOS-8410] - Reconfiguration policy fails to handle mount disk resources.
* [MESOS-8417] - Mesos can get "stuck" when a Process throws an exception.
* [MESOS-8419] - RP manager incorrectly setting framework ID leads to CHECK failure.
* [MESOS-8422] - Master's UpdateSlave handler not correctly updating terminated operations.
* [MESOS-8443] - Fix Docker Containerizer PATH on Windows so Docker is usable.
* [MESOS-8444] - GC failure causes agent miss to detach virtual paths for the executor's sandbox.
* [MESOS-8446] - Agent miss to detach `virtualLatestPath` for the executor's sandbox during recovery.
* [MESOS-8460] - `Slave::detachFile` can segfault because it could use invalid Framework*.
* [MESOS-8461] - SLRP should no assume a CSI plugin always has GetNodeID implemented.
* [MESOS-8469] - Mesos master might drop some events in the operator API stream.
* [MESOS-8480] - Mesos returns high resource usage when killing a Docker task.
* [MESOS-8481] - Agent reboot during checkpointing may result in empty checkpoints.
* [MESOS-8514] - SLRP failed to connect to CSI endpoint.
** Documentation
* [MESOS-5078] - Document TaskStatus reasons.
* [MESOS-7663] - Update the documentation to reflect the addition of reservation refinement.
* [MESOS-8007] - Add documentation for MARK_AGENT_GONE call.
* [MESOS-8303] - Add user doc for agent reconfiguration.
* [MESOS-8304] - Update CHANGELOG to call out agent reconfiguration feature.
* [MESOS-8310] - Document container image garbage collection.
** Epic
* [MESOS-1739] - Allow slave reconfiguration on restart.
* [MESOS-4945] - Garbage collect unused docker layers in the store.
* [MESOS-7235] - Improve Storage Support using Resource Provider and CSI.
* [MESOS-7289] - Support Container Storage Interface (CSI).
* [MESOS-7302] - Support launching standalone containers.
* [MESOS-7749] - Support gRPC client.
** Improvement
* [MESOS-564] - Update Contribution Documentation.
* [MESOS-5675] - Add support for master capabilities.
* [MESOS-5771] - Add benchmark test for shared resources.
* [MESOS-5902] - CMake should generate protobuf definitions for Java.
* [MESOS-6350] - Raise minimum required cmake version.
* [MESOS-6390] - Ensure Python support scripts are linted.
* [MESOS-6971] - Use arena allocation to improve protobuf message passing performance.
* [MESOS-7306] - Support mount propagation for host volumes.
* [MESOS-7330] - Add resource provider to offer.
* [MESOS-7361] - Command checks via agent pollute agent logs.
* [MESOS-7370] - Fix create symlink code to use flag which enables non-admins to make symlinks.
* [MESOS-7497] - Remove CMake anti-pattern of `set(x "${x} ..")`.
* [MESOS-7616] - Consider supporting changes to agent's domain without full drain.
* [MESOS-7675] - Isolate network ports.
* [MESOS-7695] - Add heartbeats to master stream API.
* [MESOS-7737] - Harden Mesos when building with cmake.
* [MESOS-7785] - Pass Operator API subscription events through authorizer.
* [MESOS-7795] - Remove "latest" symlink after agent reboot.
* [MESOS-7798] - Improve libprocess message passing performance.
* [MESOS-7837] - Propagate resource updates from local resource providers to master.
* [MESOS-7840] - Add Mesos CLI command to list active tasks.
* [MESOS-7842] - Basic sandbox GC metrics.
* [MESOS-7861] - Include check output in the DefaultExecutor log.
* [MESOS-7880] - Add an option to skip the Mesos style check when applying a review chain.
* [MESOS-7889] - Avoid Multiple PROTOC invocations when generating Protobuf & GRPC code in libprocess.
* [MESOS-7895] - ZK session timeout is unconfigurable in agent and scheduler drivers.
* [MESOS-7916] - Improve the test coverage of the DefaultExecutor.
* [MESOS-7924] - Add a javascript linter to the webui.
* [MESOS-7941] - Send TASK_STARTING status from built-in executors.
* [MESOS-7951] - Design Doc for Extended KillPolicy.
* [MESOS-7961] - Display task health in the webui.
* [MESOS-7962] - Display task state counters in the framework page of the webui.
* [MESOS-7973] - Non-leading VOTING replica catch-up.
* [MESOS-7987] - Initialize Google Mock rather than Google Test.
* [MESOS-8012] - Support Znode paths for masters in the new CLI.
* [MESOS-8015] - Design a scheduler (V1) HTTP API authenticatee mechanism.
* [MESOS-8016] - Introduce modularized HTTP authenticatee.
* [MESOS-8017] - Introduce a basic HTTP authenticatee.
* [MESOS-8021] - Update HTTP scheduler library to allow for modularized authenticatee.
* [MESOS-8034] - Remove LIBNAME_VERSION from EXTERNAL.
* [MESOS-8040] - Return nested/standalone containers in `GET_CONTAINERS` API call.
* [MESOS-8072] - Change Mesos common events verbose logs to use VLOG(2) instead of 1.
* [MESOS-8074] - Change Libprocess actor state transitions verbose logs to use VLOG(3) instead of 2.
* [MESOS-8078] - Some fields went missing with no replacement in api/v1.
* [MESOS-8115] - Add a master flag to disallow agents that are not configured with fault domain.
* [MESOS-8117] - Update Getting Started documentation.
* [MESOS-8221] - Use protobuf reflection to simplify downgrading of resources.
* [MESOS-8286] - Making bind mounts readonly fails with user namespaces.
* [MESOS-8294] - Support container image basic auto gc.
* [MESOS-8295] - Add excluded image parameter to containerizer::pruneImages() interface.
* [MESOS-8301] - Support moving into defer/dispatch/install handlers.
* [MESOS-8302] - Improve master failover performance.
* [MESOS-8328] - Improve logs displayed after a slave failed recovery.
* [MESOS-8358] - Create agent endpoints for pruning images.
* [MESOS-8365] - Create AuthN support for prune images API.
* [MESOS-8421] - Duration operators drop precision, even when used with integers.
* [MESOS-8455] - Avoid unnecessary copying of protobuf in the v1 API.
** Task
* [MESOS-3107] - Define CMake style guide.
* [MESOS-3110] - Harden the CMake system-dependency-locating routines.
* [MESOS-3384] - Include libsasl in Windows CMake build.
* [MESOS-3437] - Port flags_tests.
* [MESOS-4527] - Roles can exceed limit allocation via reservations.
* [MESOS-6193] - Make the docker/volume isolator nesting aware.
* [MESOS-6709] - Enable HTTP and TCP health checks on Windows.
* [MESOS-6714] - Port `slave_tests.cpp`.
* [MESOS-6733] - Windows: Enable authentication to the master.
* [MESOS-6894] - Checkpoint 'ContainerConfig' in Mesos Containerizer.
* [MESOS-7284] - Allow Mesos CLI to take masters IP.
* [MESOS-7285] - Implement a plugin to list container's on a given agent.
* [MESOS-7303] - Support Isolator capabilities.
* [MESOS-7305] - Adjust the recover logic of MesosContainerizer to allow standalone containers.
* [MESOS-7328] - Validate offer operations for converting disk resources.
* [MESOS-7388] - Update allocator interfaces to support resource providers.
* [MESOS-7443] - Add the MARK_AGENT_GONE call to the Operator v1 API protos.
* [MESOS-7444] - Add support for storing gone agents to the master registry.
* [MESOS-7445] - Implement the API handler on the master for marking agents as gone.
* [MESOS-7446] - Add authorization for the MARK_AGENT_GONE call.
* [MESOS-7448] - Add support for pruning the list of gone agents in the registry.
* [MESOS-7469] - Add resource provider driver.
* [MESOS-7491] - Build a CSI client to talk to a CSI plugin.
* [MESOS-7533] - Add a function stub for resource provider re-registration.
* [MESOS-7534] - Notify resource providers if they've been reregistered.
* [MESOS-7535] - Distinguish between active and inactive resource providers in RP Manager.
* [MESOS-7550] - Publish Local Resource Provider resources in the agent before container launch or update.
* [MESOS-7555] - Add resource provider IDs to the registry.
* [MESOS-7557] - Test that resource providers can reregister after agent fails over.
* [MESOS-7561] - Add storage resource provider specific information in ResourceProviderInfo.
* [MESOS-7578] - Write a proposal to make the I/O Switchboards optional.
* [MESOS-7594] - Implement 'apply' for resource provider related operations.
* [MESOS-7757] - Update master to handle updates to agent total resources.
* [MESOS-7790] - Design hierarchical quota allocation.
* [MESOS-7807] - Docker executor needs to return multiple IP addresses for the container.
* [MESOS-7892] - Filter results of `/state` on agent by role.
* [MESOS-7899] - Expose sandboxes using virtual paths and hide the agent work directory.
* [MESOS-7936] - Move sandbox path volume logic to 'volume/sandbox_path' isolator.
* [MESOS-7982] - Create Centos 6/7 RPM package.
* [MESOS-7985] - Use ASF CI for automating RPM packaging and upload to bintray.
* [MESOS-7992] - Enable OpenSSL build on Windows.
* [MESOS-8013] - Add test for blkio statistics.
* [MESOS-8032] - Launch CSI plugins in storage local resource provider.
* [MESOS-8050] - Mesos HTTP/HTTPS health checks for IPv6 docker containers.
* [MESOS-8060] - Introduce first class 'profile' for disk resources.
* [MESOS-8071] - Add agent capability for resource provider.
* [MESOS-8075] - Add ReadWriteLock to libprocess.
* [MESOS-8079] - Checkpoint and recover layers used to provision rootfs in provisioner.
* [MESOS-8086] - Update ACCEPT call handler in master for new operations.
* [MESOS-8087] - Add operation status update handler in Master.
* [MESOS-8088] - Introduce Lamport timestamp for offer operations.
* [MESOS-8089] - Add messages to publish resources on a resource provider.
* [MESOS-8097] - Add filesystem layout for local resource providers.
* [MESOS-8098] - Benchmark Master failover performance.
* [MESOS-8099] - Add protobuf for checkpointing resource provider states.
* [MESOS-8100] - Authorize standalone container calls from local resource providers.
* [MESOS-8101] - Import resources from CSI plugins in storage local resource provider.
* [MESOS-8102] - Add a test CSI plugin for storage local resource provider.
* [MESOS-8107] - Add a call to update total resources in the resource provider API.
* [MESOS-8108] - Process offer operations in storage local resource provider.
* [MESOS-8130] - Add placeholder handlers for offer operation feedback.
* [MESOS-8131] - Add new protobuf messages for offer operation feedback.
* [MESOS-8132] - Design a library to send offer operation status updates.
* [MESOS-8139] - Upgrade protobuf to 3.5.x.
* [MESOS-8141] - Add filesystem layout for storage resource providers.
* [MESOS-8143] - Publish and unpublish storage local resources through CSI plugins.
* [MESOS-8181] - Add tests that a failed offer operation on resource provider resources leads to a clock update.
* [MESOS-8183] - Add a container daemon to monitor a long-running standalone container.
* [MESOS-8186] - Implement the agent's AcknowledgeOfferOperationMessage handler.
* [MESOS-8187] - Enable LRP to send operation status updates, checkpoint, and retry using the SUM.
* [MESOS-8193] - Update master's OfferOperationStatusUpdate handler to acknowledge updates to the agent if OfferOperationID is not set.
* [MESOS-8195] - Implement explicit offer operation reconciliation between the master, agent and RPs.
* [MESOS-8196] - Propagate failures from applying offer operations from resource providers.
* [MESOS-8197] - Implement a library to send offer operation status updates.
* [MESOS-8198] - Update the ReconcileOfferOperations protos.
* [MESOS-8199] - Add plumbing for explicit offer operation reconciliation between master, agent, and RPs.
* [MESOS-8207] - Reconcile offer operations between resource providers, agents, and master.
* [MESOS-8211] - Handle agent local resources in offer operation handler.
* [MESOS-8218] - Support `RESERVE`/`CREATE` operations with resource providers.
* [MESOS-8222] - Add resource versions to RunTaskMessage.
* [MESOS-8244] - Add operator API to reload local resource providers.
* [MESOS-8251] - Introduce a way to resolve the "profile" for disk resources.
* [MESOS-8265] - Add state recovery for storage local resource provider.
* [MESOS-8269] - Support resource provider re-subscription in the resource provider manager.
* [MESOS-8270] - Add an agent endpoint to list all active resource providers.
* [MESOS-8309] - Introduce a UUID message type.
* [MESOS-8375] - Use protobuf reflection to simplify upgrading of resources.
* [MESOS-8394] - Bump CSI to 0.1.0.
Release Notes - Mesos - Version 1.4.4 (WIP)
-------------------------------------------
* This is a bug fix release.
** Bug
* [MESOS-9507] - Agent could not recover due to empty docker volume checkpointed files.
* [MESOS-9695] - Remove the duplicate pid check in Docker containerizer
** Improvement:
* [MESOS-9159] - Support Foreign URLs in docker registry puller.
* [MESOS-9675] - Docker Manifest V2 Schema2 Support.
Release Notes - Mesos - Version 1.4.3
-------------------------------------
* This is a bug fix release.
** Bug
* [MESOS-8128] - Make os::pipe file descriptors O_CLOEXEC.
* [MESOS-8568] - Command checks should always call `WAIT_NESTED_CONTAINER` before `REMOVE_NESTED_CONTAINER`
* [MESOS-8620] - Containers stuck in FETCHING possibly due to unresponsive server.
* [MESOS-8917] - Agent leaking file descriptors into forked processes.
* [MESOS-8921] - Autotools don't work with newer OpenJDK versions
* [MESOS-9144] - Master authentication handling leads to request amplification.
* [MESOS-9145] - Master has a fragile burned-in 5s authentication timeout.
* [MESOS-9146] - Agent has a fragile burn-in 5s authentication timeout.
* [MESOS-9147] - Agent and scheduler driver authentication retry backoff time could overflow.
* [MESOS-9151] - Container stuck at ISOLATING due to FD leak.
* [MESOS-9170] - Zookeeper doesn't compile with newer gcc due to format error.
* [MESOS-9196] - Removing rootfs mounts may fail with EBUSY.
* [MESOS-9221] - If some image layers are large, the image pulling may stuck due to the authorized token expired.
* [MESOS-9231] - `docker inspect` may return an unexpected result to Docker executor due to a race condition.
* [MESOS-9279] - Docker Containerizer 'usage' call might be expensive if mount table is big.
* [MESOS-9283] - Docker containerizer actor can get backlogged with large number of containers.
* [MESOS-9304] - Test `CGROUPS_ROOT_PidNamespaceForward` and `CGROUPS_ROOT_PidNamespaceBackward` fails on 1.4.x.
* [MESOS-9334] - Container stuck at ISOLATING state due to libevent poll never returns.
* [MESOS-9419] - Executor to framework message crashes master if framework has not re-registered.
* [MESOS-9480] - Master may skip processing authorization results for `LAUNCH_GROUP`.
* [MESOS-9492] - Persist CNI working directory across reboot.
* [MESOS-9501] - Mesos executor fails to terminate and gets stuck after agent host reboot.
* [MESOS-9502] - IOswitchboard cleanup could get stuck due to FD leak from a race.
* [MESOS-9518] - CNI_NETNS should not be set for orphan containers that do not have network namespace.
* [MESOS-9532] - ResourceOffersTest.ResourceOfferWithMultipleSlaves is flaky.
* [MESOS-9533] - CniIsolatorTest.ROOT_CleanupAfterReboot is flaky.
** Improvement:
* [MESOS-9510] - Disallowed nan, inf and so on in `Value::Scalar`.
* [MESOS-9516] - Extend `min_allocatable_resources` flag to cover non-scalar resources.
Release Notes - Mesos - Version 1.4.2
-------------------------------------
* This is a bug fix release.
** Bug
* [MESOS-4527] - Roles can exceed limit allocation via reservations.
* [MESOS-6616] - Error: dereferencing type-punned pointer will break strict-aliasing rules.
* [MESOS-7099] - Quota can be exceeded due to coarse-grained offer technique.
* [MESOS-7504] - Parent's mount namespace cannot be determined when launching a nested container.
* [MESOS-7975] - The command/default/docker executor can incorrectly send a TASK_FINISHED update even when the task is killed.
* [MESOS-8106] - Docker fetcher plugin unsupported scheme failure message is not accurate.
* [MESOS-8125] - Agent should properly handle recovering an executor when its pid is reused.
* [MESOS-8159] - ns::clone uses an async signal unsafe stack.
* [MESOS-8171] - Using a failoverTimeout of 0 with Mesos native scheduler client can result in infinite subscribe loop.
* [MESOS-8237] - Strip (Offer|Resource).allocation_info for non-MULTI_ROLE schedulers.
* [MESOS-8253] - Mesos CI docker rmi conflict.
* [MESOS-8293] - Reservation may not be allocated when the role has no quota.
* [MESOS-8297] - Built-in driver-based executors ignore kill task if the task has not been launched.
* [MESOS-8339] - Quota headroom may be insufficiently held when role has more reservation than quota.
* [MESOS-8352] - Resources may get over allocated to some roles while fail to meet the quota of other roles.
* [MESOS-8356] - Persistent volume ownership is set to root despite of sandbox owner (frameworkInfo.user) when docker executor is used.
* [MESOS-8411] - Killing a queued task can lead to the command executor never terminating.
* [MESOS-8418] - mesos-agent high cpu usage because of numerous /proc/mounts reads.
* [MESOS-8480] - Mesos returns high resource usage when killing a Docker task.
* [MESOS-8488] - Docker bug can cause unkillable tasks.
* [MESOS-8550] - Bug in `Master::detected()` leads to coredump in `MasterZooKeeperTest.MasterInfoAddress`.
* [MESOS-8552] - CGROUPS_ROOT_PidNamespaceForward and CGROUPS_ROOT_PidNamespaceBackward tests fail.
* [MESOS-8569] - Allow newline characters when decoding base64 strings in stout.
* [MESOS-8573] - Container stuck in PULLING when Docker daemon hangs
* [MESOS-8574] - Docker executor makes no progress when 'docker inspect' hangs.
* [MESOS-8575] - Improve discard handling for 'Docker::stop' and 'Docker::pull'.
* [MESOS-8576] - Improve discard handling of 'Docker::inspect()'.
* [MESOS-8604] - Quota headroom tracking may be incorrect in the presence of hierarchical reservation.
* [MESOS-8605] - Terminal task status update will not send if 'docker inspect' is hung.
* [MESOS-8626] - The 'allocatable' check in the allocator is problematic with multi-role frameworks.
* [MESOS-8651] - Potential memory leaks in the `volume/sandbox_path` isolator.
* [MESOS-8786] - CgroupIsolatorProcess accesses subsystem processes directly.
* [MESOS-8830] - Agent gc on old slave sandboxes could empty persistent volume data
* [MESOS-8871] - Agent may fail to recover if the agent dies before image store cache checkpointed.
* [MESOS-8876] - Normal exit of Docker container using rexray volume results in TASK_FAILED.
* [MESOS-8881] - Enable epoll backend in libevent integration.
* [MESOS-8885] - Disable libevent debug mode.
* [MESOS-8904] - Master crash when removing quota.
* [MESOS-8934] - Update python.m4 to support Python 3.
* [MESOS-8935] - Quota limit "chopping" can lead to cpu-only and memory-only offers.
* [MESOS-8936] - Implement a Random Sorter for offer allocations.
* [MESOS-8942] - Master streaming API does not send (health) check updates for tasks.
* [MESOS-8945] - Master check failure due to CHECK_SOME(providerId).
* [MESOS-8947] - Improve the container preparing logging in IOSwitchboard and volume/secret isolator.
* [MESOS-8952] - process::await/collect n^2 performance issue.
* [MESOS-8963] - Executor crash trying to print container ID.
* [MESOS-8980] - mesos-slave can deadlock with docker pull.
* [MESOS-8986] - `slave.available()` in the allocator is expensive and drags down allocation performance.
* [MESOS-8987] - Master asks agent to shutdown upon auth errors.
* [MESOS-9049] - Agent GC could unmount a dangling persistent volume multiple times.
* [MESOS-9088] - `createStrippedScalarQuantity()` should clear all metadata fields.
* [MESOS-9125] - Port mapper CNI plugin might fail with "Resource temporarily unavailable"
* [MESOS-9127] - Port mapper CNI plugin might deadlock iptables on the agent.
Release Notes - Mesos - Version 1.4.1
-------------------------------------
* This is a bug fix release.
** Bug
* [MESOS-7873] - Expose `ExecutorInfo.ContainerInfo.NetworkInfo` in Mesos `state` endpoint.
* [MESOS-7921] - ProcessManager::resume sometimes crashes accessing EventQueue.
* [MESOS-7964] - Heavy-duty GC makes the agent unresponsive.
* [MESOS-7968] - Handle `/proc/self/ns/pid_for_children` when parsing available namespace.
* [MESOS-7969] - Handle cgroups v2 hierarchy when parsing /proc/self/cgroups.
* [MESOS-7980] - Stout fails to compile with libc >= 2.26.
* [MESOS-8051] - Killing TASK_GROUP fail to kill some tasks.
* [MESOS-8080] - The default executor does not propagate missing task exit status correctly.
* [MESOS-8090] - Mesos 1.4.0 crashes with 1.3.x agent with oversubscription
* [MESOS-8135] - Masters can lose track of tasks' executor IDs.
* [MESOS-8169] - Incorrect master validation forces executor IDs to be globally unique.
Release Notes - Mesos - Version 1.4.0
-------------------------------------
This release contains the following new features:
* [MESOS-5116] - The `disk/xfs` isolator now supports the
`--enforce_container_disk_quota` flag to efficiently measure disk
usage without enforcing usage constraints.
* [MESOS-6223] - Agents are now allowed to recover the agent ID
after a host reboot. See docs/upgrades.md for details.
* [MESOS-6375] - **Experimental** Support for hierarchical resource
allocation roles. Hierarchical roles allows delegation of resource
allocation policies (i.e. fair sharing and quota) further down the
hierarchy. For example, the "engineering" organization gets a 75%
share of the resources, but it's up to the operators within the
"engineering" organization to figure out how to fairly share between
the "engineering/backend" team and the "engineering/frontend" team.
The same delegation applies for quota. NOTE: There are known issues
related to hierarchical roles (e.g. hierarchical quota allocation
is not implemented and quota will be over-allocated if used with
hierarchical roles, see: MESOS-7402) and thus it is not recommended
for production usage at this time.
* [MESOS-7418, MESOS-7088] - File-based secrets are now supported for Mesos
and Universal containerizer. Image-pull secrets are supported for Docker
registry credentials.
* [MESOS-7477] - Linux ambient capabilites are now supported, so
frameworks can run tasks that use ambient capabilites to grant
limited additional privileged to tasks.
* [MESOS-7476, MESOS-7671] - Support for frameworks and operators
specifying Linux bounding capabilities in order to limit the
maximum privileges that a task may acquire.
Deprecations/Removals:
* [MESOS-7671] - LinuxInfo.capabilities is deprecated in favor
of LinuxInfo.effective_capabilities.
* [MESOS-7477] - The agent `--allowed_capabilities` flag is
deprecated in favor of `--effective_capabilities`
Unresolved Critical Issues:
* [MESOS-7643] - The order of isolators provided in '--isolation' flag is not preserved and instead sorted alphabetically
* [MESOS-7402] - Quota is over-allocated when used with hierarchical roles.
Additional API Changes:
* [MESOS-7755] The interpretation of the optional resource argument
passed in `Allocator::updateSlave` was changed from the total
amount of oversubscribed resources on the agent to the new total
resources (both revocable and non-revocable) on the agent. Custom
allocator implementation should be changed to interpretation of the
passed value as a total before updating.
Feature Graduations:
* [MESOS-2533] - Support HTTP checks in Mesos.
* [MESOS-3567] - Support TCP checks in Mesos.
All Resolved Issues:
** Bug
* [MESOS-1987] - Add support for SemVer build and prerelease labels to stout.
* [MESOS-4210] - Investigate increasing protobuf protocol message size limit.
* [MESOS-4331] - git commit-msg hook completely breaks fixup commits.
* [MESOS-4467] - Implement `sleep` in Windows
* [MESOS-4983] - Segfault in ProcessTest.Spawn with GCC 6
* [MESOS-4992] - sandbox uri does not work outisde mesos http server
* [MESOS-5187] - The filesystem/linux isolator does not set the permissions of the host_path.
* [MESOS-5903] - `GTEST_IS_THREADSAFE` guards prevent many tests from being run on Windows.
* [MESOS-5937] - `flags::parse` assumes the filesystem is rooted at '/'
* [MESOS-5938] - `net::links` is not implemented on Windows.
* [MESOS-6115] - Source tree contains compiled protobuf source
* [MESOS-6539] - Compile warning in GMock: "binding dereferenced null pointer to reference"
* [MESOS-6743] - Docker executor hangs forever if `docker stop` fails.
* [MESOS-6814] - Make sure compilation configuration is propagated correctly to third party dependencies
* [MESOS-6817] - Audit the use of UNICODE-related code paths
* [MESOS-6916] - Improve health checks validation.
* [MESOS-6950] - Launching two tasks with the same Docker image simultaneously may cause a staging dir never cleaned up
* [MESOS-6961] - Executors don't use glog for logging.
* [MESOS-7017] - HTTP API responses can crash the master.
* [MESOS-7115] - Agent should prefer LOG(FATAL) over EXIT().
* [MESOS-7173] - CMake does not define `GIT_SHA` etc. in build.cpp
* [MESOS-7186] - Metrics about used/allocated shared resources are incorrect accounted.
* [MESOS-7193] - Use of `GTEST_IS_THREADSAFE` in asserts is problematic.
* [MESOS-7252] - Need to fix resource check in long-lived framework
* [MESOS-7268] - CNI isolator should mount network related /etc/* files in readonly mode
* [MESOS-7351] - CMake < 3.8.0 cannot find VS2017 tools
* [MESOS-7373] - Remove thread_local workaround on OSX
* [MESOS-7374] - Running DOCKER images in Mesos Container Runtime without `linux/filesystem` isolation enabled renders host unusable
* [MESOS-7378] - Build failure with glibc 2.12.
* [MESOS-7381] - Flaky tests in NestedMesosContainerizerTest
* [MESOS-7389] - Mesos 1.2.0 crashes with pre-1.0 Mesos agents.
* [MESOS-7403] - Resources::apply(const Offer::Operation&) should fail when a shared persistent volume can't be removed
* [MESOS-7441] - RegisterSlaveValidationTest.DropInvalidRegistration is flaky
* [MESOS-7457] - HierarchicalAllocatorTest.NestedRoleQuota is flaky
* [MESOS-7458] - webui display of framework resources is confusing
* [MESOS-7459] - Fix the duration.hpp warning
* [MESOS-7462] - Flaky test HierarchicalAllocatorTest.NestedRoleDRF
* [MESOS-7464] - Recent Docker versions cannot be parsed by stout.
* [MESOS-7468] - Could not copy the sandbox path on WebUI
* [MESOS-7471] - Provisioner recover should not always assume 'rootfses' dir exists.
* [MESOS-7476] - Restrict capabilities to only the bounding set.
* [MESOS-7484] - VersionTest.ParseInvalid aborts on Windows.
* [MESOS-7496] - The /debug:fastlink linker option is not being respected
* [MESOS-7498] - Remove need to set environment variable `PreferredToolArchitecture`
* [MESOS-7502] - Build error on Windows when using "int" for a file descriptor
* [MESOS-7507] - Add a metric for the network size of replicas for the registry.
* [MESOS-7515] - MasterAllocatorTest/0.ResourcesUnused is flaky
* [MESOS-7524] - Basic fetcher success metrics
* [MESOS-7545] - Volume secret isolator breaks Windows build
* [MESOS-7552] - MasterAllocatorTest/0.FrameworkExited is flaky
* [MESOS-7569] - Allow "old" executors with half-open connections to be preserved during agent upgrade / restart.
* [MESOS-7581] - Specifying an unbundled dependency can cause build to pick up wrong Boost version
* [MESOS-7584] - ASF Jenkins build errors out on missing 'python-six' dependency
* [MESOS-7597] - libprocess build is broken
* [MESOS-7618] - CMake files incompatible with multi-configuration generators
* [MESOS-7627] - Mesos slave stucks
* [MESOS-7638] - The command `false` does not exist on Windows
* [MESOS-7640] - Docker containerizer fails to set sandbox logs ownership correctly.
* [MESOS-7652] - Docker image with universal containerizer does not work if WORKDIR is missing in the rootfs.
* [MESOS-7655] - Reservation Refinement: Update the resources logic.
* [MESOS-7662] - Documentation regarding TASK_LOST is misleading
* [MESOS-7666] - Update the agent to use the new resource format
* [MESOS-7667] - Update the master to use the new resource format.
* [MESOS-7669] - Update the test utilities to produce the resources in the new format
* [MESOS-7671] - Let frameworks specify the task bounding capabilities.
* [MESOS-7674] - Update the generic Protobuf to JSON facility to not output deprecated fields
* [MESOS-7679] - V1 Operator API update for reservation refinement.
* [MESOS-7689] - Libprocess can crash on malformed request paths for libprocess messages.
* [MESOS-7690] - The agent can crash when an unknown executor tries to register.
* [MESOS-7700] - Prevent reserve/create operations with refined reservations on non-capable agents.
* [MESOS-7703] - Mesos fails to exec a custom executor when no shell is used
* [MESOS-7711] - Master updates registry for reregistering agents even when they haven't been unreachable
* [MESOS-7714] - Fix agent downgrade for reservation refinement
* [MESOS-7716] - Mesos 1.2.0 agent crashes Mesos 1.4.0 master
* [MESOS-7725] - PersistentVolumeEndpointsTest.ReserveAndSlaveRemoval test is flaky
* [MESOS-7728] - Java HTTP adapter crashes JVM when leading master disconnects.
* [MESOS-7735] - The master crashes when state endpoint is hit during a task authorization.
* [MESOS-7744] - Mesos Agent Sends TASK_KILL status update to Master, and still launches task
* [MESOS-7751] - Mesos failed to build on Windows due to error C2039: 'parse': is not a member of 'mesos::internal::protobuf'
* [MESOS-7753] - `log.LearnedMessage` could be rejected due to being sent from '@0.0.0.0:0'
* [MESOS-7758] - Stout doesn't build standalone.
* [MESOS-7761] - Website ruby deps do not bundle on macOS
* [MESOS-7765] - MasterTest.KillUnknownTask is failling due to a bug in `net::IPv4::ANY()`
* [MESOS-7769] - libprocess initializes to bind to random port if --ip is not specified
* [MESOS-7770] - Persistent volume might not be mounted if there is a sandbox volume whose source is the same as the target of the persistent volume.
* [MESOS-7772] - Copy-n-paste error in slave/main.cpp
* [MESOS-7775] - Eliminate extra process abort in a subprocess watchdog
* [MESOS-7777] - Agent failed to recover due to mount namespace leakage in Docker 1.12/1.13
* [MESOS-7778] - Hide per-platform subprocess headers.
* [MESOS-7783] - Framework might not receive status update when a just launched task is killed immediately
* [MESOS-7794] - Mesos failed with error c2102 when build in conformance mode (/permissive-)
* [MESOS-7796] - LIBPROCESS_IP isn't passed on to the fetcher
* [MESOS-7797] - Hard-coded forward slash breaks windows docker container task in DC/OS
* [MESOS-7805] - mesos-execute has incorrect example TaskInfo in help string
* [MESOS-7817] - CreateProcess wrapper's error message is bad
* [MESOS-7821] - Resource refinement does downgrade task.executor.resources in LAUNCH_GROUP handler.
* [MESOS-7830] - Sandbox_path volume does not have ownership set correctly.
* [MESOS-7831] - Resource refinement is not applied to tasks in completed_frameworks.
* [MESOS-7849] - The rlimits and linux/capabilities isolators should support nested containers
* [MESOS-7858] - Launching a nested container with namespace/pid isolation, with glibc < 2.25, may deadlock the LinuxLauncher and MesosContainerizer
* [MESOS-7863] - Agent may drop pending kill task status updates.
* [MESOS-7865] - Agent may process a kill task and still launch the task.
* [MESOS-7869] - Build fails with `--disable-zlib` or `--with-zlib=DIR`
* [MESOS-7871] - Agent fails assertion during request to '/state'
* [MESOS-7872] - Scheduler hang when registration fails.
* [MESOS-7888] - Track fetcher task success and failures
* [MESOS-7909] - Ordering dependency between 'linux/capabilities' and 'docker/runtime' isolator.
* [MESOS-7912] - Master WebUI not working in Chrome.
* [MESOS-7921] - process::EventQueue sometimes crashes
* [MESOS-7922] - Fix communication between old masters and new agents.
* [MESOS-7926] - Abnormal termination of default executor can cause MesosContainerizer::destroy to fail.
* [MESOS-7934] - OOM due to LibeventSSLSocket send incorrectly returning 0 after shutdown.
** Documentation
* [MESOS-7246] - Add documentation for AGENT_ADDED/AGENT_REMOVED events.
* [MESOS-7349] - Document Mesos "check" feature.
* [MESOS-7501] - Change legacy --with-network-isolator to --with-port-mapping-isolator
** Epic
* [MESOS-6975] - Prevent pre-1.0 agents from registering with 1.3+ master.
* [MESOS-7088] - Support private registry credential per container.
* [MESOS-7623] - Automatically publish website through CI
** Improvement
* [MESOS-5116] - Add support for accounting only mode in XFS isolator.
* [MESOS-5417] - define WSTRINGIFY behaviour on Windows
* [MESOS-6053] - Combine test helpers into one single binary.
* [MESOS-6223] - Allow agents to reregister post a host reboot
* [MESOS-6535] - The default executor should support kill policies
* [MESOS-6549] - Asynchronous dir removal in agent GC
* [MESOS-6782] - Inherit Environment from parent container when launching DEBUG container.
* [MESOS-6905] - Task status updates caused by task health update do not set appropriate reason.
* [MESOS-6976] - Disallow (re-)registration attempts by old agents.
* [MESOS-6977] - Cleanup tech debt in master for old agents
* [MESOS-6978] - Update webui to remove orphan tasks
* [MESOS-7006] - Launch docker containers with --cpus instead of cpu-shares
* [MESOS-7015] - Frameworks should be able to (re)register in suppressed state
* [MESOS-7092] - Health checker duplicates a lot of checker's functionality.
* [MESOS-7228] - Upgrade Mesos to build with proto3.
* [MESOS-7327] - Add a test with multiple tasks and checks for the default executor.
* [MESOS-7343] - Add a ReviewBot for testing patches on Windows
* [MESOS-7355] - Set MESOS_SANDBOX in debug containers.
* [MESOS-7364] - Upgrade vendored GMock / GTest
* [MESOS-7401] - Optionally reject messages when UPIDs does not match IP.
* [MESOS-7418] - Add support for file-based secrets
* [MESOS-7429] - Allow isolators to inject task-specific environment variables.
* [MESOS-7451] - Expose MOUNT volumes of an agent in master's v0 HTTP API
* [MESOS-7477] - Support ambient capabilities.
* [MESOS-7540] - Add an agent flag for executor re-registration timeout.
* [MESOS-7542] - Add executor reconnection retry logic to the agent
* [MESOS-7572] - Attach latest symlink when executor is registered.
* [MESOS-7585] - Added 'mesos config show' command to the new Mesos CLI.
* [MESOS-7608] - Protobuf definitions for domains
* [MESOS-7609] - Protobuf definitions for region-aware framework capability
* [MESOS-7610] - Support domains in master and agent
* [MESOS-7611] - Prevent master from joining mixed-region cluster
* [MESOS-7612] - Prevent agent with misconfigured domain from registering
* [MESOS-7614] - Only offer resources on remote agents to region-aware frameworks
* [MESOS-7630] - Add simple filtering to unversioned operator API
* [MESOS-7644] - Add DomainInfo to offers
* [MESOS-7782] - Add fetcher cache size metrics.
* [MESOS-7792] - Add support for ECDH ciphers
* [MESOS-7808] - Bundling gRPC into 3rdparty
* [MESOS-7809] - Building gRPC with Autotools
* [MESOS-7810] - gRPC support in libprocess
* [MESOS-7814] - Improve the test frameworks.
* [MESOS-7862] - Get rid of timestamp and date in generated javadoc files
* [MESOS-7870] - Refactor libssl and libcrypto checks for building gRPC
* [MESOS-7881] - Building gRPC with CMake
** Task
* [MESOS-6101] - Add Framwork events to master's operator API
* [MESOS-6162] - Add support for cgroups blkio subsystem blkio statistics.
* [MESOS-6441] - Display reservations in the agent page in the webui.
* [MESOS-7149] - Support reservations for role subtrees
* [MESOS-7283] - Add ability to initialize a test cluster for Mesos CLI unit-test infrastructure
* [MESOS-7304] - Fetcher should not depend on SlaveID.
* [MESOS-7315] - Design doc for resource provider and storage integration.
* [MESOS-7414] - Enable authorization for master's logging API calls: GET_LOGGING_LEVEL and SET_LOGGING_LEVEL
* [MESOS-7415] - Add authorization to master's operator maintenance API in v0 and v1
* [MESOS-7416] - Filter results of `/master/slaves` and the v1 call GET_AGENTS
* [MESOS-7417] - Design doc for file-based secrets.
* [MESOS-7433] - Set working directory in DEBUG containers.
* [MESOS-7449] - Refactor containerizers to not depend on TaskInfo or ExecutorInfo
* [MESOS-7488] - Add `--ip6` and `--ip6_discovery_command` flag to Mesos agent
* [MESOS-7505] - Enable hierarchical roles
* [MESOS-7560] - Add 'type' and 'name' to ResourceProviderInfo.
* [MESOS-7571] - Add `--resource_provider_config_dir` flag to the agent.
* [MESOS-7576] - Add master flag `--filter-gpu-resources={true|false}`
* [MESOS-7582] - Add Config class to manage the Mesos CLI config file.
* [MESOS-7591] - Update master to use resource provider IDs instead of agent ID in allocator calls.
* [MESOS-7593] - Update offer handling in the master to consider local resource providers
* [MESOS-7624] - Move website from svn to git
* [MESOS-7625] - Create script to automate publishing website
* [MESOS-7626] - Create a CI job to publish the website
* [MESOS-7631] - DefautlExecutor needs to inform tasks about IP addresses
* [MESOS-7632] - Add `HIERARCHICAL_ROLE` agent capability
* [MESOS-7633] - Prevent hierarchical roles from being allocated resources from non-HIERARCHICAL_ROLE agents.
* [MESOS-7665] - V0 Operator API update for reservation refinement.
* [MESOS-7668] - Update authorization to handle reservation refinement.
* [MESOS-7696] - Update resource provider design in the master
* [MESOS-7709] - Add --default_container_dns flag to the agent.
* [MESOS-7713] - Optimize number of copies made in dispatch/defer mechanism
* [MESOS-7755] - Update allocator to support updating agent total resources
* [MESOS-7757] - Update master to handle updates to agent total resources
* [MESOS-7767] - Make `net::IP` fields protected to allow for inheritance
* [MESOS-7780] - Add `SUBSCRIBE` call handling to the resource provider manager
* [MESOS-7806] - Add copy assignment operator to `net::IP::Network`
* [MESOS-7853] - Support shared PID namespace.
* [MESOS-7879] - The kill nested container call should provide ability to specify a signal.
Release Notes - Mesos - Version 1.3.3 (WIP) - cancelled
-------------------------------------------------------
* This is a bug fix release.
All Issues:
** Bug
* [MESOS-8125] - Agent should properly handle recovering an executor when its pid is reused.
* [MESOS-8171] - Using a failoverTimeout of 0 with Mesos native scheduler client can result in infinite subscribe loop.
* [MESOS-8411] - Killing a queued task can lead to the command executor never terminating.
* [MESOS-8480] - Mesos returns high resource usage when killing a Docker task.
* [MESOS-8488] - Docker bug can cause unkillable tasks.
* [MESOS-8552] - CGROUPS_ROOT_PidNamespaceForward and CGROUPS_ROOT_PidNamespaceBackward tests fail.
* [MESOS-8574] - Docker executor makes no progress when 'docker inspect' hangs.
* [MESOS-8575] - Improve discard handling for 'Docker::stop' and 'Docker::pull'.
* [MESOS-8576] - Improve discard handling of 'Docker::inspect()'.
* [MESOS-8605] - Terminal task status update will not send if 'docker inspect' is hung.
* [MESOS-8651] - Potential memory leaks in the `volume/sandbox_path` isolator.
* [MESOS-8786] - CgroupIsolatorProcess accesses subsystem processes directly.
* [MESOS-8876] - Normal exit of Docker container using rexray volume results in TASK_FAILED.
* [MESOS-8881] - Enable epoll backend in libevent integration.
* [MESOS-8885] - Disable libevent debug mode.
* [MESOS-8904] - Master crash when removing quota.
Release Notes - Mesos - Version 1.3.2
-------------------------------------
* This is a bug fix release.
All Issues:
** Bug
* [MESOS-6743] - Docker executor hangs forever if `docker stop` fails.
* [MESOS-6950] - Launching two tasks with the same Docker image simultaneously may cause a staging dir never cleaned up.
* [MESOS-7652] - Docker image with universal containerizer does not work if WORKDIR is missing in the rootfs.
* [MESOS-7674] - Update the generic Protobuf to JSON facility to not output deprecated fields.
* [MESOS-7858] - Launching a nested container with namespace/pid isolation, with glibc < 2.25, may deadlock the LinuxLauncher and MesosContainerizer.
* [MESOS-7863] - Agent may drop pending kill task status updates.
* [MESOS-7865] - Agent may process a kill task and still launch the task.
* [MESOS-7872] - Scheduler hang when registration fails.
* [MESOS-7909] - Ordering dependency between 'linux/capabilities' and 'docker/runtime' isolator.
* [MESOS-7912] - Master WebUI not working in Chrome.
* [MESOS-7926] - Abnormal termination of default executor can cause MesosContainerizer::destroy to fail.
* [MESOS-7934] - OOM due to LibeventSSLSocket send incorrectly returning 0 after shutdown.
* [MESOS-8135] - Masters can lose track of tasks' executor IDs.
* [MESOS-8237] - Strip (Offer|Resource).allocation_info for non-MULTI_ROLE schedulers.
* [MESOS-8356] - Persistent volume ownership is set to root despite of sandbox owner (frameworkInfo.user) when docker executor is used.
Release Notes - Mesos - Version 1.3.1
-------------------------------------
* This is a bug fix release.
All Issues:
** Bug
* [MESOS-5187] - The filesystem/linux isolator does not set the permissions of the host_path.
* [MESOS-7252] - Need to fix resource check in long-lived framework.
* [MESOS-7429] - Allow isolators to inject task-specific environment variables.
* [MESOS-7540] - Add an agent flag for executor re-registration timeout.
* [MESOS-7546] - WAIT_NESTED_CONTAINER sometimes returns 404.
* [MESOS-7569] - Allow "old" executors with half-open connections to be preserved during agent upgrade / restart.
* [MESOS-7581] - Fix interference of external Boost installations when using some unbundled dependencies.
* [MESOS-7689] - Libprocess can crash on malformed request paths for libprocess messages.
* [MESOS-7690] - The agent can crash when an unknown executor tries to register.
* [MESOS-7692] - Default environment variables defined in Docker image are not available in Mesos containerizer.
* [MESOS-7703] - Mesos fails to exec a custom executor when no shell is used.
* [MESOS-7728] - Java HTTP adapter crashes JVM when leading master disconnects.
* [MESOS-7770] - Persistent volume might not be mounted if there is a sandbox volume whose source is the same as the target of the persistent volume.
* [MESOS-7777] - Agent failed to recover due to mount namespace leakage in Docker 1.12/1.13.
* [MESOS-7796] - LIBPROCESS_IP isn't passed on to the fetcher.
* [MESOS-7830] - Sandbox_path volume does not have ownership set correctly.
Release Notes - Mesos - Version 1.3.0
-------------------------------------
This release contains the following new features:
* [MESOS-1763] - Support for frameworks to receive resources for multiple
roles. This allows "multi-user" frameworks to leverage the role-based
resource allocation in mesos. Prior to this support, one had to run
multiple instances of a single-user framework to achieve multi-user
resource allocation, or implement multi-user resource allocation in
the framework.
* [MESOS-6365] - Authentication and authorization support for HTTP executors.
A new `--authenticate_http_executors` agent flag enables required
authentication on the HTTP executor API. A new `--executor_secret_key` flag
sets a key file to be used when generating and authenticating default tokens
that are passed to HTTP executors. Note that enabling these flags after
upgrade is disruptive to HTTP executors that were launched before the
upgrade; see 'docs/authentication.md' for more information on these flags
and the recommended upgrade procedure. Implicit authorization rules have
been added which allow an authenticated executor to make executor API calls
as that executor and make operator API calls which affect that executor's
container. See 'docs/authorization.md' for more information on these
implicit authorization rules.
* [MESOS-6627] - Support for frameworks to modify the role(s) they are
subscribed to. This is essential to supporting "multi-user" frameworks
(see MESOS-1763) in that roles are expected to come and go over time
(e.g. new employees join, new teams are formed, employees leave, teams
are disbanded, etc).
**NOTE**: In Mesos 1.3.0, the master will no longer allow 0.x agents to
register. Interoperability between 1.1+ masters and 0.x agents has never
been supported; however, it was not explicitly disallowed, either.
Starting with this release of Mesos, registration attempts by 0.x Mesos
agents will be ignored.
Deprecations/Removals:
* [MESOS-7259] - Remove deprecated ACLs `SetQuota` and `RemoveQuota`.
This change is only applicable to the local authorizer since internally
these acls were being translated to the `UPDATE_QUOTA` action.
* [MESOS-7320] - Remove deprecated ACL `ShutdownFramework`.
This change is only applicable to the local authorizer since internally
these acls were being translated to the `TEARDOWN_FRAMEWORK` action.
Unresolved Critical Issues:
* [MESOS-1625] - Extra trailing CRLF being sent after the HTTP body in libprocess.
* [MESOS-1718] - Command executor can overcommit the agent.
* [MESOS-2554] - Slave flaps when using --slave_subsystems that are not used for isolation.
* [MESOS-2774] - SIGSEGV received during process::MessageEncoder::encode().
* [MESOS-2842] - Update FrameworkInfo.principal on framework re-registration.
* [MESOS-3533] - Unable to find and run URIs files.
* [MESOS-3747] - HTTP Scheduler API no longer allows FrameworkInfo.user to be empty string.
* [MESOS-3794] - Master should not store arbitrarily sized data in ExecutorInfo.
* [MESOS-4259] - mesos HA can't delete the the redundant container on failure slave node.
* [MESOS-4297] - Executor does not shutdown when framework teardown.
* [MESOS-4642] - Mesos Agent Json API can dump binary data from log files out as invalid JSON.
* [MESOS-4996] - 'containerizer->update' will always fail after killing a docker container.
* [MESOS-5352] - Docker volume isolator cleanup can be blocked by first cleanup failure.
* [MESOS-5396] - After failover, master does not remove agents with same UPID.
* [MESOS-5849] - Agent sandboxes on Windows surpass the 260 character path length limit.
* [MESOS-5859] - Some tasks are always in staged state.
* [MESOS-5989] - Libevent SSL Socket downgrade code accesses uninitialized memory / assumes single peek is sufficient.
* [MESOS-5995] - Protobuf JSON deserialisation does not accept numbers formated as strings.
* [MESOS-6356] - ASF CI has interleaved logging.
* [MESOS-6615] - Running mesos-slave in the docker that leave many zombie process.
* [MESOS-6623] - Re-enable tests impacted by request streaming support.
* [MESOS-6632] - ContainerLogger might leak FD if container launch fails.
* [MESOS-6780] - ContentType/AgentAPIStreamingTest.AttachContainerInput test fails reliably.
* [MESOS-6784] - IOSwitchboardTest.KillSwitchboardContainerDestroyed is flaky.
* [MESOS-6804] - Running 'tty' inside a debug container that has a tty reports "Not a tty".
* [MESOS-6843] - Fetcher should not assume stdout/stderr in the sandbox.
* [MESOS-6913] - AgentAPIStreamingTest.AttachInputToNestedContainerSession fails on Mac OS.
* [MESOS-6974] - DefaultExecutorTest.CommitSuicideOnTaskFailure test is flaky.
* [MESOS-6986] - `abort` in `DRFSorter::add`.
* [MESOS-7017] - HTTP API responses can crash the master.
* [MESOS-7082] - ROOT_DOCKER_DockerAndMesosContainerizers/DefaultExecutorTest.KillTask/0 is flaky.
* [MESOS-7099] - Quota can be exceeded due to coarse-grained offer technique.
* [MESOS-7215] - Race condition on re-registration of non-partition-aware frameworks.
* [MESOS-7298] - Fetcher caches files with world-readable permissions.
* [MESOS-7362] - GPU support can't work when run spark.
* [MESOS-7374] - Running DOCKER images in Mesos Container Runtime without `linux/filesystem` isolation enabled renders host unusable.
* [MESOS-7381] - Flaky tests in NestedMesosContainerizerTest.
* [MESOS-7386] - Executor not cleaning up existing running docker containers if external logrotate/logger processes die/killed.
Feature Graduations:
* [MESOS-2449] - Support group of tasks (Pod) constructs and API in Mesos.
* [MESOS-4641] - Support Container Network Interface (CNI).
* [MESOS-6419] - Teardown unregistered frameworks.
All Experimental Features:
* [MESOS-2533] - Support HTTP checks in Mesos.
* [MESOS-3094] - Mesos on Windows.
* [MESOS-3421] - Support sharing of resources across task instances.
* [MESOS-3567] - Support TCP checks in Mesos.
* [MESOS-4312] - Porting Mesos on Power (ppc64le).
* [MESOS-4355] - Implement isolator for Docker volume.
* [MESOS-4791] - Operator API v1.
* [MESOS-4828] - XFS disk quota isolator.
* [MESOS-5275] - Add capabilities support for mesos containerizer.
* [MESOS-5344] - Partition-aware Mesos frameworks.
* [MESOS-5788] - Added JAVA API adapter for seamless transition to new scheduler API.
* [MESOS-5931] - Support auto backend in Mesos Containerizer.
* [MESOS-6014] - Added port mapping CNI plugin.
* [MESOS-6077] - Added a default (task group) executor.
* [MESOS-6402] - rlimit support for Mesos containerizer.
* [MESOS-6460] - Container Attach/Exec.
* [MESOS-6758] - Support docker registry that requires basic auth.
* [MESOS-6906] - Introduce a general non-interpreting task check.
All Resolved Issues:
** Bug
* [MESOS-1987] - Add support for SemVer build and prerelease labels to stout.
* [MESOS-4245] - Add `dist` target to CMake solution.
* [MESOS-4263] - Report volume usage through ResourceStatistics.
* [MESOS-5028] - Copy provisioner cannot replace directory with symlink.
* [MESOS-5172] - Registry puller cannot fetch blobs correctly from http Redirect 3xx urls.
* [MESOS-5288] - Update leveldb patch file to suport s390x.
* [MESOS-5880] - Semantics of `environment` differ across Windows and POSIX.
* [MESOS-6134] - Port CFS quota support to Docker Containerizer using command executor.
* [MESOS-6138] - Add 'syntax=proto2' to all .proto files in Mesos.
* [MESOS-6327] - Large docker images causes container launch failures: Too many levels of symbolic links.
* [MESOS-6560] - The default stout stringify always copies its argument.
* [MESOS-6606] - Reject optimized builds with libcxx before 3.9.
* [MESOS-6720] - Check that `PreferredToolArchitecture` is set to `x64` on Windows before building.
* [MESOS-6730] - Reserve operation should validate reserved resource role against resource allocationInfo role.
* [MESOS-6731] - Create a test filter for stout tests that use `symlink` on Windows, as they will fail if not run as admin.
* [MESOS-6732] - XFS disk isolator should check whether quotas are enabled.
* [MESOS-6742] - Adding support for s390x architecture.
* [MESOS-6815] - Enable glog stack traces when we call things like `ABORT` on Windows.
* [MESOS-6858] - network/cni isolator generates incomplete resolv.conf.
* [MESOS-6868] - Transition Windows away from `os::killtree`.
* [MESOS-6892] - Reconsider process creation primitives on Windows.
* [MESOS-6907] - FutureTest.After3 is flaky.
* [MESOS-6951] - Docker containerizer: mangled environment when env value contains LF byte.
* [MESOS-6953] - A compromised mesos-master node can execute code as root on agents.
* [MESOS-6976] - Disallow (re-)registration attempts by old agents.
* [MESOS-6982] - PerfTest.Version fails on recent Arch Linux.
* [MESOS-7022] - Update framework authorization to support multiple roles.
* [MESOS-7029] - FaultToleranceTest.FrameworkReregister is flaky.
* [MESOS-7035] - Add test for framework upgrading to MULTI_ROLE with tasks running.
* [MESOS-7049] - CgroupsAnyHierarchyWithPerfEventTest.ROOT_CGROUPS_PERF_PerfTest is broken on Fedora 25.
* [MESOS-7097] - Framework credentials can be used to register as an agent.
* [MESOS-7133] - mesos-fetcher fails with openssl-related output.
* [MESOS-7135] - Outstanding offers to a dropped framework role should be rescinded.
* [MESOS-7146] - OSX broken due to wrong configuration of LevelDB after update.
* [MESOS-7158] - Add `role` to task/executor to indicate allocation role of their resources.
* [MESOS-7165] - Agents should be able to upgrade to be MULTI_ROLE capable.
* [MESOS-7172] - CMake does not incrementally recompile.
* [MESOS-7182] - Couple of MULTI_ROLE related tests are flaky.
* [MESOS-7197] - Requesting tiny amount of CPU crashes master.
* [MESOS-7208] - Persistent volume ownership is set to root when task is running with non-root user.
* [MESOS-7210] - HTTP health check doesn't work when mesos runs with --docker_mesos_image.
* [MESOS-7225] - Tasks launched via the default executor cannot access disk resource volumes.
* [MESOS-7236] - Base64 encoding/decoding (via stout) behaves differently on Windows.
* [MESOS-7237] - Enabling cgroups_limit_swap can lead to "invalid argument" error.
* [MESOS-7248] - RemoveNestedContainer returns unsupported.
* [MESOS-7255] - New mesos-style.py linter behavior breaks commiting when virtualenv is not installed.
* [MESOS-7259] - Remove deprecated ACLs `SetQuota` and `RemoveQuota`.
* [MESOS-7261] - maintenance.html is missing during packaging.
* [MESOS-7263] - User supplied task environment variables cause warnings in sandbox stdout.
* [MESOS-7264] - Possibly duplicate environment variables should not leak values to the sandbox.
* [MESOS-7265] - Containerizer startup may cause sensitive data to leak into sandbox logs.
* [MESOS-7270] - Java V1 Framwork Test failed on macOS.
* [MESOS-7272] - Unified containerizer does not support docker registry version < 2.3.
* [MESOS-7280] - Unified containerizer provisions docker image error with COPY backend.
* [MESOS-7281] - Backwards incompatible UpdateFrameworkMessage handling.
* [MESOS-7287] - Fix post-reviews.py to find `rbt.cmd` on Windows.
* [MESOS-7300] - Mesos failed to build on Windows due to error C2440: 'return': cannot convert from 'Error' to 'bool'.
* [MESOS-7311] - CopyFetcherPluginTest.FetchExistingFile.
* [MESOS-7316] - Upgrading Mesos to 1.2.0 results in some information missing from the `/flags` endpoint.
* [MESOS-7323] - Framework role tracking in allocator results in framework treated as active incorrectly.
* [MESOS-7340] - Log HTTP accesses to the /files endpoint.
* [MESOS-7346] - Agent crashes if the task name is too long.
* [MESOS-7348] - Network isolator crashes agent on startup when network interface cannot be found.
* [MESOS-7350] - Failed to pull image from Nexus Registry due to signature missing.
* [MESOS-7363] - Improver master robustness against duplicate UPIDs.
* [MESOS-7365] - Compile error with recent glibc.
* [MESOS-7372] - Improve agent re-registration robustness.
* [MESOS-7378] - Build failure with glibc 2.12.
* [MESOS-7389] - Mesos 1.2.0 crashes with pre-1.0 Mesos agents.
* [MESOS-7400] - The mesos master crashes due to an incorrect invariant check in the decoder.
* [MESOS-7427] - Registry puller cannot fetch manifests from Amazon ECR: 405 Unsupported.
* [MESOS-7430] - Per-role Suppress call implementation is broken.
* [MESOS-7431] - Registry puller cannot fetch manifests from Google GCR: 403 Forbidden.
* [MESOS-7453] - glyphicons-halflings-regular.woff2 is missing in WebUI.
* [MESOS-7456] - Compilation error on recent glibc in cgroups device subsystem.
* [MESOS-7464] - Recent Docker versions cannot be parsed by stout.
* [MESOS-7471] - Provisioner recover should not always assume 'rootfses' dir exists.
* [MESOS-7478] - Pre-1.2.x master does not work with 1.2.x agent.
* [MESOS-7484] - VersionTest.ParseInvalid aborts on Windows.
* [MESOS-7521] - Major performance regression in DRF sorter.
* [MESOS-7538] - Don't validate re-registrations that are going to be dropped.
** Documentation
* [MESOS-7005] - Add executor authentication documentation.
* [MESOS-7324] - Update documentation to reflect the addition of multi-role framework support.
** Epic
* [MESOS-1763] - Add support for frameworks to receive resources for multiple roles.
* [MESOS-6365] - Executor authentication.
* [MESOS-6627] - Allow frameworks to modify the role(s) they are subscribed to.
** Improvement
* [MESOS-970] - Upgrade bundled leveldb to 1.19.
* [MESOS-5186] - mesos.interface: Allow using protobuf 3.x.
* [MESOS-5992] - Complete the list of API Calls on the Operator HTTP API Doc.
* [MESOS-6280] - Task group executor should support command health checks.
* [MESOS-6304] - Add authentication support to the default executor.
* [MESOS-6523] - Agent cgroup assignment should precede agent initialization.
* [MESOS-6906] - Introduce a general non-interpreting task check.
* [MESOS-7021] - Consistent symlink behavior for os::stat accessors.
* [MESOS-7074] - port_mapping isolator: do not depend on /sys/class/net/<ifname>/speed.
* [MESOS-7101] - ExamplesTest.PersistentVolumeFramework failed on ASF CI.
* [MESOS-7120] - Add an Agent API call to cleanup nested container artifacts.
* [MESOS-7226] - Introduce precompiled headers (on Windows).
* [MESOS-7249] - Default executor does not support general checks.
* [MESOS-7256] - Replace Boost Type Traits leftovers with STL.
* [MESOS-7274] - Health checker does not support pause / resume.
* [MESOS-7275] - General checker does not support TCP checks.
* [MESOS-7276] - General checker does not support pause / resume.
* [MESOS-7277] - General checker does not support command checks via agent.
* [MESOS-7376] - Reduce copying of the Registry to improve Registrar performance.
* [MESOS-7387] - ZK master contender and detector don't respect zk_session_timeout option.
** Task
* [MESOS-3139] - Incorporate CMake into standard documentation.
* [MESOS-5418] - Test case: Escape containerizer command line on Windows.
* [MESOS-6022] - unit-test for port-mapper CNI plugin.
* [MESOS-6032] - Add infrastructure for unit tests in the new python-based CLI.
* [MESOS-6123] - Implement GET_AGENT call in v1 agent API.
* [MESOS-6447] - Display role weight / role quota information in the webui.
* [MESOS-6636] - Validate that tasks / executors / reservations / volumes do not mix Resource.allocation_info.roles.
* [MESOS-6637] - Validate that schedulers cannot perform operations on offers with different allocation roles.
* [MESOS-6657] - Update the webui to reflect that frameworks have multiple roles.
* [MESOS-6691] - Enable SSL in Mesos builds.
* [MESOS-6762] - Update release notes for multi-role changes.
* [MESOS-6791] - Allow to specific the device whitelist entries in cgroup devices subsystem.
* [MESOS-6808] - Refactor Docker::run to only take docker cli parameters.
* [MESOS-6855] - Add `role` section to response of /state endpoint.
* [MESOS-6886] - Add authorization tests for debug API handlers.
* [MESOS-6940] - Do not send offers to MULTI_ROLE schedulers if agent does not have MULTI_ROLE capability.
* [MESOS-6967] - Ensure offer operations can be applied for MULTI_ROLE and non-MULTI_ROLE frameworks.
* [MESOS-6992] - Remove validation against "/" characters in roles to support hierarchical roles.
* [MESOS-6995] - Update the webui to reflect hierarchical roles.
* [MESOS-6996] - Add a 'Secret' protobuf message.
* [MESOS-6997] - Add the SecretGenerator module interface.
* [MESOS-6998] - Add authentication support to agent's '/v1/executor' endpoint.
* [MESOS-6999] - Add agent support for generating and passing executor secrets.
* [MESOS-7000] - Implement a JWT SecretGenerator.
* [MESOS-7001] - Implement a JWT authenticator.
* [MESOS-7003] - Introduce a 'Principal' type.
* [MESOS-7004] - Enable multiple HTTP authenticator modules.
* [MESOS-7009] - Add a 'secret' field to the 'Environment' message.
* [MESOS-7011] - Add an '--executor_secret_key' flag to the agent.
* [MESOS-7013] - Update the authorizer interface for executor authentication.
* [MESOS-7014] - Add implicit executor authorization to local authorizer.
* [MESOS-7024] - Update the allocator to handle hierarchical roles.
* [MESOS-7026] - Update authorization / authorization-filtering to handle hierarchical roles.
* [MESOS-7037] - Prevent setting quota on nested roles not contained by parent role quota.
* [MESOS-7038] - Update quota cluster capacity heuristic for hierarchical roles.
* [MESOS-7039] - Prevent quota removal that violates parent role-child role quota containment.
* [MESOS-7047] - Update agent for hierarchical roles.
* [MESOS-7048] - Remove adjustment code within Resources::apply.
* [MESOS-7061] - Re-persist tasks/executors with allocation info during agent recovery.
* [MESOS-7063] - Add a test for a MULTI_ROLE master reregistering an old agent.
* [MESOS-7269] - Migrate setting in config.py to a TOML file.
* [MESOS-7282] - Create a table abstraction for the Mesos CLI.
* [MESOS-7320] - Remove deprecated ACL `ShutdownFramework`.
* [MESOS-7336] - Add resource provider API protobuf.
* [MESOS-7339] - Add authorization to agent executor API.
* [MESOS-7377] - Add authentication to the checker and health checker libraries.
* [MESOS-7391] - Add deprecation warning for Visual Studio 14 2015.
* [MESOS-7395] - Benchmark performance of hierarchical roles.
* [MESOS-7439] - Bump the default timeout value for docker volume driver unmount operation.
Release Notes - Mesos - Version 1.2.3
-------------------------------------
* This is a bug fix release.
All Issues:
** Bug
* [MESOS-6743] - Docker executor hangs forever if `docker stop` fails.
* [MESOS-6950] - Launching two tasks with the same Docker image simultaneously may cause a staging dir never cleaned up.
* [MESOS-7365] - Compile error with recent glibc.
* [MESOS-7378] - Build failure with glibc 2.12.
* [MESOS-7627] - Mesos slave stucks.
* [MESOS-7652] - Docker image with universal containerizer does not work if WORKDIR is missing in the rootfs.
* [MESOS-7744] - Mesos Agent Sends TASK_KILL status update to Master, and still launches task.
* [MESOS-7783] - Framework might not receive status update when a just launched task is killed immediately.
* [MESOS-7858] - Launching a nested container with namespace/pid isolation, with glibc < 2.25, may deadlock the LinuxLauncher and MesosContainerizer.
* [MESOS-7863] - Agent may drop pending kill task status updates.
* [MESOS-7865] - Agent may process a kill task and still launch the task.
* [MESOS-7872] - Scheduler hang when registration fails.
* [MESOS-7909] - Ordering dependency between 'linux/capabilities' and 'docker/runtime' isolator.
* [MESOS-7926] - Abnormal termination of default executor can cause MesosContainerizer::destroy to fail.
* [MESOS-7934] - OOM due to LibeventSSLSocket send incorrectly returning 0 after shutdown.
* [MESOS-7968] - Handle `/proc/self/ns/pid_for_children` when parsing available namespace.
* [MESOS-7969] - Handle cgroups v2 hierarchy when parsing /proc/self/cgroups.
* [MESOS-7975] - The command/default/docker executor can incorrectly send a TASK_FINISHED update even when the task is killed.
* [MESOS-7980] - Stout fails to compile with libc >= 2.26.
* [MESOS-8051] - Killing TASK_GROUP fail to kill some tasks.
* [MESOS-8080] - The default executor does not propagate missing task exit status correctly.
* [MESOS-8135] - Masters can lose track of tasks' executor IDs.
Release Notes - Mesos - Version 1.2.2
-------------------------------------
* This is a bug fix release.
All Issues:
** Bug
* [MESOS-5187] - The filesystem/linux isolator does not set the permissions of the host_path.
* [MESOS-7252] - Need to fix resource check in long-lived framework.
* [MESOS-7546] - WAIT_NESTED_CONTAINER sometimes returns 404.
* [MESOS-7569] - Allow "old" executors with half-open connections to be preserved during agent upgrade / restart.
* [MESOS-7581] - Fix interference of external Boost installations when using some unbundled dependencies.
* [MESOS-7689] - Libprocess can crash on malformed request paths for libprocess messages.
* [MESOS-7690] - The agent can crash when an unknown executor tries to register.
* [MESOS-7703] - Mesos fails to exec a custom executor when no shell is used.
* [MESOS-7728] - Java HTTP adapter crashes JVM when leading master disconnects.
* [MESOS-7770] - Persistent volume might not be mounted if there is a sandbox volume whose source is the same as the target of the persistent volume.
* [MESOS-7777] - Agent failed to recover due to mount namespace leakage in Docker 1.12/1.13.
* [MESOS-7796] - LIBPROCESS_IP isn't passed on to the fetcher.
* [MESOS-7830] - Sandbox_path volume does not have ownership set correctly.
** Improvement
* [MESOS-7540] - Add an agent flag for executor re-registration timeout.
Release Notes - Mesos - Version 1.2.1
-------------------------------------
* This is a bug fix release.
**NOTE**: In Mesos 1.2.1, the master will no longer allow 0.x agents to
register. Interoperability between 1.1+ masters and 0.x agents has never
been supported; however, it was not explicitly disallowed, either.
Starting with this release of Mesos, registration attempts by 0.x Mesos
agents will be ignored.
All Issues:
** Bug
* [MESOS-1987] - Add support for SemVer build and prerelease labels to stout.
* [MESOS-5028] - Copy provisioner cannot replace directory with symlink.
* [MESOS-5172] - Registry puller cannot fetch blobs correctly from http Redirect 3xx urls.
* [MESOS-6327] - Large docker images causes container launch failures: Too many levels of symbolic links.
* [MESOS-6951] - Docker containerizer: mangled environment when env value contains LF byte.
* [MESOS-6976] - Disallow (re-)registration attempts by old agents.
* [MESOS-7133] - mesos-fetcher fails with openssl-related output.
* [MESOS-7197] - Requesting tiny amount of CPU crashes master.
* [MESOS-7208] - Persistent volume ownership is set to root when task is running with non-root user.
* [MESOS-7210] - HTTP health check doesn't work when mesos runs with --docker_mesos_image.
* [MESOS-7232] - Add support to auto-load /dev/nvidia-uvm in the GPU isolator.
* [MESOS-7237] - Enabling cgroups_limit_swap can lead to "invalid argument" error.
* [MESOS-7261] - maintenance.html is missing during packaging.
* [MESOS-7263] - User supplied task environment variables cause warnings in sandbox stdout.
* [MESOS-7264] - Possibly duplicate environment variables should not leak values to the sandbox.
* [MESOS-7265] - Containerizer startup may cause sensitive data to leak into sandbox logs.
* [MESOS-7272] - Unified containerizer does not support docker registry version < 2.3.
* [MESOS-7280] - Unified containerizer provisions docker image error with COPY backend.
* [MESOS-7316] - Upgrading Mesos to 1.2.0 results in some information missing from the `/flags` endpoint.
* [MESOS-7346] - Agent crashes if the task name is too long.
* [MESOS-7350] - Failed to pull image from Nexus Registry due to signature missing.
* [MESOS-7366] - Agent sandbox gc could accidentally delete the entire persistent volume content.
* [MESOS-7368] - Documentation of framework role(s) in proto definition is confusing.
* [MESOS-7383] - Docker executor logs possibly sensitive parameters.
* [MESOS-7389] - Mesos 1.2.0 crashes with pre-1.0 Mesos agents.
* [MESOS-7400] - The mesos master crashes due to an incorrect invariant check in the decoder.
* [MESOS-7427] - Registry puller cannot fetch manifests from Amazon ECR: 405 Unsupported.
* [MESOS-7429] - Allow isolators to inject task-specific environment variables.
* [MESOS-7453] - glyphicons-halflings-regular.woff2 is missing in WebUI.
* [MESOS-7464] - Recent Docker versions cannot be parsed by stout.
* [MESOS-7471] - Provisioner recover should not always assume 'rootfses' dir exists.
* [MESOS-7478] - Pre-1.2.x master does not work with 1.2.x agent.
* [MESOS-7484] - VersionTest.ParseInvalid aborts on Windows.
Release Notes - Mesos - Version 1.2.0
-------------------------------------
This release contains the following new features:
* [MESOS-5931] - **Experimental** Support auto backend in Mesos Containerizer,
prefering overlayfs then aufs. Please note that the bind backend needs to be
specified explicitly through the agent flag '--image_provisioner_backend'
since it requires the sandbox already existed.
* [MESOS-6402] - **Experimental** Add rlimit support to Mesos containerizer.
The isolator adds support for setting POSIX resource limits (rlimits) for
containers launched using the Mesos containerizer. POSIX rlimits can be used
to control the resources a process can consume. See `docs/posix_rlimits.md`
for details.
* [MESOS-6419] - **Experimental** Teardown unregistered frameworks. The master
now treats recovered frameworks very similarly to frameworks that are registered
but currently disconnected. For example, recovered frameworks will be reported
via the normal "frameworks" key when querying HTTP endpoints. This means there
is no longer a concept of "orphan tasks": if the master knows about a task, the
task will be running under a framework. Similarly, "teardown" operations on
recovered frameworks will now work correctly.
* [MESOS-6460] - **Experimental** Container Attach and Exec. This feature adds
new Agent APIs for attaching a remote client to the stdin, stdout, and stderr
of a running Mesos task, as well as an API for launching new processes inside
the same container as a running Mesos task and attaching to its stdin, stdout,
and stderr. At a high level, these APIs mimic functionality similar to docker
attach and docker exec. The primary motivation for such functionality is to
enable users to debug their running Mesos tasks.
* [MESOS-6758] - **Experimental** Support 'Basic' auth docker private registry
on Mesos Containerizer. Until now, the mesos containerizer always assumed
Bearer auth, but we now also support basic auth for private registries. Please
note that the AWS ECS uses Basic authorization but it does not work yet due to
the redirect issue MESOS-5172.
Deprecations:
* [MESOS-6650] - Remove slavePreLaunchDockerEnvironmentDecorator and slavePreLaunchDockerHook.
Additional API Changes:
* [MESOS-3601] - Formalize all headers and metadata for HTTP API Event Stream
* [MESOS-6286] - If an agent restarts but fails to complete recovery
within `agent_reregister_timeout`, the master will now mark the
agent as unreachable. This mainly changes behavior in two
situations: (a) the master will now be more robust if agent recovery
hangs indefinitely (e.g., due to a container being in a bad state),
and (b) if agent recovery takes a very long time (e.g., because the
agent's work directory contains a large number of completed tasks),
the master might now mark an agent unreachable that would previously
have been able to eventually recover successfully.
* [MESOS-6419] - When a framework reregisters after master failover,
it is only allowed to change certain fields in its FrameworkInfo.
For example, changing "failover_timeout" is allowed, but changing
"role" is not. In previous Mesos releases, the same restrictions on
changes to FrameworkInfo were only enforced after framework
failover, not master failover.
* [MESOS-6670] - Authz for Agent v1 operator API
* [MESOS-6675] - Changed the allocator API to support adding inactive
frameworks. Custom allocator implementations will need to be updated.
* [MESOS-6865] - Remove the constraint of being only able to launch
2-level nested containers on Agent API.
Unresolved Critical Issues:
* [MESOS-1625] - Extra trailing CRLF being sent after the HTTP body in libprocess
* [MESOS-1718] - Command executor can overcommit the agent.
* [MESOS-2554] - Slave flaps when using --slave_subsystems that are not used for isolation.
* [MESOS-2774] - SIGSEGV received during process::MessageEncoder::encode()
* [MESOS-2842] - Update FrameworkInfo.principal on framework re-registration
* [MESOS-3533] - Unable to find and run URIs files
* [MESOS-3747] - HTTP Scheduler API no longer allows FrameworkInfo.user to be empty string
* [MESOS-3794] - Master should not store arbitrarily sized data in ExecutorInfo.
* [MESOS-4259] - mesos HA can't delete the the redundant container on failure slave node.
* [MESOS-4297] - Executor does not shutdown when framework teardown.
* [MESOS-4642] - Mesos Agent Json API can dump binary data from log files out as invalid JSON.
* [MESOS-4996] - 'containerizer->update' will always fail after killing a docker container.
* [MESOS-5352] - Docker volume isolator cleanup can be blocked by first cleanup failure.
* [MESOS-5396] - After failover, master does not remove agents with same UPID.
* [MESOS-5849] - Agent sandboxes on Windows surpass the 260 character path length limit
* [MESOS-5859] - Some tasks are always in staged state.
* [MESOS-5989] - Libevent SSL Socket downgrade code accesses uninitialized memory / assumes single peek is sufficient.
* [MESOS-6327] - Large docker images causes container launch failures: Too many levels of symbolic links.
* [MESOS-6356] - ASF CI has interleaved logging.
* [MESOS-6615] - Running mesos-slave in the docker that leave many zombie process
* [MESOS-6623] - Re-enable tests impacted by request streaming support
* [MESOS-6632] - ContainerLogger might leak FD if container launch fails.
* [MESOS-6780] - ContentType/AgentAPIStreamingTest.AttachContainerInput test fails reliably
* [MESOS-6784] - IOSwitchboardTest.KillSwitchboardContainerDestroyed is flaky
* [MESOS-6804] - Running 'tty' inside a debug container that has a tty reports "Not a tty"
* [MESOS-6815] - Enable glog stack traces when we call things like `ABORT` on Windows
* [MESOS-6843] - Fetcher should not assume stdout/stderr in the sandbox.
* [MESOS-6913] - AgentAPIStreamingTest.AttachInputToNestedContainerSession fails on Mac OS.
* [MESOS-6974] - DefaultExecutorTest.CommitSuicideOnTaskFailure test is flaky.
* [MESOS-6986] - abort in DRFSorter::add
* [MESOS-7017] - HTTP API responses can crash the master.
* [MESOS-7050] - IOSwitchboard FDs leaked when containerizer launch fails -- leads to deadlock
* [MESOS-7099] - Quota can be exceeded due to coarse-grained offer technique.
Feature Graduations:
* None
All Experimental Features:
* [MESOS-2449] - Support group of tasks (Pod) constructs and API in Mesos.
* [MESOS-2533] - Support HTTP checks in Mesos.
* [MESOS-3094] - Mesos on Windows.
* [MESOS-3421] - Support sharing of resources across task instances.
* [MESOS-3567] - Support TCP checks in Mesos.
* [MESOS-4312] - Porting Mesos on Power (ppc64le).
* [MESOS-4355] - Implement isolator for Docker volume.
* [MESOS-4641] - Support Container Network Interface (CNI).
* [MESOS-4791] - Operator API v1.
* [MESOS-4828] - XFS disk quota isolator.
* [MESOS-5275] - Add capabilities support for mesos containerizer.
* [MESOS-5344] - Partition-aware Mesos frameworks.
* [MESOS-5788] - Added JAVA API adapter for seamless transition to new scheduler API.
* [MESOS-5931] - **NEW** Support auto backend in Mesos Containerizer.
* [MESOS-6014] - Added port mapping CNI plugin.
* [MESOS-6077] - Added a default (task group) executor.
* [MESOS-6402] - **NEW** rlimit support for Mesos containerizer
* [MESOS-6419] - **NEW** Teardown unregistered frameworks
* [MESOS-6460] - **NEW** Container Attach/Exec
* [MESOS-6758] - **NEW** Support docker registry that requires basic auth.
All Issues:
** Bug
* [MESOS-1802] - HealthCheckTest.HealthStatusChange is flaky on jenkins.
* [MESOS-2537] - AC_ARG_ENABLED checks are broken
* [MESOS-2723] - The mesos-execute tool does not support zk:// master URLs
* [MESOS-3335] - FlagsBase copy-ctor leads to dangling pointer.
* [MESOS-3932] - Silence Boost compiler warnings with CMake
* [MESOS-4601] - Don't dump stack trace on failure to bind()
* [MESOS-4695] - SlaveTest.StateEndpoint is flaky
* [MESOS-4973] - Duplicates in 'unregistered_frameworks' in /state
* [MESOS-4975] - mesos::internal::master::Slave::tasks can grow unboundedly
* [MESOS-5218] - Fetcher should not chown the entire sandbox.
* [MESOS-5303] - Add capabilities support for mesos execute cli.
* [MESOS-5662] - Call parent class `SetUpTestCase` function in our test fixtures.
* [MESOS-5821] - Clean up the thousands of compiler warnings on MSVC
* [MESOS-5835] - Audit `PATCH_CMD`; make sure all patches are being applied on Windows.
* [MESOS-5856] - Logrotate ContainerLogger module does not rotate logs when run as root with `--switch_user`.
* [MESOS-5879] - cgroups/net_cls isolator causing agent recovery issues
* [MESOS-5963] - HealthChecker should not decide when to kill tasks and when to stop performing health checks.
* [MESOS-6001] - Aufs backend cannot support the image with numerous layers.
* [MESOS-6002] - The whiteout file cannot be removed correctly using aufs backend.
* [MESOS-6010] - Docker registry puller shows decode error "No response decoded".
* [MESOS-6119] - TCP health checks are not portable.
* [MESOS-6142] - Frameworks may RESERVE for an arbitrary role.
* [MESOS-6206] - Change reconciliation to return results for in-progress removals and reregistrations
* [MESOS-6286] - Master does not remove an agent if it is responsive but not registered
* [MESOS-6288] - The default executor should maintain launcher_dir.
* [MESOS-6293] - HealthCheckTest.HealthyTaskViaHTTPWithoutType fails on some distros.
* [MESOS-6316] - CREATE of shared volumes should not be allowed by frameworks not opted in to the capability.
* [MESOS-6320] - Implement clang-tidy check to catch incorrect flags hierarchies
* [MESOS-6349] - JSON Generation breaks if other locale than C is used.
* [MESOS-6360] - The handling of whiteout files in provisioner is not correct.
* [MESOS-6380] - mesos-local failed to start without sudo
* [MESOS-6388] - Report new PARTITION_AWARE task statuses in HTTP endpoints
* [MESOS-6389] - Update webui for PARTITION_AWARE changes
* [MESOS-6409] - mesos-ps - Invalid header value
* [MESOS-6414] - cgroups isolator cleanup failed when the hierarchy is cleanup by docker daemon
* [MESOS-6419] - The 'master/teardown' endpoint should support tearing down 'unregistered_frameworks'.
* [MESOS-6420] - Mesos Agent leaking sockets when port mapping network isolator is ON
* [MESOS-6432] - Roles with quota assigned can "game" the system to receive excessive resources.
* [MESOS-6444] - Ensure single copy of shared count of total resources in role sorter.
* [MESOS-6446] - WebUI redirect doesn't work with stats from /metric/snapshot
* [MESOS-6448] - Show the leading master hostname in the webUI.
* [MESOS-6452] - Compile error in strerror.h on OSX
* [MESOS-6455] - DefaultExecutorTests fail when running on hosts without docker.
* [MESOS-6459] - PosixRLimitsIsolatorTest.TaskExceedingLimit fails on OS X
* [MESOS-6461] - Duplicate framework ids in /master/frameworks endpoint 'unregistered_frameworks'.
* [MESOS-6478] - "filesystem/linux" isolator leaks (phantom) mounts in `mount` output
* [MESOS-6483] - Check failure when a 1.1 master marking a 0.28 agent as unreachable
* [MESOS-6484] - Memory leak in `Future<T>::after()`
* [MESOS-6501] - Add a test for duplicate framework ids in "unregistered_frameworks"
* [MESOS-6504] - Use 'geteuid()' for the root privileges check.
* [MESOS-6508] - monitor/statistics error in webui when launch mesos via mesos-local
* [MESOS-6516] - Parallel test running does not respect GTEST_FILTER
* [MESOS-6519] - MasterTest.OrphanTasksMultipleAgents
* [MESOS-6520] - Make errno an explicit argument for ErrnoError.
* [MESOS-6526] - `mesos-containerizer launch --environment` exposes executor env vars in `ps`.
* [MESOS-6527] - Memory leak in the libprocess request decoder.
* [MESOS-6544] - MasterMaintenanceTest.InverseOffersFilters is flaky.
* [MESOS-6545] - TestContainerizer is not thread-safe.
* [MESOS-6566] - The Docker executor should not leak task env variables in the Docker command cmd line.
* [MESOS-6569] - MesosContainerizer/DefaultExecutorTest.KillTask/0 failing on ASF CI
* [MESOS-6576] - DefaultExecutorTest.KillTaskGroupOnTaskFailure sometimes fails in CI
* [MESOS-6588] - LinuxRootfs misses required files
* [MESOS-6597] - Include v1 Operator API protos in generated JAR and python packages.
* [MESOS-6598] - Broken Link Framework Development Page
* [MESOS-6602] - Shutdown completed frameworks when unreachable agent reregisters
* [MESOS-6604] - Uninitialized member ObjectApprover::weight_info.
* [MESOS-6606] - Reject optimized builds with libcxx before 3.9
* [MESOS-6618] - Some tests use hardcoded port numbers.
* [MESOS-6619] - Improve task management for unreachable tasks
* [MESOS-6621] - SSL downgrade path will CHECK-fail when using both temporary and persistent sockets
* [MESOS-6624] - Master WebUI does not work on Firefox 45
* [MESOS-6625] - Expose container id in ContainerStatus in DockerContainerizer.
* [MESOS-6640] - mesos-local doesn't hande --work_dir correctly.
* [MESOS-6646] - StreamingRequestDecoder incompletely initializes its http_parser_settings
* [MESOS-6647] - Cyclic header dependency between libprocess' defer.hpp and executor.hpp
* [MESOS-6652] - Perf version not correctly parsed on Fedora 24 (and probably others)
* [MESOS-6653] - Overlayfs backend may fail to mount the rootfs if both container image and image volume are specified.
* [MESOS-6654] - Duplicate image layer ids may make the backend failed to mount rootfs.
* [MESOS-6658] - Mesos tests generated with cmake build fail to unload libraries properly
* [MESOS-6665] - io::redirect might cause stack overflow.
* [MESOS-6666] - HttpServeTest.Discard failed on OSX sierra
* [MESOS-6672] - Class DynamicLibrary's default copy constructor can lead to inconsistent state
* [MESOS-6676] - Always re-link with scheduler during re-registration.
* [MESOS-6677] - Error in Windows agent's Flags::runtime_dir CLI
* [MESOS-6684] - Update addFramework/removeFramework to handle multi-role frameworks
* [MESOS-6685] - Update Role::Resources to correctly account for multi-role frameworks
* [MESOS-6688] - IOSwitchboard should recover spawned server pid on agent restarts
* [MESOS-6689] - Remove of unix domain socket path in IOSwitchboard::cleanup
* [MESOS-6700] - Port `http_tests.cpp`
* [MESOS-6701] - Port `recordio_tests.cpp`
* [MESOS-6704] - Port `executor_http_api_tests.cpp`
* [MESOS-6707] - Port `gc_tests.cpp`
* [MESOS-6710] - Port `http_authentication_tests.cpp`
* [MESOS-6711] - Port `values_tests.cpp`
* [MESOS-6716] - Port `uri_tests.cpp`
* [MESOS-6717] - Add Windows support to agent test harness
* [MESOS-6718] - Should destroy DEBUG containers on agent recovery.
* [MESOS-6722] - Agent tries to use POSIX paths for the variable data runtime directory.
* [MESOS-6725] - The style of `.navbar-text` is inconsistent with the style of texts on the left side
* [MESOS-6726] - IOSwitchboardServerFlags adds flags for non-optional fields w/o providing a default value
* [MESOS-6736] - CMake's `CURRENT_CMAKE_BUILD_DIR` does not escape '\'
* [MESOS-6737] - The agent should synchronize with the IOSwitchboard to determine when it is ready to accept incoming connections.
* [MESOS-6739] - Authorize v1 GET_CONTAINERS call
* [MESOS-6740] - Authorize v1 GET_FLAGS call
* [MESOS-6741] - Authorize v1 SET_LOGGING_LEVEL call
* [MESOS-6744] - DefaultExecutorTest.KillTaskGroupOnTaskFailure is flaky
* [MESOS-6745] - MesosContainerizer/DefaultExecutorTest.KillTask/0 is flaky
* [MESOS-6746] - IOSwitchboard doesn't properly flush data on ATTACH_CONTAINER_OUTPUT
* [MESOS-6747] - ContainerLogger runnable must not inherit the slave environment.
* [MESOS-6748] - I/O switchboard should inherit agent environment variables.
* [MESOS-6750] - Metrics on the Agent view of the Mesos web UI flickers between empty and non-empty states
* [MESOS-6756] - I/O switchboard should deal with the case when reaping of the server failed.
* [MESOS-6757] - Consider using CMake to configure test scripts in the `bin/` diretory
* [MESOS-6761] - Implement `os::user` on Windows
* [MESOS-6767] - Reached unreachable statement at <path>/mesos/src/slave/containerizer/mesos/launch.cpp:766
* [MESOS-6772] - Stop building `mesos-agent` twice.
* [MESOS-6775] - The 'http::connect(address)' always uses the DEFAULT_KIND() of socket even if SSL is undesired.
* [MESOS-6781] - Mesos containerizer overrides environment variables passed to the executor incorrectly.
* [MESOS-6788] - Avoid stack overflow when handling streaming responses in API handlers
* [MESOS-6789] - SSL socket's 'shutdown()' method is broken
* [MESOS-6793] - CniIsolatorTest.ROOT_EnvironmentLibprocessIP fails on systems using dash as sh
* [MESOS-6795] - Listening socket might get closed while the accept is still in flight.
* [MESOS-6802] - SSL socket can lose bytes in the case of EOF
* [MESOS-6803] - Agent authentication does not have an initial `delay`
* [MESOS-6805] - Check unreachable task cache for task ID collisions on launch
* [MESOS-6811] - IOSwitchboardServerTest.SendHeartbeat and IOSwitchboardServerTest.ReceiveHeartbeat broken on OS X
* [MESOS-6813] - IOSwitchboardServerTest.AttachOutput has stack overflow issue.
* [MESOS-6820] - FaultToleranceTest.FrameworkReregister is flaky.
* [MESOS-6824] - mesos-this-capture clang-tidy check has false positives
* [MESOS-6826] - OsTest.User fails on recent Arch Linux.
* [MESOS-6829] - Mesos fails to compile when using FORTIFY_SOURCE without optimizations
* [MESOS-6830] - Mesos fails to link with gold when providing -pie without -fPIC
* [MESOS-6837] - FaultToleranceTest.FrameworkReregister is flaky
* [MESOS-6839] - It is currently impossible to kill a task in the Windows executor
* [MESOS-6848] - The default executor does not exit if a single task pod fails.
* [MESOS-6852] - Nested container's launch command is not set correctly in docker/runtime isolator.
* [MESOS-6860] - Some tests use CHECK instead of ASSERT
* [MESOS-6862] - Replace os::system usages to reduce the risk of command injection.
* [MESOS-6864] - Container Exec should be possible with tasks belonging to a task group
* [MESOS-6866] - Mesos agent not checking IDs before using them as part of the paths
* [MESOS-6870] - Port `default_executor_tests.cpp`
* [MESOS-6871] - Scheme parsing is incorrect in libprocess URL::parse().
* [MESOS-6895] - Loop uses dependent nested names for friend declaration which isn't supported by recent clang
* [MESOS-6900] - Add test for framework upgrading to multi-role capability.
* [MESOS-6904] - Perform batching of allocations to reduce allocator queue backlogging.
* [MESOS-6908] - Zero health check timeout is interpreted literally.
* [MESOS-6911] - SlaveRecoveryTest/0.RegisterDisconnectedSlave test is flaky
* [MESOS-6912] - IOSwitchboardServerTest.AttachInput fails consistently on Mac OS.
* [MESOS-6917] - Segfault when the executor sets an invalid UUID when sending a status update.
* [MESOS-6920] - Validate the UUID in Master::statusUpdate.
* [MESOS-6922] - SlaveRecoveryTest/0.RecoverTerminatedExecutor is flaky
* [MESOS-6937] - ContentType/MasterAPITest.ReserveResources/1 fails during Writer close
* [MESOS-6946] - Make wait status checks consistent.
* [MESOS-6948] - AgentAPITest.LaunchNestedContainerSession is flaky
* [MESOS-6954] - Running LAUNCH_NESTED_CONTAINER with a docker container id as parent crashes the agent
* [MESOS-6962] - Navbar overlays breadcrumbs in WebUI on narrow screens
* [MESOS-6963] - The logo doesn't fit in mobile WebUI
* [MESOS-6966] - master/tasks_unreachable metric never decremented
* [MESOS-6969] - Use clipboard.js for copy/paste webui functionality
* [MESOS-6983] - TaskValidationTest.TaskReusesUnreachableTaskID is flaky
* [MESOS-6989] - Docker executor segfaults in ~MesosExecutorDriver()
* [MESOS-6991] - Change `Environment.Variable.Value` from required to optional
* [MESOS-7008] - Quota not recovered from registry in empty cluster.
* [MESOS-7020] - cgroups::internal::write can incorrectly report success
* [MESOS-7027] - CommandExecutor ENV overwritten by Docker Image ENV in Unified Containerizer
* [MESOS-7036] - Rate limiter deadlocks during IO Switchboard-related tests
* [MESOS-7057] - Consider using the relink functionality of libprocess in the executor driver.
* [MESOS-7059] - Unnecessary mkdirs in ProvisionerDockerLocalStoreTest.*
* [MESOS-7060] - Tests depends on DockerArchive and LinuxRootfs failed.
* [MESOS-7075] - mesos-execute rejects all offers
* [MESOS-7077] - Check failed: resource.has_allocation_info().
* [MESOS-7102] - Crash when sending a SIGUSR1 signal to the agent.
* [MESOS-7119] - Mesos master crash while accepting inverse offer.
* [MESOS-7129] - Default executor exits with a stack trace in a few scenarios.
* [MESOS-7133] - mesos-fetcher fails with openssl-related output.
* [MESOS-7137] - Custom executors cannot use any reserved resources.
* [MESOS-7144] - Wrap IOSwitchboard.connect() in a dispatch
* [MESOS-7152] - The agent may be flapping after the machine reboots due to provisioner recover.
* [MESOS-7153] - The new http::Headers abstraction may break some modules.
** Documentation
* [MESOS-5597] - Document Mesos "health check" feature.
* [MESOS-6335] - Add user doc for task group tasks
* [MESOS-6411] - Add documentation for CNI port-mapper plugin.
* [MESOS-6806] - Update the addition, deletion and modification logic of CNI configuration files.
* [MESOS-7154] - Document provisioner auto backend support.
** Epic
* [MESOS-3820] - Test-only libprocess reinitialization
* [MESOS-4641] - Support Container Network Interface (CNI).
* [MESOS-4766] - Improve allocator performance.
* [MESOS-6402] - Add rlimit support to Mesos containerizer
* [MESOS-6460] - Mesos Support for Container Attach and Container Exec
* [MESOS-6670] - Authz for Agent v1 operator API
** Improvement
* [MESOS-3601] - Formalize all headers and metadata for HTTP API Event Stream
* [MESOS-5792] - Add mesos tests to CMake (make check)
* [MESOS-5900] - Support Unix domain socket connections in libprocess
* [MESOS-5931] - Support auto backend in Unified Containerizer.
* [MESOS-5992] - Complete the list of API Calls on the Operator HTTP API Doc
* [MESOS-6177] - Return unregistered agents recovered from registrar in `GetAgents` and/or `/state.json`
* [MESOS-6229] - Default to using hardened compilation flags
* [MESOS-6296] - Default executor should be able to launch multiple task groups
* [MESOS-6305] - Add authorization support for nested container calls
* [MESOS-6309] - Mesos-specific targets appear in libprocess' cmake config.
* [MESOS-6329] - Send TASK_DROPPED for task launch errors
* [MESOS-6330] - Send TASK_UNKNOWN during explicit reconciliation
* [MESOS-6331] - Don't send TASK_LOST when accepting offers in a disconnected scheduler
* [MESOS-6332] - Don't send TASK_LOST in the agent
* [MESOS-6339] - Support docker registry that requires basic auth.
* [MESOS-6361] - Enable partition-awareness in mesos-execute
* [MESOS-6369] - Add a column for FrameworkID when displaying tasks in the WebUI
* [MESOS-6395] - HealthChecker sends updates to executor via libprocess messaging.
* [MESOS-6396] - Hooks should allow sandbox dependent environment variables.
* [MESOS-6397] - Simplify the comparison logic for `ExecutorInfo`.
* [MESOS-6399] - Allowed to pass extra envs when launch development scripts.
* [MESOS-6401] - Authorizer interface should behave more uniform
* [MESOS-6407] - Move DEFAULT_v1_xxx macros to the v1 namespace.
* [MESOS-6426] - Add rlimit support to Mesos containerizer
* [MESOS-6427] - Add documentation for rlimit support of Mesos containerizer
* [MESOS-6443] - Display maintenance information in the webui.
* [MESOS-6530] - Add support for incremental gzip decompression.
* [MESOS-6556] - Hostname support for the network/cni isolator.
* [MESOS-6557] - IPC namespace isolator
* [MESOS-6562] - Use JSON content type in mesos-execute.
* [MESOS-6567] - Actively Scan for CNI Configurations
* [MESOS-6571] - Add "--task" flag to mesos-execute
* [MESOS-6626] - Support `foreachpair` for LinkedHashMap
* [MESOS-6639] - Update 'io::redirect()' to take an optional vector of callback hooks.
* [MESOS-6648] - MesosContainerizer launch helper should take ContainerLaunchInfo.
* [MESOS-6650] - Remove slavePreLaunchDockerEnvironmentDecorator and slavePreLaunchDockerHook.
* [MESOS-6675] - Change allocator API to support adding inactive frameworks
* [MESOS-6719] - Unify "active" and "state"/"connected" fields in Master::Framework
* [MESOS-6758] - Support 'Basic' auth docker private registry on Unified Containerizer.
* [MESOS-6763] - Add heartbeats to both input/output connections in IOSwitchboard
* [MESOS-6821] - Override of automatic resources should be by exact match not substring
* [MESOS-6865] - Remove the constraint of being only able to launch 2 level nested containers on Agent API
* [MESOS-6936] - Add support for media types needed for streaming request/responses.
* [MESOS-6947] - Fix pailer XSS vulnerability
* [MESOS-7045] - Skip already stored layers in local Docker puller
* [MESOS-7051] - Introduce a new http::Headers abstraction.
* [MESOS-7071] - Agent State Lacks Framework Principal
** Story
* [MESOS-3505] - Support specifying Docker image by Image ID.
* [MESOS-3753] - Test the HTTP Scheduler library with SSL enabled
** Task
* [MESOS-3398] - Revisit MAXHOSTNAMELEN implementation in Windows
* [MESOS-3697] - Add `make tests` target to CMake build system.
* [MESOS-3843] - Audit `src/CMakelists.txt` to make sure we're compiling everything we need to build the agent binary.
* [MESOS-3910] - Libprocess: Implement cleanup of the SocketManager in process::finalize
* [MESOS-3934] - Libprocess: Unify the initialization of the MetricsProcess and ReaperProcess
* [MESOS-4119] - Add support for enabling --3way to apply-reviews.py.
* [MESOS-5826] - Streamline building of example frameworks
* [MESOS-5966] - Add libprocess HTTP tests with SSL support
* [MESOS-6040] - Add a CMake build for `mesos-port-mapper`
* [MESOS-6185] - Improve test coverage for shared persistent volumes.
* [MESOS-6214] - Containerizers assume caller will call 'destroy' if 'launch' fails.
* [MESOS-6278] - Add test cases for the HTTP health checks.
* [MESOS-6279] - Add test cases for the TCP health check.
* [MESOS-6366] - Design doc for executor authentication
* [MESOS-6376] - Add documentation for capabilities support of the mesos containerizer
* [MESOS-6403] - Draft design doc for rlimit support for Mesos containerizer
* [MESOS-6431] - Add support for port-mapping in `mesos-execute`
* [MESOS-6462] - Design Doc: Mesos Support for Container Attach and Container Exec
* [MESOS-6463] - Build a prototype for remote pty support
* [MESOS-6464] - Add fine grained control of which namespaces a nested container should inherit (or not).
* [MESOS-6465] - Add a task_id -> container_id mapping in state.json
* [MESOS-6466] - Add support for streaming HTTP requests in Mesos
* [MESOS-6467] - Build a Container I/O Switchboard
* [MESOS-6470] - Support TTY in IOSwitchboard.
* [MESOS-6471] - Build support for LAUNCH_NESTED_CONTAINER_SESSION call into the Agent API in Mesos
* [MESOS-6472] - Build support for ATTACH_CONTAINER_INPUT into the Agent API in Mesos
* [MESOS-6473] - Build support for ATTACH_CONTAINER_OUTPUT into the Agent API in Mesos
* [MESOS-6474] - Add fine-grained ACLs for authorization with the new debugging APIs
* [MESOS-6475] - Mesos Container Attach/Exec Unit Tests
* [MESOS-6476] - Build a Mock HTTP Server that implements the new Debugging API calls
* [MESOS-6477] - Build a standalone python client for connecting to our Mock HTTP Server that implements the new Debug APIs
* [MESOS-6493] - Add test cases for the HTTPS health checks.
* [MESOS-6525] - Add API protos for managing debug containers
* [MESOS-6528] - Container status of a task in a pod is not correct.
* [MESOS-6543] - Add special case for entering the "mount" namespace of a parent container
* [MESOS-6546] - Update the Containerizer to handle attachInput and attachOutput calls.
* [MESOS-6547] - Update the mesos containerizer to launch per-container I/O switchboards
* [MESOS-6553] - Update `MesosContainerizerProcess::_launch()` to pass `ContainerLaunchInfo` to launcher->fork()`
* [MESOS-6594] - Add `Containerizer::attach()` API call
* [MESOS-6628] - Add a FrameworkInfo.roles field along with a MULTI_ROLE capability.
* [MESOS-6629] - Add master validation of FrameworkInfo.roles.
* [MESOS-6631] - Disallow frameworks from modifying FrameworkInfo.roles.
* [MESOS-6633] - Introduce Resource.AllocationInfo.
* [MESOS-6634] - Add Resource.AllocationInfo in Offer to indicate a single role per offer.
* [MESOS-6638] - Update Suppress and Revive to be per-role.
* [MESOS-6651] - Make IOSwitchboard an isolator.
* [MESOS-6663] - Container should be destroyed if IOSwitchboard server terminates unexpectedly.
* [MESOS-6664] - Force cleanup of IOSwitchboard server if it does not terminate after the container terminates.
* [MESOS-6749] - Update master and agent endpoints to expose FrameworkInfo.roles.
* [MESOS-6764] - Add a grace period for terminating the I/O switchboard server.
* [MESOS-6958] - Support linux filesystem type detection.
* [MESOS-6970] - Display allocation info when printing Resources.
* [MESOS-7062] - Add a test for a MULTI_ROLE framework receiving offers for each of its roles.
Release Notes - Mesos - Version 1.1.3
-------------------------------------
* This is a bug fix release.
All Issues:
** Bug
* [MESOS-5187] - The filesystem/linux isolator does not set the permissions of the host_path.
* [MESOS-6743] - Docker executor hangs forever if `docker stop` fails.
* [MESOS-6950] - Launching two tasks with the same Docker image simultaneously may cause a staging dir never cleaned up.
* [MESOS-7540] - Add an agent flag for executor re-registration timeout.
* [MESOS-7569] - Allow "old" executors with half-open connections to be preserved during agent upgrade / restart.
* [MESOS-7689] - Libprocess can crash on malformed request paths for libprocess messages.
* [MESOS-7690] - The agent can crash when an unknown executor tries to register.
* [MESOS-7581] - Fix interference of external Boost installations when using some unbundled dependencies.
* [MESOS-7703] - Mesos fails to exec a custom executor when no shell is used.
* [MESOS-7728] - Java HTTP adapter crashes JVM when leading master disconnects.
* [MESOS-7770] - Persistent volume might not be mounted if there is a sandbox volume whose source is the same as the target of the persistent volume.
* [MESOS-7777] - Agent failed to recover due to mount namespace leakage in Docker 1.12/1.13.
* [MESOS-7796] - LIBPROCESS_IP isn't passed on to the fetcher.
* [MESOS-7830] - Sandbox_path volume does not have ownership set correctly.
* [MESOS-7863] - Agent may drop pending kill task status updates.
* [MESOS-7865] - Agent may process a kill task and still launch the task.
Release Notes - Mesos - Version 1.1.2
-------------------------------------
* This is a bug fix release.
All Issues:
** Bug
* [MESOS-2537] - AC_ARG_ENABLED checks are broken.
* [MESOS-5028] - Copy provisioner cannot replace directory with symlink.
* [MESOS-5172] - Registry puller cannot fetch blobs correctly from http Redirect 3xx urls.
* [MESOS-6327] - Large docker images causes container launch failures: Too many levels of symbolic links.
* [MESOS-7057] - Consider using the relink functionality of libprocess in the executor driver.
* [MESOS-7119] - Mesos master crash while accepting inverse offer.
* [MESOS-7152] - The agent may be flapping after the machine reboots due to provisioner recover.
* [MESOS-7197] - Requesting tiny amount of CPU crashes master.
* [MESOS-7210] - HTTP health check doesn't work when mesos runs with --docker_mesos_image.
* [MESOS-7237] - Enabling cgroups_limit_swap can lead to "invalid argument" error.
* [MESOS-7265] - Containerizer startup may cause sensitive data to leak into sandbox logs.
* [MESOS-7350] - Failed to pull image from Nexus Registry due to signature missing.
* [MESOS-7366] - Agent sandbox gc could accidentally delete the entire persistent volume content.
* [MESOS-7383] - Docker executor logs possibly sensitive parameters.
* [MESOS-7422] - Docker containerizer should not leak possibly sensitive data to agent log.
* [MESOS-7471] - Provisioner recover should not always assume 'rootfses' dir exists.
* [MESOS-7482] - #elif does not match #ifdef when checking the platform.
Release Notes - Mesos - Version 1.1.1
-------------------------------------
* This is a bug fix release.
All Issues:
** Bug
* [MESOS-6002] - The whiteout file cannot be removed correctly using aufs backend.
* [MESOS-6010] - Docker registry puller shows decode error "No response decoded".
* [MESOS-6142] - Frameworks may RESERVE for an arbitrary role.
* [MESOS-6360] - The handling of whiteout files in provisioner is not correct.
* [MESOS-6411] - Add documentation for CNI port-mapper plugin.
* [MESOS-6526] - `mesos-containerizer launch --environment` exposes executor env vars in `ps`.
* [MESOS-6571] - Add "--task" flag to mesos-execute.
* [MESOS-6597] - Include v1 Operator API protos in generated JAR and python packages.
* [MESOS-6606] - Reject optimized builds with libcxx before 3.9.
* [MESOS-6621] - SSL downgrade path will CHECK-fail when using both temporary and persistent sockets.
* [MESOS-6624] - Master WebUI does not work on Firefox 45.
* [MESOS-6676] - Always re-link with scheduler during re-registration.
* [MESOS-6848] - The default executor does not exit if a single task pod fails.
* [MESOS-6852] - Nested container's launch command is not set correctly in docker/runtime isolator.
* [MESOS-6917] - Segfault when the executor sets an invalid UUID when sending a status update.
* [MESOS-7008] - Quota not recovered from registry in empty cluster.
* [MESOS-7133] - mesos-fetcher fails with openssl-related output.
Release Notes - Mesos - Version 1.1.0
-------------------------------------
This release contains the following new features:
* [MESOS-2449] - **Experimental** support for launching a group of tasks
via a new `LAUNCH_GROUP` Offer operation. Mesos will guarantee that either
all tasks or none of the tasks in the group are delivered to the executor.
Executors receive the task group via a new `LAUNCH_GROUP` event.
* [MESOS-2533] - **Experimental** support for HTTP and HTTPS health checks.
Executors may now use the updated `HealthCheck` protobuf to implement
HTTP(S) health checks. Both default executors (command and docker) leverage
`curl` binary for sending HTTP(S) requests and connect to `127.0.0.1`,
hence a task must listen on all interfaces. On Linux, for BRIDGE and USER
modes, docker executor enters the task's network namespace.
* [MESOS-3421] - **Experimental** Support sharing of resources across
containers. Currently persistent volumes are the only resources allowed to
be shared.
* [MESOS-3567] - **Experimental** support for TCP health checks. Executors
may now use the updated `HealthCheck` protobuf to implement TCP health
checks. Both default executors (command and docker) connect to `127.0.0.1`,
hence a task must listen on all interfaces. On Linux, for BRIDGE and USER
modes, docker executor enters the task's network namespace.
* [MESOS-4324] - Allow tasks to access persistent volumes in either a
read-only or read-write manner. Using a volume in read-only mode can
simplify sharing that volume between multiple tasks on the same agent.
* [MESOS-5275] - **Experimental** support for linux capabilities. Frameworks
or operators now have fine-grained control over the capabilities that a
container may have. This allows a container to run as root, but not have all
the privileges associated with the root user (e.g., CAP_SYS_ADMIN).
* [MESOS-5344] - **Experimental** support for partition-aware Mesos
frameworks. In previous Mesos releases, when an agent is partitioned from
the master and then reregisters with the cluster, all tasks running on the
agent are terminated and the agent is shutdown. In Mesos 1.1, partitioned
agents will no longer be shutdown when they reregister with the master. By
default, tasks running on such agents will still be killed (for backward
compatibility); however, frameworks can opt-in to the new PARTITION_AWARE
capability. If they do this, their tasks will not be killed when a partition
is healed. This allows frameworks to define their own policies for how to
handle partitioned tasks. Enabling the PARTITION_AWARE capability also
introduces a new set of task states: TASK_UNREACHABLE, TASK_DROPPED,
TASK_GONE, TASK_GONE_BY_OPERATOR, and TASK_UNKNOWN. These new states are
intended to eventually replace the TASK_LOST state.
* [MESOS-5788] - **Experimental** support for Java scheduler adapter. This
adapter allows framework developers to toggle between the old/new API
(driver/scheduler library) implementations, thereby allowing them to easily
transition their frameworks to the new v1 Scheduler API.
* [MESOS-6014] - **Experimental** A new port-mapper CNI plugin, the
`mesos-cni-port-mapper` has been introduced. For Mesos containers, with the
CNI port-mapper plugin, users can now expose container ports through host
ports using DNAT. This is especially useful when Mesos containers are
attached to isolated CNI networks such as private bridge networks, and the
services running in the container needs to be exposed outside these
isolated networks.
* [MESOS-6077] - **Experimental** A new default executor is introduced which
frameworks can use to launch task groups as nested containers. All the
nested containers share resources likes cpu, memory, network and volumes.
Deprecations:
* The following metrics are deprecated and will be removed in Mesos 1.4:
master/slave_shutdowns_scheduled,
master/slave_shutdowns_canceled,
slave_shutdowns_completed.
As of Mesos 1.1.0, these metrics will always be zero. The following new
metrics have been introduced as replacements:
master/slave_unreachable_scheduled,
master/slave_unreachable_canceled,
master/slave_unreachable_completed.
* [MESOS-5955] - Health check binary "mesos-health-check" is removed.
* [MESOS-6371] - Remove the 'recover()' interface in 'ContainerLogger'.
Additional API Changes:
* [MESOS-6204] - A new agent flag called `--runtime_dir`. Unlike
`--work_dir` which persists data across reboots, `--runtime_dir` is designed
to checkpoint state that should persist across agent restarts, but not
across reboots. By default this flag is set to `/var/run/mesos` when run as
root and `os::temp/mesos/runtime/` when run as non-root.
* [MESOS-6220] - HTTP handler failures should result in 500 rather than
503 responses. This means that when using the master or agent endpoints,
failures will now result in a `500 Internal Server Error` rather than a
`503 Service Unavailable`.
* [MESOS-6241] - New API calls (LAUNCH_NESTED_CONTAINER,
KILL_NESTED_CONTAINER and WAIT_NESTED_CONTAINER) have been added to the
v1 Agent API to manage nested containers within an executor container.
Unresolved Critical Issues:
* [MESOS-3794] - Master should not store arbitrarily sized data in ExecutorInfo.
* [MESOS-4642] - Mesos Agent Json API can dump binary data from log files out as invalid JSON.
* [MESOS-5396] - After failover, master does not remove agents with same UPID.
* [MESOS-5856] - Logrotate ContainerLogger module does not rotate logs when run as root with `--switch_user`.
* [MESOS-6142] - Frameworks may RESERVE for an arbitrary role.
* [MESOS-6327] - Large docker images causes container launch failures: Too many levels of symbolic links.
* [MESOS-6360] - The handling of whiteout files in provisioner is not correct.
* [MESOS-6419] - The 'master/teardown' endpoint should support tearing down 'unregistered_frameworks'.
* [MESOS-6432] - Roles with quota assigned can "game" the system to receive excessive resources.
All Experimental Features:
* [MESOS-2449] - Support group of tasks (Pod) constructs and API in Mesos.
* [MESOS-2533] - Support HTTP checks in Mesos.
* [MESOS-3094] - Mesos on Windows.
* [MESOS-3421] - Support sharing of resources across task instances.
* [MESOS-3567] - Support TCP checks in Mesos.
* [MESOS-4312] - Porting Mesos on Power (ppc64le).
* [MESOS-4355] - Implement isolator for Docker volume.
* [MESOS-4641] - Support Container Network Interface (CNI).
* [MESOS-4791] - Operator API v1.
* [MESOS-4828] - XFS disk quota isolator.
* [MESOS-5275] - Add capabilities support for unified containerizer.
* [MESOS-5344] - Partition-aware Mesos frameworks.
* [MESOS-5788] - Added JAVA API adapter for seamless transition to new scheduler API.
* [MESOS-6014] - Added port mapping CNI plugin.
* [MESOS-6077] - Added a default (task group) executor.
All Issues:
** Bug
* [MESOS-1653] - HealthCheckTest.GracePeriod is flaky.
* [MESOS-2346] - Docker tasks exiting normally, but returning TASK_FAILED.
* [MESOS-3471] - Disable perf test when perf version is not support.
* [MESOS-3760] - Remove fragile sleep() from ProcessManager::settle().
* [MESOS-3959] - Executor page of mesos ui does not show slave hostname.
* [MESOS-4070] - numify() handles negative numbers inconsistently.
* [MESOS-4638] - versioning preprocessor macros.
* [MESOS-4668] - Agent's /state endpoint does not include full reservation information.
* [MESOS-4948] - Move maintenance tests to use the new scheduler library interface.
* [MESOS-4973] - Duplicates in 'unregistered_frameworks' in /state
* [MESOS-4975] - mesos::internal::master::Slave::tasks can grow unboundedly.
* [MESOS-5276] - HTTPCommandExecutor should terminate after it receives an ACK from the agent.
* [MESOS-5290] - WebUI shows the active task is launched 46 years ago.
* [MESOS-5320] - SSL related error messages can be misguiding or incomplete.
* [MESOS-5448] - Persistent volume deletion on the agent should survive slave restart.
* [MESOS-5481] - PerfFilter disable Registrar_BENCHMARK test cases incorrectly.
* [MESOS-5613] - mesos-local fails to start if MESOS_WORK_DIR isn't set.
* [MESOS-5701] - Add benchmark for sorter performance.
* [MESOS-5752] - ROOT_GarbageCollectorUndeletableFilesTest.BusyMountPoint is flaky.
* [MESOS-5759] - ProcessRemoteLinkTest.RemoteUseStaleLink and RemoteStaleLinkRelink are flaky.
* [MESOS-5812] - MasterAPITest.Subscribe is flaky.
* [MESOS-5846] - AgentAPITest.GetState is flaky.
* [MESOS-5852] - CMake build needs to generate protobufs before building libmesos.
* [MESOS-5860] - MasterAPITest.GetTasks is flaky.
* [MESOS-5864] - Document MESOS_SANDBOX executor env variable.
* [MESOS-5867] - Operator ReadFile API read file bugs.
* [MESOS-5869] - Disable resources validation for `+=` and `-=`.
* [MESOS-5875] - Scalar resource output operator doesn't print full significant digits.
* [MESOS-5878] - Strict/RegistrarTest.UpdateQuota/0 is flaky.
* [MESOS-5888] - SlaveAuthorizerTest/ViewFlags is flaky.
* [MESOS-5891] - /help endpoint does not set Content-Type to HTML.
* [MESOS-5907] - ExamplesTest.DiskFullFramework fails on Arch.
* [MESOS-5909] - Stout "OsTest.User" test can fail on some systems.
* [MESOS-5917] - All actors should have a distinguishable ID.
* [MESOS-5919] - Improve performance for `Resources.contains` and `Resources.filter`.
* [MESOS-5921] - `validate` is a bit heavy to check negative scalar resource.
* [MESOS-5922] - mesos-agent --help exit status is 1.
* [MESOS-5928] - Agent's '--version' flag doesn't work.
* [MESOS-5930] - Orphan tasks can show up as running after they have finished.
* [MESOS-5942] - Windows implementation of `os::rmdir` is not compliant with POSIX version.
* [MESOS-5958] - Reviewbot failing due to python files not being cleaned up after distclean.
* [MESOS-5972] - SharedResourcesTest failing.
* [MESOS-5979] - elfio-3.1.patch is actually not applied.
* [MESOS-5981] - task failed in windows Server 2012 client, test-framwork example.
* [MESOS-5985] - Fix broken link in `networking.md`.
* [MESOS-5996] - Windows mesos-containerizer crashes.
* [MESOS-6000] - Overlayfs backend cannot support the image with numerous layers.
* [MESOS-6005] - Support docker registry running non-https on localhost:<non-80-port>.
* [MESOS-6013] - Use readdir instead of readdir_r.
* [MESOS-6026] - Tasks mistakenly marked as FAILED due to race b/w sendExecutorTerminatedStatusUpdate() and _statusUpdate().
* [MESOS-6031] - Collect throttle related metrics for DockerContainerizer.
* [MESOS-6041] - Stream ID mismatch should print out expected and received stream ID.
* [MESOS-6049] - XFS disk isolator doesn't handle old containers correctly.
* [MESOS-6052] - Unable to launch containers on CNI networks on CoreOS.
* [MESOS-6057] - docker isolator does not overwrite Dockerfile ENV.
* [MESOS-6059] - Allow clean up unknown container during the clean up phase of the container.
* [MESOS-6069] - Misspelled TASK_KILLED in mesos slave.
* [MESOS-6074] - Master check failure if the metrics endpoint is polled soon after it starts.
* [MESOS-6085] - Agent's /state endpoint does not include total resources.
* [MESOS-6087] - Add master tests for TaskGroup.
* [MESOS-6100] - Make fails compiling 1.0.1.
* [MESOS-6104] - Potential FD double close in libevent's implementation of `sendfile`.
* [MESOS-6110] - Deprecate using health checks without setting the type.
* [MESOS-6118] - Agent would crash with docker container tasks due to host mount table read.
* [MESOS-6122] - Mesos slave throws systemd errors even when passed a flag to disable systemd.
* [MESOS-6131] - Improved performance for resource flatten.
* [MESOS-6141] - Some tests do not properly set 'flags.launcher' with the correct value.
* [MESOS-6144] - Validate that TaskGroup executor and tasks do not use DOCKER ContainerInfo.
* [MESOS-6145] - Isolator namespaces/pid is leaking mounts.
* [MESOS-6152] - Resource leak in libevent_ssl_socket.cpp.
* [MESOS-6153] - Resource leak in slave.cpp.
* [MESOS-6154] - Clean up queued tasks if a task group is killed before launch.
* [MESOS-6157] - ContainerInfo is not validated.
* [MESOS-6159] - Remove stout's Set type.
* [MESOS-6167] - CgroupsIsolatorTest.ROOT_CGROUPS_RevocableCpu is flaky.
* [MESOS-6170] - Health check grace period covers failures happening after first success.
* [MESOS-6173] - Authentication in v2 protobuf should not be `required`.
* [MESOS-6176] - CpuIsolatorTest.ROOT_SystemCpuUsage is flaky.
* [MESOS-6181] - The logic for BadACLNoPrincipal and BadACLDropCreateAndDestroy is not correct.
* [MESOS-6207] - Python bindings fail to build with custom SVN installation path.
* [MESOS-6208] - Containers that use the Mesos containerizer but don't want to provision a container image fail to validate.
* [MESOS-6210] - Master redirect with suffix gets in redirect loop.
* [MESOS-6216] - LibeventSSLSocketImpl::create is not safe to call concurrently with os::getenv.
* [MESOS-6217] - PAGE_SIZE was not declared in PPC64LE.
* [MESOS-6226] - Master crashes while transitioning tasks to 'TASK_UNREACHABLE'.
* [MESOS-6233] - Master CHECK fails during recovery while relinking to other masters.
* [MESOS-6234] - Potential socket leak during Zookeeper network changes.
* [MESOS-6245] - Driver based schedulers performing explicit acknowledgements cannot acknowledge updates from HTTP based executors.
* [MESOS-6246] - Libprocess links will not generate an ExitedEvent if the socket creation fails.
* [MESOS-6248] - mesos-slave cannot start , Assertion `isError()' failed.
* [MESOS-6257] - Resources not recovered after rescinding an offer on DESTROY on shared volumes.
* [MESOS-6259] - CNI isolator should not `CHECK` for `resolv.conf` under `rootContainerDir`.
* [MESOS-6260] - Composing containerizer needs to properly handle nested container launch.
* [MESOS-6262] - Default executor should kill all other tasks in a task group if any task exits with a non-zero exit status.
* [MESOS-6263] - Mesos containerizer should figure out the correct sandbox directory for nested launch.
* [MESOS-6269] - CNI isolator doesn't activate loopback interface.
* [MESOS-6270] - Agent crashes when trying to recover pods.
* [MESOS-6274] - Agent should not allow HTTP executors to re-subscribe before containerizer recovery is done.
* [MESOS-6283] - Fix the Web UI allowing access to the task sandbox for nested containers.
* [MESOS-6289] - Pass the 'user' into nested container launch.
* [MESOS-6290] - Support nested containers for logger in Mesos Containerizer.
* [MESOS-6295] - Excessive logging on agent when oversubscription modules are attached.
* [MESOS-6300] - A destroyed nested container is not reflected in the parent container's children map.
* [MESOS-6301] - Recursive destroy in MesosContainerizer is problematic.
* [MESOS-6302] - Agent recovery can fail after nested containers are launched.
* [MESOS-6308] - CHECK failure in DRF sorter.
* [MESOS-6317] - Race in master/allocator when updating oversubscribed resources of an agent.
* [MESOS-6319] - ContentType/AgentAPITest.NestedContainerLaunch/1 is flaky.
* [MESOS-6321] - CHECK failure in HierarchicalAllocatorTest.NoDoubleAccounting.
* [MESOS-6322] - Agent fails to kill empty parent container.
* [MESOS-6323] - 'mesos-containerizer launch' should inherit agent environment variables.
* [MESOS-6324] - CNI should not use `ifconfig` in executors `pre_exec_command`.
* [MESOS-6363] - Default executor should not crash with a failed assertion if it notices a disconnection from the agent for non checkpointed frameworks.
* [MESOS-6370] - The executor library does not invoke the shutdown callback upon recovery timeout.
* [MESOS-6386] - "Reached unreachable statement" in LinuxCapabilitiesIsolatorTest.
* [MESOS-6391] - Command task's sandbox should not be owned by root if it uses container image.
* [MESOS-6393] - Deprecated SSL_ environment variables are non functional already.
* [MESOS-6420] - Mesos Agent leaking sockets when port mapping network isolator is ON.
* [MESOS-6445] - Reconciliation for unreachable agent after master failover is incorrect.
* [MESOS-6446] - WebUI redirect doesn't work with stats from /metric/snapshot.
* [MESOS-6457] - Tasks shouldn't transition from TASK_KILLING to TASK_RUNNING.
* [MESOS-6461] - Duplicate framework ids in /master/frameworks endpoint 'unregistered_frameworks'.
* [MESOS-6482] - Master check failure when marking an agent unreachable.
* [MESOS-6483] - Check failure when a 1.1 master marking a 0.28 agent as unreachable.
* [MESOS-6497] - Java Scheduler Adapter does not surface MasterInfo.
* [MESOS-6502] - _version uses incorrect MESOS_{MAJOR,MINOR,PATCH}_VERSION in libmesos java binding.
* [MESOS-6527] - Memory leak in the libprocess request decoder.
** Documentation
* [MESOS-5221] - Add Documentation for Nvidia GPU support.
* [MESOS-5808] - Elasticsearch misspelled on homepage.
* [MESOS-6028] - mesos-execute has a typo in volume help.
* [MESOS-6103] - Mesos version is not uptodate in getting-started page.
* [MESOS-6343] - Documentation Error: Default Executor does not implicitly construct resources.
** Epic
* [MESOS-2449] - Support group of tasks (Pod) constructs and API in Mesos.
* [MESOS-3421] - Support sharing of resources across task instances.
* [MESOS-4312] - Porting Mesos on Power (ppc64le).
* [MESOS-4791] - Operator API v1.
* [MESOS-5344] - Partition-aware Mesos frameworks.
* [MESOS-6014] - Added port mapping CNI plugin.
** Improvement
* [MESOS-2533] - Support HTTP checks in Mesos.
* [MESOS-3567] - Support TCP checks in Mesos.
* [MESOS-4049] - Allow user to control behavior of partitioned agents/tasks.
* [MESOS-4155] - Speed up ExamplesTest.*.
* [MESOS-4172] - GarbageCollectorIntegrationTest.Restart is slow.
* [MESOS-4324] - Allow access to shared persistent volumes as read only or read write by tasks.
* [MESOS-4325] - Offer shareable resources to frameworks only if it is opted in.
* [MESOS-4431] - Support sharing of persistent volumes via shared resources.
* [MESOS-4663] - Speed up ExamplesTest.PersistentVolumeFramework.
* [MESOS-4694] - DRFAllocator takes very long to allocate resources with a large number of frameworks.
* [MESOS-4892] - Support arithmetic operations for shared resources with consumer counts.
* [MESOS-5038] - Added a any mechanism for futures.
* [MESOS-5070] - Introduce more flexible subprocess interface for child options.
* [MESOS-5425] - Consider using IntervalSet for Port range resource math.
* [MESOS-5464] - The max number of completed executors for a mesos slave should be configurable.
* [MESOS-5565] - Add logging when Offer::Operation::Launch message has no tasks.
* [MESOS-5716] - Document docker private registry with authentication support in Unified Containerizer.
* [MESOS-5732] - MasterAPITest.UnreserveResources is slow.
* [MESOS-5756] - Cmake build system needs to regenerate protobufs when they are updated.
* [MESOS-5790] - Ensure all examples in Scheduler HTTP API docs are valid JSON.
* [MESOS-5822] - Add a build script for the Windows CI.
* [MESOS-5870] - Fix the large preview logo in Slack.
* [MESOS-5901] - Make the command executor unversioned.
* [MESOS-5936] - Operator SUBSCRIBE api should provdide more task metadata than just state changes.
* [MESOS-5944] - Remove `O_SYNC` from StatusUpdateManager logs.
* [MESOS-5949] - Allow frameworks to learn the time when an agent became unreachable.
* [MESOS-5951] - Remove "strict registry" code.
* [MESOS-5954] - Docker executor does not use HealthChecker library.
* [MESOS-5955] - The "mesos-health-check" binary is not used anymore.
* [MESOS-5961] - HTTP and TCP health checks should support docker executor and bridged mode.
* [MESOS-5965] - Implement garbage collection for unreachable agent lists in registry.
* [MESOS-5978] - Improve run time for arithmetic operators for Resources.
* [MESOS-5983] - Number of libprocess worker threads is not configurable for log-rotation module.
* [MESOS-6006] - Abstract mesos-style.py to allow future linters to be added more easily.
* [MESOS-6008] - Add the infrastructure for a new python-based CLI.
* [MESOS-6025] - Validate health check protobuf in the master.
* [MESOS-6037] - Offer::Operation.type should be optional.
* [MESOS-6039] - Update elfio to version 3.2.
* [MESOS-6050] - Add an agent flag for 'runtime_dir'.
* [MESOS-6051] - Add functions to the 'Launcher' abstraction to aid in checkpointing the exit status of containers.
* [MESOS-6060] - Add MOUNT or PATH disk type in logging resources.
* [MESOS-6063] - Track recovered and prepared subsystems for a container.
* [MESOS-6065] - Support provisioning image volumes in an isolator.
* [MESOS-6075] - Avoid libprocess functions in `mesos-containerizer launch`.
* [MESOS-6080] - Expose metrics in scheduler library.
* [MESOS-6088] - Update launch helper to checkpoint exit status of launched process.
* [MESOS-6090] - Change master to always update registry before in-memory state.
* [MESOS-6096] - Update mesos-execute to support launching task groups.
* [MESOS-6098] - Frameworks UI shows metrics for used resources plus offers.
* [MESOS-6140] - Add a parallel test runner.
* [MESOS-6218] - Avoided to concat cgroup internally in subsystems.
* [MESOS-6220] - HTTP handler failures should result in 500 response rather than 503 response.
* [MESOS-6242] - Expose unknown container case on Containerizer::wait.
* [MESOS-6243] - Expose failures and unknown container cases from Containerizer::destroy.
* [MESOS-6282] - CNI isolator should print plugin's stderr.
* [MESOS-6299] - Master doesn't remove task from pending when it is invalid.
* [MESOS-6310] - Remove or define non-POSIX function.
* [MESOS-6371] - Remove the 'recover()' interface in 'ContainerLogger'.
** Task
* [MESOS-3370] - Deprecate the external containerizer.
* [MESOS-4390] - Shared Volumes Design Doc.
* [MESOS-5039] - Add Subsystem abstraction for cgroups unified isolator.
* [MESOS-5040] - Add cgroups_subsystems flag for cgroups unified isolator.
* [MESOS-5041] - Add cgroups unified isolator.
* [MESOS-5042] - Add cpu subsystem support in cgroups unified isolator.
* [MESOS-5043] - Add cpuacct subsystem support in cgroups unified isolator.
* [MESOS-5045] - Add memory subsystem support in cgroups unified isolator.
* [MESOS-5046] - Add net_cls subsystem support in cgroups unified isolator.
* [MESOS-5047] - Add perf_event subsystem support in cgroups unified isolator.
* [MESOS-5051] - Create helpers for manipulating Linux capabilities.
* [MESOS-5144] - Cleanup memory leaks in libprocess finalize().
* [MESOS-5228] - Add tests for Capability API.
* [MESOS-5232] - Add capability information to ContainerInfo protobuf message.
* [MESOS-5275] - Add capabilities support for unified containerizer.
* [MESOS-5488] - Implement READ_FILE Call in v1 master API.
* [MESOS-5515] - Implement READ_FILE Call in v1 agent API.
* [MESOS-5516] - Implement GET_STATE Call in v1 agent API.
* [MESOS-5651] - Add devices subsystem support in cgroups unified isolator.
* [MESOS-5652] - Enable cgroups unified isolator.
* [MESOS-5788] - Added JAVA API adapter for seamless transition to new scheduler API.
* [MESOS-5809] - Implement GET_FRAMEWORKS Call in v1 agent API.
* [MESOS-5810] - Implement GET_EXECUTORS Call in v1 agent API.
* [MESOS-5811] - Implement GET_TASKS Call in v1 agent API.
* [MESOS-5855] - Create a 'Disk (not) full' example framework.
* [MESOS-5970] - Remove HTTP_PARSER_VERSION_MAJOR < 2 code in decoder.
* [MESOS-5973] - Remove CgroupsCpushareIsolator.
* [MESOS-5974] - Remove CgroupsMemIsolator.
* [MESOS-5975] - Remove CgroupsPerfEventIsolator.
* [MESOS-5976] - Remove CgroupsNetClsIsolator.
* [MESOS-5977] - Remove CgroupsDevicesIsolator.
* [MESOS-5987] - Update health check protobuf for HTTP and TCP health check.
* [MESOS-6017] - Introduce `PortMapping` protobuf.
* [MESOS-6020] - Remove `slavePid` from the Containerizer::launch API.
* [MESOS-6021] - Consolidate two `Containerizer::launch` methods into one.
* [MESOS-6023] - Create a binary for the port-mapper plugin.
* [MESOS-6036] - Define the Framework API protobufs required for TaskGroups.
* [MESOS-6042] - Validate TaskGroup launch in the master.
* [MESOS-6043] - Add interface for launching nested containers in Containerizer.
* [MESOS-6045] - Implement LAUNCH_GROUP operation in master.
* [MESOS-6067] - Support provisioner to be nested aware for Mesos Pods.
* [MESOS-6068] - Refactor MesosContainerizer::launch to prepare for nesting support.
* [MESOS-6070] - Renamed containerizer::Termination to ContainerTermination.
* [MESOS-6071] - Validate that an explicitly specified DEFAULT executor has disk resources.
* [MESOS-6073] - Update the streaming function for ContainerID to be nesting aware.
* [MESOS-6076] - Implement RunTaskGroup handler on the agent.
* [MESOS-6077] - Added a default (task group) executor.
* [MESOS-6102] - Add event for agent added in master operator API.
* [MESOS-6130] - Make the disk usage isolator nesting-aware.
* [MESOS-6150] - Introduce the new isolator recover interface for nested container support.
* [MESOS-6151] - Populate `CommandInfo` correctly for default executors.
* [MESOS-6156] - Make the `network/cni` isolator nesting aware.
* [MESOS-6160] - Add protobuf definition for a Volume::Source that specifies a path from parent container's sandbox.
* [MESOS-6186] - Make the generic `cgroups` isolator nesting aware.
* [MESOS-6188] - Make the `gpu/nvidia` isolator nesting aware.
* [MESOS-6189] - Add a virtual method to Isolator to indicate if it supports nesting.
* [MESOS-6190] - Make the docker/runtime isolator nesting aware.
* [MESOS-6191] - Make the filesystem/linux isolator nesting aware.
* [MESOS-6192] - Make the appc/runtime isolator nesting aware.
* [MESOS-6194] - Make the disk/du isolator nesting aware.
* [MESOS-6199] - Make the volume/image isolator nesting aware.
* [MESOS-6204] - Introduce a "runtime" directory owned by the containerizer for checkpointing container information.
* [MESOS-6227] - Update the default executor to launch/wait/destroy child containers.
* [MESOS-6230] - Add support for health checks to the default executor.
* [MESOS-6235] - Add 'argv' variant of 'os::system'.
* [MESOS-6241] - Add agent::Call / agent::Response API for managing nested containers.
* [MESOS-6258] - Add `volume/sandbox_path` isolator to support Volume::Source::SANDBOX_PATH.
* [MESOS-6265] - Adjust cgroups layout for nested containers.
* [MESOS-6272] - Allow WebUI/other tools to access the task sandbox for a nested container.
* [MESOS-6284] - MesosContainerizer should skip non-nesting aware isolators for nested container.
* [MESOS-6287] - MesosContainer should allow 'wait' on terminated nested container.
* [MESOS-6312] - Update CHANGELOG to mention addition of agent '--runtime_dir' flag.
* [MESOS-6344] - Allow `network/cni` isolator to take a search path for CNI plugins instead of single directory.
* [MESOS-6408] - Changelog for `mesos-cni-port-mapper` to 1.1.0.
** Wish
* [MESOS-5929] - Total cluster resources on master Mesos UI should have better spacing.
Release Notes - Mesos - Version 1.0.4
-------------------------------------
* This is a bug fix release.
All Issues:
** Bug
* [MESOS-2537] - AC_ARG_ENABLED checks are broken
* [MESOS-6606] - Reject optimized builds with libcxx before 3.9
* [MESOS-7008] - Quota not recovered from registry in empty cluster.
* [MESOS-7265] - Containerizer startup may cause sensitive data to leak into sandbox logs.
* [MESOS-7366] - Agent sandbox gc could accidentally delete the entire persistent volume content.
* [MESOS-7383] - Docker executor logs possibly sensitive parameters.
* [MESOS-7422] - Docker containerizer should not leak possibly sensitive data to agent log.
Release Notes - Mesos - Version 1.0.3
-------------------------------------
* This is a bug fix release.
All Issues:
** Bug
* [MESOS-6052] - Unable to launch containers on CNI networks on CoreOS
* [MESOS-6142] - Frameworks may RESERVE for an arbitrary role.
* [MESOS-6621] - SSL downgrade path will CHECK-fail when using both temporary and persistent sockets
* [MESOS-6676] - Always re-link with scheduler during re-registration.
* [MESOS-6917] - Segfault when the executor sets an invalid UUID when sending a status update.
Release Notes - Mesos - Version 1.0.2
-------------------------------------
* This is a bug fix release.
All Issues:
** Bug
* [MESOS-4638] - Versioning preprocessor macros.
* [MESOS-4973] - Duplicates in 'unregistered_frameworks' in /state
* [MESOS-4975] - mesos::internal::master::Slave::tasks can grow unboundedly.
* [MESOS-5613] - mesos-local fails to start if MESOS_WORK_DIR isn't set.
* [MESOS-6013] - Use readdir instead of readdir_r.
* [MESOS-6026] - Tasks mistakenly marked as FAILED due to race b/w sendExecutorTerminatedStatusUpdate() and _statusUpdate().
* [MESOS-6074] - Master check failure if the metrics endpoint is polled soon after it starts.
* [MESOS-6100] - Make fails compiling 1.0.1.
* [MESOS-6104] - Potential FD double close in libevent's implementation of `sendfile`.
* [MESOS-6118] - Agent would crash with docker container tasks due to host mount table read.
* [MESOS-6122] - Mesos slave throws systemd errors even when passed a flag to disable systemd.
* [MESOS-6152] - Resource leak in libevent_ssl_socket.cpp.
* [MESOS-6212] - Validate the name format of mesos managed docker containers.
* [MESOS-6216] - LibeventSSLSocketImpl::create is not safe to call concurrently with os::getenv.
* [MESOS-6233] - Master CHECK fails during recovery while relinking to other masters.
* [MESOS-6234] - Potential socket leak during Zookeeper network changes.
* [MESOS-6245] - Driver based schedulers performing explicit acknowledgements cannot acknowledge updates from HTTP based executors.
* [MESOS-6246] - Libprocess links will not generate an ExitedEvent if the socket creation fails.
* [MESOS-6269] - CNI isolator doesn't activate loopback interface.
* [MESOS-6274] - Agent should not allow HTTP executors to re-subscribe before containerizer recovery is done.
* [MESOS-6324] - CNI should not use `ifconfig` in executors `pre_exec_command`.
* [MESOS-6391] - Command task's sandbox should not be owned by root if it uses container image.
* [MESOS-6393] - Deprecated SSL_ environment variables are non functional already.
* [MESOS-6420] - Mesos Agent leaking sockets when port mapping network isolator is ON.
* [MESOS-6446] - WebUI redirect doesn't work with stats from /metric/snapshot.
* [MESOS-6457] - Tasks shouldn't transition from TASK_KILLING to TASK_RUNNING.
* [MESOS-6461] - Duplicate framework ids in /master/frameworks endpoint 'unregistered_frameworks'.
* [MESOS-6502] - _version uses incorrect MESOS_{MAJOR,MINOR,PATCH}_VERSION in libmesos java binding.
* [MESOS-6527] - Memory leak in the libprocess request decoder.
** Improvement
* [MESOS-6075] - Avoid libprocess functions in `mesos-containerizer launch`.
* [MESOS-6299] - Master doesn't remove task from pending when it is invalid.
Release Notes - Mesos - Version 1.0.1
-------------------------------------
* This is a bug fix release.
All Issues:
** Bug
* [MESOS-5388] - MesosContainerizerLaunch flags execute arbitrary commands via shell.
* [MESOS-5862] - External links to .md files broken.
* [MESOS-5911] - Webui redirection to leader in browser does not work.
* [MESOS-5913] - Stale socket FD usage when using libevent + SSL.
* [MESOS-5922] - mesos-agent --help exit status is 1.
* [MESOS-5923] - Ubuntu 14.04 LTS GPU Isolator "/run" directory is noexec.
* [MESOS-5927] - Unable to run "scratch" Dockerfiles with Unified Containerizer.
* [MESOS-5928] - Agent's '--version' flag doesn't work.
* [MESOS-5930] - Orphan tasks can show up as running after they have finished.
* [MESOS-5943] - Incremental http parsing of URLs leads to decoder error.
* [MESOS-5945] - NvidiaVolume::create() should check for root before creating volume.
* [MESOS-5959] - All non-root tests fail on GPU machine.
* [MESOS-5969] - Linux 'MountInfoTable' entries not sorted as expected.
* [MESOS-5982] - NvidiaVolume errors out if any binary is missing.
* [MESOS-5986] - SSL Socket CHECK can fail after socket receives EOF.
* [MESOS-5988] - PollSocketImpl can write to a stale fd.
** Improvement
* [MESOS-5830] - Make a sweep to trim excess space around angle brackets.
** Task
* [MESOS-5970] - Remove HTTP_PARSER_VERSION_MAJOR < 2 code in decoder.
Release Notes - Mesos - Version 1.0.0
-------------------------------------
This release contains the following new features:
* Scheduler and Executor v1 HTTP APIs are now considered stable.
* [MESOS-4791] - **Experimental** support for v1 Master and Agent APIs. These
APIs let operators and services (monitoring, load balancers) send HTTP
requests to '/api/v1' endpoint on master or agent. See
`docs/operator-http-api.md` for details.
* [MESOS-4828] - **Experimental** support for a new `disk/xfs' isolator
has been added to isolate disk resources more efficiently. Please refer to
docs/mesos-containerizer.md for more details.
* [MESOS-4355] - **Experimental** support for Docker volume plugin. We added a
new isolator 'docker/volume' which allows users to use external volumes in
Mesos containerizer. Currently, the isolator interacts with the Docker
volume plugins using a tool called 'dvdcli'. By speaking the Docker volume
plugin API, most of the Docker volume plugins are supported.
* [MESOS-4641] - **Experimental** A new network isolator, the
`network/cni` isolator, has been introduced in the `MesosContainerizer`. The
`network/cni` isolator implements the Container Network Interface (CNI)
specification proposed by CoreOS. With CNI the `network/cni` isolator is
able to allocate a network namespace to Mesos containers and attach the
container to different types of IP networks by invoking network drivers
called CNI plugins.
* [MESOS-2948, MESOS-5403] - The authorizer interface has been refactored in
order to decouple the ACLs definition language from the interface.
It additionally includes the option of retrieving `ObjectApprover`. An
`ObjectApprover` can be used to synchronously check authorizations for a
given object and is hence useful when authorizing a large number of objects
and/or large objects (which need to be copied using request based
authorization). NOTE: This is a **breaking change** for authorizer modules.
* [MESOS-5405] - The `subject` and `object` fields in authorization::Request
have been changed from required to optional. If either of these fields is
not set, the request should only be authorized if any subject/object should
be allowed.
NOTE: This is a semantic change for authorizer modules.
* [MESOS-4931, MESOS-5709, MESOS-5704] - Authorization based HTTP endpoint
filtering enables operators to restrict what part of the cluster state a
user is authorized to see.
Consider for example the `/state` master endpoint: an operator can now
authorize users to only see a subset of the running frameworks, tasks, or
executors. The following endpoints support HTTP endpoint filtering:
'/state', '/state-summary', '/tasks', '/frameworks','/weights',
and '/roles'. Additonally the following v1 API calls support filtering:
'GET_ROLES','GET_WEIGHTS','GET_FRAMEWORKS', 'GET_STATE', and 'GET_TASKS'.
* [MESOS-4909] - Tasks can now specify a kill policy. They are best-effort,
because machine failures or forcible terminations may occur. Currently, the
only available kill policy is how long to wait between graceful and forcible
task kill. In the future, more policies may be available (e.g. hitting an
HTTP endpoint, running a command, etc). Note that it is the executor's
responsibility to enforce kill policies. For executor-less command-based
tasks, the kill is performed via sending a signal to the task process:
SIGTERM for the graceful kill and SIGKILL for the forcible kill. For docker
executor-less tasks the grace period is passed to 'docker stop --time'. This
feature supersedes the '--docker_stop_timeout', which is now deprecated.
* [MESOS-4908] - The task kill policy defined within 'TaskInfo' can now be
overridden when the scheduler kills the task. This can be used by schedulers
to forcefully kill a task which is already being killed, e.g. if something
went wrong during a graceful kill and a forcible kill is desired. Note that
it is the executor's responsibility to honor the 'Event.kill.kill_policy'
field and override the task's kill policy and kill policy from a previous
kill task request. To use this feature, schedulers and executors must
support HTTP API; use the '--http_command_executor' agent flag to ensure
the agent launches the HTTP API based command executor.
* [MESOS-4949] - The executor shutdown grace period can now be configured in
`ExecutorInfo`, which overrides the agent flag. When shutting down an
executor the agent will wait in a best-effort manner for the grace period
specified here before forcibly destroying the container. The executor must
not assume that it will always be allotted the full grace period, as the
agent may decide to allot a shorter period and failures / forcible
terminations may occur. Together with kill policies this gives frameworks
flexibility around how to clean up tasks and executors.
* [MESOS-3094] - **Experimental** support for launching mesos tasks on
Windows. Note that there are no isolation guarantees provided yet.
* [MESOS-4090] - The `mesos.native` python module has been split into two,
`mesos.executor` and `mesos.scheduler`. This change also removes
un-necessary 3rd party dependencies from `mesos.executor` and
`mesos.scheduler`. `mesos.native` still exists, combining both modules for
backwards compatibility with existing code.
* [MESOS-1478] - Phase I of the Slave to Agent rename is complete. To support
the rename, new duplicate flags (e.g., --agent_reregister_timeout), new
binaries (e.g., mesos-agent) and WebUI sandbox links have been added. All
the logging output has been updated to use the term 'agent' now. Flags,
binaries and scripts with 'slave' keyword have been deprecated (see
"Deprecations section below").
* [MESOS-4312] - **Experimental** support for building and running mesos on
IBM PowerPC platform.
* [MESOS-4189] - Weights for resource roles can now be configured dynamically
via the new '/weights' endpoint on the master.
* [MESOS-4424] - Support for using Nvidia GPUs as a resource in the
Mesos "unified" containerizer. This support includes running containers
with and without filesystem isolation (i.e. running both imageless
containers as well as containers using a docker image). Frameworks must
opt-in to receiving GPU resources via the GPU_RESOURCES framework
capability (see the scarce resource problem in MESOS-5377). We support
'nvidia-docker'-style docker containers by injecting a volume that
contains the Nvidia libraries / binaries when the docker image has
the 'com.nvidia.volumes.needed' label. Support for the docker
containerizer will come in a future release.
* [MESOS-5724] - SSL certificate validation allows for additional IP address
subject alternative name extension verification.
Deprecations:
* [MESOS-2281] - Deprecated the plain text format for credentials in favor of
the JSON format.
* [MESOS-4910] - Deprecate the --docker_stop_timeout agent flag.
* [MESOS-5001] - The 'allocator/event_queue_dispatches' metric is now
deprecated in favor 'of allocator/mesos/event_queue_dispatches'.
* [MESOS-5029] - Deprecated the ExecutorInfo.source field in favor of
ExecutorInfo.labels.
* [MESOS-3781] - Deprecated flags with keyword 'slave' in favor of 'agent'.
* [MESOS-3779] - Deprecated sandbox links with 'slave' keyword in the WebUI.
* [MESOS-3784] - Deprecated `slave` subcommand for mesos-cli.
* [MESOS-5155] - Deprecated `SetQuota` and `RemoveQuota` ACLs. This change is
applicable to the local authorizer as well as any custom authorizer module.
* [MESOS-5666] - Deprecated camel cased `taskInfo` and `executorInfo` in
isolator `ContainerConfig`.
* [MESOS-5863] - Deprecated `SSL_*` environment variables used by libprocess
SSL support in favor of using `LIBPROCESS_SSL_*`.
Additional API Changes:
* [MESOS-4580] - Returning `202` (Accepted) for /reserve and related endpoints.
* [MESOS-4735] - Added 'output_file' field to CommandInfo.URI in Scheduler API
and v1 Scheduler HTTP API.
* [MESOS-5014] - Changed Call and Event Type enums in scheduler.proto
from required to optional for the purpose of backwards compatibility.
* [MESOS-5015] - Changed Call and Event Type enums in executor.proto
from required to optional for the purpose of backwards compatibility.
* [MESOS-5029] - Added 'labels' to ExecutorInfo.
* [MESOS-5030] - Added non-terminal task metadata to the container resource
usage information.
* [MESOS-5408] - Deleted the /observe HTTP endpoint.
* [MESOS-4843, MESOS-5150, MESOS-5286, MESOS-5335, MESOS-5336] - Authorization
has been added to the '/metrics/snapshot', '/logging/toggle', '/quota',
'/files/browse', '/files/download', '/files/read', '/flags', and
'/containers' endpoints. If a Mesos cluster has authorization enabled, these
endpoints now require that ACLs be set to authorize principals to access
them. Note that the '/metrics/snapshot' and '/files/*' endpoints are used by
the web UI, and thus using the web UI in a cluster with authorization
enabled will require that ACLs be set appropriately.
* [MESOS-5064] - Remove default value for the agent `work_dir`. This flag is
now required, and the agent will exit immediately if it is not provided.
* [MESOS-5637] - Authorized endpoints consistently return `503` (Service
Unavailable) instead of `500` (Internal Server Error) when the authenticator
or the authorizer fails to process the request.
* [MESOS-5657] - Executors should not inherit environment variables from the
agent.
* [MESOS-5680] - We should not 'chown -R' on persistent volumes every time
container tries to use it.
* [MESOS-5642] - Namespace and header file of `Allocator` has been moved to
be consistent with other packages.
* [MESOS-5851] - The flag `--authenticate_http` has been deprecated in favor
of `--authenticate_http_readwrite`. This new flag enables authentication for
all HTTP endpoints which support authentication and allow modification of
the state of the cluster. A new flag has also been added,
`--authenticate_http_readonly`, which enables authentication for those
authenticatable endpoints that cannot be used to modify the cluster state.
* [MESOS-5833] - Disable the experimental `registry_strict` master flag.
3rd Party Upgrades:
* [MESOS-4805] - Upgraded vendored ry-http-parser-1c3624a to nodejs/http-parser 2.6.1.
* [MESOS-4678] - Upgraded vendored protobuf 2.5.0 to 2.6.1.
* [MESOS-4803] - Upgraded vendored libev 4.15 to 4.22.
* [MESOS-4612] - Upgraded vendored ZooKeeper 3.4.5 to 3.4.8.
Binary API Changes:
* [MESOS-5055] - Slave/Agent Rename Phase I - Update strings in the log message
and standard output.
* [MESOS-3782] - Slave/Agent Rename Phase I - Duplicate/Rename binaries.
* [MESOS-5057] - Slave/Agent Rename Phase I - Update strings in error messages and
other strings.
* [MESOS-5230] - Slave/Agent Rename Phase I: Rename '/include/mesos/slave' folder
All Issues:
** Bug
* [MESOS-1495] - Create separate local data file to manage releases
* [MESOS-1575] - master sets failover timeout to 0 when framework requests a high value
* [MESOS-1865] - Redirect to the leader master when current master is not a leader
* [MESOS-2043] - Framework auth fail with timeout error and never get authenticated
* [MESOS-2198] - Document that TaskIDs should not be reused
* [MESOS-2201] - ReplicaTest.Restore fails with leveldb greater than v1.7.
* [MESOS-2331] - MasterSlaveReconciliationTest.ReconcileRace is flaky
* [MESOS-2858] - FetcherCacheHttpTest.HttpMixed is flaky.
* [MESOS-3181] - Implement package rebundling for Windows
* [MESOS-3319] - Mesos will not build when configured with gperftools enabled
* [MESOS-3402] - mesos-execute does not support credentials
* [MESOS-3573] - Mesos does not kill orphaned docker containers
* [MESOS-3714] - `os::environ` collides with the `environ` macro in Windows headers.
* [MESOS-3737] - `limiter.hpp` causes template specialization error on Windows 10/MSVC 1900
* [MESOS-3739] - Mesos does not set Content-Type for 400 Bad Request
* [MESOS-3881] - Implement `stout/os/pstree.hpp` on Windows
* [MESOS-3902] - The Location header when non-leading master redirects to leading master is incomplete.
* [MESOS-3923] - Implement AuthN handling in Master for the Scheduler endpoint
* [MESOS-3976] - C++ HTTP Scheduler Library does not work with SSL enabled
* [MESOS-4099] - parallel make tests does not build all test targets
* [MESOS-4269] - Minor typo in src/linux/cgroups.cpp
* [MESOS-4279] - Docker executor truncates task's output when the task is killed.
* [MESOS-4387] - Added a new test cases for reviveOffers in allocator test
* [MESOS-4434] - Install 3rdparty package boost, glog, protobuf and picojson when installing Mesos
* [MESOS-4447] - Renamed reserved() API to reservations()
* [MESOS-4462] - Port `gmtime_r`
* [MESOS-4463] - Implement `hsterror`
* [MESOS-4464] - Implement cpu count facilities on Windows
* [MESOS-4465] - Implement pagesize facilities in Windows
* [MESOS-4466] - Implement `waitpid` in Windows
* [MESOS-4469] - Implement memory querying in Windows
* [MESOS-4470] - Implement `uname` in Windows
* [MESOS-4471] - Implement process querying/counting in Windows
* [MESOS-4472] - Implement `getenv` in Windows
* [MESOS-4473] - Implement `shell` in Windows
* [MESOS-4474] - Implement `sendfile` in Windows
* [MESOS-4580] - Consider returning `202` (Accepted) for /reserve and related endpoints
* [MESOS-4611] - Passing a lambda to dispatch() always matches the template returning void
* [MESOS-4633] - Tests will dereference stack allocated agent objects upon assertion/expectation failure.
* [MESOS-4634] - Tests will dereference stack allocated master objects upon assertion/expectation failure.
* [MESOS-4658] - process::Connection can lead to process::wait deadlock
* [MESOS-4662] - PortMapping network isolator should not assume BIND_MOUNT_ROOT is a realpath.
* [MESOS-4672] - Implement aufs based provisioner backend.
* [MESOS-4673] - Agent fails to shutdown after reregistering period timed-out.
* [MESOS-4680] - HTTP requests to non leading mesos-master redirect to top level page
* [MESOS-4684] - Create base docker image for test suite.
* [MESOS-4705] - Linux 'perf' parsing logic may fail when OS distribution has perf backports.
* [MESOS-4744] - mesos-execute should allow setting role
* [MESOS-4807] - IOTest.BufferedRead writes to the current directory
* [MESOS-4810] - ProvisionerDockerPullerTest.ROOT_INTERNET_CURL_ShellCommand fails.
* [MESOS-4827] - Destroy Docker container crashes Mesos slave
* [MESOS-4875] - overlayfs does not work when launching tasks
* [MESOS-4876] - bind backend does not work when launching tasks
* [MESOS-4885] - Unzip should force overwrite
* [MESOS-4901] - Build fails on some systems due to unportable use of time.h
* [MESOS-4911] - Executor driver does not respect executor shutdown grace period.
* [MESOS-4912] - LinuxFilesystemIsolatorTest.ROOT_MultipleContainers fails.
* [MESOS-4922] - Setup proper /etc/hostname, /etc/hosts and /etc/resolv.conf for containers in network/cni isolator.
* [MESOS-4924] - MAC OS build failed
* [MESOS-4942] - Docker runtime isolator tests may cause disk issue.
* [MESOS-4950] - Implement reconnect funtionality in the scheduler library.
* [MESOS-4952] - Annoying image provisioner logging for when images are not used.
* [MESOS-4954] - URI fetcher error message if plugin is not found is mis-leading.
* [MESOS-4957] - Typo in Mesos portal
* [MESOS-4961] - ContainerLoggerTest.LOGROTATE_RotateInSandbox is flaky
* [MESOS-4963] - Incorrect CXXFLAGS with GCC 6
* [MESOS-4972] - Implement `os::rename`
* [MESOS-4978] - Update mesos-execute with Appc changes.
* [MESOS-4981] - Framework (re-)register metric counters broken for calls made via scheduler driver
* [MESOS-4984] - MasterTest.SlavesEndpointTwoSlaves is flaky
* [MESOS-5000] - MasterTest.MasterLost is flaky
* [MESOS-5005] - Enforce that DiskInfo principal is equal to framework/operator principal
* [MESOS-5010] - Installation of mesos python package is incomplete
* [MESOS-5012] - Protobuf change for external storage.
* [MESOS-5013] - Add docker volume driver isolator for Mesos containerizer.
* [MESOS-5018] - FrameworkInfo Capability enum does not support upgrades.
* [MESOS-5031] - Authorization Action enum does not support upgrades.
* [MESOS-5060] - Requesting /files/read.json with a negative length value causes subsequent /files requests to 404.
* [MESOS-5063] - SSLTest.HTTPSPost and SSLTest.HTTPSGet tests fail
* [MESOS-5064] - Remove default value for the agent `work_dir`
* [MESOS-5082] - Fix a bug in the Nvidia GPU device isolator that exposes a discrepancy between clang and gcc in 'using' declarations
* [MESOS-5113] - `network/cni` isolator crashes when launched without the --network_cni_plugins_dir flag
* [MESOS-5114] - Flags::parse does not handle empty string correctly.
* [MESOS-5115] - Grant access to /dev/nvidiactl and /dev/nvidia-uvm in the Nvidia GPU isolator.
* [MESOS-5121] - pivot_root is not available on PowerPC
* [MESOS-5125] - Commit message hook iterates over words, rather than lines.
* [MESOS-5126] - Commit message hook iterates over the commented lines.
* [MESOS-5127] - Reset `LIBPROCESS_IP` in `network\cni` isolator.
* [MESOS-5128] - PersistentVolumeTest.AccessPersistentVolume is flaky
* [MESOS-5131] - Slave allows the resource estimator to send non-revocable resources.
* [MESOS-5132] - Commit message hook lints the diff in verbose mode.
* [MESOS-5138] - Fix Nvidia GPU test build for namespace change of MasterDetector
* [MESOS-5142] - Add agent flags for HTTP authorization.
* [MESOS-5146] - MasterAllocatorTest/1.RebalancedForUpdatedWeights is flaky.
* [MESOS-5153] - Sandboxes contents should be protected from unauthorized users
* [MESOS-5162] - Commit message hook behaves incorrectly when a message includes a "*".
* [MESOS-5166] - ExamplesTest.DynamicReservationFramework is slow
* [MESOS-5181] - Master should reject calls from the scheduler driver if the scheduler is not connected.
* [MESOS-5184] - Mesos does not validate role info when framework registered with specified role
* [MESOS-5196] - Sandbox GC shouldn't return early in the face of an error.
* [MESOS-5199] - The mesos-execute prints confusing message when launching tasks.
* [MESOS-5216] - Document docker volume driver isolator.
* [MESOS-5224] - buffer overflow error in slave upon processing malformed UUIDs
* [MESOS-5225] - Command executor can not start when joining a CNI network
* [MESOS-5226] - The image-less task launched by mesos-execute can not join CNI network
* [MESOS-5230] - Slave/Agent Rename Phase I: Rename '/include/mesos/slave' folder
* [MESOS-5233] - python packages installation is broken
* [MESOS-5237] - The windows version of `os::access` has differing behavior than the POSIX version.
* [MESOS-5239] - Persistent volume DockerContainerizer support assumes proper mount propagation setup on the host.
* [MESOS-5240] - Command executor may escalate after the task is reaped.
* [MESOS-5244] - Compilation failure on Ubuntu 16.04
* [MESOS-5253] - Isolator cleanup should not be invoked if they are not prepared yet.
* [MESOS-5263] - pivot_root is not available on ARM
* [MESOS-5265] - Update mesos-execute to support docker volume isolator.
* [MESOS-5266] - add test cases for docker volume driver
* [MESOS-5277] - Need to add REMOVE semantics to the copy backend
* [MESOS-5279] - DRF sorter add/activate doesn't check if it's adding a duplicate entry
* [MESOS-5282] - Destroy container while provisioning volume images may lead to a race.
* [MESOS-5287] - boto is no longer a Mesos dependency.
* [MESOS-5293] - Endpoint handlers for master and agent are implemented surprisingly differently.
* [MESOS-5294] - Status updates after a health check are incomplete or invalid
* [MESOS-5295] - The task launched by non-checkpointed HTTP command executor will keep running till executor shutdown grace period (5s) after agent process exits.
* [MESOS-5304] - /metrics/snapshot endpoint help disappeared on agent.
* [MESOS-5308] - ROOT_XFS_QuotaTest.NoCheckpointRecovery failed.
* [MESOS-5312] - Env `MESOS_SANDBOX` is not set properly for command tasks that changes rootfs.
* [MESOS-5318] - Make `os::close` always catch structured exceptions on Windows
* [MESOS-5326] - Error symbolic link of include/slave
* [MESOS-5330] - Agent should backoff before connecting to the master
* [MESOS-5340] - libevent builds may prevent new connections
* [MESOS-5341] - Enabled docker volume support for DockerContainerizer
* [MESOS-5351] - DockerVolumeIsolatorTest.ROOT_INTERNET_CURL_CommandTaskRootfsWithVolumes is flaky
* [MESOS-5354] - Update "driver" as optional for DockerVolume.
* [MESOS-5359] - The scheduler library should have a delay before initiating a connection with master.
* [MESOS-5380] - Killing a queued task can cause the corresponding command executor to never terminate.
* [MESOS-5381] - Network portmapping isolator disable IPv6 failed
* [MESOS-5382] - Implement os::fsync
* [MESOS-5383] - Implement os::setHostname
* [MESOS-5386] - Add `HANDLE` overloads for functions that take a file descriptor
* [MESOS-5389] - docker containerizer should prefix relative volume.container_path values with the path to the sandbox.
* [MESOS-5390] - v1 Executor Protos not included in maven jar
* [MESOS-5397] - Slave/Agent Rename Phase 1: Update terms in the website
* [MESOS-5403] - Introduce ObjectApprover Interface to Authorizer.
* [MESOS-5405] - Make fields in authorization::Request protobuf optional.
* [MESOS-5407] - Slave/Agent rename: diagrams in docs
* [MESOS-5408] - Delete the /observe HTTP endpoint
* [MESOS-5413] - `network/cni` isolator should skip the bind mounting of the CNI network information root directory if possible
* [MESOS-5414] - configure failed on ubuntu and centos
* [MESOS-5415] - bootstrap of libprocess fails.
* [MESOS-5416] - make check of stout fails.
* [MESOS-5422] - Website README.md is out of dated
* [MESOS-5423] - Updating the website section in release-guide is out of dated
* [MESOS-5428] - Update the mechanism to define flags in FlagsBase derived clases
* [MESOS-5429] - Enhance error message for mesos-ps
* [MESOS-5432] - Javadoc in project website didn't include the generated protobuf
* [MESOS-5434] - Incomplete bootstrap 3.3.6 upgrade in webui
* [MESOS-5436] - GPU resource broke framework data table in webUI
* [MESOS-5437] - AppC appc_simple_discovery_uri_prefix is lost in configuration.md
* [MESOS-5438] - Add more verbose log for mesos-cat, mesos-tail or mesos-scp
* [MESOS-5445] - Allow libprocess/stout to build without first doing `make` in 3rdparty.
* [MESOS-5449] - Memory leak in SchedulerProcess.declineOffer
* [MESOS-5450] - Make the SASL dependency optional.
* [MESOS-5451] - Show Framework ID in log for long-lived-framework
* [MESOS-5453] - CNI should not store subnet of address in NetworkInfo
* [MESOS-5477] - Implement GET_HEALTH Call in v1 master API.
* [MESOS-5478] - Implement GET_HEALTH Call in v1 agent API.
* [MESOS-5479] - Implement GET_VERSION Call in v1 master API.
* [MESOS-5480] - Implement GET_VERSION Call in v1 agent API.
* [MESOS-5531] - Re-enable style-check for stout.
* [MESOS-5537] - http v1 SUBSCRIBED scheduler event always has nil http_interval_seconds
* [MESOS-5543] - /dev/fd is missing in the Mesos containerizer environment
* [MESOS-5554] - Change major/minor device types for Nvidia GPUs to `unsigned int`
* [MESOS-5556] - Fix method of populating device entries for `/dev/nvidia-uvm`, etc.
* [MESOS-5561] - Need to remove references to "messages/messages.hpp" from `State` API
* [MESOS-5571] - Scheduler JNI throws exception when the major versions of JAR and libmesos don't match
* [MESOS-5575] - Attempting to Parse PID logging is too verbose
* [MESOS-5577] - Modules using replicated log state API require zookeeper headers
* [MESOS-5587] - FullFrameworkWriter makes master segmentation fault.
* [MESOS-5595] - GMock warning in FaultToleranceTest.SchedulerReregisterAfterFailoverTimeout
* [MESOS-5600] - DRF sorter unnecessarily re-sorts due to misuse of "dirty" bit.
* [MESOS-5601] - DRF sorter does not re-calculate share if a client weight is updated.
* [MESOS-5607] - Refactored overlay, overlayfs and aufs checking to fs::supported
* [MESOS-5609] - Put initial scaffolding in place for implementing SUBSCRIBE call on v1 Master API.
* [MESOS-5611] - Error message is not clear when create docker volume with absolute path
* [MESOS-5615] - When using command executor, the ExecutorInfo is useless for sandbox authorization
* [MESOS-5616] - Add missing comments for GET_FLAGS, GET_HEALTH, GET_VERSION, GET_LOGGING_LEVEL, GET_LEADING_MASTER
* [MESOS-5627] - Quota-related authorization actions should be removed rather than deprecated.
* [MESOS-5629] - Agent segfaults after request to '/files/browse'
* [MESOS-5637] - Authorized endpoint results are inconsistent for failures.
* [MESOS-5642] - Move include/mesos/v1/master/allocator.proto to its own directory and package
* [MESOS-5657] - Executors should not inherit environment variables from the agent.
* [MESOS-5660] - ContainerizerTest.ROOT_CGROUPS_BalloonFramework fails because executor environment isn't inherited
* [MESOS-5664] - Invalid resources sent to '/reserve' are silently dropped
* [MESOS-5667] - CniIsolatorTest.ROOT_INTERNET_CURL_LaunchCommandTask fails on CentOS 7.
* [MESOS-5668] - Add CGROUP namespace to linux ns helper.
* [MESOS-5669] - CNI isolator should not return failure if /etc/hostname does not exist on host.
* [MESOS-5670] - MemoryPressureMesosTest.CGROUPS_ROOT_SlaveRecovery is flaky.
* [MESOS-5671] - MemoryPressureMesosTest.CGROUPS_ROOT_Statistics is flaky.
* [MESOS-5673] - Port mapping isolator may cause segfault if it bind mount root does not exist.
* [MESOS-5674] - Port mapping isolator may fail in 'isolate' method.
* [MESOS-5680] - We should not 'chown -R' on persistent volumes every time container tries to use it.
* [MESOS-5684] - Master captures `this` when creating authorization callback
* [MESOS-5685] - The /files/download endpoint's authorization can be compromised
* [MESOS-5691] - SSL downgrade support will leak sockets in CLOSE_WAIT status
* [MESOS-5692] - Add helper function "begin_with/end_with" to strings
* [MESOS-5695] - Add missing comments for GET_ROLES, GET_WEIGHTS, SUBSCRIBE, CREATE_VOLUMES, DESTROY_VOLUMES, SET_QUOTA
* [MESOS-5698] - Quota sorter not updated for resource changes at agent.
* [MESOS-5715] - Enhance startsWith/endsWith's performance
* [MESOS-5723] - SSL-enabled libprocess will leak incoming links to forks
* [MESOS-5724] - SSL certificate validation should allow IP only verification.
* [MESOS-5727] - Command executor health check does not work when the task specifies container image.
* [MESOS-5748] - Potential segfault in `link` and `send` when linking to a remote process
* [MESOS-5755] - NVML headers are not installed as part of 3rdparty install with --enable-install-module-dependencies
* [MESOS-5757] - Authorize orphaned tasks
* [MESOS-5760] - MAC OS Build failed
* [MESOS-5763] - Task stuck in fetching is not cleaned up after --executor_registration_timeout.
* [MESOS-5766] - Missing License Information for Bundled NVML headers
* [MESOS-5794] - Agent's /containers endpoint should skip terminated executors.
* [MESOS-5799] - docker::inspect() may get wrong output when a docker container is not in "running" state
* [MESOS-5806] - CNI isolator should prepare network related /etc/* files for containers using host mode but specify container images.
* [MESOS-5834] - Mesos may pass --volume-driver to the Docker daemon multiple times.
* [MESOS-5844] - PersistentVolumeEndpointsTest.OfferCreateThenEndpointRemove test is flaky
* [MESOS-5845] - The fetcher can access any local file as root
* [MESOS-5848] - Docker health checks are malformed.
* [MESOS-5851] - Create mechanism to control authentication between different HTTP endpoints
* [MESOS-5863] - Enabling SSL causes fetcher fail to fetch from HTTPS sites.
* [MESOS-5891] - /help endpoint does not set Content-Type to HTML.
** Documentation
* [MESOS-4381] - Improve upgrade compatibility documentation.
* [MESOS-4514] - Document how to implement Mesos HTTP operator endpoints.
* [MESOS-4689] - Design doc for v1 Operator API
* [MESOS-4726] - Document scheduler driver calls in framework development guide.
* [MESOS-4750] - Document: Mesos Executor expects all SSL_* environment variables to be set
* [MESOS-4785] - Reorganize ACL subject/object descriptions.
* [MESOS-4787] - HTTP endpoint docs should use shorter paths
* [MESOS-5215] - Update the documentation for '/reserve' and '/create-volumes'
* [MESOS-5313] - Failed to set quota and update weight according to document
* [MESOS-5366] - Update documentation to include contender/detector module
* [MESOS-5419] - Document all known client libraries for the Scheduler/Executor API
* [MESOS-5583] - Improve authorization documentation when setting permissive flag.
* [MESOS-5586] - Move design docs from wiki to web page
* [MESOS-5702] - CNI documentation example is not explicit enough about external plugins
** Epic
* [MESOS-1478] - Slave to Agent rename (Phase I).
* [MESOS-2297] - Add authentication support for HTTP API
* [MESOS-2948] - Generalize authorizer interface in order to allow for arbitrary Subjects, Actions and Objects
* [MESOS-4189] - Dynamic weights
* [MESOS-4843] - Authorize Master Operator Endpoints
* [MESOS-4847] - Agent HTTP Authentication
* [MESOS-4931] - Authorization based filtering for endpoints.
* [MESOS-5150] - Authorize Agent HTTP Endpoints
* [MESOS-5703] - Authorize operator endpoints for Mesos 1.0
** Improvement
* [MESOS-1571] - Signal escalation timeout is not configurable.
* [MESOS-2145] - Distinguish frameworks according to their state in the webui
* [MESOS-2154] - Port CFS quota support to Docker Containerizer
* [MESOS-2281] - Deprecate plain text Credential format.
* [MESOS-2372] - Test script for verifying compatibility between Mesos components
* [MESOS-2602] - Provide a way to "push" cluster state updates to a registered service.
* [MESOS-2720] - Publish the schema for operator endpoints
* [MESOS-3243] - Replace NULL with nullptr
* [MESOS-3690] - Make Apache Mesos' website mobile friendly
* [MESOS-3774] - Migrate Future tests from process_tests.cpp to future_tests.cpp
* [MESOS-3775] - MasterAllocatorTest.SlaveLost is slow.
* [MESOS-4090] - Create light-weight executor only and scheduler only mesos eggs
* [MESOS-4126] - Construct the error string in `MethodNotAllowed`.
* [MESOS-4160] - Log recover tests are slow.
* [MESOS-4164] - MasterTest.RecoverResources is slow.
* [MESOS-4165] - MasterTest.MasterInfoOnReElection is slow.
* [MESOS-4166] - MasterTest.LaunchCombinedOfferTest is slow.
* [MESOS-4167] - MasterTest.OfferTimeout is slow.
* [MESOS-4170] - OversubscriptionTest.UpdateAllocatorOnSchedulerFailover is slow.
* [MESOS-4171] - OversubscriptionTest.RemoveCapabilitiesOnSchedulerFailover is slow.
* [MESOS-4174] - HookTest.VerifySlaveLaunchExecutorHook is slow.
* [MESOS-4175] - ContentType/SchedulerTest.Decline is slow.
* [MESOS-4309] - Update documentation to cover HTTP authentication.
* [MESOS-4353] - Limit the number of processes created by libprocess
* [MESOS-4369] - Enhance DockerExecuter to support Docker's user-defined networks
* [MESOS-4386] - Deprecate 'authenticate' master flag in favor of 'authenticate_frameworks' flag
* [MESOS-4576] - Introduce a stout helper for "which"
* [MESOS-4610] - MasterContender/MasterDetector should be loadable as modules
* [MESOS-4612] - Update vendored ZooKeeper to 3.4.8
* [MESOS-4678] - Upgrade vendored Protobuf to 2.6.1
* [MESOS-4720] - Add allocator metrics for total vs offered/allocated resources.
* [MESOS-4721] - Expose allocation algorithm latency via a metric.
* [MESOS-4722] - Add allocator metric for number of active offer filters
* [MESOS-4723] - Add allocator metric for currently satisfied quotas
* [MESOS-4724] - Add allocator metric for currrent dominant shares of frameworks and roles
* [MESOS-4735] - CommandInfo.URI should allow specifying target filename
* [MESOS-4790] - Revert external linkage of symbols in master/constants.hpp
* [MESOS-4801] - Updated `createFrameworkInfo` for hierarchical_allocator_tests.cpp.
* [MESOS-4802] - Update leveldb patch file to suport PowerPC LE
* [MESOS-4803] - Update vendored libev to 4.22
* [MESOS-4805] - Update ry-http-parser-1c3624a to nodejs/http-parser 2.6.1
* [MESOS-4839] - Move placement new processes into the freezer cgroup into a parent hook.
* [MESOS-4868] - PersistentVolumeTests do not need to set up ACLs.
* [MESOS-4879] - Update glog patch to support PowerPC LE
* [MESOS-4886] - Support mesos containerizer force_pull_image option.
* [MESOS-4891] - Add a '/containers' endpoint to the agent to list all the active containers.
* [MESOS-4897] - Update test cases to support PowerPC LE
* [MESOS-4902] - Add authentication to libprocess endpoints
* [MESOS-4908] - Tasks cannot be killed forcefully.
* [MESOS-4909] - Introduce kill policy for tasks.
* [MESOS-4910] - Deprecate the --docker_stop_timeout agent flag.
* [MESOS-4914] - ProcessorManager delegate should be an Option<string>, not just a string.
* [MESOS-4926] - Add a list parser for comma separated integers in flags.
* [MESOS-4928] - Remove all '.get().' calls on Option / Try variables in the resources abstraction.
* [MESOS-4943] - Reduce the size of LinuxRootfs in tests.
* [MESOS-4949] - Executor shutdown grace period should be configurable.
* [MESOS-4951] - Enable actors to pass an authentication realm to libprocess
* [MESOS-4956] - Add authentication to /files endpoints
* [MESOS-5001] - Prefix allocator metrics with "mesos/" to better support custom allocator metrics.
* [MESOS-5002] - Reflecting the Tachyon => Alluxio rename in the documentation.
* [MESOS-5014] - Call and Event Type enums in scheduler.proto should be optional
* [MESOS-5015] - Call and Event Type enums in executor.proto should be optional
* [MESOS-5020] - Drop `404 Not Found` and `307 Temporary Redirect` in the scheduler library.
* [MESOS-5029] - Add labels to ExecutorInfo
* [MESOS-5030] - Expose TaskInfo's metadata to ResourceUsage struct
* [MESOS-5044] - Temporary directories created by environment->mkdtemp cleanup can be problematic.
* [MESOS-5049] - Refactore subproces setup functions.
* [MESOS-5062] - Update the long-lived-framework example to run on test clusters
* [MESOS-5069] - Upgrade http-parser to v2.6.2
* [MESOS-5073] - Mesos allocator leaks role sorter and quota role sorters.
* [MESOS-5101] - Add CMake build to docker_build.sh
* [MESOS-5117] - Enhance mesos-execute for specifying CNI networks
* [MESOS-5124] - TASK_KILLING is not supported by mesos-execute.
* [MESOS-5155] - Consolidate authorization actions for quota.
* [MESOS-5168] - Benchmark overhead of authorization based filtering.
* [MESOS-5169] - Introduce new Authorizer Actions for Authorized based filtering of endpoints.
* [MESOS-5170] - Adapt json creation for authorization based endpoint filtering.
* [MESOS-5174] - Update the balloon-framework to run on test clusters
* [MESOS-5179] - Enhance the error message for Duration flag.
* [MESOS-5212] - Allow any principal in ReservationInfo when HTTP authentication is off
* [MESOS-5214] - Populate FrameworkInfo.principal for authenticated frameworks
* [MESOS-5271] - Add alias support for Flags
* [MESOS-5273] - Need support for Authorization information via HELP.
* [MESOS-5286] - Add authorization to libprocess HTTP endpoints
* [MESOS-5296] - Split Resource and Inverse offer protobufs for V1 API
* [MESOS-5302] - Consider adding an Executor Shim/Adapter for the new/old API
* [MESOS-5307] - Sandbox mounts should not be in the host mount namespace.
* [MESOS-5316] - Authenticate the agent's '/containers' endpoint.
* [MESOS-5317] - Authorize the agent's '/containers' endpoint.
* [MESOS-5331] - Some cleanup in filesystem_isolator_tests.cpp
* [MESOS-5335] - Add authorization to GET /weights.
* [MESOS-5336] - Add authorization to GET /quota.
* [MESOS-5338] - Add `user` to `Task` protobuf message.
* [MESOS-5339] - Create Tests for testing fine-grained HTTP endpoint filtering.
* [MESOS-5347] - Enhance the log message when launching mesos containerizer.
* [MESOS-5348] - Enhance the log message when launching docker containerizer.
* [MESOS-5350] - Add asynchronous hook for validating docker containerizer tasks
* [MESOS-5356] - Add Windows support for StopWatch
* [MESOS-5360] - Set death signal for dvdcli subprocess in docker volume isolator.
* [MESOS-5370] - Add deprecation support for Flags
* [MESOS-5372] - Add random() to os:: namespace
* [MESOS-5373] - Remove `Zookeeper's` NTDDI_VERSION define
* [MESOS-5374] - Add support for Console Ctrl handling in `slave.cpp`
* [MESOS-5375] - Implement stout/os/windows/kill.hpp
* [MESOS-5398] - Rewrite os::read() to be friendlier to reading binary files
* [MESOS-5399] - Add utility for parsing ld.so.cache on linux.
* [MESOS-5400] - Add preliminary support for parsing ELF files in stout.
* [MESOS-5401] - Add ability to inject a Volume of Nvidia libraries/binaries into a docker-image container in mesos containerizer.
* [MESOS-5404] - Allow `Task` to be authorized.
* [MESOS-5420] - Implement os::exists for processes
* [MESOS-5424] - Update the style of code under website folder to match other exist source code
* [MESOS-5430] - Design the improvement of the home page of mesos.apache.org
* [MESOS-5431] - Update the website generation and development workflows with docker.
* [MESOS-5435] - Add default implementations to all Isolator virtual functions
* [MESOS-5452] - Agent modules should be initialized before all components except firewall.
* [MESOS-5456] - Master anonymous modules should initialized before any other components.
* [MESOS-5457] - Create a small testing doc for the v1 Scheduler/Executor API
* [MESOS-5459] - Update RUN_TASK_WITH_USER to use additional metadata
* [MESOS-5519] - Refresh Mesos project website homepage
* [MESOS-5532] - Maven build is too verbose for batch builds
* [MESOS-5540] - Support building with non-GNU libc
* [MESOS-5550] - Remove Nvidia GPU Isolator's link-time dependence on `libnvidia-ml`
* [MESOS-5551] - Move the Nvidia GPU isolator from `cgroups/devices/gpu/nvidia` to `gpu/nvidia`
* [MESOS-5552] - Bundle NVML headers for Nvidia GPU support.
* [MESOS-5555] - Always provide access to NVIDIA control devices within containers (if GPU isolation is enabled).
* [MESOS-5557] - Add `NvidiaGpuAllocator` component for cross-containerizer GPU allocation
* [MESOS-5558] - Update `Containerizer::resources()` to use the `NvidiaGpuAllocator`
* [MESOS-5559] - Integrate the `NvidiaGpuAllocator` into the `NvidiaGpuIsolator`
* [MESOS-5562] - Add class to share Nvidia-specific components between containerizers
* [MESOS-5563] - Rearrange Nvidia GPU files to cleanup semantics for header inclusion.
* [MESOS-5572] - Change Operator API RPC handlers return type to http::Response
* [MESOS-5576] - Masters may drop the first message they send between masters after a network partition
* [MESOS-5581] - Guarantee ordering between Isolators
* [MESOS-5582] - Create a `cgroups/devices` isolator.
* [MESOS-5592] - Pass NetworkInfo to CNI Plugins
* [MESOS-5593] - Devolve v1 operator protos before using them in Master/Agent.
* [MESOS-5617] - Mesos website preview incorrect in facebook
* [MESOS-5618] - Added a metric indicating if replicated log for the registrar has recovered or not.
* [MESOS-5630] - Change build to always enable Nvidia GPU support for Linux
* [MESOS-5636] - Display allocated resources in the agent listing of the webui.
* [MESOS-5666] - Deprecate camel case proto field in isolator ContainerConfig.
* [MESOS-5697] - Support file volume in mesos containerizer.
* [MESOS-5737] - Expose Executor PID in containers endpoint
* [MESOS-5740] - Consider adding `relink` functionality to libprocess
* [MESOS-5743] - Added a flag parser for hashset<std::string>.
* [MESOS-5749] - Have maven run in batch mode
* [MESOS-5753] - Command executor should use `mesos-containerizer launch` to launch user task.
* [MESOS-5758] - Add ability to exclude resources from fair sharing.
* [MESOS-5765] - Add 'systemGetDriverVersion' to NVML abstraction.
* [MESOS-5767] - Add ELFIO as bundled Dependency to Mesos
* [MESOS-5768] - Reimplement the stout ELF abstraction in terms of ELFIO
* [MESOS-5769] - Add get_abi_version() to ELF abstraction in stout
* [MESOS-5782] - Renamed 'commands' to 'pre_exec_commands' in ContainerLaunchInfo.
* [MESOS-5787] - Add ability to set framework capabilities in 'mesos-execute'
* [MESOS-5793] - Add ability to inject Nvidia devices into a container
* [MESOS-5833] - Disable '--registry_strict' master flag
** Task
* [MESOS-338] - Mesos 1.0
* [MESOS-2257] - Version the Operator/Admin API
* [MESOS-2408] - Slave should reclaim storage for destroyed persistent volumes.
* [MESOS-2950] - Implement current mesos Authorizer in terms of generalized Authorizer interface
* [MESOS-3063] - Add an example framework using dynamic reservation
* [MESOS-3103] - Separate OS-specific code in the libprocess library
* [MESOS-3214] - Replace boost foreach with range-based for
* [MESOS-3368] - Add device support in cgroups abstraction
* [MESOS-3371] - Implement process::subprocess on Windows
* [MESOS-3436] - Port dynamiclibrary_test.cpp to Windows
* [MESOS-3438] - Port gzip_test to Windows
* [MESOS-3439] - Port ip_tests
* [MESOS-3443] - Windows: Port protobuf_tests.hpp
* [MESOS-3541] - Add CMakeLists that builds the Mesos master
* [MESOS-3558] - Implement HTTPCommandExecutor that uses the Executor Library
* [MESOS-3559] - Make the Command Scheduler use the HTTP Scheduler Library
* [MESOS-3609] - Port slave/gc.cpp
* [MESOS-3610] - Port slave/flags.cpp to Windows
* [MESOS-3611] - Port slave/http.cpp to Windows
* [MESOS-3612] - Port slave/metrics.cpp to Windows
* [MESOS-3614] - Port slave/slave.cpp to Windows
* [MESOS-3616] - Port slave/status_update_manager.cpp to Windows
* [MESOS-3617] - Port slave/containerizer/containerizer.cpp to Windows
* [MESOS-3618] - Port slave/containerizer/fetcher.cpp
* [MESOS-3619] - Port slave/containerizer/isolator.cpp to Windows
* [MESOS-3620] - Create slave/containerizer/isolators/filesystem/windows.cpp
* [MESOS-3622] - Port slave/containerizer/launcher.cpp to Windows
* [MESOS-3623] - Port slave/containerizer/mesos/containerizer.cpp to Windows
* [MESOS-3624] - Port slave/containerizer/mesos/launch.cpp to Windows
* [MESOS-3634] - Port process/protobuf.hpp
* [MESOS-3635] - Port process/defer.hpp to Windows
* [MESOS-3636] - Port process/dispatch.hpp
* [MESOS-3637] - Port process/process.hpp to Windows
* [MESOS-3639] - Implement stout/os/windows/killtree.hpp
* [MESOS-3641] - Implement stout/os/windows/read.hpp and write.hpp
* [MESOS-3642] - Implement stout/os/windows/sendfile.hpp
* [MESOS-3646] - Port process/clock.hpp to Windows
* [MESOS-3647] - Port process/time.hpp to Windows
* [MESOS-3648] - Port stout/duration.hpp to Windows
* [MESOS-3649] - Port process/future.hpp to Windows
* [MESOS-3650] - Port process/event.hpp to Windows
* [MESOS-3651] - Port process/latch.hpp to Windows
* [MESOS-3652] - Port process/http.hpp to Windows
* [MESOS-3653] - Port process/message.hpp to Windows
* [MESOS-3654] - Port process/filter.hpp to Windows
* [MESOS-3657] - Port process/deferred.hpp to Windows
* [MESOS-3661] - Port slave/metrics.hpp to Windows
* [MESOS-3662] - Port slave/slave.hpp to Windows
* [MESOS-3663] - Port process/metrics/gauge.hpp to Windows
* [MESOS-3664] - Port process/metrics/metric.hpp to Windows
* [MESOS-3666] - Port process/metrics/metrics.hpp to Windows
* [MESOS-3668] - Port process/delay.hpp to Windows
* [MESOS-3669] - Port process/clock.hpp to Windows
* [MESOS-3670] - Port process/time.hpp to Windows
* [MESOS-3671] - Port stout/duration.hpp to Windows
* [MESOS-3672] - Port process/timer.hpp to Windows
* [MESOS-3673] - Port process/timeout.hpp to Windows
* [MESOS-3674] - Port process/async.hpp to Windows
* [MESOS-3675] - Port process/check.hpp to Windows
* [MESOS-3679] - Port slave/containerizer/containerizer.hpp to Windows
* [MESOS-3680] - Port process/subprocess.hpp to Windows
* [MESOS-3681] - Port slave/containerizer/fetcher.hpp to Windows
* [MESOS-3682] - Port slave/containerizer/launcher.hpp to Windows
* [MESOS-3683] - Port slave/containerizer/isolator.hpp to Windows
* [MESOS-3685] - Port process/io.hpp to Windows
* [MESOS-3779] - Slave/Agent Rename Phase I - Update terms in Web UI.
* [MESOS-3781] - Replace Master/Slave Terminology Phase I - Rename flag names and deprecate old ones
* [MESOS-3782] - Slave/Agent Rename Phase I - Add duplicate binaries (or create symlinks)
* [MESOS-3783] - Replace Master/Slave Terminology Phase I - Update documentation
* [MESOS-3784] - Replace Master/Slave Terminology Phase I - Update mesos-cli
* [MESOS-3854] - Finalize design for generalized Authorizer interface
* [MESOS-3945] - Add operator documentation for /weight endpoint
* [MESOS-4033] - Add a commit hook for non-ascii characters.
* [MESOS-4112] - Clean up libprocess gtest macros
* [MESOS-4214] - Introduce HTTP endpoint /weights for updating weight
* [MESOS-4316] - Support get non-default weights by /weights
* [MESOS-4459] - Implement AuthN handling on the scheduler library
* [MESOS-4623] - Add a stub Nvidia GPU isolator.
* [MESOS-4624] - Add allocation metrics for "gpus" resources.
* [MESOS-4625] - Implement Nvidia GPU isolation w/o filesystem isolation enabled.
* [MESOS-4626] - Support Nvidia GPUs with filesystem isolation enabled in mesos containerizer.
* [MESOS-4629] - Implement fault tolerance tests for the HTTP Scheduler API.
* [MESOS-4704] - Enable zlib on Windows.
* [MESOS-4758] - Add a 'name' field into NetworkInfo.
* [MESOS-4759] - Add network/cni isolator for Mesos containerizer.
* [MESOS-4761] - Add agent flags to allow operators to specify CNI plugin and config directories.
* [MESOS-4764] - The network/cni isolator should report assigned IP address.
* [MESOS-4771] - Document the network/cni isolator.
* [MESOS-4788] - Mesos UI should show the role and principal of a framework
* [MESOS-4797] - Add a couple of registrar tests for /weights endpoint
* [MESOS-4813] - Implement base tests for unified container using local puller.
* [MESOS-4818] - Add end to end testing for Appc images.
* [MESOS-4840] - Remove internal usage of deprecated ShutdownFramework ACL
* [MESOS-4844] - Add authentication to master endpoints
* [MESOS-4849] - Add agent flags for HTTP authentication
* [MESOS-4850] - Add authentication to agent endpoints /state and /flags
* [MESOS-4858] - Make changes to executor v1 library around managing connections.
* [MESOS-4860] - Add a script to install the Nvidia GDK on a host.
* [MESOS-4861] - Add configure flags to build with Nvidia GPU support.
* [MESOS-4863] - Add Nvidia GPU isolator tests.
* [MESOS-4864] - Add flag to specify available Nvidia GPUs on an agent's command line.
* [MESOS-4865] - Add GPUs as an explicit resource.
* [MESOS-4881] - Rescind all outstanding offers after changing some weights.
* [MESOS-4887] - Design doc for Slave/Agent rename
* [MESOS-4889] - Implement runtime isolator tests.
* [MESOS-4906] - Upgrade to clang-format-3.8.
* [MESOS-4932] - Propose Design for Authorization based filtering for endpoints.
* [MESOS-4933] - Registrar HTTP Authentication.
* [MESOS-4934] - Enable HELP to include authentication status of endpoint.
* [MESOS-4938] - Support docker registry authentication
* [MESOS-4939] - Support specifying per-container docker registry.
* [MESOS-4944] - Improve overlay backend so that it's writable
* [MESOS-4962] - Support for Mesos releases
* [MESOS-4982] - Update example long running to use v1 API.
* [MESOS-4993] - FetcherTest.ExtractZipFile assumes `unzip` is installed
* [MESOS-5050] - Design Linux capability support for Mesos containerizer
* [MESOS-5055] - Slave/Agent Rename Phase I - Update strings in the log message and standard output
* [MESOS-5057] - Slave/Agent Rename Phase I - Update strings in error messages and other strings
* [MESOS-5065] - Support docker private registry default docker config.
* [MESOS-5108] - Design a short-term solution for a typed error handling mechanism.
* [MESOS-5109] - Capture the error code in `ErrnoError` and `WindowsError`.
* [MESOS-5110] - Introduce an additional template parameter to `Try` for typed error.
* [MESOS-5111] - Update `network::connect` to use the typed error state of `Try`.
* [MESOS-5112] - Introduce `WindowsSocketError`.
* [MESOS-5130] - Enable `newtork/cni` isolator in `MesosContainerizer` as the default `network` isolator.
* [MESOS-5135] - Update existing documentation to Include references to GPUs as a first class resource.
* [MESOS-5136] - Update the default JSON representation of a Resource to include GPUs
* [MESOS-5137] - Remove 'dashboard.js' from the webui.
* [MESOS-5152] - Add authentication to agent's /monitor/statistics endpoint
* [MESOS-5157] - Update webui for GPU metrics
* [MESOS-5159] - Add test to verify error when requesting fractional GPUs
* [MESOS-5164] - Add authorization to agent's /monitor/statistics endpoint.
* [MESOS-5167] - Add tests for `network/cni` isolator
* [MESOS-5171] - Expose state/state.hpp to public headers
* [MESOS-5173] - Allow master/agent to take multiple modules manifest files
* [MESOS-5178] - Add logic to validate for non-fractional GPU requests in the master
* [MESOS-5209] - Add a slave hook that runs after the fetching is done.
* [MESOS-5222] - Create a benchmark for scale testing HTTP frameworks
* [MESOS-5249] - Update CMake files to reflect reorganized 3rdparty
* [MESOS-5250] - Move 3rdparty/libprocess/3rdparty/* to 3rdparty/
* [MESOS-5256] - Add support for per-containerizer resource enumeration
* [MESOS-5257] - Add autodiscovery for GPU resources
* [MESOS-5272] - Support docker image labels.
* [MESOS-5297] - Add authorization to the master's "/flags" endpoint.
* [MESOS-5365] - Introduce a timeout for docker volume driver mount/unmount operation.
* [MESOS-5394] - Rename isolator name 'xfs/disk' and 'posix/disk' to 'disk/xfs' and 'disk/du'
* [MESOS-5474] - Implement GET_FLAGS Call in v1 master API.
* [MESOS-5475] - Implement GET_FLAGS Call in v1 agent API.
* [MESOS-5484] - Implement GET_METRICS Call in v1 master API.
* [MESOS-5485] - Implement GET_LOGGING_LEVEL Call in v1 master API.
* [MESOS-5486] - Implement SET_LOGGING_LEVEL Call in v1 master API.
* [MESOS-5487] - Implement LIST_FILES Call in v1 master API.
* [MESOS-5489] - Implement GET_STATE Call in v1 master API.
* [MESOS-5491] - Implement GET_AGENTS Call in v1 master API.
* [MESOS-5492] - Implement GET_FRAMEWORKS Call in v1 master API.
* [MESOS-5493] - Implement GET_TASKS Call in v1 master API.
* [MESOS-5494] - Implement GET_ROLES Call in v1 master API.
* [MESOS-5495] - Implement GET_WEIGHTS Call in v1 master API.
* [MESOS-5496] - Implement UPDATE_WEIGHTS Call in v1 master API.
* [MESOS-5497] - Implement GET_MASTER Call in v1 master API.
* [MESOS-5498] - Implement SUBSCRIBE Call in v1 master API.
* [MESOS-5499] - Implement RESERVE_RESOURCES Call in v1 master API.
* [MESOS-5500] - Implement UNRESERVE_RESOURCES Call in v1 master API.
* [MESOS-5501] - Implement CREATE_VOLUMES Call in v1 master API.
* [MESOS-5502] - Implement DESTROY_VOLUMES Call in v1 master API.
* [MESOS-5503] - Implement GET_MAINTENANCE_STATUS Call in v1 master API.
* [MESOS-5504] - Implement GET_MAINTENANCE_SCHEDULE Call in v1 master API.
* [MESOS-5505] - Implement UPDATE_MAINTENANCE_SCHEDULE Call in v1 master API.
* [MESOS-5506] - Implement START_MAINTENANCE Call in v1 master API.
* [MESOS-5507] - Implement STOP_MAINTENANCE Call in v1 master API.
* [MESOS-5508] - Implement GET_QUOTA Call in v1 master API.
* [MESOS-5509] - Implement SET_QUOTA Call in v1 master API.
* [MESOS-5510] - Implement REMOVE_QUOTA Call in v1 master API.
* [MESOS-5511] - Implement GET_METRICS Call in v1 agent API.
* [MESOS-5512] - Implement GET_LOGGING_LEVEL Call in v1 agent API.
* [MESOS-5513] - Implement SET_LOGGING_LEVEL Call in v1 agent API.
* [MESOS-5514] - Implement LIST_FILES Call in v1 agent API.
* [MESOS-5517] - Implement GET_RESOURCE_STATISTICS Call in v1 agent API.
* [MESOS-5518] - Implement GET_CONTAINERS Call in v1 agent API.
* [MESOS-5549] - Document aufs provisioner backend.
* [MESOS-5628] - `QuotaHandler` should only make one authorization request to the authorizer.
* [MESOS-5634] - Add Framework Capability for GPU_RESOURCES
* [MESOS-5639] - Add documentation about metadata for CNI plugins.
* [MESOS-5641] - Update docker-volume.md to add some content for how to test
* [MESOS-5663] - Remove hard dependence on libelf for Linux
* [MESOS-5699] - Create new documentation for Mesos networking.
* [MESOS-5704] - Fine-grained authorization on /frameworks
* [MESOS-5705] - ZK credential is exposed in /flags and /state
* [MESOS-5706] - GET_ENDPOINT_WITH_PATH authz doesn't make sense for /flags
* [MESOS-5707] - LocalAuthorizer should error if passed a GET_ENDPOINT ACL with an unhandled path
* [MESOS-5708] - Add authz to /files/debug
* [MESOS-5709] - Authorization for /roles
* [MESOS-5711] - Update AUTHORIZATION strings in endpoint help
* [MESOS-5712] - Document exactly what is handled by GET_ENDPOINTS_WITH_PATH acl
* [MESOS-5750] - Implement GET_EXECUTORS Call in v1 master API.
* [MESOS-5764] - Whitelist the nvidia-uvm-tools device in the Nvidia GPU isolator.
Release Notes - Mesos - Version 0.28.3
--------------------------------------
* This is a bug fix release.
All Issues:
** Bug
* [MESOS-5571] - Scheduler JNI throws exception when the major versions of JAR and libmesos don't match.
* [MESOS-5673] - Port mapping isolator may cause segfault if it bind mount root does not exist.
* [MESOS-5330] - Agent should backoff before connecting to the master.
* [MESOS-5543] - /dev/fd is missing in the Mesos containerizer environment.
* [MESOS-5691] - SSL downgrade support will leak sockets in CLOSE_WAIT status.
* [MESOS-5723] - SSL-enabled libprocess will leak incoming links to forks.
* [MESOS-5748] - Potential segfault in `link` when linking to a remote process.
* [MESOS-5763] - Task stuck in fetching is not cleaned up after --executor_registration_timeout.
* [MESOS-5073] - Mesos allocator leaks role sorter and quota role sorters.
* [MESOS-5698] - Quota sorter not updated for resource changes at agent.
* [MESOS-5740] - Consider adding `relink` functionality to libprocess.
* [MESOS-5576] - Masters may drop the first message they send between masters after a network partition.
* [MESOS-5913] - Stale socket FD usage when using libevent + SSL.
* [MESOS-5927] - Unable to run "scratch" Dockerfiles with Unified Containerizer.
* [MESOS-5943] - Incremental http parsing of URLs leads to decoder error.
* [MESOS-5986] - SSL Socket CHECK can fail after socket receives EOF.
* [MESOS-6104] - Potential FD double close in libevent's implementation of `sendfile`.
* [MESOS-6142] - Frameworks may RESERVE for an arbitrary role.
* [MESOS-6152] - Resource leak in libevent_ssl_socket.cpp.
* [MESOS-6233] - Master CHECK fails during recovery while relinking to other masters.
* [MESOS-6234] - Potential socket leak during Zookeeper network changes.
* [MESOS-6246] - Libprocess links will not generate an ExitedEvent if the socket creation fails.
* [MESOS-6299] - Master doesn't remove task from pending when it is invalid.
* [MESOS-6457] - Tasks shouldn't transition from TASK_KILLING to TASK_RUNNING.
* [MESOS-6527] - Memory leak in the libprocess request decoder.
Release Notes - Mesos - Version 0.28.2
--------------------------------------
* This is a bug fix release.
All Issues:
** Bug
* [MESOS-4705] - Linux 'perf' parsing logic may fail when OS distribution has perf backports.
* [MESOS-5239] - Persistent volume DockerContainerizer support assumes proper mount propagation setup on the host.
* [MESOS-5253] - Isolator cleanup should not be invoked if they are not prepared yet.
* [MESOS-5282] - Destroy container while provisioning volume images may lead to a race.
* [MESOS-5312] - Env `MESOS_SANDBOX` is not set properly for command tasks that changes rootfs.
* [MESOS-4885] - Unzip should force overwrite.
* [MESOS-5449] - Memory leak in SchedulerProcess.declineOffer.
* [MESOS-5380] - Killing a queued task can cause the corresponding command executor to never terminate.
** Improvement
* [MESOS-5307] - Sandbox mounts should not be in the host mount namespace.
Release Notes - Mesos - Version 0.28.1
--------------------------------------
* This is a bug fix release.
All Issues:
** Bug
* [MESOS-4662] - PortMapping network isolator should not assume BIND_MOUNT_ROOT is a realpath.
* [MESOS-4874] - overlayfs does not work with kernel 4.2.3
* [MESOS-4877] - Mesos containerizer can't handle top level docker image like "alpine" (must use "library/alpine")
* [MESOS-4878] - Task stuck in TASK_STAGING when docker fetcher failed to fetch the image
* [MESOS-4964] - curl based docker fetcher fails to decode chunked encoding
* [MESOS-4985] - Destroy a container while it's provisioning can lead to leaked provisioned directories.
* [MESOS-5009] - local docker puller fails to find private registry repositories
* [MESOS-5018] - FrameworkInfo Capability enum does not support upgrades.
* [MESOS-5021] - Memory leak in subprocess when 'environment' argument is provided.
* [MESOS-5023] - MesosContainerizerProvisionerTest.DestroyWhileProvisioning is flaky.
* [MESOS-5114] - Flags::parse does not handle empty string correctly.
Release Notes - Mesos - Version 0.28.0
--------------------------------------
This release contains the following new features:
* [MESOS-4343] - A new cgroups isolator for enabling the net_cls subsystem in
Linux. The cgroups/net_cls isolator allows operators to provide network
performance isolation and network segmentation for containers within a Mesos
cluster. To enable the cgroups/net_cls isolator, append `cgroups/net_cls` to
the `--isolation` flag when starting the slave. Please refer to
docs/mesos-containerizer.md for more details.
* [MESOS-4687] - The implementation of scalar resource values (e.g., "2.5
CPUs") has changed. Mesos now reliably supports resources with up to three
decimal digits of precision (e.g., "2.501 CPUs"); resources with more than
three decimal digits of precision will be rounded. Internally, resource math
is now done using a fixed-point format that supports three decimal digits of
precision, and then converted to/from floating point for input and output,
respectively. Frameworks that do their own resource math and manipulate
fractional resources may observe differences in roundoff error and numerical
precision.
* [MESOS-4479] - Reserved resources can now optionally include "labels".
Labels are a set of key-value pairs that can be used to associate metadata
with a reserved resource. For example, frameworks can use this feature to
distinguish between two reservations for the same role at the same agent
that are intended for different purposes.
* [MESOS-2840] - **Experimental** support for container images in Mesos
containerizer (a.k.a. Unified Containerizer). This allows frameworks to
launch Docker/Appc containers using Mesos containerizer without relying on
docker daemon (engine) or rkt. The isolation of the containers is done using
isolators. Please refer to docs/container-image.md for currently supported
features and limitations.
* [MESOS-4793] - **Experimental** support for v1 Executor HTTP API. This
allows executors to send HTTP requests to the /api/v1/executor agent
endpoint without the need for an executor driver. Please refer to
docs/executor-http-api.md for more details.
* [MESOS-4370] Added support for service discovery of Docker containers that
use Docker Remote API v1.21.
Additional API Changes:
* [MESOS-4066] - Agent should not return partial state when a request is made
to /state endpoint during recovery.
* [MESOS-4547] - Introduce TASK_KILLING state.
* [MESOS-4712] - Remove 'force' field from the Subscribe Call in v1
Scheduler API.
* [MESOS-4591] - Change the object of ReserveResources and CreateVolume ACLs
to `roles`.
* [MESOS-3583] - Add stream IDs for HTTP schedulers.
* [MESOS-4427] - Ensure ip_address in state.json (from NetworkInfo) is valid.
All Issues:
** Bug
* [MESOS-1187] - precision errors with allocation calculations
* [MESOS-1469] - No output from review bot on timeout
* [MESOS-2007] - AllocatorTest/0.SlaveReregistersFirst is flaky
* [MESOS-2017] - Segfault with "Pure virtual method called" when tests fail
* [MESOS-3273] - EventCall Test Framework is flaky
* [MESOS-3397] - sorter.cpp: Check failed: total.resources.contains(slaveId)
* [MESOS-3413] - Docker containerizer does not symlink persistent volumes into sandbox
* [MESOS-3570] - Make Scheduler Library use HTTP Pipelining Abstraction in Libprocess
* [MESOS-3719] - Core dump on /teardown
* [MESOS-3725] - shared library loading depends on environment variable updates
* [MESOS-3833] - /help endpoints do not work for nested paths
* [MESOS-3940] - /reserve and /unreserve should be permissive under a master without authentication.
* [MESOS-4029] - ContentType/SchedulerTest is flaky.
* [MESOS-4047] - MemoryPressureMesosTest.CGROUPS_ROOT_SlaveRecovery is flaky
* [MESOS-4071] - Master crash during framework teardown ( Check failed: total.resources.contains(slaveId))
* [MESOS-4249] - Mesos fetcher step skipped with MESOS_DOCKER_MESOS_IMAGE flag
* [MESOS-4255] - Add mechanism for testing recovery of HTTP based executors
* [MESOS-4285] - Mesos command task doesn't support volumes with image
* [MESOS-4291] - fs::enter(rootfs) does not work if 'rootfs' is read only.
* [MESOS-4298] - Sync up configuration.md and flags.cpp
* [MESOS-4338] - Create utilities for common shell commands used.
* [MESOS-4370] - NetworkSettings.IPAddress field is deprecated in Docker
* [MESOS-4383] - Support docker runtime configuration env var from image.
* [MESOS-4395] - Add persistent volume endpoint tests with no principal
* [MESOS-4416] - Get the perf version function return fail
* [MESOS-4427] - Ensure ip_address in state.json (from NetworkInfo) is valid
* [MESOS-4454] - Create common sha512 compute utility function.
* [MESOS-4478] - ReviewBot seemed to be crashing ReviewBoard server when posting large reviews
* [MESOS-4484] - GMock warning in MasterTest.OrphanTasks
* [MESOS-4495] - Delete `os::chown` on Windows
* [MESOS-4496] - Replace `glob` on Windows with something more suited to the platform
* [MESOS-4499] - Docker provisioner store should reuse existing layers in the cache.
* [MESOS-4517] - Introduce docker runtime isolator.
* [MESOS-4542] - MasterQuotaTest.AvailableResourcesAfterRescinding is flaky.
* [MESOS-4546] - Mesos Agents needs to re-resolve hosts in zk string on leader change / failure to connect
* [MESOS-4555] - Build broken with GCC 5.3.0
* [MESOS-4556] - ShasumTest.SHA512SimpleFile failed on centos7.
* [MESOS-4562] - Mesos UI shows wrong count for "started" tasks
* [MESOS-4563] - Docker::Container::Create should handle NetworkSettings.IPAddress being an empty string.
* [MESOS-4570] - DockerFetcherPluginTest.INTERNET_CURL_FetchImage seems flaky.
* [MESOS-4573] - Design doc for scheduler HTTP Stream IDs
* [MESOS-4583] - Rename `examples/event_call_framework.cpp` to `examples/test_http_framework.cpp`
* [MESOS-4584] - Update Rakefile for mesos site generation
* [MESOS-4585] - mesos-fetcher LIBPROCESS_PORT set to 5051 URI fetch failure
* [MESOS-4587] - Docker environment variables must be able to contain the equal sign
* [MESOS-4591] - `/reserve` and `/create-volumes` endpoints allow operations for any role
* [MESOS-4597] - `freebsd.hpp` is missing from the release tarball
* [MESOS-4598] - Logrotate ContainerLogger should not remove IP from environment.
* [MESOS-4602] - Invalid usage of ATOMIC_FLAG_INIT in member initialization
* [MESOS-4614] - SlaveRecoveryTest/0.CleanupHTTPExecutor is flaky
* [MESOS-4615] - ContainerLoggerTest.DefaultToSandbox is flaky
* [MESOS-4619] - Remove markdown files from doxygen pages
* [MESOS-4637] - Docker process executor can die with agent unit on systemd.
* [MESOS-4639] - Posix process executor can die with agent unit on systemd.
* [MESOS-4640] - Logrotate container logger can die with agent unit on systemd.
* [MESOS-4656] - strings::split behaves incorrectly when n=1
* [MESOS-4661] - SlaveRecoveryTest/0.ReconnectHTTPExecutor is flaky
* [MESOS-4669] - Add common compression utility
* [MESOS-4670] - `cgroup_info` not being exposed in state.json when ComposingContainerizer is used.
* [MESOS-4671] - Status updates from executor can be forwarded out of order by the Agent.
* [MESOS-4674] - Linux filesystem isolator tests are flaky.
* [MESOS-4675] - Cannot disable systemd support
* [MESOS-4676] - ROOT_DOCKER_Logs is flaky.
* [MESOS-4677] - LimitedCpuIsolatorTest.ROOT_CGROUPS_Pids_and_Tids is flaky.
* [MESOS-4681] - Updated libnl3 download links
* [MESOS-4683] - Document docker runtime isolator.
* [MESOS-4693] - Variable shadowing in HookManager::slavePreLaunchDockerHook
* [MESOS-4703] - Make Stout configuration modular and consumable by downstream (e.g., libprocess and agent)
* [MESOS-4711] - Race condition in libevent poll implementation causes crash
* [MESOS-4714] - "make DESTDIR=<path> install" broken
* [MESOS-4743] - Mesos fetcher not working correctly on docker apps on CoreOS
* [MESOS-4747] - ContainerLoggerTest.MesosContainerizerRecover cannot be executed in isolation
* [MESOS-4768] - MasterMaintenanceTest.InverseOffers is flaky
* [MESOS-4774] - Wrong symbolic link of some Mesos libraries
* [MESOS-4784] - SlaveTest.MetricsSlaveLaunchErrors test relies on implicit blocking behavior hitting the global metrics endpoint
* [MESOS-4806] - LevelDBStateTests write to the current directory
* [MESOS-4824] - "filesystem/linux" isolator does not unmount orphaned persistent volumes
* [MESOS-4825] - Master's slave reregister logic does not update version field
* [MESOS-4830] - Bind docker runtime isolator with docker image provider.
* [MESOS-4831] - Master sometimes sends two inverse offers after the agent goes into maintenance.
* [MESOS-4832] - DockerContainerizerTest.ROOT_DOCKER_RecoverOrphanedPersistentVolumes exits when the /tmp directory is bind-mounted
* [MESOS-4833] - Poor allocator performance with labeled resources and/or persistent volumes
* [MESOS-4836] - Fix rmdir for windows
* [MESOS-4866] - Added document for overlayfs backend.
* [MESOS-4888] - Default cmd is executed as an incorrect command.
* [MESOS-4903] - Allow multiple loads of module manifests
** Documentation
* [MESOS-1471] - Document replicated log design/internals
* [MESOS-3831] - Document operator HTTP endpoints
* [MESOS-4376] - Document semantics of `slaveLost`
* [MESOS-4377] - Document units associated with resource types
* [MESOS-4452] - Improve documentation around roles, principals, authz, and reservations
* [MESOS-4622] - Update configuration.md with `--cgroups_net_cls_primary_handle` agent flag.
* [MESOS-4702] - Document default value of "offer_timeout"
* [MESOS-4786] - Example in C++ style guide uses wrong indention for wrapped line
* [MESOS-4854] - Update CHANGELOG with net_cls isolator
* [MESOS-4873] - Add documentation about container image support.
** Epic
* [MESOS-4343] - Introduce the ability to assign network handles to mesos containers
* [MESOS-4793] - Executor API v1
** Improvement
* [MESOS-197] - Executor sendStatusUpdate should ACK on slave checkpoint
* [MESOS-2585] - Use full width for mesos div.container
* [MESOS-2971] - Implement OverlayFS based provisioner backend
* [MESOS-3608] - Optionally install test binaries.
* [MESOS-4004] - Support default entrypoint and command runtime config in Mesos containerizer
* [MESOS-4005] - Support workdir runtime configuration from image
* [MESOS-4169] - MasterMaintenanceTest.InverseOffers is slow
* [MESOS-4225] - Exposed docker/appc image manifest to mesos containerizer.
* [MESOS-4261] - Remove docker auth server flag
* [MESOS-4333] - Refactor Appc provisioner tests
* [MESOS-4344] - Allow operators to assign net_cls major handles to mesos agents
* [MESOS-4479] - Implement reservation labels
* [MESOS-4486] - Speed up FetcherCacheTest.Local* test cases
* [MESOS-4487] - Introduce status() interface in `Containerizer`
* [MESOS-4488] - Define a CgroupInfo protobuf to expose cgroup isolator configuration.
* [MESOS-4489] - The `cgroups/net_cls` isolator needs to expose handles in the ContainerStatus
* [MESOS-4490] - Get container status information in slave.
* [MESOS-4493] - Add ability to create symlink on Windows
* [MESOS-4494] - Implement `size`, `usage`, and other disk metrics reporting on Windows.
* [MESOS-4497] - Add ZK to the Windows agent build
* [MESOS-4498] - Refactor os.hpp to be less monolithic, and more cross-platform compatible
* [MESOS-4520] - Introduce a status() interface for isolators
* [MESOS-4523] - Enable benchmark tests in ASF CI
* [MESOS-4547] - Introduce TASK_KILLING state.
* [MESOS-4551] - process::collect() and process::await only take a fixed number of arguments (when not using a list).
* [MESOS-4552] - Help strings are not removed from the global help process upon process termination.
* [MESOS-4564] - Separate Appc protobuf messages to its own file.
* [MESOS-4566] - Avoid unnecessary temporary `std::string` constructions and copies in `jsonify`.
* [MESOS-4571] - SlaveRecoveryTest.RecoverStatusUpdateManager is not consistent with its description
* [MESOS-4575] - Fix Appc image caching to share with image fetcher
* [MESOS-4588] - Set title for documentation webpages.
* [MESOS-4618] - Speed up FetcherCacheTest.SimpleEviction
* [MESOS-4628] - Speed up FetcherCache test cases by reduce allocation_interval.
* [MESOS-4636] - Add parent hook to subprocess.
* [MESOS-4657] - Add LOG(INFO) in `cgroups/net_cls` for debugging allocation of net_cls handles.
* [MESOS-4667] - Expose persistent volume information in HTTP endpoints
* [MESOS-4685] - Speed up FetcherCache test cases by disable framework checkpoint.
* [MESOS-4710] - Add comment about labels caveats to mesos.proto
* [MESOS-4731] - Update /frameworks to use jsonify
* [MESOS-4776] - Libprocess metrics/snapshot endpoint rate limiting should be configurable.
* [MESOS-4783] - Disable rate limiting of the global metrics endpoint for mesos-tests execution
* [MESOS-4792] - Remove src/common/date_utils.{c,h}pp
* [MESOS-4796] - Debug ability enhancement for unified container
** Task
* [MESOS-1940] - Add Mesos-graced/hosted libraries to installation path
* [MESOS-3339] - Implement filtering mechanism for (Scheduler API Events) Testing
* [MESOS-3424] - Support fetching AppC images into the store
* [MESOS-3525] - Figure out how to enforce 64-bit builds on Windows.
* [MESOS-3583] - Introduce stream IDs in HTTP Scheduler API
* [MESOS-3613] - Port slave/paths.cpp to Windows
* [MESOS-3643] - Implement stout/os/windows/shell.hpp
* [MESOS-3763] - Need for http::put request method
* [MESOS-3929] - Automate the process of landing commits for committers
* [MESOS-3943] - Support dynamic weight in allocator
* [MESOS-4066] - Agent should not return partial state when a request is made to /state endpoint during recovery.
* [MESOS-4200] - Test case(s) for weights + allocation behavior
* [MESOS-4345] - Implement a network-handle manager for net_cls cgroup subsystem
* [MESOS-4358] - Expose net_cls network handles in agent's state endpoint
* [MESOS-4421] - Document that /reserve, /create-volumes endpoints can return misleading "success"
* [MESOS-4433] - Implement a callback testing interface for the Executor Library
* [MESOS-4435] - Update `Master::Http::stateSummary` to use `jsonify`.
* [MESOS-4438] - Add 'dependency' message to 'AppcImageManifest' protobuf.
* [MESOS-4439] - Fix appc CachedImage image validation
* [MESOS-4457] - Implement tests for the new Executor library
* [MESOS-4531] - Document multi-disk support.
* [MESOS-4590] - Add test case for reservations with same role, different principals
* [MESOS-4596] - Add common Appc spec utilities.
* [MESOS-4660] - Document net_cls isolator in docs/mesos-containerizer.md.
* [MESOS-4686] - Implement master failover tests for the scheduler library.
* [MESOS-4691] - Add a HierarchicalAllocator benchmark with reservation labels.
* [MESOS-4700] - Allow agent to configure net_cls handle minor range.
* [MESOS-4707] - Add fs:supported() function for detecting whether a file system is supported
* [MESOS-4712] - Remove 'force' field from the Subscribe Call in v1 Scheduler API
* [MESOS-4713] - ReviewBot should not fail hard if there are circular dependencies in a review chain
* [MESOS-4746] - CMake: Add leveldb library to 3rdparty external builds.
* [MESOS-4748] - Add Appc image fetcher tests.
* [MESOS-4780] - Remove `user` and `rootfs` flags in Windows launcher.
* [MESOS-4798] - Make existing scheduler library tests use the callback interface.
* [MESOS-4817] - Remove internal usage of deprecated *.json endpoints.
* [MESOS-4822] - Add support for local image fetching in Appc provisioner.
* [MESOS-4829] - Remove `grace_period_seconds` field from Shutdown event v1 protobuf.
* [MESOS-4834] - Add 'file' fetcher plugin.
Release Notes - Mesos - Version 0.27.4
--------------------------------------
* This is a bug fix release.
All Issues:
** Bug
* [MESOS-5330] - Agent should backoff before connecting to the master.
* [MESOS-5571] - Scheduler JNI throws exception when the major versions of JAR and libmesos don't match.
* [MESOS-5691] - SSL downgrade support will leak sockets in CLOSE_WAIT status.
* [MESOS-5723] - SSL-enabled libprocess will leak incoming links to forks.
* [MESOS-5748] - Potential segfault in `link` when linking to a remote process.
* [MESOS-5913] - Stale socket FD usage when using libevent + SSL.
* [MESOS-5943] - Incremental http parsing of URLs leads to decoder error.
* [MESOS-5986] - SSL Socket CHECK can fail after socket receives EOF.
* [MESOS-6104] - Potential FD double close in libevent's implementation of `sendfile`.
* [MESOS-6152] - Resource leak in libevent_ssl_socket.cpp.
Release Notes - Mesos - Version 0.27.3
--------------------------------------
* This is a bug fix release.
All Issues:
** Bug
* [MESOS-4705] - Linux 'perf' parsing logic may fail when OS distribution has perf backports.
* [MESOS-4869] - /usr/libexec/mesos/mesos-health-check using/leaking a lot of memory.
* [MESOS-5018] - FrameworkInfo Capability enum does not support upgrades.
* [MESOS-5021] - Memory leak in subprocess when 'environment' argument is provided.
* [MESOS-5449] - Memory leak in SchedulerProcess.declineOffer.
Release Notes - Mesos - Version 0.27.2
--------------------------------------
* This is a bug fix release.
All Issues:
** Bug
* [MESOS-4693] - Variable shadowing in HookManager::slavePreLaunchDockerHook.
* [MESOS-4711] - Race condition in libevent poll implementation causes crash.
* [MESOS-4754] - The "executors" field is exposed under a backwards incompatible schema.
** Improvement
* [MESOS-4687] - Implement reliable floating point for scalar resources.
Release Notes - Mesos - Version 0.27.1
--------------------------------------
* This is a bug fix release.
All Issues:
** Bug
* [MESOS-4546] - Mesos Agents needs to re-resolve hosts in zk string on leader change / failure to connect.
* [MESOS-4563] - Docker::Container::Create should handle NetworkSettings.IPAddress being an empty string.
* [MESOS-4582] - state.json serving duplicate "active" fields.
* [MESOS-4585] - mesos-fetcher LIBPROCESS_PORT set to 5051 URI fetch failure.
* [MESOS-4587] - Docker environment variables must be able to contain the equal sign.
* [MESOS-4597] - `freebsd.hpp` is missing from the release tarball.
* [MESOS-4598] - Logrotate ContainerLogger should not remove IP from environment.
* [MESOS-4637] - Docker process executor can die with agent unit on systemd.
* [MESOS-4639] - Posix process executor can die with agent unit on systemd.
* [MESOS-4640] - Logrotate container logger can die with agent unit on systemd.
* [MESOS-4675] - Can not disable systemd support.
** Improvement
* [MESOS-4566] - Avoid unnecessary temporary `std::string` constructions and copies in `jsonify`.
* [MESOS-4636] - Add parent hook to subprocess.
** Task
* [MESOS-4435] - Update `Master::Http::stateSummary` to use `jsonify`.
* [MESOS-4531] - Document multi-disk support.
Release Notes - Mesos - Version 0.27.0
--------------------------------------------
API Changes:
* [MESOS-313] - Report executor termination to framework schedulers.
* [MESOS-2315] - Removed deprecated CommandInfo::ContainerInfo.
* [MESOS-3988] - Implicit roles.
* [MESOS-4154] - Rename shutdown_frameworks to teardown_frameworks.
All Issues:
** Bug
* [MESOS-934] - 'Logging and Debugging' document is out-of-date.
* [MESOS-1613] - HealthCheckTest.ConsecutiveFailures is flaky
* [MESOS-2209] - Mesos should not use negative exit codes
* [MESOS-2768] - SIGPIPE in process::run_in_event_loop()
* [MESOS-3134] - Port bootstrap to CMake
* [MESOS-3151] - ReservationTest.CompatibleCheckpointedResourcesWithPersistentVolumes is flaky
* [MESOS-3235] - FetcherCacheHttpTest.HttpCachedSerialized and FetcherCacheHttpTest.HttpCachedConcurrent are flaky
* [MESOS-3307] - Configurable size of completed task / framework history
* [MESOS-3349] - Removing mount point fails with EBUSY in LinuxFilesystemIsolator.
* [MESOS-3379] - LinuxFilesystemIsolatorTest.ROOT_VolumeFromHostSandboxMountPoint is failed
* [MESOS-3472] - RegistryTokenTest.ExpiredToken test is flaky
* [MESOS-3479] - COMMAND Health Checks are not executed if the timeout is exceeded
* [MESOS-3551] - Replace use of strerror with thread-safe alternatives strerror_r / strerror_l.
* [MESOS-3595] - Framework process hangs after master failover when number frameworks > libprocess thread pool size
* [MESOS-3718] - Implement Quota support in allocator
* [MESOS-3773] - RegistryClientTest.SimpleGetBlob is flaky
* [MESOS-3799] - Compilation warning with Ubuntu wily: auto_ptr is deprecated
* [MESOS-3809] - Expose advertise_ip and advertise_port as command line options in mesos slave
* [MESOS-3817] - Rename offers to outstanding offers
* [MESOS-3832] - Scheduler HTTP API does not redirect to leading master
* [MESOS-3834] - slave upgrade framework checkpoint incompatibility
* [MESOS-3851] - Investigate recent crashes in Command Executor
* [MESOS-3859] - Add github support to apply-reviews.py.
* [MESOS-3860] - Add support for `stout/process.hpp` on Windows.
* [MESOS-3868] - Make apply-review.sh use apply-reviews.py
* [MESOS-3909] - isolator module headers depend on picojson headers
* [MESOS-3916] - MasterMaintenanceTest.InverseOffersFilters is flaky
* [MESOS-3939] - ubsan error in net::IP::create(sockaddr const&): misaligned address
* [MESOS-3963] - Move "using mesos::fetcher::FetcherInfo" into internal namespace in "fetcher.hpp"
* [MESOS-3965] - Ensure resources in `QuotaInfo` protobuf do not contain `role`
* [MESOS-4002] - ReservationEndpointsTest.UnreserveAvailableAndOfferedResources is flaky
* [MESOS-4024] - HealthCheckTest.CheckCommandTimeout is flaky.
* [MESOS-4031] - slave crashed in cgroupstatistics()
* [MESOS-4047] - MemoryPressureMesosTest.CGROUPS_ROOT_SlaveRecovery is flaky
* [MESOS-4067] - ReservationTest.ACLMultipleOperations is flaky
* [MESOS-4069] - libevent_ssl_socket assertion fails
* [MESOS-4072] - The lt-mesos-master will coredump in some situation.
* [MESOS-4102] - Quota doesn't allocate resources on slave joining.
* [MESOS-4107] - `os::strerror_r` breaks the Windows build
* [MESOS-4108] - Implement `os::mkdtemp` for Windows
* [MESOS-4109] - HTTPConnectionTest.ClosingResponse is flaky
* [MESOS-4110] - Implement `WindowsError` to correspond with `ErrnoError`.
* [MESOS-4154] - Rename shutdown_frameworks to teardown_frameworks
* [MESOS-4177] - Create a user doc for Executor HTTP API
* [MESOS-4184] - Jenkins builds for Centos fail with missing 'which' utility and incorrect 'java.home'
* [MESOS-4192] - Add documentation for API Versioning
* [MESOS-4193] - Port `process/file.hpp`
* [MESOS-4202] - Race in SSL socket shutdown
* [MESOS-4218] - Test for Quota Status Endpoint
* [MESOS-4266] - S3 URIs prefixed with / by fetcher
* [MESOS-4274] - libprocess build fail with libhttp-parser >= 2.0
* [MESOS-4275] - Duration uses fixed-width types inconsistently
* [MESOS-4281] - Correctly handle disk quota usage when volumes are bind mounted into the container.
* [MESOS-4283] - Accept 3-field version of HDFS du output
* [MESOS-4290] - Reject tasks with images with filesystem/posix isolator
* [MESOS-4293] - Updated master help message for acls.
* [MESOS-4294] - Protobuf parse should support parsing JSON object containing JSON Null.
* [MESOS-4310] - Disable support for --switch-user on Windows.
* [MESOS-4311] - Protobuf parse should pass error messages when parsing nested JSON.
* [MESOS-4328] - Docker container REST API /monitor/statistics.json output have no timestamp field
* [MESOS-4347] - GMock warning in ReservationTest.ACLMultipleOperations
* [MESOS-4348] - GMock warning in HookTest.VerifySlaveRunTaskHook, HookTest.VerifySlaveTaskStatusDecorator
* [MESOS-4349] - GMock warning in SlaveTest.ContainerUpdatedBeforeTaskReachesExecutor
* [MESOS-4357] - GMock warning in RoleTest.ImplicitRoleStaticReservation
* [MESOS-4375] - Allow schemes in HDFS URI fetcher plugin to be configurable.
* [MESOS-4409] - MasterTest.MaxCompletedFrameworksFlag is flaky
* [MESOS-4411] - Traverse all roles for quota allocation.
* [MESOS-4417] - Prevent allocator from crashing on successful recovery.
* [MESOS-4425] - Introduce filtering test abstractions for HTTP events to libprocess
* [MESOS-4449] - SegFault on agent during executor startup
* [MESOS-4507] - Replace busybox image with alpine in Docker tests
* [MESOS-4515] - ContainerLoggerTest.LOGROTATE_RotateInSandbox breaks when running on Centos6.
* [MESOS-4530] - NetClsIsolatorTest.ROOT_CGROUPS_NetClsIsolate is flaky
* [MESOS-4533] - DiskUsageCollectorTest.ExcludeRelativePath fails on Linux
* [MESOS-4534] - Resources object can be mutated through the public API
* [MESOS-4535] - Logrotate ContainerLogger may not handle FD ownership correctly
* [MESOS-4539] - Exclude paths in Posix disk isolator should be absolute paths.
** Documentation
* [MESOS-3581] - License headers show up all over doxygen documentation.
* [MESOS-3936] - Document possible task state transitions for framework authors
* [MESOS-3996] - libprocess: document when, why defer() is necessary
* [MESOS-4204] - Document that frameworks that participate in a role should cooperate
* [MESOS-4206] - Write new logging-related documentation
* [MESOS-4207] - Add an example bug due to a lack of defer() to the defer() documentation
* [MESOS-4209] - Document "how to program with dynamic reservations and persistent volumes"
* [MESOS-4314] - Publish Quota Documentation
* [MESOS-4396] - Adding Tachyon to the list of frameworks
** Improvement
* [MESOS-313] - Report executor terminations to framework schedulers.
* [MESOS-920] - Set GLOG_drop_log_memory=false in environment prior to logging initialization.
* [MESOS-2275] - Document header include rules in style guide
* [MESOS-2353] - Improve performance of the state.json endpoint for large clusters.
* [MESOS-3074] - Add capacity heuristic for quota requests in Master
* [MESOS-3232] - Implement HTTP Basic Authentication for Mesos endpoints
* [MESOS-3493] - benchmark for declining offers
* [MESOS-3720] - Tests for Quota support in master
* [MESOS-3827] - Improve compilation speed of GMock tests
* [MESOS-3960] - Standardize quota endpoints
* [MESOS-3979] - Replace `QuotaInfo` with `Quota` in allocator interface
* [MESOS-4020] - Introduce filter for non-revocable resources in `Resources`
* [MESOS-4021] - Remove quota from Registry for quota remove request
* [MESOS-4056] - Respond with `MethodNotAllowed` if a request uses an unsupported method.
* [MESOS-4058] - Do not use `Resource.role` for resources in quota request.
* [MESOS-4085] - Implement implicit roles
* [MESOS-4103] - Show disk usage and allocation in WebUI
* [MESOS-4128] - Refactor sorter factories in allocator and improve comments around them.
* [MESOS-4136] - Add a ContainerLogger module that restrains log sizes
* [MESOS-4183] - Move operator<< definitions to .cpp files and include <iosfwd> in .hpp where possible.
* [MESOS-4195] - Add dynamic reservation tests with no principal
* [MESOS-4231] - Add a new category to cpplint to detect missing white-space in comments
* [MESOS-4241] - Consolidate docker store slave flags
* [MESOS-4262] - Enable net_cls subsytem in cgroup infrastructure
* [MESOS-4277] - Provide constexpr Duration::min() and max()
* [MESOS-4302] - Offer filter timeouts are ignored if the allocator is slow or backlogged.
* [MESOS-4337] - Implement a simple Windows version of dirent.hpp, for compatibility.
* [MESOS-4351] - Remove logic around checkpointing in the slave
* [MESOS-4410] - Introduce protobuf for quota set request.
* [MESOS-4505] - Hierarchical allocator performance is slow due to Quota
* [MESOS-4578] - docker run -c is deprecated
** Task
* [MESOS-2079] - IO.Write test is flaky on OS X 10.10.
* [MESOS-2210] - Disallow special characters in role.
* [MESOS-2296] - Implement the Events stream on slave for Call endpoint
* [MESOS-2315] - Deprecate / Remove CommandInfo::ContainerInfo
* [MESOS-2455] - Add operator endpoints to create/destroy persistent volumes.
* [MESOS-3515] - Support Subscribe Call for HTTP based Executors
* [MESOS-3550] - Create a Executor Library based on the new Executor HTTP API
* [MESOS-3615] - Port slave/state.cpp
* [MESOS-3627] - Port process/pid.hpp to Windows
* [MESOS-3628] - Port process/address.hpp to Windows
* [MESOS-3629] - Port stout/ip.hpp to Windows
* [MESOS-3630] - Port stout/net.hpp to Windows
* [MESOS-3631] - Implement stout/windows/net.hpp
* [MESOS-3633] - Port stout/path.hpp to Windows
* [MESOS-3640] - Implement stout/os/windows/ls.hpp
* [MESOS-3645] - Implement stout/os/windows/stat.hpp
* [MESOS-3658] - Port stout/protobuf.hpp to Windows
* [MESOS-3659] - Port slave/paths.hpp to Windows
* [MESOS-3660] - Port slave/state.hpp to Windows
* [MESOS-3693] - Port stout/os/open.hpp to Windows
* [MESOS-3861] - Authenticate quota requests
* [MESOS-3862] - Authorize set quota requests.
* [MESOS-3864] - Simplify and/or document the libprocess initialization synchronization logic
* [MESOS-3882] - Libprocess: Implement process::Clock::finalize
* [MESOS-3911] - Add a `--force` flag to disable sanity check in quota
* [MESOS-3912] - Rescind offers in order to satisfy quota
* [MESOS-3925] - Add HDFS based URI fetcher plugin.
* [MESOS-3951] - Make HDFS tool wrappers asynchronous.
* [MESOS-3981] - Implement recovery in the Hierarchical allocator
* [MESOS-3983] - Tests for quota request validation
* [MESOS-3984] - Tests for quota support in `allocate()` function.
* [MESOS-3985] - Tests for rescinding offers for quota
* [MESOS-4013] - Introduce status endpoint for quota
* [MESOS-4014] - Introduce remove endpoint for quota
* [MESOS-4064] - Add ContainerInfo to internal Task protobuf.
* [MESOS-4081] - Authorize quota removal
* [MESOS-4087] - Introduce a module for logging executor/task output
* [MESOS-4088] - Modularize existing plain-file logging for executor/task logs launched with the Mesos Containerizer
* [MESOS-4116] - Add tests for quotas + empty roles (no registered frameworks)
* [MESOS-4137] - Modularize plain-file logging for executor/task logs launched with the Docker Containerizer
* [MESOS-4150] - Implement container logger module metadata recovery
* [MESOS-4220] - Introduce result_of with C++14 semantics to stout.
* [MESOS-4221] - Invoke _Deferred's implicit conversion operator explicitly.
* [MESOS-4228] - Use std::is_bind_expression to reroute the result of std::bind.
* [MESOS-4236] - Create a design document for jsonify
* [MESOS-4237] - Introduce `jsonify` to stout.
* [MESOS-4238] - Update `Master::Http::state` to use the `jsonify` facility.
* [MESOS-4239] - Update relevant libprocess components to support the `jsonify` facility.
* [MESOS-4240] - Pull provisioner from linux filesystem isolator to Mesos containerizer.
* [MESOS-4378] - Add Source to Resource.DiskInfo.
* [MESOS-4380] - Adjust Resource arithmetics for DiskInfo.Source.
* [MESOS-4400] - Create persistent volume directories based on DiskInfo.Source.
* [MESOS-4402] - Update filesystem isolators to look for persistent volume directories from the correct location.
* [MESOS-4403] - Check paths in DiskInfo.Source.Path exist during slave initialization.
* [MESOS-4415] - Implement stout/os/windows/rmdir.hpp
* [MESOS-4506] - Posix disk isolator should ignore disk quota enforcement for MOUNT type disk resources.
* [MESOS-4526] - Include the allocated portion of reserved resources in the role sorter for DRF.
* [MESOS-4527] - Include allocated portion of the reserved resources in the quota role sorter for DRF.
* [MESOS-4528] - Account for reserved resources in the quota guarantee check.
* [MESOS-4529] - Update the allocator to not offer unreserved resources beyond quota.
** Wish
* [MESOS-3962] - Add labels to the message Port
Release Notes - Mesos - Version 0.26.2
--------------------------------------
* This is a bug fix release.
All Issues:
** Bug
* [MESOS-4705] - Linux 'perf' parsing logic may fail when OS distribution has perf backports.
* [MESOS-5449] - Memory leak in SchedulerProcess.declineOffer.
Release Notes - Mesos - Version 0.26.1
--------------------------------------------
* This is a bug fix release.
** Bug
* [MESOS-1187] - precision errors with allocation calculations.
* [MESOS-3307] - Configurable size of completed task / framework history.
* [MESOS-3397] - sorter.cpp: Check failed: total.resources.contains(slaveId).
* [MESOS-3605] - hdfs.du() fails on os x due to lack of native-hadoop library.
* [MESOS-3719] - Core dump on /teardown.
* [MESOS-3773] - RegistryClientTest.SimpleGetBlob is flaky.
* [MESOS-3834] - slave upgrade framework checkpoint incompatibility.
* [MESOS-4031] - slave crashed in cgroupstatistics().
* [MESOS-4069] - libevent_ssl_socket assertion fails.
* [MESOS-4071] - Master crash during framework teardown (Check failed: total.resources.contains(slaveId)).
* [MESOS-4283] - Accept 3-field version of HDFS du output.
* [MESOS-4311] - Protobuf parse should pass error messages when parsing nested JSON.
* [MESOS-4409] - MasterTest.MaxCompletedFrameworksFlag is flaky.
* [MESOS-4449] - SegFault on agent during executor startup.
* [MESOS-4518] - MasterTest.MaxCompletedTasksPerFrameworkFlag is flaky.
* [MESOS-4582] - state.json serving duplicate "active" fields.
* [MESOS-4637] - Docker process executor can die with agent unit on systemd.
* [MESOS-4639] - Posix process executor can die with agent unit on systemd.
* [MESOS-4711] - Race condition in libevent poll implementation causes crash.
* [MESOS-4754] - The "executors" field is exposed under a backwards incompatible schema.
* [MESOS-4979] - os::rmdir does not handle special files (e.g., device, socket).
* [MESOS-5021] - Memory leak in subprocess when 'environment' argument is provided.
** Improvement
* [MESOS-920] - Set GLOG_drop_log_memory=false in environment prior to logging initialization.
* [MESOS-2353] - Improve performance of the state.json endpoint for large clusters.
* [MESOS-4302] - Offer filter timeouts are ignored if the allocator is slow or backlogged.
* [MESOS-4566] - Avoid unnecessary temporary `std::string` constructions and copies in `jsonify`.
* [MESOS-4636] - Add parent hook to subprocess.
* [MESOS-4687] - Implement reliable floating point for scalar resources.
** Task
* [MESOS-4237] - Introduce `jsonify` to stout.
* [MESOS-4238] - Update `Master::Http::state` to use the `jsonify` facility.
* [MESOS-4239] - Update relevant libprocess components to support the `jsonify` facility.
* [MESOS-4435] - Update `Master::Http::stateSummary` to use `jsonify`.
Release Notes - Mesos - Version 0.26.0
--------------------------------------
API Changes:
* [MESOS-3560] - Fix JSON-based credential files by changing protobuf
`Credential` field `secret` from bytes to string.
* [MESOS-3824] - Add /frameworks endpoint to master.
All Issues:
** Bug
* [MESOS-1867] - Precision errors in UI.
* [MESOS-2864] - Master should not change the state of a terminal task if it receives another terminal update.
* [MESOS-3030] - Build failure on OS 10.11 using Xcode 7.
* [MESOS-3280] - Master fails to access replicated log after network partition.
* [MESOS-3293] - Failing ROOT_ tests on CentOS 7.1 - LimitedCpuIsolatorTest.
* [MESOS-3326] - Make use of C++11 atomics.
* [MESOS-3329] - Unused hashmap::existsValue functions have incomplete code paths.
* [MESOS-3411] - ReservationEndpointsTest.AvailableResources appears to be faulty.
* [MESOS-3428] - Support running filesystem isolation with Command Executor in MesosContainerizer.
* [MESOS-3470] - UserCgroupIsolatorTest failed on CentOS 6.6.
* [MESOS-3501] - Configure cannot find libevent headers in CentOS 6.
* [MESOS-3506] - Build instructions for CentOS 6.6 should include `sudo yum update`.
* [MESOS-3517] - Building mesos from source fails when OS language is not English.
* [MESOS-3519] - Fix file descriptor leakage / double close in the code base.
* [MESOS-3522] - MesosScheduler declineOffer results in an acceptOffer.
* [MESOS-3552] - CHECK failure due to floating point precision on reservation request.
* [MESOS-3553] - LIBPROCESS_IP not passed when executor's environment is specified.
* [MESOS-3560] - JSON-based credential files do not work correctly.
* [MESOS-3563] - Revocable task CPU shows as zero in /state.json.
* [MESOS-3569] - Typos in Mesos Monitoring doc page.
* [MESOS-3584] - Rename libprocess tests to "libprocess-tests".
* [MESOS-3591] - mesos-slave: --help output for "--master" is incomplete.
* [MESOS-3594] - Rename http_api_tests.cpp to scheduler_http_api_tests.cpp.
* [MESOS-3597] - Running tests with CMake are annoying and have a bad reporting story.
* [MESOS-3600] - Unable to build with non-default protobuf.
* [MESOS-3602] - hdfs du fails due to prepended / on path.
* [MESOS-3603] - Test build failure due to comparison between signed and unsigned integers.
* [MESOS-3604] - ExamplesTest.PersistentVolumeFramework does not work in OS X El Capitan.
* [MESOS-3605] - hdfs.du() fails on os x due to lack of native-hadoop library.
* [MESOS-3694] - Enable building mesos.apache.org locally in a Docker container.
* [MESOS-3698] - JSON parsing allows non-whitespace trailing characters.
* [MESOS-3700] - Deprecate resource_monitoring_interval flag.
* [MESOS-3708] - Improve process::subprocess ABORT message.
* [MESOS-3715] - Enable Request resource using Call::REQUEST in scheduler driver.
* [MESOS-3716] - Update Allocator interface to support quota.
* [MESOS-3728] - Libprocess: Flaky behavior on test suite when finalizing.
* [MESOS-3733] - ContentType/SchedulerTest.Suppress/0 is flaky.
* [MESOS-3734] - Incorrect sed syntax for Mac OSX.
* [MESOS-3738] - Mesos health check is invoked incorrectly when Mesos slave is within the docker container.
* [MESOS-3743] - Provide diagnostic output in agent log when fetching fails.
* [MESOS-3748] - HTTP scheduler library does not gracefully parse invalid resource identifiers.
* [MESOS-3751] - MESOS_NATIVE_JAVA_LIBRARY not set on MesosContainerize tasks with --executor_environmnent_variables.
* [MESOS-3769] - Agent logs are misleading during agent shutdown.
* [MESOS-3770] - SlaveRecoveryTest/0.RecoverCompletedExecutor is flaky.
* [MESOS-3771] - Mesos JSON API creates invalid JSON due to lack of binary data / non-ASCII handling.
* [MESOS-3773] - RegistryClientTest.SimpleGetBlob is flaky.
* [MESOS-3793] - Cannot start mesos local on a Debian GNU/Linux 8 docker machine.
* [MESOS-3800] - Containerizer attempts to create Linux launcher by default.
* [MESOS-3806] - 'mount --make-rslave /' does not work as expected on ubuntu 14.04.
* [MESOS-3810] - Must be able to use NetworkInfo with mesos-executor.
* [MESOS-3822] - Cannot specify multiple masters when slave start up.
* [MESOS-3834] - Slave upgrade framework checkpoint incompatibility.
* [MESOS-3837] - Rootfs in provisioner test doesn't handle symlink directories properly.
* [MESOS-3840] - Build broken: 'adding 'bool' to a string does not append to the string' in filesystem tests.
* [MESOS-3847] - Root tests for LinuxFilesystemIsolatorTest are broken.
* [MESOS-3849] - Corrected style in Makefiles.
* [MESOS-3937] - Test DockerContainerizerTest.ROOT_DOCKER_Launch_Executor fails.
* [MESOS-3953] - DockerTest.ROOT_DOCKER_CheckPortResource fails.
* [MESOS-3964] - LimitedCpuIsolatorTest.ROOT_CGROUPS_Cfs and LimitedCpuIsolatorTest.ROOT_CGROUPS_Cfs_Big_Quota fail on Debian 8.
* [MESOS-3966] - LinuxFilesystemIsolatorTest.ROOT_ImageInVolumeWithRootFilesystem fails on Centos 7.1.
* [MESOS-3969] - Failing 'make distcheck' on Debian 8, somehow SSL-related.
* [MESOS-3974] - CgroupsAnyHierarchyMemoryPressureTest tests fail on CentOS 6.7.
* [MESOS-3975] - SSL build of mesos causes flaky testsuite.
* [MESOS-3989] - Add missing DiscoveryInfo field to v1/mesos.proto.
* [MESOS-4106] - The health checker may fail to inform the executor to kill an unhealthy task after max_consecutive_failures.
** Documentation
* [MESOS-2783] - Document the fetcher.
* [MESOS-3692] - Clarify error message 'could not chown work directory'.
* [MESOS-3749] - Configuration docs are missing --enable-libevent and --enable-ssl.
* [MESOS-3752] - CentOS 6 dependency install fails at Maven.
* [MESOS-3905] - Five new docker-related slave flags are not covered by the configuration documentation.
** Improvement
* [MESOS-1841] - Mesos components should expose their version on an endpoint.
* [MESOS-2035] - Add reason to containerizer proto Termination.
* [MESOS-2273] - Add "tests" target to Makefile for building-but-not-running tests.
* [MESOS-2467] - Allow --resources flag to take JSON.
* [MESOS-2613] - Change docker rm command.
* [MESOS-2924] - Allow simple construction via initializer list on hashset.
* [MESOS-2960] - Configure DiscoveryInfo and Visibility per port.
* [MESOS-2972] - Serialize Docker image spec as protobuf.
* [MESOS-3099] - Validation of Docker Image Manifests from Docker Registry.
* [MESOS-3366] - Allow resources/attributes discovery.
* [MESOS-3417] - Log source address replicated log received broadcasts.
* [MESOS-3429] - Allow HTTP response codes in libprocess to be matched.
* [MESOS-3468] - Improve apply_reviews.sh script to apply chain of reviews.
* [MESOS-3554] - Allocator changes trigger large re-compiles.
* [MESOS-3566] - Add a section to the Scheduler HTTP API docs around RecordIO specification.
* [MESOS-3721] - We need the active flag on frameworks in the /state-summary endpoint.
* [MESOS-3735] - Mesos master should expose the version of registered agents.
* [MESOS-3759] - Document messages.proto.
* [MESOS-3788] - Clarify NetworkInfo semantics for IP addresses and group policies.
* [MESOS-3819] - Add documentation explaining "roles".
* [MESOS-4015] - Expose task / executor health in master & slave state.json.
** Task
* [MESOS-1832] - Slave should accept PingSlaveMessage but not "PING" message.
* [MESOS-2224] - Add explanatory comments for Allocator interface.
* [MESOS-2295] - Implement the Call endpoint on Slave.
* [MESOS-2906] - Slave : Synchronous Validation for Calls.
* [MESOS-3104] - Add an endpoint that exposes component flags.
* [MESOS-3129] - Move all MesosContainerizer related files under src/slave/containerizer/mesos.
* [MESOS-3332] - Support HTTP Pipelining in libprocess (http::post).
* [MESOS-3405] - Add JSON::protobuf for google::protobuf::RepeatedPtrField.
* [MESOS-3407] - Mesos fetcher automatically extract gz files.
* [MESOS-3480] - Refactor Executor struct in Slave to handle HTTP based executors.
* [MESOS-3762] - Refactor SSLTest fixture such that MesosTest can use the same helpers.
* [MESOS-3824] - Add /frameworks endpoint to master.
* [MESOS-3845] - Send TaskStatus::container_status inside reconciliation updates.
* [MESOS-3900] - Enable mesos-reviewbot project on jenkins to use docker.
Release Notes - Mesos - Version 0.25.1
--------------------------------------------
* This is a bug fix release.
** Bug
* [MESOS-1187] - precision errors with allocation calculations.
* [MESOS-3030] - Build failure on OS 10.11 using Xcode 7.
* [MESOS-3307] - Configurable size of completed task / framework history.
* [MESOS-3397] - sorter.cpp: Check failed: total.resources.contains(slaveId).
* [MESOS-3411] - ReservationEndpointsTest.AvailableResources appears to be faulty.
* [MESOS-3560] - JSON-based credential files do not work correctly.
* [MESOS-3602] - hdfs du fails due to prepended / on path.
* [MESOS-3604] - ExamplesTest.PersistentVolumeFramework does not work in OS X El Capitan.
* [MESOS-3605] - hdfs.du() fails on os x due to lack of native-hadoop library.
* [MESOS-3719] - Core dump on /teardown.
* [MESOS-3738] - Mesos health check is invoked incorrectly when Mesos slave is within the docker container.
* [MESOS-3773] - RegistryClientTest.SimpleGetBlob is flaky.
* [MESOS-3834] - slave upgrade framework checkpoint incompatibility.
* [MESOS-4031] - slave crashed in cgroupstatistics().
* [MESOS-4069] - libevent_ssl_socket assertion fails.
* [MESOS-4071] - Master crash during framework teardown (Check failed: total.resources.contains(slaveId)).
* [MESOS-4106] - The health checker may fail to inform the executor to kill an unhealthy task after max_consecutive_failures.
* [MESOS-4283] - Accept 3-field version of HDFS du output.
* [MESOS-4311] - Protobuf parse should pass error messages when parsing nested JSON.
* [MESOS-4409] - MasterTest.MaxCompletedFrameworksFlag is flaky.
* [MESOS-4518] - MasterTest.MaxCompletedTasksPerFrameworkFlag is flaky.
* [MESOS-4582] - state.json serving duplicate "active" fields.
* [MESOS-4637] - Docker process executor can die with agent unit on systemd.
* [MESOS-4639] - Posix process executor can die with agent unit on systemd.
* [MESOS-4711] - Race condition in libevent poll implementation causes crash.
* [MESOS-4754] - The "executors" field is exposed under a backwards incompatible schema.
* [MESOS-4979] - os::rmdir does not handle special files (e.g., device, socket).
* [MESOS-5021] - Memory leak in subprocess when 'environment' argument is provided.
** Improvement
* [MESOS-920] - Set GLOG_drop_log_memory=false in environment prior to logging initialization.
* [MESOS-2353] - Improve performance of the state.json endpoint for large clusters.
* [MESOS-4302] - Offer filter timeouts are ignored if the allocator is slow or backlogged.
* [MESOS-4566] - Avoid unnecessary temporary `std::string` constructions and copies in `jsonify`.
* [MESOS-4636] - Add parent hook to subprocess.
* [MESOS-4687] - Implement reliable floating point for scalar resources.
** Task
* [MESOS-4237] - Introduce `jsonify` to stout.
* [MESOS-4238] - Update `Master::Http::state` to use the `jsonify` facility.
* [MESOS-4239] - Update relevant libprocess components to support the `jsonify` facility.
* [MESOS-4435] - Update `Master::Http::stateSummary` to use `jsonify`.
Release Notes - Mesos - Version 0.25.0
--------------------------------------
This release contains:
* [MESOS-1474] - Experimental support for maintenance primitives. Please refer
to maintenance.md for more information.
* [MESOS-2600] - Added master endpoints /reserve and /unreserve for dynamic
reservations. Please refer to reservation.md for more information.
* [MESOS-2044] - Extended Module APIs to enable IP per container assignment,
isolation and resolution.
API Changes:
* [MESOS-3037] - Add a SUPPRESS call to the scheduler.
All Issues:
** Bug
* [MESOS-2635] - Web UI Display Bug when starting lots of tasks with small cpu value.
* [MESOS-2986] - Docker version output is not compatible with Mesos.
* [MESOS-3046] - Stout's UUID re-seeds a new random generator during each call to UUID::random.
* [MESOS-3051] - performance issues with port ranges comparison.
* [MESOS-3052] - Allocator performance issue when using a large number of filters.
* [MESOS-3136] - COMMAND health checks with Marathon 0.10.0 are broken.
* [MESOS-3169] - FrameworkInfo should only be updated if the re-registration is valid.
* [MESOS-3185] - Refactor Subprocess logic in linux/perf.cpp to use common subroutine.
* [MESOS-3239] - Refactor master HTTP endpoints help messages such that they cannot be out of sync.
* [MESOS-3245] - The comments of DRFSorter::dirty is not correct.
* [MESOS-3254] - Cgroup CHECK fails test harness.
* [MESOS-3258] - Remove Frameworkinfo capabilities on re-registration.
* [MESOS-3261] - Move QoS plug-ins to a specified folder like resource_estimator.
* [MESOS-3269] - The comments of Master::updateSlave() is not correct.
* [MESOS-3282] - Web UI no longer shows Tasks information.
* [MESOS-3344] - Add more comments for strings::internal::fmt.
* [MESOS-3351] - duplicated slave id in master after master failover.
* [MESOS-3387] - Refactor MesosContainerizer to accept namespace dynamically.
* [MESOS-3408] - Labels field of FrameworkInfo should be added into v1 mesos.proto.
* [MESOS-3411] - ReservationEndpointsTest.AvailableResources appears to be faulty.
* [MESOS-3423] - Perf event isolator stops performing sampling if a single timeout occurs.
* [MESOS-3426] - process::collect and process::await do not perform discard propagation.
* [MESOS-3430] - LinuxFilesystemIsolatorTest.ROOT_PersistentVolumeWithoutRootFilesystem fails on CentOS 7.1.
* [MESOS-3450] - Update Mesos C++ Style Guide for namespace usage.
* [MESOS-3451] - Failing tests after changes to Isolator/MesosContainerizer API.
* [MESOS-3458] - Segfault when accepting or declining inverse offers.
* [MESOS-3474] - ExamplesTest.{TestFramework, JavaFramework, PythonFramework} failed on CentOS 6.
* [MESOS-3489] - Add support for exposing Accept/Decline responses for inverse offers.
* [MESOS-3490] - Mesos UI fails to represent JSON entities.
* [MESOS-3512] - Don't retry close() on EINTR.
* [MESOS-3513] - Cgroups Test Filters aborts tests on Centos 6.6.
* [MESOS-3519] - Fix file descriptor leakage / double close in the code base.
* [MESOS-3538] - CgroupsNoHierarchyTest.ROOT_CGROUPS_NOHIERARCHY_MountUnmountHierarchy test is flaky.
* [MESOS-3575] - V1 API java/python protos are not generated.
** Documentation
* [MESOS-2083] - Add documentation for maintenance primitives.
* [MESOS-2466] - Write documentation for all the LIBPROCESS_* environment variables.
* [MESOS-3230] - Create a HTTP based Authentication design doc.
* [MESOS-3492] - Expose maintenance user doc via the documentation home page.
* [MESOS-3508] - Update docs for Agent's --launcher flag.
* [MESOS-3516] - Add user doc for networking support in Mesos 0.25.0.
** Improvement
* [MESOS-2719] - Deprecating '.json' extension in master endpoints urls.
* [MESOS-2757] - Add -> operator for Option<T>, Try<T>, Result<T>, Future<T>.
* [MESOS-2875] - Add containerId to ResourceUsage to enable QoS controller to target a container.
* [MESOS-2964] - libprocess io does not support peek().
* [MESOS-2983] - Deprecating '.json' extension in slave endpoints url.
* [MESOS-2984] - Deprecating '.json' extension in files endpoints url.
* [MESOS-3037] - Add a SUPPRESS call to the scheduler.
* [MESOS-3187] - Docker cli option support.
* [MESOS-3304] - Remove remnants of LIBPROCESS_STATISTICS_WINDOW.
* [MESOS-3312] - Factor out JSON to repeated protobuf conversion.
* [MESOS-3340] - Command-line flags should take precedence over OS Env variables.
* [MESOS-3347] - Remove dead code in src/linux/perf.cpp.
* [MESOS-3377] - mesos docker container with container_name as ENV variable.
* [MESOS-3457] - Add flag to disable hostname lookup.
** Task
* [MESOS-1831] - Master should send PingSlaveMessage instead of "PING".
* [MESOS-1935] - Replace hard-coded reap interval with a constant.
* [MESOS-2061] - Add InverseOffer protobuf message.
* [MESOS-2062] - Add InverseOffer to Event/Call API.
* [MESOS-2066] - Add optional 'Unavailability' to resource offers to provide maintenance awareness.
* [MESOS-2067] - Add HTTP API to the master for maintenance operations.
* [MESOS-2600] - Add /reserve and /unreserve endpoints on the master for dynamic reservation.
* [MESOS-2907] - Agent : Create Basic Functionality to handle /call endpoint.
* [MESOS-3015] - Add hooks for Slave exits.
* [MESOS-3038] - Resource offers do not contain Unavailability, given a maintenance schedule.
* [MESOS-3042] - Master/Allocator does not send InverseOffers to resources to be maintained.
* [MESOS-3043] - Master does not handle InverseOffers in the Accept call (Event/Call API).
* [MESOS-3045] - Maintenance information is not populated in case of failover.
* [MESOS-3066] - Replicated registry needs a representation of maintenance schedules.
* [MESOS-3069] - Registry operations do not exist for manipulating maintanence schedules.
* [MESOS-3217] - Replace boost unordered_{set,map} and hash with std versions.
* [MESOS-3223] - Implement token manager for docker registry.
* [MESOS-3265] - Starting maintenance needs to deactivate agents and kill tasks.
* [MESOS-3266] - Stopping/Completing maintenance needs to reactivate agents.
* [MESOS-3299] - Add a protobuf to represent time with integer precision.
* [MESOS-3310] - Support provisioning images specified in volumes.
* [MESOS-3345] - Expand the range of integer precision when converting into/out of json.
* [MESOS-3346] - Add filter support for inverse offers.
* [MESOS-3375] - Add executor protobuf to v1.
* [MESOS-3395] - In CMake build system, download third party dependencies from a "trusted channel" instead of from Mesos GitHub mirror.
* [MESOS-3419] - Add HELP message for reserve/unreserve endpoint.
* [MESOS-3425] - Modify LinuxLauncher to support Systemd.
* [MESOS-3459] - Change /machine/up and /machine/down endpoints to take an array.
* [MESOS-3510] - Synchronize V1 helper functions with pre-v1.
* Work In Progress:
* Functionality for endpoint 'api/v1/executor' introduced on 'Agent' is incomplete.
Release Notes - Mesos - Version 0.24.2
--------------------------------------
* This is a bug fix release.
** Bug
* [MESOS-1187] - precision errors with allocation calculations.
* [MESOS-3030] - Build failure on OS 10.11 using Xcode 7.
* [MESOS-3046] - Stout's UUID re-seeds a new random generator during each call to UUID::random.
* [MESOS-3051] - performance issues with port ranges comparison.
* [MESOS-3052] - Allocator performance issue when using a large number of filters.
* [MESOS-3307] - Configurable size of completed task / framework history.
* [MESOS-3397] - sorter.cpp: Check failed: total.resources.contains(slaveId).
* [MESOS-3560] - JSON-based credential files do not work correctly.
* [MESOS-3602] - hdfs du fails due to prepended / on path.
* [MESOS-3604] - ExamplesTest.PersistentVolumeFramework does not work in OS X El Capitan.
* [MESOS-3605] - hdfs.du() fails on os x due to lack of native-hadoop library.
* [MESOS-3719] - Core dump on /teardown.
* [MESOS-3738] - Mesos health check is invoked incorrectly when Mesos slave is within the docker container.
* [MESOS-3773] - RegistryClientTest.SimpleGetBlob is flaky.
* [MESOS-3834] - slave upgrade framework checkpoint incompatibility.
* [MESOS-4031] - slave crashed in cgroupstatistics().
* [MESOS-4069] - libevent_ssl_socket assertion fails.
* [MESOS-4071] - Master crash during framework teardown (Check failed: total.resources.contains(slaveId)).
* [MESOS-4106] - The health checker may fail to inform the executor to kill an unhealthy task after max_consecutive_failures.
* [MESOS-4283] - Accept 3-field version of HDFS du output.
* [MESOS-4311] - Protobuf parse should pass error messages when parsing nested JSON.
* [MESOS-4409] - MasterTest.MaxCompletedFrameworksFlag is flaky.
* [MESOS-4518] - MasterTest.MaxCompletedTasksPerFrameworkFlag is flaky.
* [MESOS-4582] - state.json serving duplicate "active" fields.
* [MESOS-4711] - Race condition in libevent poll implementation causes crash.
* [MESOS-4754] - The "executors" field is exposed under a backwards incompatible schema.
* [MESOS-4979] - os::rmdir does not handle special files (e.g., device, socket).
* [MESOS-5021] - Memory leak in subprocess when 'environment' argument is provided.
** Improvement
* [MESOS-920] - Set GLOG_drop_log_memory=false in environment prior to logging initialization.
* [MESOS-2353] - Improve performance of the state.json endpoint for large clusters.
* [MESOS-4302] - Offer filter timeouts are ignored if the allocator is slow or backlogged.
* [MESOS-4566] - Avoid unnecessary temporary `std::string` constructions and copies in `jsonify`.
* [MESOS-4687] - Implement reliable floating point for scalar resources.
** Task
* [MESOS-4237] - Introduce `jsonify` to stout.
* [MESOS-4238] - Update `Master::Http::state` to use the `jsonify` facility.
* [MESOS-4239] - Update relevant libprocess components to support the `jsonify` facility.
* [MESOS-4435] - Update `Master::Http::stateSummary` to use `jsonify`.
Release Notes - Mesos - Version 0.24.1
--------------------------------------
* This is a bug fix release.
All Issues:
** Bug
* [MESOS-2986] - Docker version output is not compatible with Mesos
* [MESOS-3136] - COMMAND health checks with Marathon 0.10.0 are broken
Release Notes - Mesos - Version 0.24.0
--------------------------------------
This release contains experimental v1 scheduler HTTP API. This allows framework
schedulers to send HTTP requests to master endpoint ('/api/v1/scheduler')
without the need for a driver.
The release also includes these features:
* [MESOS-336] - Mesos slave should cache executors.
Additional API Changes:
* [MESOS-2293] - Implement the scheduler endpoint on master.
* [MESOS-3135] - Publish MasterInfo to ZK using JSON.
Binary API Changes (e.g., new flags):
* [MESOS-3154] - Enable Mesos Agent Node to use arbitrary script / module to
figure out IP, HOSTNAME.
* [MESOS-809] - External control of the ip that Mesos components publish to
zookeeper.
Deprecations:
* [MESOS-2736] - MasterInfo `ip`, `port` and `hostname` are deprecated in
favor of using the `address` field (see `Address` protobuf).
Work In Progress:
* Support for 'Image' field added to 'Volume' and 'ContainerInfo' protobufs
is incomplete.
This release also includes several bug fixes and stability improvements.
All Issues:
** Bug
* [MESOS-2166] - PerfEventIsolatorTest.ROOT_CGROUPS_Sample requires 'perf' to be installed
* [MESOS-2337] - __init__.py not getting installed in $PREFIX/lib/pythonX.Y/site-packages/mesos
* [MESOS-2480] - Protobuf jar is required for unbundled protobuf regardless of --disable-java flag.
* [MESOS-2493] - google glog link is incorrect
* [MESOS-2497] - Create synchronous validations for Calls
* [MESOS-2552] - C++ Scheduler library should send HTTP Calls to master
* [MESOS-2559] - Do not use RunTaskMessage.framework_id.
* [MESOS-2660] - ROOT_CGROUPS_Listen and ROOT_IncreaseRSS tests are flaky
* [MESOS-2862] - mesos-fetcher won't fetch uris which begin with a " "
* [MESOS-2868] - --attributes flag in slave cannot take a value with ':'
* [MESOS-2882] - Duplicate name-value env-vars in '-e' option of docker run
* [MESOS-2900] - Display capabilities in state.json
* [MESOS-2989] - Changing to "framework" from "framwork"
* [MESOS-3001] - Create a "demo" HTTP API client
* [MESOS-3002] - Rename Option<T>::get(const T& _t) to getOrElse() broke network isolator
* [MESOS-3027] - Compiler warning in stout subcommand tests
* [MESOS-3058] - Cgroup tests relies on cgroups::get() returning in a specific order
* [MESOS-3079] - `sudo make distcheck` fails on Ubuntu 14.04 (and possibly other OSes too)
* [MESOS-3121] - Always disable SSLV2
* [MESOS-3124] - Updating persistent volumes after slave restart is problematic.
* [MESOS-3138] - PersistentVolumeTest.SlaveRecovery test fails on OSX
* [MESOS-3141] - Compiler warning when mocking function type has an enum return type.
* [MESOS-3143] - Disable endpoints rule fails to recognize HTTP path delegates
* [MESOS-3148] - Resolve issue with hanging tests with Zookeeper
* [MESOS-3168] - MesosZooKeeperTest fixture can have side effects across tests
* [MESOS-3170] - 0.23 Build fails when compiling against -lsasl2 which has been statically linked
* [MESOS-3175] - subprocess_tests.cpp:598 delete used but allocated with new[]
* [MESOS-3178] - Perform a self bind mount of rootfs itself in fs::chroot::enter.
* [MESOS-3192] - ContainerInfo::Image::AppC::id should be optional
* [MESOS-3195] - Fix master metrics for scheduler calls
* [MESOS-3197] - MemIsolatorTest/{0,1}.MemUsage fails on OS X
* [MESOS-3201] - Libev handle_async can deadlock with run_in_event_loop
* [MESOS-3203] - MasterAuthorizationTest.DuplicateRegistration test is flaky
* [MESOS-3204] - PortMappingIsolatorProcess shell script can silently fail
* [MESOS-3207] - C++ style guide is not rendered correctly (code section syntax disregarded)
* [MESOS-3209] - parameterize allocator benchmark by framework count
* [MESOS-3234] - enable automake maintainer mode
* [MESOS-3237] - HTTP requests with nested path are not properly handled by libprocess
* [MESOS-3238] - Master endpoint help message is incorrect
* [MESOS-3260] - SchedulerTest.* are broken on OSX and CentOS
* [MESOS-3262] - HTTPTest.NestedGet is flaky
* [MESOS-3263] - SchedulerTask.KillTest fails for JSON Requests
* [MESOS-3267] - JSON serialization/deserialization of bytes is incorrect
* [MESOS-3274] - Build error with port mapping isolator
* [MESOS-3275] - ContentType/HttpApiTest.UpdatePidToHttpSchedulerWithoutForce is flaky
* [MESOS-3284] - JSON representation of Protobuf should use base64 encoding for 'bytes' fields.
* [MESOS-3287] - downloadWithHadoop tries to access Error() for a valid Try<bool>
* [MESOS-3290] - Master should drop HTTP calls when it's recovering
** Documentation
* [MESOS-1838] - Add documentation for Authentication
* [MESOS-2555] - Document issue with slave recovery when using systemd.
* [MESOS-3087] - Typos in oversubscription doc
* [MESOS-3167] - Design doc for versioning the HTTP API
* [MESOS-3278] - Add the revocable metrics information in monitoring doc
* [MESOS-3281] - Create a user doc for Scheduler HTTP API
* [MESOS-3286] - Revocable metrics information are missed for slave node
** Improvement
* [MESOS-2350] - Add support for MesosContainerizerLaunch to chroot to a specified path
* [MESOS-2794] - Implement filesystem isolators
* [MESOS-2795] - Introduce filesystem provisioner abstraction
* [MESOS-2798] - Export statistics on "unevictable" memory
* [MESOS-2800] - Rename Option<T>::get(const T& _t) to getOrElse() and refactor the original function
* [MESOS-2841] - FrameworkInfo should include a Labels field to support arbitrary, lightweight metadata
* [MESOS-2880] - Add Frameworkinfo.capabilities on framework re-registration
* [MESOS-2902] - Enable Mesos to use arbitrary script / module to figure out IP, HOSTNAME
* [MESOS-2924] - Allow simple construction via initializer list on hashset.
* [MESOS-2946] - Authorizer Module: Interface design
* [MESOS-2947] - Authorizer Module: Implementation, Integration & Tests
* [MESOS-2951] - Inefficient container usage collection
* [MESOS-2965] - Add implicit cast to string operator to Path.
* [MESOS-2967] - Missing doxygen documentation for libprocess socket interface
* [MESOS-3020] - Expose major, minor and patch components from stout Version
* [MESOS-3054] - update gitignore
* [MESOS-3093] - Support HTTPS requests in libprocess
* [MESOS-3112] - Fetcher should perform cache eviction based on cache file usage patterns.
* [MESOS-3118] - Remove pthread specific code from Stout
* [MESOS-3119] - Remove pthread specific code from Libprocess
* [MESOS-3120] - Remove pthread specific code from Mesos
* [MESOS-3127] - Improve task reconciliation documentation.
* [MESOS-3173] - Mark Path::basename, Path::dirname as const functions.
* [MESOS-3182] - Make Master::registerFramework() and Master::reregisterFramework() call into Master::subscribe()
** Story
* [MESOS-2860] - Create the basic infrastructure to handle /scheduler endpoint
* [MESOS-3142] - As a Developer I want a better way to run shell commands
* [MESOS-3211] - As a Python developer I want a simple way to obtain information about Master from ZooKeeper
* [MESOS-3212] - As a Java developer I want a simple way to obtain information about Master from ZooKeeper
** Task
* [MESOS-2294] - Implement the Events stream on master for Call endpoint
* [MESOS-2640] - Remove old frameworks and ec2 scripts from core Mesos repository
* [MESOS-2910] - Add an Event message handler to scheduler driver
* [MESOS-2913] - Scheduler driver should send Call messages to the master
* [MESOS-2933] - Pass slave's total resources to the ResourceEstimator and QoSController via Slave::usage().
* [MESOS-2961] - Add cpuacct subsystem utils to cgroups
* [MESOS-3012] - Support existing message passing optimization with Event/Call.
* [MESOS-3067] - Implement a streaming response decoder for events stream
* [MESOS-3088] - Update scheduler driver to send SUBSCRIBE call
* [MESOS-3089] - Update scheduler library to send REQUEST call
* [MESOS-3101] - Standardize separation of Windows/Linux-specific OS code
* [MESOS-3102] - Separate OS-specific code in the stout library
* [MESOS-3130] - Custom isolators should implement Isolator instead of IsolatorProcess.
* [MESOS-3131] - Master should send heartbeats on the subscription connection
* [MESOS-3132] - Allow slave to forward messages through the master for HTTP schedulers.
* [MESOS-3145] - Using a unresolvable hostname crashes the framework on registration
* [MESOS-3149] - Use setuptools to install python cli package
* [MESOS-3162] - Provide a means to check http connection equality for streaming connections.
* [MESOS-3179] - Create a test abstraction for preparing test rootfs.
* [MESOS-3194] - Implement a 'read-only' AppC Image Store
** Wish
* [MESOS-3276] - Add Scrapinghub to the Powered By Mesos page
Release Notes - Mesos - Version 0.23.1
--------------------------------------
* This is a bug fix release.
All Issues:
** Bug
* [MESOS-2986] - Docker version output is not compatible with Mesos
* [MESOS-3136] - COMMAND health checks with Marathon 0.10.0 are broken
Release Notes - Mesos - Version 0.23.0
--------------------------------------
This release contains new features:
* [MESOS-1585] - Per-container network isolation: bandwidth capping and unique
egress flow to reduce buffer bloat. Refer to the network monitoring and
isolation documentation for more information.
* [MESOS-2115] - Dockerized slaves will properly recover Docker containers
upon failover.
Plus an upgrade to the minimum required compiler versions:
* [MESOS-2604] - Upgrade minimum required compilers to GCC 4.8+ or clang 3.5+.
And experimental support for the following features:
* [MESOS-336] - Fetcher Caching of executor/task binaries. Refer to the
fetcher documentation for more information.
* [MESOS-354] - Support for launching tasks/executors on revocable resources.
These resources can be revoked by Mesos at any time, causing the tasks using
them to be throttled or preempted.
* [MESOS-910] - SSL encryption via libevent. Refer to the SSL documentation
for instructions on building and enabling SSL.
* [MESOS-1554] - Frameworks can create Persistent Volumes from disk resources.
Refer to the persistent volume documentation for more information.
* [MESOS-2018] - Frameworks can dynamically reserve resources for their role.
Refer to the reservation documentation for more information.
Binary API Changes (e.g. new flags):
* [MESOS-1913] - Create libevent/SSL-backed Socket implementation.
* [MESOS-2110] - Configurable Ping Timeouts.
* [MESOS-2155] - Make docker containerizer killing orphan containers optional.
* [MESOS-2832] - Enable configuring Mesos with environment variables without
having them leak to tasks launched.
Framework API Changes:
* [MESOS-1127] - Implement the protobufs for the scheduler API.
* [MESOS-2097] - Update Resource protobuf with DiskInfo.
* [MESOS-2191] - Add ContainerId to the TaskStatus message.
* [MESOS-2292] - Implement Call/Event protobufs for Executor.
* [MESOS-2475] - Add the Resource::ReservationInfo protobuf message.
* [MESOS-2614] - Update name, hostname, failover_timeout, and webui_url in
master on framework re-registration.
* [MESOS-2654] - A new 'capabilities' field has been added to FrameworkInfo
to opt in to revocable resources.
* [MESOS-2691] - Update Resource message to include revocable resources.
* [MESOS-2955] - Introduce acceptOffers scheduler driver API for performing
operations on Offers.
* [MESOS-2957] - Add version to MasterInfo.
Web UI Changes:
* [MESOS-2104] - Correct naming of cgroup memory statistics.
* [MESOS-2485] - Added master metrics for slave removal reasons.
* [MESOS-2620] - Implement a mechanism which allows access control of
endpoints.
* [MESOS-2743] - Include ExecutorInfos for custom executors in
master/state.json.
* [MESOS-2775] - Added slave metrics for revocable resources.
* [MESOS-2776] - Added master metrics for revocable resources.
Module API Changes:
* [MESOS-2050] - Revise Authenticator interface.
* [MESOS-2351] - Enable label and environment decorators (hooks) to remove
label and environment entries.
* [MESOS-2884] - Allow isolators to specify required namespaces.
New Module/Hook interfaces:
* [MESOS-2160] - Add support for allocator modules.
* [MESOS-2650] - Modularize the Resource Estimator.
Deprecations:
* [MESOS-2058] - Remove stats.json endpoints for Master and Slave.
* [MESOS-2697] - '/master/shutdown' endpoint is deprecated in favor of
the new '/master/teardown' endpoint.
This release also includes several bug fixes and stability improvements.
All Issues:
** Bug
* [MESOS-328] - HTTP headers should be considered case-insensitive.
* [MESOS-719] - missing-call-to-setgroups
* [MESOS-757] - The post-reviews.py script hangs if HTTP authentication has expired
* [MESOS-1303] - ExamplesTest.{TestFramework, NoExecutorFramework} flaky
* [MESOS-1690] - Expose metric for container destroy failures
* [MESOS-1795] - Assertion failure in state abstraction crashes JVM
* [MESOS-1825] - Support the webui over HTTPS.
* [MESOS-2016] - docker_name_prefix is too generic
* [MESOS-2020] - mesos should send docker failure messages to scheduler
* [MESOS-2161] - AbstractState JNI check fails for Marathon framework
* [MESOS-2165] - When cyrus sasl MD5 isn't installed configure passes,
tests fail without any output
* [MESOS-2183] - docker containerizer doesn't work when mesos-slave is
running in a container
* [MESOS-2199] - Failing test: SlaveTest.ROOT_RunTaskWithCommandInfoWithUser
* [MESOS-2309] - Mesos rejects ExecutorInfo as incompatible when there is no
functional difference
* [MESOS-2367] - Improve slave resiliency in the face of orphan containers
* [MESOS-2373] - DRFSorter needs to distinguish resources from different slaves.
* [MESOS-2387] - SlaveTest.TaskLaunchContainerizerUpdateFails is flaky
* [MESOS-2401] - MasterTest.ShutdownFrameworkWhileTaskRunning is flaky
* [MESOS-2402] - MesosContainerizerDestroyTest.LauncherDestroyFailure is flaky
* [MESOS-2403] - MasterAllocatorTest/0.FrameworkReregistersFirst is flaky
* [MESOS-2412] - Potential memleak(s) in stout/os.hpp
* [MESOS-2426] - Developer Guide improvements
* [MESOS-2436] - Adapt unit test relying on non-checkpointing slaves
* [MESOS-2450] - Hardcoded constants in libprocess should be replaced by
their INADDR_XXX equivalents
* [MESOS-2457] - Update post-reviews to rbtools in 'submit your patch' of
developer's guide
* [MESOS-2464] - Authentication failure may lead to slave crash
* [MESOS-2469] - Mesos master/slave should be able to bind to 127.0.0.1 if
explicitly requested
* [MESOS-2479] - Task filter input disappears entirely once the search query
yields no results
* [MESOS-2481] - Update CHANGELOG and upgrades doc about the new acceptOffers API.
* [MESOS-2494] - Clang build broken with "expression result unused" warning
* [MESOS-2514] - Change the default leaf qdisc to fq_codel inside containers
* [MESOS-2530] - Alloc-dealloc-mismatch in OsSendfileTest.sendfile
* [MESOS-2534] - PerfTest.ROOT_SampleInit test fails.
* [MESOS-2538] - Remove unnecessary default flags from PortMappingMesosTest.
* [MESOS-2548] - new `make distcheck` failures inside a docker container
* [MESOS-2557] - Do not pass FrameworkID to Framework constructor for Master/Slave.
* [MESOS-2558] - Mark RunTaskMessage.framework_id as optional
* [MESOS-2566] - Fix the Attributes and Resources documentation
* [MESOS-2592] - The sandbox directory is not chown'ed if the fetcher doesn't run
* [MESOS-2598] - Slave state.json frameworks.executors.queued_tasks wrong format?
* [MESOS-2601] - Tasks are not removed after recovery from slave and mesos containerizer
* [MESOS-2603] - Permissions and ownership of persistent volumes are not set correctly.
* [MESOS-2611] - Get Started about CentOS 6.5 is wrong
* [MESOS-2627] - ExamplesTest.PersistentVolumeFramework is flaky
* [MESOS-2636] - Segfault in inline Try<IP> getIP(const std::string& hostname, int family)
* [MESOS-2656] - Slave should send status update immediately when container
launch fails.
* [MESOS-2659] - update pthread and python autoconf macros
* [MESOS-2660] - ROOT_CGROUPS_Listen test is flaky
* [MESOS-2668] - Slave fails to recover when there are still processes left
in its cgroup
* [MESOS-2671] - Port mapping isolator causes SIGABRT during slave recovery.
* [MESOS-2672] - ContainerizerTest.ROOT_CGROUPS_BalloonFramework flaky
* [MESOS-2690] - --enable-optimize build fails with maybe-uninitialized
* [MESOS-2748] - /help generated links point to wrong URLs
* [MESOS-2764] - Allow Resource Estimator to get Resource Usage information.
* [MESOS-2778] - Non-POD static variables used in fq_codel and ingress.
* [MESOS-2781] - getQdisc function in routing::queueing::internal.cpp returns
incorrect qdisc
* [MESOS-2787] - mesos-ps fails with "KeyError: 'mem_rss_bytes'"
* [MESOS-2788] - mesos-ps truncates memory statistics
* [MESOS-2792] - Remove duplicate literals in ingress & fq_codel queueing disciplines
* [MESOS-2808] - Slave should call into resource estimator whenever it wants
to forward oversubscribed resources
* [MESOS-2809] - Mesos fails to launch Docker images built with large Dockerfiles
* [MESOS-2815] - Flaky test: FetcherCacheHttpTest.HttpCachedSerialized
* [MESOS-2835] - Fix typos in source comments
* [MESOS-2866] - Slave should send oversubscribed resource information after
master failover.
* [MESOS-2869] - OversubscriptionTest.FixedResourceEstimator is flaky
* [MESOS-2873] - style hook prevent's valid markdown files from getting committed
* [MESOS-2874] - Convert PortMappingStatistics to use automatic JSON encoding/decoding
* [MESOS-2877] - Allow libprocess firewall to have more control over the
responses sent on failures
* [MESOS-2881] - Linker error when building Mesos with unbundled dependencies
* [MESOS-2889] - Add SSL switch to python configuration
* [MESOS-2890] - Sandbox URL doesn't work in web-ui when using SSL
* [MESOS-2891] - Performance regression in hierarchical allocator.
* [MESOS-2894] - web UI shows "YYYY" for year instead of year
* [MESOS-2904] - Add slave metric to count container launch failures
* [MESOS-2914] - Port mapping isolator should cleanup unknown orphan containers
after all known orphan containers are recovered during recovery.
* [MESOS-2917] - Specify correct libnl version for configure check
* [MESOS-2919] - Framework can overcommit oversubscribable resources during
master failover.
* [MESOS-2925] - Invalid usage of ATOMIC_FLAG_INIT in member initialization
* [MESOS-2932] - There is a typo in docs/docker-containerizer.md file
* [MESOS-2943] - mesos fails to compile under mac when libssl and libevent are enabled
* [MESOS-2962] - Slave fails with Abort stacktrace when DNS cannot resolve hostname
* [MESOS-2973] - SSL tests don't work with --gtest_repeat
* [MESOS-2975] - SSL tests don't work with --gtest_shuffle
* [MESOS-2986] - Docker version output is not compatible with Mesos
* [MESOS-2991] - Compilation Error on Mac OS 10.10.4 with clang 3.5.0
* [MESOS-2993] - Document per container unique egress flow and network queueing statistics
* [MESOS-2996] - Failing Docker tests on CentOS Linux release 7.1.1503.
* [MESOS-2997] - SSL connection failure causes failed CHECK.
* [MESOS-3005] - SSL tests can fail depending on hostname configuration
* [MESOS-3025] - 0.22.x scheduler driver drops 0.23.x reconciliation status
updates due to missing StatusUpdate.uuid.
* [MESOS-3034] - ReservationTest.CompatibleCheckpointedResources is flaky
* [MESOS-3055] - Master doesn't properly handle SUBSCRIBE call
* [MESOS-3060] - FTP response code for success not recognized by fetcher.
** Documentation
* [MESOS-2205] - Add user documentation for reservations
* [MESOS-2395] - Slave recovery documentation shows incorrect recover flag
* [MESOS-2416] - Update or delete release guide in confluence wiki
* [MESOS-2525] - Missing information in Python interface launchTasks scheduler method
* [MESOS-2616] - Update C++ style guide on variable naming.
* [MESOS-2621] - Create documentation for observability metrics
* [MESOS-2622] - Document the semantic change in decorator return values
* [MESOS-2783] - document the fetcher
* [MESOS-2886] - Capture some testing patterns we use in a doc
* [MESOS-2942] - Create documentation for using SSL
* [MESOS-2992] - Improve attribute documentation to reflect current state
* [MESOS-3033] - Add user guide for oversubscription
** Improvement
* [MESOS-692] - Reservations are not reported in master's state.json
* [MESOS-994] - Add an Option<string> os::getenv() to stout
* [MESOS-1733] - Change the stout path utility to declare a single, variadic
'join' function instead of several separate declarations of
various discrete arities
* [MESOS-1991] - Remove dynamic allocation from Option
* [MESOS-2023] - mesos-execute should allow setting environment variables
* [MESOS-2057] - Concurrency control for fetcher cache
* [MESOS-2069] - Basic fetcher cache functionality
* [MESOS-2070] - Implement simple slave recovery behavior for fetcher cache
* [MESOS-2072] - Fetcher cache eviction
* [MESOS-2074] - Fetcher cache test fixture
* [MESOS-2103] - Expose number of processes and threads in a container
* [MESOS-2111] - Add build instructions for OSX in getting started
* [MESOS-2136] - Expose per-cgroup memory pressure
* [MESOS-2277] - Document undocumented HTTP endpoints
* [MESOS-2323] - write flags to log at startup
* [MESOS-2332] - Report per-container metrics for network bandwidth throttling
* [MESOS-2333] - Securing Sandboxes via Filebrowser Access Control
* [MESOS-2340] - Add ability to decode JSON serialized MasterInfo from ZK
* [MESOS-2374] - Support relative host paths for container volumes
* [MESOS-2392] - Rate limit slaves removals during master recovery.
* [MESOS-2400] - Improve NsTest.ROOT_setns
* [MESOS-2438] - Improve support for streaming HTTP Responses in libprocess.
* [MESOS-2454] - Add support for /proc/self/mountinfo on Linux
* [MESOS-2461] - Slave should provide details on processes running in its cgroups
* [MESOS-2462] - Add option for Subprocess to set a death signal for the forked child
* [MESOS-2507] - Performance issue in the master when a large number of
slaves are registering.
* [MESOS-2519] - Log IP addresses from HTTP requests
* [MESOS-2527] - Add default bind to socket
* [MESOS-2528] - Symlink the namespace handle with ContainerID for the port
mapping isolator.
* [MESOS-2547] - Cleanup stale bind mounts for port mapping isolator during
slave recovery.
* [MESOS-2549] - Remove non-variadic strings::format
* [MESOS-2550] - Mesos doesn't compile with clang 3.6
* [MESOS-2565] - Clean up style and comments in modules.
* [MESOS-2571] - Expose Memory Pressure in MemIsolator
* [MESOS-2573] - Use Memory Test Helper to improve some test code.
* [MESOS-2595] - Create docker executor
* [MESOS-2608] - test-framework should support principal only credential
* [MESOS-2609] - Move StatusUpdateStream implementation to a compilation unit
* [MESOS-2624] - "configure" should fail when "patch" is not available.
* [MESOS-2653] - Slave should act on correction events from QoS controller
* [MESOS-2666] - use standard compiler detection macros
* [MESOS-2680] - Update modules doc with hook usage example
* [MESOS-2693] - Printing a resource should show information about
reservation, disk etc
* [MESOS-2709] - Design Master discovery functionality for HTTP-only clients
* [MESOS-2716] - Add non-const reference version of Option<T>::get.
* [MESOS-2729] - Update DRF sorter to update total resources
* [MESOS-2745] - Add 'Path' to stout's user guide
* [MESOS-2752] - Add HTB queueing discipline wrapper class
* [MESOS-2784] - Added constexpr to C++11 whitelist.
* [MESOS-2793] - Add support for container rootfs to Mesos isolators
* [MESOS-2801] - Remove dynamic allocation from Future<T>
* [MESOS-2804] - Log framework capabilities in the master.
* [MESOS-2805] - Make synchronized as primary form of synchronization.
* [MESOS-2836] - Report per-container metrics for network bandwidth
throttling to the slave
* [MESOS-2837] - Decode network statistics from mesos-network-helper
* [MESOS-2870] - Add validation capability to stout Flags
* [MESOS-2888] - Add SSL socket tests
* [MESOS-2928] - Update stout #include headers
* [MESOS-2940] - Reconciliation is expensive for large numbers of tasks.
* [MESOS-2958] - Update Call protobuf to move top level FrameworkInfo inside Subscribe
* [MESOS-2966] - socket::peer() and socket::address() might fail with SSL enabled
** Story
* [MESOS-1552] - Mesos javadoc should include .proto javadoc
* [MESOS-2551] - C++ Scheduler library should send Call messages to Master
* [MESOS-2746] - As a Framework User I want to be able to discover my Task's IP
** Task
* [MESOS-1598] - Add advanced shaping controls to routing library
* [MESOS-1856] - Support specifying libnl3 install location.
* [MESOS-2031] - Manage persistent directories on slave.
* [MESOS-2085] - Add support encrypted and non-encrypted communication in
parallel for cluster upgrade
* [MESOS-2108] - Add configure flag or environment variable to enable
SSL/libevent Socket
* [MESOS-2123] - Document changes in C++ Resources API in CHANGELOG.
* [MESOS-2139] - Enable the master to handle reservation operations
* [MESOS-2213] - Custom allocators should implement Allocator instead of
AllocatorProcess
* [MESOS-2233] - Run ASF CI mesos builds inside docker
* [MESOS-2289] - Design doc for the HTTP API
* [MESOS-2290] - Move all scheduler driver validations to master
* [MESOS-2291] - Move executor driver validations to slave
* [MESOS-2348] - Introduce a new filter abstraction for Resources.
* [MESOS-2366] - MasterSlaveReconciliationTest.ReconcileLostTask is flaky
* [MESOS-2375] - Remove the checkpoint variable entirely from slave/flags.hpp
* [MESOS-2404] - Add an example framework to test persistent volumes.
* [MESOS-2405] - Add user doc for using persistent volumes.
* [MESOS-2422] - Use fq_codel qdisc for egress network traffic isolation
* [MESOS-2427] - Add Java binding for the acceptOffers API.
* [MESOS-2428] - Add Python bindings for the acceptOffers API.
* [MESOS-2476] - Enable Resources to handle Resource::ReservationInfo
* [MESOS-2477] - Enable Resources::apply to handle reservation operations.
* [MESOS-2489] - Enable a framework to perform reservation operations.
* [MESOS-2491] - Persist the reservation state on the slave
* [MESOS-2496] - Make description consistent when adding flags
* [MESOS-2563] - Add license blobs to Java JNI cpp files
* [MESOS-2596] - Update allocator docs
* [MESOS-2597] - Choose allocator based on master flag and loaded modules
* [MESOS-2615] - Pipe 'updateFramework' path from master to Allocator to
support framework re-registration
* [MESOS-2629] - Update style guide to disallow capture by reference of temporaries
* [MESOS-2630] - Remove capture by reference of temporaries in Stout
* [MESOS-2631] - Remove capture by reference of temporaries in libprocess
* [MESOS-2649] - Implement Resource Estimator
* [MESOS-2652] - Update Mesos containerizer to understand revocable cpu resources
* [MESOS-2655] - Implement a stand alone test framework that uses revocable
cpu resources
* [MESOS-2661] - Remove pre-C++11 codepaths
* [MESOS-2662] - Remove <stout/memory.hpp> and switch from memory:: to std::
* [MESOS-2663] - Remove <stout/tuple.hpp> and switch from tuples:: to std::
* [MESOS-2670] - Update existing lambdas to meet style guide
* [MESOS-2677] - Add unrestricted unions to style guide
* [MESOS-2689] - Slave should forward oversubscribable resources to the master
* [MESOS-2730] - Add a new API call to the allocator to update
oversubscribed resources
* [MESOS-2733] - Update master to handle oversubscribed resource estimate
from the slave
* [MESOS-2734] - Update allocator to allocate revocable resources
* [MESOS-2739] - Remove dynamic allocation from Stout Try<T>
* [MESOS-2740] - Remove dynamic allocation from Stout Result<T>
* [MESOS-2753] - Master should validate tasks using oversubscribed resources
* [MESOS-2761] - Delegating constructors are not allowed by styleguide
* [MESOS-2762] - Explicitly-defaulted functions are not allowed by styleguide
* [MESOS-2770] - Slave should forward total amount of oversubscribed
resources to the master
* [MESOS-2773] - Pass callback to the resource estimator to retrieve
ResourceUsage from Resource Monitor on demand.
* [MESOS-2791] - Create a FixedResourceEstimator to return fixed amount of
oversubscribable resources.
* [MESOS-2807] - As a developer I need an easy way to convert MasterInfo
protobuf to/from JSON
* [MESOS-2818] - Pass 'allocated' resources for each executor to the
resource estimator.
* [MESOS-2823] - Pass callback to the QoS Controller to retrieve
ResourceUsage from Resource Monitor on demand.
* [MESOS-2892] - Add benchmark for hierarchical allocator.
* [MESOS-2893] - Add queue size metrics for the allocator.
* [MESOS-2898] - Write tests for new JSON (ZooKeeper) functionality
** Wish
* [MESOS-2510] - Add a function which test if a JSON object is contained in
another JSON object
Release Notes - Mesos - Version 0.22.2
--------------------------------------
* This is a bug fix release.
All Issues:
** Bug
* [MESOS-2986] - Docker version output is not compatible with Mesos
Release Notes - Mesos - Version 0.22.1
--------------------------------------
* This is a bug fix release.
All Issues:
** Bug
* [MESOS-1795] - Assertion failure in state abstraction crashes JVM.
* [MESOS-2161] - AbstractState JNI check fails for Marathon framework.
* [MESOS-2583] - Tasks getting stuck in staging.
* [MESOS-2592] - The sandbox directory is not chown'ed if the
fetcher doesn't run.
* [MESOS-2601] - Tasks are not removed after recovery from slave and mesos
containerizer
* [MESOS-2643] - Python scheduler driver disables implicit acknowledgments by
default.
* [MESOS-2668] - Slave fails to recover when there are still processes left in
its cgroup
** Improvement
* [MESOS-2461] - Slave should provide details on processes running in its
cgroups
** Task
* [MESOS-2614] - Update name, hostname, failover_timeout, and webui_url
in master on framework re-registration
Release Notes - Mesos - Version 0.22.0
--------------------------------------
This release contains several new features:
* Support for explicitly sending status updates acknowledgements from
schedulers; refer to the upgrades document for upgrading schedulers.
* Rate limiting slave removal, to safeguard against unforeseen bugs leading to
widespread slave removal.
* Disk quota isolation in Mesos containerizer; refer to the containerization
documentation to enable disk quota monitoring and enforcement.
* Support for module hooks in task launch sequence. Refer to the modules
documentation for more information.
* Anonymous modules: a new kind of module that does not receive any callbacks
but coexists with its parent process.
* New service discovery info in task info allows framework users to specify
discoverability of tasks for external service discovery systems. Refer to
the framework development guide for more information.
* New '--external_log_file' flag to serve external logs through the Mesos web UI.
* New '--gc_disk_headroom' flag to control maxmimum executor sandbox age.
API Changes:
* [MESOS-1143] - TASK_ERROR is now sent instead of TASK_LOST when rescheduling
a task should not be attempted.
* [MESOS-2086] - Update messages.proto to use a raw bytestream instead of a
string for AuthenticationStartMessage.
* [MESOS-2120] - Task labels which enable key value pairs in task info which
follows them through the task life-cycle.
* [MESOS-2185] - Slave state.json will now include custom resource types in
addition to first-class resource types.
* [MESOS-2208] - Service discovery info for tasks and executors.
* [MESOS-2322] - All arguments can now read their values from a file, just
specify --name=file://path/to/file.
* [MESOS-2347] - The C++/Java/Python APIs have been updated to provide the
ability for schedulers to explicitly send acknowledgements. TaskStatus now
includes a UUID to enable this.
Deprecations:
* [MESOS-444] - Slave checkpoint flag has been removed as it will be enabled
for all slaves.
* [MESOS-1876] - Remove deprecated 'slave_id' field in ReregisterSlaveMessage.
* [MESOS-2058] - Deprecate stats.json endpoints for Master and Slave and
task status counts from state.json. See /help/metrics endpoint for more
information.
* [MESOS-2322] - Deprecated specifying JSON blobs to parse using an absolute
path to point at the filename.
This release also includes several bug fixes and stability improvements.
All Issues:
** Technical task
* [MESOS-2263] - Enable protobuf::write to handle google::protobuf::RepeatedPtrField<T>
* [MESOS-2264] - Enable protobuf::read to handle google::protobuf::RepeatedPtrField<T>
* [MESOS-2265] - Introduce an os::rename abstraction to stout.
* [MESOS-2266] - Introduce a checkpoint function to support google::protobuf::Repeated<T>
** Bug
* [MESOS-998] - Slave should wait until Containerizer::update() completes
successfully.
* [MESOS-1432] - Atomically set close-on-exec where possible.
* [MESOS-1708] - Using the wrong resource "name" should report a better error.
* [MESOS-1922] - Slave blocks on the fetcher after terminating an executor.
* [MESOS-2008] - MasterAuthorizationTest.DuplicateReregistration is flaky.
* [MESOS-2048] - Fix MesosContainerizerExecuteTest.IoRedirection test.
* [MESOS-2121] - Fix ProcTest.MultipleThreads flaky.
* [MESOS-2167] - Remove empty resource checker in master.
* [MESOS-2176] - Hierarchical allocator inconsistently accounts for
reserved resources.
* [MESOS-2177] - Create socket wrappers for different protocol families.
* [MESOS-2181] - Build failure - overloaded 'socket(int, __socket_type, int)'
is ambiguous.
* [MESOS-2185] - slave state endpoint does not contain all resources in
the resources field.
* [MESOS-2192] - libprocess fails to build under g++-4.6 - src/clock.cpp.
* [MESOS-2206] - Latest health status omitted during reconciliation.
* [MESOS-2225] - FaultToleranceTest.ReregisterFrameworkExitedExecutor is
flaky.
* [MESOS-2228] - SlaveTest.MesosExecutorGracefulShutdown is flaky.
* [MESOS-2232] - Suppress MockAllocator::transformAllocation() warnings.
* [MESOS-2236] - Compilation failure on GCC 4.4.7.
* [MESOS-2241] - DiskUsageCollectorTest.SymbolicLink test is flaky.
* [MESOS-2279] - Future callbacks should be cleared once the future has transitioned.
* [MESOS-2283] - SlaveRecoveryTest.ReconcileKillTask is flaky.
* [MESOS-2302] - FaultToleranceTest.SchedulerFailoverFrameworkMessage is flaky.
* [MESOS-2305] - Refactor validators in Master.
* [MESOS-2306] - MasterAuthorizationTest.FrameworkRemovedBeforeReregistration
is flaky.
* [MESOS-2313] - fix reviewboard setting so all users have same rbt settings.
* [MESOS-2319] - Unable to set --work_dir to a non /tmp device.
* [MESOS-2324] - MasterAllocatorTest/0.OutOfOrderDispatch is flaky.
* [MESOS-2325] - CPU busy loop in libprocess libev clock.
* [MESOS-2326] - Broken OSX Build after fixed bugs in CREATE/DESTROY operation
handlers.
* [MESOS-2328] - http::URL build error with clang 3.3.
* [MESOS-2344] - segfaults running make check from ev integration.
* [MESOS-2355] - MasterTest.SlavesEndpointTwoSlaves fails sometimes because
the master assigns the same ID to both slaves.
* [MESOS-2366] - Fixed a flaky reconciliation test.
* [MESOS-2377] - Fix leak in libevent's version EventLoop::delay.
* [MESOS-2381] - Put Authentication protobufs back in mesos.internal package.
* [MESOS-2390] - HADOOP_HOME no longer works with fetcher.
* [MESOS-2410] - Broken build on OS X 10.8.5 caused by mac_tests in stout.
* [MESOS-2414] - Java bindings segfault during framework shutdown.
* [MESOS-2420] - Fetcher tests fail to build on ubuntu 14.10.
* [MESOS-2447] - Mesos replicated log does not log the Action type name.
* [MESOS-2452] - The recovered executor directory points to the meta directory.
* [MESOS-2463] - Slaves sends mutated copy of executorinfo to new elected
master.
* [MESOS-2486] - With unbundled dependencies Mesos doesn't build with
-Wl,--no-copy-dt-needed-entries.
* [MESOS-2499] - SOURCE_EXECUTOR not set properly in slave.cpp.
** Documentation
* [MESOS-1470] - Add operational documentation for running HA masters.
* [MESOS-2282] - developers guide is missing some details.
* [MESOS-2327] - Authorization docs incorrectly describe how to configure turn
off permissive mode.
* [MESOS-2391] - Provide user doc for the new posix disk isolator in Mesos
containerizer.
* [MESOS-2396] - Provide user doc for service discovery info.
** Epic
* [MESOS-2150] - Service discovery info for tasks and executors.
** Improvement
* [MESOS-1148] - Add support for rate limiting slave removal.
* [MESOS-1248] - Use JSON instead of our own format for passing URI information
to mesos-fetcher.
* [MESOS-1316] - Implement decent unit test coverage for the mesos-fetcher
tool.
* [MESOS-1587] - Report disk usage from MesosContainerizer.
* [MESOS-1588] - Enforce disk quota in MesosContainerizer.
* [MESOS-1711] - Create method for users to identify HDFS compatible protocols
in fetcher.cpp.
* [MESOS-1960] - Silence symbolic link to pre-commit in bootstrap.
* [MESOS-1974] - Refactor the C++ Resources abstraction for DiskInfo.
* [MESOS-2009] - Libprocess: Introduce mutex.
* [MESOS-2010] - Libprocess: Introduce enable_shared_from_this.
* [MESOS-2011] - Introduce mutex.
* [MESOS-2012] - Introduce enable_shared_from_this.
* [MESOS-2019] - Replace the ip and port pairs from the UPID class and
process namespace with Node class.
* [MESOS-2051] - Pull Metrics struct out of Master and Slave to improve
readability.
* [MESOS-2056] - Refactor fetcher code in preparation for fetcher cache.
* [MESOS-2094] - Libprocess: Introduce make_shared.
* [MESOS-2095] - Introduce make_shared.
* [MESOS-2104] - Correct naming of cgroup memory statistics.
* [MESOS-2126] - Libprocess Future: Improve performance, Vector instead of
Queue.
* [MESOS-2127] - killTask() should perform reconciliation for unknown tasks.
* [MESOS-2169] - Make GC_DISK_HEADROOM configurable through slave command line
flag.
* [MESOS-2172] - Refactor fetcher namespace into a class.
* [MESOS-2173] - Consolidate all fetcher env vars into one that holds a JSON
object.
* [MESOS-2193] - serve an externally managed log via the web ui.
* [MESOS-2230] - Update RateLimiter to allow the acquired future to be
discarded.
* [MESOS-2272] - Remove "internal" namespace from within "mesos"
* [MESOS-2314] - remove unnecessary constants.
* [MESOS-2347] - Add ability for schedulers to explicitly acknowledge status
updates on the driver.
** Story
* [MESOS-444] - Remove --checkpoint flag in the slave once checkpointing is stable.
* [MESOS-1694] - Future::failure should return a const string&
* [MESOS-1830] - Expose master stats differentiating between master-generated
and slave-generated LOST tasks Task.
* [MESOS-1876] - Remove deprecated 'slave_id' field in ReregisterSlaveMessage.
* [MESOS-1903] - Add backoff to framework re-registration retries.
* [MESOS-2029] - Allow slave to checkpoint resources.
* [MESOS-2060] - Add support for 'hooks' in task launch sequence.
* [MESOS-2098] - Update task validation to be after task authorization.
* [MESOS-2099] - Support acquiring/releasing resources with DiskInfo in allocator.
* [MESOS-2100] - Implement master to slave protocol for persistent disk resources.
* [MESOS-2101] - Add the persistent resources release primitive to the
framework API.
* [MESOS-2106] - Enable libevent backed libprocess with configure flag.
* [MESOS-2107] - Create libevent-backed clock implementation.
* [MESOS-2109] - Introduce socket factory.
* [MESOS-2114] - Extract and generalize WhitelistWatcher.
* [MESOS-2133] - Create libevent-backed poll implementation.
* [MESOS-2135] - Support DiskInfo in C++ Resources.
* [MESOS-2138] - Add an Offer::Operation message for Dynamic Reservations.
* [MESOS-2178] - Add a method from converting the hostname to an ip address and
create initialization wrappers for sockaddr_in and addrinfo.
* [MESOS-2240] - Narrow down file permissions on os::open.
Release Notes - Mesos - Version 0.21.2
--------------------------------------
* This is a bug fix release.
All Issues:
** Bug
* [MESOS-2986] - Docker version output is not compatible with Mesos
Release Notes - Mesos - Version 0.21.1
--------------------------------------
* This is a bug fix release.
All Issues:
** Bug
* [MESOS-2047] Isolator cleanup failures shouldn't cause TASK_LOST.
* [MESOS-2071] Libprocess generates invalid HTTP
* [MESOS-2147] Large number of connections slows statistics.json responses.
* [MESOS-2182] Performance issue in libprocess SocketManager.
** Improvement
* [MESOS-1925] Docker kill does not allow containers to exit gracefully
* [MESOS-2113] Improve configure to find apr and svn libraries/headers in OSX
Release Notes - Mesos - Version 0.21.0
--------------------------------------
This release includes several new features.
* State reconciliation for frameworks:
* Allows frameworks to reconcile the states of the tasks.
* Support for Mesos modules
* Support for modules in master, slave and tests using the --modules flag.
* Task status now includes source and reason:
* [MESOS-343] - Expose TASK_FAILED reason to Frameworks.
* [MESOS-1143] - Add a TASK_ERROR task status.
* A shared filesystem isolator:
* Volumes can be mounted from the host into a container's
filesystem.
* Parts of the shared filesystem can be made private to each
container, e.g., a private /tmp for each container.
* A pid namespace isolator:
* Processes inside a container will not have visibility to host
processes or processes in any other container.
* Containers will be destroyed by terminating the 'init' process for
the pid namespace rather than using the freezer cgroup, avoiding known
kernel bugs.
API Changes:
* [MESOS-1461] - Add task reconciliation to the Python API.
Deprecations:
* [MESOS-1807] - Disallow executors with cpu only or memory only resources.
* [MESOS-1986] - Disabling checkpointing is deprecated and the --checkpoint
flag will be removed in a future release.
Build changes:
* [MESOS-1044] - Require C++11 compiler support.
This release also includes several bug fixes and stability improvements.
All Issues:
** Bug
* [MESOS-487] - Balloon framework fails to run due to bad flags
* [MESOS-631] - Slave started in cleanup mode shouldn't accept new tasks
* [MESOS-947] - Slave should properly handle a killTask() that arrives between runTask() and _runTask()
* [MESOS-1081] - Master should not deactivate authenticated framework/slave on new AuthenticateMessage unless new authentication succeeds.
* [MESOS-1195] - systemd.slice + cgroup enablement fails in multiple ways.
* [MESOS-1208] - 3rdparty/libprocess/3rdparty/boost-1.53.0/boost/math/special_functions/sign.hpp:113:55: error: typedef 'fp_tag' locally defined but not used [-Werror=unused-local-typedefs]
* [MESOS-1219] - Master should disallow frameworks that reconnect after failover timeout.
* [MESOS-1389] - Reconciliation can send TASK_LOST before a terminal update reaches the framework.
* [MESOS-1392] - Failure when znode is removed before we can read its contents.
* [MESOS-1414] - Status updates should not be sent from the slave until it is registered.
* [MESOS-1463] - mesos-local.sh dumps core
* [MESOS-1668] - Handle a temporary one-way master --> slave socket closure.
* [MESOS-1676] - ZooKeeperMasterContenderDetectorTest.MasterDetectorTimedoutSession is flaky
* [MESOS-1688] - No offers if no memory is allocatable
* [MESOS-1695] - The stats.json endpoint on the slave exposes "registered" as a string.
* [MESOS-1696] - Improve reconciliation between master and slave.
* [MESOS-1703] - better error message when replicated log hasn't been initialized
* [MESOS-1712] - Automate disallowing of commits mixing mesos/libprocess/stout
* [MESOS-1715] - The slave does not send pending tasks during re-registration.
* [MESOS-1716] - The slave does not add pending tasks as part of the staging tasks metric.
* [MESOS-1722] - Wrong attributes separator in slave --help
* [MESOS-1741] - mesos-slave shouldn't fail if dockerd is down
* [MESOS-1746] - clear TaskStatus data to avoid OOM
* [MESOS-1748] - MasterZooKeeperTest.LostZooKeeperCluster is flaky
* [MESOS-1769] - Segfault when using external containerizer
* [MESOS-1774] - Fix protobuf detection on systems with Python 3 as default
* [MESOS-1782] - AllocatorTest/0.FrameworkExited is flaky
* [MESOS-1783] - MasterTest.LaunchDuplicateOfferTest is flaky
* [MESOS-1786] - FaultToleranceTest.ReconcilePendingTasks is flaky.
* [MESOS-1797] - Packaged Zookeeper does not compile on OSX Yosemite
* [MESOS-1799] - Reconciliation can send out-of-order updates.
* [MESOS-1814] - Task attempted to use more offers than requested in example jave and python frameworks
* [MESOS-1817] - Completed tasks remains in TASK_RUNNING when framework is disconnected
* [MESOS-1821] - CHECK failure in master.
* [MESOS-1824] - when "docker ps -a" returns 400+ lines enabling docker containerizer results in all executors dying
* [MESOS-1833] - Running docker container with colon in executor id generates error
* [MESOS-1834] - Default port for mesos is 5050, but documentation states it as 5051
* [MESOS-1843] - Specifying --with-curl doesn't work.
* [MESOS-1844] - AllocatorTest/0.SlaveLost is flaky
* [MESOS-1849] - Cannot execute container in privileged mode
* [MESOS-1853] - Remove /proc and /sys remounts from port_mapping isolator
* [MESOS-1854] - SlaveRecoveryTest.MultipleSlaves is flaky.
* [MESOS-1855] - Mesos 0.20.1 doesn't compile
* [MESOS-1857] - path::join() is broken
* [MESOS-1858] - Leaked file descriptors in StatusUpdateStream.
* [MESOS-1862] - Performance regression in the Master's http metrics.
* [MESOS-1866] - Race between ~Authenticator() and Authenticator::authenticate() can lead to schedulers/slaves to never get authenticated
* [MESOS-1869] - UpdateFramework message might reach the slave before Reregistered message and get dropped
* [MESOS-1873] - Don't pass task-related arguments to mesos-executor
* [MESOS-1875] - os::killtree() incorrectly returns early if pid has terminated
* [MESOS-1878] - Access to sandbox on slave from master UI does not show the sandbox contents
* [MESOS-1881] - Reviewbot should not apply reviews that are submitted.
* [MESOS-1884] - Composing Containerizer is not sending calls to still launching containers
* [MESOS-1892] - Using mesos-0.20.1.jar with libmesos-0.21.0 reliably segfaults
* [MESOS-1901] - Slave resources obtained from localhost:5051/state.json is not correct.
* [MESOS-1915] - Docker containers that fail to launch are not killed
* [MESOS-1945] - SlaveTest.KillTaskBetweenRunTaskParts is flaky
* [MESOS-1948] - Docker tests are flaky
* [MESOS-1967] - Test RoutingTest.INETSockets fails on some machine
* [MESOS-1969] - RBT only takes revision ranges as args for versions >= 0.6
* [MESOS-1970] - slave and offer ids are indistinguishable in the logs
* [MESOS-1975] - Module manager causes make check failure for annotated mesos versions.
* [MESOS-1989] - Container network stats reported by the port mapping isolator is the reverse of the actual network stats.
* [MESOS-2025] - OsTest.killtreeNoRoot: Process reparent assumes new parent is init pid 1
* [MESOS-2036] - Fix the Json format for the --modules and update the help message
* [MESOS-2046] - Configure should check headers and libraries for svn and apr
* [MESOS-2050] - InMemoryAuxProp plugin used by Authenticators results in SEGFAULT
* [MESOS-2052] - RunState::recover should always recover 'completed'
* [MESOS-2078] - Scheduler driver may ACK status updates when the scheduler threw an exception
** Documentation
* [MESOS-1506] - Update documentation/flags regarding new default hostname semantics
* [MESOS-1950] - Add module writers guide
* [MESOS-1984] - Documentation for Egress Control Limit
* [MESOS-2033] - Documentation for isolator filesystem/shared.
* [MESOS-2034] - Documentation for isolator namespaces/pid.
* [MESOS-2037] - Update docs/configuration.md
* Epic
* [MESOS-1407] - Provide state reconciliation for frameworks.
* Improvement
* [MESOS-186] - Resource offers should be rescinded after some configurable timeout
* [MESOS-750] - Require compilers that support c++11
* [MESOS-1181] - Improve cpplint rule coverage
* [MESOS-1502] - expose message event queue size from libprocess
* [MESOS-1567] - Add logging of the user uid when receiving SIGTERM.
* [MESOS-1586] - Isolate system directories, e.g., per-container /tmp
* [MESOS-1643] - Provide APIs to return port resource for a given role
* [MESOS-1656] - Do not remove docker container until gc process runs
* [MESOS-1728] - Libprocess: report bind parameters on failure
* [MESOS-1752] - Allow variadic templates
* [MESOS-1771] - introduce unique_ptr
* [MESOS-1779] - Mesos style checker should catch trailing white space
* [MESOS-1811] - Reconcile disconnected/deactivated semantics in the master code
* [MESOS-1813] - Fail fast in example frameworks if task goes into unexpected state
* [MESOS-1863] - Split launch tasks and decline offers metrics
* [MESOS-1896] - Enable module specific command line parameters
* [MESOS-1927] - Enable implicit local cluster launch to load modules
* [MESOS-1932] - Install git pre commit hook during bootstrap
* [MESOS-1951] - Add --isolation flag to mesos-tests
* [MESOS-1972] - Move TASK_LOST generations due to invalid tasks from scheduler driver to master
* [MESOS-2038] - Remove dead code in Slave::_runTask
* Story
* [MESOS-343] - Expose TASK_FAILED reason to Frameworks.
* [MESOS-1765] - Use PID namespace to avoid freezing cgroup
* Task
* [MESOS-681] - Document the reconciliation API.
* [MESOS-1410] - Keep terminal unacknowledged tasks in the master's state.
* [MESOS-1808] - Expose RTT in container stats
* [MESOS-1864] - Add test integration for module developers
* [MESOS-1931] - Add support for isolator modules
* [MESOS-1943] - Add event queue size metrics to scheduler driver
* [MESOS-1964] - 0.21.0 release
* [MESOS-1965] - Create mesos::modules namespace for all module related stuff
* [MESOS-1985] - Use more standard debug / release build flags
Release Notes - Mesos - Version 0.20.1
--------------------------------------
* This is a bug fix release.
All Issues:
** Bug
* [MESOS-1705] - SubprocessTest.Status sometimes flakes out
* [MESOS-1724] - Can't include port in DockerInfo's image
* [MESOS-1727] - Configure fails with ../configure: line 18439: syntax error near unexpected token `PROTOBUFPREFIX,'
* [MESOS-1729] - LogZooKeeperTest.WriteRead fails due to SIGPIPE (escalated to SIGABRT)
* [MESOS-1730] - Should be an error if commandinfo shell=true when using docker containerizer
* [MESOS-1732] - Mesos containerizer doesn't reject tasks with container info set
* [MESOS-1737] - Isolation=external result in core dump on 0.20.0
* [MESOS-1740] - Bad error message when docker containerizer isn't enabled
* [MESOS-1749] - SlaveRecoveryTest.ShutdownSlave is flaky
* [MESOS-1755] - Add docker support to mesos-execute
* [MESOS-1758] - Freezer failure leads to lost task during container destruction.
* [MESOS-1760] - MasterAuthorizationTest.FrameworkRemovedBeforeReregistration is flaky
* [MESOS-1764] - Build Fixes from 0.20 release
* [MESOS-1766] - MasterAuthorizationTest.DuplicateRegistration test is flaky
* [MESOS-1809] - Modify docker pull to use docker inspect after a successful pull
** Improvement
* [MESOS-1621] - Docker run networking should be configurable and support bridge network
* [MESOS-1762] - Avoid docker pull on each container run
* [MESOS-1770] - Docker with command shell=true should override entrypoint
Release Notes - Mesos - Version 0.20.0
--------------------------------------
This release includes a lot of new cool features. The major new features are
listed below:
* Docker support in Mesos:
* Users now can launch executors/tasks within Docker containers.
* Mesos now supports running multiple containerizers simultaneously. The slave
can dynamically choose a containerizer to launch containers based on the
configuration of executors/tasks.
* Container level network monitoring for mesos containerizer:
* Network statistics for each active container can be retrieved through the
/monitor/statistics.json endpoint on the slave.
* Completely transparent to the tasks running on the slave. No need to change
the service discovery mechanism for tasks.
* Framework authorization:
* Allows frameworks to (re-)register with authorized roles.
* Allows frameworks to launch tasks/executors as authorized users.
* Allows authorized principals to shutdown framework(s) through HTTP endpoint.
* Framework rate limiting:
* In a multi-framework environment, this feature aims to protect the
throughput of high-SLA (e.g., production, service) frameworks by having the
master throttle messages from other (e.g., development, batch) frameworks.
* Enable building against installed third-party dependencies.
API Changes:
* [MESOS-857] - The Python API now uses different namespacing. This will break
existing schedulers, please refer to the upgrades document.
* [MESOS-1409] - Status update acknowledgements are sent through the Master
now. This only affects you if you're using a non-Mesos binding (e.g. pure
language binding), in which case refer to the upgrades document.
HTTP endpoint changes:
* [MESOS-1188] - "deactivated_slaves" represents inactive slaves in
"/stats.json" and "/state.json".
* [MESOS-1390] - "/shutdown" authenticated endpoint has been added to master
to shutdown a framework.
Deprecations:
* [MESOS-1219] - Master should disallow completed frameworks from
reregistering with same framework id.
* [MESOS-1695] - "/stats.json" on the slave exposes "registered" value as
string instead of integer.
This release also includes several bug fixes and stability improvements.
All Issues:
** Sub-task
* [MESOS-1292] - [MESOS-1259]:Enrich the Java Docs in the src/java files. -- ZooKeeperState.java
* [MESOS-1293] - [MESOS-1259]:Enrich the Java Docs in the src/java files. -- Variable.java
* [MESOS-1294] - [MESOS-1259]:Enrich the Java Docs in the src/java files. -- State.java
** Bug
* [MESOS-445] - Scheduler driver destructor waits forever
* [MESOS-473] - Freezer fails fatally when it is unable to write 'FROZEN' to freezer.state
* [MESOS-759] - The cgroups TaskKiller should skip freezing the cgroup if it is already empty.
* [MESOS-856] - TasksKiller may run forever because the cgroup cannot be frozen.
* [MESOS-878] - Slave should not register with the master when in TERMINATING.
* [MESOS-1001] - registrar doesn't build on Linux/Clang
* [MESOS-1119] - Allocator should make an allocation decision per slave instead of per framework/role.
* [MESOS-1149] - SlaveRecovery.Reboot test doesn't reap executor
* [MESOS-1170] - Update system check (glog)
* [MESOS-1171] - Update system check (gmock)
* [MESOS-1172] - Update system check (libev)
* [MESOS-1173] - Update system check (picojson)
* [MESOS-1174] - Update system check (protobuf)
* [MESOS-1178] - Only enable the oom killer if it's not enabled
* [MESOS-1337] - AllocatorZooKeeperTest/0.FrameworkReregistersFirst runs forever
* [MESOS-1341] - AllocatorZooKeeperTest/0.FrameworkReregistersFirst is flaky
* [MESOS-1348] - The SlaveRecoveryTest.GCExecutor test leaks child processes.
* [MESOS-1354] - Resource leak in jvm.cpp
* [MESOS-1404] - Glibc 'fork()' is not async signal safe
* [MESOS-1417] - Slave should not send terminal status update before containerizer update is finished
* [MESOS-1422] - AllocatorTest/0.SchedulerFailover test is flaky
* [MESOS-1428] - Failed to update 'registry': Failed to perform store within 5secs (caused flaky MasterTest.StatusUpdateAcknowledgementsThroughMaster)
* [MESOS-1435] - RegistrarZooKeeperTest.TaskRunning is flaky, sometimes runs forever.
* [MESOS-1436] - AllocatorZooKeeperTest/0.SlaveReregistersFirst flaky and can run forever
* [MESOS-1437] - SlaveRecoveryTest/0.RestartBeforeContainerizerLaunch is flaky
* [MESOS-1439] - SchedulerTest.MetricsEndpoint is flaky
* [MESOS-1454] - Command executor should have nonzero resources
* [MESOS-1467] - commit msg was changed after run ./support/post-reviews.py
* [MESOS-1477] - Deadlock when terminating ZooKeeperProcess
* [MESOS-1479] - Cgroups cpu isolator should only report cfs stats if cfs is enabled
* [MESOS-1492] - Add support for optionally throttling the frameworks not specified in RateLimits config
* [MESOS-1504] - mesos.pb.h header include is problematic.
* [MESOS-1513] - FaultToleranceTest.SlaveReregisterTerminatedExecutor is flaky
* [MESOS-1526] - Regression in 'make distclean': files left around.
* [MESOS-1529] - Handle a network partition between Master and Slave
* [MESOS-1532] - AllocatorZooKeeperTest/0.SlaveReregistersFirst and AllocatorZooKeeperTest/0.FrameworkReregistersFirst are flaky
* [MESOS-1533] - HealthCheck tests are flaky
* [MESOS-1536] - AllocatorZooKeeperTest/0.FrameworkReregistersFirst
* [MESOS-1540] - Fix a typo in src/Makefile.am to include java test cases
* [MESOS-1543] - MasterTest.OrphanTasks is flaky
* [MESOS-1544] - DRFAllocatorTest.SameShareAllocations is flaky
* [MESOS-1549] - The configure script should check for libnl headers as well
* [MESOS-1555] - ExecutorInfo validity check is broken in Master
* [MESOS-1578] - Improve framework rate limiting by imposing the max number of outstanding messages per framework principal
* [MESOS-1604] - LowLevelSchedulerLibprocess did not receive offers from Master
* [MESOS-1610] - Mesos containerizer should not call isolate if the child process already died.
* [MESOS-1617] - Linux kernel generates duplicated tc u32 filter handles
* [MESOS-1624] - Apache Jenkins build fails due to -lsnappy is set when building leveldb
* [MESOS-1627] - Installed protobuf header files include wrong path to mesos header file
* [MESOS-1629] - GLOG Initialized twice if the Framework Scheduler also uses GLOG
* [MESOS-1632] - Seg fault due to infinite recursion "<< RepeatedPtrField<Resource>"
* [MESOS-1633] - Create a static mesos library
* [MESOS-1635] - zk flag fails when specifying a file and the replicated logs
* [MESOS-1639] - Master OOMs when throttling traffic from LoadGeneratorFramework
* [MESOS-1649] - Network isolator should tolerate slave crashes while doing isolate/cleanup.
* [MESOS-1653] - HealthCheckTest.GracePeriod is flaky.
* [MESOS-1655] - ZooKeeperTest.LeaderDetectorTimeoutHandling is flaky
* [MESOS-1658] - Implementation of process::io::poll can lead to broken pipes.
* [MESOS-1670] - Build Failure on Mac OSX with undefined link
* [MESOS-1673] - The value of MASTER_PING_TIMEOUT is non-deterministic
* [MESOS-1677] - AllocatorTest.FrameworkReregistersFirst is flaky.
* [MESOS-1692] - Build error on gcc-4.4.
* [MESOS-1693] - Enable builds for ARM
* [MESOS-1700] - ThreadLocal does not release pthread keys or log properly.
* [MESOS-1704] - Mac OS X build breaks in DockerContainerizerProcess::fetch
* [MESOS-1705] - SubprocessTest.Status sometimes flakes out
* [MESOS-1710] - Compilation against master fails on make check
** Documentation
* [MESOS-1480] - Write Documentation for Authorization
* [MESOS-1702] - Add document for network monitoring.
** Epic
* [MESOS-1071] - Enable building against installed third-party dependencies.
* [MESOS-1228] - Container level network monitoring
* [MESOS-1342] - Add authorization support.
** Improvement
* [MESOS-292] - Remove unnecessary includes of headers to improve compile times
* [MESOS-320] - Add instrumentation into libprocess.
* [MESOS-857] - restructure mesos python namespace
* [MESOS-921] - Consider simultaneous containerizer support
* [MESOS-987] - Wire up a code coverage tool
* [MESOS-1188] - Rename slaves/frameworks.activated/deactivated
* [MESOS-1236] - stout's os module uses a mix of Try<Nothing> and bool returns
* [MESOS-1237] - stout's os::ls should return a Try<>
* [MESOS-1259] - Enrich the Java Docs in the src/java files.
* [MESOS-1312] - Show active tasks orphaned by a framework disconnect
* [MESOS-1324] - Create a network isolator based on port mapping
* [MESOS-1339] - Add "per-framework-principal" counters for all messages from a scheduler on Master
* [MESOS-1379] - Provide a reconciliation mechanism for tasks unknown to the framework.
* [MESOS-1390] - Add an authenticated '/shutdown' endpoint for shutting down a running framework
* [MESOS-1446] - Create an abstraction for launching an operation in a subprocess.
* [MESOS-1450] - Add setns utilities to stout
* [MESOS-1453] - Update reconciliation semantics send statuses for each task.
* [MESOS-1499] - Add flags parse support for specific protobufs
* [MESOS-1501] - Add flags parse support for RateLimits protobuf
* [MESOS-1511] - Simplify 'Operation' semantics to only handle logics in the subprocess side
* [MESOS-1519] - Expose constructors of types used in java APIs
* [MESOS-1523] - ZooKeeper timeout should be longer
* [MESOS-1525] - Don't require slave id for reconciliation requests.
* [MESOS-1528] - Refactor Subprocess to support execve style launch and customized clone function
* [MESOS-1557] - Allow the network isolator to handle those tasks that are not isolated by the network isolator
* [MESOS-1559] - Allow jenkins build machine to dump stack traces of all threads when timeout
* [MESOS-1590] - Allow LoadGeneratorFramework to read password from a file
* [MESOS-1591] - Do not install LoadGeneratorFramework
* [MESOS-1608] - Add support for installing stout headers
* [MESOS-1616] - ReregisterCompletedFrameworks test does not use real JSON parser
* [MESOS-1620] - Reconciliation does not send back tasks pending validation / authorization.
* [MESOS-1652] - Stream Docker logs into sandbox logs
** Story
* [MESOS-1350] - Initial implementation of framework API rate limiter, taking the config via master flag
* [MESOS-1595] - Provide a way to install libprocess
** Task
* [MESOS-1307] - Authorize offer allocations
* [MESOS-1325] - Create a linux routing library abstraction based on libnl
* [MESOS-1343] - Authorize "/shutdown" HTTP endpoint through ACLs.
* [MESOS-1374] - Verify static libprocess scheduler port works with Mesos Master
* [MESOS-1409] - Send status update acknowledgments through the Master.
* [MESOS-1443] - Create a protobuf for framework rate limit configuration and load it as JSON through master flags
* [MESOS-1444] - Integrate rate limiter into the master
* [MESOS-1445] - Add new tests for framework rate limiting
* [MESOS-1451] - Remove 'offer_id' field from LaunchTasksMessage.
* [MESOS-1505] - Add a test to verify that frameworks with same share get equal number of allocations
* [MESOS-1530] - Create LoadGeneratorScheduler to test Framework Rate Limiting
* [MESOS-1568] - Support ENTRYPOINT style containers
* [MESOS-1580] - Accept --isolation=external through a deprecation cycle.
* [MESOS-1593] - Add DockerInfo Configuration
* [MESOS-1600] - IP classifiers in routing lib should ignore IP packets with IP options
* [MESOS-1601] - Add metrics for port mapping network isolator
* [MESOS-1671] - Expose executor metrics for slave.
* [MESOS-1672] - Add filter to allocator resourcesRecovered method
* [MESOS-1674] - Kill private_resources and treat 'ephemeral_ports' as a resource.
* [MESOS-1683] - Create user doc for framework rate limiting feature
Release Notes - Mesos - Version 0.19.1
--------------------------------------
* This is a bug fix release.
All Issues:
** Bug
* [MESOS-1448] - Mesos Fetcher doesn't support URLs that have 30X redirects.
* [MESOS-1534] - Scheduler process is not explicitly terminated in the destructor of MesosSchedulerDriver.
* [MESOS-1538] - A container destruction in the middle of a launch leads to CHECK failure.
* [MESOS-1539] - No longer able to spin up Mesos master in local mode.
* [MESOS-1550] - MesosSchedulerDriver should never, ever, call 'stop'.
* [MESOS-1551] - Master does not create work directory when missing.
Release Notes - Mesos - Version 0.19.0
--------------------------------------
* The primary feature of this release is the "Registrar". This is the addition
of replicated state in the master to ensure the set of slaves in the cluster
remains consistent in the presence of master failovers.
* This feature is currently used in a write-only manner by default to allow
smooth upgrades. 0.20.0 by default will be write *and* read.
* Operators must now specify the 'work_dir' for the master, along with the
'quorum' size of the ensemble of masters.
* This means adding or removing masters must be done carefully! The best
practice is to only ever add or remove a single master at a time and to
allow a small amount of time for the replicated log to catch up on the new
master.
* Authentication support has been added for slaves.
* Metrics reporting has been overhauled and is now exposed on /metrics/snapshot.
* Support for external containerization strategies has been added to support
custom container needs as well as experimentation; this is an alpha release!
* There are also several bug fixes and stability improvements.
All Issues:
** Sub-task
* [MESOS-562] - Update 'Getting Started' Documentation Page
* [MESOS-783] - Master::killTask must not answer with TASK_LOST when the task is unknown.
* [MESOS-841] - Enforce only leading master can write to the Registrar.
* [MESOS-880] - introduce observe endpoint to master
* [MESOS-957] - introduce RepairCoordinator stub into master
* [MESOS-1226] - Add flags for replicated log backed registry.
* [MESOS-1338] - Add global counters for each message type on Master
** Bug
* [MESOS-361] - Restrict the character space of user provided TaskIDs.
* [MESOS-577] - bootstrap fails with automake 1.14
* [MESOS-578] - configure fails on OSX 10.8.4
* [MESOS-682] - Master should properly consolidate "slaves" and "deactivated" maps
* [MESOS-743] - ReservationAllocatorTest.ResourcesReturned test is flaky
* [MESOS-767] - Slave should reregister with completed frameworks/executors
* [MESOS-779] - mesos python examples use 2 space indent
* [MESOS-873] - Crash in os::killtree on Mavericks
* [MESOS-931] - post-review is deprecated.
* [MESOS-1000] - Clang build broken on 0.18.0 master
* [MESOS-1019] - AllocatorZooKeeperTest/0.SlaveReregistersFirst is flaky.
* [MESOS-1020] - AllocatorZooKeeperTest/0.SlaveReregistersFirst is flaky
* [MESOS-1025] - json_tests fails build
* [MESOS-1042] - Fix bad CGROUPS_ROOT_Write test
* [MESOS-1048] - LimitedCpuIsolatorTest.CgroupsCfs is broken when run as non-root
* [MESOS-1053] - tar: You must specify one of the `-Acdtrux' or `--test-label' options
* [MESOS-1054] - Java extension build is broken if libsnappy is installed
* [MESOS-1058] - Master CHECK failure: hierarchical_allocator_process.hpp:421 Check failed: !slaves.contains(slaveId)
* [MESOS-1062] - CpuIsolatorTest/0.SystemCpuUsage is flaky
* [MESOS-1067] - Specifying minimum logging level doesn't work
* [MESOS-1072] - Update system check (python boto)
* [MESOS-1077] - Registrar tests are flaky.
* [MESOS-1080] - cpplint.py doesn't analyze hpp files
* [MESOS-1082] - Make fails on AWS Ubuntu 12.04 and 13.10
* [MESOS-1083] - Error in CgroupsTest::SetUpTestCase() and TearDownTestCase()
* [MESOS-1088] - ZooKeeperMasterContenderDetectorTest.MasterDetectorExpireSlaveZKSessionNewMaster is flaky
* [MESOS-1092] - [Doc] "bin/mesos-master --help" to "mesos-master --help"
* [MESOS-1099] - Log health checks in mesos
* [MESOS-1100] - Drop "OOM notifier is triggered" log message
* [MESOS-1124] - Mesos EC2 scripts: Cannot find any cluster
* [MESOS-1126] - Change linkage around libjvm to use dlopen.
* [MESOS-1152] - ProcTest.MultipleThreads is flaky
* [MESOS-1157] - make dist fail
* [MESOS-1158] - make distcheck fail
* [MESOS-1161] - Inconsistent completed frameworks state between slave and master
* [MESOS-1164] - URL encoded urls do not work in slave
* [MESOS-1165] - Retry required when recovering an empty log
* [MESOS-1167] - Update system check (boost)
* [MESOS-1168] - Update system check (zookeeper)
* [MESOS-1175] - Update system check (http-parser)
* [MESOS-1191] - ProcTest unit tests flaky
* [MESOS-1202] - Make it easy to apply GitHub pull requests
* [MESOS-1210] - OsTest.children test is flaky
* [MESOS-1211] - MesosContainerizer should recover isolators after the launcher recovers
* [MESOS-1214] - CHECK failure in Group
* [MESOS-1230] - Compiler warning in libprocess statistics
* [MESOS-1231] - CHECK failed in log coordinator
* [MESOS-1235] - Metrics.Snapshot* tests fail
* [MESOS-1239] - Group CHECK failure
* [MESOS-1264] - Slave authentication retries can trigger TASK_LOST for non-checkpointing frameworks.
* [MESOS-1265] - Group should not process enqueued events from previous ZooKeeper instance (and ZK session)
* [MESOS-1268] - distclean break during maven clean up
* [MESOS-1271] - CHECK failure in replica.
* [MESOS-1273] - SlaveRecoveryTest/0.RestartBeforeContainerizerLaunch is flaky
* [MESOS-1275] - FaultToleranceTest.SlaveReregisterOnZKExpiration is flaky
* [MESOS-1276] - Make the delay between master detection and registration configurable
* [MESOS-1310] - Queuing up slave (re-)registration during authentication causes reply() to fail
* [MESOS-1318] - ProcessWatcher triggers seg fault
* [MESOS-1331] - SlaveRecoveryTest/0.NonCheckpointingFramework is flaky.
* [MESOS-1333] - Runtime error when invoking post-reviews.py with rbt 0.6
* [MESOS-1347] - GarbageCollectorIntegrationTest.DiskUsage is flaky.
* [MESOS-1348] - The SlaveRecoveryTest.GCExecutor test leaks child processes.
* [MESOS-1361] - Flaky test: SlaveRecoveryTest/0.RecoverCompletedExecutor
* [MESOS-1362] - Flaky test: SlaveRecoveryTest/0.RemoveNonCheckpointingFramework
* [MESOS-1365] - SlaveRecoveryTest/0.MultipleFrameworks is flaky
* [MESOS-1368] - Credentials file permissions check is broken
* [MESOS-1370] - SlaveRecoveryTest/0.RemoveNonCheckpointingFramework is flaky
* [MESOS-1372] - Compiler warning from stout flags
* [MESOS-1376] - CHECK failure in the Registrar
* [MESOS-1400] - Master doesn't recover resources for invalid offers
* [MESOS-1406] - Master stats.json using boolean instead of integral value for 'elected'.
* [MESOS-1408] - Unnecessary queuing of status update acknowledgments in the scheduler driver.
* [MESOS-1413] - MesosContainerizerExecuteTest.IoRedirection fails on OSX
* [MESOS-1415] - Web UI master redirect message doesn't show up
* [MESOS-1418] - Master should remove/rescind offers for disconnected slave.
* [MESOS-1419] - Properly rescind offers
* [MESOS-1449] - Isolator::recover will attempt to remove slave cgroup when using --slave_subsystems
* [MESOS-1455] - Segfault in libprocess during Process linking.
** Documentation
* [MESOS-1002] - Add "make check" instruction to getting started doc
* [MESOS-1377] - Update configuration documentation to reflect 0.19.0 master flags.
** Epic
* [MESOS-764] - Implement Master persistence using the Registrar.
** Improvement
* [MESOS-135] - Improve javadoc (use @param, @return, etc)
* [MESOS-269] - Better JSON Support
* [MESOS-295] - Allow new masters to have better understanding of cluster state
* [MESOS-581] - Expose cpu and memory usage statistics for master and slave
* [MESOS-610] - Split slave specific tests out of master_tests
* [MESOS-922] - Containerizer to support launching tasks by TaskInfo
* [MESOS-945] - Show framework host name in the WebUI
* [MESOS-956] - Add an "Sequence" abstraction to serialize callbacks.
* [MESOS-980] - Revisit Future discard semantics to enforce that transitions occur through a Promise.
* [MESOS-982] - Relax slave (re-)registration retries and add a backoff mechanism.
* [MESOS-983] - Expose log coordinator demotion.
* [MESOS-984] - Implement "auto-initialization" of the Replicated Log.
* [MESOS-995] - Extend Subprocess to support environment variables, changing user and working directory
* [MESOS-1015] - Some header files have 'using' statements
* [MESOS-1026] - Pull std::tuple / boost::tuples::tuple into tuples namespace of stout
* [MESOS-1036] - Implement a library for exposing statistical metrics.
* [MESOS-1041] - fatal() should use abort rather than exit(1) to get stacktraces
* [MESOS-1052] - Add a script that can run via CI to verify the reviews.
* [MESOS-1055] - Add explicit to single argument constructors
* [MESOS-1057] - libprocess: Add explicit to single argument constructors
* [MESOS-1068] - No --version command line parameter
* [MESOS-1087] - Display warning for credentials file permissions
* [MESOS-1105] - TODO(benh): choose a better scheme to set mem in slave/containerizer/containerizer.cpp
* [MESOS-1112] - Refactor the Registrar to push the operations to the caller to simplify the interface
* [MESOS-1151] - Make review bot check for style issues
* [MESOS-1155] - Improve the performance of Registrar
* [MESOS-1160] - Support flattening from Try into Future.
* [MESOS-1182] - Implement an output stream operator overload for Master::Slave
* [MESOS-1224] - Add dynamic loadable library abstraction to stout.
* [MESOS-1234] - Mesos ReviewBot should look at old reviews first
* [MESOS-1252] - Support ENV MAVEN_HOME to establish the path of the `mvn` executable.
* [MESOS-1255] - Master UI should show Mesos version
* [MESOS-1270] - Reconcile logging messages in master
* [MESOS-1274] - Disallow further operations in the Registrar when a failure occurs.
* [MESOS-1287] - metrics collection should not wait indefinitely
* [MESOS-1332] - Improve Master and Slave metric names
* [MESOS-1344] - Add flags support for JSON
* [MESOS-1349] - Mesos style checker should only check for updated files
* [MESOS-1358] - Show when the leading master was elected in the webui
* [MESOS-1382] - Include the error message in routing::socket().
* [MESOS-1405] - Mesos fetcher does not support S3(n)
** Story
* [MESOS-804] - Add authentication support for slaves
* [MESOS-838] - Consider exporting queue size as a metric from the master
** Task
* [MESOS-911] - Add pluggable authorization interface
* [MESOS-974] - Add a unit test for java api of replicated log
* [MESOS-981] - Implement Storage on the Replicated Log.
* [MESOS-1116] - Create library to track statistics of metrics
* [MESOS-1123] - Implement tests for stout/cache.hpp
* [MESOS-1132] - Port master stats.json over to new metrics library
* [MESOS-1133] - Port slave stats.json over to new metrics library
* [MESOS-1146] - Port system process stats over to new metrics library
* [MESOS-1197] - Adding signal safe os::system
* [MESOS-1217] - Add Timer metric to Metrics library
* [MESOS-1284] - metrics Timer should use Clock
* [MESOS-1304] - Create framework rate limiting design document and gather feedback
* [MESOS-1305] - Export frameworks QPS through metrics endpoint
* [MESOS-1314] - Update default registry to "replicated_log".
* [MESOS-1317] - Add integration tests to enforce the semantics of a "strict" registry.
* [MESOS-1319] - Add recovery integration tests for a "strict" registry.
* [MESOS-1320] - Add reconciliation integration tests for a "strict" registry.
* [MESOS-1321] - Add killTask integration tests for a "strict" registry.
* [MESOS-1322] - Add failover integration tests for a "strict" registry.
* [MESOS-1371] - Expose libprocess queue length from scheduler driver to metrics endpoint
* [MESOS-1373] - Keep track of the principals for authenticated pids in Master.
* [MESOS-1380] - mesos-local should set default work_dir
* [MESOS-1383] - Expose the authenticated principal through Authenticator::authenticate() result
* [MESOS-1387] - Integrate Authorizer into Master
* [MESOS-1411] - Update Master and Slave to handle status update acknowledgments going through the master.
Release Notes - Mesos - Version 0.18.2
--------------------------------------
* This is a bug fix release.
All Issues:
** Bug
* [MESOS-1313] - The executor bit is now essentially ignored with the 0.18.1 fetcher implementation
Release Notes - Mesos - Version 0.18.1
--------------------------------------
* This is a bug fix release.
All Issues:
** Bug
* [MESOS-979] - Master segfault when tasks.json endpoint is hit
* [MESOS-1045] - Unrecognized file extension in CommandInfo.URI causes executor to exit
* [MESOS-1078] - JNI calls hasNext on ArrayList instead of iterator
* [MESOS-1221] - Slave should update the containerizers with executor resources after recovery
* [MESOS-1241] - Unable to disable the auto-extraction of URIs (mesos-fetcher)
** Improvement
* [MESOS-1212] - Use maven to compile and package Mesos' Java files
Release Notes - Mesos - Version 0.18.0
--------------------------------------
* The primary feature of this release is a refactor of the isolation
abstraction to make it easy to add pluggable isolators/containerizers.
All Issues:
** Sub-task
* [MESOS-1043] - Change configure.ac to use C++11 by default.
** Bug
* [MESOS-422] - Master leader election should be more robust to stale ephemeral nodes
* [MESOS-537] - ZooKeeperMasterDetectorTest.MasterDetectorExpireSlaveZKSessionNewMaster is flaky
* [MESOS-672] - Web UI redirection does not work for hosts whose ip addresses are not publicly accessible
* [MESOS-837] - AWAIT_READY should not depend on process::Clock
* [MESOS-904] - Check for libcxx is missing in configure.ac
* [MESOS-912] - Slave sometimes crashes with SIGPIPE
* [MESOS-927] - OsTest.killtree is flaky
* [MESOS-937] - Fix "pure virtual method called" bug in zookeeper::ProcessWatcher
* [MESOS-952] - Clock::resume should adjust timeouts that were created in a paused/advanced Clock context.
* [MESOS-954] - The /__processes__ endpoint in libprocess is missing a needed lock acquisition.
* [MESOS-958] - Group should not ignore the ZNOAUTH error in creating the parent path for the group
* [MESOS-963] - Compile fails on 10.9
* [MESOS-965] - GroupTest.GroupWatchWithSessionExpiration is flaky
* [MESOS-966] - symbolize.cc:235:58: error: invalid suffix on literal; C++11 requires a space between literal and identifier
* [MESOS-967] - configure: error: cannot find libsasl2
* [MESOS-977] - MasterZooKeeperTest.LostZooKeeperCluster is flaky
* [MESOS-985] - FaultToleranceTest.IgnoreKillTaskFromUnregisteredFramework is flaky
* [MESOS-989] - Flaky whitelist tests
* [MESOS-991] - hashmap.hpp error: control reaches end of non-void function
* [MESOS-1009] - src/demangle.cc:170:13: error: comparison between pointer and integer ('const char *' and 'int')
* [MESOS-1029] - lib stout compile errors on Ubuntu 13.10 with Clang 3.5
* [MESOS-1030] - Mesos compile errors on Ubuntu 13.10 with Clang 3.5: const & ..., header guard
* [MESOS-1038] - Log coordinator should demote itself after a write is discarded.
* [MESOS-1045] - Unrecognized file extension in CommandInfo.URI causes executor to exit
* [MESOS-1049] - Cpu Isolator incorrectly writes double values when writing cpu.cfs_quota_us.
* [MESOS-1050] - Containerizer broke getting hadoop binary from $HADOOP_HOME and $PATH
* [MESOS-1051] - tar command used in fetcher not portable to OS X
* [MESOS-1063] - Containerizer fails when fetching more than one URL
* [MESOS-1079] - Mesos python egg build failure on OS X Mavericks (Xcode 5.1)
* [MESOS-1086] - DRF allocator should take into account past allocations when determining an ordering so frameworks are not starved.
* [MESOS-1095] - Build failure on OSX when using gcc-4.7
* [MESOS-1121] - /usr/include/c++/4.7/type_traits:1834:9: error: no match for call to '(std::_Bind<process::Future<process::http::Response> (*(std::_Placeholder<1>))(const std::basic_string<char>&)>) ()'
* [MESOS-1128] - ':' colon in executor work directories is unusual
* [MESOS-1135] - A reregistering framework that authenticates with Master might not get any offers
* [MESOS-1176] - make distcheck fails when enabling c++11
** Documentation
* [MESOS-926] - Document change to separate cgroup mounts
** Improvement
* [MESOS-903] - Store MasterInfo in ZK to enable master web UI redirection etc.
* [MESOS-943] - Provide an abstraction for asynchronous launching of subprocesses.
* [MESOS-975] - Show git tag info in master and slave log output
** New Feature
* [MESOS-600] - Rework Isolator abstraction
Release Notes - Mesos - Version 0.17.0
--------------------------------------
* The primary feature of this release is to add recovery support for
replicated log to make it more resilient to disk failures.
* If less than quorum of disks fail, the replicated log will
automatically perform catch-up to recover lost data.
All Issues:
** Sub-task
* [MESOS-902] - add post to libprocess
** Bug
* [MESOS-280] - ExecutorDriver methods' javadocs should not be referring to SchedulerDriver methods
* [MESOS-533] - SlaveRecoveryTest/0.CleanupExecutor is flaky on Jenkins.
* [MESOS-789] - Make link to times in the webui clickable
* [MESOS-799] - Mesos python egg is faulty on OS X Mavericks
* [MESOS-831] - script-without-shebang
* [MESOS-861] - FaultToleranceTest.FrameworkReliableRegistration could hang
* [MESOS-875] - A recovering slave should not ignore valid status updates.
* [MESOS-877] - Future::then and Promise::associate have memory leaks.
* [MESOS-897] - Cleanup of stout headers from fedora review
* [MESOS-913] - Help endpoint does not work on slaves.
* [MESOS-916] - add .gitignore-template file for ./bootstrap generated files
* [MESOS-925] - remove --without-curl from libprocess
* [MESOS-941] - Memory limit not correctly set when no memory resource set on executor level
* [MESOS-951] - Build failure: in log/catchup.cpp on Clang
* [MESOS-993] - Performance issue during replicated log catch-up when the initial log position is large
* [MESOS-1014] - Log truncation takes a long time during catch-up if the initial position is very large
** Documentation
* [MESOS-929] - Aurora not added to the framework docs
** Improvement
* [MESOS-749] - Add support for multiple offers in launchTasks
* [MESOS-772] - expose count of running tasks
* [MESOS-827] - Create LOOP_FOR(duration) macro to guard testing loop from running indefinitely
* [MESOS-860] - Get mesos' libprocess dependency glog to compile with clang and libc++
* [MESOS-863] - Get mesos' libprocess dependency protobuf to compile with clang and libc++
* [MESOS-864] - Eliminate the use of internal stdlibc++ templates for achieving libc++ compatibility
* [MESOS-896] - Enable newer versions of http_parser.
** New Feature
* [MESOS-736] - Support catch-up replicated log
** Task
* [MESOS-323] - Get mesos compiling with clang to open up path forward to c++11
* [MESOS-519] - Deprecate and remove old monitoring endpoint.
Release Notes - Mesos - Version 0.16.0
--------------------------------------
* The primary feature of this release is major refactoring work on the
master election and detection process to improve its reliability and
flexibility.
All Issues:
** Sub-task
* [MESOS-645] - Improve the performance of Future.
** Bug
* [MESOS-403] - CoordinatorTest.TruncateLearnedFill test is flaky
* [MESOS-455] - ZooKeeperTest.MasterDetectorShutdownNetwork runs forever
* [MESOS-463] - Detector ZNode creation failure.
* [MESOS-465] - Failures due to ZooKeeper operation timeouts in the master detector.
* [MESOS-498] - ZooKeeperTest.MasterDetectorTimedoutSession is flaky
* [MESOS-536] - GarbageCollectorTest.Unschedule is flaky
* [MESOS-592] - Don't dump a stack trace from bad --zk flag in the detector, use EXIT(1) instead of LOG(FATAL).
* [MESOS-624] - Master improperly prints the exit status of the executor
* [MESOS-641] - Stout killtree / pstree tests fail on Ubuntu 10.04.
* [MESOS-778] - FaultToleranceTest.ReconcileIncompleteTasks test is flaky
* [MESOS-782] - Slaves in local cluster should get unique work directories
* [MESOS-795] - ZooKeeperTest.MasterDetectorTimedoutSession test is flaky
* [MESOS-800] - CHECK failure in cgroups_isolator.
* [MESOS-807] - Discard is not propagated in process::dispatch.
* [MESOS-811] - Group::cancel can return a failed future if the membership is already cancelled
* [MESOS-822] - AllocatorTest/0.SchedulerFailover is flaky
* [MESOS-823] - ZooKeeperMasterContenderDetectorTest.ContenderDetectorShutdownNetwork is flaky
* [MESOS-826] - Bad 'master' flag in slave should not print a stack trace
* [MESOS-828] - CgroupsIsolator BalloonFramework Test is broken.
* [MESOS-842] - ZooKeeperMasterContenderDetectorTest.ContenderDetectorShutdownNetwork runs forever
* [MESOS-844] - Slave should not recover checkpointed data immediately after reboot
* [MESOS-851] - Scheduler Driver does not guarantee that abort() prevents further calls on the Scheduler.
* [MESOS-858] - Ignore launch/kill requests in the slave originating from non-leading masters.
* [MESOS-859] - Cgroup kill should use cgroup.procs, not tasks
* [MESOS-866] - Pailer popup window is not scrollable in Chrome or Safari
* [MESOS-867] - ZK Membership IDs are 32 bit signed integers, not 64 bit unsigned integers.
* [MESOS-870] - Slave http endpoint can crash the slave when no master is detected.
* [MESOS-871] - GroupTest.RetryableErrors is flaky
* [MESOS-883] - Group's handling of non-retryable errors and local timeout is incorrect
* [MESOS-884] - Incorrect asynchronous detection and contention loops in Master
* [MESOS-889] - Bad 'master' string given by scheduler should not print a stack trace
* [MESOS-892] - Additional Issues with contender related change
* [MESOS-935] - Group should tell MasterDetector "no memberships detected" when it locally times out
* [MESOS-940] - Slave should checkpoint bootid after recovery instead of after registration
** Improvement
* [MESOS-111] - Add SVN ignore and git ignore info to repository
* [MESOS-728] - Masters should seppuku using EXIT instead of abort() when leadership is lost.
* [MESOS-756] - Improve release tooling.
* [MESOS-760] - Capture memory usage statistics before OOM
* [MESOS-761] - Export all memory stats from memory.stat via CgroupsIsolator's usage()
* [MESOS-768] - Executor driver stop() should dispatch stop to executor process instead of terminating it
* [MESOS-802] - Web UI shows no errors when navigation to slave fails
* [MESOS-806] - Allowing converting from an Owned<T> to a Shared<T>.
* [MESOS-818] - Bump up the minimum number threads libprocess creates to accommodate new tests
* [MESOS-833] - The Status Update Manager should use a back-off mechanism for retried updates.
* [MESOS-835] - Reduce the minimum amount of CPUs required to make offers
* [MESOS-849] - As a developer I should be able to set the AUTOMAKE and ACLOCAL environment variables for autoconf to pickup when using the bootstrap script.
* [MESOS-881] - Tests are slow because the scheduler attempts to authenticate before the master realizes it is elected.
* [MESOS-900] - Paginate all tables in the web UI
Release Notes - Mesos - Version 0.15.0
--------------------------------------
* The primary feature in this release is to add authentication support
between frameworks and masters.
* You can set --authentication=true on masters to only allow
authenticated frameworks to register.
* Frameworks can call the new `MesosSchedulerDriver` constructor to
enable authentication.
* This release also moves Jenkins framework out of the mesos repo to
https://github.com/jenkinsci/mesos-plugin.
All Issues:
** Sub-task
* [MESOS-742] - GC directories based on modification time
* [MESOS-766] - Make --checkpoint to true by default
** Bug
* [MESOS-400] - Example Java framework test is flaky
* [MESOS-467] - AllocatorTest.FrameworkExited is flaky
* [MESOS-477] - Improve stout duration tests and Stringify(Days(value))
* [MESOS-512] - GroupTest.MultipleGroups is flaky.
* [MESOS-577] - bootstrap fails with automake 1.14
* [MESOS-650] - SlaveExecutorRerouterCtrl does not handle missing slave.
* [MESOS-655] - FaultToleranceTest.MasterFailover not simulating a realistic Master shutdown
* [MESOS-661] - WebUI pailer does not preserve newlines when data is copied from firefox.
* [MESOS-664] - Type resolution issue on 32 bit systems
* [MESOS-685] - SlaveRecoveryTest/0.RecoveryTimeout Java SIGSEGV
* [MESOS-686] - Testing isolator is broken when multiple frameworks are in play
* [MESOS-702] - Webui table headers are not consistently aligned vertically
* [MESOS-729] - ./stout/include/stout/hashmap.hpp:49:5: error: 'erase' was not declared in this scope, and no declarations were found by argument-dependent lookup at the point of instantiation [-fpermissive]
* [MESOS-732] - Make slave recovery asynchronous
* [MESOS-734] - MasterTest.ReconcileTaskTest "not authenticated"
* [MESOS-737] - Recover completed frameworks/executors during recovery
* [MESOS-738] - CgroupsIsolatorTest.ROOT_CGROUPS_BalloonFramework_NoBuffer can't finish
* [MESOS-746] - master error when start with --weights input parameters
* [MESOS-747] - FaultToleranceTest.ReregisterFrameworkExitedExecutor test fails
* [MESOS-758] - Incorrect memory statistics are reported under linux
* [MESOS-762] - Revert the use of the soft limit and memory threshold notifications.
* [MESOS-771] - StatusUpdateManagerTest.DuplicateTerminalUpdateBeforeAck is flaky
* [MESOS-773] - StatusUpdateManagerTest.DuplicateTerminalUpdateBeforeAck is flaky
* [MESOS-774] - FaultToleranceTest.MasterFailover test is flaky
* [MESOS-777] - GarbageCollectorIntegrationTest.ExitedFramework test is flaky
* [MESOS-787] - Authenticatee process deadlocks
* [MESOS-792] - FaultToleranceTest.SchedulerFailoverFrameworkMessage is flaky
* [MESOS-801] - SlaveRecoveryTest / ReconcileTasksMissingFromSlave is flaky.
** Documentation
* [MESOS-518] - Improve README with Markdown
** Improvement
* [MESOS-769] - Master's authenticate should not depend on 'from'
** New Feature
* [MESOS-704] - Add authentication support using SASL and CRAM-MD5
** Task
* [MESOS-608] - Move Jenkins code out of the mesos repo to Jenkins CI repo
Release Notes - Mesos - Version 0.14.1
--------------------------------------
* This is a bug fix release.
All Issues:
** Sub-task
* [MESOS-725] - Slave should cleanup meta directory if started in non-strict mode and slave info changes.
** Bug
* [MESOS-420] - Master crashes when reregistering framework
* [MESOS-488] - The Master incorrectly sends a "Framework failed over" message when the scheduler driver retries an initial failover re-registration.
* [MESOS-641] - Stout killtree / pstree tests fail on Ubuntu 10.04.
* [MESOS-658] - A framework can be incorrectly removed by the Master.
* [MESOS-662] - Executor OOM could lead to a kernel hang
* [MESOS-679] - Inability to find a latest run should not be considered a recovery error
* [MESOS-680] - Empty files should not be considered as recovery errors
* [MESOS-690] - Slave finalize() throws segfault
* [MESOS-694] - Preserve exit status for SIGTERM.
* [MESOS-711] - Master::reconcile incorrectly recovers resources from reconciled tasks.
* [MESOS-714] - Slave should check if the (re-)registered is from the expected master
** Improvement
* [MESOS-620] - Add slaveDisconnected and slaveReconnected calls to the Allocator
Release Notes - Mesos - Version 0.14.0
--------------------------------------
* The primary feature in this release is "Slave Recovery" which allows
restarted slaves (e.g., deploys, crashes) to reconnect with old live
executors/tasks. To enable slave recovery:
* First you need to enable checkpointing on slaves with "--checkpoint" flag.
* Frameworks can opt in to this feature by setting "FrameworkInfo.checkpoint"
when registering with the master.
* Once a Framework opts in, a restarted slave will recover all the framework's
tasks and executors. The tasks/executors stay alive through a slave
restart and reconnect with the restarted slave.
* Slave recovery also improves the reliability of delivering status updates.
* The release also includes a new feature called "Resource Reservations" which
allows reserving resources on a slave to particular roles (This is an
experimental feature).
* This release also includes a new Mesos plugin for Jenkins which allows Jenkins
to dynamically launch Jenkins slaves on a Mesos cluster (This is an
experimental feature).
There are also several bug fixes and stability improvements.
All Issues:
** Sub-task
* [MESOS-548] - Upgrade angular.js to use the full angular-ui.js
* [MESOS-549] - Change truncated IDs to show on hover
* [MESOS-630] - Improve the performance of Master::Http::stats().
** Bug
* [MESOS-235] - Mesos daemon ignores --conf option
* [MESOS-368] - HTTP.Endpoints test is flaky.
* [MESOS-370] - The process based isolation module should walk the process tree to collect resource usage.
* [MESOS-380] - Command Executor doesn't send TASK_KILLED for killed tasks.
* [MESOS-434] - Process isolator libprocess throws exception
* [MESOS-449] - CgroupsTests are flaky on Ubuntu
* [MESOS-451] - Always update resources for reregistered executors.
* [MESOS-461] - Freezer failure while in FREEZING state.
* [MESOS-479] - SlaveRecoveryTest/0.CleanupExecutor failure.
* [MESOS-485] - Latest trunk fails on strict aliasing on CentOS
* [MESOS-490] - Update mesos-daemon.sh (and associated scripts) to work with new flags mechanisms.
* [MESOS-497] - Queued tasks should be launched in the order they were received
* [MESOS-499] - Local slave run crashes on startup
* [MESOS-508] - Master crash due to Broken Pipe
* [MESOS-514] - FaultToleranceTest.ReconcileIncompleteTasks is flaky
* [MESOS-522] - ZooKeeperMasterDetectorTest.MasterDetectorExpireSlaveZKSessionNewMaster
* [MESOS-534] - ReaperTest.TerminatedChildProcess is flaky on Jenkins.
* [MESOS-545] - Remove hack in post-reviews.py for tracking parent branch
* [MESOS-582] - HTTP.Endpoints is flaky
* [MESOS-594] - Add CXXFLAGS='-fno-strict-aliasing' if using gcc 4.4.*.
* [MESOS-597] - Set MESOS_NATIVE_LIBRARY or (DY)LD_LIBRARY_PATH before launching an executor in order to enable JVM based executors to easily find libmesos.so.
* [MESOS-599] - Make sure stderr/stdout get launcher output.
* [MESOS-607] - Slave recovery should properly handle executors that were cleanly terminated in the previous run
* [MESOS-611] - Refactor slave recovery to ensure slave recovers its state first
* [MESOS-612] - Slave should not recover completed executors
* [MESOS-614] - Master should remove checkpointing slave that gets disconnected when the new slave tries to register
* [MESOS-616] - The Master / Slave should not store frameworks as both active and completed.
* [MESOS-619] - Master should properly reconcile KillTasks
* [MESOS-627] - Slave should offer total disk instead of available disk by default
* [MESOS-628] - A non-checkpointing slave should still cleanup the latest slave symlink
* [MESOS-632] - Executor driver should commit suicide if it cannot re-connect with a slave after a timeout
* [MESOS-633] - Master should inform a recovered slave about frameworks that were completed
* [MESOS-635] - Master doesn't update the task state when it generates TASK_LOST
* [MESOS-636] - Executors under cgroups isolator die immediately when a slave dies if it has a controlling TTY attached
* [MESOS-637] - Executor should reregister with the updates in the same order as it received them
* [MESOS-638] - Slave should not send command executor infos to master when it reregisters
* [MESOS-640] - Duplicate status update with same UUID crashes the slave
* [MESOS-644] - Slave doesn't correctly handle checkpointed terminal update whose ack doesn't reach the executor
* [MESOS-646] - Slave recovery doesn't properly handle checkpointed queued tasks
* [MESOS-648] - Slave should properly handle partial writes of status updates
* [MESOS-657] - SlaveRecoveryTest/1.PartitionedSlave fails with cgroups
* [MESOS-668] - SlaveRecoveryTest/0.MultipleFrameworks flaky
* [MESOS-671] - CgroupsIsolator does not listen for OOM events on recovered executors.
* [MESOS-673] - Task reconciliation does not properly release executor resources.
* [MESOS-675] - CHECK failure in the Master.
* [MESOS-676] - Slave::reregistered LOG(FATAL)s due to being in RECOVERING state.
* [MESOS-689] - Master incorrectly rejects tasks for long lived executors if they don't have FrameworkID set
** Improvement
* [MESOS-179] - Need to check for Python development headers
* [MESOS-221] - New Allocators
* [MESOS-329] - Add 'help' endpoints to libprocess.
* [MESOS-552] - Jenkins scheduler should use the latest Mesos jar built from the repo
* [MESOS-553] - Jenkins plugin should bundle the native Mesos library
* [MESOS-554] - Jenkins scheduler should properly handle TASK_LOST
* [MESOS-555] - Jenkins scheduler should reuse a Jenkins slave
* [MESOS-557] - Upgrade to Bootstrap CSS v2.3.2
* [MESOS-558] - Upgrade to full release of Angular JS
* [MESOS-559] - Replace Bootstrap's JS with Angular UI Bootstrap
* [MESOS-580] - Improve Command Executor
* [MESOS-613] - Give better guidance when recovery fails
* [MESOS-626] - Add the ability for example frameworks to checkpoint
* [MESOS-634] - Make slave recovery more robust by ignoring absence of files
* [MESOS-651] - Expose slave re-registration time in the Web UI
* [MESOS-663] - Expose recovery errors when running recovery in --no-strict mode
** New Feature
* [MESOS-110] - Slave Recovery: A slave restart should not restart tasks
* [MESOS-203] - Killtree that recursively kills sessions
* [MESOS-504] - Add weighted DRF.
* [MESOS-505] - Add resource reservations/pools per role.
* [MESOS-506] - Implement Jenkins scheduler for Mesos
** Task
* [MESOS-643] - Revert the semantics of newly introduced changes to FrameworkReregistered messages
* [MESOS-647] - Revert the default recovery mode to strict
Release Notes - Mesos - Version 0.13.0
--------------------------------------
* This release includes a major refactor of the internal testing infrastructure.
* There are also several bug fixes and stability improvements (esp. around ZooKeeper).
* Hadoop on Mesos is moved out of the mesos repo to its own repository (https://github.com/mesos/hadoop).
All Issues:
** Bug
* [MESOS-77] - ExceptionTest.AbortOnFrameworkError sometimes hangs if Mesos built without optimizations
* [MESOS-201] - CppFramework test occasionally fails
* [MESOS-217] - LOST tasks are incorrectly reconciled between mesos and framework
* [MESOS-232] - Unit test CoordinatorTest.Elect triggers non-deterministic assertion failure in libprocess.
* [MESOS-276] - SIGSEV with current trunk and OpenJDK 7u3
* [MESOS-277] - Java test framework test is flaky
* [MESOS-289] - Zookeeper tests are flaky
* [MESOS-301] - Coordinator test is flaky
* [MESOS-318] - os::memory does not consider sysinfo.mem_unit
* [MESOS-321] - libprocess http::encode fails test
* [MESOS-344] - --disable-java still uses java headers during make check
* [MESOS-353] - ZooKeepet state test GetSetGet hung
* [MESOS-362] - Inconsistent slave maps in the master.
* [MESOS-365] - Slave should reject tasks before registering with the master.
* [MESOS-366] - Master check failure during load tests.
* [MESOS-369] - Mesos tests spitting out error messages.
* [MESOS-379] - Zookeeper MasterDetectorExpireSlaveZKSessionNewMaster test is flaky
* [MESOS-385] - MasterTest.TaskRunning flaky on Jenkins.
* [MESOS-392] - FaultTolerance SchedulerExit test hangs
* [MESOS-393] - Forking at an unlucky time on OS X can cause the C++ library to deadlock.
* [MESOS-394] - Don't do ExecutorLauncher in forked process but exec first instead.
* [MESOS-395] - FaultToleranceTest.SchedulerFailoverFrameworkMessage test is flaky.
* [MESOS-399] - MonitorTest.WatchUnwatch failed.
* [MESOS-400] - Example Java framework test is flaky
* [MESOS-401] - SlaveRecoveryTest/0.RecoverTerminatedExecutor is flaky on OSX.
* [MESOS-402] - CoordinatorTest.TruncateNotLearnedFill test is flaky
* [MESOS-405] - SlaveRecoveryTest/1.ReconnectExecutor crashes.
* [MESOS-406] - Google mock throws a segfault when invoked by TestFilter
* [MESOS-407] - Google test filter processing is incorrect for the empty string.
* [MESOS-408] - FaultToleranceTest.SlavePartitioned is flaky
* [MESOS-412] - MasterTest.ShutdownUnregisteredExecutor flaky
* [MESOS-423] - A slave asked to shutdown should not reregister with a new slave id
* [MESOS-424] - CgroupsIsolatorTest.BalloonFramework runs forever
* [MESOS-436] - FaultToleranceTest.SchedulerFailover test is flaky
* [MESOS-437] - ResourceOffersTest.ResourceOfferWithMultipleSlaves is flaky
* [MESOS-440] - Allow for headroom in the GC algorithm.
* [MESOS-441] - AllocatorZooKeeperTest/0.FrameworkReregistersFirst is flaky
* [MESOS-446] - Master should shutdown slaves that were deactivated
* [MESOS-447] - Master should send TASK_LOST updates for unknown tasks when slave reregisters
* [MESOS-450] - The master should shut down slaves upon removal.
* [MESOS-453] - AllocatorZookeeper tests are using /tmp/mesos work directory
* [MESOS-454] - ResourceOffers tests are using /tmp/mesos working directory
* [MESOS-462] - Resource usage collection failure messages have '1' as the failure message.
* [MESOS-469] - Scheduler driver should call disconnected on master failover
* [MESOS-474] - Mesos 0.10.0: make check fails on Ubuntu 12.04LTS
* [MESOS-476] - Upgrade libev to 4.15
* [MESOS-481] - Slave needs to only inform the master about non-terminal executors, for proper resource accounting
* [MESOS-482] - Status update manager should not cleanup the stream when there are pending updates, even though it received an ACK for a terminal update
* [MESOS-484] - Latest ZooKeeperState.cpp doesn't compile on Mountain Lion
* [MESOS-502] - Slave crashes when handling duplicate terminal updates
* [MESOS-515] - Slave fails to detect free disk space
* [MESOS-524] - JVM tests are flaky on OSX.
* [MESOS-530] - A registered slave should check registration id when it receives mulitple re(re-)gistered messages from the master
* [MESOS-535] - Master crashes when removing non-checkpointing framework from a checkpointing slave
* [MESOS-538] - Master should not offer non-checkpointing slave's resources to checkpointing frameworks
* [MESOS-587] - MesosNativeLibrary.java doesn't read environment variable "MESOS_NATIVE_LIBRARY".
* [MESOS-604] - Slave should use the filesystem containing the work directory for disk usage calculation
* [MESOS-605] - Fix wrong compression for hadoop package
* [MESOS-606] - Slave incorrectly moves a task from 'terminatedTasks' to 'completedTasks'
* [MESOS-609] - Executor should remove the task from queuedTasks when it moves a queued task to terminatedTasks.
** Improvement
* [MESOS-46] - Refactor MasterTest to use fixture
* [MESOS-134] - Add Python documentation
* [MESOS-140] - Unrecognized command line args should fail the process
* [MESOS-242] - Add more tests to Dominant Share Allocator
* [MESOS-305] - Inform the frameworks / slaves about a master failover
* [MESOS-346] - Improve OSX configure output when deprecated headers are present.
* [MESOS-360] - Mesos jar should be built for java 6
* [MESOS-409] - Master detector code should stat nodes before attempting to create
* [MESOS-472] - Separate ResourceStatistics::cpu_time into ResourceStatistics::cpu_user_time and ResourceStatistics::cpu_system_time.
* [MESOS-493] - Expose version information in http endpoints
* [MESOS-503] - Master should log LOST messages sent to the framework
* [MESOS-526] - Change slave command line flag from 'safe' to 'strict'
* [MESOS-602] - Allow Mesos native library to be loaded from an absolute path
* [MESOS-603] - Add support for better test output in newer versions of autools
** New Feature
* [MESOS-169] - Ability to for tests to catch in-flight messages that are dispatched
** Task
* [MESOS-618] - Remove Hadoop on Mesos from repository in favor of external repository
* [MESOS-643] - Revert the semantics of newly introduced changes to FrameworkReregistered messages
Release Notes - Mesos - Version 0.12.1
--------------------------------------
* This release is primarily a bug fix release with a few small features for running JVM frameworks, like Hadoop and Jenkins.
All Issues:
** Bug
* [MESOS-515] - Slave fails to detect free disk space
* [MESOS-524] - JVM tests are flaky on OSX.
* [MESOS-587] - MesosNativeLibrary.java doesn't read environment variable "MESOS_NATIVE_LIBRARY".
* [MESOS-597] - Set MESOS_NATIVE_LIBRARY or (DY)LD_LIBRARY_PATH before launching an executor in order to enable JVM based executors to easily find libmesos.so.
* [MESOS-599] - Make sure stderr/stdout get launcher output.
* [MESOS-604] - Slave should use the filesystem containing the work directory for disk usage calculation
* [MESOS-605] - Fix wrong compression for hadoop package
** Improvement
* [MESOS-179] - Need to check for Python development headers
* [MESOS-346] - Improve OSX configure output when deprecated headers are present.
* [MESOS-360] - Mesos jar should be built for java 6
* [MESOS-602] - Allow Mesos native library to be loaded from an absolute path
* [MESOS-603] - Add support for better test output in newer versions of autools
Release Notes - Mesos - Version 0.12.0
--------------------------------------
* This release includes bug fixes and stability improvements.
* The primary feature in this release is executor resource consumption monitoring. Slaves now monitor resource consumption of running executors and expose it over JSON and through the webui.
* This release also includes a new and improved Hadoop framework. The new port doesn't require patching Hadoop (i.e., it's a self contained Hadoop contrib) and lets you use existing schedulers (e.g., the fair scheduler or capacity scheduler)! The tradeoff, however, is that it doesn't take as much advantage of the fine-grained nature of a Mesos task (i.e., there is no longer a 1-1 mapping between a Mesos task and a map/reduce task).
All Issues:
** Sub-task
* [MESOS-214] - Report resources being used by executors
* [MESOS-419] - Old slave directories should not be garbage collected based on file modification time
** Bug
* [MESOS-107] - Scheduler library should not acknowledge a status update if the driver has been aborted.
* [MESOS-152] - Slave should forward status updates for unknown tasks
* [MESOS-285] - configure.macosx checks for version "10.7" but should check for 10.7 or greater
* [MESOS-307] - Web UI file download links are broken.
* [MESOS-317] - python mesos core bindings rejects framework messages with null bytes
* [MESOS-319] - Fix buggy read / write calls.
* [MESOS-325] - make clean is broken
* [MESOS-332] - Executor launcher fetches resources as slave user, instead of executor user.
* [MESOS-340] - Gperftools target always rebuilds.
* [MESOS-374] - HTTP GET requests to /statistics/snapshot.json crash the slave
* [MESOS-422] - Master leader election should be more robust to stale ephemeral nodes
* [MESOS-486] - TaskInfo should include a 'source' in order to enable getting resource monitoring statistics.
** Improvement
* [MESOS-293] - Make clean deletes checked in files.
** New Feature
* [MESOS-99] - display slave resource usage information in the slave webui
** Task
* [MESOS-274] - Unicode / Binary files over http endpoints.
* [MESOS-324] - Monitor executor resource usage.
* [MESOS-331] - Add --disable-perftools to configure.
Release Notes - Mesos - Version 0.11.0
--------------------------------------
** Brainstorming
* [MESOS-357] - participate in GSoC 2013
* [MESOS-358] - do not participate in the GSoC 2013
All Issues:
** Bug
* [MESOS-260] - Implement a duration abstraction
* [MESOS-261] - bootstrap fails when automake version >= 1.12
* [MESOS-263] - Complete the new webui (slave, framework, executor pages)
* [MESOS-264] - Make fails on the latest ubuntu
* [MESOS-270] - Log viewing broken on mesos-local runs
* [MESOS-364] - cgroup tests fail on Ubuntu
* [MESOS-386] - AllocatorTest/0.TaskFinished has incomplete expectations.
* [MESOS-388] - Latest update breaks building on OSX
* [MESOS-404] - FilesTest.BrowseTest is flaky
* [MESOS-413] - AllocatorTest/0.TaskFinished test has bad expectations.
** Improvement
* [MESOS-252] - Web UI Improvements
* [MESOS-253] - Enable -Wall -Werror on the build
* [MESOS-254] - Improve mesos slave's garbage collection
* [MESOS-259] - Expose slave attributes in slave endpoint & sortable tables
** Task
* [MESOS-275] - HTTP endpoint for file download.
* [MESOS-279] - Impose a limit on HTTP Response size
* [MESOS-389] - Add OSX slave to the Jenkins build.
* [MESOS-491] - Add mesos-0.11.0-incubating jar to maven central
Release Notes - Mesos - Version 0.10.0
--------------------------------------
All Issues:
** Sub-task
* [MESOS-89] - create utilities to collect information from the proc filesystem
* [MESOS-212] - Eliminate Bottle and Python based webui.
* [MESOS-222] - Rename SimpleAllocator to DominantShareAllocator
* [MESOS-223] - Libprocess-ify Allocator
* [MESOS-224] - Write allocator tests
** Bug
* [MESOS-17] - Hadoop executors killed while tasks in COMMIT_PENDING
* [MESOS-34] - Rendering JSON needs to escape strings properly.
* [MESOS-44] - Master Detector uses the wrong ACL when auth is not required
* [MESOS-48] - Remove the failover flag from executor driver
* [MESOS-54] - Mesos ZooKeeper authentication is broken
* [MESOS-83] - Filters should be removed when more resources are available than rejected offer had
* [MESOS-145] - mesos executor holds on to fd spawned by slave after slave death, preventing slave from restarting
* [MESOS-148] - Building of included Hadoop broken
* [MESOS-164] - Crash on Mac OS X due to dlopen not being thread-safe
* [MESOS-183] - Included MPI Framework Fails to Start
* [MESOS-187] - Mesos should not pass an invalid task to a slave
* [MESOS-190] - Slave seg fault when executor exited
* [MESOS-199] - Attempting to use killtree.sh after forked pid has died is fruitless.
* [MESOS-200] - New Linux 'proc' code assuming too new a kernel.
* [MESOS-209] - A race bug in ProcessManager::spawn in libprocess.
* [MESOS-211] - Fix Slave GC tests
* [MESOS-218] - Master throws exception on removeTask() if Framework is not connected
* [MESOS-220] - Slave throws exception in librocess when master is down
* [MESOS-229] - mesos zookeeper group code fails to connect when pre-existing children of the group path are read-only
* [MESOS-233] - port should not be of type short
* [MESOS-239] - Allocator doesn't handle framework failover correctly
* [MESOS-244] - Mesos slave process is not shutting down cleanly
* [MESOS-247] - Ranges comparison has a bug
* [MESOS-248] - Python quit unexpectedly while using mesos plugin
* [MESOS-251] - DRF allocator doesn't expire filter correctly
* [MESOS-257] - Master doesn't recover resources when executor exits
* [MESOS-262] - Slave should not charge the resources required for launching a executor against the executor
* [MESOS-266] - When the master removes a slave, a shutdown should be sent.
* [MESOS-268] - Slave should force kill executors when it is shutting down
* [MESOS-278] - Master Fails to Connect to Zookeeper
* [MESOS-284] - Short-term fix for fire-walling slave shutdown and lost slave messages from the 'wrong' master
* [MESOS-286] - AllocatorTest is flaky
* [MESOS-288] - Latest trunk does not finish make check
* [MESOS-290] - Jobtracker can't get TaskTrackerInfo when the JobTracker log file is deleted
* [MESOS-299] - Master detector doesn't notify about leading master after network disconnection
* [MESOS-302] - Scheduler driver shouldn't send an ACK if the driver is aborted while sending stats update
* [MESOS-303] - mesos slave crashes during framework termination
* [MESOS-306] - Mesos-master frequently crashes
* [MESOS-310] - cgroups isolation module should not block on fetching executors
* [MESOS-339] - Release script is expecting enter, not any key
* [MESOS-382] - FaultToleranceTest.FrameworkReliableRegistration test is flaky.
* [MESOS-383] - AllocatorTest/0.FrameworkExited test is broken
* [MESOS-464] - mesos 0.10.0 fails to build on ubutu 13.04
** Improvement
* [MESOS-8] - Maintain a history of executed frameworks/tasks and show it on the web UI
* [MESOS-57] - Submit mesos.jar to Maven Central
* [MESOS-142] - Explicitly set and clean test work directories
* [MESOS-149] - Garbage collection on slaves
* [MESOS-171] - Make CommandInfo 'uri' field be repeated, possibly making a URI embedded message to describe whether or not we should 'chmod +x' the resulting resource.
* [MESOS-180] - Update the Hadoop patch to list protobuf-2.4.1 as a dependency so Maven pulls it down.
* [MESOS-193] - Create a single-page javascript interface to replace the existing webui
* [MESOS-194] - Make killtree more verbose.
* [MESOS-255] - Expose files through HTTP endpoints.
* [MESOS-256] - Introduce a cluster name into the Web UI
* [MESOS-272] - Create a 'fs' namespace and migrate as appropriate from our 'os' namespace.
** New Feature
* [MESOS-86] - Expose master url to the scheduler
* [MESOS-158] - Make ExecutorInfo more rich
* [MESOS-185] - Provide a master stat indicating number of outstanding resource offers
* [MESOS-207] - A new isolation module on Linux that uses Linux control groups (cgroups) directly.
* [MESOS-208] - Add whitelist option to master
** Question
* [MESOS-258] - mesos-master / mesos-slave => error: [Errno 32] Broken pipe
* [MESOS-298] - Executor fails to start
* [MESOS-311] - ClassNotFoundException when deploying hadoop on mesos
** Task
* [MESOS-69] - Migrate to Apache wiki and de-activate github wiki
* [MESOS-80] - Add a page or section to the wiki defining project coding standards
* [MESOS-133] - Make Mesos clean of most GCC warnings
* [MESOS-398] - Add mesos-0.10.0-incubating jar to maven central
Release Notes - Mesos - Version 0.9.0
-------------------------------------
** Dependency upgrade
* [MESOS-174] - Upgrade protobuf dependency to version 2.4.1
** Improvement
* [MESOS-3] - Ask executors to shutdown when a framework goes away
* [MESOS-167] - Make API names consistent
** New Feature
* [MESOS-146] - EC2 scripts should find the latest AMI from a known URL