blob: d2e3a199b4a2cbe0c9d490877647b3aef946bc40 [file] [log] [blame]
Release Notes - Mesos - Version 1.7.3 (WIP)
-------------------------------------------
* This is a bug fix release.
** Bug
* [MESOS-8467] - Destroyed executors might be used after `Slave::publishResource()`.
* [MESOS-8537] - Default executor doesn't wait for status updates to be ack'd before shutting down.
* [MESOS-9124] - Agent reconfiguration can cause master to unsuppress on scheduler's behalf.
* [MESOS-9507] - Agent could not recover due to empty docker volume checkpointed files.
* [MESOS-9529] - `/proc` should be remounted even if a nested container set `share_pid_namespace` to true.
* [MESOS-9549] - nvidia/cuda 10 does not work on GPU isolator.
* [MESOS-9564] - Logrotate container logger lets tasks execute arbitrary commands in the Mesos agent's namespace.
* [MESOS-9568] - SLRP does not clean up mount directories for destroyed MOUNT disks.
* [MESOS-9581] - Mesos package naming appears to be undeterministic.
* [MESOS-9590] - Mesos CI sometimes, incorrectly, overwrites already-pushed mesos master nightly images with new images built from non-master branches.
* [MESOS-9607] - Removing a resource provider with consumers breaks resource publishing.
* [MESOS-9610] - Fetcher vulnerability - escaping from sandbox.
* [MESOS-9616] - `Filters.refuse_seconds` declines resources not in offers.
* [MESOS-9619] - Mesos Master Crashes with Launch Group when using Port Resources
* [MESOS-9661] - Agent crashes when SLRP recovers dropped operations.
* [MESOS-9692] - Quota may be under allocated for disk resources.
* [MESOS-9695] - Remove the duplicate pid check in Docker containerizer
* [MESOS-9707] - Calling link::lo() may cause runtime error
* [MESOS-9750] - Agent V1 GET_STATE response may report a complete executor's tasks as non-terminal after a graceful agent shutdown.
* [MESOS-9766] - /__processes__ endpoint can hang.
* [MESOS-9785] - Frameworks recovered from reregistered agents are not reported to master `/api/v1` subscribers.
* [MESOS-9786] - Race between two REMOVE_QUOTA calls crashes the master.
* [MESOS-9787] - Log slow SSL (TLS) peer reverse DNS lookup.
* [MESOS-9803] - Memory leak caused by an infinite chain of futures in `UriDiskProfileAdaptor`.
* [MESOS-9836] - Docker containerizer overwrites `/mesos/slave` cgroups.
* [MESOS-9847] - Docker executor doesn't wait for status updates to be ack'd before shutting down.
* [MESOS-9852] - Slow memory growth in master due to deferred deletion of offer filters and timers.
* [MESOS-9856] - REVIVE call with specified role(s) clears filters for all roles of a framework.
* [MESOS-9868] - NetworkInfo from the agent /state endpoint is not correct.
* [MESOS-9870] - Simultaneous adding/removal of a role from framework's roles and its suppressed roles crashes the master.
* [MESOS-9887] - Race condition between two terminal task status updates for Docker/Command executor.
* [MESOS-9889] - Master CPU high due to unexpected foreachkey behaviour in Master::__reregisterSlave.
* [MESOS-9893] - `volume/secret` isolator should cleanup the stored secret from runtime directory when the container is destroyed.
* [MESOS-9925] - Default executor takes a couple of seconds to start and subscribe Mesos agent.
* [MESOS-9964] - Support destroying UCR containers in provisioning state.
* [MESOS-9966] - Agent crashes when trying to destroy orphaned nested container if root container is orphaned as well.
* [MESOS-9968] - WWWAuthenticate header parsing fails when commas are in (quoted) realm.
* [MESOS-10007] - Command executor can miss exit status for short-lived commands due to double-reaping.
* [MESOS-10015] - updateAllocation() can stall the allocator with a huge number of reservations on an agent.
* [MESOS-10018] - Duplicate tasks if agent partitioned during maintenance down.
* [MESOS-10084] - Detecting whether executor is generated for command task should work when the launcher_dir changes.
* [MESOS-10092] - Cannot pull image from docker registry which does not reply with 'scope'/'service' in WWW-Authenticate header.
** Improvements
* [MESOS-8880] - Add minimum capabilities in the master.
* [MESOS-9159] - Support Foreign URLs in docker registry puller.
* [MESOS-9540] - Support `DESTROY_DISK` on preprovisioned CSI volumes.
* [MESOS-9545] - Marking an unreachable agent as gone should transition the tasks to terminal state.
* [MESOS-9675] - Docker Manifest V2 Schema2 Support.
* [MESOS-9704] - Support docker manifest v2s2 config GC.
* [MESOS-9759] - Log required quota headroom and available quota headroom in the allocator.
* [MESOS-9948] - master::Slave::hasExecutor occupies 37% of a 150 second perf sample.
* [MESOS-10017] - Log all reverse DNS lookup failures in 'legacy' TLS (SSL) hostname validation scheme.
Release Notes - Mesos - Version 1.7.2
-------------------------------------
* This is a bug fix release.
** Bug
* [MESOS-8887] - Unreachable tasks are not GC'ed when unreachable agent is GC'ed.
* [MESOS-9210] - Mesos v1 scheduler library does not properly handle SUBSCRIBE retries.
* [MESOS-9517] - SLRP should treat gRPC timeouts as non-terminal errors, instead of reporting OPERATION_FAILED.
* [MESOS-9531] - chown error handling is incorrect in createSandboxDirectory.
* [MESOS-9532] - ResourceOffersTest.ResourceOfferWithMultipleSlaves is flaky.
* [MESOS-9533] - CniIsolatorTest.ROOT_CleanupAfterReboot is flaky.
* [MESOS-9537] - SLRP sends inconsistent status updates for dropped operations.
* [MESOS-9544] - SLRP does not clean up destroyed persistent volumes.
* [MESOS-9554] - Allocator might skip allocations because a single framework is incapable of receiving certain resources.
* [MESOS-9555] - Allocator CHECK failure: reservationScalarQuantities.contains(role).
** Improvement
* [MESOS-9340] - Log all socket errors in libprocess.
Release Notes - Mesos - Version 1.7.1
-------------------------------------
* This is a bug fix release. Also includes performance and API
improvements:
* **Allocator**: Improved allocation cycle time substantially
(see MESOS-9239 and MESOS-9249). These reduce the allocation
cycle time in some benchmarks by 80%.
* **Scheduler API**: Improved the experimental `CREATE_DISK` and
`DESTROY_DISK` operations for CSI volume recovery (see MESOS-9275
and MESOS-9321). Storage local resource providers now return disk
resources with the `source.vendor` field set, so frameworks needs to
upgrade the `Resource` protobuf definitions.
* **Scheduler API**: Offer operation feedbacks now present their agent
IDs and resource provider IDs (see MESOS-9293).
** Bug
* [MESOS-7042] - Send SIGKILL after SIGTERM to IOSwitchboard after container termination.
* [MESOS-7474] - Mesos fetcher cache doesn't retry when missed.
* [MESOS-8545] - AgentAPIStreamingTest.AttachInputToNestedContainerSession is flaky.
* [MESOS-8907] - Docker image fetcher fails with HTTP/2.
* [MESOS-8978] - Command executor calling setsid breaks the tty support.
* [MESOS-9131] - Health checks launching nested containers while a container is being destroyed lead to unkillable tasks.
* [MESOS-9152] - Close all file descriptors except whitelist_fds in posix/subprocess.
* [MESOS-9154] - MasterTest.TaskStateMetrics is flaky
* [MESOS-9164] - Subprocess should unset CLOEXEC on whitelisted file descriptors.
* [MESOS-9228] - SLRP does not clean up plugin containers after it is removed.
* [MESOS-9231] - `docker inspect` may return an unexpected result to Docker executor due to a race condition.
* [MESOS-9266] - Whenever our packaging tasks trigger errors we run into permission problems.
* [MESOS-9267] - Mesos agent crashes when CNI network is not configured but used.
* [MESOS-9274] - v1 JAVA scheduler library can drop TEARDOWN upon destruction.
* [MESOS-9279] - Docker Containerizer 'usage' call might be expensive if mount table is big.
* [MESOS-9281] - SLRP gets a stale checkpoint after system crash.
* [MESOS-9283] - Docker containerizer actor can get backlogged with large number of containers.
* [MESOS-9293] - If a framework looses operation information it cannot reconcile to acknowledge updates.
* [MESOS-9295] - Nested container launch could fail if the agent upgrade with new cgroup subsystems.
* [MESOS-9308] - URI disk profile adaptor could deadlock.
* [MESOS-9317] - Some master endpoints do not handle failed authorization properly.
* [MESOS-9324] - Resource fragmentation: frameworks may be starved of port resources in the presence of large number frameworks with quota.
* [MESOS-9332] - Nested container should run as the same user of its parent container by default.
* [MESOS-9334] - Container stuck at ISOLATING state due to libevent poll never returns.
* [MESOS-9362] - Test `CgroupsIsolatorTest.ROOT_CGROUPS_CreateRecursively` is flaky.
* [MESOS-9411] - Validation of JWT tokens using HS256 hashing algorithm is not thread safe.
* [MESOS-9418] - Add support for the `Discard` blkio operation type.
* [MESOS-9419] - Executor to framework message crashes master if framework has not re-registered.
* [MESOS-9474] - Master does not respect authorization result for `CREATE_DISK` and `DESTROY_DISK`.
* [MESOS-9479] - SLRP does not set RP ID in produced OperationStatus.
* [MESOS-9480] - Master may skip processing authorization results for `LAUNCH_GROUP`.
* [MESOS-9492] - Persist CNI working directory across reboot.
* [MESOS-9501] - Mesos executor fails to terminate and gets stuck after agent host reboot.
* [MESOS-9502] - IOswitchboard cleanup could get stuck due to FD leak from a race.
* [MESOS-9505] - `make check` failed with linking errors when c-ares is installed.
* [MESOS-9508] - Official 1.7.0 tarball can't be built on Ubuntu 16.04 LTS.
* [MESOS-9518] - CNI_NETNS should not be set for orphan containers that do not have network namespace.
* [MESOS-9519] - Unable to build Mesos with CMake on Ubuntu 14.04.
** Improvement
* [MESOS-6765] - Make the Resources wrapper "copy-on-write" to improve performance.
* [MESOS-9239] - Improve sorting performance in the DRF sorter.
* [MESOS-9249] - Avoid dirtying the DRF sorter when allocating resources.
* [MESOS-9255] - Use consistent "totals" across role / framework DRF.
* [MESOS-9275] - Allow optional `profile` to be specified in `CREATE_DISK` offer operation.
* [MESOS-9305] - Create cgoup recursively to workaround systemd deleting cgroups_root.
* [MESOS-9321] - Add an optional `vendor` field in `Resource.DiskInfo.Source`.
* [MESOS-9325] - Optimize `Resources::filter` operation.
* [MESOS-9486] - Set up `object.value` for `CREATE_DISK` and `DESTROY_DISK` authorizations.
* [MESOS-9510] - Disallowed nan, inf and so on in `Value::Scalar`.
* [MESOS-9516] - Extend `min_allocatable_resources` flag to cover non-scalar resources.
Release Notes - Mesos - Version 1.7.0
-------------------------------------
This release contains the following highlights:
* Performance Improvements:
* **Master `/state` endpoint:** Adopted RapidJSON and reduced
copying for a ~130% throughput improvement due to a ~55%
decrease in latency (MESOS-9092). Also, added parallel
processing of `/state` requests to reduce master backlogging
/ interference under high request load (MESOS-9122).
* **Allocator:** Improved allocator cycle time significantly
(MESOS-9087). This, together with the reduced master
backlogging from `/state` improvements, reduces the
end-to-end offer cycling time between Mesos and schedulers.
* **Agent `/containers` endpoint:** Fixed a performance issue
that caused high latency / cpu consumption when there are
many containers on the agent (MESOS-8418).
* **Agent container launching performance improvements**:
The expensive `cgroups::verify()` calls were removed which
provides a significant improvement to container launch /
destroy throughput (MESOS-9081).
* Containerization:
* [MESOS-8794] - **Experimental** Supported docker image tarball
fetching from HDFS through the `--docker_registry` agent flag.
* [MESOS-7691] - Added a new option `cgroups/all` to the agent
flag `--isolation`. This allows cgroups isolator to
automatically load all the local enabled cgroups subsystems.
If this option is specified in the agent flag `--isolation`
along with other cgroups related options
(e.g., `cgroups/cpu`), those options will be just ignored.
* [MESOS-7947] - Added a new `--gc_non_executor_container_sandboxes`
option which tells the agent to garbage collect sandboxes created
via the LAUNCH_NESTED_CONTAINER API. The same flag will apply to
standalone container sandboxes in future.
* [MESOS-8327] - Added container-specific cgroups mounts under
`/sys/fs/cgroup` to containers with image launched by Mesos
containerizer.
* [MESOS-5647] - Expose network statistics for containers on
CNI network in the `network/cni` isolator.
* [MESOS-8792] - Added a new `linux/devices` isolator that
automatically populates containers with devices that have
been whitelisted with the `--allowed_devices` agent flag.
* [MESOS-8340] Added a new `--enforce_container_ports`
option to toggle ports resource enforcement by the
`network/ports` isolator.
* [MESOS-6451] - Add timer and percentile metrics for docker
pull latency distribution.
* Windows:
* [MESOS-8668] - Added support to libprocess for the Windows
Thread Pool API, replacing libevent with the native Windows
event and thread pool library. This can be enabled with
`-DENABLE_LIBWINIO=ON` during CMake configuration. By
utilizing I/O Completion Ports, this enables non-blocking
asynchronous I/O on Windows for sockets, pipes, and files.
* Multi-Framework Workloads:
* [MESOS-8842] - **Experimental** Added per-framework metrics
to the master. These new metrics provide detailed information
about the behavior of each framework and can help with
scalability testing, debugging, and fine grained monitoring.
Please refer to docs/monitoring.md for more details.
* [MESOS-8238] Documentation was added in the framework
development guide to provide recommendations on how schedulers
can behave co-operatively in a multi-framework setting, as
well as how to operationally configure Mesos in such a setting.
* [MESOS-8936] A new weighted random sorter was added as an
alternative to the existing DRF sorter, this allows users
that don't need DRF behavior to opt-out.
Additional API Changes:
* [MESOS-9066] - Introduced `CREATE_DISK` and `DESTROY_DISK` offer
operations to replace `CREATE_VOLUME`, `CREATE_BLOCK`,
`DESTROY_VOLUME` and `DESTROY_BLOCK`.
* Container logger module interface has been changed. The `prepare()` method
now takes `ContainerID` and `ContainerConfig` instead.
* `Isolator::recover` interface has been changed to take an `std::vector`
instead of `std::list`.
* JSON endpoints now use rapidjson to provide a performance improvement,
this means that if a client has a JSON de-serializer that does not
conform to the ECMA-404 spec for JSON, they may break. As an example,
Mesos would previously serialize '/' as '\/', but the spec does not
require the escaping and rapidjson does not escape '/'.
Changes to Dependencies:
* [MESOS-8395] - Made gRPC a requirement for Mesos builds. The `--enable-grpc`
Autotools option and the `-DENABLE_GRPC=ON` CMake option is now removed.
* [MESOS-8064] - Mesos now requires libarchive to programmatically decode
.zip, .tar, .gzip, and other common file compression schemes. Version 3.3.2
is bundled in Mesos.
* [MESOS-9092] - Adopt rapidjson for improved json serialization performance.
Version 1.1.0 is bundled in Mesos.
Unresolved Critical Issues:
* [MESOS-1718] - Command executor can overcommit the agent.
* [MESOS-2554] - Slave flaps when using --slave_subsystems that are not used for isolation.
* [MESOS-2774] - SIGSEGV received during process::MessageEncoder::encode()
* [MESOS-2842] - Update FrameworkInfo.principal on framework re-registration
* [MESOS-3747] - HTTP Scheduler API no longer allows FrameworkInfo.user to be empty string
* [MESOS-5396] - After failover, master does not remove agents with same UPID.
* [MESOS-5989] - Libevent SSL Socket downgrade code accesses uninitialized memory / assumes single peek is sufficient.
* [MESOS-5995] - Protobuf JSON deserialisation does not accept numbers formated as strings
* [MESOS-6632] - ContainerLogger might leak FD if container launch fails.
* [MESOS-7076] - libprocess tests fail when using libevent 2.1.8
* [MESOS-7386] - Executor not cleaning up existing running docker containers if external logrotate/logger processes die/killed
* [MESOS-7566] - Master crash due to failed check in DRFSorter::remove
* [MESOS-7622] - Agent can crash if a HTTP executor tries to retry subscription in running state.
* [MESOS-7721] - Master's agent removal rate limit also applies to agent unreachability.
* [MESOS-7748] - Slow subscribers of streaming APIs can lead to Mesos OOMing.
* [MESOS-7911] - Non-checkpointing framework's tasks should not be marked LOST when agent disconnects.
* [MESOS-7991] - fatal, check failed !framework->recovered()
* [MESOS-8038] - Launching GPU task sporadically fails.
* [MESOS-8137] - Mesos agent can hang during startup.
* [MESOS-8256] - Libprocess can silently deadlock due to worker thread exhaustion.
* [MESOS-8257] - Unified Containerizer "leaks" a target container mount path to the host FS when the target resolves to an absolute path
* [MESOS-8522] - `prepareMounts` in Mesos containerizer is flaky.
* [MESOS-8623] - Crashed framework brings down the whole Mesos cluster
* [MESOS-8679] - If the first KILL stuck in the default executor, all other KILLs will be ignored.
* [MESOS-8703] - Mesos master can`t reconnect to zookeeper
* [MESOS-8731] - mesos master APIs become latent
* [MESOS-8769] - Agent crashes when CNI config not defined
* [MESOS-8803] - Libprocess deadlocks in a test.
* [MESOS-8840] - `cpu.cfs_quota_us` may be accidentally set for command task using docker during agent recovery.
* [MESOS-8927] - Default executor cannot kill tasks if `LAUNCH_NESTED_CONTAINER` is stuck.
* [MESOS-9006] - The agent's GET_AGENT leaks resource information when using authorization
* [MESOS-9022] - Race condition in task updates could cause missing event in streaming
* [MESOS-9049] - Agent GC could unmount a dangling persistent volume multiple times.
* [MESOS-9053] - Network ports isolator can falsely trigger while destroying containers.
* [MESOS-9109] - Windows agent uses reserved character :(colon) for file name and crashes when attempting to remove link
* [MESOS-9131] - Health checks launching nested containers while a container is being destroyed lead to unkillable tasks
* [MESOS-9157] - cannot pull docker image from dockerhub
* [MESOS-9169] - docker image fetching fails
All Resolved Issues:
** Bug
* [MESOS-2199] - Failing test: SlaveTest.ROOT_RunTaskWithCommandInfoWithUser
* [MESOS-3202] - Avoid role/framework offer starvation in DRF allocator.
* [MESOS-3475] - TestContainerizer should not modify global environment variables.
* [MESOS-3790] - ZooKeeper connection should retry on EAI_NONAME
* [MESOS-5371] - Implement `fcntl.hpp`
* [MESOS-5904] - Process routes implementation seems to drop routes on Windows.
* [MESOS-6092] - Docker containerizer launch command may access a "Container" struct after it has been destroyed
* [MESOS-6622] - NvidiaGpuTest.ROOT_INTERNET_CURL_CGROUPS_NVIDIA_GPU_NvidiaDockerImage is flaky
* [MESOS-6823] - bool/UserContainerLoggerTest.ROOT_LOGROTATE_RotateWithSwitchUserTrueOrFalse/0 is flaky
* [MESOS-6985] - os::getenv() can segfault
* [MESOS-7032] - Mesos fail NvidiaGpuTest.ROOT_INTERNET_CURL_CGROUPS_NVIDIA_GPU_NvidiaDockerImage
* [MESOS-7168] - Agent should validate that the nested container ID does not exceed certain length.
* [MESOS-7220] - 'EXPECT_SOME' and other asserts don't work with 'Try's that have a custom error state.
* [MESOS-7342] - Port Docker tests
* [MESOS-7397] - apply-reviews.py silently fails when using chain mode.
* [MESOS-7658] - apply-reviews.py fails with Unicode characters
* [MESOS-7966] - check for maintenance on agent causes fatal error
* [MESOS-8128] - Make os::pipe file descriptors O_CLOEXEC.
* [MESOS-8134] - SlaveTest.ContainersEndpoint is flaky due to getenv crash.
* [MESOS-8429] - Clean up endpoint socket if the container daemon is destroyed while waiting.
* [MESOS-8499] - Change docker health check image to the new nanoserver one
* [MESOS-8567] - Test UriDiskProfileTest.FetchFromHTTP is flaky.
* [MESOS-8568] - Command checks should always call `WAIT_NESTED_CONTAINER` before `REMOVE_NESTED_CONTAINER`
* [MESOS-8613] - Test `MasterAllocatorTest/*.TaskFinished` is flaky.
* [MESOS-8626] - The 'allocatable' check in the allocator is problematic with multi-role frameworks
* [MESOS-8686] - Mesos build failed with /permissive- + MSVC on windows
* [MESOS-8687] - Check failure in `ProcessBase::_consume()`.
* [MESOS-8786] - CgroupIsolatorProcess accesses subsystem processes directly.
* [MESOS-8830] - Agent gc on old slave sandboxes could empty persistent volume data
* [MESOS-8838] - Consider validating that resubscribing resource providers do not change their name or type
* [MESOS-8857] - Fix subprocess(flags) logic on Windows to handle arguments with quotes
* [MESOS-8871] - Agent may fail to recover if the agent dies before image store cache checkpointed.
* [MESOS-8873] - StorageLocalResourceProviderTest.ROOT_ZeroSizedDisk is flaky.
* [MESOS-8875] - `leveldb::PosixEnv::DeleteFile()` can segfault.
* [MESOS-8884] - Flaky `DockerContainerizerTest.ROOT_DOCKER_MaxCompletionTime`.
* [MESOS-8892] - MasterSlaveReconciliationTest.ReconcileDroppedOperation is flaky
* [MESOS-8897] - ROOT_XFS_QuotaTest.DiskUsageExceedsQuotaWithKill is flaky
* [MESOS-8906] - `UriDiskProfileAdaptor` fails to update profile selectors.
* [MESOS-8913] - Resource provider manager registry leaks file descriptors into executors.
* [MESOS-8917] - Agent leaking file descriptors into forked processes
* [MESOS-8921] - Autotools don't work with newer OpenJDK versions
* [MESOS-8932] - Quota guarantee metric does not handle removal correctly.
* [MESOS-8935] - Quota limit "chopping" can lead to cpu-only and memory-only offers.
* [MESOS-8945] - Master check failure due to CHECK_SOME(providerId).
* [MESOS-8952] - process::await/collect n^2 performance issue
* [MESOS-8954] - python3/post-reviews.py errors due to TypeError.
* [MESOS-8958] - LinuxDevicesIsolatorTest.ROOT_PopulateWhitelistedDevices fails on some boxes.
* [MESOS-8963] - Executor crash trying to print container ID.
* [MESOS-8970] - Tests relying on metrics segfault on some Linux distros.
* [MESOS-8977] - BuildBot uses Docker with AUFS that has a max file length limit of 242 characters
* [MESOS-8979] - python3/push-commits.py fails due to TypeError
* [MESOS-8980] - mesos-slave can deadlock with docker pull
* [MESOS-8985] - Posting to the operator api with 'accept recordio' header can crash the agent
* [MESOS-8987] - Master asks agent to shutdown upon auth errors.
* [MESOS-9000] - Operator API event stream can miss task status updates.
* [MESOS-9007] - XFS disk isolator doesn't clean up project ID from symlinks
* [MESOS-9008] - Fetcher fails to extract some archives containing hardlinks
* [MESOS-9010] - `UPDATE_STATE` can race with `UPDATE_OPERATION_STATUS` for a resource provider.
* [MESOS-9014] - MasterAPITest.SubscribersReceiveHealthUpdates is flaky
* [MESOS-9025] - The container which joins CNI network and has checkpoint enabled will be mistakenly destroyed by agent
* [MESOS-9027] - GPU Isolator still depends on cgroups/devices agent flag given cgrous/all is supported.
* [MESOS-9037] - DefaultExecutorTest.SigkillExecutor is flaky
* [MESOS-9038] - Archiver utility extracts links within subdirectories incorrectly
* [MESOS-9039] - CNI isolator recovery should wait until unknown orphan cleanup is done
* [MESOS-9051] - Move agent call validation into common validation library.
* [MESOS-9065] - Apply the `override` keyword globally.
* [MESOS-9073] - Tox doesn't run in the support virtualenv when using Python 3 mesos-style.py
* [MESOS-9075] - Virtualenv management in support directory is buggy.
* [MESOS-9094] - On macOS libprocess_tests fail to link when compiling with gRPC
* [MESOS-9114] - cmake build is broken on macos
* [MESOS-9115] - Stout depends on missing rapidjson headers.
* [MESOS-9116] - Launch nested container session fails due to incorrect detection of `mnt` namespace of command executor's task.
* [MESOS-9125] - Port mapper CNI plugin might fail with "Resource temporarily unavailable"
* [MESOS-9127] - Port mapper CNI plugin might deadlock iptables on the agent.
* [MESOS-9137] - GRPC build fails to pass compiler flags
* [MESOS-9142] - CNI detach might fail due to missing network config file.
* [MESOS-9144] - Master authentication handling leads to request amplification.
* [MESOS-9145] - Master has a fragile burned-in 5s authentication timeout.
* [MESOS-9146] - Agent has a fragile burn-in 5s authentication timeout.
* [MESOS-9147] - Agent and scheduler driver authentication retry backoff time could overflow.
* [MESOS-9149] - Failed to build gRPC on Linux without OpenSSL.
* [MESOS-9151] - Container stuck at ISOLATING due to FD leak
* [MESOS-9156] - StorageLocalResourceProviderProcess can deadlock
* [MESOS-9160] - Failed to compile gRPC when the build path contains symlinks.
* [MESOS-9163] - `UriDiskProfileAdaptor` should not update profiles when a poll returns a non-OK HTTP status.
* [MESOS-9170] - Zookeeper doesn't compile with newer gcc due to format error
* [MESOS-9171] - Mesos agent crashes in CNI isolator when usage is queried
* [MESOS-9177] - Mesos master segfaults when responding to /state requests.
* [MESOS-9185] - An attempt to remove or destroy container in composing containerizer leads to segfault.
* [MESOS-9193] - Mesos build fail with Clang 3.5.
* [MESOS-9196] - Removing rootfs mounts may fail with EBUSY.
** Epic
* [MESOS-8564] - Port libprocess-tests suites to Windows
* [MESOS-8668] - Transition libprocess on Windows to use the Thread Pool API
* [MESOS-8705] - Composing containerizer improvements
* [MESOS-8842] - Per Framework Metrics on Master
* [MESOS-8916] - Allocation logic cleanup.
* [MESOS-9013] - Support container Cgroup FS mount.
** Improvement
* [MESOS-6451] - Add timer and percentile for docker pull latency distribution.
* [MESOS-7691] - Support local enabled cgroups subsystems automatically.
* [MESOS-7947] - Add GC capability to nested containers
* [MESOS-8064] - Add capability so mesos can programmatically decode .zip, .tar, .gzip, and other common file compression schemes
* [MESOS-8106] - Docker fetcher plugin unsupported scheme failure message is not accurate.
* [MESOS-8340] - Add a no-enforce option to the `network/ports` isolator.
* [MESOS-8418] - mesos-agent high cpu usage because of numerous /proc/mounts reads
* [MESOS-8680] - Rename variable names in slave.hpp to be more explicit.
* [MESOS-8788] - Add alg RS256 support for JWT generator and validator in libprocess
* [MESOS-8792] - Automatically create whitelisted devices.
* [MESOS-8798] - Build the "unsecure" gRPC libraries to remove SSL dependency.
* [MESOS-8829] - Get rid of extra `containerizer->wait()` calls in tests.
* [MESOS-8908] - Add -fno-omit-frame-pointer to improve debugging and profiling.
* [MESOS-8911] - Add framework metrics benchmark test.
* [MESOS-8919] - Per Framework SUBSCRIBE metrics.
* [MESOS-8920] - Support per-container container logger configuration.
* [MESOS-8924] - Refactor the libprocess gRPC warpper.
* [MESOS-8955] - Manage Python2 and 3 in build steps
* [MESOS-8986] - `slave.available()` in the allocator is expensive and drags down allocation performance.
* [MESOS-8989] - Add a better benchmark for range type resources.
* [MESOS-8998] - Allow for unbundled libevent in CMake builds to work around 2.1.x SSL issues.
* [MESOS-9015] - Allow resources to be removed when updating the sorter.
* [MESOS-9055] - Make gRPC call deadline configurable.
* [MESOS-9067] - Improve performance of json parsing by avoiding conversion cost.
* [MESOS-9081] - cgroups::verify is expensive and is done implicitly during cgroups operations.
* [MESOS-9086] - Optimize range subtraction operation.
* [MESOS-9092] - Adopt rapidjson for improved json serialization performance.
* [MESOS-9104] - Refactor capability related logic in the allocator.
* [MESOS-9110] - Add move support to the Resources / Resource_ wrappers.
* [MESOS-9122] - Batch '/state' requests in the Master actor.
* [MESOS-9129] - Port mapper CNI plugin should use '-n' option with 'iptables --list'
* [MESOS-9213] - Avoid double copying of master->framework messages when incrementing metrics.
** Task
* [MESOS-2633] - Move implementations of Framework struct functions out of master.hpp.
* [MESOS-3442] - Port path_tests to Windows
* [MESOS-3444] - Port sendfile_tests
* [MESOS-5647] - Expose network statistics for containers on CNI network in the `network/cni` isolator.
* [MESOS-5814] - Port libprocess http_tests.cpp
* [MESOS-5817] - Port libprocess process_tests.cpp
* [MESOS-5941] - RemoteLink tests fail on Windows
* [MESOS-7329] - Authorize offer operations for converting disk resources
* [MESOS-7527] - Enable ProcessTest.THREADSAFE_Http2 on Windows.
* [MESOS-8314] - Add authorization to display of resource provider information in API calls and endpoints
* [MESOS-8327] - Add container-specific CGroup FS mounts under /sys/fs/cgroup/* to Mesos containers
* [MESOS-8383] - Add metrics for operations in Storage Local Resource Provider (SLRP).
* [MESOS-8395] - Made gRPC a requirement for Mesos builds.
* [MESOS-8473] - Authorize `GET_OPERATIONS` calls.
* [MESOS-8670] - Implement `process::io::read/write` using Thread Pool API
* [MESOS-8671] - Add EventLoop implementation using Thread Pool API
* [MESOS-8672] - Replace libprocess `PollSocketImpl` with IOCP and Thread Pool API
* [MESOS-8674] - Fix os::pipe to work in overlapped mode
* [MESOS-8681] - Clean up os::sendfile on Windows
* [MESOS-8712] - Remove `destroyed` promise from `Container` struct
* [MESOS-8713] - Synchronize result of `wait` and `destroy` composing c'zer methods
* [MESOS-8714] - Cleanup `containers_` hashmap once container exits
* [MESOS-8732] - Use composing containerizer in some agent tests.
* [MESOS-8734] - Restore `WaitAfterDestroy` test to check termination status of a terminated nested container.
* [MESOS-8736] - Implement a test which ensures that `wait` and `destroy` return the same result for a terminated nested container.
* [MESOS-8737] - Update composing containerizer tests.
* [MESOS-8774] - Authenticate and authorize calls to the resource provider manager's API
* [MESOS-8794] - Support docker image tarball hdfs based fetching.
* [MESOS-8814] - Mount the volume based on `Volume.mode`.
* [MESOS-8825] - Remove storage pools associated with missing profiles.
* [MESOS-8837] - Add test of resource provider manager recovery
* [MESOS-8843] - Per Framework CALL metrics
* [MESOS-8844] - Per Framework EVENT metrics
* [MESOS-8845] - Per Framework Operation metrics
* [MESOS-8846] - Per Framework state metrics
* [MESOS-8847] - Per Framework task state metrics
* [MESOS-8848] - Per Framework Offer metrics
* [MESOS-8849] - Per Framework resource allocation metrics
* [MESOS-8903] - Update the Python CLI to use Python 3
* [MESOS-8912] - Per Framework terminal task state metrics
* [MESOS-8931] - Add os::shell back to Windows
* [MESOS-8934] - Update python.m4 to support Python 3
* [MESOS-8936] - Implement a Random Sorter for offer allocations.
* [MESOS-8940] - Per Framework Offer metrics with a specific resource type
* [MESOS-8942] - Master streaming API does not send (health) check updates for tasks.
* [MESOS-8943] - Add metrics about CSI calls.
* [MESOS-8961] - Output of tasks gets corrupted if task defines the same environment variables as the executor container
* [MESOS-8990] - Build failure of the google-test dependency on Windows using MSVC.
* [MESOS-8995] - Add SLRP unit tests for missing profiles.
* [MESOS-8997] - Consider dropping PATH disk support for CSI volumes.
* [MESOS-9002] - GCC 8.1 build failure in os::Fork::Tree.
* [MESOS-9043] - Move check validators to the common validation library.
* [MESOS-9066] - Changing `CREATE_VOLUME` and `CREATE_BLOCK` to `CREATE_DISK`.
* [MESOS-9068] - Add a metrics benchmark in libprocess.
* [MESOS-9070] - Support systemd and freezer cgroup subsystems bind mount for container with rootfs.
* [MESOS-9148] - Make cgroups destroy timeout configurable for Mesos containerizer
** Documentation
* [MESOS-8740] - Update description of a Containerizer interface.
* [MESOS-9020] - Seccomp design doc
Release Notes - Mesos - Version 1.6.2 (WIP)
-------------------------------------------
* This is a bug fix release.
** Bug
* [MESOS-8418] - mesos-agent high cpu usage because of numerous /proc/mounts reads.
* [MESOS-9125] - Port mapper CNI plugin might fail with "Resource temporarily unavailable"
* [MESOS-9127] - Port mapper CNI plugin might deadlock iptables on the agent.
Release Notes - Mesos - Version 1.6.1
-------------------------------------------
* This is a bug fix release.
** Bug
* [MESOS-3790] - ZooKeeper connection should retry on `EAI_NONAME`.
* [MESOS-8106] - Docker fetcher plugin unsupported scheme failure message is not accurate.
* [MESOS-8786] - CgroupIsolatorProcess accesses subsystem processes directly.
* [MESOS-8830] - Agent gc on old slave sandboxes could empty persistent volume data
* [MESOS-8871] - Agent may fail to recover if the agent dies before image store cache checkpointed.
* [MESOS-8904] - Master crash when removing quota.
* [MESOS-8906] - `UriDiskProfileAdaptor` fails to update profile selectors.
* [MESOS-8935] - Quota limit "chopping" can lead to cpu-only and memory-only offers.
* [MESOS-8936] - Implement a Random Sorter for offer allocations.
* [MESOS-8942] - Master streaming API does not send (health) check updates for tasks.
* [MESOS-8945] - Master check failure due to CHECK_SOME(providerId).
* [MESOS-8947] - Improve the container preparing logging in IOSwitchboard and volume/secret isolator.
* [MESOS-8952] - process::await/collect n^2 performance issue.
* [MESOS-8963] - Executor crash trying to print container ID.
* [MESOS-8980] - mesos-slave can deadlock with docker pull.
* [MESOS-8986] - `slave.available()` in the allocator is expensive and drags down allocation performance.
* [MESOS-8987] - Master asks agent to shutdown upon auth errors.
* [MESOS-9002] - GCC 8.1 build failure in os::Fork::Tree.
* [MESOS-9024] - Mesos master segfaults with stack overflow under load.
* [MESOS-9025] - The container which joins CNI network and has checkpoint enabled will be mistakenly destroyed by agent.
* [MESOS-9049] - Agent GC could unmount a dangling persistent volume multiple times.
** Improvement
* [MESOS-8934] - Update python.m4 to support Python 3.
Release Notes - Mesos - Version 1.6.0
-------------------------------------------
This release contains the following new features:
* [MESOS-4965] - **Experimental** Persistent volumes can be resized
through new offer operations and V1 operator API now.
* [MESOS-6575] - Added a new `--xfs_kill_containers` flag to the
Mesos agent. This causes the `disk/xfs` isolator to terminate
containers that exceed their disk quota.
* [MESOS-7944] - **Experimental** Added a new `MemoryProfiler` class to
libprocess to aid in debugging memory issues.
* [MESOS-8054] - **Experimental** Schedulers can now receive feedback about
offer operations which operate on resources managed by resource providers.
In the future, this feature will be extended to operations on agent default
resources.
* [MESOS-8534] - **Experimental** A nested container is now allowed
to join a separate CNI network than its parent container.
* [MESOS-8572] - Improvements to the Docker containerizer and executor
to more gracefully handle situations in which the Docker CLI is
unresponsive.
* [MESOS-8607] - The `mesos-execute` tool has been ported to Windows.
* [MESOS-8649] - **Experimental** Support for Container Storage Interface
(CSI) version 0.2 in Mesos.
* [MESOS-8659] - The Windows build now links the C runtime library
dynamically instead of statically. This requires the Visual Studio
redistributable to be available at runtime.
* [MESOS-8682] - The use of the C runtime library's POSIX wrappers on
Windows has been deprecated in favor of the native Windows APIs.
* [MESOS-8725] - Added a new `max_completion_time` field to `TaskInfo`.
Tasks which do not complete at the end of the specified duration will
fail with a new reason `REASON_MAX_COMPLETION_TIME_REACHED`.
* [MESOS-8801] - **Experimental** On Linux, Mesos can now be
configured to use the jemalloc allocator by default via the
`--enable-jemalloc-allocator` configuration option.
* Agents now support the `--fetcher_stall_timeout` flag which allows container
image and artifact fetchers to abort after the timeout when downloads stall.
Deprecations/Removals:
* Support for CSI v0.1 is deprecated in favor of CSI v0.2.
Additional API Changes:
* [MESOS-8306] - Authorization of resource reservation has been updated
to allow the restriction of which agents can statically reserve
resources for which roles.
* [MESOS-8332] - Container sandbox permissions have been changed from
0755 to 0750.
* [MESOS-8388] - Local resource provider resources are now included in
the responses to the GET_AGENTS and GET_RESOURCE_PROVIDER calls.
* [MESOS-8534] - Nested containers within a task group can now specify
separate network namespaces.
Changes to Dependencies:
* Upgraded minimum required gRPC library to version 1.10+ for gRPC-enabled builds.
Unresolved Critical Issues:
* [MESOS-1718] - Command executor can overcommit the agent.
* [MESOS-2554] - Slave flaps when using --slave_subsystems that are not used for isolation.
* [MESOS-2774] - SIGSEGV received during process::MessageEncoder::encode()
* [MESOS-2842] - Update FrameworkInfo.principal on framework re-registration
* [MESOS-3533] - Unable to find and run URIs files
* [MESOS-3747] - HTTP Scheduler API no longer allows FrameworkInfo.user to be empty string
* [MESOS-5396] - After failover, master does not remove agents with same UPID.
* [MESOS-5989] - Libevent SSL Socket downgrade code accesses uninitialized memory / assumes single peek is sufficient.
* [MESOS-5995] - Protobuf JSON deserialisation does not accept numbers formated as strings
* [MESOS-6632] - ContainerLogger might leak FD if container launch fails.
* [MESOS-7386] - Executor not cleaning up existing running docker containers if external logrotate/logger processes die/killed
* [MESOS-7566] - Master crash due to failed check in DRFSorter::remove
* [MESOS-7622] - Agent can crash if a HTTP executor tries to retry subscription in running state.
* [MESOS-7721] - Master's agent removal rate limit also applies to agent unreachability.
* [MESOS-7748] - Slow subscribers of streaming APIs can lead to Mesos OOMing.
* [MESOS-7911] - Non-checkpointing framework's tasks should not be marked LOST when agent disconnects.
* [MESOS-7966] - check for maintenance on agent causes fatal error
* [MESOS-7991] - fatal, check failed !framework->recovered()
* [MESOS-8137] - Mesos agent can hang during startup.
* [MESOS-8256] - Libprocess can silently deadlock due to worker thread exhaustion.
* [MESOS-8257] - Unified Containerizer "leaks" a target container mount path to the host FS when the target resolves to an absolute path
* [MESOS-8522] - `prepareMounts` in Mesos containerizer is flaky.
* [MESOS-8623] - Crashed framework brings down the whole Mesos cluster
* [MESOS-8679] - If the first KILL stuck in the default executor, all other KILLs will be ignored.
* [MESOS-8703] - Mesos master can`t reconnect to zookeeper
* [MESOS-8731] - mesos master APIs become latent
* [MESOS-8769] - Agent crashes when CNI config not defined
* [MESOS-8803] - Libprocess deadlocks in a test.
* [MESOS-8830] - Agent gc on old slave sandboxes could empty persistent volume data
* [MESOS-8840] - `cpu.cfs_quota_us` may be accidentally set for command task using docker during agent recovery.
Feature Graduations:
* [MESOS-4828] - XFS disk quota isolator.
* [MESOS-6906] - Introduce a general non-interpreting task check.
All Experimental Features:
* [MESOS-3094] - Mesos on Windows.
* [MESOS-3421] - Support sharing of resources across task instances.
* [MESOS-4312] - Porting Mesos on Power (ppc64le).
* [MESOS-4355] - Implement isolator for Docker volume.
* [MESOS-4965] - Persistent volume resizing.
* [MESOS-5344] - Partition-aware Mesos frameworks.
* [MESOS-5788] - Added JAVA API adapter for seamless transition to new scheduler API.
* [MESOS-5931] - Support auto backend in Mesos Containerizer.
* [MESOS-6014] - Added port mapping CNI plugin.
* [MESOS-7944] - Libprocess `MemoryProfiler`.
* [MESOS-8054] - Offer operation feedback.
* [MESOS-8534] - Separate CNI networks for nested containers.
* [MESOS-8649] - Support for Container Storage Interface version 0.2.
* [MESOS-8801] - Linux support for jemalloc.
All Resolved Issues:
** Bug
* [MESOS-1720] - Slave should send exited executor message when the executor is never launched.
* [MESOS-3915] - Upgrade vendored Boost
* [MESOS-4420] - Support read host physical link speed from virtio driver
* [MESOS-5333] - GET /master/maintenance/schedule/ produces 404.
* [MESOS-5820] - Port master to Windows
* [MESOS-5882] - `os::cloexec` does not exist on Windows
* [MESOS-5940] - `setPaths` doesn’t work on Windows
* [MESOS-6555] - Namespace 'mnt' is not supported
* [MESOS-6713] - Port `slave_recovery_tests.cpp`
* [MESOS-6715] - Port `uri_fetcher_tests.cpp`
* [MESOS-6822] - CNI reports confusing error message for failed interface setup.
* [MESOS-6973] - Fix BOOST random generator initialization on Windows
* [MESOS-7028] - NetSocketTest.EOFBeforeRecv is flaky.
* [MESOS-7342] - Port Docker tests
* [MESOS-7604] - SlaveTest.ExecutorReregistrationTimeoutFlag aborts on Windows
* [MESOS-7699] - "stdlib.h: No such file or directory" when building with GCC 6 (Debian stable freshly released)
* [MESOS-7742] - Race conditions in IOSwitchboard: listening on unix socket and premature closing of the connection.
* [MESOS-7803] - fs::list drops path components on Windows
* [MESOS-7944] - Implement jemalloc memory profiling support for Mesos
* [MESOS-7979] - reviewboard's GUESS_FIELDS setting leads to redundant information in commit messages
* [MESOS-8125] - Agent should properly handle recovering an executor when its pid is reused
* [MESOS-8140] - Executors should clear their auth tokens
* [MESOS-8232] - SlaveTest.RegisteredAgentReregisterAfterFailover is flaky.
* [MESOS-8258] - Mesos.DockerContainerizerTest.ROOT_DOCKER_SlaveRecoveryTaskContainer is flaky.
* [MESOS-8305] - DefaultExecutorTest.ROOT_MultiTaskgroupSharePidNamespace is flaky.
* [MESOS-8308] - CommandExecutorCheckTest.CommandCheckTimeout is flaky on Windows
* [MESOS-8334] - PartitionedSlaveReregistrationMasterFailover is flaky.
* [MESOS-8336] - MasterTest.RegistryUpdateAfterReconfiguration is flaky
* [MESOS-8348] - Enable function sections in the build.
* [MESOS-8350] - Resource provider-capable agents not correctly synchronizing checkpointed agent resources on reregistration
* [MESOS-8404] - Improve image puller error messages.
* [MESOS-8411] - Killing a queued task can lead to the command executor never terminating.
* [MESOS-8413] - Zookeeper configuration passwords are shown in clear text
* [MESOS-8416] - CHECK failure if trying to recover nested containers but the framework checkpointing is not enabled.
* [MESOS-8440] - `network/ports` isolator kills legitimate tasks on recovery.
* [MESOS-8444] - GC failure causes agent miss to detach virtual paths for the executor's sandbox
* [MESOS-8446] - Agent miss to detach `virtualLatestPath` for the executor's sandbox during recovery
* [MESOS-8447] - Incomplete output of apply-reviews.py --dry-run
* [MESOS-8453] - ExecutorAuthorizationTest.RunTaskGroup segfaults.
* [MESOS-8463] - Test MasterAllocatorTest/1.SingleFramework is flaky
* [MESOS-8468] - `LAUNCH_GROUP` failure tears down the default executor.
* [MESOS-8474] - Test StorageLocalResourceProviderTest.ROOT_ConvertPreExistingVolume is flaky
* [MESOS-8477] - Make clean fails without Python artifacts.
* [MESOS-8480] - Mesos returns high resource usage when killing a Docker task.
* [MESOS-8482] - Signed/Unsigned comparisons in tests
* [MESOS-8483] - ExampleTests PythonFramework fails with sigabort.
* [MESOS-8484] - stout test NumifyTest.HexNumberTest fails.
* [MESOS-8485] - MasterTest.RegistryGcByCount is flaky
* [MESOS-8489] - LinuxCapabilitiesIsolatorFlagsTest.ROOT_IsolatorFlags is flaky
* [MESOS-8490] - UpdateSlaveMessageWithPendingOffers is flaky.
* [MESOS-8497] - Docker parameter `name` does not work with Docker Containerizer.
* [MESOS-8508] - Missing map header when compiling against unbundled protobuf
* [MESOS-8510] - URI disk profile adaptor does not consider plugin type for a profile.
* [MESOS-8512] - Fetcher doesn't log it's stdout/stderr properly to the log file
* [MESOS-8513] - Noisy "transport endpoint is not connected" logs on closing sockets.
* [MESOS-8519] - Fix recovery of job object isolated tasks
* [MESOS-8530] - Default executor tasks can get stuck in KILLING state
* [MESOS-8536] - Pending offer operations on resource provider resources not properly accounted for in allocator
* [MESOS-8545] - AgentAPIStreamingTest.AttachInputToNestedContainerSession is flaky.
* [MESOS-8546] - PythonFramework test fails with cache write failure.
* [MESOS-8548] - Test StorageLocalResourceProviderTest.ROOT_Metrics is flaky
* [MESOS-8550] - Bug in `Master::detected()` leads to coredump in `MasterZooKeeperTest.MasterInfoAddress`.
* [MESOS-8552] - CGROUPS_ROOT_PidNamespaceForward and CGROUPS_ROOT_PidNamespaceBackward tests fail
* [MESOS-8563] - Windows executors cannot re-register
* [MESOS-8565] - Persistent volumes are not visible in Mesos UI when launching a pod using default executor.
* [MESOS-8577] - Destroy nested container if `LAUNCH_NESTED_CONTAINER_SESSION` fails
* [MESOS-8578] - UpgradeTest.UpgradeAgentIntoHierarchicalRoleForNonHierarchicalRole is flaky.
* [MESOS-8585] - Agent crashes when starting a task with an unknown user.
* [MESOS-8586] - apply-reviews.py silently does nothing when a review was submitted already.
* [MESOS-8594] - Mesos master stack overflow in libprocess socket send loop.
* [MESOS-8598] - Allow empty resource provider selector in `UriDiskProfileAdaptor`.
* [MESOS-8601] - Master crashes during slave reregistration after failover.
* [MESOS-8604] - Quota headroom tracking may be incorrect in the presence of hierarchical reservation.
* [MESOS-8605] - Terminal task status update will not send if 'docker inspect' is hung
* [MESOS-8610] - NsTest.SupportedNamespaces fails on CentOS7
* [MESOS-8611] - SlaveTest.RemoveExecutorUponFailedLaunch is flaky.
* [MESOS-8617] - Tests using default executor occasionally fail.
* [MESOS-8618] - ReconciliationTest.ReconcileStatusUpdateTaskState is flaky.
* [MESOS-8619] - Docker on Windows uses USERPROFILE instead of HOME for credentials
* [MESOS-8620] - Containers stuck in FETCHING possibly due to unresponsive server.
* [MESOS-8624] - Valid tasks may be explicitly dropped by agent due to race conditions
* [MESOS-8631] - Agent should be able to start a task with every CPU on a Windows machine
* [MESOS-8641] - Event stream could send heartbeat before subscribed
* [MESOS-8642] - ballon-executor is hard to run as unprivileged user
* [MESOS-8643] - `os::system` and `os::spawn` returns -1 on valid windows commands
* [MESOS-8644] - W* macros wrong on Windows.
* [MESOS-8646] - Agent should be able to resolve file names on open files.
* [MESOS-8647] - Enable resource provider agent capability by default
* [MESOS-8651] - Potential memory leaks in the `volume/sandbox_path` isolator
* [MESOS-8654] - The `/proc/sys` mount point in Mesos containers should also include `nosuid,noexec,nodev` mount options.
* [MESOS-8659] - Fix warning `cl : Command line warning D9025 : overriding '/MTd' with '/MDd'`
* [MESOS-8664] - Perf sampler doesn't handle extra fields and nameless counters
* [MESOS-8691] - Forward CXX_FLAGS to C++ projects and C_FLAGS to C projects in CMake
* [MESOS-8711] - SlaveTest.ChangeDomain is disabled.
* [MESOS-8719] - Mesos configured with `--enable-grpc` doesn't compile on non-Linux builds
* [MESOS-8724] - G++ Warning about libc system macros `major` and `minor` prevents Mesos build
* [MESOS-8733] - OversubscriptionTest.ForwardUpdateSlaveMessage is flaky
* [MESOS-8741] - `Add` to sequence will not run if it races with sequence destruction
* [MESOS-8742] - Agent resource provider config API calls should be idempotent.
* [MESOS-8749] - CSI proto is always included in the build when using CMake
* [MESOS-8761] - Default linker fails to link tests on FreeBSD
* [MESOS-8781] - Mesos master shouldn't silently drop operations
* [MESOS-8784] - OPERATION_DROPPED operation status updates should include the operation/framework IDs
* [MESOS-8787] - RP-related API should be experimental.
* [MESOS-8804] - Fix Ninja Release builds on Windows
* [MESOS-8818] - VolumeSandboxPathIsolatorTest.SharedParentTypeVolume fails on macOS
* [MESOS-8834] - Indirect recursion between `send` and `_send` in libprocess may cause stack overflow.
* [MESOS-8865] - Suspicious enum value comparisons in scheduler Java bindings
* [MESOS-8866] - CMake builds are missing byproduct declaration for jemalloc.
* [MESOS-8868] - Some 'FsTest' test cases fail on macOS
* [MESOS-8870] - Master does not correctly reconcile dropped operations after agent failover
* [MESOS-8874] - ResourceProviderManagerHttpApiTest.ResubscribeResourceProvider is flaky.
* [MESOS-8876] - Normal exit of Docker container using rexray volume results in TASK_FAILED.
* [MESOS-8881] - Enable epoll backend in libevent integration.
* [MESOS-8885] - Disable libevent debug mode.
** Improvement
* [MESOS-2922] - Add move constructors / assignment to Future.
* [MESOS-3022] - export additional metrics from scheduler driver
* [MESOS-4965] - Support resizing of an existing persistent volume
* [MESOS-5362] - Add authentication to example frameworks
* [MESOS-6128] - Make "re-register" vs. "reregister" consistent in the master
* [MESOS-7016] - Make default AWAIT_* duration configurable
* [MESOS-7643] - The order of isolators provided in '--isolation' flag is not preserved and instead sorted alphabetically
* [MESOS-7656] - Update the JSON <=> protobuf message conversion for map support
* [MESOS-7881] - Building gRPC with CMake
* [MESOS-7990] - Support systemd named hierarchy (name=systemd) for Mesos Containerizer.
* [MESOS-8033] - Use more idiomatic CMake for compiler features
* [MESOS-8240] - Add an option to build the new CLI and run unit tests.
* [MESOS-8306] - Restrict which agents can statically reserve resources for which roles
* [MESOS-8332] - Narrow the container sandbox permissions.
* [MESOS-8357] - Example frameworks have an inconsistent UX.
* [MESOS-8361] - Example frameworks to support launching mesos-local.
* [MESOS-8389] - Notion of "removable" task in master code is inaccurate.
* [MESOS-8390] - Notion of "transitioning" agents in the master is now inaccurate.
* [MESOS-8402] - Resource provider manager should persist resource provider information
* [MESOS-8426] - Speed up SLRP tests
* [MESOS-8427] - Clean up residual CSI endpoints for SLRP tests.
* [MESOS-8434] - Cleanup Authorization logic in master and agent
* [MESOS-8454] - Add a download link for master and agent logs in WebUI
* [MESOS-8471] - Allow revocable_resources capability for mesos-execute
* [MESOS-8488] - Docker bug can cause unkillable tasks.
* [MESOS-8506] - Add test coverage for `Resources::find` on revocable resources
* [MESOS-8556] - Boost emits warning repeatedly
* [MESOS-8573] - Container stuck in PULLING when Docker daemon hangs
* [MESOS-8574] - Docker executor makes no progress when 'docker inspect' hangs
* [MESOS-8575] - Improve discard handling for 'Docker::stop' and 'Docker::pull'.
* [MESOS-8576] - Improve discard handling of 'Docker::inspect()'
* [MESOS-8591] - Add infra to test a hung Docker daemon
* [MESOS-8599] - Build with Ninja on Windows
* [MESOS-8607] - Port mesos-execute to Windows
* [MESOS-8609] - Create a metric to indicate how long agent takes to recover executors
* [MESOS-8640] - Validate `DockerInfo` exists when container's type is `DOCKER`
* [MESOS-8656] - Improve stout JSON -> protobuf message conversion to handle more valid JSONs
* [MESOS-8658] - CMake build should use same compiler warnings as Autotools
* [MESOS-8702] - Replace the manual parsing in Mesos code with the native protobuf map support
* [MESOS-8725] - Support max_duration for tasks
* [MESOS-8728] - Don't print full usage for invocation errors
* [MESOS-8772] - Add slave recovery test for default executor.
* [MESOS-8793] - Add more logging to agent recovery path.
* [MESOS-8801] - Add jemalloc as optional third-party memory allocator
* [MESOS-8851] - Introduce a push-based gauge.
** Task
* [MESOS-3441] - Port os_tests to Windows
* [MESOS-3445] - Port signals_tests to Windows
* [MESOS-3644] - Implement stout/os/windows/signals.hpp
* [MESOS-4176] - Support CMake build on FreeBSD
* [MESOS-5726] - Benchmark the v1 Operator API
* [MESOS-5850] - Add a test that runs the 'mesos-local' binary
* [MESOS-6575] - Change `disk/xfs` isolator to terminate executor when it exceeds quota
* [MESOS-7558] - Add resource provider validation
* [MESOS-8184] - Implement master's AcknowledgeOfferOperationMessage handler.
* [MESOS-8189] - Master’s OperationStatusUpdate handler should forward updates to the framework when OfferOperationID is set.
* [MESOS-8190] - Update the master to accept OfferOperationIDs from frameworks.
* [MESOS-8191] - Implement ReconcileOfferOperations handler in the master
* [MESOS-8192] - Update the scheduler library to support request/response API calls.
* [MESOS-8275] - Remove use of ::_stat on Windows
* [MESOS-8284] - Add a ns::supported convenience API.
* [MESOS-8362] - Verify end-to-end operation status update retry after RP failover
* [MESOS-8363] - Verify that the master acknowledges operation status updates correctly
* [MESOS-8373] - Test reconciliation after operation is dropped en route to agent
* [MESOS-8382] - Master should bookkeep local resource providers.
* [MESOS-8388] - Show LRP resources in master and agent endpoints.
* [MESOS-8407] - Add SLRP unit tests for profile updates and corner cases.
* [MESOS-8408] - Add an SLRP test for CSI plugin restart.
* [MESOS-8409] - Add an SLRP test for agent registered with a new ID.
* [MESOS-8415] - Add an SLRP test for agent reboot.
* [MESOS-8420] - Test that operation status updates are retried after being dropped en-route to the master.
* [MESOS-8424] - Test that operations are correctly reported following a master failover
* [MESOS-8442] - Source tree contains generated endpoint documentation
* [MESOS-8445] - Test that `UPDATE_STATE` of a resource provider doesn't have unwanted side-effects in master or agent
* [MESOS-8462] - Unit test for `Slave::detachFile` on removed frameworks.
* [MESOS-8492] - Checkpoint profiles in storage local resource provider.
* [MESOS-8527] - Add metrics about number of subscribed LRPs on the agent.
* [MESOS-8534] - Allow nested containers in TaskGroups to have separate network namespaces
* [MESOS-8539] - Add metrics about CSI plugin terminations.
* [MESOS-8551] - Port libprocess HTTPTest.QueryEncodeDecode
* [MESOS-8569] - Allow newline characters when decoding base64 strings in stout.
* [MESOS-8650] - Bump CSI bundle to v0.2.
* [MESOS-8653] - Make the CSI client to support CSI v0.2.
* [MESOS-8657] - Build CSI proto in CMake.
* [MESOS-8673] - Fix os::open to use HANDLEs
* [MESOS-8675] - Remove FD_CRT from WindowsFD
* [MESOS-8676] - Fix os::read and os::write to use HANDLES
* [MESOS-8678] - Bump gRPC bundle to 1.10.0.
* [MESOS-8683] - Remove _close from Windows close.hpp
* [MESOS-8684] - Replace _dup with DuplicateHandle on Windows
* [MESOS-8685] - Replace _lseek with SetFilePointer
* [MESOS-8692] - Replace _chsize_s with SetEndOfFile on Windows
* [MESOS-8697] - Make gRPC-related tests cross-platform.
* [MESOS-8698] - Enable storage local resource provider in CMake.
* [MESOS-8706] - Unify return type of `wait` and `destroy` containerizer methods
* [MESOS-8710] - Update tests after changing return type of `wait` method
* [MESOS-8717] - Support CSI v0.2 in SLRP.
* [MESOS-8735] - Implement recovery for resource provider manager registrar
* [MESOS-8747] - Support resizing persistent volume through operator API
* [MESOS-8748] - Create ACL for grow and shrink volume
* [MESOS-8750] - Check failed: !slaves.registered.contains(task->slave_id)
* [MESOS-8777] - Support `STAGE_UNSTAGE_VOLUME` CSI capability in SLRP
* [MESOS-8819] - mesos.pom file hardcodes developers
* [MESOS-8833] - Port libprocess subprocess_tests.cpp
** Documentation
* [MESOS-8291] - Add documentation about fault domains
Release Notes - Mesos - Version 1.5.2 (WIP)
-------------------------------------------
* This is a bug fix release.
** Bug
* [MESOS-3790] - ZooKeeper connection should retry on `EAI_NONAME`.
* [MESOS-8418] - mesos-agent high cpu usage because of numerous /proc/mounts reads.
* [MESOS-8830] - Agent gc on old slave sandboxes could empty persistent volume data
* [MESOS-8904] - Master crash when removing quota.
* [MESOS-8906] - `UriDiskProfileAdaptor` fails to update profile selectors.
* [MESOS-8935] - Quota limit "chopping" can lead to cpu-only and memory-only offers.
* [MESOS-8936] - Implement a Random Sorter for offer allocations.
* [MESOS-8942] - Master streaming API does not send (health) check updates for tasks.
* [MESOS-8945] - Master check failure due to CHECK_SOME(providerId).
* [MESOS-8947] - Improve the container preparing logging in IOSwitchboard and volume/secret isolator.
* [MESOS-8952] - process::await/collect n^2 performance issue.
* [MESOS-8963] - Executor crash trying to print container ID.
* [MESOS-8980] - mesos-slave can deadlock with docker pull.
* [MESOS-8986] - `slave.available()` in the allocator is expensive and drags down allocation performance.
* [MESOS-8987] - Master asks agent to shutdown upon auth errors.
* [MESOS-9024] - Mesos master segfaults with stack overflow under load.
* [MESOS-9049] - Agent GC could unmount a dangling persistent volume multiple times.
* [MESOS-9125] - Port mapper CNI plugin might fail with "Resource temporarily unavailable"
* [MESOS-9127] - Port mapper CNI plugin might deadlock iptables on the agent.
Release Notes - Mesos - Version 1.5.1
-------------------------------------------
* This is a bug fix release.
** Bug
* [MESOS-1720] - Slave should send exited executor message when the executor is never launched.
* [MESOS-7742] - Race conditions in IOSwitchboard: listening on unix socket and premature closing of the connection.
* [MESOS-8125] - Agent should properly handle recovering an executor when its pid is reused.
* [MESOS-8411] - Killing a queued task can lead to the command executor never terminating.
* [MESOS-8416] - CHECK failure if trying to recover nested containers but the framework checkpointing is not enabled.
* [MESOS-8468] - `LAUNCH_GROUP` failure tears down the default executor.
* [MESOS-8488] - Docker bug can cause unkillable tasks.
* [MESOS-8510] - URI disk profile adaptor does not consider plugin type for a profile.
* [MESOS-8536] - Pending offer operations on resource provider resources not properly accounted for in allocator.
* [MESOS-8550] - Bug in `Master::detected()` leads to coredump in `MasterZooKeeperTest.MasterInfoAddress`.
* [MESOS-8552] - CGROUPS_ROOT_PidNamespaceForward and CGROUPS_ROOT_PidNamespaceBackward tests fail.
* [MESOS-8565] - Persistent volumes are not visible in Mesos UI when launching a pod using default executor.
* [MESOS-8569] - Allow newline characters when decoding base64 strings in stout.
* [MESOS-8574] - Docker executor makes no progress when 'docker inspect' hangs.
* [MESOS-8575] - Improve discard handling for 'Docker::stop' and 'Docker::pull'.
* [MESOS-8576] - Improve discard handling of 'Docker::inspect()'.
* [MESOS-8577] - Destroy nested container if `LAUNCH_NESTED_CONTAINER_SESSION` fails.
* [MESOS-8594] - Mesos master stack overflow in libprocess socket send loop.
* [MESOS-8598] - Allow empty resource provider selector in `UriDiskProfileAdaptor`.
* [MESOS-8601] - Master crashes during slave reregistration after failover.
* [MESOS-8604] - Quota headroom tracking may be incorrect in the presence of hierarchical reservation.
* [MESOS-8605] - Terminal task status update will not send if 'docker inspect' is hung.
* [MESOS-8619] - Docker on Windows uses `USERPROFILE` instead of `HOME` for credentials.
* [MESOS-8624] - Valid tasks may be explicitly dropped by agent due to race conditions.
* [MESOS-8631] - Agent should be able to start a task with every CPU on a Windows machine.
* [MESOS-8641] - Event stream could send heartbeat before subscribed.
* [MESOS-8646] - Agent should be able to resolve file names on open files.
* [MESOS-8651] - Potential memory leaks in the `volume/sandbox_path` isolator.
* [MESOS-8741] - `Add` to sequence will not run if it races with sequence destruction.
* [MESOS-8742] - Agent resource provider config API calls should be idempotent.
* [MESOS-8786] - CgroupIsolatorProcess accesses subsystem processes directly.
* [MESOS-8787] - RP-related API should be experimental.
* [MESOS-8876] - Normal exit of Docker container using rexray volume results in TASK_FAILED.
* [MESOS-8881] - Enable epoll backend in libevent integration.
* [MESOS-8885] - Disable libevent debug mode.
Release Notes - Mesos - Version 1.5.0
-------------------------------------------
This release contains the following new features:
* [MESOS-1739] - **Experimental** Agents now support the
`--reconfiguration_policy` flag which allows them to recover
the agent ID and running tasks after configuration changes.
See docs/agent-recovery.md for more details.
* [MESOS-4945] - **Experimental** Agents now can automatically
garbage collect unused Docker image layers used by Mesos
Containerizer.
* [MESOS-7289, MESOS-7235] - **Experimental** Support for the
Container Storage Interface (CSI) to simplify storage management
in Mesos, and allow 3rdparty vendors to plugin into Mesos very
easily.
* [MESOS-7302] - Support launching standalone containers on the
agent using MesosContainerizer without a master or framework
running.
* [MESOS-7749] - **Experimental** Support for gRPC client in Mesos.
The gRPC is bundled in Mesos and a gRPC client API is built is
built into libprocess.
* [MESOS-7973] - **Experimental** Non-leading replica is now allowed
to catch-up missing log positions in the replicated log. This opens
the door for implementing hot standby (by offloading some reading
from a leader to standbys) and fast failover time (by keeping
in-memory storage represented by the log “hot”).
* Several improvements and fixes to the enforcement of quota
guarantees have been made:
* [MESOS-4527]: Previously a role could "game" the quota system
by amassing reservations that it leaves unused. This is now
prevented by accounting for reservations when allocating
resources.
* [MESOS-7099]: Resources are now allocated in a fine-grained
manner to prevent roles from exceeding their quota.
* [MESOS-8293]: There was a bug where a role may not receive its
reservation when it does not have quota, this has been fixed.
* [MESOS-8339]: When a role has more reservations than quota,
there was a bug previously where an insufficient amount of
quota headroom was held. This has been fixed.
* [MESOS-8352]: When allocating to a role with quota, we
previously included all other resources on the agent that the
role does not have quota for. This made it possible to violate
the quota guarantees of a different role. This has been fixed
by taking into account the headroom that is needed when
allocating the resources.
Deprecations/Removals:
* [MESOS-7305] - Some nested container agent APIs `****_NESTED_CONTAINER`
are deprecated in favor of the new generally named agent APIs
`****_CONTAINER`.
* Agent flag `--executor_secret_key` has been deprecated. Operators
should use `--jwt_secret_key` instead.
Additional API Changes:
* [MESOS-6406, MESOS-7215, MESOS-8337] Now when an agent is partitioned,
the master tracks all noncompleted tasks regardless of partition-awareness
so when the agent reregisters it can recover all of them and send their
latest statuses to the scheduler. NOTE: The master now sends updates for
tasks recovered from partitioned agents upon reregistration so the scheduler
can get them before reconciliation. We also fixed the buggy semantics that
exposes terminal unacknowledged tasks when partitioned as "completed" in the
HTTP endpoints and the operator API, now they are shown as "unreachable". We
plan to further improve the API on this in MESOS-8405.
* [MESOS-7550] The fields `Resource.disk.source.path.root` and
`Resource.disk.source.mount.root` can now be set to relative paths
to an agent's work directory.
* [MESOS-7660] `Filter::refuse_seconds` is now capped to 31536000
seconds (365 days).
* [MESOS-7941] Built-in executors will now send a TASK_STARTING
status update when a task is starting.
* [MESOS-7973] A new `catchup` method has been added to the
`Log.Reader` interface (including Java binding).
* [MESOS-8040] Return nested/standalone containers in `GET_CONTAINERS`
API call.
* [MESOS-8165] Master will now send TASK_GONE status for unknown
tasks of PARTITION_AWARE frameworks belonging to registered agents
during explicit reconciliation.
Changes to Dependencies:
* Upgraded minimum required Protobuf library to version 3+.
Feature Graduations:
* [MESOS-4791] - v1 Operator API is now considered stable. The performance has
been improved so that when using protobuf it is faster than v0, and when
using JSON it is slightly slower than v0.
* [MESOS-5116] - Add support for accounting only mode in XFS isolator.
* [MESOS-5275, MESOS-7476, MESOS-7477, MESOS-7671] - Add file-based and
protobuf-based capabilities support for mesos containerizer. This
includes the support for effective and bounding capabilities.
* [MESOS-6077] - Added a default (task group) executor.
* [MESOS-6402] - rlimit support for Mesos containerizer.
* [MESOS-6460] - Container Attach/Exec.
* [MESOS-6758] - Support docker registry that requires basic auth.
* [MESOS-7088] - Support private registry credential per container.
* [MESOS-7418] - Add support for file-based secrets.
Unresolved Critical Issues:
* [MESOS-1718] - Command executor can overcommit the agent.
* [MESOS-2554] - Slave flaps when using --slave_subsystems that are not used for isolation.
* [MESOS-2774] - SIGSEGV received during process::MessageEncoder::encode().
* [MESOS-2842] - Update FrameworkInfo.principal on framework re-registration.
* [MESOS-3533] - Unable to find and run URIs files.
* [MESOS-3747] - HTTP Scheduler API no longer allows FrameworkInfo.user to be empty string.
* [MESOS-4996] - 'containerizer->update' will always fail after killing a docker container.
* [MESOS-5352] - Docker volume isolator cleanup can be blocked by first cleanup failure.
* [MESOS-5396] - After failover, master does not remove agents with same UPID.
* [MESOS-5989] - Libevent SSL Socket downgrade code accesses uninitialized memory / assumes single peek is sufficient.
* [MESOS-5995] - Protobuf JSON deserialisation does not accept numbers formated as strings.
* [MESOS-6632] - ContainerLogger might leak FD if container launch fails.
* [MESOS-6804] - Running 'tty' inside a debug container that has a tty reports "Not a tty".
* [MESOS-6986] - abort in DRFSorter::add.
* [MESOS-7386] - Executor not cleaning up existing running docker containers if external logrotate/logger processes die/killed.
* [MESOS-7566] - Master crash due to failed check in DRFSorter::remove.
* [MESOS-7622] - Agent can crash if a HTTP executor tries to retry subscription in running state.
* [MESOS-7721] - Master's agent removal rate limit also applies to agent unreachability.
* [MESOS-7748] - Slow subscribers of streaming APIs can lead to Mesos OOMing.
* [MESOS-7911] - Non-checkpointing framework's tasks should not be marked LOST when agent disconnects.
* [MESOS-7966] - check for maintenance on agent causes fatal error.
* [MESOS-7991] - fatal, check failed !framework->recovered().
* [MESOS-8038] - Launching GPU task sporadically fails.
* [MESOS-8125] - Agent should properly handle recovering an executor when its pid is reused.
* [MESOS-8137] - Mesos agent can hang during startup.
* [MESOS-8256] - Libprocess can silently deadlock due to worker thread exhaustion.
* [MESOS-8411] - Killing a queued task can lead to the command executor never terminating.
* [MESOS-8468] - `LAUNCH_GROUP` failure tears down the default executor.
All Resolved Issues:
** Bug
* [MESOS-1216] - Attributes comparator operator should allow multiple attributes of same name and type.
* [MESOS-3576] - Audit CMake linking flags.
* [MESOS-5455] - Transition away from temporary build variables.
* [MESOS-5462] - Re-organize isolator hierarchy.
* [MESOS-5656] - Incomplete modelling of 3rdparty dependencies in cmake build.
* [MESOS-5881] - Semantics of `os::symlink` differ across POSIX and Windows.
* [MESOS-5905] - Zookeeper tests do not work on CMake builds as directory structure changed.
* [MESOS-6086] - PersistentVolumeEndpointsTest.EndpointCreateThenOfferRemove is flaky.
* [MESOS-6187] - "double free or corruption" with Java 8.
* [MESOS-6345] - ExamplesTest.PersistentVolumeFramework failing due to double free corruption on Ubuntu 14.04.
* [MESOS-6406] - Send latest status for partition-aware tasks when agent reregisters.
* [MESOS-6428] - Mesos containerizer helper function signalSafeWriteStatus is not AS-Safe.
* [MESOS-6616] - Error: dereferencing type-punned pointer will break strict-aliasing rules.
* [MESOS-6671] - External 3rdparty deps are not built with the configured compiler in cmake build.
* [MESOS-6690] - Wire up resource control API to Windows Job objects API.
* [MESOS-6697] - Port `authentication_tests.cpp`.
* [MESOS-6703] - Port `credentials_tests.cpp`.
* [MESOS-6705] - Port `fetcher_tests.cpp`.
* [MESOS-6708] - Port `group_tests.cpp`.
* [MESOS-6735] - `os::realpath` semantics differ between Windows and POSIX.
* [MESOS-6784] - IOSwitchboardTest.KillSwitchboardContainerDestroyed is flaky.
* [MESOS-6790] - Wrong task started time in webui.
* [MESOS-6794] - Properly model header dependencies of cmake build components.
* [MESOS-6816] - Allows frameworks to overwrite system environment variables.
* [MESOS-6942] - CMake build with `-DENABLE_LIBEVENT=ON` requires system-installed `openssl`.
* [MESOS-6949] - SchedulerTest.MasterFailover is flaky.
* [MESOS-7007] - filesystem/shared and --default_container_info broken since 1.1.
* [MESOS-7099] - Quota can be exceeded due to coarse-grained offer technique.
* [MESOS-7130] - port_mapping isolator: executor hangs when running on EC2.
* [MESOS-7160] - Parsing of perf version segfaults.
* [MESOS-7215] - Race condition on re-registration of non-partition-aware frameworks.
* [MESOS-7223] - Linux filesystem isolator cannot mount host volume /dev/log.
* [MESOS-7296] - CMake 2.8.10 does not support TIMESTAMP.
* [MESOS-7312] - Update Resource proto for storage resource providers.
* [MESOS-7425] - ImageAlpine/ProvisionerDockerTest.ROOT_INTERNET_CURL_SimpleCommand/3 is flaky in some OS.
* [MESOS-7440] - Various DefaultExecutorCheckTest* tests flaky on ASF CI.
* [MESOS-7500] - Command checks via agent lead to flaky tests.
* [MESOS-7504] - Parent's mount namespace cannot be determined when launching a nested container.
* [MESOS-7509] - CniIsolatorPortMapperTest.ROOT_INTERNET_CURL_PortMapper fails on some Linux distros.
* [MESOS-7511] - CniIsolatorTest.ROOT_DynamicAddDelofCniConfig is flaky.
* [MESOS-7519] - OversubscriptionTest.RescindRevocableOfferWithIncreasedRevocable is flaky.
* [MESOS-7541] - Cannot compile without pre-compiled headers on Windows.
* [MESOS-7586] - Make use of cout/cerr and glog consistent.
* [MESOS-7589] - CommandExecutorCheckTest.CommandCheckDeliveredAndReconciled is flaky.
* [MESOS-7660] - HierarchicalAllocator uses the default filter instead of a very long one.
* [MESOS-7661] - Libprocess timers with long durations trigger immediately.
* [MESOS-7704] - Remove use of #pragma comment (lib, "IPHLPAPI.lib").
* [MESOS-7726] - MasterTest.IgnoreOldAgentReregistration test is flaky.
* [MESOS-7729] - ExamplesTest.DynamicReservationFramework is flaky.
* [MESOS-7741] - SlaveRecoveryTest/0.MultipleSlaves has double free corruption.
* [MESOS-7781] - Windows API GetVersionExW was declared deprecated.
* [MESOS-7784] - MasterTestPrePostReservationRefinement.CreateAndDestroyVolumesV1 is flaky.
* [MESOS-7791] - subprocess' childMain using ABORT when encountering user errors.
* [MESOS-7811] - libprocess-tests depend on gtest but it's not setup.
* [MESOS-7828] - Current approach to parse protobuf enum from JSON does not support upgrades.
* [MESOS-7835] - CMake build does not support Marathon.
* [MESOS-7851] - Master stores old resource format in the registry.
* [MESOS-7867] - Master doesn't handle scheduler driver downgrade from HTTP based to PID based.
* [MESOS-7873] - Expose `ExecutorInfo.ContainerInfo.NetworkInfo` in Mesos `state` endpoint.
* [MESOS-7877] - Audit test code for undefined behavior in accessing container elements.
* [MESOS-7917] - Docker statistics not reported on Windows.
* [MESOS-7921] - ProcessManager::resume sometimes crashes accessing EventQueue.
* [MESOS-7923] - Make args optional in mesos port mapper plugin.
* [MESOS-7927] - The composing containerizer leaks memory in some scenarios.
* [MESOS-7929] - `Metrics()` hangs on second call on Windows.
* [MESOS-7945] - MasterAPITest.EventAuthorizationFiltering is flaky.
* [MESOS-7963] - Task groups can lose the container limitation status.
* [MESOS-7964] - Heavy-duty GC makes the agent unresponsive.
* [MESOS-7968] - Handle `/proc/self/ns/pid_for_children` when parsing available namespace.
* [MESOS-7969] - Handle cgroups v2 hierarchy when parsing /proc/self/cgroups.
* [MESOS-7972] - SlaveTest.HTTPSchedulerSlaveRestart test is flaky.
* [MESOS-7975] - The command/default/docker executor can incorrectly send a TASK_FINISHED update even when the task is killed.
* [MESOS-7978] - Lint javascript files to enable linting.
* [MESOS-7980] - Stout fails to compile with libc >= 2.26.
* [MESOS-7988] - Mesos attempts to open handle for the system idle process.
* [MESOS-7993] - Fix Windows header orderings.
* [MESOS-7996] - ContentType/SchedulerTest.NoOffersWithAllRolesSuppressed is flaky.
* [MESOS-7997] - ContentType/MasterAPITest.CreateAndDestroyVolumes is flaky.
* [MESOS-7998] - PersistentVolumeEndpointsTest.UnreserveVolumeResources is flaky.
* [MESOS-8000] - DefaultExecutorCniTest.ROOT_VerifyContainerIP is flaky.
* [MESOS-8001] - PersistentVolumeEndpointsTest.NoAuthentication is flaky.
* [MESOS-8003] - PersistentVolumeEndpointsTest.SlavesEndpointFullResources is flaky.
* [MESOS-8010] - AfterTest.Loop is flaky.
* [MESOS-8027] - os::open doesn't always atomically apply O_CLOEXEC.
* [MESOS-8035] - Correct mesos-tests CMake build dependencies.
* [MESOS-8039] - A broken connection during LaunchNestedContainer call might result in the nested container not being cleaned up.
* [MESOS-8046] - MasterTestPrePostReservationRefinement.ReserveAndUnreserveResourcesV1 is flaky.
* [MESOS-8048] - ReservationEndpointsTest.GoodReserveAndUnreserveACL is flaky.
* [MESOS-8051] - Killing TASK_GROUP fail to kill some tasks.
* [MESOS-8052] - "protoc" not found when running "make -j4 check" directly in stout.
* [MESOS-8057] - Apply security patches to AngularJS and JQuery in the Mesos UI.
* [MESOS-8058] - Agent and master can race when updating agent state.
* [MESOS-8066] - Pylint report errors in apply-reviews.py on Ubuntu 14.04.
* [MESOS-8070] - Bundled GRPC build does not build on Debian 8.
* [MESOS-8076] - PersistentVolumeTest.SharedPersistentVolumeRescindOnDestroy is flaky.
* [MESOS-8080] - The default executor does not propagate missing task exit status correctly.
* [MESOS-8082] - updateAvailable races with a periodic allocation and leads to flaky tests.
* [MESOS-8084] - Double free corruption in tests due to parallel manipulation of signal and control handlers.
* [MESOS-8085] - No point in deallocate() for a framework for maintenance if it is deactivated.
* [MESOS-8090] - Mesos 1.4.0 crashes with 1.3.x agent with oversubscription.
* [MESOS-8093] - Some tests miss subscribed event because expectation is set after event fires.
* [MESOS-8095] - ResourceProviderRegistrarTest.AgentRegistrar is flaky.
* [MESOS-8116] - Fix off by-one error in Windows long path support.
* [MESOS-8119] - ROOT_DOCKER_DockerHealthyTask segfaults in debian 8.
* [MESOS-8121] - Unified Containerizer Auto backend should check xfs ftype for overlayfs backend.
* [MESOS-8123] - GPU tests are failing due to TASK_STARTING.
* [MESOS-8135] - Masters can lose track of tasks' executor IDs.
* [MESOS-8136] - Update XFS isolator tests to handle TASK_STARTING.
* [MESOS-8157] - Review #62775 broke the build.
* [MESOS-8159] - ns::clone uses an async signal unsafe stack.
* [MESOS-8165] - TASK_UNKNOWN status is ambiguous.
* [MESOS-8169] - Incorrect master validation forces executor IDs to be globally unique.
* [MESOS-8171] - Using a failoverTimeout of 0 with Mesos native scheduler client can result in infinite subscribe loop.
* [MESOS-8173] - Improve fetcher exit status message.
* [MESOS-8178] - UnreachableAgentReregisterAfterFailover is flaky.
* [MESOS-8179] - Scheduler library has incorrect assumptions about connections.
* [MESOS-8180] - Port mesos-fetcher to Windows.
* [MESOS-8200] - Suppressed roles are not honoured for v1 scheduler subscribe requests.
* [MESOS-8217] - Don't run linters on every commit.
* [MESOS-8220] - Can't build with Visual Studio 15.5.
* [MESOS-8223] - Master crashes when suppressed on subscribe is enabled.
* [MESOS-8225] - Port os::which to Windows.
* [MESOS-8237] - Strip (Offer|Resource).allocation_info for non-MULTI_ROLE schedulers.
* [MESOS-8245] - SlaveRecoveryTest/0.ReconnectExecutor is flaky.
* [MESOS-8249] - Support image prune in mesos containerizer and provisioner.
* [MESOS-8263] - ResourceProviderManagerHttpApiTest.ConvertResources is flaky.
* [MESOS-8267] - NestedMesosContainerizerTest.ROOT_CGROUPS_RecoverLauncherOrphans is flaky.
* [MESOS-8272] - Fall back to bind mounting container devices.
* [MESOS-8279] - Persistent volumes are not visible in Mesos UI using default executor on Linux.
* [MESOS-8280] - Mesos Containerizer GC should set 'layers' after checkpointing layer ids in provisioner.
* [MESOS-8282] - Take pending offer operations into account when calculating framework allocated resources.
* [MESOS-8288] - SlaveTest.IgnoreV0ExecutorIfItReregistersWithoutReconnect is flaky.
* [MESOS-8289] - ReservationTest.MasterFailover is flaky when run with `RESOURCE_PROVIDER` capability.
* [MESOS-8293] - Reservation may not be allocated when the role has no quota.
* [MESOS-8297] - Built-in driver-based executors ignore kill task if the task has not been launched.
* [MESOS-8312] - Pass resource provider information to master as part of UpdateSlaveMessage.
* [MESOS-8315] - ResourceProviderManagerHttpApiTest.ResubscribeResourceProvider is flaky.
* [MESOS-8316] - Tests that fetch docker images might be flaky due to insufficient wait timeout.
* [MESOS-8318] - OfferOperationStatusUpdateManagerTest tests fail on Windows.
* [MESOS-8320] - Expose information about local resource providers in master.
* [MESOS-8325] - Mesos containerizer does not properly handle old running containers.
* [MESOS-8337] - Invalid state transition attempted when agent is lost.
* [MESOS-8339] - Quota headroom may be insufficiently held when role has more reservation than quota.
* [MESOS-8341] - Agent can become stuck in (re-)registering state during upgrades.
* [MESOS-8344] - Improve JSON v1 operator API performance.
* [MESOS-8346] - Resubscription of a resource provider will crash the agent if its HTTP connection isn't closed.
* [MESOS-8349] - When a resource provider driver is disconnected, it fails to reconnect.
* [MESOS-8350] - Resource provider-capable agents not correctly synchronizing checkpointed agent resources on reregistration.
* [MESOS-8352] - Resources may get over allocated to some roles while fail to meet the quota of other roles.
* [MESOS-8356] - Persistent volume ownership is set to root despite of sandbox owner (frameworkInfo.user) when docker executor is used.
* [MESOS-8369] - CI build failure compiling volume_profile.proto.
* [MESOS-8376] - Bundled GRPC does not build on Debian 9.
* [MESOS-8377] - RecoverTest.CatchupTruncated is flaky.
* [MESOS-8391] - Mesos agent doesn't notice that a pod task exits or crashes after the agent restart.
* [MESOS-8393] - SLRP NewVolumeRecovery and LaunchTaskRecovery tests CHECK failures.
* [MESOS-8410] - Reconfiguration policy fails to handle mount disk resources.
* [MESOS-8417] - Mesos can get "stuck" when a Process throws an exception.
* [MESOS-8419] - RP manager incorrectly setting framework ID leads to CHECK failure.
* [MESOS-8422] - Master's UpdateSlave handler not correctly updating terminated operations.
* [MESOS-8443] - Fix Docker Containerizer PATH on Windows so Docker is usable.
* [MESOS-8444] - GC failure causes agent miss to detach virtual paths for the executor's sandbox.
* [MESOS-8446] - Agent miss to detach `virtualLatestPath` for the executor's sandbox during recovery.
* [MESOS-8460] - `Slave::detachFile` can segfault because it could use invalid Framework*.
* [MESOS-8461] - SLRP should no assume a CSI plugin always has GetNodeID implemented.
* [MESOS-8469] - Mesos master might drop some events in the operator API stream.
* [MESOS-8480] - Mesos returns high resource usage when killing a Docker task.
* [MESOS-8481] - Agent reboot during checkpointing may result in empty checkpoints.
* [MESOS-8514] - SLRP failed to connect to CSI endpoint.
** Documentation
* [MESOS-5078] - Document TaskStatus reasons.
* [MESOS-7663] - Update the documentation to reflect the addition of reservation refinement.
* [MESOS-8007] - Add documentation for MARK_AGENT_GONE call.
* [MESOS-8303] - Add user doc for agent reconfiguration.
* [MESOS-8304] - Update CHANGELOG to call out agent reconfiguration feature.
* [MESOS-8310] - Document container image garbage collection.
** Epic
* [MESOS-1739] - Allow slave reconfiguration on restart.
* [MESOS-4945] - Garbage collect unused docker layers in the store.
* [MESOS-7235] - Improve Storage Support using Resource Provider and CSI.
* [MESOS-7289] - Support Container Storage Interface (CSI).
* [MESOS-7302] - Support launching standalone containers.
* [MESOS-7749] - Support gRPC client.
** Improvement
* [MESOS-564] - Update Contribution Documentation.
* [MESOS-5675] - Add support for master capabilities.
* [MESOS-5771] - Add benchmark test for shared resources.
* [MESOS-5902] - CMake should generate protobuf definitions for Java.
* [MESOS-6350] - Raise minimum required cmake version.
* [MESOS-6390] - Ensure Python support scripts are linted.
* [MESOS-6971] - Use arena allocation to improve protobuf message passing performance.
* [MESOS-7306] - Support mount propagation for host volumes.
* [MESOS-7330] - Add resource provider to offer.
* [MESOS-7361] - Command checks via agent pollute agent logs.
* [MESOS-7370] - Fix create symlink code to use flag which enables non-admins to make symlinks.
* [MESOS-7497] - Remove CMake anti-pattern of `set(x "${x} ..")`.
* [MESOS-7616] - Consider supporting changes to agent's domain without full drain.
* [MESOS-7675] - Isolate network ports.
* [MESOS-7695] - Add heartbeats to master stream API.
* [MESOS-7737] - Harden Mesos when building with cmake.
* [MESOS-7785] - Pass Operator API subscription events through authorizer.
* [MESOS-7795] - Remove "latest" symlink after agent reboot.
* [MESOS-7798] - Improve libprocess message passing performance.
* [MESOS-7837] - Propagate resource updates from local resource providers to master.
* [MESOS-7840] - Add Mesos CLI command to list active tasks.
* [MESOS-7842] - Basic sandbox GC metrics.
* [MESOS-7861] - Include check output in the DefaultExecutor log.
* [MESOS-7880] - Add an option to skip the Mesos style check when applying a review chain.
* [MESOS-7889] - Avoid Multiple PROTOC invocations when generating Protobuf & GRPC code in libprocess.
* [MESOS-7895] - ZK session timeout is unconfigurable in agent and scheduler drivers.
* [MESOS-7916] - Improve the test coverage of the DefaultExecutor.
* [MESOS-7924] - Add a javascript linter to the webui.
* [MESOS-7941] - Send TASK_STARTING status from built-in executors.
* [MESOS-7951] - Design Doc for Extended KillPolicy.
* [MESOS-7961] - Display task health in the webui.
* [MESOS-7962] - Display task state counters in the framework page of the webui.
* [MESOS-7973] - Non-leading VOTING replica catch-up.
* [MESOS-7987] - Initialize Google Mock rather than Google Test.
* [MESOS-8012] - Support Znode paths for masters in the new CLI.
* [MESOS-8015] - Design a scheduler (V1) HTTP API authenticatee mechanism.
* [MESOS-8016] - Introduce modularized HTTP authenticatee.
* [MESOS-8017] - Introduce a basic HTTP authenticatee.
* [MESOS-8021] - Update HTTP scheduler library to allow for modularized authenticatee.
* [MESOS-8034] - Remove LIBNAME_VERSION from EXTERNAL.
* [MESOS-8040] - Return nested/standalone containers in `GET_CONTAINERS` API call.
* [MESOS-8072] - Change Mesos common events verbose logs to use VLOG(2) instead of 1.
* [MESOS-8074] - Change Libprocess actor state transitions verbose logs to use VLOG(3) instead of 2.
* [MESOS-8078] - Some fields went missing with no replacement in api/v1.
* [MESOS-8115] - Add a master flag to disallow agents that are not configured with fault domain.
* [MESOS-8117] - Update Getting Started documentation.
* [MESOS-8221] - Use protobuf reflection to simplify downgrading of resources.
* [MESOS-8286] - Making bind mounts readonly fails with user namespaces.
* [MESOS-8294] - Support container image basic auto gc.
* [MESOS-8295] - Add excluded image parameter to containerizer::pruneImages() interface.
* [MESOS-8301] - Support moving into defer/dispatch/install handlers.
* [MESOS-8302] - Improve master failover performance.
* [MESOS-8328] - Improve logs displayed after a slave failed recovery.
* [MESOS-8358] - Create agent endpoints for pruning images.
* [MESOS-8365] - Create AuthN support for prune images API.
* [MESOS-8421] - Duration operators drop precision, even when used with integers.
* [MESOS-8455] - Avoid unnecessary copying of protobuf in the v1 API.
** Task
* [MESOS-3107] - Define CMake style guide.
* [MESOS-3110] - Harden the CMake system-dependency-locating routines.
* [MESOS-3384] - Include libsasl in Windows CMake build.
* [MESOS-3437] - Port flags_tests.
* [MESOS-4527] - Roles can exceed limit allocation via reservations.
* [MESOS-6193] - Make the docker/volume isolator nesting aware.
* [MESOS-6709] - Enable HTTP and TCP health checks on Windows.
* [MESOS-6714] - Port `slave_tests.cpp`.
* [MESOS-6733] - Windows: Enable authentication to the master.
* [MESOS-6894] - Checkpoint 'ContainerConfig' in Mesos Containerizer.
* [MESOS-7284] - Allow Mesos CLI to take masters IP.
* [MESOS-7285] - Implement a plugin to list container's on a given agent.
* [MESOS-7303] - Support Isolator capabilities.
* [MESOS-7305] - Adjust the recover logic of MesosContainerizer to allow standalone containers.
* [MESOS-7328] - Validate offer operations for converting disk resources.
* [MESOS-7388] - Update allocator interfaces to support resource providers.
* [MESOS-7443] - Add the MARK_AGENT_GONE call to the Operator v1 API protos.
* [MESOS-7444] - Add support for storing gone agents to the master registry.
* [MESOS-7445] - Implement the API handler on the master for marking agents as gone.
* [MESOS-7446] - Add authorization for the MARK_AGENT_GONE call.
* [MESOS-7448] - Add support for pruning the list of gone agents in the registry.
* [MESOS-7469] - Add resource provider driver.
* [MESOS-7491] - Build a CSI client to talk to a CSI plugin.
* [MESOS-7533] - Add a function stub for resource provider re-registration.
* [MESOS-7534] - Notify resource providers if they've been reregistered.
* [MESOS-7535] - Distinguish between active and inactive resource providers in RP Manager.
* [MESOS-7550] - Publish Local Resource Provider resources in the agent before container launch or update.
* [MESOS-7555] - Add resource provider IDs to the registry.
* [MESOS-7557] - Test that resource providers can reregister after agent fails over.
* [MESOS-7561] - Add storage resource provider specific information in ResourceProviderInfo.
* [MESOS-7578] - Write a proposal to make the I/O Switchboards optional.
* [MESOS-7594] - Implement 'apply' for resource provider related operations.
* [MESOS-7757] - Update master to handle updates to agent total resources.
* [MESOS-7790] - Design hierarchical quota allocation.
* [MESOS-7807] - Docker executor needs to return multiple IP addresses for the container.
* [MESOS-7892] - Filter results of `/state` on agent by role.
* [MESOS-7899] - Expose sandboxes using virtual paths and hide the agent work directory.
* [MESOS-7936] - Move sandbox path volume logic to 'volume/sandbox_path' isolator.
* [MESOS-7982] - Create Centos 6/7 RPM package.
* [MESOS-7985] - Use ASF CI for automating RPM packaging and upload to bintray.
* [MESOS-7992] - Enable OpenSSL build on Windows.
* [MESOS-8013] - Add test for blkio statistics.
* [MESOS-8032] - Launch CSI plugins in storage local resource provider.
* [MESOS-8050] - Mesos HTTP/HTTPS health checks for IPv6 docker containers.
* [MESOS-8060] - Introduce first class 'profile' for disk resources.
* [MESOS-8071] - Add agent capability for resource provider.
* [MESOS-8075] - Add ReadWriteLock to libprocess.
* [MESOS-8079] - Checkpoint and recover layers used to provision rootfs in provisioner.
* [MESOS-8086] - Update ACCEPT call handler in master for new operations.
* [MESOS-8087] - Add operation status update handler in Master.
* [MESOS-8088] - Introduce Lamport timestamp for offer operations.
* [MESOS-8089] - Add messages to publish resources on a resource provider.
* [MESOS-8097] - Add filesystem layout for local resource providers.
* [MESOS-8098] - Benchmark Master failover performance.
* [MESOS-8099] - Add protobuf for checkpointing resource provider states.
* [MESOS-8100] - Authorize standalone container calls from local resource providers.
* [MESOS-8101] - Import resources from CSI plugins in storage local resource provider.
* [MESOS-8102] - Add a test CSI plugin for storage local resource provider.
* [MESOS-8107] - Add a call to update total resources in the resource provider API.
* [MESOS-8108] - Process offer operations in storage local resource provider.
* [MESOS-8130] - Add placeholder handlers for offer operation feedback.
* [MESOS-8131] - Add new protobuf messages for offer operation feedback.
* [MESOS-8132] - Design a library to send offer operation status updates.
* [MESOS-8139] - Upgrade protobuf to 3.5.x.
* [MESOS-8141] - Add filesystem layout for storage resource providers.
* [MESOS-8143] - Publish and unpublish storage local resources through CSI plugins.
* [MESOS-8181] - Add tests that a failed offer operation on resource provider resources leads to a clock update.
* [MESOS-8183] - Add a container daemon to monitor a long-running standalone container.
* [MESOS-8186] - Implement the agent's AcknowledgeOfferOperationMessage handler.
* [MESOS-8187] - Enable LRP to send operation status updates, checkpoint, and retry using the SUM.
* [MESOS-8193] - Update master’s OfferOperationStatusUpdate handler to acknowledge updates to the agent if OfferOperationID is not set.
* [MESOS-8195] - Implement explicit offer operation reconciliation between the master, agent and RPs.
* [MESOS-8196] - Propagate failures from applying offer operations from resource providers.
* [MESOS-8197] - Implement a library to send offer operation status updates.
* [MESOS-8198] - Update the ReconcileOfferOperations protos.
* [MESOS-8199] - Add plumbing for explicit offer operation reconciliation between master, agent, and RPs.
* [MESOS-8207] - Reconcile offer operations between resource providers, agents, and master.
* [MESOS-8211] - Handle agent local resources in offer operation handler.
* [MESOS-8218] - Support `RESERVE`/`CREATE` operations with resource providers.
* [MESOS-8222] - Add resource versions to RunTaskMessage.
* [MESOS-8244] - Add operator API to reload local resource providers.
* [MESOS-8251] - Introduce a way to resolve the "profile" for disk resources.
* [MESOS-8265] - Add state recovery for storage local resource provider.
* [MESOS-8269] - Support resource provider re-subscription in the resource provider manager.
* [MESOS-8270] - Add an agent endpoint to list all active resource providers.
* [MESOS-8309] - Introduce a UUID message type.
* [MESOS-8375] - Use protobuf reflection to simplify upgrading of resources.
* [MESOS-8394] - Bump CSI to 0.1.0.
Release Notes - Mesos - Version 1.4.2 (WIP)
-------------------------------------------
* This is a bug fix release.
** Bug
* [MESOS-4527] - Roles can exceed limit allocation via reservations.
* [MESOS-6616] - Error: dereferencing type-punned pointer will break strict-aliasing rules.
* [MESOS-7099] - Quota can be exceeded due to coarse-grained offer technique.
* [MESOS-7504] - Parent's mount namespace cannot be determined when launching a nested container.
* [MESOS-7975] - The command/default/docker executor can incorrectly send a TASK_FINISHED update even when the task is killed.
* [MESOS-8106] - Docker fetcher plugin unsupported scheme failure message is not accurate.
* [MESOS-8125] - Agent should properly handle recovering an executor when its pid is reused.
* [MESOS-8159] - ns::clone uses an async signal unsafe stack.
* [MESOS-8171] - Using a failoverTimeout of 0 with Mesos native scheduler client can result in infinite subscribe loop.
* [MESOS-8237] - Strip (Offer|Resource).allocation_info for non-MULTI_ROLE schedulers.
* [MESOS-8253] - Mesos CI docker rmi conflict.
* [MESOS-8293] - Reservation may not be allocated when the role has no quota.
* [MESOS-8297] - Built-in driver-based executors ignore kill task if the task has not been launched.
* [MESOS-8339] - Quota headroom may be insufficiently held when role has more reservation than quota.
* [MESOS-8352] - Resources may get over allocated to some roles while fail to meet the quota of other roles.
* [MESOS-8356] - Persistent volume ownership is set to root despite of sandbox owner (frameworkInfo.user) when docker executor is used.
* [MESOS-8411] - Killing a queued task can lead to the command executor never terminating.
* [MESOS-8418] - mesos-agent high cpu usage because of numerous /proc/mounts reads.
* [MESOS-8480] - Mesos returns high resource usage when killing a Docker task.
* [MESOS-8488] - Docker bug can cause unkillable tasks.
* [MESOS-8550] - Bug in `Master::detected()` leads to coredump in `MasterZooKeeperTest.MasterInfoAddress`.
* [MESOS-8552] - CGROUPS_ROOT_PidNamespaceForward and CGROUPS_ROOT_PidNamespaceBackward tests fail.
* [MESOS-8569] - Allow newline characters when decoding base64 strings in stout.
* [MESOS-8573] - Container stuck in PULLING when Docker daemon hangs
* [MESOS-8574] - Docker executor makes no progress when 'docker inspect' hangs.
* [MESOS-8575] - Improve discard handling for 'Docker::stop' and 'Docker::pull'.
* [MESOS-8576] - Improve discard handling of 'Docker::inspect()'.
* [MESOS-8604] - Quota headroom tracking may be incorrect in the presence of hierarchical reservation.
* [MESOS-8605] - Terminal task status update will not send if 'docker inspect' is hung.
* [MESOS-8626] - The 'allocatable' check in the allocator is problematic with multi-role frameworks.
* [MESOS-8651] - Potential memory leaks in the `volume/sandbox_path` isolator.
* [MESOS-8786] - CgroupIsolatorProcess accesses subsystem processes directly.
* [MESOS-8830] - Agent gc on old slave sandboxes could empty persistent volume data
* [MESOS-8871] - Agent may fail to recover if the agent dies before image store cache checkpointed.
* [MESOS-8876] - Normal exit of Docker container using rexray volume results in TASK_FAILED.
* [MESOS-8881] - Enable epoll backend in libevent integration.
* [MESOS-8885] - Disable libevent debug mode.
* [MESOS-8904] - Master crash when removing quota.
* [MESOS-8934] - Update python.m4 to support Python 3.
* [MESOS-8935] - Quota limit "chopping" can lead to cpu-only and memory-only offers.
* [MESOS-8936] - Implement a Random Sorter for offer allocations.
* [MESOS-8942] - Master streaming API does not send (health) check updates for tasks.
* [MESOS-8945] - Master check failure due to CHECK_SOME(providerId).
* [MESOS-8947] - Improve the container preparing logging in IOSwitchboard and volume/secret isolator.
* [MESOS-8952] - process::await/collect n^2 performance issue.
* [MESOS-8963] - Executor crash trying to print container ID.
* [MESOS-8980] - mesos-slave can deadlock with docker pull.
* [MESOS-8986] - `slave.available()` in the allocator is expensive and drags down allocation performance.
* [MESOS-8987] - Master asks agent to shutdown upon auth errors.
* [MESOS-9049] - Agent GC could unmount a dangling persistent volume multiple times.
* [MESOS-9088] - `createStrippedScalarQuantity()` should clear all metadata fields.
* [MESOS-9125] - Port mapper CNI plugin might fail with "Resource temporarily unavailable"
* [MESOS-9127] - Port mapper CNI plugin might deadlock iptables on the agent.
Release Notes - Mesos - Version 1.4.1
-------------------------------------------
* This is a bug fix release.
** Bug
* [MESOS-7873] - Expose `ExecutorInfo.ContainerInfo.NetworkInfo` in Mesos `state` endpoint.
* [MESOS-7921] - ProcessManager::resume sometimes crashes accessing EventQueue.
* [MESOS-7964] - Heavy-duty GC makes the agent unresponsive.
* [MESOS-7968] - Handle `/proc/self/ns/pid_for_children` when parsing available namespace.
* [MESOS-7969] - Handle cgroups v2 hierarchy when parsing /proc/self/cgroups.
* [MESOS-7980] - Stout fails to compile with libc >= 2.26.
* [MESOS-8051] - Killing TASK_GROUP fail to kill some tasks.
* [MESOS-8080] - The default executor does not propagate missing task exit status correctly.
* [MESOS-8090] - Mesos 1.4.0 crashes with 1.3.x agent with oversubscription
* [MESOS-8135] - Masters can lose track of tasks' executor IDs.
* [MESOS-8169] - Incorrect master validation forces executor IDs to be globally unique.
Release Notes - Mesos - Version 1.4.0
-------------------------------------------
This release contains the following new features:
* [MESOS-5116] - The `disk/xfs` isolator now supports the
`--enforce_container_disk_quota` flag to efficiently measure disk
usage without enforcing usage constraints.
* [MESOS-6223] - Agents are now allowed to recover the agent ID
after a host reboot. See docs/upgrades.md for details.
* [MESOS-6375] - **Experimental** Support for hierarchical resource
allocation roles. Hierarchical roles allows delegation of resource
allocation policies (i.e. fair sharing and quota) further down the
hierarchy. For example, the "engineering" organization gets a 75%
share of the resources, but it's up to the operators within the
"engineering" organization to figure out how to fairly share between
the "engineering/backend" team and the "engineering/frontend" team.
The same delegation applies for quota. NOTE: There are known issues
related to hierarchical roles (e.g. hierarchical quota allocation
is not implemented and quota will be over-allocated if used with
hierarchical roles, see: MESOS-7402) and thus it is not recommended
for production usage at this time.
* [MESOS-7418, MESOS-7088] - File-based secrets are now supported for Mesos
and Universal containerizer. Image-pull secrets are supported for Docker
registry credentials.
* [MESOS-7477] - Linux ambient capabilites are now supported, so
frameworks can run tasks that use ambient capabilites to grant
limited additional privileged to tasks.
* [MESOS-7476, MESOS-7671] - Support for frameworks and operators
specifying Linux bounding capabilities in order to limit the
maximum privileges that a task may acquire.
Deprecations/Removals:
* [MESOS-7671] - LinuxInfo.capabilities is deprecated in favor
of LinuxInfo.effective_capabilities.
* [MESOS-7477] - The agent `--allowed_capabilities` flag is
deprecated in favor of `--effective_capabilities`
Unresolved Critical Issues:
* [MESOS-7643] - The order of isolators provided in '--isolation' flag is not preserved and instead sorted alphabetically
* [MESOS-7402] - Quota is over-allocated when used with hierarchical roles.
Additional API Changes:
* [MESOS-7755] The interpretation of the optional resource argument
passed in `Allocator::updateSlave` was changed from the total
amount of oversubscribed resources on the agent to the new total
resources (both revocable and non-revocable) on the agent. Custom
allocator implementation should be changed to interpretation of the
passed value as a total before updating.
Feature Graduations:
* [MESOS-2533] - Support HTTP checks in Mesos.
* [MESOS-3567] - Support TCP checks in Mesos.
All Resolved Issues:
** Bug
* [MESOS-1987] - Add support for SemVer build and prerelease labels to stout.
* [MESOS-4210] - Investigate increasing protobuf protocol message size limit.
* [MESOS-4331] - git commit-msg hook completely breaks fixup commits.
* [MESOS-4467] - Implement `sleep` in Windows
* [MESOS-4983] - Segfault in ProcessTest.Spawn with GCC 6
* [MESOS-4992] - sandbox uri does not work outisde mesos http server
* [MESOS-5187] - The filesystem/linux isolator does not set the permissions of the host_path.
* [MESOS-5903] - `GTEST_IS_THREADSAFE` guards prevent many tests from being run on Windows.
* [MESOS-5937] - `flags::parse` assumes the filesystem is rooted at '/'
* [MESOS-5938] - `net::links` is not implemented on Windows.
* [MESOS-6115] - Source tree contains compiled protobuf source
* [MESOS-6539] - Compile warning in GMock: "binding dereferenced null pointer to reference"
* [MESOS-6743] - Docker executor hangs forever if `docker stop` fails.
* [MESOS-6814] - Make sure compilation configuration is propagated correctly to third party dependencies
* [MESOS-6817] - Audit the use of UNICODE-related code paths
* [MESOS-6916] - Improve health checks validation.
* [MESOS-6950] - Launching two tasks with the same Docker image simultaneously may cause a staging dir never cleaned up
* [MESOS-6961] - Executors don't use glog for logging.
* [MESOS-7017] - HTTP API responses can crash the master.
* [MESOS-7115] - Agent should prefer LOG(FATAL) over EXIT().
* [MESOS-7173] - CMake does not define `GIT_SHA` etc. in build.cpp
* [MESOS-7186] - Metrics about used/allocated shared resources are incorrect accounted.
* [MESOS-7193] - Use of `GTEST_IS_THREADSAFE` in asserts is problematic.
* [MESOS-7252] - Need to fix resource check in long-lived framework
* [MESOS-7268] - CNI isolator should mount network related /etc/* files in readonly mode
* [MESOS-7351] - CMake < 3.8.0 cannot find VS2017 tools
* [MESOS-7373] - Remove thread_local workaround on OSX
* [MESOS-7374] - Running DOCKER images in Mesos Container Runtime without `linux/filesystem` isolation enabled renders host unusable
* [MESOS-7378] - Build failure with glibc 2.12.
* [MESOS-7381] - Flaky tests in NestedMesosContainerizerTest
* [MESOS-7389] - Mesos 1.2.0 crashes with pre-1.0 Mesos agents.
* [MESOS-7403] - Resources::apply(const Offer::Operation&) should fail when a shared persistent volume can't be removed
* [MESOS-7441] - RegisterSlaveValidationTest.DropInvalidRegistration is flaky
* [MESOS-7457] - HierarchicalAllocatorTest.NestedRoleQuota is flaky
* [MESOS-7458] - webui display of framework resources is confusing
* [MESOS-7459] - Fix the duration.hpp warning
* [MESOS-7462] - Flaky test HierarchicalAllocatorTest.NestedRoleDRF
* [MESOS-7464] - Recent Docker versions cannot be parsed by stout.
* [MESOS-7468] - Could not copy the sandbox path on WebUI
* [MESOS-7471] - Provisioner recover should not always assume 'rootfses' dir exists.
* [MESOS-7476] - Restrict capabilities to only the bounding set.
* [MESOS-7484] - VersionTest.ParseInvalid aborts on Windows.
* [MESOS-7496] - The /debug:fastlink linker option is not being respected
* [MESOS-7498] - Remove need to set environment variable `PreferredToolArchitecture`
* [MESOS-7502] - Build error on Windows when using "int" for a file descriptor
* [MESOS-7507] - Add a metric for the network size of replicas for the registry.
* [MESOS-7515] - MasterAllocatorTest/0.ResourcesUnused is flaky
* [MESOS-7524] - Basic fetcher success metrics
* [MESOS-7545] - Volume secret isolator breaks Windows build
* [MESOS-7552] - MasterAllocatorTest/0.FrameworkExited is flaky
* [MESOS-7569] - Allow "old" executors with half-open connections to be preserved during agent upgrade / restart.
* [MESOS-7581] - Specifying an unbundled dependency can cause build to pick up wrong Boost version
* [MESOS-7584] - ASF Jenkins build errors out on missing 'python-six' dependency
* [MESOS-7597] - libprocess build is broken
* [MESOS-7618] - CMake files incompatible with multi-configuration generators
* [MESOS-7627] - Mesos slave stucks
* [MESOS-7638] - The command `false` does not exist on Windows
* [MESOS-7640] - Docker containerizer fails to set sandbox logs ownership correctly.
* [MESOS-7652] - Docker image with universal containerizer does not work if WORKDIR is missing in the rootfs.
* [MESOS-7655] - Reservation Refinement: Update the resources logic.
* [MESOS-7662] - Documentation regarding TASK_LOST is misleading
* [MESOS-7666] - Update the agent to use the new resource format
* [MESOS-7667] - Update the master to use the new resource format.
* [MESOS-7669] - Update the test utilities to produce the resources in the new format
* [MESOS-7671] - Let frameworks specify the task bounding capabilities.
* [MESOS-7674] - Update the generic Protobuf to JSON facility to not output deprecated fields
* [MESOS-7679] - V1 Operator API update for reservation refinement.
* [MESOS-7689] - Libprocess can crash on malformed request paths for libprocess messages.
* [MESOS-7690] - The agent can crash when an unknown executor tries to register.
* [MESOS-7700] - Prevent reserve/create operations with refined reservations on non-capable agents.
* [MESOS-7703] - Mesos fails to exec a custom executor when no shell is used
* [MESOS-7711] - Master updates registry for reregistering agents even when they haven't been unreachable
* [MESOS-7714] - Fix agent downgrade for reservation refinement
* [MESOS-7716] - Mesos 1.2.0 agent crashes Mesos 1.4.0 master
* [MESOS-7725] - PersistentVolumeEndpointsTest.ReserveAndSlaveRemoval test is flaky
* [MESOS-7728] - Java HTTP adapter crashes JVM when leading master disconnects.
* [MESOS-7735] - The master crashes when state endpoint is hit during a task authorization.
* [MESOS-7744] - Mesos Agent Sends TASK_KILL status update to Master, and still launches task
* [MESOS-7751] - Mesos failed to build on Windows due to error C2039: 'parse': is not a member of 'mesos::internal::protobuf'
* [MESOS-7753] - `log.LearnedMessage` could be rejected due to being sent from '@0.0.0.0:0'
* [MESOS-7758] - Stout doesn't build standalone.
* [MESOS-7761] - Website ruby deps do not bundle on macOS
* [MESOS-7765] - MasterTest.KillUnknownTask is failling due to a bug in `net::IPv4::ANY()`
* [MESOS-7769] - libprocess initializes to bind to random port if --ip is not specified
* [MESOS-7770] - Persistent volume might not be mounted if there is a sandbox volume whose source is the same as the target of the persistent volume.
* [MESOS-7772] - Copy-n-paste error in slave/main.cpp
* [MESOS-7775] - Eliminate extra process abort in a subprocess watchdog
* [MESOS-7777] - Agent failed to recover due to mount namespace leakage in Docker 1.12/1.13
* [MESOS-7778] - Hide per-platform subprocess headers.
* [MESOS-7783] - Framework might not receive status update when a just launched task is killed immediately
* [MESOS-7794] - Mesos failed with error c2102 when build in conformance mode (/permissive-)
* [MESOS-7796] - LIBPROCESS_IP isn't passed on to the fetcher
* [MESOS-7797] - Hard-coded forward slash breaks windows docker container task in DC/OS
* [MESOS-7805] - mesos-execute has incorrect example TaskInfo in help string
* [MESOS-7817] - CreateProcess wrapper's error message is bad
* [MESOS-7821] - Resource refinement does downgrade task.executor.resources in LAUNCH_GROUP handler.
* [MESOS-7830] - Sandbox_path volume does not have ownership set correctly.
* [MESOS-7831] - Resource refinement is not applied to tasks in completed_frameworks.
* [MESOS-7849] - The rlimits and linux/capabilities isolators should support nested containers
* [MESOS-7858] - Launching a nested container with namespace/pid isolation, with glibc < 2.25, may deadlock the LinuxLauncher and MesosContainerizer
* [MESOS-7863] - Agent may drop pending kill task status updates.
* [MESOS-7865] - Agent may process a kill task and still launch the task.
* [MESOS-7869] - Build fails with `--disable-zlib` or `--with-zlib=DIR`
* [MESOS-7871] - Agent fails assertion during request to '/state'
* [MESOS-7872] - Scheduler hang when registration fails.
* [MESOS-7888] - Track fetcher task success and failures
* [MESOS-7909] - Ordering dependency between 'linux/capabilities' and 'docker/runtime' isolator.
* [MESOS-7912] - Master WebUI not working in Chrome.
* [MESOS-7921] - process::EventQueue sometimes crashes
* [MESOS-7922] - Fix communication between old masters and new agents.
* [MESOS-7926] - Abnormal termination of default executor can cause MesosContainerizer::destroy to fail.
* [MESOS-7934] - OOM due to LibeventSSLSocket send incorrectly returning 0 after shutdown.
** Documentation
* [MESOS-7246] - Add documentation for AGENT_ADDED/AGENT_REMOVED events.
* [MESOS-7349] - Document Mesos "check" feature.
* [MESOS-7501] - Change legacy --with-network-isolator to --with-port-mapping-isolator
** Epic
* [MESOS-6975] - Prevent pre-1.0 agents from registering with 1.3+ master.
* [MESOS-7088] - Support private registry credential per container.
* [MESOS-7623] - Automatically publish website through CI
** Improvement
* [MESOS-5116] - Add support for accounting only mode in XFS isolator.
* [MESOS-5417] - define WSTRINGIFY behaviour on Windows
* [MESOS-6053] - Combine test helpers into one single binary.
* [MESOS-6223] - Allow agents to reregister post a host reboot
* [MESOS-6535] - The default executor should support kill policies
* [MESOS-6549] - Asynchronous dir removal in agent GC
* [MESOS-6782] - Inherit Environment from parent container when launching DEBUG container.
* [MESOS-6905] - Task status updates caused by task health update do not set appropriate reason.
* [MESOS-6976] - Disallow (re-)registration attempts by old agents.
* [MESOS-6977] - Cleanup tech debt in master for old agents
* [MESOS-6978] - Update webui to remove orphan tasks
* [MESOS-7006] - Launch docker containers with --cpus instead of cpu-shares
* [MESOS-7015] - Frameworks should be able to (re)register in suppressed state
* [MESOS-7092] - Health checker duplicates a lot of checker's functionality.
* [MESOS-7228] - Upgrade Mesos to build with proto3.
* [MESOS-7327] - Add a test with multiple tasks and checks for the default executor.
* [MESOS-7343] - Add a ReviewBot for testing patches on Windows
* [MESOS-7355] - Set MESOS_SANDBOX in debug containers.
* [MESOS-7364] - Upgrade vendored GMock / GTest
* [MESOS-7401] - Optionally reject messages when UPIDs does not match IP.
* [MESOS-7418] - Add support for file-based secrets
* [MESOS-7429] - Allow isolators to inject task-specific environment variables.
* [MESOS-7451] - Expose MOUNT volumes of an agent in master's v0 HTTP API
* [MESOS-7477] - Support ambient capabilities.
* [MESOS-7540] - Add an agent flag for executor re-registration timeout.
* [MESOS-7542] - Add executor reconnection retry logic to the agent
* [MESOS-7572] - Attach latest symlink when executor is registered.
* [MESOS-7585] - Added 'mesos config show' command to the new Mesos CLI.
* [MESOS-7608] - Protobuf definitions for domains
* [MESOS-7609] - Protobuf definitions for region-aware framework capability
* [MESOS-7610] - Support domains in master and agent
* [MESOS-7611] - Prevent master from joining mixed-region cluster
* [MESOS-7612] - Prevent agent with misconfigured domain from registering
* [MESOS-7614] - Only offer resources on remote agents to region-aware frameworks
* [MESOS-7630] - Add simple filtering to unversioned operator API
* [MESOS-7644] - Add DomainInfo to offers
* [MESOS-7782] - Add fetcher cache size metrics.
* [MESOS-7792] - Add support for ECDH ciphers
* [MESOS-7808] - Bundling gRPC into 3rdparty
* [MESOS-7809] - Building gRPC with Autotools
* [MESOS-7810] - gRPC support in libprocess
* [MESOS-7814] - Improve the test frameworks.
* [MESOS-7862] - Get rid of timestamp and date in generated javadoc files
* [MESOS-7870] - Refactor libssl and libcrypto checks for building gRPC
* [MESOS-7881] - Building gRPC with CMake
** Task
* [MESOS-6101] - Add Framwork events to master's operator API
* [MESOS-6162] - Add support for cgroups blkio subsystem blkio statistics.
* [MESOS-6441] - Display reservations in the agent page in the webui.
* [MESOS-7149] - Support reservations for role subtrees
* [MESOS-7283] - Add ability to initialize a test cluster for Mesos CLI unit-test infrastructure
* [MESOS-7304] - Fetcher should not depend on SlaveID.
* [MESOS-7315] - Design doc for resource provider and storage integration.
* [MESOS-7414] - Enable authorization for master's logging API calls: GET_LOGGING_LEVEL and SET_LOGGING_LEVEL
* [MESOS-7415] - Add authorization to master's operator maintenance API in v0 and v1
* [MESOS-7416] - Filter results of `/master/slaves` and the v1 call GET_AGENTS
* [MESOS-7417] - Design doc for file-based secrets.
* [MESOS-7433] - Set working directory in DEBUG containers.
* [MESOS-7449] - Refactor containerizers to not depend on TaskInfo or ExecutorInfo
* [MESOS-7488] - Add `--ip6` and `--ip6_discovery_command` flag to Mesos agent
* [MESOS-7505] - Enable hierarchical roles
* [MESOS-7560] - Add 'type' and 'name' to ResourceProviderInfo.
* [MESOS-7571] - Add `--resource_provider_config_dir` flag to the agent.
* [MESOS-7576] - Add master flag `--filter-gpu-resources={true|false}`
* [MESOS-7582] - Add Config class to manage the Mesos CLI config file.
* [MESOS-7591] - Update master to use resource provider IDs instead of agent ID in allocator calls.
* [MESOS-7593] - Update offer handling in the master to consider local resource providers
* [MESOS-7624] - Move website from svn to git
* [MESOS-7625] - Create script to automate publishing website
* [MESOS-7626] - Create a CI job to publish the website
* [MESOS-7631] - DefautlExecutor needs to inform tasks about IP addresses
* [MESOS-7632] - Add `HIERARCHICAL_ROLE` agent capability
* [MESOS-7633] - Prevent hierarchical roles from being allocated resources from non-HIERARCHICAL_ROLE agents.
* [MESOS-7665] - V0 Operator API update for reservation refinement.
* [MESOS-7668] - Update authorization to handle reservation refinement.
* [MESOS-7696] - Update resource provider design in the master
* [MESOS-7709] - Add --default_container_dns flag to the agent.
* [MESOS-7713] - Optimize number of copies made in dispatch/defer mechanism
* [MESOS-7755] - Update allocator to support updating agent total resources
* [MESOS-7757] - Update master to handle updates to agent total resources
* [MESOS-7767] - Make `net::IP` fields protected to allow for inheritance
* [MESOS-7780] - Add `SUBSCRIBE` call handling to the resource provider manager
* [MESOS-7806] - Add copy assignment operator to `net::IP::Network`
* [MESOS-7853] - Support shared PID namespace.
* [MESOS-7879] - The kill nested container call should provide ability to specify a signal.
Release Notes - Mesos - Version 1.3.3 (WIP)
-------------------------------------------
* This is a bug fix release.
All Issues:
** Bug
* [MESOS-8125] - Agent should properly handle recovering an executor when its pid is reused.
* [MESOS-8171] - Using a failoverTimeout of 0 with Mesos native scheduler client can result in infinite subscribe loop.
* [MESOS-8411] - Killing a queued task can lead to the command executor never terminating.
* [MESOS-8480] - Mesos returns high resource usage when killing a Docker task.
* [MESOS-8488] - Docker bug can cause unkillable tasks.
* [MESOS-8552] - CGROUPS_ROOT_PidNamespaceForward and CGROUPS_ROOT_PidNamespaceBackward tests fail.
* [MESOS-8574] - Docker executor makes no progress when 'docker inspect' hangs.
* [MESOS-8575] - Improve discard handling for 'Docker::stop' and 'Docker::pull'.
* [MESOS-8576] - Improve discard handling of 'Docker::inspect()'.
* [MESOS-8605] - Terminal task status update will not send if 'docker inspect' is hung.
* [MESOS-8651] - Potential memory leaks in the `volume/sandbox_path` isolator.
* [MESOS-8786] - CgroupIsolatorProcess accesses subsystem processes directly.
* [MESOS-8876] - Normal exit of Docker container using rexray volume results in TASK_FAILED.
* [MESOS-8881] - Enable epoll backend in libevent integration.
* [MESOS-8885] - Disable libevent debug mode.
* [MESOS-8904] - Master crash when removing quota.
Release Notes - Mesos - Version 1.3.2
-------------------------------------
* This is a bug fix release.
All Issues:
** Bug
* [MESOS-6743] - Docker executor hangs forever if `docker stop` fails.
* [MESOS-6950] - Launching two tasks with the same Docker image simultaneously may cause a staging dir never cleaned up.
* [MESOS-7652] - Docker image with universal containerizer does not work if WORKDIR is missing in the rootfs.
* [MESOS-7674] - Update the generic Protobuf to JSON facility to not output deprecated fields.
* [MESOS-7858] - Launching a nested container with namespace/pid isolation, with glibc < 2.25, may deadlock the LinuxLauncher and MesosContainerizer.
* [MESOS-7863] - Agent may drop pending kill task status updates.
* [MESOS-7865] - Agent may process a kill task and still launch the task.
* [MESOS-7872] - Scheduler hang when registration fails.
* [MESOS-7909] - Ordering dependency between 'linux/capabilities' and 'docker/runtime' isolator.
* [MESOS-7912] - Master WebUI not working in Chrome.
* [MESOS-7926] - Abnormal termination of default executor can cause MesosContainerizer::destroy to fail.
* [MESOS-7934] - OOM due to LibeventSSLSocket send incorrectly returning 0 after shutdown.
* [MESOS-8135] - Masters can lose track of tasks' executor IDs.
* [MESOS-8237] - Strip (Offer|Resource).allocation_info for non-MULTI_ROLE schedulers.
* [MESOS-8356] - Persistent volume ownership is set to root despite of sandbox owner (frameworkInfo.user) when docker executor is used.
Release Notes - Mesos - Version 1.3.1
-------------------------------------
* This is a bug fix release.
All Issues:
** Bug
* [MESOS-5187] - The filesystem/linux isolator does not set the permissions of the host_path.
* [MESOS-7252] - Need to fix resource check in long-lived framework.
* [MESOS-7429] - Allow isolators to inject task-specific environment variables.
* [MESOS-7540] - Add an agent flag for executor re-registration timeout.
* [MESOS-7546] - WAIT_NESTED_CONTAINER sometimes returns 404.
* [MESOS-7569] - Allow "old" executors with half-open connections to be preserved during agent upgrade / restart.
* [MESOS-7581] - Fix interference of external Boost installations when using some unbundled dependencies.
* [MESOS-7689] - Libprocess can crash on malformed request paths for libprocess messages.
* [MESOS-7690] - The agent can crash when an unknown executor tries to register.
* [MESOS-7692] - Default environment variables defined in Docker image are not available in Mesos containerizer.
* [MESOS-7703] - Mesos fails to exec a custom executor when no shell is used.
* [MESOS-7728] - Java HTTP adapter crashes JVM when leading master disconnects.
* [MESOS-7770] - Persistent volume might not be mounted if there is a sandbox volume whose source is the same as the target of the persistent volume.
* [MESOS-7777] - Agent failed to recover due to mount namespace leakage in Docker 1.12/1.13.
* [MESOS-7796] - LIBPROCESS_IP isn't passed on to the fetcher.
* [MESOS-7830] - Sandbox_path volume does not have ownership set correctly.
Release Notes - Mesos - Version 1.3.0
-------------------------------------
This release contains the following new features:
* [MESOS-1763] - Support for frameworks to receive resources for multiple
roles. This allows "multi-user" frameworks to leverage the role-based
resource allocation in mesos. Prior to this support, one had to run
multiple instances of a single-user framework to achieve multi-user
resource allocation, or implement multi-user resource allocation in
the framework.
* [MESOS-6365] - Authentication and authorization support for HTTP executors.
A new `--authenticate_http_executors` agent flag enables required
authentication on the HTTP executor API. A new `--executor_secret_key` flag
sets a key file to be used when generating and authenticating default tokens
that are passed to HTTP executors. Note that enabling these flags after
upgrade is disruptive to HTTP executors that were launched before the
upgrade; see 'docs/authentication.md' for more information on these flags
and the recommended upgrade procedure. Implicit authorization rules have
been added which allow an authenticated executor to make executor API calls
as that executor and make operator API calls which affect that executor's
container. See 'docs/authorization.md' for more information on these
implicit authorization rules.
* [MESOS-6627] - Support for frameworks to modify the role(s) they are
subscribed to. This is essential to supporting "multi-user" frameworks
(see MESOS-1763) in that roles are expected to come and go over time
(e.g. new employees join, new teams are formed, employees leave, teams
are disbanded, etc).
**NOTE**: In Mesos 1.3.0, the master will no longer allow 0.x agents to
register. Interoperability between 1.1+ masters and 0.x agents has never
been supported; however, it was not explicitly disallowed, either.
Starting with this release of Mesos, registration attempts by 0.x Mesos
agents will be ignored.
Deprecations/Removals:
* [MESOS-7259] - Remove deprecated ACLs `SetQuota` and `RemoveQuota`.
This change is only applicable to the local authorizer since internally
these acls were being translated to the `UPDATE_QUOTA` action.
* [MESOS-7320] - Remove deprecated ACL `ShutdownFramework`.
This change is only applicable to the local authorizer since internally
these acls were being translated to the `TEARDOWN_FRAMEWORK` action.
Unresolved Critical Issues:
* [MESOS-1625] - Extra trailing CRLF being sent after the HTTP body in libprocess.
* [MESOS-1718] - Command executor can overcommit the agent.
* [MESOS-2554] - Slave flaps when using --slave_subsystems that are not used for isolation.
* [MESOS-2774] - SIGSEGV received during process::MessageEncoder::encode().
* [MESOS-2842] - Update FrameworkInfo.principal on framework re-registration.
* [MESOS-3533] - Unable to find and run URIs files.
* [MESOS-3747] - HTTP Scheduler API no longer allows FrameworkInfo.user to be empty string.
* [MESOS-3794] - Master should not store arbitrarily sized data in ExecutorInfo.
* [MESOS-4259] - mesos HA can't delete the the redundant container on failure slave node.
* [MESOS-4297] - Executor does not shutdown when framework teardown.
* [MESOS-4642] - Mesos Agent Json API can dump binary data from log files out as invalid JSON.
* [MESOS-4996] - 'containerizer->update' will always fail after killing a docker container.
* [MESOS-5352] - Docker volume isolator cleanup can be blocked by first cleanup failure.
* [MESOS-5396] - After failover, master does not remove agents with same UPID.
* [MESOS-5849] - Agent sandboxes on Windows surpass the 260 character path length limit.
* [MESOS-5859] - Some tasks are always in staged state.
* [MESOS-5989] - Libevent SSL Socket downgrade code accesses uninitialized memory / assumes single peek is sufficient.
* [MESOS-5995] - Protobuf JSON deserialisation does not accept numbers formated as strings.
* [MESOS-6356] - ASF CI has interleaved logging.
* [MESOS-6615] - Running mesos-slave in the docker that leave many zombie process.
* [MESOS-6623] - Re-enable tests impacted by request streaming support.
* [MESOS-6632] - ContainerLogger might leak FD if container launch fails.
* [MESOS-6780] - ContentType/AgentAPIStreamingTest.AttachContainerInput test fails reliably.
* [MESOS-6784] - IOSwitchboardTest.KillSwitchboardContainerDestroyed is flaky.
* [MESOS-6804] - Running 'tty' inside a debug container that has a tty reports "Not a tty".
* [MESOS-6843] - Fetcher should not assume stdout/stderr in the sandbox.
* [MESOS-6913] - AgentAPIStreamingTest.AttachInputToNestedContainerSession fails on Mac OS.
* [MESOS-6974] - DefaultExecutorTest.CommitSuicideOnTaskFailure test is flaky.
* [MESOS-6986] - `abort` in `DRFSorter::add`.
* [MESOS-7017] - HTTP API responses can crash the master.
* [MESOS-7082] - ROOT_DOCKER_DockerAndMesosContainerizers/DefaultExecutorTest.KillTask/0 is flaky.
* [MESOS-7099] - Quota can be exceeded due to coarse-grained offer technique.
* [MESOS-7215] - Race condition on re-registration of non-partition-aware frameworks.
* [MESOS-7298] - Fetcher caches files with world-readable permissions.
* [MESOS-7362] - GPU support can't work when run spark.
* [MESOS-7374] - Running DOCKER images in Mesos Container Runtime without `linux/filesystem` isolation enabled renders host unusable.
* [MESOS-7381] - Flaky tests in NestedMesosContainerizerTest.
* [MESOS-7386] - Executor not cleaning up existing running docker containers if external logrotate/logger processes die/killed.
Feature Graduations:
* [MESOS-2449] - Support group of tasks (Pod) constructs and API in Mesos.
* [MESOS-4641] - Support Container Network Interface (CNI).
* [MESOS-6419] - Teardown unregistered frameworks.
All Experimental Features:
* [MESOS-2533] - Support HTTP checks in Mesos.
* [MESOS-3094] - Mesos on Windows.
* [MESOS-3421] - Support sharing of resources across task instances.
* [MESOS-3567] - Support TCP checks in Mesos.
* [MESOS-4312] - Porting Mesos on Power (ppc64le).
* [MESOS-4355] - Implement isolator for Docker volume.
* [MESOS-4791] - Operator API v1.
* [MESOS-4828] - XFS disk quota isolator.
* [MESOS-5275] - Add capabilities support for mesos containerizer.
* [MESOS-5344] - Partition-aware Mesos frameworks.
* [MESOS-5788] - Added JAVA API adapter for seamless transition to new scheduler API.
* [MESOS-5931] - Support auto backend in Mesos Containerizer.
* [MESOS-6014] - Added port mapping CNI plugin.
* [MESOS-6077] - Added a default (task group) executor.
* [MESOS-6402] - rlimit support for Mesos containerizer.
* [MESOS-6460] - Container Attach/Exec.
* [MESOS-6758] - Support docker registry that requires basic auth.
* [MESOS-6906] - Introduce a general non-interpreting task check.
All Resolved Issues:
** Bug
* [MESOS-1987] - Add support for SemVer build and prerelease labels to stout.
* [MESOS-4245] - Add `dist` target to CMake solution.
* [MESOS-4263] - Report volume usage through ResourceStatistics.
* [MESOS-5028] - Copy provisioner cannot replace directory with symlink.
* [MESOS-5172] - Registry puller cannot fetch blobs correctly from http Redirect 3xx urls.
* [MESOS-5288] - Update leveldb patch file to suport s390x.
* [MESOS-5880] - Semantics of `environment` differ across Windows and POSIX.
* [MESOS-6134] - Port CFS quota support to Docker Containerizer using command executor.
* [MESOS-6138] - Add 'syntax=proto2' to all .proto files in Mesos.
* [MESOS-6327] - Large docker images causes container launch failures: Too many levels of symbolic links.
* [MESOS-6560] - The default stout stringify always copies its argument.
* [MESOS-6606] - Reject optimized builds with libcxx before 3.9.
* [MESOS-6720] - Check that `PreferredToolArchitecture` is set to `x64` on Windows before building.
* [MESOS-6730] - Reserve operation should validate reserved resource role against resource allocationInfo role.
* [MESOS-6731] - Create a test filter for stout tests that use `symlink` on Windows, as they will fail if not run as admin.
* [MESOS-6732] - XFS disk isolator should check whether quotas are enabled.
* [MESOS-6742] - Adding support for s390x architecture.
* [MESOS-6815] - Enable glog stack traces when we call things like `ABORT` on Windows.
* [MESOS-6858] - network/cni isolator generates incomplete resolv.conf.
* [MESOS-6868] - Transition Windows away from `os::killtree`.
* [MESOS-6892] - Reconsider process creation primitives on Windows.
* [MESOS-6907] - FutureTest.After3 is flaky.
* [MESOS-6951] - Docker containerizer: mangled environment when env value contains LF byte.
* [MESOS-6953] - A compromised mesos-master node can execute code as root on agents.
* [MESOS-6976] - Disallow (re-)registration attempts by old agents.
* [MESOS-6982] - PerfTest.Version fails on recent Arch Linux.
* [MESOS-7022] - Update framework authorization to support multiple roles.
* [MESOS-7029] - FaultToleranceTest.FrameworkReregister is flaky.
* [MESOS-7035] - Add test for framework upgrading to MULTI_ROLE with tasks running.
* [MESOS-7049] - CgroupsAnyHierarchyWithPerfEventTest.ROOT_CGROUPS_PERF_PerfTest is broken on Fedora 25.
* [MESOS-7097] - Framework credentials can be used to register as an agent.
* [MESOS-7133] - mesos-fetcher fails with openssl-related output.
* [MESOS-7135] - Outstanding offers to a dropped framework role should be rescinded.
* [MESOS-7146] - OSX broken due to wrong configuration of LevelDB after update.
* [MESOS-7158] - Add `role` to task/executor to indicate allocation role of their resources.
* [MESOS-7165] - Agents should be able to upgrade to be MULTI_ROLE capable.
* [MESOS-7172] - CMake does not incrementally recompile.
* [MESOS-7182] - Couple of MULTI_ROLE related tests are flaky.
* [MESOS-7197] - Requesting tiny amount of CPU crashes master.
* [MESOS-7208] - Persistent volume ownership is set to root when task is running with non-root user.
* [MESOS-7210] - HTTP health check doesn't work when mesos runs with --docker_mesos_image.
* [MESOS-7225] - Tasks launched via the default executor cannot access disk resource volumes.
* [MESOS-7236] - Base64 encoding/decoding (via stout) behaves differently on Windows.
* [MESOS-7237] - Enabling cgroups_limit_swap can lead to "invalid argument" error.
* [MESOS-7248] - RemoveNestedContainer returns unsupported.
* [MESOS-7255] - New mesos-style.py linter behavior breaks commiting when virtualenv is not installed.
* [MESOS-7259] - Remove deprecated ACLs `SetQuota` and `RemoveQuota`.
* [MESOS-7261] - maintenance.html is missing during packaging.
* [MESOS-7263] - User supplied task environment variables cause warnings in sandbox stdout.
* [MESOS-7264] - Possibly duplicate environment variables should not leak values to the sandbox.
* [MESOS-7265] - Containerizer startup may cause sensitive data to leak into sandbox logs.
* [MESOS-7270] - Java V1 Framwork Test failed on macOS.
* [MESOS-7272] - Unified containerizer does not support docker registry version < 2.3.
* [MESOS-7280] - Unified containerizer provisions docker image error with COPY backend.
* [MESOS-7281] - Backwards incompatible UpdateFrameworkMessage handling.
* [MESOS-7287] - Fix post-reviews.py to find `rbt.cmd` on Windows.
* [MESOS-7300] - Mesos failed to build on Windows due to error C2440: 'return': cannot convert from 'Error' to 'bool'.
* [MESOS-7311] - CopyFetcherPluginTest.FetchExistingFile.
* [MESOS-7316] - Upgrading Mesos to 1.2.0 results in some information missing from the `/flags` endpoint.
* [MESOS-7323] - Framework role tracking in allocator results in framework treated as active incorrectly.
* [MESOS-7340] - Log HTTP accesses to the /files endpoint.
* [MESOS-7346] - Agent crashes if the task name is too long.
* [MESOS-7348] - Network isolator crashes agent on startup when network interface cannot be found.
* [MESOS-7350] - Failed to pull image from Nexus Registry due to signature missing.
* [MESOS-7363] - Improver master robustness against duplicate UPIDs.
* [MESOS-7365] - Compile error with recent glibc.
* [MESOS-7372] - Improve agent re-registration robustness.
* [MESOS-7378] - Build failure with glibc 2.12.
* [MESOS-7389] - Mesos 1.2.0 crashes with pre-1.0 Mesos agents.
* [MESOS-7400] - The mesos master crashes due to an incorrect invariant check in the decoder.
* [MESOS-7427] - Registry puller cannot fetch manifests from Amazon ECR: 405 Unsupported.
* [MESOS-7430] - Per-role Suppress call implementation is broken.
* [MESOS-7431] - Registry puller cannot fetch manifests from Google GCR: 403 Forbidden.
* [MESOS-7453] - glyphicons-halflings-regular.woff2 is missing in WebUI.
* [MESOS-7456] - Compilation error on recent glibc in cgroups device subsystem.
* [MESOS-7464] - Recent Docker versions cannot be parsed by stout.
* [MESOS-7471] - Provisioner recover should not always assume 'rootfses' dir exists.
* [MESOS-7478] - Pre-1.2.x master does not work with 1.2.x agent.
* [MESOS-7484] - VersionTest.ParseInvalid aborts on Windows.
* [MESOS-7521] - Major performance regression in DRF sorter.
* [MESOS-7538] - Don't validate re-registrations that are going to be dropped.
** Documentation
* [MESOS-7005] - Add executor authentication documentation.
* [MESOS-7324] - Update documentation to reflect the addition of multi-role framework support.
** Epic
* [MESOS-1763] - Add support for frameworks to receive resources for multiple roles.
* [MESOS-6365] - Executor authentication.
* [MESOS-6627] - Allow frameworks to modify the role(s) they are subscribed to.
** Improvement
* [MESOS-970] - Upgrade bundled leveldb to 1.19.
* [MESOS-5186] - mesos.interface: Allow using protobuf 3.x.
* [MESOS-5992] - Complete the list of API Calls on the Operator HTTP API Doc.
* [MESOS-6280] - Task group executor should support command health checks.
* [MESOS-6304] - Add authentication support to the default executor.
* [MESOS-6523] - Agent cgroup assignment should precede agent initialization.
* [MESOS-6906] - Introduce a general non-interpreting task check.
* [MESOS-7021] - Consistent symlink behavior for os::stat accessors.
* [MESOS-7074] - port_mapping isolator: do not depend on /sys/class/net/<ifname>/speed.
* [MESOS-7101] - ExamplesTest.PersistentVolumeFramework failed on ASF CI.
* [MESOS-7120] - Add an Agent API call to cleanup nested container artifacts.
* [MESOS-7226] - Introduce precompiled headers (on Windows).
* [MESOS-7249] - Default executor does not support general checks.
* [MESOS-7256] - Replace Boost Type Traits leftovers with STL.
* [MESOS-7274] - Health checker does not support pause / resume.
* [MESOS-7275] - General checker does not support TCP checks.
* [MESOS-7276] - General checker does not support pause / resume.
* [MESOS-7277] - General checker does not support command checks via agent.
* [MESOS-7376] - Reduce copying of the Registry to improve Registrar performance.
* [MESOS-7387] - ZK master contender and detector don't respect zk_session_timeout option.
** Task
* [MESOS-3139] - Incorporate CMake into standard documentation.
* [MESOS-5418] - Test case: Escape containerizer command line on Windows.
* [MESOS-6022] - unit-test for port-mapper CNI plugin.
* [MESOS-6032] - Add infrastructure for unit tests in the new python-based CLI.
* [MESOS-6123] - Implement GET_AGENT call in v1 agent API.
* [MESOS-6447] - Display role weight / role quota information in the webui.
* [MESOS-6636] - Validate that tasks / executors / reservations / volumes do not mix Resource.allocation_info.roles.
* [MESOS-6637] - Validate that schedulers cannot perform operations on offers with different allocation roles.
* [MESOS-6657] - Update the webui to reflect that frameworks have multiple roles.
* [MESOS-6691] - Enable SSL in Mesos builds.
* [MESOS-6762] - Update release notes for multi-role changes.
* [MESOS-6791] - Allow to specific the device whitelist entries in cgroup devices subsystem.
* [MESOS-6808] - Refactor Docker::run to only take docker cli parameters.
* [MESOS-6855] - Add `role` section to response of /state endpoint.
* [MESOS-6886] - Add authorization tests for debug API handlers.
* [MESOS-6940] - Do not send offers to MULTI_ROLE schedulers if agent does not have MULTI_ROLE capability.
* [MESOS-6967] - Ensure offer operations can be applied for MULTI_ROLE and non-MULTI_ROLE frameworks.
* [MESOS-6992] - Remove validation against "/" characters in roles to support hierarchical roles.
* [MESOS-6995] - Update the webui to reflect hierarchical roles.
* [MESOS-6996] - Add a 'Secret' protobuf message.
* [MESOS-6997] - Add the SecretGenerator module interface.
* [MESOS-6998] - Add authentication support to agent's '/v1/executor' endpoint.
* [MESOS-6999] - Add agent support for generating and passing executor secrets.
* [MESOS-7000] - Implement a JWT SecretGenerator.
* [MESOS-7001] - Implement a JWT authenticator.
* [MESOS-7003] - Introduce a 'Principal' type.
* [MESOS-7004] - Enable multiple HTTP authenticator modules.
* [MESOS-7009] - Add a 'secret' field to the 'Environment' message.
* [MESOS-7011] - Add an '--executor_secret_key' flag to the agent.
* [MESOS-7013] - Update the authorizer interface for executor authentication.
* [MESOS-7014] - Add implicit executor authorization to local authorizer.
* [MESOS-7024] - Update the allocator to handle hierarchical roles.
* [MESOS-7026] - Update authorization / authorization-filtering to handle hierarchical roles.
* [MESOS-7037] - Prevent setting quota on nested roles not contained by parent role quota.
* [MESOS-7038] - Update quota cluster capacity heuristic for hierarchical roles.
* [MESOS-7039] - Prevent quota removal that violates parent role-child role quota containment.
* [MESOS-7047] - Update agent for hierarchical roles.
* [MESOS-7048] - Remove adjustment code within Resources::apply.
* [MESOS-7061] - Re-persist tasks/executors with allocation info during agent recovery.
* [MESOS-7063] - Add a test for a MULTI_ROLE master reregistering an old agent.
* [MESOS-7269] - Migrate setting in config.py to a TOML file.
* [MESOS-7282] - Create a table abstraction for the Mesos CLI.
* [MESOS-7320] - Remove deprecated ACL `ShutdownFramework`.
* [MESOS-7336] - Add resource provider API protobuf.
* [MESOS-7339] - Add authorization to agent executor API.
* [MESOS-7377] - Add authentication to the checker and health checker libraries.
* [MESOS-7391] - Add deprecation warning for Visual Studio 14 2015.
* [MESOS-7395] - Benchmark performance of hierarchical roles.
* [MESOS-7439] - Bump the default timeout value for docker volume driver unmount operation.
Release Notes - Mesos - Version 1.2.3
-------------------------------------------
* This is a bug fix release.
All Issues:
** Bug
* [MESOS-6743] - Docker executor hangs forever if `docker stop` fails.
* [MESOS-6950] - Launching two tasks with the same Docker image simultaneously may cause a staging dir never cleaned up.
* [MESOS-7365] - Compile error with recent glibc.
* [MESOS-7378] - Build failure with glibc 2.12.
* [MESOS-7627] - Mesos slave stucks.
* [MESOS-7652] - Docker image with universal containerizer does not work if WORKDIR is missing in the rootfs.
* [MESOS-7744] - Mesos Agent Sends TASK_KILL status update to Master, and still launches task.
* [MESOS-7783] - Framework might not receive status update when a just launched task is killed immediately.
* [MESOS-7858] - Launching a nested container with namespace/pid isolation, with glibc < 2.25, may deadlock the LinuxLauncher and MesosContainerizer.
* [MESOS-7863] - Agent may drop pending kill task status updates.
* [MESOS-7865] - Agent may process a kill task and still launch the task.
* [MESOS-7872] - Scheduler hang when registration fails.
* [MESOS-7909] - Ordering dependency between 'linux/capabilities' and 'docker/runtime' isolator.
* [MESOS-7926] - Abnormal termination of default executor can cause MesosContainerizer::destroy to fail.
* [MESOS-7934] - OOM due to LibeventSSLSocket send incorrectly returning 0 after shutdown.
* [MESOS-7968] - Handle `/proc/self/ns/pid_for_children` when parsing available namespace.
* [MESOS-7969] - Handle cgroups v2 hierarchy when parsing /proc/self/cgroups.
* [MESOS-7975] - The command/default/docker executor can incorrectly send a TASK_FINISHED update even when the task is killed.
* [MESOS-7980] - Stout fails to compile with libc >= 2.26.
* [MESOS-8051] - Killing TASK_GROUP fail to kill some tasks.
* [MESOS-8080] - The default executor does not propagate missing task exit status correctly.
* [MESOS-8135] - Masters can lose track of tasks' executor IDs.
Release Notes - Mesos - Version 1.2.2
-------------------------------------
* This is a bug fix release.
All Issues:
** Bug
* [MESOS-5187] - The filesystem/linux isolator does not set the permissions of the host_path.
* [MESOS-7252] - Need to fix resource check in long-lived framework.
* [MESOS-7546] - WAIT_NESTED_CONTAINER sometimes returns 404.
* [MESOS-7569] - Allow "old" executors with half-open connections to be preserved during agent upgrade / restart.
* [MESOS-7581] - Fix interference of external Boost installations when using some unbundled dependencies.
* [MESOS-7689] - Libprocess can crash on malformed request paths for libprocess messages.
* [MESOS-7690] - The agent can crash when an unknown executor tries to register.
* [MESOS-7703] - Mesos fails to exec a custom executor when no shell is used.
* [MESOS-7728] - Java HTTP adapter crashes JVM when leading master disconnects.
* [MESOS-7770] - Persistent volume might not be mounted if there is a sandbox volume whose source is the same as the target of the persistent volume.
* [MESOS-7777] - Agent failed to recover due to mount namespace leakage in Docker 1.12/1.13.
* [MESOS-7796] - LIBPROCESS_IP isn't passed on to the fetcher.
* [MESOS-7830] - Sandbox_path volume does not have ownership set correctly.
** Improvement
* [MESOS-7540] - Add an agent flag for executor re-registration timeout.
Release Notes - Mesos - Version 1.2.1
-------------------------------------
* This is a bug fix release.
**NOTE**: In Mesos 1.2.1, the master will no longer allow 0.x agents to
register. Interoperability between 1.1+ masters and 0.x agents has never
been supported; however, it was not explicitly disallowed, either.
Starting with this release of Mesos, registration attempts by 0.x Mesos
agents will be ignored.
All Issues:
** Bug
* [MESOS-1987] - Add support for SemVer build and prerelease labels to stout.
* [MESOS-5028] - Copy provisioner cannot replace directory with symlink.
* [MESOS-5172] - Registry puller cannot fetch blobs correctly from http Redirect 3xx urls.
* [MESOS-6327] - Large docker images causes container launch failures: Too many levels of symbolic links.
* [MESOS-6951] - Docker containerizer: mangled environment when env value contains LF byte.
* [MESOS-6976] - Disallow (re-)registration attempts by old agents.
* [MESOS-7133] - mesos-fetcher fails with openssl-related output.
* [MESOS-7197] - Requesting tiny amount of CPU crashes master.
* [MESOS-7208] - Persistent volume ownership is set to root when task is running with non-root user.
* [MESOS-7210] - HTTP health check doesn't work when mesos runs with --docker_mesos_image.
* [MESOS-7232] - Add support to auto-load /dev/nvidia-uvm in the GPU isolator.
* [MESOS-7237] - Enabling cgroups_limit_swap can lead to "invalid argument" error.
* [MESOS-7261] - maintenance.html is missing during packaging.
* [MESOS-7263] - User supplied task environment variables cause warnings in sandbox stdout.
* [MESOS-7264] - Possibly duplicate environment variables should not leak values to the sandbox.
* [MESOS-7265] - Containerizer startup may cause sensitive data to leak into sandbox logs.
* [MESOS-7272] - Unified containerizer does not support docker registry version < 2.3.
* [MESOS-7280] - Unified containerizer provisions docker image error with COPY backend.
* [MESOS-7316] - Upgrading Mesos to 1.2.0 results in some information missing from the `/flags` endpoint.
* [MESOS-7346] - Agent crashes if the task name is too long.
* [MESOS-7350] - Failed to pull image from Nexus Registry due to signature missing.
* [MESOS-7366] - Agent sandbox gc could accidentally delete the entire persistent volume content.
* [MESOS-7368] - Documentation of framework role(s) in proto definition is confusing.
* [MESOS-7383] - Docker executor logs possibly sensitive parameters.
* [MESOS-7389] - Mesos 1.2.0 crashes with pre-1.0 Mesos agents.
* [MESOS-7400] - The mesos master crashes due to an incorrect invariant check in the decoder.
* [MESOS-7427] - Registry puller cannot fetch manifests from Amazon ECR: 405 Unsupported.
* [MESOS-7429] - Allow isolators to inject task-specific environment variables.
* [MESOS-7453] - glyphicons-halflings-regular.woff2 is missing in WebUI.
* [MESOS-7464] - Recent Docker versions cannot be parsed by stout.
* [MESOS-7471] - Provisioner recover should not always assume 'rootfses' dir exists.
* [MESOS-7478] - Pre-1.2.x master does not work with 1.2.x agent.
* [MESOS-7484] - VersionTest.ParseInvalid aborts on Windows.
Release Notes - Mesos - Version 1.2.0
-------------------------------------------
This release contains the following new features:
* [MESOS-5931] - **Experimental** Support auto backend in Mesos Containerizer,
prefering overlayfs then aufs. Please note that the bind backend needs to be
specified explicitly through the agent flag '--image_provisioner_backend'
since it requires the sandbox already existed.
* [MESOS-6402] - **Experimental** Add rlimit support to Mesos containerizer.
The isolator adds support for setting POSIX resource limits (rlimits) for
containers launched using the Mesos containerizer. POSIX rlimits can be used
to control the resources a process can consume. See `docs/posix_rlimits.md`
for details.
* [MESOS-6419] - **Experimental** Teardown unregistered frameworks. The master
now treats recovered frameworks very similarly to frameworks that are registered
but currently disconnected. For example, recovered frameworks will be reported
via the normal "frameworks" key when querying HTTP endpoints. This means there
is no longer a concept of "orphan tasks": if the master knows about a task, the
task will be running under a framework. Similarly, "teardown" operations on
recovered frameworks will now work correctly.
* [MESOS-6460] - **Experimental** Container Attach and Exec. This feature adds
new Agent APIs for attaching a remote client to the stdin, stdout, and stderr
of a running Mesos task, as well as an API for launching new processes inside
the same container as a running Mesos task and attaching to its stdin, stdout,
and stderr. At a high level, these APIs mimic functionality similar to docker
attach and docker exec. The primary motivation for such functionality is to
enable users to debug their running Mesos tasks.
* [MESOS-6758] - **Experimental** Support 'Basic' auth docker private registry
on Mesos Containerizer. Until now, the mesos containerizer always assumed
Bearer auth, but we now also support basic auth for private registries. Please
note that the AWS ECS uses Basic authorization but it does not work yet due to
the redirect issue MESOS-5172.
Deprecations:
* [MESOS-6650] - Remove slavePreLaunchDockerEnvironmentDecorator and slavePreLaunchDockerHook.
Additional API Changes:
* [MESOS-3601] - Formalize all headers and metadata for HTTP API Event Stream
* [MESOS-6286] - If an agent restarts but fails to complete recovery
within `agent_reregister_timeout`, the master will now mark the
agent as unreachable. This mainly changes behavior in two
situations: (a) the master will now be more robust if agent recovery
hangs indefinitely (e.g., due to a container being in a bad state),
and (b) if agent recovery takes a very long time (e.g., because the
agent's work directory contains a large number of completed tasks),
the master might now mark an agent unreachable that would previously
have been able to eventually recover successfully.
* [MESOS-6419] - When a framework reregisters after master failover,
it is only allowed to change certain fields in its FrameworkInfo.
For example, changing "failover_timeout" is allowed, but changing
"role" is not. In previous Mesos releases, the same restrictions on
changes to FrameworkInfo were only enforced after framework
failover, not master failover.
* [MESOS-6670] - Authz for Agent v1 operator API
* [MESOS-6675] - Changed the allocator API to support adding inactive
frameworks. Custom allocator implementations will need to be updated.
* [MESOS-6865] - Remove the constraint of being only able to launch
2-level nested containers on Agent API.
Unresolved Critical Issues:
* [MESOS-1625] - Extra trailing CRLF being sent after the HTTP body in libprocess
* [MESOS-1718] - Command executor can overcommit the agent.
* [MESOS-2554] - Slave flaps when using --slave_subsystems that are not used for isolation.
* [MESOS-2774] - SIGSEGV received during process::MessageEncoder::encode()
* [MESOS-2842] - Update FrameworkInfo.principal on framework re-registration
* [MESOS-3533] - Unable to find and run URIs files
* [MESOS-3747] - HTTP Scheduler API no longer allows FrameworkInfo.user to be empty string
* [MESOS-3794] - Master should not store arbitrarily sized data in ExecutorInfo.
* [MESOS-4259] - mesos HA can't delete the the redundant container on failure slave node.
* [MESOS-4297] - Executor does not shutdown when framework teardown.
* [MESOS-4642] - Mesos Agent Json API can dump binary data from log files out as invalid JSON.
* [MESOS-4996] - 'containerizer->update' will always fail after killing a docker container.
* [MESOS-5352] - Docker volume isolator cleanup can be blocked by first cleanup failure.
* [MESOS-5396] - After failover, master does not remove agents with same UPID.
* [MESOS-5849] - Agent sandboxes on Windows surpass the 260 character path length limit
* [MESOS-5859] - Some tasks are always in staged state.
* [MESOS-5989] - Libevent SSL Socket downgrade code accesses uninitialized memory / assumes single peek is sufficient.
* [MESOS-6327] - Large docker images causes container launch failures: Too many levels of symbolic links.
* [MESOS-6356] - ASF CI has interleaved logging.
* [MESOS-6615] - Running mesos-slave in the docker that leave many zombie process
* [MESOS-6623] - Re-enable tests impacted by request streaming support
* [MESOS-6632] - ContainerLogger might leak FD if container launch fails.
* [MESOS-6780] - ContentType/AgentAPIStreamingTest.AttachContainerInput test fails reliably
* [MESOS-6784] - IOSwitchboardTest.KillSwitchboardContainerDestroyed is flaky
* [MESOS-6804] - Running 'tty' inside a debug container that has a tty reports "Not a tty"
* [MESOS-6815] - Enable glog stack traces when we call things like `ABORT` on Windows
* [MESOS-6843] - Fetcher should not assume stdout/stderr in the sandbox.
* [MESOS-6913] - AgentAPIStreamingTest.AttachInputToNestedContainerSession fails on Mac OS.
* [MESOS-6974] - DefaultExecutorTest.CommitSuicideOnTaskFailure test is flaky.
* [MESOS-6986] - abort in DRFSorter::add
* [MESOS-7017] - HTTP API responses can crash the master.
* [MESOS-7050] - IOSwitchboard FDs leaked when containerizer launch fails -- leads to deadlock
* [MESOS-7099] - Quota can be exceeded due to coarse-grained offer technique.
Feature Graduations:
* None
All Experimental Features:
* [MESOS-2449] - Support group of tasks (Pod) constructs and API in Mesos.
* [MESOS-2533] - Support HTTP checks in Mesos.
* [MESOS-3094] - Mesos on Windows.
* [MESOS-3421] - Support sharing of resources across task instances.
* [MESOS-3567] - Support TCP checks in Mesos.
* [MESOS-4312] - Porting Mesos on Power (ppc64le).
* [MESOS-4355] - Implement isolator for Docker volume.
* [MESOS-4641] - Support Container Network Interface (CNI).
* [MESOS-4791] - Operator API v1.
* [MESOS-4828] - XFS disk quota isolator.
* [MESOS-5275] - Add capabilities support for mesos containerizer.
* [MESOS-5344] - Partition-aware Mesos frameworks.
* [MESOS-5788] - Added JAVA API adapter for seamless transition to new scheduler API.
* [MESOS-5931] - **NEW** Support auto backend in Mesos Containerizer.
* [MESOS-6014] - Added port mapping CNI plugin.
* [MESOS-6077] - Added a default (task group) executor.
* [MESOS-6402] - **NEW** rlimit support for Mesos containerizer
* [MESOS-6419] - **NEW** Teardown unregistered frameworks
* [MESOS-6460] - **NEW** Container Attach/Exec
* [MESOS-6758] - **NEW** Support docker registry that requires basic auth.
All Issues:
** Bug
* [MESOS-1802] - HealthCheckTest.HealthStatusChange is flaky on jenkins.
* [MESOS-2537] - AC_ARG_ENABLED checks are broken
* [MESOS-2723] - The mesos-execute tool does not support zk:// master URLs
* [MESOS-3335] - FlagsBase copy-ctor leads to dangling pointer.
* [MESOS-3932] - Silence Boost compiler warnings with CMake
* [MESOS-4601] - Don't dump stack trace on failure to bind()
* [MESOS-4695] - SlaveTest.StateEndpoint is flaky
* [MESOS-4973] - Duplicates in 'unregistered_frameworks' in /state
* [MESOS-4975] - mesos::internal::master::Slave::tasks can grow unboundedly
* [MESOS-5218] - Fetcher should not chown the entire sandbox.
* [MESOS-5303] - Add capabilities support for mesos execute cli.
* [MESOS-5662] - Call parent class `SetUpTestCase` function in our test fixtures.
* [MESOS-5821] - Clean up the thousands of compiler warnings on MSVC
* [MESOS-5835] - Audit `PATCH_CMD`; make sure all patches are being applied on Windows.
* [MESOS-5856] - Logrotate ContainerLogger module does not rotate logs when run as root with `--switch_user`.
* [MESOS-5879] - cgroups/net_cls isolator causing agent recovery issues
* [MESOS-5963] - HealthChecker should not decide when to kill tasks and when to stop performing health checks.
* [MESOS-6001] - Aufs backend cannot support the image with numerous layers.
* [MESOS-6002] - The whiteout file cannot be removed correctly using aufs backend.
* [MESOS-6010] - Docker registry puller shows decode error "No response decoded".
* [MESOS-6119] - TCP health checks are not portable.
* [MESOS-6142] - Frameworks may RESERVE for an arbitrary role.
* [MESOS-6206] - Change reconciliation to return results for in-progress removals and reregistrations
* [MESOS-6286] - Master does not remove an agent if it is responsive but not registered
* [MESOS-6288] - The default executor should maintain launcher_dir.
* [MESOS-6293] - HealthCheckTest.HealthyTaskViaHTTPWithoutType fails on some distros.
* [MESOS-6316] - CREATE of shared volumes should not be allowed by frameworks not opted in to the capability.
* [MESOS-6320] - Implement clang-tidy check to catch incorrect flags hierarchies
* [MESOS-6349] - JSON Generation breaks if other locale than C is used.
* [MESOS-6360] - The handling of whiteout files in provisioner is not correct.
* [MESOS-6380] - mesos-local failed to start without sudo
* [MESOS-6388] - Report new PARTITION_AWARE task statuses in HTTP endpoints
* [MESOS-6389] - Update webui for PARTITION_AWARE changes
* [MESOS-6409] - mesos-ps - Invalid header value
* [MESOS-6414] - cgroups isolator cleanup failed when the hierarchy is cleanup by docker daemon
* [MESOS-6419] - The 'master/teardown' endpoint should support tearing down 'unregistered_frameworks'.
* [MESOS-6420] - Mesos Agent leaking sockets when port mapping network isolator is ON
* [MESOS-6432] - Roles with quota assigned can "game" the system to receive excessive resources.
* [MESOS-6444] - Ensure single copy of shared count of total resources in role sorter.
* [MESOS-6446] - WebUI redirect doesn't work with stats from /metric/snapshot
* [MESOS-6448] - Show the leading master hostname in the webUI.
* [MESOS-6452] - Compile error in strerror.h on OSX
* [MESOS-6455] - DefaultExecutorTests fail when running on hosts without docker.
* [MESOS-6459] - PosixRLimitsIsolatorTest.TaskExceedingLimit fails on OS X
* [MESOS-6461] - Duplicate framework ids in /master/frameworks endpoint 'unregistered_frameworks'.
* [MESOS-6478] - "filesystem/linux" isolator leaks (phantom) mounts in `mount` output
* [MESOS-6483] - Check failure when a 1.1 master marking a 0.28 agent as unreachable
* [MESOS-6484] - Memory leak in `Future<T>::after()`
* [MESOS-6501] - Add a test for duplicate framework ids in "unregistered_frameworks"
* [MESOS-6504] - Use 'geteuid()' for the root privileges check.
* [MESOS-6508] - monitor/statistics error in webui when launch mesos via mesos-local
* [MESOS-6516] - Parallel test running does not respect GTEST_FILTER
* [MESOS-6519] - MasterTest.OrphanTasksMultipleAgents
* [MESOS-6520] - Make errno an explicit argument for ErrnoError.
* [MESOS-6526] - `mesos-containerizer launch --environment` exposes executor env vars in `ps`.
* [MESOS-6527] - Memory leak in the libprocess request decoder.
* [MESOS-6544] - MasterMaintenanceTest.InverseOffersFilters is flaky.
* [MESOS-6545] - TestContainerizer is not thread-safe.
* [MESOS-6566] - The Docker executor should not leak task env variables in the Docker command cmd line.
* [MESOS-6569] - MesosContainerizer/DefaultExecutorTest.KillTask/0 failing on ASF CI
* [MESOS-6576] - DefaultExecutorTest.KillTaskGroupOnTaskFailure sometimes fails in CI
* [MESOS-6588] - LinuxRootfs misses required files
* [MESOS-6597] - Include v1 Operator API protos in generated JAR and python packages.
* [MESOS-6598] - Broken Link Framework Development Page
* [MESOS-6602] - Shutdown completed frameworks when unreachable agent reregisters
* [MESOS-6604] - Uninitialized member ObjectApprover::weight_info.
* [MESOS-6606] - Reject optimized builds with libcxx before 3.9
* [MESOS-6618] - Some tests use hardcoded port numbers.
* [MESOS-6619] - Improve task management for unreachable tasks
* [MESOS-6621] - SSL downgrade path will CHECK-fail when using both temporary and persistent sockets
* [MESOS-6624] - Master WebUI does not work on Firefox 45
* [MESOS-6625] - Expose container id in ContainerStatus in DockerContainerizer.
* [MESOS-6640] - mesos-local doesn't hande --work_dir correctly.
* [MESOS-6646] - StreamingRequestDecoder incompletely initializes its http_parser_settings
* [MESOS-6647] - Cyclic header dependency between libprocess' defer.hpp and executor.hpp
* [MESOS-6652] - Perf version not correctly parsed on Fedora 24 (and probably others)
* [MESOS-6653] - Overlayfs backend may fail to mount the rootfs if both container image and image volume are specified.
* [MESOS-6654] - Duplicate image layer ids may make the backend failed to mount rootfs.
* [MESOS-6658] - Mesos tests generated with cmake build fail to unload libraries properly
* [MESOS-6665] - io::redirect might cause stack overflow.
* [MESOS-6666] - HttpServeTest.Discard failed on OSX sierra
* [MESOS-6672] - Class DynamicLibrary's default copy constructor can lead to inconsistent state
* [MESOS-6676] - Always re-link with scheduler during re-registration.
* [MESOS-6677] - Error in Windows agent's Flags::runtime_dir CLI
* [MESOS-6684] - Update addFramework/removeFramework to handle multi-role frameworks
* [MESOS-6685] - Update Role::Resources to correctly account for multi-role frameworks
* [MESOS-6688] - IOSwitchboard should recover spawned server pid on agent restarts
* [MESOS-6689] - Remove of unix domain socket path in IOSwitchboard::cleanup
* [MESOS-6700] - Port `http_tests.cpp`
* [MESOS-6701] - Port `recordio_tests.cpp`
* [MESOS-6704] - Port `executor_http_api_tests.cpp`
* [MESOS-6707] - Port `gc_tests.cpp`
* [MESOS-6710] - Port `http_authentication_tests.cpp`
* [MESOS-6711] - Port `values_tests.cpp`
* [MESOS-6716] - Port `uri_tests.cpp`
* [MESOS-6717] - Add Windows support to agent test harness
* [MESOS-6718] - Should destroy DEBUG containers on agent recovery.
* [MESOS-6722] - Agent tries to use POSIX paths for the variable data runtime directory.
* [MESOS-6725] - The style of `.navbar-text` is inconsistent with the style of texts on the left side
* [MESOS-6726] - IOSwitchboardServerFlags adds flags for non-optional fields w/o providing a default value
* [MESOS-6736] - CMake's `CURRENT_CMAKE_BUILD_DIR` does not escape '\'
* [MESOS-6737] - The agent should synchronize with the IOSwitchboard to determine when it is ready to accept incoming connections.
* [MESOS-6739] - Authorize v1 GET_CONTAINERS call
* [MESOS-6740] - Authorize v1 GET_FLAGS call
* [MESOS-6741] - Authorize v1 SET_LOGGING_LEVEL call
* [MESOS-6744] - DefaultExecutorTest.KillTaskGroupOnTaskFailure is flaky
* [MESOS-6745] - MesosContainerizer/DefaultExecutorTest.KillTask/0 is flaky
* [MESOS-6746] - IOSwitchboard doesn't properly flush data on ATTACH_CONTAINER_OUTPUT
* [MESOS-6747] - ContainerLogger runnable must not inherit the slave environment.
* [MESOS-6748] - I/O switchboard should inherit agent environment variables.
* [MESOS-6750] - Metrics on the Agent view of the Mesos web UI flickers between empty and non-empty states
* [MESOS-6756] - I/O switchboard should deal with the case when reaping of the server failed.
* [MESOS-6757] - Consider using CMake to configure test scripts in the `bin/` diretory
* [MESOS-6761] - Implement `os::user` on Windows
* [MESOS-6767] - Reached unreachable statement at <path>/mesos/src/slave/containerizer/mesos/launch.cpp:766
* [MESOS-6772] - Stop building `mesos-agent` twice.
* [MESOS-6775] - The 'http::connect(address)' always uses the DEFAULT_KIND() of socket even if SSL is undesired.
* [MESOS-6781] - Mesos containerizer overrides environment variables passed to the executor incorrectly.
* [MESOS-6788] - Avoid stack overflow when handling streaming responses in API handlers
* [MESOS-6789] - SSL socket's 'shutdown()' method is broken
* [MESOS-6793] - CniIsolatorTest.ROOT_EnvironmentLibprocessIP fails on systems using dash as sh
* [MESOS-6795] - Listening socket might get closed while the accept is still in flight.
* [MESOS-6802] - SSL socket can lose bytes in the case of EOF
* [MESOS-6803] - Agent authentication does not have an initial `delay`
* [MESOS-6805] - Check unreachable task cache for task ID collisions on launch
* [MESOS-6811] - IOSwitchboardServerTest.SendHeartbeat and IOSwitchboardServerTest.ReceiveHeartbeat broken on OS X
* [MESOS-6813] - IOSwitchboardServerTest.AttachOutput has stack overflow issue.
* [MESOS-6820] - FaultToleranceTest.FrameworkReregister is flaky.
* [MESOS-6824] - mesos-this-capture clang-tidy check has false positives
* [MESOS-6826] - OsTest.User fails on recent Arch Linux.
* [MESOS-6829] - Mesos fails to compile when using FORTIFY_SOURCE without optimizations
* [MESOS-6830] - Mesos fails to link with gold when providing -pie without -fPIC
* [MESOS-6837] - FaultToleranceTest.FrameworkReregister is flaky
* [MESOS-6839] - It is currently impossible to kill a task in the Windows executor
* [MESOS-6848] - The default executor does not exit if a single task pod fails.
* [MESOS-6852] - Nested container's launch command is not set correctly in docker/runtime isolator.
* [MESOS-6860] - Some tests use CHECK instead of ASSERT
* [MESOS-6862] - Replace os::system usages to reduce the risk of command injection.
* [MESOS-6864] - Container Exec should be possible with tasks belonging to a task group
* [MESOS-6866] - Mesos agent not checking IDs before using them as part of the paths
* [MESOS-6870] - Port `default_executor_tests.cpp`
* [MESOS-6871] - Scheme parsing is incorrect in libprocess URL::parse().
* [MESOS-6895] - Loop uses dependent nested names for friend declaration which isn't supported by recent clang
* [MESOS-6900] - Add test for framework upgrading to multi-role capability.
* [MESOS-6904] - Perform batching of allocations to reduce allocator queue backlogging.
* [MESOS-6908] - Zero health check timeout is interpreted literally.
* [MESOS-6911] - SlaveRecoveryTest/0.RegisterDisconnectedSlave test is flaky
* [MESOS-6912] - IOSwitchboardServerTest.AttachInput fails consistently on Mac OS.
* [MESOS-6917] - Segfault when the executor sets an invalid UUID when sending a status update.
* [MESOS-6920] - Validate the UUID in Master::statusUpdate.
* [MESOS-6922] - SlaveRecoveryTest/0.RecoverTerminatedExecutor is flaky
* [MESOS-6937] - ContentType/MasterAPITest.ReserveResources/1 fails during Writer close
* [MESOS-6946] - Make wait status checks consistent.
* [MESOS-6948] - AgentAPITest.LaunchNestedContainerSession is flaky
* [MESOS-6954] - Running LAUNCH_NESTED_CONTAINER with a docker container id as parent crashes the agent
* [MESOS-6962] - Navbar overlays breadcrumbs in WebUI on narrow screens
* [MESOS-6963] - The logo doesn't fit in mobile WebUI
* [MESOS-6966] - master/tasks_unreachable metric never decremented
* [MESOS-6969] - Use clipboard.js for copy/paste webui functionality
* [MESOS-6983] - TaskValidationTest.TaskReusesUnreachableTaskID is flaky
* [MESOS-6989] - Docker executor segfaults in ~MesosExecutorDriver()
* [MESOS-6991] - Change `Environment.Variable.Value` from required to optional
* [MESOS-7008] - Quota not recovered from registry in empty cluster.
* [MESOS-7020] - cgroups::internal::write can incorrectly report success
* [MESOS-7027] - CommandExecutor ENV overwritten by Docker Image ENV in Unified Containerizer
* [MESOS-7036] - Rate limiter deadlocks during IO Switchboard-related tests
* [MESOS-7057] - Consider using the relink functionality of libprocess in the executor driver.
* [MESOS-7059] - Unnecessary mkdirs in ProvisionerDockerLocalStoreTest.*
* [MESOS-7060] - Tests depends on DockerArchive and LinuxRootfs failed.
* [MESOS-7075] - mesos-execute rejects all offers
* [MESOS-7077] - Check failed: resource.has_allocation_info().
* [MESOS-7102] - Crash when sending a SIGUSR1 signal to the agent.
* [MESOS-7119] - Mesos master crash while accepting inverse offer.
* [MESOS-7129] - Default executor exits with a stack trace in a few scenarios.
* [MESOS-7133] - mesos-fetcher fails with openssl-related output.
* [MESOS-7137] - Custom executors cannot use any reserved resources.
* [MESOS-7144] - Wrap IOSwitchboard.connect() in a dispatch
* [MESOS-7152] - The agent may be flapping after the machine reboots due to provisioner recover.
* [MESOS-7153] - The new http::Headers abstraction may break some modules.
** Documentation
* [MESOS-5597] - Document Mesos "health check" feature.
* [MESOS-6335] - Add user doc for task group tasks
* [MESOS-6411] - Add documentation for CNI port-mapper plugin.
* [MESOS-6806] - Update the addition, deletion and modification logic of CNI configuration files.
* [MESOS-7154] - Document provisioner auto backend support.
** Epic
* [MESOS-3820] - Test-only libprocess reinitialization
* [MESOS-4641] - Support Container Network Interface (CNI).
* [MESOS-4766] - Improve allocator performance.
* [MESOS-6402] - Add rlimit support to Mesos containerizer
* [MESOS-6460] - Mesos Support for Container Attach and Container Exec
* [MESOS-6670] - Authz for Agent v1 operator API
** Improvement
* [MESOS-3601] - Formalize all headers and metadata for HTTP API Event Stream
* [MESOS-5792] - Add mesos tests to CMake (make check)
* [MESOS-5900] - Support Unix domain socket connections in libprocess
* [MESOS-5931] - Support auto backend in Unified Containerizer.
* [MESOS-5992] - Complete the list of API Calls on the Operator HTTP API Doc
* [MESOS-6177] - Return unregistered agents recovered from registrar in `GetAgents` and/or `/state.json`
* [MESOS-6229] - Default to using hardened compilation flags
* [MESOS-6296] - Default executor should be able to launch multiple task groups
* [MESOS-6305] - Add authorization support for nested container calls
* [MESOS-6309] - Mesos-specific targets appear in libprocess' cmake config.
* [MESOS-6329] - Send TASK_DROPPED for task launch errors
* [MESOS-6330] - Send TASK_UNKNOWN during explicit reconciliation
* [MESOS-6331] - Don't send TASK_LOST when accepting offers in a disconnected scheduler
* [MESOS-6332] - Don't send TASK_LOST in the agent
* [MESOS-6339] - Support docker registry that requires basic auth.
* [MESOS-6361] - Enable partition-awareness in mesos-execute
* [MESOS-6369] - Add a column for FrameworkID when displaying tasks in the WebUI
* [MESOS-6395] - HealthChecker sends updates to executor via libprocess messaging.
* [MESOS-6396] - Hooks should allow sandbox dependent environment variables.
* [MESOS-6397] - Simplify the comparison logic for `ExecutorInfo`.
* [MESOS-6399] - Allowed to pass extra envs when launch development scripts.
* [MESOS-6401] - Authorizer interface should behave more uniform
* [MESOS-6407] - Move DEFAULT_v1_xxx macros to the v1 namespace.
* [MESOS-6426] - Add rlimit support to Mesos containerizer
* [MESOS-6427] - Add documentation for rlimit support of Mesos containerizer
* [MESOS-6443] - Display maintenance information in the webui.
* [MESOS-6530] - Add support for incremental gzip decompression.
* [MESOS-6556] - Hostname support for the network/cni isolator.
* [MESOS-6557] - IPC namespace isolator
* [MESOS-6562] - Use JSON content type in mesos-execute.
* [MESOS-6567] - Actively Scan for CNI Configurations
* [MESOS-6571] - Add "--task" flag to mesos-execute
* [MESOS-6626] - Support `foreachpair` for LinkedHashMap
* [MESOS-6639] - Update 'io::redirect()' to take an optional vector of callback hooks.
* [MESOS-6648] - MesosContainerizer launch helper should take ContainerLaunchInfo.
* [MESOS-6650] - Remove slavePreLaunchDockerEnvironmentDecorator and slavePreLaunchDockerHook.
* [MESOS-6675] - Change allocator API to support adding inactive frameworks
* [MESOS-6719] - Unify "active" and "state"/"connected" fields in Master::Framework
* [MESOS-6758] - Support 'Basic' auth docker private registry on Unified Containerizer.
* [MESOS-6763] - Add heartbeats to both input/output connections in IOSwitchboard
* [MESOS-6821] - Override of automatic resources should be by exact match not substring
* [MESOS-6865] - Remove the constraint of being only able to launch 2 level nested containers on Agent API
* [MESOS-6936] - Add support for media types needed for streaming request/responses.
* [MESOS-6947] - Fix pailer XSS vulnerability
* [MESOS-7045] - Skip already stored layers in local Docker puller
* [MESOS-7051] - Introduce a new http::Headers abstraction.
* [MESOS-7071] - Agent State Lacks Framework Principal
** Story
* [MESOS-3505] - Support specifying Docker image by Image ID.
* [MESOS-3753] - Test the HTTP Scheduler library with SSL enabled
** Task
* [MESOS-3398] - Revisit MAXHOSTNAMELEN implementation in Windows
* [MESOS-3697] - Add `make tests` target to CMake build system.
* [MESOS-3843] - Audit `src/CMakelists.txt` to make sure we're compiling everything we need to build the agent binary.
* [MESOS-3910] - Libprocess: Implement cleanup of the SocketManager in process::finalize
* [MESOS-3934] - Libprocess: Unify the initialization of the MetricsProcess and ReaperProcess
* [MESOS-4119] - Add support for enabling --3way to apply-reviews.py.
* [MESOS-5826] - Streamline building of example frameworks
* [MESOS-5966] - Add libprocess HTTP tests with SSL support
* [MESOS-6040] - Add a CMake build for `mesos-port-mapper`
* [MESOS-6185] - Improve test coverage for shared persistent volumes.
* [MESOS-6214] - Containerizers assume caller will call 'destroy' if 'launch' fails.
* [MESOS-6278] - Add test cases for the HTTP health checks.
* [MESOS-6279] - Add test cases for the TCP health check.
* [MESOS-6366] - Design doc for executor authentication
* [MESOS-6376] - Add documentation for capabilities support of the mesos containerizer
* [MESOS-6403] - Draft design doc for rlimit support for Mesos containerizer
* [MESOS-6431] - Add support for port-mapping in `mesos-execute`
* [MESOS-6462] - Design Doc: Mesos Support for Container Attach and Container Exec
* [MESOS-6463] - Build a prototype for remote pty support
* [MESOS-6464] - Add fine grained control of which namespaces a nested container should inherit (or not).
* [MESOS-6465] - Add a task_id -> container_id mapping in state.json
* [MESOS-6466] - Add support for streaming HTTP requests in Mesos
* [MESOS-6467] - Build a Container I/O Switchboard
* [MESOS-6470] - Support TTY in IOSwitchboard.
* [MESOS-6471] - Build support for LAUNCH_NESTED_CONTAINER_SESSION call into the Agent API in Mesos
* [MESOS-6472] - Build support for ATTACH_CONTAINER_INPUT into the Agent API in Mesos
* [MESOS-6473] - Build support for ATTACH_CONTAINER_OUTPUT into the Agent API in Mesos
* [MESOS-6474] - Add fine-grained ACLs for authorization with the new debugging APIs
* [MESOS-6475] - Mesos Container Attach/Exec Unit Tests
* [MESOS-6476] - Build a Mock HTTP Server that implements the new Debugging API calls
* [MESOS-6477] - Build a standalone python client for connecting to our Mock HTTP Server that implements the new Debug APIs
* [MESOS-6493] - Add test cases for the HTTPS health checks.
* [MESOS-6525] - Add API protos for managing debug containers
* [MESOS-6528] - Container status of a task in a pod is not correct.
* [MESOS-6543] - Add special case for entering the "mount" namespace of a parent container
* [MESOS-6546] - Update the Containerizer to handle attachInput and attachOutput calls.
* [MESOS-6547] - Update the mesos containerizer to launch per-container I/O switchboards
* [MESOS-6553] - Update `MesosContainerizerProcess::_launch()` to pass `ContainerLaunchInfo` to launcher->fork()`
* [MESOS-6594] - Add `Containerizer::attach()` API call
* [MESOS-6628] - Add a FrameworkInfo.roles field along with a MULTI_ROLE capability.
* [MESOS-6629] - Add master validation of FrameworkInfo.roles.
* [MESOS-6631] - Disallow frameworks from modifying FrameworkInfo.roles.
* [MESOS-6633] - Introduce Resource.AllocationInfo.
* [MESOS-6634] - Add Resource.AllocationInfo in Offer to indicate a single role per offer.
* [MESOS-6638] - Update Suppress and Revive to be per-role.
* [MESOS-6651] - Make IOSwitchboard an isolator.
* [MESOS-6663] - Container should be destroyed if IOSwitchboard server terminates unexpectedly.
* [MESOS-6664] - Force cleanup of IOSwitchboard server if it does not terminate after the container terminates.
* [MESOS-6749] - Update master and agent endpoints to expose FrameworkInfo.roles.
* [MESOS-6764] - Add a grace period for terminating the I/O switchboard server.
* [MESOS-6958] - Support linux filesystem type detection.
* [MESOS-6970] - Display allocation info when printing Resources.
* [MESOS-7062] - Add a test for a MULTI_ROLE framework receiving offers for each of its roles.
Release Notes - Mesos - Version 1.1.3
-------------------------------------
* This is a bug fix release.
All Issues:
** Bug
* [MESOS-5187] - The filesystem/linux isolator does not set the permissions of the host_path.
* [MESOS-6743] - Docker executor hangs forever if `docker stop` fails.
* [MESOS-6950] - Launching two tasks with the same Docker image simultaneously may cause a staging dir never cleaned up.
* [MESOS-7540] - Add an agent flag for executor re-registration timeout.
* [MESOS-7569] - Allow "old" executors with half-open connections to be preserved during agent upgrade / restart.
* [MESOS-7689] - Libprocess can crash on malformed request paths for libprocess messages.
* [MESOS-7690] - The agent can crash when an unknown executor tries to register.
* [MESOS-7581] - Fix interference of external Boost installations when using some unbundled dependencies.
* [MESOS-7703] - Mesos fails to exec a custom executor when no shell is used.
* [MESOS-7728] - Java HTTP adapter crashes JVM when leading master disconnects.
* [MESOS-7770] - Persistent volume might not be mounted if there is a sandbox volume whose source is the same as the target of the persistent volume.
* [MESOS-7777] - Agent failed to recover due to mount namespace leakage in Docker 1.12/1.13.
* [MESOS-7796] - LIBPROCESS_IP isn't passed on to the fetcher.
* [MESOS-7830] - Sandbox_path volume does not have ownership set correctly.
* [MESOS-7863] - Agent may drop pending kill task status updates.
* [MESOS-7865] - Agent may process a kill task and still launch the task.
Release Notes - Mesos - Version 1.1.2
-------------------------------------
* This is a bug fix release.
All Issues:
** Bug
* [MESOS-2537] - AC_ARG_ENABLED checks are broken.
* [MESOS-5028] - Copy provisioner cannot replace directory with symlink.
* [MESOS-5172] - Registry puller cannot fetch blobs correctly from http Redirect 3xx urls.
* [MESOS-6327] - Large docker images causes container launch failures: Too many levels of symbolic links.
* [MESOS-7057] - Consider using the relink functionality of libprocess in the executor driver.
* [MESOS-7119] - Mesos master crash while accepting inverse offer.
* [MESOS-7152] - The agent may be flapping after the machine reboots due to provisioner recover.
* [MESOS-7197] - Requesting tiny amount of CPU crashes master.
* [MESOS-7210] - HTTP health check doesn't work when mesos runs with --docker_mesos_image.
* [MESOS-7237] - Enabling cgroups_limit_swap can lead to "invalid argument" error.
* [MESOS-7265] - Containerizer startup may cause sensitive data to leak into sandbox logs.
* [MESOS-7350] - Failed to pull image from Nexus Registry due to signature missing.
* [MESOS-7366] - Agent sandbox gc could accidentally delete the entire persistent volume content.
* [MESOS-7383] - Docker executor logs possibly sensitive parameters.
* [MESOS-7422] - Docker containerizer should not leak possibly sensitive data to agent log.
* [MESOS-7471] - Provisioner recover should not always assume 'rootfses' dir exists.
* [MESOS-7482] - #elif does not match #ifdef when checking the platform.
Release Notes - Mesos - Version 1.1.1
-------------------------------------
* This is a bug fix release.
All Issues:
** Bug
* [MESOS-6002] - The whiteout file cannot be removed correctly using aufs backend.
* [MESOS-6010] - Docker registry puller shows decode error "No response decoded".
* [MESOS-6142] - Frameworks may RESERVE for an arbitrary role.
* [MESOS-6360] - The handling of whiteout files in provisioner is not correct.
* [MESOS-6411] - Add documentation for CNI port-mapper plugin.
* [MESOS-6526] - `mesos-containerizer launch --environment` exposes executor env vars in `ps`.
* [MESOS-6571] - Add "--task" flag to mesos-execute.
* [MESOS-6597] - Include v1 Operator API protos in generated JAR and python packages.
* [MESOS-6606] - Reject optimized builds with libcxx before 3.9.
* [MESOS-6621] - SSL downgrade path will CHECK-fail when using both temporary and persistent sockets.
* [MESOS-6624] - Master WebUI does not work on Firefox 45.
* [MESOS-6676] - Always re-link with scheduler during re-registration.
* [MESOS-6848] - The default executor does not exit if a single task pod fails.
* [MESOS-6852] - Nested container's launch command is not set correctly in docker/runtime isolator.
* [MESOS-6917] - Segfault when the executor sets an invalid UUID when sending a status update.
* [MESOS-7008] - Quota not recovered from registry in empty cluster.
* [MESOS-7133] - mesos-fetcher fails with openssl-related output.
Release Notes - Mesos - Version 1.1.0
-------------------------------------
This release contains the following new features:
* [MESOS-2449] - **Experimental** support for launching a group of tasks
via a new `LAUNCH_GROUP` Offer operation. Mesos will guarantee that either
all tasks or none of the tasks in the group are delivered to the executor.
Executors receive the task group via a new `LAUNCH_GROUP` event.
* [MESOS-2533] - **Experimental** support for HTTP and HTTPS health checks.
Executors may now use the updated `HealthCheck` protobuf to implement
HTTP(S) health checks. Both default executors (command and docker) leverage
`curl` binary for sending HTTP(S) requests and connect to `127.0.0.1`,
hence a task must listen on all interfaces. On Linux, for BRIDGE and USER
modes, docker executor enters the task's network namespace.
* [MESOS-3421] - **Experimental** Support sharing of resources across
containers. Currently persistent volumes are the only resources allowed to
be shared.
* [MESOS-3567] - **Experimental** support for TCP health checks. Executors
may now use the updated `HealthCheck` protobuf to implement TCP health
checks. Both default executors (command and docker) connect to `127.0.0.1`,
hence a task must listen on all interfaces. On Linux, for BRIDGE and USER
modes, docker executor enters the task's network namespace.
* [MESOS-4324] - Allow tasks to access persistent volumes in either a
read-only or read-write manner. Using a volume in read-only mode can
simplify sharing that volume between multiple tasks on the same agent.
* [MESOS-5275] - **Experimental** support for linux capabilities. Frameworks
or operators now have fine-grained control over the capabilities that a
container may have. This allows a container to run as root, but not have all
the privileges associated with the root user (e.g., CAP_SYS_ADMIN).
* [MESOS-5344] - **Experimental** support for partition-aware Mesos
frameworks. In previous Mesos releases, when an agent is partitioned from
the master and then reregisters with the cluster, all tasks running on the
agent are terminated and the agent is shutdown. In Mesos 1.1, partitioned
agents will no longer be shutdown when they reregister with the master. By
default, tasks running on such agents will still be killed (for backward
compatibility); however, frameworks can opt-in to the new PARTITION_AWARE
capability. If they do this, their tasks will not be killed when a partition
is healed. This allows frameworks to define their own policies for how to
handle partitioned tasks. Enabling the PARTITION_AWARE capability also
introduces a new set of task states: TASK_UNREACHABLE, TASK_DROPPED,
TASK_GONE, TASK_GONE_BY_OPERATOR, and TASK_UNKNOWN. These new states are
intended to eventually replace the TASK_LOST state.
* [MESOS-5788] - **Experimental** support for Java scheduler adapter. This
adapter allows framework developers to toggle between the old/new API
(driver/scheduler library) implementations, thereby allowing them to easily
transition their frameworks to the new v1 Scheduler API.
* [MESOS-6014] - **Experimental** A new port-mapper CNI plugin, the
`mesos-cni-port-mapper` has been introduced. For Mesos containers, with the
CNI port-mapper plugin, users can now expose container ports through host
ports using DNAT. This is especially useful when Mesos containers are
attached to isolated CNI networks such as private bridge networks, and the
services running in the container needs to be exposed outside these
isolated networks.
* [MESOS-6077] - **Experimental** A new default executor is introduced which
frameworks can use to launch task groups as nested containers. All the
nested containers share resources likes cpu, memory, network and volumes.
Deprecations:
* The following metrics are deprecated and will be removed in Mesos 1.4:
master/slave_shutdowns_scheduled,
master/slave_shutdowns_canceled,
slave_shutdowns_completed.
As of Mesos 1.1.0, these metrics will always be zero. The following new
metrics have been introduced as replacements:
master/slave_unreachable_scheduled,
master/slave_unreachable_canceled,
master/slave_unreachable_completed.
* [MESOS-5955] - Health check binary "mesos-health-check" is removed.
* [MESOS-6371] - Remove the 'recover()' interface in 'ContainerLogger'.
Additional API Changes:
* [MESOS-6204] - A new agent flag called `--runtime_dir`. Unlike
`--work_dir` which persists data across reboots, `--runtime_dir` is designed
to checkpoint state that should persist across agent restarts, but not
across reboots. By default this flag is set to `/var/run/mesos` when run as
root and `os::temp/mesos/runtime/` when run as non-root.
* [MESOS-6220] - HTTP handler failures should result in 500 rather than
503 responses. This means that when using the master or agent endpoints,
failures will now result in a `500 Internal Server Error` rather than a
`503 Service Unavailable`.
* [MESOS-6241] - New API calls (LAUNCH_NESTED_CONTAINER,
KILL_NESTED_CONTAINER and WAIT_NESTED_CONTAINER) have been added to the
v1 Agent API to manage nested containers within an executor container.
Unresolved Critical Issues:
* [MESOS-3794] - Master should not store arbitrarily sized data in ExecutorInfo.
* [MESOS-4642] - Mesos Agent Json API can dump binary data from log files out as invalid JSON.
* [MESOS-5396] - After failover, master does not remove agents with same UPID.
* [MESOS-5856] - Logrotate ContainerLogger module does not rotate logs when run as root with `--switch_user`.
* [MESOS-6142] - Frameworks may RESERVE for an arbitrary role.
* [MESOS-6327] - Large docker images causes container launch failures: Too many levels of symbolic links.
* [MESOS-6360] - The handling of whiteout files in provisioner is not correct.
* [MESOS-6419] - The 'master/teardown' endpoint should support tearing down 'unregistered_frameworks'.
* [MESOS-6432] - Roles with quota assigned can "game" the system to receive excessive resources.
All Experimental Features:
* [MESOS-2449] - Support group of tasks (Pod) constructs and API in Mesos.
* [MESOS-2533] - Support HTTP checks in Mesos.
* [MESOS-3094] - Mesos on Windows.
* [MESOS-3421] - Support sharing of resources across task instances.
* [MESOS-3567] - Support TCP checks in Mesos.
* [MESOS-4312] - Porting Mesos on Power (ppc64le).
* [MESOS-4355] - Implement isolator for Docker volume.
* [MESOS-4641] - Support Container Network Interface (CNI).
* [MESOS-4791] - Operator API v1.
* [MESOS-4828] - XFS disk quota isolator.
* [MESOS-5275] - Add capabilities support for unified containerizer.
* [MESOS-5344] - Partition-aware Mesos frameworks.
* [MESOS-5788] - Added JAVA API adapter for seamless transition to new scheduler API.
* [MESOS-6014] - Added port mapping CNI plugin.
* [MESOS-6077] - Added a default (task group) executor.
All Issues:
** Bug
* [MESOS-1653] - HealthCheckTest.GracePeriod is flaky.
* [MESOS-2346] - Docker tasks exiting normally, but returning TASK_FAILED.
* [MESOS-3471] - Disable perf test when perf version is not support.
* [MESOS-3760] - Remove fragile sleep() from ProcessManager::settle().
* [MESOS-3959] - Executor page of mesos ui does not show slave hostname.
* [MESOS-4070] - numify() handles negative numbers inconsistently.
* [MESOS-4638] - versioning preprocessor macros.
* [MESOS-4668] - Agent's /state endpoint does not include full reservation information.
* [MESOS-4948] - Move maintenance tests to use the new scheduler library interface.
* [MESOS-4973] - Duplicates in 'unregistered_frameworks' in /state
* [MESOS-4975] - mesos::internal::master::Slave::tasks can grow unboundedly.
* [MESOS-5276] - HTTPCommandExecutor should terminate after it receives an ACK from the agent.
* [MESOS-5290] - WebUI shows the active task is launched 46 years ago.
* [MESOS-5320] - SSL related error messages can be misguiding or incomplete.
* [MESOS-5448] - Persistent volume deletion on the agent should survive slave restart.
* [MESOS-5481] - PerfFilter disable Registrar_BENCHMARK test cases incorrectly.
* [MESOS-5613] - mesos-local fails to start if MESOS_WORK_DIR isn't set.
* [MESOS-5701] - Add benchmark for sorter performance.
* [MESOS-5752] - ROOT_GarbageCollectorUndeletableFilesTest.BusyMountPoint is flaky.
* [MESOS-5759] - ProcessRemoteLinkTest.RemoteUseStaleLink and RemoteStaleLinkRelink are flaky.
* [MESOS-5812] - MasterAPITest.Subscribe is flaky.
* [MESOS-5846] - AgentAPITest.GetState is flaky.
* [MESOS-5852] - CMake build needs to generate protobufs before building libmesos.
* [MESOS-5860] - MasterAPITest.GetTasks is flaky.
* [MESOS-5864] - Document MESOS_SANDBOX executor env variable.
* [MESOS-5867] - Operator ReadFile API read file bugs.
* [MESOS-5869] - Disable resources validation for `+=` and `-=`.
* [MESOS-5875] - Scalar resource output operator doesn't print full significant digits.
* [MESOS-5878] - Strict/RegistrarTest.UpdateQuota/0 is flaky.
* [MESOS-5888] - SlaveAuthorizerTest/ViewFlags is flaky.
* [MESOS-5891] - /help endpoint does not set Content-Type to HTML.
* [MESOS-5907] - ExamplesTest.DiskFullFramework fails on Arch.
* [MESOS-5909] - Stout "OsTest.User" test can fail on some systems.
* [MESOS-5917] - All actors should have a distinguishable ID.
* [MESOS-5919] - Improve performance for `Resources.contains` and `Resources.filter`.
* [MESOS-5921] - `validate` is a bit heavy to check negative scalar resource.
* [MESOS-5922] - mesos-agent --help exit status is 1.
* [MESOS-5928] - Agent's '--version' flag doesn't work.
* [MESOS-5930] - Orphan tasks can show up as running after they have finished.
* [MESOS-5942] - Windows implementation of `os::rmdir` is not compliant with POSIX version.
* [MESOS-5958] - Reviewbot failing due to python files not being cleaned up after distclean.
* [MESOS-5972] - SharedResourcesTest failing.
* [MESOS-5979] - elfio-3.1.patch is actually not applied.
* [MESOS-5981] - task failed in windows Server 2012 client, test-framwork example.
* [MESOS-5985] - Fix broken link in `networking.md`.
* [MESOS-5996] - Windows mesos-containerizer crashes.
* [MESOS-6000] - Overlayfs backend cannot support the image with numerous layers.
* [MESOS-6005] - Support docker registry running non-https on localhost:<non-80-port>.
* [MESOS-6013] - Use readdir instead of readdir_r.
* [MESOS-6026] - Tasks mistakenly marked as FAILED due to race b/w sendExecutorTerminatedStatusUpdate() and _statusUpdate().
* [MESOS-6031] - Collect throttle related metrics for DockerContainerizer.
* [MESOS-6041] - Stream ID mismatch should print out expected and received stream ID.
* [MESOS-6049] - XFS disk isolator doesn't handle old containers correctly.
* [MESOS-6052] - Unable to launch containers on CNI networks on CoreOS.
* [MESOS-6057] - docker isolator does not overwrite Dockerfile ENV.
* [MESOS-6059] - Allow clean up unknown container during the clean up phase of the container.
* [MESOS-6069] - Misspelled TASK_KILLED in mesos slave.
* [MESOS-6074] - Master check failure if the metrics endpoint is polled soon after it starts.
* [MESOS-6085] - Agent's /state endpoint does not include total resources.
* [MESOS-6087] - Add master tests for TaskGroup.
* [MESOS-6100] - Make fails compiling 1.0.1.
* [MESOS-6104] - Potential FD double close in libevent's implementation of `sendfile`.
* [MESOS-6110] - Deprecate using health checks without setting the type.
* [MESOS-6118] - Agent would crash with docker container tasks due to host mount table read.
* [MESOS-6122] - Mesos slave throws systemd errors even when passed a flag to disable systemd.
* [MESOS-6131] - Improved performance for resource flatten.
* [MESOS-6141] - Some tests do not properly set 'flags.launcher' with the correct value.
* [MESOS-6144] - Validate that TaskGroup executor and tasks do not use DOCKER ContainerInfo.
* [MESOS-6145] - Isolator namespaces/pid is leaking mounts.
* [MESOS-6152] - Resource leak in libevent_ssl_socket.cpp.
* [MESOS-6153] - Resource leak in slave.cpp.
* [MESOS-6154] - Clean up queued tasks if a task group is killed before launch.
* [MESOS-6157] - ContainerInfo is not validated.
* [MESOS-6159] - Remove stout's Set type.
* [MESOS-6167] - CgroupsIsolatorTest.ROOT_CGROUPS_RevocableCpu is flaky.
* [MESOS-6170] - Health check grace period covers failures happening after first success.
* [MESOS-6173] - Authentication in v2 protobuf should not be `required`.
* [MESOS-6176] - CpuIsolatorTest.ROOT_SystemCpuUsage is flaky.
* [MESOS-6181] - The logic for BadACLNoPrincipal and BadACLDropCreateAndDestroy is not correct.
* [MESOS-6207] - Python bindings fail to build with custom SVN installation path.
* [MESOS-6208] - Containers that use the Mesos containerizer but don't want to provision a container image fail to validate.
* [MESOS-6210] - Master redirect with suffix gets in redirect loop.
* [MESOS-6216] - LibeventSSLSocketImpl::create is not safe to call concurrently with os::getenv.
* [MESOS-6217] - PAGE_SIZE was not declared in PPC64LE.
* [MESOS-6226] - Master crashes while transitioning tasks to 'TASK_UNREACHABLE'.
* [MESOS-6233] - Master CHECK fails during recovery while relinking to other masters.
* [MESOS-6234] - Potential socket leak during Zookeeper network changes.
* [MESOS-6245] - Driver based schedulers performing explicit acknowledgements cannot acknowledge updates from HTTP based executors.
* [MESOS-6246] - Libprocess links will not generate an ExitedEvent if the socket creation fails.
* [MESOS-6248] - mesos-slave cannot start , Assertion `isError()' failed.
* [MESOS-6257] - Resources not recovered after rescinding an offer on DESTROY on shared volumes.
* [MESOS-6259] - CNI isolator should not `CHECK` for `resolv.conf` under `rootContainerDir`.
* [MESOS-6260] - Composing containerizer needs to properly handle nested container launch.
* [MESOS-6262] - Default executor should kill all other tasks in a task group if any task exits with a non-zero exit status.
* [MESOS-6263] - Mesos containerizer should figure out the correct sandbox directory for nested launch.
* [MESOS-6269] - CNI isolator doesn't activate loopback interface.
* [MESOS-6270] - Agent crashes when trying to recover pods.
* [MESOS-6274] - Agent should not allow HTTP executors to re-subscribe before containerizer recovery is done.
* [MESOS-6283] - Fix the Web UI allowing access to the task sandbox for nested containers.
* [MESOS-6289] - Pass the 'user' into nested container launch.
* [MESOS-6290] - Support nested containers for logger in Mesos Containerizer.
* [MESOS-6295] - Excessive logging on agent when oversubscription modules are attached.
* [MESOS-6300] - A destroyed nested container is not reflected in the parent container's children map.
* [MESOS-6301] - Recursive destroy in MesosContainerizer is problematic.
* [MESOS-6302] - Agent recovery can fail after nested containers are launched.
* [MESOS-6308] - CHECK failure in DRF sorter.
* [MESOS-6317] - Race in master/allocator when updating oversubscribed resources of an agent.
* [MESOS-6319] - ContentType/AgentAPITest.NestedContainerLaunch/1 is flaky.
* [MESOS-6321] - CHECK failure in HierarchicalAllocatorTest.NoDoubleAccounting.
* [MESOS-6322] - Agent fails to kill empty parent container.
* [MESOS-6323] - 'mesos-containerizer launch' should inherit agent environment variables.
* [MESOS-6324] - CNI should not use `ifconfig` in executors `pre_exec_command`.
* [MESOS-6363] - Default executor should not crash with a failed assertion if it notices a disconnection from the agent for non checkpointed frameworks.
* [MESOS-6370] - The executor library does not invoke the shutdown callback upon recovery timeout.
* [MESOS-6386] - "Reached unreachable statement" in LinuxCapabilitiesIsolatorTest.
* [MESOS-6391] - Command task's sandbox should not be owned by root if it uses container image.
* [MESOS-6393] - Deprecated SSL_ environment variables are non functional already.
* [MESOS-6420] - Mesos Agent leaking sockets when port mapping network isolator is ON.
* [MESOS-6445] - Reconciliation for unreachable agent after master failover is incorrect.
* [MESOS-6446] - WebUI redirect doesn't work with stats from /metric/snapshot.
* [MESOS-6457] - Tasks shouldn't transition from TASK_KILLING to TASK_RUNNING.
* [MESOS-6461] - Duplicate framework ids in /master/frameworks endpoint 'unregistered_frameworks'.
* [MESOS-6482] - Master check failure when marking an agent unreachable.
* [MESOS-6483] - Check failure when a 1.1 master marking a 0.28 agent as unreachable.
* [MESOS-6497] - Java Scheduler Adapter does not surface MasterInfo.
* [MESOS-6502] - _version uses incorrect MESOS_{MAJOR,MINOR,PATCH}_VERSION in libmesos java binding.
* [MESOS-6527] - Memory leak in the libprocess request decoder.
** Documentation
* [MESOS-5221] - Add Documentation for Nvidia GPU support.
* [MESOS-5808] - Elasticsearch misspelled on homepage.
* [MESOS-6028] - mesos-execute has a typo in volume help.
* [MESOS-6103] - Mesos version is not uptodate in getting-started page.
* [MESOS-6343] - Documentation Error: Default Executor does not implicitly construct resources.
** Epic
* [MESOS-2449] - Support group of tasks (Pod) constructs and API in Mesos.
* [MESOS-3421] - Support sharing of resources across task instances.
* [MESOS-4312] - Porting Mesos on Power (ppc64le).
* [MESOS-4791] - Operator API v1.
* [MESOS-5344] - Partition-aware Mesos frameworks.
* [MESOS-6014] - Added port mapping CNI plugin.
** Improvement
* [MESOS-2533] - Support HTTP checks in Mesos.
* [MESOS-3567] - Support TCP checks in Mesos.
* [MESOS-4049] - Allow user to control behavior of partitioned agents/tasks.
* [MESOS-4155] - Speed up ExamplesTest.*.
* [MESOS-4172] - GarbageCollectorIntegrationTest.Restart is slow.
* [MESOS-4324] - Allow access to shared persistent volumes as read only or read write by tasks.
* [MESOS-4325] - Offer shareable resources to frameworks only if it is opted in.
* [MESOS-4431] - Support sharing of persistent volumes via shared resources.
* [MESOS-4663] - Speed up ExamplesTest.PersistentVolumeFramework.
* [MESOS-4694] - DRFAllocator takes very long to allocate resources with a large number of frameworks.