| --- |
| title: Apache Mesos - Fetcher |
| layout: documentation |
| --- |
| |
| # Mesos Fetcher |
| |
| Mesos 0.23.0 introduced experimental support for the Mesos _fetcher cache_. |
| |
| In this context we loosely regard the term "downloading" as to include copying |
| from local file systems. |
| |
| ## What is the Mesos fetcher? |
| |
| The Mesos fetcher is a mechanism to download resources into the [sandbox |
| directory](sandbox.md) of a task in preparation of running |
| the task. As part of a TaskInfo message, the framework ordering the task's |
| execution provides a list of `CommandInfo::URI` protobuf values, which becomes |
| the input to the Mesos fetcher. |
| |
| The Mesos fetcher can copy files from a local filesytem and it also natively |
| supports the HTTP, HTTPS, FTP and FTPS protocols. If the requested URI is based |
| on some other protocol, then the fetcher tries to utilise a local Hadoop client |
| and hence supports any protocol supported by the Hadoop client, e.g., HDFS, S3. |
| See the agent [configuration documentation](configuration/agent.md) |
| for how to configure the agent with a path to the Hadoop client. |
| |
| By default, each requested URI is downloaded directly into the sandbox directory |
| and repeated requests for the same URI leads to downloading another copy of the |
| same resource. Alternatively, the fetcher can be instructed to cache URI |
| downloads in a dedicated directory for reuse by subsequent downloads. |
| |
| The Mesos fetcher mechanism comprises of these two parts: |
| |
| 1. The agent-internal Fetcher Process (in terms of libprocess) that controls and |
| coordinates all fetch actions. Every agent instance has exactly one internal |
| fetcher instance that is used by every kind of containerizer. |
| |
| 2. The external program `mesos-fetcher` that is invoked by the former. It |
| performs all network and disk operations except file deletions and file size |
| queries for cache-internal bookkeeping. It is run as an external OS process in |
| order to shield the agent process from I/O-related hazards. It takes |
| instructions in form of an environment variable containing a JSON object with |
| detailed fetch action descriptions. |
| |
| ## The fetch procedure |
| |
| Frameworks launch tasks by calling the scheduler driver method `launchTasks()`, |
| passing `CommandInfo` protobuf structures as arguments. This type of structure |
| specifies (among other things) a command and a list of URIs that need to be |
| "fetched" into the sandbox directory on the agent node as a precondition for |
| task execution. Hence, when the agent receives a request to launch a task, it |
| calls upon its fetcher, first, to provision the specified resources into the |
| sandbox directory. If fetching fails, the task is not started and the reported |
| task status is `TASK_FAILED`. |
| |
| All URIs requested for a given task are fetched sequentially in a single |
| invocation of mesos-fetcher. Here, avoiding download concurrency reduces the |
| risk of bandwidth issues somewhat. However, multiple fetch operations can be |
| active concurrently due to multiple task launch requests. |
| |
| ### The URI protobuf structure |
| |
| Before mesos-fetcher is started, the specific fetch actions to be performed for |
| each URI are determined based on the following protobuf structure. (See |
| `include/mesos/mesos.proto` for more details.) |
| |
| message CommandInfo { |
| message URI { |
| required string value = 1; |
| optional bool executable = 2; |
| optional bool extract = 3 [default = true]; |
| optional bool cache = 4; |
| optional string output_file = 5; |
| } |
| ... |
| optional string user = 5; |
| } |
| |
| The field "value" contains the URI. |
| |
| If the "executable" field is "true", the "extract" field is ignored and |
| has no effect. |
| |
| If the "cache" field is true, the fetcher cache is to be used for the URI. |
| |
| If the "output_file" field is set, the fetcher will use that name for the copy |
| stored in the sandbox directory. "output_file" may contain a directory |
| component, in which case the path described must be a relative path. |
| |
| ### Specifying a user name |
| |
| The framework may pass along a user name that becomes a fetch parameter. This |
| causes its executors and tasks to run under a specific user. However, if the |
| "user" field in the CommandInfo structure is specified, it takes precedence for |
| the affected task. |
| |
| If a user name is specified either way, the fetcher first validates that it is |
| in fact a valid user name on the agent. If it is not, fetching fails right here. |
| Otherwise, the sandbox directory is assigned to the specified user as owner |
| (using `chown`) at the end of the fetch procedure, before task execution begins. |
| |
| The user name in play has an important effect on caching. Caching is managed on |
| a per-user base, i.e. the combination of user name and "uri" uniquely |
| identifies a cacheable fetch result. If no user name has been specified, this |
| counts for the cache as a separate user, too. Thus cache files for each valid |
| user are segregated from all others, including those without a specified user. |
| |
| This means that the exact same URI will be downloaded and cached multiple times |
| if different users are indicated. |
| |
| ### Executable fetch results |
| |
| By default, fetched files are not executable. |
| |
| If the field "executable" is set to "true", the fetch result will be changed to |
| be executable (by "chmod") for every user. This happens at the end of the fetch |
| procedure, in the sandbox directory only. It does not affect any cache file. |
| |
| ### Archive extraction |
| |
| If the "extract" field is "true", which is the default, then files with |
| a recognized extension that hints at packed or compressed archives are unpacked |
| in the sandbox directory. These file extensions are recognized: |
| |
| - .tar, .tar.gz, .tar.bz2, .tar.xz |
| - .gz, .tgz, .tbz2, .txz, .zip |
| |
| In case the cache is bypassed, both the archive and the unpacked results will be |
| found together in the sandbox. In case a cache file is unpacked, only the |
| extraction result will be found in the sandbox. |
| |
| The "output_file" field is useful here for cases where the URI ends with query |
| parameters, since these will otherwise end up in the file copied to the sandbox |
| and will subsequently fail to be recognized as archives. |
| |
| ### Bypassing the cache |
| |
| By default, the URI field "cache" is not present. If this is the case or its |
| value is "false" the fetcher downloads directly into the sandbox directory. |
| |
| The same also happens dynamically as a fallback strategy if anything goes wrong |
| when preparing a fetch operation that involves the cache. In this case, a |
| warning message is logged. Possible fallback conditions are: |
| |
| - The server offering the URI does not respond or reports an error. |
| - The URI's download size could not be determined. |
| - There is not enough space in the cache, even after attempting to evict files. |
| |
| ### Fetching through the cache |
| |
| If the URI's "cache" field has the value "true", then the fetcher cache is in |
| effect. If a URI is encountered for the first time (for the same user), it is |
| first downloaded into the cache, then copied to the sandbox directory from |
| there. If the same URI is encountered again, and a corresponding cache file is |
| resident in the cache or still en route into the cache, then downloading is |
| omitted and the fetcher proceeds directly to copying from the cache. Competing |
| requests for the same URI simply wait upon completion of the first request that |
| occurs. Thus every URI is downloaded at most once (per user) as long as it is |
| cached. |
| |
| Every cache file stays resident for an unspecified amount of time and can be |
| removed at the fetcher's discretion at any moment, except while it is in direct |
| use: |
| |
| - It is still being downloaded by this fetch procedure. |
| - It is still being downloaded by a concurrent fetch procedure for a different |
| task. |
| - It is being copied or extracted from the cache. |
| |
| Once a cache file has been removed, the related URI will thereafter be treated |
| as described above for the first encounter. |
| |
| Unfortunately, there is no mechanism to refresh a cache entry in the current |
| experimental version of the fetcher cache. A future feature may force updates |
| based on checksum queries to the URI. |
| |
| Recommended practice for now: |
| |
| The framework should start using a fresh unique URI whenever the resource's |
| content has changed. |
| |
| ### Determining resource sizes |
| |
| Before downloading a resource to the cache, the fetcher first determines the |
| size of the expected resource. It uses these methods depending on the nature of |
| the URI. |
| |
| - Local file sizes are probed with systems calls (that follow symbolic links). |
| - HTTP/HTTPS URIs are queried for the "content-length" field in the header. This |
| is performed by `curl`. The reported asset size must be greater than zero or |
| the URI is deemed invalid. |
| - FTP/FTPS is not supported at the time of writing. |
| - Everything else is queried by the local HDFS client. |
| |
| If any of this reports an error, the fetcher then falls back on bypassing the |
| cache as described above. |
| |
| WARNING: Only URIs for which download sizes can be queried up front and for |
| which accurate sizes are reported reliably are eligible for any fetcher cache |
| involvement. If actual cache file sizes exceed the physical capacity of the |
| cache directory in any way, all further agent behavior is completely |
| unspecified. Do not use any cache feature with any URI for which you have any |
| doubts! |
| |
| To mitigate this problem, cache files that have been found to be larger than |
| expected are deleted immediately after downloading and delivering the |
| requested content to the sandbox. Thus exceeding total capacity at least |
| does not accumulate over subsequent fetcher runs. |
| |
| If you know for sure that size aberrations are within certain limits you can |
| specify a cache directory size that is sufficiently smaller than your actual |
| physical volume and fetching should work. |
| |
| In case of cache files that are smaller then expected, the cache will |
| dynamically adjust its own bookkeeping according to actual sizes. |
| |
| ### Cache eviction |
| |
| After determining the prospective size of a cache file and before downloading |
| it, the cache attempts to ensure that at least as much space as is needed for |
| this file is available and can be written into. If this is immediately the case, |
| the requested amount of space is simply marked as reserved. Otherwise, missing |
| space is freed up by "cache eviction". This means that the cache removes files |
| at its own discretion until the given space target is met or exceeded. |
| |
| The eviction process fails if too many files are in use and therefore not |
| evictable or if the cache is simply too small. Either way, the fetcher then |
| falls back on bypassing the cache for the given URI as described above. |
| |
| If multiple evictions happen concurrently, each of them is pursuing its own |
| separate space goals. However, leftover freed up space from one effort is |
| automatically awarded to others. |
| |
| ## HTTP and SOCKS proxy settings |
| |
| Sometimes it is desirable to use a proxy to download the file. The Mesos |
| fetcher uses libcurl internally for downloading content from |
| HTTP/HTTPS/FTP/FTPS servers, and libcurl can use a proxy automatically if |
| certain environment variables are set. |
| |
| The respective environment variable name is `[protocol]_proxy`, where |
| `protocol` can be one of socks4, socks5, http, https. |
| |
| For example, the value of the `http_proxy` environment variable would be used |
| as the proxy for fetching http contents, while `https_proxy` would be used for |
| fetching https contents. Pay attention that these variable names must be |
| entirely in lower case. |
| |
| The value of the proxy variable is of the format |
| `[protocol://][user:password@]machine[:port]`, where `protocol` can be one of |
| socks4, socks5, http, https. |
| |
| FTP/FTPS requests with a proxy also make use of an HTTP/HTTPS proxy. Even |
| though in general this constrains the available FTP protocol operations, |
| everything the fetcher uses is supported. |
| |
| Your proxy settings can be placed in `/etc/default/mesos-slave`. Here is an |
| example: |
| |
| ``` |
| export http_proxy=https://proxy.example.com:3128 |
| export https_proxy=https://proxy.example.com:3128 |
| ``` |
| |
| The fetcher will pick up these environment variable settings since the utility |
| program `mesos-fetcher` which it employs is a child of mesos-agent. |
| |
| For more details, please check the |
| [libcurl manual](http://curl.haxx.se/libcurl/c/libcurl-tutorial.html). |
| |
| ## Agent flags |
| |
| It is highly recommended to set these flags explicitly to values other than |
| their defaults or to not use the fetcher cache in production. |
| |
| - "fetcher_cache_size", default value: enough for testing. |
| - "fetcher_cache_dir", default value: somewhere inside the directory specified |
| by the "work_dir" flag, which is OK for testing. |
| |
| Recommended practice: |
| |
| - Use a separate volume as fetcher cache. Do not specify a directory as fetcher |
| cache directory that competes with any other contributor for the underlying |
| volume's space. |
| - Set the cache directory size flag of the agent to less than your actual cache |
| volume's physical size. Use a safety margin, especially if you do not know |
| for sure if all frameworks are going to be compliant. |
| |
| Ultimate remedy: |
| |
| You can disable the fetcher cache entirely on each agent by setting its |
| "fetcher_cache_size" flag to zero bytes. |
| |
| ## Future Features |
| The following features would be relatively easy to implement additionally. |
| |
| - Perform cache updates based on resource check sums. For example, query the md5 |
| field in HTTP headers to determine when a resource at a URL has changed. |
| - Respect HTTP cache-control directives. |
| - Enable caching for ftp/ftps. |
| - Use symbolic links or bind mounts to project cached resources into the |
| sandbox, read-only. |
| - Have a choice whether to copy the extracted archive into the sandbox. |
| - Have a choice whether to delete the archive after extraction bypassing the |
| cache. |
| - Make the segregation of cache files by user optional. |
| - Extract content while downloading when bypassing the cache. |
| - Prefetch resources for subsequent tasks. This can happen concurrently with |
| running the present task, right after fetching its own resources. |
| |
| ## Implementation Details |
| The [Mesos Fetcher Cache Internals](fetcher-cache-internals.md) describes how the fetcher cache is implemented. |