You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mesos.apache.org by dl...@apache.org on 2015/06/07 00:19:32 UTC

svn commit: r1683964 [2/2] - in /mesos/site: publish/documentation/fetcher-cache-internals/ publish/documentation/fetcher/ publish/documentation/latest/fetcher-cache-internals/ publish/documentation/latest/fetcher/ publish/documentation/latest/mesos-do...

Added: mesos/site/source/documentation/latest/fetcher-cache-internals.md
URL: http://svn.apache.org/viewvc/mesos/site/source/documentation/latest/fetcher-cache-internals.md?rev=1683964&view=auto
==============================================================================
--- mesos/site/source/documentation/latest/fetcher-cache-internals.md (added)
+++ mesos/site/source/documentation/latest/fetcher-cache-internals.md Sat Jun  6 22:19:31 2015
@@ -0,0 +1,115 @@
+---
+layout: documentation
+---
+
+# Mesos Fetcher Cache Internals
+
+It assumed that readers of this document are well familiar with the contents of the overview and user guide of the Mesos fetcher in "fetcher.md". The present document makes direct references to notions defined in the former.
+
+## Design goals for the initial fetcher cache prototype:
+
+0. Direct fetching: Provide the pre-existing fetcher functionality (as in Mesos 0.22 and before) when caching is not explicitly requested.
+1. Program isolation: Preserve the approach to employ an external "mesos-fetcher" program to handle all (potentially very lengthy or blocking) content download operations.
+2. Cache effect: Significant lessen overall time spent on fetching in case of repetition of requests for the same URI. This holds for both sequential and concurrent repetition. The latter is the case when concurrently launched tasks on the same slave require overlapping URI sets.
+3. Cache space limit: Use a user-specified directory for cache storage and maintain a user-specified physical storage space limit for it. Evict older cache files as needed to fetch new cache content.
+4. Fallback strategy: Whenever downloading to or from the cache fails for any reason, fetching into the sandbox should still succeed somehow if at all possible.
+5. Slave recovery: Support slave recovery.
+
+For future releases, we foresee additional features:
+1. Automatic refreshing of cache content when a URI's content has changed.
+2. Prefetching URIs for subsequent tasks. Prefetching can run in parallel with task execution.
+
+## How the fetcher cache works
+
+In this section we look deeper into the implementation of design goals #1, #2, #3. The others are sufficiently covered in the user guide.
+
+### Fetcher process and mesos-fetcher
+
+The fetcher mechanism consists of two separate entities:
+
+1. The fetcher process included in the slave program. There is exactly one instance of this per slave.
+2. The separate mesos-fetcher program. There is one invocation of this per fetch request from the slave to the fetcher process.
+
+The fetcher process performs internal bookkeeping of what is in the cache and what is not. As needed, it invokes the mesos-fetcher program to download resources from URIs to the cache or directly to sandbox directories, and to copy resources from the cache to a sandbox directory.
+
+All decision making "intelligence" is situated in the fetcher process and the mesos-fetcher program is a rather simple helper program. Except for cache files, there is no persistent state at all in the entire fetcher system. This greatly simplifies dealing with all the inherent intricacies and races involved in concurrent fetching with caching.
+
+The mesos-fetcher program takes straight forward per-URI commands and executes these. It has three possible modes of operation for any given URI:
+
+1. Bypass the cache and fetch directly into the specified sandbox directory.
+2. Fetch into the cache and then copy the resulting cache file into the sandbox directory.
+3. Do not download anything. Copy (or extract) a resource from the cache into the sandbox directory.
+
+Besides minor complications such as archive extraction and execution rights settings, this already sums up all it does.
+
+Based on this setup, the main program flow in the fetcher process is concerned with assembling a list of parameters to the mesos-fetcher program that describe items to be fetched. This figure illustrates the high-level collaboration of the fetcher process with mesos-fetcher program runs. It also depicts the next level of detail of the fetcher process, which will be described in the following section.
+
+![Fetcher Separation of Labor](images/fetch_components.jpg?raw=true)
+
+
+### Cache state representation and manipulation
+
+The fetcher process uses a private instance of class Cache to represent what URIs are cached, where the respective cache files are, what stage of processing they are in, and so on.
+
+The main data structure to hold all this information is a hashmap from URI/user combinations to Cache::Entry objects, which each contain information about an individual cache file on disk. These objects are referenced by shared_ptr, because they can be addressed by multiple callbacks on behalf of concurrent fetch attempts while also being held in the hashmap.
+
+A cache entry corresponds directly to a cache file on disk throughout the entire life time of the latter, including before and after its existence. It holds all pertinent state to inform about the phase and results of fetching the corresponding URI.
+
+This figure illustrates the different states which a cache entry can be in.
+
+![Fetcher Cache State](images/fetch_state.jpg?raw=true)
+
+While a cache entry is referenced it cannot be evicted by a the current or any other concurrent fetch attempt in order to make space for a download of a new cache file.
+
+The two blue states are essentially the same: no cache file exists. The two green disk states on the right are also the same.
+
+The figure only depicts what happens from the point of view of one isolated fetch run. Any given cache entry can be referenced simultaniously by another concurrent fetch run. It must not be evicted as long as it is referenced by any fetching activity. We implement this by reference counting. Every cache entry has a reference count field that gets incremented at the beginning of its use by a fetch run and decremented at its end. The latter must happen no matter whether the run has been successful or whether there has been an error. Increments happen when:
+- A new cache entry is created. It is immediately referenced.
+- An existing cache entry's file download is going to be waited for.
+- An existing cache entry has a resident cache file that is going to be retrieved.
+
+Every increment is recorded in a list. At the very end of the fetch procedure, no matter what its outcome is, each entry in the list gets its reference count decremented.
+
+(Currently, we are even leaving reference counts for cache entries for which we fall back to bypassing the cache untouched until the end of the fetch procedure. This may be unnecessary, but it is safe. It is also supposedly rare, because fallbacks only occur to mitigate unexpected error situations. A future version may optimize this behavior.)
+
+### The per-URI control flow
+
+As menitoned above, the fetcher process' main control flow concerns sorting out what to do with each URI presented to it in a fetch request. An overview of the ensuing control flow for a given URI is depicted in this figure.
+
+![Determining Fetcher Actions](images/fetch_flow.jpg?raw=true)
+
+After going through this procedure for each URI, the fetcher process assembles the gathered list of per-URI actions into a JSON object (FetcherInfo), which is passed to the mesos-fetcher program in an environment variable. The possible fetch actions for a URI are shown at the bottom of the flow chart. After they are determined, the fetcher process invokes mesos-fetcher.
+
+The implementation is oriented at this control flow but its code structure cannot match it directly, because some of these branches must span multiple libprocess continuations. There are two layers of futures, one for each of these phases.
+
+1.  Before making fetcher cache items.
+- a) Wait for concurrent downloads for pre-existing cache entries
+- b) Wait for size fetching combined and then space reservation for new cache entries.
+
+2. After making fetcher cache items and running mesos-fetcher.
+- Complete new cache items with success/failure, which as an important side-effect informs concurrent fetch runs’ futures in phase 1/a.
+
+The futures for phase 1 are not shared outside one fetch run. They exclusively guard asynchronous operations for the same fetch run. Their type parameter does not really matter. But each needs to correspond to one URI and eventual fetch item somehow. Multiple variants have been proposed for this. The complexity remains about the same.
+
+The futures for phase 2 need to be part of the cache entries, because they are shared between concurrent fetch runs.
+
+Some time between phase 1 and 2, the fallback strategy needs to be applied where indicated: when a future from phase 1 has failed for any reason, we fall back on bypassing the cache.
+
+Besides, everything touched in 1/a and 1/b needs to be prevented from being cache-evicted until the end. One can in principle release cache entries right after they fail, but this requires more complexity and is harder to prove correct.
+
+
+### Cache eviction
+
+![Before eviction](images/fetch_evict1.jpg?raw=true)
+
+The resources named "A" and "B" have been fetched with caching into sandbox 1 and 2 below. In the course of this, two cache entries have been created and two files have been downloaded into the cache and named "1" and "2". (Cache file names have unique names that comprise serial numbers.)
+
+The next figure illustrates the state after fetching a different cached URI into sandbox 3, which in this case requires evicting a cache-resident file and its entry. Steps:
+1. Remove the cache entry for "A" from the fetcher process' cache entry table. Its faded depiction is supposed to indicate this. This immediately makes it appear as if the URI has never been cached, even though the cache file is still around.
+2. Proceed with fetching "C". This creates a new cache file, which has a different unique name. (The fetcher process remembers in its cache entry which file name belongs to which URI.)
+
+![After eviction](images/fetch_evict2.jpg?raw=true)
+
+The next figure then shows what happens if the first URI is fetched once again. Here we also assume the cache being so filled up that eviction is necessary and this time the entry and file for "B" are the victims.
+
+![After another eviction](images/fetch_evict3.jpg?raw=true)

Added: mesos/site/source/documentation/latest/fetcher.md
URL: http://svn.apache.org/viewvc/mesos/site/source/documentation/latest/fetcher.md?rev=1683964&view=auto
==============================================================================
--- mesos/site/source/documentation/latest/fetcher.md (added)
+++ mesos/site/source/documentation/latest/fetcher.md Sat Jun  6 22:19:31 2015
@@ -0,0 +1,255 @@
+---
+layout: documentation
+---
+
+# Mesos Fetcher
+
+Experimental support for the Mesos fetcher _cache_ is introduced in
+Mesos 0.23.0.
+
+In this context we loosely regard the term "downloading" as to include copying
+from local file systems.
+
+## What is the Mesos fetcher?
+
+The Mesos fetcher is a mechanism to download resources into the sandbox
+directory of a task in preparation of running the task. As part of a TaskInfo
+message, the framework ordering the task's execution provides a list of
+CommandInfo::URI protobuf values, which becomes the input to the Mesos fetcher.
+
+By default, each requested URI is downloaded directly into the sandbox directory
+and repeated requests for the same URI leads to downloading another copy of the
+same resource. Alternatively, the fetcher can be instructed to cache URI
+downloads in a dedicated directory for reuse by subsequent downloads.
+
+The Mesos fetcher mechanism comprises of these two parts:
+
+1. The slave-internal Fetcher Process (in terms of libprocess) that controls and
+coordinates all fetch actions. Every slave instance has exactly one internal
+fetcher instance that is used by every kind of containerizer (except the
+external containerizer variant, which is responsible for its own approach to
+fetching).
+
+2. The external program "mesos-fetcher" that is invoked by the former. It
+performs all network and disk operations except file deletions and file size
+queries for cache-internal bookkeeping. It is run as an external OS process in
+order to shield the slave process from I/O-related hazards. It takes
+instructions in form of an environment variable containing a JSON object with
+detailed fetch action descriptions.
+
+## The fetch procedure
+
+Frameworks launch tasks by calling the scheduler driver method launchTasks(),
+passing CommandInfo protobuf structures as arguments. This type of structure
+specifies (among other things) a command and a list of URIs that need to be
+"fetched" into the sandbox directory on the the slave node as a precondition for
+task execution. Hence, when the slave receives a request go launch a task, it
+calls upon its fetcher, first, to provision the specified resources into the
+sandbox directory. If fetching fails, the task is not started and the reported
+task status is TASK_FAILED.
+
+All URIs requested for a given task are fetched sequentially in a single
+invocation of mesos-fetcher. Here, avoiding download concurrency reduces the
+risk of bandwidth issues somewhat. However, multiple fetch operations can be
+active concurrently due to multiple task launch requests.
+
+### The URI protobuf structure
+
+Before mesos-fetcher is started, the specific fetch actions to be performed for
+each URI are determined based on the following protobuf structure. (See
+"include/mesos/mesos.proto" for more details.)
+
+    message CommandInfo {
+      message URI {
+        required string value = 1;
+        optional bool executable = 2;
+        optional bool extract = 3 [default = true];
+        optional bool cache = 4;
+      }
+      ...
+      optional string user = 5;
+    }
+
+The field "value" contains the URI.
+
+If the "executable" field is "true", the "extract" field is ignored and
+has no effect.
+
+If the "cache" field is true, the fetcher cache is to be used for the URI.
+
+### Specifying a user name
+
+The framework may pass along a user name that becomes a fetch parameter. This
+causes its executors and tasks to run under a specific user. However, if the
+"user" field in the CommandInfo structure is specified, it takes precedence for
+the affected task.
+
+If a user name is specified either way, the fetcher first validates that it is
+in fact a valid user name on the slave. If it is not, fetching fails right here.
+Otherwise, the sandbox directory is assigned to the specified user as owner
+(using chown) at the end of the fetch procedure, before task execution begins.
+
+The user name in play has an important effect on caching.  Caching is managed on
+a per-user base, i.e. the combination of user name and "uri" uniquely
+identifies a cacheable fetch result. If no user name has been specified, this
+counts for the cache as a separate user, too. Thus cache files for each valid
+user are segregated from all others, including those without a specified user.
+
+This means that the exact same URI will be downloaded and cached multiple times
+if different users are indicated.
+
+### Executable fetch results
+
+By default, fetched files are not executable.
+
+If the field "executable" is set to "true", the fetch result will be changed to
+be executable (by "chmod") for every user. This happens at the end of the fetch
+procedure, in the sandbox directory only. It does not affect any cache file.
+
+### Archive extraction
+
+If the "extract" field is "true", which is the default, then files with
+extensions that hint at packed or compressed archives (".zip", ".tar", et.al.)
+are unpacked in the sandbox directory.
+
+In case the cache is bypassed, both the archive and the unpacked results will be
+found together in the sandbox. In case a cache file is unpacked, only the
+extraction result will be found in the sandbox.
+
+### Bypassing the cache
+
+By default, the URI field "cache" is not present. If this is the case or its
+value is "false" the fetcher downloads directly into the sandbox directory.
+
+The same also happens dynamically as a fallback strategy if anything goes wrong
+when preparing a fetch operation that involves the cache. In this case, a
+warning message is logged. Possible fallback conditions are:
+
+- The server offering the URI does not respond or reports an error.
+- The URI's download size could not be determined.
+- There is not enough space in the cache, even after attempting to evict files.
+
+### Fetching through the cache
+
+If the URI's "cache" field has the value "true", then the fetcher cache is in
+effect. If a URI is encountered for the first time (for the same user), it is
+first downloaded into the cache, then copied to the sandbox directory from
+there. If the same URI is encountered again, and a corresponding cache file is
+resident in the cache or still en route into the cache, then downloading is
+omitted and the fetcher proceeds directly to copying from the cache. Competing
+requests for the same URI simply wait upon completion of the first request that
+occurs. Thus every URI is downloaded at most once (per user) as long as it is
+cached.
+
+Every cache file stays resident for an unspecified amount of time and can be
+removed at the fetcher's discretion at any moment, except while it is in direct
+use:
+
+- It is still being downloaded by this fetch procedure.
+- It is still being downloaded by a concurrent fetch procedure for a different
+  task.
+- It is being copied or extracted from the cache.
+
+Once a cache file has been removed, the related URI will thereafter be treated
+as described above for the first encounter.
+
+Unfortunately, there is no mechanism to refresh a cache entry in the current
+experimental version of the fetcher cache. A future feature may force updates
+based on checksum queries to the URI.
+
+Recommended practice for now:
+
+The framework should start using a fresh unique URI whenever the resource's
+content has changed.
+
+### Determining resource sizes
+
+Before downloading a resource to the cache, the fetcher first determines the
+size of the expected resource. It uses these methods depending on the nature of
+the URI.
+
+- Local file sizes are probed with systems calls (that follow symbolic links).
+- HTTP/HTTPS URIs are queried for the "content-length" field in the header. This
+  is performed by CURL. The reported asset size must be greater than zero or
+  the URI is deemed invalid.
+- FTP/FTPS is not supported at the time of writing.
+- Everything else is queried by the local HDFS client.
+
+If any of this reports an error, the fetcher then falls back on bypassing the
+cache as described above.
+
+WARNING: Only URIs for which download sizes can be queried up front and for
+which accurate sizes are reported reliably are eligible for any fetcher cache
+involvement. If actual cache file sizes exceed the physical capacity of the
+cache directory in any way, all further slave behavior is completely
+unspecified. Do not use any cache feature with any URI for which you have any
+doubts!
+
+To mitigate this problem, cache files that have been found to be larger than
+expected are deleted immediately after downloading and and delivering the
+requested content to the sandbox. Thus exceeding total capacity at least
+does not accumulate over subsequent fetcher runs.
+
+If you know for sure that size aberrations are within certain limits you can
+specify a cache directory size that is sufficiently smaller than your actual
+physical volume and fetching should work.
+
+In case of cache files that are smaller then expected, the cache will
+dynamically adjust its own bookkeeping according to actual sizes.
+
+### Cache eviction
+
+After determining the prospective size of a cache file and before downloading
+it, the cache attempts to ensure that at least as much space as is needed for
+this file is available and can be written into. If this is immediately the case,
+the requested amount of space is simply marked as reserved. Otherwise, missing
+space is freed up by "cache eviction". This means that the cache removes files
+at its own discretion until the given space target is met or exceeded.
+
+The eviction process fails if too many files are in use and therefore not
+evictable or if the cache is simply too small. Either way, the fetcher then
+falls back on bypassing the cache for the given URI as described above.
+
+If multiple evictions happen concurrently, each of them is pursuing its own
+separate space goals. However, leftover freed up space from one effort is
+automatically awarded to others.
+
+## Slave flags
+
+It is highly recommended to set these flags explicitly to values other than
+their defaults or to not use the fetcher cache in production.
+
+- "fetcher_cache_size", default value: enough for testing.
+- "fetcher_cache_dir", default value: somewhere inside the directory specified
+  by the "work_dir" flag, which is OK for testing.
+
+Recommended practice:
+
+- Use a separate volume as fetcher cache. Do not specify a directory as fetcher
+  cache directory that competes with any other contributor for the underlying
+  volume's space.
+- Set the cache directory size flag of the slave to less than your actual cache
+  volume's physical size. Use a safety margin, especially if you do not know
+  for sure if all frameworks are going to be compliant.
+
+Ultimate remedy:
+
+You can disable the fetcher cache entirely on each slave by setting its
+"fetcher_cache_size" flag to zero bytes.
+
+## Future Features
+The following features would be relatively easy to implement additionally.
+
+- Perform cache updates based on resource check sums. For example, query the md5
+  field in HTTP headers to determine when a resource at a URL has changed.
+- Respect HTTP cache-control directives.
+- Enable caching for ftp/ftps.
+- Use symbolic links or bind mounts to project cached resources into the
+  sandbox, read-only.
+- Have a choice whether to copy the extracted archive into the sandbox.
+- Have a choice whether to delete the archive after extraction bypassing the
+  cache.
+- Make the segregation of cache files by user optional.
+- Extract content while downloading when bypassing the cache.
+- Prefetch resources for subsequent tasks. This can happen concurrently with
+  running the present task, right after fetching its own resources.

Added: mesos/site/source/documentation/latest/mesos-doxygen-style-guide.md
URL: http://svn.apache.org/viewvc/mesos/site/source/documentation/latest/mesos-doxygen-style-guide.md?rev=1683964&view=auto
==============================================================================
--- mesos/site/source/documentation/latest/mesos-doxygen-style-guide.md (added)
+++ mesos/site/source/documentation/latest/mesos-doxygen-style-guide.md Sat Jun  6 22:19:31 2015
@@ -0,0 +1,102 @@
+# Apache Mesos Doxygen Style Guide
+
+This guide introduces a consistent style
+for [documenting Mesos source code](http://mesos.apache.org/api/latest/c++)
+using [Doxygen](http://www.doxygen.org).
+There is an ongoing, incremental effort with the goal to document all public Mesos, libprocess, and stout APIs this way.
+For now, existing code may not follow these guidelines, but new code should.
+
+## Preliminaries
+
+We follow the [IETF RFC2119](https://www.ietf.org/rfc/rfc2119.txt)
+on how to use words such as "must", "should", "can",
+and other requirement-related notions.
+
+
+## Building Doxygen Documentation
+As of right now, the Doxygen documentation should be built from the *build* subdirectory using *doxygen ../Doxyfile* . The documentation will then be generated into the *./doxygen* subdirectory.
+Todo: We should create a regular make target.
+
+
+## Doxygen Tags
+*When following these links be aware that the doxygen documentation is using another syntax in that @param is explained as \param.*
+
+* [@param](http://doxygen.org/manual/commands.html#cmdparam) Describes function parameters.
+* [@return](http://doxygen.org/manual/commands.html#cmdreturn) Describes return values.
+* [@see](http://doxygen.org/manual/commands.html#cmdsa) Describes a cross-reference to classes, functions, methods, variables, files or URL.
+* [@file](http://doxygen.org/manual/commands.html#cmdfile) Describes a refence to a file. It is required when documenting global functions, variables, typedefs, or enums in separate files.
+* [@link](http://doxygen.org/manual/commands.html#cmdlink) and [@endlink](http://doxygen.org/manual/commands.html#cmdendlink) Describes a link to a file, class, or member.
+* [@example](http://doxygen.org/manual/commands.html#cmdexample) Describes source code examples.
+* [@todo](http://doxygen.org/manual/commands.html#cmdtodo) Describes a TODO item.
+* [@image](http://doxygen.org/manual/commands.html#cmdimage) Describes an image.
+
+## Wrapping
+We wrap long descriptions using 4 spaces on the next line.
+~~~
+@param uncompressed The input string that requires
+    a very long description and an even longer
+    description on this line as well.
+~~~
+
+
+## Outside Source Code
+
+### Library and Component Overview Pages and User Guides
+
+Substantial libraries, components, and subcomponents of the Mesos system such as
+stout, libprocess, master, slave, containerizer, allocator, and others
+should have an overview page in markdown format that explains their
+purpose, overall structure, and general use. This can even be a complete user guide.
+
+This page must be located in the top directory of the library/component and named "REAMDE.md".
+
+The first line in such a document must be a section heading bearing the title which will appear in the generated Doxygen index.
+Example: "# Libprocess User Guide"
+
+## In Source Code
+
+Doxygen documentation needs only to be applied to source code parts that
+constitute an interface for which we want to generate Mesos API documentation
+files. Implementation code that does not participate in this should still be
+enhanced by source code comments as appropriate, but these comments should not follow the doxygen style.
+
+We follow the [Javadoc syntax](http://en.wikipedia.org/wiki/Javadoc) to mark comment blocks.
+These have the general form:
+
+    /**
+     * Brief summary.
+     *
+     * Detailed description. More detail.
+     * @see Some reference
+     *
+     * @param <name> Parameter description.
+     * @return Return value description.
+     */
+
+Example:
+
+    /**
+     * Returns a compressed version of a string.
+     *
+     * Compresses an input string using the foobar algorithm.
+     *
+     * @param uncompressed The input string.
+     * @return A compressed version of the input string.
+     */
+     std::string compress(const std::string& uncompressed);
+
+### Constants and Variables
+
+### Functions
+
+### Classes
+
+#### Methods
+
+#### Fields
+
+### Templates
+
+### Macros
+
+### Global declarations outside classes