You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@aurora.apache.org by jf...@apache.org on 2016/04/15 22:21:35 UTC

svn commit: r1739360 [5/8] - in /aurora/site: ./ data/ publish/ publish/blog/ publish/blog/aurora-0-13-0-released/ publish/documentation/0.10.0/ publish/documentation/0.10.0/build-system/ publish/documentation/0.10.0/client-cluster-configuration/ publi...

Added: aurora/site/source/documentation/0.13.0/additional-resources/presentations.md
URL: http://svn.apache.org/viewvc/aurora/site/source/documentation/0.13.0/additional-resources/presentations.md?rev=1739360&view=auto
==============================================================================
--- aurora/site/source/documentation/0.13.0/additional-resources/presentations.md (added)
+++ aurora/site/source/documentation/0.13.0/additional-resources/presentations.md Fri Apr 15 20:21:30 2016
@@ -0,0 +1,80 @@
+# Apache Aurora Presentations
+Video and slides from presentations and panel discussions about Apache Aurora.
+
+_(Listed in date descending order)_
+
+<table>
+
+	<tr>
+		<td><img src="../images/presentations/10_08_2015_mesos_aurora_on_a_small_scale_thumb.png" alt="Mesos and Aurora on a Small Scale Thumbnail" /></td>
+		<td><strong><a href="https://www.youtube.com/watch?v=q5iIqhaCJ_o">Mesos &amp; Aurora on a Small Scale (Video)</a></strong>
+		<p>Presented by Florian Pfeiffer</p>
+		<p>October 8, 2015 at <a href="http://events.linuxfoundation.org/events/archive/2015/mesoscon-europe">#MesosCon Europe 2015</a></p></td>
+	</tr>
+	<tr>
+		<td><img src="../images/presentations/10_08_2015_sla_aware_maintenance_for_operators_thumb.png" alt="SLA Aware Maintenance for Operators Thumbnail" /></td>
+		<td><strong><a href="https://www.youtube.com/watch?v=tZ0-SISvCis">SLA Aware Maintenance for Operators (Video)</a></strong>
+		<p>Presented by Joe Smith</p>
+		<p>October 8, 2015 at <a href="http://events.linuxfoundation.org/events/archive/2015/mesoscon-europe">#MesosCon Europe 2015</a></p></td>
+	</tr>
+	<tr>
+		<td><img src="../images/presentations/09_20_2015_shipping_code_with_aurora_thumb.png" alt="Shipping Code with Aurora Thumbnail" /></td>
+		<td><strong><a href="https://www.youtube.com/watch?v=y1hi7K1lPkk">Shipping Code with Aurora (Video)</a></strong>
+		<p>Presented by Bill Farner</p>
+		<p>August 20, 2015 at <a href="http://events.linuxfoundation.org/events/archive/2015/mesoscon">#MesosCon 2015</a></p></td>
+	</tr>
+	<tr>
+		<td><img src="../images/presentations/09_20_2015_twitter_production_scale_thumb.png" alt="Twitter Production Scale Thumbnail" /></td>
+		<td><strong><a href="https://www.youtube.com/watch?v=nNrh-gdu9m4">Twitter’s Production Scale: Mesos and Aurora Operations (Video)</a></strong>
+		<p>Presented by Joe Smith</p>
+		<p>August 20, 2015 at <a href="http://events.linuxfoundation.org/events/archive/2015/mesoscon">#MesosCon 2015</a></p></td>
+	</tr>
+	<tr>
+		<td><img src="../images/presentations/04_30_2015_monolith_to_microservices_thumb.png" alt="From Monolith to Microservices with Aurora Video Thumbnail" /></td>
+		<td><strong><a href="https://www.youtube.com/watch?v=yXkOgnyK4Hw">From Monolith to Microservices w/ Aurora (Video)</a></strong>
+		<p>Presented by Thanos Baskous, Tony Dong, Dobromir Montauk</p>
+		<p>April 30, 2015 at <a href="http://www.meetup.com/Bay-Area-Apache-Aurora-Users-Group/events/221219480/">Bay Area Apache Aurora Users Group</a></p></td>
+	</tr>
+	<tr>
+		<td><img src="../images/presentations/03_07_2015_aurora_mesos_in_practice_at_twitter_thumb.png" alt="Aurora + Mesos in Practice at Twitter Thumbnail" /></td>
+		<td><strong><a href="https://www.youtube.com/watch?v=1XYJGX_qZVU">Aurora + Mesos in Practice at Twitter (Video)</a></strong>
+		<p>Presented by Bill Farner</p>
+		<p>March 07, 2015 at <a href="http://www.bigeng.io/aurora-mesos-in-practice-at-twitter">Bigcommerce TechTalk</a></p></td>
+	</tr>
+	<tr>
+		<td><img src="../images/presentations/02_28_2015_apache_aurora_thumb.png" alt="Apache Auroraの始めかた Slideshow Thumbnail" /></td>
+		<td><strong><a href="http://www.slideshare.net/zembutsu/apache-aurora-introduction-and-tutorial-osc15tk">Apache Auroraの始めかた (Slides)</a></strong>
+		<p>Presented by Masahito Zembutsu</p>
+		<p>February 28, 2015 at <a href="http://www.ospn.jp/osc2015-spring/">Open Source Conference 2015 Tokyo Spring</a></p></td>
+	</tr>
+	<tr>
+		<td><img src="../images/presentations/02_19_2015_aurora_adopters_panel_thumb.png" alt="Apache Aurora Adopters Panel Video Thumbnail" /></td>
+		<td><strong><a href="https://www.youtube.com/watch?v=2Jsj0zFdRlg">Apache Aurora Adopters Panel (Video)</a></strong>
+		<p>Panelists Ben Staffin, Josh Adams, Bill Farner, Berk Demir</p>
+		<p>February 19, 2015 at <a href="http://www.meetup.com/Bay-Area-Mesos-User-Group/events/220279080/">Bay Area Mesos Users Group</a></p></td>
+	</tr>
+	<tr>
+		<td><img src="../images/presentations/02_19_2015_aurora_at_twitter_thumb.png" alt="Operating Apache Aurora and Mesos at Twitter Video Thumbnail" /></td>
+		<td><strong><a href="https://www.youtube.com/watch?v=E4lxX6epM_U">Operating Apache Aurora and Mesos at Twitter (Video)</a></strong>
+		<p>Presented by Joe Smith</p>
+		<p>February 19, 2015 at <a href="http://www.meetup.com/Bay-Area-Mesos-User-Group/events/220279080/">Bay Area Mesos Users Group</a></p></td>
+	</tr>
+	<tr>
+		<td><img src="../images/presentations/02_19_2015_aurora_at_tellapart_thumb.png" alt="Apache Aurora and Mesos at TellApart" /></td>
+		<td><strong><a href="https://www.youtube.com/watch?v=ZZXtXLvTXAE">Apache Aurora and Mesos at TellApart (Video)</a></strong>
+		<p>Presented by Steve Niemitz</p>
+		<p>February 19, 2015 at <a href="http://www.meetup.com/Bay-Area-Mesos-User-Group/events/220279080/">Bay Area Mesos Users Group</a></p></td>
+	</tr>
+	<tr>
+		<td><img src="../images/presentations/08_21_2014_past_present_future_thumb.png" alt="Past, Present, and Future of the Aurora Scheduler Video Thumbnail" /></td>
+		<td><strong><a href="https://www.youtube.com/watch?v=Dsc5CPhKs4o">Past, Present, and Future of the Aurora Scheduler (Video)</a></strong>
+		<p>Presented by Bill Farner</p>
+		<p>August 21, 2014 at <a href="http://events.linuxfoundation.org/events/archive/2014/mesoscon">#MesosCon 2014</a></p></td>
+	</tr>
+	<tr>
+		<td><img src="../images/presentations/03_25_2014_introduction_to_aurora_thumb.png" alt="Introduction to Apache Aurora Video Thumbnail" /></td>
+		<td><strong><a href="https://www.youtube.com/watch?v=asd_h6VzaJc">Introduction to Apache Aurora (Video)</a></strong>
+		<p>Presented by Bill Farner</p>
+		<p>March 25, 2014 at <a href="https://www.eventbrite.com/e/aurora-and-mesosframeworksmeetup-tickets-10850994617">Aurora and Mesos Frameworks Meetup</a></p></td>
+	</tr>
+</table>

Added: aurora/site/source/documentation/0.13.0/additional-resources/tools.md
URL: http://svn.apache.org/viewvc/aurora/site/source/documentation/0.13.0/additional-resources/tools.md?rev=1739360&view=auto
==============================================================================
--- aurora/site/source/documentation/0.13.0/additional-resources/tools.md (added)
+++ aurora/site/source/documentation/0.13.0/additional-resources/tools.md Fri Apr 15 20:21:30 2016
@@ -0,0 +1,21 @@
+# Tools
+
+Various tools integrate with Aurora. Is there a tool missing? Let us know, or submit a patch to add it!
+
+* Load-balancing technology used to direct traffic to services running on Aurora:
+  - [synapse](https://github.com/airbnb/synapse) based on HAProxy
+  - [aurproxy](https://github.com/tellapart/aurproxy) based on nginx
+  - [jobhopper](https://github.com/benley/aurora-jobhopper) performs HTTP redirects for easy developer and administrator access
+
+* RPC libraries that integrate with the Aurora's [service discovery mechanism](../features/service-discovery.md):
+  - [linkerd](https://linkerd.io/) RPC proxy
+  - [finagle](https://twitter.github.io/finagle) (Scala)
+  - [scales](https://github.com/steveniemitz/scales) (Python)
+
+* Monitoring:
+  - [collectd-aurora](https://github.com/zircote/collectd-aurora) for cluster monitoring using collectd
+  - [Prometheus Aurora exporter](https://github.com/tommyulfsparre/aurora_exporter) for cluster monitoring using Prometheus
+  - [Prometheus service discovery integration](http://prometheus.io/docs/operating/configuration/#zookeeper-serverset-sd-configurations-serverset_sd_config) for discovering and monitoring services running on Aurora
+
+* Packaging and deployment:
+  - [aurora-packaging](https://github.com/apache/aurora-packaging), the source of the official Aurora packages

Added: aurora/site/source/documentation/0.13.0/development/client.md
URL: http://svn.apache.org/viewvc/aurora/site/source/documentation/0.13.0/development/client.md?rev=1739360&view=auto
==============================================================================
--- aurora/site/source/documentation/0.13.0/development/client.md (added)
+++ aurora/site/source/documentation/0.13.0/development/client.md Fri Apr 15 20:21:30 2016
@@ -0,0 +1,81 @@
+Developing the Aurora Client
+============================
+
+The client is written in Python, and uses the
+[Pants](http://pantsbuild.github.io/python-readme.html) build tool.
+
+
+Building and Testing
+--------------------
+
+Building and testing the client code are both done using Pants. The relevant targets to know about
+are:
+
+   * Build a client executable: `./pants binary src/main/python/apache/aurora/client:aurora`
+   * Test client code: `./pants test src/test/python/apache/aurora/client/cli:cli`
+
+If you want to build a source distribution of the client, you need to run `./build-support/release/make-python-sdists`.
+
+
+Running/Debugging
+------------------
+
+For manually testing client changes against a cluster, we use [Vagrant](https://www.vagrantup.com/).
+To start a virtual cluster, you need to install Vagrant, and then run `vagrant up` for the root of
+the aurora workspace. This will create a vagrant host named "devcluster", with a mesos master, a set
+of mesos agents, and an aurora scheduler.
+
+If you have a change you would like to test in your local cluster, you'll rebuild the client:
+
+    vagrant ssh -c 'aurorabuild client'
+
+Once this completes, the `aurora` command will reflect your changes.
+
+
+Running/Debugging in PyCharm
+-----------------------------
+
+It's possible to use PyCharm to run and debug both the client and client tests in an IDE. In order
+to do this, first run:
+
+    build-support/python/make-pycharm-virtualenv
+
+This script will configure a virtualenv with all of our Python requirements. Once the script
+completes it will emit instructions for configuring PyCharm:
+
+    Your PyCharm environment is now set up.  You can open the project root
+    directory with PyCharm.
+
+    Once the project is loaded:
+      - open project settings
+      - click 'Project Interpreter'
+      - click the cog in the upper-right corner
+      - click 'Add Local'
+      - select 'build-support/python/pycharm.venv/bin/python'
+      - click 'OK'
+
+### Running/Debugging Tests
+After following these instructions, you should now be able to run/debug tests directly from the IDE
+by right-clicking on a test (or test class) and choosing to run or debug:
+
+[![Debug Client Test](../images/debug-client-test.png)](../images/debug-client-test.png)
+
+If you've set a breakpoint, you can see the run will now stop and let you debug:
+
+[![Debugging Client Test](../images/debugging-client-test.png)](../images/debugging-client-test.png)
+
+### Running/Debugging the Client
+Actually running and debugging the client is unfortunately a bit more complex. You'll need to create
+a Run configuration:
+
+* Go to Run → Edit Configurations
+* Click the + icon to add a new configuration.
+* Choose python and name the configuration 'client'.
+* Set the script path to `/your/path/to/aurora/src/main/python/apache/aurora/client/cli/client.py`
+* Set the script parameters to the command you want to run (e.g. `job status <job key>`)
+* Expand the Environment section and click the ellipsis to add a new environment variable
+* Click the + at the bottom to add a new variable named AURORA_CONFIG_ROOT whose value is the
+  path where the your cluster configuration can be found. For example, to talk to the scheduler
+  running in the vagrant image, it would be set to `/your/path/to/aurora/examples/vagrant` (this
+  is the directory where our example clusters.json is found).
+* You should now be able to run and debug this configuration!

Added: aurora/site/source/documentation/0.13.0/development/committers-guide.md
URL: http://svn.apache.org/viewvc/aurora/site/source/documentation/0.13.0/development/committers-guide.md?rev=1739360&view=auto
==============================================================================
--- aurora/site/source/documentation/0.13.0/development/committers-guide.md (added)
+++ aurora/site/source/documentation/0.13.0/development/committers-guide.md Fri Apr 15 20:21:30 2016
@@ -0,0 +1,86 @@
+Committer's Guide
+=================
+
+Information for official Apache Aurora committers.
+
+Setting up your email account
+-----------------------------
+Once your Apache ID has been set up you can configure your account and add ssh keys and setup an
+email forwarding address at
+
+    http://id.apache.org
+
+Additional instructions for setting up your new committer email can be found at
+
+    http://www.apache.org/dev/user-email.html
+
+The recommended setup is to configure all services (mailing lists, JIRA, ReviewBoard) to send
+emails to your @apache.org email address.
+
+
+Creating a gpg key for releases
+-------------------------------
+In order to create a release candidate you will need a gpg key published to an external key server
+and that key will need to be added to our KEYS file as well.
+
+1. Create a key:
+
+               gpg --gen-key
+
+2. Add your gpg key to the Apache Aurora KEYS file:
+
+               git clone https://git-wip-us.apache.org/repos/asf/aurora.git
+               (gpg --list-sigs <KEY ID> && gpg --armor --export <KEY ID>) >> KEYS
+               git add KEYS && git commit -m "Adding gpg key for <APACHE ID>"
+               ./rbt post -o -g
+
+3. Publish the key to an external key server:
+
+               gpg --keyserver pgp.mit.edu --send-keys <KEY ID>
+
+4. Update the changes to the KEYS file to the Apache Aurora svn dist locations listed below:
+
+               https://dist.apache.org/repos/dist/dev/aurora/KEYS
+               https://dist.apache.org/repos/dist/release/aurora/KEYS
+
+5. Add your key to git config for use with the release scripts:
+
+               git config --global user.signingkey <KEY ID>
+
+
+Creating a release
+------------------
+The following will guide you through the steps to create a release candidate, vote, and finally an
+official Apache Aurora release. Before starting your gpg key should be in the KEYS file and you
+must have access to commit to the dist.a.o repositories.
+
+1. Ensure that all issues resolved for this release candidate are tagged with the correct Fix
+Version in JIRA, the changelog script will use this to generate the CHANGELOG in step #2.
+
+2. Create a release candidate. This will automatically update the CHANGELOG and commit it, create a
+branch and update the current version within the trunk. To create a minor version update and publish
+it run
+
+               ./build-support/release/release-candidate -l m -p
+
+3. Update, if necessary, the draft email created from the `release-candidate` script in step #2 and
+send the [VOTE] email to the dev@ mailing list. You can verify the release signature and checksums
+by running
+
+               ./build-support/release/verify-release-candidate
+
+4. Wait for the vote to complete. If the vote fails close the vote by replying to the initial [VOTE]
+email sent in step #3 by editing the subject to [RESULT][VOTE] ... and noting the failure reason
+(example [here](http://markmail.org/message/d4d6xtvj7vgwi76f)). Now address any issues and go back to
+step #1 and run again, this time you will use the -r flag to increment the release candidate
+version. This will automatically clean up the release candidate rc0 branch and source distribution.
+
+               ./build-support/release/release-candidate -l m -r 1 -p
+
+5. Once the vote has successfully passed create the release
+
+               ./build-support/release/release
+
+6. Update the draft email created fom the `release` script in step #5 to include the Apache ID's for
+all binding votes and send the [RESULT][VOTE] email to the dev@ mailing list.
+

Added: aurora/site/source/documentation/0.13.0/development/db-migration.md
URL: http://svn.apache.org/viewvc/aurora/site/source/documentation/0.13.0/development/db-migration.md?rev=1739360&view=auto
==============================================================================
--- aurora/site/source/documentation/0.13.0/development/db-migration.md (added)
+++ aurora/site/source/documentation/0.13.0/development/db-migration.md Fri Apr 15 20:21:30 2016
@@ -0,0 +1,33 @@
+DB Migrations
+=============
+
+Changes to the DB schema should be made in the form of migrations. This ensures that all changes
+are applied correctly after a DB dump from a previous version is restored.
+
+DB migrations are managed through a system built on top of
+[MyBatis Migrations](http://www.mybatis.org/migrations/). The migrations are run automatically when
+a snapshot is restored, no manual interaction is required by cluster operators.
+
+Upgrades
+--------
+When adding or altering tables or changing data, a new migration class should be created under the
+org.apache.aurora.scheduler.storage.db.migration package. The class should implement the
+[MigrationScript](https://github.com/mybatis/migrations/blob/master/src/main/java/org/apache/ibatis/migration/MigrationScript.java)
+interface (see [V001_TestMigration](../../src/test/java/org/apache/aurora/scheduler/storage/db/testmigration/V001_TestMigration.java)
+as an example). The upgrade and downgrade scripts are defined in this class. When restoring a
+snapshot the list of migrations on the classpath is compared to the list of applied changes in the
+DB. Any changes that have not yet been applied are executed and their downgrade script is stored
+alongside the changelog entry in the database to faciliate downgrades in the event of a rollback.
+
+Downgrades
+----------
+If, while running migrations, a rollback is detected, i.e. a change exists in the DB changelog that
+does not exist on the classpath, the downgrade script associated with each affected change is
+applied.
+
+Baselines
+---------
+After enough time has passed (at least 1 official release), it should be safe to baseline migrations
+if desired. This can be accomplished by adding the changes from migrations directly to
+[schema.sql](../../src/main/resources/org/apache/aurora/scheduler/storage/db/schema.sql), removing
+the corresponding migration classes and adding a migration to remove the changelog entries.
\ No newline at end of file

Added: aurora/site/source/documentation/0.13.0/development/design-documents.md
URL: http://svn.apache.org/viewvc/aurora/site/source/documentation/0.13.0/development/design-documents.md?rev=1739360&view=auto
==============================================================================
--- aurora/site/source/documentation/0.13.0/development/design-documents.md (added)
+++ aurora/site/source/documentation/0.13.0/development/design-documents.md Fri Apr 15 20:21:30 2016
@@ -0,0 +1,20 @@
+Design Documents
+================
+
+Since its inception as an Apache project, larger feature additions to the
+Aurora code base are discussed in form of design documents. Design documents
+are living documents until a consensus has been reached to implement a feature
+in the proposed form.
+
+Current and past documents:
+
+* [Command Hooks for the Aurora Client](design/command-hooks.md)
+* [Health Checks for Updates](https://docs.google.com/document/d/1ZdgW8S4xMhvKW7iQUX99xZm10NXSxEWR0a-21FP5d94/edit)
+* [JobUpdateDiff thrift API](https://docs.google.com/document/d/1Fc_YhhV7fc4D9Xv6gJzpfooxbK4YWZcvzw6Bd3qVTL8/edit)
+* [REST API RFC](https://docs.google.com/document/d/11_lAsYIRlD5ETRzF2eSd3oa8LXAHYFD8rSetspYXaf4/edit)
+* [Revocable Mesos offers in Aurora](https://docs.google.com/document/d/1r1WCHgmPJp5wbrqSZLsgtxPNj3sULfHrSFmxp2GyPTo/edit)
+* [Supporting the Mesos Universal Containerizer](https://docs.google.com/document/d/111T09NBF2zjjl7HE95xglsDpRdKoZqhCRM5hHmOfTLA/edit?usp=sharing)
+* [Tier Management In Apache Aurora](https://docs.google.com/document/d/1erszT-HsWf1zCIfhbqHlsotHxWUvDyI2xUwNQQQxLgs/edit?usp=sharing)
+* [Ubiquitous Jobs](https://docs.google.com/document/d/12hr6GnUZU3mc7xsWRzMi3nQILGB-3vyUxvbG-6YmvdE/edit)
+
+Design documents can be found in the Aurora issue tracker via the query [`project = AURORA AND text ~ "docs.google.com" ORDER BY created`](https://issues.apache.org/jira/browse/AURORA-1528?jql=project%20%3D%20AURORA%20AND%20text%20~%20%22docs.google.com%22%20ORDER%20BY%20created).

Added: aurora/site/source/documentation/0.13.0/development/design/command-hooks.md
URL: http://svn.apache.org/viewvc/aurora/site/source/documentation/0.13.0/development/design/command-hooks.md?rev=1739360&view=auto
==============================================================================
--- aurora/site/source/documentation/0.13.0/development/design/command-hooks.md (added)
+++ aurora/site/source/documentation/0.13.0/development/design/command-hooks.md Fri Apr 15 20:21:30 2016
@@ -0,0 +1,102 @@
+# Command Hooks for the Aurora Client
+
+## Introduction/Motivation
+
+We've got hooks in the client that surround API calls. These are
+pretty awkward, because they don't correlate with user actions. For
+example, suppose we wanted a policy that said users weren't allowed to
+kill all instances of a production job at once.
+
+Right now, all that we could hook would be the "killJob" api call. But
+kill (at least in newer versions of the client) normally runs in
+batches. If a user called killall, what we would see on the API level
+is a series of "killJob" calls, each of which specified a batch of
+instances. We woudn't be able to distinguish between really killing
+all instances of a job (which is forbidden under this policy), and
+carefully killing in batches (which is permitted.) In each case, the
+hook would just see a series of API calls, and couldn't find out what
+the actual command being executed was!
+
+For most policy enforcement, what we really want to be able to do is
+look at and vet the commands that a user is performing, not the API
+calls that the client uses to implement those commands.
+
+So I propose that we add a new kind of hooks, which surround noun/verb
+commands. A hook will register itself to handle a collection of (noun,
+verb) pairs. Whenever any of those noun/verb commands are invoked, the
+hooks methods will be called around the execution of the verb. A
+pre-hook will have the ability to reject a command, preventing the
+verb from being executed.
+
+## Registering Hooks
+
+These hooks will be registered via configuration plugins. A configuration plugin
+can register hooks using an API. Hooks registered this way are, effectively,
+hardwired into the client executable.
+
+The order of execution of hooks is unspecified: they may be called in
+any order. There is no way to guarantee that one hook will execute
+before some other hook.
+
+
+### Global Hooks
+
+Commands registered by the python call are called _global_ hooks,
+because they will run for all configurations, whether or not they
+specify any hooks in the configuration file.
+
+In the implementation, hooks are registered in the module
+`apache.aurora.client.cli.command_hooks`, using the class
+`GlobalCommandHookRegistry`. A global hook can be registered by calling
+`GlobalCommandHookRegistry.register_command_hook` in a configuration plugin.
+
+### The API
+
+    class CommandHook(object)
+      @property
+      def name(self):
+        """Returns a name for the hook."
+
+      def get_nouns(self):
+        """Return the nouns that have verbs that should invoke this hook."""
+
+      def get_verbs(self, noun):
+        """Return the verbs for a particular noun that should invoke his hook."""
+
+      @abstractmethod
+      def pre_command(self, noun, verb, context, commandline):
+        """Execute a hook before invoking a verb.
+        * noun: the noun being invoked.
+        * verb: the verb being invoked.
+        * context: the context object that will be used to invoke the verb.
+          The options object will be initialized before calling the hook
+        * commandline: the original argv collection used to invoke the client.
+        Returns: True if the command should be allowed to proceed; False if the command
+        should be rejected.
+        """
+
+      def post_command(self, noun, verb, context, commandline, result):
+        """Execute a hook after invoking a verb.
+        * noun: the noun being invoked.
+        * verb: the verb being invoked.
+        * context: the context object that will be used to invoke the verb.
+          The options object will be initialized before calling the hook
+        * commandline: the original argv collection used to invoke the client.
+        * result: the result code returned by the verb.
+        Returns: nothing
+        """
+
+    class GlobalCommandHookRegistry(object):
+      @classmethod
+      def register_command_hook(self, hook):
+        pass
+
+### Skipping Hooks
+
+To skip a hook, a user uses a command-line option, `--skip-hooks`. The option can either
+specify specific hooks to skip, or "all":
+
+* `aurora --skip-hooks=all job create east/bozo/devel/myjob` will create a job
+  without running any hooks.
+* `aurora --skip-hooks=test,iq create east/bozo/devel/myjob` will create a job,
+  and will skip only the hooks named "test" and "iq".

Added: aurora/site/source/documentation/0.13.0/development/scheduler.md
URL: http://svn.apache.org/viewvc/aurora/site/source/documentation/0.13.0/development/scheduler.md?rev=1739360&view=auto
==============================================================================
--- aurora/site/source/documentation/0.13.0/development/scheduler.md (added)
+++ aurora/site/source/documentation/0.13.0/development/scheduler.md Fri Apr 15 20:21:30 2016
@@ -0,0 +1,118 @@
+Developing the Aurora Scheduler
+===============================
+
+The Aurora scheduler is written in Java code and built with [Gradle](http://gradle.org).
+
+
+Prerequisite
+============
+
+When using Apache Aurora checked out from the source repository or the binary
+distribution, the Gradle wrapper and JavaScript dependencies are provided.
+However, you need to manually install them when using the source release
+downloads:
+
+1. Install Gradle following the instructions on the [Gradle web site](http://gradle.org)
+2. From the root directory of the Apache Aurora project generate the Gradle
+wrapper by running:
+
+    gradle wrapper
+
+
+Getting Started
+===============
+
+You will need Java 8 installed and on your `PATH` or unzipped somewhere with `JAVA_HOME` set. Then
+
+    ./gradlew tasks
+
+will bootstrap the build system and show available tasks. This can take a while the first time you
+run it but subsequent runs will be much faster due to cached artifacts.
+
+Running the Tests
+-----------------
+Aurora has a comprehensive unit test suite. To run the tests use
+
+    ./gradlew build
+
+Gradle will only re-run tests when dependencies of them have changed. To force a re-run of all
+tests use
+
+    ./gradlew clean build
+
+Running the build with code quality checks
+------------------------------------------
+To speed up development iteration, the plain gradle commands will not run static analysis tools.
+However, you should run these before posting a review diff, and **always** run this before pushing a
+commit to origin/master.
+
+    ./gradlew build -Pq
+
+Running integration tests
+-------------------------
+To run the same tests that are run in the Apache Aurora continuous integration
+environment:
+
+    ./build-support/jenkins/build.sh
+
+In addition, there is an end-to-end test that runs a suite of aurora commands
+using a virtual cluster:
+
+    ./src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh
+
+Creating a bundle for deployment
+--------------------------------
+Gradle can create a zip file containing Aurora, all of its dependencies, and a launch script with
+
+    ./gradlew distZip
+
+or a tar file containing the same files with
+
+    ./gradlew distTar
+
+The output file will be written to `dist/distributions/aurora-scheduler.zip` or
+`dist/distributions/aurora-scheduler.tar`.
+
+
+
+Developing Aurora Java code
+===========================
+
+Setting up an IDE
+-----------------
+Gradle can generate project files for your IDE. To generate an IntelliJ IDEA project run
+
+    ./gradlew idea
+
+and import the generated `aurora.ipr` file.
+
+Adding or Upgrading a Dependency
+--------------------------------
+New dependencies can be added from Maven central by adding a `compile` dependency to `build.gradle`.
+For example, to add a dependency on `com.example`'s `example-lib` 1.0 add this block:
+
+    compile 'com.example:example-lib:1.0'
+
+NOTE: Anyone thinking about adding a new dependency should first familiarize themselves with the
+Apache Foundation's third-party licensing
+[policy](http://www.apache.org/legal/resolved.html#category-x).
+
+
+
+Developing the Aurora Build System
+==================================
+
+Bootstrapping Gradle
+--------------------
+The following files were autogenerated by `gradle wrapper` using gradle's
+[Wrapper](http://www.gradle.org/docs/current/dsl/org.gradle.api.tasks.wrapper.Wrapper.html) plugin and
+should not be modified directly:
+
+    ./gradlew
+    ./gradlew.bat
+    ./gradle/wrapper/gradle-wrapper.jar
+    ./gradle/wrapper/gradle-wrapper.properties
+
+To upgrade Gradle unpack the new version somewhere, run `/path/to/new/gradle wrapper` in the
+repository root and commit the changed files.
+

Added: aurora/site/source/documentation/0.13.0/development/thermos.md
URL: http://svn.apache.org/viewvc/aurora/site/source/documentation/0.13.0/development/thermos.md?rev=1739360&view=auto
==============================================================================
--- aurora/site/source/documentation/0.13.0/development/thermos.md (added)
+++ aurora/site/source/documentation/0.13.0/development/thermos.md Fri Apr 15 20:21:30 2016
@@ -0,0 +1,126 @@
+The Python components of Aurora are built using [Pants](https://pantsbuild.github.io).
+
+
+Python Build Conventions
+========================
+The Python code is laid out according to the following conventions:
+
+1. 1 `BUILD` per 3rd level directory. For a list of current top-level packages run:
+
+        % find src/main/python -maxdepth 3 -mindepth 3 -type d |\
+        while read dname; do echo $dname |\
+            sed 's@src/main/python/\(.*\)/\(.*\)/\(.*\).*@\1.\2.\3@'; done
+
+2.  Each `BUILD` file exports 1
+    [`python_library`](https://pantsbuild.github.io/build_dictionary.html#bdict_python_library)
+    that provides a
+    [`setup_py`](https://pantsbuild.github.io/build_dictionary.html#setup_py)
+    containing each
+    [`python_binary`](https://pantsbuild.github.io/build_dictionary.html#python_binary)
+    in the `BUILD` file, named the same as the directory it's in so that it can be referenced
+    without a ':' character. The `sources` field in the `python_library` will almost always be
+    `rglobs('*.py')`.
+
+3.  Other BUILD files may only depend on this single public `python_library`
+    target. Any other target is considered a private implementation detail and
+    should be prefixed with an `_`.
+
+4.  `python_binary` targets are always named the same as the exported console script.
+
+5.  `python_binary` targets must have identical `dependencies` to the `python_library` exported
+    by the package and must use `entry_point`.
+
+    The means a PEX file generated by pants will contain exactly the same files that will be
+    available on the `PYTHONPATH` in the case of `pip install` of the corresponding library
+    target. This will help our migration off of Pants in the future.
+
+Annotated example - apache.thermos.runner
+-----------------------------------------
+
+    % find src/main/python/apache/thermos/runner
+    src/main/python/apache/thermos/runner
+    src/main/python/apache/thermos/runner/__init__.py
+    src/main/python/apache/thermos/runner/thermos_runner.py
+    src/main/python/apache/thermos/runner/BUILD
+    % cat src/main/python/apache/thermos/runner/BUILD
+    # License boilerplate omitted
+    import os
+
+
+    # Private target so that a setup_py can exist without a circular dependency. Only targets within
+    # this file should depend on this.
+    python_library(
+      name = '_runner',
+      # The target covers every python file under this directory and subdirectories.
+      sources = rglobs('*.py'),
+      dependencies = [
+        '3rdparty/python:twitter.common.app',
+        '3rdparty/python:twitter.common.log',
+        # Source dependencies are always referenced without a ':'.
+        'src/main/python/apache/thermos/common',
+        'src/main/python/apache/thermos/config',
+        'src/main/python/apache/thermos/core',
+      ],
+    )
+
+    # Binary target for thermos_runner.pex. Nothing should depend on this - it's only used as an
+    # argument to ./pants binary.
+    python_binary(
+      name = 'thermos_runner',
+      # Use entry_point, not source so the files used here are the same ones tests see.
+      entry_point = 'apache.thermos.bin.thermos_runner',
+      dependencies = [
+        # Notice that we depend only on the single private target from this BUILD file here.
+        ':_runner',
+      ],
+    )
+
+    # The public library that everyone importing the runner symbols uses.
+    # The test targets and any other dependent source code should depend on this.
+    python_library(
+      name = 'runner',
+      dependencies = [
+        # Again, notice that we depend only on the single private target from this BUILD file here.
+        ':_runner',
+      ],
+      # We always provide a setup_py. This will cause any dependee libraries to automatically
+      # reference this library in their requirements.txt rather than copy the source files into their
+      # sdist.
+      provides = setup_py(
+        # Conventionally named and versioned.
+        name = 'apache.thermos.runner',
+        version = open(os.path.join(get_buildroot(), '.auroraversion')).read().strip().upper(),
+      ).with_binaries({
+        # Every binary in this file should also be repeated here.
+        # Always use the dict-form of .with_binaries so that commands with dashes in their names are
+        # supported.
+        # The console script name is always the same as the PEX with .pex stripped.
+        'thermos_runner': ':thermos_runner',
+      }),
+    )
+
+
+
+Thermos Test resources
+======================
+
+The Aurora source repository and distributions contain several
+[binary files](../../src/test/resources/org/apache/thermos/root/checkpoints) to
+qualify the backwards-compatibility of thermos with checkpoint data. Since
+thermos persists state to disk, to be read by the thermos observer), it is important that we have
+tests that prevent regressions affecting the ability to parse previously-written data.
+
+The files included represent persisted checkpoints that exercise different
+features of thermos. The existing files should not be modified unless
+we are accepting backwards incompatibility, such as with a major release.
+
+It is not practical to write source code to generate these files on the fly,
+as source would be vulnerable to drift (e.g. due to refactoring) in ways
+that would undermine the goal of ensuring backwards compatibility.
+
+The most common reason to add a new checkpoint file would be to provide
+coverage for new thermos features that alter the data format. This is
+accomplished by writing and running a
+[job configuration](../reference/configuration.md) that exercises the feature, and
+copying the checkpoint file from the sandbox directory, by default this is
+`/var/run/thermos/checkpoints/<aurora task id>`.

Added: aurora/site/source/documentation/0.13.0/development/thrift.md
URL: http://svn.apache.org/viewvc/aurora/site/source/documentation/0.13.0/development/thrift.md?rev=1739360&view=auto
==============================================================================
--- aurora/site/source/documentation/0.13.0/development/thrift.md (added)
+++ aurora/site/source/documentation/0.13.0/development/thrift.md Fri Apr 15 20:21:30 2016
@@ -0,0 +1,54 @@
+Thrift
+======
+
+Aurora uses [Apache Thrift](https://thrift.apache.org/) for representing structured data in
+client/server RPC protocol as well as for internal data storage. While Thrift is capable of
+correctly handling additions and renames of the existing members, field removals must be done
+carefully to ensure backwards compatibility and provide predictable deprecation cycle. This
+document describes general guidelines for making Thrift schema changes to the existing fields in
+[api.thrift](../../api/src/main/thrift/org/apache/aurora/gen/api.thrift).
+
+It is highly recommended to go through the
+[Thrift: The Missing Guide](http://diwakergupta.github.io/thrift-missing-guide/) first to refresh on
+basic Thrift schema concepts.
+
+Checklist
+---------
+Every existing Thrift schema modification is unique in its requirements and must be analyzed
+carefully to identify its scope and expected consequences. The following checklist may help in that
+analysis:
+* Is this a new field/struct? If yes, go ahead
+* Is this a pure field/struct rename without any type/structure change? If yes, go ahead and rename
+* Anything else, read further to make sure your change is properly planned
+
+Deprecation cycle
+-----------------
+Any time a breaking change (e.g.: field replacement or removal) is required, the following cycle
+must be followed:
+
+### vCurrent
+Change is applied in a way that does not break scheduler/client with this version to
+communicate with scheduler/client from vCurrent-1.
+* Do not remove or rename the old field
+* Add a new field as an eventual replacement of the old one and implement a dual read/write
+anywhere the old field is used. If a thrift struct is mapped in the DB store make sure both columns
+are marked as `NOT NULL`
+* Check [storage.thrift](../../api/src/main/thrift/org/apache/aurora/gen/storage.thrift) to see if
+the affected struct is stored in Aurora scheduler storage. If so, it's almost certainly also
+necessary to perform a [DB migration](db-migration.md).
+* Add a deprecation jira ticket into the vCurrent+1 release candidate
+* Add a TODO for the deprecated field mentioning the jira ticket
+
+### vCurrent+1
+Finalize the change by removing the deprecated fields from the Thrift schema.
+* Drop any dual read/write routines added in the previous version
+* Remove thrift backfilling in scheduler
+* Remove the deprecated Thrift field
+
+Testing
+-------
+It's always advisable to test your changes in the local vagrant environment to build more
+confidence that you change is backwards compatible. It's easy to simulate different
+client/scheduler versions by playing with `aurorabuild` command. See [this document](../getting-started/vagrant.md)
+for more.
+

Added: aurora/site/source/documentation/0.13.0/development/ui.md
URL: http://svn.apache.org/viewvc/aurora/site/source/documentation/0.13.0/development/ui.md?rev=1739360&view=auto
==============================================================================
--- aurora/site/source/documentation/0.13.0/development/ui.md (added)
+++ aurora/site/source/documentation/0.13.0/development/ui.md Fri Apr 15 20:21:30 2016
@@ -0,0 +1,46 @@
+Developing the Aurora Scheduler UI
+==================================
+
+Installing bower (optional)
+----------------------------
+Third party JS libraries used in Aurora (located at 3rdparty/javascript/bower_components) are
+managed by bower, a JS dependency manager. Bower is only required if you plan to add, remove or
+update JS libraries. Bower can be installed using the following command:
+
+    npm install -g bower
+
+Bower depends on node.js and npm. The easiest way to install node on a mac is via brew:
+
+    brew install node
+
+For more node.js installation options refer to https://github.com/joyent/node/wiki/Installation.
+
+More info on installing and using bower can be found at: http://bower.io/. Once installed, you can
+use the following commands to view and modify the bower repo at
+3rdparty/javascript/bower_components
+
+    bower list
+    bower install <library name>
+    bower remove <library name>
+    bower update <library name>
+    bower help
+
+
+Faster Iteration in Vagrant
+---------------------------
+The scheduler serves UI assets from the classpath. For production deployments this means the assets
+are served from within a jar. However, for faster development iteration, the vagrant image is
+configured to add the `scheduler` subtree of `/vagrant/dist/resources/main` to the head of
+`CLASSPATH`. This path is configured as a shared filesystem to the path on the host system where
+your Aurora repository lives. This means that any updates under `dist/resources/main/scheduler` in
+your checkout will be reflected immediately in the UI served from within the vagrant image.
+
+The one caveat to this is that this path is under `dist` not `src`. This is because the assets must
+be processed by gradle before they can be served. So, unfortunately, you cannot just save your local
+changes and see them reflected in the UI, you must first run `./gradlew processResources`. This is
+less than ideal, but better than having to restart the scheduler after every change. Additionally,
+gradle makes this process somewhat easier with the use of the `--continuous` flag. If you run:
+`./gradlew processResources --continuous` gradle will monitor the filesystem for changes and run the
+task automatically as necessary. This doesn't quite provide hot-reload capabilities, but it does
+allow for <5s from save to changes being visibile in the UI with no further action required on the
+part of the developer.

Added: aurora/site/source/documentation/0.13.0/features/constraints.md
URL: http://svn.apache.org/viewvc/aurora/site/source/documentation/0.13.0/features/constraints.md?rev=1739360&view=auto
==============================================================================
--- aurora/site/source/documentation/0.13.0/features/constraints.md (added)
+++ aurora/site/source/documentation/0.13.0/features/constraints.md Fri Apr 15 20:21:30 2016
@@ -0,0 +1,126 @@
+Scheduling Constraints
+======================
+
+By default, Aurora will pick any random slave with sufficient resources
+in order to schedule a task. This scheduling choice can be further
+restricted with the help of constraints.
+
+
+Mesos Attributes
+----------------
+
+Data centers are often organized with hierarchical failure domains.  Common failure domains
+include hosts, racks, rows, and PDUs.  If you have this information available, it is wise to tag
+the Mesos slave with them as
+[attributes](https://mesos.apache.org/documentation/attributes-resources/).
+
+The Mesos slave `--attributes` command line argument can be used to mark slaves with
+static key/value pairs, so called attributes (not to be confused with `--resources`, which are
+dynamic and accounted).
+
+For example, consider the host `cluster1-aaa-03-sr2` and its following attributes (given in
+key:value format): `host:cluster1-aaa-03-sr2` and `rack:aaa`.
+
+Aurora makes these attributes available for matching with scheduling constraints.
+
+
+Limit Constraints
+-----------------
+
+Limit constraints allow to control machine diversity using constraints. The below
+constraint ensures that no more than two instances of your job may run on a single host.
+Think of this as a "group by" limit.
+
+    Service(
+      name = 'webservice',
+      role = 'www-data',
+      constraints = {
+        'host': 'limit:2',
+      }
+      ...
+    )
+
+
+Likewise, you can use constraints to control rack diversity, e.g. at
+most one task per rack:
+
+    constraints = {
+      'rack': 'limit:1',
+    }
+
+Use these constraints sparingly as they can dramatically reduce Tasks' schedulability.
+Further details are available in the reference documentation on
+[Scheduling Constraints](../reference/configuration.md#specifying-scheduling-constraints).
+
+
+
+Value Constraints
+-----------------
+
+Value constraints can be used to express that a certain attribute with a certain value
+should be present on a Mesos slave. For example, the following job would only be
+scheduled on nodes that claim to have an `SSD` as their disk.
+
+    Service(
+      name = 'webservice',
+      role = 'www-data',
+      constraints = {
+        'disk': 'SSD',
+      }
+      ...
+    )
+
+
+Further details are available in the reference documentation on
+[Scheduling Constraints](../reference/configuration.md#specifying-scheduling-constraints).
+
+
+Running stateful services
+-------------------------
+
+Aurora is best suited to run stateless applications, but it also accommodates for stateful services
+like databases, or services that otherwise need to always run on the same machines.
+
+### Dedicated attribute
+
+Most of the Mesos attributes arbitrary and available for custom use.  There is one exception,
+though: the `dedicated` attribute.  Aurora treats this specially, and only allows matching jobs to
+run on these machines, and will only schedule matching jobs on these machines.
+
+
+#### Syntax
+The dedicated attribute has semantic meaning. The format is `$role(/.*)?`. When a job is created,
+the scheduler requires that the `$role` component matches the `role` field in the job
+configuration, and will reject the job creation otherwise.  The remainder of the attribute is
+free-form. We've developed the idiom of formatting this attribute as `$role/$job`, but do not
+enforce this. For example: a job `devcluster/www-data/prod/hello` with a dedicated constraint set as
+`www-data/web.multi` will have its tasks scheduled only on Mesos slaves configured with:
+`--attributes=dedicated:www-data/web.multi`.
+
+A wildcard (`*`) may be used for the role portion of the dedicated attribute, which will allow any
+owner to elect for a job to run on the host(s). For example: tasks from both
+`devcluster/www-data/prod/hello` and `devcluster/vagrant/test/hello` with a dedicated constraint
+formatted as `*/web.multi` will be scheduled only on Mesos slaves configured with
+`--attributes=dedicated:*/web.multi`. This may be useful when assembling a virtual cluster of
+machines sharing the same set of traits or requirements.
+
+##### Example
+Consider the following slave command line:
+
+    mesos-slave --attributes="dedicated:db_team/redis" ...
+
+And this job configuration:
+
+    Service(
+      name = 'redis',
+      role = 'db_team',
+      constraints = {
+        'dedicated': 'db_team/redis'
+      }
+      ...
+    )
+
+The job configuration is indicating that it should only be scheduled on slaves with the attribute
+`dedicated:db_team/redis`.  Additionally, Aurora will prevent any tasks that do _not_ have that
+constraint from running on those slaves.
+

Added: aurora/site/source/documentation/0.13.0/features/containers.md
URL: http://svn.apache.org/viewvc/aurora/site/source/documentation/0.13.0/features/containers.md?rev=1739360&view=auto
==============================================================================
--- aurora/site/source/documentation/0.13.0/features/containers.md (added)
+++ aurora/site/source/documentation/0.13.0/features/containers.md Fri Apr 15 20:21:30 2016
@@ -0,0 +1,43 @@
+Containers
+==========
+
+
+Docker
+------
+
+Aurora has optional support for launching Docker containers, if correctly [configured by an Operator](../operations/configuration.md#docker-containers).
+
+Example (available in the [Vagrant environment](../getting-started/vagrant.md)):
+
+
+    $ cat /vagrant/examples/jobs/docker/hello_docker.aurora
+    hello_docker = Process(
+      name = 'hello',
+      cmdline = """
+        while true; do
+          echo hello world
+          sleep 10
+        done
+      """)
+
+    hello_world_docker = Task(
+      name = 'hello docker',
+      processes = [hello_world_proc],
+      resources = Resources(cpu = 1, ram = 1*MB, disk=8*MB)
+    )
+
+    jobs = [
+      Service(
+        cluster = 'devcluster',
+        environment = 'devel',
+        role = 'docker-test',
+        name = 'hello_docker',
+        task = hello_world_docker,
+        container = Container(docker = Docker(image = 'python:2.7'))
+      )
+    ]
+
+
+In order to correctly execute processes inside a job, the docker container must have Python 2.7
+installed. Further details of how to use Docker can be found in the
+[Reference Documentation](../reference/configuration.md#docker-object).

Added: aurora/site/source/documentation/0.13.0/features/cron-jobs.md
URL: http://svn.apache.org/viewvc/aurora/site/source/documentation/0.13.0/features/cron-jobs.md?rev=1739360&view=auto
==============================================================================
--- aurora/site/source/documentation/0.13.0/features/cron-jobs.md (added)
+++ aurora/site/source/documentation/0.13.0/features/cron-jobs.md Fri Apr 15 20:21:30 2016
@@ -0,0 +1,124 @@
+# Cron Jobs
+
+Aurora supports execution of scheduled jobs on a Mesos cluster using cron-style syntax.
+
+- [Overview](#overview)
+- [Collision Policies](#collision-policies)
+- [Failure recovery](#failure-recovery)
+- [Interacting with cron jobs via the Aurora CLI](#interacting-with-cron-jobs-via-the-aurora-cli)
+	- [cron schedule](#cron-schedule)
+	- [cron deschedule](#cron-deschedule)
+	- [cron start](#cron-start)
+	- [job killall, job restart, job kill](#job-killall-job-restart-job-kill)
+- [Technical Note About Syntax](#technical-note-about-syntax)
+- [Caveats](#caveats)
+	- [Failovers](#failovers)
+	- [Collision policy is best-effort](#collision-policy-is-best-effort)
+	- [Timezone Configuration](#timezone-configuration)
+
+## Overview
+
+A job is identified as a cron job by the presence of a
+`cron_schedule` attribute containing a cron-style schedule in the
+[`Job`](../reference/configuration.md#job-objects) object. Examples of cron schedules
+include "every 5 minutes" (`*/5 * * * *`), "Fridays at 17:00" (`* 17 * * FRI`), and
+"the 1st and 15th day of the month at 03:00" (`0 3 1,15 *`).
+
+Example (available in the [Vagrant environment](../getting-started/vagrant.md)):
+
+    $ cat /vagrant/examples/jobs/cron_hello_world.aurora
+    # A cron job that runs every 5 minutes.
+    jobs = [
+      Job(
+        cluster = 'devcluster',
+        role = 'www-data',
+        environment = 'test',
+        name = 'cron_hello_world',
+        cron_schedule = '*/5 * * * *',
+        task = SimpleTask(
+          'cron_hello_world',
+          'echo "Hello world from cron, the time is now $(date --rfc-822)"'),
+      ),
+    ]
+
+## Collision Policies
+
+The `cron_collision_policy` field specifies the scheduler's behavior when a new cron job is
+triggered while an older run hasn't finished. The scheduler has two policies available:
+
+* `KILL_EXISTING`: The default policy - on a collision the old instances are killed and a instances with the current
+configuration are started.
+* `CANCEL_NEW`: On a collision the new run is cancelled.
+
+Note that the use of `CANCEL_NEW` is likely a code smell - interrupted cron jobs should be able
+to recover their progress on a subsequent invocation, otherwise they risk having their work queue
+grow faster than they can process it.
+
+## Failure recovery
+
+Unlike with services, which aurora will always re-execute regardless of exit status, instances of
+cron jobs retry according to the `max_task_failures` attribute of the
+[Task](../reference/configuration.md#task-object) object. To get "run-until-success" semantics,
+set `max_task_failures` to `-1`.
+
+## Interacting with cron jobs via the Aurora CLI
+
+Most interaction with cron jobs takes place using the `cron` subcommand. See `aurora cron -h`
+for up-to-date usage instructions.
+
+### cron schedule
+Schedules a new cron job on the Aurora cluster for later runs or replaces the existing cron template
+with a new one. Only future runs will be affected, any existing active tasks are left intact.
+
+    $ aurora cron schedule devcluster/www-data/test/cron_hello_world /vagrant/examples/jobs/cron_hello_world.aurora
+
+### cron deschedule
+Deschedules a cron job, preventing future runs but allowing current runs to complete.
+
+    $ aurora cron deschedule devcluster/www-data/test/cron_hello_world
+
+### cron start
+Start a cron job immediately, outside of its normal cron schedule.
+
+    $ aurora cron start devcluster/www-data/test/cron_hello_world
+
+### job killall, job restart, job kill
+Cron jobs create instances running on the cluster that you can interact with like normal Aurora
+tasks with `job kill` and `job restart`.
+
+
+## Technical Note About Syntax
+
+`cron_schedule` uses a restricted subset of BSD crontab syntax. While the
+execution engine currently uses Quartz, the schedule parsing is custom, a subset of FreeBSD
+[crontab(5)](http://www.freebsd.org/cgi/man.cgi?crontab(5)) syntax. See
+[the source](https://github.com/apache/aurora/blob/master/src/main/java/org/apache/aurora/scheduler/cron/CrontabEntry.java#L106-L124)
+for details.
+
+
+## Caveats
+
+### Failovers
+No failover recovery. Aurora does not record the latest minute it fired
+triggers for across failovers. Therefore it's possible to miss triggers
+on failover. Note that this behavior may change in the future.
+
+It's necessary to sync time between schedulers with something like `ntpd`.
+Clock skew could cause double or missed triggers in the case of a failover.
+
+### Collision policy is best-effort
+Aurora aims to always have *at least one copy* of a given instance running at a time - it's
+an AP system, meaning it chooses Availability and Partition Tolerance at the expense of
+Consistency.
+
+If your collision policy was `CANCEL_NEW` and a task has terminated but
+Aurora has not noticed this Aurora will go ahead and create your new
+task.
+
+If your collision policy was `KILL_EXISTING` and a task was marked `LOST`
+but not yet GCed Aurora will go ahead and create your new task without
+attempting to kill the old one (outside the GC interval).
+
+### Timezone Configuration
+Cron timezone is configured indepdendently of JVM timezone with the `-cron_timezone` flag and
+defaults to UTC.

Added: aurora/site/source/documentation/0.13.0/features/job-updates.md
URL: http://svn.apache.org/viewvc/aurora/site/source/documentation/0.13.0/features/job-updates.md?rev=1739360&view=auto
==============================================================================
--- aurora/site/source/documentation/0.13.0/features/job-updates.md (added)
+++ aurora/site/source/documentation/0.13.0/features/job-updates.md Fri Apr 15 20:21:30 2016
@@ -0,0 +1,111 @@
+Aurora Job Updates
+==================
+
+`Job` configurations can be updated at any point in their lifecycle.
+Usually updates are done incrementally using a process called a *rolling
+upgrade*, in which Tasks are upgraded in small groups, one group at a
+time.  Updates are done using various Aurora Client commands.
+
+
+Rolling Job Updates
+-------------------
+
+There are several sub-commands to manage job updates:
+
+    aurora update start <job key> <configuration file>
+    aurora update info <job key>
+    aurora update pause <job key>
+    aurora update resume <job key>
+    aurora update abort <job key>
+    aurora update list <cluster>
+
+When you `start` a job update, the command will return once it has sent the
+instructions to the scheduler.  At that point, you may view detailed
+progress for the update with the `info` subcommand, in addition to viewing
+graphical progress in the web browser.  You may also get a full listing of
+in-progress updates in a cluster with `list`.
+
+Once an update has been started, you can `pause` to keep the update but halt
+progress.  This can be useful for doing things like debug a  partially-updated
+job to determine whether you would like to proceed.  You can `resume` to
+proceed.
+
+You may `abort` a job update regardless of the state it is in. This will
+instruct the scheduler to completely abandon the job update and leave the job
+in the current (possibly partially-updated) state.
+
+For a configuration update, the Aurora Client calculates required changes
+by examining the current job config state and the new desired job config.
+It then starts a *rolling batched update process* by going through every batch
+and performing these operations:
+
+- If an instance is present in the scheduler but isn't in the new config,
+  then that instance is killed.
+- If an instance is not present in the scheduler but is present in
+  the new config, then the instance is created.
+- If an instance is present in both the scheduler and the new config, then
+  the client diffs both task configs. If it detects any changes, it
+  performs an instance update by killing the old config instance and adds
+  the new config instance.
+
+The Aurora client continues through the instance list until all tasks are
+updated, in `RUNNING,` and healthy for a configurable amount of time.
+If the client determines the update is not going well (a percentage of health
+checks have failed), it cancels the update.
+
+Update cancellation runs a procedure similar to the described above
+update sequence, but in reverse order. New instance configs are swapped
+with old instance configs and batch updates proceed backwards
+from the point where the update failed. E.g.; (0,1,2) (3,4,5) (6,7,
+8-FAIL) results in a rollback in order (8,7,6) (5,4,3) (2,1,0).
+
+For details how to control a job update, please see the
+[UpdateConfig](../reference/configuration.md#updateconfig-objects) configuration object.
+
+
+Coordinated Job Updates
+------------------------
+
+Some Aurora services may benefit from having more control over updates by explicitly
+acknowledging ("heartbeating") job update progress. This may be helpful for mission-critical
+service updates where explicit job health monitoring is vital during the entire job update
+lifecycle. Such job updates would rely on an external service (or a custom client) periodically
+pulsing an active coordinated job update via a
+[pulseJobUpdate RPC](../../api/src/main/thrift/org/apache/aurora/gen/api.thrift).
+
+A coordinated update is defined by setting a positive
+[pulse_interval_secs](../reference/configuration.md#updateconfig-objects) value in job configuration
+file. If no pulses are received within specified interval the update will be blocked. A blocked
+update is unable to continue rolling forward (or rolling back) but retains its active status.
+It may only be unblocked by a fresh `pulseJobUpdate` call.
+
+NOTE: A coordinated update starts in `ROLL_FORWARD_AWAITING_PULSE` state and will not make any
+progress until the first pulse arrives. However, a paused update (`ROLL_FORWARD_PAUSED` or
+`ROLL_BACK_PAUSED`) is still considered active and upon resuming will immediately make progress
+provided the pulse interval has not expired.
+
+
+Canary Deployments
+------------------
+
+Canary deployments are a pattern for rolling out updates to a subset of job instances,
+in order to test different code versions alongside the actual production job.
+It is a risk-mitigation strategy for job owners and commonly used in a form where
+job instance 0 runs with a different configuration than the instances 1-N.
+
+For example, consider a job with 4 instances that each
+request 1 core of cpu, 1 GB of RAM, and 1 GB of disk space as specified
+in the configuration file `hello_world.aurora`. If you want to
+update it so it requests 2 GB of RAM instead of 1. You can create a new
+configuration file to do that called `new_hello_world.aurora` and
+issue
+
+    aurora update start <job_key_value>/0-1 new_hello_world.aurora
+
+This results in instances 0 and 1 having 1 cpu, 2 GB of RAM, and 1 GB of disk space,
+while instances 2 and 3 have 1 cpu, 1 GB of RAM, and 1 GB of disk space. If instance 3
+dies and restarts, it restarts with 1 cpu, 1 GB RAM, and 1 GB disk space.
+
+So that means there are two simultaneous task configurations for the same job
+at the same time, just valid for different ranges of instances. While this isn't a recommended
+pattern, it is valid and supported by the Aurora scheduler.

Added: aurora/site/source/documentation/0.13.0/features/multitenancy.md
URL: http://svn.apache.org/viewvc/aurora/site/source/documentation/0.13.0/features/multitenancy.md?rev=1739360&view=auto
==============================================================================
--- aurora/site/source/documentation/0.13.0/features/multitenancy.md (added)
+++ aurora/site/source/documentation/0.13.0/features/multitenancy.md Fri Apr 15 20:21:30 2016
@@ -0,0 +1,62 @@
+Multitenancy
+============
+
+Aurora is a multi-tenant system that can run jobs of multiple clients/tenants.
+Going beyond the [resource isolation on an individual host](resource-isolation.md), it is
+crucial to prevent those jobs from stepping on each others toes.
+
+
+Job Namespaces
+--------------
+
+The namespace for jobs in Aurora follows a hierarchical structure. This is meant to make it easier
+to differentiate between different jobs. A job key consists of four parts. The four parts are
+`<cluster>/<role>/<environment>/<jobname>` in that order:
+
+* Cluster refers to the name of a particular Aurora installation.
+* Role names are user accounts.
+* Environment names are namespaces.
+* Jobname is the custom name of your job.
+
+Role names correspond to user accounts. They are used for
+[authentication](../operations/security.md), as the linux user used to run jobs, and for the
+assignment of [quota](#preemption). If you don't know what accounts are available, contact your
+sysadmin.
+
+The environment component in the job key, serves as a namespace. The values for
+environment are validated in the client and the scheduler so as to allow any of `devel`, `test`,
+`production`, and any value matching the regular expression `staging[0-9]*`.
+
+None of the values imply any difference in the scheduling behavior. Conventionally, the
+"environment" is set so as to indicate a certain level of stability in the behavior of the job
+by ensuring that an appropriate level of testing has been performed on the application code. e.g.
+in the case of a typical Job, releases may progress through the following phases in order of
+increasing level of stability: `devel`, `test`, `staging`, `production`.
+
+
+Preemption
+----------
+
+In order to guarantee that important production jobs are always running, Aurora supports
+preemption.
+
+Let's consider we have a pending job that is candidate for scheduling but resource shortage pressure
+prevents this. Active tasks can become the victim of preemption, if:
+
+ - both candidate and victim are owned by the same role and the
+   [priority](../reference/configuration.md#job-objects) of a victim is lower than the
+   [priority](../reference/configuration.md#job-objects) of the candidate.
+ - OR a victim is non-[production](../reference/configuration.md#job-objects) and the candidate is
+   [production](../reference/configuration.md#job-objects).
+
+In other words, tasks from [production](../reference/configuration.md#job-objects) jobs may preempt
+tasks from any non-production job. However, a production task may only be preempted by tasks from
+production jobs in the same role with higher [priority](../reference/configuration.md#job-objects).
+
+Aurora requires resource quotas for [production non-dedicated jobs](../reference/configuration.md#job-objects).
+Quota is enforced at the job role level and when set, defines a non-preemptible pool of compute resources within
+that role. All job types (service, adhoc or cron) require role resource quota unless a job has
+[dedicated constraint set](constraints.md#dedicated-attribute).
+
+To grant quota to a particular role in production, an operator can use the command
+`aurora_admin set_quota`.

Added: aurora/site/source/documentation/0.13.0/features/resource-isolation.md
URL: http://svn.apache.org/viewvc/aurora/site/source/documentation/0.13.0/features/resource-isolation.md?rev=1739360&view=auto
==============================================================================
--- aurora/site/source/documentation/0.13.0/features/resource-isolation.md (added)
+++ aurora/site/source/documentation/0.13.0/features/resource-isolation.md Fri Apr 15 20:21:30 2016
@@ -0,0 +1,167 @@
+Resources Isolation and Sizing
+==============================
+
+- [Isolation](#isolation)
+- [Sizing](#sizing)
+- [Oversubscription](#oversubscription)
+
+
+Isolation
+---------
+
+Aurora is a multi-tenant system; a single software instance runs on a
+server, serving multiple clients/tenants. To share resources among
+tenants, it implements isolation of:
+
+* CPU
+* memory
+* disk space
+
+CPU is a soft limit, and handled differently from memory and disk space.
+Too low a CPU value results in throttling your application and
+slowing it down. Memory and disk space are both hard limits; when your
+application goes over these values, it's killed.
+
+### CPU Isolation
+
+Mesos uses a quota based CPU scheduler (the *Completely Fair Scheduler*)
+to provide consistent and predictable performance.  This is effectively
+a guarantee of resources -- you receive at least what you requested, but
+also no more than you've requested.
+
+The scheduler gives applications a CPU quota for every 100 ms interval.
+When an application uses its quota for an interval, it is throttled for
+the rest of the 100 ms. Usage resets for each interval and unused
+quota does not carry over.
+
+For example, an application specifying 4.0 CPU has access to 400 ms of
+CPU time every 100 ms. This CPU quota can be used in different ways,
+depending on the application and available resources. Consider the
+scenarios shown in this diagram.
+
+![CPU Availability](../images/CPUavailability.png)
+
+* *Scenario A*: the application can use up to 4 cores continuously for
+every 100 ms interval. It is never throttled and starts processing
+new requests immediately.
+
+* *Scenario B* : the application uses up to 8 cores (depending on
+availability) but is throttled after 50 ms. The CPU quota resets at the
+start of each new 100 ms interval.
+
+* *Scenario C* : is like Scenario A, but there is a garbage collection
+event in the second interval that consumes all CPU quota. The
+application throttles for the remaining 75 ms of that interval and
+cannot service requests until the next interval. In this example, the
+garbage collection finished in one interval but, depending on how much
+garbage needs collecting, it may take more than one interval and further
+delay service of requests.
+
+*Technical Note*: Mesos considers logical cores, also known as
+hyperthreading or SMT cores, as the unit of CPU.
+
+### Memory Isolation
+
+Mesos uses dedicated memory allocation. Your application always has
+access to the amount of memory specified in your configuration. The
+application's memory use is defined as the sum of the resident set size
+(RSS) of all processes in a shard. Each shard is considered
+independently.
+
+In other words, say you specified a memory size of 10GB. Each shard
+would receive 10GB of memory. If an individual shard's memory demands
+exceed 10GB, that shard is killed, but the other shards continue
+working.
+
+*Technical note*: Total memory size is not enforced at allocation time,
+so your application can request more than its allocation without getting
+an ENOMEM. However, it will be killed shortly after.
+
+### Disk Space
+
+Disk space used by your application is defined as the sum of the files'
+disk space in your application's directory, including the `stdout` and
+`stderr` logged from your application. Each shard is considered
+independently. You should use off-node storage for your application's
+data whenever possible.
+
+In other words, say you specified disk space size of 100MB. Each shard
+would receive 100MB of disk space. If an individual shard's disk space
+demands exceed 100MB, that shard is killed, but the other shards
+continue working.
+
+After your application finishes running, its allocated disk space is
+reclaimed. Thus, your job's final action should move any disk content
+that you want to keep, such as logs, to your home file system or other
+less transitory storage. Disk reclamation takes place an undefined
+period after the application finish time; until then, the disk contents
+are still available but you shouldn't count on them being so.
+
+*Technical note* : Disk space is not enforced at write so your
+application can write above its quota without getting an ENOSPC, but it
+will be killed shortly after. This is subject to change.
+
+### Other Resources
+
+Other resources, such as network bandwidth, do not have any performance
+guarantees. For some resources, such as memory bandwidth, there are no
+practical sharing methods so some application combinations collocated on
+the same host may cause contention.
+
+
+Sizing
+-------
+
+### CPU Sizing
+
+To correctly size Aurora-run Mesos tasks, specify a per-shard CPU value
+that lets the task run at its desired performance when at peak load
+distributed across all shards. Include reserve capacity of at least 50%,
+possibly more, depending on how critical your service is (or how
+confident you are about your original estimate : -)), ideally by
+increasing the number of shards to also improve resiliency. When running
+your application, observe its CPU stats over time. If consistently at or
+near your quota during peak load, you should consider increasing either
+per-shard CPU or the number of shards.
+
+## Memory Sizing
+
+Size for your application's peak requirement. Observe the per-instance
+memory statistics over time, as memory requirements can vary over
+different periods. Remember that if your application exceeds its memory
+value, it will be killed, so you should also add a safety margin of
+around 10-20%. If you have the ability to do so, you may also want to
+put alerts on the per-instance memory.
+
+## Disk Space Sizing
+
+Size for your application's peak requirement. Rotate and discard log
+files as needed to stay within your quota. When running a Java process,
+add the maximum size of the Java heap to your disk space requirement, in
+order to account for an out of memory error dumping the heap
+into the application's sandbox space.
+
+
+Oversubscription
+----------------
+
+**WARNING**: This feature is currently in alpha status. Do not use it in production clusters!
+
+Mesos [supports a concept of revocable tasks](http://mesos.apache.org/documentation/latest/oversubscription/)
+by oversubscribing machine resources by the amount deemed safe to not affect the existing
+non-revocable tasks. Aurora now supports revocable jobs via a `tier` setting set to `revocable`
+value.
+
+The Aurora scheduler must be configured to receive revocable offers from Mesos and accept revocable
+jobs. If not configured properly revocable tasks will never get assigned to hosts and will stay in
+`PENDING`. Set these scheduler flag to allow receiving revocable Mesos offers:
+
+    -receive_revocable_resources=true
+
+Specify a tier configuration file path (unless you want to use the [default](../../src/main/resources/org/apache/aurora/scheduler/tiers.json)):
+
+    -tier_config=path/to/tiers/config.json
+
+
+See the [Configuration Reference](../references/configuration.md) for details on how to mark a job
+as being revocable.

Added: aurora/site/source/documentation/0.13.0/features/service-discovery.md
URL: http://svn.apache.org/viewvc/aurora/site/source/documentation/0.13.0/features/service-discovery.md?rev=1739360&view=auto
==============================================================================
--- aurora/site/source/documentation/0.13.0/features/service-discovery.md (added)
+++ aurora/site/source/documentation/0.13.0/features/service-discovery.md Fri Apr 15 20:21:30 2016
@@ -0,0 +1,44 @@
+Service Discovery
+=================
+
+It is possible for the Aurora executor to announce tasks into ServerSets for
+the purpose of service discovery.  ServerSets use the Zookeeper [group membership pattern](http://zookeeper.apache.org/doc/trunk/recipes.html#sc_outOfTheBox)
+of which there are several reference implementations:
+
+  - [C++](https://github.com/apache/mesos/blob/master/src/zookeeper/group.cpp)
+  - [Java](https://github.com/twitter/commons/blob/master/src/java/com/twitter/common/zookeeper/ServerSetImpl.java#L221)
+  - [Python](https://github.com/twitter/commons/blob/master/src/python/twitter/common/zookeeper/serverset/serverset.py#L51)
+
+These can also be used natively in Finagle using the [ZookeeperServerSetCluster](https://github.com/twitter/finagle/blob/master/finagle-serversets/src/main/scala/com/twitter/finagle/zookeeper/ZookeeperServerSetCluster.scala).
+
+For more information about how to configure announcing, see the [Configuration Reference](../reference/configuration.md).
+
+Using Mesos DiscoveryInfo
+-------------------------
+Experimental support for populating DiscoveryInfo in Mesos is introduced in Aurora. This can be used to build
+custom service discovery system not using zookeeper. Please see `Service Discovery` section in
+[Mesos Framework Development guide](http://mesos.apache.org/documentation/latest/app-framework-development-guide/) for
+explanation of the protobuf message in Mesos.
+
+To use this feature, please enable `--populate_discovery_info` flag on scheduler. All jobs started by scheduler
+afterwards will have their portmap populated to Mesos and discoverable in `/state` endpoint in Mesos master and agent.
+
+### Using Mesos DNS
+An example is using [Mesos-DNS](https://github.com/mesosphere/mesos-dns), which is able to generate multiple DNS
+records. With current implementation, the example job with key `devcluster/vagrant/test/http-example` generates at
+least the following:
+
+1. An A record for `http_example.test.vagrant.twitterscheduler.mesos` (which only includes IP address);
+2. A [SRV record](https://en.wikipedia.org/wiki/SRV_record) for
+ `_http_example.test.vagrant._tcp.twitterscheduler.mesos`, which includes IP address and every port. This should only
+  be used if the service has one port.
+3. A SRV record `_{port-name}._http_example.test.vagrant._tcp.twitterscheduler.mesos` for each port name
+  defined. This should be used when the service has multiple ports.
+
+Things to note:
+
+1. The domain part (".mesos" in above example) can be configured in [Mesos DNS](http://mesosphere.github.io/mesos-dns/docs/configuration-parameters.html);
+2. The `twitterscheduler` part is the lower-case of framework name, which is not configurable right now (see
+   [TWITTER_SCHEDULER_NAME](https://github.com/apache/aurora/blob/master/src/main/java/org/apache/aurora/scheduler/mesos/CommandLineDriverSettingsModule.java#L98));
+3. Right now, portmap and port aliases in announcer object are not reflected in DiscoveryInfo, therefore not visible in
+   Mesos DNS records either. This is because they are only resolved in thermos executors.
\ No newline at end of file

Added: aurora/site/source/documentation/0.13.0/features/services.md
URL: http://svn.apache.org/viewvc/aurora/site/source/documentation/0.13.0/features/services.md?rev=1739360&view=auto
==============================================================================
--- aurora/site/source/documentation/0.13.0/features/services.md (added)
+++ aurora/site/source/documentation/0.13.0/features/services.md Fri Apr 15 20:21:30 2016
@@ -0,0 +1,99 @@
+Long-running Services
+=====================
+
+Jobs that are always restart on completion, whether successful or unsuccessful,
+are called services. This is useful for long-running processes
+such as webservices that should always be running, unless stopped explicitly.
+
+
+Service Specification
+---------------------
+
+A job is identified as a service by the presence of the flag
+``service=True` in the [`Job`](../reference/configuration.md#job-objects) object.
+The `Service` alias can be used as shorthand for `Job` with `service=True`.
+
+Example (available in the [Vagrant environment](../getting-started/vagrant.md)):
+
+    $ cat /vagrant/examples/jobs/hello_world.aurora
+    hello = Process(
+      name = 'hello',
+      cmdline = """
+        while true; do
+          echo hello world
+          sleep 10
+        done
+      """)
+
+    task = SequentialTask(
+      processes = [hello],
+      resources = Resources(cpu = 1.0, ram = 128*MB, disk = 128*MB)
+    )
+
+    jobs = [
+      Service(
+        task = task,
+        cluster = 'devcluster',
+        role = 'www-data',
+        environment = 'prod',
+        name = 'hello'
+      )
+    ]
+
+
+Jobs without the service bit set only restart up to `max_task_failures` times and only if they
+terminated unsuccessfully either due to human error or machine failure (see the
+[`Job`](../reference/configuration.md#job-objects) object for details).
+
+
+Ports
+-----
+
+In order to be useful, most services have to bind to one or more ports. Aurora enables this
+usecase via the [`thermos.ports` namespace](../reference/configuration.md#thermos-namespace) that
+allows to request arbitrarily named ports:
+
+
+    nginx = Process(
+      name = 'nginx',
+      cmdline = './run_nginx.sh -port {{thermos.ports[http]}}'
+    )
+
+
+When this process is included in a job, the job will be allocated a port, and the command line
+will be replaced with something like:
+
+    ./run_nginx.sh -port 42816
+
+Where 42816 happens to be the allocated port.
+
+For details on how to enable clients to discover this dynamically assigned port, see our
+[Service Discovery](service-discovery.md) documentation.
+
+
+Health Checking
+---------------
+
+Typically, the Thermos executor monitors processes within a task only by liveness of the forked
+process. In addition to that, Aurora has support for rudimentary health checking: Either via HTTP
+via custom shell scripts.
+
+For example, simply by requesting a `health` port, a process can request to be health checked
+via repeated calls to the `/health` endpoint:
+
+    nginx = Process(
+      name = 'nginx',
+      cmdline = './run_nginx.sh -port {{thermos.ports[health]}}'
+    )
+
+Please see the
+[configuration reference](../reference/configuration.md#user-content-healthcheckconfig-objects)
+for configuration options for this feature.
+
+You can pause health checking by touching a file inside of your sandbox, named `.healthchecksnooze`.
+As long as that file is present, health checks will be disabled, enabling users to gather core
+dumps or other performance measurements without worrying about Aurora's health check killing
+their process.
+
+WARNING: Remember to remove this when you are done, otherwise your instance will have permanently
+disabled health checks.

Added: aurora/site/source/documentation/0.13.0/features/sla-metrics.md
URL: http://svn.apache.org/viewvc/aurora/site/source/documentation/0.13.0/features/sla-metrics.md?rev=1739360&view=auto
==============================================================================
--- aurora/site/source/documentation/0.13.0/features/sla-metrics.md (added)
+++ aurora/site/source/documentation/0.13.0/features/sla-metrics.md Fri Apr 15 20:21:30 2016
@@ -0,0 +1,178 @@
+Aurora SLA Measurement
+======================
+
+- [Overview](#overview)
+- [Metric Details](#metric-details)
+  - [Platform Uptime](#platform-uptime)
+  - [Job Uptime](#job-uptime)
+  - [Median Time To Assigned (MTTA)](#median-time-to-assigned-\(mtta\))
+  - [Median Time To Running (MTTR)](#median-time-to-running-\(mttr\))
+- [Limitations](#limitations)
+
+## Overview
+
+The primary goal of the feature is collection and monitoring of Aurora job SLA (Service Level
+Agreements) metrics that defining a contractual relationship between the Aurora/Mesos platform
+and hosted services.
+
+The Aurora SLA feature is by default only enabled for service (non-cron)
+production jobs (`"production=True"` in your `.aurora` config). It can be enabled for
+non-production services by an operator via the scheduler command line flag `-sla_non_prod_metrics`.
+
+Counters that track SLA measurements are computed periodically within the scheduler.
+The individual instance metrics are refreshed every minute (configurable via
+`sla_stat_refresh_interval`). The instance counters are subsequently aggregated by
+relevant grouping types before exporting to scheduler `/vars` endpoint (when using `vagrant`
+that would be `http://192.168.33.7:8081/vars`)
+
+
+## Metric Details
+
+### Platform Uptime
+
+*Aggregate amount of time a job spends in a non-runnable state due to platform unavailability
+or scheduling delays. This metric tracks Aurora/Mesos uptime performance and reflects on any
+system-caused downtime events (tasks LOST or DRAINED). Any user-initiated task kills/restarts
+will not degrade this metric.*
+
+**Collection scope:**
+
+* Per job - `sla_<job_key>_platform_uptime_percent`
+* Per cluster - `sla_cluster_platform_uptime_percent`
+
+**Units:** percent
+
+A fault in the task environment may cause the Aurora/Mesos to have different views on the task state
+or lose track of the task existence. In such cases, the service task is marked as LOST and
+rescheduled by Aurora. For example, this may happen when the task stays in ASSIGNED or STARTING
+for too long or the Mesos slave becomes unhealthy (or disappears completely). The time between
+task entering LOST and its replacement reaching RUNNING state is counted towards platform downtime.
+
+Another example of a platform downtime event is the administrator-requested task rescheduling. This
+happens during planned Mesos slave maintenance when all slave tasks are marked as DRAINED and
+rescheduled elsewhere.
+
+To accurately calculate Platform Uptime, we must separate platform incurred downtime from user
+actions that put a service instance in a non-operational state. It is simpler to isolate
+user-incurred downtime and treat all other downtime as platform incurred.
+
+Currently, a user can cause a healthy service (task) downtime in only two ways: via `killTasks`
+or `restartShards` RPCs. For both, their affected tasks leave an audit state transition trail
+relevant to uptime calculations. By applying a special "SLA meaning" to exposed task state
+transition records, we can build a deterministic downtime trace for every given service instance.
+
+A task going through a state transition carries one of three possible SLA meanings
+(see [SlaAlgorithm.java](../../src/main/java/org/apache/aurora/scheduler/sla/SlaAlgorithm.java) for
+sla-to-task-state mapping):
+
+* Task is UP: starts a period where the task is considered to be up and running from the Aurora
+  platform standpoint.
+
+* Task is DOWN: starts a period where the task cannot reach the UP state for some
+  non-user-related reason. Counts towards instance downtime.
+
+* Task is REMOVED from SLA: starts a period where the task is not expected to be UP due to
+  user initiated action or failure. We ignore this period for the uptime calculation purposes.
+
+This metric is recalculated over the last sampling period (last minute) to account for
+any UP/DOWN/REMOVED events. It ignores any UP/DOWN events not immediately adjacent to the
+sampling interval as well as adjacent REMOVED events.
+
+### Job Uptime
+
+*Percentage of the job instances considered to be in RUNNING state for the specified duration
+relative to request time. This is a purely application side metric that is considering aggregate
+uptime of all RUNNING instances. Any user- or platform initiated restarts directly affect
+this metric.*
+
+**Collection scope:** We currently expose job uptime values at 5 pre-defined
+percentiles (50th,75th,90th,95th and 99th):
+
+* `sla_<job_key>_job_uptime_50_00_sec`
+* `sla_<job_key>_job_uptime_75_00_sec`
+* `sla_<job_key>_job_uptime_90_00_sec`
+* `sla_<job_key>_job_uptime_95_00_sec`
+* `sla_<job_key>_job_uptime_99_00_sec`
+
+**Units:** seconds
+You can also get customized real-time stats from aurora client. See `aurora sla -h` for
+more details.
+
+### Median Time To Assigned (MTTA)
+
+*Median time a job spends waiting for its tasks to be assigned to a host. This is a combined
+metric that helps track the dependency of scheduling performance on the requested resources
+(user scope) as well as the internal scheduler bin-packing algorithm efficiency (platform scope).*
+
+**Collection scope:**
+
+* Per job - `sla_<job_key>_mtta_ms`
+* Per cluster - `sla_cluster_mtta_ms`
+* Per instance size (small, medium, large, x-large, xx-large). Size are defined in:
+[ResourceAggregates.java](../../src/main/java/org/apache/aurora/scheduler/base/ResourceAggregates.java)
+  * By CPU:
+    * `sla_cpu_small_mtta_ms`
+    * `sla_cpu_medium_mtta_ms`
+    * `sla_cpu_large_mtta_ms`
+    * `sla_cpu_xlarge_mtta_ms`
+    * `sla_cpu_xxlarge_mtta_ms`
+  * By RAM:
+    * `sla_ram_small_mtta_ms`
+    * `sla_ram_medium_mtta_ms`
+    * `sla_ram_large_mtta_ms`
+    * `sla_ram_xlarge_mtta_ms`
+    * `sla_ram_xxlarge_mtta_ms`
+  * By DISK:
+    * `sla_disk_small_mtta_ms`
+    * `sla_disk_medium_mtta_ms`
+    * `sla_disk_large_mtta_ms`
+    * `sla_disk_xlarge_mtta_ms`
+    * `sla_disk_xxlarge_mtta_ms`
+
+**Units:** milliseconds
+
+MTTA only considers instances that have already reached ASSIGNED state and ignores those
+that are still PENDING. This ensures straggler instances (e.g. with unreasonable resource
+constraints) do not affect metric curves.
+
+### Median Time To Running (MTTR)
+
+*Median time a job waits for its tasks to reach RUNNING state. This is a comprehensive metric
+reflecting on the overall time it takes for the Aurora/Mesos to start executing user content.*
+
+**Collection scope:**
+
+* Per job - `sla_<job_key>_mttr_ms`
+* Per cluster - `sla_cluster_mttr_ms`
+* Per instance size (small, medium, large, x-large, xx-large). Size are defined in:
+[ResourceAggregates.java](../../src/main/java/org/apache/aurora/scheduler/base/ResourceAggregates.java)
+  * By CPU:
+    * `sla_cpu_small_mttr_ms`
+    * `sla_cpu_medium_mttr_ms`
+    * `sla_cpu_large_mttr_ms`
+    * `sla_cpu_xlarge_mttr_ms`
+    * `sla_cpu_xxlarge_mttr_ms`
+  * By RAM:
+    * `sla_ram_small_mttr_ms`
+    * `sla_ram_medium_mttr_ms`
+    * `sla_ram_large_mttr_ms`
+    * `sla_ram_xlarge_mttr_ms`
+    * `sla_ram_xxlarge_mttr_ms`
+  * By DISK:
+    * `sla_disk_small_mttr_ms`
+    * `sla_disk_medium_mttr_ms`
+    * `sla_disk_large_mttr_ms`
+    * `sla_disk_xlarge_mttr_ms`
+    * `sla_disk_xxlarge_mttr_ms`
+
+**Units:** milliseconds
+
+MTTR only considers instances in RUNNING state. This ensures straggler instances (e.g. with
+unreasonable resource constraints) do not affect metric curves.
+
+## Limitations
+
+* The availability of Aurora SLA metrics is bound by the scheduler availability.
+
+* All metrics are calculated at a pre-defined interval (currently set at 1 minute).
+  Scheduler restarts may result in missed collections.