You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mesos.apache.org by dl...@apache.org on 2015/07/01 20:55:33 UTC
svn commit: r1688707 [7/7] - in /mesos/site: publish/ publish/documentation/ publish/documentation/allocation-module/ publish/documentation/app-framework-development-guide/ publish/documentation/clang-format/ publish/documentation/configuration/ publis...

Added: mesos/site/source/documentation/latest/monitoring.md
URL: http://svn.apache.org/viewvc/mesos/site/source/documentation/latest/monitoring.md?rev=1688707&view=auto
==============================================================================
--- mesos/site/source/documentation/latest/monitoring.md (added)
+++ mesos/site/source/documentation/latest/monitoring.md Wed Jul  1 18:55:31 2015
@@ -0,0 +1,1057 @@
+---
+layout: documentation
+---
+
+
+# Mesos Observability Metrics
+
+This document describes the observability metrics provided by Mesos master and
+slave nodes. This document also provides some initial guidance on which metrics
+you should monitor to detect abnormal situations in your cluster.
+
+
+## Overview
+
+Mesos master and slave nodes report a set of statistics and metrics that enable
+you to  monitor resource usage and detect abnormal situations early. The
+information reported by Mesos includes details about available resources, used
+resources, registered frameworks, active slaves, and task state. You can use
+this information to create automated alerts and to plot different metrics over
+time inside a monitoring dashboard.
+
+
+## Metric Types
+
+Mesos provides two different kinds of metrics: counters and gauges.
+
+**Counters** keep track of discrete events and are monotonically increasing. The
+value of a metric of this type is always a natural number. Examples include the
+number of failed tasks and the number of slave registrations. For some metrics
+of this type, the rate of change is often more useful than the value itself.
+
+**Gauges** represent an instantaneous sample of some magnitude. Examples include
+the amount of used memory in the cluster and the number of connected slaves. For
+some metrics of this type, it is often useful to determine whether the value is
+above or below a threshold for a sustained period of time.
+
+The tables in this document indicate the type of each available metric.
+
+
+## Master Nodes
+
+Metrics from the master node are available at the following URL:
+
+    http://<mesos-master-ip>:5050/metrics/snapshot
+
+The response is a JSON object that contains metrics names and values as
+key-value pairs.
+
+### Observability metrics
+
+This section lists all available metrics from Mesos master nodes grouped by
+category.
+
+#### Resources
+
+The following metrics provide information about the total resources available in
+the cluster and their current usage. High resource usage for sustained periods
+of time may indicate that you need to add capacity to your cluster or that a
+framework is misbehaving.
+
+<table class="table table-striped">
+<thead>
+<tr><th>Metric</th><th>Description</th><th>Type</th>
+</thead>
+<tr>
+  <td>
+  <code>master/cpus_percent</code>
+  </td>
+  <td>Percentage of allocated CPUs</td>
+  <td>Gauge</td>
+</tr>
+<tr>
+  <td>
+  <code>master/cpus_used</code>
+  </td>
+  <td>Number of allocated CPUs</td>
+  <td>Gauge</td>
+</tr>
+<tr>
+  <td>
+  <code>master/cpus_total</code>
+  </td>
+  <td>Number of CPUs</td>
+  <td>Gauge</td>
+</tr>
+<tr>
+  <td>
+  <code>master/disk_percent</code>
+  </td>
+  <td>Percentage of allocated disk space</td>
+  <td>Gauge</td>
+</tr>
+<tr>
+  <td>
+  <code>master/disk_used</code>
+  </td>
+  <td>Allocated disk space in MB</td>
+  <td>Gauge</td>
+</tr>
+<tr>
+  <td>
+  <code>master/disk_total</code>
+  </td>
+  <td>Disk space in MB</td>
+  <td>Gauge</td>
+</tr>
+<tr>
+  <td>
+  <code>master/mem_percent</code>
+  </td>
+  <td>Percentage of allocated memory</td>
+  <td>Gauge</td>
+</tr>
+<tr>
+  <td>
+  <code>master/mem_used</code>
+  </td>
+  <td>Allocated memory in MB</td>
+  <td>Gauge</td>
+</tr>
+<tr>
+  <td>
+  <code>master/mem_total</code>
+  </td>
+  <td>Memory in MB</td>
+  <td>Gauge</td>
+</tr>
+</table>
+
+#### Master
+
+The following metrics provide information about whether a master is currently
+elected and how long it has been running. A cluster with no elected master
+for sustained periods of time indicates a malfunctioning cluster. This
+points to either leadership election issues (so check the connection to
+ZooKeeper) or a flapping Master process. A low uptime value indicates that the
+master has restarted recently.
+
+<table class="table table-striped">
+<thead>
+<tr><th>Metric</th><th>Description</th><th>Type</th>
+</thead>
+<tr>
+  <td>
+  <code>master/elected</code>
+  </td>
+  <td>Whether this is the elected master</td>
+  <td>Gauge</td>
+</tr>
+<tr>
+  <td>
+  <code>master/uptime_secs</code>
+  </td>
+  <td>Uptime in seconds</td>
+  <td>Gauge</td>
+</tr>
+</table>
+
+#### System
+
+The following metrics provide information about the resources available on this
+master node and their current usage. High resource usage in a master node for
+sustained periods of time may degrade the performance of the cluster.
+
+<table class="table table-striped">
+<thead>
+<tr><th>Metric</th><th>Description</th><th>Type</th>
+</thead>
+<tr>
+  <td>
+  <code>system/cpus_total</code>
+  </td>
+  <td>Number of CPUs available in this master node</td>
+  <td>Gauge</td>
+</tr>
+<tr>
+  <td>
+  <code>system/load_15min</code>
+  </td>
+  <td>Load average for the past 15 minutes</td>
+  <td>Gauge</td>
+</tr>
+<tr>
+  <td>
+  <code>system/load_5min</code>
+  </td>
+  <td>Load average for the past 5 minutes</td>
+  <td>Gauge</td>
+</tr>
+<tr>
+  <td>
+  <code>system/load_1min</code>
+  </td>
+  <td>Load average for the past minute</td>
+  <td>Gauge</td>
+</tr>
+<tr>
+  <td>
+  <code>system/mem_free_bytes</code>
+  </td>
+  <td>Free memory in bytes</td>
+  <td>Gauge</td>
+</tr>
+<tr>
+  <td>
+  <code>system/mem_total_bytes</code>
+  </td>
+  <td>Total memory in bytes</td>
+  <td>Gauge</td>
+</tr>
+</table>
+
+#### Slaves
+
+The following metrics provide information about slave events, slave counts, and
+slave states. A low number of active slaves may indicate that slaves are
+unhealthy or that they are not able to connect to the elected master.
+
+<table class="table table-striped">
+<thead>
+<tr><th>Metric</th><th>Description</th><th>Type</th>
+</thead>
+<tr>
+  <td>
+  <code>master/slave_registrations</code>
+  </td>
+  <td>Number of slaves that were able to cleanly re-join the cluster and
+      connect back to the master after the master is disconnected.</td>
+  <td>Counter</td>
+</tr>
+<tr>
+  <td>
+  <code>master/slave_removals</code>
+  </td>
+  <td>Number of slave removed for various reasons, including maintenance</td>
+  <td>Counter</td>
+</tr>
+<tr>
+  <td>
+  <code>master/slave_reregistrations</code>
+  </td>
+  <td>Number of slave re-registrations</td>
+  <td>Counter</td>
+</tr>
+<tr>
+  <td>
+  <code>master/slave_shutdowns_scheduled</code>
+  </td>
+  <td>Number of slaves which have failed their health check and are scheduled
+      to be removed. They will not be immediately removed due to the Slave
+      Removal Rate-Limit, but <code>master/slave_shutdowns_completed</code>
+      will start increasing as they do get removed.</td>
+  <td>Counter</td>
+</tr>
+<tr>
+  <td>
+  <code>master/slave_shutdowns_cancelled</code>
+  </td>
+  <td>Number of cancelled slave shutdowns. This happens when the slave removal
+      rate limit allows for a slave to reconnect and send a <code>PONG</code>
+      to the master before being removed.</td>
+  <td>Counter</td>
+</tr>
+<tr>
+  <td>
+  <code>master/slave_shutdowns_completed</code>
+  </td>
+  <td>Number of slaves that failed their health check. These are slaves which
+      were not heard from despite the slave-removal rate limit, and have been
+      removed from the master's slave registry.</td>
+  <td>Counter</td>
+</tr>
+<tr>
+  <td>
+  <code>master/slaves_active</code>
+  </td>
+  <td>Number of active slaves</td>
+  <td>Gauge</td>
+</tr>
+<tr>
+  <td>
+  <code>master/slaves_connected</code>
+  </td>
+  <td>Number of connected slaves</td>
+  <td>Gauge</td>
+</tr>
+<tr>
+  <td>
+  <code>master/slaves_disconnected</code>
+  </td>
+  <td>Number of disconnected slaves</td>
+  <td>Gauge</td>
+</tr>
+<tr>
+  <td>
+  <code>master/slaves_inactive</code>
+  </td>
+  <td>Number of inactive slaves</td>
+  <td>Gauge</td>
+</tr>
+</table>
+
+#### Frameworks
+
+The following metrics provide information about the registered frameworks in the
+cluster. No active or connected frameworks may indicate that a scheduler is not
+registered or that it is misbehaving.
+
+<table class="table table-striped">
+<thead>
+<tr><th>Metric</th><th>Description</th><th>Type</th>
+</thead>
+<tr>
+  <td>
+  <code>master/frameworks_active</code>
+  </td>
+  <td>Number of active frameworks</td>
+  <td>Gauge</td>
+</tr>
+<tr>
+  <td>
+  <code>master/frameworks_connected</code>
+  </td>
+  <td>Number of connected frameworks</td>
+  <td>Gauge</td>
+</tr>
+<tr>
+  <td>
+  <code>master/frameworks_disconnected</code>
+  </td>
+  <td>Number of disconnected frameworks</td>
+  <td>Gauge</td>
+</tr>
+<tr>
+  <td>
+  <code>master/frameworks_inactive</code>
+  </td>
+  <td>Number of inactive frameworks</td>
+  <td>Gauge</td>
+</tr>
+<tr>
+  <td>
+  <code>master/outstanding_offers</code>
+  </td>
+  <td>Number of outstanding resource offers</td>
+  <td>Gauge</td>
+</tr>
+</table>
+
+#### Tasks
+
+The following metrics provide information about active and terminated tasks. A
+high rate of lost tasks may indicate that there is a problem with the cluster.
+The task states listed here match those of the task state machine.
+
+<table class="table table-striped">
+<thead>
+<tr><th>Metric</th><th>Description</th><th>Type</th>
+</thead>
+<tr>
+  <td>
+  <code>master/tasks_error</code>
+  </td>
+  <td>Number of tasks that were invalid</td>
+  <td>Counter</td>
+</tr>
+<tr>
+  <td>
+  <code>master/tasks_failed</code>
+  </td>
+  <td>Number of failed tasks</td>
+  <td>Counter</td>
+</tr>
+<tr>
+  <td>
+  <code>master/tasks_finished</code>
+  </td>
+  <td>Number of finished tasks</td>
+  <td>Counter</td>
+</tr>
+<tr>
+  <td>
+  <code>master/tasks_killed</code>
+  </td>
+  <td>Number of killed tasks</td>
+  <td>Counter</td>
+</tr>
+<tr>
+  <td>
+  <code>master/tasks_lost</code>
+  </td>
+  <td>Number of lost tasks</td>
+  <td>Counter</td>
+</tr>
+<tr>
+  <td>
+  <code>master/tasks_running</code>
+  </td>
+  <td>Number of running tasks</td>
+  <td>Gauge</td>
+</tr>
+<tr>
+  <td>
+  <code>master/tasks_staging</code>
+  </td>
+  <td>Number of staging tasks</td>
+  <td>Gauge</td>
+</tr>
+<tr>
+  <td>
+  <code>master/tasks_starting</code>
+  </td>
+  <td>Number of starting tasks</td>
+  <td>Gauge</td>
+</tr>
+</table>
+
+#### Messages
+
+The following metrics provide information about messages between the master and
+the slaves and between the framework and the executors. A high rate of dropped
+messages may indicate that there is a problem with the network.
+
+<table class="table table-striped">
+<thead>
+<tr><th>Metric</th><th>Description</th><th>Type</th>
+</thead>
+<tr>
+  <td>
+  <code>master/invalid_framework_to_executor_messages</code>
+  </td>
+  <td>Number of invalid framework to executor messages</td>
+  <td>Counter</td>
+</tr>
+<tr>
+  <td>
+  <code>master/invalid_status_update_acknowledgements</code>
+  </td>
+  <td>Number of invalid status update acknowledgements</td>
+  <td>Counter</td>
+</tr>
+<tr>
+  <td>
+  <code>master/invalid_status_updates</code>
+  </td>
+  <td>Number of invalid status updates</td>
+  <td>Counter</td>
+</tr>
+<tr>
+  <td>
+  <code>master/dropped_messages</code>
+  </td>
+  <td>Number of dropped messages</td>
+  <td>Counter</td>
+</tr>
+<tr>
+  <td>
+  <code>master/messages_authenticate</code>
+  </td>
+  <td>Number of authentication messages</td>
+  <td>Counter</td>
+</tr>
+<tr>
+  <td>
+  <code>master/messages_deactivate_framework</code>
+  </td>
+  <td>Number of framework deactivation messages</td>
+  <td>Counter</td>
+</tr>
+<tr>
+  <td>
+  <code>master/messages_exited_executor</code>
+  </td>
+  <td>Number of terminated executor messages</td>
+  <td>Counter</td>
+</tr>
+<tr>
+  <td>
+  <code>master/messages_framework_to_executor</code>
+  </td>
+  <td>Number of messages from a framework to an executor</td>
+  <td>Counter</td>
+</tr>
+<tr>
+  <td>
+  <code>master/messages_kill_task</code>
+  </td>
+  <td>Number of kill task messages</td>
+  <td>Counter</td>
+</tr>
+<tr>
+  <td>
+  <code>master/messages_launch_tasks</code>
+  </td>
+  <td>Number of launch task messages</td>
+  <td>Counter</td>
+</tr>
+<tr>
+  <td>
+  <code>master/messages_reconcile_tasks</code>
+  </td>
+  <td>Number of reconcile task messages</td>
+  <td>Counter</td>
+</tr>
+<tr>
+  <td>
+  <code>master/messages_register_framework</code>
+  </td>
+  <td>Number of framework registration messages</td>
+  <td>Counter</td>
+</tr>
+<tr>
+  <td>
+  <code>master/messages_register_slave</code>
+  </td>
+  <td>Number of slave registration messages</td>
+  <td>Counter</td>
+</tr>
+<tr>
+  <td>
+  <code>master/messages_reregister_framework</code>
+  </td>
+  <td>Number of framework re-registration messages</td>
+  <td>Counter</td>
+</tr>
+<tr>
+  <td>
+  <code>master/messages_reregister_slave</code>
+  </td>
+  <td>Number of slave re-registration messages</td>
+  <td>Counter</td>
+</tr>
+<tr>
+  <td>
+  <code>master/messages_resource_request</code>
+  </td>
+  <td>Number of resource request messages</td>
+  <td>Counter</td>
+</tr>
+<tr>
+  <td>
+  <code>master/messages_revive_offers</code>
+  </td>
+  <td>Number of offer revival messages</td>
+  <td>Counter</td>
+</tr>
+<tr>
+  <td>
+  <code>master/messages_status_udpate</code>
+  </td>
+  <td>Number of status update messages</td>
+  <td>Counter</td>
+</tr>
+<tr>
+  <td>
+  <code>master/messages_status_update_acknowledgement</code>
+  </td>
+  <td>Number of status update acknowledgement messages</td>
+  <td>Counter</td>
+</tr>
+<tr>
+  <td>
+  <code>master/messages_unregister_framework</code>
+  </td>
+  <td>Number of framework unregistration messages</td>
+  <td>Counter</td>
+</tr>
+<tr>
+  <td>
+  <code>master/messages_unregister_slave</code>
+  </td>
+  <td>Number of slave unregistration messages</td>
+  <td>Counter</td>
+</tr>
+<tr>
+  <td>
+  <code>master/valid_framework_to_executor_messages</code>
+  </td>
+  <td>Number of valid framework to executor messages</td>
+  <td>Counter</td>
+</tr>
+<tr>
+  <td>
+  <code>master/valid_status_update_acknowledgements</code>
+  </td>
+  <td>Number of valid status update acknowledgement messages</td>
+  <td>Counter</td>
+</tr>
+<tr>
+  <td>
+  <code>master/valid_status_updates</code>
+  </td>
+  <td>Number of valid status update messages</td>
+  <td>Counter</td>
+</tr>
+</table>
+
+#### Event queue
+
+The following metrics provide information about different types of events in the
+event queue.
+
+<table class="table table-striped">
+<thead>
+<tr><th>Metric</th><th>Description</th><th>Type</th>
+</thead>
+<tr>
+  <td>
+  <code>master/event_queue_dispatches</code>
+  </td>
+  <td>Number of dispatches in the event queue</td>
+  <td>Gauge</td>
+</tr>
+<tr>
+  <td>
+  <code>master/event_queue_http_requests</code>
+  </td>
+  <td>Number of HTTP requests in the event queue</td>
+  <td>Gauge</td>
+</tr>
+<tr>
+  <td>
+  <code>master/event_queue_messages</code>
+  </td>
+  <td>Number of messages in the event queue</td>
+  <td>Gauge</td>
+</tr>
+</table>
+
+#### Registrar
+
+The following metrics provide information about read and write latency to the
+slave registrar.
+
+<table class="table table-striped">
+<thead>
+<tr><th>Metric</th><th>Description</th><th>Type</th>
+</thead>
+<tr>
+  <td>
+  <code>registrar/state_fetch_ms</code>
+  </td>
+  <td>Registry read latency in ms </td>
+  <td>Gauge</td>
+</tr>
+<tr>
+  <td>
+  <code>registrar/state_store_ms</code>
+  </td>
+  <td>Registry write latency in ms </td>
+  <td>Gauge</td>
+</tr>
+<tr>
+  <td>
+  <code>registrar/state_store_ms/max</code>
+  </td>
+  <td>Maximum registry write latency in ms</td>
+  <td>Gauge</td>
+</tr>
+<tr>
+  <td>
+  <code>registrar/state_store_ms/min</code>
+  </td>
+  <td>Minimum registry write latency in ms</td>
+  <td>Gauge</td>
+</tr>
+<tr>
+  <td>
+  <code>registrar/state_store_ms/p50</code>
+  </td>
+  <td>Median registry write latency in ms</td>
+  <td>Gauge</td>
+</tr>
+<tr>
+  <td>
+  <code>registrar/state_store_ms/p90</code>
+  </td>
+  <td>90th percentile registry write latency in ms</td>
+  <td>Gauge</td>
+</tr>
+<tr>
+  <td>
+  <code>registrar/state_store_ms/p95</code>
+  </td>
+  <td>95th percentile registry write latency in ms</td>
+  <td>Gauge</td>
+</tr>
+<tr>
+  <td>
+  <code>registrar/state_store_ms/p99</code>
+  </td>
+  <td>99th percentile registry write latency in ms</td>
+  <td>Gauge</td>
+</tr>
+<tr>
+  <td>
+  <code>registrar/state_store_ms/p999</code>
+  </td>
+  <td>99.9th percentile registry write latency in ms</td>
+  <td>Gauge</td>
+</tr>
+<tr>
+  <td>
+  <code>registrar/state_store_ms/p9999</code>
+  </td>
+  <td>99.99th percentile registry write latency in ms</td>
+  <td>Gauge</td>
+</tr>
+</table>
+
+
+### Basic Alerts
+
+This section lists some examples of basic alerts that you can use to detect
+abnormal situations in a cluster.
+
+#### master/uptime_secs is low
+
+The master has restarted.
+
+#### master/uptime_secs < 60 for sustained periods of time
+
+The cluster has a flapping master node.
+
+#### master/tasks_lost is increasing rapidly
+
+Tasks in the cluster are disappearing. Possible causes include hardware
+failures, bugs in one of the frameworks, or bugs in Mesos.
+
+#### master/slaves_active is low
+
+Slaves are having trouble connecting to the master.
+
+#### master/cpus_percent > 0.9 for sustained periods of time
+
+Cluster CPU utilization is close to capacity.
+
+#### master/mem_percent > 0.9 for sustained periods of time
+
+Cluster memory utilization is close to capacity.
+
+#### master/elected is 0 for sustained periods of time
+
+No master is currently elected.
+
+
+
+
+## Slave Nodes
+
+Metrics from each slave node are available at the following URL:
+
+    http://<mesos-slave>:5051/metrics/snapshot
+
+The response is a JSON object that contains metrics names and values as key-
+value pairs.
+
+
+### Observability Metrics
+
+This section lists all available metrics from Mesos slave nodes grouped by
+category.
+
+#### Resources
+
+The following metrics provide information about the total resources available in
+the slave and their current usage.
+
+<table class="table table-striped">
+<thead>
+<tr><th>Metric</th><th>Description</th><th>Type</th>
+</thead>
+<tr>
+  <td>
+  <code>slave/cpus_percent</code>
+  </td>
+  <td>Percentage of allocated CPUs</td>
+  <td>Gauge</td>
+</tr>
+<tr>
+  <td>
+  <code>slave/cpus_used</code>
+  </td>
+  <td>Number of allocated CPUs</td>
+  <td>Gauge</td>
+</tr>
+<tr>
+  <td>
+  <code>slave/cpus_total</code>
+  </td>
+  <td>Number of CPUs</td>
+  <td>Gauge</td>
+</tr>
+<tr>
+  <td>
+  <code>slave/disk_percent</code>
+  </td>
+  <td>Percentage of allocated disk space</td>
+  <td>Gauge</td>
+</tr>
+<tr>
+  <td>
+  <code>slave/disk_used</code>
+  </td>
+  <td>Allocated disk space in MB</td>
+  <td>Gauge</td>
+</tr>
+<tr>
+  <td>
+  <code>slave/disk_total</code>
+  </td>
+  <td>Disk space in MB</td>
+  <td>Gauge</td>
+</tr>
+<tr>
+  <td>
+  <code>slave/mem_percent</code>
+  </td>
+  <td>Percentage of allocated memory</td>
+  <td>Gauge</td>
+</tr>
+<tr>
+  <td>
+  <code>slave/mem_used</code>
+  </td>
+  <td>Allocated memory in MB</td>
+  <td>Gauge</td>
+</tr>
+<tr>
+  <td>
+  <code>slave/mem_total</code>
+  </td>
+  <td>Memory in MB</td>
+  <td>Gauge</td>
+</tr>
+</table>
+
+#### Slave
+
+The following metrics provide information about whether a slave is currently
+registered with a master and for how long it has been running.
+
+<table class="table table-striped">
+<thead>
+<tr><th>Metric</th><th>Description</th><th>Type</th>
+</thead>
+<tr>
+  <td>
+  <code>slave/registered</code>
+  </td>
+  <td>Whether this slave is registered with a master</td>
+  <td>Gauge</td>
+</tr>
+<tr>
+  <td>
+  <code>slave/uptime_secs</code>
+  </td>
+  <td>Uptime in seconds</td>
+  <td>Gauge</td>
+</tr>
+</table>
+
+#### System
+
+The following metrics provide information about the slave system.
+
+<table class="table table-striped">
+<thead>
+<tr><th>Metric</th><th>Description</th><th>Type</th>
+</thead>
+<tr>
+  <td>
+  <code>system/cpus_total</code>
+  </td>
+  <td>Number of CPUs available</td>
+  <td>Gauge</td>
+</tr>
+<tr>
+  <td>
+  <code>system/load_15min</code>
+  </td>
+  <td>Load average for the past 15 minutes</td>
+  <td>Gauge</td>
+</tr>
+<tr>
+  <td>
+  <code>system/load_5min</code>
+  </td>
+  <td>Load average for the past 5 minutes</td>
+  <td>Gauge</td>
+</tr>
+<tr>
+  <td>
+  <code>system/load_1min</code>
+  </td>
+  <td>Load average for the past minute</td>
+  <td>Gauge</td>
+</tr>
+<tr>
+  <td>
+  <code>system/mem_free_bytes</code>
+  </td>
+  <td>Free memory in bytes</td>
+  <td>Gauge</td>
+</tr>
+<tr>
+  <td>
+  <code>system/mem_total_bytes</code>
+  </td>
+  <td>Total memory in bytes</td>
+  <td>Gauge</td>
+</tr>
+</table>
+
+#### Executors
+
+The following metrics provide information about the executor instances running
+on the slave.
+
+<table class="table table-striped">
+<thead>
+<tr><th>Metric</th><th>Description</th><th>Type</th>
+</thead>
+<tr>
+  <td>
+  <code>slave/frameworks_active</code>
+  </td>
+  <td>Number of active frameworks</td>
+  <td>Gauge</td>
+</tr>
+<tr>
+  <td>
+  <code>slave/executors_registering</code>
+  </td>
+  <td>Number of executors registering</td>
+  <td>Gauge</td>
+</tr>
+<tr>
+  <td>
+  <code>slave/executors_running</code>
+  </td>
+  <td>Number of executors running</td>
+  <td>Gauge</td>
+</tr>
+<tr>
+  <td>
+  <code>slave/executors_terminated</code>
+  </td>
+  <td>Number of terminated executors</td>
+  <td>Counter</td>
+</tr>
+<tr>
+  <td>
+  <code>slave/executors_terminating</code>
+  </td>
+  <td>Number of terminating executors</td>
+  <td>Gauge</td>
+</tr>
+</table>
+
+#### Tasks
+
+The following metrics provide information about active and terminated tasks.
+
+<table class="table table-striped">
+<thead>
+<tr><th>Metric</th><th>Description</th><th>Type</th>
+</thead>
+<tr>
+  <td>
+  <code>slave/tasks_failed</code>
+  </td>
+  <td>Number of failed tasks</td>
+  <td>Counter</td>
+</tr>
+<tr>
+  <td>
+  <code>slave/tasks_finished</code>
+  </td>
+  <td>Number of finished tasks</td>
+  <td>Counter</td>
+</tr>
+<tr>
+  <td>
+  <code>slave/tasks_killed</code>
+  </td>
+  <td>Number of killed tasks</td>
+  <td>Counter</td>
+</tr>
+<tr>
+  <td>
+  <code>slave/tasks_lost</code>
+  </td>
+  <td>Number of lost tasks</td>
+  <td>Counter</td>
+</tr>
+<tr>
+  <td>
+  <code>slave/tasks_running</code>
+  </td>
+  <td>Number of running tasks</td>
+  <td>Gauge</td>
+</tr>
+<tr>
+  <td>
+  <code>slave/tasks_staging</code>
+  </td>
+  <td>Number of staging tasks</td>
+  <td>Gauge</td>
+</tr>
+<tr>
+  <td>
+  <code>slave/tasks_starting</code>
+  </td>
+  <td>Number of starting tasks</td>
+  <td>Gauge</td>
+</tr>
+</table>
+
+#### Messages
+
+The following metrics provide information about messages between the slaves and
+the master it is registered with.
+
+<table class="table table-striped">
+<thead>
+<tr><th>Metric</th><th>Description</th><th>Type</th>
+</thead>
+<tr>
+  <td>
+  <code>slave/invalid_framework_messages</code>
+  </td>
+  <td>Number of invalid framework messages</td>
+  <td>Counter</td>
+</tr>
+<tr>
+  <td>
+  <code>slave/invalid_status_udpates</code>
+  </td>
+  <td>Number of invalid status updates</td>
+  <td>Counter</td>
+</tr>
+<tr>
+  <td>
+  <code>slave/valid_framework_messages</code>
+  </td>
+  <td>Number of valid framework messages</td>
+  <td>Counter</td>
+</tr>
+<tr>
+  <td>
+  <code>slave/valid_status_udpates</code>
+  </td>
+  <td>Number of valid status updates</td>
+  <td>Counter</td>
+</tr>
+</table>

Modified: mesos/site/source/documentation/latest/network-monitoring.md
URL: http://svn.apache.org/viewvc/mesos/site/source/documentation/latest/network-monitoring.md?rev=1688707&r1=1688706&r2=1688707&view=diff
==============================================================================
--- mesos/site/source/documentation/latest/network-monitoring.md (original)
+++ mesos/site/source/documentation/latest/network-monitoring.md Wed Jul  1 18:55:31 2015
@@ -23,21 +23,20 @@ Currently, network monitoring is only su
 
 Make sure the following packages are installed on the slave:
 
-* [libnl3](http://www.infradead.org/~tgr/libnl/) >= 3.2.25
+* [libnl3](http://www.infradead.org/~tgr/libnl/) >= 3.2.26
 * [iproute](http://www.linuxfoundation.org/collaborate/workgroups/networking/iproute2) (>= 2.6.39 is advised but not required for debugging purpose)
 
 On the build machine, you need to install the following packages:
 
-* [libnl3-devel](http://www.infradead.org/~tgr/libnl/) >= 3.2.25
+* [libnl3-devel](http://www.infradead.org/~tgr/libnl/) >= 3.2.26
 
 ### Configure and build
 
 Network monitoring will NOT be built in by default. To build Mesos with network monitoring support, you need to add a configure option:
 
-```
-$ ./configure --with-network-isolator
-$ make
-```
+    $ ./configure --with-network-isolator
+    $ make
+
 
 ### Host ephemeral ports squeeze
 
@@ -47,24 +46,22 @@ For non-ephemeral ports (e.g, listening
 
 For ephemeral ports, without network monitoring, all executors/tasks running on the slave share the same ephemeral port range of the host. The default ephemeral port range on most Linux distributions is [32768, 61000]. With network monitoring, for each container, we need to reserve a range for ports on the host which will be used as the ephemeral port range for the container network stack (these ports are directly mapped into the container). We need to ensure none of the host processes are using those ports. Because of that, you may want to squeeze the host ephemeral port range in order to support more containers on each slave. To do that, you can use the following command (need root permission). A host reboot is required to ensure there are no connections using ports outside the new ephemeral range.
 
-```
-# This sets the host ephemeral port range to [57345, 61000].
-$ echo "57345 61000" > /proc/sys/net/ipv4/ip_local_port_range
-```
+    # This sets the host ephemeral port range to [57345, 61000].
+    $ echo "57345 61000" > /proc/sys/net/ipv4/ip_local_port_range
+
 
 ### Turn on network monitoring
 
 After the host ephemeral ports squeeze and reboot, you can turn on network monitoring by appending `network/port_mapping` to the isolation flag. Notice that you need specify the `ephemeral_ports` resource (via --resources flag). It tells the slave which ports on the host are reserved for containers. It must NOT overlap with the host ephemeral port range. You can also specify how many ephemeral ports you want to allocate to each container. It is recommended but not required that this number is power of 2 aligned (e.g., 512, 1024). If not, there will be some performance impact for classifying packets. The maximum number of containers on the slave will be limited by approximately |ephemeral_ports|/ephemeral_ports_per_container, subject to alignment etc.
 
-```
-mesos-slave \
-	--checkpoint \
-	--log_dir=/var/log/mesos \
-	--work_dir=/var/lib/mesos \
-	--isolation=cgroups/cpu,cgroups/mem,network/port_mapping \
-	--resources=cpus:22;mem:62189;ports:[31000-32000];disk:400000;ephemeral_ports:[32768-57344] \
-	--ephemeral_ports_per_container=1024
-```
+    mesos-slave \
+        --checkpoint \
+        --log_dir=/var/log/mesos \
+        --work_dir=/var/lib/mesos \
+        --isolation=cgroups/cpu,cgroups/mem,network/port_mapping \
+        --resources=cpus:22;mem:62189;ports:[31000-32000];disk:400000;ephemeral_ports:[32768-57344] \
+        --ephemeral_ports_per_container=1024
+
 
 ## How to get statistics?
 
@@ -81,40 +78,39 @@ Currently, we report the following netwo
 
 For example, these are the statistics you will get by hitting the `/monitor/statistics.json` endpoint on a slave with network monitoring turned on:
 
-```
-$ curl -s http://localhost:5051/monitor/statistics.json | python2.6
--mjson.tool
-[
-    {
-        "executor_id": "sample_executor_id-ebd8fa62-757d-489e-9e23-678a21d078d6",
-        "executor_name": "sample_executor",
-        "framework_id": "201103282247-0000000019-0000",
-        "source": "sample_executor",
-        "statistics": {
-            "cpus_limit": 0.35,
-            "cpus_nr_periods": 520883,
-            "cpus_nr_throttled": 2163,
-            "cpus_system_time_secs": 154.42,
-            "cpus_throttled_time_secs": 145.96,
-            "cpus_user_time_secs": 258.74,
-            "mem_anon_bytes": 109137920,
-            "mem_file_bytes": 30613504,
-            "mem_limit_bytes": 167772160,
-            "mem_mapped_file_bytes": 8192,
-            "mem_rss_bytes": 140341248,
-            "net_rx_bytes": 2402099,
-            "net_rx_dropped": 0,
-            "net_rx_errors": 0,
-            "net_rx_packets": 33273,
-            "net_tx_bytes": 1507798,
-            "net_tx_dropped": 0,
-            "net_tx_errors": 0,
-            "net_tx_packets": 17726,
-            "timestamp": 1408043826.91626
+    $ curl -s http://localhost:5051/monitor/statistics.json | python2.6
+    -mjson.tool
+    [
+        {
+            "executor_id": "sample_executor_id-ebd8fa62-757d-489e-9e23-678a21d078d6",
+            "executor_name": "sample_executor",
+            "framework_id": "201103282247-0000000019-0000",
+            "source": "sample_executor",
+            "statistics": {
+                "cpus_limit": 0.35,
+                "cpus_nr_periods": 520883,
+                "cpus_nr_throttled": 2163,
+                "cpus_system_time_secs": 154.42,
+                "cpus_throttled_time_secs": 145.96,
+                "cpus_user_time_secs": 258.74,
+                "mem_anon_bytes": 109137920,
+                "mem_file_bytes": 30613504,
+                "mem_limit_bytes": 167772160,
+                "mem_mapped_file_bytes": 8192,
+                "mem_rss_bytes": 140341248,
+                "net_rx_bytes": 2402099,
+                "net_rx_dropped": 0,
+                "net_rx_errors": 0,
+                "net_rx_packets": 33273,
+                "net_tx_bytes": 1507798,
+                "net_tx_dropped": 0,
+                "net_tx_errors": 0,
+                "net_tx_packets": 17726,
+                "timestamp": 1408043826.91626
+            }
         }
-    }
-]
-```
+    ]
+
 
 # Network Egress Rate Limit
 
@@ -124,13 +120,11 @@ Mesos 0.21.0 adds an optional feature to
 
 Egress Rate Limit requires Network Monitoring. To enable it, please follow all the steps in the [previous section](#Network_Monitoring) to enable the Network Monitoring first, and then use the newly introduced `egress_rate_limit_per_container` flag to specify the rate limit for each container. Note that this flag expects a `Bytes` type like the following:
 
-```
-mesos-slave \
-	--checkpoint \
-	--log_dir=/var/log/mesos \
-	--work_dir=/var/lib/mesos \
-	--isolation=cgroups/cpu,cgroups/mem,network/port_mapping \
-	--resources=cpus:22;mem:62189;ports:[31000-32000];disk:400000;ephemeral_ports:[32768-57344] \
-	--ephemeral_ports_per_container=1024 \
-	--egress_rate_limit_per_container=37500KB # Convert to ~300Mbits/s.
-```
+    mesos-slave \
+        --checkpoint \
+        --log_dir=/var/log/mesos \
+        --work_dir=/var/lib/mesos \
+        --isolation=cgroups/cpu,cgroups/mem,network/port_mapping \
+        --resources=cpus:22;mem:62189;ports:[31000-32000];disk:400000;ephemeral_ports:[32768-57344] \
+        --ephemeral_ports_per_container=1024 \
+        --egress_rate_limit_per_container=37500KB # Convert to ~300Mbits/s.

Modified: mesos/site/source/documentation/latest/reconciliation.md
URL: http://svn.apache.org/viewvc/mesos/site/source/documentation/latest/reconciliation.md?rev=1688707&r1=1688706&r2=1688707&view=diff
==============================================================================
--- mesos/site/source/documentation/latest/reconciliation.md (original)
+++ mesos/site/source/documentation/latest/reconciliation.md Wed Jul  1 18:55:31 2015
@@ -54,17 +54,16 @@ reconciliation in a framework scheduler.
 
 Frameworks send a list of `TaskStatus` messages to the master:
 
-```
-  // Allows the framework to query the status for non-terminal tasks.
-  // This causes the master to send back the latest task status for
-  // each task in 'statuses', if possible. Tasks that are no longer
-  // known will result in a TASK_LOST update. If statuses is empty,
-  // then the master will send the latest status for each task
-  // currently known.
-  message Reconcile {
-    repeated TaskStatus statuses = 1; // Should be non-terminal only.
-  }
-```
+    // Allows the framework to query the status for non-terminal tasks.
+    // This causes the master to send back the latest task status for
+    // each task in 'statuses', if possible. Tasks that are no longer
+    // known will result in a TASK_LOST update. If statuses is empty,
+    // then the master will send the latest status for each task
+    // currently known.
+    message Reconcile {
+      repeated TaskStatus statuses = 1; // Should be non-terminal only.
+    }
+
 
 Currently, the master will only examine two fields in `TaskStatus`:
 
@@ -105,4 +104,4 @@ Offers are reconciled automatically afte
 
 * Offers do not persist beyond the lifetime of a Master.
 * If a disconnection occurs, offers are no longer valid.
-* Offers are rescinded and regenerated each time the framework (re-)registers.
\ No newline at end of file
+* Offers are rescinded and regenerated each time the framework (re-)registers.

Modified: mesos/site/source/documentation/latest/release-guide.md
URL: http://svn.apache.org/viewvc/mesos/site/source/documentation/latest/release-guide.md?rev=1688707&r1=1688706&r2=1688707&view=diff
==============================================================================
--- mesos/site/source/documentation/latest/release-guide.md (original)
+++ mesos/site/source/documentation/latest/release-guide.md Wed Jul  1 18:55:31 2015
@@ -27,22 +27,21 @@ This guide describes the process of doin
    you can get from running `mvn --encrypt-password` (NOTE: you may
    need to first generate a [master
    password](http://maven.apache.org/guides/mini/guide-encryption.html).
-```
-<settings>
-  <servers>
-    <server>
-      <id>apache.snapshots.https</id>
-      <username>APACHE USERNAME</username>
-      <password>APACHE ENCRYPTED PASSWORD</password>
-    </server>
-    <server>
-      <id>apache.releases.https</id>
-      <username>APACHE USERNAME</username>
-      <password>APACHE ENCRYPTED PASSWORD</password>
-    </server>
-  </servers>
-</settings>
-```
+
+        <settings>
+          <servers>
+            <server>
+              <id>apache.snapshots.https</id>
+              <username>APACHE USERNAME</username>
+              <password>APACHE ENCRYPTED PASSWORD</password>
+            </server>
+            <server>
+              <id>apache.releases.https</id>
+              <username>APACHE USERNAME</username>
+              <password>APACHE ENCRYPTED PASSWORD</password>
+            </server>
+          </servers>
+        </settings>
 
 6. Use `gpg-agent` to avoid typing your passphrase repeatedly.
 

Added: mesos/site/source/documentation/latest/reservation.md
URL: http://svn.apache.org/viewvc/mesos/site/source/documentation/latest/reservation.md?rev=1688707&view=auto
==============================================================================
--- mesos/site/source/documentation/latest/reservation.md (added)
+++ mesos/site/source/documentation/latest/reservation.md Wed Jul  1 18:55:31 2015
@@ -0,0 +1,294 @@
+---
+layout: documentation
+---
+
+# Reservation
+
+Mesos provides mechanisms to __reserve__ resources in specific slaves.
+The concept was first introduced with __static reservation__ in 0.14.0
+which enabled operators to specify the reserved resources on slave startup.
+This was extended with __dynamic reservation__ in 0.23.0 which enabled operators
+and authorized __frameworks__ to dynamically reserve resources in the cluster.
+
+No breaking changes were introduced with dynamic reservation, which means the
+existing static reservation mechanism continues to be fully supported.
+
+In both types of reservations, resources are reserved for a __role__.
+
+
+## Static Reservation (since 0.14.0)
+
+An operator can configure a slave with resources reserved for a role.
+The reserved resources are specified via the `--resources` flag.
+For example, suppose we have 12 CPUs and 6144 MB of RAM available on a slave and
+that we want to reserve 8 CPUs and 4096 MB of RAM for the `ads` role.
+We start the slave like so:
+
+        $ mesos-slave \
+          --master=<ip>:<port> \
+          --resources="cpus:4;mem:2048;cpus(ads):8;mem(ads):4096"
+
+We now have 8 CPUs and 4096 MB of RAM reserved for `ads` on this slave.
+
+__CAVEAT:__ In order to modify a static reservation, the operator must drain and
+            restart the slave with the new configuration specifed in the
+            `--resources` flag.
+
+__NOTE:__ This feature is supported for backwards compatibility.
+          The recommended approach is to specify the total resources available
+          on the slave as unreserved via the `--resources` flag and manage
+          reservations dynamically via the master HTTP endpoints.
+
+
+## Dynamic Reservation (since 0.23.0)
+
+As mentioned in [Static Reservation](#static-reservation-since-0140), specifying the
+reserved resources via the `--resources` flag makes the reservation static.
+This is, statically reserved resources cannot be reserved for another role nor
+be unreserved. Dynamic Reservation enables operators and authorized frameworks
+to reserve and unreserve resources post slave-startup.
+
+We require a `principal` from the operator or framework in order to
+authenticate/authorize the operations. [Authorization](/documentation/latest/authorization/) is
+specified via the existing ACL mechanism. (_Coming Soon_)
+
+* `Offer::Operation::Reserve` and `Offer::Operation::Unreserve` messages are
+  available for __frameworks__ to send back via the `acceptOffers` API as a
+  response to a resource offer.
+* `/reserve` and `/unreserve` HTTP endpoints are available for __operators__
+  to manage dynamic reservations through the master. (_Coming Soon_).
+
+In the following sections, we will walk through examples of each of the
+interfaces described above.
+
+
+### `Offer::Operation::Reserve`
+
+A framework is able to reserve resources through the resource offer cycle.
+Suppose we receive a resource offer with 12 CPUs and 6144 MB of RAM unreserved.
+
+        {
+          "id": <offer_id>,
+          "framework_id": <framework_id>,
+          "slave_id": <slave_id>,
+          "hostname": <hostname>,
+          "resources": [
+            {
+              "name": "cpus",
+              "type": "SCALAR",
+              "scalar": { "value": 12 },
+              "role": "*",
+            },
+            {
+              "name": "mem",
+              "type": "SCALAR",
+              "scalar": { "value": 6144 },
+              "role": "*",
+            }
+          ]
+        }
+
+We can reserve 8 CPUs and 4096 MB of RAM by sending the following
+`Offer::Operation` message. `Offer::Operation::Reserve` has a `resources` field
+which we specify with the resources to be reserved. We need to expicitly set
+the `role` and `principal` fields with the framework's role and principal.
+
+        {
+          "type": Offer::Operation::RESERVE,
+          "reserve": {
+            "resources": [
+              {
+                "name": "cpus",
+                "type": "SCALAR",
+                "scalar": { "value": 8 },
+                "role": <framework_role>,
+                "reservation": {
+                  "principal": <framework_principal>
+                }
+              },
+              {
+                "name": "mem",
+                "type": "SCALAR",
+                "scalar": { "value": 4096 },
+                "role": <framework_role>,
+                "reservation": {
+                  "principal": <framework_principal>
+                }
+              }
+            ]
+          }
+        }
+
+The subsequent resource offer will __contain__ the following reserved resources:
+
+        {
+          "id": <offer_id>,
+          "framework_id": <framework_id>,
+          "slave_id": <slave_id>,
+          "hostname": <hostname>,
+          "resources": [
+            {
+              "name": "cpus",
+              "type": "SCALAR",
+              "scalar": { "value": 8 },
+              "role": <framework_role>,
+              "reservation": {
+                "principal": <framework_principal>
+              }
+            },
+            {
+              "name": "mem",
+              "type": "SCALAR",
+              "scalar": { "value": 4096 },
+              "role": <framework_role>,
+              "reservation": {
+                "principal": <framework_principal>
+              }
+            },
+          ]
+        }
+
+
+### `Offer::Operation::Unreserve`
+
+A framework is able to unreserve resources through the resource offer cycle.
+In [Offer::Operation::Reserve](#offeroperationreserve), we reserved 8 CPUs
+and 4096 MB of RAM for our `role`. The master will continue to offer these
+resources to our `role`. Suppose we would like to unreserve these resources.
+First, we receive a resource offer (copy/pasted from above):
+
+        {
+          "id": <offer_id>,
+          "framework_id": <framework_id>,
+          "slave_id": <slave_id>,
+          "hostname": <hostname>,
+          "resources": [
+            {
+              "name": "cpus",
+              "type": "SCALAR",
+              "scalar": { "value": 8 },
+              "role": <framework_role>,
+              "reservation": {
+                "principal": <framework_principal>
+              }
+            },
+            {
+              "name": "mem",
+              "type": "SCALAR",
+              "scalar": { "value": 4096 },
+              "role": <framework_role>,
+              "reservation": {
+                "principal": <framework_principal>
+              }
+            },
+          ]
+        }
+
+We unreserve the 8 CPUs and 4096 MB of RAM by sending the following
+`Offer::Operation` message. `Offer::Operation::Unreserve` has a `resources` field
+which we specify with the resources to be unreserved.
+
+        {
+          "type": Offer::Operation::UNRESERVE,
+          "unreserve": {
+            "resources": [
+              {
+                "name": "cpus",
+                "type": "SCALAR",
+                "scalar": { "value": 8 },
+                "role": <framework_role>,
+                "reservation": {
+                  "principal": <framework_principal>
+                }
+              },
+              {
+                "name": "mem",
+                "type": "SCALAR",
+                "scalar": { "value": 4096 },
+                "role": <framework_role>,
+                "reservation": {
+                  "principal": <framework_principal>
+                }
+              }
+            ]
+          }
+        }
+
+The unreserved resources may now be offered to other frameworks.
+
+
+### `/reserve` (_Coming Soon_)
+
+Suppose we want to reserve 8 CPUs and 4096 MB of RAM for the `ads` role on
+a slave with id=`<slave_id>`. We send an HTTP POST request to the `/reserve`
+HTTP endpoint like so:
+
+        $ curl -i \
+          -u <operator_principal>:<password> \
+          -d slaveId=<slave_id> \
+          -d resources='[ \
+            { \
+              "name": "cpus", \
+              "type": "SCALAR", \
+              "scalar": { "value": 8 }, \
+              "role": "ads", \
+              "reservation": { \
+                "principal": <operator_principal> \
+              } \
+            }, \
+            { \
+              "name": "mem", \
+              "type": "SCALAR", \
+              "scalar": { "value": 4096 }, \
+              "role": "ads", \
+              "reservation": { \
+                "principal": <operator_principal> \
+              } \
+            } \
+          ]' \
+          -X POST http://<ip>:<port>/master/reserve
+
+The user receives one of the following HTTP responses:
+
+* `200 OK`: Success
+* `400 BadRequest`: Invalid arguments (e.g. missing parameters).
+* `401 Unauthorized`: Unauthorized request.
+* `409 Conflict`: Insufficient resources to satisfy the reserve operation.
+
+
+### `/unreserve` (_Coming Soon_)
+
+Suppose we want to unreserve the resources that we dynamically reserved above.
+We can send an HTTP POST request to the `/unreserve` HTTP endpoint like so:
+
+        $ curl -i \
+          -u <operator_principal>:<password> \
+          -d slaveId=<slave_id> \
+          -d resources='[ \
+            { \
+              "name": "cpus", \
+              "type": "SCALAR", \
+              "scalar": { "value": 8 }, \
+              "role": "ads", \
+              "reservation": { \
+                "principal": <operator_principal> \
+              } \
+            }, \
+            { \
+              "name": "mem", \
+              "type": "SCALAR", \
+              "scalar": { "value": 4096 }, \
+              "role": "ads", \
+              "reservation": { \
+                "principal": <operator_principal> \
+              } \
+            } \
+          ]' \
+          -X POST http://<ip>:<port>/master/unreserve
+
+The user receives one of the following HTTP responses:
+
+* `200 OK`: Success
+* `400 BadRequest`: Invalid arguments (e.g. missing parameters).
+* `401 Unauthorized`: Unauthorized request.
+* `409 Conflict`: Insufficient resources to satisfy unreserve operation.

Modified: mesos/site/source/documentation/latest/upgrades.md
URL: http://svn.apache.org/viewvc/mesos/site/source/documentation/latest/upgrades.md?rev=1688707&r1=1688706&r2=1688707&view=diff
==============================================================================
--- mesos/site/source/documentation/latest/upgrades.md (original)
+++ mesos/site/source/documentation/latest/upgrades.md Wed Jul  1 18:55:31 2015
@@ -10,6 +10,12 @@ This document serves as a guide for user
 
 **NOTE** In order to enable decorator modules to remove metadata (environment variables or labels), we changed the meaning of the return value for decorator hooks in Mesos 0.23.0. Please refer to the modules documentation for more details.
 
+**NOTE** Slave ping timeouts are now configurable on the master via `--slave_ping_timeout` and `--max_slave_ping_timeouts`. Slaves should be upgraded to 0.23.x before changing these flags.
+
+**NOTE** A new scheduler driver API, `acceptOffers`, has been introduced. This is a more general version of the `launchTasks` API, which allows the scheduler to accept an offer and specify a list of operations (Offer.Operation) to perform using the resources in the offer. Currently, the supported operations include LAUNCH (launching tasks), RESERVE (making dynamic reservations), UNRESERVE (releasing dynamic reservations), CREATE (creating persistent volumes) and DESTROY (releasing persistent volumes). Similar to the `launchTasks` API, any unused resources will be considered declined, and the specified filters will be applied on all unused resources.
+
+**NOTE** The Resource protobuf has been extended to include more metadata for supporting persistence (DiskInfo), dynamic reservations (ReservationInfo) and oversubscription (RevocableInfo). You must not combine two Resource objects if they have different metadata.
+
 ## Upgrading from 0.21.x to 0.22.x
 
 **NOTE** Slave checkpoint flag has been removed as it will be enabled for all
@@ -23,12 +29,11 @@ Please refer to the metrics/snapshot end
 
 **NOTE**: The Authentication API has changed slightly in this release to support additional authentication mechanisms. The change from 'string' to 'bytes' for AuthenticationStartMessage.data has no impact on C++ or the over-the-wire representation, so it only impacts pure language bindings for languages like Java and Python that use different types for UTF-8 strings vs. byte arrays.
 
-```
-message AuthenticationStartMessage {
-  required string mechanism = 1;
-  optional bytes data = 2;
-}
-```
+    message AuthenticationStartMessage {
+      required string mechanism = 1;
+      optional bytes data = 2;
+    }
+
 
 **NOTE** All Mesos arguments can now be passed using file:// to read them out of a file (either an absolute or relative path). The --credentials, --whitelist, and any flags that expect JSON backed arguments (such as --modules) behave as before, although support for just passing a absolute path for any JSON flags rather than file:// has been deprecated and will produce a warning (and the absolute path behavior will be removed in a future release).
 
@@ -42,6 +47,7 @@ In order to upgrade a running cluster:
 * Restart the schedulers.
 * Upgrade the executors by linking the latest native library / jar / egg.
 
+
 ## Upgrading from 0.20.x to 0.21.x
 
 **NOTE** Disabling slave checkpointing has been deprecated; the slave --checkpoint flag has been deprecated and will be removed in a future release.
@@ -59,25 +65,23 @@ In order to upgrade a running cluster:
 
 **NOTE**: The Mesos API has been changed slightly in this release. The CommandInfo has been changed (see below), which makes launching a command more flexible. The 'value' field has been changed from _required_ to _optional_. However, it will not cause any issue during the upgrade (since the existing schedulers always set this field).
 
-```
-message CommandInfo {
-  ...
-  // There are two ways to specify the command:
-  // 1) If 'shell == true', the command will be launched via shell
-  //    (i.e., /bin/sh -c 'value'). The 'value' specified will be
-  //    treated as the shell command. The 'arguments' will be ignored.
-  // 2) If 'shell == false', the command will be launched by passing
-  //    arguments to an executable. The 'value' specified will be
-  //    treated as the filename of the executable. The 'arguments'
-  //    will be treated as the arguments to the executable. This is
-  //    similar to how POSIX exec families launch processes (i.e.,
-  //    execlp(value, arguments(0), arguments(1), ...)).
-  optional bool shell = 6 [default = true];
-  optional string value = 3;
-  repeated string arguments = 7;
-  ...
-}
-```
+    message CommandInfo {
+      ...
+      // There are two ways to specify the command:
+      // 1) If 'shell == true', the command will be launched via shell
+      //    (i.e., /bin/sh -c 'value'). The 'value' specified will be
+      //    treated as the shell command. The 'arguments' will be ignored.
+      // 2) If 'shell == false', the command will be launched by passing
+      //    arguments to an executable. The 'value' specified will be
+      //    treated as the filename of the executable. The 'arguments'
+      //    will be treated as the arguments to the executable. This is
+      //    similar to how POSIX exec families launch processes (i.e.,
+      //    execlp(value, arguments(0), arguments(1), ...)).
+      optional bool shell = 6 [default = true];
+      optional string value = 3;
+      repeated string arguments = 7;
+      ...
+    }
 
 **NOTE**: The Python bindings are also changing in this release. There are now sub-modules which allow you to use either the interfaces and/or the native driver.
 
@@ -86,7 +90,6 @@ message CommandInfo {
 
 To ensure a smooth upgrade, we recommend to upgrade your python framework and executor first. You will be able to either import using the new configuration or the old. Replace the existing imports with something like the following:
 
-```
     try:
         from mesos.native import MesosExecutorDriver, MesosSchedulerDriver
         from mesos.interface import Executor, Scheduler
@@ -94,7 +97,6 @@ To ensure a smooth upgrade, we recommend
     except ImportError:
         from mesos import Executor, MesosExecutorDriver, MesosSchedulerDriver, Scheduler
         import mesos_pb2
-```
 
 **NOTE**: If you're using a pure language binding, please ensure that it sends status update acknowledgements through the master before upgrading.