You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mesos.apache.org by mp...@apache.org on 2016/02/23 05:18:43 UTC

svn commit: r1731786 - in /mesos/site/publish: assets/img/documentation/ blog/mesos-0-27-1-released/ blog/mesoscon-2016-cfp-is-now-open/ documentation/latest/multiple-disk/ documentation/latest/replicated-log-internals/ documentation/multiple-disk/ doc...

Author: mpark
Date: Tue Feb 23 04:18:42 2016
New Revision: 1731786

URL: http://svn.apache.org/viewvc?rev=1731786&view=rev
Log:
Add missing files from 0.27.1 website update.


Added:
    mesos/site/publish/assets/img/documentation/log-architecture.png   (with props)
    mesos/site/publish/assets/img/documentation/log-cluster.png   (with props)
    mesos/site/publish/blog/mesos-0-27-1-released/
    mesos/site/publish/blog/mesos-0-27-1-released/index.html
    mesos/site/publish/blog/mesoscon-2016-cfp-is-now-open/
    mesos/site/publish/blog/mesoscon-2016-cfp-is-now-open/index.html
    mesos/site/publish/documentation/latest/multiple-disk/
    mesos/site/publish/documentation/latest/multiple-disk/index.html
    mesos/site/publish/documentation/latest/replicated-log-internals/
    mesos/site/publish/documentation/latest/replicated-log-internals/index.html
    mesos/site/publish/documentation/multiple-disk/
    mesos/site/publish/documentation/multiple-disk/index.html
    mesos/site/publish/documentation/replicated-log-internals/
    mesos/site/publish/documentation/replicated-log-internals/index.html

Added: mesos/site/publish/assets/img/documentation/log-architecture.png
URL: http://svn.apache.org/viewvc/mesos/site/publish/assets/img/documentation/log-architecture.png?rev=1731786&view=auto
==============================================================================
Binary file - no diff available.

Propchange: mesos/site/publish/assets/img/documentation/log-architecture.png
------------------------------------------------------------------------------
    svn:mime-type = application/octet-stream

Added: mesos/site/publish/assets/img/documentation/log-cluster.png
URL: http://svn.apache.org/viewvc/mesos/site/publish/assets/img/documentation/log-cluster.png?rev=1731786&view=auto
==============================================================================
Binary file - no diff available.

Propchange: mesos/site/publish/assets/img/documentation/log-cluster.png
------------------------------------------------------------------------------
    svn:mime-type = application/octet-stream

Added: mesos/site/publish/blog/mesos-0-27-1-released/index.html
URL: http://svn.apache.org/viewvc/mesos/site/publish/blog/mesos-0-27-1-released/index.html?rev=1731786&view=auto
==============================================================================
--- mesos/site/publish/blog/mesos-0-27-1-released/index.html (added)
+++ mesos/site/publish/blog/mesos-0-27-1-released/index.html Tue Feb 23 04:18:42 2016
@@ -0,0 +1,163 @@
+<!DOCTYPE html>
+<html>
+    <head>
+        <meta charset="utf-8">
+        <title>Apache Mesos 0.27.1 Released</title>
+		    <meta name="viewport" content="width=device-width, initial-scale=1.0">
+
+		    <link href="//netdna.bootstrapcdn.com/bootstrap/3.1.1/css/bootstrap.min.css" rel="stylesheet">
+		    <link rel="alternate" type="application/atom+xml" title="Apache Mesos Blog" href="/blog/feed.xml">
+		    
+		    <link href="../../assets/css/main.css" media="screen" rel="stylesheet" type="text/css" />
+				
+		    
+			
+			<!-- Google Analytics Magic -->
+			<script type="text/javascript">
+			  var _gaq = _gaq || [];
+			  _gaq.push(['_setAccount', 'UA-20226872-1']);
+			  _gaq.push(['_setDomainName', 'apache.org']);
+			  _gaq.push(['_trackPageview']);
+
+			  (function() {
+			    var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true;
+			    ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js';
+			    var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s);
+			  })();
+			</script>
+    </head>
+    <body>
+			<!-- magical breadcrumbs -->
+			<div class="topnav">
+			<ul class="breadcrumb">
+			  <li>
+					<div class="dropdown">
+					  <a data-toggle="dropdown" href="#">Apache Software Foundation <span class="caret"></span></a>
+					  <ul class="dropdown-menu" role="menu" aria-labelledby="dLabel">
+							<li><a href="http://www.apache.org">Apache Homepage</a></li>
+							<li><a href="http://www.apache.org/licenses/">License</a></li>
+					  	<li><a href="http://www.apache.org/foundation/sponsorship.html">Sponsorship</a></li>  
+					  	<li><a href="http://www.apache.org/foundation/thanks.html">Thanks</a></li>
+							<li><a href="http://www.apache.org/security/">Security</a></li>
+					  </ul>
+					</div>
+				</li>
+				<li><a href="http://mesos.apache.org">Apache Mesos</a></li>
+				
+				
+					<li><a href="/blog
+/">Blog
+</a></li>
+				
+				
+			</ul><!-- /breadcrumb -->
+			</div>
+			
+			<!-- navbar excitement -->
+	    <div class="navbar navbar-static-top" role="navigation">
+	      <div class="navbar-inner">
+	        <div class="container">
+						<a href="/" class="logo"><img src="/assets/img/mesos_logo.png" alt="Apache Mesos logo" /></a>
+					<div class="nav-collapse">
+						<ul class="nav nav-pills navbar-right">
+						  <li><a href="/gettingstarted/">Getting Started</a></li>
+						  <li><a href="/documentation/latest/">Documentation</a></li>
+						  <li><a href="/downloads/">Downloads</a></li>
+						  <li><a href="/community/">Community</a></li>
+						</ul>
+					</div>
+	        </div>
+	      </div>
+	    </div><!-- /.navbar -->
+
+      <div class="container">
+
+			<div class="row">
+
+<div class="col-md-3">
+	<div class="meta">
+		<span class="author">
+			
+			  <img src="http://www.gravatar.com/avatar/2ab9cab3a7cf782261c583c1f48a81b0?s=80" class="author_gravatar">
+			
+			<span class="author_contact">
+			  <p><strong>Michael Park</strong></p>
+			  <p><a href="http://twitter.com/mcypark`">@mcypark`</a></p>
+			</span>
+		</span>
+		<p><em>Posted February 22, 2016</em></p>
+	</div>
+	
+	<div class="share">
+		<span class="social-share-button"><a href="https://twitter.com/share" class="twitter-share-button" data-via="apachemesos">Tweet</a></span>
+		
+		<span><script>!function(d,s,id){var js,fjs=d.getElementsByTagName(s)[0],p=/^http:/.test(d.location)?'http':'https';if(!d.getElementById(id)){js=d.createElement(s);js.id=id;js.src=p+'://platform.twitter.com/widgets.js';fjs.parentNode.insertBefore(js,fjs);}}(document, 'script', 'twitter-wjs');</script></span>
+
+		<span><div class="g-plusone" data-size="medium"></div></span>
+
+		<!-- Place this tag after the last +1 button tag. -->
+		<script type="text/javascript">
+		  (function() {
+		    var po = document.createElement('script'); po.type = 'text/javascript'; po.async = true;
+		    po.src = 'https://apis.google.com/js/plusone.js';
+		    var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(po, s);
+		  })();
+		</script>
+		
+		<script src="//platform.linkedin.com/in.js" type="text/javascript">
+		 lang: en_US
+		</script>
+		<script type="IN/Share" data-counter="right"></script>
+	</div>
+</div>
+
+<div class="post col-md-9">
+	<h1>Apache Mesos 0.27.1 Released</h1>
+	
+	<p>The latest Mesos release, 0.27.1, is now available for <a href="http://mesos.apache.org/downloads">download</a>.
+This release includes fixes and improvements for: reconnection logic for Zookeeper client, <code>/state</code> endpoint and <code>systemd</code> integration.</p>
+
+<ul>
+<li><a href="https://issues.apache.org/jira/browse/MESOS-4546">MESOS-4546</a> - Mesos Agents needs to re-resolve hosts in zk string on leader change / failure to connect.</li>
+<li><a href="https://issues.apache.org/jira/browse/MESOS-4582">MESOS-4582</a> - state.json serving duplicate &ldquo;active&rdquo; fields.</li>
+<li><a href="https://issues.apache.org/jira/browse/MESOS-3007">MESOS-3007</a> - Support systemd with Mesos.</li>
+</ul>
+
+
+<p>Full release notes are available in the release <a href="https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob_plain;f=CHANGELOG;hb=0.27.1">CHANGELOG</a>.</p>
+
+<h3>Upgrades</h3>
+
+<p>Rolling upgrades from a Mesos 0.27.0 cluster to Mesos 0.27.1 are straightforward.
+Please refer to the <a href="http://mesos.apache.org/documentation/latest/upgrades/">upgrade guide</a> for detailed information on upgrading to Mesos 0.27.1.</p>
+
+<h3>Try it out</h3>
+
+<p>We encourage you to try out this release and let us know what you think.
+If you run into any issues, please let us know on the <a href="https://mesos.apache.org/community">user mailing list and IRC</a>.</p>
+
+<h3>Thanks!</h3>
+
+<p>Thanks to the 8 contributors who made 0.27.1 possible:</p>
+
+<p>Jie Yu, Joerg Shad, Joris Van Remoortere, Joseph Wu, Kapil Arya, Michael Park, Neil Conway, Shuai Lin</p>
+
+</div>
+</div>
+
+			
+	      <hr>
+
+				<!-- footer -->
+	      <div class="footer">
+	        <p>&copy; 2012-2015 <a href="http://apache.org">The Apache Software Foundation</a>.
+	        Apache Mesos, the Apache feather logo, and the Apache Mesos project logo are trademarks of The Apache Software Foundation.<p>
+	      </div><!-- /footer -->
+
+	    </div> <!-- /container -->
+
+	    <!-- JS -->
+	    <script src="//code.jquery.com/jquery-1.11.0.min.js" type="text/javascript"></script>
+			<script src="//netdna.bootstrapcdn.com/bootstrap/3.1.1/js/bootstrap.min.js" type="text/javascript"></script>
+    </body>
+</html>

Added: mesos/site/publish/blog/mesoscon-2016-cfp-is-now-open/index.html
URL: http://svn.apache.org/viewvc/mesos/site/publish/blog/mesoscon-2016-cfp-is-now-open/index.html?rev=1731786&view=auto
==============================================================================
--- mesos/site/publish/blog/mesoscon-2016-cfp-is-now-open/index.html (added)
+++ mesos/site/publish/blog/mesoscon-2016-cfp-is-now-open/index.html Tue Feb 23 04:18:42 2016
@@ -0,0 +1,163 @@
+<!DOCTYPE html>
+<html>
+    <head>
+        <meta charset="utf-8">
+        <title>MesosCon 2016 CFP is now open!</title>
+		    <meta name="viewport" content="width=device-width, initial-scale=1.0">
+
+		    <link href="//netdna.bootstrapcdn.com/bootstrap/3.1.1/css/bootstrap.min.css" rel="stylesheet">
+		    <link rel="alternate" type="application/atom+xml" title="Apache Mesos Blog" href="/blog/feed.xml">
+		    
+		    <link href="../../assets/css/main.css" media="screen" rel="stylesheet" type="text/css" />
+				
+		    
+			
+			<!-- Google Analytics Magic -->
+			<script type="text/javascript">
+			  var _gaq = _gaq || [];
+			  _gaq.push(['_setAccount', 'UA-20226872-1']);
+			  _gaq.push(['_setDomainName', 'apache.org']);
+			  _gaq.push(['_trackPageview']);
+
+			  (function() {
+			    var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true;
+			    ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js';
+			    var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s);
+			  })();
+			</script>
+    </head>
+    <body>
+			<!-- magical breadcrumbs -->
+			<div class="topnav">
+			<ul class="breadcrumb">
+			  <li>
+					<div class="dropdown">
+					  <a data-toggle="dropdown" href="#">Apache Software Foundation <span class="caret"></span></a>
+					  <ul class="dropdown-menu" role="menu" aria-labelledby="dLabel">
+							<li><a href="http://www.apache.org">Apache Homepage</a></li>
+							<li><a href="http://www.apache.org/licenses/">License</a></li>
+					  	<li><a href="http://www.apache.org/foundation/sponsorship.html">Sponsorship</a></li>  
+					  	<li><a href="http://www.apache.org/foundation/thanks.html">Thanks</a></li>
+							<li><a href="http://www.apache.org/security/">Security</a></li>
+					  </ul>
+					</div>
+				</li>
+				<li><a href="http://mesos.apache.org">Apache Mesos</a></li>
+				
+				
+					<li><a href="/blog
+/">Blog
+</a></li>
+				
+				
+			</ul><!-- /breadcrumb -->
+			</div>
+			
+			<!-- navbar excitement -->
+	    <div class="navbar navbar-static-top" role="navigation">
+	      <div class="navbar-inner">
+	        <div class="container">
+						<a href="/" class="logo"><img src="/assets/img/mesos_logo.png" alt="Apache Mesos logo" /></a>
+					<div class="nav-collapse">
+						<ul class="nav nav-pills navbar-right">
+						  <li><a href="/gettingstarted/">Getting Started</a></li>
+						  <li><a href="/documentation/latest/">Documentation</a></li>
+						  <li><a href="/downloads/">Downloads</a></li>
+						  <li><a href="/community/">Community</a></li>
+						</ul>
+					</div>
+	        </div>
+	      </div>
+	    </div><!-- /.navbar -->
+
+      <div class="container">
+
+			<div class="row">
+
+<div class="col-md-3">
+	<div class="meta">
+		<span class="author">
+			
+			  <img src="http://www.gravatar.com/avatar/2ab9cab3a7cf782261c583c1f48a81b0?s=80" class="author_gravatar">
+			
+			<span class="author_contact">
+			  <p><strong>Michael Park</strong></p>
+			  <p><a href="http://twitter.com/mcypark">@mcypark</a></p>
+			</span>
+		</span>
+		<p><em>Posted February 12, 2016</em></p>
+	</div>
+	
+	<div class="share">
+		<span class="social-share-button"><a href="https://twitter.com/share" class="twitter-share-button" data-via="apachemesos">Tweet</a></span>
+		
+		<span><script>!function(d,s,id){var js,fjs=d.getElementsByTagName(s)[0],p=/^http:/.test(d.location)?'http':'https';if(!d.getElementById(id)){js=d.createElement(s);js.id=id;js.src=p+'://platform.twitter.com/widgets.js';fjs.parentNode.insertBefore(js,fjs);}}(document, 'script', 'twitter-wjs');</script></span>
+
+		<span><div class="g-plusone" data-size="medium"></div></span>
+
+		<!-- Place this tag after the last +1 button tag. -->
+		<script type="text/javascript">
+		  (function() {
+		    var po = document.createElement('script'); po.type = 'text/javascript'; po.async = true;
+		    po.src = 'https://apis.google.com/js/plusone.js';
+		    var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(po, s);
+		  })();
+		</script>
+		
+		<script src="//platform.linkedin.com/in.js" type="text/javascript">
+		 lang: en_US
+		</script>
+		<script type="IN/Share" data-counter="right"></script>
+	</div>
+</div>
+
+<div class="post col-md-9">
+	<h1>MesosCon 2016 CFP is now open!</h1>
+	
+	<p>MesosCon North America 2016 will be held in Denver, CO on June 1-2, 2016.</p>
+
+<p>Talk submissions, sponsorship opportunities, and early-bird registration are now open for the conference.</p>
+
+<h2>Speak at MesosCon</h2>
+
+<p>Several formats are being accepted for speaking proposals, including:
+Presentations, Panels, Keynotes and Lightning Talks.
+Submissions are being accepted through <strong>March 9, 2016</strong>.</p>
+
+<p>You can take a look at <a href="https://www.youtube.com/playlist?list=PLVjgeV_avap2arug3vIz8c6l72rvh9poV">MesosCon 2015 list of talks</a>
+for ideas. If you&rsquo;re unsure about your proposal, or want some feedback or advice in general,
+please don&rsquo;t hesitate to reach out to the <a href="https://mail-archives.apache.org/mod_mbox/mesos-dev">mesos-dev</a> mailing list.
+We&rsquo;ll be happy to help out! Further details are available on the <a href="http://events.linuxfoundation.org/events/mesoscon/program/cfp">CFP website</a>.</p>
+
+<p>All submissions for MesosCon North America will also be considered for the upcoming MesosCon Europe and China events (details to be announced shortly).</p>
+
+<h2>Sponsor MesosCon</h2>
+
+<p>The <a href="http://events.linuxfoundation.org/events/mesoscon/sponsors">sponsorship prospectus for MesosCon</a> is now available,
+including several opportunities with limited availability.</p>
+
+<h2>Attend MesosCon</h2>
+
+<p><a href="http://events.linuxfoundation.org/events/mesoscon/attend/register">Early-Bird registration</a> is open through April 22nd!</p>
+
+<p>We look forward to seeing you at MesosCon North America, as well as MesosCon Europe and MesosCon China taking place later this year!</p>
+
+</div>
+</div>
+
+			
+	      <hr>
+
+				<!-- footer -->
+	      <div class="footer">
+	        <p>&copy; 2012-2015 <a href="http://apache.org">The Apache Software Foundation</a>.
+	        Apache Mesos, the Apache feather logo, and the Apache Mesos project logo are trademarks of The Apache Software Foundation.<p>
+	      </div><!-- /footer -->
+
+	    </div> <!-- /container -->
+
+	    <!-- JS -->
+	    <script src="//code.jquery.com/jquery-1.11.0.min.js" type="text/javascript"></script>
+			<script src="//netdna.bootstrapcdn.com/bootstrap/3.1.1/js/bootstrap.min.js" type="text/javascript"></script>
+    </body>
+</html>

Added: mesos/site/publish/documentation/latest/multiple-disk/index.html
URL: http://svn.apache.org/viewvc/mesos/site/publish/documentation/latest/multiple-disk/index.html?rev=1731786&view=auto
==============================================================================
--- mesos/site/publish/documentation/latest/multiple-disk/index.html (added)
+++ mesos/site/publish/documentation/latest/multiple-disk/index.html Tue Feb 23 04:18:42 2016
@@ -0,0 +1,242 @@
+<!DOCTYPE html>
+<html>
+    <head>
+        <meta charset="utf-8">
+        <title></title>
+		    <meta name="viewport" content="width=device-width, initial-scale=1.0">
+
+		    <link href="//netdna.bootstrapcdn.com/bootstrap/3.1.1/css/bootstrap.min.css" rel="stylesheet">
+		    <link rel="alternate" type="application/atom+xml" title="Apache Mesos Blog" href="/blog/feed.xml">
+		    
+		    <link href="../../../assets/css/main.css" media="screen" rel="stylesheet" type="text/css" />
+				
+		    
+			
+			<!-- Google Analytics Magic -->
+			<script type="text/javascript">
+			  var _gaq = _gaq || [];
+			  _gaq.push(['_setAccount', 'UA-20226872-1']);
+			  _gaq.push(['_setDomainName', 'apache.org']);
+			  _gaq.push(['_trackPageview']);
+
+			  (function() {
+			    var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true;
+			    ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js';
+			    var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s);
+			  })();
+			</script>
+    </head>
+    <body>
+			<!-- magical breadcrumbs -->
+			<div class="topnav">
+			<ul class="breadcrumb">
+			  <li>
+					<div class="dropdown">
+					  <a data-toggle="dropdown" href="#">Apache Software Foundation <span class="caret"></span></a>
+					  <ul class="dropdown-menu" role="menu" aria-labelledby="dLabel">
+							<li><a href="http://www.apache.org">Apache Homepage</a></li>
+							<li><a href="http://www.apache.org/licenses/">License</a></li>
+					  	<li><a href="http://www.apache.org/foundation/sponsorship.html">Sponsorship</a></li>  
+					  	<li><a href="http://www.apache.org/foundation/thanks.html">Thanks</a></li>
+							<li><a href="http://www.apache.org/security/">Security</a></li>
+					  </ul>
+					</div>
+				</li>
+				<li><a href="http://mesos.apache.org">Apache Mesos</a></li>
+				
+				
+					<li><a href="/documentation
+/">Documentation
+</a></li>
+				
+				
+			</ul><!-- /breadcrumb -->
+			</div>
+			
+			<!-- navbar excitement -->
+	    <div class="navbar navbar-static-top" role="navigation">
+	      <div class="navbar-inner">
+	        <div class="container">
+						<a href="/" class="logo"><img src="/assets/img/mesos_logo.png" alt="Apache Mesos logo" /></a>
+					<div class="nav-collapse">
+						<ul class="nav nav-pills navbar-right">
+						  <li><a href="/gettingstarted/">Getting Started</a></li>
+						  <li><a href="/documentation/latest/">Documentation</a></li>
+						  <li><a href="/downloads/">Downloads</a></li>
+						  <li><a href="/community/">Community</a></li>
+						</ul>
+					</div>
+	        </div>
+	      </div>
+	    </div><!-- /.navbar -->
+
+      <div class="container">
+
+			<div class="row-fluid">
+	<div class="col-md-4">
+		<h4>If you're new to Mesos</h4>
+		<p>See the <a href="/gettingstarted/">getting started</a> page for more information about downloading, building, and deploying Mesos.</p>
+		
+		<h4>If you'd like to get involved or you're looking for support</h4>
+		<p>See our <a href="/community/">community</a> page for more details.</p>
+	</div>
+	<div class="col-md-8">
+		<h1>Multiple Disks</h1>
+
+<p>Mesos provides a mechanism for operators to expose multiple disk resources. When
+creating <a href="/documentation/latest/./persistent-volume/">persistent volumes</a> frameworks can decide
+whether to use specific disks by examining the <code>source</code> field on the disk
+resources offered.</p>
+
+<p><code>Disk</code> resources come in three forms:</p>
+
+<ul>
+<li>A <code>Root</code> disk is presented by not having the <code>source</code> set in <code>DiskInfo</code>.</li>
+<li>A <code>Path</code> disk is presented by having the <code>PATH</code> enum set for <code>source</code> in
+<code>DiskInfo</code>. It also has a <code>root</code> which the operator uses to specify the
+directory to be used to store data.</li>
+<li>A <code>Mount</code> disk is presented by having the <code>MOUNT</code> enum set for <code>source</code> in
+<code>DiskInfo</code>. It also has a <code>root</code> which the operator uses to specify the
+mount point used to store data.</li>
+</ul>
+
+
+<p>Operators can use the JSON-formated <code>--resources</code> option on the agent to provide
+these different kind of disk resources on agent start-up.</p>
+
+<h4><code>Root</code> disk</h4>
+
+<p>A <code>Root</code> disk is the basic disk resource in Mesos. It usually maps to the
+storage on the main operating system drive that the operator has presented to
+the agent. Data is mapped into the <code>work_dir</code> of the agent.</p>
+
+<pre><code>    {
+      "resources" : [
+        {
+          "name" : "disk",
+          "type" : "SCALAR",
+          "scalar" : { "value" : 2048 },
+          "role" : &lt;framework_role&gt;
+        }
+      ]
+    }
+</code></pre>
+
+<h4><code>Path</code> disks</h4>
+
+<p>A <code>Path</code> disk is an auxiliary disk resource provided by the operator. This can
+can be carved up into smaller chunks by accepting less than the total amount as
+frameworks see fit. Common uses for this kind of disk are extra logging space,
+file archives or caches, or other non performance-critical applications.
+Operators can present extra disks on their agents as <code>Path</code> disks simply by
+creating a directory and making that the <code>root</code> of the <code>Path</code> in <code>DiskInfo</code>&rsquo;s
+<code>source</code>.</p>
+
+<p><code>Path</code> disks are also useful for mocking up a multiple disk environment by
+creating some directories on the operating system drive. This should only be
+done in a testing or staging environment.</p>
+
+<pre><code>    {
+      "resources" : [
+        {
+          "name" : "disk",
+          "type" : "SCALAR",
+          "scalar" : { "value" : 2048 },
+          "role" : &lt;framework_role&gt;,
+          "disk" : {
+            "source" : {
+              "type" : "PATH",
+              "path" : { "root" : "/mnt/data" }
+            }
+          }
+        }
+      ]
+    }
+</code></pre>
+
+<h4><code>Mount</code> disks</h4>
+
+<p>A <code>Mount</code> disk is an auxiliary disk resource provided by the operator. This
+<strong>cannot</strong> be carved up into smaller chunks by frameworks. This lack of
+flexibility allows operators to provide assurances to frameworks that they will
+have exclusive access to the disk device. Common uses for this kind of disk
+include database storage, write-ahead logs, or other performance-critical
+applications.</p>
+
+<p>Another requirement of <code>Mount</code> disks is that (on Linux) they map to a <code>mount</code>
+point in the <code>/proc/mounts</code> table. Operators should mount a physical disk with
+their preferred file system and provide the mount point as the <code>root</code> of the
+<code>Mount</code> in <code>DiskInfo</code>&rsquo;s <code>source</code>.</p>
+
+<p>Aside from the performance advantages of <code>Mount</code> disks, applications running on
+them should be able to rely on disk errors when they attempt to exceed the
+capacity of the volume. This holds true as long as the file system in use
+correctly propagates these errors. Due to this expectation, the <code>posix/disk</code>
+isolation is disabled for <code>Mount</code> disks.</p>
+
+<pre><code>    {
+      "resources" : [
+        {
+          "name" : "disk",
+          "type" : "SCALAR",
+          "scalar" : { "value" : 2048 },
+          "role" : &lt;framework_role&gt;,
+          "disk" : {
+            "source" : {
+              "type" : "MOUNT",
+              "mount" : { "root" : "/mnt/data" }
+            }
+          }
+        }
+      ]
+    }
+</code></pre>
+
+<h4><code>Block</code> disks</h4>
+
+<p>Mesos currently does not allow operators to expose raw block devices. It may do
+so in the future, but there are security and flexibility concerns that need to
+be addressed in a design document first.</p>
+
+<h3>Storage Management</h3>
+
+<p>Mesos currently does not clean up or destroy data when persistent volumes are
+destroyed. It may do so in the future; however, the expectation is currently
+upon the framework, executor, and application to delete their data before
+destroying their persistent volumes. This is strongly encouraged for both
+security and ensuring that future users of the underlying disk resource are not
+penalized due to prior consumption of the disk capacity.</p>
+
+<h3>Implementation</h3>
+
+<p>A <code>Path</code> disk will have sub-directories created within the <code>root</code> which will be
+used to differentiate the different volumes that are created on it.</p>
+
+<p>A <code>Mount</code> disk will <strong>not</strong> have sub-directories created, allowing applications
+to use the full file system mounted on the device. This construct allows Mesos
+tasks to access volumes that contain pre-existing directory structures. This can
+be useful to simplify ingesting data such as a pre-existing Postgres database or
+HDFS data directory.</p>
+
+<p>Operators should be aware of these distinctions when inspecting or cleaning up
+remnant data.</p>
+
+	</div>
+</div>
+
+			
+	      <hr>
+
+				<!-- footer -->
+	      <div class="footer">
+	        <p>&copy; 2012-2015 <a href="http://apache.org">The Apache Software Foundation</a>.
+	        Apache Mesos, the Apache feather logo, and the Apache Mesos project logo are trademarks of The Apache Software Foundation.<p>
+	      </div><!-- /footer -->
+
+	    </div> <!-- /container -->
+
+	    <!-- JS -->
+	    <script src="//code.jquery.com/jquery-1.11.0.min.js" type="text/javascript"></script>
+			<script src="//netdna.bootstrapcdn.com/bootstrap/3.1.1/js/bootstrap.min.js" type="text/javascript"></script>
+    </body>
+</html>

Added: mesos/site/publish/documentation/latest/replicated-log-internals/index.html
URL: http://svn.apache.org/viewvc/mesos/site/publish/documentation/latest/replicated-log-internals/index.html?rev=1731786&view=auto
==============================================================================
--- mesos/site/publish/documentation/latest/replicated-log-internals/index.html (added)
+++ mesos/site/publish/documentation/latest/replicated-log-internals/index.html Tue Feb 23 04:18:42 2016
@@ -0,0 +1,218 @@
+<!DOCTYPE html>
+<html>
+    <head>
+        <meta charset="utf-8">
+        <title></title>
+		    <meta name="viewport" content="width=device-width, initial-scale=1.0">
+
+		    <link href="//netdna.bootstrapcdn.com/bootstrap/3.1.1/css/bootstrap.min.css" rel="stylesheet">
+		    <link rel="alternate" type="application/atom+xml" title="Apache Mesos Blog" href="/blog/feed.xml">
+		    
+		    <link href="../../../assets/css/main.css" media="screen" rel="stylesheet" type="text/css" />
+				
+		    
+			
+			<!-- Google Analytics Magic -->
+			<script type="text/javascript">
+			  var _gaq = _gaq || [];
+			  _gaq.push(['_setAccount', 'UA-20226872-1']);
+			  _gaq.push(['_setDomainName', 'apache.org']);
+			  _gaq.push(['_trackPageview']);
+
+			  (function() {
+			    var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true;
+			    ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js';
+			    var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s);
+			  })();
+			</script>
+    </head>
+    <body>
+			<!-- magical breadcrumbs -->
+			<div class="topnav">
+			<ul class="breadcrumb">
+			  <li>
+					<div class="dropdown">
+					  <a data-toggle="dropdown" href="#">Apache Software Foundation <span class="caret"></span></a>
+					  <ul class="dropdown-menu" role="menu" aria-labelledby="dLabel">
+							<li><a href="http://www.apache.org">Apache Homepage</a></li>
+							<li><a href="http://www.apache.org/licenses/">License</a></li>
+					  	<li><a href="http://www.apache.org/foundation/sponsorship.html">Sponsorship</a></li>  
+					  	<li><a href="http://www.apache.org/foundation/thanks.html">Thanks</a></li>
+							<li><a href="http://www.apache.org/security/">Security</a></li>
+					  </ul>
+					</div>
+				</li>
+				<li><a href="http://mesos.apache.org">Apache Mesos</a></li>
+				
+				
+					<li><a href="/documentation
+/">Documentation
+</a></li>
+				
+				
+			</ul><!-- /breadcrumb -->
+			</div>
+			
+			<!-- navbar excitement -->
+	    <div class="navbar navbar-static-top" role="navigation">
+	      <div class="navbar-inner">
+	        <div class="container">
+						<a href="/" class="logo"><img src="/assets/img/mesos_logo.png" alt="Apache Mesos logo" /></a>
+					<div class="nav-collapse">
+						<ul class="nav nav-pills navbar-right">
+						  <li><a href="/gettingstarted/">Getting Started</a></li>
+						  <li><a href="/documentation/latest/">Documentation</a></li>
+						  <li><a href="/downloads/">Downloads</a></li>
+						  <li><a href="/community/">Community</a></li>
+						</ul>
+					</div>
+	        </div>
+	      </div>
+	    </div><!-- /.navbar -->
+
+      <div class="container">
+
+			<div class="row-fluid">
+	<div class="col-md-4">
+		<h4>If you're new to Mesos</h4>
+		<p>See the <a href="/gettingstarted/">getting started</a> page for more information about downloading, building, and deploying Mesos.</p>
+		
+		<h4>If you'd like to get involved or you're looking for support</h4>
+		<p>See our <a href="/community/">community</a> page for more details.</p>
+	</div>
+	<div class="col-md-8">
+		<h1>The Mesos Replicated Log</h1>
+
+<p>Mesos provides a library that lets you create replicated fault-tolerant append-only logs; this library is known as the <em>replicated log</em>. The Mesos master uses this library to store cluster state in a replicated, durable way; the library is also available for use by frameworks to store replicated framework state or to implement the common &ldquo;<a href="https://en.wikipedia.org/wiki/State_machine_replication">replicated state machine</a>&rdquo; pattern.</p>
+
+<h2>What is the replicated log?</h2>
+
+<p><img src="/assets/img/documentation/log-cluster.png" alt="Aurora and the Replicated Log" /></p>
+
+<p>The replicated log provides <em>append-only</em> storage of <em>log entries</em>; each log entry can contain arbitrary data. The log is <em>replicated</em>, which means that each log entry has multiple copies in the system. Replication provides both fault tolerance and high availability. In the following example, we use <a href="https://aurora.apache.org/">Apache Aurora</a>, a fault tolerant scheduler (i.e., framework) running on top of Mesos, to show a typical replicated log setup.</p>
+
+<p>As shown above, there are multiple Aurora instances running simultaneously (for high availability), with one elected as the leader. There is a log replica on each host running Aurora. Aurora can access the replicated log through a thin library containing the log API.</p>
+
+<p>Typically, the leader is the only one that appends data to the log. Each log entry is replicated and sent to all replicas in the system. Replicas are strongly consistent. In other words, all replicas agree on the value of each log entry. Because the log is replicated, when Aurora decides to failover, it does not need to copy the log from a remote host.</p>
+
+<h2>Use Cases</h2>
+
+<p>The replicated log can be used to build a wide variety of distributed applications. For example, Aurora uses the replicated log to store all task states and job configurations. The Mesos master&rsquo;s <em>registry</em> also leverages the replicated log to store information about all slaves in the cluster.</p>
+
+<p>The replicated log is often used to allow applications to manage replicated state in a strongly consistent way. One way to do this is to store a state-mutating operation in each log entry and have all instances of the distributed application agree on the same initial state (e.g., empty state). The replicated log ensures that each application instance will observe the same sequence of log entries in the same order; as long as applying a state-mutating operation is deterministic, this ensures that all application instances will remain consistent with one another. If any instance of the application crashes, it can reconstruct the current version of the replicated state by starting at the initial state and re-applying all the logged mutations in order.</p>
+
+<p>If the log grows too large, an application can write out a snapshot and then delete all the log entries that occurred before the snapshot. Using this approach, we will be exposing a <a href="https://github.com/apache/mesos/blob/master/src/state/state.hpp">distributed state</a> abstraction in Mesos with replicated log as a backend.</p>
+
+<p>Similarly, the replicated log can be used to build <a href="https://en.wikipedia.org/wiki/State_machine_replication">replicated state machines</a>. In this scenario, each log entry contains a state machine command. Since replicas are strongly consistent, all servers will execute the same commands in the same order.</p>
+
+<h2>Implementation</h2>
+
+<p><img src="/assets/img/documentation/log-architecture.png" alt="Replicated Log Architecture" /></p>
+
+<p>The replicated log uses the <a href="https://en.wikipedia.org/wiki/Paxos_%28computer_science%29">Paxos consensus algorithm</a> to ensure that all replicas agree on every log entry’s value. It is similar to what’s described in <a href="https://ramcloud.stanford.edu/~ongaro/userstudy/paxos.pdf">these slides</a>. Readers who are familiar with Paxos can skip this section.</p>
+
+<p>The above figure is an implementation overview. When a user wants to append data to the log, the system creates a log writer. The log writer internally creates a coordinator. The coordinator contacts all replicas and executes the Paxos algorithm to make sure all replicas agree about the appended data. The coordinator is sometimes referred to as the <a href="https://en.wikipedia.org/wiki/Paxos_%28computer_science%29"><em>proposer</em></a>.</p>
+
+<p>Each replica keeps an array of log entries. The array index is the log position. Each log entry is composed of three components: the value written by the user, the associated Paxos state and a <em>learned</em> bit where true means this log entry’s value has been agreed. Therefore, a replica in our implementation is both an <a href="https://en.wikipedia.org/wiki/Paxos_%28computer_science%29"><em>acceptor</em></a> and a <a href="https://en.wikipedia.org/wiki/Paxos_%28computer_science%29"><em>learner</em></a>.</p>
+
+<h3>Reaching consensus for a single log entry</h3>
+
+<p>A Paxos round can help all replicas reach consensus on a single log entry’s value. It has two phases: a promise phase and a write phase. Note that we are using slightly different terminology from the <a href="https://research.microsoft.com/en-us/um/people/lamport/pubs/paxes-simple.pdf">original Paxos paper</a>. In our implementation, the <em>prepare</em> and <em>accept</em> phases in the original paper are referred to as the <em>promise</em> and <em>write</em> phases, respectively. Consequently, a prepare request (response) is referred to as a promise request (response), and an accept request (response) is referred to as a write request (response).</p>
+
+<p>To append value <em>X</em> to the log at position <em>p</em>, the coordinator first broadcasts a promise request to all replicas with proposal number <em>n</em>, asking replicas to promise that they will not respond to any request (promise/write request) with a proposal number lower than <em>n</em>. We assume that <em>n</em> is higher than any other previously used proposal number, and will explain how we do this later.</p>
+
+<p>When receiving the promise request, each replica checks its Paxos state to decide if it can safely respond to the request, depending on the promises it has previously given out. If the replica is able to give the promise (i.e., passes the proposal number check), it will first persist its promise (the proposal number <em>n</em>) on disk and reply with a promise response. If the replica has been previously written (i.e., accepted a write request), it needs to include the previously written value along with the proposal number used in that write request into the promise response it’s about to send out.</p>
+
+<p>Upon receiving promise responses from a <a href="https://en.wikipedia.org/wiki/Quorum_%28distributed_computing%29">quorum</a> of replicas, the coordinator first checks if there exist any previously written value from those responses. The append operation cannot continue if a previously written value is found because it’s likely that a value has already been agreed on for that log entry. This is one of the key ideas in Paxos: restrict the value that can be written to ensure consistency.</p>
+
+<p>If no previous written value is found, the coordinator broadcasts a write request to all replicas with value <em>X</em> and proposal number <em>n</em>. On receiving the write request, each replica checks the promise it has given again, and replies with a write response if the write request’s proposal number is equal to or larger than the proposal number it has promised. Once the coordinator receives write responses from a quorum of replicas, the append operation succeeds.</p>
+
+<h3>Optimizing append latency using Multi-Paxos</h3>
+
+<p>One naive solution to implement a replicated log is to run a full Paxos round (promise phase and write phase) for each log entry. As discussed in the <a href="https://research.microsoft.com/en-us/um/people/lamport/pubs/paxos-simple.pdf">original Paxos paper</a>, if the leader is relatively stable, <em>Multi-Paxos</em> can be used to eliminate the need for the promise phase for most of the append operations, resulting in improved performance.</p>
+
+<p>To do that, we introduce a new type of promise request called an <em>implicit</em> promise request. An implicit promise request can be viewed as a <em>batched</em> promise request for a (potentially infinite) set of log entries. Broadcasting an implicit promise request is conceptually equivalent to broadcasting a promise request for every log entry whose value has not yet been agreed. If the implicit promise request broadcasted by a coordinator gets accepted by a quorum of replicas, this coordinator is no longer required to run the promise phase if it wants to append to a log entry whose value has not yet been agreed because the promise phase has already been done in <em>batch</em>. The coordinator in this case is therefore called <em>elected</em> (a.k.a., the leader), and has <em>exclusive</em> access to the replicated log. An elected coordinator may be <em>demoted</em> (or lose exclusive access) if another coordinator broadcasts an implicit promise request with a higher proposa
 l number.</p>
+
+<p>One question remaining is how can we find out those log entries whose values have not yet been agreed. We have a very simple solution: if a replica accepts an implicit promise request, it will include its largest known log position in the response. An elected coordinator will only append log entries at positions larger than <em>p</em>, where <em>p</em> is greater than any log position seen in these responses.</p>
+
+<p>Multi-Paxos has better performance if the leader is stable. The replicated log itself does not perform leader election. Instead, we rely on the user of the replicated log to choose a stable leader. For example, Aurora uses <a href="https://zookeeper.apache.org/">ZooKeeper</a> to elect the leader.</p>
+
+<h3>Enabling local reads</h3>
+
+<p>As discussed above, in our implementation, each replica is both an acceptor and a learner. Treating each replica as a learner allows us to do local reads without involving other replicas. When a log entry’s value has been agreed, the coordinator will broadcast a <em>learned</em> message to all replicas. Once a replica receives the learned message, it will set the learned bit in the corresponding log entry, indicating the value of that log entry has been agreed. We say a log entry is &ldquo;learned&rdquo; if its learned bit is set. The coordinator does not have to wait for replicas’ acknowledgments.</p>
+
+<p>To perform a read, the log reader will directly look up the underlying local replica. If the corresponding log entry is learned, the reader can just return the value to the user. Otherwise, a full Paxos round is needed to discover the agreed value. We always make sure that the replica co-located with the elected coordinator always has all log entries learned. We achieve that by running full Paxos rounds for those unlearned log entries after the coordinator is elected.</p>
+
+<h3>Reducing log size using garbage collection</h3>
+
+<p>In case the log grows large, the application has the choice to truncate the log. To perform a truncation, we append a special log entry whose value is the log position to which the user wants to truncate the log. A replica can actually truncate the log once this special log entry has been learned.</p>
+
+<h3>Unique proposal number</h3>
+
+<p>Many of the <a href="https://research.microsoft.com/en-us/um/people/lamport/pubs/paxos-simple.pdf">Paxos research papers</a> assume that each proposal number is globally unique, and a coordinator can always come up with a proposal number that is larger than any other proposal numbers in the system. However, implementing this is not trivial, especially in a distributed environment. <a href="https://ramcloud.stanford.edu/~ongaro/userstudy/paxos.pdf">Some researchers suggest</a> concatenating a globally unique server id to each proposal number. But it is still not clear how to generate a globally unique id for each server.</p>
+
+<p>Our solution does not make the above assumptions. A coordinator can use an arbitrary proposal number initially. During the promise phase, if a replica knows a proposal number higher than the proposal number used by the coordinator, it will send the largest known proposal number back to the coordinator. The coordinator will retry the promise phase with a higher proposal number.</p>
+
+<p>To avoid livelock (e.g., when two coordinators completing), we inject a randomly delay between T and 2T before each retry. T has to be chosen carefully. On one hand, we want T >> broadcast time such that one coordinator usually times out and wins before others wake up. On the other hand, we want T to be as small as possible such that we can reduce the wait time. Currently, we use T = 100ms. This idea is actually borrowed from <a href="https://ramcloud.stanford.edu/wiki/download/attachments/11370504/raft.pdf">Raft</a>.</p>
+
+<h2>Automatic replica recovery</h2>
+
+<p>The algorithm described above has a critical vulnerability: if a replica loses its durable state (i.e., log files) due to either disk failure or operational error, that replica may cause inconsistency in the log if it is simply restarted and re-added to the group. The operator needs to stop the application on all hosts, copy the log files from the leader’s host, and then restart the application. Note that the operator cannot copy the log files from an arbitrary replica because copying an unlearned log entry may falsely assemble a quorum for an incorrect value, leading to inconsistency.</p>
+
+<p>To avoid the need for operator intervention in this situation, the Mesos replicated log includes support for <em>auto recovery</em>. As long as a quorum of replicas is working properly, the users of the application won’t notice any difference.</p>
+
+<h3>Non-voting replicas</h3>
+
+<p>To enable auto recovery, a key insight is that a replica that loses its durable state should not be allowed to respond to requests from coordinators after restart. Otherwise, it may introduce inconsistency in the log as it could have accepted a promise/write request which it would not have accepted if its previous Paxos state had not been lost.</p>
+
+<p>To solve that, we introduce a new status variable for each replica. A normal replica is said in VOTING status, meaning that it is allowed to respond to requests from coordinators. A replica with no persisted state is put in EMPTY status by default. A replica in EMPTY status is not allowed to respond to any request from coordinators.</p>
+
+<p>A replica in EMPTY status will be promoted to VOTING status if the following two conditions are met:</p>
+
+<ol>
+<li>a sufficient amount of missing log entries are recovered such that if other replicas fail, the remaining replicas can recover all the learned log entries, and</li>
+<li>its future responses to a coordinator will not break any of the promises (potentially lost) it has given out.</li>
+</ol>
+
+
+<p>In the following, we discuss how we achieve these two conditions.</p>
+
+<h3>Catch-up</h3>
+
+<p>To satisfy the above two conditions, a replica needs to perform <em>catch-up</em> to recover lost states. In other words, it will run Paxos rounds to find out those log entries whose values that have already been agreed. The question is how many log entries the local replica should catch-up before the above two conditions can be satisfied.</p>
+
+<p>We found that it is sufficient to catch-up those log entries from position <em>begin</em> to position <em>end</em> where <em>begin</em> is the smallest position seen in a quorum of VOTING replicas and <em>end</em> is the largest position seen in a quorum of VOTING replicas.</p>
+
+<p>Here is our correctness argument. For a log entry at position <em>e</em> where <em>e</em> is larger than <em>end</em>, obviously no value has been agreed on. Otherwise, we should find at least one VOTING replica in a quorum of replicas such that its end position is larger than <em>end</em>. For the same reason, a coordinator should not have collected enough promises for the log entry at position <em>e</em>. Therefore, it&rsquo;s safe for the recovering replica to respond requests for that log entry. For a log entry at position <em>b</em> where <em>b</em> is smaller than <em>begin</em>, it should have already been truncated and the truncation should have already been agreed. Therefore, allowing the recovering replica to respond requests for that position is also safe.</p>
+
+<h3>Auto initialization</h3>
+
+<p>Since we don’t allow an empty replica (a replica in EMPTY status) to respond to requests from coordinators, that raises a question for bootstrapping because initially, each replica is empty. The replicated log provides two choices here. One choice is to use a tool (`mesos-log) to explicitly initialize the log on each replica by setting the replica&rsquo;s status to VOTING, but that requires an extra step when setting up an application.</p>
+
+<p>The other choice is to do automatic initialization. Our idea is: we allow a replica in EMPTY status to become VOTING immediately if it finds all replicas are in EMPTY status. This is based on the assumption that the only time <em>all</em> replicas are in EMPTY status is during start-up. This may not be true if a catastrophic failure causes all replicas to lose their durable state, and that&rsquo;s exactly the reason we allow conservative users to disable auto-initialization.</p>
+
+<p>To do auto-initialization, if we use a single-phase protocol and allow a replica to directly transit from EMPTY status to VOTING status, we may run into a state where we cannot make progress even if all replicas are in EMPTY status initially. For example, say the quorum size is 2. All replicas are in EMPTY status initially. One replica will first set its status to VOTING because if finds all replicas are in EMPTY status. After that, neither the VOTING replica nor the EMPTY replicas can make progress. To solve this problem, we use a two-phase protocol and introduce an intermediate transient status (STARTING) between EMPTY and VOTING status. A replica in EMPTY status can transit to STARTING status if it finds all replicas are in either EMPTY or STARTING status. A replica in STARTING status can transit to VOTING status if it finds all replicas are in either STARTING or VOTING status. In that way, in our previous example, all replicas will be in STARTING status before any of them can
  transit to VOTING status.</p>
+
+<h2>Future work</h2>
+
+<p>Currently, replicated log does not support dynamic quorum size change, also known as <em>reconfiguration</em>. Supporting reconfiguration would allow us more easily to add, move or swap hosts for replicas. We plan to support reconfiguration in the future.</p>
+
+	</div>
+</div>
+
+			
+	      <hr>
+
+				<!-- footer -->
+	      <div class="footer">
+	        <p>&copy; 2012-2015 <a href="http://apache.org">The Apache Software Foundation</a>.
+	        Apache Mesos, the Apache feather logo, and the Apache Mesos project logo are trademarks of The Apache Software Foundation.<p>
+	      </div><!-- /footer -->
+
+	    </div> <!-- /container -->
+
+	    <!-- JS -->
+	    <script src="//code.jquery.com/jquery-1.11.0.min.js" type="text/javascript"></script>
+			<script src="//netdna.bootstrapcdn.com/bootstrap/3.1.1/js/bootstrap.min.js" type="text/javascript"></script>
+    </body>
+</html>

Added: mesos/site/publish/documentation/multiple-disk/index.html
URL: http://svn.apache.org/viewvc/mesos/site/publish/documentation/multiple-disk/index.html?rev=1731786&view=auto
==============================================================================
--- mesos/site/publish/documentation/multiple-disk/index.html (added)
+++ mesos/site/publish/documentation/multiple-disk/index.html Tue Feb 23 04:18:42 2016
@@ -0,0 +1,242 @@
+<!DOCTYPE html>
+<html>
+    <head>
+        <meta charset="utf-8">
+        <title></title>
+		    <meta name="viewport" content="width=device-width, initial-scale=1.0">
+
+		    <link href="//netdna.bootstrapcdn.com/bootstrap/3.1.1/css/bootstrap.min.css" rel="stylesheet">
+		    <link rel="alternate" type="application/atom+xml" title="Apache Mesos Blog" href="/blog/feed.xml">
+		    
+		    <link href="../../assets/css/main.css" media="screen" rel="stylesheet" type="text/css" />
+				
+		    
+			
+			<!-- Google Analytics Magic -->
+			<script type="text/javascript">
+			  var _gaq = _gaq || [];
+			  _gaq.push(['_setAccount', 'UA-20226872-1']);
+			  _gaq.push(['_setDomainName', 'apache.org']);
+			  _gaq.push(['_trackPageview']);
+
+			  (function() {
+			    var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true;
+			    ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js';
+			    var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s);
+			  })();
+			</script>
+    </head>
+    <body>
+			<!-- magical breadcrumbs -->
+			<div class="topnav">
+			<ul class="breadcrumb">
+			  <li>
+					<div class="dropdown">
+					  <a data-toggle="dropdown" href="#">Apache Software Foundation <span class="caret"></span></a>
+					  <ul class="dropdown-menu" role="menu" aria-labelledby="dLabel">
+							<li><a href="http://www.apache.org">Apache Homepage</a></li>
+							<li><a href="http://www.apache.org/licenses/">License</a></li>
+					  	<li><a href="http://www.apache.org/foundation/sponsorship.html">Sponsorship</a></li>  
+					  	<li><a href="http://www.apache.org/foundation/thanks.html">Thanks</a></li>
+							<li><a href="http://www.apache.org/security/">Security</a></li>
+					  </ul>
+					</div>
+				</li>
+				<li><a href="http://mesos.apache.org">Apache Mesos</a></li>
+				
+				
+					<li><a href="/documentation
+/">Documentation
+</a></li>
+				
+				
+			</ul><!-- /breadcrumb -->
+			</div>
+			
+			<!-- navbar excitement -->
+	    <div class="navbar navbar-static-top" role="navigation">
+	      <div class="navbar-inner">
+	        <div class="container">
+						<a href="/" class="logo"><img src="/assets/img/mesos_logo.png" alt="Apache Mesos logo" /></a>
+					<div class="nav-collapse">
+						<ul class="nav nav-pills navbar-right">
+						  <li><a href="/gettingstarted/">Getting Started</a></li>
+						  <li><a href="/documentation/latest/">Documentation</a></li>
+						  <li><a href="/downloads/">Downloads</a></li>
+						  <li><a href="/community/">Community</a></li>
+						</ul>
+					</div>
+	        </div>
+	      </div>
+	    </div><!-- /.navbar -->
+
+      <div class="container">
+
+			<div class="row-fluid">
+	<div class="col-md-4">
+		<h4>If you're new to Mesos</h4>
+		<p>See the <a href="/gettingstarted/">getting started</a> page for more information about downloading, building, and deploying Mesos.</p>
+		
+		<h4>If you'd like to get involved or you're looking for support</h4>
+		<p>See our <a href="/community/">community</a> page for more details.</p>
+	</div>
+	<div class="col-md-8">
+		<h1>Multiple Disks</h1>
+
+<p>Mesos provides a mechanism for operators to expose multiple disk resources. When
+creating <a href="/documentation/latest/./persistent-volume/">persistent volumes</a> frameworks can decide
+whether to use specific disks by examining the <code>source</code> field on the disk
+resources offered.</p>
+
+<p><code>Disk</code> resources come in three forms:</p>
+
+<ul>
+<li>A <code>Root</code> disk is presented by not having the <code>source</code> set in <code>DiskInfo</code>.</li>
+<li>A <code>Path</code> disk is presented by having the <code>PATH</code> enum set for <code>source</code> in
+<code>DiskInfo</code>. It also has a <code>root</code> which the operator uses to specify the
+directory to be used to store data.</li>
+<li>A <code>Mount</code> disk is presented by having the <code>MOUNT</code> enum set for <code>source</code> in
+<code>DiskInfo</code>. It also has a <code>root</code> which the operator uses to specify the
+mount point used to store data.</li>
+</ul>
+
+
+<p>Operators can use the JSON-formated <code>--resources</code> option on the agent to provide
+these different kind of disk resources on agent start-up.</p>
+
+<h4><code>Root</code> disk</h4>
+
+<p>A <code>Root</code> disk is the basic disk resource in Mesos. It usually maps to the
+storage on the main operating system drive that the operator has presented to
+the agent. Data is mapped into the <code>work_dir</code> of the agent.</p>
+
+<pre><code>    {
+      "resources" : [
+        {
+          "name" : "disk",
+          "type" : "SCALAR",
+          "scalar" : { "value" : 2048 },
+          "role" : &lt;framework_role&gt;
+        }
+      ]
+    }
+</code></pre>
+
+<h4><code>Path</code> disks</h4>
+
+<p>A <code>Path</code> disk is an auxiliary disk resource provided by the operator. This can
+can be carved up into smaller chunks by accepting less than the total amount as
+frameworks see fit. Common uses for this kind of disk are extra logging space,
+file archives or caches, or other non performance-critical applications.
+Operators can present extra disks on their agents as <code>Path</code> disks simply by
+creating a directory and making that the <code>root</code> of the <code>Path</code> in <code>DiskInfo</code>&rsquo;s
+<code>source</code>.</p>
+
+<p><code>Path</code> disks are also useful for mocking up a multiple disk environment by
+creating some directories on the operating system drive. This should only be
+done in a testing or staging environment.</p>
+
+<pre><code>    {
+      "resources" : [
+        {
+          "name" : "disk",
+          "type" : "SCALAR",
+          "scalar" : { "value" : 2048 },
+          "role" : &lt;framework_role&gt;,
+          "disk" : {
+            "source" : {
+              "type" : "PATH",
+              "path" : { "root" : "/mnt/data" }
+            }
+          }
+        }
+      ]
+    }
+</code></pre>
+
+<h4><code>Mount</code> disks</h4>
+
+<p>A <code>Mount</code> disk is an auxiliary disk resource provided by the operator. This
+<strong>cannot</strong> be carved up into smaller chunks by frameworks. This lack of
+flexibility allows operators to provide assurances to frameworks that they will
+have exclusive access to the disk device. Common uses for this kind of disk
+include database storage, write-ahead logs, or other performance-critical
+applications.</p>
+
+<p>Another requirement of <code>Mount</code> disks is that (on Linux) they map to a <code>mount</code>
+point in the <code>/proc/mounts</code> table. Operators should mount a physical disk with
+their preferred file system and provide the mount point as the <code>root</code> of the
+<code>Mount</code> in <code>DiskInfo</code>&rsquo;s <code>source</code>.</p>
+
+<p>Aside from the performance advantages of <code>Mount</code> disks, applications running on
+them should be able to rely on disk errors when they attempt to exceed the
+capacity of the volume. This holds true as long as the file system in use
+correctly propagates these errors. Due to this expectation, the <code>posix/disk</code>
+isolation is disabled for <code>Mount</code> disks.</p>
+
+<pre><code>    {
+      "resources" : [
+        {
+          "name" : "disk",
+          "type" : "SCALAR",
+          "scalar" : { "value" : 2048 },
+          "role" : &lt;framework_role&gt;,
+          "disk" : {
+            "source" : {
+              "type" : "MOUNT",
+              "mount" : { "root" : "/mnt/data" }
+            }
+          }
+        }
+      ]
+    }
+</code></pre>
+
+<h4><code>Block</code> disks</h4>
+
+<p>Mesos currently does not allow operators to expose raw block devices. It may do
+so in the future, but there are security and flexibility concerns that need to
+be addressed in a design document first.</p>
+
+<h3>Storage Management</h3>
+
+<p>Mesos currently does not clean up or destroy data when persistent volumes are
+destroyed. It may do so in the future; however, the expectation is currently
+upon the framework, executor, and application to delete their data before
+destroying their persistent volumes. This is strongly encouraged for both
+security and ensuring that future users of the underlying disk resource are not
+penalized due to prior consumption of the disk capacity.</p>
+
+<h3>Implementation</h3>
+
+<p>A <code>Path</code> disk will have sub-directories created within the <code>root</code> which will be
+used to differentiate the different volumes that are created on it.</p>
+
+<p>A <code>Mount</code> disk will <strong>not</strong> have sub-directories created, allowing applications
+to use the full file system mounted on the device. This construct allows Mesos
+tasks to access volumes that contain pre-existing directory structures. This can
+be useful to simplify ingesting data such as a pre-existing Postgres database or
+HDFS data directory.</p>
+
+<p>Operators should be aware of these distinctions when inspecting or cleaning up
+remnant data.</p>
+
+	</div>
+</div>
+
+			
+	      <hr>
+
+				<!-- footer -->
+	      <div class="footer">
+	        <p>&copy; 2012-2015 <a href="http://apache.org">The Apache Software Foundation</a>.
+	        Apache Mesos, the Apache feather logo, and the Apache Mesos project logo are trademarks of The Apache Software Foundation.<p>
+	      </div><!-- /footer -->
+
+	    </div> <!-- /container -->
+
+	    <!-- JS -->
+	    <script src="//code.jquery.com/jquery-1.11.0.min.js" type="text/javascript"></script>
+			<script src="//netdna.bootstrapcdn.com/bootstrap/3.1.1/js/bootstrap.min.js" type="text/javascript"></script>
+    </body>
+</html>

Added: mesos/site/publish/documentation/replicated-log-internals/index.html
URL: http://svn.apache.org/viewvc/mesos/site/publish/documentation/replicated-log-internals/index.html?rev=1731786&view=auto
==============================================================================
--- mesos/site/publish/documentation/replicated-log-internals/index.html (added)
+++ mesos/site/publish/documentation/replicated-log-internals/index.html Tue Feb 23 04:18:42 2016
@@ -0,0 +1,218 @@
+<!DOCTYPE html>
+<html>
+    <head>
+        <meta charset="utf-8">
+        <title></title>
+		    <meta name="viewport" content="width=device-width, initial-scale=1.0">
+
+		    <link href="//netdna.bootstrapcdn.com/bootstrap/3.1.1/css/bootstrap.min.css" rel="stylesheet">
+		    <link rel="alternate" type="application/atom+xml" title="Apache Mesos Blog" href="/blog/feed.xml">
+		    
+		    <link href="../../assets/css/main.css" media="screen" rel="stylesheet" type="text/css" />
+				
+		    
+			
+			<!-- Google Analytics Magic -->
+			<script type="text/javascript">
+			  var _gaq = _gaq || [];
+			  _gaq.push(['_setAccount', 'UA-20226872-1']);
+			  _gaq.push(['_setDomainName', 'apache.org']);
+			  _gaq.push(['_trackPageview']);
+
+			  (function() {
+			    var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true;
+			    ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js';
+			    var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s);
+			  })();
+			</script>
+    </head>
+    <body>
+			<!-- magical breadcrumbs -->
+			<div class="topnav">
+			<ul class="breadcrumb">
+			  <li>
+					<div class="dropdown">
+					  <a data-toggle="dropdown" href="#">Apache Software Foundation <span class="caret"></span></a>
+					  <ul class="dropdown-menu" role="menu" aria-labelledby="dLabel">
+							<li><a href="http://www.apache.org">Apache Homepage</a></li>
+							<li><a href="http://www.apache.org/licenses/">License</a></li>
+					  	<li><a href="http://www.apache.org/foundation/sponsorship.html">Sponsorship</a></li>  
+					  	<li><a href="http://www.apache.org/foundation/thanks.html">Thanks</a></li>
+							<li><a href="http://www.apache.org/security/">Security</a></li>
+					  </ul>
+					</div>
+				</li>
+				<li><a href="http://mesos.apache.org">Apache Mesos</a></li>
+				
+				
+					<li><a href="/documentation
+/">Documentation
+</a></li>
+				
+				
+			</ul><!-- /breadcrumb -->
+			</div>
+			
+			<!-- navbar excitement -->
+	    <div class="navbar navbar-static-top" role="navigation">
+	      <div class="navbar-inner">
+	        <div class="container">
+						<a href="/" class="logo"><img src="/assets/img/mesos_logo.png" alt="Apache Mesos logo" /></a>
+					<div class="nav-collapse">
+						<ul class="nav nav-pills navbar-right">
+						  <li><a href="/gettingstarted/">Getting Started</a></li>
+						  <li><a href="/documentation/latest/">Documentation</a></li>
+						  <li><a href="/downloads/">Downloads</a></li>
+						  <li><a href="/community/">Community</a></li>
+						</ul>
+					</div>
+	        </div>
+	      </div>
+	    </div><!-- /.navbar -->
+
+      <div class="container">
+
+			<div class="row-fluid">
+	<div class="col-md-4">
+		<h4>If you're new to Mesos</h4>
+		<p>See the <a href="/gettingstarted/">getting started</a> page for more information about downloading, building, and deploying Mesos.</p>
+		
+		<h4>If you'd like to get involved or you're looking for support</h4>
+		<p>See our <a href="/community/">community</a> page for more details.</p>
+	</div>
+	<div class="col-md-8">
+		<h1>The Mesos Replicated Log</h1>
+
+<p>Mesos provides a library that lets you create replicated fault-tolerant append-only logs; this library is known as the <em>replicated log</em>. The Mesos master uses this library to store cluster state in a replicated, durable way; the library is also available for use by frameworks to store replicated framework state or to implement the common &ldquo;<a href="https://en.wikipedia.org/wiki/State_machine_replication">replicated state machine</a>&rdquo; pattern.</p>
+
+<h2>What is the replicated log?</h2>
+
+<p><img src="/assets/img/documentation/log-cluster.png" alt="Aurora and the Replicated Log" /></p>
+
+<p>The replicated log provides <em>append-only</em> storage of <em>log entries</em>; each log entry can contain arbitrary data. The log is <em>replicated</em>, which means that each log entry has multiple copies in the system. Replication provides both fault tolerance and high availability. In the following example, we use <a href="https://aurora.apache.org/">Apache Aurora</a>, a fault tolerant scheduler (i.e., framework) running on top of Mesos, to show a typical replicated log setup.</p>
+
+<p>As shown above, there are multiple Aurora instances running simultaneously (for high availability), with one elected as the leader. There is a log replica on each host running Aurora. Aurora can access the replicated log through a thin library containing the log API.</p>
+
+<p>Typically, the leader is the only one that appends data to the log. Each log entry is replicated and sent to all replicas in the system. Replicas are strongly consistent. In other words, all replicas agree on the value of each log entry. Because the log is replicated, when Aurora decides to failover, it does not need to copy the log from a remote host.</p>
+
+<h2>Use Cases</h2>
+
+<p>The replicated log can be used to build a wide variety of distributed applications. For example, Aurora uses the replicated log to store all task states and job configurations. The Mesos master&rsquo;s <em>registry</em> also leverages the replicated log to store information about all slaves in the cluster.</p>
+
+<p>The replicated log is often used to allow applications to manage replicated state in a strongly consistent way. One way to do this is to store a state-mutating operation in each log entry and have all instances of the distributed application agree on the same initial state (e.g., empty state). The replicated log ensures that each application instance will observe the same sequence of log entries in the same order; as long as applying a state-mutating operation is deterministic, this ensures that all application instances will remain consistent with one another. If any instance of the application crashes, it can reconstruct the current version of the replicated state by starting at the initial state and re-applying all the logged mutations in order.</p>
+
+<p>If the log grows too large, an application can write out a snapshot and then delete all the log entries that occurred before the snapshot. Using this approach, we will be exposing a <a href="https://github.com/apache/mesos/blob/master/src/state/state.hpp">distributed state</a> abstraction in Mesos with replicated log as a backend.</p>
+
+<p>Similarly, the replicated log can be used to build <a href="https://en.wikipedia.org/wiki/State_machine_replication">replicated state machines</a>. In this scenario, each log entry contains a state machine command. Since replicas are strongly consistent, all servers will execute the same commands in the same order.</p>
+
+<h2>Implementation</h2>
+
+<p><img src="/assets/img/documentation/log-architecture.png" alt="Replicated Log Architecture" /></p>
+
+<p>The replicated log uses the <a href="https://en.wikipedia.org/wiki/Paxos_%28computer_science%29">Paxos consensus algorithm</a> to ensure that all replicas agree on every log entry’s value. It is similar to what’s described in <a href="https://ramcloud.stanford.edu/~ongaro/userstudy/paxos.pdf">these slides</a>. Readers who are familiar with Paxos can skip this section.</p>
+
+<p>The above figure is an implementation overview. When a user wants to append data to the log, the system creates a log writer. The log writer internally creates a coordinator. The coordinator contacts all replicas and executes the Paxos algorithm to make sure all replicas agree about the appended data. The coordinator is sometimes referred to as the <a href="https://en.wikipedia.org/wiki/Paxos_%28computer_science%29"><em>proposer</em></a>.</p>
+
+<p>Each replica keeps an array of log entries. The array index is the log position. Each log entry is composed of three components: the value written by the user, the associated Paxos state and a <em>learned</em> bit where true means this log entry’s value has been agreed. Therefore, a replica in our implementation is both an <a href="https://en.wikipedia.org/wiki/Paxos_%28computer_science%29"><em>acceptor</em></a> and a <a href="https://en.wikipedia.org/wiki/Paxos_%28computer_science%29"><em>learner</em></a>.</p>
+
+<h3>Reaching consensus for a single log entry</h3>
+
+<p>A Paxos round can help all replicas reach consensus on a single log entry’s value. It has two phases: a promise phase and a write phase. Note that we are using slightly different terminology from the <a href="https://research.microsoft.com/en-us/um/people/lamport/pubs/paxes-simple.pdf">original Paxos paper</a>. In our implementation, the <em>prepare</em> and <em>accept</em> phases in the original paper are referred to as the <em>promise</em> and <em>write</em> phases, respectively. Consequently, a prepare request (response) is referred to as a promise request (response), and an accept request (response) is referred to as a write request (response).</p>
+
+<p>To append value <em>X</em> to the log at position <em>p</em>, the coordinator first broadcasts a promise request to all replicas with proposal number <em>n</em>, asking replicas to promise that they will not respond to any request (promise/write request) with a proposal number lower than <em>n</em>. We assume that <em>n</em> is higher than any other previously used proposal number, and will explain how we do this later.</p>
+
+<p>When receiving the promise request, each replica checks its Paxos state to decide if it can safely respond to the request, depending on the promises it has previously given out. If the replica is able to give the promise (i.e., passes the proposal number check), it will first persist its promise (the proposal number <em>n</em>) on disk and reply with a promise response. If the replica has been previously written (i.e., accepted a write request), it needs to include the previously written value along with the proposal number used in that write request into the promise response it’s about to send out.</p>
+
+<p>Upon receiving promise responses from a <a href="https://en.wikipedia.org/wiki/Quorum_%28distributed_computing%29">quorum</a> of replicas, the coordinator first checks if there exist any previously written value from those responses. The append operation cannot continue if a previously written value is found because it’s likely that a value has already been agreed on for that log entry. This is one of the key ideas in Paxos: restrict the value that can be written to ensure consistency.</p>
+
+<p>If no previous written value is found, the coordinator broadcasts a write request to all replicas with value <em>X</em> and proposal number <em>n</em>. On receiving the write request, each replica checks the promise it has given again, and replies with a write response if the write request’s proposal number is equal to or larger than the proposal number it has promised. Once the coordinator receives write responses from a quorum of replicas, the append operation succeeds.</p>
+
+<h3>Optimizing append latency using Multi-Paxos</h3>
+
+<p>One naive solution to implement a replicated log is to run a full Paxos round (promise phase and write phase) for each log entry. As discussed in the <a href="https://research.microsoft.com/en-us/um/people/lamport/pubs/paxos-simple.pdf">original Paxos paper</a>, if the leader is relatively stable, <em>Multi-Paxos</em> can be used to eliminate the need for the promise phase for most of the append operations, resulting in improved performance.</p>
+
+<p>To do that, we introduce a new type of promise request called an <em>implicit</em> promise request. An implicit promise request can be viewed as a <em>batched</em> promise request for a (potentially infinite) set of log entries. Broadcasting an implicit promise request is conceptually equivalent to broadcasting a promise request for every log entry whose value has not yet been agreed. If the implicit promise request broadcasted by a coordinator gets accepted by a quorum of replicas, this coordinator is no longer required to run the promise phase if it wants to append to a log entry whose value has not yet been agreed because the promise phase has already been done in <em>batch</em>. The coordinator in this case is therefore called <em>elected</em> (a.k.a., the leader), and has <em>exclusive</em> access to the replicated log. An elected coordinator may be <em>demoted</em> (or lose exclusive access) if another coordinator broadcasts an implicit promise request with a higher proposa
 l number.</p>
+
+<p>One question remaining is how can we find out those log entries whose values have not yet been agreed. We have a very simple solution: if a replica accepts an implicit promise request, it will include its largest known log position in the response. An elected coordinator will only append log entries at positions larger than <em>p</em>, where <em>p</em> is greater than any log position seen in these responses.</p>
+
+<p>Multi-Paxos has better performance if the leader is stable. The replicated log itself does not perform leader election. Instead, we rely on the user of the replicated log to choose a stable leader. For example, Aurora uses <a href="https://zookeeper.apache.org/">ZooKeeper</a> to elect the leader.</p>
+
+<h3>Enabling local reads</h3>
+
+<p>As discussed above, in our implementation, each replica is both an acceptor and a learner. Treating each replica as a learner allows us to do local reads without involving other replicas. When a log entry’s value has been agreed, the coordinator will broadcast a <em>learned</em> message to all replicas. Once a replica receives the learned message, it will set the learned bit in the corresponding log entry, indicating the value of that log entry has been agreed. We say a log entry is &ldquo;learned&rdquo; if its learned bit is set. The coordinator does not have to wait for replicas’ acknowledgments.</p>
+
+<p>To perform a read, the log reader will directly look up the underlying local replica. If the corresponding log entry is learned, the reader can just return the value to the user. Otherwise, a full Paxos round is needed to discover the agreed value. We always make sure that the replica co-located with the elected coordinator always has all log entries learned. We achieve that by running full Paxos rounds for those unlearned log entries after the coordinator is elected.</p>
+
+<h3>Reducing log size using garbage collection</h3>
+
+<p>In case the log grows large, the application has the choice to truncate the log. To perform a truncation, we append a special log entry whose value is the log position to which the user wants to truncate the log. A replica can actually truncate the log once this special log entry has been learned.</p>
+
+<h3>Unique proposal number</h3>
+
+<p>Many of the <a href="https://research.microsoft.com/en-us/um/people/lamport/pubs/paxos-simple.pdf">Paxos research papers</a> assume that each proposal number is globally unique, and a coordinator can always come up with a proposal number that is larger than any other proposal numbers in the system. However, implementing this is not trivial, especially in a distributed environment. <a href="https://ramcloud.stanford.edu/~ongaro/userstudy/paxos.pdf">Some researchers suggest</a> concatenating a globally unique server id to each proposal number. But it is still not clear how to generate a globally unique id for each server.</p>
+
+<p>Our solution does not make the above assumptions. A coordinator can use an arbitrary proposal number initially. During the promise phase, if a replica knows a proposal number higher than the proposal number used by the coordinator, it will send the largest known proposal number back to the coordinator. The coordinator will retry the promise phase with a higher proposal number.</p>
+
+<p>To avoid livelock (e.g., when two coordinators completing), we inject a randomly delay between T and 2T before each retry. T has to be chosen carefully. On one hand, we want T >> broadcast time such that one coordinator usually times out and wins before others wake up. On the other hand, we want T to be as small as possible such that we can reduce the wait time. Currently, we use T = 100ms. This idea is actually borrowed from <a href="https://ramcloud.stanford.edu/wiki/download/attachments/11370504/raft.pdf">Raft</a>.</p>
+
+<h2>Automatic replica recovery</h2>
+
+<p>The algorithm described above has a critical vulnerability: if a replica loses its durable state (i.e., log files) due to either disk failure or operational error, that replica may cause inconsistency in the log if it is simply restarted and re-added to the group. The operator needs to stop the application on all hosts, copy the log files from the leader’s host, and then restart the application. Note that the operator cannot copy the log files from an arbitrary replica because copying an unlearned log entry may falsely assemble a quorum for an incorrect value, leading to inconsistency.</p>
+
+<p>To avoid the need for operator intervention in this situation, the Mesos replicated log includes support for <em>auto recovery</em>. As long as a quorum of replicas is working properly, the users of the application won’t notice any difference.</p>
+
+<h3>Non-voting replicas</h3>
+
+<p>To enable auto recovery, a key insight is that a replica that loses its durable state should not be allowed to respond to requests from coordinators after restart. Otherwise, it may introduce inconsistency in the log as it could have accepted a promise/write request which it would not have accepted if its previous Paxos state had not been lost.</p>
+
+<p>To solve that, we introduce a new status variable for each replica. A normal replica is said in VOTING status, meaning that it is allowed to respond to requests from coordinators. A replica with no persisted state is put in EMPTY status by default. A replica in EMPTY status is not allowed to respond to any request from coordinators.</p>
+
+<p>A replica in EMPTY status will be promoted to VOTING status if the following two conditions are met:</p>
+
+<ol>
+<li>a sufficient amount of missing log entries are recovered such that if other replicas fail, the remaining replicas can recover all the learned log entries, and</li>
+<li>its future responses to a coordinator will not break any of the promises (potentially lost) it has given out.</li>
+</ol>
+
+
+<p>In the following, we discuss how we achieve these two conditions.</p>
+
+<h3>Catch-up</h3>
+
+<p>To satisfy the above two conditions, a replica needs to perform <em>catch-up</em> to recover lost states. In other words, it will run Paxos rounds to find out those log entries whose values that have already been agreed. The question is how many log entries the local replica should catch-up before the above two conditions can be satisfied.</p>
+
+<p>We found that it is sufficient to catch-up those log entries from position <em>begin</em> to position <em>end</em> where <em>begin</em> is the smallest position seen in a quorum of VOTING replicas and <em>end</em> is the largest position seen in a quorum of VOTING replicas.</p>
+
+<p>Here is our correctness argument. For a log entry at position <em>e</em> where <em>e</em> is larger than <em>end</em>, obviously no value has been agreed on. Otherwise, we should find at least one VOTING replica in a quorum of replicas such that its end position is larger than <em>end</em>. For the same reason, a coordinator should not have collected enough promises for the log entry at position <em>e</em>. Therefore, it&rsquo;s safe for the recovering replica to respond requests for that log entry. For a log entry at position <em>b</em> where <em>b</em> is smaller than <em>begin</em>, it should have already been truncated and the truncation should have already been agreed. Therefore, allowing the recovering replica to respond requests for that position is also safe.</p>
+
+<h3>Auto initialization</h3>
+
+<p>Since we don’t allow an empty replica (a replica in EMPTY status) to respond to requests from coordinators, that raises a question for bootstrapping because initially, each replica is empty. The replicated log provides two choices here. One choice is to use a tool (`mesos-log) to explicitly initialize the log on each replica by setting the replica&rsquo;s status to VOTING, but that requires an extra step when setting up an application.</p>
+
+<p>The other choice is to do automatic initialization. Our idea is: we allow a replica in EMPTY status to become VOTING immediately if it finds all replicas are in EMPTY status. This is based on the assumption that the only time <em>all</em> replicas are in EMPTY status is during start-up. This may not be true if a catastrophic failure causes all replicas to lose their durable state, and that&rsquo;s exactly the reason we allow conservative users to disable auto-initialization.</p>
+
+<p>To do auto-initialization, if we use a single-phase protocol and allow a replica to directly transit from EMPTY status to VOTING status, we may run into a state where we cannot make progress even if all replicas are in EMPTY status initially. For example, say the quorum size is 2. All replicas are in EMPTY status initially. One replica will first set its status to VOTING because if finds all replicas are in EMPTY status. After that, neither the VOTING replica nor the EMPTY replicas can make progress. To solve this problem, we use a two-phase protocol and introduce an intermediate transient status (STARTING) between EMPTY and VOTING status. A replica in EMPTY status can transit to STARTING status if it finds all replicas are in either EMPTY or STARTING status. A replica in STARTING status can transit to VOTING status if it finds all replicas are in either STARTING or VOTING status. In that way, in our previous example, all replicas will be in STARTING status before any of them can
  transit to VOTING status.</p>
+
+<h2>Future work</h2>
+
+<p>Currently, replicated log does not support dynamic quorum size change, also known as <em>reconfiguration</em>. Supporting reconfiguration would allow us more easily to add, move or swap hosts for replicas. We plan to support reconfiguration in the future.</p>
+
+	</div>
+</div>
+
+			
+	      <hr>
+
+				<!-- footer -->
+	      <div class="footer">
+	        <p>&copy; 2012-2015 <a href="http://apache.org">The Apache Software Foundation</a>.
+	        Apache Mesos, the Apache feather logo, and the Apache Mesos project logo are trademarks of The Apache Software Foundation.<p>
+	      </div><!-- /footer -->
+
+	    </div> <!-- /container -->
+
+	    <!-- JS -->
+	    <script src="//code.jquery.com/jquery-1.11.0.min.js" type="text/javascript"></script>
+			<script src="//netdna.bootstrapcdn.com/bootstrap/3.1.1/js/bootstrap.min.js" type="text/javascript"></script>
+    </body>
+</html>