You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@storm.apache.org by bo...@apache.org on 2016/03/22 16:38:57 UTC

[31/51] [partial] storm git commit: STORM-1617: Versioned Docs

http://git-wip-us.apache.org/repos/asf/storm/blob/335bbf94/_site/documentation/Distributed-RPC.html
----------------------------------------------------------------------
diff --git a/_site/documentation/Distributed-RPC.html b/_site/documentation/Distributed-RPC.html
deleted file mode 100644
index b2bbdd2..0000000
--- a/_site/documentation/Distributed-RPC.html
+++ /dev/null
@@ -1,355 +0,0 @@
-<!DOCTYPE html>
-<html>
-    <head>
-    <meta charset="utf-8">
-    <meta http-equiv="X-UA-Compatible" content="IE=edge">
-    <meta name="viewport" content="width=device-width, initial-scale=1">
-
-    <link rel="shortcut icon" href="/favicon.ico" type="image/x-icon">
-    <link rel="icon" href="/favicon.ico" type="image/x-icon">
-
-    <title>Distributed RPC</title>
-
-    <!-- Bootstrap core CSS -->
-    <link href="/assets/css/bootstrap.min.css" rel="stylesheet">
-    <!-- Bootstrap theme -->
-    <link href="/assets/css/bootstrap-theme.min.css" rel="stylesheet">
-
-    <!-- Custom styles for this template -->
-    <link rel="stylesheet" href="http://fortawesome.github.io/Font-Awesome/assets/font-awesome/css/font-awesome.css">
-    <link href="/css/style.css" rel="stylesheet">
-    <link href="/assets/css/owl.theme.css" rel="stylesheet">
-    <link href="/assets/css/owl.carousel.css" rel="stylesheet">
-    <script type="text/javascript" src="/assets/js/jquery.min.js"></script>
-    <script type="text/javascript" src="/assets/js/bootstrap.min.js"></script>
-    <script type="text/javascript" src="/assets/js/owl.carousel.min.js"></script>
-    <script type="text/javascript" src="/assets/js/storm.js"></script>
-    <!-- Just for debugging purposes. Don't actually copy these 2 lines! -->
-    <!--[if lt IE 9]><script src="../../assets/js/ie8-responsive-file-warning.js"></script><![endif]-->
-    
-    <!-- HTML5 shim and Respond.js for IE8 support of HTML5 elements and media queries -->
-    <!--[if lt IE 9]>
-      <script src="https://oss.maxcdn.com/html5shiv/3.7.2/html5shiv.min.js"></script>
-      <script src="https://oss.maxcdn.com/respond/1.4.2/respond.min.js"></script>
-    <![endif]-->
-  </head>
-
-
-  <body>
-    <header>
-  <div class="container-fluid">
-      <div class="row">
-          <div class="col-md-10">
-              <a href="/index.html"><img src="/images/logo.png" class="logo" /></a>
-            </div>
-            <div class="col-md-2">
-              <a href="/downloads.html" class="btn-std btn-block btn-download">Download</a>
-            </div>
-        </div>
-    </div>
-</header>
-<!--Header End-->
-<!--Navigation Begin-->
-<div class="navbar" role="banner">
-  <div class="container-fluid">
-      <div class="navbar-header">
-          <button class="navbar-toggle" type="button" data-toggle="collapse" data-target=".bs-navbar-collapse">
-                <span class="icon-bar"></span>
-                <span class="icon-bar"></span>
-                <span class="icon-bar"></span>
-            </button>
-        </div>
-        <nav class="collapse navbar-collapse bs-navbar-collapse" role="navigation">
-          <ul class="nav navbar-nav">
-              <li><a href="/index.html" id="home">Home</a></li>
-                <li><a href="/getting-help.html" id="getting-help">Getting Help</a></li>
-                <li><a href="/about/integrates.html" id="project-info">Project Information</a></li>
-                <li><a href="/documentation.html" id="documentation">Documentation</a></li>
-                <li><a href="/talksAndVideos.html">Talks and Slideshows</a></li>
-                <li class="dropdown">
-                    <a href="#" class="dropdown-toggle" data-toggle="dropdown" id="contribute">Community <b class="caret"></b></a>
-                    <ul class="dropdown-menu">
-                        <li><a href="/contribute/Contributing-to-Storm.html">Contributing</a></li>
-                        <li><a href="/contribute/People.html">People</a></li>
-                        <li><a href="/contribute/BYLAWS.html">ByLaws</a></li>
-                    </ul>
-                </li>
-                <li><a href="/2015/11/05/storm096-released.html" id="news">News</a></li>
-            </ul>
-        </nav>
-    </div>
-</div>
-
-
-
-    <div class="container-fluid">
-    <h1 class="page-title">Distributed RPC</h1>
-          <div class="row">
-           	<div class="col-md-12">
-	             <!-- Documentation -->
-
-<p class="post-meta"></p>
-
-<p>The idea behind distributed RPC (DRPC) is to parallelize the computation of really intense functions on the fly using Storm. The Storm topology takes in as input a stream of function arguments, and it emits an output stream of the results for each of those function calls. </p>
-
-<p>DRPC is not so much a feature of Storm as it is a pattern expressed from Storm&#39;s primitives of streams, spouts, bolts, and topologies. DRPC could have been packaged as a separate library from Storm, but it&#39;s so useful that it&#39;s bundled with Storm.</p>
-
-<h3 id="high-level-overview">High level overview</h3>
-
-<p>Distributed RPC is coordinated by a &quot;DRPC server&quot; (Storm comes packaged with an implementation of this). The DRPC server coordinates receiving an RPC request, sending the request to the Storm topology, receiving the results from the Storm topology, and sending the results back to the waiting client. From a client&#39;s perspective, a distributed RPC call looks just like a regular RPC call. For example, here&#39;s how a client would compute the results for the &quot;reach&quot; function with the argument &quot;<a href="http://twitter.com%22:">http://twitter.com&quot;:</a></p>
-<div class="highlight"><pre><code class="language-java" data-lang="java"><span class="n">DRPCClient</span> <span class="n">client</span> <span class="o">=</span> <span class="k">new</span> <span class="n">DRPCClient</span><span class="o">(</span><span class="s">"drpc-host"</span><span class="o">,</span> <span class="mi">3772</span><span class="o">);</span>
-<span class="n">String</span> <span class="n">result</span> <span class="o">=</span> <span class="n">client</span><span class="o">.</span><span class="na">execute</span><span class="o">(</span><span class="s">"reach"</span><span class="o">,</span> <span class="s">"http://twitter.com"</span><span class="o">);</span>
-</code></pre></div>
-<p>The distributed RPC workflow looks like this:</p>
-
-<p><img src="images/drpc-workflow.png" alt="Tasks in a topology"></p>
-
-<p>A client sends the DRPC server the name of the function to execute and the arguments to that function. The topology implementing that function uses a <code>DRPCSpout</code> to receive a function invocation stream from the DRPC server. Each function invocation is tagged with a unique id by the DRPC server. The topology then computes the result and at the end of the topology a bolt called <code>ReturnResults</code> connects to the DRPC server and gives it the result for the function invocation id. The DRPC server then uses the id to match up that result with which client is waiting, unblocks the waiting client, and sends it the result.</p>
-
-<h3 id="lineardrpctopologybuilder">LinearDRPCTopologyBuilder</h3>
-
-<p>Storm comes with a topology builder called <a href="/javadoc/apidocs/backtype/storm/drpc/LinearDRPCTopologyBuilder.html">LinearDRPCTopologyBuilder</a> that automates almost all the steps involved for doing DRPC. These include:</p>
-
-<ol>
-<li>Setting up the spout</li>
-<li>Returning the results to the DRPC server</li>
-<li>Providing functionality to bolts for doing finite aggregations over groups of tuples</li>
-</ol>
-
-<p>Let&#39;s look at a simple example. Here&#39;s the implementation of a DRPC topology that returns its input argument with a &quot;!&quot; appended:</p>
-<div class="highlight"><pre><code class="language-java" data-lang="java"><span class="kd">public</span> <span class="kd">static</span> <span class="kd">class</span> <span class="nc">ExclaimBolt</span> <span class="kd">extends</span> <span class="n">BaseBasicBolt</span> <span class="o">{</span>
-    <span class="kd">public</span> <span class="kt">void</span> <span class="n">execute</span><span class="o">(</span><span class="n">Tuple</span> <span class="n">tuple</span><span class="o">,</span> <span class="n">BasicOutputCollector</span> <span class="n">collector</span><span class="o">)</span> <span class="o">{</span>
-        <span class="n">String</span> <span class="n">input</span> <span class="o">=</span> <span class="n">tuple</span><span class="o">.</span><span class="na">getString</span><span class="o">(</span><span class="mi">1</span><span class="o">);</span>
-        <span class="n">collector</span><span class="o">.</span><span class="na">emit</span><span class="o">(</span><span class="k">new</span> <span class="n">Values</span><span class="o">(</span><span class="n">tuple</span><span class="o">.</span><span class="na">getValue</span><span class="o">(</span><span class="mi">0</span><span class="o">),</span> <span class="n">input</span> <span class="o">+</span> <span class="s">"!"</span><span class="o">));</span>
-    <span class="o">}</span>
-
-    <span class="kd">public</span> <span class="kt">void</span> <span class="n">declareOutputFields</span><span class="o">(</span><span class="n">OutputFieldsDeclarer</span> <span class="n">declarer</span><span class="o">)</span> <span class="o">{</span>
-        <span class="n">declarer</span><span class="o">.</span><span class="na">declare</span><span class="o">(</span><span class="k">new</span> <span class="n">Fields</span><span class="o">(</span><span class="s">"id"</span><span class="o">,</span> <span class="s">"result"</span><span class="o">));</span>
-    <span class="o">}</span>
-<span class="o">}</span>
-
-<span class="kd">public</span> <span class="kd">static</span> <span class="kt">void</span> <span class="nf">main</span><span class="p">(</span><span class="n">String</span><span class="o">[]</span> <span class="n">args</span><span class="o">)</span> <span class="kd">throws</span> <span class="n">Exception</span> <span class="o">{</span>
-    <span class="n">LinearDRPCTopologyBuilder</span> <span class="n">builder</span> <span class="o">=</span> <span class="k">new</span> <span class="n">LinearDRPCTopologyBuilder</span><span class="o">(</span><span class="s">"exclamation"</span><span class="o">);</span>
-    <span class="n">builder</span><span class="o">.</span><span class="na">addBolt</span><span class="o">(</span><span class="k">new</span> <span class="n">ExclaimBolt</span><span class="o">(),</span> <span class="mi">3</span><span class="o">);</span>
-    <span class="c1">// ...</span>
-<span class="o">}</span>
-</code></pre></div>
-<p>As you can see, there&#39;s very little to it. When creating the <code>LinearDRPCTopologyBuilder</code>, you tell it the name of the DRPC function for the topology. A single DRPC server can coordinate many functions, and the function name distinguishes the functions from one another. The first bolt you declare will take in as input 2-tuples, where the first field is the request id and the second field is the arguments for that request. <code>LinearDRPCTopologyBuilder</code> expects the last bolt to emit an output stream containing 2-tuples of the form [id, result]. Finally, all intermediate tuples must contain the request id as the first field.</p>
-
-<p>In this example, <code>ExclaimBolt</code> simply appends a &quot;!&quot; to the second field of the tuple. <code>LinearDRPCTopologyBuilder</code> handles the rest of the coordination of connecting to the DRPC server and sending results back.</p>
-
-<h3 id="local-mode-drpc">Local mode DRPC</h3>
-
-<p>DRPC can be run in local mode. Here&#39;s how to run the above example in local mode:</p>
-<div class="highlight"><pre><code class="language-java" data-lang="java"><span class="n">LocalDRPC</span> <span class="n">drpc</span> <span class="o">=</span> <span class="k">new</span> <span class="n">LocalDRPC</span><span class="o">();</span>
-<span class="n">LocalCluster</span> <span class="n">cluster</span> <span class="o">=</span> <span class="k">new</span> <span class="n">LocalCluster</span><span class="o">();</span>
-
-<span class="n">cluster</span><span class="o">.</span><span class="na">submitTopology</span><span class="o">(</span><span class="s">"drpc-demo"</span><span class="o">,</span> <span class="n">conf</span><span class="o">,</span> <span class="n">builder</span><span class="o">.</span><span class="na">createLocalTopology</span><span class="o">(</span><span class="n">drpc</span><span class="o">));</span>
-
-<span class="n">System</span><span class="o">.</span><span class="na">out</span><span class="o">.</span><span class="na">println</span><span class="o">(</span><span class="s">"Results for 'hello':"</span> <span class="o">+</span> <span class="n">drpc</span><span class="o">.</span><span class="na">execute</span><span class="o">(</span><span class="s">"exclamation"</span><span class="o">,</span> <span class="s">"hello"</span><span class="o">));</span>
-
-<span class="n">cluster</span><span class="o">.</span><span class="na">shutdown</span><span class="o">();</span>
-<span class="n">drpc</span><span class="o">.</span><span class="na">shutdown</span><span class="o">();</span>
-</code></pre></div>
-<p>First you create a <code>LocalDRPC</code> object. This object simulates a DRPC server in process, just like how <code>LocalCluster</code> simulates a Storm cluster in process. Then you create the <code>LocalCluster</code> to run the topology in local mode. <code>LinearDRPCTopologyBuilder</code> has separate methods for creating local topologies and remote topologies. In local mode the <code>LocalDRPC</code> object does not bind to any ports so the topology needs to know about the object to communicate with it. This is why <code>createLocalTopology</code> takes in the <code>LocalDRPC</code> object as input.</p>
-
-<p>After launching the topology, you can do DRPC invocations using the <code>execute</code> method on <code>LocalDRPC</code>.</p>
-
-<h3 id="remote-mode-drpc">Remote mode DRPC</h3>
-
-<p>Using DRPC on an actual cluster is also straightforward. There&#39;s three steps:</p>
-
-<ol>
-<li>Launch DRPC server(s)</li>
-<li>Configure the locations of the DRPC servers</li>
-<li>Submit DRPC topologies to Storm cluster</li>
-</ol>
-
-<p>Launching a DRPC server can be done with the <code>storm</code> script and is just like launching Nimbus or the UI:</p>
-<div class="highlight"><pre><code class="language-" data-lang="">bin/storm drpc
-</code></pre></div>
-<p>Next, you need to configure your Storm cluster to know the locations of the DRPC server(s). This is how <code>DRPCSpout</code> knows from where to read function invocations. This can be done through the <code>storm.yaml</code> file or the topology configurations. Configuring this through the <code>storm.yaml</code> looks something like this:</p>
-<div class="highlight"><pre><code class="language-yaml" data-lang="yaml"><span class="s">drpc.servers</span><span class="pi">:</span>
-  <span class="pi">-</span> <span class="s2">"</span><span class="s">drpc1.foo.com"</span>
-  <span class="pi">-</span> <span class="s2">"</span><span class="s">drpc2.foo.com"</span>
-</code></pre></div>
-<p>Finally, you launch DRPC topologies using <code>StormSubmitter</code> just like you launch any other topology. To run the above example in remote mode, you do something like this:</p>
-<div class="highlight"><pre><code class="language-java" data-lang="java"><span class="n">StormSubmitter</span><span class="o">.</span><span class="na">submitTopology</span><span class="o">(</span><span class="s">"exclamation-drpc"</span><span class="o">,</span> <span class="n">conf</span><span class="o">,</span> <span class="n">builder</span><span class="o">.</span><span class="na">createRemoteTopology</span><span class="o">());</span>
-</code></pre></div>
-<p><code>createRemoteTopology</code> is used to create topologies suitable for Storm clusters.</p>
-
-<h3 id="a-more-complex-example">A more complex example</h3>
-
-<p>The exclamation DRPC example was a toy example for illustrating the concepts of DRPC. Let&#39;s look at a more complex example which really needs the parallelism a Storm cluster provides for computing the DRPC function. The example we&#39;ll look at is computing the reach of a URL on Twitter.</p>
-
-<p>The reach of a URL is the number of unique people exposed to a URL on Twitter. To compute reach, you need to:</p>
-
-<ol>
-<li>Get all the people who tweeted the URL</li>
-<li>Get all the followers of all those people</li>
-<li>Unique the set of followers</li>
-<li>Count the unique set of followers</li>
-</ol>
-
-<p>A single reach computation can involve thousands of database calls and tens of millions of follower records during the computation. It&#39;s a really, really intense computation. As you&#39;re about to see, implementing this function on top of Storm is dead simple. On a single machine, reach can take minutes to compute; on a Storm cluster, you can compute reach for even the hardest URLs in a couple seconds.</p>
-
-<p>A sample reach topology is defined in storm-starter <a href="https://github.com/apache/storm/blob/master/examples/storm-starter/src/jvm/storm/starter/ReachTopology.java">here</a>. Here&#39;s how you define the reach topology:</p>
-<div class="highlight"><pre><code class="language-java" data-lang="java"><span class="n">LinearDRPCTopologyBuilder</span> <span class="n">builder</span> <span class="o">=</span> <span class="k">new</span> <span class="n">LinearDRPCTopologyBuilder</span><span class="o">(</span><span class="s">"reach"</span><span class="o">);</span>
-<span class="n">builder</span><span class="o">.</span><span class="na">addBolt</span><span class="o">(</span><span class="k">new</span> <span class="n">GetTweeters</span><span class="o">(),</span> <span class="mi">3</span><span class="o">);</span>
-<span class="n">builder</span><span class="o">.</span><span class="na">addBolt</span><span class="o">(</span><span class="k">new</span> <span class="n">GetFollowers</span><span class="o">(),</span> <span class="mi">12</span><span class="o">)</span>
-        <span class="o">.</span><span class="na">shuffleGrouping</span><span class="o">();</span>
-<span class="n">builder</span><span class="o">.</span><span class="na">addBolt</span><span class="o">(</span><span class="k">new</span> <span class="n">PartialUniquer</span><span class="o">(),</span> <span class="mi">6</span><span class="o">)</span>
-        <span class="o">.</span><span class="na">fieldsGrouping</span><span class="o">(</span><span class="k">new</span> <span class="n">Fields</span><span class="o">(</span><span class="s">"id"</span><span class="o">,</span> <span class="s">"follower"</span><span class="o">));</span>
-<span class="n">builder</span><span class="o">.</span><span class="na">addBolt</span><span class="o">(</span><span class="k">new</span> <span class="n">CountAggregator</span><span class="o">(),</span> <span class="mi">2</span><span class="o">)</span>
-        <span class="o">.</span><span class="na">fieldsGrouping</span><span class="o">(</span><span class="k">new</span> <span class="n">Fields</span><span class="o">(</span><span class="s">"id"</span><span class="o">));</span>
-</code></pre></div>
-<p>The topology executes as four steps:</p>
-
-<ol>
-<li><code>GetTweeters</code> gets the users who tweeted the URL. It transforms an input stream of <code>[id, url]</code> into an output stream of <code>[id, tweeter]</code>. Each <code>url</code> tuple will map to many <code>tweeter</code> tuples.</li>
-<li><code>GetFollowers</code> gets the followers for the tweeters. It transforms an input stream of <code>[id, tweeter]</code> into an output stream of <code>[id, follower]</code>. Across all the tasks, there may of course be duplication of follower tuples when someone follows multiple people who tweeted the same URL.</li>
-<li><code>PartialUniquer</code> groups the followers stream by the follower id. This has the effect of the same follower going to the same task. So each task of <code>PartialUniquer</code> will receive mutually independent sets of followers. Once <code>PartialUniquer</code> receives all the follower tuples directed at it for the request id, it emits the unique count of its subset of followers.</li>
-<li>Finally, <code>CountAggregator</code> receives the partial counts from each of the <code>PartialUniquer</code> tasks and sums them up to complete the reach computation.</li>
-</ol>
-
-<p>Let&#39;s take a look at the <code>PartialUniquer</code> bolt:</p>
-<div class="highlight"><pre><code class="language-java" data-lang="java"><span class="kd">public</span> <span class="kd">class</span> <span class="nc">PartialUniquer</span> <span class="kd">extends</span> <span class="n">BaseBatchBolt</span> <span class="o">{</span>
-    <span class="n">BatchOutputCollector</span> <span class="n">_collector</span><span class="o">;</span>
-    <span class="n">Object</span> <span class="n">_id</span><span class="o">;</span>
-    <span class="n">Set</span><span class="o">&lt;</span><span class="n">String</span><span class="o">&gt;</span> <span class="n">_followers</span> <span class="o">=</span> <span class="k">new</span> <span class="n">HashSet</span><span class="o">&lt;</span><span class="n">String</span><span class="o">&gt;();</span>
-
-    <span class="nd">@Override</span>
-    <span class="kd">public</span> <span class="kt">void</span> <span class="n">prepare</span><span class="o">(</span><span class="n">Map</span> <span class="n">conf</span><span class="o">,</span> <span class="n">TopologyContext</span> <span class="n">context</span><span class="o">,</span> <span class="n">BatchOutputCollector</span> <span class="n">collector</span><span class="o">,</span> <span class="n">Object</span> <span class="n">id</span><span class="o">)</span> <span class="o">{</span>
-        <span class="n">_collector</span> <span class="o">=</span> <span class="n">collector</span><span class="o">;</span>
-        <span class="n">_id</span> <span class="o">=</span> <span class="n">id</span><span class="o">;</span>
-    <span class="o">}</span>
-
-    <span class="nd">@Override</span>
-    <span class="kd">public</span> <span class="kt">void</span> <span class="n">execute</span><span class="o">(</span><span class="n">Tuple</span> <span class="n">tuple</span><span class="o">)</span> <span class="o">{</span>
-        <span class="n">_followers</span><span class="o">.</span><span class="na">add</span><span class="o">(</span><span class="n">tuple</span><span class="o">.</span><span class="na">getString</span><span class="o">(</span><span class="mi">1</span><span class="o">));</span>
-    <span class="o">}</span>
-
-    <span class="nd">@Override</span>
-    <span class="kd">public</span> <span class="kt">void</span> <span class="n">finishBatch</span><span class="o">()</span> <span class="o">{</span>
-        <span class="n">_collector</span><span class="o">.</span><span class="na">emit</span><span class="o">(</span><span class="k">new</span> <span class="n">Values</span><span class="o">(</span><span class="n">_id</span><span class="o">,</span> <span class="n">_followers</span><span class="o">.</span><span class="na">size</span><span class="o">()));</span>
-    <span class="o">}</span>
-
-    <span class="nd">@Override</span>
-    <span class="kd">public</span> <span class="kt">void</span> <span class="n">declareOutputFields</span><span class="o">(</span><span class="n">OutputFieldsDeclarer</span> <span class="n">declarer</span><span class="o">)</span> <span class="o">{</span>
-        <span class="n">declarer</span><span class="o">.</span><span class="na">declare</span><span class="o">(</span><span class="k">new</span> <span class="n">Fields</span><span class="o">(</span><span class="s">"id"</span><span class="o">,</span> <span class="s">"partial-count"</span><span class="o">));</span>
-    <span class="o">}</span>
-<span class="o">}</span>
-</code></pre></div>
-<p><code>PartialUniquer</code> implements <code>IBatchBolt</code> by extending <code>BaseBatchBolt</code>. A batch bolt provides a first class API to processing a batch of tuples as a concrete unit. A new instance of the batch bolt is created for each request id, and Storm takes care of cleaning up the instances when appropriate. </p>
-
-<p>When <code>PartialUniquer</code> receives a follower tuple in the <code>execute</code> method, it adds it to the set for the request id in an internal <code>HashSet</code>. </p>
-
-<p>Batch bolts provide the <code>finishBatch</code> method which is called after all the tuples for this batch targeted at this task have been processed. In the callback, <code>PartialUniquer</code> emits a single tuple containing the unique count for its subset of follower ids.</p>
-
-<p>Under the hood, <code>CoordinatedBolt</code> is used to detect when a given bolt has received all of the tuples for any given request id. <code>CoordinatedBolt</code> makes use of direct streams to manage this coordination.</p>
-
-<p>The rest of the topology should be self-explanatory. As you can see, every single step of the reach computation is done in parallel, and defining the DRPC topology was extremely simple.</p>
-
-<h3 id="non-linear-drpc-topologies">Non-linear DRPC topologies</h3>
-
-<p><code>LinearDRPCTopologyBuilder</code> only handles &quot;linear&quot; DRPC topologies, where the computation is expressed as a sequence of steps (like reach). It&#39;s not hard to imagine functions that would require a more complicated topology with branching and merging of the bolts. For now, to do this you&#39;ll need to drop down into using <code>CoordinatedBolt</code> directly. Be sure to talk about your use case for non-linear DRPC topologies on the mailing list to inform the construction of more general abstractions for DRPC topologies.</p>
-
-<h3 id="how-lineardrpctopologybuilder-works">How LinearDRPCTopologyBuilder works</h3>
-
-<ul>
-<li>DRPCSpout emits [args, return-info]. return-info is the host and port of the DRPC server as well as the id generated by the DRPC server</li>
-<li>constructs a topology comprising of:
-
-<ul>
-<li>DRPCSpout</li>
-<li>PrepareRequest (generates a request id and creates a stream for the return info and a stream for the args)</li>
-<li>CoordinatedBolt wrappers and direct groupings</li>
-<li>JoinResult (joins the result with the return info)</li>
-<li>ReturnResult (connects to the DRPC server and returns the result)</li>
-</ul></li>
-<li>LinearDRPCTopologyBuilder is a good example of a higher level abstraction built on top of Storm&#39;s primitives</li>
-</ul>
-
-<h3 id="advanced">Advanced</h3>
-
-<ul>
-<li>KeyedFairBolt for weaving the processing of multiple requests at the same time</li>
-<li>How to use <code>CoordinatedBolt</code> directly</li>
-</ul>
-
-
-
-	          </div>
-	       </div>
-	  </div>
-<footer>
-    <div class="container-fluid">
-        <div class="row">
-            <div class="col-md-3">
-                <div class="footer-widget">
-                    <h5>Meetups</h5>
-                    <ul class="latest-news">
-                        
-                        <li><a href="http://www.meetup.com/Apache-Storm-Apache-Kafka/">Apache Storm & Apache Kafka</a> <span class="small">(Sunnyvale, CA)</span></li>
-                        
-                        <li><a href="http://www.meetup.com/Apache-Storm-Kafka-Users/">Apache Storm & Kafka Users</a> <span class="small">(Seattle, WA)</span></li>
-                        
-                        <li><a href="http://www.meetup.com/New-York-City-Storm-User-Group/">NYC Storm User Group</a> <span class="small">(New York, NY)</span></li>
-                        
-                        <li><a href="http://www.meetup.com/Bay-Area-Stream-Processing">Bay Area Stream Processing</a> <span class="small">(Emeryville, CA)</span></li>
-                        
-                        <li><a href="http://www.meetup.com/Boston-Storm-Users/">Boston Realtime Data</a> <span class="small">(Boston, MA)</span></li>
-                        
-                        <li><a href="http://www.meetup.com/storm-london">London Storm User Group</a> <span class="small">(London, UK)</span></li>
-                        
-                        <!-- <li><a href="http://www.meetup.com/Apache-Storm-Kafka-Users/">Seatle, WA</a> <span class="small">(27 Jun 2015)</span></li> -->
-                    </ul>
-                </div>
-            </div>
-            <div class="col-md-3">
-                <div class="footer-widget">
-                    <h5>About Storm</h5>
-                    <p>Storm integrates with any queueing system and any database system. Storm's spout abstraction makes it easy to integrate a new queuing system. Likewise, integrating Storm with database systems is easy.</p>
-               </div>
-            </div>
-            <div class="col-md-3">
-                <div class="footer-widget">
-                    <h5>First Look</h5>
-                    <ul class="footer-list">
-                        <li><a href="/documentation/Rationale.html">Rationale</a></li>
-                        <li><a href="/tutorial.html">Tutorial</a></li>
-                        <li><a href="/documentation/Setting-up-development-environment.html">Setting up development environment</a></li>
-                        <li><a href="/documentation/Creating-a-new-Storm-project.html">Creating a new Storm project</a></li>
-                    </ul>
-                </div>
-            </div>
-            <div class="col-md-3">
-                <div class="footer-widget">
-                    <h5>Documentation</h5>
-                    <ul class="footer-list">
-                        <li><a href="/doc-index.html">Index</a></li>
-                        <li><a href="/documentation.html">Manual</a></li>
-                        <li><a href="https://storm.apache.org/javadoc/apidocs/index.html">Javadoc</a></li>
-                        <li><a href="/documentation/FAQ.html">FAQ</a></li>
-                    </ul>
-                </div>
-            </div>
-        </div>
-        <hr/>
-        <div class="row">   
-            <div class="col-md-12">
-                <p align="center">Copyright © 2015 <a href="http://www.apache.org">Apache Software Foundation</a>. All Rights Reserved. 
-                    <br>Apache Storm, Apache, the Apache feather logo, and the Apache Storm project logos are trademarks of The Apache Software Foundation. 
-                    <br>All other marks mentioned may be trademarks or registered trademarks of their respective owners.</p>
-            </div>
-        </div>
-    </div>
-</footer>
-<!--Footer End-->
-<!-- Scroll to top -->
-<span class="totop"><a href="#"><i class="fa fa-angle-up"></i></a></span> 
-
-</body>
-
-</html>
-

http://git-wip-us.apache.org/repos/asf/storm/blob/335bbf94/_site/documentation/FAQ.html
----------------------------------------------------------------------
diff --git a/_site/documentation/FAQ.html b/_site/documentation/FAQ.html
deleted file mode 100644
index e9781a0..0000000
--- a/_site/documentation/FAQ.html
+++ /dev/null
@@ -1,303 +0,0 @@
-<!DOCTYPE html>
-<html>
-    <head>
-    <meta charset="utf-8">
-    <meta http-equiv="X-UA-Compatible" content="IE=edge">
-    <meta name="viewport" content="width=device-width, initial-scale=1">
-
-    <link rel="shortcut icon" href="/favicon.ico" type="image/x-icon">
-    <link rel="icon" href="/favicon.ico" type="image/x-icon">
-
-    <title>FAQ</title>
-
-    <!-- Bootstrap core CSS -->
-    <link href="/assets/css/bootstrap.min.css" rel="stylesheet">
-    <!-- Bootstrap theme -->
-    <link href="/assets/css/bootstrap-theme.min.css" rel="stylesheet">
-
-    <!-- Custom styles for this template -->
-    <link rel="stylesheet" href="http://fortawesome.github.io/Font-Awesome/assets/font-awesome/css/font-awesome.css">
-    <link href="/css/style.css" rel="stylesheet">
-    <link href="/assets/css/owl.theme.css" rel="stylesheet">
-    <link href="/assets/css/owl.carousel.css" rel="stylesheet">
-    <script type="text/javascript" src="/assets/js/jquery.min.js"></script>
-    <script type="text/javascript" src="/assets/js/bootstrap.min.js"></script>
-    <script type="text/javascript" src="/assets/js/owl.carousel.min.js"></script>
-    <script type="text/javascript" src="/assets/js/storm.js"></script>
-    <!-- Just for debugging purposes. Don't actually copy these 2 lines! -->
-    <!--[if lt IE 9]><script src="../../assets/js/ie8-responsive-file-warning.js"></script><![endif]-->
-    
-    <!-- HTML5 shim and Respond.js for IE8 support of HTML5 elements and media queries -->
-    <!--[if lt IE 9]>
-      <script src="https://oss.maxcdn.com/html5shiv/3.7.2/html5shiv.min.js"></script>
-      <script src="https://oss.maxcdn.com/respond/1.4.2/respond.min.js"></script>
-    <![endif]-->
-  </head>
-
-
-  <body>
-    <header>
-  <div class="container-fluid">
-      <div class="row">
-          <div class="col-md-10">
-              <a href="/index.html"><img src="/images/logo.png" class="logo" /></a>
-            </div>
-            <div class="col-md-2">
-              <a href="/downloads.html" class="btn-std btn-block btn-download">Download</a>
-            </div>
-        </div>
-    </div>
-</header>
-<!--Header End-->
-<!--Navigation Begin-->
-<div class="navbar" role="banner">
-  <div class="container-fluid">
-      <div class="navbar-header">
-          <button class="navbar-toggle" type="button" data-toggle="collapse" data-target=".bs-navbar-collapse">
-                <span class="icon-bar"></span>
-                <span class="icon-bar"></span>
-                <span class="icon-bar"></span>
-            </button>
-        </div>
-        <nav class="collapse navbar-collapse bs-navbar-collapse" role="navigation">
-          <ul class="nav navbar-nav">
-              <li><a href="/index.html" id="home">Home</a></li>
-                <li><a href="/getting-help.html" id="getting-help">Getting Help</a></li>
-                <li><a href="/about/integrates.html" id="project-info">Project Information</a></li>
-                <li><a href="/documentation.html" id="documentation">Documentation</a></li>
-                <li><a href="/talksAndVideos.html">Talks and Slideshows</a></li>
-                <li class="dropdown">
-                    <a href="#" class="dropdown-toggle" data-toggle="dropdown" id="contribute">Community <b class="caret"></b></a>
-                    <ul class="dropdown-menu">
-                        <li><a href="/contribute/Contributing-to-Storm.html">Contributing</a></li>
-                        <li><a href="/contribute/People.html">People</a></li>
-                        <li><a href="/contribute/BYLAWS.html">ByLaws</a></li>
-                    </ul>
-                </li>
-                <li><a href="/2015/11/05/storm096-released.html" id="news">News</a></li>
-            </ul>
-        </nav>
-    </div>
-</div>
-
-
-
-    <div class="container-fluid">
-    <h1 class="page-title">FAQ</h1>
-          <div class="row">
-           	<div class="col-md-12">
-	             <!-- Documentation -->
-
-<p class="post-meta"></p>
-
-<h2 id="best-practices">Best Practices</h2>
-
-<h3 id="what-rules-of-thumb-can-you-give-me-for-configuring-storm-trident">What rules of thumb can you give me for configuring Storm+Trident?</h3>
-
-<ul>
-<li>number of workers a multiple of number of machines; parallelism a multiple of number of workers; number of kafka partitions a multiple of number of spout parallelism</li>
-<li>Use one worker per topology per machine</li>
-<li>Start with fewer, larger aggregators, one per machine with workers on it</li>
-<li>Use the isolation scheduler</li>
-<li>Use one acker per worker -- 0.9 makes that the default, but earlier versions do not.</li>
-<li>enable GC logging; you should see very few major GCs if things are in reasonable shape.</li>
-<li>set the trident batch millis to about 50% of your typical end-to-end latency.</li>
-<li>Start with a max spout pending that is for sure too small -- one for trident, or the number of executors for storm -- and increase it until you stop seeing changes in the flow. You&#39;ll probably end up with something near <code>2*(throughput in recs/sec)*(end-to-end latency)</code> (2x the Little&#39;s law capacity).</li>
-</ul>
-
-<h3 id="what-are-some-of-the-best-ways-to-get-a-worker-to-mysteriously-and-bafflingly-die">What are some of the best ways to get a worker to mysteriously and bafflingly die?</h3>
-
-<ul>
-<li>Do you have write access to the log directory</li>
-<li>Are you blowing out your heap?</li>
-<li>Are all the right libraries installed on all of the workers?</li>
-<li>Is the zookeeper hostname still set to localhost?</li>
-<li>Did you supply a correct, unique hostname -- one that resolves back to the machine -- to each worker, and put it in the storm conf file?</li>
-<li>Have you opened firewall/securitygroup permissions <em>bidirectionally</em> among a) all the workers, b) the storm master, c) zookeeper? Also, from the workers to any kafka/kestrel/database/etc that your topology accesses? Use netcat to poke the appropriate ports and be sure. </li>
-</ul>
-
-<h3 id="halp-i-cannot-see">Halp! I cannot see:</h3>
-
-<ul>
-<li><strong>my logs</strong> Logs by default go to $STORM_HOME/logs. Check that you have write permissions to that directory. They are configured in 
-
-<ul>
-<li>log4j2/{cluster, worker}.xml (&gt; 0.9);</li>
-<li>logback/cluster.xml (0.9);</li>
-<li>log4j/*.properties in earlier versions (&lt; 0.9).</li>
-</ul></li>
-<li><strong>final JVM settings</strong> Add the <code>-XX+PrintFlagsFinal</code> commandline option in the childopts (see the conf file)</li>
-<li><strong>final Java system properties</strong> Add <code>Properties props = System.getProperties(); props.list(System.out);</code> near where you build your topology.</li>
-</ul>
-
-<h3 id="how-many-workers-should-i-use">How many Workers should I use?</h3>
-
-<p>The total number of workers is set by the supervisors -- there&#39;s some number of JVM slots each supervisor will superintend. The thing you set on the topology is how many worker slots it will try to claim.</p>
-
-<p>There&#39;s no great reason to use more than one worker per topology per machine.</p>
-
-<p>With one topology running on three 8-core nodes, and parallelism hint 24, each bolt gets 8 executors per machine, i.e. one for each core. There are three big benefits to running three workers (with 8 assigned executors each) compare to running say 24 workers (one assigned executor each).</p>
-
-<p>First, data that is repartitioned (shuffles or group-bys) to executors in the same worker will not have to hit the transfer buffer. Instead, tuples are deposited directly from send to receive buffer. That&#39;s a big win. By contrast, if the destination executor were on the same machine in a different worker, it would have to go send -&gt; worker transfer -&gt; local socket -&gt; worker recv -&gt; exec recv buffer. It doesn&#39;t hit the network card, but it&#39;s not as big a win as when executors are in the same worker.</p>
-
-<p>Second, you&#39;re typically better off with three aggregators having very large backing cache than having twenty-four aggregators having small backing caches. This reduces the effect of skew, and improves LRU efficiency.</p>
-
-<p>Lastly, fewer workers reduces control flow chatter.</p>
-
-<h2 id="topology">Topology</h2>
-
-<h3 id="can-a-trident-topology-have-multiple-streams">Can a Trident topology have Multiple Streams?</h3>
-
-<blockquote>
-<p>Can a Trident Topology work like a workflow with conditional paths (if-else)? e.g. A Spout (S1) connects to a bolt (B0) which based on certain values in the incoming tuple routes them to either bolt (B1) or bolt (B2) but not both.</p>
-</blockquote>
-
-<p>A Trident &quot;each&quot; operator returns a Stream object, which you can store in a variable. You can then run multiple eaches on the same Stream to split it, e.g.: </p>
-<div class="highlight"><pre><code class="language-" data-lang="">    Stream s = topology.each(...).groupBy(...).aggregate(...) 
-    Stream branch1 = s.each(..., FilterA) 
-    Stream branch2 = s.each(..., FilterB) 
-</code></pre></div>
-<p>You can join streams with join, merge or multiReduce.</p>
-
-<p>At time of writing, you can&#39;t emit to multiple output streams from Trident -- see <a href="https://issues.apache.org/jira/browse/STORM-68">STORM-68</a></p>
-
-<h2 id="spouts">Spouts</h2>
-
-<h3 id="what-is-a-coordinator-and-why-are-there-several">What is a coordinator, and why are there several?</h3>
-
-<p>A trident-spout is actually run within a storm <em>bolt</em>. The storm-spout of a trident topology is the MasterBatchCoordinator -- it coordinates trident batches and is the same no matter what spouts you use. A batch is born when the MBC dispenses a seed tuple to each of the spout-coordinators. The spout-coordinator bolts know how your particular spouts should cooperate -- so in the kafka case, it&#39;s what helps figure out what partition and offset range each spout should pull from.</p>
-
-<h3 id="what-can-i-store-into-the-spout-39-s-metadata-record">What can I store into the spout&#39;s metadata record?</h3>
-
-<p>You should only store static data, and as little of it as possible, into the metadata record (note: maybe you <em>can</em> store more interesting things; you shouldn&#39;t, though)</p>
-
-<h3 id="how-often-is-the-39-emitpartitionbatchnew-39-function-called">How often is the &#39;emitPartitionBatchNew&#39; function called?</h3>
-
-<p>Since the MBC is the actual spout, all the tuples in a batch are just members of its tupletree. That means storm&#39;s &quot;max spout pending&quot; config effectively defines the number of concurrent batches trident runs. The MBC emits a new batch if it has fewer than max-spending tuples pending and if at least one <a href="https://github.com/apache/storm/blob/master/conf/defaults.yaml#L115">trident batch interval</a>&#39;s worth of seconds has passed since the last batch.</p>
-
-<h3 id="if-nothing-was-emitted-does-trident-slow-down-the-calls">If nothing was emitted does Trident slow down the calls?</h3>
-
-<p>Yes, there&#39;s a pluggable &quot;spout wait strategy&quot;; the default is to sleep for a <a href="https://github.com/apache/storm/blob/master/conf/defaults.yaml#L110">configurable amount of time</a></p>
-
-<h3 id="ok-then-what-is-the-trident-batch-interval-for">OK, then what is the trident batch interval for?</h3>
-
-<p>You know how computers of the 486 era had a <a href="http://en.wikipedia.org/wiki/Turbo_button">turbo button</a> on them? It&#39;s like that. </p>
-
-<p>Actually, it has two practical uses. One is to throttle spouts that poll a remote source without throttling processing. For example, we have a spout that looks in a given S3 bucket for a new batch-uploaded file to read, linebreak and emit. We don&#39;t want it hitting S3 more than every few seconds: files don&#39;t show up more than once every few minutes, and a batch takes a few seconds to process.</p>
-
-<p>The other is to limit overpressure on the internal queues during startup or under a heavy burst load -- if the spouts spring to life and suddenly jam ten batches&#39; worth of records into the system, you could have a mass of less-urgent tuples from batch 7 clog up the transfer buffer and prevent the $commit tuple from batch 3 to get through (or even just the regular old tuples from batch 3). What we do is set the trident batch interval to about half the typical end-to-end processing latency -- if it takes 600ms to process a batch, it&#39;s OK to only kick off a batch every 300ms.</p>
-
-<p>Note that this is a cap, not an additional delay -- with a period of 300ms, if your batch takes 258ms Trident will only delay an additional 42ms.</p>
-
-<h3 id="how-do-you-set-the-batch-size">How do you set the batch size?</h3>
-
-<p>Trident doesn&#39;t place its own limits on the batch count. In the case of the Kafka spout, the max fetch bytes size divided by the average record size defines an effective records per subbatch partition.</p>
-
-<h3 id="how-do-i-resize-a-batch">How do I resize a batch?</h3>
-
-<p>The trident batch is a somewhat overloaded facility. Together with the number of partitions, the batch size is constrained by or serves to define</p>
-
-<ol>
-<li>the unit of transactional safety (tuples at risk vs time)</li>
-<li>per partition, an effective windowing mechanism for windowed stream analytics</li>
-<li>per partition, the number of simultaneous queries that will be made by a partitionQuery, partitionPersist, etc;</li>
-<li>per partition, the number of records convenient for the spout to dispatch at the same time;</li>
-</ol>
-
-<p>You can&#39;t change the overall batch size once generated, but you can change the number of partitions -- do a shuffle and then change the parallelism hint</p>
-
-<h2 id="time-series">Time Series</h2>
-
-<h3 id="how-do-i-aggregate-events-by-time">How do I aggregate events by time?</h3>
-
-<p>If have records with an immutable timestamp, and you would like to count, average or otherwise aggregate them into discrete time buckets, Trident is an excellent and scalable solution.</p>
-
-<p>Write an <code>Each</code> function that turns the timestamp into a time bucket: if the bucket size was &quot;by hour&quot;, then the timestamp <code>2013-08-08 12:34:56</code> would be mapped to the <code>2013-08-08 12:00:00</code> time bucket, and so would everything else in the twelve o&#39;clock hour. Then group on that timebucket and use a grouped persistentAggregate. The persistentAggregate uses a local cacheMap backed by a data store. Groups with many records require very few reads from the data store, and use efficient bulk reads and writes; as long as your data feed is relatively prompt Trident will make very efficient use of memory and network. Even if a server drops off line for a day, then delivers that full day&#39;s worth of data in a rush, the old results will be calmly retrieved and updated -- and without interfering with calculating the current results.</p>
-
-<h3 id="how-can-i-know-that-all-records-for-a-time-bucket-have-been-received">How can I know that all records for a time bucket have been received?</h3>
-
-<p>You cannot know that all events are collected -- this is an epistemological challenge, not a distributed systems challenge. You can:</p>
-
-<ul>
-<li>Set a time limit using domain knowledge</li>
-<li>Introduce a <em>punctuation</em>: a record known to come after all records in the given time bucket. Trident uses this scheme to know when a batch is complete. If you for instance receive records from a set of sensors, each in order for that sensor, then once all sensors have sent you a 3:02:xx or later timestamp lets you know you can commit. </li>
-<li>When possible, make your process incremental: each value that comes in makes the answer more an more true. A Trident ReducerAggregator is an operator that takes a prior result and a set of new records and returns a new result. This lets the result be cached and serialized to a datastore; if a server drops off line for a day and then comes back with a full day&#39;s worth of data in a rush, the old results will be calmly retrieved and updated.</li>
-<li>Lambda architecture: Record all events into an archival store (S3, HBase, HDFS) on receipt. in the fast layer, once the time window is clear, process the bucket to get an actionable answer, and ignore everything older than the time window. Periodically run a global aggregation to calculate a &quot;correct&quot; answer.</li>
-</ul>
-
-
-
-	          </div>
-	       </div>
-	  </div>
-<footer>
-    <div class="container-fluid">
-        <div class="row">
-            <div class="col-md-3">
-                <div class="footer-widget">
-                    <h5>Meetups</h5>
-                    <ul class="latest-news">
-                        
-                        <li><a href="http://www.meetup.com/Apache-Storm-Apache-Kafka/">Apache Storm & Apache Kafka</a> <span class="small">(Sunnyvale, CA)</span></li>
-                        
-                        <li><a href="http://www.meetup.com/Apache-Storm-Kafka-Users/">Apache Storm & Kafka Users</a> <span class="small">(Seattle, WA)</span></li>
-                        
-                        <li><a href="http://www.meetup.com/New-York-City-Storm-User-Group/">NYC Storm User Group</a> <span class="small">(New York, NY)</span></li>
-                        
-                        <li><a href="http://www.meetup.com/Bay-Area-Stream-Processing">Bay Area Stream Processing</a> <span class="small">(Emeryville, CA)</span></li>
-                        
-                        <li><a href="http://www.meetup.com/Boston-Storm-Users/">Boston Realtime Data</a> <span class="small">(Boston, MA)</span></li>
-                        
-                        <li><a href="http://www.meetup.com/storm-london">London Storm User Group</a> <span class="small">(London, UK)</span></li>
-                        
-                        <!-- <li><a href="http://www.meetup.com/Apache-Storm-Kafka-Users/">Seatle, WA</a> <span class="small">(27 Jun 2015)</span></li> -->
-                    </ul>
-                </div>
-            </div>
-            <div class="col-md-3">
-                <div class="footer-widget">
-                    <h5>About Storm</h5>
-                    <p>Storm integrates with any queueing system and any database system. Storm's spout abstraction makes it easy to integrate a new queuing system. Likewise, integrating Storm with database systems is easy.</p>
-               </div>
-            </div>
-            <div class="col-md-3">
-                <div class="footer-widget">
-                    <h5>First Look</h5>
-                    <ul class="footer-list">
-                        <li><a href="/documentation/Rationale.html">Rationale</a></li>
-                        <li><a href="/tutorial.html">Tutorial</a></li>
-                        <li><a href="/documentation/Setting-up-development-environment.html">Setting up development environment</a></li>
-                        <li><a href="/documentation/Creating-a-new-Storm-project.html">Creating a new Storm project</a></li>
-                    </ul>
-                </div>
-            </div>
-            <div class="col-md-3">
-                <div class="footer-widget">
-                    <h5>Documentation</h5>
-                    <ul class="footer-list">
-                        <li><a href="/doc-index.html">Index</a></li>
-                        <li><a href="/documentation.html">Manual</a></li>
-                        <li><a href="https://storm.apache.org/javadoc/apidocs/index.html">Javadoc</a></li>
-                        <li><a href="/documentation/FAQ.html">FAQ</a></li>
-                    </ul>
-                </div>
-            </div>
-        </div>
-        <hr/>
-        <div class="row">   
-            <div class="col-md-12">
-                <p align="center">Copyright © 2015 <a href="http://www.apache.org">Apache Software Foundation</a>. All Rights Reserved. 
-                    <br>Apache Storm, Apache, the Apache feather logo, and the Apache Storm project logos are trademarks of The Apache Software Foundation. 
-                    <br>All other marks mentioned may be trademarks or registered trademarks of their respective owners.</p>
-            </div>
-        </div>
-    </div>
-</footer>
-<!--Footer End-->
-<!-- Scroll to top -->
-<span class="totop"><a href="#"><i class="fa fa-angle-up"></i></a></span> 
-
-</body>
-
-</html>
-

http://git-wip-us.apache.org/repos/asf/storm/blob/335bbf94/_site/documentation/Fault-tolerance.html
----------------------------------------------------------------------
diff --git a/_site/documentation/Fault-tolerance.html b/_site/documentation/Fault-tolerance.html
deleted file mode 100644
index 9ce6412..0000000
--- a/_site/documentation/Fault-tolerance.html
+++ /dev/null
@@ -1,194 +0,0 @@
-<!DOCTYPE html>
-<html>
-    <head>
-    <meta charset="utf-8">
-    <meta http-equiv="X-UA-Compatible" content="IE=edge">
-    <meta name="viewport" content="width=device-width, initial-scale=1">
-
-    <link rel="shortcut icon" href="/favicon.ico" type="image/x-icon">
-    <link rel="icon" href="/favicon.ico" type="image/x-icon">
-
-    <title>Fault Tolerance</title>
-
-    <!-- Bootstrap core CSS -->
-    <link href="/assets/css/bootstrap.min.css" rel="stylesheet">
-    <!-- Bootstrap theme -->
-    <link href="/assets/css/bootstrap-theme.min.css" rel="stylesheet">
-
-    <!-- Custom styles for this template -->
-    <link rel="stylesheet" href="http://fortawesome.github.io/Font-Awesome/assets/font-awesome/css/font-awesome.css">
-    <link href="/css/style.css" rel="stylesheet">
-    <link href="/assets/css/owl.theme.css" rel="stylesheet">
-    <link href="/assets/css/owl.carousel.css" rel="stylesheet">
-    <script type="text/javascript" src="/assets/js/jquery.min.js"></script>
-    <script type="text/javascript" src="/assets/js/bootstrap.min.js"></script>
-    <script type="text/javascript" src="/assets/js/owl.carousel.min.js"></script>
-    <script type="text/javascript" src="/assets/js/storm.js"></script>
-    <!-- Just for debugging purposes. Don't actually copy these 2 lines! -->
-    <!--[if lt IE 9]><script src="../../assets/js/ie8-responsive-file-warning.js"></script><![endif]-->
-    
-    <!-- HTML5 shim and Respond.js for IE8 support of HTML5 elements and media queries -->
-    <!--[if lt IE 9]>
-      <script src="https://oss.maxcdn.com/html5shiv/3.7.2/html5shiv.min.js"></script>
-      <script src="https://oss.maxcdn.com/respond/1.4.2/respond.min.js"></script>
-    <![endif]-->
-  </head>
-
-
-  <body>
-    <header>
-  <div class="container-fluid">
-      <div class="row">
-          <div class="col-md-10">
-              <a href="/index.html"><img src="/images/logo.png" class="logo" /></a>
-            </div>
-            <div class="col-md-2">
-              <a href="/downloads.html" class="btn-std btn-block btn-download">Download</a>
-            </div>
-        </div>
-    </div>
-</header>
-<!--Header End-->
-<!--Navigation Begin-->
-<div class="navbar" role="banner">
-  <div class="container-fluid">
-      <div class="navbar-header">
-          <button class="navbar-toggle" type="button" data-toggle="collapse" data-target=".bs-navbar-collapse">
-                <span class="icon-bar"></span>
-                <span class="icon-bar"></span>
-                <span class="icon-bar"></span>
-            </button>
-        </div>
-        <nav class="collapse navbar-collapse bs-navbar-collapse" role="navigation">
-          <ul class="nav navbar-nav">
-              <li><a href="/index.html" id="home">Home</a></li>
-                <li><a href="/getting-help.html" id="getting-help">Getting Help</a></li>
-                <li><a href="/about/integrates.html" id="project-info">Project Information</a></li>
-                <li><a href="/documentation.html" id="documentation">Documentation</a></li>
-                <li><a href="/talksAndVideos.html">Talks and Slideshows</a></li>
-                <li class="dropdown">
-                    <a href="#" class="dropdown-toggle" data-toggle="dropdown" id="contribute">Community <b class="caret"></b></a>
-                    <ul class="dropdown-menu">
-                        <li><a href="/contribute/Contributing-to-Storm.html">Contributing</a></li>
-                        <li><a href="/contribute/People.html">People</a></li>
-                        <li><a href="/contribute/BYLAWS.html">ByLaws</a></li>
-                    </ul>
-                </li>
-                <li><a href="/2015/11/05/storm096-released.html" id="news">News</a></li>
-            </ul>
-        </nav>
-    </div>
-</div>
-
-
-
-    <div class="container-fluid">
-    <h1 class="page-title">Fault Tolerance</h1>
-          <div class="row">
-           	<div class="col-md-12">
-	             <!-- Documentation -->
-
-<p class="post-meta"></p>
-
-<p>This page explains the design details of Storm that make it a fault-tolerant system.</p>
-
-<h2 id="what-happens-when-a-worker-dies">What happens when a worker dies?</h2>
-
-<p>When a worker dies, the supervisor will restart it. If it continuously fails on startup and is unable to heartbeat to Nimbus, Nimbus will reassign the worker to another machine.</p>
-
-<h2 id="what-happens-when-a-node-dies">What happens when a node dies?</h2>
-
-<p>The tasks assigned to that machine will time-out and Nimbus will reassign those tasks to other machines.</p>
-
-<h2 id="what-happens-when-nimbus-or-supervisor-daemons-die">What happens when Nimbus or Supervisor daemons die?</h2>
-
-<p>The Nimbus and Supervisor daemons are designed to be fail-fast (process self-destructs whenever any unexpected situation is encountered) and stateless (all state is kept in Zookeeper or on disk). As described in <a href="Setting-up-a-Storm-cluster.html">Setting up a Storm cluster</a>, the Nimbus and Supervisor daemons must be run under supervision using a tool like daemontools or monit. So if the Nimbus or Supervisor daemons die, they restart like nothing happened.</p>
-
-<p>Most notably, no worker processes are affected by the death of Nimbus or the Supervisors. This is in contrast to Hadoop, where if the JobTracker dies, all the running jobs are lost. </p>
-
-<h2 id="is-nimbus-a-single-point-of-failure">Is Nimbus a single point of failure?</h2>
-
-<p>If you lose the Nimbus node, the workers will still continue to function. Additionally, supervisors will continue to restart workers if they die. However, without Nimbus, workers won&#39;t be reassigned to other machines when necessary (like if you lose a worker machine). </p>
-
-<p>So the answer is that Nimbus is &quot;sort of&quot; a SPOF. In practice, it&#39;s not a big deal since nothing catastrophic happens when the Nimbus daemon dies. There are plans to make Nimbus highly available in the future.</p>
-
-<h2 id="how-does-storm-guarantee-data-processing">How does Storm guarantee data processing?</h2>
-
-<p>Storm provides mechanisms to guarantee data processing even if nodes die or messages are lost. See <a href="Guaranteeing-message-processing.html">Guaranteeing message processing</a> for the details.</p>
-
-
-
-	          </div>
-	       </div>
-	  </div>
-<footer>
-    <div class="container-fluid">
-        <div class="row">
-            <div class="col-md-3">
-                <div class="footer-widget">
-                    <h5>Meetups</h5>
-                    <ul class="latest-news">
-                        
-                        <li><a href="http://www.meetup.com/Apache-Storm-Apache-Kafka/">Apache Storm & Apache Kafka</a> <span class="small">(Sunnyvale, CA)</span></li>
-                        
-                        <li><a href="http://www.meetup.com/Apache-Storm-Kafka-Users/">Apache Storm & Kafka Users</a> <span class="small">(Seattle, WA)</span></li>
-                        
-                        <li><a href="http://www.meetup.com/New-York-City-Storm-User-Group/">NYC Storm User Group</a> <span class="small">(New York, NY)</span></li>
-                        
-                        <li><a href="http://www.meetup.com/Bay-Area-Stream-Processing">Bay Area Stream Processing</a> <span class="small">(Emeryville, CA)</span></li>
-                        
-                        <li><a href="http://www.meetup.com/Boston-Storm-Users/">Boston Realtime Data</a> <span class="small">(Boston, MA)</span></li>
-                        
-                        <li><a href="http://www.meetup.com/storm-london">London Storm User Group</a> <span class="small">(London, UK)</span></li>
-                        
-                        <!-- <li><a href="http://www.meetup.com/Apache-Storm-Kafka-Users/">Seatle, WA</a> <span class="small">(27 Jun 2015)</span></li> -->
-                    </ul>
-                </div>
-            </div>
-            <div class="col-md-3">
-                <div class="footer-widget">
-                    <h5>About Storm</h5>
-                    <p>Storm integrates with any queueing system and any database system. Storm's spout abstraction makes it easy to integrate a new queuing system. Likewise, integrating Storm with database systems is easy.</p>
-               </div>
-            </div>
-            <div class="col-md-3">
-                <div class="footer-widget">
-                    <h5>First Look</h5>
-                    <ul class="footer-list">
-                        <li><a href="/documentation/Rationale.html">Rationale</a></li>
-                        <li><a href="/tutorial.html">Tutorial</a></li>
-                        <li><a href="/documentation/Setting-up-development-environment.html">Setting up development environment</a></li>
-                        <li><a href="/documentation/Creating-a-new-Storm-project.html">Creating a new Storm project</a></li>
-                    </ul>
-                </div>
-            </div>
-            <div class="col-md-3">
-                <div class="footer-widget">
-                    <h5>Documentation</h5>
-                    <ul class="footer-list">
-                        <li><a href="/doc-index.html">Index</a></li>
-                        <li><a href="/documentation.html">Manual</a></li>
-                        <li><a href="https://storm.apache.org/javadoc/apidocs/index.html">Javadoc</a></li>
-                        <li><a href="/documentation/FAQ.html">FAQ</a></li>
-                    </ul>
-                </div>
-            </div>
-        </div>
-        <hr/>
-        <div class="row">   
-            <div class="col-md-12">
-                <p align="center">Copyright © 2015 <a href="http://www.apache.org">Apache Software Foundation</a>. All Rights Reserved. 
-                    <br>Apache Storm, Apache, the Apache feather logo, and the Apache Storm project logos are trademarks of The Apache Software Foundation. 
-                    <br>All other marks mentioned may be trademarks or registered trademarks of their respective owners.</p>
-            </div>
-        </div>
-    </div>
-</footer>
-<!--Footer End-->
-<!-- Scroll to top -->
-<span class="totop"><a href="#"><i class="fa fa-angle-up"></i></a></span> 
-
-</body>
-
-</html>
-