You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@storm.apache.org by na...@apache.org on 2014/05/25 19:47:13 UTC

svn commit: r1597454 [2/9] - in /incubator/storm/site: ./ publish/ publish/about/ publish/documentation/

Modified: incubator/storm/site/publish/documentation/Concepts.html
URL: http://svn.apache.org/viewvc/incubator/storm/site/publish/documentation/Concepts.html?rev=1597454&r1=1597453&r2=1597454&view=diff
==============================================================================
--- incubator/storm/site/publish/documentation/Concepts.html (original)
+++ incubator/storm/site/publish/documentation/Concepts.html Sun May 25 17:47:12 2014
@@ -68,52 +68,49 @@
 <p>This page lists the main concepts of Storm and links to resources where you can find more information. The concepts discussed are:</p>
 
 <ol>
-<li>Topologies</li>
-<li>Streams</li>
-<li>Spouts</li>
-<li>Bolts</li>
-<li>Stream groupings</li>
-<li>Reliability</li>
-<li>Tasks</li>
-<li>Workers</li>
+  <li>Topologies</li>
+  <li>Streams</li>
+  <li>Spouts</li>
+  <li>Bolts</li>
+  <li>Stream groupings</li>
+  <li>Reliability</li>
+  <li>Tasks</li>
+  <li>Workers</li>
 </ol>
 
-
-<h3>Topologies</h3>
+<h3 id="topologies">Topologies</h3>
 
 <p>The logic for a realtime application is packaged into a Storm topology. A Storm topology is analogous to a MapReduce job. One key difference is that a MapReduce job eventually finishes, whereas a topology runs forever (or until you kill it, of course). A topology is a graph of spouts and bolts that are connected with stream groupings. These concepts are described below.</p>
 
 <p><strong>Resources:</strong></p>
 
 <ul>
-<li><a href="/apidocs/backtype/storm/topology/TopologyBuilder.html">TopologyBuilder</a>: use this class to construct topologies in Java</li>
-<li><a href="Running-topologies-on-a-production-cluster.html">Running topologies on a production cluster</a></li>
-<li><a href="Local-mode.html">Local mode</a>: Read this to learn how to develop and test topologies in local mode.</li>
+  <li><a href="/apidocs/backtype/storm/topology/TopologyBuilder.html">TopologyBuilder</a>: use this class to construct topologies in Java</li>
+  <li><a href="Running-topologies-on-a-production-cluster.html">Running topologies on a production cluster</a></li>
+  <li><a href="Local-mode.html">Local mode</a>: Read this to learn how to develop and test topologies in local mode.</li>
 </ul>
 
+<h3 id="streams">Streams</h3>
 
-<h3>Streams</h3>
-
-<p>The stream is the core abstraction in Storm. A stream is an unbounded sequence of tuples that is processed and created in parallel in a distributed fashion. Streams are defined with a schema that names the fields in the stream's tuples. By default, tuples can contain integers, longs, shorts, bytes, strings, doubles, floats, booleans, and byte arrays. You can also define your own serializers so that custom types can be used natively within tuples.</p>
+<p>The stream is the core abstraction in Storm. A stream is an unbounded sequence of tuples that is processed and created in parallel in a distributed fashion. Streams are defined with a schema that names the fields in the stream&#8217;s tuples. By default, tuples can contain integers, longs, shorts, bytes, strings, doubles, floats, booleans, and byte arrays. You can also define your own serializers so that custom types can be used natively within tuples.</p>
 
-<p>Every stream is given an id when declared. Since single-stream spouts and bolts are so common, <a href="/apidocs/backtype/storm/topology/OutputFieldsDeclarer.html">OutputFieldsDeclarer</a> has convenience methods for declaring a single stream without specifying an id. In this case, the stream is given the default id of "default".</p>
+<p>Every stream is given an id when declared. Since single-stream spouts and bolts are so common, <a href="/apidocs/backtype/storm/topology/OutputFieldsDeclarer.html">OutputFieldsDeclarer</a> has convenience methods for declaring a single stream without specifying an id. In this case, the stream is given the default id of &#8220;default&#8221;.</p>
 
 <p><strong>Resources:</strong></p>
 
 <ul>
-<li><a href="/apidocs/backtype/storm/tuple/Tuple.html">Tuple</a>: streams are composed of tuples</li>
-<li><a href="/apidocs/backtype/storm/topology/OutputFieldsDeclarer.html">OutputFieldsDeclarer</a>: used to declare streams and their schemas</li>
-<li><a href="Serialization.html">Serialization</a>: Information about Storm's dynamic typing of tuples and declaring custom serializations</li>
-<li><a href="/apidocs/backtype/storm/serialization/ISerialization.html">ISerialization</a>: custom serializers must implement this interface</li>
-<li><a href="/apidocs/backtype/storm/Config.html#TOPOLOGY_SERIALIZATIONS">CONFIG.TOPOLOGY_SERIALIZATIONS</a>: custom serializers can be registered using this configuration</li>
+  <li><a href="/apidocs/backtype/storm/tuple/Tuple.html">Tuple</a>: streams are composed of tuples</li>
+  <li><a href="/apidocs/backtype/storm/topology/OutputFieldsDeclarer.html">OutputFieldsDeclarer</a>: used to declare streams and their schemas</li>
+  <li><a href="Serialization.html">Serialization</a>: Information about Storm&#8217;s dynamic typing of tuples and declaring custom serializations</li>
+  <li><a href="/apidocs/backtype/storm/serialization/ISerialization.html">ISerialization</a>: custom serializers must implement this interface</li>
+  <li><a href="/apidocs/backtype/storm/Config.html#TOPOLOGY_SERIALIZATIONS">CONFIG.TOPOLOGY_SERIALIZATIONS</a>: custom serializers can be registered using this configuration</li>
 </ul>
 
-
-<h3>Spouts</h3>
+<h3 id="spouts">Spouts</h3>
 
 <p>A spout is a source of streams in a topology. Generally spouts will read tuples from an external source and emit them into the topology (e.g. a Kestrel queue or the Twitter API). Spouts can either be <strong>reliable</strong> or <strong>unreliable</strong>. A reliable spout is capable of replaying a tuple if it failed to be processed by Storm, whereas an unreliable spout forgets about the tuple as soon as it is emitted.</p>
 
-<p>Spouts can emit more than one stream. To do so, declare multiple streams using the <code>declareStream</code> method of <a href="/apidocs/backtype/storm/topology/OutputFieldsDeclarer.html">OutputFieldsDeclarer</a> and specify the stream to emit to when using the <code>emit</code> method on <a href="/apidocs/backtype/storm/spout/SpoutOutputCollector.html">SpoutOutputCollector</a>.</p>
+<p>Spouts can emit more than one stream. To do so, declare multiple streams using the <code>declareStream</code> method of <a href="/apidocs/backtype/storm/topology/OutputFieldsDeclarer.html">OutputFieldsDeclarer</a> and specify the stream to emit to when using the <code>emit</code> method on <a href="/apidocs/backtype/storm/spout/SpoutOutputCollector.html">SpoutOutputCollector</a>. </p>
 
 <p>The main method on spouts is <code>nextTuple</code>. <code>nextTuple</code> either emits a new tuple into the topology or simply returns if there are no new tuples to emit. It is imperative that <code>nextTuple</code> does not block for any spout implementation, because Storm calls all the spout methods on the same thread.</p>
 
@@ -122,20 +119,19 @@
 <p><strong>Resources:</strong></p>
 
 <ul>
-<li><a href="/apidocs/backtype/storm/topology/IRichSpout.html">IRichSpout</a>: this is the interface that spouts must implement.</li>
-<li><a href="Guaranteeing-message-processing.html">Guaranteeing message processing</a></li>
+  <li><a href="/apidocs/backtype/storm/topology/IRichSpout.html">IRichSpout</a>: this is the interface that spouts must implement. </li>
+  <li><a href="Guaranteeing-message-processing.html">Guaranteeing message processing</a></li>
 </ul>
 
+<h3 id="bolts">Bolts</h3>
 
-<h3>Bolts</h3>
-
-<p>All processing in topologies is done in bolts. Bolts can do anything from filtering, functions, aggregations, joins, talking to databases, and more.</p>
+<p>All processing in topologies is done in bolts. Bolts can do anything from filtering, functions, aggregations, joins, talking to databases, and more. </p>
 
-<p>Bolts can do simple stream transformations. Doing complex stream transformations often requires multiple steps and thus multiple bolts. For example, transforming a stream of tweets into a stream of trending images requires at least two steps: a bolt to do a rolling count of retweets for each image, and one or more bolts to stream out the top X images (you can do this particular stream transformation in a more scalable way with three bolts than with two).</p>
+<p>Bolts can do simple stream transformations. Doing complex stream transformations often requires multiple steps and thus multiple bolts. For example, transforming a stream of tweets into a stream of trending images requires at least two steps: a bolt to do a rolling count of retweets for each image, and one or more bolts to stream out the top X images (you can do this particular stream transformation in a more scalable way with three bolts than with two). </p>
 
 <p>Bolts can emit more than one stream. To do so, declare multiple streams using the <code>declareStream</code> method of <a href="/apidocs/backtype/storm/topology/OutputFieldsDeclarer.html">OutputFieldsDeclarer</a> and specify the stream to emit to when using the <code>emit</code> method on <a href="/apidocs/backtype/storm/task/OutputCollector.html">OutputCollector</a>.</p>
 
-<p>When you declare a bolt's input streams, you always subscribe to specific streams of another component. If you want to subscribe to all the streams of another component, you have to subscribe to each one individually. <a href="/apidocs/backtype/storm/topology/InputDeclarer.html">InputDeclarer</a> has syntactic sugar for subscribing to streams declared on the default stream id. Saying <code>declarer.shuffleGrouping("1")</code> subscribes to the default stream on component "1" and is equivalent to <code>declarer.shuffleGrouping("1", DEFAULT_STREAM_ID)</code>.</p>
+<p>When you declare a bolt&#8217;s input streams, you always subscribe to specific streams of another component. If you want to subscribe to all the streams of another component, you have to subscribe to each one individually. <a href="/apidocs/backtype/storm/topology/InputDeclarer.html">InputDeclarer</a> has syntactic sugar for subscribing to streams declared on the default stream id. Saying <code>declarer.shuffleGrouping("1")</code> subscribes to the default stream on component &#8220;1&#8221; and is equivalent to <code>declarer.shuffleGrouping("1", DEFAULT_STREAM_ID)</code>. </p>
 
 <p>The main method in bolts is the <code>execute</code> method which takes in as input a new tuple. Bolts emit new tuples using the <a href="/apidocs/backtype/storm/task/OutputCollector.html">OutputCollector</a> object. Bolts must call the <code>ack</code> method on the <code>OutputCollector</code> for every tuple they process so that Storm knows when tuples are completed (and can eventually determine that its safe to ack the original spout tuples). For the common case of processing an input tuple, emitting 0 or more tuples based on that tuple, and then acking the input tuple, Storm provides an <a href="/apidocs/backtype/storm/topology/IBasicBolt.html">IBasicBolt</a> interface which does the acking automatically.</p>
 
@@ -144,62 +140,58 @@
 <p><strong>Resources:</strong></p>
 
 <ul>
-<li><a href="/apidocs/backtype/storm/topology/IRichBolt.html">IRichBolt</a>: this is general interface for bolts.</li>
-<li><a href="/apidocs/backtype/storm/topology/IBasicBolt.html">IBasicBolt</a>: this is a convenience interface for defining bolts that do filtering or simple functions.</li>
-<li><a href="/apidocs/backtype/storm/task/OutputCollector.html">OutputCollector</a>: bolts emit tuples to their output streams using an instance of this class</li>
-<li><a href="Guaranteeing-message-processing.html">Guaranteeing message processing</a></li>
+  <li><a href="/apidocs/backtype/storm/topology/IRichBolt.html">IRichBolt</a>: this is general interface for bolts.</li>
+  <li><a href="/apidocs/backtype/storm/topology/IBasicBolt.html">IBasicBolt</a>: this is a convenience interface for defining bolts that do filtering or simple functions.</li>
+  <li><a href="/apidocs/backtype/storm/task/OutputCollector.html">OutputCollector</a>: bolts emit tuples to their output streams using an instance of this class</li>
+  <li><a href="Guaranteeing-message-processing.html">Guaranteeing message processing</a></li>
 </ul>
 
+<h3 id="stream-groupings">Stream groupings</h3>
 
-<h3>Stream groupings</h3>
-
-<p>Part of defining a topology is specifying for each bolt which streams it should receive as input. A stream grouping defines how that stream should be partitioned among the bolt's tasks.</p>
+<p>Part of defining a topology is specifying for each bolt which streams it should receive as input. A stream grouping defines how that stream should be partitioned among the bolt&#8217;s tasks.</p>
 
 <p>There are seven built-in stream groupings in Storm, and you can implement a custom stream grouping by implementing the <a href="/apidocs/backtype/storm/grouping/CustomStreamGrouping.html">CustomStreamGrouping</a> interface:</p>
 
 <ol>
-<li><strong>Shuffle grouping</strong>: Tuples are randomly distributed across the bolt's tasks in a way such that each bolt is guaranteed to get an equal number of tuples.</li>
-<li><strong>Fields grouping</strong>: The stream is partitioned by the fields specified in the grouping. For example, if the stream is grouped by the "user-id" field, tuples with the same "user-id" will always go to the same task, but tuples with different "user-id"'s may go to different tasks.</li>
-<li><strong>All grouping</strong>: The stream is replicated across all the bolt's tasks. Use this grouping with care.</li>
-<li><strong>Global grouping</strong>: The entire stream goes to a single one of the bolt's tasks. Specifically, it goes to the task with the lowest id.</li>
-<li><strong>None grouping</strong>: This grouping specifies that you don't care how the stream is grouped. Currently, none groupings are equivalent to shuffle groupings. Eventually though, Storm will push down bolts with none groupings to execute in the same thread as the bolt or spout they subscribe from (when possible).</li>
-<li><strong>Direct grouping</strong>: This is a special kind of grouping. A stream grouped this way means that the <strong>producer</strong> of the tuple decides which task of the consumer will receive this tuple. Direct groupings can only be declared on streams that have been declared as direct streams. Tuples emitted to a direct stream must be emitted using one of the <a href="/apidocs/backtype/storm/task/OutputCollector.html#emitDirect(int,%20int,%20java.util.List">emitDirect</a> methods. A bolt can get the task ids of its consumers by either using the provided <a href="/apidocs/backtype/storm/task/TopologyContext.html">TopologyContext</a> or by keeping track of the output of the <code>emit</code> method in <a href="/apidocs/backtype/storm/task/OutputCollector.html">OutputCollector</a> (which returns the task ids that the tuple was sent to).</li>
-<li><strong>Local or shuffle grouping</strong>: If the target bolt has one or more tasks in the same worker process, tuples will be shuffled to just those in-process tasks. Otherwise, this acts like a normal shuffle grouping.</li>
+  <li><strong>Shuffle grouping</strong>: Tuples are randomly distributed across the bolt&#8217;s tasks in a way such that each bolt is guaranteed to get an equal number of tuples.</li>
+  <li><strong>Fields grouping</strong>: The stream is partitioned by the fields specified in the grouping. For example, if the stream is grouped by the &#8220;user-id&#8221; field, tuples with the same &#8220;user-id&#8221; will always go to the same task, but tuples with different &#8220;user-id&#8221;&#8217;s may go to different tasks.</li>
+  <li><strong>All grouping</strong>: The stream is replicated across all the bolt&#8217;s tasks. Use this grouping with care.</li>
+  <li><strong>Global grouping</strong>: The entire stream goes to a single one of the bolt&#8217;s tasks. Specifically, it goes to the task with the lowest id.</li>
+  <li><strong>None grouping</strong>: This grouping specifies that you don&#8217;t care how the stream is grouped. Currently, none groupings are equivalent to shuffle groupings. Eventually though, Storm will push down bolts with none groupings to execute in the same thread as the bolt or spout they subscribe from (when possible).</li>
+  <li><strong>Direct grouping</strong>: This is a special kind of grouping. A stream grouped this way means that the <strong>producer</strong> of the tuple decides which task of the consumer will receive this tuple. Direct groupings can only be declared on streams that have been declared as direct streams. Tuples emitted to a direct stream must be emitted using one of the [emitDirect](/apidocs/backtype/storm/task/OutputCollector.html#emitDirect(int, int, java.util.List) methods. A bolt can get the task ids of its consumers by either using the provided <a href="/apidocs/backtype/storm/task/TopologyContext.html">TopologyContext</a> or by keeping track of the output of the <code>emit</code> method in <a href="/apidocs/backtype/storm/task/OutputCollector.html">OutputCollector</a> (which returns the task ids that the tuple was sent to).  </li>
+  <li><strong>Local or shuffle grouping</strong>: If the target bolt has one or more tasks in the same worker process, tuples will be shuffled to just those in-process tasks. Otherwise, this acts like a normal shuffle grouping.</li>
 </ol>
 
-
 <p><strong>Resources:</strong></p>
 
 <ul>
-<li><a href="/apidocs/backtype/storm/topology/TopologyBuilder.html">TopologyBuilder</a>: use this class to define topologies</li>
-<li><a href="/apidocs/backtype/storm/topology/InputDeclarer.html">InputDeclarer</a>: this object is returned whenever <code>setBolt</code> is called on <code>TopologyBuilder</code> and is used for declaring a bolt's input streams and how those streams should be grouped</li>
-<li><a href="/apidocs/backtype/storm/task/CoordinatedBolt.html">CoordinatedBolt</a>: this bolt is useful for distributed RPC topologies and makes heavy use of direct streams and direct groupings</li>
+  <li><a href="/apidocs/backtype/storm/topology/TopologyBuilder.html">TopologyBuilder</a>: use this class to define topologies</li>
+  <li><a href="/apidocs/backtype/storm/topology/InputDeclarer.html">InputDeclarer</a>: this object is returned whenever <code>setBolt</code> is called on <code>TopologyBuilder</code> and is used for declaring a bolt&#8217;s input streams and how those streams should be grouped</li>
+  <li><a href="/apidocs/backtype/storm/task/CoordinatedBolt.html">CoordinatedBolt</a>: this bolt is useful for distributed RPC topologies and makes heavy use of direct streams and direct groupings</li>
 </ul>
 
+<h3 id="reliability">Reliability</h3>
 
-<h3>Reliability</h3>
+<p>Storm guarantees that every spout tuple will be fully processed by the topology. It does this by tracking the tree of tuples triggered by every spout tuple and determining when that tree of tuples has been successfully completed. Every topology has a &#8220;message timeout&#8221; associated with it. If Storm fails to detect that a spout tuple has been completed within that timeout, then it fails the tuple and replays it later. </p>
 
-<p>Storm guarantees that every spout tuple will be fully processed by the topology. It does this by tracking the tree of tuples triggered by every spout tuple and determining when that tree of tuples has been successfully completed. Every topology has a "message timeout" associated with it. If Storm fails to detect that a spout tuple has been completed within that timeout, then it fails the tuple and replays it later.</p>
+<p>To take advantage of Storm&#8217;s reliability capabilities, you must tell Storm when new edges in a tuple tree are being created and tell Storm whenever you&#8217;ve finished processing an individual tuple. These are done using the <a href="/apidocs/backtype/storm/task/OutputCollector.html">OutputCollector</a> object that bolts use to emit tuples. Anchoring is done in the <code>emit</code> method, and you declare that you&#8217;re finished with a tuple using the <code>ack</code> method.</p>
 
-<p>To take advantage of Storm's reliability capabilities, you must tell Storm when new edges in a tuple tree are being created and tell Storm whenever you've finished processing an individual tuple. These are done using the <a href="/apidocs/backtype/storm/task/OutputCollector.html">OutputCollector</a> object that bolts use to emit tuples. Anchoring is done in the <code>emit</code> method, and you declare that you're finished with a tuple using the <code>ack</code> method.</p>
+<p>This is all explained in much more detail in <a href="Guaranteeing-message-processing.html">Guaranteeing message processing</a>. </p>
 
-<p>This is all explained in much more detail in <a href="Guaranteeing-message-processing.html">Guaranteeing message processing</a>.</p>
+<h3 id="tasks">Tasks</h3>
 
-<h3>Tasks</h3>
+<p>Each spout or bolt executes as many tasks across the cluster. Each task corresponds to one thread of execution, and stream groupings define how to send tuples from one set of tasks to another set of tasks. You set the parallelism for each spout or bolt in the <code>setSpout</code> and <code>setBolt</code> methods of <a href="/apidocs/backtype/storm/topology/TopologyBuilder.html">TopologyBuilder</a>. </p>
 
-<p>Each spout or bolt executes as many tasks across the cluster. Each task corresponds to one thread of execution, and stream groupings define how to send tuples from one set of tasks to another set of tasks. You set the parallelism for each spout or bolt in the <code>setSpout</code> and <code>setBolt</code> methods of <a href="/apidocs/backtype/storm/topology/TopologyBuilder.html">TopologyBuilder</a>.</p>
-
-<h3>Workers</h3>
+<h3 id="workers">Workers</h3>
 
 <p>Topologies execute across one or more worker processes. Each worker process is a physical JVM and executes a subset of all the tasks for the topology. For example, if the combined parallelism of the topology is 300 and 50 workers are allocated, then each worker will execute 6 tasks (as threads within the worker). Storm tries to spread the tasks evenly across all the workers.</p>
 
 <p><strong>Resources:</strong></p>
 
 <ul>
-<li><a href="/apidocs/backtype/storm/Config.html#TOPOLOGY_WORKERS">Config.TOPOLOGY_WORKERS</a>: this config sets the number of workers to allocate for executing the topology</li>
+  <li><a href="/apidocs/backtype/storm/Config.html#TOPOLOGY_WORKERS">Config.TOPOLOGY_WORKERS</a>: this config sets the number of workers to allocate for executing the topology</li>
 </ul>
 
-
 </div>
 </div>
 <div id="clear"></div></div>

Modified: incubator/storm/site/publish/documentation/Configuration.html
URL: http://svn.apache.org/viewvc/incubator/storm/site/publish/documentation/Configuration.html?rev=1597454&r1=1597453&r2=1597454&view=diff
==============================================================================
--- incubator/storm/site/publish/documentation/Configuration.html (original)
+++ incubator/storm/site/publish/documentation/Configuration.html Sun May 25 17:47:12 2014
@@ -65,41 +65,38 @@
   </ul>
 </div>
 <div id="aboutcontent">
-<p>Storm has a variety of configurations for tweaking the behavior of nimbus, supervisors, and running topologies. Some configurations are system configurations and cannot be modified on a topology by topology basis, whereas other configurations can be modified per topology.</p>
+<p>Storm has a variety of configurations for tweaking the behavior of nimbus, supervisors, and running topologies. Some configurations are system configurations and cannot be modified on a topology by topology basis, whereas other configurations can be modified per topology. </p>
 
-<p>Every configuration has a default value defined in <a href="https://github.com/apache/incubator-storm/blob/master/conf/defaults.yaml">defaults.yaml</a> in the Storm codebase. You can override these configurations by defining a storm.yaml in the classpath of Nimbus and the supervisors. Finally, you can define a topology-specific configuration that you submit along with your topology when using <a href="/apidocs/backtype/storm/StormSubmitter.html">StormSubmitter</a>. However, the topology-specific configuration can only override configs prefixed with "TOPOLOGY".</p>
+<p>Every configuration has a default value defined in <a href="https://github.com/apache/incubator-storm/blob/master/conf/defaults.yaml">defaults.yaml</a> in the Storm codebase. You can override these configurations by defining a storm.yaml in the classpath of Nimbus and the supervisors. Finally, you can define a topology-specific configuration that you submit along with your topology when using <a href="/apidocs/backtype/storm/StormSubmitter.html">StormSubmitter</a>. However, the topology-specific configuration can only override configs prefixed with &#8220;TOPOLOGY&#8221;.</p>
 
 <p>Storm 0.7.0 and onwards lets you override configuration on a per-bolt/per-spout basis. The only configurations that can be overriden this way are:</p>
 
 <ol>
-<li>"topology.debug"</li>
-<li>"topology.max.spout.pending"</li>
-<li>"topology.max.task.parallelism"</li>
-<li>"topology.kryo.register": This works a little bit differently than the other ones, since the serializations will be available to all components in the topology. More details on <a href="Serialization.html">Serialization</a>.</li>
+  <li>&#8220;topology.debug&#8221;</li>
+  <li>&#8220;topology.max.spout.pending&#8221;</li>
+  <li>&#8220;topology.max.task.parallelism&#8221;</li>
+  <li>&#8220;topology.kryo.register&#8221;: This works a little bit differently than the other ones, since the serializations will be available to all components in the topology. More details on <a href="Serialization.html">Serialization</a>. </li>
 </ol>
 
-
 <p>The Java API lets you specify component specific configurations in two ways:</p>
 
 <ol>
-<li><em>Internally:</em> Override <code>getComponentConfiguration</code> in any spout or bolt and return the component-specific configuration map.</li>
-<li><em>Externally:</em> <code>setSpout</code> and <code>setBolt</code> in <code>TopologyBuilder</code> return an object with methods <code>addConfiguration</code> and <code>addConfigurations</code> that can be used to override the configurations for the component.</li>
+  <li><em>Internally:</em> Override <code>getComponentConfiguration</code> in any spout or bolt and return the component-specific configuration map.</li>
+  <li><em>Externally:</em> <code>setSpout</code> and <code>setBolt</code> in <code>TopologyBuilder</code> return an object with methods <code>addConfiguration</code> and <code>addConfigurations</code> that can be used to override the configurations for the component.</li>
 </ol>
 
-
-<p>The preference order for configuration values is defaults.yaml &lt; storm.yaml &lt; topology specific configuration &lt; internal component specific configuration &lt; external component specific configuration.</p>
+<p>The preference order for configuration values is defaults.yaml &lt; storm.yaml &lt; topology specific configuration &lt; internal component specific configuration &lt; external component specific configuration. </p>
 
 <p><strong>Resources:</strong></p>
 
 <ul>
-<li><a href="/apidocs/backtype/storm/Config.html">Config</a>: a listing of all configurations as well as a helper class for creating topology specific configurations</li>
-<li><a href="https://github.com/apache/incubator-storm/blob/master/conf/defaults.yaml">defaults.yaml</a>: the default values for all configurations</li>
-<li><a href="Setting-up-a-Storm-cluster.html">Setting up a Storm cluster</a>: explains how to create and configure a Storm cluster</li>
-<li><a href="Running-topologies-on-a-production-cluster.html">Running topologies on a production cluster</a>: lists useful configurations when running topologies on a cluster</li>
-<li><a href="Local-mode.html">Local mode</a>: lists useful configurations when using local mode</li>
+  <li><a href="/apidocs/backtype/storm/Config.html">Config</a>: a listing of all configurations as well as a helper class for creating topology specific configurations</li>
+  <li><a href="https://github.com/apache/incubator-storm/blob/master/conf/defaults.yaml">defaults.yaml</a>: the default values for all configurations</li>
+  <li><a href="Setting-up-a-Storm-cluster.html">Setting up a Storm cluster</a>: explains how to create and configure a Storm cluster</li>
+  <li><a href="Running-topologies-on-a-production-cluster.html">Running topologies on a production cluster</a>: lists useful configurations when running topologies on a cluster</li>
+  <li><a href="Local-mode.html">Local mode</a>: lists useful configurations when using local mode</li>
 </ul>
 
-
 </div>
 </div>
 <div id="clear"></div></div>

Modified: incubator/storm/site/publish/documentation/Contributing-to-Storm.html
URL: http://svn.apache.org/viewvc/incubator/storm/site/publish/documentation/Contributing-to-Storm.html?rev=1597454&r1=1597453&r2=1597454&view=diff
==============================================================================
--- incubator/storm/site/publish/documentation/Contributing-to-Storm.html (original)
+++ incubator/storm/site/publish/documentation/Contributing-to-Storm.html Sun May 25 17:47:12 2014
@@ -65,36 +65,36 @@
   </ul>
 </div>
 <div id="aboutcontent">
-<h3>Getting started with contributing</h3>
+<h3 id="getting-started-with-contributing">Getting started with contributing</h3>
 
-<p>Some of the issues on the <a href="https://issues.apache.org/jira/browse/STORM">issue tracker</a> are marked with the "Newbie" label. If you're interesting in contributing to Storm but don't know where to begin, these are good issues to start with. These issues are a great way to get your feet wet with learning the codebase because they require learning about only an isolated portion of the codebase and are a relatively small amount of work.</p>
+<p>Some of the issues on the <a href="https://issues.apache.org/jira/browse/STORM">issue tracker</a> are marked with the &#8220;Newbie&#8221; label. If you&#8217;re interesting in contributing to Storm but don&#8217;t know where to begin, these are good issues to start with. These issues are a great way to get your feet wet with learning the codebase because they require learning about only an isolated portion of the codebase and are a relatively small amount of work.</p>
 
-<h3>Learning the codebase</h3>
+<h3 id="learning-the-codebase">Learning the codebase</h3>
 
 <p>The <a href="Implementation-docs.html">Implementation docs</a> section of the wiki gives detailed walkthroughs of the codebase. Reading through these docs is highly recommended to understand the codebase.</p>
 
-<h3>Contribution process</h3>
+<h3 id="contribution-process">Contribution process</h3>
 
-<p>Contributions to the Storm codebase should be sent as GitHub pull requests. If there's any problems to the pull request we can iterate on it using GitHub's commenting features.</p>
+<p>Contributions to the Storm codebase should be sent as GitHub pull requests. If there&#8217;s any problems to the pull request we can iterate on it using GitHub&#8217;s commenting features.</p>
 
 <p>For small patches, feel free to submit pull requests directly for them. For larger contributions, please use the following process. The idea behind this process is to prevent any wasted work and catch design issues early on:</p>
 
 <ol>
-<li>Open an issue on the <a href="https://issues.apache.org/jira/browse/STORM">issue tracker</a> if one doesn't exist already</li>
-<li>Comment on the issue with your plan for implementing the issue. Explain what pieces of the codebase you're going to touch and how everything is going to fit together.</li>
-<li>Storm committers will iterate with you on the design to make sure you're on the right track</li>
-<li>Implement your issue, submit a pull request, and iterate from there.</li>
+  <li>Open an issue on the <a href="https://issues.apache.org/jira/browse/STORM">issue tracker</a> if one doesn&#8217;t exist already</li>
+  <li>Comment on the issue with your plan for implementing the issue. Explain what pieces of the codebase you&#8217;re going to touch and how everything is going to fit together.</li>
+  <li>Storm committers will iterate with you on the design to make sure you&#8217;re on the right track</li>
+  <li>Implement your issue, submit a pull request, and iterate from there.</li>
 </ol>
 
+<h3 id="modules-built-on-top-of-storm">Modules built on top of Storm</h3>
 
-<h3>Modules built on top of Storm</h3>
+<p>Modules built on top of Storm (like spouts, bolts, etc) that aren&#8217;t appropriate for Storm core can be done as your own project or as part of <a href="https://github.com/stormprocessor">@stormprocessor</a>. To be part of @stormprocessor put your project on your own Github and then send an email to the mailing list proposing to make it part of @stormprocessor. Then the community can discuss whether it&#8217;s useful enough to be part of @stormprocessor. Then you&#8217;ll be added to the @stormprocessor organization and can maintain your project there. The advantage of hosting your module in @stormprocessor is that it will be easier for potential users to find your project.</p>
 
-<p>Modules built on top of Storm (like spouts, bolts, etc) that aren't appropriate for Storm core can be done as your own project or as part of <a href="https://github.com/stormprocessor">@stormprocessor</a>. To be part of @stormprocessor put your project on your own Github and then send an email to the mailing list proposing to make it part of @stormprocessor. Then the community can discuss whether it's useful enough to be part of @stormprocessor. Then you'll be added to the @stormprocessor organization and can maintain your project there. The advantage of hosting your module in @stormprocessor is that it will be easier for potential users to find your project.</p>
-
-<h3>Contributing documentation</h3>
+<h3 id="contributing-documentation">Contributing documentation</h3>
 
 <p>Documentation contributions are very welcome! The best way to send contributions is as emails through the mailing list.</p>
 
+
 </div>
 </div>
 <div id="clear"></div></div>

Modified: incubator/storm/site/publish/documentation/Creating-a-new-Storm-project.html
URL: http://svn.apache.org/viewvc/incubator/storm/site/publish/documentation/Creating-a-new-Storm-project.html?rev=1597454&r1=1597453&r2=1597454&view=diff
==============================================================================
--- incubator/storm/site/publish/documentation/Creating-a-new-Storm-project.html (original)
+++ incubator/storm/site/publish/documentation/Creating-a-new-Storm-project.html Sun May 25 17:47:12 2014
@@ -68,22 +68,21 @@
 <p>This page outlines how to set up a Storm project for development. The steps are:</p>
 
 <ol>
-<li>Add Storm jars to classpath</li>
-<li>If using multilang, add multilang dir to classpath</li>
+  <li>Add Storm jars to classpath</li>
+  <li>If using multilang, add multilang dir to classpath</li>
 </ol>
 
-
 <p>Follow along to see how to set up the <a href="http://github.com/nathanmarz/storm-starter">storm-starter</a> project in Eclipse.</p>
 
-<h3>Add Storm jars to classpath</h3>
+<h3 id="add-storm-jars-to-classpath">Add Storm jars to classpath</h3>
 
-<p>You'll need the Storm jars on your classpath to develop Storm topologies. Using <a href="Maven.html">Maven</a> is highly recommended. <a href="https://github.com/nathanmarz/storm-starter/blob/master/m2-pom.xml">Here's an example</a> of how to setup your pom.xml for a Storm project. If you don't want to use Maven, you can include the jars from the Storm release on your classpath.</p>
+<p>You&#8217;ll need the Storm jars on your classpath to develop Storm topologies. Using <a href="Maven.html">Maven</a> is highly recommended. <a href="https://github.com/nathanmarz/storm-starter/blob/master/m2-pom.xml">Here&#8217;s an example</a> of how to setup your pom.xml for a Storm project. If you don&#8217;t want to use Maven, you can include the jars from the Storm release on your classpath. </p>
 
 <p><a href="http://github.com/nathanmarz/storm-starter">storm-starter</a> uses <a href="http://github.com/technomancy/leiningen">Leiningen</a> for build and dependency resolution. You can install leiningen by downloading <a href="https://raw.github.com/technomancy/leiningen/stable/bin/lein">this script</a>, placing it on your path, and making it executable. To retrieve the dependencies for Storm, simply run <code>lein deps</code> in the project root.</p>
 
 <p>To set up the classpath in Eclipse, create a new Java project, include <code>src/jvm/</code> as a source path, and make sure all the jars in <code>lib/</code> and <code>lib/dev/</code> are in the <code>Referenced Libraries</code> section of the project.</p>
 
-<h3>If using multilang, add multilang dir to classpath</h3>
+<h3 id="if-using-multilang-add-multilang-dir-to-classpath">If using multilang, add multilang dir to classpath</h3>
 
 <p>If you implement spouts or bolts in languages other than Java, then those implementations should be under the <code>multilang/resources/</code> directory of the project. For Storm to find these files in local mode, the <code>resources/</code> dir needs to be on the classpath. You can do this in Eclipse by adding <code>multilang/</code> as a source folder. You may also need to add multilang/resources as a source directory.</p>
 

Modified: incubator/storm/site/publish/documentation/DSLs-and-multilang-adapters.html
URL: http://svn.apache.org/viewvc/incubator/storm/site/publish/documentation/DSLs-and-multilang-adapters.html?rev=1597454&r1=1597453&r2=1597454&view=diff
==============================================================================
--- incubator/storm/site/publish/documentation/DSLs-and-multilang-adapters.html (original)
+++ incubator/storm/site/publish/documentation/DSLs-and-multilang-adapters.html Sun May 25 17:47:12 2014
@@ -66,15 +66,14 @@
 </div>
 <div id="aboutcontent">
 <ul>
-<li><a href="https://github.com/velvia/ScalaStorm">Scala DSL</a></li>
-<li><a href="https://github.com/colinsurprenant/redstorm">JRuby DSL</a></li>
-<li><a href="Clojure-DSL.html">Clojure DSL</a></li>
-<li><a href="https://github.com/tomdz/storm-esper">Storm/Esper integration</a>: Streaming SQL on top of Storm</li>
-<li><a href="https://github.com/gphat/io-storm">io-storm</a>: Perl multilang adapter</li>
-<li><a href="https://github.com/lazyshot/storm-php">storm-php</a>: PHP multilang adapter</li>
+  <li><a href="https://github.com/velvia/ScalaStorm">Scala DSL</a></li>
+  <li><a href="https://github.com/colinsurprenant/redstorm">JRuby DSL</a></li>
+  <li><a href="Clojure-DSL.html">Clojure DSL</a></li>
+  <li><a href="https://github.com/tomdz/storm-esper">Storm/Esper integration</a>: Streaming SQL on top of Storm</li>
+  <li><a href="https://github.com/gphat/io-storm">io-storm</a>: Perl multilang adapter</li>
+  <li><a href="https://github.com/lazyshot/storm-php">storm-php</a>: PHP multilang adapter</li>
 </ul>
 
-
 </div>
 </div>
 <div id="clear"></div></div>

Modified: incubator/storm/site/publish/documentation/Defining-a-non-jvm-language-dsl-for-storm.html
URL: http://svn.apache.org/viewvc/incubator/storm/site/publish/documentation/Defining-a-non-jvm-language-dsl-for-storm.html?rev=1597454&r1=1597453&r2=1597454&view=diff
==============================================================================
--- incubator/storm/site/publish/documentation/Defining-a-non-jvm-language-dsl-for-storm.html (original)
+++ incubator/storm/site/publish/documentation/Defining-a-non-jvm-language-dsl-for-storm.html Sun May 25 17:47:12 2014
@@ -69,29 +69,33 @@
 
 <p>When you create the Thrift structs for spouts and bolts, the code for the spout or bolt is specified in the ComponentObject struct:</p>
 
-<pre><code>union ComponentObject {
+<p><code>
+union ComponentObject {
   1: binary serialized_java;
   2: ShellComponent shell;
   3: JavaObject java_object;
 }
-</code></pre>
+</code></p>
 
-<p>For a Python DSL, you would want to make use of "2" and "3". ShellComponent lets you specify a script to run that component (e.g., your python code). And JavaObject lets you specify native java spouts and bolts for the component (and Storm will use reflection to create that spout or bolt).</p>
+<p>For a Python DSL, you would want to make use of &#8220;2&#8221; and &#8220;3&#8221;. ShellComponent lets you specify a script to run that component (e.g., your python code). And JavaObject lets you specify native java spouts and bolts for the component (and Storm will use reflection to create that spout or bolt).</p>
 
-<p>There's a "storm shell" command that will help with submitting a topology. Its usage is like this:</p>
+<p>There&#8217;s a &#8220;storm shell&#8221; command that will help with submitting a topology. Its usage is like this:</p>
 
-<pre><code>storm shell resources/ python topology.py arg1 arg2
-</code></pre>
+<p><code>
+storm shell resources/ python topology.py arg1 arg2
+</code></p>
 
 <p>storm shell will then package resources/ into a jar, upload the jar to Nimbus, and call your topology.py script like this:</p>
 
-<pre><code>python topology.py arg1 arg2 {nimbus-host} {nimbus-port} {uploaded-jar-location}
-</code></pre>
-
-<p>Then you can connect to Nimbus using the Thrift API and submit the topology, passing {uploaded-jar-location} into the submitTopology method. For reference, here's the submitTopology definition:</p>
-
-<pre><code class="java">void submitTopology(1: string name, 2: string uploadedJarLocation, 3: string jsonConf, 4: StormTopology topology) throws (1: AlreadyAliveException e, 2: InvalidTopologyException ite);
-</code></pre>
+<p><code>
+python topology.py arg1 arg2 {nimbus-host} {nimbus-port} {uploaded-jar-location}
+</code></p>
+
+<p>Then you can connect to Nimbus using the Thrift API and submit the topology, passing {uploaded-jar-location} into the submitTopology method. For reference, here&#8217;s the submitTopology definition:</p>
+
+<p><code>java
+void submitTopology(1: string name, 2: string uploadedJarLocation, 3: string jsonConf, 4: StormTopology topology) throws (1: AlreadyAliveException e, 2: InvalidTopologyException ite);
+</code></p>
 
 <p>Finally, one of the key things to do in a non-JVM DSL is make it easy to define the entire topology in one file (the bolts, spouts, and the definition of the topology).</p>
 

Modified: incubator/storm/site/publish/documentation/Distributed-RPC.html
URL: http://svn.apache.org/viewvc/incubator/storm/site/publish/documentation/Distributed-RPC.html?rev=1597454&r1=1597453&r2=1597454&view=diff
==============================================================================
--- incubator/storm/site/publish/documentation/Distributed-RPC.html (original)
+++ incubator/storm/site/publish/documentation/Distributed-RPC.html Sun May 25 17:47:12 2014
@@ -65,17 +65,18 @@
   </ul>
 </div>
 <div id="aboutcontent">
-<p>The idea behind distributed RPC (DRPC) is to parallelize the computation of really intense functions on the fly using Storm. The Storm topology takes in as input a stream of function arguments, and it emits an output stream of the results for each of those function calls.</p>
+<p>The idea behind distributed RPC (DRPC) is to parallelize the computation of really intense functions on the fly using Storm. The Storm topology takes in as input a stream of function arguments, and it emits an output stream of the results for each of those function calls. </p>
 
-<p>DRPC is not so much a feature of Storm as it is a pattern expressed from Storm's primitives of streams, spouts, bolts, and topologies. DRPC could have been packaged as a separate library from Storm, but it's so useful that it's bundled with Storm.</p>
+<p>DRPC is not so much a feature of Storm as it is a pattern expressed from Storm&#8217;s primitives of streams, spouts, bolts, and topologies. DRPC could have been packaged as a separate library from Storm, but it&#8217;s so useful that it&#8217;s bundled with Storm.</p>
 
-<h3>High level overview</h3>
+<h3 id="high-level-overview">High level overview</h3>
 
-<p>Distributed RPC is coordinated by a "DRPC server" (Storm comes packaged with an implementation of this). The DRPC server coordinates receiving an RPC request, sending the request to the Storm topology, receiving the results from the Storm topology, and sending the results back to the waiting client. From a client's perspective, a distributed RPC call looks just like a regular RPC call. For example, here's how a client would compute the results for the "reach" function with the argument "http://twitter.com":</p>
+<p>Distributed RPC is coordinated by a &#8220;DRPC server&#8221; (Storm comes packaged with an implementation of this). The DRPC server coordinates receiving an RPC request, sending the request to the Storm topology, receiving the results from the Storm topology, and sending the results back to the waiting client. From a client&#8217;s perspective, a distributed RPC call looks just like a regular RPC call. For example, here&#8217;s how a client would compute the results for the &#8220;reach&#8221; function with the argument &#8220;http://twitter.com&#8221;:</p>
 
-<pre><code class="java">DRPCClient client = new DRPCClient("drpc-host", 3772);
+<p><code>java
+DRPCClient client = new DRPCClient("drpc-host", 3772);
 String result = client.execute("reach", "http://twitter.com");
-</code></pre>
+</code></p>
 
 <p>The distributed RPC workflow looks like this:</p>
 
@@ -83,109 +84,112 @@ String result = client.execute("reach", 
 
 <p>A client sends the DRPC server the name of the function to execute and the arguments to that function. The topology implementing that function uses a <code>DRPCSpout</code> to receive a function invocation stream from the DRPC server. Each function invocation is tagged with a unique id by the DRPC server. The topology then computes the result and at the end of the topology a bolt called <code>ReturnResults</code> connects to the DRPC server and gives it the result for the function invocation id. The DRPC server then uses the id to match up that result with which client is waiting, unblocks the waiting client, and sends it the result.</p>
 
-<h3>LinearDRPCTopologyBuilder</h3>
+<h3 id="lineardrpctopologybuilder">LinearDRPCTopologyBuilder</h3>
 
 <p>Storm comes with a topology builder called <a href="/apidocs/backtype/storm/drpc/LinearDRPCTopologyBuilder.html">LinearDRPCTopologyBuilder</a> that automates almost all the steps involved for doing DRPC. These include:</p>
 
 <ol>
-<li>Setting up the spout</li>
-<li>Returning the results to the DRPC server</li>
-<li>Providing functionality to bolts for doing finite aggregations over groups of tuples</li>
+  <li>Setting up the spout</li>
+  <li>Returning the results to the DRPC server</li>
+  <li>Providing functionality to bolts for doing finite aggregations over groups of tuples</li>
 </ol>
 
+<p>Let&#8217;s look at a simple example. Here&#8217;s the implementation of a DRPC topology that returns its input argument with a &#8220;!&#8221; appended:</p>
 
-<p>Let's look at a simple example. Here's the implementation of a DRPC topology that returns its input argument with a "!" appended:</p>
-
-<pre><code class="java">public static class ExclaimBolt extends BaseBasicBolt {
+<p>```java
+public static class ExclaimBolt extends BaseBasicBolt {
     public void execute(Tuple tuple, BasicOutputCollector collector) {
         String input = tuple.getString(1);
-        collector.emit(new Values(tuple.getValue(0), input + "!"));
-    }
+        collector.emit(new Values(tuple.getValue(0), input + &#8220;!&#8221;));
+    }</p>
 
-    public void declareOutputFields(OutputFieldsDeclarer declarer) {
-        declarer.declare(new Fields("id", "result"));
-    }
-}
+<pre><code>public void declareOutputFields(OutputFieldsDeclarer declarer) {
+    declarer.declare(new Fields("id", "result"));
+} }
+</code></pre>
 
-public static void main(String[] args) throws Exception {
-    LinearDRPCTopologyBuilder builder = new LinearDRPCTopologyBuilder("exclamation");
+<p>public static void main(String[] args) throws Exception {
+    LinearDRPCTopologyBuilder builder = new LinearDRPCTopologyBuilder(&#8220;exclamation&#8221;);
     builder.addBolt(new ExclaimBolt(), 3);
-    // ...
+    // &#8230;
 }
-</code></pre>
+```</p>
 
-<p>As you can see, there's very little to it. When creating the <code>LinearDRPCTopologyBuilder</code>, you tell it the name of the DRPC function for the topology. A single DRPC server can coordinate many functions, and the function name distinguishes the functions from one another. The first bolt you declare will take in as input 2-tuples, where the first field is the request id and the second field is the arguments for that request. <code>LinearDRPCTopologyBuilder</code> expects the last bolt to emit an output stream containing 2-tuples of the form [id, result]. Finally, all intermediate tuples must contain the request id as the first field.</p>
+<p>As you can see, there&#8217;s very little to it. When creating the <code>LinearDRPCTopologyBuilder</code>, you tell it the name of the DRPC function for the topology. A single DRPC server can coordinate many functions, and the function name distinguishes the functions from one another. The first bolt you declare will take in as input 2-tuples, where the first field is the request id and the second field is the arguments for that request. <code>LinearDRPCTopologyBuilder</code> expects the last bolt to emit an output stream containing 2-tuples of the form [id, result]. Finally, all intermediate tuples must contain the request id as the first field.</p>
 
-<p>In this example, <code>ExclaimBolt</code> simply appends a "!" to the second field of the tuple. <code>LinearDRPCTopologyBuilder</code> handles the rest of the coordination of connecting to the DRPC server and sending results back.</p>
+<p>In this example, <code>ExclaimBolt</code> simply appends a &#8220;!&#8221; to the second field of the tuple. <code>LinearDRPCTopologyBuilder</code> handles the rest of the coordination of connecting to the DRPC server and sending results back.</p>
 
-<h3>Local mode DRPC</h3>
+<h3 id="local-mode-drpc">Local mode DRPC</h3>
 
-<p>DRPC can be run in local mode. Here's how to run the above example in local mode:</p>
+<p>DRPC can be run in local mode. Here&#8217;s how to run the above example in local mode:</p>
 
-<pre><code class="java">LocalDRPC drpc = new LocalDRPC();
-LocalCluster cluster = new LocalCluster();
+<p>```java
+LocalDRPC drpc = new LocalDRPC();
+LocalCluster cluster = new LocalCluster();</p>
 
-cluster.submitTopology("drpc-demo", conf, builder.createLocalTopology(drpc));
+<p>cluster.submitTopology(&#8220;drpc-demo&#8221;, conf, builder.createLocalTopology(drpc));</p>
 
-System.out.println("Results for 'hello':" + drpc.execute("exclamation", "hello"));
+<p>System.out.println(&#8220;Results for &#8216;hello&#8217;:&#8221; + drpc.execute(&#8220;exclamation&#8221;, &#8220;hello&#8221;));</p>
 
-cluster.shutdown();
+<p>cluster.shutdown();
 drpc.shutdown();
-</code></pre>
+```</p>
 
 <p>First you create a <code>LocalDRPC</code> object. This object simulates a DRPC server in process, just like how <code>LocalCluster</code> simulates a Storm cluster in process. Then you create the <code>LocalCluster</code> to run the topology in local mode. <code>LinearDRPCTopologyBuilder</code> has separate methods for creating local topologies and remote topologies. In local mode the <code>LocalDRPC</code> object does not bind to any ports so the topology needs to know about the object to communicate with it. This is why <code>createLocalTopology</code> takes in the <code>LocalDRPC</code> object as input.</p>
 
 <p>After launching the topology, you can do DRPC invocations using the <code>execute</code> method on <code>LocalDRPC</code>.</p>
 
-<h3>Remote mode DRPC</h3>
+<h3 id="remote-mode-drpc">Remote mode DRPC</h3>
 
-<p>Using DRPC on an actual cluster is also straightforward. There's three steps:</p>
+<p>Using DRPC on an actual cluster is also straightforward. There&#8217;s three steps:</p>
 
 <ol>
-<li>Launch DRPC server(s)</li>
-<li>Configure the locations of the DRPC servers</li>
-<li>Submit DRPC topologies to Storm cluster</li>
+  <li>Launch DRPC server(s)</li>
+  <li>Configure the locations of the DRPC servers</li>
+  <li>Submit DRPC topologies to Storm cluster</li>
 </ol>
 
-
 <p>Launching a DRPC server can be done with the <code>storm</code> script and is just like launching Nimbus or the UI:</p>
 
-<pre><code>bin/storm drpc
-</code></pre>
+<p><code>
+bin/storm drpc
+</code></p>
 
 <p>Next, you need to configure your Storm cluster to know the locations of the DRPC server(s). This is how <code>DRPCSpout</code> knows from where to read function invocations. This can be done through the <code>storm.yaml</code> file or the topology configurations. Configuring this through the <code>storm.yaml</code> looks something like this:</p>
 
-<pre><code class="yaml">drpc.servers:
+<p><code>yaml
+drpc.servers:
   - "drpc1.foo.com"
   - "drpc2.foo.com"
-</code></pre>
+</code></p>
 
 <p>Finally, you launch DRPC topologies using <code>StormSubmitter</code> just like you launch any other topology. To run the above example in remote mode, you do something like this:</p>
 
-<pre><code class="java">StormSubmitter.submitTopology("exclamation-drpc", conf, builder.createRemoteTopology());
-</code></pre>
+<p><code>java
+StormSubmitter.submitTopology("exclamation-drpc", conf, builder.createRemoteTopology());
+</code></p>
 
 <p><code>createRemoteTopology</code> is used to create topologies suitable for Storm clusters.</p>
 
-<h3>A more complex example</h3>
+<h3 id="a-more-complex-example">A more complex example</h3>
 
-<p>The exclamation DRPC example was a toy example for illustrating the concepts of DRPC. Let's look at a more complex example which really needs the parallelism a Storm cluster provides for computing the DRPC function. The example we'll look at is computing the reach of a URL on Twitter.</p>
+<p>The exclamation DRPC example was a toy example for illustrating the concepts of DRPC. Let&#8217;s look at a more complex example which really needs the parallelism a Storm cluster provides for computing the DRPC function. The example we&#8217;ll look at is computing the reach of a URL on Twitter.</p>
 
 <p>The reach of a URL is the number of unique people exposed to a URL on Twitter. To compute reach, you need to:</p>
 
 <ol>
-<li>Get all the people who tweeted the URL</li>
-<li>Get all the followers of all those people</li>
-<li>Unique the set of followers</li>
-<li>Count the unique set of followers</li>
+  <li>Get all the people who tweeted the URL</li>
+  <li>Get all the followers of all those people</li>
+  <li>Unique the set of followers</li>
+  <li>Count the unique set of followers</li>
 </ol>
 
+<p>A single reach computation can involve thousands of database calls and tens of millions of follower records during the computation. It&#8217;s a really, really intense computation. As you&#8217;re about to see, implementing this function on top of Storm is dead simple. On a single machine, reach can take minutes to compute; on a Storm cluster, you can compute reach for even the hardest URLs in a couple seconds.</p>
 
-<p>A single reach computation can involve thousands of database calls and tens of millions of follower records during the computation. It's a really, really intense computation. As you're about to see, implementing this function on top of Storm is dead simple. On a single machine, reach can take minutes to compute; on a Storm cluster, you can compute reach for even the hardest URLs in a couple seconds.</p>
-
-<p>A sample reach topology is defined in storm-starter <a href="https://github.com/nathanmarz/storm-starter/blob/master/src/jvm/storm/starter/ReachTopology.java">here</a>. Here's how you define the reach topology:</p>
+<p>A sample reach topology is defined in storm-starter <a href="https://github.com/nathanmarz/storm-starter/blob/master/src/jvm/storm/starter/ReachTopology.java">here</a>. Here&#8217;s how you define the reach topology:</p>
 
-<pre><code class="java">LinearDRPCTopologyBuilder builder = new LinearDRPCTopologyBuilder("reach");
+<p><code>java
+LinearDRPCTopologyBuilder builder = new LinearDRPCTopologyBuilder("reach");
 builder.addBolt(new GetTweeters(), 3);
 builder.addBolt(new GetFollowers(), 12)
         .shuffleGrouping();
@@ -193,51 +197,50 @@ builder.addBolt(new PartialUniquer(), 6)
         .fieldsGrouping(new Fields("id", "follower"));
 builder.addBolt(new CountAggregator(), 2)
         .fieldsGrouping(new Fields("id"));
-</code></pre>
+</code></p>
 
 <p>The topology executes as four steps:</p>
 
 <ol>
-<li><code>GetTweeters</code> gets the users who tweeted the URL. It transforms an input stream of <code>[id, url]</code> into an output stream of <code>[id, tweeter]</code>. Each <code>url</code> tuple will map to many <code>tweeter</code> tuples.</li>
-<li><code>GetFollowers</code> gets the followers for the tweeters. It transforms an input stream of <code>[id, tweeter]</code> into an output stream of <code>[id, follower]</code>. Across all the tasks, there may of course be duplication of follower tuples when someone follows multiple people who tweeted the same URL.</li>
-<li><code>PartialUniquer</code> groups the followers stream by the follower id. This has the effect of the same follower going to the same task. So each task of <code>PartialUniquer</code> will receive mutually independent sets of followers. Once <code>PartialUniquer</code> receives all the follower tuples directed at it for the request id, it emits the unique count of its subset of followers.</li>
-<li>Finally, <code>CountAggregator</code> receives the partial counts from each of the <code>PartialUniquer</code> tasks and sums them up to complete the reach computation.</li>
+  <li><code>GetTweeters</code> gets the users who tweeted the URL. It transforms an input stream of <code>[id, url]</code> into an output stream of <code>[id, tweeter]</code>. Each <code>url</code> tuple will map to many <code>tweeter</code> tuples.</li>
+  <li><code>GetFollowers</code> gets the followers for the tweeters. It transforms an input stream of <code>[id, tweeter]</code> into an output stream of <code>[id, follower]</code>. Across all the tasks, there may of course be duplication of follower tuples when someone follows multiple people who tweeted the same URL.</li>
+  <li><code>PartialUniquer</code> groups the followers stream by the follower id. This has the effect of the same follower going to the same task. So each task of <code>PartialUniquer</code> will receive mutually independent sets of followers. Once <code>PartialUniquer</code> receives all the follower tuples directed at it for the request id, it emits the unique count of its subset of followers.</li>
+  <li>Finally, <code>CountAggregator</code> receives the partial counts from each of the <code>PartialUniquer</code> tasks and sums them up to complete the reach computation.</li>
 </ol>
 
+<p>Let&#8217;s take a look at the <code>PartialUniquer</code> bolt:</p>
 
-<p>Let's take a look at the <code>PartialUniquer</code> bolt:</p>
-
-<pre><code class="java">public class PartialUniquer extends BaseBatchBolt {
+<p>```java
+public class PartialUniquer extends BaseBatchBolt {
     BatchOutputCollector _collector;
     Object _id;
-    Set&lt;String&gt; _followers = new HashSet&lt;String&gt;();
+    Set<string> _followers = new HashSet<string>();</string></string></p>
+
+<pre><code>@Override
+public void prepare(Map conf, TopologyContext context, BatchOutputCollector collector, Object id) {
+    _collector = collector;
+    _id = id;
+}
 
-    @Override
-    public void prepare(Map conf, TopologyContext context, BatchOutputCollector collector, Object id) {
-        _collector = collector;
-        _id = id;
-    }
-
-    @Override
-    public void execute(Tuple tuple) {
-        _followers.add(tuple.getString(1));
-    }
-
-    @Override
-    public void finishBatch() {
-        _collector.emit(new Values(_id, _followers.size()));
-    }
-
-    @Override
-    public void declareOutputFields(OutputFieldsDeclarer declarer) {
-        declarer.declare(new Fields("id", "partial-count"));
-    }
+@Override
+public void execute(Tuple tuple) {
+    _followers.add(tuple.getString(1));
 }
+
+@Override
+public void finishBatch() {
+    _collector.emit(new Values(_id, _followers.size()));
+}
+
+@Override
+public void declareOutputFields(OutputFieldsDeclarer declarer) {
+    declarer.declare(new Fields("id", "partial-count"));
+} } ```
 </code></pre>
 
-<p><code>PartialUniquer</code> implements <code>IBatchBolt</code> by extending <code>BaseBatchBolt</code>. A batch bolt provides a first class API to processing a batch of tuples as a concrete unit. A new instance of the batch bolt is created for each request id, and Storm takes care of cleaning up the instances when appropriate.</p>
+<p><code>PartialUniquer</code> implements <code>IBatchBolt</code> by extending <code>BaseBatchBolt</code>. A batch bolt provides a first class API to processing a batch of tuples as a concrete unit. A new instance of the batch bolt is created for each request id, and Storm takes care of cleaning up the instances when appropriate. </p>
 
-<p>When <code>PartialUniquer</code> receives a follower tuple in the <code>execute</code> method, it adds it to the set for the request id in an internal <code>HashSet</code>.</p>
+<p>When <code>PartialUniquer</code> receives a follower tuple in the <code>execute</code> method, it adds it to the set for the request id in an internal <code>HashSet</code>. </p>
 
 <p>Batch bolts provide the <code>finishBatch</code> method which is called after all the tuples for this batch targeted at this task have been processed. In the callback, <code>PartialUniquer</code> emits a single tuple containing the unique count for its subset of follower ids.</p>
 
@@ -245,36 +248,32 @@ builder.addBolt(new CountAggregator(), 2
 
 <p>The rest of the topology should be self-explanatory. As you can see, every single step of the reach computation is done in parallel, and defining the DRPC topology was extremely simple.</p>
 
-<h3>Non-linear DRPC topologies</h3>
+<h3 id="non-linear-drpc-topologies">Non-linear DRPC topologies</h3>
 
-<p><code>LinearDRPCTopologyBuilder</code> only handles "linear" DRPC topologies, where the computation is expressed as a sequence of steps (like reach). It's not hard to imagine functions that would require a more complicated topology with branching and merging of the bolts. For now, to do this you'll need to drop down into using <code>CoordinatedBolt</code> directly. Be sure to talk about your use case for non-linear DRPC topologies on the mailing list to inform the construction of more general abstractions for DRPC topologies.</p>
+<p><code>LinearDRPCTopologyBuilder</code> only handles &#8220;linear&#8221; DRPC topologies, where the computation is expressed as a sequence of steps (like reach). It&#8217;s not hard to imagine functions that would require a more complicated topology with branching and merging of the bolts. For now, to do this you&#8217;ll need to drop down into using <code>CoordinatedBolt</code> directly. Be sure to talk about your use case for non-linear DRPC topologies on the mailing list to inform the construction of more general abstractions for DRPC topologies.</p>
 
-<h3>How LinearDRPCTopologyBuilder works</h3>
+<h3 id="how-lineardrpctopologybuilder-works">How LinearDRPCTopologyBuilder works</h3>
 
 <ul>
-<li>DRPCSpout emits [args, return-info]. return-info is the host and port of the DRPC server as well as the id generated by the DRPC server</li>
-<li>constructs a topology comprising of:
-
-<ul>
-<li>DRPCSpout</li>
-<li>PrepareRequest (generates a request id and creates a stream for the return info and a stream for the args)</li>
-<li>CoordinatedBolt wrappers and direct groupings</li>
-<li>JoinResult (joins the result with the return info)</li>
-<li>ReturnResult (connects to the DRPC server and returns the result)</li>
-</ul>
-</li>
-<li>LinearDRPCTopologyBuilder is a good example of a higher level abstraction built on top of Storm's primitives</li>
+  <li>DRPCSpout emits [args, return-info]. return-info is the host and port of the DRPC server as well as the id generated by the DRPC server</li>
+  <li>constructs a topology comprising of:
+    <ul>
+      <li>DRPCSpout</li>
+      <li>PrepareRequest (generates a request id and creates a stream for the return info and a stream for the args)</li>
+      <li>CoordinatedBolt wrappers and direct groupings</li>
+      <li>JoinResult (joins the result with the return info)</li>
+      <li>ReturnResult (connects to the DRPC server and returns the result)</li>
+    </ul>
+  </li>
+  <li>LinearDRPCTopologyBuilder is a good example of a higher level abstraction built on top of Storm&#8217;s primitives</li>
 </ul>
 
-
-<h3>Advanced</h3>
-
+<h3 id="advanced">Advanced</h3>
 <ul>
-<li>KeyedFairBolt for weaving the processing of multiple requests at the same time</li>
-<li>How to use <code>CoordinatedBolt</code> directly</li>
+  <li>KeyedFairBolt for weaving the processing of multiple requests at the same time</li>
+  <li>How to use <code>CoordinatedBolt</code> directly</li>
 </ul>
 
-
 </div>
 </div>
 <div id="clear"></div></div>

Modified: incubator/storm/site/publish/documentation/Documentation.html
URL: http://svn.apache.org/viewvc/incubator/storm/site/publish/documentation/Documentation.html?rev=1597454&r1=1597453&r2=1597454&view=diff
==============================================================================
--- incubator/storm/site/publish/documentation/Documentation.html (original)
+++ incubator/storm/site/publish/documentation/Documentation.html Sun May 25 17:47:12 2014
@@ -65,69 +65,64 @@
   </ul>
 </div>
 <div id="aboutcontent">
-<h3>Basics of Storm</h3>
+<h3 id="basics-of-storm">Basics of Storm</h3>
 
 <ul>
-<li><a href="http://nathanmarz.github.com/storm">Javadoc</a></li>
-<li><a href="Concepts.html">Concepts</a></li>
-<li><a href="Configuration.html">Configuration</a></li>
-<li><a href="Guaranteeing-message-processing.html">Guaranteeing message processing</a></li>
-<li><a href="Fault-tolerance.html">Fault-tolerance</a></li>
-<li><a href="Command-line-client.html">Command line client</a></li>
-<li><a href="Understanding-the-parallelism-of-a-Storm-topology.html">Understanding the parallelism of a Storm topology</a></li>
-<li><a href="FAQ.html">FAQ</a></li>
+  <li><a href="http://nathanmarz.github.com/storm">Javadoc</a></li>
+  <li><a href="Concepts.html">Concepts</a></li>
+  <li><a href="Configuration.html">Configuration</a></li>
+  <li><a href="Guaranteeing-message-processing.html">Guaranteeing message processing</a></li>
+  <li><a href="Fault-tolerance.html">Fault-tolerance</a></li>
+  <li><a href="Command-line-client.html">Command line client</a></li>
+  <li><a href="Understanding-the-parallelism-of-a-Storm-topology.html">Understanding the parallelism of a Storm topology</a></li>
+  <li><a href="FAQ.html">FAQ</a></li>
 </ul>
 
+<h3 id="trident">Trident</h3>
 
-<h3>Trident</h3>
-
-<p>Trident is an alternative interface to Storm. It provides exactly-once processing, "transactional" datastore persistence, and a set of common stream analytics operations.</p>
+<p>Trident is an alternative interface to Storm. It provides exactly-once processing, &#8220;transactional&#8221; datastore persistence, and a set of common stream analytics operations.</p>
 
 <ul>
-<li><a href="Trident-tutorial.html">Trident Tutorial</a>     -- basic concepts and walkthrough</li>
-<li><a href="Trident-API-Overview.html">Trident API Overview</a> -- operations for transforming and orchestrating data</li>
-<li><a href="Trident-state.html">Trident State</a>        -- exactly-once processing and fast, persistent aggregation</li>
-<li><a href="Trident-spouts.html">Trident spouts</a>       -- transactional and non-transactional data intake</li>
+  <li><a href="Trident-tutorial.html">Trident Tutorial</a>     &#8211; basic concepts and walkthrough</li>
+  <li><a href="Trident-API-Overview.html">Trident API Overview</a> &#8211; operations for transforming and orchestrating data</li>
+  <li><a href="Trident-state.html">Trident State</a>        &#8211; exactly-once processing and fast, persistent aggregation</li>
+  <li><a href="Trident-spouts.html">Trident spouts</a>       &#8211; transactional and non-transactional data intake</li>
 </ul>
 
-
-<h3>Setup and deploying</h3>
+<h3 id="setup-and-deploying">Setup and deploying</h3>
 
 <ul>
-<li><a href="Setting-up-a-Storm-cluster.html">Setting up a Storm cluster</a></li>
-<li><a href="Local-mode.html">Local mode</a></li>
-<li><a href="Troubleshooting.html">Troubleshooting</a></li>
-<li><a href="Running-topologies-on-a-production-cluster.html">Running topologies on a production cluster</a></li>
-<li><a href="Maven.html">Building Storm</a> with Maven</li>
+  <li><a href="Setting-up-a-Storm-cluster.html">Setting up a Storm cluster</a></li>
+  <li><a href="Local-mode.html">Local mode</a></li>
+  <li><a href="Troubleshooting.html">Troubleshooting</a></li>
+  <li><a href="Running-topologies-on-a-production-cluster.html">Running topologies on a production cluster</a></li>
+  <li><a href="Maven.html">Building Storm</a> with Maven</li>
 </ul>
 
-
-<h3>Intermediate</h3>
+<h3 id="intermediate">Intermediate</h3>
 
 <ul>
-<li><a href="Serialization.html">Serialization</a></li>
-<li><a href="Common-patterns.html">Common patterns</a></li>
-<li><a href="Clojure-DSL.html">Clojure DSL</a></li>
-<li><a href="Using-non-JVM-languages-with-Storm.html">Using non-JVM languages with Storm</a></li>
-<li><a href="Distributed-RPC.html">Distributed RPC</a></li>
-<li><a href="Transactional-topologies.html">Transactional topologies</a></li>
-<li><a href="Kestrel-and-Storm.html">Kestrel and Storm</a></li>
-<li><a href="Direct-groupings.html">Direct groupings</a></li>
-<li><a href="Hooks.html">Hooks</a></li>
-<li><a href="Metrics.html">Metrics</a></li>
-<li><a href="">Lifecycle of a trident tuple</a></li>
+  <li><a href="Serialization.html">Serialization</a></li>
+  <li><a href="Common-patterns.html">Common patterns</a></li>
+  <li><a href="Clojure-DSL.html">Clojure DSL</a></li>
+  <li><a href="Using-non-JVM-languages-with-Storm.html">Using non-JVM languages with Storm</a></li>
+  <li><a href="Distributed-RPC.html">Distributed RPC</a></li>
+  <li><a href="Transactional-topologies.html">Transactional topologies</a></li>
+  <li><a href="Kestrel-and-Storm.html">Kestrel and Storm</a></li>
+  <li><a href="Direct-groupings.html">Direct groupings</a></li>
+  <li><a href="Hooks.html">Hooks</a></li>
+  <li><a href="Metrics.html">Metrics</a></li>
+  <li><a href="">Lifecycle of a trident tuple</a></li>
 </ul>
 
-
-<h3>Advanced</h3>
+<h3 id="advanced">Advanced</h3>
 
 <ul>
-<li><a href="Defining-a-non-jvm-language-dsl-for-storm.html">Defining a non-JVM language DSL for Storm</a></li>
-<li><a href="Multilang-protocol.html">Multilang protocol</a> (how to provide support for another language)</li>
-<li><a href="Implementation-docs.html">Implementation docs</a></li>
+  <li><a href="Defining-a-non-jvm-language-dsl-for-storm.html">Defining a non-JVM language DSL for Storm</a></li>
+  <li><a href="Multilang-protocol.html">Multilang protocol</a> (how to provide support for another language)</li>
+  <li><a href="Implementation-docs.html">Implementation docs</a></li>
 </ul>
 
-
 </div>
 </div>
 <div id="clear"></div></div>