You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@storm.apache.org by sr...@apache.org on 2018/05/12 15:58:52 UTC
[49/51] [abbrv] [partial] storm-site git commit: Publish up to date 2.0.0-SNAPSHOT documentation

http://git-wip-us.apache.org/repos/asf/storm-site/blob/ef81b5ca/releases/2.0.0-SNAPSHOT/Performance.md
----------------------------------------------------------------------
diff --git a/releases/2.0.0-SNAPSHOT/Performance.md b/releases/2.0.0-SNAPSHOT/Performance.md
new file mode 100644
index 0000000..20acb32
--- /dev/null
+++ b/releases/2.0.0-SNAPSHOT/Performance.md
@@ -0,0 +1,165 @@
+---
+title: Performance Tuning
+layout: documentation
+documentation: true
+---
+
+Latency, throughput and resource consumption are the three key dimensions involved in performance tuning.
+In the following sections we discuss the settings that can used to tune along these dimension and understand the trade-offs.
+
+It is important to understand that these settings can vary depending on the topology, the type of hardware and the number of hosts used by the topology.
+
+## 1. Buffer Size
+Spouts and Bolts operate asynchronously using message passing. Message queues used for this purpose are of fixed but configurable size. Buffer size
+refers to the size of these queues. Every consumer has its own receive queue. The messages wait in the queue until the consumer is ready to process them.
+The queue will typically be almost empty or almost full depending whether the consumer is operating faster or slower than the rate at which producers
+are generating messages for it. Storm queues always have a single consumer and potentially multiple producers. There are two buffer size settings
+of interest:
+
+- `topology.executor.receive.buffer.size` : This is the size of the message queue employed for each spout and bolt executor.
+- `topology.transfer.buffer.size` : This is the size of the outbound message queue used for inter-worker messaging. This queue is referred to as
+the *Worker Transfer Queue*.
+
+**Note:** If the specified buffer size is not a power of 2, it is internally rounded up to the next power of 2.
+
+#### Guidance
+Very small message queues (size < 1024) are likely to hamper throughput by not providing enough isolation between the consumer and producer. This
+can affect the asynchronous nature of the processing as the producers are likely to find the downstream queue to be full.
+
+Very large message queues are also not desirable to deal with slow consumers. Better to employ more consumers (i.e. bolts) on additional CPU cores instead. If queues
+are large and often full, the messages will end up waiting longer in these queues at each step of the processing, leading to poor latency being
+reported on the Storm UI. Large queues also imply higher memory consumption especially if the queues are typically full.
+
+
+## 2. Batch Size
+Producers can either write a batch of messages to the consumer's queue or write each message individually. This batch size can be configured.
+Inserting messages in batches to downstream queues helps reduce the number of synchronization operations required for the inserts. Consequently this helps achieve higher throughput. However,
+sometimes it may take a little time for the buffer to fill up, before it is flushed into the downstream queue. This implies that the buffered messages
+will take longer to become visible to the downstream consumer who is waiting to process them. This can increase the average end-to-end latency for
+these messages. The latency can get very bad if the batch sizes are large and the topology is not experiencing high traffic.
+
+- `topology.producer.batch.size` : The batch size for writes into the receive queue of any spout/bolt is controlled via this setting. This setting
+impacts the communication within a worker process. Each upstream producer maintains a separate batch to a component's receive queue. So if two spout
+instances are writing to the same downstream bolt instance, each of the spout instances will have maintain a separate batch.
+
+-  `topology.transfer.batch.size` : Messages that are destined to a spout/bolt running on a different worker process, are sent to a queue called
+the **Worker Transfer Queue**. The Worker Transfer Thread is responsible for draining the messages in this queue and send them to the appropriate
+worker process over the network. This setting controls the batch size for writes into the Worker Transfer Queue.  This impacts the communication
+between worker processes.
+
+#### Guidance
+
+**For Low latency:** Set batch size to 1. This basically disables batching. This is likely to reduce peak sustainable throughput under heavy traffic, but
+not likely to impact throughput much under low/medium traffic situations.
+
+**For High throughput:** Set batch size > 1. Try values like 10, 100, 1000 or even higher and see what yields the best throughput for the topology.
+Beyond a certain point the throughput is likely to get worse.
+
+**Varying throughput:** Topologies often experience fluctuating amounts of incoming traffic over the day. Other topos may experience higher traffic in some
+paths and lower throughput in other paths simultaneously. If latency is not a concern, a small bach size (e.g. 10) and in conjunction with the right flush
+frequency may provide a reasonable compromise for such scenarios. For meeting stricter latency SLAs, consider setting it to 1.
+
+
+## 3. Flush Tuple Frequency
+In low/medium traffic situations or when batch size is too large, the batches may take too long to fill up and consequently the messages could take unacceptably
+long time to become visible to downstream components. In such case, periodic flushing of batches is necessary to keep the messages moving and avoid compromising
+latencies when batching is enabled.
+
+When batching has been enabled, special messages called *flush tuples* are inserted periodically into the receive queues of all spout and bolt instances.
+This causes each spout/bolt instance to flush all its outstanding batches to their respective downstream components.
+
+`topology.flush.tuple.freq.millis` : This setting controls how often the flush tuples are generated. Flush tuples are not generated if this configuration is
+set to 0 or if (`topology.producer.batch.size`=1 and `topology.transfer.batch.size`=1).
+
+
+#### Guidance
+Flushing interval can be used as tool to retain the higher throughput benefits of batching and avoid batched messages getting stuck for too long waiting for their.
+batch to fill. Preferably this value should be larger than the average execute latencies of the bolts in the topology. Trying to flush the queues more frequently than
+the amount of time it takes to produce the messages may hurt performance. Understanding the average execute latencies of each bolt will help determine the average
+number of messages in the queues between two flushes.
+
+**For Low latency:** A smaller value helps achieve tighter latency SLAs.
+
+**For High throughput:**  When trying to maximize throughput under high traffic situations, the batches are likely to get filled and flushed automatically.
+To optimize for such cases, this value can be set to a higher number.
+
+**Varying throughput:** If latency is not a concern, a larger value will optimize for high traffic situations. For meeting tighter SLAs set this to lower
+values.
+
+
+## 4. Wait Strategy
+Wait strategies are used to conserve CPU usage by trading off some latency and throughput. They are applied for the following situations:
+
+4.1 **Spout Wait:**  In low/no traffic situations, Spout's nextTuple() may not produce any new emits. To prevent invoking the Spout's nextTuple too often,
+this wait strategy is used between nextTuple() calls, allowing the spout's executor thread to idle and conserve CPU. Spout wait strategy is also used
+when the `topology.max.spout.pending` limit has been reached when ACKers are enabled. Select a strategy using `topology.spout.wait.strategy`. Configure the
+chosen wait strategy using one of the `topology.spout.wait.*` settings.
+
+4.2 **Bolt Wait:** : When a bolt polls it's receive queue for new messages to process, it is possible that the queue is empty. This typically happens
+in case of low/no traffic situations or when the upstream spout/bolt is inherently slower. This wait strategy is used in such cases. It avoids high CPU usage
+due to the bolt continuously checking on a typically empty queue. Select a strategy using `topology.bolt.wait.strategy`. The chosen strategy can be further configured
+using the `topology.bolt.wait.*` settings.
+
+4.3 **Backpressure Wait** : Select a strategy using `topology.backpressure.wait.strategy`. When a spout/bolt tries to write to a downstream component's receive queue,
+there is a possibility that the queue is full. In such cases the write needs to be retried. This wait strategy is used to induce some idling in-between re-attempts for
+conserving CPU. The chosen strategy can be further configured using the `topology.backpressure.wait.*` settings.
+
+
+#### Built-in wait strategies:
+These wait strategies are availabe for use with all of the above mentioned wait situations.
+
+- **ProgressiveWaitStrategy** : This strategy can be used for Bolt Wait or Backpressure Wait situations. Set the strategy to 'org.apache.storm.policy.WaitStrategyProgressive' to
+select this wait strategy. This is a dynamic wait strategy that enters into progressively deeper states of CPU conservation if the Backpressure Wait or Bolt Wait situations persist.
+It has 3 levels of idling and allows configuring how long to stay at each level :
+
+  1. Level1 / No Waiting - The first few times it will return immediately. This does not conserve any CPU. The number of times it remains in this state is configured using
+  `topology.spout.wait.progressive.level1.count` or `topology.bolt.wait.progressive.level1.count` or `topology.backpressure.wait.progressive.level1.count` depending which
+  situation it is being used.
+
+  2. Level 2 / Park Nanos - In this state it disables the current thread for thread scheduling purposes, for 1 nano second using LockSupport.parkNanos(). This puts the CPU in a minimal
+  conservation state. It remains in this state for `topology.spout.wait.progressive.level2.count` or `topology.bolt.wait.progressive.level2.count` or
+  `topology.backpressure.wait.progressive.level2.count` iterations.
+
+  3. Level 3 / Thread.sleep() - In this level it calls Thread.sleep() with the value specified in `topology.spout.wait.progressive.level3.sleep.millis` or
+  `topology.bolt.wait.progressive.level3.sleep.millis` or `topology.backpressure.wait.progressive.level3.sleep.millis`. This is the most CPU conserving level and it remains in
+  this level for the remaining iterations.
+
+
+- **ParkWaitStrategy** : This strategy can be used for Bolt Wait or Backpressure Wait situations. Set the strategy to `org.apache.storm.policy.WaitStrategyPark` to use this.
+This strategy disables the current thread for thread scheduling purposes by calling LockSupport.parkNanos(). The amount of park time is configured using either
+`topology.bolt.wait.park.microsec` or `topology.backpressure.wait.park.microsec` based on the wait situation it is used. Setting the park time to 0, effectively disables
+invocation of LockSupport.parkNanos and this mode can be used to achieve busy polling (which at the cost of high CPU utilization even when idle, may improve latency and/or throughput).
+
+
+## 5. Max.spout.pending
+The setting `topology.max.spout.pending` limits the number of un-ACKed tuples at the spout level. Once a spout reaches this limit, the spout's nextTuple()
+method will not be called until some ACKs are received for the outstanding emits. This setting does not have any affect if ACKing is disabled. It
+is a spout throttling mechanism which can impact throughput and latency. Setting it to null disables it for storm-core topologies. Impact on throughput
+is dependent on the topology and its concurrency (workers/executors), so experimentation is necessary to determine optimal setting. Latency and memory consumption
+is expected to typically increase with higher and higher values for this.
+
+
+## 6. Load Aware messaging
+When load aware messaging is enabled (default), shuffle grouping takes additional factors into consideration for message routing.
+Impact of this on performance is dependent on the topology and its deployment footprint (i.e. distribution over process and machines).
+Consequently it is useful to assess the impact of setting `topology.disable.loadaware.messaging` to `true` or `false` for your
+specific case.
+
+
+## 7. Sampling Rate
+Sampling rate is used to control how often certain metrics are computed on the Spout and Bolt executors. This is configured using `topology.stats.sample.rate`
+Setting it to 1 means, the stats are computed for every emitted message. As an example, to sample once every 1000 messages it can be set to  0.001. It may be
+possible to improve throughput and latency by reducing the sampling rate.
+
+
+## 8. Budgeting CPU cores for Executors
+There are three main types of executors (i.e threads) to take into account when budgeting CPU cores for them. Spout Executors, Bolt Executors, Worker Transfer (handles outbound
+messages) and NettyWorker (handles inbound messages).
+The first two are used to run spout, bolt and acker instances. The Worker Transfer thread is used to serialize and send messages to other workers (in multi-worker mode).
+
+Executors that are expected to remain busy, either because they are handling a lot of messages, or because their processing is inherently CPU intensive, should be allocated
+1 physical core each. Allocating logical cores (instead of physical) or less than 1 physical core for CPU intensive executors increases CPU contention and performance can suffer.
+Executors that are not expected to be busy can be allocated a smaller fraction of the physical core (or even logical cores). It maybe not be economical to allocate a full physical
+core for executors that are not likely to saturate the CPU.
+
+The *system bolt* generally processes very few messages per second, and so requires very little cpu (typically less than 10% of a physical core).

http://git-wip-us.apache.org/repos/asf/storm-site/blob/ef81b5ca/releases/2.0.0-SNAPSHOT/Resource_Aware_Scheduler_overview.md
----------------------------------------------------------------------
diff --git a/releases/2.0.0-SNAPSHOT/Resource_Aware_Scheduler_overview.md b/releases/2.0.0-SNAPSHOT/Resource_Aware_Scheduler_overview.md
index e3e2b56..4d03f88 100644
--- a/releases/2.0.0-SNAPSHOT/Resource_Aware_Scheduler_overview.md
+++ b/releases/2.0.0-SNAPSHOT/Resource_Aware_Scheduler_overview.md
@@ -14,27 +14,29 @@ http://www.slideshare.net/HadoopSummit/resource-aware-scheduling-in-apache-storm
 1. [Using Resource Aware Scheduler](#Using-Resource-Aware-Scheduler)
 2. [API Overview](#API-Overview)
     1. [Setting Memory Requirement](#Setting-Memory-Requirement)
-    2. [Setting CPU Requirement](#Setting-CPU-Requirement)
-    3. [Limiting the Heap Size per Worker (JVM) Process](#Limiting-the-Heap-Size-per-Worker-(JVM)Process)
-    4. [Setting Available Resources on Node](#Setting-Available-Resources-on-Node)
-    5. [Other Configurations](#Other-Configurations)
+    2. [Shared Memory Requirement](#Setting-Shared-Memory)
+    3. [Setting CPU Requirement](#Setting-CPU-Requirement)
+    4. [Limiting the Heap Size per Worker (JVM) Process](#Limiting-the-Heap-Size-per-Worker-(JVM)Process)
+    5. [Setting Available Resources on Node](#Setting-Available-Resources-on-Node)
+    6. [Other Configurations](#Other-Configurations)
 3. [Topology Priorities and Per User Resource](#Topology-Priorities-and-Per-User-Resource)
     1. [Setup](#Setup)
     2. [Specifying Topology Priority](#Specifying-Topology-Priority)
     3. [Specifying Scheduling Strategy](#Specifying-Scheduling-Strategy)
     4. [Specifying Topology Prioritization Strategy](#Specifying-Topology-Prioritization-Strategy)
-    5. [Specifying Eviction Strategy](#Specifying-Eviction-Strategy)
 4. [Profiling Resource Usage](#Profiling-Resource-Usage)
 5. [Enhancements on original DefaultResourceAwareStrategy](#Enhancements-on-original-DefaultResourceAwareStrategy)
 
 <div id='Using-Resource-Aware-Scheduler'/>
+
 ## Using Resource Aware Scheduler
 
 The user can switch to using the Resource Aware Scheduler by setting the following in *conf/storm.yaml*
-
+```
     storm.scheduler: “org.apache.storm.scheduler.resource.ResourceAwareScheduler”
-    
+```    
 <div id='API-Overview'/>
+
 ## API Overview
 
 For use with Trident, please see the [Trident RAS API](Trident-RAS-API.html)
@@ -42,41 +44,67 @@ For use with Trident, please see the [Trident RAS API](Trident-RAS-API.html)
 For a Storm Topology, the user can now specify the amount of resources a topology component (i.e. Spout or Bolt) is required to run a single instance of the component.  The user can specify the resource requirement for a topology component by using the following API calls.
 
 <div id='Setting-Memory-Requirement'/>
+
 ### Setting Memory Requirement
 
 API to set component memory requirement:
-
+```
     public T setMemoryLoad(Number onHeap, Number offHeap)
-
+```
 Parameters:
 * Number onHeap – The amount of on heap memory an instance of this component will consume in megabytes
 * Number offHeap – The amount of off heap memory an instance of this component will consume in megabytes
 
 The user also has to option to just specify the on heap memory requirement if the component does not have an off heap memory need.
-
+```
     public T setMemoryLoad(Number onHeap)
-
+```
 Parameters:
 * Number onHeap – The amount of on heap memory an instance of this component will consume
 
 If no value is provided for offHeap, 0.0 will be used. If no value is provided for onHeap, or if the API is never called for a component, the default value will be used.
 
 Example of Usage:
-
+```
     SpoutDeclarer s1 = builder.setSpout("word", new TestWordSpout(), 10);
     s1.setMemoryLoad(1024.0, 512.0);
     builder.setBolt("exclaim1", new ExclamationBolt(), 3)
                 .shuffleGrouping("word").setMemoryLoad(512.0);
-
+```
 The entire memory requested for this topology is 16.5 GB. That is from 10 spouts with 1GB on heap memory and 0.5 GB off heap memory each and 3 bolts with 0.5 GB on heap memory each.
 
+<div id='Setting-Shared-Memory'/>
+
+### Shared Memory
+
+In some cases you may have memory that is shared between components. It may be a as simple as a large static data structure, or as complex as static data that is memory mapped into a bolt and is shared across workers.  In any case you can specify your shared memory request by
+creating one of `SharedOffHeapWithinNode`, `SharedOffHeapWithinWorker`, or `SharedOnHeap` and adding it to bolts and spouts that use that shared memory.
+
+Example of Usage:
+
+```
+ builder.setBolt("exclaim1", new ExclamationBolt(), 3).shuffleGrouping("word")
+          .addSharedMemory(new SharedOnHeap(100, "exclaim-cache"));
+```
+
+In the above example all of the "exclaim1" bolts in a worker will share 100MB of memory.
+
+```
+ builder.setBolt("lookup", new LookupBolt(), 3).shuffleGrouping("spout")
+          .addSharedMemory(new SharedOffHeapWithinNode(500, "static-lookup"));
+```
+
+In this example all "lookup" bolts on a given node will share 500 MB or memory off heap. 
+
+
 <div id='Setting-CPU-Requirement'/>
+
 ### Setting CPU Requirement
 
 API to set component CPU requirement:
-
+```
     public T setCPULoad(Double amount)
-
+```
 Parameters:
 * Number amount – The amount of on CPU an instance of this component will consume.
 
@@ -85,18 +113,22 @@ Currently, the amount of CPU resources a component requires or is available on a
 By convention a CPU core typically will get 100 points. If you feel that your processors are more or less powerful you can adjust this accordingly. Heavy tasks that are CPU bound will get 100 points, as they can consume an entire core. Medium tasks should get 50, light tasks 25, and tiny tasks 10. In some cases you have a task that spawns other threads to help with processing. These tasks may need to go above 100 points to express the amount of CPU they are using. If these conventions are followed the common case for a single threaded task the reported Capacity * 100 should be the number of CPU points that the task needs.
 
 Example of Usage:
-
+```
     SpoutDeclarer s1 = builder.setSpout("word", new TestWordSpout(), 10);
     s1.setCPULoad(15.0);
     builder.setBolt("exclaim1", new ExclamationBolt(), 3)
                 .shuffleGrouping("word").setCPULoad(10.0);
     builder.setBolt("exclaim2", new HeavyBolt(), 1)
                     .shuffleGrouping("exclaim1").setCPULoad(450.0);
+```
 
 <div id='Limiting-the-Heap-Size-per-Worker-(JVM)Process'/>
+
 ###	Limiting the Heap Size per Worker (JVM) Process
 
+```
     public void setTopologyWorkerMaxHeapSize(Number size)
+```
 
 Parameters:
 * Number size – The memory limit a worker process will be allocated in megabytes
@@ -104,36 +136,41 @@ Parameters:
 The user can limit the amount of memory resources the resource aware scheduler allocates to a single worker on a per topology basis by using the above API.  This API is in place so that the users can spread executors to multiple workers.  However, spreading executors to multiple workers may increase the communication latency since executors will not be able to use Disruptor Queue for intra-process communication.
 
 Example of Usage:
-
+```
     Config conf = new Config();
     conf.setTopologyWorkerMaxHeapSize(512.0);
-    
+```
+
 <div id='Setting-Available-Resources-on-Node'/>
+
 ### Setting Available Resources on Node
 
 A storm administrator can specify node resource availability by modifying the *conf/storm.yaml* file located in the storm home directory of that node.
 
 A storm administrator can specify how much available memory a node has in megabytes adding the following to *storm.yaml*
-
+```
     supervisor.memory.capacity.mb: [amount<Double>]
-
+```
 A storm administrator can also specify how much available CPU resources a node has available adding the following to *storm.yaml*
-
+```
     supervisor.cpu.capacity: [amount<Double>]
-
+```
 
 Note: that the amount the user can specify for the available CPU is represented using a point system like discussed earlier.
 
 Example of Usage:
-
+```
     supervisor.memory.capacity.mb: 20480.0
     supervisor.cpu.capacity: 100.0
+```
 
 <div id='Other-Configurations'/>
+
 ### Other Configurations
 
 The user can set some default configurations for the Resource Aware Scheduler in *conf/storm.yaml*:
 
+```
     //default value if on heap memory requirement is not specified for a component 
     topology.component.resources.onheap.memory.mb: 128.0
 
@@ -145,24 +182,31 @@ The user can set some default configurations for the Resource Aware Scheduler in
 
     //default value for the max heap size for a worker  
     topology.worker.max.heap.size.mb: 768.0
+```
+
+### Warning
+
+If Resource Aware Scheduling is enabled, it will dynamically calculate the number of workers and the `topology.workers` setting is ignored.
 
 <div id='Topology-Priorities-and-Per-User-Resource'/>
+
 ## Topology Priorities and Per User Resource 
 
-The Resource Aware Scheduler or RAS also has multitenant capabilities since many Storm users typically share a Storm cluster.  Resource Aware Scheduler can allocate resources on a per user basis.  Each user can be guaranteed a certain amount of resources to run his or her topologies and the Resource Aware Scheduler will meet those guarantees when possible.  When the Storm cluster has extra free resources, Resource Aware Scheduler will to be able allocate additional resources to user in a fair manner. The importance of topologies can also vary.  Topologies can be used for actual production or just experimentation, thus Resource Aware Scheduler will take into account the importance of a topology when determining the order in which to schedule topologies or when to evict topologies
+The Resource Aware Scheduler or RAS also has multi-tenant capabilities since many Storm users typically share a Storm cluster.  Resource Aware Scheduler can allocate resources on a per user basis.  Each user can be guaranteed a certain amount of resources to run his or her topologies and the Resource Aware Scheduler will meet those guarantees when possible.  When the Storm cluster has extra free resources, Resource Aware Scheduler will to be able allocate additional resources to user in a fair manner. The importance of topologies can also vary.  Topologies can be used for actual production or just experimentation, thus Resource Aware Scheduler will take into account the importance of a topology when determining the order in which to schedule topologies or when to evict topologies
 
 <div id='Setup'/>
+
 ### Setup
 
 The resource guarantees of a user can be specified *conf/user-resource-pools.yaml*.  Specify the resource guarantees of a user in the following format:
-
+```
     resource.aware.scheduler.user.pools:
 	[UserId]
 		cpu: [Amount of Guarantee CPU Resources]
 		memory: [Amount of Guarantee Memory Resources]
-
+```
 An example of what *user-resource-pools.yaml* can look like:
-
+```
     resource.aware.scheduler.user.pools:
         jerry:
             cpu: 1000
@@ -173,11 +217,13 @@ An example of what *user-resource-pools.yaml* can look like:
         bobby:
             cpu: 5000.0
             memory: 16384.0
-
+```
 Please note that the specified amount of Guaranteed CPU and Memory can be either a integer or double
 
 <div id='Specifying-Topology-Priority'/>
+
 ### Specifying Topology Priority
+
 The range of topology priorities can range form 0-29.  The topologies priorities will be partitioned into several priority levels that may contain a range of priorities. 
 For example we can create a priority level mapping:
 
@@ -186,28 +232,29 @@ For example we can create a priority level mapping:
     DEV => 20 – 29
 
 Thus, each priority level contains 10 sub priorities. Users can set the priority level of a topology by using the following API
-
+```
     conf.setTopologyPriority(int priority)
-    
+``` 
 Parameters:
 * priority – an integer representing the priority of the topology
 
 Please note that the 0-29 range is not a hard limit.  Thus, a user can set a priority number that is higher than 29. However, the property of higher the priority number, lower the importance still holds
 
 <div id='Specifying-Scheduling-Strategy'/>
+
 ### Specifying Scheduling Strategy
 
 A user can specify on a per topology basis what scheduling strategy to use.  Users can implement the IStrategy interface and define new strategies to schedule specific topologies.  This pluggable interface was created since we realize different topologies might have different scheduling needs.  A user can set the topology strategy within the topology definition by using the API:
-
+```
     public void setTopologyStrategy(Class<? extends IStrategy> clazz)
-    
+```
 Parameters:
 * clazz – The strategy class that implements the IStrategy interface
 
 Example Usage:
-
+```
     conf.setTopologyStrategy(org.apache.storm.scheduler.resource.strategies.scheduling.DefaultResourceAwareStrategy.class);
-
+```
 A default scheduling is provided.  The DefaultResourceAwareStrategy is implemented based off the scheduling algorithm in the original paper describing resource aware scheduling in Storm:
 
 Peng, Boyang, Mohammad Hosseini, Zhihao Hong, Reza Farivar, and Roy Campbell. "R-storm: Resource-aware scheduling in storm." In Proceedings of the 16th Annual Middleware Conference, pp. 149-161. ACM, 2015.
@@ -217,60 +264,85 @@ http://dl.acm.org/citation.cfm?id=2814808
 **Please Note: Enhancements have to made on top of the original scheduling strategy as described in the paper.  Please see section "Enhancements on original DefaultResourceAwareStrategy"**
 
 <div id='Specifying-Topology-Prioritization-Strategy'/>
-### Specifying Topology Prioritization Strategy
 
-The order of scheduling is a pluggable interface in which a user could define a strategy that prioritizes topologies.  For a user to define his or her own prioritization strategy, he or she needs to implement the ISchedulingPriorityStrategy interface.  A user can set the scheduling priority strategy by setting the *Config.RESOURCE_AWARE_SCHEDULER_PRIORITY_STRATEGY* to point to the class that implements the strategy. For instance:
+### Specifying Topology Prioritization Strategy
 
+The order of scheduling and eviction is determined by a pluggable interface in which the cluster owner can define how topologies should be scheduled.  For the owner to define his or her own prioritization strategy, she or he needs to implement the ISchedulingPriorityStrategy interface.  A user can set the scheduling priority strategy by setting the `DaemonConfig.RESOURCE_AWARE_SCHEDULER_PRIORITY_STRATEGY` to point to the class that implements the strategy. For instance:
+```
     resource.aware.scheduler.priority.strategy: "org.apache.storm.scheduler.resource.strategies.priority.DefaultSchedulingPriorityStrategy"
-    
-A default strategy will be provided.  The following explains how the default scheduling priority strategy works.
+```
+
+Topologies are scheduled starting at the beginning of the list returned by this plugin.  If there are not enough resources to schedule the topology others are evicted starting at the end of the list.  Eviction stops when there are no lower priority topologies left to evict.
 
 **DefaultSchedulingPriorityStrategy**
 
-The order of scheduling should be based on the distance between a user’s current resource allocation and his or her guaranteed allocation.  We should prioritize the users who are the furthest away from their resource guarantee. The difficulty of this problem is that a user may have multiple resource guarantees, and another user can have another set of resource guarantees, so how can we compare them in a fair manner?  Let's use the average percentage of resource guarantees satisfied as a method of comparison.
+In the past the order of scheduling was based on the distance between a user’s current resource allocation and his or her guaranteed allocation.
+
+We currently use a slightly different approach. We simulate scheduling the highest priority topology for each user and score the topology for each of the resources using the formula
+
+```
+(Requested + Assigned - Guaranteed)/Available
+```
+
+Where
 
-For example:
+ * `Requested` is the resource requested by this topology (or a approximation of it for complex requests like shared memory)
+ * `Assigned` is the resources already assigned by the simulation.
+ * `Guaranteed` is the resource guarantee for this user
+ * `Available` is the amount of that resource currently available in the cluster.
 
-|User|Resource Guarantee|Resource Allocated|
-|----|------------------|------------------|
-|A|<10 CPU, 50GB>|<2 CPU, 40 GB>|
-|B|< 20 CPU, 25GB>|<15 CPU, 10 GB>|
+This gives a score that is negative for guaranteed requests and a score that is positive for requests that are not within the guarantee.
 
-User A’s average percentage satisfied of resource guarantee: 
+To combine different resources the maximum of all the individual resource scores is used.  This guarantees that if a user would go over a guarantee for a single resource it would not be offset by being under guarantee on any other resources.
 
-(2/10+40/50)/2  = 0.5
+For Example:
 
-User B’s average percentage satisfied of resource guarantee: 
+Assume we have to schedule the following topologies.
 
-(15/20+10/25)/2  = 0.575
+|ID|User|CPU|Memory|Priority|
+|---|----|---|------|-------|
+|A-1|A|100|1,000|1|
+|A-2|A|100|1,000|10|
+|B-1|B|100|1,000|1|
+|B-2|B|100|1,000|10|
 
-Thus, in this example User A has a smaller average percentage of his or her resource guarantee satisfied than User B.  Thus, User A should get priority to be allocated more resource, i.e., schedule a topology submitted by User A.
+The cluster as a whole has 300 CPU and 4,000 Memory.
 
-When scheduling, RAS sorts users by the average percentage satisfied of resource guarantee and schedule topologies from users based on that ordering starting from the users with the lowest average percentage satisfied of resource guarantee.  When a user’s resource guarantee is completely satisfied, the user’s average percentage satisfied of resource guarantee will be greater than or equal to 1.
+User A is guaranteed 100 CPU and 1,000 Memory.  User B is guaranteed 200 CPU and 1,500 Memory.  The scores for the most important, lowest priority number, topologies for each user would be.
 
-<div id='Specifying-Eviction-Strategy'/>
-### Specifying Eviction Strategy
-The eviction strategy is used when there are not enough free resources in the cluster to schedule new topologies. If the cluster is full, we need a mechanism to evict topologies so that user resource guarantees can be met and additional resource can be shared fairly among users. The strategy for evicting topologies is also a pluggable interface in which the user can implement his or her own topology eviction strategy.  For a user to implement his or her own eviction strategy, he or she needs to implement the IEvictionStrategy Interface and set *Config.RESOURCE_AWARE_SCHEDULER_EVICTION_STRATEGY* to point to the implemented strategy class. For instance:
+```
+A-1 Score = max(CPU: (100 + 0 - 100)/300, MEM: (1,000 + 0 - 1,000)/4,000) = 0
+B-1 Score = max(CPU: (100 + 0 - 200)/300, MEM: (1,000 + 0 - 1,500)/4,000) = -0.125
+``` 
 
-    resource.aware.scheduler.eviction.strategy: "org.apache.storm.scheduler.resource.strategies.eviction.DefaultEvictionStrategy"
+`B-1` has the lowest score so it would be the highest priority topology to schedule. In the next round the scores would be.
 
-A default eviction strategy is provided.  The following explains how the default topology eviction strategy works
+```
+A-1 Score = max(CPU: (100 + 0 - 100)/200, MEM: (1,000 + 0 - 1,000)/3,000) = 0
+B-2 Score = max(CPU: (100 + 100 - 200)/200, MEM: (1,000 + 1,000 - 1,500)/3,000) = 0.167
+```
 
-**DefaultEvictionStrategy**
+`A-1` has the lowest score now so it would be the next highest priority topology to schedule.
 
-To determine if topology eviction should occur we should take into account the priority of the topology that we are trying to schedule and whether the resource guarantees for the owner of the topology have been met.  
+This process would be repeated until all of the topologies are ordered, even if there are no resources left on the cluster to schedule a topology.
 
-We should never evict a topology from a user that does not have his or her resource guarantees satisfied.  The following flow chart should describe the logic for the eviction process.
+**FIFOSchedulingPriorityStrategy**
 
-![Viewing metrics with VisualVM](images/resource_aware_scheduler_default_eviction_strategy.png)
+The FIFO strategy is intended more for test or staging clusters where users are running integration tests or doing development work.  Topologies in these situations tend to be short lived and at times a user may forget that they are running a topology at all.
+
+To try and be as fair as possible to users running short lived topologies the `FIFOSchedulingPriorityStrategy` extends the `DefaultSchedulingPriorityStrategy` so that any negative score (a.k.a. a topology that fits within a user's guarantees) would remain unchanged, but positive scores are replaced with the up-time of the topology.
+
+This respects the guarantees of a user, but at the same time it gives priority for the rest of the resources to the most recently launched topology.  Older topologies, that have probably been forgotten about, are then least likely to get resources.
 
 <div id='Profiling-Resource-Usage'/>
+
 ## Profiling Resource Usage
 
 Figuring out resource usage for your topology:
  
 To get an idea of how much memory/CPU your topology is actually using you can add the following to your topology launch code.
- 
+
+```
     //Log all storm metrics
     conf.registerMetricsConsumer(backtype.storm.metric.LoggingMetricsConsumer.class);
  
@@ -278,29 +350,34 @@ To get an idea of how much memory/CPU your topology is actually using you can ad
     Map<String, String> workerMetrics = new HashMap<String, String>();
     workerMetrics.put("CPU", "org.apache.storm.metrics.sigar.CPUMetric");
     conf.put(Config.TOPOLOGY_WORKER_METRICS, workerMetrics);
- 
+```
+
 The CPU metrics will require you to add
- 
+
+``` 
     <dependency>
         <groupId>org.apache.storm</groupId>
         <artifactId>storm-metrics</artifactId>
         <version>1.0.0</version>
     </dependency>
- 
+```
+
 as a topology dependency (1.0.0 or higher).
  
 You can then go to your topology on the UI, turn on the system metrics, and find the log that the LoggingMetricsConsumer is writing to.  It will output results in the log like.
- 
+
+```
     1454526100 node1.nodes.com:6707 -1:__system CPU {user-ms=74480, sys-ms=10780}
     1454526100 node1.nodes.com:6707 -1:__system memory/nonHeap     {unusedBytes=2077536, virtualFreeBytes=-64621729, initBytes=2555904, committedBytes=66699264, maxBytes=-1, usedBytes=64621728}
     1454526100 node1.nodes.com:6707 -1:__system memory/heap  {unusedBytes=573861408, virtualFreeBytes=694644256, initBytes=805306368, committedBytes=657719296, maxBytes=778502144, usedBytes=83857888}
+```
 
 The metrics with -1:__system are generally metrics for the entire worker.  In the example above that worker is running on node1.nodes.com:6707.  These metrics are collected every 60 seconds.  For the CPU you can see that over the 60 seconds this worker used  74480 + 10780 = 85260 ms of CPU time.  This is equivalent to 85260/60000 or about 1.5 cores.
  
 The Memory usage is similar but look at the usedBytes.  offHeap is 64621728 or about 62MB, and onHeap is 83857888 or about 80MB, but you should know what you set your heap to in each of your workers already.  How do you divide this up per bolt/spout?  That is a bit harder and may require some trial and error from your end.
 
 <div id='Enhancements-on-original-DefaultResourceAwareStrategy'/>
-## * Enhancements on original DefaultResourceAwareStrategy *
+## Enhancements on original DefaultResourceAwareStrategy
 
 The default resource aware scheduling strategy as described in the paper above has two main scheduling phases:
 
@@ -311,12 +388,12 @@ Enhancements have been made for both scheduling phases
 
 ### Task Selection Enhancements 
 
-Instead of using a breadth first traversal of the topology graph to create a ordering of components and its executors, a new heuristic is used that orders components by the number of in and out edges (potential connections) of the component.  This is discovered to be a more effective way to colocate executors that communicate with each other and reduce the network latency.
+Instead of using a breadth first traversal of the topology graph to create a ordering of components and its executors, a new heuristic is used that orders components by the number of in and out edges (potential connections) of the component.  This is discovered to be a more effective way to co-locate executors that communicate with each other and reduce the network latency.
 
 
 ### Node Selection Enhancements
 
-Node selection comes down first selecting which rack (server rack) and then which node on that rack to choose. The gist of strategy in choosing a rack and node is finding the rack that has the "most" resource available and in that rack find the node with the "most" free resources.  The assumption we are making for this strategy is that the node or rack with the most free resources will have the highest probability that allows us to schedule colocate the most number of executors on the node or rack to reduce network communication latency
+Node selection comes down first selecting which rack (server rack) and then which node on that rack to choose. The gist of strategy in choosing a rack and node is finding the rack that has the "most" resource available and in that rack find the node with the "most" free resources.  The assumption we are making for this strategy is that the node or rack with the most free resources will have the highest probability that allows us to schedule co-locate the most number of executors on the node or rack to reduce network communication latency
 
 Racks and nodes will be sorted from best choice to worst choice.  When finding an executor, the strategy will iterate through all racks and nodes, starting from best to worst, before giving up.  Racks and nodes will be sorted in the following matter:
 
@@ -327,8 +404,8 @@ Racks and nodes will be sorted from best choice to worst choice.  When finding a
  -- Please refer the section on Subordinate Resource Availability
 
 3. Average of the all the resource availability  
- -- This is simply taking the average of the percent available (available resources on node or rack divied by theavailable resources on rack or cluster, repectively).  This situation will only be used when "effective resources" for two objects (rack or node) are the same. Then we consider the average of all the percentages of resources as a metric for sorting. For example:
-
+ -- This is simply taking the average of the percent available (available resources on node or rack divided by the available resources on rack or cluster, respectively).  This situation will only be used when "effective resources" for two objects (rack or node) are the same. Then we consider the average of all the percentages of resources as a metric for sorting. For example:
+```
         Avail Resources:
         node 1: CPU = 50 Memory = 1024 Slots = 20
         node 2: CPU = 50 Memory = 8192 Slots = 40
@@ -338,20 +415,20 @@ Racks and nodes will be sorted from best choice to worst choice.  When finding a
         node 1 = 50 / (50+50+1000) = 0.045 (CPU bound)
         node 2 = 50 / (50+50+1000) = 0.045 (CPU bound)
         node 3 = 0 (memory and slots are 0)
-
+```
 ode 1 and node 2 have the same effective resources but clearly node 2 has more resources (memory and slots) than node 1 and we would want to pick node 2 first since there is a higher probability we will be able to schedule more executors on it. This is what the phase 2 averaging does
 
-Thus the sorting follows the following progression. Compare based on 1) and if equal then compare based on 2) and if equal compare based on 3) and if equal we break ties by arbitrarly assigning ordering based on comparing the ids of the node or rack.
+Thus the sorting follows the following progression. Compare based on 1) and if equal then compare based on 2) and if equal compare based on 3) and if equal we break ties by arbitrarily assigning ordering based on comparing the ids of the node or rack.
 
 **Subordinate Resource Availability**
 
-Originally the getBestClustering algorithm for RAS finds the "Best" rack based on which rack has the "most available" resources by finding the rack with the biggest sum of available memory + available across all nodes in the rack. This method is not very accurate since memory and cpu usage aree values on a different scale and the values are not normailized. This method is also not effective since it does not consider the number of slots available and it fails to identifying racks that are not schedulable due to the exhaustion of one of the resources either memory, cpu, or slots. Also the previous method does not consider failures of workers. When executors of a topology gets unassigned and needs to be scheduled again, the current logic in getBestClustering may be inadequate since it will likely return a cluster that is different from where the majority of executors from the topology is originally scheduling in.
+Originally the getBestClustering algorithm for RAS finds the "Best" rack based on which rack has the "most available" resources by finding the rack with the biggest sum of available memory + available across all nodes in the rack. This method is not very accurate since memory and cpu usage agree values on a different scale and the values are not normalized. This method is also not effective since it does not consider the number of slots available and it fails to identifying racks that are not schedulable due to the exhaustion of one of the resources either memory, cpu, or slots. Also the previous method does not consider failures of workers. When executors of a topology gets unassigned and needs to be scheduled again, the current logic in getBestClustering may be inadequate since it will likely return a cluster that is different from where the majority of executors from the topology is originally scheduling in.
 
 The new strategy/algorithm to find the "best" rack or node, I dub subordinate resource availability ordering (inspired by Dominant Resource Fairness), sorts racks and nodes by the subordinate (not dominant) resource availability.
 
 For example given 4 racks with the following resource availabilities
-
-    //generate some that has alot of memory but little of cpu
+```
+    //generate some that has a lot of memory but little of cpu
     rack-3 Avail [ CPU 100.0 MEM 200000.0 Slots 40 ] Total [ CPU 100.0 MEM 200000.0 Slots 40 ]
     //generate some supervisors that are depleted of one resource
     rack-2 Avail [ CPU 0.0 MEM 80000.0 Slots 40 ] Total [ CPU 0.0 MEM 80000.0 Slots 40 ]
@@ -362,7 +439,7 @@ For example given 4 racks with the following resource availabilities
     //best rack to choose
     rack-0 Avail [ CPU 4000.0 MEM 80000.0 Slots 40( ] Total [ CPU 4000.0 MEM 80000.0 Slots 40 ]
     Cluster Overall Avail [ CPU 12200.0 MEM 410000.0 Slots 200 ] Total [ CPU 12200.0 MEM 410000.0 Slots 200 ]
-
+```
 It is clear that rack-0 is the best cluster since its the most balanced and can potentially schedule the most executors, while rack-2 is the worst rack since rack-2 is depleted of cpu resource thus rendering it unschedulable even though there are other resources available.
 
 We first calculate the resource availability percentage of all the racks for each resource by computing:
@@ -372,13 +449,13 @@ We first calculate the resource availability percentage of all the racks for eac
 We do this calculation to normalize the values otherwise the resource values would not be comparable.
 
 So for our example:
-
+```
     rack-3 Avail [ CPU 0.819672131147541% MEM 48.78048780487805% Slots 20.0% ] effective resources: 0.00819672131147541
     rack-2 Avail [ 0.0% MEM 19.51219512195122% Slots 20.0% ] effective resources: 0.0
     rack-4 Avail [ CPU 50.0% MEM 2.4390243902439024% Slots 20.0% ] effective resources: 0.024390243902439025
     rack-1 Avail [ CPU 16.39344262295082% MEM 9.75609756097561% Slots 20.0% ] effective resources: 0.0975609756097561
     rack-0 Avail [ CPU 32.78688524590164% MEM 19.51219512195122% Slots 20.0% ] effective resources: 0.1951219512195122
-
+```
 The effective resource of a rack, which is also the subordinate resource, is computed by: 
 
     MIN(resource availability percentage of {CPU, Memory, # of free Slots}).
@@ -409,7 +486,7 @@ The below graphs provides a comparison of how well the various strategies schedu
 
 For this network metric, the larger the number is number is the more potential network latency the topology will have for this scheduling.  Two types of experiments are performed.  One set experiments are performed with randomly generated topologies and randomly generate clusters.  The other set of experiments are performed with a dataset containing all of the running topologies at yahoo and semi-randomly generated clusters based on the size of the topology.  Both set of experiments are run millions of iterations until results converge.  
 
-For the experiments involving randomly generated topologies, an optimal strategy is implemented that exhausively finds the most optimal solution if a solution exists.  The topologies and clusters used in this experiment are relatively small so that the optimal strategy traverse to solution space to find a optimal solution in a reasonable amount of time.  This strategy is not run with the Yahoo topologies since the topologies are large and would take unreasonable amount of time to run, since the solutions space is W^N (ordering doesn't matter within a worker) where W is the number of workers and N is the number of executors. The NextGenStrategy represents the scheduling strategy with these enhancements.  The DefaultResourceAwareStrategy represents the original scheduling strategy.  The RoundRobinStrategy represents a naive strategy that simply schedules executors in a round robin fashion while respecting the resource constraints.  The graph below presents averages of the network metr
 ic.  A CDF graph is also presented further down.
+For the experiments involving randomly generated topologies, an optimal strategy is implemented that exhaustively finds the most optimal solution if a solution exists.  The topologies and clusters used in this experiment are relatively small so that the optimal strategy traverse to solution space to find a optimal solution in a reasonable amount of time.  This strategy is not run with the Yahoo topologies since the topologies are large and would take unreasonable amount of time to run, since the solutions space is W^N (ordering doesn't matter within a worker) where W is the number of workers and N is the number of executors. The NextGenStrategy represents the scheduling strategy with these enhancements.  The DefaultResourceAwareStrategy represents the original scheduling strategy.  The RoundRobinStrategy represents a naive strategy that simply schedules executors in a round robin fashion while respecting the resource constraints.  The graph below presents averages of the network met
 ric.  A CDF graph is also presented further down.
 
 | Random Topologies | Yahoo topologies |
 |-------------------|------------------|

http://git-wip-us.apache.org/repos/asf/storm-site/blob/ef81b5ca/releases/2.0.0-SNAPSHOT/Running-topologies-on-a-production-cluster.md
----------------------------------------------------------------------
diff --git a/releases/2.0.0-SNAPSHOT/Running-topologies-on-a-production-cluster.md b/releases/2.0.0-SNAPSHOT/Running-topologies-on-a-production-cluster.md
index 802fd2e..b92f3b2 100644
--- a/releases/2.0.0-SNAPSHOT/Running-topologies-on-a-production-cluster.md
+++ b/releases/2.0.0-SNAPSHOT/Running-topologies-on-a-production-cluster.md
@@ -16,7 +16,7 @@ conf.setMaxSpoutPending(5000);
 StormSubmitter.submitTopology("mytopology", conf, topology);
 ```
 
-3) Create a jar containing your code and all the dependencies of your code (except for Storm -- the Storm jars will be added to the classpath on the worker nodes).
+3) Create a JAR containing your topology code. You have the option to either bundle all of the dependencies of your code into that JAR (except for Storm -- the Storm JARs will be added to the classpath on the worker nodes), or you can leverage the [Classpath handling](Classpath-handling.html) features in Storm for using external libraries without bundling them into your topology JAR.
 
 If you're using Maven, the [Maven Assembly Plugin](http://maven.apache.org/plugins/maven-assembly-plugin/) can do the packaging for you. Just add this to your pom.xml:
 

http://git-wip-us.apache.org/repos/asf/storm-site/blob/ef81b5ca/releases/2.0.0-SNAPSHOT/SECURITY.md
----------------------------------------------------------------------
diff --git a/releases/2.0.0-SNAPSHOT/SECURITY.md b/releases/2.0.0-SNAPSHOT/SECURITY.md
index a9d0d7f..96aef73 100644
--- a/releases/2.0.0-SNAPSHOT/SECURITY.md
+++ b/releases/2.0.0-SNAPSHOT/SECURITY.md
@@ -17,6 +17,9 @@ Authentication and Authorization. But to do so usually requires
 configuring your Operating System to restrict the operations that can be done.
 This is generally a good idea even if you plan on running your cluster with Auth.
 
+Storm's OS level security relies on running Storm processes using OS accounts that have only the permissions they need. 
+Note that workers run under the same OS account as the Supervisor daemon by default.
+
 The exact detail of how to setup these precautions varies a lot and is beyond
 the scope of this document.
 
@@ -74,6 +77,7 @@ ui.filter.params:
    "kerberos.name.rules": "RULE:[2:$1@$0]([jt]t@.*EXAMPLE.COM)s/.*/$MAPRED_USER/ RULE:[2:$1@$0]([nd]n@.*EXAMPLE.COM)s/.*/$HDFS_USER/DEFAULT"
 ```
 make sure to create a principal 'HTTP/{hostname}' (here hostname should be the one where UI daemon runs
+Be aware that the UI user *MUST* be HTTP.
 
 Once configured users needs to do kinit before accessing UI.
 Ex:
@@ -88,9 +92,9 @@ curl  -i --negotiate -u:anyUser  -b ~/cookiejar.txt -c ~/cookiejar.txt  http://s
 **Caution**: In AD MIT Keberos setup the key size is bigger than the default UI jetty server request header size. Make sure you set ui.header.buffer.bytes to 65536 in storm.yaml. More details are on [STORM-633](https://issues.apache.org/jira/browse/STORM-633)
 
 
-## UI / DRPC SSL 
+## UI / DRPC / LOGVIEWER SSL 
 
-Both UI and DRPC allows users to configure ssl .
+UI,DRPC and LOGVIEWER allows users to configure ssl .
 
 ### UI
 
@@ -135,6 +139,26 @@ If users want to setup 2-way auth
 
 
 
+### LOGVIEWER
+similarly to UI and DRPC , users need to configure following for LOGVIEWER
+
+1. logviewer.https.port 
+2. logviewer.https.keystore.type (example "jks")
+3. logviewer.https.keystore.path (example "/etc/ssl/storm_keystore.jks")
+4. logviewer.https.keystore.password (keystore password)
+5. logviewer.https.key.password (private key password)
+
+optional config 
+6. logviewer.https.truststore.path (example "/etc/ssl/storm_truststore.jks")
+7. logviewer.https.truststore.password (truststore password)
+8. logviewer.https.truststore.type (example "jks")
+
+If users want to setup 2-way auth
+9. logviewer.https.want.client.auth (If this set to true server requests for client certifcate authentication, but keeps the connection if no authentication provided)
+10. logviewer.https.need.client.auth (If this set to true server requires client to provide authentication)
+
+
+
 
 ## Authentication (Kerberos)
 
@@ -423,16 +447,20 @@ nimbus.impersonation.acl:
 
 ### Automatic Credentials Push and Renewal
 Individual topologies have the ability to push credentials (tickets and tokens) to workers so that they can access secure services.  Exposing this to all of the users can be a pain for them.
-To hide this from them in the common case plugins can be used to populate the credentials, unpack them on the other side into a java Subject, and also allow Nimbus to renew the credentials if needed.
-These are controlled by the following configs. topology.auto-credentials is a list of java plugins, all of which must implement IAutoCredentials interface, that populate the credentials on gateway 
-and unpack them on the worker side. On a kerberos secure cluster they should be set by default to point to org.apache.storm.security.auth.kerberos.AutoTGT.  
-nimbus.credential.renewers.classes should also be set to this value so that nimbus can periodically renew the TGT on behalf of the user.
+To hide this from them in the common case plugins can be used to populate the credentials, unpack them on the other side into a java Subject, and also allow Nimbus to renew the credentials if needed. These are controlled by the following configs.
+ 
+`topology.auto-credentials` is a list of java plugins, all of which must implement the `IAutoCredentials` interface, that populate the credentials on gateway 
+and unpack them on the worker side. On a kerberos secure cluster they should be set by default to point to `org.apache.storm.security.auth.kerberos.AutoTGT`
+
+`nimbus.credential.renewers.classes` should also be set to `org.apache.storm.security.auth.kerberos.AutoTGT` so that nimbus can periodically renew the TGT on behalf of the user.  
+
+All autocredential classes that desire to implement the IMetricsRegistrant interface can register metrics automatically for each topology.  The AutoTGT class currently implements this interface and adds a metric named TGT-TimeToExpiryMsecs showing the remaining time until the TGT needs to be renewed.
 
-nimbus.credential.renewers.freq.secs controls how often the renewer will poll to see if anything needs to be renewed, but the default should be fine.
+`nimbus.credential.renewers.freq.secs` controls how often the renewer will poll to see if anything needs to be renewed, but the default should be fine.
 
-In addition Nimbus itself can be used to get credentials on behalf of the user submitting topologies. This can be configures using nimbus.autocredential.plugins.classes which is a list 
-of fully qualified class names ,all of which must implement INimbusCredentialPlugin.  Nimbus will invoke the populateCredentials method of all the configured implementation as part of topology
-submission. You should use this config with topology.auto-credentials and nimbus.credential.renewers.classes so the credentials can be populated on worker side and nimbus can automatically renew
+In addition Nimbus itself can be used to get credentials on behalf of the user submitting topologies. This can be configured using `nimbus.autocredential.plugins.classes` which is a list 
+of fully qualified class names, all of which must implement `INimbusCredentialPlugin`.  Nimbus will invoke the populateCredentials method of all the configured implementation as part of topology
+submission. You should use this config with `topology.auto-credentials` and `nimbus.credential.renewers.classes` so the credentials can be populated on worker side and nimbus can automatically renew
 them. Currently there are 2 examples of using this config, AutoHDFS and AutoHBase which auto populates hdfs and hbase delegation tokens for topology submitter so they don't have to distribute keytabs
 on all possible worker hosts.
 
@@ -475,6 +503,44 @@ nimbus.groups:
  
 
 ### DRPC
-Hopefully more on this soon
+ 
+ Storm provides the Access Control List for the DRPC Authorizer.Users can see [org.apache.storm.security.auth.authorizer.DRPCSimpleACLAuthorizer](javadocs/org/apache/storm/security/auth/authorizer/DRPCSimpleACLAuthorizer.html) for more details.
+ 
+ There are several DRPC ACL related configurations.
+ 
+ | YAML Setting | Description |
+ |------------|----------------------|
+ | drpc.authorizer.acl | A class that will perform authorization for DRPC operations. Set this to org.apache.storm.security.auth.authorizer.DRPCSimpleACLAuthorizer when using security.|
+ | drpc.authorizer.acl.filename | This is the name of a file that the ACLs will be loaded from. It is separate from storm.yaml to allow the file to be updated without bringing down a DRPC server. Defaults to drpc-auth-acl.yaml |
+ | drpc.authorizer.acl.strict| It is useful to set this to false for staging where users may want to experiment, but true for production where you want users to be secure. Defaults to false. |
 
+ The file pointed to by drpc.authorizer.acl.filename will have only one config in it drpc.authorizer.acl this should be of the form
 
+ drpc.authorizer.acl:
+   "functionName1":
+     "client.users":
+       - "alice"
+       - "bob"
+     "invocation.user": "bob"
+     
+ In this the users bob and alice as client.users are allowed to run DRPC requests against functionName1, but only bob as the invocation.user is allowed to run the topology that actually processes those requests.
+
+
+## Cluster Zookeeper Authentication
+ 
+ Users can implement cluster Zookeeper authentication by setting several configurations are shown below.
+ 
+ | YAML Setting | Description |
+ |------------|----------------------|
+ | storm.zookeeper.auth.scheme | The cluster Zookeeper authentication scheme to use, e.g. "digest". Defaults to no authentication. |
+ | storm.zookeeper.auth.payload | A string representing the payload for cluster Zookeeper authentication.It should only be set in the storm-cluster-auth.yaml.Users can see storm-cluster-auth.yaml.example for more details. |
+ 
+ 
+ Also,there are several configurations for topology Zookeeper authentication:
+ 
+ | YAML Setting | Description |
+ |------------|----------------------|
+ | storm.zookeeper.topology.auth.scheme | The topology Zookeeper authentication scheme to use, e.g. "digest". It is the internal config and user shouldn't set it. |
+ | storm.zookeeper.topology.auth.payload | A string representing the payload for topology Zookeeper authentication. |
+ 
+ Note: If storm.zookeeper.topology.auth.payload isn't set,storm will generate a ZooKeeper secret payload for MD5-digest with generateZookeeperDigestSecretPayload() method.

http://git-wip-us.apache.org/repos/asf/storm-site/blob/ef81b5ca/releases/2.0.0-SNAPSHOT/STORM-UI-REST-API.md
----------------------------------------------------------------------
diff --git a/releases/2.0.0-SNAPSHOT/STORM-UI-REST-API.md b/releases/2.0.0-SNAPSHOT/STORM-UI-REST-API.md
index 35404f2..2e2dd21 100644
--- a/releases/2.0.0-SNAPSHOT/STORM-UI-REST-API.md
+++ b/releases/2.0.0-SNAPSHOT/STORM-UI-REST-API.md
@@ -290,7 +290,7 @@ Sample response:
     "supervisors": [{ 
         "totalMem": 4096.0, 
         "host":"192.168.10.237",
-        "id":"bdfe8eff-f1d8-4bce-81f5-9d3ae1bf432e-169.254.129.212",
+        "id":"bdfe8eff-f1d8-4bce-81f5-9d3ae1bf432e",
         "uptime":"7m 8s",
         "totalCpu":400.0,
         "usedCpu":495.0,
@@ -305,12 +305,12 @@ Sample response:
         "topologyName":"ras",
         "topologyId":"ras-4-1460229987",
         "host":"192.168.10.237",
-        "supervisorId":"bdfe8eff-f1d8-4bce-81f5-9d3ae1bf432e-169.254.129.212",
+        "supervisorId":"bdfe8eff-f1d8-4bce-81f5-9d3ae1bf432e",
         "assignedMemOnHeap":704.0,
         "uptime":"2m 47s",
         "uptimeSeconds":167,
         "port":6707,
-        "workerLogLink":"http:\/\/192.168.10.237:8000\/log?file=ras-4-1460229987%2F6707%2Fworker.log",
+        "workerLogLink":"http:\/\/host:8000\/log?file=ras-4-1460229987%2F6707%2Fworker.log",
         "componentNumTasks": {
             "word":5
         },
@@ -322,11 +322,11 @@ Sample response:
         "topologyName":"ras",
         "topologyId":"ras-4-1460229987",
         "host":"192.168.10.237",
-        "supervisorId":"bdfe8eff-f1d8-4bce-81f5-9d3ae1bf432e-169.254.129.212",
+        "supervisorId":"bdfe8eff-f1d8-4bce-81f5-9d3ae1bf432e",
         "assignedMemOnHeap":904.0,
         "uptime":"2m 53s",
         "port":6706,
-        "workerLogLink":"http:\/\/192.168.10.237:8000\/log?file=ras-4-1460229987%2F6706%2Fworker.log",
+        "workerLogLink":"http:\/\/host:8000\/log?file=ras-4-1460229987%2F6706%2Fworker.log",
         "componentNumTasks":{
             "exclaim2":2,
             "exclaim1":3,
@@ -340,6 +340,63 @@ Sample response:
 }
 ```
 
+### /api/v1/owner-resources (GET)
+
+Returns summary information aggregated at the topology owner level. Pass an optional id for a specific owner (e.g. /api/v1/owner-resources/theowner)
+
+Response fields:
+
+|Field  |Value|Description|
+|---	|---	|---
+|owner  |String |Topology owner
+|totalTopologies|Integer|Total number of topologies owner is running
+|totalExecutors |Integer|Total number of executors used by owner
+|totalWorkers |Integer|Total number of workers used by owner
+|totalTasks|Integer|Total number of tasks used by owner
+|totalMemoryUsage|Double|Total Memory Assigned on behalf of owner in MB
+|totalCpuUsage|Double|Total CPU Resource Assigned on behalf of User. Every 100 means 1 core
+|memoryGuarantee|Double|The amount of memory resource (in MB) guaranteed to owner
+|cpuGuarantee|Double|The amount of CPU resource (every 100 means 1 core) guaranteed to owner
+|isolatedNodes|Integer|The amount of nodes that are guaranteed isolated to owner
+|memoryGuaranteeRemaining|Double|The amount of guaranteed memory resources (in MB) remaining
+|cpuGuaranteeRemaining|Double|The amount of guaranteed CPU resource (every 100 means 1 core) remaining
+|totalReqOnHeapMem|Double| Total On-Heap Memory Requested by User in MB
+|totalReqOffHeapMem|Double|Total Off-Heap Memory Requested by User in MB
+|totalReqMem|Double|Total Memory Requested by User in MB
+|totalReqCpu|Double|Total CPU Resource Requested by User. Every 100 means 1 core
+|totalAssignedOnHeapMem|Double|Total On-Heap Memory Assigned on behalf of owner in MB
+|totalAssignedOffHeapMem|Double|Total Off-Heap Memory Assigned on behalf of owner in MB
+
+Sample response:
+ 
+```json
+{
+    "owners": [
+        {
+            "totalReqOnHeapMem": 896,
+            "owner": "ownerA",
+            "totalExecutors": 7,
+            "cpuGuaranteeRemaining": 30,
+            "totalReqMem": 896,
+            "cpuGuarantee": 100,
+            "isolatedNodes": "N/A",
+            "memoryGuarantee": 4000,
+            "memoryGuaranteeRemaining": 3104,
+            "totalTasks": 7,
+            "totalMemoryUsage": 896,
+            "totalReqOffHeapMem": 0,
+            "totalReqCpu": 70,
+            "totalWorkers": 2,
+            "totalCpuUsage": 70,
+            "totalAssignedOffHeapMem": 0,
+            "totalAssignedOnHeapMem": 896,
+            "totalTopologies": 1
+        }
+    ],
+    "schedulerDisplayResource": true
+}
+```
+
 ### /api/v1/topology/summary (GET)
 
 Returns summary information for all topologies.
@@ -396,9 +453,10 @@ Sample response:
 }
 ```
 
-### /api/v1/topology-workers/:id (GET)
+### /api/v1/topology-workers/\<id\> (GET)
 
-Returns the worker' information (host and port) for a topology.
+Returns the worker' information (host and port) for a topology whose id is substituted for \<id\>. 
+The topology id is obtained by the above /topology/summary call. 
 
 Response fields:
 
@@ -429,9 +487,9 @@ Sample response:
 }
 ```
 
-### /api/v1/topology/:id (GET)
+### /api/v1/topology/\<id\> (GET)
 
-Returns topology information and statistics.  Substitute id with topology id.
+Returns topology information and statistics.  Substitute \<id\> with the topology id.
 
 Request parameters:
 
@@ -545,6 +603,25 @@ Sample response:
     "msgTimeout": 30,
     "windowHint": "10m 0s",
     "schedulerDisplayResource": true,
+    "workers": [{
+        "topologyName": "WordCount3",
+        "topologyId": "WordCount3-1-1402960825",
+        "host": "my-host",
+        "supervisorId": "9124ca9a-42e8-481e-9bf3-a041d9595430",
+        "assignedMemOnHeap": 1452.0,
+        "uptime": "27m 26s",
+        "port": 6702,
+        "workerLogLink": "logs",
+        "componentNumTasks": {
+            "spout": 2,
+            "count": 3,
+            "split": 10
+        },
+        "executorsTotal": 15,
+        "uptimeSeconds": 1646,
+        "assignedCpu": 260.0,
+        "assignedMemOffHeap": 160.0
+    }]
     "topologyStats": [
         {
             "windowPretty": "10m 0s",
@@ -690,9 +767,9 @@ Sample response:
 }
 ```
 
-### /api/v1/topology/:id/metrics
+### /api/v1/topology/\<id\>/metrics
 
-Returns detailed metrics for topology. It shows metrics per component, which are aggregated by stream.
+Returns detailed metrics for topology for a topology whose id is substituted for \<id\>. It shows metrics per component, which are aggregated by stream.
 
 |Parameter |Value   |Description  |
 |----------|--------|-------------|
@@ -967,9 +1044,9 @@ Sample response:
 }
 ```
 
-### /api/v1/topology/:id/component/:component (GET)
+### /api/v1/topology/\<id\>/component/\<component\> (GET)
 
-Returns detailed metrics and executor information
+Returns detailed metrics and executor information for a topology whose id is substituted for \<id\> and a component whose id is substituted for \<component\>
 
 |Parameter |Value   |Description  |
 |----------|--------|-------------|
@@ -1205,9 +1282,10 @@ Sample response:
 
 ## Profiling and Debugging GET Operations
 
-###  /api/v1/topology/:id/profiling/start/:host-port/:timeout (GET)
+###  /api/v1/topology/\<id\>/profiling/start/\<host-port\>/\<timeout\> (GET)
 
 Request to start profiler on worker with timeout. Returns status and link to profiler artifacts for worker.
+Substitute appropriate values for \<id\>, \<host-port\> and \<timeout\>.
 
 |Parameter |Value   |Description  |
 |----------|--------|-------------|
@@ -1243,9 +1321,10 @@ Sample response:
 }
 ```
 
-###  /api/v1/topology/:id/profiling/dumpprofile/:host-port (GET)
+###  /api/v1/topology/\<id\>/profiling/dumpprofile/\<host-port\> (GET)
 
 Request to dump profiler recording on worker. Returns status and worker id for the request.
+Substitute for \<id\> and \<host-port\>. 
 
 |Parameter |Value   |Description  |
 |----------|--------|-------------|
@@ -1274,9 +1353,10 @@ Sample response:
 }
 ```
 
-###  /api/v1/topology/:id/profiling/stop/:host-port (GET)
+###  /api/v1/topology/\<id\>/profiling/stop/\<host-port\> (GET)
 
 Request to stop profiler on worker. Returns status and worker id for the request.
+Substitute for \<id\> and \<host-port\>. 
 
 |Parameter |Value   |Description  |
 |----------|--------|-------------|
@@ -1305,9 +1385,10 @@ Sample response:
 }
 ```
 
-###  /api/v1/topology/:id/profiling/dumpjstack/:host-port (GET)
+###  /api/v1/topology/\<id\>/profiling/dumpjstack/\<host-port\> (GET)
 
 Request to dump jstack on worker. Returns status and worker id for the request.
+Substitute for \<id\> and \<host-port\>. 
 
 |Parameter |Value   |Description  |
 |----------|--------|-------------|
@@ -1336,9 +1417,10 @@ Sample response:
 }
 ```
 
-###  /api/v1/topology/:id/profiling/dumpheap/:host-port (GET)
+###  /api/v1/topology/\<id\>/profiling/dumpheap/\<host-port\> (GET)
 
 Request to dump heap (jmap) on worker. Returns status and worker id for the request.
+Substitute for \<id\> and \<host-port\>. 
 
 |Parameter |Value   |Description  |
 |----------|--------|-------------|
@@ -1367,9 +1449,10 @@ Sample response:
 }
 ```
 
-###  /api/v1/topology/:id/profiling/restartworker/:host-port (GET)
+###  /api/v1/topology/\<id\>/profiling/restartworker/\<host-port\> (GET)
 
 Request to request the worker. Returns status and worker id for the request.
+Substitute for \<id\> and \<host-port\>. 
 
 |Parameter |Value   |Description  |
 |----------|--------|-------------|
@@ -1400,9 +1483,9 @@ Sample response:
 
 ## POST Operations
 
-### /api/v1/topology/:id/activate (POST)
+### /api/v1/topology/\<id\>/activate (POST)
 
-Activates a topology.
+Activates a topology. Substitute for \<id\>. 
 
 |Parameter |Value   |Description  |
 |----------|--------|-------------|
@@ -1415,9 +1498,9 @@ Sample Response:
 ```
 
 
-### /api/v1/topology/:id/deactivate (POST)
+### /api/v1/topology/\<id\>/deactivate (POST)
 
-Deactivates a topology.
+Deactivates a topology. Substitute for \<id\>. 
 
 |Parameter |Value   |Description  |
 |----------|--------|-------------|
@@ -1430,9 +1513,10 @@ Sample Response:
 ```
 
 
-### /api/v1/topology/:id/rebalance/:wait-time (POST)
+### /api/v1/topology/\<id\>/rebalance/\<wait-time\> (POST)
 
 Rebalances a topology.
+Substitute for \<id\> and \<wait-time\>. 
 
 |Parameter |Value   |Description  |
 |----------|--------|-------------|
@@ -1464,9 +1548,10 @@ Sample Response:
 
 
 
-### /api/v1/topology/:id/kill/:wait-time (POST)
+### /api/v1/topology/\<id\>/kill/\<wait-time\> (POST)
 
 Kills a topology.
+Substitute for \<id\> and \<wait-time\>. 
 
 |Parameter |Value   |Description  |
 |----------|--------|-------------|
@@ -1495,3 +1580,28 @@ Sample response:
   "errorMessage": "java.lang.NullPointerException\n\tat clojure.core$name.invoke(core.clj:1505)\n\tat org.apache.storm.ui.core$component_page.invoke(core.clj:752)\n\tat org.apache.storm.ui.core$fn__7766.invoke(core.clj:782)\n\tat compojure.core$make_route$fn__5755.invoke(core.clj:93)\n\tat compojure.core$if_route$fn__5743.invoke(core.clj:39)\n\tat compojure.core$if_method$fn__5736.invoke(core.clj:24)\n\tat compojure.core$routing$fn__5761.invoke(core.clj:106)\n\tat clojure.core$some.invoke(core.clj:2443)\n\tat compojure.core$routing.doInvoke(core.clj:106)\n\tat clojure.lang.RestFn.applyTo(RestFn.java:139)\n\tat clojure.core$apply.invoke(core.clj:619)\n\tat compojure.core$routes$fn__5765.invoke(core.clj:111)\n\tat ring.middleware.reload$wrap_reload$fn__6880.invoke(reload.clj:14)\n\tat org.apache.storm.ui.core$catch_errors$fn__7800.invoke(core.clj:836)\n\tat ring.middleware.keyword_params$wrap_keyword_params$fn__6319.invoke(keyword_params.clj:27)\n\tat ring.middleware.nested_params$wra
 p_nested_params$fn__6358.invoke(nested_params.clj:65)\n\tat ring.middleware.params$wrap_params$fn__6291.invoke(params.clj:55)\n\tat ring.middleware.multipart_params$wrap_multipart_params$fn__6386.invoke(multipart_params.clj:103)\n\tat ring.middleware.flash$wrap_flash$fn__6675.invoke(flash.clj:14)\n\tat ring.middleware.session$wrap_session$fn__6664.invoke(session.clj:43)\n\tat ring.middleware.cookies$wrap_cookies$fn__6595.invoke(cookies.clj:160)\n\tat ring.adapter.jetty$proxy_handler$fn__6112.invoke(jetty.clj:16)\n\tat ring.adapter.jetty.proxy$org.mortbay.jetty.handler.AbstractHandler$0.handle(Unknown Source)\n\tat org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)\n\tat org.mortbay.jetty.Server.handle(Server.java:326)\n\tat org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)\n\tat org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)\n\tat org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)\n\tat org
 .mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)\n\tat org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)\n\tat org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:228)\n\tat org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)\n"
 }
 ```
+
+# DRPC REST API
+
+If DRPC is configured with either an http or https port it will expose a REST endpoint. (See [Setting up a Storm cluster](Setting-up-a-Storm-cluster.html) for how to do that)
+
+In all of these commands `:func` is the DRPC function and `:args` is the arguments to it.  The only difference is in how those arguments are supplied.  In all cases the response
+is in the response's body.
+
+In all cases DRPC does not have state, so if your request times out or results in an error please retry the request, but preferably with an exponential backoff to avoid doing a
+DDOS on the DRPC servers.
+
+### /drpc/:func (POST)
+
+In this case the `:args` to the drpc request are in the body of the post.
+
+### /drpc/:func/:args (GET)
+
+In this case the `:args` are supplied as a part of the URL itself.  There are limitations on URL lengths by many tools, so if this is above a hundred characters it is recomended 
+to use the POST option instead.
+
+### /drpc/:func (GET)
+
+In some rare cases `:args` may not be needed by the DRPC command.  If no `:args` section is given in the DRPC request and empty string `""` will be used for the arguments.
+
+

http://git-wip-us.apache.org/repos/asf/storm-site/blob/ef81b5ca/releases/2.0.0-SNAPSHOT/Serialization.md
----------------------------------------------------------------------
diff --git a/releases/2.0.0-SNAPSHOT/Serialization.md b/releases/2.0.0-SNAPSHOT/Serialization.md
index c2e129b..e35a0f9 100644
--- a/releases/2.0.0-SNAPSHOT/Serialization.md
+++ b/releases/2.0.0-SNAPSHOT/Serialization.md
@@ -9,7 +9,7 @@ Tuples can be comprised of objects of any types. Since Storm is a distributed sy
 
 Storm uses [Kryo](https://github.com/EsotericSoftware/kryo) for serialization. Kryo is a flexible and fast serialization library that produces small serializations.
 
-By default, Storm can serialize primitive types, strings, byte arrays, ArrayList, HashMap, HashSet, and the Clojure collection types. If you want to use another type in your tuples, you'll need to register a custom serializer.
+By default, Storm can serialize primitive types, strings, byte arrays, ArrayList, HashMap, and HashSet. If you want to use another type in your tuples, you'll need to register a custom serializer.
 
 ### Dynamic typing
 
@@ -25,7 +25,7 @@ Finally, another reason for using dynamic typing is so Storm can be used in a st
 
 As mentioned, Storm uses Kryo for serialization. To implement custom serializers, you need to register new serializers with Kryo. It's highly recommended that you read over [Kryo's home page](https://github.com/EsotericSoftware/kryo) to understand how it handles custom serialization.
 
-Adding custom serializers is done through the "topology.kryo.register" property in your topology config. It takes a list of registrations, where each registration can take one of two forms:
+Adding custom serializers is done through the "topology.kryo.register" property in your topology config or through a ServiceLoader described later. The config takes a list of registrations, where each registration can take one of two forms:
 
 1. The name of a class to register. In this case, Storm will use Kryo's `FieldsSerializer` to serialize the class. This may or may not be optimal for the class -- see the Kryo docs for more details.
 2. A map from the name of a class to register to an implementation of [com.esotericsoftware.kryo.Serializer](https://github.com/EsotericSoftware/kryo/blob/master/src/com/esotericsoftware/kryo/Serializer.java).
@@ -45,6 +45,12 @@ Storm provides helpers for registering serializers in a topology config. The [Co
 
 There's an advanced config called `Config.TOPOLOGY_SKIP_MISSING_KRYO_REGISTRATIONS`. If you set this to true, Storm will ignore any serializations that are registered but do not have their code available on the classpath. Otherwise, Storm will throw errors when it can't find a serialization. This is useful if you run many topologies on a cluster that each have different serializations, but you want to declare all the serializations across all topologies in the `storm.yaml` files.
 
+#### SerializationRegister Service Loader
+
+If you want to provide language bindings to storm, have a library that you want to interact cleanly with storm or have some other reason to provide serialization bindings and don't want to force the user to update their configs you can use the org.apache.storm.serialization.SerializationRegister service loader.
+
+You may use this like any other service loader and storm will register the bindings without forceing users to update their configs. The storm-clojure package uses this to provide transparent support for clojure types.
+
 ### Java serialization
 
 If Storm encounters a type for which it doesn't have a serialization registered, it will use Java serialization if possible. If the object can't be serialized with Java serialization, then Storm will throw an error.

http://git-wip-us.apache.org/repos/asf/storm-site/blob/ef81b5ca/releases/2.0.0-SNAPSHOT/Setting-up-a-Storm-cluster.md
----------------------------------------------------------------------
diff --git a/releases/2.0.0-SNAPSHOT/Setting-up-a-Storm-cluster.md b/releases/2.0.0-SNAPSHOT/Setting-up-a-Storm-cluster.md
index 45005e2..67ad727 100644
--- a/releases/2.0.0-SNAPSHOT/Setting-up-a-Storm-cluster.md
+++ b/releases/2.0.0-SNAPSHOT/Setting-up-a-Storm-cluster.md
@@ -14,22 +14,23 @@ Here's a summary of the steps for setting up a Storm cluster:
 3. Download and extract a Storm release to Nimbus and worker machines
 4. Fill in mandatory configurations into storm.yaml
 5. Launch daemons under supervision using "storm" script and a supervisor of your choice
+6. Setup DRPC servers (Optional)
 
 ### Set up a Zookeeper cluster
 
-Storm uses Zookeeper for coordinating the cluster. Zookeeper **is not** used for message passing, so the load Storm places on Zookeeper is quite low. Single node Zookeeper clusters should be sufficient for most cases, but if you want failover or are deploying large Storm clusters you may want larger Zookeeper clusters. Instructions for deploying Zookeeper are [here](http://zookeeper.apache.org/doc/r3.3.3/zookeeperAdmin.html). 
+Storm uses Zookeeper for coordinating the cluster. Zookeeper **is not** used for message passing, so the load Storm places on Zookeeper is quite low. Single node Zookeeper clusters should be sufficient for most cases, but if you want failover or are deploying large Storm clusters you may want larger Zookeeper clusters. Instructions for deploying Zookeeper are [here](http://zookeeper.apache.org/doc/r3.3.3/zookeeperAdmin.html).
 
 A few notes about Zookeeper deployment:
 
-1. It's critical that you run Zookeeper under supervision, since Zookeeper is fail-fast and will exit the process if it encounters any error case. See [here](http://zookeeper.apache.org/doc/r3.3.3/zookeeperAdmin.html#sc_supervision) for more details. 
+1. It's critical that you run Zookeeper under supervision, since Zookeeper is fail-fast and will exit the process if it encounters any error case. See [here](http://zookeeper.apache.org/doc/r3.3.3/zookeeperAdmin.html#sc_supervision) for more details.
 2. It's critical that you set up a cron to compact Zookeeper's data and transaction logs. The Zookeeper daemon does not do this on its own, and if you don't set up a cron, Zookeeper will quickly run out of disk space. See [here](http://zookeeper.apache.org/doc/r3.3.3/zookeeperAdmin.html#sc_maintenance) for more details.
 
 ### Install dependencies on Nimbus and worker machines
 
 Next you need to install Storm's dependencies on Nimbus and the worker machines. These are:
 
-1. Java 7
-2. Python 2.6.6
+1. Java 8+ (Apache Storm 2.x is tested through travis ci against a java 8 JDK)
+2. Python 2.6.6 (Python 3.x should work too, but is not tested as part of our CI enviornment)
 
 These are the versions of the dependencies that have been tested with Storm. Storm may or may not work with different versions of Java and/or Python.
 
@@ -58,11 +59,12 @@ If the port that your Zookeeper cluster uses is different than the default, you
 ```yaml
 storm.local.dir: "/mnt/storm"
 ```
-If you run storm on windows,it could be:
+If you run storm on windows, it could be:
+
 ```yaml
 storm.local.dir: "C:\\storm-local"
 ```
-If you use a relative path,it will be relative to where you installed storm(STORM_HOME).
+If you use a relative path, it will be relative to where you installed storm(STORM_HOME).
 You can leave it empty with default value `$STORM_HOME/storm-local`
 
 3) **nimbus.seeds**: The worker nodes need to know which machines are the candidate of master in order to download topology jars and confs. For example:
@@ -82,9 +84,15 @@ supervisor.slots.ports:
     - 6703
 ```
 
+5) **drpc.servers**: If you want to setup DRPC servers they need to specified so that the workers can find them. This should be a list of the DRPC servers.  For example:
+
+```yaml
+drpc.servers: ["111.222.333.44"]
+```
+
 ### Monitoring Health of Supervisors
 
-Storm provides a mechanism by which administrators can configure the supervisor to run administrator supplied scripts periodically to determine if a node is healthy or not. Administrators can have the supervisor determine if the node is in a healthy state by performing any checks of their choice in scripts located in storm.health.check.dir. If a script detects the node to be in an unhealthy state, it must print a line to standard output beginning with the string ERROR. The supervisor will periodically run the scripts in the health check dir and check the output. If the script’s output contains the string ERROR, as described above, the supervisor will shut down any workers and exit. 
+Storm provides a mechanism by which administrators can configure the supervisor to run administrator supplied scripts periodically to determine if a node is healthy or not. Administrators can have the supervisor determine if the node is in a healthy state by performing any checks of their choice in scripts located in storm.health.check.dir. If a script detects the node to be in an unhealthy state, it must print a line to standard output beginning with the string ERROR. The supervisor will periodically run the scripts in the health check dir and check the output. If the script’s output contains the string ERROR, as described above, the supervisor will shut down any workers and exit.
 
 If the supervisor is running with supervision "/bin/storm node-health-check" can be called to determine if the supervisor should be launched or if the node is unhealthy.
 
@@ -101,17 +109,27 @@ The time to allow any given healthcheck script to run before it is marked failed
 storm.health.check.timeout.ms: 5000
 ```
 
-### Configure external libraries and environmental variables (optional)
+### Configure external libraries and environment variables (optional)
 
-If you need support from external libraries or custom plugins, you can place such jars into the extlib/ and extlib-daemon/ directories. Note that the extlib-daemon/ directory stores jars used only by daemons (Nimbus, Supervisor, DRPC, UI, Logviewer), e.g., HDFS and customized scheduling libraries. Accordingly, two environmental variables STORM_EXT_CLASSPATH and STORM_EXT_CLASSPATH_DAEMON can be configured by users for including the external classpath and daemon-only external classpath.
+If you need support from external libraries or custom plugins, you can place such jars into the extlib/ and extlib-daemon/ directories. Note that the extlib-daemon/ directory stores jars used only by daemons (Nimbus, Supervisor, DRPC, UI, Logviewer), e.g., HDFS and customized scheduling libraries. Accordingly, two environment variables STORM_EXT_CLASSPATH and STORM_EXT_CLASSPATH_DAEMON can be configured by users for including the external classpath and daemon-only external classpath. See [Classpath handling](Classpath-handling.html) for more details on using external libraries.
 
 
 ### Launch daemons under supervision using "storm" script and a supervisor of your choice
 
 The last step is to launch all the Storm daemons. It is critical that you run each of these daemons under supervision. Storm is a __fail-fast__ system which means the processes will halt whenever an unexpected error is encountered. Storm is designed so that it can safely halt at any point and recover correctly when the process is restarted. This is why Storm keeps no state in-process -- if Nimbus or the Supervisors restart, the running topologies are unaffected. Here's how to run the Storm daemons:
 
-1. **Nimbus**: Run the command "bin/storm nimbus" under supervision on the master machine.
-2. **Supervisor**: Run the command "bin/storm supervisor" under supervision on each worker machine. The supervisor daemon is responsible for starting and stopping worker processes on that machine.
-3. **UI**: Run the Storm UI (a site you can access from the browser that gives diagnostics on the cluster and topologies) by running the command "bin/storm ui" under supervision. The UI can be accessed by navigating your web browser to http://{ui host}:8080. 
+1. **Nimbus**: Run the command `bin/storm nimbus` under supervision on the master machine.
+2. **Supervisor**: Run the command `bin/storm supervisor` under supervision on each worker machine. The supervisor daemon is responsible for starting and stopping worker processes on that machine.
+3. **UI**: Run the Storm UI (a site you can access from the browser that gives diagnostics on the cluster and topologies) by running the command "bin/storm ui" under supervision. The UI can be accessed by navigating your web browser to http://{ui host}:8080.
 
 As you can see, running the daemons is very straightforward. The daemons will log to the logs/ directory in wherever you extracted the Storm release.
+
+### Setup DRPC servers (Optional)
+
+Just like with nimbus or the supervisors you will need to launch the drpc server.  To do this run the command `bin/storm drpc` on each of the machines that you configured as a part of the `drpc.servers` config.
+
+#### DRPC Http Setup
+
+DRPC optionally offers a REST API as well.  To enable this set teh config `drpc.http.port` to the port you want to run on before launching the DRPC server. See the [REST documentation](STORM-UI-REST-API.html) for more information on how to use it.
+
+It also supports SSL by setting `drpc.https.port` along with the keystore and optional truststore similar to how you would configure the UI.