You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@storm.apache.org by knusbaum <gi...@git.apache.org> on 2016/03/24 20:44:23 UTC

[GitHub] storm pull request: Adding documentation for Trident RAS API

GitHub user knusbaum opened a pull request:

    https://github.com/apache/storm/pull/1256

    Adding documentation for Trident RAS API

    

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/knusbaum/incubator-storm master

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/storm/pull/1256.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1256
    
----
commit 3b56c6b1847e5388e94b5c195b0965d55198d81c
Author: Kyle Nusbaum <ky...@gmail.com>
Date:   2016-03-24T19:42:17Z

    Adding documentation for Trident RAS API

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request: Adding documentation for Trident RAS API

Posted by jerrypeng <gi...@git.apache.org>.
Github user jerrypeng commented on a diff in the pull request:

    https://github.com/apache/storm/pull/1256#discussion_r57385884
  
    --- Diff: docs/Trident-RAS-API.md ---
    @@ -0,0 +1,49 @@
    +---
    +title: Trident RAS API
    +layout: documentation
    +documentation: true
    +---
    +
    +## Trident RAS API
    +
    +The Trident RAS (Resource Aware Scheduler) API provides a mechanism to specify the resource consumption of their topology. The API looks exactly like the base RAS API, only it is called on Trident Streams instead of Bolts and Spouts.
    --- End diff --
    
    grammer:
    
    ... the resource consumption of **a Trident** topology ...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request: Adding documentation for Trident RAS API

Posted by satishd <gi...@git.apache.org>.
Github user satishd commented on a diff in the pull request:

    https://github.com/apache/storm/pull/1256#discussion_r57557021
  
    --- Diff: docs/Trident-RAS-API.md ---
    @@ -0,0 +1,50 @@
    +---
    +title: Trident RAS API
    +layout: documentation
    +documentation: true
    +---
    +
    +## Trident RAS API
    +
    +The Trident RAS (Resource Aware Scheduler) API provides a mechanism to allow users to specify the resource consumption of a Trident topology. The API looks exactly like the base RAS API, only it is called on Trident Streams instead of Bolts and Spouts.
    +
    +In order to avoid duplication and inconsistency in documentation, the purpose and effects of resource setting are not described here, but are instead found in the [Resource Aware Scheduler Overview](Resource_Aware_Scheduler_overview.html)
    +
    +### Use
    +
    +First, an example:
    +
    +```java
    +    TridentTopology topo = new TridentTopology();
    +    TridentState wordCounts =
    +        topology
    +            .newStream("words", feeder)
    +            .parallelismHint(5)
    +            .setCPULoad(20)
    +            .setMemoryLoad(512,256)
    +            .each( new Fields("sentence"),  new Split(), new Fields("word"))
    +            .setCPULoad(10)
    +            .setMemoryLoad(512)
    +            .each(new Fields("word"), new BangAdder(), new Fields("word!"))
    +            .parallelismHint(10)
    +            .setCPULoad(50)
    +            .setMemoryLoad(1024)
    +            .groupBy(new Fields("word!"))
    +            .persistentAggregate(new MemoryMapState.Factory(), new Count(), new Fields("count"))
    +            .setCPULoad(100)
    +            .setMemoryLoad(2048);
    +```
    +
    +Resources can be set for each operation (except for grouping, shuffling, partitioning).
    +Operations that are combined by Trident into single Bolts will have their resources summed.
    +
    +Every Bolt is given **at least** the default resources, regardless of user settings.
    +
    +The Trident Stream above becomes a Storm Topology with:
    + * a spout and spout coordinator with a CPU load of 20% each, and a memory load of 512MiB on heap and 256MiB off heap.
    + * a bolt with 60% cpu load (10% + 50%) and a memory load of 1536MiB (1024 + 512) on heap from the combined `Split` and `BangAdder`
    + * a bolt with 100% cpu load and a memory load of 2048MiB.
    --- End diff --
    
    minor nit: 100% cpu load, and a memory load of 2048MiB **on heap and default value for off heap**


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request: Adding documentation for Trident RAS API

Posted by knusbaum <gi...@git.apache.org>.
Github user knusbaum commented on the pull request:

    https://github.com/apache/storm/pull/1256#issuecomment-202475739
  
    Squashed and merged. Going in 1.x-branch as well.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request: Adding documentation for Trident RAS API

Posted by jerrypeng <gi...@git.apache.org>.
Github user jerrypeng commented on a diff in the pull request:

    https://github.com/apache/storm/pull/1256#discussion_r57389473
  
    --- Diff: docs/Trident-RAS-API.md ---
    @@ -0,0 +1,49 @@
    +---
    +title: Trident RAS API
    +layout: documentation
    +documentation: true
    +---
    +
    +## Trident RAS API
    +
    +The Trident RAS (Resource Aware Scheduler) API provides a mechanism to specify the resource consumption of their topology. The API looks exactly like the base RAS API, only it is called on Trident Streams instead of Bolts and Spouts.
    +
    +In order to avoid duplication and inconsistency in documentation, the purpose and effects of resource setting are not described here, but are instead found in the [Resource Aware Scheduler Overview](Resource_Aware_Scheduler_overview.html)
    +
    +### Use
    +
    +First an example:
    +
    +```java
    +    TridentTopology topo = new TridentTopology();
    +    TridentState wordCounts =
    +        topology
    +            .newStream("words", feeder)
    +            .parallelismHint(5)
    +            .setCPULoad(20)
    +            .setMemoryLoad(512,256)
    +            .each( new Fields("sentence"),  new Split(), new Fields("word"))
    +            .setCPULoad(10)
    +            .setMemoryLoad(512)
    +            .each(new Fields("word"), new BangAdder(), new Fields("word!"))
    +            .parallelismHint(10)
    +            .setCPULoad(50)
    +            .setMemoryLoad(1024)
    +            .groupBy(new Fields("word!"))
    +            .persistentAggregate(new MemoryMapState.Factory(), new Count(), new Fields("count"))
    +            .setCPULoad(100)
    +            .setMemoryLoad(2048);
    +```
    +
    +Resources can be set per operation (except for grouping, shuffling, partitioning).
    +Operations that are combined by Trident into single Bolts have their resources summed.
    +
    +Every Bolt is given **at least** the default resources, regardless of user settings.
    +
    +In the above case, we end up with
    + * a spout and spout coordinator with a CPU load of 20% each, and a memory load of 512MiB on heap and 256MiB off heap.
    + * a bolt with 60% cpu load (10% + 50%) and a memory load of 1536MiB (1024 + 512) on heap from the combined `Split` and `BangAdder`
    + * a bolt with 100% cpu load and a memory load of 2048MiB.
    +
    +The methods can be called for every operation (or some of the operations) or used in the same manner as `parallelismHint()`.
    --- End diff --
    
    reword:
    The RAS API can be used in the same manner as parrallelismHint(), i.e. resource declarations have the same *boundaries* as parallelismHints.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request: Adding documentation for Trident RAS API

Posted by jerrypeng <gi...@git.apache.org>.
Github user jerrypeng commented on a diff in the pull request:

    https://github.com/apache/storm/pull/1256#discussion_r57387004
  
    --- Diff: docs/Trident-RAS-API.md ---
    @@ -0,0 +1,49 @@
    +---
    +title: Trident RAS API
    +layout: documentation
    +documentation: true
    +---
    +
    +## Trident RAS API
    +
    +The Trident RAS (Resource Aware Scheduler) API provides a mechanism to specify the resource consumption of their topology. The API looks exactly like the base RAS API, only it is called on Trident Streams instead of Bolts and Spouts.
    +
    +In order to avoid duplication and inconsistency in documentation, the purpose and effects of resource setting are not described here, but are instead found in the [Resource Aware Scheduler Overview](Resource_Aware_Scheduler_overview.html)
    +
    +### Use
    +
    +First an example:
    +
    +```java
    +    TridentTopology topo = new TridentTopology();
    +    TridentState wordCounts =
    +        topology
    +            .newStream("words", feeder)
    +            .parallelismHint(5)
    +            .setCPULoad(20)
    +            .setMemoryLoad(512,256)
    +            .each( new Fields("sentence"),  new Split(), new Fields("word"))
    +            .setCPULoad(10)
    +            .setMemoryLoad(512)
    +            .each(new Fields("word"), new BangAdder(), new Fields("word!"))
    +            .parallelismHint(10)
    +            .setCPULoad(50)
    +            .setMemoryLoad(1024)
    +            .groupBy(new Fields("word!"))
    +            .persistentAggregate(new MemoryMapState.Factory(), new Count(), new Fields("count"))
    +            .setCPULoad(100)
    +            .setMemoryLoad(2048);
    +```
    +
    +Resources can be set per operation (except for grouping, shuffling, partitioning).
    --- End diff --
    
    Resources can be set ** for each **operation


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request: Adding documentation for Trident RAS API

Posted by jerrypeng <gi...@git.apache.org>.
Github user jerrypeng commented on a diff in the pull request:

    https://github.com/apache/storm/pull/1256#discussion_r57386690
  
    --- Diff: docs/Trident-RAS-API.md ---
    @@ -0,0 +1,49 @@
    +---
    +title: Trident RAS API
    +layout: documentation
    +documentation: true
    +---
    +
    +## Trident RAS API
    +
    +The Trident RAS (Resource Aware Scheduler) API provides a mechanism to specify the resource consumption of their topology. The API looks exactly like the base RAS API, only it is called on Trident Streams instead of Bolts and Spouts.
    +
    +In order to avoid duplication and inconsistency in documentation, the purpose and effects of resource setting are not described here, but are instead found in the [Resource Aware Scheduler Overview](Resource_Aware_Scheduler_overview.html)
    +
    +### Use
    +
    +First an example:
    --- End diff --
    
    First,


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request: Adding documentation for Trident RAS API

Posted by knusbaum <gi...@git.apache.org>.
Github user knusbaum closed the pull request at:

    https://github.com/apache/storm/pull/1256


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request: Adding documentation for Trident RAS API

Posted by jerrypeng <gi...@git.apache.org>.
Github user jerrypeng commented on a diff in the pull request:

    https://github.com/apache/storm/pull/1256#discussion_r57387622
  
    --- Diff: docs/Trident-RAS-API.md ---
    @@ -0,0 +1,49 @@
    +---
    +title: Trident RAS API
    +layout: documentation
    +documentation: true
    +---
    +
    +## Trident RAS API
    +
    +The Trident RAS (Resource Aware Scheduler) API provides a mechanism to specify the resource consumption of their topology. The API looks exactly like the base RAS API, only it is called on Trident Streams instead of Bolts and Spouts.
    +
    +In order to avoid duplication and inconsistency in documentation, the purpose and effects of resource setting are not described here, but are instead found in the [Resource Aware Scheduler Overview](Resource_Aware_Scheduler_overview.html)
    +
    +### Use
    +
    +First an example:
    +
    +```java
    +    TridentTopology topo = new TridentTopology();
    +    TridentState wordCounts =
    +        topology
    +            .newStream("words", feeder)
    +            .parallelismHint(5)
    +            .setCPULoad(20)
    +            .setMemoryLoad(512,256)
    +            .each( new Fields("sentence"),  new Split(), new Fields("word"))
    +            .setCPULoad(10)
    +            .setMemoryLoad(512)
    +            .each(new Fields("word"), new BangAdder(), new Fields("word!"))
    +            .parallelismHint(10)
    +            .setCPULoad(50)
    +            .setMemoryLoad(1024)
    +            .groupBy(new Fields("word!"))
    +            .persistentAggregate(new MemoryMapState.Factory(), new Count(), new Fields("count"))
    +            .setCPULoad(100)
    +            .setMemoryLoad(2048);
    +```
    +
    +Resources can be set per operation (except for grouping, shuffling, partitioning).
    +Operations that are combined by Trident into single Bolts have their resources summed.
    --- End diff --
    
    ... into single Bolts **will** have their ...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request: Adding documentation for Trident RAS API

Posted by jerrypeng <gi...@git.apache.org>.
Github user jerrypeng commented on a diff in the pull request:

    https://github.com/apache/storm/pull/1256#discussion_r57391018
  
    --- Diff: docs/Trident-RAS-API.md ---
    @@ -0,0 +1,49 @@
    +---
    +title: Trident RAS API
    +layout: documentation
    +documentation: true
    +---
    +
    +## Trident RAS API
    +
    +The Trident RAS (Resource Aware Scheduler) API provides a mechanism to specify the resource consumption of their topology. The API looks exactly like the base RAS API, only it is called on Trident Streams instead of Bolts and Spouts.
    +
    +In order to avoid duplication and inconsistency in documentation, the purpose and effects of resource setting are not described here, but are instead found in the [Resource Aware Scheduler Overview](Resource_Aware_Scheduler_overview.html)
    +
    +### Use
    +
    +First an example:
    +
    +```java
    +    TridentTopology topo = new TridentTopology();
    +    TridentState wordCounts =
    +        topology
    +            .newStream("words", feeder)
    +            .parallelismHint(5)
    +            .setCPULoad(20)
    +            .setMemoryLoad(512,256)
    +            .each( new Fields("sentence"),  new Split(), new Fields("word"))
    +            .setCPULoad(10)
    +            .setMemoryLoad(512)
    +            .each(new Fields("word"), new BangAdder(), new Fields("word!"))
    +            .parallelismHint(10)
    +            .setCPULoad(50)
    +            .setMemoryLoad(1024)
    +            .groupBy(new Fields("word!"))
    +            .persistentAggregate(new MemoryMapState.Factory(), new Count(), new Fields("count"))
    +            .setCPULoad(100)
    +            .setMemoryLoad(2048);
    +```
    +
    +Resources can be set per operation (except for grouping, shuffling, partitioning).
    +Operations that are combined by Trident into single Bolts have their resources summed.
    +
    +Every Bolt is given **at least** the default resources, regardless of user settings.
    +
    +In the above case, we end up with
    --- End diff --
    
    The aforementioned Trident Topology example will be transformed into a Storm Topology with the following components:


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request: Adding documentation for Trident RAS API

Posted by knusbaum <gi...@git.apache.org>.
Github user knusbaum commented on a diff in the pull request:

    https://github.com/apache/storm/pull/1256#discussion_r57590466
  
    --- Diff: docs/Trident-RAS-API.md ---
    @@ -0,0 +1,50 @@
    +---
    +title: Trident RAS API
    +layout: documentation
    +documentation: true
    +---
    +
    +## Trident RAS API
    +
    +The Trident RAS (Resource Aware Scheduler) API provides a mechanism to allow users to specify the resource consumption of a Trident topology. The API looks exactly like the base RAS API, only it is called on Trident Streams instead of Bolts and Spouts.
    +
    +In order to avoid duplication and inconsistency in documentation, the purpose and effects of resource setting are not described here, but are instead found in the [Resource Aware Scheduler Overview](Resource_Aware_Scheduler_overview.html)
    +
    +### Use
    +
    +First, an example:
    +
    +```java
    +    TridentTopology topo = new TridentTopology();
    +    TridentState wordCounts =
    +        topology
    +            .newStream("words", feeder)
    +            .parallelismHint(5)
    +            .setCPULoad(20)
    +            .setMemoryLoad(512,256)
    +            .each( new Fields("sentence"),  new Split(), new Fields("word"))
    +            .setCPULoad(10)
    +            .setMemoryLoad(512)
    +            .each(new Fields("word"), new BangAdder(), new Fields("word!"))
    +            .parallelismHint(10)
    +            .setCPULoad(50)
    +            .setMemoryLoad(1024)
    +            .groupBy(new Fields("word!"))
    +            .persistentAggregate(new MemoryMapState.Factory(), new Count(), new Fields("count"))
    +            .setCPULoad(100)
    +            .setMemoryLoad(2048);
    +```
    +
    +Resources can be set for each operation (except for grouping, shuffling, partitioning).
    +Operations that are combined by Trident into single Bolts will have their resources summed.
    +
    +Every Bolt is given **at least** the default resources, regardless of user settings.
    +
    +The Trident Stream above becomes a Storm Topology with:
    + * a spout and spout coordinator with a CPU load of 20% each, and a memory load of 512MiB on heap and 256MiB off heap.
    + * a bolt with 60% cpu load (10% + 50%) and a memory load of 1536MiB (1024 + 512) on heap from the combined `Split` and `BangAdder`
    + * a bolt with 100% cpu load and a memory load of 2048MiB.
    --- End diff --
    
    Right you are.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request: Adding documentation for Trident RAS API

Posted by revans2 <gi...@git.apache.org>.
Github user revans2 commented on the pull request:

    https://github.com/apache/storm/pull/1256#issuecomment-201010784
  
    +1 looks good to me.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---