You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@storm.apache.org by satishd <gi...@git.apache.org> on 2016/03/28 11:29:48 UTC

[GitHub] storm pull request: STORM-1652 Added trident windowing API documen...

GitHub user satishd opened a pull request:

    https://github.com/apache/storm/pull/1270

    STORM-1652 Added trident windowing API documentation to Trident API doc.

    

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/satishd/storm STORM-1652

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/storm/pull/1270.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1270
    
----
commit e4c88e7932253aca54b8b9734bfe072843b9e1fc
Author: Satish Duggana <sd...@hortonworks.com>
Date:   2016-03-28T07:18:40Z

    STORM-1652 Added trident windowing API documentation to Trident API doc.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request: STORM-1652 Added trident windowing API documen...

Posted by revans2 <gi...@git.apache.org>.
Github user revans2 commented on the pull request:

    https://github.com/apache/storm/pull/1270#issuecomment-202472310
  
    Overall looks good I am +1 after the links are updated, but I am not super familiar with the windowing work being done in trident, so if someone else could take a look at it too that would be great.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request: STORM-1652 Added trident windowing API documen...

Posted by ptgoetz <gi...@git.apache.org>.
Github user ptgoetz commented on the pull request:

    https://github.com/apache/storm/pull/1270#issuecomment-203603553
  
    +1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request: STORM-1652 Added trident windowing API documen...

Posted by HeartSaVioR <gi...@git.apache.org>.
Github user HeartSaVioR commented on the pull request:

    https://github.com/apache/storm/pull/1270#issuecomment-203850400
  
    +1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request: STORM-1652 Added trident windowing API documen...

Posted by harshach <gi...@git.apache.org>.
Github user harshach commented on the pull request:

    https://github.com/apache/storm/pull/1270#issuecomment-203520598
  
    +1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request: STORM-1652 Added trident windowing API documen...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/storm/pull/1270


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request: STORM-1652 Added trident windowing API documen...

Posted by revans2 <gi...@git.apache.org>.
Github user revans2 commented on a diff in the pull request:

    https://github.com/apache/storm/pull/1270#discussion_r57591836
  
    --- Diff: docs/Trident-API-Overview.md ---
    @@ -299,6 +299,106 @@ Below example shows how these APIs can be used to find maximum using respective
     
     Example applications of these APIs can be located at [TridentMinMaxOfDevicesTopology](https://github.com/apache/storm/blob/master/examples/storm-starter/src/jvm/org/apache/storm/starter/trident/TridentMinMaxOfDevicesTopology.java) and [TridentMinMaxOfVehiclesTopology](https://github.com/apache/storm/blob/master/examples/storm-starter/src/jvm/org/apache/storm/starter/trident/TridentMinMaxOfVehiclesTopology.java) 
     
    +### Windowing
    +Trident streams can process tuples in batches which are of the same window and emit aggregated result to the next operation. 
    +There are two kinds of windowing supported which are based on processing time or tuples count:
    +    1. Tumbling window
    +    2. Sliding window
    +
    +#### Tumbling window
    +Tuples are grouped in a single window based on processing time or count. Any tuple belongs to only one of the windows.
    +
    +```java 
    +
    +    /**
    +     * Returns a stream of tuples which are aggregated results of a tumbling window with every {@code windowCount} of tuples.
    +     */
    +    public Stream tumblingWindow(int windowCount, WindowsStoreFactory windowStoreFactory,
    +                                      Fields inputFields, Aggregator aggregator, Fields functionFields);
    +    
    +    /**
    +     * Returns a stream of tuples which are aggregated results of a window that tumbles at duration of {@code windowDuration}
    +     */
    +    public Stream tumblingWindow(BaseWindowedBolt.Duration windowDuration, WindowsStoreFactory windowStoreFactory,
    +                                     Fields inputFields, Aggregator aggregator, Fields functionFields);
    +                                     
    +```
    +
    +#### Sliding window
    +Tuples are grouped in windows and window slides for every sliding interval. A tuple can belong to more than one window.
    +
    +```java
    + 
    +    /**
    +     * Returns a stream of tuples which are aggregated results of a sliding window with every {@code windowCount} of tuples
    +     * and slides the window after {@code slideCount}.
    +     */
    +    public Stream slidingWindow(int windowCount, int slideCount, WindowsStoreFactory windowStoreFactory,
    +                                      Fields inputFields, Aggregator aggregator, Fields functionFields);
    +     
    +    /**
    +     * Returns a stream of tuples which are aggregated results of a window which slides at duration of {@code slidingInterval}
    +     * and completes a window at {@code windowDuration}
    +     */
    +    public Stream slidingWindow(BaseWindowedBolt.Duration windowDuration, BaseWindowedBolt.Duration slidingInterval,
    +                                    WindowsStoreFactory windowStoreFactory, Fields inputFields, Aggregator aggregator, Fields functionFields);
    +```
    +
    +Examples of tumbling and sliding windows can be found [here](Windowing.html)
    +
    +#### Common windowing API
    +Below is the common windowing API which takes `WindowConfig` for any supported windowing configurations. 
    +
    +```java
    +
    +    public Stream window(WindowConfig windowConfig, WindowsStoreFactory windowStoreFactory, Fields inputFields,
    +                         Aggregator aggregator, Fields functionFields)
    +                         
    +```
    +
    +`windowConfig` can be any of the below.
    + - `SlidingCountWindow.of(int windowCount, int slidingCount)`
    + - `SlidingDurationWindow.of(BaseWindowedBolt.Duration windowDuration, BaseWindowedBolt.Duration slidingDuration)`
    + - `TumblingCountWindow.of(int windowLength)`
    + - `TumblingDurationWindow.of(BaseWindowedBolt.Duration windowLength)`
    + 
    + 
    +Trident windowing APIs need `WindowsStoreFactory` to store received tuples and aggregated values. Currently, basic implementation 
    +for HBase is given with `HBaseWindowsStoreFactory`. It can further be extended to address respective usecases. 
    +Example of using `HBaseWindowStoreFactory` for windowing can be seen below.    
    +
    +```java
    +
    +    // window-state table should already be created with cf:tuples column
    +    HBaseWindowsStoreFactory windowStoreFactory = new HBaseWindowsStoreFactory(new HashMap<String, Object>(), "window-state", "cf".getBytes("UTF-8"), "tuples".getBytes("UTF-8"));
    +    FixedBatchSpout spout = new FixedBatchSpout(new Fields("sentence"), 3, new Values("the cow jumped over the moon"),
    +            new Values("the man went to the store and bought some candy"), new Values("four score and seven years ago"),
    +            new Values("how many apples can you eat"), new Values("to be or not to be the person"));
    +    spout.setCycle(true);
    +
    +    TridentTopology topology = new TridentTopology();
    +
    +    Stream stream = topology.newStream("spout1", spout).parallelismHint(16).each(new Fields("sentence"),
    +            new Split(), new Fields("word"))
    +            .window(TumblingCountWindow.of(1000), windowStoreFactory, new Fields("word"), new CountAsAggregator(), new Fields("count"))
    +            .peek(new Consumer() {
    +                @Override
    +                public void accept(TridentTuple input) {
    +                    LOG.info("Received tuple: [{}]", input);
    +                }
    +            });
    +
    +    StormTopology stormTopology =  topology.build();
    +    
    +```
    +
    +Detailed description of all the above APIs in this section can be found [here](javadocs/org/apache/storm/trident/Stream.html)  
    +
    +#### Example applications
    +Example applications of these APIs are located at [TridentHBaseWindowingStoreTopology](https://github.com/apache/storm/blob/master/examples/storm-starter/src/jvm/org/apache/storm/starter/trident/TridentHBaseWindowingStoreTopology.java) 
    +and [TridentWindowingInmemoryStoreTopology](https://github.com/apache/storm/blob/master/examples/storm-starter/src/jvm/org/apache/storm/starter/trident/TridentWindowingInmemoryStoreTopology.java) 
    --- End diff --
    
    Instead of putting full github links in here, could you please use `{{page.git-blob-base}}` instead.
    
    i.e.
    
    ```
    Example applications of these APIs are located at [TridentHBaseWindowingStoreTopology]({{page.git-blob-base}}/examples/storm-starter/src/jvm/org/apache/storm/starter/trident/TridentHBaseWindowingStoreTopology.java) 
    and [TridentWindowingInmemoryStoreTopology]({{page.git-blob-base}}/examples/storm-starter/src/jvm/org/apache/storm/starter/trident/TridentWindowingInmemoryStoreTopology.java) 
    ```
    
    That way it will point to the correct thing on github associated with the release this documentation is a part of.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---