You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by proflin <gi...@git.apache.org> on 2016/03/02 14:51:27 UTC

[GitHub] spark pull request: [SPARK-13618][STREAMING][WEB-UI][WIP] Make Str...

GitHub user proflin opened a pull request:

    https://github.com/apache/spark/pull/11470

    [SPARK-13618][STREAMING][WEB-UI][WIP] Make Streaming web UI page display rate-limit lines on statistics graph

    ## What changes were proposed in this pull request?
    
    This PR makes Streaming web UI display rate-limit lines in the statistics graph.
    
    Specifically, this PR:
    
    1. adds in `RateLimiter` a data structure keeping history of rate limit changes, so that calculating the upper bound of how many records we can receive in a block interval is possible;
    2. adds  the `numRecordsLimit` information into the path from `BlockGenerator` generates a `Block` to the `ReceivedBlockInfo` (so that `numRecordsLimit` can be transferred on wire to the driver side's `ReceivedBlockTracker`);
    3. makes changes in `StreamingJobProgressListener` and related places, so that the aggregated `numRecordsLimit` information for every batch can be calculated;
    4. makes changes in `StreamingPage` and related places, so two or more lines can be drawn on a single statistics graph.
    
    ## How was this patch tested? 
    
    - units tests 
    - manually checked UI(see below)
    
    ## Screenshots
    
    ### without back pressure
    ![](https://issues.apache.org/jira/secure/attachment/12790928/1.png)
    
    ### with back pressure
    ![](https://issues.apache.org/jira/secure/attachment/12790929/2.png)

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/proflin/spark display-rate-limit-on-streaming-web-ui

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/11470.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #11470
    
----
commit a4e0739b941b61a67a3cf78f4d00223aa20b0e93
Author: proflin <pr...@gmail.com>
Date:   2016-03-02T12:40:23Z

    Adds in `RateLimiter` a data structure keeping history of rate limit changes

commit 239eab9f70bbf3de432bf19d5b2259dce9818a53
Author: proflin <pr...@gmail.com>
Date:   2016-03-02T12:45:52Z

    Add `numRecordLimit` information into the `BlockGenerator` -> `ReceivedBlockInfo` path

commit 54afb2e6c0a26fb155ba93265813bbe546258b9d
Author: proflin <pr...@gmail.com>
Date:   2016-03-02T12:52:52Z

    Enables `StreamingJobProgressListener` calculate the aggregated `numRecordsLimit` information for every batch

commit 7aba06393484c822c25f03d8911cf07d2e0258a9
Author: proflin <pr...@gmail.com>
Date:   2016-03-02T12:54:10Z

    Display the rate limit lines on `StreamingPage`

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13618][STREAMING][WEB-UI] Make Streamin...

Posted by lw-lin <gi...@git.apache.org>.
Github user lw-lin commented on the pull request:

    https://github.com/apache/spark/pull/11470#issuecomment-203804602
  
    This PR has many conflicts to resolve, so I'm closing this for now and will re-open later, thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13618][STREAMING][WEB-UI][WIP] Make Str...

Posted by zsxwing <gi...@git.apache.org>.
Github user zsxwing commented on the pull request:

    https://github.com/apache/spark/pull/11470#issuecomment-191369955
  
    Why need to display the rate-limit line when backpressure is disabled?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13618][STREAMING][WEB-UI] Make Streamin...

Posted by lw-lin <gi...@git.apache.org>.
Github user lw-lin closed the pull request at:

    https://github.com/apache/spark/pull/11470


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13618][STREAMING][WEB-UI] Make Streamin...

Posted by lw-lin <gi...@git.apache.org>.
Github user lw-lin commented on the pull request:

    https://github.com/apache/spark/pull/11470#issuecomment-194880816
  
    All three parts are now ready for review. I've also drafted a design doc (please see [Spark-13618](https://issues.apache.org/jira/browse/SPARK-13618)), hopefully it can help reviewers review this more faster.
    
    @tdas @zsxwing would you mind taking a look when you have time? Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13618][STREAMING][WEB-UI][WIP] Make Str...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11470#issuecomment-191245968
  
    Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13618][STREAMING][WEB-UI] Make Streamin...

Posted by lw-lin <gi...@git.apache.org>.
Github user lw-lin commented on the pull request:

    https://github.com/apache/spark/pull/11470#issuecomment-194874328
  
    @zsxwing 
    Points taken -- indeed no need to display rate-limit line when an `InputDStream` instance is not _under rate control_: I've added a field `underRateControl` for this purpose (please see Part 2). Thanks !
    
    Regarding where to collect the rate limit numbers -- at driver side or at executor side -- I've attached an design doc, in which there is a little discussion. Thanks for pointing this out !



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13618][STREAMING][WEB-UI][WIP] Make Str...

Posted by proflin <gi...@git.apache.org>.
Github user proflin commented on the pull request:

    https://github.com/apache/spark/pull/11470#issuecomment-191261292
  
    2 things I'm still working on:
    - better deal with situations where `numRecordsLimit`s are mixed with `Some(...)` from some batches and 'None' from the other batches;
    - fix compiling and tests issues in modules such as `external/kafka`, etc.
    
    The `streaming` module compiles and passes units tests. I'm posting this at this stage hoping to get some early feedbacks so I can improve this. @tdas @zsxwing would you take a look when you have time please? Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13618][STREAMING][WEB-UI][WIP] Make Str...

Posted by zsxwing <gi...@git.apache.org>.
Github user zsxwing commented on the pull request:

    https://github.com/apache/spark/pull/11470#issuecomment-191370838
  
    A suggestion: could you just ad the rate limit informations to some Batch info and then store history limits in StreamingJobProgressListener so that we don't need to add history limits to RateLimiter?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org