You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by jerryshao <gi...@git.apache.org> on 2018/04/28 07:38:25 UTC

[GitHub] spark pull request #21188: [SPARK-24046][SS] Fix rate source rowsPerSecond <...

GitHub user jerryshao opened a pull request:

    https://github.com/apache/spark/pull/21188

    [SPARK-24046][SS] Fix rate source rowsPerSecond <= rampUpTime corner case

    ## What changes were proposed in this pull request?
    
    Current Rate source has some issues when calculating `valueAtSecond` if `rowsPerSecond` <= `rampUpTime`, value will not be gradually increased, details can be found in [JIRA](https://issues.apache.org/jira/browse/SPARK-24046). So here propose to fix this issue. 
    
    ## How was this patch tested?
    
    Add UT


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/jerryshao/apache-spark SPARK-24046

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/21188.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #21188
    
----
commit bf62aed080c9a2b6b46e8ee656c70b5ae76c0d45
Author: jerryshao <ss...@...>
Date:   2018-04-28T07:29:17Z

    Fix rate source rowsPerSecond <= rampUpTime corner case

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21188: [SPARK-24046][SS] Fix rate source rowsPerSecond <= rampU...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21188
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21188: [SPARK-24046][SS] Fix rate source rowsPerSecond <= rampU...

Posted by maasg <gi...@git.apache.org>.
Github user maasg commented on the issue:

    https://github.com/apache/spark/pull/21188
  
    Hi Jerry, 
    I don't think the issue is solved with this patch. I plugged the new function in my notebook and it still shows a rather flat ramp-up: 
    ![image](https://user-images.githubusercontent.com/874997/39434014-b06b52b4-4c97-11e8-871e-3064cbdc101f.png)
    
    I created a PR with an alternative rate computation: https://github.com/apache/spark/pull/21194
    Could you have a look?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21188: [SPARK-24046][SS] Fix rate source rowsPerSecond <= rampU...

Posted by maasg <gi...@git.apache.org>.
Github user maasg commented on the issue:

    https://github.com/apache/spark/pull/21188
  
    Hi Jerry,
    There's an improvement from the original situation, but the initial ramp-up phase starts only when the time gets very close to `rampUpTime`. Here you have another example that shows the behavior:
    Above: The original implementation
    Below: This PR. (Notebook src:https://gist.github.com/maasg/95eab0e49eeef9c7457bca2900bf412e)
    ![image](https://user-images.githubusercontent.com/874997/39543852-9704b0ac-4e4c-11e8-9661-9f2bc8681c99.png)
    
    This behavior is not consistent with the working, case when `rowsPerSecond > rampUpTime`
    In this case, the ramp-up is quite smooth:
    
    ![image](https://user-images.githubusercontent.com/874997/39543996-075e0330-4e4d-11e8-89b2-90861cea9a71.png)



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21188: [SPARK-24046][SS] Fix rate source rowsPerSecond <...

Posted by xuanyuanking <gi...@git.apache.org>.
Github user xuanyuanking commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21188#discussion_r185852663
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/sources/RateStreamProvider.scala ---
    @@ -107,14 +107,25 @@ object RateStreamProvider {
         // seconds   = 0 1 2  3  4  5  6
         // speed     = 0 2 4  6  8 10 10 (speedDeltaPerSecond * seconds)
         // end value = 0 2 6 12 20 30 40 (0 + speedDeltaPerSecond * seconds) * (seconds + 1) / 2
    -    val speedDeltaPerSecond = rowsPerSecond / (rampUpTimeSeconds + 1)
    +    val speedDeltaPerSecond = math.max(1, rowsPerSecond / (rampUpTimeSeconds + 1))
    --- End diff --
    
    Keep at-least 1 per second and leave other seconds to 0 is ok IMOP. 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21188: [SPARK-24046][SS] Fix rate source rowsPerSecond <= rampU...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21188
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89952/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21188: [SPARK-24046][SS] Fix rate source rowsPerSecond <= rampU...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21188
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21188: [SPARK-24046][SS] Fix rate source rowsPerSecond <= rampU...

Posted by xuanyuanking <gi...@git.apache.org>.
Github user xuanyuanking commented on the issue:

    https://github.com/apache/spark/pull/21188
  
    @maasg as comment in #21194, I just consider we should not change the behavior while `seconds > rampUpTimeSeconds`. Maybe it more important than smooth.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21188: [SPARK-24046][SS] Fix rate source rowsPerSecond <= rampU...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21188
  
    **[Test build #89952 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89952/testReport)** for PR 21188 at commit [`bf62aed`](https://github.com/apache/spark/commit/bf62aed080c9a2b6b46e8ee656c70b5ae76c0d45).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21188: [SPARK-24046][SS] Fix rate source rowsPerSecond <= rampU...

Posted by jerryshao <gi...@git.apache.org>.
Github user jerryshao commented on the issue:

    https://github.com/apache/spark/pull/21188
  
    Isn't this a flat ramp-up smoothly increasing the rows per second? Your proposal is another solution, but just two options...


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21188: [SPARK-24046][SS] Fix rate source rowsPerSecond <= rampU...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21188
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/4227/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21188: [SPARK-24046][SS] Fix rate source rowsPerSecond <= rampU...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21188
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21188: [SPARK-24046][SS] Fix rate source rowsPerSecond <= rampU...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21188
  
    **[Test build #89952 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89952/testReport)** for PR 21188 at commit [`bf62aed`](https://github.com/apache/spark/commit/bf62aed080c9a2b6b46e8ee656c70b5ae76c0d45).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21188: [SPARK-24046][SS] Fix rate source rowsPerSecond <= rampU...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21188
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2736/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org