You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by jerryshao <gi...@git.apache.org> on 2018/04/28 07:38:25 UTC
[GitHub] spark pull request #21188: [SPARK-24046][SS] Fix rate source rowsPerSecond <...
GitHub user jerryshao opened a pull request:
https://github.com/apache/spark/pull/21188
[SPARK-24046][SS] Fix rate source rowsPerSecond <= rampUpTime corner case
## What changes were proposed in this pull request?
Current Rate source has some issues when calculating `valueAtSecond` if `rowsPerSecond` <= `rampUpTime`, value will not be gradually increased, details can be found in [JIRA](https://issues.apache.org/jira/browse/SPARK-24046). So here propose to fix this issue.
## How was this patch tested?
Add UT
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/jerryshao/apache-spark SPARK-24046
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/21188.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #21188
----
commit bf62aed080c9a2b6b46e8ee656c70b5ae76c0d45
Author: jerryshao <ss...@...>
Date: 2018-04-28T07:29:17Z
Fix rate source rowsPerSecond <= rampUpTime corner case
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21188: [SPARK-24046][SS] Fix rate source rowsPerSecond <= rampU...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21188
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21188: [SPARK-24046][SS] Fix rate source rowsPerSecond <= rampU...
Posted by maasg <gi...@git.apache.org>.
Github user maasg commented on the issue:
https://github.com/apache/spark/pull/21188
Hi Jerry,
I don't think the issue is solved with this patch. I plugged the new function in my notebook and it still shows a rather flat ramp-up:
![image](https://user-images.githubusercontent.com/874997/39434014-b06b52b4-4c97-11e8-871e-3064cbdc101f.png)
I created a PR with an alternative rate computation: https://github.com/apache/spark/pull/21194
Could you have a look?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21188: [SPARK-24046][SS] Fix rate source rowsPerSecond <= rampU...
Posted by maasg <gi...@git.apache.org>.
Github user maasg commented on the issue:
https://github.com/apache/spark/pull/21188
Hi Jerry,
There's an improvement from the original situation, but the initial ramp-up phase starts only when the time gets very close to `rampUpTime`. Here you have another example that shows the behavior:
Above: The original implementation
Below: This PR. (Notebook src:https://gist.github.com/maasg/95eab0e49eeef9c7457bca2900bf412e)
![image](https://user-images.githubusercontent.com/874997/39543852-9704b0ac-4e4c-11e8-9661-9f2bc8681c99.png)
This behavior is not consistent with the working, case when `rowsPerSecond > rampUpTime`
In this case, the ramp-up is quite smooth:
![image](https://user-images.githubusercontent.com/874997/39543996-075e0330-4e4d-11e8-89b2-90861cea9a71.png)
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #21188: [SPARK-24046][SS] Fix rate source rowsPerSecond <...
Posted by xuanyuanking <gi...@git.apache.org>.
Github user xuanyuanking commented on a diff in the pull request:
https://github.com/apache/spark/pull/21188#discussion_r185852663
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/sources/RateStreamProvider.scala ---
@@ -107,14 +107,25 @@ object RateStreamProvider {
// seconds = 0 1 2 3 4 5 6
// speed = 0 2 4 6 8 10 10 (speedDeltaPerSecond * seconds)
// end value = 0 2 6 12 20 30 40 (0 + speedDeltaPerSecond * seconds) * (seconds + 1) / 2
- val speedDeltaPerSecond = rowsPerSecond / (rampUpTimeSeconds + 1)
+ val speedDeltaPerSecond = math.max(1, rowsPerSecond / (rampUpTimeSeconds + 1))
--- End diff --
Keep at-least 1 per second and leave other seconds to 0 is ok IMOP.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21188: [SPARK-24046][SS] Fix rate source rowsPerSecond <= rampU...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21188
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89952/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21188: [SPARK-24046][SS] Fix rate source rowsPerSecond <= rampU...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21188
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21188: [SPARK-24046][SS] Fix rate source rowsPerSecond <= rampU...
Posted by xuanyuanking <gi...@git.apache.org>.
Github user xuanyuanking commented on the issue:
https://github.com/apache/spark/pull/21188
@maasg as comment in #21194, I just consider we should not change the behavior while `seconds > rampUpTimeSeconds`. Maybe it more important than smooth.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21188: [SPARK-24046][SS] Fix rate source rowsPerSecond <= rampU...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21188
**[Test build #89952 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89952/testReport)** for PR 21188 at commit [`bf62aed`](https://github.com/apache/spark/commit/bf62aed080c9a2b6b46e8ee656c70b5ae76c0d45).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21188: [SPARK-24046][SS] Fix rate source rowsPerSecond <= rampU...
Posted by jerryshao <gi...@git.apache.org>.
Github user jerryshao commented on the issue:
https://github.com/apache/spark/pull/21188
Isn't this a flat ramp-up smoothly increasing the rows per second? Your proposal is another solution, but just two options...
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21188: [SPARK-24046][SS] Fix rate source rowsPerSecond <= rampU...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21188
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/4227/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21188: [SPARK-24046][SS] Fix rate source rowsPerSecond <= rampU...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21188
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21188: [SPARK-24046][SS] Fix rate source rowsPerSecond <= rampU...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21188
**[Test build #89952 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89952/testReport)** for PR 21188 at commit [`bf62aed`](https://github.com/apache/spark/commit/bf62aed080c9a2b6b46e8ee656c70b5ae76c0d45).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #21188: [SPARK-24046][SS] Fix rate source rowsPerSecond <= rampU...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21188
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2736/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org