You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by tdas <gi...@git.apache.org> on 2016/07/26 02:38:14 UTC

[GitHub] spark pull request #14361: [TEST][STREAMING] Fix flaky Kafka rate controllin...

GitHub user tdas opened a pull request:

    https://github.com/apache/spark/pull/14361

    [TEST][STREAMING] Fix flaky Kafka rate controlling test

    ## What changes were proposed in this pull request?
    
    The current test is incorrect, because
    - The expected number of messages does not take into account that the topic has 2 partitions, and rate is set per partition. 
    - Also in some cases, the test ran out of data in Kafka while waiting for the right amount of data per batch. 
    
    The PR
    - Reduces the partitions to 1
    - Adds more data to Kafka
     - Runs with 0.5 second so that batches are created slowly
    
    ## How was this patch tested?
    Ran many times locally, going to run it many times in Jenkins
    
    (If this patch involves UI changes, please attach a screenshot; otherwise, remove this)
    
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/tdas/spark kafka-rate-test-fix

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/14361.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #14361
    
----
commit 52b5a209eab2b11565e0e9403b35b5eae6429e53
Author: Tathagata Das <ta...@gmail.com>
Date:   2016-07-26T02:28:55Z

    Fixed rate controller test

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14361: [TEST][STREAMING] Fix flaky Kafka rate controlling test

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14361
  
    **[Test build #62861 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62861/consoleFull)** for PR 14361 at commit [`52b5a20`](https://github.com/apache/spark/commit/52b5a209eab2b11565e0e9403b35b5eae6429e53).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14361: [TEST][STREAMING] Fix flaky Kafka rate controlling test

Posted by tdas <gi...@git.apache.org>.
Github user tdas commented on the issue:

    https://github.com/apache/spark/pull/14361
  
    @koeninger Can you take a look?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14361: [TEST][STREAMING] Fix flaky Kafka rate controlling test

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14361
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62861/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14361: [TEST][STREAMING] Fix flaky Kafka rate controlling test

Posted by tdas <gi...@git.apache.org>.
Github user tdas commented on the issue:

    https://github.com/apache/spark/pull/14361
  
    I also ran it on my machine over 250 times, not a single failure. So I am merging this to master and 2.0.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14361: [TEST][STREAMING] Fix flaky Kafka rate controlling test

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14361
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14361: [TEST][STREAMING] Fix flaky Kafka rate controllin...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/14361


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14361: [TEST][STREAMING] Fix flaky Kafka rate controlling test

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14361
  
    **[Test build #3192 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3192/consoleFull)** for PR 14361 at commit [`52b5a20`](https://github.com/apache/spark/commit/52b5a209eab2b11565e0e9403b35b5eae6429e53).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14361: [TEST][STREAMING] Fix flaky Kafka rate controlling test

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14361
  
    **[Test build #3191 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3191/consoleFull)** for PR 14361 at commit [`52b5a20`](https://github.com/apache/spark/commit/52b5a209eab2b11565e0e9403b35b5eae6429e53).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14361: [TEST][STREAMING] Fix flaky Kafka rate controlling test

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14361
  
    **[Test build #3192 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3192/consoleFull)** for PR 14361 at commit [`52b5a20`](https://github.com/apache/spark/commit/52b5a209eab2b11565e0e9403b35b5eae6429e53).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14361: [TEST][STREAMING] Fix flaky Kafka rate controlling test

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14361
  
    **[Test build #62861 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62861/consoleFull)** for PR 14361 at commit [`52b5a20`](https://github.com/apache/spark/commit/52b5a209eab2b11565e0e9403b35b5eae6429e53).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14361: [TEST][STREAMING] Fix flaky Kafka rate controlling test

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14361
  
    **[Test build #3191 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3191/consoleFull)** for PR 14361 at commit [`52b5a20`](https://github.com/apache/spark/commit/52b5a209eab2b11565e0e9403b35b5eae6429e53).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14361: [TEST][STREAMING] Fix flaky Kafka rate controlling test

Posted by tdas <gi...@git.apache.org>.
Github user tdas commented on the issue:

    https://github.com/apache/spark/pull/14361
  
    Flaky test fixes are always a higher priority for merging because it blocks productivity for others. Often we have ignored tests at very short notice to unblock others. Nonetheless, my apologies for not explicitly mentioning it, but I was welcoming your feedback nonetheless as we can always incorporate your feedback in another PR. 



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14361: [TEST][STREAMING] Fix flaky Kafka rate controlling test

Posted by koeninger <gi...@git.apache.org>.
Github user koeninger commented on the issue:

    https://github.com/apache/spark/pull/14361
  
    - This is testing RateEstimator, not maxRatePerPartition.  I didn't write the rate estimator code, but my understanding of the rate expressed there is that it is on a per-stream basis, not a per-partition basis.  So your explanation of why partitions need to be reduced to 1 doesn't make sense to me.
    
    - Even if that is the case, it seems like a better idea to fix the expected sizes, not limit to 1 partition, because people will be using backpressure with multi-partition topics
    
    - This test exists in both 0.8 and 0.10, but this patch only applies to 0.10
    
    Just as kind of a meta-comment, I'm not sure what the point of asking for feedback is if it's going to be merged to master within 5 hours regardless.  I was asleep during that entire time.  I understand the rush for 2.0, and I'm not trying to play the "Apache Process" card or get in your face... I'd just ask you to consider the reasoning involved.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org