You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@spark.apache.org by jose-torres <gi...@git.apache.org> on 2018/02/20 23:48:03 UTC

[GitHub] spark pull request #20646: [SPARK-23408][SS] Synchronize successive AddDataM...

GitHub user jose-torres opened a pull request:

    https://github.com/apache/spark/pull/20646

    [SPARK-23408][SS] Synchronize successive AddDataMemory actions in StreamTest.

    ## What changes were proposed in this pull request?
    
    The stream-stream join tests add data to multiple sources, and expect it all to show up in the next batch. But there's a race condition; the new batch might trigger when only one of the AddData actions has been reached.
    
    Fortunately, MemoryStream synchronizes batch generation on itself, and StreamExecution won't generate empty batches. So we can resolve this race condition by synchronizing successive AddDataMemory actions against every MemoryStream together. Then we can be sure StreamExecution won't start generating a batch before all the data is present.
    
    ## How was this patch tested?
    existing tests


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/jose-torres/spark flaky

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/20646.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #20646
    
----
commit d540be6bb051a33d2f6bd69a49fbe11afe9f0a65
Author: Jose Torres <jo...@...>
Date:   2018-02-20T23:34:16Z

    just use synchronization

commit d91c55f1a17b03aa2d46682e76c6eb207e71a521
Author: Jose Torres <jo...@...>
Date:   2018-02-20T23:38:35Z

    Merge branch 'master' of https://github.com/apache/spark into flaky

commit dce075f53c8a1418dac99c9b7b7f9b7e79ed17ff
Author: Jose Torres <jo...@...>
Date:   2018-02-20T23:45:40Z

    fix merge

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #20646: [SPARK-23408][SS] Synchronize successive AddDataMemory a...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20646
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #20646: [SPARK-23408][SS] Synchronize successive AddDataMemory a...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20646
  
    **[Test build #87571 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87571/testReport)** for PR 20646 at commit [`1df90e7`](https://github.com/apache/spark/commit/1df90e796e9388d7992bc55f9f87bfd71af2f7f9).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #20646: [SPARK-23408][SS] Synchronize successive AddDataMemory a...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20646
  
    **[Test build #87571 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87571/testReport)** for PR 20646 at commit [`1df90e7`](https://github.com/apache/spark/commit/1df90e796e9388d7992bc55f9f87bfd71af2f7f9).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #20646: [SPARK-23408][SS] Synchronize successive AddDataMemory a...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20646
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #20646: [SPARK-23408][SS] Synchronize successive AddDataMemory a...

Posted by tdas <gi...@git.apache.org>.

Github user tdas commented on the issue:

    https://github.com/apache/spark/pull/20646
  
    I opened a new PR to test out an alternate approach. PTAL - https://github.com/apache/spark/pull/20650/files?w=1
    
    (note the `w=1`, that is to ignore whitespaces diffs in the diff view).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #20646: [SPARK-23408][SS] Synchronize successive AddDataMemory a...

Posted by jose-torres <gi...@git.apache.org>.

Github user jose-torres commented on the issue:

    https://github.com/apache/spark/pull/20646
  
    (https://issues.apache.org/jira/browse/SPARK-23369 was already filed for previous flake)


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #20646: [SPARK-23408][SS] Synchronize successive AddDataMemory a...

Posted by jose-torres <gi...@git.apache.org>.

Github user jose-torres commented on the issue:

    https://github.com/apache/spark/pull/20646
  
    retest this please


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #20646: [SPARK-23408][SS] Synchronize successive AddDataMemory a...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20646
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org