You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@spark.apache.org by rekhajoshm <gi...@git.apache.org> on 2017/10/01 00:21:42 UTC

[GitHub] spark pull request #19404: [SPARK-21760] [Streaming] Fix for Structured stre...

GitHub user rekhajoshm opened a pull request:

    https://github.com/apache/spark/pull/19404

    [SPARK-21760] [Streaming] Fix for Structured streaming terminates with Exception

    ## What changes were proposed in this pull request?
    Updated serialize call for flush and FSDataOutputStream hflush
    
    ## How was this patch tested?
    manual test

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/rekhajoshm/spark SPARK-21760

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/19404.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #19404
    
----
commit e3677c9fa9697e0d34f9df52442085a6a481c9e9
Author: Rekha Joshi <re...@gmail.com>
Date:   2015-05-05T23:10:08Z

    Merge pull request #1 from apache/master
    
    Pulling functionality from apache spark

commit 106fd8eee8f6a6f7c67cfc64f57c1161f76d8f75
Author: Rekha Joshi <re...@gmail.com>
Date:   2015-05-08T21:49:09Z

    Merge pull request #2 from apache/master
    
    pull latest from apache spark

commit 0be142d6becba7c09c6eba0b8ea1efe83d649e8c
Author: Rekha Joshi <re...@gmail.com>
Date:   2015-06-22T00:08:08Z

    Merge pull request #3 from apache/master
    
    Pulling functionality from apache spark

commit 6c6ee12fd733e3f9902e10faf92ccb78211245e3
Author: Rekha Joshi <re...@gmail.com>
Date:   2015-09-17T01:03:09Z

    Merge pull request #4 from apache/master
    
    Pulling functionality from apache spark

commit b123c601e459d1ad17511fd91dd304032154882a
Author: Rekha Joshi <re...@gmail.com>
Date:   2015-11-25T18:50:32Z

    Merge pull request #5 from apache/master
    
    pull request from apache/master

commit c73c32aadd6066e631956923725a48d98a18777e
Author: Rekha Joshi <re...@gmail.com>
Date:   2016-03-18T19:13:51Z

    Merge pull request #6 from apache/master
    
    pull latest from apache spark

commit 7dbf7320057978526635bed09dabc8cf8657a28a
Author: Rekha Joshi <re...@gmail.com>
Date:   2016-04-05T20:26:40Z

    Merge pull request #8 from apache/master
    
    pull latest from apache spark

commit 5e9d71827f8e2e4d07027281b80e4e073e7fecd1
Author: Rekha Joshi <re...@gmail.com>
Date:   2017-05-01T23:00:30Z

    Merge pull request #9 from apache/master
    
    Pull apache spark

commit 63d99b3ce5f222d7126133170a373591f0ac67dd
Author: Rekha Joshi <re...@gmail.com>
Date:   2017-09-30T22:26:44Z

    Merge pull request #10 from apache/master
    
    pull latest apache spark

commit 89cdb3bdf70ec39a09b4e598935bb20a8f64f0cb
Author: rjoshi2 <re...@gmail.com>
Date:   2017-10-01T00:15:58Z

    Fix for Structured streaming terminating with Exception

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #19404: [SPARK-21760] [Streaming] Fix for Structured streaming t...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19404
  
    **[Test build #82439 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82439/testReport)** for PR 19404 at commit [`a2d5bc7`](https://github.com/apache/spark/commit/a2d5bc706987accaeb6c9516e7d4a07b5bb3104f).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #19404: [SPARK-21760] [Streaming] Fix for Structured streaming t...

Posted by rekhajoshm <gi...@git.apache.org>.

Github user rekhajoshm commented on the issue:

    https://github.com/apache/spark/pull/19404
  
    Seems to be apache spark git/jenkins issue.Please retest after a while.thanks


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #19404: [SPARK-21760] [Streaming] Fix for Structured streaming t...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19404
  
    **[Test build #82433 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82433/testReport)** for PR 19404 at commit [`9cd3ee6`](https://github.com/apache/spark/commit/9cd3ee69b9ddd678b77d8b78feec51ea8c55e377).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #19404: [SPARK-21760] [Streaming] Fix for Structured streaming t...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19404
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82439/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #19404: [SPARK-21760] [Streaming] Fix for Structured streaming t...

Posted by steveloughran <gi...@git.apache.org>.

Github user steveloughran commented on the issue:

    https://github.com/apache/spark/pull/19404
  
    Problem here is that a stream which doesn't implement hflush/hsync is required to throw an exception; it's a way of guaranteeing that if hsync/hflush does complete, the action has done what you want - HBase &c utterly depend on this.
    
    The fact that FSDataOutputStream implements Syncable and yet streams it may relay to may not is the whole reason for [HDFS-11644](https://issues.apache.org/jira/browse/HDFS-11644) and the `StreamCapabilities` method. As with Erasure Coding, even HDFS streams may not support hflush/hsync
    
    This patch is at risk of raising an exception whenever it tries to call `hflush()` on non HDFS store or HDFS with Erasure Coding enabled. IF you were targeting Hadoop 2.9+ you could just check `hasCapability("hsync")` use it if present. For Hadoop 2.6+ you'll have to call `out.hflush()` on the first attempt, if any exception (IOE, UnsupportedOperationException, RTE) is raised, catch, swallow and never try to hflush again. 
    
    Sorry, it's messy: its why I'd like that `hasCapability(`) probe up for all features which are only intermittently available. Can complicate caller code if you want to know these things, but stops you getting caught out when you really want to know the durability semantics of the FS.
    
    see also WiP [OutputStream](https://github.com/steveloughran/hadoop/blob/s3/HADOOP-13327-outputstream-trunk/hadoop-common-project/hadoop-common/src/site/markdown/filesystem/outputstream.md)
    
    (thanks for mentioning me BTW; this is one of those things that would probably work well in local tests but blow up in production somewhere)


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #19404: [SPARK-21760] [Streaming] Fix for Structured streaming t...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19404
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82433/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #19404: [SPARK-21760] [Streaming] Fix for Structured streaming t...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19404
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82359/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org