You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by JoshRosen <gi...@git.apache.org> on 2014/12/31 10:23:10 UTC

[GitHub] spark pull request: [SPARK-5035] [Streaming] ReceiverMessage trait...

GitHub user JoshRosen opened a pull request:

    https://github.com/apache/spark/pull/3857

    [SPARK-5035] [Streaming] ReceiverMessage trait should extend Serializable

    Spark Streaming's ReceiverMessage trait should extend Serializable in order to fix a subtle bug that only occurs when running on a real cluster:
    
    If you attempt to send a fire-and-forget message to a remote Akka actor and that message cannot be serialized, then this seems to lead to more-or-less silent failures. As an optimization, Akka skips message serialization for messages sent within the same JVM. As a result, Spark's unit tests will never fail due to non-serializable Akka messages, but these will cause mostly-silent failures when running on a real cluster.
    
    Before this patch, here was the code for ReceiverMessage:
    
    ```
    /** Messages sent to the NetworkReceiver. */
    private[streaming] sealed trait ReceiverMessage
    private[streaming] object StopReceiver extends ReceiverMessage
    ```
    
    Since ReceiverMessage does not extend Serializable and StopReceiver is a regular `object`, not a `case object`, StopReceiver will throw serialization errors. As a result, graceful receiver shutdown is broken on real clusters but works in local and local-cluster modes. If you want to reproduce this, try running the word count example from the Streaming Programming Guide in the Spark shell:
    
    ```
    import org.apache.spark._
    import org.apache.spark.streaming._
    import org.apache.spark.streaming.StreamingContext._
    val ssc = new StreamingContext(sc, Seconds(10))
    // Create a DStream that will connect to hostname:port, like localhost:9999
    val lines = ssc.socketTextStream("localhost", 9999)
    // Split each line into words
    val words = lines.flatMap(_.split(" "))
    import org.apache.spark.streaming.StreamingContext._
    // Count each word in each batch
    val pairs = words.map(word => (word, 1))
    val wordCounts = pairs.reduceByKey(_ + _)
    // Print the first ten elements of each RDD generated in this DStream to the console
    wordCounts.print()
    ssc.start()
    Thread.sleep(10000)
    ssc.stop(true, true)
    ```
    
    Prior to this patch, this would work correctly in local mode but fail when running against a real cluster.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/JoshRosen/spark SPARK-5035

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/3857.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #3857
    
----
commit 71d0eae7658641b9a820b86e8017dc9c7d3c6029
Author: Josh Rosen <jo...@databricks.com>
Date:   2014-12-31T09:20:37Z

    [SPARK-5035] ReceiverMessage trait should extend Serializable.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-5035] [Streaming] ReceiverMessage trait...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/3857


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-5035] [Streaming] ReceiverMessage trait...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/3857#issuecomment-68431882
  
      [Test build #24954 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24954/consoleFull) for   PR 3857 at commit [`71d0eae`](https://github.com/apache/spark/commit/71d0eae7658641b9a820b86e8017dc9c7d3c6029).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-5035] [Streaming] ReceiverMessage trait...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on the pull request:

    https://github.com/apache/spark/pull/3857#issuecomment-68432566
  
    Also, just to sanity check and make sure I haven't overlooked something, at least one other person besides me should run the `spark-shell` reproduction listed in the PR description.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-5035] [Streaming] ReceiverMessage trait...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on the pull request:

    https://github.com/apache/spark/pull/3857#issuecomment-68472986
  
    Actually, my commit message claimed that the graceful shutdown worked in `local-cluster` mode, but that's probably not true since the executors would be in separate JVMs there.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-5035] [Streaming] ReceiverMessage trait...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on the pull request:

    https://github.com/apache/spark/pull/3857#issuecomment-68432275
  
    Also, this was a really nasty bug because it seems very hard to test for this in Spark's own unit tests.  Akka has a configuration option to force all messages to be serialized, even between local actors, but unfortunately this breaks Spark core because we send some non-serializable SparkContext references when initializing the DAGScheduler actor.
    
    Can we force serialization by spinning up separate actor systems for the master / worker / executor processes when running in local-cluster mode?  Or is there some other way that we can selectively force serialization in order to uncover these sorts of issues?
    
    We can definitely reproduce these sorts of issues in my spark-integration-tests system, since that uses multiple JVMs, but for completeness's sake I guess we'd need that tool's suites to send all of the remote messages (so this could be a lot of test code duplication). 
    
    Maybe the simplest (general) preventative test would have been something that just tries to call the Java serializer on an instance of each message class, so we just test serializability independent of Akka.  Checking for serializability (either manually or through handwritten tests) should be part of our review checklist when adding new Akka messages.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-5035] [Streaming] ReceiverMessage trait...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on the pull request:

    https://github.com/apache/spark/pull/3857#issuecomment-68431968
  
    /cc @tdas.
    
    This might fix both https://issues.apache.org/jira/browse/SPARK-4986 and https://issues.apache.org/jira/browse/SPARK-2892, although there could possibly be more pieces to solving those (e.g. replace 10 second timeout with a configurable timeout).
    
    I want to give a huge thanks to @cleaton for filing SPARK-4986 and for coming up with a workaround patch for SPARK-4986 which helped to spot this issue.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-5035] [Streaming] ReceiverMessage trait...

Posted by tdas <gi...@git.apache.org>.
Github user tdas commented on the pull request:

    https://github.com/apache/spark/pull/3857#issuecomment-68472892
  
    Goood catch! Merging this right away!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-5035] [Streaming] ReceiverMessage trait...

Posted by tdas <gi...@git.apache.org>.
Github user tdas commented on the pull request:

    https://github.com/apache/spark/pull/3857#issuecomment-68473048
  
    Can you try to check whether https://issues.apache.org/jira/browse/SPARK-2892 is solved with this? Use a local cluster (not "local-cluster").


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-5035] [Streaming] ReceiverMessage trait...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/3857#issuecomment-68435331
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24954/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-5035] [Streaming] ReceiverMessage trait...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/3857#issuecomment-68435325
  
      [Test build #24954 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24954/consoleFull) for   PR 3857 at commit [`71d0eae`](https://github.com/apache/spark/commit/71d0eae7658641b9a820b86e8017dc9c7d3c6029).
     * This patch **passes all tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-5035] [Streaming] ReceiverMessage trait...

Posted by tdas <gi...@git.apache.org>.
Github user tdas commented on the pull request:

    https://github.com/apache/spark/pull/3857#issuecomment-68473104
  
    And if this is to be merged, this should be backported all the way to branch-1.0


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-5035] [Streaming] ReceiverMessage trait...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on the pull request:

    https://github.com/apache/spark/pull/3857#issuecomment-68474468
  
    Yep, turns out that prior to this patch the test fails in `local-cluster` mode:
    
    ```
    [info] - stop gracefully *** FAILED *** (18 seconds, 445 milliseconds)
    [info]   1 did not equal 6680633, and 1 did not equal 6680634 Received records = 1, processed records = 6680632 (StreamingContextSuite.scala:198)
    [info]   org.scalatest.exceptions.TestFailedException:
    [info]   at org.scalatest.Assertions$class.newAssertionFailedException(Assertions.scala:500)
    [info]   at org.scalatest.FunSuite.newAssertionFailedException(FunSuite.scala:1555)
    ```
    
    I tested out the NetworkWordCount example described in https://issues.apache.org/jira/browse/SPARK-2892 and was able to both reproduce the issue reported there and verify that it's been fixed by this patch.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org