You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by 397090770 <gi...@git.apache.org> on 2015/11/16 11:26:27 UTC

[GitHub] spark pull request: [SPARK-11751]Doc describe error in the "Spark ...

GitHub user 397090770 opened a pull request:

    https://github.com/apache/spark/pull/9734

    [SPARK-11751]Doc describe error in the "Spark Streaming Programming Guide" page

    In the **[Task Launching Overheads](http://spark.apache.org/docs/latest/streaming-programming-guide.html#task-launching-overheads)** section,
    >Task Serialization: Using Kryo serialization for serializing tasks can reduce the task sizes, and therefore reduce the time taken to send them to the slaves.
    
    as we known **Task Serialization** is configuration by **spark.closure.serializer** parameter, but currently only the Java serializer is supported. If we set **spark.closure.serializer** to **org.apache.spark.serializer.KryoSerializer**, then this will throw a exception.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/397090770/spark 397090770-patch-1

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/9734.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #9734
    
----
commit afd1dbc7e45e49f0f38d58968ad57f0a670fb008
Author: yangping.wu <wy...@163.com>
Date:   2015-11-16T10:16:03Z

    Doc describe error
    
    Doc describe error

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11751]Doc describe error in the "Spark ...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9734#discussion_r44907725
  
    --- Diff: docs/streaming-programming-guide.md ---
    @@ -2001,8 +2001,7 @@ If the number of tasks launched per second is high (say, 50 or more per second),
     of sending out tasks to the slaves may be significant and will make it hard to achieve sub-second
     latencies. The overhead can be reduced by the following changes:
     
    -* **Task Serialization**: Using Kryo serialization for serializing tasks can reduce the task
    -  sizes, and therefore reduce the time taken to send them to the slaves.
    +* **Task Serialization**: Using Kryo serialization for serializing tasks can reduce the task sizes, and therefore reduce the time taken to send them to the slaves. This is controlled by the spark.closure.serializer property. However, at this time, Kryo serialization cannot be enabled for closure serialization. This may be resolved in a future release.
    --- End diff --
    
    Let's see if others have comments about the text. Note you should back-tick-quote the `spark.closure.serializer` property.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11751]Doc describe error in the "Spark ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9734#issuecomment-156981853
  
    Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-11751]Doc describe error in the "Spark ...

Posted by 397090770 <gi...@git.apache.org>.
Github user 397090770 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9734#discussion_r44910286
  
    --- Diff: docs/streaming-programming-guide.md ---
    @@ -2001,8 +2001,7 @@ If the number of tasks launched per second is high (say, 50 or more per second),
     of sending out tasks to the slaves may be significant and will make it hard to achieve sub-second
     latencies. The overhead can be reduced by the following changes:
     
    -* **Task Serialization**: Using Kryo serialization for serializing tasks can reduce the task
    -  sizes, and therefore reduce the time taken to send them to the slaves.
    +* **Task Serialization**: Using Kryo serialization for serializing tasks can reduce the task sizes, and therefore reduce the time taken to send them to the slaves. This is controlled by the spark.closure.serializer property. However, at this time, Kryo serialization cannot be enabled for closure serialization. This may be resolved in a future release.
    --- End diff --
    
    done.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org