You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Hyukjin Kwon (JIRA)" <ji...@apache.org> on 2019/05/21 04:38:03 UTC

[jira] [Resolved] (SPARK-7708) Incorrect task serialization with Kryo closure serializer

     [ https://issues.apache.org/jira/browse/SPARK-7708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hyukjin Kwon resolved SPARK-7708.
---------------------------------
    Resolution: Incomplete

> Incorrect task serialization with Kryo closure serializer
> ---------------------------------------------------------
>
>                 Key: SPARK-7708
>                 URL: https://issues.apache.org/jira/browse/SPARK-7708
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 1.2.2
>            Reporter: Akshat Aranya
>            Priority: Major
>              Labels: bulk-closed
>
> I've been investigating the use of Kryo for closure serialization with Spark 1.2, and it seems like I've hit upon a bug:
> When a task is serialized before scheduling, the following log message is generated:
> [info] o.a.s.s.TaskSetManager - Starting task 124.1 in stage 0.0 (TID 342, <host>, PROCESS_LOCAL, 302 bytes)
> This message comes from TaskSetManager which serializes the task using the closure serializer.  Before the message is sent out, the TaskDescription (which included the original task as a byte array), is serialized again into a byte array with the closure serializer.  I added a log message for this in CoarseGrainedSchedulerBackend, which produces the following output:
> [info] o.a.s.s.c.CoarseGrainedSchedulerBackend - 124.1 size=132
> The serialized size of TaskDescription (132 bytes) turns out to be _smaller_ than serialized task that it contains (302 bytes). This implies that TaskDescription.buffer is not getting serialized correctly.
> On the executor side, the deserialization produces a null value for TaskDescription.buffer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org