You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Jim Kleckner (JIRA)" <ji...@apache.org> on 2017/03/09 05:54:37 UTC

[jira] [Comment Edited] (SPARK-11141) Batching of ReceivedBlockTrackerLogEvents for efficient WAL writes

    [ https://issues.apache.org/jira/browse/SPARK-11141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15902537#comment-15902537 ] 

Jim Kleckner edited comment on SPARK-11141 at 3/9/17 5:54 AM:
--------------------------------------------------------------

FYI, this can cause problems when not using S3 during shutdown as described in this AWS posting: https://forums.aws.amazon.com/thread.jspa?threadID=223378

The workaround indicated is to use --conf spark.streaming.driver.writeAheadLog.allowBatching=false with the submit.

The exception contains the text:
{code}
streaming stop ReceivedBlockTracker: Exception thrown while writing record: BatchAllocationEvent
{code}


was (Author: jkleckner):
FYI, this can cause problems when not using S3 during shutdown as described in this AWS posting: https://forums.aws.amazon.com/thread.jspa?threadID=223378

The workaround indicated is to use --conf spark.streaming.driver.writeAheadLog.allowBatching=false with the submit.

> Batching of ReceivedBlockTrackerLogEvents for efficient WAL writes
> ------------------------------------------------------------------
>
>                 Key: SPARK-11141
>                 URL: https://issues.apache.org/jira/browse/SPARK-11141
>             Project: Spark
>          Issue Type: Improvement
>          Components: DStreams
>            Reporter: Burak Yavuz
>            Assignee: Burak Yavuz
>             Fix For: 1.6.0
>
>
> When using S3 as a directory for WALs, the writes take too long. The driver gets very easily bottlenecked when multiple receivers send AddBlock events to the ReceiverTracker. This PR adds batching of events in the ReceivedBlockTracker so that receivers don't get blocked by the driver for too long.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org