You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Wenchen Fan (JIRA)" <ji...@apache.org> on 2017/06/23 01:26:00 UTC

[jira] [Commented] (SPARK-20923) TaskMetrics._updatedBlockStatuses uses a lot of memory

    [ https://issues.apache.org/jira/browse/SPARK-20923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16060283#comment-16060283 ] 

Wenchen Fan commented on SPARK-20923:
-------------------------------------

This patch changes the public behavior and we should mention it in the release notes. Basically users can track the status of updated blocks via {{SparkListenerTaskEnd}} event, but this feature was introduced for internal usage at the beginning and I'm wondering how many users are using this feature. After this patch we don't trach it anymore by default, users can still turn it on by setting {{spark.taskMetrics.trackUpdatedBlockStatuses}} to true.

> TaskMetrics._updatedBlockStatuses uses a lot of memory
> ------------------------------------------------------
>
>                 Key: SPARK-20923
>                 URL: https://issues.apache.org/jira/browse/SPARK-20923
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 2.1.0
>            Reporter: Thomas Graves
>            Assignee: Thomas Graves
>              Labels: releasenotes
>             Fix For: 2.3.0
>
>
> The driver appears to use a ton of memory in certain cases to store the task metrics updated block status'.  For instance I had a user reading data form hive and caching it.  The # of tasks to read was around 62,000, they were using 1000 executors and it ended up caching a couple TB's of data.  The driver kept running out of memory. 
> I investigated and it looks like there was 5GB of a 10GB heap being used up by the TaskMetrics._updatedBlockStatuses because there are a lot of blocks.
> The updatedBlockStatuses was already removed from the task end event under SPARK-20084.  I don't see anything else that seems to be using this.  Anybody know if I missed something?
>  If its not being used we should remove it, otherwise we need to figure out a better way of doing it so it doesn't use so much memory.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org