You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "jin xing (JIRA)" <ji...@apache.org> on 2018/05/02 05:40:00 UTC
[jira] [Created] (SPARK-24143) filter empty blocks when convert
mapstatus to (blockId, size) pair
jin xing created SPARK-24143:
--------------------------------
Summary: filter empty blocks when convert mapstatus to (blockId, size) pair
Key: SPARK-24143
URL: https://issues.apache.org/jira/browse/SPARK-24143
Project: Spark
Issue Type: New Feature
Components: Spark Core
Affects Versions: 2.3.0
Reporter: jin xing
In current code(MapOutputTracker.convertMapStatuses), mapstatus are converted to (blockId, size) pair for all blocks -- no matter the block is empty or not, which result in OOM when there are lots of consecutive empty blocks, especially when adaptive execution is enabled.
(blockId, size) pair is only used in ShuffleBlockFetcherIterator to control shuffle-read and only non-empty block request is sent. Can we just filter out the empty blocks in MapOutputTracker.convertMapStatuses and save memory?
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org