You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Imran Rashid (JIRA)" <ji...@apache.org> on 2018/08/06 17:23:00 UTC
[jira] [Created] (SPARK-25035) Replication disk-stored blocks
should avoid memory mapping
Imran Rashid created SPARK-25035:
------------------------------------
Summary: Replication disk-stored blocks should avoid memory mapping
Key: SPARK-25035
URL: https://issues.apache.org/jira/browse/SPARK-25035
Project: Spark
Issue Type: Improvement
Components: Spark Core
Affects Versions: 2.3.1
Reporter: Imran Rashid
This is a follow-up to SPARK-24296.
When replicating a disk-cached block, even if we fetch-to-disk, we still memory-map the file, just to copy it to another location.
Ideally we'd just move the tmp file to the right location. But even without that, we could read the file as an input stream, instead of memory-mapping the whole thing. Memory-mapping is particularly a problem when running under yarn, as the OS may believe there is plenty of memory available, meanwhile yarn decides to kill the process for exceeding memory limits.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org