You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Matei Zaharia (JIRA)" <ji...@apache.org> on 2014/05/18 03:34:14 UTC

[jira] [Updated] (SPARK-1145) Memory mapping with many small blocks can cause JVM allocation failures

     [ https://issues.apache.org/jira/browse/SPARK-1145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Matei Zaharia updated SPARK-1145:
---------------------------------

    Fix Version/s: 0.9.2

> Memory mapping with many small blocks can cause JVM allocation failures
> -----------------------------------------------------------------------
>
>                 Key: SPARK-1145
>                 URL: https://issues.apache.org/jira/browse/SPARK-1145
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 0.9.0
>            Reporter: Patrick Wendell
>            Assignee: Patrick Wendell
>             Fix For: 1.0.0, 0.9.2
>
>
> During a shuffle each block or block segment is memory mapped to a file. When the segments are very small and there are a large number of them, the memory maps can start failing and eventually the JVM will terminate. It's not clear exactly what's happening but it appears that when the JVM terminates about 265MB of virtual address space is used by memory mapped files. This doesn't seem affected at all by `-XXmaxdirectmemorysize` - AFAIK that option is just to give the JVM its own self imposed limit rather than allow it to run into OS limits. 
> At the time of JVM failure it appears the overall OS memory becomes scarce, so it's possible there are overheads for each memory mapped file that are adding up here. One overhead is that the memory mapping occurs at the granularity of pages, so if blocks are really small there is natural overhead required to pad to the page boundary.
> In the particular case where I saw this, the JVM was running 4 reducers, each of which was trying to access about 30,000 blocks for a total of 120,000 concurrent reads. At about 65,000 open files it crapped out. In this case each file was about 1000 bytes.
> User should really be coalescing or using fewer reducers if they have 1000 byte shuffle files, but I expect this to happen nonetheless. My proposal was that if the file is smaller than a few pages, we should just read it into a java buffer and not bother to memory map it. Memory mapping huge numbers of small files in the JVM is neither recommended or good for performance, AFAIK.
> Below is the stack trace:
> {code}
> 14/02/27 08:32:35 ERROR storage.BlockManagerWorker: Exception handling buffer message
> java.io.IOException: Map failed
>   at sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:888)
>   at org.apache.spark.storage.DiskStore.getBytes(DiskStore.scala:89)
>   at org.apache.spark.storage.BlockManager.getLocalBytes(BlockManager.scala:285)
>   at org.apache.spark.storage.BlockManagerWorker.getBlock(BlockManagerWorker.scala:90)
>   at org.apache.spark.storage.BlockManagerWorker.processBlockMessage(BlockManagerWorker.scala:69)
>   at org.apache.spark.storage.BlockManagerWorker$$anonfun$2.apply(BlockManagerWorker.scala:44)
>   at org.apache.spark.storage.BlockManagerWorker$$anonfun$2.apply(BlockManagerWorker.scala:44)
>   at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
>   at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
>   at scala.collection.Iterator$class.foreach(Iterator.scala:727)
>   at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
>   at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
>   at org.apache.spark.storage.BlockMessageArray.foreach(BlockMessageArray.scala:28)
>   at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
>   at org.apache.spark.storage.BlockMessageArray.map(BlockMessageArray.scala:28)
>   at org.apache.spark.storage.BlockManagerWorker.onBlockMessageReceive(BlockManagerWorker.scala:44)
>   at org.apache.spark.storage.BlockManagerWorker$$anonfun$1.apply(BlockManagerWorker.scala:34)
>   at org.apache.spark.storage.BlockManagerWorker$$anonfun$1.apply(BlockManagerWorker.scala:34)
>   at org.apache.spark.network.ConnectionManager.org$apache$spark$network$ConnectionManager$$handleMessage(ConnectionManager.scala:512)
>   at org.apache.spark.network.ConnectionManager$$anon$8.run(ConnectionManager.scala:478)
>   at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> {code}
> And the JVM error log had a bunch of entries like this:
> {code}
> 7f4b48f89000-7f4b48f8a000 r--s 00000000 ca:30 1622077901                 /mnt4/spark/spark-local-20140227020022-227c/26/shuffle_0_22312_38
> 7f4b48f8a000-7f4b48f8b000 r--s 00000000 ca:20 545892715                  /mnt3/spark/spark-local-20140227020022-5ef5/3a/shuffle_0_26808_20
> 7f4b48f8b000-7f4b48f8c000 r--s 00000000 ca:50 1622480741                 /mnt2/spark/spark-local-20140227020022-315b/1c/shuffle_0_29013_19
> 7f4b48f8c000-7f4b48f8d000 r--s 00000000 ca:30 10082610                   /mnt4/spark/spark-local-20140227020022-227c/3b/shuffle_0_28002_9
> 7f4b48f8d000-7f4b48f8e000 r--s 00000000 ca:50 1622268539                 /mnt2/spark/spark-local-20140227020022-315b/3e/shuffle_0_23983_17
> 7f4b48f8e000-7f4b48f8f000 r--s 00000000 ca:50 1083068239                 /mnt2/spark/spark-local-20140227020022-315b/37/shuffle_0_25505_22
> 7f4b48f8f000-7f4b48f90000 r--s 00000000 ca:30 9921006                    /mnt4/spark/spark-local-20140227020022-227c/31/shuffle_0_24072_95
> 7f4b48f90000-7f4b48f91000 r--s 00000000 ca:50 10441349                   /mnt2/spark/spark-local-20140227020022-315b/20/shuffle_0_27409_47
> 7f4b48f91000-7f4b48f92000 r--s 00000000 ca:50 10406042                   /mnt2/spark/spark-local-20140227020022-315b/0e/shuffle_0_26481_84
> 7f4b48f92000-7f4b48f93000 r--s 00000000 ca:50 1622268192                 /mnt2/spark/spark-local-20140227020022-315b/14/shuffle_0_23818_92
> 7f4b48f93000-7f4b48f94000 r--s 00000000 ca:50 1082957628                 /mnt2/spark/spark-local-20140227020022-315b/09/shuffle_0_22824_45
> 7f4b48f94000-7f4b48f95000 r--s 00000000 ca:20 1082199965                 /mnt3/spark/spark-local-20140227020022-5ef5/00/shuffle_0_1429_13
> 7f4b48f95000-7f4b48f96000 r--s 00000000 ca:20 10940995                   /mnt3/spark/spark-local-20140227020022-5ef5/38/shuffle_0_28705_44
> 7f4b48f96000-7f4b48f97000 r--s 00000000 ca:10 17456971                   /mnt/spark/spark-local-20140227020022-b372/28/shuffle_0_23055_72
> 7f4b48f97000-7f4b48f98000 r--s 00000000 ca:30 9853895                    /mnt4/spark/spark-local-20140227020022-227c/08/shuffle_0_22797_42
> 7f4b48f98000-7f4b48f99000 r--s 00000000 ca:20 1622089728                 /mnt3/spark/spark-local-20140227020022-5ef5/27/shuffle_0_24017_97
> 7f4b48f99000-7f4b48f9a000 r--s 00000000 ca:50 1082937570                 /mnt2/spark/spark-local-20140227020022-315b/24/shuffle_0_22291_38
> 7f4b48f9a000-7f4b48f9b000 r--s 00000000 ca:30 10056604                   /mnt4/spark/spark-local-20140227020022-227c/2f/shuffle_0_27408_59
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)