You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Imran Rashid (JIRA)" <ji...@apache.org> on 2018/10/10 17:57:00 UTC
[jira] [Created] (SPARK-25704) Replication of > 2GB block fails due
to bad config default
Imran Rashid created SPARK-25704:
------------------------------------
Summary: Replication of > 2GB block fails due to bad config default
Key: SPARK-25704
URL: https://issues.apache.org/jira/browse/SPARK-25704
Project: Spark
Issue Type: Bug
Components: Spark Core
Affects Versions: 2.4.0
Reporter: Imran Rashid
Assignee: Imran Rashid
Replicating a block > 2GB currently fails because it tries to allocate a bytebuffer that is just a *bit* too large, due to a bad default config. This [line|https://github.com/apache/spark/blob/cd40655965072051dfae65eabd979edff0e4d398/core/src/main/scala/org/apache/spark/storage/BlockManager.scala#L454]:
{code}
ChunkedByteBuffer.fromFile(tmpFile, conf.get(config.MEMORY_MAP_LIMIT_FOR_TESTS).toInt)
{code}
{{MEMORY_MAP_LIMIT_FOR_TESTS}} defaults to {{Integer.MAX_VALUE}}, but unfortunately that is just a tiny bit too big. You'll see an exception like:
{noformat}
18/10/09 21:21:54 WARN server.TransportChannelHandler: Exception in connection from /172.31.118.153:53534
java.lang.OutOfMemoryError: Requested array size exceeds VM limit
at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:57)
at java.nio.ByteBuffer.allocate(ByteBuffer.java:335)
at org.apache.spark.util.io.ChunkedByteBuffer$$anonfun$8.apply(ChunkedByteBuffer.scala:199)
at org.apache.spark.util.io.ChunkedByteBuffer$$anonfun$8.apply(ChunkedByteBuffer.scala:199)
at org.apache.spark.util.io.ChunkedByteBufferOutputStream.allocateNewChunkIfNeeded(ChunkedByteBufferOutputStream.scala:87)
at org.apache.spark.util.io.ChunkedByteBufferOutputStream.write(ChunkedByteBufferOutputStream.scala:75)
at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:2315)
at org.apache.commons.io.IOUtils.copy(IOUtils.java:2270)
at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:2291)
at org.apache.commons.io.IOUtils.copy(IOUtils.java:2246)
at org.apache.spark.util.io.ChunkedByteBuffer$$anonfun$fromFile$1.apply$mcI$sp(ChunkedByteBuffer.scala:201)
at org.apache.spark.util.io.ChunkedByteBuffer$$anonfun$fromFile$1.apply(ChunkedByteBuffer.scala:201)
at org.apache.spark.util.io.ChunkedByteBuffer$$anonfun$fromFile$1.apply(ChunkedByteBuffer.scala:201)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
at org.apache.spark.util.io.ChunkedByteBuffer$.fromFile(ChunkedByteBuffer.scala:202)
at org.apache.spark.util.io.ChunkedByteBuffer$.fromFile(ChunkedByteBuffer.scala:184)
at org.apache.spark.storage.BlockManager$$anon$1.onComplete(BlockManager.scala:454)
{noformat}
at least on my system, its just 2 bytes too big :(
{noformat}
> scala -J-Xmx4G
import java.nio.ByteBuffer
scala> ByteBuffer.allocate(Integer.MAX_VALUE)
java.lang.OutOfMemoryError: Requested array size exceeds VM limit
at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:57)
at java.nio.ByteBuffer.allocate(ByteBuffer.java:335)
... 30 elided
scala> ByteBuffer.allocate(Integer.MAX_VALUE - 1)
java.lang.OutOfMemoryError: Requested array size exceeds VM limit
at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:57)
at java.nio.ByteBuffer.allocate(ByteBuffer.java:335)
... 30 elided
scala> ByteBuffer.allocate(Integer.MAX_VALUE - 2)
res3: java.nio.ByteBuffer = java.nio.HeapByteBuffer[pos=0 lim=2147483645 cap=2147483645]
{noformat}
*Workaround*: Set to "spark.storage.memoryMapLimitForTests" something a bit smaller, eg. 2147483135 (that's Integer.MAX_VALUE - 512, just in case its a bit different on other systems).
This was introduced by SPARK-25422. I'll file a PR shortly.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org