You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Karl D. Gierach (JIRA)" <ji...@apache.org> on 2015/10/15 19:55:05 UTC

[jira] [Commented] (SPARK-5739) Size exceeds Integer.MAX_VALUE in File Map

    [ https://issues.apache.org/jira/browse/SPARK-5739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14959298#comment-14959298 ] 

Karl D. Gierach commented on SPARK-5739:
----------------------------------------

Is there anyway to increase this block limit?  I'm hitting the same issue during a UnionRDD operation.

Also, above this issue's state is "resolved" but I'm not sure what the resolution is?


> Size exceeds Integer.MAX_VALUE in File Map
> ------------------------------------------
>
>                 Key: SPARK-5739
>                 URL: https://issues.apache.org/jira/browse/SPARK-5739
>             Project: Spark
>          Issue Type: Bug
>          Components: MLlib
>    Affects Versions: 1.1.1
>         Environment: Spark1.1.1 on a cluster with 12 node. Every node with 128GB RAM, 24 Core. the data is just 40GB, and there is 48 parallel task on a node.
>            Reporter: DjvuLee
>            Priority: Minor
>
> I just run the kmeans algorithm using a random generate data,but occurred this problem after some iteration. I try several time, and this problem is reproduced. 
> Because the data is random generate, so I guess is there a bug ? Or if random data can lead to such a scenario that the size is bigger than Integer.MAX_VALUE, can we check the size before using the file map?
> 015-02-11 00:39:36,057 [sparkDriver-akka.actor.default-dispatcher-15] WARN  org.apache.spark.util.SizeEstimator - Failed to check whether UseCompressedOops is set; assuming yes
> [error] (run-main-0) java.lang.IllegalArgumentException: Size exceeds Integer.MAX_VALUE
> java.lang.IllegalArgumentException: Size exceeds Integer.MAX_VALUE
> 	at sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:850)
> 	at org.apache.spark.storage.DiskStore.getBytes(DiskStore.scala:105)
> 	at org.apache.spark.storage.DiskStore.putIterator(DiskStore.scala:86)
> 	at org.apache.spark.storage.MemoryStore.putIterator(MemoryStore.scala:140)
> 	at org.apache.spark.storage.MemoryStore.putIterator(MemoryStore.scala:105)
> 	at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:747)
> 	at org.apache.spark.storage.BlockManager.putIterator(BlockManager.scala:598)
> 	at org.apache.spark.storage.BlockManager.putSingle(BlockManager.scala:869)
> 	at org.apache.spark.broadcast.TorrentBroadcast.writeBlocks(TorrentBroadcast.scala:79)
> 	at org.apache.spark.broadcast.TorrentBroadcast.<init>(TorrentBroadcast.scala:68)
> 	at org.apache.spark.broadcast.TorrentBroadcastFactory.newBroadcast(TorrentBroadcastFactory.scala:36)
> 	at org.apache.spark.broadcast.TorrentBroadcastFactory.newBroadcast(TorrentBroadcastFactory.scala:29)
> 	at org.apache.spark.broadcast.BroadcastManager.newBroadcast(BroadcastManager.scala:62)
> 	at org.apache.spark.SparkContext.broadcast(SparkContext.scala:809)
> 	at org.apache.spark.mllib.clustering.KMeans.initKMeansParallel(KMeans.scala:270)
> 	at org.apache.spark.mllib.clustering.KMeans.runBreeze(KMeans.scala:143)
> 	at org.apache.spark.mllib.clustering.KMeans.run(KMeans.scala:126)
> 	at org.apache.spark.mllib.clustering.KMeans$.train(KMeans.scala:338)
> 	at org.apache.spark.mllib.clustering.KMeans$.train(KMeans.scala:348)
> 	at KMeansDataGenerator$.main(kmeans.scala:105)
> 	at KMeansDataGenerator.main(kmeans.scala)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:94)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:55)
> 	at java.lang.reflect.Method.invoke(Method.java:619)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org