You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by vinodep <vp...@andrew.cmu.edu> on 2016/09/06 22:37:56 UTC

BlockMatrix Multiplication fails with Out of Memory

Hi, 
I am trying to multiply Matrix of size 67584*67584 in a loop. In the first
iteration, multiplication goes through, but in the second iteration, it
fails with Java heap out of memory issue. I'm using pyspark and below is the
configuration.
Setup:
70 nodes (1driver+69 workers) with
SPARK_DRIVER_MEMORY=32g,SPARK_WORKER_CORES=16,SPARK_WORKER_MEMORY=20g,SPARK_EXECUTOR_MEMORY=5g,spark.executor.cores=5

Data : 67584 matrix size, block size is 1024
So, i basically load number of mat files (matlab .mat) files using textFile,
form a Block RDD with each file read being a block, and create a
blockmatrix(A)
Then, i multiply the matrix with itself in the loop, basically to get the
powers (A^^2,A^^4). But somehow multiplication always fails with out of
memory issues after second iteration.I'm using multiply method from
BlockMatrix

for i in range(3):
    A = A.multiply(A)

What am i missing? What is a correct way to load a big matrix file (.mat
)from local filesystem into rdd and create a blockmatrix and do repeated
multiplication? 



--
View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/BlockMatrix-Multiplication-fails-with-Out-of-Memory-tp18869.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscribe@spark.apache.org