You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@spark.apache.org by klrmowse <kl...@gmail.com> on 2018/05/01 15:49:36 UTC

Re: [EXT] [Spark 2.x Core] .collect() size limit

okie, i may have found an alternate/workaround to using .collect() for what i
am trying to achieve...

initially, for the Spark application that i am working on, i would call
.collect() on two separate RDDs into a couple of ArrayLists (which was the
reason i was asking what the size limit on the driver is)

i need to map the 1st rdd to the 2nd rdd according to a computation/function
- resulting in key-value pairs;

it turns out, i don't need to call .collect() if i instead use
.zipPartitions() - to which i can pass the function to; 

i am currently testing it out...



thanks all for your responses



--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org