You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Yadid Ayzenberg <ya...@media.mit.edu> on 2013/11/20 16:02:05 UTC

running transformation on group of RDDs concurrently

Assuming I also want to run n concurrent jobs of the following type: 
each RDD is of the same form (JavaPairRDD), and I would like to run the 
same transformation on all RDDs.
The brute force way  would be to instantiate n threads and submit a job 
from each thread.

Would this way be valid as well ? create a new RDD which is a 
combination of the n RDDs (something like a group by for multiple RDDs).
Is there a way to implement this using the existing java API ?

Yadid