You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Yadid Ayzenberg <ya...@media.mit.edu> on 2013/11/20 16:02:05 UTC
running transformation on group of RDDs concurrently
Assuming I also want to run n concurrent jobs of the following type:
each RDD is of the same form (JavaPairRDD), and I would like to run the
same transformation on all RDDs.
The brute force way would be to instantiate n threads and submit a job
from each thread.
Would this way be valid as well ? create a new RDD which is a
combination of the n RDDs (something like a group by for multiple RDDs).
Is there a way to implement this using the existing java API ?
Yadid