You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by JaeSung Jun <ja...@gmail.com> on 2015/12/10 14:19:53 UTC
Does RDD[Type1, Iterable[Type2]] split into multiple partitions?
Hi,
I'm currently working on Iterable type of RDD, which is like :
val keyValueIterableRDD[CaseClass1, Iterable[CaseClass2]] = buildRDD(...)
If there is only one unique key and Iterable is big enough, would this
Iterable be partitioned across all executors like followings ?
(executor1)
(xxx, iterator from 0 to 10,000)
(executor2)
(xxx, iterator from 10,001 to 20,000)
(executor2)
(xxx, iterator from 20,001 to 30,000)
...
Thanks
Jason
Re: Does RDD[Type1, Iterable[Type2]] split into multiple partitions?
Posted by Reynold Xin <rx...@databricks.com>.
No, since the signature itself limits it.
On Thu, Dec 10, 2015 at 9:19 PM, JaeSung Jun <ja...@gmail.com> wrote:
> Hi,
>
> I'm currently working on Iterable type of RDD, which is like :
>
> val keyValueIterableRDD[CaseClass1, Iterable[CaseClass2]] = buildRDD(...)
>
> If there is only one unique key and Iterable is big enough, would this
> Iterable be partitioned across all executors like followings ?
>
> (executor1)
> (xxx, iterator from 0 to 10,000)
>
> (executor2)
> (xxx, iterator from 10,001 to 20,000)
>
> (executor2)
> (xxx, iterator from 20,001 to 30,000)
>
> ...
>
> Thanks
> Jason
>
>