You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by "张志强(旺轩)" <zz...@alibaba-inc.com> on 2015/10/13 10:16:30 UTC

How to split one RDD to small ones according to its key's value

Hi everyone,

 

I am facing a requirement that I want to split one RDD into some small ones:

 

but I want to split it according to its Key element value , e.g: for those
its key is X, they gonna be in RDD1; for those its key is Y, they gonna be
in RDD2 , and so on.

 

I know it has a routine call randomSplit but I don't think it meets my need.

 

thanks for your feedback,

-Allen Zhang


RE: How to split one RDD to small ones according to its key's value

Posted by PK Gnanam <pk...@bridgepearl.com>.
I think you will need to use the partitionBy method 

.partitionBy(no of partitions, lambda that returns a partitioner)

 

Thanks,

PK

 

From: 张志强(旺轩) [mailto:zzq98736@alibaba-inc.com] 
Sent: Tuesday, October 13, 2015 4:17 AM
To: dev@spark.apache.org
Subject: How to split one RDD to small ones according to its key's value

 

Hi everyone,

 

I am facing a requirement that I want to split one RDD into some small ones:

 

but I want to split it according to its Key element value , e.g: for those
its key is X, they gonna be in RDD1; for those its key is Y, they gonna be
in RDD2 , and so on.

 

I know it has a routine call randomSplit but I don’t think it meets my
need.

 

thanks for your feedback,

-Allen Zhang