You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by "张志强(旺轩)" <zz...@alibaba-inc.com> on 2015/10/13 10:16:30 UTC
How to split one RDD to small ones according to its key's value
Hi everyone,
I am facing a requirement that I want to split one RDD into some small ones:
but I want to split it according to its Key element value , e.g: for those
its key is X, they gonna be in RDD1; for those its key is Y, they gonna be
in RDD2 , and so on.
I know it has a routine call randomSplit but I don't think it meets my need.
thanks for your feedback,
-Allen Zhang
RE: How to split one RDD to small ones according to its key's value
Posted by PK Gnanam <pk...@bridgepearl.com>.
I think you will need to use the partitionBy method
.partitionBy(no of partitions, lambda that returns a partitioner)
Thanks,
PK
From: 张志强(旺轩) [mailto:zzq98736@alibaba-inc.com]
Sent: Tuesday, October 13, 2015 4:17 AM
To: dev@spark.apache.org
Subject: How to split one RDD to small ones according to its key's value
Hi everyone,
I am facing a requirement that I want to split one RDD into some small ones:
but I want to split it according to its Key element value , e.g: for those
its key is X, they gonna be in RDD1; for those its key is Y, they gonna be
in RDD2 , and so on.
I know it has a routine call randomSplit but I don’t think it meets my
need.
thanks for your feedback,
-Allen Zhang