You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Vikash Kumar <vi...@gmail.com> on 2016/10/09 04:11:14 UTC
"How to change rdd fields for each key combination."
I have rdd with records in below format,
id,name,age,houseno,childPresent
1,gupta,35,100,None
1,verma,16,100,None
1,ravi,10,100,None
2, Abc,32,200,None
2,xyz,23,200,None
I have to change childPresent field for all row for same id if any record
with same id have age < 18. How can I do that.
I want output as below:
1,gupta,35,100,Y
1,verma,16,100,Y -- because it hase age less than 18 so Y for all
childPresent for Id =1
1,ravi,10,100,Y
2, Abc,32,200,N
2,xyz,23,200,N -- because there is no age < 18 for Id =2.
Please let me know how can I achieve using spark/scala.
Thanks
Vikash