You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Shepherd <Ch...@huawei.com> on 2015/10/16 21:14:40 UTC

Question of RDD in calculation

Hi all,I am new in Spark, and I have a question in dealing with RDD.I’ve
converted RDD to DataFrame. So there are two DF: DF1 and DF2DF1 contains:
userID, time, dataUsage, durationDF2 contains: userIDEach userID has
multiple rows in DF1.DF2 has distinct userID, and I would like to compute
the average, max and min value of both dataUsage and duration for each
userID in DF1?And store the results in a new dataframe.How can I do
that?Thanks a lot.BestFrank



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Question-of-RDD-in-calculation-tp25100.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.