You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Yogesh Vyas <in...@gmail.com> on 2016/10/03 06:19:58 UTC

filtering in SparkR

Hi,

I have two SparkDataFrames, df1 and df2.
There schemas are as follows:
df1=>SparkDataFrame[id:double, c1:string, c2:string]
df2=>SparkDataFrame[id:double, c3:string, c4:string]

I want to filter out rows from df1 where df1$id does not match df2$id

I tried some expression: filter(df1,!(df1$id %in% df2$id)), but it does not
works.

Anybody could please provide me a solution for it?

Regards,
Yogesh

Fwd: filtering in SparkR

Posted by Yogesh Vyas <in...@gmail.com>.
Hi,

I have two SparkDataFrames, df1 and df2.
There schemas are as follows:
df1=>SparkDataFrame[id:double, c1:string, c2:string]
df2=>SparkDataFrame[id:double, c3:string, c4:string]

I want to filter out rows from df1 where df1$id does not match df2$id

I tried some expression: filter(df1,!(df1$id %in% df2$id)), but it does not
works.

Anybody could please provide me a solution for it?

Regards,
Yogesh