You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by vishnu86 <vi...@yahoo.com> on 2014/09/12 11:29:20 UTC

Using filter in joined dataset

I am newbie to scala and spark. I am joining two datasets , first one coming
from stream and second one which is in HDFS.

I am using scala in spark. After joining the two datasets , I need to apply
filter on the joined datasets, but here I am facing as issue. Please assist
to resolve.

I am using the code below,
val streamkv = streamrecs.map(_.split("~")).map(r => ( r(0), (r(5), r(6)))) 
val HDFSlines = sc.textFile("/user/Rest/sample.dat").map(_.split("~")).map(r
=> ( r(1), (r(0) r(3),r(4),))) 
val streamwindow = streamkv.window(Minutes(1)) 

val join1 = streamwindow.transform(joinRDD => { joinRDD.join(HDFSlines)} )

I am getting the following error, when I use the filter

val tofilter = join1.filter {
 | case (_, (_, _),(_,_,device)) =>
 | device.contains("iPhone")
 | }.count()

 error: constructor cannot be instantiated to expected type;
 found   : (T1, T2, T3)
 required: (String, ((String, String), (String, String, String)))
       case (_, (_, _),(_,_,device)) =>




--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Using-filter-in-joined-dataset-tp14077.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org