You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by roy <rp...@njit.edu> on 2015/03/31 15:16:05 UTC

Re: can't union two rdds

use zip



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/can-t-union-two-rdds-tp22320p22321.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Re: can't union two rdds

Posted by "ankurjain.nitrr" <an...@gmail.com>.
Rdd union will result in  

  1 2 
  3 4 
  5 6 
  7 8 
  9 10 
11 12

What you are trying to do is join.
There must be a logic/key to perform join operation.

I think in your case you want the order (index) to be the joining key here.
RDD is a distributed data structure and is not apt for your case.

If that amount for data is less, you can use rdd.collect, just iterate on it
both the list and produce the desired result



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/can-t-union-two-rdds-tp22320p22323.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org