You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Yijie Shen <he...@gmail.com> on 2015/03/08 11:29:36 UTC
A way to share RDD directly using Tachyon?
Hi,
I would like to share a RDD in several Spark Applications,
i.e, create one in application A, publish the ID somewhere and get the RDD back directly using ID in Application B.
I know I can use Tachyon just as a filesystem and s.saveAsTextFile("tachyon://localhost:19998/Y”) like this.
But get a RDD directly from tachyon instead of a file can sometimes avoid parsing the same file repeatedly in different Apps, I think.
What am I supposed to do in order to share RDDs to get a better performance?
—
Best Regards!
Yijie Shen
Re: A way to share RDD directly using Tachyon?
Posted by Akhil Das <ak...@sigmoidanalytics.com>.
Did you try something like:
myRDD.saveAsObjectFile("tachyon://localhost:19998/Y")
val newRDD = sc.objectFile[MyObject]("tachyon://localhost:19998/Y")
Thanks
Best Regards
On Sun, Mar 8, 2015 at 3:59 PM, Yijie Shen <he...@gmail.com>
wrote:
> Hi,
>
> I would like to share a RDD in several Spark Applications,
> i.e, create one in application A, publish the ID somewhere and get the RDD
> back directly using ID in Application B.
>
> I know I can use Tachyon just as a filesystem and
> s.saveAsTextFile("tachyon://localhost:19998/Y”) like this.
>
> But get a RDD directly from tachyon instead of a file can sometimes avoid
> parsing the same file repeatedly in different Apps, I think.
>
> What am I supposed to do in order to share RDDs to get a better
> performance?
>
>
> —
> Best Regards!
> Yijie Shen
>