You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by mark <ma...@googlemail.com> on 2015/10/23 04:27:19 UTC

Saving RDDs in Tachyon

I have Avro records stored in Parquet files in HDFS. I want to read these
out as an RDD and save that RDD in Tachyon for any spark job that wants the
data.

How do I save the RDD in Tachyon? What format do I use? Which RDD
'saveAs...' method do I want?

Thanks

Re: Saving RDDs in Tachyon

Posted by Calvin Jia <ji...@gmail.com>.
Hi Mark,

Were you able to successfully store the RDD with Akhil's method? When you
read it back as an objectFile, you will also need to specify the correct
type.

You can find more information about integrating Spark and Tachyon on this
page: http://tachyon-project.org/documentation/Running-Spark-on-Tachyon.html
.

Hope this helps,
Calvin

On Fri, Oct 30, 2015 at 7:04 AM, Akhil Das <ak...@sigmoidanalytics.com>
wrote:

> I guess you can do a .saveAsObjectFiles and read it back as sc.objectFile
>
> Thanks
> Best Regards
>
> On Fri, Oct 23, 2015 at 7:57 AM, mark <ma...@googlemail.com> wrote:
>
>> I have Avro records stored in Parquet files in HDFS. I want to read these
>> out as an RDD and save that RDD in Tachyon for any spark job that wants the
>> data.
>>
>> How do I save the RDD in Tachyon? What format do I use? Which RDD
>> 'saveAs...' method do I want?
>>
>> Thanks
>>
>
>

Re: Saving RDDs in Tachyon

Posted by Akhil Das <ak...@sigmoidanalytics.com>.
I guess you can do a .saveAsObjectFiles and read it back as sc.objectFile

Thanks
Best Regards

On Fri, Oct 23, 2015 at 7:57 AM, mark <ma...@googlemail.com> wrote:

> I have Avro records stored in Parquet files in HDFS. I want to read these
> out as an RDD and save that RDD in Tachyon for any spark job that wants the
> data.
>
> How do I save the RDD in Tachyon? What format do I use? Which RDD
> 'saveAs...' method do I want?
>
> Thanks
>