You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by abhiguruvayya <sh...@gmail.com> on 2014/06/20 02:54:43 UTC

How to store JavaRDD as a sequence file using spark java API?

I want to store JavaRDD as a sequence file instead of textfile. But i don't
see any Java API for that. Is there a way for this? Please let me know.
Thanks!



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/How-to-store-JavaRDD-as-a-sequence-file-using-spark-java-API-tp7969.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: How to store JavaRDD as a sequence file using spark java API?

Posted by Kan Zhang <kz...@apache.org>.
Yes, it can if you set the output format to SequenceFileOutputFormat. The
difference is saveAsSequenceFile does the conversion to Writable for you if
needed and then calls saveAsHadoopFile.


On Fri, Jun 20, 2014 at 12:43 AM, abhiguruvayya <sh...@gmail.com>
wrote:

> Does JavaPairRDD.saveAsHadoopFile store data as a sequenceFile? Then what
> is
> the significance of RDD.saveAsSequenceFile?
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/How-to-store-JavaRDD-as-a-sequence-file-using-spark-java-API-tp7969p7983.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>

Re: How to store JavaRDD as a sequence file using spark java API?

Posted by abhiguruvayya <sh...@gmail.com>.
Does JavaPairRDD.saveAsHadoopFile store data as a sequenceFile? Then what is
the significance of RDD.saveAsSequenceFile?



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/How-to-store-JavaRDD-as-a-sequence-file-using-spark-java-API-tp7969p7983.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: How to store JavaRDD as a sequence file using spark java API?

Posted by Shixiong Zhu <zs...@gmail.com>.
You can use "JavaPairRDD.saveAsHadoopFile/saveAsNewAPIHadoopFile".

Best Regards,
Shixiong Zhu


2014-06-20 14:22 GMT+08:00 abhiguruvayya <sh...@gmail.com>:

> Any inputs on this will be helpful.
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/How-to-store-JavaRDD-as-a-sequence-file-using-spark-java-API-tp7969p7980.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>

Re: How to store JavaRDD as a sequence file using spark java API?

Posted by abhiguruvayya <sh...@gmail.com>.
Any inputs on this will be helpful.



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/How-to-store-JavaRDD-as-a-sequence-file-using-spark-java-API-tp7969p7980.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: How to store JavaRDD as a sequence file using spark java API?

Posted by abhiguruvayya <sh...@gmail.com>.
No.  My understanding by reading the code is that RDD.saveAsObjectFile uses
Java Serialization and RDD.saveAsSequenceFile uses Writable which is tied to
the Writable Serialization framework in HDFS. 



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/How-to-store-JavaRDD-as-a-sequence-file-using-spark-java-API-tp7969p7973.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: How to store JavaRDD as a sequence file using spark java API?

Posted by Kan Zhang <kz...@apache.org>.
Can you use saveAsObjectFile?


On Thu, Jun 19, 2014 at 5:54 PM, abhiguruvayya <sh...@gmail.com>
wrote:

> I want to store JavaRDD as a sequence file instead of textfile. But i don't
> see any Java API for that. Is there a way for this? Please let me know.
> Thanks!
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/How-to-store-JavaRDD-as-a-sequence-file-using-spark-java-API-tp7969.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>