You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Naveen Kumar Pokala <np...@spcapitaliq.com> on 2014/11/11 13:58:13 UTC
save as file
Hi,
I am spark 1.1.0. I need a help regarding saving rdd in a JSON file?
How to do that? And how to mentions hdfs path in the program.
-Naveen
Re: save as file
Posted by Ritesh Kumar Singh <ri...@gmail.com>.
We have RDD.saveAsTextFile and RDD.saveAsObjectFile for saving the output
to any location specified. The params to be provided are:
>path of storage location
>no. of partitions
For giving an hdfs path we use the following format:
"/user/<user-name>/<directory-to-sore>/"
On Tue, Nov 11, 2014 at 6:28 PM, Naveen Kumar Pokala <
npokala@spcapitaliq.com> wrote:
> Hi,
>
>
>
> I am spark 1.1.0. I need a help regarding saving rdd in a JSON file?
>
>
>
> How to do that? And how to mentions hdfs path in the program.
>
>
>
>
>
> -Naveen
>
>
>
>
>
Re: save as file
Posted by Akhil Das <ak...@sigmoidanalytics.com>.
One approach would be to use SaveAsNewAPIHadoop file and specify
jsonOutputFormat.
Another simple one would be like:
val rdd = sc.parallelize(1 to 100)
val json = rdd.map(x => {
val m: Map[String, Int] = Map("id" -> x)
new JSONObject(m) })
json.saveAsTextFile("output")
Thanks
Best Regards
On Tue, Nov 11, 2014 at 6:28 PM, Naveen Kumar Pokala <
npokala@spcapitaliq.com> wrote:
> Hi,
>
>
>
> I am spark 1.1.0. I need a help regarding saving rdd in a JSON file?
>
>
>
> How to do that? And how to mentions hdfs path in the program.
>
>
>
>
>
> -Naveen
>
>
>
>
>