You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@spark.apache.org by Naveen Kumar Pokala <np...@spcapitaliq.com> on 2014/11/11 13:58:13 UTC

save as file

Hi,

I am spark 1.1.0. I need a help regarding saving rdd in a JSON file?

How to do that? And how to mentions hdfs path in the program.


-Naveen

Re: save as file

Posted by Ritesh Kumar Singh <ri...@gmail.com>.

We have RDD.saveAsTextFile and RDD.saveAsObjectFile for saving the output
to any location specified. The params to be provided are:
>path of storage location
>no. of partitions

For giving an hdfs path we use the following format:
"/user/<user-name>/<directory-to-sore>/"

On Tue, Nov 11, 2014 at 6:28 PM, Naveen Kumar Pokala <
npokala@spcapitaliq.com> wrote:

> Hi,
>
>
>
> I am spark 1.1.0. I need a help regarding saving rdd in a JSON file?
>
>
>
> How to do that? And how to mentions hdfs path in the program.
>
>
>
>
>
> -Naveen
>
>
>
>
>

Re: save as file

Posted by Akhil Das <ak...@sigmoidanalytics.com>.

One approach would be to use SaveAsNewAPIHadoop file and specify
jsonOutputFormat.

Another simple one would be like:

val rdd = sc.parallelize(1 to 100)
val json = rdd.map(x => {
      val m: Map[String, Int] = Map("id" -> x)
      new JSONObject(m) })

json.saveAsTextFile("output")

Thanks
Best Regards

On Tue, Nov 11, 2014 at 6:28 PM, Naveen Kumar Pokala <
npokala@spcapitaliq.com> wrote:

> Hi,
>
>
>
> I am spark 1.1.0. I need a help regarding saving rdd in a JSON file?
>
>
>
> How to do that? And how to mentions hdfs path in the program.
>
>
>
>
>
> -Naveen
>
>
>
>
>