You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Sree Eedupuganti <sr...@inndata.in> on 2016/11/15 07:20:59 UTC

How to read a Multi Line json object via Spark

I tried from Spark-Shell and i am getting the following error:

Here is the test.json file:

{
    "colorsArray": [{
        "red": "#f00",
        "green": "#0f0",
        "blue": "#00f",
        "cyan": "#0ff",
        "magenta": "#f0f",
        "yellow": "#ff0",
        "black": "#000"
    }]}


scala> val jtex =
sqlContext.read.format("json").option("samplingRatio","1.0").load("/user/spark/test.json")

       jtex: org.apache.spark.sql.DataFrame = [_corrupt_record: string]


Any suggestions please. Thanks.
-- 
Best Regards,
Sreeharsha Eedupuganti
Data Engineer
innData Analytics Private Limited

Re: How to read a Multi Line json object via Spark

Posted by Hyukjin Kwon <gu...@gmail.com>.
Hi Sree,


There is a blog about that,
http://searchdatascience.com/spark-adventures-1-processing-multi-line-json-files/

It is pretty old but I am sure that it is helpful.

Currently, JSON datasource only supports to rest JSON documents formatted
according to http://jsonlines.org/

There is an issue open to support this
https://issues.apache.org/jira/browse/SPARK-18352

I hope this is helpful.


Thanks.



2016-11-15 16:20 GMT+09:00 Sree Eedupuganti <sr...@inndata.in>:

> I tried from Spark-Shell and i am getting the following error:
>
> Here is the test.json file:
>
> {
>     "colorsArray": [{
>         "red": "#f00",
>         "green": "#0f0",
>         "blue": "#00f",
>         "cyan": "#0ff",
>         "magenta": "#f0f",
>         "yellow": "#ff0",
>         "black": "#000"
>     }]}
>
>
> scala> val jtex = sqlContext.read.format("json").option("samplingRatio","1.0").load("/user/spark/test.json")
>
>        jtex: org.apache.spark.sql.DataFrame = [_corrupt_record: string]
>
>
> Any suggestions please. Thanks.
> --
> Best Regards,
> Sreeharsha Eedupuganti
> Data Engineer
> innData Analytics Private Limited
>

RE: How to read a Multi Line json object via Spark

Posted by "Kappaganthu, Sivaram (ES)" <Si...@ADP.com>.
Hello,

Please find attached the old mail on this subject

Thanks,
Sivaram
From: Sree Eedupuganti [mailto:sree@inndata.in]
Sent: Tuesday, November 15, 2016 12:51 PM
To: user
Subject: How to read a Multi Line json object via Spark

I tried from Spark-Shell and i am getting the following error:

Here is the test.json file:


{

    "colorsArray": [{

        "red": "#f00",

        "green": "#0f0",

        "blue": "#00f",

        "cyan": "#0ff",

        "magenta": "#f0f",

        "yellow": "#ff0",

        "black": "#000"

    }]

}


scala> val jtex = sqlContext.read.format("json").option("samplingRatio","1.0").load("/user/spark/test.json")



       jtex: org.apache.spark.sql.DataFrame = [_corrupt_record: string]

Any suggestions please. Thanks.
--
Best Regards,
Sreeharsha Eedupuganti
Data Engineer
innData Analytics Private Limited

----------------------------------------------------------------------
This message and any attachments are intended only for the use of the addressee and may contain information that is privileged and confidential. If the reader of the message is not the intended recipient or an authorized representative of the intended recipient, you are hereby notified that any dissemination of this communication is strictly prohibited. If you have received this communication in error, notify the sender immediately by return email and delete the message and any attachments from your system.