You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Nirav Patel <np...@xactlycorp.com> on 2018/08/30 23:19:06 UTC

CSV parser - how to parse column containing json data

Is there a way to parse csv file with some column in middle containing json
data structure?

"a",102,"c","{"x":"xx","y":false,"z":123}","d","e",102.2


Thanks,
Nirav

-- 


 <http://www.xactlycorp.com/email-click/>

 
<https://www.instagram.com/xactlycorp/>   
<https://www.linkedin.com/company/xactly-corporation>   
<https://twitter.com/Xactly>   <https://www.facebook.com/XactlyCorp>   
<http://www.youtube.com/xactlycorporation>

Re: CSV parser - how to parse column containing json data

Posted by Brandon Geise <br...@gmail.com>.
Do your schema inference and then apply the JSON schema using withColumn overwriting the String representation

 

From: Nirav Patel <np...@xactlycorp.com>
Date: Tuesday, October 2, 2018 at 5:00 PM
To: <br...@gmail.com>
Cc: spark users <us...@spark.apache.org>
Subject: Re: CSV parser - how to parse column containing json data

 

I need to inferSchema from CSV as well. As per your solution, I am creating SructType only for Json field. So how am I going to mix and match here? i.e. do type inference for all fields but json field and use custom json_schema for json field. 

 

 

 

 

 

On Thu, Aug 30, 2018 at 5:29 PM Brandon Geise <br...@gmail.com> wrote:

If you know your json schema you can create a struct and then apply that using from_json:

 

val json_schema = StructType(Array(StructField(“x”, StringType, true), StructField(“y”, StringType, true), StructField(“z”, IntegerType, true)))

 

.withColumn("_c3", from_json(col("_c3_signals"),json_schema))

 

From: Nirav Patel <np...@xactlycorp.com>
Date: Thursday, August 30, 2018 at 7:19 PM
To: spark users <us...@spark.apache.org>
Subject: CSV parser - how to parse column containing json data

 

Is there a way to parse csv file with some column in middle containing json data structure?

 

"a",102,"c","{"x":"xx","y":false,"z":123}","d","e",102.2

 

 

Thanks,

Nirav






        






        


Re: CSV parser - how to parse column containing json data

Posted by Nirav Patel <np...@xactlycorp.com>.
I need to inferSchema from CSV as well. As per your solution, I am creating
SructType only for Json field. So how am I going to mix and match here?
i.e. do type inference for all fields but json field and use custom
json_schema for json field.





On Thu, Aug 30, 2018 at 5:29 PM Brandon Geise <br...@gmail.com>
wrote:

> If you know your json schema you can create a struct and then apply that
> using from_json:
>
>
>
> val json_schema = StructType(Array(StructField(“x”, StringType, true),
> StructField(“y”, StringType, true), StructField(“z”, IntegerType, true)))
>
>
>
> .withColumn("_c3", from_json(col("_c3_signals"),json_schema))
>
>
>
> *From: *Nirav Patel <np...@xactlycorp.com>
> *Date: *Thursday, August 30, 2018 at 7:19 PM
> *To: *spark users <us...@spark.apache.org>
> *Subject: *CSV parser - how to parse column containing json data
>
>
>
> Is there a way to parse csv file with some column in middle containing
> json data structure?
>
>
>
> "a",102,"c","{"x":"xx","y":false,"z":123}","d","e",102.2
>
>
>
>
>
> Thanks,
>
> Nirav
>
>
>
>
> [image: What's New with Xactly] <http://www.xactlycorp.com/email-click/>
>
> [image: https://www.xactlycorp.com/wp-content/uploads/2017/09/insta.png]
> <https://www.instagram.com/xactlycorp/>  [image:
> https://www.xactlycorp.com/wp-content/uploads/2017/09/linkedin.png]
> <https://www.linkedin.com/company/xactly-corporation>  [image:
> https://www.xactlycorp.com/wp-content/uploads/2017/09/twitter.png]
> <https://twitter.com/Xactly>  [image:
> https://www.xactlycorp.com/wp-content/uploads/2017/09/facebook.png]
> <https://www.facebook.com/XactlyCorp>  [image:
> https://www.xactlycorp.com/wp-content/uploads/2017/09/youtube.png]
> <http://www.youtube.com/xactlycorporation>
>

-- 


 <http://www.xactlycorp.com/email-click/>

 
<https://www.instagram.com/xactlycorp/>   
<https://www.linkedin.com/company/xactly-corporation>   
<https://twitter.com/Xactly>   <https://www.facebook.com/XactlyCorp>   
<http://www.youtube.com/xactlycorporation>

Re: CSV parser - how to parse column containing json data

Posted by Brandon Geise <br...@gmail.com>.
If you know your json schema you can create a struct and then apply that using from_json:

 

val json_schema = StructType(Array(StructField(“x”, StringType, true), StructField(“y”, StringType, true), StructField(“z”, IntegerType, true)))

 

.withColumn("_c3", from_json(col("_c3_signals"),json_schema))

 

From: Nirav Patel <np...@xactlycorp.com>
Date: Thursday, August 30, 2018 at 7:19 PM
To: spark users <us...@spark.apache.org>
Subject: CSV parser - how to parse column containing json data

 

Is there a way to parse csv file with some column in middle containing json data structure?

 

"a",102,"c","{"x":"xx","y":false,"z":123}","d","e",102.2

 

 

Thanks,

Nirav