You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flink.apache.org by Norman Spangenberg <wi...@studserv.uni-leipzig.de> on 2014/08/06 17:15:14 UTC
parse json-file with scala-api and json4s
hello,
i hope this is the right place for this question.
i'm currently experimenting and comparing flink/stratosphere and apache
spark.
my goal is to analyse large json-files of twitter-data and now i'm
looking for a way to parse the json-tuples in a map-function and put in
a dataset.
for this i'm using the flink scala api and json4s.
but in flink the problem is to parse the json-file.
val words = cleaned.map { ( line => parse(line) }
Error Message is:
Error analyzing UDT org.json4s.JValue: Subtype
org.json4s.JsonAST.JInt - Field num: BigInt - Unsupported type BigInt
Subtype
org.json4s.JsonAST.JArray - Field arr:
List[org.json4s.JsonAST.JValue] - Subtype org.json4s.JsonAST.JInt -
Field num: BigInt - Unsupported type BigInt Subtype
org.json4s.JsonAST.JArray - Field arr:
List[org.json4s.JsonAST.JValue] - Subtype org.json4s.JsonAST.JDecimal -
Field num: BigDecimal - Unsupported type
BigDecimal Subtype org.json4s.JsonAST.JDecimal - Field num:
BigDecimal - Unsupported type BigDecimal Subtype
org.json4s.JsonAST.JObject - Field obj:
List[(String, org.json4s.JsonAST.JValue)] - Field _2:
org.json4s.JsonAST.JValue - Subtype org.json4s.JsonAST.JInt - Field num:
BigInt - Unsupported type BigInt
Subtype org.json4s.JsonAST.JObject - Field obj: List[(String,
org.json4s.JsonAST.JValue)] - Field _2: org.json4s.JsonAST.JValue - Subtype
org.json4s.JsonAST.JDecimal - Field num: BigDecimal - Unsupported
type BigDecimal
in spark i found a way based on
https://gist.github.com/cotdp/b471cfff183b59d65ae1
val user_interest = lines.map(line => {parse(line)})
.map(json => {implicit lazy val formats =
org.json4s.DefaultFormats
val name = (json \
"name").extract[String]
val location_x = (json \ "location"
\ "x").extract[Double]
val location_y = (json \ "location"
\ "y").extract[Double]
val likes = (json \
"likes").extract[Seq[String]].map(_.toLowerCase()).mkString(";")
( UserInterest(name, location_x,
location_y, likes) )
})
this works fine in spark, but is it possible to do the same with flink?
kind regards,
norman
Re: parse json-file with scala-api and json4s
Posted by Aljoscha Krettek <al...@apache.org>.
Hi Norman,
right now it is only possible to use Primitive Types and Case Classes (of
which tuples are a special case) as Scala Data Types. Your program could
work if you omit the second map function and instead put that code in your
first map function. This way you avoid having that custom JSON Type as the
data type of your first intermediate DataSet.
Just let me now if you need more information.
Aljoscha
On Wed, Aug 6, 2014 at 5:15 PM, Norman Spangenberg <
wir12kqe@studserv.uni-leipzig.de> wrote:
> hello,
> i hope this is the right place for this question.
> i'm currently experimenting and comparing flink/stratosphere and apache
> spark.
> my goal is to analyse large json-files of twitter-data and now i'm looking
> for a way to parse the json-tuples in a map-function and put in a dataset.
> for this i'm using the flink scala api and json4s.
> but in flink the problem is to parse the json-file.
> val words = cleaned.map { ( line => parse(line) }
> Error Message is:
> Error analyzing UDT org.json4s.JValue: Subtype org.json4s.JsonAST.JInt
> - Field num: BigInt - Unsupported type BigInt Subtype
> org.json4s.JsonAST.JArray - Field arr: List[org.json4s.JsonAST.JValue]
> - Subtype org.json4s.JsonAST.JInt - Field num: BigInt - Unsupported type
> BigInt Subtype
> org.json4s.JsonAST.JArray - Field arr: List[org.json4s.JsonAST.JValue]
> - Subtype org.json4s.JsonAST.JDecimal - Field num: BigDecimal - Unsupported
> type
> BigDecimal Subtype org.json4s.JsonAST.JDecimal - Field num:
> BigDecimal - Unsupported type BigDecimal Subtype org.json4s.JsonAST.JObject
> - Field obj:
> List[(String, org.json4s.JsonAST.JValue)] - Field _2:
> org.json4s.JsonAST.JValue - Subtype org.json4s.JsonAST.JInt - Field num:
> BigInt - Unsupported type BigInt
> Subtype org.json4s.JsonAST.JObject - Field obj: List[(String,
> org.json4s.JsonAST.JValue)] - Field _2: org.json4s.JsonAST.JValue - Subtype
> org.json4s.JsonAST.JDecimal - Field num: BigDecimal - Unsupported
> type BigDecimal
>
>
> in spark i found a way based on https://gist.github.com/cotdp/
> b471cfff183b59d65ae1
>
> val user_interest = lines.map(line => {parse(line)})
> .map(json => {implicit lazy val formats =
> org.json4s.DefaultFormats
> val name = (json \
> "name").extract[String]
> val location_x = (json \ "location" \
> "x").extract[Double]
> val location_y = (json \ "location" \
> "y").extract[Double]
> val likes = (json \
> "likes").extract[Seq[String]].map(_.toLowerCase()).mkString(";")
> ( UserInterest(name, location_x,
> location_y, likes) )
> })
>
> this works fine in spark, but is it possible to do the same with flink?
>
> kind regards,
> norman
>