You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by Kevin Chen <kc...@palantir.com> on 2015/09/08 21:46:46 UTC

Deserializing JSON into Scala objects in Java code

Hello Spark Devs,

 I am trying to use the new Spark API json endpoints at /api/v1/[path]
(added in SPARK-3454).

 In order to minimize maintenance on our end, I would like to use
Retrofit/Jackson to parse the json directly into the Scala classes in
org/apache/spark/status/api/v1/api.scala (ApplicationInfo,
ApplicationAttemptInfo, etc…). However, Jackson does not seem to know how to
handle Scala Seqs, and will throw an error when trying to parse the
attempts: Seq[ApplicationAttemptInfo] field of ApplicationInfo. Our codebase
is in Java.

 My questions are:
1. Do you have any recommendations on how to easily deserialize Scala
objects from json? For example, do you have any current usage examples of
SPARK-3454 with Java?
2. Alternatively, are you committed to the json formats of /api/v1/path? I
would guess so, because of the ‘v1’, but wanted to confirm. If so, I could
deserialize the json into instances of my own Java classes instead, without
worrying about changing the class structure later due to changes in the
Spark API.
Some further information:
* The error I am getting with Jackson when trying to deserialize the json
into ApplicationInfo is Caused by:
com.fasterxml.jackson.databind.JsonMappingException: Can not construct
instance of scala.collection.Seq, problem: abstract types either need to be
mapped to concrete types, have custom deserializer, or be instantiated with
additional type information
* I tried using Jackson’s DefaultScalaModule, which seems to have support
for Scala Seqs, but got no luck.
* Deserialization works if the Scala class does not have any Seq fields, and
works if the fields are Java Lists instead of Seqs.
Thanks very much for your help!
Kevin Chen




Re: Deserializing JSON into Scala objects in Java code

Posted by Marcelo Vanzin <va...@cloudera.com>.
Hi Kevin,

This code works fine for me (output is "List(1, 2)"):

import org.apache.spark.status.api.v1.RDDPartitionInfo;
import com.fasterxml.jackson.databind.ObjectMapper;
import com.fasterxml.jackson.module.scala.DefaultScalaModule;

class jackson { public static void main(String[] args) throws Exception {
  ObjectMapper mapper = new ObjectMapper();
  mapper.registerModule(new DefaultScalaModule());

  String json = "{ \"blockName\" : \"name\", \"executors\" : [ \"1\",
\"2\" ] }";
  RDDPartitionInfo info = mapper.readValue(json, RDDPartitionInfo.class);
  System.out.println(info.executors());
} }


On Tue, Sep 8, 2015 at 1:27 PM, Kevin Chen <kc...@palantir.com> wrote:
> Hi Marcelo,
>
>  Thanks for the quick response. I understand that I can just write my own
> Java classes (I will use that as a fallback option), but in order to avoid
> code duplication and further possible changes, I was hoping there would be
> a way to use the Spark API classes directly, since it seems there should
> be.
>
>  I registered the Scala module in the same way (except in Java instead of
> Scala),
>
> mapper.registerModule(new DefaultScalaModule());
>
> But I don’t think the module is being used/registered properly? Do you
> happen to know whether the above line should work in Java?
>
>
>
> On 9/8/15, 12:55 PM, "Marcelo Vanzin" <va...@cloudera.com> wrote:
>
>>Hi Kevin,
>>
>>How did you try to use the Scala module? Spark has this code when
>>setting up the ObjectMapper used to generate the output:
>>
>>
>>mapper.registerModule(com.fasterxml.jackson.module.scala.DefaultScalaModul
>>e)
>>
>>As for supporting direct serialization to Java objects, I don't think
>>that was the goal of the API. The Scala API classes are public mostly
>>so that API compatibility checks are performed against them. If you
>>don't mind the duplication, you could write your own Java POJOs that
>>mirror the Scala API, and use them to deserialize the JSON.
>>
>>
>>On Tue, Sep 8, 2015 at 12:46 PM, Kevin Chen <kc...@palantir.com> wrote:
>>> Hello Spark Devs,
>>>
>>>  I am trying to use the new Spark API json endpoints at /api/v1/[path]
>>> (added in SPARK-3454).
>>>
>>>  In order to minimize maintenance on our end, I would like to use
>>> Retrofit/Jackson to parse the json directly into the Scala classes in
>>> org/apache/spark/status/api/v1/api.scala (ApplicationInfo,
>>> ApplicationAttemptInfo, etc…). However, Jackson does not seem to know
>>>how to
>>> handle Scala Seqs, and will throw an error when trying to parse the
>>> attempts: Seq[ApplicationAttemptInfo] field of ApplicationInfo. Our
>>>codebase
>>> is in Java.
>>>
>>>  My questions are:
>>>
>>> Do you have any recommendations on how to easily deserialize Scala
>>>objects
>>> from json? For example, do you have any current usage examples of
>>>SPARK-3454
>>> with Java?
>>> Alternatively, are you committed to the json formats of /api/v1/path? I
>>> would guess so, because of the ‘v1’, but wanted to confirm. If so, I
>>>could
>>> deserialize the json into instances of my own Java classes instead,
>>>without
>>> worrying about changing the class structure later due to changes in the
>>> Spark API.
>>>
>>> Some further information:
>>>
>>> The error I am getting with Jackson when trying to deserialize the json
>>>into
>>> ApplicationInfo is Caused by:
>>> com.fasterxml.jackson.databind.JsonMappingException: Can not construct
>>> instance of scala.collection.Seq, problem: abstract types either need
>>>to be
>>> mapped to concrete types, have custom deserializer, or be instantiated
>>>with
>>> additional type information
>>> I tried using Jackson’s DefaultScalaModule, which seems to have support
>>>for
>>> Scala Seqs, but got no luck.
>>> Deserialization works if the Scala class does not have any Seq fields,
>>>and
>>> works if the fields are Java Lists instead of Seqs.
>>>
>>> Thanks very much for your help!
>>> Kevin Chen
>>>
>>
>>
>>
>>--
>>Marcelo



-- 
Marcelo

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org


Re: Deserializing JSON into Scala objects in Java code

Posted by Kevin Chen <kc...@palantir.com>.
Hi Marcelo,

 Thanks for the quick response. I understand that I can just write my own
Java classes (I will use that as a fallback option), but in order to avoid
code duplication and further possible changes, I was hoping there would be
a way to use the Spark API classes directly, since it seems there should
be.

 I registered the Scala module in the same way (except in Java instead of
Scala),

mapper.registerModule(new DefaultScalaModule());

But I don’t think the module is being used/registered properly? Do you
happen to know whether the above line should work in Java?



On 9/8/15, 12:55 PM, "Marcelo Vanzin" <va...@cloudera.com> wrote:

>Hi Kevin,
>
>How did you try to use the Scala module? Spark has this code when
>setting up the ObjectMapper used to generate the output:
>
>  
>mapper.registerModule(com.fasterxml.jackson.module.scala.DefaultScalaModul
>e)
>
>As for supporting direct serialization to Java objects, I don't think
>that was the goal of the API. The Scala API classes are public mostly
>so that API compatibility checks are performed against them. If you
>don't mind the duplication, you could write your own Java POJOs that
>mirror the Scala API, and use them to deserialize the JSON.
>
>
>On Tue, Sep 8, 2015 at 12:46 PM, Kevin Chen <kc...@palantir.com> wrote:
>> Hello Spark Devs,
>>
>>  I am trying to use the new Spark API json endpoints at /api/v1/[path]
>> (added in SPARK-3454).
>>
>>  In order to minimize maintenance on our end, I would like to use
>> Retrofit/Jackson to parse the json directly into the Scala classes in
>> org/apache/spark/status/api/v1/api.scala (ApplicationInfo,
>> ApplicationAttemptInfo, etc…). However, Jackson does not seem to know
>>how to
>> handle Scala Seqs, and will throw an error when trying to parse the
>> attempts: Seq[ApplicationAttemptInfo] field of ApplicationInfo. Our
>>codebase
>> is in Java.
>>
>>  My questions are:
>>
>> Do you have any recommendations on how to easily deserialize Scala
>>objects
>> from json? For example, do you have any current usage examples of
>>SPARK-3454
>> with Java?
>> Alternatively, are you committed to the json formats of /api/v1/path? I
>> would guess so, because of the ‘v1’, but wanted to confirm. If so, I
>>could
>> deserialize the json into instances of my own Java classes instead,
>>without
>> worrying about changing the class structure later due to changes in the
>> Spark API.
>>
>> Some further information:
>>
>> The error I am getting with Jackson when trying to deserialize the json
>>into
>> ApplicationInfo is Caused by:
>> com.fasterxml.jackson.databind.JsonMappingException: Can not construct
>> instance of scala.collection.Seq, problem: abstract types either need
>>to be
>> mapped to concrete types, have custom deserializer, or be instantiated
>>with
>> additional type information
>> I tried using Jackson’s DefaultScalaModule, which seems to have support
>>for
>> Scala Seqs, but got no luck.
>> Deserialization works if the Scala class does not have any Seq fields,
>>and
>> works if the fields are Java Lists instead of Seqs.
>>
>> Thanks very much for your help!
>> Kevin Chen
>>
>
>
>
>-- 
>Marcelo

Re: Deserializing JSON into Scala objects in Java code

Posted by Marcelo Vanzin <va...@cloudera.com>.
Hi Kevin,

How did you try to use the Scala module? Spark has this code when
setting up the ObjectMapper used to generate the output:

  mapper.registerModule(com.fasterxml.jackson.module.scala.DefaultScalaModule)

As for supporting direct serialization to Java objects, I don't think
that was the goal of the API. The Scala API classes are public mostly
so that API compatibility checks are performed against them. If you
don't mind the duplication, you could write your own Java POJOs that
mirror the Scala API, and use them to deserialize the JSON.


On Tue, Sep 8, 2015 at 12:46 PM, Kevin Chen <kc...@palantir.com> wrote:
> Hello Spark Devs,
>
>  I am trying to use the new Spark API json endpoints at /api/v1/[path]
> (added in SPARK-3454).
>
>  In order to minimize maintenance on our end, I would like to use
> Retrofit/Jackson to parse the json directly into the Scala classes in
> org/apache/spark/status/api/v1/api.scala (ApplicationInfo,
> ApplicationAttemptInfo, etc…). However, Jackson does not seem to know how to
> handle Scala Seqs, and will throw an error when trying to parse the
> attempts: Seq[ApplicationAttemptInfo] field of ApplicationInfo. Our codebase
> is in Java.
>
>  My questions are:
>
> Do you have any recommendations on how to easily deserialize Scala objects
> from json? For example, do you have any current usage examples of SPARK-3454
> with Java?
> Alternatively, are you committed to the json formats of /api/v1/path? I
> would guess so, because of the ‘v1’, but wanted to confirm. If so, I could
> deserialize the json into instances of my own Java classes instead, without
> worrying about changing the class structure later due to changes in the
> Spark API.
>
> Some further information:
>
> The error I am getting with Jackson when trying to deserialize the json into
> ApplicationInfo is Caused by:
> com.fasterxml.jackson.databind.JsonMappingException: Can not construct
> instance of scala.collection.Seq, problem: abstract types either need to be
> mapped to concrete types, have custom deserializer, or be instantiated with
> additional type information
> I tried using Jackson’s DefaultScalaModule, which seems to have support for
> Scala Seqs, but got no luck.
> Deserialization works if the Scala class does not have any Seq fields, and
> works if the fields are Java Lists instead of Seqs.
>
> Thanks very much for your help!
> Kevin Chen
>



-- 
Marcelo

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org


Re: Deserializing JSON into Scala objects in Java code

Posted by Kevin Chen <kc...@palantir.com>.
Marcelo and Christopher,

 Thanks for your help! The problem turned out to arise from a different part
of the code (we have multiple ObjectMappers), but because I am not very
familiar with Jackson I had thought there was a problem with the Scala
module.

Thank you again,
Kevin

From:  Christopher Currie <ch...@currie.com>
Date:  Wednesday, September 9, 2015 at 10:17 AM
To:  Kevin Chen <kc...@palantir.com>, "dev@spark.apache.org"
<de...@spark.apache.org>
Cc:  Matt Cheah <mc...@palantir.com>, Mingyu Kim <mk...@palantir.com>
Subject:  Fwd: Deserializing JSON into Scala objects in Java code

Kevin,

I'm not a Spark dev, but I maintain the Scala module for Jackson. If you're
continuing to have issues with parsing JSON using the Spark Scala datatypes,
let me know or chime in on the jackson mailing list
(jackson-user@googlegroups.com) and I'll see what I can do to help.

Christopher Currie

---------- Forwarded message ----------
From: Paul Brown <pr...@mult.ifario.us>
Date: Tue, Sep 8, 2015 at 8:58 PM
Subject: Fwd: Deserializing JSON into Scala objects in Java code
To: Christopher Currie <ch...@currie.com>


Passing along. 

---------- Forwarded message ----------
From: Kevin Chen <kc...@palantir.com>
Date: Tuesday, September 8, 2015
Subject: Deserializing JSON into Scala objects in Java code
To: "dev@spark.apache.org" <de...@spark.apache.org>
Cc: Matt Cheah <mc...@palantir.com>, Mingyu Kim <mk...@palantir.com>


Hello Spark Devs,

 I am trying to use the new Spark API json endpoints at /api/v1/[path]
(added in SPARK-3454).

 In order to minimize maintenance on our end, I would like to use
Retrofit/Jackson to parse the json directly into the Scala classes in
org/apache/spark/status/api/v1/api.scala (ApplicationInfo,
ApplicationAttemptInfo, etc…). However, Jackson does not seem to know how to
handle Scala Seqs, and will throw an error when trying to parse the
attempts: Seq[ApplicationAttemptInfo] field of ApplicationInfo. Our codebase
is in Java.

 My questions are:
1. Do you have any recommendations on how to easily deserialize Scala
objects from json? For example, do you have any current usage examples of
SPARK-3454 with Java?
2. Alternatively, are you committed to the json formats of /api/v1/path? I
would guess so, because of the ‘v1’, but wanted to confirm. If so, I could
deserialize the json into instances of my own Java classes instead, without
worrying about changing the class structure later due to changes in the
Spark API.
Some further information:
* The error I am getting with Jackson when trying to deserialize the json
into ApplicationInfo is Caused by:
com.fasterxml.jackson.databind.JsonMappingException: Can not construct
instance of scala.collection.Seq, problem: abstract types either need to be
mapped to concrete types, have custom deserializer, or be instantiated with
additional type information
* I tried using Jackson’s DefaultScalaModule, which seems to have support
for Scala Seqs, but got no luck.
* Deserialization works if the Scala class does not have any Seq fields, and
works if the fields are Java Lists instead of Seqs.
Thanks very much for your help!
Kevin Chen




-- 
(Sent from mobile. Pardon brevity.)