You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-dev@hadoop.apache.org by Daniel Templeton <da...@cloudera.com> on 2017/11/16 03:23:43 UTC

Changing the JSON Serializer

Looks like our REST endpoints return malformed JSON for any DAO that 
includes a Map.  That includes:

* the resourceSecondsMap and preemptedResourceSecondsMap entries in all 
the GET /apps/* endpoints,
* the operationsInfo entry in the GET /scheduler endpoint for capacity 
scheduler,
* the local_resources, environment, and acls entries in the POST /apps 
endpoint, and
* the labelsToNodes entry in the GET /label-mappings endpoint.

The issue is that each entry in the map is included with a duplicate key 
("entry").  Some JSON parsers will choke on the error, and some will 
quietly drop the duplicates.  I've filed YARN-7505 to address the issue.

The solution is to replace the Jersey JSON serializer with the Jackson 
JSON serializer.  This change fixes the issue, but it changes the 
structure of the resulting JSON.  For example, without YARN-7505, 
hitting /apps might yield JSON that contains something like:

"resourceSecondsMap":{
   "entry":{"key":"memory-mb","value":"11225"},
   "entry":{"key":"vcores","value":"5"}
   "entry":{"key":"test","value":"0"}
   "entry":{"key":"test2","value":"0"}
}

With YARN-7505, we get:

"resourceSecondsMap": {
   "test2":0,
   "test":0,
   "memory-mb":11225,
   "vcores":5
}

The first example is obviously broken, so the second one is clearly 
better, but it's structurally different.

For the GET /label-mappings endpoint, the keys of the map also have to 
be changed to simple strings because JSON doesn't allow for complex map 
keys.  So this:

"labelsToNodes":{
   "entry":{
     "key":{"name":"label1","exclusivity":"true"},
     "value":{"nodes":"localhost:63261"}
   }
}

becomes this:

"labelsToNodes":{
   "label1":{
     "nodes":["dhcp-10-16-0-181.pa.cloudera.com:63261"]
   }
}

The first one sucks and is invalid, but changing to the second one will 
break clients that are parsing the first one, especially if they're 
expecting to get the label exclusivity from this endpoint.

Before I try to get YARN-7505 committed, I want to give the community a 
chance to voice any concerns about the change.  It's too late to get 
into 3.0.0, so we'd be looking at 3.0.1 and 3.1.0.

Feel free to comment here or on the JIRA directly.

Thanks,
Daniel

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-dev-help@hadoop.apache.org


Re: Changing the JSON Serializer

Posted by Daniel Templeton <da...@cloudera.com>.
Yeah, we're not going to be able to change the REST APIs without 
updating the version number and leaving the old version around for a 
while.  We should make sure that the fix makes future WS version revs 
easy (or at least easier than this one).

Daniel

On 11/17/17 3:56 PM, Eric Yang wrote:
> This means YARN-7505 can have /ws/v2/* running in parallel of /ws/v1/* for 3.1 or 3.0.1 release, and deprecate /ws/v1/*.  In version 4, we drop /ws/v1/*, right?
> I think this plan can work.
>
> Regards,
> Eric
>
>
> From: Sean Busbey <bu...@cloudera.com>
> Date: Friday, November 17, 2017 at 3:08 PM
> To: Eric Yang <ey...@hortonworks.com>
> Cc: "yarn-dev@hadoop.apache.org" <ya...@hadoop.apache.org>
> Subject: Re: Changing the JSON Serializer
>
> 3.0.0 RCs are in progress already. Bit late to make a breaking change.
>
> the REST APIs are versioned for a reason. So long as we're outputting these changes on a new version, this change should be fine on whatever branch we like. When we open up for changes to go in the next major release we can drop the v1 APIs.
>
> On Fri, Nov 17, 2017 at 11:41 AM, Eric Yang <ey...@hortonworks.com>> wrote:
> +1 on changing the JSON serializer.  Hadoop was an early adopter for Jersey, but proper JSON deserializer for Jackson didn’t appear until mid 2016 after Jackson 2.5 release.  Hence, some early versions of Hadoop REST API were not JSON compliant.  Hadoop kind of comply to schematic versioning, therefore, it will be best to make this change in 3.0 release.  This will reduce some baggage carried forward from Hadoop 2.x.
> I think community will respond positively toward this change.  Thank you for bringing this up.
>
> regards,
> Eric
>
> On 11/16/17, 10:02 PM, "Sean Busbey" <bu...@cloudera.com>> wrote:
>
>      The REST APIs are covered under the compatibility guidelines[1]. Presuming
>      these are under a new API version number, it's not clear to me from the
>      existing guidelines if adding one is okay in a maintenance release. It
>      sounds surprising to me.
>
>      [1]:
>      https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/Compatibility.html#REST_APIs
>
>      On Wed, Nov 15, 2017 at 9:23 PM, Daniel Templeton <da...@cloudera.com>>
>      wrote:
>
>      > Looks like our REST endpoints return malformed JSON for any DAO that
>      > includes a Map.  That includes:
>      >
>      > * the resourceSecondsMap and preemptedResourceSecondsMap entries in all
>      > the GET /apps/* endpoints,
>      > * the operationsInfo entry in the GET /scheduler endpoint for capacity
>      > scheduler,
>      > * the local_resources, environment, and acls entries in the POST /apps
>      > endpoint, and
>      > * the labelsToNodes entry in the GET /label-mappings endpoint.
>      >
>      > The issue is that each entry in the map is included with a duplicate key
>      > ("entry").  Some JSON parsers will choke on the error, and some will
>      > quietly drop the duplicates.  I've filed YARN-7505 to address the issue.
>      >
>      > The solution is to replace the Jersey JSON serializer with the Jackson
>      > JSON serializer.  This change fixes the issue, but it changes the structure
>      > of the resulting JSON.  For example, without YARN-7505, hitting /apps might
>      > yield JSON that contains something like:
>      >
>      > "resourceSecondsMap":{
>      >   "entry":{"key":"memory-mb","value":"11225"},
>      >   "entry":{"key":"vcores","value":"5"}
>      >   "entry":{"key":"test","value":"0"}
>      >   "entry":{"key":"test2","value":"0"}
>      > }
>      >
>      > With YARN-7505, we get:
>      >
>      > "resourceSecondsMap": {
>      >   "test2":0,
>      >   "test":0,
>      >   "memory-mb":11225,
>      >   "vcores":5
>      > }
>      >
>      > The first example is obviously broken, so the second one is clearly
>      > better, but it's structurally different.
>      >
>      > For the GET /label-mappings endpoint, the keys of the map also have to be
>      > changed to simple strings because JSON doesn't allow for complex map keys.
>      > So this:
>      >
>      > "labelsToNodes":{
>      >   "entry":{
>      >     "key":{"name":"label1","exclusivity":"true"},
>      >     "value":{"nodes":"localhost:63261"}
>      >   }
>      > }
>      >
>      > becomes this:
>      >
>      > "labelsToNodes":{
>      >   "label1":{
>      >     "nodes":["dhcp-10-16-0-181.pa.cloudera.com:63261<http://dhcp-10-16-0-181.pa.cloudera.com:63261>"]
>      >   }
>      > }
>      >
>      > The first one sucks and is invalid, but changing to the second one will
>      > break clients that are parsing the first one, especially if they're
>      > expecting to get the label exclusivity from this endpoint.
>      >
>      > Before I try to get YARN-7505 committed, I want to give the community a
>      > chance to voice any concerns about the change.  It's too late to get into
>      > 3.0.0, so we'd be looking at 3.0.1 and 3.1.0.
>      >
>      > Feel free to comment here or on the JIRA directly.
>      >
>      > Thanks,
>      > Daniel
>      >
>      > ---------------------------------------------------------------------
>      > To unsubscribe, e-mail: yarn-dev-unsubscribe@hadoop.apache.org<ma...@hadoop.apache.org>
>      > For additional commands, e-mail: yarn-dev-help@hadoop.apache.org<ma...@hadoop.apache.org>
>      >
>      >
>
>
>      --
>      busbey
>
>
>
>
> --
> busbey


---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-dev-help@hadoop.apache.org


Re: Changing the JSON Serializer

Posted by Daniel Templeton <da...@cloudera.com>.
Oh, yeah, I agree.  Adding the new version would be a 3.1 thing.

Daniel

On 11/17/17 4:48 PM, Sean Busbey wrote:
> I personally wouldn't be comfortable adding an API version in a maintenance
> release; it's essentially adding a feature. but I'm not set to be the RM
> for 3.0.1. :)
>
> On Nov 17, 2017 17:56, "Eric Yang" <ey...@hortonworks.com> wrote:
>
>> This means YARN-7505 can have /ws/v2/* running in parallel of /ws/v1/* for
>> 3.1 or 3.0.1 release, and deprecate /ws/v1/*.  In version 4, we drop
>> /ws/v1/*, right?
>>
>> I think this plan can work.
>>
>>
>>
>> Regards,
>>
>> Eric
>>
>>
>>
>>
>>
>> *From: *Sean Busbey <bu...@cloudera.com>
>> *Date: *Friday, November 17, 2017 at 3:08 PM
>> *To: *Eric Yang <ey...@hortonworks.com>
>> *Cc: *"yarn-dev@hadoop.apache.org" <ya...@hadoop.apache.org>
>> *Subject: *Re: Changing the JSON Serializer
>>
>>
>>
>> 3.0.0 RCs are in progress already. Bit late to make a breaking change.
>>
>>
>>
>> the REST APIs are versioned for a reason. So long as we're outputting
>> these changes on a new version, this change should be fine on whatever
>> branch we like. When we open up for changes to go in the next major release
>> we can drop the v1 APIs.
>>
>>
>>
>> On Fri, Nov 17, 2017 at 11:41 AM, Eric Yang <ey...@hortonworks.com> wrote:
>>
>> +1 on changing the JSON serializer.  Hadoop was an early adopter for
>> Jersey, but proper JSON deserializer for Jackson didn’t appear until mid
>> 2016 after Jackson 2.5 release.  Hence, some early versions of Hadoop REST
>> API were not JSON compliant.  Hadoop kind of comply to schematic
>> versioning, therefore, it will be best to make this change in 3.0 release.
>> This will reduce some baggage carried forward from Hadoop 2.x.
>> I think community will respond positively toward this change.  Thank you
>> for bringing this up.
>>
>> regards,
>> Eric
>>
>>
>> On 11/16/17, 10:02 PM, "Sean Busbey" <bu...@cloudera.com> wrote:
>>
>>      The REST APIs are covered under the compatibility guidelines[1].
>> Presuming
>>      these are under a new API version number, it's not clear to me from the
>>      existing guidelines if adding one is okay in a maintenance release. It
>>      sounds surprising to me.
>>
>>      [1]:
>>      https://hadoop.apache.org/docs/current/hadoop-project-
>> dist/hadoop-common/Compatibility.html#REST_APIs
>>
>>      On Wed, Nov 15, 2017 at 9:23 PM, Daniel Templeton <daniel@cloudera.com
>>      wrote:
>>
>>      > Looks like our REST endpoints return malformed JSON for any DAO that
>>      > includes a Map.  That includes:
>>      >
>>      > * the resourceSecondsMap and preemptedResourceSecondsMap entries in
>> all
>>      > the GET /apps/* endpoints,
>>      > * the operationsInfo entry in the GET /scheduler endpoint for
>> capacity
>>      > scheduler,
>>      > * the local_resources, environment, and acls entries in the POST
>> /apps
>>      > endpoint, and
>>      > * the labelsToNodes entry in the GET /label-mappings endpoint.
>>      >
>>      > The issue is that each entry in the map is included with a duplicate
>> key
>>      > ("entry").  Some JSON parsers will choke on the error, and some will
>>      > quietly drop the duplicates.  I've filed YARN-7505 to address the
>> issue.
>>      >
>>      > The solution is to replace the Jersey JSON serializer with the
>> Jackson
>>      > JSON serializer.  This change fixes the issue, but it changes the
>> structure
>>      > of the resulting JSON.  For example, without YARN-7505, hitting
>> /apps might
>>      > yield JSON that contains something like:
>>      >
>>      > "resourceSecondsMap":{
>>      >   "entry":{"key":"memory-mb","value":"11225"},
>>      >   "entry":{"key":"vcores","value":"5"}
>>      >   "entry":{"key":"test","value":"0"}
>>      >   "entry":{"key":"test2","value":"0"}
>>      > }
>>      >
>>      > With YARN-7505, we get:
>>      >
>>      > "resourceSecondsMap": {
>>      >   "test2":0,
>>      >   "test":0,
>>      >   "memory-mb":11225,
>>      >   "vcores":5
>>      > }
>>      >
>>      > The first example is obviously broken, so the second one is clearly
>>      > better, but it's structurally different.
>>      >
>>      > For the GET /label-mappings endpoint, the keys of the map also have
>> to be
>>      > changed to simple strings because JSON doesn't allow for complex map
>> keys.
>>      > So this:
>>      >
>>      > "labelsToNodes":{
>>      >   "entry":{
>>      >     "key":{"name":"label1","exclusivity":"true"},
>>      >     "value":{"nodes":"localhost:63261"}
>>      >   }
>>      > }
>>      >
>>      > becomes this:
>>      >
>>      > "labelsToNodes":{
>>      >   "label1":{
>>      >     "nodes":["dhcp-10-16-0-181.pa.cloudera.com:63261"]
>>      >   }
>>      > }
>>      >
>>      > The first one sucks and is invalid, but changing to the second one
>> will
>>      > break clients that are parsing the first one, especially if they're
>>      > expecting to get the label exclusivity from this endpoint.
>>      >
>>      > Before I try to get YARN-7505 committed, I want to give the
>> community a
>>      > chance to voice any concerns about the change.  It's too late to get
>> into
>>      > 3.0.0, so we'd be looking at 3.0.1 and 3.1.0.
>>      >
>>      > Feel free to comment here or on the JIRA directly.
>>      >
>>      > Thanks,
>>      > Daniel
>>      >
>>      > ------------------------------------------------------------
>> ---------
>>      > To unsubscribe, e-mail: yarn-dev-unsubscribe@hadoop.apache.org
>>      > For additional commands, e-mail: yarn-dev-help@hadoop.apache.org
>>      >
>>      >
>>
>>
>>      --
>>      busbey
>>
>>
>>
>>
>>
>> --
>>
>> busbey
>>


---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-dev-help@hadoop.apache.org


Re: Changing the JSON Serializer

Posted by Sean Busbey <bu...@cloudera.com>.
I personally wouldn't be comfortable adding an API version in a maintenance
release; it's essentially adding a feature. but I'm not set to be the RM
for 3.0.1. :)

On Nov 17, 2017 17:56, "Eric Yang" <ey...@hortonworks.com> wrote:

> This means YARN-7505 can have /ws/v2/* running in parallel of /ws/v1/* for
> 3.1 or 3.0.1 release, and deprecate /ws/v1/*.  In version 4, we drop
> /ws/v1/*, right?
>
> I think this plan can work.
>
>
>
> Regards,
>
> Eric
>
>
>
>
>
> *From: *Sean Busbey <bu...@cloudera.com>
> *Date: *Friday, November 17, 2017 at 3:08 PM
> *To: *Eric Yang <ey...@hortonworks.com>
> *Cc: *"yarn-dev@hadoop.apache.org" <ya...@hadoop.apache.org>
> *Subject: *Re: Changing the JSON Serializer
>
>
>
> 3.0.0 RCs are in progress already. Bit late to make a breaking change.
>
>
>
> the REST APIs are versioned for a reason. So long as we're outputting
> these changes on a new version, this change should be fine on whatever
> branch we like. When we open up for changes to go in the next major release
> we can drop the v1 APIs.
>
>
>
> On Fri, Nov 17, 2017 at 11:41 AM, Eric Yang <ey...@hortonworks.com> wrote:
>
> +1 on changing the JSON serializer.  Hadoop was an early adopter for
> Jersey, but proper JSON deserializer for Jackson didn’t appear until mid
> 2016 after Jackson 2.5 release.  Hence, some early versions of Hadoop REST
> API were not JSON compliant.  Hadoop kind of comply to schematic
> versioning, therefore, it will be best to make this change in 3.0 release.
> This will reduce some baggage carried forward from Hadoop 2.x.
> I think community will respond positively toward this change.  Thank you
> for bringing this up.
>
> regards,
> Eric
>
>
> On 11/16/17, 10:02 PM, "Sean Busbey" <bu...@cloudera.com> wrote:
>
>     The REST APIs are covered under the compatibility guidelines[1].
> Presuming
>     these are under a new API version number, it's not clear to me from the
>     existing guidelines if adding one is okay in a maintenance release. It
>     sounds surprising to me.
>
>     [1]:
>     https://hadoop.apache.org/docs/current/hadoop-project-
> dist/hadoop-common/Compatibility.html#REST_APIs
>
>     On Wed, Nov 15, 2017 at 9:23 PM, Daniel Templeton <daniel@cloudera.com
> >
>     wrote:
>
>     > Looks like our REST endpoints return malformed JSON for any DAO that
>     > includes a Map.  That includes:
>     >
>     > * the resourceSecondsMap and preemptedResourceSecondsMap entries in
> all
>     > the GET /apps/* endpoints,
>     > * the operationsInfo entry in the GET /scheduler endpoint for
> capacity
>     > scheduler,
>     > * the local_resources, environment, and acls entries in the POST
> /apps
>     > endpoint, and
>     > * the labelsToNodes entry in the GET /label-mappings endpoint.
>     >
>     > The issue is that each entry in the map is included with a duplicate
> key
>     > ("entry").  Some JSON parsers will choke on the error, and some will
>     > quietly drop the duplicates.  I've filed YARN-7505 to address the
> issue.
>     >
>     > The solution is to replace the Jersey JSON serializer with the
> Jackson
>     > JSON serializer.  This change fixes the issue, but it changes the
> structure
>     > of the resulting JSON.  For example, without YARN-7505, hitting
> /apps might
>     > yield JSON that contains something like:
>     >
>     > "resourceSecondsMap":{
>     >   "entry":{"key":"memory-mb","value":"11225"},
>     >   "entry":{"key":"vcores","value":"5"}
>     >   "entry":{"key":"test","value":"0"}
>     >   "entry":{"key":"test2","value":"0"}
>     > }
>     >
>     > With YARN-7505, we get:
>     >
>     > "resourceSecondsMap": {
>     >   "test2":0,
>     >   "test":0,
>     >   "memory-mb":11225,
>     >   "vcores":5
>     > }
>     >
>     > The first example is obviously broken, so the second one is clearly
>     > better, but it's structurally different.
>     >
>     > For the GET /label-mappings endpoint, the keys of the map also have
> to be
>     > changed to simple strings because JSON doesn't allow for complex map
> keys.
>     > So this:
>     >
>     > "labelsToNodes":{
>     >   "entry":{
>     >     "key":{"name":"label1","exclusivity":"true"},
>     >     "value":{"nodes":"localhost:63261"}
>     >   }
>     > }
>     >
>     > becomes this:
>     >
>     > "labelsToNodes":{
>     >   "label1":{
>     >     "nodes":["dhcp-10-16-0-181.pa.cloudera.com:63261"]
>     >   }
>     > }
>     >
>     > The first one sucks and is invalid, but changing to the second one
> will
>     > break clients that are parsing the first one, especially if they're
>     > expecting to get the label exclusivity from this endpoint.
>     >
>     > Before I try to get YARN-7505 committed, I want to give the
> community a
>     > chance to voice any concerns about the change.  It's too late to get
> into
>     > 3.0.0, so we'd be looking at 3.0.1 and 3.1.0.
>     >
>     > Feel free to comment here or on the JIRA directly.
>     >
>     > Thanks,
>     > Daniel
>     >
>     > ------------------------------------------------------------
> ---------
>     > To unsubscribe, e-mail: yarn-dev-unsubscribe@hadoop.apache.org
>     > For additional commands, e-mail: yarn-dev-help@hadoop.apache.org
>     >
>     >
>
>
>     --
>     busbey
>
>
>
>
>
> --
>
> busbey
>

Re: Changing the JSON Serializer

Posted by Eric Yang <ey...@hortonworks.com>.
This means YARN-7505 can have /ws/v2/* running in parallel of /ws/v1/* for 3.1 or 3.0.1 release, and deprecate /ws/v1/*.  In version 4, we drop /ws/v1/*, right?
I think this plan can work.

Regards,
Eric


From: Sean Busbey <bu...@cloudera.com>
Date: Friday, November 17, 2017 at 3:08 PM
To: Eric Yang <ey...@hortonworks.com>
Cc: "yarn-dev@hadoop.apache.org" <ya...@hadoop.apache.org>
Subject: Re: Changing the JSON Serializer

3.0.0 RCs are in progress already. Bit late to make a breaking change.

the REST APIs are versioned for a reason. So long as we're outputting these changes on a new version, this change should be fine on whatever branch we like. When we open up for changes to go in the next major release we can drop the v1 APIs.

On Fri, Nov 17, 2017 at 11:41 AM, Eric Yang <ey...@hortonworks.com>> wrote:
+1 on changing the JSON serializer.  Hadoop was an early adopter for Jersey, but proper JSON deserializer for Jackson didn’t appear until mid 2016 after Jackson 2.5 release.  Hence, some early versions of Hadoop REST API were not JSON compliant.  Hadoop kind of comply to schematic versioning, therefore, it will be best to make this change in 3.0 release.  This will reduce some baggage carried forward from Hadoop 2.x.
I think community will respond positively toward this change.  Thank you for bringing this up.

regards,
Eric

On 11/16/17, 10:02 PM, "Sean Busbey" <bu...@cloudera.com>> wrote:

    The REST APIs are covered under the compatibility guidelines[1]. Presuming
    these are under a new API version number, it's not clear to me from the
    existing guidelines if adding one is okay in a maintenance release. It
    sounds surprising to me.

    [1]:
    https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/Compatibility.html#REST_APIs

    On Wed, Nov 15, 2017 at 9:23 PM, Daniel Templeton <da...@cloudera.com>>
    wrote:

    > Looks like our REST endpoints return malformed JSON for any DAO that
    > includes a Map.  That includes:
    >
    > * the resourceSecondsMap and preemptedResourceSecondsMap entries in all
    > the GET /apps/* endpoints,
    > * the operationsInfo entry in the GET /scheduler endpoint for capacity
    > scheduler,
    > * the local_resources, environment, and acls entries in the POST /apps
    > endpoint, and
    > * the labelsToNodes entry in the GET /label-mappings endpoint.
    >
    > The issue is that each entry in the map is included with a duplicate key
    > ("entry").  Some JSON parsers will choke on the error, and some will
    > quietly drop the duplicates.  I've filed YARN-7505 to address the issue.
    >
    > The solution is to replace the Jersey JSON serializer with the Jackson
    > JSON serializer.  This change fixes the issue, but it changes the structure
    > of the resulting JSON.  For example, without YARN-7505, hitting /apps might
    > yield JSON that contains something like:
    >
    > "resourceSecondsMap":{
    >   "entry":{"key":"memory-mb","value":"11225"},
    >   "entry":{"key":"vcores","value":"5"}
    >   "entry":{"key":"test","value":"0"}
    >   "entry":{"key":"test2","value":"0"}
    > }
    >
    > With YARN-7505, we get:
    >
    > "resourceSecondsMap": {
    >   "test2":0,
    >   "test":0,
    >   "memory-mb":11225,
    >   "vcores":5
    > }
    >
    > The first example is obviously broken, so the second one is clearly
    > better, but it's structurally different.
    >
    > For the GET /label-mappings endpoint, the keys of the map also have to be
    > changed to simple strings because JSON doesn't allow for complex map keys.
    > So this:
    >
    > "labelsToNodes":{
    >   "entry":{
    >     "key":{"name":"label1","exclusivity":"true"},
    >     "value":{"nodes":"localhost:63261"}
    >   }
    > }
    >
    > becomes this:
    >
    > "labelsToNodes":{
    >   "label1":{
    >     "nodes":["dhcp-10-16-0-181.pa.cloudera.com:63261<http://dhcp-10-16-0-181.pa.cloudera.com:63261>"]
    >   }
    > }
    >
    > The first one sucks and is invalid, but changing to the second one will
    > break clients that are parsing the first one, especially if they're
    > expecting to get the label exclusivity from this endpoint.
    >
    > Before I try to get YARN-7505 committed, I want to give the community a
    > chance to voice any concerns about the change.  It's too late to get into
    > 3.0.0, so we'd be looking at 3.0.1 and 3.1.0.
    >
    > Feel free to comment here or on the JIRA directly.
    >
    > Thanks,
    > Daniel
    >
    > ---------------------------------------------------------------------
    > To unsubscribe, e-mail: yarn-dev-unsubscribe@hadoop.apache.org<ma...@hadoop.apache.org>
    > For additional commands, e-mail: yarn-dev-help@hadoop.apache.org<ma...@hadoop.apache.org>
    >
    >


    --
    busbey




--
busbey

Re: Changing the JSON Serializer

Posted by Sean Busbey <bu...@cloudera.com>.
3.0.0 RCs are in progress already. Bit late to make a breaking change.

the REST APIs are versioned for a reason. So long as we're outputting these
changes on a new version, this change should be fine on whatever branch we
like. When we open up for changes to go in the next major release we can
drop the v1 APIs.

On Fri, Nov 17, 2017 at 11:41 AM, Eric Yang <ey...@hortonworks.com> wrote:

> +1 on changing the JSON serializer.  Hadoop was an early adopter for
> Jersey, but proper JSON deserializer for Jackson didn’t appear until mid
> 2016 after Jackson 2.5 release.  Hence, some early versions of Hadoop REST
> API were not JSON compliant.  Hadoop kind of comply to schematic
> versioning, therefore, it will be best to make this change in 3.0 release.
> This will reduce some baggage carried forward from Hadoop 2.x.
> I think community will respond positively toward this change.  Thank you
> for bringing this up.
>
> regards,
> Eric
>
> On 11/16/17, 10:02 PM, "Sean Busbey" <bu...@cloudera.com> wrote:
>
>     The REST APIs are covered under the compatibility guidelines[1].
> Presuming
>     these are under a new API version number, it's not clear to me from the
>     existing guidelines if adding one is okay in a maintenance release. It
>     sounds surprising to me.
>
>     [1]:
>     https://hadoop.apache.org/docs/current/hadoop-project-
> dist/hadoop-common/Compatibility.html#REST_APIs
>
>     On Wed, Nov 15, 2017 at 9:23 PM, Daniel Templeton <daniel@cloudera.com
> >
>     wrote:
>
>     > Looks like our REST endpoints return malformed JSON for any DAO that
>     > includes a Map.  That includes:
>     >
>     > * the resourceSecondsMap and preemptedResourceSecondsMap entries in
> all
>     > the GET /apps/* endpoints,
>     > * the operationsInfo entry in the GET /scheduler endpoint for
> capacity
>     > scheduler,
>     > * the local_resources, environment, and acls entries in the POST
> /apps
>     > endpoint, and
>     > * the labelsToNodes entry in the GET /label-mappings endpoint.
>     >
>     > The issue is that each entry in the map is included with a duplicate
> key
>     > ("entry").  Some JSON parsers will choke on the error, and some will
>     > quietly drop the duplicates.  I've filed YARN-7505 to address the
> issue.
>     >
>     > The solution is to replace the Jersey JSON serializer with the
> Jackson
>     > JSON serializer.  This change fixes the issue, but it changes the
> structure
>     > of the resulting JSON.  For example, without YARN-7505, hitting
> /apps might
>     > yield JSON that contains something like:
>     >
>     > "resourceSecondsMap":{
>     >   "entry":{"key":"memory-mb","value":"11225"},
>     >   "entry":{"key":"vcores","value":"5"}
>     >   "entry":{"key":"test","value":"0"}
>     >   "entry":{"key":"test2","value":"0"}
>     > }
>     >
>     > With YARN-7505, we get:
>     >
>     > "resourceSecondsMap": {
>     >   "test2":0,
>     >   "test":0,
>     >   "memory-mb":11225,
>     >   "vcores":5
>     > }
>     >
>     > The first example is obviously broken, so the second one is clearly
>     > better, but it's structurally different.
>     >
>     > For the GET /label-mappings endpoint, the keys of the map also have
> to be
>     > changed to simple strings because JSON doesn't allow for complex map
> keys.
>     > So this:
>     >
>     > "labelsToNodes":{
>     >   "entry":{
>     >     "key":{"name":"label1","exclusivity":"true"},
>     >     "value":{"nodes":"localhost:63261"}
>     >   }
>     > }
>     >
>     > becomes this:
>     >
>     > "labelsToNodes":{
>     >   "label1":{
>     >     "nodes":["dhcp-10-16-0-181.pa.cloudera.com:63261"]
>     >   }
>     > }
>     >
>     > The first one sucks and is invalid, but changing to the second one
> will
>     > break clients that are parsing the first one, especially if they're
>     > expecting to get the label exclusivity from this endpoint.
>     >
>     > Before I try to get YARN-7505 committed, I want to give the
> community a
>     > chance to voice any concerns about the change.  It's too late to get
> into
>     > 3.0.0, so we'd be looking at 3.0.1 and 3.1.0.
>     >
>     > Feel free to comment here or on the JIRA directly.
>     >
>     > Thanks,
>     > Daniel
>     >
>     > ------------------------------------------------------------
> ---------
>     > To unsubscribe, e-mail: yarn-dev-unsubscribe@hadoop.apache.org
>     > For additional commands, e-mail: yarn-dev-help@hadoop.apache.org
>     >
>     >
>
>
>     --
>     busbey
>
>
>


-- 
busbey

Re: Changing the JSON Serializer

Posted by Eric Yang <ey...@hortonworks.com>.
+1 on changing the JSON serializer.  Hadoop was an early adopter for Jersey, but proper JSON deserializer for Jackson didn’t appear until mid 2016 after Jackson 2.5 release.  Hence, some early versions of Hadoop REST API were not JSON compliant.  Hadoop kind of comply to schematic versioning, therefore, it will be best to make this change in 3.0 release.  This will reduce some baggage carried forward from Hadoop 2.x.
I think community will respond positively toward this change.  Thank you for bringing this up.

regards,
Eric

On 11/16/17, 10:02 PM, "Sean Busbey" <bu...@cloudera.com> wrote:

    The REST APIs are covered under the compatibility guidelines[1]. Presuming
    these are under a new API version number, it's not clear to me from the
    existing guidelines if adding one is okay in a maintenance release. It
    sounds surprising to me.
    
    [1]:
    https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/Compatibility.html#REST_APIs
    
    On Wed, Nov 15, 2017 at 9:23 PM, Daniel Templeton <da...@cloudera.com>
    wrote:
    
    > Looks like our REST endpoints return malformed JSON for any DAO that
    > includes a Map.  That includes:
    >
    > * the resourceSecondsMap and preemptedResourceSecondsMap entries in all
    > the GET /apps/* endpoints,
    > * the operationsInfo entry in the GET /scheduler endpoint for capacity
    > scheduler,
    > * the local_resources, environment, and acls entries in the POST /apps
    > endpoint, and
    > * the labelsToNodes entry in the GET /label-mappings endpoint.
    >
    > The issue is that each entry in the map is included with a duplicate key
    > ("entry").  Some JSON parsers will choke on the error, and some will
    > quietly drop the duplicates.  I've filed YARN-7505 to address the issue.
    >
    > The solution is to replace the Jersey JSON serializer with the Jackson
    > JSON serializer.  This change fixes the issue, but it changes the structure
    > of the resulting JSON.  For example, without YARN-7505, hitting /apps might
    > yield JSON that contains something like:
    >
    > "resourceSecondsMap":{
    >   "entry":{"key":"memory-mb","value":"11225"},
    >   "entry":{"key":"vcores","value":"5"}
    >   "entry":{"key":"test","value":"0"}
    >   "entry":{"key":"test2","value":"0"}
    > }
    >
    > With YARN-7505, we get:
    >
    > "resourceSecondsMap": {
    >   "test2":0,
    >   "test":0,
    >   "memory-mb":11225,
    >   "vcores":5
    > }
    >
    > The first example is obviously broken, so the second one is clearly
    > better, but it's structurally different.
    >
    > For the GET /label-mappings endpoint, the keys of the map also have to be
    > changed to simple strings because JSON doesn't allow for complex map keys.
    > So this:
    >
    > "labelsToNodes":{
    >   "entry":{
    >     "key":{"name":"label1","exclusivity":"true"},
    >     "value":{"nodes":"localhost:63261"}
    >   }
    > }
    >
    > becomes this:
    >
    > "labelsToNodes":{
    >   "label1":{
    >     "nodes":["dhcp-10-16-0-181.pa.cloudera.com:63261"]
    >   }
    > }
    >
    > The first one sucks and is invalid, but changing to the second one will
    > break clients that are parsing the first one, especially if they're
    > expecting to get the label exclusivity from this endpoint.
    >
    > Before I try to get YARN-7505 committed, I want to give the community a
    > chance to voice any concerns about the change.  It's too late to get into
    > 3.0.0, so we'd be looking at 3.0.1 and 3.1.0.
    >
    > Feel free to comment here or on the JIRA directly.
    >
    > Thanks,
    > Daniel
    >
    > ---------------------------------------------------------------------
    > To unsubscribe, e-mail: yarn-dev-unsubscribe@hadoop.apache.org
    > For additional commands, e-mail: yarn-dev-help@hadoop.apache.org
    >
    >
    
    
    -- 
    busbey
    


Re: Changing the JSON Serializer

Posted by Sean Busbey <bu...@cloudera.com>.
The REST APIs are covered under the compatibility guidelines[1]. Presuming
these are under a new API version number, it's not clear to me from the
existing guidelines if adding one is okay in a maintenance release. It
sounds surprising to me.

[1]:
https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/Compatibility.html#REST_APIs

On Wed, Nov 15, 2017 at 9:23 PM, Daniel Templeton <da...@cloudera.com>
wrote:

> Looks like our REST endpoints return malformed JSON for any DAO that
> includes a Map.  That includes:
>
> * the resourceSecondsMap and preemptedResourceSecondsMap entries in all
> the GET /apps/* endpoints,
> * the operationsInfo entry in the GET /scheduler endpoint for capacity
> scheduler,
> * the local_resources, environment, and acls entries in the POST /apps
> endpoint, and
> * the labelsToNodes entry in the GET /label-mappings endpoint.
>
> The issue is that each entry in the map is included with a duplicate key
> ("entry").  Some JSON parsers will choke on the error, and some will
> quietly drop the duplicates.  I've filed YARN-7505 to address the issue.
>
> The solution is to replace the Jersey JSON serializer with the Jackson
> JSON serializer.  This change fixes the issue, but it changes the structure
> of the resulting JSON.  For example, without YARN-7505, hitting /apps might
> yield JSON that contains something like:
>
> "resourceSecondsMap":{
>   "entry":{"key":"memory-mb","value":"11225"},
>   "entry":{"key":"vcores","value":"5"}
>   "entry":{"key":"test","value":"0"}
>   "entry":{"key":"test2","value":"0"}
> }
>
> With YARN-7505, we get:
>
> "resourceSecondsMap": {
>   "test2":0,
>   "test":0,
>   "memory-mb":11225,
>   "vcores":5
> }
>
> The first example is obviously broken, so the second one is clearly
> better, but it's structurally different.
>
> For the GET /label-mappings endpoint, the keys of the map also have to be
> changed to simple strings because JSON doesn't allow for complex map keys.
> So this:
>
> "labelsToNodes":{
>   "entry":{
>     "key":{"name":"label1","exclusivity":"true"},
>     "value":{"nodes":"localhost:63261"}
>   }
> }
>
> becomes this:
>
> "labelsToNodes":{
>   "label1":{
>     "nodes":["dhcp-10-16-0-181.pa.cloudera.com:63261"]
>   }
> }
>
> The first one sucks and is invalid, but changing to the second one will
> break clients that are parsing the first one, especially if they're
> expecting to get the label exclusivity from this endpoint.
>
> Before I try to get YARN-7505 committed, I want to give the community a
> chance to voice any concerns about the change.  It's too late to get into
> 3.0.0, so we'd be looking at 3.0.1 and 3.1.0.
>
> Feel free to comment here or on the JIRA directly.
>
> Thanks,
> Daniel
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: yarn-dev-unsubscribe@hadoop.apache.org
> For additional commands, e-mail: yarn-dev-help@hadoop.apache.org
>
>


-- 
busbey