You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-dev@hadoop.apache.org by Eric Badger <eb...@verizonmedia.com.INVALID> on 2019/09/24 16:23:07 UTC

Incompatible changes between branch-2.8 and branch-2.9

We (Verizon Media) are currently moving towards upgrading our clusters from
our internal fork of branch-2.8 to an internal fork of branch-2. During
this process, we have found multiple incompatible changes in protobufs
between branch-2.8 and branch-2. These incompatibilities were all
introduced between branch-2.8 and branch-2.9. I did a git diff over all
.proto files across the branch-2.8 and branch-2.9 and found 3 instances of
incompatibilities from 3 separate commits. All of the incompatibilities are
in yarn_protos.proto


I would like to discuss how to fix these incompatible changes. Otherwise,
rolling upgrades will not be supported between branch-2.8 (or below) and
branch-2.9 (or beyond). We could revert the incompatible changes, but then
the new releases would be incompatible with the releases that have these
incompatible changes. If we do nothing, then rolling upgrades won't work
between 2.8- and 2.9+.


Thanks,


Eric


-------------------------------------------------------------------


git diff branch-2.8..branch-2.9 $(find . -name '*\.proto')


https://issues.apache.org/jira/browse/YARN-6616

   - Trunk patch (applied through branch-2.9) differs from branch-2.8 patch

@@ -211,7 +245,20 @@ message ApplicationReportProto {

   optional PriorityProto priority = 23;

   optional string appNodeLabelExpression = 24;

   optional string amNodeLabelExpression = 25;

-  optional int64 submitTime = 26;

+  repeated AppTimeoutsMapProto appTimeouts = 26;

+  optional int64 launchTime = 27;

+  optional int64 submitTime = 28;


https://issues.apache.org/jira/browse/YARN-6050

   - Trunk and branch-2 patches both change the protobuf type in the same
   way.

@@ -356,7 +416,22 @@ message ApplicationSubmissionContextProto {

   optional LogAggregationContextProto log_aggregation_context = 14;

   optional ReservationIdProto reservation_id = 15;

   optional string node_label_expression = 16;

-  optional ResourceRequestProto am_container_resource_request = 17;

+  repeated ResourceRequestProto am_container_resource_request = 17;

+  repeated ApplicationTimeoutMapProto application_timeouts = 18;


https://issues.apache.org/jira/browse/YARN-7813

   - Trunk (applied through branch-3.1) and branch-3.0 (applied through
   branch-2.9) patches differ from branch-2.8 patch

@@ -425,7 +501,21 @@ message QueueInfoProto {

   optional string defaultNodeLabelExpression = 9;

   optional QueueStatisticsProto queueStatistics = 10;

   optional bool preemptionDisabled = 11;

-  optional bool intraQueuePreemptionDisabled = 12;

+  repeated QueueConfigurationsMapProto queueConfigurationsMap = 12;

+  optional bool intraQueuePreemptionDisabled = 13;

Re: Incompatible changes between branch-2.8 and branch-2.9

Posted by Eric Badger <eb...@verizonmedia.com.INVALID>.
* For YARN-7813, not sure why moving from 2.8.4/5 -> 2.8.6 would be
incompatible with this strategy? It should be OK to remove/add optional
fields (removing the field with id 12, and adding the field with id 13)
  - Ahhhh I misunderstood. I was thinking that the field was overwritten in
branch-2.8 as well. Yea I think that approach will be fine.

On Wed, Sep 25, 2019 at 2:31 AM Robert Kanter <rk...@apache.org> wrote:

> >
> > *   For YARN-6050, there's a bit here:
> > https://developers.google.com/protocol-buffers/docs/proto that says
> > "optional is compatible with repeated", so I think we should be OK there.
> >   - Optional is compatible with repeatable over the wire such that
> > protobuf won't blow up, but does that actually mean that it's compatible
> in
> > this case? If it's expecting an optional and gets a repeated, it's going
> to
> > drop everything except for the last value. I don't know enough about
> > YARN-6050 to say if this will be ok or not.
>
>
> It's been a while since I looked into this, but I think it should be okay.
>  If an older client (using optional) sends the message to a newer server
> (using repeated), then there will never be more than one value for the
> field.  The server puts these into a list, so the list would simply have a
> single value in it.  The server's logic should be able to handle a single
> valued list here because (a) IIRC we wanted to make sure compatibility
> wasn't a problem (Cloudera supported rolling upgrades between CDH 5.x so
> this was important) and (b) sending a single resource request, even in a
> newer client, is a still a valid thing to do.
> If a newer client (using repeated) sends the message to an older server
> (using optional), I'm not sure what will happen.  My guess is that it will
> drop the extra values (though I wonder if it will keep the first or last
> value...).  In any case, I believe most clients will only send the one
> value - in order for a client to send multiple values, you'd have to
> specify some additional MR configs (see MAPREDUCE-6871).  IIRC, there's
> also a SPARK JIRA similar to MAPREDUCE-6871, but I can't find it right now.
>
> - Robert
>
> On Tue, Sep 24, 2019 at 9:49 PM Jonathan Hung <jy...@gmail.com>
> wrote:
>
> > - I've created YARN-9855 and uploaded patches to fix YARN-6616 in
> > branch-2.8 and branch-2.7.
> > - For YARN-6050, not sure either. Robert/Wangda, can you comment on
> > YARN-6050 compatibility?
> > - For YARN-7813, not sure why moving from 2.8.4/5 -> 2.8.6 would be
> > incompatible with this strategy? It should be OK to remove/add optional
> > fields (removing the field with id 12, and adding the field with id 13).
> > The difficulties I see here are, we would have to leave id 12 blank in
> > 2.8.6 (so we cannot have YARN-6164 in branch-2.8), and users on 2.8.4/5
> > would have to move to 2.8.6 before moving to 2.9+. But rolling upgrade
> > would still work IIUC.
> >
> > Jonathan Hung
> >
> >
> > On Tue, Sep 24, 2019 at 2:52 PM Eric Badger <eb...@verizonmedia.com>
> > wrote:
> >
> >> *   For YARN-6616, for branch-2.8 and below, it was only committed to
> >> 2.7.8/2.8.6 which have not been released (as I understand). Perhaps we
> can
> >> revert YARN-6616 from branch-2.7 and branch-2.8.
> >>   - This seems reasonable. Since we haven't released anything, it should
> >> be no issue to change the 2.7/2.8 protobuf field to have the same value
> as
> >> 2.9+
> >>
> >> *   For YARN-6050, there's a bit here:
> >> https://developers.google.com/protocol-buffers/docs/proto that says
> >> "optional is compatible with repeated", so I think we should be OK
> there.
> >>   - Optional is compatible with repeatable over the wire such that
> >> protobuf won't blow up, but does that actually mean that it's
> compatible in
> >> this case? If it's expecting an optional and gets a repeated, it's
> going to
> >> drop everything except for the last value. I don't know enough about
> >> YARN-6050 to say if this will be ok or not.
> >>
> >> *   For YARN-7813, it's in 2.8.4 so it seems upgrading from 2.8.4 or
> >> 2.8.5 to a 2.9+ version will be an issue. One option could be to move
> the
> >> intraQueuePreemptionDisabled field from id 12 to id 13 in branch-2.8,
> then
> >> users would upgrade from 2.8.4/2.8.5 to 2.8.6 (someone would have to
> >> release this), then upgrade from 2.8.6 to 2.9+.
> >>   - I'm ok with this, but it should be noted that the upgrade from
> >> 2.8.4/2.8.5 to 2.8.6 (or 2.9+) would not be compatible for a rolling
> >> upgrade. So this would cause some pain to anybody with clusters on those
> >> versions.
> >>
> >> Eric
> >>
> >> On Tue, Sep 24, 2019 at 2:42 PM Jonathan Hung <jy...@gmail.com>
> >> wrote:
> >>
> >>> Sorry, let me edit my first point. We can just create addendums for
> >>> YARN-6616 in branch-2.7 and branch-2.8 to edit the submitTime field to
> the
> >>> correct id 28. We don’t need to revert YARN-6616 from these branches
> >>> completely.
> >>>
> >>> Jonathan
> >>>
> >>> ________________________________
> >>> From: Jonathan Hung <jy...@gmail.com>
> >>> Sent: Tuesday, September 24, 2019 11:38 AM
> >>> To: Eric Badger
> >>> Cc: Hadoop Common; yarn-dev; mapreduce-dev; Hdfs-dev
> >>> Subject: Re: Incompatible changes between branch-2.8 and branch-2.9
> >>>
> >>> Hi Eric, thanks for the investigation.
> >>>
> >>>   *   For YARN-6616, for branch-2.8 and below, it was only committed to
> >>> 2.7.8/2.8.6 which have not been released (as I understand). Perhaps we
> can
> >>> revert YARN-6616 from branch-2.7 and branch-2.8.
> >>>   *   For YARN-6050, there's a bit here:
> >>> https://developers.google.com/protocol-buffers/docs/proto that says
> >>> "optional is compatible with repeated", so I think we should be OK
> there.
> >>>   *   For YARN-7813, it's in 2.8.4 so it seems upgrading from 2.8.4 or
> >>> 2.8.5 to a 2.9+ version will be an issue. One option could be to move
> the
> >>> intraQueuePreemptionDisabled field from id 12 to id 13 in branch-2.8,
> then
> >>> users would upgrade from 2.8.4/2.8.5 to 2.8.6 (someone would have to
> >>> release this), then upgrade from 2.8.6 to 2.9+.
> >>>
> >>> Jonathan Hung
> >>>
> >>>
> >>> On Tue, Sep 24, 2019 at 9:23 AM Eric Badger <ebadger@verizonmedia.com
> .invalid>
> >>> wrote:
> >>> We (Verizon Media) are currently moving towards upgrading our clusters
> >>> from
> >>> our internal fork of branch-2.8 to an internal fork of branch-2. During
> >>> this process, we have found multiple incompatible changes in protobufs
> >>> between branch-2.8 and branch-2. These incompatibilities were all
> >>> introduced between branch-2.8 and branch-2.9. I did a git diff over all
> >>> .proto files across the branch-2.8 and branch-2.9 and found 3 instances
> >>> of
> >>> incompatibilities from 3 separate commits. All of the incompatibilities
> >>> are
> >>> in yarn_protos.proto
> >>>
> >>>
> >>> I would like to discuss how to fix these incompatible changes.
> Otherwise,
> >>> rolling upgrades will not be supported between branch-2.8 (or below)
> and
> >>> branch-2.9 (or beyond). We could revert the incompatible changes, but
> >>> then
> >>> the new releases would be incompatible with the releases that have
> these
> >>> incompatible changes. If we do nothing, then rolling upgrades won't
> work
> >>> between 2.8- and 2.9+.
> >>>
> >>>
> >>> Thanks,
> >>>
> >>>
> >>> Eric
> >>>
> >>>
> >>> -------------------------------------------------------------------
> >>>
> >>>
> >>> git diff branch-2.8..branch-2.9 $(find . -name '*\.proto')
> >>>
> >>>
> >>> https://issues.apache.org/jira/browse/YARN-6616
> >>>
> >>>    - Trunk patch (applied through branch-2.9) differs from branch-2.8
> >>> patch
> >>>
> >>> @@ -211,7 +245,20 @@ message ApplicationReportProto {
> >>>
> >>>    optional PriorityProto priority = 23;
> >>>
> >>>    optional string appNodeLabelExpression = 24;
> >>>
> >>>    optional string amNodeLabelExpression = 25;
> >>>
> >>> -  optional int64 submitTime = 26;
> >>>
> >>> +  repeated AppTimeoutsMapProto appTimeouts = 26;
> >>>
> >>> +  optional int64 launchTime = 27;
> >>>
> >>> +  optional int64 submitTime = 28;
> >>>
> >>>
> >>> https://issues.apache.org/jira/browse/YARN-6050
> >>>
> >>>    - Trunk and branch-2 patches both change the protobuf type in the
> same
> >>>    way.
> >>>
> >>> @@ -356,7 +416,22 @@ message ApplicationSubmissionContextProto {
> >>>
> >>>    optional LogAggregationContextProto log_aggregation_context = 14;
> >>>
> >>>    optional ReservationIdProto reservation_id = 15;
> >>>
> >>>    optional string node_label_expression = 16;
> >>>
> >>> -  optional ResourceRequestProto am_container_resource_request = 17;
> >>>
> >>> +  repeated ResourceRequestProto am_container_resource_request = 17;
> >>>
> >>> +  repeated ApplicationTimeoutMapProto application_timeouts = 18;
> >>>
> >>>
> >>> https://issues.apache.org/jira/browse/YARN-7813
> >>>
> >>>    - Trunk (applied through branch-3.1) and branch-3.0 (applied through
> >>>    branch-2.9) patches differ from branch-2.8 patch
> >>>
> >>> @@ -425,7 +501,21 @@ message QueueInfoProto {
> >>>
> >>>    optional string defaultNodeLabelExpression = 9;
> >>>
> >>>    optional QueueStatisticsProto queueStatistics = 10;
> >>>
> >>>    optional bool preemptionDisabled = 11;
> >>>
> >>> -  optional bool intraQueuePreemptionDisabled = 12;
> >>>
> >>> +  repeated QueueConfigurationsMapProto queueConfigurationsMap = 12;
> >>>
> >>> +  optional bool intraQueuePreemptionDisabled = 13;
> >>>
> >>
>

Re: Incompatible changes between branch-2.8 and branch-2.9

Posted by Eric Badger <eb...@verizonmedia.com.INVALID>.
* For YARN-7813, not sure why moving from 2.8.4/5 -> 2.8.6 would be
incompatible with this strategy? It should be OK to remove/add optional
fields (removing the field with id 12, and adding the field with id 13)
  - Ahhhh I misunderstood. I was thinking that the field was overwritten in
branch-2.8 as well. Yea I think that approach will be fine.

On Wed, Sep 25, 2019 at 2:31 AM Robert Kanter <rk...@apache.org> wrote:

> >
> > *   For YARN-6050, there's a bit here:
> > https://developers.google.com/protocol-buffers/docs/proto that says
> > "optional is compatible with repeated", so I think we should be OK there.
> >   - Optional is compatible with repeatable over the wire such that
> > protobuf won't blow up, but does that actually mean that it's compatible
> in
> > this case? If it's expecting an optional and gets a repeated, it's going
> to
> > drop everything except for the last value. I don't know enough about
> > YARN-6050 to say if this will be ok or not.
>
>
> It's been a while since I looked into this, but I think it should be okay.
>  If an older client (using optional) sends the message to a newer server
> (using repeated), then there will never be more than one value for the
> field.  The server puts these into a list, so the list would simply have a
> single value in it.  The server's logic should be able to handle a single
> valued list here because (a) IIRC we wanted to make sure compatibility
> wasn't a problem (Cloudera supported rolling upgrades between CDH 5.x so
> this was important) and (b) sending a single resource request, even in a
> newer client, is a still a valid thing to do.
> If a newer client (using repeated) sends the message to an older server
> (using optional), I'm not sure what will happen.  My guess is that it will
> drop the extra values (though I wonder if it will keep the first or last
> value...).  In any case, I believe most clients will only send the one
> value - in order for a client to send multiple values, you'd have to
> specify some additional MR configs (see MAPREDUCE-6871).  IIRC, there's
> also a SPARK JIRA similar to MAPREDUCE-6871, but I can't find it right now.
>
> - Robert
>
> On Tue, Sep 24, 2019 at 9:49 PM Jonathan Hung <jy...@gmail.com>
> wrote:
>
> > - I've created YARN-9855 and uploaded patches to fix YARN-6616 in
> > branch-2.8 and branch-2.7.
> > - For YARN-6050, not sure either. Robert/Wangda, can you comment on
> > YARN-6050 compatibility?
> > - For YARN-7813, not sure why moving from 2.8.4/5 -> 2.8.6 would be
> > incompatible with this strategy? It should be OK to remove/add optional
> > fields (removing the field with id 12, and adding the field with id 13).
> > The difficulties I see here are, we would have to leave id 12 blank in
> > 2.8.6 (so we cannot have YARN-6164 in branch-2.8), and users on 2.8.4/5
> > would have to move to 2.8.6 before moving to 2.9+. But rolling upgrade
> > would still work IIUC.
> >
> > Jonathan Hung
> >
> >
> > On Tue, Sep 24, 2019 at 2:52 PM Eric Badger <eb...@verizonmedia.com>
> > wrote:
> >
> >> *   For YARN-6616, for branch-2.8 and below, it was only committed to
> >> 2.7.8/2.8.6 which have not been released (as I understand). Perhaps we
> can
> >> revert YARN-6616 from branch-2.7 and branch-2.8.
> >>   - This seems reasonable. Since we haven't released anything, it should
> >> be no issue to change the 2.7/2.8 protobuf field to have the same value
> as
> >> 2.9+
> >>
> >> *   For YARN-6050, there's a bit here:
> >> https://developers.google.com/protocol-buffers/docs/proto that says
> >> "optional is compatible with repeated", so I think we should be OK
> there.
> >>   - Optional is compatible with repeatable over the wire such that
> >> protobuf won't blow up, but does that actually mean that it's
> compatible in
> >> this case? If it's expecting an optional and gets a repeated, it's
> going to
> >> drop everything except for the last value. I don't know enough about
> >> YARN-6050 to say if this will be ok or not.
> >>
> >> *   For YARN-7813, it's in 2.8.4 so it seems upgrading from 2.8.4 or
> >> 2.8.5 to a 2.9+ version will be an issue. One option could be to move
> the
> >> intraQueuePreemptionDisabled field from id 12 to id 13 in branch-2.8,
> then
> >> users would upgrade from 2.8.4/2.8.5 to 2.8.6 (someone would have to
> >> release this), then upgrade from 2.8.6 to 2.9+.
> >>   - I'm ok with this, but it should be noted that the upgrade from
> >> 2.8.4/2.8.5 to 2.8.6 (or 2.9+) would not be compatible for a rolling
> >> upgrade. So this would cause some pain to anybody with clusters on those
> >> versions.
> >>
> >> Eric
> >>
> >> On Tue, Sep 24, 2019 at 2:42 PM Jonathan Hung <jy...@gmail.com>
> >> wrote:
> >>
> >>> Sorry, let me edit my first point. We can just create addendums for
> >>> YARN-6616 in branch-2.7 and branch-2.8 to edit the submitTime field to
> the
> >>> correct id 28. We don’t need to revert YARN-6616 from these branches
> >>> completely.
> >>>
> >>> Jonathan
> >>>
> >>> ________________________________
> >>> From: Jonathan Hung <jy...@gmail.com>
> >>> Sent: Tuesday, September 24, 2019 11:38 AM
> >>> To: Eric Badger
> >>> Cc: Hadoop Common; yarn-dev; mapreduce-dev; Hdfs-dev
> >>> Subject: Re: Incompatible changes between branch-2.8 and branch-2.9
> >>>
> >>> Hi Eric, thanks for the investigation.
> >>>
> >>>   *   For YARN-6616, for branch-2.8 and below, it was only committed to
> >>> 2.7.8/2.8.6 which have not been released (as I understand). Perhaps we
> can
> >>> revert YARN-6616 from branch-2.7 and branch-2.8.
> >>>   *   For YARN-6050, there's a bit here:
> >>> https://developers.google.com/protocol-buffers/docs/proto that says
> >>> "optional is compatible with repeated", so I think we should be OK
> there.
> >>>   *   For YARN-7813, it's in 2.8.4 so it seems upgrading from 2.8.4 or
> >>> 2.8.5 to a 2.9+ version will be an issue. One option could be to move
> the
> >>> intraQueuePreemptionDisabled field from id 12 to id 13 in branch-2.8,
> then
> >>> users would upgrade from 2.8.4/2.8.5 to 2.8.6 (someone would have to
> >>> release this), then upgrade from 2.8.6 to 2.9+.
> >>>
> >>> Jonathan Hung
> >>>
> >>>
> >>> On Tue, Sep 24, 2019 at 9:23 AM Eric Badger <ebadger@verizonmedia.com
> .invalid>
> >>> wrote:
> >>> We (Verizon Media) are currently moving towards upgrading our clusters
> >>> from
> >>> our internal fork of branch-2.8 to an internal fork of branch-2. During
> >>> this process, we have found multiple incompatible changes in protobufs
> >>> between branch-2.8 and branch-2. These incompatibilities were all
> >>> introduced between branch-2.8 and branch-2.9. I did a git diff over all
> >>> .proto files across the branch-2.8 and branch-2.9 and found 3 instances
> >>> of
> >>> incompatibilities from 3 separate commits. All of the incompatibilities
> >>> are
> >>> in yarn_protos.proto
> >>>
> >>>
> >>> I would like to discuss how to fix these incompatible changes.
> Otherwise,
> >>> rolling upgrades will not be supported between branch-2.8 (or below)
> and
> >>> branch-2.9 (or beyond). We could revert the incompatible changes, but
> >>> then
> >>> the new releases would be incompatible with the releases that have
> these
> >>> incompatible changes. If we do nothing, then rolling upgrades won't
> work
> >>> between 2.8- and 2.9+.
> >>>
> >>>
> >>> Thanks,
> >>>
> >>>
> >>> Eric
> >>>
> >>>
> >>> -------------------------------------------------------------------
> >>>
> >>>
> >>> git diff branch-2.8..branch-2.9 $(find . -name '*\.proto')
> >>>
> >>>
> >>> https://issues.apache.org/jira/browse/YARN-6616
> >>>
> >>>    - Trunk patch (applied through branch-2.9) differs from branch-2.8
> >>> patch
> >>>
> >>> @@ -211,7 +245,20 @@ message ApplicationReportProto {
> >>>
> >>>    optional PriorityProto priority = 23;
> >>>
> >>>    optional string appNodeLabelExpression = 24;
> >>>
> >>>    optional string amNodeLabelExpression = 25;
> >>>
> >>> -  optional int64 submitTime = 26;
> >>>
> >>> +  repeated AppTimeoutsMapProto appTimeouts = 26;
> >>>
> >>> +  optional int64 launchTime = 27;
> >>>
> >>> +  optional int64 submitTime = 28;
> >>>
> >>>
> >>> https://issues.apache.org/jira/browse/YARN-6050
> >>>
> >>>    - Trunk and branch-2 patches both change the protobuf type in the
> same
> >>>    way.
> >>>
> >>> @@ -356,7 +416,22 @@ message ApplicationSubmissionContextProto {
> >>>
> >>>    optional LogAggregationContextProto log_aggregation_context = 14;
> >>>
> >>>    optional ReservationIdProto reservation_id = 15;
> >>>
> >>>    optional string node_label_expression = 16;
> >>>
> >>> -  optional ResourceRequestProto am_container_resource_request = 17;
> >>>
> >>> +  repeated ResourceRequestProto am_container_resource_request = 17;
> >>>
> >>> +  repeated ApplicationTimeoutMapProto application_timeouts = 18;
> >>>
> >>>
> >>> https://issues.apache.org/jira/browse/YARN-7813
> >>>
> >>>    - Trunk (applied through branch-3.1) and branch-3.0 (applied through
> >>>    branch-2.9) patches differ from branch-2.8 patch
> >>>
> >>> @@ -425,7 +501,21 @@ message QueueInfoProto {
> >>>
> >>>    optional string defaultNodeLabelExpression = 9;
> >>>
> >>>    optional QueueStatisticsProto queueStatistics = 10;
> >>>
> >>>    optional bool preemptionDisabled = 11;
> >>>
> >>> -  optional bool intraQueuePreemptionDisabled = 12;
> >>>
> >>> +  repeated QueueConfigurationsMapProto queueConfigurationsMap = 12;
> >>>
> >>> +  optional bool intraQueuePreemptionDisabled = 13;
> >>>
> >>
>

Re: Incompatible changes between branch-2.8 and branch-2.9

Posted by Eric Badger <eb...@verizonmedia.com.INVALID>.
* For YARN-7813, not sure why moving from 2.8.4/5 -> 2.8.6 would be
incompatible with this strategy? It should be OK to remove/add optional
fields (removing the field with id 12, and adding the field with id 13)
  - Ahhhh I misunderstood. I was thinking that the field was overwritten in
branch-2.8 as well. Yea I think that approach will be fine.

On Wed, Sep 25, 2019 at 2:31 AM Robert Kanter <rk...@apache.org> wrote:

> >
> > *   For YARN-6050, there's a bit here:
> > https://developers.google.com/protocol-buffers/docs/proto that says
> > "optional is compatible with repeated", so I think we should be OK there.
> >   - Optional is compatible with repeatable over the wire such that
> > protobuf won't blow up, but does that actually mean that it's compatible
> in
> > this case? If it's expecting an optional and gets a repeated, it's going
> to
> > drop everything except for the last value. I don't know enough about
> > YARN-6050 to say if this will be ok or not.
>
>
> It's been a while since I looked into this, but I think it should be okay.
>  If an older client (using optional) sends the message to a newer server
> (using repeated), then there will never be more than one value for the
> field.  The server puts these into a list, so the list would simply have a
> single value in it.  The server's logic should be able to handle a single
> valued list here because (a) IIRC we wanted to make sure compatibility
> wasn't a problem (Cloudera supported rolling upgrades between CDH 5.x so
> this was important) and (b) sending a single resource request, even in a
> newer client, is a still a valid thing to do.
> If a newer client (using repeated) sends the message to an older server
> (using optional), I'm not sure what will happen.  My guess is that it will
> drop the extra values (though I wonder if it will keep the first or last
> value...).  In any case, I believe most clients will only send the one
> value - in order for a client to send multiple values, you'd have to
> specify some additional MR configs (see MAPREDUCE-6871).  IIRC, there's
> also a SPARK JIRA similar to MAPREDUCE-6871, but I can't find it right now.
>
> - Robert
>
> On Tue, Sep 24, 2019 at 9:49 PM Jonathan Hung <jy...@gmail.com>
> wrote:
>
> > - I've created YARN-9855 and uploaded patches to fix YARN-6616 in
> > branch-2.8 and branch-2.7.
> > - For YARN-6050, not sure either. Robert/Wangda, can you comment on
> > YARN-6050 compatibility?
> > - For YARN-7813, not sure why moving from 2.8.4/5 -> 2.8.6 would be
> > incompatible with this strategy? It should be OK to remove/add optional
> > fields (removing the field with id 12, and adding the field with id 13).
> > The difficulties I see here are, we would have to leave id 12 blank in
> > 2.8.6 (so we cannot have YARN-6164 in branch-2.8), and users on 2.8.4/5
> > would have to move to 2.8.6 before moving to 2.9+. But rolling upgrade
> > would still work IIUC.
> >
> > Jonathan Hung
> >
> >
> > On Tue, Sep 24, 2019 at 2:52 PM Eric Badger <eb...@verizonmedia.com>
> > wrote:
> >
> >> *   For YARN-6616, for branch-2.8 and below, it was only committed to
> >> 2.7.8/2.8.6 which have not been released (as I understand). Perhaps we
> can
> >> revert YARN-6616 from branch-2.7 and branch-2.8.
> >>   - This seems reasonable. Since we haven't released anything, it should
> >> be no issue to change the 2.7/2.8 protobuf field to have the same value
> as
> >> 2.9+
> >>
> >> *   For YARN-6050, there's a bit here:
> >> https://developers.google.com/protocol-buffers/docs/proto that says
> >> "optional is compatible with repeated", so I think we should be OK
> there.
> >>   - Optional is compatible with repeatable over the wire such that
> >> protobuf won't blow up, but does that actually mean that it's
> compatible in
> >> this case? If it's expecting an optional and gets a repeated, it's
> going to
> >> drop everything except for the last value. I don't know enough about
> >> YARN-6050 to say if this will be ok or not.
> >>
> >> *   For YARN-7813, it's in 2.8.4 so it seems upgrading from 2.8.4 or
> >> 2.8.5 to a 2.9+ version will be an issue. One option could be to move
> the
> >> intraQueuePreemptionDisabled field from id 12 to id 13 in branch-2.8,
> then
> >> users would upgrade from 2.8.4/2.8.5 to 2.8.6 (someone would have to
> >> release this), then upgrade from 2.8.6 to 2.9+.
> >>   - I'm ok with this, but it should be noted that the upgrade from
> >> 2.8.4/2.8.5 to 2.8.6 (or 2.9+) would not be compatible for a rolling
> >> upgrade. So this would cause some pain to anybody with clusters on those
> >> versions.
> >>
> >> Eric
> >>
> >> On Tue, Sep 24, 2019 at 2:42 PM Jonathan Hung <jy...@gmail.com>
> >> wrote:
> >>
> >>> Sorry, let me edit my first point. We can just create addendums for
> >>> YARN-6616 in branch-2.7 and branch-2.8 to edit the submitTime field to
> the
> >>> correct id 28. We don’t need to revert YARN-6616 from these branches
> >>> completely.
> >>>
> >>> Jonathan
> >>>
> >>> ________________________________
> >>> From: Jonathan Hung <jy...@gmail.com>
> >>> Sent: Tuesday, September 24, 2019 11:38 AM
> >>> To: Eric Badger
> >>> Cc: Hadoop Common; yarn-dev; mapreduce-dev; Hdfs-dev
> >>> Subject: Re: Incompatible changes between branch-2.8 and branch-2.9
> >>>
> >>> Hi Eric, thanks for the investigation.
> >>>
> >>>   *   For YARN-6616, for branch-2.8 and below, it was only committed to
> >>> 2.7.8/2.8.6 which have not been released (as I understand). Perhaps we
> can
> >>> revert YARN-6616 from branch-2.7 and branch-2.8.
> >>>   *   For YARN-6050, there's a bit here:
> >>> https://developers.google.com/protocol-buffers/docs/proto that says
> >>> "optional is compatible with repeated", so I think we should be OK
> there.
> >>>   *   For YARN-7813, it's in 2.8.4 so it seems upgrading from 2.8.4 or
> >>> 2.8.5 to a 2.9+ version will be an issue. One option could be to move
> the
> >>> intraQueuePreemptionDisabled field from id 12 to id 13 in branch-2.8,
> then
> >>> users would upgrade from 2.8.4/2.8.5 to 2.8.6 (someone would have to
> >>> release this), then upgrade from 2.8.6 to 2.9+.
> >>>
> >>> Jonathan Hung
> >>>
> >>>
> >>> On Tue, Sep 24, 2019 at 9:23 AM Eric Badger <ebadger@verizonmedia.com
> .invalid>
> >>> wrote:
> >>> We (Verizon Media) are currently moving towards upgrading our clusters
> >>> from
> >>> our internal fork of branch-2.8 to an internal fork of branch-2. During
> >>> this process, we have found multiple incompatible changes in protobufs
> >>> between branch-2.8 and branch-2. These incompatibilities were all
> >>> introduced between branch-2.8 and branch-2.9. I did a git diff over all
> >>> .proto files across the branch-2.8 and branch-2.9 and found 3 instances
> >>> of
> >>> incompatibilities from 3 separate commits. All of the incompatibilities
> >>> are
> >>> in yarn_protos.proto
> >>>
> >>>
> >>> I would like to discuss how to fix these incompatible changes.
> Otherwise,
> >>> rolling upgrades will not be supported between branch-2.8 (or below)
> and
> >>> branch-2.9 (or beyond). We could revert the incompatible changes, but
> >>> then
> >>> the new releases would be incompatible with the releases that have
> these
> >>> incompatible changes. If we do nothing, then rolling upgrades won't
> work
> >>> between 2.8- and 2.9+.
> >>>
> >>>
> >>> Thanks,
> >>>
> >>>
> >>> Eric
> >>>
> >>>
> >>> -------------------------------------------------------------------
> >>>
> >>>
> >>> git diff branch-2.8..branch-2.9 $(find . -name '*\.proto')
> >>>
> >>>
> >>> https://issues.apache.org/jira/browse/YARN-6616
> >>>
> >>>    - Trunk patch (applied through branch-2.9) differs from branch-2.8
> >>> patch
> >>>
> >>> @@ -211,7 +245,20 @@ message ApplicationReportProto {
> >>>
> >>>    optional PriorityProto priority = 23;
> >>>
> >>>    optional string appNodeLabelExpression = 24;
> >>>
> >>>    optional string amNodeLabelExpression = 25;
> >>>
> >>> -  optional int64 submitTime = 26;
> >>>
> >>> +  repeated AppTimeoutsMapProto appTimeouts = 26;
> >>>
> >>> +  optional int64 launchTime = 27;
> >>>
> >>> +  optional int64 submitTime = 28;
> >>>
> >>>
> >>> https://issues.apache.org/jira/browse/YARN-6050
> >>>
> >>>    - Trunk and branch-2 patches both change the protobuf type in the
> same
> >>>    way.
> >>>
> >>> @@ -356,7 +416,22 @@ message ApplicationSubmissionContextProto {
> >>>
> >>>    optional LogAggregationContextProto log_aggregation_context = 14;
> >>>
> >>>    optional ReservationIdProto reservation_id = 15;
> >>>
> >>>    optional string node_label_expression = 16;
> >>>
> >>> -  optional ResourceRequestProto am_container_resource_request = 17;
> >>>
> >>> +  repeated ResourceRequestProto am_container_resource_request = 17;
> >>>
> >>> +  repeated ApplicationTimeoutMapProto application_timeouts = 18;
> >>>
> >>>
> >>> https://issues.apache.org/jira/browse/YARN-7813
> >>>
> >>>    - Trunk (applied through branch-3.1) and branch-3.0 (applied through
> >>>    branch-2.9) patches differ from branch-2.8 patch
> >>>
> >>> @@ -425,7 +501,21 @@ message QueueInfoProto {
> >>>
> >>>    optional string defaultNodeLabelExpression = 9;
> >>>
> >>>    optional QueueStatisticsProto queueStatistics = 10;
> >>>
> >>>    optional bool preemptionDisabled = 11;
> >>>
> >>> -  optional bool intraQueuePreemptionDisabled = 12;
> >>>
> >>> +  repeated QueueConfigurationsMapProto queueConfigurationsMap = 12;
> >>>
> >>> +  optional bool intraQueuePreemptionDisabled = 13;
> >>>
> >>
>

Re: Incompatible changes between branch-2.8 and branch-2.9

Posted by Eric Badger <eb...@verizonmedia.com.INVALID>.
* For YARN-7813, not sure why moving from 2.8.4/5 -> 2.8.6 would be
incompatible with this strategy? It should be OK to remove/add optional
fields (removing the field with id 12, and adding the field with id 13)
  - Ahhhh I misunderstood. I was thinking that the field was overwritten in
branch-2.8 as well. Yea I think that approach will be fine.

On Wed, Sep 25, 2019 at 2:31 AM Robert Kanter <rk...@apache.org> wrote:

> >
> > *   For YARN-6050, there's a bit here:
> > https://developers.google.com/protocol-buffers/docs/proto that says
> > "optional is compatible with repeated", so I think we should be OK there.
> >   - Optional is compatible with repeatable over the wire such that
> > protobuf won't blow up, but does that actually mean that it's compatible
> in
> > this case? If it's expecting an optional and gets a repeated, it's going
> to
> > drop everything except for the last value. I don't know enough about
> > YARN-6050 to say if this will be ok or not.
>
>
> It's been a while since I looked into this, but I think it should be okay.
>  If an older client (using optional) sends the message to a newer server
> (using repeated), then there will never be more than one value for the
> field.  The server puts these into a list, so the list would simply have a
> single value in it.  The server's logic should be able to handle a single
> valued list here because (a) IIRC we wanted to make sure compatibility
> wasn't a problem (Cloudera supported rolling upgrades between CDH 5.x so
> this was important) and (b) sending a single resource request, even in a
> newer client, is a still a valid thing to do.
> If a newer client (using repeated) sends the message to an older server
> (using optional), I'm not sure what will happen.  My guess is that it will
> drop the extra values (though I wonder if it will keep the first or last
> value...).  In any case, I believe most clients will only send the one
> value - in order for a client to send multiple values, you'd have to
> specify some additional MR configs (see MAPREDUCE-6871).  IIRC, there's
> also a SPARK JIRA similar to MAPREDUCE-6871, but I can't find it right now.
>
> - Robert
>
> On Tue, Sep 24, 2019 at 9:49 PM Jonathan Hung <jy...@gmail.com>
> wrote:
>
> > - I've created YARN-9855 and uploaded patches to fix YARN-6616 in
> > branch-2.8 and branch-2.7.
> > - For YARN-6050, not sure either. Robert/Wangda, can you comment on
> > YARN-6050 compatibility?
> > - For YARN-7813, not sure why moving from 2.8.4/5 -> 2.8.6 would be
> > incompatible with this strategy? It should be OK to remove/add optional
> > fields (removing the field with id 12, and adding the field with id 13).
> > The difficulties I see here are, we would have to leave id 12 blank in
> > 2.8.6 (so we cannot have YARN-6164 in branch-2.8), and users on 2.8.4/5
> > would have to move to 2.8.6 before moving to 2.9+. But rolling upgrade
> > would still work IIUC.
> >
> > Jonathan Hung
> >
> >
> > On Tue, Sep 24, 2019 at 2:52 PM Eric Badger <eb...@verizonmedia.com>
> > wrote:
> >
> >> *   For YARN-6616, for branch-2.8 and below, it was only committed to
> >> 2.7.8/2.8.6 which have not been released (as I understand). Perhaps we
> can
> >> revert YARN-6616 from branch-2.7 and branch-2.8.
> >>   - This seems reasonable. Since we haven't released anything, it should
> >> be no issue to change the 2.7/2.8 protobuf field to have the same value
> as
> >> 2.9+
> >>
> >> *   For YARN-6050, there's a bit here:
> >> https://developers.google.com/protocol-buffers/docs/proto that says
> >> "optional is compatible with repeated", so I think we should be OK
> there.
> >>   - Optional is compatible with repeatable over the wire such that
> >> protobuf won't blow up, but does that actually mean that it's
> compatible in
> >> this case? If it's expecting an optional and gets a repeated, it's
> going to
> >> drop everything except for the last value. I don't know enough about
> >> YARN-6050 to say if this will be ok or not.
> >>
> >> *   For YARN-7813, it's in 2.8.4 so it seems upgrading from 2.8.4 or
> >> 2.8.5 to a 2.9+ version will be an issue. One option could be to move
> the
> >> intraQueuePreemptionDisabled field from id 12 to id 13 in branch-2.8,
> then
> >> users would upgrade from 2.8.4/2.8.5 to 2.8.6 (someone would have to
> >> release this), then upgrade from 2.8.6 to 2.9+.
> >>   - I'm ok with this, but it should be noted that the upgrade from
> >> 2.8.4/2.8.5 to 2.8.6 (or 2.9+) would not be compatible for a rolling
> >> upgrade. So this would cause some pain to anybody with clusters on those
> >> versions.
> >>
> >> Eric
> >>
> >> On Tue, Sep 24, 2019 at 2:42 PM Jonathan Hung <jy...@gmail.com>
> >> wrote:
> >>
> >>> Sorry, let me edit my first point. We can just create addendums for
> >>> YARN-6616 in branch-2.7 and branch-2.8 to edit the submitTime field to
> the
> >>> correct id 28. We don’t need to revert YARN-6616 from these branches
> >>> completely.
> >>>
> >>> Jonathan
> >>>
> >>> ________________________________
> >>> From: Jonathan Hung <jy...@gmail.com>
> >>> Sent: Tuesday, September 24, 2019 11:38 AM
> >>> To: Eric Badger
> >>> Cc: Hadoop Common; yarn-dev; mapreduce-dev; Hdfs-dev
> >>> Subject: Re: Incompatible changes between branch-2.8 and branch-2.9
> >>>
> >>> Hi Eric, thanks for the investigation.
> >>>
> >>>   *   For YARN-6616, for branch-2.8 and below, it was only committed to
> >>> 2.7.8/2.8.6 which have not been released (as I understand). Perhaps we
> can
> >>> revert YARN-6616 from branch-2.7 and branch-2.8.
> >>>   *   For YARN-6050, there's a bit here:
> >>> https://developers.google.com/protocol-buffers/docs/proto that says
> >>> "optional is compatible with repeated", so I think we should be OK
> there.
> >>>   *   For YARN-7813, it's in 2.8.4 so it seems upgrading from 2.8.4 or
> >>> 2.8.5 to a 2.9+ version will be an issue. One option could be to move
> the
> >>> intraQueuePreemptionDisabled field from id 12 to id 13 in branch-2.8,
> then
> >>> users would upgrade from 2.8.4/2.8.5 to 2.8.6 (someone would have to
> >>> release this), then upgrade from 2.8.6 to 2.9+.
> >>>
> >>> Jonathan Hung
> >>>
> >>>
> >>> On Tue, Sep 24, 2019 at 9:23 AM Eric Badger <ebadger@verizonmedia.com
> .invalid>
> >>> wrote:
> >>> We (Verizon Media) are currently moving towards upgrading our clusters
> >>> from
> >>> our internal fork of branch-2.8 to an internal fork of branch-2. During
> >>> this process, we have found multiple incompatible changes in protobufs
> >>> between branch-2.8 and branch-2. These incompatibilities were all
> >>> introduced between branch-2.8 and branch-2.9. I did a git diff over all
> >>> .proto files across the branch-2.8 and branch-2.9 and found 3 instances
> >>> of
> >>> incompatibilities from 3 separate commits. All of the incompatibilities
> >>> are
> >>> in yarn_protos.proto
> >>>
> >>>
> >>> I would like to discuss how to fix these incompatible changes.
> Otherwise,
> >>> rolling upgrades will not be supported between branch-2.8 (or below)
> and
> >>> branch-2.9 (or beyond). We could revert the incompatible changes, but
> >>> then
> >>> the new releases would be incompatible with the releases that have
> these
> >>> incompatible changes. If we do nothing, then rolling upgrades won't
> work
> >>> between 2.8- and 2.9+.
> >>>
> >>>
> >>> Thanks,
> >>>
> >>>
> >>> Eric
> >>>
> >>>
> >>> -------------------------------------------------------------------
> >>>
> >>>
> >>> git diff branch-2.8..branch-2.9 $(find . -name '*\.proto')
> >>>
> >>>
> >>> https://issues.apache.org/jira/browse/YARN-6616
> >>>
> >>>    - Trunk patch (applied through branch-2.9) differs from branch-2.8
> >>> patch
> >>>
> >>> @@ -211,7 +245,20 @@ message ApplicationReportProto {
> >>>
> >>>    optional PriorityProto priority = 23;
> >>>
> >>>    optional string appNodeLabelExpression = 24;
> >>>
> >>>    optional string amNodeLabelExpression = 25;
> >>>
> >>> -  optional int64 submitTime = 26;
> >>>
> >>> +  repeated AppTimeoutsMapProto appTimeouts = 26;
> >>>
> >>> +  optional int64 launchTime = 27;
> >>>
> >>> +  optional int64 submitTime = 28;
> >>>
> >>>
> >>> https://issues.apache.org/jira/browse/YARN-6050
> >>>
> >>>    - Trunk and branch-2 patches both change the protobuf type in the
> same
> >>>    way.
> >>>
> >>> @@ -356,7 +416,22 @@ message ApplicationSubmissionContextProto {
> >>>
> >>>    optional LogAggregationContextProto log_aggregation_context = 14;
> >>>
> >>>    optional ReservationIdProto reservation_id = 15;
> >>>
> >>>    optional string node_label_expression = 16;
> >>>
> >>> -  optional ResourceRequestProto am_container_resource_request = 17;
> >>>
> >>> +  repeated ResourceRequestProto am_container_resource_request = 17;
> >>>
> >>> +  repeated ApplicationTimeoutMapProto application_timeouts = 18;
> >>>
> >>>
> >>> https://issues.apache.org/jira/browse/YARN-7813
> >>>
> >>>    - Trunk (applied through branch-3.1) and branch-3.0 (applied through
> >>>    branch-2.9) patches differ from branch-2.8 patch
> >>>
> >>> @@ -425,7 +501,21 @@ message QueueInfoProto {
> >>>
> >>>    optional string defaultNodeLabelExpression = 9;
> >>>
> >>>    optional QueueStatisticsProto queueStatistics = 10;
> >>>
> >>>    optional bool preemptionDisabled = 11;
> >>>
> >>> -  optional bool intraQueuePreemptionDisabled = 12;
> >>>
> >>> +  repeated QueueConfigurationsMapProto queueConfigurationsMap = 12;
> >>>
> >>> +  optional bool intraQueuePreemptionDisabled = 13;
> >>>
> >>
>

Re: Incompatible changes between branch-2.8 and branch-2.9

Posted by Robert Kanter <rk...@apache.org>.
>
> *   For YARN-6050, there's a bit here:
> https://developers.google.com/protocol-buffers/docs/proto that says
> "optional is compatible with repeated", so I think we should be OK there.
>   - Optional is compatible with repeatable over the wire such that
> protobuf won't blow up, but does that actually mean that it's compatible in
> this case? If it's expecting an optional and gets a repeated, it's going to
> drop everything except for the last value. I don't know enough about
> YARN-6050 to say if this will be ok or not.


It's been a while since I looked into this, but I think it should be okay.
 If an older client (using optional) sends the message to a newer server
(using repeated), then there will never be more than one value for the
field.  The server puts these into a list, so the list would simply have a
single value in it.  The server's logic should be able to handle a single
valued list here because (a) IIRC we wanted to make sure compatibility
wasn't a problem (Cloudera supported rolling upgrades between CDH 5.x so
this was important) and (b) sending a single resource request, even in a
newer client, is a still a valid thing to do.
If a newer client (using repeated) sends the message to an older server
(using optional), I'm not sure what will happen.  My guess is that it will
drop the extra values (though I wonder if it will keep the first or last
value...).  In any case, I believe most clients will only send the one
value - in order for a client to send multiple values, you'd have to
specify some additional MR configs (see MAPREDUCE-6871).  IIRC, there's
also a SPARK JIRA similar to MAPREDUCE-6871, but I can't find it right now.

- Robert

On Tue, Sep 24, 2019 at 9:49 PM Jonathan Hung <jy...@gmail.com> wrote:

> - I've created YARN-9855 and uploaded patches to fix YARN-6616 in
> branch-2.8 and branch-2.7.
> - For YARN-6050, not sure either. Robert/Wangda, can you comment on
> YARN-6050 compatibility?
> - For YARN-7813, not sure why moving from 2.8.4/5 -> 2.8.6 would be
> incompatible with this strategy? It should be OK to remove/add optional
> fields (removing the field with id 12, and adding the field with id 13).
> The difficulties I see here are, we would have to leave id 12 blank in
> 2.8.6 (so we cannot have YARN-6164 in branch-2.8), and users on 2.8.4/5
> would have to move to 2.8.6 before moving to 2.9+. But rolling upgrade
> would still work IIUC.
>
> Jonathan Hung
>
>
> On Tue, Sep 24, 2019 at 2:52 PM Eric Badger <eb...@verizonmedia.com>
> wrote:
>
>> *   For YARN-6616, for branch-2.8 and below, it was only committed to
>> 2.7.8/2.8.6 which have not been released (as I understand). Perhaps we can
>> revert YARN-6616 from branch-2.7 and branch-2.8.
>>   - This seems reasonable. Since we haven't released anything, it should
>> be no issue to change the 2.7/2.8 protobuf field to have the same value as
>> 2.9+
>>
>> *   For YARN-6050, there's a bit here:
>> https://developers.google.com/protocol-buffers/docs/proto that says
>> "optional is compatible with repeated", so I think we should be OK there.
>>   - Optional is compatible with repeatable over the wire such that
>> protobuf won't blow up, but does that actually mean that it's compatible in
>> this case? If it's expecting an optional and gets a repeated, it's going to
>> drop everything except for the last value. I don't know enough about
>> YARN-6050 to say if this will be ok or not.
>>
>> *   For YARN-7813, it's in 2.8.4 so it seems upgrading from 2.8.4 or
>> 2.8.5 to a 2.9+ version will be an issue. One option could be to move the
>> intraQueuePreemptionDisabled field from id 12 to id 13 in branch-2.8, then
>> users would upgrade from 2.8.4/2.8.5 to 2.8.6 (someone would have to
>> release this), then upgrade from 2.8.6 to 2.9+.
>>   - I'm ok with this, but it should be noted that the upgrade from
>> 2.8.4/2.8.5 to 2.8.6 (or 2.9+) would not be compatible for a rolling
>> upgrade. So this would cause some pain to anybody with clusters on those
>> versions.
>>
>> Eric
>>
>> On Tue, Sep 24, 2019 at 2:42 PM Jonathan Hung <jy...@gmail.com>
>> wrote:
>>
>>> Sorry, let me edit my first point. We can just create addendums for
>>> YARN-6616 in branch-2.7 and branch-2.8 to edit the submitTime field to the
>>> correct id 28. We don’t need to revert YARN-6616 from these branches
>>> completely.
>>>
>>> Jonathan
>>>
>>> ________________________________
>>> From: Jonathan Hung <jy...@gmail.com>
>>> Sent: Tuesday, September 24, 2019 11:38 AM
>>> To: Eric Badger
>>> Cc: Hadoop Common; yarn-dev; mapreduce-dev; Hdfs-dev
>>> Subject: Re: Incompatible changes between branch-2.8 and branch-2.9
>>>
>>> Hi Eric, thanks for the investigation.
>>>
>>>   *   For YARN-6616, for branch-2.8 and below, it was only committed to
>>> 2.7.8/2.8.6 which have not been released (as I understand). Perhaps we can
>>> revert YARN-6616 from branch-2.7 and branch-2.8.
>>>   *   For YARN-6050, there's a bit here:
>>> https://developers.google.com/protocol-buffers/docs/proto that says
>>> "optional is compatible with repeated", so I think we should be OK there.
>>>   *   For YARN-7813, it's in 2.8.4 so it seems upgrading from 2.8.4 or
>>> 2.8.5 to a 2.9+ version will be an issue. One option could be to move the
>>> intraQueuePreemptionDisabled field from id 12 to id 13 in branch-2.8, then
>>> users would upgrade from 2.8.4/2.8.5 to 2.8.6 (someone would have to
>>> release this), then upgrade from 2.8.6 to 2.9+.
>>>
>>> Jonathan Hung
>>>
>>>
>>> On Tue, Sep 24, 2019 at 9:23 AM Eric Badger <eb...@verizonmedia.com.invalid>
>>> wrote:
>>> We (Verizon Media) are currently moving towards upgrading our clusters
>>> from
>>> our internal fork of branch-2.8 to an internal fork of branch-2. During
>>> this process, we have found multiple incompatible changes in protobufs
>>> between branch-2.8 and branch-2. These incompatibilities were all
>>> introduced between branch-2.8 and branch-2.9. I did a git diff over all
>>> .proto files across the branch-2.8 and branch-2.9 and found 3 instances
>>> of
>>> incompatibilities from 3 separate commits. All of the incompatibilities
>>> are
>>> in yarn_protos.proto
>>>
>>>
>>> I would like to discuss how to fix these incompatible changes. Otherwise,
>>> rolling upgrades will not be supported between branch-2.8 (or below) and
>>> branch-2.9 (or beyond). We could revert the incompatible changes, but
>>> then
>>> the new releases would be incompatible with the releases that have these
>>> incompatible changes. If we do nothing, then rolling upgrades won't work
>>> between 2.8- and 2.9+.
>>>
>>>
>>> Thanks,
>>>
>>>
>>> Eric
>>>
>>>
>>> -------------------------------------------------------------------
>>>
>>>
>>> git diff branch-2.8..branch-2.9 $(find . -name '*\.proto')
>>>
>>>
>>> https://issues.apache.org/jira/browse/YARN-6616
>>>
>>>    - Trunk patch (applied through branch-2.9) differs from branch-2.8
>>> patch
>>>
>>> @@ -211,7 +245,20 @@ message ApplicationReportProto {
>>>
>>>    optional PriorityProto priority = 23;
>>>
>>>    optional string appNodeLabelExpression = 24;
>>>
>>>    optional string amNodeLabelExpression = 25;
>>>
>>> -  optional int64 submitTime = 26;
>>>
>>> +  repeated AppTimeoutsMapProto appTimeouts = 26;
>>>
>>> +  optional int64 launchTime = 27;
>>>
>>> +  optional int64 submitTime = 28;
>>>
>>>
>>> https://issues.apache.org/jira/browse/YARN-6050
>>>
>>>    - Trunk and branch-2 patches both change the protobuf type in the same
>>>    way.
>>>
>>> @@ -356,7 +416,22 @@ message ApplicationSubmissionContextProto {
>>>
>>>    optional LogAggregationContextProto log_aggregation_context = 14;
>>>
>>>    optional ReservationIdProto reservation_id = 15;
>>>
>>>    optional string node_label_expression = 16;
>>>
>>> -  optional ResourceRequestProto am_container_resource_request = 17;
>>>
>>> +  repeated ResourceRequestProto am_container_resource_request = 17;
>>>
>>> +  repeated ApplicationTimeoutMapProto application_timeouts = 18;
>>>
>>>
>>> https://issues.apache.org/jira/browse/YARN-7813
>>>
>>>    - Trunk (applied through branch-3.1) and branch-3.0 (applied through
>>>    branch-2.9) patches differ from branch-2.8 patch
>>>
>>> @@ -425,7 +501,21 @@ message QueueInfoProto {
>>>
>>>    optional string defaultNodeLabelExpression = 9;
>>>
>>>    optional QueueStatisticsProto queueStatistics = 10;
>>>
>>>    optional bool preemptionDisabled = 11;
>>>
>>> -  optional bool intraQueuePreemptionDisabled = 12;
>>>
>>> +  repeated QueueConfigurationsMapProto queueConfigurationsMap = 12;
>>>
>>> +  optional bool intraQueuePreemptionDisabled = 13;
>>>
>>

Re: Incompatible changes between branch-2.8 and branch-2.9

Posted by Robert Kanter <rk...@apache.org>.
>
> *   For YARN-6050, there's a bit here:
> https://developers.google.com/protocol-buffers/docs/proto that says
> "optional is compatible with repeated", so I think we should be OK there.
>   - Optional is compatible with repeatable over the wire such that
> protobuf won't blow up, but does that actually mean that it's compatible in
> this case? If it's expecting an optional and gets a repeated, it's going to
> drop everything except for the last value. I don't know enough about
> YARN-6050 to say if this will be ok or not.


It's been a while since I looked into this, but I think it should be okay.
 If an older client (using optional) sends the message to a newer server
(using repeated), then there will never be more than one value for the
field.  The server puts these into a list, so the list would simply have a
single value in it.  The server's logic should be able to handle a single
valued list here because (a) IIRC we wanted to make sure compatibility
wasn't a problem (Cloudera supported rolling upgrades between CDH 5.x so
this was important) and (b) sending a single resource request, even in a
newer client, is a still a valid thing to do.
If a newer client (using repeated) sends the message to an older server
(using optional), I'm not sure what will happen.  My guess is that it will
drop the extra values (though I wonder if it will keep the first or last
value...).  In any case, I believe most clients will only send the one
value - in order for a client to send multiple values, you'd have to
specify some additional MR configs (see MAPREDUCE-6871).  IIRC, there's
also a SPARK JIRA similar to MAPREDUCE-6871, but I can't find it right now.

- Robert

On Tue, Sep 24, 2019 at 9:49 PM Jonathan Hung <jy...@gmail.com> wrote:

> - I've created YARN-9855 and uploaded patches to fix YARN-6616 in
> branch-2.8 and branch-2.7.
> - For YARN-6050, not sure either. Robert/Wangda, can you comment on
> YARN-6050 compatibility?
> - For YARN-7813, not sure why moving from 2.8.4/5 -> 2.8.6 would be
> incompatible with this strategy? It should be OK to remove/add optional
> fields (removing the field with id 12, and adding the field with id 13).
> The difficulties I see here are, we would have to leave id 12 blank in
> 2.8.6 (so we cannot have YARN-6164 in branch-2.8), and users on 2.8.4/5
> would have to move to 2.8.6 before moving to 2.9+. But rolling upgrade
> would still work IIUC.
>
> Jonathan Hung
>
>
> On Tue, Sep 24, 2019 at 2:52 PM Eric Badger <eb...@verizonmedia.com>
> wrote:
>
>> *   For YARN-6616, for branch-2.8 and below, it was only committed to
>> 2.7.8/2.8.6 which have not been released (as I understand). Perhaps we can
>> revert YARN-6616 from branch-2.7 and branch-2.8.
>>   - This seems reasonable. Since we haven't released anything, it should
>> be no issue to change the 2.7/2.8 protobuf field to have the same value as
>> 2.9+
>>
>> *   For YARN-6050, there's a bit here:
>> https://developers.google.com/protocol-buffers/docs/proto that says
>> "optional is compatible with repeated", so I think we should be OK there.
>>   - Optional is compatible with repeatable over the wire such that
>> protobuf won't blow up, but does that actually mean that it's compatible in
>> this case? If it's expecting an optional and gets a repeated, it's going to
>> drop everything except for the last value. I don't know enough about
>> YARN-6050 to say if this will be ok or not.
>>
>> *   For YARN-7813, it's in 2.8.4 so it seems upgrading from 2.8.4 or
>> 2.8.5 to a 2.9+ version will be an issue. One option could be to move the
>> intraQueuePreemptionDisabled field from id 12 to id 13 in branch-2.8, then
>> users would upgrade from 2.8.4/2.8.5 to 2.8.6 (someone would have to
>> release this), then upgrade from 2.8.6 to 2.9+.
>>   - I'm ok with this, but it should be noted that the upgrade from
>> 2.8.4/2.8.5 to 2.8.6 (or 2.9+) would not be compatible for a rolling
>> upgrade. So this would cause some pain to anybody with clusters on those
>> versions.
>>
>> Eric
>>
>> On Tue, Sep 24, 2019 at 2:42 PM Jonathan Hung <jy...@gmail.com>
>> wrote:
>>
>>> Sorry, let me edit my first point. We can just create addendums for
>>> YARN-6616 in branch-2.7 and branch-2.8 to edit the submitTime field to the
>>> correct id 28. We don’t need to revert YARN-6616 from these branches
>>> completely.
>>>
>>> Jonathan
>>>
>>> ________________________________
>>> From: Jonathan Hung <jy...@gmail.com>
>>> Sent: Tuesday, September 24, 2019 11:38 AM
>>> To: Eric Badger
>>> Cc: Hadoop Common; yarn-dev; mapreduce-dev; Hdfs-dev
>>> Subject: Re: Incompatible changes between branch-2.8 and branch-2.9
>>>
>>> Hi Eric, thanks for the investigation.
>>>
>>>   *   For YARN-6616, for branch-2.8 and below, it was only committed to
>>> 2.7.8/2.8.6 which have not been released (as I understand). Perhaps we can
>>> revert YARN-6616 from branch-2.7 and branch-2.8.
>>>   *   For YARN-6050, there's a bit here:
>>> https://developers.google.com/protocol-buffers/docs/proto that says
>>> "optional is compatible with repeated", so I think we should be OK there.
>>>   *   For YARN-7813, it's in 2.8.4 so it seems upgrading from 2.8.4 or
>>> 2.8.5 to a 2.9+ version will be an issue. One option could be to move the
>>> intraQueuePreemptionDisabled field from id 12 to id 13 in branch-2.8, then
>>> users would upgrade from 2.8.4/2.8.5 to 2.8.6 (someone would have to
>>> release this), then upgrade from 2.8.6 to 2.9+.
>>>
>>> Jonathan Hung
>>>
>>>
>>> On Tue, Sep 24, 2019 at 9:23 AM Eric Badger <eb...@verizonmedia.com.invalid>
>>> wrote:
>>> We (Verizon Media) are currently moving towards upgrading our clusters
>>> from
>>> our internal fork of branch-2.8 to an internal fork of branch-2. During
>>> this process, we have found multiple incompatible changes in protobufs
>>> between branch-2.8 and branch-2. These incompatibilities were all
>>> introduced between branch-2.8 and branch-2.9. I did a git diff over all
>>> .proto files across the branch-2.8 and branch-2.9 and found 3 instances
>>> of
>>> incompatibilities from 3 separate commits. All of the incompatibilities
>>> are
>>> in yarn_protos.proto
>>>
>>>
>>> I would like to discuss how to fix these incompatible changes. Otherwise,
>>> rolling upgrades will not be supported between branch-2.8 (or below) and
>>> branch-2.9 (or beyond). We could revert the incompatible changes, but
>>> then
>>> the new releases would be incompatible with the releases that have these
>>> incompatible changes. If we do nothing, then rolling upgrades won't work
>>> between 2.8- and 2.9+.
>>>
>>>
>>> Thanks,
>>>
>>>
>>> Eric
>>>
>>>
>>> -------------------------------------------------------------------
>>>
>>>
>>> git diff branch-2.8..branch-2.9 $(find . -name '*\.proto')
>>>
>>>
>>> https://issues.apache.org/jira/browse/YARN-6616
>>>
>>>    - Trunk patch (applied through branch-2.9) differs from branch-2.8
>>> patch
>>>
>>> @@ -211,7 +245,20 @@ message ApplicationReportProto {
>>>
>>>    optional PriorityProto priority = 23;
>>>
>>>    optional string appNodeLabelExpression = 24;
>>>
>>>    optional string amNodeLabelExpression = 25;
>>>
>>> -  optional int64 submitTime = 26;
>>>
>>> +  repeated AppTimeoutsMapProto appTimeouts = 26;
>>>
>>> +  optional int64 launchTime = 27;
>>>
>>> +  optional int64 submitTime = 28;
>>>
>>>
>>> https://issues.apache.org/jira/browse/YARN-6050
>>>
>>>    - Trunk and branch-2 patches both change the protobuf type in the same
>>>    way.
>>>
>>> @@ -356,7 +416,22 @@ message ApplicationSubmissionContextProto {
>>>
>>>    optional LogAggregationContextProto log_aggregation_context = 14;
>>>
>>>    optional ReservationIdProto reservation_id = 15;
>>>
>>>    optional string node_label_expression = 16;
>>>
>>> -  optional ResourceRequestProto am_container_resource_request = 17;
>>>
>>> +  repeated ResourceRequestProto am_container_resource_request = 17;
>>>
>>> +  repeated ApplicationTimeoutMapProto application_timeouts = 18;
>>>
>>>
>>> https://issues.apache.org/jira/browse/YARN-7813
>>>
>>>    - Trunk (applied through branch-3.1) and branch-3.0 (applied through
>>>    branch-2.9) patches differ from branch-2.8 patch
>>>
>>> @@ -425,7 +501,21 @@ message QueueInfoProto {
>>>
>>>    optional string defaultNodeLabelExpression = 9;
>>>
>>>    optional QueueStatisticsProto queueStatistics = 10;
>>>
>>>    optional bool preemptionDisabled = 11;
>>>
>>> -  optional bool intraQueuePreemptionDisabled = 12;
>>>
>>> +  repeated QueueConfigurationsMapProto queueConfigurationsMap = 12;
>>>
>>> +  optional bool intraQueuePreemptionDisabled = 13;
>>>
>>

Re: Incompatible changes between branch-2.8 and branch-2.9

Posted by Robert Kanter <rk...@apache.org>.
>
> *   For YARN-6050, there's a bit here:
> https://developers.google.com/protocol-buffers/docs/proto that says
> "optional is compatible with repeated", so I think we should be OK there.
>   - Optional is compatible with repeatable over the wire such that
> protobuf won't blow up, but does that actually mean that it's compatible in
> this case? If it's expecting an optional and gets a repeated, it's going to
> drop everything except for the last value. I don't know enough about
> YARN-6050 to say if this will be ok or not.


It's been a while since I looked into this, but I think it should be okay.
 If an older client (using optional) sends the message to a newer server
(using repeated), then there will never be more than one value for the
field.  The server puts these into a list, so the list would simply have a
single value in it.  The server's logic should be able to handle a single
valued list here because (a) IIRC we wanted to make sure compatibility
wasn't a problem (Cloudera supported rolling upgrades between CDH 5.x so
this was important) and (b) sending a single resource request, even in a
newer client, is a still a valid thing to do.
If a newer client (using repeated) sends the message to an older server
(using optional), I'm not sure what will happen.  My guess is that it will
drop the extra values (though I wonder if it will keep the first or last
value...).  In any case, I believe most clients will only send the one
value - in order for a client to send multiple values, you'd have to
specify some additional MR configs (see MAPREDUCE-6871).  IIRC, there's
also a SPARK JIRA similar to MAPREDUCE-6871, but I can't find it right now.

- Robert

On Tue, Sep 24, 2019 at 9:49 PM Jonathan Hung <jy...@gmail.com> wrote:

> - I've created YARN-9855 and uploaded patches to fix YARN-6616 in
> branch-2.8 and branch-2.7.
> - For YARN-6050, not sure either. Robert/Wangda, can you comment on
> YARN-6050 compatibility?
> - For YARN-7813, not sure why moving from 2.8.4/5 -> 2.8.6 would be
> incompatible with this strategy? It should be OK to remove/add optional
> fields (removing the field with id 12, and adding the field with id 13).
> The difficulties I see here are, we would have to leave id 12 blank in
> 2.8.6 (so we cannot have YARN-6164 in branch-2.8), and users on 2.8.4/5
> would have to move to 2.8.6 before moving to 2.9+. But rolling upgrade
> would still work IIUC.
>
> Jonathan Hung
>
>
> On Tue, Sep 24, 2019 at 2:52 PM Eric Badger <eb...@verizonmedia.com>
> wrote:
>
>> *   For YARN-6616, for branch-2.8 and below, it was only committed to
>> 2.7.8/2.8.6 which have not been released (as I understand). Perhaps we can
>> revert YARN-6616 from branch-2.7 and branch-2.8.
>>   - This seems reasonable. Since we haven't released anything, it should
>> be no issue to change the 2.7/2.8 protobuf field to have the same value as
>> 2.9+
>>
>> *   For YARN-6050, there's a bit here:
>> https://developers.google.com/protocol-buffers/docs/proto that says
>> "optional is compatible with repeated", so I think we should be OK there.
>>   - Optional is compatible with repeatable over the wire such that
>> protobuf won't blow up, but does that actually mean that it's compatible in
>> this case? If it's expecting an optional and gets a repeated, it's going to
>> drop everything except for the last value. I don't know enough about
>> YARN-6050 to say if this will be ok or not.
>>
>> *   For YARN-7813, it's in 2.8.4 so it seems upgrading from 2.8.4 or
>> 2.8.5 to a 2.9+ version will be an issue. One option could be to move the
>> intraQueuePreemptionDisabled field from id 12 to id 13 in branch-2.8, then
>> users would upgrade from 2.8.4/2.8.5 to 2.8.6 (someone would have to
>> release this), then upgrade from 2.8.6 to 2.9+.
>>   - I'm ok with this, but it should be noted that the upgrade from
>> 2.8.4/2.8.5 to 2.8.6 (or 2.9+) would not be compatible for a rolling
>> upgrade. So this would cause some pain to anybody with clusters on those
>> versions.
>>
>> Eric
>>
>> On Tue, Sep 24, 2019 at 2:42 PM Jonathan Hung <jy...@gmail.com>
>> wrote:
>>
>>> Sorry, let me edit my first point. We can just create addendums for
>>> YARN-6616 in branch-2.7 and branch-2.8 to edit the submitTime field to the
>>> correct id 28. We don’t need to revert YARN-6616 from these branches
>>> completely.
>>>
>>> Jonathan
>>>
>>> ________________________________
>>> From: Jonathan Hung <jy...@gmail.com>
>>> Sent: Tuesday, September 24, 2019 11:38 AM
>>> To: Eric Badger
>>> Cc: Hadoop Common; yarn-dev; mapreduce-dev; Hdfs-dev
>>> Subject: Re: Incompatible changes between branch-2.8 and branch-2.9
>>>
>>> Hi Eric, thanks for the investigation.
>>>
>>>   *   For YARN-6616, for branch-2.8 and below, it was only committed to
>>> 2.7.8/2.8.6 which have not been released (as I understand). Perhaps we can
>>> revert YARN-6616 from branch-2.7 and branch-2.8.
>>>   *   For YARN-6050, there's a bit here:
>>> https://developers.google.com/protocol-buffers/docs/proto that says
>>> "optional is compatible with repeated", so I think we should be OK there.
>>>   *   For YARN-7813, it's in 2.8.4 so it seems upgrading from 2.8.4 or
>>> 2.8.5 to a 2.9+ version will be an issue. One option could be to move the
>>> intraQueuePreemptionDisabled field from id 12 to id 13 in branch-2.8, then
>>> users would upgrade from 2.8.4/2.8.5 to 2.8.6 (someone would have to
>>> release this), then upgrade from 2.8.6 to 2.9+.
>>>
>>> Jonathan Hung
>>>
>>>
>>> On Tue, Sep 24, 2019 at 9:23 AM Eric Badger <eb...@verizonmedia.com.invalid>
>>> wrote:
>>> We (Verizon Media) are currently moving towards upgrading our clusters
>>> from
>>> our internal fork of branch-2.8 to an internal fork of branch-2. During
>>> this process, we have found multiple incompatible changes in protobufs
>>> between branch-2.8 and branch-2. These incompatibilities were all
>>> introduced between branch-2.8 and branch-2.9. I did a git diff over all
>>> .proto files across the branch-2.8 and branch-2.9 and found 3 instances
>>> of
>>> incompatibilities from 3 separate commits. All of the incompatibilities
>>> are
>>> in yarn_protos.proto
>>>
>>>
>>> I would like to discuss how to fix these incompatible changes. Otherwise,
>>> rolling upgrades will not be supported between branch-2.8 (or below) and
>>> branch-2.9 (or beyond). We could revert the incompatible changes, but
>>> then
>>> the new releases would be incompatible with the releases that have these
>>> incompatible changes. If we do nothing, then rolling upgrades won't work
>>> between 2.8- and 2.9+.
>>>
>>>
>>> Thanks,
>>>
>>>
>>> Eric
>>>
>>>
>>> -------------------------------------------------------------------
>>>
>>>
>>> git diff branch-2.8..branch-2.9 $(find . -name '*\.proto')
>>>
>>>
>>> https://issues.apache.org/jira/browse/YARN-6616
>>>
>>>    - Trunk patch (applied through branch-2.9) differs from branch-2.8
>>> patch
>>>
>>> @@ -211,7 +245,20 @@ message ApplicationReportProto {
>>>
>>>    optional PriorityProto priority = 23;
>>>
>>>    optional string appNodeLabelExpression = 24;
>>>
>>>    optional string amNodeLabelExpression = 25;
>>>
>>> -  optional int64 submitTime = 26;
>>>
>>> +  repeated AppTimeoutsMapProto appTimeouts = 26;
>>>
>>> +  optional int64 launchTime = 27;
>>>
>>> +  optional int64 submitTime = 28;
>>>
>>>
>>> https://issues.apache.org/jira/browse/YARN-6050
>>>
>>>    - Trunk and branch-2 patches both change the protobuf type in the same
>>>    way.
>>>
>>> @@ -356,7 +416,22 @@ message ApplicationSubmissionContextProto {
>>>
>>>    optional LogAggregationContextProto log_aggregation_context = 14;
>>>
>>>    optional ReservationIdProto reservation_id = 15;
>>>
>>>    optional string node_label_expression = 16;
>>>
>>> -  optional ResourceRequestProto am_container_resource_request = 17;
>>>
>>> +  repeated ResourceRequestProto am_container_resource_request = 17;
>>>
>>> +  repeated ApplicationTimeoutMapProto application_timeouts = 18;
>>>
>>>
>>> https://issues.apache.org/jira/browse/YARN-7813
>>>
>>>    - Trunk (applied through branch-3.1) and branch-3.0 (applied through
>>>    branch-2.9) patches differ from branch-2.8 patch
>>>
>>> @@ -425,7 +501,21 @@ message QueueInfoProto {
>>>
>>>    optional string defaultNodeLabelExpression = 9;
>>>
>>>    optional QueueStatisticsProto queueStatistics = 10;
>>>
>>>    optional bool preemptionDisabled = 11;
>>>
>>> -  optional bool intraQueuePreemptionDisabled = 12;
>>>
>>> +  repeated QueueConfigurationsMapProto queueConfigurationsMap = 12;
>>>
>>> +  optional bool intraQueuePreemptionDisabled = 13;
>>>
>>

Re: Incompatible changes between branch-2.8 and branch-2.9

Posted by Robert Kanter <rk...@apache.org>.
>
> *   For YARN-6050, there's a bit here:
> https://developers.google.com/protocol-buffers/docs/proto that says
> "optional is compatible with repeated", so I think we should be OK there.
>   - Optional is compatible with repeatable over the wire such that
> protobuf won't blow up, but does that actually mean that it's compatible in
> this case? If it's expecting an optional and gets a repeated, it's going to
> drop everything except for the last value. I don't know enough about
> YARN-6050 to say if this will be ok or not.


It's been a while since I looked into this, but I think it should be okay.
 If an older client (using optional) sends the message to a newer server
(using repeated), then there will never be more than one value for the
field.  The server puts these into a list, so the list would simply have a
single value in it.  The server's logic should be able to handle a single
valued list here because (a) IIRC we wanted to make sure compatibility
wasn't a problem (Cloudera supported rolling upgrades between CDH 5.x so
this was important) and (b) sending a single resource request, even in a
newer client, is a still a valid thing to do.
If a newer client (using repeated) sends the message to an older server
(using optional), I'm not sure what will happen.  My guess is that it will
drop the extra values (though I wonder if it will keep the first or last
value...).  In any case, I believe most clients will only send the one
value - in order for a client to send multiple values, you'd have to
specify some additional MR configs (see MAPREDUCE-6871).  IIRC, there's
also a SPARK JIRA similar to MAPREDUCE-6871, but I can't find it right now.

- Robert

On Tue, Sep 24, 2019 at 9:49 PM Jonathan Hung <jy...@gmail.com> wrote:

> - I've created YARN-9855 and uploaded patches to fix YARN-6616 in
> branch-2.8 and branch-2.7.
> - For YARN-6050, not sure either. Robert/Wangda, can you comment on
> YARN-6050 compatibility?
> - For YARN-7813, not sure why moving from 2.8.4/5 -> 2.8.6 would be
> incompatible with this strategy? It should be OK to remove/add optional
> fields (removing the field with id 12, and adding the field with id 13).
> The difficulties I see here are, we would have to leave id 12 blank in
> 2.8.6 (so we cannot have YARN-6164 in branch-2.8), and users on 2.8.4/5
> would have to move to 2.8.6 before moving to 2.9+. But rolling upgrade
> would still work IIUC.
>
> Jonathan Hung
>
>
> On Tue, Sep 24, 2019 at 2:52 PM Eric Badger <eb...@verizonmedia.com>
> wrote:
>
>> *   For YARN-6616, for branch-2.8 and below, it was only committed to
>> 2.7.8/2.8.6 which have not been released (as I understand). Perhaps we can
>> revert YARN-6616 from branch-2.7 and branch-2.8.
>>   - This seems reasonable. Since we haven't released anything, it should
>> be no issue to change the 2.7/2.8 protobuf field to have the same value as
>> 2.9+
>>
>> *   For YARN-6050, there's a bit here:
>> https://developers.google.com/protocol-buffers/docs/proto that says
>> "optional is compatible with repeated", so I think we should be OK there.
>>   - Optional is compatible with repeatable over the wire such that
>> protobuf won't blow up, but does that actually mean that it's compatible in
>> this case? If it's expecting an optional and gets a repeated, it's going to
>> drop everything except for the last value. I don't know enough about
>> YARN-6050 to say if this will be ok or not.
>>
>> *   For YARN-7813, it's in 2.8.4 so it seems upgrading from 2.8.4 or
>> 2.8.5 to a 2.9+ version will be an issue. One option could be to move the
>> intraQueuePreemptionDisabled field from id 12 to id 13 in branch-2.8, then
>> users would upgrade from 2.8.4/2.8.5 to 2.8.6 (someone would have to
>> release this), then upgrade from 2.8.6 to 2.9+.
>>   - I'm ok with this, but it should be noted that the upgrade from
>> 2.8.4/2.8.5 to 2.8.6 (or 2.9+) would not be compatible for a rolling
>> upgrade. So this would cause some pain to anybody with clusters on those
>> versions.
>>
>> Eric
>>
>> On Tue, Sep 24, 2019 at 2:42 PM Jonathan Hung <jy...@gmail.com>
>> wrote:
>>
>>> Sorry, let me edit my first point. We can just create addendums for
>>> YARN-6616 in branch-2.7 and branch-2.8 to edit the submitTime field to the
>>> correct id 28. We don’t need to revert YARN-6616 from these branches
>>> completely.
>>>
>>> Jonathan
>>>
>>> ________________________________
>>> From: Jonathan Hung <jy...@gmail.com>
>>> Sent: Tuesday, September 24, 2019 11:38 AM
>>> To: Eric Badger
>>> Cc: Hadoop Common; yarn-dev; mapreduce-dev; Hdfs-dev
>>> Subject: Re: Incompatible changes between branch-2.8 and branch-2.9
>>>
>>> Hi Eric, thanks for the investigation.
>>>
>>>   *   For YARN-6616, for branch-2.8 and below, it was only committed to
>>> 2.7.8/2.8.6 which have not been released (as I understand). Perhaps we can
>>> revert YARN-6616 from branch-2.7 and branch-2.8.
>>>   *   For YARN-6050, there's a bit here:
>>> https://developers.google.com/protocol-buffers/docs/proto that says
>>> "optional is compatible with repeated", so I think we should be OK there.
>>>   *   For YARN-7813, it's in 2.8.4 so it seems upgrading from 2.8.4 or
>>> 2.8.5 to a 2.9+ version will be an issue. One option could be to move the
>>> intraQueuePreemptionDisabled field from id 12 to id 13 in branch-2.8, then
>>> users would upgrade from 2.8.4/2.8.5 to 2.8.6 (someone would have to
>>> release this), then upgrade from 2.8.6 to 2.9+.
>>>
>>> Jonathan Hung
>>>
>>>
>>> On Tue, Sep 24, 2019 at 9:23 AM Eric Badger <eb...@verizonmedia.com.invalid>
>>> wrote:
>>> We (Verizon Media) are currently moving towards upgrading our clusters
>>> from
>>> our internal fork of branch-2.8 to an internal fork of branch-2. During
>>> this process, we have found multiple incompatible changes in protobufs
>>> between branch-2.8 and branch-2. These incompatibilities were all
>>> introduced between branch-2.8 and branch-2.9. I did a git diff over all
>>> .proto files across the branch-2.8 and branch-2.9 and found 3 instances
>>> of
>>> incompatibilities from 3 separate commits. All of the incompatibilities
>>> are
>>> in yarn_protos.proto
>>>
>>>
>>> I would like to discuss how to fix these incompatible changes. Otherwise,
>>> rolling upgrades will not be supported between branch-2.8 (or below) and
>>> branch-2.9 (or beyond). We could revert the incompatible changes, but
>>> then
>>> the new releases would be incompatible with the releases that have these
>>> incompatible changes. If we do nothing, then rolling upgrades won't work
>>> between 2.8- and 2.9+.
>>>
>>>
>>> Thanks,
>>>
>>>
>>> Eric
>>>
>>>
>>> -------------------------------------------------------------------
>>>
>>>
>>> git diff branch-2.8..branch-2.9 $(find . -name '*\.proto')
>>>
>>>
>>> https://issues.apache.org/jira/browse/YARN-6616
>>>
>>>    - Trunk patch (applied through branch-2.9) differs from branch-2.8
>>> patch
>>>
>>> @@ -211,7 +245,20 @@ message ApplicationReportProto {
>>>
>>>    optional PriorityProto priority = 23;
>>>
>>>    optional string appNodeLabelExpression = 24;
>>>
>>>    optional string amNodeLabelExpression = 25;
>>>
>>> -  optional int64 submitTime = 26;
>>>
>>> +  repeated AppTimeoutsMapProto appTimeouts = 26;
>>>
>>> +  optional int64 launchTime = 27;
>>>
>>> +  optional int64 submitTime = 28;
>>>
>>>
>>> https://issues.apache.org/jira/browse/YARN-6050
>>>
>>>    - Trunk and branch-2 patches both change the protobuf type in the same
>>>    way.
>>>
>>> @@ -356,7 +416,22 @@ message ApplicationSubmissionContextProto {
>>>
>>>    optional LogAggregationContextProto log_aggregation_context = 14;
>>>
>>>    optional ReservationIdProto reservation_id = 15;
>>>
>>>    optional string node_label_expression = 16;
>>>
>>> -  optional ResourceRequestProto am_container_resource_request = 17;
>>>
>>> +  repeated ResourceRequestProto am_container_resource_request = 17;
>>>
>>> +  repeated ApplicationTimeoutMapProto application_timeouts = 18;
>>>
>>>
>>> https://issues.apache.org/jira/browse/YARN-7813
>>>
>>>    - Trunk (applied through branch-3.1) and branch-3.0 (applied through
>>>    branch-2.9) patches differ from branch-2.8 patch
>>>
>>> @@ -425,7 +501,21 @@ message QueueInfoProto {
>>>
>>>    optional string defaultNodeLabelExpression = 9;
>>>
>>>    optional QueueStatisticsProto queueStatistics = 10;
>>>
>>>    optional bool preemptionDisabled = 11;
>>>
>>> -  optional bool intraQueuePreemptionDisabled = 12;
>>>
>>> +  repeated QueueConfigurationsMapProto queueConfigurationsMap = 12;
>>>
>>> +  optional bool intraQueuePreemptionDisabled = 13;
>>>
>>

Re: Incompatible changes between branch-2.8 and branch-2.9

Posted by Jonathan Hung <jy...@gmail.com>.
- I've created YARN-9855 and uploaded patches to fix YARN-6616 in
branch-2.8 and branch-2.7.
- For YARN-6050, not sure either. Robert/Wangda, can you comment on
YARN-6050 compatibility?
- For YARN-7813, not sure why moving from 2.8.4/5 -> 2.8.6 would be
incompatible with this strategy? It should be OK to remove/add optional
fields (removing the field with id 12, and adding the field with id 13).
The difficulties I see here are, we would have to leave id 12 blank in
2.8.6 (so we cannot have YARN-6164 in branch-2.8), and users on 2.8.4/5
would have to move to 2.8.6 before moving to 2.9+. But rolling upgrade
would still work IIUC.

Jonathan Hung


On Tue, Sep 24, 2019 at 2:52 PM Eric Badger <eb...@verizonmedia.com>
wrote:

> *   For YARN-6616, for branch-2.8 and below, it was only committed to
> 2.7.8/2.8.6 which have not been released (as I understand). Perhaps we can
> revert YARN-6616 from branch-2.7 and branch-2.8.
>   - This seems reasonable. Since we haven't released anything, it should
> be no issue to change the 2.7/2.8 protobuf field to have the same value as
> 2.9+
>
> *   For YARN-6050, there's a bit here:
> https://developers.google.com/protocol-buffers/docs/proto that says
> "optional is compatible with repeated", so I think we should be OK there.
>   - Optional is compatible with repeatable over the wire such that
> protobuf won't blow up, but does that actually mean that it's compatible in
> this case? If it's expecting an optional and gets a repeated, it's going to
> drop everything except for the last value. I don't know enough about
> YARN-6050 to say if this will be ok or not.
>
> *   For YARN-7813, it's in 2.8.4 so it seems upgrading from 2.8.4 or 2.8.5
> to a 2.9+ version will be an issue. One option could be to move the
> intraQueuePreemptionDisabled field from id 12 to id 13 in branch-2.8, then
> users would upgrade from 2.8.4/2.8.5 to 2.8.6 (someone would have to
> release this), then upgrade from 2.8.6 to 2.9+.
>   - I'm ok with this, but it should be noted that the upgrade from
> 2.8.4/2.8.5 to 2.8.6 (or 2.9+) would not be compatible for a rolling
> upgrade. So this would cause some pain to anybody with clusters on those
> versions.
>
> Eric
>
> On Tue, Sep 24, 2019 at 2:42 PM Jonathan Hung <jy...@gmail.com>
> wrote:
>
>> Sorry, let me edit my first point. We can just create addendums for
>> YARN-6616 in branch-2.7 and branch-2.8 to edit the submitTime field to the
>> correct id 28. We don’t need to revert YARN-6616 from these branches
>> completely.
>>
>> Jonathan
>>
>> ________________________________
>> From: Jonathan Hung <jy...@gmail.com>
>> Sent: Tuesday, September 24, 2019 11:38 AM
>> To: Eric Badger
>> Cc: Hadoop Common; yarn-dev; mapreduce-dev; Hdfs-dev
>> Subject: Re: Incompatible changes between branch-2.8 and branch-2.9
>>
>> Hi Eric, thanks for the investigation.
>>
>>   *   For YARN-6616, for branch-2.8 and below, it was only committed to
>> 2.7.8/2.8.6 which have not been released (as I understand). Perhaps we can
>> revert YARN-6616 from branch-2.7 and branch-2.8.
>>   *   For YARN-6050, there's a bit here:
>> https://developers.google.com/protocol-buffers/docs/proto that says
>> "optional is compatible with repeated", so I think we should be OK there.
>>   *   For YARN-7813, it's in 2.8.4 so it seems upgrading from 2.8.4 or
>> 2.8.5 to a 2.9+ version will be an issue. One option could be to move the
>> intraQueuePreemptionDisabled field from id 12 to id 13 in branch-2.8, then
>> users would upgrade from 2.8.4/2.8.5 to 2.8.6 (someone would have to
>> release this), then upgrade from 2.8.6 to 2.9+.
>>
>> Jonathan Hung
>>
>>
>> On Tue, Sep 24, 2019 at 9:23 AM Eric Badger <eb...@verizonmedia.com.invalid>
>> wrote:
>> We (Verizon Media) are currently moving towards upgrading our clusters
>> from
>> our internal fork of branch-2.8 to an internal fork of branch-2. During
>> this process, we have found multiple incompatible changes in protobufs
>> between branch-2.8 and branch-2. These incompatibilities were all
>> introduced between branch-2.8 and branch-2.9. I did a git diff over all
>> .proto files across the branch-2.8 and branch-2.9 and found 3 instances of
>> incompatibilities from 3 separate commits. All of the incompatibilities
>> are
>> in yarn_protos.proto
>>
>>
>> I would like to discuss how to fix these incompatible changes. Otherwise,
>> rolling upgrades will not be supported between branch-2.8 (or below) and
>> branch-2.9 (or beyond). We could revert the incompatible changes, but then
>> the new releases would be incompatible with the releases that have these
>> incompatible changes. If we do nothing, then rolling upgrades won't work
>> between 2.8- and 2.9+.
>>
>>
>> Thanks,
>>
>>
>> Eric
>>
>>
>> -------------------------------------------------------------------
>>
>>
>> git diff branch-2.8..branch-2.9 $(find . -name '*\.proto')
>>
>>
>> https://issues.apache.org/jira/browse/YARN-6616
>>
>>    - Trunk patch (applied through branch-2.9) differs from branch-2.8
>> patch
>>
>> @@ -211,7 +245,20 @@ message ApplicationReportProto {
>>
>>    optional PriorityProto priority = 23;
>>
>>    optional string appNodeLabelExpression = 24;
>>
>>    optional string amNodeLabelExpression = 25;
>>
>> -  optional int64 submitTime = 26;
>>
>> +  repeated AppTimeoutsMapProto appTimeouts = 26;
>>
>> +  optional int64 launchTime = 27;
>>
>> +  optional int64 submitTime = 28;
>>
>>
>> https://issues.apache.org/jira/browse/YARN-6050
>>
>>    - Trunk and branch-2 patches both change the protobuf type in the same
>>    way.
>>
>> @@ -356,7 +416,22 @@ message ApplicationSubmissionContextProto {
>>
>>    optional LogAggregationContextProto log_aggregation_context = 14;
>>
>>    optional ReservationIdProto reservation_id = 15;
>>
>>    optional string node_label_expression = 16;
>>
>> -  optional ResourceRequestProto am_container_resource_request = 17;
>>
>> +  repeated ResourceRequestProto am_container_resource_request = 17;
>>
>> +  repeated ApplicationTimeoutMapProto application_timeouts = 18;
>>
>>
>> https://issues.apache.org/jira/browse/YARN-7813
>>
>>    - Trunk (applied through branch-3.1) and branch-3.0 (applied through
>>    branch-2.9) patches differ from branch-2.8 patch
>>
>> @@ -425,7 +501,21 @@ message QueueInfoProto {
>>
>>    optional string defaultNodeLabelExpression = 9;
>>
>>    optional QueueStatisticsProto queueStatistics = 10;
>>
>>    optional bool preemptionDisabled = 11;
>>
>> -  optional bool intraQueuePreemptionDisabled = 12;
>>
>> +  repeated QueueConfigurationsMapProto queueConfigurationsMap = 12;
>>
>> +  optional bool intraQueuePreemptionDisabled = 13;
>>
>

Re: Incompatible changes between branch-2.8 and branch-2.9

Posted by Jonathan Hung <jy...@gmail.com>.
- I've created YARN-9855 and uploaded patches to fix YARN-6616 in
branch-2.8 and branch-2.7.
- For YARN-6050, not sure either. Robert/Wangda, can you comment on
YARN-6050 compatibility?
- For YARN-7813, not sure why moving from 2.8.4/5 -> 2.8.6 would be
incompatible with this strategy? It should be OK to remove/add optional
fields (removing the field with id 12, and adding the field with id 13).
The difficulties I see here are, we would have to leave id 12 blank in
2.8.6 (so we cannot have YARN-6164 in branch-2.8), and users on 2.8.4/5
would have to move to 2.8.6 before moving to 2.9+. But rolling upgrade
would still work IIUC.

Jonathan Hung


On Tue, Sep 24, 2019 at 2:52 PM Eric Badger <eb...@verizonmedia.com>
wrote:

> *   For YARN-6616, for branch-2.8 and below, it was only committed to
> 2.7.8/2.8.6 which have not been released (as I understand). Perhaps we can
> revert YARN-6616 from branch-2.7 and branch-2.8.
>   - This seems reasonable. Since we haven't released anything, it should
> be no issue to change the 2.7/2.8 protobuf field to have the same value as
> 2.9+
>
> *   For YARN-6050, there's a bit here:
> https://developers.google.com/protocol-buffers/docs/proto that says
> "optional is compatible with repeated", so I think we should be OK there.
>   - Optional is compatible with repeatable over the wire such that
> protobuf won't blow up, but does that actually mean that it's compatible in
> this case? If it's expecting an optional and gets a repeated, it's going to
> drop everything except for the last value. I don't know enough about
> YARN-6050 to say if this will be ok or not.
>
> *   For YARN-7813, it's in 2.8.4 so it seems upgrading from 2.8.4 or 2.8.5
> to a 2.9+ version will be an issue. One option could be to move the
> intraQueuePreemptionDisabled field from id 12 to id 13 in branch-2.8, then
> users would upgrade from 2.8.4/2.8.5 to 2.8.6 (someone would have to
> release this), then upgrade from 2.8.6 to 2.9+.
>   - I'm ok with this, but it should be noted that the upgrade from
> 2.8.4/2.8.5 to 2.8.6 (or 2.9+) would not be compatible for a rolling
> upgrade. So this would cause some pain to anybody with clusters on those
> versions.
>
> Eric
>
> On Tue, Sep 24, 2019 at 2:42 PM Jonathan Hung <jy...@gmail.com>
> wrote:
>
>> Sorry, let me edit my first point. We can just create addendums for
>> YARN-6616 in branch-2.7 and branch-2.8 to edit the submitTime field to the
>> correct id 28. We don’t need to revert YARN-6616 from these branches
>> completely.
>>
>> Jonathan
>>
>> ________________________________
>> From: Jonathan Hung <jy...@gmail.com>
>> Sent: Tuesday, September 24, 2019 11:38 AM
>> To: Eric Badger
>> Cc: Hadoop Common; yarn-dev; mapreduce-dev; Hdfs-dev
>> Subject: Re: Incompatible changes between branch-2.8 and branch-2.9
>>
>> Hi Eric, thanks for the investigation.
>>
>>   *   For YARN-6616, for branch-2.8 and below, it was only committed to
>> 2.7.8/2.8.6 which have not been released (as I understand). Perhaps we can
>> revert YARN-6616 from branch-2.7 and branch-2.8.
>>   *   For YARN-6050, there's a bit here:
>> https://developers.google.com/protocol-buffers/docs/proto that says
>> "optional is compatible with repeated", so I think we should be OK there.
>>   *   For YARN-7813, it's in 2.8.4 so it seems upgrading from 2.8.4 or
>> 2.8.5 to a 2.9+ version will be an issue. One option could be to move the
>> intraQueuePreemptionDisabled field from id 12 to id 13 in branch-2.8, then
>> users would upgrade from 2.8.4/2.8.5 to 2.8.6 (someone would have to
>> release this), then upgrade from 2.8.6 to 2.9+.
>>
>> Jonathan Hung
>>
>>
>> On Tue, Sep 24, 2019 at 9:23 AM Eric Badger <eb...@verizonmedia.com.invalid>
>> wrote:
>> We (Verizon Media) are currently moving towards upgrading our clusters
>> from
>> our internal fork of branch-2.8 to an internal fork of branch-2. During
>> this process, we have found multiple incompatible changes in protobufs
>> between branch-2.8 and branch-2. These incompatibilities were all
>> introduced between branch-2.8 and branch-2.9. I did a git diff over all
>> .proto files across the branch-2.8 and branch-2.9 and found 3 instances of
>> incompatibilities from 3 separate commits. All of the incompatibilities
>> are
>> in yarn_protos.proto
>>
>>
>> I would like to discuss how to fix these incompatible changes. Otherwise,
>> rolling upgrades will not be supported between branch-2.8 (or below) and
>> branch-2.9 (or beyond). We could revert the incompatible changes, but then
>> the new releases would be incompatible with the releases that have these
>> incompatible changes. If we do nothing, then rolling upgrades won't work
>> between 2.8- and 2.9+.
>>
>>
>> Thanks,
>>
>>
>> Eric
>>
>>
>> -------------------------------------------------------------------
>>
>>
>> git diff branch-2.8..branch-2.9 $(find . -name '*\.proto')
>>
>>
>> https://issues.apache.org/jira/browse/YARN-6616
>>
>>    - Trunk patch (applied through branch-2.9) differs from branch-2.8
>> patch
>>
>> @@ -211,7 +245,20 @@ message ApplicationReportProto {
>>
>>    optional PriorityProto priority = 23;
>>
>>    optional string appNodeLabelExpression = 24;
>>
>>    optional string amNodeLabelExpression = 25;
>>
>> -  optional int64 submitTime = 26;
>>
>> +  repeated AppTimeoutsMapProto appTimeouts = 26;
>>
>> +  optional int64 launchTime = 27;
>>
>> +  optional int64 submitTime = 28;
>>
>>
>> https://issues.apache.org/jira/browse/YARN-6050
>>
>>    - Trunk and branch-2 patches both change the protobuf type in the same
>>    way.
>>
>> @@ -356,7 +416,22 @@ message ApplicationSubmissionContextProto {
>>
>>    optional LogAggregationContextProto log_aggregation_context = 14;
>>
>>    optional ReservationIdProto reservation_id = 15;
>>
>>    optional string node_label_expression = 16;
>>
>> -  optional ResourceRequestProto am_container_resource_request = 17;
>>
>> +  repeated ResourceRequestProto am_container_resource_request = 17;
>>
>> +  repeated ApplicationTimeoutMapProto application_timeouts = 18;
>>
>>
>> https://issues.apache.org/jira/browse/YARN-7813
>>
>>    - Trunk (applied through branch-3.1) and branch-3.0 (applied through
>>    branch-2.9) patches differ from branch-2.8 patch
>>
>> @@ -425,7 +501,21 @@ message QueueInfoProto {
>>
>>    optional string defaultNodeLabelExpression = 9;
>>
>>    optional QueueStatisticsProto queueStatistics = 10;
>>
>>    optional bool preemptionDisabled = 11;
>>
>> -  optional bool intraQueuePreemptionDisabled = 12;
>>
>> +  repeated QueueConfigurationsMapProto queueConfigurationsMap = 12;
>>
>> +  optional bool intraQueuePreemptionDisabled = 13;
>>
>

Re: Incompatible changes between branch-2.8 and branch-2.9

Posted by Jonathan Hung <jy...@gmail.com>.
- I've created YARN-9855 and uploaded patches to fix YARN-6616 in
branch-2.8 and branch-2.7.
- For YARN-6050, not sure either. Robert/Wangda, can you comment on
YARN-6050 compatibility?
- For YARN-7813, not sure why moving from 2.8.4/5 -> 2.8.6 would be
incompatible with this strategy? It should be OK to remove/add optional
fields (removing the field with id 12, and adding the field with id 13).
The difficulties I see here are, we would have to leave id 12 blank in
2.8.6 (so we cannot have YARN-6164 in branch-2.8), and users on 2.8.4/5
would have to move to 2.8.6 before moving to 2.9+. But rolling upgrade
would still work IIUC.

Jonathan Hung


On Tue, Sep 24, 2019 at 2:52 PM Eric Badger <eb...@verizonmedia.com>
wrote:

> *   For YARN-6616, for branch-2.8 and below, it was only committed to
> 2.7.8/2.8.6 which have not been released (as I understand). Perhaps we can
> revert YARN-6616 from branch-2.7 and branch-2.8.
>   - This seems reasonable. Since we haven't released anything, it should
> be no issue to change the 2.7/2.8 protobuf field to have the same value as
> 2.9+
>
> *   For YARN-6050, there's a bit here:
> https://developers.google.com/protocol-buffers/docs/proto that says
> "optional is compatible with repeated", so I think we should be OK there.
>   - Optional is compatible with repeatable over the wire such that
> protobuf won't blow up, but does that actually mean that it's compatible in
> this case? If it's expecting an optional and gets a repeated, it's going to
> drop everything except for the last value. I don't know enough about
> YARN-6050 to say if this will be ok or not.
>
> *   For YARN-7813, it's in 2.8.4 so it seems upgrading from 2.8.4 or 2.8.5
> to a 2.9+ version will be an issue. One option could be to move the
> intraQueuePreemptionDisabled field from id 12 to id 13 in branch-2.8, then
> users would upgrade from 2.8.4/2.8.5 to 2.8.6 (someone would have to
> release this), then upgrade from 2.8.6 to 2.9+.
>   - I'm ok with this, but it should be noted that the upgrade from
> 2.8.4/2.8.5 to 2.8.6 (or 2.9+) would not be compatible for a rolling
> upgrade. So this would cause some pain to anybody with clusters on those
> versions.
>
> Eric
>
> On Tue, Sep 24, 2019 at 2:42 PM Jonathan Hung <jy...@gmail.com>
> wrote:
>
>> Sorry, let me edit my first point. We can just create addendums for
>> YARN-6616 in branch-2.7 and branch-2.8 to edit the submitTime field to the
>> correct id 28. We don’t need to revert YARN-6616 from these branches
>> completely.
>>
>> Jonathan
>>
>> ________________________________
>> From: Jonathan Hung <jy...@gmail.com>
>> Sent: Tuesday, September 24, 2019 11:38 AM
>> To: Eric Badger
>> Cc: Hadoop Common; yarn-dev; mapreduce-dev; Hdfs-dev
>> Subject: Re: Incompatible changes between branch-2.8 and branch-2.9
>>
>> Hi Eric, thanks for the investigation.
>>
>>   *   For YARN-6616, for branch-2.8 and below, it was only committed to
>> 2.7.8/2.8.6 which have not been released (as I understand). Perhaps we can
>> revert YARN-6616 from branch-2.7 and branch-2.8.
>>   *   For YARN-6050, there's a bit here:
>> https://developers.google.com/protocol-buffers/docs/proto that says
>> "optional is compatible with repeated", so I think we should be OK there.
>>   *   For YARN-7813, it's in 2.8.4 so it seems upgrading from 2.8.4 or
>> 2.8.5 to a 2.9+ version will be an issue. One option could be to move the
>> intraQueuePreemptionDisabled field from id 12 to id 13 in branch-2.8, then
>> users would upgrade from 2.8.4/2.8.5 to 2.8.6 (someone would have to
>> release this), then upgrade from 2.8.6 to 2.9+.
>>
>> Jonathan Hung
>>
>>
>> On Tue, Sep 24, 2019 at 9:23 AM Eric Badger <eb...@verizonmedia.com.invalid>
>> wrote:
>> We (Verizon Media) are currently moving towards upgrading our clusters
>> from
>> our internal fork of branch-2.8 to an internal fork of branch-2. During
>> this process, we have found multiple incompatible changes in protobufs
>> between branch-2.8 and branch-2. These incompatibilities were all
>> introduced between branch-2.8 and branch-2.9. I did a git diff over all
>> .proto files across the branch-2.8 and branch-2.9 and found 3 instances of
>> incompatibilities from 3 separate commits. All of the incompatibilities
>> are
>> in yarn_protos.proto
>>
>>
>> I would like to discuss how to fix these incompatible changes. Otherwise,
>> rolling upgrades will not be supported between branch-2.8 (or below) and
>> branch-2.9 (or beyond). We could revert the incompatible changes, but then
>> the new releases would be incompatible with the releases that have these
>> incompatible changes. If we do nothing, then rolling upgrades won't work
>> between 2.8- and 2.9+.
>>
>>
>> Thanks,
>>
>>
>> Eric
>>
>>
>> -------------------------------------------------------------------
>>
>>
>> git diff branch-2.8..branch-2.9 $(find . -name '*\.proto')
>>
>>
>> https://issues.apache.org/jira/browse/YARN-6616
>>
>>    - Trunk patch (applied through branch-2.9) differs from branch-2.8
>> patch
>>
>> @@ -211,7 +245,20 @@ message ApplicationReportProto {
>>
>>    optional PriorityProto priority = 23;
>>
>>    optional string appNodeLabelExpression = 24;
>>
>>    optional string amNodeLabelExpression = 25;
>>
>> -  optional int64 submitTime = 26;
>>
>> +  repeated AppTimeoutsMapProto appTimeouts = 26;
>>
>> +  optional int64 launchTime = 27;
>>
>> +  optional int64 submitTime = 28;
>>
>>
>> https://issues.apache.org/jira/browse/YARN-6050
>>
>>    - Trunk and branch-2 patches both change the protobuf type in the same
>>    way.
>>
>> @@ -356,7 +416,22 @@ message ApplicationSubmissionContextProto {
>>
>>    optional LogAggregationContextProto log_aggregation_context = 14;
>>
>>    optional ReservationIdProto reservation_id = 15;
>>
>>    optional string node_label_expression = 16;
>>
>> -  optional ResourceRequestProto am_container_resource_request = 17;
>>
>> +  repeated ResourceRequestProto am_container_resource_request = 17;
>>
>> +  repeated ApplicationTimeoutMapProto application_timeouts = 18;
>>
>>
>> https://issues.apache.org/jira/browse/YARN-7813
>>
>>    - Trunk (applied through branch-3.1) and branch-3.0 (applied through
>>    branch-2.9) patches differ from branch-2.8 patch
>>
>> @@ -425,7 +501,21 @@ message QueueInfoProto {
>>
>>    optional string defaultNodeLabelExpression = 9;
>>
>>    optional QueueStatisticsProto queueStatistics = 10;
>>
>>    optional bool preemptionDisabled = 11;
>>
>> -  optional bool intraQueuePreemptionDisabled = 12;
>>
>> +  repeated QueueConfigurationsMapProto queueConfigurationsMap = 12;
>>
>> +  optional bool intraQueuePreemptionDisabled = 13;
>>
>

Re: Incompatible changes between branch-2.8 and branch-2.9

Posted by Jonathan Hung <jy...@gmail.com>.
- I've created YARN-9855 and uploaded patches to fix YARN-6616 in
branch-2.8 and branch-2.7.
- For YARN-6050, not sure either. Robert/Wangda, can you comment on
YARN-6050 compatibility?
- For YARN-7813, not sure why moving from 2.8.4/5 -> 2.8.6 would be
incompatible with this strategy? It should be OK to remove/add optional
fields (removing the field with id 12, and adding the field with id 13).
The difficulties I see here are, we would have to leave id 12 blank in
2.8.6 (so we cannot have YARN-6164 in branch-2.8), and users on 2.8.4/5
would have to move to 2.8.6 before moving to 2.9+. But rolling upgrade
would still work IIUC.

Jonathan Hung


On Tue, Sep 24, 2019 at 2:52 PM Eric Badger <eb...@verizonmedia.com>
wrote:

> *   For YARN-6616, for branch-2.8 and below, it was only committed to
> 2.7.8/2.8.6 which have not been released (as I understand). Perhaps we can
> revert YARN-6616 from branch-2.7 and branch-2.8.
>   - This seems reasonable. Since we haven't released anything, it should
> be no issue to change the 2.7/2.8 protobuf field to have the same value as
> 2.9+
>
> *   For YARN-6050, there's a bit here:
> https://developers.google.com/protocol-buffers/docs/proto that says
> "optional is compatible with repeated", so I think we should be OK there.
>   - Optional is compatible with repeatable over the wire such that
> protobuf won't blow up, but does that actually mean that it's compatible in
> this case? If it's expecting an optional and gets a repeated, it's going to
> drop everything except for the last value. I don't know enough about
> YARN-6050 to say if this will be ok or not.
>
> *   For YARN-7813, it's in 2.8.4 so it seems upgrading from 2.8.4 or 2.8.5
> to a 2.9+ version will be an issue. One option could be to move the
> intraQueuePreemptionDisabled field from id 12 to id 13 in branch-2.8, then
> users would upgrade from 2.8.4/2.8.5 to 2.8.6 (someone would have to
> release this), then upgrade from 2.8.6 to 2.9+.
>   - I'm ok with this, but it should be noted that the upgrade from
> 2.8.4/2.8.5 to 2.8.6 (or 2.9+) would not be compatible for a rolling
> upgrade. So this would cause some pain to anybody with clusters on those
> versions.
>
> Eric
>
> On Tue, Sep 24, 2019 at 2:42 PM Jonathan Hung <jy...@gmail.com>
> wrote:
>
>> Sorry, let me edit my first point. We can just create addendums for
>> YARN-6616 in branch-2.7 and branch-2.8 to edit the submitTime field to the
>> correct id 28. We don’t need to revert YARN-6616 from these branches
>> completely.
>>
>> Jonathan
>>
>> ________________________________
>> From: Jonathan Hung <jy...@gmail.com>
>> Sent: Tuesday, September 24, 2019 11:38 AM
>> To: Eric Badger
>> Cc: Hadoop Common; yarn-dev; mapreduce-dev; Hdfs-dev
>> Subject: Re: Incompatible changes between branch-2.8 and branch-2.9
>>
>> Hi Eric, thanks for the investigation.
>>
>>   *   For YARN-6616, for branch-2.8 and below, it was only committed to
>> 2.7.8/2.8.6 which have not been released (as I understand). Perhaps we can
>> revert YARN-6616 from branch-2.7 and branch-2.8.
>>   *   For YARN-6050, there's a bit here:
>> https://developers.google.com/protocol-buffers/docs/proto that says
>> "optional is compatible with repeated", so I think we should be OK there.
>>   *   For YARN-7813, it's in 2.8.4 so it seems upgrading from 2.8.4 or
>> 2.8.5 to a 2.9+ version will be an issue. One option could be to move the
>> intraQueuePreemptionDisabled field from id 12 to id 13 in branch-2.8, then
>> users would upgrade from 2.8.4/2.8.5 to 2.8.6 (someone would have to
>> release this), then upgrade from 2.8.6 to 2.9+.
>>
>> Jonathan Hung
>>
>>
>> On Tue, Sep 24, 2019 at 9:23 AM Eric Badger <eb...@verizonmedia.com.invalid>
>> wrote:
>> We (Verizon Media) are currently moving towards upgrading our clusters
>> from
>> our internal fork of branch-2.8 to an internal fork of branch-2. During
>> this process, we have found multiple incompatible changes in protobufs
>> between branch-2.8 and branch-2. These incompatibilities were all
>> introduced between branch-2.8 and branch-2.9. I did a git diff over all
>> .proto files across the branch-2.8 and branch-2.9 and found 3 instances of
>> incompatibilities from 3 separate commits. All of the incompatibilities
>> are
>> in yarn_protos.proto
>>
>>
>> I would like to discuss how to fix these incompatible changes. Otherwise,
>> rolling upgrades will not be supported between branch-2.8 (or below) and
>> branch-2.9 (or beyond). We could revert the incompatible changes, but then
>> the new releases would be incompatible with the releases that have these
>> incompatible changes. If we do nothing, then rolling upgrades won't work
>> between 2.8- and 2.9+.
>>
>>
>> Thanks,
>>
>>
>> Eric
>>
>>
>> -------------------------------------------------------------------
>>
>>
>> git diff branch-2.8..branch-2.9 $(find . -name '*\.proto')
>>
>>
>> https://issues.apache.org/jira/browse/YARN-6616
>>
>>    - Trunk patch (applied through branch-2.9) differs from branch-2.8
>> patch
>>
>> @@ -211,7 +245,20 @@ message ApplicationReportProto {
>>
>>    optional PriorityProto priority = 23;
>>
>>    optional string appNodeLabelExpression = 24;
>>
>>    optional string amNodeLabelExpression = 25;
>>
>> -  optional int64 submitTime = 26;
>>
>> +  repeated AppTimeoutsMapProto appTimeouts = 26;
>>
>> +  optional int64 launchTime = 27;
>>
>> +  optional int64 submitTime = 28;
>>
>>
>> https://issues.apache.org/jira/browse/YARN-6050
>>
>>    - Trunk and branch-2 patches both change the protobuf type in the same
>>    way.
>>
>> @@ -356,7 +416,22 @@ message ApplicationSubmissionContextProto {
>>
>>    optional LogAggregationContextProto log_aggregation_context = 14;
>>
>>    optional ReservationIdProto reservation_id = 15;
>>
>>    optional string node_label_expression = 16;
>>
>> -  optional ResourceRequestProto am_container_resource_request = 17;
>>
>> +  repeated ResourceRequestProto am_container_resource_request = 17;
>>
>> +  repeated ApplicationTimeoutMapProto application_timeouts = 18;
>>
>>
>> https://issues.apache.org/jira/browse/YARN-7813
>>
>>    - Trunk (applied through branch-3.1) and branch-3.0 (applied through
>>    branch-2.9) patches differ from branch-2.8 patch
>>
>> @@ -425,7 +501,21 @@ message QueueInfoProto {
>>
>>    optional string defaultNodeLabelExpression = 9;
>>
>>    optional QueueStatisticsProto queueStatistics = 10;
>>
>>    optional bool preemptionDisabled = 11;
>>
>> -  optional bool intraQueuePreemptionDisabled = 12;
>>
>> +  repeated QueueConfigurationsMapProto queueConfigurationsMap = 12;
>>
>> +  optional bool intraQueuePreemptionDisabled = 13;
>>
>

Re: Incompatible changes between branch-2.8 and branch-2.9

Posted by Eric Badger <eb...@verizonmedia.com.INVALID>.
*   For YARN-6616, for branch-2.8 and below, it was only committed to
2.7.8/2.8.6 which have not been released (as I understand). Perhaps we can
revert YARN-6616 from branch-2.7 and branch-2.8.
  - This seems reasonable. Since we haven't released anything, it should be
no issue to change the 2.7/2.8 protobuf field to have the same value as 2.9+

*   For YARN-6050, there's a bit here:
https://developers.google.com/protocol-buffers/docs/proto that says
"optional is compatible with repeated", so I think we should be OK there.
  - Optional is compatible with repeatable over the wire such that protobuf
won't blow up, but does that actually mean that it's compatible in this
case? If it's expecting an optional and gets a repeated, it's going to drop
everything except for the last value. I don't know enough about YARN-6050
to say if this will be ok or not.

*   For YARN-7813, it's in 2.8.4 so it seems upgrading from 2.8.4 or 2.8.5
to a 2.9+ version will be an issue. One option could be to move the
intraQueuePreemptionDisabled field from id 12 to id 13 in branch-2.8, then
users would upgrade from 2.8.4/2.8.5 to 2.8.6 (someone would have to
release this), then upgrade from 2.8.6 to 2.9+.
  - I'm ok with this, but it should be noted that the upgrade from
2.8.4/2.8.5 to 2.8.6 (or 2.9+) would not be compatible for a rolling
upgrade. So this would cause some pain to anybody with clusters on those
versions.

Eric

On Tue, Sep 24, 2019 at 2:42 PM Jonathan Hung <jy...@gmail.com> wrote:

> Sorry, let me edit my first point. We can just create addendums for
> YARN-6616 in branch-2.7 and branch-2.8 to edit the submitTime field to the
> correct id 28. We don’t need to revert YARN-6616 from these branches
> completely.
>
> Jonathan
>
> ________________________________
> From: Jonathan Hung <jy...@gmail.com>
> Sent: Tuesday, September 24, 2019 11:38 AM
> To: Eric Badger
> Cc: Hadoop Common; yarn-dev; mapreduce-dev; Hdfs-dev
> Subject: Re: Incompatible changes between branch-2.8 and branch-2.9
>
> Hi Eric, thanks for the investigation.
>
>   *   For YARN-6616, for branch-2.8 and below, it was only committed to
> 2.7.8/2.8.6 which have not been released (as I understand). Perhaps we can
> revert YARN-6616 from branch-2.7 and branch-2.8.
>   *   For YARN-6050, there's a bit here:
> https://developers.google.com/protocol-buffers/docs/proto that says
> "optional is compatible with repeated", so I think we should be OK there.
>   *   For YARN-7813, it's in 2.8.4 so it seems upgrading from 2.8.4 or
> 2.8.5 to a 2.9+ version will be an issue. One option could be to move the
> intraQueuePreemptionDisabled field from id 12 to id 13 in branch-2.8, then
> users would upgrade from 2.8.4/2.8.5 to 2.8.6 (someone would have to
> release this), then upgrade from 2.8.6 to 2.9+.
>
> Jonathan Hung
>
>
> On Tue, Sep 24, 2019 at 9:23 AM Eric Badger <eb...@verizonmedia.com.invalid>
> wrote:
> We (Verizon Media) are currently moving towards upgrading our clusters from
> our internal fork of branch-2.8 to an internal fork of branch-2. During
> this process, we have found multiple incompatible changes in protobufs
> between branch-2.8 and branch-2. These incompatibilities were all
> introduced between branch-2.8 and branch-2.9. I did a git diff over all
> .proto files across the branch-2.8 and branch-2.9 and found 3 instances of
> incompatibilities from 3 separate commits. All of the incompatibilities are
> in yarn_protos.proto
>
>
> I would like to discuss how to fix these incompatible changes. Otherwise,
> rolling upgrades will not be supported between branch-2.8 (or below) and
> branch-2.9 (or beyond). We could revert the incompatible changes, but then
> the new releases would be incompatible with the releases that have these
> incompatible changes. If we do nothing, then rolling upgrades won't work
> between 2.8- and 2.9+.
>
>
> Thanks,
>
>
> Eric
>
>
> -------------------------------------------------------------------
>
>
> git diff branch-2.8..branch-2.9 $(find . -name '*\.proto')
>
>
> https://issues.apache.org/jira/browse/YARN-6616
>
>    - Trunk patch (applied through branch-2.9) differs from branch-2.8 patch
>
> @@ -211,7 +245,20 @@ message ApplicationReportProto {
>
>    optional PriorityProto priority = 23;
>
>    optional string appNodeLabelExpression = 24;
>
>    optional string amNodeLabelExpression = 25;
>
> -  optional int64 submitTime = 26;
>
> +  repeated AppTimeoutsMapProto appTimeouts = 26;
>
> +  optional int64 launchTime = 27;
>
> +  optional int64 submitTime = 28;
>
>
> https://issues.apache.org/jira/browse/YARN-6050
>
>    - Trunk and branch-2 patches both change the protobuf type in the same
>    way.
>
> @@ -356,7 +416,22 @@ message ApplicationSubmissionContextProto {
>
>    optional LogAggregationContextProto log_aggregation_context = 14;
>
>    optional ReservationIdProto reservation_id = 15;
>
>    optional string node_label_expression = 16;
>
> -  optional ResourceRequestProto am_container_resource_request = 17;
>
> +  repeated ResourceRequestProto am_container_resource_request = 17;
>
> +  repeated ApplicationTimeoutMapProto application_timeouts = 18;
>
>
> https://issues.apache.org/jira/browse/YARN-7813
>
>    - Trunk (applied through branch-3.1) and branch-3.0 (applied through
>    branch-2.9) patches differ from branch-2.8 patch
>
> @@ -425,7 +501,21 @@ message QueueInfoProto {
>
>    optional string defaultNodeLabelExpression = 9;
>
>    optional QueueStatisticsProto queueStatistics = 10;
>
>    optional bool preemptionDisabled = 11;
>
> -  optional bool intraQueuePreemptionDisabled = 12;
>
> +  repeated QueueConfigurationsMapProto queueConfigurationsMap = 12;
>
> +  optional bool intraQueuePreemptionDisabled = 13;
>

Re: Incompatible changes between branch-2.8 and branch-2.9

Posted by Eric Badger <eb...@verizonmedia.com.INVALID>.
*   For YARN-6616, for branch-2.8 and below, it was only committed to
2.7.8/2.8.6 which have not been released (as I understand). Perhaps we can
revert YARN-6616 from branch-2.7 and branch-2.8.
  - This seems reasonable. Since we haven't released anything, it should be
no issue to change the 2.7/2.8 protobuf field to have the same value as 2.9+

*   For YARN-6050, there's a bit here:
https://developers.google.com/protocol-buffers/docs/proto that says
"optional is compatible with repeated", so I think we should be OK there.
  - Optional is compatible with repeatable over the wire such that protobuf
won't blow up, but does that actually mean that it's compatible in this
case? If it's expecting an optional and gets a repeated, it's going to drop
everything except for the last value. I don't know enough about YARN-6050
to say if this will be ok or not.

*   For YARN-7813, it's in 2.8.4 so it seems upgrading from 2.8.4 or 2.8.5
to a 2.9+ version will be an issue. One option could be to move the
intraQueuePreemptionDisabled field from id 12 to id 13 in branch-2.8, then
users would upgrade from 2.8.4/2.8.5 to 2.8.6 (someone would have to
release this), then upgrade from 2.8.6 to 2.9+.
  - I'm ok with this, but it should be noted that the upgrade from
2.8.4/2.8.5 to 2.8.6 (or 2.9+) would not be compatible for a rolling
upgrade. So this would cause some pain to anybody with clusters on those
versions.

Eric

On Tue, Sep 24, 2019 at 2:42 PM Jonathan Hung <jy...@gmail.com> wrote:

> Sorry, let me edit my first point. We can just create addendums for
> YARN-6616 in branch-2.7 and branch-2.8 to edit the submitTime field to the
> correct id 28. We don’t need to revert YARN-6616 from these branches
> completely.
>
> Jonathan
>
> ________________________________
> From: Jonathan Hung <jy...@gmail.com>
> Sent: Tuesday, September 24, 2019 11:38 AM
> To: Eric Badger
> Cc: Hadoop Common; yarn-dev; mapreduce-dev; Hdfs-dev
> Subject: Re: Incompatible changes between branch-2.8 and branch-2.9
>
> Hi Eric, thanks for the investigation.
>
>   *   For YARN-6616, for branch-2.8 and below, it was only committed to
> 2.7.8/2.8.6 which have not been released (as I understand). Perhaps we can
> revert YARN-6616 from branch-2.7 and branch-2.8.
>   *   For YARN-6050, there's a bit here:
> https://developers.google.com/protocol-buffers/docs/proto that says
> "optional is compatible with repeated", so I think we should be OK there.
>   *   For YARN-7813, it's in 2.8.4 so it seems upgrading from 2.8.4 or
> 2.8.5 to a 2.9+ version will be an issue. One option could be to move the
> intraQueuePreemptionDisabled field from id 12 to id 13 in branch-2.8, then
> users would upgrade from 2.8.4/2.8.5 to 2.8.6 (someone would have to
> release this), then upgrade from 2.8.6 to 2.9+.
>
> Jonathan Hung
>
>
> On Tue, Sep 24, 2019 at 9:23 AM Eric Badger <eb...@verizonmedia.com.invalid>
> wrote:
> We (Verizon Media) are currently moving towards upgrading our clusters from
> our internal fork of branch-2.8 to an internal fork of branch-2. During
> this process, we have found multiple incompatible changes in protobufs
> between branch-2.8 and branch-2. These incompatibilities were all
> introduced between branch-2.8 and branch-2.9. I did a git diff over all
> .proto files across the branch-2.8 and branch-2.9 and found 3 instances of
> incompatibilities from 3 separate commits. All of the incompatibilities are
> in yarn_protos.proto
>
>
> I would like to discuss how to fix these incompatible changes. Otherwise,
> rolling upgrades will not be supported between branch-2.8 (or below) and
> branch-2.9 (or beyond). We could revert the incompatible changes, but then
> the new releases would be incompatible with the releases that have these
> incompatible changes. If we do nothing, then rolling upgrades won't work
> between 2.8- and 2.9+.
>
>
> Thanks,
>
>
> Eric
>
>
> -------------------------------------------------------------------
>
>
> git diff branch-2.8..branch-2.9 $(find . -name '*\.proto')
>
>
> https://issues.apache.org/jira/browse/YARN-6616
>
>    - Trunk patch (applied through branch-2.9) differs from branch-2.8 patch
>
> @@ -211,7 +245,20 @@ message ApplicationReportProto {
>
>    optional PriorityProto priority = 23;
>
>    optional string appNodeLabelExpression = 24;
>
>    optional string amNodeLabelExpression = 25;
>
> -  optional int64 submitTime = 26;
>
> +  repeated AppTimeoutsMapProto appTimeouts = 26;
>
> +  optional int64 launchTime = 27;
>
> +  optional int64 submitTime = 28;
>
>
> https://issues.apache.org/jira/browse/YARN-6050
>
>    - Trunk and branch-2 patches both change the protobuf type in the same
>    way.
>
> @@ -356,7 +416,22 @@ message ApplicationSubmissionContextProto {
>
>    optional LogAggregationContextProto log_aggregation_context = 14;
>
>    optional ReservationIdProto reservation_id = 15;
>
>    optional string node_label_expression = 16;
>
> -  optional ResourceRequestProto am_container_resource_request = 17;
>
> +  repeated ResourceRequestProto am_container_resource_request = 17;
>
> +  repeated ApplicationTimeoutMapProto application_timeouts = 18;
>
>
> https://issues.apache.org/jira/browse/YARN-7813
>
>    - Trunk (applied through branch-3.1) and branch-3.0 (applied through
>    branch-2.9) patches differ from branch-2.8 patch
>
> @@ -425,7 +501,21 @@ message QueueInfoProto {
>
>    optional string defaultNodeLabelExpression = 9;
>
>    optional QueueStatisticsProto queueStatistics = 10;
>
>    optional bool preemptionDisabled = 11;
>
> -  optional bool intraQueuePreemptionDisabled = 12;
>
> +  repeated QueueConfigurationsMapProto queueConfigurationsMap = 12;
>
> +  optional bool intraQueuePreemptionDisabled = 13;
>

Re: Incompatible changes between branch-2.8 and branch-2.9

Posted by Eric Badger <eb...@verizonmedia.com.INVALID>.
*   For YARN-6616, for branch-2.8 and below, it was only committed to
2.7.8/2.8.6 which have not been released (as I understand). Perhaps we can
revert YARN-6616 from branch-2.7 and branch-2.8.
  - This seems reasonable. Since we haven't released anything, it should be
no issue to change the 2.7/2.8 protobuf field to have the same value as 2.9+

*   For YARN-6050, there's a bit here:
https://developers.google.com/protocol-buffers/docs/proto that says
"optional is compatible with repeated", so I think we should be OK there.
  - Optional is compatible with repeatable over the wire such that protobuf
won't blow up, but does that actually mean that it's compatible in this
case? If it's expecting an optional and gets a repeated, it's going to drop
everything except for the last value. I don't know enough about YARN-6050
to say if this will be ok or not.

*   For YARN-7813, it's in 2.8.4 so it seems upgrading from 2.8.4 or 2.8.5
to a 2.9+ version will be an issue. One option could be to move the
intraQueuePreemptionDisabled field from id 12 to id 13 in branch-2.8, then
users would upgrade from 2.8.4/2.8.5 to 2.8.6 (someone would have to
release this), then upgrade from 2.8.6 to 2.9+.
  - I'm ok with this, but it should be noted that the upgrade from
2.8.4/2.8.5 to 2.8.6 (or 2.9+) would not be compatible for a rolling
upgrade. So this would cause some pain to anybody with clusters on those
versions.

Eric

On Tue, Sep 24, 2019 at 2:42 PM Jonathan Hung <jy...@gmail.com> wrote:

> Sorry, let me edit my first point. We can just create addendums for
> YARN-6616 in branch-2.7 and branch-2.8 to edit the submitTime field to the
> correct id 28. We don’t need to revert YARN-6616 from these branches
> completely.
>
> Jonathan
>
> ________________________________
> From: Jonathan Hung <jy...@gmail.com>
> Sent: Tuesday, September 24, 2019 11:38 AM
> To: Eric Badger
> Cc: Hadoop Common; yarn-dev; mapreduce-dev; Hdfs-dev
> Subject: Re: Incompatible changes between branch-2.8 and branch-2.9
>
> Hi Eric, thanks for the investigation.
>
>   *   For YARN-6616, for branch-2.8 and below, it was only committed to
> 2.7.8/2.8.6 which have not been released (as I understand). Perhaps we can
> revert YARN-6616 from branch-2.7 and branch-2.8.
>   *   For YARN-6050, there's a bit here:
> https://developers.google.com/protocol-buffers/docs/proto that says
> "optional is compatible with repeated", so I think we should be OK there.
>   *   For YARN-7813, it's in 2.8.4 so it seems upgrading from 2.8.4 or
> 2.8.5 to a 2.9+ version will be an issue. One option could be to move the
> intraQueuePreemptionDisabled field from id 12 to id 13 in branch-2.8, then
> users would upgrade from 2.8.4/2.8.5 to 2.8.6 (someone would have to
> release this), then upgrade from 2.8.6 to 2.9+.
>
> Jonathan Hung
>
>
> On Tue, Sep 24, 2019 at 9:23 AM Eric Badger <eb...@verizonmedia.com.invalid>
> wrote:
> We (Verizon Media) are currently moving towards upgrading our clusters from
> our internal fork of branch-2.8 to an internal fork of branch-2. During
> this process, we have found multiple incompatible changes in protobufs
> between branch-2.8 and branch-2. These incompatibilities were all
> introduced between branch-2.8 and branch-2.9. I did a git diff over all
> .proto files across the branch-2.8 and branch-2.9 and found 3 instances of
> incompatibilities from 3 separate commits. All of the incompatibilities are
> in yarn_protos.proto
>
>
> I would like to discuss how to fix these incompatible changes. Otherwise,
> rolling upgrades will not be supported between branch-2.8 (or below) and
> branch-2.9 (or beyond). We could revert the incompatible changes, but then
> the new releases would be incompatible with the releases that have these
> incompatible changes. If we do nothing, then rolling upgrades won't work
> between 2.8- and 2.9+.
>
>
> Thanks,
>
>
> Eric
>
>
> -------------------------------------------------------------------
>
>
> git diff branch-2.8..branch-2.9 $(find . -name '*\.proto')
>
>
> https://issues.apache.org/jira/browse/YARN-6616
>
>    - Trunk patch (applied through branch-2.9) differs from branch-2.8 patch
>
> @@ -211,7 +245,20 @@ message ApplicationReportProto {
>
>    optional PriorityProto priority = 23;
>
>    optional string appNodeLabelExpression = 24;
>
>    optional string amNodeLabelExpression = 25;
>
> -  optional int64 submitTime = 26;
>
> +  repeated AppTimeoutsMapProto appTimeouts = 26;
>
> +  optional int64 launchTime = 27;
>
> +  optional int64 submitTime = 28;
>
>
> https://issues.apache.org/jira/browse/YARN-6050
>
>    - Trunk and branch-2 patches both change the protobuf type in the same
>    way.
>
> @@ -356,7 +416,22 @@ message ApplicationSubmissionContextProto {
>
>    optional LogAggregationContextProto log_aggregation_context = 14;
>
>    optional ReservationIdProto reservation_id = 15;
>
>    optional string node_label_expression = 16;
>
> -  optional ResourceRequestProto am_container_resource_request = 17;
>
> +  repeated ResourceRequestProto am_container_resource_request = 17;
>
> +  repeated ApplicationTimeoutMapProto application_timeouts = 18;
>
>
> https://issues.apache.org/jira/browse/YARN-7813
>
>    - Trunk (applied through branch-3.1) and branch-3.0 (applied through
>    branch-2.9) patches differ from branch-2.8 patch
>
> @@ -425,7 +501,21 @@ message QueueInfoProto {
>
>    optional string defaultNodeLabelExpression = 9;
>
>    optional QueueStatisticsProto queueStatistics = 10;
>
>    optional bool preemptionDisabled = 11;
>
> -  optional bool intraQueuePreemptionDisabled = 12;
>
> +  repeated QueueConfigurationsMapProto queueConfigurationsMap = 12;
>
> +  optional bool intraQueuePreemptionDisabled = 13;
>

Re: Incompatible changes between branch-2.8 and branch-2.9

Posted by Eric Badger <eb...@verizonmedia.com.INVALID>.
*   For YARN-6616, for branch-2.8 and below, it was only committed to
2.7.8/2.8.6 which have not been released (as I understand). Perhaps we can
revert YARN-6616 from branch-2.7 and branch-2.8.
  - This seems reasonable. Since we haven't released anything, it should be
no issue to change the 2.7/2.8 protobuf field to have the same value as 2.9+

*   For YARN-6050, there's a bit here:
https://developers.google.com/protocol-buffers/docs/proto that says
"optional is compatible with repeated", so I think we should be OK there.
  - Optional is compatible with repeatable over the wire such that protobuf
won't blow up, but does that actually mean that it's compatible in this
case? If it's expecting an optional and gets a repeated, it's going to drop
everything except for the last value. I don't know enough about YARN-6050
to say if this will be ok or not.

*   For YARN-7813, it's in 2.8.4 so it seems upgrading from 2.8.4 or 2.8.5
to a 2.9+ version will be an issue. One option could be to move the
intraQueuePreemptionDisabled field from id 12 to id 13 in branch-2.8, then
users would upgrade from 2.8.4/2.8.5 to 2.8.6 (someone would have to
release this), then upgrade from 2.8.6 to 2.9+.
  - I'm ok with this, but it should be noted that the upgrade from
2.8.4/2.8.5 to 2.8.6 (or 2.9+) would not be compatible for a rolling
upgrade. So this would cause some pain to anybody with clusters on those
versions.

Eric

On Tue, Sep 24, 2019 at 2:42 PM Jonathan Hung <jy...@gmail.com> wrote:

> Sorry, let me edit my first point. We can just create addendums for
> YARN-6616 in branch-2.7 and branch-2.8 to edit the submitTime field to the
> correct id 28. We don’t need to revert YARN-6616 from these branches
> completely.
>
> Jonathan
>
> ________________________________
> From: Jonathan Hung <jy...@gmail.com>
> Sent: Tuesday, September 24, 2019 11:38 AM
> To: Eric Badger
> Cc: Hadoop Common; yarn-dev; mapreduce-dev; Hdfs-dev
> Subject: Re: Incompatible changes between branch-2.8 and branch-2.9
>
> Hi Eric, thanks for the investigation.
>
>   *   For YARN-6616, for branch-2.8 and below, it was only committed to
> 2.7.8/2.8.6 which have not been released (as I understand). Perhaps we can
> revert YARN-6616 from branch-2.7 and branch-2.8.
>   *   For YARN-6050, there's a bit here:
> https://developers.google.com/protocol-buffers/docs/proto that says
> "optional is compatible with repeated", so I think we should be OK there.
>   *   For YARN-7813, it's in 2.8.4 so it seems upgrading from 2.8.4 or
> 2.8.5 to a 2.9+ version will be an issue. One option could be to move the
> intraQueuePreemptionDisabled field from id 12 to id 13 in branch-2.8, then
> users would upgrade from 2.8.4/2.8.5 to 2.8.6 (someone would have to
> release this), then upgrade from 2.8.6 to 2.9+.
>
> Jonathan Hung
>
>
> On Tue, Sep 24, 2019 at 9:23 AM Eric Badger <eb...@verizonmedia.com.invalid>
> wrote:
> We (Verizon Media) are currently moving towards upgrading our clusters from
> our internal fork of branch-2.8 to an internal fork of branch-2. During
> this process, we have found multiple incompatible changes in protobufs
> between branch-2.8 and branch-2. These incompatibilities were all
> introduced between branch-2.8 and branch-2.9. I did a git diff over all
> .proto files across the branch-2.8 and branch-2.9 and found 3 instances of
> incompatibilities from 3 separate commits. All of the incompatibilities are
> in yarn_protos.proto
>
>
> I would like to discuss how to fix these incompatible changes. Otherwise,
> rolling upgrades will not be supported between branch-2.8 (or below) and
> branch-2.9 (or beyond). We could revert the incompatible changes, but then
> the new releases would be incompatible with the releases that have these
> incompatible changes. If we do nothing, then rolling upgrades won't work
> between 2.8- and 2.9+.
>
>
> Thanks,
>
>
> Eric
>
>
> -------------------------------------------------------------------
>
>
> git diff branch-2.8..branch-2.9 $(find . -name '*\.proto')
>
>
> https://issues.apache.org/jira/browse/YARN-6616
>
>    - Trunk patch (applied through branch-2.9) differs from branch-2.8 patch
>
> @@ -211,7 +245,20 @@ message ApplicationReportProto {
>
>    optional PriorityProto priority = 23;
>
>    optional string appNodeLabelExpression = 24;
>
>    optional string amNodeLabelExpression = 25;
>
> -  optional int64 submitTime = 26;
>
> +  repeated AppTimeoutsMapProto appTimeouts = 26;
>
> +  optional int64 launchTime = 27;
>
> +  optional int64 submitTime = 28;
>
>
> https://issues.apache.org/jira/browse/YARN-6050
>
>    - Trunk and branch-2 patches both change the protobuf type in the same
>    way.
>
> @@ -356,7 +416,22 @@ message ApplicationSubmissionContextProto {
>
>    optional LogAggregationContextProto log_aggregation_context = 14;
>
>    optional ReservationIdProto reservation_id = 15;
>
>    optional string node_label_expression = 16;
>
> -  optional ResourceRequestProto am_container_resource_request = 17;
>
> +  repeated ResourceRequestProto am_container_resource_request = 17;
>
> +  repeated ApplicationTimeoutMapProto application_timeouts = 18;
>
>
> https://issues.apache.org/jira/browse/YARN-7813
>
>    - Trunk (applied through branch-3.1) and branch-3.0 (applied through
>    branch-2.9) patches differ from branch-2.8 patch
>
> @@ -425,7 +501,21 @@ message QueueInfoProto {
>
>    optional string defaultNodeLabelExpression = 9;
>
>    optional QueueStatisticsProto queueStatistics = 10;
>
>    optional bool preemptionDisabled = 11;
>
> -  optional bool intraQueuePreemptionDisabled = 12;
>
> +  repeated QueueConfigurationsMapProto queueConfigurationsMap = 12;
>
> +  optional bool intraQueuePreemptionDisabled = 13;
>

Re: Incompatible changes between branch-2.8 and branch-2.9

Posted by Jonathan Hung <jy...@gmail.com>.
Sorry, let me edit my first point. We can just create addendums for YARN-6616 in branch-2.7 and branch-2.8 to edit the submitTime field to the correct id 28. We don’t need to revert YARN-6616 from these branches completely.

Jonathan

________________________________
From: Jonathan Hung <jy...@gmail.com>
Sent: Tuesday, September 24, 2019 11:38 AM
To: Eric Badger
Cc: Hadoop Common; yarn-dev; mapreduce-dev; Hdfs-dev
Subject: Re: Incompatible changes between branch-2.8 and branch-2.9

Hi Eric, thanks for the investigation.

  *   For YARN-6616, for branch-2.8 and below, it was only committed to 2.7.8/2.8.6 which have not been released (as I understand). Perhaps we can revert YARN-6616 from branch-2.7 and branch-2.8.
  *   For YARN-6050, there's a bit here: https://developers.google.com/protocol-buffers/docs/proto that says "optional is compatible with repeated", so I think we should be OK there.
  *   For YARN-7813, it's in 2.8.4 so it seems upgrading from 2.8.4 or 2.8.5 to a 2.9+ version will be an issue. One option could be to move the intraQueuePreemptionDisabled field from id 12 to id 13 in branch-2.8, then users would upgrade from 2.8.4/2.8.5 to 2.8.6 (someone would have to release this), then upgrade from 2.8.6 to 2.9+.

Jonathan Hung


On Tue, Sep 24, 2019 at 9:23 AM Eric Badger <eb...@verizonmedia.com.invalid> wrote:
We (Verizon Media) are currently moving towards upgrading our clusters from
our internal fork of branch-2.8 to an internal fork of branch-2. During
this process, we have found multiple incompatible changes in protobufs
between branch-2.8 and branch-2. These incompatibilities were all
introduced between branch-2.8 and branch-2.9. I did a git diff over all
.proto files across the branch-2.8 and branch-2.9 and found 3 instances of
incompatibilities from 3 separate commits. All of the incompatibilities are
in yarn_protos.proto


I would like to discuss how to fix these incompatible changes. Otherwise,
rolling upgrades will not be supported between branch-2.8 (or below) and
branch-2.9 (or beyond). We could revert the incompatible changes, but then
the new releases would be incompatible with the releases that have these
incompatible changes. If we do nothing, then rolling upgrades won't work
between 2.8- and 2.9+.


Thanks,


Eric


-------------------------------------------------------------------


git diff branch-2.8..branch-2.9 $(find . -name '*\.proto')


https://issues.apache.org/jira/browse/YARN-6616

   - Trunk patch (applied through branch-2.9) differs from branch-2.8 patch

@@ -211,7 +245,20 @@ message ApplicationReportProto {

   optional PriorityProto priority = 23;

   optional string appNodeLabelExpression = 24;

   optional string amNodeLabelExpression = 25;

-  optional int64 submitTime = 26;

+  repeated AppTimeoutsMapProto appTimeouts = 26;

+  optional int64 launchTime = 27;

+  optional int64 submitTime = 28;


https://issues.apache.org/jira/browse/YARN-6050

   - Trunk and branch-2 patches both change the protobuf type in the same
   way.

@@ -356,7 +416,22 @@ message ApplicationSubmissionContextProto {

   optional LogAggregationContextProto log_aggregation_context = 14;

   optional ReservationIdProto reservation_id = 15;

   optional string node_label_expression = 16;

-  optional ResourceRequestProto am_container_resource_request = 17;

+  repeated ResourceRequestProto am_container_resource_request = 17;

+  repeated ApplicationTimeoutMapProto application_timeouts = 18;


https://issues.apache.org/jira/browse/YARN-7813

   - Trunk (applied through branch-3.1) and branch-3.0 (applied through
   branch-2.9) patches differ from branch-2.8 patch

@@ -425,7 +501,21 @@ message QueueInfoProto {

   optional string defaultNodeLabelExpression = 9;

   optional QueueStatisticsProto queueStatistics = 10;

   optional bool preemptionDisabled = 11;

-  optional bool intraQueuePreemptionDisabled = 12;

+  repeated QueueConfigurationsMapProto queueConfigurationsMap = 12;

+  optional bool intraQueuePreemptionDisabled = 13;

Re: Incompatible changes between branch-2.8 and branch-2.9

Posted by Jonathan Hung <jy...@gmail.com>.
Sorry, let me edit my first point. We can just create addendums for YARN-6616 in branch-2.7 and branch-2.8 to edit the submitTime field to the correct id 28. We don’t need to revert YARN-6616 from these branches completely.

Jonathan

________________________________
From: Jonathan Hung <jy...@gmail.com>
Sent: Tuesday, September 24, 2019 11:38 AM
To: Eric Badger
Cc: Hadoop Common; yarn-dev; mapreduce-dev; Hdfs-dev
Subject: Re: Incompatible changes between branch-2.8 and branch-2.9

Hi Eric, thanks for the investigation.

  *   For YARN-6616, for branch-2.8 and below, it was only committed to 2.7.8/2.8.6 which have not been released (as I understand). Perhaps we can revert YARN-6616 from branch-2.7 and branch-2.8.
  *   For YARN-6050, there's a bit here: https://developers.google.com/protocol-buffers/docs/proto that says "optional is compatible with repeated", so I think we should be OK there.
  *   For YARN-7813, it's in 2.8.4 so it seems upgrading from 2.8.4 or 2.8.5 to a 2.9+ version will be an issue. One option could be to move the intraQueuePreemptionDisabled field from id 12 to id 13 in branch-2.8, then users would upgrade from 2.8.4/2.8.5 to 2.8.6 (someone would have to release this), then upgrade from 2.8.6 to 2.9+.

Jonathan Hung


On Tue, Sep 24, 2019 at 9:23 AM Eric Badger <eb...@verizonmedia.com.invalid> wrote:
We (Verizon Media) are currently moving towards upgrading our clusters from
our internal fork of branch-2.8 to an internal fork of branch-2. During
this process, we have found multiple incompatible changes in protobufs
between branch-2.8 and branch-2. These incompatibilities were all
introduced between branch-2.8 and branch-2.9. I did a git diff over all
.proto files across the branch-2.8 and branch-2.9 and found 3 instances of
incompatibilities from 3 separate commits. All of the incompatibilities are
in yarn_protos.proto


I would like to discuss how to fix these incompatible changes. Otherwise,
rolling upgrades will not be supported between branch-2.8 (or below) and
branch-2.9 (or beyond). We could revert the incompatible changes, but then
the new releases would be incompatible with the releases that have these
incompatible changes. If we do nothing, then rolling upgrades won't work
between 2.8- and 2.9+.


Thanks,


Eric


-------------------------------------------------------------------


git diff branch-2.8..branch-2.9 $(find . -name '*\.proto')


https://issues.apache.org/jira/browse/YARN-6616

   - Trunk patch (applied through branch-2.9) differs from branch-2.8 patch

@@ -211,7 +245,20 @@ message ApplicationReportProto {

   optional PriorityProto priority = 23;

   optional string appNodeLabelExpression = 24;

   optional string amNodeLabelExpression = 25;

-  optional int64 submitTime = 26;

+  repeated AppTimeoutsMapProto appTimeouts = 26;

+  optional int64 launchTime = 27;

+  optional int64 submitTime = 28;


https://issues.apache.org/jira/browse/YARN-6050

   - Trunk and branch-2 patches both change the protobuf type in the same
   way.

@@ -356,7 +416,22 @@ message ApplicationSubmissionContextProto {

   optional LogAggregationContextProto log_aggregation_context = 14;

   optional ReservationIdProto reservation_id = 15;

   optional string node_label_expression = 16;

-  optional ResourceRequestProto am_container_resource_request = 17;

+  repeated ResourceRequestProto am_container_resource_request = 17;

+  repeated ApplicationTimeoutMapProto application_timeouts = 18;


https://issues.apache.org/jira/browse/YARN-7813

   - Trunk (applied through branch-3.1) and branch-3.0 (applied through
   branch-2.9) patches differ from branch-2.8 patch

@@ -425,7 +501,21 @@ message QueueInfoProto {

   optional string defaultNodeLabelExpression = 9;

   optional QueueStatisticsProto queueStatistics = 10;

   optional bool preemptionDisabled = 11;

-  optional bool intraQueuePreemptionDisabled = 12;

+  repeated QueueConfigurationsMapProto queueConfigurationsMap = 12;

+  optional bool intraQueuePreemptionDisabled = 13;

Re: Incompatible changes between branch-2.8 and branch-2.9

Posted by Jonathan Hung <jy...@gmail.com>.
Sorry, let me edit my first point. We can just create addendums for YARN-6616 in branch-2.7 and branch-2.8 to edit the submitTime field to the correct id 28. We don’t need to revert YARN-6616 from these branches completely.

Jonathan

________________________________
From: Jonathan Hung <jy...@gmail.com>
Sent: Tuesday, September 24, 2019 11:38 AM
To: Eric Badger
Cc: Hadoop Common; yarn-dev; mapreduce-dev; Hdfs-dev
Subject: Re: Incompatible changes between branch-2.8 and branch-2.9

Hi Eric, thanks for the investigation.

  *   For YARN-6616, for branch-2.8 and below, it was only committed to 2.7.8/2.8.6 which have not been released (as I understand). Perhaps we can revert YARN-6616 from branch-2.7 and branch-2.8.
  *   For YARN-6050, there's a bit here: https://developers.google.com/protocol-buffers/docs/proto that says "optional is compatible with repeated", so I think we should be OK there.
  *   For YARN-7813, it's in 2.8.4 so it seems upgrading from 2.8.4 or 2.8.5 to a 2.9+ version will be an issue. One option could be to move the intraQueuePreemptionDisabled field from id 12 to id 13 in branch-2.8, then users would upgrade from 2.8.4/2.8.5 to 2.8.6 (someone would have to release this), then upgrade from 2.8.6 to 2.9+.

Jonathan Hung


On Tue, Sep 24, 2019 at 9:23 AM Eric Badger <eb...@verizonmedia.com.invalid> wrote:
We (Verizon Media) are currently moving towards upgrading our clusters from
our internal fork of branch-2.8 to an internal fork of branch-2. During
this process, we have found multiple incompatible changes in protobufs
between branch-2.8 and branch-2. These incompatibilities were all
introduced between branch-2.8 and branch-2.9. I did a git diff over all
.proto files across the branch-2.8 and branch-2.9 and found 3 instances of
incompatibilities from 3 separate commits. All of the incompatibilities are
in yarn_protos.proto


I would like to discuss how to fix these incompatible changes. Otherwise,
rolling upgrades will not be supported between branch-2.8 (or below) and
branch-2.9 (or beyond). We could revert the incompatible changes, but then
the new releases would be incompatible with the releases that have these
incompatible changes. If we do nothing, then rolling upgrades won't work
between 2.8- and 2.9+.


Thanks,


Eric


-------------------------------------------------------------------


git diff branch-2.8..branch-2.9 $(find . -name '*\.proto')


https://issues.apache.org/jira/browse/YARN-6616

   - Trunk patch (applied through branch-2.9) differs from branch-2.8 patch

@@ -211,7 +245,20 @@ message ApplicationReportProto {

   optional PriorityProto priority = 23;

   optional string appNodeLabelExpression = 24;

   optional string amNodeLabelExpression = 25;

-  optional int64 submitTime = 26;

+  repeated AppTimeoutsMapProto appTimeouts = 26;

+  optional int64 launchTime = 27;

+  optional int64 submitTime = 28;


https://issues.apache.org/jira/browse/YARN-6050

   - Trunk and branch-2 patches both change the protobuf type in the same
   way.

@@ -356,7 +416,22 @@ message ApplicationSubmissionContextProto {

   optional LogAggregationContextProto log_aggregation_context = 14;

   optional ReservationIdProto reservation_id = 15;

   optional string node_label_expression = 16;

-  optional ResourceRequestProto am_container_resource_request = 17;

+  repeated ResourceRequestProto am_container_resource_request = 17;

+  repeated ApplicationTimeoutMapProto application_timeouts = 18;


https://issues.apache.org/jira/browse/YARN-7813

   - Trunk (applied through branch-3.1) and branch-3.0 (applied through
   branch-2.9) patches differ from branch-2.8 patch

@@ -425,7 +501,21 @@ message QueueInfoProto {

   optional string defaultNodeLabelExpression = 9;

   optional QueueStatisticsProto queueStatistics = 10;

   optional bool preemptionDisabled = 11;

-  optional bool intraQueuePreemptionDisabled = 12;

+  repeated QueueConfigurationsMapProto queueConfigurationsMap = 12;

+  optional bool intraQueuePreemptionDisabled = 13;

Re: Incompatible changes between branch-2.8 and branch-2.9

Posted by Jonathan Hung <jy...@gmail.com>.
Sorry, let me edit my first point. We can just create addendums for YARN-6616 in branch-2.7 and branch-2.8 to edit the submitTime field to the correct id 28. We don’t need to revert YARN-6616 from these branches completely.

Jonathan

________________________________
From: Jonathan Hung <jy...@gmail.com>
Sent: Tuesday, September 24, 2019 11:38 AM
To: Eric Badger
Cc: Hadoop Common; yarn-dev; mapreduce-dev; Hdfs-dev
Subject: Re: Incompatible changes between branch-2.8 and branch-2.9

Hi Eric, thanks for the investigation.

  *   For YARN-6616, for branch-2.8 and below, it was only committed to 2.7.8/2.8.6 which have not been released (as I understand). Perhaps we can revert YARN-6616 from branch-2.7 and branch-2.8.
  *   For YARN-6050, there's a bit here: https://developers.google.com/protocol-buffers/docs/proto that says "optional is compatible with repeated", so I think we should be OK there.
  *   For YARN-7813, it's in 2.8.4 so it seems upgrading from 2.8.4 or 2.8.5 to a 2.9+ version will be an issue. One option could be to move the intraQueuePreemptionDisabled field from id 12 to id 13 in branch-2.8, then users would upgrade from 2.8.4/2.8.5 to 2.8.6 (someone would have to release this), then upgrade from 2.8.6 to 2.9+.

Jonathan Hung


On Tue, Sep 24, 2019 at 9:23 AM Eric Badger <eb...@verizonmedia.com.invalid> wrote:
We (Verizon Media) are currently moving towards upgrading our clusters from
our internal fork of branch-2.8 to an internal fork of branch-2. During
this process, we have found multiple incompatible changes in protobufs
between branch-2.8 and branch-2. These incompatibilities were all
introduced between branch-2.8 and branch-2.9. I did a git diff over all
.proto files across the branch-2.8 and branch-2.9 and found 3 instances of
incompatibilities from 3 separate commits. All of the incompatibilities are
in yarn_protos.proto


I would like to discuss how to fix these incompatible changes. Otherwise,
rolling upgrades will not be supported between branch-2.8 (or below) and
branch-2.9 (or beyond). We could revert the incompatible changes, but then
the new releases would be incompatible with the releases that have these
incompatible changes. If we do nothing, then rolling upgrades won't work
between 2.8- and 2.9+.


Thanks,


Eric


-------------------------------------------------------------------


git diff branch-2.8..branch-2.9 $(find . -name '*\.proto')


https://issues.apache.org/jira/browse/YARN-6616

   - Trunk patch (applied through branch-2.9) differs from branch-2.8 patch

@@ -211,7 +245,20 @@ message ApplicationReportProto {

   optional PriorityProto priority = 23;

   optional string appNodeLabelExpression = 24;

   optional string amNodeLabelExpression = 25;

-  optional int64 submitTime = 26;

+  repeated AppTimeoutsMapProto appTimeouts = 26;

+  optional int64 launchTime = 27;

+  optional int64 submitTime = 28;


https://issues.apache.org/jira/browse/YARN-6050

   - Trunk and branch-2 patches both change the protobuf type in the same
   way.

@@ -356,7 +416,22 @@ message ApplicationSubmissionContextProto {

   optional LogAggregationContextProto log_aggregation_context = 14;

   optional ReservationIdProto reservation_id = 15;

   optional string node_label_expression = 16;

-  optional ResourceRequestProto am_container_resource_request = 17;

+  repeated ResourceRequestProto am_container_resource_request = 17;

+  repeated ApplicationTimeoutMapProto application_timeouts = 18;


https://issues.apache.org/jira/browse/YARN-7813

   - Trunk (applied through branch-3.1) and branch-3.0 (applied through
   branch-2.9) patches differ from branch-2.8 patch

@@ -425,7 +501,21 @@ message QueueInfoProto {

   optional string defaultNodeLabelExpression = 9;

   optional QueueStatisticsProto queueStatistics = 10;

   optional bool preemptionDisabled = 11;

-  optional bool intraQueuePreemptionDisabled = 12;

+  repeated QueueConfigurationsMapProto queueConfigurationsMap = 12;

+  optional bool intraQueuePreemptionDisabled = 13;

Re: Incompatible changes between branch-2.8 and branch-2.9

Posted by Jonathan Hung <jy...@gmail.com>.
Hi Eric, thanks for the investigation.

   - For YARN-6616, for branch-2.8 and below, it was only committed to
   2.7.8/2.8.6 which have not been released (as I understand). Perhaps we can
   revert YARN-6616 from branch-2.7 and branch-2.8.
   - For YARN-6050, there's a bit here:
   https://developers.google.com/protocol-buffers/docs/proto that says
   "optional is compatible with repeated", so I think we should be OK there.
   - For YARN-7813, it's in 2.8.4 so it seems upgrading from 2.8.4 or 2.8.5
   to a 2.9+ version will be an issue. One option could be to move the
   intraQueuePreemptionDisabled field from id 12 to id 13 in branch-2.8, then
   users would upgrade from 2.8.4/2.8.5 to 2.8.6 (someone would have to
   release this), then upgrade from 2.8.6 to 2.9+.


Jonathan Hung


On Tue, Sep 24, 2019 at 9:23 AM Eric Badger
<eb...@verizonmedia.com.invalid> wrote:

> We (Verizon Media) are currently moving towards upgrading our clusters from
> our internal fork of branch-2.8 to an internal fork of branch-2. During
> this process, we have found multiple incompatible changes in protobufs
> between branch-2.8 and branch-2. These incompatibilities were all
> introduced between branch-2.8 and branch-2.9. I did a git diff over all
> .proto files across the branch-2.8 and branch-2.9 and found 3 instances of
> incompatibilities from 3 separate commits. All of the incompatibilities are
> in yarn_protos.proto
>
>
> I would like to discuss how to fix these incompatible changes. Otherwise,
> rolling upgrades will not be supported between branch-2.8 (or below) and
> branch-2.9 (or beyond). We could revert the incompatible changes, but then
> the new releases would be incompatible with the releases that have these
> incompatible changes. If we do nothing, then rolling upgrades won't work
> between 2.8- and 2.9+.
>
>
> Thanks,
>
>
> Eric
>
>
> -------------------------------------------------------------------
>
>
> git diff branch-2.8..branch-2.9 $(find . -name '*\.proto')
>
>
> https://issues.apache.org/jira/browse/YARN-6616
>
>    - Trunk patch (applied through branch-2.9) differs from branch-2.8 patch
>
> @@ -211,7 +245,20 @@ message ApplicationReportProto {
>
>    optional PriorityProto priority = 23;
>
>    optional string appNodeLabelExpression = 24;
>
>    optional string amNodeLabelExpression = 25;
>
> -  optional int64 submitTime = 26;
>
> +  repeated AppTimeoutsMapProto appTimeouts = 26;
>
> +  optional int64 launchTime = 27;
>
> +  optional int64 submitTime = 28;
>
>
> https://issues.apache.org/jira/browse/YARN-6050
>
>    - Trunk and branch-2 patches both change the protobuf type in the same
>    way.
>
> @@ -356,7 +416,22 @@ message ApplicationSubmissionContextProto {
>
>    optional LogAggregationContextProto log_aggregation_context = 14;
>
>    optional ReservationIdProto reservation_id = 15;
>
>    optional string node_label_expression = 16;
>
> -  optional ResourceRequestProto am_container_resource_request = 17;
>
> +  repeated ResourceRequestProto am_container_resource_request = 17;
>
> +  repeated ApplicationTimeoutMapProto application_timeouts = 18;
>
>
> https://issues.apache.org/jira/browse/YARN-7813
>
>    - Trunk (applied through branch-3.1) and branch-3.0 (applied through
>    branch-2.9) patches differ from branch-2.8 patch
>
> @@ -425,7 +501,21 @@ message QueueInfoProto {
>
>    optional string defaultNodeLabelExpression = 9;
>
>    optional QueueStatisticsProto queueStatistics = 10;
>
>    optional bool preemptionDisabled = 11;
>
> -  optional bool intraQueuePreemptionDisabled = 12;
>
> +  repeated QueueConfigurationsMapProto queueConfigurationsMap = 12;
>
> +  optional bool intraQueuePreemptionDisabled = 13;
>

Re: Incompatible changes between branch-2.8 and branch-2.9

Posted by Jonathan Hung <jy...@gmail.com>.
Hi Eric, thanks for the investigation.

   - For YARN-6616, for branch-2.8 and below, it was only committed to
   2.7.8/2.8.6 which have not been released (as I understand). Perhaps we can
   revert YARN-6616 from branch-2.7 and branch-2.8.
   - For YARN-6050, there's a bit here:
   https://developers.google.com/protocol-buffers/docs/proto that says
   "optional is compatible with repeated", so I think we should be OK there.
   - For YARN-7813, it's in 2.8.4 so it seems upgrading from 2.8.4 or 2.8.5
   to a 2.9+ version will be an issue. One option could be to move the
   intraQueuePreemptionDisabled field from id 12 to id 13 in branch-2.8, then
   users would upgrade from 2.8.4/2.8.5 to 2.8.6 (someone would have to
   release this), then upgrade from 2.8.6 to 2.9+.


Jonathan Hung


On Tue, Sep 24, 2019 at 9:23 AM Eric Badger
<eb...@verizonmedia.com.invalid> wrote:

> We (Verizon Media) are currently moving towards upgrading our clusters from
> our internal fork of branch-2.8 to an internal fork of branch-2. During
> this process, we have found multiple incompatible changes in protobufs
> between branch-2.8 and branch-2. These incompatibilities were all
> introduced between branch-2.8 and branch-2.9. I did a git diff over all
> .proto files across the branch-2.8 and branch-2.9 and found 3 instances of
> incompatibilities from 3 separate commits. All of the incompatibilities are
> in yarn_protos.proto
>
>
> I would like to discuss how to fix these incompatible changes. Otherwise,
> rolling upgrades will not be supported between branch-2.8 (or below) and
> branch-2.9 (or beyond). We could revert the incompatible changes, but then
> the new releases would be incompatible with the releases that have these
> incompatible changes. If we do nothing, then rolling upgrades won't work
> between 2.8- and 2.9+.
>
>
> Thanks,
>
>
> Eric
>
>
> -------------------------------------------------------------------
>
>
> git diff branch-2.8..branch-2.9 $(find . -name '*\.proto')
>
>
> https://issues.apache.org/jira/browse/YARN-6616
>
>    - Trunk patch (applied through branch-2.9) differs from branch-2.8 patch
>
> @@ -211,7 +245,20 @@ message ApplicationReportProto {
>
>    optional PriorityProto priority = 23;
>
>    optional string appNodeLabelExpression = 24;
>
>    optional string amNodeLabelExpression = 25;
>
> -  optional int64 submitTime = 26;
>
> +  repeated AppTimeoutsMapProto appTimeouts = 26;
>
> +  optional int64 launchTime = 27;
>
> +  optional int64 submitTime = 28;
>
>
> https://issues.apache.org/jira/browse/YARN-6050
>
>    - Trunk and branch-2 patches both change the protobuf type in the same
>    way.
>
> @@ -356,7 +416,22 @@ message ApplicationSubmissionContextProto {
>
>    optional LogAggregationContextProto log_aggregation_context = 14;
>
>    optional ReservationIdProto reservation_id = 15;
>
>    optional string node_label_expression = 16;
>
> -  optional ResourceRequestProto am_container_resource_request = 17;
>
> +  repeated ResourceRequestProto am_container_resource_request = 17;
>
> +  repeated ApplicationTimeoutMapProto application_timeouts = 18;
>
>
> https://issues.apache.org/jira/browse/YARN-7813
>
>    - Trunk (applied through branch-3.1) and branch-3.0 (applied through
>    branch-2.9) patches differ from branch-2.8 patch
>
> @@ -425,7 +501,21 @@ message QueueInfoProto {
>
>    optional string defaultNodeLabelExpression = 9;
>
>    optional QueueStatisticsProto queueStatistics = 10;
>
>    optional bool preemptionDisabled = 11;
>
> -  optional bool intraQueuePreemptionDisabled = 12;
>
> +  repeated QueueConfigurationsMapProto queueConfigurationsMap = 12;
>
> +  optional bool intraQueuePreemptionDisabled = 13;
>

Re: Incompatible changes between branch-2.8 and branch-2.9

Posted by Jonathan Hung <jy...@gmail.com>.
Hi Eric, thanks for the investigation.

   - For YARN-6616, for branch-2.8 and below, it was only committed to
   2.7.8/2.8.6 which have not been released (as I understand). Perhaps we can
   revert YARN-6616 from branch-2.7 and branch-2.8.
   - For YARN-6050, there's a bit here:
   https://developers.google.com/protocol-buffers/docs/proto that says
   "optional is compatible with repeated", so I think we should be OK there.
   - For YARN-7813, it's in 2.8.4 so it seems upgrading from 2.8.4 or 2.8.5
   to a 2.9+ version will be an issue. One option could be to move the
   intraQueuePreemptionDisabled field from id 12 to id 13 in branch-2.8, then
   users would upgrade from 2.8.4/2.8.5 to 2.8.6 (someone would have to
   release this), then upgrade from 2.8.6 to 2.9+.


Jonathan Hung


On Tue, Sep 24, 2019 at 9:23 AM Eric Badger
<eb...@verizonmedia.com.invalid> wrote:

> We (Verizon Media) are currently moving towards upgrading our clusters from
> our internal fork of branch-2.8 to an internal fork of branch-2. During
> this process, we have found multiple incompatible changes in protobufs
> between branch-2.8 and branch-2. These incompatibilities were all
> introduced between branch-2.8 and branch-2.9. I did a git diff over all
> .proto files across the branch-2.8 and branch-2.9 and found 3 instances of
> incompatibilities from 3 separate commits. All of the incompatibilities are
> in yarn_protos.proto
>
>
> I would like to discuss how to fix these incompatible changes. Otherwise,
> rolling upgrades will not be supported between branch-2.8 (or below) and
> branch-2.9 (or beyond). We could revert the incompatible changes, but then
> the new releases would be incompatible with the releases that have these
> incompatible changes. If we do nothing, then rolling upgrades won't work
> between 2.8- and 2.9+.
>
>
> Thanks,
>
>
> Eric
>
>
> -------------------------------------------------------------------
>
>
> git diff branch-2.8..branch-2.9 $(find . -name '*\.proto')
>
>
> https://issues.apache.org/jira/browse/YARN-6616
>
>    - Trunk patch (applied through branch-2.9) differs from branch-2.8 patch
>
> @@ -211,7 +245,20 @@ message ApplicationReportProto {
>
>    optional PriorityProto priority = 23;
>
>    optional string appNodeLabelExpression = 24;
>
>    optional string amNodeLabelExpression = 25;
>
> -  optional int64 submitTime = 26;
>
> +  repeated AppTimeoutsMapProto appTimeouts = 26;
>
> +  optional int64 launchTime = 27;
>
> +  optional int64 submitTime = 28;
>
>
> https://issues.apache.org/jira/browse/YARN-6050
>
>    - Trunk and branch-2 patches both change the protobuf type in the same
>    way.
>
> @@ -356,7 +416,22 @@ message ApplicationSubmissionContextProto {
>
>    optional LogAggregationContextProto log_aggregation_context = 14;
>
>    optional ReservationIdProto reservation_id = 15;
>
>    optional string node_label_expression = 16;
>
> -  optional ResourceRequestProto am_container_resource_request = 17;
>
> +  repeated ResourceRequestProto am_container_resource_request = 17;
>
> +  repeated ApplicationTimeoutMapProto application_timeouts = 18;
>
>
> https://issues.apache.org/jira/browse/YARN-7813
>
>    - Trunk (applied through branch-3.1) and branch-3.0 (applied through
>    branch-2.9) patches differ from branch-2.8 patch
>
> @@ -425,7 +501,21 @@ message QueueInfoProto {
>
>    optional string defaultNodeLabelExpression = 9;
>
>    optional QueueStatisticsProto queueStatistics = 10;
>
>    optional bool preemptionDisabled = 11;
>
> -  optional bool intraQueuePreemptionDisabled = 12;
>
> +  repeated QueueConfigurationsMapProto queueConfigurationsMap = 12;
>
> +  optional bool intraQueuePreemptionDisabled = 13;
>

Re: Incompatible changes between branch-2.8 and branch-2.9

Posted by Jonathan Hung <jy...@gmail.com>.
Hi Eric, thanks for the investigation.

   - For YARN-6616, for branch-2.8 and below, it was only committed to
   2.7.8/2.8.6 which have not been released (as I understand). Perhaps we can
   revert YARN-6616 from branch-2.7 and branch-2.8.
   - For YARN-6050, there's a bit here:
   https://developers.google.com/protocol-buffers/docs/proto that says
   "optional is compatible with repeated", so I think we should be OK there.
   - For YARN-7813, it's in 2.8.4 so it seems upgrading from 2.8.4 or 2.8.5
   to a 2.9+ version will be an issue. One option could be to move the
   intraQueuePreemptionDisabled field from id 12 to id 13 in branch-2.8, then
   users would upgrade from 2.8.4/2.8.5 to 2.8.6 (someone would have to
   release this), then upgrade from 2.8.6 to 2.9+.


Jonathan Hung


On Tue, Sep 24, 2019 at 9:23 AM Eric Badger
<eb...@verizonmedia.com.invalid> wrote:

> We (Verizon Media) are currently moving towards upgrading our clusters from
> our internal fork of branch-2.8 to an internal fork of branch-2. During
> this process, we have found multiple incompatible changes in protobufs
> between branch-2.8 and branch-2. These incompatibilities were all
> introduced between branch-2.8 and branch-2.9. I did a git diff over all
> .proto files across the branch-2.8 and branch-2.9 and found 3 instances of
> incompatibilities from 3 separate commits. All of the incompatibilities are
> in yarn_protos.proto
>
>
> I would like to discuss how to fix these incompatible changes. Otherwise,
> rolling upgrades will not be supported between branch-2.8 (or below) and
> branch-2.9 (or beyond). We could revert the incompatible changes, but then
> the new releases would be incompatible with the releases that have these
> incompatible changes. If we do nothing, then rolling upgrades won't work
> between 2.8- and 2.9+.
>
>
> Thanks,
>
>
> Eric
>
>
> -------------------------------------------------------------------
>
>
> git diff branch-2.8..branch-2.9 $(find . -name '*\.proto')
>
>
> https://issues.apache.org/jira/browse/YARN-6616
>
>    - Trunk patch (applied through branch-2.9) differs from branch-2.8 patch
>
> @@ -211,7 +245,20 @@ message ApplicationReportProto {
>
>    optional PriorityProto priority = 23;
>
>    optional string appNodeLabelExpression = 24;
>
>    optional string amNodeLabelExpression = 25;
>
> -  optional int64 submitTime = 26;
>
> +  repeated AppTimeoutsMapProto appTimeouts = 26;
>
> +  optional int64 launchTime = 27;
>
> +  optional int64 submitTime = 28;
>
>
> https://issues.apache.org/jira/browse/YARN-6050
>
>    - Trunk and branch-2 patches both change the protobuf type in the same
>    way.
>
> @@ -356,7 +416,22 @@ message ApplicationSubmissionContextProto {
>
>    optional LogAggregationContextProto log_aggregation_context = 14;
>
>    optional ReservationIdProto reservation_id = 15;
>
>    optional string node_label_expression = 16;
>
> -  optional ResourceRequestProto am_container_resource_request = 17;
>
> +  repeated ResourceRequestProto am_container_resource_request = 17;
>
> +  repeated ApplicationTimeoutMapProto application_timeouts = 18;
>
>
> https://issues.apache.org/jira/browse/YARN-7813
>
>    - Trunk (applied through branch-3.1) and branch-3.0 (applied through
>    branch-2.9) patches differ from branch-2.8 patch
>
> @@ -425,7 +501,21 @@ message QueueInfoProto {
>
>    optional string defaultNodeLabelExpression = 9;
>
>    optional QueueStatisticsProto queueStatistics = 10;
>
>    optional bool preemptionDisabled = 11;
>
> -  optional bool intraQueuePreemptionDisabled = 12;
>
> +  repeated QueueConfigurationsMapProto queueConfigurationsMap = 12;
>
> +  optional bool intraQueuePreemptionDisabled = 13;
>