You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mesos.apache.org by Alex Rukletsov <al...@mesosphere.com> on 2016/10/12 16:34:23 UTC

On Mesos versioning and deprecation policy

Folks,

There have been a bunch of online [1, 2] and offline discussions about our
deprecation and versioning policy. I found that people—including
myself—read the versioning doc [3] differently; moreover some aspects are
not captured there. I would like to start a discussion around this topic by
sharing my confusions and suggestions. This will hopefully help us stay on
the same page and have similar expectations. The second goal is to
eliminate ambiguities from the versioning doc (thanks Vinod for
volunteering to update it).

1. API vs. semantic changes.
Current versioning guide treat features (e.g. flags, metrics, endpoints)
and API differently: incompatible changes for the former are allowed after
6 month deprecation cycle, while for the latter they require bumping a
major version. I suggest we consolidate these policies.

We should also define and clearly explain what changes require bumping the
major version. I have no strong opinion here and would love to hear what
people think. The original motivation for maintaining backwards
compatibility is to make sure vN schedulers can correctly work with vN API
without being updated. But what about semantic changes that do not touch
the API? For example, what if we decide to send less task health updates to
schedulers based on some health policy? It influences the flow of task
status updates, should such change be considered compatible? Taking it to
an extreme, we may not even be able to fix some bugs because someone may
already rely on this behaviour!

Another tightly related thing we should explicitly call out is
upgradability and rollback capabilities inside a major release. Committing
to this may significantly limit what we can change within a major release;
on the other side it will give users more time and a better experience
about using and maintaining Mesos clusters.

2. Versioned vs. unversioned protobufs.
Currently we have v1 and unnamed protobufs, which simultaneously mean v0,
v2, and internal. I am sometimes confused about what is the right way to
update or introduce a field or message there, do people feel the same? How
about splitting the unnamed version into explicit v0, v2, and internal?

Food for thought. It would be great if we can only maintain "diffs" to the
internal protobufs in the code, instead of duplicating them altogether.

3. API and feature labelling.
I suggest to introduce explicit labels for API and features, to ensure
users have the right assumptions about the their lifetime while engineers
have the ability to change a wip feature in an non-compatible way. I
propose the following:
API: stable, non-stable, pure (not used by Mesos components)
Feature: experimental, normal.

Looking forward to your thoughts and suggestions.
AlexR

[1] https://www.mail-archive.com/user@mesos.apache.org/msg08025.html
[2] https://www.mail-archive.com/dev@mesos.apache.org/msg36621.html
[3]
https://github.com/apache/mesos/blob/b2beef37f6f85a8c75e968136caa7a1f292ba20e/docs/versioning.md

Re: On Mesos versioning and deprecation policy

Posted by haosdent <ha...@gmail.com>.
>How about splitting the unnamed version into explicit v0, v2, and internal?

Currently our internal protobuf and v0 protobuf use the same unnamed
version protobuf and under the same namespace (`package mesos`).
If we are going to split v0 and internal, that requires copy all protobuf
files under `package mesos` into `package mesos.internal` and need to
change the whole code base to use the protobuf in `package mesos.internal`.
But it is beneficial to do this, so that we could avoid [the hacks][1]
that convert from the unversioned protobuf(v0) to the unversioned
protobuf(internal).

[1]
https://github.com/apache/mesos/blob/fa976c22ac66ff5c905157a5a36bda1d21525b32/src/master/master.cpp#L4077-L4108

On Thu, Oct 13, 2016 at 12:34 AM, Alex Rukletsov <al...@mesosphere.com>
wrote:

> Folks,
>
> There have been a bunch of online [1, 2] and offline discussions about our
> deprecation and versioning policy. I found that people—including
> myself—read the versioning doc [3] differently; moreover some aspects are
> not captured there. I would like to start a discussion around this topic by
> sharing my confusions and suggestions. This will hopefully help us stay on
> the same page and have similar expectations. The second goal is to
> eliminate ambiguities from the versioning doc (thanks Vinod for
> volunteering to update it).
>
> 1. API vs. semantic changes.
> Current versioning guide treat features (e.g. flags, metrics, endpoints)
> and API differently: incompatible changes for the former are allowed after
> 6 month deprecation cycle, while for the latter they require bumping a
> major version. I suggest we consolidate these policies.
>
> We should also define and clearly explain what changes require bumping the
> major version. I have no strong opinion here and would love to hear what
> people think. The original motivation for maintaining backwards
> compatibility is to make sure vN schedulers can correctly work with vN API
> without being updated. But what about semantic changes that do not touch
> the API? For example, what if we decide to send less task health updates to
> schedulers based on some health policy? It influences the flow of task
> status updates, should such change be considered compatible? Taking it to
> an extreme, we may not even be able to fix some bugs because someone may
> already rely on this behaviour!
>
> Another tightly related thing we should explicitly call out is
> upgradability and rollback capabilities inside a major release. Committing
> to this may significantly limit what we can change within a major release;
> on the other side it will give users more time and a better experience
> about using and maintaining Mesos clusters.
>
> 2. Versioned vs. unversioned protobufs.
> Currently we have v1 and unnamed protobufs, which simultaneously mean v0,
> v2, and internal. I am sometimes confused about what is the right way to
> update or introduce a field or message there, do people feel the same? How
> about splitting the unnamed version into explicit v0, v2, and internal?
>
> Food for thought. It would be great if we can only maintain "diffs" to the
> internal protobufs in the code, instead of duplicating them altogether.
>
> 3. API and feature labelling.
> I suggest to introduce explicit labels for API and features, to ensure
> users have the right assumptions about the their lifetime while engineers
> have the ability to change a wip feature in an non-compatible way. I
> propose the following:
> API: stable, non-stable, pure (not used by Mesos components)
> Feature: experimental, normal.
>
> Looking forward to your thoughts and suggestions.
> AlexR
>
> [1] https://www.mail-archive.com/user@mesos.apache.org/msg08025.html
> [2] https://www.mail-archive.com/dev@mesos.apache.org/msg36621.html
> [3]
> https://github.com/apache/mesos/blob/b2beef37f6f85a8c75e968136caa7a
> 1f292ba20e/docs/versioning.md
>



-- 
Best Regards,
Haosdent Huang

Re: On Mesos versioning and deprecation policy

Posted by haosdent <ha...@gmail.com>.
+1 For the sum up. Now it is clear for me.

On Sat, Oct 29, 2016 at 6:45 AM, Vinod Kone <vi...@apache.org> wrote:

> We had an extended discussion around this in the last community sync.
> Thanks for those who participated!
>
> To sum up the discussion:
>
> --> As mesos devs, we should strive to not make incompatible changes in
> APIs, flags, environment variables.
>
> --> In the rare case where an incompatible change is preferred (e.g., code
> complexity), we should give a clear 6 months heads up the users that a
> breaking change is going to take place.
>
> --> Breaking changes do not necessitate a major version bump. This is
> because we want to allow live upgrades between major versions (e.g., 1.10
> to 2.0).
>
> --> Compatibility guarantees do not apply to experimental features (incl.
> APIs).
>
> --> We need to have clear documentation about procedure that devs could
> follow when deprecating/removing stable features and adding experimental
> features.
>
> --> We need to improve upgrades.md to make it easy for operators to know
> what features are deprecated/removed between versions X and Y.
>
> --> We should decouple internal protos used by Mesos from the unversioned
> protos used by driver based frameworks.
>
> I will spend some time in the next few weeks to create/update the
> documentation reflecting these points.
>
> Anything else I missed?
>
> Thanks,
>
> On Sat, Oct 15, 2016 at 11:47 AM, haosdent <ha...@gmail.com> wrote:
>
> > Thanks @yan's great inputs! I couldn't agree more almost of them.
> >
> > > Also the API is not just what the machine reads but all the
> documentation
> > associated with it, right? It depends on what the documentation says;
> what
> > the user _should_ expect.
> >
> > I think different users may have different expectations. And the guy who
> > developed the APIs may have different understand from some users as well.
> > Our documentations should cover most of cases.
> >
> > But in case that we didn't or forgot to write it explicitly in the
> > document, should we give up to update the API? Just like user Alice said
> > this is a BUG while user Bob said this is a feature. I think we still
> need
> > to raise it case by case to ensure most users are not affected by the
> > breaking API changes.
> >
> > On Sat, Oct 15, 2016 at 6:55 AM, Vinod Kone <vi...@apache.org>
> wrote:
> >
> > > We will chat about this in the upcoming community sync (thursday 3 PM).
> > > So, please make sure to attend if you are interested.
> > >
> > > On Fri, Oct 14, 2016 at 3:44 PM, Yan Xu <xu...@apple.com> wrote:
> > >
> > >>
> > >> On Fri, Oct 14, 2016 at 3:37 PM, Yan Xu <xu...@apple.com> wrote:
> > >>
> > >>> Thanks Alex for starting this!
> > >>>
> > >>> In addition to comments below, I think it'll be helpful to keep the
> > >>> existing versioning doc concise and user-friendly while having a
> > dedicated
> > >>> doc for the "implementation details" where precise requirements and
> > >>> procedures go. Maybe some duplication/cross-referencing is needed but
> > Mesos
> > >>> developers will find the latter much more helpful while the
> > users/framework
> > >>> developer will find the former easy to read.
> > >>>
> > >>> e.g., a similar split:
> > >>> https://github.com/kubernetes/kubernetes/blob/master/docs/api.md
> > >>> https://github.com/kubernetes/kubernetes/blob/master/docs/de
> > >>> vel/api_changes.md (which has a lot of details on how the kubernetes
> > >>> community is thinking about similar issues, which we can learn from)
> > >>>
> > >>> Jiang Yan Xu 
> > >>>
> > >>> On Wed, Oct 12, 2016 at 9:34 AM, Alex Rukletsov <alex@mesosphere.com
> >
> > >>> wrote:
> > >>>
> > >>>> Folks,
> > >>>>
> > >>>> There have been a bunch of online [1, 2] and offline discussions
> about
> > >>>> our
> > >>>> deprecation and versioning policy. I found that people—including
> > >>>> myself—read the versioning doc [3] differently; moreover some
> aspects
> > >>>> are
> > >>>> not captured there. I would like to start a discussion around this
> > >>>> topic by
> > >>>> sharing my confusions and suggestions. This will hopefully help us
> > stay
> > >>>> on
> > >>>> the same page and have similar expectations. The second goal is to
> > >>>> eliminate ambiguities from the versioning doc (thanks Vinod for
> > >>>> volunteering to update it).
> > >>>>
> > >>>
> > >>> +1 Let me know if there are things I can help with.
> > >>>
> > >>>
> > >>>>
> > >>>> 1. API vs. semantic changes.
> > >>>> Current versioning guide treat features (e.g. flags, metrics,
> > endpoints)
> > >>>> and API differently: incompatible changes for the former are allowed
> > >>>> after
> > >>>> 6 month deprecation cycle, while for the latter they require
> bumping a
> > >>>> major version. I suggest we consolidate these policies.
> > >>>>
> > >>>
> > >>> I feel that the distinction is not API vs. semantic changes,
> Backwards
> > >>> compatible API guarantee should imply backwards compatible semantics
> > (of
> > >>> the API).
> > >>> i.e., if a change in API doesn't cause the message to be dropped to
> the
> > >>> floor but leads to behavior change that causes problems in the
> system,
> > it
> > >>> still breaks compatibility.
> > >>>
> > >>> IMO the distinction is more between:
> > >>> - Compatibility between components that are impossible/very
> unpleasant
> > >>> to upgrade in lockstep - high priority for compatibility guarantee.
> > >>> - Compatibility between components that are generally bundled
> (modules)
> > >>> or things that usually aren't built into automated tooling (e.g., the
> > >>> /state endpoint) - more relaxed for now but we should explicitly
> > exclude
> > >>> them from the guarantee.
> > >>>
> > >>>
> > >>>>
> > >>>> We should also define and clearly explain what changes require
> bumping
> > >>>> the
> > >>>> major version. I have no strong opinion here and would love to hear
> > what
> > >>>> people think. The original motivation for maintaining backwards
> > >>>> compatibility is to make sure vN schedulers can correctly work with
> vN
> > >>>> API
> > >>>> without being updated. But what about semantic changes that do not
> > touch
> > >>>> the API? For example, what if we decide to send less task health
> > >>>> updates to
> > >>>> schedulers based on some health policy? It influences the flow of
> task
> > >>>> status updates, should such change be considered compatible? Taking
> it
> > >>>> to
> > >>>> an extreme, we may not even be able to fix some bugs because someone
> > may
> > >>>> already rely on this behaviour!
> > >>>>
> > >>>
> > >>> API changes should warrant a major version bump. Also the API is not
> > >>> just what the machine reads but all the documentation associated with
> > it,
> > >>> right? It depends on what the documentation says; what the user
> > _should_
> > >>> expect.
> > >>>
> > >>> That said, I feel that these things are hard to be talked about in
> the
> > >>> abstract. Even with a guideline, we still need to make case-by-case
> > >>> decisions. (e.g., has the documentation precisely defined this
> precise
> > >>> behavior? If not, is it reasonable for the users to expect some
> > behavior
> > >>> because it's common sense? How bad is it if some behavior just
> changes
> > a
> > >>> tiny bit?) Therefore we need to make sure the process for API changes
> > are
> > >>> more rigorously defined.
> > >>>
> > >>> Whether something is a bug depends on whether the API does what it
> says
> > >>> it'll do. The line may sometimes be blurry but in general I don't
> feel
> > it's
> > >>> a problem. If someone is relying on the behavior that is a bug, we
> > should
> > >>> still help them fix it but the bug shouldn't count as "our
> guarantee".
> > >>>
> > >>>
> > >>>>
> > >>>> Another tightly related thing we should explicitly call out is
> > >>>> upgradability and rollback capabilities inside a major release.
> > >>>> Committing
> > >>>> to this may significantly limit what we can change within a major
> > >>>> release;
> > >>>> on the other side it will give users more time and a better
> experience
> > >>>> about using and maintaining Mesos clusters.
> > >>>>
> > >>>
> > >>> According to the versioning doc upgradability depends on whether you
> > >>> depend on deprecated/removed features.
> > >>>
> > >>> That paragraph should be explained more precisely:
> > >>> - "deprecated" means your system won't break but warnings are shown
> > >>> (Maybe we should use some standard deprecation warning keywords so
> the
> > >>> operator can monitor the log for such warnings!
> > >>> - "removed": means it may break.
> > >>>
> > >>> If you deprecate a flag/env that interface with operator tooling in
> the
> > >>> next minor release, the operator basically has 6 months from the next
> > minor
> > >>> release to change the her tooling. I feel this is pretty acceptable.
> > >>> If you deprecate a flag/env variable that interface with the
> framework
> > >>> (executor) in the next minor release, I feel it may not be enough and
> > it
> > >>> probably warrants a major version bump. So perhaps the API shouldn't
> be
> > >>> just the protos.
> > >>>
> > >>>
> > >>>> 2. Versioned vs. unversioned protobufs.
> > >>>> Currently we have v1 and unnamed protobufs, which simultaneously
> mean
> > >>>> v0,
> > >>>> v2, and internal. I am sometimes confused about what is the right
> way
> > to
> > >>>> update or introduce a field or message there, do people feel the
> same?
> > >>>> How
> > >>>> about splitting the unnamed version into explicit v0, v2, and
> > internal?
> > >>>>
> > >>>
> > >>> As haosdent mentioned, we have captured this in MESOS-6268. The
> benefit
> > >>> is clear but I guess the people will be more motivated when we find
> > some v2
> > >>> feature can't be made compatible with the v0 API. (Anand's point
> > >>> in MESOS-6016). On the other hand, if we cut v0 API access before
> that
> > >>> happens (is v0 API obsolete and should be removed 6 months after
> 1.0?)
> > then
> > >>> we don't need to worry about v0 and can use unversioned protos as
> > >>> "internal"?
> > >>>
> > >>>
> > >>>> Food for thought. It would be great if we can only maintain "diffs"
> to
> > >>>> the
> > >>>> internal protobufs in the code, instead of duplicating them
> > altogether.
> > >>>>
> > >>>> 3. API and feature labelling.
> > >>>> I suggest to introduce explicit labels for API and features, to
> ensure
> > >>>> users have the right assumptions about the their lifetime while
> > >>>> engineers
> > >>>> have the ability to change a wip feature in an non-compatible way. I
> > >>>> propose the following:
> > >>>> API: stable, non-stable, pure (not used by Mesos components)
> > >>>> Feature: experimental, normal.
> > >>>>
> > >>>
> > >>>  +1 on formalizing the terminologies.
> > >>>
> > >>> Historically the distinction is not clear for the following:
> > >>>
> > >>> 1. The API has no compatibility guarantee at all.
> > >>> 2. The feature provided by this API is experimental
> > >>>
> > >>
> > >> To add to this point: because 2) logically doesn't apply to the "pure
> > >> (not used by Mesos components)" fields in the API, it could be more
> > >> confusing and thus require more precise definition.
> > >>
> > >>
> > >>>
> > >>> IMO It's OK that we say that we don't distinguish the two (the API
> has
> > >>> no compatibility guarantee until the feature is fully released) but
> we
> > have
> > >>> to make it clear.
> > >>> If we don't make such distinction, ALL API additions should be marked
> > as
> > >>> unstable first and be changed stable later (as a formal process).
> > >>>
> > >>>
> > >>>>
> > >>>> Looking forward to your thoughts and suggestions.
> > >>>> AlexR
> > >>>>
> > >>>> [1] https://www.mail-archive.com/user@mesos.apache.org/
> msg08025.html
> > >>>> [2] https://www.mail-archive.com/dev@mesos.apache.org/msg36621.html
> > >>>> [3]
> > >>>> https://github.com/apache/mesos/blob/b2beef37f6f85a8c75e9681
> > >>>> 36caa7a1f292ba20e/docs/versioning.md
> > >>>>
> > >>>
> > >>>
> > >>
> > >
> >
> >
> > --
> > Best Regards,
> > Haosdent Huang
> >
>



-- 
Best Regards,
Haosdent Huang

Re: On Mesos versioning and deprecation policy

Posted by haosdent <ha...@gmail.com>.
+1 For the sum up. Now it is clear for me.

On Sat, Oct 29, 2016 at 6:45 AM, Vinod Kone <vi...@apache.org> wrote:

> We had an extended discussion around this in the last community sync.
> Thanks for those who participated!
>
> To sum up the discussion:
>
> --> As mesos devs, we should strive to not make incompatible changes in
> APIs, flags, environment variables.
>
> --> In the rare case where an incompatible change is preferred (e.g., code
> complexity), we should give a clear 6 months heads up the users that a
> breaking change is going to take place.
>
> --> Breaking changes do not necessitate a major version bump. This is
> because we want to allow live upgrades between major versions (e.g., 1.10
> to 2.0).
>
> --> Compatibility guarantees do not apply to experimental features (incl.
> APIs).
>
> --> We need to have clear documentation about procedure that devs could
> follow when deprecating/removing stable features and adding experimental
> features.
>
> --> We need to improve upgrades.md to make it easy for operators to know
> what features are deprecated/removed between versions X and Y.
>
> --> We should decouple internal protos used by Mesos from the unversioned
> protos used by driver based frameworks.
>
> I will spend some time in the next few weeks to create/update the
> documentation reflecting these points.
>
> Anything else I missed?
>
> Thanks,
>
> On Sat, Oct 15, 2016 at 11:47 AM, haosdent <ha...@gmail.com> wrote:
>
> > Thanks @yan's great inputs! I couldn't agree more almost of them.
> >
> > > Also the API is not just what the machine reads but all the
> documentation
> > associated with it, right? It depends on what the documentation says;
> what
> > the user _should_ expect.
> >
> > I think different users may have different expectations. And the guy who
> > developed the APIs may have different understand from some users as well.
> > Our documentations should cover most of cases.
> >
> > But in case that we didn't or forgot to write it explicitly in the
> > document, should we give up to update the API? Just like user Alice said
> > this is a BUG while user Bob said this is a feature. I think we still
> need
> > to raise it case by case to ensure most users are not affected by the
> > breaking API changes.
> >
> > On Sat, Oct 15, 2016 at 6:55 AM, Vinod Kone <vi...@apache.org>
> wrote:
> >
> > > We will chat about this in the upcoming community sync (thursday 3 PM).
> > > So, please make sure to attend if you are interested.
> > >
> > > On Fri, Oct 14, 2016 at 3:44 PM, Yan Xu <xu...@apple.com> wrote:
> > >
> > >>
> > >> On Fri, Oct 14, 2016 at 3:37 PM, Yan Xu <xu...@apple.com> wrote:
> > >>
> > >>> Thanks Alex for starting this!
> > >>>
> > >>> In addition to comments below, I think it'll be helpful to keep the
> > >>> existing versioning doc concise and user-friendly while having a
> > dedicated
> > >>> doc for the "implementation details" where precise requirements and
> > >>> procedures go. Maybe some duplication/cross-referencing is needed but
> > Mesos
> > >>> developers will find the latter much more helpful while the
> > users/framework
> > >>> developer will find the former easy to read.
> > >>>
> > >>> e.g., a similar split:
> > >>> https://github.com/kubernetes/kubernetes/blob/master/docs/api.md
> > >>> https://github.com/kubernetes/kubernetes/blob/master/docs/de
> > >>> vel/api_changes.md (which has a lot of details on how the kubernetes
> > >>> community is thinking about similar issues, which we can learn from)
> > >>>
> > >>> Jiang Yan Xu 
> > >>>
> > >>> On Wed, Oct 12, 2016 at 9:34 AM, Alex Rukletsov <alex@mesosphere.com
> >
> > >>> wrote:
> > >>>
> > >>>> Folks,
> > >>>>
> > >>>> There have been a bunch of online [1, 2] and offline discussions
> about
> > >>>> our
> > >>>> deprecation and versioning policy. I found that people—including
> > >>>> myself—read the versioning doc [3] differently; moreover some
> aspects
> > >>>> are
> > >>>> not captured there. I would like to start a discussion around this
> > >>>> topic by
> > >>>> sharing my confusions and suggestions. This will hopefully help us
> > stay
> > >>>> on
> > >>>> the same page and have similar expectations. The second goal is to
> > >>>> eliminate ambiguities from the versioning doc (thanks Vinod for
> > >>>> volunteering to update it).
> > >>>>
> > >>>
> > >>> +1 Let me know if there are things I can help with.
> > >>>
> > >>>
> > >>>>
> > >>>> 1. API vs. semantic changes.
> > >>>> Current versioning guide treat features (e.g. flags, metrics,
> > endpoints)
> > >>>> and API differently: incompatible changes for the former are allowed
> > >>>> after
> > >>>> 6 month deprecation cycle, while for the latter they require
> bumping a
> > >>>> major version. I suggest we consolidate these policies.
> > >>>>
> > >>>
> > >>> I feel that the distinction is not API vs. semantic changes,
> Backwards
> > >>> compatible API guarantee should imply backwards compatible semantics
> > (of
> > >>> the API).
> > >>> i.e., if a change in API doesn't cause the message to be dropped to
> the
> > >>> floor but leads to behavior change that causes problems in the
> system,
> > it
> > >>> still breaks compatibility.
> > >>>
> > >>> IMO the distinction is more between:
> > >>> - Compatibility between components that are impossible/very
> unpleasant
> > >>> to upgrade in lockstep - high priority for compatibility guarantee.
> > >>> - Compatibility between components that are generally bundled
> (modules)
> > >>> or things that usually aren't built into automated tooling (e.g., the
> > >>> /state endpoint) - more relaxed for now but we should explicitly
> > exclude
> > >>> them from the guarantee.
> > >>>
> > >>>
> > >>>>
> > >>>> We should also define and clearly explain what changes require
> bumping
> > >>>> the
> > >>>> major version. I have no strong opinion here and would love to hear
> > what
> > >>>> people think. The original motivation for maintaining backwards
> > >>>> compatibility is to make sure vN schedulers can correctly work with
> vN
> > >>>> API
> > >>>> without being updated. But what about semantic changes that do not
> > touch
> > >>>> the API? For example, what if we decide to send less task health
> > >>>> updates to
> > >>>> schedulers based on some health policy? It influences the flow of
> task
> > >>>> status updates, should such change be considered compatible? Taking
> it
> > >>>> to
> > >>>> an extreme, we may not even be able to fix some bugs because someone
> > may
> > >>>> already rely on this behaviour!
> > >>>>
> > >>>
> > >>> API changes should warrant a major version bump. Also the API is not
> > >>> just what the machine reads but all the documentation associated with
> > it,
> > >>> right? It depends on what the documentation says; what the user
> > _should_
> > >>> expect.
> > >>>
> > >>> That said, I feel that these things are hard to be talked about in
> the
> > >>> abstract. Even with a guideline, we still need to make case-by-case
> > >>> decisions. (e.g., has the documentation precisely defined this
> precise
> > >>> behavior? If not, is it reasonable for the users to expect some
> > behavior
> > >>> because it's common sense? How bad is it if some behavior just
> changes
> > a
> > >>> tiny bit?) Therefore we need to make sure the process for API changes
> > are
> > >>> more rigorously defined.
> > >>>
> > >>> Whether something is a bug depends on whether the API does what it
> says
> > >>> it'll do. The line may sometimes be blurry but in general I don't
> feel
> > it's
> > >>> a problem. If someone is relying on the behavior that is a bug, we
> > should
> > >>> still help them fix it but the bug shouldn't count as "our
> guarantee".
> > >>>
> > >>>
> > >>>>
> > >>>> Another tightly related thing we should explicitly call out is
> > >>>> upgradability and rollback capabilities inside a major release.
> > >>>> Committing
> > >>>> to this may significantly limit what we can change within a major
> > >>>> release;
> > >>>> on the other side it will give users more time and a better
> experience
> > >>>> about using and maintaining Mesos clusters.
> > >>>>
> > >>>
> > >>> According to the versioning doc upgradability depends on whether you
> > >>> depend on deprecated/removed features.
> > >>>
> > >>> That paragraph should be explained more precisely:
> > >>> - "deprecated" means your system won't break but warnings are shown
> > >>> (Maybe we should use some standard deprecation warning keywords so
> the
> > >>> operator can monitor the log for such warnings!
> > >>> - "removed": means it may break.
> > >>>
> > >>> If you deprecate a flag/env that interface with operator tooling in
> the
> > >>> next minor release, the operator basically has 6 months from the next
> > minor
> > >>> release to change the her tooling. I feel this is pretty acceptable.
> > >>> If you deprecate a flag/env variable that interface with the
> framework
> > >>> (executor) in the next minor release, I feel it may not be enough and
> > it
> > >>> probably warrants a major version bump. So perhaps the API shouldn't
> be
> > >>> just the protos.
> > >>>
> > >>>
> > >>>> 2. Versioned vs. unversioned protobufs.
> > >>>> Currently we have v1 and unnamed protobufs, which simultaneously
> mean
> > >>>> v0,
> > >>>> v2, and internal. I am sometimes confused about what is the right
> way
> > to
> > >>>> update or introduce a field or message there, do people feel the
> same?
> > >>>> How
> > >>>> about splitting the unnamed version into explicit v0, v2, and
> > internal?
> > >>>>
> > >>>
> > >>> As haosdent mentioned, we have captured this in MESOS-6268. The
> benefit
> > >>> is clear but I guess the people will be more motivated when we find
> > some v2
> > >>> feature can't be made compatible with the v0 API. (Anand's point
> > >>> in MESOS-6016). On the other hand, if we cut v0 API access before
> that
> > >>> happens (is v0 API obsolete and should be removed 6 months after
> 1.0?)
> > then
> > >>> we don't need to worry about v0 and can use unversioned protos as
> > >>> "internal"?
> > >>>
> > >>>
> > >>>> Food for thought. It would be great if we can only maintain "diffs"
> to
> > >>>> the
> > >>>> internal protobufs in the code, instead of duplicating them
> > altogether.
> > >>>>
> > >>>> 3. API and feature labelling.
> > >>>> I suggest to introduce explicit labels for API and features, to
> ensure
> > >>>> users have the right assumptions about the their lifetime while
> > >>>> engineers
> > >>>> have the ability to change a wip feature in an non-compatible way. I
> > >>>> propose the following:
> > >>>> API: stable, non-stable, pure (not used by Mesos components)
> > >>>> Feature: experimental, normal.
> > >>>>
> > >>>
> > >>>  +1 on formalizing the terminologies.
> > >>>
> > >>> Historically the distinction is not clear for the following:
> > >>>
> > >>> 1. The API has no compatibility guarantee at all.
> > >>> 2. The feature provided by this API is experimental
> > >>>
> > >>
> > >> To add to this point: because 2) logically doesn't apply to the "pure
> > >> (not used by Mesos components)" fields in the API, it could be more
> > >> confusing and thus require more precise definition.
> > >>
> > >>
> > >>>
> > >>> IMO It's OK that we say that we don't distinguish the two (the API
> has
> > >>> no compatibility guarantee until the feature is fully released) but
> we
> > have
> > >>> to make it clear.
> > >>> If we don't make such distinction, ALL API additions should be marked
> > as
> > >>> unstable first and be changed stable later (as a formal process).
> > >>>
> > >>>
> > >>>>
> > >>>> Looking forward to your thoughts and suggestions.
> > >>>> AlexR
> > >>>>
> > >>>> [1] https://www.mail-archive.com/user@mesos.apache.org/
> msg08025.html
> > >>>> [2] https://www.mail-archive.com/dev@mesos.apache.org/msg36621.html
> > >>>> [3]
> > >>>> https://github.com/apache/mesos/blob/b2beef37f6f85a8c75e9681
> > >>>> 36caa7a1f292ba20e/docs/versioning.md
> > >>>>
> > >>>
> > >>>
> > >>
> > >
> >
> >
> > --
> > Best Regards,
> > Haosdent Huang
> >
>



-- 
Best Regards,
Haosdent Huang

Re: On Mesos versioning and deprecation policy

Posted by Vinod Kone <vi...@apache.org>.
We had an extended discussion around this in the last community sync.
Thanks for those who participated!

To sum up the discussion:

--> As mesos devs, we should strive to not make incompatible changes in
APIs, flags, environment variables.

--> In the rare case where an incompatible change is preferred (e.g., code
complexity), we should give a clear 6 months heads up the users that a
breaking change is going to take place.

--> Breaking changes do not necessitate a major version bump. This is
because we want to allow live upgrades between major versions (e.g., 1.10
to 2.0).

--> Compatibility guarantees do not apply to experimental features (incl.
APIs).

--> We need to have clear documentation about procedure that devs could
follow when deprecating/removing stable features and adding experimental
features.

--> We need to improve upgrades.md to make it easy for operators to know
what features are deprecated/removed between versions X and Y.

--> We should decouple internal protos used by Mesos from the unversioned
protos used by driver based frameworks.

I will spend some time in the next few weeks to create/update the
documentation reflecting these points.

Anything else I missed?

Thanks,

On Sat, Oct 15, 2016 at 11:47 AM, haosdent <ha...@gmail.com> wrote:

> Thanks @yan's great inputs! I couldn't agree more almost of them.
>
> > Also the API is not just what the machine reads but all the documentation
> associated with it, right? It depends on what the documentation says; what
> the user _should_ expect.
>
> I think different users may have different expectations. And the guy who
> developed the APIs may have different understand from some users as well.
> Our documentations should cover most of cases.
>
> But in case that we didn't or forgot to write it explicitly in the
> document, should we give up to update the API? Just like user Alice said
> this is a BUG while user Bob said this is a feature. I think we still need
> to raise it case by case to ensure most users are not affected by the
> breaking API changes.
>
> On Sat, Oct 15, 2016 at 6:55 AM, Vinod Kone <vi...@apache.org> wrote:
>
> > We will chat about this in the upcoming community sync (thursday 3 PM).
> > So, please make sure to attend if you are interested.
> >
> > On Fri, Oct 14, 2016 at 3:44 PM, Yan Xu <xu...@apple.com> wrote:
> >
> >>
> >> On Fri, Oct 14, 2016 at 3:37 PM, Yan Xu <xu...@apple.com> wrote:
> >>
> >>> Thanks Alex for starting this!
> >>>
> >>> In addition to comments below, I think it'll be helpful to keep the
> >>> existing versioning doc concise and user-friendly while having a
> dedicated
> >>> doc for the "implementation details" where precise requirements and
> >>> procedures go. Maybe some duplication/cross-referencing is needed but
> Mesos
> >>> developers will find the latter much more helpful while the
> users/framework
> >>> developer will find the former easy to read.
> >>>
> >>> e.g., a similar split:
> >>> https://github.com/kubernetes/kubernetes/blob/master/docs/api.md
> >>> https://github.com/kubernetes/kubernetes/blob/master/docs/de
> >>> vel/api_changes.md (which has a lot of details on how the kubernetes
> >>> community is thinking about similar issues, which we can learn from)
> >>>
> >>> Jiang Yan Xu 
> >>>
> >>> On Wed, Oct 12, 2016 at 9:34 AM, Alex Rukletsov <al...@mesosphere.com>
> >>> wrote:
> >>>
> >>>> Folks,
> >>>>
> >>>> There have been a bunch of online [1, 2] and offline discussions about
> >>>> our
> >>>> deprecation and versioning policy. I found that people—including
> >>>> myself—read the versioning doc [3] differently; moreover some aspects
> >>>> are
> >>>> not captured there. I would like to start a discussion around this
> >>>> topic by
> >>>> sharing my confusions and suggestions. This will hopefully help us
> stay
> >>>> on
> >>>> the same page and have similar expectations. The second goal is to
> >>>> eliminate ambiguities from the versioning doc (thanks Vinod for
> >>>> volunteering to update it).
> >>>>
> >>>
> >>> +1 Let me know if there are things I can help with.
> >>>
> >>>
> >>>>
> >>>> 1. API vs. semantic changes.
> >>>> Current versioning guide treat features (e.g. flags, metrics,
> endpoints)
> >>>> and API differently: incompatible changes for the former are allowed
> >>>> after
> >>>> 6 month deprecation cycle, while for the latter they require bumping a
> >>>> major version. I suggest we consolidate these policies.
> >>>>
> >>>
> >>> I feel that the distinction is not API vs. semantic changes, Backwards
> >>> compatible API guarantee should imply backwards compatible semantics
> (of
> >>> the API).
> >>> i.e., if a change in API doesn't cause the message to be dropped to the
> >>> floor but leads to behavior change that causes problems in the system,
> it
> >>> still breaks compatibility.
> >>>
> >>> IMO the distinction is more between:
> >>> - Compatibility between components that are impossible/very unpleasant
> >>> to upgrade in lockstep - high priority for compatibility guarantee.
> >>> - Compatibility between components that are generally bundled (modules)
> >>> or things that usually aren't built into automated tooling (e.g., the
> >>> /state endpoint) - more relaxed for now but we should explicitly
> exclude
> >>> them from the guarantee.
> >>>
> >>>
> >>>>
> >>>> We should also define and clearly explain what changes require bumping
> >>>> the
> >>>> major version. I have no strong opinion here and would love to hear
> what
> >>>> people think. The original motivation for maintaining backwards
> >>>> compatibility is to make sure vN schedulers can correctly work with vN
> >>>> API
> >>>> without being updated. But what about semantic changes that do not
> touch
> >>>> the API? For example, what if we decide to send less task health
> >>>> updates to
> >>>> schedulers based on some health policy? It influences the flow of task
> >>>> status updates, should such change be considered compatible? Taking it
> >>>> to
> >>>> an extreme, we may not even be able to fix some bugs because someone
> may
> >>>> already rely on this behaviour!
> >>>>
> >>>
> >>> API changes should warrant a major version bump. Also the API is not
> >>> just what the machine reads but all the documentation associated with
> it,
> >>> right? It depends on what the documentation says; what the user
> _should_
> >>> expect.
> >>>
> >>> That said, I feel that these things are hard to be talked about in the
> >>> abstract. Even with a guideline, we still need to make case-by-case
> >>> decisions. (e.g., has the documentation precisely defined this precise
> >>> behavior? If not, is it reasonable for the users to expect some
> behavior
> >>> because it's common sense? How bad is it if some behavior just changes
> a
> >>> tiny bit?) Therefore we need to make sure the process for API changes
> are
> >>> more rigorously defined.
> >>>
> >>> Whether something is a bug depends on whether the API does what it says
> >>> it'll do. The line may sometimes be blurry but in general I don't feel
> it's
> >>> a problem. If someone is relying on the behavior that is a bug, we
> should
> >>> still help them fix it but the bug shouldn't count as "our guarantee".
> >>>
> >>>
> >>>>
> >>>> Another tightly related thing we should explicitly call out is
> >>>> upgradability and rollback capabilities inside a major release.
> >>>> Committing
> >>>> to this may significantly limit what we can change within a major
> >>>> release;
> >>>> on the other side it will give users more time and a better experience
> >>>> about using and maintaining Mesos clusters.
> >>>>
> >>>
> >>> According to the versioning doc upgradability depends on whether you
> >>> depend on deprecated/removed features.
> >>>
> >>> That paragraph should be explained more precisely:
> >>> - "deprecated" means your system won't break but warnings are shown
> >>> (Maybe we should use some standard deprecation warning keywords so the
> >>> operator can monitor the log for such warnings!
> >>> - "removed": means it may break.
> >>>
> >>> If you deprecate a flag/env that interface with operator tooling in the
> >>> next minor release, the operator basically has 6 months from the next
> minor
> >>> release to change the her tooling. I feel this is pretty acceptable.
> >>> If you deprecate a flag/env variable that interface with the framework
> >>> (executor) in the next minor release, I feel it may not be enough and
> it
> >>> probably warrants a major version bump. So perhaps the API shouldn't be
> >>> just the protos.
> >>>
> >>>
> >>>> 2. Versioned vs. unversioned protobufs.
> >>>> Currently we have v1 and unnamed protobufs, which simultaneously mean
> >>>> v0,
> >>>> v2, and internal. I am sometimes confused about what is the right way
> to
> >>>> update or introduce a field or message there, do people feel the same?
> >>>> How
> >>>> about splitting the unnamed version into explicit v0, v2, and
> internal?
> >>>>
> >>>
> >>> As haosdent mentioned, we have captured this in MESOS-6268. The benefit
> >>> is clear but I guess the people will be more motivated when we find
> some v2
> >>> feature can't be made compatible with the v0 API. (Anand's point
> >>> in MESOS-6016). On the other hand, if we cut v0 API access before that
> >>> happens (is v0 API obsolete and should be removed 6 months after 1.0?)
> then
> >>> we don't need to worry about v0 and can use unversioned protos as
> >>> "internal"?
> >>>
> >>>
> >>>> Food for thought. It would be great if we can only maintain "diffs" to
> >>>> the
> >>>> internal protobufs in the code, instead of duplicating them
> altogether.
> >>>>
> >>>> 3. API and feature labelling.
> >>>> I suggest to introduce explicit labels for API and features, to ensure
> >>>> users have the right assumptions about the their lifetime while
> >>>> engineers
> >>>> have the ability to change a wip feature in an non-compatible way. I
> >>>> propose the following:
> >>>> API: stable, non-stable, pure (not used by Mesos components)
> >>>> Feature: experimental, normal.
> >>>>
> >>>
> >>>  +1 on formalizing the terminologies.
> >>>
> >>> Historically the distinction is not clear for the following:
> >>>
> >>> 1. The API has no compatibility guarantee at all.
> >>> 2. The feature provided by this API is experimental
> >>>
> >>
> >> To add to this point: because 2) logically doesn't apply to the "pure
> >> (not used by Mesos components)" fields in the API, it could be more
> >> confusing and thus require more precise definition.
> >>
> >>
> >>>
> >>> IMO It's OK that we say that we don't distinguish the two (the API has
> >>> no compatibility guarantee until the feature is fully released) but we
> have
> >>> to make it clear.
> >>> If we don't make such distinction, ALL API additions should be marked
> as
> >>> unstable first and be changed stable later (as a formal process).
> >>>
> >>>
> >>>>
> >>>> Looking forward to your thoughts and suggestions.
> >>>> AlexR
> >>>>
> >>>> [1] https://www.mail-archive.com/user@mesos.apache.org/msg08025.html
> >>>> [2] https://www.mail-archive.com/dev@mesos.apache.org/msg36621.html
> >>>> [3]
> >>>> https://github.com/apache/mesos/blob/b2beef37f6f85a8c75e9681
> >>>> 36caa7a1f292ba20e/docs/versioning.md
> >>>>
> >>>
> >>>
> >>
> >
>
>
> --
> Best Regards,
> Haosdent Huang
>

Re: On Mesos versioning and deprecation policy

Posted by Vinod Kone <vi...@apache.org>.
We had an extended discussion around this in the last community sync.
Thanks for those who participated!

To sum up the discussion:

--> As mesos devs, we should strive to not make incompatible changes in
APIs, flags, environment variables.

--> In the rare case where an incompatible change is preferred (e.g., code
complexity), we should give a clear 6 months heads up the users that a
breaking change is going to take place.

--> Breaking changes do not necessitate a major version bump. This is
because we want to allow live upgrades between major versions (e.g., 1.10
to 2.0).

--> Compatibility guarantees do not apply to experimental features (incl.
APIs).

--> We need to have clear documentation about procedure that devs could
follow when deprecating/removing stable features and adding experimental
features.

--> We need to improve upgrades.md to make it easy for operators to know
what features are deprecated/removed between versions X and Y.

--> We should decouple internal protos used by Mesos from the unversioned
protos used by driver based frameworks.

I will spend some time in the next few weeks to create/update the
documentation reflecting these points.

Anything else I missed?

Thanks,

On Sat, Oct 15, 2016 at 11:47 AM, haosdent <ha...@gmail.com> wrote:

> Thanks @yan's great inputs! I couldn't agree more almost of them.
>
> > Also the API is not just what the machine reads but all the documentation
> associated with it, right? It depends on what the documentation says; what
> the user _should_ expect.
>
> I think different users may have different expectations. And the guy who
> developed the APIs may have different understand from some users as well.
> Our documentations should cover most of cases.
>
> But in case that we didn't or forgot to write it explicitly in the
> document, should we give up to update the API? Just like user Alice said
> this is a BUG while user Bob said this is a feature. I think we still need
> to raise it case by case to ensure most users are not affected by the
> breaking API changes.
>
> On Sat, Oct 15, 2016 at 6:55 AM, Vinod Kone <vi...@apache.org> wrote:
>
> > We will chat about this in the upcoming community sync (thursday 3 PM).
> > So, please make sure to attend if you are interested.
> >
> > On Fri, Oct 14, 2016 at 3:44 PM, Yan Xu <xu...@apple.com> wrote:
> >
> >>
> >> On Fri, Oct 14, 2016 at 3:37 PM, Yan Xu <xu...@apple.com> wrote:
> >>
> >>> Thanks Alex for starting this!
> >>>
> >>> In addition to comments below, I think it'll be helpful to keep the
> >>> existing versioning doc concise and user-friendly while having a
> dedicated
> >>> doc for the "implementation details" where precise requirements and
> >>> procedures go. Maybe some duplication/cross-referencing is needed but
> Mesos
> >>> developers will find the latter much more helpful while the
> users/framework
> >>> developer will find the former easy to read.
> >>>
> >>> e.g., a similar split:
> >>> https://github.com/kubernetes/kubernetes/blob/master/docs/api.md
> >>> https://github.com/kubernetes/kubernetes/blob/master/docs/de
> >>> vel/api_changes.md (which has a lot of details on how the kubernetes
> >>> community is thinking about similar issues, which we can learn from)
> >>>
> >>> Jiang Yan Xu 
> >>>
> >>> On Wed, Oct 12, 2016 at 9:34 AM, Alex Rukletsov <al...@mesosphere.com>
> >>> wrote:
> >>>
> >>>> Folks,
> >>>>
> >>>> There have been a bunch of online [1, 2] and offline discussions about
> >>>> our
> >>>> deprecation and versioning policy. I found that people—including
> >>>> myself—read the versioning doc [3] differently; moreover some aspects
> >>>> are
> >>>> not captured there. I would like to start a discussion around this
> >>>> topic by
> >>>> sharing my confusions and suggestions. This will hopefully help us
> stay
> >>>> on
> >>>> the same page and have similar expectations. The second goal is to
> >>>> eliminate ambiguities from the versioning doc (thanks Vinod for
> >>>> volunteering to update it).
> >>>>
> >>>
> >>> +1 Let me know if there are things I can help with.
> >>>
> >>>
> >>>>
> >>>> 1. API vs. semantic changes.
> >>>> Current versioning guide treat features (e.g. flags, metrics,
> endpoints)
> >>>> and API differently: incompatible changes for the former are allowed
> >>>> after
> >>>> 6 month deprecation cycle, while for the latter they require bumping a
> >>>> major version. I suggest we consolidate these policies.
> >>>>
> >>>
> >>> I feel that the distinction is not API vs. semantic changes, Backwards
> >>> compatible API guarantee should imply backwards compatible semantics
> (of
> >>> the API).
> >>> i.e., if a change in API doesn't cause the message to be dropped to the
> >>> floor but leads to behavior change that causes problems in the system,
> it
> >>> still breaks compatibility.
> >>>
> >>> IMO the distinction is more between:
> >>> - Compatibility between components that are impossible/very unpleasant
> >>> to upgrade in lockstep - high priority for compatibility guarantee.
> >>> - Compatibility between components that are generally bundled (modules)
> >>> or things that usually aren't built into automated tooling (e.g., the
> >>> /state endpoint) - more relaxed for now but we should explicitly
> exclude
> >>> them from the guarantee.
> >>>
> >>>
> >>>>
> >>>> We should also define and clearly explain what changes require bumping
> >>>> the
> >>>> major version. I have no strong opinion here and would love to hear
> what
> >>>> people think. The original motivation for maintaining backwards
> >>>> compatibility is to make sure vN schedulers can correctly work with vN
> >>>> API
> >>>> without being updated. But what about semantic changes that do not
> touch
> >>>> the API? For example, what if we decide to send less task health
> >>>> updates to
> >>>> schedulers based on some health policy? It influences the flow of task
> >>>> status updates, should such change be considered compatible? Taking it
> >>>> to
> >>>> an extreme, we may not even be able to fix some bugs because someone
> may
> >>>> already rely on this behaviour!
> >>>>
> >>>
> >>> API changes should warrant a major version bump. Also the API is not
> >>> just what the machine reads but all the documentation associated with
> it,
> >>> right? It depends on what the documentation says; what the user
> _should_
> >>> expect.
> >>>
> >>> That said, I feel that these things are hard to be talked about in the
> >>> abstract. Even with a guideline, we still need to make case-by-case
> >>> decisions. (e.g., has the documentation precisely defined this precise
> >>> behavior? If not, is it reasonable for the users to expect some
> behavior
> >>> because it's common sense? How bad is it if some behavior just changes
> a
> >>> tiny bit?) Therefore we need to make sure the process for API changes
> are
> >>> more rigorously defined.
> >>>
> >>> Whether something is a bug depends on whether the API does what it says
> >>> it'll do. The line may sometimes be blurry but in general I don't feel
> it's
> >>> a problem. If someone is relying on the behavior that is a bug, we
> should
> >>> still help them fix it but the bug shouldn't count as "our guarantee".
> >>>
> >>>
> >>>>
> >>>> Another tightly related thing we should explicitly call out is
> >>>> upgradability and rollback capabilities inside a major release.
> >>>> Committing
> >>>> to this may significantly limit what we can change within a major
> >>>> release;
> >>>> on the other side it will give users more time and a better experience
> >>>> about using and maintaining Mesos clusters.
> >>>>
> >>>
> >>> According to the versioning doc upgradability depends on whether you
> >>> depend on deprecated/removed features.
> >>>
> >>> That paragraph should be explained more precisely:
> >>> - "deprecated" means your system won't break but warnings are shown
> >>> (Maybe we should use some standard deprecation warning keywords so the
> >>> operator can monitor the log for such warnings!
> >>> - "removed": means it may break.
> >>>
> >>> If you deprecate a flag/env that interface with operator tooling in the
> >>> next minor release, the operator basically has 6 months from the next
> minor
> >>> release to change the her tooling. I feel this is pretty acceptable.
> >>> If you deprecate a flag/env variable that interface with the framework
> >>> (executor) in the next minor release, I feel it may not be enough and
> it
> >>> probably warrants a major version bump. So perhaps the API shouldn't be
> >>> just the protos.
> >>>
> >>>
> >>>> 2. Versioned vs. unversioned protobufs.
> >>>> Currently we have v1 and unnamed protobufs, which simultaneously mean
> >>>> v0,
> >>>> v2, and internal. I am sometimes confused about what is the right way
> to
> >>>> update or introduce a field or message there, do people feel the same?
> >>>> How
> >>>> about splitting the unnamed version into explicit v0, v2, and
> internal?
> >>>>
> >>>
> >>> As haosdent mentioned, we have captured this in MESOS-6268. The benefit
> >>> is clear but I guess the people will be more motivated when we find
> some v2
> >>> feature can't be made compatible with the v0 API. (Anand's point
> >>> in MESOS-6016). On the other hand, if we cut v0 API access before that
> >>> happens (is v0 API obsolete and should be removed 6 months after 1.0?)
> then
> >>> we don't need to worry about v0 and can use unversioned protos as
> >>> "internal"?
> >>>
> >>>
> >>>> Food for thought. It would be great if we can only maintain "diffs" to
> >>>> the
> >>>> internal protobufs in the code, instead of duplicating them
> altogether.
> >>>>
> >>>> 3. API and feature labelling.
> >>>> I suggest to introduce explicit labels for API and features, to ensure
> >>>> users have the right assumptions about the their lifetime while
> >>>> engineers
> >>>> have the ability to change a wip feature in an non-compatible way. I
> >>>> propose the following:
> >>>> API: stable, non-stable, pure (not used by Mesos components)
> >>>> Feature: experimental, normal.
> >>>>
> >>>
> >>>  +1 on formalizing the terminologies.
> >>>
> >>> Historically the distinction is not clear for the following:
> >>>
> >>> 1. The API has no compatibility guarantee at all.
> >>> 2. The feature provided by this API is experimental
> >>>
> >>
> >> To add to this point: because 2) logically doesn't apply to the "pure
> >> (not used by Mesos components)" fields in the API, it could be more
> >> confusing and thus require more precise definition.
> >>
> >>
> >>>
> >>> IMO It's OK that we say that we don't distinguish the two (the API has
> >>> no compatibility guarantee until the feature is fully released) but we
> have
> >>> to make it clear.
> >>> If we don't make such distinction, ALL API additions should be marked
> as
> >>> unstable first and be changed stable later (as a formal process).
> >>>
> >>>
> >>>>
> >>>> Looking forward to your thoughts and suggestions.
> >>>> AlexR
> >>>>
> >>>> [1] https://www.mail-archive.com/user@mesos.apache.org/msg08025.html
> >>>> [2] https://www.mail-archive.com/dev@mesos.apache.org/msg36621.html
> >>>> [3]
> >>>> https://github.com/apache/mesos/blob/b2beef37f6f85a8c75e9681
> >>>> 36caa7a1f292ba20e/docs/versioning.md
> >>>>
> >>>
> >>>
> >>
> >
>
>
> --
> Best Regards,
> Haosdent Huang
>

Re: On Mesos versioning and deprecation policy

Posted by haosdent <ha...@gmail.com>.
Thanks @yan's great inputs! I couldn't agree more almost of them.

> Also the API is not just what the machine reads but all the documentation
associated with it, right? It depends on what the documentation says; what
the user _should_ expect.

I think different users may have different expectations. And the guy who
developed the APIs may have different understand from some users as well.
Our documentations should cover most of cases.

But in case that we didn't or forgot to write it explicitly in the
document, should we give up to update the API? Just like user Alice said
this is a BUG while user Bob said this is a feature. I think we still need
to raise it case by case to ensure most users are not affected by the
breaking API changes.

On Sat, Oct 15, 2016 at 6:55 AM, Vinod Kone <vi...@apache.org> wrote:

> We will chat about this in the upcoming community sync (thursday 3 PM).
> So, please make sure to attend if you are interested.
>
> On Fri, Oct 14, 2016 at 3:44 PM, Yan Xu <xu...@apple.com> wrote:
>
>>
>> On Fri, Oct 14, 2016 at 3:37 PM, Yan Xu <xu...@apple.com> wrote:
>>
>>> Thanks Alex for starting this!
>>>
>>> In addition to comments below, I think it'll be helpful to keep the
>>> existing versioning doc concise and user-friendly while having a dedicated
>>> doc for the "implementation details" where precise requirements and
>>> procedures go. Maybe some duplication/cross-referencing is needed but Mesos
>>> developers will find the latter much more helpful while the users/framework
>>> developer will find the former easy to read.
>>>
>>> e.g., a similar split:
>>> https://github.com/kubernetes/kubernetes/blob/master/docs/api.md
>>> https://github.com/kubernetes/kubernetes/blob/master/docs/de
>>> vel/api_changes.md (which has a lot of details on how the kubernetes
>>> community is thinking about similar issues, which we can learn from)
>>>
>>> Jiang Yan Xu 
>>>
>>> On Wed, Oct 12, 2016 at 9:34 AM, Alex Rukletsov <al...@mesosphere.com>
>>> wrote:
>>>
>>>> Folks,
>>>>
>>>> There have been a bunch of online [1, 2] and offline discussions about
>>>> our
>>>> deprecation and versioning policy. I found that people—including
>>>> myself—read the versioning doc [3] differently; moreover some aspects
>>>> are
>>>> not captured there. I would like to start a discussion around this
>>>> topic by
>>>> sharing my confusions and suggestions. This will hopefully help us stay
>>>> on
>>>> the same page and have similar expectations. The second goal is to
>>>> eliminate ambiguities from the versioning doc (thanks Vinod for
>>>> volunteering to update it).
>>>>
>>>
>>> +1 Let me know if there are things I can help with.
>>>
>>>
>>>>
>>>> 1. API vs. semantic changes.
>>>> Current versioning guide treat features (e.g. flags, metrics, endpoints)
>>>> and API differently: incompatible changes for the former are allowed
>>>> after
>>>> 6 month deprecation cycle, while for the latter they require bumping a
>>>> major version. I suggest we consolidate these policies.
>>>>
>>>
>>> I feel that the distinction is not API vs. semantic changes, Backwards
>>> compatible API guarantee should imply backwards compatible semantics (of
>>> the API).
>>> i.e., if a change in API doesn't cause the message to be dropped to the
>>> floor but leads to behavior change that causes problems in the system, it
>>> still breaks compatibility.
>>>
>>> IMO the distinction is more between:
>>> - Compatibility between components that are impossible/very unpleasant
>>> to upgrade in lockstep - high priority for compatibility guarantee.
>>> - Compatibility between components that are generally bundled (modules)
>>> or things that usually aren't built into automated tooling (e.g., the
>>> /state endpoint) - more relaxed for now but we should explicitly exclude
>>> them from the guarantee.
>>>
>>>
>>>>
>>>> We should also define and clearly explain what changes require bumping
>>>> the
>>>> major version. I have no strong opinion here and would love to hear what
>>>> people think. The original motivation for maintaining backwards
>>>> compatibility is to make sure vN schedulers can correctly work with vN
>>>> API
>>>> without being updated. But what about semantic changes that do not touch
>>>> the API? For example, what if we decide to send less task health
>>>> updates to
>>>> schedulers based on some health policy? It influences the flow of task
>>>> status updates, should such change be considered compatible? Taking it
>>>> to
>>>> an extreme, we may not even be able to fix some bugs because someone may
>>>> already rely on this behaviour!
>>>>
>>>
>>> API changes should warrant a major version bump. Also the API is not
>>> just what the machine reads but all the documentation associated with it,
>>> right? It depends on what the documentation says; what the user _should_
>>> expect.
>>>
>>> That said, I feel that these things are hard to be talked about in the
>>> abstract. Even with a guideline, we still need to make case-by-case
>>> decisions. (e.g., has the documentation precisely defined this precise
>>> behavior? If not, is it reasonable for the users to expect some behavior
>>> because it's common sense? How bad is it if some behavior just changes a
>>> tiny bit?) Therefore we need to make sure the process for API changes are
>>> more rigorously defined.
>>>
>>> Whether something is a bug depends on whether the API does what it says
>>> it'll do. The line may sometimes be blurry but in general I don't feel it's
>>> a problem. If someone is relying on the behavior that is a bug, we should
>>> still help them fix it but the bug shouldn't count as "our guarantee".
>>>
>>>
>>>>
>>>> Another tightly related thing we should explicitly call out is
>>>> upgradability and rollback capabilities inside a major release.
>>>> Committing
>>>> to this may significantly limit what we can change within a major
>>>> release;
>>>> on the other side it will give users more time and a better experience
>>>> about using and maintaining Mesos clusters.
>>>>
>>>
>>> According to the versioning doc upgradability depends on whether you
>>> depend on deprecated/removed features.
>>>
>>> That paragraph should be explained more precisely:
>>> - "deprecated" means your system won't break but warnings are shown
>>> (Maybe we should use some standard deprecation warning keywords so the
>>> operator can monitor the log for such warnings!
>>> - "removed": means it may break.
>>>
>>> If you deprecate a flag/env that interface with operator tooling in the
>>> next minor release, the operator basically has 6 months from the next minor
>>> release to change the her tooling. I feel this is pretty acceptable.
>>> If you deprecate a flag/env variable that interface with the framework
>>> (executor) in the next minor release, I feel it may not be enough and it
>>> probably warrants a major version bump. So perhaps the API shouldn't be
>>> just the protos.
>>>
>>>
>>>> 2. Versioned vs. unversioned protobufs.
>>>> Currently we have v1 and unnamed protobufs, which simultaneously mean
>>>> v0,
>>>> v2, and internal. I am sometimes confused about what is the right way to
>>>> update or introduce a field or message there, do people feel the same?
>>>> How
>>>> about splitting the unnamed version into explicit v0, v2, and internal?
>>>>
>>>
>>> As haosdent mentioned, we have captured this in MESOS-6268. The benefit
>>> is clear but I guess the people will be more motivated when we find some v2
>>> feature can't be made compatible with the v0 API. (Anand's point
>>> in MESOS-6016). On the other hand, if we cut v0 API access before that
>>> happens (is v0 API obsolete and should be removed 6 months after 1.0?) then
>>> we don't need to worry about v0 and can use unversioned protos as
>>> "internal"?
>>>
>>>
>>>> Food for thought. It would be great if we can only maintain "diffs" to
>>>> the
>>>> internal protobufs in the code, instead of duplicating them altogether.
>>>>
>>>> 3. API and feature labelling.
>>>> I suggest to introduce explicit labels for API and features, to ensure
>>>> users have the right assumptions about the their lifetime while
>>>> engineers
>>>> have the ability to change a wip feature in an non-compatible way. I
>>>> propose the following:
>>>> API: stable, non-stable, pure (not used by Mesos components)
>>>> Feature: experimental, normal.
>>>>
>>>
>>>  +1 on formalizing the terminologies.
>>>
>>> Historically the distinction is not clear for the following:
>>>
>>> 1. The API has no compatibility guarantee at all.
>>> 2. The feature provided by this API is experimental
>>>
>>
>> To add to this point: because 2) logically doesn't apply to the "pure
>> (not used by Mesos components)" fields in the API, it could be more
>> confusing and thus require more precise definition.
>>
>>
>>>
>>> IMO It's OK that we say that we don't distinguish the two (the API has
>>> no compatibility guarantee until the feature is fully released) but we have
>>> to make it clear.
>>> If we don't make such distinction, ALL API additions should be marked as
>>> unstable first and be changed stable later (as a formal process).
>>>
>>>
>>>>
>>>> Looking forward to your thoughts and suggestions.
>>>> AlexR
>>>>
>>>> [1] https://www.mail-archive.com/user@mesos.apache.org/msg08025.html
>>>> [2] https://www.mail-archive.com/dev@mesos.apache.org/msg36621.html
>>>> [3]
>>>> https://github.com/apache/mesos/blob/b2beef37f6f85a8c75e9681
>>>> 36caa7a1f292ba20e/docs/versioning.md
>>>>
>>>
>>>
>>
>


-- 
Best Regards,
Haosdent Huang

Re: On Mesos versioning and deprecation policy

Posted by haosdent <ha...@gmail.com>.
Thanks @yan's great inputs! I couldn't agree more almost of them.

> Also the API is not just what the machine reads but all the documentation
associated with it, right? It depends on what the documentation says; what
the user _should_ expect.

I think different users may have different expectations. And the guy who
developed the APIs may have different understand from some users as well.
Our documentations should cover most of cases.

But in case that we didn't or forgot to write it explicitly in the
document, should we give up to update the API? Just like user Alice said
this is a BUG while user Bob said this is a feature. I think we still need
to raise it case by case to ensure most users are not affected by the
breaking API changes.

On Sat, Oct 15, 2016 at 6:55 AM, Vinod Kone <vi...@apache.org> wrote:

> We will chat about this in the upcoming community sync (thursday 3 PM).
> So, please make sure to attend if you are interested.
>
> On Fri, Oct 14, 2016 at 3:44 PM, Yan Xu <xu...@apple.com> wrote:
>
>>
>> On Fri, Oct 14, 2016 at 3:37 PM, Yan Xu <xu...@apple.com> wrote:
>>
>>> Thanks Alex for starting this!
>>>
>>> In addition to comments below, I think it'll be helpful to keep the
>>> existing versioning doc concise and user-friendly while having a dedicated
>>> doc for the "implementation details" where precise requirements and
>>> procedures go. Maybe some duplication/cross-referencing is needed but Mesos
>>> developers will find the latter much more helpful while the users/framework
>>> developer will find the former easy to read.
>>>
>>> e.g., a similar split:
>>> https://github.com/kubernetes/kubernetes/blob/master/docs/api.md
>>> https://github.com/kubernetes/kubernetes/blob/master/docs/de
>>> vel/api_changes.md (which has a lot of details on how the kubernetes
>>> community is thinking about similar issues, which we can learn from)
>>>
>>> Jiang Yan Xu 
>>>
>>> On Wed, Oct 12, 2016 at 9:34 AM, Alex Rukletsov <al...@mesosphere.com>
>>> wrote:
>>>
>>>> Folks,
>>>>
>>>> There have been a bunch of online [1, 2] and offline discussions about
>>>> our
>>>> deprecation and versioning policy. I found that people—including
>>>> myself—read the versioning doc [3] differently; moreover some aspects
>>>> are
>>>> not captured there. I would like to start a discussion around this
>>>> topic by
>>>> sharing my confusions and suggestions. This will hopefully help us stay
>>>> on
>>>> the same page and have similar expectations. The second goal is to
>>>> eliminate ambiguities from the versioning doc (thanks Vinod for
>>>> volunteering to update it).
>>>>
>>>
>>> +1 Let me know if there are things I can help with.
>>>
>>>
>>>>
>>>> 1. API vs. semantic changes.
>>>> Current versioning guide treat features (e.g. flags, metrics, endpoints)
>>>> and API differently: incompatible changes for the former are allowed
>>>> after
>>>> 6 month deprecation cycle, while for the latter they require bumping a
>>>> major version. I suggest we consolidate these policies.
>>>>
>>>
>>> I feel that the distinction is not API vs. semantic changes, Backwards
>>> compatible API guarantee should imply backwards compatible semantics (of
>>> the API).
>>> i.e., if a change in API doesn't cause the message to be dropped to the
>>> floor but leads to behavior change that causes problems in the system, it
>>> still breaks compatibility.
>>>
>>> IMO the distinction is more between:
>>> - Compatibility between components that are impossible/very unpleasant
>>> to upgrade in lockstep - high priority for compatibility guarantee.
>>> - Compatibility between components that are generally bundled (modules)
>>> or things that usually aren't built into automated tooling (e.g., the
>>> /state endpoint) - more relaxed for now but we should explicitly exclude
>>> them from the guarantee.
>>>
>>>
>>>>
>>>> We should also define and clearly explain what changes require bumping
>>>> the
>>>> major version. I have no strong opinion here and would love to hear what
>>>> people think. The original motivation for maintaining backwards
>>>> compatibility is to make sure vN schedulers can correctly work with vN
>>>> API
>>>> without being updated. But what about semantic changes that do not touch
>>>> the API? For example, what if we decide to send less task health
>>>> updates to
>>>> schedulers based on some health policy? It influences the flow of task
>>>> status updates, should such change be considered compatible? Taking it
>>>> to
>>>> an extreme, we may not even be able to fix some bugs because someone may
>>>> already rely on this behaviour!
>>>>
>>>
>>> API changes should warrant a major version bump. Also the API is not
>>> just what the machine reads but all the documentation associated with it,
>>> right? It depends on what the documentation says; what the user _should_
>>> expect.
>>>
>>> That said, I feel that these things are hard to be talked about in the
>>> abstract. Even with a guideline, we still need to make case-by-case
>>> decisions. (e.g., has the documentation precisely defined this precise
>>> behavior? If not, is it reasonable for the users to expect some behavior
>>> because it's common sense? How bad is it if some behavior just changes a
>>> tiny bit?) Therefore we need to make sure the process for API changes are
>>> more rigorously defined.
>>>
>>> Whether something is a bug depends on whether the API does what it says
>>> it'll do. The line may sometimes be blurry but in general I don't feel it's
>>> a problem. If someone is relying on the behavior that is a bug, we should
>>> still help them fix it but the bug shouldn't count as "our guarantee".
>>>
>>>
>>>>
>>>> Another tightly related thing we should explicitly call out is
>>>> upgradability and rollback capabilities inside a major release.
>>>> Committing
>>>> to this may significantly limit what we can change within a major
>>>> release;
>>>> on the other side it will give users more time and a better experience
>>>> about using and maintaining Mesos clusters.
>>>>
>>>
>>> According to the versioning doc upgradability depends on whether you
>>> depend on deprecated/removed features.
>>>
>>> That paragraph should be explained more precisely:
>>> - "deprecated" means your system won't break but warnings are shown
>>> (Maybe we should use some standard deprecation warning keywords so the
>>> operator can monitor the log for such warnings!
>>> - "removed": means it may break.
>>>
>>> If you deprecate a flag/env that interface with operator tooling in the
>>> next minor release, the operator basically has 6 months from the next minor
>>> release to change the her tooling. I feel this is pretty acceptable.
>>> If you deprecate a flag/env variable that interface with the framework
>>> (executor) in the next minor release, I feel it may not be enough and it
>>> probably warrants a major version bump. So perhaps the API shouldn't be
>>> just the protos.
>>>
>>>
>>>> 2. Versioned vs. unversioned protobufs.
>>>> Currently we have v1 and unnamed protobufs, which simultaneously mean
>>>> v0,
>>>> v2, and internal. I am sometimes confused about what is the right way to
>>>> update or introduce a field or message there, do people feel the same?
>>>> How
>>>> about splitting the unnamed version into explicit v0, v2, and internal?
>>>>
>>>
>>> As haosdent mentioned, we have captured this in MESOS-6268. The benefit
>>> is clear but I guess the people will be more motivated when we find some v2
>>> feature can't be made compatible with the v0 API. (Anand's point
>>> in MESOS-6016). On the other hand, if we cut v0 API access before that
>>> happens (is v0 API obsolete and should be removed 6 months after 1.0?) then
>>> we don't need to worry about v0 and can use unversioned protos as
>>> "internal"?
>>>
>>>
>>>> Food for thought. It would be great if we can only maintain "diffs" to
>>>> the
>>>> internal protobufs in the code, instead of duplicating them altogether.
>>>>
>>>> 3. API and feature labelling.
>>>> I suggest to introduce explicit labels for API and features, to ensure
>>>> users have the right assumptions about the their lifetime while
>>>> engineers
>>>> have the ability to change a wip feature in an non-compatible way. I
>>>> propose the following:
>>>> API: stable, non-stable, pure (not used by Mesos components)
>>>> Feature: experimental, normal.
>>>>
>>>
>>>  +1 on formalizing the terminologies.
>>>
>>> Historically the distinction is not clear for the following:
>>>
>>> 1. The API has no compatibility guarantee at all.
>>> 2. The feature provided by this API is experimental
>>>
>>
>> To add to this point: because 2) logically doesn't apply to the "pure
>> (not used by Mesos components)" fields in the API, it could be more
>> confusing and thus require more precise definition.
>>
>>
>>>
>>> IMO It's OK that we say that we don't distinguish the two (the API has
>>> no compatibility guarantee until the feature is fully released) but we have
>>> to make it clear.
>>> If we don't make such distinction, ALL API additions should be marked as
>>> unstable first and be changed stable later (as a formal process).
>>>
>>>
>>>>
>>>> Looking forward to your thoughts and suggestions.
>>>> AlexR
>>>>
>>>> [1] https://www.mail-archive.com/user@mesos.apache.org/msg08025.html
>>>> [2] https://www.mail-archive.com/dev@mesos.apache.org/msg36621.html
>>>> [3]
>>>> https://github.com/apache/mesos/blob/b2beef37f6f85a8c75e9681
>>>> 36caa7a1f292ba20e/docs/versioning.md
>>>>
>>>
>>>
>>
>


-- 
Best Regards,
Haosdent Huang

Re: On Mesos versioning and deprecation policy

Posted by Vinod Kone <vi...@apache.org>.
We will chat about this in the upcoming community sync (thursday 3 PM). So,
please make sure to attend if you are interested.

On Fri, Oct 14, 2016 at 3:44 PM, Yan Xu <xu...@apple.com> wrote:

>
> On Fri, Oct 14, 2016 at 3:37 PM, Yan Xu <xu...@apple.com> wrote:
>
>> Thanks Alex for starting this!
>>
>> In addition to comments below, I think it'll be helpful to keep the
>> existing versioning doc concise and user-friendly while having a dedicated
>> doc for the "implementation details" where precise requirements and
>> procedures go. Maybe some duplication/cross-referencing is needed but Mesos
>> developers will find the latter much more helpful while the users/framework
>> developer will find the former easy to read.
>>
>> e.g., a similar split:
>> https://github.com/kubernetes/kubernetes/blob/master/docs/api.md
>> https://github.com/kubernetes/kubernetes/blob/master/docs/de
>> vel/api_changes.md (which has a lot of details on how the kubernetes
>> community is thinking about similar issues, which we can learn from)
>>
>> Jiang Yan Xu 
>>
>> On Wed, Oct 12, 2016 at 9:34 AM, Alex Rukletsov <al...@mesosphere.com>
>> wrote:
>>
>>> Folks,
>>>
>>> There have been a bunch of online [1, 2] and offline discussions about
>>> our
>>> deprecation and versioning policy. I found that people—including
>>> myself—read the versioning doc [3] differently; moreover some aspects are
>>> not captured there. I would like to start a discussion around this topic
>>> by
>>> sharing my confusions and suggestions. This will hopefully help us stay
>>> on
>>> the same page and have similar expectations. The second goal is to
>>> eliminate ambiguities from the versioning doc (thanks Vinod for
>>> volunteering to update it).
>>>
>>
>> +1 Let me know if there are things I can help with.
>>
>>
>>>
>>> 1. API vs. semantic changes.
>>> Current versioning guide treat features (e.g. flags, metrics, endpoints)
>>> and API differently: incompatible changes for the former are allowed
>>> after
>>> 6 month deprecation cycle, while for the latter they require bumping a
>>> major version. I suggest we consolidate these policies.
>>>
>>
>> I feel that the distinction is not API vs. semantic changes, Backwards
>> compatible API guarantee should imply backwards compatible semantics (of
>> the API).
>> i.e., if a change in API doesn't cause the message to be dropped to the
>> floor but leads to behavior change that causes problems in the system, it
>> still breaks compatibility.
>>
>> IMO the distinction is more between:
>> - Compatibility between components that are impossible/very unpleasant to
>> upgrade in lockstep - high priority for compatibility guarantee.
>> - Compatibility between components that are generally bundled (modules)
>> or things that usually aren't built into automated tooling (e.g., the
>> /state endpoint) - more relaxed for now but we should explicitly exclude
>> them from the guarantee.
>>
>>
>>>
>>> We should also define and clearly explain what changes require bumping
>>> the
>>> major version. I have no strong opinion here and would love to hear what
>>> people think. The original motivation for maintaining backwards
>>> compatibility is to make sure vN schedulers can correctly work with vN
>>> API
>>> without being updated. But what about semantic changes that do not touch
>>> the API? For example, what if we decide to send less task health updates
>>> to
>>> schedulers based on some health policy? It influences the flow of task
>>> status updates, should such change be considered compatible? Taking it to
>>> an extreme, we may not even be able to fix some bugs because someone may
>>> already rely on this behaviour!
>>>
>>
>> API changes should warrant a major version bump. Also the API is not just
>> what the machine reads but all the documentation associated with it, right?
>> It depends on what the documentation says; what the user _should_ expect.
>>
>> That said, I feel that these things are hard to be talked about in the
>> abstract. Even with a guideline, we still need to make case-by-case
>> decisions. (e.g., has the documentation precisely defined this precise
>> behavior? If not, is it reasonable for the users to expect some behavior
>> because it's common sense? How bad is it if some behavior just changes a
>> tiny bit?) Therefore we need to make sure the process for API changes are
>> more rigorously defined.
>>
>> Whether something is a bug depends on whether the API does what it says
>> it'll do. The line may sometimes be blurry but in general I don't feel it's
>> a problem. If someone is relying on the behavior that is a bug, we should
>> still help them fix it but the bug shouldn't count as "our guarantee".
>>
>>
>>>
>>> Another tightly related thing we should explicitly call out is
>>> upgradability and rollback capabilities inside a major release.
>>> Committing
>>> to this may significantly limit what we can change within a major
>>> release;
>>> on the other side it will give users more time and a better experience
>>> about using and maintaining Mesos clusters.
>>>
>>
>> According to the versioning doc upgradability depends on whether you
>> depend on deprecated/removed features.
>>
>> That paragraph should be explained more precisely:
>> - "deprecated" means your system won't break but warnings are shown
>> (Maybe we should use some standard deprecation warning keywords so the
>> operator can monitor the log for such warnings!
>> - "removed": means it may break.
>>
>> If you deprecate a flag/env that interface with operator tooling in the
>> next minor release, the operator basically has 6 months from the next minor
>> release to change the her tooling. I feel this is pretty acceptable.
>> If you deprecate a flag/env variable that interface with the framework
>> (executor) in the next minor release, I feel it may not be enough and it
>> probably warrants a major version bump. So perhaps the API shouldn't be
>> just the protos.
>>
>>
>>> 2. Versioned vs. unversioned protobufs.
>>> Currently we have v1 and unnamed protobufs, which simultaneously mean v0,
>>> v2, and internal. I am sometimes confused about what is the right way to
>>> update or introduce a field or message there, do people feel the same?
>>> How
>>> about splitting the unnamed version into explicit v0, v2, and internal?
>>>
>>
>> As haosdent mentioned, we have captured this in MESOS-6268. The benefit
>> is clear but I guess the people will be more motivated when we find some v2
>> feature can't be made compatible with the v0 API. (Anand's point
>> in MESOS-6016). On the other hand, if we cut v0 API access before that
>> happens (is v0 API obsolete and should be removed 6 months after 1.0?) then
>> we don't need to worry about v0 and can use unversioned protos as
>> "internal"?
>>
>>
>>> Food for thought. It would be great if we can only maintain "diffs" to
>>> the
>>> internal protobufs in the code, instead of duplicating them altogether.
>>>
>>> 3. API and feature labelling.
>>> I suggest to introduce explicit labels for API and features, to ensure
>>> users have the right assumptions about the their lifetime while engineers
>>> have the ability to change a wip feature in an non-compatible way. I
>>> propose the following:
>>> API: stable, non-stable, pure (not used by Mesos components)
>>> Feature: experimental, normal.
>>>
>>
>>  +1 on formalizing the terminologies.
>>
>> Historically the distinction is not clear for the following:
>>
>> 1. The API has no compatibility guarantee at all.
>> 2. The feature provided by this API is experimental
>>
>
> To add to this point: because 2) logically doesn't apply to the "pure (not
> used by Mesos components)" fields in the API, it could be more confusing
> and thus require more precise definition.
>
>
>>
>> IMO It's OK that we say that we don't distinguish the two (the API has no
>> compatibility guarantee until the feature is fully released) but we have to
>> make it clear.
>> If we don't make such distinction, ALL API additions should be marked as
>> unstable first and be changed stable later (as a formal process).
>>
>>
>>>
>>> Looking forward to your thoughts and suggestions.
>>> AlexR
>>>
>>> [1] https://www.mail-archive.com/user@mesos.apache.org/msg08025.html
>>> [2] https://www.mail-archive.com/dev@mesos.apache.org/msg36621.html
>>> [3]
>>> https://github.com/apache/mesos/blob/b2beef37f6f85a8c75e9681
>>> 36caa7a1f292ba20e/docs/versioning.md
>>>
>>
>>
>

Re: On Mesos versioning and deprecation policy

Posted by Vinod Kone <vi...@apache.org>.
We will chat about this in the upcoming community sync (thursday 3 PM). So,
please make sure to attend if you are interested.

On Fri, Oct 14, 2016 at 3:44 PM, Yan Xu <xu...@apple.com> wrote:

>
> On Fri, Oct 14, 2016 at 3:37 PM, Yan Xu <xu...@apple.com> wrote:
>
>> Thanks Alex for starting this!
>>
>> In addition to comments below, I think it'll be helpful to keep the
>> existing versioning doc concise and user-friendly while having a dedicated
>> doc for the "implementation details" where precise requirements and
>> procedures go. Maybe some duplication/cross-referencing is needed but Mesos
>> developers will find the latter much more helpful while the users/framework
>> developer will find the former easy to read.
>>
>> e.g., a similar split:
>> https://github.com/kubernetes/kubernetes/blob/master/docs/api.md
>> https://github.com/kubernetes/kubernetes/blob/master/docs/de
>> vel/api_changes.md (which has a lot of details on how the kubernetes
>> community is thinking about similar issues, which we can learn from)
>>
>> Jiang Yan Xu 
>>
>> On Wed, Oct 12, 2016 at 9:34 AM, Alex Rukletsov <al...@mesosphere.com>
>> wrote:
>>
>>> Folks,
>>>
>>> There have been a bunch of online [1, 2] and offline discussions about
>>> our
>>> deprecation and versioning policy. I found that people—including
>>> myself—read the versioning doc [3] differently; moreover some aspects are
>>> not captured there. I would like to start a discussion around this topic
>>> by
>>> sharing my confusions and suggestions. This will hopefully help us stay
>>> on
>>> the same page and have similar expectations. The second goal is to
>>> eliminate ambiguities from the versioning doc (thanks Vinod for
>>> volunteering to update it).
>>>
>>
>> +1 Let me know if there are things I can help with.
>>
>>
>>>
>>> 1. API vs. semantic changes.
>>> Current versioning guide treat features (e.g. flags, metrics, endpoints)
>>> and API differently: incompatible changes for the former are allowed
>>> after
>>> 6 month deprecation cycle, while for the latter they require bumping a
>>> major version. I suggest we consolidate these policies.
>>>
>>
>> I feel that the distinction is not API vs. semantic changes, Backwards
>> compatible API guarantee should imply backwards compatible semantics (of
>> the API).
>> i.e., if a change in API doesn't cause the message to be dropped to the
>> floor but leads to behavior change that causes problems in the system, it
>> still breaks compatibility.
>>
>> IMO the distinction is more between:
>> - Compatibility between components that are impossible/very unpleasant to
>> upgrade in lockstep - high priority for compatibility guarantee.
>> - Compatibility between components that are generally bundled (modules)
>> or things that usually aren't built into automated tooling (e.g., the
>> /state endpoint) - more relaxed for now but we should explicitly exclude
>> them from the guarantee.
>>
>>
>>>
>>> We should also define and clearly explain what changes require bumping
>>> the
>>> major version. I have no strong opinion here and would love to hear what
>>> people think. The original motivation for maintaining backwards
>>> compatibility is to make sure vN schedulers can correctly work with vN
>>> API
>>> without being updated. But what about semantic changes that do not touch
>>> the API? For example, what if we decide to send less task health updates
>>> to
>>> schedulers based on some health policy? It influences the flow of task
>>> status updates, should such change be considered compatible? Taking it to
>>> an extreme, we may not even be able to fix some bugs because someone may
>>> already rely on this behaviour!
>>>
>>
>> API changes should warrant a major version bump. Also the API is not just
>> what the machine reads but all the documentation associated with it, right?
>> It depends on what the documentation says; what the user _should_ expect.
>>
>> That said, I feel that these things are hard to be talked about in the
>> abstract. Even with a guideline, we still need to make case-by-case
>> decisions. (e.g., has the documentation precisely defined this precise
>> behavior? If not, is it reasonable for the users to expect some behavior
>> because it's common sense? How bad is it if some behavior just changes a
>> tiny bit?) Therefore we need to make sure the process for API changes are
>> more rigorously defined.
>>
>> Whether something is a bug depends on whether the API does what it says
>> it'll do. The line may sometimes be blurry but in general I don't feel it's
>> a problem. If someone is relying on the behavior that is a bug, we should
>> still help them fix it but the bug shouldn't count as "our guarantee".
>>
>>
>>>
>>> Another tightly related thing we should explicitly call out is
>>> upgradability and rollback capabilities inside a major release.
>>> Committing
>>> to this may significantly limit what we can change within a major
>>> release;
>>> on the other side it will give users more time and a better experience
>>> about using and maintaining Mesos clusters.
>>>
>>
>> According to the versioning doc upgradability depends on whether you
>> depend on deprecated/removed features.
>>
>> That paragraph should be explained more precisely:
>> - "deprecated" means your system won't break but warnings are shown
>> (Maybe we should use some standard deprecation warning keywords so the
>> operator can monitor the log for such warnings!
>> - "removed": means it may break.
>>
>> If you deprecate a flag/env that interface with operator tooling in the
>> next minor release, the operator basically has 6 months from the next minor
>> release to change the her tooling. I feel this is pretty acceptable.
>> If you deprecate a flag/env variable that interface with the framework
>> (executor) in the next minor release, I feel it may not be enough and it
>> probably warrants a major version bump. So perhaps the API shouldn't be
>> just the protos.
>>
>>
>>> 2. Versioned vs. unversioned protobufs.
>>> Currently we have v1 and unnamed protobufs, which simultaneously mean v0,
>>> v2, and internal. I am sometimes confused about what is the right way to
>>> update or introduce a field or message there, do people feel the same?
>>> How
>>> about splitting the unnamed version into explicit v0, v2, and internal?
>>>
>>
>> As haosdent mentioned, we have captured this in MESOS-6268. The benefit
>> is clear but I guess the people will be more motivated when we find some v2
>> feature can't be made compatible with the v0 API. (Anand's point
>> in MESOS-6016). On the other hand, if we cut v0 API access before that
>> happens (is v0 API obsolete and should be removed 6 months after 1.0?) then
>> we don't need to worry about v0 and can use unversioned protos as
>> "internal"?
>>
>>
>>> Food for thought. It would be great if we can only maintain "diffs" to
>>> the
>>> internal protobufs in the code, instead of duplicating them altogether.
>>>
>>> 3. API and feature labelling.
>>> I suggest to introduce explicit labels for API and features, to ensure
>>> users have the right assumptions about the their lifetime while engineers
>>> have the ability to change a wip feature in an non-compatible way. I
>>> propose the following:
>>> API: stable, non-stable, pure (not used by Mesos components)
>>> Feature: experimental, normal.
>>>
>>
>>  +1 on formalizing the terminologies.
>>
>> Historically the distinction is not clear for the following:
>>
>> 1. The API has no compatibility guarantee at all.
>> 2. The feature provided by this API is experimental
>>
>
> To add to this point: because 2) logically doesn't apply to the "pure (not
> used by Mesos components)" fields in the API, it could be more confusing
> and thus require more precise definition.
>
>
>>
>> IMO It's OK that we say that we don't distinguish the two (the API has no
>> compatibility guarantee until the feature is fully released) but we have to
>> make it clear.
>> If we don't make such distinction, ALL API additions should be marked as
>> unstable first and be changed stable later (as a formal process).
>>
>>
>>>
>>> Looking forward to your thoughts and suggestions.
>>> AlexR
>>>
>>> [1] https://www.mail-archive.com/user@mesos.apache.org/msg08025.html
>>> [2] https://www.mail-archive.com/dev@mesos.apache.org/msg36621.html
>>> [3]
>>> https://github.com/apache/mesos/blob/b2beef37f6f85a8c75e9681
>>> 36caa7a1f292ba20e/docs/versioning.md
>>>
>>
>>
>

Re: On Mesos versioning and deprecation policy

Posted by Yan Xu <xu...@apple.com>.
On Fri, Oct 14, 2016 at 3:37 PM, Yan Xu <xu...@apple.com> wrote:

> Thanks Alex for starting this!
>
> In addition to comments below, I think it'll be helpful to keep the
> existing versioning doc concise and user-friendly while having a dedicated
> doc for the "implementation details" where precise requirements and
> procedures go. Maybe some duplication/cross-referencing is needed but Mesos
> developers will find the latter much more helpful while the users/framework
> developer will find the former easy to read.
>
> e.g., a similar split:
> https://github.com/kubernetes/kubernetes/blob/master/docs/api.md
> https://github.com/kubernetes/kubernetes/blob/master/docs/de
> vel/api_changes.md (which has a lot of details on how the kubernetes
> community is thinking about similar issues, which we can learn from)
>
> Jiang Yan Xu 
>
> On Wed, Oct 12, 2016 at 9:34 AM, Alex Rukletsov <al...@mesosphere.com>
> wrote:
>
>> Folks,
>>
>> There have been a bunch of online [1, 2] and offline discussions about our
>> deprecation and versioning policy. I found that people—including
>> myself—read the versioning doc [3] differently; moreover some aspects are
>> not captured there. I would like to start a discussion around this topic
>> by
>> sharing my confusions and suggestions. This will hopefully help us stay on
>> the same page and have similar expectations. The second goal is to
>> eliminate ambiguities from the versioning doc (thanks Vinod for
>> volunteering to update it).
>>
>
> +1 Let me know if there are things I can help with.
>
>
>>
>> 1. API vs. semantic changes.
>> Current versioning guide treat features (e.g. flags, metrics, endpoints)
>> and API differently: incompatible changes for the former are allowed after
>> 6 month deprecation cycle, while for the latter they require bumping a
>> major version. I suggest we consolidate these policies.
>>
>
> I feel that the distinction is not API vs. semantic changes, Backwards
> compatible API guarantee should imply backwards compatible semantics (of
> the API).
> i.e., if a change in API doesn't cause the message to be dropped to the
> floor but leads to behavior change that causes problems in the system, it
> still breaks compatibility.
>
> IMO the distinction is more between:
> - Compatibility between components that are impossible/very unpleasant to
> upgrade in lockstep - high priority for compatibility guarantee.
> - Compatibility between components that are generally bundled (modules) or
> things that usually aren't built into automated tooling (e.g., the /state
> endpoint) - more relaxed for now but we should explicitly exclude them from
> the guarantee.
>
>
>>
>> We should also define and clearly explain what changes require bumping the
>> major version. I have no strong opinion here and would love to hear what
>> people think. The original motivation for maintaining backwards
>> compatibility is to make sure vN schedulers can correctly work with vN API
>> without being updated. But what about semantic changes that do not touch
>> the API? For example, what if we decide to send less task health updates
>> to
>> schedulers based on some health policy? It influences the flow of task
>> status updates, should such change be considered compatible? Taking it to
>> an extreme, we may not even be able to fix some bugs because someone may
>> already rely on this behaviour!
>>
>
> API changes should warrant a major version bump. Also the API is not just
> what the machine reads but all the documentation associated with it, right?
> It depends on what the documentation says; what the user _should_ expect.
>
> That said, I feel that these things are hard to be talked about in the
> abstract. Even with a guideline, we still need to make case-by-case
> decisions. (e.g., has the documentation precisely defined this precise
> behavior? If not, is it reasonable for the users to expect some behavior
> because it's common sense? How bad is it if some behavior just changes a
> tiny bit?) Therefore we need to make sure the process for API changes are
> more rigorously defined.
>
> Whether something is a bug depends on whether the API does what it says
> it'll do. The line may sometimes be blurry but in general I don't feel it's
> a problem. If someone is relying on the behavior that is a bug, we should
> still help them fix it but the bug shouldn't count as "our guarantee".
>
>
>>
>> Another tightly related thing we should explicitly call out is
>> upgradability and rollback capabilities inside a major release. Committing
>> to this may significantly limit what we can change within a major release;
>> on the other side it will give users more time and a better experience
>> about using and maintaining Mesos clusters.
>>
>
> According to the versioning doc upgradability depends on whether you
> depend on deprecated/removed features.
>
> That paragraph should be explained more precisely:
> - "deprecated" means your system won't break but warnings are shown (Maybe
> we should use some standard deprecation warning keywords so the operator
> can monitor the log for such warnings!
> - "removed": means it may break.
>
> If you deprecate a flag/env that interface with operator tooling in the
> next minor release, the operator basically has 6 months from the next minor
> release to change the her tooling. I feel this is pretty acceptable.
> If you deprecate a flag/env variable that interface with the framework
> (executor) in the next minor release, I feel it may not be enough and it
> probably warrants a major version bump. So perhaps the API shouldn't be
> just the protos.
>
>
>> 2. Versioned vs. unversioned protobufs.
>> Currently we have v1 and unnamed protobufs, which simultaneously mean v0,
>> v2, and internal. I am sometimes confused about what is the right way to
>> update or introduce a field or message there, do people feel the same? How
>> about splitting the unnamed version into explicit v0, v2, and internal?
>>
>
> As haosdent mentioned, we have captured this in MESOS-6268. The benefit is
> clear but I guess the people will be more motivated when we find some v2
> feature can't be made compatible with the v0 API. (Anand's point
> in MESOS-6016). On the other hand, if we cut v0 API access before that
> happens (is v0 API obsolete and should be removed 6 months after 1.0?) then
> we don't need to worry about v0 and can use unversioned protos as
> "internal"?
>
>
>> Food for thought. It would be great if we can only maintain "diffs" to the
>> internal protobufs in the code, instead of duplicating them altogether.
>>
>> 3. API and feature labelling.
>> I suggest to introduce explicit labels for API and features, to ensure
>> users have the right assumptions about the their lifetime while engineers
>> have the ability to change a wip feature in an non-compatible way. I
>> propose the following:
>> API: stable, non-stable, pure (not used by Mesos components)
>> Feature: experimental, normal.
>>
>
>  +1 on formalizing the terminologies.
>
> Historically the distinction is not clear for the following:
>
> 1. The API has no compatibility guarantee at all.
> 2. The feature provided by this API is experimental
>

To add to this point: because 2) logically doesn't apply to the "pure (not
used by Mesos components)" fields in the API, it could be more confusing
and thus require more precise definition.


>
> IMO It's OK that we say that we don't distinguish the two (the API has no
> compatibility guarantee until the feature is fully released) but we have to
> make it clear.
> If we don't make such distinction, ALL API additions should be marked as
> unstable first and be changed stable later (as a formal process).
>
>
>>
>> Looking forward to your thoughts and suggestions.
>> AlexR
>>
>> [1] https://www.mail-archive.com/user@mesos.apache.org/msg08025.html
>> [2] https://www.mail-archive.com/dev@mesos.apache.org/msg36621.html
>> [3]
>> https://github.com/apache/mesos/blob/b2beef37f6f85a8c75e9681
>> 36caa7a1f292ba20e/docs/versioning.md
>>
>
>

Re: On Mesos versioning and deprecation policy

Posted by Yan Xu <xu...@apple.com>.
On Fri, Oct 14, 2016 at 3:37 PM, Yan Xu <xu...@apple.com> wrote:

> Thanks Alex for starting this!
>
> In addition to comments below, I think it'll be helpful to keep the
> existing versioning doc concise and user-friendly while having a dedicated
> doc for the "implementation details" where precise requirements and
> procedures go. Maybe some duplication/cross-referencing is needed but Mesos
> developers will find the latter much more helpful while the users/framework
> developer will find the former easy to read.
>
> e.g., a similar split:
> https://github.com/kubernetes/kubernetes/blob/master/docs/api.md
> https://github.com/kubernetes/kubernetes/blob/master/docs/de
> vel/api_changes.md (which has a lot of details on how the kubernetes
> community is thinking about similar issues, which we can learn from)
>
> Jiang Yan Xu 
>
> On Wed, Oct 12, 2016 at 9:34 AM, Alex Rukletsov <al...@mesosphere.com>
> wrote:
>
>> Folks,
>>
>> There have been a bunch of online [1, 2] and offline discussions about our
>> deprecation and versioning policy. I found that people—including
>> myself—read the versioning doc [3] differently; moreover some aspects are
>> not captured there. I would like to start a discussion around this topic
>> by
>> sharing my confusions and suggestions. This will hopefully help us stay on
>> the same page and have similar expectations. The second goal is to
>> eliminate ambiguities from the versioning doc (thanks Vinod for
>> volunteering to update it).
>>
>
> +1 Let me know if there are things I can help with.
>
>
>>
>> 1. API vs. semantic changes.
>> Current versioning guide treat features (e.g. flags, metrics, endpoints)
>> and API differently: incompatible changes for the former are allowed after
>> 6 month deprecation cycle, while for the latter they require bumping a
>> major version. I suggest we consolidate these policies.
>>
>
> I feel that the distinction is not API vs. semantic changes, Backwards
> compatible API guarantee should imply backwards compatible semantics (of
> the API).
> i.e., if a change in API doesn't cause the message to be dropped to the
> floor but leads to behavior change that causes problems in the system, it
> still breaks compatibility.
>
> IMO the distinction is more between:
> - Compatibility between components that are impossible/very unpleasant to
> upgrade in lockstep - high priority for compatibility guarantee.
> - Compatibility between components that are generally bundled (modules) or
> things that usually aren't built into automated tooling (e.g., the /state
> endpoint) - more relaxed for now but we should explicitly exclude them from
> the guarantee.
>
>
>>
>> We should also define and clearly explain what changes require bumping the
>> major version. I have no strong opinion here and would love to hear what
>> people think. The original motivation for maintaining backwards
>> compatibility is to make sure vN schedulers can correctly work with vN API
>> without being updated. But what about semantic changes that do not touch
>> the API? For example, what if we decide to send less task health updates
>> to
>> schedulers based on some health policy? It influences the flow of task
>> status updates, should such change be considered compatible? Taking it to
>> an extreme, we may not even be able to fix some bugs because someone may
>> already rely on this behaviour!
>>
>
> API changes should warrant a major version bump. Also the API is not just
> what the machine reads but all the documentation associated with it, right?
> It depends on what the documentation says; what the user _should_ expect.
>
> That said, I feel that these things are hard to be talked about in the
> abstract. Even with a guideline, we still need to make case-by-case
> decisions. (e.g., has the documentation precisely defined this precise
> behavior? If not, is it reasonable for the users to expect some behavior
> because it's common sense? How bad is it if some behavior just changes a
> tiny bit?) Therefore we need to make sure the process for API changes are
> more rigorously defined.
>
> Whether something is a bug depends on whether the API does what it says
> it'll do. The line may sometimes be blurry but in general I don't feel it's
> a problem. If someone is relying on the behavior that is a bug, we should
> still help them fix it but the bug shouldn't count as "our guarantee".
>
>
>>
>> Another tightly related thing we should explicitly call out is
>> upgradability and rollback capabilities inside a major release. Committing
>> to this may significantly limit what we can change within a major release;
>> on the other side it will give users more time and a better experience
>> about using and maintaining Mesos clusters.
>>
>
> According to the versioning doc upgradability depends on whether you
> depend on deprecated/removed features.
>
> That paragraph should be explained more precisely:
> - "deprecated" means your system won't break but warnings are shown (Maybe
> we should use some standard deprecation warning keywords so the operator
> can monitor the log for such warnings!
> - "removed": means it may break.
>
> If you deprecate a flag/env that interface with operator tooling in the
> next minor release, the operator basically has 6 months from the next minor
> release to change the her tooling. I feel this is pretty acceptable.
> If you deprecate a flag/env variable that interface with the framework
> (executor) in the next minor release, I feel it may not be enough and it
> probably warrants a major version bump. So perhaps the API shouldn't be
> just the protos.
>
>
>> 2. Versioned vs. unversioned protobufs.
>> Currently we have v1 and unnamed protobufs, which simultaneously mean v0,
>> v2, and internal. I am sometimes confused about what is the right way to
>> update or introduce a field or message there, do people feel the same? How
>> about splitting the unnamed version into explicit v0, v2, and internal?
>>
>
> As haosdent mentioned, we have captured this in MESOS-6268. The benefit is
> clear but I guess the people will be more motivated when we find some v2
> feature can't be made compatible with the v0 API. (Anand's point
> in MESOS-6016). On the other hand, if we cut v0 API access before that
> happens (is v0 API obsolete and should be removed 6 months after 1.0?) then
> we don't need to worry about v0 and can use unversioned protos as
> "internal"?
>
>
>> Food for thought. It would be great if we can only maintain "diffs" to the
>> internal protobufs in the code, instead of duplicating them altogether.
>>
>> 3. API and feature labelling.
>> I suggest to introduce explicit labels for API and features, to ensure
>> users have the right assumptions about the their lifetime while engineers
>> have the ability to change a wip feature in an non-compatible way. I
>> propose the following:
>> API: stable, non-stable, pure (not used by Mesos components)
>> Feature: experimental, normal.
>>
>
>  +1 on formalizing the terminologies.
>
> Historically the distinction is not clear for the following:
>
> 1. The API has no compatibility guarantee at all.
> 2. The feature provided by this API is experimental
>

To add to this point: because 2) logically doesn't apply to the "pure (not
used by Mesos components)" fields in the API, it could be more confusing
and thus require more precise definition.


>
> IMO It's OK that we say that we don't distinguish the two (the API has no
> compatibility guarantee until the feature is fully released) but we have to
> make it clear.
> If we don't make such distinction, ALL API additions should be marked as
> unstable first and be changed stable later (as a formal process).
>
>
>>
>> Looking forward to your thoughts and suggestions.
>> AlexR
>>
>> [1] https://www.mail-archive.com/user@mesos.apache.org/msg08025.html
>> [2] https://www.mail-archive.com/dev@mesos.apache.org/msg36621.html
>> [3]
>> https://github.com/apache/mesos/blob/b2beef37f6f85a8c75e9681
>> 36caa7a1f292ba20e/docs/versioning.md
>>
>
>

Re: On Mesos versioning and deprecation policy

Posted by Yan Xu <xu...@apple.com>.
Thanks Alex for starting this!

In addition to comments below, I think it'll be helpful to keep the
existing versioning doc concise and user-friendly while having a dedicated
doc for the "implementation details" where precise requirements and
procedures go. Maybe some duplication/cross-referencing is needed but Mesos
developers will find the latter much more helpful while the users/framework
developer will find the former easy to read.

e.g., a similar split:
https://github.com/kubernetes/kubernetes/blob/master/docs/api.md
https://github.com/kubernetes/kubernetes/blob/master/docs/devel/api_changes.md
(which has a lot of details on how the kubernetes community is thinking
about similar issues, which we can learn from)

Jiang Yan Xu 

On Wed, Oct 12, 2016 at 9:34 AM, Alex Rukletsov <al...@mesosphere.com> wrote:

> Folks,
>
> There have been a bunch of online [1, 2] and offline discussions about our
> deprecation and versioning policy. I found that people—including
> myself—read the versioning doc [3] differently; moreover some aspects are
> not captured there. I would like to start a discussion around this topic by
> sharing my confusions and suggestions. This will hopefully help us stay on
> the same page and have similar expectations. The second goal is to
> eliminate ambiguities from the versioning doc (thanks Vinod for
> volunteering to update it).
>

+1 Let me know if there are things I can help with.


>
> 1. API vs. semantic changes.
> Current versioning guide treat features (e.g. flags, metrics, endpoints)
> and API differently: incompatible changes for the former are allowed after
> 6 month deprecation cycle, while for the latter they require bumping a
> major version. I suggest we consolidate these policies.
>

I feel that the distinction is not API vs. semantic changes, Backwards
compatible API guarantee should imply backwards compatible semantics (of
the API).
i.e., if a change in API doesn't cause the message to be dropped to the
floor but leads to behavior change that causes problems in the system, it
still breaks compatibility.

IMO the distinction is more between:
- Compatibility between components that are impossible/very unpleasant to
upgrade in lockstep - high priority for compatibility guarantee.
- Compatibility between components that are generally bundled (modules) or
things that usually aren't built into automated tooling (e.g., the /state
endpoint) - more relaxed for now but we should explicitly exclude them from
the guarantee.


>
> We should also define and clearly explain what changes require bumping the
> major version. I have no strong opinion here and would love to hear what
> people think. The original motivation for maintaining backwards
> compatibility is to make sure vN schedulers can correctly work with vN API
> without being updated. But what about semantic changes that do not touch
> the API? For example, what if we decide to send less task health updates to
> schedulers based on some health policy? It influences the flow of task
> status updates, should such change be considered compatible? Taking it to
> an extreme, we may not even be able to fix some bugs because someone may
> already rely on this behaviour!
>

API changes should warrant a major version bump. Also the API is not just
what the machine reads but all the documentation associated with it, right?
It depends on what the documentation says; what the user _should_ expect.

That said, I feel that these things are hard to be talked about in the
abstract. Even with a guideline, we still need to make case-by-case
decisions. (e.g., has the documentation precisely defined this precise
behavior? If not, is it reasonable for the users to expect some behavior
because it's common sense? How bad is it if some behavior just changes a
tiny bit?) Therefore we need to make sure the process for API changes are
more rigorously defined.

Whether something is a bug depends on whether the API does what it says
it'll do. The line may sometimes be blurry but in general I don't feel it's
a problem. If someone is relying on the behavior that is a bug, we should
still help them fix it but the bug shouldn't count as "our guarantee".


>
> Another tightly related thing we should explicitly call out is
> upgradability and rollback capabilities inside a major release. Committing
> to this may significantly limit what we can change within a major release;
> on the other side it will give users more time and a better experience
> about using and maintaining Mesos clusters.
>

According to the versioning doc upgradability depends on whether you depend
on deprecated/removed features.

That paragraph should be explained more precisely:
- "deprecated" means your system won't break but warnings are shown (Maybe
we should use some standard deprecation warning keywords so the operator
can monitor the log for such warnings!
- "removed": means it may break.

If you deprecate a flag/env that interface with operator tooling in the
next minor release, the operator basically has 6 months from the next minor
release to change the her tooling. I feel this is pretty acceptable.
If you deprecate a flag/env variable that interface with the framework
(executor) in the next minor release, I feel it may not be enough and it
probably warrants a major version bump. So perhaps the API shouldn't be
just the protos.


> 2. Versioned vs. unversioned protobufs.
> Currently we have v1 and unnamed protobufs, which simultaneously mean v0,
> v2, and internal. I am sometimes confused about what is the right way to
> update or introduce a field or message there, do people feel the same? How
> about splitting the unnamed version into explicit v0, v2, and internal?
>

As haosdent mentioned, we have captured this in MESOS-6268. The benefit is
clear but I guess the people will be more motivated when we find some v2
feature can't be made compatible with the v0 API. (Anand's point
in MESOS-6016). On the other hand, if we cut v0 API access before that
happens (is v0 API obsolete and should be removed 6 months after 1.0?) then
we don't need to worry about v0 and can use unversioned protos as
"internal"?


> Food for thought. It would be great if we can only maintain "diffs" to the
> internal protobufs in the code, instead of duplicating them altogether.
>
> 3. API and feature labelling.
> I suggest to introduce explicit labels for API and features, to ensure
> users have the right assumptions about the their lifetime while engineers
> have the ability to change a wip feature in an non-compatible way. I
> propose the following:
> API: stable, non-stable, pure (not used by Mesos components)
> Feature: experimental, normal.
>

 +1 on formalizing the terminologies.

Historically the distinction is not clear for the following:

1. The API has no compatibility guarantee at all.
2. The feature provided by this API is experimental

IMO It's OK that we say that we don't distinguish the two (the API has no
compatibility guarantee until the feature is fully released) but we have to
make it clear.
If we don't make such distinction, ALL API additions should be marked as
unstable first and be changed stable later (as a formal process).


>
> Looking forward to your thoughts and suggestions.
> AlexR
>
> [1] https://www.mail-archive.com/user@mesos.apache.org/msg08025.html
> [2] https://www.mail-archive.com/dev@mesos.apache.org/msg36621.html
> [3]
> https://github.com/apache/mesos/blob/b2beef37f6f85a8c75e968136caa7a
> 1f292ba20e/docs/versioning.md
>

Re: On Mesos versioning and deprecation policy

Posted by Yan Xu <xu...@apple.com>.
Thanks Alex for starting this!

In addition to comments below, I think it'll be helpful to keep the
existing versioning doc concise and user-friendly while having a dedicated
doc for the "implementation details" where precise requirements and
procedures go. Maybe some duplication/cross-referencing is needed but Mesos
developers will find the latter much more helpful while the users/framework
developer will find the former easy to read.

e.g., a similar split:
https://github.com/kubernetes/kubernetes/blob/master/docs/api.md
https://github.com/kubernetes/kubernetes/blob/master/docs/devel/api_changes.md
(which has a lot of details on how the kubernetes community is thinking
about similar issues, which we can learn from)

Jiang Yan Xu 

On Wed, Oct 12, 2016 at 9:34 AM, Alex Rukletsov <al...@mesosphere.com> wrote:

> Folks,
>
> There have been a bunch of online [1, 2] and offline discussions about our
> deprecation and versioning policy. I found that people—including
> myself—read the versioning doc [3] differently; moreover some aspects are
> not captured there. I would like to start a discussion around this topic by
> sharing my confusions and suggestions. This will hopefully help us stay on
> the same page and have similar expectations. The second goal is to
> eliminate ambiguities from the versioning doc (thanks Vinod for
> volunteering to update it).
>

+1 Let me know if there are things I can help with.


>
> 1. API vs. semantic changes.
> Current versioning guide treat features (e.g. flags, metrics, endpoints)
> and API differently: incompatible changes for the former are allowed after
> 6 month deprecation cycle, while for the latter they require bumping a
> major version. I suggest we consolidate these policies.
>

I feel that the distinction is not API vs. semantic changes, Backwards
compatible API guarantee should imply backwards compatible semantics (of
the API).
i.e., if a change in API doesn't cause the message to be dropped to the
floor but leads to behavior change that causes problems in the system, it
still breaks compatibility.

IMO the distinction is more between:
- Compatibility between components that are impossible/very unpleasant to
upgrade in lockstep - high priority for compatibility guarantee.
- Compatibility between components that are generally bundled (modules) or
things that usually aren't built into automated tooling (e.g., the /state
endpoint) - more relaxed for now but we should explicitly exclude them from
the guarantee.


>
> We should also define and clearly explain what changes require bumping the
> major version. I have no strong opinion here and would love to hear what
> people think. The original motivation for maintaining backwards
> compatibility is to make sure vN schedulers can correctly work with vN API
> without being updated. But what about semantic changes that do not touch
> the API? For example, what if we decide to send less task health updates to
> schedulers based on some health policy? It influences the flow of task
> status updates, should such change be considered compatible? Taking it to
> an extreme, we may not even be able to fix some bugs because someone may
> already rely on this behaviour!
>

API changes should warrant a major version bump. Also the API is not just
what the machine reads but all the documentation associated with it, right?
It depends on what the documentation says; what the user _should_ expect.

That said, I feel that these things are hard to be talked about in the
abstract. Even with a guideline, we still need to make case-by-case
decisions. (e.g., has the documentation precisely defined this precise
behavior? If not, is it reasonable for the users to expect some behavior
because it's common sense? How bad is it if some behavior just changes a
tiny bit?) Therefore we need to make sure the process for API changes are
more rigorously defined.

Whether something is a bug depends on whether the API does what it says
it'll do. The line may sometimes be blurry but in general I don't feel it's
a problem. If someone is relying on the behavior that is a bug, we should
still help them fix it but the bug shouldn't count as "our guarantee".


>
> Another tightly related thing we should explicitly call out is
> upgradability and rollback capabilities inside a major release. Committing
> to this may significantly limit what we can change within a major release;
> on the other side it will give users more time and a better experience
> about using and maintaining Mesos clusters.
>

According to the versioning doc upgradability depends on whether you depend
on deprecated/removed features.

That paragraph should be explained more precisely:
- "deprecated" means your system won't break but warnings are shown (Maybe
we should use some standard deprecation warning keywords so the operator
can monitor the log for such warnings!
- "removed": means it may break.

If you deprecate a flag/env that interface with operator tooling in the
next minor release, the operator basically has 6 months from the next minor
release to change the her tooling. I feel this is pretty acceptable.
If you deprecate a flag/env variable that interface with the framework
(executor) in the next minor release, I feel it may not be enough and it
probably warrants a major version bump. So perhaps the API shouldn't be
just the protos.


> 2. Versioned vs. unversioned protobufs.
> Currently we have v1 and unnamed protobufs, which simultaneously mean v0,
> v2, and internal. I am sometimes confused about what is the right way to
> update or introduce a field or message there, do people feel the same? How
> about splitting the unnamed version into explicit v0, v2, and internal?
>

As haosdent mentioned, we have captured this in MESOS-6268. The benefit is
clear but I guess the people will be more motivated when we find some v2
feature can't be made compatible with the v0 API. (Anand's point
in MESOS-6016). On the other hand, if we cut v0 API access before that
happens (is v0 API obsolete and should be removed 6 months after 1.0?) then
we don't need to worry about v0 and can use unversioned protos as
"internal"?


> Food for thought. It would be great if we can only maintain "diffs" to the
> internal protobufs in the code, instead of duplicating them altogether.
>
> 3. API and feature labelling.
> I suggest to introduce explicit labels for API and features, to ensure
> users have the right assumptions about the their lifetime while engineers
> have the ability to change a wip feature in an non-compatible way. I
> propose the following:
> API: stable, non-stable, pure (not used by Mesos components)
> Feature: experimental, normal.
>

 +1 on formalizing the terminologies.

Historically the distinction is not clear for the following:

1. The API has no compatibility guarantee at all.
2. The feature provided by this API is experimental

IMO It's OK that we say that we don't distinguish the two (the API has no
compatibility guarantee until the feature is fully released) but we have to
make it clear.
If we don't make such distinction, ALL API additions should be marked as
unstable first and be changed stable later (as a formal process).


>
> Looking forward to your thoughts and suggestions.
> AlexR
>
> [1] https://www.mail-archive.com/user@mesos.apache.org/msg08025.html
> [2] https://www.mail-archive.com/dev@mesos.apache.org/msg36621.html
> [3]
> https://github.com/apache/mesos/blob/b2beef37f6f85a8c75e968136caa7a
> 1f292ba20e/docs/versioning.md
>

Re: On Mesos versioning and deprecation policy

Posted by haosdent <ha...@gmail.com>.
>How about splitting the unnamed version into explicit v0, v2, and internal?

Currently our internal protobuf and v0 protobuf use the same unnamed
version protobuf and under the same namespace (`package mesos`).
If we are going to split v0 and internal, that requires copy all protobuf
files under `package mesos` into `package mesos.internal` and need to
change the whole code base to use the protobuf in `package mesos.internal`.
But it is beneficial to do this, so that we could avoid [the hacks][1]
that convert from the unversioned protobuf(v0) to the unversioned
protobuf(internal).

[1]
https://github.com/apache/mesos/blob/fa976c22ac66ff5c905157a5a36bda1d21525b32/src/master/master.cpp#L4077-L4108

On Thu, Oct 13, 2016 at 12:34 AM, Alex Rukletsov <al...@mesosphere.com>
wrote:

> Folks,
>
> There have been a bunch of online [1, 2] and offline discussions about our
> deprecation and versioning policy. I found that people—including
> myself—read the versioning doc [3] differently; moreover some aspects are
> not captured there. I would like to start a discussion around this topic by
> sharing my confusions and suggestions. This will hopefully help us stay on
> the same page and have similar expectations. The second goal is to
> eliminate ambiguities from the versioning doc (thanks Vinod for
> volunteering to update it).
>
> 1. API vs. semantic changes.
> Current versioning guide treat features (e.g. flags, metrics, endpoints)
> and API differently: incompatible changes for the former are allowed after
> 6 month deprecation cycle, while for the latter they require bumping a
> major version. I suggest we consolidate these policies.
>
> We should also define and clearly explain what changes require bumping the
> major version. I have no strong opinion here and would love to hear what
> people think. The original motivation for maintaining backwards
> compatibility is to make sure vN schedulers can correctly work with vN API
> without being updated. But what about semantic changes that do not touch
> the API? For example, what if we decide to send less task health updates to
> schedulers based on some health policy? It influences the flow of task
> status updates, should such change be considered compatible? Taking it to
> an extreme, we may not even be able to fix some bugs because someone may
> already rely on this behaviour!
>
> Another tightly related thing we should explicitly call out is
> upgradability and rollback capabilities inside a major release. Committing
> to this may significantly limit what we can change within a major release;
> on the other side it will give users more time and a better experience
> about using and maintaining Mesos clusters.
>
> 2. Versioned vs. unversioned protobufs.
> Currently we have v1 and unnamed protobufs, which simultaneously mean v0,
> v2, and internal. I am sometimes confused about what is the right way to
> update or introduce a field or message there, do people feel the same? How
> about splitting the unnamed version into explicit v0, v2, and internal?
>
> Food for thought. It would be great if we can only maintain "diffs" to the
> internal protobufs in the code, instead of duplicating them altogether.
>
> 3. API and feature labelling.
> I suggest to introduce explicit labels for API and features, to ensure
> users have the right assumptions about the their lifetime while engineers
> have the ability to change a wip feature in an non-compatible way. I
> propose the following:
> API: stable, non-stable, pure (not used by Mesos components)
> Feature: experimental, normal.
>
> Looking forward to your thoughts and suggestions.
> AlexR
>
> [1] https://www.mail-archive.com/user@mesos.apache.org/msg08025.html
> [2] https://www.mail-archive.com/dev@mesos.apache.org/msg36621.html
> [3]
> https://github.com/apache/mesos/blob/b2beef37f6f85a8c75e968136caa7a
> 1f292ba20e/docs/versioning.md
>



-- 
Best Regards,
Haosdent Huang