You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@trafficcontrol.apache.org by Dave Neuman <ne...@apache.org> on 2021/09/07 15:36:55 UTC

Re: [EXTERNAL] Proposal: stable vs unstable TO API versions

Another versioning discussion, yay!
In all seriousness, I am +1 for whatever makes development easier as long
as the risk to operations doesn't outweigh the savings.  In other words, if
we feel like this change is going to provide a lot more simplicity for our
development cycle --  including the ability to release quicker -- and we
don't feel the cost is too high operationally then I am all for it.

I understand the concerns about potentially breaking clients that are using
some version of the unstable or latest version of the API, but I think as a
client using the latest/unstable version of the API that is an
inherent risk.  I would also hope that we would invest in the proper amount
of automated testing necessary to catch these types of issues ahead of
time.

Ultimately I think that we have spent a lot of time discussing API
versioning and yet it still comes up as a pain for our development
process.  I am all for any reasonable change that could make that better.
I am also for trying new things and if it doesn't work, we go back to the
way things were.

I do think that we will have to probably put some thought into when we
determine an API is "stable" and what that process looks like.  It is a
little uncomfortable to just leave that as a gut feel type thing, but I
understand that it is also very hard to put more rules/processes around
something that is pretty subjective.

This is just my $.02, I am definitely not actively doing a ton of
development against the API these days, but I do see the pain that the API
brings us and hope we can make that better.

--Dave

On Tue, Aug 31, 2021 at 10:27 AM Rawlin Peters <ra...@apache.org> wrote:

> For your 1st reason, that is all hinged on whether or not the software
> needs to use the unstable version of the API. That is why you also
> have the choice to stay on the stable version and not have to worry
> about coordinating upgrades. Mind you, upgrades would only need to be
> coordinated in the cases where a component actually uses one of the
> broken APIs in the unstable version. We can easily keep track of
> breaking changes in the changelog in order to call out certain
> upgrades that would need to be coordinated (for any components that
> use the unstable API). Just because that process might be more
> error-prone than keeping the latest API version stable doesn't mean we
> shouldn't do it. It's a small risk that has a huge reward in time
> saved by not having to deal with so many API upgrades.
>
> I think your 2nd reason is actually supporting this proposal:
>
> > The removal of the 1.x API is showing how expensive it truly is to
> safely remove API versions, and that’s something to be weighed in addition
> to maintenance cost to the project for those versions.
>
> The 1.x API removal was a prime example in just how much code was able
> to stay on the stable API version until we decided to remove it. With
> this proposal, all of that code would still be able to remain
> unchanged for a longer period of time than without this proposal,
> saving much unnecessary toil. It also reduces maintenance cost of
> prior versions because in creating less new major versions, we will
> have less of them to support over time.
>
> > I think the million-dollar question revolves more around how much/far
> back we are willing to support. If it’s only one release at a time, that’s
> going to drive those 3rd party code maintenance costs up significantly
> higher as part of just doing business which will slow down deployments even
> if releases are moving faster.
>
> I don't think so, because we'd be creating less major versions to
> remove in the first place, so we wouldn't have to worry about
> upgrading 3rd party code that stays on the stable API version. From
> the lessons learned with the API 1.x removal, the vast majority of 3rd
> party code stays on the stable API version until that version is
> getting removed. So we would be releasing faster *and* deploying
> faster.
>
> For your 3rd reason, developers working on the same route generally
> always have to coordinate changes in some way, and we are usually very
> good about that. That is how it's always been done and will continue
> to be done, unaffected by this proposal. It's not really the release
> manager's responsibility to figure out what has been broken and what
> upgrades need to be coordinated. That is a collective responsibility
> of all ATC developers when making breaking changes. Breaking changes
> should be called out in the changelog, along with any prescribed
> upgrade orders. If this proposal is accepted, I think we should give
> these types of changes their own specific section in the changelog.
>
> For your 4th reason, I don't think we've ever decided to merge
> something that was half-baked just to avoid API versioning issues. A
> PR is already a feature branch and can remain open until ready to
> merge. The problem this proposal solves is when a developer starts
> developing a feature towards e.g. API 4.0, but we just cut a release
> and are now on API 5.0, so that developer then needs to *rework* their
> PR to now target API 5.0. Unnecessary rework decreases productivity
> and makes the feature take longer to get to production and produce
> value for us. This proposal basically extends the runway, so that we
> don't have to make the decision to delay the release if the feature is
> nearly complete in order to avoid that unnecessary rework. We can
> simply cut the release on time and have the new feature land in the
> subsequent release (with no unnecessary rework for the developer).
> Additionally, it is always somewhat disappointing when we have to
> *wait* to start developing a new feature because a release is about to
> be cut in order to avoid unnecessary rework caused by API versioning.
> This proposal would allow that work to start at any point in time
> without adding any unnecessary rework.
>
> For your last point, I know you keep linking to Rob's
> https://github.com/rob05c/apiver library whenever conversations
> related to API versioning come up, but this proposal is mainly
> concerned with major version changes, for which that library was not
> made. Also, I'm not really sure how Elixir would help solve this
> problem.
>
> - Rawlin
>

Re: [EXTERNAL] Proposal: stable vs unstable TO API versions

Posted by Rawlin Peters <ra...@apache.org>.
Replies below:

On Tue, Sep 7, 2021 at 9:37 AM Dave Neuman <ne...@apache.org> wrote:
> I do think that we will have to probably put some thought into when we
> determine an API is "stable" and what that process looks like.  It is a
> little uncomfortable to just leave that as a gut feel type thing, but I
> understand that it is also very hard to put more rules/processes around
> something that is pretty subjective.

I think if our general guideline is to only make breaking changes when
absolutely necessary for a new feature being added (i.e. we can't just
add a new optional field with a default for some reason or adding new
routes that tie into existing routes would make the API too unwieldy),
then we should just look at what we have planned on our roadmaps for
the next 6-12 months or so. If there is anything that sticks out as
needing a breaking API change, then perhaps we hold off on stabilizing
until we get that breaking change into the unstable API. Or, if the
API version has already been unstable for a certain amount of time,
perhaps we would stabilize it even if we have breaking changes on the
roadmap.

On Tue, Sep 7, 2021 at 9:58 AM Robert O Butts <ro...@apache.org> wrote:
>
> I'm concerned that using this "unstable" version makes it impossible to
> upgrade in-place.
>
> Because if a client (cache config, Traffic Monitor, random ops scripts,
> etc) uses it, and a breaking change is made, if you upgrade Traffic Ops
> first you'll break all clients, and if you upgrade clients first, they'll
> try to talk to TO and get 200's but the data will be malformed.

I understand your concern about upgrading, but in reality it's still
possible to upgrade components that use the unstable API version. It
will just require more coordination than upgrading components that use
the stable API. Plus, keep in mind, it's not like every single
breaking change to the unstable API automatically breaks every client
of the unstable API. Only clients using the particular route(s) being
broken in the unstable API would require coordination to upgrade.

> Worse, it seems like this isn't obvious. Which makes it a pretty big
> footgun, if ATC operators use the "beta" API in their production CDN
> without realizing they just made it impossible to upgrade.

If we declare a certain API version unstable, ATC operators should
understand the risks of using it, just like there are risks involved
in using the API in general. Using the API to make changes is
generally a last-resort option when making the same changes in the UI
would take much longer. Using the UI is generally the much safer
option since it has a lot more built-in safeties (confirmations, form
validation, etc) than the API, but in the case where ATC operators
absolutely need the new features in the unstable API and can't use the
UI instead, they will have to take that risk.

> On the other hand, I'm not seeing the big development savings.

https://github.com/apache/trafficcontrol/pull/6145 -- 60,000 lines of
code just to add a new major TO API version is a pretty big savings,
and that is not even counting all of the "if version == x"
conditionals that have to clutter the code to handle multiple API
versions. The fewer version-specific conditionals we have to deal with
in the code, the easier it is to develop and the less bug-prone it is.

> Since using it makes it impossible to upgrade,
> this means all production CDNs will have to wait 2 major versions for new
> features.

Again, this is a false statement. CDNs will have access to unstable
features via the API immediately upon release, and if certain
components need new changes in the unstable API, their upgrades may
need to be coordinated with the TO upgrade. Since `t3c` uses a large
percentage of the API currently and will most likely need to use the
unstable API, most of its upgrade concerns will be alleviated by the
addition of Cache Config Snapshots. The Cache Config Snapshots API
will generally always be stable in that the JSON snapshot will only
have fields added in a backwards-compatible manner. We should never
make a breaking change to a snapshot, and in general we never really
have (at least for the CRConfig snapshot that I know of). So with
Cache Config Snapshots, `t3c` will always have access to new features
right away and won't have to use the unstable API. Hopefully that
alleviates some of your upgrade concerts with respect to `t3c`. Most
other ATC components use a much smaller percentage of the API and
generally don't always need to use the latest API version.

- Rawlin