You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@arrow.apache.org by Wes McKinney <we...@gmail.com> on 2019/07/01 19:53:56 UTC

Re: [Discuss] Compatibility Guarantees and Versioning Post "1.0.0"

hi Micah,

Sorry for the delay in feedback. I looked at the document and it seems
like a reasonable perspective about forward- and
backward-compatibility.

It seems like the main thing you are proposing is to apply Semantic
Versioning to Format and Library versions separately. That's an
interesting idea, my thought had been to have a version number that is
FORMAT_VERSION.LIBRARY_VERSION.PATCH_VERSION. But your proposal is
more flexible in some ways, so let me clarify for others reading

In what you are proposing, the next release would be:

Format version: 1.0.0
Library version: 1.0.0

Suppose that 20 major versions down the road we stand at

Format version: 1.5.0
Library version: 20.0.0

The minor version of the Format would indicate that there are
additions, like new elements in the Type union, but otherwise backward
and forward compatible. So the Minor version means "new things, but
old clients will not be disrupted if those new things are not used".
We've already been doing this since the V4 Format iteration but we
have not had a way to signal that there may be new features. As a
corollary to this, I wonder if we should create a dual version in the
metadata

PROTOCOL VERSION: (what is currently MetadataVersion, V2)
FEATURE VERSION: not tracked at all

So Minor version bumps in the format would trigger a bump in the
FeatureVersion. Note that we don't really have a mechanism for clients
and servers to report to each other what features they support, so
this could help with that when for applications where it might matter.

Should backward/forward compatibility be disrupted in the future, then
a change to the major version would be required. So in year 2025, say,
we might decide that we want to do:

Format version: 2.0.0
Library version: 21.0.0

The Format version would live in the project's Documentation, so the
Apache releases are only the library version.

Regarding your open questions:

1. Should we clean up "warts" on the specification, like redundant information

I don't think it's necessary. So if Metadata V5 is Format Version
1.0.0 (currently we are V4, but we're discussing some possible
non-forward compatible changes...) I think that's OK. None of these
things are "hurting" anything

2. Do we need additional mechanisms for marking some features as experimental?

Not sure, but I think this can be mostly addressed through
documentation. Flight will still be experimental in 1.0.0, for
example.

3. Do we need protocol negotiation mechanisms in Flight

Could you explain what you mean? Are you thinking if there is some
major revamp of the protocol and you need to switch between a "V1
Flight Protocol" and a "V2 Flight Protocol"?

- Wes

On Thu, Jun 13, 2019 at 2:17 AM Micah Kornfield <em...@gmail.com> wrote:
>
> Hi Everyone,
> I think there might be some ideas that we still need to reach consensus on
> for how the format and libraries evolve in a post-1.0.0 release world.
>  Specifically, I think we need to agree on definitions for
> backwards/forwards compatibility and its implications for versioning the
> format.
>
> To this end I put some thoughts down in a Google Doc [1] for the purposes
> of discussion.  Comments welcome.  I will start threads for any comments in
> the document that seem to warrant further discussion, and once we reach
> consensus I can create a patch to document what we decide on as part of the
> specification.
>
> Thanks,
> Micah
>
> [1]
> https://docs.google.com/document/d/1uBitWu57rDu85tNHn0NwstAbrlYqor9dPFg_7QaE-nc/edit#

Re: [Discuss] Compatibility Guarantees and Versioning Post "1.0.0"

Posted by Micah Kornfield <em...@gmail.com>.
SGTM could you or another PMC member start one?

Thanks,
Micah

On Saturday, July 13, 2019, Wes McKinney <we...@gmail.com> wrote:

> Micah -- I would suggest that -- absent more opinions -- we vote about
> adopting the versioning scheme you described here (Format Version and
> Library Version)
>
> On Wed, Jul 10, 2019 at 8:46 AM Wes McKinney <we...@gmail.com> wrote:
> >
> > On Wed, Jul 10, 2019 at 12:43 AM Micah Kornfield <em...@gmail.com>
> wrote:
> > >
> > > Hi Eric,
> > > Short answer: I think your understanding matches what I was
> proposing.  Longer answer below.
> > >
> > >> So, for example, we release library v1.0.0 in a few months and then
> library v2.0.0 a few months after that.  In v2.0.0, C++, Python, and Java
> didn't make any breaking API changes from 1.0.0. But C# made 3 API breaking
> changes. This would be acceptable?
> > >
> > > Yes.  I think all language bindings are going under rapid enough
> iteration that we are making at least a few small breaking API changes on
> each release even though we try to avoid it.  I think it will be worth
> having further discussions on the release process once at least a few
> languages get to a more stable point.
> > >
> >
> > I agree with this. I think we are a pretty long ways away from making
> > API stability _guarantees_ in any of the implementations, though we
> > certainly should try to be courteous about the changes we do make, to
> > allow for graceful transitions over a period of 1-2 releases if
> > possible.
> >
> > > Thanks,
> > > Micah
> > >
> > > On Tue, Jul 9, 2019 at 2:26 PM Eric Erhardt <
> Eric.Erhardt@microsoft.com> wrote:
> > >>
> > >> Just to be sure I fully understand the proposal:
> > >>
> > >> For the Library Version, we are going to increment the MAJOR version
> on every normal release, and increment the MINOR version if we need to
> release a patch/bug fix type of release.
> > >>
> > >> Since SemVer allows for API breaking changes on MAJOR versions, this
> basically means, each library (C++, Python, C#, Java, etc) _can_ introduce
> API breaking changes on every normal release (like we have been with the
> 0.x.0 releases).
> > >>
> > >> So, for example, we release library v1.0.0 in a few months and then
> library v2.0.0 a few months after that.  In v2.0.0, C++, Python, and Java
> didn't make any breaking API changes from 1.0.0. But C# made 3 API breaking
> changes. This would be acceptable?
> > >>
> > >> If my understanding above is correct, then I think this is a good
> plan. Initially I was concerned that the C# library wouldn't be free to
> make API breaking changes with making the version `1.0.0`. The C# library
> is still pretty inadequate, and I have a feeling there are a few things
> that will need to change about it in the future. But with the above plan,
> this concern won't be a problem.
> > >>
> > >> Eric
> > >>
> > >> -----Original Message-----
> > >> From: Micah Kornfield <em...@gmail.com>
> > >> Sent: Monday, July 1, 2019 10:02 PM
> > >> To: Wes McKinney <we...@gmail.com>
> > >> Cc: dev@arrow.apache.org
> > >> Subject: Re: [Discuss] Compatibility Guarantees and Versioning Post
> "1.0.0"
> > >>
> > >> Hi Wes,
> > >> Thanks for your response.  In regards to the protocol negotiation
> your description of feature reporting (snipped below) is along the lines of
> what I was thinking.  It might not be necessary for 1.0.0, but at some
> point might become useful.
> > >>
> > >>
> > >> >  Note that we don't really have a mechanism for clients and servers
> to
> > >> > report to each other what features they support, so this could help
> > >> > with that when for applications where it might matter.
> > >>
> > >>
> > >> Thanks,
> > >> Micah
> > >>
> > >>
> > >> On Mon, Jul 1, 2019 at 12:54 PM Wes McKinney <we...@gmail.com>
> wrote:
> > >>
> > >> > hi Micah,
> > >> >
> > >> > Sorry for the delay in feedback. I looked at the document and it
> seems
> > >> > like a reasonable perspective about forward- and
> > >> > backward-compatibility.
> > >> >
> > >> > It seems like the main thing you are proposing is to apply Semantic
> > >> > Versioning to Format and Library versions separately. That's an
> > >> > interesting idea, my thought had been to have a version number that
> is
> > >> > FORMAT_VERSION.LIBRARY_VERSION.PATCH_VERSION. But your proposal is
> > >> > more flexible in some ways, so let me clarify for others reading
> > >> >
> > >> > In what you are proposing, the next release would be:
> > >> >
> > >> > Format version: 1.0.0
> > >> > Library version: 1.0.0
> > >> >
> > >> > Suppose that 20 major versions down the road we stand at
> > >> >
> > >> > Format version: 1.5.0
> > >> > Library version: 20.0.0
> > >> >
> > >> > The minor version of the Format would indicate that there are
> > >> > additions, like new elements in the Type union, but otherwise
> backward
> > >> > and forward compatible. So the Minor version means "new things, but
> > >> > old clients will not be disrupted if those new things are not used".
> > >> > We've already been doing this since the V4 Format iteration but we
> > >> > have not had a way to signal that there may be new features. As a
> > >> > corollary to this, I wonder if we should create a dual version in
> the
> > >> > metadata
> > >> >
> > >> > PROTOCOL VERSION: (what is currently MetadataVersion, V2) FEATURE
> > >> > VERSION: not tracked at all
> > >> >
> > >> > So Minor version bumps in the format would trigger a bump in the
> > >> > FeatureVersion. Note that we don't really have a mechanism for
> clients
> > >> > and servers to report to each other what features they support, so
> > >> > this could help with that when for applications where it might
> matter.
> > >> >
> > >> > Should backward/forward compatibility be disrupted in the future,
> then
> > >> > a change to the major version would be required. So in year 2025,
> say,
> > >> > we might decide that we want to do:
> > >> >
> > >> > Format version: 2.0.0
> > >> > Library version: 21.0.0
> > >> >
> > >> > The Format version would live in the project's Documentation, so the
> > >> > Apache releases are only the library version.
> > >> >
> > >> > Regarding your open questions:
> > >> >
> > >> > 1. Should we clean up "warts" on the specification, like redundant
> > >> > information
> > >> >
> > >> > I don't think it's necessary. So if Metadata V5 is Format Version
> > >> > 1.0.0 (currently we are V4, but we're discussing some possible
> > >> > non-forward compatible changes...) I think that's OK. None of these
> > >> > things are "hurting" anything
> > >> >
> > >> > 2. Do we need additional mechanisms for marking some features as
> > >> > experimental?
> > >> >
> > >> > Not sure, but I think this can be mostly addressed through
> > >> > documentation. Flight will still be experimental in 1.0.0, for
> > >> > example.
> > >> >
> > >> > 3. Do we need protocol negotiation mechanisms in Flight
> > >> >
> > >> > Could you explain what you mean? Are you thinking if there is some
> > >> > major revamp of the protocol and you need to switch between a "V1
> > >> > Flight Protocol" and a "V2 Flight Protocol"?
> > >> >
> > >> > - Wes
> > >> >
> > >> > On Thu, Jun 13, 2019 at 2:17 AM Micah Kornfield
> > >> > <em...@gmail.com>
> > >> > wrote:
> > >> > >
> > >> > > Hi Everyone,
> > >> > > I think there might be some ideas that we still need to reach
> > >> > > consensus
> > >> > on
> > >> > > for how the format and libraries evolve in a post-1.0.0 release
> world.
> > >> > >  Specifically, I think we need to agree on definitions for
> > >> > > backwards/forwards compatibility and its implications for
> versioning
> > >> > > the format.
> > >> > >
> > >> > > To this end I put some thoughts down in a Google Doc [1] for the
> > >> > > purposes of discussion.  Comments welcome.  I will start threads
> for
> > >> > > any comments
> > >> > in
> > >> > > the document that seem to warrant further discussion, and once we
> > >> > > reach consensus I can create a patch to document what we decide on
> > >> > > as part of
> > >> > the
> > >> > > specification.
> > >> > >
> > >> > > Thanks,
> > >> > > Micah
> > >> > >
> > >> > > [1]
> > >> > >
> > >> > https://nam06.safelinks.protection.outlook.com/?url=
> https%3A%2F%2Fdocs
> > >> > .google.com%2Fdocument%2Fd%2F1uBitWu57rDu85tNHn0NwstAbrlY
> qor9dPFg_7QaE-nc%2Fedit%23&amp;data=02%7C01%7CEric.Erhardt%40microsoft.com
> %7C6fc59049ffb049c9ddb108d6fe99bebf%7C72f988bf86f141af91ab2d7cd011
> db47%7C1%7C0%7C636976334243577292&amp;sdata=YNQ%2FgL5rvmvRqvvW%
> 2Bxjmb%2F4KeEe2JHe1ruws2VP%2BvK4%3D&amp;reserved=0
> > >> >
>

Re: [Discuss] Compatibility Guarantees and Versioning Post "1.0.0"

Posted by Wes McKinney <we...@gmail.com>.
Micah -- I would suggest that -- absent more opinions -- we vote about
adopting the versioning scheme you described here (Format Version and
Library Version)

On Wed, Jul 10, 2019 at 8:46 AM Wes McKinney <we...@gmail.com> wrote:
>
> On Wed, Jul 10, 2019 at 12:43 AM Micah Kornfield <em...@gmail.com> wrote:
> >
> > Hi Eric,
> > Short answer: I think your understanding matches what I was proposing.  Longer answer below.
> >
> >> So, for example, we release library v1.0.0 in a few months and then library v2.0.0 a few months after that.  In v2.0.0, C++, Python, and Java didn't make any breaking API changes from 1.0.0. But C# made 3 API breaking changes. This would be acceptable?
> >
> > Yes.  I think all language bindings are going under rapid enough iteration that we are making at least a few small breaking API changes on each release even though we try to avoid it.  I think it will be worth having further discussions on the release process once at least a few languages get to a more stable point.
> >
>
> I agree with this. I think we are a pretty long ways away from making
> API stability _guarantees_ in any of the implementations, though we
> certainly should try to be courteous about the changes we do make, to
> allow for graceful transitions over a period of 1-2 releases if
> possible.
>
> > Thanks,
> > Micah
> >
> > On Tue, Jul 9, 2019 at 2:26 PM Eric Erhardt <Er...@microsoft.com> wrote:
> >>
> >> Just to be sure I fully understand the proposal:
> >>
> >> For the Library Version, we are going to increment the MAJOR version on every normal release, and increment the MINOR version if we need to release a patch/bug fix type of release.
> >>
> >> Since SemVer allows for API breaking changes on MAJOR versions, this basically means, each library (C++, Python, C#, Java, etc) _can_ introduce API breaking changes on every normal release (like we have been with the 0.x.0 releases).
> >>
> >> So, for example, we release library v1.0.0 in a few months and then library v2.0.0 a few months after that.  In v2.0.0, C++, Python, and Java didn't make any breaking API changes from 1.0.0. But C# made 3 API breaking changes. This would be acceptable?
> >>
> >> If my understanding above is correct, then I think this is a good plan. Initially I was concerned that the C# library wouldn't be free to make API breaking changes with making the version `1.0.0`. The C# library is still pretty inadequate, and I have a feeling there are a few things that will need to change about it in the future. But with the above plan, this concern won't be a problem.
> >>
> >> Eric
> >>
> >> -----Original Message-----
> >> From: Micah Kornfield <em...@gmail.com>
> >> Sent: Monday, July 1, 2019 10:02 PM
> >> To: Wes McKinney <we...@gmail.com>
> >> Cc: dev@arrow.apache.org
> >> Subject: Re: [Discuss] Compatibility Guarantees and Versioning Post "1.0.0"
> >>
> >> Hi Wes,
> >> Thanks for your response.  In regards to the protocol negotiation your description of feature reporting (snipped below) is along the lines of what I was thinking.  It might not be necessary for 1.0.0, but at some point might become useful.
> >>
> >>
> >> >  Note that we don't really have a mechanism for clients and servers to
> >> > report to each other what features they support, so this could help
> >> > with that when for applications where it might matter.
> >>
> >>
> >> Thanks,
> >> Micah
> >>
> >>
> >> On Mon, Jul 1, 2019 at 12:54 PM Wes McKinney <we...@gmail.com> wrote:
> >>
> >> > hi Micah,
> >> >
> >> > Sorry for the delay in feedback. I looked at the document and it seems
> >> > like a reasonable perspective about forward- and
> >> > backward-compatibility.
> >> >
> >> > It seems like the main thing you are proposing is to apply Semantic
> >> > Versioning to Format and Library versions separately. That's an
> >> > interesting idea, my thought had been to have a version number that is
> >> > FORMAT_VERSION.LIBRARY_VERSION.PATCH_VERSION. But your proposal is
> >> > more flexible in some ways, so let me clarify for others reading
> >> >
> >> > In what you are proposing, the next release would be:
> >> >
> >> > Format version: 1.0.0
> >> > Library version: 1.0.0
> >> >
> >> > Suppose that 20 major versions down the road we stand at
> >> >
> >> > Format version: 1.5.0
> >> > Library version: 20.0.0
> >> >
> >> > The minor version of the Format would indicate that there are
> >> > additions, like new elements in the Type union, but otherwise backward
> >> > and forward compatible. So the Minor version means "new things, but
> >> > old clients will not be disrupted if those new things are not used".
> >> > We've already been doing this since the V4 Format iteration but we
> >> > have not had a way to signal that there may be new features. As a
> >> > corollary to this, I wonder if we should create a dual version in the
> >> > metadata
> >> >
> >> > PROTOCOL VERSION: (what is currently MetadataVersion, V2) FEATURE
> >> > VERSION: not tracked at all
> >> >
> >> > So Minor version bumps in the format would trigger a bump in the
> >> > FeatureVersion. Note that we don't really have a mechanism for clients
> >> > and servers to report to each other what features they support, so
> >> > this could help with that when for applications where it might matter.
> >> >
> >> > Should backward/forward compatibility be disrupted in the future, then
> >> > a change to the major version would be required. So in year 2025, say,
> >> > we might decide that we want to do:
> >> >
> >> > Format version: 2.0.0
> >> > Library version: 21.0.0
> >> >
> >> > The Format version would live in the project's Documentation, so the
> >> > Apache releases are only the library version.
> >> >
> >> > Regarding your open questions:
> >> >
> >> > 1. Should we clean up "warts" on the specification, like redundant
> >> > information
> >> >
> >> > I don't think it's necessary. So if Metadata V5 is Format Version
> >> > 1.0.0 (currently we are V4, but we're discussing some possible
> >> > non-forward compatible changes...) I think that's OK. None of these
> >> > things are "hurting" anything
> >> >
> >> > 2. Do we need additional mechanisms for marking some features as
> >> > experimental?
> >> >
> >> > Not sure, but I think this can be mostly addressed through
> >> > documentation. Flight will still be experimental in 1.0.0, for
> >> > example.
> >> >
> >> > 3. Do we need protocol negotiation mechanisms in Flight
> >> >
> >> > Could you explain what you mean? Are you thinking if there is some
> >> > major revamp of the protocol and you need to switch between a "V1
> >> > Flight Protocol" and a "V2 Flight Protocol"?
> >> >
> >> > - Wes
> >> >
> >> > On Thu, Jun 13, 2019 at 2:17 AM Micah Kornfield
> >> > <em...@gmail.com>
> >> > wrote:
> >> > >
> >> > > Hi Everyone,
> >> > > I think there might be some ideas that we still need to reach
> >> > > consensus
> >> > on
> >> > > for how the format and libraries evolve in a post-1.0.0 release world.
> >> > >  Specifically, I think we need to agree on definitions for
> >> > > backwards/forwards compatibility and its implications for versioning
> >> > > the format.
> >> > >
> >> > > To this end I put some thoughts down in a Google Doc [1] for the
> >> > > purposes of discussion.  Comments welcome.  I will start threads for
> >> > > any comments
> >> > in
> >> > > the document that seem to warrant further discussion, and once we
> >> > > reach consensus I can create a patch to document what we decide on
> >> > > as part of
> >> > the
> >> > > specification.
> >> > >
> >> > > Thanks,
> >> > > Micah
> >> > >
> >> > > [1]
> >> > >
> >> > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs
> >> > .google.com%2Fdocument%2Fd%2F1uBitWu57rDu85tNHn0NwstAbrlYqor9dPFg_7QaE-nc%2Fedit%23&amp;data=02%7C01%7CEric.Erhardt%40microsoft.com%7C6fc59049ffb049c9ddb108d6fe99bebf%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636976334243577292&amp;sdata=YNQ%2FgL5rvmvRqvvW%2Bxjmb%2F4KeEe2JHe1ruws2VP%2BvK4%3D&amp;reserved=0
> >> >

Re: [Discuss] Compatibility Guarantees and Versioning Post "1.0.0"

Posted by Wes McKinney <we...@gmail.com>.
On Wed, Jul 10, 2019 at 12:43 AM Micah Kornfield <em...@gmail.com> wrote:
>
> Hi Eric,
> Short answer: I think your understanding matches what I was proposing.  Longer answer below.
>
>> So, for example, we release library v1.0.0 in a few months and then library v2.0.0 a few months after that.  In v2.0.0, C++, Python, and Java didn't make any breaking API changes from 1.0.0. But C# made 3 API breaking changes. This would be acceptable?
>
> Yes.  I think all language bindings are going under rapid enough iteration that we are making at least a few small breaking API changes on each release even though we try to avoid it.  I think it will be worth having further discussions on the release process once at least a few languages get to a more stable point.
>

I agree with this. I think we are a pretty long ways away from making
API stability _guarantees_ in any of the implementations, though we
certainly should try to be courteous about the changes we do make, to
allow for graceful transitions over a period of 1-2 releases if
possible.

> Thanks,
> Micah
>
> On Tue, Jul 9, 2019 at 2:26 PM Eric Erhardt <Er...@microsoft.com> wrote:
>>
>> Just to be sure I fully understand the proposal:
>>
>> For the Library Version, we are going to increment the MAJOR version on every normal release, and increment the MINOR version if we need to release a patch/bug fix type of release.
>>
>> Since SemVer allows for API breaking changes on MAJOR versions, this basically means, each library (C++, Python, C#, Java, etc) _can_ introduce API breaking changes on every normal release (like we have been with the 0.x.0 releases).
>>
>> So, for example, we release library v1.0.0 in a few months and then library v2.0.0 a few months after that.  In v2.0.0, C++, Python, and Java didn't make any breaking API changes from 1.0.0. But C# made 3 API breaking changes. This would be acceptable?
>>
>> If my understanding above is correct, then I think this is a good plan. Initially I was concerned that the C# library wouldn't be free to make API breaking changes with making the version `1.0.0`. The C# library is still pretty inadequate, and I have a feeling there are a few things that will need to change about it in the future. But with the above plan, this concern won't be a problem.
>>
>> Eric
>>
>> -----Original Message-----
>> From: Micah Kornfield <em...@gmail.com>
>> Sent: Monday, July 1, 2019 10:02 PM
>> To: Wes McKinney <we...@gmail.com>
>> Cc: dev@arrow.apache.org
>> Subject: Re: [Discuss] Compatibility Guarantees and Versioning Post "1.0.0"
>>
>> Hi Wes,
>> Thanks for your response.  In regards to the protocol negotiation your description of feature reporting (snipped below) is along the lines of what I was thinking.  It might not be necessary for 1.0.0, but at some point might become useful.
>>
>>
>> >  Note that we don't really have a mechanism for clients and servers to
>> > report to each other what features they support, so this could help
>> > with that when for applications where it might matter.
>>
>>
>> Thanks,
>> Micah
>>
>>
>> On Mon, Jul 1, 2019 at 12:54 PM Wes McKinney <we...@gmail.com> wrote:
>>
>> > hi Micah,
>> >
>> > Sorry for the delay in feedback. I looked at the document and it seems
>> > like a reasonable perspective about forward- and
>> > backward-compatibility.
>> >
>> > It seems like the main thing you are proposing is to apply Semantic
>> > Versioning to Format and Library versions separately. That's an
>> > interesting idea, my thought had been to have a version number that is
>> > FORMAT_VERSION.LIBRARY_VERSION.PATCH_VERSION. But your proposal is
>> > more flexible in some ways, so let me clarify for others reading
>> >
>> > In what you are proposing, the next release would be:
>> >
>> > Format version: 1.0.0
>> > Library version: 1.0.0
>> >
>> > Suppose that 20 major versions down the road we stand at
>> >
>> > Format version: 1.5.0
>> > Library version: 20.0.0
>> >
>> > The minor version of the Format would indicate that there are
>> > additions, like new elements in the Type union, but otherwise backward
>> > and forward compatible. So the Minor version means "new things, but
>> > old clients will not be disrupted if those new things are not used".
>> > We've already been doing this since the V4 Format iteration but we
>> > have not had a way to signal that there may be new features. As a
>> > corollary to this, I wonder if we should create a dual version in the
>> > metadata
>> >
>> > PROTOCOL VERSION: (what is currently MetadataVersion, V2) FEATURE
>> > VERSION: not tracked at all
>> >
>> > So Minor version bumps in the format would trigger a bump in the
>> > FeatureVersion. Note that we don't really have a mechanism for clients
>> > and servers to report to each other what features they support, so
>> > this could help with that when for applications where it might matter.
>> >
>> > Should backward/forward compatibility be disrupted in the future, then
>> > a change to the major version would be required. So in year 2025, say,
>> > we might decide that we want to do:
>> >
>> > Format version: 2.0.0
>> > Library version: 21.0.0
>> >
>> > The Format version would live in the project's Documentation, so the
>> > Apache releases are only the library version.
>> >
>> > Regarding your open questions:
>> >
>> > 1. Should we clean up "warts" on the specification, like redundant
>> > information
>> >
>> > I don't think it's necessary. So if Metadata V5 is Format Version
>> > 1.0.0 (currently we are V4, but we're discussing some possible
>> > non-forward compatible changes...) I think that's OK. None of these
>> > things are "hurting" anything
>> >
>> > 2. Do we need additional mechanisms for marking some features as
>> > experimental?
>> >
>> > Not sure, but I think this can be mostly addressed through
>> > documentation. Flight will still be experimental in 1.0.0, for
>> > example.
>> >
>> > 3. Do we need protocol negotiation mechanisms in Flight
>> >
>> > Could you explain what you mean? Are you thinking if there is some
>> > major revamp of the protocol and you need to switch between a "V1
>> > Flight Protocol" and a "V2 Flight Protocol"?
>> >
>> > - Wes
>> >
>> > On Thu, Jun 13, 2019 at 2:17 AM Micah Kornfield
>> > <em...@gmail.com>
>> > wrote:
>> > >
>> > > Hi Everyone,
>> > > I think there might be some ideas that we still need to reach
>> > > consensus
>> > on
>> > > for how the format and libraries evolve in a post-1.0.0 release world.
>> > >  Specifically, I think we need to agree on definitions for
>> > > backwards/forwards compatibility and its implications for versioning
>> > > the format.
>> > >
>> > > To this end I put some thoughts down in a Google Doc [1] for the
>> > > purposes of discussion.  Comments welcome.  I will start threads for
>> > > any comments
>> > in
>> > > the document that seem to warrant further discussion, and once we
>> > > reach consensus I can create a patch to document what we decide on
>> > > as part of
>> > the
>> > > specification.
>> > >
>> > > Thanks,
>> > > Micah
>> > >
>> > > [1]
>> > >
>> > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs
>> > .google.com%2Fdocument%2Fd%2F1uBitWu57rDu85tNHn0NwstAbrlYqor9dPFg_7QaE-nc%2Fedit%23&amp;data=02%7C01%7CEric.Erhardt%40microsoft.com%7C6fc59049ffb049c9ddb108d6fe99bebf%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636976334243577292&amp;sdata=YNQ%2FgL5rvmvRqvvW%2Bxjmb%2F4KeEe2JHe1ruws2VP%2BvK4%3D&amp;reserved=0
>> >

Re: [Discuss] Compatibility Guarantees and Versioning Post "1.0.0"

Posted by Micah Kornfield <em...@gmail.com>.
Hi Eric,
Short answer: I think your understanding matches what I was proposing.
Longer answer below.

So, for example, we release library v1.0.0 in a few months and then library
> v2.0.0 a few months after that.  In v2.0.0, C++, Python, and Java didn't
> make any breaking API changes from 1.0.0. But C# made 3 API breaking
> changes. This would be acceptable?

Yes.  I think all language bindings are going under rapid enough iteration
that we are making at least a few small breaking API changes on each
release even though we try to avoid it.  I think it will be worth having
further discussions on the release process once at least a few languages
get to a more stable point.

Thanks,
Micah

On Tue, Jul 9, 2019 at 2:26 PM Eric Erhardt <Er...@microsoft.com>
wrote:

> Just to be sure I fully understand the proposal:
>
> For the Library Version, we are going to increment the MAJOR version on
> every normal release, and increment the MINOR version if we need to release
> a patch/bug fix type of release.
>
> Since SemVer allows for API breaking changes on MAJOR versions, this
> basically means, each library (C++, Python, C#, Java, etc) _can_ introduce
> API breaking changes on every normal release (like we have been with the
> 0.x.0 releases).
>
> So, for example, we release library v1.0.0 in a few months and then
> library v2.0.0 a few months after that.  In v2.0.0, C++, Python, and Java
> didn't make any breaking API changes from 1.0.0. But C# made 3 API breaking
> changes. This would be acceptable?
>
> If my understanding above is correct, then I think this is a good plan.
> Initially I was concerned that the C# library wouldn't be free to make API
> breaking changes with making the version `1.0.0`. The C# library is still
> pretty inadequate, and I have a feeling there are a few things that will
> need to change about it in the future. But with the above plan, this
> concern won't be a problem.
>
> Eric
>
> -----Original Message-----
> From: Micah Kornfield <em...@gmail.com>
> Sent: Monday, July 1, 2019 10:02 PM
> To: Wes McKinney <we...@gmail.com>
> Cc: dev@arrow.apache.org
> Subject: Re: [Discuss] Compatibility Guarantees and Versioning Post "1.0.0"
>
> Hi Wes,
> Thanks for your response.  In regards to the protocol negotiation your
> description of feature reporting (snipped below) is along the lines of what
> I was thinking.  It might not be necessary for 1.0.0, but at some point
> might become useful.
>
>
> >  Note that we don't really have a mechanism for clients and servers to
> > report to each other what features they support, so this could help
> > with that when for applications where it might matter.
>
>
> Thanks,
> Micah
>
>
> On Mon, Jul 1, 2019 at 12:54 PM Wes McKinney <we...@gmail.com> wrote:
>
> > hi Micah,
> >
> > Sorry for the delay in feedback. I looked at the document and it seems
> > like a reasonable perspective about forward- and
> > backward-compatibility.
> >
> > It seems like the main thing you are proposing is to apply Semantic
> > Versioning to Format and Library versions separately. That's an
> > interesting idea, my thought had been to have a version number that is
> > FORMAT_VERSION.LIBRARY_VERSION.PATCH_VERSION. But your proposal is
> > more flexible in some ways, so let me clarify for others reading
> >
> > In what you are proposing, the next release would be:
> >
> > Format version: 1.0.0
> > Library version: 1.0.0
> >
> > Suppose that 20 major versions down the road we stand at
> >
> > Format version: 1.5.0
> > Library version: 20.0.0
> >
> > The minor version of the Format would indicate that there are
> > additions, like new elements in the Type union, but otherwise backward
> > and forward compatible. So the Minor version means "new things, but
> > old clients will not be disrupted if those new things are not used".
> > We've already been doing this since the V4 Format iteration but we
> > have not had a way to signal that there may be new features. As a
> > corollary to this, I wonder if we should create a dual version in the
> > metadata
> >
> > PROTOCOL VERSION: (what is currently MetadataVersion, V2) FEATURE
> > VERSION: not tracked at all
> >
> > So Minor version bumps in the format would trigger a bump in the
> > FeatureVersion. Note that we don't really have a mechanism for clients
> > and servers to report to each other what features they support, so
> > this could help with that when for applications where it might matter.
> >
> > Should backward/forward compatibility be disrupted in the future, then
> > a change to the major version would be required. So in year 2025, say,
> > we might decide that we want to do:
> >
> > Format version: 2.0.0
> > Library version: 21.0.0
> >
> > The Format version would live in the project's Documentation, so the
> > Apache releases are only the library version.
> >
> > Regarding your open questions:
> >
> > 1. Should we clean up "warts" on the specification, like redundant
> > information
> >
> > I don't think it's necessary. So if Metadata V5 is Format Version
> > 1.0.0 (currently we are V4, but we're discussing some possible
> > non-forward compatible changes...) I think that's OK. None of these
> > things are "hurting" anything
> >
> > 2. Do we need additional mechanisms for marking some features as
> > experimental?
> >
> > Not sure, but I think this can be mostly addressed through
> > documentation. Flight will still be experimental in 1.0.0, for
> > example.
> >
> > 3. Do we need protocol negotiation mechanisms in Flight
> >
> > Could you explain what you mean? Are you thinking if there is some
> > major revamp of the protocol and you need to switch between a "V1
> > Flight Protocol" and a "V2 Flight Protocol"?
> >
> > - Wes
> >
> > On Thu, Jun 13, 2019 at 2:17 AM Micah Kornfield
> > <em...@gmail.com>
> > wrote:
> > >
> > > Hi Everyone,
> > > I think there might be some ideas that we still need to reach
> > > consensus
> > on
> > > for how the format and libraries evolve in a post-1.0.0 release world.
> > >  Specifically, I think we need to agree on definitions for
> > > backwards/forwards compatibility and its implications for versioning
> > > the format.
> > >
> > > To this end I put some thoughts down in a Google Doc [1] for the
> > > purposes of discussion.  Comments welcome.  I will start threads for
> > > any comments
> > in
> > > the document that seem to warrant further discussion, and once we
> > > reach consensus I can create a patch to document what we decide on
> > > as part of
> > the
> > > specification.
> > >
> > > Thanks,
> > > Micah
> > >
> > > [1]
> > >
> > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs
> > .google.com
> %2Fdocument%2Fd%2F1uBitWu57rDu85tNHn0NwstAbrlYqor9dPFg_7QaE-nc%2Fedit%23&amp;data=02%7C01%7CEric.Erhardt%
> 40microsoft.com
> %7C6fc59049ffb049c9ddb108d6fe99bebf%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636976334243577292&amp;sdata=YNQ%2FgL5rvmvRqvvW%2Bxjmb%2F4KeEe2JHe1ruws2VP%2BvK4%3D&amp;reserved=0
> >
>

RE: [Discuss] Compatibility Guarantees and Versioning Post "1.0.0"

Posted by Eric Erhardt <Er...@microsoft.com.INVALID>.
Just to be sure I fully understand the proposal:

For the Library Version, we are going to increment the MAJOR version on every normal release, and increment the MINOR version if we need to release a patch/bug fix type of release.

Since SemVer allows for API breaking changes on MAJOR versions, this basically means, each library (C++, Python, C#, Java, etc) _can_ introduce API breaking changes on every normal release (like we have been with the 0.x.0 releases).

So, for example, we release library v1.0.0 in a few months and then library v2.0.0 a few months after that.  In v2.0.0, C++, Python, and Java didn't make any breaking API changes from 1.0.0. But C# made 3 API breaking changes. This would be acceptable?

If my understanding above is correct, then I think this is a good plan. Initially I was concerned that the C# library wouldn't be free to make API breaking changes with making the version `1.0.0`. The C# library is still pretty inadequate, and I have a feeling there are a few things that will need to change about it in the future. But with the above plan, this concern won't be a problem.

Eric

-----Original Message-----
From: Micah Kornfield <em...@gmail.com> 
Sent: Monday, July 1, 2019 10:02 PM
To: Wes McKinney <we...@gmail.com>
Cc: dev@arrow.apache.org
Subject: Re: [Discuss] Compatibility Guarantees and Versioning Post "1.0.0"

Hi Wes,
Thanks for your response.  In regards to the protocol negotiation your description of feature reporting (snipped below) is along the lines of what I was thinking.  It might not be necessary for 1.0.0, but at some point might become useful.


>  Note that we don't really have a mechanism for clients and servers to 
> report to each other what features they support, so this could help 
> with that when for applications where it might matter.


Thanks,
Micah


On Mon, Jul 1, 2019 at 12:54 PM Wes McKinney <we...@gmail.com> wrote:

> hi Micah,
>
> Sorry for the delay in feedback. I looked at the document and it seems 
> like a reasonable perspective about forward- and 
> backward-compatibility.
>
> It seems like the main thing you are proposing is to apply Semantic 
> Versioning to Format and Library versions separately. That's an 
> interesting idea, my thought had been to have a version number that is 
> FORMAT_VERSION.LIBRARY_VERSION.PATCH_VERSION. But your proposal is 
> more flexible in some ways, so let me clarify for others reading
>
> In what you are proposing, the next release would be:
>
> Format version: 1.0.0
> Library version: 1.0.0
>
> Suppose that 20 major versions down the road we stand at
>
> Format version: 1.5.0
> Library version: 20.0.0
>
> The minor version of the Format would indicate that there are 
> additions, like new elements in the Type union, but otherwise backward 
> and forward compatible. So the Minor version means "new things, but 
> old clients will not be disrupted if those new things are not used".
> We've already been doing this since the V4 Format iteration but we 
> have not had a way to signal that there may be new features. As a 
> corollary to this, I wonder if we should create a dual version in the 
> metadata
>
> PROTOCOL VERSION: (what is currently MetadataVersion, V2) FEATURE 
> VERSION: not tracked at all
>
> So Minor version bumps in the format would trigger a bump in the 
> FeatureVersion. Note that we don't really have a mechanism for clients 
> and servers to report to each other what features they support, so 
> this could help with that when for applications where it might matter.
>
> Should backward/forward compatibility be disrupted in the future, then 
> a change to the major version would be required. So in year 2025, say, 
> we might decide that we want to do:
>
> Format version: 2.0.0
> Library version: 21.0.0
>
> The Format version would live in the project's Documentation, so the 
> Apache releases are only the library version.
>
> Regarding your open questions:
>
> 1. Should we clean up "warts" on the specification, like redundant 
> information
>
> I don't think it's necessary. So if Metadata V5 is Format Version
> 1.0.0 (currently we are V4, but we're discussing some possible 
> non-forward compatible changes...) I think that's OK. None of these 
> things are "hurting" anything
>
> 2. Do we need additional mechanisms for marking some features as 
> experimental?
>
> Not sure, but I think this can be mostly addressed through 
> documentation. Flight will still be experimental in 1.0.0, for 
> example.
>
> 3. Do we need protocol negotiation mechanisms in Flight
>
> Could you explain what you mean? Are you thinking if there is some 
> major revamp of the protocol and you need to switch between a "V1 
> Flight Protocol" and a "V2 Flight Protocol"?
>
> - Wes
>
> On Thu, Jun 13, 2019 at 2:17 AM Micah Kornfield 
> <em...@gmail.com>
> wrote:
> >
> > Hi Everyone,
> > I think there might be some ideas that we still need to reach 
> > consensus
> on
> > for how the format and libraries evolve in a post-1.0.0 release world.
> >  Specifically, I think we need to agree on definitions for 
> > backwards/forwards compatibility and its implications for versioning 
> > the format.
> >
> > To this end I put some thoughts down in a Google Doc [1] for the 
> > purposes of discussion.  Comments welcome.  I will start threads for 
> > any comments
> in
> > the document that seem to warrant further discussion, and once we 
> > reach consensus I can create a patch to document what we decide on 
> > as part of
> the
> > specification.
> >
> > Thanks,
> > Micah
> >
> > [1]
> >
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs
> .google.com%2Fdocument%2Fd%2F1uBitWu57rDu85tNHn0NwstAbrlYqor9dPFg_7QaE-nc%2Fedit%23&amp;data=02%7C01%7CEric.Erhardt%40microsoft.com%7C6fc59049ffb049c9ddb108d6fe99bebf%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636976334243577292&amp;sdata=YNQ%2FgL5rvmvRqvvW%2Bxjmb%2F4KeEe2JHe1ruws2VP%2BvK4%3D&amp;reserved=0
>

Re: [Discuss] Compatibility Guarantees and Versioning Post "1.0.0"

Posted by Micah Kornfield <em...@gmail.com>.
Hi Wes,
Thanks for your response.  In regards to the protocol negotiation your
description of feature reporting (snipped below) is along the lines of what
I was thinking.  It might not be necessary for 1.0.0, but at some point
might become useful.


>  Note that we don't really have a mechanism for clients
> and servers to report to each other what features they support, so
> this could help with that when for applications where it might matter.


Thanks,
Micah


On Mon, Jul 1, 2019 at 12:54 PM Wes McKinney <we...@gmail.com> wrote:

> hi Micah,
>
> Sorry for the delay in feedback. I looked at the document and it seems
> like a reasonable perspective about forward- and
> backward-compatibility.
>
> It seems like the main thing you are proposing is to apply Semantic
> Versioning to Format and Library versions separately. That's an
> interesting idea, my thought had been to have a version number that is
> FORMAT_VERSION.LIBRARY_VERSION.PATCH_VERSION. But your proposal is
> more flexible in some ways, so let me clarify for others reading
>
> In what you are proposing, the next release would be:
>
> Format version: 1.0.0
> Library version: 1.0.0
>
> Suppose that 20 major versions down the road we stand at
>
> Format version: 1.5.0
> Library version: 20.0.0
>
> The minor version of the Format would indicate that there are
> additions, like new elements in the Type union, but otherwise backward
> and forward compatible. So the Minor version means "new things, but
> old clients will not be disrupted if those new things are not used".
> We've already been doing this since the V4 Format iteration but we
> have not had a way to signal that there may be new features. As a
> corollary to this, I wonder if we should create a dual version in the
> metadata
>
> PROTOCOL VERSION: (what is currently MetadataVersion, V2)
> FEATURE VERSION: not tracked at all
>
> So Minor version bumps in the format would trigger a bump in the
> FeatureVersion. Note that we don't really have a mechanism for clients
> and servers to report to each other what features they support, so
> this could help with that when for applications where it might matter.
>
> Should backward/forward compatibility be disrupted in the future, then
> a change to the major version would be required. So in year 2025, say,
> we might decide that we want to do:
>
> Format version: 2.0.0
> Library version: 21.0.0
>
> The Format version would live in the project's Documentation, so the
> Apache releases are only the library version.
>
> Regarding your open questions:
>
> 1. Should we clean up "warts" on the specification, like redundant
> information
>
> I don't think it's necessary. So if Metadata V5 is Format Version
> 1.0.0 (currently we are V4, but we're discussing some possible
> non-forward compatible changes...) I think that's OK. None of these
> things are "hurting" anything
>
> 2. Do we need additional mechanisms for marking some features as
> experimental?
>
> Not sure, but I think this can be mostly addressed through
> documentation. Flight will still be experimental in 1.0.0, for
> example.
>
> 3. Do we need protocol negotiation mechanisms in Flight
>
> Could you explain what you mean? Are you thinking if there is some
> major revamp of the protocol and you need to switch between a "V1
> Flight Protocol" and a "V2 Flight Protocol"?
>
> - Wes
>
> On Thu, Jun 13, 2019 at 2:17 AM Micah Kornfield <em...@gmail.com>
> wrote:
> >
> > Hi Everyone,
> > I think there might be some ideas that we still need to reach consensus
> on
> > for how the format and libraries evolve in a post-1.0.0 release world.
> >  Specifically, I think we need to agree on definitions for
> > backwards/forwards compatibility and its implications for versioning the
> > format.
> >
> > To this end I put some thoughts down in a Google Doc [1] for the purposes
> > of discussion.  Comments welcome.  I will start threads for any comments
> in
> > the document that seem to warrant further discussion, and once we reach
> > consensus I can create a patch to document what we decide on as part of
> the
> > specification.
> >
> > Thanks,
> > Micah
> >
> > [1]
> >
> https://docs.google.com/document/d/1uBitWu57rDu85tNHn0NwstAbrlYqor9dPFg_7QaE-nc/edit#
>