You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@thrift.apache.org by Dave Engberg <de...@evernote.com> on 2010/08/24 20:44:31 UTC

Philosophy about forward compatibility of generated code?

  [ Caveat:  I should obviously spend more time keeping up with the 
daily email traffic to weigh in on proposed changes before full 
releases, but you know how it goes at a start-up.  Anyhoo ... ]

I'm starting to go through the process of updating our API to Thrift 
0.4, which hits around a dozen apps that we maintain (clients and 
server) and around a hundred implemented by third party developers.

I tend to see changes in Thrift releases that break the external 
interfaces of generated code.  In some cases, this may be based on a 
heavy conscious decision, but other changes seem to casually break 
compatibility for virtually no benefit.

For example, in 0.4.0, the Java generated interfaces for 'binary' fields 
changed everything from the built-in 'byte[]' Java type to the 
java.nio.ByteBuffer class.  This means that all relevant method 
signatures and fields were changed, and all instance setters and getters 
went from:
   public byte[] getImage() ...
   public void setImage(byte[] image) ...
to:
   public ByteBuffer getImage()
   public void setImage(ByteBuffer image)

That means that we'll need to change a hundreds of lines of code from:
    foo.setImage(bytes);
to:
   foo.setImage(ByteBuffer.wrap(bytes));
and from:
   byte[] bar = foo.getImage();
to:
   byte[] bar = new byte[foo.getImage().capacity()];
   foo.getImage().get(bar, 0, bar.length);

(This particular compatibility change seems particularly gratuitous 
since it actually increases the cost and overhead of constructing 
objects and marshaling data while replacing a core Java type with an odd 
dependency on a low-level Java IO package that's not really designed for 
this type of thing.)

So my meta-question is:  what's the Thrift project's philosophy about 
maintaining interface consistency and backward compatibility between 
releases?

At some point in the project's lifecycle, I think that Thrift should 
shift to a "don't break existing code unless absolutely, positively 
necessary for performance/security/etc."
But perhaps there's an explicit policy in place now that "Since we're 
only at version 0.4, all bets are off and any marginal change justifies 
breaking existing code. We'll stabilize at 1.0."

Re: Philosophy about forward compatibility of generated code?

Posted by Roger Meier <ro...@bufferoverflow.ch>.

I really like the approach using the http://semver.org/ approach which 
is very simple to understand and desribed within a few lines.
Nobody likes to search the mailing list or the Jira's generated 
changelog to find out what are the relevant changes.

Probably two wiki pages would be very helpful for users and developers:
- http://wiki.apache.org/thrift/VersioningSystem => describe the overall 
concept
- http://wiki.apache.org/thrift/RoadMap => what are the plans for the 
next releases, just a few words about the overall target...JIRA is just 
a tool;-)

I absolutely agree with David's position about 1.0 versions.
...any version might be useful, also a 0.1.2

--roger

Am 26.08.2010 22:40, schrieb Anthony Molinaro:
> I'd like to second David's opinion that 1.0 is a meaningless construct.
> I like the philosophy outlined here about version numbers
>
> http://semver.org/
>
> In a nutshell, its MAJOR.MINOR.PATCH where MAJOR is binary incompatible
> change, MINOR is backward compatible new feature, and PATCH is bugfix
> of existing feature.
>
> Now with thrift since we are dealing with 2 things (a serialization
> protocol and libraries to interact with those), these version numbers
> are not quite right.  I would say that thrift should adopt a four
> digit version.
>
> PROTO_VERSION.MAJOR.MINOR.PATCH
>
> where PROTO_VERSION is the serialization format version.  Any
> version of thrift released with PROTO_VERSION of 0 is compatible with
> all other 0 versions at the serialization level.
>
> MAJOR would be for binary incompatible changes of the generated code.
> Since we are still pre "1.0" these tend to happen anyway.  What this means
> is that if 0.5.0.0 is released, you'll most likely have to recompile
> any code which linked against 0.4.x.x if you want some new features
> in the client or server library.  However, since the PROTO_VERSION
> has not changed you'd be able to interoperate with older servers, and
> your server could take calls from older clients.
>
> MINOR would be for new features which don't require you to rebuild if
> you aren't using them.  This could involve things like adding a new
> server type.  If its a completely new server type you haven't broken
> any existing code so you are okay.
>
> PATCH would be bugfix releases, you probably don't see a lot of these
> but allow for quick fixes if say a 0.5.0.0 release had a major bug which
> was fixed quickly, you release a 0.5.0.1 and move on.
>
> Within such a framework I think you then make decisions about when to
> include new things and when to release according to criteria that
> David laid out.  Also, if this policy is spelled out explicitly on the
> website then questions about compatibility can be answered there (via
> a pointer from a FAQ).
>
> Also, this framework allows for quicker turn around of certain features
> and bugfixes.  I would expect to see several MINOR releases between MAJOR
> releases.  It is release early, release often right?  I also follow
> cassandra pretty closely and they seem to have MINOR releases (although they
> use the PATCH number), every month or 6 weeks or so, and MAJOR releases
> every 6 months or so.  Of course faster release cycle for things is
> more of a headache for Bryan, so maybe that's not a good idea :)
>
> Anyway, just my opinion, I wouldn't expect any action on this unless
> others thought it was a good idea.
>
> -Anthony
>
> On Thu, Aug 26, 2010 at 12:42:27PM -0700, David Reiss wrote:
>    
>> My opinion is that any time we face an incompatible change, the factors we
>> should consider are:
>>
>> - How significant will the benefit be?  (In this case, it was huge
>>    because the byte[] were inherently broken.)
>> - How many users do we expect it to affect (e.g., Java vs. OCaml
>>    changes).
>> - How difficult is the upgrade path?  (Changing the code to create a
>>    server is relatively simple.  Changing all code that works with
>>    generated structures is harder.  Changing the wire protocol is
>>    prohibitive.)
>> - What is the effect if client code is not updated?  (In C++ or Java,
>>    you probably get a nice, safe compile error.  In Python, you might get
>>    difficult-to-explain behavior after your app has been running for a
>>    while.)
>>
>> It's possible that I've left something off the list, but these are the
>> ones that come to mind.
>>
>> This might be a controversial opinion, but I place no weight in version
>> numbers at all.  You'll notice that I intentionally left "whether we
>> have released version 1.0" off of my list.  Of course, others might
>> place weight on a 1.0 release, so it could indirectly affect my second
>> point.
>>
>>      
>>>> At some point in the project's lifecycle, I think that Thrift should shift
>>>> to a "don't break existing code unless absolutely, positively necessary for
>>>> performance/security/etc."
>>>> But perhaps there's an explicit policy in place now that "Since we're only
>>>> at version 0.4, all bets are off and any marginal change justifies breaking
>>>> existing code. We'll stabilize at 1.0."
>>>>          
>> That's a reasonable position, but as both a developer and user of
>> Thrift, I personally would prefer not to do that.  When upgrade paths
>> are easy and upgrade errors are easily detected, I would *never* want to
>> be prevented from making a beneficial change because it wasn't
>> "absolutely, positively necessary".
>>
>> If this means never releasing version 1.0, that's fine with me.  I
>> recently learned that Hadoop (an incredibly successful project) is still
>> hanging out at 0.22.  I'd be fine with Thrift doing the same.
>>
>> If we do decide to release a version 1.0 and refrain from making
>> unnecessary breaking changes even if they are very beneficial and easy
>> to upgrade to, I would want to immediately create a 2.0 branch that
>> would accept more aggressive changes.
>>
>> --David
>>
>> On 08/24/2010 12:12 PM, Bryan Duxbury wrote:
>>      
>>> In general, I don't aim to break things. I hate updating my code as much as
>>> the next guy. However, we are still pre-1.0, which means if we're going to
>>> break code, now is the time to do it. In fact, we should get as much of it
>>> out of the way as we can before our user base and feature set becomes so
>>> entrenched that we're permanently committed to given interfaces.
>>>
>>> That said, I reject your premise that we "break compatibility for virtually
>>> no benefit". In fact, this particular change was effected specifically to
>>> give us a substantial performance benefit for structs with many binary
>>> fields. (It also makes a lot of operations easier, including map keys and
>>> set elements, comparison and equality, etc.)
>>>
>>> Finally, I would *gladly* accept a patch to the Java generator that
>>> generates accessors with the old-style signatures that wrap the new style
>>> ones. There's no reason for us not to have a 0.4.1 if it would ease folks'
>>> lives.
>>>
>>> -Bryan
>>>
>>> On Tue, Aug 24, 2010 at 11:44 AM, Dave Engberg<de...@evernote.com>wrote:
>>>
>>>        
>>>>   [ Caveat:  I should obviously spend more time keeping up with the daily
>>>> email traffic to weigh in on proposed changes before full releases, but you
>>>> know how it goes at a start-up.  Anyhoo ... ]
>>>>
>>>> I'm starting to go through the process of updating our API to Thrift 0.4,
>>>> which hits around a dozen apps that we maintain (clients and server) and
>>>> around a hundred implemented by third party developers.
>>>>
>>>> I tend to see changes in Thrift releases that break the external interfaces
>>>> of generated code.  In some cases, this may be based on a heavy conscious
>>>> decision, but other changes seem to casually break compatibility for
>>>> virtually no benefit.
>>>>
>>>> For example, in 0.4.0, the Java generated interfaces for 'binary' fields
>>>> changed everything from the built-in 'byte[]' Java type to the
>>>> java.nio.ByteBuffer class.  This means that all relevant method signatures
>>>> and fields were changed, and all instance setters and getters went from:
>>>>   public byte[] getImage() ...
>>>>   public void setImage(byte[] image) ...
>>>> to:
>>>>   public ByteBuffer getImage()
>>>>   public void setImage(ByteBuffer image)
>>>>
>>>> That means that we'll need to change a hundreds of lines of code from:
>>>>    foo.setImage(bytes);
>>>> to:
>>>>   foo.setImage(ByteBuffer.wrap(bytes));
>>>> and from:
>>>>   byte[] bar = foo.getImage();
>>>> to:
>>>>   byte[] bar = new byte[foo.getImage().capacity()];
>>>>   foo.getImage().get(bar, 0, bar.length);
>>>>
>>>> (This particular compatibility change seems particularly gratuitous since
>>>> it actually increases the cost and overhead of constructing objects and
>>>> marshaling data while replacing a core Java type with an odd dependency on a
>>>> low-level Java IO package that's not really designed for this type of
>>>> thing.)
>>>>
>>>> So my meta-question is:  what's the Thrift project's philosophy about
>>>> maintaining interface consistency and backward compatibility between
>>>> releases?
>>>>
>>>> At some point in the project's lifecycle, I think that Thrift should shift
>>>> to a "don't break existing code unless absolutely, positively necessary for
>>>> performance/security/etc."
>>>> But perhaps there's an explicit policy in place now that "Since we're only
>>>> at version 0.4, all bets are off and any marginal change justifies breaking
>>>> existing code. We'll stabilize at 1.0."
>>>>
>>>>
>>>>          
>>>        
>

Re: Philosophy about forward compatibility of generated code?

Posted by Doug Cutting <cu...@apache.org>.

On 08/26/2010 01:56 PM, Bryan Duxbury wrote:
> This all sounds really cool, except that there is more than one kind of
> protocol. Further, I don't think we should *ever* break binary compatibility
> with a protocol. The cost is simply too high.

FWIW, I started out Avro with 1.0 so we never had to argue when it was 
ready to go 1.0.

For Avro we use roughly FORMAT.API.PATCH.  In other words, 2.0 could 
make an incompatible change to a binary format, but hopefully we'll 
never have a 2.0 release.  Incompatible API changes are generally 
discouraged, but permitted in minor releases.  Patch releases can 
include new features and bugfixes, but no incompatible API changes.

Doug

Re: Philosophy about forward compatibility of generated code?

Posted by Todd Lipcon <to...@cloudera.com>.

On Thu, Aug 26, 2010 at 1:56 PM, Bryan Duxbury <br...@rapleaf.com> wrote:

> This all sounds really cool, except that there is more than one kind of
> protocol. Further, I don't think we should *ever* break binary
> compatibility
> with a protocol. The cost is simply too high.
>
> +1. A backwards-incompatible break in TBinaryProtocol would be fairly
disastrous.

I would be fine, however, with a TBinaryProtocolv2 which users could elect
to use at some point, provided the old version was kept around for a year or
two to give people time to transition (yes, I said year!)


> I'm fine with us not making 1.0 magical, but need to be aware that some
> people will just assume it is and complain accordingly.
>
> I'm happy to do "minor" releases monthly or so for the foreseeable future.
> I
> think we should flip the major version to 1 when we feel like the set of
> features we have in the languages we're calling "mature" is reasonably
> consistent.
>
> On Thu, Aug 26, 2010 at 1:40 PM, Anthony Molinaro <
> anthonym@alumni.caltech.edu> wrote:
>
> > I'd like to second David's opinion that 1.0 is a meaningless construct.
> > I like the philosophy outlined here about version numbers
> >
> > http://semver.org/
> >
> > In a nutshell, its MAJOR.MINOR.PATCH where MAJOR is binary incompatible
> > change, MINOR is backward compatible new feature, and PATCH is bugfix
> > of existing feature.
> >
> > Now with thrift since we are dealing with 2 things (a serialization
> > protocol and libraries to interact with those), these version numbers
> > are not quite right.  I would say that thrift should adopt a four
> > digit version.
> >
> > PROTO_VERSION.MAJOR.MINOR.PATCH
> >
> > where PROTO_VERSION is the serialization format version.  Any
> > version of thrift released with PROTO_VERSION of 0 is compatible with
> > all other 0 versions at the serialization level.
> >
> > MAJOR would be for binary incompatible changes of the generated code.
> > Since we are still pre "1.0" these tend to happen anyway.  What this
> means
> > is that if 0.5.0.0 is released, you'll most likely have to recompile
> > any code which linked against 0.4.x.x if you want some new features
> > in the client or server library.  However, since the PROTO_VERSION
> > has not changed you'd be able to interoperate with older servers, and
> > your server could take calls from older clients.
> >
> > MINOR would be for new features which don't require you to rebuild if
> > you aren't using them.  This could involve things like adding a new
> > server type.  If its a completely new server type you haven't broken
> > any existing code so you are okay.
> >
> > PATCH would be bugfix releases, you probably don't see a lot of these
> > but allow for quick fixes if say a 0.5.0.0 release had a major bug which
> > was fixed quickly, you release a 0.5.0.1 and move on.
> >
> > Within such a framework I think you then make decisions about when to
> > include new things and when to release according to criteria that
> > David laid out.  Also, if this policy is spelled out explicitly on the
> > website then questions about compatibility can be answered there (via
> > a pointer from a FAQ).
> >
> > Also, this framework allows for quicker turn around of certain features
> > and bugfixes.  I would expect to see several MINOR releases between MAJOR
> > releases.  It is release early, release often right?  I also follow
> > cassandra pretty closely and they seem to have MINOR releases (although
> > they
> > use the PATCH number), every month or 6 weeks or so, and MAJOR releases
> > every 6 months or so.  Of course faster release cycle for things is
> > more of a headache for Bryan, so maybe that's not a good idea :)
> >
> > Anyway, just my opinion, I wouldn't expect any action on this unless
> > others thought it was a good idea.
> >
> > -Anthony
> >
> > On Thu, Aug 26, 2010 at 12:42:27PM -0700, David Reiss wrote:
> > > My opinion is that any time we face an incompatible change, the factors
> > we
> > > should consider are:
> > >
> > > - How significant will the benefit be?  (In this case, it was huge
> > >   because the byte[] were inherently broken.)
> > > - How many users do we expect it to affect (e.g., Java vs. OCaml
> > >   changes).
> > > - How difficult is the upgrade path?  (Changing the code to create a
> > >   server is relatively simple.  Changing all code that works with
> > >   generated structures is harder.  Changing the wire protocol is
> > >   prohibitive.)
> > > - What is the effect if client code is not updated?  (In C++ or Java,
> > >   you probably get a nice, safe compile error.  In Python, you might
> get
> > >   difficult-to-explain behavior after your app has been running for a
> > >   while.)
> > >
> > > It's possible that I've left something off the list, but these are the
> > > ones that come to mind.
> > >
> > > This might be a controversial opinion, but I place no weight in version
> > > numbers at all.  You'll notice that I intentionally left "whether we
> > > have released version 1.0" off of my list.  Of course, others might
> > > place weight on a 1.0 release, so it could indirectly affect my second
> > > point.
> > >
> > > >> At some point in the project's lifecycle, I think that Thrift should
> > shift
> > > >> to a "don't break existing code unless absolutely, positively
> > necessary for
> > > >> performance/security/etc."
> > > >> But perhaps there's an explicit policy in place now that "Since
> we're
> > only
> > > >> at version 0.4, all bets are off and any marginal change justifies
> > breaking
> > > >> existing code. We'll stabilize at 1.0."
> > > That's a reasonable position, but as both a developer and user of
> > > Thrift, I personally would prefer not to do that.  When upgrade paths
> > > are easy and upgrade errors are easily detected, I would *never* want
> to
> > > be prevented from making a beneficial change because it wasn't
> > > "absolutely, positively necessary".
> > >
> > > If this means never releasing version 1.0, that's fine with me.  I
> > > recently learned that Hadoop (an incredibly successful project) is
> still
> > > hanging out at 0.22.  I'd be fine with Thrift doing the same.
> > >
> > > If we do decide to release a version 1.0 and refrain from making
> > > unnecessary breaking changes even if they are very beneficial and easy
> > > to upgrade to, I would want to immediately create a 2.0 branch that
> > > would accept more aggressive changes.
> > >
> > > --David
> > >
> > > On 08/24/2010 12:12 PM, Bryan Duxbury wrote:
> > > > In general, I don't aim to break things. I hate updating my code as
> > much as
> > > > the next guy. However, we are still pre-1.0, which means if we're
> going
> > to
> > > > break code, now is the time to do it. In fact, we should get as much
> of
> > it
> > > > out of the way as we can before our user base and feature set becomes
> > so
> > > > entrenched that we're permanently committed to given interfaces.
> > > >
> > > > That said, I reject your premise that we "break compatibility for
> > virtually
> > > > no benefit". In fact, this particular change was effected
> specifically
> > to
> > > > give us a substantial performance benefit for structs with many
> binary
> > > > fields. (It also makes a lot of operations easier, including map keys
> > and
> > > > set elements, comparison and equality, etc.)
> > > >
> > > > Finally, I would *gladly* accept a patch to the Java generator that
> > > > generates accessors with the old-style signatures that wrap the new
> > style
> > > > ones. There's no reason for us not to have a 0.4.1 if it would ease
> > folks'
> > > > lives.
> > > >
> > > > -Bryan
> > > >
> > > > On Tue, Aug 24, 2010 at 11:44 AM, Dave Engberg <
> dengberg@evernote.com
> > >wrote:
> > > >
> > > >>  [ Caveat:  I should obviously spend more time keeping up with the
> > daily
> > > >> email traffic to weigh in on proposed changes before full releases,
> > but you
> > > >> know how it goes at a start-up.  Anyhoo ... ]
> > > >>
> > > >> I'm starting to go through the process of updating our API to Thrift
> > 0.4,
> > > >> which hits around a dozen apps that we maintain (clients and server)
> > and
> > > >> around a hundred implemented by third party developers.
> > > >>
> > > >> I tend to see changes in Thrift releases that break the external
> > interfaces
> > > >> of generated code.  In some cases, this may be based on a heavy
> > conscious
> > > >> decision, but other changes seem to casually break compatibility for
> > > >> virtually no benefit.
> > > >>
> > > >> For example, in 0.4.0, the Java generated interfaces for 'binary'
> > fields
> > > >> changed everything from the built-in 'byte[]' Java type to the
> > > >> java.nio.ByteBuffer class.  This means that all relevant method
> > signatures
> > > >> and fields were changed, and all instance setters and getters went
> > from:
> > > >>  public byte[] getImage() ...
> > > >>  public void setImage(byte[] image) ...
> > > >> to:
> > > >>  public ByteBuffer getImage()
> > > >>  public void setImage(ByteBuffer image)
> > > >>
> > > >> That means that we'll need to change a hundreds of lines of code
> from:
> > > >>   foo.setImage(bytes);
> > > >> to:
> > > >>  foo.setImage(ByteBuffer.wrap(bytes));
> > > >> and from:
> > > >>  byte[] bar = foo.getImage();
> > > >> to:
> > > >>  byte[] bar = new byte[foo.getImage().capacity()];
> > > >>  foo.getImage().get(bar, 0, bar.length);
> > > >>
> > > >> (This particular compatibility change seems particularly gratuitous
> > since
> > > >> it actually increases the cost and overhead of constructing objects
> > and
> > > >> marshaling data while replacing a core Java type with an odd
> > dependency on a
> > > >> low-level Java IO package that's not really designed for this type
> of
> > > >> thing.)
> > > >>
> > > >> So my meta-question is:  what's the Thrift project's philosophy
> about
> > > >> maintaining interface consistency and backward compatibility between
> > > >> releases?
> > > >>
> > > >> At some point in the project's lifecycle, I think that Thrift should
> > shift
> > > >> to a "don't break existing code unless absolutely, positively
> > necessary for
> > > >> performance/security/etc."
> > > >> But perhaps there's an explicit policy in place now that "Since
> we're
> > only
> > > >> at version 0.4, all bets are off and any marginal change justifies
> > breaking
> > > >> existing code. We'll stabilize at 1.0."
> > > >>
> > > >>
> > > >
> >
> > --
> > ------------------------------------------------------------------------
> > Anthony Molinaro                           <an...@alumni.caltech.edu>
> >
>



-- 
Todd Lipcon
Software Engineer, Cloudera

Re: Philosophy about forward compatibility of generated code?

Posted by Joe Schaefer <jo...@yahoo.com>.

FWIW, I think the issues Anthony is discussing are exactly
focused on the right type of question: what stability promises
do you wish to convey using versioning?  When people say
"no promises til 1.0" they mean that the project has yet
to formalize via version numbers what sorts of promises it
wishes to make, so all bets are off.  And that convention
is well understood in the open-source world.  And again I
will point out that while hadoop is shit-hot, it's developer
community is not in nearly as fabulous shape, so suggesting
that this project emulate hadoop won't win over your mentors.

Ideally the versioning policy will be written down somewhere,
voted on, and include instructions for managing "stable" branches.
There is absolutely nothing wrong with moving quickly through
major version numbers as David alludes to, so a thrift 3.0.0(.0)
in 1 year's time is certainly a possibility, depending on what
sorts of versioning rules get adopted.

It's definitely worth your trouble as committers to flesh this
out as you get increasingly better grasp of the impact your changes
to the code makes on downstream users through subsequent releases.



----- Original Message ----
> From: Bryan Duxbury <br...@rapleaf.com>
> To: thrift-dev@incubator.apache.org
> Sent: Thu, August 26, 2010 4:56:02 PM
> Subject: Re: Philosophy about forward compatibility of generated code?
> 
> This all sounds really cool, except that there is more than one kind  of
> protocol. Further, I don't think we should *ever* break binary  compatibility
> with a protocol. The cost is simply too high.
> 
> I'm fine  with us not making 1.0 magical, but need to be aware that some
> people will  just assume it is and complain accordingly.
> 
> I'm happy to do "minor"  releases monthly or so for the foreseeable future. I
> think we should flip the  major version to 1 when we feel like the set of
> features we have in the  languages we're calling "mature" is reasonably
> consistent.
> 
> On Thu, Aug  26, 2010 at 1:40 PM, Anthony Molinaro <
> anthonym@alumni.caltech.edu>  wrote:
> 
> > I'd like to second David's opinion that 1.0 is a meaningless  construct.
> > I like the philosophy outlined here about version  numbers
> >
> > http://semver.org/
> >
> > In a nutshell, its  MAJOR.MINOR.PATCH where MAJOR is binary incompatible
> > change, MINOR is  backward compatible new feature, and PATCH is bugfix
> > of existing  feature.
> >
> > Now with thrift since we are dealing with 2 things (a  serialization
> > protocol and libraries to interact with those), these  version numbers
> > are not quite right.  I would say that thrift  should adopt a four
> > digit version.
> >
> >  PROTO_VERSION.MAJOR.MINOR.PATCH
> >
> > where PROTO_VERSION is the  serialization format version.  Any
> > version of thrift released with  PROTO_VERSION of 0 is compatible with
> > all other 0 versions at the  serialization level.
> >
> > MAJOR would be for binary incompatible  changes of the generated code.
> > Since we are still pre "1.0" these tend  to happen anyway.  What this means
> > is that if 0.5.0.0 is released,  you'll most likely have to recompile
> > any code which linked against  0.4.x.x if you want some new features
> > in the client or server  library.  However, since the PROTO_VERSION
> > has not changed you'd be  able to interoperate with older servers, and
> > your server could take  calls from older clients.
> >
> > MINOR would be for new features which  don't require you to rebuild if
> > you aren't using them.  This could  involve things like adding a new
> > server type.  If its a completely  new server type you haven't broken
> > any existing code so you are  okay.
> >
> > PATCH would be bugfix releases, you probably don't see a  lot of these
> > but allow for quick fixes if say a 0.5.0.0 release had a  major bug which
> > was fixed quickly, you release a 0.5.0.1 and move  on.
> >
> > Within such a framework I think you then make decisions  about when to
> > include new things and when to release according to  criteria that
> > David laid out.  Also, if this policy is spelled out  explicitly on the
> > website then questions about compatibility can be  answered there (via
> > a pointer from a FAQ).
> >
> > Also, this  framework allows for quicker turn around of certain features
> > and  bugfixes.  I would expect to see several MINOR releases between  MAJOR
> > releases.  It is release early, release often right?  I  also follow
> > cassandra pretty closely and they seem to have MINOR  releases (although
> > they
> > use the PATCH number), every month or 6  weeks or so, and MAJOR releases
> > every 6 months or so.  Of course  faster release cycle for things is
> > more of a headache for Bryan, so  maybe that's not a good idea :)
> >
> > Anyway, just my opinion, I  wouldn't expect any action on this unless
> > others thought it was a good  idea.
> >
> > -Anthony
> >
> > On Thu, Aug 26, 2010 at  12:42:27PM -0700, David Reiss wrote:
> > > My opinion is that any time we  face an incompatible change, the factors
> > we
> > > should consider  are:
> > >
> > > - How significant will the benefit be?  (In  this case, it was huge
> > >   because the byte[] were inherently  broken.)
> > > - How many users do we expect it to affect (e.g., Java vs.  OCaml
> > >   changes).
> > > - How difficult is the upgrade  path?  (Changing the code to create a
> > >   server is  relatively simple.  Changing all code that works with
> > >    generated structures is harder.  Changing the wire protocol is
> >  >   prohibitive.)
> > > - What is the effect if client code is  not updated?  (In C++ or Java,
> > >   you probably get a nice,  safe compile error.  In Python, you might get
> > >    difficult-to-explain behavior after your app has been running for a
> >  >   while.)
> > >
> > > It's possible that I've left  something off the list, but these are the
> > > ones that come to  mind.
> > >
> > > This might be a controversial opinion, but I  place no weight in version
> > > numbers at all.  You'll notice that  I intentionally left "whether we
> > > have released version 1.0" off of  my list.  Of course, others might
> > > place weight on a 1.0  release, so it could indirectly affect my second
> > > point.
> >  >
> > > >> At some point in the project's lifecycle, I think  that Thrift should
> > shift
> > > >> to a "don't break  existing code unless absolutely, positively
> > necessary for
> > >  >> performance/security/etc."
> > > >> But perhaps there's an  explicit policy in place now that "Since we're
> > only
> > >  >> at version 0.4, all bets are off and any marginal change  justifies
> > breaking
> > > >> existing code. We'll stabilize  at 1.0."
> > > That's a reasonable position, but as both a developer and  user of
> > > Thrift, I personally would prefer not to do that.   When upgrade paths
> > > are easy and upgrade errors are easily detected,  I would *never* want to
> > > be prevented from making a beneficial  change because it wasn't
> > > "absolutely, positively  necessary".
> > >
> > > If this means never releasing version 1.0,  that's fine with me.  I
> > > recently learned that Hadoop (an  incredibly successful project) is still
> > > hanging out at 0.22.   I'd be fine with Thrift doing the same.
> > >
> > > If we do  decide to release a version 1.0 and refrain from making
> > > unnecessary  breaking changes even if they are very beneficial and easy
> > > to  upgrade to, I would want to immediately create a 2.0 branch that
> > >  would accept more aggressive changes.
> > >
> > > --David
> >  >
> > > On 08/24/2010 12:12 PM, Bryan Duxbury wrote:
> > > >  In general, I don't aim to break things. I hate updating my code as
> > much  as
> > > > the next guy. However, we are still pre-1.0, which means if  we're going
> > to
> > > > break code, now is the time to do it.  In fact, we should get as much of
> > it
> > > > out of the way as  we can before our user base and feature set becomes
> > so
> > > >  entrenched that we're permanently committed to given interfaces.
> > >  >
> > > > That said, I reject your premise that we "break  compatibility for
> > virtually
> > > > no benefit". In fact, this  particular change was effected specifically
> > to
> > > > give us  a substantial performance benefit for structs with many binary
> > > >  fields. (It also makes a lot of operations easier, including map keys
> >  and
> > > > set elements, comparison and equality, etc.)
> > >  >
> > > > Finally, I would *gladly* accept a patch to the Java  generator that
> > > > generates accessors with the old-style  signatures that wrap the new
> > style
> > > > ones. There's no  reason for us not to have a 0.4.1 if it would ease
> > folks'
> > >  > lives.
> > > >
> > > > -Bryan
> > > >
> >  > > On Tue, Aug 24, 2010 at 11:44 AM, Dave Engberg <dengberg@evernote.com
> >  >wrote:
> > > >
> > > >>  [ Caveat:  I  should obviously spend more time keeping up with the
> > daily
> > >  >> email traffic to weigh in on proposed changes before full  releases,
> > but you
> > > >> know how it goes at a  start-up.  Anyhoo ... ]
> > > >>
> > > >> I'm  starting to go through the process of updating our API to Thrift
> >  0.4,
> > > >> which hits around a dozen apps that we maintain  (clients and server)
> > and
> > > >> around a hundred  implemented by third party developers.
> > > >>
> > >  >> I tend to see changes in Thrift releases that break the  external
> > interfaces
> > > >> of generated code.  In  some cases, this may be based on a heavy
> > conscious
> > > >>  decision, but other changes seem to casually break compatibility for
> >  > >> virtually no benefit.
> > > >>
> > > >>  For example, in 0.4.0, the Java generated interfaces for 'binary'
> >  fields
> > > >> changed everything from the built-in 'byte[]' Java  type to the
> > > >> java.nio.ByteBuffer class.  This means  that all relevant method
> > signatures
> > > >> and fields  were changed, and all instance setters and getters went
> > from:
> >  > >>  public byte[] getImage() ...
> > > >>   public void setImage(byte[] image) ...
> > > >> to:
> > >  >>  public ByteBuffer getImage()
> > > >>  public  void setImage(ByteBuffer image)
> > > >>
> > > >> That  means that we'll need to change a hundreds of lines of code from:
> > >  >>   foo.setImage(bytes);
> > > >> to:
> > >  >>  foo.setImage(ByteBuffer.wrap(bytes));
> > > >> and  from:
> > > >>  byte[] bar = foo.getImage();
> > >  >> to:
> > > >>  byte[] bar = new  byte[foo.getImage().capacity()];
> > > >>   foo.getImage().get(bar, 0, bar.length);
> > > >>
> > >  >> (This particular compatibility change seems particularly  gratuitous
> > since
> > > >> it actually increases the cost  and overhead of constructing objects
> > and
> > > >>  marshaling data while replacing a core Java type with an odd
> > dependency  on a
> > > >> low-level Java IO package that's not really designed  for this type of
> > > >> thing.)
> > > >>
> >  > >> So my meta-question is:  what's the Thrift project's  philosophy about
> > > >> maintaining interface consistency and  backward compatibility between
> > > >> releases?
> > >  >>
> > > >> At some point in the project's lifecycle, I think  that Thrift should
> > shift
> > > >> to a "don't break  existing code unless absolutely, positively
> > necessary for
> > >  >> performance/security/etc."
> > > >> But perhaps there's an  explicit policy in place now that "Since we're
> > only
> > >  >> at version 0.4, all bets are off and any marginal change  justifies
> > breaking
> > > >> existing code. We'll stabilize  at 1.0."
> > > >>
> > > >>
> > >  >
> >
> > --
> >  ------------------------------------------------------------------------
> >  Anthony Molinaro                            <an...@alumni.caltech.edu>
> >
>

Re: Philosophy about forward compatibility of generated code?

Posted by Bryan Duxbury <br...@rapleaf.com>.

This all sounds really cool, except that there is more than one kind of
protocol. Further, I don't think we should *ever* break binary compatibility
with a protocol. The cost is simply too high.

I'm fine with us not making 1.0 magical, but need to be aware that some
people will just assume it is and complain accordingly.

I'm happy to do "minor" releases monthly or so for the foreseeable future. I
think we should flip the major version to 1 when we feel like the set of
features we have in the languages we're calling "mature" is reasonably
consistent.

On Thu, Aug 26, 2010 at 1:40 PM, Anthony Molinaro <
anthonym@alumni.caltech.edu> wrote:

> I'd like to second David's opinion that 1.0 is a meaningless construct.
> I like the philosophy outlined here about version numbers
>
> http://semver.org/
>
> In a nutshell, its MAJOR.MINOR.PATCH where MAJOR is binary incompatible
> change, MINOR is backward compatible new feature, and PATCH is bugfix
> of existing feature.
>
> Now with thrift since we are dealing with 2 things (a serialization
> protocol and libraries to interact with those), these version numbers
> are not quite right.  I would say that thrift should adopt a four
> digit version.
>
> PROTO_VERSION.MAJOR.MINOR.PATCH
>
> where PROTO_VERSION is the serialization format version.  Any
> version of thrift released with PROTO_VERSION of 0 is compatible with
> all other 0 versions at the serialization level.
>
> MAJOR would be for binary incompatible changes of the generated code.
> Since we are still pre "1.0" these tend to happen anyway.  What this means
> is that if 0.5.0.0 is released, you'll most likely have to recompile
> any code which linked against 0.4.x.x if you want some new features
> in the client or server library.  However, since the PROTO_VERSION
> has not changed you'd be able to interoperate with older servers, and
> your server could take calls from older clients.
>
> MINOR would be for new features which don't require you to rebuild if
> you aren't using them.  This could involve things like adding a new
> server type.  If its a completely new server type you haven't broken
> any existing code so you are okay.
>
> PATCH would be bugfix releases, you probably don't see a lot of these
> but allow for quick fixes if say a 0.5.0.0 release had a major bug which
> was fixed quickly, you release a 0.5.0.1 and move on.
>
> Within such a framework I think you then make decisions about when to
> include new things and when to release according to criteria that
> David laid out.  Also, if this policy is spelled out explicitly on the
> website then questions about compatibility can be answered there (via
> a pointer from a FAQ).
>
> Also, this framework allows for quicker turn around of certain features
> and bugfixes.  I would expect to see several MINOR releases between MAJOR
> releases.  It is release early, release often right?  I also follow
> cassandra pretty closely and they seem to have MINOR releases (although
> they
> use the PATCH number), every month or 6 weeks or so, and MAJOR releases
> every 6 months or so.  Of course faster release cycle for things is
> more of a headache for Bryan, so maybe that's not a good idea :)
>
> Anyway, just my opinion, I wouldn't expect any action on this unless
> others thought it was a good idea.
>
> -Anthony
>
> On Thu, Aug 26, 2010 at 12:42:27PM -0700, David Reiss wrote:
> > My opinion is that any time we face an incompatible change, the factors
> we
> > should consider are:
> >
> > - How significant will the benefit be?  (In this case, it was huge
> >   because the byte[] were inherently broken.)
> > - How many users do we expect it to affect (e.g., Java vs. OCaml
> >   changes).
> > - How difficult is the upgrade path?  (Changing the code to create a
> >   server is relatively simple.  Changing all code that works with
> >   generated structures is harder.  Changing the wire protocol is
> >   prohibitive.)
> > - What is the effect if client code is not updated?  (In C++ or Java,
> >   you probably get a nice, safe compile error.  In Python, you might get
> >   difficult-to-explain behavior after your app has been running for a
> >   while.)
> >
> > It's possible that I've left something off the list, but these are the
> > ones that come to mind.
> >
> > This might be a controversial opinion, but I place no weight in version
> > numbers at all.  You'll notice that I intentionally left "whether we
> > have released version 1.0" off of my list.  Of course, others might
> > place weight on a 1.0 release, so it could indirectly affect my second
> > point.
> >
> > >> At some point in the project's lifecycle, I think that Thrift should
> shift
> > >> to a "don't break existing code unless absolutely, positively
> necessary for
> > >> performance/security/etc."
> > >> But perhaps there's an explicit policy in place now that "Since we're
> only
> > >> at version 0.4, all bets are off and any marginal change justifies
> breaking
> > >> existing code. We'll stabilize at 1.0."
> > That's a reasonable position, but as both a developer and user of
> > Thrift, I personally would prefer not to do that.  When upgrade paths
> > are easy and upgrade errors are easily detected, I would *never* want to
> > be prevented from making a beneficial change because it wasn't
> > "absolutely, positively necessary".
> >
> > If this means never releasing version 1.0, that's fine with me.  I
> > recently learned that Hadoop (an incredibly successful project) is still
> > hanging out at 0.22.  I'd be fine with Thrift doing the same.
> >
> > If we do decide to release a version 1.0 and refrain from making
> > unnecessary breaking changes even if they are very beneficial and easy
> > to upgrade to, I would want to immediately create a 2.0 branch that
> > would accept more aggressive changes.
> >
> > --David
> >
> > On 08/24/2010 12:12 PM, Bryan Duxbury wrote:
> > > In general, I don't aim to break things. I hate updating my code as
> much as
> > > the next guy. However, we are still pre-1.0, which means if we're going
> to
> > > break code, now is the time to do it. In fact, we should get as much of
> it
> > > out of the way as we can before our user base and feature set becomes
> so
> > > entrenched that we're permanently committed to given interfaces.
> > >
> > > That said, I reject your premise that we "break compatibility for
> virtually
> > > no benefit". In fact, this particular change was effected specifically
> to
> > > give us a substantial performance benefit for structs with many binary
> > > fields. (It also makes a lot of operations easier, including map keys
> and
> > > set elements, comparison and equality, etc.)
> > >
> > > Finally, I would *gladly* accept a patch to the Java generator that
> > > generates accessors with the old-style signatures that wrap the new
> style
> > > ones. There's no reason for us not to have a 0.4.1 if it would ease
> folks'
> > > lives.
> > >
> > > -Bryan
> > >
> > > On Tue, Aug 24, 2010 at 11:44 AM, Dave Engberg <dengberg@evernote.com
> >wrote:
> > >
> > >>  [ Caveat:  I should obviously spend more time keeping up with the
> daily
> > >> email traffic to weigh in on proposed changes before full releases,
> but you
> > >> know how it goes at a start-up.  Anyhoo ... ]
> > >>
> > >> I'm starting to go through the process of updating our API to Thrift
> 0.4,
> > >> which hits around a dozen apps that we maintain (clients and server)
> and
> > >> around a hundred implemented by third party developers.
> > >>
> > >> I tend to see changes in Thrift releases that break the external
> interfaces
> > >> of generated code.  In some cases, this may be based on a heavy
> conscious
> > >> decision, but other changes seem to casually break compatibility for
> > >> virtually no benefit.
> > >>
> > >> For example, in 0.4.0, the Java generated interfaces for 'binary'
> fields
> > >> changed everything from the built-in 'byte[]' Java type to the
> > >> java.nio.ByteBuffer class.  This means that all relevant method
> signatures
> > >> and fields were changed, and all instance setters and getters went
> from:
> > >>  public byte[] getImage() ...
> > >>  public void setImage(byte[] image) ...
> > >> to:
> > >>  public ByteBuffer getImage()
> > >>  public void setImage(ByteBuffer image)
> > >>
> > >> That means that we'll need to change a hundreds of lines of code from:
> > >>   foo.setImage(bytes);
> > >> to:
> > >>  foo.setImage(ByteBuffer.wrap(bytes));
> > >> and from:
> > >>  byte[] bar = foo.getImage();
> > >> to:
> > >>  byte[] bar = new byte[foo.getImage().capacity()];
> > >>  foo.getImage().get(bar, 0, bar.length);
> > >>
> > >> (This particular compatibility change seems particularly gratuitous
> since
> > >> it actually increases the cost and overhead of constructing objects
> and
> > >> marshaling data while replacing a core Java type with an odd
> dependency on a
> > >> low-level Java IO package that's not really designed for this type of
> > >> thing.)
> > >>
> > >> So my meta-question is:  what's the Thrift project's philosophy about
> > >> maintaining interface consistency and backward compatibility between
> > >> releases?
> > >>
> > >> At some point in the project's lifecycle, I think that Thrift should
> shift
> > >> to a "don't break existing code unless absolutely, positively
> necessary for
> > >> performance/security/etc."
> > >> But perhaps there's an explicit policy in place now that "Since we're
> only
> > >> at version 0.4, all bets are off and any marginal change justifies
> breaking
> > >> existing code. We'll stabilize at 1.0."
> > >>
> > >>
> > >
>
> --
> ------------------------------------------------------------------------
> Anthony Molinaro                           <an...@alumni.caltech.edu>
>

Re: Philosophy about forward compatibility of generated code?

Posted by Anthony Molinaro <an...@alumni.caltech.edu>.

I'd like to second David's opinion that 1.0 is a meaningless construct.
I like the philosophy outlined here about version numbers

http://semver.org/

In a nutshell, its MAJOR.MINOR.PATCH where MAJOR is binary incompatible
change, MINOR is backward compatible new feature, and PATCH is bugfix
of existing feature.

Now with thrift since we are dealing with 2 things (a serialization
protocol and libraries to interact with those), these version numbers
are not quite right.  I would say that thrift should adopt a four
digit version.

PROTO_VERSION.MAJOR.MINOR.PATCH

where PROTO_VERSION is the serialization format version.  Any
version of thrift released with PROTO_VERSION of 0 is compatible with
all other 0 versions at the serialization level.

MAJOR would be for binary incompatible changes of the generated code.
Since we are still pre "1.0" these tend to happen anyway.  What this means
is that if 0.5.0.0 is released, you'll most likely have to recompile
any code which linked against 0.4.x.x if you want some new features
in the client or server library.  However, since the PROTO_VERSION
has not changed you'd be able to interoperate with older servers, and
your server could take calls from older clients.

MINOR would be for new features which don't require you to rebuild if
you aren't using them.  This could involve things like adding a new
server type.  If its a completely new server type you haven't broken
any existing code so you are okay.

PATCH would be bugfix releases, you probably don't see a lot of these
but allow for quick fixes if say a 0.5.0.0 release had a major bug which
was fixed quickly, you release a 0.5.0.1 and move on.

Within such a framework I think you then make decisions about when to
include new things and when to release according to criteria that
David laid out.  Also, if this policy is spelled out explicitly on the
website then questions about compatibility can be answered there (via
a pointer from a FAQ).

Also, this framework allows for quicker turn around of certain features
and bugfixes.  I would expect to see several MINOR releases between MAJOR
releases.  It is release early, release often right?  I also follow
cassandra pretty closely and they seem to have MINOR releases (although they
use the PATCH number), every month or 6 weeks or so, and MAJOR releases
every 6 months or so.  Of course faster release cycle for things is
more of a headache for Bryan, so maybe that's not a good idea :)

Anyway, just my opinion, I wouldn't expect any action on this unless
others thought it was a good idea.

-Anthony

On Thu, Aug 26, 2010 at 12:42:27PM -0700, David Reiss wrote:
> My opinion is that any time we face an incompatible change, the factors we
> should consider are:
> 
> - How significant will the benefit be?  (In this case, it was huge
>   because the byte[] were inherently broken.)
> - How many users do we expect it to affect (e.g., Java vs. OCaml
>   changes).
> - How difficult is the upgrade path?  (Changing the code to create a
>   server is relatively simple.  Changing all code that works with
>   generated structures is harder.  Changing the wire protocol is
>   prohibitive.)
> - What is the effect if client code is not updated?  (In C++ or Java,
>   you probably get a nice, safe compile error.  In Python, you might get
>   difficult-to-explain behavior after your app has been running for a
>   while.)
> 
> It's possible that I've left something off the list, but these are the
> ones that come to mind.
> 
> This might be a controversial opinion, but I place no weight in version
> numbers at all.  You'll notice that I intentionally left "whether we
> have released version 1.0" off of my list.  Of course, others might
> place weight on a 1.0 release, so it could indirectly affect my second
> point.
> 
> >> At some point in the project's lifecycle, I think that Thrift should shift
> >> to a "don't break existing code unless absolutely, positively necessary for
> >> performance/security/etc."
> >> But perhaps there's an explicit policy in place now that "Since we're only
> >> at version 0.4, all bets are off and any marginal change justifies breaking
> >> existing code. We'll stabilize at 1.0."
> That's a reasonable position, but as both a developer and user of
> Thrift, I personally would prefer not to do that.  When upgrade paths
> are easy and upgrade errors are easily detected, I would *never* want to
> be prevented from making a beneficial change because it wasn't
> "absolutely, positively necessary".
> 
> If this means never releasing version 1.0, that's fine with me.  I
> recently learned that Hadoop (an incredibly successful project) is still
> hanging out at 0.22.  I'd be fine with Thrift doing the same.
> 
> If we do decide to release a version 1.0 and refrain from making
> unnecessary breaking changes even if they are very beneficial and easy
> to upgrade to, I would want to immediately create a 2.0 branch that
> would accept more aggressive changes.
> 
> --David
> 
> On 08/24/2010 12:12 PM, Bryan Duxbury wrote:
> > In general, I don't aim to break things. I hate updating my code as much as
> > the next guy. However, we are still pre-1.0, which means if we're going to
> > break code, now is the time to do it. In fact, we should get as much of it
> > out of the way as we can before our user base and feature set becomes so
> > entrenched that we're permanently committed to given interfaces.
> > 
> > That said, I reject your premise that we "break compatibility for virtually
> > no benefit". In fact, this particular change was effected specifically to
> > give us a substantial performance benefit for structs with many binary
> > fields. (It also makes a lot of operations easier, including map keys and
> > set elements, comparison and equality, etc.)
> > 
> > Finally, I would *gladly* accept a patch to the Java generator that
> > generates accessors with the old-style signatures that wrap the new style
> > ones. There's no reason for us not to have a 0.4.1 if it would ease folks'
> > lives.
> > 
> > -Bryan
> > 
> > On Tue, Aug 24, 2010 at 11:44 AM, Dave Engberg <de...@evernote.com>wrote:
> > 
> >>  [ Caveat:  I should obviously spend more time keeping up with the daily
> >> email traffic to weigh in on proposed changes before full releases, but you
> >> know how it goes at a start-up.  Anyhoo ... ]
> >>
> >> I'm starting to go through the process of updating our API to Thrift 0.4,
> >> which hits around a dozen apps that we maintain (clients and server) and
> >> around a hundred implemented by third party developers.
> >>
> >> I tend to see changes in Thrift releases that break the external interfaces
> >> of generated code.  In some cases, this may be based on a heavy conscious
> >> decision, but other changes seem to casually break compatibility for
> >> virtually no benefit.
> >>
> >> For example, in 0.4.0, the Java generated interfaces for 'binary' fields
> >> changed everything from the built-in 'byte[]' Java type to the
> >> java.nio.ByteBuffer class.  This means that all relevant method signatures
> >> and fields were changed, and all instance setters and getters went from:
> >>  public byte[] getImage() ...
> >>  public void setImage(byte[] image) ...
> >> to:
> >>  public ByteBuffer getImage()
> >>  public void setImage(ByteBuffer image)
> >>
> >> That means that we'll need to change a hundreds of lines of code from:
> >>   foo.setImage(bytes);
> >> to:
> >>  foo.setImage(ByteBuffer.wrap(bytes));
> >> and from:
> >>  byte[] bar = foo.getImage();
> >> to:
> >>  byte[] bar = new byte[foo.getImage().capacity()];
> >>  foo.getImage().get(bar, 0, bar.length);
> >>
> >> (This particular compatibility change seems particularly gratuitous since
> >> it actually increases the cost and overhead of constructing objects and
> >> marshaling data while replacing a core Java type with an odd dependency on a
> >> low-level Java IO package that's not really designed for this type of
> >> thing.)
> >>
> >> So my meta-question is:  what's the Thrift project's philosophy about
> >> maintaining interface consistency and backward compatibility between
> >> releases?
> >>
> >> At some point in the project's lifecycle, I think that Thrift should shift
> >> to a "don't break existing code unless absolutely, positively necessary for
> >> performance/security/etc."
> >> But perhaps there's an explicit policy in place now that "Since we're only
> >> at version 0.4, all bets are off and any marginal change justifies breaking
> >> existing code. We'll stabilize at 1.0."
> >>
> >>
> > 

-- 
------------------------------------------------------------------------
Anthony Molinaro                           <an...@alumni.caltech.edu>

Re: Philosophy about forward compatibility of generated code?

Posted by David Reiss <dr...@facebook.com>.

My opinion is that any time we face an incompatible change, the factors we
should consider are:

- How significant will the benefit be?  (In this case, it was huge
  because the byte[] were inherently broken.)
- How many users do we expect it to affect (e.g., Java vs. OCaml
  changes).
- How difficult is the upgrade path?  (Changing the code to create a
  server is relatively simple.  Changing all code that works with
  generated structures is harder.  Changing the wire protocol is
  prohibitive.)
- What is the effect if client code is not updated?  (In C++ or Java,
  you probably get a nice, safe compile error.  In Python, you might get
  difficult-to-explain behavior after your app has been running for a
  while.)

It's possible that I've left something off the list, but these are the
ones that come to mind.

This might be a controversial opinion, but I place no weight in version
numbers at all.  You'll notice that I intentionally left "whether we
have released version 1.0" off of my list.  Of course, others might
place weight on a 1.0 release, so it could indirectly affect my second
point.

>> At some point in the project's lifecycle, I think that Thrift should shift
>> to a "don't break existing code unless absolutely, positively necessary for
>> performance/security/etc."
>> But perhaps there's an explicit policy in place now that "Since we're only
>> at version 0.4, all bets are off and any marginal change justifies breaking
>> existing code. We'll stabilize at 1.0."
That's a reasonable position, but as both a developer and user of
Thrift, I personally would prefer not to do that.  When upgrade paths
are easy and upgrade errors are easily detected, I would *never* want to
be prevented from making a beneficial change because it wasn't
"absolutely, positively necessary".

If this means never releasing version 1.0, that's fine with me.  I
recently learned that Hadoop (an incredibly successful project) is still
hanging out at 0.22.  I'd be fine with Thrift doing the same.

If we do decide to release a version 1.0 and refrain from making
unnecessary breaking changes even if they are very beneficial and easy
to upgrade to, I would want to immediately create a 2.0 branch that
would accept more aggressive changes.

--David

On 08/24/2010 12:12 PM, Bryan Duxbury wrote:
> In general, I don't aim to break things. I hate updating my code as much as
> the next guy. However, we are still pre-1.0, which means if we're going to
> break code, now is the time to do it. In fact, we should get as much of it
> out of the way as we can before our user base and feature set becomes so
> entrenched that we're permanently committed to given interfaces.
> 
> That said, I reject your premise that we "break compatibility for virtually
> no benefit". In fact, this particular change was effected specifically to
> give us a substantial performance benefit for structs with many binary
> fields. (It also makes a lot of operations easier, including map keys and
> set elements, comparison and equality, etc.)
> 
> Finally, I would *gladly* accept a patch to the Java generator that
> generates accessors with the old-style signatures that wrap the new style
> ones. There's no reason for us not to have a 0.4.1 if it would ease folks'
> lives.
> 
> -Bryan
> 
> On Tue, Aug 24, 2010 at 11:44 AM, Dave Engberg <de...@evernote.com>wrote:
> 
>>  [ Caveat:  I should obviously spend more time keeping up with the daily
>> email traffic to weigh in on proposed changes before full releases, but you
>> know how it goes at a start-up.  Anyhoo ... ]
>>
>> I'm starting to go through the process of updating our API to Thrift 0.4,
>> which hits around a dozen apps that we maintain (clients and server) and
>> around a hundred implemented by third party developers.
>>
>> I tend to see changes in Thrift releases that break the external interfaces
>> of generated code.  In some cases, this may be based on a heavy conscious
>> decision, but other changes seem to casually break compatibility for
>> virtually no benefit.
>>
>> For example, in 0.4.0, the Java generated interfaces for 'binary' fields
>> changed everything from the built-in 'byte[]' Java type to the
>> java.nio.ByteBuffer class.  This means that all relevant method signatures
>> and fields were changed, and all instance setters and getters went from:
>>  public byte[] getImage() ...
>>  public void setImage(byte[] image) ...
>> to:
>>  public ByteBuffer getImage()
>>  public void setImage(ByteBuffer image)
>>
>> That means that we'll need to change a hundreds of lines of code from:
>>   foo.setImage(bytes);
>> to:
>>  foo.setImage(ByteBuffer.wrap(bytes));
>> and from:
>>  byte[] bar = foo.getImage();
>> to:
>>  byte[] bar = new byte[foo.getImage().capacity()];
>>  foo.getImage().get(bar, 0, bar.length);
>>
>> (This particular compatibility change seems particularly gratuitous since
>> it actually increases the cost and overhead of constructing objects and
>> marshaling data while replacing a core Java type with an odd dependency on a
>> low-level Java IO package that's not really designed for this type of
>> thing.)
>>
>> So my meta-question is:  what's the Thrift project's philosophy about
>> maintaining interface consistency and backward compatibility between
>> releases?
>>
>> At some point in the project's lifecycle, I think that Thrift should shift
>> to a "don't break existing code unless absolutely, positively necessary for
>> performance/security/etc."
>> But perhaps there's an explicit policy in place now that "Since we're only
>> at version 0.4, all bets are off and any marginal change justifies breaking
>> existing code. We'll stabilize at 1.0."
>>
>>
>

Re: Philosophy about forward compatibility of generated code?

Posted by Joe Schaefer <jo...@yahoo.com>.

----- Original Message ----

> From: Bryan Duxbury <br...@rapleaf.com>
> To: thrift-dev@incubator.apache.org
> Sent: Tue, August 24, 2010 3:12:18 PM
> Subject: Re: Philosophy about forward compatibility of generated code?
> 
> In general, I don't aim to break things. I hate updating my code as much  as
> the next guy. However, we are still pre-1.0, which means if we're going  to
> break code, now is the time to do it. In fact, we should get as much of  it
> out of the way as we can before our user base and feature set becomes  so
> entrenched that we're permanently committed to given  interfaces.

+1.

Re: Philosophy about forward compatibility of generated code?

Posted by ro...@bufferoverflow.ch.

+1

Bryan!
I like the generator approach and your readiness to publish a 0.4.1  
bug fix release if really required.


Zitat von Bryan Duxbury <br...@rapleaf.com>:

> In general, I don't aim to break things. I hate updating my code as much as
> the next guy. However, we are still pre-1.0, which means if we're going to
> break code, now is the time to do it. In fact, we should get as much of it
> out of the way as we can before our user base and feature set becomes so
> entrenched that we're permanently committed to given interfaces.
>
> That said, I reject your premise that we "break compatibility for virtually
> no benefit". In fact, this particular change was effected specifically to
> give us a substantial performance benefit for structs with many binary
> fields. (It also makes a lot of operations easier, including map keys and
> set elements, comparison and equality, etc.)
>
> Finally, I would *gladly* accept a patch to the Java generator that
> generates accessors with the old-style signatures that wrap the new style
> ones. There's no reason for us not to have a 0.4.1 if it would ease folks'
> lives.
>
> -Bryan
>
> On Tue, Aug 24, 2010 at 11:44 AM, Dave Engberg <de...@evernote.com>wrote:
>
>>  [ Caveat:  I should obviously spend more time keeping up with the daily
>> email traffic to weigh in on proposed changes before full releases, but you
>> know how it goes at a start-up.  Anyhoo ... ]
>>
>> I'm starting to go through the process of updating our API to Thrift 0.4,
>> which hits around a dozen apps that we maintain (clients and server) and
>> around a hundred implemented by third party developers.
>>
>> I tend to see changes in Thrift releases that break the external interfaces
>> of generated code.  In some cases, this may be based on a heavy conscious
>> decision, but other changes seem to casually break compatibility for
>> virtually no benefit.
>>
>> For example, in 0.4.0, the Java generated interfaces for 'binary' fields
>> changed everything from the built-in 'byte[]' Java type to the
>> java.nio.ByteBuffer class.  This means that all relevant method signatures
>> and fields were changed, and all instance setters and getters went from:
>>  public byte[] getImage() ...
>>  public void setImage(byte[] image) ...
>> to:
>>  public ByteBuffer getImage()
>>  public void setImage(ByteBuffer image)
>>
>> That means that we'll need to change a hundreds of lines of code from:
>>   foo.setImage(bytes);
>> to:
>>  foo.setImage(ByteBuffer.wrap(bytes));
>> and from:
>>  byte[] bar = foo.getImage();
>> to:
>>  byte[] bar = new byte[foo.getImage().capacity()];
>>  foo.getImage().get(bar, 0, bar.length);
>>
>> (This particular compatibility change seems particularly gratuitous since
>> it actually increases the cost and overhead of constructing objects and
>> marshaling data while replacing a core Java type with an odd dependency on a
>> low-level Java IO package that's not really designed for this type of
>> thing.)
>>
>> So my meta-question is:  what's the Thrift project's philosophy about
>> maintaining interface consistency and backward compatibility between
>> releases?
>>
>> At some point in the project's lifecycle, I think that Thrift should shift
>> to a "don't break existing code unless absolutely, positively necessary for
>> performance/security/etc."
>> But perhaps there's an explicit policy in place now that "Since we're only
>> at version 0.4, all bets are off and any marginal change justifies breaking
>> existing code. We'll stabilize at 1.0."
>>
>>
>



----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.

Re: Philosophy about forward compatibility of generated code?

Posted by Bryan Duxbury <br...@rapleaf.com>.

In general, I don't aim to break things. I hate updating my code as much as
the next guy. However, we are still pre-1.0, which means if we're going to
break code, now is the time to do it. In fact, we should get as much of it
out of the way as we can before our user base and feature set becomes so
entrenched that we're permanently committed to given interfaces.

That said, I reject your premise that we "break compatibility for virtually
no benefit". In fact, this particular change was effected specifically to
give us a substantial performance benefit for structs with many binary
fields. (It also makes a lot of operations easier, including map keys and
set elements, comparison and equality, etc.)

Finally, I would *gladly* accept a patch to the Java generator that
generates accessors with the old-style signatures that wrap the new style
ones. There's no reason for us not to have a 0.4.1 if it would ease folks'
lives.

-Bryan

On Tue, Aug 24, 2010 at 11:44 AM, Dave Engberg <de...@evernote.com>wrote:

>  [ Caveat:  I should obviously spend more time keeping up with the daily
> email traffic to weigh in on proposed changes before full releases, but you
> know how it goes at a start-up.  Anyhoo ... ]
>
> I'm starting to go through the process of updating our API to Thrift 0.4,
> which hits around a dozen apps that we maintain (clients and server) and
> around a hundred implemented by third party developers.
>
> I tend to see changes in Thrift releases that break the external interfaces
> of generated code.  In some cases, this may be based on a heavy conscious
> decision, but other changes seem to casually break compatibility for
> virtually no benefit.
>
> For example, in 0.4.0, the Java generated interfaces for 'binary' fields
> changed everything from the built-in 'byte[]' Java type to the
> java.nio.ByteBuffer class.  This means that all relevant method signatures
> and fields were changed, and all instance setters and getters went from:
>  public byte[] getImage() ...
>  public void setImage(byte[] image) ...
> to:
>  public ByteBuffer getImage()
>  public void setImage(ByteBuffer image)
>
> That means that we'll need to change a hundreds of lines of code from:
>   foo.setImage(bytes);
> to:
>  foo.setImage(ByteBuffer.wrap(bytes));
> and from:
>  byte[] bar = foo.getImage();
> to:
>  byte[] bar = new byte[foo.getImage().capacity()];
>  foo.getImage().get(bar, 0, bar.length);
>
> (This particular compatibility change seems particularly gratuitous since
> it actually increases the cost and overhead of constructing objects and
> marshaling data while replacing a core Java type with an odd dependency on a
> low-level Java IO package that's not really designed for this type of
> thing.)
>
> So my meta-question is:  what's the Thrift project's philosophy about
> maintaining interface consistency and backward compatibility between
> releases?
>
> At some point in the project's lifecycle, I think that Thrift should shift
> to a "don't break existing code unless absolutely, positively necessary for
> performance/security/etc."
> But perhaps there's an explicit policy in place now that "Since we're only
> at version 0.4, all bets are off and any marginal change justifies breaking
> existing code. We'll stabilize at 1.0."
>
>

Re: Philosophy about forward compatibility of generated code?

Posted by Dave Engberg <de...@evernote.com>.

I've attached the test code I was playing with.

The code isn't particularly pretty since I was just doing a quick test 
with one of our existing data structures:

    struct Data {
       1:  optional  binary bodyHash,
       2:  optional  i32 size,
       3:  optional  binary body
    }

I wouldn't be surprised if there was some combination of variables 
(number of binary fields, size of the contents of the binary fields, 
origin of the data, JVM, etc.) that would produce notably different 
results, but I used this with a few different parameters as just a rough 
test to make sure I wasn't completely crazy.
In particular, I wouldn't be surprised if you might get different 
results if the input was coming out of some other NIO transport that 
might hand off more cleanly to ByteBuffer without as much copying, since 
there could be some buffer sharing.

But it seems like if ...
(1) your transport layer isn't already optimized into NIO
(2) your application layer really needs to get the data in/out of a byte 
array anyway (e.g. for use as a SQL column)
(3) your data structures are sort of "normal" hand written structs and 
not crazy generated pseudo-code with thousands of fields
... then the ByteBuffer layer may add a bit of overhead.

I don't really care about the negligible CPU cost for our app (we're not 
CPU constrained, and definitely not CPU constrained due to Thrift 
marshaling!), but the changed interface doesn't really give us any benefit.

Thanks

On 8/26/10 5:25 PM, Bryan Duxbury wrote:
> Hm, your performance results are very much at odds with mine. Can you share
> your test code?
>
> I think your memory usage sounds right, but personally I found that
> allocating new wrapping ByteBuffers instead of allocating a new byte[] and
> doing the copy made things about 2.5x as fast on the deserialization side. I
> guess it's possible that I didn't account for a performance loss on the
> write side?
>
> On Thu, Aug 26, 2010 at 5:19 PM, Dave Engberg<de...@evernote.com>  wrote:
>
>> Thanks, I hadn't actually changed any of my code yet, I was just typing
>> that into email after a quick Google search yielded:
>> http://www.exampledepot.com/egs/java.nio/Buf2Array.html
>>
>> I decided to take Bryan up on the offer to submit a patch for 0.4.1
>> instead.
>> I did some quick performance testing of the 0.4 implementation compared to
>> 0.2 by serializing a struct with two 100-byte binary fields a million times,
>> and then deserializing.
>> The 0.4 implementation was about 10% slower due to all of the extra
>> wrapping and unwrapping, but total memory usage was about 10% less.  I.e.
>> the performance is a wash for anything we're doing, and the conversion's a
>> hassle (breaking all of our usage of Thrift structs as POJO data
>> structures), so I'll add a compiler option.
>>
>> Thanks
>>
>>
>>
>> On 8/26/10 3:03 PM, Mathias Herberts wrote:
>>
>>> The changes you intend to do to your code are not correct.
>>> ByteBuffer.capacity() does not return the actual size of the data
>>> stored in a ByteBuffer.
>>>
>>> The size is actually ByteBuffer.limit() - ByteBuffer.arrayOffset().
>>>
>>> And by creating a new byte[] you copy the data and do not hold a
>>> reference to it. Which means that if you modify the returned byte[] it
>>> won't be reflected in the Thrift struct.
>>>
>>> I can only agree that this one change is a pain in the b... to deal with.
>>>
>>
>>

Re: Philosophy about forward compatibility of generated code?

Posted by Bryan Duxbury <br...@rapleaf.com>.

Hm, your performance results are very much at odds with mine. Can you share
your test code?

I think your memory usage sounds right, but personally I found that
allocating new wrapping ByteBuffers instead of allocating a new byte[] and
doing the copy made things about 2.5x as fast on the deserialization side. I
guess it's possible that I didn't account for a performance loss on the
write side?

On Thu, Aug 26, 2010 at 5:19 PM, Dave Engberg <de...@evernote.com> wrote:

>
> Thanks, I hadn't actually changed any of my code yet, I was just typing
> that into email after a quick Google search yielded:
> http://www.exampledepot.com/egs/java.nio/Buf2Array.html
>
> I decided to take Bryan up on the offer to submit a patch for 0.4.1
> instead.
> I did some quick performance testing of the 0.4 implementation compared to
> 0.2 by serializing a struct with two 100-byte binary fields a million times,
> and then deserializing.
> The 0.4 implementation was about 10% slower due to all of the extra
> wrapping and unwrapping, but total memory usage was about 10% less.  I.e.
> the performance is a wash for anything we're doing, and the conversion's a
> hassle (breaking all of our usage of Thrift structs as POJO data
> structures), so I'll add a compiler option.
>
> Thanks
>
>
>
> On 8/26/10 3:03 PM, Mathias Herberts wrote:
>
>> The changes you intend to do to your code are not correct.
>> ByteBuffer.capacity() does not return the actual size of the data
>> stored in a ByteBuffer.
>>
>> The size is actually ByteBuffer.limit() - ByteBuffer.arrayOffset().
>>
>> And by creating a new byte[] you copy the data and do not hold a
>> reference to it. Which means that if you modify the returned byte[] it
>> won't be reflected in the Thrift struct.
>>
>> I can only agree that this one change is a pain in the b... to deal with.
>>
>
>
>

Re: Philosophy about forward compatibility of generated code?

Posted by Dave Engberg <de...@evernote.com>.

Thanks, I hadn't actually changed any of my code yet, I was just typing 
that into email after a quick Google search yielded:
http://www.exampledepot.com/egs/java.nio/Buf2Array.html

I decided to take Bryan up on the offer to submit a patch for 0.4.1 instead.
I did some quick performance testing of the 0.4 implementation compared 
to 0.2 by serializing a struct with two 100-byte binary fields a million 
times, and then deserializing.
The 0.4 implementation was about 10% slower due to all of the extra 
wrapping and unwrapping, but total memory usage was about 10% less.  
I.e. the performance is a wash for anything we're doing, and the 
conversion's a hassle (breaking all of our usage of Thrift structs as 
POJO data structures), so I'll add a compiler option.

Thanks

On 8/26/10 3:03 PM, Mathias Herberts wrote:
> The changes you intend to do to your code are not correct.
> ByteBuffer.capacity() does not return the actual size of the data
> stored in a ByteBuffer.
>
> The size is actually ByteBuffer.limit() - ByteBuffer.arrayOffset().
>
> And by creating a new byte[] you copy the data and do not hold a
> reference to it. Which means that if you modify the returned byte[] it
> won't be reflected in the Thrift struct.
>
> I can only agree that this one change is a pain in the b... to deal with.

Re: Philosophy about forward compatibility of generated code?

Posted by Mathias Herberts <ma...@gmail.com>.

On Fri, Aug 27, 2010 at 01:15, Bryan Duxbury <br...@rapleaf.com> wrote:
> On Thu, Aug 26, 2010 at 3:03 PM, Mathias Herberts <
> mathias.herberts@gmail.com> wrote:
>
>> The changes you intend to do to your code are not correct.
>> ByteBuffer.capacity() does not return the actual size of the data
>> stored in a ByteBuffer.
>>
>> The size is actually ByteBuffer.limit() - ByteBuffer.arrayOffset().
>>
>
> Actually, I think it's bb.limit() - bb.arrayOffset() - bb.position(). (I
> would welcome a patch that added a helper method for this somewhere.)

position will move as you add/read data. From what I see in the trunk
t_java_generator.cc, limit()-arrayOffset() is used but not position
(see generate_deep_copy_non_container), which I think might be closer
to the semantics of a native byte[].

Relying on ByteBuffer's 'get' method is from my POV very dangerous as
'get' copies data from 'position' and then increments 'position',
which means that 'get' is not idempotent.

I foresee lots of hard to find bugs resulting from code converted from
byte[] to ByteBuffer :-(

Mathias.

Re: Philosophy about forward compatibility of generated code?

Posted by Bryan Duxbury <br...@rapleaf.com>.

On Thu, Aug 26, 2010 at 3:03 PM, Mathias Herberts <
mathias.herberts@gmail.com> wrote:

> The changes you intend to do to your code are not correct.
> ByteBuffer.capacity() does not return the actual size of the data
> stored in a ByteBuffer.
>
> The size is actually ByteBuffer.limit() - ByteBuffer.arrayOffset().
>

Actually, I think it's bb.limit() - bb.arrayOffset() - bb.position(). (I
would welcome a patch that added a helper method for this somewhere.)

Re: Philosophy about forward compatibility of generated code?

Posted by Mathias Herberts <ma...@gmail.com>.

> I'm starting to go through the process of updating our API to Thrift 0.4,
> which hits around a dozen apps that we maintain (clients and server) and
> around a hundred implemented by third party developers.
>
> I tend to see changes in Thrift releases that break the external interfaces
> of generated code.  In some cases, this may be based on a heavy conscious
> decision, but other changes seem to casually break compatibility for
> virtually no benefit.
>
> For example, in 0.4.0, the Java generated interfaces for 'binary' fields
> changed everything from the built-in 'byte[]' Java type to the
> java.nio.ByteBuffer class.  This means that all relevant method signatures
> and fields were changed, and all instance setters and getters went from:
>  public byte[] getImage() ...
>  public void setImage(byte[] image) ...
> to:
>  public ByteBuffer getImage()
>  public void setImage(ByteBuffer image)
>
> That means that we'll need to change a hundreds of lines of code from:
>   foo.setImage(bytes);
> to:
>  foo.setImage(ByteBuffer.wrap(bytes));
> and from:
>  byte[] bar = foo.getImage();
> to:
>  byte[] bar = new byte[foo.getImage().capacity()];
>  foo.getImage().get(bar, 0, bar.length);

The changes you intend to do to your code are not correct.
ByteBuffer.capacity() does not return the actual size of the data
stored in a ByteBuffer.

The size is actually ByteBuffer.limit() - ByteBuffer.arrayOffset().

And by creating a new byte[] you copy the data and do not hold a
reference to it. Which means that if you modify the returned byte[] it
won't be reflected in the Thrift struct.

I can only agree that this one change is a pain in the b... to deal with.