You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Joydeep Sen Sarma <js...@facebook.com> on 2008/02/21 22:17:41 UTC

define backwards compatibility (was: changes to compression interfaces in 0.15?)

Arun - if you can't pull the api - then u must redirect the api to the new call that preserves it's semantics.

in this case - had we re-implemented SequenceFile.setCompressionType in 0.15 to call SequenceFileOutputFormat.setOutputCompressionType() - then it would have been a backwards compatible change. + deprecation would have served fair warning for eventual pullout.

i find the confusion over what backwards compatibility means scary - and i am really hoping that the outcome of this thread is a clear definition from the committers/hadoop-board of what to reasonably expect (or not!) going forward.



-----Original Message-----
From: Pete Wyckoff [mailto:pwyckoff@facebook.com]
Sent: Thu 2/21/2008 12:47 PM
To: core-user@hadoop.apache.org
Subject: Re: changes to compression interfaces in 0.15?
 

If the API semantics are changing under you, you have to change your code
whether or not the API is pulled or deprecated.  Pulling it makes it more
obvious that the user has to change his/her code.

-- pete


On 2/21/08 12:41 PM, "Arun C Murthy" <ac...@yahoo-inc.com> wrote:

> 
> On Feb 21, 2008, at 12:20 PM, Joydeep Sen Sarma wrote:
> 
>>> To maintain backward compat, we cannot remove old apis - the standard
>>> procedure is to deprecate them for the next release and remove them
>>> in subsequent releases.
>> 
>> you've got to be kidding.
>> 
>> we didn't maintain backwards compatibility. my app broke. Simple
>> and straightforward. and the old interfaces are not deprecated (to
>> quote 0.15.3 on a 'deprecated' interface:
>> 
> 
> You are right, HADOOP-1851 didn't fix it right. I've filed HADOOP-2869.
> 
> We do need to be more diligent about listing config changes in
> CHANGES.txt for starters, and that point is taken. However, we can't
> start pulling out apis without deprecating them first.
> 
> Arun
> 
>> 
>>   /**
>>    * Set the compression type for sequence files.
>>    * @param job the configuration to modify
>>    * @param val the new compression type (none, block, record)
>>    */
>>   static public void setCompressionType(Configuration job,
>>                                         CompressionType val) {
>> )
>> 
>> I (and i would suspect any average user willing to recompile code)
>> would much much rather that we broke backwards compatibility
>> immediately rather than maintain carry over defunct apis that
>> insidiously break application behavior.
>> 
>> and of course - this does not address the point that the option
>> strings themselves are depcreated. (remember - people set options
>> explicitly from xml files and streaming. not everyone goes through
>> java apis)).
>> 
>> --
>> 
>> as one of my dear professors once said - put ur self in the other
>> person's shoe. consider that u were in my position and that a
>> production app suddenly went from consuming 100G to 1TB. and
>> everything slowed down drastically. and it did not give any sign
>> that anything was amiss. everything looked golden on the ourside.
>> what would be ur reaction if u find out after a week that the
>> system was full and numerous processes had to be re-run? how would
>> you have figured that was going to happen by looking at the
>> INCOMPATIBLE section (which btw - i did carefully before sending my
>> mail).
>> 
>> (fortunately i escaped the worst case - but i think this is a real
>> call to action)
>> 
>> 
>> -----Original Message-----
>> From: Arun C Murthy [mailto:acm@yahoo-inc.com]
>> Sent: Thu 2/21/2008 11:21 AM
>> To: core-user@hadoop.apache.org
>> Subject: Re: changes to compression interfaces in 0.15?
>> 
>> Joydeep,
>> 
>> On Feb 20, 2008, at 5:06 PM, Joydeep Sen Sarma wrote:
>> 
>>> Hi developers,
>>> 
>>> In migrating to 0.15 - i am noticing that the compression interfaces
>>> have changed:
>>> 
>>> -          compression type for sequencefile outputs used to be set
>>> by:
>>> SequenceFile.setCompressionType()
>>> 
>>> -          now it seems to be set using:
>>> sequenceFileOutputFormat.setOutputCompressionType()
>>> 
>>> 
>> 
>> Yes, we added SequenceFileOutputFormat.setOutputCompressionType and
>> deprecated the old api. (HADOOP-1851)
>> 
>>> 
>>> The change is for the better - but would it be possible to:
>>> 
>>> -          remove old/dead interfaces. That would have been a
>>> straightforward hint for applications to look for new interfaces.
>>> (hadoop-default.xml also still has setting for old conf variable:
>>> io.seqfile.compression.type)
>>> 
>> 
>> To maintain backward compat, we cannot remove old apis - the standard
>> procedure is to deprecate them for the next release and remove them
>> in subsequent releases.
>> 
>>> -          if possible - document changed interfaces in the release
>>> notes (there's no way we can find this out by looking at the long
>>> list
>>> of Jiras).
>>> 
>> 
>> Please look at the INCOMPATIBLE CHANGES section of CHANGES.txt,
>> HADOOP-1851 is listed there. Admittedly we can do better, but that is
>> a good place to look for when upgrading to newer releases.
>>> 
>>> i am not sure how updated the wiki is on the compression stuff (my
>>> responsibility to update it) - but please do consider the impact of
>> 
>> Please use the forrest-based docs (on the hadoop website - e.g.
>> mapred_tutorial.html) rather than the wiki as the gold-standard. The
>> reason we moved away from the wiki is precisely this - harder to
>> maintain docs per release etc.
>> 
>>> changing interfaces on existing applications. (maybe we should have a
>>> JIRA tag to mark out bugs that change interfaces).
>>> 
>>> 
>> 
>> Again, CHANGES.txt and INCOMPATIBLE CHANGES section for now.
>> 
>> Arun
>> 
>>> 
>>> 
>>> As always - thanks for all the fish (err .. working code),
>>> 
>>> 
>>> 
>>> Joydeep
>>> 
>>> 
>>> 
>> 
>> 
> 



Re: define backwards compatibility

Posted by Doug Cutting <cu...@apache.org>.
Joydeep Sen Sarma wrote:
> i find the confusion over what backwards compatibility means scary - and i am really hoping that the outcome of this thread is a clear definition from the committers/hadoop-board of what to reasonably expect (or not!) going forward.

The goal is clear: code that compiles and runs warning-free in one 
release should not have to to be altered to try the next release.  It 
may generate warnings, and these should be addressed before another 
upgrade is attempted.

Sometimes it is not possible to achieve this.  In these cases 
applications should fail with a clear error message, either at 
compilation or runtime.

In both cases, incompatible changes should be well documented in the 
release notes.

This is described (in part) in http://wiki.apache.org/hadoop/Roadmap

That's the goal.  Implementing and enforcing it is another story.  For 
that we depend on developer and user vigilance.  The current issue seems 
a case of failure to implement the policy rather than a lack of policy.

Doug