You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-user@hadoop.apache.org by Steve Lewis <lo...@gmail.com> on 2010/11/13 19:36:08 UTC

Caution using Hadoop 0.21

Our group made a very poorly considered decision to build out cluster using
Hadoop 0.21
We discovered that a number of programs written and running properly under
0.20.2 did not work
under 0.21

The first issue is that Mapper.Context and Reducer.Context and many of their
superclasses were
converted from concrete classes to interfaces. This change, and I have never
in 15 years of programming Java seen so major
a change to well known public classes is guaranteed to break any code which
subclasses these objects.

While it is a far better decision to make these classes interface, the
manner of the change and the fact that it is poorly
documented shows extraordinary poor judgement on the part of the Hadoop
developers

http://lordjoesoftware.blogspot.com/

-- 
Steven M. Lewis PhD
4221 105th Ave Ne
Kirkland, WA 98033
206-384-1340 (cell)
Institute for Systems Biology
Seattle WA

Re: Caution using Hadoop 0.21

Posted by Steve Lewis <lo...@gmail.com>.

Two reasons -
1) we want a unit test to log whenever a write occurs
2) I want the keys generated by a write in a subsection of the app  to be
augmented by added data before being sent to hadoop


On Mon, Nov 15, 2010 at 11:21 PM, Owen O'Malley <om...@apache.org> wrote:

> I'm very sorry that you got burned by the change. Most MapReduce
> applications don't extend the Context classes since those are objects that
> are provided by the framework. In 0.21, we've marked which interfaces are
> stable and which are still evolving. We try and hold all of the interfaces
> stable, but evolving ones do change as we figure out what they should look
> like.
>
> Can I ask why you were extending the Context classes?
>
> -- Owen
>



-- 
Steven M. Lewis PhD
4221 105th Ave Ne
Kirkland, WA 98033
206-384-1340 (cell)
Institute for Systems Biology
Seattle WA

Re: Caution using Hadoop 0.21

Posted by Owen O'Malley <om...@apache.org>.

I'm very sorry that you got burned by the change. Most MapReduce
applications don't extend the Context classes since those are objects that
are provided by the framework. In 0.21, we've marked which interfaces are
stable and which are still evolving. We try and hold all of the interfaces
stable, but evolving ones do change as we figure out what they should look
like.

Can I ask why you were extending the Context classes?

-- Owen

Re: Caution using Hadoop 0.21

Posted by Steve Lewis <lo...@gmail.com>.

I did not say I never saw an API change - what I said and stand by is that I
have not
seen a major public class change in a way that would break older code.
 Adding a few methods to
a class is OK. Adding new classes and packages are OK. Adding methods to a
public interface is very bad -
although Java did this with different versions of JDBC. But changing an
important public class from a
class to an interface - a move guaranteed to break code of anyone
subclassing that class is pretty much unheard of

On Sat, Nov 13, 2010 at 8:53 PM, Edward Capriolo <ed...@gmail.com>wrote:

> On Sat, Nov 13, 2010 at 4:33 PM, Shi Yu <sh...@uchicago.edu> wrote:
> > I agree with Steve. That's why I am still using 0.19.2 in my production.
> >
> > Shi
> >
> > On 2010-11-13 12:36, Steve Lewis wrote:
> >>
> >> Our group made a very poorly considered decision to build out cluster
> >> using
> >> Hadoop 0.21
> >> We discovered that a number of programs written and running properly
> under
> >> 0.20.2 did not work
> >> under 0.21
> >>
> >> The first issue is that Mapper.Context and Reducer.Context and many of
> >> their
> >> superclasses were
> >> converted from concrete classes to interfaces. This change, and I have
> >> never
> >> in 15 years of programming Java seen so major
> >> a change to well known public classes is guaranteed to break any code
> >> which
> >> subclasses these objects.
> >>
> >> While it is a far better decision to make these classes interface, the
> >> manner of the change and the fact that it is poorly
> >> documented shows extraordinary poor judgement on the part of the Hadoop
> >> developers
> >>
> >> http://lordjoesoftware.blogspot.com/
> >>
> >>
> >
> >
> >
>
> At times we have been frustrated by rapidly changing API's
>
> # 23 August, 2010: release 0.21.0 available
> # 26 February, 2010: release 0.20.2 available
> # 14 September, 2009: release 0.20.1 available
> # 23 July, 2009: release 0.19.2 available
> # 22 April, 2009: release 0.20.0 available
>
> By the standard major/minor/revision scheme 0.20.X->0.21.X is a minor
> release. However since hadoop has never had a major release you might
> consider 0.20->0.21 to be a "major" release.
>
> In any case, are you saying that in 15 years of coding you have never
> seen an API change between minor releases? I think that is quite
> common. It was also more then a year between 0.20.X and 0.21.X.  Again
> common to expect a change in that time frame.
>



-- 
Steven M. Lewis PhD
4221 105th Ave Ne
Kirkland, WA 98033
206-384-1340 (cell)
Institute for Systems Biology
Seattle WA

Re: Caution using Hadoop 0.21

Posted by Edward Capriolo <ed...@gmail.com>.

On Sat, Nov 13, 2010 at 4:33 PM, Shi Yu <sh...@uchicago.edu> wrote:
> I agree with Steve. That's why I am still using 0.19.2 in my production.
>
> Shi
>
> On 2010-11-13 12:36, Steve Lewis wrote:
>>
>> Our group made a very poorly considered decision to build out cluster
>> using
>> Hadoop 0.21
>> We discovered that a number of programs written and running properly under
>> 0.20.2 did not work
>> under 0.21
>>
>> The first issue is that Mapper.Context and Reducer.Context and many of
>> their
>> superclasses were
>> converted from concrete classes to interfaces. This change, and I have
>> never
>> in 15 years of programming Java seen so major
>> a change to well known public classes is guaranteed to break any code
>> which
>> subclasses these objects.
>>
>> While it is a far better decision to make these classes interface, the
>> manner of the change and the fact that it is poorly
>> documented shows extraordinary poor judgement on the part of the Hadoop
>> developers
>>
>> http://lordjoesoftware.blogspot.com/
>>
>>
>
>
>

At times we have been frustrated by rapidly changing API's

# 23 August, 2010: release 0.21.0 available
# 26 February, 2010: release 0.20.2 available
# 14 September, 2009: release 0.20.1 available
# 23 July, 2009: release 0.19.2 available
# 22 April, 2009: release 0.20.0 available

By the standard major/minor/revision scheme 0.20.X->0.21.X is a minor
release. However since hadoop has never had a major release you might
consider 0.20->0.21 to be a "major" release.

In any case, are you saying that in 15 years of coding you have never
seen an API change between minor releases? I think that is quite
common. It was also more then a year between 0.20.X and 0.21.X.  Again
common to expect a change in that time frame.

Re: Caution using Hadoop 0.21

Posted by Shi Yu <sh...@uchicago.edu>.

I agree with Steve. That's why I am still using 0.19.2 in my production.

Shi

On 2010-11-13 12:36, Steve Lewis wrote:
> Our group made a very poorly considered decision to build out cluster using
> Hadoop 0.21
> We discovered that a number of programs written and running properly under
> 0.20.2 did not work
> under 0.21
>
> The first issue is that Mapper.Context and Reducer.Context and many of their
> superclasses were
> converted from concrete classes to interfaces. This change, and I have never
> in 15 years of programming Java seen so major
> a change to well known public classes is guaranteed to break any code which
> subclasses these objects.
>
> While it is a far better decision to make these classes interface, the
> manner of the change and the fact that it is poorly
> documented shows extraordinary poor judgement on the part of the Hadoop
> developers
>
> http://lordjoesoftware.blogspot.com/
>
>