You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@crunch.apache.org by Josh Wills <jw...@cloudera.com> on 2012/06/12 22:14:49 UTC

avro 1.7.0: should we upgrade crunch?

My fellow developers,

I saw Chris' tweet that Avro 1.7.0 was released-- thoughts on
upgrading? Does it buy us any good stuff we want?

J

-- 
Director of Data Science
Cloudera
Twitter: @josh_wills

Re: avro 1.7.0: should we upgrade crunch?

Posted by Josh Wills <jw...@cloudera.com>.
I can put it through the paces w/cdh4, hadoop-1.0.3, and
hadoop-2.0.0-alpha, verify that tests pass, and then integrate the
pull req.

J

On Tue, Jun 12, 2012 at 3:55 PM, Christian Tzolov
<ch...@gmail.com> wrote:
> I am afraid I have pulled the Avro upgrade request far too soon. I should
> have checked my mailbox first.
>
> We have been using avro-1.7.0-rc0 for a week now to resolve AVRO-1046 issue.
> No problems were observed so far but we have not tested it with any other
> Hadoop version but chd3.
>
> Cheers,
> Chris
>
> On Tue, Jun 12, 2012 at 10:47 PM, Tom White <to...@cloudera.com> wrote:
>>
>> Funny - I was just writing an email about potential dragons...
>>
>> Before upgrading someone should check that Avro 1.7.0 works with
>> released versions of Hadoop. In the past there have been problems with
>> Avro and dependencies like Jackson conflicting with versions that
>> Hadoop uses. In particular, the MR classpath can be controlled via the
>> configuration property mapreduce.user.classpath.first and the env
>> property HADOOP_USER_CLASSPATH_FIRST. By setting these to true MR will
>> use the newer Avro libraries, however there is a risk that Hadoop will
>> not work with the newer versions.
>>
>> Cheers,
>> Tom
>>
>> On Tue, Jun 12, 2012 at 3:36 PM, Josh Wills <jw...@cloudera.com> wrote:
>> > On Tue, Jun 12, 2012 at 1:28 PM, Gabriel Reid <ga...@gmail.com>
>> > wrote:
>> >> On Tue, Jun 12, 2012 at 10:14 PM, Josh Wills <jw...@cloudera.com>
>> >> wrote:
>> >>>
>> >>> I saw Chris' tweet that Avro 1.7.0 was released-- thoughts on
>> >>> upgrading? Does it buy us any good stuff we want?
>> >>
>> >> The fix of ReflectDatumReader not working correctly with Specific
>> >> Records (https://issues.apache.org/jira/browse/AVRO-1046) would allow
>> >> us to remove a fair bit of Avro code that works around that bug -- if
>> >> we do do the upgrade, I'd certainly volunteer to weed out those
>> >> workarounds.
>> >>
>> >> On the other hand, I just did a quick scan of the release notes for
>> >> 1.7.0
>> >> (https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310911&version=12318848)
>> >> and I didn't notice anything that would really be a big reason to
>> >> upgrade right away.
>> >>
>> >> Are there any reasons *not* to upgrade (other than risks of something
>> >> else being broken)? Maybe the cleanup of the Avro code that we can do
>> >> is reason enough to do the upgrade.
>> >
>> > +tom explicitly
>> >
>> > I remember having a bunch of frustrations with 1.6.0 and having to
>> > stay on 1.5.4 for longer than I wanted because of some critical bugs
>> > that didn't get fixed until 1.6.2, but I also think the move from
>> > 1.5.4 to 1.6.0 involved a much larger rewrite than what I see from the
>> > release notes for 1.7.0. Tom White is traveling across the US right
>> > now, but I'm wondering if he has a feel for whether 1.7.0 is likely to
>> > contain any dragons. :)
>> >
>> >>
>> >> - Gabriel
>> >
>> >
>> >
>> > --
>> > Director of Data Science
>> > Cloudera
>> > Twitter: @josh_wills
>
>



-- 
Director of Data Science
Cloudera
Twitter: @josh_wills

Re: avro 1.7.0: should we upgrade crunch?

Posted by Christian Tzolov <ch...@gmail.com>.
I am afraid I have pulled the Avro upgrade request far too soon. I should
have checked my mailbox first.

We have been using avro-1.7.0-rc0 for a week now to resolve
AVRO-1046<https://issues.apache.org/jira/browse/AVRO-1046> issue.
No problems were observed so far but we have not tested it with any other
Hadoop version but chd3.

Cheers,
Chris

On Tue, Jun 12, 2012 at 10:47 PM, Tom White <to...@cloudera.com> wrote:

> Funny - I was just writing an email about potential dragons...
>
> Before upgrading someone should check that Avro 1.7.0 works with
> released versions of Hadoop. In the past there have been problems with
> Avro and dependencies like Jackson conflicting with versions that
> Hadoop uses. In particular, the MR classpath can be controlled via the
> configuration property mapreduce.user.classpath.first and the env
> property HADOOP_USER_CLASSPATH_FIRST. By setting these to true MR will
> use the newer Avro libraries, however there is a risk that Hadoop will
> not work with the newer versions.
>
> Cheers,
> Tom
>
> On Tue, Jun 12, 2012 at 3:36 PM, Josh Wills <jw...@cloudera.com> wrote:
> > On Tue, Jun 12, 2012 at 1:28 PM, Gabriel Reid <ga...@gmail.com>
> wrote:
> >> On Tue, Jun 12, 2012 at 10:14 PM, Josh Wills <jw...@cloudera.com>
> wrote:
> >>>
> >>> I saw Chris' tweet that Avro 1.7.0 was released-- thoughts on
> >>> upgrading? Does it buy us any good stuff we want?
> >>
> >> The fix of ReflectDatumReader not working correctly with Specific
> >> Records (https://issues.apache.org/jira/browse/AVRO-1046) would allow
> >> us to remove a fair bit of Avro code that works around that bug -- if
> >> we do do the upgrade, I'd certainly volunteer to weed out those
> >> workarounds.
> >>
> >> On the other hand, I just did a quick scan of the release notes for
> >> 1.7.0 (
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310911&version=12318848
> )
> >> and I didn't notice anything that would really be a big reason to
> >> upgrade right away.
> >>
> >> Are there any reasons *not* to upgrade (other than risks of something
> >> else being broken)? Maybe the cleanup of the Avro code that we can do
> >> is reason enough to do the upgrade.
> >
> > +tom explicitly
> >
> > I remember having a bunch of frustrations with 1.6.0 and having to
> > stay on 1.5.4 for longer than I wanted because of some critical bugs
> > that didn't get fixed until 1.6.2, but I also think the move from
> > 1.5.4 to 1.6.0 involved a much larger rewrite than what I see from the
> > release notes for 1.7.0. Tom White is traveling across the US right
> > now, but I'm wondering if he has a feel for whether 1.7.0 is likely to
> > contain any dragons. :)
> >
> >>
> >> - Gabriel
> >
> >
> >
> > --
> > Director of Data Science
> > Cloudera
> > Twitter: @josh_wills
>

Re: avro 1.7.0: should we upgrade crunch?

Posted by Tom White <to...@cloudera.com>.
Funny - I was just writing an email about potential dragons...

Before upgrading someone should check that Avro 1.7.0 works with
released versions of Hadoop. In the past there have been problems with
Avro and dependencies like Jackson conflicting with versions that
Hadoop uses. In particular, the MR classpath can be controlled via the
configuration property mapreduce.user.classpath.first and the env
property HADOOP_USER_CLASSPATH_FIRST. By setting these to true MR will
use the newer Avro libraries, however there is a risk that Hadoop will
not work with the newer versions.

Cheers,
Tom

On Tue, Jun 12, 2012 at 3:36 PM, Josh Wills <jw...@cloudera.com> wrote:
> On Tue, Jun 12, 2012 at 1:28 PM, Gabriel Reid <ga...@gmail.com> wrote:
>> On Tue, Jun 12, 2012 at 10:14 PM, Josh Wills <jw...@cloudera.com> wrote:
>>>
>>> I saw Chris' tweet that Avro 1.7.0 was released-- thoughts on
>>> upgrading? Does it buy us any good stuff we want?
>>
>> The fix of ReflectDatumReader not working correctly with Specific
>> Records (https://issues.apache.org/jira/browse/AVRO-1046) would allow
>> us to remove a fair bit of Avro code that works around that bug -- if
>> we do do the upgrade, I'd certainly volunteer to weed out those
>> workarounds.
>>
>> On the other hand, I just did a quick scan of the release notes for
>> 1.7.0 (https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310911&version=12318848)
>> and I didn't notice anything that would really be a big reason to
>> upgrade right away.
>>
>> Are there any reasons *not* to upgrade (other than risks of something
>> else being broken)? Maybe the cleanup of the Avro code that we can do
>> is reason enough to do the upgrade.
>
> +tom explicitly
>
> I remember having a bunch of frustrations with 1.6.0 and having to
> stay on 1.5.4 for longer than I wanted because of some critical bugs
> that didn't get fixed until 1.6.2, but I also think the move from
> 1.5.4 to 1.6.0 involved a much larger rewrite than what I see from the
> release notes for 1.7.0. Tom White is traveling across the US right
> now, but I'm wondering if he has a feel for whether 1.7.0 is likely to
> contain any dragons. :)
>
>>
>> - Gabriel
>
>
>
> --
> Director of Data Science
> Cloudera
> Twitter: @josh_wills

Re: avro 1.7.0: should we upgrade crunch?

Posted by Josh Wills <jw...@cloudera.com>.
On Tue, Jun 12, 2012 at 1:28 PM, Gabriel Reid <ga...@gmail.com> wrote:
> On Tue, Jun 12, 2012 at 10:14 PM, Josh Wills <jw...@cloudera.com> wrote:
>>
>> I saw Chris' tweet that Avro 1.7.0 was released-- thoughts on
>> upgrading? Does it buy us any good stuff we want?
>
> The fix of ReflectDatumReader not working correctly with Specific
> Records (https://issues.apache.org/jira/browse/AVRO-1046) would allow
> us to remove a fair bit of Avro code that works around that bug -- if
> we do do the upgrade, I'd certainly volunteer to weed out those
> workarounds.
>
> On the other hand, I just did a quick scan of the release notes for
> 1.7.0 (https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310911&version=12318848)
> and I didn't notice anything that would really be a big reason to
> upgrade right away.
>
> Are there any reasons *not* to upgrade (other than risks of something
> else being broken)? Maybe the cleanup of the Avro code that we can do
> is reason enough to do the upgrade.

+tom explicitly

I remember having a bunch of frustrations with 1.6.0 and having to
stay on 1.5.4 for longer than I wanted because of some critical bugs
that didn't get fixed until 1.6.2, but I also think the move from
1.5.4 to 1.6.0 involved a much larger rewrite than what I see from the
release notes for 1.7.0. Tom White is traveling across the US right
now, but I'm wondering if he has a feel for whether 1.7.0 is likely to
contain any dragons. :)

>
> - Gabriel



-- 
Director of Data Science
Cloudera
Twitter: @josh_wills

Re: avro 1.7.0: should we upgrade crunch?

Posted by Gabriel Reid <ga...@gmail.com>.
On Tue, Jun 12, 2012 at 10:14 PM, Josh Wills <jw...@cloudera.com> wrote:
>
> I saw Chris' tweet that Avro 1.7.0 was released-- thoughts on
> upgrading? Does it buy us any good stuff we want?

The fix of ReflectDatumReader not working correctly with Specific
Records (https://issues.apache.org/jira/browse/AVRO-1046) would allow
us to remove a fair bit of Avro code that works around that bug -- if
we do do the upgrade, I'd certainly volunteer to weed out those
workarounds.

On the other hand, I just did a quick scan of the release notes for
1.7.0 (https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310911&version=12318848)
and I didn't notice anything that would really be a big reason to
upgrade right away.

Are there any reasons *not* to upgrade (other than risks of something
else being broken)? Maybe the cleanup of the Avro code that we can do
is reason enough to do the upgrade.

- Gabriel