You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@hbase.apache.org by Jonathan Hsieh <jo...@cloudera.com> on 2012/05/16 20:24:08 UTC

Publishing jars for hbase compiled against hadoop 0.23.x/hadoop 2.0.x

Hey Devs,

I've gotten pinged by folks working on Apache Flume, a project that depends
directly upon hbase and hadoop hdfs jars about how to get the proper hbase
jars that work against hadoop 1.0 and hadoop 0.23/2.0.
Unfortunately, the transition from hadoop 1.0.0 to hadoop 0.23.x/2.0
requires hbase to be recompiled to run against the different hadoop
version. ("compile compatible" but not "binary compatible").

Currently, we build and publish hbase jars compiled against hadoop 1.0.x.

What is the right way to publish poms/jars for those who want use an hbase
jars compiled against hadoop 0.23/2.0?  Is there a right way?

Jon.

-- 
// Jonathan Hsieh (shay)
// Software Engineer, Cloudera
// jon@cloudera.com

Re: Publishing jars for hbase compiled against hadoop 0.23.x/hadoop 2.0.x

Posted by Konstantin Boudnik <co...@apache.org>.

See my comments inlined...

On Wed, May 16, 2012 at 04:06PM, Andrew Purtell wrote:
> [cc bigtop-dev]
> 
> On Wed, May 16, 2012 at 3:22 PM, Jesse Yates <je...@gmail.com> wrote:
> > ═+1 on a small number of supported versions with different classifiers that
> > only span a limited api skew to avoid a mountain of reflection. Along with
> > that, support for the builds via jenkins testing.
> 
> and
> 
> >> I think HBase should consider having a single blessed set of
> >> dependencies and only one build for a given release,
> >
> > This would be really nice, but seems a bit unreasonable given that we are
> > the "hadoop database" (if not in name, at least by connotation). I think
> > limiting our support to the latest X versions (2-3?) is reasonable given
> > consistent APIs
> 
> I was talking release mechanics not source/compilation/testing level
> support. Hence the suggestion for multiple Jenkins projects for the
> dependency versions we care about. That care could be scoped like you
> suggest.
> 
> I like what Bigtop espouses: carefully constructed snapshots of the
> world, well tested in total. Seems easier to manage then laying out
> various planes from increasingly higher dimensional spaces. If they
> get traction we can act as a responsible upstream project. As for our
> official release, we'd have maybe two, I'll grant you that, Hadoop 1
> and Hadoop 2.
> 
> X=2 will be a challenge. It's not just the Hadoop version that could
> change, but the versions of all of its dependencies, SLF4J, Guava,
> JUnit, protobuf, etc. etc. etc.; and that could happen at any time on
> point releases. If we are supporting the whole series of 1.x and 2.x
> releases, then that could be a real pain. Guava is a good example, it
> was a bit painful for us to move from 9 to 11 but not so for core as
> far as I know.

One of the by-design advantages of stack-assembly-validation automation
approach (that BigTop incidentally took ;) is that it provides a relatively
no-effort creation of stack updates when a single or multiple dependencies got
changed. Yes, it requires certain upfront time-investment to make the first
base stack definition. And from here it should be pretty much downhill.

We have applied a similar approach for the creation of X86 Solaris based
stacks for Sun Microsystems' rack-mount servers and it was a hoot and saved us
a tremendous amount of money back then (not that it helped Sun in the long
run)

>  - we should be very careful in picking which new versions
> > we support and when. A lot of the pain with the hadoop distributions has
> > been then wildly shifting apis, making a lot of work painful for handling
> > different versions (distcp/backup situations come to mind here, among other
> > things.
> 
> We also have test dependencies on interfaces that are LimitedPrivate
> at best. It's a source of friction.
> 
> > +1 on the idea of having classifiers for the different versions we actually
> > release as proper artifacts, and should be completely reasonable to enable
> > via profiles. I'd have to double check as to _how_ people would specify
> > that classifier/version of hbase from the maven repo, but it seems entirely
> > possible (my worry here is about the collison with the -tests and -sources
> > classifiers, which are standard mvn conventions for different builds).
> > Otherwise, with maven it is very reasonable to have people hosting profiles
> > for versions that they want to support - generally, this means just another
> > settings.xml file that includes another profile that people can activate on
> > their own, when they want to build against their own version.
> 
> This was a question I had, maybe you know. What happens if you want to
> build something like <artifact>-<version>-<classifier>-tests or
> -source? Would that work? Otherwise we'd have to add a suffix using
> property substitutions in profiles, right?

*-tests artifacts in maven are somewhat special animals and can't be dependent
upon in the common sense. This actually was a reason that BigTop has chosen to
make/use regular binary jar artifacts and use a name designator for their
test-related nature.

With regards,
  Cos

> Best regards,
> 
> ═ ═- Andy
> 
> Problems worthy of attack prove their worth by hitting back. - Piet
> Hein (via Tom White)

Re: Publishing jars for hbase compiled against hadoop 0.23.x/hadoop 2.0.x

Posted by Alejandro Abdelnur <tu...@cloudera.com>.

On Wed, May 16, 2012 at 4:06 PM, Andrew Purtell <ap...@apache.org> wrote:
>> +1 on the idea of having classifiers for the different versions we actually
>> release as proper artifacts, and should be completely reasonable to enable
>> via profiles. I'd have to double check as to _how_ people would specify
>> that classifier/version of hbase from the maven repo, but it seems entirely
>> possible (my worry here is about the collison with the -tests and -sources
>> classifiers, which are standard mvn conventions for different builds).
>> Otherwise, with maven it is very reasonable to have people hosting profiles
>> for versions that they want to support - generally, this means just another
>> settings.xml file that includes another profile that people can activate on
>> their own, when they want to build against their own version.
>
> This was a question I had, maybe you know. What happens if you want to
> build something like <artifact>-<version>-<classifier>-tests or
> -source? Would that work? Otherwise we'd have to add a suffix using
> property substitutions in profiles, right?

Well, we'd have to test if using <classifier> and <type>
(http://maven.apache.org/guides/mini/guide-attached-tests.html) work
as expected.

Otherwise (an it may be easier/cleaner) just use 2 different versions
for the hbase JARs, one for Hadoop1 and one for Hadoop2 (ie embedding
h1 & h2 in the version). This may be easier and less error prone for
users.

Whatever we do should no be based on profiles as (AFAIK) the published
POMs can not be consumed activating/deactivating profiles.

And again, it would be great if all projects affected by this end up
using an identical solution.

thx
-- 
Alejandro

Re: Publishing jars for hbase compiled against hadoop 0.23.x/hadoop 2.0.x

Posted by Konstantin Boudnik <co...@apache.org>.

See my comments inlined...

On Wed, May 16, 2012 at 04:06PM, Andrew Purtell wrote:
> [cc bigtop-dev]
> 
> On Wed, May 16, 2012 at 3:22 PM, Jesse Yates <je...@gmail.com> wrote:
> > ═+1 on a small number of supported versions with different classifiers that
> > only span a limited api skew to avoid a mountain of reflection. Along with
> > that, support for the builds via jenkins testing.
> 
> and
> 
> >> I think HBase should consider having a single blessed set of
> >> dependencies and only one build for a given release,
> >
> > This would be really nice, but seems a bit unreasonable given that we are
> > the "hadoop database" (if not in name, at least by connotation). I think
> > limiting our support to the latest X versions (2-3?) is reasonable given
> > consistent APIs
> 
> I was talking release mechanics not source/compilation/testing level
> support. Hence the suggestion for multiple Jenkins projects for the
> dependency versions we care about. That care could be scoped like you
> suggest.
> 
> I like what Bigtop espouses: carefully constructed snapshots of the
> world, well tested in total. Seems easier to manage then laying out
> various planes from increasingly higher dimensional spaces. If they
> get traction we can act as a responsible upstream project. As for our
> official release, we'd have maybe two, I'll grant you that, Hadoop 1
> and Hadoop 2.
> 
> X=2 will be a challenge. It's not just the Hadoop version that could
> change, but the versions of all of its dependencies, SLF4J, Guava,
> JUnit, protobuf, etc. etc. etc.; and that could happen at any time on
> point releases. If we are supporting the whole series of 1.x and 2.x
> releases, then that could be a real pain. Guava is a good example, it
> was a bit painful for us to move from 9 to 11 but not so for core as
> far as I know.

One of the by-design advantages of stack-assembly-validation automation
approach (that BigTop incidentally took ;) is that it provides a relatively
no-effort creation of stack updates when a single or multiple dependencies got
changed. Yes, it requires certain upfront time-investment to make the first
base stack definition. And from here it should be pretty much downhill.

We have applied a similar approach for the creation of X86 Solaris based
stacks for Sun Microsystems' rack-mount servers and it was a hoot and saved us
a tremendous amount of money back then (not that it helped Sun in the long
run)

>  - we should be very careful in picking which new versions
> > we support and when. A lot of the pain with the hadoop distributions has
> > been then wildly shifting apis, making a lot of work painful for handling
> > different versions (distcp/backup situations come to mind here, among other
> > things.
> 
> We also have test dependencies on interfaces that are LimitedPrivate
> at best. It's a source of friction.
> 
> > +1 on the idea of having classifiers for the different versions we actually
> > release as proper artifacts, and should be completely reasonable to enable
> > via profiles. I'd have to double check as to _how_ people would specify
> > that classifier/version of hbase from the maven repo, but it seems entirely
> > possible (my worry here is about the collison with the -tests and -sources
> > classifiers, which are standard mvn conventions for different builds).
> > Otherwise, with maven it is very reasonable to have people hosting profiles
> > for versions that they want to support - generally, this means just another
> > settings.xml file that includes another profile that people can activate on
> > their own, when they want to build against their own version.
> 
> This was a question I had, maybe you know. What happens if you want to
> build something like <artifact>-<version>-<classifier>-tests or
> -source? Would that work? Otherwise we'd have to add a suffix using
> property substitutions in profiles, right?

*-tests artifacts in maven are somewhat special animals and can't be dependent
upon in the common sense. This actually was a reason that BigTop has chosen to
make/use regular binary jar artifacts and use a name designator for their
test-related nature.

With regards,
  Cos

> Best regards,
> 
> ═ ═- Andy
> 
> Problems worthy of attack prove their worth by hitting back. - Piet
> Hein (via Tom White)

Re: Publishing jars for hbase compiled against hadoop 0.23.x/hadoop 2.0.x

Posted by Andrew Purtell <ap...@apache.org>.

[cc bigtop-dev]

On Wed, May 16, 2012 at 3:22 PM, Jesse Yates <je...@gmail.com> wrote:
>  +1 on a small number of supported versions with different classifiers that
> only span a limited api skew to avoid a mountain of reflection. Along with
> that, support for the builds via jenkins testing.

and

>> I think HBase should consider having a single blessed set of
>> dependencies and only one build for a given release,
>
> This would be really nice, but seems a bit unreasonable given that we are
> the "hadoop database" (if not in name, at least by connotation). I think
> limiting our support to the latest X versions (2-3?) is reasonable given
> consistent APIs

I was talking release mechanics not source/compilation/testing level
support. Hence the suggestion for multiple Jenkins projects for the
dependency versions we care about. That care could be scoped like you
suggest.

I like what Bigtop espouses: carefully constructed snapshots of the
world, well tested in total. Seems easier to manage then laying out
various planes from increasingly higher dimensional spaces. If they
get traction we can act as a responsible upstream project. As for our
official release, we'd have maybe two, I'll grant you that, Hadoop 1
and Hadoop 2.

X=2 will be a challenge. It's not just the Hadoop version that could
change, but the versions of all of its dependencies, SLF4J, Guava,
JUnit, protobuf, etc. etc. etc.; and that could happen at any time on
point releases. If we are supporting the whole series of 1.x and 2.x
releases, then that could be a real pain. Guava is a good example, it
was a bit painful for us to move from 9 to 11 but not so for core as
far as I know.

 - we should be very careful in picking which new versions
> we support and when. A lot of the pain with the hadoop distributions has
> been then wildly shifting apis, making a lot of work painful for handling
> different versions (distcp/backup situations come to mind here, among other
> things.

We also have test dependencies on interfaces that are LimitedPrivate
at best. It's a source of friction.

> +1 on the idea of having classifiers for the different versions we actually
> release as proper artifacts, and should be completely reasonable to enable
> via profiles. I'd have to double check as to _how_ people would specify
> that classifier/version of hbase from the maven repo, but it seems entirely
> possible (my worry here is about the collison with the -tests and -sources
> classifiers, which are standard mvn conventions for different builds).
> Otherwise, with maven it is very reasonable to have people hosting profiles
> for versions that they want to support - generally, this means just another
> settings.xml file that includes another profile that people can activate on
> their own, when they want to build against their own version.

This was a question I had, maybe you know. What happens if you want to
build something like <artifact>-<version>-<classifier>-tests or
-source? Would that work? Otherwise we'd have to add a suffix using
property substitutions in profiles, right?

Best regards,

   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet
Hein (via Tom White)

Re: Publishing jars for hbase compiled against hadoop 0.23.x/hadoop 2.0.x

Posted by Andrew Purtell <ap...@apache.org>.

[cc bigtop-dev]

On Wed, May 16, 2012 at 3:22 PM, Jesse Yates <je...@gmail.com> wrote:
>  +1 on a small number of supported versions with different classifiers that
> only span a limited api skew to avoid a mountain of reflection. Along with
> that, support for the builds via jenkins testing.

and

>> I think HBase should consider having a single blessed set of
>> dependencies and only one build for a given release,
>
> This would be really nice, but seems a bit unreasonable given that we are
> the "hadoop database" (if not in name, at least by connotation). I think
> limiting our support to the latest X versions (2-3?) is reasonable given
> consistent APIs

I was talking release mechanics not source/compilation/testing level
support. Hence the suggestion for multiple Jenkins projects for the
dependency versions we care about. That care could be scoped like you
suggest.

I like what Bigtop espouses: carefully constructed snapshots of the
world, well tested in total. Seems easier to manage then laying out
various planes from increasingly higher dimensional spaces. If they
get traction we can act as a responsible upstream project. As for our
official release, we'd have maybe two, I'll grant you that, Hadoop 1
and Hadoop 2.

X=2 will be a challenge. It's not just the Hadoop version that could
change, but the versions of all of its dependencies, SLF4J, Guava,
JUnit, protobuf, etc. etc. etc.; and that could happen at any time on
point releases. If we are supporting the whole series of 1.x and 2.x
releases, then that could be a real pain. Guava is a good example, it
was a bit painful for us to move from 9 to 11 but not so for core as
far as I know.

 - we should be very careful in picking which new versions
> we support and when. A lot of the pain with the hadoop distributions has
> been then wildly shifting apis, making a lot of work painful for handling
> different versions (distcp/backup situations come to mind here, among other
> things.

We also have test dependencies on interfaces that are LimitedPrivate
at best. It's a source of friction.

> +1 on the idea of having classifiers for the different versions we actually
> release as proper artifacts, and should be completely reasonable to enable
> via profiles. I'd have to double check as to _how_ people would specify
> that classifier/version of hbase from the maven repo, but it seems entirely
> possible (my worry here is about the collison with the -tests and -sources
> classifiers, which are standard mvn conventions for different builds).
> Otherwise, with maven it is very reasonable to have people hosting profiles
> for versions that they want to support - generally, this means just another
> settings.xml file that includes another profile that people can activate on
> their own, when they want to build against their own version.

This was a question I had, maybe you know. What happens if you want to
build something like <artifact>-<version>-<classifier>-tests or
-source? Would that work? Otherwise we'd have to add a suffix using
property substitutions in profiles, right?

Best regards,

   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet
Hein (via Tom White)

Re: Publishing jars for hbase compiled against hadoop 0.23.x/hadoop 2.0.x

Posted by Jesse Yates <je...@gmail.com>.

Comments inline.

TL;DR
 +1 on a small number of supported versions with different classifiers that
only span a limited api skew to avoid a mountain of reflection. Along with
that, support for the builds via jenkins testing.

Any further dependency resoluton should be considered 'external projects'
and handled via their own maven setttings.xml which can be in external
repos by people who want hbase to support other versions of our
dependencies (and possibly have a branch of hbase with the appropriate
modifications). Any new dependency versions we want to support should be
heavily vetted for ease of integration and stability.

-1 on keeping code in the POMs for things we don't directly release as that
means more potential maintaince for things we (as a community) don't care
that much about (ala current Avro support).

-------------------
Jesse Yates
@jesse_yates
jyates.github.com

On Wed, May 16, 2012 at 2:09 PM, Andrew Purtell <ap...@apache.org> wrote:

> On Wed, May 16, 2012 at 1:00 PM, Roman Shaposhnik <rv...@apache.org> wrote:
> > Now, it seems like in Bigtop we're going to soon expose the Maven repo
> > with all of the Maven artifacts constituting a particular Bigtop
> "stack". You
> > could think of it as a transitive closure of all of the deps. built
> against
> > each other. This, of course, will not tackle an issue of a random
> combination
> > of components (we only support the versions of components as
> > specified in our own BOM for each particular Bigtop release) but it will
> > provide a pretty stable body of Maven artifacts that are KNOWN (as
> > in tested) to be compiled against each other.
>
> I think HBase should consider having a single blessed set of
> dependencies and only one build for a given release,

This would be really nice, but seems a bit unreasonable given that we are
the "hadoop database" (if not in name, at least by connotation). I think
limiting our support to the latest X versions (2-3?) is reasonable given
consistent APIs - we should be very careful in picking which new versions
we support and when. A lot of the pain with the hadoop distributions has
been then wildly shifting apis, making a lot of work painful for handling
different versions (distcp/backup situations come to mind here, among other
things.

> but also several
> Jenkins projects set up to insure that release also builds against
> some larger set of additional dependencies according to contributor
> needs,

Definitely a necessity if we support more than 1 version. Only problem here
is that we then have to worry about multiple builds, which seemed to be a
problem in the past. If we are going to support more than 1 version, we
need to have full support for that version/permutation of options (eg.
Hadoop X with Zookeeper Y)

> and otherwise the user is welcome to mvn
> -Ddependency.version=foo.

I'd prefer not to have pieces in the code that are not being regularly
tested/used. If we find we have a lot of people using a given version and
willing to support it, then we should roll it in (like with other external
dependencies, like the Avro stuff that we are stuck with).

The mvn command you recommend above is already quite close to what we are
doing already, with just specifying the hadoop version as a profile, eg (or
close enough)
 -Dhadoop.version=0.23

+1 on the idea of having classifiers for the different versions we actually
release as proper artifacts, and should be completely reasonable to enable
via profiles. I'd have to double check as to _how_ people would specify
that classifier/version of hbase from the maven repo, but it seems entirely
possible (my worry here is about the collison with the -tests and -sources
classifiers, which are standard mvn conventions for different builds).
Otherwise, with maven it is very reasonable to have people hosting profiles
for versions that they want to support - generally, this means just another
settings.xml file that includes another profile that people can activate on
their own, when they want to build against their own version.

> A project like BigTop could separately
> handle a broader set of combinations according to "distribution
> consumer" demand, we could point potential users at that if it's an
> option.
>
> Best regards,
>
>    - Andy
>
> Problems worthy of attack prove their worth by hitting back. - Piet
> Hein (via Tom White)
>

Re: Publishing jars for hbase compiled against hadoop 0.23.x/hadoop 2.0.x

Posted by Andrew Purtell <ap...@apache.org>.

On Wed, May 16, 2012 at 1:00 PM, Roman Shaposhnik <rv...@apache.org> wrote:
> Now, it seems like in Bigtop we're going to soon expose the Maven repo
> with all of the Maven artifacts constituting a particular Bigtop "stack". You
> could think of it as a transitive closure of all of the deps. built against
> each other. This, of course, will not tackle an issue of a random combination
> of components (we only support the versions of components as
> specified in our own BOM for each particular Bigtop release) but it will
> provide a pretty stable body of Maven artifacts that are KNOWN (as
> in tested) to be compiled against each other.

I think HBase should consider having a single blessed set of
dependencies and only one build for a given release, but also several
Jenkins projects set up to insure that release also builds against
some larger set of additional dependencies according to contributor
needs, and otherwise the user is welcome to mvn
-Ddependency.version=foo. A project like BigTop could separately
handle a broader set of combinations according to "distribution
consumer" demand, we could point potential users at that if it's an
option.

Best regards,

   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet
Hein (via Tom White)

Re: Publishing jars for hbase compiled against hadoop 0.23.x/hadoop 2.0.x

Posted by Andrew Purtell <ap...@apache.org>.

On Wed, May 16, 2012 at 1:00 PM, Roman Shaposhnik <rv...@apache.org> wrote:
> Now, it seems like in Bigtop we're going to soon expose the Maven repo
> with all of the Maven artifacts constituting a particular Bigtop "stack". You
> could think of it as a transitive closure of all of the deps. built against
> each other. This, of course, will not tackle an issue of a random combination
> of components (we only support the versions of components as
> specified in our own BOM for each particular Bigtop release) but it will
> provide a pretty stable body of Maven artifacts that are KNOWN (as
> in tested) to be compiled against each other.

I think HBase should consider having a single blessed set of
dependencies and only one build for a given release, but also several
Jenkins projects set up to insure that release also builds against
some larger set of additional dependencies according to contributor
needs, and otherwise the user is welcome to mvn
-Ddependency.version=foo. A project like BigTop could separately
handle a broader set of combinations according to "distribution
consumer" demand, we could point potential users at that if it's an
option.

Best regards,

   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet
Hein (via Tom White)

Re: Publishing jars for hbase compiled against hadoop 0.23.x/hadoop 2.0.x

Posted by Roman Shaposhnik <rv...@apache.org>.

+Bigtop

On Wed, May 16, 2012 at 12:50 PM, Alejandro Abdelnur <tu...@cloudera.com> wrote:
> A while ago I've raise this issue in Pig
>
> This is an issue that most if not all projects (hbase, pig, sqoop,
> hive, oozie,...) based on Hadoop will face.
>
> It would be great if all these projects come up with a consistent way
> of doing this.
>
> Any idea how to tackle it? Starting the discusion all dev aliases?

This is something we've pondered in Bigtop. Our current thinking
is that while it is probably Ok to lean on the "leaf-node" (think Pig,
Hive, to some extend HBase) projects to at least take Hadoop
compatibility into account, the full problem is going to combinatorically
explode pretty soon.

Take Hive as an example -- for that project just taking care of Hadoop
is not enough, if there are incompatiblities between HBase release
Hive needs to publish HxB matrix of artifacts where H is the # of incomp.
Hadoop versions and B is the # of incomp. HBase versions. And that
doesn't take into account the fact that Hive might be interested in
publishing different artifacts to begin with (think -security
artifacts in HBase).
This gets pretty ugly pretty quickly.

Oh, and don't forget that somebody has to test all of the above.

Now, it seems like in Bigtop we're going to soon expose the Maven repo
with all of the Maven artifacts constituting a particular Bigtop "stack". You
could think of it as a transitive closure of all of the deps. built against
each other. This, of course, will not tackle an issue of a random combination
of components (we only support the versions of components as
specified in our own BOM for each particular Bigtop release) but it will
provide a pretty stable body of Maven artifacts that are KNOWN (as
in tested) to be compiled against each other.

If this sounds interesting and useful for upstream projects -- I'd invite
the continuation of this discussion to happen on bigtop-dev@.

Thanks,
Roman.

Re: Publishing jars for hbase compiled against hadoop 0.23.x/hadoop 2.0.x

Posted by Roman Shaposhnik <rv...@apache.org>.

+Bigtop

On Wed, May 16, 2012 at 12:50 PM, Alejandro Abdelnur <tu...@cloudera.com> wrote:
> A while ago I've raise this issue in Pig
>
> This is an issue that most if not all projects (hbase, pig, sqoop,
> hive, oozie,...) based on Hadoop will face.
>
> It would be great if all these projects come up with a consistent way
> of doing this.
>
> Any idea how to tackle it? Starting the discusion all dev aliases?

This is something we've pondered in Bigtop. Our current thinking
is that while it is probably Ok to lean on the "leaf-node" (think Pig,
Hive, to some extend HBase) projects to at least take Hadoop
compatibility into account, the full problem is going to combinatorically
explode pretty soon.

Take Hive as an example -- for that project just taking care of Hadoop
is not enough, if there are incompatiblities between HBase release
Hive needs to publish HxB matrix of artifacts where H is the # of incomp.
Hadoop versions and B is the # of incomp. HBase versions. And that
doesn't take into account the fact that Hive might be interested in
publishing different artifacts to begin with (think -security
artifacts in HBase).
This gets pretty ugly pretty quickly.

Oh, and don't forget that somebody has to test all of the above.

Now, it seems like in Bigtop we're going to soon expose the Maven repo
with all of the Maven artifacts constituting a particular Bigtop "stack". You
could think of it as a transitive closure of all of the deps. built against
each other. This, of course, will not tackle an issue of a random combination
of components (we only support the versions of components as
specified in our own BOM for each particular Bigtop release) but it will
provide a pretty stable body of Maven artifacts that are KNOWN (as
in tested) to be compiled against each other.

If this sounds interesting and useful for upstream projects -- I'd invite
the continuation of this discussion to happen on bigtop-dev@.

Thanks,
Roman.

Re: Publishing jars for hbase compiled against hadoop 0.23.x/hadoop 2.0.x

Posted by Alejandro Abdelnur <tu...@cloudera.com>.

A while ago I've raise this issue in Pig

This is an issue that most if not all projects (hbase, pig, sqoop,
hive, oozie,...) based on Hadoop will face.

It would be great if all these projects come up with a consistent way
of doing this.

Any idea how to tackle it? Starting the discusion all dev aliases?

thx

On Wed, May 16, 2012 at 12:45 PM, Gary Helmling <gh...@gmail.com> wrote:
> Maven's support for "classifiers" in dependencies seems to be targeted
> at this kind of case:
> http://maven.apache.org/pom.html#Dependencies
>
> I'm not sure how exactly that works with publishing artifacts though.
> It may just amount to appending the "classifier" as a suffix anyway.
> But may be worth looking at in more detail.
>
>
> On Wed, May 16, 2012 at 11:35 AM, Jonathan Hsieh <jo...@cloudera.com> wrote:
>> Andy,
>>
>> Ah, ok that sounds reasonable.  Some this would be similar to how the
>> security build used to have a "-security" suffix but for hadoop2 we'd have
>> something like a "-hadoop2" suffix instead.
>>
>> Jon.
>>
>> On Wed, May 16, 2012 at 11:27 AM, Andrew Purtell <ap...@apache.org>wrote:
>>
>>> On Wed, May 16, 2012 at 11:24 AM, Jonathan Hsieh <jo...@cloudera.com> wrote:
>>> > I've gotten pinged by folks working on Apache Flume, a project that
>>> depends
>>> > directly upon hbase and hadoop hdfs jars about how to get the proper
>>> hbase
>>> > jars that work against hadoop 1.0 and hadoop 0.23/2.0.
>>> > Unfortunately, the transition from hadoop 1.0.0 to hadoop 0.23.x/2.0
>>> > requires hbase to be recompiled to run against the different hadoop
>>> > version. ("compile compatible" but not "binary compatible").
>>> >
>>> > Currently, we build and publish hbase jars compiled against hadoop 1.0.x.
>>> >
>>> > What is the right way to publish poms/jars for those who want use an
>>> hbase
>>> > jars compiled against hadoop 0.23/2.0?  Is there a right way?
>>>
>>> This requires we add a version suffix for the Hadoop version used during
>>> build?
>>>
>>> Best regards,
>>>
>>>    - Andy
>>>
>>> Problems worthy of attack prove their worth by hitting back. - Piet
>>> Hein (via Tom White)
>>>
>>
>>
>>
>> --
>> // Jonathan Hsieh (shay)
>> // Software Engineer, Cloudera
>> // jon@cloudera.com



-- 
Alejandro

Re: Publishing jars for hbase compiled against hadoop 0.23.x/hadoop 2.0.x

Posted by Gary Helmling <gh...@gmail.com>.

Maven's support for "classifiers" in dependencies seems to be targeted
at this kind of case:
http://maven.apache.org/pom.html#Dependencies

I'm not sure how exactly that works with publishing artifacts though.
It may just amount to appending the "classifier" as a suffix anyway.
But may be worth looking at in more detail.


On Wed, May 16, 2012 at 11:35 AM, Jonathan Hsieh <jo...@cloudera.com> wrote:
> Andy,
>
> Ah, ok that sounds reasonable.  Some this would be similar to how the
> security build used to have a "-security" suffix but for hadoop2 we'd have
> something like a "-hadoop2" suffix instead.
>
> Jon.
>
> On Wed, May 16, 2012 at 11:27 AM, Andrew Purtell <ap...@apache.org>wrote:
>
>> On Wed, May 16, 2012 at 11:24 AM, Jonathan Hsieh <jo...@cloudera.com> wrote:
>> > I've gotten pinged by folks working on Apache Flume, a project that
>> depends
>> > directly upon hbase and hadoop hdfs jars about how to get the proper
>> hbase
>> > jars that work against hadoop 1.0 and hadoop 0.23/2.0.
>> > Unfortunately, the transition from hadoop 1.0.0 to hadoop 0.23.x/2.0
>> > requires hbase to be recompiled to run against the different hadoop
>> > version. ("compile compatible" but not "binary compatible").
>> >
>> > Currently, we build and publish hbase jars compiled against hadoop 1.0.x.
>> >
>> > What is the right way to publish poms/jars for those who want use an
>> hbase
>> > jars compiled against hadoop 0.23/2.0?  Is there a right way?
>>
>> This requires we add a version suffix for the Hadoop version used during
>> build?
>>
>> Best regards,
>>
>>    - Andy
>>
>> Problems worthy of attack prove their worth by hitting back. - Piet
>> Hein (via Tom White)
>>
>
>
>
> --
> // Jonathan Hsieh (shay)
> // Software Engineer, Cloudera
> // jon@cloudera.com

Re: Publishing jars for hbase compiled against hadoop 0.23.x/hadoop 2.0.x

Posted by Jonathan Hsieh <jo...@cloudera.com>.

Andy,

Ah, ok that sounds reasonable.  Some this would be similar to how the
security build used to have a "-security" suffix but for hadoop2 we'd have
something like a "-hadoop2" suffix instead.

Jon.

On Wed, May 16, 2012 at 11:27 AM, Andrew Purtell <ap...@apache.org>wrote:

> On Wed, May 16, 2012 at 11:24 AM, Jonathan Hsieh <jo...@cloudera.com> wrote:
> > I've gotten pinged by folks working on Apache Flume, a project that
> depends
> > directly upon hbase and hadoop hdfs jars about how to get the proper
> hbase
> > jars that work against hadoop 1.0 and hadoop 0.23/2.0.
> > Unfortunately, the transition from hadoop 1.0.0 to hadoop 0.23.x/2.0
> > requires hbase to be recompiled to run against the different hadoop
> > version. ("compile compatible" but not "binary compatible").
> >
> > Currently, we build and publish hbase jars compiled against hadoop 1.0.x.
> >
> > What is the right way to publish poms/jars for those who want use an
> hbase
> > jars compiled against hadoop 0.23/2.0?  Is there a right way?
>
> This requires we add a version suffix for the Hadoop version used during
> build?
>
> Best regards,
>
>    - Andy
>
> Problems worthy of attack prove their worth by hitting back. - Piet
> Hein (via Tom White)
>



-- 
// Jonathan Hsieh (shay)
// Software Engineer, Cloudera
// jon@cloudera.com

Re: Publishing jars for hbase compiled against hadoop 0.23.x/hadoop 2.0.x

Posted by Andrew Purtell <ap...@apache.org>.

On Wed, May 16, 2012 at 11:24 AM, Jonathan Hsieh <jo...@cloudera.com> wrote:
> I've gotten pinged by folks working on Apache Flume, a project that depends
> directly upon hbase and hadoop hdfs jars about how to get the proper hbase
> jars that work against hadoop 1.0 and hadoop 0.23/2.0.
> Unfortunately, the transition from hadoop 1.0.0 to hadoop 0.23.x/2.0
> requires hbase to be recompiled to run against the different hadoop
> version. ("compile compatible" but not "binary compatible").
>
> Currently, we build and publish hbase jars compiled against hadoop 1.0.x.
>
> What is the right way to publish poms/jars for those who want use an hbase
> jars compiled against hadoop 0.23/2.0?  Is there a right way?

This requires we add a version suffix for the Hadoop version used during build?

Best regards,

   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet
Hein (via Tom White)