You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hbase.apache.org by Mike Spreitzer <ms...@us.ibm.com> on 2011/02/05 08:47:46 UTC

Using the Hadoop bundled in the lib directory of HBase

Hi, I'm new to HBase and have a stupid question about its dependency on 
Hadoop.  Section 1.3.1.2 of (http://hbase.apache.org/notsoquick.html) says 
there is an "instance" of Hadoop in the lib directory of HBase.  What 
exactly is meant by "instance"?  Is it all I need, or do I need to get a 
"full" copy of Hadoop from elsewhere?  If HBase already has all I need, I 
am having trouble finding it.  The Hadoop instructions refer to commands, 
for example, that I can't find.

Thanks,
Mike

Re: Using the Hadoop bundled in the lib directory of HBase

Posted by Norbert Burger <no...@gmail.com>.

For testing purposes, it is possible to run HBase without HDFS and the
benefits of durability.  Benoit Sigoure has a good writeup here:

http://opentsdb.net/setup-hbase.html

But for larger deployments, HDFS is the way to go.  Another approach you
might consider is the pseudo-distributed option, where you get Hadoop+HBase
running all on the same node (http://goo.gl/Rytnp).

Norbert

On Sat, Feb 5, 2011 at 9:51 AM, Norbert Burger <no...@gmail.com>wrote:

> Mike, you'll also need also access to an installation of Hadoop, whether
> this on the same machines as your HBase install (common), or somewhere
> else.  Often, people install Hadoop first and then layer HBase over it.
>
> HBase depends on core Hadoop functionality like HDFS, and uses the Hadoop
> JAR in lib/ to support this.  But this is library code only; what you're
> missing is the rest of Hadoop ecosystem (config files, directory structure,
> command-line tools, etc.)
>
> Norbert
>
>
> On Sat, Feb 5, 2011 at 9:21 AM, Ted Yu <yu...@gmail.com> wrote:
>
>> On a related note:
>> http://wiki.apache.org/hadoop/Hadoop%20Upgrade (referenced by
>> http://wiki.apache.org/hadoop/Hbase/HowToMigrate#90) needs to be filled
>> out.
>>
>> On Fri, Feb 4, 2011 at 11:47 PM, Mike Spreitzer <ms...@us.ibm.com>
>> wrote:
>>
>> > Hi, I'm new to HBase and have a stupid question about its dependency on
>> > Hadoop.  Section 1.3.1.2 of (http://hbase.apache.org/notsoquick.html)
>> says
>> > there is an "instance" of Hadoop in the lib directory of HBase.  What
>> > exactly is meant by "instance"?  Is it all I need, or do I need to get a
>> > "full" copy of Hadoop from elsewhere?  If HBase already has all I need,
>> I
>> > am having trouble finding it.  The Hadoop instructions refer to
>> commands,
>> > for example, that I can't find.
>> >
>> > Thanks,
>> > Mike
>>
>
>

Re: Using the Hadoop bundled in the lib directory of HBase

Posted by Stack <st...@duboce.net>.

Oh, sorry.  Thanks for the noob POV Joe.  Invaluable.  Let me have a go at it.
St.Ack


On Mon, Feb 7, 2011 at 9:21 AM, Joe Pallas <pa...@cs.stanford.edu> wrote:
>
> On Feb 7, 2011, at 9:02 AM, Stack wrote:
>
>> Here is our Hadoop story for 0.90.0:
>> http://hbase.apache.org/notsoquick.html#hadoop
>
> And for someone who is new to HBase and Hadoop, those two paragraphs are immensely confusing.  First it says you have to build your own Hadoop, and then it says a copy of Hadoop is bundled with HBase.
>
> The poor newbie is at a total loss.  Do I have to build my own copy of Hadoop or not?  Can I download the latest 0.20.x Hadoop and just replace the hadoop-core jar with the one from the HBase distribution?  And then all those other versions get mentioned, which just makes things even more confusing.
>
> I think this could be clearer.
> joe
>
>

Re: Using the Hadoop bundled in the lib directory of HBase

Posted by Joe Pallas <pa...@cs.stanford.edu>.

On Feb 7, 2011, at 9:02 AM, Stack wrote:

> Here is our Hadoop story for 0.90.0:
> http://hbase.apache.org/notsoquick.html#hadoop

And for someone who is new to HBase and Hadoop, those two paragraphs are immensely confusing.  First it says you have to build your own Hadoop, and then it says a copy of Hadoop is bundled with HBase.

The poor newbie is at a total loss.  Do I have to build my own copy of Hadoop or not?  Can I download the latest 0.20.x Hadoop and just replace the hadoop-core jar with the one from the HBase distribution?  And then all those other versions get mentioned, which just makes things even more confusing.

I think this could be clearer.
joe

Re: Using the Hadoop bundled in the lib directory of HBase

Posted by Ryan Rawson <ry...@gmail.com>.

Hey guys,

If you are running on hadoop 0.20.2  you are going to lose data when
you crash.  So don't do it :-)

You will need to either use a cdh3 beta (we use b2), or build the
hadoop-20-append branch.  We have built the hadoop-20-append tip and
included the JAR with the default distribution. It is not compatible
with hadoop 0.20.2 (stock/native) nor cdh3 beta*.

It's really confusing, but the basic fact is there is no ASF released
version of hadoop that runs HBase properly. My best suggestion is to
complain to general@, and file JIRAs if you can. It helps when users
complain, since I think everyone has gone tone deaf from me
complaining :-)

-ryan

On Thu, Feb 10, 2011 at 6:13 AM, Mike Spreitzer <ms...@us.ibm.com> wrote:
> Yes, you've got it right.  Let me emphasize that what I did was *much*
> easier than the other way around --- which I tried first and in which I
> had problems.  The Cloudera release specifically depends on Sun security
> classes that are not in the Java (IBM's) that I used.  I tried building
> Hadoop's 0.20-append branch but had some difficulties and it took a long
> time.  The various build instructions I found all talked about running the
> regression test suite once or twice --- and a single run takes hours.  The
> first time I ran it, from a clean download and build, it had problems. And
> the instructions are confusion regarding building the native part.  The
> instructions seem to say you can build and test without building the
> native support; how can that be?
>
> Regards,
> Mike Spreitzer
> SMTP: mspreitz@us.ibm.com, Lotus Notes: Mike Spreitzer/Watson/IBM
> Office phone: +1-914-784-6424 (IBM T/L 863-)
> AOL Instant Messaging: M1k3Sprtzr
>
>
>
> From:   Suraj Varma <sv...@gmail.com>
> To:     user@hbase.apache.org
> Date:   02/10/2011 08:02 AM
> Subject:        Re: Using the Hadoop bundled in the lib directory of HBase
>
>
>
> This procedure does seem a bit opposite of what I've seen folks recommend
> (and the way it is documented in the notsoquick.html).But it might be
> equivalent in this specific case (not completely sure as scripts etc are
> different). I'll let one of the experts comment on that.
>
> If I understood you right, you took the hadoop 0.20.2 release (which does
> not have append support needed to prevent data loss in some situations)
> and
> installed that. Next you took hbase 0.90.0 's hadoop-core.jar (which is
> from
> a separately built branch-0.20-append and copied that over to the hadoop
> installation.
>
> What folks usually do is copy over the hadoop install's jar file over to
> hbase - so, if you have a Cloudera install, you would copy over the
> Cloudera
> built hadoop jar over to your hbase install (replacing the hbase hadoop
> jar).
>
> I'm guessing that in your specific situation since branch-0.20-append and
> hadoop 0.20.2 are fairly close (other than the append changes), it "might"
> work. But - not sure if this is what folks normally do ...
>
> Can someone clarify this? The above procedure Mike followed certainly is
> much simpler in this specific case as he doesn't have to built out his own
> branch-0.20-append and rather "reuse" the one that was built for
> hbase-0.90.
>
> Thanks,
> --Suraj
>
>
> On Mon, Feb 7, 2011 at 9:17 AM, Mike Spreitzer <ms...@us.ibm.com>
> wrote:
>
>> After a few false starts, what I have done is: fetch the 0.20.2 release
> of
>> hadoop core (which appears to be common + dfs + mapred), install it,
>> delete hadoop/hadoop-core.jar, unpack the hbase distribution, copy its
>> lib/hadoop-core-...jar file to hadoop/hadoop-...-core.jar, configure,
> and
>> test.  It seems to be working.  Is that what you expected?  Should I
>> expect subtle problems?
>>
>> If that was the right procedure, this could be explained a little more
>> clearly at (http://hbase.apache.org/notsoquick.html#hadoop).  The first
>> thing that set me on the wrong path was the statement that I have to
>> either build my own Hadoop or use Cloudera; apparently that's not right,
> I
>> can use a built release if I replace one jar in it.  That web page says
> "
>> If you want to run HBase on an Hadoop cluster that is other than a
> version
>> made from branch-0.20.append " (which is my case, using a standard
>> release) "you must replace the hadoop jar found in the HBase lib
> directory
>> with the hadoop jar you are running out on your cluster to avoid version
>> mismatch issues" --- but I think it's the other way around in my case.
>>
>> Thanks,
>> Mike Spreitzer
>> SMTP: mspreitz@us.ibm.com, Lotus Notes: Mike Spreitzer/Watson/IBM
>> Office phone: +1-914-784-6424 (IBM T/L 863-)
>> AOL Instant Messaging: M1k3Sprtzr
>>
>>
>>
>> From:   Stack <st...@duboce.net>
>> To:     user@hbase.apache.org
>> Date:   02/07/2011 12:07 PM
>> Subject:        Re: Using the Hadoop bundled in the lib directory of
> HBase
>> Sent by:        saint.ack@gmail.com
>>
>>
>>
>> On Sun, Feb 6, 2011 at 9:31 PM, Vijay Raj <vi...@sargasdata.com>
> wrote:
>> > Hadoop core contained hdfs / mapreduce , all bundled together until
>> 0.20.x .
>> >  Since 0.21, it got forked into common, hdfs and mapreduce
> sub-projects.
>> >
>>
>> What Vijay said.
>>
>> > In this case - what is needed is a 0.20.2 download from hadoop and
>> configuring
>> > the same. The hadoop-0.20.2.jar needs to be replaced by the patched
>> > hadoop-0.20.2-xxxx.jar available in HBASE_HOME/lib/*.jar directory, to
>> make
>> > things work .
>> >
>>
>> This is a  little off.
>>
>> Here is our Hadoop story for 0.90.0:
>> http://hbase.apache.org/notsoquick.html#hadoop
>>
>> It links to the branch.   If you need instruction on how to check out
>> and build, just say (do we need to add pointers to book?)
>>
>> St.Ack
>>
>>
>
>

Re: Using the Hadoop bundled in the lib directory of HBase

Posted by Mike Spreitzer <ms...@us.ibm.com>.

Yes, you've got it right.  Let me emphasize that what I did was *much* 
easier than the other way around --- which I tried first and in which I 
had problems.  The Cloudera release specifically depends on Sun security 
classes that are not in the Java (IBM's) that I used.  I tried building 
Hadoop's 0.20-append branch but had some difficulties and it took a long 
time.  The various build instructions I found all talked about running the 
regression test suite once or twice --- and a single run takes hours.  The 
first time I ran it, from a clean download and build, it had problems. And 
the instructions are confusion regarding building the native part.  The 
instructions seem to say you can build and test without building the 
native support; how can that be?

Regards,
Mike Spreitzer
SMTP: mspreitz@us.ibm.com, Lotus Notes: Mike Spreitzer/Watson/IBM
Office phone: +1-914-784-6424 (IBM T/L 863-)
AOL Instant Messaging: M1k3Sprtzr

From:   Suraj Varma <sv...@gmail.com>
To:     user@hbase.apache.org
Date:   02/10/2011 08:02 AM
Subject:        Re: Using the Hadoop bundled in the lib directory of HBase

This procedure does seem a bit opposite of what I've seen folks recommend
(and the way it is documented in the notsoquick.html).But it might be
equivalent in this specific case (not completely sure as scripts etc are
different). I'll let one of the experts comment on that.

If I understood you right, you took the hadoop 0.20.2 release (which does
not have append support needed to prevent data loss in some situations) 
and
installed that. Next you took hbase 0.90.0 's hadoop-core.jar (which is 
from
a separately built branch-0.20-append and copied that over to the hadoop
installation.

What folks usually do is copy over the hadoop install's jar file over to
hbase - so, if you have a Cloudera install, you would copy over the 
Cloudera
built hadoop jar over to your hbase install (replacing the hbase hadoop
jar).

I'm guessing that in your specific situation since branch-0.20-append and
hadoop 0.20.2 are fairly close (other than the append changes), it "might"
work. But - not sure if this is what folks normally do ...

Can someone clarify this? The above procedure Mike followed certainly is
much simpler in this specific case as he doesn't have to built out his own
branch-0.20-append and rather "reuse" the one that was built for 
hbase-0.90.

Thanks,
--Suraj

On Mon, Feb 7, 2011 at 9:17 AM, Mike Spreitzer <ms...@us.ibm.com> 
wrote:

> After a few false starts, what I have done is: fetch the 0.20.2 release 
of
> hadoop core (which appears to be common + dfs + mapred), install it,
> delete hadoop/hadoop-core.jar, unpack the hbase distribution, copy its
> lib/hadoop-core-...jar file to hadoop/hadoop-...-core.jar, configure, 
and
> test.  It seems to be working.  Is that what you expected?  Should I
> expect subtle problems?
>
> If that was the right procedure, this could be explained a little more
> clearly at (http://hbase.apache.org/notsoquick.html#hadoop).  The first
> thing that set me on the wrong path was the statement that I have to
> either build my own Hadoop or use Cloudera; apparently that's not right, 
I
> can use a built release if I replace one jar in it.  That web page says 
"
> If you want to run HBase on an Hadoop cluster that is other than a 
version
> made from branch-0.20.append " (which is my case, using a standard
> release) "you must replace the hadoop jar found in the HBase lib 
directory
> with the hadoop jar you are running out on your cluster to avoid version
> mismatch issues" --- but I think it's the other way around in my case.
>
> Thanks,
> Mike Spreitzer
> SMTP: mspreitz@us.ibm.com, Lotus Notes: Mike Spreitzer/Watson/IBM
> Office phone: +1-914-784-6424 (IBM T/L 863-)
> AOL Instant Messaging: M1k3Sprtzr
>
>
>
> From:   Stack <st...@duboce.net>
> To:     user@hbase.apache.org
> Date:   02/07/2011 12:07 PM
> Subject:        Re: Using the Hadoop bundled in the lib directory of 
HBase
> Sent by:        saint.ack@gmail.com
>
>
>
> On Sun, Feb 6, 2011 at 9:31 PM, Vijay Raj <vi...@sargasdata.com> 
wrote:
> > Hadoop core contained hdfs / mapreduce , all bundled together until
> 0.20.x .
> >  Since 0.21, it got forked into common, hdfs and mapreduce 
sub-projects.
> >
>
> What Vijay said.
>
> > In this case - what is needed is a 0.20.2 download from hadoop and
> configuring
> > the same. The hadoop-0.20.2.jar needs to be replaced by the patched
> > hadoop-0.20.2-xxxx.jar available in HBASE_HOME/lib/*.jar directory, to
> make
> > things work .
> >
>
> This is a  little off.
>
> Here is our Hadoop story for 0.90.0:
> http://hbase.apache.org/notsoquick.html#hadoop
>
> It links to the branch.   If you need instruction on how to check out
> and build, just say (do we need to add pointers to book?)
>
> St.Ack
>
>

Re: Using the Hadoop bundled in the lib directory of HBase

Posted by Suraj Varma <sv...@gmail.com>.

This procedure does seem a bit opposite of what I've seen folks recommend
(and the way it is documented in the notsoquick.html).But it might be
equivalent in this specific case (not completely sure as scripts etc are
different). I'll let one of the experts comment on that.

If I understood you right, you took the hadoop 0.20.2 release (which does
not have append support needed to prevent data loss in some situations) and
installed that. Next you took hbase 0.90.0 's hadoop-core.jar (which is from
a separately built branch-0.20-append and copied that over to the hadoop
installation.

What folks usually do is copy over the hadoop install's jar file over to
hbase - so, if you have a Cloudera install, you would copy over the Cloudera
built hadoop jar over to your hbase install (replacing the hbase hadoop
jar).

I'm guessing that in your specific situation since branch-0.20-append and
hadoop 0.20.2 are fairly close (other than the append changes), it "might"
work. But - not sure if this is what folks normally do ...

Can someone clarify this? The above procedure Mike followed certainly is
much simpler in this specific case as he doesn't have to built out his own
branch-0.20-append and rather "reuse" the one that was built for hbase-0.90.

Thanks,
--Suraj

On Mon, Feb 7, 2011 at 9:17 AM, Mike Spreitzer <ms...@us.ibm.com> wrote:

> After a few false starts, what I have done is: fetch the 0.20.2 release of
> hadoop core (which appears to be common + dfs + mapred), install it,
> delete hadoop/hadoop-core.jar, unpack the hbase distribution, copy its
> lib/hadoop-core-...jar file to hadoop/hadoop-...-core.jar, configure, and
> test.  It seems to be working.  Is that what you expected?  Should I
> expect subtle problems?
>
> If that was the right procedure, this could be explained a little more
> clearly at (http://hbase.apache.org/notsoquick.html#hadoop).  The first
> thing that set me on the wrong path was the statement that I have to
> either build my own Hadoop or use Cloudera; apparently that's not right, I
> can use a built release if I replace one jar in it.  That web page says "
> If you want to run HBase on an Hadoop cluster that is other than a version
> made from branch-0.20.append " (which is my case, using a standard
> release) "you must replace the hadoop jar found in the HBase lib directory
> with the hadoop jar you are running out on your cluster to avoid version
> mismatch issues" --- but I think it's the other way around in my case.
>
> Thanks,
> Mike Spreitzer
> SMTP: mspreitz@us.ibm.com, Lotus Notes: Mike Spreitzer/Watson/IBM
> Office phone: +1-914-784-6424 (IBM T/L 863-)
> AOL Instant Messaging: M1k3Sprtzr
>
>
>
> From:   Stack <st...@duboce.net>
> To:     user@hbase.apache.org
> Date:   02/07/2011 12:07 PM
> Subject:        Re: Using the Hadoop bundled in the lib directory of HBase
> Sent by:        saint.ack@gmail.com
>
>
>
> On Sun, Feb 6, 2011 at 9:31 PM, Vijay Raj <vi...@sargasdata.com> wrote:
> > Hadoop core contained hdfs / mapreduce , all bundled together until
> 0.20.x .
> >  Since 0.21, it got forked into common, hdfs and mapreduce sub-projects.
> >
>
> What Vijay said.
>
> > In this case - what is needed is a 0.20.2 download from hadoop and
> configuring
> > the same. The hadoop-0.20.2.jar needs to be replaced by the patched
> > hadoop-0.20.2-xxxx.jar available in HBASE_HOME/lib/*.jar directory, to
> make
> > things work .
> >
>
> This is a  little off.
>
> Here is our Hadoop story for 0.90.0:
> http://hbase.apache.org/notsoquick.html#hadoop
>
> It links to the branch.   If you need instruction on how to check out
> and build, just say (do we need to add pointers to book?)
>
> St.Ack
>
>

Re: Using the Hadoop bundled in the lib directory of HBase

Posted by Mike Spreitzer <ms...@us.ibm.com>.

I do not see a BlockChannel.java in 
http://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20-append/ 
--- nor do I see any references in there to BlockChannel.

Thanks,
Mike Spreitzer




From:   Ryan Rawson <ry...@gmail.com>
To:     user@hbase.apache.org
Date:   02/13/2011 03:51 PM
Subject:        Re: Using the Hadoop bundled in the lib directory of HBase



On Sun, Feb 13, 2011 at 8:29 AM, Mike Spreitzer <ms...@us.ibm.com> 
wrote:
> Yes, I simply took the Hadoop 0.20.2 release, deleted its 
hadoop-core.jar,
> and replaced it with the contents of
> lib/hadoop-core-0.20-append-r1056497.jar from hbase.
>
> I'm not sure what to do with "this approach might work".  How can I know
> if it really does?

I'm not sure, maybe it'll great until one day in a month everything
will crash and burn due to <thing no one could have guessed>.  Perhaps
someone with extensive hdfs code experience might be able to tell you.

>
> BTW, I see that HBase's lib/hadoop-core-0.20-append-r1056497.jar 
contains
> org/apache/hadoop/hdfs/server/datanode/BlockChannel.class but I am 
having
> trouble figuring out why.  From where in SVN does that come?

Is it not in the append-20-branch ?


>
> Thanks,
> Mike Spreitzer
>
>
>
>
> From:   Ryan Rawson <ry...@gmail.com>
> To:     user@hbase.apache.org
> Cc:     stack <sa...@gmail.com>
> Date:   02/13/2011 02:33 AM
> Subject:        Re: Using the Hadoop bundled in the lib directory of 
HBase
>
>
>
> If you are taking the jar that we ship and slamming it in a hadoop
> 0.20.2 based distro that might work.  I'm not sure if there are any
> differences than pure code (which would then be expressed in the jar
> only), so this approach might work.
>
> You could also check out to the revision that we built our JAR and
> trying that. By default you need apache forrest (argh) and java5 to
> build hadoop (ARGH) which makes it not buildable on OSX.
>
> Building sucks, there are no short cuts. Good luck out there!
> -ryan
>
> On Sat, Feb 12, 2011 at 11:24 PM, Mike Spreitzer <ms...@us.ibm.com>
> wrote:
>> Let me be clear about the amount of testing I did: extremely little.  I
>> should also point out that at first I did not appreciate fully the
> meaning
>> of you earlier comment to Vijay saying "this is a little off" --- I now
>> realize you were in fact saying that Vijay told me to do things
> backward.
>>
>> Since my note saying the backward approach worked, two things have
>> happened: (1) someone make a link to it from (
>> http://hbase.apache.org/notsoquick.html), and (2) Ryan Rowson replied
>> saying, in no uncertain terms, that the backward approach is 
unreliable.
> I
>> would not have noticed a reliability issue in the negligible testing I
>> did.
>>
>> Having gotten two opposite opinions, I am now unsure of the truth of 
the
>> matter.  Is there any chance of Vijay and Ryan agreeing?
>>
>> Thanks,
>> Mike Spreitzer
>> SMTP: mspreitz@us.ibm.com, Lotus Notes: Mike Spreitzer/Watson/IBM
>> Office phone: +1-914-784-6424 (IBM T/L 863-)
>> AOL Instant Messaging: M1k3Sprtzr
>>
>
>

Re: Using the Hadoop bundled in the lib directory of HBase

Posted by Ryan Rawson <ry...@gmail.com>.

On Sun, Feb 13, 2011 at 8:29 AM, Mike Spreitzer <ms...@us.ibm.com> wrote:
> Yes, I simply took the Hadoop 0.20.2 release, deleted its hadoop-core.jar,
> and replaced it with the contents of
> lib/hadoop-core-0.20-append-r1056497.jar from hbase.
>
> I'm not sure what to do with "this approach might work".  How can I know
> if it really does?

I'm not sure, maybe it'll great until one day in a month everything
will crash and burn due to <thing no one could have guessed>.  Perhaps
someone with extensive hdfs code experience might be able to tell you.

>
> BTW, I see that HBase's lib/hadoop-core-0.20-append-r1056497.jar contains
> org/apache/hadoop/hdfs/server/datanode/BlockChannel.class but I am having
> trouble figuring out why.  From where in SVN does that come?

Is it not in the append-20-branch ?


>
> Thanks,
> Mike Spreitzer
>
>
>
>
> From:   Ryan Rawson <ry...@gmail.com>
> To:     user@hbase.apache.org
> Cc:     stack <sa...@gmail.com>
> Date:   02/13/2011 02:33 AM
> Subject:        Re: Using the Hadoop bundled in the lib directory of HBase
>
>
>
> If you are taking the jar that we ship and slamming it in a hadoop
> 0.20.2 based distro that might work.  I'm not sure if there are any
> differences than pure code (which would then be expressed in the jar
> only), so this approach might work.
>
> You could also check out to the revision that we built our JAR and
> trying that. By default you need apache forrest (argh) and java5 to
> build hadoop (ARGH) which makes it not buildable on OSX.
>
> Building sucks, there are no short cuts. Good luck out there!
> -ryan
>
> On Sat, Feb 12, 2011 at 11:24 PM, Mike Spreitzer <ms...@us.ibm.com>
> wrote:
>> Let me be clear about the amount of testing I did: extremely little.  I
>> should also point out that at first I did not appreciate fully the
> meaning
>> of you earlier comment to Vijay saying "this is a little off" --- I now
>> realize you were in fact saying that Vijay told me to do things
> backward.
>>
>> Since my note saying the backward approach worked, two things have
>> happened: (1) someone make a link to it from (
>> http://hbase.apache.org/notsoquick.html), and (2) Ryan Rowson replied
>> saying, in no uncertain terms, that the backward approach is unreliable.
> I
>> would not have noticed a reliability issue in the negligible testing I
>> did.
>>
>> Having gotten two opposite opinions, I am now unsure of the truth of the
>> matter.  Is there any chance of Vijay and Ryan agreeing?
>>
>> Thanks,
>> Mike Spreitzer
>> SMTP: mspreitz@us.ibm.com, Lotus Notes: Mike Spreitzer/Watson/IBM
>> Office phone: +1-914-784-6424 (IBM T/L 863-)
>> AOL Instant Messaging: M1k3Sprtzr
>>
>
>

Re: Using the Hadoop bundled in the lib directory of HBase

Posted by Mike Spreitzer <ms...@us.ibm.com>.

Yes, I simply took the Hadoop 0.20.2 release, deleted its hadoop-core.jar, 
and replaced it with the contents of 
lib/hadoop-core-0.20-append-r1056497.jar from hbase.

I'm not sure what to do with "this approach might work".  How can I know 
if it really does?

BTW, I see that HBase's lib/hadoop-core-0.20-append-r1056497.jar contains 
org/apache/hadoop/hdfs/server/datanode/BlockChannel.class but I am having 
trouble figuring out why.  From where in SVN does that come?

Thanks,
Mike Spreitzer

From:   Ryan Rawson <ry...@gmail.com>
To:     user@hbase.apache.org
Cc:     stack <sa...@gmail.com>
Date:   02/13/2011 02:33 AM
Subject:        Re: Using the Hadoop bundled in the lib directory of HBase

If you are taking the jar that we ship and slamming it in a hadoop
0.20.2 based distro that might work.  I'm not sure if there are any
differences than pure code (which would then be expressed in the jar
only), so this approach might work.

You could also check out to the revision that we built our JAR and
trying that. By default you need apache forrest (argh) and java5 to
build hadoop (ARGH) which makes it not buildable on OSX.

Building sucks, there are no short cuts. Good luck out there!
-ryan

On Sat, Feb 12, 2011 at 11:24 PM, Mike Spreitzer <ms...@us.ibm.com> 
wrote:
> Let me be clear about the amount of testing I did: extremely little.  I
> should also point out that at first I did not appreciate fully the 
meaning
> of you earlier comment to Vijay saying "this is a little off" --- I now
> realize you were in fact saying that Vijay told me to do things 
backward.
>
> Since my note saying the backward approach worked, two things have
> happened: (1) someone make a link to it from (
> http://hbase.apache.org/notsoquick.html), and (2) Ryan Rowson replied
> saying, in no uncertain terms, that the backward approach is unreliable. 
I
> would not have noticed a reliability issue in the negligible testing I
> did.
>
> Having gotten two opposite opinions, I am now unsure of the truth of the
> matter.  Is there any chance of Vijay and Ryan agreeing?
>
> Thanks,
> Mike Spreitzer
> SMTP: mspreitz@us.ibm.com, Lotus Notes: Mike Spreitzer/Watson/IBM
> Office phone: +1-914-784-6424 (IBM T/L 863-)
> AOL Instant Messaging: M1k3Sprtzr
>

Re: Using the Hadoop bundled in the lib directory of HBase

Posted by Ryan Rawson <ry...@gmail.com>.

If you are taking the jar that we ship and slamming it in a hadoop
0.20.2 based distro that might work.  I'm not sure if there are any
differences than pure code (which would then be expressed in the jar
only), so this approach might work.

You could also check out to the revision that we built our JAR and
trying that. By default you need apache forrest (argh) and java5 to
build hadoop (ARGH) which makes it not buildable on OSX.

Building sucks, there are no short cuts. Good luck out there!
-ryan

On Sat, Feb 12, 2011 at 11:24 PM, Mike Spreitzer <ms...@us.ibm.com> wrote:
> Let me be clear about the amount of testing I did: extremely little.  I
> should also point out that at first I did not appreciate fully the meaning
> of you earlier comment to Vijay saying "this is a little off" --- I now
> realize you were in fact saying that Vijay told me to do things backward.
>
> Since my note saying the backward approach worked, two things have
> happened: (1) someone make a link to it from (
> http://hbase.apache.org/notsoquick.html), and (2) Ryan Rowson replied
> saying, in no uncertain terms, that the backward approach is unreliable. I
> would not have noticed a reliability issue in the negligible testing I
> did.
>
> Having gotten two opposite opinions, I am now unsure of the truth of the
> matter.  Is there any chance of Vijay and Ryan agreeing?
>
> Thanks,
> Mike Spreitzer
> SMTP: mspreitz@us.ibm.com, Lotus Notes: Mike Spreitzer/Watson/IBM
> Office phone: +1-914-784-6424 (IBM T/L 863-)
> AOL Instant Messaging: M1k3Sprtzr
>

Re: Using the Hadoop bundled in the lib directory of HBase

Posted by Mike Spreitzer <ms...@us.ibm.com>.

Let me be clear about the amount of testing I did: extremely little.  I 
should also point out that at first I did not appreciate fully the meaning 
of you earlier comment to Vijay saying "this is a little off" --- I now 
realize you were in fact saying that Vijay told me to do things backward.

Since my note saying the backward approach worked, two things have 
happened: (1) someone make a link to it from (
http://hbase.apache.org/notsoquick.html), and (2) Ryan Rowson replied 
saying, in no uncertain terms, that the backward approach is unreliable. I 
would not have noticed a reliability issue in the negligible testing I 
did.

Having gotten two opposite opinions, I am now unsure of the truth of the 
matter.  Is there any chance of Vijay and Ryan agreeing?

Thanks,
Mike Spreitzer
SMTP: mspreitz@us.ibm.com, Lotus Notes: Mike Spreitzer/Watson/IBM
Office phone: +1-914-784-6424 (IBM T/L 863-)
AOL Instant Messaging: M1k3Sprtzr

Re: Using the Hadoop bundled in the lib directory of HBase

Posted by Mike Spreitzer <ms...@us.ibm.com>.

After a few false starts, what I have done is: fetch the 0.20.2 release of 
hadoop core (which appears to be common + dfs + mapred), install it, 
delete hadoop/hadoop-core.jar, unpack the hbase distribution, copy its 
lib/hadoop-core-...jar file to hadoop/hadoop-...-core.jar, configure, and 
test.  It seems to be working.  Is that what you expected?  Should I 
expect subtle problems?

If that was the right procedure, this could be explained a little more 
clearly at (http://hbase.apache.org/notsoquick.html#hadoop).  The first 
thing that set me on the wrong path was the statement that I have to 
either build my own Hadoop or use Cloudera; apparently that's not right, I 
can use a built release if I replace one jar in it.  That web page says "
If you want to run HBase on an Hadoop cluster that is other than a version 
made from branch-0.20.append " (which is my case, using a standard 
release) "you must replace the hadoop jar found in the HBase lib directory 
with the hadoop jar you are running out on your cluster to avoid version 
mismatch issues" --- but I think it's the other way around in my case.

Thanks,
Mike Spreitzer
SMTP: mspreitz@us.ibm.com, Lotus Notes: Mike Spreitzer/Watson/IBM
Office phone: +1-914-784-6424 (IBM T/L 863-)
AOL Instant Messaging: M1k3Sprtzr

From:   Stack <st...@duboce.net>
To:     user@hbase.apache.org
Date:   02/07/2011 12:07 PM
Subject:        Re: Using the Hadoop bundled in the lib directory of HBase
Sent by:        saint.ack@gmail.com

On Sun, Feb 6, 2011 at 9:31 PM, Vijay Raj <vi...@sargasdata.com> wrote:
> Hadoop core contained hdfs / mapreduce , all bundled together until 
0.20.x .
>  Since 0.21, it got forked into common, hdfs and mapreduce sub-projects.
>

What Vijay said.

> In this case - what is needed is a 0.20.2 download from hadoop and 
configuring
> the same. The hadoop-0.20.2.jar needs to be replaced by the patched
> hadoop-0.20.2-xxxx.jar available in HBASE_HOME/lib/*.jar directory, to 
make
> things work .
>

This is a  little off.

Here is our Hadoop story for 0.90.0:
http://hbase.apache.org/notsoquick.html#hadoop

It links to the branch.   If you need instruction on how to check out
and build, just say (do we need to add pointers to book?)

St.Ack

Re: Using the Hadoop bundled in the lib directory of HBase

Posted by Stack <st...@duboce.net>.

On Sun, Feb 6, 2011 at 9:31 PM, Vijay Raj <vi...@sargasdata.com> wrote:
> Hadoop core contained hdfs / mapreduce , all bundled together until 0.20.x .
>  Since 0.21, it got forked into common, hdfs and mapreduce sub-projects.
>

What Vijay said.

> In this case - what is needed is a 0.20.2 download from hadoop and configuring
> the same. The hadoop-0.20.2.jar needs to be replaced by the patched
> hadoop-0.20.2-xxxx.jar available in HBASE_HOME/lib/*.jar directory, to make
> things work .
>

This is a  little off.

Here is our Hadoop story for 0.90.0:
http://hbase.apache.org/notsoquick.html#hadoop

It links to the branch.   If you need instruction on how to check out
and build, just say (do we need to add pointers to book?)

St.Ack

Re: Using the Hadoop bundled in the lib directory of HBase

Posted by Vijay Raj <vi...@sargasdata.com>.


----- Original Message ----
> From: Mike Spreitzer <ms...@us.ibm.com>
> To: user@hbase.apache.org
> Sent: Sun, February 6, 2011 9:12:18 PM
> Subject: Re: Using the Hadoop bundled in the lib directory of HBase
> 
> OK, I am building the 0.20-append branch of hadoop-common.  Do I then have 
> to build hadoop-hdfs or can I use a pre-built release of hadoop-hdfs?   If 
> the latter, where would I find such a thing?  When I try following  the 
> links (http://hadoop.apache.org/hdfs/ -> 
> http://hadoop.apache.org/hdfs/releases.html -> 
> http://hadoop.apache.org/hdfs/releases.html#Download -> 
> http://www.apache.org/dyn/closer.cgi/hadoop/core/ ) I get to releases of 
> hadoop/core --- what is that?

Hadoop core contained hdfs / mapreduce , all bundled together until 0.20.x . 
 Since 0.21, it got forked into common, hdfs and mapreduce sub-projects. 

In this case - what is needed is a 0.20.2 download from hadoop and configuring 
the same. The hadoop-0.20.2.jar needs to be replaced by the patched 
hadoop-0.20.2-xxxx.jar available in HBASE_HOME/lib/*.jar directory, to make 
things work .



> 
> Thanks,
> Mike Spreitzer
> SMTP: mspreitz@us.ibm.com, Lotus Notes: Mike  Spreitzer/Watson/IBM
> Office phone: +1-914-784-6424 (IBM T/L 863-)
> AOL  Instant Messaging: M1k3Sprtzr
> 
> 
> 
> From:   Norbert Burger <no...@gmail.com>
> To:     user@hbase.apache.org
> Date:    02/05/2011 09:51 AM
> Subject:        Re: Using the Hadoop  bundled in the lib directory of HBase
> 
> 
> 
> Mike, you'll also need also  access to an installation of Hadoop, whether
> this on the same machines as  your HBase install (common), or somewhere
> else.  Often, people install  Hadoop first and then layer HBase over it.
> 
> HBase depends on core Hadoop  functionality like HDFS, and uses the Hadoop
> JAR in lib/ to support  this.  But this is library code only; what you're
> missing is the rest of  Hadoop ecosystem (config files, directory 
> structure,
> command-line tools,  etc.)
> 
> Norbert
> 
> On Sat, Feb 5, 2011 at 9:21 AM, Ted Yu <yu...@gmail.com> wrote:
> 
> >  On a related note:
> > http://wiki.apache.org/hadoop/Hadoop%20Upgrade  (referenced by
> > http://wiki.apache.org/hadoop/Hbase/HowToMigrate#90)  needs to be filled
> > out.
> >
> > On Fri, Feb 4, 2011 at 11:47 PM,  Mike Spreitzer <ms...@us.ibm.com>
> >  wrote:
> >
> > > Hi, I'm new to HBase and have a stupid question  about its dependency 
> on
> > > Hadoop.  Section 1.3.1.2 of  (http://hbase.apache.org/notsoquick.html)
> > says
> > > there is an  "instance" of Hadoop in the lib directory of HBase.  What
> > >  exactly is meant by "instance"?  Is it all I need, or do I need to get 
> a
> > > "full" copy of Hadoop from elsewhere?  If HBase already  has all I 
> need, I
> > > am having trouble finding it.  The  Hadoop instructions refer to 
> commands,
> > > for example, that I  can't find.
> > >
> > > Thanks,
> > > Mike
> >
> 
>

Re: Using the Hadoop bundled in the lib directory of HBase

Posted by Mike Spreitzer <ms...@us.ibm.com>.

OK, I am building the 0.20-append branch of hadoop-common.  Do I then have 
to build hadoop-hdfs or can I use a pre-built release of hadoop-hdfs?  If 
the latter, where would I find such a thing?  When I try following the 
links (http://hadoop.apache.org/hdfs/ -> 
http://hadoop.apache.org/hdfs/releases.html -> 
http://hadoop.apache.org/hdfs/releases.html#Download -> 
http://www.apache.org/dyn/closer.cgi/hadoop/core/ ) I get to releases of 
hadoop/core --- what is that?

Thanks,
Mike Spreitzer
SMTP: mspreitz@us.ibm.com, Lotus Notes: Mike Spreitzer/Watson/IBM
Office phone: +1-914-784-6424 (IBM T/L 863-)
AOL Instant Messaging: M1k3Sprtzr

From:   Norbert Burger <no...@gmail.com>
To:     user@hbase.apache.org
Date:   02/05/2011 09:51 AM
Subject:        Re: Using the Hadoop bundled in the lib directory of HBase

Mike, you'll also need also access to an installation of Hadoop, whether
this on the same machines as your HBase install (common), or somewhere
else.  Often, people install Hadoop first and then layer HBase over it.

HBase depends on core Hadoop functionality like HDFS, and uses the Hadoop
JAR in lib/ to support this.  But this is library code only; what you're
missing is the rest of Hadoop ecosystem (config files, directory 
structure,
command-line tools, etc.)

Norbert

On Sat, Feb 5, 2011 at 9:21 AM, Ted Yu <yu...@gmail.com> wrote:

> On a related note:
> http://wiki.apache.org/hadoop/Hadoop%20Upgrade (referenced by
> http://wiki.apache.org/hadoop/Hbase/HowToMigrate#90) needs to be filled
> out.
>
> On Fri, Feb 4, 2011 at 11:47 PM, Mike Spreitzer <ms...@us.ibm.com>
> wrote:
>
> > Hi, I'm new to HBase and have a stupid question about its dependency 
on
> > Hadoop.  Section 1.3.1.2 of (http://hbase.apache.org/notsoquick.html)
> says
> > there is an "instance" of Hadoop in the lib directory of HBase.  What
> > exactly is meant by "instance"?  Is it all I need, or do I need to get 
a
> > "full" copy of Hadoop from elsewhere?  If HBase already has all I 
need, I
> > am having trouble finding it.  The Hadoop instructions refer to 
commands,
> > for example, that I can't find.
> >
> > Thanks,
> > Mike
>

Re: Using the Hadoop bundled in the lib directory of HBase

Posted by Norbert Burger <no...@gmail.com>.

Mike, you'll also need also access to an installation of Hadoop, whether
this on the same machines as your HBase install (common), or somewhere
else.  Often, people install Hadoop first and then layer HBase over it.

HBase depends on core Hadoop functionality like HDFS, and uses the Hadoop
JAR in lib/ to support this.  But this is library code only; what you're
missing is the rest of Hadoop ecosystem (config files, directory structure,
command-line tools, etc.)

Norbert

On Sat, Feb 5, 2011 at 9:21 AM, Ted Yu <yu...@gmail.com> wrote:

> On a related note:
> http://wiki.apache.org/hadoop/Hadoop%20Upgrade (referenced by
> http://wiki.apache.org/hadoop/Hbase/HowToMigrate#90) needs to be filled
> out.
>
> On Fri, Feb 4, 2011 at 11:47 PM, Mike Spreitzer <ms...@us.ibm.com>
> wrote:
>
> > Hi, I'm new to HBase and have a stupid question about its dependency on
> > Hadoop.  Section 1.3.1.2 of (http://hbase.apache.org/notsoquick.html)
> says
> > there is an "instance" of Hadoop in the lib directory of HBase.  What
> > exactly is meant by "instance"?  Is it all I need, or do I need to get a
> > "full" copy of Hadoop from elsewhere?  If HBase already has all I need, I
> > am having trouble finding it.  The Hadoop instructions refer to commands,
> > for example, that I can't find.
> >
> > Thanks,
> > Mike
>

Re: Using the Hadoop bundled in the lib directory of HBase

Posted by Ted Yu <yu...@gmail.com>.

On a related note:
http://wiki.apache.org/hadoop/Hadoop%20Upgrade (referenced by
http://wiki.apache.org/hadoop/Hbase/HowToMigrate#90) needs to be filled out.

On Fri, Feb 4, 2011 at 11:47 PM, Mike Spreitzer <ms...@us.ibm.com> wrote:

> Hi, I'm new to HBase and have a stupid question about its dependency on
> Hadoop.  Section 1.3.1.2 of (http://hbase.apache.org/notsoquick.html) says
> there is an "instance" of Hadoop in the lib directory of HBase.  What
> exactly is meant by "instance"?  Is it all I need, or do I need to get a
> "full" copy of Hadoop from elsewhere?  If HBase already has all I need, I
> am having trouble finding it.  The Hadoop instructions refer to commands,
> for example, that I can't find.
>
> Thanks,
> Mike