You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Mike Spreitzer <ms...@us.ibm.com> on 2011/02/05 08:47:46 UTC
Using the Hadoop bundled in the lib directory of HBase
Hi, I'm new to HBase and have a stupid question about its dependency on
Hadoop. Section 1.3.1.2 of (http://hbase.apache.org/notsoquick.html) says
there is an "instance" of Hadoop in the lib directory of HBase. What
exactly is meant by "instance"? Is it all I need, or do I need to get a
"full" copy of Hadoop from elsewhere? If HBase already has all I need, I
am having trouble finding it. The Hadoop instructions refer to commands,
for example, that I can't find.
Thanks,
Mike
Re: Using the Hadoop bundled in the lib directory of HBase
Posted by Norbert Burger <no...@gmail.com>.
For testing purposes, it is possible to run HBase without HDFS and the
benefits of durability. Benoit Sigoure has a good writeup here:
http://opentsdb.net/setup-hbase.html
But for larger deployments, HDFS is the way to go. Another approach you
might consider is the pseudo-distributed option, where you get Hadoop+HBase
running all on the same node (http://goo.gl/Rytnp).
Norbert
On Sat, Feb 5, 2011 at 9:51 AM, Norbert Burger <no...@gmail.com>wrote:
> Mike, you'll also need also access to an installation of Hadoop, whether
> this on the same machines as your HBase install (common), or somewhere
> else. Often, people install Hadoop first and then layer HBase over it.
>
> HBase depends on core Hadoop functionality like HDFS, and uses the Hadoop
> JAR in lib/ to support this. But this is library code only; what you're
> missing is the rest of Hadoop ecosystem (config files, directory structure,
> command-line tools, etc.)
>
> Norbert
>
>
> On Sat, Feb 5, 2011 at 9:21 AM, Ted Yu <yu...@gmail.com> wrote:
>
>> On a related note:
>> http://wiki.apache.org/hadoop/Hadoop%20Upgrade (referenced by
>> http://wiki.apache.org/hadoop/Hbase/HowToMigrate#90) needs to be filled
>> out.
>>
>> On Fri, Feb 4, 2011 at 11:47 PM, Mike Spreitzer <ms...@us.ibm.com>
>> wrote:
>>
>> > Hi, I'm new to HBase and have a stupid question about its dependency on
>> > Hadoop. Section 1.3.1.2 of (http://hbase.apache.org/notsoquick.html)
>> says
>> > there is an "instance" of Hadoop in the lib directory of HBase. What
>> > exactly is meant by "instance"? Is it all I need, or do I need to get a
>> > "full" copy of Hadoop from elsewhere? If HBase already has all I need,
>> I
>> > am having trouble finding it. The Hadoop instructions refer to
>> commands,
>> > for example, that I can't find.
>> >
>> > Thanks,
>> > Mike
>>
>
>
Re: Using the Hadoop bundled in the lib directory of HBase
Posted by Stack <st...@duboce.net>.
Oh, sorry. Thanks for the noob POV Joe. Invaluable. Let me have a go at it.
St.Ack
On Mon, Feb 7, 2011 at 9:21 AM, Joe Pallas <pa...@cs.stanford.edu> wrote:
>
> On Feb 7, 2011, at 9:02 AM, Stack wrote:
>
>> Here is our Hadoop story for 0.90.0:
>> http://hbase.apache.org/notsoquick.html#hadoop
>
> And for someone who is new to HBase and Hadoop, those two paragraphs are immensely confusing. First it says you have to build your own Hadoop, and then it says a copy of Hadoop is bundled with HBase.
>
> The poor newbie is at a total loss. Do I have to build my own copy of Hadoop or not? Can I download the latest 0.20.x Hadoop and just replace the hadoop-core jar with the one from the HBase distribution? And then all those other versions get mentioned, which just makes things even more confusing.
>
> I think this could be clearer.
> joe
>
>
Re: Using the Hadoop bundled in the lib directory of HBase
Posted by Joe Pallas <pa...@cs.stanford.edu>.
On Feb 7, 2011, at 9:02 AM, Stack wrote:
> Here is our Hadoop story for 0.90.0:
> http://hbase.apache.org/notsoquick.html#hadoop
And for someone who is new to HBase and Hadoop, those two paragraphs are immensely confusing. First it says you have to build your own Hadoop, and then it says a copy of Hadoop is bundled with HBase.
The poor newbie is at a total loss. Do I have to build my own copy of Hadoop or not? Can I download the latest 0.20.x Hadoop and just replace the hadoop-core jar with the one from the HBase distribution? And then all those other versions get mentioned, which just makes things even more confusing.
I think this could be clearer.
joe
Re: Using the Hadoop bundled in the lib directory of HBase
Posted by Ryan Rawson <ry...@gmail.com>.
Hey guys,
If you are running on hadoop 0.20.2 you are going to lose data when
you crash. So don't do it :-)
You will need to either use a cdh3 beta (we use b2), or build the
hadoop-20-append branch. We have built the hadoop-20-append tip and
included the JAR with the default distribution. It is not compatible
with hadoop 0.20.2 (stock/native) nor cdh3 beta*.
It's really confusing, but the basic fact is there is no ASF released
version of hadoop that runs HBase properly. My best suggestion is to
complain to general@, and file JIRAs if you can. It helps when users
complain, since I think everyone has gone tone deaf from me
complaining :-)
-ryan
On Thu, Feb 10, 2011 at 6:13 AM, Mike Spreitzer <ms...@us.ibm.com> wrote:
> Yes, you've got it right. Let me emphasize that what I did was *much*
> easier than the other way around --- which I tried first and in which I
> had problems. The Cloudera release specifically depends on Sun security
> classes that are not in the Java (IBM's) that I used. I tried building
> Hadoop's 0.20-append branch but had some difficulties and it took a long
> time. The various build instructions I found all talked about running the
> regression test suite once or twice --- and a single run takes hours. The
> first time I ran it, from a clean download and build, it had problems. And
> the instructions are confusion regarding building the native part. The
> instructions seem to say you can build and test without building the
> native support; how can that be?
>
> Regards,
> Mike Spreitzer
> SMTP: mspreitz@us.ibm.com, Lotus Notes: Mike Spreitzer/Watson/IBM
> Office phone: +1-914-784-6424 (IBM T/L 863-)
> AOL Instant Messaging: M1k3Sprtzr
>
>
>
> From: Suraj Varma <sv...@gmail.com>
> To: user@hbase.apache.org
> Date: 02/10/2011 08:02 AM
> Subject: Re: Using the Hadoop bundled in the lib directory of HBase
>
>
>
> This procedure does seem a bit opposite of what I've seen folks recommend
> (and the way it is documented in the notsoquick.html).But it might be
> equivalent in this specific case (not completely sure as scripts etc are
> different). I'll let one of the experts comment on that.
>
> If I understood you right, you took the hadoop 0.20.2 release (which does
> not have append support needed to prevent data loss in some situations)
> and
> installed that. Next you took hbase 0.90.0 's hadoop-core.jar (which is
> from
> a separately built branch-0.20-append and copied that over to the hadoop
> installation.
>
> What folks usually do is copy over the hadoop install's jar file over to
> hbase - so, if you have a Cloudera install, you would copy over the
> Cloudera
> built hadoop jar over to your hbase install (replacing the hbase hadoop
> jar).
>
> I'm guessing that in your specific situation since branch-0.20-append and
> hadoop 0.20.2 are fairly close (other than the append changes), it "might"
> work. But - not sure if this is what folks normally do ...
>
> Can someone clarify this? The above procedure Mike followed certainly is
> much simpler in this specific case as he doesn't have to built out his own
> branch-0.20-append and rather "reuse" the one that was built for
> hbase-0.90.
>
> Thanks,
> --Suraj
>
>
> On Mon, Feb 7, 2011 at 9:17 AM, Mike Spreitzer <ms...@us.ibm.com>
> wrote:
>
>> After a few false starts, what I have done is: fetch the 0.20.2 release
> of
>> hadoop core (which appears to be common + dfs + mapred), install it,
>> delete hadoop/hadoop-core.jar, unpack the hbase distribution, copy its
>> lib/hadoop-core-...jar file to hadoop/hadoop-...-core.jar, configure,
> and
>> test. It seems to be working. Is that what you expected? Should I
>> expect subtle problems?
>>
>> If that was the right procedure, this could be explained a little more
>> clearly at (http://hbase.apache.org/notsoquick.html#hadoop). The first
>> thing that set me on the wrong path was the statement that I have to
>> either build my own Hadoop or use Cloudera; apparently that's not right,
> I
>> can use a built release if I replace one jar in it. That web page says
> "
>> If you want to run HBase on an Hadoop cluster that is other than a
> version
>> made from branch-0.20.append " (which is my case, using a standard
>> release) "you must replace the hadoop jar found in the HBase lib
> directory
>> with the hadoop jar you are running out on your cluster to avoid version
>> mismatch issues" --- but I think it's the other way around in my case.
>>
>> Thanks,
>> Mike Spreitzer
>> SMTP: mspreitz@us.ibm.com, Lotus Notes: Mike Spreitzer/Watson/IBM
>> Office phone: +1-914-784-6424 (IBM T/L 863-)
>> AOL Instant Messaging: M1k3Sprtzr
>>
>>
>>
>> From: Stack <st...@duboce.net>
>> To: user@hbase.apache.org
>> Date: 02/07/2011 12:07 PM
>> Subject: Re: Using the Hadoop bundled in the lib directory of
> HBase
>> Sent by: saint.ack@gmail.com
>>
>>
>>
>> On Sun, Feb 6, 2011 at 9:31 PM, Vijay Raj <vi...@sargasdata.com>
> wrote:
>> > Hadoop core contained hdfs / mapreduce , all bundled together until
>> 0.20.x .
>> > Since 0.21, it got forked into common, hdfs and mapreduce
> sub-projects.
>> >
>>
>> What Vijay said.
>>
>> > In this case - what is needed is a 0.20.2 download from hadoop and
>> configuring
>> > the same. The hadoop-0.20.2.jar needs to be replaced by the patched
>> > hadoop-0.20.2-xxxx.jar available in HBASE_HOME/lib/*.jar directory, to
>> make
>> > things work .
>> >
>>
>> This is a little off.
>>
>> Here is our Hadoop story for 0.90.0:
>> http://hbase.apache.org/notsoquick.html#hadoop
>>
>> It links to the branch. If you need instruction on how to check out
>> and build, just say (do we need to add pointers to book?)
>>
>> St.Ack
>>
>>
>
>
Re: Using the Hadoop bundled in the lib directory of HBase
Posted by Mike Spreitzer <ms...@us.ibm.com>.
Yes, you've got it right. Let me emphasize that what I did was *much*
easier than the other way around --- which I tried first and in which I
had problems. The Cloudera release specifically depends on Sun security
classes that are not in the Java (IBM's) that I used. I tried building
Hadoop's 0.20-append branch but had some difficulties and it took a long
time. The various build instructions I found all talked about running the
regression test suite once or twice --- and a single run takes hours. The
first time I ran it, from a clean download and build, it had problems. And
the instructions are confusion regarding building the native part. The
instructions seem to say you can build and test without building the
native support; how can that be?
Regards,
Mike Spreitzer
SMTP: mspreitz@us.ibm.com, Lotus Notes: Mike Spreitzer/Watson/IBM
Office phone: +1-914-784-6424 (IBM T/L 863-)
AOL Instant Messaging: M1k3Sprtzr
From: Suraj Varma <sv...@gmail.com>
To: user@hbase.apache.org
Date: 02/10/2011 08:02 AM
Subject: Re: Using the Hadoop bundled in the lib directory of HBase
This procedure does seem a bit opposite of what I've seen folks recommend
(and the way it is documented in the notsoquick.html).But it might be
equivalent in this specific case (not completely sure as scripts etc are
different). I'll let one of the experts comment on that.
If I understood you right, you took the hadoop 0.20.2 release (which does
not have append support needed to prevent data loss in some situations)
and
installed that. Next you took hbase 0.90.0 's hadoop-core.jar (which is
from
a separately built branch-0.20-append and copied that over to the hadoop
installation.
What folks usually do is copy over the hadoop install's jar file over to
hbase - so, if you have a Cloudera install, you would copy over the
Cloudera
built hadoop jar over to your hbase install (replacing the hbase hadoop
jar).
I'm guessing that in your specific situation since branch-0.20-append and
hadoop 0.20.2 are fairly close (other than the append changes), it "might"
work. But - not sure if this is what folks normally do ...
Can someone clarify this? The above procedure Mike followed certainly is
much simpler in this specific case as he doesn't have to built out his own
branch-0.20-append and rather "reuse" the one that was built for
hbase-0.90.
Thanks,
--Suraj
On Mon, Feb 7, 2011 at 9:17 AM, Mike Spreitzer <ms...@us.ibm.com>
wrote:
> After a few false starts, what I have done is: fetch the 0.20.2 release
of
> hadoop core (which appears to be common + dfs + mapred), install it,
> delete hadoop/hadoop-core.jar, unpack the hbase distribution, copy its
> lib/hadoop-core-...jar file to hadoop/hadoop-...-core.jar, configure,
and
> test. It seems to be working. Is that what you expected? Should I
> expect subtle problems?
>
> If that was the right procedure, this could be explained a little more
> clearly at (http://hbase.apache.org/notsoquick.html#hadoop). The first
> thing that set me on the wrong path was the statement that I have to
> either build my own Hadoop or use Cloudera; apparently that's not right,
I
> can use a built release if I replace one jar in it. That web page says
"
> If you want to run HBase on an Hadoop cluster that is other than a
version
> made from branch-0.20.append " (which is my case, using a standard
> release) "you must replace the hadoop jar found in the HBase lib
directory
> with the hadoop jar you are running out on your cluster to avoid version
> mismatch issues" --- but I think it's the other way around in my case.
>
> Thanks,
> Mike Spreitzer
> SMTP: mspreitz@us.ibm.com, Lotus Notes: Mike Spreitzer/Watson/IBM
> Office phone: +1-914-784-6424 (IBM T/L 863-)
> AOL Instant Messaging: M1k3Sprtzr
>
>
>
> From: Stack <st...@duboce.net>
> To: user@hbase.apache.org
> Date: 02/07/2011 12:07 PM
> Subject: Re: Using the Hadoop bundled in the lib directory of
HBase
> Sent by: saint.ack@gmail.com
>
>
>
> On Sun, Feb 6, 2011 at 9:31 PM, Vijay Raj <vi...@sargasdata.com>
wrote:
> > Hadoop core contained hdfs / mapreduce , all bundled together until
> 0.20.x .
> > Since 0.21, it got forked into common, hdfs and mapreduce
sub-projects.
> >
>
> What Vijay said.
>
> > In this case - what is needed is a 0.20.2 download from hadoop and
> configuring
> > the same. The hadoop-0.20.2.jar needs to be replaced by the patched
> > hadoop-0.20.2-xxxx.jar available in HBASE_HOME/lib/*.jar directory, to
> make
> > things work .
> >
>
> This is a little off.
>
> Here is our Hadoop story for 0.90.0:
> http://hbase.apache.org/notsoquick.html#hadoop
>
> It links to the branch. If you need instruction on how to check out
> and build, just say (do we need to add pointers to book?)
>
> St.Ack
>
>
Re: Using the Hadoop bundled in the lib directory of HBase
Posted by Suraj Varma <sv...@gmail.com>.
This procedure does seem a bit opposite of what I've seen folks recommend
(and the way it is documented in the notsoquick.html).But it might be
equivalent in this specific case (not completely sure as scripts etc are
different). I'll let one of the experts comment on that.
If I understood you right, you took the hadoop 0.20.2 release (which does
not have append support needed to prevent data loss in some situations) and
installed that. Next you took hbase 0.90.0 's hadoop-core.jar (which is from
a separately built branch-0.20-append and copied that over to the hadoop
installation.
What folks usually do is copy over the hadoop install's jar file over to
hbase - so, if you have a Cloudera install, you would copy over the Cloudera
built hadoop jar over to your hbase install (replacing the hbase hadoop
jar).
I'm guessing that in your specific situation since branch-0.20-append and
hadoop 0.20.2 are fairly close (other than the append changes), it "might"
work. But - not sure if this is what folks normally do ...
Can someone clarify this? The above procedure Mike followed certainly is
much simpler in this specific case as he doesn't have to built out his own
branch-0.20-append and rather "reuse" the one that was built for hbase-0.90.
Thanks,
--Suraj
On Mon, Feb 7, 2011 at 9:17 AM, Mike Spreitzer <ms...@us.ibm.com> wrote:
> After a few false starts, what I have done is: fetch the 0.20.2 release of
> hadoop core (which appears to be common + dfs + mapred), install it,
> delete hadoop/hadoop-core.jar, unpack the hbase distribution, copy its
> lib/hadoop-core-...jar file to hadoop/hadoop-...-core.jar, configure, and
> test. It seems to be working. Is that what you expected? Should I
> expect subtle problems?
>
> If that was the right procedure, this could be explained a little more
> clearly at (http://hbase.apache.org/notsoquick.html#hadoop). The first
> thing that set me on the wrong path was the statement that I have to
> either build my own Hadoop or use Cloudera; apparently that's not right, I
> can use a built release if I replace one jar in it. That web page says "
> If you want to run HBase on an Hadoop cluster that is other than a version
> made from branch-0.20.append " (which is my case, using a standard
> release) "you must replace the hadoop jar found in the HBase lib directory
> with the hadoop jar you are running out on your cluster to avoid version
> mismatch issues" --- but I think it's the other way around in my case.
>
> Thanks,
> Mike Spreitzer
> SMTP: mspreitz@us.ibm.com, Lotus Notes: Mike Spreitzer/Watson/IBM
> Office phone: +1-914-784-6424 (IBM T/L 863-)
> AOL Instant Messaging: M1k3Sprtzr
>
>
>
> From: Stack <st...@duboce.net>
> To: user@hbase.apache.org
> Date: 02/07/2011 12:07 PM
> Subject: Re: Using the Hadoop bundled in the lib directory of HBase
> Sent by: saint.ack@gmail.com
>
>
>
> On Sun, Feb 6, 2011 at 9:31 PM, Vijay Raj <vi...@sargasdata.com> wrote:
> > Hadoop core contained hdfs / mapreduce , all bundled together until
> 0.20.x .
> > Since 0.21, it got forked into common, hdfs and mapreduce sub-projects.
> >
>
> What Vijay said.
>
> > In this case - what is needed is a 0.20.2 download from hadoop and
> configuring
> > the same. The hadoop-0.20.2.jar needs to be replaced by the patched
> > hadoop-0.20.2-xxxx.jar available in HBASE_HOME/lib/*.jar directory, to
> make
> > things work .
> >
>
> This is a little off.
>
> Here is our Hadoop story for 0.90.0:
> http://hbase.apache.org/notsoquick.html#hadoop
>
> It links to the branch. If you need instruction on how to check out
> and build, just say (do we need to add pointers to book?)
>
> St.Ack
>
>
Re: Using the Hadoop bundled in the lib directory of HBase
Posted by Mike Spreitzer <ms...@us.ibm.com>.
I do not see a BlockChannel.java in
http://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20-append/
--- nor do I see any references in there to BlockChannel.
Thanks,
Mike Spreitzer
From: Ryan Rawson <ry...@gmail.com>
To: user@hbase.apache.org
Date: 02/13/2011 03:51 PM
Subject: Re: Using the Hadoop bundled in the lib directory of HBase
On Sun, Feb 13, 2011 at 8:29 AM, Mike Spreitzer <ms...@us.ibm.com>
wrote:
> Yes, I simply took the Hadoop 0.20.2 release, deleted its
hadoop-core.jar,
> and replaced it with the contents of
> lib/hadoop-core-0.20-append-r1056497.jar from hbase.
>
> I'm not sure what to do with "this approach might work". How can I know
> if it really does?
I'm not sure, maybe it'll great until one day in a month everything
will crash and burn due to <thing no one could have guessed>. Perhaps
someone with extensive hdfs code experience might be able to tell you.
>
> BTW, I see that HBase's lib/hadoop-core-0.20-append-r1056497.jar
contains
> org/apache/hadoop/hdfs/server/datanode/BlockChannel.class but I am
having
> trouble figuring out why. From where in SVN does that come?
Is it not in the append-20-branch ?
>
> Thanks,
> Mike Spreitzer
>
>
>
>
> From: Ryan Rawson <ry...@gmail.com>
> To: user@hbase.apache.org
> Cc: stack <sa...@gmail.com>
> Date: 02/13/2011 02:33 AM
> Subject: Re: Using the Hadoop bundled in the lib directory of
HBase
>
>
>
> If you are taking the jar that we ship and slamming it in a hadoop
> 0.20.2 based distro that might work. I'm not sure if there are any
> differences than pure code (which would then be expressed in the jar
> only), so this approach might work.
>
> You could also check out to the revision that we built our JAR and
> trying that. By default you need apache forrest (argh) and java5 to
> build hadoop (ARGH) which makes it not buildable on OSX.
>
> Building sucks, there are no short cuts. Good luck out there!
> -ryan
>
> On Sat, Feb 12, 2011 at 11:24 PM, Mike Spreitzer <ms...@us.ibm.com>
> wrote:
>> Let me be clear about the amount of testing I did: extremely little. I
>> should also point out that at first I did not appreciate fully the
> meaning
>> of you earlier comment to Vijay saying "this is a little off" --- I now
>> realize you were in fact saying that Vijay told me to do things
> backward.
>>
>> Since my note saying the backward approach worked, two things have
>> happened: (1) someone make a link to it from (
>> http://hbase.apache.org/notsoquick.html), and (2) Ryan Rowson replied
>> saying, in no uncertain terms, that the backward approach is
unreliable.
> I
>> would not have noticed a reliability issue in the negligible testing I
>> did.
>>
>> Having gotten two opposite opinions, I am now unsure of the truth of
the
>> matter. Is there any chance of Vijay and Ryan agreeing?
>>
>> Thanks,
>> Mike Spreitzer
>> SMTP: mspreitz@us.ibm.com, Lotus Notes: Mike Spreitzer/Watson/IBM
>> Office phone: +1-914-784-6424 (IBM T/L 863-)
>> AOL Instant Messaging: M1k3Sprtzr
>>
>
>
Re: Using the Hadoop bundled in the lib directory of HBase
Posted by Ryan Rawson <ry...@gmail.com>.
On Sun, Feb 13, 2011 at 8:29 AM, Mike Spreitzer <ms...@us.ibm.com> wrote:
> Yes, I simply took the Hadoop 0.20.2 release, deleted its hadoop-core.jar,
> and replaced it with the contents of
> lib/hadoop-core-0.20-append-r1056497.jar from hbase.
>
> I'm not sure what to do with "this approach might work". How can I know
> if it really does?
I'm not sure, maybe it'll great until one day in a month everything
will crash and burn due to <thing no one could have guessed>. Perhaps
someone with extensive hdfs code experience might be able to tell you.
>
> BTW, I see that HBase's lib/hadoop-core-0.20-append-r1056497.jar contains
> org/apache/hadoop/hdfs/server/datanode/BlockChannel.class but I am having
> trouble figuring out why. From where in SVN does that come?
Is it not in the append-20-branch ?
>
> Thanks,
> Mike Spreitzer
>
>
>
>
> From: Ryan Rawson <ry...@gmail.com>
> To: user@hbase.apache.org
> Cc: stack <sa...@gmail.com>
> Date: 02/13/2011 02:33 AM
> Subject: Re: Using the Hadoop bundled in the lib directory of HBase
>
>
>
> If you are taking the jar that we ship and slamming it in a hadoop
> 0.20.2 based distro that might work. I'm not sure if there are any
> differences than pure code (which would then be expressed in the jar
> only), so this approach might work.
>
> You could also check out to the revision that we built our JAR and
> trying that. By default you need apache forrest (argh) and java5 to
> build hadoop (ARGH) which makes it not buildable on OSX.
>
> Building sucks, there are no short cuts. Good luck out there!
> -ryan
>
> On Sat, Feb 12, 2011 at 11:24 PM, Mike Spreitzer <ms...@us.ibm.com>
> wrote:
>> Let me be clear about the amount of testing I did: extremely little. I
>> should also point out that at first I did not appreciate fully the
> meaning
>> of you earlier comment to Vijay saying "this is a little off" --- I now
>> realize you were in fact saying that Vijay told me to do things
> backward.
>>
>> Since my note saying the backward approach worked, two things have
>> happened: (1) someone make a link to it from (
>> http://hbase.apache.org/notsoquick.html), and (2) Ryan Rowson replied
>> saying, in no uncertain terms, that the backward approach is unreliable.
> I
>> would not have noticed a reliability issue in the negligible testing I
>> did.
>>
>> Having gotten two opposite opinions, I am now unsure of the truth of the
>> matter. Is there any chance of Vijay and Ryan agreeing?
>>
>> Thanks,
>> Mike Spreitzer
>> SMTP: mspreitz@us.ibm.com, Lotus Notes: Mike Spreitzer/Watson/IBM
>> Office phone: +1-914-784-6424 (IBM T/L 863-)
>> AOL Instant Messaging: M1k3Sprtzr
>>
>
>
Re: Using the Hadoop bundled in the lib directory of HBase
Posted by Mike Spreitzer <ms...@us.ibm.com>.
Yes, I simply took the Hadoop 0.20.2 release, deleted its hadoop-core.jar,
and replaced it with the contents of
lib/hadoop-core-0.20-append-r1056497.jar from hbase.
I'm not sure what to do with "this approach might work". How can I know
if it really does?
BTW, I see that HBase's lib/hadoop-core-0.20-append-r1056497.jar contains
org/apache/hadoop/hdfs/server/datanode/BlockChannel.class but I am having
trouble figuring out why. From where in SVN does that come?
Thanks,
Mike Spreitzer
From: Ryan Rawson <ry...@gmail.com>
To: user@hbase.apache.org
Cc: stack <sa...@gmail.com>
Date: 02/13/2011 02:33 AM
Subject: Re: Using the Hadoop bundled in the lib directory of HBase
If you are taking the jar that we ship and slamming it in a hadoop
0.20.2 based distro that might work. I'm not sure if there are any
differences than pure code (which would then be expressed in the jar
only), so this approach might work.
You could also check out to the revision that we built our JAR and
trying that. By default you need apache forrest (argh) and java5 to
build hadoop (ARGH) which makes it not buildable on OSX.
Building sucks, there are no short cuts. Good luck out there!
-ryan
On Sat, Feb 12, 2011 at 11:24 PM, Mike Spreitzer <ms...@us.ibm.com>
wrote:
> Let me be clear about the amount of testing I did: extremely little. I
> should also point out that at first I did not appreciate fully the
meaning
> of you earlier comment to Vijay saying "this is a little off" --- I now
> realize you were in fact saying that Vijay told me to do things
backward.
>
> Since my note saying the backward approach worked, two things have
> happened: (1) someone make a link to it from (
> http://hbase.apache.org/notsoquick.html), and (2) Ryan Rowson replied
> saying, in no uncertain terms, that the backward approach is unreliable.
I
> would not have noticed a reliability issue in the negligible testing I
> did.
>
> Having gotten two opposite opinions, I am now unsure of the truth of the
> matter. Is there any chance of Vijay and Ryan agreeing?
>
> Thanks,
> Mike Spreitzer
> SMTP: mspreitz@us.ibm.com, Lotus Notes: Mike Spreitzer/Watson/IBM
> Office phone: +1-914-784-6424 (IBM T/L 863-)
> AOL Instant Messaging: M1k3Sprtzr
>
Re: Using the Hadoop bundled in the lib directory of HBase
Posted by Ryan Rawson <ry...@gmail.com>.
If you are taking the jar that we ship and slamming it in a hadoop
0.20.2 based distro that might work. I'm not sure if there are any
differences than pure code (which would then be expressed in the jar
only), so this approach might work.
You could also check out to the revision that we built our JAR and
trying that. By default you need apache forrest (argh) and java5 to
build hadoop (ARGH) which makes it not buildable on OSX.
Building sucks, there are no short cuts. Good luck out there!
-ryan
On Sat, Feb 12, 2011 at 11:24 PM, Mike Spreitzer <ms...@us.ibm.com> wrote:
> Let me be clear about the amount of testing I did: extremely little. I
> should also point out that at first I did not appreciate fully the meaning
> of you earlier comment to Vijay saying "this is a little off" --- I now
> realize you were in fact saying that Vijay told me to do things backward.
>
> Since my note saying the backward approach worked, two things have
> happened: (1) someone make a link to it from (
> http://hbase.apache.org/notsoquick.html), and (2) Ryan Rowson replied
> saying, in no uncertain terms, that the backward approach is unreliable. I
> would not have noticed a reliability issue in the negligible testing I
> did.
>
> Having gotten two opposite opinions, I am now unsure of the truth of the
> matter. Is there any chance of Vijay and Ryan agreeing?
>
> Thanks,
> Mike Spreitzer
> SMTP: mspreitz@us.ibm.com, Lotus Notes: Mike Spreitzer/Watson/IBM
> Office phone: +1-914-784-6424 (IBM T/L 863-)
> AOL Instant Messaging: M1k3Sprtzr
>
Re: Using the Hadoop bundled in the lib directory of HBase
Posted by Mike Spreitzer <ms...@us.ibm.com>.
Let me be clear about the amount of testing I did: extremely little. I
should also point out that at first I did not appreciate fully the meaning
of you earlier comment to Vijay saying "this is a little off" --- I now
realize you were in fact saying that Vijay told me to do things backward.
Since my note saying the backward approach worked, two things have
happened: (1) someone make a link to it from (
http://hbase.apache.org/notsoquick.html), and (2) Ryan Rowson replied
saying, in no uncertain terms, that the backward approach is unreliable. I
would not have noticed a reliability issue in the negligible testing I
did.
Having gotten two opposite opinions, I am now unsure of the truth of the
matter. Is there any chance of Vijay and Ryan agreeing?
Thanks,
Mike Spreitzer
SMTP: mspreitz@us.ibm.com, Lotus Notes: Mike Spreitzer/Watson/IBM
Office phone: +1-914-784-6424 (IBM T/L 863-)
AOL Instant Messaging: M1k3Sprtzr
Re: Using the Hadoop bundled in the lib directory of HBase
Posted by Mike Spreitzer <ms...@us.ibm.com>.
After a few false starts, what I have done is: fetch the 0.20.2 release of
hadoop core (which appears to be common + dfs + mapred), install it,
delete hadoop/hadoop-core.jar, unpack the hbase distribution, copy its
lib/hadoop-core-...jar file to hadoop/hadoop-...-core.jar, configure, and
test. It seems to be working. Is that what you expected? Should I
expect subtle problems?
If that was the right procedure, this could be explained a little more
clearly at (http://hbase.apache.org/notsoquick.html#hadoop). The first
thing that set me on the wrong path was the statement that I have to
either build my own Hadoop or use Cloudera; apparently that's not right, I
can use a built release if I replace one jar in it. That web page says "
If you want to run HBase on an Hadoop cluster that is other than a version
made from branch-0.20.append " (which is my case, using a standard
release) "you must replace the hadoop jar found in the HBase lib directory
with the hadoop jar you are running out on your cluster to avoid version
mismatch issues" --- but I think it's the other way around in my case.
Thanks,
Mike Spreitzer
SMTP: mspreitz@us.ibm.com, Lotus Notes: Mike Spreitzer/Watson/IBM
Office phone: +1-914-784-6424 (IBM T/L 863-)
AOL Instant Messaging: M1k3Sprtzr
From: Stack <st...@duboce.net>
To: user@hbase.apache.org
Date: 02/07/2011 12:07 PM
Subject: Re: Using the Hadoop bundled in the lib directory of HBase
Sent by: saint.ack@gmail.com
On Sun, Feb 6, 2011 at 9:31 PM, Vijay Raj <vi...@sargasdata.com> wrote:
> Hadoop core contained hdfs / mapreduce , all bundled together until
0.20.x .
> Since 0.21, it got forked into common, hdfs and mapreduce sub-projects.
>
What Vijay said.
> In this case - what is needed is a 0.20.2 download from hadoop and
configuring
> the same. The hadoop-0.20.2.jar needs to be replaced by the patched
> hadoop-0.20.2-xxxx.jar available in HBASE_HOME/lib/*.jar directory, to
make
> things work .
>
This is a little off.
Here is our Hadoop story for 0.90.0:
http://hbase.apache.org/notsoquick.html#hadoop
It links to the branch. If you need instruction on how to check out
and build, just say (do we need to add pointers to book?)
St.Ack
Re: Using the Hadoop bundled in the lib directory of HBase
Posted by Stack <st...@duboce.net>.
On Sun, Feb 6, 2011 at 9:31 PM, Vijay Raj <vi...@sargasdata.com> wrote:
> Hadoop core contained hdfs / mapreduce , all bundled together until 0.20.x .
> Since 0.21, it got forked into common, hdfs and mapreduce sub-projects.
>
What Vijay said.
> In this case - what is needed is a 0.20.2 download from hadoop and configuring
> the same. The hadoop-0.20.2.jar needs to be replaced by the patched
> hadoop-0.20.2-xxxx.jar available in HBASE_HOME/lib/*.jar directory, to make
> things work .
>
This is a little off.
Here is our Hadoop story for 0.90.0:
http://hbase.apache.org/notsoquick.html#hadoop
It links to the branch. If you need instruction on how to check out
and build, just say (do we need to add pointers to book?)
St.Ack
Re: Using the Hadoop bundled in the lib directory of HBase
Posted by Vijay Raj <vi...@sargasdata.com>.
----- Original Message ----
> From: Mike Spreitzer <ms...@us.ibm.com>
> To: user@hbase.apache.org
> Sent: Sun, February 6, 2011 9:12:18 PM
> Subject: Re: Using the Hadoop bundled in the lib directory of HBase
>
> OK, I am building the 0.20-append branch of hadoop-common. Do I then have
> to build hadoop-hdfs or can I use a pre-built release of hadoop-hdfs? If
> the latter, where would I find such a thing? When I try following the
> links (http://hadoop.apache.org/hdfs/ ->
> http://hadoop.apache.org/hdfs/releases.html ->
> http://hadoop.apache.org/hdfs/releases.html#Download ->
> http://www.apache.org/dyn/closer.cgi/hadoop/core/ ) I get to releases of
> hadoop/core --- what is that?
Hadoop core contained hdfs / mapreduce , all bundled together until 0.20.x .
Since 0.21, it got forked into common, hdfs and mapreduce sub-projects.
In this case - what is needed is a 0.20.2 download from hadoop and configuring
the same. The hadoop-0.20.2.jar needs to be replaced by the patched
hadoop-0.20.2-xxxx.jar available in HBASE_HOME/lib/*.jar directory, to make
things work .
>
> Thanks,
> Mike Spreitzer
> SMTP: mspreitz@us.ibm.com, Lotus Notes: Mike Spreitzer/Watson/IBM
> Office phone: +1-914-784-6424 (IBM T/L 863-)
> AOL Instant Messaging: M1k3Sprtzr
>
>
>
> From: Norbert Burger <no...@gmail.com>
> To: user@hbase.apache.org
> Date: 02/05/2011 09:51 AM
> Subject: Re: Using the Hadoop bundled in the lib directory of HBase
>
>
>
> Mike, you'll also need also access to an installation of Hadoop, whether
> this on the same machines as your HBase install (common), or somewhere
> else. Often, people install Hadoop first and then layer HBase over it.
>
> HBase depends on core Hadoop functionality like HDFS, and uses the Hadoop
> JAR in lib/ to support this. But this is library code only; what you're
> missing is the rest of Hadoop ecosystem (config files, directory
> structure,
> command-line tools, etc.)
>
> Norbert
>
> On Sat, Feb 5, 2011 at 9:21 AM, Ted Yu <yu...@gmail.com> wrote:
>
> > On a related note:
> > http://wiki.apache.org/hadoop/Hadoop%20Upgrade (referenced by
> > http://wiki.apache.org/hadoop/Hbase/HowToMigrate#90) needs to be filled
> > out.
> >
> > On Fri, Feb 4, 2011 at 11:47 PM, Mike Spreitzer <ms...@us.ibm.com>
> > wrote:
> >
> > > Hi, I'm new to HBase and have a stupid question about its dependency
> on
> > > Hadoop. Section 1.3.1.2 of (http://hbase.apache.org/notsoquick.html)
> > says
> > > there is an "instance" of Hadoop in the lib directory of HBase. What
> > > exactly is meant by "instance"? Is it all I need, or do I need to get
> a
> > > "full" copy of Hadoop from elsewhere? If HBase already has all I
> need, I
> > > am having trouble finding it. The Hadoop instructions refer to
> commands,
> > > for example, that I can't find.
> > >
> > > Thanks,
> > > Mike
> >
>
>
Re: Using the Hadoop bundled in the lib directory of HBase
Posted by Mike Spreitzer <ms...@us.ibm.com>.
OK, I am building the 0.20-append branch of hadoop-common. Do I then have
to build hadoop-hdfs or can I use a pre-built release of hadoop-hdfs? If
the latter, where would I find such a thing? When I try following the
links (http://hadoop.apache.org/hdfs/ ->
http://hadoop.apache.org/hdfs/releases.html ->
http://hadoop.apache.org/hdfs/releases.html#Download ->
http://www.apache.org/dyn/closer.cgi/hadoop/core/ ) I get to releases of
hadoop/core --- what is that?
Thanks,
Mike Spreitzer
SMTP: mspreitz@us.ibm.com, Lotus Notes: Mike Spreitzer/Watson/IBM
Office phone: +1-914-784-6424 (IBM T/L 863-)
AOL Instant Messaging: M1k3Sprtzr
From: Norbert Burger <no...@gmail.com>
To: user@hbase.apache.org
Date: 02/05/2011 09:51 AM
Subject: Re: Using the Hadoop bundled in the lib directory of HBase
Mike, you'll also need also access to an installation of Hadoop, whether
this on the same machines as your HBase install (common), or somewhere
else. Often, people install Hadoop first and then layer HBase over it.
HBase depends on core Hadoop functionality like HDFS, and uses the Hadoop
JAR in lib/ to support this. But this is library code only; what you're
missing is the rest of Hadoop ecosystem (config files, directory
structure,
command-line tools, etc.)
Norbert
On Sat, Feb 5, 2011 at 9:21 AM, Ted Yu <yu...@gmail.com> wrote:
> On a related note:
> http://wiki.apache.org/hadoop/Hadoop%20Upgrade (referenced by
> http://wiki.apache.org/hadoop/Hbase/HowToMigrate#90) needs to be filled
> out.
>
> On Fri, Feb 4, 2011 at 11:47 PM, Mike Spreitzer <ms...@us.ibm.com>
> wrote:
>
> > Hi, I'm new to HBase and have a stupid question about its dependency
on
> > Hadoop. Section 1.3.1.2 of (http://hbase.apache.org/notsoquick.html)
> says
> > there is an "instance" of Hadoop in the lib directory of HBase. What
> > exactly is meant by "instance"? Is it all I need, or do I need to get
a
> > "full" copy of Hadoop from elsewhere? If HBase already has all I
need, I
> > am having trouble finding it. The Hadoop instructions refer to
commands,
> > for example, that I can't find.
> >
> > Thanks,
> > Mike
>
Re: Using the Hadoop bundled in the lib directory of HBase
Posted by Norbert Burger <no...@gmail.com>.
Mike, you'll also need also access to an installation of Hadoop, whether
this on the same machines as your HBase install (common), or somewhere
else. Often, people install Hadoop first and then layer HBase over it.
HBase depends on core Hadoop functionality like HDFS, and uses the Hadoop
JAR in lib/ to support this. But this is library code only; what you're
missing is the rest of Hadoop ecosystem (config files, directory structure,
command-line tools, etc.)
Norbert
On Sat, Feb 5, 2011 at 9:21 AM, Ted Yu <yu...@gmail.com> wrote:
> On a related note:
> http://wiki.apache.org/hadoop/Hadoop%20Upgrade (referenced by
> http://wiki.apache.org/hadoop/Hbase/HowToMigrate#90) needs to be filled
> out.
>
> On Fri, Feb 4, 2011 at 11:47 PM, Mike Spreitzer <ms...@us.ibm.com>
> wrote:
>
> > Hi, I'm new to HBase and have a stupid question about its dependency on
> > Hadoop. Section 1.3.1.2 of (http://hbase.apache.org/notsoquick.html)
> says
> > there is an "instance" of Hadoop in the lib directory of HBase. What
> > exactly is meant by "instance"? Is it all I need, or do I need to get a
> > "full" copy of Hadoop from elsewhere? If HBase already has all I need, I
> > am having trouble finding it. The Hadoop instructions refer to commands,
> > for example, that I can't find.
> >
> > Thanks,
> > Mike
>
Re: Using the Hadoop bundled in the lib directory of HBase
Posted by Ted Yu <yu...@gmail.com>.
On a related note:
http://wiki.apache.org/hadoop/Hadoop%20Upgrade (referenced by
http://wiki.apache.org/hadoop/Hbase/HowToMigrate#90) needs to be filled out.
On Fri, Feb 4, 2011 at 11:47 PM, Mike Spreitzer <ms...@us.ibm.com> wrote:
> Hi, I'm new to HBase and have a stupid question about its dependency on
> Hadoop. Section 1.3.1.2 of (http://hbase.apache.org/notsoquick.html) says
> there is an "instance" of Hadoop in the lib directory of HBase. What
> exactly is meant by "instance"? Is it all I need, or do I need to get a
> "full" copy of Hadoop from elsewhere? If HBase already has all I need, I
> am having trouble finding it. The Hadoop instructions refer to commands,
> for example, that I can't find.
>
> Thanks,
> Mike