You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by praveenesh kumar <pr...@gmail.com> on 2011/04/26 07:25:06 UTC

HBASE on Hadoop

Hello everyone,

Thanks everyone for guiding me everytime. I am able to setup hadoop cluster
of 10 nodes.
Now comes HBASE..!!!

I am new to all this...
My problem is I have huge data to analyze.
so shall I go for single node Hbase installation on all nodes or go for
distributed Hbase installation.??

How distributed installation is different from single node installaion ??
Now suppose if I have distributed Hbase...
and If I design some table on my master node.. and then store data on it..
say around 100M. How the data is going to be distributed.. Will HBASE do it
automatically or we have to write codes for getting it distributed ??
Is there any good tutorial that tells us more about HBase and how to work on
it ???

Thanks,
Praveenesh

Re: HBASE on Hadoop

Posted by Bennett Andrews <be...@gmail.com>.
Setup a "distributed" HBase cluster.

On each node that runs TaskTracker and Datanode, also run a HBase
RegionServer.

On the node that runs JobTracker and Namenode, run the HBase Master.

There is a getting started guide here.
http://hbase.apache.org/book/notsoquick.html

Check out the HBase user list for more.
http://hbase.apache.org/mail-lists.html



On Wed, Apr 27, 2011 at 4:58 PM, gaurav garg <ga...@gmail.com>wrote:

> Praveenesh,
>
> I will recommend you to read the google Big Table paper(
> http://labs.google.com/papers/bigtable.html) which is a foundation for the
> hbase.
> Terminology is little different though:
>
> Mapping of terms(not exhaustive):
> ***********************************
> Big Table             Hbase
> ***********************************
> Master Server      HMaster
> Tablet                  region
> Tablet server        regionserver
>  chubby                zookeeper (It is apache implementation of
> distributed
> synchronization server)
>
> Hbase stores data on hadoop dfs. hbase is a client of hdfs. Hence hadoop
> will automatically distribute and replicate the data across your hadoop
> cluster.
> Hbase master and regionservers formats/transforms the data and relies on
> hadoop for storage and retrieval.
>
>
> Hbase cluster can run on a separate set of nodes or it can even share
> hadoop
> nodes.
>
> Once you have setup hdfs cluster, hbase cluster can be easily setup.
>
> Thanks
> Gaurav
>
>
> On Tue, Apr 26, 2011 at 10:55 AM, praveenesh kumar <praveenesh@gmail.com
> >wrote:
>
> > Hello everyone,
> >
> > Thanks everyone for guiding me everytime. I am able to setup hadoop
> cluster
> > of 10 nodes.
> > Now comes HBASE..!!!
> >
> > I am new to all this...
> > My problem is I have huge data to analyze.
> > so shall I go for single node Hbase installation on all nodes or go for
> > distributed Hbase installation.??
> >
> > How distributed installation is different from single node installaion ??
> > Now suppose if I have distributed Hbase...
> > and If I design some table on my master node.. and then store data on
> it..
> > say around 100M. How the data is going to be distributed.. Will HBASE do
> it
> > automatically or we have to write codes for getting it distributed ??
> > Is there any good tutorial that tells us more about HBase and how to work
> > on
> > it ???
> >
> > Thanks,
> > Praveenesh
> >
>

Re: HBASE on Hadoop

Posted by gaurav garg <ga...@gmail.com>.
Praveenesh,

I will recommend you to read the google Big Table paper(
http://labs.google.com/papers/bigtable.html) which is a foundation for the
hbase.
Terminology is little different though:

Mapping of terms(not exhaustive):
***********************************
Big Table             Hbase
***********************************
Master Server      HMaster
Tablet                  region
Tablet server        regionserver
 chubby                zookeeper (It is apache implementation of distributed
synchronization server)

Hbase stores data on hadoop dfs. hbase is a client of hdfs. Hence hadoop
will automatically distribute and replicate the data across your hadoop
cluster.
Hbase master and regionservers formats/transforms the data and relies on
hadoop for storage and retrieval.


Hbase cluster can run on a separate set of nodes or it can even share hadoop
nodes.

Once you have setup hdfs cluster, hbase cluster can be easily setup.

Thanks
Gaurav


On Tue, Apr 26, 2011 at 10:55 AM, praveenesh kumar <pr...@gmail.com>wrote:

> Hello everyone,
>
> Thanks everyone for guiding me everytime. I am able to setup hadoop cluster
> of 10 nodes.
> Now comes HBASE..!!!
>
> I am new to all this...
> My problem is I have huge data to analyze.
> so shall I go for single node Hbase installation on all nodes or go for
> distributed Hbase installation.??
>
> How distributed installation is different from single node installaion ??
> Now suppose if I have distributed Hbase...
> and If I design some table on my master node.. and then store data on it..
> say around 100M. How the data is going to be distributed.. Will HBASE do it
> automatically or we have to write codes for getting it distributed ??
> Is there any good tutorial that tells us more about HBase and how to work
> on
> it ???
>
> Thanks,
> Praveenesh
>