You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hbase.apache.org by som_shekhar <ko...@wipro.com> on 2011/02/07 13:51:34 UTC

Why Hbase?

Hi All,
I am new to Hbase, can anyone throw the light on the fact that why at the
first point Hbase is invented even though the HDFS has the capability to do
everything i.e. distributing the data, providing the data localization. But
then why hbase and big table comes into picture, what was the need>?
-- 
View this message in context: http://old.nabble.com/Why-Hbase--tp30863301p30863301.html
Sent from the HBase User mailing list archive at Nabble.com.

Re: Why Hbase?

Posted by tsuna <ts...@gmail.com>.

On Mon, Feb 7, 2011 at 7:01 AM, Ted Dunning <td...@maprtech.com> wrote:
> HDFS does not provide for keyed access to data, nor column oriented access
> when only a subset of related data is needed.  Also, HDFS is a write-once
> file system while hbase provides random updates.

Note that the write-once aspect is a peculiarity of HDFS, GFS doesn't
have this limitation.

Bigtable was originally invented to provide low-latency key-value
oriented serving.  Without Bigtable, an application had to load its
data directly from GFS (which is a high throughput but high latency
distributed file system) to serve it out of memory.  For large data
sets that can't fit on a single machine, this is actually hard to do,
especially if you want to be able to change individual data items.

HBase is solving the same problem as Bigtable.  The main goal is low
latency serving for a very large number of fairly small data items.

-- 
Benoit "tsuna" Sigoure
Software Engineer @ www.StumbleUpon.com

Re: Why Hbase?

Posted by Tatsuya Kawano <ta...@gmail.com>.

Well, you don't have to rebalance your HDFS cluster every time you add
or remove a node. You can let HDFS Name Node to manage data placement
and this usually works fine. So this won't be a big difference between
HDFS and HBase.

Like others said, HDFS is designed for high throughput, and HBase is
designed for low latency. Use HDFS when you need sequential access to
the big data, and use HBase when you need random access to the big
data. Also HDFS doesn't let you update a part of an existing file, so
consider HBase if you want to do that.

Thanks,

-- 
河野 達也
Tatsuya Kawano (Mr.)
Tokyo, Japan

twitter: http://twitter.com/tatsuya6502


2011/2/8 som_shekhar <ko...@wipro.com>:
>
> Thanks for the above fact.
> When one of the node is added or failed, the distribution of data on hadoop
> has to be done manually. i have read somewhere that because of this the
> hbase or NOSQL comes into picture.
> Can you please give some more details.
>
>
> Ted Dunning-2 wrote:
>>
>> HDFS does not provide for keyed access to data, nor column oriented access
>> when only a subset of related data is needed.  Also, HDFS is a write-once
>> file system while hbase provides random updates.
>>
>> On Mon, Feb 7, 2011 at 4:51 AM, som_shekhar
>> <ko...@wipro.com>wrote:
>>
>>>
>>> Hi All,
>>> I am new to Hbase, can anyone throw the light on the fact that why at the
>>> first point Hbase is invented even though the HDFS has the capability to
>>> do
>>> everything i.e. distributing the data, providing the data localization.
>>> But
>>> then why hbase and big table comes into picture, what was the need>?
>>> --
>>> View this message in context:
>>> http://old.nabble.com/Why-Hbase--tp30863301p30863301.html
>>> Sent from the HBase User mailing list archive at Nabble.com.
>>>
>>>
>>
>>
>
> --
> View this message in context: http://old.nabble.com/Why-Hbase--tp30863301p30870366.html
> Sent from the HBase User mailing list archive at Nabble.com.

Re: Why Hbase?

Posted by som_shekhar <ko...@wipro.com>.

Thanks for the above fact.
When one of the node is added or failed, the distribution of data on hadoop
has to be done manually. i have read somewhere that because of this the
hbase or NOSQL comes into picture.
Can you please give some more details.


Ted Dunning-2 wrote:
> 
> HDFS does not provide for keyed access to data, nor column oriented access
> when only a subset of related data is needed.  Also, HDFS is a write-once
> file system while hbase provides random updates.
> 
> On Mon, Feb 7, 2011 at 4:51 AM, som_shekhar
> <ko...@wipro.com>wrote:
> 
>>
>> Hi All,
>> I am new to Hbase, can anyone throw the light on the fact that why at the
>> first point Hbase is invented even though the HDFS has the capability to
>> do
>> everything i.e. distributing the data, providing the data localization.
>> But
>> then why hbase and big table comes into picture, what was the need>?
>> --
>> View this message in context:
>> http://old.nabble.com/Why-Hbase--tp30863301p30863301.html
>> Sent from the HBase User mailing list archive at Nabble.com.
>>
>>
> 
> 

-- 
View this message in context: http://old.nabble.com/Why-Hbase--tp30863301p30870366.html
Sent from the HBase User mailing list archive at Nabble.com.

Re: Why Hbase?

Posted by Ted Dunning <td...@maprtech.com>.

HDFS does not provide for keyed access to data, nor column oriented access
when only a subset of related data is needed.  Also, HDFS is a write-once
file system while hbase provides random updates.

On Mon, Feb 7, 2011 at 4:51 AM, som_shekhar <ko...@wipro.com>wrote:

>
> Hi All,
> I am new to Hbase, can anyone throw the light on the fact that why at the
> first point Hbase is invented even though the HDFS has the capability to do
> everything i.e. distributing the data, providing the data localization. But
> then why hbase and big table comes into picture, what was the need>?
> --
> View this message in context:
> http://old.nabble.com/Why-Hbase--tp30863301p30863301.html
> Sent from the HBase User mailing list archive at Nabble.com.
>
>