You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Vaibhav Puranik <vp...@gmail.com> on 2009/04/03 01:16:10 UTC

Shift entirely to HBase?

Hi,

In our system we have some fast growing tables and some slow growing tables
(as in every system).
Some slow growing tables only have 100s of rows and they grow at a very slow
rate - e.g. users in a small organization.

We are thinking about ditching mysql and adopting Hbase. Is it advised to
ditch mysql altogether and shift to HBase or do you recommend just shifting
fast growing tables to Hbase?

What are other people doing?

Regards,
Vaibahv

Fwd: Shift entirely to HBase?

Posted by Vaibhav Puranik <vp...@gmail.com>.
Thanks Stack and Erik.
I will look at the backup MR job in the issue.

BTW, we won't be able to use the table based indexer as it's not useful
whenever you use column name as data (so that we can store multiple columns
per row).  Most of our tables have column as data and hence we cannot
specify the fully qualified column name in the IndexSpecification.

Regards,
Vaibhav

---------- Forwarded message ----------
From: Erik Holstad <er...@gmail.com>
Date: Fri, Apr 3, 2009 at 4:18 PM
Subject: Re: Shift entirely to HBase?
To: hbase-user@hadoop.apache.org


Hi Vaibhav!

*https://issues.apache.org/jira/browse/HBASE-974
*is what Stack is talking about, just look at the last comment that I made
so you know why it is a little bit slow at the moment.
Internally on Streamy we use a setup class so that every table is only
initiated one and you get and put that table back so
other MR can use it.

Hope it works for you.

Regards Erik

Re: Shift entirely to HBase?

Posted by Erik Holstad <er...@gmail.com>.
Hi Vaibhav!

*https://issues.apache.org/jira/browse/HBASE-974
*is what Stack is talking about, just look at the last comment that I made
so you know why it is a little bit slow at the moment.
Internally on Streamy we use a setup class so that every table is only
initiated one and you get and put that table back so
other MR can use it.

Hope it works for you.

Regards Erik

Re: Shift entirely to HBase?

Posted by stack <st...@duboce.net>.
Oh, you could also set your columns to flush more frequently than default so
edits are presisted more often.
St.Ack

On Sat, Apr 4, 2009 at 12:16 AM, stack <st...@duboce.net> wrote:

> Good.  I'm glad you are doing the evaluations.  My guess is that you'll
> need to wait on 0.20.0 to get the realtime numbers you'll be happy with.
>
> Look through hbase JIRAs for issues on backups.  There are a few.  I think
> Erik Holstad's the most up-to-date.  Check it out (A google summer of code
> project looks like it will concentrate on this particular issue snapshotting
> tables).
>
> St.Ack
>
>
> On Fri, Apr 3, 2009 at 8:19 PM, Vaibhav Puranik <vp...@gmail.com>wrote:
>
>> Stack,
>>
>> We are still trying to explore answers to these questions.
>>
>> For example, we are at this moment doing performance testing on hbase to
>> see
>> whether it can be used as a real time database.
>>
>> We haven't finalized the new schema - I am exploring the table based
>> indexes
>> feature to see how it can help us to do "where somecolumn = value" where
>> somecolumn is not the row key. We are still getting used to designing the
>> new way.
>>
>> We certainly cannot afford to loose data. Can you (or anybody) give me
>> pointers for backing up hbase data?
>>
>> Regards,
>> Vaibhav Puranik
>> Gumgum Inc.
>>
>>
>> On Thu, Apr 2, 2009 at 11:37 PM, stack <st...@duboce.net> wrote:
>>
>> > Can you do all queries you do in mysql against hbase?
>> >
>> > Does hbase run fast enough for your needs?
>> >
>> > Can you tolerate lost data (hdfs does not yet have a working flush/sync
>> so
>> > dataloss is possible on machine crash -- hopefully fixed in hadoop
>> 0.21).
>> >
>> > St.Ack
>> >
>> >
>> > On Fri, Apr 3, 2009 at 1:16 AM, Vaibhav Puranik <vp...@gmail.com>
>> > wrote:
>> >
>> > > Hi,
>> > >
>> > > In our system we have some fast growing tables and some slow growing
>> > tables
>> > > (as in every system).
>> > > Some slow growing tables only have 100s of rows and they grow at a
>> very
>> > > slow
>> > > rate - e.g. users in a small organization.
>> > >
>> > > We are thinking about ditching mysql and adopting Hbase. Is it advised
>> to
>> > > ditch mysql altogether and shift to HBase or do you recommend just
>> > shifting
>> > > fast growing tables to Hbase?
>> > >
>> > > What are other people doing?
>> > >
>> > > Regards,
>> > > Vaibahv
>> > >
>> >
>>
>
>

Re: Shift entirely to HBase?

Posted by stack <st...@duboce.net>.
Good.  I'm glad you are doing the evaluations.  My guess is that you'll need
to wait on 0.20.0 to get the realtime numbers you'll be happy with.

Look through hbase JIRAs for issues on backups.  There are a few.  I think
Erik Holstad's the most up-to-date.  Check it out (A google summer of code
project looks like it will concentrate on this particular issue snapshotting
tables).

St.Ack

On Fri, Apr 3, 2009 at 8:19 PM, Vaibhav Puranik <vp...@gmail.com> wrote:

> Stack,
>
> We are still trying to explore answers to these questions.
>
> For example, we are at this moment doing performance testing on hbase to
> see
> whether it can be used as a real time database.
>
> We haven't finalized the new schema - I am exploring the table based
> indexes
> feature to see how it can help us to do "where somecolumn = value" where
> somecolumn is not the row key. We are still getting used to designing the
> new way.
>
> We certainly cannot afford to loose data. Can you (or anybody) give me
> pointers for backing up hbase data?
>
> Regards,
> Vaibhav Puranik
> Gumgum Inc.
>
>
> On Thu, Apr 2, 2009 at 11:37 PM, stack <st...@duboce.net> wrote:
>
> > Can you do all queries you do in mysql against hbase?
> >
> > Does hbase run fast enough for your needs?
> >
> > Can you tolerate lost data (hdfs does not yet have a working flush/sync
> so
> > dataloss is possible on machine crash -- hopefully fixed in hadoop 0.21).
> >
> > St.Ack
> >
> >
> > On Fri, Apr 3, 2009 at 1:16 AM, Vaibhav Puranik <vp...@gmail.com>
> > wrote:
> >
> > > Hi,
> > >
> > > In our system we have some fast growing tables and some slow growing
> > tables
> > > (as in every system).
> > > Some slow growing tables only have 100s of rows and they grow at a very
> > > slow
> > > rate - e.g. users in a small organization.
> > >
> > > We are thinking about ditching mysql and adopting Hbase. Is it advised
> to
> > > ditch mysql altogether and shift to HBase or do you recommend just
> > shifting
> > > fast growing tables to Hbase?
> > >
> > > What are other people doing?
> > >
> > > Regards,
> > > Vaibahv
> > >
> >
>

Re: Shift entirely to HBase?

Posted by Vaibhav Puranik <vp...@gmail.com>.
Stack,

We are still trying to explore answers to these questions.

For example, we are at this moment doing performance testing on hbase to see
whether it can be used as a real time database.

We haven't finalized the new schema - I am exploring the table based indexes
feature to see how it can help us to do "where somecolumn = value" where
somecolumn is not the row key. We are still getting used to designing the
new way.

We certainly cannot afford to loose data. Can you (or anybody) give me
pointers for backing up hbase data?

Regards,
Vaibhav Puranik
Gumgum Inc.


On Thu, Apr 2, 2009 at 11:37 PM, stack <st...@duboce.net> wrote:

> Can you do all queries you do in mysql against hbase?
>
> Does hbase run fast enough for your needs?
>
> Can you tolerate lost data (hdfs does not yet have a working flush/sync so
> dataloss is possible on machine crash -- hopefully fixed in hadoop 0.21).
>
> St.Ack
>
>
> On Fri, Apr 3, 2009 at 1:16 AM, Vaibhav Puranik <vp...@gmail.com>
> wrote:
>
> > Hi,
> >
> > In our system we have some fast growing tables and some slow growing
> tables
> > (as in every system).
> > Some slow growing tables only have 100s of rows and they grow at a very
> > slow
> > rate - e.g. users in a small organization.
> >
> > We are thinking about ditching mysql and adopting Hbase. Is it advised to
> > ditch mysql altogether and shift to HBase or do you recommend just
> shifting
> > fast growing tables to Hbase?
> >
> > What are other people doing?
> >
> > Regards,
> > Vaibahv
> >
>

Re: Shift entirely to HBase?

Posted by stack <st...@duboce.net>.
Can you do all queries you do in mysql against hbase?

Does hbase run fast enough for your needs?

Can you tolerate lost data (hdfs does not yet have a working flush/sync so
dataloss is possible on machine crash -- hopefully fixed in hadoop 0.21).

St.Ack


On Fri, Apr 3, 2009 at 1:16 AM, Vaibhav Puranik <vp...@gmail.com> wrote:

> Hi,
>
> In our system we have some fast growing tables and some slow growing tables
> (as in every system).
> Some slow growing tables only have 100s of rows and they grow at a very
> slow
> rate - e.g. users in a small organization.
>
> We are thinking about ditching mysql and adopting Hbase. Is it advised to
> ditch mysql altogether and shift to HBase or do you recommend just shifting
> fast growing tables to Hbase?
>
> What are other people doing?
>
> Regards,
> Vaibahv
>