You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by "Hiller, Dean (Contractor)" <de...@broadridge.com> on 2011/01/10 19:59:23 UTC

RE: some data replication support in hbase?

Our customers do not have hbase.  We need to replicate data to their
database that is at the customer location.  We only replicate a portion
of their data.

I was hoping for some hooks into hbase so right after data was
successfully written, we could then fire an event.  I guess we can just
put it on our client as part of a framework, but I kinda wanted it to be
server side.

Ie. We could have our own hbase api on top of the hbase api that would
fire events after store events were successful, but was wondering if
there was already built in hooks.

Dean

-----Original Message-----
From: Otis Gospodnetic [mailto:otis_gospodnetic@yahoo.com] 
Sent: Friday, December 24, 2010 9:13 AM
To: user@hbase.apache.org; hbase-user@hadoop.apache.org
Subject: Re: some data replication support in hbase?

Hi,

Hm, maybe I'm missing something.... but HBase runs on top of HDFS
(that's where 
it gets/puts data), which itself provides data replication.  So that
should be 
all you need, no?

Otis
----
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - Hadoop -
HBase
Hadoop ecosystem search :: http://search-hadoop.com/



----- Original Message ----
> From: "Hiller, Dean (Contractor)" <de...@broadridge.com>
> To: hbase-user@hadoop.apache.org
> Sent: Tue, December 21, 2010 12:14:25 PM
> Subject: some data replication support in hbase?
> 
> Are there any hooks in hbase to do data replication?  We have to try
to
> move our 12 hour batch jobs down to 3 hours or so and are looking  at
> moving into a noSQL environment, but currently, customers  have
> replicated data(only a small subset of tables because our data set
size
> is so big).  Are there any good strategies for data  replication?
> 
> 
> 
> It probably doesn't matter but our customers' local  db(multiple
> customers) is Sybase right now(as I think we edicted that a while
back
> to them).  Any ideas here?  All we really care about is that  it is
> eventually consistent with our cluster.  
> 
> 
> 
> I think  we may also have issues where the update of two rows should
have
> what hbase  had before or after kind of thing.
> 
> 
> 
> Ideas?
> 
> Thanks,
> 
> Dean
> 
> 
> 
> 
> 
> 
> This  message and any attachments are intended only for the use of the

>addressee  and
> may contain information that is privileged and confidential. If the
reader of 
>the 
>
> message is not the intended recipient or an authorized  representative
of the
> intended recipient, you are hereby notified that any  dissemination of
this
> communication is strictly prohibited. If you have  received this
communication 
>in
> error, please notify us immediately by e-mail  and delete the message
and any
> attachments from your system.
> 
This message and any attachments are intended only for the use of the addressee and
may contain information that is privileged and confidential. If the reader of the 
message is not the intended recipient or an authorized representative of the
intended recipient, you are hereby notified that any dissemination of this
communication is strictly prohibited. If you have received this communication in
error, please notify us immediately by e-mail and delete the message and any
attachments from your system.


RE: some data replication support in hbase?

Posted by "Hiller, Dean (Contractor)" <de...@broadridge.com>.
Heh, that has also been in the back of my head.  On that topic though, do you know if a node fails after a put and before coprocessor is run, will the other node take over running the co-processor so it is guaranteed to run the coprocessor or fail with the put?  (I guess is there any atomicity of a put and the coprocessor running to completion even if duplicates occur)

If I remember correctly a Put only succeeds if the min number of nodes is written to or it fails which I believe the default is 2 and the default replication is 3(or am I getting that mixed up with Cassandra?)

Thanks,
Dean


-----Original Message-----
From: saint.ack@gmail.com [mailto:saint.ack@gmail.com] On Behalf Of Stack
Sent: Monday, January 10, 2011 12:54 PM
To: user@hbase.apache.org
Subject: Re: some data replication support in hbase?

Sounds like something Coprocessors, hbase TRUNK/hbase 0.92, could do for you.
St.Ack

On Mon, Jan 10, 2011 at 10:59 AM, Hiller, Dean  (Contractor)
<de...@broadridge.com> wrote:
> Our customers do not have hbase.  We need to replicate data to their
> database that is at the customer location.  We only replicate a portion
> of their data.
>
> I was hoping for some hooks into hbase so right after data was
> successfully written, we could then fire an event.  I guess we can just
> put it on our client as part of a framework, but I kinda wanted it to be
> server side.
>
> Ie. We could have our own hbase api on top of the hbase api that would
> fire events after store events were successful, but was wondering if
> there was already built in hooks.
>
> Dean
>
> -----Original Message-----
> From: Otis Gospodnetic [mailto:otis_gospodnetic@yahoo.com]
> Sent: Friday, December 24, 2010 9:13 AM
> To: user@hbase.apache.org; hbase-user@hadoop.apache.org
> Subject: Re: some data replication support in hbase?
>
> Hi,
>
> Hm, maybe I'm missing something.... but HBase runs on top of HDFS
> (that's where
> it gets/puts data), which itself provides data replication.  So that
> should be
> all you need, no?
>
> Otis
> ----
> Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - Hadoop -
> HBase
> Hadoop ecosystem search :: http://search-hadoop.com/
>
>
>
> ----- Original Message ----
>> From: "Hiller, Dean (Contractor)" <de...@broadridge.com>
>> To: hbase-user@hadoop.apache.org
>> Sent: Tue, December 21, 2010 12:14:25 PM
>> Subject: some data replication support in hbase?
>>
>> Are there any hooks in hbase to do data replication?  We have to try
> to
>> move our 12 hour batch jobs down to 3 hours or so and are looking  at
>> moving into a noSQL environment, but currently, customers  have
>> replicated data(only a small subset of tables because our data set
> size
>> is so big).  Are there any good strategies for data  replication?
>>
>>
>>
>> It probably doesn't matter but our customers' local  db(multiple
>> customers) is Sybase right now(as I think we edicted that a while
> back
>> to them).  Any ideas here?  All we really care about is that  it is
>> eventually consistent with our cluster.
>>
>>
>>
>> I think  we may also have issues where the update of two rows should
> have
>> what hbase  had before or after kind of thing.
>>
>>
>>
>> Ideas?
>>
>> Thanks,
>>
>> Dean
>>
>>
>>
>>
>>
>>
>> This  message and any attachments are intended only for the use of the
>
>>addressee  and
>> may contain information that is privileged and confidential. If the
> reader of
>>the
>>
>> message is not the intended recipient or an authorized  representative
> of the
>> intended recipient, you are hereby notified that any  dissemination of
> this
>> communication is strictly prohibited. If you have  received this
> communication
>>in
>> error, please notify us immediately by e-mail  and delete the message
> and any
>> attachments from your system.
>>
> This message and any attachments are intended only for the use of the addressee and
> may contain information that is privileged and confidential. If the reader of the
> message is not the intended recipient or an authorized representative of the
> intended recipient, you are hereby notified that any dissemination of this
> communication is strictly prohibited. If you have received this communication in
> error, please notify us immediately by e-mail and delete the message and any
> attachments from your system.
>
>
This message and any attachments are intended only for the use of the addressee and
may contain information that is privileged and confidential. If the reader of the 
message is not the intended recipient or an authorized representative of the
intended recipient, you are hereby notified that any dissemination of this
communication is strictly prohibited. If you have received this communication in
error, please notify us immediately by e-mail and delete the message and any
attachments from your system.


Re: some data replication support in hbase?

Posted by Stack <st...@duboce.net>.
Sounds like something Coprocessors, hbase TRUNK/hbase 0.92, could do for you.
St.Ack

On Mon, Jan 10, 2011 at 10:59 AM, Hiller, Dean  (Contractor)
<de...@broadridge.com> wrote:
> Our customers do not have hbase.  We need to replicate data to their
> database that is at the customer location.  We only replicate a portion
> of their data.
>
> I was hoping for some hooks into hbase so right after data was
> successfully written, we could then fire an event.  I guess we can just
> put it on our client as part of a framework, but I kinda wanted it to be
> server side.
>
> Ie. We could have our own hbase api on top of the hbase api that would
> fire events after store events were successful, but was wondering if
> there was already built in hooks.
>
> Dean
>
> -----Original Message-----
> From: Otis Gospodnetic [mailto:otis_gospodnetic@yahoo.com]
> Sent: Friday, December 24, 2010 9:13 AM
> To: user@hbase.apache.org; hbase-user@hadoop.apache.org
> Subject: Re: some data replication support in hbase?
>
> Hi,
>
> Hm, maybe I'm missing something.... but HBase runs on top of HDFS
> (that's where
> it gets/puts data), which itself provides data replication.  So that
> should be
> all you need, no?
>
> Otis
> ----
> Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - Hadoop -
> HBase
> Hadoop ecosystem search :: http://search-hadoop.com/
>
>
>
> ----- Original Message ----
>> From: "Hiller, Dean (Contractor)" <de...@broadridge.com>
>> To: hbase-user@hadoop.apache.org
>> Sent: Tue, December 21, 2010 12:14:25 PM
>> Subject: some data replication support in hbase?
>>
>> Are there any hooks in hbase to do data replication?  We have to try
> to
>> move our 12 hour batch jobs down to 3 hours or so and are looking  at
>> moving into a noSQL environment, but currently, customers  have
>> replicated data(only a small subset of tables because our data set
> size
>> is so big).  Are there any good strategies for data  replication?
>>
>>
>>
>> It probably doesn't matter but our customers' local  db(multiple
>> customers) is Sybase right now(as I think we edicted that a while
> back
>> to them).  Any ideas here?  All we really care about is that  it is
>> eventually consistent with our cluster.
>>
>>
>>
>> I think  we may also have issues where the update of two rows should
> have
>> what hbase  had before or after kind of thing.
>>
>>
>>
>> Ideas?
>>
>> Thanks,
>>
>> Dean
>>
>>
>>
>>
>>
>>
>> This  message and any attachments are intended only for the use of the
>
>>addressee  and
>> may contain information that is privileged and confidential. If the
> reader of
>>the
>>
>> message is not the intended recipient or an authorized  representative
> of the
>> intended recipient, you are hereby notified that any  dissemination of
> this
>> communication is strictly prohibited. If you have  received this
> communication
>>in
>> error, please notify us immediately by e-mail  and delete the message
> and any
>> attachments from your system.
>>
> This message and any attachments are intended only for the use of the addressee and
> may contain information that is privileged and confidential. If the reader of the
> message is not the intended recipient or an authorized representative of the
> intended recipient, you are hereby notified that any dissemination of this
> communication is strictly prohibited. If you have received this communication in
> error, please notify us immediately by e-mail and delete the message and any
> attachments from your system.
>
>