You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@hbase.apache.org by Demai Ni <ni...@gmail.com> on 2013/07/23 01:42:24 UTC

[question about replication] how to apply delta from Master to Slave after crash ?

hi, folks,

I am trying to address a recovery scenario after Master crash.

In a simple Master - Slave replication setup for t1, and suddenly Master
crashes. there will be a small delta of t1 already changed on Master, but
haven't been replicated to Slave yet.  Assuming Master's filesystem and
network still available, how to apply the delta from Master to Slave to
sync t1.

Anyone addresses this scenario in your shop? Can you please give me some
pointers?

Basically, I need to address two things:
1) get the timestamp of last consistent point of T1. some thing like
oldest(lastAppliedOP_of_T1 of each regionServer on Slave Cluster).  Any
other way to do so?

2) something like copyTable utility with starttime from 1). However
copyTable requests hbase and zookeep available from source cluster(right?)


I have been thinking about ways to address this for quite a few days and
still don't have very good ideas yet. Any suggestion is greatly appreciated

thanks.

Demai

Re: [question about replication] how to apply delta from Master to Slave after crash ?

Posted by Jean-Daniel Cryans <jd...@apache.org>.

For more information on how replication works, please see
http://hbase.apache.org/replication.html

Also regarding the tool, I don't have time right now to work on it, so
don't expect something soon unless you want to work on it :)

Have a good weekend,

J-D

On Fri, Jul 26, 2013 at 1:29 PM, Demai Ni <ni...@gmail.com> wrote:
> JD, thanks for the jira and further explanation. for some reason, I was
> always thinking about 'pull' while considering the solution. Certainly
> 'push' the natural way to address this on Apache's hbase.  I got your
> points now. Appreciate it... Demai
>
>
> On Fri, Jul 26, 2013 at 1:08 PM, Jean-Daniel Cryans <jd...@apache.org>wrote:
>
>> https://issues.apache.org/jira/browse/HBASE-9047
>>
>> On Fri, Jul 26, 2013 at 12:59 PM, Jean-Daniel Cryans
>> <jd...@apache.org> wrote:
>> > I guess I didn't explain my ideas clearly.
>> >
>> > So, first, replication in HBase is master-push, so you don't want to
>> > reverse the process. It means that this tool needs to run on the
>> > master cluster.
>> >
>> > Then I don't think you need to specify a timestamp since the
>> > replication state is in ZK. Basically that tool we're talking about
>> > would be able to read the replication state of each master region
>> > server, finish replicating what's missing, and then clear that state
>> > in zookeeper.
>> >
>> > The code that handles replication does most of that already. Check
>> > ReplicationSourceManager and ReplicationSource. Basically when
>> > ReplicationSourceManager.init() is called, it will check all the
>> > queues in ZK and try to grab those that aren't attached to a region
>> > server. If the whole cluster is down, it will grab all of them.
>> >
>> > The beautiful thing here is that you could start that tool on all your
>> > machines and the load will be spread out, but that might not be a big
>> > concern if replication wasn't lagging since it would take a few
>> > seconds to finish replicating the missing data for each region server.
>> >
>> > I'll open a jira.
>> >
>> > J-D
>> >
>> > On Fri, Jul 26, 2013 at 11:50 AM, Demai Ni <ni...@gmail.com> wrote:
>> >> JD,
>> >>
>> >> yeah. that sounds what I will need to do. a tool like this
>> >> [slave_cluster]$tool_to_syncup master_ZKquorum table_name
>> start_timestamp
>> >>
>> >> so two tasks for me:
>> >> 1) identify the start_timestamp
>> >> 2) write the tool_to_syncup which will reach to master_ZK, copy the
>> HLOGs
>> >> from makster, replay the HLOGs on Slave.
>> >>
>> >> are you aware of some example code for the 2) task that I can leverage?
>> >> thanks
>> >>
>> >> Demai
>>

Re: [question about replication] how to apply delta from Master to Slave after crash ?

Posted by Demai Ni <ni...@gmail.com>.

JD, thanks for the jira and further explanation. for some reason, I was
always thinking about 'pull' while considering the solution. Certainly
'push' the natural way to address this on Apache's hbase.  I got your
points now. Appreciate it... Demai


On Fri, Jul 26, 2013 at 1:08 PM, Jean-Daniel Cryans <jd...@apache.org>wrote:

> https://issues.apache.org/jira/browse/HBASE-9047
>
> On Fri, Jul 26, 2013 at 12:59 PM, Jean-Daniel Cryans
> <jd...@apache.org> wrote:
> > I guess I didn't explain my ideas clearly.
> >
> > So, first, replication in HBase is master-push, so you don't want to
> > reverse the process. It means that this tool needs to run on the
> > master cluster.
> >
> > Then I don't think you need to specify a timestamp since the
> > replication state is in ZK. Basically that tool we're talking about
> > would be able to read the replication state of each master region
> > server, finish replicating what's missing, and then clear that state
> > in zookeeper.
> >
> > The code that handles replication does most of that already. Check
> > ReplicationSourceManager and ReplicationSource. Basically when
> > ReplicationSourceManager.init() is called, it will check all the
> > queues in ZK and try to grab those that aren't attached to a region
> > server. If the whole cluster is down, it will grab all of them.
> >
> > The beautiful thing here is that you could start that tool on all your
> > machines and the load will be spread out, but that might not be a big
> > concern if replication wasn't lagging since it would take a few
> > seconds to finish replicating the missing data for each region server.
> >
> > I'll open a jira.
> >
> > J-D
> >
> > On Fri, Jul 26, 2013 at 11:50 AM, Demai Ni <ni...@gmail.com> wrote:
> >> JD,
> >>
> >> yeah. that sounds what I will need to do. a tool like this
> >> [slave_cluster]$tool_to_syncup master_ZKquorum table_name
> start_timestamp
> >>
> >> so two tasks for me:
> >> 1) identify the start_timestamp
> >> 2) write the tool_to_syncup which will reach to master_ZK, copy the
> HLOGs
> >> from makster, replay the HLOGs on Slave.
> >>
> >> are you aware of some example code for the 2) task that I can leverage?
> >> thanks
> >>
> >> Demai
>

Re: [question about replication] how to apply delta from Master to Slave after crash ?

Posted by Jean-Daniel Cryans <jd...@apache.org>.

https://issues.apache.org/jira/browse/HBASE-9047

On Fri, Jul 26, 2013 at 12:59 PM, Jean-Daniel Cryans
<jd...@apache.org> wrote:
> I guess I didn't explain my ideas clearly.
>
> So, first, replication in HBase is master-push, so you don't want to
> reverse the process. It means that this tool needs to run on the
> master cluster.
>
> Then I don't think you need to specify a timestamp since the
> replication state is in ZK. Basically that tool we're talking about
> would be able to read the replication state of each master region
> server, finish replicating what's missing, and then clear that state
> in zookeeper.
>
> The code that handles replication does most of that already. Check
> ReplicationSourceManager and ReplicationSource. Basically when
> ReplicationSourceManager.init() is called, it will check all the
> queues in ZK and try to grab those that aren't attached to a region
> server. If the whole cluster is down, it will grab all of them.
>
> The beautiful thing here is that you could start that tool on all your
> machines and the load will be spread out, but that might not be a big
> concern if replication wasn't lagging since it would take a few
> seconds to finish replicating the missing data for each region server.
>
> I'll open a jira.
>
> J-D
>
> On Fri, Jul 26, 2013 at 11:50 AM, Demai Ni <ni...@gmail.com> wrote:
>> JD,
>>
>> yeah. that sounds what I will need to do. a tool like this
>> [slave_cluster]$tool_to_syncup master_ZKquorum table_name start_timestamp
>>
>> so two tasks for me:
>> 1) identify the start_timestamp
>> 2) write the tool_to_syncup which will reach to master_ZK, copy the HLOGs
>> from makster, replay the HLOGs on Slave.
>>
>> are you aware of some example code for the 2) task that I can leverage?
>> thanks
>>
>> Demai

Re: [question about replication] how to apply delta from Master to Slave after crash ?

Posted by Jean-Daniel Cryans <jd...@apache.org>.

I guess I didn't explain my ideas clearly.

So, first, replication in HBase is master-push, so you don't want to
reverse the process. It means that this tool needs to run on the
master cluster.

Then I don't think you need to specify a timestamp since the
replication state is in ZK. Basically that tool we're talking about
would be able to read the replication state of each master region
server, finish replicating what's missing, and then clear that state
in zookeeper.

The code that handles replication does most of that already. Check
ReplicationSourceManager and ReplicationSource. Basically when
ReplicationSourceManager.init() is called, it will check all the
queues in ZK and try to grab those that aren't attached to a region
server. If the whole cluster is down, it will grab all of them.

The beautiful thing here is that you could start that tool on all your
machines and the load will be spread out, but that might not be a big
concern if replication wasn't lagging since it would take a few
seconds to finish replicating the missing data for each region server.

I'll open a jira.

J-D

On Fri, Jul 26, 2013 at 11:50 AM, Demai Ni <ni...@gmail.com> wrote:
> JD,
>
> yeah. that sounds what I will need to do. a tool like this
> [slave_cluster]$tool_to_syncup master_ZKquorum table_name start_timestamp
>
> so two tasks for me:
> 1) identify the start_timestamp
> 2) write the tool_to_syncup which will reach to master_ZK, copy the HLOGs
> from makster, replay the HLOGs on Slave.
>
> are you aware of some example code for the 2) task that I can leverage?
> thanks
>
> Demai

Re: [question about replication] how to apply delta from Master to Slave after crash ?

Posted by Demai Ni <ni...@gmail.com>.

JD,

yeah. that sounds what I will need to do. a tool like this
[slave_cluster]$tool_to_syncup master_ZKquorum table_name start_timestamp

so two tasks for me:
1) identify the start_timestamp
2) write the tool_to_syncup which will reach to master_ZK, copy the HLOGs
from makster, replay the HLOGs on Slave.

are you aware of some example code for the 2) task that I can leverage?
thanks

Demai

Re: [question about replication] how to apply delta from Master to Slave after crash ?

Posted by Jean-Daniel Cryans <jd...@apache.org>.

It's not possible right now to copy the delta if your cluster is down,
but since replication only uses the HLogs, as long as you can keep ZK
running (to know what needs to be replicated) it should be possible to
write a small tool that will finish replicating what's missing.

That could be a nice tool to have in HBase. Would it work for you?

J-D


On Fri, Jul 26, 2013 at 10:44 AM, Demai Ni <ni...@gmail.com> wrote:
> JD,
>
> thanks. I agree with you. the current replication mechanism will handle
> such situation exactly as you suggested.  Were I the DBA, I will focus on
> restart my Master-cluster instead of shipping the delta over to
> slave-cluster.
>
> However, my customer still would like to have the tool to sync-up the data
> in the case of Master down. I guess the reasons are not purely from
> technical perspective, more for a business check mark. :-(
>
> that is why I am looking for a way. Can you please give me some pointers to
> start with? many thanks
>
> Demai
>
>
> On Fri, Jul 26, 2013 at 10:08 AM, Jean-Daniel Cryans <jd...@apache.org>wrote:
>
>> As soon as you restart your master, it will continue replicating from
>> where it was, so it will send that delta.
>>
>> J-D
>>
>> On Mon, Jul 22, 2013 at 4:42 PM, Demai Ni <ni...@gmail.com> wrote:
>> > hi, folks,
>> >
>> > I am trying to address a recovery scenario after Master crash.
>> >
>> > In a simple Master - Slave replication setup for t1, and suddenly Master
>> > crashes. there will be a small delta of t1 already changed on Master, but
>> > haven't been replicated to Slave yet.  Assuming Master's filesystem and
>> > network still available, how to apply the delta from Master to Slave to
>> > sync t1.
>> >
>> > Anyone addresses this scenario in your shop? Can you please give me some
>> > pointers?
>> >
>> > Basically, I need to address two things:
>> > 1) get the timestamp of last consistent point of T1. some thing like
>> > oldest(lastAppliedOP_of_T1 of each regionServer on Slave Cluster).  Any
>> > other way to do so?
>> >
>> > 2) something like copyTable utility with starttime from 1). However
>> > copyTable requests hbase and zookeep available from source
>> cluster(right?)
>> >
>> >
>> > I have been thinking about ways to address this for quite a few days and
>> > still don't have very good ideas yet. Any suggestion is greatly
>> appreciated
>> >
>> > thanks.
>> >
>> > Demai
>>

Re: [question about replication] how to apply delta from Master to Slave after crash ?

Posted by Demai Ni <ni...@gmail.com>.

JD,

thanks. I agree with you. the current replication mechanism will handle
such situation exactly as you suggested.  Were I the DBA, I will focus on
restart my Master-cluster instead of shipping the delta over to
slave-cluster.

However, my customer still would like to have the tool to sync-up the data
in the case of Master down. I guess the reasons are not purely from
technical perspective, more for a business check mark. :-(

that is why I am looking for a way. Can you please give me some pointers to
start with? many thanks

Demai

On Fri, Jul 26, 2013 at 10:08 AM, Jean-Daniel Cryans <jd...@apache.org>wrote:

> As soon as you restart your master, it will continue replicating from
> where it was, so it will send that delta.
>
> J-D
>
> On Mon, Jul 22, 2013 at 4:42 PM, Demai Ni <ni...@gmail.com> wrote:
> > hi, folks,
> >
> > I am trying to address a recovery scenario after Master crash.
> >
> > In a simple Master - Slave replication setup for t1, and suddenly Master
> > crashes. there will be a small delta of t1 already changed on Master, but
> > haven't been replicated to Slave yet.  Assuming Master's filesystem and
> > network still available, how to apply the delta from Master to Slave to
> > sync t1.
> >
> > Anyone addresses this scenario in your shop? Can you please give me some
> > pointers?
> >
> > Basically, I need to address two things:
> > 1) get the timestamp of last consistent point of T1. some thing like
> > oldest(lastAppliedOP_of_T1 of each regionServer on Slave Cluster).  Any
> > other way to do so?
> >
> > 2) something like copyTable utility with starttime from 1). However
> > copyTable requests hbase and zookeep available from source
> cluster(right?)
> >
> >
> > I have been thinking about ways to address this for quite a few days and
> > still don't have very good ideas yet. Any suggestion is greatly
> appreciated
> >
> > thanks.
> >
> > Demai
>

Re: [question about replication] how to apply delta from Master to Slave after crash ?

Posted by Jean-Daniel Cryans <jd...@apache.org>.

As soon as you restart your master, it will continue replicating from
where it was, so it will send that delta.

J-D

On Mon, Jul 22, 2013 at 4:42 PM, Demai Ni <ni...@gmail.com> wrote:
> hi, folks,
>
> I am trying to address a recovery scenario after Master crash.
>
> In a simple Master - Slave replication setup for t1, and suddenly Master
> crashes. there will be a small delta of t1 already changed on Master, but
> haven't been replicated to Slave yet.  Assuming Master's filesystem and
> network still available, how to apply the delta from Master to Slave to
> sync t1.
>
> Anyone addresses this scenario in your shop? Can you please give me some
> pointers?
>
> Basically, I need to address two things:
> 1) get the timestamp of last consistent point of T1. some thing like
> oldest(lastAppliedOP_of_T1 of each regionServer on Slave Cluster).  Any
> other way to do so?
>
> 2) something like copyTable utility with starttime from 1). However
> copyTable requests hbase and zookeep available from source cluster(right?)
>
>
> I have been thinking about ways to address this for quite a few days and
> still don't have very good ideas yet. Any suggestion is greatly appreciated
>
> thanks.
>
> Demai