You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hbase.apache.org by Otis Gospodnetic <ot...@yahoo.com> on 2011/03/02 17:50:59 UTC

HBase replication documentation

Hi,

What's the best place to learn about HBase replication?
I found http://hbase.apache.org/book/cluster_replication.html , but note how 
there is only a link there, and that link points to a 404.

Thanks,
Otis
----
Sematext :: http://sematext.com/ :: Solr - Lucene - Hadoop - HBase
Hadoop ecosystem search :: http://search-hadoop.com/

Re: Questions about HBase Cluster Replication

Posted by Otis Gospodnetic <ot...@yahoo.com>.

Here it is: https://issues.apache.org/jira/browse/HBASE-3597

I think we'll have the opportunity to test out cluster replication and provide 
feedback soon.

Otis
----
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/



----- Original Message ----
> From: Jean-Daniel Cryans <jd...@apache.org>
> To: user@hbase.apache.org
> Sent: Thu, March 3, 2011 3:41:04 PM
> Subject: Re: Questions about HBase Cluster Replication
> 
> Yep, it just occurred to me while answering you :) I'm the only dev
> who  worked on the replication stuff, any contribution or just testing
> out the  software is really appreciated.
> 
> J-D
> 
> On Thu, Mar 3, 2011 at 12:10  PM, Otis Gospodnetic
> <ot...@yahoo.com>  wrote:
> > Aha, so the fact that the age doesn't change when replication  keeps retrying 
>is
> > really a bug?
> >
> >  Otis
> >
> >
> >
> >
> > ----- Original Message  ----
> >> From: Jean-Daniel Cryans <jd...@apache.org>
> >> To: user@hbase.apache.org
> >> Sent:  Thu, March 3, 2011 2:17:08 PM
> >> Subject: Re: Questions about HBase  Cluster Replication
> >>
> >> No it's the age in  ms:
> >>
> >> ageOfLastAppliedOp.set(System.currentTimeMillis()  -  timestamp);
> >>
> >> And the timestamp is the one given to the  HLogEdit, not the  timestamp
> >> of the cell.
> >>
> >>  J-D
> >>
> >> On Thu, Mar 3, 2011 at 11:13 AM,  Otis  Gospodnetic
> >> <ot...@yahoo.com>   wrote:
> >> > Is that really the *age* really the *timestamp* of last   successful log
> >>shipment?
> >> > If so, one could calculate  the real age with  age = now() -
> >> >  ageOfLastShippedOnWhichIsReallyTimestamp .  And that would  be useful  
to
> >>have.
> >> >
> >> > Otis
> >> >  ----
> >> > Sematext :: http://sematext.com/ :: Solr -  Lucene - Nutch
> >> > Lucene  ecosystem search :: http://search-lucene.com/
> >> >
> >> >
> >>  >
> >> > ----- Original  Message ----
> >> >> From:  Jean-Daniel Cryans <jd...@apache.org>
> >>  >> To: user@hbase.apache.org
> >>  >> Sent:  Thu, March 3, 2011 12:21:09 PM
> >> >> Subject: Re:  Questions about HBase  Cluster Replication
> >> >>
> >>  >> It's a work in progress, that  information is currently published  by
> >> >> every  region server in the  master cluster (since it's  push
> >> >> replication, not pull)  through JMX  under the  name
> >> >> "ageOfLastShippedOp". It's really not perfect    though, since if it
> >> >> fails to replicate and starts retrying  then the  age won't  change but
> >> >> the actual lag will go up.  Also it will have  to be revisited when  we
> >> >> add multiple  slaves since you don't really  want to publish the  same
> >> >>  metric for multiple slaves... it really  wouldn't  work.
> >>  >>
> >> >> J-D
> >> >>
> >> >> On  Thu, Mar  3, 2011 at 9:10 AM, Bill Graham <bi...@gmail.com>  
> wrote:
> >> >> >  Actually, how far behind replication is  w.r.t. edit  logs is 
>different
> >> >> >  than how out of sync  they are, but you get  the idea.
> >> >> >
> >> >>  > On Thu, Mar  3, 2011 at 9:07 AM,  Bill Graham <bi...@gmail.com>
> >>  wrote:
> >> >> >> One more question for the FAQ:
> >>  >>  >>
> >> >> >> 6. Is  it possible for an admin  to tell just how  out of sync the 
two
> >> >> >>  clusters  are? Something like  Seconds_Behind_Master in MySQL's SHOW
> >> >>  >>  SLAVE  STATUS?
> >> >> >>
> >> >>  >>
> >> >> >> On Wed,  Mar 2, 2011 at 9:32  PM,  Jean-Daniel Cryans
> >><jd...@apache.org>
> >>  >>wrote:
> >> >>  >>> Although, I would add that  this feature is still  experimental so 
> who
> >>knows
> >>  >>:)
> >> >> >>>
> >> >> >>> I   think the worst  that happened to us was that replication was  
>broken
> >> >>  >>> (see the  jira where if the master  loses it's zk session with the
> >>slave
> >> >>  >>> zk  ensemble, it requires a HBase restart on the  master side) for  
>a
> >> few
> >> >> >>> days because of maintenance  of  the link between the two 
> datacenters
> >> >> >>>  which took more  than a minute. When we finally did  restart the  
>master
> >> >> >>>  cluster, it had to process about 2TBs  of  HLogs... those ICVs can
> >> >>  >>> really generate a  lot of  data!
> >> >>  >>>
> >> >>  >>> J-D
> >> >> >>>
> >> >>   >>> On  Wed, Mar 2, 2011 at 9:25 PM, Jean-Daniel  Cryans
> >><jd...@apache.org>
> >>  >>wrote:
> >> >>  >>>>> 5. If one is adding  replication on the  *production*  Master
> > cluster,
> >>  >>what's the
> >> >> >>>>> worst  thing that   can happen to this Master cluster?  Nothing 
>scary
> >>other
> >>  >>than
> >> >> >>>>> changing configs +   interruption during a restart?  (which is
> >>currently
> >>  >>still  bad
> >> >> >>>>> because of region    assignments?)
> >> >> >>>>>
> >> >>   >>>>
> >> >> >>>>  The replication code is  pretty  much encapsulated from the rest 
of
> >> the
> >> >>  >>>> region  server code, it won't mess with your Puts or  change  your
> >> >>  >>>> birthday  date.
> >> >>  >>>>
> >> >>  >>>> With 0.90 the regions  are  reassigned where they were before,  so
> > it's
> >> >>  >>>> really just the  block cache that gets  screwed.
> >>  >> >>>>
> >> >> >>>>    J-D
> >> >> >>>>
> >> >>  >>>
> >> >>  >>
> >> >>  >
> >> >>
> >> >
> >>
> >
>

Re: Questions about HBase Cluster Replication

Posted by Jean-Daniel Cryans <jd...@apache.org>.

Yep, it just occurred to me while answering you :) I'm the only dev
who worked on the replication stuff, any contribution or just testing
out the software is really appreciated.

J-D

On Thu, Mar 3, 2011 at 12:10 PM, Otis Gospodnetic
<ot...@yahoo.com> wrote:
> Aha, so the fact that the age doesn't change when replication keeps retrying is
> really a bug?
>
> Otis
>
>
>
>
> ----- Original Message ----
>> From: Jean-Daniel Cryans <jd...@apache.org>
>> To: user@hbase.apache.org
>> Sent: Thu, March 3, 2011 2:17:08 PM
>> Subject: Re: Questions about HBase Cluster Replication
>>
>> No it's the age in ms:
>>
>> ageOfLastAppliedOp.set(System.currentTimeMillis()  - timestamp);
>>
>> And the timestamp is the one given to the HLogEdit, not the  timestamp
>> of the cell.
>>
>> J-D
>>
>> On Thu, Mar 3, 2011 at 11:13 AM,  Otis Gospodnetic
>> <ot...@yahoo.com>  wrote:
>> > Is that really the *age* really the *timestamp* of last  successful log
>>shipment?
>> > If so, one could calculate the real age with  age = now() -
>> > ageOfLastShippedOnWhichIsReallyTimestamp .  And that would  be useful to
>>have.
>> >
>> > Otis
>> > ----
>> > Sematext :: http://sematext.com/ :: Solr -  Lucene - Nutch
>> > Lucene ecosystem search :: http://search-lucene.com/
>> >
>> >
>> >
>> > ----- Original  Message ----
>> >> From: Jean-Daniel Cryans <jd...@apache.org>
>> >> To: user@hbase.apache.org
>> >> Sent:  Thu, March 3, 2011 12:21:09 PM
>> >> Subject: Re: Questions about HBase  Cluster Replication
>> >>
>> >> It's a work in progress, that  information is currently published by
>> >> every  region server in the  master cluster (since it's push
>> >> replication, not pull)  through JMX  under the name
>> >> "ageOfLastShippedOp". It's really not perfect   though, since if it
>> >> fails to replicate and starts retrying then the  age won't  change but
>> >> the actual lag will go up. Also it will have  to be revisited when  we
>> >> add multiple slaves since you don't really  want to publish the  same
>> >> metric for multiple slaves... it really  wouldn't  work.
>> >>
>> >> J-D
>> >>
>> >> On Thu, Mar  3, 2011 at 9:10 AM, Bill Graham <bi...@gmail.com>  wrote:
>> >> >  Actually, how far behind replication is w.r.t. edit  logs is different
>> >> >  than how out of sync they are, but you get  the idea.
>> >> >
>> >> > On Thu, Mar  3, 2011 at 9:07 AM,  Bill Graham <bi...@gmail.com>
>> wrote:
>> >> >> One more question for the FAQ:
>> >>  >>
>> >> >> 6. Is  it possible for an admin to tell just how  out of sync the two
>> >> >>  clusters are? Something like  Seconds_Behind_Master in MySQL's SHOW
>> >> >>  SLAVE  STATUS?
>> >> >>
>> >> >>
>> >> >> On Wed,  Mar 2, 2011 at 9:32  PM, Jean-Daniel Cryans
>><jd...@apache.org>
>> >>wrote:
>> >>  >>> Although, I would add that this feature is still  experimental so  who
>>knows
>> >>:)
>> >> >>>
>> >> >>> I  think the worst  that happened to us was that replication was broken
>> >>  >>> (see the  jira where if the master loses it's zk session with the
>>slave
>> >> >>> zk  ensemble, it requires a HBase restart on the  master side) for a
>> few
>> >> >>> days because of maintenance of  the link between the two  datacenters
>> >> >>> which took more  than a minute. When we finally did  restart the master
>> >> >>>  cluster, it had to process about 2TBs of  HLogs... those ICVs can
>> >>  >>> really generate a lot of  data!
>> >>  >>>
>> >> >>> J-D
>> >> >>>
>> >>  >>> On  Wed, Mar 2, 2011 at 9:25 PM, Jean-Daniel Cryans
>><jd...@apache.org>
>> >>wrote:
>> >>  >>>>> 5. If one is adding replication on the  *production*  Master
> cluster,
>> >>what's the
>> >> >>>>> worst  thing that  can happen to this Master cluster?  Nothing scary
>>other
>> >>than
>> >> >>>>> changing configs +  interruption during a restart?  (which is
>>currently
>> >>still  bad
>> >> >>>>> because of region   assignments?)
>> >> >>>>>
>> >>  >>>>
>> >> >>>>  The replication code is pretty  much encapsulated from the rest of
>> the
>> >> >>>> region  server code, it won't mess with your Puts or  change your
>> >>  >>>> birthday  date.
>> >> >>>>
>> >>  >>>> With 0.90 the regions are  reassigned where they were before,  so
> it's
>> >> >>>> really just the  block cache that gets  screwed.
>> >> >>>>
>> >> >>>>   J-D
>> >> >>>>
>> >> >>>
>> >>  >>
>> >> >
>> >>
>> >
>>
>

Re: Questions about HBase Cluster Replication

Posted by Otis Gospodnetic <ot...@yahoo.com>.

Aha, so the fact that the age doesn't change when replication keeps retrying is 
really a bug?

Otis




----- Original Message ----
> From: Jean-Daniel Cryans <jd...@apache.org>
> To: user@hbase.apache.org
> Sent: Thu, March 3, 2011 2:17:08 PM
> Subject: Re: Questions about HBase Cluster Replication
> 
> No it's the age in ms:
> 
> ageOfLastAppliedOp.set(System.currentTimeMillis()  - timestamp);
> 
> And the timestamp is the one given to the HLogEdit, not the  timestamp
> of the cell.
> 
> J-D
> 
> On Thu, Mar 3, 2011 at 11:13 AM,  Otis Gospodnetic
> <ot...@yahoo.com>  wrote:
> > Is that really the *age* really the *timestamp* of last  successful log 
>shipment?
> > If so, one could calculate the real age with  age = now() -
> > ageOfLastShippedOnWhichIsReallyTimestamp .  And that would  be useful to 
>have.
> >
> > Otis
> > ----
> > Sematext :: http://sematext.com/ :: Solr -  Lucene - Nutch
> > Lucene ecosystem search :: http://search-lucene.com/
> >
> >
> >
> > ----- Original  Message ----
> >> From: Jean-Daniel Cryans <jd...@apache.org>
> >> To: user@hbase.apache.org
> >> Sent:  Thu, March 3, 2011 12:21:09 PM
> >> Subject: Re: Questions about HBase  Cluster Replication
> >>
> >> It's a work in progress, that  information is currently published by
> >> every  region server in the  master cluster (since it's push
> >> replication, not pull)  through JMX  under the name
> >> "ageOfLastShippedOp". It's really not perfect   though, since if it
> >> fails to replicate and starts retrying then the  age won't  change but
> >> the actual lag will go up. Also it will have  to be revisited when  we
> >> add multiple slaves since you don't really  want to publish the  same
> >> metric for multiple slaves... it really  wouldn't  work.
> >>
> >> J-D
> >>
> >> On Thu, Mar  3, 2011 at 9:10 AM, Bill Graham <bi...@gmail.com>  wrote:
> >> >  Actually, how far behind replication is w.r.t. edit  logs is different
> >> >  than how out of sync they are, but you get  the idea.
> >> >
> >> > On Thu, Mar  3, 2011 at 9:07 AM,  Bill Graham <bi...@gmail.com>  
> wrote:
> >> >> One more question for the FAQ:
> >>  >>
> >> >> 6. Is  it possible for an admin to tell just how  out of sync the two
> >> >>  clusters are? Something like  Seconds_Behind_Master in MySQL's SHOW
> >> >>  SLAVE  STATUS?
> >> >>
> >> >>
> >> >> On Wed,  Mar 2, 2011 at 9:32  PM, Jean-Daniel Cryans 
><jd...@apache.org>
> >>wrote:
> >>  >>> Although, I would add that this feature is still  experimental so  who 
>knows
> >>:)
> >> >>>
> >> >>> I  think the worst  that happened to us was that replication was broken
> >>  >>> (see the  jira where if the master loses it's zk session with the  
>slave
> >> >>> zk  ensemble, it requires a HBase restart on the  master side) for a 
> few
> >> >>> days because of maintenance of  the link between the two  datacenters
> >> >>> which took more  than a minute. When we finally did  restart the master
> >> >>>  cluster, it had to process about 2TBs of  HLogs... those ICVs can
> >>  >>> really generate a lot of  data!
> >>  >>>
> >> >>> J-D
> >> >>>
> >>  >>> On  Wed, Mar 2, 2011 at 9:25 PM, Jean-Daniel Cryans 
><jd...@apache.org>
> >>wrote:
> >>  >>>>> 5. If one is adding replication on the  *production*  Master 
cluster,
> >>what's the
> >> >>>>> worst  thing that  can happen to this Master cluster?  Nothing scary  
>other
> >>than
> >> >>>>> changing configs +  interruption during a restart?  (which is 
>currently
> >>still  bad
> >> >>>>> because of region   assignments?)
> >> >>>>>
> >>  >>>>
> >> >>>>  The replication code is pretty  much encapsulated from the rest of 
> the
> >> >>>> region  server code, it won't mess with your Puts or  change your
> >>  >>>> birthday  date.
> >> >>>>
> >>  >>>> With 0.90 the regions are  reassigned where they were before,  so 
it's
> >> >>>> really just the  block cache that gets  screwed.
> >> >>>>
> >> >>>>   J-D
> >> >>>>
> >> >>>
> >>  >>
> >> >
> >>
> >
>

Re: Questions about HBase Cluster Replication

Posted by Jean-Daniel Cryans <jd...@apache.org>.

No it's the age in ms:

ageOfLastAppliedOp.set(System.currentTimeMillis() - timestamp);

And the timestamp is the one given to the HLogEdit, not the timestamp
of the cell.

J-D

On Thu, Mar 3, 2011 at 11:13 AM, Otis Gospodnetic
<ot...@yahoo.com> wrote:
> Is that really the *age* really the *timestamp* of last successful log shipment?
> If so, one could calculate the real age with age = now() -
> ageOfLastShippedOnWhichIsReallyTimestamp .  And that would be useful to have.
>
> Otis
> ----
> Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
> Lucene ecosystem search :: http://search-lucene.com/
>
>
>
> ----- Original Message ----
>> From: Jean-Daniel Cryans <jd...@apache.org>
>> To: user@hbase.apache.org
>> Sent: Thu, March 3, 2011 12:21:09 PM
>> Subject: Re: Questions about HBase Cluster Replication
>>
>> It's a work in progress, that information is currently published by
>> every  region server in the master cluster (since it's push
>> replication, not pull)  through JMX under the name
>> "ageOfLastShippedOp". It's really not perfect  though, since if it
>> fails to replicate and starts retrying then the age won't  change but
>> the actual lag will go up. Also it will have to be revisited when  we
>> add multiple slaves since you don't really want to publish the  same
>> metric for multiple slaves... it really wouldn't  work.
>>
>> J-D
>>
>> On Thu, Mar 3, 2011 at 9:10 AM, Bill Graham <bi...@gmail.com> wrote:
>> >  Actually, how far behind replication is w.r.t. edit logs is different
>> >  than how out of sync they are, but you get the idea.
>> >
>> > On Thu, Mar  3, 2011 at 9:07 AM, Bill Graham <bi...@gmail.com>  wrote:
>> >> One more question for the FAQ:
>> >>
>> >> 6. Is  it possible for an admin to tell just how out of sync the two
>> >>  clusters are? Something like Seconds_Behind_Master in MySQL's SHOW
>> >>  SLAVE STATUS?
>> >>
>> >>
>> >> On Wed, Mar 2, 2011 at 9:32  PM, Jean-Daniel Cryans <jd...@apache.org>
>>wrote:
>> >>> Although, I would add that this feature is still  experimental so who knows
>>:)
>> >>>
>> >>> I think the worst  that happened to us was that replication was broken
>> >>> (see the  jira where if the master loses it's zk session with the slave
>> >>> zk  ensemble, it requires a HBase restart on the master side) for a  few
>> >>> days because of maintenance of the link between the two  datacenters
>> >>> which took more than a minute. When we finally did  restart the master
>> >>> cluster, it had to process about 2TBs of  HLogs... those ICVs can
>> >>> really generate a lot of  data!
>> >>>
>> >>> J-D
>> >>>
>> >>> On  Wed, Mar 2, 2011 at 9:25 PM, Jean-Daniel Cryans <jd...@apache.org>
>>wrote:
>> >>>>> 5. If one is adding replication on the  *production* Master cluster,
>>what's the
>> >>>>> worst thing that  can happen to this Master cluster?  Nothing scary other
>>than
>> >>>>> changing configs + interruption during a restart?  (which is currently
>>still bad
>> >>>>> because of region  assignments?)
>> >>>>>
>> >>>>
>> >>>>  The replication code is pretty much encapsulated from the rest of  the
>> >>>> region server code, it won't mess with your Puts or  change your
>> >>>> birthday  date.
>> >>>>
>> >>>> With 0.90 the regions are  reassigned where they were before, so it's
>> >>>> really just the  block cache that gets screwed.
>> >>>>
>> >>>>  J-D
>> >>>>
>> >>>
>> >>
>> >
>>
>

Re: Questions about HBase Cluster Replication

Posted by Otis Gospodnetic <ot...@yahoo.com>.

Is that really the *age* really the *timestamp* of last successful log shipment?
If so, one could calculate the real age with age = now() - 
ageOfLastShippedOnWhichIsReallyTimestamp .  And that would be useful to have.

Otis
----
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/



----- Original Message ----
> From: Jean-Daniel Cryans <jd...@apache.org>
> To: user@hbase.apache.org
> Sent: Thu, March 3, 2011 12:21:09 PM
> Subject: Re: Questions about HBase Cluster Replication
> 
> It's a work in progress, that information is currently published by
> every  region server in the master cluster (since it's push
> replication, not pull)  through JMX under the name
> "ageOfLastShippedOp". It's really not perfect  though, since if it
> fails to replicate and starts retrying then the age won't  change but
> the actual lag will go up. Also it will have to be revisited when  we
> add multiple slaves since you don't really want to publish the  same
> metric for multiple slaves... it really wouldn't  work.
> 
> J-D
> 
> On Thu, Mar 3, 2011 at 9:10 AM, Bill Graham <bi...@gmail.com> wrote:
> >  Actually, how far behind replication is w.r.t. edit logs is different
> >  than how out of sync they are, but you get the idea.
> >
> > On Thu, Mar  3, 2011 at 9:07 AM, Bill Graham <bi...@gmail.com>  wrote:
> >> One more question for the FAQ:
> >>
> >> 6. Is  it possible for an admin to tell just how out of sync the two
> >>  clusters are? Something like Seconds_Behind_Master in MySQL's SHOW
> >>  SLAVE STATUS?
> >>
> >>
> >> On Wed, Mar 2, 2011 at 9:32  PM, Jean-Daniel Cryans <jd...@apache.org>  
>wrote:
> >>> Although, I would add that this feature is still  experimental so who knows 
>:)
> >>>
> >>> I think the worst  that happened to us was that replication was broken
> >>> (see the  jira where if the master loses it's zk session with the slave
> >>> zk  ensemble, it requires a HBase restart on the master side) for a  few
> >>> days because of maintenance of the link between the two  datacenters
> >>> which took more than a minute. When we finally did  restart the master
> >>> cluster, it had to process about 2TBs of  HLogs... those ICVs can
> >>> really generate a lot of  data!
> >>>
> >>> J-D
> >>>
> >>> On  Wed, Mar 2, 2011 at 9:25 PM, Jean-Daniel Cryans <jd...@apache.org>  
>wrote:
> >>>>> 5. If one is adding replication on the  *production* Master cluster, 
>what's the
> >>>>> worst thing that  can happen to this Master cluster?  Nothing scary other  
>than
> >>>>> changing configs + interruption during a restart?  (which is currently 
>still bad
> >>>>> because of region  assignments?)
> >>>>>
> >>>>
> >>>>  The replication code is pretty much encapsulated from the rest of  the
> >>>> region server code, it won't mess with your Puts or  change your
> >>>> birthday  date.
> >>>>
> >>>> With 0.90 the regions are  reassigned where they were before, so it's
> >>>> really just the  block cache that gets screwed.
> >>>>
> >>>>  J-D
> >>>>
> >>>
> >>
> >
>

Re: Questions about HBase Cluster Replication

Posted by Jean-Daniel Cryans <jd...@apache.org>.

It's a work in progress, that information is currently published by
every region server in the master cluster (since it's push
replication, not pull) through JMX under the name
"ageOfLastShippedOp". It's really not perfect though, since if it
fails to replicate and starts retrying then the age won't change but
the actual lag will go up. Also it will have to be revisited when we
add multiple slaves since you don't really want to publish the same
metric for multiple slaves... it really wouldn't work.

J-D

On Thu, Mar 3, 2011 at 9:10 AM, Bill Graham <bi...@gmail.com> wrote:
> Actually, how far behind replication is w.r.t. edit logs is different
> than how out of sync they are, but you get the idea.
>
> On Thu, Mar 3, 2011 at 9:07 AM, Bill Graham <bi...@gmail.com> wrote:
>> One more question for the FAQ:
>>
>> 6. Is it possible for an admin to tell just how out of sync the two
>> clusters are? Something like Seconds_Behind_Master in MySQL's SHOW
>> SLAVE STATUS?
>>
>>
>> On Wed, Mar 2, 2011 at 9:32 PM, Jean-Daniel Cryans <jd...@apache.org> wrote:
>>> Although, I would add that this feature is still experimental so who knows :)
>>>
>>> I think the worst that happened to us was that replication was broken
>>> (see the jira where if the master loses it's zk session with the slave
>>> zk ensemble, it requires a HBase restart on the master side) for a few
>>> days because of maintenance of the link between the two datacenters
>>> which took more than a minute. When we finally did restart the master
>>> cluster, it had to process about 2TBs of HLogs... those ICVs can
>>> really generate a lot of data!
>>>
>>> J-D
>>>
>>> On Wed, Mar 2, 2011 at 9:25 PM, Jean-Daniel Cryans <jd...@apache.org> wrote:
>>>>> 5. If one is adding replication on the *production* Master cluster, what's the
>>>>> worst thing that can happen to this Master cluster?  Nothing scary other than
>>>>> changing configs + interruption during a restart? (which is currently still bad
>>>>> because of region assignments?)
>>>>>
>>>>
>>>> The replication code is pretty much encapsulated from the rest of the
>>>> region server code, it won't mess with your Puts or change your
>>>> birthday date.
>>>>
>>>> With 0.90 the regions are reassigned where they were before, so it's
>>>> really just the block cache that gets screwed.
>>>>
>>>> J-D
>>>>
>>>
>>
>

Re: Questions about HBase Cluster Replication

Posted by Bill Graham <bi...@gmail.com>.

Actually, how far behind replication is w.r.t. edit logs is different
than how out of sync they are, but you get the idea.

On Thu, Mar 3, 2011 at 9:07 AM, Bill Graham <bi...@gmail.com> wrote:
> One more question for the FAQ:
>
> 6. Is it possible for an admin to tell just how out of sync the two
> clusters are? Something like Seconds_Behind_Master in MySQL's SHOW
> SLAVE STATUS?
>
>
> On Wed, Mar 2, 2011 at 9:32 PM, Jean-Daniel Cryans <jd...@apache.org> wrote:
>> Although, I would add that this feature is still experimental so who knows :)
>>
>> I think the worst that happened to us was that replication was broken
>> (see the jira where if the master loses it's zk session with the slave
>> zk ensemble, it requires a HBase restart on the master side) for a few
>> days because of maintenance of the link between the two datacenters
>> which took more than a minute. When we finally did restart the master
>> cluster, it had to process about 2TBs of HLogs... those ICVs can
>> really generate a lot of data!
>>
>> J-D
>>
>> On Wed, Mar 2, 2011 at 9:25 PM, Jean-Daniel Cryans <jd...@apache.org> wrote:
>>>> 5. If one is adding replication on the *production* Master cluster, what's the
>>>> worst thing that can happen to this Master cluster?  Nothing scary other than
>>>> changing configs + interruption during a restart? (which is currently still bad
>>>> because of region assignments?)
>>>>
>>>
>>> The replication code is pretty much encapsulated from the rest of the
>>> region server code, it won't mess with your Puts or change your
>>> birthday date.
>>>
>>> With 0.90 the regions are reassigned where they were before, so it's
>>> really just the block cache that gets screwed.
>>>
>>> J-D
>>>
>>
>

Re: Questions about HBase Cluster Replication

Posted by Bill Graham <bi...@gmail.com>.

One more question for the FAQ:

6. Is it possible for an admin to tell just how out of sync the two
clusters are? Something like Seconds_Behind_Master in MySQL's SHOW
SLAVE STATUS?


On Wed, Mar 2, 2011 at 9:32 PM, Jean-Daniel Cryans <jd...@apache.org> wrote:
> Although, I would add that this feature is still experimental so who knows :)
>
> I think the worst that happened to us was that replication was broken
> (see the jira where if the master loses it's zk session with the slave
> zk ensemble, it requires a HBase restart on the master side) for a few
> days because of maintenance of the link between the two datacenters
> which took more than a minute. When we finally did restart the master
> cluster, it had to process about 2TBs of HLogs... those ICVs can
> really generate a lot of data!
>
> J-D
>
> On Wed, Mar 2, 2011 at 9:25 PM, Jean-Daniel Cryans <jd...@apache.org> wrote:
>>> 5. If one is adding replication on the *production* Master cluster, what's the
>>> worst thing that can happen to this Master cluster?  Nothing scary other than
>>> changing configs + interruption during a restart? (which is currently still bad
>>> because of region assignments?)
>>>
>>
>> The replication code is pretty much encapsulated from the rest of the
>> region server code, it won't mess with your Puts or change your
>> birthday date.
>>
>> With 0.90 the regions are reassigned where they were before, so it's
>> really just the block cache that gets screwed.
>>
>> J-D
>>
>

Re: Questions about HBase Cluster Replication

Posted by Jean-Daniel Cryans <jd...@apache.org>.

Although, I would add that this feature is still experimental so who knows :)

I think the worst that happened to us was that replication was broken
(see the jira where if the master loses it's zk session with the slave
zk ensemble, it requires a HBase restart on the master side) for a few
days because of maintenance of the link between the two datacenters
which took more than a minute. When we finally did restart the master
cluster, it had to process about 2TBs of HLogs... those ICVs can
really generate a lot of data!

J-D

On Wed, Mar 2, 2011 at 9:25 PM, Jean-Daniel Cryans <jd...@apache.org> wrote:
>> 5. If one is adding replication on the *production* Master cluster, what's the
>> worst thing that can happen to this Master cluster?  Nothing scary other than
>> changing configs + interruption during a restart? (which is currently still bad
>> because of region assignments?)
>>
>
> The replication code is pretty much encapsulated from the rest of the
> region server code, it won't mess with your Puts or change your
> birthday date.
>
> With 0.90 the regions are reassigned where they were before, so it's
> really just the block cache that gets screwed.
>
> J-D
>

Re: Questions about HBase Cluster Replication

Posted by Jean-Daniel Cryans <jd...@apache.org>.

> 5. If one is adding replication on the *production* Master cluster, what's the
> worst thing that can happen to this Master cluster?  Nothing scary other than
> changing configs + interruption during a restart? (which is currently still bad
> because of region assignments?)
>

The replication code is pretty much encapsulated from the rest of the
region server code, it won't mess with your Puts or change your
birthday date.

With 0.90 the regions are reassigned where they were before, so it's
really just the block cache that gets screwed.

J-D

Re: Questions about HBase Cluster Replication

Posted by Otis Gospodnetic <ot...@yahoo.com>.

Hello,

I've got an additional question below:

> 1. What  happens if data is modified in the Slave cluster?  Is replication 
> one-way, so if I delete, change, or add something in the Slave cluster, the 
> Master cluster won't detect it and my changes will remain in the Slave  cluster 
>
> until one sunny day the same data in the Master cluster changes and  
>replication 
>
> overwrites my changes in the Slave cluster?
> 
> 2. Is there a  way to tell the replication "And now don't just send the edits - 
>
> replicate  all data now and overwrite everything in the Slave cluster"?
> 
> 3. What has  to happen for Master and Slave cluster(s) to get out of sync?
> 
> 4. Are  people using this in production?

5. If one is adding replication on the *production* Master cluster, what's the 
worst thing that can happen to this Master cluster?  Nothing scary other than 
changing configs + interruption during a restart? (which is currently still bad 
because of region assignments?)

Thanks,
Otis



----- Original Message ----
> From: Otis Gospodnetic <ot...@yahoo.com>
> To: user@hbase.apache.org
> Sent: Wed, March 2, 2011 1:05:14 PM
> Subject: Questions about HBase replication
> 
> Hello,
> 
> That http://hbase.apache.org/replication.html is informative, but  I have some 

> more Qs (that may end up qualifying as FAQs):
> 
> 1. What  happens if data is modified in the Slave cluster?  Is replication 
> one-way, so if I delete, change, or add something in the Slave cluster, the 
> Master cluster won't detect it and my changes will remain in the Slave  cluster 
>
> until one sunny day the same data in the Master cluster changes and  
>replication 
>
> overwrites my changes in the Slave cluster?
> 
> 2. Is there a  way to tell the replication "And now don't just send the edits - 
>
> replicate  all data now and overwrite everything in the Slave cluster"?
> 
> 3. What has  to happen for Master and Slave cluster(s) to get out of sync?
> 
> 4. Are  people using this in production?
> 
> Thanks,
> Otis
> ----
> Sematext :: http://sematext.com/ :: Solr -  Lucene - Nutch
> Lucene ecosystem search :: http://search-lucene.com/
> 
> 
> 
> ----- Original  Message ----
> > From: Jean-Daniel Cryans <jd...@apache.org>
> > To: user@hbase.apache.org
> > Sent: Wed,  March 2, 2011 12:12:13 PM
> > Subject: Re: HBase replication  documentation
> > 
> > I think it's suppose to point there http://hbase.apache.org/replication.html
> > 
> > J-D
> > 
> > On Wed, Mar 2,  2011 at 8:50 AM, Otis Gospodnetic
> > <ot...@yahoo.com>   wrote:
> > > Hi,
> > >
> > > What's the best place to  learn about HBase  replication?
> > > I found http://hbase.apache.org/book/cluster_replication.html , but note  
> how
> > > there is only a link there, and that link points to a   404.
> > >
> > > Thanks,
> > > Otis
> > >  ----
> > > Sematext :: http://sematext.com/ :: Solr -  Lucene - Hadoop - HBase
> > >  Hadoop ecosystem search :: http://search-hadoop.com/
> > >
> > >
> > 
>

Questions about HBase replication

Posted by Otis Gospodnetic <ot...@yahoo.com>.

Hello,

That http://hbase.apache.org/replication.html is informative, but I have some 
more Qs (thay may end up qualifying as FAQs):

1. What happens if data is modified in the Slave cluster?  Is replication 
one-way, so if I delete, change, or add something in the Slave cluster, the 
Master cluster won't detect it and my changes will remain in the Slave cluster 
until one sunny day the same data in the Master cluster changes and replication 
overwrites my changes in the Slave cluster?

2. Is there a way to tell the replication "And now don't just send the edits - 
replicate all data now and overwrite everything in the Slave cluster"?

3. What has to happen for Master and Slave cluster(s) to get out of sync?

4. Are people using this in production?

Thanks,
Otis
----
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/

----- Original Message ----
> From: Jean-Daniel Cryans <jd...@apache.org>
> To: user@hbase.apache.org
> Sent: Wed, March 2, 2011 12:12:13 PM
> Subject: Re: HBase replication documentation
> 
> I think it's suppose to point there http://hbase.apache.org/replication.html
> 
> J-D
> 
> On Wed, Mar 2,  2011 at 8:50 AM, Otis Gospodnetic
> <ot...@yahoo.com>  wrote:
> > Hi,
> >
> > What's the best place to learn about HBase  replication?
> > I found http://hbase.apache.org/book/cluster_replication.html , but note  
how
> > there is only a link there, and that link points to a  404.
> >
> > Thanks,
> > Otis
> > ----
> > Sematext :: http://sematext.com/ :: Solr -  Lucene - Hadoop - HBase
> > Hadoop ecosystem search :: http://search-hadoop.com/
> >
> >
>

Re: HBase replication documentation

Posted by Jean-Daniel Cryans <jd...@apache.org>.

I think it's suppose to point there http://hbase.apache.org/replication.html

J-D

On Wed, Mar 2, 2011 at 8:50 AM, Otis Gospodnetic
<ot...@yahoo.com> wrote:
> Hi,
>
> What's the best place to learn about HBase replication?
> I found http://hbase.apache.org/book/cluster_replication.html , but note how
> there is only a link there, and that link points to a 404.
>
> Thanks,
> Otis
> ----
> Sematext :: http://sematext.com/ :: Solr - Lucene - Hadoop - HBase
> Hadoop ecosystem search :: http://search-hadoop.com/
>
>