You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Asaf Mesika <as...@gmail.com> on 2013/06/18 08:48:44 UTC

HBase Replication is talking to the wrong peer

Hi,

I have two cluster setup in a lab, each has 1 Master and 3 RS.
I'm inserting roughly 15GB into the master cluster, but I see between 5 -
10 minutes delay between master and slave cluster (ageOfLastShippedOp) them.

On my Graphite I see that replicateLogEntries_num_ops is increasing in one
region server (IP 85) of the slave cluster, out of 3 (IPs 83,84,85).

I ran a grep on the logs of each region server of the master, and saw
Chosen peer message saying the following:
RS ip 74: Chosen peer 83
RS ip 75: Chosen peer 85
RS ip 76: Chosen peer 85

So first problem: Why only two slave RS (83,85) are receiving replicated
log entries instead of 3?

Second and biggest problem: I ran netstat -tnp and grepped for 83,84,85 on
the RS ip 74, and saw that it is in fact talking with RS 85!
This was correlated with the Graphite graph of replicateLogEntries_num_ops
which showed that only RS 85 was receiving replicated log entries.

For me it looks like a bug.

Anyone has any ideas how to solve those two issues?

Re: HBase Replication is talking to the wrong peer

Posted by Jean-Daniel Cryans <jd...@apache.org>.
Did you find what the issue was? From your other thread it looks like
you got it working.

Thx,

J-D

On Mon, Jun 17, 2013 at 11:48 PM, Asaf Mesika <as...@gmail.com> wrote:
> Hi,
>
> I have two cluster setup in a lab, each has 1 Master and 3 RS.
> I'm inserting roughly 15GB into the master cluster, but I see between 5 -
> 10 minutes delay between master and slave cluster (ageOfLastShippedOp) them.
>
> On my Graphite I see that replicateLogEntries_num_ops is increasing in one
> region server (IP 85) of the slave cluster, out of 3 (IPs 83,84,85).
>
> I ran a grep on the logs of each region server of the master, and saw
> Chosen peer message saying the following:
> RS ip 74: Chosen peer 83
> RS ip 75: Chosen peer 85
> RS ip 76: Chosen peer 85
>
> So first problem: Why only two slave RS (83,85) are receiving replicated
> log entries instead of 3?
>
> Second and biggest problem: I ran netstat -tnp and grepped for 83,84,85 on
> the RS ip 74, and saw that it is in fact talking with RS 85!
> This was correlated with the Graphite graph of replicateLogEntries_num_ops
> which showed that only RS 85 was receiving replicated log entries.
>
> For me it looks like a bug.
>
> Anyone has any ideas how to solve those two issues?