You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@hbase.apache.org by 何良均 <20...@163.com> on 2022/11/22 03:33:13 UTC

Re: [DISCUSS] Move replication queue storage from zookeeper to a separated HBase table

Last time we discussed the replication tool of design doc Move replication queue storage from zookeeper to a separated HBase table. However, we think that the implementation of ReplicationSyncUp tool is slightly complicated, so we decide to discuss it again separately.




We plan to hold an online meeting at 7PM to 8PM, 23 Nov 2022, GMT +8, using tencent meeting.


何良均 邀请您参加腾讯会议
会议主题：ReplicationSyncUp讨论
会议时间：2022/11/23 19:00-20:00 (GMT+08:00) 中国标准时间 - 北京

点击链接入会，或添加至会议列表：
https://meeting.tencent.com/dm/uAO9OU5ghD3y

#腾讯会议：138-478-728
会议密码：432745


More attendees are always welcomed.

Re: [DISCUSS] Move replication queue storage from zookeeper to a separated HBase table

Posted by "张铎(Duo Zhang)" <pa...@gmail.com>.

Thanks Liangjun for putting this up. I think the discussion was very
constructive, we found some blind points in the previous discussions,
for example, after the ReplicationSyncUp tool is done, all the data in
hbase:replication are useless.

Later we also need to modify the design doc accordingly.

Liangjun He <20...@163.com> 于2022年11月28日周一 23:34写道：
>
> Attendees: Duo Zhang, Yu Li, Liangjun He
>
> First, Liangjun introduced the discussion of the ReplicationSyncUp tool last time (see: https://lists.apache.org/thread/1yzy60wbgomvlhlbocps1jklc0x5t349), The ReplicationSyncUp tool removes ZK dependency by reading the latest snapshot data of the hbase:replication table, and generates new hbase:replication table snapshot after the execution of ReplicationSyncUp. At the same time, when the master cluster recovers, it needs to ensure that HMaster is started before the RegionServer to restore the hbase:replication snapshot to a table. Duo thinks that it is relatively complicated to use the snapshot, and considers changing to the regular flush hbase:replication table, and then the ReplicationSyncUp tool can directly read the table data to implement the ReplicationSyncUp tool. Then we discussed the relevant solutions, the following is the content of the discussion：
>
> 1. How does the ReplicationSyncUp tool read the data of the hbase:replication table If we rely on periodically flushing the hbase:replication table?
>
> When the ReplicationSyncUp tool is executed, the master cluster is in a down state. Because the hbase:replication table is flushed regularly, ReplicationSyncUp can directly read the hbase:replication table data offline. This way has no technical challenges and is simpler. Of course, the flush way and the snapshot way have the same problem, because flush is executed regularly, there is a certain delay time, which will also lead to redundant data being replicated to the slave cluster when ReplicationSyncUp is executed.
>
> 2. How to modify the hbase:replication table data after ReplicationSyncUp is executed?
>
> After ReplicationSyncUp is executed, the data need to be replicated by the master cluster has been replicated. Theoretically, the data in the hbase:replication table needs to be cleaned up. When ReplicationSyncUp is executed, an flag can be written to the file system, and the master cluster HMaster recovers, the data in the hbase:replication table can be cleaned according to this flag. After cleaning, we must delete this flag to avoid repeatedly cleaning the hbase:replication table.
>
> 3. Does the data cleaning of the hbase:replication table require that the HMaster be started before the RegionServer when the master cluster recovers to avoid inconsistency of hbase:replication data?
>
> HMaster does not need to be started before RegionServer for two reasons:
>
> a. If the RegionServer is started first, the RegionServer will be in the initialization state until the HMaster is started, no regions are assigned to it, so no data needs to replicated, and the hbase:replication table will not be modified;
>
> b. If the RegionServer is started first, it will not claim the replication queue of dead RegionServer, because this process is launched in the ServerCrashProcedure, and ServerCrashProcedure is executed by HMaster.
>
>
>
>
> Thanks.
>
>
>
>
> 在 2022-11-22 11:33:13，"Liangjun He" <20...@163.com> 写道：
>
> Last time we discussed the replication tool of design doc Move replication queue storage from zookeeper to a separated HBase table. However, we think that the implementation of ReplicationSyncUp tool is slightly complicated, so we decide to discuss it again separately.
>
>
>
>
> We plan to hold an online meeting at 7PM to 8PM, 23 Nov 2022, GMT +8, using tencent meeting.
>
>
> 何良均 邀请您参加腾讯会议
> 会议主题：ReplicationSyncUp讨论
> 会议时间：2022/11/23 19:00-20:00 (GMT+08:00) 中国标准时间 - 北京
>
> 点击链接入会，或添加至会议列表：
> https://meeting.tencent.com/dm/uAO9OU5ghD3y
>
> #腾讯会议：138-478-728
> 会议密码：432745
>
>
> More attendees are always welcomed.

Re: [DISCUSS] Move replication queue storage from zookeeper to a separated HBase table

Posted by Liangjun He <20...@163.com>.

Attendees: Duo Zhang, Yu Li, Liangjun He

First, Liangjun introduced the discussion of the ReplicationSyncUp tool last time (see: https://lists.apache.org/thread/1yzy60wbgomvlhlbocps1jklc0x5t349), The ReplicationSyncUp tool removes ZK dependency by reading the latest snapshot data of the hbase:replication table, and generates new hbase:replication table snapshot after the execution of ReplicationSyncUp. At the same time, when the master cluster recovers, it needs to ensure that HMaster is started before the RegionServer to restore the hbase:replication snapshot to a table. Duo thinks that it is relatively complicated to use the snapshot, and considers changing to the regular flush hbase:replication table, and then the ReplicationSyncUp tool can directly read the table data to implement the ReplicationSyncUp tool. Then we discussed the relevant solutions, the following is the content of the discussion：

1. How does the ReplicationSyncUp tool read the data of the hbase:replication table If we rely on periodically flushing the hbase:replication table?

When the ReplicationSyncUp tool is executed, the master cluster is in a down state. Because the hbase:replication table is flushed regularly, ReplicationSyncUp can directly read the hbase:replication table data offline. This way has no technical challenges and is simpler. Of course, the flush way and the snapshot way have the same problem, because flush is executed regularly, there is a certain delay time, which will also lead to redundant data being replicated to the slave cluster when ReplicationSyncUp is executed.

2. How to modify the hbase:replication table data after ReplicationSyncUp is executed?

After ReplicationSyncUp is executed, the data need to be replicated by the master cluster has been replicated. Theoretically, the data in the hbase:replication table needs to be cleaned up. When ReplicationSyncUp is executed, an flag can be written to the file system, and the master cluster HMaster recovers, the data in the hbase:replication table can be cleaned according to this flag. After cleaning, we must delete this flag to avoid repeatedly cleaning the hbase:replication table.

3. Does the data cleaning of the hbase:replication table require that the HMaster be started before the RegionServer when the master cluster recovers to avoid inconsistency of hbase:replication data?

HMaster does not need to be started before RegionServer for two reasons:

a. If the RegionServer is started first, the RegionServer will be in the initialization state until the HMaster is started, no regions are assigned to it, so no data needs to replicated, and the hbase:replication table will not be modified;

b. If the RegionServer is started first, it will not claim the replication queue of dead RegionServer, because this process is launched in the ServerCrashProcedure, and ServerCrashProcedure is executed by HMaster.

Thanks.

在 2022-11-22 11:33:13，"Liangjun He" <20...@163.com> 写道：

Last time we discussed the replication tool of design doc Move replication queue storage from zookeeper to a separated HBase table. However, we think that the implementation of ReplicationSyncUp tool is slightly complicated, so we decide to discuss it again separately.

We plan to hold an online meeting at 7PM to 8PM, 23 Nov 2022, GMT +8, using tencent meeting.

何良均 邀请您参加腾讯会议
会议主题：ReplicationSyncUp讨论
会议时间：2022/11/23 19:00-20:00 (GMT+08:00) 中国标准时间 - 北京

点击链接入会，或添加至会议列表：
https://meeting.tencent.com/dm/uAO9OU5ghD3y

#腾讯会议：138-478-728
会议密码：432745

More attendees are always welcomed.