You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "chenglei (Jira)" <ji...@apache.org> on 2022/04/19 10:26:00 UTC
[jira] [Updated] (HBASE-26960) Another case for unnecessary replication suspending in RegionReplicationSink
[ https://issues.apache.org/jira/browse/HBASE-26960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
chenglei updated HBASE-26960:
-----------------------------
Description:
There is another case replication in {{RegionReplicationSink}} would be suspend:
For {{RegionReplicationSink}}, when there is a replication error , {{RegionReplicationSink}} invokes {{MemStoreFlusher#requestFlush}} to request a flush, and after receiving the {{FlushAction#START_FLUSH}} or {{FlushAction#CANNOT_FLUSH}} flush marker, it would resume the replication. But when {{MemStoreFlusher}} flushing, it invokes {following method {{HRegion.flushcache}} with the {{writeFlushRequestWalMarker}} set to false:
{code:java}
public FlushResultImpl flushcache(List<byte[]> families,
boolean writeFlushRequestWalMarker, FlushLifeCycleTracker tracker) throws IOException {
}
{code}
When {{writeFlushRequestWalMarker}} is set to false, {{HRegion.flushcache}} does not write the {{FlushAction#CANNOT_FLUSH}} flush marker to {{WAL}} when the memstore is empty, so when there is a replication error when the memstore is empty(eg. replicating the {{FlushAction#START_FLUSH}} or {{FlushAction#COMMIT_FLUSH}} ), the replication may suspend until next memstore flush,even though the there are user writes and it could replicate normally.
I simulate this problem in the PR , and for {{writeFlushRequestWalMarker}} paramter, it introduced by HBASE-11580 and just only determines whether writing the {{FlushAction#CANNOT_FLUSH}} flush marker to WAL when the memstore is empty, so I think for simplicity, we could set it to true always for {{MemStoreFlusher}}.
> Another case for unnecessary replication suspending in RegionReplicationSink
> ----------------------------------------------------------------------------
>
> Key: HBASE-26960
> URL: https://issues.apache.org/jira/browse/HBASE-26960
> Project: HBase
> Issue Type: Bug
> Components: read replicas
> Affects Versions: 3.0.0-alpha-2
> Reporter: chenglei
> Priority: Major
>
> There is another case replication in {{RegionReplicationSink}} would be suspend:
> For {{RegionReplicationSink}}, when there is a replication error , {{RegionReplicationSink}} invokes {{MemStoreFlusher#requestFlush}} to request a flush, and after receiving the {{FlushAction#START_FLUSH}} or {{FlushAction#CANNOT_FLUSH}} flush marker, it would resume the replication. But when {{MemStoreFlusher}} flushing, it invokes {following method {{HRegion.flushcache}} with the {{writeFlushRequestWalMarker}} set to false:
> {code:java}
> public FlushResultImpl flushcache(List<byte[]> families,
> boolean writeFlushRequestWalMarker, FlushLifeCycleTracker tracker) throws IOException {
> }
> {code}
> When {{writeFlushRequestWalMarker}} is set to false, {{HRegion.flushcache}} does not write the {{FlushAction#CANNOT_FLUSH}} flush marker to {{WAL}} when the memstore is empty, so when there is a replication error when the memstore is empty(eg. replicating the {{FlushAction#START_FLUSH}} or {{FlushAction#COMMIT_FLUSH}} ), the replication may suspend until next memstore flush,even though the there are user writes and it could replicate normally.
> I simulate this problem in the PR , and for {{writeFlushRequestWalMarker}} paramter, it introduced by HBASE-11580 and just only determines whether writing the {{FlushAction#CANNOT_FLUSH}} flush marker to WAL when the memstore is empty, so I think for simplicity, we could set it to true always for {{MemStoreFlusher}}.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)