You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "chenglei (Jira)" <ji...@apache.org> on 2022/07/09 03:33:00 UTC
[jira] [Resolved] (HBASE-26950) Use AsyncConnection in ReplicationSink
[ https://issues.apache.org/jira/browse/HBASE-26950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
chenglei resolved HBASE-26950.
------------------------------
Assignee: chenglei
Resolution: Fixed
> Use AsyncConnection in ReplicationSink
> --------------------------------------
>
> Key: HBASE-26950
> URL: https://issues.apache.org/jira/browse/HBASE-26950
> Project: HBase
> Issue Type: Improvement
> Affects Versions: 2.4.11
> Reporter: Bryan Beaudreault
> Assignee: chenglei
> Priority: Major
> Labels: branch-2, replication
> Fix For: 2.5.0, 2.6.0
>
>
> We don't need to necessarily rewrite ReplicationSink to work fully async. I think it would simply benefit from ConnectionFactory.createAsyncConnection instead of ConnectionFactory.createConnection.
> The reasons for this are:
> * AsyncConnection is the more modern implementation, the only implementation in master, and where most of the efforts will be going forward.
> * ReplicationSink only does batch calls, and batch calls are done with AsyncProcess. It's likely that the native AsyncTable is better than AsyncProcess for this.
> ** One specific example, AsyncProcess calls findAllLocationsOrFail sequentially for all actions in a batch. This can take quite a while with the default replication batch size of 5k, if actions are spread across many regions. In AsyncTable, these calls are done in parallel
--
This message was sent by Atlassian Jira
(v8.20.10#820010)