You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "chenglei (Jira)" <ji...@apache.org> on 2022/07/04 04:15:00 UTC

[jira] [Comment Edited] (HBASE-26950) Use AsyncConnection in ReplicationSink

    [ https://issues.apache.org/jira/browse/HBASE-26950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17561993#comment-17561993 ] 

chenglei edited comment on HBASE-26950 at 7/4/22 4:14 AM:
----------------------------------------------------------

[~bbeaudreault], master has already used {{AsyncClusterConnection}} for {{ReplicationSink}}. for branch-2,  what we're going to do is to just use {{AsyncConnection}} to rewrite the {{ReplicationSink.batch}} as master,and for {{HFileReplicator}}, we continue to use the {{Connection}}.  I have opened a PR for branch-2.


was (Author: comnetwork):
[~bbeaudreault], master has already used {{AsyncClusterConnection}} for {{ReplicationSink}}. for branch-2,  what we're going to do is to just use {{AsyncConnection}} to rewrite the {{ReplicationSink.batch}},and for {{HFileReplicator}}, we continue to use the {{Connection}}.  I have opened a PR for branch-2.

> Use AsyncConnection in ReplicationSink
> --------------------------------------
>
>                 Key: HBASE-26950
>                 URL: https://issues.apache.org/jira/browse/HBASE-26950
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 2.4.11
>            Reporter: Bryan Beaudreault
>            Priority: Major
>
> We don't need to necessarily rewrite ReplicationSink to work fully async. I think it would simply benefit from ConnectionFactory.createAsyncConnection instead of ConnectionFactory.createConnection.
> The reasons for this are:
>  * AsyncConnection is the more modern implementation, the only implementation in master, and where most of the efforts will be going forward.
>  * ReplicationSink only does batch calls, and batch calls are done with AsyncProcess. It's likely that the native AsyncTable is better than AsyncProcess for this.
>  ** One specific example, AsyncProcess calls findAllLocationsOrFail sequentially for all actions in a batch. This can take quite a while with the default replication batch size of 5k, if actions are spread across many regions. In AsyncTable, these calls are done in parallel



--
This message was sent by Atlassian Jira
(v8.20.10#820010)