You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "TingWuHuang (Jira)" <ji...@apache.org> on 2021/01/10 06:44:00 UTC

[jira] [Comment Edited] (FLINK-20460) Support async lookup for HBase connector

    [ https://issues.apache.org/jira/browse/FLINK-20460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17262083#comment-17262083 ] 

TingWuHuang edited comment on FLINK-20460 at 1/10/21, 6:43 AM:
---------------------------------------------------------------

We have recently completed development of this feature and found that only version 2.0 supports async client,but HBASE 1.0 version doesn't provide async methods. .So there are now three ways to do this: first, define a thread pool call to the synchronous client to simulate an asynchronous request; The second only Hbase2.2 connector uses asynchronous requests.The third kind of asynchronous client that introduces external implementation, such as Asynchbase. However, the functions they provide can only retrieve column data for the column family one at a time. From the above analysis, maybe the first way is better.If you agree, could you please assgin it to me. Thank. 
Detailed scheme:
1.Add lookup.isAsync option to check if asynchronous mode is enabled.
2.Use FixedThreadPool to handle requests.the FixedThreadPool which thread is fix and workpool       is infinite, For we can't create thread unlimitedly. Inner thread, getConnection and lookup DB.
3.We can use guava cache like JdbcRowDataLookupFunction to have a memory cache.
4.If we want to get cache usage, we can also add a hitRate metric.

 [~jark],cc


was (Author: tingwuhuang):
We have recently completed development of this feature and found that only version 2.0 supports async client,but HBASE 1.0 version doesn't provide async methods. .So there are now three ways to do this: first, define a thread pool call to the synchronous client to simulate an asynchronous request; The second only Hbase2.2 connector uses asynchronous requests.The third kind of asynchronous client that introduces external implementation, such as Asynchbase. However, the functions they provide can only retrieve column data for the column family one at a time. From the above analysis, maybe the first way is better.If you agree, could you please assgin it to me. Thank. 
[~jark],cc

> Support async lookup for HBase connector
> ----------------------------------------
>
>                 Key: FLINK-20460
>                 URL: https://issues.apache.org/jira/browse/FLINK-20460
>             Project: Flink
>          Issue Type: New Feature
>          Components: Connectors / HBase, Table SQL / Ecosystem
>            Reporter: Jark Wu
>            Priority: Major
>
> Currenlty, {{HBaseRowDataLookupFunction}} implements {{TableFunction}} which is a sync operation. Would be better to have an {{AsyncTableFunction}} implementation which has better performance. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)