You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kylin.apache.org by "Xiaoxiang Yu (Jira)" <ji...@apache.org> on 2019/09/22 06:55:00 UTC

[jira] [Comment Edited] (KYLIN-4141) Build Global Dictionary in no time

    [ https://issues.apache.org/jira/browse/KYLIN-4141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16935228#comment-16935228 ] 

Xiaoxiang Yu edited comment on KYLIN-4141 at 9/22/19 6:54 AM:
--------------------------------------------------------------

h2. Performance

In my CDH cluster(CDH5.7), I found the response time is following:

 
||Action||AVG Resopnse Time||Resopnse Time||
|HBase CheckAndPut|0.5ms|0.4ms    ~  1.0ms|
|HBase Get|0.4ms|0.3ms    ~  1.0ms|
|RocksDB Get|0.10ms |0.05ms  ~  2.0ms|
|RocksDB Put|0..005ms|0.001ms ~ 0.01 ms|

 

For the worst case, each input string is a never-seen value, it has to go through four steps, overall encode will cost about 1.0 ms in average.

 1. first try get from local RocksDB (Not Found),

 2. then do CheckAndPut to HBase (Remote Exists),

 3. then Get From HBase

 4, then set to local RocksDB

The better case will go through only one CheckAndPut to HBase, it will cost about 0.6 ms in average.

The best case will get from local will cost about 0.1 ms in average.

 

For my test message, ingest rate  I got is about 1.8k ~ 2.1k message per second.

 

 

 !image-2019-09-22-14-54-07-772.png! 

 

 


was (Author: hit_lacus):
h2. Performance

In my CDH cluster(CDH5.7), I found the response time is following:

 
||Action||AVG Resopnse Time||Resopnse Time||
|HBase CheckAndPut|0.5ms|0.4ms    ~  1.0ms|
|HBase Get|0.4ms|0.3ms    ~  1.0ms|
|RocksDB Get|0.10ms |0.05ms  ~  2.0ms|
|RocksDB Put|0..005ms|0.001ms ~ 0.01 ms|

 

For the worst case, each input string is a never-seen value, it has to go through four steps, overall encode will cost about 1.0 ms in average.

 1. first try get from local RocksDB (Not Found),

 2. then do CheckAndPut to HBase (Remote Exists),

 3. then Get From HBase

 4, then set to local RocksDB

The better case will go through only one CheckAndPut to HBase, it will cost about 0.6 ms in average.

The best case will get from local will cost about 0.1 ms in average.

 

For my test message, ingest rate  I got is about 1.8k ~ 2.1k message per second.

 

 

!image-2019-09-22-14-51-05-027.png!

 

 

> Build Global Dictionary in no time
> ----------------------------------
>
>                 Key: KYLIN-4141
>                 URL: https://issues.apache.org/jira/browse/KYLIN-4141
>             Project: Kylin
>          Issue Type: Improvement
>          Components: Real-time Streaming
>    Affects Versions: v3.0.0-beta
>            Reporter: Xiaoxiang Yu
>            Assignee: Xiaoxiang Yu
>            Priority: Major
>             Fix For: v3.0.0-beta
>
>         Attachments: image-2019-09-20-19-04-47-937.png, image-2019-09-20-19-04-55-935.png, image-2019-09-20-20-06-15-960.png, image-2019-09-22-14-54-07-772.png
>
>
> h2.  
> h2. Backgroud
> Currently, realtime OLAP do not support COUNT_DISTINCT(bitmap) for non interger type.
> Because of the lack the ability of encoding string at once, so I want to use RocksDB & HBase as implementation of streaming distributed dictionary. 
> h2. Design
>  # each receiver will own a local dict cache
>  # all receiver will share a remote dict storage
>  # we choose to use RocksDB as local dict cache
>  # we choose to use HBase as remote dict storage
>  
>  # for each cube, we will create a local dict and a hbase table
>  # we will create column family both in RocksDB and HBase for each column which occur in COUNT_DISTINCT
> h2. Design Diagram
> !image-2019-09-20-19-04-47-937.png!
> !image-2019-09-20-19-04-55-935.png!
>  
> !image-2019-09-20-20-06-15-960.png!
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)