You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kylin.apache.org by "ZhouKang (Jira)" <ji...@apache.org> on 2019/12/31 06:57:00 UTC

[jira] [Updated] (KYLIN-4322) ROA of compression HBase result

     [ https://issues.apache.org/jira/browse/KYLIN-4322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

ZhouKang updated KYLIN-4322:
----------------------------
    Summary: ROA of compression HBase result  (was: ROC of compression HBase result)

> ROA of compression HBase result
> -------------------------------
>
>                 Key: KYLIN-4322
>                 URL: https://issues.apache.org/jira/browse/KYLIN-4322
>             Project: Kylin
>          Issue Type: Bug
>            Reporter: ZhouKang
>            Priority: Major
>
> kylin.storage.hbase.endpoint-compress-result is  TRUE as default.
> In our production environment, when the hbase scan result is larger than 200M, it will take more than 10s to compress data.
> We can find this by hbase's log:
> ||Size||avg rate||max rate||avg time||max time||
> |<1M|0.12|0.25|0.18ms|0.7s|
> |1M ~ 10M|0.39|0.97|0.2s|0.6s|
> |10M ~ 100M|0.47|0.81|2s|6.3s|
> |>100M|0.95|0.96|15.7s|24.8s|
> rate: compressed data size / origin data size
>  AND please NOTICE that,
> when the source data size is less than 1M, 65% compression data is larger than source data.
> When source data is less then 10M, the latency of data transmission is acceptability. When data is larger then 100M, it will take a long time to compress data.
>  
> So, I think kylin.storage.hbase.endpoint-compress-result  should be FALSE by default;
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)