You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kylin.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2019/12/31 07:39:00 UTC

[jira] [Commented] (KYLIN-4322) Cost–benefit of compression HBase result

    [ https://issues.apache.org/jira/browse/KYLIN-4322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17005968#comment-17005968 ] 

ASF GitHub Bot commented on KYLIN-4322:
---------------------------------------

zhoukangcn commented on pull request #1033: KYLIN-4322: set storage.hbase.endpoint-compress-result default value …
URL: https://github.com/apache/kylin/pull/1033
 
 
   …false
   
   see: https://issues.apache.org/jira/browse/KYLIN-4322
   
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


> Cost–benefit of compression HBase result
> ----------------------------------------
>
>                 Key: KYLIN-4322
>                 URL: https://issues.apache.org/jira/browse/KYLIN-4322
>             Project: Kylin
>          Issue Type: Bug
>            Reporter: ZhouKang
>            Priority: Major
>
> kylin.storage.hbase.endpoint-compress-result is  TRUE as default.
> In our production environment, when the hbase scan result is larger than 200M, it will take more than 10s to compress data.
> We can find this by hbase's log:
> ||Size||avg rate||max rate||avg time||max time||
> |<1M|0.12|0.25|0.18ms|0.7s|
> |1M ~ 10M|0.39|0.97|0.2s|0.6s|
> |10M ~ 100M|0.47|0.81|2s|6.3s|
> |>100M|0.95|0.96|15.7s|24.8s|
> rate: compressed data size / origin data size
>  AND please NOTICE that,
> when the source data size is less than 1M, 65% compression data is larger than source data.
> When source data is less then 10M, the latency of data transmission is acceptability. When data is larger then 100M, it will take a long time to compress data.
>  
> So, I think kylin.storage.hbase.endpoint-compress-result  should be FALSE by default;
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)