You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kylin.apache.org by "liyang (JIRA)" <ji...@apache.org> on 2017/02/17 05:03:42 UTC

[jira] [Resolved] (KYLIN-2442) Re-calculate expansion rate, count raw data size regardless of flat table compression

     [ https://issues.apache.org/jira/browse/KYLIN-2442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

liyang resolved KYLIN-2442.
---------------------------
       Resolution: Fixed
         Assignee: liyang
    Fix Version/s: v2.0.0

> Re-calculate expansion rate, count raw data size regardless of flat table compression
> -------------------------------------------------------------------------------------
>
>                 Key: KYLIN-2442
>                 URL: https://issues.apache.org/jira/browse/KYLIN-2442
>             Project: Kylin
>          Issue Type: Improvement
>            Reporter: liyang
>            Assignee: liyang
>             Fix For: v2.0.0
>
>
> Right now the expansion rate is calculated as "Cube Size / Raw Data Size". And the raw data size is the size of intermediate hive table. This means the Raw Data Size depends on the compression format of the intermediate table. And affects the correctness of expansion rate and other estimates based on the raw data size.
> The change intends to calculate the Raw Data Size based on the uncompressed cell values of the intermediate hive table. All cells take their string form and sum up the string byte size in UTF8 encoding. The result serves as Raw Data Size, is stable regardless of compression and other env parameters.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)