You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kylin.apache.org by "kangkaisen (JIRA)" <ji...@apache.org> on 2016/05/16 03:20:12 UTC

[jira] [Created] (KYLIN-1694) make multiply coefficient configurable when estimating cuboid size

kangkaisen created KYLIN-1694:
---------------------------------

             Summary: make multiply coefficient configurable when estimating cuboid size
                 Key: KYLIN-1694
                 URL: https://issues.apache.org/jira/browse/KYLIN-1694
             Project: Kylin
          Issue Type: Bug
          Components: Job Engine
    Affects Versions: v1.5.1, v1.5.0
            Reporter: kangkaisen
            Assignee: Dong Li


In the current version of MRv2 build engine, in CubeStatsReader when estimating cuboid size , the curent method is "cube is memory hungry, storage size estimation multiply 0.05" and "cube is not memory hungry, storage size estimation multiply 0.25".

This has one major problems:the default multiply coefficient is smaller, this will make the estimated cuboid size much less than the actual
cuboid size,which will lead to the region numbers of HBase and the reducer numbers of CubeHFileJob are both smaller. obviously, the current method
makes the job of CubeHFileJob much slower.

After we remove the the default multiply coefficient, the job of CubeHFileJob becomes much faster.

we'd better make multiply coefficient configurable and this could be more friendly for user.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)