You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kylin.apache.org by "kangkaisen (JIRA)" <ji...@apache.org> on 2016/05/16 03:20:12 UTC
[jira] [Created] (KYLIN-1694) make multiply coefficient
configurable when estimating cuboid size
kangkaisen created KYLIN-1694:
---------------------------------
Summary: make multiply coefficient configurable when estimating cuboid size
Key: KYLIN-1694
URL: https://issues.apache.org/jira/browse/KYLIN-1694
Project: Kylin
Issue Type: Bug
Components: Job Engine
Affects Versions: v1.5.1, v1.5.0
Reporter: kangkaisen
Assignee: Dong Li
In the current version of MRv2 build engine, in CubeStatsReader when estimating cuboid size , the curent method is "cube is memory hungry, storage size estimation multiply 0.05" and "cube is not memory hungry, storage size estimation multiply 0.25".
This has one major problems:the default multiply coefficient is smaller, this will make the estimated cuboid size much less than the actual
cuboid size,which will lead to the region numbers of HBase and the reducer numbers of CubeHFileJob are both smaller. obviously, the current method
makes the job of CubeHFileJob much slower.
After we remove the the default multiply coefficient, the job of CubeHFileJob becomes much faster.
we'd better make multiply coefficient configurable and this could be more friendly for user.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)