You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kylin.apache.org by "Shaofeng SHI (JIRA)" <ji...@apache.org> on 2016/05/18 09:28:12 UTC
[jira] [Updated] (KYLIN-1677) Distribute source data by certain
columns when creating flat table
[ https://issues.apache.org/jira/browse/KYLIN-1677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Shaofeng SHI updated KYLIN-1677:
--------------------------------
Fix Version/s: v1.5.3
> Distribute source data by certain columns when creating flat table
> ------------------------------------------------------------------
>
> Key: KYLIN-1677
> URL: https://issues.apache.org/jira/browse/KYLIN-1677
> Project: Kylin
> Issue Type: Improvement
> Components: Job Engine
> Reporter: Shaofeng SHI
> Assignee: Shaofeng SHI
> Fix For: v1.5.3
>
>
> Inspired by KYLIN-1656, Kylin can distribute the source data by certain columns when creating the flat hive table; Then the data assigned to a mapper will have more similarity, more aggregation can happen at mapper side, and then less shuffle and reduce is needed.
> Columns can be used for the distribution includes: ultra high cardinality column, mandantory column, partition date/time column, etc.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)