You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@kylin.apache.org by 热爱大发挥 <38...@qq.com> on 2016/01/31 08:02:35 UTC
cube 构建疑惑
我的fact表数据量为5000万左右, 给build cube 第二部的时候(Extact Fact Table Distinct Columns), reduce数量为什么都是1呢, 看了源代码确实是写死了1个, 这就导致了单个节点的负载过高,内存不足导致job退出了.
这个问题该如何解决呢, 能否订制每个步奏的mapreduce参数呢?
Re: cube 构建疑惑
Posted by ShaoFeng Shi <sh...@gmail.com>.
The screenshot couldn’t be shown up.
So far kylin doesn’t support customising mr configurations for each step. You can try to give the reducer more memory in conf/kylin_job_conf.xml as a workaround.
Besides, you need consider whether that ultra-high-cardinality column is meaningful as a dimension. If not, remove that; if yes and evening adding memory to the reducer still couldn’t work, then you can select “dictionary" as No in the “Advanced” tab, and set a max-length for that column, Kylin will not use dictionary to encode that, just copying the value to rowkey.
> On Jan 31, 2016, at 3:02 PM, 热爱大发挥 <38...@qq.com> wrote:
>
> 我的fact表数据量为5000万左右, 给build cube 第二部的时候(Extact Fact Table Distinct Columns), reduce数量为什么都是1呢, 看了源代码确实是写死了1个, 这就导致了单个节点的负载过高,内存不足导致job退出了.
> 这个问题该如何解决呢, 能否订制每个步奏的mapreduce参数呢?
>
>
>
>
回复: cube 构建疑惑
Posted by 热爱大发挥 <38...@qq.com>.
Take up your valuable time,thank you very much!
------------------ 原始邮件 ------------------
发件人: "ShaoFeng Shi";<sh...@gmail.com>;
发送时间: 2016年1月31日(星期天) 下午3:45
收件人: "user"<us...@kylin.apache.org>;
主题: Re: cube 构建疑惑
for most of the cases, one reducer is okay for merge the distinct values from all dimension columns on fact table; but if there are multiple ultra high cardinality columns, using multiple reducers would gain better concurrency. Actually this is the task I'm doing today, as a part of work for another feature, it will be rollout in a certain release after 2.0
By the way, please try using English for getting wider audience.
发送自 Outlook Mobile
On Sat, Jan 30, 2016 at 11:02 PM -0800, "热爱大发挥" <38...@qq.com> wrote:
我的fact表数据量为5000万左右, 给build cube 第二部的时候(Extact Fact Table Distinct Columns), reduce数量为什么都是1呢, 看了源代码确实是写死了1个, 这就导致了单个节点的负载过高,内存不足导致job退出了.
这个问题该如何解决呢, 能否订制每个步奏的mapreduce参数呢?
Re: cube 构建疑惑
Posted by ShaoFeng Shi <sh...@gmail.com>.
for most of the cases, one reducer is okay for merge the distinct values from all dimension columns on fact table; but if there are multiple ultra high cardinality columns, using multiple reducers would gain better concurrency. Actually this is the task I'm doing today, as a part of work for another feature, it will be rollout in a certain release after 2.0
By the way, please try using English for getting wider audience.
发送自 Outlook Mobile
On Sat, Jan 30, 2016 at 11:02 PM -0800, "热爱大发挥" <38...@qq.com> wrote:
我的fact表数据量为5000万左右, 给build cube 第二部的时候(Extact Fact Table Distinct Columns), reduce数量为什么都是1呢, 看了源代码确实是写死了1个, 这就导致了单个节点的负载过高,内存不足导致job退出了.这个问题该如何解决呢, 能否订制每个步奏的mapreduce参数呢?