You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kylin.apache.org by lk_hadoop <lk...@163.com> on 2019/04/09 01:21:59 UTC
can't pass step Build Cube In-Mem
hi,all :
I'm using kylin-2.6.1-cdh57, and the source row count is 500 million,I can success build cube .
but when I use the cube planner , it has one step : Build Cube In-Mem for job :OPTIMIZE CUBE
the config about the kylin_job_conf_inmem.xml is :
<property>
<name>mapreduce.map.memory.mb</name>
<value>9216</value>
<description></description>
</property>
<property>
<name>mapreduce.map.java.opts</name>
<value>-Xmx8192m -XX:OnOutOfMemoryError='kill -9 %p'</value>
<description></description>
</property>
<property>
<name>mapreduce.job.is-mem-hungry</name>
<value>true</value>
</property>
<property>
<name>mapreduce.job.split.metainfo.maxsize</name>
<value>-1</value>
<description>The maximum permissible size of the split metainfo file.
The JobTracker won't attempt to read split metainfo files bigger than
the configured value. No limits if set to -1.
</description>
</property>
<property>
<name>mapreduce.job.max.split.locations</name>
<value>2000</value>
<description>No description</description>
</property>
<property>
<name>mapreduce.task.io.sort.mb</name>
<value>200</value>
<description></description>
</property>
finally the map job will be killed for OnOutOfMemoryError , but when I giev more mem for map job , I will get another error : java.nio.BufferOverflowException
why kylin will run the job inmem ? how can I avoid it ?
2019-04-08
lk_hadoop
回复: Re: 答复: can't pass step Build Cube In-Mem
Posted by lk_hadoop <lk...@163.com>.
Thank you very much ! @Long Chao
2019-04-12
lk_hadoop
发件人:Long Chao <ch...@gmail.com>
发送时间:2019-04-12 18:17
主题:Re: Re: 答复: can't pass step Build Cube In-Mem
收件人:"dev"<de...@kylin.apache.org>
抄送:
Hi lk,
I have fixed this issue, and the code is in Kylin's master branch now.
If your situation is very urgent, you can apply the commit[
https://github.com/apache/kylin/commit/ed266aa98d8524a344469b1e1ead8bfd462702d8]
and build a new binary package.
Btw, to keep the previous behavior(optimize job using inmem algorithm as
default), I just add a new configuration parameter(
*kylin.cube.algorithm.inmem-auto-optimize*) to remove the above limitation,
you need to set *kylin.cube.algorithm.inmem-auto-optimize *to *false*, and
then the optimize job will use the algorithm as you configured(like:
*kylin.cube.algorithm=layer*).
On Thu, Apr 11, 2019 at 6:00 PM lk_hadoop <lk...@163.com> wrote:
> thank you ~ @Long Chao
>
> 2019-04-11
>
> lk_hadoop
>
>
>
> 发件人:Long Chao <ch...@gmail.com>
> 发送时间:2019-04-11 17:56
> 主题:Re: 答复: can't pass step Build Cube In-Mem
> 收件人:"dev"<de...@kylin.apache.org>
> 抄送:
>
> Hi lk,
> Optimize job will only build the newly generated cuboids in the
> recommended cuboid list, usually the amount of them is not too large.
> So, by default, we use inmem algorithm to build those new cuboids,
> but
> now the algorithm can't be overwritten by properties file.
>
> And I create a jira for this problem to make the algorithm
> configurable. https://issues.apache.org/jira/browse/KYLIN-3950
>
> On Thu, Apr 11, 2019 at 5:49 PM lk_hadoop <lk...@163.com> wrote:
>
> > I think that's not too much :
> >
> > Cuboid Distribution
> > Current Cuboid Distribution
> > [Cuboid Count: 49] [Row Count: 1117994636]
> >
> > Recommend Cuboid Distribution
> > [Cuboid Count: 168] [Row Count: 464893216]
> >
> >
> > 2019-04-11
> >
> > lk_hadoop
> >
> >
> >
> > 发件人:Na Zhai <na...@kyligence.io>
> > 发送时间:2019-04-11 17:42
> > 主题:答复: can't pass step Build Cube In-Mem
> > 收件人:"dev@kylin.apache.org"<de...@kylin.apache.org>
> > 抄送:
> >
> > Hi, lk_hadoop.
> >
> >
> >
> > Does Cube planner recommend too many cuboid? If so, it may cause OOM.
> >
> >
> >
> >
> >
> > 发送自 Windows 10 版邮件<https://go.microsoft.com/fwlink/?LinkId=550986>应用
> >
> >
> >
> > ________________________________
> > 发件人: lk_hadoop <lk...@163.com>
> > 发送时间: Tuesday, April 9, 2019 9:21:59 AM
> > 收件人: dev
> > 主题: can't pass step Build Cube In-Mem
> >
> > hi,all :
> > I'm using kylin-2.6.1-cdh57, and the source row count is 500
> million,I
> > can success build cube .
> > but when I use the cube planner , it has one step : Build Cube In-Mem
> > for job :OPTIMIZE CUBE
> > the config about the kylin_job_conf_inmem.xml is :
> >
> > <property>
> > <name>mapreduce.map.memory.mb</name>
> > <value>9216</value>
> > <description></description>
> > </property>
> >
> > <property>
> > <name>mapreduce.map.java.opts</name>
> > <value>-Xmx8192m -XX:OnOutOfMemoryError='kill -9 %p'</value>
> > <description></description>
> > </property>
> >
> > <property>
> > <name>mapreduce.job.is-mem-hungry</name>
> > <value>true</value>
> > </property>
> >
> > <property>
> > <name>mapreduce.job.split.metainfo.maxsize</name>
> > <value>-1</value>
> > <description>The maximum permissible size of the split metainfo
> > file.
> > The JobTracker won't attempt to read split metainfo files
> > bigger than
> > the configured value. No limits if set to -1.
> > </description>
> > </property>
> >
> > <property>
> > <name>mapreduce.job.max.split.locations</name>
> > <value>2000</value>
> > <description>No description</description>
> > </property>
> >
> > <property>
> > <name>mapreduce.task.io.sort.mb</name>
> > <value>200</value>
> > <description></description>
> > </property>
> >
> >
> > finally the map job will be killed for OnOutOfMemoryError , but
> when
> > I giev more mem for map job , I will get another error :
> > java.nio.BufferOverflowException
> >
> > why kylin will run the job inmem ? how can I avoid it ?
> >
> >
> >
> > 2019-04-08
> >
> >
> > lk_hadoop
Re: Re: 答复: can't pass step Build Cube In-Mem
Posted by Long Chao <ch...@gmail.com>.
Hi lk,
I have fixed this issue, and the code is in Kylin's master branch now.
If your situation is very urgent, you can apply the commit[
https://github.com/apache/kylin/commit/ed266aa98d8524a344469b1e1ead8bfd462702d8]
and build a new binary package.
Btw, to keep the previous behavior(optimize job using inmem algorithm as
default), I just add a new configuration parameter(
*kylin.cube.algorithm.inmem-auto-optimize*) to remove the above limitation,
you need to set *kylin.cube.algorithm.inmem-auto-optimize *to *false*, and
then the optimize job will use the algorithm as you configured(like:
*kylin.cube.algorithm=layer*).
On Thu, Apr 11, 2019 at 6:00 PM lk_hadoop <lk...@163.com> wrote:
> thank you ~ @Long Chao
>
> 2019-04-11
>
> lk_hadoop
>
>
>
> 发件人:Long Chao <ch...@gmail.com>
> 发送时间:2019-04-11 17:56
> 主题:Re: 答复: can't pass step Build Cube In-Mem
> 收件人:"dev"<de...@kylin.apache.org>
> 抄送:
>
> Hi lk,
> Optimize job will only build the newly generated cuboids in the
> recommended cuboid list, usually the amount of them is not too large.
> So, by default, we use inmem algorithm to build those new cuboids,
> but
> now the algorithm can't be overwritten by properties file.
>
> And I create a jira for this problem to make the algorithm
> configurable. https://issues.apache.org/jira/browse/KYLIN-3950
>
> On Thu, Apr 11, 2019 at 5:49 PM lk_hadoop <lk...@163.com> wrote:
>
> > I think that's not too much :
> >
> > Cuboid Distribution
> > Current Cuboid Distribution
> > [Cuboid Count: 49] [Row Count: 1117994636]
> >
> > Recommend Cuboid Distribution
> > [Cuboid Count: 168] [Row Count: 464893216]
> >
> >
> > 2019-04-11
> >
> > lk_hadoop
> >
> >
> >
> > 发件人:Na Zhai <na...@kyligence.io>
> > 发送时间:2019-04-11 17:42
> > 主题:答复: can't pass step Build Cube In-Mem
> > 收件人:"dev@kylin.apache.org"<de...@kylin.apache.org>
> > 抄送:
> >
> > Hi, lk_hadoop.
> >
> >
> >
> > Does Cube planner recommend too many cuboid? If so, it may cause OOM.
> >
> >
> >
> >
> >
> > 发送自 Windows 10 版邮件<https://go.microsoft.com/fwlink/?LinkId=550986>应用
> >
> >
> >
> > ________________________________
> > 发件人: lk_hadoop <lk...@163.com>
> > 发送时间: Tuesday, April 9, 2019 9:21:59 AM
> > 收件人: dev
> > 主题: can't pass step Build Cube In-Mem
> >
> > hi,all :
> > I'm using kylin-2.6.1-cdh57, and the source row count is 500
> million,I
> > can success build cube .
> > but when I use the cube planner , it has one step : Build Cube In-Mem
> > for job :OPTIMIZE CUBE
> > the config about the kylin_job_conf_inmem.xml is :
> >
> > <property>
> > <name>mapreduce.map.memory.mb</name>
> > <value>9216</value>
> > <description></description>
> > </property>
> >
> > <property>
> > <name>mapreduce.map.java.opts</name>
> > <value>-Xmx8192m -XX:OnOutOfMemoryError='kill -9 %p'</value>
> > <description></description>
> > </property>
> >
> > <property>
> > <name>mapreduce.job.is-mem-hungry</name>
> > <value>true</value>
> > </property>
> >
> > <property>
> > <name>mapreduce.job.split.metainfo.maxsize</name>
> > <value>-1</value>
> > <description>The maximum permissible size of the split metainfo
> > file.
> > The JobTracker won't attempt to read split metainfo files
> > bigger than
> > the configured value. No limits if set to -1.
> > </description>
> > </property>
> >
> > <property>
> > <name>mapreduce.job.max.split.locations</name>
> > <value>2000</value>
> > <description>No description</description>
> > </property>
> >
> > <property>
> > <name>mapreduce.task.io.sort.mb</name>
> > <value>200</value>
> > <description></description>
> > </property>
> >
> >
> > finally the map job will be killed for OnOutOfMemoryError , but
> when
> > I giev more mem for map job , I will get another error :
> > java.nio.BufferOverflowException
> >
> > why kylin will run the job inmem ? how can I avoid it ?
> >
> >
> >
> > 2019-04-08
> >
> >
> > lk_hadoop
回复: Re: 答复: can't pass step Build Cube In-Mem
Posted by lk_hadoop <lk...@163.com>.
thank you ~ @Long Chao
2019-04-11
lk_hadoop
发件人:Long Chao <ch...@gmail.com>
发送时间:2019-04-11 17:56
主题:Re: 答复: can't pass step Build Cube In-Mem
收件人:"dev"<de...@kylin.apache.org>
抄送:
Hi lk,
Optimize job will only build the newly generated cuboids in the
recommended cuboid list, usually the amount of them is not too large.
So, by default, we use inmem algorithm to build those new cuboids, but
now the algorithm can't be overwritten by properties file.
And I create a jira for this problem to make the algorithm
configurable. https://issues.apache.org/jira/browse/KYLIN-3950
On Thu, Apr 11, 2019 at 5:49 PM lk_hadoop <lk...@163.com> wrote:
> I think that's not too much :
>
> Cuboid Distribution
> Current Cuboid Distribution
> [Cuboid Count: 49] [Row Count: 1117994636]
>
> Recommend Cuboid Distribution
> [Cuboid Count: 168] [Row Count: 464893216]
>
>
> 2019-04-11
>
> lk_hadoop
>
>
>
> 发件人:Na Zhai <na...@kyligence.io>
> 发送时间:2019-04-11 17:42
> 主题:答复: can't pass step Build Cube In-Mem
> 收件人:"dev@kylin.apache.org"<de...@kylin.apache.org>
> 抄送:
>
> Hi, lk_hadoop.
>
>
>
> Does Cube planner recommend too many cuboid? If so, it may cause OOM.
>
>
>
>
>
> 发送自 Windows 10 版邮件<https://go.microsoft.com/fwlink/?LinkId=550986>应用
>
>
>
> ________________________________
> 发件人: lk_hadoop <lk...@163.com>
> 发送时间: Tuesday, April 9, 2019 9:21:59 AM
> 收件人: dev
> 主题: can't pass step Build Cube In-Mem
>
> hi,all :
> I'm using kylin-2.6.1-cdh57, and the source row count is 500 million,I
> can success build cube .
> but when I use the cube planner , it has one step : Build Cube In-Mem
> for job :OPTIMIZE CUBE
> the config about the kylin_job_conf_inmem.xml is :
>
> <property>
> <name>mapreduce.map.memory.mb</name>
> <value>9216</value>
> <description></description>
> </property>
>
> <property>
> <name>mapreduce.map.java.opts</name>
> <value>-Xmx8192m -XX:OnOutOfMemoryError='kill -9 %p'</value>
> <description></description>
> </property>
>
> <property>
> <name>mapreduce.job.is-mem-hungry</name>
> <value>true</value>
> </property>
>
> <property>
> <name>mapreduce.job.split.metainfo.maxsize</name>
> <value>-1</value>
> <description>The maximum permissible size of the split metainfo
> file.
> The JobTracker won't attempt to read split metainfo files
> bigger than
> the configured value. No limits if set to -1.
> </description>
> </property>
>
> <property>
> <name>mapreduce.job.max.split.locations</name>
> <value>2000</value>
> <description>No description</description>
> </property>
>
> <property>
> <name>mapreduce.task.io.sort.mb</name>
> <value>200</value>
> <description></description>
> </property>
>
>
> finally the map job will be killed for OnOutOfMemoryError , but when
> I giev more mem for map job , I will get another error :
> java.nio.BufferOverflowException
>
> why kylin will run the job inmem ? how can I avoid it ?
>
>
>
> 2019-04-08
>
>
> lk_hadoop
Re: 答复: can't pass step Build Cube In-Mem
Posted by Long Chao <ch...@gmail.com>.
Hi lk,
Optimize job will only build the newly generated cuboids in the
recommended cuboid list, usually the amount of them is not too large.
So, by default, we use inmem algorithm to build those new cuboids, but
now the algorithm can't be overwritten by properties file.
And I create a jira for this problem to make the algorithm
configurable. https://issues.apache.org/jira/browse/KYLIN-3950
On Thu, Apr 11, 2019 at 5:49 PM lk_hadoop <lk...@163.com> wrote:
> I think that's not too much :
>
> Cuboid Distribution
> Current Cuboid Distribution
> [Cuboid Count: 49] [Row Count: 1117994636]
>
> Recommend Cuboid Distribution
> [Cuboid Count: 168] [Row Count: 464893216]
>
>
> 2019-04-11
>
> lk_hadoop
>
>
>
> 发件人:Na Zhai <na...@kyligence.io>
> 发送时间:2019-04-11 17:42
> 主题:答复: can't pass step Build Cube In-Mem
> 收件人:"dev@kylin.apache.org"<de...@kylin.apache.org>
> 抄送:
>
> Hi, lk_hadoop.
>
>
>
> Does Cube planner recommend too many cuboid? If so, it may cause OOM.
>
>
>
>
>
> 发送自 Windows 10 版邮件<https://go.microsoft.com/fwlink/?LinkId=550986>应用
>
>
>
> ________________________________
> 发件人: lk_hadoop <lk...@163.com>
> 发送时间: Tuesday, April 9, 2019 9:21:59 AM
> 收件人: dev
> 主题: can't pass step Build Cube In-Mem
>
> hi,all :
> I'm using kylin-2.6.1-cdh57, and the source row count is 500 million,I
> can success build cube .
> but when I use the cube planner , it has one step : Build Cube In-Mem
> for job :OPTIMIZE CUBE
> the config about the kylin_job_conf_inmem.xml is :
>
> <property>
> <name>mapreduce.map.memory.mb</name>
> <value>9216</value>
> <description></description>
> </property>
>
> <property>
> <name>mapreduce.map.java.opts</name>
> <value>-Xmx8192m -XX:OnOutOfMemoryError='kill -9 %p'</value>
> <description></description>
> </property>
>
> <property>
> <name>mapreduce.job.is-mem-hungry</name>
> <value>true</value>
> </property>
>
> <property>
> <name>mapreduce.job.split.metainfo.maxsize</name>
> <value>-1</value>
> <description>The maximum permissible size of the split metainfo
> file.
> The JobTracker won't attempt to read split metainfo files
> bigger than
> the configured value. No limits if set to -1.
> </description>
> </property>
>
> <property>
> <name>mapreduce.job.max.split.locations</name>
> <value>2000</value>
> <description>No description</description>
> </property>
>
> <property>
> <name>mapreduce.task.io.sort.mb</name>
> <value>200</value>
> <description></description>
> </property>
>
>
> finally the map job will be killed for OnOutOfMemoryError , but when
> I giev more mem for map job , I will get another error :
> java.nio.BufferOverflowException
>
> why kylin will run the job inmem ? how can I avoid it ?
>
>
>
> 2019-04-08
>
>
> lk_hadoop
回复: 答复: can't pass step Build Cube In-Mem
Posted by lk_hadoop <lk...@163.com>.
I think that's not too much :
Cuboid Distribution
Current Cuboid Distribution
[Cuboid Count: 49] [Row Count: 1117994636]
Recommend Cuboid Distribution
[Cuboid Count: 168] [Row Count: 464893216]
2019-04-11
lk_hadoop
发件人:Na Zhai <na...@kyligence.io>
发送时间:2019-04-11 17:42
主题:答复: can't pass step Build Cube In-Mem
收件人:"dev@kylin.apache.org"<de...@kylin.apache.org>
抄送:
Hi, lk_hadoop.
Does Cube planner recommend too many cuboid? If so, it may cause OOM.
发送自 Windows 10 版邮件<https://go.microsoft.com/fwlink/?LinkId=550986>应用
________________________________
发件人: lk_hadoop <lk...@163.com>
发送时间: Tuesday, April 9, 2019 9:21:59 AM
收件人: dev
主题: can't pass step Build Cube In-Mem
hi,all :
I'm using kylin-2.6.1-cdh57, and the source row count is 500 million,I can success build cube .
but when I use the cube planner , it has one step : Build Cube In-Mem for job :OPTIMIZE CUBE
the config about the kylin_job_conf_inmem.xml is :
<property>
<name>mapreduce.map.memory.mb</name>
<value>9216</value>
<description></description>
</property>
<property>
<name>mapreduce.map.java.opts</name>
<value>-Xmx8192m -XX:OnOutOfMemoryError='kill -9 %p'</value>
<description></description>
</property>
<property>
<name>mapreduce.job.is-mem-hungry</name>
<value>true</value>
</property>
<property>
<name>mapreduce.job.split.metainfo.maxsize</name>
<value>-1</value>
<description>The maximum permissible size of the split metainfo file.
The JobTracker won't attempt to read split metainfo files bigger than
the configured value. No limits if set to -1.
</description>
</property>
<property>
<name>mapreduce.job.max.split.locations</name>
<value>2000</value>
<description>No description</description>
</property>
<property>
<name>mapreduce.task.io.sort.mb</name>
<value>200</value>
<description></description>
</property>
finally the map job will be killed for OnOutOfMemoryError , but when I giev more mem for map job , I will get another error : java.nio.BufferOverflowException
why kylin will run the job inmem ? how can I avoid it ?
2019-04-08
lk_hadoop
答复: can't pass step Build Cube In-Mem
Posted by Na Zhai <na...@kyligence.io>.
Hi, lk_hadoop.
Does Cube planner recommend too many cuboid? If so, it may cause OOM.
发送自 Windows 10 版邮件<https://go.microsoft.com/fwlink/?LinkId=550986>应用
________________________________
发件人: lk_hadoop <lk...@163.com>
发送时间: Tuesday, April 9, 2019 9:21:59 AM
收件人: dev
主题: can't pass step Build Cube In-Mem
hi,all :
I'm using kylin-2.6.1-cdh57, and the source row count is 500 million,I can success build cube .
but when I use the cube planner , it has one step : Build Cube In-Mem for job :OPTIMIZE CUBE
the config about the kylin_job_conf_inmem.xml is :
<property>
<name>mapreduce.map.memory.mb</name>
<value>9216</value>
<description></description>
</property>
<property>
<name>mapreduce.map.java.opts</name>
<value>-Xmx8192m -XX:OnOutOfMemoryError='kill -9 %p'</value>
<description></description>
</property>
<property>
<name>mapreduce.job.is-mem-hungry</name>
<value>true</value>
</property>
<property>
<name>mapreduce.job.split.metainfo.maxsize</name>
<value>-1</value>
<description>The maximum permissible size of the split metainfo file.
The JobTracker won't attempt to read split metainfo files bigger than
the configured value. No limits if set to -1.
</description>
</property>
<property>
<name>mapreduce.job.max.split.locations</name>
<value>2000</value>
<description>No description</description>
</property>
<property>
<name>mapreduce.task.io.sort.mb</name>
<value>200</value>
<description></description>
</property>
finally the map job will be killed for OnOutOfMemoryError , but when I giev more mem for map job , I will get another error : java.nio.BufferOverflowException
why kylin will run the job inmem ? how can I avoid it ?
2019-04-08
lk_hadoop