You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@kylin.apache.org by Alberto Ramón <a....@gmail.com> on 2017/05/10 07:29:21 UTC

Re: kylin nonsupport Multi-value dimensions?

Hi,
Not all hive types are supported

Check this lines:
https://github.com/apache/kylin/blob/5d4982e247a2172d97d44c85309cef4b3dbfce09/core-metadata/src/main/java/org/apache/kylin/dimension/DimensionEncodingFactory.java#L76

On 10 May 2017 at 08:10, jianhui.yi <ji...@zhiyoubao.com> wrote:

> I encountered a multi-dimensional dimension of the problem, and I used
> bridge table to try to solve it, but when building a cube,it report an error
>
> java.lang.IllegalStateException: The table: DIM_XXX Dup key found,
> key=[1446], value1=[1446,29,1,1], value2=[1446,28,0,0]
>
>          at org.apache.kylin.dict.lookup.LookupTable.initRow(
> LookupTable.java:86)
>
>          at org.apache.kylin.dict.lookup.LookupTable.init(LookupTable.
> java:69)
>
>          at org.apache.kylin.dict.lookup.LookupStringTable.init(
> LookupStringTable.java:79)
>
>          at org.apache.kylin.dict.lookup.LookupTable.<init>(
> LookupTable.java:57)
>
>          at org.apache.kylin.dict.lookup.LookupStringTable.<init>(
> LookupStringTable.java:65)
>
>          at org.apache.kylin.cube.CubeManager.getLookupTable(
> CubeManager.java:644)
>
>          at org.apache.kylin.cube.cli.DictionaryGeneratorCLI.
> processSegment(DictionaryGeneratorCLI.java:98)
>
>          at org.apache.kylin.cube.cli.DictionaryGeneratorCLI.
> processSegment(DictionaryGeneratorCLI.java:54)
>
>          at org.apache.kylin.engine.mr.steps.CreateDictionaryJob.run(
> CreateDictionaryJob.java:66)
>
>          at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>
>          at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
>
>          at org.apache.kylin.engine.mr.common.HadoopShellExecutable.
> doWork(HadoopShellExecutable.java:63)
>
>          at org.apache.kylin.job.execution.AbstractExecutable.
> execute(AbstractExecutable.java:124)
>
>          at org.apache.kylin.job.execution.DefaultChainedExecutable.
> doWork(DefaultChainedExecutable.java:64)
>
>          at org.apache.kylin.job.execution.AbstractExecutable.
> execute(AbstractExecutable.java:124)
>
>          at org.apache.kylin.job.impl.threadpool.DefaultScheduler$
> JobRunner.run(DefaultScheduler.java:142)
>
>          at java.util.concurrent.ThreadPoolExecutor.runWorker(
> ThreadPoolExecutor.java:1145)
>
>          at java.util.concurrent.ThreadPoolExecutor$Worker.run(
> ThreadPoolExecutor.java:615)
>
>          at java.lang.Thread.run(Thread.java:745)
>
> result code:2
>
>
>
>
>
>
>

Re: 答复: kylin nonsupport Multi-value dimensions?

Posted by Li Yang <li...@apache.org>.
> java.lang.IllegalStateException: The table: DIM_XXX Dup key found,
key=[1446], value1=[1446,29,1,1], value2=[1446,28,0,0]

This error is about dup key in a dimension table. The primary key of
dimension table must be unique on all rows. And in this case, the key
"1446" appears twice.

On Wed, May 10, 2017 at 6:59 PM, Alberto Ramón <a....@gmail.com>
wrote:

> You can convert this dim to string and check performance using like filters
>
> With hive duplicate values in fact table.  One for each dim value
>
> Other complex solution can be extended dictionary encode dimension to
> understand multivalues
>
> No more ideas :)
>
>
> On 10 May 2017 8:51 a.m., "jianhui.yi" <ji...@zhiyoubao.com> wrote:
>
> Sorry, I write it wrongly,this problem is multi-value dimension,
>
> Example: I have a fact table named fact_order,a dimension table named
> dim_sales
>
> In the fact_order table ,An order data contains multiple salespeople.
>
> When I use fact_order join dim_sales it report that error: Dup key found.
>
> How can I solve it ?
>
>
>
> *发件人:* Alberto Ramón [mailto:a.ramonportoles@gmail.com]
> *发送时间:* 2017年5月10日 15:29
> *收件人:* user <us...@kylin.apache.org>
> *主题:* Re: kylin nonsupport Multi-value dimensions?
>
>
>
> Hi,
>
> Not all hive types are supported
>
> Check this lines:
> https://github.com/apache/kylin/blob/5d4982e247a2172d97d44c8
> 5309cef4b3dbfce09/core-metadata/src/main/java/org/
> apache/kylin/dimension/DimensionEncodingFactory.java#L76
>
>
>
> On 10 May 2017 at 08:10, jianhui.yi <ji...@zhiyoubao.com> wrote:
>
> I encountered a multi-dimensional dimension of the problem, and I used
> bridge table to try to solve it, but when building a cube,it report an error
>
> java.lang.IllegalStateException: The table: DIM_XXX Dup key found,
> key=[1446], value1=[1446,29,1,1], value2=[1446,28,0,0]
>
>          at org.apache.kylin.dict.lookup.LookupTable.initRow(LookupTable
> .java:86)
>
>          at org.apache.kylin.dict.lookup.LookupTable.init(LookupTable.ja
> va:69)
>
>          at org.apache.kylin.dict.lookup.LookupStringTable.init(LookupSt
> ringTable.java:79)
>
>          at org.apache.kylin.dict.lookup.LookupTable.<init>(LookupTable.
> java:57)
>
>          at org.apache.kylin.dict.lookup.LookupStringTable.<init>(Lookup
> StringTable.java:65)
>
>          at org.apache.kylin.cube.CubeManager.getLookupTable(CubeManager
> .java:644)
>
>          at org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegm
> ent(DictionaryGeneratorCLI.java:98)
>
>          at org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegm
> ent(DictionaryGeneratorCLI.java:54)
>
>          at org.apache.kylin.engine.mr.steps.CreateDictionaryJob.run(Cre
> ateDictionaryJob.java:66)
>
>          at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>
>          at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
>
>          at org.apache.kylin.engine.mr.common.HadoopShellExecutable.doWo
> rk(HadoopShellExecutable.java:63)
>
>          at org.apache.kylin.job.execution.AbstractExecutable.execute(
> AbstractExecutable.java:124)
>
>          at org.apache.kylin.job.execution.DefaultChainedExecutable.doWo
> rk(DefaultChainedExecutable.java:64)
>
>          at org.apache.kylin.job.execution.AbstractExecutable.execute(
> AbstractExecutable.java:124)
>
>          at org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRun
> ner.run(DefaultScheduler.java:142)
>
>          at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool
> Executor.java:1145)
>
>          at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo
> lExecutor.java:615)
>
>          at java.lang.Thread.run(Thread.java:745)
>
> result code:2
>
>
>
>
>
>
>
>
>
>
>

Re: 答复: kylin nonsupport Multi-value dimensions?

Posted by Alberto Ramón <a....@gmail.com>.
You can convert this dim to string and check performance using like filters

With hive duplicate values in fact table.  One for each dim value

Other complex solution can be extended dictionary encode dimension to
understand multivalues

No more ideas :)


On 10 May 2017 8:51 a.m., "jianhui.yi" <ji...@zhiyoubao.com> wrote:

Sorry, I write it wrongly,this problem is multi-value dimension,

Example: I have a fact table named fact_order,a dimension table named
dim_sales

In the fact_order table ,An order data contains multiple salespeople.

When I use fact_order join dim_sales it report that error: Dup key found.

How can I solve it ?



*发件人:* Alberto Ramón [mailto:a.ramonportoles@gmail.com]
*发送时间:* 2017年5月10日 15:29
*收件人:* user <us...@kylin.apache.org>
*主题:* Re: kylin nonsupport Multi-value dimensions?



Hi,

Not all hive types are supported

Check this lines:
https://github.com/apache/kylin/blob/5d4982e247a2172d97d44c85309cef
4b3dbfce09/core-metadata/src/main/java/org/apache/kylin/dimension/
DimensionEncodingFactory.java#L76



On 10 May 2017 at 08:10, jianhui.yi <ji...@zhiyoubao.com> wrote:

I encountered a multi-dimensional dimension of the problem, and I used
bridge table to try to solve it, but when building a cube,it report an error

java.lang.IllegalStateException: The table: DIM_XXX Dup key found,
key=[1446], value1=[1446,29,1,1], value2=[1446,28,0,0]

         at org.apache.kylin.dict.lookup.LookupTable.initRow(
LookupTable.java:86)

         at org.apache.kylin.dict.lookup.LookupTable.init(LookupTable.
java:69)

         at org.apache.kylin.dict.lookup.LookupStringTable.init(
LookupStringTable.java:79)

         at org.apache.kylin.dict.lookup.LookupTable.<init>(
LookupTable.java:57)

         at org.apache.kylin.dict.lookup.LookupStringTable.<init>(
LookupStringTable.java:65)

         at org.apache.kylin.cube.CubeManager.getLookupTable(
CubeManager.java:644)

         at org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(
DictionaryGeneratorCLI.java:98)

         at org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(
DictionaryGeneratorCLI.java:54)

         at org.apache.kylin.engine.mr.steps.CreateDictionaryJob.run(
CreateDictionaryJob.java:66)

         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)

         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)

         at org.apache.kylin.engine.mr.common.HadoopShellExecutable.
doWork(HadoopShellExecutable.java:63)

         at org.apache.kylin.job.execution.AbstractExecutable.
execute(AbstractExecutable.java:124)

         at org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(
DefaultChainedExecutable.java:64)

         at org.apache.kylin.job.execution.AbstractExecutable.
execute(AbstractExecutable.java:124)

         at org.apache.kylin.job.impl.threadpool.DefaultScheduler$
JobRunner.run(DefaultScheduler.java:142)

         at java.util.concurrent.ThreadPoolExecutor.runWorker(
ThreadPoolExecutor.java:1145)

         at java.util.concurrent.ThreadPoolExecutor$Worker.run(
ThreadPoolExecutor.java:615)

         at java.lang.Thread.run(Thread.java:745)

result code:2

答复: kylin nonsupport Multi-value dimensions?

Posted by "jianhui.yi" <ji...@zhiyoubao.com>.
Sorry, I write it wrongly,this problem is multi-value dimension,

Example: I have a fact table named fact_order,a dimension table named dim_sales

In the fact_order table ,An order data contains multiple salespeople.

When I use fact_order join dim_sales it report that error: Dup key found.

How can I solve it ?

 

发件人: Alberto Ramón [mailto:a.ramonportoles@gmail.com] 
发送时间: 2017年5月10日 15:29
收件人: user <us...@kylin.apache.org>
主题: Re: kylin nonsupport Multi-value dimensions?

 

Hi, 

Not all hive types are supported

Check this lines: 
https://github.com/apache/kylin/blob/5d4982e247a2172d97d44c85309cef4b3dbfce09/core-metadata/src/main/java/org/apache/kylin/dimension/DimensionEncodingFactory.java#L76

 

On 10 May 2017 at 08:10, jianhui.yi <jianhui.yi@zhiyoubao.com <ma...@zhiyoubao.com> > wrote:

I encountered a multi-dimensional dimension of the problem, and I used bridge table to try to solve it, but when building a cube,it report an error

java.lang.IllegalStateException: The table: DIM_XXX Dup key found, key=[1446], value1=[1446,29,1,1], value2=[1446,28,0,0]

         at org.apache.kylin.dict.lookup.LookupTable.initRow(LookupTable.java:86)

         at org.apache.kylin.dict.lookup.LookupTable.init(LookupTable.java:69)

         at org.apache.kylin.dict.lookup.LookupStringTable.init(LookupStringTable.java:79)

         at org.apache.kylin.dict.lookup.LookupTable.<init>(LookupTable.java:57)

         at org.apache.kylin.dict.lookup.LookupStringTable.<init>(LookupStringTable.java:65)

         at org.apache.kylin.cube.CubeManager.getLookupTable(CubeManager.java:644)

         at org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:98)

         at org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:54)

         at org.apache.kylin.engine.mr <http://org.apache.kylin.engine.mr> .steps.CreateDictionaryJob.run(CreateDictionaryJob.java:66)

         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)

         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)

         at org.apache.kylin.engine.mr <http://org.apache.kylin.engine.mr> .common.HadoopShellExecutable.doWork(HadoopShellExecutable.java:63)

         at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:124)

         at org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:64)

         at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:124)

         at org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:142)

         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

         at java.lang.Thread.run(Thread.java:745)

result code:2