You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@kylin.apache.org by 仇同心 <qi...@jd.com> on 2017/02/10 05:29:39 UTC

create dictionary error

Hi,all
     Building operation error on the of  Step Name: Build Dimension Dictionary:

    java.lang.RuntimeException: Failed to create dictionary on DMT.DMT_KYLIN_JDMALL_ORDR_DTL_I_D.SALE_ORD_ID
         at org.apache.kylin.dict.DictionaryManager.buildDictionary(DictionaryManager.java:325)
         at org.apache.kylin.cube.CubeManager.buildDictionary(CubeManager.java:185)
         at org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:50)
         at org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:41)
         at org.apache.kylin.engine.mr.steps.CreateDictionaryJob.run(CreateDictionaryJob.java:56)
         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
         at org.apache.kylin.engine.mr.common.HadoopShellExecutable.doWork(HadoopShellExecutable.java:63)
         at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:113)
         at org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:57)
         at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:113)
         at org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:136)
         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
         at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException: Too big dictionary, dictionary cannot be bigger than 2GB
         at org.apache.kylin.dict.TrieDictionaryBuilder.buildTrieBytes(TrieDictionaryBuilder.java:421)
         at org.apache.kylin.dict.TrieDictionaryBuilder.build(TrieDictionaryBuilder.java:408)
         at org.apache.kylin.dict.DictionaryGenerator$StringDictBuilder.build(DictionaryGenerator.java:165)
         at org.apache.kylin.dict.DictionaryGenerator.buildDictionary(DictionaryGenerator.java:81)
         at org.apache.kylin.dict.DictionaryGenerator.buildDictionary(DictionaryGenerator.java:73)
         at org.apache.kylin.dict.DictionaryManager.buildDictionary(DictionaryManager.java:321)
         ... 14 more

  The  Cardinality of  “SALE_ORD_ID”  is 157644463,but This column was not selected for the dimension.

  In addition, I'm very confused here to build a data dictionary is full amount to build or data to construct according to the selected time range?


Thank you~






Re: create dictionary error

Posted by Alberto Ramón <a....@gmail.com>.
Hi, Move this thread to User mailList

SALE_ORD_ID is not a dim of cube, but isit  a PK-FK ?  I think yes  :)
Are you using DERIVED Dims in this table ?

See this
<https://github.com/apache/kylin/blob/576d2dd352b11f428db1d6308e35350a2fca122e/core-dictionary/src/main/java/org/apache/kylin/dict/TrieDictionaryBuilder.java#L421>,
the 2G limit is hardcoded, I think increase XMX dont solve your case
They said you have a cardinalty more than " final int _2GB = 2000000000;",
can you check if this is true?
can you review the statistics for this columns?







2017-02-10 6:29 GMT+01:00 仇同心 <qi...@jd.com>:

> Hi,all
>
>      Building operation error on the of  Step Name: Build Dimension
> Dictionary:
>
>
>
>     java.lang.RuntimeException: Failed to create dictionary on
> DMT.DMT_KYLIN_JDMALL_ORDR_DTL_I_D.SALE_ORD_ID
>
>          at org.apache.kylin.dict.DictionaryManager.buildDictionary(
> DictionaryManager.java:325)
>
>          at org.apache.kylin.cube.CubeManager.buildDictionary(
> CubeManager.java:185)
>
>          at org.apache.kylin.cube.cli.DictionaryGeneratorCLI.
> processSegment(DictionaryGeneratorCLI.java:50)
>
>          at org.apache.kylin.cube.cli.DictionaryGeneratorCLI.
> processSegment(DictionaryGeneratorCLI.java:41)
>
>          at org.apache.kylin.engine.mr.steps.CreateDictionaryJob.run(
> CreateDictionaryJob.java:56)
>
>          at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>
>          at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
>
>          at org.apache.kylin.engine.mr.common.HadoopShellExecutable.
> doWork(HadoopShellExecutable.java:63)
>
>          at org.apache.kylin.job.execution.AbstractExecutable.
> execute(AbstractExecutable.java:113)
>
>          at org.apache.kylin.job.execution.DefaultChainedExecutable.
> doWork(DefaultChainedExecutable.java:57)
>
>          at org.apache.kylin.job.execution.AbstractExecutable.
> execute(AbstractExecutable.java:113)
>
>          at org.apache.kylin.job.impl.threadpool.DefaultScheduler$
> JobRunner.run(DefaultScheduler.java:136)
>
>          at java.util.concurrent.ThreadPoolExecutor.runWorker(
> ThreadPoolExecutor.java:1145)
>
>          at java.util.concurrent.ThreadPoolExecutor$Worker.run(
> ThreadPoolExecutor.java:615)
>
>          at java.lang.Thread.run(Thread.java:745)
>
> Caused by: java.lang.RuntimeException: Too big dictionary, dictionary
> cannot be bigger than 2GB
>
>          at org.apache.kylin.dict.TrieDictionaryBuilder.buildTrieBytes(
> TrieDictionaryBuilder.java:421)
>
>          at org.apache.kylin.dict.TrieDictionaryBuilder.build(
> TrieDictionaryBuilder.java:408)
>
>          at org.apache.kylin.dict.DictionaryGenerator$
> StringDictBuilder.build(DictionaryGenerator.java:165)
>
>          at org.apache.kylin.dict.DictionaryGenerator.buildDictionary(
> DictionaryGenerator.java:81)
>
>          at org.apache.kylin.dict.DictionaryGenerator.buildDictionary(
> DictionaryGenerator.java:73)
>
>          at org.apache.kylin.dict.DictionaryManager.buildDictionary(
> DictionaryManager.java:321)
>
>          ... 14 more
>
>
>
>   The  Cardinality of  “SALE_ORD_ID”  is 157644463,but This column was not
> selected for the dimension.
>
>
>
>   In addition, I'm very confused here to build a data dictionary is full
> amount to build or data to construct according to the selected time range?
>
>
>
>
>
> Thank you~
>
>
>
>
>
>
>
>
>
>
>

Re: create dictionary error

Posted by Li Yang <li...@apache.org>.
Dictionary is not the best encoding for columns like ID, s/n, phone number
etc..

Try use the "integer" encoding instead, or the "fixed_length" encoding
could be a workaround too.

Yang

On Mon, Feb 13, 2017 at 11:00 AM, Luke_Selina <hu...@gmail.com>
wrote:

> Have you solved the problem? Please let me know Or May be you can try
> another
> dimension Encoding method(fix_length、integer and so on).
>
> Good Luck!
>
> Luke_Selina
>
> --
> View this message in context: http://apache-kylin.74782.x6.
> nabble.com/create-dictionary-error-tp7155p7171.html
> Sent from the Apache Kylin mailing list archive at Nabble.com.
>

Re: create dictionary error

Posted by Luke_Selina <hu...@gmail.com>.
Have you solved the problem? Please let me know Or May be you can try another
dimension Encoding method(fix_length、integer and so on).

Good Luck!

Luke_Selina

--
View this message in context: http://apache-kylin.74782.x6.nabble.com/create-dictionary-error-tp7155p7171.html
Sent from the Apache Kylin mailing list archive at Nabble.com.