You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@kylin.apache.org by "quzhengpeng@hetrone.com" <qu...@hetrone.com> on 2016/12/06 08:34:26 UTC
回复: Re: Cube Dup key found
Hi,
I use sqoop1.4.6 load data from mysql to hive. The table of orders has it's own key ,but in kylin seams have something wrong. How to add the key of the lookup table (my orders table) ?
Hi,
This error because of some dimension table has more than 1 record when fact table join on it through the key ‘5847,ufenqi,2016-11-11’, you can avoid this by add key columns in the join condition.
在 2016年12月6日,下午3:20,quzhengpeng@hetrone.com 写道:
Hi,
I have two tables users and orders, one user can make many orders. They're relation is one to many.
I create the model with inner join users and orders
Finally i build the cube and raise a Dup key Error, How can i make the cube?
java.lang.IllegalStateException: Dup key found, key=[5847,ufenqi,2016-11-11], value1=[2615,product,5847,ufenqi,2014-09-09 23:23:31.0,338800,170,10,2016-11-11,2099-12-31], value2=[3635,product,5847,ufenqi,2014-09-11 22:51:06.0,336800,170,10,2016-11-11,2099-12-31]
at org.apache.kylin.dict.lookup.LookupTable.initRow(LookupTable.java:85)
at org.apache.kylin.dict.lookup.LookupTable.init(LookupTable.java:68)
at org.apache.kylin.dict.lookup.LookupStringTable.init(LookupStringTable.java:79)
at org.apache.kylin.dict.lookup.LookupTable.<init>(LookupTable.java:56)
at org.apache.kylin.dict.lookup.LookupStringTable.<init>(LookupStringTable.java:65)
at org.apache.kylin.cube.CubeManager.getLookupTable(CubeManager.java:674)
at org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:60)
at org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:41)
at org.apache.kylin.engine.mr.steps.CreateDictionaryJob.run(CreateDictionaryJob.java:54)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
at org.apache.kylin.engine.mr.common.HadoopShellExecutable.doWork(HadoopShellExecutable.java:63)
at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:113)
at org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:57)
at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:113)
at org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:136)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
result code:2
Re: Re: Cube Dup key found
Posted by Billy Liu <bi...@apache.org>.
In your case, the order table would be fact table, and user table would be
lookup table. Could you have a try?
2016-12-06 17:51 GMT+08:00 quzhengpeng@hetrone.com <qu...@hetrone.com>
:
> Hi ,
> hive> desc uc_users;
> OK
> *user_id int*
> *u**ser_modify string *
> create_time string
> *start_dt string*
> end_dt string
>
> hive> desc oc_orders;
> OK
> *order_id int *
> *order_modify string *
> user_id int
> user_modify string
> create_time string
> order_money int
> status int
> pay_status int
> *start_dt string*
> end_dt string
>
> they are the struct. In RDBMS the user_id and user_modify is FK
> ,underlined is primary key and the red is my join condition. but now i can
> find additional coumnm in the join step.and the key(5847,ufenqi,2016-11-11)
> in table orders have multi-result, it's right in logic. now i don't know
> how to Keep every record in my lookup table unique with key(
> 5847,ufenqi,2016-11-11)
>
>
>
>
>
> *From:* Mars Xu <xu...@gmail.com>
> *Date:* 2016-12-06 17:04
> *To:* user <us...@kylin.apache.org>
> *Subject:* Re: Cube Dup key found
> when u join the fact table and lookup table in kylin ,u need to define the
> join key columns such as ordered,username and date ,right ? in your lookup
> table (orders table) , it seems that the table has more records when
> orderedid=5847,username=ufenqi,date=2016-11-11. u need to define an
> additional column in the join step according to your business
> meaningness,such as order_seqno.
>
> Keep every record in your lookup table unique .
>
> 在 2016年12月6日,下午4:34,quzhengpeng@hetrone.com 写道:
>
> Hi,
> I use sqoop1.4.6 load data from mysql to hive. The table of orders has
> it's own key ,but in kylin seams have something wrong. How to add the key
> of the lookup table (my orders table) ?
>
> Hi,
> This error because of some dimension table has more than 1 record
> when fact table join on it through the key ‘5847,ufenqi,2016-11-11’, you
> can avoid this by add key columns in the join condition.
>
>
> 在 2016年12月6日,下午3:20,quzhengpeng@hetrone.com 写道:
>
> Hi,
> I have two tables users and orders, one user can make many orders.
> They're relation is one to many.
> I create the model with inner join users and orders
> Finally i build the cube and raise a Dup key Error, How can i make the
> cube?
>
> java.lang.IllegalStateException: Dup key found, key=[5847,ufenqi,2016-11-11], value1=[2615,product,5847,ufenqi,2014-09-09 23:23:31.0,338800,170,10,2016-11-11,2099-12-31], value2=[3635,product,5847,ufenqi,2014-09-11 22:51:06.0,336800,170,10,2016-11-11,2099-12-31]
> at org.apache.kylin.dict.lookup.LookupTable.initRow(LookupTable.java:85)
> at org.apache.kylin.dict.lookup.LookupTable.init(LookupTable.java:68)
> at org.apache.kylin.dict.lookup.LookupStringTable.init(LookupStringTable.java:79)
> at org.apache.kylin.dict.lookup.LookupTable.<init>(LookupTable.java:56)
> at org.apache.kylin.dict.lookup.LookupStringTable.<init>(LookupStringTable.java:65)
> at org.apache.kylin.cube.CubeManager.getLookupTable(CubeManager.java:674)
> at org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:60)
> at org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:41)
> at org.apache.kylin.engine.mr.steps.CreateDictionaryJob.run(CreateDictionaryJob.java:54)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
> at org.apache.kylin.engine.mr.common.HadoopShellExecutable.doWork(HadoopShellExecutable.java:63)
> at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:113)
> at org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:57)
> at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:113)
> at org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:136)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> result code:2
>
>
>
回复: Re: Cube Dup key found
Posted by "quzhengpeng@hetrone.com" <qu...@hetrone.com>.
Hi ,
hive> desc uc_users;
OK
user_id int
user_modify string
create_time string
start_dt string
end_dt string
hive> desc oc_orders;
OK
order_id int
order_modify string
user_id int
user_modify string
create_time string
order_money int
status int
pay_status int
start_dt string
end_dt string
they are the struct. In RDBMS the user_id and user_modify is FK ,underlined is primary key and the red is my join condition. but now i can find additional coumnm in the join step.and the key(5847,ufenqi,2016-11-11) in table orders have multi-result, it's right in logic. now i don't know how to Keep every record in my lookup table unique with key(5847,ufenqi,2016-11-11)
From: Mars Xu
Date: 2016-12-06 17:04
To: user
Subject: Re: Cube Dup key found
when u join the fact table and lookup table in kylin ,u need to define the join key columns such as ordered,username and date ,right ? in your lookup table (orders table) , it seems that the table has more records when orderedid=5847,username=ufenqi,date=2016-11-11. u need to define an additional column in the join step according to your business meaningness,such as order_seqno.
Keep every record in your lookup table unique .
在 2016年12月6日,下午4:34,quzhengpeng@hetrone.com 写道:
Hi,
I use sqoop1.4.6 load data from mysql to hive. The table of orders has it's own key ,but in kylin seams have something wrong. How to add the key of the lookup table (my orders table) ?
Hi,
This error because of some dimension table has more than 1 record when fact table join on it through the key ‘5847,ufenqi,2016-11-11’, you can avoid this by add key columns in the join condition.
在 2016年12月6日,下午3:20,quzhengpeng@hetrone.com 写道:
Hi,
I have two tables users and orders, one user can make many orders. They're relation is one to many.
I create the model with inner join users and orders
Finally i build the cube and raise a Dup key Error, How can i make the cube?
java.lang.IllegalStateException: Dup key found, key=[5847,ufenqi,2016-11-11], value1=[2615,product,5847,ufenqi,2014-09-09 23:23:31.0,338800,170,10,2016-11-11,2099-12-31], value2=[3635,product,5847,ufenqi,2014-09-11 22:51:06.0,336800,170,10,2016-11-11,2099-12-31]
at org.apache.kylin.dict.lookup.LookupTable.initRow(LookupTable.java:85)
at org.apache.kylin.dict.lookup.LookupTable.init(LookupTable.java:68)
at org.apache.kylin.dict.lookup.LookupStringTable.init(LookupStringTable.java:79)
at org.apache.kylin.dict.lookup.LookupTable.<init>(LookupTable.java:56)
at org.apache.kylin.dict.lookup.LookupStringTable.<init>(LookupStringTable.java:65)
at org.apache.kylin.cube.CubeManager.getLookupTable(CubeManager.java:674)
at org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:60)
at org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:41)
at org.apache.kylin.engine.mr.steps.CreateDictionaryJob.run(CreateDictionaryJob.java:54)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
at org.apache.kylin.engine.mr.common.HadoopShellExecutable.doWork(HadoopShellExecutable.java:63)
at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:113)
at org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:57)
at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:113)
at org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:136)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
result code:2
Re: Cube Dup key found
Posted by Mars Xu <xu...@gmail.com>.
when u join the fact table and lookup table in kylin ,u need to define the join key columns such as ordered,username and date ,right ? in your lookup table (orders table) , it seems that the table has more records when orderedid=5847,username=ufenqi,date=2016-11-11. u need to define an additional column in the join step according to your business meaningness,such as order_seqno.
Keep every record in your lookup table unique .
> 在 2016年12月6日,下午4:34,quzhengpeng@hetrone.com 写道:
>
> Hi,
> I use sqoop1.4.6 load data from mysql to hive. The table of orders has it's own key ,but in kylin seams have something wrong. How to add the key of the lookup table (my orders table) ?
>
> Hi,
> This error because of some dimension table has more than 1 record when fact table join on it through the key ‘5847,ufenqi,2016-11-11’, you can avoid this by add key columns in the join condition.
>
>
>> 在 2016年12月6日,下午3:20,quzhengpeng@hetrone.com <ma...@hetrone.com> 写道:
>>
>> Hi,
>> I have two tables users and orders, one user can make many orders. They're relation is one to many.
>> I create the model with inner join users and orders
>> Finally i build the cube and raise a Dup key Error, How can i make the cube?
>>
>> java.lang.IllegalStateException: Dup key found, key=[5847,ufenqi,2016-11-11], value1=[2615,product,5847,ufenqi,2014-09-09 23:23:31.0,338800,170,10,2016-11-11,2099-12-31], value2=[3635,product,5847,ufenqi,2014-09-11 22:51:06.0,336800,170,10,2016-11-11,2099-12-31]
>> at org.apache.kylin.dict.lookup.LookupTable.initRow(LookupTable.java:85)
>> at org.apache.kylin.dict.lookup.LookupTable.init(LookupTable.java:68)
>> at org.apache.kylin.dict.lookup.LookupStringTable.init(LookupStringTable.java:79)
>> at org.apache.kylin.dict.lookup.LookupTable.<init>(LookupTable.java:56)
>> at org.apache.kylin.dict.lookup.LookupStringTable.<init>(LookupStringTable.java:65)
>> at org.apache.kylin.cube.CubeManager.getLookupTable(CubeManager.java:674)
>> at org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:60)
>> at org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:41)
>> at org.apache.kylin.engine.mr.steps.CreateDictionaryJob.run(CreateDictionaryJob.java:54)
>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
>> at org.apache.kylin.engine.mr.common.HadoopShellExecutable.doWork(HadoopShellExecutable.java:63)
>> at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:113)
>> at org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:57)
>> at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:113)
>> at org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:136)
>> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>> at java.lang.Thread.run(Thread.java:745)
>> result code:2