You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kylin.apache.org by "huang song (Jira)" <ji...@apache.org> on 2022/07/11 09:00:00 UTC

[jira] [Commented] (KYLIN-5208) kylin 4.0.1 构建cube 2个表1对多关系 ,报错 DUP key found ,key=venue_id

    [ https://issues.apache.org/jira/browse/KYLIN-5208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17564879#comment-17564879 ] 

huang song commented on KYLIN-5208:
-----------------------------------

经过自己的观察突然发现 如果是构建1对多的关联表时,维度表的关联字段不能重复。这样的话,维度表只能是 关联字段为1的表。关联字段多条记录的就只能是事实表了。我报以上的问题就是因为 关联表为多条重复关联字段的表 。

> kylin 4.0.1 构建cube 2个表1对多关系 ,报错  DUP  key found ,key=venue_id
> -------------------------------------------------------------
>
>                 Key: KYLIN-5208
>                 URL: https://issues.apache.org/jira/browse/KYLIN-5208
>             Project: Kylin
>          Issue Type: Bug
>          Components: Spark Engine
>            Reporter: huang song
>            Priority: Blocker
>
> 2022-07-08 10:44:56,379 INFO  [pool-1-thread-1] cluster.SchedulerInfoCmdHelper : stderr                                  Dload  Upload   Total   Spent    Left  Speed
> 2022-07-08 10:44:56,379 INFO  [pool-1-thread-1] cluster.SchedulerInfoCmdHelper : stderr 
> ================================================================
>  at org.apache.kylin.engine.spark.application.SparkApplication.execute(SparkApplication.java:308)
>  at org.apache.kylin.engine.spark.application.SparkApplication.execute(SparkApplication.java:94)
>  ... 4 more
> }
> RetryInfo{
>     overrideConf : \{spark.executor.memory=6143MB, spark.executor.memoryOverhead=1228MB},
>     throwable : java.lang.RuntimeException: Error execute org.apache.kylin.engine.spark.job.CubeBuildJob
>  at org.apache.kylin.engine.spark.application.SparkApplication.execute(SparkApplication.java:97)
>  at org.apache.spark.application.JobWorker$$anon$2.run(JobWorker.scala:55)
>  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.IllegalStateException: Failed to build lookup table B_VENUE_OPEN_TIME snapshot for Dup key found, key= VENUE_ID
>  at org.apache.kylin.engine.spark.builder.CubeSnapshotBuilder.$anonfun$checkDupKey$1(CubeSnapshotBuilder.scala:200)
>  at org.apache.kylin.engine.spark.builder.CubeSnapshotBuilder.$anonfun$checkDupKey$1$adapted(CubeSnapshotBuilder.scala:190)
>  at scala.collection.IndexedSeqOptimized.foreach(IndexedSeqOptimized.scala:36)
>  at scala.collection.IndexedSeqOptimized.foreach$(IndexedSeqOptimized.scala:33)
>  at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:198)
>  at org.apache.kylin.engine.spark.builder.CubeSnapshotBuilder.checkDupKey(CubeSnapshotBuilder.scala:190)
>  at org.apache.kylin.engine.spark.job.ParentSourceChooser.decideFlatTableSource(ParentSourceChooser.scala:83)
>  at org.apache.kylin.engine.spark.job.ParentSourceChooser.$anonfun$decideSources$1(ParentSourceChooser.scala:71)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)