You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kylin.apache.org by "huang song (Jira)" <ji...@apache.org> on 2022/07/11 09:00:00 UTC
[jira] [Commented] (KYLIN-5208) kylin 4.0.1 构建cube 2个表1对多关系 ,报错 DUP key found ,key=venue_id
[ https://issues.apache.org/jira/browse/KYLIN-5208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17564879#comment-17564879 ]
huang song commented on KYLIN-5208:
-----------------------------------
经过自己的观察突然发现 如果是构建1对多的关联表时,维度表的关联字段不能重复。这样的话,维度表只能是 关联字段为1的表。关联字段多条记录的就只能是事实表了。我报以上的问题就是因为 关联表为多条重复关联字段的表 。
> kylin 4.0.1 构建cube 2个表1对多关系 ,报错 DUP key found ,key=venue_id
> -------------------------------------------------------------
>
> Key: KYLIN-5208
> URL: https://issues.apache.org/jira/browse/KYLIN-5208
> Project: Kylin
> Issue Type: Bug
> Components: Spark Engine
> Reporter: huang song
> Priority: Blocker
>
> 2022-07-08 10:44:56,379 INFO [pool-1-thread-1] cluster.SchedulerInfoCmdHelper : stderr Dload Upload Total Spent Left Speed
> 2022-07-08 10:44:56,379 INFO [pool-1-thread-1] cluster.SchedulerInfoCmdHelper : stderr
> ================================================================
> at org.apache.kylin.engine.spark.application.SparkApplication.execute(SparkApplication.java:308)
> at org.apache.kylin.engine.spark.application.SparkApplication.execute(SparkApplication.java:94)
> ... 4 more
> }
> RetryInfo{
> overrideConf : \{spark.executor.memory=6143MB, spark.executor.memoryOverhead=1228MB},
> throwable : java.lang.RuntimeException: Error execute org.apache.kylin.engine.spark.job.CubeBuildJob
> at org.apache.kylin.engine.spark.application.SparkApplication.execute(SparkApplication.java:97)
> at org.apache.spark.application.JobWorker$$anon$2.run(JobWorker.scala:55)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.IllegalStateException: Failed to build lookup table B_VENUE_OPEN_TIME snapshot for Dup key found, key= VENUE_ID
> at org.apache.kylin.engine.spark.builder.CubeSnapshotBuilder.$anonfun$checkDupKey$1(CubeSnapshotBuilder.scala:200)
> at org.apache.kylin.engine.spark.builder.CubeSnapshotBuilder.$anonfun$checkDupKey$1$adapted(CubeSnapshotBuilder.scala:190)
> at scala.collection.IndexedSeqOptimized.foreach(IndexedSeqOptimized.scala:36)
> at scala.collection.IndexedSeqOptimized.foreach$(IndexedSeqOptimized.scala:33)
> at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:198)
> at org.apache.kylin.engine.spark.builder.CubeSnapshotBuilder.checkDupKey(CubeSnapshotBuilder.scala:190)
> at org.apache.kylin.engine.spark.job.ParentSourceChooser.decideFlatTableSource(ParentSourceChooser.scala:83)
> at org.apache.kylin.engine.spark.job.ParentSourceChooser.$anonfun$decideSources$1(ParentSourceChooser.scala:71)
--
This message was sent by Atlassian Jira
(v8.20.10#820010)