You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kylin.apache.org by "Yaqian Zhang (Jira)" <ji...@apache.org> on 2021/11/09 09:21:00 UTC

[jira] [Assigned] (KYLIN-5067) CubeBuildJob build unnecessary snapshot

     [ https://issues.apache.org/jira/browse/KYLIN-5067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yaqian Zhang reassigned KYLIN-5067:
-----------------------------------

    Assignee: Xiaoxiang Yu

> CubeBuildJob build unnecessary snapshot 
> ----------------------------------------
>
>                 Key: KYLIN-5067
>                 URL: https://issues.apache.org/jira/browse/KYLIN-5067
>             Project: Kylin
>          Issue Type: Bug
>    Affects Versions: v4.0.0-beta
>            Reporter: Xiaoxiang Yu
>            Assignee: Xiaoxiang Yu
>            Priority: Major
>             Fix For: v4.0.1
>
>
> In TPC-H benchmark, the query-13, which contains a 'left outer join', and its right table's join key(o_custkey), is not unique. And it will cause the build job failed with following exception.
>  
> {code:java}
> java.lang.RuntimeException: Error execute org.apache.kylin.engine.spark.job.CubeBuildJob
> 	at org.apache.kylin.engine.spark.application.SparkApplication.execute(SparkApplication.java:96)
> 	at org.apache.spark.application.JobWorker$$anon$2.run(JobWorker.scala:55)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> 	at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.IllegalStateException: Failed to build lookup table V_ORDERS snapshot for Dup key found, key= O_CUSTKEY
> 	at org.apache.kylin.engine.spark.builder.CubeSnapshotBuilder$$anonfun$checkDupKey$1.apply(CubeSnapshotBuilder.scala:198)
> 	at org.apache.kylin.engine.spark.builder.CubeSnapshotBuilder$$anonfun$checkDupKey$1.apply(CubeSnapshotBuilder.scala:190)
> 	at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
> 	at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186)
> 	at org.apache.kylin.engine.spark.builder.CubeSnapshotBuilder.checkDupKey(CubeSnapshotBuilder.scala:189)
> 	at org.apache.kylin.engine.spark.job.ParentSourceChooser.decideFlatTableSource(ParentSourceChooser.scala:83)
> 	at org.apache.kylin.engine.spark.job.ParentSourceChooser$$anonfun$decideSources$1.apply(ParentSourceChooser.scala:71)
> 	at org.apache.kylin.engine.spark.job.ParentSourceChooser$$anonfun$decideSources$1.apply(ParentSourceChooser.scala:66)
> 	at scala.collection.Iterator$class.foreach(Iterator.scala:893)
> 	at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
> 	at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
> 	at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
> 	at org.apache.kylin.engine.spark.job.ParentSourceChooser.decideSources(ParentSourceChooser.scala:66)
> 	at org.apache.kylin.engine.spark.job.CubeBuildJob.doExecute(CubeBuildJob.java:178)
> 	at org.apache.kylin.engine.spark.application.SparkApplication.execute(SparkApplication.java:304)
> 	at org.apache.kylin.engine.spark.application.SparkApplication.execute(SparkApplication.java:93)
> 	... 4 more
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)