You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kylin.apache.org by "ASF subversion and git services (Jira)" <ji...@apache.org> on 2024/04/12 08:25:00 UTC

[jira] [Commented] (KYLIN-5828) During multi-jobs concurrent building, the flat table may use inconsistent global dictionaries, resulting in incorrect count distinct query results.

    [ https://issues.apache.org/jira/browse/KYLIN-5828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17836484#comment-17836484 ] 

ASF subversion and git services commented on KYLIN-5828:
--------------------------------------------------------

Commit c61dc4189968f29f8e56e98f72a16f100e9d6e2b in kylin's branch refs/heads/kylin5 from huangsheng
[ https://gitbox.apache.org/repos/asf?p=kylin.git;h=c61dc41899 ]

KYLIN-5828 Concurrently dict v2 jobs lead to abnormal encoding result


> During multi-jobs concurrent building, the flat table may use inconsistent global dictionaries, resulting in incorrect count distinct query results.
> ----------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: KYLIN-5828
>                 URL: https://issues.apache.org/jira/browse/KYLIN-5828
>             Project: Kylin
>          Issue Type: Bug
>          Components: Storage - Parquet
>            Reporter: Zhimin Wu
>            Assignee: Zhimin Wu
>            Priority: Major
>
> *Root Cause*
> When multiple tasks are concurrently building and using the same global dictionary, the consistency of the dictionary version used in the flat table encoding process is not guaranteed. At the same time, another task expands the dictionary, causing some flat table partitions to mistakenly use the new version of the dictionary partition file. Due to the inconsistent data distribution, the correct dictionary content cannot be obtained, resulting in a flat table encoding column of 0 and ultimately causing an abnormal count distinct value.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)