You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "Sagar Sumit (Jira)" <ji...@apache.org> on 2022/04/04 02:15:00 UTC

[jira] [Closed] (HUDI-3776) Fix BloomIndex incorrectly using ColStats to lookup records locations

     [ https://issues.apache.org/jira/browse/HUDI-3776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sagar Sumit closed HUDI-3776.
-----------------------------
    Resolution: Fixed

> Fix BloomIndex incorrectly using ColStats to lookup records locations
> ---------------------------------------------------------------------
>
>                 Key: HUDI-3776
>                 URL: https://issues.apache.org/jira/browse/HUDI-3776
>             Project: Apache Hudi
>          Issue Type: Bug
>            Reporter: Alexey Kudinkin
>            Assignee: Sagar Sumit
>            Priority: Blocker
>              Labels: pull-request-available
>             Fix For: 0.11.0
>
>
> Currently, BloomIndex tries to rely solely on Column Stats to lookup records locations. This is however incorrect, since CS state might not be complete at any given moment; instead we should use it on the basis of best effort (not assuming that it would have any record at all), and for those files that are not found in ColStats we should list from them directly.
> You can search in code for "HUDI-3776" to see exact code location this is related to



--
This message was sent by Atlassian Jira
(v8.20.1#820001)