You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2022/12/22 06:12:00 UTC

[jira] [Work logged] (HIVE-26884) Iceberg: V2 Vectorization returns wrong results with deletes

     [ https://issues.apache.org/jira/browse/HIVE-26884?focusedWorklogId=835189&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-835189 ]

ASF GitHub Bot logged work on HIVE-26884:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 22/Dec/22 06:11
            Start Date: 22/Dec/22 06:11
    Worklog Time Spent: 10m 
      Work Description: ayushtkn opened a new pull request, #3890:
URL: https://github.com/apache/hive/pull/3890

   ### What changes were proposed in this pull request?
   
   Fix Row Number calculation in Parquet Vectorized V2 reads.
   
   ### Why are the changes needed?
   
   Vectorized V2 reads with multiple blocks and some getting filtered leads to wrong positions being calculated, messing up with the PositionalDeleteFilter
   
   ### Does this PR introduce _any_ user-facing change?
   
   Yes, Correct Results with Parquet V2 reads
   
   ### How was this patch tested?
   
   On Local Env with subset of data from the actual Env. (Further in progress)




Issue Time Tracking
-------------------

            Worklog Id:     (was: 835189)
    Remaining Estimate: 0h
            Time Spent: 10m

> Iceberg: V2 Vectorization returns wrong results with deletes
> ------------------------------------------------------------
>
>                 Key: HIVE-26884
>                 URL: https://issues.apache.org/jira/browse/HIVE-26884
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Ayush Saxena
>            Assignee: Ayush Saxena
>            Priority: Major
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> In case of Iceberg V2 reads, if we have delete files, and a couple of parquet blocks are skipped in that case the row number calculation is screwed and that leads to mismatch with delete filter row positions and hence leading to wrong results.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)