You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Ádám Szita (Jira)" <ji...@apache.org> on 2022/11/15 09:48:00 UTC

[jira] [Updated] (HIVE-26137) Optimized transfer of Iceberg residual expressions from AM to execution

     [ https://issues.apache.org/jira/browse/HIVE-26137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ádám Szita updated HIVE-26137:
------------------------------
    Component/s: Iceberg integration

> Optimized transfer of Iceberg residual expressions from AM to execution
> -----------------------------------------------------------------------
>
>                 Key: HIVE-26137
>                 URL: https://issues.apache.org/jira/browse/HIVE-26137
>             Project: Hive
>          Issue Type: Improvement
>          Components: Iceberg integration
>            Reporter: Ádám Szita
>            Assignee: Ádám Szita
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 4.0.0-alpha-2
>
>          Time Spent: 40m
>  Remaining Estimate: 0h
>
> HIVE-25967 introduced a hack to prevent Iceberg filter expressions to be serialized into splits. This temporary fix was to avoid OOM problems on Tez AM side, but at the same time prevented predicate pushdowns to work on the execution side too.
> This ticket intends to incorporate the long term solution. It turns out that the file scan tasks created by Iceberg actually don't contain a "residual" expressions, but rather a complete/original one. It becomes residual only when it is evaluated against the tasks' partition value, which only happens on the execution site. This means that the original filter is the same expression for all splits in Tez AM, so we can transfer it via job conf instead.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)