You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Ádám Szita (Jira)" <ji...@apache.org> on 2022/11/15 09:48:00 UTC

[jira] [Updated] (HIVE-26110) Bulk insert into partitioned table creates lots of files in iceberg

     [ https://issues.apache.org/jira/browse/HIVE-26110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ádám Szita updated HIVE-26110:
------------------------------
    Component/s: Iceberg integration

> Bulk insert into partitioned table creates lots of files in iceberg
> -------------------------------------------------------------------
>
>                 Key: HIVE-26110
>                 URL: https://issues.apache.org/jira/browse/HIVE-26110
>             Project: Hive
>          Issue Type: Bug
>          Components: Iceberg integration
>            Reporter: Rajesh Balamohan
>            Assignee: Ádám Szita
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 4.0.0-alpha-2
>
>          Time Spent: 1h
>  Remaining Estimate: 0h
>
> For e.g, create web_returns table in tpcds in iceberg format and try to copy over data from regular table. More like "insert into web_returns_iceberg as select * from web_returns".
> This inserts the data correctly, however there are lot of files present in each partition. IMO, dynamic sort optimisation isn't working fine and this causes records not to be grouped in the final phase.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)