You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2022/04/05 09:02:00 UTC

[jira] [Work logged] (HIVE-26110) Bulk insert into partitioned table creates lots of files in iceberg

     [ https://issues.apache.org/jira/browse/HIVE-26110?focusedWorklogId=752722&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-752722 ]

ASF GitHub Bot logged work on HIVE-26110:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 05/Apr/22 09:01
            Start Date: 05/Apr/22 09:01
    Worklog Time Spent: 10m 
      Work Description: szlta merged PR #3174:
URL: https://github.com/apache/hive/pull/3174




Issue Time Tracking
-------------------

    Worklog Id:     (was: 752722)
    Time Spent: 1h  (was: 50m)

> Bulk insert into partitioned table creates lots of files in iceberg
> -------------------------------------------------------------------
>
>                 Key: HIVE-26110
>                 URL: https://issues.apache.org/jira/browse/HIVE-26110
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Rajesh Balamohan
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 1h
>  Remaining Estimate: 0h
>
> For e.g, create web_returns table in tpcds in iceberg format and try to copy over data from regular table. More like "insert into web_returns_iceberg as select * from web_returns".
> This inserts the data correctly, however there are lot of files present in each partition. IMO, dynamic sort optimisation isn't working fine and this causes records not to be grouped in the final phase.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)