You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Marta Kuczora (Jira)" <ji...@apache.org> on 2020/02/21 10:59:00 UTC

[jira] [Created] (HIVE-22918) Investigate empty bucket file creation for ACID tables

Marta Kuczora created HIVE-22918:
------------------------------------

             Summary: Investigate empty bucket file creation for ACID tables
                 Key: HIVE-22918
                 URL: https://issues.apache.org/jira/browse/HIVE-22918
             Project: Hive
          Issue Type: Task
    Affects Versions: 4.0.0
            Reporter: Marta Kuczora
            Assignee: Marton Bod


When creating an insert-only bucketed table with 5 buckets, and we insert only one row to this table, Hive creates empty files for the other 4 buckets. This logic is in the code for ACID tables as well, but when checking the table's final directory after the insert, I found that only 1 files got created. When debugged this issue, I found that the empty files are created in the staging directory outside the delta directory, therefore they won't get copied by the move task to the final directory. This behavior seems broken, but not sure if we really need the empty files in this case.

This Jira is about investigating whether or not we need these empty files for ACID tables and if we do, fix the code to have them for ACID tables as well.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)