You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Marton Bod (Jira)" <ji...@apache.org> on 2020/02/03 16:25:00 UTC

[jira] [Updated] (HIVE-22819) Refactor Hive::listFilesCreatedByQuery to make it faster for object stores

     [ https://issues.apache.org/jira/browse/HIVE-22819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Marton Bod updated HIVE-22819:
------------------------------
    Attachment: HIVE-22819.1.patch
        Status: Patch Available  (was: Open)

> Refactor Hive::listFilesCreatedByQuery to make it faster for object stores
> --------------------------------------------------------------------------
>
>                 Key: HIVE-22819
>                 URL: https://issues.apache.org/jira/browse/HIVE-22819
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Marton Bod
>            Assignee: Marton Bod
>            Priority: Major
>         Attachments: HIVE-22819.1.patch
>
>
> {color:#0000ff}Hive::listFilesCreatedByQuery{color} does an exists(), an isDir() and then a listing call. This can be expensive in object stores. We should instead directly list the files in the directory (we'd have to handle an exception if the directory does not exists, but issuing a single call to the object store would most likely still end up being more performant). 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)