You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Mustafa İman (Jira)" <ji...@apache.org> on 2020/11/13 01:25:00 UTC

[jira] [Updated] (HIVE-24380) NullScanTaskDispatcher should liststatus in parallel

     [ https://issues.apache.org/jira/browse/HIVE-24380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mustafa İman updated HIVE-24380:
--------------------------------
    Description: NullScanTaskDispatcher does listStatus for hundreds of partition directories in case of external tables. This is big problem in cloud installations where directory listings are in object store like S3. We can do this in parallel.  (was: NullScanTaskDispatcher should query listStatus in parallel as it might take a long time to go through hundreds of partitions serially. This is big problem in cloud installations where directory listings are in object store like S3.)

> NullScanTaskDispatcher should liststatus in parallel
> ----------------------------------------------------
>
>                 Key: HIVE-24380
>                 URL: https://issues.apache.org/jira/browse/HIVE-24380
>             Project: Hive
>          Issue Type: Sub-task
>            Reporter: Mustafa İman
>            Assignee: Mustafa İman
>            Priority: Major
>
> NullScanTaskDispatcher does listStatus for hundreds of partition directories in case of external tables. This is big problem in cloud installations where directory listings are in object store like S3. We can do this in parallel.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)