You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Joydeep Sen Sarma (JIRA)" <ji...@apache.org> on 2011/03/21 18:53:06 UTC

[jira] [Commented] (HIVE-2051) getInputSummary() to call FileSystem.getContentSummary() in parallel

    [ https://issues.apache.org/jira/browse/HIVE-2051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13009237#comment-13009237 ] 

Joydeep Sen Sarma commented on HIVE-2051:
-----------------------------------------

+1. will commit after running tests.

> getInputSummary() to call FileSystem.getContentSummary() in parallel
> --------------------------------------------------------------------
>
>                 Key: HIVE-2051
>                 URL: https://issues.apache.org/jira/browse/HIVE-2051
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Siying Dong
>            Assignee: Siying Dong
>            Priority: Minor
>         Attachments: HIVE-2051.1.patch, HIVE-2051.2.patch, HIVE-2051.3.patch, HIVE-2051.4.patch, HIVE-2051.5.patch
>
>
> getInputSummary() now call FileSystem.getContentSummary() one by one, which can be extremely slow when the number of input paths are huge. By calling those functions in parallel, we can cut latency in most cases.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira