You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "He Yongqiang (JIRA)" <ji...@apache.org> on 2011/03/08 23:13:59 UTC

[jira] Commented: (HIVE-2030) isEmptyPath() to use ContentSummary cache

    [ https://issues.apache.org/jira/browse/HIVE-2030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13004253#comment-13004253 ] 

He Yongqiang commented on HIVE-2030:
------------------------------------

The ContentSummary is not guaranteed to be populated. Even it is, it seems this information is not passed to the child process. (So this is not empty only when executing with local mode)

> isEmptyPath() to use ContentSummary cache
> -----------------------------------------
>
>                 Key: HIVE-2030
>                 URL: https://issues.apache.org/jira/browse/HIVE-2030
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Siying Dong
>            Assignee: Siying Dong
>            Priority: Minor
>         Attachments: HIVE-2030.1.patch
>
>
> addInputPaths() calls isEmptyPath() for every input path. Now every call is a DFS namenode call. Making isEmptyPath() to use cached ContentSummary, we should be able to avoid some namenode calls and reduce latency in the case of multiple partitions.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira