You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Joydeep Sen Sarma (JIRA)" <ji...@apache.org> on 2010/12/16 00:59:01 UTC

[jira] Commented: (HIVE-1852) Reduce unnecessary DFSClient.rename() calls

    [ https://issues.apache.org/jira/browse/HIVE-1852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12971893#action_12971893 ] 

Joydeep Sen Sarma commented on HIVE-1852:
-----------------------------------------

are u sure this is ok? it seems we have changed the semantics - the old code takes each file from underneath the dir and moves into final location. the new code moves the directory underneath the final location. there's one extra level of directory in the new code that's not there in the old code. also - the semantics in terms of collisions changes because of this. if we create a subdir - then there may not be collisions in the new code (because of rename) that may occur in the old code.

> Reduce unnecessary DFSClient.rename() calls
> -------------------------------------------
>
>                 Key: HIVE-1852
>                 URL: https://issues.apache.org/jira/browse/HIVE-1852
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Ning Zhang
>            Assignee: Ning Zhang
>         Attachments: HIVE-1852.patch
>
>
> In Hive client side (MoveTask etc), DFSCleint.rename() is called for every file inside a directory. It is very expensive for a large directory in a busy DFS namenode. We should replace it with a single rename() call on the whole directory. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.