You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Rajesh Balamohan (JIRA)" <ji...@apache.org> on 2016/05/23 08:55:12 UTC
[jira] [Commented] (HADOOP-13164) Optimize
S3AFileSystem::deleteUnnecessaryFakeDirectories
[ https://issues.apache.org/jira/browse/HADOOP-13164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15296117#comment-15296117 ]
Rajesh Balamohan commented on HADOOP-13164:
-------------------------------------------
Instead of optimizing deleteUnnecessaryFakeDirectories to reduce the number of calls to S3, need to understand whether it is mandatory to do invoke it from S3*OutputStream.close() / S3AFileSystem.innerCopyFromLocalFile / S3AFileSystem.innerRename. Thoughts?.
> Optimize S3AFileSystem::deleteUnnecessaryFakeDirectories
> --------------------------------------------------------
>
> Key: HADOOP-13164
> URL: https://issues.apache.org/jira/browse/HADOOP-13164
> Project: Hadoop Common
> Issue Type: Bug
> Components: fs/s3
> Affects Versions: 2.8.0
> Reporter: Rajesh Balamohan
> Priority: Minor
>
> https://github.com/apache/hadoop/blob/27c4e90efce04e1b1302f668b5eb22412e00d033/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java#L1224
> deleteUnnecessaryFakeDirectories is invoked in S3AFileSystem during rename and on outputstream close() to purge any fake directories. Depending on the nesting in the folder structure, it might take a lot longer time as it invokes getFileStatus multiple times. Instead, it should be able to break out of the loop once a non-empty directory is encountered.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org