You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Steve Loughran (Jira)" <ji...@apache.org> on 2020/07/27 15:45:00 UTC
[jira] [Commented] (HADOOP-17157) S3A rename operation not the same
with HDFS
[ https://issues.apache.org/jira/browse/HADOOP-17157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17165809#comment-17165809 ]
Steve Loughran commented on HADOOP-17157:
-----------------------------------------
thank you for running with an enhancing the tests -always appreciated
rename in FileSystem is the troublespot in our lives, especially that bit about empty directories which was more of an accident/misunderstanding (mv does that, Posix does not. https://pubs.opengroup.org/onlinepubs/009695399/functions/rename.html
That HDFS behaviour you see holds if-and-only-if the destination is empty.
regarding both that filesystem spec and the s3a behaviour, yes, we may be wrong, at least as far as empty directories are concerned.
* the bit of the spec needs review/cleanup. It's the bit we are scared of
* I don't really want to change s3a as the general consensus is that HDFS is broken
FileContext's rename() doesn't do bad things on empty dest directories. I'll have to look @ s3a now to see what it does.
What to do *properly* here
HADOOP-11452 looks at making rename/3 public and specified; never been finished. See the discussion on what I'd like now
HDDS-2112 covers ozone/hdfs mismatch
I think I'd like to see that async rename I've discussed there, but if you want to take up HADOOP-11452 and finish it off....
(ps: thank you for running the tests. Always appreciated)
What now,
# I'd recommend you look at org.apache.hadoop.fs.contract.ContractOptions and see the options there, and which filesystems do what. I think S3A copies posix.
# if it gets things hopelessly wrong there, that's an issue
# if it doesn't copy HDFS's considered-wrong policy: I don't feel too bad.
Our filesystem.md docs clearly need improving on this section. I think the big issue is that the author of that bit of spec didn't fully understand HDFS and was scared to look into the details. I speak as that individual.
If you want to review and clarify, gladly welcome
> S3A rename operation not the same with HDFS
> -------------------------------------------
>
> Key: HADOOP-17157
> URL: https://issues.apache.org/jira/browse/HADOOP-17157
> Project: Hadoop Common
> Issue Type: Bug
> Components: fs/s3
> Reporter: Jiajia Li
> Priority: Major
>
> When I run the test ITestS3ADeleteManyFiles, I change the [https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/scale/ITestS3ADeleteManyFiles.java#L97]
> to
> {code}
> fs.mkdirs(finalDir);
> {code}
> So before rename operator, "finalParent/final" has been created.
> But after the rename operation, all the files will be moved from "srcParent/src" to "finalParent/final"
> So this is not the same with the HDFS rename operation:
> HDFS rename includes the calculation of the destination path. If the destination exists and is a directory, the final destination of the rename becomes the destination + the filename of the source path.
> let dest = if (isDir(FS, src) and d != src) :
> d + [filename(src)]
> else :
> d
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org