You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Steve Loughran (Jira)" <ji...@apache.org> on 2020/07/27 15:45:00 UTC

[jira] [Commented] (HADOOP-17157) S3A rename operation not the same with HDFS

    [ https://issues.apache.org/jira/browse/HADOOP-17157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17165809#comment-17165809 ] 

Steve Loughran commented on HADOOP-17157:
-----------------------------------------

thank you for running with an enhancing the tests -always appreciated

rename in FileSystem is the troublespot in our lives, especially that bit about empty directories which was more of an accident/misunderstanding (mv does that, Posix does not. https://pubs.opengroup.org/onlinepubs/009695399/functions/rename.html

That HDFS behaviour you see holds if-and-only-if the destination is empty. 

regarding both that filesystem spec and the s3a behaviour, yes, we may be wrong, at least as far as empty directories are concerned.

* the bit of the spec needs review/cleanup. It's the bit we are scared of
* I don't really want to change s3a as the general consensus is that HDFS is broken

FileContext's rename() doesn't do bad things on empty dest directories. I'll have to look @ s3a now to see what it does.

What to do *properly* here

HADOOP-11452 looks at making rename/3 public and specified; never been finished. See the discussion on what I'd like now

HDDS-2112 covers ozone/hdfs mismatch


I think I'd like to see that async rename I've discussed there, but if you want to take up HADOOP-11452 and finish it off....

(ps: thank you for running the tests. Always appreciated)

What now,

# I'd recommend you look at org.apache.hadoop.fs.contract.ContractOptions and see the options there, and which filesystems do what. I think S3A copies posix. 
# if it gets things hopelessly wrong there, that's an issue
# if it doesn't copy HDFS's considered-wrong policy: I don't feel too bad. 

Our filesystem.md docs clearly need improving on this section. I think the big issue is that the author of that bit of spec didn't fully understand HDFS and was scared to look into the details. I speak as that individual.

If you want to review and clarify, gladly welcome




> S3A rename operation not the same with HDFS
> -------------------------------------------
>
>                 Key: HADOOP-17157
>                 URL: https://issues.apache.org/jira/browse/HADOOP-17157
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs/s3
>            Reporter: Jiajia Li
>            Priority: Major
>
> When I run the test ITestS3ADeleteManyFiles, I change the [https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/scale/ITestS3ADeleteManyFiles.java#L97]
> to 
> {code}
> fs.mkdirs(finalDir);
> {code}
> So before rename operator, "finalParent/final" has been created.
> But after the rename operation,  all the files will be moved from "srcParent/src" to "finalParent/final"
> So this is not the same with the HDFS rename operation:
> HDFS rename includes the calculation of the destination path. If the destination exists and is a directory, the final destination of the rename becomes the destination + the filename of the source path.
> let dest = if (isDir(FS, src) and d != src) :
>         d + [filename(src)]
>     else :
>         d



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org