You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Steve Loughran (Jira)" <ji...@apache.org> on 2021/11/21 11:47:00 UTC

[jira] [Commented] (HADOOP-18019) S3AFileSystem.s3GetFileStatus() doesn't find dir markers on minio

    [ https://issues.apache.org/jira/browse/HADOOP-18019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17447019#comment-17447019 ] 

Steve Loughran commented on HADOOP-18019:
-----------------------------------------

actually, if you are going to check out locally hadoop, could you try this pr: https://github.com/apache/hadoop/pull/3534

it cuts out s3guard so will stop any attempts to talk to dynamoDB. the testing.md file covers how to test against third party stores ... point to a different large file for seek tests and encryption and s3 select test suites are skipped automatically

> S3AFileSystem.s3GetFileStatus() doesn't find dir markers on minio
> -----------------------------------------------------------------
>
>                 Key: HADOOP-18019
>                 URL: https://issues.apache.org/jira/browse/HADOOP-18019
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs/s3
>    Affects Versions: 3.3.0, 3.3.1, 3.3.2
>         Environment: minio s3-compatible storage
>            Reporter: Ruslan Dautkhanov
>            Priority: Major
>
> Repro code:
> {code:java}
> val conf = new Configuration()  
> conf.set("fs.s3a.endpoint", "http://127.0.0.1:9000") conf.set("fs.s3a.path.style.access", "true") 
> conf.set("fs.s3a.access.key", "user_access_key") 
> conf.set("fs.s3a.secret.key", "password")  
> val path = new Path("s3a://comcast-test")  
> val fs = path.getFileSystem(conf)  
> fs.mkdirs(new Path("/testdelta/_delta_log"))  
> fs.getFileStatus(new Path("/testdelta/_delta_log")){code}
> Fails with *FileNotFoundException fails* on Minio. The same code works in real S3.
> It also works in Hadoop 3.2 with Minio and earlier versions.
> Only fails on 3.3 and newer Hadoop branches.
> The reason as discovered by [~sadikovi] is actually a more fundamental one - Minio does not have empty directories (sort of), see [https://github.com/minio/minio/issues/2423].
> This works in Hadoop 3.2 because of this infamous "Is this necessary?" block of code
> [https://github.com/apache/hadoop/blob/branch-3.2.0/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java#L2204-L2223]
> that was removed in Hadoop 3.3 -
> [https://github.com/apache/hadoop/blob/branch-3.3.0/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java#L2179]
> and this causes the regression



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org