You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Ruslan Dautkhanov (Jira)" <ji...@apache.org> on 2021/11/19 21:19:00 UTC

[jira] [Updated] (HADOOP-18019) Hadoop 3.3 regression in hadoop/fs/s3a/S3AFileSystem.s3GetFileStatus()

     [ https://issues.apache.org/jira/browse/HADOOP-18019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ruslan Dautkhanov updated HADOOP-18019:
---------------------------------------
    Description: 
Repro code:
{code:java}
val conf = new Configuration()  
conf.set("fs.s3a.endpoint", "http://127.0.0.1:9000") conf.set("fs.s3a.path.style.access", "true") 
conf.set("fs.s3a.access.key", "user_access_key") 
conf.set("fs.s3a.secret.key", "password")  

val path = new Path("s3a://comcast-test")  
val fs = path.getFileSystem(conf)  
fs.mkdirs(new Path("/testdelta/_delta_log"))  
fs.getFileStatus(new Path("/testdelta/_delta_log")){code}
Fails with *FileNotFoundException fails* on Minio. The same code works in real S3.
It also works in Hadoop 3.2 with Minio and earlier versions.

Only fails on 3.3 and newer Hadoop branches.

The reason as discovered by [~sadikovi] is actually a more fundamental one - Minio does not have empty directories (sort of), see [https://github.com/minio/minio/issues/2423].

This works in Hadoop 3.2 because of this infamous "Is this necessary?" block of code
[https://github.com/apache/hadoop/blob/branch-3.2.0/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java#L2204-L2223]

that was removed in Hadoop 3.3 -
[https://github.com/apache/hadoop/blob/branch-3.3.0/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java#L2179]

and this causes the regression

  was:
Repro code:

{{}}
{code:java}
val conf = new Configuration()  
conf.set("fs.s3a.endpoint", "http://127.0.0.1:9000") conf.set("fs.s3a.path.style.access", "true") 
conf.set("fs.s3a.access.key", "user_access_key") 
conf.set("fs.s3a.secret.key", "password")  

val path = new Path("s3a://comcast-test")  
val fs = path.getFileSystem(conf)  
fs.mkdirs(new Path("/testdelta/_delta_log"))  
fs.getFileStatus(new Path("/testdelta/_delta_log")){code}
{{}}
Fails with *FileNotFoundException fails* on Minio. The same code works in real S3.
It also works in Hadoop 3.2 with Minio and earlier versions.

Only fails on 3.3 and newer Hadoop branches.

The reason as discovered by [~sadikovi] is actually a more fundamental one - Minio does not have empty directories (sort of), see [https://github.com/minio/minio/issues/2423].

This works in Hadoop 3.2 because of this infamous "Is this necessary?" block of code
[https://github.com/apache/hadoop/blob/branch-3.2.0/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java#L2204-L2223]

that was removed in Hadoop 3.3 -
[https://github.com/apache/hadoop/blob/branch-3.3.0/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java#L2179]

and this causes the regression


> Hadoop 3.3 regression in hadoop/fs/s3a/S3AFileSystem.s3GetFileStatus()
> ----------------------------------------------------------------------
>
>                 Key: HADOOP-18019
>                 URL: https://issues.apache.org/jira/browse/HADOOP-18019
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs/s3
>    Affects Versions: 3.3.0, 3.3.1, 3.3.2
>            Reporter: Ruslan Dautkhanov
>            Priority: Major
>
> Repro code:
> {code:java}
> val conf = new Configuration()  
> conf.set("fs.s3a.endpoint", "http://127.0.0.1:9000") conf.set("fs.s3a.path.style.access", "true") 
> conf.set("fs.s3a.access.key", "user_access_key") 
> conf.set("fs.s3a.secret.key", "password")  
> val path = new Path("s3a://comcast-test")  
> val fs = path.getFileSystem(conf)  
> fs.mkdirs(new Path("/testdelta/_delta_log"))  
> fs.getFileStatus(new Path("/testdelta/_delta_log")){code}
> Fails with *FileNotFoundException fails* on Minio. The same code works in real S3.
> It also works in Hadoop 3.2 with Minio and earlier versions.
> Only fails on 3.3 and newer Hadoop branches.
> The reason as discovered by [~sadikovi] is actually a more fundamental one - Minio does not have empty directories (sort of), see [https://github.com/minio/minio/issues/2423].
> This works in Hadoop 3.2 because of this infamous "Is this necessary?" block of code
> [https://github.com/apache/hadoop/blob/branch-3.2.0/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java#L2204-L2223]
> that was removed in Hadoop 3.3 -
> [https://github.com/apache/hadoop/blob/branch-3.3.0/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java#L2179]
> and this causes the regression



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org