You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Steve Loughran (Jira)" <ji...@apache.org> on 2020/07/16 14:13:00 UTC

[jira] [Commented] (HADOOP-17134) S3AFileSystem.listLocatedStatu(file) does a LIST even with S3Guard

    [ https://issues.apache.org/jira/browse/HADOOP-17134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17159234#comment-17159234 ] 

Steve Loughran commented on HADOOP-17134:
-----------------------------------------

not sure we need to worry about this. it's the specific operation we are optimising away from, because we know that in the production code we've seen, it is only ever called against directories.

We could fix it by replicating the relevant code from innerGetFileStatus which looked in s3guard for the file, and do that first. Which would add a DDB call to every directory listing on the path we are now optimising for.

> S3AFileSystem.listLocatedStatu(file) does a LIST even with S3Guard
> ------------------------------------------------------------------
>
>                 Key: HADOOP-17134
>                 URL: https://issues.apache.org/jira/browse/HADOOP-17134
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: 3.4.0
>            Reporter: Steve Loughran
>            Priority: Minor
>
> This is minor and we may want to WONTFIX; noticed during work on directory markers.
> If you call listLocatedStatus(file) then a LIST call is always made to S3, even when S3Guard is present and has the record to say "this is a file"
> Does this matter enough to fix? 
> # The HADOOP-16465 work moved the list before falling back to getFileStatus
> # that listing calls s3guard.listChildren(path) to list the children.
> # which only returns the chlldren of a path, not a record of the path itself.
> # so we get an empty list back, triggering the LIST
> # its only after that LIST fails that we fall back to getFileStatus and hence look for the actual file record.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org