You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Steve Loughran (JIRA)" <ji...@apache.org> on 2016/12/29 16:01:58 UTC

[jira] [Commented] (HADOOP-13926) S3Guard: Improve listLocatedStatus

    [ https://issues.apache.org/jira/browse/HADOOP-13926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15785576#comment-15785576 ] 

Steve Loughran commented on HADOOP-13926:
-----------------------------------------

One aspect of the list calls which return iterators is that they should ideally be designed to iterate over buckets containing millions of files, without worrying about memory or startup costs. You can see the performance diff if you try to do a listing of the landsat bucket on 2.8: the iterator works, vs 2.7: the code blocks for so long tests timeout, just because there are too many blobs to list in the treewalk.

we need to make sure this call (and related ones), (and implicitly s3guard), can handle paths with a few million child entries

> S3Guard: Improve listLocatedStatus
> ----------------------------------
>
>                 Key: HADOOP-13926
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13926
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>            Reporter: Rajesh Balamohan
>            Priority: Minor
>         Attachments: HADOOP-13926.wip.proto.branch-13345.1.patch
>
>
> Need to check if {{listLocatedStatus}} can make use of metastore's listChildren feature.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org