You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2020/10/01 06:57:00 UTC

[jira] [Work logged] (HADOOP-17281) Implement FileSystem.listStatusIterator() in S3AFileSystem

     [ https://issues.apache.org/jira/browse/HADOOP-17281?focusedWorklogId=493316&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-493316 ]

ASF GitHub Bot logged work on HADOOP-17281:
-------------------------------------------

                Author: ASF GitHub Bot
            Created on: 01/Oct/20 06:56
            Start Date: 01/Oct/20 06:56
    Worklog Time Spent: 10m 
      Work Description: mukund-thakur opened a new pull request #2354:
URL: https://github.com/apache/hadoop/pull/2354


   Ran the new test using ap-south-1 bucket. 
   
   O/P- 
   `(ContractTestUtils.java:end(1847)) - Duration of listing 1000 files using listFiles() api with batch size of 10 including 10ms of processing time for each file: 12,223,848,028 nS
   2020-10-01 12:19:28,811 [JUnit-testMultiPagesListingPerformanceAndCorrectness] INFO  contract.ContractTestUtils (ContractTestUtils.java:end(1847)) - Duration of listing 1000 files using listStatus() api with batch size of 10 including 10ms of processing time for each file: 15,988,037,357 nS
   2020-10-01 12:19:41,050 [JUnit-testMultiPagesListingPerformanceAndCorrectness] INFO  contract.ContractTestUtils (ContractTestUtils.java:end(1847)) - Duration of listing 1000 files using listStatusIterator() api with batch size of 10 including 10ms of processing time for each file: 12,214,813,052 nS`
   
   From the logs we can see that time taken using listStatusIterator() and listFiles() matches and is less than listStatus().


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Issue Time Tracking
-------------------

            Worklog Id:     (was: 493316)
    Remaining Estimate: 0h
            Time Spent: 10m

> Implement FileSystem.listStatusIterator() in S3AFileSystem
> ----------------------------------------------------------
>
>                 Key: HADOOP-17281
>                 URL: https://issues.apache.org/jira/browse/HADOOP-17281
>             Project: Hadoop Common
>          Issue Type: Task
>          Components: fs/s3
>    Affects Versions: 3.3.0
>            Reporter: Mukund Thakur
>            Assignee: Mukund Thakur
>            Priority: Major
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently S3AFileSystem only implements listStatus() api which returns an array. Once we implement the listStatusIterator(), clients can benefit from the async listing done recently 
> https://issues.apache.org/jira/browse/HADOOP-17074  by performing some tasks on files while iterating them.
>  
> CC [~stevel]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org