You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2021/01/20 14:19:21 UTC

[GitHub] [iceberg] steveloughran opened a new issue #2124: Improve HadoopCatalog performance/scalability

steveloughran opened a new issue #2124:
URL: https://github.com/apache/iceberg/issues/2124


   By moving HadoopCatalog onto `listStatusIterator()` list API calls, filesystem clients which do paged listings (hdfs, webhdfs, s3a, and soon abfs) can mix prefetching of the next page of results while HadoopCatalog examines the current entries. This makes a difference on object stores. (Not that you should be using HadoopCatalog on S3; ABFS is a different matter)


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi commented on issue #2124: Improve HadoopCatalog performance/scalability

Posted by GitBox <gi...@apache.org>.
aokolnychyi commented on issue #2124:
URL: https://github.com/apache/iceberg/issues/2124#issuecomment-769290824


   Thanks for submitting a fix for this, @steveloughran. Seems like this issue is resolved but I am wondering whether we should add this functionality to other places like `RemoveOrphanFilesAction` that do lists as well. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org