You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2022/07/25 13:29:01 UTC

[GitHub] [beam] steveniemitz opened a new issue, #22431: [Bug]: GcsFileSystem thrashes HTTP connections when checking of a file exists

steveniemitz opened a new issue, #22431:
URL: https://github.com/apache/beam/issues/22431

   ### What happened?
   
   Due to an idiosyncrasy of how the google API client batch API works, calls using the batch API do not use a pooled HTTP connection (more accurately, do not leave the connection in a state where it can be returned to the pool) and instead require a new connection each time.  This can lead to a significant number of sockets left in TIME_WAIT for operations that do a lot of getObject operations (match, etc), possibly even leading to socket/fd exhaustion.
   
   The interaction between the FileSystem and GcsUtil is such that getObjects is generally only called with a single element, so we can optimize here and direct that to the single-object API instead, which does pool correctly.
   
   ### Issue Priority
   
   Priority: 2
   
   ### Issue Component
   
   Component: io-java-gcp


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] lukecwik closed issue #22431: [Bug]: GcsFileSystem thrashes HTTP connections when checking of a file exists

Posted by GitBox <gi...@apache.org>.
lukecwik closed issue #22431: [Bug]: GcsFileSystem thrashes HTTP connections when checking of a file exists
URL: https://github.com/apache/beam/issues/22431


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org