You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2021/11/10 10:21:16 UTC

[GitHub] [beam] phoerious opened a new pull request #15931: Make S3 streaming more efficient

phoerious opened a new pull request #15931:
URL: https://github.com/apache/beam/pull/15931


   I tinkered around with the built-in Boto3-based S3 integration in the Python SDK, since it turned out to be rather inefficient, which comes mostly from the fact that it creates a new range request for each chunk (inefficient even with `Connection: Keep-Alive`) and also relies on huge 16MB buffers (which isn't always optimal for latency in general or client apps which do their own buffering).
   
   This PR modifies the Boto3 client to keep track of a previously opened HTTP stream instead of creating a new connection/range request for each read.
   
   This implementation is more cache efficient also for the server, since it allocates only one long-lived stream for the whole S3 object instead of requesting a new range with each read.
   
   Here are a few benchmarks for reading a WARC file from our Ceph cluster and iterating all records with FastWARC. For comparison I chose the default buffer size (16MB) and a much smaller one (65536 bytes, FastWARCs default internal buffer size).
   
       old:
         5518.28 records/s (large buffer)
         760.86 records/s  (small buffer)
   
       new:
         6156.54 records/s (large buffer)
         9292.72 records/s (small buffer)
   
   With a small buffer, it's about 1.7x faster, with the large buffer it's still a bit faster, but not much, because the overhead of allocating and discarding large byte strings starts kicking in.
   
   The new S3 client also has a new optional constructor parameter for setting an initial offset. I don't know how else to expose this functionality, but it's way more efficient than opening a stream from the start and then seeking to the desired position. This is meant mainly for use cases where you implement your own `FileBasedSource`. Unfortunately, `FileBasedSource.open_file()` provides no interface for opening a file at a certain initial offset, but you can work around it by constructing your own `Client` and `S3IO` instances for `s3://` URLs.
   
   Disclaimer: There is no issue report for this, so I can't put it in the title and I also don't know whom to ping for review.
   
   ------------------------
   
   Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:
   
    - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`).
    - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue.
    - [ ] Update `CHANGES.md` with noteworthy changes.
    - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more tips on [how to make review process smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier).
   
   `ValidatesRunner` compliance status (on master branch)
   --------------------------------------------------------
   
   <table>
     <thead>
       <tr>
         <th>Lang</th>
         <th>ULR</th>
         <th>Dataflow</th>
         <th>Flink</th>
         <th>Samza</th>
         <th>Spark</th>
         <th>Twister2</th>
       </tr>
     </thead>
     <tbody>
       <tr>
         <td>Go</td>
         <td>---</td>
         <td>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon">
           </a>
         <td>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon">
           </a>
         </td>
         <td>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Samza/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Samza/lastCompletedBuild/badge/icon">
           </a>
         </td>
         <td>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon">
           </a>
         </td>
         <td>---</td>
       </tr>
       <tr>
         <td>Java</td>
         <td>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_ULR/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_ULR/lastCompletedBuild/badge/icon">
           </a>
         </td>
         <td>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon?subject=V1">
           </a><br>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Streaming/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Streaming/lastCompletedBuild/badge/icon?subject=V1+Streaming">
           </a><br>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon?subject=V1+Java+11">
           </a><br>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_Java_VR_Dataflow_V2/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_Java_VR_Dataflow_V2/lastCompletedBuild/badge/icon?subject=V2">
           </a><br>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_Java_VR_Dataflow_V2_Streaming/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_Java_VR_Dataflow_V2_Streaming/lastCompletedBuild/badge/icon?subject=V2+Streaming">
           </a><br>
         </td>
         <td>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon?subject=Java+8">
           </a><br>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/badge/icon?subject=Java+11">
           </a><br>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon?subject=Portable">
           </a><br>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon?subject=Portable+Streaming">
           </a>
         </td>
         <td>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon">
           </a><br>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Samza/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Samza/lastCompletedBuild/badge/icon?subject=Portable">
           </a>
         </td>
         <td>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon">
           </a><br>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon?subject=Portable">
           </a><br>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon?subject=Structured+Streaming">
           </a>
         </td>
         <td>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Twister2/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Twister2/lastCompletedBuild/badge/icon">
           </a>
         </td>
       </tr>
       <tr>
         <td>Python</td>
         <td>---</td>
         <td>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon?subject=V1">
           </a><br>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_Py_VR_Dataflow_V2/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_Py_VR_Dataflow_V2/lastCompletedBuild/badge/icon?subject=V2">
           </a><br>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon?subject=ValCont">
           </a>
         </td>
         <td>
           <a href="https://ci-beam.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/badge/icon?subject=Portable">
           </a><br>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_Python_VR_Flink/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_Python_VR_Flink/lastCompletedBuild/badge/icon">
           </a>
         </td>
         <td>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_Python_VR_Samza/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_Python_VR_Samza/lastCompletedBuild/badge/icon">
           </a>
         </td>
         <td>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_Python_VR_Spark/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_Python_VR_Spark/lastCompletedBuild/badge/icon">
           </a>
         </td>
         <td>---</td>
       </tr>
       <tr>
         <td>XLang</td>
         <td>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_XVR_Direct/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_XVR_Direct/lastCompletedBuild/badge/icon">
           </a>
         </td>
         <td>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_XVR_Dataflow/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_XVR_Dataflow/lastCompletedBuild/badge/icon">
           </a>
         </td>
         <td>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_XVR_Flink/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_XVR_Flink/lastCompletedBuild/badge/icon">
           </a>
         </td>
         <td>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_XVR_Samza/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_XVR_Samza/lastCompletedBuild/badge/icon">
           </a>
         </td>
         <td>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_XVR_Spark/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_XVR_Spark/lastCompletedBuild/badge/icon">
           </a>
         </td>
         <td>---</td>
       </tr>
     </tbody>
   </table>
   
   Examples testing status on various runners
   --------------------------------------------------------
   
   <table>
     <thead>
       <tr>
         <th>Lang</th>
         <th>ULR</th>
         <th>Dataflow</th>
         <th>Flink</th>
         <th>Samza</th>
         <th>Spark</th>
         <th>Twister2</th>
       </tr>
     </thead>
     <tbody>
       <tr>
         <td>Go</td>
         <td>---</td>
         <td>---</td>
         <td>---</td>
         <td>---</td>
         <td>---</td>
         <td>---</td>
         <td>---</td>
       </tr>
       <tr>
         <td>Java</td>
         <td>---</td>
         <td>
           <a href="https://ci-beam.apache.org/job/beam_PreCommit_Java_Examples_Dataflow_Cron/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PreCommit_Java_Examples_Dataflow_Cron/lastCompletedBuild/badge/icon?subject=V1">
           </a><br>
           <a href="https://ci-beam.apache.org/job/beam_PreCommit_Java_Examples_Dataflow_Java11_Cron/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PreCommit_Java_Examples_Dataflow_Java11_Cron/lastCompletedBuild/badge/icon?subject=V1+Java11">
           </a><br>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_Java_Examples_Dataflow_V2/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_Java_Examples_Dataflow_V2/lastCompletedBuild/badge/icon?subject=V2">
           </a><br>
         </td>
         <td>---</td>
         <td>---</td>
         <td>---</td>
         <td>---</td>
         <td>---</td>
       </tr>
       <tr>
         <td>Python</td>
         <td>---</td>
         <td>---</td>
         <td>---</td>
         <td>---</td>
         <td>---</td>
         <td>---</td>
         <td>---</td>
       </tr>
       <tr>
         <td>XLang</td>
         <td>---</td>
         <td>---</td>
         <td>---</td>
         <td>---</td>
         <td>---</td>
         <td>---</td>
         <td>---</td>
       </tr>
     </tbody>
   </table>
   
   Post-Commit SDK/Transform Integration Tests Status (on master branch)
   ------------------------------------------------------------------------------------------------
   
   <table>
     <thead>
       <tr>
         <th>Go</th>
         <th>Java</th>
         <th>Python</th>
       </tr>
     </thead>
     <tbody>
       <tr>
         <td>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon">
           </a>
         </td>
         <td>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon">
           </a>
         </td>
         <td>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/badge/icon?subject=3.6">
           </a><br>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/badge/icon?subject=3.7">
           </a><br>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_Python38/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_Python38/lastCompletedBuild/badge/icon?subject=3.8">
           </a>
         </td>
       </tr>
     </tbody>
   </table>
   
   Pre-Commit Tests Status (on master branch)
   ------------------------------------------------------------------------------------------------
   
   <table>
     <thead>
       <tr>
         <th>---</th>
         <th>Java</th>
         <th>Python</th>
         <th>Go</th>
         <th>Website</th>
         <th>Whitespace</th>
         <th>Typescript</th>
       </tr>
     </thead>
     <tbody>
       <tr>
         <td>Non-portable</td>
         <td>
           <a href="https://ci-beam.apache.org/job/beam_PreCommit_Java_Cron/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PreCommit_Java_Cron/lastCompletedBuild/badge/icon">
           </a><br>
         </td>
         <td>
           <a href="https://ci-beam.apache.org/job/beam_PreCommit_Python_Cron/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PreCommit_Python_Cron/lastCompletedBuild/badge/icon?subject=Tests">
           </a><br>
           <a href="https://ci-beam.apache.org/job/beam_PreCommit_PythonLint_Cron/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PreCommit_PythonLint_Cron/lastCompletedBuild/badge/icon?subject=Lint">
           </a><br>
           <a href="https://ci-beam.apache.org/job/beam_PreCommit_PythonDocker_Cron/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PreCommit_PythonDocker_Cron/badge/icon?subject=Docker">
           </a><br>
           <a href="https://ci-beam.apache.org/job/beam_PreCommit_PythonDocs_Cron/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PreCommit_PythonDocs_Cron/badge/icon?subject=Docs">
           </a>
         </td>
         <td>
           <a href="https://ci-beam.apache.org/job/beam_PreCommit_Go_Cron/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PreCommit_Go_Cron/lastCompletedBuild/badge/icon">
           </a>
         </td>
         <td>
           <a href="https://ci-beam.apache.org/job/beam_PreCommit_Website_Cron/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PreCommit_Website_Cron/lastCompletedBuild/badge/icon">
           </a>
         </td>
         <td>
           <a href="https://ci-beam.apache.org/job/beam_PreCommit_Whitespace_Cron/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PreCommit_Whitespace_Cron/lastCompletedBuild/badge/icon">
           </a>
         </td>
         <td>
           <a href="https://ci-beam.apache.org/job/beam_PreCommit_Typescript_Cron/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PreCommit_Typescript_Cron/lastCompletedBuild/badge/icon">
           </a>
         </td>
       </tr>
       <tr>
         <td>Portable</td>
         <td>---</td>
         <td>
           <a href="https://ci-beam.apache.org/job/beam_PreCommit_Portable_Python_Cron/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PreCommit_Portable_Python_Cron/lastCompletedBuild/badge/icon">
           </a>
         </td>
         <td>
           <a href="https://ci-beam.apache.org/job/beam_PreCommit_GoPortable_Cron/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PreCommit_GoPortable_Cron/lastCompletedBuild/badge/icon">
           </a>
         </td>
         <td>---</td>
         <td>---</td>
         <td>---</td>
       </tr>
     </tbody>
   </table>
   
   See [.test-infra/jenkins/README](https://github.com/apache/beam/blob/master/.test-infra/jenkins/README.md) for trigger phrase, status and link of all Jenkins jobs.
   
   
   GitHub Actions Tests Status (on master branch)
   ------------------------------------------------------------------------------------------------
   [![Build python source distribution and wheels](https://github.com/apache/beam/workflows/Build%20python%20source%20distribution%20and%20wheels/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Build+python+source+distribution+and+wheels%22+branch%3Amaster+event%3Aschedule)
   [![Python tests](https://github.com/apache/beam/workflows/Python%20tests/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Python+Tests%22+branch%3Amaster+event%3Aschedule)
   [![Java tests](https://github.com/apache/beam/workflows/Java%20Tests/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Java+Tests%22+branch%3Amaster+event%3Aschedule)
   
   See [CI.md](https://github.com/apache/beam/blob/master/CI.md) for more information about GitHub Actions CI.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] aromanenko-dev commented on pull request #15931: Make S3 streaming more efficient

Posted by GitBox <gi...@apache.org>.
aromanenko-dev commented on pull request #15931:
URL: https://github.com/apache/beam/pull/15931#issuecomment-991642841


   @aaltay Tbh, I'm not well aware who supports S3 in Python SDK. Maybe @mosche you could take a look?
   
   @phoerious Thanks for contribution! It would be better to create a PR from a feature branch than from master/main. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] mosche commented on a change in pull request #15931: Make S3 streaming more efficient

Posted by GitBox <gi...@apache.org>.
mosche commented on a change in pull request #15931:
URL: https://github.com/apache/beam/pull/15931#discussion_r768518897



##########
File path: sdks/python/apache_beam/io/aws/clients/s3/boto3_client.py
##########
@@ -67,6 +69,11 @@ def __init__(self, options):
         aws_secret_access_key=secret_access_key,
         aws_session_token=session_token)
 
+    self._download_request = None

Review comment:
       👍 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #15931: Make S3 streaming more efficient

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #15931:
URL: https://github.com/apache/beam/pull/15931#issuecomment-965013224


   # [Codecov](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#15931](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (2af84ce) into [master](https://codecov.io/gh/apache/beam/commit/ffb18c79127e85faa9dea7104c5d3e145fdfaf9a?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (ffb18c7) will **increase** coverage by `37.47%`.
   > The diff coverage is `9.09%`.
   
   > :exclamation: Current head 2af84ce differs from pull request most recent head c6c15ac. Consider uploading reports for the commit c6c15ac to get more accurate results
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/15931/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@             Coverage Diff             @@
   ##           master   #15931       +/-   ##
   ===========================================
   + Coverage   46.10%   83.57%   +37.47%     
   ===========================================
     Files         196      447      +251     
     Lines       19498    61526    +42028     
   ===========================================
   + Hits         8990    51422    +42432     
   - Misses       9538    10104      +566     
   + Partials      970        0      -970     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...thon/apache\_beam/io/aws/clients/s3/boto3\_client.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vYXdzL2NsaWVudHMvczMvYm90bzNfY2xpZW50LnB5) | `11.72% <9.09%> (ø)` | |
   | [...kg/beam/core/runtime/xlangx/expansionx/download.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9jb3JlL3J1bnRpbWUveGxhbmd4L2V4cGFuc2lvbngvZG93bmxvYWQuZ28=) | | |
   | [sdks/go/pkg/beam/testing/teststream/teststream.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS90ZXN0aW5nL3Rlc3RzdHJlYW0vdGVzdHN0cmVhbS5nbw==) | | |
   | [sdks/go/pkg/beam/runners/direct/buffer.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9ydW5uZXJzL2RpcmVjdC9idWZmZXIuZ28=) | | |
   | [sdks/go/pkg/beam/core/runtime/graphx/xlang.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9jb3JlL3J1bnRpbWUvZ3JhcGh4L3hsYW5nLmdv) | | |
   | [sdks/go/pkg/beam/core/runtime/exec/reshuffle.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9jb3JlL3J1bnRpbWUvZXhlYy9yZXNodWZmbGUuZ28=) | | |
   | [sdks/go/pkg/beam/transforms/filter/filter.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS90cmFuc2Zvcm1zL2ZpbHRlci9maWx0ZXIuZ28=) | | |
   | [sdks/go/pkg/beam/io/databaseio/mapper.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9pby9kYXRhYmFzZWlvL21hcHBlci5nbw==) | | |
   | [sdks/go/pkg/beam/util/gcsx/gcs.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS91dGlsL2djc3gvZ2NzLmdv) | | |
   | [...dks/go/pkg/beam/core/runtime/harness/monitoring.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9jb3JlL3J1bnRpbWUvaGFybmVzcy9tb25pdG9yaW5nLmdv) | | |
   | ... and [634 more](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [ffb18c7...c6c15ac](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #15931: Make S3 streaming more efficient

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #15931:
URL: https://github.com/apache/beam/pull/15931#issuecomment-965013224


   # [Codecov](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#15931](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (7d47ba9) into [master](https://codecov.io/gh/apache/beam/commit/67bcf1e16e3fdf68cdea7a4b42b9c003e4b8948c?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (67bcf1e) will **increase** coverage by `37.54%`.
   > The diff coverage is `13.33%`.
   
   > :exclamation: Current head 7d47ba9 differs from pull request most recent head 34fedd0. Consider uploading reports for the commit 34fedd0 to get more accurate results
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/15931/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@             Coverage Diff             @@
   ##           master   #15931       +/-   ##
   ===========================================
   + Coverage   46.13%   83.68%   +37.54%     
   ===========================================
     Files         197      447      +250     
     Lines       19519    61717    +42198     
   ===========================================
   + Hits         9006    51645    +42639     
   - Misses       9542    10072      +530     
   + Partials      971        0      -971     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...thon/apache\_beam/io/aws/clients/s3/boto3\_client.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vYXdzL2NsaWVudHMvczMvYm90bzNfY2xpZW50LnB5) | `15.07% <11.36%> (ø)` | |
   | [sdks/python/apache\_beam/utils/retry.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdXRpbHMvcmV0cnkucHk=) | `85.43% <100.00%> (ø)` | |
   | [sdks/go/pkg/beam/core/runtime/xlangx/expand.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9jb3JlL3J1bnRpbWUveGxhbmd4L2V4cGFuZC5nbw==) | | |
   | [sdks/go/pkg/beam/core/runtime/harness/datamgr.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9jb3JlL3J1bnRpbWUvaGFybmVzcy9kYXRhbWdyLmdv) | | |
   | [sdks/go/pkg/beam/core/graph/scope.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9jb3JlL2dyYXBoL3Njb3BlLmdv) | | |
   | [sdks/go/pkg/beam/encoding.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9lbmNvZGluZy5nbw==) | | |
   | [sdks/go/pkg/beam/core/util/protox/protox.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9jb3JlL3V0aWwvcHJvdG94L3Byb3RveC5nbw==) | | |
   | [sdks/go/pkg/beam/core/runtime/coderx/string.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9jb3JlL3J1bnRpbWUvY29kZXJ4L3N0cmluZy5nbw==) | | |
   | [sdks/go/pkg/beam/core/graph/coder/varint.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9jb3JlL2dyYXBoL2NvZGVyL3ZhcmludC5nbw==) | | |
   | [...pkg/beam/core/runtime/xlangx/expansionx/process.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9jb3JlL3J1bnRpbWUveGxhbmd4L2V4cGFuc2lvbngvcHJvY2Vzcy5nbw==) | | |
   | ... and [636 more](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [67bcf1e...34fedd0](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #15931: Make S3 streaming more efficient

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #15931:
URL: https://github.com/apache/beam/pull/15931#issuecomment-965013224


   # [Codecov](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#15931](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (05d1610) into [master](https://codecov.io/gh/apache/beam/commit/8da177f64d314cf72e89a51e51fb0915f706a784?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (8da177f) will **decrease** coverage by `0.02%`.
   > The diff coverage is `8.33%`.
   
   > :exclamation: Current head 05d1610 differs from pull request most recent head 2af84ce. Consider uploading reports for the commit 2af84ce to get more accurate results
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/15931/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #15931      +/-   ##
   ==========================================
   - Coverage   83.53%   83.50%   -0.03%     
   ==========================================
     Files         445      445              
     Lines       61385    61401      +16     
   ==========================================
   - Hits        51276    51273       -3     
   - Misses      10109    10128      +19     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...thon/apache\_beam/io/aws/clients/s3/boto3\_client.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vYXdzL2NsaWVudHMvczMvYm90bzNfY2xpZW50LnB5) | `11.76% <8.33%> (-0.74%)` | :arrow_down: |
   | [...hon/apache\_beam/runners/direct/test\_stream\_impl.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9kaXJlY3QvdGVzdF9zdHJlYW1faW1wbC5weQ==) | `94.02% <0.00%> (-2.24%)` | :arrow_down: |
   | [sdks/python/apache\_beam/internal/metrics/metric.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW50ZXJuYWwvbWV0cmljcy9tZXRyaWMucHk=) | `90.00% <0.00%> (-1.00%)` | :arrow_down: |
   | [sdks/python/apache\_beam/transforms/combiners.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdHJhbnNmb3Jtcy9jb21iaW5lcnMucHk=) | `93.15% <0.00%> (-0.40%)` | :arrow_down: |
   | [...ks/python/apache\_beam/runners/worker/sdk\_worker.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvc2RrX3dvcmtlci5weQ==) | `89.25% <0.00%> (-0.17%)` | :arrow_down: |
   | [...hon/apache\_beam/runners/worker/bundle\_processor.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvYnVuZGxlX3Byb2Nlc3Nvci5weQ==) | `93.51% <0.00%> (-0.13%)` | :arrow_down: |
   | [sdks/python/apache\_beam/transforms/util.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdHJhbnNmb3Jtcy91dGlsLnB5) | `96.02% <0.00%> (+0.16%)` | :arrow_up: |
   | [...runners/interactive/display/pcoll\_visualization.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9pbnRlcmFjdGl2ZS9kaXNwbGF5L3Bjb2xsX3Zpc3VhbGl6YXRpb24ucHk=) | `86.36% <0.00%> (+0.50%)` | :arrow_up: |
   | [...pache\_beam/runners/interactive/interactive\_beam.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9pbnRlcmFjdGl2ZS9pbnRlcmFjdGl2ZV9iZWFtLnB5) | `76.58% <0.00%> (+0.97%)` | :arrow_up: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [8da177f...2af84ce](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #15931: Make S3 streaming more efficient

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #15931:
URL: https://github.com/apache/beam/pull/15931#issuecomment-965013224


   # [Codecov](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#15931](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (c6c15ac) into [master](https://codecov.io/gh/apache/beam/commit/ffb18c79127e85faa9dea7104c5d3e145fdfaf9a?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (ffb18c7) will **increase** coverage by `37.52%`.
   > The diff coverage is `15.62%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/15931/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@             Coverage Diff             @@
   ##           master   #15931       +/-   ##
   ===========================================
   + Coverage   46.10%   83.63%   +37.52%     
   ===========================================
     Files         196      447      +251     
     Lines       19498    61710    +42212     
   ===========================================
   + Hits         8990    51611    +42621     
   - Misses       9538    10099      +561     
   + Partials      970        0      -970     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...thon/apache\_beam/io/aws/clients/s3/boto3\_client.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vYXdzL2NsaWVudHMvczMvYm90bzNfY2xpZW50LnB5) | `13.10% <15.62%> (ø)` | |
   | [sdks/go/pkg/beam/core/graph/graph.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9jb3JlL2dyYXBoL2dyYXBoLmdv) | | |
   | [sdks/go/pkg/beam/core/runtime/exec/fullvalue.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9jb3JlL3J1bnRpbWUvZXhlYy9mdWxsdmFsdWUuZ28=) | | |
   | [sdks/go/pkg/beam/core/runtime/exec/decode.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9jb3JlL3J1bnRpbWUvZXhlYy9kZWNvZGUuZ28=) | | |
   | [sdks/go/pkg/beam/transforms/stats/max.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS90cmFuc2Zvcm1zL3N0YXRzL21heC5nbw==) | | |
   | [sdks/go/pkg/beam/util/harnessopts/sampler.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS91dGlsL2hhcm5lc3NvcHRzL3NhbXBsZXIuZ28=) | | |
   | [sdks/go/pkg/beam/core/runtime/exec/pardo.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9jb3JlL3J1bnRpbWUvZXhlYy9wYXJkby5nbw==) | | |
   | [sdks/go/pkg/beam/core/runtime/exec/input.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9jb3JlL3J1bnRpbWUvZXhlYy9pbnB1dC5nbw==) | | |
   | [.../pkg/beam/runners/dataflow/dataflowlib/messages.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9ydW5uZXJzL2RhdGFmbG93L2RhdGFmbG93bGliL21lc3NhZ2VzLmdv) | | |
   | [sdks/go/pkg/beam/core/runtime/coderx/int.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9jb3JlL3J1bnRpbWUvY29kZXJ4L2ludC5nbw==) | | |
   | ... and [634 more](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [ffb18c7...c6c15ac](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] phoerious commented on pull request #15931: Make S3 streaming more efficient

Posted by GitBox <gi...@apache.org>.
phoerious commented on pull request #15931:
URL: https://github.com/apache/beam/pull/15931#issuecomment-996892709


   With this PR itself, I don't think there are any open issues at this point except that I cannot add tests for the changes without writing a whole new test suite with a more or less complete Boto3 stub client. The existing tests do not cover the actual S3Client, but only its interface via a mock S3Client.
   
   As for the problems addressed by my changes:
   
   Initial problem: The S3 client is way too slow, because it opens a new connection for each range request. This is very inefficient for both the client and the server. The new implementation tries to keep the stream open for sequential reads (at least 2x faster, but more like 10-20x in a real-world scenario).
   
   Second problem that I fixed just now with my second (and so far unreviewed) commit: The exception clauses were designed only for HTTP errors and did not work for errors on lower levels, such as TCP connection issues.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] aaltay commented on pull request #15931: Make S3 streaming more efficient

Posted by GitBox <gi...@apache.org>.
aaltay commented on pull request #15931:
URL: https://github.com/apache/beam/pull/15931#issuecomment-997107847


   > I already did the local testing part. I tested sequential and non-sequential reads (including repeated reads from the same range) and got the expected data back.
   > 
   > I also tried closing the internal HTTP stream mid-air to trigger a retry and also tried to connect to a non-existent host (also triggering retries) as well as requesting a non-existent object (4xx client error, no retries).
   
   Great and thank you.
   
   @mosche - Does this change looks good to you except for the test comments?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #15931: Make S3 streaming more efficient

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #15931:
URL: https://github.com/apache/beam/pull/15931#issuecomment-965013224


   # [Codecov](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#15931](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (34fedd0) into [master](https://codecov.io/gh/apache/beam/commit/67bcf1e16e3fdf68cdea7a4b42b9c003e4b8948c?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (67bcf1e) will **increase** coverage by `37.54%`.
   > The diff coverage is `11.62%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/15931/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@             Coverage Diff             @@
   ##           master   #15931       +/-   ##
   ===========================================
   + Coverage   46.13%   83.68%   +37.54%     
   ===========================================
     Files         197      447      +250     
     Lines       19519    61715    +42196     
   ===========================================
   + Hits         9006    51649    +42643     
   - Misses       9542    10066      +524     
   + Partials      971        0      -971     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...thon/apache\_beam/io/aws/clients/s3/boto3\_client.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vYXdzL2NsaWVudHMvczMvYm90bzNfY2xpZW50LnB5) | `14.51% <9.52%> (ø)` | |
   | [sdks/python/apache\_beam/utils/retry.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdXRpbHMvcmV0cnkucHk=) | `85.43% <100.00%> (ø)` | |
   | [sdks/go/pkg/beam/core/graph/scope.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9jb3JlL2dyYXBoL3Njb3BlLmdv) | | |
   | [sdks/go/pkg/beam/util/shimx/generate.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS91dGlsL3NoaW14L2dlbmVyYXRlLmdv) | | |
   | [sdks/go/pkg/beam/core/runtime/exec/fullvalue.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9jb3JlL3J1bnRpbWUvZXhlYy9mdWxsdmFsdWUuZ28=) | | |
   | [sdks/go/pkg/beam/core/runtime/exec/multiplex.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9jb3JlL3J1bnRpbWUvZXhlYy9tdWx0aXBsZXguZ28=) | | |
   | [.../pkg/beam/runners/dataflow/dataflowlib/messages.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9ydW5uZXJzL2RhdGFmbG93L2RhdGFmbG93bGliL21lc3NhZ2VzLmdv) | | |
   | [sdks/go/pkg/beam/core/graph/coder/bool.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9jb3JlL2dyYXBoL2NvZGVyL2Jvb2wuZ28=) | | |
   | [sdks/go/pkg/beam/transforms/stats/sum\_switch.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS90cmFuc2Zvcm1zL3N0YXRzL3N1bV9zd2l0Y2guZ28=) | | |
   | [sdks/go/pkg/beam/runners/direct/gbk.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9ydW5uZXJzL2RpcmVjdC9nYmsuZ28=) | | |
   | ... and [636 more](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [67bcf1e...34fedd0](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] commented on pull request #15931: Make S3 streaming more efficient

Posted by GitBox <gi...@apache.org>.
codecov[bot] commented on pull request #15931:
URL: https://github.com/apache/beam/pull/15931#issuecomment-965013224


   # [Codecov](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#15931](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (05d1610) into [master](https://codecov.io/gh/apache/beam/commit/8da177f64d314cf72e89a51e51fb0915f706a784?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (8da177f) will **decrease** coverage by `0.02%`.
   > The diff coverage is `8.33%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/15931/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #15931      +/-   ##
   ==========================================
   - Coverage   83.53%   83.50%   -0.03%     
   ==========================================
     Files         445      445              
     Lines       61385    61401      +16     
   ==========================================
   - Hits        51275    51273       -2     
   - Misses      10110    10128      +18     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...thon/apache\_beam/io/aws/clients/s3/boto3\_client.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vYXdzL2NsaWVudHMvczMvYm90bzNfY2xpZW50LnB5) | `11.76% <8.33%> (-0.74%)` | :arrow_down: |
   | [...hon/apache\_beam/runners/direct/test\_stream\_impl.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9kaXJlY3QvdGVzdF9zdHJlYW1faW1wbC5weQ==) | `94.02% <0.00%> (-2.24%)` | :arrow_down: |
   | [sdks/python/apache\_beam/internal/metrics/metric.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW50ZXJuYWwvbWV0cmljcy9tZXRyaWMucHk=) | `90.00% <0.00%> (-1.00%)` | :arrow_down: |
   | [sdks/python/apache\_beam/transforms/combiners.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdHJhbnNmb3Jtcy9jb21iaW5lcnMucHk=) | `93.15% <0.00%> (-0.40%)` | :arrow_down: |
   | [...ks/python/apache\_beam/runners/worker/sdk\_worker.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvc2RrX3dvcmtlci5weQ==) | `89.25% <0.00%> (-0.17%)` | :arrow_down: |
   | [...hon/apache\_beam/runners/worker/bundle\_processor.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvYnVuZGxlX3Byb2Nlc3Nvci5weQ==) | `93.51% <0.00%> (-0.13%)` | :arrow_down: |
   | [sdks/python/apache\_beam/transforms/util.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdHJhbnNmb3Jtcy91dGlsLnB5) | `96.02% <0.00%> (+0.16%)` | :arrow_up: |
   | [...runners/interactive/display/pcoll\_visualization.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9pbnRlcmFjdGl2ZS9kaXNwbGF5L3Bjb2xsX3Zpc3VhbGl6YXRpb24ucHk=) | `86.36% <0.00%> (+0.50%)` | :arrow_up: |
   | [...pache\_beam/runners/interactive/interactive\_beam.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9pbnRlcmFjdGl2ZS9pbnRlcmFjdGl2ZV9iZWFtLnB5) | `76.58% <0.00%> (+0.97%)` | :arrow_up: |
   | [sdks/python/apache\_beam/utils/interactive\_utils.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdXRpbHMvaW50ZXJhY3RpdmVfdXRpbHMucHk=) | `95.12% <0.00%> (+2.43%)` | :arrow_up: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [8da177f...05d1610](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] phoerious commented on a change in pull request #15931: Make S3 streaming more efficient

Posted by GitBox <gi...@apache.org>.
phoerious commented on a change in pull request #15931:
URL: https://github.com/apache/beam/pull/15931#discussion_r768496977



##########
File path: sdks/python/apache_beam/io/aws/clients/s3/boto3_client.py
##########
@@ -92,27 +99,65 @@ def get_object_metadata(self, request):
         boto_response['ContentLength'],
         boto_response['ContentType'])
 
+    item.size = max(item.size - self._download_offset, 0)
     return item
 
+  def get_stream(self, request, start):

Review comment:
       Is there a guide somewhere how I can build and test the Python SDK? The `./gradlew buildPython` job fails with totally non-descriptive virtualenv errors on my machine. Hence I have to rely on your CI to debug stuff. That's why there were two variable name inconsistencies, which I overlooked when I copied the code from my own project to the Beam SDK code.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] aaltay commented on pull request #15931: Make S3 streaming more efficient

Posted by GitBox <gi...@apache.org>.
aaltay commented on pull request #15931:
URL: https://github.com/apache/beam/pull/15931#issuecomment-996886898


   > @aaltay How do you want to proceed?
   
   I was trying to find a reviewer initially but I did not follow the PR after that. Could you give me a quick summary what are the open issues now?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] phoerious commented on pull request #15931: Make S3 streaming more efficient

Posted by GitBox <gi...@apache.org>.
phoerious commented on pull request #15931:
URL: https://github.com/apache/beam/pull/15931#issuecomment-997838212


   I rebased the changes to the current master branch and removed a (now) unused import.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] aaltay merged pull request #15931: Make S3 streaming more efficient

Posted by GitBox <gi...@apache.org>.
aaltay merged pull request #15931:
URL: https://github.com/apache/beam/pull/15931


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] mosche commented on a change in pull request #15931: Make S3 streaming more efficient

Posted by GitBox <gi...@apache.org>.
mosche commented on a change in pull request #15931:
URL: https://github.com/apache/beam/pull/15931#discussion_r768514689



##########
File path: sdks/python/apache_beam/io/aws/clients/s3/boto3_client.py
##########
@@ -92,27 +99,65 @@ def get_object_metadata(self, request):
         boto_response['ContentLength'],
         boto_response['ContentType'])
 
+    item.size = max(item.size - self._download_offset, 0)
     return item
 
+  def get_stream(self, request, start):

Review comment:
       There's also a `local-env-setup.sh` that should help




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] phoerious edited a comment on pull request #15931: Make S3 streaming more efficient

Posted by GitBox <gi...@apache.org>.
phoerious edited a comment on pull request #15931:
URL: https://github.com/apache/beam/pull/15931#issuecomment-993377086


   Thanks for the comments. I fixed a bunch of stuff and added some comments myself.
   
   As for the (pseudo-)statefulness, this is absolute essential, but as commented above, I don't think it's an issue. In fact, in a real-world scenario, the speed gain by making the client stateful is way higher than the raw client benchmark above suggests (more like 20-60x). Without keeping the connection open for sequential reads, the client is so slow that I can hardly imagine anyone actually using it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] phoerious commented on pull request #15931: Make S3 streaming more efficient

Posted by GitBox <gi...@apache.org>.
phoerious commented on pull request #15931:
URL: https://github.com/apache/beam/pull/15931#issuecomment-993377086


   Thanks for the comments. I fixed a bunch of stuff and added some comments myself.
   
   As for the (pseudo-)statefulness, this is absolute essential, but as commented above, I don't think it's an issue. In fact, in a real-world scenario, the speed gain by making the client stateful is way higher than the raw client benchmark above suggests (more like 20-60x). Without keeping the connection open and reading sequentially, the client is so slow that I can hardly imagine anyone actually using it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] phoerious commented on a change in pull request #15931: Make S3 streaming more efficient

Posted by GitBox <gi...@apache.org>.
phoerious commented on a change in pull request #15931:
URL: https://github.com/apache/beam/pull/15931#discussion_r768491540



##########
File path: sdks/python/apache_beam/io/aws/clients/s3/boto3_client.py
##########
@@ -92,27 +99,65 @@ def get_object_metadata(self, request):
         boto_response['ContentLength'],
         boto_response['ContentType'])
 
+    item.size = max(item.size - self._download_offset, 0)
     return item
 
+  def get_stream(self, request, start):
+    """Opens a stream object starting at the given position.
+
+    Args:
+      request: (GetRequest) request
+      start: (int) start offset
+    Returns:
+      (Stream) Boto3 stream object.
+    """
+
+    if self._download_request and (
+        start != self._download_pos or
+        request.bucket != self._download_request.bucket or
+        request.object != self._download_request.object):
+      self._download_stream.close()
+      self._download_stream = None
+
+    # noinspection PyProtectedMember
+    if not self._download_stream or self._download_stream._raw_stream.closed:
+      try:
+        self._download_stream = self.client.get_object(
+            Bucket=request.bucket,
+            Key=request.object,
+            Range='bytes={}-'.format(start + self._download_offset))['Body']
+        self._download_request = request
+        self._download_pos = start
+      except Exception as e:
+        message = e.response['Error'].get(
+            'Message', e.response['Error'].get('Code', ''))
+        code = e.response['ResponseMetadata']['HTTPStatusCode']
+        raise messages.S3ClientError(message, code)
+
+    return self._download_stream
+
   def get_range(self, request, start, end):
     r"""Retrieves an object's contents.
 
       Args:
         request: (GetRequest) request
+        start: (int) start offset
+        end: (int) end offset
       Returns:
         (bytes) The response message.
       """
-    try:
-      boto_response = self.client.get_object(
-          Bucket=request.bucket,
-          Key=request.object,
-          Range='bytes={}-{}'.format(start, end - 1))
-    except Exception as e:
-      message = e.response['Error']['Message']
-      code = e.response['ResponseMetadata']['HTTPStatusCode']
-      raise messages.S3ClientError(message, code)
-
-    return boto_response['Body'].read()  # A bytes object
+    for i in range(self._retries):

Review comment:
       Good point. Done.
   
   The main reason for the retry here is that long-lived connections tend to close intermittently if the reads do not occur at a fast and constant rate (I regularly have issues with that when reading and processing records from a WARC file), so I want to retry at least once. The first retry should occur pretty much instantly, so I opted to give `with_exponential_backoff` a very small initial delay.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #15931: Make S3 streaming more efficient

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #15931:
URL: https://github.com/apache/beam/pull/15931#issuecomment-965013224


   # [Codecov](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#15931](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (4831f68) into [master](https://codecov.io/gh/apache/beam/commit/ffb18c79127e85faa9dea7104c5d3e145fdfaf9a?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (ffb18c7) will **increase** coverage by `9.01%`.
   > The diff coverage is `13.79%`.
   
   > :exclamation: Current head 4831f68 differs from pull request most recent head dd68537. Consider uploading reports for the commit dd68537 to get more accurate results
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/15931/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #15931      +/-   ##
   ==========================================
   + Coverage   74.64%   83.65%   +9.01%     
   ==========================================
     Files         643      447     -196     
     Lines       81183    61698   -19485     
   ==========================================
   - Hits        60595    51611    -8984     
   + Misses      19618    10087    -9531     
   + Partials      970        0     -970     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...thon/apache\_beam/io/aws/clients/s3/boto3\_client.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vYXdzL2NsaWVudHMvczMvYm90bzNfY2xpZW50LnB5) | `13.53% <13.79%> (+1.03%)` | :arrow_up: |
   | [sdks/python/apache\_beam/internal/metrics/metric.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW50ZXJuYWwvbWV0cmljcy9tZXRyaWMucHk=) | `90.00% <0.00%> (-1.00%)` | :arrow_down: |
   | [sdks/python/apache\_beam/runners/direct/executor.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9kaXJlY3QvZXhlY3V0b3IucHk=) | `96.46% <0.00%> (-0.55%)` | :arrow_down: |
   | [...kg/beam/core/runtime/graphx/schema/logicaltypes.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9jb3JlL3J1bnRpbWUvZ3JhcGh4L3NjaGVtYS9sb2dpY2FsdHlwZXMuZ28=) | | |
   | [sdks/go/pkg/beam/internal/errors/errors.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9pbnRlcm5hbC9lcnJvcnMvZXJyb3JzLmdv) | | |
   | [sdks/go/pkg/beam/core/runtime/exec/util.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9jb3JlL3J1bnRpbWUvZXhlYy91dGlsLmdv) | | |
   | [sdks/go/pkg/beam/core/metrics/sampler.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9jb3JlL21ldHJpY3Mvc2FtcGxlci5nbw==) | | |
   | [sdks/go/pkg/beam/core/graph/bind.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9jb3JlL2dyYXBoL2JpbmQuZ28=) | | |
   | [sdks/go/pkg/beam/runners/direct/direct.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9ydW5uZXJzL2RpcmVjdC9kaXJlY3QuZ28=) | | |
   | [sdks/go/pkg/beam/core/runtime/exec/reshuffle.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9jb3JlL3J1bnRpbWUvZXhlYy9yZXNodWZmbGUuZ28=) | | |
   | ... and [192 more](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [ffb18c7...dd68537](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #15931: Make S3 streaming more efficient

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #15931:
URL: https://github.com/apache/beam/pull/15931#issuecomment-965013224


   # [Codecov](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#15931](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (4831f68) into [master](https://codecov.io/gh/apache/beam/commit/18fa2a1d81f91ab35b2c93f957d00b943eb0350e?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (18fa2a1) will **increase** coverage by `8.99%`.
   > The diff coverage is `13.79%`.
   
   > :exclamation: Current head 4831f68 differs from pull request most recent head 7d47ba9. Consider uploading reports for the commit 7d47ba9 to get more accurate results
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/15931/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #15931      +/-   ##
   ==========================================
   + Coverage   74.65%   83.65%   +8.99%     
   ==========================================
     Files         644      447     -197     
     Lines       81230    61698   -19532     
   ==========================================
   - Hits        60640    51611    -9029     
   + Misses      19619    10087    -9532     
   + Partials      971        0     -971     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...thon/apache\_beam/io/aws/clients/s3/boto3\_client.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vYXdzL2NsaWVudHMvczMvYm90bzNfY2xpZW50LnB5) | `13.53% <13.79%> (+1.03%)` | :arrow_up: |
   | [sdks/python/apache\_beam/ml/gcp/cloud\_dlp.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vbWwvZ2NwL2Nsb3VkX2RscC5weQ==) | `58.33% <0.00%> (-16.14%)` | :arrow_down: |
   | [...ks/python/apache\_beam/runners/worker/sdk\_worker.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvc2RrX3dvcmtlci5weQ==) | `88.90% <0.00%> (-0.16%)` | :arrow_down: |
   | [...s/python/apache\_beam/io/gcp/bigquery\_file\_loads.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL2JpZ3F1ZXJ5X2ZpbGVfbG9hZHMucHk=) | `87.58% <0.00%> (-0.15%)` | :arrow_down: |
   | [sdks/python/apache\_beam/io/gcp/bigquery.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL2JpZ3F1ZXJ5LnB5) | `62.81% <0.00%> (-0.05%)` | :arrow_down: |
   | [sdks/python/apache\_beam/io/gcp/bigquery\_tools.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL2JpZ3F1ZXJ5X3Rvb2xzLnB5) | `82.91% <0.00%> (-0.03%)` | :arrow_down: |
   | [sdks/python/apache\_beam/transforms/util.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdHJhbnNmb3Jtcy91dGlsLnB5) | `95.86% <0.00%> (ø)` | |
   | [sdks/go/pkg/beam/core/runtime/exec/util.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9jb3JlL3J1bnRpbWUvZXhlYy91dGlsLmdv) | | |
   | [sdks/go/pkg/beam/core/runtime/exec/encode.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9jb3JlL3J1bnRpbWUvZXhlYy9lbmNvZGUuZ28=) | | |
   | [sdks/go/pkg/beam/core/graph/coder/double.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9jb3JlL2dyYXBoL2NvZGVyL2RvdWJsZS5nbw==) | | |
   | ... and [201 more](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [18fa2a1...7d47ba9](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] mosche commented on pull request #15931: Make S3 streaming more efficient

Posted by GitBox <gi...@apache.org>.
mosche commented on pull request #15931:
URL: https://github.com/apache/beam/pull/15931#issuecomment-997751434


   @aaltay the lack of the most fundamental test coverage is shocking, but I agree that should not block this.
   Otherwise this looks good to me given my limited knowledge of the Python SDK 👍 Thanks @phoerious!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] aromanenko-dev commented on pull request #15931: Make S3 streaming more efficient

Posted by GitBox <gi...@apache.org>.
aromanenko-dev commented on pull request #15931:
URL: https://github.com/apache/beam/pull/15931#issuecomment-999758474


   Sorry if I missed that but does this PR is linked to any Jira issue ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] phoerious commented on pull request #15931: [BEAM-13585] Make S3 streaming more efficient

Posted by GitBox <gi...@apache.org>.
phoerious commented on pull request #15931:
URL: https://github.com/apache/beam/pull/15931#issuecomment-1003313792


   Done. Sorry for the delay.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] aromanenko-dev commented on pull request #15931: Make S3 streaming more efficient

Posted by GitBox <gi...@apache.org>.
aromanenko-dev commented on pull request #15931:
URL: https://github.com/apache/beam/pull/15931#issuecomment-991642841


   @aaltay Tbh, I'm not well aware who supports S3 in Python SDK. Maybe @mosche you could take a look?
   
   @phoerious Thanks for contribution! It would be better to create a PR from a feature branch than from master/main. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #15931: Make S3 streaming more efficient

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #15931:
URL: https://github.com/apache/beam/pull/15931#issuecomment-965013224


   # [Codecov](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#15931](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (c6c15ac) into [master](https://codecov.io/gh/apache/beam/commit/ffb18c79127e85faa9dea7104c5d3e145fdfaf9a?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (ffb18c7) will **increase** coverage by `37.52%`.
   > The diff coverage is `15.62%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/15931/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@             Coverage Diff             @@
   ##           master   #15931       +/-   ##
   ===========================================
   + Coverage   46.10%   83.63%   +37.52%     
   ===========================================
     Files         196      447      +251     
     Lines       19498    61710    +42212     
   ===========================================
   + Hits         8990    51611    +42621     
   - Misses       9538    10099      +561     
   + Partials      970        0      -970     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...thon/apache\_beam/io/aws/clients/s3/boto3\_client.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vYXdzL2NsaWVudHMvczMvYm90bzNfY2xpZW50LnB5) | `13.10% <15.62%> (ø)` | |
   | [sdks/go/pkg/beam/internal/errors/errors.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9pbnRlcm5hbC9lcnJvcnMvZXJyb3JzLmdv) | | |
   | [sdks/go/pkg/beam/io/databaseio/writer.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9pby9kYXRhYmFzZWlvL3dyaXRlci5nbw==) | | |
   | [sdks/go/pkg/beam/transforms/stats/count.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS90cmFuc2Zvcm1zL3N0YXRzL2NvdW50Lmdv) | | |
   | [sdks/go/pkg/beam/core/runtime/exec/pardo.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9jb3JlL3J1bnRpbWUvZXhlYy9wYXJkby5nbw==) | | |
   | [sdks/go/pkg/beam/core/runtime/symbols.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9jb3JlL3J1bnRpbWUvc3ltYm9scy5nbw==) | | |
   | [sdks/go/pkg/beam/core/runtime/exec/datasink.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9jb3JlL3J1bnRpbWUvZXhlYy9kYXRhc2luay5nbw==) | | |
   | [...dks/go/pkg/beam/core/runtime/harness/monitoring.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9jb3JlL3J1bnRpbWUvaGFybmVzcy9tb25pdG9yaW5nLmdv) | | |
   | [sdks/go/pkg/beam/io/filesystem/memfs/memory.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9pby9maWxlc3lzdGVtL21lbWZzL21lbW9yeS5nbw==) | | |
   | [...s/go/pkg/beam/core/runtime/harness/sampler\_hook.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9jb3JlL3J1bnRpbWUvaGFybmVzcy9zYW1wbGVyX2hvb2suZ28=) | | |
   | ... and [634 more](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [ffb18c7...c6c15ac](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #15931: Make S3 streaming more efficient

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #15931:
URL: https://github.com/apache/beam/pull/15931#issuecomment-965013224


   # [Codecov](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#15931](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (7d47ba9) into [master](https://codecov.io/gh/apache/beam/commit/18fa2a1d81f91ab35b2c93f957d00b943eb0350e?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (18fa2a1) will **increase** coverage by `9.02%`.
   > The diff coverage is `13.33%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/15931/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #15931      +/-   ##
   ==========================================
   + Coverage   74.65%   83.68%   +9.02%     
   ==========================================
     Files         644      447     -197     
     Lines       81230    61717   -19513     
   ==========================================
   - Hits        60640    51645    -8995     
   + Misses      19619    10072    -9547     
   + Partials      971        0     -971     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...thon/apache\_beam/io/aws/clients/s3/boto3\_client.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vYXdzL2NsaWVudHMvczMvYm90bzNfY2xpZW50LnB5) | `15.07% <11.36%> (+2.57%)` | :arrow_up: |
   | [sdks/python/apache\_beam/utils/retry.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdXRpbHMvcmV0cnkucHk=) | `85.43% <100.00%> (ø)` | |
   | [...hon/apache\_beam/runners/worker/bundle\_processor.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvYnVuZGxlX3Byb2Nlc3Nvci5weQ==) | `93.39% <0.00%> (ø)` | |
   | [sdks/go/pkg/beam/core/runtime/options.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9jb3JlL3J1bnRpbWUvb3B0aW9ucy5nbw==) | | |
   | [sdks/go/pkg/beam/io/databaseio/database.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9pby9kYXRhYmFzZWlvL2RhdGFiYXNlLmdv) | | |
   | [sdks/go/pkg/beam/core/runtime/xlangx/expand.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9jb3JlL3J1bnRpbWUveGxhbmd4L2V4cGFuZC5nbw==) | | |
   | [sdks/go/pkg/beam/io/textio/sdf.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9pby90ZXh0aW8vc2RmLmdv) | | |
   | [sdks/go/pkg/beam/io/textio/textio.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9pby90ZXh0aW8vdGV4dGlvLmdv) | | |
   | [sdks/go/pkg/beam/core/runtime/symbols.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9jb3JlL3J1bnRpbWUvc3ltYm9scy5nbw==) | | |
   | [sdks/go/pkg/beam/core/runtime/exec/sdf.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9jb3JlL3J1bnRpbWUvZXhlYy9zZGYuZ28=) | | |
   | ... and [195 more](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [18fa2a1...7d47ba9](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #15931: Make S3 streaming more efficient

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #15931:
URL: https://github.com/apache/beam/pull/15931#issuecomment-965013224


   # [Codecov](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#15931](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (7d47ba9) into [master](https://codecov.io/gh/apache/beam/commit/18fa2a1d81f91ab35b2c93f957d00b943eb0350e?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (18fa2a1) will **increase** coverage by `9.02%`.
   > The diff coverage is `13.33%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/15931/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #15931      +/-   ##
   ==========================================
   + Coverage   74.65%   83.68%   +9.02%     
   ==========================================
     Files         644      447     -197     
     Lines       81230    61717   -19513     
   ==========================================
   - Hits        60640    51645    -8995     
   + Misses      19619    10072    -9547     
   + Partials      971        0     -971     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...thon/apache\_beam/io/aws/clients/s3/boto3\_client.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vYXdzL2NsaWVudHMvczMvYm90bzNfY2xpZW50LnB5) | `15.07% <11.36%> (+2.57%)` | :arrow_up: |
   | [sdks/python/apache\_beam/utils/retry.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdXRpbHMvcmV0cnkucHk=) | `85.43% <100.00%> (ø)` | |
   | [...hon/apache\_beam/runners/worker/bundle\_processor.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvYnVuZGxlX3Byb2Nlc3Nvci5weQ==) | `93.39% <0.00%> (ø)` | |
   | [sdks/go/pkg/beam/core/runtime/exec/multiplex.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9jb3JlL3J1bnRpbWUvZXhlYy9tdWx0aXBsZXguZ28=) | | |
   | [sdks/go/pkg/beam/core/graph/coder/map.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9jb3JlL2dyYXBoL2NvZGVyL21hcC5nbw==) | | |
   | [sdks/go/pkg/beam/transforms/stats/count.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS90cmFuc2Zvcm1zL3N0YXRzL2NvdW50Lmdv) | | |
   | [sdks/go/pkg/beam/core/runtime/graphx/serialize.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9jb3JlL3J1bnRpbWUvZ3JhcGh4L3NlcmlhbGl6ZS5nbw==) | | |
   | [sdks/go/pkg/beam/core/runtime/graphx/cogbk.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9jb3JlL3J1bnRpbWUvZ3JhcGh4L2NvZ2JrLmdv) | | |
   | [sdks/go/pkg/beam/core/runtime/harness/session.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9jb3JlL3J1bnRpbWUvaGFybmVzcy9zZXNzaW9uLmdv) | | |
   | [sdks/go/pkg/beam/artifact/materialize.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9hcnRpZmFjdC9tYXRlcmlhbGl6ZS5nbw==) | | |
   | ... and [195 more](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [18fa2a1...7d47ba9](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #15931: Make S3 streaming more efficient

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #15931:
URL: https://github.com/apache/beam/pull/15931#issuecomment-965013224


   # [Codecov](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#15931](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (2af84ce) into [master](https://codecov.io/gh/apache/beam/commit/8da177f64d314cf72e89a51e51fb0915f706a784?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (8da177f) will **increase** coverage by `0.04%`.
   > The diff coverage is `9.09%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/15931/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #15931      +/-   ##
   ==========================================
   + Coverage   83.53%   83.57%   +0.04%     
   ==========================================
     Files         445      447       +2     
     Lines       61385    61526     +141     
   ==========================================
   + Hits        51276    51422     +146     
   + Misses      10109    10104       -5     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...thon/apache\_beam/io/aws/clients/s3/boto3\_client.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vYXdzL2NsaWVudHMvczMvYm90bzNfY2xpZW50LnB5) | `11.72% <9.09%> (-0.78%)` | :arrow_down: |
   | [sdks/python/apache\_beam/internal/pickler.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW50ZXJuYWwvcGlja2xlci5weQ==) | `77.27% <0.00%> (-9.59%)` | :arrow_down: |
   | [sdks/python/apache\_beam/utils/interactive\_utils.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdXRpbHMvaW50ZXJhY3RpdmVfdXRpbHMucHk=) | `87.80% <0.00%> (-7.32%)` | :arrow_down: |
   | [...eam/portability/api/beam\_expansion\_api\_pb2\_grpc.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcG9ydGFiaWxpdHkvYXBpL2JlYW1fZXhwYW5zaW9uX2FwaV9wYjJfZ3JwYy5weQ==) | `57.89% <0.00%> (-4.02%)` | :arrow_down: |
   | [...eam/portability/api/beam\_provision\_api\_pb2\_grpc.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcG9ydGFiaWxpdHkvYXBpL2JlYW1fcHJvdmlzaW9uX2FwaV9wYjJfZ3JwYy5weQ==) | `73.68% <0.00%> (-2.51%)` | :arrow_down: |
   | [...hon/apache\_beam/runners/direct/test\_stream\_impl.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9kaXJlY3QvdGVzdF9zdHJlYW1faW1wbC5weQ==) | `94.02% <0.00%> (-2.24%)` | :arrow_down: |
   | [...e\_beam/portability/api/beam\_runner\_api\_pb2\_grpc.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcG9ydGFiaWxpdHkvYXBpL2JlYW1fcnVubmVyX2FwaV9wYjJfZ3JwYy5weQ==) | `78.94% <0.00%> (-2.01%)` | :arrow_down: |
   | [sdks/python/apache\_beam/internal/metrics/metric.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW50ZXJuYWwvbWV0cmljcy9tZXRyaWMucHk=) | `90.00% <0.00%> (-1.00%)` | :arrow_down: |
   | [...ache\_beam/portability/api/beam\_job\_api\_pb2\_grpc.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcG9ydGFiaWxpdHkvYXBpL2JlYW1fam9iX2FwaV9wYjJfZ3JwYy5weQ==) | `56.04% <0.00%> (-0.95%)` | :arrow_down: |
   | [...beam/portability/api/beam\_artifact\_api\_pb2\_grpc.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcG9ydGFiaWxpdHkvYXBpL2JlYW1fYXJ0aWZhY3RfYXBpX3BiMl9ncnBjLnB5) | `56.04% <0.00%> (-0.95%)` | :arrow_down: |
   | ... and [43 more](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [8da177f...2af84ce](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] aaltay commented on pull request #15931: Make S3 streaming more efficient

Posted by GitBox <gi...@apache.org>.
aaltay commented on pull request #15931:
URL: https://github.com/apache/beam/pull/15931#issuecomment-998235555


   Run Python_PVR_Flink PreCommit


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] phoerious commented on a change in pull request #15931: Make S3 streaming more efficient

Posted by GitBox <gi...@apache.org>.
phoerious commented on a change in pull request #15931:
URL: https://github.com/apache/beam/pull/15931#discussion_r768514890



##########
File path: sdks/python/apache_beam/io/aws/clients/s3/boto3_client.py
##########
@@ -67,6 +69,11 @@ def __init__(self, options):
         aws_secret_access_key=secret_access_key,
         aws_session_token=session_token)
 
+    self._download_request = None

Review comment:
       If an instance is used to process two connections simultaneously, the client should reset the connection before each range request and effectively fall back to the old behaviour. The same happens when you read non-sequentially. The only issue I can see here if the client is used in a non-thread-safe way, but I don't want to add mutex locks unless I know that this is definitely a scenario that can occur.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] mosche commented on a change in pull request #15931: Make S3 streaming more efficient

Posted by GitBox <gi...@apache.org>.
mosche commented on a change in pull request #15931:
URL: https://github.com/apache/beam/pull/15931#discussion_r768504881



##########
File path: sdks/python/apache_beam/io/aws/clients/s3/boto3_client.py
##########
@@ -67,6 +69,11 @@ def __init__(self, options):
         aws_secret_access_key=secret_access_key,
         aws_session_token=session_token)
 
+    self._download_request = None

Review comment:
       if an instance would be shared to process multiple bundles in parallel, the result really becomes unpredictable. but it sounds like that's not the case




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #15931: Make S3 streaming more efficient

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #15931:
URL: https://github.com/apache/beam/pull/15931#issuecomment-965013224


   # [Codecov](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#15931](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (2af84ce) into [master](https://codecov.io/gh/apache/beam/commit/8da177f64d314cf72e89a51e51fb0915f706a784?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (8da177f) will **increase** coverage by `0.04%`.
   > The diff coverage is `9.09%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/15931/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #15931      +/-   ##
   ==========================================
   + Coverage   83.53%   83.57%   +0.04%     
   ==========================================
     Files         445      447       +2     
     Lines       61385    61526     +141     
   ==========================================
   + Hits        51276    51422     +146     
   + Misses      10109    10104       -5     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...thon/apache\_beam/io/aws/clients/s3/boto3\_client.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vYXdzL2NsaWVudHMvczMvYm90bzNfY2xpZW50LnB5) | `11.72% <9.09%> (-0.78%)` | :arrow_down: |
   | [sdks/python/apache\_beam/internal/pickler.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW50ZXJuYWwvcGlja2xlci5weQ==) | `77.27% <0.00%> (-9.59%)` | :arrow_down: |
   | [sdks/python/apache\_beam/utils/interactive\_utils.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdXRpbHMvaW50ZXJhY3RpdmVfdXRpbHMucHk=) | `87.80% <0.00%> (-7.32%)` | :arrow_down: |
   | [...eam/portability/api/beam\_expansion\_api\_pb2\_grpc.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcG9ydGFiaWxpdHkvYXBpL2JlYW1fZXhwYW5zaW9uX2FwaV9wYjJfZ3JwYy5weQ==) | `57.89% <0.00%> (-4.02%)` | :arrow_down: |
   | [...eam/portability/api/beam\_provision\_api\_pb2\_grpc.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcG9ydGFiaWxpdHkvYXBpL2JlYW1fcHJvdmlzaW9uX2FwaV9wYjJfZ3JwYy5weQ==) | `73.68% <0.00%> (-2.51%)` | :arrow_down: |
   | [...hon/apache\_beam/runners/direct/test\_stream\_impl.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9kaXJlY3QvdGVzdF9zdHJlYW1faW1wbC5weQ==) | `94.02% <0.00%> (-2.24%)` | :arrow_down: |
   | [...e\_beam/portability/api/beam\_runner\_api\_pb2\_grpc.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcG9ydGFiaWxpdHkvYXBpL2JlYW1fcnVubmVyX2FwaV9wYjJfZ3JwYy5weQ==) | `78.94% <0.00%> (-2.01%)` | :arrow_down: |
   | [sdks/python/apache\_beam/internal/metrics/metric.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW50ZXJuYWwvbWV0cmljcy9tZXRyaWMucHk=) | `90.00% <0.00%> (-1.00%)` | :arrow_down: |
   | [...ache\_beam/portability/api/beam\_job\_api\_pb2\_grpc.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcG9ydGFiaWxpdHkvYXBpL2JlYW1fam9iX2FwaV9wYjJfZ3JwYy5weQ==) | `56.04% <0.00%> (-0.95%)` | :arrow_down: |
   | [...beam/portability/api/beam\_artifact\_api\_pb2\_grpc.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcG9ydGFiaWxpdHkvYXBpL2JlYW1fYXJ0aWZhY3RfYXBpX3BiMl9ncnBjLnB5) | `56.04% <0.00%> (-0.95%)` | :arrow_down: |
   | ... and [43 more](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [8da177f...2af84ce](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] aaltay commented on pull request #15931: Make S3 streaming more efficient

Posted by GitBox <gi...@apache.org>.
aaltay commented on pull request #15931:
URL: https://github.com/apache/beam/pull/15931#issuecomment-996907500


   > With this PR itself, I don't think there are any open issues at this point except that I cannot add tests for the changes without writing a whole new test suite with a more or less complete Boto3 stub client. The existing tests do not cover the actual S3Client, but only its interface via a mock S3Client.
   > 
   > As for the problems addressed by my changes:
   > 
   > Initial problem: The S3 client is way too slow, because it opens a new connection for each range request. This is very inefficient for both the client and the server. The new implementation tries to keep the stream open for sequential reads (at least 2x faster, but more like 10-20x in a real-world scenario).
   > 
   > Second problem that I fixed just now with my second (and so far unreviewed) commit: The exception clauses were designed only for HTTP errors and did not work for errors on lower levels, such as TCP connection issues.
   
   Got it. 
   
   My suggestion would be:
   - Locally test your changes. Manually try a few configs and report results. Try to break, try some edge cases etc. If all good, let's merge. (It would have been much better to write at least some tests, but I understand your time is limited and asking a whole suite might be a lot to ask.)
   - File a jira for a new test suite with as much information as possible so that we can track. Or even make it a starter task for new people.
   
   Thank you for your contribution.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #15931: Make S3 streaming more efficient

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #15931:
URL: https://github.com/apache/beam/pull/15931#issuecomment-965013224


   # [Codecov](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#15931](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (4831f68) into [master](https://codecov.io/gh/apache/beam/commit/ffb18c79127e85faa9dea7104c5d3e145fdfaf9a?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (ffb18c7) will **increase** coverage by `9.01%`.
   > The diff coverage is `13.79%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/15931/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #15931      +/-   ##
   ==========================================
   + Coverage   74.64%   83.65%   +9.01%     
   ==========================================
     Files         643      447     -196     
     Lines       81183    61698   -19485     
   ==========================================
   - Hits        60595    51611    -8984     
   + Misses      19618    10087    -9531     
   + Partials      970        0     -970     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...thon/apache\_beam/io/aws/clients/s3/boto3\_client.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vYXdzL2NsaWVudHMvczMvYm90bzNfY2xpZW50LnB5) | `13.53% <13.79%> (+1.03%)` | :arrow_up: |
   | [sdks/python/apache\_beam/internal/metrics/metric.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW50ZXJuYWwvbWV0cmljcy9tZXRyaWMucHk=) | `90.00% <0.00%> (-1.00%)` | :arrow_down: |
   | [sdks/python/apache\_beam/runners/direct/executor.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9kaXJlY3QvZXhlY3V0b3IucHk=) | `96.46% <0.00%> (-0.55%)` | :arrow_down: |
   | [sdks/go/pkg/beam/core/graph/coder/varint.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9jb3JlL2dyYXBoL2NvZGVyL3ZhcmludC5nbw==) | | |
   | [sdks/go/pkg/beam/core/util/reflectx/structs.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9jb3JlL3V0aWwvcmVmbGVjdHgvc3RydWN0cy5nbw==) | | |
   | [sdks/go/pkg/beam/core/runtime/pipelinex/util.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9jb3JlL3J1bnRpbWUvcGlwZWxpbmV4L3V0aWwuZ28=) | | |
   | [sdks/go/pkg/beam/pipeline.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9waXBlbGluZS5nbw==) | | |
   | [...o/pkg/beam/io/rtrackers/offsetrange/offsetrange.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9pby9ydHJhY2tlcnMvb2Zmc2V0cmFuZ2Uvb2Zmc2V0cmFuZ2UuZ28=) | | |
   | [sdks/go/pkg/beam/core/graph/coder/int.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9jb3JlL2dyYXBoL2NvZGVyL2ludC5nbw==) | | |
   | [sdks/go/pkg/beam/core/metrics/store.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9jb3JlL21ldHJpY3Mvc3RvcmUuZ28=) | | |
   | ... and [192 more](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [ffb18c7...4831f68](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] aaltay commented on pull request #15931: Make S3 streaming more efficient

Posted by GitBox <gi...@apache.org>.
aaltay commented on pull request #15931:
URL: https://github.com/apache/beam/pull/15931#issuecomment-991449710


   @aromanenko-dev - do you know who could review this?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] mosche commented on a change in pull request #15931: Make S3 streaming more efficient

Posted by GitBox <gi...@apache.org>.
mosche commented on a change in pull request #15931:
URL: https://github.com/apache/beam/pull/15931#discussion_r768427938



##########
File path: sdks/python/apache_beam/io/aws/clients/s3/boto3_client.py
##########
@@ -67,6 +69,11 @@ def __init__(self, options):
         aws_secret_access_key=secret_access_key,
         aws_session_token=session_token)
 
+    self._download_request = None

Review comment:
       The following lines add some local mutable state to the client. I'm not familiar with the Python SDK unfortunately, so not sure if this is legitimate. In any case, I'd suggest to document this in the pydocs of the client.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] phoerious commented on a change in pull request #15931: Make S3 streaming more efficient

Posted by GitBox <gi...@apache.org>.
phoerious commented on a change in pull request #15931:
URL: https://github.com/apache/beam/pull/15931#discussion_r768512310



##########
File path: sdks/python/apache_beam/io/aws/clients/s3/boto3_client.py
##########
@@ -92,27 +99,65 @@ def get_object_metadata(self, request):
         boto_response['ContentLength'],
         boto_response['ContentType'])
 
+    item.size = max(item.size - self._download_offset, 0)
     return item
 
+  def get_stream(self, request, start):

Review comment:
       Thanks. I'll test if I can get it running.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] phoerious commented on a change in pull request #15931: Make S3 streaming more efficient

Posted by GitBox <gi...@apache.org>.
phoerious commented on a change in pull request #15931:
URL: https://github.com/apache/beam/pull/15931#discussion_r768493618



##########
File path: sdks/python/apache_beam/io/aws/clients/s3/boto3_client.py
##########
@@ -67,6 +69,11 @@ def __init__(self, options):
         aws_secret_access_key=secret_access_key,
         aws_session_token=session_token)
 
+    self._download_request = None

Review comment:
       I haven't had any issues with that on a Flink backend. I don't know the SDK in detail, but I don't think it creates any issues, because the entire behaviour of the client is still idempotent (just less efficient if you don't read sequentially).




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] aaltay commented on pull request #15931: Make S3 streaming more efficient

Posted by GitBox <gi...@apache.org>.
aaltay commented on pull request #15931:
URL: https://github.com/apache/beam/pull/15931#issuecomment-991449710


   @aromanenko-dev - do you know who could review this?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] phoerious commented on pull request #15931: Make S3 streaming more efficient

Posted by GitBox <gi...@apache.org>.
phoerious commented on pull request #15931:
URL: https://github.com/apache/beam/pull/15931#issuecomment-993800246


   Yeah, the class itself isn't used in the tests. The tests only hit the API of a mock S3Client, which is kind of superfluous.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] phoerious commented on a change in pull request #15931: Make S3 streaming more efficient

Posted by GitBox <gi...@apache.org>.
phoerious commented on a change in pull request #15931:
URL: https://github.com/apache/beam/pull/15931#discussion_r768493618



##########
File path: sdks/python/apache_beam/io/aws/clients/s3/boto3_client.py
##########
@@ -67,6 +69,11 @@ def __init__(self, options):
         aws_secret_access_key=secret_access_key,
         aws_session_token=session_token)
 
+    self._download_request = None

Review comment:
       I haven't had any issues with that on a Flink backend. I don't know the SDK in detail, but I don't think it creates any issues, because the entire behaviour of the client is still idempotent (just less efficient if you don't read sequentially). The user should not be aware of the statefulness, because it does not matter for the API.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] phoerious commented on a change in pull request #15931: Make S3 streaming more efficient

Posted by GitBox <gi...@apache.org>.
phoerious commented on a change in pull request #15931:
URL: https://github.com/apache/beam/pull/15931#discussion_r768514890



##########
File path: sdks/python/apache_beam/io/aws/clients/s3/boto3_client.py
##########
@@ -67,6 +69,11 @@ def __init__(self, options):
         aws_secret_access_key=secret_access_key,
         aws_session_token=session_token)
 
+    self._download_request = None

Review comment:
       If an instance is used to process two connections simultaneously, the client should reset the connection before each range request and effectively fall back to the old behaviour. The same happens when you read non-sequentially. The only issue I can see here is if the client is used in a non-thread-safe way, but I don't want to add mutex locks unless I know that this is definitely a scenario that can occur.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #15931: Make S3 streaming more efficient

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #15931:
URL: https://github.com/apache/beam/pull/15931#issuecomment-965013224


   # [Codecov](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#15931](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (4831f68) into [master](https://codecov.io/gh/apache/beam/commit/ffb18c79127e85faa9dea7104c5d3e145fdfaf9a?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (ffb18c7) will **increase** coverage by `9.01%`.
   > The diff coverage is `13.79%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/15931/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #15931      +/-   ##
   ==========================================
   + Coverage   74.64%   83.65%   +9.01%     
   ==========================================
     Files         643      447     -196     
     Lines       81183    61698   -19485     
   ==========================================
   - Hits        60595    51611    -8984     
   + Misses      19618    10087    -9531     
   + Partials      970        0     -970     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...thon/apache\_beam/io/aws/clients/s3/boto3\_client.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vYXdzL2NsaWVudHMvczMvYm90bzNfY2xpZW50LnB5) | `13.53% <13.79%> (+1.03%)` | :arrow_up: |
   | [sdks/python/apache\_beam/internal/metrics/metric.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW50ZXJuYWwvbWV0cmljcy9tZXRyaWMucHk=) | `90.00% <0.00%> (-1.00%)` | :arrow_down: |
   | [sdks/python/apache\_beam/runners/direct/executor.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9kaXJlY3QvZXhlY3V0b3IucHk=) | `96.46% <0.00%> (-0.55%)` | :arrow_down: |
   | [sdks/go/pkg/beam/core/runtime/graphx/coder.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9jb3JlL3J1bnRpbWUvZ3JhcGh4L2NvZGVyLmdv) | | |
   | [sdks/go/pkg/beam/core/funcx/output.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9jb3JlL2Z1bmN4L291dHB1dC5nbw==) | | |
   | [sdks/go/pkg/beam/util/harnessopts/sampler.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS91dGlsL2hhcm5lc3NvcHRzL3NhbXBsZXIuZ28=) | | |
   | [sdks/go/pkg/beam/core/util/reflectx/util.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9jb3JlL3V0aWwvcmVmbGVjdHgvdXRpbC5nbw==) | | |
   | [sdks/go/pkg/beam/core/runtime/xlangx/payload.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9jb3JlL3J1bnRpbWUveGxhbmd4L3BheWxvYWQuZ28=) | | |
   | [sdks/go/pkg/beam/core/graph/edge.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9jb3JlL2dyYXBoL2VkZ2UuZ28=) | | |
   | [sdks/go/pkg/beam/core/runtime/exec/input.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9jb3JlL3J1bnRpbWUvZXhlYy9pbnB1dC5nbw==) | | |
   | ... and [192 more](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [ffb18c7...4831f68](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #15931: Make S3 streaming more efficient

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #15931:
URL: https://github.com/apache/beam/pull/15931#issuecomment-965013224


   # [Codecov](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#15931](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (c6c15ac) into [master](https://codecov.io/gh/apache/beam/commit/ffb18c79127e85faa9dea7104c5d3e145fdfaf9a?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (ffb18c7) will **increase** coverage by `8.99%`.
   > The diff coverage is `15.62%`.
   
   > :exclamation: Current head c6c15ac differs from pull request most recent head f19b4da. Consider uploading reports for the commit f19b4da to get more accurate results
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/15931/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #15931      +/-   ##
   ==========================================
   + Coverage   74.64%   83.63%   +8.99%     
   ==========================================
     Files         643      447     -196     
     Lines       81183    61710   -19473     
   ==========================================
   - Hits        60595    51611    -8984     
   + Misses      19618    10099    -9519     
   + Partials      970        0     -970     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...thon/apache\_beam/io/aws/clients/s3/boto3\_client.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vYXdzL2NsaWVudHMvczMvYm90bzNfY2xpZW50LnB5) | `13.10% <15.62%> (+0.60%)` | :arrow_up: |
   | [sdks/python/apache\_beam/internal/metrics/metric.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW50ZXJuYWwvbWV0cmljcy9tZXRyaWMucHk=) | `90.00% <0.00%> (-1.00%)` | :arrow_down: |
   | [sdks/python/apache\_beam/runners/direct/executor.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9kaXJlY3QvZXhlY3V0b3IucHk=) | `96.46% <0.00%> (-0.55%)` | :arrow_down: |
   | [...hon/apache\_beam/runners/worker/bundle\_processor.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvYnVuZGxlX3Byb2Nlc3Nvci5weQ==) | `93.39% <0.00%> (-0.13%)` | :arrow_down: |
   | [sdks/go/pkg/beam/core/funcx/fn.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9jb3JlL2Z1bmN4L2ZuLmdv) | | |
   | [sdks/go/pkg/beam/core/runtime/types.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9jb3JlL3J1bnRpbWUvdHlwZXMuZ28=) | | |
   | [sdks/go/pkg/beam/runner.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9ydW5uZXIuZ28=) | | |
   | [sdks/go/pkg/beam/core/runtime/pipelinex/util.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9jb3JlL3J1bnRpbWUvcGlwZWxpbmV4L3V0aWwuZ28=) | | |
   | [sdks/go/pkg/beam/xlang.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS94bGFuZy5nbw==) | | |
   | [sdks/go/pkg/beam/core/util/reflectx/call.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9jb3JlL3V0aWwvcmVmbGVjdHgvY2FsbC5nbw==) | | |
   | ... and [194 more](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [ffb18c7...f19b4da](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] phoerious commented on pull request #15931: Make S3 streaming more efficient

Posted by GitBox <gi...@apache.org>.
phoerious commented on pull request #15931:
URL: https://github.com/apache/beam/pull/15931#issuecomment-993538373


   All right, I fixed up and simplified a bunch of things and I tested everything manually (sequential and non-sequential reads) and it seems to be working fine. I don't think, however, that I can realistically add a test suite for this without mocking up an entire Boto3 session client.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #15931: Make S3 streaming more efficient

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #15931:
URL: https://github.com/apache/beam/pull/15931#issuecomment-965013224


   # [Codecov](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#15931](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (34fedd0) into [master](https://codecov.io/gh/apache/beam/commit/67bcf1e16e3fdf68cdea7a4b42b9c003e4b8948c?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (67bcf1e) will **increase** coverage by `37.54%`.
   > The diff coverage is `11.62%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/15931/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@             Coverage Diff             @@
   ##           master   #15931       +/-   ##
   ===========================================
   + Coverage   46.13%   83.68%   +37.54%     
   ===========================================
     Files         197      447      +250     
     Lines       19519    61715    +42196     
   ===========================================
   + Hits         9006    51649    +42643     
   - Misses       9542    10066      +524     
   + Partials      971        0      -971     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...thon/apache\_beam/io/aws/clients/s3/boto3\_client.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vYXdzL2NsaWVudHMvczMvYm90bzNfY2xpZW50LnB5) | `14.51% <9.52%> (ø)` | |
   | [sdks/python/apache\_beam/utils/retry.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdXRpbHMvcmV0cnkucHk=) | `85.43% <100.00%> (ø)` | |
   | [sdks/go/pkg/beam/core/runtime/exec/window.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9jb3JlL3J1bnRpbWUvZXhlYy93aW5kb3cuZ28=) | | |
   | [sdks/go/pkg/beam/core/runtime/graphx/tree.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9jb3JlL3J1bnRpbWUvZ3JhcGh4L3RyZWUuZ28=) | | |
   | [sdks/go/pkg/beam/partition.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9wYXJ0aXRpb24uZ28=) | | |
   | [...ks/go/pkg/beam/core/runtime/harness/cache\_hooks.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9jb3JlL3J1bnRpbWUvaGFybmVzcy9jYWNoZV9ob29rcy5nbw==) | | |
   | [sdks/go/pkg/beam/transforms/top/top.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS90cmFuc2Zvcm1zL3RvcC90b3AuZ28=) | | |
   | [sdks/go/pkg/beam/core/graph/coder/time.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9jb3JlL2dyYXBoL2NvZGVyL3RpbWUuZ28=) | | |
   | [sdks/go/pkg/beam/core/metrics/dumper.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9jb3JlL21ldHJpY3MvZHVtcGVyLmdv) | | |
   | [sdks/go/pkg/beam/transforms/filter/filter.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS90cmFuc2Zvcm1zL2ZpbHRlci9maWx0ZXIuZ28=) | | |
   | ... and [636 more](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [67bcf1e...34fedd0](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] aaltay commented on pull request #15931: Make S3 streaming more efficient

Posted by GitBox <gi...@apache.org>.
aaltay commented on pull request #15931:
URL: https://github.com/apache/beam/pull/15931#issuecomment-999734590


   Thank you @phoerious 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] mosche commented on pull request #15931: Make S3 streaming more efficient

Posted by GitBox <gi...@apache.org>.
mosche commented on pull request #15931:
URL: https://github.com/apache/beam/pull/15931#issuecomment-993786270


   @phoerious The boto stubber might help for testing. Sorry, didn't notice the client was entirely untested. First time looking into the python code base myself 😱 
   https://botocore.amazonaws.com/v1/documentation/api/latest/reference/stubber.html
   
   @aaltay How do you want to proceed?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] phoerious edited a comment on pull request #15931: Make S3 streaming more efficient

Posted by GitBox <gi...@apache.org>.
phoerious edited a comment on pull request #15931:
URL: https://github.com/apache/beam/pull/15931#issuecomment-993538373






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #15931: Make S3 streaming more efficient

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #15931:
URL: https://github.com/apache/beam/pull/15931#issuecomment-965013224


   # [Codecov](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#15931](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (4831f68) into [master](https://codecov.io/gh/apache/beam/commit/ffb18c79127e85faa9dea7104c5d3e145fdfaf9a?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (ffb18c7) will **increase** coverage by `9.01%`.
   > The diff coverage is `13.79%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/15931/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #15931      +/-   ##
   ==========================================
   + Coverage   74.64%   83.65%   +9.01%     
   ==========================================
     Files         643      447     -196     
     Lines       81183    61698   -19485     
   ==========================================
   - Hits        60595    51611    -8984     
   + Misses      19618    10087    -9531     
   + Partials      970        0     -970     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...thon/apache\_beam/io/aws/clients/s3/boto3\_client.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vYXdzL2NsaWVudHMvczMvYm90bzNfY2xpZW50LnB5) | `13.53% <13.79%> (+1.03%)` | :arrow_up: |
   | [sdks/python/apache\_beam/internal/metrics/metric.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW50ZXJuYWwvbWV0cmljcy9tZXRyaWMucHk=) | `90.00% <0.00%> (-1.00%)` | :arrow_down: |
   | [sdks/python/apache\_beam/runners/direct/executor.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9kaXJlY3QvZXhlY3V0b3IucHk=) | `96.46% <0.00%> (-0.55%)` | :arrow_down: |
   | [sdks/go/pkg/beam/transforms/stats/min.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS90cmFuc2Zvcm1zL3N0YXRzL21pbi5nbw==) | | |
   | [sdks/go/pkg/beam/core/runtime/coderx/float.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9jb3JlL3J1bnRpbWUvY29kZXJ4L2Zsb2F0Lmdv) | | |
   | [sdks/go/pkg/beam/core/graph/coder/varint.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9jb3JlL2dyYXBoL2NvZGVyL3ZhcmludC5nbw==) | | |
   | [sdks/go/pkg/beam/core/runtime/graphx/dataflow.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9jb3JlL3J1bnRpbWUvZ3JhcGh4L2RhdGFmbG93Lmdv) | | |
   | [sdks/go/pkg/beam/util/starcgenx/starcgenx.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS91dGlsL3N0YXJjZ2VueC9zdGFyY2dlbnguZ28=) | | |
   | [...o/pkg/beam/runners/dataflow/dataflowlib/execute.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9ydW5uZXJzL2RhdGFmbG93L2RhdGFmbG93bGliL2V4ZWN1dGUuZ28=) | | |
   | [...o/pkg/beam/runners/dataflow/dataflowlib/metrics.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9ydW5uZXJzL2RhdGFmbG93L2RhdGFmbG93bGliL21ldHJpY3MuZ28=) | | |
   | ... and [192 more](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [ffb18c7...4831f68](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] aaltay commented on pull request #15931: Make S3 streaming more efficient

Posted by GitBox <gi...@apache.org>.
aaltay commented on pull request #15931:
URL: https://github.com/apache/beam/pull/15931#issuecomment-998236073


   Re-running the tests, waiting for all the pass. Please ping me once they are all green.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] phoerious commented on pull request #15931: Make S3 streaming more efficient

Posted by GitBox <gi...@apache.org>.
phoerious commented on pull request #15931:
URL: https://github.com/apache/beam/pull/15931#issuecomment-996920229


   I already did the local testing part. I tested sequential and non-sequential reads (including repeated reads from the same range) and got the expected data back.
   
   I also tried closing the internal HTTP stream mid-air to trigger a retry and also tried to connect to a non-existent host (also triggering retries) as well as requesting a non-existent object (4xx client error, no retries).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] aaltay commented on pull request #15931: Make S3 streaming more efficient

Posted by GitBox <gi...@apache.org>.
aaltay commented on pull request #15931:
URL: https://github.com/apache/beam/pull/15931#issuecomment-999945083


   > Sorry if I missed that but does this PR is linked to any Jira issue ?
   
   No. And I should have asked for it before merging it.
   
   @phoerious - Could you file a jira if one does not exist, and update the PR title to include the jira issue?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] phoerious commented on pull request #15931: Make S3 streaming more efficient

Posted by GitBox <gi...@apache.org>.
phoerious commented on pull request #15931:
URL: https://github.com/apache/beam/pull/15931#issuecomment-999408176


   @aaltay all green (except the Codecov test, obviously). 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] mosche commented on a change in pull request #15931: Make S3 streaming more efficient

Posted by GitBox <gi...@apache.org>.
mosche commented on a change in pull request #15931:
URL: https://github.com/apache/beam/pull/15931#discussion_r768427938



##########
File path: sdks/python/apache_beam/io/aws/clients/s3/boto3_client.py
##########
@@ -67,6 +69,11 @@ def __init__(self, options):
         aws_secret_access_key=secret_access_key,
         aws_session_token=session_token)
 
+    self._download_request = None

Review comment:
       This adds some local mutable state to the client. I'm not familiar with the Python SDK unfortunately, so not sure if this is legitimate. In any case, I'd suggest to document this in the pydocs of the client.

##########
File path: sdks/python/apache_beam/io/aws/clients/s3/boto3_client.py
##########
@@ -24,16 +24,18 @@
   # pylint: disable=wrong-import-order, wrong-import-position
   # pylint: disable=ungrouped-imports
   import boto3
+  import botocore.exceptions as boto_exception
 
 except ImportError:
   boto3 = None
+  boto_exception = None
 
 
 class Client(object):
   """
   Wrapper for boto3 library
   """
-  def __init__(self, options):
+  def __init__(self, options, download_offset=0):

Review comment:
       @phoerious Could you add some documentation to explain usage of `download_offset` as it's currently not used. I'm actually wondering a bit if it's necessary at all. If you use `get_range` and skip over some bytes using `start` >> `_download_pos`, the stream would be closed and reopened starting from `start` without any need for an offset

##########
File path: sdks/python/apache_beam/io/aws/clients/s3/boto3_client.py
##########
@@ -92,27 +99,65 @@ def get_object_metadata(self, request):
         boto_response['ContentLength'],
         boto_response['ContentType'])
 
+    item.size = max(item.size - self._download_offset, 0)
     return item
 
+  def get_stream(self, request, start):

Review comment:
       Could you add some pytests for get_stream? Even though fairly trivial, it would be good to also carefully test state management here.

##########
File path: sdks/python/apache_beam/io/aws/clients/s3/boto3_client.py
##########
@@ -92,27 +99,65 @@ def get_object_metadata(self, request):
         boto_response['ContentLength'],
         boto_response['ContentType'])
 
+    item.size = max(item.size - self._download_offset, 0)
     return item
 
+  def get_stream(self, request, start):
+    """Opens a stream object starting at the given position.
+
+    Args:
+      request: (GetRequest) request
+      start: (int) start offset
+    Returns:
+      (Stream) Boto3 stream object.
+    """
+
+    if self._download_request and (
+        start != self._download_pos or
+        request.bucket != self._download_request.bucket or
+        request.object != self._download_request.object):
+      self._download_stream.close()
+      self._download_stream = None
+
+    # noinspection PyProtectedMember
+    if not self._download_stream or self._download_stream._raw_stream.closed:
+      try:
+        self._download_stream = self.client.get_object(
+            Bucket=request.bucket,
+            Key=request.object,
+            Range='bytes={}-'.format(start + self._download_offset))['Body']
+        self._download_request = request
+        self._download_pos = start
+      except Exception as e:
+        message = e.response['Error'].get(
+            'Message', e.response['Error'].get('Code', ''))
+        code = e.response['ResponseMetadata']['HTTPStatusCode']
+        raise messages.S3ClientError(message, code)
+
+    return self._download_stream
+
   def get_range(self, request, start, end):
     r"""Retrieves an object's contents.
 
       Args:
         request: (GetRequest) request
+        start: (int) start offset
+        end: (int) end offset
       Returns:
         (bytes) The response message.
       """
-    try:
-      boto_response = self.client.get_object(
-          Bucket=request.bucket,
-          Key=request.object,
-          Range='bytes={}-{}'.format(start, end - 1))
-    except Exception as e:
-      message = e.response['Error']['Message']
-      code = e.response['ResponseMetadata']['HTTPStatusCode']
-      raise messages.S3ClientError(message, code)
-
-    return boto_response['Body'].read()  # A bytes object
+    for i in range(self._retries):

Review comment:
       Where is `_retries` initialized? Also, imho, the retry behavior is rather important and it would be good to cover it in tests. Also, any retry should always use some kind of exponential backoff strategy. In `utils` you can find `@retry.with_exponential_backoff`, please have a look.

##########
File path: sdks/python/apache_beam/io/aws/clients/s3/boto3_client.py
##########
@@ -92,27 +99,65 @@ def get_object_metadata(self, request):
         boto_response['ContentLength'],
         boto_response['ContentType'])
 
+    item.size = max(item.size - self._download_offset, 0)
     return item
 
+  def get_stream(self, request, start):
+    """Opens a stream object starting at the given position.
+
+    Args:
+      request: (GetRequest) request
+      start: (int) start offset
+    Returns:
+      (Stream) Boto3 stream object.
+    """
+
+    if self._download_request and (
+        start != self._download_pos or
+        request.bucket != self._download_request.bucket or
+        request.object != self._download_request.object):
+      self._download_stream.close()
+      self._download_stream = None
+
+    # noinspection PyProtectedMember
+    if not self._download_stream or self._download_stream._raw_stream.closed:
+      try:
+        self._download_stream = self.client.get_object(
+            Bucket=request.bucket,
+            Key=request.object,
+            Range='bytes={}-'.format(start + self._download_offset))['Body']
+        self._download_request = request
+        self._download_pos = start
+      except Exception as e:
+        message = e.response['Error'].get(
+            'Message', e.response['Error'].get('Code', ''))
+        code = e.response['ResponseMetadata']['HTTPStatusCode']
+        raise messages.S3ClientError(message, code)
+
+    return self._download_stream
+
   def get_range(self, request, start, end):
     r"""Retrieves an object's contents.
 
       Args:
         request: (GetRequest) request
+        start: (int) start offset
+        end: (int) end offset

Review comment:
       Just based on `end - 1` below, for clarity
   ```suggestion
           end: (int) end offset (exclusive)
   ```

##########
File path: sdks/python/apache_beam/io/aws/clients/s3/boto3_client.py
##########
@@ -92,27 +99,65 @@ def get_object_metadata(self, request):
         boto_response['ContentLength'],
         boto_response['ContentType'])
 
+    item.size = max(item.size - self._download_offset, 0)
     return item
 
+  def get_stream(self, request, start):
+    """Opens a stream object starting at the given position.
+
+    Args:
+      request: (GetRequest) request
+      start: (int) start offset
+    Returns:
+      (Stream) Boto3 stream object.
+    """
+
+    if self._download_request and (
+        start != self._download_pos or
+        request.bucket != self._download_request.bucket or
+        request.object != self._download_request.object):
+      self._download_stream.close()
+      self._download_stream = None
+
+    # noinspection PyProtectedMember
+    if not self._download_stream or self._download_stream._raw_stream.closed:
+      try:
+        self._download_stream = self.client.get_object(
+            Bucket=request.bucket,
+            Key=request.object,
+            Range='bytes={}-'.format(start + self._download_offset))['Body']
+        self._download_request = request
+        self._download_pos = start
+      except Exception as e:
+        message = e.response['Error'].get(
+            'Message', e.response['Error'].get('Code', ''))
+        code = e.response['ResponseMetadata']['HTTPStatusCode']
+        raise messages.S3ClientError(message, code)
+
+    return self._download_stream
+
   def get_range(self, request, start, end):
     r"""Retrieves an object's contents.
 
       Args:
         request: (GetRequest) request
+        start: (int) start offset
+        end: (int) end offset
       Returns:
         (bytes) The response message.
       """
-    try:
-      boto_response = self.client.get_object(
-          Bucket=request.bucket,
-          Key=request.object,
-          Range='bytes={}-{}'.format(start, end - 1))
-    except Exception as e:
-      message = e.response['Error']['Message']
-      code = e.response['ResponseMetadata']['HTTPStatusCode']
-      raise messages.S3ClientError(message, code)
-
-    return boto_response['Body'].read()  # A bytes object
+    for i in range(self._retries):
+      try:
+        stream = self.get_stream(request, start)
+        data = stream.read(end - start)
+        self._pos += len(data)

Review comment:
       ```suggestion
           self._download_pos += len(data)
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] phoerious commented on a change in pull request #15931: Make S3 streaming more efficient

Posted by GitBox <gi...@apache.org>.
phoerious commented on a change in pull request #15931:
URL: https://github.com/apache/beam/pull/15931#discussion_r768492019



##########
File path: sdks/python/apache_beam/io/aws/clients/s3/boto3_client.py
##########
@@ -24,16 +24,18 @@
   # pylint: disable=wrong-import-order, wrong-import-position
   # pylint: disable=ungrouped-imports
   import boto3
+  import botocore.exceptions as boto_exception
 
 except ImportError:
   boto3 = None
+  boto_exception = None
 
 
 class Client(object):
   """
   Wrapper for boto3 library
   """
-  def __init__(self, options):
+  def __init__(self, options, download_offset=0):

Review comment:
       It's not really needed. This is a leftover from a previous design and I removed it.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] mosche commented on a change in pull request #15931: Make S3 streaming more efficient

Posted by GitBox <gi...@apache.org>.
mosche commented on a change in pull request #15931:
URL: https://github.com/apache/beam/pull/15931#discussion_r768501639



##########
File path: sdks/python/apache_beam/io/aws/clients/s3/boto3_client.py
##########
@@ -92,27 +99,65 @@ def get_object_metadata(self, request):
         boto_response['ContentLength'],
         boto_response['ContentType'])
 
+    item.size = max(item.size - self._download_offset, 0)
     return item
 
+  def get_stream(self, request, start):

Review comment:
       I struggled myself yesterday when trying to set things up, so unfortunately I'm not of much help ...
   Here's some guidelines though https://cwiki.apache.org/confluence/display/BEAM/Python+Tips




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] mosche commented on a change in pull request #15931: Make S3 streaming more efficient

Posted by GitBox <gi...@apache.org>.
mosche commented on a change in pull request #15931:
URL: https://github.com/apache/beam/pull/15931#discussion_r768507027



##########
File path: sdks/python/apache_beam/io/aws/clients/s3/boto3_client.py
##########
@@ -67,6 +69,11 @@ def __init__(self, options):
         aws_secret_access_key=secret_access_key,
         aws_session_token=session_token)
 
+    self._download_request = None

Review comment:
       i'm certainly not opposed to a stateful client here, i just lack the context if this can be safely done




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #15931: Make S3 streaming more efficient

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #15931:
URL: https://github.com/apache/beam/pull/15931#issuecomment-965013224


   # [Codecov](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#15931](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (05d1610) into [master](https://codecov.io/gh/apache/beam/commit/8da177f64d314cf72e89a51e51fb0915f706a784?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (8da177f) will **decrease** coverage by `0.02%`.
   > The diff coverage is `8.33%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/15931/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #15931      +/-   ##
   ==========================================
   - Coverage   83.53%   83.50%   -0.03%     
   ==========================================
     Files         445      445              
     Lines       61385    61401      +16     
   ==========================================
   - Hits        51275    51273       -2     
   - Misses      10110    10128      +18     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...thon/apache\_beam/io/aws/clients/s3/boto3\_client.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vYXdzL2NsaWVudHMvczMvYm90bzNfY2xpZW50LnB5) | `11.76% <8.33%> (-0.74%)` | :arrow_down: |
   | [...hon/apache\_beam/runners/direct/test\_stream\_impl.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9kaXJlY3QvdGVzdF9zdHJlYW1faW1wbC5weQ==) | `94.02% <0.00%> (-2.24%)` | :arrow_down: |
   | [sdks/python/apache\_beam/internal/metrics/metric.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW50ZXJuYWwvbWV0cmljcy9tZXRyaWMucHk=) | `90.00% <0.00%> (-1.00%)` | :arrow_down: |
   | [sdks/python/apache\_beam/transforms/combiners.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdHJhbnNmb3Jtcy9jb21iaW5lcnMucHk=) | `93.15% <0.00%> (-0.40%)` | :arrow_down: |
   | [...ks/python/apache\_beam/runners/worker/sdk\_worker.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvc2RrX3dvcmtlci5weQ==) | `89.25% <0.00%> (-0.17%)` | :arrow_down: |
   | [...hon/apache\_beam/runners/worker/bundle\_processor.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvYnVuZGxlX3Byb2Nlc3Nvci5weQ==) | `93.51% <0.00%> (-0.13%)` | :arrow_down: |
   | [sdks/python/apache\_beam/transforms/util.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdHJhbnNmb3Jtcy91dGlsLnB5) | `96.02% <0.00%> (+0.16%)` | :arrow_up: |
   | [...runners/interactive/display/pcoll\_visualization.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9pbnRlcmFjdGl2ZS9kaXNwbGF5L3Bjb2xsX3Zpc3VhbGl6YXRpb24ucHk=) | `86.36% <0.00%> (+0.50%)` | :arrow_up: |
   | [...pache\_beam/runners/interactive/interactive\_beam.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9pbnRlcmFjdGl2ZS9pbnRlcmFjdGl2ZV9iZWFtLnB5) | `76.58% <0.00%> (+0.97%)` | :arrow_up: |
   | [sdks/python/apache\_beam/utils/interactive\_utils.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdXRpbHMvaW50ZXJhY3RpdmVfdXRpbHMucHk=) | `95.12% <0.00%> (+2.43%)` | :arrow_up: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [8da177f...05d1610](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] mosche commented on a change in pull request #15931: Make S3 streaming more efficient

Posted by GitBox <gi...@apache.org>.
mosche commented on a change in pull request #15931:
URL: https://github.com/apache/beam/pull/15931#discussion_r768518244



##########
File path: sdks/python/apache_beam/io/aws/clients/s3/boto3_client.py
##########
@@ -67,6 +69,11 @@ def __init__(self, options):
         aws_secret_access_key=secret_access_key,
         aws_session_token=session_token)
 
+    self._download_request = None

Review comment:
       well, these blocks are by no means atomic... there's all kinds of possible race conditions if this code should run concurrently




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] phoerious commented on a change in pull request #15931: Make S3 streaming more efficient

Posted by GitBox <gi...@apache.org>.
phoerious commented on a change in pull request #15931:
URL: https://github.com/apache/beam/pull/15931#discussion_r768538855



##########
File path: sdks/python/apache_beam/io/aws/clients/s3/boto3_client.py
##########
@@ -92,27 +99,65 @@ def get_object_metadata(self, request):
         boto_response['ContentLength'],
         boto_response['ContentType'])
 
+    item.size = max(item.size - self._download_offset, 0)
     return item
 
+  def get_stream(self, request, start):

Review comment:
       I got the tests to run, but I am honestly ad a loss how to add proper tests for that. All the existing tests are on a mock S3 client. The real S3Client class is entirely untested and cannot be tested effectively without a running S3 backend server.

##########
File path: sdks/python/apache_beam/io/aws/clients/s3/boto3_client.py
##########
@@ -92,27 +99,65 @@ def get_object_metadata(self, request):
         boto_response['ContentLength'],
         boto_response['ContentType'])
 
+    item.size = max(item.size - self._download_offset, 0)
     return item
 
+  def get_stream(self, request, start):

Review comment:
       I got the tests to run, but I am honestly at a loss how to add proper tests for that. All the existing tests are on a mock S3 client. The real S3Client class is entirely untested and cannot be tested effectively without a running S3 backend server.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #15931: Make S3 streaming more efficient

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #15931:
URL: https://github.com/apache/beam/pull/15931#issuecomment-965013224


   # [Codecov](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#15931](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (c6c15ac) into [master](https://codecov.io/gh/apache/beam/commit/ffb18c79127e85faa9dea7104c5d3e145fdfaf9a?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (ffb18c7) will **increase** coverage by `8.99%`.
   > The diff coverage is `15.62%`.
   
   > :exclamation: Current head c6c15ac differs from pull request most recent head 4831f68. Consider uploading reports for the commit 4831f68 to get more accurate results
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/15931/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #15931      +/-   ##
   ==========================================
   + Coverage   74.64%   83.63%   +8.99%     
   ==========================================
     Files         643      447     -196     
     Lines       81183    61710   -19473     
   ==========================================
   - Hits        60595    51611    -8984     
   + Misses      19618    10099    -9519     
   + Partials      970        0     -970     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...thon/apache\_beam/io/aws/clients/s3/boto3\_client.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vYXdzL2NsaWVudHMvczMvYm90bzNfY2xpZW50LnB5) | `13.10% <15.62%> (+0.60%)` | :arrow_up: |
   | [sdks/python/apache\_beam/internal/metrics/metric.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW50ZXJuYWwvbWV0cmljcy9tZXRyaWMucHk=) | `90.00% <0.00%> (-1.00%)` | :arrow_down: |
   | [sdks/python/apache\_beam/runners/direct/executor.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9kaXJlY3QvZXhlY3V0b3IucHk=) | `96.46% <0.00%> (-0.55%)` | :arrow_down: |
   | [...hon/apache\_beam/runners/worker/bundle\_processor.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvYnVuZGxlX3Byb2Nlc3Nvci5weQ==) | `93.39% <0.00%> (-0.13%)` | :arrow_down: |
   | [sdks/go/pkg/beam/internal/errors/errors.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9pbnRlcm5hbC9lcnJvcnMvZXJyb3JzLmdv) | | |
   | [sdks/go/pkg/beam/io/databaseio/writer.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9pby9kYXRhYmFzZWlvL3dyaXRlci5nbw==) | | |
   | [sdks/go/pkg/beam/core/runtime/symbols.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9jb3JlL3J1bnRpbWUvc3ltYm9scy5nbw==) | | |
   | [sdks/go/pkg/beam/transforms/stats/count.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS90cmFuc2Zvcm1zL3N0YXRzL2NvdW50Lmdv) | | |
   | [sdks/go/pkg/beam/core/runtime/exec/pardo.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9jb3JlL3J1bnRpbWUvZXhlYy9wYXJkby5nbw==) | | |
   | [...s/go/pkg/beam/core/runtime/harness/sampler\_hook.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9jb3JlL3J1bnRpbWUvaGFybmVzcy9zYW1wbGVyX2hvb2suZ28=) | | |
   | ... and [194 more](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [ffb18c7...4831f68](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #15931: Make S3 streaming more efficient

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #15931:
URL: https://github.com/apache/beam/pull/15931#issuecomment-965013224


   # [Codecov](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#15931](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (34fedd0) into [master](https://codecov.io/gh/apache/beam/commit/67bcf1e16e3fdf68cdea7a4b42b9c003e4b8948c?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (67bcf1e) will **increase** coverage by `37.54%`.
   > The diff coverage is `11.62%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/15931/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@             Coverage Diff             @@
   ##           master   #15931       +/-   ##
   ===========================================
   + Coverage   46.13%   83.68%   +37.54%     
   ===========================================
     Files         197      447      +250     
     Lines       19519    61715    +42196     
   ===========================================
   + Hits         9006    51649    +42643     
   - Misses       9542    10066      +524     
   + Partials      971        0      -971     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...thon/apache\_beam/io/aws/clients/s3/boto3\_client.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vYXdzL2NsaWVudHMvczMvYm90bzNfY2xpZW50LnB5) | `14.51% <9.52%> (ø)` | |
   | [sdks/python/apache\_beam/utils/retry.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdXRpbHMvcmV0cnkucHk=) | `85.43% <100.00%> (ø)` | |
   | [sdks/go/pkg/beam/core/runtime/exec/pcollection.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9jb3JlL3J1bnRpbWUvZXhlYy9wY29sbGVjdGlvbi5nbw==) | | |
   | [sdks/go/pkg/beam/beam.shims.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9iZWFtLnNoaW1zLmdv) | | |
   | [sdks/go/pkg/beam/testing/passert/passert.shims.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS90ZXN0aW5nL3Bhc3NlcnQvcGFzc2VydC5zaGltcy5nbw==) | | |
   | [sdks/go/pkg/beam/core/graph/coder/int.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9jb3JlL2dyYXBoL2NvZGVyL2ludC5nbw==) | | |
   | [sdks/go/pkg/beam/core/graph/coder/windows.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9jb3JlL2dyYXBoL2NvZGVyL3dpbmRvd3MuZ28=) | | |
   | [sdks/go/pkg/beam/core/graph/coder/time.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9jb3JlL2dyYXBoL2NvZGVyL3RpbWUuZ28=) | | |
   | [.../go/pkg/beam/core/graph/coder/testutil/testutil.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9jb3JlL2dyYXBoL2NvZGVyL3Rlc3R1dGlsL3Rlc3R1dGlsLmdv) | | |
   | [sdks/go/pkg/beam/pcollection.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9wY29sbGVjdGlvbi5nbw==) | | |
   | ... and [636 more](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [67bcf1e...34fedd0](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #15931: Make S3 streaming more efficient

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #15931:
URL: https://github.com/apache/beam/pull/15931#issuecomment-965013224


   # [Codecov](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#15931](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (34fedd0) into [master](https://codecov.io/gh/apache/beam/commit/67bcf1e16e3fdf68cdea7a4b42b9c003e4b8948c?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (67bcf1e) will **increase** coverage by `9.02%`.
   > The diff coverage is `11.62%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/15931/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #15931      +/-   ##
   ==========================================
   + Coverage   74.66%   83.68%   +9.02%     
   ==========================================
     Files         644      447     -197     
     Lines       81230    61715   -19515     
   ==========================================
   - Hits        60648    51649    -8999     
   + Misses      19611    10066    -9545     
   + Partials      971        0     -971     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...thon/apache\_beam/io/aws/clients/s3/boto3\_client.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vYXdzL2NsaWVudHMvczMvYm90bzNfY2xpZW50LnB5) | `14.51% <9.52%> (+2.01%)` | :arrow_up: |
   | [sdks/python/apache\_beam/utils/retry.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdXRpbHMvcmV0cnkucHk=) | `85.43% <100.00%> (ø)` | |
   | [...pache\_beam/runners/interactive/interactive\_beam.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9pbnRlcmFjdGl2ZS9pbnRlcmFjdGl2ZV9iZWFtLnB5) | `75.60% <0.00%> (-0.98%)` | :arrow_down: |
   | [...runners/interactive/display/pcoll\_visualization.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9pbnRlcmFjdGl2ZS9kaXNwbGF5L3Bjb2xsX3Zpc3VhbGl6YXRpb24ucHk=) | `85.85% <0.00%> (-0.51%)` | :arrow_down: |
   | [...hon/apache\_beam/runners/worker/bundle\_processor.py](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvYnVuZGxlX3Byb2Nlc3Nvci5weQ==) | `93.39% <0.00%> (-0.13%)` | :arrow_down: |
   | [...s/go/pkg/beam/core/runtime/harness/sampler\_hook.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9jb3JlL3J1bnRpbWUvaGFybmVzcy9zYW1wbGVyX2hvb2suZ28=) | | |
   | [sdks/go/pkg/beam/core/util/protox/any.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9jb3JlL3V0aWwvcHJvdG94L2FueS5nbw==) | | |
   | [sdks/go/pkg/beam/core/util/symtab/symtab.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9jb3JlL3V0aWwvc3ltdGFiL3N5bXRhYi5nbw==) | | |
   | [sdks/go/pkg/beam/core/runtime/exec/emit.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9jb3JlL3J1bnRpbWUvZXhlYy9lbWl0Lmdv) | | |
   | [sdks/go/pkg/beam/core/runtime/exec/fn\_arity.go](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9jb3JlL3J1bnRpbWUvZXhlYy9mbl9hcml0eS5nbw==) | | |
   | ... and [195 more](https://codecov.io/gh/apache/beam/pull/15931/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [67bcf1e...34fedd0](https://codecov.io/gh/apache/beam/pull/15931?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] phoerious commented on pull request #15931: Make S3 streaming more efficient

Posted by GitBox <gi...@apache.org>.
phoerious commented on pull request #15931:
URL: https://github.com/apache/beam/pull/15931#issuecomment-996858759


   I added one more commit that adjusts the general exception / error handling. Previously, the code threw an (unrelated) exception for any error that is not an HTTP error, such as a connection or intermittent read error. Since those are errors you want to retry, I adjusted the `except` clauses and the retry exception filter accordingly (and even if you didn't care about those errors, you wouldn't want to raise a `KeyError`). 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org