You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by "lostluck (via GitHub)" <gi...@apache.org> on 2023/02/13 19:49:08 UTC

[GitHub] [beam] lostluck opened a new issue, #25454: [Feature Request]: Python Datastore IO Query Read should retry

lostluck opened a new issue, #25454:
URL: https://github.com/apache/beam/issues/25454

   ### What would you like to happen?
   
   As implemented, the Python SDK Datastore IO Query doesn't currently retry on retryable RPC/HTTP errors, in particular, Deadline exceeded.
   
   Per the [Datastore documentation](https://cloud.google.com/datastore/docs/concepts/errors) DEADLINE_EXCEEDED errors should retry using exponential backoff. 
   
   https://github.com/apache/beam/blob/v2.44.0/sdks/python/apache_beam/io/gcp/datastore/v1new/datastoreio.py#L304
   
   Writes currently do this at least, but the same applies to reads. https://github.com/apache/beam/blob/v2.44.0/sdks/python/apache_beam/io/gcp/datastore/v1new/datastoreio.py#L397
   
   ----
   
   It does occur to me that this would need to be done in a safe enough fashion to not redundantly re-emit already read and processed data. This may complicate the implementation of this resilience improvement.
   
   ### Issue Priority
   
   Priority: 3 (nice-to-have improvement)
   
   ### Issue Components
   
   - [X] Component: Python SDK
   - [ ] Component: Java SDK
   - [ ] Component: Go SDK
   - [ ] Component: Typescript SDK
   - [X] Component: IO connector
   - [ ] Component: Beam examples
   - [ ] Component: Beam playground
   - [ ] Component: Beam katas
   - [ ] Component: Website
   - [ ] Component: Spark Runner
   - [ ] Component: Flink Runner
   - [ ] Component: Samza Runner
   - [ ] Component: Twister2 Runner
   - [ ] Component: Hazelcast Jet Runner
   - [ ] Component: Google Cloud Dataflow Runner


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org