You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2022/08/30 22:58:47 UTC

[GitHub] [beam] yixiaoshen opened a new pull request, #22966: [#22478]: Add read_time support to Google Firestore connector

yixiaoshen opened a new pull request, #22966:
URL: https://github.com/apache/beam/pull/22966

   Allows FirestoreIO read functions to be configured with a read_time, this allows the Beam workload to issue all its Firestore reads using the same read_time and get the consistent read result from Firestore across all its requests.
   
   addresses #22478
   
   ------------------------
   
   Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:
   
    - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`).
    - [ ] Mention the appropriate issue in your description (for example: `addresses #123`), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, comment `fixes #<ISSUE NUMBER>` instead.
    - [ ] Update `CHANGES.md` with noteworthy changes.
    - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more tips on [how to make review process smoother](https://beam.apache.org/contribute/get-started-contributing/#make-the-reviewers-job-easier).
   
   To check the build health, please visit [https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md](https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md)
   
   GitHub Actions Tests Status (on master branch)
   ------------------------------------------------------------------------------------------------
   [![Build python source distribution and wheels](https://github.com/apache/beam/workflows/Build%20python%20source%20distribution%20and%20wheels/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Build+python+source+distribution+and+wheels%22+branch%3Amaster+event%3Aschedule)
   [![Python tests](https://github.com/apache/beam/workflows/Python%20tests/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Python+Tests%22+branch%3Amaster+event%3Aschedule)
   [![Java tests](https://github.com/apache/beam/workflows/Java%20Tests/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Java+Tests%22+branch%3Amaster+event%3Aschedule)
   [![Go tests](https://github.com/apache/beam/workflows/Go%20tests/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Go+tests%22+branch%3Amaster+event%3Aschedule)
   
   See [CI.md](https://github.com/apache/beam/blob/master/CI.md) for more information about GitHub Actions CI.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] yixiaoshen commented on pull request #22966: [#22478]: Add read_time support to Google Firestore connector

Posted by GitBox <gi...@apache.org>.
yixiaoshen commented on PR #22966:
URL: https://github.com/apache/beam/pull/22966#issuecomment-1232258702

   Note that FirestoreV1IT never ran before because the test code expected 2 environment variables that are not set in the tests environments. This PR also changed FirestoreV1IT code so that it can run without requiring the env vars. I've added coverage for consistent reads in FirestoreV1IT and tested it in a private GCP project to verify the correctness of the changes. However, I have to disable the test in PostCommit because as of now Firestore and Datastore database cannot coexist in the same GCP project and the test project already runs Datastore IT tests. I will need to defer the enablement of FirestoreV1IT in a later PR which requires setting up a new test project.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] ahmedabu98 commented on pull request #22966: [#22478]: Add read_time support to Google Firestore connector

Posted by GitBox <gi...@apache.org>.
ahmedabu98 commented on PR #22966:
URL: https://github.com/apache/beam/pull/22966#issuecomment-1271734615

   Also breaking V2 runner tests: https://ci-beam.apache.org/job/beam_PostCommit_Java_DataflowV2/2164/testReport/


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] yixiaoshen commented on pull request #22966: [#22478]: Add read_time support to Google Firestore connector

Posted by GitBox <gi...@apache.org>.
yixiaoshen commented on PR #22966:
URL: https://github.com/apache/beam/pull/22966#issuecomment-1232305609

   R: @pcostell


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] yixiaoshen commented on pull request #22966: [#22478]: Add read_time support to Google Firestore connector

Posted by GitBox <gi...@apache.org>.
yixiaoshen commented on PR #22966:
URL: https://github.com/apache/beam/pull/22966#issuecomment-1239929420

   R: @pabloem


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] github-actions[bot] commented on pull request #22966: [#22478]: Add read_time support to Google Firestore connector

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on PR #22966:
URL: https://github.com/apache/beam/pull/22966#issuecomment-1232278428

   Assigning reviewers. If you would like to opt out of this review, comment `assign to next reviewer`:
   
   R: @apilloud for label java.
   R: @damccorm for label build.
   R: @ahmedabu98 for label io.
   
   Available commands:
   - `stop reviewer notifications` - opt out of the automated review tooling
   - `remind me after tests pass` - tag the comment author after tests pass
   - `waiting on author` - shift the attention set back to the author (any comment or push by the author will return the attention set to the reviewers)
   
   The PR bot will only process comments in the main thread (not review comments).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] yixiaoshen commented on a diff in pull request #22966: [#22478]: Add read_time support to Google Firestore connector

Posted by GitBox <gi...@apache.org>.
yixiaoshen commented on code in PR #22966:
URL: https://github.com/apache/beam/pull/22966#discussion_r964301019


##########
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/firestore/FirestoreV1ReadFn.java:
##########
@@ -484,6 +575,7 @@ public final void processElement(ProcessContext c) throws Exception {
 
         try {
           RequestT request = nextPageToken == null ? element : setPageToken(element, nextPageToken);

Review Comment:
   Done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] pcostell commented on a diff in pull request #22966: [#22478]: Add read_time support to Google Firestore connector

Posted by GitBox <gi...@apache.org>.
pcostell commented on code in PR #22966:
URL: https://github.com/apache/beam/pull/22966#discussion_r964230851


##########
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/firestore/FirestoreV1ReadFn.java:
##########
@@ -484,6 +575,7 @@ public final void processElement(ProcessContext c) throws Exception {
 
         try {
           RequestT request = nextPageToken == null ? element : setPageToken(element, nextPageToken);

Review Comment:
   It seems like the pattern for other setX(RequestT, X) is not not support nullable and conditionally call it in the base. Perhaps we should do that as well?



##########
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/firestore/FirestoreV1ReadFn.java:
##########
@@ -125,6 +134,13 @@ protected RunQueryRequest setStartFrom(
       return element.toBuilder().setStructuredQuery(builder.build()).build();
     }
 
+    @Override
+    protected RunQueryRequest setReadTime(RunQueryRequest element, @Nullable Instant readTime) {
+      return readTime == null
+          ? element

Review Comment:
   nit: We mix and match element / request for the variable names, can we choose just one?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] yixiaoshen commented on pull request #22966: [#22478]: Add read_time support to Google Firestore connector

Posted by GitBox <gi...@apache.org>.
yixiaoshen commented on PR #22966:
URL: https://github.com/apache/beam/pull/22966#issuecomment-1244250865

   R: @chamikaramj


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] yixiaoshen commented on a diff in pull request #22966: [#22478]: Add read_time support to Google Firestore connector

Posted by GitBox <gi...@apache.org>.
yixiaoshen commented on code in PR #22966:
URL: https://github.com/apache/beam/pull/22966#discussion_r970236111


##########
sdks/java/io/google-cloud-platform/build.gradle:
##########
@@ -196,6 +196,8 @@ task integrationTest(type: Test, dependsOn: processTestResources) {
   exclude '**/BigQueryIOStorageWriteIT.class'
   exclude '**/BigQueryToTableIT.class'
   exclude '**/BigQueryIOJsonTest.class'
+  // TODO Firestore IT test needs to run in a separate project from Datastore.
+  exclude '**/FirestoreV1IT.class'

Review Comment:
   I have locally run FirestoreV1IT test against a different GCP project to make sure the code behaves correctly. I'm currently working on another change to enable this test, hopefully in the same GCP project by using a new Firestore feature that's not launched yet but we can internally allowlist this test project to use that. I expect to have that ready in a few days, but I don't want it to block this PR because FirestoreV1IT has never run in the past because it was incorrectly written and it's an unrelated problem from read_time support.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] yixiaoshen commented on a diff in pull request #22966: [#22478]: Add read_time support to Google Firestore connector

Posted by GitBox <gi...@apache.org>.
yixiaoshen commented on code in PR #22966:
URL: https://github.com/apache/beam/pull/22966#discussion_r964300806


##########
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/firestore/FirestoreV1ReadFn.java:
##########
@@ -125,6 +134,13 @@ protected RunQueryRequest setStartFrom(
       return element.toBuilder().setStructuredQuery(builder.build()).build();
     }
 
+    @Override
+    protected RunQueryRequest setReadTime(RunQueryRequest element, @Nullable Instant readTime) {
+      return readTime == null
+          ? element

Review Comment:
   Done, most places use 'element' so I change the rest also to element.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] chamikaramj commented on a diff in pull request #22966: [#22478]: Add read_time support to Google Firestore connector

Posted by GitBox <gi...@apache.org>.
chamikaramj commented on code in PR #22966:
URL: https://github.com/apache/beam/pull/22966#discussion_r980624162


##########
sdks/java/io/google-cloud-platform/build.gradle:
##########
@@ -196,6 +196,8 @@ task integrationTest(type: Test, dependsOn: processTestResources) {
   exclude '**/BigQueryIOStorageWriteIT.class'
   exclude '**/BigQueryToTableIT.class'
   exclude '**/BigQueryIOJsonTest.class'
+  // TODO Firestore IT test needs to run in a separate project from Datastore.
+  exclude '**/FirestoreV1IT.class'

Review Comment:
   +1  for enabling the tests soon. But we don't have to block the PR on that.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] github-actions[bot] commented on pull request #22966: [#22478]: Add read_time support to Google Firestore connector

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on PR #22966:
URL: https://github.com/apache/beam/pull/22966#issuecomment-1232306212

   Stopping reviewer notifications for this pull request: review requested by someone other than the bot, ceding control


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] yixiaoshen commented on pull request #22966: [#22478]: Add read_time support to Google Firestore connector

Posted by GitBox <gi...@apache.org>.
yixiaoshen commented on PR #22966:
URL: https://github.com/apache/beam/pull/22966#issuecomment-1271792576

   > Also breaking V2 runner tests: https://ci-beam.apache.org/job/beam_PostCommit_Java_DataflowV2/2164/testReport/
   
   fix is waiting to be merged https://github.com/apache/beam/pull/23322


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] codecov[bot] commented on pull request #22966: [#22478]: Add read_time support to Google Firestore connector

Posted by GitBox <gi...@apache.org>.
codecov[bot] commented on PR #22966:
URL: https://github.com/apache/beam/pull/22966#issuecomment-1232951426

   # [Codecov](https://codecov.io/gh/apache/beam/pull/22966?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#22966](https://codecov.io/gh/apache/beam/pull/22966?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (8b1fcd3) into [master](https://codecov.io/gh/apache/beam/commit/3ede5b76e48b41e89bc67541ea5044ebe704e905?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (3ede5b7) will **decrease** coverage by `0.00%`.
   > The diff coverage is `n/a`.
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #22966      +/-   ##
   ==========================================
   - Coverage   73.69%   73.68%   -0.01%     
   ==========================================
     Files         713      713              
     Lines       94988    94988              
   ==========================================
   - Hits        69998    69994       -4     
   - Misses      23689    23693       +4     
     Partials     1301     1301              
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | python | `83.51% <ø> (-0.01%)` | :arrow_down: |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/22966?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [sdks/python/apache\_beam/internal/metrics/metric.py](https://codecov.io/gh/apache/beam/pull/22966/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW50ZXJuYWwvbWV0cmljcy9tZXRyaWMucHk=) | `93.00% <0.00%> (-1.00%)` | :arrow_down: |
   | [...hon/apache\_beam/runners/worker/bundle\_processor.py](https://codecov.io/gh/apache/beam/pull/22966/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvYnVuZGxlX3Byb2Nlc3Nvci5weQ==) | `93.30% <0.00%> (-0.38%)` | :arrow_down: |
   | [...ks/python/apache\_beam/runners/worker/sdk\_worker.py](https://codecov.io/gh/apache/beam/pull/22966/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvc2RrX3dvcmtlci5weQ==) | `88.94% <0.00%> (-0.16%)` | :arrow_down: |
   | [...n/apache\_beam/ml/gcp/recommendations\_ai\_test\_it.py](https://codecov.io/gh/apache/beam/pull/22966/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vbWwvZ2NwL3JlY29tbWVuZGF0aW9uc19haV90ZXN0X2l0LnB5) | `75.51% <0.00%> (+2.04%)` | :arrow_up: |
   
   :mega: We’re building smart automated test selection to slash your CI/CD build times. [Learn more](https://about.codecov.io/iterative-testing/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] pabloem commented on a diff in pull request #22966: [#22478]: Add read_time support to Google Firestore connector

Posted by GitBox <gi...@apache.org>.
pabloem commented on code in PR #22966:
URL: https://github.com/apache/beam/pull/22966#discussion_r965359778


##########
sdks/java/io/google-cloud-platform/build.gradle:
##########
@@ -196,6 +196,8 @@ task integrationTest(type: Test, dependsOn: processTestResources) {
   exclude '**/BigQueryIOStorageWriteIT.class'
   exclude '**/BigQueryToTableIT.class'
   exclude '**/BigQueryIOJsonTest.class'
+  // TODO Firestore IT test needs to run in a separate project from Datastore.
+  exclude '**/FirestoreV1IT.class'

Review Comment:
   thanks for adding these. what can we do to ensure these run somewhere?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] chamikaramj merged pull request #22966: [#22478]: Add read_time support to Google Firestore connector

Posted by GitBox <gi...@apache.org>.
chamikaramj merged PR #22966:
URL: https://github.com/apache/beam/pull/22966


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] yixiaoshen commented on a diff in pull request #22966: [#22478]: Add read_time support to Google Firestore connector

Posted by GitBox <gi...@apache.org>.
yixiaoshen commented on code in PR #22966:
URL: https://github.com/apache/beam/pull/22966#discussion_r980714897


##########
sdks/java/io/google-cloud-platform/build.gradle:
##########
@@ -196,6 +196,8 @@ task integrationTest(type: Test, dependsOn: processTestResources) {
   exclude '**/BigQueryIOStorageWriteIT.class'
   exclude '**/BigQueryToTableIT.class'
   exclude '**/BigQueryIOJsonTest.class'
+  // TODO Firestore IT test needs to run in a separate project from Datastore.
+  exclude '**/FirestoreV1IT.class'

Review Comment:
   #23322 is out for review and will fix and enable FirestoreV1IT



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] yixiaoshen commented on pull request #22966: [#22478]: Add read_time support to Google Firestore connector

Posted by GitBox <gi...@apache.org>.
yixiaoshen commented on PR #22966:
URL: https://github.com/apache/beam/pull/22966#issuecomment-1239906236

   R: @chamikaramj


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org