You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2021/06/14 07:37:56 UTC

[GitHub] [beam] BenWhitehead opened a new pull request #15005: [BEAM-8376] Google Cloud Firestore Connector - Add Firestore v1 Read Operations

BenWhitehead opened a new pull request #15005:
URL: https://github.com/apache/beam/pull/15005


   Entry point for accessing Firestore V1 read methods is `FirestoreIO.v1().read()`.
   
   Currently supported read RPC methods:
   * `PartitionQuery`
   * `RunQuery`
   * `ListCollectionIds`
   * `ListDocuments`
   * `BatchGetDocuments`
   
   ### Unit Tests
   
   No external dependencies are needed for this suite
   
   A large suite of unit tests have been added to cover most branches and error
   scenarios in the various components. Test for input validation and bounds
   checking are also included in this suite.
   
   ### Integration Tests
   
   Integration tests for each type of RPC is present in
   `org.apache.beam.sdk.io.gcp.firestore.it.FirestoreV1IT`. All of these tests
   leverage `TestPipeline` and verify the expected Documents/Collections are all
   operated on during the test.
   
   ### Reviewers
   R: @chamikaramj
   R: @jlara310 (Firestore Technical Writer)
   R: @cynthiachi (Firestore Dataplane)
   R: @danthev (Firestore SRE)
   
   ------------------------
   
   Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:
   
    - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`).
    - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue.
    - [ ] Update `CHANGES.md` with noteworthy changes.
    - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more tips on [how to make review process smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier).
   
   `ValidatesRunner` compliance status (on master branch)
   --------------------------------------------------------
   
   <table>
     <thead>
       <tr>
         <th>Lang</th>
         <th>ULR</th>
         <th>Dataflow</th>
         <th>Flink</th>
         <th>Samza</th>
         <th>Spark</th>
         <th>Twister2</th>
       </tr>
     </thead>
     <tbody>
       <tr>
         <td>Go</td>
         <td>---</td>
         <td>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon">
           </a>
         <td>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon">
           </a>
         </td>
         <td>---</td>
         <td>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon">
           </a>
         </td>
         <td>---</td>
       </tr>
       <tr>
         <td>Java</td>
         <td>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_ULR/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_ULR/lastCompletedBuild/badge/icon">
           </a>
         </td>
         <td>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon?subject=V1">
           </a><br>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Streaming/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Streaming/lastCompletedBuild/badge/icon?subject=V1+Streaming">
           </a><br>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon?subject=V1+Java+11">
           </a><br>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_Java_VR_Dataflow_V2/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_Java_VR_Dataflow_V2/lastCompletedBuild/badge/icon?subject=V2">
           </a><br>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_Java_VR_Dataflow_V2_Streaming/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_Java_VR_Dataflow_V2_Streaming/lastCompletedBuild/badge/icon?subject=V2+Streaming">
           </a><br>
         </td>
         <td>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon?subject=Java+8">
           </a><br>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/badge/icon?subject=Java+11">
           </a><br>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon?subject=Portable">
           </a><br>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon?subject=Portable+Streaming">
           </a>
         </td>
         <td>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon">
           </a><br>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Samza/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Samza/lastCompletedBuild/badge/icon?subject=Portable">
           </a>
         </td>
         <td>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon">
           </a><br>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon?subject=Portable">
           </a><br>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon?subject=Structured+Streaming">
           </a>
         </td>
         <td>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Twister2/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Twister2/lastCompletedBuild/badge/icon">
           </a>
         </td>
       </tr>
       <tr>
         <td>Python</td>
         <td>---</td>
         <td>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon?subject=V1">
           </a><br>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_Py_VR_Dataflow_V2/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_Py_VR_Dataflow_V2/lastCompletedBuild/badge/icon?subject=V2">
           </a><br>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon?subject=ValCont">
           </a>
         </td>
         <td>
           <a href="https://ci-beam.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/badge/icon?subject=Portable">
           </a><br>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_Python_VR_Flink/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_Python_VR_Flink/lastCompletedBuild/badge/icon">
           </a>
         </td>
         <td>---</td>
         <td>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_Python_VR_Spark/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_Python_VR_Spark/lastCompletedBuild/badge/icon">
           </a>
         </td>
         <td>---</td>
       </tr>
       <tr>
         <td>XLang</td>
         <td>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_XVR_Direct/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_XVR_Direct/lastCompletedBuild/badge/icon">
           </a>
         </td>
         <td>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_XVR_Dataflow/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_XVR_Dataflow/lastCompletedBuild/badge/icon">
           </a>
         </td>
         <td>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_XVR_Flink/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_XVR_Flink/lastCompletedBuild/badge/icon">
           </a>
         </td>
         <td>---</td>
         <td>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_XVR_Spark/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_XVR_Spark/lastCompletedBuild/badge/icon">
           </a>
         </td>
         <td>---</td>
       </tr>
     </tbody>
   </table>
   
   Examples testing status on various runners
   --------------------------------------------------------
   
   <table>
     <thead>
       <tr>
         <th>Lang</th>
         <th>ULR</th>
         <th>Dataflow</th>
         <th>Flink</th>
         <th>Samza</th>
         <th>Spark</th>
         <th>Twister2</th>
       </tr>
     </thead>
     <tbody>
       <tr>
         <td>Go</td>
         <td>---</td>
         <td>---</td>
         <td>---</td>
         <td>---</td>
         <td>---</td>
         <td>---</td>
         <td>---</td>
       </tr>
       <tr>
         <td>Java</td>
         <td>---</td>
         <td>
           <a href="https://ci-beam.apache.org/job/beam_PreCommit_Java_Examples_Dataflow_Cron/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PreCommit_Java_Examples_Dataflow_Cron/lastCompletedBuild/badge/icon?subject=V1">
           </a><br>
           <a href="https://ci-beam.apache.org/job/beam_PreCommit_Java_Examples_Dataflow_Java11_Cron/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PreCommit_Java_Examples_Dataflow_Java11_Cron/lastCompletedBuild/badge/icon?subject=V1+Java11">
           </a><br>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_Java_Examples_Dataflow_V2/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_Java_Examples_Dataflow_V2/lastCompletedBuild/badge/icon?subject=V2">
           </a><br>
         </td>
         <td>---</td>
         <td>---</td>
         <td>---</td>
         <td>---</td>
         <td>---</td>
       </tr>
       <tr>
         <td>Python</td>
         <td>---</td>
         <td>---</td>
         <td>---</td>
         <td>---</td>
         <td>---</td>
         <td>---</td>
         <td>---</td>
       </tr>
       <tr>
         <td>XLang</td>
         <td>---</td>
         <td>---</td>
         <td>---</td>
         <td>---</td>
         <td>---</td>
         <td>---</td>
         <td>---</td>
       </tr>
     </tbody>
   </table>
   
   Post-Commit SDK/Transform Integration Tests Status (on master branch)
   ------------------------------------------------------------------------------------------------
   
   <table>
     <thead>
       <tr>
         <th>Go</th>
         <th>Java</th>
         <th>Python</th>
       </tr>
     </thead>
     <tbody>
       <tr>
         <td>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon">
           </a>
         </td>
         <td>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon">
           </a>
         </td>
         <td>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/badge/icon?subject=3.6">
           </a><br>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/badge/icon?subject=3.7">
           </a><br>
           <a href="https://ci-beam.apache.org/job/beam_PostCommit_Python38/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PostCommit_Python38/lastCompletedBuild/badge/icon?subject=3.8">
           </a>
         </td>
       </tr>
     </tbody>
   </table>
   
   Pre-Commit Tests Status (on master branch)
   ------------------------------------------------------------------------------------------------
   
   <table>
     <thead>
       <tr>
         <th>---</th>
         <th>Java</th>
         <th>Python</th>
         <th>Go</th>
         <th>Website</th>
         <th>Whitespace</th>
         <th>Typescript</th>
       </tr>
     </thead>
     <tbody>
       <tr>
         <td>Non-portable</td>
         <td>
           <a href="https://ci-beam.apache.org/job/beam_PreCommit_Java_Cron/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PreCommit_Java_Cron/lastCompletedBuild/badge/icon">
           </a><br>
         </td>
         <td>
           <a href="https://ci-beam.apache.org/job/beam_PreCommit_Python_Cron/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PreCommit_Python_Cron/lastCompletedBuild/badge/icon?subject=Tests">
           </a><br>
           <a href="https://ci-beam.apache.org/job/beam_PreCommit_PythonLint_Cron/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PreCommit_PythonLint_Cron/lastCompletedBuild/badge/icon?subject=Lint">
           </a><br>
           <a href="https://ci-beam.apache.org/job/beam_PreCommit_PythonDocker_Cron/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PreCommit_PythonDocker_Cron/badge/icon?subject=Docker">
           </a><br>
           <a href="https://ci-beam.apache.org/job/beam_PreCommit_PythonDocs_Cron/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PreCommit_PythonDocs_Cron/badge/icon?subject=Docs">
           </a>
         </td>
         <td>
           <a href="https://ci-beam.apache.org/job/beam_PreCommit_Go_Cron/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PreCommit_Go_Cron/lastCompletedBuild/badge/icon">
           </a>
         </td>
         <td>
           <a href="https://ci-beam.apache.org/job/beam_PreCommit_Website_Cron/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PreCommit_Website_Cron/lastCompletedBuild/badge/icon">
           </a>
         </td>
         <td>
           <a href="https://ci-beam.apache.org/job/beam_PreCommit_Whitespace_Cron/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PreCommit_Whitespace_Cron/lastCompletedBuild/badge/icon">
           </a>
         </td>
         <td>
           <a href="https://ci-beam.apache.org/job/beam_PreCommit_Typescript_Cron/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PreCommit_Typescript_Cron/lastCompletedBuild/badge/icon">
           </a>
         </td>
       </tr>
       <tr>
         <td>Portable</td>
         <td>---</td>
         <td>
           <a href="https://ci-beam.apache.org/job/beam_PreCommit_Portable_Python_Cron/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PreCommit_Portable_Python_Cron/lastCompletedBuild/badge/icon">
           </a>
         </td>
         <td>
           <a href="https://ci-beam.apache.org/job/beam_PreCommit_GoPortable_Cron/lastCompletedBuild/">
             <img alt="Build Status" src="https://ci-beam.apache.org/job/beam_PreCommit_GoPortable_Cron/lastCompletedBuild/badge/icon">
           </a>
         </td>
         <td>---</td>
         <td>---</td>
         <td>---</td>
       </tr>
     </tbody>
   </table>
   
   See [.test-infra/jenkins/README](https://github.com/apache/beam/blob/master/.test-infra/jenkins/README.md) for trigger phrase, status and link of all Jenkins jobs.
   
   
   GitHub Actions Tests Status (on master branch)
   ------------------------------------------------------------------------------------------------
   [![Build python source distribution and wheels](https://github.com/apache/beam/workflows/Build%20python%20source%20distribution%20and%20wheels/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Build+python+source+distribution+and+wheels%22+branch%3Amaster+event%3Aschedule)
   [![Python tests](https://github.com/apache/beam/workflows/Python%20tests/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Python+Tests%22+branch%3Amaster+event%3Aschedule)
   [![Java tests](https://github.com/apache/beam/workflows/Java%20Tests/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Java+Tests%22+branch%3Amaster+event%3Aschedule)
   
   See [CI.md](https://github.com/apache/beam/blob/master/CI.md) for more information about GitHub Actions CI.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] BenWhitehead commented on a change in pull request #15005: [BEAM-8376] Google Cloud Firestore Connector - Add Firestore v1 Read Operations

Posted by GitBox <gi...@apache.org>.
BenWhitehead commented on a change in pull request #15005:
URL: https://github.com/apache/beam/pull/15005#discussion_r666369015



##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/firestore/FirestoreV1ReadFn.java
##########
@@ -0,0 +1,633 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.gcp.firestore;
+
+import static java.util.Objects.requireNonNull;
+
+import com.google.api.gax.paging.AbstractPage;
+import com.google.api.gax.paging.AbstractPagedListResponse;
+import com.google.api.gax.rpc.ServerStream;
+import com.google.api.gax.rpc.ServerStreamingCallable;
+import com.google.api.gax.rpc.UnaryCallable;
+import com.google.cloud.firestore.v1.FirestoreClient.ListCollectionIdsPage;
+import com.google.cloud.firestore.v1.FirestoreClient.ListCollectionIdsPagedResponse;
+import com.google.cloud.firestore.v1.FirestoreClient.ListDocumentsPage;
+import com.google.cloud.firestore.v1.FirestoreClient.ListDocumentsPagedResponse;
+import com.google.cloud.firestore.v1.FirestoreClient.PartitionQueryPage;
+import com.google.cloud.firestore.v1.FirestoreClient.PartitionQueryPagedResponse;
+import com.google.cloud.firestore.v1.stub.FirestoreStub;
+import com.google.firestore.v1.BatchGetDocumentsRequest;
+import com.google.firestore.v1.BatchGetDocumentsResponse;
+import com.google.firestore.v1.Cursor;
+import com.google.firestore.v1.ListCollectionIdsRequest;
+import com.google.firestore.v1.ListCollectionIdsResponse;
+import com.google.firestore.v1.ListDocumentsRequest;
+import com.google.firestore.v1.ListDocumentsResponse;
+import com.google.firestore.v1.PartitionQueryRequest;
+import com.google.firestore.v1.PartitionQueryResponse;
+import com.google.firestore.v1.RunQueryRequest;
+import com.google.firestore.v1.RunQueryResponse;
+import com.google.firestore.v1.StructuredQuery;
+import com.google.firestore.v1.StructuredQuery.Direction;
+import com.google.firestore.v1.StructuredQuery.FieldReference;
+import com.google.firestore.v1.StructuredQuery.Order;
+import com.google.firestore.v1.Value;
+import com.google.protobuf.Message;
+import com.google.protobuf.ProtocolStringList;
+import java.io.Serializable;
+import java.util.List;
+import java.util.Map;
+import java.util.Objects;
+import org.apache.beam.sdk.extensions.gcp.options.GcpOptions;
+import org.apache.beam.sdk.io.gcp.firestore.FirestoreDoFn.NonWindowAwareDoFn;
+import org.apache.beam.sdk.io.gcp.firestore.FirestoreV1Fn.HasRpcAttemptContext;
+import org.apache.beam.sdk.io.gcp.firestore.RpcQos.RpcAttempt.Context;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.display.DisplayData;
+import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.annotations.VisibleForTesting;
+import org.checkerframework.checker.nullness.compatqual.NullableDecl;
+import org.checkerframework.checker.nullness.qual.Nullable;
+import org.joda.time.Instant;
+
+/**
+ * A collection of {@link org.apache.beam.sdk.transforms.DoFn DoFn}s for each of the supported read
+ * RPC methods from the Cloud Firestore V1 API.
+ */
+final class FirestoreV1ReadFn {

Review comment:
       Is there any existing convention for a class that groups a set of `DoFn`s?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] BenWhitehead commented on pull request #15005: [BEAM-8376] Google Cloud Firestore Connector - Add Firestore v1 Read Operations

Posted by GitBox <gi...@apache.org>.
BenWhitehead commented on pull request #15005:
URL: https://github.com/apache/beam/pull/15005#issuecomment-879485470


   Run Java PreCommit


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] BenWhitehead commented on pull request #15005: [BEAM-8376] Google Cloud Firestore Connector - Add Firestore v1 Read Operations

Posted by GitBox <gi...@apache.org>.
BenWhitehead commented on pull request #15005:
URL: https://github.com/apache/beam/pull/15005#issuecomment-870065497


   Run Java PreCommit


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] BenWhitehead commented on pull request #15005: [BEAM-8376] Google Cloud Firestore Connector - Add Firestore v1 Read Operations

Posted by GitBox <gi...@apache.org>.
BenWhitehead commented on pull request #15005:
URL: https://github.com/apache/beam/pull/15005#issuecomment-865204056


   Run Java PreCommit


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] jlara310 commented on a change in pull request #15005: [BEAM-8376] Google Cloud Firestore Connector - Add Firestore v1 Read Operations

Posted by GitBox <gi...@apache.org>.
jlara310 commented on a change in pull request #15005:
URL: https://github.com/apache/beam/pull/15005#discussion_r662686011



##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/firestore/FirestoreV1.java
##########
@@ -59,6 +89,80 @@
  *
  * <h3>Operations</h3>
  *
+ * <h4>Read</h4>
+ *
+ * <p>The currently supported read operations and their execution behavior are as follows:
+ *
+ * <table>
+ *   <tbody>
+ *     <tr>
+ *       <th>RPC</th>
+ *       <th>Execution Behavior</th>
+ *     </tr>
+ *     <tr>
+ *       <td>PartitionQuery</td>
+ *       <td>Parallel Streaming</td>
+ *     </tr>
+ *     <tr>
+ *       <td>RunQuery</td>
+ *       <td>Sequential Streaming</td>
+ *     </tr>
+ *     <tr>
+ *       <td>BatchGet</td>
+ *       <td>Sequential Streaming</td>
+ *     </tr>
+ *     <tr>
+ *       <td>ListCollectionIds</td>
+ *       <td>Sequential Paginated</td>
+ *     </tr>
+ *     <tr>
+ *       <td>ListDocuments</td>
+ *       <td>Sequential Paginated</td>
+ *     </tr>
+ *   </tbody>
+ * </table>
+ *
+ * <p>PartitionQuery should be preferred over other options if at all possible, it has the ability
+ * to parallelize execution of multiple queries for specific sub-ranges of the full results.
+ *
+ * <p>ListDocuments should only ever be used if the use of <a target="_blank" rel="noopener

Review comment:
       Re-write in imperative mood:
   
   You should only ever use ListDocuments if you need...

##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/firestore/FirestoreV1.java
##########
@@ -59,6 +89,80 @@
  *
  * <h3>Operations</h3>
  *
+ * <h4>Read</h4>
+ *
+ * <p>The currently supported read operations and their execution behavior are as follows:
+ *
+ * <table>
+ *   <tbody>
+ *     <tr>
+ *       <th>RPC</th>
+ *       <th>Execution Behavior</th>
+ *     </tr>
+ *     <tr>
+ *       <td>PartitionQuery</td>
+ *       <td>Parallel Streaming</td>
+ *     </tr>
+ *     <tr>
+ *       <td>RunQuery</td>
+ *       <td>Sequential Streaming</td>
+ *     </tr>
+ *     <tr>
+ *       <td>BatchGet</td>
+ *       <td>Sequential Streaming</td>
+ *     </tr>
+ *     <tr>
+ *       <td>ListCollectionIds</td>
+ *       <td>Sequential Paginated</td>
+ *     </tr>
+ *     <tr>
+ *       <td>ListDocuments</td>
+ *       <td>Sequential Paginated</td>
+ *     </tr>
+ *   </tbody>
+ * </table>
+ *
+ * <p>PartitionQuery should be preferred over other options if at all possible, it has the ability

Review comment:
       if at all possible, it ->
   
   if at all possible, because it

##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/firestore/RpcQosImpl.java
##########
@@ -700,6 +723,142 @@ public long nextBackOffMillis() {
     }
   }
 
+  /**
+   * This class implements a backoff algorithm similar to that of {@link
+   * org.apache.beam.sdk.util.FluentBackoff} with a key differences:
+   *
+   * <ol>
+   *   <li>A set of status code numbers may be specified to have a graceful evaluation
+   *   <li>Gracefully evaluated status code numbers will increment a decaying counter to ensure if
+   *       the graceful status code numbers occur more than once in the previous 60 seconds the
+   *       regular backoff behavior will kick in.
+   *   <li>The random number generator used to induce jitter is provided via constructor parameter
+   *       rather than using {@link Math#random()}}
+   * </ol>
+   *
+   * The primary motivation for creating this implementation is to support streamed responses from
+   * Firestore. In the case of RunQuery and BatchGet the results are returned via stream. The result
+   * stream has a maximum lifetime of 60 seconds before it will be broken and an UNAVAILABLE status
+   * code will be raised. Give this UNAVAILABLE is expected for streams this class allows for

Review comment:
       Typo? "Give this UNAVAILABLE is expected for streams this class"
   
   "Given that this UNAVAILABLE is expected for streams, this class"

##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/firestore/RpcQosImpl.java
##########
@@ -700,6 +723,142 @@ public long nextBackOffMillis() {
     }
   }
 
+  /**
+   * This class implements a backoff algorithm similar to that of {@link
+   * org.apache.beam.sdk.util.FluentBackoff} with a key differences:
+   *
+   * <ol>
+   *   <li>A set of status code numbers may be specified to have a graceful evaluation
+   *   <li>Gracefully evaluated status code numbers will increment a decaying counter to ensure if
+   *       the graceful status code numbers occur more than once in the previous 60 seconds the
+   *       regular backoff behavior will kick in.
+   *   <li>The random number generator used to induce jitter is provided via constructor parameter
+   *       rather than using {@link Math#random()}}
+   * </ol>
+   *
+   * The primary motivation for creating this implementation is to support streamed responses from
+   * Firestore. In the case of RunQuery and BatchGet the results are returned via stream. The result
+   * stream has a maximum lifetime of 60 seconds before it will be broken and an UNAVAILABLE status
+   * code will be raised. Give this UNAVAILABLE is expected for streams this class allows for
+   * defining a set of status code numbers which are give a grace count of 1 before backoff kicks

Review comment:
       "which are give a grace count" -> which are given a grace count

##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/firestore/RpcQosImpl.java
##########
@@ -115,25 +118,43 @@
             filteringDistributionFactory);
     writeRampUp =
         new WriteRampUp(500.0 / options.getHintMaxNumWorkers(), filteringDistributionFactory);
-    // maxRetries is an inclusive value, we want exclusive since we are tracking all attempts
-    fb =
-        FluentBackoff.DEFAULT
-            .withMaxRetries(options.getMaxAttempts() - 1)
-            .withInitialBackoff(options.getInitialBackoff());
     counters = new WeakHashMap<>();
     computeCounters = (Context c) -> O11y.create(c, counterFactory, filteringDistributionFactory);
   }
 
   @Override
   public RpcWriteAttemptImpl newWriteAttempt(Context context) {
     return new RpcWriteAttemptImpl(
-        context, counters.computeIfAbsent(context, computeCounters), fb.backoff(), sleeper);
+        context,
+        counters.computeIfAbsent(context, computeCounters),
+        new StatusCodeAwareBackoff(
+            random,
+            options.getMaxAttempts(),
+            options.getThrottleDuration(),
+            Collections.emptySet()),
+        sleeper);
   }
 
   @Override
   public RpcReadAttemptImpl newReadAttempt(Context context) {
+    Set<Integer> graceStatusCodeNumbers = Collections.emptySet();
+    // When reading results from a RunQuery or BatchGet the stream returning the results has a
+    //   maximum lifetime of 60 seconds at which point it will be broken with a n UNAVAILABLE

Review comment:
       "a n UNAVAILABLE"
   Typo?
   
   -> "an UNAVAILABLE"

##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/firestore/RpcQosImpl.java
##########
@@ -700,6 +723,142 @@ public long nextBackOffMillis() {
     }
   }
 
+  /**
+   * This class implements a backoff algorithm similar to that of {@link
+   * org.apache.beam.sdk.util.FluentBackoff} with a key differences:

Review comment:
       "a key differences" -> "some key differences"




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] BenWhitehead commented on pull request #15005: [BEAM-8376] Google Cloud Firestore Connector - Add Firestore v1 Read Operations

Posted by GitBox <gi...@apache.org>.
BenWhitehead commented on pull request #15005:
URL: https://github.com/apache/beam/pull/15005#issuecomment-869845855






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] chamikaramj commented on a change in pull request #15005: [BEAM-8376] Google Cloud Firestore Connector - Add Firestore v1 Read Operations

Posted by GitBox <gi...@apache.org>.
chamikaramj commented on a change in pull request #15005:
URL: https://github.com/apache/beam/pull/15005#discussion_r663224976



##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/firestore/FirestoreDoFn.java
##########
@@ -46,6 +46,22 @@
   @StartBundle
   public abstract void startBundle(DoFn<InT, OutT>.StartBundleContext context) throws Exception;
 
+  abstract static class NonWindowAwareDoFn<InT, OutT> extends FirestoreDoFn<InT, OutT> {

Review comment:
       Please add a comment describing why we we need this class.

##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/firestore/FirestoreV1.java
##########
@@ -189,6 +533,725 @@ private Write() {}
     }
   }
 
+  /**
+   * Concrete class representing a {@link PTransform}{@code <}{@link PCollection}{@code <}{@link
+   * ListCollectionIdsRequest}{@code >, }{@link PTransform}{@code <}{@link
+   * ListCollectionIdsResponse}{@code >>} which will read from Firestore.
+   *
+   * <p>This class is part of the Firestore Connector DSL, it has a type safe builder accessible via
+   * {@link FirestoreIO#v1()}{@code .}{@link FirestoreV1#read() read()}{@code .}{@link
+   * FirestoreV1.Read#listCollectionIds() listCollectionIds()}.
+   *
+   * <p>All request quality-of-service for an instance of this PTransform is scoped to the worker
+   * and configured via {@link ListCollectionIds.Builder#withRpcQosOptions(RpcQosOptions)}.
+   *
+   * @see FirestoreIO#v1()
+   * @see FirestoreV1#read()
+   * @see FirestoreV1.Read#listCollectionIds()
+   * @see FirestoreV1.ListCollectionIds.Builder
+   * @see ListCollectionIdsRequest
+   * @see ListCollectionIdsResponse
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.Firestore.ListCollectionIds">google.firestore.v1.Firestore.ListCollectionIds</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.ListCollectionIdsRequest">google.firestore.v1.ListCollectionIdsRequest</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.ListCollectionIdsResponse">google.firestore.v1.ListCollectionIdsResponse</a>
+   */
+  public static final class ListCollectionIds
+      extends Transform<
+          PCollection<ListCollectionIdsRequest>,
+          PCollection<String>,
+          ListCollectionIds,
+          ListCollectionIds.Builder> {
+
+    private ListCollectionIds(
+        JodaClock clock,
+        FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+        RpcQosOptions rpcQosOptions) {
+      super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    @Override
+    public PCollection<String> expand(PCollection<ListCollectionIdsRequest> input) {
+      return input
+          .apply(
+              "listCollectionIds",
+              ParDo.of(
+                  new ListCollectionIdsFn(clock, firestoreStatefulComponentFactory, rpcQosOptions)))
+          .apply(ParDo.of(new FlattenListCollectionIdsResponse()))
+          .apply(Reshuffle.viaRandomKey());
+    }
+
+    @Override
+    public Builder toBuilder() {
+      return new Builder(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    /**
+     * A type safe builder for {@link ListCollectionIds} allowing configuration and instantiation.
+     *
+     * <p>This class is part of the Firestore Connector DSL, it has a type safe builder accessible
+     * via {@link FirestoreIO#v1()}{@code .}{@link FirestoreV1#read() read()}{@code .}{@link
+     * FirestoreV1.Read#listCollectionIds() listCollectionIds()}.
+     *
+     * <p>
+     *
+     * @see FirestoreIO#v1()
+     * @see FirestoreV1#read()
+     * @see FirestoreV1.Read#listCollectionIds()
+     * @see FirestoreV1.ListCollectionIds
+     * @see ListCollectionIdsRequest
+     * @see ListCollectionIdsResponse
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.Firestore.ListCollectionIds">google.firestore.v1.Firestore.ListCollectionIds</a>
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.ListCollectionIdsRequest">google.firestore.v1.ListCollectionIdsRequest</a>
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.ListCollectionIdsResponse">google.firestore.v1.ListCollectionIdsResponse</a>
+     */
+    public static final class Builder
+        extends Transform.Builder<
+            PCollection<ListCollectionIdsRequest>,
+            PCollection<String>,
+            ListCollectionIds,
+            ListCollectionIds.Builder> {
+
+      private Builder() {
+        super();
+      }
+
+      private Builder(
+          JodaClock clock,
+          FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+          RpcQosOptions rpcQosOptions) {
+        super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+      }
+
+      @Override
+      public ListCollectionIds build() {
+        return genericBuild();
+      }
+
+      @Override
+      ListCollectionIds buildSafe(
+          JodaClock clock,
+          FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+          RpcQosOptions rpcQosOptions) {
+        return new ListCollectionIds(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+      }
+    }
+  }
+
+  /**
+   * Concrete class representing a {@link PTransform}{@code <}{@link PCollection}{@code <}{@link
+   * ListDocumentsRequest}{@code >, }{@link PTransform}{@code <}{@link ListDocumentsResponse}{@code
+   * >>} which will read from Firestore.
+   *
+   * <p>This class is part of the Firestore Connector DSL, it has a type safe builder accessible via
+   * {@link FirestoreIO#v1()}{@code .}{@link FirestoreV1#read() read()}{@code .}{@link
+   * FirestoreV1.Read#listDocuments() listDocuments()}.
+   *
+   * <p>All request quality-of-service for an instance of this PTransform is scoped to the worker
+   * and configured via {@link ListDocuments.Builder#withRpcQosOptions(RpcQosOptions)}.
+   *
+   * @see FirestoreIO#v1()
+   * @see FirestoreV1#read()
+   * @see FirestoreV1.Read#listDocuments()
+   * @see FirestoreV1.ListDocuments.Builder
+   * @see ListDocumentsRequest
+   * @see ListDocumentsResponse
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.Firestore.ListDocuments">google.firestore.v1.Firestore.ListDocuments</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.ListDocumentsRequest">google.firestore.v1.ListDocumentsRequest</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.ListDocumentsResponse">google.firestore.v1.ListDocumentsResponse</a>
+   */
+  public static final class ListDocuments
+      extends Transform<
+          PCollection<ListDocumentsRequest>,
+          PCollection<Document>,
+          ListDocuments,
+          ListDocuments.Builder> {
+
+    private ListDocuments(
+        JodaClock clock,
+        FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+        RpcQosOptions rpcQosOptions) {
+      super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    @Override
+    public PCollection<Document> expand(PCollection<ListDocumentsRequest> input) {
+      return input
+          .apply(
+              "listDocuments",
+              ParDo.of(
+                  new ListDocumentsFn(clock, firestoreStatefulComponentFactory, rpcQosOptions)))
+          .apply(ParDo.of(new ListDocumentsResponseToDocument()))
+          .apply(Reshuffle.viaRandomKey());
+    }
+
+    @Override
+    public Builder toBuilder() {
+      return new Builder(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    /**
+     * A type safe builder for {@link ListDocuments} allowing configuration and instantiation.
+     *
+     * <p>This class is part of the Firestore Connector DSL, it has a type safe builder accessible
+     * via {@link FirestoreIO#v1()}{@code .}{@link FirestoreV1#read() read()}{@code .}{@link
+     * FirestoreV1.Read#listDocuments() listDocuments()}.
+     *
+     * <p>
+     *
+     * @see FirestoreIO#v1()
+     * @see FirestoreV1#read()
+     * @see FirestoreV1.Read#listDocuments()
+     * @see FirestoreV1.ListDocuments
+     * @see ListDocumentsRequest
+     * @see ListDocumentsResponse
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.Firestore.ListDocuments">google.firestore.v1.Firestore.ListDocuments</a>
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.ListDocumentsRequest">google.firestore.v1.ListDocumentsRequest</a>
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.ListDocumentsResponse">google.firestore.v1.ListDocumentsResponse</a>
+     */
+    public static final class Builder
+        extends Transform.Builder<
+            PCollection<ListDocumentsRequest>,
+            PCollection<Document>,
+            ListDocuments,
+            ListDocuments.Builder> {
+
+      private Builder() {
+        super();
+      }
+
+      private Builder(
+          JodaClock clock,
+          FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+          RpcQosOptions rpcQosOptions) {
+        super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+      }
+
+      @Override
+      public ListDocuments build() {
+        return genericBuild();
+      }
+
+      @Override
+      ListDocuments buildSafe(
+          JodaClock clock,
+          FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+          RpcQosOptions rpcQosOptions) {
+        return new ListDocuments(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+      }
+    }
+  }
+
+  /**
+   * Concrete class representing a {@link PTransform}{@code <}{@link PCollection}{@code <}{@link
+   * RunQueryRequest}{@code >, }{@link PTransform}{@code <}{@link RunQueryResponse}{@code >>} which
+   * will read from Firestore.
+   *
+   * <p>This class is part of the Firestore Connector DSL, it has a type safe builder accessible via
+   * {@link FirestoreIO#v1()}{@code .}{@link FirestoreV1#read() read()}{@code .}{@link
+   * FirestoreV1.Read#runQuery() runQuery()}.
+   *
+   * <p>All request quality-of-service for an instance of this PTransform is scoped to the worker
+   * and configured via {@link RunQuery.Builder#withRpcQosOptions(RpcQosOptions)}.
+   *
+   * @see FirestoreIO#v1()
+   * @see FirestoreV1#read()
+   * @see FirestoreV1.Read#runQuery()
+   * @see FirestoreV1.RunQuery.Builder
+   * @see RunQueryRequest
+   * @see RunQueryResponse
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.Firestore.RunQuery">google.firestore.v1.Firestore.RunQuery</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.RunQueryRequest">google.firestore.v1.RunQueryRequest</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.RunQueryResponse">google.firestore.v1.RunQueryResponse</a>
+   */
+  public static final class RunQuery
+      extends Transform<
+          PCollection<RunQueryRequest>, PCollection<RunQueryResponse>, RunQuery, RunQuery.Builder> {
+
+    private RunQuery(
+        JodaClock clock,
+        FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+        RpcQosOptions rpcQosOptions) {
+      super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    @Override
+    public PCollection<RunQueryResponse> expand(PCollection<RunQueryRequest> input) {
+      return input
+          .apply(
+              "runQuery",
+              ParDo.of(new RunQueryFn(clock, firestoreStatefulComponentFactory, rpcQosOptions)))
+          .apply(Reshuffle.viaRandomKey());
+    }
+
+    @Override
+    public Builder toBuilder() {
+      return new Builder(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    /**
+     * A type safe builder for {@link RunQuery} allowing configuration and instantiation.
+     *
+     * <p>This class is part of the Firestore Connector DSL, it has a type safe builder accessible
+     * via {@link FirestoreIO#v1()}{@code .}{@link FirestoreV1#read() read()}{@code .}{@link
+     * FirestoreV1.Read#runQuery() runQuery()}.
+     *
+     * <p>
+     *
+     * @see FirestoreIO#v1()
+     * @see FirestoreV1#read()
+     * @see FirestoreV1.Read#runQuery()
+     * @see FirestoreV1.RunQuery
+     * @see RunQueryRequest
+     * @see RunQueryResponse
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.Firestore.RunQuery">google.firestore.v1.Firestore.RunQuery</a>
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.RunQueryRequest">google.firestore.v1.RunQueryRequest</a>
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.RunQueryResponse">google.firestore.v1.RunQueryResponse</a>
+     */
+    public static final class Builder
+        extends Transform.Builder<
+            PCollection<RunQueryRequest>,
+            PCollection<RunQueryResponse>,
+            RunQuery,
+            RunQuery.Builder> {
+
+      private Builder() {
+        super();
+      }
+
+      private Builder(
+          JodaClock clock,
+          FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+          RpcQosOptions rpcQosOptions) {
+        super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+      }
+
+      @Override
+      public RunQuery build() {
+        return genericBuild();
+      }
+
+      @Override
+      RunQuery buildSafe(
+          JodaClock clock,
+          FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+          RpcQosOptions rpcQosOptions) {
+        return new RunQuery(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+      }
+    }
+  }
+
+  /**
+   * Concrete class representing a {@link PTransform}{@code <}{@link PCollection}{@code <}{@link
+   * BatchGetDocumentsRequest}{@code >, }{@link PTransform}{@code <}{@link
+   * BatchGetDocumentsResponse}{@code >>} which will read from Firestore.
+   *
+   * <p>This class is part of the Firestore Connector DSL, it has a type safe builder accessible via
+   * {@link FirestoreIO#v1()}{@code .}{@link FirestoreV1#read() read()}{@code .}{@link
+   * FirestoreV1.Read#batchGetDocuments() batchGetDocuments()}.
+   *
+   * <p>All request quality-of-service for an instance of this PTransform is scoped to the worker
+   * and configured via {@link BatchGetDocuments.Builder#withRpcQosOptions(RpcQosOptions)}.
+   *
+   * @see FirestoreIO#v1()
+   * @see FirestoreV1#read()
+   * @see FirestoreV1.Read#batchGetDocuments()
+   * @see FirestoreV1.BatchGetDocuments.Builder
+   * @see BatchGetDocumentsRequest
+   * @see BatchGetDocumentsResponse
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.Firestore.BatchGetDocuments">google.firestore.v1.Firestore.BatchGetDocuments</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.BatchGetDocumentsRequest">google.firestore.v1.BatchGetDocumentsRequest</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.BatchGetDocumentsResponse">google.firestore.v1.BatchGetDocumentsResponse</a>
+   */
+  public static final class BatchGetDocuments
+      extends Transform<
+          PCollection<BatchGetDocumentsRequest>,
+          PCollection<BatchGetDocumentsResponse>,
+          BatchGetDocuments,
+          BatchGetDocuments.Builder> {
+
+    private BatchGetDocuments(
+        JodaClock clock,
+        FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+        RpcQosOptions rpcQosOptions) {
+      super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    @Override
+    public PCollection<BatchGetDocumentsResponse> expand(
+        PCollection<BatchGetDocumentsRequest> input) {
+      return input
+          .apply(
+              "batchGetDocuments",
+              ParDo.of(
+                  new BatchGetDocumentsFn(clock, firestoreStatefulComponentFactory, rpcQosOptions)))
+          .apply(Reshuffle.viaRandomKey());
+    }
+
+    @Override
+    public Builder toBuilder() {
+      return new Builder(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    /**
+     * A type safe builder for {@link BatchGetDocuments} allowing configuration and instantiation.
+     *
+     * <p>This class is part of the Firestore Connector DSL, it has a type safe builder accessible
+     * via {@link FirestoreIO#v1()}{@code .}{@link FirestoreV1#read() read()}{@code .}{@link
+     * FirestoreV1.Read#batchGetDocuments() batchGetDocuments()}.
+     *
+     * <p>
+     *
+     * @see FirestoreIO#v1()
+     * @see FirestoreV1#read()
+     * @see FirestoreV1.Read#batchGetDocuments()
+     * @see FirestoreV1.BatchGetDocuments
+     * @see BatchGetDocumentsRequest
+     * @see BatchGetDocumentsResponse
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.Firestore.BatchGetDocuments">google.firestore.v1.Firestore.BatchGetDocuments</a>
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.BatchGetDocumentsRequest">google.firestore.v1.BatchGetDocumentsRequest</a>
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.BatchGetDocumentsResponse">google.firestore.v1.BatchGetDocumentsResponse</a>
+     */
+    public static final class Builder
+        extends Transform.Builder<
+            PCollection<BatchGetDocumentsRequest>,
+            PCollection<BatchGetDocumentsResponse>,
+            BatchGetDocuments,
+            BatchGetDocuments.Builder> {
+
+      private Builder() {
+        super();
+      }
+
+      public Builder(
+          JodaClock clock,
+          FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+          RpcQosOptions rpcQosOptions) {
+        super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+      }
+
+      @Override
+      public BatchGetDocuments build() {
+        return genericBuild();
+      }
+
+      @Override
+      BatchGetDocuments buildSafe(
+          JodaClock clock,
+          FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+          RpcQosOptions rpcQosOptions) {
+        return new BatchGetDocuments(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+      }
+    }
+  }
+
+  /**
+   * Concrete class representing a {@link PTransform}{@code <}{@link PCollection}{@code <}{@link
+   * PartitionQueryRequest}{@code >, }{@link PTransform}{@code <}{@link RunQueryResponse}{@code >>}
+   * which will read from Firestore.
+   *
+   * <p>This class is part of the Firestore Connector DSL, it has a type safe builder accessible via
+   * {@link FirestoreIO#v1()}{@code .}{@link FirestoreV1#read() read()}{@code .}{@link
+   * FirestoreV1.Read#partitionQuery() partitionQuery()}.
+   *
+   * <p>All request quality-of-service for an instance of this PTransform is scoped to the worker
+   * and configured via {@link PartitionQuery.Builder#withRpcQosOptions(RpcQosOptions)}.
+   *
+   * @see FirestoreIO#v1()
+   * @see FirestoreV1#read()
+   * @see FirestoreV1.Read#partitionQuery()
+   * @see FirestoreV1.PartitionQuery.Builder
+   * @see PartitionQueryRequest
+   * @see RunQueryResponse
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.Firestore.PartitionQuery">google.firestore.v1.Firestore.PartitionQuery</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.PartitionQueryRequest">google.firestore.v1.PartitionQueryRequest</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.PartitionQueryResponse">google.firestore.v1.PartitionQueryResponse</a>
+   */
+  public static final class PartitionQuery
+      extends Transform<
+          PCollection<PartitionQueryRequest>,
+          PCollection<RunQueryResponse>,
+          PartitionQuery,
+          PartitionQuery.Builder> {
+
+    private final boolean nameOnlyQuery;
+
+    private PartitionQuery(
+        JodaClock clock,
+        FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+        RpcQosOptions rpcQosOptions,
+        boolean nameOnlyQuery) {
+      super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+      this.nameOnlyQuery = nameOnlyQuery;
+    }
+
+    @Override
+    public PCollection<RunQueryResponse> expand(PCollection<PartitionQueryRequest> input) {

Review comment:
       Agree. I would have expected PartitionQuery to be a utility that help Beam read transform to parallelize better instead being a part of the public API.

##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/firestore/FirestoreDoFn.java
##########
@@ -46,6 +46,22 @@
   @StartBundle
   public abstract void startBundle(DoFn<InT, OutT>.StartBundleContext context) throws Exception;
 
+  abstract static class NonWindowAwareDoFn<InT, OutT> extends FirestoreDoFn<InT, OutT> {
+    /**
+     * {@link ProcessContext#element() context.element()} must be non-null, otherwise a
+     * NullPointerException will be thrown.

Review comment:
       I'm not sure if this means whether Firestore Read does not support Windowing or not but please note that Windowing is a key feature of Beam and all sources are expected to support that.

##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/firestore/FirestoreV1.java
##########
@@ -59,6 +89,80 @@
  *
  * <h3>Operations</h3>
  *
+ * <h4>Read</h4>
+ *
+ * <p>The currently supported read operations and their execution behavior are as follows:
+ *
+ * <table>
+ *   <tbody>
+ *     <tr>
+ *       <th>RPC</th>
+ *       <th>Execution Behavior</th>
+ *     </tr>
+ *     <tr>
+ *       <td>PartitionQuery</td>
+ *       <td>Parallel Streaming</td>
+ *     </tr>
+ *     <tr>
+ *       <td>RunQuery</td>
+ *       <td>Sequential Streaming</td>
+ *     </tr>
+ *     <tr>
+ *       <td>BatchGet</td>
+ *       <td>Sequential Streaming</td>
+ *     </tr>
+ *     <tr>
+ *       <td>ListCollectionIds</td>
+ *       <td>Sequential Paginated</td>
+ *     </tr>
+ *     <tr>
+ *       <td>ListDocuments</td>
+ *       <td>Sequential Paginated</td>
+ *     </tr>
+ *   </tbody>
+ * </table>
+ *
+ * <p>PartitionQuery should be preferred over other options if at all possible, becuase it has the
+ * ability to parallelize execution of multiple queries for specific sub-ranges of the full results.
+ *
+ * <p>You should only ever use ListDocuments if the use of <a target="_blank" rel="noopener
+ * noreferrer"
+ * href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.ListDocumentsRequest">{@code
+ * show_missing}</a> is needed to access a document. RunQuery and PartitionQuery will always be
+ * faster if the use of {@code show_missing} is not needed.
+ *
+ * <p><b>Example Usage</b>
+ *
+ * <pre>{@code
+ * PCollection<PartitionQueryRequest> partitionQueryRequests = ...;

Review comment:
       Probably add short descriptions to each of these request/response types (and links for further details).

##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/firestore/FirestoreV1ReadFn.java
##########
@@ -0,0 +1,633 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.gcp.firestore;
+
+import static java.util.Objects.requireNonNull;
+
+import com.google.api.gax.paging.AbstractPage;
+import com.google.api.gax.paging.AbstractPagedListResponse;
+import com.google.api.gax.rpc.ServerStream;
+import com.google.api.gax.rpc.ServerStreamingCallable;
+import com.google.api.gax.rpc.UnaryCallable;
+import com.google.cloud.firestore.v1.FirestoreClient.ListCollectionIdsPage;
+import com.google.cloud.firestore.v1.FirestoreClient.ListCollectionIdsPagedResponse;
+import com.google.cloud.firestore.v1.FirestoreClient.ListDocumentsPage;
+import com.google.cloud.firestore.v1.FirestoreClient.ListDocumentsPagedResponse;
+import com.google.cloud.firestore.v1.FirestoreClient.PartitionQueryPage;
+import com.google.cloud.firestore.v1.FirestoreClient.PartitionQueryPagedResponse;
+import com.google.cloud.firestore.v1.stub.FirestoreStub;
+import com.google.firestore.v1.BatchGetDocumentsRequest;
+import com.google.firestore.v1.BatchGetDocumentsResponse;
+import com.google.firestore.v1.Cursor;
+import com.google.firestore.v1.ListCollectionIdsRequest;
+import com.google.firestore.v1.ListCollectionIdsResponse;
+import com.google.firestore.v1.ListDocumentsRequest;
+import com.google.firestore.v1.ListDocumentsResponse;
+import com.google.firestore.v1.PartitionQueryRequest;
+import com.google.firestore.v1.PartitionQueryResponse;
+import com.google.firestore.v1.RunQueryRequest;
+import com.google.firestore.v1.RunQueryResponse;
+import com.google.firestore.v1.StructuredQuery;
+import com.google.firestore.v1.StructuredQuery.Direction;
+import com.google.firestore.v1.StructuredQuery.FieldReference;
+import com.google.firestore.v1.StructuredQuery.Order;
+import com.google.firestore.v1.Value;
+import com.google.protobuf.Message;
+import com.google.protobuf.ProtocolStringList;
+import java.io.Serializable;
+import java.util.List;
+import java.util.Map;
+import java.util.Objects;
+import org.apache.beam.sdk.extensions.gcp.options.GcpOptions;
+import org.apache.beam.sdk.io.gcp.firestore.FirestoreDoFn.NonWindowAwareDoFn;
+import org.apache.beam.sdk.io.gcp.firestore.FirestoreV1Fn.HasRpcAttemptContext;
+import org.apache.beam.sdk.io.gcp.firestore.RpcQos.RpcAttempt.Context;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.display.DisplayData;
+import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.annotations.VisibleForTesting;
+import org.checkerframework.checker.nullness.compatqual.NullableDecl;
+import org.checkerframework.checker.nullness.qual.Nullable;
+import org.joda.time.Instant;
+
+/**
+ * A collection of {@link org.apache.beam.sdk.transforms.DoFn DoFn}s for each of the supported read
+ * RPC methods from the Cloud Firestore V1 API.
+ */
+final class FirestoreV1ReadFn {
+
+  /**
+   * {@link DoFn} for Firestore V1 {@link RunQueryRequest}s.
+   *
+   * <p>This Fn uses a stream to obtain responses, each response from the stream will be output to
+   * the next stage of the pipeline. Each response from the stream represents an individual document
+   * with the associated metadata.
+   *
+   * <p>If an error is encountered while reading from the stream, the stream will attempt to resume
+   * rather than starting over. The restarting of the stream will continue within the scope of the
+   * completion of the request (meaning any possibility of resumption is contingent upon an attempt
+   * being available in the Qos budget).
+   *
+   * <p>All request quality-of-service is managed via the instance of {@link RpcQos} associated with
+   * the lifecycle of this Fn.
+   */
+  static final class RunQueryFn

Review comment:
       I would also try to reduce the the implementation to one or few SDF-based sources to make implementation simpler.

##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/firestore/FirestoreV1.java
##########
@@ -189,6 +533,725 @@ private Write() {}
     }
   }
 
+  /**
+   * Concrete class representing a {@link PTransform}{@code <}{@link PCollection}{@code <}{@link
+   * ListCollectionIdsRequest}{@code >, }{@link PTransform}{@code <}{@link
+   * ListCollectionIdsResponse}{@code >>} which will read from Firestore.
+   *
+   * <p>This class is part of the Firestore Connector DSL, it has a type safe builder accessible via
+   * {@link FirestoreIO#v1()}{@code .}{@link FirestoreV1#read() read()}{@code .}{@link
+   * FirestoreV1.Read#listCollectionIds() listCollectionIds()}.
+   *
+   * <p>All request quality-of-service for an instance of this PTransform is scoped to the worker
+   * and configured via {@link ListCollectionIds.Builder#withRpcQosOptions(RpcQosOptions)}.
+   *
+   * @see FirestoreIO#v1()
+   * @see FirestoreV1#read()
+   * @see FirestoreV1.Read#listCollectionIds()
+   * @see FirestoreV1.ListCollectionIds.Builder
+   * @see ListCollectionIdsRequest
+   * @see ListCollectionIdsResponse
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.Firestore.ListCollectionIds">google.firestore.v1.Firestore.ListCollectionIds</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.ListCollectionIdsRequest">google.firestore.v1.ListCollectionIdsRequest</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.ListCollectionIdsResponse">google.firestore.v1.ListCollectionIdsResponse</a>
+   */
+  public static final class ListCollectionIds
+      extends Transform<
+          PCollection<ListCollectionIdsRequest>,
+          PCollection<String>,
+          ListCollectionIds,
+          ListCollectionIds.Builder> {
+
+    private ListCollectionIds(
+        JodaClock clock,
+        FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+        RpcQosOptions rpcQosOptions) {
+      super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    @Override
+    public PCollection<String> expand(PCollection<ListCollectionIdsRequest> input) {
+      return input
+          .apply(
+              "listCollectionIds",
+              ParDo.of(
+                  new ListCollectionIdsFn(clock, firestoreStatefulComponentFactory, rpcQosOptions)))
+          .apply(ParDo.of(new FlattenListCollectionIdsResponse()))
+          .apply(Reshuffle.viaRandomKey());
+    }
+
+    @Override
+    public Builder toBuilder() {
+      return new Builder(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    /**
+     * A type safe builder for {@link ListCollectionIds} allowing configuration and instantiation.
+     *
+     * <p>This class is part of the Firestore Connector DSL, it has a type safe builder accessible
+     * via {@link FirestoreIO#v1()}{@code .}{@link FirestoreV1#read() read()}{@code .}{@link
+     * FirestoreV1.Read#listCollectionIds() listCollectionIds()}.
+     *
+     * <p>
+     *
+     * @see FirestoreIO#v1()
+     * @see FirestoreV1#read()
+     * @see FirestoreV1.Read#listCollectionIds()
+     * @see FirestoreV1.ListCollectionIds
+     * @see ListCollectionIdsRequest
+     * @see ListCollectionIdsResponse
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.Firestore.ListCollectionIds">google.firestore.v1.Firestore.ListCollectionIds</a>
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.ListCollectionIdsRequest">google.firestore.v1.ListCollectionIdsRequest</a>
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.ListCollectionIdsResponse">google.firestore.v1.ListCollectionIdsResponse</a>
+     */
+    public static final class Builder
+        extends Transform.Builder<
+            PCollection<ListCollectionIdsRequest>,
+            PCollection<String>,
+            ListCollectionIds,
+            ListCollectionIds.Builder> {
+
+      private Builder() {
+        super();
+      }
+
+      private Builder(
+          JodaClock clock,
+          FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+          RpcQosOptions rpcQosOptions) {
+        super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+      }
+
+      @Override
+      public ListCollectionIds build() {
+        return genericBuild();
+      }
+
+      @Override
+      ListCollectionIds buildSafe(
+          JodaClock clock,
+          FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+          RpcQosOptions rpcQosOptions) {
+        return new ListCollectionIds(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+      }
+    }
+  }
+
+  /**
+   * Concrete class representing a {@link PTransform}{@code <}{@link PCollection}{@code <}{@link
+   * ListDocumentsRequest}{@code >, }{@link PTransform}{@code <}{@link ListDocumentsResponse}{@code
+   * >>} which will read from Firestore.
+   *
+   * <p>This class is part of the Firestore Connector DSL, it has a type safe builder accessible via
+   * {@link FirestoreIO#v1()}{@code .}{@link FirestoreV1#read() read()}{@code .}{@link
+   * FirestoreV1.Read#listDocuments() listDocuments()}.
+   *
+   * <p>All request quality-of-service for an instance of this PTransform is scoped to the worker
+   * and configured via {@link ListDocuments.Builder#withRpcQosOptions(RpcQosOptions)}.
+   *
+   * @see FirestoreIO#v1()
+   * @see FirestoreV1#read()
+   * @see FirestoreV1.Read#listDocuments()
+   * @see FirestoreV1.ListDocuments.Builder
+   * @see ListDocumentsRequest
+   * @see ListDocumentsResponse
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.Firestore.ListDocuments">google.firestore.v1.Firestore.ListDocuments</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.ListDocumentsRequest">google.firestore.v1.ListDocumentsRequest</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.ListDocumentsResponse">google.firestore.v1.ListDocumentsResponse</a>
+   */
+  public static final class ListDocuments
+      extends Transform<
+          PCollection<ListDocumentsRequest>,
+          PCollection<Document>,
+          ListDocuments,
+          ListDocuments.Builder> {
+
+    private ListDocuments(
+        JodaClock clock,
+        FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+        RpcQosOptions rpcQosOptions) {
+      super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    @Override
+    public PCollection<Document> expand(PCollection<ListDocumentsRequest> input) {
+      return input
+          .apply(
+              "listDocuments",
+              ParDo.of(
+                  new ListDocumentsFn(clock, firestoreStatefulComponentFactory, rpcQosOptions)))
+          .apply(ParDo.of(new ListDocumentsResponseToDocument()))
+          .apply(Reshuffle.viaRandomKey());
+    }
+
+    @Override
+    public Builder toBuilder() {
+      return new Builder(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    /**
+     * A type safe builder for {@link ListDocuments} allowing configuration and instantiation.
+     *
+     * <p>This class is part of the Firestore Connector DSL, it has a type safe builder accessible
+     * via {@link FirestoreIO#v1()}{@code .}{@link FirestoreV1#read() read()}{@code .}{@link
+     * FirestoreV1.Read#listDocuments() listDocuments()}.
+     *
+     * <p>
+     *
+     * @see FirestoreIO#v1()
+     * @see FirestoreV1#read()
+     * @see FirestoreV1.Read#listDocuments()
+     * @see FirestoreV1.ListDocuments
+     * @see ListDocumentsRequest
+     * @see ListDocumentsResponse
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.Firestore.ListDocuments">google.firestore.v1.Firestore.ListDocuments</a>
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.ListDocumentsRequest">google.firestore.v1.ListDocumentsRequest</a>
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.ListDocumentsResponse">google.firestore.v1.ListDocumentsResponse</a>
+     */
+    public static final class Builder
+        extends Transform.Builder<
+            PCollection<ListDocumentsRequest>,
+            PCollection<Document>,
+            ListDocuments,
+            ListDocuments.Builder> {
+
+      private Builder() {
+        super();
+      }
+
+      private Builder(
+          JodaClock clock,
+          FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+          RpcQosOptions rpcQosOptions) {
+        super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+      }
+
+      @Override
+      public ListDocuments build() {
+        return genericBuild();
+      }
+
+      @Override
+      ListDocuments buildSafe(
+          JodaClock clock,
+          FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+          RpcQosOptions rpcQosOptions) {
+        return new ListDocuments(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+      }
+    }
+  }
+
+  /**
+   * Concrete class representing a {@link PTransform}{@code <}{@link PCollection}{@code <}{@link
+   * RunQueryRequest}{@code >, }{@link PTransform}{@code <}{@link RunQueryResponse}{@code >>} which
+   * will read from Firestore.
+   *
+   * <p>This class is part of the Firestore Connector DSL, it has a type safe builder accessible via
+   * {@link FirestoreIO#v1()}{@code .}{@link FirestoreV1#read() read()}{@code .}{@link
+   * FirestoreV1.Read#runQuery() runQuery()}.
+   *
+   * <p>All request quality-of-service for an instance of this PTransform is scoped to the worker
+   * and configured via {@link RunQuery.Builder#withRpcQosOptions(RpcQosOptions)}.
+   *
+   * @see FirestoreIO#v1()
+   * @see FirestoreV1#read()
+   * @see FirestoreV1.Read#runQuery()
+   * @see FirestoreV1.RunQuery.Builder
+   * @see RunQueryRequest
+   * @see RunQueryResponse
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.Firestore.RunQuery">google.firestore.v1.Firestore.RunQuery</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.RunQueryRequest">google.firestore.v1.RunQueryRequest</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.RunQueryResponse">google.firestore.v1.RunQueryResponse</a>
+   */
+  public static final class RunQuery
+      extends Transform<
+          PCollection<RunQueryRequest>, PCollection<RunQueryResponse>, RunQuery, RunQuery.Builder> {
+
+    private RunQuery(
+        JodaClock clock,
+        FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+        RpcQosOptions rpcQosOptions) {
+      super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    @Override
+    public PCollection<RunQueryResponse> expand(PCollection<RunQueryRequest> input) {
+      return input
+          .apply(
+              "runQuery",
+              ParDo.of(new RunQueryFn(clock, firestoreStatefulComponentFactory, rpcQosOptions)))
+          .apply(Reshuffle.viaRandomKey());
+    }
+
+    @Override
+    public Builder toBuilder() {
+      return new Builder(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    /**
+     * A type safe builder for {@link RunQuery} allowing configuration and instantiation.
+     *
+     * <p>This class is part of the Firestore Connector DSL, it has a type safe builder accessible
+     * via {@link FirestoreIO#v1()}{@code .}{@link FirestoreV1#read() read()}{@code .}{@link
+     * FirestoreV1.Read#runQuery() runQuery()}.
+     *
+     * <p>
+     *
+     * @see FirestoreIO#v1()
+     * @see FirestoreV1#read()
+     * @see FirestoreV1.Read#runQuery()
+     * @see FirestoreV1.RunQuery
+     * @see RunQueryRequest
+     * @see RunQueryResponse
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.Firestore.RunQuery">google.firestore.v1.Firestore.RunQuery</a>
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.RunQueryRequest">google.firestore.v1.RunQueryRequest</a>
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.RunQueryResponse">google.firestore.v1.RunQueryResponse</a>
+     */
+    public static final class Builder
+        extends Transform.Builder<
+            PCollection<RunQueryRequest>,
+            PCollection<RunQueryResponse>,
+            RunQuery,
+            RunQuery.Builder> {
+
+      private Builder() {
+        super();
+      }
+
+      private Builder(
+          JodaClock clock,
+          FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+          RpcQosOptions rpcQosOptions) {
+        super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+      }
+
+      @Override
+      public RunQuery build() {
+        return genericBuild();
+      }
+
+      @Override
+      RunQuery buildSafe(
+          JodaClock clock,
+          FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+          RpcQosOptions rpcQosOptions) {
+        return new RunQuery(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+      }
+    }
+  }
+
+  /**
+   * Concrete class representing a {@link PTransform}{@code <}{@link PCollection}{@code <}{@link
+   * BatchGetDocumentsRequest}{@code >, }{@link PTransform}{@code <}{@link
+   * BatchGetDocumentsResponse}{@code >>} which will read from Firestore.
+   *
+   * <p>This class is part of the Firestore Connector DSL, it has a type safe builder accessible via
+   * {@link FirestoreIO#v1()}{@code .}{@link FirestoreV1#read() read()}{@code .}{@link
+   * FirestoreV1.Read#batchGetDocuments() batchGetDocuments()}.
+   *
+   * <p>All request quality-of-service for an instance of this PTransform is scoped to the worker
+   * and configured via {@link BatchGetDocuments.Builder#withRpcQosOptions(RpcQosOptions)}.
+   *
+   * @see FirestoreIO#v1()
+   * @see FirestoreV1#read()
+   * @see FirestoreV1.Read#batchGetDocuments()
+   * @see FirestoreV1.BatchGetDocuments.Builder
+   * @see BatchGetDocumentsRequest
+   * @see BatchGetDocumentsResponse
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.Firestore.BatchGetDocuments">google.firestore.v1.Firestore.BatchGetDocuments</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.BatchGetDocumentsRequest">google.firestore.v1.BatchGetDocumentsRequest</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.BatchGetDocumentsResponse">google.firestore.v1.BatchGetDocumentsResponse</a>
+   */
+  public static final class BatchGetDocuments
+      extends Transform<
+          PCollection<BatchGetDocumentsRequest>,
+          PCollection<BatchGetDocumentsResponse>,
+          BatchGetDocuments,
+          BatchGetDocuments.Builder> {
+
+    private BatchGetDocuments(
+        JodaClock clock,
+        FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+        RpcQosOptions rpcQosOptions) {
+      super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    @Override
+    public PCollection<BatchGetDocumentsResponse> expand(
+        PCollection<BatchGetDocumentsRequest> input) {
+      return input
+          .apply(
+              "batchGetDocuments",
+              ParDo.of(
+                  new BatchGetDocumentsFn(clock, firestoreStatefulComponentFactory, rpcQosOptions)))
+          .apply(Reshuffle.viaRandomKey());
+    }
+
+    @Override
+    public Builder toBuilder() {
+      return new Builder(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    /**
+     * A type safe builder for {@link BatchGetDocuments} allowing configuration and instantiation.
+     *
+     * <p>This class is part of the Firestore Connector DSL, it has a type safe builder accessible
+     * via {@link FirestoreIO#v1()}{@code .}{@link FirestoreV1#read() read()}{@code .}{@link
+     * FirestoreV1.Read#batchGetDocuments() batchGetDocuments()}.
+     *
+     * <p>
+     *
+     * @see FirestoreIO#v1()
+     * @see FirestoreV1#read()
+     * @see FirestoreV1.Read#batchGetDocuments()
+     * @see FirestoreV1.BatchGetDocuments
+     * @see BatchGetDocumentsRequest
+     * @see BatchGetDocumentsResponse
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.Firestore.BatchGetDocuments">google.firestore.v1.Firestore.BatchGetDocuments</a>
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.BatchGetDocumentsRequest">google.firestore.v1.BatchGetDocumentsRequest</a>
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.BatchGetDocumentsResponse">google.firestore.v1.BatchGetDocumentsResponse</a>
+     */
+    public static final class Builder
+        extends Transform.Builder<
+            PCollection<BatchGetDocumentsRequest>,
+            PCollection<BatchGetDocumentsResponse>,
+            BatchGetDocuments,
+            BatchGetDocuments.Builder> {
+
+      private Builder() {
+        super();
+      }
+
+      public Builder(
+          JodaClock clock,
+          FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+          RpcQosOptions rpcQosOptions) {
+        super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+      }
+
+      @Override
+      public BatchGetDocuments build() {
+        return genericBuild();
+      }
+
+      @Override
+      BatchGetDocuments buildSafe(
+          JodaClock clock,
+          FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+          RpcQosOptions rpcQosOptions) {
+        return new BatchGetDocuments(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+      }
+    }
+  }
+
+  /**
+   * Concrete class representing a {@link PTransform}{@code <}{@link PCollection}{@code <}{@link
+   * PartitionQueryRequest}{@code >, }{@link PTransform}{@code <}{@link RunQueryResponse}{@code >>}
+   * which will read from Firestore.
+   *
+   * <p>This class is part of the Firestore Connector DSL, it has a type safe builder accessible via
+   * {@link FirestoreIO#v1()}{@code .}{@link FirestoreV1#read() read()}{@code .}{@link
+   * FirestoreV1.Read#partitionQuery() partitionQuery()}.
+   *
+   * <p>All request quality-of-service for an instance of this PTransform is scoped to the worker
+   * and configured via {@link PartitionQuery.Builder#withRpcQosOptions(RpcQosOptions)}.
+   *
+   * @see FirestoreIO#v1()
+   * @see FirestoreV1#read()
+   * @see FirestoreV1.Read#partitionQuery()
+   * @see FirestoreV1.PartitionQuery.Builder
+   * @see PartitionQueryRequest
+   * @see RunQueryResponse
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.Firestore.PartitionQuery">google.firestore.v1.Firestore.PartitionQuery</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.PartitionQueryRequest">google.firestore.v1.PartitionQueryRequest</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.PartitionQueryResponse">google.firestore.v1.PartitionQueryResponse</a>
+   */
+  public static final class PartitionQuery
+      extends Transform<
+          PCollection<PartitionQueryRequest>,
+          PCollection<RunQueryResponse>,
+          PartitionQuery,
+          PartitionQuery.Builder> {
+
+    private final boolean nameOnlyQuery;
+
+    private PartitionQuery(
+        JodaClock clock,
+        FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+        RpcQosOptions rpcQosOptions,
+        boolean nameOnlyQuery) {
+      super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+      this.nameOnlyQuery = nameOnlyQuery;
+    }
+
+    @Override
+    public PCollection<RunQueryResponse> expand(PCollection<PartitionQueryRequest> input) {
+      PCollection<RunQueryRequest> queries =
+          input
+              .apply(
+                  "PartitionQuery",
+                  ParDo.of(
+                      new PartitionQueryFn(
+                          clock, firestoreStatefulComponentFactory, rpcQosOptions)))
+              .apply("expand queries", ParDo.of(new PartitionQueryResponseToRunQueryRequest()));
+      if (nameOnlyQuery) {
+        queries =
+            queries.apply(
+                "set name only query",
+                MapElements.via(
+                    new SimpleFunction<RunQueryRequest, RunQueryRequest>() {
+                      @Override
+                      public RunQueryRequest apply(RunQueryRequest input) {
+                        RunQueryRequest.Builder builder = input.toBuilder();
+                        builder
+                            .getStructuredQueryBuilder()
+                            .setSelect(
+                                Projection.newBuilder()
+                                    .addFields(
+                                        FieldReference.newBuilder()
+                                            .setFieldPath("__name__")
+                                            .build())
+                                    .build());
+                        return builder.build();
+                      }
+                    }));
+      }
+      return queries
+          .apply(Reshuffle.viaRandomKey())
+          .apply(new RunQuery(clock, firestoreStatefulComponentFactory, rpcQosOptions));
+    }
+
+    @Override
+    public Builder toBuilder() {
+      return new Builder(clock, firestoreStatefulComponentFactory, rpcQosOptions, nameOnlyQuery);
+    }
+
+    /**
+     * A type safe builder for {@link PartitionQuery} allowing configuration and instantiation.
+     *
+     * <p>This class is part of the Firestore Connector DSL, it has a type safe builder accessible
+     * via {@link FirestoreIO#v1()}{@code .}{@link FirestoreV1#read() read()}{@code .}{@link
+     * FirestoreV1.Read#partitionQuery() partitionQuery()}.
+     *
+     * <p>
+     *
+     * @see FirestoreIO#v1()
+     * @see FirestoreV1#read()
+     * @see FirestoreV1.Read#partitionQuery()
+     * @see FirestoreV1.PartitionQuery
+     * @see PartitionQueryRequest
+     * @see RunQueryResponse
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.Firestore.PartitionQuery">google.firestore.v1.Firestore.PartitionQuery</a>
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.PartitionQueryRequest">google.firestore.v1.PartitionQueryRequest</a>
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.PartitionQueryResponse">google.firestore.v1.PartitionQueryResponse</a>
+     */
+    public static final class Builder
+        extends Transform.Builder<
+            PCollection<PartitionQueryRequest>,
+            PCollection<RunQueryResponse>,
+            PartitionQuery,
+            FirestoreV1.PartitionQuery.Builder> {
+
+      private boolean nameOnlyQuery = false;
+
+      private Builder() {
+        super();
+      }
+
+      public Builder(
+          JodaClock clock,
+          FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+          RpcQosOptions rpcQosOptions,
+          boolean nameOnlyQuery) {
+        super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+        this.nameOnlyQuery = nameOnlyQuery;
+      }
+
+      @Override
+      public PartitionQuery build() {
+        return genericBuild();
+      }
+
+      /**
+       * Update produced queries to only retrieve their {@code __name__} thereby not retrieving any
+       * fields and reducing resource requirements.
+       *
+       * @return this builder
+       */
+      public Builder withNameOnlyQuery() {
+        this.nameOnlyQuery = true;
+        return this;
+      }
+
+      @Override
+      PartitionQuery buildSafe(
+          JodaClock clock,
+          FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+          RpcQosOptions rpcQosOptions) {
+        return new PartitionQuery(
+            clock, firestoreStatefulComponentFactory, rpcQosOptions, nameOnlyQuery);
+      }
+    }
+
+    /**
+     * DoFn which contains the logic necessary to turn a {@link PartitionQueryRequest} and {@link
+     * PartitionQueryResponse} pair into {@code N} {@link RunQueryRequest}.
+     */
+    static final class PartitionQueryResponseToRunQueryRequest
+        extends DoFn<PartitionQueryPair, RunQueryRequest> {
+
+      /**
+       * When fetching cursors that span multiple pages it is expected (per <a
+       * href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.PartitionQueryRequest">
+       * PartitionQueryRequest.page_token</a>) for the client to sort the cursors before processing
+       * them to define the sub-queries. So here we're defining a Comparator which will sort Cursors

Review comment:
       Can such a order change while a Beam pipeline is reading a given dataset ?

##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/firestore/FirestoreV1ReadFn.java
##########
@@ -0,0 +1,633 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.gcp.firestore;
+
+import static java.util.Objects.requireNonNull;
+
+import com.google.api.gax.paging.AbstractPage;
+import com.google.api.gax.paging.AbstractPagedListResponse;
+import com.google.api.gax.rpc.ServerStream;
+import com.google.api.gax.rpc.ServerStreamingCallable;
+import com.google.api.gax.rpc.UnaryCallable;
+import com.google.cloud.firestore.v1.FirestoreClient.ListCollectionIdsPage;
+import com.google.cloud.firestore.v1.FirestoreClient.ListCollectionIdsPagedResponse;
+import com.google.cloud.firestore.v1.FirestoreClient.ListDocumentsPage;
+import com.google.cloud.firestore.v1.FirestoreClient.ListDocumentsPagedResponse;
+import com.google.cloud.firestore.v1.FirestoreClient.PartitionQueryPage;
+import com.google.cloud.firestore.v1.FirestoreClient.PartitionQueryPagedResponse;
+import com.google.cloud.firestore.v1.stub.FirestoreStub;
+import com.google.firestore.v1.BatchGetDocumentsRequest;
+import com.google.firestore.v1.BatchGetDocumentsResponse;
+import com.google.firestore.v1.Cursor;
+import com.google.firestore.v1.ListCollectionIdsRequest;
+import com.google.firestore.v1.ListCollectionIdsResponse;
+import com.google.firestore.v1.ListDocumentsRequest;
+import com.google.firestore.v1.ListDocumentsResponse;
+import com.google.firestore.v1.PartitionQueryRequest;
+import com.google.firestore.v1.PartitionQueryResponse;
+import com.google.firestore.v1.RunQueryRequest;
+import com.google.firestore.v1.RunQueryResponse;
+import com.google.firestore.v1.StructuredQuery;
+import com.google.firestore.v1.StructuredQuery.Direction;
+import com.google.firestore.v1.StructuredQuery.FieldReference;
+import com.google.firestore.v1.StructuredQuery.Order;
+import com.google.firestore.v1.Value;
+import com.google.protobuf.Message;
+import com.google.protobuf.ProtocolStringList;
+import java.io.Serializable;
+import java.util.List;
+import java.util.Map;
+import java.util.Objects;
+import org.apache.beam.sdk.extensions.gcp.options.GcpOptions;
+import org.apache.beam.sdk.io.gcp.firestore.FirestoreDoFn.NonWindowAwareDoFn;
+import org.apache.beam.sdk.io.gcp.firestore.FirestoreV1Fn.HasRpcAttemptContext;
+import org.apache.beam.sdk.io.gcp.firestore.RpcQos.RpcAttempt.Context;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.display.DisplayData;
+import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.annotations.VisibleForTesting;
+import org.checkerframework.checker.nullness.compatqual.NullableDecl;
+import org.checkerframework.checker.nullness.qual.Nullable;
+import org.joda.time.Instant;
+
+/**
+ * A collection of {@link org.apache.beam.sdk.transforms.DoFn DoFn}s for each of the supported read
+ * RPC methods from the Cloud Firestore V1 API.
+ */
+final class FirestoreV1ReadFn {

Review comment:
       Usually "*Fn" notation is used to name DoFn implementations.

##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/firestore/FirestoreV1ReadFn.java
##########
@@ -0,0 +1,633 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.gcp.firestore;
+
+import static java.util.Objects.requireNonNull;
+
+import com.google.api.gax.paging.AbstractPage;
+import com.google.api.gax.paging.AbstractPagedListResponse;
+import com.google.api.gax.rpc.ServerStream;
+import com.google.api.gax.rpc.ServerStreamingCallable;
+import com.google.api.gax.rpc.UnaryCallable;
+import com.google.cloud.firestore.v1.FirestoreClient.ListCollectionIdsPage;
+import com.google.cloud.firestore.v1.FirestoreClient.ListCollectionIdsPagedResponse;
+import com.google.cloud.firestore.v1.FirestoreClient.ListDocumentsPage;
+import com.google.cloud.firestore.v1.FirestoreClient.ListDocumentsPagedResponse;
+import com.google.cloud.firestore.v1.FirestoreClient.PartitionQueryPage;
+import com.google.cloud.firestore.v1.FirestoreClient.PartitionQueryPagedResponse;
+import com.google.cloud.firestore.v1.stub.FirestoreStub;
+import com.google.firestore.v1.BatchGetDocumentsRequest;
+import com.google.firestore.v1.BatchGetDocumentsResponse;
+import com.google.firestore.v1.Cursor;
+import com.google.firestore.v1.ListCollectionIdsRequest;
+import com.google.firestore.v1.ListCollectionIdsResponse;
+import com.google.firestore.v1.ListDocumentsRequest;
+import com.google.firestore.v1.ListDocumentsResponse;
+import com.google.firestore.v1.PartitionQueryRequest;
+import com.google.firestore.v1.PartitionQueryResponse;
+import com.google.firestore.v1.RunQueryRequest;
+import com.google.firestore.v1.RunQueryResponse;
+import com.google.firestore.v1.StructuredQuery;
+import com.google.firestore.v1.StructuredQuery.Direction;
+import com.google.firestore.v1.StructuredQuery.FieldReference;
+import com.google.firestore.v1.StructuredQuery.Order;
+import com.google.firestore.v1.Value;
+import com.google.protobuf.Message;
+import com.google.protobuf.ProtocolStringList;
+import java.io.Serializable;
+import java.util.List;
+import java.util.Map;
+import java.util.Objects;
+import org.apache.beam.sdk.extensions.gcp.options.GcpOptions;
+import org.apache.beam.sdk.io.gcp.firestore.FirestoreDoFn.NonWindowAwareDoFn;
+import org.apache.beam.sdk.io.gcp.firestore.FirestoreV1Fn.HasRpcAttemptContext;
+import org.apache.beam.sdk.io.gcp.firestore.RpcQos.RpcAttempt.Context;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.display.DisplayData;
+import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.annotations.VisibleForTesting;
+import org.checkerframework.checker.nullness.compatqual.NullableDecl;
+import org.checkerframework.checker.nullness.qual.Nullable;
+import org.joda.time.Instant;
+
+/**
+ * A collection of {@link org.apache.beam.sdk.transforms.DoFn DoFn}s for each of the supported read
+ * RPC methods from the Cloud Firestore V1 API.
+ */
+final class FirestoreV1ReadFn {
+
+  /**
+   * {@link DoFn} for Firestore V1 {@link RunQueryRequest}s.
+   *
+   * <p>This Fn uses a stream to obtain responses, each response from the stream will be output to
+   * the next stage of the pipeline. Each response from the stream represents an individual document
+   * with the associated metadata.
+   *
+   * <p>If an error is encountered while reading from the stream, the stream will attempt to resume
+   * rather than starting over. The restarting of the stream will continue within the scope of the
+   * completion of the request (meaning any possibility of resumption is contingent upon an attempt
+   * being available in the Qos budget).
+   *
+   * <p>All request quality-of-service is managed via the instance of {@link RpcQos} associated with
+   * the lifecycle of this Fn.
+   */
+  static final class RunQueryFn
+      extends StreamingFirestoreV1ReadFn<RunQueryRequest, RunQueryResponse> {
+
+    RunQueryFn(
+        JodaClock clock,
+        FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+        RpcQosOptions rpcQosOptions) {
+      super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    @Override
+    public Context getRpcAttemptContext() {
+      return FirestoreV1Fn.V1FnRpcAttemptContext.RunQuery;
+    }
+
+    @Override
+    protected ServerStreamingCallable<RunQueryRequest, RunQueryResponse> getCallable(
+        FirestoreStub firestoreStub) {
+      return firestoreStub.runQueryCallable();
+    }
+
+    @Override
+    protected RunQueryRequest setStartFrom(
+        RunQueryRequest element, RunQueryResponse runQueryResponse) {
+      StructuredQuery query = element.getStructuredQuery();
+      StructuredQuery.Builder builder;
+      List<Order> orderByList = query.getOrderByList();
+      // if the orderByList is empty that means the default sort of "__name__ ASC" will be used
+      // Before we can set the cursor to the last document name read, we need to explicitly add
+      // the order of "__name__ ASC" because a cursor value must map to an order by
+      if (orderByList.isEmpty()) {
+        builder =
+            query
+                .toBuilder()
+                .addOrderBy(
+                    Order.newBuilder()
+                        .setField(FieldReference.newBuilder().setFieldPath("__name__").build())
+                        .setDirection(Direction.ASCENDING)
+                        .build())
+                .setStartAt(
+                    Cursor.newBuilder()
+                        .setBefore(false)
+                        .addValues(
+                            Value.newBuilder()
+                                .setReferenceValue(runQueryResponse.getDocument().getName())
+                                .build()));
+      } else {
+        Cursor.Builder cursor = Cursor.newBuilder().setBefore(false);
+        Map<String, Value> fieldsMap = runQueryResponse.getDocument().getFieldsMap();
+        for (Order order : orderByList) {
+          String fieldPath = order.getField().getFieldPath();
+          Value value = fieldsMap.get(fieldPath);
+          if (value != null) {
+            cursor.addValues(value);
+          } else if ("__name__".equals(fieldPath)) {
+            cursor.addValues(
+                Value.newBuilder()
+                    .setReferenceValue(runQueryResponse.getDocument().getName())
+                    .build());
+          }
+        }
+        builder = query.toBuilder().setStartAt(cursor.build());
+      }
+      return element.toBuilder().setStructuredQuery(builder.build()).build();
+    }
+  }
+
+  /**
+   * {@link DoFn} for Firestore V1 {@link PartitionQueryRequest}s.
+   *
+   * <p>This Fn uses pagination to obtain responses, all pages will be aggregated before being
+   * emitted to the next stage of the pipeline. Aggregation of pages is necessary as the next step
+   * of pairing of cursors to create N queries must first sort all cursors. See <a target="_blank"
+   * rel="noopener noreferrer"
+   * href="https://cloud.google.com/firestore/docs/reference/rest/v1/projects.databases.documents/partitionQuery#request-body">{@code
+   * pageToken}s</a> documentation for details.
+   *
+   * <p>All request quality-of-service is managed via the instance of {@link RpcQos} associated with
+   * the lifecycle of this Fn.
+   */
+  static final class PartitionQueryFn
+      extends BaseFirestoreV1ReadFn<PartitionQueryRequest, PartitionQueryPair> {
+
+    public PartitionQueryFn(
+        JodaClock clock,
+        FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+        RpcQosOptions rpcQosOptions) {
+      super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    @Override
+    public Context getRpcAttemptContext() {
+      return FirestoreV1Fn.V1FnRpcAttemptContext.PartitionQuery;
+    }
+
+    @Override
+    public void processElement(ProcessContext context) throws Exception {
+      @SuppressWarnings("nullness")
+      final PartitionQueryRequest element =
+          requireNonNull(context.element(), "c.element() must be non null");
+
+      RpcQos.RpcReadAttempt attempt = rpcQos.newReadAttempt(getRpcAttemptContext());

Review comment:
       Is this a global lock ? How would this adapt when Beam parallelize reading across workers ?

##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/firestore/FirestoreV1ReadFn.java
##########
@@ -0,0 +1,633 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.gcp.firestore;
+
+import static java.util.Objects.requireNonNull;
+
+import com.google.api.gax.paging.AbstractPage;
+import com.google.api.gax.paging.AbstractPagedListResponse;
+import com.google.api.gax.rpc.ServerStream;
+import com.google.api.gax.rpc.ServerStreamingCallable;
+import com.google.api.gax.rpc.UnaryCallable;
+import com.google.cloud.firestore.v1.FirestoreClient.ListCollectionIdsPage;
+import com.google.cloud.firestore.v1.FirestoreClient.ListCollectionIdsPagedResponse;
+import com.google.cloud.firestore.v1.FirestoreClient.ListDocumentsPage;
+import com.google.cloud.firestore.v1.FirestoreClient.ListDocumentsPagedResponse;
+import com.google.cloud.firestore.v1.FirestoreClient.PartitionQueryPage;
+import com.google.cloud.firestore.v1.FirestoreClient.PartitionQueryPagedResponse;
+import com.google.cloud.firestore.v1.stub.FirestoreStub;
+import com.google.firestore.v1.BatchGetDocumentsRequest;
+import com.google.firestore.v1.BatchGetDocumentsResponse;
+import com.google.firestore.v1.Cursor;
+import com.google.firestore.v1.ListCollectionIdsRequest;
+import com.google.firestore.v1.ListCollectionIdsResponse;
+import com.google.firestore.v1.ListDocumentsRequest;
+import com.google.firestore.v1.ListDocumentsResponse;
+import com.google.firestore.v1.PartitionQueryRequest;
+import com.google.firestore.v1.PartitionQueryResponse;
+import com.google.firestore.v1.RunQueryRequest;
+import com.google.firestore.v1.RunQueryResponse;
+import com.google.firestore.v1.StructuredQuery;
+import com.google.firestore.v1.StructuredQuery.Direction;
+import com.google.firestore.v1.StructuredQuery.FieldReference;
+import com.google.firestore.v1.StructuredQuery.Order;
+import com.google.firestore.v1.Value;
+import com.google.protobuf.Message;
+import com.google.protobuf.ProtocolStringList;
+import java.io.Serializable;
+import java.util.List;
+import java.util.Map;
+import java.util.Objects;
+import org.apache.beam.sdk.extensions.gcp.options.GcpOptions;
+import org.apache.beam.sdk.io.gcp.firestore.FirestoreDoFn.NonWindowAwareDoFn;
+import org.apache.beam.sdk.io.gcp.firestore.FirestoreV1Fn.HasRpcAttemptContext;
+import org.apache.beam.sdk.io.gcp.firestore.RpcQos.RpcAttempt.Context;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.display.DisplayData;
+import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.annotations.VisibleForTesting;
+import org.checkerframework.checker.nullness.compatqual.NullableDecl;
+import org.checkerframework.checker.nullness.qual.Nullable;
+import org.joda.time.Instant;
+
+/**
+ * A collection of {@link org.apache.beam.sdk.transforms.DoFn DoFn}s for each of the supported read
+ * RPC methods from the Cloud Firestore V1 API.
+ */
+final class FirestoreV1ReadFn {
+
+  /**
+   * {@link DoFn} for Firestore V1 {@link RunQueryRequest}s.
+   *
+   * <p>This Fn uses a stream to obtain responses, each response from the stream will be output to
+   * the next stage of the pipeline. Each response from the stream represents an individual document
+   * with the associated metadata.
+   *
+   * <p>If an error is encountered while reading from the stream, the stream will attempt to resume
+   * rather than starting over. The restarting of the stream will continue within the scope of the
+   * completion of the request (meaning any possibility of resumption is contingent upon an attempt
+   * being available in the Qos budget).
+   *
+   * <p>All request quality-of-service is managed via the instance of {@link RpcQos} associated with
+   * the lifecycle of this Fn.
+   */
+  static final class RunQueryFn

Review comment:
       Beam sources help Beam pipeline parallelize. I think we have to update following source Fn classes to use SDF to parallelize better. This help support features such as dynamic work rebalancing. Without proper parallelization Beam pipelines that use Firestore source could run into stragglers (which is an issue many Dataflow customers run into without dynamic work rebalancing is not available).
   See here for more details on SDF: https://beam.apache.org/documentation/programming-guide/#splittable-dofns

##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/firestore/FirestoreV1.java
##########
@@ -59,6 +89,80 @@
  *
  * <h3>Operations</h3>
  *
+ * <h4>Read</h4>
+ *
+ * <p>The currently supported read operations and their execution behavior are as follows:
+ *
+ * <table>
+ *   <tbody>
+ *     <tr>
+ *       <th>RPC</th>
+ *       <th>Execution Behavior</th>
+ *     </tr>
+ *     <tr>
+ *       <td>PartitionQuery</td>
+ *       <td>Parallel Streaming</td>
+ *     </tr>
+ *     <tr>
+ *       <td>RunQuery</td>
+ *       <td>Sequential Streaming</td>
+ *     </tr>
+ *     <tr>
+ *       <td>BatchGet</td>
+ *       <td>Sequential Streaming</td>
+ *     </tr>
+ *     <tr>
+ *       <td>ListCollectionIds</td>
+ *       <td>Sequential Paginated</td>
+ *     </tr>
+ *     <tr>
+ *       <td>ListDocuments</td>
+ *       <td>Sequential Paginated</td>
+ *     </tr>
+ *   </tbody>
+ * </table>
+ *
+ * <p>PartitionQuery should be preferred over other options if at all possible, becuase it has the
+ * ability to parallelize execution of multiple queries for specific sub-ranges of the full results.
+ *
+ * <p>You should only ever use ListDocuments if the use of <a target="_blank" rel="noopener
+ * noreferrer"
+ * href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.ListDocumentsRequest">{@code
+ * show_missing}</a> is needed to access a document. RunQuery and PartitionQuery will always be
+ * faster if the use of {@code show_missing} is not needed.
+ *
+ * <p><b>Example Usage</b>
+ *
+ * <pre>{@code
+ * PCollection<PartitionQueryRequest> partitionQueryRequests = ...;
+ * PCollection<RunQueryResponse> partitionQueryResponses = partitionQueryRequests
+ *     .apply(FirestoreIO.v1().read().partitionQuery().build());
+ * }</pre>
+ *
+ * <pre>{@code
+ * PCollection<RunQueryRequest> runQueryRequests = ...;

Review comment:
       Do you think all these types of PCollections should be in the public API ? Are end users expected to use all of these ?
   
   I'm wondering if we can somehow simplify the public API by allowing users to get a certain (root) type of PCollection and providing utility functions to convert from there.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] chamikaramj commented on a change in pull request #15005: [BEAM-8376] Google Cloud Firestore Connector - Add Firestore v1 Read Operations

Posted by GitBox <gi...@apache.org>.
chamikaramj commented on a change in pull request #15005:
URL: https://github.com/apache/beam/pull/15005#discussion_r668372449



##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/firestore/FirestoreV1ReadFn.java
##########
@@ -0,0 +1,633 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.gcp.firestore;
+
+import static java.util.Objects.requireNonNull;
+
+import com.google.api.gax.paging.AbstractPage;
+import com.google.api.gax.paging.AbstractPagedListResponse;
+import com.google.api.gax.rpc.ServerStream;
+import com.google.api.gax.rpc.ServerStreamingCallable;
+import com.google.api.gax.rpc.UnaryCallable;
+import com.google.cloud.firestore.v1.FirestoreClient.ListCollectionIdsPage;
+import com.google.cloud.firestore.v1.FirestoreClient.ListCollectionIdsPagedResponse;
+import com.google.cloud.firestore.v1.FirestoreClient.ListDocumentsPage;
+import com.google.cloud.firestore.v1.FirestoreClient.ListDocumentsPagedResponse;
+import com.google.cloud.firestore.v1.FirestoreClient.PartitionQueryPage;
+import com.google.cloud.firestore.v1.FirestoreClient.PartitionQueryPagedResponse;
+import com.google.cloud.firestore.v1.stub.FirestoreStub;
+import com.google.firestore.v1.BatchGetDocumentsRequest;
+import com.google.firestore.v1.BatchGetDocumentsResponse;
+import com.google.firestore.v1.Cursor;
+import com.google.firestore.v1.ListCollectionIdsRequest;
+import com.google.firestore.v1.ListCollectionIdsResponse;
+import com.google.firestore.v1.ListDocumentsRequest;
+import com.google.firestore.v1.ListDocumentsResponse;
+import com.google.firestore.v1.PartitionQueryRequest;
+import com.google.firestore.v1.PartitionQueryResponse;
+import com.google.firestore.v1.RunQueryRequest;
+import com.google.firestore.v1.RunQueryResponse;
+import com.google.firestore.v1.StructuredQuery;
+import com.google.firestore.v1.StructuredQuery.Direction;
+import com.google.firestore.v1.StructuredQuery.FieldReference;
+import com.google.firestore.v1.StructuredQuery.Order;
+import com.google.firestore.v1.Value;
+import com.google.protobuf.Message;
+import com.google.protobuf.ProtocolStringList;
+import java.io.Serializable;
+import java.util.List;
+import java.util.Map;
+import java.util.Objects;
+import org.apache.beam.sdk.extensions.gcp.options.GcpOptions;
+import org.apache.beam.sdk.io.gcp.firestore.FirestoreDoFn.NonWindowAwareDoFn;
+import org.apache.beam.sdk.io.gcp.firestore.FirestoreV1Fn.HasRpcAttemptContext;
+import org.apache.beam.sdk.io.gcp.firestore.RpcQos.RpcAttempt.Context;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.display.DisplayData;
+import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.annotations.VisibleForTesting;
+import org.checkerframework.checker.nullness.compatqual.NullableDecl;
+import org.checkerframework.checker.nullness.qual.Nullable;
+import org.joda.time.Instant;
+
+/**
+ * A collection of {@link org.apache.beam.sdk.transforms.DoFn DoFn}s for each of the supported read
+ * RPC methods from the Cloud Firestore V1 API.
+ */
+final class FirestoreV1ReadFn {
+
+  /**
+   * {@link DoFn} for Firestore V1 {@link RunQueryRequest}s.
+   *
+   * <p>This Fn uses a stream to obtain responses, each response from the stream will be output to
+   * the next stage of the pipeline. Each response from the stream represents an individual document
+   * with the associated metadata.
+   *
+   * <p>If an error is encountered while reading from the stream, the stream will attempt to resume
+   * rather than starting over. The restarting of the stream will continue within the scope of the
+   * completion of the request (meaning any possibility of resumption is contingent upon an attempt
+   * being available in the Qos budget).
+   *
+   * <p>All request quality-of-service is managed via the instance of {@link RpcQos} associated with
+   * the lifecycle of this Fn.
+   */
+  static final class RunQueryFn

Review comment:
       Ben could you add a TODO with a Jira to support progress reporting and dynamic work rebalancing in the future ?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] nehsyc commented on a change in pull request #15005: [BEAM-8376] Google Cloud Firestore Connector - Add Firestore v1 Read Operations

Posted by GitBox <gi...@apache.org>.
nehsyc commented on a change in pull request #15005:
URL: https://github.com/apache/beam/pull/15005#discussion_r667241197



##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/firestore/FirestoreV1ReadFn.java
##########
@@ -0,0 +1,633 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.gcp.firestore;
+
+import static java.util.Objects.requireNonNull;
+
+import com.google.api.gax.paging.AbstractPage;
+import com.google.api.gax.paging.AbstractPagedListResponse;
+import com.google.api.gax.rpc.ServerStream;
+import com.google.api.gax.rpc.ServerStreamingCallable;
+import com.google.api.gax.rpc.UnaryCallable;
+import com.google.cloud.firestore.v1.FirestoreClient.ListCollectionIdsPage;
+import com.google.cloud.firestore.v1.FirestoreClient.ListCollectionIdsPagedResponse;
+import com.google.cloud.firestore.v1.FirestoreClient.ListDocumentsPage;
+import com.google.cloud.firestore.v1.FirestoreClient.ListDocumentsPagedResponse;
+import com.google.cloud.firestore.v1.FirestoreClient.PartitionQueryPage;
+import com.google.cloud.firestore.v1.FirestoreClient.PartitionQueryPagedResponse;
+import com.google.cloud.firestore.v1.stub.FirestoreStub;
+import com.google.firestore.v1.BatchGetDocumentsRequest;
+import com.google.firestore.v1.BatchGetDocumentsResponse;
+import com.google.firestore.v1.Cursor;
+import com.google.firestore.v1.ListCollectionIdsRequest;
+import com.google.firestore.v1.ListCollectionIdsResponse;
+import com.google.firestore.v1.ListDocumentsRequest;
+import com.google.firestore.v1.ListDocumentsResponse;
+import com.google.firestore.v1.PartitionQueryRequest;
+import com.google.firestore.v1.PartitionQueryResponse;
+import com.google.firestore.v1.RunQueryRequest;
+import com.google.firestore.v1.RunQueryResponse;
+import com.google.firestore.v1.StructuredQuery;
+import com.google.firestore.v1.StructuredQuery.Direction;
+import com.google.firestore.v1.StructuredQuery.FieldReference;
+import com.google.firestore.v1.StructuredQuery.Order;
+import com.google.firestore.v1.Value;
+import com.google.protobuf.Message;
+import com.google.protobuf.ProtocolStringList;
+import java.io.Serializable;
+import java.util.List;
+import java.util.Map;
+import java.util.Objects;
+import org.apache.beam.sdk.extensions.gcp.options.GcpOptions;
+import org.apache.beam.sdk.io.gcp.firestore.FirestoreDoFn.NonWindowAwareDoFn;
+import org.apache.beam.sdk.io.gcp.firestore.FirestoreV1Fn.HasRpcAttemptContext;
+import org.apache.beam.sdk.io.gcp.firestore.RpcQos.RpcAttempt.Context;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.display.DisplayData;
+import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.annotations.VisibleForTesting;
+import org.checkerframework.checker.nullness.compatqual.NullableDecl;
+import org.checkerframework.checker.nullness.qual.Nullable;
+import org.joda.time.Instant;
+
+/**
+ * A collection of {@link org.apache.beam.sdk.transforms.DoFn DoFn}s for each of the supported read
+ * RPC methods from the Cloud Firestore V1 API.
+ */
+final class FirestoreV1ReadFn {
+
+  /**
+   * {@link DoFn} for Firestore V1 {@link RunQueryRequest}s.
+   *
+   * <p>This Fn uses a stream to obtain responses, each response from the stream will be output to
+   * the next stage of the pipeline. Each response from the stream represents an individual document
+   * with the associated metadata.
+   *
+   * <p>If an error is encountered while reading from the stream, the stream will attempt to resume
+   * rather than starting over. The restarting of the stream will continue within the scope of the
+   * completion of the request (meaning any possibility of resumption is contingent upon an attempt
+   * being available in the Qos budget).
+   *
+   * <p>All request quality-of-service is managed via the instance of {@link RpcQos} associated with
+   * the lifecycle of this Fn.
+   */
+  static final class RunQueryFn

Review comment:
       Limited parallelism (by the number of partitions) would be the main concern w.r.t. performance, especially when the processing time for each partitioned query can vary a lot. If the static partitioning done by Datastore is fairly good in terms of balanced load breakdown, dynamic sharding would become less needed there. It'd be good anyway to document the known drawback of the manual partitioning.
   
   Once Dataflow detects the limited parallelism, it will not upsize more than that to avoid wasting resources. All those happen automatically when Dataflow fails splitting the tasks (or bundles) for a short period of time (minutes).
   
   BTW Ben mentioned another point related to autoscaling, throttling during the read. Read might fail and get retried which could cause Dataflow to upsize the worker pool. Reporting throttling signals like what's done in the sink back to Dataflow would potentially solve the problem.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] BenWhitehead commented on a change in pull request #15005: [BEAM-8376] Google Cloud Firestore Connector - Add Firestore v1 Read Operations

Posted by GitBox <gi...@apache.org>.
BenWhitehead commented on a change in pull request #15005:
URL: https://github.com/apache/beam/pull/15005#discussion_r654656702



##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/firestore/RpcQosImpl.java
##########
@@ -700,6 +723,142 @@ public long nextBackOffMillis() {
     }
   }
 
+  /**
+   * This class implements a backoff algorithm similar to that of {@link
+   * org.apache.beam.sdk.util.FluentBackoff} with a could key differences:
+   *
+   * <ol>
+   *   <li>A set of status code numbers may be specified to have a graceful evaluation
+   *   <li>Gracefully evaluated status code numbers will increment a decaying counter to ensure if
+   *       the graceful status code numbers occur more than once in the previous 60 seconds the
+   *       regular backoff behavior will kick in.
+   *   <li>The random number generator used to induce jitter is provided via constructor parameter
+   *       rather than using {@link Math#random()}}
+   * </ol>
+   *
+   * The primary motivation for creating this implementation is to support streamed responses from
+   * Firestore. In the case of RunQuery and BatchGet the results are returned via stream. The result
+   * stream has a maximum lifetime of 60 seconds before it will be broken and an UNAVAILABLE status
+   * code will be raised. Give this UNAVAILABLE is expected for streams this class allows for
+   * defining a set of status code numbers which are give a grace count of 1 before backoff kicks
+   * in. When backoff does kick in, it is implemented using the same calculations as {@link
+   * org.apache.beam.sdk.util.FluentBackoff}.
+   */
+  static final class StatusCodeAwareBackoff {
+    private static final double RANDOMIZATION_FACTOR = 0.5;

Review comment:
       It does, here is the declaration site https://github.com/apache/beam/blob/0c01636fc8610414859d946cb93eabc904123fc8/sdks/java/core/src/main/java/org/apache/beam/sdk/util/FluentBackoff.java#L36 and the use site https://github.com/apache/beam/blob/0c01636fc8610414859d946cb93eabc904123fc8/sdks/java/core/src/main/java/org/apache/beam/sdk/util/FluentBackoff.java#L198-L199

##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/firestore/FirestoreV1ReadFn.java
##########
@@ -0,0 +1,632 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.gcp.firestore;
+
+import static java.util.Objects.requireNonNull;
+
+import com.google.api.gax.paging.AbstractPage;
+import com.google.api.gax.paging.AbstractPagedListResponse;
+import com.google.api.gax.rpc.ServerStream;
+import com.google.api.gax.rpc.ServerStreamingCallable;
+import com.google.api.gax.rpc.UnaryCallable;
+import com.google.cloud.firestore.v1.FirestoreClient.ListCollectionIdsPage;
+import com.google.cloud.firestore.v1.FirestoreClient.ListCollectionIdsPagedResponse;
+import com.google.cloud.firestore.v1.FirestoreClient.ListDocumentsPage;
+import com.google.cloud.firestore.v1.FirestoreClient.ListDocumentsPagedResponse;
+import com.google.cloud.firestore.v1.FirestoreClient.PartitionQueryPage;
+import com.google.cloud.firestore.v1.FirestoreClient.PartitionQueryPagedResponse;
+import com.google.cloud.firestore.v1.stub.FirestoreStub;
+import com.google.firestore.v1.BatchGetDocumentsRequest;
+import com.google.firestore.v1.BatchGetDocumentsResponse;
+import com.google.firestore.v1.Cursor;
+import com.google.firestore.v1.ListCollectionIdsRequest;
+import com.google.firestore.v1.ListCollectionIdsResponse;
+import com.google.firestore.v1.ListDocumentsRequest;
+import com.google.firestore.v1.ListDocumentsResponse;
+import com.google.firestore.v1.PartitionQueryRequest;
+import com.google.firestore.v1.PartitionQueryResponse;
+import com.google.firestore.v1.RunQueryRequest;
+import com.google.firestore.v1.RunQueryResponse;
+import com.google.firestore.v1.StructuredQuery;
+import com.google.firestore.v1.StructuredQuery.Direction;
+import com.google.firestore.v1.StructuredQuery.FieldReference;
+import com.google.firestore.v1.StructuredQuery.Order;
+import com.google.firestore.v1.Value;
+import com.google.protobuf.Message;
+import com.google.protobuf.ProtocolStringList;
+import java.io.Serializable;
+import java.util.List;
+import java.util.Map;
+import java.util.Objects;
+import org.apache.beam.sdk.extensions.gcp.options.GcpOptions;
+import org.apache.beam.sdk.io.gcp.firestore.FirestoreDoFn.NonWindowAwareDoFn;
+import org.apache.beam.sdk.io.gcp.firestore.FirestoreV1Fn.HasRpcAttemptContext;
+import org.apache.beam.sdk.io.gcp.firestore.RpcQos.RpcAttempt.Context;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.display.DisplayData;
+import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.annotations.VisibleForTesting;
+import org.checkerframework.checker.nullness.qual.Nullable;
+import org.joda.time.Instant;
+
+/**
+ * A collection of {@link org.apache.beam.sdk.transforms.DoFn DoFn}s for each of the supported read
+ * RPC methods from the Cloud Firestore V1 API.
+ */
+final class FirestoreV1ReadFn {
+
+  /**
+   * {@link DoFn} for Firestore V1 {@link RunQueryRequest}s.
+   *
+   * <p>This Fn uses a stream to obtain responses, each response from the stream will be output to
+   * the next stage of the pipeline. Each response from the stream represents an individual document
+   * with the associated metadata.
+   *
+   * <p>If an error is encountered while reading from the stream, the stream will attempt to resume
+   * rather than starting over. The restarting of the stream will continue within the scope of the
+   * completion of the request (meaning any possibility of resumption is contingent upon an attempt
+   * being available in the Qos budget).
+   *
+   * <p>All request quality-of-service is managed via the instance of {@link RpcQos} associated with
+   * the lifecycle of this Fn.
+   */
+  static final class RunQueryFn
+      extends StreamingFirestoreV1ReadFn<RunQueryRequest, RunQueryResponse> {
+
+    RunQueryFn(
+        JodaClock clock,
+        FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+        RpcQosOptions rpcQosOptions) {
+      super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    @Override
+    public Context getRpcAttemptContext() {
+      return FirestoreV1Fn.V1FnRpcAttemptContext.RunQuery;
+    }
+
+    @Override
+    protected ServerStreamingCallable<RunQueryRequest, RunQueryResponse> getCallable(
+        FirestoreStub firestoreStub) {
+      return firestoreStub.runQueryCallable();
+    }
+
+    @Override
+    protected RunQueryRequest setStartFrom(
+        RunQueryRequest element, RunQueryResponse runQueryResponse) {
+      StructuredQuery query = element.getStructuredQuery();
+      StructuredQuery.Builder builder;
+      List<Order> orderByList = query.getOrderByList();
+      // if the orderByList is empty that means the default sort of "__name__ ASC" will be used
+      // Before we can set the cursor to the last document name read, we need to explicitly add
+      // the order of "__name__ ASC" because a cursor value must map to an order by
+      if (orderByList.isEmpty()) {
+        builder =
+            query
+                .toBuilder()
+                .addOrderBy(
+                    Order.newBuilder()
+                        .setField(FieldReference.newBuilder().setFieldPath("__name__").build())
+                        .setDirection(Direction.ASCENDING)
+                        .build())
+                .setStartAt(
+                    Cursor.newBuilder()
+                        .setBefore(false)
+                        .addValues(
+                            Value.newBuilder()
+                                .setReferenceValue(runQueryResponse.getDocument().getName())
+                                .build()));
+      } else {
+        Cursor.Builder cursor = Cursor.newBuilder().setBefore(false);
+        Map<String, Value> fieldsMap = runQueryResponse.getDocument().getFieldsMap();
+        for (Order order : orderByList) {
+          String fieldPath = order.getField().getFieldPath();
+          Value value = fieldsMap.get(fieldPath);
+          if (value != null) {
+            cursor.addValues(value);
+          } else if ("__name__".equals(fieldPath)) {
+            cursor.addValues(
+                Value.newBuilder()
+                    .setReferenceValue(runQueryResponse.getDocument().getName())
+                    .build());
+          }
+        }
+        builder = query.toBuilder().setStartAt(cursor.build());
+      }
+      return element.toBuilder().setStructuredQuery(builder.build()).build();
+    }
+  }
+
+  /**
+   * {@link DoFn} for Firestore V1 {@link PartitionQueryRequest}s.
+   *
+   * <p>This Fn uses pagination to obtain responses, all pages will be aggregated before being
+   * emitted to the next stage of the pipeline. Aggregation of pages is necessary as the next step
+   * of pairing of cursors to create N queries must first sort all cursors. See <a target="_blank"
+   * rel="noopener noreferrer"
+   * href="https://cloud.google.com/firestore/docs/reference/rest/v1/projects.databases.documents/partitionQuery#request-body">{@code
+   * pageToken}s</a> documentation for details.
+   *
+   * <p>All request quality-of-service is managed via the instance of {@link RpcQos} associated with
+   * the lifecycle of this Fn.
+   */
+  static final class PartitionQueryFn
+      extends BaseFirestoreV1ReadFn<PartitionQueryRequest, PartitionQueryPair> {
+
+    public PartitionQueryFn(
+        JodaClock clock,
+        FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+        RpcQosOptions rpcQosOptions) {
+      super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    @Override
+    public Context getRpcAttemptContext() {
+      return FirestoreV1Fn.V1FnRpcAttemptContext.PartitionQuery;
+    }
+
+    @Override
+    public void processElement(ProcessContext context) throws Exception {
+      @SuppressWarnings("nullness")
+      final PartitionQueryRequest element =
+          requireNonNull(context.element(), "c.element() must be non null");
+
+      RpcQos.RpcReadAttempt attempt = rpcQos.newReadAttempt(getRpcAttemptContext());
+      PartitionQueryResponse.Builder aggregate = null;
+      while (true) {
+        if (!attempt.awaitSafeToProceed(clock.instant())) {
+          continue;
+        }
+
+        try {
+          PartitionQueryRequest request = setPageToken(element, aggregate);
+          attempt.recordRequestStart(clock.instant());
+          PartitionQueryPagedResponse pagedResponse =
+              firestoreStub.partitionQueryPagedCallable().call(request);
+          for (PartitionQueryPage page : pagedResponse.iteratePages()) {
+            attempt.recordRequestSuccessful(clock.instant());
+            PartitionQueryResponse response = page.getResponse();
+            if (aggregate == null) {
+              aggregate = response.toBuilder();
+            } else {
+              aggregate.addAllPartitions(response.getPartitionsList());
+              if (page.hasNextPage()) {
+                aggregate.setNextPageToken(response.getNextPageToken());
+              } else {
+                aggregate.clearNextPageToken();
+              }
+            }
+            if (page.hasNextPage()) {
+              attempt.recordRequestStart(clock.instant());
+            }
+          }
+          attempt.completeSuccess();
+          break;
+        } catch (RuntimeException exception) {
+          Instant end = clock.instant();
+          attempt.recordRequestFailed(end);
+          attempt.checkCanRetry(end, exception);
+        }
+      }
+      if (aggregate != null) {
+        context.output(new PartitionQueryPair(element, aggregate.build()));
+      }
+    }
+
+    private PartitionQueryRequest setPageToken(
+        PartitionQueryRequest request,
+        @edu.umd.cs.findbugs.annotations.Nullable PartitionQueryResponse.Builder aggregate) {
+      if (aggregate != null && aggregate.getNextPageToken() != null) {
+        return request.toBuilder().setPageToken(aggregate.getNextPageToken()).build();
+      }
+      return request;
+    }
+  }
+
+  /**
+   * {@link DoFn} for Firestore V1 {@link ListDocumentsRequest}s.
+   *
+   * <p>This Fn uses pagination to obtain responses, the response from each page will be output to
+   * the next stage of the pipeline.
+   *
+   * <p>All request quality-of-service is managed via the instance of {@link RpcQos} associated with
+   * the lifecycle of this Fn.
+   */
+  static final class ListDocumentsFn
+      extends PaginatedFirestoreV1ReadFn<
+          ListDocumentsRequest,
+          ListDocumentsPagedResponse,
+          ListDocumentsPage,
+          ListDocumentsResponse> {
+
+    ListDocumentsFn(
+        JodaClock clock,
+        FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+        RpcQosOptions rpcQosOptions) {
+      super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    @Override
+    public Context getRpcAttemptContext() {
+      return FirestoreV1Fn.V1FnRpcAttemptContext.ListDocuments;
+    }
+
+    @Override
+    protected UnaryCallable<ListDocumentsRequest, ListDocumentsPagedResponse> getCallable(
+        FirestoreStub firestoreStub) {
+      return firestoreStub.listDocumentsPagedCallable();
+    }
+
+    @Override
+    protected ListDocumentsRequest setPageToken(
+        ListDocumentsRequest request, String nextPageToken) {
+      return request.toBuilder().setPageToken(nextPageToken).build();
+    }
+  }
+
+  /**
+   * {@link DoFn} for Firestore V1 {@link ListCollectionIdsRequest}s.
+   *
+   * <p>This Fn uses pagination to obtain responses, the response from each page will be output to
+   * the next stage of the pipeline.
+   *
+   * <p>All request quality-of-service is managed via the instance of {@link RpcQos} associated with
+   * the lifecycle of this Fn.
+   */
+  static final class ListCollectionIdsFn
+      extends PaginatedFirestoreV1ReadFn<
+          ListCollectionIdsRequest,
+          ListCollectionIdsPagedResponse,
+          ListCollectionIdsPage,
+          ListCollectionIdsResponse> {
+
+    ListCollectionIdsFn(
+        JodaClock clock,
+        FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+        RpcQosOptions rpcQosOptions) {
+      super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    @Override
+    public Context getRpcAttemptContext() {
+      return FirestoreV1Fn.V1FnRpcAttemptContext.ListCollectionIds;
+    }
+
+    @Override
+    protected UnaryCallable<ListCollectionIdsRequest, ListCollectionIdsPagedResponse> getCallable(
+        FirestoreStub firestoreStub) {
+      return firestoreStub.listCollectionIdsPagedCallable();
+    }
+
+    @Override
+    protected ListCollectionIdsRequest setPageToken(
+        ListCollectionIdsRequest request, String nextPageToken) {
+      return request.toBuilder().setPageToken(nextPageToken).build();
+    }
+  }
+
+  /**
+   * {@link DoFn} for Firestore V1 {@link BatchGetDocumentsRequest}s.
+   *
+   * <p>This Fn uses a stream to obtain responses, each response from the stream will be output to
+   * the next stage of the pipeline. Each response from the stream represents an individual document
+   * with the associated metadata.
+   *
+   * <p>All request quality-of-service is managed via the instance of {@link RpcQos} associated with
+   * the lifecycle of this Fn.
+   */
+  static final class BatchGetDocumentsFn
+      extends StreamingFirestoreV1ReadFn<BatchGetDocumentsRequest, BatchGetDocumentsResponse> {
+
+    BatchGetDocumentsFn(
+        JodaClock clock,
+        FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+        RpcQosOptions rpcQosOptions) {
+      super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    @Override
+    public Context getRpcAttemptContext() {
+      return FirestoreV1Fn.V1FnRpcAttemptContext.BatchGetDocuments;
+    }
+
+    @Override
+    protected ServerStreamingCallable<BatchGetDocumentsRequest, BatchGetDocumentsResponse>
+        getCallable(FirestoreStub firestoreStub) {
+      return firestoreStub.batchGetDocumentsCallable();
+    }
+
+    @Override
+    protected BatchGetDocumentsRequest setStartFrom(
+        BatchGetDocumentsRequest originalRequest, BatchGetDocumentsResponse mostRecentResponse) {
+      int startIndex = -1;
+      ProtocolStringList documentsList = originalRequest.getDocumentsList();
+      String missing = mostRecentResponse.getMissing();
+      String foundName =
+          mostRecentResponse.hasFound() ? mostRecentResponse.getFound().getName() : null;
+      // we only scan until the second to last originalRequest. If the final element were to be
+      // reached
+      // the full request would be complete and we wouldn't be in this scenario
+      int maxIndex = documentsList.size() - 2;
+      for (int i = 0; i <= maxIndex; i++) {

Review comment:
       We potentially could, but that'd require a refactor to the `org.apache.beam.sdk.io.gcp.firestore.FirestoreV1ReadFn.StreamingFirestoreV1ReadFn` to track number of elements received (which is shared with `RunQuery`). I'm not sure it'd provide much material value though.
   
   Even for a list of 10m document names a linear scan for the 2nd to last value can happen in ~120ms which compared to an RPC is negligible.
   
   <details>
     <summary>Scan perf test</summary>
     
   ##### Test
   ```java
     @Test
     public void scanMicroBench() {
       List<String> strings = IntStream.rangeClosed(1, 10_000_000)
           .mapToObj(i -> String.format("projects/p1/databases/d1/documents/c1/doc2/c2/doc%d", i))
           .collect(Collectors.toList());
   
       List<String> searchValues = List.of(
           "projects/p1/databases/d1/documents/c1/doc2/c2/doc100",
           "projects/p1/databases/d1/documents/c1/doc2/c2/doc10000",
           "projects/p1/databases/d1/documents/c1/doc2/c2/doc1000000",
           "projects/p1/databases/d1/documents/c1/doc2/c2/doc9999999"
       );
   
       for (String searchValue : searchValues) {
         var sw = Stopwatch.createStarted();
         for (int i = 0; i < strings.size(); i++) {
           String s = strings.get(i);
           if (searchValue.equals(s)) {
             break;
           }
         }
         var stop = sw.stop();
         System.out.printf("search[%s] %s%n", searchValue, stop.toString());
       }
     }
   ```
   
   ##### Output
   ```
   search[projects/p1/databases/d1/documents/c1/doc2/c2/doc100] 18.72 μs
   search[projects/p1/databases/d1/documents/c1/doc2/c2/doc10000] 683.9 μs
   search[projects/p1/databases/d1/documents/c1/doc2/c2/doc1000000] 19.83 ms
   search[projects/p1/databases/d1/documents/c1/doc2/c2/doc9999999] 127.4 ms
   ```
   
   </details>

##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/firestore/FirestoreV1ReadFn.java
##########
@@ -0,0 +1,632 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.gcp.firestore;
+
+import static java.util.Objects.requireNonNull;
+
+import com.google.api.gax.paging.AbstractPage;
+import com.google.api.gax.paging.AbstractPagedListResponse;
+import com.google.api.gax.rpc.ServerStream;
+import com.google.api.gax.rpc.ServerStreamingCallable;
+import com.google.api.gax.rpc.UnaryCallable;
+import com.google.cloud.firestore.v1.FirestoreClient.ListCollectionIdsPage;
+import com.google.cloud.firestore.v1.FirestoreClient.ListCollectionIdsPagedResponse;
+import com.google.cloud.firestore.v1.FirestoreClient.ListDocumentsPage;
+import com.google.cloud.firestore.v1.FirestoreClient.ListDocumentsPagedResponse;
+import com.google.cloud.firestore.v1.FirestoreClient.PartitionQueryPage;
+import com.google.cloud.firestore.v1.FirestoreClient.PartitionQueryPagedResponse;
+import com.google.cloud.firestore.v1.stub.FirestoreStub;
+import com.google.firestore.v1.BatchGetDocumentsRequest;
+import com.google.firestore.v1.BatchGetDocumentsResponse;
+import com.google.firestore.v1.Cursor;
+import com.google.firestore.v1.ListCollectionIdsRequest;
+import com.google.firestore.v1.ListCollectionIdsResponse;
+import com.google.firestore.v1.ListDocumentsRequest;
+import com.google.firestore.v1.ListDocumentsResponse;
+import com.google.firestore.v1.PartitionQueryRequest;
+import com.google.firestore.v1.PartitionQueryResponse;
+import com.google.firestore.v1.RunQueryRequest;
+import com.google.firestore.v1.RunQueryResponse;
+import com.google.firestore.v1.StructuredQuery;
+import com.google.firestore.v1.StructuredQuery.Direction;
+import com.google.firestore.v1.StructuredQuery.FieldReference;
+import com.google.firestore.v1.StructuredQuery.Order;
+import com.google.firestore.v1.Value;
+import com.google.protobuf.Message;
+import com.google.protobuf.ProtocolStringList;
+import java.io.Serializable;
+import java.util.List;
+import java.util.Map;
+import java.util.Objects;
+import org.apache.beam.sdk.extensions.gcp.options.GcpOptions;
+import org.apache.beam.sdk.io.gcp.firestore.FirestoreDoFn.NonWindowAwareDoFn;
+import org.apache.beam.sdk.io.gcp.firestore.FirestoreV1Fn.HasRpcAttemptContext;
+import org.apache.beam.sdk.io.gcp.firestore.RpcQos.RpcAttempt.Context;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.display.DisplayData;
+import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.annotations.VisibleForTesting;
+import org.checkerframework.checker.nullness.qual.Nullable;
+import org.joda.time.Instant;
+
+/**
+ * A collection of {@link org.apache.beam.sdk.transforms.DoFn DoFn}s for each of the supported read
+ * RPC methods from the Cloud Firestore V1 API.
+ */
+final class FirestoreV1ReadFn {
+
+  /**
+   * {@link DoFn} for Firestore V1 {@link RunQueryRequest}s.
+   *
+   * <p>This Fn uses a stream to obtain responses, each response from the stream will be output to
+   * the next stage of the pipeline. Each response from the stream represents an individual document
+   * with the associated metadata.
+   *
+   * <p>If an error is encountered while reading from the stream, the stream will attempt to resume
+   * rather than starting over. The restarting of the stream will continue within the scope of the
+   * completion of the request (meaning any possibility of resumption is contingent upon an attempt
+   * being available in the Qos budget).
+   *
+   * <p>All request quality-of-service is managed via the instance of {@link RpcQos} associated with
+   * the lifecycle of this Fn.
+   */
+  static final class RunQueryFn
+      extends StreamingFirestoreV1ReadFn<RunQueryRequest, RunQueryResponse> {
+
+    RunQueryFn(
+        JodaClock clock,
+        FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+        RpcQosOptions rpcQosOptions) {
+      super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    @Override
+    public Context getRpcAttemptContext() {
+      return FirestoreV1Fn.V1FnRpcAttemptContext.RunQuery;
+    }
+
+    @Override
+    protected ServerStreamingCallable<RunQueryRequest, RunQueryResponse> getCallable(
+        FirestoreStub firestoreStub) {
+      return firestoreStub.runQueryCallable();
+    }
+
+    @Override
+    protected RunQueryRequest setStartFrom(
+        RunQueryRequest element, RunQueryResponse runQueryResponse) {
+      StructuredQuery query = element.getStructuredQuery();
+      StructuredQuery.Builder builder;
+      List<Order> orderByList = query.getOrderByList();
+      // if the orderByList is empty that means the default sort of "__name__ ASC" will be used
+      // Before we can set the cursor to the last document name read, we need to explicitly add
+      // the order of "__name__ ASC" because a cursor value must map to an order by
+      if (orderByList.isEmpty()) {
+        builder =
+            query
+                .toBuilder()
+                .addOrderBy(
+                    Order.newBuilder()
+                        .setField(FieldReference.newBuilder().setFieldPath("__name__").build())
+                        .setDirection(Direction.ASCENDING)
+                        .build())
+                .setStartAt(
+                    Cursor.newBuilder()
+                        .setBefore(false)
+                        .addValues(
+                            Value.newBuilder()
+                                .setReferenceValue(runQueryResponse.getDocument().getName())
+                                .build()));
+      } else {
+        Cursor.Builder cursor = Cursor.newBuilder().setBefore(false);
+        Map<String, Value> fieldsMap = runQueryResponse.getDocument().getFieldsMap();
+        for (Order order : orderByList) {
+          String fieldPath = order.getField().getFieldPath();
+          Value value = fieldsMap.get(fieldPath);
+          if (value != null) {
+            cursor.addValues(value);
+          } else if ("__name__".equals(fieldPath)) {
+            cursor.addValues(
+                Value.newBuilder()
+                    .setReferenceValue(runQueryResponse.getDocument().getName())
+                    .build());
+          }
+        }
+        builder = query.toBuilder().setStartAt(cursor.build());
+      }
+      return element.toBuilder().setStructuredQuery(builder.build()).build();
+    }
+  }
+
+  /**
+   * {@link DoFn} for Firestore V1 {@link PartitionQueryRequest}s.
+   *
+   * <p>This Fn uses pagination to obtain responses, all pages will be aggregated before being
+   * emitted to the next stage of the pipeline. Aggregation of pages is necessary as the next step
+   * of pairing of cursors to create N queries must first sort all cursors. See <a target="_blank"
+   * rel="noopener noreferrer"
+   * href="https://cloud.google.com/firestore/docs/reference/rest/v1/projects.databases.documents/partitionQuery#request-body">{@code
+   * pageToken}s</a> documentation for details.
+   *
+   * <p>All request quality-of-service is managed via the instance of {@link RpcQos} associated with
+   * the lifecycle of this Fn.
+   */
+  static final class PartitionQueryFn
+      extends BaseFirestoreV1ReadFn<PartitionQueryRequest, PartitionQueryPair> {
+
+    public PartitionQueryFn(
+        JodaClock clock,
+        FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+        RpcQosOptions rpcQosOptions) {
+      super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    @Override
+    public Context getRpcAttemptContext() {
+      return FirestoreV1Fn.V1FnRpcAttemptContext.PartitionQuery;
+    }
+
+    @Override
+    public void processElement(ProcessContext context) throws Exception {
+      @SuppressWarnings("nullness")
+      final PartitionQueryRequest element =
+          requireNonNull(context.element(), "c.element() must be non null");
+
+      RpcQos.RpcReadAttempt attempt = rpcQos.newReadAttempt(getRpcAttemptContext());
+      PartitionQueryResponse.Builder aggregate = null;
+      while (true) {
+        if (!attempt.awaitSafeToProceed(clock.instant())) {
+          continue;
+        }
+
+        try {
+          PartitionQueryRequest request = setPageToken(element, aggregate);
+          attempt.recordRequestStart(clock.instant());
+          PartitionQueryPagedResponse pagedResponse =
+              firestoreStub.partitionQueryPagedCallable().call(request);
+          for (PartitionQueryPage page : pagedResponse.iteratePages()) {
+            attempt.recordRequestSuccessful(clock.instant());
+            PartitionQueryResponse response = page.getResponse();
+            if (aggregate == null) {
+              aggregate = response.toBuilder();
+            } else {
+              aggregate.addAllPartitions(response.getPartitionsList());
+              if (page.hasNextPage()) {
+                aggregate.setNextPageToken(response.getNextPageToken());
+              } else {
+                aggregate.clearNextPageToken();
+              }
+            }
+            if (page.hasNextPage()) {
+              attempt.recordRequestStart(clock.instant());
+            }
+          }
+          attempt.completeSuccess();
+          break;
+        } catch (RuntimeException exception) {
+          Instant end = clock.instant();
+          attempt.recordRequestFailed(end);
+          attempt.checkCanRetry(end, exception);
+        }
+      }
+      if (aggregate != null) {
+        context.output(new PartitionQueryPair(element, aggregate.build()));
+      }
+    }
+
+    private PartitionQueryRequest setPageToken(
+        PartitionQueryRequest request,
+        @edu.umd.cs.findbugs.annotations.Nullable PartitionQueryResponse.Builder aggregate) {
+      if (aggregate != null && aggregate.getNextPageToken() != null) {
+        return request.toBuilder().setPageToken(aggregate.getNextPageToken()).build();
+      }
+      return request;
+    }
+  }
+
+  /**
+   * {@link DoFn} for Firestore V1 {@link ListDocumentsRequest}s.
+   *
+   * <p>This Fn uses pagination to obtain responses, the response from each page will be output to
+   * the next stage of the pipeline.
+   *
+   * <p>All request quality-of-service is managed via the instance of {@link RpcQos} associated with
+   * the lifecycle of this Fn.
+   */
+  static final class ListDocumentsFn
+      extends PaginatedFirestoreV1ReadFn<
+          ListDocumentsRequest,
+          ListDocumentsPagedResponse,
+          ListDocumentsPage,
+          ListDocumentsResponse> {
+
+    ListDocumentsFn(
+        JodaClock clock,
+        FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+        RpcQosOptions rpcQosOptions) {
+      super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    @Override
+    public Context getRpcAttemptContext() {
+      return FirestoreV1Fn.V1FnRpcAttemptContext.ListDocuments;
+    }
+
+    @Override
+    protected UnaryCallable<ListDocumentsRequest, ListDocumentsPagedResponse> getCallable(
+        FirestoreStub firestoreStub) {
+      return firestoreStub.listDocumentsPagedCallable();
+    }
+
+    @Override
+    protected ListDocumentsRequest setPageToken(
+        ListDocumentsRequest request, String nextPageToken) {
+      return request.toBuilder().setPageToken(nextPageToken).build();
+    }
+  }
+
+  /**
+   * {@link DoFn} for Firestore V1 {@link ListCollectionIdsRequest}s.
+   *
+   * <p>This Fn uses pagination to obtain responses, the response from each page will be output to
+   * the next stage of the pipeline.
+   *
+   * <p>All request quality-of-service is managed via the instance of {@link RpcQos} associated with
+   * the lifecycle of this Fn.
+   */
+  static final class ListCollectionIdsFn
+      extends PaginatedFirestoreV1ReadFn<
+          ListCollectionIdsRequest,
+          ListCollectionIdsPagedResponse,
+          ListCollectionIdsPage,
+          ListCollectionIdsResponse> {
+
+    ListCollectionIdsFn(
+        JodaClock clock,
+        FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+        RpcQosOptions rpcQosOptions) {
+      super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    @Override
+    public Context getRpcAttemptContext() {
+      return FirestoreV1Fn.V1FnRpcAttemptContext.ListCollectionIds;
+    }
+
+    @Override
+    protected UnaryCallable<ListCollectionIdsRequest, ListCollectionIdsPagedResponse> getCallable(
+        FirestoreStub firestoreStub) {
+      return firestoreStub.listCollectionIdsPagedCallable();
+    }
+
+    @Override
+    protected ListCollectionIdsRequest setPageToken(
+        ListCollectionIdsRequest request, String nextPageToken) {
+      return request.toBuilder().setPageToken(nextPageToken).build();
+    }
+  }
+
+  /**
+   * {@link DoFn} for Firestore V1 {@link BatchGetDocumentsRequest}s.
+   *
+   * <p>This Fn uses a stream to obtain responses, each response from the stream will be output to
+   * the next stage of the pipeline. Each response from the stream represents an individual document
+   * with the associated metadata.
+   *
+   * <p>All request quality-of-service is managed via the instance of {@link RpcQos} associated with
+   * the lifecycle of this Fn.
+   */
+  static final class BatchGetDocumentsFn
+      extends StreamingFirestoreV1ReadFn<BatchGetDocumentsRequest, BatchGetDocumentsResponse> {
+
+    BatchGetDocumentsFn(
+        JodaClock clock,
+        FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+        RpcQosOptions rpcQosOptions) {
+      super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    @Override
+    public Context getRpcAttemptContext() {
+      return FirestoreV1Fn.V1FnRpcAttemptContext.BatchGetDocuments;
+    }
+
+    @Override
+    protected ServerStreamingCallable<BatchGetDocumentsRequest, BatchGetDocumentsResponse>
+        getCallable(FirestoreStub firestoreStub) {
+      return firestoreStub.batchGetDocumentsCallable();
+    }
+
+    @Override
+    protected BatchGetDocumentsRequest setStartFrom(
+        BatchGetDocumentsRequest originalRequest, BatchGetDocumentsResponse mostRecentResponse) {
+      int startIndex = -1;
+      ProtocolStringList documentsList = originalRequest.getDocumentsList();
+      String missing = mostRecentResponse.getMissing();
+      String foundName =
+          mostRecentResponse.hasFound() ? mostRecentResponse.getFound().getName() : null;
+      // we only scan until the second to last originalRequest. If the final element were to be
+      // reached
+      // the full request would be complete and we wouldn't be in this scenario
+      int maxIndex = documentsList.size() - 2;
+      for (int i = 0; i <= maxIndex; i++) {
+        String docName = documentsList.get(i);
+        if (docName.equals(missing) || docName.equals(foundName)) {
+          startIndex = i;
+          break;
+        }
+      }
+      if (0 <= startIndex) {
+        BatchGetDocumentsRequest.Builder builder = originalRequest.toBuilder().clearDocuments();
+        documentsList.stream()
+            .skip(startIndex + 1) // start from the next entry from the one we found
+            .forEach(builder::addDocuments);
+        return builder.build();
+      }
+      // unable to find a match, return the original request
+      return originalRequest;

Review comment:
       I can update it to throw an error. What information would like to see included in the error message?
   
   Something like the following?
   ```suggestion
         throw new IllegalStateException(
             String.format(
                 "Unable to determine BatchGet resumption point. Most recently received doc __name__ '%s'",
                 foundName != null ? foundName : missing));
       }
   
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] chamikaramj commented on pull request #15005: [BEAM-8376] Google Cloud Firestore Connector - Add Firestore v1 Read Operations

Posted by GitBox <gi...@apache.org>.
chamikaramj commented on pull request #15005:
URL: https://github.com/apache/beam/pull/15005#issuecomment-879498454


   Run Java PreCommit


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] chamikaramj commented on pull request #15005: [BEAM-8376] Google Cloud Firestore Connector - Add Firestore v1 Read Operations

Posted by GitBox <gi...@apache.org>.
chamikaramj commented on pull request #15005:
URL: https://github.com/apache/beam/pull/15005#issuecomment-879465730






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] danthev commented on a change in pull request #15005: [BEAM-8376] Google Cloud Firestore Connector - Add Firestore v1 Read Operations

Posted by GitBox <gi...@apache.org>.
danthev commented on a change in pull request #15005:
URL: https://github.com/apache/beam/pull/15005#discussion_r655638240



##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/firestore/FirestoreV1ReadFn.java
##########
@@ -0,0 +1,632 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.gcp.firestore;
+
+import static java.util.Objects.requireNonNull;
+
+import com.google.api.gax.paging.AbstractPage;
+import com.google.api.gax.paging.AbstractPagedListResponse;
+import com.google.api.gax.rpc.ServerStream;
+import com.google.api.gax.rpc.ServerStreamingCallable;
+import com.google.api.gax.rpc.UnaryCallable;
+import com.google.cloud.firestore.v1.FirestoreClient.ListCollectionIdsPage;
+import com.google.cloud.firestore.v1.FirestoreClient.ListCollectionIdsPagedResponse;
+import com.google.cloud.firestore.v1.FirestoreClient.ListDocumentsPage;
+import com.google.cloud.firestore.v1.FirestoreClient.ListDocumentsPagedResponse;
+import com.google.cloud.firestore.v1.FirestoreClient.PartitionQueryPage;
+import com.google.cloud.firestore.v1.FirestoreClient.PartitionQueryPagedResponse;
+import com.google.cloud.firestore.v1.stub.FirestoreStub;
+import com.google.firestore.v1.BatchGetDocumentsRequest;
+import com.google.firestore.v1.BatchGetDocumentsResponse;
+import com.google.firestore.v1.Cursor;
+import com.google.firestore.v1.ListCollectionIdsRequest;
+import com.google.firestore.v1.ListCollectionIdsResponse;
+import com.google.firestore.v1.ListDocumentsRequest;
+import com.google.firestore.v1.ListDocumentsResponse;
+import com.google.firestore.v1.PartitionQueryRequest;
+import com.google.firestore.v1.PartitionQueryResponse;
+import com.google.firestore.v1.RunQueryRequest;
+import com.google.firestore.v1.RunQueryResponse;
+import com.google.firestore.v1.StructuredQuery;
+import com.google.firestore.v1.StructuredQuery.Direction;
+import com.google.firestore.v1.StructuredQuery.FieldReference;
+import com.google.firestore.v1.StructuredQuery.Order;
+import com.google.firestore.v1.Value;
+import com.google.protobuf.Message;
+import com.google.protobuf.ProtocolStringList;
+import java.io.Serializable;
+import java.util.List;
+import java.util.Map;
+import java.util.Objects;
+import org.apache.beam.sdk.extensions.gcp.options.GcpOptions;
+import org.apache.beam.sdk.io.gcp.firestore.FirestoreDoFn.NonWindowAwareDoFn;
+import org.apache.beam.sdk.io.gcp.firestore.FirestoreV1Fn.HasRpcAttemptContext;
+import org.apache.beam.sdk.io.gcp.firestore.RpcQos.RpcAttempt.Context;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.display.DisplayData;
+import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.annotations.VisibleForTesting;
+import org.checkerframework.checker.nullness.qual.Nullable;
+import org.joda.time.Instant;
+
+/**
+ * A collection of {@link org.apache.beam.sdk.transforms.DoFn DoFn}s for each of the supported read
+ * RPC methods from the Cloud Firestore V1 API.
+ */
+final class FirestoreV1ReadFn {
+
+  /**
+   * {@link DoFn} for Firestore V1 {@link RunQueryRequest}s.
+   *
+   * <p>This Fn uses a stream to obtain responses, each response from the stream will be output to
+   * the next stage of the pipeline. Each response from the stream represents an individual document
+   * with the associated metadata.
+   *
+   * <p>If an error is encountered while reading from the stream, the stream will attempt to resume
+   * rather than starting over. The restarting of the stream will continue within the scope of the
+   * completion of the request (meaning any possibility of resumption is contingent upon an attempt
+   * being available in the Qos budget).
+   *
+   * <p>All request quality-of-service is managed via the instance of {@link RpcQos} associated with
+   * the lifecycle of this Fn.
+   */
+  static final class RunQueryFn
+      extends StreamingFirestoreV1ReadFn<RunQueryRequest, RunQueryResponse> {
+
+    RunQueryFn(
+        JodaClock clock,
+        FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+        RpcQosOptions rpcQosOptions) {
+      super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    @Override
+    public Context getRpcAttemptContext() {
+      return FirestoreV1Fn.V1FnRpcAttemptContext.RunQuery;
+    }
+
+    @Override
+    protected ServerStreamingCallable<RunQueryRequest, RunQueryResponse> getCallable(
+        FirestoreStub firestoreStub) {
+      return firestoreStub.runQueryCallable();
+    }
+
+    @Override
+    protected RunQueryRequest setStartFrom(
+        RunQueryRequest element, RunQueryResponse runQueryResponse) {
+      StructuredQuery query = element.getStructuredQuery();
+      StructuredQuery.Builder builder;
+      List<Order> orderByList = query.getOrderByList();
+      // if the orderByList is empty that means the default sort of "__name__ ASC" will be used
+      // Before we can set the cursor to the last document name read, we need to explicitly add
+      // the order of "__name__ ASC" because a cursor value must map to an order by
+      if (orderByList.isEmpty()) {
+        builder =
+            query
+                .toBuilder()
+                .addOrderBy(
+                    Order.newBuilder()
+                        .setField(FieldReference.newBuilder().setFieldPath("__name__").build())
+                        .setDirection(Direction.ASCENDING)
+                        .build())
+                .setStartAt(
+                    Cursor.newBuilder()
+                        .setBefore(false)
+                        .addValues(
+                            Value.newBuilder()
+                                .setReferenceValue(runQueryResponse.getDocument().getName())
+                                .build()));
+      } else {
+        Cursor.Builder cursor = Cursor.newBuilder().setBefore(false);
+        Map<String, Value> fieldsMap = runQueryResponse.getDocument().getFieldsMap();
+        for (Order order : orderByList) {
+          String fieldPath = order.getField().getFieldPath();
+          Value value = fieldsMap.get(fieldPath);
+          if (value != null) {
+            cursor.addValues(value);
+          } else if ("__name__".equals(fieldPath)) {
+            cursor.addValues(
+                Value.newBuilder()
+                    .setReferenceValue(runQueryResponse.getDocument().getName())
+                    .build());
+          }
+        }
+        builder = query.toBuilder().setStartAt(cursor.build());
+      }
+      return element.toBuilder().setStructuredQuery(builder.build()).build();
+    }
+  }
+
+  /**
+   * {@link DoFn} for Firestore V1 {@link PartitionQueryRequest}s.
+   *
+   * <p>This Fn uses pagination to obtain responses, all pages will be aggregated before being
+   * emitted to the next stage of the pipeline. Aggregation of pages is necessary as the next step
+   * of pairing of cursors to create N queries must first sort all cursors. See <a target="_blank"
+   * rel="noopener noreferrer"
+   * href="https://cloud.google.com/firestore/docs/reference/rest/v1/projects.databases.documents/partitionQuery#request-body">{@code
+   * pageToken}s</a> documentation for details.
+   *
+   * <p>All request quality-of-service is managed via the instance of {@link RpcQos} associated with
+   * the lifecycle of this Fn.
+   */
+  static final class PartitionQueryFn
+      extends BaseFirestoreV1ReadFn<PartitionQueryRequest, PartitionQueryPair> {
+
+    public PartitionQueryFn(
+        JodaClock clock,
+        FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+        RpcQosOptions rpcQosOptions) {
+      super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    @Override
+    public Context getRpcAttemptContext() {
+      return FirestoreV1Fn.V1FnRpcAttemptContext.PartitionQuery;
+    }
+
+    @Override
+    public void processElement(ProcessContext context) throws Exception {
+      @SuppressWarnings("nullness")
+      final PartitionQueryRequest element =
+          requireNonNull(context.element(), "c.element() must be non null");
+
+      RpcQos.RpcReadAttempt attempt = rpcQos.newReadAttempt(getRpcAttemptContext());
+      PartitionQueryResponse.Builder aggregate = null;
+      while (true) {
+        if (!attempt.awaitSafeToProceed(clock.instant())) {
+          continue;
+        }
+
+        try {
+          PartitionQueryRequest request = setPageToken(element, aggregate);
+          attempt.recordRequestStart(clock.instant());
+          PartitionQueryPagedResponse pagedResponse =
+              firestoreStub.partitionQueryPagedCallable().call(request);
+          for (PartitionQueryPage page : pagedResponse.iteratePages()) {
+            attempt.recordRequestSuccessful(clock.instant());
+            PartitionQueryResponse response = page.getResponse();
+            if (aggregate == null) {
+              aggregate = response.toBuilder();
+            } else {
+              aggregate.addAllPartitions(response.getPartitionsList());
+              if (page.hasNextPage()) {
+                aggregate.setNextPageToken(response.getNextPageToken());
+              } else {
+                aggregate.clearNextPageToken();
+              }
+            }
+            if (page.hasNextPage()) {
+              attempt.recordRequestStart(clock.instant());
+            }
+          }
+          attempt.completeSuccess();
+          break;
+        } catch (RuntimeException exception) {
+          Instant end = clock.instant();
+          attempt.recordRequestFailed(end);
+          attempt.checkCanRetry(end, exception);
+        }
+      }
+      if (aggregate != null) {
+        context.output(new PartitionQueryPair(element, aggregate.build()));
+      }
+    }
+
+    private PartitionQueryRequest setPageToken(
+        PartitionQueryRequest request,
+        @edu.umd.cs.findbugs.annotations.Nullable PartitionQueryResponse.Builder aggregate) {
+      if (aggregate != null && aggregate.getNextPageToken() != null) {
+        return request.toBuilder().setPageToken(aggregate.getNextPageToken()).build();
+      }
+      return request;
+    }
+  }
+
+  /**
+   * {@link DoFn} for Firestore V1 {@link ListDocumentsRequest}s.
+   *
+   * <p>This Fn uses pagination to obtain responses, the response from each page will be output to
+   * the next stage of the pipeline.
+   *
+   * <p>All request quality-of-service is managed via the instance of {@link RpcQos} associated with
+   * the lifecycle of this Fn.
+   */
+  static final class ListDocumentsFn
+      extends PaginatedFirestoreV1ReadFn<
+          ListDocumentsRequest,
+          ListDocumentsPagedResponse,
+          ListDocumentsPage,
+          ListDocumentsResponse> {
+
+    ListDocumentsFn(
+        JodaClock clock,
+        FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+        RpcQosOptions rpcQosOptions) {
+      super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    @Override
+    public Context getRpcAttemptContext() {
+      return FirestoreV1Fn.V1FnRpcAttemptContext.ListDocuments;
+    }
+
+    @Override
+    protected UnaryCallable<ListDocumentsRequest, ListDocumentsPagedResponse> getCallable(
+        FirestoreStub firestoreStub) {
+      return firestoreStub.listDocumentsPagedCallable();
+    }
+
+    @Override
+    protected ListDocumentsRequest setPageToken(
+        ListDocumentsRequest request, String nextPageToken) {
+      return request.toBuilder().setPageToken(nextPageToken).build();
+    }
+  }
+
+  /**
+   * {@link DoFn} for Firestore V1 {@link ListCollectionIdsRequest}s.
+   *
+   * <p>This Fn uses pagination to obtain responses, the response from each page will be output to
+   * the next stage of the pipeline.
+   *
+   * <p>All request quality-of-service is managed via the instance of {@link RpcQos} associated with
+   * the lifecycle of this Fn.
+   */
+  static final class ListCollectionIdsFn
+      extends PaginatedFirestoreV1ReadFn<
+          ListCollectionIdsRequest,
+          ListCollectionIdsPagedResponse,
+          ListCollectionIdsPage,
+          ListCollectionIdsResponse> {
+
+    ListCollectionIdsFn(
+        JodaClock clock,
+        FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+        RpcQosOptions rpcQosOptions) {
+      super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    @Override
+    public Context getRpcAttemptContext() {
+      return FirestoreV1Fn.V1FnRpcAttemptContext.ListCollectionIds;
+    }
+
+    @Override
+    protected UnaryCallable<ListCollectionIdsRequest, ListCollectionIdsPagedResponse> getCallable(
+        FirestoreStub firestoreStub) {
+      return firestoreStub.listCollectionIdsPagedCallable();
+    }
+
+    @Override
+    protected ListCollectionIdsRequest setPageToken(
+        ListCollectionIdsRequest request, String nextPageToken) {
+      return request.toBuilder().setPageToken(nextPageToken).build();
+    }
+  }
+
+  /**
+   * {@link DoFn} for Firestore V1 {@link BatchGetDocumentsRequest}s.
+   *
+   * <p>This Fn uses a stream to obtain responses, each response from the stream will be output to
+   * the next stage of the pipeline. Each response from the stream represents an individual document
+   * with the associated metadata.
+   *
+   * <p>All request quality-of-service is managed via the instance of {@link RpcQos} associated with
+   * the lifecycle of this Fn.
+   */
+  static final class BatchGetDocumentsFn
+      extends StreamingFirestoreV1ReadFn<BatchGetDocumentsRequest, BatchGetDocumentsResponse> {
+
+    BatchGetDocumentsFn(
+        JodaClock clock,
+        FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+        RpcQosOptions rpcQosOptions) {
+      super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    @Override
+    public Context getRpcAttemptContext() {
+      return FirestoreV1Fn.V1FnRpcAttemptContext.BatchGetDocuments;
+    }
+
+    @Override
+    protected ServerStreamingCallable<BatchGetDocumentsRequest, BatchGetDocumentsResponse>
+        getCallable(FirestoreStub firestoreStub) {
+      return firestoreStub.batchGetDocumentsCallable();
+    }
+
+    @Override
+    protected BatchGetDocumentsRequest setStartFrom(
+        BatchGetDocumentsRequest originalRequest, BatchGetDocumentsResponse mostRecentResponse) {
+      int startIndex = -1;
+      ProtocolStringList documentsList = originalRequest.getDocumentsList();
+      String missing = mostRecentResponse.getMissing();
+      String foundName =
+          mostRecentResponse.hasFound() ? mostRecentResponse.getFound().getName() : null;
+      // we only scan until the second to last originalRequest. If the final element were to be
+      // reached
+      // the full request would be complete and we wouldn't be in this scenario
+      int maxIndex = documentsList.size() - 2;
+      for (int i = 0; i <= maxIndex; i++) {
+        String docName = documentsList.get(i);
+        if (docName.equals(missing) || docName.equals(foundName)) {
+          startIndex = i;
+          break;
+        }
+      }
+      if (0 <= startIndex) {
+        BatchGetDocumentsRequest.Builder builder = originalRequest.toBuilder().clearDocuments();
+        documentsList.stream()
+            .skip(startIndex + 1) // start from the next entry from the one we found
+            .forEach(builder::addDocuments);
+        return builder.build();
+      }
+      // unable to find a match, return the original request
+      return originalRequest;

Review comment:
       Yes, that sounds good.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] BenWhitehead commented on pull request #15005: [BEAM-8376] Google Cloud Firestore Connector - Add Firestore v1 Read Operations

Posted by GitBox <gi...@apache.org>.
BenWhitehead commented on pull request #15005:
URL: https://github.com/apache/beam/pull/15005#issuecomment-865204056


   Run Java PreCommit


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] BenWhitehead commented on a change in pull request #15005: [BEAM-8376] Google Cloud Firestore Connector - Add Firestore v1 Read Operations

Posted by GitBox <gi...@apache.org>.
BenWhitehead commented on a change in pull request #15005:
URL: https://github.com/apache/beam/pull/15005#discussion_r655741368



##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/firestore/FirestoreV1.java
##########
@@ -189,6 +533,725 @@ private Write() {}
     }
   }
 
+  /**
+   * Concrete class representing a {@link PTransform}{@code <}{@link PCollection}{@code <}{@link
+   * ListCollectionIdsRequest}{@code >, }{@link PTransform}{@code <}{@link
+   * ListCollectionIdsResponse}{@code >>} which will read from Firestore.
+   *
+   * <p>This class is part of the Firestore Connector DSL, it has a type safe builder accessible via
+   * {@link FirestoreIO#v1()}{@code .}{@link FirestoreV1#read() read()}{@code .}{@link
+   * FirestoreV1.Read#listCollectionIds() listCollectionIds()}.
+   *
+   * <p>All request quality-of-service for an instance of this PTransform is scoped to the worker
+   * and configured via {@link ListCollectionIds.Builder#withRpcQosOptions(RpcQosOptions)}.
+   *
+   * @see FirestoreIO#v1()
+   * @see FirestoreV1#read()
+   * @see FirestoreV1.Read#listCollectionIds()
+   * @see FirestoreV1.ListCollectionIds.Builder
+   * @see ListCollectionIdsRequest
+   * @see ListCollectionIdsResponse
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.Firestore.ListCollectionIds">google.firestore.v1.Firestore.ListCollectionIds</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.ListCollectionIdsRequest">google.firestore.v1.ListCollectionIdsRequest</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.ListCollectionIdsResponse">google.firestore.v1.ListCollectionIdsResponse</a>
+   */
+  public static final class ListCollectionIds
+      extends Transform<
+          PCollection<ListCollectionIdsRequest>,
+          PCollection<String>,
+          ListCollectionIds,
+          ListCollectionIds.Builder> {
+
+    private ListCollectionIds(
+        JodaClock clock,
+        FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+        RpcQosOptions rpcQosOptions) {
+      super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    @Override
+    public PCollection<String> expand(PCollection<ListCollectionIdsRequest> input) {
+      return input
+          .apply(
+              "listCollectionIds",
+              ParDo.of(
+                  new ListCollectionIdsFn(clock, firestoreStatefulComponentFactory, rpcQosOptions)))
+          .apply(ParDo.of(new FlattenListCollectionIdsResponse()))
+          .apply(Reshuffle.viaRandomKey());
+    }
+
+    @Override
+    public Builder toBuilder() {
+      return new Builder(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    /**
+     * A type safe builder for {@link ListCollectionIds} allowing configuration and instantiation.
+     *
+     * <p>This class is part of the Firestore Connector DSL, it has a type safe builder accessible
+     * via {@link FirestoreIO#v1()}{@code .}{@link FirestoreV1#read() read()}{@code .}{@link
+     * FirestoreV1.Read#listCollectionIds() listCollectionIds()}.
+     *
+     * <p>
+     *
+     * @see FirestoreIO#v1()
+     * @see FirestoreV1#read()
+     * @see FirestoreV1.Read#listCollectionIds()
+     * @see FirestoreV1.ListCollectionIds
+     * @see ListCollectionIdsRequest
+     * @see ListCollectionIdsResponse
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.Firestore.ListCollectionIds">google.firestore.v1.Firestore.ListCollectionIds</a>
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.ListCollectionIdsRequest">google.firestore.v1.ListCollectionIdsRequest</a>
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.ListCollectionIdsResponse">google.firestore.v1.ListCollectionIdsResponse</a>
+     */
+    public static final class Builder
+        extends Transform.Builder<
+            PCollection<ListCollectionIdsRequest>,
+            PCollection<String>,
+            ListCollectionIds,
+            ListCollectionIds.Builder> {
+
+      private Builder() {
+        super();
+      }
+
+      private Builder(
+          JodaClock clock,
+          FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+          RpcQosOptions rpcQosOptions) {
+        super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+      }
+
+      @Override
+      public ListCollectionIds build() {
+        return genericBuild();
+      }
+
+      @Override
+      ListCollectionIds buildSafe(
+          JodaClock clock,
+          FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+          RpcQosOptions rpcQosOptions) {
+        return new ListCollectionIds(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+      }
+    }
+  }
+
+  /**
+   * Concrete class representing a {@link PTransform}{@code <}{@link PCollection}{@code <}{@link
+   * ListDocumentsRequest}{@code >, }{@link PTransform}{@code <}{@link ListDocumentsResponse}{@code
+   * >>} which will read from Firestore.
+   *
+   * <p>This class is part of the Firestore Connector DSL, it has a type safe builder accessible via
+   * {@link FirestoreIO#v1()}{@code .}{@link FirestoreV1#read() read()}{@code .}{@link
+   * FirestoreV1.Read#listDocuments() listDocuments()}.
+   *
+   * <p>All request quality-of-service for an instance of this PTransform is scoped to the worker
+   * and configured via {@link ListDocuments.Builder#withRpcQosOptions(RpcQosOptions)}.
+   *
+   * @see FirestoreIO#v1()
+   * @see FirestoreV1#read()
+   * @see FirestoreV1.Read#listDocuments()
+   * @see FirestoreV1.ListDocuments.Builder
+   * @see ListDocumentsRequest
+   * @see ListDocumentsResponse
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.Firestore.ListDocuments">google.firestore.v1.Firestore.ListDocuments</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.ListDocumentsRequest">google.firestore.v1.ListDocumentsRequest</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.ListDocumentsResponse">google.firestore.v1.ListDocumentsResponse</a>
+   */
+  public static final class ListDocuments
+      extends Transform<
+          PCollection<ListDocumentsRequest>,
+          PCollection<Document>,
+          ListDocuments,
+          ListDocuments.Builder> {
+
+    private ListDocuments(
+        JodaClock clock,
+        FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+        RpcQosOptions rpcQosOptions) {
+      super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    @Override
+    public PCollection<Document> expand(PCollection<ListDocumentsRequest> input) {
+      return input
+          .apply(
+              "listDocuments",
+              ParDo.of(
+                  new ListDocumentsFn(clock, firestoreStatefulComponentFactory, rpcQosOptions)))
+          .apply(ParDo.of(new ListDocumentsResponseToDocument()))
+          .apply(Reshuffle.viaRandomKey());
+    }
+
+    @Override
+    public Builder toBuilder() {
+      return new Builder(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    /**
+     * A type safe builder for {@link ListDocuments} allowing configuration and instantiation.
+     *
+     * <p>This class is part of the Firestore Connector DSL, it has a type safe builder accessible
+     * via {@link FirestoreIO#v1()}{@code .}{@link FirestoreV1#read() read()}{@code .}{@link
+     * FirestoreV1.Read#listDocuments() listDocuments()}.
+     *
+     * <p>
+     *
+     * @see FirestoreIO#v1()
+     * @see FirestoreV1#read()
+     * @see FirestoreV1.Read#listDocuments()
+     * @see FirestoreV1.ListDocuments
+     * @see ListDocumentsRequest
+     * @see ListDocumentsResponse
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.Firestore.ListDocuments">google.firestore.v1.Firestore.ListDocuments</a>
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.ListDocumentsRequest">google.firestore.v1.ListDocumentsRequest</a>
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.ListDocumentsResponse">google.firestore.v1.ListDocumentsResponse</a>
+     */
+    public static final class Builder
+        extends Transform.Builder<
+            PCollection<ListDocumentsRequest>,
+            PCollection<Document>,
+            ListDocuments,
+            ListDocuments.Builder> {
+
+      private Builder() {
+        super();
+      }
+
+      private Builder(
+          JodaClock clock,
+          FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+          RpcQosOptions rpcQosOptions) {
+        super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+      }
+
+      @Override
+      public ListDocuments build() {
+        return genericBuild();
+      }
+
+      @Override
+      ListDocuments buildSafe(
+          JodaClock clock,
+          FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+          RpcQosOptions rpcQosOptions) {
+        return new ListDocuments(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+      }
+    }
+  }
+
+  /**
+   * Concrete class representing a {@link PTransform}{@code <}{@link PCollection}{@code <}{@link
+   * RunQueryRequest}{@code >, }{@link PTransform}{@code <}{@link RunQueryResponse}{@code >>} which
+   * will read from Firestore.
+   *
+   * <p>This class is part of the Firestore Connector DSL, it has a type safe builder accessible via
+   * {@link FirestoreIO#v1()}{@code .}{@link FirestoreV1#read() read()}{@code .}{@link
+   * FirestoreV1.Read#runQuery() runQuery()}.
+   *
+   * <p>All request quality-of-service for an instance of this PTransform is scoped to the worker
+   * and configured via {@link RunQuery.Builder#withRpcQosOptions(RpcQosOptions)}.
+   *
+   * @see FirestoreIO#v1()
+   * @see FirestoreV1#read()
+   * @see FirestoreV1.Read#runQuery()
+   * @see FirestoreV1.RunQuery.Builder
+   * @see RunQueryRequest
+   * @see RunQueryResponse
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.Firestore.RunQuery">google.firestore.v1.Firestore.RunQuery</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.RunQueryRequest">google.firestore.v1.RunQueryRequest</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.RunQueryResponse">google.firestore.v1.RunQueryResponse</a>
+   */
+  public static final class RunQuery
+      extends Transform<
+          PCollection<RunQueryRequest>, PCollection<RunQueryResponse>, RunQuery, RunQuery.Builder> {
+
+    private RunQuery(
+        JodaClock clock,
+        FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+        RpcQosOptions rpcQosOptions) {
+      super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    @Override
+    public PCollection<RunQueryResponse> expand(PCollection<RunQueryRequest> input) {
+      return input
+          .apply(
+              "runQuery",
+              ParDo.of(new RunQueryFn(clock, firestoreStatefulComponentFactory, rpcQosOptions)))
+          .apply(Reshuffle.viaRandomKey());
+    }
+
+    @Override
+    public Builder toBuilder() {
+      return new Builder(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    /**
+     * A type safe builder for {@link RunQuery} allowing configuration and instantiation.
+     *
+     * <p>This class is part of the Firestore Connector DSL, it has a type safe builder accessible
+     * via {@link FirestoreIO#v1()}{@code .}{@link FirestoreV1#read() read()}{@code .}{@link
+     * FirestoreV1.Read#runQuery() runQuery()}.
+     *
+     * <p>
+     *
+     * @see FirestoreIO#v1()
+     * @see FirestoreV1#read()
+     * @see FirestoreV1.Read#runQuery()
+     * @see FirestoreV1.RunQuery
+     * @see RunQueryRequest
+     * @see RunQueryResponse
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.Firestore.RunQuery">google.firestore.v1.Firestore.RunQuery</a>
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.RunQueryRequest">google.firestore.v1.RunQueryRequest</a>
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.RunQueryResponse">google.firestore.v1.RunQueryResponse</a>
+     */
+    public static final class Builder
+        extends Transform.Builder<
+            PCollection<RunQueryRequest>,
+            PCollection<RunQueryResponse>,
+            RunQuery,
+            RunQuery.Builder> {
+
+      private Builder() {
+        super();
+      }
+
+      private Builder(
+          JodaClock clock,
+          FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+          RpcQosOptions rpcQosOptions) {
+        super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+      }
+
+      @Override
+      public RunQuery build() {
+        return genericBuild();
+      }
+
+      @Override
+      RunQuery buildSafe(
+          JodaClock clock,
+          FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+          RpcQosOptions rpcQosOptions) {
+        return new RunQuery(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+      }
+    }
+  }
+
+  /**
+   * Concrete class representing a {@link PTransform}{@code <}{@link PCollection}{@code <}{@link
+   * BatchGetDocumentsRequest}{@code >, }{@link PTransform}{@code <}{@link
+   * BatchGetDocumentsResponse}{@code >>} which will read from Firestore.
+   *
+   * <p>This class is part of the Firestore Connector DSL, it has a type safe builder accessible via
+   * {@link FirestoreIO#v1()}{@code .}{@link FirestoreV1#read() read()}{@code .}{@link
+   * FirestoreV1.Read#batchGetDocuments() batchGetDocuments()}.
+   *
+   * <p>All request quality-of-service for an instance of this PTransform is scoped to the worker
+   * and configured via {@link BatchGetDocuments.Builder#withRpcQosOptions(RpcQosOptions)}.
+   *
+   * @see FirestoreIO#v1()
+   * @see FirestoreV1#read()
+   * @see FirestoreV1.Read#batchGetDocuments()
+   * @see FirestoreV1.BatchGetDocuments.Builder
+   * @see BatchGetDocumentsRequest
+   * @see BatchGetDocumentsResponse
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.Firestore.BatchGetDocuments">google.firestore.v1.Firestore.BatchGetDocuments</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.BatchGetDocumentsRequest">google.firestore.v1.BatchGetDocumentsRequest</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.BatchGetDocumentsResponse">google.firestore.v1.BatchGetDocumentsResponse</a>
+   */
+  public static final class BatchGetDocuments
+      extends Transform<
+          PCollection<BatchGetDocumentsRequest>,
+          PCollection<BatchGetDocumentsResponse>,
+          BatchGetDocuments,
+          BatchGetDocuments.Builder> {
+
+    private BatchGetDocuments(
+        JodaClock clock,
+        FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+        RpcQosOptions rpcQosOptions) {
+      super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    @Override
+    public PCollection<BatchGetDocumentsResponse> expand(
+        PCollection<BatchGetDocumentsRequest> input) {
+      return input
+          .apply(
+              "batchGetDocuments",
+              ParDo.of(
+                  new BatchGetDocumentsFn(clock, firestoreStatefulComponentFactory, rpcQosOptions)))
+          .apply(Reshuffle.viaRandomKey());
+    }
+
+    @Override
+    public Builder toBuilder() {
+      return new Builder(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    /**
+     * A type safe builder for {@link BatchGetDocuments} allowing configuration and instantiation.
+     *
+     * <p>This class is part of the Firestore Connector DSL, it has a type safe builder accessible
+     * via {@link FirestoreIO#v1()}{@code .}{@link FirestoreV1#read() read()}{@code .}{@link
+     * FirestoreV1.Read#batchGetDocuments() batchGetDocuments()}.
+     *
+     * <p>
+     *
+     * @see FirestoreIO#v1()
+     * @see FirestoreV1#read()
+     * @see FirestoreV1.Read#batchGetDocuments()
+     * @see FirestoreV1.BatchGetDocuments
+     * @see BatchGetDocumentsRequest
+     * @see BatchGetDocumentsResponse
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.Firestore.BatchGetDocuments">google.firestore.v1.Firestore.BatchGetDocuments</a>
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.BatchGetDocumentsRequest">google.firestore.v1.BatchGetDocumentsRequest</a>
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.BatchGetDocumentsResponse">google.firestore.v1.BatchGetDocumentsResponse</a>
+     */
+    public static final class Builder
+        extends Transform.Builder<
+            PCollection<BatchGetDocumentsRequest>,
+            PCollection<BatchGetDocumentsResponse>,
+            BatchGetDocuments,
+            BatchGetDocuments.Builder> {
+
+      private Builder() {
+        super();
+      }
+
+      public Builder(
+          JodaClock clock,
+          FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+          RpcQosOptions rpcQosOptions) {
+        super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+      }
+
+      @Override
+      public BatchGetDocuments build() {
+        return genericBuild();
+      }
+
+      @Override
+      BatchGetDocuments buildSafe(
+          JodaClock clock,
+          FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+          RpcQosOptions rpcQosOptions) {
+        return new BatchGetDocuments(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+      }
+    }
+  }
+
+  /**
+   * Concrete class representing a {@link PTransform}{@code <}{@link PCollection}{@code <}{@link
+   * PartitionQueryRequest}{@code >, }{@link PTransform}{@code <}{@link RunQueryResponse}{@code >>}
+   * which will read from Firestore.
+   *
+   * <p>This class is part of the Firestore Connector DSL, it has a type safe builder accessible via
+   * {@link FirestoreIO#v1()}{@code .}{@link FirestoreV1#read() read()}{@code .}{@link
+   * FirestoreV1.Read#partitionQuery() partitionQuery()}.
+   *
+   * <p>All request quality-of-service for an instance of this PTransform is scoped to the worker
+   * and configured via {@link PartitionQuery.Builder#withRpcQosOptions(RpcQosOptions)}.
+   *
+   * @see FirestoreIO#v1()
+   * @see FirestoreV1#read()
+   * @see FirestoreV1.Read#partitionQuery()
+   * @see FirestoreV1.PartitionQuery.Builder
+   * @see PartitionQueryRequest
+   * @see RunQueryResponse
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.Firestore.PartitionQuery">google.firestore.v1.Firestore.PartitionQuery</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.PartitionQueryRequest">google.firestore.v1.PartitionQueryRequest</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.PartitionQueryResponse">google.firestore.v1.PartitionQueryResponse</a>
+   */
+  public static final class PartitionQuery
+      extends Transform<
+          PCollection<PartitionQueryRequest>,
+          PCollection<RunQueryResponse>,
+          PartitionQuery,
+          PartitionQuery.Builder> {
+
+    private final boolean nameOnlyQuery;
+
+    private PartitionQuery(
+        JodaClock clock,
+        FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+        RpcQosOptions rpcQosOptions,
+        boolean nameOnlyQuery) {
+      super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+      this.nameOnlyQuery = nameOnlyQuery;
+    }
+
+    @Override
+    public PCollection<RunQueryResponse> expand(PCollection<PartitionQueryRequest> input) {
+      PCollection<RunQueryRequest> queries =
+          input
+              .apply(
+                  "PartitionQuery",
+                  ParDo.of(
+                      new PartitionQueryFn(
+                          clock, firestoreStatefulComponentFactory, rpcQosOptions)))
+              .apply("expand queries", ParDo.of(new PartitionQueryResponseToRunQueryRequest()));
+      if (nameOnlyQuery) {
+        queries =
+            queries.apply(
+                "set name only query",
+                MapElements.via(
+                    new SimpleFunction<RunQueryRequest, RunQueryRequest>() {
+                      @Override
+                      public RunQueryRequest apply(RunQueryRequest input) {
+                        RunQueryRequest.Builder builder = input.toBuilder();
+                        builder
+                            .getStructuredQueryBuilder()
+                            .setSelect(
+                                Projection.newBuilder()
+                                    .addFields(
+                                        FieldReference.newBuilder()
+                                            .setFieldPath("__name__")
+                                            .build())
+                                    .build());
+                        return builder.build();
+                      }
+                    }));
+      }
+      return queries
+          .apply(Reshuffle.viaRandomKey())
+          .apply(new RunQuery(clock, firestoreStatefulComponentFactory, rpcQosOptions));
+    }
+
+    @Override
+    public Builder toBuilder() {
+      return new Builder(clock, firestoreStatefulComponentFactory, rpcQosOptions, nameOnlyQuery);
+    }
+
+    /**
+     * A type safe builder for {@link PartitionQuery} allowing configuration and instantiation.
+     *
+     * <p>This class is part of the Firestore Connector DSL, it has a type safe builder accessible
+     * via {@link FirestoreIO#v1()}{@code .}{@link FirestoreV1#read() read()}{@code .}{@link
+     * FirestoreV1.Read#partitionQuery() partitionQuery()}.
+     *
+     * <p>
+     *
+     * @see FirestoreIO#v1()
+     * @see FirestoreV1#read()
+     * @see FirestoreV1.Read#partitionQuery()
+     * @see FirestoreV1.PartitionQuery
+     * @see PartitionQueryRequest
+     * @see RunQueryResponse
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.Firestore.PartitionQuery">google.firestore.v1.Firestore.PartitionQuery</a>
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.PartitionQueryRequest">google.firestore.v1.PartitionQueryRequest</a>
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.PartitionQueryResponse">google.firestore.v1.PartitionQueryResponse</a>
+     */
+    public static final class Builder
+        extends Transform.Builder<
+            PCollection<PartitionQueryRequest>,
+            PCollection<RunQueryResponse>,
+            PartitionQuery,
+            FirestoreV1.PartitionQuery.Builder> {
+
+      private boolean nameOnlyQuery = false;
+
+      private Builder() {
+        super();
+      }
+
+      public Builder(
+          JodaClock clock,
+          FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+          RpcQosOptions rpcQosOptions,
+          boolean nameOnlyQuery) {
+        super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+        this.nameOnlyQuery = nameOnlyQuery;
+      }
+
+      @Override
+      public PartitionQuery build() {
+        return genericBuild();
+      }
+
+      /**
+       * Update produced queries to only retrieve their {@code __name__} thereby not retrieving any
+       * fields and reducing resource requirements.
+       *
+       * @return this builder
+       */
+      public Builder withNameOnlyQuery() {
+        this.nameOnlyQuery = true;
+        return this;
+      }
+
+      @Override
+      PartitionQuery buildSafe(
+          JodaClock clock,
+          FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+          RpcQosOptions rpcQosOptions) {
+        return new PartitionQuery(
+            clock, firestoreStatefulComponentFactory, rpcQosOptions, nameOnlyQuery);
+      }
+    }
+
+    /**
+     * DoFn which contains the logic necessary to turn a {@link PartitionQueryRequest} and {@link
+     * PartitionQueryResponse} pair into {@code N} {@link RunQueryRequest}.
+     */
+    static final class PartitionQueryResponseToRunQueryRequest
+        extends DoFn<PartitionQueryPair, RunQueryRequest> {
+
+      /**
+       * When fetching cursors that span multiple pages it is expected (per <a
+       * href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.PartitionQueryRequest">
+       * PartitionQueryRequest.page_token</a>) for the client to sort the cursors before processing
+       * them to define the sub-queries. So here we're defining a Comparator which will sort Cursors
+       * by the first reference value present, then comparing the reference values
+       * lexicographically.
+       */
+      static final Comparator<Cursor> CURSOR_REFERENCE_VALUE_COMPARATOR;
+
+      static {
+        Function<Cursor, Optional<Value>> firstReferenceValue =
+            (Cursor c) ->
+                c.getValuesList().stream()
+                    .filter(
+                        v -> {
+                          String referenceValue = v.getReferenceValue();
+                          return referenceValue != null && !referenceValue.isEmpty();
+                        })
+                    .findFirst();
+        Function<String, String[]> stringToPath = (String s) -> s.split("/");
+        // compare references by their path segments rather than as a whole string to ensure
+        // per path segment comparison is taken into account.
+        Comparator<String[]> pathWiseCompare =
+            (String[] path1, String[] path2) -> {
+              int minLength = Math.min(path1.length, path2.length);
+              for (int i = 0; i < minLength; i++) {
+                String pathSegment1 = path1[i];
+                String pathSegment2 = path2[i];
+                int compare = pathSegment1.compareTo(pathSegment2);
+                if (compare != 0) {
+                  return compare;
+                }
+              }
+              if (path1.length == path2.length) {
+                return 0;
+              } else if (minLength == path1.length) {
+                return -1;
+              } else {
+                return 1;
+              }
+            };
+
+        // Sort those cursors which have no firstReferenceValue at the bottom of the list
+        CURSOR_REFERENCE_VALUE_COMPARATOR =
+            Comparator.comparing(
+                firstReferenceValue,
+                (o1, o2) -> {
+                  if (o1.isPresent() && o2.isPresent()) {
+                    return pathWiseCompare.compare(
+                        stringToPath.apply(o1.get().getReferenceValue()),
+                        stringToPath.apply(o2.get().getReferenceValue()));
+                  } else if (o1.isPresent()) {
+                    return -1;
+                  } else {
+                    return 1;
+                  }
+                });
+      }
+
+      @ProcessElement
+      public void processElement(ProcessContext c) {
+        PartitionQueryPair pair = c.element();

Review comment:
       Unit test added in addition to existing integration test.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] BenWhitehead commented on a change in pull request #15005: [BEAM-8376] Google Cloud Firestore Connector - Add Firestore v1 Read Operations

Posted by GitBox <gi...@apache.org>.
BenWhitehead commented on a change in pull request #15005:
URL: https://github.com/apache/beam/pull/15005#discussion_r666368212



##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/firestore/FirestoreV1ReadFn.java
##########
@@ -0,0 +1,633 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.gcp.firestore;
+
+import static java.util.Objects.requireNonNull;
+
+import com.google.api.gax.paging.AbstractPage;
+import com.google.api.gax.paging.AbstractPagedListResponse;
+import com.google.api.gax.rpc.ServerStream;
+import com.google.api.gax.rpc.ServerStreamingCallable;
+import com.google.api.gax.rpc.UnaryCallable;
+import com.google.cloud.firestore.v1.FirestoreClient.ListCollectionIdsPage;
+import com.google.cloud.firestore.v1.FirestoreClient.ListCollectionIdsPagedResponse;
+import com.google.cloud.firestore.v1.FirestoreClient.ListDocumentsPage;
+import com.google.cloud.firestore.v1.FirestoreClient.ListDocumentsPagedResponse;
+import com.google.cloud.firestore.v1.FirestoreClient.PartitionQueryPage;
+import com.google.cloud.firestore.v1.FirestoreClient.PartitionQueryPagedResponse;
+import com.google.cloud.firestore.v1.stub.FirestoreStub;
+import com.google.firestore.v1.BatchGetDocumentsRequest;
+import com.google.firestore.v1.BatchGetDocumentsResponse;
+import com.google.firestore.v1.Cursor;
+import com.google.firestore.v1.ListCollectionIdsRequest;
+import com.google.firestore.v1.ListCollectionIdsResponse;
+import com.google.firestore.v1.ListDocumentsRequest;
+import com.google.firestore.v1.ListDocumentsResponse;
+import com.google.firestore.v1.PartitionQueryRequest;
+import com.google.firestore.v1.PartitionQueryResponse;
+import com.google.firestore.v1.RunQueryRequest;
+import com.google.firestore.v1.RunQueryResponse;
+import com.google.firestore.v1.StructuredQuery;
+import com.google.firestore.v1.StructuredQuery.Direction;
+import com.google.firestore.v1.StructuredQuery.FieldReference;
+import com.google.firestore.v1.StructuredQuery.Order;
+import com.google.firestore.v1.Value;
+import com.google.protobuf.Message;
+import com.google.protobuf.ProtocolStringList;
+import java.io.Serializable;
+import java.util.List;
+import java.util.Map;
+import java.util.Objects;
+import org.apache.beam.sdk.extensions.gcp.options.GcpOptions;
+import org.apache.beam.sdk.io.gcp.firestore.FirestoreDoFn.NonWindowAwareDoFn;
+import org.apache.beam.sdk.io.gcp.firestore.FirestoreV1Fn.HasRpcAttemptContext;
+import org.apache.beam.sdk.io.gcp.firestore.RpcQos.RpcAttempt.Context;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.display.DisplayData;
+import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.annotations.VisibleForTesting;
+import org.checkerframework.checker.nullness.compatqual.NullableDecl;
+import org.checkerframework.checker.nullness.qual.Nullable;
+import org.joda.time.Instant;
+
+/**
+ * A collection of {@link org.apache.beam.sdk.transforms.DoFn DoFn}s for each of the supported read
+ * RPC methods from the Cloud Firestore V1 API.
+ */
+final class FirestoreV1ReadFn {
+
+  /**
+   * {@link DoFn} for Firestore V1 {@link RunQueryRequest}s.
+   *
+   * <p>This Fn uses a stream to obtain responses, each response from the stream will be output to
+   * the next stage of the pipeline. Each response from the stream represents an individual document
+   * with the associated metadata.
+   *
+   * <p>If an error is encountered while reading from the stream, the stream will attempt to resume
+   * rather than starting over. The restarting of the stream will continue within the scope of the
+   * completion of the request (meaning any possibility of resumption is contingent upon an attempt
+   * being available in the Qos budget).
+   *
+   * <p>All request quality-of-service is managed via the instance of {@link RpcQos} associated with
+   * the lifecycle of this Fn.
+   */
+  static final class RunQueryFn

Review comment:
       I did attempt to provide a SDF for query when implementing this class, however today the API for `PartitionQuery` does not allow dynamic splitting, it is a purely pre-process type of operation. Due to this fact, I had to add the additional `PartitionQueryFn` which could perform the pre-processing and yield the computed queries.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] BenWhitehead commented on a change in pull request #15005: [BEAM-8376] Google Cloud Firestore Connector - Add Firestore v1 Read Operations

Posted by GitBox <gi...@apache.org>.
BenWhitehead commented on a change in pull request #15005:
URL: https://github.com/apache/beam/pull/15005#discussion_r668230908



##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/firestore/FirestoreV1.java
##########
@@ -189,6 +533,725 @@ private Write() {}
     }
   }
 
+  /**
+   * Concrete class representing a {@link PTransform}{@code <}{@link PCollection}{@code <}{@link
+   * ListCollectionIdsRequest}{@code >, }{@link PTransform}{@code <}{@link
+   * ListCollectionIdsResponse}{@code >>} which will read from Firestore.
+   *
+   * <p>This class is part of the Firestore Connector DSL, it has a type safe builder accessible via
+   * {@link FirestoreIO#v1()}{@code .}{@link FirestoreV1#read() read()}{@code .}{@link
+   * FirestoreV1.Read#listCollectionIds() listCollectionIds()}.
+   *
+   * <p>All request quality-of-service for an instance of this PTransform is scoped to the worker
+   * and configured via {@link ListCollectionIds.Builder#withRpcQosOptions(RpcQosOptions)}.
+   *
+   * @see FirestoreIO#v1()
+   * @see FirestoreV1#read()
+   * @see FirestoreV1.Read#listCollectionIds()
+   * @see FirestoreV1.ListCollectionIds.Builder
+   * @see ListCollectionIdsRequest
+   * @see ListCollectionIdsResponse
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.Firestore.ListCollectionIds">google.firestore.v1.Firestore.ListCollectionIds</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.ListCollectionIdsRequest">google.firestore.v1.ListCollectionIdsRequest</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.ListCollectionIdsResponse">google.firestore.v1.ListCollectionIdsResponse</a>
+   */
+  public static final class ListCollectionIds
+      extends Transform<
+          PCollection<ListCollectionIdsRequest>,
+          PCollection<String>,
+          ListCollectionIds,
+          ListCollectionIds.Builder> {
+
+    private ListCollectionIds(
+        JodaClock clock,
+        FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+        RpcQosOptions rpcQosOptions) {
+      super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    @Override
+    public PCollection<String> expand(PCollection<ListCollectionIdsRequest> input) {
+      return input
+          .apply(
+              "listCollectionIds",
+              ParDo.of(
+                  new ListCollectionIdsFn(clock, firestoreStatefulComponentFactory, rpcQosOptions)))
+          .apply(ParDo.of(new FlattenListCollectionIdsResponse()))
+          .apply(Reshuffle.viaRandomKey());
+    }
+
+    @Override
+    public Builder toBuilder() {
+      return new Builder(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    /**
+     * A type safe builder for {@link ListCollectionIds} allowing configuration and instantiation.
+     *
+     * <p>This class is part of the Firestore Connector DSL, it has a type safe builder accessible
+     * via {@link FirestoreIO#v1()}{@code .}{@link FirestoreV1#read() read()}{@code .}{@link
+     * FirestoreV1.Read#listCollectionIds() listCollectionIds()}.
+     *
+     * <p>
+     *
+     * @see FirestoreIO#v1()
+     * @see FirestoreV1#read()
+     * @see FirestoreV1.Read#listCollectionIds()
+     * @see FirestoreV1.ListCollectionIds
+     * @see ListCollectionIdsRequest
+     * @see ListCollectionIdsResponse
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.Firestore.ListCollectionIds">google.firestore.v1.Firestore.ListCollectionIds</a>
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.ListCollectionIdsRequest">google.firestore.v1.ListCollectionIdsRequest</a>
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.ListCollectionIdsResponse">google.firestore.v1.ListCollectionIdsResponse</a>
+     */
+    public static final class Builder
+        extends Transform.Builder<
+            PCollection<ListCollectionIdsRequest>,
+            PCollection<String>,
+            ListCollectionIds,
+            ListCollectionIds.Builder> {
+
+      private Builder() {
+        super();
+      }
+
+      private Builder(
+          JodaClock clock,
+          FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+          RpcQosOptions rpcQosOptions) {
+        super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+      }
+
+      @Override
+      public ListCollectionIds build() {
+        return genericBuild();
+      }
+
+      @Override
+      ListCollectionIds buildSafe(
+          JodaClock clock,
+          FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+          RpcQosOptions rpcQosOptions) {
+        return new ListCollectionIds(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+      }
+    }
+  }
+
+  /**
+   * Concrete class representing a {@link PTransform}{@code <}{@link PCollection}{@code <}{@link
+   * ListDocumentsRequest}{@code >, }{@link PTransform}{@code <}{@link ListDocumentsResponse}{@code
+   * >>} which will read from Firestore.
+   *
+   * <p>This class is part of the Firestore Connector DSL, it has a type safe builder accessible via
+   * {@link FirestoreIO#v1()}{@code .}{@link FirestoreV1#read() read()}{@code .}{@link
+   * FirestoreV1.Read#listDocuments() listDocuments()}.
+   *
+   * <p>All request quality-of-service for an instance of this PTransform is scoped to the worker
+   * and configured via {@link ListDocuments.Builder#withRpcQosOptions(RpcQosOptions)}.
+   *
+   * @see FirestoreIO#v1()
+   * @see FirestoreV1#read()
+   * @see FirestoreV1.Read#listDocuments()
+   * @see FirestoreV1.ListDocuments.Builder
+   * @see ListDocumentsRequest
+   * @see ListDocumentsResponse
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.Firestore.ListDocuments">google.firestore.v1.Firestore.ListDocuments</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.ListDocumentsRequest">google.firestore.v1.ListDocumentsRequest</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.ListDocumentsResponse">google.firestore.v1.ListDocumentsResponse</a>
+   */
+  public static final class ListDocuments
+      extends Transform<
+          PCollection<ListDocumentsRequest>,
+          PCollection<Document>,
+          ListDocuments,
+          ListDocuments.Builder> {
+
+    private ListDocuments(
+        JodaClock clock,
+        FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+        RpcQosOptions rpcQosOptions) {
+      super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    @Override
+    public PCollection<Document> expand(PCollection<ListDocumentsRequest> input) {
+      return input
+          .apply(
+              "listDocuments",
+              ParDo.of(
+                  new ListDocumentsFn(clock, firestoreStatefulComponentFactory, rpcQosOptions)))
+          .apply(ParDo.of(new ListDocumentsResponseToDocument()))
+          .apply(Reshuffle.viaRandomKey());
+    }
+
+    @Override
+    public Builder toBuilder() {
+      return new Builder(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    /**
+     * A type safe builder for {@link ListDocuments} allowing configuration and instantiation.
+     *
+     * <p>This class is part of the Firestore Connector DSL, it has a type safe builder accessible
+     * via {@link FirestoreIO#v1()}{@code .}{@link FirestoreV1#read() read()}{@code .}{@link
+     * FirestoreV1.Read#listDocuments() listDocuments()}.
+     *
+     * <p>
+     *
+     * @see FirestoreIO#v1()
+     * @see FirestoreV1#read()
+     * @see FirestoreV1.Read#listDocuments()
+     * @see FirestoreV1.ListDocuments
+     * @see ListDocumentsRequest
+     * @see ListDocumentsResponse
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.Firestore.ListDocuments">google.firestore.v1.Firestore.ListDocuments</a>
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.ListDocumentsRequest">google.firestore.v1.ListDocumentsRequest</a>
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.ListDocumentsResponse">google.firestore.v1.ListDocumentsResponse</a>
+     */
+    public static final class Builder
+        extends Transform.Builder<
+            PCollection<ListDocumentsRequest>,
+            PCollection<Document>,
+            ListDocuments,
+            ListDocuments.Builder> {
+
+      private Builder() {
+        super();
+      }
+
+      private Builder(
+          JodaClock clock,
+          FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+          RpcQosOptions rpcQosOptions) {
+        super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+      }
+
+      @Override
+      public ListDocuments build() {
+        return genericBuild();
+      }
+
+      @Override
+      ListDocuments buildSafe(
+          JodaClock clock,
+          FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+          RpcQosOptions rpcQosOptions) {
+        return new ListDocuments(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+      }
+    }
+  }
+
+  /**
+   * Concrete class representing a {@link PTransform}{@code <}{@link PCollection}{@code <}{@link
+   * RunQueryRequest}{@code >, }{@link PTransform}{@code <}{@link RunQueryResponse}{@code >>} which
+   * will read from Firestore.
+   *
+   * <p>This class is part of the Firestore Connector DSL, it has a type safe builder accessible via
+   * {@link FirestoreIO#v1()}{@code .}{@link FirestoreV1#read() read()}{@code .}{@link
+   * FirestoreV1.Read#runQuery() runQuery()}.
+   *
+   * <p>All request quality-of-service for an instance of this PTransform is scoped to the worker
+   * and configured via {@link RunQuery.Builder#withRpcQosOptions(RpcQosOptions)}.
+   *
+   * @see FirestoreIO#v1()
+   * @see FirestoreV1#read()
+   * @see FirestoreV1.Read#runQuery()
+   * @see FirestoreV1.RunQuery.Builder
+   * @see RunQueryRequest
+   * @see RunQueryResponse
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.Firestore.RunQuery">google.firestore.v1.Firestore.RunQuery</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.RunQueryRequest">google.firestore.v1.RunQueryRequest</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.RunQueryResponse">google.firestore.v1.RunQueryResponse</a>
+   */
+  public static final class RunQuery
+      extends Transform<
+          PCollection<RunQueryRequest>, PCollection<RunQueryResponse>, RunQuery, RunQuery.Builder> {
+
+    private RunQuery(
+        JodaClock clock,
+        FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+        RpcQosOptions rpcQosOptions) {
+      super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    @Override
+    public PCollection<RunQueryResponse> expand(PCollection<RunQueryRequest> input) {
+      return input
+          .apply(
+              "runQuery",
+              ParDo.of(new RunQueryFn(clock, firestoreStatefulComponentFactory, rpcQosOptions)))
+          .apply(Reshuffle.viaRandomKey());
+    }
+
+    @Override
+    public Builder toBuilder() {
+      return new Builder(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    /**
+     * A type safe builder for {@link RunQuery} allowing configuration and instantiation.
+     *
+     * <p>This class is part of the Firestore Connector DSL, it has a type safe builder accessible
+     * via {@link FirestoreIO#v1()}{@code .}{@link FirestoreV1#read() read()}{@code .}{@link
+     * FirestoreV1.Read#runQuery() runQuery()}.
+     *
+     * <p>
+     *
+     * @see FirestoreIO#v1()
+     * @see FirestoreV1#read()
+     * @see FirestoreV1.Read#runQuery()
+     * @see FirestoreV1.RunQuery
+     * @see RunQueryRequest
+     * @see RunQueryResponse
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.Firestore.RunQuery">google.firestore.v1.Firestore.RunQuery</a>
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.RunQueryRequest">google.firestore.v1.RunQueryRequest</a>
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.RunQueryResponse">google.firestore.v1.RunQueryResponse</a>
+     */
+    public static final class Builder
+        extends Transform.Builder<
+            PCollection<RunQueryRequest>,
+            PCollection<RunQueryResponse>,
+            RunQuery,
+            RunQuery.Builder> {
+
+      private Builder() {
+        super();
+      }
+
+      private Builder(
+          JodaClock clock,
+          FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+          RpcQosOptions rpcQosOptions) {
+        super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+      }
+
+      @Override
+      public RunQuery build() {
+        return genericBuild();
+      }
+
+      @Override
+      RunQuery buildSafe(
+          JodaClock clock,
+          FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+          RpcQosOptions rpcQosOptions) {
+        return new RunQuery(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+      }
+    }
+  }
+
+  /**
+   * Concrete class representing a {@link PTransform}{@code <}{@link PCollection}{@code <}{@link
+   * BatchGetDocumentsRequest}{@code >, }{@link PTransform}{@code <}{@link
+   * BatchGetDocumentsResponse}{@code >>} which will read from Firestore.
+   *
+   * <p>This class is part of the Firestore Connector DSL, it has a type safe builder accessible via
+   * {@link FirestoreIO#v1()}{@code .}{@link FirestoreV1#read() read()}{@code .}{@link
+   * FirestoreV1.Read#batchGetDocuments() batchGetDocuments()}.
+   *
+   * <p>All request quality-of-service for an instance of this PTransform is scoped to the worker
+   * and configured via {@link BatchGetDocuments.Builder#withRpcQosOptions(RpcQosOptions)}.
+   *
+   * @see FirestoreIO#v1()
+   * @see FirestoreV1#read()
+   * @see FirestoreV1.Read#batchGetDocuments()
+   * @see FirestoreV1.BatchGetDocuments.Builder
+   * @see BatchGetDocumentsRequest
+   * @see BatchGetDocumentsResponse
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.Firestore.BatchGetDocuments">google.firestore.v1.Firestore.BatchGetDocuments</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.BatchGetDocumentsRequest">google.firestore.v1.BatchGetDocumentsRequest</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.BatchGetDocumentsResponse">google.firestore.v1.BatchGetDocumentsResponse</a>
+   */
+  public static final class BatchGetDocuments
+      extends Transform<
+          PCollection<BatchGetDocumentsRequest>,
+          PCollection<BatchGetDocumentsResponse>,
+          BatchGetDocuments,
+          BatchGetDocuments.Builder> {
+
+    private BatchGetDocuments(
+        JodaClock clock,
+        FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+        RpcQosOptions rpcQosOptions) {
+      super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    @Override
+    public PCollection<BatchGetDocumentsResponse> expand(
+        PCollection<BatchGetDocumentsRequest> input) {
+      return input
+          .apply(
+              "batchGetDocuments",
+              ParDo.of(
+                  new BatchGetDocumentsFn(clock, firestoreStatefulComponentFactory, rpcQosOptions)))
+          .apply(Reshuffle.viaRandomKey());
+    }
+
+    @Override
+    public Builder toBuilder() {
+      return new Builder(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    /**
+     * A type safe builder for {@link BatchGetDocuments} allowing configuration and instantiation.
+     *
+     * <p>This class is part of the Firestore Connector DSL, it has a type safe builder accessible
+     * via {@link FirestoreIO#v1()}{@code .}{@link FirestoreV1#read() read()}{@code .}{@link
+     * FirestoreV1.Read#batchGetDocuments() batchGetDocuments()}.
+     *
+     * <p>
+     *
+     * @see FirestoreIO#v1()
+     * @see FirestoreV1#read()
+     * @see FirestoreV1.Read#batchGetDocuments()
+     * @see FirestoreV1.BatchGetDocuments
+     * @see BatchGetDocumentsRequest
+     * @see BatchGetDocumentsResponse
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.Firestore.BatchGetDocuments">google.firestore.v1.Firestore.BatchGetDocuments</a>
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.BatchGetDocumentsRequest">google.firestore.v1.BatchGetDocumentsRequest</a>
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.BatchGetDocumentsResponse">google.firestore.v1.BatchGetDocumentsResponse</a>
+     */
+    public static final class Builder
+        extends Transform.Builder<
+            PCollection<BatchGetDocumentsRequest>,
+            PCollection<BatchGetDocumentsResponse>,
+            BatchGetDocuments,
+            BatchGetDocuments.Builder> {
+
+      private Builder() {
+        super();
+      }
+
+      public Builder(
+          JodaClock clock,
+          FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+          RpcQosOptions rpcQosOptions) {
+        super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+      }
+
+      @Override
+      public BatchGetDocuments build() {
+        return genericBuild();
+      }
+
+      @Override
+      BatchGetDocuments buildSafe(
+          JodaClock clock,
+          FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+          RpcQosOptions rpcQosOptions) {
+        return new BatchGetDocuments(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+      }
+    }
+  }
+
+  /**
+   * Concrete class representing a {@link PTransform}{@code <}{@link PCollection}{@code <}{@link
+   * PartitionQueryRequest}{@code >, }{@link PTransform}{@code <}{@link RunQueryResponse}{@code >>}
+   * which will read from Firestore.
+   *
+   * <p>This class is part of the Firestore Connector DSL, it has a type safe builder accessible via
+   * {@link FirestoreIO#v1()}{@code .}{@link FirestoreV1#read() read()}{@code .}{@link
+   * FirestoreV1.Read#partitionQuery() partitionQuery()}.
+   *
+   * <p>All request quality-of-service for an instance of this PTransform is scoped to the worker
+   * and configured via {@link PartitionQuery.Builder#withRpcQosOptions(RpcQosOptions)}.
+   *
+   * @see FirestoreIO#v1()
+   * @see FirestoreV1#read()
+   * @see FirestoreV1.Read#partitionQuery()
+   * @see FirestoreV1.PartitionQuery.Builder
+   * @see PartitionQueryRequest
+   * @see RunQueryResponse
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.Firestore.PartitionQuery">google.firestore.v1.Firestore.PartitionQuery</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.PartitionQueryRequest">google.firestore.v1.PartitionQueryRequest</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.PartitionQueryResponse">google.firestore.v1.PartitionQueryResponse</a>
+   */
+  public static final class PartitionQuery
+      extends Transform<
+          PCollection<PartitionQueryRequest>,
+          PCollection<RunQueryResponse>,
+          PartitionQuery,
+          PartitionQuery.Builder> {
+
+    private final boolean nameOnlyQuery;
+
+    private PartitionQuery(
+        JodaClock clock,
+        FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+        RpcQosOptions rpcQosOptions,
+        boolean nameOnlyQuery) {
+      super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+      this.nameOnlyQuery = nameOnlyQuery;
+    }
+
+    @Override
+    public PCollection<RunQueryResponse> expand(PCollection<PartitionQueryRequest> input) {

Review comment:
       Given that PartitionQuery is a purely ahead of time operation, it is a transform that happens in the pipeline itself rather than per worker. By doing this, all of the subqueries are resolved and checkpointed before any RunQuery processing is performed thereby allowing the beam runner to absorb the burden of enqueuing all of the individual queries.
   
   We do want PartitionQuery to be part of the public API for composition reasons, and for our power customers who do generate their own PartitionQueryRequests or their processing jobs. This is less error prone than attempting to wrap the PartitionQuery api and offer all of its features rather than simply allowing a pass through of the RPC Proto.
   
   I have pushed a commit where I decoupled PartitionQuery from RunQuery. This change makes the PartitionQuery seem a bit more utility like in that it is now an explicitly added pre-step:
   
   ##### Before
   ```java
      PCollection<PartitionQueryRequest> partitionQueryRequests = ...;
      PCollection<RunQueryResponse> runQueryResponses = partitionQueryRequests
          .apply(FirestoreIO.v1().read().partitionQuery().build());
   ```
   
   ##### After
   ```java
      PCollection<PartitionQueryRequest> partitionQueryRequests = ...;
      PCollection<RunQueryRequest> runQueryRequests = partitionQueryRequests
          .apply(FirestoreIO.v1().read().partitionQuery().build());
      PCollection<RunQueryResponse> runQueryResponses = runQueryRequests
          .apply(FirestoreIO.v1().read().runQuery().build());
   ```
   
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] BenWhitehead commented on a change in pull request #15005: [BEAM-8376] Google Cloud Firestore Connector - Add Firestore v1 Read Operations

Posted by GitBox <gi...@apache.org>.
BenWhitehead commented on a change in pull request #15005:
URL: https://github.com/apache/beam/pull/15005#discussion_r668234834



##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/firestore/FirestoreV1.java
##########
@@ -59,6 +89,80 @@
  *
  * <h3>Operations</h3>
  *
+ * <h4>Read</h4>
+ *
+ * <p>The currently supported read operations and their execution behavior are as follows:
+ *
+ * <table>
+ *   <tbody>
+ *     <tr>
+ *       <th>RPC</th>
+ *       <th>Execution Behavior</th>
+ *     </tr>
+ *     <tr>
+ *       <td>PartitionQuery</td>
+ *       <td>Parallel Streaming</td>
+ *     </tr>
+ *     <tr>
+ *       <td>RunQuery</td>
+ *       <td>Sequential Streaming</td>
+ *     </tr>
+ *     <tr>
+ *       <td>BatchGet</td>
+ *       <td>Sequential Streaming</td>
+ *     </tr>
+ *     <tr>
+ *       <td>ListCollectionIds</td>
+ *       <td>Sequential Paginated</td>
+ *     </tr>
+ *     <tr>
+ *       <td>ListDocuments</td>
+ *       <td>Sequential Paginated</td>
+ *     </tr>
+ *   </tbody>
+ * </table>
+ *
+ * <p>PartitionQuery should be preferred over other options if at all possible, becuase it has the
+ * ability to parallelize execution of multiple queries for specific sub-ranges of the full results.
+ *
+ * <p>You should only ever use ListDocuments if the use of <a target="_blank" rel="noopener
+ * noreferrer"
+ * href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.ListDocumentsRequest">{@code
+ * show_missing}</a> is needed to access a document. RunQuery and PartitionQuery will always be
+ * faster if the use of {@code show_missing} is not needed.
+ *
+ * <p><b>Example Usage</b>
+ *
+ * <pre>{@code
+ * PCollection<PartitionQueryRequest> partitionQueryRequests = ...;
+ * PCollection<RunQueryResponse> partitionQueryResponses = partitionQueryRequests
+ *     .apply(FirestoreIO.v1().read().partitionQuery().build());
+ * }</pre>
+ *
+ * <pre>{@code
+ * PCollection<RunQueryRequest> runQueryRequests = ...;

Review comment:
       Each of the `PTransform`s off of `FirestoreIO.v1().read()` represent an individual RPC which Firestore supports for access of data. Each of them has at least one differentiating feature from other similar methods and is justified in being present.
   
   1. BatchGet is currently the only way to get documents by their id. Some customers do external id management which is then coordinated across several systems.
   2. ListCollections is currently the only way in which you can enumerate the collections of a document.
   3. ListDocuments is currently the only way in which you can access documents which have sub collections but no properties themselves (via `show_missing`)
   4. RunQuery is the primary and most performant way of fetching document by some criteria.
   5. PartitionQuery works in conjunction with RunQuery, today only CollectionGroup queries are support for partitioning but more query types are intended to be supported in the future.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] BenWhitehead commented on pull request #15005: [BEAM-8376] Google Cloud Firestore Connector - Add Firestore v1 Read Operations

Posted by GitBox <gi...@apache.org>.
BenWhitehead commented on pull request #15005:
URL: https://github.com/apache/beam/pull/15005#issuecomment-864263876


   CodeCov might be gone after their recent compromise. Attached are two zip files with code coverage reports from my workstation -- one with unit and IT tests, one with only unit tests.
   
   [beam-coverage-report_unit-and-it.tar.gz](https://github.com/apache/beam/files/6679276/beam-coverage-report_unit-and-it.tar.gz)
   [beam-coverage-report_unit.tar.gz](https://github.com/apache/beam/files/6679277/beam-coverage-report_unit.tar.gz)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] danthev commented on a change in pull request #15005: [BEAM-8376] Google Cloud Firestore Connector - Add Firestore v1 Read Operations

Posted by GitBox <gi...@apache.org>.
danthev commented on a change in pull request #15005:
URL: https://github.com/apache/beam/pull/15005#discussion_r655626971



##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/firestore/FirestoreV1.java
##########
@@ -189,6 +533,725 @@ private Write() {}
     }
   }
 
+  /**
+   * Concrete class representing a {@link PTransform}{@code <}{@link PCollection}{@code <}{@link
+   * ListCollectionIdsRequest}{@code >, }{@link PTransform}{@code <}{@link
+   * ListCollectionIdsResponse}{@code >>} which will read from Firestore.
+   *
+   * <p>This class is part of the Firestore Connector DSL, it has a type safe builder accessible via
+   * {@link FirestoreIO#v1()}{@code .}{@link FirestoreV1#read() read()}{@code .}{@link
+   * FirestoreV1.Read#listCollectionIds() listCollectionIds()}.
+   *
+   * <p>All request quality-of-service for an instance of this PTransform is scoped to the worker
+   * and configured via {@link ListCollectionIds.Builder#withRpcQosOptions(RpcQosOptions)}.
+   *
+   * @see FirestoreIO#v1()
+   * @see FirestoreV1#read()
+   * @see FirestoreV1.Read#listCollectionIds()
+   * @see FirestoreV1.ListCollectionIds.Builder
+   * @see ListCollectionIdsRequest
+   * @see ListCollectionIdsResponse
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.Firestore.ListCollectionIds">google.firestore.v1.Firestore.ListCollectionIds</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.ListCollectionIdsRequest">google.firestore.v1.ListCollectionIdsRequest</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.ListCollectionIdsResponse">google.firestore.v1.ListCollectionIdsResponse</a>
+   */
+  public static final class ListCollectionIds
+      extends Transform<
+          PCollection<ListCollectionIdsRequest>,
+          PCollection<String>,
+          ListCollectionIds,
+          ListCollectionIds.Builder> {
+
+    private ListCollectionIds(
+        JodaClock clock,
+        FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+        RpcQosOptions rpcQosOptions) {
+      super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    @Override
+    public PCollection<String> expand(PCollection<ListCollectionIdsRequest> input) {
+      return input
+          .apply(
+              "listCollectionIds",
+              ParDo.of(
+                  new ListCollectionIdsFn(clock, firestoreStatefulComponentFactory, rpcQosOptions)))
+          .apply(ParDo.of(new FlattenListCollectionIdsResponse()))
+          .apply(Reshuffle.viaRandomKey());
+    }
+
+    @Override
+    public Builder toBuilder() {
+      return new Builder(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    /**
+     * A type safe builder for {@link ListCollectionIds} allowing configuration and instantiation.
+     *
+     * <p>This class is part of the Firestore Connector DSL, it has a type safe builder accessible
+     * via {@link FirestoreIO#v1()}{@code .}{@link FirestoreV1#read() read()}{@code .}{@link
+     * FirestoreV1.Read#listCollectionIds() listCollectionIds()}.
+     *
+     * <p>
+     *
+     * @see FirestoreIO#v1()
+     * @see FirestoreV1#read()
+     * @see FirestoreV1.Read#listCollectionIds()
+     * @see FirestoreV1.ListCollectionIds
+     * @see ListCollectionIdsRequest
+     * @see ListCollectionIdsResponse
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.Firestore.ListCollectionIds">google.firestore.v1.Firestore.ListCollectionIds</a>
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.ListCollectionIdsRequest">google.firestore.v1.ListCollectionIdsRequest</a>
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.ListCollectionIdsResponse">google.firestore.v1.ListCollectionIdsResponse</a>
+     */
+    public static final class Builder
+        extends Transform.Builder<
+            PCollection<ListCollectionIdsRequest>,
+            PCollection<String>,
+            ListCollectionIds,
+            ListCollectionIds.Builder> {
+
+      private Builder() {
+        super();
+      }
+
+      private Builder(
+          JodaClock clock,
+          FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+          RpcQosOptions rpcQosOptions) {
+        super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+      }
+
+      @Override
+      public ListCollectionIds build() {
+        return genericBuild();
+      }
+
+      @Override
+      ListCollectionIds buildSafe(
+          JodaClock clock,
+          FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+          RpcQosOptions rpcQosOptions) {
+        return new ListCollectionIds(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+      }
+    }
+  }
+
+  /**
+   * Concrete class representing a {@link PTransform}{@code <}{@link PCollection}{@code <}{@link
+   * ListDocumentsRequest}{@code >, }{@link PTransform}{@code <}{@link ListDocumentsResponse}{@code
+   * >>} which will read from Firestore.
+   *
+   * <p>This class is part of the Firestore Connector DSL, it has a type safe builder accessible via
+   * {@link FirestoreIO#v1()}{@code .}{@link FirestoreV1#read() read()}{@code .}{@link
+   * FirestoreV1.Read#listDocuments() listDocuments()}.
+   *
+   * <p>All request quality-of-service for an instance of this PTransform is scoped to the worker
+   * and configured via {@link ListDocuments.Builder#withRpcQosOptions(RpcQosOptions)}.
+   *
+   * @see FirestoreIO#v1()
+   * @see FirestoreV1#read()
+   * @see FirestoreV1.Read#listDocuments()
+   * @see FirestoreV1.ListDocuments.Builder
+   * @see ListDocumentsRequest
+   * @see ListDocumentsResponse
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.Firestore.ListDocuments">google.firestore.v1.Firestore.ListDocuments</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.ListDocumentsRequest">google.firestore.v1.ListDocumentsRequest</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.ListDocumentsResponse">google.firestore.v1.ListDocumentsResponse</a>
+   */
+  public static final class ListDocuments
+      extends Transform<
+          PCollection<ListDocumentsRequest>,
+          PCollection<Document>,
+          ListDocuments,
+          ListDocuments.Builder> {
+
+    private ListDocuments(
+        JodaClock clock,
+        FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+        RpcQosOptions rpcQosOptions) {
+      super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    @Override
+    public PCollection<Document> expand(PCollection<ListDocumentsRequest> input) {
+      return input
+          .apply(
+              "listDocuments",
+              ParDo.of(
+                  new ListDocumentsFn(clock, firestoreStatefulComponentFactory, rpcQosOptions)))
+          .apply(ParDo.of(new ListDocumentsResponseToDocument()))
+          .apply(Reshuffle.viaRandomKey());
+    }
+
+    @Override
+    public Builder toBuilder() {
+      return new Builder(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    /**
+     * A type safe builder for {@link ListDocuments} allowing configuration and instantiation.
+     *
+     * <p>This class is part of the Firestore Connector DSL, it has a type safe builder accessible
+     * via {@link FirestoreIO#v1()}{@code .}{@link FirestoreV1#read() read()}{@code .}{@link
+     * FirestoreV1.Read#listDocuments() listDocuments()}.
+     *
+     * <p>
+     *
+     * @see FirestoreIO#v1()
+     * @see FirestoreV1#read()
+     * @see FirestoreV1.Read#listDocuments()
+     * @see FirestoreV1.ListDocuments
+     * @see ListDocumentsRequest
+     * @see ListDocumentsResponse
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.Firestore.ListDocuments">google.firestore.v1.Firestore.ListDocuments</a>
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.ListDocumentsRequest">google.firestore.v1.ListDocumentsRequest</a>
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.ListDocumentsResponse">google.firestore.v1.ListDocumentsResponse</a>
+     */
+    public static final class Builder
+        extends Transform.Builder<
+            PCollection<ListDocumentsRequest>,
+            PCollection<Document>,
+            ListDocuments,
+            ListDocuments.Builder> {
+
+      private Builder() {
+        super();
+      }
+
+      private Builder(
+          JodaClock clock,
+          FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+          RpcQosOptions rpcQosOptions) {
+        super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+      }
+
+      @Override
+      public ListDocuments build() {
+        return genericBuild();
+      }
+
+      @Override
+      ListDocuments buildSafe(
+          JodaClock clock,
+          FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+          RpcQosOptions rpcQosOptions) {
+        return new ListDocuments(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+      }
+    }
+  }
+
+  /**
+   * Concrete class representing a {@link PTransform}{@code <}{@link PCollection}{@code <}{@link
+   * RunQueryRequest}{@code >, }{@link PTransform}{@code <}{@link RunQueryResponse}{@code >>} which
+   * will read from Firestore.
+   *
+   * <p>This class is part of the Firestore Connector DSL, it has a type safe builder accessible via
+   * {@link FirestoreIO#v1()}{@code .}{@link FirestoreV1#read() read()}{@code .}{@link
+   * FirestoreV1.Read#runQuery() runQuery()}.
+   *
+   * <p>All request quality-of-service for an instance of this PTransform is scoped to the worker
+   * and configured via {@link RunQuery.Builder#withRpcQosOptions(RpcQosOptions)}.
+   *
+   * @see FirestoreIO#v1()
+   * @see FirestoreV1#read()
+   * @see FirestoreV1.Read#runQuery()
+   * @see FirestoreV1.RunQuery.Builder
+   * @see RunQueryRequest
+   * @see RunQueryResponse
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.Firestore.RunQuery">google.firestore.v1.Firestore.RunQuery</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.RunQueryRequest">google.firestore.v1.RunQueryRequest</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.RunQueryResponse">google.firestore.v1.RunQueryResponse</a>
+   */
+  public static final class RunQuery
+      extends Transform<
+          PCollection<RunQueryRequest>, PCollection<RunQueryResponse>, RunQuery, RunQuery.Builder> {
+
+    private RunQuery(
+        JodaClock clock,
+        FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+        RpcQosOptions rpcQosOptions) {
+      super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    @Override
+    public PCollection<RunQueryResponse> expand(PCollection<RunQueryRequest> input) {
+      return input
+          .apply(
+              "runQuery",
+              ParDo.of(new RunQueryFn(clock, firestoreStatefulComponentFactory, rpcQosOptions)))
+          .apply(Reshuffle.viaRandomKey());
+    }
+
+    @Override
+    public Builder toBuilder() {
+      return new Builder(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    /**
+     * A type safe builder for {@link RunQuery} allowing configuration and instantiation.
+     *
+     * <p>This class is part of the Firestore Connector DSL, it has a type safe builder accessible
+     * via {@link FirestoreIO#v1()}{@code .}{@link FirestoreV1#read() read()}{@code .}{@link
+     * FirestoreV1.Read#runQuery() runQuery()}.
+     *
+     * <p>
+     *
+     * @see FirestoreIO#v1()
+     * @see FirestoreV1#read()
+     * @see FirestoreV1.Read#runQuery()
+     * @see FirestoreV1.RunQuery
+     * @see RunQueryRequest
+     * @see RunQueryResponse
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.Firestore.RunQuery">google.firestore.v1.Firestore.RunQuery</a>
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.RunQueryRequest">google.firestore.v1.RunQueryRequest</a>
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.RunQueryResponse">google.firestore.v1.RunQueryResponse</a>
+     */
+    public static final class Builder
+        extends Transform.Builder<
+            PCollection<RunQueryRequest>,
+            PCollection<RunQueryResponse>,
+            RunQuery,
+            RunQuery.Builder> {
+
+      private Builder() {
+        super();
+      }
+
+      private Builder(
+          JodaClock clock,
+          FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+          RpcQosOptions rpcQosOptions) {
+        super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+      }
+
+      @Override
+      public RunQuery build() {
+        return genericBuild();
+      }
+
+      @Override
+      RunQuery buildSafe(
+          JodaClock clock,
+          FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+          RpcQosOptions rpcQosOptions) {
+        return new RunQuery(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+      }
+    }
+  }
+
+  /**
+   * Concrete class representing a {@link PTransform}{@code <}{@link PCollection}{@code <}{@link
+   * BatchGetDocumentsRequest}{@code >, }{@link PTransform}{@code <}{@link
+   * BatchGetDocumentsResponse}{@code >>} which will read from Firestore.
+   *
+   * <p>This class is part of the Firestore Connector DSL, it has a type safe builder accessible via
+   * {@link FirestoreIO#v1()}{@code .}{@link FirestoreV1#read() read()}{@code .}{@link
+   * FirestoreV1.Read#batchGetDocuments() batchGetDocuments()}.
+   *
+   * <p>All request quality-of-service for an instance of this PTransform is scoped to the worker
+   * and configured via {@link BatchGetDocuments.Builder#withRpcQosOptions(RpcQosOptions)}.
+   *
+   * @see FirestoreIO#v1()
+   * @see FirestoreV1#read()
+   * @see FirestoreV1.Read#batchGetDocuments()
+   * @see FirestoreV1.BatchGetDocuments.Builder
+   * @see BatchGetDocumentsRequest
+   * @see BatchGetDocumentsResponse
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.Firestore.BatchGetDocuments">google.firestore.v1.Firestore.BatchGetDocuments</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.BatchGetDocumentsRequest">google.firestore.v1.BatchGetDocumentsRequest</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.BatchGetDocumentsResponse">google.firestore.v1.BatchGetDocumentsResponse</a>
+   */
+  public static final class BatchGetDocuments
+      extends Transform<
+          PCollection<BatchGetDocumentsRequest>,
+          PCollection<BatchGetDocumentsResponse>,
+          BatchGetDocuments,
+          BatchGetDocuments.Builder> {
+
+    private BatchGetDocuments(
+        JodaClock clock,
+        FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+        RpcQosOptions rpcQosOptions) {
+      super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    @Override
+    public PCollection<BatchGetDocumentsResponse> expand(
+        PCollection<BatchGetDocumentsRequest> input) {
+      return input
+          .apply(
+              "batchGetDocuments",
+              ParDo.of(
+                  new BatchGetDocumentsFn(clock, firestoreStatefulComponentFactory, rpcQosOptions)))
+          .apply(Reshuffle.viaRandomKey());
+    }
+
+    @Override
+    public Builder toBuilder() {
+      return new Builder(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    /**
+     * A type safe builder for {@link BatchGetDocuments} allowing configuration and instantiation.
+     *
+     * <p>This class is part of the Firestore Connector DSL, it has a type safe builder accessible
+     * via {@link FirestoreIO#v1()}{@code .}{@link FirestoreV1#read() read()}{@code .}{@link
+     * FirestoreV1.Read#batchGetDocuments() batchGetDocuments()}.
+     *
+     * <p>
+     *
+     * @see FirestoreIO#v1()
+     * @see FirestoreV1#read()
+     * @see FirestoreV1.Read#batchGetDocuments()
+     * @see FirestoreV1.BatchGetDocuments
+     * @see BatchGetDocumentsRequest
+     * @see BatchGetDocumentsResponse
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.Firestore.BatchGetDocuments">google.firestore.v1.Firestore.BatchGetDocuments</a>
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.BatchGetDocumentsRequest">google.firestore.v1.BatchGetDocumentsRequest</a>
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.BatchGetDocumentsResponse">google.firestore.v1.BatchGetDocumentsResponse</a>
+     */
+    public static final class Builder
+        extends Transform.Builder<
+            PCollection<BatchGetDocumentsRequest>,
+            PCollection<BatchGetDocumentsResponse>,
+            BatchGetDocuments,
+            BatchGetDocuments.Builder> {
+
+      private Builder() {
+        super();
+      }
+
+      public Builder(
+          JodaClock clock,
+          FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+          RpcQosOptions rpcQosOptions) {
+        super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+      }
+
+      @Override
+      public BatchGetDocuments build() {
+        return genericBuild();
+      }
+
+      @Override
+      BatchGetDocuments buildSafe(
+          JodaClock clock,
+          FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+          RpcQosOptions rpcQosOptions) {
+        return new BatchGetDocuments(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+      }
+    }
+  }
+
+  /**
+   * Concrete class representing a {@link PTransform}{@code <}{@link PCollection}{@code <}{@link
+   * PartitionQueryRequest}{@code >, }{@link PTransform}{@code <}{@link RunQueryResponse}{@code >>}
+   * which will read from Firestore.
+   *
+   * <p>This class is part of the Firestore Connector DSL, it has a type safe builder accessible via
+   * {@link FirestoreIO#v1()}{@code .}{@link FirestoreV1#read() read()}{@code .}{@link
+   * FirestoreV1.Read#partitionQuery() partitionQuery()}.
+   *
+   * <p>All request quality-of-service for an instance of this PTransform is scoped to the worker
+   * and configured via {@link PartitionQuery.Builder#withRpcQosOptions(RpcQosOptions)}.
+   *
+   * @see FirestoreIO#v1()
+   * @see FirestoreV1#read()
+   * @see FirestoreV1.Read#partitionQuery()
+   * @see FirestoreV1.PartitionQuery.Builder
+   * @see PartitionQueryRequest
+   * @see RunQueryResponse
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.Firestore.PartitionQuery">google.firestore.v1.Firestore.PartitionQuery</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.PartitionQueryRequest">google.firestore.v1.PartitionQueryRequest</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.PartitionQueryResponse">google.firestore.v1.PartitionQueryResponse</a>
+   */
+  public static final class PartitionQuery
+      extends Transform<
+          PCollection<PartitionQueryRequest>,
+          PCollection<RunQueryResponse>,
+          PartitionQuery,
+          PartitionQuery.Builder> {
+
+    private final boolean nameOnlyQuery;
+
+    private PartitionQuery(
+        JodaClock clock,
+        FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+        RpcQosOptions rpcQosOptions,
+        boolean nameOnlyQuery) {
+      super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+      this.nameOnlyQuery = nameOnlyQuery;
+    }
+
+    @Override
+    public PCollection<RunQueryResponse> expand(PCollection<PartitionQueryRequest> input) {
+      PCollection<RunQueryRequest> queries =
+          input
+              .apply(
+                  "PartitionQuery",
+                  ParDo.of(
+                      new PartitionQueryFn(
+                          clock, firestoreStatefulComponentFactory, rpcQosOptions)))
+              .apply("expand queries", ParDo.of(new PartitionQueryResponseToRunQueryRequest()));
+      if (nameOnlyQuery) {
+        queries =
+            queries.apply(
+                "set name only query",
+                MapElements.via(
+                    new SimpleFunction<RunQueryRequest, RunQueryRequest>() {
+                      @Override
+                      public RunQueryRequest apply(RunQueryRequest input) {
+                        RunQueryRequest.Builder builder = input.toBuilder();
+                        builder
+                            .getStructuredQueryBuilder()
+                            .setSelect(
+                                Projection.newBuilder()
+                                    .addFields(
+                                        FieldReference.newBuilder()
+                                            .setFieldPath("__name__")
+                                            .build())
+                                    .build());
+                        return builder.build();
+                      }
+                    }));
+      }
+      return queries
+          .apply(Reshuffle.viaRandomKey())
+          .apply(new RunQuery(clock, firestoreStatefulComponentFactory, rpcQosOptions));
+    }
+
+    @Override
+    public Builder toBuilder() {
+      return new Builder(clock, firestoreStatefulComponentFactory, rpcQosOptions, nameOnlyQuery);
+    }
+
+    /**
+     * A type safe builder for {@link PartitionQuery} allowing configuration and instantiation.
+     *
+     * <p>This class is part of the Firestore Connector DSL, it has a type safe builder accessible
+     * via {@link FirestoreIO#v1()}{@code .}{@link FirestoreV1#read() read()}{@code .}{@link
+     * FirestoreV1.Read#partitionQuery() partitionQuery()}.
+     *
+     * <p>
+     *
+     * @see FirestoreIO#v1()
+     * @see FirestoreV1#read()
+     * @see FirestoreV1.Read#partitionQuery()
+     * @see FirestoreV1.PartitionQuery
+     * @see PartitionQueryRequest
+     * @see RunQueryResponse
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.Firestore.PartitionQuery">google.firestore.v1.Firestore.PartitionQuery</a>
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.PartitionQueryRequest">google.firestore.v1.PartitionQueryRequest</a>
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.PartitionQueryResponse">google.firestore.v1.PartitionQueryResponse</a>
+     */
+    public static final class Builder
+        extends Transform.Builder<
+            PCollection<PartitionQueryRequest>,
+            PCollection<RunQueryResponse>,
+            PartitionQuery,
+            FirestoreV1.PartitionQuery.Builder> {
+
+      private boolean nameOnlyQuery = false;
+
+      private Builder() {
+        super();
+      }
+
+      public Builder(
+          JodaClock clock,
+          FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+          RpcQosOptions rpcQosOptions,
+          boolean nameOnlyQuery) {
+        super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+        this.nameOnlyQuery = nameOnlyQuery;
+      }
+
+      @Override
+      public PartitionQuery build() {
+        return genericBuild();
+      }
+
+      /**
+       * Update produced queries to only retrieve their {@code __name__} thereby not retrieving any
+       * fields and reducing resource requirements.
+       *
+       * @return this builder
+       */
+      public Builder withNameOnlyQuery() {
+        this.nameOnlyQuery = true;
+        return this;
+      }
+
+      @Override
+      PartitionQuery buildSafe(
+          JodaClock clock,
+          FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+          RpcQosOptions rpcQosOptions) {
+        return new PartitionQuery(
+            clock, firestoreStatefulComponentFactory, rpcQosOptions, nameOnlyQuery);
+      }
+    }
+
+    /**
+     * DoFn which contains the logic necessary to turn a {@link PartitionQueryRequest} and {@link
+     * PartitionQueryResponse} pair into {@code N} {@link RunQueryRequest}.
+     */
+    static final class PartitionQueryResponseToRunQueryRequest
+        extends DoFn<PartitionQueryPair, RunQueryRequest> {
+
+      /**
+       * When fetching cursors that span multiple pages it is expected (per <a
+       * href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.PartitionQueryRequest">
+       * PartitionQueryRequest.page_token</a>) for the client to sort the cursors before processing
+       * them to define the sub-queries. So here we're defining a Comparator which will sort Cursors
+       * by the first reference value present, then comparing the reference values
+       * lexicographically.
+       */
+      static final Comparator<Cursor> CURSOR_REFERENCE_VALUE_COMPARATOR;
+
+      static {
+        Function<Cursor, Optional<Value>> firstReferenceValue =
+            (Cursor c) ->
+                c.getValuesList().stream()
+                    .filter(
+                        v -> {
+                          String referenceValue = v.getReferenceValue();
+                          return referenceValue != null && !referenceValue.isEmpty();
+                        })
+                    .findFirst();
+        Function<String, String[]> stringToPath = (String s) -> s.split("/");
+        // compare references by their path segments rather than as a whole string to ensure
+        // per path segment comparison is taken into account.
+        Comparator<String[]> pathWiseCompare =
+            (String[] path1, String[] path2) -> {
+              int minLength = Math.min(path1.length, path2.length);
+              for (int i = 0; i < minLength; i++) {
+                String pathSegment1 = path1[i];
+                String pathSegment2 = path2[i];
+                int compare = pathSegment1.compareTo(pathSegment2);
+                if (compare != 0) {
+                  return compare;
+                }
+              }
+              if (path1.length == path2.length) {
+                return 0;
+              } else if (minLength == path1.length) {
+                return -1;
+              } else {
+                return 1;
+              }
+            };
+
+        // Sort those cursors which have no firstReferenceValue at the bottom of the list
+        CURSOR_REFERENCE_VALUE_COMPARATOR =
+            Comparator.comparing(
+                firstReferenceValue,
+                (o1, o2) -> {
+                  if (o1.isPresent() && o2.isPresent()) {
+                    return pathWiseCompare.compare(
+                        stringToPath.apply(o1.get().getReferenceValue()),
+                        stringToPath.apply(o2.get().getReferenceValue()));
+                  } else if (o1.isPresent()) {
+                    return -1;
+                  } else {
+                    return 1;
+                  }
+                });
+      }
+
+      @ProcessElement
+      public void processElement(ProcessContext c) {
+        PartitionQueryPair pair = c.element();

Review comment:
       It seems this logic isn't tested, just the comparator itself. Can you add one that calls `runFunction`?

##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/firestore/FirestoreV1ReadFn.java
##########
@@ -0,0 +1,632 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.gcp.firestore;
+
+import static java.util.Objects.requireNonNull;
+
+import com.google.api.gax.paging.AbstractPage;
+import com.google.api.gax.paging.AbstractPagedListResponse;
+import com.google.api.gax.rpc.ServerStream;
+import com.google.api.gax.rpc.ServerStreamingCallable;
+import com.google.api.gax.rpc.UnaryCallable;
+import com.google.cloud.firestore.v1.FirestoreClient.ListCollectionIdsPage;
+import com.google.cloud.firestore.v1.FirestoreClient.ListCollectionIdsPagedResponse;
+import com.google.cloud.firestore.v1.FirestoreClient.ListDocumentsPage;
+import com.google.cloud.firestore.v1.FirestoreClient.ListDocumentsPagedResponse;
+import com.google.cloud.firestore.v1.FirestoreClient.PartitionQueryPage;
+import com.google.cloud.firestore.v1.FirestoreClient.PartitionQueryPagedResponse;
+import com.google.cloud.firestore.v1.stub.FirestoreStub;
+import com.google.firestore.v1.BatchGetDocumentsRequest;
+import com.google.firestore.v1.BatchGetDocumentsResponse;
+import com.google.firestore.v1.Cursor;
+import com.google.firestore.v1.ListCollectionIdsRequest;
+import com.google.firestore.v1.ListCollectionIdsResponse;
+import com.google.firestore.v1.ListDocumentsRequest;
+import com.google.firestore.v1.ListDocumentsResponse;
+import com.google.firestore.v1.PartitionQueryRequest;
+import com.google.firestore.v1.PartitionQueryResponse;
+import com.google.firestore.v1.RunQueryRequest;
+import com.google.firestore.v1.RunQueryResponse;
+import com.google.firestore.v1.StructuredQuery;
+import com.google.firestore.v1.StructuredQuery.Direction;
+import com.google.firestore.v1.StructuredQuery.FieldReference;
+import com.google.firestore.v1.StructuredQuery.Order;
+import com.google.firestore.v1.Value;
+import com.google.protobuf.Message;
+import com.google.protobuf.ProtocolStringList;
+import java.io.Serializable;
+import java.util.List;
+import java.util.Map;
+import java.util.Objects;
+import org.apache.beam.sdk.extensions.gcp.options.GcpOptions;
+import org.apache.beam.sdk.io.gcp.firestore.FirestoreDoFn.NonWindowAwareDoFn;
+import org.apache.beam.sdk.io.gcp.firestore.FirestoreV1Fn.HasRpcAttemptContext;
+import org.apache.beam.sdk.io.gcp.firestore.RpcQos.RpcAttempt.Context;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.display.DisplayData;
+import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.annotations.VisibleForTesting;
+import org.checkerframework.checker.nullness.qual.Nullable;
+import org.joda.time.Instant;
+
+/**
+ * A collection of {@link org.apache.beam.sdk.transforms.DoFn DoFn}s for each of the supported read
+ * RPC methods from the Cloud Firestore V1 API.
+ */
+final class FirestoreV1ReadFn {
+
+  /**
+   * {@link DoFn} for Firestore V1 {@link RunQueryRequest}s.
+   *
+   * <p>This Fn uses a stream to obtain responses, each response from the stream will be output to
+   * the next stage of the pipeline. Each response from the stream represents an individual document
+   * with the associated metadata.
+   *
+   * <p>If an error is encountered while reading from the stream, the stream will attempt to resume
+   * rather than starting over. The restarting of the stream will continue within the scope of the
+   * completion of the request (meaning any possibility of resumption is contingent upon an attempt
+   * being available in the Qos budget).
+   *
+   * <p>All request quality-of-service is managed via the instance of {@link RpcQos} associated with
+   * the lifecycle of this Fn.
+   */
+  static final class RunQueryFn
+      extends StreamingFirestoreV1ReadFn<RunQueryRequest, RunQueryResponse> {
+
+    RunQueryFn(
+        JodaClock clock,
+        FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+        RpcQosOptions rpcQosOptions) {
+      super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    @Override
+    public Context getRpcAttemptContext() {
+      return FirestoreV1Fn.V1FnRpcAttemptContext.RunQuery;
+    }
+
+    @Override
+    protected ServerStreamingCallable<RunQueryRequest, RunQueryResponse> getCallable(
+        FirestoreStub firestoreStub) {
+      return firestoreStub.runQueryCallable();
+    }
+
+    @Override
+    protected RunQueryRequest setStartFrom(
+        RunQueryRequest element, RunQueryResponse runQueryResponse) {
+      StructuredQuery query = element.getStructuredQuery();
+      StructuredQuery.Builder builder;
+      List<Order> orderByList = query.getOrderByList();
+      // if the orderByList is empty that means the default sort of "__name__ ASC" will be used
+      // Before we can set the cursor to the last document name read, we need to explicitly add
+      // the order of "__name__ ASC" because a cursor value must map to an order by
+      if (orderByList.isEmpty()) {
+        builder =
+            query
+                .toBuilder()
+                .addOrderBy(
+                    Order.newBuilder()
+                        .setField(FieldReference.newBuilder().setFieldPath("__name__").build())
+                        .setDirection(Direction.ASCENDING)
+                        .build())
+                .setStartAt(
+                    Cursor.newBuilder()
+                        .setBefore(false)
+                        .addValues(
+                            Value.newBuilder()
+                                .setReferenceValue(runQueryResponse.getDocument().getName())
+                                .build()));
+      } else {
+        Cursor.Builder cursor = Cursor.newBuilder().setBefore(false);
+        Map<String, Value> fieldsMap = runQueryResponse.getDocument().getFieldsMap();
+        for (Order order : orderByList) {
+          String fieldPath = order.getField().getFieldPath();
+          Value value = fieldsMap.get(fieldPath);
+          if (value != null) {
+            cursor.addValues(value);
+          } else if ("__name__".equals(fieldPath)) {
+            cursor.addValues(
+                Value.newBuilder()
+                    .setReferenceValue(runQueryResponse.getDocument().getName())
+                    .build());
+          }
+        }
+        builder = query.toBuilder().setStartAt(cursor.build());
+      }
+      return element.toBuilder().setStructuredQuery(builder.build()).build();
+    }
+  }
+
+  /**
+   * {@link DoFn} for Firestore V1 {@link PartitionQueryRequest}s.
+   *
+   * <p>This Fn uses pagination to obtain responses, all pages will be aggregated before being
+   * emitted to the next stage of the pipeline. Aggregation of pages is necessary as the next step
+   * of pairing of cursors to create N queries must first sort all cursors. See <a target="_blank"
+   * rel="noopener noreferrer"
+   * href="https://cloud.google.com/firestore/docs/reference/rest/v1/projects.databases.documents/partitionQuery#request-body">{@code
+   * pageToken}s</a> documentation for details.
+   *
+   * <p>All request quality-of-service is managed via the instance of {@link RpcQos} associated with
+   * the lifecycle of this Fn.
+   */
+  static final class PartitionQueryFn
+      extends BaseFirestoreV1ReadFn<PartitionQueryRequest, PartitionQueryPair> {
+
+    public PartitionQueryFn(
+        JodaClock clock,
+        FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+        RpcQosOptions rpcQosOptions) {
+      super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    @Override
+    public Context getRpcAttemptContext() {
+      return FirestoreV1Fn.V1FnRpcAttemptContext.PartitionQuery;
+    }
+
+    @Override
+    public void processElement(ProcessContext context) throws Exception {
+      @SuppressWarnings("nullness")
+      final PartitionQueryRequest element =
+          requireNonNull(context.element(), "c.element() must be non null");
+
+      RpcQos.RpcReadAttempt attempt = rpcQos.newReadAttempt(getRpcAttemptContext());
+      PartitionQueryResponse.Builder aggregate = null;
+      while (true) {
+        if (!attempt.awaitSafeToProceed(clock.instant())) {
+          continue;
+        }
+
+        try {
+          PartitionQueryRequest request = setPageToken(element, aggregate);
+          attempt.recordRequestStart(clock.instant());
+          PartitionQueryPagedResponse pagedResponse =
+              firestoreStub.partitionQueryPagedCallable().call(request);
+          for (PartitionQueryPage page : pagedResponse.iteratePages()) {
+            attempt.recordRequestSuccessful(clock.instant());
+            PartitionQueryResponse response = page.getResponse();
+            if (aggregate == null) {
+              aggregate = response.toBuilder();
+            } else {
+              aggregate.addAllPartitions(response.getPartitionsList());
+              if (page.hasNextPage()) {
+                aggregate.setNextPageToken(response.getNextPageToken());
+              } else {
+                aggregate.clearNextPageToken();
+              }
+            }
+            if (page.hasNextPage()) {
+              attempt.recordRequestStart(clock.instant());
+            }
+          }
+          attempt.completeSuccess();
+          break;
+        } catch (RuntimeException exception) {
+          Instant end = clock.instant();
+          attempt.recordRequestFailed(end);
+          attempt.checkCanRetry(end, exception);
+        }
+      }
+      if (aggregate != null) {
+        context.output(new PartitionQueryPair(element, aggregate.build()));
+      }
+    }
+
+    private PartitionQueryRequest setPageToken(
+        PartitionQueryRequest request,
+        @edu.umd.cs.findbugs.annotations.Nullable PartitionQueryResponse.Builder aggregate) {
+      if (aggregate != null && aggregate.getNextPageToken() != null) {
+        return request.toBuilder().setPageToken(aggregate.getNextPageToken()).build();
+      }
+      return request;
+    }
+  }
+
+  /**
+   * {@link DoFn} for Firestore V1 {@link ListDocumentsRequest}s.
+   *
+   * <p>This Fn uses pagination to obtain responses, the response from each page will be output to
+   * the next stage of the pipeline.
+   *
+   * <p>All request quality-of-service is managed via the instance of {@link RpcQos} associated with
+   * the lifecycle of this Fn.
+   */
+  static final class ListDocumentsFn
+      extends PaginatedFirestoreV1ReadFn<
+          ListDocumentsRequest,
+          ListDocumentsPagedResponse,
+          ListDocumentsPage,
+          ListDocumentsResponse> {
+
+    ListDocumentsFn(
+        JodaClock clock,
+        FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+        RpcQosOptions rpcQosOptions) {
+      super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    @Override
+    public Context getRpcAttemptContext() {
+      return FirestoreV1Fn.V1FnRpcAttemptContext.ListDocuments;
+    }
+
+    @Override
+    protected UnaryCallable<ListDocumentsRequest, ListDocumentsPagedResponse> getCallable(
+        FirestoreStub firestoreStub) {
+      return firestoreStub.listDocumentsPagedCallable();
+    }
+
+    @Override
+    protected ListDocumentsRequest setPageToken(
+        ListDocumentsRequest request, String nextPageToken) {
+      return request.toBuilder().setPageToken(nextPageToken).build();
+    }
+  }
+
+  /**
+   * {@link DoFn} for Firestore V1 {@link ListCollectionIdsRequest}s.
+   *
+   * <p>This Fn uses pagination to obtain responses, the response from each page will be output to
+   * the next stage of the pipeline.
+   *
+   * <p>All request quality-of-service is managed via the instance of {@link RpcQos} associated with
+   * the lifecycle of this Fn.
+   */
+  static final class ListCollectionIdsFn
+      extends PaginatedFirestoreV1ReadFn<
+          ListCollectionIdsRequest,
+          ListCollectionIdsPagedResponse,
+          ListCollectionIdsPage,
+          ListCollectionIdsResponse> {
+
+    ListCollectionIdsFn(
+        JodaClock clock,
+        FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+        RpcQosOptions rpcQosOptions) {
+      super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    @Override
+    public Context getRpcAttemptContext() {
+      return FirestoreV1Fn.V1FnRpcAttemptContext.ListCollectionIds;
+    }
+
+    @Override
+    protected UnaryCallable<ListCollectionIdsRequest, ListCollectionIdsPagedResponse> getCallable(
+        FirestoreStub firestoreStub) {
+      return firestoreStub.listCollectionIdsPagedCallable();
+    }
+
+    @Override
+    protected ListCollectionIdsRequest setPageToken(
+        ListCollectionIdsRequest request, String nextPageToken) {
+      return request.toBuilder().setPageToken(nextPageToken).build();
+    }
+  }
+
+  /**
+   * {@link DoFn} for Firestore V1 {@link BatchGetDocumentsRequest}s.
+   *
+   * <p>This Fn uses a stream to obtain responses, each response from the stream will be output to
+   * the next stage of the pipeline. Each response from the stream represents an individual document
+   * with the associated metadata.
+   *
+   * <p>All request quality-of-service is managed via the instance of {@link RpcQos} associated with
+   * the lifecycle of this Fn.
+   */
+  static final class BatchGetDocumentsFn
+      extends StreamingFirestoreV1ReadFn<BatchGetDocumentsRequest, BatchGetDocumentsResponse> {
+
+    BatchGetDocumentsFn(
+        JodaClock clock,
+        FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+        RpcQosOptions rpcQosOptions) {
+      super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    @Override
+    public Context getRpcAttemptContext() {
+      return FirestoreV1Fn.V1FnRpcAttemptContext.BatchGetDocuments;
+    }
+
+    @Override
+    protected ServerStreamingCallable<BatchGetDocumentsRequest, BatchGetDocumentsResponse>
+        getCallable(FirestoreStub firestoreStub) {
+      return firestoreStub.batchGetDocumentsCallable();
+    }
+
+    @Override
+    protected BatchGetDocumentsRequest setStartFrom(
+        BatchGetDocumentsRequest originalRequest, BatchGetDocumentsResponse mostRecentResponse) {
+      int startIndex = -1;
+      ProtocolStringList documentsList = originalRequest.getDocumentsList();
+      String missing = mostRecentResponse.getMissing();
+      String foundName =
+          mostRecentResponse.hasFound() ? mostRecentResponse.getFound().getName() : null;
+      // we only scan until the second to last originalRequest. If the final element were to be
+      // reached
+      // the full request would be complete and we wouldn't be in this scenario
+      int maxIndex = documentsList.size() - 2;
+      for (int i = 0; i <= maxIndex; i++) {
+        String docName = documentsList.get(i);
+        if (docName.equals(missing) || docName.equals(foundName)) {
+          startIndex = i;
+          break;
+        }
+      }
+      if (0 <= startIndex) {
+        BatchGetDocumentsRequest.Builder builder = originalRequest.toBuilder().clearDocuments();
+        documentsList.stream()
+            .skip(startIndex + 1) // start from the next entry from the one we found
+            .forEach(builder::addDocuments);
+        return builder.build();
+      }
+      // unable to find a match, return the original request
+      return originalRequest;

Review comment:
       Yes, that sounds good.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] chamikaramj commented on a change in pull request #15005: [BEAM-8376] Google Cloud Firestore Connector - Add Firestore v1 Read Operations

Posted by GitBox <gi...@apache.org>.
chamikaramj commented on a change in pull request #15005:
URL: https://github.com/apache/beam/pull/15005#discussion_r666439356



##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/firestore/FirestoreV1ReadFn.java
##########
@@ -0,0 +1,633 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.gcp.firestore;
+
+import static java.util.Objects.requireNonNull;
+
+import com.google.api.gax.paging.AbstractPage;
+import com.google.api.gax.paging.AbstractPagedListResponse;
+import com.google.api.gax.rpc.ServerStream;
+import com.google.api.gax.rpc.ServerStreamingCallable;
+import com.google.api.gax.rpc.UnaryCallable;
+import com.google.cloud.firestore.v1.FirestoreClient.ListCollectionIdsPage;
+import com.google.cloud.firestore.v1.FirestoreClient.ListCollectionIdsPagedResponse;
+import com.google.cloud.firestore.v1.FirestoreClient.ListDocumentsPage;
+import com.google.cloud.firestore.v1.FirestoreClient.ListDocumentsPagedResponse;
+import com.google.cloud.firestore.v1.FirestoreClient.PartitionQueryPage;
+import com.google.cloud.firestore.v1.FirestoreClient.PartitionQueryPagedResponse;
+import com.google.cloud.firestore.v1.stub.FirestoreStub;
+import com.google.firestore.v1.BatchGetDocumentsRequest;
+import com.google.firestore.v1.BatchGetDocumentsResponse;
+import com.google.firestore.v1.Cursor;
+import com.google.firestore.v1.ListCollectionIdsRequest;
+import com.google.firestore.v1.ListCollectionIdsResponse;
+import com.google.firestore.v1.ListDocumentsRequest;
+import com.google.firestore.v1.ListDocumentsResponse;
+import com.google.firestore.v1.PartitionQueryRequest;
+import com.google.firestore.v1.PartitionQueryResponse;
+import com.google.firestore.v1.RunQueryRequest;
+import com.google.firestore.v1.RunQueryResponse;
+import com.google.firestore.v1.StructuredQuery;
+import com.google.firestore.v1.StructuredQuery.Direction;
+import com.google.firestore.v1.StructuredQuery.FieldReference;
+import com.google.firestore.v1.StructuredQuery.Order;
+import com.google.firestore.v1.Value;
+import com.google.protobuf.Message;
+import com.google.protobuf.ProtocolStringList;
+import java.io.Serializable;
+import java.util.List;
+import java.util.Map;
+import java.util.Objects;
+import org.apache.beam.sdk.extensions.gcp.options.GcpOptions;
+import org.apache.beam.sdk.io.gcp.firestore.FirestoreDoFn.NonWindowAwareDoFn;
+import org.apache.beam.sdk.io.gcp.firestore.FirestoreV1Fn.HasRpcAttemptContext;
+import org.apache.beam.sdk.io.gcp.firestore.RpcQos.RpcAttempt.Context;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.display.DisplayData;
+import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.annotations.VisibleForTesting;
+import org.checkerframework.checker.nullness.compatqual.NullableDecl;
+import org.checkerframework.checker.nullness.qual.Nullable;
+import org.joda.time.Instant;
+
+/**
+ * A collection of {@link org.apache.beam.sdk.transforms.DoFn DoFn}s for each of the supported read
+ * RPC methods from the Cloud Firestore V1 API.
+ */
+final class FirestoreV1ReadFn {
+
+  /**
+   * {@link DoFn} for Firestore V1 {@link RunQueryRequest}s.
+   *
+   * <p>This Fn uses a stream to obtain responses, each response from the stream will be output to
+   * the next stage of the pipeline. Each response from the stream represents an individual document
+   * with the associated metadata.
+   *
+   * <p>If an error is encountered while reading from the stream, the stream will attempt to resume
+   * rather than starting over. The restarting of the stream will continue within the scope of the
+   * completion of the request (meaning any possibility of resumption is contingent upon an attempt
+   * being available in the Qos budget).
+   *
+   * <p>All request quality-of-service is managed via the instance of {@link RpcQos} associated with
+   * the lifecycle of this Fn.
+   */
+  static final class RunQueryFn

Review comment:
       @nehsyc do you have any concerns here related to Dataflow autoscaling ?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] danthev commented on a change in pull request #15005: [BEAM-8376] Google Cloud Firestore Connector - Add Firestore v1 Read Operations

Posted by GitBox <gi...@apache.org>.
danthev commented on a change in pull request #15005:
URL: https://github.com/apache/beam/pull/15005#discussion_r655626971



##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/firestore/FirestoreV1.java
##########
@@ -189,6 +533,725 @@ private Write() {}
     }
   }
 
+  /**
+   * Concrete class representing a {@link PTransform}{@code <}{@link PCollection}{@code <}{@link
+   * ListCollectionIdsRequest}{@code >, }{@link PTransform}{@code <}{@link
+   * ListCollectionIdsResponse}{@code >>} which will read from Firestore.
+   *
+   * <p>This class is part of the Firestore Connector DSL, it has a type safe builder accessible via
+   * {@link FirestoreIO#v1()}{@code .}{@link FirestoreV1#read() read()}{@code .}{@link
+   * FirestoreV1.Read#listCollectionIds() listCollectionIds()}.
+   *
+   * <p>All request quality-of-service for an instance of this PTransform is scoped to the worker
+   * and configured via {@link ListCollectionIds.Builder#withRpcQosOptions(RpcQosOptions)}.
+   *
+   * @see FirestoreIO#v1()
+   * @see FirestoreV1#read()
+   * @see FirestoreV1.Read#listCollectionIds()
+   * @see FirestoreV1.ListCollectionIds.Builder
+   * @see ListCollectionIdsRequest
+   * @see ListCollectionIdsResponse
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.Firestore.ListCollectionIds">google.firestore.v1.Firestore.ListCollectionIds</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.ListCollectionIdsRequest">google.firestore.v1.ListCollectionIdsRequest</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.ListCollectionIdsResponse">google.firestore.v1.ListCollectionIdsResponse</a>
+   */
+  public static final class ListCollectionIds
+      extends Transform<
+          PCollection<ListCollectionIdsRequest>,
+          PCollection<String>,
+          ListCollectionIds,
+          ListCollectionIds.Builder> {
+
+    private ListCollectionIds(
+        JodaClock clock,
+        FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+        RpcQosOptions rpcQosOptions) {
+      super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    @Override
+    public PCollection<String> expand(PCollection<ListCollectionIdsRequest> input) {
+      return input
+          .apply(
+              "listCollectionIds",
+              ParDo.of(
+                  new ListCollectionIdsFn(clock, firestoreStatefulComponentFactory, rpcQosOptions)))
+          .apply(ParDo.of(new FlattenListCollectionIdsResponse()))
+          .apply(Reshuffle.viaRandomKey());
+    }
+
+    @Override
+    public Builder toBuilder() {
+      return new Builder(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    /**
+     * A type safe builder for {@link ListCollectionIds} allowing configuration and instantiation.
+     *
+     * <p>This class is part of the Firestore Connector DSL, it has a type safe builder accessible
+     * via {@link FirestoreIO#v1()}{@code .}{@link FirestoreV1#read() read()}{@code .}{@link
+     * FirestoreV1.Read#listCollectionIds() listCollectionIds()}.
+     *
+     * <p>
+     *
+     * @see FirestoreIO#v1()
+     * @see FirestoreV1#read()
+     * @see FirestoreV1.Read#listCollectionIds()
+     * @see FirestoreV1.ListCollectionIds
+     * @see ListCollectionIdsRequest
+     * @see ListCollectionIdsResponse
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.Firestore.ListCollectionIds">google.firestore.v1.Firestore.ListCollectionIds</a>
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.ListCollectionIdsRequest">google.firestore.v1.ListCollectionIdsRequest</a>
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.ListCollectionIdsResponse">google.firestore.v1.ListCollectionIdsResponse</a>
+     */
+    public static final class Builder
+        extends Transform.Builder<
+            PCollection<ListCollectionIdsRequest>,
+            PCollection<String>,
+            ListCollectionIds,
+            ListCollectionIds.Builder> {
+
+      private Builder() {
+        super();
+      }
+
+      private Builder(
+          JodaClock clock,
+          FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+          RpcQosOptions rpcQosOptions) {
+        super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+      }
+
+      @Override
+      public ListCollectionIds build() {
+        return genericBuild();
+      }
+
+      @Override
+      ListCollectionIds buildSafe(
+          JodaClock clock,
+          FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+          RpcQosOptions rpcQosOptions) {
+        return new ListCollectionIds(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+      }
+    }
+  }
+
+  /**
+   * Concrete class representing a {@link PTransform}{@code <}{@link PCollection}{@code <}{@link
+   * ListDocumentsRequest}{@code >, }{@link PTransform}{@code <}{@link ListDocumentsResponse}{@code
+   * >>} which will read from Firestore.
+   *
+   * <p>This class is part of the Firestore Connector DSL, it has a type safe builder accessible via
+   * {@link FirestoreIO#v1()}{@code .}{@link FirestoreV1#read() read()}{@code .}{@link
+   * FirestoreV1.Read#listDocuments() listDocuments()}.
+   *
+   * <p>All request quality-of-service for an instance of this PTransform is scoped to the worker
+   * and configured via {@link ListDocuments.Builder#withRpcQosOptions(RpcQosOptions)}.
+   *
+   * @see FirestoreIO#v1()
+   * @see FirestoreV1#read()
+   * @see FirestoreV1.Read#listDocuments()
+   * @see FirestoreV1.ListDocuments.Builder
+   * @see ListDocumentsRequest
+   * @see ListDocumentsResponse
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.Firestore.ListDocuments">google.firestore.v1.Firestore.ListDocuments</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.ListDocumentsRequest">google.firestore.v1.ListDocumentsRequest</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.ListDocumentsResponse">google.firestore.v1.ListDocumentsResponse</a>
+   */
+  public static final class ListDocuments
+      extends Transform<
+          PCollection<ListDocumentsRequest>,
+          PCollection<Document>,
+          ListDocuments,
+          ListDocuments.Builder> {
+
+    private ListDocuments(
+        JodaClock clock,
+        FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+        RpcQosOptions rpcQosOptions) {
+      super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    @Override
+    public PCollection<Document> expand(PCollection<ListDocumentsRequest> input) {
+      return input
+          .apply(
+              "listDocuments",
+              ParDo.of(
+                  new ListDocumentsFn(clock, firestoreStatefulComponentFactory, rpcQosOptions)))
+          .apply(ParDo.of(new ListDocumentsResponseToDocument()))
+          .apply(Reshuffle.viaRandomKey());
+    }
+
+    @Override
+    public Builder toBuilder() {
+      return new Builder(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    /**
+     * A type safe builder for {@link ListDocuments} allowing configuration and instantiation.
+     *
+     * <p>This class is part of the Firestore Connector DSL, it has a type safe builder accessible
+     * via {@link FirestoreIO#v1()}{@code .}{@link FirestoreV1#read() read()}{@code .}{@link
+     * FirestoreV1.Read#listDocuments() listDocuments()}.
+     *
+     * <p>
+     *
+     * @see FirestoreIO#v1()
+     * @see FirestoreV1#read()
+     * @see FirestoreV1.Read#listDocuments()
+     * @see FirestoreV1.ListDocuments
+     * @see ListDocumentsRequest
+     * @see ListDocumentsResponse
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.Firestore.ListDocuments">google.firestore.v1.Firestore.ListDocuments</a>
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.ListDocumentsRequest">google.firestore.v1.ListDocumentsRequest</a>
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.ListDocumentsResponse">google.firestore.v1.ListDocumentsResponse</a>
+     */
+    public static final class Builder
+        extends Transform.Builder<
+            PCollection<ListDocumentsRequest>,
+            PCollection<Document>,
+            ListDocuments,
+            ListDocuments.Builder> {
+
+      private Builder() {
+        super();
+      }
+
+      private Builder(
+          JodaClock clock,
+          FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+          RpcQosOptions rpcQosOptions) {
+        super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+      }
+
+      @Override
+      public ListDocuments build() {
+        return genericBuild();
+      }
+
+      @Override
+      ListDocuments buildSafe(
+          JodaClock clock,
+          FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+          RpcQosOptions rpcQosOptions) {
+        return new ListDocuments(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+      }
+    }
+  }
+
+  /**
+   * Concrete class representing a {@link PTransform}{@code <}{@link PCollection}{@code <}{@link
+   * RunQueryRequest}{@code >, }{@link PTransform}{@code <}{@link RunQueryResponse}{@code >>} which
+   * will read from Firestore.
+   *
+   * <p>This class is part of the Firestore Connector DSL, it has a type safe builder accessible via
+   * {@link FirestoreIO#v1()}{@code .}{@link FirestoreV1#read() read()}{@code .}{@link
+   * FirestoreV1.Read#runQuery() runQuery()}.
+   *
+   * <p>All request quality-of-service for an instance of this PTransform is scoped to the worker
+   * and configured via {@link RunQuery.Builder#withRpcQosOptions(RpcQosOptions)}.
+   *
+   * @see FirestoreIO#v1()
+   * @see FirestoreV1#read()
+   * @see FirestoreV1.Read#runQuery()
+   * @see FirestoreV1.RunQuery.Builder
+   * @see RunQueryRequest
+   * @see RunQueryResponse
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.Firestore.RunQuery">google.firestore.v1.Firestore.RunQuery</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.RunQueryRequest">google.firestore.v1.RunQueryRequest</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.RunQueryResponse">google.firestore.v1.RunQueryResponse</a>
+   */
+  public static final class RunQuery
+      extends Transform<
+          PCollection<RunQueryRequest>, PCollection<RunQueryResponse>, RunQuery, RunQuery.Builder> {
+
+    private RunQuery(
+        JodaClock clock,
+        FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+        RpcQosOptions rpcQosOptions) {
+      super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    @Override
+    public PCollection<RunQueryResponse> expand(PCollection<RunQueryRequest> input) {
+      return input
+          .apply(
+              "runQuery",
+              ParDo.of(new RunQueryFn(clock, firestoreStatefulComponentFactory, rpcQosOptions)))
+          .apply(Reshuffle.viaRandomKey());
+    }
+
+    @Override
+    public Builder toBuilder() {
+      return new Builder(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    /**
+     * A type safe builder for {@link RunQuery} allowing configuration and instantiation.
+     *
+     * <p>This class is part of the Firestore Connector DSL, it has a type safe builder accessible
+     * via {@link FirestoreIO#v1()}{@code .}{@link FirestoreV1#read() read()}{@code .}{@link
+     * FirestoreV1.Read#runQuery() runQuery()}.
+     *
+     * <p>
+     *
+     * @see FirestoreIO#v1()
+     * @see FirestoreV1#read()
+     * @see FirestoreV1.Read#runQuery()
+     * @see FirestoreV1.RunQuery
+     * @see RunQueryRequest
+     * @see RunQueryResponse
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.Firestore.RunQuery">google.firestore.v1.Firestore.RunQuery</a>
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.RunQueryRequest">google.firestore.v1.RunQueryRequest</a>
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.RunQueryResponse">google.firestore.v1.RunQueryResponse</a>
+     */
+    public static final class Builder
+        extends Transform.Builder<
+            PCollection<RunQueryRequest>,
+            PCollection<RunQueryResponse>,
+            RunQuery,
+            RunQuery.Builder> {
+
+      private Builder() {
+        super();
+      }
+
+      private Builder(
+          JodaClock clock,
+          FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+          RpcQosOptions rpcQosOptions) {
+        super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+      }
+
+      @Override
+      public RunQuery build() {
+        return genericBuild();
+      }
+
+      @Override
+      RunQuery buildSafe(
+          JodaClock clock,
+          FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+          RpcQosOptions rpcQosOptions) {
+        return new RunQuery(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+      }
+    }
+  }
+
+  /**
+   * Concrete class representing a {@link PTransform}{@code <}{@link PCollection}{@code <}{@link
+   * BatchGetDocumentsRequest}{@code >, }{@link PTransform}{@code <}{@link
+   * BatchGetDocumentsResponse}{@code >>} which will read from Firestore.
+   *
+   * <p>This class is part of the Firestore Connector DSL, it has a type safe builder accessible via
+   * {@link FirestoreIO#v1()}{@code .}{@link FirestoreV1#read() read()}{@code .}{@link
+   * FirestoreV1.Read#batchGetDocuments() batchGetDocuments()}.
+   *
+   * <p>All request quality-of-service for an instance of this PTransform is scoped to the worker
+   * and configured via {@link BatchGetDocuments.Builder#withRpcQosOptions(RpcQosOptions)}.
+   *
+   * @see FirestoreIO#v1()
+   * @see FirestoreV1#read()
+   * @see FirestoreV1.Read#batchGetDocuments()
+   * @see FirestoreV1.BatchGetDocuments.Builder
+   * @see BatchGetDocumentsRequest
+   * @see BatchGetDocumentsResponse
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.Firestore.BatchGetDocuments">google.firestore.v1.Firestore.BatchGetDocuments</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.BatchGetDocumentsRequest">google.firestore.v1.BatchGetDocumentsRequest</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.BatchGetDocumentsResponse">google.firestore.v1.BatchGetDocumentsResponse</a>
+   */
+  public static final class BatchGetDocuments
+      extends Transform<
+          PCollection<BatchGetDocumentsRequest>,
+          PCollection<BatchGetDocumentsResponse>,
+          BatchGetDocuments,
+          BatchGetDocuments.Builder> {
+
+    private BatchGetDocuments(
+        JodaClock clock,
+        FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+        RpcQosOptions rpcQosOptions) {
+      super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    @Override
+    public PCollection<BatchGetDocumentsResponse> expand(
+        PCollection<BatchGetDocumentsRequest> input) {
+      return input
+          .apply(
+              "batchGetDocuments",
+              ParDo.of(
+                  new BatchGetDocumentsFn(clock, firestoreStatefulComponentFactory, rpcQosOptions)))
+          .apply(Reshuffle.viaRandomKey());
+    }
+
+    @Override
+    public Builder toBuilder() {
+      return new Builder(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    /**
+     * A type safe builder for {@link BatchGetDocuments} allowing configuration and instantiation.
+     *
+     * <p>This class is part of the Firestore Connector DSL, it has a type safe builder accessible
+     * via {@link FirestoreIO#v1()}{@code .}{@link FirestoreV1#read() read()}{@code .}{@link
+     * FirestoreV1.Read#batchGetDocuments() batchGetDocuments()}.
+     *
+     * <p>
+     *
+     * @see FirestoreIO#v1()
+     * @see FirestoreV1#read()
+     * @see FirestoreV1.Read#batchGetDocuments()
+     * @see FirestoreV1.BatchGetDocuments
+     * @see BatchGetDocumentsRequest
+     * @see BatchGetDocumentsResponse
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.Firestore.BatchGetDocuments">google.firestore.v1.Firestore.BatchGetDocuments</a>
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.BatchGetDocumentsRequest">google.firestore.v1.BatchGetDocumentsRequest</a>
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.BatchGetDocumentsResponse">google.firestore.v1.BatchGetDocumentsResponse</a>
+     */
+    public static final class Builder
+        extends Transform.Builder<
+            PCollection<BatchGetDocumentsRequest>,
+            PCollection<BatchGetDocumentsResponse>,
+            BatchGetDocuments,
+            BatchGetDocuments.Builder> {
+
+      private Builder() {
+        super();
+      }
+
+      public Builder(
+          JodaClock clock,
+          FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+          RpcQosOptions rpcQosOptions) {
+        super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+      }
+
+      @Override
+      public BatchGetDocuments build() {
+        return genericBuild();
+      }
+
+      @Override
+      BatchGetDocuments buildSafe(
+          JodaClock clock,
+          FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+          RpcQosOptions rpcQosOptions) {
+        return new BatchGetDocuments(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+      }
+    }
+  }
+
+  /**
+   * Concrete class representing a {@link PTransform}{@code <}{@link PCollection}{@code <}{@link
+   * PartitionQueryRequest}{@code >, }{@link PTransform}{@code <}{@link RunQueryResponse}{@code >>}
+   * which will read from Firestore.
+   *
+   * <p>This class is part of the Firestore Connector DSL, it has a type safe builder accessible via
+   * {@link FirestoreIO#v1()}{@code .}{@link FirestoreV1#read() read()}{@code .}{@link
+   * FirestoreV1.Read#partitionQuery() partitionQuery()}.
+   *
+   * <p>All request quality-of-service for an instance of this PTransform is scoped to the worker
+   * and configured via {@link PartitionQuery.Builder#withRpcQosOptions(RpcQosOptions)}.
+   *
+   * @see FirestoreIO#v1()
+   * @see FirestoreV1#read()
+   * @see FirestoreV1.Read#partitionQuery()
+   * @see FirestoreV1.PartitionQuery.Builder
+   * @see PartitionQueryRequest
+   * @see RunQueryResponse
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.Firestore.PartitionQuery">google.firestore.v1.Firestore.PartitionQuery</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.PartitionQueryRequest">google.firestore.v1.PartitionQueryRequest</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.PartitionQueryResponse">google.firestore.v1.PartitionQueryResponse</a>
+   */
+  public static final class PartitionQuery
+      extends Transform<
+          PCollection<PartitionQueryRequest>,
+          PCollection<RunQueryResponse>,
+          PartitionQuery,
+          PartitionQuery.Builder> {
+
+    private final boolean nameOnlyQuery;
+
+    private PartitionQuery(
+        JodaClock clock,
+        FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+        RpcQosOptions rpcQosOptions,
+        boolean nameOnlyQuery) {
+      super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+      this.nameOnlyQuery = nameOnlyQuery;
+    }
+
+    @Override
+    public PCollection<RunQueryResponse> expand(PCollection<PartitionQueryRequest> input) {
+      PCollection<RunQueryRequest> queries =
+          input
+              .apply(
+                  "PartitionQuery",
+                  ParDo.of(
+                      new PartitionQueryFn(
+                          clock, firestoreStatefulComponentFactory, rpcQosOptions)))
+              .apply("expand queries", ParDo.of(new PartitionQueryResponseToRunQueryRequest()));
+      if (nameOnlyQuery) {
+        queries =
+            queries.apply(
+                "set name only query",
+                MapElements.via(
+                    new SimpleFunction<RunQueryRequest, RunQueryRequest>() {
+                      @Override
+                      public RunQueryRequest apply(RunQueryRequest input) {
+                        RunQueryRequest.Builder builder = input.toBuilder();
+                        builder
+                            .getStructuredQueryBuilder()
+                            .setSelect(
+                                Projection.newBuilder()
+                                    .addFields(
+                                        FieldReference.newBuilder()
+                                            .setFieldPath("__name__")
+                                            .build())
+                                    .build());
+                        return builder.build();
+                      }
+                    }));
+      }
+      return queries
+          .apply(Reshuffle.viaRandomKey())
+          .apply(new RunQuery(clock, firestoreStatefulComponentFactory, rpcQosOptions));
+    }
+
+    @Override
+    public Builder toBuilder() {
+      return new Builder(clock, firestoreStatefulComponentFactory, rpcQosOptions, nameOnlyQuery);
+    }
+
+    /**
+     * A type safe builder for {@link PartitionQuery} allowing configuration and instantiation.
+     *
+     * <p>This class is part of the Firestore Connector DSL, it has a type safe builder accessible
+     * via {@link FirestoreIO#v1()}{@code .}{@link FirestoreV1#read() read()}{@code .}{@link
+     * FirestoreV1.Read#partitionQuery() partitionQuery()}.
+     *
+     * <p>
+     *
+     * @see FirestoreIO#v1()
+     * @see FirestoreV1#read()
+     * @see FirestoreV1.Read#partitionQuery()
+     * @see FirestoreV1.PartitionQuery
+     * @see PartitionQueryRequest
+     * @see RunQueryResponse
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.Firestore.PartitionQuery">google.firestore.v1.Firestore.PartitionQuery</a>
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.PartitionQueryRequest">google.firestore.v1.PartitionQueryRequest</a>
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.PartitionQueryResponse">google.firestore.v1.PartitionQueryResponse</a>
+     */
+    public static final class Builder
+        extends Transform.Builder<
+            PCollection<PartitionQueryRequest>,
+            PCollection<RunQueryResponse>,
+            PartitionQuery,
+            FirestoreV1.PartitionQuery.Builder> {
+
+      private boolean nameOnlyQuery = false;
+
+      private Builder() {
+        super();
+      }
+
+      public Builder(
+          JodaClock clock,
+          FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+          RpcQosOptions rpcQosOptions,
+          boolean nameOnlyQuery) {
+        super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+        this.nameOnlyQuery = nameOnlyQuery;
+      }
+
+      @Override
+      public PartitionQuery build() {
+        return genericBuild();
+      }
+
+      /**
+       * Update produced queries to only retrieve their {@code __name__} thereby not retrieving any
+       * fields and reducing resource requirements.
+       *
+       * @return this builder
+       */
+      public Builder withNameOnlyQuery() {
+        this.nameOnlyQuery = true;
+        return this;
+      }
+
+      @Override
+      PartitionQuery buildSafe(
+          JodaClock clock,
+          FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+          RpcQosOptions rpcQosOptions) {
+        return new PartitionQuery(
+            clock, firestoreStatefulComponentFactory, rpcQosOptions, nameOnlyQuery);
+      }
+    }
+
+    /**
+     * DoFn which contains the logic necessary to turn a {@link PartitionQueryRequest} and {@link
+     * PartitionQueryResponse} pair into {@code N} {@link RunQueryRequest}.
+     */
+    static final class PartitionQueryResponseToRunQueryRequest
+        extends DoFn<PartitionQueryPair, RunQueryRequest> {
+
+      /**
+       * When fetching cursors that span multiple pages it is expected (per <a
+       * href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.PartitionQueryRequest">
+       * PartitionQueryRequest.page_token</a>) for the client to sort the cursors before processing
+       * them to define the sub-queries. So here we're defining a Comparator which will sort Cursors
+       * by the first reference value present, then comparing the reference values
+       * lexicographically.
+       */
+      static final Comparator<Cursor> CURSOR_REFERENCE_VALUE_COMPARATOR;
+
+      static {
+        Function<Cursor, Optional<Value>> firstReferenceValue =
+            (Cursor c) ->
+                c.getValuesList().stream()
+                    .filter(
+                        v -> {
+                          String referenceValue = v.getReferenceValue();
+                          return referenceValue != null && !referenceValue.isEmpty();
+                        })
+                    .findFirst();
+        Function<String, String[]> stringToPath = (String s) -> s.split("/");
+        // compare references by their path segments rather than as a whole string to ensure
+        // per path segment comparison is taken into account.
+        Comparator<String[]> pathWiseCompare =
+            (String[] path1, String[] path2) -> {
+              int minLength = Math.min(path1.length, path2.length);
+              for (int i = 0; i < minLength; i++) {
+                String pathSegment1 = path1[i];
+                String pathSegment2 = path2[i];
+                int compare = pathSegment1.compareTo(pathSegment2);
+                if (compare != 0) {
+                  return compare;
+                }
+              }
+              if (path1.length == path2.length) {
+                return 0;
+              } else if (minLength == path1.length) {
+                return -1;
+              } else {
+                return 1;
+              }
+            };
+
+        // Sort those cursors which have no firstReferenceValue at the bottom of the list
+        CURSOR_REFERENCE_VALUE_COMPARATOR =
+            Comparator.comparing(
+                firstReferenceValue,
+                (o1, o2) -> {
+                  if (o1.isPresent() && o2.isPresent()) {
+                    return pathWiseCompare.compare(
+                        stringToPath.apply(o1.get().getReferenceValue()),
+                        stringToPath.apply(o2.get().getReferenceValue()));
+                  } else if (o1.isPresent()) {
+                    return -1;
+                  } else {
+                    return 1;
+                  }
+                });
+      }
+
+      @ProcessElement
+      public void processElement(ProcessContext c) {
+        PartitionQueryPair pair = c.element();

Review comment:
       It seems this logic isn't tested, just the comparator itself. Can you add one that calls `runFunction`?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] BenWhitehead commented on a change in pull request #15005: [BEAM-8376] Google Cloud Firestore Connector - Add Firestore v1 Read Operations

Posted by GitBox <gi...@apache.org>.
BenWhitehead commented on a change in pull request #15005:
URL: https://github.com/apache/beam/pull/15005#discussion_r655741368



##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/firestore/FirestoreV1.java
##########
@@ -189,6 +533,725 @@ private Write() {}
     }
   }
 
+  /**
+   * Concrete class representing a {@link PTransform}{@code <}{@link PCollection}{@code <}{@link
+   * ListCollectionIdsRequest}{@code >, }{@link PTransform}{@code <}{@link
+   * ListCollectionIdsResponse}{@code >>} which will read from Firestore.
+   *
+   * <p>This class is part of the Firestore Connector DSL, it has a type safe builder accessible via
+   * {@link FirestoreIO#v1()}{@code .}{@link FirestoreV1#read() read()}{@code .}{@link
+   * FirestoreV1.Read#listCollectionIds() listCollectionIds()}.
+   *
+   * <p>All request quality-of-service for an instance of this PTransform is scoped to the worker
+   * and configured via {@link ListCollectionIds.Builder#withRpcQosOptions(RpcQosOptions)}.
+   *
+   * @see FirestoreIO#v1()
+   * @see FirestoreV1#read()
+   * @see FirestoreV1.Read#listCollectionIds()
+   * @see FirestoreV1.ListCollectionIds.Builder
+   * @see ListCollectionIdsRequest
+   * @see ListCollectionIdsResponse
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.Firestore.ListCollectionIds">google.firestore.v1.Firestore.ListCollectionIds</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.ListCollectionIdsRequest">google.firestore.v1.ListCollectionIdsRequest</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.ListCollectionIdsResponse">google.firestore.v1.ListCollectionIdsResponse</a>
+   */
+  public static final class ListCollectionIds
+      extends Transform<
+          PCollection<ListCollectionIdsRequest>,
+          PCollection<String>,
+          ListCollectionIds,
+          ListCollectionIds.Builder> {
+
+    private ListCollectionIds(
+        JodaClock clock,
+        FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+        RpcQosOptions rpcQosOptions) {
+      super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    @Override
+    public PCollection<String> expand(PCollection<ListCollectionIdsRequest> input) {
+      return input
+          .apply(
+              "listCollectionIds",
+              ParDo.of(
+                  new ListCollectionIdsFn(clock, firestoreStatefulComponentFactory, rpcQosOptions)))
+          .apply(ParDo.of(new FlattenListCollectionIdsResponse()))
+          .apply(Reshuffle.viaRandomKey());
+    }
+
+    @Override
+    public Builder toBuilder() {
+      return new Builder(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    /**
+     * A type safe builder for {@link ListCollectionIds} allowing configuration and instantiation.
+     *
+     * <p>This class is part of the Firestore Connector DSL, it has a type safe builder accessible
+     * via {@link FirestoreIO#v1()}{@code .}{@link FirestoreV1#read() read()}{@code .}{@link
+     * FirestoreV1.Read#listCollectionIds() listCollectionIds()}.
+     *
+     * <p>
+     *
+     * @see FirestoreIO#v1()
+     * @see FirestoreV1#read()
+     * @see FirestoreV1.Read#listCollectionIds()
+     * @see FirestoreV1.ListCollectionIds
+     * @see ListCollectionIdsRequest
+     * @see ListCollectionIdsResponse
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.Firestore.ListCollectionIds">google.firestore.v1.Firestore.ListCollectionIds</a>
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.ListCollectionIdsRequest">google.firestore.v1.ListCollectionIdsRequest</a>
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.ListCollectionIdsResponse">google.firestore.v1.ListCollectionIdsResponse</a>
+     */
+    public static final class Builder
+        extends Transform.Builder<
+            PCollection<ListCollectionIdsRequest>,
+            PCollection<String>,
+            ListCollectionIds,
+            ListCollectionIds.Builder> {
+
+      private Builder() {
+        super();
+      }
+
+      private Builder(
+          JodaClock clock,
+          FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+          RpcQosOptions rpcQosOptions) {
+        super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+      }
+
+      @Override
+      public ListCollectionIds build() {
+        return genericBuild();
+      }
+
+      @Override
+      ListCollectionIds buildSafe(
+          JodaClock clock,
+          FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+          RpcQosOptions rpcQosOptions) {
+        return new ListCollectionIds(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+      }
+    }
+  }
+
+  /**
+   * Concrete class representing a {@link PTransform}{@code <}{@link PCollection}{@code <}{@link
+   * ListDocumentsRequest}{@code >, }{@link PTransform}{@code <}{@link ListDocumentsResponse}{@code
+   * >>} which will read from Firestore.
+   *
+   * <p>This class is part of the Firestore Connector DSL, it has a type safe builder accessible via
+   * {@link FirestoreIO#v1()}{@code .}{@link FirestoreV1#read() read()}{@code .}{@link
+   * FirestoreV1.Read#listDocuments() listDocuments()}.
+   *
+   * <p>All request quality-of-service for an instance of this PTransform is scoped to the worker
+   * and configured via {@link ListDocuments.Builder#withRpcQosOptions(RpcQosOptions)}.
+   *
+   * @see FirestoreIO#v1()
+   * @see FirestoreV1#read()
+   * @see FirestoreV1.Read#listDocuments()
+   * @see FirestoreV1.ListDocuments.Builder
+   * @see ListDocumentsRequest
+   * @see ListDocumentsResponse
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.Firestore.ListDocuments">google.firestore.v1.Firestore.ListDocuments</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.ListDocumentsRequest">google.firestore.v1.ListDocumentsRequest</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.ListDocumentsResponse">google.firestore.v1.ListDocumentsResponse</a>
+   */
+  public static final class ListDocuments
+      extends Transform<
+          PCollection<ListDocumentsRequest>,
+          PCollection<Document>,
+          ListDocuments,
+          ListDocuments.Builder> {
+
+    private ListDocuments(
+        JodaClock clock,
+        FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+        RpcQosOptions rpcQosOptions) {
+      super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    @Override
+    public PCollection<Document> expand(PCollection<ListDocumentsRequest> input) {
+      return input
+          .apply(
+              "listDocuments",
+              ParDo.of(
+                  new ListDocumentsFn(clock, firestoreStatefulComponentFactory, rpcQosOptions)))
+          .apply(ParDo.of(new ListDocumentsResponseToDocument()))
+          .apply(Reshuffle.viaRandomKey());
+    }
+
+    @Override
+    public Builder toBuilder() {
+      return new Builder(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    /**
+     * A type safe builder for {@link ListDocuments} allowing configuration and instantiation.
+     *
+     * <p>This class is part of the Firestore Connector DSL, it has a type safe builder accessible
+     * via {@link FirestoreIO#v1()}{@code .}{@link FirestoreV1#read() read()}{@code .}{@link
+     * FirestoreV1.Read#listDocuments() listDocuments()}.
+     *
+     * <p>
+     *
+     * @see FirestoreIO#v1()
+     * @see FirestoreV1#read()
+     * @see FirestoreV1.Read#listDocuments()
+     * @see FirestoreV1.ListDocuments
+     * @see ListDocumentsRequest
+     * @see ListDocumentsResponse
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.Firestore.ListDocuments">google.firestore.v1.Firestore.ListDocuments</a>
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.ListDocumentsRequest">google.firestore.v1.ListDocumentsRequest</a>
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.ListDocumentsResponse">google.firestore.v1.ListDocumentsResponse</a>
+     */
+    public static final class Builder
+        extends Transform.Builder<
+            PCollection<ListDocumentsRequest>,
+            PCollection<Document>,
+            ListDocuments,
+            ListDocuments.Builder> {
+
+      private Builder() {
+        super();
+      }
+
+      private Builder(
+          JodaClock clock,
+          FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+          RpcQosOptions rpcQosOptions) {
+        super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+      }
+
+      @Override
+      public ListDocuments build() {
+        return genericBuild();
+      }
+
+      @Override
+      ListDocuments buildSafe(
+          JodaClock clock,
+          FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+          RpcQosOptions rpcQosOptions) {
+        return new ListDocuments(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+      }
+    }
+  }
+
+  /**
+   * Concrete class representing a {@link PTransform}{@code <}{@link PCollection}{@code <}{@link
+   * RunQueryRequest}{@code >, }{@link PTransform}{@code <}{@link RunQueryResponse}{@code >>} which
+   * will read from Firestore.
+   *
+   * <p>This class is part of the Firestore Connector DSL, it has a type safe builder accessible via
+   * {@link FirestoreIO#v1()}{@code .}{@link FirestoreV1#read() read()}{@code .}{@link
+   * FirestoreV1.Read#runQuery() runQuery()}.
+   *
+   * <p>All request quality-of-service for an instance of this PTransform is scoped to the worker
+   * and configured via {@link RunQuery.Builder#withRpcQosOptions(RpcQosOptions)}.
+   *
+   * @see FirestoreIO#v1()
+   * @see FirestoreV1#read()
+   * @see FirestoreV1.Read#runQuery()
+   * @see FirestoreV1.RunQuery.Builder
+   * @see RunQueryRequest
+   * @see RunQueryResponse
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.Firestore.RunQuery">google.firestore.v1.Firestore.RunQuery</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.RunQueryRequest">google.firestore.v1.RunQueryRequest</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.RunQueryResponse">google.firestore.v1.RunQueryResponse</a>
+   */
+  public static final class RunQuery
+      extends Transform<
+          PCollection<RunQueryRequest>, PCollection<RunQueryResponse>, RunQuery, RunQuery.Builder> {
+
+    private RunQuery(
+        JodaClock clock,
+        FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+        RpcQosOptions rpcQosOptions) {
+      super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    @Override
+    public PCollection<RunQueryResponse> expand(PCollection<RunQueryRequest> input) {
+      return input
+          .apply(
+              "runQuery",
+              ParDo.of(new RunQueryFn(clock, firestoreStatefulComponentFactory, rpcQosOptions)))
+          .apply(Reshuffle.viaRandomKey());
+    }
+
+    @Override
+    public Builder toBuilder() {
+      return new Builder(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    /**
+     * A type safe builder for {@link RunQuery} allowing configuration and instantiation.
+     *
+     * <p>This class is part of the Firestore Connector DSL, it has a type safe builder accessible
+     * via {@link FirestoreIO#v1()}{@code .}{@link FirestoreV1#read() read()}{@code .}{@link
+     * FirestoreV1.Read#runQuery() runQuery()}.
+     *
+     * <p>
+     *
+     * @see FirestoreIO#v1()
+     * @see FirestoreV1#read()
+     * @see FirestoreV1.Read#runQuery()
+     * @see FirestoreV1.RunQuery
+     * @see RunQueryRequest
+     * @see RunQueryResponse
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.Firestore.RunQuery">google.firestore.v1.Firestore.RunQuery</a>
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.RunQueryRequest">google.firestore.v1.RunQueryRequest</a>
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.RunQueryResponse">google.firestore.v1.RunQueryResponse</a>
+     */
+    public static final class Builder
+        extends Transform.Builder<
+            PCollection<RunQueryRequest>,
+            PCollection<RunQueryResponse>,
+            RunQuery,
+            RunQuery.Builder> {
+
+      private Builder() {
+        super();
+      }
+
+      private Builder(
+          JodaClock clock,
+          FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+          RpcQosOptions rpcQosOptions) {
+        super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+      }
+
+      @Override
+      public RunQuery build() {
+        return genericBuild();
+      }
+
+      @Override
+      RunQuery buildSafe(
+          JodaClock clock,
+          FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+          RpcQosOptions rpcQosOptions) {
+        return new RunQuery(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+      }
+    }
+  }
+
+  /**
+   * Concrete class representing a {@link PTransform}{@code <}{@link PCollection}{@code <}{@link
+   * BatchGetDocumentsRequest}{@code >, }{@link PTransform}{@code <}{@link
+   * BatchGetDocumentsResponse}{@code >>} which will read from Firestore.
+   *
+   * <p>This class is part of the Firestore Connector DSL, it has a type safe builder accessible via
+   * {@link FirestoreIO#v1()}{@code .}{@link FirestoreV1#read() read()}{@code .}{@link
+   * FirestoreV1.Read#batchGetDocuments() batchGetDocuments()}.
+   *
+   * <p>All request quality-of-service for an instance of this PTransform is scoped to the worker
+   * and configured via {@link BatchGetDocuments.Builder#withRpcQosOptions(RpcQosOptions)}.
+   *
+   * @see FirestoreIO#v1()
+   * @see FirestoreV1#read()
+   * @see FirestoreV1.Read#batchGetDocuments()
+   * @see FirestoreV1.BatchGetDocuments.Builder
+   * @see BatchGetDocumentsRequest
+   * @see BatchGetDocumentsResponse
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.Firestore.BatchGetDocuments">google.firestore.v1.Firestore.BatchGetDocuments</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.BatchGetDocumentsRequest">google.firestore.v1.BatchGetDocumentsRequest</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.BatchGetDocumentsResponse">google.firestore.v1.BatchGetDocumentsResponse</a>
+   */
+  public static final class BatchGetDocuments
+      extends Transform<
+          PCollection<BatchGetDocumentsRequest>,
+          PCollection<BatchGetDocumentsResponse>,
+          BatchGetDocuments,
+          BatchGetDocuments.Builder> {
+
+    private BatchGetDocuments(
+        JodaClock clock,
+        FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+        RpcQosOptions rpcQosOptions) {
+      super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    @Override
+    public PCollection<BatchGetDocumentsResponse> expand(
+        PCollection<BatchGetDocumentsRequest> input) {
+      return input
+          .apply(
+              "batchGetDocuments",
+              ParDo.of(
+                  new BatchGetDocumentsFn(clock, firestoreStatefulComponentFactory, rpcQosOptions)))
+          .apply(Reshuffle.viaRandomKey());
+    }
+
+    @Override
+    public Builder toBuilder() {
+      return new Builder(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    /**
+     * A type safe builder for {@link BatchGetDocuments} allowing configuration and instantiation.
+     *
+     * <p>This class is part of the Firestore Connector DSL, it has a type safe builder accessible
+     * via {@link FirestoreIO#v1()}{@code .}{@link FirestoreV1#read() read()}{@code .}{@link
+     * FirestoreV1.Read#batchGetDocuments() batchGetDocuments()}.
+     *
+     * <p>
+     *
+     * @see FirestoreIO#v1()
+     * @see FirestoreV1#read()
+     * @see FirestoreV1.Read#batchGetDocuments()
+     * @see FirestoreV1.BatchGetDocuments
+     * @see BatchGetDocumentsRequest
+     * @see BatchGetDocumentsResponse
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.Firestore.BatchGetDocuments">google.firestore.v1.Firestore.BatchGetDocuments</a>
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.BatchGetDocumentsRequest">google.firestore.v1.BatchGetDocumentsRequest</a>
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.BatchGetDocumentsResponse">google.firestore.v1.BatchGetDocumentsResponse</a>
+     */
+    public static final class Builder
+        extends Transform.Builder<
+            PCollection<BatchGetDocumentsRequest>,
+            PCollection<BatchGetDocumentsResponse>,
+            BatchGetDocuments,
+            BatchGetDocuments.Builder> {
+
+      private Builder() {
+        super();
+      }
+
+      public Builder(
+          JodaClock clock,
+          FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+          RpcQosOptions rpcQosOptions) {
+        super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+      }
+
+      @Override
+      public BatchGetDocuments build() {
+        return genericBuild();
+      }
+
+      @Override
+      BatchGetDocuments buildSafe(
+          JodaClock clock,
+          FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+          RpcQosOptions rpcQosOptions) {
+        return new BatchGetDocuments(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+      }
+    }
+  }
+
+  /**
+   * Concrete class representing a {@link PTransform}{@code <}{@link PCollection}{@code <}{@link
+   * PartitionQueryRequest}{@code >, }{@link PTransform}{@code <}{@link RunQueryResponse}{@code >>}
+   * which will read from Firestore.
+   *
+   * <p>This class is part of the Firestore Connector DSL, it has a type safe builder accessible via
+   * {@link FirestoreIO#v1()}{@code .}{@link FirestoreV1#read() read()}{@code .}{@link
+   * FirestoreV1.Read#partitionQuery() partitionQuery()}.
+   *
+   * <p>All request quality-of-service for an instance of this PTransform is scoped to the worker
+   * and configured via {@link PartitionQuery.Builder#withRpcQosOptions(RpcQosOptions)}.
+   *
+   * @see FirestoreIO#v1()
+   * @see FirestoreV1#read()
+   * @see FirestoreV1.Read#partitionQuery()
+   * @see FirestoreV1.PartitionQuery.Builder
+   * @see PartitionQueryRequest
+   * @see RunQueryResponse
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.Firestore.PartitionQuery">google.firestore.v1.Firestore.PartitionQuery</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.PartitionQueryRequest">google.firestore.v1.PartitionQueryRequest</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.PartitionQueryResponse">google.firestore.v1.PartitionQueryResponse</a>
+   */
+  public static final class PartitionQuery
+      extends Transform<
+          PCollection<PartitionQueryRequest>,
+          PCollection<RunQueryResponse>,
+          PartitionQuery,
+          PartitionQuery.Builder> {
+
+    private final boolean nameOnlyQuery;
+
+    private PartitionQuery(
+        JodaClock clock,
+        FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+        RpcQosOptions rpcQosOptions,
+        boolean nameOnlyQuery) {
+      super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+      this.nameOnlyQuery = nameOnlyQuery;
+    }
+
+    @Override
+    public PCollection<RunQueryResponse> expand(PCollection<PartitionQueryRequest> input) {
+      PCollection<RunQueryRequest> queries =
+          input
+              .apply(
+                  "PartitionQuery",
+                  ParDo.of(
+                      new PartitionQueryFn(
+                          clock, firestoreStatefulComponentFactory, rpcQosOptions)))
+              .apply("expand queries", ParDo.of(new PartitionQueryResponseToRunQueryRequest()));
+      if (nameOnlyQuery) {
+        queries =
+            queries.apply(
+                "set name only query",
+                MapElements.via(
+                    new SimpleFunction<RunQueryRequest, RunQueryRequest>() {
+                      @Override
+                      public RunQueryRequest apply(RunQueryRequest input) {
+                        RunQueryRequest.Builder builder = input.toBuilder();
+                        builder
+                            .getStructuredQueryBuilder()
+                            .setSelect(
+                                Projection.newBuilder()
+                                    .addFields(
+                                        FieldReference.newBuilder()
+                                            .setFieldPath("__name__")
+                                            .build())
+                                    .build());
+                        return builder.build();
+                      }
+                    }));
+      }
+      return queries
+          .apply(Reshuffle.viaRandomKey())
+          .apply(new RunQuery(clock, firestoreStatefulComponentFactory, rpcQosOptions));
+    }
+
+    @Override
+    public Builder toBuilder() {
+      return new Builder(clock, firestoreStatefulComponentFactory, rpcQosOptions, nameOnlyQuery);
+    }
+
+    /**
+     * A type safe builder for {@link PartitionQuery} allowing configuration and instantiation.
+     *
+     * <p>This class is part of the Firestore Connector DSL, it has a type safe builder accessible
+     * via {@link FirestoreIO#v1()}{@code .}{@link FirestoreV1#read() read()}{@code .}{@link
+     * FirestoreV1.Read#partitionQuery() partitionQuery()}.
+     *
+     * <p>
+     *
+     * @see FirestoreIO#v1()
+     * @see FirestoreV1#read()
+     * @see FirestoreV1.Read#partitionQuery()
+     * @see FirestoreV1.PartitionQuery
+     * @see PartitionQueryRequest
+     * @see RunQueryResponse
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.Firestore.PartitionQuery">google.firestore.v1.Firestore.PartitionQuery</a>
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.PartitionQueryRequest">google.firestore.v1.PartitionQueryRequest</a>
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.PartitionQueryResponse">google.firestore.v1.PartitionQueryResponse</a>
+     */
+    public static final class Builder
+        extends Transform.Builder<
+            PCollection<PartitionQueryRequest>,
+            PCollection<RunQueryResponse>,
+            PartitionQuery,
+            FirestoreV1.PartitionQuery.Builder> {
+
+      private boolean nameOnlyQuery = false;
+
+      private Builder() {
+        super();
+      }
+
+      public Builder(
+          JodaClock clock,
+          FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+          RpcQosOptions rpcQosOptions,
+          boolean nameOnlyQuery) {
+        super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+        this.nameOnlyQuery = nameOnlyQuery;
+      }
+
+      @Override
+      public PartitionQuery build() {
+        return genericBuild();
+      }
+
+      /**
+       * Update produced queries to only retrieve their {@code __name__} thereby not retrieving any
+       * fields and reducing resource requirements.
+       *
+       * @return this builder
+       */
+      public Builder withNameOnlyQuery() {
+        this.nameOnlyQuery = true;
+        return this;
+      }
+
+      @Override
+      PartitionQuery buildSafe(
+          JodaClock clock,
+          FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+          RpcQosOptions rpcQosOptions) {
+        return new PartitionQuery(
+            clock, firestoreStatefulComponentFactory, rpcQosOptions, nameOnlyQuery);
+      }
+    }
+
+    /**
+     * DoFn which contains the logic necessary to turn a {@link PartitionQueryRequest} and {@link
+     * PartitionQueryResponse} pair into {@code N} {@link RunQueryRequest}.
+     */
+    static final class PartitionQueryResponseToRunQueryRequest
+        extends DoFn<PartitionQueryPair, RunQueryRequest> {
+
+      /**
+       * When fetching cursors that span multiple pages it is expected (per <a
+       * href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.PartitionQueryRequest">
+       * PartitionQueryRequest.page_token</a>) for the client to sort the cursors before processing
+       * them to define the sub-queries. So here we're defining a Comparator which will sort Cursors
+       * by the first reference value present, then comparing the reference values
+       * lexicographically.
+       */
+      static final Comparator<Cursor> CURSOR_REFERENCE_VALUE_COMPARATOR;
+
+      static {
+        Function<Cursor, Optional<Value>> firstReferenceValue =
+            (Cursor c) ->
+                c.getValuesList().stream()
+                    .filter(
+                        v -> {
+                          String referenceValue = v.getReferenceValue();
+                          return referenceValue != null && !referenceValue.isEmpty();
+                        })
+                    .findFirst();
+        Function<String, String[]> stringToPath = (String s) -> s.split("/");
+        // compare references by their path segments rather than as a whole string to ensure
+        // per path segment comparison is taken into account.
+        Comparator<String[]> pathWiseCompare =
+            (String[] path1, String[] path2) -> {
+              int minLength = Math.min(path1.length, path2.length);
+              for (int i = 0; i < minLength; i++) {
+                String pathSegment1 = path1[i];
+                String pathSegment2 = path2[i];
+                int compare = pathSegment1.compareTo(pathSegment2);
+                if (compare != 0) {
+                  return compare;
+                }
+              }
+              if (path1.length == path2.length) {
+                return 0;
+              } else if (minLength == path1.length) {
+                return -1;
+              } else {
+                return 1;
+              }
+            };
+
+        // Sort those cursors which have no firstReferenceValue at the bottom of the list
+        CURSOR_REFERENCE_VALUE_COMPARATOR =
+            Comparator.comparing(
+                firstReferenceValue,
+                (o1, o2) -> {
+                  if (o1.isPresent() && o2.isPresent()) {
+                    return pathWiseCompare.compare(
+                        stringToPath.apply(o1.get().getReferenceValue()),
+                        stringToPath.apply(o2.get().getReferenceValue()));
+                  } else if (o1.isPresent()) {
+                    return -1;
+                  } else {
+                    return 1;
+                  }
+                });
+      }
+
+      @ProcessElement
+      public void processElement(ProcessContext c) {
+        PartitionQueryPair pair = c.element();

Review comment:
       Unit test added in addition to existing integration test.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] chamikaramj commented on a change in pull request #15005: [BEAM-8376] Google Cloud Firestore Connector - Add Firestore v1 Read Operations

Posted by GitBox <gi...@apache.org>.
chamikaramj commented on a change in pull request #15005:
URL: https://github.com/apache/beam/pull/15005#discussion_r663224976



##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/firestore/FirestoreDoFn.java
##########
@@ -46,6 +46,22 @@
   @StartBundle
   public abstract void startBundle(DoFn<InT, OutT>.StartBundleContext context) throws Exception;
 
+  abstract static class NonWindowAwareDoFn<InT, OutT> extends FirestoreDoFn<InT, OutT> {

Review comment:
       Please add a comment describing why we we need this class.

##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/firestore/FirestoreV1.java
##########
@@ -189,6 +533,725 @@ private Write() {}
     }
   }
 
+  /**
+   * Concrete class representing a {@link PTransform}{@code <}{@link PCollection}{@code <}{@link
+   * ListCollectionIdsRequest}{@code >, }{@link PTransform}{@code <}{@link
+   * ListCollectionIdsResponse}{@code >>} which will read from Firestore.
+   *
+   * <p>This class is part of the Firestore Connector DSL, it has a type safe builder accessible via
+   * {@link FirestoreIO#v1()}{@code .}{@link FirestoreV1#read() read()}{@code .}{@link
+   * FirestoreV1.Read#listCollectionIds() listCollectionIds()}.
+   *
+   * <p>All request quality-of-service for an instance of this PTransform is scoped to the worker
+   * and configured via {@link ListCollectionIds.Builder#withRpcQosOptions(RpcQosOptions)}.
+   *
+   * @see FirestoreIO#v1()
+   * @see FirestoreV1#read()
+   * @see FirestoreV1.Read#listCollectionIds()
+   * @see FirestoreV1.ListCollectionIds.Builder
+   * @see ListCollectionIdsRequest
+   * @see ListCollectionIdsResponse
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.Firestore.ListCollectionIds">google.firestore.v1.Firestore.ListCollectionIds</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.ListCollectionIdsRequest">google.firestore.v1.ListCollectionIdsRequest</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.ListCollectionIdsResponse">google.firestore.v1.ListCollectionIdsResponse</a>
+   */
+  public static final class ListCollectionIds
+      extends Transform<
+          PCollection<ListCollectionIdsRequest>,
+          PCollection<String>,
+          ListCollectionIds,
+          ListCollectionIds.Builder> {
+
+    private ListCollectionIds(
+        JodaClock clock,
+        FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+        RpcQosOptions rpcQosOptions) {
+      super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    @Override
+    public PCollection<String> expand(PCollection<ListCollectionIdsRequest> input) {
+      return input
+          .apply(
+              "listCollectionIds",
+              ParDo.of(
+                  new ListCollectionIdsFn(clock, firestoreStatefulComponentFactory, rpcQosOptions)))
+          .apply(ParDo.of(new FlattenListCollectionIdsResponse()))
+          .apply(Reshuffle.viaRandomKey());
+    }
+
+    @Override
+    public Builder toBuilder() {
+      return new Builder(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    /**
+     * A type safe builder for {@link ListCollectionIds} allowing configuration and instantiation.
+     *
+     * <p>This class is part of the Firestore Connector DSL, it has a type safe builder accessible
+     * via {@link FirestoreIO#v1()}{@code .}{@link FirestoreV1#read() read()}{@code .}{@link
+     * FirestoreV1.Read#listCollectionIds() listCollectionIds()}.
+     *
+     * <p>
+     *
+     * @see FirestoreIO#v1()
+     * @see FirestoreV1#read()
+     * @see FirestoreV1.Read#listCollectionIds()
+     * @see FirestoreV1.ListCollectionIds
+     * @see ListCollectionIdsRequest
+     * @see ListCollectionIdsResponse
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.Firestore.ListCollectionIds">google.firestore.v1.Firestore.ListCollectionIds</a>
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.ListCollectionIdsRequest">google.firestore.v1.ListCollectionIdsRequest</a>
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.ListCollectionIdsResponse">google.firestore.v1.ListCollectionIdsResponse</a>
+     */
+    public static final class Builder
+        extends Transform.Builder<
+            PCollection<ListCollectionIdsRequest>,
+            PCollection<String>,
+            ListCollectionIds,
+            ListCollectionIds.Builder> {
+
+      private Builder() {
+        super();
+      }
+
+      private Builder(
+          JodaClock clock,
+          FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+          RpcQosOptions rpcQosOptions) {
+        super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+      }
+
+      @Override
+      public ListCollectionIds build() {
+        return genericBuild();
+      }
+
+      @Override
+      ListCollectionIds buildSafe(
+          JodaClock clock,
+          FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+          RpcQosOptions rpcQosOptions) {
+        return new ListCollectionIds(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+      }
+    }
+  }
+
+  /**
+   * Concrete class representing a {@link PTransform}{@code <}{@link PCollection}{@code <}{@link
+   * ListDocumentsRequest}{@code >, }{@link PTransform}{@code <}{@link ListDocumentsResponse}{@code
+   * >>} which will read from Firestore.
+   *
+   * <p>This class is part of the Firestore Connector DSL, it has a type safe builder accessible via
+   * {@link FirestoreIO#v1()}{@code .}{@link FirestoreV1#read() read()}{@code .}{@link
+   * FirestoreV1.Read#listDocuments() listDocuments()}.
+   *
+   * <p>All request quality-of-service for an instance of this PTransform is scoped to the worker
+   * and configured via {@link ListDocuments.Builder#withRpcQosOptions(RpcQosOptions)}.
+   *
+   * @see FirestoreIO#v1()
+   * @see FirestoreV1#read()
+   * @see FirestoreV1.Read#listDocuments()
+   * @see FirestoreV1.ListDocuments.Builder
+   * @see ListDocumentsRequest
+   * @see ListDocumentsResponse
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.Firestore.ListDocuments">google.firestore.v1.Firestore.ListDocuments</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.ListDocumentsRequest">google.firestore.v1.ListDocumentsRequest</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.ListDocumentsResponse">google.firestore.v1.ListDocumentsResponse</a>
+   */
+  public static final class ListDocuments
+      extends Transform<
+          PCollection<ListDocumentsRequest>,
+          PCollection<Document>,
+          ListDocuments,
+          ListDocuments.Builder> {
+
+    private ListDocuments(
+        JodaClock clock,
+        FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+        RpcQosOptions rpcQosOptions) {
+      super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    @Override
+    public PCollection<Document> expand(PCollection<ListDocumentsRequest> input) {
+      return input
+          .apply(
+              "listDocuments",
+              ParDo.of(
+                  new ListDocumentsFn(clock, firestoreStatefulComponentFactory, rpcQosOptions)))
+          .apply(ParDo.of(new ListDocumentsResponseToDocument()))
+          .apply(Reshuffle.viaRandomKey());
+    }
+
+    @Override
+    public Builder toBuilder() {
+      return new Builder(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    /**
+     * A type safe builder for {@link ListDocuments} allowing configuration and instantiation.
+     *
+     * <p>This class is part of the Firestore Connector DSL, it has a type safe builder accessible
+     * via {@link FirestoreIO#v1()}{@code .}{@link FirestoreV1#read() read()}{@code .}{@link
+     * FirestoreV1.Read#listDocuments() listDocuments()}.
+     *
+     * <p>
+     *
+     * @see FirestoreIO#v1()
+     * @see FirestoreV1#read()
+     * @see FirestoreV1.Read#listDocuments()
+     * @see FirestoreV1.ListDocuments
+     * @see ListDocumentsRequest
+     * @see ListDocumentsResponse
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.Firestore.ListDocuments">google.firestore.v1.Firestore.ListDocuments</a>
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.ListDocumentsRequest">google.firestore.v1.ListDocumentsRequest</a>
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.ListDocumentsResponse">google.firestore.v1.ListDocumentsResponse</a>
+     */
+    public static final class Builder
+        extends Transform.Builder<
+            PCollection<ListDocumentsRequest>,
+            PCollection<Document>,
+            ListDocuments,
+            ListDocuments.Builder> {
+
+      private Builder() {
+        super();
+      }
+
+      private Builder(
+          JodaClock clock,
+          FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+          RpcQosOptions rpcQosOptions) {
+        super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+      }
+
+      @Override
+      public ListDocuments build() {
+        return genericBuild();
+      }
+
+      @Override
+      ListDocuments buildSafe(
+          JodaClock clock,
+          FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+          RpcQosOptions rpcQosOptions) {
+        return new ListDocuments(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+      }
+    }
+  }
+
+  /**
+   * Concrete class representing a {@link PTransform}{@code <}{@link PCollection}{@code <}{@link
+   * RunQueryRequest}{@code >, }{@link PTransform}{@code <}{@link RunQueryResponse}{@code >>} which
+   * will read from Firestore.
+   *
+   * <p>This class is part of the Firestore Connector DSL, it has a type safe builder accessible via
+   * {@link FirestoreIO#v1()}{@code .}{@link FirestoreV1#read() read()}{@code .}{@link
+   * FirestoreV1.Read#runQuery() runQuery()}.
+   *
+   * <p>All request quality-of-service for an instance of this PTransform is scoped to the worker
+   * and configured via {@link RunQuery.Builder#withRpcQosOptions(RpcQosOptions)}.
+   *
+   * @see FirestoreIO#v1()
+   * @see FirestoreV1#read()
+   * @see FirestoreV1.Read#runQuery()
+   * @see FirestoreV1.RunQuery.Builder
+   * @see RunQueryRequest
+   * @see RunQueryResponse
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.Firestore.RunQuery">google.firestore.v1.Firestore.RunQuery</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.RunQueryRequest">google.firestore.v1.RunQueryRequest</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.RunQueryResponse">google.firestore.v1.RunQueryResponse</a>
+   */
+  public static final class RunQuery
+      extends Transform<
+          PCollection<RunQueryRequest>, PCollection<RunQueryResponse>, RunQuery, RunQuery.Builder> {
+
+    private RunQuery(
+        JodaClock clock,
+        FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+        RpcQosOptions rpcQosOptions) {
+      super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    @Override
+    public PCollection<RunQueryResponse> expand(PCollection<RunQueryRequest> input) {
+      return input
+          .apply(
+              "runQuery",
+              ParDo.of(new RunQueryFn(clock, firestoreStatefulComponentFactory, rpcQosOptions)))
+          .apply(Reshuffle.viaRandomKey());
+    }
+
+    @Override
+    public Builder toBuilder() {
+      return new Builder(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    /**
+     * A type safe builder for {@link RunQuery} allowing configuration and instantiation.
+     *
+     * <p>This class is part of the Firestore Connector DSL, it has a type safe builder accessible
+     * via {@link FirestoreIO#v1()}{@code .}{@link FirestoreV1#read() read()}{@code .}{@link
+     * FirestoreV1.Read#runQuery() runQuery()}.
+     *
+     * <p>
+     *
+     * @see FirestoreIO#v1()
+     * @see FirestoreV1#read()
+     * @see FirestoreV1.Read#runQuery()
+     * @see FirestoreV1.RunQuery
+     * @see RunQueryRequest
+     * @see RunQueryResponse
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.Firestore.RunQuery">google.firestore.v1.Firestore.RunQuery</a>
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.RunQueryRequest">google.firestore.v1.RunQueryRequest</a>
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.RunQueryResponse">google.firestore.v1.RunQueryResponse</a>
+     */
+    public static final class Builder
+        extends Transform.Builder<
+            PCollection<RunQueryRequest>,
+            PCollection<RunQueryResponse>,
+            RunQuery,
+            RunQuery.Builder> {
+
+      private Builder() {
+        super();
+      }
+
+      private Builder(
+          JodaClock clock,
+          FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+          RpcQosOptions rpcQosOptions) {
+        super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+      }
+
+      @Override
+      public RunQuery build() {
+        return genericBuild();
+      }
+
+      @Override
+      RunQuery buildSafe(
+          JodaClock clock,
+          FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+          RpcQosOptions rpcQosOptions) {
+        return new RunQuery(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+      }
+    }
+  }
+
+  /**
+   * Concrete class representing a {@link PTransform}{@code <}{@link PCollection}{@code <}{@link
+   * BatchGetDocumentsRequest}{@code >, }{@link PTransform}{@code <}{@link
+   * BatchGetDocumentsResponse}{@code >>} which will read from Firestore.
+   *
+   * <p>This class is part of the Firestore Connector DSL, it has a type safe builder accessible via
+   * {@link FirestoreIO#v1()}{@code .}{@link FirestoreV1#read() read()}{@code .}{@link
+   * FirestoreV1.Read#batchGetDocuments() batchGetDocuments()}.
+   *
+   * <p>All request quality-of-service for an instance of this PTransform is scoped to the worker
+   * and configured via {@link BatchGetDocuments.Builder#withRpcQosOptions(RpcQosOptions)}.
+   *
+   * @see FirestoreIO#v1()
+   * @see FirestoreV1#read()
+   * @see FirestoreV1.Read#batchGetDocuments()
+   * @see FirestoreV1.BatchGetDocuments.Builder
+   * @see BatchGetDocumentsRequest
+   * @see BatchGetDocumentsResponse
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.Firestore.BatchGetDocuments">google.firestore.v1.Firestore.BatchGetDocuments</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.BatchGetDocumentsRequest">google.firestore.v1.BatchGetDocumentsRequest</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.BatchGetDocumentsResponse">google.firestore.v1.BatchGetDocumentsResponse</a>
+   */
+  public static final class BatchGetDocuments
+      extends Transform<
+          PCollection<BatchGetDocumentsRequest>,
+          PCollection<BatchGetDocumentsResponse>,
+          BatchGetDocuments,
+          BatchGetDocuments.Builder> {
+
+    private BatchGetDocuments(
+        JodaClock clock,
+        FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+        RpcQosOptions rpcQosOptions) {
+      super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    @Override
+    public PCollection<BatchGetDocumentsResponse> expand(
+        PCollection<BatchGetDocumentsRequest> input) {
+      return input
+          .apply(
+              "batchGetDocuments",
+              ParDo.of(
+                  new BatchGetDocumentsFn(clock, firestoreStatefulComponentFactory, rpcQosOptions)))
+          .apply(Reshuffle.viaRandomKey());
+    }
+
+    @Override
+    public Builder toBuilder() {
+      return new Builder(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    /**
+     * A type safe builder for {@link BatchGetDocuments} allowing configuration and instantiation.
+     *
+     * <p>This class is part of the Firestore Connector DSL, it has a type safe builder accessible
+     * via {@link FirestoreIO#v1()}{@code .}{@link FirestoreV1#read() read()}{@code .}{@link
+     * FirestoreV1.Read#batchGetDocuments() batchGetDocuments()}.
+     *
+     * <p>
+     *
+     * @see FirestoreIO#v1()
+     * @see FirestoreV1#read()
+     * @see FirestoreV1.Read#batchGetDocuments()
+     * @see FirestoreV1.BatchGetDocuments
+     * @see BatchGetDocumentsRequest
+     * @see BatchGetDocumentsResponse
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.Firestore.BatchGetDocuments">google.firestore.v1.Firestore.BatchGetDocuments</a>
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.BatchGetDocumentsRequest">google.firestore.v1.BatchGetDocumentsRequest</a>
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.BatchGetDocumentsResponse">google.firestore.v1.BatchGetDocumentsResponse</a>
+     */
+    public static final class Builder
+        extends Transform.Builder<
+            PCollection<BatchGetDocumentsRequest>,
+            PCollection<BatchGetDocumentsResponse>,
+            BatchGetDocuments,
+            BatchGetDocuments.Builder> {
+
+      private Builder() {
+        super();
+      }
+
+      public Builder(
+          JodaClock clock,
+          FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+          RpcQosOptions rpcQosOptions) {
+        super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+      }
+
+      @Override
+      public BatchGetDocuments build() {
+        return genericBuild();
+      }
+
+      @Override
+      BatchGetDocuments buildSafe(
+          JodaClock clock,
+          FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+          RpcQosOptions rpcQosOptions) {
+        return new BatchGetDocuments(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+      }
+    }
+  }
+
+  /**
+   * Concrete class representing a {@link PTransform}{@code <}{@link PCollection}{@code <}{@link
+   * PartitionQueryRequest}{@code >, }{@link PTransform}{@code <}{@link RunQueryResponse}{@code >>}
+   * which will read from Firestore.
+   *
+   * <p>This class is part of the Firestore Connector DSL, it has a type safe builder accessible via
+   * {@link FirestoreIO#v1()}{@code .}{@link FirestoreV1#read() read()}{@code .}{@link
+   * FirestoreV1.Read#partitionQuery() partitionQuery()}.
+   *
+   * <p>All request quality-of-service for an instance of this PTransform is scoped to the worker
+   * and configured via {@link PartitionQuery.Builder#withRpcQosOptions(RpcQosOptions)}.
+   *
+   * @see FirestoreIO#v1()
+   * @see FirestoreV1#read()
+   * @see FirestoreV1.Read#partitionQuery()
+   * @see FirestoreV1.PartitionQuery.Builder
+   * @see PartitionQueryRequest
+   * @see RunQueryResponse
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.Firestore.PartitionQuery">google.firestore.v1.Firestore.PartitionQuery</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.PartitionQueryRequest">google.firestore.v1.PartitionQueryRequest</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.PartitionQueryResponse">google.firestore.v1.PartitionQueryResponse</a>
+   */
+  public static final class PartitionQuery
+      extends Transform<
+          PCollection<PartitionQueryRequest>,
+          PCollection<RunQueryResponse>,
+          PartitionQuery,
+          PartitionQuery.Builder> {
+
+    private final boolean nameOnlyQuery;
+
+    private PartitionQuery(
+        JodaClock clock,
+        FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+        RpcQosOptions rpcQosOptions,
+        boolean nameOnlyQuery) {
+      super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+      this.nameOnlyQuery = nameOnlyQuery;
+    }
+
+    @Override
+    public PCollection<RunQueryResponse> expand(PCollection<PartitionQueryRequest> input) {

Review comment:
       Agree. I would have expected PartitionQuery to be a utility that help Beam read transform to parallelize better instead being a part of the public API.

##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/firestore/FirestoreDoFn.java
##########
@@ -46,6 +46,22 @@
   @StartBundle
   public abstract void startBundle(DoFn<InT, OutT>.StartBundleContext context) throws Exception;
 
+  abstract static class NonWindowAwareDoFn<InT, OutT> extends FirestoreDoFn<InT, OutT> {
+    /**
+     * {@link ProcessContext#element() context.element()} must be non-null, otherwise a
+     * NullPointerException will be thrown.

Review comment:
       I'm not sure if this means whether Firestore Read does not support Windowing or not but please note that Windowing is a key feature of Beam and all sources are expected to support that.

##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/firestore/FirestoreV1.java
##########
@@ -59,6 +89,80 @@
  *
  * <h3>Operations</h3>
  *
+ * <h4>Read</h4>
+ *
+ * <p>The currently supported read operations and their execution behavior are as follows:
+ *
+ * <table>
+ *   <tbody>
+ *     <tr>
+ *       <th>RPC</th>
+ *       <th>Execution Behavior</th>
+ *     </tr>
+ *     <tr>
+ *       <td>PartitionQuery</td>
+ *       <td>Parallel Streaming</td>
+ *     </tr>
+ *     <tr>
+ *       <td>RunQuery</td>
+ *       <td>Sequential Streaming</td>
+ *     </tr>
+ *     <tr>
+ *       <td>BatchGet</td>
+ *       <td>Sequential Streaming</td>
+ *     </tr>
+ *     <tr>
+ *       <td>ListCollectionIds</td>
+ *       <td>Sequential Paginated</td>
+ *     </tr>
+ *     <tr>
+ *       <td>ListDocuments</td>
+ *       <td>Sequential Paginated</td>
+ *     </tr>
+ *   </tbody>
+ * </table>
+ *
+ * <p>PartitionQuery should be preferred over other options if at all possible, becuase it has the
+ * ability to parallelize execution of multiple queries for specific sub-ranges of the full results.
+ *
+ * <p>You should only ever use ListDocuments if the use of <a target="_blank" rel="noopener
+ * noreferrer"
+ * href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.ListDocumentsRequest">{@code
+ * show_missing}</a> is needed to access a document. RunQuery and PartitionQuery will always be
+ * faster if the use of {@code show_missing} is not needed.
+ *
+ * <p><b>Example Usage</b>
+ *
+ * <pre>{@code
+ * PCollection<PartitionQueryRequest> partitionQueryRequests = ...;

Review comment:
       Probably add short descriptions to each of these request/response types (and links for further details).

##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/firestore/FirestoreV1ReadFn.java
##########
@@ -0,0 +1,633 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.gcp.firestore;
+
+import static java.util.Objects.requireNonNull;
+
+import com.google.api.gax.paging.AbstractPage;
+import com.google.api.gax.paging.AbstractPagedListResponse;
+import com.google.api.gax.rpc.ServerStream;
+import com.google.api.gax.rpc.ServerStreamingCallable;
+import com.google.api.gax.rpc.UnaryCallable;
+import com.google.cloud.firestore.v1.FirestoreClient.ListCollectionIdsPage;
+import com.google.cloud.firestore.v1.FirestoreClient.ListCollectionIdsPagedResponse;
+import com.google.cloud.firestore.v1.FirestoreClient.ListDocumentsPage;
+import com.google.cloud.firestore.v1.FirestoreClient.ListDocumentsPagedResponse;
+import com.google.cloud.firestore.v1.FirestoreClient.PartitionQueryPage;
+import com.google.cloud.firestore.v1.FirestoreClient.PartitionQueryPagedResponse;
+import com.google.cloud.firestore.v1.stub.FirestoreStub;
+import com.google.firestore.v1.BatchGetDocumentsRequest;
+import com.google.firestore.v1.BatchGetDocumentsResponse;
+import com.google.firestore.v1.Cursor;
+import com.google.firestore.v1.ListCollectionIdsRequest;
+import com.google.firestore.v1.ListCollectionIdsResponse;
+import com.google.firestore.v1.ListDocumentsRequest;
+import com.google.firestore.v1.ListDocumentsResponse;
+import com.google.firestore.v1.PartitionQueryRequest;
+import com.google.firestore.v1.PartitionQueryResponse;
+import com.google.firestore.v1.RunQueryRequest;
+import com.google.firestore.v1.RunQueryResponse;
+import com.google.firestore.v1.StructuredQuery;
+import com.google.firestore.v1.StructuredQuery.Direction;
+import com.google.firestore.v1.StructuredQuery.FieldReference;
+import com.google.firestore.v1.StructuredQuery.Order;
+import com.google.firestore.v1.Value;
+import com.google.protobuf.Message;
+import com.google.protobuf.ProtocolStringList;
+import java.io.Serializable;
+import java.util.List;
+import java.util.Map;
+import java.util.Objects;
+import org.apache.beam.sdk.extensions.gcp.options.GcpOptions;
+import org.apache.beam.sdk.io.gcp.firestore.FirestoreDoFn.NonWindowAwareDoFn;
+import org.apache.beam.sdk.io.gcp.firestore.FirestoreV1Fn.HasRpcAttemptContext;
+import org.apache.beam.sdk.io.gcp.firestore.RpcQos.RpcAttempt.Context;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.display.DisplayData;
+import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.annotations.VisibleForTesting;
+import org.checkerframework.checker.nullness.compatqual.NullableDecl;
+import org.checkerframework.checker.nullness.qual.Nullable;
+import org.joda.time.Instant;
+
+/**
+ * A collection of {@link org.apache.beam.sdk.transforms.DoFn DoFn}s for each of the supported read
+ * RPC methods from the Cloud Firestore V1 API.
+ */
+final class FirestoreV1ReadFn {
+
+  /**
+   * {@link DoFn} for Firestore V1 {@link RunQueryRequest}s.
+   *
+   * <p>This Fn uses a stream to obtain responses, each response from the stream will be output to
+   * the next stage of the pipeline. Each response from the stream represents an individual document
+   * with the associated metadata.
+   *
+   * <p>If an error is encountered while reading from the stream, the stream will attempt to resume
+   * rather than starting over. The restarting of the stream will continue within the scope of the
+   * completion of the request (meaning any possibility of resumption is contingent upon an attempt
+   * being available in the Qos budget).
+   *
+   * <p>All request quality-of-service is managed via the instance of {@link RpcQos} associated with
+   * the lifecycle of this Fn.
+   */
+  static final class RunQueryFn

Review comment:
       I would also try to reduce the the implementation to one or few SDF-based sources to make implementation simpler.

##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/firestore/FirestoreV1.java
##########
@@ -189,6 +533,725 @@ private Write() {}
     }
   }
 
+  /**
+   * Concrete class representing a {@link PTransform}{@code <}{@link PCollection}{@code <}{@link
+   * ListCollectionIdsRequest}{@code >, }{@link PTransform}{@code <}{@link
+   * ListCollectionIdsResponse}{@code >>} which will read from Firestore.
+   *
+   * <p>This class is part of the Firestore Connector DSL, it has a type safe builder accessible via
+   * {@link FirestoreIO#v1()}{@code .}{@link FirestoreV1#read() read()}{@code .}{@link
+   * FirestoreV1.Read#listCollectionIds() listCollectionIds()}.
+   *
+   * <p>All request quality-of-service for an instance of this PTransform is scoped to the worker
+   * and configured via {@link ListCollectionIds.Builder#withRpcQosOptions(RpcQosOptions)}.
+   *
+   * @see FirestoreIO#v1()
+   * @see FirestoreV1#read()
+   * @see FirestoreV1.Read#listCollectionIds()
+   * @see FirestoreV1.ListCollectionIds.Builder
+   * @see ListCollectionIdsRequest
+   * @see ListCollectionIdsResponse
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.Firestore.ListCollectionIds">google.firestore.v1.Firestore.ListCollectionIds</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.ListCollectionIdsRequest">google.firestore.v1.ListCollectionIdsRequest</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.ListCollectionIdsResponse">google.firestore.v1.ListCollectionIdsResponse</a>
+   */
+  public static final class ListCollectionIds
+      extends Transform<
+          PCollection<ListCollectionIdsRequest>,
+          PCollection<String>,
+          ListCollectionIds,
+          ListCollectionIds.Builder> {
+
+    private ListCollectionIds(
+        JodaClock clock,
+        FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+        RpcQosOptions rpcQosOptions) {
+      super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    @Override
+    public PCollection<String> expand(PCollection<ListCollectionIdsRequest> input) {
+      return input
+          .apply(
+              "listCollectionIds",
+              ParDo.of(
+                  new ListCollectionIdsFn(clock, firestoreStatefulComponentFactory, rpcQosOptions)))
+          .apply(ParDo.of(new FlattenListCollectionIdsResponse()))
+          .apply(Reshuffle.viaRandomKey());
+    }
+
+    @Override
+    public Builder toBuilder() {
+      return new Builder(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    /**
+     * A type safe builder for {@link ListCollectionIds} allowing configuration and instantiation.
+     *
+     * <p>This class is part of the Firestore Connector DSL, it has a type safe builder accessible
+     * via {@link FirestoreIO#v1()}{@code .}{@link FirestoreV1#read() read()}{@code .}{@link
+     * FirestoreV1.Read#listCollectionIds() listCollectionIds()}.
+     *
+     * <p>
+     *
+     * @see FirestoreIO#v1()
+     * @see FirestoreV1#read()
+     * @see FirestoreV1.Read#listCollectionIds()
+     * @see FirestoreV1.ListCollectionIds
+     * @see ListCollectionIdsRequest
+     * @see ListCollectionIdsResponse
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.Firestore.ListCollectionIds">google.firestore.v1.Firestore.ListCollectionIds</a>
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.ListCollectionIdsRequest">google.firestore.v1.ListCollectionIdsRequest</a>
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.ListCollectionIdsResponse">google.firestore.v1.ListCollectionIdsResponse</a>
+     */
+    public static final class Builder
+        extends Transform.Builder<
+            PCollection<ListCollectionIdsRequest>,
+            PCollection<String>,
+            ListCollectionIds,
+            ListCollectionIds.Builder> {
+
+      private Builder() {
+        super();
+      }
+
+      private Builder(
+          JodaClock clock,
+          FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+          RpcQosOptions rpcQosOptions) {
+        super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+      }
+
+      @Override
+      public ListCollectionIds build() {
+        return genericBuild();
+      }
+
+      @Override
+      ListCollectionIds buildSafe(
+          JodaClock clock,
+          FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+          RpcQosOptions rpcQosOptions) {
+        return new ListCollectionIds(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+      }
+    }
+  }
+
+  /**
+   * Concrete class representing a {@link PTransform}{@code <}{@link PCollection}{@code <}{@link
+   * ListDocumentsRequest}{@code >, }{@link PTransform}{@code <}{@link ListDocumentsResponse}{@code
+   * >>} which will read from Firestore.
+   *
+   * <p>This class is part of the Firestore Connector DSL, it has a type safe builder accessible via
+   * {@link FirestoreIO#v1()}{@code .}{@link FirestoreV1#read() read()}{@code .}{@link
+   * FirestoreV1.Read#listDocuments() listDocuments()}.
+   *
+   * <p>All request quality-of-service for an instance of this PTransform is scoped to the worker
+   * and configured via {@link ListDocuments.Builder#withRpcQosOptions(RpcQosOptions)}.
+   *
+   * @see FirestoreIO#v1()
+   * @see FirestoreV1#read()
+   * @see FirestoreV1.Read#listDocuments()
+   * @see FirestoreV1.ListDocuments.Builder
+   * @see ListDocumentsRequest
+   * @see ListDocumentsResponse
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.Firestore.ListDocuments">google.firestore.v1.Firestore.ListDocuments</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.ListDocumentsRequest">google.firestore.v1.ListDocumentsRequest</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.ListDocumentsResponse">google.firestore.v1.ListDocumentsResponse</a>
+   */
+  public static final class ListDocuments
+      extends Transform<
+          PCollection<ListDocumentsRequest>,
+          PCollection<Document>,
+          ListDocuments,
+          ListDocuments.Builder> {
+
+    private ListDocuments(
+        JodaClock clock,
+        FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+        RpcQosOptions rpcQosOptions) {
+      super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    @Override
+    public PCollection<Document> expand(PCollection<ListDocumentsRequest> input) {
+      return input
+          .apply(
+              "listDocuments",
+              ParDo.of(
+                  new ListDocumentsFn(clock, firestoreStatefulComponentFactory, rpcQosOptions)))
+          .apply(ParDo.of(new ListDocumentsResponseToDocument()))
+          .apply(Reshuffle.viaRandomKey());
+    }
+
+    @Override
+    public Builder toBuilder() {
+      return new Builder(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    /**
+     * A type safe builder for {@link ListDocuments} allowing configuration and instantiation.
+     *
+     * <p>This class is part of the Firestore Connector DSL, it has a type safe builder accessible
+     * via {@link FirestoreIO#v1()}{@code .}{@link FirestoreV1#read() read()}{@code .}{@link
+     * FirestoreV1.Read#listDocuments() listDocuments()}.
+     *
+     * <p>
+     *
+     * @see FirestoreIO#v1()
+     * @see FirestoreV1#read()
+     * @see FirestoreV1.Read#listDocuments()
+     * @see FirestoreV1.ListDocuments
+     * @see ListDocumentsRequest
+     * @see ListDocumentsResponse
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.Firestore.ListDocuments">google.firestore.v1.Firestore.ListDocuments</a>
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.ListDocumentsRequest">google.firestore.v1.ListDocumentsRequest</a>
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.ListDocumentsResponse">google.firestore.v1.ListDocumentsResponse</a>
+     */
+    public static final class Builder
+        extends Transform.Builder<
+            PCollection<ListDocumentsRequest>,
+            PCollection<Document>,
+            ListDocuments,
+            ListDocuments.Builder> {
+
+      private Builder() {
+        super();
+      }
+
+      private Builder(
+          JodaClock clock,
+          FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+          RpcQosOptions rpcQosOptions) {
+        super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+      }
+
+      @Override
+      public ListDocuments build() {
+        return genericBuild();
+      }
+
+      @Override
+      ListDocuments buildSafe(
+          JodaClock clock,
+          FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+          RpcQosOptions rpcQosOptions) {
+        return new ListDocuments(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+      }
+    }
+  }
+
+  /**
+   * Concrete class representing a {@link PTransform}{@code <}{@link PCollection}{@code <}{@link
+   * RunQueryRequest}{@code >, }{@link PTransform}{@code <}{@link RunQueryResponse}{@code >>} which
+   * will read from Firestore.
+   *
+   * <p>This class is part of the Firestore Connector DSL, it has a type safe builder accessible via
+   * {@link FirestoreIO#v1()}{@code .}{@link FirestoreV1#read() read()}{@code .}{@link
+   * FirestoreV1.Read#runQuery() runQuery()}.
+   *
+   * <p>All request quality-of-service for an instance of this PTransform is scoped to the worker
+   * and configured via {@link RunQuery.Builder#withRpcQosOptions(RpcQosOptions)}.
+   *
+   * @see FirestoreIO#v1()
+   * @see FirestoreV1#read()
+   * @see FirestoreV1.Read#runQuery()
+   * @see FirestoreV1.RunQuery.Builder
+   * @see RunQueryRequest
+   * @see RunQueryResponse
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.Firestore.RunQuery">google.firestore.v1.Firestore.RunQuery</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.RunQueryRequest">google.firestore.v1.RunQueryRequest</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.RunQueryResponse">google.firestore.v1.RunQueryResponse</a>
+   */
+  public static final class RunQuery
+      extends Transform<
+          PCollection<RunQueryRequest>, PCollection<RunQueryResponse>, RunQuery, RunQuery.Builder> {
+
+    private RunQuery(
+        JodaClock clock,
+        FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+        RpcQosOptions rpcQosOptions) {
+      super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    @Override
+    public PCollection<RunQueryResponse> expand(PCollection<RunQueryRequest> input) {
+      return input
+          .apply(
+              "runQuery",
+              ParDo.of(new RunQueryFn(clock, firestoreStatefulComponentFactory, rpcQosOptions)))
+          .apply(Reshuffle.viaRandomKey());
+    }
+
+    @Override
+    public Builder toBuilder() {
+      return new Builder(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    /**
+     * A type safe builder for {@link RunQuery} allowing configuration and instantiation.
+     *
+     * <p>This class is part of the Firestore Connector DSL, it has a type safe builder accessible
+     * via {@link FirestoreIO#v1()}{@code .}{@link FirestoreV1#read() read()}{@code .}{@link
+     * FirestoreV1.Read#runQuery() runQuery()}.
+     *
+     * <p>
+     *
+     * @see FirestoreIO#v1()
+     * @see FirestoreV1#read()
+     * @see FirestoreV1.Read#runQuery()
+     * @see FirestoreV1.RunQuery
+     * @see RunQueryRequest
+     * @see RunQueryResponse
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.Firestore.RunQuery">google.firestore.v1.Firestore.RunQuery</a>
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.RunQueryRequest">google.firestore.v1.RunQueryRequest</a>
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.RunQueryResponse">google.firestore.v1.RunQueryResponse</a>
+     */
+    public static final class Builder
+        extends Transform.Builder<
+            PCollection<RunQueryRequest>,
+            PCollection<RunQueryResponse>,
+            RunQuery,
+            RunQuery.Builder> {
+
+      private Builder() {
+        super();
+      }
+
+      private Builder(
+          JodaClock clock,
+          FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+          RpcQosOptions rpcQosOptions) {
+        super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+      }
+
+      @Override
+      public RunQuery build() {
+        return genericBuild();
+      }
+
+      @Override
+      RunQuery buildSafe(
+          JodaClock clock,
+          FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+          RpcQosOptions rpcQosOptions) {
+        return new RunQuery(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+      }
+    }
+  }
+
+  /**
+   * Concrete class representing a {@link PTransform}{@code <}{@link PCollection}{@code <}{@link
+   * BatchGetDocumentsRequest}{@code >, }{@link PTransform}{@code <}{@link
+   * BatchGetDocumentsResponse}{@code >>} which will read from Firestore.
+   *
+   * <p>This class is part of the Firestore Connector DSL, it has a type safe builder accessible via
+   * {@link FirestoreIO#v1()}{@code .}{@link FirestoreV1#read() read()}{@code .}{@link
+   * FirestoreV1.Read#batchGetDocuments() batchGetDocuments()}.
+   *
+   * <p>All request quality-of-service for an instance of this PTransform is scoped to the worker
+   * and configured via {@link BatchGetDocuments.Builder#withRpcQosOptions(RpcQosOptions)}.
+   *
+   * @see FirestoreIO#v1()
+   * @see FirestoreV1#read()
+   * @see FirestoreV1.Read#batchGetDocuments()
+   * @see FirestoreV1.BatchGetDocuments.Builder
+   * @see BatchGetDocumentsRequest
+   * @see BatchGetDocumentsResponse
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.Firestore.BatchGetDocuments">google.firestore.v1.Firestore.BatchGetDocuments</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.BatchGetDocumentsRequest">google.firestore.v1.BatchGetDocumentsRequest</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.BatchGetDocumentsResponse">google.firestore.v1.BatchGetDocumentsResponse</a>
+   */
+  public static final class BatchGetDocuments
+      extends Transform<
+          PCollection<BatchGetDocumentsRequest>,
+          PCollection<BatchGetDocumentsResponse>,
+          BatchGetDocuments,
+          BatchGetDocuments.Builder> {
+
+    private BatchGetDocuments(
+        JodaClock clock,
+        FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+        RpcQosOptions rpcQosOptions) {
+      super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    @Override
+    public PCollection<BatchGetDocumentsResponse> expand(
+        PCollection<BatchGetDocumentsRequest> input) {
+      return input
+          .apply(
+              "batchGetDocuments",
+              ParDo.of(
+                  new BatchGetDocumentsFn(clock, firestoreStatefulComponentFactory, rpcQosOptions)))
+          .apply(Reshuffle.viaRandomKey());
+    }
+
+    @Override
+    public Builder toBuilder() {
+      return new Builder(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    /**
+     * A type safe builder for {@link BatchGetDocuments} allowing configuration and instantiation.
+     *
+     * <p>This class is part of the Firestore Connector DSL, it has a type safe builder accessible
+     * via {@link FirestoreIO#v1()}{@code .}{@link FirestoreV1#read() read()}{@code .}{@link
+     * FirestoreV1.Read#batchGetDocuments() batchGetDocuments()}.
+     *
+     * <p>
+     *
+     * @see FirestoreIO#v1()
+     * @see FirestoreV1#read()
+     * @see FirestoreV1.Read#batchGetDocuments()
+     * @see FirestoreV1.BatchGetDocuments
+     * @see BatchGetDocumentsRequest
+     * @see BatchGetDocumentsResponse
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.Firestore.BatchGetDocuments">google.firestore.v1.Firestore.BatchGetDocuments</a>
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.BatchGetDocumentsRequest">google.firestore.v1.BatchGetDocumentsRequest</a>
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.BatchGetDocumentsResponse">google.firestore.v1.BatchGetDocumentsResponse</a>
+     */
+    public static final class Builder
+        extends Transform.Builder<
+            PCollection<BatchGetDocumentsRequest>,
+            PCollection<BatchGetDocumentsResponse>,
+            BatchGetDocuments,
+            BatchGetDocuments.Builder> {
+
+      private Builder() {
+        super();
+      }
+
+      public Builder(
+          JodaClock clock,
+          FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+          RpcQosOptions rpcQosOptions) {
+        super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+      }
+
+      @Override
+      public BatchGetDocuments build() {
+        return genericBuild();
+      }
+
+      @Override
+      BatchGetDocuments buildSafe(
+          JodaClock clock,
+          FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+          RpcQosOptions rpcQosOptions) {
+        return new BatchGetDocuments(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+      }
+    }
+  }
+
+  /**
+   * Concrete class representing a {@link PTransform}{@code <}{@link PCollection}{@code <}{@link
+   * PartitionQueryRequest}{@code >, }{@link PTransform}{@code <}{@link RunQueryResponse}{@code >>}
+   * which will read from Firestore.
+   *
+   * <p>This class is part of the Firestore Connector DSL, it has a type safe builder accessible via
+   * {@link FirestoreIO#v1()}{@code .}{@link FirestoreV1#read() read()}{@code .}{@link
+   * FirestoreV1.Read#partitionQuery() partitionQuery()}.
+   *
+   * <p>All request quality-of-service for an instance of this PTransform is scoped to the worker
+   * and configured via {@link PartitionQuery.Builder#withRpcQosOptions(RpcQosOptions)}.
+   *
+   * @see FirestoreIO#v1()
+   * @see FirestoreV1#read()
+   * @see FirestoreV1.Read#partitionQuery()
+   * @see FirestoreV1.PartitionQuery.Builder
+   * @see PartitionQueryRequest
+   * @see RunQueryResponse
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.Firestore.PartitionQuery">google.firestore.v1.Firestore.PartitionQuery</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.PartitionQueryRequest">google.firestore.v1.PartitionQueryRequest</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.PartitionQueryResponse">google.firestore.v1.PartitionQueryResponse</a>
+   */
+  public static final class PartitionQuery
+      extends Transform<
+          PCollection<PartitionQueryRequest>,
+          PCollection<RunQueryResponse>,
+          PartitionQuery,
+          PartitionQuery.Builder> {
+
+    private final boolean nameOnlyQuery;
+
+    private PartitionQuery(
+        JodaClock clock,
+        FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+        RpcQosOptions rpcQosOptions,
+        boolean nameOnlyQuery) {
+      super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+      this.nameOnlyQuery = nameOnlyQuery;
+    }
+
+    @Override
+    public PCollection<RunQueryResponse> expand(PCollection<PartitionQueryRequest> input) {
+      PCollection<RunQueryRequest> queries =
+          input
+              .apply(
+                  "PartitionQuery",
+                  ParDo.of(
+                      new PartitionQueryFn(
+                          clock, firestoreStatefulComponentFactory, rpcQosOptions)))
+              .apply("expand queries", ParDo.of(new PartitionQueryResponseToRunQueryRequest()));
+      if (nameOnlyQuery) {
+        queries =
+            queries.apply(
+                "set name only query",
+                MapElements.via(
+                    new SimpleFunction<RunQueryRequest, RunQueryRequest>() {
+                      @Override
+                      public RunQueryRequest apply(RunQueryRequest input) {
+                        RunQueryRequest.Builder builder = input.toBuilder();
+                        builder
+                            .getStructuredQueryBuilder()
+                            .setSelect(
+                                Projection.newBuilder()
+                                    .addFields(
+                                        FieldReference.newBuilder()
+                                            .setFieldPath("__name__")
+                                            .build())
+                                    .build());
+                        return builder.build();
+                      }
+                    }));
+      }
+      return queries
+          .apply(Reshuffle.viaRandomKey())
+          .apply(new RunQuery(clock, firestoreStatefulComponentFactory, rpcQosOptions));
+    }
+
+    @Override
+    public Builder toBuilder() {
+      return new Builder(clock, firestoreStatefulComponentFactory, rpcQosOptions, nameOnlyQuery);
+    }
+
+    /**
+     * A type safe builder for {@link PartitionQuery} allowing configuration and instantiation.
+     *
+     * <p>This class is part of the Firestore Connector DSL, it has a type safe builder accessible
+     * via {@link FirestoreIO#v1()}{@code .}{@link FirestoreV1#read() read()}{@code .}{@link
+     * FirestoreV1.Read#partitionQuery() partitionQuery()}.
+     *
+     * <p>
+     *
+     * @see FirestoreIO#v1()
+     * @see FirestoreV1#read()
+     * @see FirestoreV1.Read#partitionQuery()
+     * @see FirestoreV1.PartitionQuery
+     * @see PartitionQueryRequest
+     * @see RunQueryResponse
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.Firestore.PartitionQuery">google.firestore.v1.Firestore.PartitionQuery</a>
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.PartitionQueryRequest">google.firestore.v1.PartitionQueryRequest</a>
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.PartitionQueryResponse">google.firestore.v1.PartitionQueryResponse</a>
+     */
+    public static final class Builder
+        extends Transform.Builder<
+            PCollection<PartitionQueryRequest>,
+            PCollection<RunQueryResponse>,
+            PartitionQuery,
+            FirestoreV1.PartitionQuery.Builder> {
+
+      private boolean nameOnlyQuery = false;
+
+      private Builder() {
+        super();
+      }
+
+      public Builder(
+          JodaClock clock,
+          FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+          RpcQosOptions rpcQosOptions,
+          boolean nameOnlyQuery) {
+        super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+        this.nameOnlyQuery = nameOnlyQuery;
+      }
+
+      @Override
+      public PartitionQuery build() {
+        return genericBuild();
+      }
+
+      /**
+       * Update produced queries to only retrieve their {@code __name__} thereby not retrieving any
+       * fields and reducing resource requirements.
+       *
+       * @return this builder
+       */
+      public Builder withNameOnlyQuery() {
+        this.nameOnlyQuery = true;
+        return this;
+      }
+
+      @Override
+      PartitionQuery buildSafe(
+          JodaClock clock,
+          FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+          RpcQosOptions rpcQosOptions) {
+        return new PartitionQuery(
+            clock, firestoreStatefulComponentFactory, rpcQosOptions, nameOnlyQuery);
+      }
+    }
+
+    /**
+     * DoFn which contains the logic necessary to turn a {@link PartitionQueryRequest} and {@link
+     * PartitionQueryResponse} pair into {@code N} {@link RunQueryRequest}.
+     */
+    static final class PartitionQueryResponseToRunQueryRequest
+        extends DoFn<PartitionQueryPair, RunQueryRequest> {
+
+      /**
+       * When fetching cursors that span multiple pages it is expected (per <a
+       * href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.PartitionQueryRequest">
+       * PartitionQueryRequest.page_token</a>) for the client to sort the cursors before processing
+       * them to define the sub-queries. So here we're defining a Comparator which will sort Cursors

Review comment:
       Can such a order change while a Beam pipeline is reading a given dataset ?

##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/firestore/FirestoreV1ReadFn.java
##########
@@ -0,0 +1,633 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.gcp.firestore;
+
+import static java.util.Objects.requireNonNull;
+
+import com.google.api.gax.paging.AbstractPage;
+import com.google.api.gax.paging.AbstractPagedListResponse;
+import com.google.api.gax.rpc.ServerStream;
+import com.google.api.gax.rpc.ServerStreamingCallable;
+import com.google.api.gax.rpc.UnaryCallable;
+import com.google.cloud.firestore.v1.FirestoreClient.ListCollectionIdsPage;
+import com.google.cloud.firestore.v1.FirestoreClient.ListCollectionIdsPagedResponse;
+import com.google.cloud.firestore.v1.FirestoreClient.ListDocumentsPage;
+import com.google.cloud.firestore.v1.FirestoreClient.ListDocumentsPagedResponse;
+import com.google.cloud.firestore.v1.FirestoreClient.PartitionQueryPage;
+import com.google.cloud.firestore.v1.FirestoreClient.PartitionQueryPagedResponse;
+import com.google.cloud.firestore.v1.stub.FirestoreStub;
+import com.google.firestore.v1.BatchGetDocumentsRequest;
+import com.google.firestore.v1.BatchGetDocumentsResponse;
+import com.google.firestore.v1.Cursor;
+import com.google.firestore.v1.ListCollectionIdsRequest;
+import com.google.firestore.v1.ListCollectionIdsResponse;
+import com.google.firestore.v1.ListDocumentsRequest;
+import com.google.firestore.v1.ListDocumentsResponse;
+import com.google.firestore.v1.PartitionQueryRequest;
+import com.google.firestore.v1.PartitionQueryResponse;
+import com.google.firestore.v1.RunQueryRequest;
+import com.google.firestore.v1.RunQueryResponse;
+import com.google.firestore.v1.StructuredQuery;
+import com.google.firestore.v1.StructuredQuery.Direction;
+import com.google.firestore.v1.StructuredQuery.FieldReference;
+import com.google.firestore.v1.StructuredQuery.Order;
+import com.google.firestore.v1.Value;
+import com.google.protobuf.Message;
+import com.google.protobuf.ProtocolStringList;
+import java.io.Serializable;
+import java.util.List;
+import java.util.Map;
+import java.util.Objects;
+import org.apache.beam.sdk.extensions.gcp.options.GcpOptions;
+import org.apache.beam.sdk.io.gcp.firestore.FirestoreDoFn.NonWindowAwareDoFn;
+import org.apache.beam.sdk.io.gcp.firestore.FirestoreV1Fn.HasRpcAttemptContext;
+import org.apache.beam.sdk.io.gcp.firestore.RpcQos.RpcAttempt.Context;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.display.DisplayData;
+import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.annotations.VisibleForTesting;
+import org.checkerframework.checker.nullness.compatqual.NullableDecl;
+import org.checkerframework.checker.nullness.qual.Nullable;
+import org.joda.time.Instant;
+
+/**
+ * A collection of {@link org.apache.beam.sdk.transforms.DoFn DoFn}s for each of the supported read
+ * RPC methods from the Cloud Firestore V1 API.
+ */
+final class FirestoreV1ReadFn {

Review comment:
       Usually "*Fn" notation is used to name DoFn implementations.

##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/firestore/FirestoreV1ReadFn.java
##########
@@ -0,0 +1,633 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.gcp.firestore;
+
+import static java.util.Objects.requireNonNull;
+
+import com.google.api.gax.paging.AbstractPage;
+import com.google.api.gax.paging.AbstractPagedListResponse;
+import com.google.api.gax.rpc.ServerStream;
+import com.google.api.gax.rpc.ServerStreamingCallable;
+import com.google.api.gax.rpc.UnaryCallable;
+import com.google.cloud.firestore.v1.FirestoreClient.ListCollectionIdsPage;
+import com.google.cloud.firestore.v1.FirestoreClient.ListCollectionIdsPagedResponse;
+import com.google.cloud.firestore.v1.FirestoreClient.ListDocumentsPage;
+import com.google.cloud.firestore.v1.FirestoreClient.ListDocumentsPagedResponse;
+import com.google.cloud.firestore.v1.FirestoreClient.PartitionQueryPage;
+import com.google.cloud.firestore.v1.FirestoreClient.PartitionQueryPagedResponse;
+import com.google.cloud.firestore.v1.stub.FirestoreStub;
+import com.google.firestore.v1.BatchGetDocumentsRequest;
+import com.google.firestore.v1.BatchGetDocumentsResponse;
+import com.google.firestore.v1.Cursor;
+import com.google.firestore.v1.ListCollectionIdsRequest;
+import com.google.firestore.v1.ListCollectionIdsResponse;
+import com.google.firestore.v1.ListDocumentsRequest;
+import com.google.firestore.v1.ListDocumentsResponse;
+import com.google.firestore.v1.PartitionQueryRequest;
+import com.google.firestore.v1.PartitionQueryResponse;
+import com.google.firestore.v1.RunQueryRequest;
+import com.google.firestore.v1.RunQueryResponse;
+import com.google.firestore.v1.StructuredQuery;
+import com.google.firestore.v1.StructuredQuery.Direction;
+import com.google.firestore.v1.StructuredQuery.FieldReference;
+import com.google.firestore.v1.StructuredQuery.Order;
+import com.google.firestore.v1.Value;
+import com.google.protobuf.Message;
+import com.google.protobuf.ProtocolStringList;
+import java.io.Serializable;
+import java.util.List;
+import java.util.Map;
+import java.util.Objects;
+import org.apache.beam.sdk.extensions.gcp.options.GcpOptions;
+import org.apache.beam.sdk.io.gcp.firestore.FirestoreDoFn.NonWindowAwareDoFn;
+import org.apache.beam.sdk.io.gcp.firestore.FirestoreV1Fn.HasRpcAttemptContext;
+import org.apache.beam.sdk.io.gcp.firestore.RpcQos.RpcAttempt.Context;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.display.DisplayData;
+import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.annotations.VisibleForTesting;
+import org.checkerframework.checker.nullness.compatqual.NullableDecl;
+import org.checkerframework.checker.nullness.qual.Nullable;
+import org.joda.time.Instant;
+
+/**
+ * A collection of {@link org.apache.beam.sdk.transforms.DoFn DoFn}s for each of the supported read
+ * RPC methods from the Cloud Firestore V1 API.
+ */
+final class FirestoreV1ReadFn {
+
+  /**
+   * {@link DoFn} for Firestore V1 {@link RunQueryRequest}s.
+   *
+   * <p>This Fn uses a stream to obtain responses, each response from the stream will be output to
+   * the next stage of the pipeline. Each response from the stream represents an individual document
+   * with the associated metadata.
+   *
+   * <p>If an error is encountered while reading from the stream, the stream will attempt to resume
+   * rather than starting over. The restarting of the stream will continue within the scope of the
+   * completion of the request (meaning any possibility of resumption is contingent upon an attempt
+   * being available in the Qos budget).
+   *
+   * <p>All request quality-of-service is managed via the instance of {@link RpcQos} associated with
+   * the lifecycle of this Fn.
+   */
+  static final class RunQueryFn
+      extends StreamingFirestoreV1ReadFn<RunQueryRequest, RunQueryResponse> {
+
+    RunQueryFn(
+        JodaClock clock,
+        FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+        RpcQosOptions rpcQosOptions) {
+      super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    @Override
+    public Context getRpcAttemptContext() {
+      return FirestoreV1Fn.V1FnRpcAttemptContext.RunQuery;
+    }
+
+    @Override
+    protected ServerStreamingCallable<RunQueryRequest, RunQueryResponse> getCallable(
+        FirestoreStub firestoreStub) {
+      return firestoreStub.runQueryCallable();
+    }
+
+    @Override
+    protected RunQueryRequest setStartFrom(
+        RunQueryRequest element, RunQueryResponse runQueryResponse) {
+      StructuredQuery query = element.getStructuredQuery();
+      StructuredQuery.Builder builder;
+      List<Order> orderByList = query.getOrderByList();
+      // if the orderByList is empty that means the default sort of "__name__ ASC" will be used
+      // Before we can set the cursor to the last document name read, we need to explicitly add
+      // the order of "__name__ ASC" because a cursor value must map to an order by
+      if (orderByList.isEmpty()) {
+        builder =
+            query
+                .toBuilder()
+                .addOrderBy(
+                    Order.newBuilder()
+                        .setField(FieldReference.newBuilder().setFieldPath("__name__").build())
+                        .setDirection(Direction.ASCENDING)
+                        .build())
+                .setStartAt(
+                    Cursor.newBuilder()
+                        .setBefore(false)
+                        .addValues(
+                            Value.newBuilder()
+                                .setReferenceValue(runQueryResponse.getDocument().getName())
+                                .build()));
+      } else {
+        Cursor.Builder cursor = Cursor.newBuilder().setBefore(false);
+        Map<String, Value> fieldsMap = runQueryResponse.getDocument().getFieldsMap();
+        for (Order order : orderByList) {
+          String fieldPath = order.getField().getFieldPath();
+          Value value = fieldsMap.get(fieldPath);
+          if (value != null) {
+            cursor.addValues(value);
+          } else if ("__name__".equals(fieldPath)) {
+            cursor.addValues(
+                Value.newBuilder()
+                    .setReferenceValue(runQueryResponse.getDocument().getName())
+                    .build());
+          }
+        }
+        builder = query.toBuilder().setStartAt(cursor.build());
+      }
+      return element.toBuilder().setStructuredQuery(builder.build()).build();
+    }
+  }
+
+  /**
+   * {@link DoFn} for Firestore V1 {@link PartitionQueryRequest}s.
+   *
+   * <p>This Fn uses pagination to obtain responses, all pages will be aggregated before being
+   * emitted to the next stage of the pipeline. Aggregation of pages is necessary as the next step
+   * of pairing of cursors to create N queries must first sort all cursors. See <a target="_blank"
+   * rel="noopener noreferrer"
+   * href="https://cloud.google.com/firestore/docs/reference/rest/v1/projects.databases.documents/partitionQuery#request-body">{@code
+   * pageToken}s</a> documentation for details.
+   *
+   * <p>All request quality-of-service is managed via the instance of {@link RpcQos} associated with
+   * the lifecycle of this Fn.
+   */
+  static final class PartitionQueryFn
+      extends BaseFirestoreV1ReadFn<PartitionQueryRequest, PartitionQueryPair> {
+
+    public PartitionQueryFn(
+        JodaClock clock,
+        FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+        RpcQosOptions rpcQosOptions) {
+      super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    @Override
+    public Context getRpcAttemptContext() {
+      return FirestoreV1Fn.V1FnRpcAttemptContext.PartitionQuery;
+    }
+
+    @Override
+    public void processElement(ProcessContext context) throws Exception {
+      @SuppressWarnings("nullness")
+      final PartitionQueryRequest element =
+          requireNonNull(context.element(), "c.element() must be non null");
+
+      RpcQos.RpcReadAttempt attempt = rpcQos.newReadAttempt(getRpcAttemptContext());

Review comment:
       Is this a global lock ? How would this adapt when Beam parallelize reading across workers ?

##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/firestore/FirestoreV1ReadFn.java
##########
@@ -0,0 +1,633 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.gcp.firestore;
+
+import static java.util.Objects.requireNonNull;
+
+import com.google.api.gax.paging.AbstractPage;
+import com.google.api.gax.paging.AbstractPagedListResponse;
+import com.google.api.gax.rpc.ServerStream;
+import com.google.api.gax.rpc.ServerStreamingCallable;
+import com.google.api.gax.rpc.UnaryCallable;
+import com.google.cloud.firestore.v1.FirestoreClient.ListCollectionIdsPage;
+import com.google.cloud.firestore.v1.FirestoreClient.ListCollectionIdsPagedResponse;
+import com.google.cloud.firestore.v1.FirestoreClient.ListDocumentsPage;
+import com.google.cloud.firestore.v1.FirestoreClient.ListDocumentsPagedResponse;
+import com.google.cloud.firestore.v1.FirestoreClient.PartitionQueryPage;
+import com.google.cloud.firestore.v1.FirestoreClient.PartitionQueryPagedResponse;
+import com.google.cloud.firestore.v1.stub.FirestoreStub;
+import com.google.firestore.v1.BatchGetDocumentsRequest;
+import com.google.firestore.v1.BatchGetDocumentsResponse;
+import com.google.firestore.v1.Cursor;
+import com.google.firestore.v1.ListCollectionIdsRequest;
+import com.google.firestore.v1.ListCollectionIdsResponse;
+import com.google.firestore.v1.ListDocumentsRequest;
+import com.google.firestore.v1.ListDocumentsResponse;
+import com.google.firestore.v1.PartitionQueryRequest;
+import com.google.firestore.v1.PartitionQueryResponse;
+import com.google.firestore.v1.RunQueryRequest;
+import com.google.firestore.v1.RunQueryResponse;
+import com.google.firestore.v1.StructuredQuery;
+import com.google.firestore.v1.StructuredQuery.Direction;
+import com.google.firestore.v1.StructuredQuery.FieldReference;
+import com.google.firestore.v1.StructuredQuery.Order;
+import com.google.firestore.v1.Value;
+import com.google.protobuf.Message;
+import com.google.protobuf.ProtocolStringList;
+import java.io.Serializable;
+import java.util.List;
+import java.util.Map;
+import java.util.Objects;
+import org.apache.beam.sdk.extensions.gcp.options.GcpOptions;
+import org.apache.beam.sdk.io.gcp.firestore.FirestoreDoFn.NonWindowAwareDoFn;
+import org.apache.beam.sdk.io.gcp.firestore.FirestoreV1Fn.HasRpcAttemptContext;
+import org.apache.beam.sdk.io.gcp.firestore.RpcQos.RpcAttempt.Context;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.display.DisplayData;
+import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.annotations.VisibleForTesting;
+import org.checkerframework.checker.nullness.compatqual.NullableDecl;
+import org.checkerframework.checker.nullness.qual.Nullable;
+import org.joda.time.Instant;
+
+/**
+ * A collection of {@link org.apache.beam.sdk.transforms.DoFn DoFn}s for each of the supported read
+ * RPC methods from the Cloud Firestore V1 API.
+ */
+final class FirestoreV1ReadFn {
+
+  /**
+   * {@link DoFn} for Firestore V1 {@link RunQueryRequest}s.
+   *
+   * <p>This Fn uses a stream to obtain responses, each response from the stream will be output to
+   * the next stage of the pipeline. Each response from the stream represents an individual document
+   * with the associated metadata.
+   *
+   * <p>If an error is encountered while reading from the stream, the stream will attempt to resume
+   * rather than starting over. The restarting of the stream will continue within the scope of the
+   * completion of the request (meaning any possibility of resumption is contingent upon an attempt
+   * being available in the Qos budget).
+   *
+   * <p>All request quality-of-service is managed via the instance of {@link RpcQos} associated with
+   * the lifecycle of this Fn.
+   */
+  static final class RunQueryFn

Review comment:
       Beam sources help Beam pipeline parallelize. I think we have to update following source Fn classes to use SDF to parallelize better. This help support features such as dynamic work rebalancing. Without proper parallelization Beam pipelines that use Firestore source could run into stragglers (which is an issue many Dataflow customers run into without dynamic work rebalancing is not available).
   See here for more details on SDF: https://beam.apache.org/documentation/programming-guide/#splittable-dofns

##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/firestore/FirestoreV1.java
##########
@@ -59,6 +89,80 @@
  *
  * <h3>Operations</h3>
  *
+ * <h4>Read</h4>
+ *
+ * <p>The currently supported read operations and their execution behavior are as follows:
+ *
+ * <table>
+ *   <tbody>
+ *     <tr>
+ *       <th>RPC</th>
+ *       <th>Execution Behavior</th>
+ *     </tr>
+ *     <tr>
+ *       <td>PartitionQuery</td>
+ *       <td>Parallel Streaming</td>
+ *     </tr>
+ *     <tr>
+ *       <td>RunQuery</td>
+ *       <td>Sequential Streaming</td>
+ *     </tr>
+ *     <tr>
+ *       <td>BatchGet</td>
+ *       <td>Sequential Streaming</td>
+ *     </tr>
+ *     <tr>
+ *       <td>ListCollectionIds</td>
+ *       <td>Sequential Paginated</td>
+ *     </tr>
+ *     <tr>
+ *       <td>ListDocuments</td>
+ *       <td>Sequential Paginated</td>
+ *     </tr>
+ *   </tbody>
+ * </table>
+ *
+ * <p>PartitionQuery should be preferred over other options if at all possible, becuase it has the
+ * ability to parallelize execution of multiple queries for specific sub-ranges of the full results.
+ *
+ * <p>You should only ever use ListDocuments if the use of <a target="_blank" rel="noopener
+ * noreferrer"
+ * href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.ListDocumentsRequest">{@code
+ * show_missing}</a> is needed to access a document. RunQuery and PartitionQuery will always be
+ * faster if the use of {@code show_missing} is not needed.
+ *
+ * <p><b>Example Usage</b>
+ *
+ * <pre>{@code
+ * PCollection<PartitionQueryRequest> partitionQueryRequests = ...;
+ * PCollection<RunQueryResponse> partitionQueryResponses = partitionQueryRequests
+ *     .apply(FirestoreIO.v1().read().partitionQuery().build());
+ * }</pre>
+ *
+ * <pre>{@code
+ * PCollection<RunQueryRequest> runQueryRequests = ...;

Review comment:
       Do you think all these types of PCollections should be in the public API ? Are end users expected to use all of these ?
   
   I'm wondering if we can somehow simplify the public API by allowing users to get a certain (root) type of PCollection and providing utility functions to convert from there.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] BenWhitehead commented on a change in pull request #15005: [BEAM-8376] Google Cloud Firestore Connector - Add Firestore v1 Read Operations

Posted by GitBox <gi...@apache.org>.
BenWhitehead commented on a change in pull request #15005:
URL: https://github.com/apache/beam/pull/15005#discussion_r668896523



##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/firestore/FirestoreV1ReadFn.java
##########
@@ -0,0 +1,633 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.gcp.firestore;
+
+import static java.util.Objects.requireNonNull;
+
+import com.google.api.gax.paging.AbstractPage;
+import com.google.api.gax.paging.AbstractPagedListResponse;
+import com.google.api.gax.rpc.ServerStream;
+import com.google.api.gax.rpc.ServerStreamingCallable;
+import com.google.api.gax.rpc.UnaryCallable;
+import com.google.cloud.firestore.v1.FirestoreClient.ListCollectionIdsPage;
+import com.google.cloud.firestore.v1.FirestoreClient.ListCollectionIdsPagedResponse;
+import com.google.cloud.firestore.v1.FirestoreClient.ListDocumentsPage;
+import com.google.cloud.firestore.v1.FirestoreClient.ListDocumentsPagedResponse;
+import com.google.cloud.firestore.v1.FirestoreClient.PartitionQueryPage;
+import com.google.cloud.firestore.v1.FirestoreClient.PartitionQueryPagedResponse;
+import com.google.cloud.firestore.v1.stub.FirestoreStub;
+import com.google.firestore.v1.BatchGetDocumentsRequest;
+import com.google.firestore.v1.BatchGetDocumentsResponse;
+import com.google.firestore.v1.Cursor;
+import com.google.firestore.v1.ListCollectionIdsRequest;
+import com.google.firestore.v1.ListCollectionIdsResponse;
+import com.google.firestore.v1.ListDocumentsRequest;
+import com.google.firestore.v1.ListDocumentsResponse;
+import com.google.firestore.v1.PartitionQueryRequest;
+import com.google.firestore.v1.PartitionQueryResponse;
+import com.google.firestore.v1.RunQueryRequest;
+import com.google.firestore.v1.RunQueryResponse;
+import com.google.firestore.v1.StructuredQuery;
+import com.google.firestore.v1.StructuredQuery.Direction;
+import com.google.firestore.v1.StructuredQuery.FieldReference;
+import com.google.firestore.v1.StructuredQuery.Order;
+import com.google.firestore.v1.Value;
+import com.google.protobuf.Message;
+import com.google.protobuf.ProtocolStringList;
+import java.io.Serializable;
+import java.util.List;
+import java.util.Map;
+import java.util.Objects;
+import org.apache.beam.sdk.extensions.gcp.options.GcpOptions;
+import org.apache.beam.sdk.io.gcp.firestore.FirestoreDoFn.NonWindowAwareDoFn;
+import org.apache.beam.sdk.io.gcp.firestore.FirestoreV1Fn.HasRpcAttemptContext;
+import org.apache.beam.sdk.io.gcp.firestore.RpcQos.RpcAttempt.Context;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.display.DisplayData;
+import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.annotations.VisibleForTesting;
+import org.checkerframework.checker.nullness.compatqual.NullableDecl;
+import org.checkerframework.checker.nullness.qual.Nullable;
+import org.joda.time.Instant;
+
+/**
+ * A collection of {@link org.apache.beam.sdk.transforms.DoFn DoFn}s for each of the supported read
+ * RPC methods from the Cloud Firestore V1 API.
+ */
+final class FirestoreV1ReadFn {
+
+  /**
+   * {@link DoFn} for Firestore V1 {@link RunQueryRequest}s.
+   *
+   * <p>This Fn uses a stream to obtain responses, each response from the stream will be output to
+   * the next stage of the pipeline. Each response from the stream represents an individual document
+   * with the associated metadata.
+   *
+   * <p>If an error is encountered while reading from the stream, the stream will attempt to resume
+   * rather than starting over. The restarting of the stream will continue within the scope of the
+   * completion of the request (meaning any possibility of resumption is contingent upon an attempt
+   * being available in the Qos budget).
+   *
+   * <p>All request quality-of-service is managed via the instance of {@link RpcQos} associated with
+   * the lifecycle of this Fn.
+   */
+  static final class RunQueryFn

Review comment:
       Done, and added as TODOs in on the class
   * BEAM-12605 Add dynamic work rebalancing to Firestore Query Source
   * BEAM-12606 Add support for progress reporting to Firestore Query Source




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] danthev commented on a change in pull request #15005: [BEAM-8376] Google Cloud Firestore Connector - Add Firestore v1 Read Operations

Posted by GitBox <gi...@apache.org>.
danthev commented on a change in pull request #15005:
URL: https://github.com/apache/beam/pull/15005#discussion_r652221313



##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/firestore/FirestoreV1ReadFn.java
##########
@@ -0,0 +1,632 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.gcp.firestore;
+
+import static java.util.Objects.requireNonNull;
+
+import com.google.api.gax.paging.AbstractPage;
+import com.google.api.gax.paging.AbstractPagedListResponse;
+import com.google.api.gax.rpc.ServerStream;
+import com.google.api.gax.rpc.ServerStreamingCallable;
+import com.google.api.gax.rpc.UnaryCallable;
+import com.google.cloud.firestore.v1.FirestoreClient.ListCollectionIdsPage;
+import com.google.cloud.firestore.v1.FirestoreClient.ListCollectionIdsPagedResponse;
+import com.google.cloud.firestore.v1.FirestoreClient.ListDocumentsPage;
+import com.google.cloud.firestore.v1.FirestoreClient.ListDocumentsPagedResponse;
+import com.google.cloud.firestore.v1.FirestoreClient.PartitionQueryPage;
+import com.google.cloud.firestore.v1.FirestoreClient.PartitionQueryPagedResponse;
+import com.google.cloud.firestore.v1.stub.FirestoreStub;
+import com.google.firestore.v1.BatchGetDocumentsRequest;
+import com.google.firestore.v1.BatchGetDocumentsResponse;
+import com.google.firestore.v1.Cursor;
+import com.google.firestore.v1.ListCollectionIdsRequest;
+import com.google.firestore.v1.ListCollectionIdsResponse;
+import com.google.firestore.v1.ListDocumentsRequest;
+import com.google.firestore.v1.ListDocumentsResponse;
+import com.google.firestore.v1.PartitionQueryRequest;
+import com.google.firestore.v1.PartitionQueryResponse;
+import com.google.firestore.v1.RunQueryRequest;
+import com.google.firestore.v1.RunQueryResponse;
+import com.google.firestore.v1.StructuredQuery;
+import com.google.firestore.v1.StructuredQuery.Direction;
+import com.google.firestore.v1.StructuredQuery.FieldReference;
+import com.google.firestore.v1.StructuredQuery.Order;
+import com.google.firestore.v1.Value;
+import com.google.protobuf.Message;
+import com.google.protobuf.ProtocolStringList;
+import java.io.Serializable;
+import java.util.List;
+import java.util.Map;
+import java.util.Objects;
+import org.apache.beam.sdk.extensions.gcp.options.GcpOptions;
+import org.apache.beam.sdk.io.gcp.firestore.FirestoreDoFn.NonWindowAwareDoFn;
+import org.apache.beam.sdk.io.gcp.firestore.FirestoreV1Fn.HasRpcAttemptContext;
+import org.apache.beam.sdk.io.gcp.firestore.RpcQos.RpcAttempt.Context;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.display.DisplayData;
+import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.annotations.VisibleForTesting;
+import org.checkerframework.checker.nullness.qual.Nullable;
+import org.joda.time.Instant;
+
+/**
+ * A collection of {@link org.apache.beam.sdk.transforms.DoFn DoFn}s for each of the supported read
+ * RPC methods from the Cloud Firestore V1 API.
+ */
+final class FirestoreV1ReadFn {
+
+  /**
+   * {@link DoFn} for Firestore V1 {@link RunQueryRequest}s.
+   *
+   * <p>This Fn uses a stream to obtain responses, each response from the stream will be output to
+   * the next stage of the pipeline. Each response from the stream represents an individual document
+   * with the associated metadata.
+   *
+   * <p>If an error is encountered while reading from the stream, the stream will attempt to resume
+   * rather than starting over. The restarting of the stream will continue within the scope of the
+   * completion of the request (meaning any possibility of resumption is contingent upon an attempt
+   * being available in the Qos budget).
+   *
+   * <p>All request quality-of-service is managed via the instance of {@link RpcQos} associated with
+   * the lifecycle of this Fn.
+   */
+  static final class RunQueryFn
+      extends StreamingFirestoreV1ReadFn<RunQueryRequest, RunQueryResponse> {
+
+    RunQueryFn(
+        JodaClock clock,
+        FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+        RpcQosOptions rpcQosOptions) {
+      super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    @Override
+    public Context getRpcAttemptContext() {
+      return FirestoreV1Fn.V1FnRpcAttemptContext.RunQuery;
+    }
+
+    @Override
+    protected ServerStreamingCallable<RunQueryRequest, RunQueryResponse> getCallable(
+        FirestoreStub firestoreStub) {
+      return firestoreStub.runQueryCallable();
+    }
+
+    @Override
+    protected RunQueryRequest setStartFrom(
+        RunQueryRequest element, RunQueryResponse runQueryResponse) {
+      StructuredQuery query = element.getStructuredQuery();
+      StructuredQuery.Builder builder;
+      List<Order> orderByList = query.getOrderByList();
+      // if the orderByList is empty that means the default sort of "__name__ ASC" will be used
+      // Before we can set the cursor to the last document name read, we need to explicitly add
+      // the order of "__name__ ASC" because a cursor value must map to an order by
+      if (orderByList.isEmpty()) {
+        builder =
+            query
+                .toBuilder()
+                .addOrderBy(
+                    Order.newBuilder()
+                        .setField(FieldReference.newBuilder().setFieldPath("__name__").build())
+                        .setDirection(Direction.ASCENDING)
+                        .build())
+                .setStartAt(
+                    Cursor.newBuilder()
+                        .setBefore(false)
+                        .addValues(
+                            Value.newBuilder()
+                                .setReferenceValue(runQueryResponse.getDocument().getName())
+                                .build()));
+      } else {
+        Cursor.Builder cursor = Cursor.newBuilder().setBefore(false);
+        Map<String, Value> fieldsMap = runQueryResponse.getDocument().getFieldsMap();
+        for (Order order : orderByList) {
+          String fieldPath = order.getField().getFieldPath();
+          Value value = fieldsMap.get(fieldPath);
+          if (value != null) {
+            cursor.addValues(value);
+          } else if ("__name__".equals(fieldPath)) {
+            cursor.addValues(
+                Value.newBuilder()
+                    .setReferenceValue(runQueryResponse.getDocument().getName())
+                    .build());
+          }
+        }
+        builder = query.toBuilder().setStartAt(cursor.build());
+      }
+      return element.toBuilder().setStructuredQuery(builder.build()).build();
+    }
+  }
+
+  /**
+   * {@link DoFn} for Firestore V1 {@link PartitionQueryRequest}s.
+   *
+   * <p>This Fn uses pagination to obtain responses, all pages will be aggregated before being
+   * emitted to the next stage of the pipeline. Aggregation of pages is necessary as the next step
+   * of pairing of cursors to create N queries must first sort all cursors. See <a target="_blank"
+   * rel="noopener noreferrer"
+   * href="https://cloud.google.com/firestore/docs/reference/rest/v1/projects.databases.documents/partitionQuery#request-body">{@code
+   * pageToken}s</a> documentation for details.
+   *
+   * <p>All request quality-of-service is managed via the instance of {@link RpcQos} associated with
+   * the lifecycle of this Fn.
+   */
+  static final class PartitionQueryFn
+      extends BaseFirestoreV1ReadFn<PartitionQueryRequest, PartitionQueryPair> {
+
+    public PartitionQueryFn(
+        JodaClock clock,
+        FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+        RpcQosOptions rpcQosOptions) {
+      super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    @Override
+    public Context getRpcAttemptContext() {
+      return FirestoreV1Fn.V1FnRpcAttemptContext.PartitionQuery;
+    }
+
+    @Override
+    public void processElement(ProcessContext context) throws Exception {
+      @SuppressWarnings("nullness")
+      final PartitionQueryRequest element =
+          requireNonNull(context.element(), "c.element() must be non null");
+
+      RpcQos.RpcReadAttempt attempt = rpcQos.newReadAttempt(getRpcAttemptContext());
+      PartitionQueryResponse.Builder aggregate = null;
+      while (true) {
+        if (!attempt.awaitSafeToProceed(clock.instant())) {
+          continue;
+        }
+
+        try {
+          PartitionQueryRequest request = setPageToken(element, aggregate);
+          attempt.recordRequestStart(clock.instant());
+          PartitionQueryPagedResponse pagedResponse =
+              firestoreStub.partitionQueryPagedCallable().call(request);
+          for (PartitionQueryPage page : pagedResponse.iteratePages()) {
+            attempt.recordRequestSuccessful(clock.instant());
+            PartitionQueryResponse response = page.getResponse();
+            if (aggregate == null) {
+              aggregate = response.toBuilder();
+            } else {
+              aggregate.addAllPartitions(response.getPartitionsList());
+              if (page.hasNextPage()) {
+                aggregate.setNextPageToken(response.getNextPageToken());
+              } else {
+                aggregate.clearNextPageToken();
+              }
+            }
+            if (page.hasNextPage()) {
+              attempt.recordRequestStart(clock.instant());
+            }
+          }
+          attempt.completeSuccess();
+          break;
+        } catch (RuntimeException exception) {
+          Instant end = clock.instant();
+          attempt.recordRequestFailed(end);
+          attempt.checkCanRetry(end, exception);
+        }
+      }
+      if (aggregate != null) {
+        context.output(new PartitionQueryPair(element, aggregate.build()));
+      }
+    }
+
+    private PartitionQueryRequest setPageToken(
+        PartitionQueryRequest request,
+        @edu.umd.cs.findbugs.annotations.Nullable PartitionQueryResponse.Builder aggregate) {
+      if (aggregate != null && aggregate.getNextPageToken() != null) {
+        return request.toBuilder().setPageToken(aggregate.getNextPageToken()).build();
+      }
+      return request;
+    }
+  }
+
+  /**
+   * {@link DoFn} for Firestore V1 {@link ListDocumentsRequest}s.
+   *
+   * <p>This Fn uses pagination to obtain responses, the response from each page will be output to
+   * the next stage of the pipeline.
+   *
+   * <p>All request quality-of-service is managed via the instance of {@link RpcQos} associated with
+   * the lifecycle of this Fn.
+   */
+  static final class ListDocumentsFn
+      extends PaginatedFirestoreV1ReadFn<
+          ListDocumentsRequest,
+          ListDocumentsPagedResponse,
+          ListDocumentsPage,
+          ListDocumentsResponse> {
+
+    ListDocumentsFn(
+        JodaClock clock,
+        FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+        RpcQosOptions rpcQosOptions) {
+      super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    @Override
+    public Context getRpcAttemptContext() {
+      return FirestoreV1Fn.V1FnRpcAttemptContext.ListDocuments;
+    }
+
+    @Override
+    protected UnaryCallable<ListDocumentsRequest, ListDocumentsPagedResponse> getCallable(
+        FirestoreStub firestoreStub) {
+      return firestoreStub.listDocumentsPagedCallable();
+    }
+
+    @Override
+    protected ListDocumentsRequest setPageToken(
+        ListDocumentsRequest request, String nextPageToken) {
+      return request.toBuilder().setPageToken(nextPageToken).build();
+    }
+  }
+
+  /**
+   * {@link DoFn} for Firestore V1 {@link ListCollectionIdsRequest}s.
+   *
+   * <p>This Fn uses pagination to obtain responses, the response from each page will be output to
+   * the next stage of the pipeline.
+   *
+   * <p>All request quality-of-service is managed via the instance of {@link RpcQos} associated with
+   * the lifecycle of this Fn.
+   */
+  static final class ListCollectionIdsFn
+      extends PaginatedFirestoreV1ReadFn<
+          ListCollectionIdsRequest,
+          ListCollectionIdsPagedResponse,
+          ListCollectionIdsPage,
+          ListCollectionIdsResponse> {
+
+    ListCollectionIdsFn(
+        JodaClock clock,
+        FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+        RpcQosOptions rpcQosOptions) {
+      super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    @Override
+    public Context getRpcAttemptContext() {
+      return FirestoreV1Fn.V1FnRpcAttemptContext.ListCollectionIds;
+    }
+
+    @Override
+    protected UnaryCallable<ListCollectionIdsRequest, ListCollectionIdsPagedResponse> getCallable(
+        FirestoreStub firestoreStub) {
+      return firestoreStub.listCollectionIdsPagedCallable();
+    }
+
+    @Override
+    protected ListCollectionIdsRequest setPageToken(
+        ListCollectionIdsRequest request, String nextPageToken) {
+      return request.toBuilder().setPageToken(nextPageToken).build();
+    }
+  }
+
+  /**
+   * {@link DoFn} for Firestore V1 {@link BatchGetDocumentsRequest}s.
+   *
+   * <p>This Fn uses a stream to obtain responses, each response from the stream will be output to
+   * the next stage of the pipeline. Each response from the stream represents an individual document
+   * with the associated metadata.
+   *
+   * <p>All request quality-of-service is managed via the instance of {@link RpcQos} associated with
+   * the lifecycle of this Fn.
+   */
+  static final class BatchGetDocumentsFn
+      extends StreamingFirestoreV1ReadFn<BatchGetDocumentsRequest, BatchGetDocumentsResponse> {
+
+    BatchGetDocumentsFn(
+        JodaClock clock,
+        FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+        RpcQosOptions rpcQosOptions) {
+      super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    @Override
+    public Context getRpcAttemptContext() {
+      return FirestoreV1Fn.V1FnRpcAttemptContext.BatchGetDocuments;
+    }
+
+    @Override
+    protected ServerStreamingCallable<BatchGetDocumentsRequest, BatchGetDocumentsResponse>
+        getCallable(FirestoreStub firestoreStub) {
+      return firestoreStub.batchGetDocumentsCallable();
+    }
+
+    @Override
+    protected BatchGetDocumentsRequest setStartFrom(
+        BatchGetDocumentsRequest originalRequest, BatchGetDocumentsResponse mostRecentResponse) {
+      int startIndex = -1;
+      ProtocolStringList documentsList = originalRequest.getDocumentsList();
+      String missing = mostRecentResponse.getMissing();
+      String foundName =
+          mostRecentResponse.hasFound() ? mostRecentResponse.getFound().getName() : null;
+      // we only scan until the second to last originalRequest. If the final element were to be
+      // reached
+      // the full request would be complete and we wouldn't be in this scenario
+      int maxIndex = documentsList.size() - 2;
+      for (int i = 0; i <= maxIndex; i++) {

Review comment:
       I don't know much about what the response will look like. But given that you skip everything up to `startIndex`, it seems safe to assume this is always sequential? If so, can you advance an object-level counter instead of iterating over all document names every time?

##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/firestore/RpcQosImpl.java
##########
@@ -700,6 +723,142 @@ public long nextBackOffMillis() {
     }
   }
 
+  /**
+   * This class implements a backoff algorithm similar to that of {@link
+   * org.apache.beam.sdk.util.FluentBackoff} with a could key differences:

Review comment:
       Typo

##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/firestore/RpcQosImpl.java
##########
@@ -700,6 +723,142 @@ public long nextBackOffMillis() {
     }
   }
 
+  /**
+   * This class implements a backoff algorithm similar to that of {@link
+   * org.apache.beam.sdk.util.FluentBackoff} with a could key differences:
+   *
+   * <ol>
+   *   <li>A set of status code numbers may be specified to have a graceful evaluation
+   *   <li>Gracefully evaluated status code numbers will increment a decaying counter to ensure if
+   *       the graceful status code numbers occur more than once in the previous 60 seconds the
+   *       regular backoff behavior will kick in.
+   *   <li>The random number generator used to induce jitter is provided via constructor parameter
+   *       rather than using {@link Math#random()}}
+   * </ol>
+   *
+   * The primary motivation for creating this implementation is to support streamed responses from
+   * Firestore. In the case of RunQuery and BatchGet the results are returned via stream. The result
+   * stream has a maximum lifetime of 60 seconds before it will be broken and an UNAVAILABLE status
+   * code will be raised. Give this UNAVAILABLE is expected for streams this class allows for
+   * defining a set of status code numbers which are give a grace count of 1 before backoff kicks
+   * in. When backoff does kick in, it is implemented using the same calculations as {@link
+   * org.apache.beam.sdk.util.FluentBackoff}.
+   */
+  static final class StatusCodeAwareBackoff {
+    private static final double RANDOMIZATION_FACTOR = 0.5;

Review comment:
       This seems like a lot, does `FluentBackoff` have that much fuzziness as well?

##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/firestore/FirestoreV1ReadFn.java
##########
@@ -0,0 +1,632 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.gcp.firestore;
+
+import static java.util.Objects.requireNonNull;
+
+import com.google.api.gax.paging.AbstractPage;
+import com.google.api.gax.paging.AbstractPagedListResponse;
+import com.google.api.gax.rpc.ServerStream;
+import com.google.api.gax.rpc.ServerStreamingCallable;
+import com.google.api.gax.rpc.UnaryCallable;
+import com.google.cloud.firestore.v1.FirestoreClient.ListCollectionIdsPage;
+import com.google.cloud.firestore.v1.FirestoreClient.ListCollectionIdsPagedResponse;
+import com.google.cloud.firestore.v1.FirestoreClient.ListDocumentsPage;
+import com.google.cloud.firestore.v1.FirestoreClient.ListDocumentsPagedResponse;
+import com.google.cloud.firestore.v1.FirestoreClient.PartitionQueryPage;
+import com.google.cloud.firestore.v1.FirestoreClient.PartitionQueryPagedResponse;
+import com.google.cloud.firestore.v1.stub.FirestoreStub;
+import com.google.firestore.v1.BatchGetDocumentsRequest;
+import com.google.firestore.v1.BatchGetDocumentsResponse;
+import com.google.firestore.v1.Cursor;
+import com.google.firestore.v1.ListCollectionIdsRequest;
+import com.google.firestore.v1.ListCollectionIdsResponse;
+import com.google.firestore.v1.ListDocumentsRequest;
+import com.google.firestore.v1.ListDocumentsResponse;
+import com.google.firestore.v1.PartitionQueryRequest;
+import com.google.firestore.v1.PartitionQueryResponse;
+import com.google.firestore.v1.RunQueryRequest;
+import com.google.firestore.v1.RunQueryResponse;
+import com.google.firestore.v1.StructuredQuery;
+import com.google.firestore.v1.StructuredQuery.Direction;
+import com.google.firestore.v1.StructuredQuery.FieldReference;
+import com.google.firestore.v1.StructuredQuery.Order;
+import com.google.firestore.v1.Value;
+import com.google.protobuf.Message;
+import com.google.protobuf.ProtocolStringList;
+import java.io.Serializable;
+import java.util.List;
+import java.util.Map;
+import java.util.Objects;
+import org.apache.beam.sdk.extensions.gcp.options.GcpOptions;
+import org.apache.beam.sdk.io.gcp.firestore.FirestoreDoFn.NonWindowAwareDoFn;
+import org.apache.beam.sdk.io.gcp.firestore.FirestoreV1Fn.HasRpcAttemptContext;
+import org.apache.beam.sdk.io.gcp.firestore.RpcQos.RpcAttempt.Context;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.display.DisplayData;
+import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.annotations.VisibleForTesting;
+import org.checkerframework.checker.nullness.qual.Nullable;
+import org.joda.time.Instant;
+
+/**
+ * A collection of {@link org.apache.beam.sdk.transforms.DoFn DoFn}s for each of the supported read
+ * RPC methods from the Cloud Firestore V1 API.
+ */
+final class FirestoreV1ReadFn {
+
+  /**
+   * {@link DoFn} for Firestore V1 {@link RunQueryRequest}s.
+   *
+   * <p>This Fn uses a stream to obtain responses, each response from the stream will be output to
+   * the next stage of the pipeline. Each response from the stream represents an individual document
+   * with the associated metadata.
+   *
+   * <p>If an error is encountered while reading from the stream, the stream will attempt to resume
+   * rather than starting over. The restarting of the stream will continue within the scope of the
+   * completion of the request (meaning any possibility of resumption is contingent upon an attempt
+   * being available in the Qos budget).
+   *
+   * <p>All request quality-of-service is managed via the instance of {@link RpcQos} associated with
+   * the lifecycle of this Fn.
+   */
+  static final class RunQueryFn
+      extends StreamingFirestoreV1ReadFn<RunQueryRequest, RunQueryResponse> {
+
+    RunQueryFn(
+        JodaClock clock,
+        FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+        RpcQosOptions rpcQosOptions) {
+      super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    @Override
+    public Context getRpcAttemptContext() {
+      return FirestoreV1Fn.V1FnRpcAttemptContext.RunQuery;
+    }
+
+    @Override
+    protected ServerStreamingCallable<RunQueryRequest, RunQueryResponse> getCallable(
+        FirestoreStub firestoreStub) {
+      return firestoreStub.runQueryCallable();
+    }
+
+    @Override
+    protected RunQueryRequest setStartFrom(
+        RunQueryRequest element, RunQueryResponse runQueryResponse) {
+      StructuredQuery query = element.getStructuredQuery();
+      StructuredQuery.Builder builder;
+      List<Order> orderByList = query.getOrderByList();
+      // if the orderByList is empty that means the default sort of "__name__ ASC" will be used
+      // Before we can set the cursor to the last document name read, we need to explicitly add
+      // the order of "__name__ ASC" because a cursor value must map to an order by
+      if (orderByList.isEmpty()) {
+        builder =
+            query
+                .toBuilder()
+                .addOrderBy(
+                    Order.newBuilder()
+                        .setField(FieldReference.newBuilder().setFieldPath("__name__").build())
+                        .setDirection(Direction.ASCENDING)
+                        .build())
+                .setStartAt(
+                    Cursor.newBuilder()
+                        .setBefore(false)
+                        .addValues(
+                            Value.newBuilder()
+                                .setReferenceValue(runQueryResponse.getDocument().getName())
+                                .build()));
+      } else {
+        Cursor.Builder cursor = Cursor.newBuilder().setBefore(false);
+        Map<String, Value> fieldsMap = runQueryResponse.getDocument().getFieldsMap();
+        for (Order order : orderByList) {
+          String fieldPath = order.getField().getFieldPath();
+          Value value = fieldsMap.get(fieldPath);
+          if (value != null) {
+            cursor.addValues(value);
+          } else if ("__name__".equals(fieldPath)) {
+            cursor.addValues(
+                Value.newBuilder()
+                    .setReferenceValue(runQueryResponse.getDocument().getName())
+                    .build());
+          }
+        }
+        builder = query.toBuilder().setStartAt(cursor.build());
+      }
+      return element.toBuilder().setStructuredQuery(builder.build()).build();
+    }
+  }
+
+  /**
+   * {@link DoFn} for Firestore V1 {@link PartitionQueryRequest}s.
+   *
+   * <p>This Fn uses pagination to obtain responses, all pages will be aggregated before being
+   * emitted to the next stage of the pipeline. Aggregation of pages is necessary as the next step
+   * of pairing of cursors to create N queries must first sort all cursors. See <a target="_blank"
+   * rel="noopener noreferrer"
+   * href="https://cloud.google.com/firestore/docs/reference/rest/v1/projects.databases.documents/partitionQuery#request-body">{@code
+   * pageToken}s</a> documentation for details.
+   *
+   * <p>All request quality-of-service is managed via the instance of {@link RpcQos} associated with
+   * the lifecycle of this Fn.
+   */
+  static final class PartitionQueryFn
+      extends BaseFirestoreV1ReadFn<PartitionQueryRequest, PartitionQueryPair> {
+
+    public PartitionQueryFn(
+        JodaClock clock,
+        FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+        RpcQosOptions rpcQosOptions) {
+      super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    @Override
+    public Context getRpcAttemptContext() {
+      return FirestoreV1Fn.V1FnRpcAttemptContext.PartitionQuery;
+    }
+
+    @Override
+    public void processElement(ProcessContext context) throws Exception {
+      @SuppressWarnings("nullness")
+      final PartitionQueryRequest element =
+          requireNonNull(context.element(), "c.element() must be non null");
+
+      RpcQos.RpcReadAttempt attempt = rpcQos.newReadAttempt(getRpcAttemptContext());
+      PartitionQueryResponse.Builder aggregate = null;
+      while (true) {
+        if (!attempt.awaitSafeToProceed(clock.instant())) {
+          continue;
+        }
+
+        try {
+          PartitionQueryRequest request = setPageToken(element, aggregate);
+          attempt.recordRequestStart(clock.instant());
+          PartitionQueryPagedResponse pagedResponse =
+              firestoreStub.partitionQueryPagedCallable().call(request);
+          for (PartitionQueryPage page : pagedResponse.iteratePages()) {
+            attempt.recordRequestSuccessful(clock.instant());
+            PartitionQueryResponse response = page.getResponse();
+            if (aggregate == null) {
+              aggregate = response.toBuilder();
+            } else {
+              aggregate.addAllPartitions(response.getPartitionsList());
+              if (page.hasNextPage()) {
+                aggregate.setNextPageToken(response.getNextPageToken());
+              } else {
+                aggregate.clearNextPageToken();
+              }
+            }
+            if (page.hasNextPage()) {
+              attempt.recordRequestStart(clock.instant());
+            }
+          }
+          attempt.completeSuccess();
+          break;
+        } catch (RuntimeException exception) {
+          Instant end = clock.instant();
+          attempt.recordRequestFailed(end);
+          attempt.checkCanRetry(end, exception);
+        }
+      }
+      if (aggregate != null) {
+        context.output(new PartitionQueryPair(element, aggregate.build()));
+      }
+    }
+
+    private PartitionQueryRequest setPageToken(
+        PartitionQueryRequest request,
+        @edu.umd.cs.findbugs.annotations.Nullable PartitionQueryResponse.Builder aggregate) {
+      if (aggregate != null && aggregate.getNextPageToken() != null) {
+        return request.toBuilder().setPageToken(aggregate.getNextPageToken()).build();
+      }
+      return request;
+    }
+  }
+
+  /**
+   * {@link DoFn} for Firestore V1 {@link ListDocumentsRequest}s.
+   *
+   * <p>This Fn uses pagination to obtain responses, the response from each page will be output to
+   * the next stage of the pipeline.
+   *
+   * <p>All request quality-of-service is managed via the instance of {@link RpcQos} associated with
+   * the lifecycle of this Fn.
+   */
+  static final class ListDocumentsFn
+      extends PaginatedFirestoreV1ReadFn<
+          ListDocumentsRequest,
+          ListDocumentsPagedResponse,
+          ListDocumentsPage,
+          ListDocumentsResponse> {
+
+    ListDocumentsFn(
+        JodaClock clock,
+        FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+        RpcQosOptions rpcQosOptions) {
+      super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    @Override
+    public Context getRpcAttemptContext() {
+      return FirestoreV1Fn.V1FnRpcAttemptContext.ListDocuments;
+    }
+
+    @Override
+    protected UnaryCallable<ListDocumentsRequest, ListDocumentsPagedResponse> getCallable(
+        FirestoreStub firestoreStub) {
+      return firestoreStub.listDocumentsPagedCallable();
+    }
+
+    @Override
+    protected ListDocumentsRequest setPageToken(
+        ListDocumentsRequest request, String nextPageToken) {
+      return request.toBuilder().setPageToken(nextPageToken).build();
+    }
+  }
+
+  /**
+   * {@link DoFn} for Firestore V1 {@link ListCollectionIdsRequest}s.
+   *
+   * <p>This Fn uses pagination to obtain responses, the response from each page will be output to
+   * the next stage of the pipeline.
+   *
+   * <p>All request quality-of-service is managed via the instance of {@link RpcQos} associated with
+   * the lifecycle of this Fn.
+   */
+  static final class ListCollectionIdsFn
+      extends PaginatedFirestoreV1ReadFn<
+          ListCollectionIdsRequest,
+          ListCollectionIdsPagedResponse,
+          ListCollectionIdsPage,
+          ListCollectionIdsResponse> {
+
+    ListCollectionIdsFn(
+        JodaClock clock,
+        FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+        RpcQosOptions rpcQosOptions) {
+      super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    @Override
+    public Context getRpcAttemptContext() {
+      return FirestoreV1Fn.V1FnRpcAttemptContext.ListCollectionIds;
+    }
+
+    @Override
+    protected UnaryCallable<ListCollectionIdsRequest, ListCollectionIdsPagedResponse> getCallable(
+        FirestoreStub firestoreStub) {
+      return firestoreStub.listCollectionIdsPagedCallable();
+    }
+
+    @Override
+    protected ListCollectionIdsRequest setPageToken(
+        ListCollectionIdsRequest request, String nextPageToken) {
+      return request.toBuilder().setPageToken(nextPageToken).build();
+    }
+  }
+
+  /**
+   * {@link DoFn} for Firestore V1 {@link BatchGetDocumentsRequest}s.
+   *
+   * <p>This Fn uses a stream to obtain responses, each response from the stream will be output to
+   * the next stage of the pipeline. Each response from the stream represents an individual document
+   * with the associated metadata.
+   *
+   * <p>All request quality-of-service is managed via the instance of {@link RpcQos} associated with
+   * the lifecycle of this Fn.
+   */
+  static final class BatchGetDocumentsFn
+      extends StreamingFirestoreV1ReadFn<BatchGetDocumentsRequest, BatchGetDocumentsResponse> {
+
+    BatchGetDocumentsFn(
+        JodaClock clock,
+        FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+        RpcQosOptions rpcQosOptions) {
+      super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    @Override
+    public Context getRpcAttemptContext() {
+      return FirestoreV1Fn.V1FnRpcAttemptContext.BatchGetDocuments;
+    }
+
+    @Override
+    protected ServerStreamingCallable<BatchGetDocumentsRequest, BatchGetDocumentsResponse>
+        getCallable(FirestoreStub firestoreStub) {
+      return firestoreStub.batchGetDocumentsCallable();
+    }
+
+    @Override
+    protected BatchGetDocumentsRequest setStartFrom(
+        BatchGetDocumentsRequest originalRequest, BatchGetDocumentsResponse mostRecentResponse) {
+      int startIndex = -1;
+      ProtocolStringList documentsList = originalRequest.getDocumentsList();
+      String missing = mostRecentResponse.getMissing();
+      String foundName =
+          mostRecentResponse.hasFound() ? mostRecentResponse.getFound().getName() : null;
+      // we only scan until the second to last originalRequest. If the final element were to be
+      // reached
+      // the full request would be complete and we wouldn't be in this scenario
+      int maxIndex = documentsList.size() - 2;
+      for (int i = 0; i <= maxIndex; i++) {
+        String docName = documentsList.get(i);
+        if (docName.equals(missing) || docName.equals(foundName)) {
+          startIndex = i;
+          break;
+        }
+      }
+      if (0 <= startIndex) {
+        BatchGetDocumentsRequest.Builder builder = originalRequest.toBuilder().clearDocuments();
+        documentsList.stream()
+            .skip(startIndex + 1) // start from the next entry from the one we found
+            .forEach(builder::addDocuments);
+        return builder.build();
+      }
+      // unable to find a match, return the original request
+      return originalRequest;

Review comment:
       Will this only ever happen on the first response with the transaction field set? Maybe you can add a comment here to make this clearer. If there's any other way this could fail I prefer an exception.

##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/firestore/FirestoreV1ReadFn.java
##########
@@ -0,0 +1,632 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.gcp.firestore;
+
+import static java.util.Objects.requireNonNull;
+
+import com.google.api.gax.paging.AbstractPage;
+import com.google.api.gax.paging.AbstractPagedListResponse;
+import com.google.api.gax.rpc.ServerStream;
+import com.google.api.gax.rpc.ServerStreamingCallable;
+import com.google.api.gax.rpc.UnaryCallable;
+import com.google.cloud.firestore.v1.FirestoreClient.ListCollectionIdsPage;
+import com.google.cloud.firestore.v1.FirestoreClient.ListCollectionIdsPagedResponse;
+import com.google.cloud.firestore.v1.FirestoreClient.ListDocumentsPage;
+import com.google.cloud.firestore.v1.FirestoreClient.ListDocumentsPagedResponse;
+import com.google.cloud.firestore.v1.FirestoreClient.PartitionQueryPage;
+import com.google.cloud.firestore.v1.FirestoreClient.PartitionQueryPagedResponse;
+import com.google.cloud.firestore.v1.stub.FirestoreStub;
+import com.google.firestore.v1.BatchGetDocumentsRequest;
+import com.google.firestore.v1.BatchGetDocumentsResponse;
+import com.google.firestore.v1.Cursor;
+import com.google.firestore.v1.ListCollectionIdsRequest;
+import com.google.firestore.v1.ListCollectionIdsResponse;
+import com.google.firestore.v1.ListDocumentsRequest;
+import com.google.firestore.v1.ListDocumentsResponse;
+import com.google.firestore.v1.PartitionQueryRequest;
+import com.google.firestore.v1.PartitionQueryResponse;
+import com.google.firestore.v1.RunQueryRequest;
+import com.google.firestore.v1.RunQueryResponse;
+import com.google.firestore.v1.StructuredQuery;
+import com.google.firestore.v1.StructuredQuery.Direction;
+import com.google.firestore.v1.StructuredQuery.FieldReference;
+import com.google.firestore.v1.StructuredQuery.Order;
+import com.google.firestore.v1.Value;
+import com.google.protobuf.Message;
+import com.google.protobuf.ProtocolStringList;
+import java.io.Serializable;
+import java.util.List;
+import java.util.Map;
+import java.util.Objects;
+import org.apache.beam.sdk.extensions.gcp.options.GcpOptions;
+import org.apache.beam.sdk.io.gcp.firestore.FirestoreDoFn.NonWindowAwareDoFn;
+import org.apache.beam.sdk.io.gcp.firestore.FirestoreV1Fn.HasRpcAttemptContext;
+import org.apache.beam.sdk.io.gcp.firestore.RpcQos.RpcAttempt.Context;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.display.DisplayData;
+import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.annotations.VisibleForTesting;
+import org.checkerframework.checker.nullness.qual.Nullable;
+import org.joda.time.Instant;
+
+/**
+ * A collection of {@link org.apache.beam.sdk.transforms.DoFn DoFn}s for each of the supported read
+ * RPC methods from the Cloud Firestore V1 API.
+ */
+final class FirestoreV1ReadFn {
+
+  /**
+   * {@link DoFn} for Firestore V1 {@link RunQueryRequest}s.
+   *
+   * <p>This Fn uses a stream to obtain responses, each response from the stream will be output to
+   * the next stage of the pipeline. Each response from the stream represents an individual document
+   * with the associated metadata.
+   *
+   * <p>If an error is encountered while reading from the stream, the stream will attempt to resume
+   * rather than starting over. The restarting of the stream will continue within the scope of the
+   * completion of the request (meaning any possibility of resumption is contingent upon an attempt
+   * being available in the Qos budget).
+   *
+   * <p>All request quality-of-service is managed via the instance of {@link RpcQos} associated with
+   * the lifecycle of this Fn.
+   */
+  static final class RunQueryFn
+      extends StreamingFirestoreV1ReadFn<RunQueryRequest, RunQueryResponse> {
+
+    RunQueryFn(
+        JodaClock clock,
+        FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+        RpcQosOptions rpcQosOptions) {
+      super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    @Override
+    public Context getRpcAttemptContext() {
+      return FirestoreV1Fn.V1FnRpcAttemptContext.RunQuery;
+    }
+
+    @Override
+    protected ServerStreamingCallable<RunQueryRequest, RunQueryResponse> getCallable(
+        FirestoreStub firestoreStub) {
+      return firestoreStub.runQueryCallable();
+    }
+
+    @Override
+    protected RunQueryRequest setStartFrom(
+        RunQueryRequest element, RunQueryResponse runQueryResponse) {
+      StructuredQuery query = element.getStructuredQuery();
+      StructuredQuery.Builder builder;
+      List<Order> orderByList = query.getOrderByList();
+      // if the orderByList is empty that means the default sort of "__name__ ASC" will be used
+      // Before we can set the cursor to the last document name read, we need to explicitly add
+      // the order of "__name__ ASC" because a cursor value must map to an order by
+      if (orderByList.isEmpty()) {
+        builder =
+            query
+                .toBuilder()
+                .addOrderBy(
+                    Order.newBuilder()
+                        .setField(FieldReference.newBuilder().setFieldPath("__name__").build())
+                        .setDirection(Direction.ASCENDING)
+                        .build())
+                .setStartAt(
+                    Cursor.newBuilder()
+                        .setBefore(false)
+                        .addValues(
+                            Value.newBuilder()
+                                .setReferenceValue(runQueryResponse.getDocument().getName())
+                                .build()));
+      } else {
+        Cursor.Builder cursor = Cursor.newBuilder().setBefore(false);
+        Map<String, Value> fieldsMap = runQueryResponse.getDocument().getFieldsMap();
+        for (Order order : orderByList) {
+          String fieldPath = order.getField().getFieldPath();
+          Value value = fieldsMap.get(fieldPath);
+          if (value != null) {
+            cursor.addValues(value);
+          } else if ("__name__".equals(fieldPath)) {
+            cursor.addValues(
+                Value.newBuilder()
+                    .setReferenceValue(runQueryResponse.getDocument().getName())
+                    .build());
+          }
+        }
+        builder = query.toBuilder().setStartAt(cursor.build());
+      }
+      return element.toBuilder().setStructuredQuery(builder.build()).build();
+    }
+  }
+
+  /**
+   * {@link DoFn} for Firestore V1 {@link PartitionQueryRequest}s.
+   *
+   * <p>This Fn uses pagination to obtain responses, all pages will be aggregated before being
+   * emitted to the next stage of the pipeline. Aggregation of pages is necessary as the next step
+   * of pairing of cursors to create N queries must first sort all cursors. See <a target="_blank"
+   * rel="noopener noreferrer"
+   * href="https://cloud.google.com/firestore/docs/reference/rest/v1/projects.databases.documents/partitionQuery#request-body">{@code
+   * pageToken}s</a> documentation for details.
+   *
+   * <p>All request quality-of-service is managed via the instance of {@link RpcQos} associated with
+   * the lifecycle of this Fn.
+   */
+  static final class PartitionQueryFn
+      extends BaseFirestoreV1ReadFn<PartitionQueryRequest, PartitionQueryPair> {
+
+    public PartitionQueryFn(
+        JodaClock clock,
+        FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+        RpcQosOptions rpcQosOptions) {
+      super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    @Override
+    public Context getRpcAttemptContext() {
+      return FirestoreV1Fn.V1FnRpcAttemptContext.PartitionQuery;
+    }
+
+    @Override
+    public void processElement(ProcessContext context) throws Exception {
+      @SuppressWarnings("nullness")
+      final PartitionQueryRequest element =
+          requireNonNull(context.element(), "c.element() must be non null");
+
+      RpcQos.RpcReadAttempt attempt = rpcQos.newReadAttempt(getRpcAttemptContext());
+      PartitionQueryResponse.Builder aggregate = null;
+      while (true) {
+        if (!attempt.awaitSafeToProceed(clock.instant())) {
+          continue;
+        }
+
+        try {
+          PartitionQueryRequest request = setPageToken(element, aggregate);
+          attempt.recordRequestStart(clock.instant());
+          PartitionQueryPagedResponse pagedResponse =
+              firestoreStub.partitionQueryPagedCallable().call(request);
+          for (PartitionQueryPage page : pagedResponse.iteratePages()) {
+            attempt.recordRequestSuccessful(clock.instant());
+            PartitionQueryResponse response = page.getResponse();
+            if (aggregate == null) {
+              aggregate = response.toBuilder();
+            } else {
+              aggregate.addAllPartitions(response.getPartitionsList());
+              if (page.hasNextPage()) {
+                aggregate.setNextPageToken(response.getNextPageToken());
+              } else {
+                aggregate.clearNextPageToken();
+              }
+            }
+            if (page.hasNextPage()) {
+              attempt.recordRequestStart(clock.instant());
+            }
+          }
+          attempt.completeSuccess();
+          break;
+        } catch (RuntimeException exception) {
+          Instant end = clock.instant();
+          attempt.recordRequestFailed(end);
+          attempt.checkCanRetry(end, exception);
+        }
+      }
+      if (aggregate != null) {
+        context.output(new PartitionQueryPair(element, aggregate.build()));
+      }
+    }
+
+    private PartitionQueryRequest setPageToken(
+        PartitionQueryRequest request,
+        @edu.umd.cs.findbugs.annotations.Nullable PartitionQueryResponse.Builder aggregate) {

Review comment:
       Wrong import

##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/firestore/FirestoreV1ReadFn.java
##########
@@ -0,0 +1,632 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.gcp.firestore;
+
+import static java.util.Objects.requireNonNull;
+
+import com.google.api.gax.paging.AbstractPage;
+import com.google.api.gax.paging.AbstractPagedListResponse;
+import com.google.api.gax.rpc.ServerStream;
+import com.google.api.gax.rpc.ServerStreamingCallable;
+import com.google.api.gax.rpc.UnaryCallable;
+import com.google.cloud.firestore.v1.FirestoreClient.ListCollectionIdsPage;
+import com.google.cloud.firestore.v1.FirestoreClient.ListCollectionIdsPagedResponse;
+import com.google.cloud.firestore.v1.FirestoreClient.ListDocumentsPage;
+import com.google.cloud.firestore.v1.FirestoreClient.ListDocumentsPagedResponse;
+import com.google.cloud.firestore.v1.FirestoreClient.PartitionQueryPage;
+import com.google.cloud.firestore.v1.FirestoreClient.PartitionQueryPagedResponse;
+import com.google.cloud.firestore.v1.stub.FirestoreStub;
+import com.google.firestore.v1.BatchGetDocumentsRequest;
+import com.google.firestore.v1.BatchGetDocumentsResponse;
+import com.google.firestore.v1.Cursor;
+import com.google.firestore.v1.ListCollectionIdsRequest;
+import com.google.firestore.v1.ListCollectionIdsResponse;
+import com.google.firestore.v1.ListDocumentsRequest;
+import com.google.firestore.v1.ListDocumentsResponse;
+import com.google.firestore.v1.PartitionQueryRequest;
+import com.google.firestore.v1.PartitionQueryResponse;
+import com.google.firestore.v1.RunQueryRequest;
+import com.google.firestore.v1.RunQueryResponse;
+import com.google.firestore.v1.StructuredQuery;
+import com.google.firestore.v1.StructuredQuery.Direction;
+import com.google.firestore.v1.StructuredQuery.FieldReference;
+import com.google.firestore.v1.StructuredQuery.Order;
+import com.google.firestore.v1.Value;
+import com.google.protobuf.Message;
+import com.google.protobuf.ProtocolStringList;
+import java.io.Serializable;
+import java.util.List;
+import java.util.Map;
+import java.util.Objects;
+import org.apache.beam.sdk.extensions.gcp.options.GcpOptions;
+import org.apache.beam.sdk.io.gcp.firestore.FirestoreDoFn.NonWindowAwareDoFn;
+import org.apache.beam.sdk.io.gcp.firestore.FirestoreV1Fn.HasRpcAttemptContext;
+import org.apache.beam.sdk.io.gcp.firestore.RpcQos.RpcAttempt.Context;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.display.DisplayData;
+import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.annotations.VisibleForTesting;
+import org.checkerframework.checker.nullness.qual.Nullable;
+import org.joda.time.Instant;
+
+/**
+ * A collection of {@link org.apache.beam.sdk.transforms.DoFn DoFn}s for each of the supported read
+ * RPC methods from the Cloud Firestore V1 API.
+ */
+final class FirestoreV1ReadFn {
+
+  /**
+   * {@link DoFn} for Firestore V1 {@link RunQueryRequest}s.
+   *
+   * <p>This Fn uses a stream to obtain responses, each response from the stream will be output to
+   * the next stage of the pipeline. Each response from the stream represents an individual document
+   * with the associated metadata.
+   *
+   * <p>If an error is encountered while reading from the stream, the stream will attempt to resume
+   * rather than starting over. The restarting of the stream will continue within the scope of the
+   * completion of the request (meaning any possibility of resumption is contingent upon an attempt
+   * being available in the Qos budget).
+   *
+   * <p>All request quality-of-service is managed via the instance of {@link RpcQos} associated with
+   * the lifecycle of this Fn.
+   */
+  static final class RunQueryFn
+      extends StreamingFirestoreV1ReadFn<RunQueryRequest, RunQueryResponse> {
+
+    RunQueryFn(
+        JodaClock clock,
+        FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+        RpcQosOptions rpcQosOptions) {
+      super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    @Override
+    public Context getRpcAttemptContext() {
+      return FirestoreV1Fn.V1FnRpcAttemptContext.RunQuery;
+    }
+
+    @Override
+    protected ServerStreamingCallable<RunQueryRequest, RunQueryResponse> getCallable(
+        FirestoreStub firestoreStub) {
+      return firestoreStub.runQueryCallable();
+    }
+
+    @Override
+    protected RunQueryRequest setStartFrom(
+        RunQueryRequest element, RunQueryResponse runQueryResponse) {
+      StructuredQuery query = element.getStructuredQuery();
+      StructuredQuery.Builder builder;
+      List<Order> orderByList = query.getOrderByList();
+      // if the orderByList is empty that means the default sort of "__name__ ASC" will be used
+      // Before we can set the cursor to the last document name read, we need to explicitly add
+      // the order of "__name__ ASC" because a cursor value must map to an order by
+      if (orderByList.isEmpty()) {
+        builder =
+            query
+                .toBuilder()
+                .addOrderBy(
+                    Order.newBuilder()
+                        .setField(FieldReference.newBuilder().setFieldPath("__name__").build())
+                        .setDirection(Direction.ASCENDING)
+                        .build())
+                .setStartAt(
+                    Cursor.newBuilder()
+                        .setBefore(false)
+                        .addValues(
+                            Value.newBuilder()
+                                .setReferenceValue(runQueryResponse.getDocument().getName())
+                                .build()));
+      } else {
+        Cursor.Builder cursor = Cursor.newBuilder().setBefore(false);
+        Map<String, Value> fieldsMap = runQueryResponse.getDocument().getFieldsMap();
+        for (Order order : orderByList) {
+          String fieldPath = order.getField().getFieldPath();
+          Value value = fieldsMap.get(fieldPath);
+          if (value != null) {
+            cursor.addValues(value);
+          } else if ("__name__".equals(fieldPath)) {
+            cursor.addValues(
+                Value.newBuilder()
+                    .setReferenceValue(runQueryResponse.getDocument().getName())
+                    .build());
+          }
+        }
+        builder = query.toBuilder().setStartAt(cursor.build());
+      }
+      return element.toBuilder().setStructuredQuery(builder.build()).build();
+    }
+  }
+
+  /**
+   * {@link DoFn} for Firestore V1 {@link PartitionQueryRequest}s.
+   *
+   * <p>This Fn uses pagination to obtain responses, all pages will be aggregated before being
+   * emitted to the next stage of the pipeline. Aggregation of pages is necessary as the next step
+   * of pairing of cursors to create N queries must first sort all cursors. See <a target="_blank"
+   * rel="noopener noreferrer"
+   * href="https://cloud.google.com/firestore/docs/reference/rest/v1/projects.databases.documents/partitionQuery#request-body">{@code
+   * pageToken}s</a> documentation for details.
+   *
+   * <p>All request quality-of-service is managed via the instance of {@link RpcQos} associated with
+   * the lifecycle of this Fn.
+   */
+  static final class PartitionQueryFn
+      extends BaseFirestoreV1ReadFn<PartitionQueryRequest, PartitionQueryPair> {
+
+    public PartitionQueryFn(
+        JodaClock clock,
+        FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+        RpcQosOptions rpcQosOptions) {
+      super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    @Override
+    public Context getRpcAttemptContext() {
+      return FirestoreV1Fn.V1FnRpcAttemptContext.PartitionQuery;
+    }
+
+    @Override
+    public void processElement(ProcessContext context) throws Exception {
+      @SuppressWarnings("nullness")
+      final PartitionQueryRequest element =
+          requireNonNull(context.element(), "c.element() must be non null");
+
+      RpcQos.RpcReadAttempt attempt = rpcQos.newReadAttempt(getRpcAttemptContext());
+      PartitionQueryResponse.Builder aggregate = null;
+      while (true) {
+        if (!attempt.awaitSafeToProceed(clock.instant())) {
+          continue;
+        }
+
+        try {
+          PartitionQueryRequest request = setPageToken(element, aggregate);
+          attempt.recordRequestStart(clock.instant());
+          PartitionQueryPagedResponse pagedResponse =
+              firestoreStub.partitionQueryPagedCallable().call(request);
+          for (PartitionQueryPage page : pagedResponse.iteratePages()) {
+            attempt.recordRequestSuccessful(clock.instant());
+            PartitionQueryResponse response = page.getResponse();
+            if (aggregate == null) {
+              aggregate = response.toBuilder();
+            } else {
+              aggregate.addAllPartitions(response.getPartitionsList());
+              if (page.hasNextPage()) {
+                aggregate.setNextPageToken(response.getNextPageToken());
+              } else {
+                aggregate.clearNextPageToken();
+              }
+            }
+            if (page.hasNextPage()) {
+              attempt.recordRequestStart(clock.instant());
+            }
+          }
+          attempt.completeSuccess();
+          break;
+        } catch (RuntimeException exception) {
+          Instant end = clock.instant();
+          attempt.recordRequestFailed(end);
+          attempt.checkCanRetry(end, exception);
+        }
+      }
+      if (aggregate != null) {
+        context.output(new PartitionQueryPair(element, aggregate.build()));
+      }
+    }
+
+    private PartitionQueryRequest setPageToken(
+        PartitionQueryRequest request,
+        @edu.umd.cs.findbugs.annotations.Nullable PartitionQueryResponse.Builder aggregate) {
+      if (aggregate != null && aggregate.getNextPageToken() != null) {
+        return request.toBuilder().setPageToken(aggregate.getNextPageToken()).build();
+      }
+      return request;
+    }
+  }
+
+  /**
+   * {@link DoFn} for Firestore V1 {@link ListDocumentsRequest}s.
+   *
+   * <p>This Fn uses pagination to obtain responses, the response from each page will be output to
+   * the next stage of the pipeline.
+   *
+   * <p>All request quality-of-service is managed via the instance of {@link RpcQos} associated with
+   * the lifecycle of this Fn.
+   */
+  static final class ListDocumentsFn
+      extends PaginatedFirestoreV1ReadFn<
+          ListDocumentsRequest,
+          ListDocumentsPagedResponse,
+          ListDocumentsPage,
+          ListDocumentsResponse> {
+
+    ListDocumentsFn(
+        JodaClock clock,
+        FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+        RpcQosOptions rpcQosOptions) {
+      super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    @Override
+    public Context getRpcAttemptContext() {
+      return FirestoreV1Fn.V1FnRpcAttemptContext.ListDocuments;
+    }
+
+    @Override
+    protected UnaryCallable<ListDocumentsRequest, ListDocumentsPagedResponse> getCallable(
+        FirestoreStub firestoreStub) {
+      return firestoreStub.listDocumentsPagedCallable();
+    }
+
+    @Override
+    protected ListDocumentsRequest setPageToken(
+        ListDocumentsRequest request, String nextPageToken) {
+      return request.toBuilder().setPageToken(nextPageToken).build();
+    }
+  }
+
+  /**
+   * {@link DoFn} for Firestore V1 {@link ListCollectionIdsRequest}s.
+   *
+   * <p>This Fn uses pagination to obtain responses, the response from each page will be output to
+   * the next stage of the pipeline.
+   *
+   * <p>All request quality-of-service is managed via the instance of {@link RpcQos} associated with
+   * the lifecycle of this Fn.
+   */
+  static final class ListCollectionIdsFn
+      extends PaginatedFirestoreV1ReadFn<
+          ListCollectionIdsRequest,
+          ListCollectionIdsPagedResponse,
+          ListCollectionIdsPage,
+          ListCollectionIdsResponse> {
+
+    ListCollectionIdsFn(
+        JodaClock clock,
+        FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+        RpcQosOptions rpcQosOptions) {
+      super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    @Override
+    public Context getRpcAttemptContext() {
+      return FirestoreV1Fn.V1FnRpcAttemptContext.ListCollectionIds;
+    }
+
+    @Override
+    protected UnaryCallable<ListCollectionIdsRequest, ListCollectionIdsPagedResponse> getCallable(
+        FirestoreStub firestoreStub) {
+      return firestoreStub.listCollectionIdsPagedCallable();
+    }
+
+    @Override
+    protected ListCollectionIdsRequest setPageToken(
+        ListCollectionIdsRequest request, String nextPageToken) {
+      return request.toBuilder().setPageToken(nextPageToken).build();
+    }
+  }
+
+  /**
+   * {@link DoFn} for Firestore V1 {@link BatchGetDocumentsRequest}s.
+   *
+   * <p>This Fn uses a stream to obtain responses, each response from the stream will be output to
+   * the next stage of the pipeline. Each response from the stream represents an individual document
+   * with the associated metadata.
+   *
+   * <p>All request quality-of-service is managed via the instance of {@link RpcQos} associated with
+   * the lifecycle of this Fn.
+   */
+  static final class BatchGetDocumentsFn
+      extends StreamingFirestoreV1ReadFn<BatchGetDocumentsRequest, BatchGetDocumentsResponse> {
+
+    BatchGetDocumentsFn(
+        JodaClock clock,
+        FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+        RpcQosOptions rpcQosOptions) {
+      super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    @Override
+    public Context getRpcAttemptContext() {
+      return FirestoreV1Fn.V1FnRpcAttemptContext.BatchGetDocuments;
+    }
+
+    @Override
+    protected ServerStreamingCallable<BatchGetDocumentsRequest, BatchGetDocumentsResponse>
+        getCallable(FirestoreStub firestoreStub) {
+      return firestoreStub.batchGetDocumentsCallable();
+    }
+
+    @Override
+    protected BatchGetDocumentsRequest setStartFrom(
+        BatchGetDocumentsRequest originalRequest, BatchGetDocumentsResponse mostRecentResponse) {
+      int startIndex = -1;
+      ProtocolStringList documentsList = originalRequest.getDocumentsList();
+      String missing = mostRecentResponse.getMissing();
+      String foundName =
+          mostRecentResponse.hasFound() ? mostRecentResponse.getFound().getName() : null;
+      // we only scan until the second to last originalRequest. If the final element were to be
+      // reached
+      // the full request would be complete and we wouldn't be in this scenario
+      int maxIndex = documentsList.size() - 2;
+      for (int i = 0; i <= maxIndex; i++) {
+        String docName = documentsList.get(i);
+        if (docName.equals(missing) || docName.equals(foundName)) {
+          startIndex = i;
+          break;
+        }
+      }
+      if (0 <= startIndex) {
+        BatchGetDocumentsRequest.Builder builder = originalRequest.toBuilder().clearDocuments();
+        documentsList.stream()
+            .skip(startIndex + 1) // start from the next entry from the one we found
+            .forEach(builder::addDocuments);
+        return builder.build();
+      }
+      // unable to find a match, return the original request
+      return originalRequest;
+    }
+  }
+
+  /**
+   * {@link DoFn} Providing support for a Read type RPC operation which uses a Stream rather than
+   * pagination. Each response from the stream will be output to the next stage of the pipeline.
+   *
+   * <p>All request quality-of-service is managed via the instance of {@link RpcQos} associated with
+   * the lifecycle of this Fn.
+   *
+   * @param <InT> Request type
+   * @param <OutT> Response type
+   */
+  private abstract static class StreamingFirestoreV1ReadFn<
+          InT extends Message, OutT extends Message>
+      extends BaseFirestoreV1ReadFn<InT, OutT> {
+
+    protected StreamingFirestoreV1ReadFn(
+        JodaClock clock,
+        FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+        RpcQosOptions rpcQosOptions) {
+      super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    protected abstract ServerStreamingCallable<InT, OutT> getCallable(FirestoreStub firestoreStub);
+
+    protected abstract InT setStartFrom(InT element, OutT out);
+
+    @Override
+    public final void processElement(ProcessContext c) throws Exception {
+      @SuppressWarnings(
+          "nullness") // for some reason requireNonNull thinks its parameter but be non-null...
+      final InT element = requireNonNull(c.element(), "c.element() must be non null");
+
+      RpcQos.RpcReadAttempt attempt = rpcQos.newReadAttempt(getRpcAttemptContext());
+      OutT lastReceivedValue = null;
+      while (true) {
+        if (!attempt.awaitSafeToProceed(clock.instant())) {
+          continue;
+        }
+
+        Instant start = clock.instant();
+        try {
+          InT request =
+              lastReceivedValue == null ? element : setStartFrom(element, lastReceivedValue);
+          attempt.recordRequestStart(start);
+          ServerStream<OutT> serverStream = getCallable(firestoreStub).call(request);
+          attempt.recordRequestSuccessful(clock.instant());
+          for (OutT out : serverStream) {
+            lastReceivedValue = out;
+            attempt.recordStreamValue(clock.instant());
+            c.output(out);
+          }
+          attempt.completeSuccess();
+          break;
+        } catch (RuntimeException exception) {
+          Instant end = clock.instant();
+          attempt.recordRequestFailed(end);
+          attempt.checkCanRetry(end, exception);
+        }
+      }
+    }
+  }
+
+  /**
+   * {@link DoFn} Providing support for a Read type RPC operation which uses pagination rather than
+   * a Stream.
+   *
+   * @param <RequestT> Request type
+   * @param <ResponseT> Response type
+   */
+  @SuppressWarnings({
+    // errorchecker doesn't like the second ? on PagedResponse, seemingly because of different
+    // recursion depth limits; 3 on the found vs 4 on the required.
+    // The second ? is the type of collection the paged response uses to hold all responses if
+    // trying to expand all pages to a single collection. We are emitting a single page at at time
+    // while tracking read progress so we can resume if an error has occurred and we still have
+    // attempt budget available.
+    "type.argument.type.incompatible"
+  })
+  private abstract static class PaginatedFirestoreV1ReadFn<
+          RequestT extends Message,
+          PagedResponseT extends AbstractPagedListResponse<RequestT, ResponseT, ?, PageT, ?>,
+          PageT extends AbstractPage<RequestT, ResponseT, ?, PageT>,
+          ResponseT extends Message>
+      extends BaseFirestoreV1ReadFn<RequestT, ResponseT> {
+
+    protected PaginatedFirestoreV1ReadFn(
+        JodaClock clock,
+        FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+        RpcQosOptions rpcQosOptions) {
+      super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    protected abstract UnaryCallable<RequestT, PagedResponseT> getCallable(
+        FirestoreStub firestoreStub);
+
+    protected abstract RequestT setPageToken(RequestT request, String nextPageToken);
+
+    @Override
+    public final void processElement(ProcessContext c) throws Exception {
+      @SuppressWarnings(
+          "nullness") // for some reason requireNonNull thinks its parameter but be non-null...
+      final RequestT element = requireNonNull(c.element(), "c.element() must be non null");
+
+      RpcQos.RpcReadAttempt attempt = rpcQos.newReadAttempt(getRpcAttemptContext());
+      String nextPageToken = null;
+      while (true) {
+        if (!attempt.awaitSafeToProceed(clock.instant())) {
+          continue;
+        }
+
+        try {
+          RequestT request = nextPageToken == null ? element : setPageToken(element, nextPageToken);
+          attempt.recordRequestStart(clock.instant());
+          PagedResponseT pagedResponse = getCallable(firestoreStub).call(request);
+          for (PageT page : pagedResponse.iteratePages()) {
+            ResponseT response = page.getResponse();
+            attempt.recordRequestSuccessful(clock.instant());
+            c.output(response);
+            if (page.hasNextPage()) {
+              nextPageToken = page.getNextPageToken();
+              attempt.recordRequestStart(clock.instant());
+            }
+          }
+          attempt.completeSuccess();
+          break;
+        } catch (RuntimeException exception) {
+          Instant end = clock.instant();
+          attempt.recordRequestFailed(end);
+          attempt.checkCanRetry(end, exception);
+        }
+      }
+    }
+  }
+
+  /**
+   * Base class for all {@link org.apache.beam.sdk.transforms.DoFn DoFn}s which provide access to
+   * RPCs from the Cloud Firestore V1 API.
+   *
+   * <p>This class takes care of common lifecycle elements and transient state management for
+   * subclasses allowing subclasses to provide the minimal implementation for {@link
+   * NonWindowAwareDoFn#processElement(DoFn.ProcessContext)}}
+   *
+   * @param <InT> The type of element coming into this {@link DoFn}
+   * @param <OutT> The type of element output from this {@link DoFn}
+   */
+  abstract static class BaseFirestoreV1ReadFn<InT, OutT> extends NonWindowAwareDoFn<InT, OutT>
+      implements HasRpcAttemptContext {
+
+    protected final JodaClock clock;
+    protected final FirestoreStatefulComponentFactory firestoreStatefulComponentFactory;
+    protected final RpcQosOptions rpcQosOptions;
+
+    // transient running state information, not important to any possible checkpointing
+    protected transient FirestoreStub firestoreStub;
+    protected transient RpcQos rpcQos;
+    protected transient String projectId;
+
+    @SuppressWarnings(
+        "initialization.fields.uninitialized") // allow transient fields to be managed by component
+    // lifecycle
+    protected BaseFirestoreV1ReadFn(
+        JodaClock clock,
+        FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+        RpcQosOptions rpcQosOptions) {
+      this.clock = requireNonNull(clock, "clock must be non null");
+      this.firestoreStatefulComponentFactory =
+          requireNonNull(firestoreStatefulComponentFactory, "firestoreFactory must be non null");
+      this.rpcQosOptions = requireNonNull(rpcQosOptions, "rpcQosOptions must be non null");
+    }
+
+    /** {@inheritDoc} */
+    @Override
+    public void setup() {
+      rpcQos = firestoreStatefulComponentFactory.getRpcQos(rpcQosOptions);
+    }
+
+    /** {@inheritDoc} */
+    @Override
+    public final void startBundle(StartBundleContext c) {
+      String project = c.getPipelineOptions().as(GcpOptions.class).getProject();
+      projectId =
+          requireNonNull(project, "project must be defined on GcpOptions of PipelineOptions");
+      firestoreStub = firestoreStatefulComponentFactory.getFirestoreStub(c.getPipelineOptions());
+    }
+
+    /** {@inheritDoc} */
+    @SuppressWarnings("nullness") // allow clearing transient fields
+    @Override
+    public void finishBundle() throws Exception {
+      projectId = null;
+      firestoreStub.close();
+    }
+
+    /** {@inheritDoc} */
+    @Override
+    public final void populateDisplayData(
+        @edu.umd.cs.findbugs.annotations.NonNull DisplayData.Builder builder) {

Review comment:
       Wrong import




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] BenWhitehead commented on a change in pull request #15005: [BEAM-8376] Google Cloud Firestore Connector - Add Firestore v1 Read Operations

Posted by GitBox <gi...@apache.org>.
BenWhitehead commented on a change in pull request #15005:
URL: https://github.com/apache/beam/pull/15005#discussion_r668061600



##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/firestore/FirestoreV1ReadFn.java
##########
@@ -0,0 +1,633 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.gcp.firestore;
+
+import static java.util.Objects.requireNonNull;
+
+import com.google.api.gax.paging.AbstractPage;
+import com.google.api.gax.paging.AbstractPagedListResponse;
+import com.google.api.gax.rpc.ServerStream;
+import com.google.api.gax.rpc.ServerStreamingCallable;
+import com.google.api.gax.rpc.UnaryCallable;
+import com.google.cloud.firestore.v1.FirestoreClient.ListCollectionIdsPage;
+import com.google.cloud.firestore.v1.FirestoreClient.ListCollectionIdsPagedResponse;
+import com.google.cloud.firestore.v1.FirestoreClient.ListDocumentsPage;
+import com.google.cloud.firestore.v1.FirestoreClient.ListDocumentsPagedResponse;
+import com.google.cloud.firestore.v1.FirestoreClient.PartitionQueryPage;
+import com.google.cloud.firestore.v1.FirestoreClient.PartitionQueryPagedResponse;
+import com.google.cloud.firestore.v1.stub.FirestoreStub;
+import com.google.firestore.v1.BatchGetDocumentsRequest;
+import com.google.firestore.v1.BatchGetDocumentsResponse;
+import com.google.firestore.v1.Cursor;
+import com.google.firestore.v1.ListCollectionIdsRequest;
+import com.google.firestore.v1.ListCollectionIdsResponse;
+import com.google.firestore.v1.ListDocumentsRequest;
+import com.google.firestore.v1.ListDocumentsResponse;
+import com.google.firestore.v1.PartitionQueryRequest;
+import com.google.firestore.v1.PartitionQueryResponse;
+import com.google.firestore.v1.RunQueryRequest;
+import com.google.firestore.v1.RunQueryResponse;
+import com.google.firestore.v1.StructuredQuery;
+import com.google.firestore.v1.StructuredQuery.Direction;
+import com.google.firestore.v1.StructuredQuery.FieldReference;
+import com.google.firestore.v1.StructuredQuery.Order;
+import com.google.firestore.v1.Value;
+import com.google.protobuf.Message;
+import com.google.protobuf.ProtocolStringList;
+import java.io.Serializable;
+import java.util.List;
+import java.util.Map;
+import java.util.Objects;
+import org.apache.beam.sdk.extensions.gcp.options.GcpOptions;
+import org.apache.beam.sdk.io.gcp.firestore.FirestoreDoFn.NonWindowAwareDoFn;
+import org.apache.beam.sdk.io.gcp.firestore.FirestoreV1Fn.HasRpcAttemptContext;
+import org.apache.beam.sdk.io.gcp.firestore.RpcQos.RpcAttempt.Context;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.display.DisplayData;
+import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.annotations.VisibleForTesting;
+import org.checkerframework.checker.nullness.compatqual.NullableDecl;
+import org.checkerframework.checker.nullness.qual.Nullable;
+import org.joda.time.Instant;
+
+/**
+ * A collection of {@link org.apache.beam.sdk.transforms.DoFn DoFn}s for each of the supported read
+ * RPC methods from the Cloud Firestore V1 API.
+ */
+final class FirestoreV1ReadFn {
+
+  /**
+   * {@link DoFn} for Firestore V1 {@link RunQueryRequest}s.
+   *
+   * <p>This Fn uses a stream to obtain responses, each response from the stream will be output to
+   * the next stage of the pipeline. Each response from the stream represents an individual document
+   * with the associated metadata.
+   *
+   * <p>If an error is encountered while reading from the stream, the stream will attempt to resume
+   * rather than starting over. The restarting of the stream will continue within the scope of the
+   * completion of the request (meaning any possibility of resumption is contingent upon an attempt
+   * being available in the Qos budget).
+   *
+   * <p>All request quality-of-service is managed via the instance of {@link RpcQos} associated with
+   * the lifecycle of this Fn.
+   */
+  static final class RunQueryFn

Review comment:
       On both the read and write path, if the workers is throttling for some backoff it is reported via `Metrics.counter("throttlingMs").inc(throttleDuration.getMillis())`. In the case of read, the only time a throttle would would occur is if RPCs are failing and backoff kicks in. In the case of write, in addition or RPC failure backoff, a worker can be throttled as part of write ramp up, or if RPC are not failing, but writes within those RPCs are failing and the client side adaptive throttler kicks in to try and bring the error rate down.

##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/firestore/FirestoreDoFn.java
##########
@@ -46,6 +46,22 @@
   @StartBundle
   public abstract void startBundle(DoFn<InT, OutT>.StartBundleContext context) throws Exception;
 
+  abstract static class NonWindowAwareDoFn<InT, OutT> extends FirestoreDoFn<InT, OutT> {

Review comment:
       Done, and I've renamed it to a more accurate name.

##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/firestore/FirestoreV1.java
##########
@@ -59,6 +89,80 @@
  *
  * <h3>Operations</h3>
  *
+ * <h4>Read</h4>
+ *
+ * <p>The currently supported read operations and their execution behavior are as follows:
+ *
+ * <table>
+ *   <tbody>
+ *     <tr>
+ *       <th>RPC</th>
+ *       <th>Execution Behavior</th>
+ *     </tr>
+ *     <tr>
+ *       <td>PartitionQuery</td>
+ *       <td>Parallel Streaming</td>
+ *     </tr>
+ *     <tr>
+ *       <td>RunQuery</td>
+ *       <td>Sequential Streaming</td>
+ *     </tr>
+ *     <tr>
+ *       <td>BatchGet</td>
+ *       <td>Sequential Streaming</td>
+ *     </tr>
+ *     <tr>
+ *       <td>ListCollectionIds</td>
+ *       <td>Sequential Paginated</td>
+ *     </tr>
+ *     <tr>
+ *       <td>ListDocuments</td>
+ *       <td>Sequential Paginated</td>
+ *     </tr>
+ *   </tbody>
+ * </table>
+ *
+ * <p>PartitionQuery should be preferred over other options if at all possible, becuase it has the
+ * ability to parallelize execution of multiple queries for specific sub-ranges of the full results.
+ *
+ * <p>You should only ever use ListDocuments if the use of <a target="_blank" rel="noopener
+ * noreferrer"
+ * href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.ListDocumentsRequest">{@code
+ * show_missing}</a> is needed to access a document. RunQuery and PartitionQuery will always be
+ * faster if the use of {@code show_missing} is not needed.
+ *
+ * <p><b>Example Usage</b>
+ *
+ * <pre>{@code
+ * PCollection<PartitionQueryRequest> partitionQueryRequests = ...;

Review comment:
       I've updated the javadocs to use links instead of just text, and also pulled the code samples up into the table which outlines the behavior and links to the class with full rpc cross links.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] BenWhitehead commented on a change in pull request #15005: [BEAM-8376] Google Cloud Firestore Connector - Add Firestore v1 Read Operations

Posted by GitBox <gi...@apache.org>.
BenWhitehead commented on a change in pull request #15005:
URL: https://github.com/apache/beam/pull/15005#discussion_r666366199



##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/firestore/FirestoreV1ReadFn.java
##########
@@ -0,0 +1,633 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.gcp.firestore;
+
+import static java.util.Objects.requireNonNull;
+
+import com.google.api.gax.paging.AbstractPage;
+import com.google.api.gax.paging.AbstractPagedListResponse;
+import com.google.api.gax.rpc.ServerStream;
+import com.google.api.gax.rpc.ServerStreamingCallable;
+import com.google.api.gax.rpc.UnaryCallable;
+import com.google.cloud.firestore.v1.FirestoreClient.ListCollectionIdsPage;
+import com.google.cloud.firestore.v1.FirestoreClient.ListCollectionIdsPagedResponse;
+import com.google.cloud.firestore.v1.FirestoreClient.ListDocumentsPage;
+import com.google.cloud.firestore.v1.FirestoreClient.ListDocumentsPagedResponse;
+import com.google.cloud.firestore.v1.FirestoreClient.PartitionQueryPage;
+import com.google.cloud.firestore.v1.FirestoreClient.PartitionQueryPagedResponse;
+import com.google.cloud.firestore.v1.stub.FirestoreStub;
+import com.google.firestore.v1.BatchGetDocumentsRequest;
+import com.google.firestore.v1.BatchGetDocumentsResponse;
+import com.google.firestore.v1.Cursor;
+import com.google.firestore.v1.ListCollectionIdsRequest;
+import com.google.firestore.v1.ListCollectionIdsResponse;
+import com.google.firestore.v1.ListDocumentsRequest;
+import com.google.firestore.v1.ListDocumentsResponse;
+import com.google.firestore.v1.PartitionQueryRequest;
+import com.google.firestore.v1.PartitionQueryResponse;
+import com.google.firestore.v1.RunQueryRequest;
+import com.google.firestore.v1.RunQueryResponse;
+import com.google.firestore.v1.StructuredQuery;
+import com.google.firestore.v1.StructuredQuery.Direction;
+import com.google.firestore.v1.StructuredQuery.FieldReference;
+import com.google.firestore.v1.StructuredQuery.Order;
+import com.google.firestore.v1.Value;
+import com.google.protobuf.Message;
+import com.google.protobuf.ProtocolStringList;
+import java.io.Serializable;
+import java.util.List;
+import java.util.Map;
+import java.util.Objects;
+import org.apache.beam.sdk.extensions.gcp.options.GcpOptions;
+import org.apache.beam.sdk.io.gcp.firestore.FirestoreDoFn.NonWindowAwareDoFn;
+import org.apache.beam.sdk.io.gcp.firestore.FirestoreV1Fn.HasRpcAttemptContext;
+import org.apache.beam.sdk.io.gcp.firestore.RpcQos.RpcAttempt.Context;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.display.DisplayData;
+import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.annotations.VisibleForTesting;
+import org.checkerframework.checker.nullness.compatqual.NullableDecl;
+import org.checkerframework.checker.nullness.qual.Nullable;
+import org.joda.time.Instant;
+
+/**
+ * A collection of {@link org.apache.beam.sdk.transforms.DoFn DoFn}s for each of the supported read
+ * RPC methods from the Cloud Firestore V1 API.
+ */
+final class FirestoreV1ReadFn {
+
+  /**
+   * {@link DoFn} for Firestore V1 {@link RunQueryRequest}s.
+   *
+   * <p>This Fn uses a stream to obtain responses, each response from the stream will be output to
+   * the next stage of the pipeline. Each response from the stream represents an individual document
+   * with the associated metadata.
+   *
+   * <p>If an error is encountered while reading from the stream, the stream will attempt to resume
+   * rather than starting over. The restarting of the stream will continue within the scope of the
+   * completion of the request (meaning any possibility of resumption is contingent upon an attempt
+   * being available in the Qos budget).
+   *
+   * <p>All request quality-of-service is managed via the instance of {@link RpcQos} associated with
+   * the lifecycle of this Fn.
+   */
+  static final class RunQueryFn
+      extends StreamingFirestoreV1ReadFn<RunQueryRequest, RunQueryResponse> {
+
+    RunQueryFn(
+        JodaClock clock,
+        FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+        RpcQosOptions rpcQosOptions) {
+      super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    @Override
+    public Context getRpcAttemptContext() {
+      return FirestoreV1Fn.V1FnRpcAttemptContext.RunQuery;
+    }
+
+    @Override
+    protected ServerStreamingCallable<RunQueryRequest, RunQueryResponse> getCallable(
+        FirestoreStub firestoreStub) {
+      return firestoreStub.runQueryCallable();
+    }
+
+    @Override
+    protected RunQueryRequest setStartFrom(
+        RunQueryRequest element, RunQueryResponse runQueryResponse) {
+      StructuredQuery query = element.getStructuredQuery();
+      StructuredQuery.Builder builder;
+      List<Order> orderByList = query.getOrderByList();
+      // if the orderByList is empty that means the default sort of "__name__ ASC" will be used
+      // Before we can set the cursor to the last document name read, we need to explicitly add
+      // the order of "__name__ ASC" because a cursor value must map to an order by
+      if (orderByList.isEmpty()) {
+        builder =
+            query
+                .toBuilder()
+                .addOrderBy(
+                    Order.newBuilder()
+                        .setField(FieldReference.newBuilder().setFieldPath("__name__").build())
+                        .setDirection(Direction.ASCENDING)
+                        .build())
+                .setStartAt(
+                    Cursor.newBuilder()
+                        .setBefore(false)
+                        .addValues(
+                            Value.newBuilder()
+                                .setReferenceValue(runQueryResponse.getDocument().getName())
+                                .build()));
+      } else {
+        Cursor.Builder cursor = Cursor.newBuilder().setBefore(false);
+        Map<String, Value> fieldsMap = runQueryResponse.getDocument().getFieldsMap();
+        for (Order order : orderByList) {
+          String fieldPath = order.getField().getFieldPath();
+          Value value = fieldsMap.get(fieldPath);
+          if (value != null) {
+            cursor.addValues(value);
+          } else if ("__name__".equals(fieldPath)) {
+            cursor.addValues(
+                Value.newBuilder()
+                    .setReferenceValue(runQueryResponse.getDocument().getName())
+                    .build());
+          }
+        }
+        builder = query.toBuilder().setStartAt(cursor.build());
+      }
+      return element.toBuilder().setStructuredQuery(builder.build()).build();
+    }
+  }
+
+  /**
+   * {@link DoFn} for Firestore V1 {@link PartitionQueryRequest}s.
+   *
+   * <p>This Fn uses pagination to obtain responses, all pages will be aggregated before being
+   * emitted to the next stage of the pipeline. Aggregation of pages is necessary as the next step
+   * of pairing of cursors to create N queries must first sort all cursors. See <a target="_blank"
+   * rel="noopener noreferrer"
+   * href="https://cloud.google.com/firestore/docs/reference/rest/v1/projects.databases.documents/partitionQuery#request-body">{@code
+   * pageToken}s</a> documentation for details.
+   *
+   * <p>All request quality-of-service is managed via the instance of {@link RpcQos} associated with
+   * the lifecycle of this Fn.
+   */
+  static final class PartitionQueryFn
+      extends BaseFirestoreV1ReadFn<PartitionQueryRequest, PartitionQueryPair> {
+
+    public PartitionQueryFn(
+        JodaClock clock,
+        FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+        RpcQosOptions rpcQosOptions) {
+      super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    @Override
+    public Context getRpcAttemptContext() {
+      return FirestoreV1Fn.V1FnRpcAttemptContext.PartitionQuery;
+    }
+
+    @Override
+    public void processElement(ProcessContext context) throws Exception {
+      @SuppressWarnings("nullness")
+      final PartitionQueryRequest element =
+          requireNonNull(context.element(), "c.element() must be non null");
+
+      RpcQos.RpcReadAttempt attempt = rpcQos.newReadAttempt(getRpcAttemptContext());

Review comment:
       No this is not a global lock, it is something that is instantiated per request attempt on each worker. Each attempt encapsulates the logic for number of RPC attempts, backoff, retry ability as well as success/failure and rpc duration metrics. `RpcQos` however is managed from `@Setup` instead of `@StartBundle` so that the rpc metrics can be tracked cross bundle.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] cynthiachi commented on a change in pull request #15005: [BEAM-8376] Google Cloud Firestore Connector - Add Firestore v1 Read Operations

Posted by GitBox <gi...@apache.org>.
cynthiachi commented on a change in pull request #15005:
URL: https://github.com/apache/beam/pull/15005#discussion_r663255598



##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/firestore/FirestoreV1.java
##########
@@ -189,6 +533,725 @@ private Write() {}
     }
   }
 
+  /**
+   * Concrete class representing a {@link PTransform}{@code <}{@link PCollection}{@code <}{@link
+   * ListCollectionIdsRequest}{@code >, }{@link PTransform}{@code <}{@link
+   * ListCollectionIdsResponse}{@code >>} which will read from Firestore.
+   *
+   * <p>This class is part of the Firestore Connector DSL, it has a type safe builder accessible via
+   * {@link FirestoreIO#v1()}{@code .}{@link FirestoreV1#read() read()}{@code .}{@link
+   * FirestoreV1.Read#listCollectionIds() listCollectionIds()}.
+   *
+   * <p>All request quality-of-service for an instance of this PTransform is scoped to the worker
+   * and configured via {@link ListCollectionIds.Builder#withRpcQosOptions(RpcQosOptions)}.
+   *
+   * @see FirestoreIO#v1()
+   * @see FirestoreV1#read()
+   * @see FirestoreV1.Read#listCollectionIds()
+   * @see FirestoreV1.ListCollectionIds.Builder
+   * @see ListCollectionIdsRequest
+   * @see ListCollectionIdsResponse
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.Firestore.ListCollectionIds">google.firestore.v1.Firestore.ListCollectionIds</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.ListCollectionIdsRequest">google.firestore.v1.ListCollectionIdsRequest</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.ListCollectionIdsResponse">google.firestore.v1.ListCollectionIdsResponse</a>
+   */
+  public static final class ListCollectionIds
+      extends Transform<
+          PCollection<ListCollectionIdsRequest>,
+          PCollection<String>,
+          ListCollectionIds,
+          ListCollectionIds.Builder> {
+
+    private ListCollectionIds(
+        JodaClock clock,
+        FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+        RpcQosOptions rpcQosOptions) {
+      super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    @Override
+    public PCollection<String> expand(PCollection<ListCollectionIdsRequest> input) {
+      return input
+          .apply(
+              "listCollectionIds",
+              ParDo.of(
+                  new ListCollectionIdsFn(clock, firestoreStatefulComponentFactory, rpcQosOptions)))
+          .apply(ParDo.of(new FlattenListCollectionIdsResponse()))
+          .apply(Reshuffle.viaRandomKey());
+    }
+
+    @Override
+    public Builder toBuilder() {
+      return new Builder(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    /**
+     * A type safe builder for {@link ListCollectionIds} allowing configuration and instantiation.
+     *
+     * <p>This class is part of the Firestore Connector DSL, it has a type safe builder accessible
+     * via {@link FirestoreIO#v1()}{@code .}{@link FirestoreV1#read() read()}{@code .}{@link
+     * FirestoreV1.Read#listCollectionIds() listCollectionIds()}.
+     *
+     * <p>
+     *
+     * @see FirestoreIO#v1()
+     * @see FirestoreV1#read()
+     * @see FirestoreV1.Read#listCollectionIds()
+     * @see FirestoreV1.ListCollectionIds
+     * @see ListCollectionIdsRequest
+     * @see ListCollectionIdsResponse
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.Firestore.ListCollectionIds">google.firestore.v1.Firestore.ListCollectionIds</a>
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.ListCollectionIdsRequest">google.firestore.v1.ListCollectionIdsRequest</a>
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.ListCollectionIdsResponse">google.firestore.v1.ListCollectionIdsResponse</a>
+     */
+    public static final class Builder
+        extends Transform.Builder<
+            PCollection<ListCollectionIdsRequest>,
+            PCollection<String>,
+            ListCollectionIds,
+            ListCollectionIds.Builder> {
+
+      private Builder() {
+        super();
+      }
+
+      private Builder(
+          JodaClock clock,
+          FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+          RpcQosOptions rpcQosOptions) {
+        super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+      }
+
+      @Override
+      public ListCollectionIds build() {
+        return genericBuild();
+      }
+
+      @Override
+      ListCollectionIds buildSafe(
+          JodaClock clock,
+          FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+          RpcQosOptions rpcQosOptions) {
+        return new ListCollectionIds(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+      }
+    }
+  }
+
+  /**
+   * Concrete class representing a {@link PTransform}{@code <}{@link PCollection}{@code <}{@link
+   * ListDocumentsRequest}{@code >, }{@link PTransform}{@code <}{@link ListDocumentsResponse}{@code
+   * >>} which will read from Firestore.
+   *
+   * <p>This class is part of the Firestore Connector DSL, it has a type safe builder accessible via
+   * {@link FirestoreIO#v1()}{@code .}{@link FirestoreV1#read() read()}{@code .}{@link
+   * FirestoreV1.Read#listDocuments() listDocuments()}.
+   *
+   * <p>All request quality-of-service for an instance of this PTransform is scoped to the worker
+   * and configured via {@link ListDocuments.Builder#withRpcQosOptions(RpcQosOptions)}.
+   *
+   * @see FirestoreIO#v1()
+   * @see FirestoreV1#read()
+   * @see FirestoreV1.Read#listDocuments()
+   * @see FirestoreV1.ListDocuments.Builder
+   * @see ListDocumentsRequest
+   * @see ListDocumentsResponse
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.Firestore.ListDocuments">google.firestore.v1.Firestore.ListDocuments</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.ListDocumentsRequest">google.firestore.v1.ListDocumentsRequest</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.ListDocumentsResponse">google.firestore.v1.ListDocumentsResponse</a>
+   */
+  public static final class ListDocuments
+      extends Transform<
+          PCollection<ListDocumentsRequest>,
+          PCollection<Document>,
+          ListDocuments,
+          ListDocuments.Builder> {
+
+    private ListDocuments(
+        JodaClock clock,
+        FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+        RpcQosOptions rpcQosOptions) {
+      super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    @Override
+    public PCollection<Document> expand(PCollection<ListDocumentsRequest> input) {
+      return input
+          .apply(
+              "listDocuments",
+              ParDo.of(
+                  new ListDocumentsFn(clock, firestoreStatefulComponentFactory, rpcQosOptions)))
+          .apply(ParDo.of(new ListDocumentsResponseToDocument()))
+          .apply(Reshuffle.viaRandomKey());
+    }
+
+    @Override
+    public Builder toBuilder() {
+      return new Builder(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    /**
+     * A type safe builder for {@link ListDocuments} allowing configuration and instantiation.
+     *
+     * <p>This class is part of the Firestore Connector DSL, it has a type safe builder accessible
+     * via {@link FirestoreIO#v1()}{@code .}{@link FirestoreV1#read() read()}{@code .}{@link
+     * FirestoreV1.Read#listDocuments() listDocuments()}.
+     *
+     * <p>
+     *
+     * @see FirestoreIO#v1()
+     * @see FirestoreV1#read()
+     * @see FirestoreV1.Read#listDocuments()
+     * @see FirestoreV1.ListDocuments
+     * @see ListDocumentsRequest
+     * @see ListDocumentsResponse
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.Firestore.ListDocuments">google.firestore.v1.Firestore.ListDocuments</a>
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.ListDocumentsRequest">google.firestore.v1.ListDocumentsRequest</a>
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.ListDocumentsResponse">google.firestore.v1.ListDocumentsResponse</a>
+     */
+    public static final class Builder
+        extends Transform.Builder<
+            PCollection<ListDocumentsRequest>,
+            PCollection<Document>,
+            ListDocuments,
+            ListDocuments.Builder> {
+
+      private Builder() {
+        super();
+      }
+
+      private Builder(
+          JodaClock clock,
+          FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+          RpcQosOptions rpcQosOptions) {
+        super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+      }
+
+      @Override
+      public ListDocuments build() {
+        return genericBuild();
+      }
+
+      @Override
+      ListDocuments buildSafe(
+          JodaClock clock,
+          FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+          RpcQosOptions rpcQosOptions) {
+        return new ListDocuments(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+      }
+    }
+  }
+
+  /**
+   * Concrete class representing a {@link PTransform}{@code <}{@link PCollection}{@code <}{@link
+   * RunQueryRequest}{@code >, }{@link PTransform}{@code <}{@link RunQueryResponse}{@code >>} which
+   * will read from Firestore.
+   *
+   * <p>This class is part of the Firestore Connector DSL, it has a type safe builder accessible via
+   * {@link FirestoreIO#v1()}{@code .}{@link FirestoreV1#read() read()}{@code .}{@link
+   * FirestoreV1.Read#runQuery() runQuery()}.
+   *
+   * <p>All request quality-of-service for an instance of this PTransform is scoped to the worker
+   * and configured via {@link RunQuery.Builder#withRpcQosOptions(RpcQosOptions)}.
+   *
+   * @see FirestoreIO#v1()
+   * @see FirestoreV1#read()
+   * @see FirestoreV1.Read#runQuery()
+   * @see FirestoreV1.RunQuery.Builder
+   * @see RunQueryRequest
+   * @see RunQueryResponse
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.Firestore.RunQuery">google.firestore.v1.Firestore.RunQuery</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.RunQueryRequest">google.firestore.v1.RunQueryRequest</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.RunQueryResponse">google.firestore.v1.RunQueryResponse</a>
+   */
+  public static final class RunQuery
+      extends Transform<
+          PCollection<RunQueryRequest>, PCollection<RunQueryResponse>, RunQuery, RunQuery.Builder> {
+
+    private RunQuery(
+        JodaClock clock,
+        FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+        RpcQosOptions rpcQosOptions) {
+      super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    @Override
+    public PCollection<RunQueryResponse> expand(PCollection<RunQueryRequest> input) {
+      return input
+          .apply(
+              "runQuery",
+              ParDo.of(new RunQueryFn(clock, firestoreStatefulComponentFactory, rpcQosOptions)))
+          .apply(Reshuffle.viaRandomKey());
+    }
+
+    @Override
+    public Builder toBuilder() {
+      return new Builder(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    /**
+     * A type safe builder for {@link RunQuery} allowing configuration and instantiation.
+     *
+     * <p>This class is part of the Firestore Connector DSL, it has a type safe builder accessible
+     * via {@link FirestoreIO#v1()}{@code .}{@link FirestoreV1#read() read()}{@code .}{@link
+     * FirestoreV1.Read#runQuery() runQuery()}.
+     *
+     * <p>
+     *
+     * @see FirestoreIO#v1()
+     * @see FirestoreV1#read()
+     * @see FirestoreV1.Read#runQuery()
+     * @see FirestoreV1.RunQuery
+     * @see RunQueryRequest
+     * @see RunQueryResponse
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.Firestore.RunQuery">google.firestore.v1.Firestore.RunQuery</a>
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.RunQueryRequest">google.firestore.v1.RunQueryRequest</a>
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.RunQueryResponse">google.firestore.v1.RunQueryResponse</a>
+     */
+    public static final class Builder
+        extends Transform.Builder<
+            PCollection<RunQueryRequest>,
+            PCollection<RunQueryResponse>,
+            RunQuery,
+            RunQuery.Builder> {
+
+      private Builder() {
+        super();
+      }
+
+      private Builder(
+          JodaClock clock,
+          FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+          RpcQosOptions rpcQosOptions) {
+        super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+      }
+
+      @Override
+      public RunQuery build() {
+        return genericBuild();
+      }
+
+      @Override
+      RunQuery buildSafe(
+          JodaClock clock,
+          FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+          RpcQosOptions rpcQosOptions) {
+        return new RunQuery(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+      }
+    }
+  }
+
+  /**
+   * Concrete class representing a {@link PTransform}{@code <}{@link PCollection}{@code <}{@link
+   * BatchGetDocumentsRequest}{@code >, }{@link PTransform}{@code <}{@link
+   * BatchGetDocumentsResponse}{@code >>} which will read from Firestore.
+   *
+   * <p>This class is part of the Firestore Connector DSL, it has a type safe builder accessible via
+   * {@link FirestoreIO#v1()}{@code .}{@link FirestoreV1#read() read()}{@code .}{@link
+   * FirestoreV1.Read#batchGetDocuments() batchGetDocuments()}.
+   *
+   * <p>All request quality-of-service for an instance of this PTransform is scoped to the worker
+   * and configured via {@link BatchGetDocuments.Builder#withRpcQosOptions(RpcQosOptions)}.
+   *
+   * @see FirestoreIO#v1()
+   * @see FirestoreV1#read()
+   * @see FirestoreV1.Read#batchGetDocuments()
+   * @see FirestoreV1.BatchGetDocuments.Builder
+   * @see BatchGetDocumentsRequest
+   * @see BatchGetDocumentsResponse
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.Firestore.BatchGetDocuments">google.firestore.v1.Firestore.BatchGetDocuments</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.BatchGetDocumentsRequest">google.firestore.v1.BatchGetDocumentsRequest</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.BatchGetDocumentsResponse">google.firestore.v1.BatchGetDocumentsResponse</a>
+   */
+  public static final class BatchGetDocuments
+      extends Transform<
+          PCollection<BatchGetDocumentsRequest>,
+          PCollection<BatchGetDocumentsResponse>,
+          BatchGetDocuments,
+          BatchGetDocuments.Builder> {
+
+    private BatchGetDocuments(
+        JodaClock clock,
+        FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+        RpcQosOptions rpcQosOptions) {
+      super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    @Override
+    public PCollection<BatchGetDocumentsResponse> expand(
+        PCollection<BatchGetDocumentsRequest> input) {
+      return input
+          .apply(
+              "batchGetDocuments",
+              ParDo.of(
+                  new BatchGetDocumentsFn(clock, firestoreStatefulComponentFactory, rpcQosOptions)))
+          .apply(Reshuffle.viaRandomKey());
+    }
+
+    @Override
+    public Builder toBuilder() {
+      return new Builder(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+    }
+
+    /**
+     * A type safe builder for {@link BatchGetDocuments} allowing configuration and instantiation.
+     *
+     * <p>This class is part of the Firestore Connector DSL, it has a type safe builder accessible
+     * via {@link FirestoreIO#v1()}{@code .}{@link FirestoreV1#read() read()}{@code .}{@link
+     * FirestoreV1.Read#batchGetDocuments() batchGetDocuments()}.
+     *
+     * <p>
+     *
+     * @see FirestoreIO#v1()
+     * @see FirestoreV1#read()
+     * @see FirestoreV1.Read#batchGetDocuments()
+     * @see FirestoreV1.BatchGetDocuments
+     * @see BatchGetDocumentsRequest
+     * @see BatchGetDocumentsResponse
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.Firestore.BatchGetDocuments">google.firestore.v1.Firestore.BatchGetDocuments</a>
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.BatchGetDocumentsRequest">google.firestore.v1.BatchGetDocumentsRequest</a>
+     * @see <a target="_blank" rel="noopener noreferrer"
+     *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.BatchGetDocumentsResponse">google.firestore.v1.BatchGetDocumentsResponse</a>
+     */
+    public static final class Builder
+        extends Transform.Builder<
+            PCollection<BatchGetDocumentsRequest>,
+            PCollection<BatchGetDocumentsResponse>,
+            BatchGetDocuments,
+            BatchGetDocuments.Builder> {
+
+      private Builder() {
+        super();
+      }
+
+      public Builder(
+          JodaClock clock,
+          FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+          RpcQosOptions rpcQosOptions) {
+        super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+      }
+
+      @Override
+      public BatchGetDocuments build() {
+        return genericBuild();
+      }
+
+      @Override
+      BatchGetDocuments buildSafe(
+          JodaClock clock,
+          FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+          RpcQosOptions rpcQosOptions) {
+        return new BatchGetDocuments(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+      }
+    }
+  }
+
+  /**
+   * Concrete class representing a {@link PTransform}{@code <}{@link PCollection}{@code <}{@link
+   * PartitionQueryRequest}{@code >, }{@link PTransform}{@code <}{@link RunQueryResponse}{@code >>}
+   * which will read from Firestore.
+   *
+   * <p>This class is part of the Firestore Connector DSL, it has a type safe builder accessible via
+   * {@link FirestoreIO#v1()}{@code .}{@link FirestoreV1#read() read()}{@code .}{@link
+   * FirestoreV1.Read#partitionQuery() partitionQuery()}.
+   *
+   * <p>All request quality-of-service for an instance of this PTransform is scoped to the worker
+   * and configured via {@link PartitionQuery.Builder#withRpcQosOptions(RpcQosOptions)}.
+   *
+   * @see FirestoreIO#v1()
+   * @see FirestoreV1#read()
+   * @see FirestoreV1.Read#partitionQuery()
+   * @see FirestoreV1.PartitionQuery.Builder
+   * @see PartitionQueryRequest
+   * @see RunQueryResponse
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.Firestore.PartitionQuery">google.firestore.v1.Firestore.PartitionQuery</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.PartitionQueryRequest">google.firestore.v1.PartitionQueryRequest</a>
+   * @see <a target="_blank" rel="noopener noreferrer"
+   *     href="https://cloud.google.com/firestore/docs/reference/rpc/google.firestore.v1#google.firestore.v1.PartitionQueryResponse">google.firestore.v1.PartitionQueryResponse</a>
+   */
+  public static final class PartitionQuery
+      extends Transform<
+          PCollection<PartitionQueryRequest>,
+          PCollection<RunQueryResponse>,
+          PartitionQuery,
+          PartitionQuery.Builder> {
+
+    private final boolean nameOnlyQuery;
+
+    private PartitionQuery(
+        JodaClock clock,
+        FirestoreStatefulComponentFactory firestoreStatefulComponentFactory,
+        RpcQosOptions rpcQosOptions,
+        boolean nameOnlyQuery) {
+      super(clock, firestoreStatefulComponentFactory, rpcQosOptions);
+      this.nameOnlyQuery = nameOnlyQuery;
+    }
+
+    @Override
+    public PCollection<RunQueryResponse> expand(PCollection<PartitionQueryRequest> input) {

Review comment:
       I'm not fully understanding when this would be used. I would have expected each worker to separately deal with a RunQueryRequest, though maybe there's Beam magic going on here?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] BenWhitehead commented on pull request #15005: [BEAM-8376] Google Cloud Firestore Connector - Add Firestore v1 Read Operations

Posted by GitBox <gi...@apache.org>.
BenWhitehead commented on pull request #15005:
URL: https://github.com/apache/beam/pull/15005#issuecomment-869845855


   Run Java PreCommit


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] BenWhitehead commented on pull request #15005: [BEAM-8376] Google Cloud Firestore Connector - Add Firestore v1 Read Operations

Posted by GitBox <gi...@apache.org>.
BenWhitehead commented on pull request #15005:
URL: https://github.com/apache/beam/pull/15005#issuecomment-879516414


   Run Java PreCommit


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] BenWhitehead commented on a change in pull request #15005: [BEAM-8376] Google Cloud Firestore Connector - Add Firestore v1 Read Operations

Posted by GitBox <gi...@apache.org>.
BenWhitehead commented on a change in pull request #15005:
URL: https://github.com/apache/beam/pull/15005#discussion_r666363059



##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/firestore/FirestoreDoFn.java
##########
@@ -46,6 +46,22 @@
   @StartBundle
   public abstract void startBundle(DoFn<InT, OutT>.StartBundleContext context) throws Exception;
 
+  abstract static class NonWindowAwareDoFn<InT, OutT> extends FirestoreDoFn<InT, OutT> {
+    /**
+     * {@link ProcessContext#element() context.element()} must be non-null, otherwise a
+     * NullPointerException will be thrown.

Review comment:
       Apologies, I will need to rename this class. I'm leaning toward the new name being `ImplicitlyWindowedDoFn`, because output will exclusively use `DoFn.WindowedContext#output(OutputT)` whereas the `WindowAwareDoFn` may also output during `@FinishBundle` and must explicitly provide the window via `DoFn.FinishBundleContext#output(OutputT, org.joda.time.Instant, org.apache.beam.sdk.transforms.windowing.BoundedWindow)`




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] chamikaramj merged pull request #15005: [BEAM-8376] Google Cloud Firestore Connector - Add Firestore v1 Read Operations

Posted by GitBox <gi...@apache.org>.
chamikaramj merged pull request #15005:
URL: https://github.com/apache/beam/pull/15005


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org