You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2021/07/02 03:36:00 UTC

[jira] [Work logged] (BEAM-8376) Add FirestoreIO connector to Java SDK

     [ https://issues.apache.org/jira/browse/BEAM-8376?focusedWorklogId=617968&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-617968 ]

ASF GitHub Bot logged work on BEAM-8376:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 02/Jul/21 03:35
            Start Date: 02/Jul/21 03:35
    Worklog Time Spent: 10m 
      Work Description: jlara310 commented on a change in pull request #15005:
URL: https://github.com/apache/beam/pull/15005#discussion_r662686011



##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/firestore/FirestoreV1.java
##########
@@ -59,6 +89,80 @@
  *
  * <h3>Operations</h3>
  *
+ * <h4>Read</h4>
+ *
+ * <p>The currently supported read operations and their execution behavior are as follows:
+ *
+ * <table>
+ *   <tbody>
+ *     <tr>
+ *       <th>RPC</th>
+ *       <th>Execution Behavior</th>
+ *     </tr>
+ *     <tr>
+ *       <td>PartitionQuery</td>
+ *       <td>Parallel Streaming</td>
+ *     </tr>
+ *     <tr>
+ *       <td>RunQuery</td>
+ *       <td>Sequential Streaming</td>
+ *     </tr>
+ *     <tr>
+ *       <td>BatchGet</td>
+ *       <td>Sequential Streaming</td>
+ *     </tr>
+ *     <tr>
+ *       <td>ListCollectionIds</td>
+ *       <td>Sequential Paginated</td>
+ *     </tr>
+ *     <tr>
+ *       <td>ListDocuments</td>
+ *       <td>Sequential Paginated</td>
+ *     </tr>
+ *   </tbody>
+ * </table>
+ *
+ * <p>PartitionQuery should be preferred over other options if at all possible, it has the ability
+ * to parallelize execution of multiple queries for specific sub-ranges of the full results.
+ *
+ * <p>ListDocuments should only ever be used if the use of <a target="_blank" rel="noopener

Review comment:
       Re-write in imperative mood:
   
   You should only ever use ListDocuments if you need...

##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/firestore/FirestoreV1.java
##########
@@ -59,6 +89,80 @@
  *
  * <h3>Operations</h3>
  *
+ * <h4>Read</h4>
+ *
+ * <p>The currently supported read operations and their execution behavior are as follows:
+ *
+ * <table>
+ *   <tbody>
+ *     <tr>
+ *       <th>RPC</th>
+ *       <th>Execution Behavior</th>
+ *     </tr>
+ *     <tr>
+ *       <td>PartitionQuery</td>
+ *       <td>Parallel Streaming</td>
+ *     </tr>
+ *     <tr>
+ *       <td>RunQuery</td>
+ *       <td>Sequential Streaming</td>
+ *     </tr>
+ *     <tr>
+ *       <td>BatchGet</td>
+ *       <td>Sequential Streaming</td>
+ *     </tr>
+ *     <tr>
+ *       <td>ListCollectionIds</td>
+ *       <td>Sequential Paginated</td>
+ *     </tr>
+ *     <tr>
+ *       <td>ListDocuments</td>
+ *       <td>Sequential Paginated</td>
+ *     </tr>
+ *   </tbody>
+ * </table>
+ *
+ * <p>PartitionQuery should be preferred over other options if at all possible, it has the ability

Review comment:
       if at all possible, it ->
   
   if at all possible, because it

##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/firestore/RpcQosImpl.java
##########
@@ -700,6 +723,142 @@ public long nextBackOffMillis() {
     }
   }
 
+  /**
+   * This class implements a backoff algorithm similar to that of {@link
+   * org.apache.beam.sdk.util.FluentBackoff} with a key differences:
+   *
+   * <ol>
+   *   <li>A set of status code numbers may be specified to have a graceful evaluation
+   *   <li>Gracefully evaluated status code numbers will increment a decaying counter to ensure if
+   *       the graceful status code numbers occur more than once in the previous 60 seconds the
+   *       regular backoff behavior will kick in.
+   *   <li>The random number generator used to induce jitter is provided via constructor parameter
+   *       rather than using {@link Math#random()}}
+   * </ol>
+   *
+   * The primary motivation for creating this implementation is to support streamed responses from
+   * Firestore. In the case of RunQuery and BatchGet the results are returned via stream. The result
+   * stream has a maximum lifetime of 60 seconds before it will be broken and an UNAVAILABLE status
+   * code will be raised. Give this UNAVAILABLE is expected for streams this class allows for

Review comment:
       Typo? "Give this UNAVAILABLE is expected for streams this class"
   
   "Given that this UNAVAILABLE is expected for streams, this class"

##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/firestore/RpcQosImpl.java
##########
@@ -700,6 +723,142 @@ public long nextBackOffMillis() {
     }
   }
 
+  /**
+   * This class implements a backoff algorithm similar to that of {@link
+   * org.apache.beam.sdk.util.FluentBackoff} with a key differences:
+   *
+   * <ol>
+   *   <li>A set of status code numbers may be specified to have a graceful evaluation
+   *   <li>Gracefully evaluated status code numbers will increment a decaying counter to ensure if
+   *       the graceful status code numbers occur more than once in the previous 60 seconds the
+   *       regular backoff behavior will kick in.
+   *   <li>The random number generator used to induce jitter is provided via constructor parameter
+   *       rather than using {@link Math#random()}}
+   * </ol>
+   *
+   * The primary motivation for creating this implementation is to support streamed responses from
+   * Firestore. In the case of RunQuery and BatchGet the results are returned via stream. The result
+   * stream has a maximum lifetime of 60 seconds before it will be broken and an UNAVAILABLE status
+   * code will be raised. Give this UNAVAILABLE is expected for streams this class allows for
+   * defining a set of status code numbers which are give a grace count of 1 before backoff kicks

Review comment:
       "which are give a grace count" -> which are given a grace count

##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/firestore/RpcQosImpl.java
##########
@@ -115,25 +118,43 @@
             filteringDistributionFactory);
     writeRampUp =
         new WriteRampUp(500.0 / options.getHintMaxNumWorkers(), filteringDistributionFactory);
-    // maxRetries is an inclusive value, we want exclusive since we are tracking all attempts
-    fb =
-        FluentBackoff.DEFAULT
-            .withMaxRetries(options.getMaxAttempts() - 1)
-            .withInitialBackoff(options.getInitialBackoff());
     counters = new WeakHashMap<>();
     computeCounters = (Context c) -> O11y.create(c, counterFactory, filteringDistributionFactory);
   }
 
   @Override
   public RpcWriteAttemptImpl newWriteAttempt(Context context) {
     return new RpcWriteAttemptImpl(
-        context, counters.computeIfAbsent(context, computeCounters), fb.backoff(), sleeper);
+        context,
+        counters.computeIfAbsent(context, computeCounters),
+        new StatusCodeAwareBackoff(
+            random,
+            options.getMaxAttempts(),
+            options.getThrottleDuration(),
+            Collections.emptySet()),
+        sleeper);
   }
 
   @Override
   public RpcReadAttemptImpl newReadAttempt(Context context) {
+    Set<Integer> graceStatusCodeNumbers = Collections.emptySet();
+    // When reading results from a RunQuery or BatchGet the stream returning the results has a
+    //   maximum lifetime of 60 seconds at which point it will be broken with a n UNAVAILABLE

Review comment:
       "a n UNAVAILABLE"
   Typo?
   
   -> "an UNAVAILABLE"

##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/firestore/RpcQosImpl.java
##########
@@ -700,6 +723,142 @@ public long nextBackOffMillis() {
     }
   }
 
+  /**
+   * This class implements a backoff algorithm similar to that of {@link
+   * org.apache.beam.sdk.util.FluentBackoff} with a key differences:

Review comment:
       "a key differences" -> "some key differences"




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 617968)
    Time Spent: 37h  (was: 36h 50m)

> Add FirestoreIO connector to Java SDK
> -------------------------------------
>
>                 Key: BEAM-8376
>                 URL: https://issues.apache.org/jira/browse/BEAM-8376
>             Project: Beam
>          Issue Type: New Feature
>          Components: io-java-gcp
>            Reporter: Stefan Djelekar
>            Priority: P3
>          Time Spent: 37h
>  Remaining Estimate: 0h
>
> Motivation:
> There is no Firestore connector for Java SDK at the moment.
> Having it will enhance the integrations with database options on the Google Cloud Platform.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)