You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by "reuvenlax (via GitHub)" <gi...@apache.org> on 2023/03/07 05:46:16 UTC

[GitHub] [beam] reuvenlax commented on a diff in pull request #25723: #25722 Add option to propagate successful storage-api writes

reuvenlax commented on code in PR #25723:
URL: https://github.com/apache/beam/pull/25723#discussion_r1127368552


##########
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/StorageApiWriteUnshardedRecords.java:
##########
@@ -558,7 +574,23 @@ long flush(
               appendFailures.inc();
               return RetryType.RETRY_ALL_OPERATIONS;
             },
-            c -> recordsAppended.inc(c.protoRows.getSerializedRowsCount()),
+            c -> {
+              recordsAppended.inc(c.protoRows.getSerializedRowsCount());
+              if (successfulRowsReceiver != null) {
+                for (ByteString rowBytes : c.protoRows.getSerializedRowsList()) {
+                  try {
+                    TableRow row =

Review Comment:
   Some people have usages for this beyond just Wait.on (e.g. write row somewhere else only after first write is done), and the simples semantic is to return the entire row.
   
   Agree about the cost - this is why I force users to opt into this. I think we should also output a collection of metadata (e.g. every append to the API along with some statistics, like the number of rows in that append), and this metadata collection can also be used in Wait.on.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org