You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2020/08/18 14:00:53 UTC

[GitHub] [beam] piotr-szuberski opened a new pull request #12611: [BEAM-10131][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

piotr-szuberski opened a new pull request #12611:
URL: https://github.com/apache/beam/pull/12611


   What is left:
   - unit test for translating row to mutation
   - add array and struct types support
   - support for transactions? I doubt it's doable.
   
   Questions:
   Does Beam have a Spanner instance running for testing so I can write a gradle task as well? I suppose the answer is negative as I couldn't find anything.
   
   ------------------------
   
   Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:
   
    - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`).
    - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue.
    - [ ] Update `CHANGES.md` with noteworthy changes.
    - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more tips on [how to make review process smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier).
   
   Post-Commit Tests Status (on master branch)
   ------------------------------------------------------------------------------------------------
   
   Lang | SDK | Dataflow | Flink | Samza | Spark | Twister2
   --- | --- | --- | --- | --- | --- | ---
   Go | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/) | ---
   Java | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/badge/i
 con)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)<br>[![Build Status](htt
 ps://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Twister2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Twister2/lastCompletedBuild/)
   Python | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Python38/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Python38/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_
 Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Py_VR_Dataflow_V2/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Py_VR_Dataflow_V2/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PreCommit_Python2_PVR_Flink_Cron/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PreCommit_Python2_PVR_Flink_Cron/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Python35_VR_Flink/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Python35_VR_Flink/lastCompletedBuild/) | --- | [![Build Status](https://ci-beam.apache.org/job/beam_P
 ostCommit_Python_VR_Spark/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Python_VR_Spark/lastCompletedBuild/) | ---
   XLang | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_XVR_Direct/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_XVR_Direct/lastCompletedBuild/) | --- | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_XVR_Flink/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_XVR_Flink/lastCompletedBuild/) | --- | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_XVR_Spark/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_XVR_Spark/lastCompletedBuild/) | ---
   
   Pre-Commit Tests Status (on master branch)
   ------------------------------------------------------------------------------------------------
   
   --- |Java | Python | Go | Website
   --- | --- | --- | --- | ---
   Non-portable | [![Build Status](https://ci-beam.apache.org/job/beam_PreCommit_Java_Cron/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PreCommit_Java_Cron/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PreCommit_Python_Cron/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PreCommit_Python_Cron/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PreCommit_PythonLint_Cron/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PreCommit_PythonLint_Cron/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PreCommit_PythonDocker_Cron/badge/icon)](https://ci-beam.apache.org/job/beam_PreCommit_PythonDocker_Cron/lastCompletedBuild/) <br>[![Build Status](https://ci-beam.apache.org/job/beam_PreCommit_PythonDocs_Cron/badge/icon)](https://ci-beam.apache.org/job/beam_PreCommit_PythonDocs_Cron/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/be
 am_PreCommit_Go_Cron/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PreCommit_Go_Cron/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PreCommit_Website_Cron/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PreCommit_Website_Cron/lastCompletedBuild/)
   Portable | --- | [![Build Status](https://ci-beam.apache.org/job/beam_PreCommit_Portable_Python_Cron/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PreCommit_Portable_Python_Cron/lastCompletedBuild/) | --- | ---
   
   See [.test-infra/jenkins/README](https://github.com/apache/beam/blob/master/.test-infra/jenkins/README.md) for trigger phrase, status and link of all Jenkins jobs.
   
   
   GitHub Actions Tests Status (on master branch)
   ------------------------------------------------------------------------------------------------
   ![Build python source distribution and wheels](https://github.com/apache/beam/workflows/Build%20python%20source%20distribution%20and%20wheels/badge.svg)
   
   See [CI.md](https://github.com/apache/beam/blob/master/CI.md) for more information about GitHub Actions CI.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski commented on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski commented on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-716736973


   Run Python 3.8 PostCommit


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-683611200


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=h1) Report
   > Merging [#12611](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=desc) into [master](https://codecov.io/gh/apache/beam/commit/0335ba5403e5071a1376a3a4101d14b03cf90b8d?el=desc) will **increase** coverage by `0.02%`.
   > The diff coverage is `58.51%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/12611/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #12611      +/-   ##
   ==========================================
   + Coverage   40.23%   40.26%   +0.02%     
   ==========================================
     Files         455      456       +1     
     Lines       53726    53825      +99     
   ==========================================
   + Hits        21619    21674      +55     
   - Misses      32107    32151      +44     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [sdks/python/apache\_beam/io/gcp/spanner.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL3NwYW5uZXIucHk=) | `58.51% <58.51%> (ø)` | |
   | [...hon/apache\_beam/runners/direct/test\_stream\_impl.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9kaXJlY3QvdGVzdF9zdHJlYW1faW1wbC5weQ==) | `41.17% <0.00%> (-1.58%)` | :arrow_down: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=footer). Last update [0335ba5...283c611](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-683611200


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=h1) Report
   > Merging [#12611](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=desc) into [master](https://codecov.io/gh/apache/beam/commit/6bf56f92b34f7c15b752c46eca19489a604c4775?el=desc) will **decrease** coverage by `42.21%`.
   > The diff coverage is `56.73%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/12611/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree)
   
   ```diff
   @@             Coverage Diff             @@
   ##           master   #12611       +/-   ##
   ===========================================
   - Coverage   82.51%   40.30%   -42.22%     
   ===========================================
     Files         455      456        +1     
     Lines       54867    53926      -941     
   ===========================================
   - Hits        45272    21733    -23539     
   - Misses       9595    32193    +22598     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [sdks/python/apache\_beam/io/gcp/spanner.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL3NwYW5uZXIucHk=) | `56.73% <56.73%> (ø)` | |
   | [...python/apache\_beam/examples/complete/distribopt.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vZXhhbXBsZXMvY29tcGxldGUvZGlzdHJpYm9wdC5weQ==) | `0.00% <0.00%> (-98.59%)` | :arrow_down: |
   | [...dks/python/apache\_beam/transforms/create\_source.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdHJhbnNmb3Jtcy9jcmVhdGVfc291cmNlLnB5) | `0.00% <0.00%> (-98.19%)` | :arrow_down: |
   | [...on/apache\_beam/runners/direct/helper\_transforms.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9kaXJlY3QvaGVscGVyX3RyYW5zZm9ybXMucHk=) | `0.00% <0.00%> (-98.15%)` | :arrow_down: |
   | [...e\_beam/runners/interactive/testing/mock\_ipython.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9pbnRlcmFjdGl2ZS90ZXN0aW5nL21vY2tfaXB5dGhvbi5weQ==) | `7.14% <0.00%> (-92.86%)` | :arrow_down: |
   | [.../examples/snippets/transforms/elementwise/pardo.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vZXhhbXBsZXMvc25pcHBldHMvdHJhbnNmb3Jtcy9lbGVtZW50d2lzZS9wYXJkby5weQ==) | `11.36% <0.00%> (-88.64%)` | :arrow_down: |
   | [sdks/python/apache\_beam/typehints/opcodes.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdHlwZWhpbnRzL29wY29kZXMucHk=) | `0.00% <0.00%> (-87.92%)` | :arrow_down: |
   | [...s/snippets/transforms/aggregation/combineperkey.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vZXhhbXBsZXMvc25pcHBldHMvdHJhbnNmb3Jtcy9hZ2dyZWdhdGlvbi9jb21iaW5lcGVya2V5LnB5) | `11.95% <0.00%> (-86.96%)` | :arrow_down: |
   | [...xamples/snippets/transforms/elementwise/flatmap.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vZXhhbXBsZXMvc25pcHBldHMvdHJhbnNmb3Jtcy9lbGVtZW50d2lzZS9mbGF0bWFwLnB5) | `14.28% <0.00%> (-85.72%)` | :arrow_down: |
   | [...mples/snippets/transforms/elementwise/partition.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vZXhhbXBsZXMvc25pcHBldHMvdHJhbnNmb3Jtcy9lbGVtZW50d2lzZS9wYXJ0aXRpb24ucHk=) | `11.90% <0.00%> (-85.72%)` | :arrow_down: |
   | ... and [294 more](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree-more) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=footer). Last update [6bf56f9...1c43284](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski commented on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski commented on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-726125340


   Run Python 3.7 PostCommit


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] nielm commented on a change in pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
nielm commented on a change in pull request #12611:
URL: https://github.com/apache/beam/pull/12611#discussion_r501067536



##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/StructUtils.java
##########
@@ -0,0 +1,545 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.gcp.spanner;
+
+import static java.util.stream.Collectors.toList;
+import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument;
+
+import com.google.cloud.ByteArray;
+import com.google.cloud.Timestamp;
+import com.google.cloud.spanner.Struct;
+import com.google.cloud.spanner.Type;
+import java.math.BigDecimal;
+import java.util.List;
+import java.util.stream.StreamSupport;
+import org.apache.beam.sdk.schemas.Schema;
+import org.apache.beam.sdk.values.Row;
+import org.joda.time.DateTime;
+import org.joda.time.Instant;
+
+final class StructUtils {
+  public static Row translateStructToRow(Struct struct, Schema schema) {
+    checkForSchemasEquality(schema.getFields(), struct.getType().getStructFields(), false);
+
+    List<Schema.Field> fields = schema.getFields();
+    Row.FieldValueBuilder valueBuilder = null;
+    // TODO: Remove this null-checking once nullable fields are supported in cross-language
+    int count = 0;
+    while (valueBuilder == null && count < fields.size()) {
+      valueBuilder = getFirstStructValue(struct, fields.get(count), schema);
+      ++count;
+    }
+    for (int i = count; i < fields.size(); ++i) {
+      valueBuilder = getStructValue(valueBuilder, struct, fields.get(i));
+    }
+    return valueBuilder != null ? valueBuilder.build() : Row.withSchema(schema).build();
+  }
+
+  public static Struct translateRowToStruct(Row row) {
+    Struct.Builder structBuilder = Struct.newBuilder();
+    List<Schema.Field> fields = row.getSchema().getFields();
+    fields.forEach(
+        field -> {
+          String column = field.getName();
+          switch (field.getType().getTypeName()) {
+            case ROW:
+              structBuilder
+                  .set(column)
+                  .to(
+                      beamTypeToSpannerType(field.getType()),
+                      translateRowToStruct(row.getRow(column)));
+              break;
+            case ARRAY:
+              addArrayToStruct(structBuilder, row, field);
+              break;
+            case ITERABLE:
+              addIterableToStruct(structBuilder, row, field);
+              break;
+            case FLOAT:
+              structBuilder.set(column).to(row.getFloat(column).doubleValue());
+              break;
+            case DOUBLE:
+              structBuilder.set(column).to(row.getDouble(column));
+              break;
+            case DECIMAL:
+              structBuilder.set(column).to(row.getDecimal(column).doubleValue());

Review comment:
       Spanner now supports NUMERIC which may be a better conversion?
   
   > @nielm Could you take a look at this thread? [#12611 (comment)](https://github.com/apache/beam/pull/12611#discussion_r480429441)
   
   Done.

##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/StructUtils.java
##########
@@ -0,0 +1,545 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.gcp.spanner;
+
+import static java.util.stream.Collectors.toList;
+import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument;
+
+import com.google.cloud.ByteArray;
+import com.google.cloud.Timestamp;
+import com.google.cloud.spanner.Struct;
+import com.google.cloud.spanner.Type;
+import java.math.BigDecimal;
+import java.util.List;
+import java.util.stream.StreamSupport;
+import org.apache.beam.sdk.schemas.Schema;
+import org.apache.beam.sdk.values.Row;
+import org.joda.time.DateTime;
+import org.joda.time.Instant;
+
+final class StructUtils {
+  public static Row translateStructToRow(Struct struct, Schema schema) {
+    checkForSchemasEquality(schema.getFields(), struct.getType().getStructFields(), false);
+
+    List<Schema.Field> fields = schema.getFields();
+    Row.FieldValueBuilder valueBuilder = null;
+    // TODO: Remove this null-checking once nullable fields are supported in cross-language
+    int count = 0;
+    while (valueBuilder == null && count < fields.size()) {
+      valueBuilder = getFirstStructValue(struct, fields.get(count), schema);
+      ++count;
+    }
+    for (int i = count; i < fields.size(); ++i) {
+      valueBuilder = getStructValue(valueBuilder, struct, fields.get(i));
+    }
+    return valueBuilder != null ? valueBuilder.build() : Row.withSchema(schema).build();
+  }
+
+  public static Struct translateRowToStruct(Row row) {
+    Struct.Builder structBuilder = Struct.newBuilder();
+    List<Schema.Field> fields = row.getSchema().getFields();
+    fields.forEach(
+        field -> {
+          String column = field.getName();
+          switch (field.getType().getTypeName()) {
+            case ROW:
+              structBuilder
+                  .set(column)
+                  .to(
+                      beamTypeToSpannerType(field.getType()),
+                      translateRowToStruct(row.getRow(column)));
+              break;
+            case ARRAY:
+              addArrayToStruct(structBuilder, row, field);
+              break;
+            case ITERABLE:
+              addIterableToStruct(structBuilder, row, field);
+              break;
+            case FLOAT:
+              structBuilder.set(column).to(row.getFloat(column).doubleValue());
+              break;
+            case DOUBLE:
+              structBuilder.set(column).to(row.getDouble(column));
+              break;
+            case DECIMAL:
+              structBuilder.set(column).to(row.getDecimal(column).doubleValue());

Review comment:
       Spanner now supports NUMERIC which may be a better conversion?
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski removed a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski removed a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-727010594






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski removed a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski removed a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-686493626






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] TheNeuralBit commented on a change in pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
TheNeuralBit commented on a change in pull request #12611:
URL: https://github.com/apache/beam/pull/12611#discussion_r501379245



##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/SpannerIO.java
##########
@@ -678,6 +703,42 @@ public Read withPartitionOptions(PartitionOptions partitionOptions) {
               .withTransaction(getTransaction());
       return input.apply(Create.of(getReadOperation())).apply("Execute query", readAll);
     }
+
+    SerializableFunction<Struct, Row> getFormatFn() {
+      return (SerializableFunction<Struct, Row>)
+          input ->
+              Row.withSchema(Schema.builder().addInt64Field("Key").build())
+                  .withFieldValue("Key", 3L)
+                  .build();
+    }
+  }
+
+  public static class ReadRows extends PTransform<PBegin, PCollection<Row>> {
+    Read read;
+    Schema schema;
+
+    public ReadRows(Read read, Schema schema) {
+      super("Read rows");
+      this.read = read;
+      this.schema = schema;

Review comment:
       It seems like it should be possible to analyze the query and determine the output schema, SqlTransform and JdbcIO both do this




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-683611200


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=h1) Report
   > Merging [#12611](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=desc) into [master](https://codecov.io/gh/apache/beam/commit/0335ba5403e5071a1376a3a4101d14b03cf90b8d?el=desc) will **increase** coverage by `0.05%`.
   > The diff coverage is `56.73%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/12611/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #12611      +/-   ##
   ==========================================
   + Coverage   40.23%   40.29%   +0.05%     
   ==========================================
     Files         455      456       +1     
     Lines       53726    53884     +158     
   ==========================================
   + Hits        21619    21714      +95     
   - Misses      32107    32170      +63     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [sdks/python/apache\_beam/io/gcp/spanner.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL3NwYW5uZXIucHk=) | `56.73% <56.73%> (ø)` | |
   | [...hon/apache\_beam/runners/direct/test\_stream\_impl.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9kaXJlY3QvdGVzdF9zdHJlYW1faW1wbC5weQ==) | `41.17% <0.00%> (-1.58%)` | :arrow_down: |
   | [sdks/python/apache\_beam/portability/python\_urns.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcG9ydGFiaWxpdHkvcHl0aG9uX3VybnMucHk=) | `100.00% <0.00%> (ø)` | |
   | [...s/python/apache\_beam/io/gcp/bigquery\_file\_loads.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL2JpZ3F1ZXJ5X2ZpbGVfbG9hZHMucHk=) | `23.36% <0.00%> (ø)` | |
   | [...hon/apache\_beam/runners/worker/bundle\_processor.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvYnVuZGxlX3Byb2Nlc3Nvci5weQ==) | `28.92% <0.00%> (+0.24%)` | :arrow_up: |
   | [sdks/python/apache\_beam/dataframe/frames.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vZGF0YWZyYW1lL2ZyYW1lcy5weQ==) | `43.73% <0.00%> (+0.76%)` | :arrow_up: |
   | [sdks/python/apache\_beam/transforms/core.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdHJhbnNmb3Jtcy9jb3JlLnB5) | `39.22% <0.00%> (+0.94%)` | :arrow_up: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=footer). Last update [0335ba5...b0c23a2](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-683611200


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=h1) Report
   > Merging [#12611](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=desc) (1c43284) into [master](https://codecov.io/gh/apache/beam/commit/3d6cc0ed9ed537229b27b5dbe73288f21b0e351c?el=desc) (3d6cc0e) will **decrease** coverage by `0.04%`.
   > The diff coverage is `56.73%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/12611/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #12611      +/-   ##
   ==========================================
   - Coverage   82.48%   82.44%   -0.05%     
   ==========================================
     Files         455      456       +1     
     Lines       54876    54975      +99     
   ==========================================
   + Hits        45266    45324      +58     
   - Misses       9610     9651      +41     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [sdks/python/apache\_beam/io/gcp/spanner.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL3NwYW5uZXIucHk=) | `56.73% <56.73%> (ø)` | |
   | [...eam/runners/interactive/interactive\_environment.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9pbnRlcmFjdGl2ZS9pbnRlcmFjdGl2ZV9lbnZpcm9ubWVudC5weQ==) | `89.45% <0.00%> (-0.36%)` | :arrow_down: |
   | [sdks/python/apache\_beam/io/iobase.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vaW9iYXNlLnB5) | `83.75% <0.00%> (-0.29%)` | :arrow_down: |
   | [...hon/apache\_beam/runners/worker/bundle\_processor.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvYnVuZGxlX3Byb2Nlc3Nvci5weQ==) | `94.07% <0.00%> (-0.27%)` | :arrow_down: |
   | [...ks/python/apache\_beam/runners/worker/sdk\_worker.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvc2RrX3dvcmtlci5weQ==) | `89.47% <0.00%> (-0.16%)` | :arrow_down: |
   | [...runners/interactive/display/pcoll\_visualization.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9pbnRlcmFjdGl2ZS9kaXNwbGF5L3Bjb2xsX3Zpc3VhbGl6YXRpb24ucHk=) | `85.26% <0.00%> (-0.08%)` | :arrow_down: |
   | [...beam/runners/portability/local\_job\_service\_main.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9wb3J0YWJpbGl0eS9sb2NhbF9qb2Jfc2VydmljZV9tYWluLnB5) | `0.00% <0.00%> (ø)` | |
   | [sdks/python/apache\_beam/runners/common.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9jb21tb24ucHk=) | `89.20% <0.00%> (+0.44%)` | :arrow_up: |
   | [.../python/apache\_beam/transforms/periodicsequence.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdHJhbnNmb3Jtcy9wZXJpb2RpY3NlcXVlbmNlLnB5) | `98.24% <0.00%> (+1.75%)` | :arrow_up: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=footer). Last update [02a1cd2...7178fc2](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski edited a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski edited a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-686447747


   @TheNeuralBit I've upgraded it a bit.
   - Checking schemas equality is redundant because it will throw an exception with a good message anyway (class cast failure or unknown column). Also, it's possible to just add row.addFieldValues(Map<String, Object> values) and depend on the casts following the schema.
   - I managed to unify addArray and addIterable code duplication with a bit ugly casts (needed SuppressWarning("unchecked")) but I don't think it can be easily achieved otherwise.
   - Nothing comes to my mind to remove duplication in addIterableToMutationBuilder and addIterableToStructBuilder methods. These are unrelated classes (Struct.Builder and Mutation.WriteBuilder. Maybe my Java knowledge is insufficient here. I could make an interface that simulates .setInt64Array, setStructArray etc but it would be even more boilerplate.
   - I unified a bit the API of both python spanners. Not everything could be done 1:1, but the corresponding keywords were changed and the positions of positional arguments.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski commented on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski commented on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-726976356


   Run Python 3.7 PostCommit


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski commented on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski commented on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-726035603


   Run Python 3.7 PostCommit


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski commented on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski commented on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-688837483


   Run Java PreCommit


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski removed a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski removed a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-717413326


   Run Python 3.7 PostCommit


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] TheNeuralBit commented on a change in pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
TheNeuralBit commented on a change in pull request #12611:
URL: https://github.com/apache/beam/pull/12611#discussion_r501379245



##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/SpannerIO.java
##########
@@ -678,6 +703,42 @@ public Read withPartitionOptions(PartitionOptions partitionOptions) {
               .withTransaction(getTransaction());
       return input.apply(Create.of(getReadOperation())).apply("Execute query", readAll);
     }
+
+    SerializableFunction<Struct, Row> getFormatFn() {
+      return (SerializableFunction<Struct, Row>)
+          input ->
+              Row.withSchema(Schema.builder().addInt64Field("Key").build())
+                  .withFieldValue("Key", 3L)
+                  .build();
+    }
+  }
+
+  public static class ReadRows extends PTransform<PBegin, PCollection<Row>> {
+    Read read;
+    Schema schema;
+
+    public ReadRows(Read read, Schema schema) {
+      super("Read rows");
+      this.read = read;
+      this.schema = schema;

Review comment:
       It seems like it should be possible to analyze the query and determine the output schema, SqlTransform and JdbcIO both do this.
   
   I got a similar response from my internal queries though, it doesn't look like there's a good way to do this with the Spanner client




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski commented on a change in pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski commented on a change in pull request #12611:
URL: https://github.com/apache/beam/pull/12611#discussion_r482063484



##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/StructUtils.java
##########
@@ -0,0 +1,545 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.gcp.spanner;
+
+import static java.util.stream.Collectors.toList;
+import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument;
+
+import com.google.cloud.ByteArray;
+import com.google.cloud.Timestamp;
+import com.google.cloud.spanner.Struct;
+import com.google.cloud.spanner.Type;
+import java.math.BigDecimal;
+import java.util.List;
+import java.util.stream.StreamSupport;
+import org.apache.beam.sdk.schemas.Schema;
+import org.apache.beam.sdk.values.Row;
+import org.joda.time.DateTime;
+import org.joda.time.Instant;
+
+final class StructUtils {
+  public static Row translateStructToRow(Struct struct, Schema schema) {
+    checkForSchemasEquality(schema.getFields(), struct.getType().getStructFields(), false);
+
+    List<Schema.Field> fields = schema.getFields();
+    Row.FieldValueBuilder valueBuilder = null;
+    // TODO: Remove this null-checking once nullable fields are supported in cross-language

Review comment:
       NullableCoder is not a standard coder as was mentioned here: https://issues.apache.org/jira/browse/BEAM-10529?jql=project%20%3D%20BEAM%20AND%20text%20~%20%22nullable%20python%22
   So I suppose the only way to support null values is not to set them.
   I noticed that when I tried to read a null field from Spanner table. But I may be wrong




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] TheNeuralBit commented on a change in pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
TheNeuralBit commented on a change in pull request #12611:
URL: https://github.com/apache/beam/pull/12611#discussion_r502593104



##########
File path: sdks/java/io/google-cloud-platform/expansion-service/build.gradle
##########
@@ -0,0 +1,38 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+apply plugin: 'org.apache.beam.module'
+apply plugin: 'application'
+mainClassName = "org.apache.beam.sdk.expansion.service.ExpansionService"
+
+applyJavaNature(
+        enableChecker: true,
+        automaticModuleName: 'org.apache.beam.sdk.io.gcp.expansion.service',
+        exportJavadoc: false,
+        validateShadowJar: false,
+        shadowClosure: {},
+)
+
+description = "Apache Beam :: SDKs :: Java :: IO :: Google Cloud Platform :: Expansion Service"
+ext.summary = "Expansion service serving Spanner Java IO"

Review comment:
       "GCP Java IOs". Currently just Spanner but this will pick up any IOs that are made external.

##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/MutationUtils.java
##########
@@ -34,4 +49,237 @@ public static boolean isPointDelete(Mutation m) {
         && Iterables.isEmpty(m.getKeySet().getRanges())
         && Iterables.size(m.getKeySet().getKeys()) == 1;
   }
+
+  /**
+   * Utility function to convert row to mutation.
+   *
+   * @return function that can convert row to mutation
+   */
+  public static SerializableFunction<Row, Mutation> beamRowToMutationFn() {

Review comment:
       Please add some documentation about the mapping from Mutation to Beam schemas, I think this would be the appropriate place. You might duplicate the doc somewhere in the Python code as well, or just reference this.

##########
File path: sdks/python/apache_beam/io/gcp/spanner.py
##########
@@ -0,0 +1,483 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""PTransforms for supporting Spanner in Python pipelines.
+
+  These transforms are currently supported by Beam portable
+  Flink and Spark runners.
+
+  **Setup**
+
+  Transforms provided in this module are cross-language transforms
+  implemented in the Beam Java SDK. During the pipeline construction, Python SDK
+  will connect to a Java expansion service to expand these transforms.
+  To facilitate this, a small amount of setup is needed before using these
+  transforms in a Beam Python pipeline.
+
+  There are several ways to setup cross-language Spanner transforms.
+
+  * Option 1: use the default expansion service
+  * Option 2: specify a custom expansion service
+
+  See below for details regarding each of these options.
+
+  *Option 1: Use the default expansion service*
+
+  This is the recommended and easiest setup option for using Python Spanner
+  transforms. This option is only available for Beam 2.25.0 and later.
+
+  This option requires following pre-requisites before running the Beam
+  pipeline.
+
+  * Install Java runtime in the computer from where the pipeline is constructed
+    and make sure that 'java' command is available.
+
+  In this option, Python SDK will either download (for released Beam version) or
+  build (when running from a Beam Git clone) a expansion service jar and use
+  that to expand transforms. Currently Spanner transforms use the
+  'beam-sdks-java-io-google-cloud-platform-expansion-service' jar for this
+  purpose.
+
+  *Option 2: specify a custom expansion service*
+
+  In this option, you startup your own expansion service and provide that as
+  a parameter when using the transforms provided in this module.
+
+  This option requires following pre-requisites before running the Beam
+  pipeline.
+
+  * Startup your own expansion service.
+  * Update your pipeline to provide the expansion service address when
+    initiating Spanner transforms provided in this module.
+
+  Flink Users can use the built-in Expansion Service of the Flink Runner's
+  Job Server. If you start Flink's Job Server, the expansion service will be
+  started on port 8097. For a different address, please set the
+  expansion_service parameter.
+
+  **More information**
+
+  For more information regarding cross-language transforms see:
+  - https://beam.apache.org/roadmap/portability/
+
+  For more information specific to Flink runner see:
+  - https://beam.apache.org/documentation/runners/flink/
+"""
+
+# pytype: skip-file
+
+from __future__ import absolute_import
+
+import typing
+import uuid
+from typing import List
+from typing import NamedTuple
+from typing import Optional
+
+from past.builtins import unicode
+
+from apache_beam import Map
+from apache_beam import PTransform
+from apache_beam import coders
+from apache_beam.transforms.external import BeamJarExpansionService
+from apache_beam.transforms.external import ExternalTransform
+from apache_beam.transforms.external import NamedTupleBasedPayloadBuilder
+from apache_beam.typehints.schemas import named_tuple_to_schema
+
+__all__ = [
+    'WriteToSpanner',
+    'ReadFromSpanner',
+    'TimestampBoundMode',
+    'TimeUnit',
+]
+
+
+def default_io_expansion_service():
+  return BeamJarExpansionService(
+      'sdks:java:io:google-cloud-platform:expansion-service:shadowJar')
+
+
+WriteToSpannerSchema = typing.NamedTuple(
+    'WriteToSpannerSchema',
+    [
+        ('instance_id', unicode),
+        ('database_id', unicode),
+        ('project_id', Optional[unicode]),
+        ('max_batch_size_bytes', Optional[int]),
+        ('max_number_mutations', Optional[int]),
+        ('max_number_rows', Optional[int]),
+        ('grouping_factor', Optional[int]),
+        ('host', Optional[unicode]),
+        ('emulator_host', Optional[unicode]),
+        ('commit_deadline', Optional[int]),
+        ('max_cumulative_backoff', Optional[int]),
+    ],
+)

Review comment:
       Now that we've dropped support for Python 2 you could use the class syntax here if you want:
   ```py
   class WriteToSpannerSchema(typing.NamedTuple):
     instance_id: str
     database_id: str
     ...
   ```

##########
File path: sdks/python/apache_beam/io/gcp/spanner.py
##########
@@ -0,0 +1,483 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""PTransforms for supporting Spanner in Python pipelines.
+
+  These transforms are currently supported by Beam portable
+  Flink and Spark runners.
+
+  **Setup**
+
+  Transforms provided in this module are cross-language transforms
+  implemented in the Beam Java SDK. During the pipeline construction, Python SDK
+  will connect to a Java expansion service to expand these transforms.
+  To facilitate this, a small amount of setup is needed before using these
+  transforms in a Beam Python pipeline.
+
+  There are several ways to setup cross-language Spanner transforms.
+
+  * Option 1: use the default expansion service
+  * Option 2: specify a custom expansion service
+
+  See below for details regarding each of these options.
+
+  *Option 1: Use the default expansion service*
+
+  This is the recommended and easiest setup option for using Python Spanner
+  transforms. This option is only available for Beam 2.25.0 and later.
+
+  This option requires following pre-requisites before running the Beam
+  pipeline.
+
+  * Install Java runtime in the computer from where the pipeline is constructed
+    and make sure that 'java' command is available.
+
+  In this option, Python SDK will either download (for released Beam version) or
+  build (when running from a Beam Git clone) a expansion service jar and use
+  that to expand transforms. Currently Spanner transforms use the
+  'beam-sdks-java-io-google-cloud-platform-expansion-service' jar for this
+  purpose.
+
+  *Option 2: specify a custom expansion service*
+
+  In this option, you startup your own expansion service and provide that as
+  a parameter when using the transforms provided in this module.
+
+  This option requires following pre-requisites before running the Beam
+  pipeline.
+
+  * Startup your own expansion service.
+  * Update your pipeline to provide the expansion service address when
+    initiating Spanner transforms provided in this module.
+
+  Flink Users can use the built-in Expansion Service of the Flink Runner's
+  Job Server. If you start Flink's Job Server, the expansion service will be
+  started on port 8097. For a different address, please set the
+  expansion_service parameter.
+
+  **More information**
+
+  For more information regarding cross-language transforms see:
+  - https://beam.apache.org/roadmap/portability/
+
+  For more information specific to Flink runner see:
+  - https://beam.apache.org/documentation/runners/flink/
+"""
+
+# pytype: skip-file
+
+from __future__ import absolute_import
+
+import typing
+import uuid
+from typing import List
+from typing import NamedTuple
+from typing import Optional
+
+from past.builtins import unicode
+
+from apache_beam import Map
+from apache_beam import PTransform
+from apache_beam import coders
+from apache_beam.transforms.external import BeamJarExpansionService
+from apache_beam.transforms.external import ExternalTransform
+from apache_beam.transforms.external import NamedTupleBasedPayloadBuilder
+from apache_beam.typehints.schemas import named_tuple_to_schema
+
+__all__ = [
+    'WriteToSpanner',
+    'ReadFromSpanner',
+    'TimestampBoundMode',
+    'TimeUnit',
+]
+
+
+def default_io_expansion_service():
+  return BeamJarExpansionService(
+      'sdks:java:io:google-cloud-platform:expansion-service:shadowJar')
+
+
+WriteToSpannerSchema = typing.NamedTuple(
+    'WriteToSpannerSchema',
+    [
+        ('instance_id', unicode),
+        ('database_id', unicode),
+        ('project_id', Optional[unicode]),
+        ('max_batch_size_bytes', Optional[int]),
+        ('max_number_mutations', Optional[int]),
+        ('max_number_rows', Optional[int]),
+        ('grouping_factor', Optional[int]),
+        ('host', Optional[unicode]),
+        ('emulator_host', Optional[unicode]),
+        ('commit_deadline', Optional[int]),
+        ('max_cumulative_backoff', Optional[int]),
+    ],
+)
+
+
+class WriteToSpanner:
+  """
+  A PTransform which writes mutations to the specified instance's database
+  via Spanner.
+
+  This transform receives rows defined as NamedTuple or as List[NamedTuple]
+  in case of delete operation. Example::
+
+    ExampleRow = typing.NamedTuple('ExampleRow',
+                                   [('id', int), ('name', unicode)])
+
+    with Pipeline() as p:
+      _ = (
+          p
+          | 'Impulse' >> beam.Impulse()
+          | 'Generate' >> beam.FlatMap(lambda x: range(num_rows))
+          | 'To row' >> beam.Map(lambda n: ExampleRow(n, str(n))
+              .with_output_types(ExampleRow)
+          | 'Write to Spanner' >> WriteToSpanner(
+              instance_id='your_instance',
+              database_id='existing_database',
+              project_id='your_project_id').insert('your_table'))
+
+  In addition you can pass List[ExampleRow] to delete transform::
+
+    with Pipeline() as p:
+      _ = (
+          p
+          | 'Impulse' >> beam.Impulse()
+          | 'Generate' >> beam.FlatMap(lambda x: range(num_rows))
+          | 'To row' >> beam.Map(lambda n: [ExampleRow(n, str(n),
+              ExampleRow(n * 2, str(n * 2)])
+              .with_output_types(List[ExampleRow])
+          | 'Write to Spanner' >> WriteToSpanner(
+              instance_id='your_instance',
+              database_id='existing_database',
+              project_id='your_project_id').delete('your_table'))
+
+  Experimental; no backwards compatibility guarantees.
+  """
+  def __init__(
+      self,
+      project_id,
+      instance_id,
+      database_id,
+      max_batch_size_bytes=None,
+      max_number_mutations=None,
+      max_number_rows=None,
+      grouping_factor=None,
+      host=None,
+      emulator_host=None,
+      commit_deadline=None,
+      max_cumulative_backoff=None,
+      expansion_service=None,
+  ):
+    """
+    Initializes a write operation to Spanner.
+
+    :param project_id: Specifies the Cloud Spanner project.
+    :param instance_id: Specifies the Cloud Spanner instance.
+    :param database_id: Specifies the Cloud Spanner database.
+    :param max_batch_size_bytes: Specifies the batch size limit (max number of
+        bytes mutated per batch). Default value is 1048576 bytes = 1MB.
+    :param max_number_mutations: Specifies the cell mutation limit (maximum
+        number of mutated cells per batch). Default value is 5000.
+    :param max_number_rows: Specifies the row mutation limit (maximum number of
+        mutated rows per batch). Default value is 500.
+    :param grouping_factor: Specifies the multiple of max mutation (in terms
+        of both bytes per batch and cells per batch) that is used to select a
+        set of mutations to sort by key for batching. This sort uses local
+        memory on the workers, so using large values can cause out of memory
+        errors. Default value is 1000.
+    :param host: Specifies the Cloud Spanner host.
+    :param emulator_host: Specifies Spanner emulator host.
+    :param commit_deadline: Specifies the deadline for the Commit API call.
+        Default is 15 secs. DEADLINE_EXCEEDED errors will prompt a backoff/retry
+        until the value of commit_deadline is reached. DEADLINE_EXCEEDED errors
+        are ar reported with logging and counters. Pass seconds as value.
+    :param max_cumulative_backoff: Specifies the maximum cumulative backoff
+        time when retrying after DEADLINE_EXCEEDED errors. Default is 900s
+        (15min). If the mutations still have not been written after this time,
+        they are treated as a failure, and handled according to the setting of
+        failure_mode. Pass seconds as value.
+    :param expansion_service: The address (host:port) of the ExpansionService.
+    """
+    max_cumulative_backoff = int(
+        max_cumulative_backoff) if max_cumulative_backoff else None
+    commit_deadline = int(commit_deadline) if commit_deadline else None
+    self.config = NamedTupleBasedPayloadBuilder(
+        WriteToSpannerSchema(
+            project_id=project_id,
+            instance_id=instance_id,
+            database_id=database_id,
+            max_batch_size_bytes=max_batch_size_bytes,
+            max_number_mutations=max_number_mutations,
+            max_number_rows=max_number_rows,
+            grouping_factor=grouping_factor,
+            host=host,
+            emulator_host=emulator_host,
+            commit_deadline=commit_deadline,
+            max_cumulative_backoff=max_cumulative_backoff,
+        ),
+    )
+    self.expansion_service = expansion_service or default_io_expansion_service()
+
+  def insert(self, table):
+    return WriteToSpannerTransform(
+        self.config, self.expansion_service, Operation.INSERT, table)
+
+  def delete(self, table):
+    return WriteToSpannerTransform(
+        self.config, self.expansion_service, Operation.DELETE, table)
+
+  def update(self, table):
+    return WriteToSpannerTransform(
+        self.config, self.expansion_service, Operation.UPDATE, table)
+
+  def replace(self, table):
+    return WriteToSpannerTransform(
+        self.config, self.expansion_service, Operation.REPLACE, table)
+
+  def insert_or_update(self, table):
+    return WriteToSpannerTransform(
+        self.config, self.expansion_service, Operation.INSERT_OR_UPDATE, table)
+
+
+ReadFromSpannerSchema = NamedTuple(
+    'ReadFromSpannerSchema',
+    [
+        ('instance_id', unicode),
+        ('database_id', unicode),
+        ('schema', bytes),
+        ('sql', Optional[unicode]),
+        ('table', Optional[unicode]),
+        ('project_id', Optional[unicode]),
+        ('host', Optional[unicode]),
+        ('emulator_host', Optional[unicode]),
+        ('batching', Optional[bool]),
+        ('timestamp_bound_mode', Optional[unicode]),
+        ('read_timestamp', Optional[unicode]),
+        ('exact_staleness', Optional[int]),
+        ('time_unit', Optional[unicode]),
+    ],
+)
+
+
+class ReadFromSpanner(ExternalTransform):
+  """
+  A PTransform which reads from the specified Spanner instance's database.
+
+  This transform required type of the row it has to return to provide the
+  schema. Example::
+
+    ExampleRow = typing.NamedTuple('ExampleRow',
+                                   [('id', int), ('name', unicode)])
+
+    with Pipeline() as p:
+      result = (
+          p
+          | ReadFromSpanner(
+              instance_id='your_instance_id',
+              database_id='your_database_id',
+              project_id='your_project_id',
+              row_type=ExampleRow,
+              query='SELECT * FROM some_table',
+          ).with_output_types(ExampleRow))
+
+  Experimental; no backwards compatibility guarantees.
+  """
+  URN = 'beam:external:java:spanner:read:v1'
+
+  def __init__(
+      self,
+      project_id,
+      instance_id,
+      database_id,
+      row_type=None,
+      sql=None,
+      table=None,
+      host=None,
+      emulator_host=None,
+      batching=None,
+      timestamp_bound_mode=None,
+      read_timestamp=None,
+      exact_staleness=None,
+      time_unit=None,
+      expansion_service=None,
+  ):
+    """
+    Initializes a read operation from Spanner.
+
+    :param project_id: Specifies the Cloud Spanner project.
+    :param instance_id: Specifies the Cloud Spanner instance.
+    :param database_id: Specifies the Cloud Spanner database.
+    :param row_type: Row type that fits the given query or table. Passed as
+        NamedTuple, e.g. NamedTuple('name', [('row_name', unicode)])
+    :param sql: An sql query to execute. It's results must fit the
+        provided row_type. Don't use when table is set.
+    :param table: A spanner table. When provided all columns from row_type
+        will be selected to query. Don't use when query is set.
+    :param batching: By default Batch API is used to read data from Cloud
+        Spanner. It is useful to disable batching when the underlying query
+        is not root-partitionable.
+    :param host: Specifies the Cloud Spanner host.
+    :param emulator_host: Specifies Spanner emulator host.
+    :param timestamp_bound_mode: Defines how Cloud Spanner will choose a
+        timestamp for a read-only transaction or a single read/query.
+        Possible values:
+        STRONG: A timestamp bound that will perform reads and queries at a
+        timestamp where all previously committed transactions are visible.
+        READ_TIMESTAMP: Returns a timestamp bound that will perform reads
+        and queries at the given timestamp.
+        MIN_READ_TIMESTAMP: Returns a timestamp bound that will perform reads
+        and queries at a timestamp chosen to be at least given timestamp value.
+        EXACT_STALENESS: Returns a timestamp bound that will perform reads and
+        queries at an exact staleness. The timestamp is chosen soon after the
+        read is started.
+        MAX_STALENESS: Returns a timestamp bound that will perform reads and
+        queries at a timestamp chosen to be at most time_unit stale.
+    :param read_timestamp: Timestamp in string. Use only when
+        timestamp_bound_mode is set to READ_TIMESTAMP or MIN_READ_TIMESTAMP.
+    :param exact_staleness: Staleness value as int. Use only when
+        timestamp_bound_mode is set to EXACT_STALENESS or MAX_STALENESS.
+        time_unit has to be set along with this param.
+    :param time_unit: Time unit for staleness_value. Possible values:
+        NANOSECONDS, MICROSECONDS, MILLISECONDS, SECONDS, HOURS, DAYS.
+    :param expansion_service: The address (host:port) of the ExpansionService.
+    """
+    assert row_type
+    assert sql or table and not (sql and table)
+    TimeUnit.verify_param(time_unit)
+    TimestampBoundMode.verify_param(timestamp_bound_mode)
+    staleness_value = int(exact_staleness) if exact_staleness else None
+
+    if staleness_value or time_unit:
+      assert staleness_value and time_unit and \
+             timestamp_bound_mode == TimestampBoundMode.MAX_STALENESS or \
+             timestamp_bound_mode == TimestampBoundMode.EXACT_STALENESS
+
+    if read_timestamp:
+      assert timestamp_bound_mode == TimestampBoundMode.MIN_READ_TIMESTAMP\
+             or timestamp_bound_mode == TimestampBoundMode.READ_TIMESTAMP
+
+    coders.registry.register_coder(row_type, coders.RowCoder)
+
+    super(ReadFromSpanner, self).__init__(
+        self.URN,
+        NamedTupleBasedPayloadBuilder(
+            ReadFromSpannerSchema(
+                instance_id=instance_id,
+                database_id=database_id,
+                sql=sql,
+                table=table,
+                schema=named_tuple_to_schema(row_type).SerializeToString(),
+                project_id=project_id,
+                host=host,
+                emulator_host=emulator_host,
+                batching=batching,
+                timestamp_bound_mode=timestamp_bound_mode,
+                read_timestamp=read_timestamp,
+                exact_staleness=exact_staleness,
+                time_unit=time_unit,
+            ),
+        ),
+        expansion_service or default_io_expansion_service(),
+    )
+
+
+class Operation:
+  INSERT = 'INSERT'
+  DELETE = 'DELETE'
+  UPDATE = 'UPDATE'
+  REPLACE = 'REPLACE'
+  INSERT_OR_UPDATE = 'INSERT_OR_UPDATE'
+
+
+class TimeUnit:
+  NANOSECONDS = 'NANOSECONDS'
+  MICROSECONDS = 'MICROSECONDS'
+  MILLISECONDS = 'MILLISECONDS'
+  SECONDS = 'SECONDS'
+  HOURS = 'HOURS'
+  DAYS = 'DAYS'
+
+  @staticmethod
+  def verify_param(param):
+    if param and not hasattr(TimeUnit, param):
+      raise RuntimeError(
+          'Invalid param for TimestampBoundMode: {}'.format(param))
+
+
+class TimestampBoundMode:
+  MAX_STALENESS = 'MAX_STALENESS'
+  EXACT_STALENESS = 'EXACT_STALENESS'
+  READ_TIMESTAMP = 'READ_TIMESTAMP'
+  MIN_READ_TIMESTAMP = 'MIN_READ_TIMESTAMP'
+  STRONG = 'STRONG'
+
+  @staticmethod
+  def verify_param(param):
+    if param and not hasattr(TimestampBoundMode, param):
+      raise RuntimeError(
+          'Invalid param for TimestampBoundMode: {}'.format(param))
+
+
+class WriteToSpannerTransform(PTransform):
+  URN = 'beam:external:java:spanner:write:v1'
+
+  def __init__(self, config, expansion_service, operation, table):
+    super(WriteToSpannerTransform, self).__init__()
+    self.config = config
+    self.expansion_service = expansion_service
+    self.operation = operation
+    self.table = table
+
+  def expand(self, row_pcoll):
+    return (
+        row_pcoll
+        | RowToMutation(self.operation, self.table)
+        | ExternalTransform(self.URN, self.config, self.expansion_service))
+
+
+class RowToMutation(PTransform):
+  def __init__(self, operation, table):
+    super(RowToMutation, self).__init__()
+    self.operation = operation
+    self.table = table
+
+  def expand(self, pcoll):
+    is_delete = self.operation == Operation.DELETE
+    mutation_name = 'Mutation_%s_%s' % (
+        self.operation, str(uuid.uuid4()).replace('-', ''))
+
+    # There is an error when pcoll.element_type is List[row_type] so pass
+    # a list of inner element types to NamedTuple explicitly.
+    is_list = hasattr(pcoll.element_type, 'inner_type')
+    row_type = pcoll.element_type.inner_type if is_list else pcoll.element_type
+    mutation_type = NamedTuple(
+        mutation_name,
+        [
+            ('operation', unicode),
+            ('table', unicode),
+            ('keyset', List[row_type]) if is_delete else ('row', row_type),

Review comment:
       We should make sure this works when schemas are specified via [`beam.Row`](https://beam.apache.org/releases/pydoc/current/apache_beam.pvalue.html#apache_beam.pvalue.Row) as well, right now I think this will only work with the `NamedTuple` style.
   
   You could use `element_type = named_tuple_from_schema(schema_from_element_type(pcoll.element_type))` to make sure element_type is a NamedTuple that you can use here (it might be worth adding a convenience function for that patttern).
   https://github.com/apache/beam/blob/a66454bc3af5afeba38e7275bf5e5156c2468e0d/sdks/python/apache_beam/typehints/schemas.py#L273
   https://github.com/apache/beam/blob/a66454bc3af5afeba38e7275bf5e5156c2468e0d/sdks/python/apache_beam/typehints/schemas.py#L282
   
   
   

##########
File path: sdks/python/apache_beam/io/gcp/spanner.py
##########
@@ -0,0 +1,483 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""PTransforms for supporting Spanner in Python pipelines.
+
+  These transforms are currently supported by Beam portable
+  Flink and Spark runners.
+
+  **Setup**
+
+  Transforms provided in this module are cross-language transforms
+  implemented in the Beam Java SDK. During the pipeline construction, Python SDK
+  will connect to a Java expansion service to expand these transforms.
+  To facilitate this, a small amount of setup is needed before using these
+  transforms in a Beam Python pipeline.
+
+  There are several ways to setup cross-language Spanner transforms.
+
+  * Option 1: use the default expansion service
+  * Option 2: specify a custom expansion service
+
+  See below for details regarding each of these options.
+
+  *Option 1: Use the default expansion service*
+
+  This is the recommended and easiest setup option for using Python Spanner
+  transforms. This option is only available for Beam 2.25.0 and later.
+
+  This option requires following pre-requisites before running the Beam
+  pipeline.
+
+  * Install Java runtime in the computer from where the pipeline is constructed
+    and make sure that 'java' command is available.
+
+  In this option, Python SDK will either download (for released Beam version) or
+  build (when running from a Beam Git clone) a expansion service jar and use
+  that to expand transforms. Currently Spanner transforms use the
+  'beam-sdks-java-io-google-cloud-platform-expansion-service' jar for this
+  purpose.
+
+  *Option 2: specify a custom expansion service*
+
+  In this option, you startup your own expansion service and provide that as
+  a parameter when using the transforms provided in this module.
+
+  This option requires following pre-requisites before running the Beam
+  pipeline.
+
+  * Startup your own expansion service.
+  * Update your pipeline to provide the expansion service address when
+    initiating Spanner transforms provided in this module.
+
+  Flink Users can use the built-in Expansion Service of the Flink Runner's
+  Job Server. If you start Flink's Job Server, the expansion service will be
+  started on port 8097. For a different address, please set the
+  expansion_service parameter.
+
+  **More information**
+
+  For more information regarding cross-language transforms see:
+  - https://beam.apache.org/roadmap/portability/
+
+  For more information specific to Flink runner see:
+  - https://beam.apache.org/documentation/runners/flink/
+"""
+
+# pytype: skip-file
+
+from __future__ import absolute_import
+
+import typing
+import uuid
+from typing import List
+from typing import NamedTuple
+from typing import Optional
+
+from past.builtins import unicode
+
+from apache_beam import Map
+from apache_beam import PTransform
+from apache_beam import coders
+from apache_beam.transforms.external import BeamJarExpansionService
+from apache_beam.transforms.external import ExternalTransform
+from apache_beam.transforms.external import NamedTupleBasedPayloadBuilder
+from apache_beam.typehints.schemas import named_tuple_to_schema
+
+__all__ = [
+    'WriteToSpanner',
+    'ReadFromSpanner',
+    'TimestampBoundMode',
+    'TimeUnit',
+]
+
+
+def default_io_expansion_service():
+  return BeamJarExpansionService(
+      'sdks:java:io:google-cloud-platform:expansion-service:shadowJar')
+
+
+WriteToSpannerSchema = typing.NamedTuple(
+    'WriteToSpannerSchema',
+    [
+        ('instance_id', unicode),
+        ('database_id', unicode),
+        ('project_id', Optional[unicode]),
+        ('max_batch_size_bytes', Optional[int]),
+        ('max_number_mutations', Optional[int]),
+        ('max_number_rows', Optional[int]),
+        ('grouping_factor', Optional[int]),
+        ('host', Optional[unicode]),
+        ('emulator_host', Optional[unicode]),
+        ('commit_deadline', Optional[int]),
+        ('max_cumulative_backoff', Optional[int]),
+    ],
+)
+
+
+class WriteToSpanner:
+  """
+  A PTransform which writes mutations to the specified instance's database
+  via Spanner.
+
+  This transform receives rows defined as NamedTuple or as List[NamedTuple]
+  in case of delete operation. Example::
+
+    ExampleRow = typing.NamedTuple('ExampleRow',
+                                   [('id', int), ('name', unicode)])
+
+    with Pipeline() as p:
+      _ = (
+          p
+          | 'Impulse' >> beam.Impulse()
+          | 'Generate' >> beam.FlatMap(lambda x: range(num_rows))
+          | 'To row' >> beam.Map(lambda n: ExampleRow(n, str(n))
+              .with_output_types(ExampleRow)
+          | 'Write to Spanner' >> WriteToSpanner(
+              instance_id='your_instance',
+              database_id='existing_database',
+              project_id='your_project_id').insert('your_table'))
+
+  In addition you can pass List[ExampleRow] to delete transform::
+
+    with Pipeline() as p:
+      _ = (
+          p
+          | 'Impulse' >> beam.Impulse()
+          | 'Generate' >> beam.FlatMap(lambda x: range(num_rows))
+          | 'To row' >> beam.Map(lambda n: [ExampleRow(n, str(n),
+              ExampleRow(n * 2, str(n * 2)])
+              .with_output_types(List[ExampleRow])
+          | 'Write to Spanner' >> WriteToSpanner(
+              instance_id='your_instance',
+              database_id='existing_database',
+              project_id='your_project_id').delete('your_table'))
+
+  Experimental; no backwards compatibility guarantees.
+  """
+  def __init__(
+      self,
+      project_id,
+      instance_id,
+      database_id,
+      max_batch_size_bytes=None,
+      max_number_mutations=None,
+      max_number_rows=None,
+      grouping_factor=None,
+      host=None,
+      emulator_host=None,
+      commit_deadline=None,
+      max_cumulative_backoff=None,
+      expansion_service=None,
+  ):
+    """
+    Initializes a write operation to Spanner.
+
+    :param project_id: Specifies the Cloud Spanner project.
+    :param instance_id: Specifies the Cloud Spanner instance.
+    :param database_id: Specifies the Cloud Spanner database.
+    :param max_batch_size_bytes: Specifies the batch size limit (max number of
+        bytes mutated per batch). Default value is 1048576 bytes = 1MB.
+    :param max_number_mutations: Specifies the cell mutation limit (maximum
+        number of mutated cells per batch). Default value is 5000.
+    :param max_number_rows: Specifies the row mutation limit (maximum number of
+        mutated rows per batch). Default value is 500.
+    :param grouping_factor: Specifies the multiple of max mutation (in terms
+        of both bytes per batch and cells per batch) that is used to select a
+        set of mutations to sort by key for batching. This sort uses local
+        memory on the workers, so using large values can cause out of memory
+        errors. Default value is 1000.
+    :param host: Specifies the Cloud Spanner host.
+    :param emulator_host: Specifies Spanner emulator host.
+    :param commit_deadline: Specifies the deadline for the Commit API call.
+        Default is 15 secs. DEADLINE_EXCEEDED errors will prompt a backoff/retry
+        until the value of commit_deadline is reached. DEADLINE_EXCEEDED errors
+        are ar reported with logging and counters. Pass seconds as value.
+    :param max_cumulative_backoff: Specifies the maximum cumulative backoff
+        time when retrying after DEADLINE_EXCEEDED errors. Default is 900s
+        (15min). If the mutations still have not been written after this time,
+        they are treated as a failure, and handled according to the setting of
+        failure_mode. Pass seconds as value.
+    :param expansion_service: The address (host:port) of the ExpansionService.
+    """
+    max_cumulative_backoff = int(
+        max_cumulative_backoff) if max_cumulative_backoff else None
+    commit_deadline = int(commit_deadline) if commit_deadline else None
+    self.config = NamedTupleBasedPayloadBuilder(
+        WriteToSpannerSchema(
+            project_id=project_id,
+            instance_id=instance_id,
+            database_id=database_id,
+            max_batch_size_bytes=max_batch_size_bytes,
+            max_number_mutations=max_number_mutations,
+            max_number_rows=max_number_rows,
+            grouping_factor=grouping_factor,
+            host=host,
+            emulator_host=emulator_host,
+            commit_deadline=commit_deadline,
+            max_cumulative_backoff=max_cumulative_backoff,
+        ),
+    )
+    self.expansion_service = expansion_service or default_io_expansion_service()
+
+  def insert(self, table):
+    return WriteToSpannerTransform(
+        self.config, self.expansion_service, Operation.INSERT, table)
+
+  def delete(self, table):
+    return WriteToSpannerTransform(
+        self.config, self.expansion_service, Operation.DELETE, table)
+
+  def update(self, table):
+    return WriteToSpannerTransform(
+        self.config, self.expansion_service, Operation.UPDATE, table)
+
+  def replace(self, table):
+    return WriteToSpannerTransform(
+        self.config, self.expansion_service, Operation.REPLACE, table)
+
+  def insert_or_update(self, table):
+    return WriteToSpannerTransform(
+        self.config, self.expansion_service, Operation.INSERT_OR_UPDATE, table)
+
+
+ReadFromSpannerSchema = NamedTuple(
+    'ReadFromSpannerSchema',
+    [
+        ('instance_id', unicode),
+        ('database_id', unicode),
+        ('schema', bytes),
+        ('sql', Optional[unicode]),
+        ('table', Optional[unicode]),
+        ('project_id', Optional[unicode]),
+        ('host', Optional[unicode]),
+        ('emulator_host', Optional[unicode]),
+        ('batching', Optional[bool]),
+        ('timestamp_bound_mode', Optional[unicode]),
+        ('read_timestamp', Optional[unicode]),
+        ('exact_staleness', Optional[int]),
+        ('time_unit', Optional[unicode]),
+    ],
+)
+
+
+class ReadFromSpanner(ExternalTransform):
+  """
+  A PTransform which reads from the specified Spanner instance's database.
+
+  This transform required type of the row it has to return to provide the
+  schema. Example::
+
+    ExampleRow = typing.NamedTuple('ExampleRow',
+                                   [('id', int), ('name', unicode)])
+
+    with Pipeline() as p:
+      result = (
+          p
+          | ReadFromSpanner(
+              instance_id='your_instance_id',
+              database_id='your_database_id',
+              project_id='your_project_id',
+              row_type=ExampleRow,
+              query='SELECT * FROM some_table',
+          ).with_output_types(ExampleRow))
+
+  Experimental; no backwards compatibility guarantees.
+  """
+  URN = 'beam:external:java:spanner:read:v1'
+
+  def __init__(
+      self,
+      project_id,
+      instance_id,
+      database_id,
+      row_type=None,
+      sql=None,
+      table=None,
+      host=None,
+      emulator_host=None,
+      batching=None,
+      timestamp_bound_mode=None,
+      read_timestamp=None,
+      exact_staleness=None,
+      time_unit=None,
+      expansion_service=None,
+  ):
+    """
+    Initializes a read operation from Spanner.
+
+    :param project_id: Specifies the Cloud Spanner project.
+    :param instance_id: Specifies the Cloud Spanner instance.
+    :param database_id: Specifies the Cloud Spanner database.
+    :param row_type: Row type that fits the given query or table. Passed as
+        NamedTuple, e.g. NamedTuple('name', [('row_name', unicode)])
+    :param sql: An sql query to execute. It's results must fit the
+        provided row_type. Don't use when table is set.
+    :param table: A spanner table. When provided all columns from row_type
+        will be selected to query. Don't use when query is set.
+    :param batching: By default Batch API is used to read data from Cloud
+        Spanner. It is useful to disable batching when the underlying query
+        is not root-partitionable.
+    :param host: Specifies the Cloud Spanner host.
+    :param emulator_host: Specifies Spanner emulator host.
+    :param timestamp_bound_mode: Defines how Cloud Spanner will choose a
+        timestamp for a read-only transaction or a single read/query.
+        Possible values:
+        STRONG: A timestamp bound that will perform reads and queries at a
+        timestamp where all previously committed transactions are visible.
+        READ_TIMESTAMP: Returns a timestamp bound that will perform reads
+        and queries at the given timestamp.
+        MIN_READ_TIMESTAMP: Returns a timestamp bound that will perform reads
+        and queries at a timestamp chosen to be at least given timestamp value.
+        EXACT_STALENESS: Returns a timestamp bound that will perform reads and
+        queries at an exact staleness. The timestamp is chosen soon after the
+        read is started.
+        MAX_STALENESS: Returns a timestamp bound that will perform reads and
+        queries at a timestamp chosen to be at most time_unit stale.
+    :param read_timestamp: Timestamp in string. Use only when
+        timestamp_bound_mode is set to READ_TIMESTAMP or MIN_READ_TIMESTAMP.
+    :param exact_staleness: Staleness value as int. Use only when
+        timestamp_bound_mode is set to EXACT_STALENESS or MAX_STALENESS.
+        time_unit has to be set along with this param.
+    :param time_unit: Time unit for staleness_value. Possible values:
+        NANOSECONDS, MICROSECONDS, MILLISECONDS, SECONDS, HOURS, DAYS.
+    :param expansion_service: The address (host:port) of the ExpansionService.
+    """
+    assert row_type
+    assert sql or table and not (sql and table)
+    TimeUnit.verify_param(time_unit)
+    TimestampBoundMode.verify_param(timestamp_bound_mode)
+    staleness_value = int(exact_staleness) if exact_staleness else None
+
+    if staleness_value or time_unit:
+      assert staleness_value and time_unit and \
+             timestamp_bound_mode == TimestampBoundMode.MAX_STALENESS or \
+             timestamp_bound_mode == TimestampBoundMode.EXACT_STALENESS
+
+    if read_timestamp:
+      assert timestamp_bound_mode == TimestampBoundMode.MIN_READ_TIMESTAMP\
+             or timestamp_bound_mode == TimestampBoundMode.READ_TIMESTAMP
+
+    coders.registry.register_coder(row_type, coders.RowCoder)
+
+    super(ReadFromSpanner, self).__init__(
+        self.URN,
+        NamedTupleBasedPayloadBuilder(
+            ReadFromSpannerSchema(
+                instance_id=instance_id,
+                database_id=database_id,
+                sql=sql,
+                table=table,
+                schema=named_tuple_to_schema(row_type).SerializeToString(),
+                project_id=project_id,
+                host=host,
+                emulator_host=emulator_host,
+                batching=batching,
+                timestamp_bound_mode=timestamp_bound_mode,
+                read_timestamp=read_timestamp,
+                exact_staleness=exact_staleness,
+                time_unit=time_unit,
+            ),
+        ),
+        expansion_service or default_io_expansion_service(),
+    )
+
+
+class Operation:
+  INSERT = 'INSERT'
+  DELETE = 'DELETE'
+  UPDATE = 'UPDATE'
+  REPLACE = 'REPLACE'
+  INSERT_OR_UPDATE = 'INSERT_OR_UPDATE'
+
+
+class TimeUnit:
+  NANOSECONDS = 'NANOSECONDS'
+  MICROSECONDS = 'MICROSECONDS'
+  MILLISECONDS = 'MILLISECONDS'
+  SECONDS = 'SECONDS'
+  HOURS = 'HOURS'
+  DAYS = 'DAYS'
+
+  @staticmethod
+  def verify_param(param):
+    if param and not hasattr(TimeUnit, param):
+      raise RuntimeError(
+          'Invalid param for TimestampBoundMode: {}'.format(param))
+
+
+class TimestampBoundMode:
+  MAX_STALENESS = 'MAX_STALENESS'
+  EXACT_STALENESS = 'EXACT_STALENESS'
+  READ_TIMESTAMP = 'READ_TIMESTAMP'
+  MIN_READ_TIMESTAMP = 'MIN_READ_TIMESTAMP'
+  STRONG = 'STRONG'
+
+  @staticmethod
+  def verify_param(param):
+    if param and not hasattr(TimestampBoundMode, param):
+      raise RuntimeError(
+          'Invalid param for TimestampBoundMode: {}'.format(param))
+
+
+class WriteToSpannerTransform(PTransform):
+  URN = 'beam:external:java:spanner:write:v1'
+
+  def __init__(self, config, expansion_service, operation, table):
+    super(WriteToSpannerTransform, self).__init__()
+    self.config = config
+    self.expansion_service = expansion_service
+    self.operation = operation
+    self.table = table
+
+  def expand(self, row_pcoll):
+    return (
+        row_pcoll
+        | RowToMutation(self.operation, self.table)
+        | ExternalTransform(self.URN, self.config, self.expansion_service))
+
+
+class RowToMutation(PTransform):
+  def __init__(self, operation, table):
+    super(RowToMutation, self).__init__()
+    self.operation = operation
+    self.table = table
+
+  def expand(self, pcoll):
+    is_delete = self.operation == Operation.DELETE
+    mutation_name = 'Mutation_%s_%s' % (
+        self.operation, str(uuid.uuid4()).replace('-', ''))
+
+    # There is an error when pcoll.element_type is List[row_type] so pass
+    # a list of inner element types to NamedTuple explicitly.
+    is_list = hasattr(pcoll.element_type, 'inner_type')
+    row_type = pcoll.element_type.inner_type if is_list else pcoll.element_type

Review comment:
       Could you explain the issue here? 
   
   Also the logic between `is_list` and `is_delete` is pretty confusing. Could this be simplified by only allowing lists for delete operations?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski removed a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski removed a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-716871424






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-683611200


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=h1) Report
   > Merging [#12611](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=desc) into [master](https://codecov.io/gh/apache/beam/commit/0335ba5403e5071a1376a3a4101d14b03cf90b8d?el=desc) will **increase** coverage by `0.02%`.
   > The diff coverage is `59.13%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/12611/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #12611      +/-   ##
   ==========================================
   + Coverage   40.23%   40.26%   +0.02%     
   ==========================================
     Files         455      456       +1     
     Lines       53726    53824      +98     
   ==========================================
   + Hits        21619    21674      +55     
   - Misses      32107    32150      +43     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [sdks/python/apache\_beam/io/gcp/spanner.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL3NwYW5uZXIucHk=) | `59.13% <59.13%> (ø)` | |
   | [...hon/apache\_beam/runners/direct/test\_stream\_impl.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9kaXJlY3QvdGVzdF9zdHJlYW1faW1wbC5weQ==) | `41.17% <0.00%> (-1.58%)` | :arrow_down: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=footer). Last update [0335ba5...6e9d3ba](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-683611200


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=h1) Report
   > Merging [#12611](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=desc) into [master](https://codecov.io/gh/apache/beam/commit/0335ba5403e5071a1376a3a4101d14b03cf90b8d?el=desc) will **increase** coverage by `0.02%`.
   > The diff coverage is `59.13%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/12611/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #12611      +/-   ##
   ==========================================
   + Coverage   40.23%   40.26%   +0.02%     
   ==========================================
     Files         455      456       +1     
     Lines       53726    53824      +98     
   ==========================================
   + Hits        21619    21674      +55     
   - Misses      32107    32150      +43     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [sdks/python/apache\_beam/io/gcp/spanner.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL3NwYW5uZXIucHk=) | `59.13% <59.13%> (ø)` | |
   | [...hon/apache\_beam/runners/direct/test\_stream\_impl.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9kaXJlY3QvdGVzdF9zdHJlYW1faW1wbC5weQ==) | `41.17% <0.00%> (-1.58%)` | :arrow_down: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=footer). Last update [0335ba5...a18af8a](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski commented on a change in pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski commented on a change in pull request #12611:
URL: https://github.com/apache/beam/pull/12611#discussion_r483525931



##########
File path: sdks/python/apache_beam/io/gcp/spanner.py
##########
@@ -0,0 +1,503 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""PTransforms for supporting Spanner in Python pipelines.
+
+  These transforms are currently supported by Beam portable
+  Flink and Spark runners.
+
+  **Setup**
+
+  Transforms provided in this module are cross-language transforms
+  implemented in the Beam Java SDK. During the pipeline construction, Python SDK
+  will connect to a Java expansion service to expand these transforms.
+  To facilitate this, a small amount of setup is needed before using these
+  transforms in a Beam Python pipeline.
+
+  There are several ways to setup cross-language Spanner transforms.
+
+  * Option 1: use the default expansion service
+  * Option 2: specify a custom expansion service
+
+  See below for details regarding each of these options.
+
+  *Option 1: Use the default expansion service*
+
+  This is the recommended and easiest setup option for using Python Spanner
+  transforms. This option is only available for Beam 2.25.0 and later.
+
+  This option requires following pre-requisites before running the Beam
+  pipeline.
+
+  * Install Java runtime in the computer from where the pipeline is constructed
+    and make sure that 'java' command is available.
+
+  In this option, Python SDK will either download (for released Beam version) or
+  build (when running from a Beam Git clone) a expansion service jar and use
+  that to expand transforms. Currently Spanner transforms use the
+  'beam-sdks-java-io-google-cloud-platform-expansion-service' jar for this
+  purpose.
+
+  *Option 2: specify a custom expansion service*
+
+  In this option, you startup your own expansion service and provide that as
+  a parameter when using the transforms provided in this module.
+
+  This option requires following pre-requisites before running the Beam
+  pipeline.
+
+  * Startup your own expansion service.
+  * Update your pipeline to provide the expansion service address when
+    initiating Spanner transforms provided in this module.
+
+  Flink Users can use the built-in Expansion Service of the Flink Runner's
+  Job Server. If you start Flink's Job Server, the expansion service will be
+  started on port 8097. For a different address, please set the
+  expansion_service parameter.
+
+  **More information**
+
+  For more information regarding cross-language transforms see:
+  - https://beam.apache.org/roadmap/portability/
+
+  For more information specific to Flink runner see:
+  - https://beam.apache.org/documentation/runners/flink/
+"""
+
+# pytype: skip-file
+
+from __future__ import absolute_import
+
+import typing
+import uuid
+from typing import List
+from typing import NamedTuple
+from typing import Optional
+
+from past.builtins import unicode
+
+from apache_beam import coders
+from apache_beam.transforms.external import BeamJarExpansionService
+from apache_beam.transforms.external import ExternalTransform
+from apache_beam.transforms.external import NamedTupleBasedPayloadBuilder
+from apache_beam.typehints.schemas import named_tuple_to_schema
+
+__all__ = [
+    'WriteToSpanner',
+    'ReadFromSpanner',
+    'MutationCreator',
+    'TimestampBoundMode',
+    'TimeUnit',
+]
+
+
+def default_io_expansion_service():
+  return BeamJarExpansionService(
+      'sdks:java:io:google-cloud-platform:expansion-service:shadowJar')
+
+
+WriteToSpannerSchema = typing.NamedTuple(
+    'WriteToSpannerSchema',
+    [
+        ('instance_id', unicode),
+        ('database_id', unicode),
+        ('project_id', Optional[unicode]),
+        ('max_batch_size_bytes', Optional[int]),
+        ('max_number_mutations', Optional[int]),
+        ('max_number_rows', Optional[int]),
+        ('grouping_factor', Optional[int]),
+        ('host', Optional[unicode]),
+        ('emulator_host', Optional[unicode]),
+        ('commit_deadline', Optional[int]),
+        ('max_cumulative_backoff', Optional[int]),
+    ],
+)
+
+
+class WriteToSpanner(ExternalTransform):
+  """
+  A PTransform which writes mutations to the specified instance's database
+  via Spanner.
+
+  This transform receives Mutations defined as NamedTuple which are created
+  via utility class MutationCreator. Mutation needs to know what row type does
+  it wrap. Example::
+
+    ExampleRow = typing.NamedTuple('ExampleRow',
+                                   [('id', int), ('name', unicode)])
+    coders.registry.register_coder(ExampleRow, coders.RowCoder)
+
+    mutation_creator = MutationCreator('table', ExampleRow, 'ExampleMutation')

Review comment:
       That way we loose possibility of mixing different kinds of mutations. I don't imagine any sane usage of mixed insert/delete as the order is not guaranteed so I aggree that removing this assumption is justified.
   
   Since we will always map rows to mutations before then it would be good to enclose mapping rows to mutations inside WriteToSpanner. How about such an API?:
   ```
   pc.with_output_types(CustomRow) | WriteToSpanner(...).insert(table)
   pc.with_output_types(CustomRow) | WriteToSpanner(...).delete(table)
   pc.with_output_types(List[CustomRow]) | WriteToSpanner(...).delete(table)
   ```
   It's not consistent with ReadFromSpanner(...) but I think it's better than forcing the user to call RowToMutation each time.
   To be more consistent I could do something like `ReadFromSpanner(...).from_table(table)` and `ReadFromSpanner(...).from_sql(sql_query)`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-683611200


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=h1) Report
   > Merging [#12611](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=desc) into [master](https://codecov.io/gh/apache/beam/commit/3d6cc0ed9ed537229b27b5dbe73288f21b0e351c?el=desc) will **decrease** coverage by `0.04%`.
   > The diff coverage is `56.73%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/12611/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #12611      +/-   ##
   ==========================================
   - Coverage   82.48%   82.44%   -0.05%     
   ==========================================
     Files         455      456       +1     
     Lines       54876    54975      +99     
   ==========================================
   + Hits        45266    45324      +58     
   - Misses       9610     9651      +41     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [sdks/python/apache\_beam/io/gcp/spanner.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL3NwYW5uZXIucHk=) | `56.73% <56.73%> (ø)` | |
   | [...eam/runners/interactive/interactive\_environment.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9pbnRlcmFjdGl2ZS9pbnRlcmFjdGl2ZV9lbnZpcm9ubWVudC5weQ==) | `89.45% <0.00%> (-0.36%)` | :arrow_down: |
   | [sdks/python/apache\_beam/io/iobase.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vaW9iYXNlLnB5) | `83.75% <0.00%> (-0.29%)` | :arrow_down: |
   | [...hon/apache\_beam/runners/worker/bundle\_processor.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvYnVuZGxlX3Byb2Nlc3Nvci5weQ==) | `94.07% <0.00%> (-0.27%)` | :arrow_down: |
   | [...ks/python/apache\_beam/runners/worker/sdk\_worker.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvc2RrX3dvcmtlci5weQ==) | `89.47% <0.00%> (-0.16%)` | :arrow_down: |
   | [...runners/interactive/display/pcoll\_visualization.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9pbnRlcmFjdGl2ZS9kaXNwbGF5L3Bjb2xsX3Zpc3VhbGl6YXRpb24ucHk=) | `85.26% <0.00%> (-0.08%)` | :arrow_down: |
   | [...beam/runners/portability/local\_job\_service\_main.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9wb3J0YWJpbGl0eS9sb2NhbF9qb2Jfc2VydmljZV9tYWluLnB5) | `0.00% <0.00%> (ø)` | |
   | [sdks/python/apache\_beam/runners/common.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9jb21tb24ucHk=) | `89.20% <0.00%> (+0.44%)` | :arrow_up: |
   | [.../python/apache\_beam/transforms/periodicsequence.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdHJhbnNmb3Jtcy9wZXJpb2RpY3NlcXVlbmNlLnB5) | `98.24% <0.00%> (+1.75%)` | :arrow_up: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=footer). Last update [ba24b08...b76a62a](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-683611200


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=h1) Report
   > Merging [#12611](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=desc) into [master](https://codecov.io/gh/apache/beam/commit/0335ba5403e5071a1376a3a4101d14b03cf90b8d?el=desc) will **increase** coverage by `0.05%`.
   > The diff coverage is `56.73%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/12611/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #12611      +/-   ##
   ==========================================
   + Coverage   40.23%   40.29%   +0.05%     
   ==========================================
     Files         455      456       +1     
     Lines       53726    53884     +158     
   ==========================================
   + Hits        21619    21714      +95     
   - Misses      32107    32170      +63     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [sdks/python/apache\_beam/io/gcp/spanner.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL3NwYW5uZXIucHk=) | `56.73% <56.73%> (ø)` | |
   | [...hon/apache\_beam/runners/direct/test\_stream\_impl.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9kaXJlY3QvdGVzdF9zdHJlYW1faW1wbC5weQ==) | `41.17% <0.00%> (-1.58%)` | :arrow_down: |
   | [sdks/python/apache\_beam/portability/python\_urns.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcG9ydGFiaWxpdHkvcHl0aG9uX3VybnMucHk=) | `100.00% <0.00%> (ø)` | |
   | [...s/python/apache\_beam/io/gcp/bigquery\_file\_loads.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL2JpZ3F1ZXJ5X2ZpbGVfbG9hZHMucHk=) | `23.36% <0.00%> (ø)` | |
   | [...hon/apache\_beam/runners/worker/bundle\_processor.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvYnVuZGxlX3Byb2Nlc3Nvci5weQ==) | `28.92% <0.00%> (+0.24%)` | :arrow_up: |
   | [sdks/python/apache\_beam/dataframe/frames.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vZGF0YWZyYW1lL2ZyYW1lcy5weQ==) | `43.73% <0.00%> (+0.76%)` | :arrow_up: |
   | [sdks/python/apache\_beam/transforms/core.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdHJhbnNmb3Jtcy9jb3JlLnB5) | `39.22% <0.00%> (+0.94%)` | :arrow_up: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=footer). Last update [0335ba5...b0c23a2](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski removed a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski removed a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-686441745


   Run Python 3.8 PostCommit


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski commented on a change in pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski commented on a change in pull request #12611:
URL: https://github.com/apache/beam/pull/12611#discussion_r501677086



##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/StructUtils.java
##########
@@ -0,0 +1,545 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.gcp.spanner;
+
+import static java.util.stream.Collectors.toList;
+import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument;
+
+import com.google.cloud.ByteArray;
+import com.google.cloud.Timestamp;
+import com.google.cloud.spanner.Struct;
+import com.google.cloud.spanner.Type;
+import java.math.BigDecimal;
+import java.util.List;
+import java.util.stream.StreamSupport;
+import org.apache.beam.sdk.schemas.Schema;
+import org.apache.beam.sdk.values.Row;
+import org.joda.time.DateTime;
+import org.joda.time.Instant;
+
+final class StructUtils {
+  public static Row translateStructToRow(Struct struct, Schema schema) {
+    checkForSchemasEquality(schema.getFields(), struct.getType().getStructFields(), false);
+
+    List<Schema.Field> fields = schema.getFields();
+    Row.FieldValueBuilder valueBuilder = null;
+    // TODO: Remove this null-checking once nullable fields are supported in cross-language
+    int count = 0;
+    while (valueBuilder == null && count < fields.size()) {
+      valueBuilder = getFirstStructValue(struct, fields.get(count), schema);
+      ++count;
+    }
+    for (int i = count; i < fields.size(); ++i) {
+      valueBuilder = getStructValue(valueBuilder, struct, fields.get(i));
+    }
+    return valueBuilder != null ? valueBuilder.build() : Row.withSchema(schema).build();
+  }
+
+  public static Struct translateRowToStruct(Row row) {
+    Struct.Builder structBuilder = Struct.newBuilder();
+    List<Schema.Field> fields = row.getSchema().getFields();
+    fields.forEach(
+        field -> {
+          String column = field.getName();
+          switch (field.getType().getTypeName()) {
+            case ROW:
+              structBuilder
+                  .set(column)
+                  .to(
+                      beamTypeToSpannerType(field.getType()),
+                      translateRowToStruct(row.getRow(column)));
+              break;
+            case ARRAY:
+              addArrayToStruct(structBuilder, row, field);
+              break;
+            case ITERABLE:
+              addIterableToStruct(structBuilder, row, field);
+              break;
+            case FLOAT:
+              structBuilder.set(column).to(row.getFloat(column).doubleValue());
+              break;
+            case DOUBLE:
+              structBuilder.set(column).to(row.getDouble(column));
+              break;
+            case DECIMAL:
+              structBuilder.set(column).to(row.getDecimal(column).doubleValue());

Review comment:
       Great, quite a new thing in Spanner as I can see! Thanks! Done.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-683611200


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=h1) Report
   > Merging [#12611](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=desc) into [master](https://codecov.io/gh/apache/beam/commit/ba89e76f5daecd496cafb2861dac5ef69480a973?el=desc) will **increase** coverage by `0.03%`.
   > The diff coverage is `56.73%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/12611/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #12611      +/-   ##
   ==========================================
   + Coverage   40.26%   40.29%   +0.03%     
   ==========================================
     Files         455      456       +1     
     Lines       53780    53884     +104     
   ==========================================
   + Hits        21655    21714      +59     
   - Misses      32125    32170      +45     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [sdks/python/apache\_beam/io/gcp/spanner.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL3NwYW5uZXIucHk=) | `56.73% <56.73%> (ø)` | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=footer). Last update [ba89e76...c24e8cb](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-683611200


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=h1) Report
   > Merging [#12611](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=desc) into [master](https://codecov.io/gh/apache/beam/commit/0335ba5403e5071a1376a3a4101d14b03cf90b8d?el=desc) will **increase** coverage by `0.05%`.
   > The diff coverage is `56.73%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/12611/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #12611      +/-   ##
   ==========================================
   + Coverage   40.23%   40.29%   +0.05%     
   ==========================================
     Files         455      456       +1     
     Lines       53726    53884     +158     
   ==========================================
   + Hits        21619    21714      +95     
   - Misses      32107    32170      +63     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [sdks/python/apache\_beam/io/gcp/spanner.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL3NwYW5uZXIucHk=) | `56.73% <56.73%> (ø)` | |
   | [...hon/apache\_beam/runners/direct/test\_stream\_impl.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9kaXJlY3QvdGVzdF9zdHJlYW1faW1wbC5weQ==) | `41.17% <0.00%> (-1.58%)` | :arrow_down: |
   | [sdks/python/apache\_beam/portability/python\_urns.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcG9ydGFiaWxpdHkvcHl0aG9uX3VybnMucHk=) | `100.00% <0.00%> (ø)` | |
   | [...s/python/apache\_beam/io/gcp/bigquery\_file\_loads.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL2JpZ3F1ZXJ5X2ZpbGVfbG9hZHMucHk=) | `23.36% <0.00%> (ø)` | |
   | [...hon/apache\_beam/runners/worker/bundle\_processor.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvYnVuZGxlX3Byb2Nlc3Nvci5weQ==) | `28.92% <0.00%> (+0.24%)` | :arrow_up: |
   | [sdks/python/apache\_beam/dataframe/frames.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vZGF0YWZyYW1lL2ZyYW1lcy5weQ==) | `43.73% <0.00%> (+0.76%)` | :arrow_up: |
   | [sdks/python/apache\_beam/transforms/core.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdHJhbnNmb3Jtcy9jb3JlLnB5) | `39.22% <0.00%> (+0.94%)` | :arrow_up: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=footer). Last update [0335ba5...b0c23a2](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-683611200


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=h1) Report
   > Merging [#12611](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=desc) into [master](https://codecov.io/gh/apache/beam/commit/ba89e76f5daecd496cafb2861dac5ef69480a973?el=desc) will **increase** coverage by `0.03%`.
   > The diff coverage is `56.73%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/12611/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #12611      +/-   ##
   ==========================================
   + Coverage   40.26%   40.30%   +0.03%     
   ==========================================
     Files         455      456       +1     
     Lines       53780    53926     +146     
   ==========================================
   + Hits        21655    21733      +78     
   - Misses      32125    32193      +68     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [sdks/python/apache\_beam/io/gcp/spanner.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL3NwYW5uZXIucHk=) | `56.73% <56.73%> (ø)` | |
   | [...s/python/apache\_beam/testing/synthetic\_pipeline.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdGVzdGluZy9zeW50aGV0aWNfcGlwZWxpbmUucHk=) | `23.45% <0.00%> (+2.52%)` | :arrow_up: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=footer). Last update [ba89e76...c24e8cb](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski commented on a change in pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski commented on a change in pull request #12611:
URL: https://github.com/apache/beam/pull/12611#discussion_r501677086



##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/StructUtils.java
##########
@@ -0,0 +1,545 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.gcp.spanner;
+
+import static java.util.stream.Collectors.toList;
+import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument;
+
+import com.google.cloud.ByteArray;
+import com.google.cloud.Timestamp;
+import com.google.cloud.spanner.Struct;
+import com.google.cloud.spanner.Type;
+import java.math.BigDecimal;
+import java.util.List;
+import java.util.stream.StreamSupport;
+import org.apache.beam.sdk.schemas.Schema;
+import org.apache.beam.sdk.values.Row;
+import org.joda.time.DateTime;
+import org.joda.time.Instant;
+
+final class StructUtils {
+  public static Row translateStructToRow(Struct struct, Schema schema) {
+    checkForSchemasEquality(schema.getFields(), struct.getType().getStructFields(), false);
+
+    List<Schema.Field> fields = schema.getFields();
+    Row.FieldValueBuilder valueBuilder = null;
+    // TODO: Remove this null-checking once nullable fields are supported in cross-language
+    int count = 0;
+    while (valueBuilder == null && count < fields.size()) {
+      valueBuilder = getFirstStructValue(struct, fields.get(count), schema);
+      ++count;
+    }
+    for (int i = count; i < fields.size(); ++i) {
+      valueBuilder = getStructValue(valueBuilder, struct, fields.get(i));
+    }
+    return valueBuilder != null ? valueBuilder.build() : Row.withSchema(schema).build();
+  }
+
+  public static Struct translateRowToStruct(Row row) {
+    Struct.Builder structBuilder = Struct.newBuilder();
+    List<Schema.Field> fields = row.getSchema().getFields();
+    fields.forEach(
+        field -> {
+          String column = field.getName();
+          switch (field.getType().getTypeName()) {
+            case ROW:
+              structBuilder
+                  .set(column)
+                  .to(
+                      beamTypeToSpannerType(field.getType()),
+                      translateRowToStruct(row.getRow(column)));
+              break;
+            case ARRAY:
+              addArrayToStruct(structBuilder, row, field);
+              break;
+            case ITERABLE:
+              addIterableToStruct(structBuilder, row, field);
+              break;
+            case FLOAT:
+              structBuilder.set(column).to(row.getFloat(column).doubleValue());
+              break;
+            case DOUBLE:
+              structBuilder.set(column).to(row.getDouble(column));
+              break;
+            case DECIMAL:
+              structBuilder.set(column).to(row.getDecimal(column).doubleValue());

Review comment:
       Great, I think it wasn't available when I wrote that code. Thanks!




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski commented on pull request #12611: [BEAM-10131][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski commented on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-676516510


   Run Python 3.8 PostCommit


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski commented on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski commented on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-709965751






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-683611200


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=h1) Report
   > Merging [#12611](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=desc) (1c43284) into [master](https://codecov.io/gh/apache/beam/commit/3d6cc0ed9ed537229b27b5dbe73288f21b0e351c?el=desc) (3d6cc0e) will **decrease** coverage by `0.04%`.
   > The diff coverage is `56.73%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/12611/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #12611      +/-   ##
   ==========================================
   - Coverage   82.48%   82.44%   -0.05%     
   ==========================================
     Files         455      456       +1     
     Lines       54876    54975      +99     
   ==========================================
   + Hits        45266    45324      +58     
   - Misses       9610     9651      +41     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [sdks/python/apache\_beam/io/gcp/spanner.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL3NwYW5uZXIucHk=) | `56.73% <56.73%> (ø)` | |
   | [...eam/runners/interactive/interactive\_environment.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9pbnRlcmFjdGl2ZS9pbnRlcmFjdGl2ZV9lbnZpcm9ubWVudC5weQ==) | `89.45% <0.00%> (-0.36%)` | :arrow_down: |
   | [sdks/python/apache\_beam/io/iobase.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vaW9iYXNlLnB5) | `83.75% <0.00%> (-0.29%)` | :arrow_down: |
   | [...hon/apache\_beam/runners/worker/bundle\_processor.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvYnVuZGxlX3Byb2Nlc3Nvci5weQ==) | `94.07% <0.00%> (-0.27%)` | :arrow_down: |
   | [...ks/python/apache\_beam/runners/worker/sdk\_worker.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvc2RrX3dvcmtlci5weQ==) | `89.47% <0.00%> (-0.16%)` | :arrow_down: |
   | [...runners/interactive/display/pcoll\_visualization.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9pbnRlcmFjdGl2ZS9kaXNwbGF5L3Bjb2xsX3Zpc3VhbGl6YXRpb24ucHk=) | `85.26% <0.00%> (-0.08%)` | :arrow_down: |
   | [...beam/runners/portability/local\_job\_service\_main.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9wb3J0YWJpbGl0eS9sb2NhbF9qb2Jfc2VydmljZV9tYWluLnB5) | `0.00% <0.00%> (ø)` | |
   | [sdks/python/apache\_beam/runners/common.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9jb21tb24ucHk=) | `89.20% <0.00%> (+0.44%)` | :arrow_up: |
   | [.../python/apache\_beam/transforms/periodicsequence.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdHJhbnNmb3Jtcy9wZXJpb2RpY3NlcXVlbmNlLnB5) | `98.24% <0.00%> (+1.75%)` | :arrow_up: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=footer). Last update [02a1cd2...2e1ad01](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-683611200


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=h1) Report
   > Merging [#12611](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=desc) into [master](https://codecov.io/gh/apache/beam/commit/0335ba5403e5071a1376a3a4101d14b03cf90b8d?el=desc) will **increase** coverage by `0.03%`.
   > The diff coverage is `59.13%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/12611/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #12611      +/-   ##
   ==========================================
   + Coverage   40.23%   40.27%   +0.03%     
   ==========================================
     Files         455      456       +1     
     Lines       53726    53819      +93     
   ==========================================
   + Hits        21619    21674      +55     
   - Misses      32107    32145      +38     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [sdks/python/apache\_beam/io/gcp/spanner.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL3NwYW5uZXIucHk=) | `59.13% <59.13%> (ø)` | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=footer). Last update [0335ba5...55ee406](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] TheNeuralBit commented on pull request #12611: [BEAM-10131][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
TheNeuralBit commented on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-675641020


   No worries I'm happy to help review :) It might take me a few days to get to it though.
   
   Regarding testing: we could consider adding a spanner instance to apache-beam-testing for integration testing, I'd suggest raising it on dev@ if you want to pursue it. I also just came across https://cloud.google.com/spanner/docs/emulator which could be a good option too. Its a docker container that starts up an in-memory version of spanner to test against.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski commented on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski commented on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-686502850


   Run Python 3.8 PostCommit


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-683611200


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=h1) Report
   > Merging [#12611](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=desc) into [master](https://codecov.io/gh/apache/beam/commit/0335ba5403e5071a1376a3a4101d14b03cf90b8d?el=desc) will **increase** coverage by `0.02%`.
   > The diff coverage is `58.51%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/12611/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #12611      +/-   ##
   ==========================================
   + Coverage   40.23%   40.26%   +0.02%     
   ==========================================
     Files         455      456       +1     
     Lines       53726    53825      +99     
   ==========================================
   + Hits        21619    21674      +55     
   - Misses      32107    32151      +44     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [sdks/python/apache\_beam/io/gcp/spanner.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL3NwYW5uZXIucHk=) | `58.51% <58.51%> (ø)` | |
   | [...hon/apache\_beam/runners/direct/test\_stream\_impl.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9kaXJlY3QvdGVzdF9zdHJlYW1faW1wbC5weQ==) | `41.17% <0.00%> (-1.58%)` | :arrow_down: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=footer). Last update [0335ba5...5feb3d9](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-683611200


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=h1) Report
   > Merging [#12611](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=desc) into [master](https://codecov.io/gh/apache/beam/commit/3d6cc0ed9ed537229b27b5dbe73288f21b0e351c?el=desc) will **decrease** coverage by `0.04%`.
   > The diff coverage is `56.73%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/12611/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #12611      +/-   ##
   ==========================================
   - Coverage   82.48%   82.44%   -0.05%     
   ==========================================
     Files         455      456       +1     
     Lines       54876    54975      +99     
   ==========================================
   + Hits        45266    45324      +58     
   - Misses       9610     9651      +41     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [sdks/python/apache\_beam/io/gcp/spanner.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL3NwYW5uZXIucHk=) | `56.73% <56.73%> (ø)` | |
   | [...eam/runners/interactive/interactive\_environment.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9pbnRlcmFjdGl2ZS9pbnRlcmFjdGl2ZV9lbnZpcm9ubWVudC5weQ==) | `89.45% <0.00%> (-0.36%)` | :arrow_down: |
   | [sdks/python/apache\_beam/io/iobase.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vaW9iYXNlLnB5) | `83.75% <0.00%> (-0.29%)` | :arrow_down: |
   | [...hon/apache\_beam/runners/worker/bundle\_processor.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvYnVuZGxlX3Byb2Nlc3Nvci5weQ==) | `94.07% <0.00%> (-0.27%)` | :arrow_down: |
   | [...ks/python/apache\_beam/runners/worker/sdk\_worker.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvc2RrX3dvcmtlci5weQ==) | `89.47% <0.00%> (-0.16%)` | :arrow_down: |
   | [...runners/interactive/display/pcoll\_visualization.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9pbnRlcmFjdGl2ZS9kaXNwbGF5L3Bjb2xsX3Zpc3VhbGl6YXRpb24ucHk=) | `85.26% <0.00%> (-0.08%)` | :arrow_down: |
   | [...beam/runners/portability/local\_job\_service\_main.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9wb3J0YWJpbGl0eS9sb2NhbF9qb2Jfc2VydmljZV9tYWluLnB5) | `0.00% <0.00%> (ø)` | |
   | [sdks/python/apache\_beam/runners/common.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9jb21tb24ucHk=) | `89.20% <0.00%> (+0.44%)` | :arrow_up: |
   | [.../python/apache\_beam/transforms/periodicsequence.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdHJhbnNmb3Jtcy9wZXJpb2RpY3NlcXVlbmNlLnB5) | `98.24% <0.00%> (+1.75%)` | :arrow_up: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=footer). Last update [ba24b08...8f64294](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] nielm commented on a change in pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
nielm commented on a change in pull request #12611:
URL: https://github.com/apache/beam/pull/12611#discussion_r501067027



##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/StructUtils.java
##########
@@ -0,0 +1,545 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.gcp.spanner;
+
+import static java.util.stream.Collectors.toList;
+import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument;
+
+import com.google.cloud.ByteArray;
+import com.google.cloud.Timestamp;
+import com.google.cloud.spanner.Struct;
+import com.google.cloud.spanner.Type;
+import java.math.BigDecimal;
+import java.util.List;
+import java.util.stream.StreamSupport;
+import org.apache.beam.sdk.schemas.Schema;
+import org.apache.beam.sdk.values.Row;
+import org.joda.time.DateTime;
+import org.joda.time.Instant;
+
+final class StructUtils {
+  public static Row translateStructToRow(Struct struct, Schema schema) {
+    checkForSchemasEquality(schema.getFields(), struct.getType().getStructFields(), false);

Review comment:
       Spanner now supports NUMERIC which may be a better conversion?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski commented on a change in pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski commented on a change in pull request #12611:
URL: https://github.com/apache/beam/pull/12611#discussion_r501676146



##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/SpannerIO.java
##########
@@ -678,6 +703,42 @@ public Read withPartitionOptions(PartitionOptions partitionOptions) {
               .withTransaction(getTransaction());
       return input.apply(Create.of(getReadOperation())).apply("Execute query", readAll);
     }
+
+    SerializableFunction<Struct, Row> getFormatFn() {
+      return (SerializableFunction<Struct, Row>)
+          input ->
+              Row.withSchema(Schema.builder().addInt64Field("Key").build())
+                  .withFieldValue("Key", 3L)
+                  .build();
+    }
+  }
+
+  public static class ReadRows extends PTransform<PBegin, PCollection<Row>> {
+    Read read;
+    Schema schema;
+
+    public ReadRows(Read read, Schema schema) {
+      super("Read rows");
+      this.read = read;
+      this.schema = schema;

Review comment:
       Thank you @nielm ! I thought about the LIMIT approach but then I found the same arguments not to do that.
   
   It appears there exist a jdbc client for Spanner: https://cloud.google.com/spanner/docs/jdbc-drivers . I'll try to figure out if I can use it. 
   
   There is ResultSetMetadata in Spanner's REST API which extends json object. https://cloud.google.com/spanner/docs/reference/rest/v1/ResultSetMetadata but at the end of the day it requires at least partially to fetch the data.
   
   But I would leave it for another PR as it supposedly require to move SchemaUtils from io/jdbc to some more general place (extensions/sql?). As I can see Struct type is represented as String as is mentiones here:
   ```
   The Cloud Spanner STRUCT data type is mapped to a SQL VARCHAR data type, accessible through 
   this driver as String types. All other types have appropriate mappings.
   ```
   So it may not be the best option.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski commented on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski commented on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-686576737


   Run Python 3.8 PostCommit


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-683611200


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=h1) Report
   > :exclamation: No coverage uploaded for pull request head (`spanner-xlang@55ee406`). [Click here to learn what that means](https://docs.codecov.io/docs/error-reference#section-missing-head-commit).
   > The diff coverage is `n/a`.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-683611200


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=h1) Report
   > Merging [#12611](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=desc) into [master](https://codecov.io/gh/apache/beam/commit/0335ba5403e5071a1376a3a4101d14b03cf90b8d?el=desc) will **increase** coverage by `0.02%`.
   > The diff coverage is `59.13%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/12611/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #12611      +/-   ##
   ==========================================
   + Coverage   40.23%   40.26%   +0.02%     
   ==========================================
     Files         455      456       +1     
     Lines       53726    53824      +98     
   ==========================================
   + Hits        21619    21674      +55     
   - Misses      32107    32150      +43     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [sdks/python/apache\_beam/io/gcp/spanner.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL3NwYW5uZXIucHk=) | `59.13% <59.13%> (ø)` | |
   | [...hon/apache\_beam/runners/direct/test\_stream\_impl.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9kaXJlY3QvdGVzdF9zdHJlYW1faW1wbC5weQ==) | `41.17% <0.00%> (-1.58%)` | :arrow_down: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=footer). Last update [0335ba5...dd827fb](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski commented on a change in pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski commented on a change in pull request #12611:
URL: https://github.com/apache/beam/pull/12611#discussion_r501676146



##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/SpannerIO.java
##########
@@ -678,6 +703,42 @@ public Read withPartitionOptions(PartitionOptions partitionOptions) {
               .withTransaction(getTransaction());
       return input.apply(Create.of(getReadOperation())).apply("Execute query", readAll);
     }
+
+    SerializableFunction<Struct, Row> getFormatFn() {
+      return (SerializableFunction<Struct, Row>)
+          input ->
+              Row.withSchema(Schema.builder().addInt64Field("Key").build())
+                  .withFieldValue("Key", 3L)
+                  .build();
+    }
+  }
+
+  public static class ReadRows extends PTransform<PBegin, PCollection<Row>> {
+    Read read;
+    Schema schema;
+
+    public ReadRows(Read read, Schema schema) {
+      super("Read rows");
+      this.read = read;
+      this.schema = schema;

Review comment:
       Thank you @nielm ! I thought about the LIMIT approach but then I found the same arguments not to do that.
   
   It appears there exist a jdbc client for Spanner: https://cloud.google.com/spanner/docs/jdbc-drivers . I'll try to figure out if I can use it. 
   
   There is ResultSetMetadata in Spanner's REST API which extends json object. https://cloud.google.com/spanner/docs/reference/rest/v1/ResultSetMetadata but at the end of the day it requires at least partially to fetch the data.
   
   But I would leave it for another PR as it supposedly require to move SchemaUtils from io/jdbc to some more general place (extensions/sql?). As I can see Struct type is mapped to String/Varchar as is mentioned in the FAQ, so it may not be the best option
   ```
   The Cloud Spanner STRUCT data type is mapped to a SQL VARCHAR data type, accessible through 
   this driver as String types. All other types have appropriate mappings.
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski commented on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski commented on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-716740271


   Run Python 3.8 PostCommit


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-683611200


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=h1) Report
   > Merging [#12611](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=desc) (1c43284) into [master](https://codecov.io/gh/apache/beam/commit/3d6cc0ed9ed537229b27b5dbe73288f21b0e351c?el=desc) (3d6cc0e) will **decrease** coverage by `0.04%`.
   > The diff coverage is `56.73%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/12611/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #12611      +/-   ##
   ==========================================
   - Coverage   82.48%   82.44%   -0.05%     
   ==========================================
     Files         455      456       +1     
     Lines       54876    54975      +99     
   ==========================================
   + Hits        45266    45324      +58     
   - Misses       9610     9651      +41     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [sdks/python/apache\_beam/io/gcp/spanner.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL3NwYW5uZXIucHk=) | `56.73% <56.73%> (ø)` | |
   | [...eam/runners/interactive/interactive\_environment.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9pbnRlcmFjdGl2ZS9pbnRlcmFjdGl2ZV9lbnZpcm9ubWVudC5weQ==) | `89.45% <0.00%> (-0.36%)` | :arrow_down: |
   | [sdks/python/apache\_beam/io/iobase.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vaW9iYXNlLnB5) | `83.75% <0.00%> (-0.29%)` | :arrow_down: |
   | [...hon/apache\_beam/runners/worker/bundle\_processor.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvYnVuZGxlX3Byb2Nlc3Nvci5weQ==) | `94.07% <0.00%> (-0.27%)` | :arrow_down: |
   | [...ks/python/apache\_beam/runners/worker/sdk\_worker.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvc2RrX3dvcmtlci5weQ==) | `89.47% <0.00%> (-0.16%)` | :arrow_down: |
   | [...runners/interactive/display/pcoll\_visualization.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9pbnRlcmFjdGl2ZS9kaXNwbGF5L3Bjb2xsX3Zpc3VhbGl6YXRpb24ucHk=) | `85.26% <0.00%> (-0.08%)` | :arrow_down: |
   | [...beam/runners/portability/local\_job\_service\_main.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9wb3J0YWJpbGl0eS9sb2NhbF9qb2Jfc2VydmljZV9tYWluLnB5) | `0.00% <0.00%> (ø)` | |
   | [sdks/python/apache\_beam/runners/common.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9jb21tb24ucHk=) | `89.20% <0.00%> (+0.44%)` | :arrow_up: |
   | [.../python/apache\_beam/transforms/periodicsequence.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdHJhbnNmb3Jtcy9wZXJpb2RpY3NlcXVlbmNlLnB5) | `98.24% <0.00%> (+1.75%)` | :arrow_up: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=footer). Last update [02a1cd2...9efa8c7](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] TheNeuralBit commented on a change in pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
TheNeuralBit commented on a change in pull request #12611:
URL: https://github.com/apache/beam/pull/12611#discussion_r483316706



##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/SpannerTransformRegistrar.java
##########
@@ -0,0 +1,287 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.gcp.spanner;
+
+import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument;
+
+import com.google.auto.service.AutoService;
+import com.google.cloud.Timestamp;
+import com.google.cloud.spanner.TimestampBound;
+import java.util.Map;
+import java.util.concurrent.TimeUnit;
+import org.apache.beam.model.pipeline.v1.SchemaApi;
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.annotations.Experimental.Kind;
+import org.apache.beam.sdk.expansion.ExternalTransformRegistrar;
+import org.apache.beam.sdk.schemas.Schema;
+import org.apache.beam.sdk.schemas.SchemaTranslation;
+import org.apache.beam.sdk.transforms.ExternalTransformBuilder;
+import org.apache.beam.sdk.transforms.PTransform;
+import org.apache.beam.sdk.values.PBegin;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.PDone;
+import org.apache.beam.sdk.values.Row;
+import org.apache.beam.vendor.grpc.v1p26p0.com.google.protobuf.InvalidProtocolBufferException;
+import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.ImmutableMap;
+import org.checkerframework.checker.nullness.qual.Nullable;
+import org.joda.time.Duration;
+
+/**
+ * Exposes {@link SpannerIO.WriteRows} and {@link SpannerIO.ReadRows} as an external transform for
+ * cross-language usage.
+ */
+@Experimental(Kind.PORTABILITY)
+@AutoService(ExternalTransformRegistrar.class)
+public class SpannerTransformRegistrar implements ExternalTransformRegistrar {
+  public static final String WRITE_URN = "beam:external:java:spanner:write:v1";
+  public static final String READ_URN = "beam:external:java:spanner:read:v1";
+
+  @Override
+  public Map<String, ExternalTransformBuilder<?, ?, ?>> knownBuilderInstances() {
+    return ImmutableMap.of(WRITE_URN, new WriteBuilder(), READ_URN, new ReadBuilder());
+  }
+
+  public abstract static class CrossLanguageConfiguration {
+    String instanceId;
+    String databaseId;
+    String projectId;
+    @Nullable String host;
+    @Nullable String emulatorHost;
+
+    public void setInstanceId(String instanceId) {
+      this.instanceId = instanceId;
+    }
+
+    public void setDatabaseId(String databaseId) {
+      this.databaseId = databaseId;
+    }
+
+    public void setProjectId(String projectId) {
+      this.projectId = projectId;
+    }
+
+    public void setHost(@Nullable String host) {
+      this.host = host;
+    }
+
+    public void setEmulatorHost(@Nullable String emulatorHost) {
+      this.emulatorHost = emulatorHost;
+    }
+  }
+
+  @Experimental(Kind.PORTABILITY)
+  public static class ReadBuilder
+      implements ExternalTransformBuilder<ReadBuilder.Configuration, PBegin, PCollection<Row>> {
+
+    public static class Configuration extends CrossLanguageConfiguration {
+      // TODO: BEAM-10851 Come up with something to determine schema without this explicit parameter
+      private Schema schema;
+      private @Nullable String sql;
+      private @Nullable String table;
+      private @Nullable Boolean batching;
+      private @Nullable String timestampBoundMode;
+      private @Nullable String readTimestamp;
+      private @Nullable String timeUnit;
+      private @Nullable Long exactStaleness;
+
+      public void setSql(@Nullable String sql) {
+        this.sql = sql;
+      }
+
+      public void setTable(@Nullable String table) {
+        this.table = table;
+      }
+
+      public void setBatching(@Nullable Boolean batching) {
+        this.batching = batching;
+      }
+
+      public void setTimestampBoundMode(@Nullable String timestampBoundMode) {
+        this.timestampBoundMode = timestampBoundMode;
+      }
+
+      public void setSchema(byte[] schema) throws InvalidProtocolBufferException {
+        this.schema = SchemaTranslation.schemaFromProto(SchemaApi.Schema.parseFrom(schema));
+      }
+
+      public void setReadTimestamp(@Nullable String readTimestamp) {
+        this.readTimestamp = readTimestamp;
+      }
+
+      public void setTimeUnit(@Nullable String timeUnit) {
+        this.timeUnit = timeUnit;
+      }
+
+      public void setExactStaleness(@Nullable Long exactStaleness) {
+        this.exactStaleness = exactStaleness;
+      }
+
+      private TimestampBound getTimestampBound() {
+        if (timestampBoundMode == null) {
+          return null;
+        }
+
+        TimestampBound.Mode mode = TimestampBound.Mode.valueOf(timestampBoundMode);
+        if (mode == TimestampBound.Mode.MAX_STALENESS
+            || mode == TimestampBound.Mode.EXACT_STALENESS) {
+          checkArgument(
+              exactStaleness != null,
+              "Staleness value cannot be null when MAX_STALENESS or EXACT_STALENESS mode is selected");
+          checkArgument(
+              timeUnit != null,
+              "Time unit cannot be null when MAX_STALENESS or EXACT_STALENESS mode is selected");
+        }
+        if (mode == TimestampBound.Mode.READ_TIMESTAMP
+            || mode == TimestampBound.Mode.MIN_READ_TIMESTAMP) {
+          checkArgument(
+              readTimestamp != null,
+              "Timestamp cannot be null when READ_TIMESTAMP or MIN_READ_TIMESTAMP mode is selected");
+        }
+        switch (mode) {
+          case STRONG:
+            return TimestampBound.strong();
+          case MAX_STALENESS:
+            return TimestampBound.ofMaxStaleness(exactStaleness, TimeUnit.valueOf(timeUnit));
+          case EXACT_STALENESS:
+            return TimestampBound.ofExactStaleness(exactStaleness, TimeUnit.valueOf(timeUnit));
+          case READ_TIMESTAMP:
+            return TimestampBound.ofReadTimestamp(Timestamp.parseTimestamp(readTimestamp));
+          case MIN_READ_TIMESTAMP:
+            return TimestampBound.ofMinReadTimestamp(Timestamp.parseTimestamp(readTimestamp));
+          default:
+            throw new RuntimeException("Unknown timestamp bound mode: " + mode);
+        }
+      }
+
+      public ReadOperation getReadOperation() {
+        checkArgument(
+            sql == null || table == null,
+            "Query and table params are mutually exclusive. Set just one of them.");
+        if (sql != null) {
+          return ReadOperation.create().withQuery(sql);
+        }
+        return ReadOperation.create().withTable(table).withColumns(schema.getFieldNames());
+      }
+    }
+
+    @Override
+    public PTransform<PBegin, PCollection<Row>> buildExternal(Configuration configuration) {
+      SpannerIO.Read readTransform =
+          SpannerIO.read()
+              .withProjectId(configuration.projectId)
+              .withDatabaseId(configuration.databaseId)
+              .withInstanceId(configuration.instanceId)
+              .withReadOperation(configuration.getReadOperation());
+
+      if (configuration.host != null) {
+        readTransform = readTransform.withHost(configuration.host);
+      }
+      if (configuration.emulatorHost != null) {
+        readTransform = readTransform.withEmulatorHost(configuration.emulatorHost);
+      }
+      if (configuration.getTimestampBound() != null) {
+        readTransform = readTransform.withTimestampBound(configuration.getTimestampBound());
+      }
+      if (configuration.batching != null) {
+        readTransform = readTransform.withBatching(configuration.batching);
+      }
+
+      return new SpannerIO.ReadRows(readTransform, configuration.schema);
+    }
+  }
+
+  @Experimental(Kind.PORTABILITY)
+  public static class WriteBuilder
+      implements ExternalTransformBuilder<WriteBuilder.Configuration, PCollection<Row>, PDone> {
+
+    public static class Configuration extends CrossLanguageConfiguration {

Review comment:
       FYI now with https://github.com/apache/beam/pull/12481 it's possible to use schema inference for these configuration objects, so you should be able to use `@AutoValue` and `AutoValueSchema` which could save a lot of boiler plate There's an example here:
   
   https://github.com/apache/beam/blob/89a2d17624a5f2f445b7199fe7a61ec0eca8205a/sdks/java/expansion-service/src/test/java/org/apache/beam/sdk/expansion/service/ExpansionServiceTest.java#L305-L315
   
   (it's fine to leave it as-is, just letting you know)

##########
File path: sdks/python/apache_beam/io/gcp/spanner.py
##########
@@ -0,0 +1,503 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""PTransforms for supporting Spanner in Python pipelines.
+
+  These transforms are currently supported by Beam portable
+  Flink and Spark runners.
+
+  **Setup**
+
+  Transforms provided in this module are cross-language transforms
+  implemented in the Beam Java SDK. During the pipeline construction, Python SDK
+  will connect to a Java expansion service to expand these transforms.
+  To facilitate this, a small amount of setup is needed before using these
+  transforms in a Beam Python pipeline.
+
+  There are several ways to setup cross-language Spanner transforms.
+
+  * Option 1: use the default expansion service
+  * Option 2: specify a custom expansion service
+
+  See below for details regarding each of these options.
+
+  *Option 1: Use the default expansion service*
+
+  This is the recommended and easiest setup option for using Python Spanner
+  transforms. This option is only available for Beam 2.25.0 and later.
+
+  This option requires following pre-requisites before running the Beam
+  pipeline.
+
+  * Install Java runtime in the computer from where the pipeline is constructed
+    and make sure that 'java' command is available.
+
+  In this option, Python SDK will either download (for released Beam version) or
+  build (when running from a Beam Git clone) a expansion service jar and use
+  that to expand transforms. Currently Spanner transforms use the
+  'beam-sdks-java-io-google-cloud-platform-expansion-service' jar for this
+  purpose.
+
+  *Option 2: specify a custom expansion service*
+
+  In this option, you startup your own expansion service and provide that as
+  a parameter when using the transforms provided in this module.
+
+  This option requires following pre-requisites before running the Beam
+  pipeline.
+
+  * Startup your own expansion service.
+  * Update your pipeline to provide the expansion service address when
+    initiating Spanner transforms provided in this module.
+
+  Flink Users can use the built-in Expansion Service of the Flink Runner's
+  Job Server. If you start Flink's Job Server, the expansion service will be
+  started on port 8097. For a different address, please set the
+  expansion_service parameter.
+
+  **More information**
+
+  For more information regarding cross-language transforms see:
+  - https://beam.apache.org/roadmap/portability/
+
+  For more information specific to Flink runner see:
+  - https://beam.apache.org/documentation/runners/flink/
+"""
+
+# pytype: skip-file
+
+from __future__ import absolute_import
+
+import typing
+import uuid
+from typing import List
+from typing import NamedTuple
+from typing import Optional
+
+from past.builtins import unicode
+
+from apache_beam import coders
+from apache_beam.transforms.external import BeamJarExpansionService
+from apache_beam.transforms.external import ExternalTransform
+from apache_beam.transforms.external import NamedTupleBasedPayloadBuilder
+from apache_beam.typehints.schemas import named_tuple_to_schema
+
+__all__ = [
+    'WriteToSpanner',
+    'ReadFromSpanner',
+    'MutationCreator',
+    'TimestampBoundMode',
+    'TimeUnit',
+]
+
+
+def default_io_expansion_service():
+  return BeamJarExpansionService(
+      'sdks:java:io:google-cloud-platform:expansion-service:shadowJar')
+
+
+WriteToSpannerSchema = typing.NamedTuple(
+    'WriteToSpannerSchema',
+    [
+        ('instance_id', unicode),
+        ('database_id', unicode),
+        ('project_id', Optional[unicode]),
+        ('max_batch_size_bytes', Optional[int]),
+        ('max_number_mutations', Optional[int]),
+        ('max_number_rows', Optional[int]),
+        ('grouping_factor', Optional[int]),
+        ('host', Optional[unicode]),
+        ('emulator_host', Optional[unicode]),
+        ('commit_deadline', Optional[int]),
+        ('max_cumulative_backoff', Optional[int]),
+    ],
+)
+
+
+class WriteToSpanner(ExternalTransform):
+  """
+  A PTransform which writes mutations to the specified instance's database
+  via Spanner.
+
+  This transform receives Mutations defined as NamedTuple which are created
+  via utility class MutationCreator. Mutation needs to know what row type does
+  it wrap. Example::
+
+    ExampleRow = typing.NamedTuple('ExampleRow',
+                                   [('id', int), ('name', unicode)])
+    coders.registry.register_coder(ExampleRow, coders.RowCoder)
+
+    mutation_creator = MutationCreator('table', ExampleRow, 'ExampleMutation')

Review comment:
       Overall I think it makes a lot of sense to use Rows for the Mutations, with a nested Row for the data, but this API is pretty tricky. Could you look into adding a separate PTransform (or multiple PTransforms) for converting the Rows to mutations? I think an API like this should be possible:
   
   ```py
   pc = ... #some PCollection with a schema
   
   pc | RowToMutation.insert('table')
        | WriteToSpanner(...)
   
   OR 
   
   pc | RowToMutation.insertOrUpdate('table')
        | WriteToSpanner(...)
   
   OR
   
   pc | RowToMutation.delete('table')
        | WriteToSpanner(...)
   ```
   
   The PTransform would be able to look at the `element_type` of the input PCollection and create a mutation type that wraps it  in the `expand` method. There's not a lot of examples of logic like this in the Python SDK (yet) the only one I know of is here: https://github.com/apache/beam/blob/cfa448d121297398312d09c531258a72b413488b/sdks/python/apache_beam/dataframe/schemas.py#L50-L55
   
   That way the user wouldn't need to pass the type they're planning on using to MutationCreator. What do you think of that?
   

##########
File path: sdks/python/apache_beam/io/gcp/spanner.py
##########
@@ -0,0 +1,504 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""PTransforms for supporting Spanner in Python pipelines.
+
+  These transforms are currently supported by Beam portable
+  Flink and Spark runners.
+
+  **Setup**
+
+  Transforms provided in this module are cross-language transforms
+  implemented in the Beam Java SDK. During the pipeline construction, Python SDK
+  will connect to a Java expansion service to expand these transforms.
+  To facilitate this, a small amount of setup is needed before using these
+  transforms in a Beam Python pipeline.
+
+  There are several ways to setup cross-language Spanner transforms.
+
+  * Option 1: use the default expansion service
+  * Option 2: specify a custom expansion service
+
+  See below for details regarding each of these options.
+
+  *Option 1: Use the default expansion service*
+
+  This is the recommended and easiest setup option for using Python Spanner
+  transforms. This option is only available for Beam 2.25.0 and later.
+
+  This option requires following pre-requisites before running the Beam
+  pipeline.
+
+  * Install Java runtime in the computer from where the pipeline is constructed
+    and make sure that 'java' command is available.
+
+  In this option, Python SDK will either download (for released Beam version) or
+  build (when running from a Beam Git clone) a expansion service jar and use
+  that to expand transforms. Currently Spanner transforms use the
+  'beam-sdks-java-io-google-cloud-platform-expansion-service' jar for this
+  purpose.
+
+  *Option 2: specify a custom expansion service*
+
+  In this option, you startup your own expansion service and provide that as
+  a parameter when using the transforms provided in this module.
+
+  This option requires following pre-requisites before running the Beam
+  pipeline.
+
+  * Startup your own expansion service.
+  * Update your pipeline to provide the expansion service address when
+    initiating Spanner transforms provided in this module.
+
+  Flink Users can use the built-in Expansion Service of the Flink Runner's
+  Job Server. If you start Flink's Job Server, the expansion service will be
+  started on port 8097. For a different address, please set the
+  expansion_service parameter.
+
+  **More information**
+
+  For more information regarding cross-language transforms see:
+  - https://beam.apache.org/roadmap/portability/
+
+  For more information specific to Flink runner see:
+  - https://beam.apache.org/documentation/runners/flink/
+"""
+
+# pytype: skip-file
+
+from __future__ import absolute_import
+
+import typing
+import uuid
+from typing import List
+from typing import NamedTuple
+from typing import Optional
+
+from past.builtins import unicode
+
+from apache_beam import coders
+from apache_beam.transforms.external import BeamJarExpansionService
+from apache_beam.transforms.external import ExternalTransform
+from apache_beam.transforms.external import NamedTupleBasedPayloadBuilder
+from apache_beam.typehints.schemas import named_tuple_to_schema
+
+__all__ = [
+    'WriteToSpanner',
+    'ReadFromSpanner',
+    'MutationCreator',
+    'TimestampBoundMode',
+    'TimeUnit',
+]
+
+
+def default_io_expansion_service():
+  return BeamJarExpansionService(
+      'sdks:java:io:google-cloud-platform:expansion-service:shadowJar')
+
+
+WriteToSpannerSchema = typing.NamedTuple(
+    'WriteToSpannerSchema',
+    [
+        ('instance_id', unicode),
+        ('database_id', unicode),
+        ('project_id', Optional[unicode]),
+        ('batch_size_bytes', Optional[int]),
+        ('max_num_mutations', Optional[int]),
+        ('max_num_rows', Optional[int]),
+        ('grouping_factor', Optional[int]),
+        ('host', Optional[unicode]),
+        ('emulator_host', Optional[unicode]),
+        ('commit_deadline', Optional[int]),
+        ('max_cumulative_backoff', Optional[int]),
+    ],
+)
+
+
+class WriteToSpanner(ExternalTransform):

Review comment:
       Yeah that makes sense. There's definitely still value in adding this even if we end up preferring the native Python one, since we can use it from the Go SDK in the future.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski commented on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski commented on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-727022314


   Run Python 3.7 PostCommit


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski commented on a change in pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski commented on a change in pull request #12611:
URL: https://github.com/apache/beam/pull/12611#discussion_r523017744



##########
File path: sdks/python/apache_beam/io/gcp/spanner.py
##########
@@ -0,0 +1,635 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""PTransforms for supporting Spanner in Python pipelines.
+
+  These transforms are currently supported by Beam portable
+  Flink and Spark runners.
+
+  **Setup**
+
+  Transforms provided in this module are cross-language transforms
+  implemented in the Beam Java SDK. During the pipeline construction, Python SDK
+  will connect to a Java expansion service to expand these transforms.
+  To facilitate this, a small amount of setup is needed before using these
+  transforms in a Beam Python pipeline.
+
+  There are several ways to setup cross-language Spanner transforms.
+
+  * Option 1: use the default expansion service
+  * Option 2: specify a custom expansion service
+
+  See below for details regarding each of these options.
+
+  *Option 1: Use the default expansion service*
+
+  This is the recommended and easiest setup option for using Python Spanner
+  transforms. This option is only available for Beam 2.26.0 and later.
+
+  This option requires following pre-requisites before running the Beam
+  pipeline.
+
+  * Install Java runtime in the computer from where the pipeline is constructed
+    and make sure that 'java' command is available.
+
+  In this option, Python SDK will either download (for released Beam version) or
+  build (when running from a Beam Git clone) a expansion service jar and use
+  that to expand transforms. Currently Spanner transforms use the
+  'beam-sdks-java-io-google-cloud-platform-expansion-service' jar for this
+  purpose.
+
+  *Option 2: specify a custom expansion service*
+
+  In this option, you startup your own expansion service and provide that as
+  a parameter when using the transforms provided in this module.
+
+  This option requires following pre-requisites before running the Beam
+  pipeline.
+
+  * Startup your own expansion service.
+  * Update your pipeline to provide the expansion service address when
+    initiating Spanner transforms provided in this module.
+
+  Flink Users can use the built-in Expansion Service of the Flink Runner's
+  Job Server. If you start Flink's Job Server, the expansion service will be
+  started on port 8097. For a different address, please set the
+  expansion_service parameter.
+
+  **More information**
+
+  For more information regarding cross-language transforms see:
+  - https://beam.apache.org/roadmap/portability/
+
+  For more information specific to Flink runner see:
+  - https://beam.apache.org/documentation/runners/flink/

Review comment:
       I agree - it refers to all the existing xlang transforms, so it'll be done in another PR?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski commented on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski commented on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-687192502


   Run Python 3.8 PostCommit


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] TheNeuralBit commented on a change in pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
TheNeuralBit commented on a change in pull request #12611:
URL: https://github.com/apache/beam/pull/12611#discussion_r524468664



##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/StructUtils.java
##########
@@ -0,0 +1,380 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.gcp.spanner;
+
+import static java.util.stream.Collectors.toList;
+
+import com.google.cloud.ByteArray;
+import com.google.cloud.Timestamp;
+import com.google.cloud.spanner.Struct;
+import com.google.cloud.spanner.Type;
+import java.math.BigDecimal;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.stream.StreamSupport;
+import org.apache.beam.sdk.schemas.Schema;
+import org.apache.beam.sdk.values.Row;
+import org.checkerframework.checker.nullness.qual.Nullable;
+import org.joda.time.DateTime;
+import org.joda.time.Instant;
+import org.joda.time.ReadableDateTime;
+
+final class StructUtils {
+  public static Row structToBeamRow(Struct struct, Schema schema) {
+    Map<String, Object> structValues =
+        schema.getFields().stream()
+            .collect(
+                HashMap::new,
+                (map, field) -> {
+                  @Nullable Object structValue = getStructValue(struct, field);
+                  if (structValue == null) {
+                    throw new NullPointerException("Null struct value at field " + field.getName());
+                  }
+                  map.put(field.getName(), structValue);
+                },
+                Map::putAll);
+    return Row.withSchema(schema).withFieldValues(structValues).build();
+  }
+
+  public static Struct beamRowToStruct(Row row) {
+    Struct.Builder structBuilder = Struct.newBuilder();
+    List<Schema.Field> fields = row.getSchema().getFields();
+    fields.forEach(
+        field -> {
+          String column = field.getName();
+          switch (field.getType().getTypeName()) {
+            case ROW:
+              @Nullable Row subRow = row.getRow(column);
+              if (subRow == null) {
+                throw new NullPointerException(String.format("Null subRow at '%s' column", column));
+              }

Review comment:
       Ack, ok




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski commented on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski commented on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-717285241


   Run Python 3.8 PostCommit


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski commented on a change in pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski commented on a change in pull request #12611:
URL: https://github.com/apache/beam/pull/12611#discussion_r483525931



##########
File path: sdks/python/apache_beam/io/gcp/spanner.py
##########
@@ -0,0 +1,503 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""PTransforms for supporting Spanner in Python pipelines.
+
+  These transforms are currently supported by Beam portable
+  Flink and Spark runners.
+
+  **Setup**
+
+  Transforms provided in this module are cross-language transforms
+  implemented in the Beam Java SDK. During the pipeline construction, Python SDK
+  will connect to a Java expansion service to expand these transforms.
+  To facilitate this, a small amount of setup is needed before using these
+  transforms in a Beam Python pipeline.
+
+  There are several ways to setup cross-language Spanner transforms.
+
+  * Option 1: use the default expansion service
+  * Option 2: specify a custom expansion service
+
+  See below for details regarding each of these options.
+
+  *Option 1: Use the default expansion service*
+
+  This is the recommended and easiest setup option for using Python Spanner
+  transforms. This option is only available for Beam 2.25.0 and later.
+
+  This option requires following pre-requisites before running the Beam
+  pipeline.
+
+  * Install Java runtime in the computer from where the pipeline is constructed
+    and make sure that 'java' command is available.
+
+  In this option, Python SDK will either download (for released Beam version) or
+  build (when running from a Beam Git clone) a expansion service jar and use
+  that to expand transforms. Currently Spanner transforms use the
+  'beam-sdks-java-io-google-cloud-platform-expansion-service' jar for this
+  purpose.
+
+  *Option 2: specify a custom expansion service*
+
+  In this option, you startup your own expansion service and provide that as
+  a parameter when using the transforms provided in this module.
+
+  This option requires following pre-requisites before running the Beam
+  pipeline.
+
+  * Startup your own expansion service.
+  * Update your pipeline to provide the expansion service address when
+    initiating Spanner transforms provided in this module.
+
+  Flink Users can use the built-in Expansion Service of the Flink Runner's
+  Job Server. If you start Flink's Job Server, the expansion service will be
+  started on port 8097. For a different address, please set the
+  expansion_service parameter.
+
+  **More information**
+
+  For more information regarding cross-language transforms see:
+  - https://beam.apache.org/roadmap/portability/
+
+  For more information specific to Flink runner see:
+  - https://beam.apache.org/documentation/runners/flink/
+"""
+
+# pytype: skip-file
+
+from __future__ import absolute_import
+
+import typing
+import uuid
+from typing import List
+from typing import NamedTuple
+from typing import Optional
+
+from past.builtins import unicode
+
+from apache_beam import coders
+from apache_beam.transforms.external import BeamJarExpansionService
+from apache_beam.transforms.external import ExternalTransform
+from apache_beam.transforms.external import NamedTupleBasedPayloadBuilder
+from apache_beam.typehints.schemas import named_tuple_to_schema
+
+__all__ = [
+    'WriteToSpanner',
+    'ReadFromSpanner',
+    'MutationCreator',
+    'TimestampBoundMode',
+    'TimeUnit',
+]
+
+
+def default_io_expansion_service():
+  return BeamJarExpansionService(
+      'sdks:java:io:google-cloud-platform:expansion-service:shadowJar')
+
+
+WriteToSpannerSchema = typing.NamedTuple(
+    'WriteToSpannerSchema',
+    [
+        ('instance_id', unicode),
+        ('database_id', unicode),
+        ('project_id', Optional[unicode]),
+        ('max_batch_size_bytes', Optional[int]),
+        ('max_number_mutations', Optional[int]),
+        ('max_number_rows', Optional[int]),
+        ('grouping_factor', Optional[int]),
+        ('host', Optional[unicode]),
+        ('emulator_host', Optional[unicode]),
+        ('commit_deadline', Optional[int]),
+        ('max_cumulative_backoff', Optional[int]),
+    ],
+)
+
+
+class WriteToSpanner(ExternalTransform):
+  """
+  A PTransform which writes mutations to the specified instance's database
+  via Spanner.
+
+  This transform receives Mutations defined as NamedTuple which are created
+  via utility class MutationCreator. Mutation needs to know what row type does
+  it wrap. Example::
+
+    ExampleRow = typing.NamedTuple('ExampleRow',
+                                   [('id', int), ('name', unicode)])
+    coders.registry.register_coder(ExampleRow, coders.RowCoder)
+
+    mutation_creator = MutationCreator('table', ExampleRow, 'ExampleMutation')

Review comment:
       That way we loose possibility of mixing different kinds of mutations. I don't imagine any sane usage of mixed insert/delete as the order is not guaranteed so I aggree that removing this assumption is justified.
   
   Since we will always map rows to mutations before then it would be good to enclose mapping rows to mutations inside WriteToSpanner. How about such an API?:
   ```
   WriteToSpanner(...).insert(table)
   WriteToSpanner(...).delete(table)
   ```
   It's not consistent with ReadFromSpanner(...) but I think it's better than forcing the user to call RowToMutation each time.
   To be more consistent I could do something like `ReadFromSpanner(...).from_table(table)` and `ReadFromSpanner(...).from_sql(sql_query)`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski commented on a change in pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski commented on a change in pull request #12611:
URL: https://github.com/apache/beam/pull/12611#discussion_r501676146



##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/SpannerIO.java
##########
@@ -678,6 +703,42 @@ public Read withPartitionOptions(PartitionOptions partitionOptions) {
               .withTransaction(getTransaction());
       return input.apply(Create.of(getReadOperation())).apply("Execute query", readAll);
     }
+
+    SerializableFunction<Struct, Row> getFormatFn() {
+      return (SerializableFunction<Struct, Row>)
+          input ->
+              Row.withSchema(Schema.builder().addInt64Field("Key").build())
+                  .withFieldValue("Key", 3L)
+                  .build();
+    }
+  }
+
+  public static class ReadRows extends PTransform<PBegin, PCollection<Row>> {
+    Read read;
+    Schema schema;
+
+    public ReadRows(Read read, Schema schema) {
+      super("Read rows");
+      this.read = read;
+      this.schema = schema;

Review comment:
       Thank you @nielm ! I thought about the LIMIT approach but then I found the same arguments not to do that.
   
   It appears there exist a jdbc client for Spanner: https://cloud.google.com/spanner/docs/jdbc-drivers . I'll try to figure out if I can use it. 
   
   There is ResultSetMetadata in Spanner's REST API which extends json object. https://cloud.google.com/spanner/docs/reference/rest/v1/ResultSetMetadata but at the end of the day it requires at least partially to fetch the data.
   
   But I would leave it for another PR as it supposedly require to move SchemaUtils from io/jdbc to some more general place (extensions/sql?). As I can see Struct type is represented as String as is mentiones here:
   ```
   The Cloud Spanner STRUCT data type is mapped to a SQL VARCHAR data type, accessible through this driver as String types. All other types have appropriate mappings.
   ```
   So it may not be the best option.

##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/SpannerIO.java
##########
@@ -678,6 +703,42 @@ public Read withPartitionOptions(PartitionOptions partitionOptions) {
               .withTransaction(getTransaction());
       return input.apply(Create.of(getReadOperation())).apply("Execute query", readAll);
     }
+
+    SerializableFunction<Struct, Row> getFormatFn() {
+      return (SerializableFunction<Struct, Row>)
+          input ->
+              Row.withSchema(Schema.builder().addInt64Field("Key").build())
+                  .withFieldValue("Key", 3L)
+                  .build();
+    }
+  }
+
+  public static class ReadRows extends PTransform<PBegin, PCollection<Row>> {
+    Read read;
+    Schema schema;
+
+    public ReadRows(Read read, Schema schema) {
+      super("Read rows");
+      this.read = read;
+      this.schema = schema;

Review comment:
       Thank you @nielm ! I thought about the LIMIT approach but then I found the same arguments not to do that.
   
   It appears there exist a jdbc client for Spanner: https://cloud.google.com/spanner/docs/jdbc-drivers . I'll try to figure out if I can use it. 
   
   There is ResultSetMetadata in Spanner's REST API which extends json object. https://cloud.google.com/spanner/docs/reference/rest/v1/ResultSetMetadata but at the end of the day it requires at least partially to fetch the data.
   
   But I would leave it for another PR as it supposedly require to move SchemaUtils from io/jdbc to some more general place (extensions/sql?). As I can see Struct type is represented as String as is mentiones here:
   ```
   The Cloud Spanner STRUCT data type is mapped to a SQL VARCHAR data type, accessible through 
   this driver as String types. All other types have appropriate mappings.
   ```
   So it may not be the best option.

##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/SpannerIO.java
##########
@@ -678,6 +703,42 @@ public Read withPartitionOptions(PartitionOptions partitionOptions) {
               .withTransaction(getTransaction());
       return input.apply(Create.of(getReadOperation())).apply("Execute query", readAll);
     }
+
+    SerializableFunction<Struct, Row> getFormatFn() {
+      return (SerializableFunction<Struct, Row>)
+          input ->
+              Row.withSchema(Schema.builder().addInt64Field("Key").build())
+                  .withFieldValue("Key", 3L)
+                  .build();
+    }
+  }
+
+  public static class ReadRows extends PTransform<PBegin, PCollection<Row>> {
+    Read read;
+    Schema schema;
+
+    public ReadRows(Read read, Schema schema) {
+      super("Read rows");
+      this.read = read;
+      this.schema = schema;

Review comment:
       Thank you @nielm ! I thought about the LIMIT approach but then I found the same arguments not to do that.
   
   It appears there exist a jdbc client for Spanner: https://cloud.google.com/spanner/docs/jdbc-drivers . I'll try to figure out if I can use it. 
   
   There is ResultSetMetadata in Spanner's REST API which extends json object. https://cloud.google.com/spanner/docs/reference/rest/v1/ResultSetMetadata but at the end of the day it requires at least partially to fetch the data.
   
   But I would leave it for another PR as it supposedly require to move SchemaUtils from io/jdbc to some more general place (extensions/sql?). As I can see Struct type is mapped to String/Varchar as is mentioned in the FAQ, so it may not be the best option
   ```
   The Cloud Spanner STRUCT data type is mapped to a SQL VARCHAR data type, accessible through 
   this driver as String types. All other types have appropriate mappings.
   ```

##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/StructUtils.java
##########
@@ -0,0 +1,545 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.gcp.spanner;
+
+import static java.util.stream.Collectors.toList;
+import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument;
+
+import com.google.cloud.ByteArray;
+import com.google.cloud.Timestamp;
+import com.google.cloud.spanner.Struct;
+import com.google.cloud.spanner.Type;
+import java.math.BigDecimal;
+import java.util.List;
+import java.util.stream.StreamSupport;
+import org.apache.beam.sdk.schemas.Schema;
+import org.apache.beam.sdk.values.Row;
+import org.joda.time.DateTime;
+import org.joda.time.Instant;
+
+final class StructUtils {
+  public static Row translateStructToRow(Struct struct, Schema schema) {
+    checkForSchemasEquality(schema.getFields(), struct.getType().getStructFields(), false);
+
+    List<Schema.Field> fields = schema.getFields();
+    Row.FieldValueBuilder valueBuilder = null;
+    // TODO: Remove this null-checking once nullable fields are supported in cross-language
+    int count = 0;
+    while (valueBuilder == null && count < fields.size()) {
+      valueBuilder = getFirstStructValue(struct, fields.get(count), schema);
+      ++count;
+    }
+    for (int i = count; i < fields.size(); ++i) {
+      valueBuilder = getStructValue(valueBuilder, struct, fields.get(i));
+    }
+    return valueBuilder != null ? valueBuilder.build() : Row.withSchema(schema).build();
+  }
+
+  public static Struct translateRowToStruct(Row row) {
+    Struct.Builder structBuilder = Struct.newBuilder();
+    List<Schema.Field> fields = row.getSchema().getFields();
+    fields.forEach(
+        field -> {
+          String column = field.getName();
+          switch (field.getType().getTypeName()) {
+            case ROW:
+              structBuilder
+                  .set(column)
+                  .to(
+                      beamTypeToSpannerType(field.getType()),
+                      translateRowToStruct(row.getRow(column)));
+              break;
+            case ARRAY:
+              addArrayToStruct(structBuilder, row, field);
+              break;
+            case ITERABLE:
+              addIterableToStruct(structBuilder, row, field);
+              break;
+            case FLOAT:
+              structBuilder.set(column).to(row.getFloat(column).doubleValue());
+              break;
+            case DOUBLE:
+              structBuilder.set(column).to(row.getDouble(column));
+              break;
+            case DECIMAL:
+              structBuilder.set(column).to(row.getDecimal(column).doubleValue());

Review comment:
       Great, I think it wasn't available when I wrote that code. Thanks!

##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/StructUtils.java
##########
@@ -0,0 +1,545 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.gcp.spanner;
+
+import static java.util.stream.Collectors.toList;
+import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument;
+
+import com.google.cloud.ByteArray;
+import com.google.cloud.Timestamp;
+import com.google.cloud.spanner.Struct;
+import com.google.cloud.spanner.Type;
+import java.math.BigDecimal;
+import java.util.List;
+import java.util.stream.StreamSupport;
+import org.apache.beam.sdk.schemas.Schema;
+import org.apache.beam.sdk.values.Row;
+import org.joda.time.DateTime;
+import org.joda.time.Instant;
+
+final class StructUtils {
+  public static Row translateStructToRow(Struct struct, Schema schema) {
+    checkForSchemasEquality(schema.getFields(), struct.getType().getStructFields(), false);
+
+    List<Schema.Field> fields = schema.getFields();
+    Row.FieldValueBuilder valueBuilder = null;
+    // TODO: Remove this null-checking once nullable fields are supported in cross-language
+    int count = 0;
+    while (valueBuilder == null && count < fields.size()) {
+      valueBuilder = getFirstStructValue(struct, fields.get(count), schema);
+      ++count;
+    }
+    for (int i = count; i < fields.size(); ++i) {
+      valueBuilder = getStructValue(valueBuilder, struct, fields.get(i));
+    }
+    return valueBuilder != null ? valueBuilder.build() : Row.withSchema(schema).build();
+  }
+
+  public static Struct translateRowToStruct(Row row) {
+    Struct.Builder structBuilder = Struct.newBuilder();
+    List<Schema.Field> fields = row.getSchema().getFields();
+    fields.forEach(
+        field -> {
+          String column = field.getName();
+          switch (field.getType().getTypeName()) {
+            case ROW:
+              structBuilder
+                  .set(column)
+                  .to(
+                      beamTypeToSpannerType(field.getType()),
+                      translateRowToStruct(row.getRow(column)));
+              break;
+            case ARRAY:
+              addArrayToStruct(structBuilder, row, field);
+              break;
+            case ITERABLE:
+              addIterableToStruct(structBuilder, row, field);
+              break;
+            case FLOAT:
+              structBuilder.set(column).to(row.getFloat(column).doubleValue());
+              break;
+            case DOUBLE:
+              structBuilder.set(column).to(row.getDouble(column));
+              break;
+            case DECIMAL:
+              structBuilder.set(column).to(row.getDecimal(column).doubleValue());

Review comment:
       Great, quite a new thing in Spanner as I can see! Thanks!

##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/StructUtils.java
##########
@@ -0,0 +1,545 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.gcp.spanner;
+
+import static java.util.stream.Collectors.toList;
+import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument;
+
+import com.google.cloud.ByteArray;
+import com.google.cloud.Timestamp;
+import com.google.cloud.spanner.Struct;
+import com.google.cloud.spanner.Type;
+import java.math.BigDecimal;
+import java.util.List;
+import java.util.stream.StreamSupport;
+import org.apache.beam.sdk.schemas.Schema;
+import org.apache.beam.sdk.values.Row;
+import org.joda.time.DateTime;
+import org.joda.time.Instant;
+
+final class StructUtils {
+  public static Row translateStructToRow(Struct struct, Schema schema) {
+    checkForSchemasEquality(schema.getFields(), struct.getType().getStructFields(), false);
+
+    List<Schema.Field> fields = schema.getFields();
+    Row.FieldValueBuilder valueBuilder = null;
+    // TODO: Remove this null-checking once nullable fields are supported in cross-language
+    int count = 0;
+    while (valueBuilder == null && count < fields.size()) {
+      valueBuilder = getFirstStructValue(struct, fields.get(count), schema);
+      ++count;
+    }
+    for (int i = count; i < fields.size(); ++i) {
+      valueBuilder = getStructValue(valueBuilder, struct, fields.get(i));
+    }
+    return valueBuilder != null ? valueBuilder.build() : Row.withSchema(schema).build();
+  }
+
+  public static Struct translateRowToStruct(Row row) {
+    Struct.Builder structBuilder = Struct.newBuilder();
+    List<Schema.Field> fields = row.getSchema().getFields();
+    fields.forEach(
+        field -> {
+          String column = field.getName();
+          switch (field.getType().getTypeName()) {
+            case ROW:
+              structBuilder
+                  .set(column)
+                  .to(
+                      beamTypeToSpannerType(field.getType()),
+                      translateRowToStruct(row.getRow(column)));
+              break;
+            case ARRAY:
+              addArrayToStruct(structBuilder, row, field);
+              break;
+            case ITERABLE:
+              addIterableToStruct(structBuilder, row, field);
+              break;
+            case FLOAT:
+              structBuilder.set(column).to(row.getFloat(column).doubleValue());
+              break;
+            case DOUBLE:
+              structBuilder.set(column).to(row.getDouble(column));
+              break;
+            case DECIMAL:
+              structBuilder.set(column).to(row.getDecimal(column).doubleValue());

Review comment:
       Great, quite a new thing in Spanner as I can see! Thanks! Done.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski commented on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski commented on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-726829353


   Run Python 3.7 PostCommit


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski commented on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski commented on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-688837568


   Run Python 3.8 PostCommit


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] chamikaramj commented on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
chamikaramj commented on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-689900820


   cc: @allenpradeep @nielm 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski commented on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski commented on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-717282468


   Run Python 3.8 PostCommit


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-683611200


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=h1) Report
   > Merging [#12611](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=desc) into [master](https://codecov.io/gh/apache/beam/commit/3d6cc0ed9ed537229b27b5dbe73288f21b0e351c?el=desc) will **decrease** coverage by `0.04%`.
   > The diff coverage is `56.73%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/12611/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #12611      +/-   ##
   ==========================================
   - Coverage   82.48%   82.44%   -0.05%     
   ==========================================
     Files         455      456       +1     
     Lines       54876    54975      +99     
   ==========================================
   + Hits        45266    45324      +58     
   - Misses       9610     9651      +41     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [sdks/python/apache\_beam/io/gcp/spanner.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL3NwYW5uZXIucHk=) | `56.73% <56.73%> (ø)` | |
   | [...eam/runners/interactive/interactive\_environment.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9pbnRlcmFjdGl2ZS9pbnRlcmFjdGl2ZV9lbnZpcm9ubWVudC5weQ==) | `89.45% <0.00%> (-0.36%)` | :arrow_down: |
   | [sdks/python/apache\_beam/io/iobase.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vaW9iYXNlLnB5) | `83.75% <0.00%> (-0.29%)` | :arrow_down: |
   | [...hon/apache\_beam/runners/worker/bundle\_processor.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvYnVuZGxlX3Byb2Nlc3Nvci5weQ==) | `94.07% <0.00%> (-0.27%)` | :arrow_down: |
   | [...ks/python/apache\_beam/runners/worker/sdk\_worker.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvc2RrX3dvcmtlci5weQ==) | `89.47% <0.00%> (-0.16%)` | :arrow_down: |
   | [...runners/interactive/display/pcoll\_visualization.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9pbnRlcmFjdGl2ZS9kaXNwbGF5L3Bjb2xsX3Zpc3VhbGl6YXRpb24ucHk=) | `85.26% <0.00%> (-0.08%)` | :arrow_down: |
   | [...beam/runners/portability/local\_job\_service\_main.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9wb3J0YWJpbGl0eS9sb2NhbF9qb2Jfc2VydmljZV9tYWluLnB5) | `0.00% <0.00%> (ø)` | |
   | [sdks/python/apache\_beam/runners/common.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9jb21tb24ucHk=) | `89.20% <0.00%> (+0.44%)` | :arrow_up: |
   | [.../python/apache\_beam/transforms/periodicsequence.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdHJhbnNmb3Jtcy9wZXJpb2RpY3NlcXVlbmNlLnB5) | `98.24% <0.00%> (+1.75%)` | :arrow_up: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=footer). Last update [84db719...537aa7f](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-683611200


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=h1) Report
   > Merging [#12611](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=desc) into [master](https://codecov.io/gh/apache/beam/commit/3d6cc0ed9ed537229b27b5dbe73288f21b0e351c?el=desc) will **decrease** coverage by `0.04%`.
   > The diff coverage is `56.73%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/12611/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #12611      +/-   ##
   ==========================================
   - Coverage   82.48%   82.44%   -0.05%     
   ==========================================
     Files         455      456       +1     
     Lines       54876    54975      +99     
   ==========================================
   + Hits        45266    45324      +58     
   - Misses       9610     9651      +41     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [sdks/python/apache\_beam/io/gcp/spanner.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL3NwYW5uZXIucHk=) | `56.73% <56.73%> (ø)` | |
   | [...eam/runners/interactive/interactive\_environment.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9pbnRlcmFjdGl2ZS9pbnRlcmFjdGl2ZV9lbnZpcm9ubWVudC5weQ==) | `89.45% <0.00%> (-0.36%)` | :arrow_down: |
   | [sdks/python/apache\_beam/io/iobase.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vaW9iYXNlLnB5) | `83.75% <0.00%> (-0.29%)` | :arrow_down: |
   | [...hon/apache\_beam/runners/worker/bundle\_processor.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvYnVuZGxlX3Byb2Nlc3Nvci5weQ==) | `94.07% <0.00%> (-0.27%)` | :arrow_down: |
   | [...ks/python/apache\_beam/runners/worker/sdk\_worker.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvc2RrX3dvcmtlci5weQ==) | `89.47% <0.00%> (-0.16%)` | :arrow_down: |
   | [...runners/interactive/display/pcoll\_visualization.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9pbnRlcmFjdGl2ZS9kaXNwbGF5L3Bjb2xsX3Zpc3VhbGl6YXRpb24ucHk=) | `85.26% <0.00%> (-0.08%)` | :arrow_down: |
   | [...beam/runners/portability/local\_job\_service\_main.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9wb3J0YWJpbGl0eS9sb2NhbF9qb2Jfc2VydmljZV9tYWluLnB5) | `0.00% <0.00%> (ø)` | |
   | [sdks/python/apache\_beam/runners/common.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9jb21tb24ucHk=) | `89.20% <0.00%> (+0.44%)` | :arrow_up: |
   | [.../python/apache\_beam/transforms/periodicsequence.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdHJhbnNmb3Jtcy9wZXJpb2RpY3NlcXVlbmNlLnB5) | `98.24% <0.00%> (+1.75%)` | :arrow_up: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=footer). Last update [84db719...356788f](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski commented on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski commented on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-712244810


   Run Python 3.8 PostCommit


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-683611200


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=h1) Report
   > Merging [#12611](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=desc) into [master](https://codecov.io/gh/apache/beam/commit/0335ba5403e5071a1376a3a4101d14b03cf90b8d?el=desc) will **increase** coverage by `0.05%`.
   > The diff coverage is `56.73%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/12611/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #12611      +/-   ##
   ==========================================
   + Coverage   40.23%   40.29%   +0.05%     
   ==========================================
     Files         455      456       +1     
     Lines       53726    53884     +158     
   ==========================================
   + Hits        21619    21714      +95     
   - Misses      32107    32170      +63     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [sdks/python/apache\_beam/io/gcp/spanner.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL3NwYW5uZXIucHk=) | `56.73% <56.73%> (ø)` | |
   | [...hon/apache\_beam/runners/direct/test\_stream\_impl.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9kaXJlY3QvdGVzdF9zdHJlYW1faW1wbC5weQ==) | `41.17% <0.00%> (-1.58%)` | :arrow_down: |
   | [sdks/python/apache\_beam/portability/python\_urns.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcG9ydGFiaWxpdHkvcHl0aG9uX3VybnMucHk=) | `100.00% <0.00%> (ø)` | |
   | [...s/python/apache\_beam/io/gcp/bigquery\_file\_loads.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL2JpZ3F1ZXJ5X2ZpbGVfbG9hZHMucHk=) | `23.36% <0.00%> (ø)` | |
   | [...hon/apache\_beam/runners/worker/bundle\_processor.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvYnVuZGxlX3Byb2Nlc3Nvci5weQ==) | `28.92% <0.00%> (+0.24%)` | :arrow_up: |
   | [sdks/python/apache\_beam/dataframe/frames.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vZGF0YWZyYW1lL2ZyYW1lcy5weQ==) | `43.73% <0.00%> (+0.76%)` | :arrow_up: |
   | [sdks/python/apache\_beam/transforms/core.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdHJhbnNmb3Jtcy9jb3JlLnB5) | `39.22% <0.00%> (+0.94%)` | :arrow_up: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=footer). Last update [0335ba5...b0c23a2](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski commented on a change in pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski commented on a change in pull request #12611:
URL: https://github.com/apache/beam/pull/12611#discussion_r482073899



##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/SpannerIO.java
##########
@@ -678,6 +703,42 @@ public Read withPartitionOptions(PartitionOptions partitionOptions) {
               .withTransaction(getTransaction());
       return input.apply(Create.of(getReadOperation())).apply("Execute query", readAll);
     }
+
+    SerializableFunction<Struct, Row> getFormatFn() {
+      return (SerializableFunction<Struct, Row>)
+          input ->
+              Row.withSchema(Schema.builder().addInt64Field("Key").build())
+                  .withFieldValue("Key", 3L)
+                  .build();
+    }
+  }
+
+  public static class ReadRows extends PTransform<PBegin, PCollection<Row>> {
+    Read read;
+    Schema schema;
+
+    public ReadRows(Read read, Schema schema) {
+      super("Read rows");
+      this.read = read;
+      this.schema = schema;

Review comment:
       I'd really like to do it in this PR, but the only thing that comes to mind is to do what you said - perform the read request with client and then read the schema. The obvious disadvantage is that the Spanner query will be executed twice. I researched that limit of 1 row added to the end of query will not improve the performance so this is not the thing to do for huge result sets




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski commented on a change in pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski commented on a change in pull request #12611:
URL: https://github.com/apache/beam/pull/12611#discussion_r482079860



##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/StructUtils.java
##########
@@ -0,0 +1,545 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.gcp.spanner;
+
+import static java.util.stream.Collectors.toList;
+import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument;
+
+import com.google.cloud.ByteArray;
+import com.google.cloud.Timestamp;
+import com.google.cloud.spanner.Struct;
+import com.google.cloud.spanner.Type;
+import java.math.BigDecimal;
+import java.util.List;
+import java.util.stream.StreamSupport;
+import org.apache.beam.sdk.schemas.Schema;
+import org.apache.beam.sdk.values.Row;
+import org.joda.time.DateTime;
+import org.joda.time.Instant;
+
+final class StructUtils {

Review comment:
       There are non-related different classes that require the same things to be done on them.
   One of them are Key and Mutation, other one Row.Builder and Row.FieldValueBuilder. In python there is duck typing and it's easy. But in Java I don't know how to reduce the repeated code. Maybe I should do more setValue(Object obj) and depend on castings instead of returning the proper type all the time. I'll try it.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski commented on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski commented on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-727154152


   Run Python 3.7 PostCommit


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] TheNeuralBit commented on a change in pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
TheNeuralBit commented on a change in pull request #12611:
URL: https://github.com/apache/beam/pull/12611#discussion_r524471761



##########
File path: sdks/python/apache_beam/io/gcp/spanner.py
##########
@@ -0,0 +1,635 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""PTransforms for supporting Spanner in Python pipelines.
+
+  These transforms are currently supported by Beam portable
+  Flink and Spark runners.
+
+  **Setup**
+
+  Transforms provided in this module are cross-language transforms
+  implemented in the Beam Java SDK. During the pipeline construction, Python SDK
+  will connect to a Java expansion service to expand these transforms.
+  To facilitate this, a small amount of setup is needed before using these
+  transforms in a Beam Python pipeline.
+
+  There are several ways to setup cross-language Spanner transforms.
+
+  * Option 1: use the default expansion service
+  * Option 2: specify a custom expansion service
+
+  See below for details regarding each of these options.
+
+  *Option 1: Use the default expansion service*
+
+  This is the recommended and easiest setup option for using Python Spanner
+  transforms. This option is only available for Beam 2.26.0 and later.
+
+  This option requires following pre-requisites before running the Beam
+  pipeline.
+
+  * Install Java runtime in the computer from where the pipeline is constructed
+    and make sure that 'java' command is available.
+
+  In this option, Python SDK will either download (for released Beam version) or
+  build (when running from a Beam Git clone) a expansion service jar and use
+  that to expand transforms. Currently Spanner transforms use the
+  'beam-sdks-java-io-google-cloud-platform-expansion-service' jar for this
+  purpose.
+
+  *Option 2: specify a custom expansion service*
+
+  In this option, you startup your own expansion service and provide that as
+  a parameter when using the transforms provided in this module.
+
+  This option requires following pre-requisites before running the Beam
+  pipeline.
+
+  * Startup your own expansion service.
+  * Update your pipeline to provide the expansion service address when
+    initiating Spanner transforms provided in this module.
+
+  Flink Users can use the built-in Expansion Service of the Flink Runner's
+  Job Server. If you start Flink's Job Server, the expansion service will be
+  started on port 8097. For a different address, please set the
+  expansion_service parameter.
+
+  **More information**
+
+  For more information regarding cross-language transforms see:
+  - https://beam.apache.org/roadmap/portability/
+
+  For more information specific to Flink runner see:
+  - https://beam.apache.org/documentation/runners/flink/

Review comment:
       Yeah it can be done in another PR. Filed BEAM-11269 to track this.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski edited a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski edited a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-687108482


   @chamikaramj Brian asked me to ask you for the further review as he is going OOO this week. I'd be grateful :)
   If you won't have time until thursday then this PR can wait, there is no haste with it.
   I've changed the API of WriteToSpanner to use WriteToSpanner(config).insert(table) etc instead of MutationCreator.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski removed a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski removed a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-686482934


   Run Python 3.8 PostCommit


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-683611200


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=h1) Report
   > Merging [#12611](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=desc) into [master](https://codecov.io/gh/apache/beam/commit/3d6cc0ed9ed537229b27b5dbe73288f21b0e351c?el=desc) will **decrease** coverage by `0.04%`.
   > The diff coverage is `56.73%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/12611/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #12611      +/-   ##
   ==========================================
   - Coverage   82.48%   82.44%   -0.05%     
   ==========================================
     Files         455      456       +1     
     Lines       54876    54975      +99     
   ==========================================
   + Hits        45266    45324      +58     
   - Misses       9610     9651      +41     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [sdks/python/apache\_beam/io/gcp/spanner.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL3NwYW5uZXIucHk=) | `56.73% <56.73%> (ø)` | |
   | [...eam/runners/interactive/interactive\_environment.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9pbnRlcmFjdGl2ZS9pbnRlcmFjdGl2ZV9lbnZpcm9ubWVudC5weQ==) | `89.45% <0.00%> (-0.36%)` | :arrow_down: |
   | [sdks/python/apache\_beam/io/iobase.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vaW9iYXNlLnB5) | `83.75% <0.00%> (-0.29%)` | :arrow_down: |
   | [...hon/apache\_beam/runners/worker/bundle\_processor.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvYnVuZGxlX3Byb2Nlc3Nvci5weQ==) | `94.07% <0.00%> (-0.27%)` | :arrow_down: |
   | [...ks/python/apache\_beam/runners/worker/sdk\_worker.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvc2RrX3dvcmtlci5weQ==) | `89.47% <0.00%> (-0.16%)` | :arrow_down: |
   | [...runners/interactive/display/pcoll\_visualization.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9pbnRlcmFjdGl2ZS9kaXNwbGF5L3Bjb2xsX3Zpc3VhbGl6YXRpb24ucHk=) | `85.26% <0.00%> (-0.08%)` | :arrow_down: |
   | [...beam/runners/portability/local\_job\_service\_main.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9wb3J0YWJpbGl0eS9sb2NhbF9qb2Jfc2VydmljZV9tYWluLnB5) | `0.00% <0.00%> (ø)` | |
   | [sdks/python/apache\_beam/runners/common.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9jb21tb24ucHk=) | `89.20% <0.00%> (+0.44%)` | :arrow_up: |
   | [.../python/apache\_beam/transforms/periodicsequence.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdHJhbnNmb3Jtcy9wZXJpb2RpY3NlcXVlbmNlLnB5) | `98.24% <0.00%> (+1.75%)` | :arrow_up: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=footer). Last update [e4c95f2...2750244](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski commented on a change in pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski commented on a change in pull request #12611:
URL: https://github.com/apache/beam/pull/12611#discussion_r482074113



##########
File path: sdks/java/io/google-cloud-platform/expansion-service/build.gradle
##########
@@ -0,0 +1,44 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+apply plugin: 'org.apache.beam.module'
+apply plugin: 'application'
+mainClassName = "org.apache.beam.sdk.expansion.service.ExpansionService"
+
+applyJavaNature(
+        enableChecker: true,
+        automaticModuleName: 'org.apache.beam.sdk.io.gcp.expansion.service',
+        exportJavadoc: false,
+        validateShadowJar: false,
+        shadowClosure: {},
+)
+
+task runService(type: Exec) {
+    dependsOn shadowJar
+    executable 'sh'
+    args '-c', 'java -jar /Users/piotr/beam/sdks/java/io/google-cloud-platform/expansion-service/build/libs/beam-sdks-java-io-google-cloud-platform-expansion-service-2.24.0-SNAPSHOT.jar 8097'
+}

Review comment:
       My fault, I'll remove this. Sorry for that.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski commented on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski commented on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-717180946


   Run Python 3.7 PostCommit


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski commented on pull request #12611: [BEAM-10131][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski commented on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-676422822


   > Regarding testing: we could consider adding a spanner instance to apache-beam-testing for integration testing, I'd suggest raising it on dev@ if you want to pursue it. I also just came across https://cloud.google.com/spanner/docs/emulator which could be a good option too. Its a docker container that starts up an in-memory version of spanner to test against.
   
   @TheNeuralBit Great advice as always! I tried to find something like this emulator on dockerhub but without success. I managed to successfully use this emulator, it has much better support than aws for localstack.
   
   I am almost certain that the Schema doesn't have to be sent as proto in Read but I didn't come up with anything else.
   
   Another issue is representing the Mutation - for now it's a Row containing 4 fields: operation, table, rows and key_set. It does quite well but I wonder whether I can do it better.
   
   FYI - I'll be OOO the next week so there is absolutely no haste :)


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski commented on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski commented on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-727010594


   Run Java PreCommit


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski removed a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski removed a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-686475344


   Run Python 3.8 PostCommit


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-683611200


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=h1) Report
   > Merging [#12611](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=desc) into [master](https://codecov.io/gh/apache/beam/commit/3d6cc0ed9ed537229b27b5dbe73288f21b0e351c?el=desc) will **decrease** coverage by `0.04%`.
   > The diff coverage is `56.73%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/12611/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #12611      +/-   ##
   ==========================================
   - Coverage   82.48%   82.44%   -0.05%     
   ==========================================
     Files         455      456       +1     
     Lines       54876    54975      +99     
   ==========================================
   + Hits        45266    45324      +58     
   - Misses       9610     9651      +41     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [sdks/python/apache\_beam/io/gcp/spanner.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL3NwYW5uZXIucHk=) | `56.73% <56.73%> (ø)` | |
   | [...eam/runners/interactive/interactive\_environment.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9pbnRlcmFjdGl2ZS9pbnRlcmFjdGl2ZV9lbnZpcm9ubWVudC5weQ==) | `89.45% <0.00%> (-0.36%)` | :arrow_down: |
   | [sdks/python/apache\_beam/io/iobase.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vaW9iYXNlLnB5) | `83.75% <0.00%> (-0.29%)` | :arrow_down: |
   | [...hon/apache\_beam/runners/worker/bundle\_processor.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvYnVuZGxlX3Byb2Nlc3Nvci5weQ==) | `94.07% <0.00%> (-0.27%)` | :arrow_down: |
   | [...ks/python/apache\_beam/runners/worker/sdk\_worker.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvc2RrX3dvcmtlci5weQ==) | `89.47% <0.00%> (-0.16%)` | :arrow_down: |
   | [...runners/interactive/display/pcoll\_visualization.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9pbnRlcmFjdGl2ZS9kaXNwbGF5L3Bjb2xsX3Zpc3VhbGl6YXRpb24ucHk=) | `85.26% <0.00%> (-0.08%)` | :arrow_down: |
   | [...beam/runners/portability/local\_job\_service\_main.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9wb3J0YWJpbGl0eS9sb2NhbF9qb2Jfc2VydmljZV9tYWluLnB5) | `0.00% <0.00%> (ø)` | |
   | [sdks/python/apache\_beam/runners/common.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9jb21tb24ucHk=) | `89.20% <0.00%> (+0.44%)` | :arrow_up: |
   | [.../python/apache\_beam/transforms/periodicsequence.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdHJhbnNmb3Jtcy9wZXJpb2RpY3NlcXVlbmNlLnB5) | `98.24% <0.00%> (+1.75%)` | :arrow_up: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=footer). Last update [e4c95f2...fd94c76](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski commented on a change in pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski commented on a change in pull request #12611:
URL: https://github.com/apache/beam/pull/12611#discussion_r483525931



##########
File path: sdks/python/apache_beam/io/gcp/spanner.py
##########
@@ -0,0 +1,503 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""PTransforms for supporting Spanner in Python pipelines.
+
+  These transforms are currently supported by Beam portable
+  Flink and Spark runners.
+
+  **Setup**
+
+  Transforms provided in this module are cross-language transforms
+  implemented in the Beam Java SDK. During the pipeline construction, Python SDK
+  will connect to a Java expansion service to expand these transforms.
+  To facilitate this, a small amount of setup is needed before using these
+  transforms in a Beam Python pipeline.
+
+  There are several ways to setup cross-language Spanner transforms.
+
+  * Option 1: use the default expansion service
+  * Option 2: specify a custom expansion service
+
+  See below for details regarding each of these options.
+
+  *Option 1: Use the default expansion service*
+
+  This is the recommended and easiest setup option for using Python Spanner
+  transforms. This option is only available for Beam 2.25.0 and later.
+
+  This option requires following pre-requisites before running the Beam
+  pipeline.
+
+  * Install Java runtime in the computer from where the pipeline is constructed
+    and make sure that 'java' command is available.
+
+  In this option, Python SDK will either download (for released Beam version) or
+  build (when running from a Beam Git clone) a expansion service jar and use
+  that to expand transforms. Currently Spanner transforms use the
+  'beam-sdks-java-io-google-cloud-platform-expansion-service' jar for this
+  purpose.
+
+  *Option 2: specify a custom expansion service*
+
+  In this option, you startup your own expansion service and provide that as
+  a parameter when using the transforms provided in this module.
+
+  This option requires following pre-requisites before running the Beam
+  pipeline.
+
+  * Startup your own expansion service.
+  * Update your pipeline to provide the expansion service address when
+    initiating Spanner transforms provided in this module.
+
+  Flink Users can use the built-in Expansion Service of the Flink Runner's
+  Job Server. If you start Flink's Job Server, the expansion service will be
+  started on port 8097. For a different address, please set the
+  expansion_service parameter.
+
+  **More information**
+
+  For more information regarding cross-language transforms see:
+  - https://beam.apache.org/roadmap/portability/
+
+  For more information specific to Flink runner see:
+  - https://beam.apache.org/documentation/runners/flink/
+"""
+
+# pytype: skip-file
+
+from __future__ import absolute_import
+
+import typing
+import uuid
+from typing import List
+from typing import NamedTuple
+from typing import Optional
+
+from past.builtins import unicode
+
+from apache_beam import coders
+from apache_beam.transforms.external import BeamJarExpansionService
+from apache_beam.transforms.external import ExternalTransform
+from apache_beam.transforms.external import NamedTupleBasedPayloadBuilder
+from apache_beam.typehints.schemas import named_tuple_to_schema
+
+__all__ = [
+    'WriteToSpanner',
+    'ReadFromSpanner',
+    'MutationCreator',
+    'TimestampBoundMode',
+    'TimeUnit',
+]
+
+
+def default_io_expansion_service():
+  return BeamJarExpansionService(
+      'sdks:java:io:google-cloud-platform:expansion-service:shadowJar')
+
+
+WriteToSpannerSchema = typing.NamedTuple(
+    'WriteToSpannerSchema',
+    [
+        ('instance_id', unicode),
+        ('database_id', unicode),
+        ('project_id', Optional[unicode]),
+        ('max_batch_size_bytes', Optional[int]),
+        ('max_number_mutations', Optional[int]),
+        ('max_number_rows', Optional[int]),
+        ('grouping_factor', Optional[int]),
+        ('host', Optional[unicode]),
+        ('emulator_host', Optional[unicode]),
+        ('commit_deadline', Optional[int]),
+        ('max_cumulative_backoff', Optional[int]),
+    ],
+)
+
+
+class WriteToSpanner(ExternalTransform):
+  """
+  A PTransform which writes mutations to the specified instance's database
+  via Spanner.
+
+  This transform receives Mutations defined as NamedTuple which are created
+  via utility class MutationCreator. Mutation needs to know what row type does
+  it wrap. Example::
+
+    ExampleRow = typing.NamedTuple('ExampleRow',
+                                   [('id', int), ('name', unicode)])
+    coders.registry.register_coder(ExampleRow, coders.RowCoder)
+
+    mutation_creator = MutationCreator('table', ExampleRow, 'ExampleMutation')

Review comment:
       That way we loose possibility of mixing different kinds of mutations. I don't imagine any sane usage of mixed insert/delete as the order is not guaranteed. Since we will always map rows to mutations before then it would be good to enclose mapping rows to mutations inside WriteToSpanner. How about such an API?:
   ```
   WriteToSpanner(...).insert(table)
   WriteToSpanner(...).delete(table)
   ```
   It's not consistent with ReadFromSpanner(...) but I think it's better than forcing the user to call RowToMutation each time.
   To be more consistent I could do something like `ReadFromSpanner(...).from_table(table)` and `ReadFromSpanner(...).from_sql(sql_query)`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski commented on a change in pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski commented on a change in pull request #12611:
URL: https://github.com/apache/beam/pull/12611#discussion_r483574800



##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/SpannerTransformRegistrar.java
##########
@@ -0,0 +1,287 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.gcp.spanner;
+
+import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument;
+
+import com.google.auto.service.AutoService;
+import com.google.cloud.Timestamp;
+import com.google.cloud.spanner.TimestampBound;
+import java.util.Map;
+import java.util.concurrent.TimeUnit;
+import org.apache.beam.model.pipeline.v1.SchemaApi;
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.annotations.Experimental.Kind;
+import org.apache.beam.sdk.expansion.ExternalTransformRegistrar;
+import org.apache.beam.sdk.schemas.Schema;
+import org.apache.beam.sdk.schemas.SchemaTranslation;
+import org.apache.beam.sdk.transforms.ExternalTransformBuilder;
+import org.apache.beam.sdk.transforms.PTransform;
+import org.apache.beam.sdk.values.PBegin;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.PDone;
+import org.apache.beam.sdk.values.Row;
+import org.apache.beam.vendor.grpc.v1p26p0.com.google.protobuf.InvalidProtocolBufferException;
+import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.ImmutableMap;
+import org.checkerframework.checker.nullness.qual.Nullable;
+import org.joda.time.Duration;
+
+/**
+ * Exposes {@link SpannerIO.WriteRows} and {@link SpannerIO.ReadRows} as an external transform for
+ * cross-language usage.
+ */
+@Experimental(Kind.PORTABILITY)
+@AutoService(ExternalTransformRegistrar.class)
+public class SpannerTransformRegistrar implements ExternalTransformRegistrar {
+  public static final String WRITE_URN = "beam:external:java:spanner:write:v1";
+  public static final String READ_URN = "beam:external:java:spanner:read:v1";
+
+  @Override
+  public Map<String, ExternalTransformBuilder<?, ?, ?>> knownBuilderInstances() {
+    return ImmutableMap.of(WRITE_URN, new WriteBuilder(), READ_URN, new ReadBuilder());
+  }
+
+  public abstract static class CrossLanguageConfiguration {
+    String instanceId;
+    String databaseId;
+    String projectId;
+    @Nullable String host;
+    @Nullable String emulatorHost;
+
+    public void setInstanceId(String instanceId) {
+      this.instanceId = instanceId;
+    }
+
+    public void setDatabaseId(String databaseId) {
+      this.databaseId = databaseId;
+    }
+
+    public void setProjectId(String projectId) {
+      this.projectId = projectId;
+    }
+
+    public void setHost(@Nullable String host) {
+      this.host = host;
+    }
+
+    public void setEmulatorHost(@Nullable String emulatorHost) {
+      this.emulatorHost = emulatorHost;
+    }
+  }
+
+  @Experimental(Kind.PORTABILITY)
+  public static class ReadBuilder
+      implements ExternalTransformBuilder<ReadBuilder.Configuration, PBegin, PCollection<Row>> {
+
+    public static class Configuration extends CrossLanguageConfiguration {
+      // TODO: BEAM-10851 Come up with something to determine schema without this explicit parameter
+      private Schema schema;
+      private @Nullable String sql;
+      private @Nullable String table;
+      private @Nullable Boolean batching;
+      private @Nullable String timestampBoundMode;
+      private @Nullable String readTimestamp;
+      private @Nullable String timeUnit;
+      private @Nullable Long exactStaleness;
+
+      public void setSql(@Nullable String sql) {
+        this.sql = sql;
+      }
+
+      public void setTable(@Nullable String table) {
+        this.table = table;
+      }
+
+      public void setBatching(@Nullable Boolean batching) {
+        this.batching = batching;
+      }
+
+      public void setTimestampBoundMode(@Nullable String timestampBoundMode) {
+        this.timestampBoundMode = timestampBoundMode;
+      }
+
+      public void setSchema(byte[] schema) throws InvalidProtocolBufferException {
+        this.schema = SchemaTranslation.schemaFromProto(SchemaApi.Schema.parseFrom(schema));
+      }
+
+      public void setReadTimestamp(@Nullable String readTimestamp) {
+        this.readTimestamp = readTimestamp;
+      }
+
+      public void setTimeUnit(@Nullable String timeUnit) {
+        this.timeUnit = timeUnit;
+      }
+
+      public void setExactStaleness(@Nullable Long exactStaleness) {
+        this.exactStaleness = exactStaleness;
+      }
+
+      private TimestampBound getTimestampBound() {
+        if (timestampBoundMode == null) {
+          return null;
+        }
+
+        TimestampBound.Mode mode = TimestampBound.Mode.valueOf(timestampBoundMode);
+        if (mode == TimestampBound.Mode.MAX_STALENESS
+            || mode == TimestampBound.Mode.EXACT_STALENESS) {
+          checkArgument(
+              exactStaleness != null,
+              "Staleness value cannot be null when MAX_STALENESS or EXACT_STALENESS mode is selected");
+          checkArgument(
+              timeUnit != null,
+              "Time unit cannot be null when MAX_STALENESS or EXACT_STALENESS mode is selected");
+        }
+        if (mode == TimestampBound.Mode.READ_TIMESTAMP
+            || mode == TimestampBound.Mode.MIN_READ_TIMESTAMP) {
+          checkArgument(
+              readTimestamp != null,
+              "Timestamp cannot be null when READ_TIMESTAMP or MIN_READ_TIMESTAMP mode is selected");
+        }
+        switch (mode) {
+          case STRONG:
+            return TimestampBound.strong();
+          case MAX_STALENESS:
+            return TimestampBound.ofMaxStaleness(exactStaleness, TimeUnit.valueOf(timeUnit));
+          case EXACT_STALENESS:
+            return TimestampBound.ofExactStaleness(exactStaleness, TimeUnit.valueOf(timeUnit));
+          case READ_TIMESTAMP:
+            return TimestampBound.ofReadTimestamp(Timestamp.parseTimestamp(readTimestamp));
+          case MIN_READ_TIMESTAMP:
+            return TimestampBound.ofMinReadTimestamp(Timestamp.parseTimestamp(readTimestamp));
+          default:
+            throw new RuntimeException("Unknown timestamp bound mode: " + mode);
+        }
+      }
+
+      public ReadOperation getReadOperation() {
+        checkArgument(
+            sql == null || table == null,
+            "Query and table params are mutually exclusive. Set just one of them.");
+        if (sql != null) {
+          return ReadOperation.create().withQuery(sql);
+        }
+        return ReadOperation.create().withTable(table).withColumns(schema.getFieldNames());
+      }
+    }
+
+    @Override
+    public PTransform<PBegin, PCollection<Row>> buildExternal(Configuration configuration) {
+      SpannerIO.Read readTransform =
+          SpannerIO.read()
+              .withProjectId(configuration.projectId)
+              .withDatabaseId(configuration.databaseId)
+              .withInstanceId(configuration.instanceId)
+              .withReadOperation(configuration.getReadOperation());
+
+      if (configuration.host != null) {
+        readTransform = readTransform.withHost(configuration.host);
+      }
+      if (configuration.emulatorHost != null) {
+        readTransform = readTransform.withEmulatorHost(configuration.emulatorHost);
+      }
+      if (configuration.getTimestampBound() != null) {
+        readTransform = readTransform.withTimestampBound(configuration.getTimestampBound());
+      }
+      if (configuration.batching != null) {
+        readTransform = readTransform.withBatching(configuration.batching);
+      }
+
+      return new SpannerIO.ReadRows(readTransform, configuration.schema);
+    }
+  }
+
+  @Experimental(Kind.PORTABILITY)
+  public static class WriteBuilder
+      implements ExternalTransformBuilder<WriteBuilder.Configuration, PCollection<Row>, PDone> {
+
+    public static class Configuration extends CrossLanguageConfiguration {

Review comment:
       Many setters have some logic inside them and AutoValue does not allow inheritance so there would be some code duplication in getter methods. As this is just a cosmetic thing and I have other things to do I'll leave it as it is for now.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski commented on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski commented on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-716871424


   Run Website_Stage_GCS PreCommit


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski commented on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski commented on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-686475344


   Run Python 3.8 PostCommit


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski commented on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski commented on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-686493626


   Run Python 3.8 PostCommit


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski commented on a change in pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski commented on a change in pull request #12611:
URL: https://github.com/apache/beam/pull/12611#discussion_r522054029



##########
File path: sdks/python/apache_beam/io/gcp/spanner.py
##########
@@ -0,0 +1,662 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""PTransforms for supporting Spanner in Python pipelines.
+
+  These transforms are currently supported by Beam portable
+  Flink and Spark runners.
+
+  **Setup**
+
+  Transforms provided in this module are cross-language transforms
+  implemented in the Beam Java SDK. During the pipeline construction, Python SDK
+  will connect to a Java expansion service to expand these transforms.
+  To facilitate this, a small amount of setup is needed before using these
+  transforms in a Beam Python pipeline.
+
+  There are several ways to setup cross-language Spanner transforms.
+
+  * Option 1: use the default expansion service
+  * Option 2: specify a custom expansion service
+
+  See below for details regarding each of these options.
+
+  *Option 1: Use the default expansion service*
+
+  This is the recommended and easiest setup option for using Python Spanner
+  transforms. This option is only available for Beam 2.26.0 and later.
+
+  This option requires following pre-requisites before running the Beam
+  pipeline.
+
+  * Install Java runtime in the computer from where the pipeline is constructed
+    and make sure that 'java' command is available.
+
+  In this option, Python SDK will either download (for released Beam version) or
+  build (when running from a Beam Git clone) a expansion service jar and use
+  that to expand transforms. Currently Spanner transforms use the
+  'beam-sdks-java-io-google-cloud-platform-expansion-service' jar for this
+  purpose.
+
+  *Option 2: specify a custom expansion service*
+
+  In this option, you startup your own expansion service and provide that as
+  a parameter when using the transforms provided in this module.
+
+  This option requires following pre-requisites before running the Beam
+  pipeline.
+
+  * Startup your own expansion service.
+  * Update your pipeline to provide the expansion service address when
+    initiating Spanner transforms provided in this module.
+
+  Flink Users can use the built-in Expansion Service of the Flink Runner's
+  Job Server. If you start Flink's Job Server, the expansion service will be
+  started on port 8097. For a different address, please set the
+  expansion_service parameter.
+
+  **More information**
+
+  For more information regarding cross-language transforms see:
+  - https://beam.apache.org/roadmap/portability/
+
+  For more information specific to Flink runner see:
+  - https://beam.apache.org/documentation/runners/flink/
+"""
+
+# pytype: skip-file
+
+from __future__ import absolute_import
+
+import uuid
+from enum import Enum
+from enum import auto
+from typing import List
+from typing import NamedTuple
+from typing import Optional
+
+from past.builtins import unicode
+
+from apache_beam import Map
+from apache_beam import PTransform
+from apache_beam import coders
+from apache_beam.transforms.external import BeamJarExpansionService
+from apache_beam.transforms.external import ExternalTransform
+from apache_beam.transforms.external import NamedTupleBasedPayloadBuilder
+from apache_beam.typehints.schemas import named_tuple_from_schema
+from apache_beam.typehints.schemas import named_tuple_to_schema
+from apache_beam.typehints.schemas import schema_from_element_type
+
+__all__ = [
+    'ReadFromSpanner',
+    'SpannerDelete',
+    'SpannerInsert',
+    'SpannerInsertOrUpdate',
+    'SpannerReplace',
+    'SpannerUpdate',
+    'TimestampBoundMode',
+    'TimeUnit',
+]
+
+
+def default_io_expansion_service():
+  return BeamJarExpansionService(
+      'sdks:java:io:google-cloud-platform:expansion-service:shadowJar')
+
+
+_READ_URN = 'beam:external:java:spanner:read:v1'
+_WRITE_URN = 'beam:external:java:spanner:write:v1'
+
+
+class TimeUnit(Enum):
+  NANOSECONDS = auto()
+  MICROSECONDS = auto()
+  MILLISECONDS = auto()
+  SECONDS = auto()
+  HOURS = auto()
+  DAYS = auto()
+
+
+class TimestampBoundMode(Enum):
+  MAX_STALENESS = auto()
+  EXACT_STALENESS = auto()
+  READ_TIMESTAMP = auto()
+  MIN_READ_TIMESTAMP = auto()
+  STRONG = auto()
+
+
+class ReadFromSpannerSchema(NamedTuple):
+  instance_id: unicode
+  database_id: unicode
+  schema: bytes
+  sql: Optional[unicode]
+  table: Optional[unicode]
+  project_id: Optional[unicode]
+  host: Optional[unicode]
+  emulator_host: Optional[unicode]
+  batching: Optional[bool]
+  timestamp_bound_mode: Optional[unicode]
+  read_timestamp: Optional[unicode]
+  exact_staleness: Optional[int]
+  time_unit: Optional[unicode]
+
+
+class ReadFromSpanner(ExternalTransform):
+  """
+  A PTransform which reads from the specified Spanner instance's database.
+
+  This transform required type of the row it has to return to provide the
+  schema. Example::
+
+    ExampleRow = typing.NamedTuple('ExampleRow',
+                                   [('id', int), ('name', unicode)])
+
+    with Pipeline() as p:
+      result = (
+          p
+          | ReadFromSpanner(
+              instance_id='your_instance_id',
+              database_id='your_database_id',
+              project_id='your_project_id',
+              row_type=ExampleRow,
+              query='SELECT * FROM some_table',
+              timestamp_bound_mode=TimestampBoundMode.MAX_STALENESS,
+              exact_staleness=3,
+              time_unit=TimeUnit.HOURS,
+          ).with_output_types(ExampleRow))
+
+  Experimental; no backwards compatibility guarantees.
+  """
+  def __init__(
+      self,
+      project_id,
+      instance_id,
+      database_id,
+      row_type=None,
+      sql=None,
+      table=None,
+      host=None,
+      emulator_host=None,
+      batching=None,
+      timestamp_bound_mode=None,
+      read_timestamp=None,
+      exact_staleness=None,
+      time_unit=None,
+      expansion_service=None,
+  ):
+    """
+    Initializes a read operation from Spanner.
+
+    :param project_id: Specifies the Cloud Spanner project.
+    :param instance_id: Specifies the Cloud Spanner instance.
+    :param database_id: Specifies the Cloud Spanner database.
+    :param row_type: Row type that fits the given query or table. Passed as
+        NamedTuple, e.g. NamedTuple('name', [('row_name', unicode)])
+    :param sql: An sql query to execute. It's results must fit the
+        provided row_type. Don't use when table is set.
+    :param table: A spanner table. When provided all columns from row_type
+        will be selected to query. Don't use when query is set.
+    :param batching: By default Batch API is used to read data from Cloud
+        Spanner. It is useful to disable batching when the underlying query
+        is not root-partitionable.
+    :param host: Specifies the Cloud Spanner host.
+    :param emulator_host: Specifies Spanner emulator host.
+    :param timestamp_bound_mode: Defines how Cloud Spanner will choose a
+        timestamp for a read-only transaction or a single read/query.
+        Passed as TimestampBoundMode enum. Possible values:
+        STRONG: A timestamp bound that will perform reads and queries at a
+        timestamp where all previously committed transactions are visible.
+        READ_TIMESTAMP: Returns a timestamp bound that will perform reads
+        and queries at the given timestamp.
+        MIN_READ_TIMESTAMP: Returns a timestamp bound that will perform reads
+        and queries at a timestamp chosen to be at least given timestamp value.
+        EXACT_STALENESS: Returns a timestamp bound that will perform reads and
+        queries at an exact staleness. The timestamp is chosen soon after the
+        read is started.
+        MAX_STALENESS: Returns a timestamp bound that will perform reads and
+        queries at a timestamp chosen to be at most time_unit stale.
+    :param read_timestamp: Timestamp in string. Use only when
+        timestamp_bound_mode is set to READ_TIMESTAMP or MIN_READ_TIMESTAMP.
+    :param exact_staleness: Staleness value as int. Use only when
+        timestamp_bound_mode is set to EXACT_STALENESS or MAX_STALENESS.
+        time_unit has to be set along with this param.
+    :param time_unit: Time unit for staleness_value passed as TimeUnit enum.
+        Possible values: NANOSECONDS, MICROSECONDS, MILLISECONDS, SECONDS,
+        HOURS, DAYS.
+    :param expansion_service: The address (host:port) of the ExpansionService.
+    """
+    assert row_type
+    assert sql or table and not (sql and table)
+    staleness_value = int(exact_staleness) if exact_staleness else None
+
+    if staleness_value or time_unit:
+      assert staleness_value and time_unit and \
+             timestamp_bound_mode is TimestampBoundMode.MAX_STALENESS or \
+             timestamp_bound_mode is TimestampBoundMode.EXACT_STALENESS
+
+    if read_timestamp:
+      assert timestamp_bound_mode is TimestampBoundMode.MIN_READ_TIMESTAMP\
+             or timestamp_bound_mode is TimestampBoundMode.READ_TIMESTAMP
+
+    coders.registry.register_coder(row_type, coders.RowCoder)
+
+    super(ReadFromSpanner, self).__init__(
+        _READ_URN,
+        NamedTupleBasedPayloadBuilder(
+            ReadFromSpannerSchema(
+                instance_id=instance_id,
+                database_id=database_id,
+                sql=sql,
+                table=table,
+                schema=named_tuple_to_schema(row_type).SerializeToString(),
+                project_id=project_id,
+                host=host,
+                emulator_host=emulator_host,
+                batching=batching,
+                timestamp_bound_mode=_get_enum_name(timestamp_bound_mode),
+                read_timestamp=read_timestamp,
+                exact_staleness=exact_staleness,
+                time_unit=_get_enum_name(time_unit),
+            ),
+        ),
+        expansion_service or default_io_expansion_service(),
+    )
+
+
+class WriteToSpannerSchema(NamedTuple):
+  project_id: unicode
+  instance_id: unicode
+  database_id: unicode
+  max_batch_size_bytes: Optional[int]
+  max_number_mutations: Optional[int]
+  max_number_rows: Optional[int]
+  grouping_factor: Optional[int]
+  host: Optional[unicode]
+  emulator_host: Optional[unicode]
+  commit_deadline: Optional[int]
+  max_cumulative_backoff: Optional[int]
+
+
+_CLASS_DOC = \
+  """
+  A PTransform which writes {0} mutations to the specified Spanner table.
+
+  This transform receives rows defined as NamedTuple. Example::
+
+    {1} = typing.NamedTuple('{1}',
+                                   [('id', int), ('name', unicode)])
+
+    with Pipeline() as p:
+      _ = (
+          p
+          | 'Impulse' >> beam.Impulse()
+          | 'Generate' >> beam.FlatMap(lambda x: range(num_rows))
+          | 'To row' >> beam.Map(lambda n: {1}(n, str(n))
+              .with_output_types({2})
+          | 'Write to Spanner' >> Spanner{3}(
+              instance_id='your_instance',
+              database_id='existing_database',
+              project_id='your_project_id',
+              table='your_table'))
+
+  Experimental; no backwards compatibility guarantees.
+  """
+
+_INIT_DOC = \
+  """
+  Initializes {} operation to a Spanner table.
+
+  :param project_id: Specifies the Cloud Spanner project.
+  :param instance_id: Specifies the Cloud Spanner instance.
+  :param database_id: Specifies the Cloud Spanner database.
+  :param table: Specifies the Cloud Spanner table.
+  :param max_batch_size_bytes: Specifies the batch size limit (max number of
+      bytes mutated per batch). Default value is 1048576 bytes = 1MB.
+  :param max_number_mutations: Specifies the cell mutation limit (maximum
+      number of mutated cells per batch). Default value is 5000.
+  :param max_number_rows: Specifies the row mutation limit (maximum number of
+      mutated rows per batch). Default value is 500.
+  :param grouping_factor: Specifies the multiple of max mutation (in terms
+      of both bytes per batch and cells per batch) that is used to select a
+      set of mutations to sort by key for batching. This sort uses local
+      memory on the workers, so using large values can cause out of memory
+      errors. Default value is 1000.
+  :param host: Specifies the Cloud Spanner host.
+  :param emulator_host: Specifies Spanner emulator host.
+  :param commit_deadline: Specifies the deadline for the Commit API call.
+      Default is 15 secs. DEADLINE_EXCEEDED errors will prompt a backoff/retry
+      until the value of commit_deadline is reached. DEADLINE_EXCEEDED errors
+      are ar reported with logging and counters. Pass seconds as value.
+  :param max_cumulative_backoff: Specifies the maximum cumulative backoff
+      time when retrying after DEADLINE_EXCEEDED errors. Default is 900s
+      (15min). If the mutations still have not been written after this time,
+      they are treated as a failure, and handled according to the setting of
+      failure_mode. Pass seconds as value.
+  :param expansion_service: The address (host:port) of the ExpansionService.
+  """
+
+
+def _add_doc(value, *args):
+  def _doc(obj):
+    obj.__doc__ = value.format(*args)
+    return obj
+
+  return _doc
+
+
+@_add_doc(_CLASS_DOC, 'delete', 'ExampleKey', 'List[ExampleKey]', 'Delete')
+class SpannerDelete(PTransform):
+  @_add_doc(_INIT_DOC, 'a delete')
+  def __init__(
+      self,
+      project_id,
+      instance_id,
+      database_id,
+      table,
+      max_batch_size_bytes=None,
+      max_number_mutations=None,
+      max_number_rows=None,
+      grouping_factor=None,
+      host=None,
+      emulator_host=None,
+      commit_deadline=None,
+      max_cumulative_backoff=None,
+      expansion_service=None,
+  ):
+    super().__init__()
+    max_cumulative_backoff = int(
+        max_cumulative_backoff) if max_cumulative_backoff else None
+    commit_deadline = int(commit_deadline) if commit_deadline else None
+    self.table = table
+    self.params = WriteToSpannerSchema(
+        project_id=project_id,
+        instance_id=instance_id,
+        database_id=database_id,
+        max_batch_size_bytes=max_batch_size_bytes,
+        max_number_mutations=max_number_mutations,
+        max_number_rows=max_number_rows,
+        grouping_factor=grouping_factor,
+        host=host,
+        emulator_host=emulator_host,
+        commit_deadline=commit_deadline,
+        max_cumulative_backoff=max_cumulative_backoff,
+    )
+    self.expansion_service = expansion_service or default_io_expansion_service()
+
+  def expand(self, pbegin):
+    return _apply_write_transform(
+        pbegin,
+        _RowToMutation(_Operation.DELETE, self.table),
+        self.params,
+        self.expansion_service)
+
+
+@_add_doc(_CLASS_DOC, 'insert', 'ExampleRow', 'ExampleRow', 'Insert')
+class SpannerInsert(PTransform):
+  @_add_doc(_INIT_DOC, 'an insert')
+  def __init__(
+      self,
+      project_id,
+      instance_id,
+      database_id,
+      table,
+      max_batch_size_bytes=None,
+      max_number_mutations=None,
+      max_number_rows=None,
+      grouping_factor=None,
+      host=None,
+      emulator_host=None,
+      commit_deadline=None,
+      max_cumulative_backoff=None,
+      expansion_service=None,
+  ):
+    super().__init__()
+    max_cumulative_backoff = int(
+        max_cumulative_backoff) if max_cumulative_backoff else None
+    commit_deadline = int(commit_deadline) if commit_deadline else None
+    self.table = table
+    self.params = WriteToSpannerSchema(
+        project_id=project_id,
+        instance_id=instance_id,
+        database_id=database_id,
+        max_batch_size_bytes=max_batch_size_bytes,
+        max_number_mutations=max_number_mutations,
+        max_number_rows=max_number_rows,
+        grouping_factor=grouping_factor,
+        host=host,
+        emulator_host=emulator_host,
+        commit_deadline=commit_deadline,
+        max_cumulative_backoff=max_cumulative_backoff,
+    )
+    self.expansion_service = expansion_service or default_io_expansion_service()
+
+  def expand(self, pbegin):
+    return _apply_write_transform(
+        pbegin,
+        _RowToMutation(_Operation.INSERT, self.table),
+        self.params,
+        self.expansion_service)
+
+
+@_add_doc(_CLASS_DOC, 'replace', 'ExampleRow', 'ExampleRow', 'Replace')
+class SpannerReplace(PTransform):
+  @_add_doc(_INIT_DOC, 'a replace')
+  def __init__(
+      self,
+      project_id,
+      instance_id,
+      database_id,
+      table,
+      max_batch_size_bytes=None,
+      max_number_mutations=None,
+      max_number_rows=None,
+      grouping_factor=None,
+      host=None,
+      emulator_host=None,
+      commit_deadline=None,
+      max_cumulative_backoff=None,
+      expansion_service=None,
+  ):
+    super().__init__()
+    max_cumulative_backoff = int(
+        max_cumulative_backoff) if max_cumulative_backoff else None
+    commit_deadline = int(commit_deadline) if commit_deadline else None
+    self.table = table
+    self.params = WriteToSpannerSchema(
+        project_id=project_id,
+        instance_id=instance_id,
+        database_id=database_id,
+        max_batch_size_bytes=max_batch_size_bytes,
+        max_number_mutations=max_number_mutations,
+        max_number_rows=max_number_rows,
+        grouping_factor=grouping_factor,
+        host=host,
+        emulator_host=emulator_host,
+        commit_deadline=commit_deadline,
+        max_cumulative_backoff=max_cumulative_backoff,
+    )
+    self.expansion_service = expansion_service or default_io_expansion_service()
+
+  def expand(self, pbegin):
+    return _apply_write_transform(
+        pbegin,
+        _RowToMutation(_Operation.REPLACE, self.table),
+        self.params,
+        self.expansion_service)
+
+
+@_add_doc(
+    _CLASS_DOC,
+    'insert-or-update',
+    'ExampleRow',
+    'ExampleRow',
+    'InsertOrUpdate')
+class SpannerInsertOrUpdate(PTransform):
+  @_add_doc(_INIT_DOC, 'an insert-or-update')
+  def __init__(
+      self,
+      project_id,
+      instance_id,
+      database_id,
+      table,
+      max_batch_size_bytes=None,
+      max_number_mutations=None,
+      max_number_rows=None,
+      grouping_factor=None,
+      host=None,
+      emulator_host=None,
+      commit_deadline=None,
+      max_cumulative_backoff=None,
+      expansion_service=None,
+  ):
+    super().__init__()
+    max_cumulative_backoff = int(
+        max_cumulative_backoff) if max_cumulative_backoff else None
+    commit_deadline = int(commit_deadline) if commit_deadline else None
+    self.table = table
+    self.params = WriteToSpannerSchema(
+        project_id=project_id,
+        instance_id=instance_id,
+        database_id=database_id,
+        max_batch_size_bytes=max_batch_size_bytes,
+        max_number_mutations=max_number_mutations,
+        max_number_rows=max_number_rows,
+        grouping_factor=grouping_factor,
+        host=host,
+        emulator_host=emulator_host,
+        commit_deadline=commit_deadline,
+        max_cumulative_backoff=max_cumulative_backoff,
+    )
+    self.expansion_service = expansion_service or default_io_expansion_service()
+
+  def expand(self, pbegin):
+    return _apply_write_transform(
+        pbegin,
+        _RowToMutation(_Operation.INSERT_OR_UPDATE, self.table),
+        self.params,
+        self.expansion_service)
+
+
+@_add_doc(_CLASS_DOC, 'update', 'ExampleRow', 'ExampleRow', 'Update')
+class SpannerUpdate(PTransform):
+  @_add_doc(_INIT_DOC, 'an update')
+  def __init__(
+      self,
+      project_id,
+      instance_id,
+      database_id,
+      table,
+      max_batch_size_bytes=None,
+      max_number_mutations=None,
+      max_number_rows=None,
+      grouping_factor=None,
+      host=None,
+      emulator_host=None,
+      commit_deadline=None,
+      max_cumulative_backoff=None,
+      expansion_service=None,
+  ):
+    super().__init__()
+    max_cumulative_backoff = int(
+        max_cumulative_backoff) if max_cumulative_backoff else None
+    commit_deadline = int(commit_deadline) if commit_deadline else None
+    self.table = table
+    self.params = WriteToSpannerSchema(
+        project_id=project_id,
+        instance_id=instance_id,
+        database_id=database_id,
+        max_batch_size_bytes=max_batch_size_bytes,
+        max_number_mutations=max_number_mutations,
+        max_number_rows=max_number_rows,
+        grouping_factor=grouping_factor,
+        host=host,
+        emulator_host=emulator_host,
+        commit_deadline=commit_deadline,
+        max_cumulative_backoff=max_cumulative_backoff,
+    )
+    self.expansion_service = expansion_service or default_io_expansion_service()
+
+  def expand(self, pbegin):
+    return _apply_write_transform(
+        pbegin,
+        _RowToMutation(_Operation.UPDATE, self.table),
+        self.params,
+        self.expansion_service)
+
+
+def _apply_write_transform(pbegin, to_mutation, params, expansion_service):
+  return (
+      pbegin
+      | to_mutation
+      | ExternalTransform(
+          _WRITE_URN, NamedTupleBasedPayloadBuilder(params), expansion_service))
+
+
+class _RowToMutation(PTransform):

Review comment:
       Done. I've replaced keysets with keys - it's horrible to distinguish List<Row> from Row in runtime. With keys only it's way cleaner.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski commented on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski commented on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-723916260


   @TheNeuralBit I promise that this is the last big review from me. I've just recently realized how much work I've made you to do. Better late than never I guess!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski commented on a change in pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski commented on a change in pull request #12611:
URL: https://github.com/apache/beam/pull/12611#discussion_r505693635



##########
File path: sdks/python/apache_beam/io/gcp/spanner.py
##########
@@ -0,0 +1,483 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""PTransforms for supporting Spanner in Python pipelines.
+
+  These transforms are currently supported by Beam portable
+  Flink and Spark runners.
+
+  **Setup**
+
+  Transforms provided in this module are cross-language transforms
+  implemented in the Beam Java SDK. During the pipeline construction, Python SDK
+  will connect to a Java expansion service to expand these transforms.
+  To facilitate this, a small amount of setup is needed before using these
+  transforms in a Beam Python pipeline.
+
+  There are several ways to setup cross-language Spanner transforms.
+
+  * Option 1: use the default expansion service
+  * Option 2: specify a custom expansion service
+
+  See below for details regarding each of these options.
+
+  *Option 1: Use the default expansion service*
+
+  This is the recommended and easiest setup option for using Python Spanner
+  transforms. This option is only available for Beam 2.25.0 and later.
+
+  This option requires following pre-requisites before running the Beam
+  pipeline.
+
+  * Install Java runtime in the computer from where the pipeline is constructed
+    and make sure that 'java' command is available.
+
+  In this option, Python SDK will either download (for released Beam version) or
+  build (when running from a Beam Git clone) a expansion service jar and use
+  that to expand transforms. Currently Spanner transforms use the
+  'beam-sdks-java-io-google-cloud-platform-expansion-service' jar for this
+  purpose.
+
+  *Option 2: specify a custom expansion service*
+
+  In this option, you startup your own expansion service and provide that as
+  a parameter when using the transforms provided in this module.
+
+  This option requires following pre-requisites before running the Beam
+  pipeline.
+
+  * Startup your own expansion service.
+  * Update your pipeline to provide the expansion service address when
+    initiating Spanner transforms provided in this module.
+
+  Flink Users can use the built-in Expansion Service of the Flink Runner's
+  Job Server. If you start Flink's Job Server, the expansion service will be
+  started on port 8097. For a different address, please set the
+  expansion_service parameter.
+
+  **More information**
+
+  For more information regarding cross-language transforms see:
+  - https://beam.apache.org/roadmap/portability/
+
+  For more information specific to Flink runner see:
+  - https://beam.apache.org/documentation/runners/flink/
+"""
+
+# pytype: skip-file
+
+from __future__ import absolute_import
+
+import typing
+import uuid
+from typing import List
+from typing import NamedTuple
+from typing import Optional
+
+from past.builtins import unicode
+
+from apache_beam import Map
+from apache_beam import PTransform
+from apache_beam import coders
+from apache_beam.transforms.external import BeamJarExpansionService
+from apache_beam.transforms.external import ExternalTransform
+from apache_beam.transforms.external import NamedTupleBasedPayloadBuilder
+from apache_beam.typehints.schemas import named_tuple_to_schema
+
+__all__ = [
+    'WriteToSpanner',
+    'ReadFromSpanner',
+    'TimestampBoundMode',
+    'TimeUnit',
+]
+
+
+def default_io_expansion_service():
+  return BeamJarExpansionService(
+      'sdks:java:io:google-cloud-platform:expansion-service:shadowJar')
+
+
+WriteToSpannerSchema = typing.NamedTuple(
+    'WriteToSpannerSchema',
+    [
+        ('instance_id', unicode),
+        ('database_id', unicode),
+        ('project_id', Optional[unicode]),
+        ('max_batch_size_bytes', Optional[int]),
+        ('max_number_mutations', Optional[int]),
+        ('max_number_rows', Optional[int]),
+        ('grouping_factor', Optional[int]),
+        ('host', Optional[unicode]),
+        ('emulator_host', Optional[unicode]),
+        ('commit_deadline', Optional[int]),
+        ('max_cumulative_backoff', Optional[int]),
+    ],
+)
+
+
+class WriteToSpanner:
+  """
+  A PTransform which writes mutations to the specified instance's database
+  via Spanner.
+
+  This transform receives rows defined as NamedTuple or as List[NamedTuple]
+  in case of delete operation. Example::
+
+    ExampleRow = typing.NamedTuple('ExampleRow',
+                                   [('id', int), ('name', unicode)])
+
+    with Pipeline() as p:
+      _ = (
+          p
+          | 'Impulse' >> beam.Impulse()
+          | 'Generate' >> beam.FlatMap(lambda x: range(num_rows))
+          | 'To row' >> beam.Map(lambda n: ExampleRow(n, str(n))
+              .with_output_types(ExampleRow)
+          | 'Write to Spanner' >> WriteToSpanner(
+              instance_id='your_instance',
+              database_id='existing_database',
+              project_id='your_project_id').insert('your_table'))
+
+  In addition you can pass List[ExampleRow] to delete transform::
+
+    with Pipeline() as p:
+      _ = (
+          p
+          | 'Impulse' >> beam.Impulse()
+          | 'Generate' >> beam.FlatMap(lambda x: range(num_rows))
+          | 'To row' >> beam.Map(lambda n: [ExampleRow(n, str(n),
+              ExampleRow(n * 2, str(n * 2)])
+              .with_output_types(List[ExampleRow])
+          | 'Write to Spanner' >> WriteToSpanner(
+              instance_id='your_instance',
+              database_id='existing_database',
+              project_id='your_project_id').delete('your_table'))
+
+  Experimental; no backwards compatibility guarantees.
+  """
+  def __init__(
+      self,
+      project_id,
+      instance_id,
+      database_id,
+      max_batch_size_bytes=None,
+      max_number_mutations=None,
+      max_number_rows=None,
+      grouping_factor=None,
+      host=None,
+      emulator_host=None,
+      commit_deadline=None,
+      max_cumulative_backoff=None,
+      expansion_service=None,
+  ):
+    """
+    Initializes a write operation to Spanner.
+
+    :param project_id: Specifies the Cloud Spanner project.
+    :param instance_id: Specifies the Cloud Spanner instance.
+    :param database_id: Specifies the Cloud Spanner database.
+    :param max_batch_size_bytes: Specifies the batch size limit (max number of
+        bytes mutated per batch). Default value is 1048576 bytes = 1MB.
+    :param max_number_mutations: Specifies the cell mutation limit (maximum
+        number of mutated cells per batch). Default value is 5000.
+    :param max_number_rows: Specifies the row mutation limit (maximum number of
+        mutated rows per batch). Default value is 500.
+    :param grouping_factor: Specifies the multiple of max mutation (in terms
+        of both bytes per batch and cells per batch) that is used to select a
+        set of mutations to sort by key for batching. This sort uses local
+        memory on the workers, so using large values can cause out of memory
+        errors. Default value is 1000.
+    :param host: Specifies the Cloud Spanner host.
+    :param emulator_host: Specifies Spanner emulator host.
+    :param commit_deadline: Specifies the deadline for the Commit API call.
+        Default is 15 secs. DEADLINE_EXCEEDED errors will prompt a backoff/retry
+        until the value of commit_deadline is reached. DEADLINE_EXCEEDED errors
+        are ar reported with logging and counters. Pass seconds as value.
+    :param max_cumulative_backoff: Specifies the maximum cumulative backoff
+        time when retrying after DEADLINE_EXCEEDED errors. Default is 900s
+        (15min). If the mutations still have not been written after this time,
+        they are treated as a failure, and handled according to the setting of
+        failure_mode. Pass seconds as value.
+    :param expansion_service: The address (host:port) of the ExpansionService.
+    """
+    max_cumulative_backoff = int(
+        max_cumulative_backoff) if max_cumulative_backoff else None
+    commit_deadline = int(commit_deadline) if commit_deadline else None
+    self.config = NamedTupleBasedPayloadBuilder(
+        WriteToSpannerSchema(
+            project_id=project_id,
+            instance_id=instance_id,
+            database_id=database_id,
+            max_batch_size_bytes=max_batch_size_bytes,
+            max_number_mutations=max_number_mutations,
+            max_number_rows=max_number_rows,
+            grouping_factor=grouping_factor,
+            host=host,
+            emulator_host=emulator_host,
+            commit_deadline=commit_deadline,
+            max_cumulative_backoff=max_cumulative_backoff,
+        ),
+    )
+    self.expansion_service = expansion_service or default_io_expansion_service()
+
+  def insert(self, table):
+    return WriteToSpannerTransform(
+        self.config, self.expansion_service, Operation.INSERT, table)
+
+  def delete(self, table):
+    return WriteToSpannerTransform(
+        self.config, self.expansion_service, Operation.DELETE, table)
+
+  def update(self, table):
+    return WriteToSpannerTransform(
+        self.config, self.expansion_service, Operation.UPDATE, table)
+
+  def replace(self, table):
+    return WriteToSpannerTransform(
+        self.config, self.expansion_service, Operation.REPLACE, table)
+
+  def insert_or_update(self, table):
+    return WriteToSpannerTransform(
+        self.config, self.expansion_service, Operation.INSERT_OR_UPDATE, table)
+
+
+ReadFromSpannerSchema = NamedTuple(
+    'ReadFromSpannerSchema',
+    [
+        ('instance_id', unicode),
+        ('database_id', unicode),
+        ('schema', bytes),
+        ('sql', Optional[unicode]),
+        ('table', Optional[unicode]),
+        ('project_id', Optional[unicode]),
+        ('host', Optional[unicode]),
+        ('emulator_host', Optional[unicode]),
+        ('batching', Optional[bool]),
+        ('timestamp_bound_mode', Optional[unicode]),
+        ('read_timestamp', Optional[unicode]),
+        ('exact_staleness', Optional[int]),
+        ('time_unit', Optional[unicode]),
+    ],
+)
+
+
+class ReadFromSpanner(ExternalTransform):
+  """
+  A PTransform which reads from the specified Spanner instance's database.
+
+  This transform required type of the row it has to return to provide the
+  schema. Example::
+
+    ExampleRow = typing.NamedTuple('ExampleRow',
+                                   [('id', int), ('name', unicode)])
+
+    with Pipeline() as p:
+      result = (
+          p
+          | ReadFromSpanner(
+              instance_id='your_instance_id',
+              database_id='your_database_id',
+              project_id='your_project_id',
+              row_type=ExampleRow,
+              query='SELECT * FROM some_table',
+          ).with_output_types(ExampleRow))
+
+  Experimental; no backwards compatibility guarantees.
+  """
+  URN = 'beam:external:java:spanner:read:v1'
+
+  def __init__(
+      self,
+      project_id,
+      instance_id,
+      database_id,
+      row_type=None,
+      sql=None,
+      table=None,
+      host=None,
+      emulator_host=None,
+      batching=None,
+      timestamp_bound_mode=None,
+      read_timestamp=None,
+      exact_staleness=None,
+      time_unit=None,
+      expansion_service=None,
+  ):
+    """
+    Initializes a read operation from Spanner.
+
+    :param project_id: Specifies the Cloud Spanner project.
+    :param instance_id: Specifies the Cloud Spanner instance.
+    :param database_id: Specifies the Cloud Spanner database.
+    :param row_type: Row type that fits the given query or table. Passed as
+        NamedTuple, e.g. NamedTuple('name', [('row_name', unicode)])
+    :param sql: An sql query to execute. It's results must fit the
+        provided row_type. Don't use when table is set.
+    :param table: A spanner table. When provided all columns from row_type
+        will be selected to query. Don't use when query is set.
+    :param batching: By default Batch API is used to read data from Cloud
+        Spanner. It is useful to disable batching when the underlying query
+        is not root-partitionable.
+    :param host: Specifies the Cloud Spanner host.
+    :param emulator_host: Specifies Spanner emulator host.
+    :param timestamp_bound_mode: Defines how Cloud Spanner will choose a
+        timestamp for a read-only transaction or a single read/query.
+        Possible values:
+        STRONG: A timestamp bound that will perform reads and queries at a
+        timestamp where all previously committed transactions are visible.
+        READ_TIMESTAMP: Returns a timestamp bound that will perform reads
+        and queries at the given timestamp.
+        MIN_READ_TIMESTAMP: Returns a timestamp bound that will perform reads
+        and queries at a timestamp chosen to be at least given timestamp value.
+        EXACT_STALENESS: Returns a timestamp bound that will perform reads and
+        queries at an exact staleness. The timestamp is chosen soon after the
+        read is started.
+        MAX_STALENESS: Returns a timestamp bound that will perform reads and
+        queries at a timestamp chosen to be at most time_unit stale.
+    :param read_timestamp: Timestamp in string. Use only when
+        timestamp_bound_mode is set to READ_TIMESTAMP or MIN_READ_TIMESTAMP.
+    :param exact_staleness: Staleness value as int. Use only when
+        timestamp_bound_mode is set to EXACT_STALENESS or MAX_STALENESS.
+        time_unit has to be set along with this param.
+    :param time_unit: Time unit for staleness_value. Possible values:
+        NANOSECONDS, MICROSECONDS, MILLISECONDS, SECONDS, HOURS, DAYS.
+    :param expansion_service: The address (host:port) of the ExpansionService.
+    """
+    assert row_type
+    assert sql or table and not (sql and table)
+    TimeUnit.verify_param(time_unit)
+    TimestampBoundMode.verify_param(timestamp_bound_mode)
+    staleness_value = int(exact_staleness) if exact_staleness else None
+
+    if staleness_value or time_unit:
+      assert staleness_value and time_unit and \
+             timestamp_bound_mode == TimestampBoundMode.MAX_STALENESS or \
+             timestamp_bound_mode == TimestampBoundMode.EXACT_STALENESS
+
+    if read_timestamp:
+      assert timestamp_bound_mode == TimestampBoundMode.MIN_READ_TIMESTAMP\
+             or timestamp_bound_mode == TimestampBoundMode.READ_TIMESTAMP
+
+    coders.registry.register_coder(row_type, coders.RowCoder)
+
+    super(ReadFromSpanner, self).__init__(
+        self.URN,
+        NamedTupleBasedPayloadBuilder(
+            ReadFromSpannerSchema(
+                instance_id=instance_id,
+                database_id=database_id,
+                sql=sql,
+                table=table,
+                schema=named_tuple_to_schema(row_type).SerializeToString(),
+                project_id=project_id,
+                host=host,
+                emulator_host=emulator_host,
+                batching=batching,
+                timestamp_bound_mode=timestamp_bound_mode,
+                read_timestamp=read_timestamp,
+                exact_staleness=exact_staleness,
+                time_unit=time_unit,
+            ),
+        ),
+        expansion_service or default_io_expansion_service(),
+    )
+
+
+class Operation:
+  INSERT = 'INSERT'
+  DELETE = 'DELETE'
+  UPDATE = 'UPDATE'
+  REPLACE = 'REPLACE'
+  INSERT_OR_UPDATE = 'INSERT_OR_UPDATE'
+
+
+class TimeUnit:
+  NANOSECONDS = 'NANOSECONDS'
+  MICROSECONDS = 'MICROSECONDS'
+  MILLISECONDS = 'MILLISECONDS'
+  SECONDS = 'SECONDS'
+  HOURS = 'HOURS'
+  DAYS = 'DAYS'
+
+  @staticmethod
+  def verify_param(param):
+    if param and not hasattr(TimeUnit, param):
+      raise RuntimeError(
+          'Invalid param for TimestampBoundMode: {}'.format(param))
+
+
+class TimestampBoundMode:
+  MAX_STALENESS = 'MAX_STALENESS'
+  EXACT_STALENESS = 'EXACT_STALENESS'
+  READ_TIMESTAMP = 'READ_TIMESTAMP'
+  MIN_READ_TIMESTAMP = 'MIN_READ_TIMESTAMP'
+  STRONG = 'STRONG'
+
+  @staticmethod
+  def verify_param(param):
+    if param and not hasattr(TimestampBoundMode, param):
+      raise RuntimeError(
+          'Invalid param for TimestampBoundMode: {}'.format(param))
+
+
+class WriteToSpannerTransform(PTransform):
+  URN = 'beam:external:java:spanner:write:v1'
+
+  def __init__(self, config, expansion_service, operation, table):
+    super(WriteToSpannerTransform, self).__init__()
+    self.config = config
+    self.expansion_service = expansion_service
+    self.operation = operation
+    self.table = table
+
+  def expand(self, row_pcoll):
+    return (
+        row_pcoll
+        | RowToMutation(self.operation, self.table)
+        | ExternalTransform(self.URN, self.config, self.expansion_service))
+
+
+class RowToMutation(PTransform):
+  def __init__(self, operation, table):
+    super(RowToMutation, self).__init__()
+    self.operation = operation
+    self.table = table
+
+  def expand(self, pcoll):
+    is_delete = self.operation == Operation.DELETE
+    mutation_name = 'Mutation_%s_%s' % (
+        self.operation, str(uuid.uuid4()).replace('-', ''))
+
+    # There is an error when pcoll.element_type is List[row_type] so pass
+    # a list of inner element types to NamedTuple explicitly.
+    is_list = hasattr(pcoll.element_type, 'inner_type')
+    row_type = pcoll.element_type.inner_type if is_list else pcoll.element_type
+    mutation_type = NamedTuple(
+        mutation_name,
+        [
+            ('operation', unicode),
+            ('table', unicode),
+            ('keyset', List[row_type]) if is_delete else ('row', row_type),

Review comment:
       Done. I'm not sure if you didn't mean to add that convenience to schemas.py. I'm leaving it in spanner.py for now




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski commented on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski commented on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-704255878


   @nielm Could you take a look at this thread? https://github.com/apache/beam/pull/12611#discussion_r480429441


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski commented on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski commented on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-686447747


   @TheNeuralBit I've upgraded it a bit.
   - Checking schemas equality is redundant because it will throw an exception with a good message anyway (class cast failure or unknown column).
   - I managed to unify addArray and addIterable code duplication with a bit ugly casts (needed SuppressWarning("unchecked")) but I don't think it can be easily achieved otherwise.
   - Nothing comes to my mind to remove duplication in addIterableToMutationBuilder and addIterableToStructBuilder methods. These are unrelated classes (Struct.Builder and Mutation.WriteBuilder. Maybe my Java knowledge is insufficient here. I could make an interface that simulates .setInt64Array, setStructArray etc but it would be even more boilerplate.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski commented on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski commented on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-697267984


   @TheNeuralBit @nielm @allenpradeep  ping


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-683611200


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=h1) Report
   > Merging [#12611](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=desc) into [master](https://codecov.io/gh/apache/beam/commit/3d6cc0ed9ed537229b27b5dbe73288f21b0e351c?el=desc) will **decrease** coverage by `0.04%`.
   > The diff coverage is `56.73%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/12611/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #12611      +/-   ##
   ==========================================
   - Coverage   82.48%   82.44%   -0.05%     
   ==========================================
     Files         455      456       +1     
     Lines       54876    54975      +99     
   ==========================================
   + Hits        45266    45324      +58     
   - Misses       9610     9651      +41     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [sdks/python/apache\_beam/io/gcp/spanner.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL3NwYW5uZXIucHk=) | `56.73% <56.73%> (ø)` | |
   | [...eam/runners/interactive/interactive\_environment.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9pbnRlcmFjdGl2ZS9pbnRlcmFjdGl2ZV9lbnZpcm9ubWVudC5weQ==) | `89.45% <0.00%> (-0.36%)` | :arrow_down: |
   | [sdks/python/apache\_beam/io/iobase.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vaW9iYXNlLnB5) | `83.75% <0.00%> (-0.29%)` | :arrow_down: |
   | [...hon/apache\_beam/runners/worker/bundle\_processor.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvYnVuZGxlX3Byb2Nlc3Nvci5weQ==) | `94.07% <0.00%> (-0.27%)` | :arrow_down: |
   | [...ks/python/apache\_beam/runners/worker/sdk\_worker.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvc2RrX3dvcmtlci5weQ==) | `89.47% <0.00%> (-0.16%)` | :arrow_down: |
   | [...runners/interactive/display/pcoll\_visualization.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9pbnRlcmFjdGl2ZS9kaXNwbGF5L3Bjb2xsX3Zpc3VhbGl6YXRpb24ucHk=) | `85.26% <0.00%> (-0.08%)` | :arrow_down: |
   | [...beam/runners/portability/local\_job\_service\_main.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9wb3J0YWJpbGl0eS9sb2NhbF9qb2Jfc2VydmljZV9tYWluLnB5) | `0.00% <0.00%> (ø)` | |
   | [sdks/python/apache\_beam/runners/common.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9jb21tb24ucHk=) | `89.20% <0.00%> (+0.44%)` | :arrow_up: |
   | [.../python/apache\_beam/transforms/periodicsequence.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdHJhbnNmb3Jtcy9wZXJpb2RpY3NlcXVlbmNlLnB5) | `98.24% <0.00%> (+1.75%)` | :arrow_up: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=footer). Last update [e4c95f2...f385921](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski commented on a change in pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski commented on a change in pull request #12611:
URL: https://github.com/apache/beam/pull/12611#discussion_r482081597



##########
File path: sdks/python/apache_beam/io/gcp/spanner.py
##########
@@ -0,0 +1,504 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""PTransforms for supporting Spanner in Python pipelines.
+
+  These transforms are currently supported by Beam portable
+  Flink and Spark runners.
+
+  **Setup**
+
+  Transforms provided in this module are cross-language transforms
+  implemented in the Beam Java SDK. During the pipeline construction, Python SDK
+  will connect to a Java expansion service to expand these transforms.
+  To facilitate this, a small amount of setup is needed before using these
+  transforms in a Beam Python pipeline.
+
+  There are several ways to setup cross-language Spanner transforms.
+
+  * Option 1: use the default expansion service
+  * Option 2: specify a custom expansion service
+
+  See below for details regarding each of these options.
+
+  *Option 1: Use the default expansion service*
+
+  This is the recommended and easiest setup option for using Python Spanner
+  transforms. This option is only available for Beam 2.25.0 and later.
+
+  This option requires following pre-requisites before running the Beam
+  pipeline.
+
+  * Install Java runtime in the computer from where the pipeline is constructed
+    and make sure that 'java' command is available.
+
+  In this option, Python SDK will either download (for released Beam version) or
+  build (when running from a Beam Git clone) a expansion service jar and use
+  that to expand transforms. Currently Spanner transforms use the
+  'beam-sdks-java-io-google-cloud-platform-expansion-service' jar for this
+  purpose.
+
+  *Option 2: specify a custom expansion service*
+
+  In this option, you startup your own expansion service and provide that as
+  a parameter when using the transforms provided in this module.
+
+  This option requires following pre-requisites before running the Beam
+  pipeline.
+
+  * Startup your own expansion service.
+  * Update your pipeline to provide the expansion service address when
+    initiating Spanner transforms provided in this module.
+
+  Flink Users can use the built-in Expansion Service of the Flink Runner's
+  Job Server. If you start Flink's Job Server, the expansion service will be
+  started on port 8097. For a different address, please set the
+  expansion_service parameter.
+
+  **More information**
+
+  For more information regarding cross-language transforms see:
+  - https://beam.apache.org/roadmap/portability/
+
+  For more information specific to Flink runner see:
+  - https://beam.apache.org/documentation/runners/flink/
+"""
+
+# pytype: skip-file
+
+from __future__ import absolute_import
+
+import typing
+import uuid
+from typing import List
+from typing import NamedTuple
+from typing import Optional
+
+from past.builtins import unicode
+
+from apache_beam import coders
+from apache_beam.transforms.external import BeamJarExpansionService
+from apache_beam.transforms.external import ExternalTransform
+from apache_beam.transforms.external import NamedTupleBasedPayloadBuilder
+from apache_beam.typehints.schemas import named_tuple_to_schema
+
+__all__ = [
+    'WriteToSpanner',
+    'ReadFromSpanner',
+    'MutationCreator',
+    'TimestampBoundMode',
+    'TimeUnit',
+]
+
+
+def default_io_expansion_service():
+  return BeamJarExpansionService(
+      'sdks:java:io:google-cloud-platform:expansion-service:shadowJar')
+
+
+WriteToSpannerSchema = typing.NamedTuple(
+    'WriteToSpannerSchema',
+    [
+        ('instance_id', unicode),
+        ('database_id', unicode),
+        ('project_id', Optional[unicode]),
+        ('batch_size_bytes', Optional[int]),
+        ('max_num_mutations', Optional[int]),
+        ('max_num_rows', Optional[int]),
+        ('grouping_factor', Optional[int]),
+        ('host', Optional[unicode]),
+        ('emulator_host', Optional[unicode]),
+        ('commit_deadline', Optional[int]),
+        ('max_cumulative_backoff', Optional[int]),
+    ],
+)
+
+
+class WriteToSpanner(ExternalTransform):

Review comment:
       I can try to make the API compliant with the native one. I think it'd be valuable for Beam to compare the performance of both IOs and then decide which one to leave.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski commented on a change in pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski commented on a change in pull request #12611:
URL: https://github.com/apache/beam/pull/12611#discussion_r522054029



##########
File path: sdks/python/apache_beam/io/gcp/spanner.py
##########
@@ -0,0 +1,662 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""PTransforms for supporting Spanner in Python pipelines.
+
+  These transforms are currently supported by Beam portable
+  Flink and Spark runners.
+
+  **Setup**
+
+  Transforms provided in this module are cross-language transforms
+  implemented in the Beam Java SDK. During the pipeline construction, Python SDK
+  will connect to a Java expansion service to expand these transforms.
+  To facilitate this, a small amount of setup is needed before using these
+  transforms in a Beam Python pipeline.
+
+  There are several ways to setup cross-language Spanner transforms.
+
+  * Option 1: use the default expansion service
+  * Option 2: specify a custom expansion service
+
+  See below for details regarding each of these options.
+
+  *Option 1: Use the default expansion service*
+
+  This is the recommended and easiest setup option for using Python Spanner
+  transforms. This option is only available for Beam 2.26.0 and later.
+
+  This option requires following pre-requisites before running the Beam
+  pipeline.
+
+  * Install Java runtime in the computer from where the pipeline is constructed
+    and make sure that 'java' command is available.
+
+  In this option, Python SDK will either download (for released Beam version) or
+  build (when running from a Beam Git clone) a expansion service jar and use
+  that to expand transforms. Currently Spanner transforms use the
+  'beam-sdks-java-io-google-cloud-platform-expansion-service' jar for this
+  purpose.
+
+  *Option 2: specify a custom expansion service*
+
+  In this option, you startup your own expansion service and provide that as
+  a parameter when using the transforms provided in this module.
+
+  This option requires following pre-requisites before running the Beam
+  pipeline.
+
+  * Startup your own expansion service.
+  * Update your pipeline to provide the expansion service address when
+    initiating Spanner transforms provided in this module.
+
+  Flink Users can use the built-in Expansion Service of the Flink Runner's
+  Job Server. If you start Flink's Job Server, the expansion service will be
+  started on port 8097. For a different address, please set the
+  expansion_service parameter.
+
+  **More information**
+
+  For more information regarding cross-language transforms see:
+  - https://beam.apache.org/roadmap/portability/
+
+  For more information specific to Flink runner see:
+  - https://beam.apache.org/documentation/runners/flink/
+"""
+
+# pytype: skip-file
+
+from __future__ import absolute_import
+
+import uuid
+from enum import Enum
+from enum import auto
+from typing import List
+from typing import NamedTuple
+from typing import Optional
+
+from past.builtins import unicode
+
+from apache_beam import Map
+from apache_beam import PTransform
+from apache_beam import coders
+from apache_beam.transforms.external import BeamJarExpansionService
+from apache_beam.transforms.external import ExternalTransform
+from apache_beam.transforms.external import NamedTupleBasedPayloadBuilder
+from apache_beam.typehints.schemas import named_tuple_from_schema
+from apache_beam.typehints.schemas import named_tuple_to_schema
+from apache_beam.typehints.schemas import schema_from_element_type
+
+__all__ = [
+    'ReadFromSpanner',
+    'SpannerDelete',
+    'SpannerInsert',
+    'SpannerInsertOrUpdate',
+    'SpannerReplace',
+    'SpannerUpdate',
+    'TimestampBoundMode',
+    'TimeUnit',
+]
+
+
+def default_io_expansion_service():
+  return BeamJarExpansionService(
+      'sdks:java:io:google-cloud-platform:expansion-service:shadowJar')
+
+
+_READ_URN = 'beam:external:java:spanner:read:v1'
+_WRITE_URN = 'beam:external:java:spanner:write:v1'
+
+
+class TimeUnit(Enum):
+  NANOSECONDS = auto()
+  MICROSECONDS = auto()
+  MILLISECONDS = auto()
+  SECONDS = auto()
+  HOURS = auto()
+  DAYS = auto()
+
+
+class TimestampBoundMode(Enum):
+  MAX_STALENESS = auto()
+  EXACT_STALENESS = auto()
+  READ_TIMESTAMP = auto()
+  MIN_READ_TIMESTAMP = auto()
+  STRONG = auto()
+
+
+class ReadFromSpannerSchema(NamedTuple):
+  instance_id: unicode
+  database_id: unicode
+  schema: bytes
+  sql: Optional[unicode]
+  table: Optional[unicode]
+  project_id: Optional[unicode]
+  host: Optional[unicode]
+  emulator_host: Optional[unicode]
+  batching: Optional[bool]
+  timestamp_bound_mode: Optional[unicode]
+  read_timestamp: Optional[unicode]
+  exact_staleness: Optional[int]
+  time_unit: Optional[unicode]
+
+
+class ReadFromSpanner(ExternalTransform):
+  """
+  A PTransform which reads from the specified Spanner instance's database.
+
+  This transform required type of the row it has to return to provide the
+  schema. Example::
+
+    ExampleRow = typing.NamedTuple('ExampleRow',
+                                   [('id', int), ('name', unicode)])
+
+    with Pipeline() as p:
+      result = (
+          p
+          | ReadFromSpanner(
+              instance_id='your_instance_id',
+              database_id='your_database_id',
+              project_id='your_project_id',
+              row_type=ExampleRow,
+              query='SELECT * FROM some_table',
+              timestamp_bound_mode=TimestampBoundMode.MAX_STALENESS,
+              exact_staleness=3,
+              time_unit=TimeUnit.HOURS,
+          ).with_output_types(ExampleRow))
+
+  Experimental; no backwards compatibility guarantees.
+  """
+  def __init__(
+      self,
+      project_id,
+      instance_id,
+      database_id,
+      row_type=None,
+      sql=None,
+      table=None,
+      host=None,
+      emulator_host=None,
+      batching=None,
+      timestamp_bound_mode=None,
+      read_timestamp=None,
+      exact_staleness=None,
+      time_unit=None,
+      expansion_service=None,
+  ):
+    """
+    Initializes a read operation from Spanner.
+
+    :param project_id: Specifies the Cloud Spanner project.
+    :param instance_id: Specifies the Cloud Spanner instance.
+    :param database_id: Specifies the Cloud Spanner database.
+    :param row_type: Row type that fits the given query or table. Passed as
+        NamedTuple, e.g. NamedTuple('name', [('row_name', unicode)])
+    :param sql: An sql query to execute. It's results must fit the
+        provided row_type. Don't use when table is set.
+    :param table: A spanner table. When provided all columns from row_type
+        will be selected to query. Don't use when query is set.
+    :param batching: By default Batch API is used to read data from Cloud
+        Spanner. It is useful to disable batching when the underlying query
+        is not root-partitionable.
+    :param host: Specifies the Cloud Spanner host.
+    :param emulator_host: Specifies Spanner emulator host.
+    :param timestamp_bound_mode: Defines how Cloud Spanner will choose a
+        timestamp for a read-only transaction or a single read/query.
+        Passed as TimestampBoundMode enum. Possible values:
+        STRONG: A timestamp bound that will perform reads and queries at a
+        timestamp where all previously committed transactions are visible.
+        READ_TIMESTAMP: Returns a timestamp bound that will perform reads
+        and queries at the given timestamp.
+        MIN_READ_TIMESTAMP: Returns a timestamp bound that will perform reads
+        and queries at a timestamp chosen to be at least given timestamp value.
+        EXACT_STALENESS: Returns a timestamp bound that will perform reads and
+        queries at an exact staleness. The timestamp is chosen soon after the
+        read is started.
+        MAX_STALENESS: Returns a timestamp bound that will perform reads and
+        queries at a timestamp chosen to be at most time_unit stale.
+    :param read_timestamp: Timestamp in string. Use only when
+        timestamp_bound_mode is set to READ_TIMESTAMP or MIN_READ_TIMESTAMP.
+    :param exact_staleness: Staleness value as int. Use only when
+        timestamp_bound_mode is set to EXACT_STALENESS or MAX_STALENESS.
+        time_unit has to be set along with this param.
+    :param time_unit: Time unit for staleness_value passed as TimeUnit enum.
+        Possible values: NANOSECONDS, MICROSECONDS, MILLISECONDS, SECONDS,
+        HOURS, DAYS.
+    :param expansion_service: The address (host:port) of the ExpansionService.
+    """
+    assert row_type
+    assert sql or table and not (sql and table)
+    staleness_value = int(exact_staleness) if exact_staleness else None
+
+    if staleness_value or time_unit:
+      assert staleness_value and time_unit and \
+             timestamp_bound_mode is TimestampBoundMode.MAX_STALENESS or \
+             timestamp_bound_mode is TimestampBoundMode.EXACT_STALENESS
+
+    if read_timestamp:
+      assert timestamp_bound_mode is TimestampBoundMode.MIN_READ_TIMESTAMP\
+             or timestamp_bound_mode is TimestampBoundMode.READ_TIMESTAMP
+
+    coders.registry.register_coder(row_type, coders.RowCoder)
+
+    super(ReadFromSpanner, self).__init__(
+        _READ_URN,
+        NamedTupleBasedPayloadBuilder(
+            ReadFromSpannerSchema(
+                instance_id=instance_id,
+                database_id=database_id,
+                sql=sql,
+                table=table,
+                schema=named_tuple_to_schema(row_type).SerializeToString(),
+                project_id=project_id,
+                host=host,
+                emulator_host=emulator_host,
+                batching=batching,
+                timestamp_bound_mode=_get_enum_name(timestamp_bound_mode),
+                read_timestamp=read_timestamp,
+                exact_staleness=exact_staleness,
+                time_unit=_get_enum_name(time_unit),
+            ),
+        ),
+        expansion_service or default_io_expansion_service(),
+    )
+
+
+class WriteToSpannerSchema(NamedTuple):
+  project_id: unicode
+  instance_id: unicode
+  database_id: unicode
+  max_batch_size_bytes: Optional[int]
+  max_number_mutations: Optional[int]
+  max_number_rows: Optional[int]
+  grouping_factor: Optional[int]
+  host: Optional[unicode]
+  emulator_host: Optional[unicode]
+  commit_deadline: Optional[int]
+  max_cumulative_backoff: Optional[int]
+
+
+_CLASS_DOC = \
+  """
+  A PTransform which writes {0} mutations to the specified Spanner table.
+
+  This transform receives rows defined as NamedTuple. Example::
+
+    {1} = typing.NamedTuple('{1}',
+                                   [('id', int), ('name', unicode)])
+
+    with Pipeline() as p:
+      _ = (
+          p
+          | 'Impulse' >> beam.Impulse()
+          | 'Generate' >> beam.FlatMap(lambda x: range(num_rows))
+          | 'To row' >> beam.Map(lambda n: {1}(n, str(n))
+              .with_output_types({2})
+          | 'Write to Spanner' >> Spanner{3}(
+              instance_id='your_instance',
+              database_id='existing_database',
+              project_id='your_project_id',
+              table='your_table'))
+
+  Experimental; no backwards compatibility guarantees.
+  """
+
+_INIT_DOC = \
+  """
+  Initializes {} operation to a Spanner table.
+
+  :param project_id: Specifies the Cloud Spanner project.
+  :param instance_id: Specifies the Cloud Spanner instance.
+  :param database_id: Specifies the Cloud Spanner database.
+  :param table: Specifies the Cloud Spanner table.
+  :param max_batch_size_bytes: Specifies the batch size limit (max number of
+      bytes mutated per batch). Default value is 1048576 bytes = 1MB.
+  :param max_number_mutations: Specifies the cell mutation limit (maximum
+      number of mutated cells per batch). Default value is 5000.
+  :param max_number_rows: Specifies the row mutation limit (maximum number of
+      mutated rows per batch). Default value is 500.
+  :param grouping_factor: Specifies the multiple of max mutation (in terms
+      of both bytes per batch and cells per batch) that is used to select a
+      set of mutations to sort by key for batching. This sort uses local
+      memory on the workers, so using large values can cause out of memory
+      errors. Default value is 1000.
+  :param host: Specifies the Cloud Spanner host.
+  :param emulator_host: Specifies Spanner emulator host.
+  :param commit_deadline: Specifies the deadline for the Commit API call.
+      Default is 15 secs. DEADLINE_EXCEEDED errors will prompt a backoff/retry
+      until the value of commit_deadline is reached. DEADLINE_EXCEEDED errors
+      are ar reported with logging and counters. Pass seconds as value.
+  :param max_cumulative_backoff: Specifies the maximum cumulative backoff
+      time when retrying after DEADLINE_EXCEEDED errors. Default is 900s
+      (15min). If the mutations still have not been written after this time,
+      they are treated as a failure, and handled according to the setting of
+      failure_mode. Pass seconds as value.
+  :param expansion_service: The address (host:port) of the ExpansionService.
+  """
+
+
+def _add_doc(value, *args):
+  def _doc(obj):
+    obj.__doc__ = value.format(*args)
+    return obj
+
+  return _doc
+
+
+@_add_doc(_CLASS_DOC, 'delete', 'ExampleKey', 'List[ExampleKey]', 'Delete')
+class SpannerDelete(PTransform):
+  @_add_doc(_INIT_DOC, 'a delete')
+  def __init__(
+      self,
+      project_id,
+      instance_id,
+      database_id,
+      table,
+      max_batch_size_bytes=None,
+      max_number_mutations=None,
+      max_number_rows=None,
+      grouping_factor=None,
+      host=None,
+      emulator_host=None,
+      commit_deadline=None,
+      max_cumulative_backoff=None,
+      expansion_service=None,
+  ):
+    super().__init__()
+    max_cumulative_backoff = int(
+        max_cumulative_backoff) if max_cumulative_backoff else None
+    commit_deadline = int(commit_deadline) if commit_deadline else None
+    self.table = table
+    self.params = WriteToSpannerSchema(
+        project_id=project_id,
+        instance_id=instance_id,
+        database_id=database_id,
+        max_batch_size_bytes=max_batch_size_bytes,
+        max_number_mutations=max_number_mutations,
+        max_number_rows=max_number_rows,
+        grouping_factor=grouping_factor,
+        host=host,
+        emulator_host=emulator_host,
+        commit_deadline=commit_deadline,
+        max_cumulative_backoff=max_cumulative_backoff,
+    )
+    self.expansion_service = expansion_service or default_io_expansion_service()
+
+  def expand(self, pbegin):
+    return _apply_write_transform(
+        pbegin,
+        _RowToMutation(_Operation.DELETE, self.table),
+        self.params,
+        self.expansion_service)
+
+
+@_add_doc(_CLASS_DOC, 'insert', 'ExampleRow', 'ExampleRow', 'Insert')
+class SpannerInsert(PTransform):
+  @_add_doc(_INIT_DOC, 'an insert')
+  def __init__(
+      self,
+      project_id,
+      instance_id,
+      database_id,
+      table,
+      max_batch_size_bytes=None,
+      max_number_mutations=None,
+      max_number_rows=None,
+      grouping_factor=None,
+      host=None,
+      emulator_host=None,
+      commit_deadline=None,
+      max_cumulative_backoff=None,
+      expansion_service=None,
+  ):
+    super().__init__()
+    max_cumulative_backoff = int(
+        max_cumulative_backoff) if max_cumulative_backoff else None
+    commit_deadline = int(commit_deadline) if commit_deadline else None
+    self.table = table
+    self.params = WriteToSpannerSchema(
+        project_id=project_id,
+        instance_id=instance_id,
+        database_id=database_id,
+        max_batch_size_bytes=max_batch_size_bytes,
+        max_number_mutations=max_number_mutations,
+        max_number_rows=max_number_rows,
+        grouping_factor=grouping_factor,
+        host=host,
+        emulator_host=emulator_host,
+        commit_deadline=commit_deadline,
+        max_cumulative_backoff=max_cumulative_backoff,
+    )
+    self.expansion_service = expansion_service or default_io_expansion_service()
+
+  def expand(self, pbegin):
+    return _apply_write_transform(
+        pbegin,
+        _RowToMutation(_Operation.INSERT, self.table),
+        self.params,
+        self.expansion_service)
+
+
+@_add_doc(_CLASS_DOC, 'replace', 'ExampleRow', 'ExampleRow', 'Replace')
+class SpannerReplace(PTransform):
+  @_add_doc(_INIT_DOC, 'a replace')
+  def __init__(
+      self,
+      project_id,
+      instance_id,
+      database_id,
+      table,
+      max_batch_size_bytes=None,
+      max_number_mutations=None,
+      max_number_rows=None,
+      grouping_factor=None,
+      host=None,
+      emulator_host=None,
+      commit_deadline=None,
+      max_cumulative_backoff=None,
+      expansion_service=None,
+  ):
+    super().__init__()
+    max_cumulative_backoff = int(
+        max_cumulative_backoff) if max_cumulative_backoff else None
+    commit_deadline = int(commit_deadline) if commit_deadline else None
+    self.table = table
+    self.params = WriteToSpannerSchema(
+        project_id=project_id,
+        instance_id=instance_id,
+        database_id=database_id,
+        max_batch_size_bytes=max_batch_size_bytes,
+        max_number_mutations=max_number_mutations,
+        max_number_rows=max_number_rows,
+        grouping_factor=grouping_factor,
+        host=host,
+        emulator_host=emulator_host,
+        commit_deadline=commit_deadline,
+        max_cumulative_backoff=max_cumulative_backoff,
+    )
+    self.expansion_service = expansion_service or default_io_expansion_service()
+
+  def expand(self, pbegin):
+    return _apply_write_transform(
+        pbegin,
+        _RowToMutation(_Operation.REPLACE, self.table),
+        self.params,
+        self.expansion_service)
+
+
+@_add_doc(
+    _CLASS_DOC,
+    'insert-or-update',
+    'ExampleRow',
+    'ExampleRow',
+    'InsertOrUpdate')
+class SpannerInsertOrUpdate(PTransform):
+  @_add_doc(_INIT_DOC, 'an insert-or-update')
+  def __init__(
+      self,
+      project_id,
+      instance_id,
+      database_id,
+      table,
+      max_batch_size_bytes=None,
+      max_number_mutations=None,
+      max_number_rows=None,
+      grouping_factor=None,
+      host=None,
+      emulator_host=None,
+      commit_deadline=None,
+      max_cumulative_backoff=None,
+      expansion_service=None,
+  ):
+    super().__init__()
+    max_cumulative_backoff = int(
+        max_cumulative_backoff) if max_cumulative_backoff else None
+    commit_deadline = int(commit_deadline) if commit_deadline else None
+    self.table = table
+    self.params = WriteToSpannerSchema(
+        project_id=project_id,
+        instance_id=instance_id,
+        database_id=database_id,
+        max_batch_size_bytes=max_batch_size_bytes,
+        max_number_mutations=max_number_mutations,
+        max_number_rows=max_number_rows,
+        grouping_factor=grouping_factor,
+        host=host,
+        emulator_host=emulator_host,
+        commit_deadline=commit_deadline,
+        max_cumulative_backoff=max_cumulative_backoff,
+    )
+    self.expansion_service = expansion_service or default_io_expansion_service()
+
+  def expand(self, pbegin):
+    return _apply_write_transform(
+        pbegin,
+        _RowToMutation(_Operation.INSERT_OR_UPDATE, self.table),
+        self.params,
+        self.expansion_service)
+
+
+@_add_doc(_CLASS_DOC, 'update', 'ExampleRow', 'ExampleRow', 'Update')
+class SpannerUpdate(PTransform):
+  @_add_doc(_INIT_DOC, 'an update')
+  def __init__(
+      self,
+      project_id,
+      instance_id,
+      database_id,
+      table,
+      max_batch_size_bytes=None,
+      max_number_mutations=None,
+      max_number_rows=None,
+      grouping_factor=None,
+      host=None,
+      emulator_host=None,
+      commit_deadline=None,
+      max_cumulative_backoff=None,
+      expansion_service=None,
+  ):
+    super().__init__()
+    max_cumulative_backoff = int(
+        max_cumulative_backoff) if max_cumulative_backoff else None
+    commit_deadline = int(commit_deadline) if commit_deadline else None
+    self.table = table
+    self.params = WriteToSpannerSchema(
+        project_id=project_id,
+        instance_id=instance_id,
+        database_id=database_id,
+        max_batch_size_bytes=max_batch_size_bytes,
+        max_number_mutations=max_number_mutations,
+        max_number_rows=max_number_rows,
+        grouping_factor=grouping_factor,
+        host=host,
+        emulator_host=emulator_host,
+        commit_deadline=commit_deadline,
+        max_cumulative_backoff=max_cumulative_backoff,
+    )
+    self.expansion_service = expansion_service or default_io_expansion_service()
+
+  def expand(self, pbegin):
+    return _apply_write_transform(
+        pbegin,
+        _RowToMutation(_Operation.UPDATE, self.table),
+        self.params,
+        self.expansion_service)
+
+
+def _apply_write_transform(pbegin, to_mutation, params, expansion_service):
+  return (
+      pbegin
+      | to_mutation
+      | ExternalTransform(
+          _WRITE_URN, NamedTupleBasedPayloadBuilder(params), expansion_service))
+
+
+class _RowToMutation(PTransform):

Review comment:
       Done. I've replaced keysets with keys - it's horrible to distinguish List<Row> from Row in runtime. With rows/keys only it's way cleaner.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-683611200


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=h1) Report
   > Merging [#12611](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=desc) into [master](https://codecov.io/gh/apache/beam/commit/0335ba5403e5071a1376a3a4101d14b03cf90b8d?el=desc) will **increase** coverage by `0.03%`.
   > The diff coverage is `59.13%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/12611/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #12611      +/-   ##
   ==========================================
   + Coverage   40.23%   40.27%   +0.03%     
   ==========================================
     Files         455      456       +1     
     Lines       53726    53819      +93     
   ==========================================
   + Hits        21619    21674      +55     
   - Misses      32107    32145      +38     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [sdks/python/apache\_beam/io/gcp/spanner.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL3NwYW5uZXIucHk=) | `59.13% <59.13%> (ø)` | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=footer). Last update [0335ba5...55ee406](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-683611200


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=h1) Report
   > Merging [#12611](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=desc) (1c43284) into [master](https://codecov.io/gh/apache/beam/commit/3d6cc0ed9ed537229b27b5dbe73288f21b0e351c?el=desc) (3d6cc0e) will **decrease** coverage by `0.04%`.
   > The diff coverage is `56.73%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/12611/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #12611      +/-   ##
   ==========================================
   - Coverage   82.48%   82.44%   -0.05%     
   ==========================================
     Files         455      456       +1     
     Lines       54876    54975      +99     
   ==========================================
   + Hits        45266    45324      +58     
   - Misses       9610     9651      +41     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [sdks/python/apache\_beam/io/gcp/spanner.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL3NwYW5uZXIucHk=) | `56.73% <56.73%> (ø)` | |
   | [...eam/runners/interactive/interactive\_environment.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9pbnRlcmFjdGl2ZS9pbnRlcmFjdGl2ZV9lbnZpcm9ubWVudC5weQ==) | `89.45% <0.00%> (-0.36%)` | :arrow_down: |
   | [sdks/python/apache\_beam/io/iobase.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vaW9iYXNlLnB5) | `83.75% <0.00%> (-0.29%)` | :arrow_down: |
   | [...hon/apache\_beam/runners/worker/bundle\_processor.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvYnVuZGxlX3Byb2Nlc3Nvci5weQ==) | `94.07% <0.00%> (-0.27%)` | :arrow_down: |
   | [...ks/python/apache\_beam/runners/worker/sdk\_worker.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvc2RrX3dvcmtlci5weQ==) | `89.47% <0.00%> (-0.16%)` | :arrow_down: |
   | [...runners/interactive/display/pcoll\_visualization.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9pbnRlcmFjdGl2ZS9kaXNwbGF5L3Bjb2xsX3Zpc3VhbGl6YXRpb24ucHk=) | `85.26% <0.00%> (-0.08%)` | :arrow_down: |
   | [...beam/runners/portability/local\_job\_service\_main.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9wb3J0YWJpbGl0eS9sb2NhbF9qb2Jfc2VydmljZV9tYWluLnB5) | `0.00% <0.00%> (ø)` | |
   | [sdks/python/apache\_beam/runners/common.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9jb21tb24ucHk=) | `89.20% <0.00%> (+0.44%)` | :arrow_up: |
   | [.../python/apache\_beam/transforms/periodicsequence.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdHJhbnNmb3Jtcy9wZXJpb2RpY3NlcXVlbmNlLnB5) | `98.24% <0.00%> (+1.75%)` | :arrow_up: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=footer). Last update [02a1cd2...7eb4576](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski commented on a change in pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski commented on a change in pull request #12611:
URL: https://github.com/apache/beam/pull/12611#discussion_r505657026



##########
File path: sdks/python/apache_beam/io/gcp/spanner.py
##########
@@ -0,0 +1,483 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""PTransforms for supporting Spanner in Python pipelines.
+
+  These transforms are currently supported by Beam portable
+  Flink and Spark runners.
+
+  **Setup**
+
+  Transforms provided in this module are cross-language transforms
+  implemented in the Beam Java SDK. During the pipeline construction, Python SDK
+  will connect to a Java expansion service to expand these transforms.
+  To facilitate this, a small amount of setup is needed before using these
+  transforms in a Beam Python pipeline.
+
+  There are several ways to setup cross-language Spanner transforms.
+
+  * Option 1: use the default expansion service
+  * Option 2: specify a custom expansion service
+
+  See below for details regarding each of these options.
+
+  *Option 1: Use the default expansion service*
+
+  This is the recommended and easiest setup option for using Python Spanner
+  transforms. This option is only available for Beam 2.25.0 and later.
+
+  This option requires following pre-requisites before running the Beam
+  pipeline.
+
+  * Install Java runtime in the computer from where the pipeline is constructed
+    and make sure that 'java' command is available.
+
+  In this option, Python SDK will either download (for released Beam version) or
+  build (when running from a Beam Git clone) a expansion service jar and use
+  that to expand transforms. Currently Spanner transforms use the
+  'beam-sdks-java-io-google-cloud-platform-expansion-service' jar for this
+  purpose.
+
+  *Option 2: specify a custom expansion service*
+
+  In this option, you startup your own expansion service and provide that as
+  a parameter when using the transforms provided in this module.
+
+  This option requires following pre-requisites before running the Beam
+  pipeline.
+
+  * Startup your own expansion service.
+  * Update your pipeline to provide the expansion service address when
+    initiating Spanner transforms provided in this module.
+
+  Flink Users can use the built-in Expansion Service of the Flink Runner's
+  Job Server. If you start Flink's Job Server, the expansion service will be
+  started on port 8097. For a different address, please set the
+  expansion_service parameter.
+
+  **More information**
+
+  For more information regarding cross-language transforms see:
+  - https://beam.apache.org/roadmap/portability/
+
+  For more information specific to Flink runner see:
+  - https://beam.apache.org/documentation/runners/flink/
+"""
+
+# pytype: skip-file
+
+from __future__ import absolute_import
+
+import typing
+import uuid
+from typing import List
+from typing import NamedTuple
+from typing import Optional
+
+from past.builtins import unicode
+
+from apache_beam import Map
+from apache_beam import PTransform
+from apache_beam import coders
+from apache_beam.transforms.external import BeamJarExpansionService
+from apache_beam.transforms.external import ExternalTransform
+from apache_beam.transforms.external import NamedTupleBasedPayloadBuilder
+from apache_beam.typehints.schemas import named_tuple_to_schema
+
+__all__ = [
+    'WriteToSpanner',
+    'ReadFromSpanner',
+    'TimestampBoundMode',
+    'TimeUnit',
+]
+
+
+def default_io_expansion_service():
+  return BeamJarExpansionService(
+      'sdks:java:io:google-cloud-platform:expansion-service:shadowJar')
+
+
+WriteToSpannerSchema = typing.NamedTuple(
+    'WriteToSpannerSchema',
+    [
+        ('instance_id', unicode),
+        ('database_id', unicode),
+        ('project_id', Optional[unicode]),
+        ('max_batch_size_bytes', Optional[int]),
+        ('max_number_mutations', Optional[int]),
+        ('max_number_rows', Optional[int]),
+        ('grouping_factor', Optional[int]),
+        ('host', Optional[unicode]),
+        ('emulator_host', Optional[unicode]),
+        ('commit_deadline', Optional[int]),
+        ('max_cumulative_backoff', Optional[int]),
+    ],
+)
+
+
+class WriteToSpanner:
+  """
+  A PTransform which writes mutations to the specified instance's database
+  via Spanner.
+
+  This transform receives rows defined as NamedTuple or as List[NamedTuple]
+  in case of delete operation. Example::
+
+    ExampleRow = typing.NamedTuple('ExampleRow',
+                                   [('id', int), ('name', unicode)])
+
+    with Pipeline() as p:
+      _ = (
+          p
+          | 'Impulse' >> beam.Impulse()
+          | 'Generate' >> beam.FlatMap(lambda x: range(num_rows))
+          | 'To row' >> beam.Map(lambda n: ExampleRow(n, str(n))
+              .with_output_types(ExampleRow)
+          | 'Write to Spanner' >> WriteToSpanner(
+              instance_id='your_instance',
+              database_id='existing_database',
+              project_id='your_project_id').insert('your_table'))
+
+  In addition you can pass List[ExampleRow] to delete transform::
+
+    with Pipeline() as p:
+      _ = (
+          p
+          | 'Impulse' >> beam.Impulse()
+          | 'Generate' >> beam.FlatMap(lambda x: range(num_rows))
+          | 'To row' >> beam.Map(lambda n: [ExampleRow(n, str(n),
+              ExampleRow(n * 2, str(n * 2)])
+              .with_output_types(List[ExampleRow])
+          | 'Write to Spanner' >> WriteToSpanner(
+              instance_id='your_instance',
+              database_id='existing_database',
+              project_id='your_project_id').delete('your_table'))
+
+  Experimental; no backwards compatibility guarantees.
+  """
+  def __init__(
+      self,
+      project_id,
+      instance_id,
+      database_id,
+      max_batch_size_bytes=None,
+      max_number_mutations=None,
+      max_number_rows=None,
+      grouping_factor=None,
+      host=None,
+      emulator_host=None,
+      commit_deadline=None,
+      max_cumulative_backoff=None,
+      expansion_service=None,
+  ):
+    """
+    Initializes a write operation to Spanner.
+
+    :param project_id: Specifies the Cloud Spanner project.
+    :param instance_id: Specifies the Cloud Spanner instance.
+    :param database_id: Specifies the Cloud Spanner database.
+    :param max_batch_size_bytes: Specifies the batch size limit (max number of
+        bytes mutated per batch). Default value is 1048576 bytes = 1MB.
+    :param max_number_mutations: Specifies the cell mutation limit (maximum
+        number of mutated cells per batch). Default value is 5000.
+    :param max_number_rows: Specifies the row mutation limit (maximum number of
+        mutated rows per batch). Default value is 500.
+    :param grouping_factor: Specifies the multiple of max mutation (in terms
+        of both bytes per batch and cells per batch) that is used to select a
+        set of mutations to sort by key for batching. This sort uses local
+        memory on the workers, so using large values can cause out of memory
+        errors. Default value is 1000.
+    :param host: Specifies the Cloud Spanner host.
+    :param emulator_host: Specifies Spanner emulator host.
+    :param commit_deadline: Specifies the deadline for the Commit API call.
+        Default is 15 secs. DEADLINE_EXCEEDED errors will prompt a backoff/retry
+        until the value of commit_deadline is reached. DEADLINE_EXCEEDED errors
+        are ar reported with logging and counters. Pass seconds as value.
+    :param max_cumulative_backoff: Specifies the maximum cumulative backoff
+        time when retrying after DEADLINE_EXCEEDED errors. Default is 900s
+        (15min). If the mutations still have not been written after this time,
+        they are treated as a failure, and handled according to the setting of
+        failure_mode. Pass seconds as value.
+    :param expansion_service: The address (host:port) of the ExpansionService.
+    """
+    max_cumulative_backoff = int(
+        max_cumulative_backoff) if max_cumulative_backoff else None
+    commit_deadline = int(commit_deadline) if commit_deadline else None
+    self.config = NamedTupleBasedPayloadBuilder(
+        WriteToSpannerSchema(
+            project_id=project_id,
+            instance_id=instance_id,
+            database_id=database_id,
+            max_batch_size_bytes=max_batch_size_bytes,
+            max_number_mutations=max_number_mutations,
+            max_number_rows=max_number_rows,
+            grouping_factor=grouping_factor,
+            host=host,
+            emulator_host=emulator_host,
+            commit_deadline=commit_deadline,
+            max_cumulative_backoff=max_cumulative_backoff,
+        ),
+    )
+    self.expansion_service = expansion_service or default_io_expansion_service()
+
+  def insert(self, table):
+    return WriteToSpannerTransform(
+        self.config, self.expansion_service, Operation.INSERT, table)
+
+  def delete(self, table):
+    return WriteToSpannerTransform(
+        self.config, self.expansion_service, Operation.DELETE, table)
+
+  def update(self, table):
+    return WriteToSpannerTransform(
+        self.config, self.expansion_service, Operation.UPDATE, table)
+
+  def replace(self, table):
+    return WriteToSpannerTransform(
+        self.config, self.expansion_service, Operation.REPLACE, table)
+
+  def insert_or_update(self, table):
+    return WriteToSpannerTransform(
+        self.config, self.expansion_service, Operation.INSERT_OR_UPDATE, table)
+
+
+ReadFromSpannerSchema = NamedTuple(
+    'ReadFromSpannerSchema',
+    [
+        ('instance_id', unicode),
+        ('database_id', unicode),
+        ('schema', bytes),
+        ('sql', Optional[unicode]),
+        ('table', Optional[unicode]),
+        ('project_id', Optional[unicode]),
+        ('host', Optional[unicode]),
+        ('emulator_host', Optional[unicode]),
+        ('batching', Optional[bool]),
+        ('timestamp_bound_mode', Optional[unicode]),
+        ('read_timestamp', Optional[unicode]),
+        ('exact_staleness', Optional[int]),
+        ('time_unit', Optional[unicode]),
+    ],
+)
+
+
+class ReadFromSpanner(ExternalTransform):
+  """
+  A PTransform which reads from the specified Spanner instance's database.
+
+  This transform required type of the row it has to return to provide the
+  schema. Example::
+
+    ExampleRow = typing.NamedTuple('ExampleRow',
+                                   [('id', int), ('name', unicode)])
+
+    with Pipeline() as p:
+      result = (
+          p
+          | ReadFromSpanner(
+              instance_id='your_instance_id',
+              database_id='your_database_id',
+              project_id='your_project_id',
+              row_type=ExampleRow,
+              query='SELECT * FROM some_table',
+          ).with_output_types(ExampleRow))
+
+  Experimental; no backwards compatibility guarantees.
+  """
+  URN = 'beam:external:java:spanner:read:v1'
+
+  def __init__(
+      self,
+      project_id,
+      instance_id,
+      database_id,
+      row_type=None,
+      sql=None,
+      table=None,
+      host=None,
+      emulator_host=None,
+      batching=None,
+      timestamp_bound_mode=None,
+      read_timestamp=None,
+      exact_staleness=None,
+      time_unit=None,
+      expansion_service=None,
+  ):
+    """
+    Initializes a read operation from Spanner.
+
+    :param project_id: Specifies the Cloud Spanner project.
+    :param instance_id: Specifies the Cloud Spanner instance.
+    :param database_id: Specifies the Cloud Spanner database.
+    :param row_type: Row type that fits the given query or table. Passed as
+        NamedTuple, e.g. NamedTuple('name', [('row_name', unicode)])
+    :param sql: An sql query to execute. It's results must fit the
+        provided row_type. Don't use when table is set.
+    :param table: A spanner table. When provided all columns from row_type
+        will be selected to query. Don't use when query is set.
+    :param batching: By default Batch API is used to read data from Cloud
+        Spanner. It is useful to disable batching when the underlying query
+        is not root-partitionable.
+    :param host: Specifies the Cloud Spanner host.
+    :param emulator_host: Specifies Spanner emulator host.
+    :param timestamp_bound_mode: Defines how Cloud Spanner will choose a
+        timestamp for a read-only transaction or a single read/query.
+        Possible values:
+        STRONG: A timestamp bound that will perform reads and queries at a
+        timestamp where all previously committed transactions are visible.
+        READ_TIMESTAMP: Returns a timestamp bound that will perform reads
+        and queries at the given timestamp.
+        MIN_READ_TIMESTAMP: Returns a timestamp bound that will perform reads
+        and queries at a timestamp chosen to be at least given timestamp value.
+        EXACT_STALENESS: Returns a timestamp bound that will perform reads and
+        queries at an exact staleness. The timestamp is chosen soon after the
+        read is started.
+        MAX_STALENESS: Returns a timestamp bound that will perform reads and
+        queries at a timestamp chosen to be at most time_unit stale.
+    :param read_timestamp: Timestamp in string. Use only when
+        timestamp_bound_mode is set to READ_TIMESTAMP or MIN_READ_TIMESTAMP.
+    :param exact_staleness: Staleness value as int. Use only when
+        timestamp_bound_mode is set to EXACT_STALENESS or MAX_STALENESS.
+        time_unit has to be set along with this param.
+    :param time_unit: Time unit for staleness_value. Possible values:
+        NANOSECONDS, MICROSECONDS, MILLISECONDS, SECONDS, HOURS, DAYS.
+    :param expansion_service: The address (host:port) of the ExpansionService.
+    """
+    assert row_type
+    assert sql or table and not (sql and table)
+    TimeUnit.verify_param(time_unit)
+    TimestampBoundMode.verify_param(timestamp_bound_mode)
+    staleness_value = int(exact_staleness) if exact_staleness else None
+
+    if staleness_value or time_unit:
+      assert staleness_value and time_unit and \
+             timestamp_bound_mode == TimestampBoundMode.MAX_STALENESS or \
+             timestamp_bound_mode == TimestampBoundMode.EXACT_STALENESS
+
+    if read_timestamp:
+      assert timestamp_bound_mode == TimestampBoundMode.MIN_READ_TIMESTAMP\
+             or timestamp_bound_mode == TimestampBoundMode.READ_TIMESTAMP
+
+    coders.registry.register_coder(row_type, coders.RowCoder)
+
+    super(ReadFromSpanner, self).__init__(
+        self.URN,
+        NamedTupleBasedPayloadBuilder(
+            ReadFromSpannerSchema(
+                instance_id=instance_id,
+                database_id=database_id,
+                sql=sql,
+                table=table,
+                schema=named_tuple_to_schema(row_type).SerializeToString(),
+                project_id=project_id,
+                host=host,
+                emulator_host=emulator_host,
+                batching=batching,
+                timestamp_bound_mode=timestamp_bound_mode,
+                read_timestamp=read_timestamp,
+                exact_staleness=exact_staleness,
+                time_unit=time_unit,
+            ),
+        ),
+        expansion_service or default_io_expansion_service(),
+    )
+
+
+class Operation:
+  INSERT = 'INSERT'
+  DELETE = 'DELETE'
+  UPDATE = 'UPDATE'
+  REPLACE = 'REPLACE'
+  INSERT_OR_UPDATE = 'INSERT_OR_UPDATE'
+
+
+class TimeUnit:
+  NANOSECONDS = 'NANOSECONDS'
+  MICROSECONDS = 'MICROSECONDS'
+  MILLISECONDS = 'MILLISECONDS'
+  SECONDS = 'SECONDS'
+  HOURS = 'HOURS'
+  DAYS = 'DAYS'
+
+  @staticmethod
+  def verify_param(param):
+    if param and not hasattr(TimeUnit, param):
+      raise RuntimeError(
+          'Invalid param for TimestampBoundMode: {}'.format(param))
+
+
+class TimestampBoundMode:
+  MAX_STALENESS = 'MAX_STALENESS'
+  EXACT_STALENESS = 'EXACT_STALENESS'
+  READ_TIMESTAMP = 'READ_TIMESTAMP'
+  MIN_READ_TIMESTAMP = 'MIN_READ_TIMESTAMP'
+  STRONG = 'STRONG'
+
+  @staticmethod
+  def verify_param(param):
+    if param and not hasattr(TimestampBoundMode, param):
+      raise RuntimeError(
+          'Invalid param for TimestampBoundMode: {}'.format(param))
+
+
+class WriteToSpannerTransform(PTransform):
+  URN = 'beam:external:java:spanner:write:v1'
+
+  def __init__(self, config, expansion_service, operation, table):
+    super(WriteToSpannerTransform, self).__init__()
+    self.config = config
+    self.expansion_service = expansion_service
+    self.operation = operation
+    self.table = table
+
+  def expand(self, row_pcoll):
+    return (
+        row_pcoll
+        | RowToMutation(self.operation, self.table)
+        | ExternalTransform(self.URN, self.config, self.expansion_service))
+
+
+class RowToMutation(PTransform):
+  def __init__(self, operation, table):
+    super(RowToMutation, self).__init__()
+    self.operation = operation
+    self.table = table
+
+  def expand(self, pcoll):
+    is_delete = self.operation == Operation.DELETE
+    mutation_name = 'Mutation_%s_%s' % (
+        self.operation, str(uuid.uuid4()).replace('-', ''))
+
+    # There is an error when pcoll.element_type is List[row_type] so pass
+    # a list of inner element types to NamedTuple explicitly.
+    is_list = hasattr(pcoll.element_type, 'inner_type')
+    row_type = pcoll.element_type.inner_type if is_list else pcoll.element_type

Review comment:
       I've enforced lists only for delete operation.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski commented on a change in pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski commented on a change in pull request #12611:
URL: https://github.com/apache/beam/pull/12611#discussion_r523024788



##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/StructUtils.java
##########
@@ -0,0 +1,301 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.gcp.spanner;
+
+import static java.util.stream.Collectors.toList;
+
+import com.google.cloud.ByteArray;
+import com.google.cloud.Timestamp;
+import com.google.cloud.spanner.Struct;
+import com.google.cloud.spanner.Type;
+import java.math.BigDecimal;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.stream.StreamSupport;
+import org.apache.beam.sdk.schemas.Schema;
+import org.apache.beam.sdk.values.Row;
+import org.joda.time.DateTime;
+import org.joda.time.Instant;
+
+@SuppressWarnings({
+  "nullness" // TODO(https://issues.apache.org/jira/browse/BEAM-10402)
+})

Review comment:
       Done. Oh, it was quite painful as all of the row getters return a @Nullable value. Especially that checkNotNull doesn't work with the checker and there is even no possibility to check for null in a function (only `if (var == null) { throw new NullPointerException("Null var"); }` seem to work.
   
   It doesn't even work in chained functions as in this example:
   ```
   @Nullable Object var = new Object();
   if (var != null) {
     someObject.doSth().doChained(var); // checker doesn't understand that var is checked for nullness)
   }
   ```
   So it's quite unfriendly. In general I'm really excited about dealing with NPE problem, but for now it adds much more complexity and reduces the contributor friendliness. But I guess that it's worth it, especially when the checker gets smarter and will work with the Guava checks and chained functions (if it's even possible?)




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski commented on a change in pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski commented on a change in pull request #12611:
URL: https://github.com/apache/beam/pull/12611#discussion_r501676146



##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/SpannerIO.java
##########
@@ -678,6 +703,42 @@ public Read withPartitionOptions(PartitionOptions partitionOptions) {
               .withTransaction(getTransaction());
       return input.apply(Create.of(getReadOperation())).apply("Execute query", readAll);
     }
+
+    SerializableFunction<Struct, Row> getFormatFn() {
+      return (SerializableFunction<Struct, Row>)
+          input ->
+              Row.withSchema(Schema.builder().addInt64Field("Key").build())
+                  .withFieldValue("Key", 3L)
+                  .build();
+    }
+  }
+
+  public static class ReadRows extends PTransform<PBegin, PCollection<Row>> {
+    Read read;
+    Schema schema;
+
+    public ReadRows(Read read, Schema schema) {
+      super("Read rows");
+      this.read = read;
+      this.schema = schema;

Review comment:
       Thank you @nielm ! I thought about the LIMIT approach but then I found the same arguments not to do that.
   
   It appears there exist a jdbc client for Spanner: https://cloud.google.com/spanner/docs/jdbc-drivers . I'll try to figure out if I can use it. 
   
   There is ResultSetMetadata in Spanner's REST API which extends json object. https://cloud.google.com/spanner/docs/reference/rest/v1/ResultSetMetadata but at the end of the day it requires at least partially to fetch the data.
   
   But I would leave it for another PR as it supposedly require to move SchemaUtils from io/jdbc to some more general place (extensions/sql?). As I can see Struct type is represented as String as is mentiones here:
   ```
   The Cloud Spanner STRUCT data type is mapped to a SQL VARCHAR data type, accessible through this driver as String types. All other types have appropriate mappings.
   ```
   So it may not be the best option.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski commented on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski commented on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-709461345


   Run Python 3.8 PostCommit


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski removed a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski removed a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-710247484






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski commented on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski commented on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-710027902


   I've solved the docs problem by making an annotation that modifies the __doc__ property. I'm not sure whether it's a good solution (whether the users using Beam will get the doc info when hovered or if it will be exported to the website properly). If so i can revert it.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] chamikaramj commented on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
chamikaramj commented on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-693695238


   In addition to Brian's review, @allenpradeep or @nielm can you briefly look at Java SpannerIO changes here ?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski commented on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski commented on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-728248437


   > Looks good, merging now. Thanks for all your work on this @piotr-szuberski :)
   
   Thank you too for your reviews! :)


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski removed a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski removed a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-677656948


   Run Python 3.8 PostCommit


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-683611200


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=h1) Report
   > Merging [#12611](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=desc) into [master](https://codecov.io/gh/apache/beam/commit/ba89e76f5daecd496cafb2861dac5ef69480a973?el=desc) will **increase** coverage by `0.03%`.
   > The diff coverage is `56.73%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/12611/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #12611      +/-   ##
   ==========================================
   + Coverage   40.26%   40.30%   +0.03%     
   ==========================================
     Files         455      456       +1     
     Lines       53780    53926     +146     
   ==========================================
   + Hits        21655    21733      +78     
   - Misses      32125    32193      +68     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [sdks/python/apache\_beam/io/gcp/spanner.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL3NwYW5uZXIucHk=) | `56.73% <56.73%> (ø)` | |
   | [...s/python/apache\_beam/testing/synthetic\_pipeline.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdGVzdGluZy9zeW50aGV0aWNfcGlwZWxpbmUucHk=) | `23.45% <0.00%> (+2.52%)` | :arrow_up: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=footer). Last update [ba89e76...c24e8cb](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] chamikaramj commented on a change in pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
chamikaramj commented on a change in pull request #12611:
URL: https://github.com/apache/beam/pull/12611#discussion_r492364337



##########
File path: sdks/python/apache_beam/io/gcp/spanner.py
##########
@@ -0,0 +1,504 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""PTransforms for supporting Spanner in Python pipelines.
+
+  These transforms are currently supported by Beam portable
+  Flink and Spark runners.
+
+  **Setup**
+
+  Transforms provided in this module are cross-language transforms
+  implemented in the Beam Java SDK. During the pipeline construction, Python SDK
+  will connect to a Java expansion service to expand these transforms.
+  To facilitate this, a small amount of setup is needed before using these
+  transforms in a Beam Python pipeline.
+
+  There are several ways to setup cross-language Spanner transforms.
+
+  * Option 1: use the default expansion service
+  * Option 2: specify a custom expansion service
+
+  See below for details regarding each of these options.
+
+  *Option 1: Use the default expansion service*
+
+  This is the recommended and easiest setup option for using Python Spanner
+  transforms. This option is only available for Beam 2.25.0 and later.
+
+  This option requires following pre-requisites before running the Beam
+  pipeline.
+
+  * Install Java runtime in the computer from where the pipeline is constructed
+    and make sure that 'java' command is available.
+
+  In this option, Python SDK will either download (for released Beam version) or
+  build (when running from a Beam Git clone) a expansion service jar and use
+  that to expand transforms. Currently Spanner transforms use the
+  'beam-sdks-java-io-google-cloud-platform-expansion-service' jar for this
+  purpose.
+
+  *Option 2: specify a custom expansion service*
+
+  In this option, you startup your own expansion service and provide that as
+  a parameter when using the transforms provided in this module.
+
+  This option requires following pre-requisites before running the Beam
+  pipeline.
+
+  * Startup your own expansion service.
+  * Update your pipeline to provide the expansion service address when
+    initiating Spanner transforms provided in this module.
+
+  Flink Users can use the built-in Expansion Service of the Flink Runner's
+  Job Server. If you start Flink's Job Server, the expansion service will be
+  started on port 8097. For a different address, please set the
+  expansion_service parameter.
+
+  **More information**
+
+  For more information regarding cross-language transforms see:
+  - https://beam.apache.org/roadmap/portability/
+
+  For more information specific to Flink runner see:
+  - https://beam.apache.org/documentation/runners/flink/
+"""
+
+# pytype: skip-file
+
+from __future__ import absolute_import
+
+import typing
+import uuid
+from typing import List
+from typing import NamedTuple
+from typing import Optional
+
+from past.builtins import unicode
+
+from apache_beam import coders
+from apache_beam.transforms.external import BeamJarExpansionService
+from apache_beam.transforms.external import ExternalTransform
+from apache_beam.transforms.external import NamedTupleBasedPayloadBuilder
+from apache_beam.typehints.schemas import named_tuple_to_schema
+
+__all__ = [
+    'WriteToSpanner',
+    'ReadFromSpanner',
+    'MutationCreator',
+    'TimestampBoundMode',
+    'TimeUnit',
+]
+
+
+def default_io_expansion_service():
+  return BeamJarExpansionService(
+      'sdks:java:io:google-cloud-platform:expansion-service:shadowJar')
+
+
+WriteToSpannerSchema = typing.NamedTuple(
+    'WriteToSpannerSchema',
+    [
+        ('instance_id', unicode),
+        ('database_id', unicode),
+        ('project_id', Optional[unicode]),
+        ('batch_size_bytes', Optional[int]),
+        ('max_num_mutations', Optional[int]),
+        ('max_num_rows', Optional[int]),
+        ('grouping_factor', Optional[int]),
+        ('host', Optional[unicode]),
+        ('emulator_host', Optional[unicode]),
+        ('commit_deadline', Optional[int]),
+        ('max_cumulative_backoff', Optional[int]),
+    ],
+)
+
+
+class WriteToSpanner(ExternalTransform):

Review comment:
       Probably it makes sense to converge into one implementation. I'd prefer the Java implementation (hence cross-language) since it's being around for longer and used by many users. We have to make sure that the cross-language version works for all runners before native version can be removed. For example, cross-language version will not work for current production Dataflow (Runner v1) and we have to confirm that it works adequate for Dataflow Runner v2.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski edited a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski edited a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-686447747


   @TheNeuralBit I've upgraded it a bit.
   - Checking schemas equality is redundant because it will throw an exception with a good message anyway (class cast failure or unknown column). Also, it's possible to just add row.addFieldValues(Map<String, Object> values) and depend on the casts following the schema.
   - I managed to unify addArray and addIterable code duplication with a bit ugly casts (needed SuppressWarning("unchecked")) but I don't think it can be easily achieved otherwise.
   - Nothing comes to my mind to remove duplication in addIterableToMutationBuilder and addIterableToStructBuilder methods. These are unrelated classes (Struct.Builder and Mutation.WriteBuilder. Maybe my Java knowledge is insufficient here. I could make an interface that simulates .setInt64Array, setStructArray etc but it would be even more boilerplate.
   - I unified a bit the API of both python spanners. Not everything could be done 1:1, but the corresponding keywords were changed and the positions of positional arguments.
   - Nulls run with no problems - I used ImmutableMap.Builder that doesn't allow null values. I changed it to normal HashMap and now it's ok.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski commented on a change in pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski commented on a change in pull request #12611:
URL: https://github.com/apache/beam/pull/12611#discussion_r482070325



##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/SpannerIO.java
##########
@@ -678,6 +703,42 @@ public Read withPartitionOptions(PartitionOptions partitionOptions) {
               .withTransaction(getTransaction());
       return input.apply(Create.of(getReadOperation())).apply("Execute query", readAll);
     }
+
+    SerializableFunction<Struct, Row> getFormatFn() {
+      return (SerializableFunction<Struct, Row>)
+          input ->
+              Row.withSchema(Schema.builder().addInt64Field("Key").build())
+                  .withFieldValue("Key", 3L)
+                  .build();
+    }
+  }
+
+  public static class ReadRows extends PTransform<PBegin, PCollection<Row>> {
+    Read read;
+    Schema schema;
+
+    public ReadRows(Read read, Schema schema) {
+      super("Read rows");
+      this.read = read;
+      this.schema = schema;
+    }
+
+    @Override
+    public PCollection<Row> expand(PBegin input) {
+      return input
+          .apply(read)
+          .apply(
+              MapElements.into(TypeDescriptor.of(Row.class))
+                  .via(
+                      new SerializableFunction<Struct, Row>() {
+                        @Override
+                        public Row apply(Struct struct) {
+                          return StructUtils.translateStructToRow(struct, schema);
+                        }
+                      }))
+          .setRowSchema(schema)
+          .setCoder(RowCoder.of(schema));

Review comment:
       For some reason it was not obvious to me. Done.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] TheNeuralBit commented on a change in pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
TheNeuralBit commented on a change in pull request #12611:
URL: https://github.com/apache/beam/pull/12611#discussion_r523092073



##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/StructUtils.java
##########
@@ -0,0 +1,380 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.gcp.spanner;
+
+import static java.util.stream.Collectors.toList;
+
+import com.google.cloud.ByteArray;
+import com.google.cloud.Timestamp;
+import com.google.cloud.spanner.Struct;
+import com.google.cloud.spanner.Type;
+import java.math.BigDecimal;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.stream.StreamSupport;
+import org.apache.beam.sdk.schemas.Schema;
+import org.apache.beam.sdk.values.Row;
+import org.checkerframework.checker.nullness.qual.Nullable;
+import org.joda.time.DateTime;
+import org.joda.time.Instant;
+import org.joda.time.ReadableDateTime;
+
+final class StructUtils {
+  public static Row structToBeamRow(Struct struct, Schema schema) {
+    Map<String, Object> structValues =
+        schema.getFields().stream()
+            .collect(
+                HashMap::new,
+                (map, field) -> {
+                  @Nullable Object structValue = getStructValue(struct, field);
+                  if (structValue == null) {
+                    throw new NullPointerException("Null struct value at field " + field.getName());
+                  }
+                  map.put(field.getName(), structValue);
+                },
+                Map::putAll);
+    return Row.withSchema(schema).withFieldValues(structValues).build();
+  }
+
+  public static Struct beamRowToStruct(Row row) {
+    Struct.Builder structBuilder = Struct.newBuilder();
+    List<Schema.Field> fields = row.getSchema().getFields();
+    fields.forEach(
+        field -> {
+          String column = field.getName();
+          switch (field.getType().getTypeName()) {
+            case ROW:
+              @Nullable Row subRow = row.getRow(column);
+              if (subRow == null) {
+                throw new NullPointerException(String.format("Null subRow at '%s' column", column));
+              }

Review comment:
       Let's make these null checks use `org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkNotNull`, that will be a bit more concise, and it avoids throwing an NPE 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-683611200


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=h1) Report
   > Merging [#12611](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=desc) into [master](https://codecov.io/gh/apache/beam/commit/3d6cc0ed9ed537229b27b5dbe73288f21b0e351c?el=desc) will **decrease** coverage by `0.04%`.
   > The diff coverage is `56.73%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/12611/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #12611      +/-   ##
   ==========================================
   - Coverage   82.48%   82.44%   -0.05%     
   ==========================================
     Files         455      456       +1     
     Lines       54876    54975      +99     
   ==========================================
   + Hits        45266    45324      +58     
   - Misses       9610     9651      +41     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [sdks/python/apache\_beam/io/gcp/spanner.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL3NwYW5uZXIucHk=) | `56.73% <56.73%> (ø)` | |
   | [...eam/runners/interactive/interactive\_environment.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9pbnRlcmFjdGl2ZS9pbnRlcmFjdGl2ZV9lbnZpcm9ubWVudC5weQ==) | `89.45% <0.00%> (-0.36%)` | :arrow_down: |
   | [sdks/python/apache\_beam/io/iobase.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vaW9iYXNlLnB5) | `83.75% <0.00%> (-0.29%)` | :arrow_down: |
   | [...hon/apache\_beam/runners/worker/bundle\_processor.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvYnVuZGxlX3Byb2Nlc3Nvci5weQ==) | `94.07% <0.00%> (-0.27%)` | :arrow_down: |
   | [...ks/python/apache\_beam/runners/worker/sdk\_worker.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvc2RrX3dvcmtlci5weQ==) | `89.47% <0.00%> (-0.16%)` | :arrow_down: |
   | [...runners/interactive/display/pcoll\_visualization.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9pbnRlcmFjdGl2ZS9kaXNwbGF5L3Bjb2xsX3Zpc3VhbGl6YXRpb24ucHk=) | `85.26% <0.00%> (-0.08%)` | :arrow_down: |
   | [...beam/runners/portability/local\_job\_service\_main.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9wb3J0YWJpbGl0eS9sb2NhbF9qb2Jfc2VydmljZV9tYWluLnB5) | `0.00% <0.00%> (ø)` | |
   | [sdks/python/apache\_beam/runners/common.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9jb21tb24ucHk=) | `89.20% <0.00%> (+0.44%)` | :arrow_up: |
   | [.../python/apache\_beam/transforms/periodicsequence.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdHJhbnNmb3Jtcy9wZXJpb2RpY3NlcXVlbmNlLnB5) | `98.24% <0.00%> (+1.75%)` | :arrow_up: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=footer). Last update [e4c95f2...a120f31](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski commented on a change in pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski commented on a change in pull request #12611:
URL: https://github.com/apache/beam/pull/12611#discussion_r501677086



##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/StructUtils.java
##########
@@ -0,0 +1,545 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.gcp.spanner;
+
+import static java.util.stream.Collectors.toList;
+import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument;
+
+import com.google.cloud.ByteArray;
+import com.google.cloud.Timestamp;
+import com.google.cloud.spanner.Struct;
+import com.google.cloud.spanner.Type;
+import java.math.BigDecimal;
+import java.util.List;
+import java.util.stream.StreamSupport;
+import org.apache.beam.sdk.schemas.Schema;
+import org.apache.beam.sdk.values.Row;
+import org.joda.time.DateTime;
+import org.joda.time.Instant;
+
+final class StructUtils {
+  public static Row translateStructToRow(Struct struct, Schema schema) {
+    checkForSchemasEquality(schema.getFields(), struct.getType().getStructFields(), false);
+
+    List<Schema.Field> fields = schema.getFields();
+    Row.FieldValueBuilder valueBuilder = null;
+    // TODO: Remove this null-checking once nullable fields are supported in cross-language
+    int count = 0;
+    while (valueBuilder == null && count < fields.size()) {
+      valueBuilder = getFirstStructValue(struct, fields.get(count), schema);
+      ++count;
+    }
+    for (int i = count; i < fields.size(); ++i) {
+      valueBuilder = getStructValue(valueBuilder, struct, fields.get(i));
+    }
+    return valueBuilder != null ? valueBuilder.build() : Row.withSchema(schema).build();
+  }
+
+  public static Struct translateRowToStruct(Row row) {
+    Struct.Builder structBuilder = Struct.newBuilder();
+    List<Schema.Field> fields = row.getSchema().getFields();
+    fields.forEach(
+        field -> {
+          String column = field.getName();
+          switch (field.getType().getTypeName()) {
+            case ROW:
+              structBuilder
+                  .set(column)
+                  .to(
+                      beamTypeToSpannerType(field.getType()),
+                      translateRowToStruct(row.getRow(column)));
+              break;
+            case ARRAY:
+              addArrayToStruct(structBuilder, row, field);
+              break;
+            case ITERABLE:
+              addIterableToStruct(structBuilder, row, field);
+              break;
+            case FLOAT:
+              structBuilder.set(column).to(row.getFloat(column).doubleValue());
+              break;
+            case DOUBLE:
+              structBuilder.set(column).to(row.getDouble(column));
+              break;
+            case DECIMAL:
+              structBuilder.set(column).to(row.getDecimal(column).doubleValue());

Review comment:
       Great, quite a new thing in Spanner as I can see! Thanks!




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski edited a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski edited a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-710027902


   I've solved the docs problem by making an annotation that modifies the __doc__ property. I'm not sure whether it's a good solution (whether the users using Beam will get the doc info when hovered or if it will be exported to the website properly). If so i can revert it.
   I tried to run pydocs command and the generated html looks good.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-683611200


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=h1) Report
   > Merging [#12611](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=desc) into [master](https://codecov.io/gh/apache/beam/commit/ba89e76f5daecd496cafb2861dac5ef69480a973?el=desc) will **increase** coverage by `0.03%`.
   > The diff coverage is `56.73%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/12611/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #12611      +/-   ##
   ==========================================
   + Coverage   40.26%   40.30%   +0.03%     
   ==========================================
     Files         455      456       +1     
     Lines       53780    53926     +146     
   ==========================================
   + Hits        21655    21733      +78     
   - Misses      32125    32193      +68     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [sdks/python/apache\_beam/io/gcp/spanner.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL3NwYW5uZXIucHk=) | `56.73% <56.73%> (ø)` | |
   | [...s/python/apache\_beam/testing/synthetic\_pipeline.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdGVzdGluZy9zeW50aGV0aWNfcGlwZWxpbmUucHk=) | `23.45% <0.00%> (+2.52%)` | :arrow_up: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=footer). Last update [ba89e76...c24e8cb](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski removed a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski removed a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-727022314


   Run Python 3.7 PostCommit


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski commented on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski commented on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-686482934


   Run Python 3.8 PostCommit


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] nielm commented on a change in pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
nielm commented on a change in pull request #12611:
URL: https://github.com/apache/beam/pull/12611#discussion_r501066733



##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/SpannerIO.java
##########
@@ -678,6 +703,42 @@ public Read withPartitionOptions(PartitionOptions partitionOptions) {
               .withTransaction(getTransaction());
       return input.apply(Create.of(getReadOperation())).apply("Execute query", readAll);
     }
+
+    SerializableFunction<Struct, Row> getFormatFn() {
+      return (SerializableFunction<Struct, Row>)
+          input ->
+              Row.withSchema(Schema.builder().addInt64Field("Key").build())
+                  .withFieldValue("Key", 3L)
+                  .build();
+    }
+  }
+
+  public static class ReadRows extends PTransform<PBegin, PCollection<Row>> {
+    Read read;
+    Schema schema;
+
+    public ReadRows(Read read, Schema schema) {
+      super("Read rows");
+      this.read = read;
+      this.schema = schema;

Review comment:
       I don't see any good solution here...
   When reading an entire table, it could be possible to read the table's schema first, and determine what types the columns are, but this does not work for a query as the query output columns may not correspond to table columns. 
   
   Adding `LIMIT 1` would only work for simple queries, anything with joins, `GROUP BY`, `ORDER BY` will require the majority of the query to be executed before a single row is returned. 
   
   So the only solution I can see is for the caller to specify the row Schema as you do here..




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-683611200


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=h1) Report
   > Merging [#12611](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=desc) (1c43284) into [master](https://codecov.io/gh/apache/beam/commit/3d6cc0ed9ed537229b27b5dbe73288f21b0e351c?el=desc) (3d6cc0e) will **decrease** coverage by `0.04%`.
   > The diff coverage is `56.73%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/12611/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #12611      +/-   ##
   ==========================================
   - Coverage   82.48%   82.44%   -0.05%     
   ==========================================
     Files         455      456       +1     
     Lines       54876    54975      +99     
   ==========================================
   + Hits        45266    45324      +58     
   - Misses       9610     9651      +41     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [sdks/python/apache\_beam/io/gcp/spanner.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL3NwYW5uZXIucHk=) | `56.73% <56.73%> (ø)` | |
   | [...eam/runners/interactive/interactive\_environment.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9pbnRlcmFjdGl2ZS9pbnRlcmFjdGl2ZV9lbnZpcm9ubWVudC5weQ==) | `89.45% <0.00%> (-0.36%)` | :arrow_down: |
   | [sdks/python/apache\_beam/io/iobase.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vaW9iYXNlLnB5) | `83.75% <0.00%> (-0.29%)` | :arrow_down: |
   | [...hon/apache\_beam/runners/worker/bundle\_processor.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvYnVuZGxlX3Byb2Nlc3Nvci5weQ==) | `94.07% <0.00%> (-0.27%)` | :arrow_down: |
   | [...ks/python/apache\_beam/runners/worker/sdk\_worker.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvc2RrX3dvcmtlci5weQ==) | `89.47% <0.00%> (-0.16%)` | :arrow_down: |
   | [...runners/interactive/display/pcoll\_visualization.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9pbnRlcmFjdGl2ZS9kaXNwbGF5L3Bjb2xsX3Zpc3VhbGl6YXRpb24ucHk=) | `85.26% <0.00%> (-0.08%)` | :arrow_down: |
   | [...beam/runners/portability/local\_job\_service\_main.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9wb3J0YWJpbGl0eS9sb2NhbF9qb2Jfc2VydmljZV9tYWluLnB5) | `0.00% <0.00%> (ø)` | |
   | [sdks/python/apache\_beam/runners/common.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9jb21tb24ucHk=) | `89.20% <0.00%> (+0.44%)` | :arrow_up: |
   | [.../python/apache\_beam/transforms/periodicsequence.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdHJhbnNmb3Jtcy9wZXJpb2RpY3NlcXVlbmNlLnB5) | `98.24% <0.00%> (+1.75%)` | :arrow_up: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=footer). Last update [02a1cd2...6370a87](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-683611200


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=h1) Report
   > Merging [#12611](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=desc) (1c43284) into [master](https://codecov.io/gh/apache/beam/commit/3d6cc0ed9ed537229b27b5dbe73288f21b0e351c?el=desc) (3d6cc0e) will **decrease** coverage by `0.04%`.
   > The diff coverage is `56.73%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/12611/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #12611      +/-   ##
   ==========================================
   - Coverage   82.48%   82.44%   -0.05%     
   ==========================================
     Files         455      456       +1     
     Lines       54876    54975      +99     
   ==========================================
   + Hits        45266    45324      +58     
   - Misses       9610     9651      +41     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [sdks/python/apache\_beam/io/gcp/spanner.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL3NwYW5uZXIucHk=) | `56.73% <56.73%> (ø)` | |
   | [...eam/runners/interactive/interactive\_environment.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9pbnRlcmFjdGl2ZS9pbnRlcmFjdGl2ZV9lbnZpcm9ubWVudC5weQ==) | `89.45% <0.00%> (-0.36%)` | :arrow_down: |
   | [sdks/python/apache\_beam/io/iobase.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vaW9iYXNlLnB5) | `83.75% <0.00%> (-0.29%)` | :arrow_down: |
   | [...hon/apache\_beam/runners/worker/bundle\_processor.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvYnVuZGxlX3Byb2Nlc3Nvci5weQ==) | `94.07% <0.00%> (-0.27%)` | :arrow_down: |
   | [...ks/python/apache\_beam/runners/worker/sdk\_worker.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvc2RrX3dvcmtlci5weQ==) | `89.47% <0.00%> (-0.16%)` | :arrow_down: |
   | [...runners/interactive/display/pcoll\_visualization.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9pbnRlcmFjdGl2ZS9kaXNwbGF5L3Bjb2xsX3Zpc3VhbGl6YXRpb24ucHk=) | `85.26% <0.00%> (-0.08%)` | :arrow_down: |
   | [...beam/runners/portability/local\_job\_service\_main.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9wb3J0YWJpbGl0eS9sb2NhbF9qb2Jfc2VydmljZV9tYWluLnB5) | `0.00% <0.00%> (ø)` | |
   | [sdks/python/apache\_beam/runners/common.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9jb21tb24ucHk=) | `89.20% <0.00%> (+0.44%)` | :arrow_up: |
   | [.../python/apache\_beam/transforms/periodicsequence.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdHJhbnNmb3Jtcy9wZXJpb2RpY3NlcXVlbmNlLnB5) | `98.24% <0.00%> (+1.75%)` | :arrow_up: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=footer). Last update [02a1cd2...78fb311](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] chamikaramj commented on a change in pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
chamikaramj commented on a change in pull request #12611:
URL: https://github.com/apache/beam/pull/12611#discussion_r492364337



##########
File path: sdks/python/apache_beam/io/gcp/spanner.py
##########
@@ -0,0 +1,504 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""PTransforms for supporting Spanner in Python pipelines.
+
+  These transforms are currently supported by Beam portable
+  Flink and Spark runners.
+
+  **Setup**
+
+  Transforms provided in this module are cross-language transforms
+  implemented in the Beam Java SDK. During the pipeline construction, Python SDK
+  will connect to a Java expansion service to expand these transforms.
+  To facilitate this, a small amount of setup is needed before using these
+  transforms in a Beam Python pipeline.
+
+  There are several ways to setup cross-language Spanner transforms.
+
+  * Option 1: use the default expansion service
+  * Option 2: specify a custom expansion service
+
+  See below for details regarding each of these options.
+
+  *Option 1: Use the default expansion service*
+
+  This is the recommended and easiest setup option for using Python Spanner
+  transforms. This option is only available for Beam 2.25.0 and later.
+
+  This option requires following pre-requisites before running the Beam
+  pipeline.
+
+  * Install Java runtime in the computer from where the pipeline is constructed
+    and make sure that 'java' command is available.
+
+  In this option, Python SDK will either download (for released Beam version) or
+  build (when running from a Beam Git clone) a expansion service jar and use
+  that to expand transforms. Currently Spanner transforms use the
+  'beam-sdks-java-io-google-cloud-platform-expansion-service' jar for this
+  purpose.
+
+  *Option 2: specify a custom expansion service*
+
+  In this option, you startup your own expansion service and provide that as
+  a parameter when using the transforms provided in this module.
+
+  This option requires following pre-requisites before running the Beam
+  pipeline.
+
+  * Startup your own expansion service.
+  * Update your pipeline to provide the expansion service address when
+    initiating Spanner transforms provided in this module.
+
+  Flink Users can use the built-in Expansion Service of the Flink Runner's
+  Job Server. If you start Flink's Job Server, the expansion service will be
+  started on port 8097. For a different address, please set the
+  expansion_service parameter.
+
+  **More information**
+
+  For more information regarding cross-language transforms see:
+  - https://beam.apache.org/roadmap/portability/
+
+  For more information specific to Flink runner see:
+  - https://beam.apache.org/documentation/runners/flink/
+"""
+
+# pytype: skip-file
+
+from __future__ import absolute_import
+
+import typing
+import uuid
+from typing import List
+from typing import NamedTuple
+from typing import Optional
+
+from past.builtins import unicode
+
+from apache_beam import coders
+from apache_beam.transforms.external import BeamJarExpansionService
+from apache_beam.transforms.external import ExternalTransform
+from apache_beam.transforms.external import NamedTupleBasedPayloadBuilder
+from apache_beam.typehints.schemas import named_tuple_to_schema
+
+__all__ = [
+    'WriteToSpanner',
+    'ReadFromSpanner',
+    'MutationCreator',
+    'TimestampBoundMode',
+    'TimeUnit',
+]
+
+
+def default_io_expansion_service():
+  return BeamJarExpansionService(
+      'sdks:java:io:google-cloud-platform:expansion-service:shadowJar')
+
+
+WriteToSpannerSchema = typing.NamedTuple(
+    'WriteToSpannerSchema',
+    [
+        ('instance_id', unicode),
+        ('database_id', unicode),
+        ('project_id', Optional[unicode]),
+        ('batch_size_bytes', Optional[int]),
+        ('max_num_mutations', Optional[int]),
+        ('max_num_rows', Optional[int]),
+        ('grouping_factor', Optional[int]),
+        ('host', Optional[unicode]),
+        ('emulator_host', Optional[unicode]),
+        ('commit_deadline', Optional[int]),
+        ('max_cumulative_backoff', Optional[int]),
+    ],
+)
+
+
+class WriteToSpanner(ExternalTransform):

Review comment:
       Probably it makes sense to converge into one implementation. I'd prefer the Java implementation (hence cross-language) since it's being around for longer and used by many users. We have to make sure that the cross-language version works for all runners before native version can be removed. For example, cross-language version will not work for current production Dataflow (Runner v1) and we have to confirm that it works adequate for Dataflow Runner v2.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-683611200


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=h1) Report
   > Merging [#12611](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=desc) into [master](https://codecov.io/gh/apache/beam/commit/3d6cc0ed9ed537229b27b5dbe73288f21b0e351c?el=desc) will **decrease** coverage by `0.04%`.
   > The diff coverage is `56.73%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/12611/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #12611      +/-   ##
   ==========================================
   - Coverage   82.48%   82.44%   -0.05%     
   ==========================================
     Files         455      456       +1     
     Lines       54876    54975      +99     
   ==========================================
   + Hits        45266    45324      +58     
   - Misses       9610     9651      +41     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [sdks/python/apache\_beam/io/gcp/spanner.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL3NwYW5uZXIucHk=) | `56.73% <56.73%> (ø)` | |
   | [...eam/runners/interactive/interactive\_environment.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9pbnRlcmFjdGl2ZS9pbnRlcmFjdGl2ZV9lbnZpcm9ubWVudC5weQ==) | `89.45% <0.00%> (-0.36%)` | :arrow_down: |
   | [sdks/python/apache\_beam/io/iobase.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vaW9iYXNlLnB5) | `83.75% <0.00%> (-0.29%)` | :arrow_down: |
   | [...hon/apache\_beam/runners/worker/bundle\_processor.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvYnVuZGxlX3Byb2Nlc3Nvci5weQ==) | `94.07% <0.00%> (-0.27%)` | :arrow_down: |
   | [...ks/python/apache\_beam/runners/worker/sdk\_worker.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvc2RrX3dvcmtlci5weQ==) | `89.47% <0.00%> (-0.16%)` | :arrow_down: |
   | [...runners/interactive/display/pcoll\_visualization.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9pbnRlcmFjdGl2ZS9kaXNwbGF5L3Bjb2xsX3Zpc3VhbGl6YXRpb24ucHk=) | `85.26% <0.00%> (-0.08%)` | :arrow_down: |
   | [...beam/runners/portability/local\_job\_service\_main.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9wb3J0YWJpbGl0eS9sb2NhbF9qb2Jfc2VydmljZV9tYWluLnB5) | `0.00% <0.00%> (ø)` | |
   | [sdks/python/apache\_beam/runners/common.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9jb21tb24ucHk=) | `89.20% <0.00%> (+0.44%)` | :arrow_up: |
   | [.../python/apache\_beam/transforms/periodicsequence.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdHJhbnNmb3Jtcy9wZXJpb2RpY3NlcXVlbmNlLnB5) | `98.24% <0.00%> (+1.75%)` | :arrow_up: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=footer). Last update [84db719...46e0f1a](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski commented on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski commented on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-709456631


   My greatest concern about current changes is that there is a lot of duplicated docs and constructor parameters code.. I'm not sure if something can be done. I've researched the subject of docs a bit and didn't find any decent solution.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-683611200


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=h1) Report
   > Merging [#12611](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=desc) into [master](https://codecov.io/gh/apache/beam/commit/3d6cc0ed9ed537229b27b5dbe73288f21b0e351c?el=desc) will **decrease** coverage by `0.04%`.
   > The diff coverage is `56.73%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/12611/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #12611      +/-   ##
   ==========================================
   - Coverage   82.48%   82.44%   -0.05%     
   ==========================================
     Files         455      456       +1     
     Lines       54876    54975      +99     
   ==========================================
   + Hits        45266    45324      +58     
   - Misses       9610     9651      +41     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [sdks/python/apache\_beam/io/gcp/spanner.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL3NwYW5uZXIucHk=) | `56.73% <56.73%> (ø)` | |
   | [...eam/runners/interactive/interactive\_environment.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9pbnRlcmFjdGl2ZS9pbnRlcmFjdGl2ZV9lbnZpcm9ubWVudC5weQ==) | `89.45% <0.00%> (-0.36%)` | :arrow_down: |
   | [sdks/python/apache\_beam/io/iobase.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vaW9iYXNlLnB5) | `83.75% <0.00%> (-0.29%)` | :arrow_down: |
   | [...hon/apache\_beam/runners/worker/bundle\_processor.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvYnVuZGxlX3Byb2Nlc3Nvci5weQ==) | `94.07% <0.00%> (-0.27%)` | :arrow_down: |
   | [...ks/python/apache\_beam/runners/worker/sdk\_worker.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvc2RrX3dvcmtlci5weQ==) | `89.47% <0.00%> (-0.16%)` | :arrow_down: |
   | [...runners/interactive/display/pcoll\_visualization.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9pbnRlcmFjdGl2ZS9kaXNwbGF5L3Bjb2xsX3Zpc3VhbGl6YXRpb24ucHk=) | `85.26% <0.00%> (-0.08%)` | :arrow_down: |
   | [...beam/runners/portability/local\_job\_service\_main.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9wb3J0YWJpbGl0eS9sb2NhbF9qb2Jfc2VydmljZV9tYWluLnB5) | `0.00% <0.00%> (ø)` | |
   | [sdks/python/apache\_beam/runners/common.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9jb21tb24ucHk=) | `89.20% <0.00%> (+0.44%)` | :arrow_up: |
   | [.../python/apache\_beam/transforms/periodicsequence.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdHJhbnNmb3Jtcy9wZXJpb2RpY3NlcXVlbmNlLnB5) | `98.24% <0.00%> (+1.75%)` | :arrow_up: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=footer). Last update [ba24b08...a5af894](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski commented on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski commented on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-717413326


   Run Python 3.7 PostCommit


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski commented on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski commented on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-726117163


   Run Python 3.7 PostCommit


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski edited a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski edited a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-686447747


   @TheNeuralBit I've upgraded it a bit.
   - Checking schemas equality is redundant because it will throw an exception with a good message anyway (class cast failure or unknown column). Also, it's possible to just add row.addFieldValues(Map<String, Object> values) and depend on the casts following the schema.
   - I managed to unify addArray and addIterable code duplication with a bit ugly casts (needed SuppressWarning("unchecked")) but I don't think it can be easily achieved otherwise.
   - Nothing comes to my mind to remove duplication in addIterableToMutationBuilder and addIterableToStructBuilder methods. These are unrelated classes (Struct.Builder and Mutation.WriteBuilder. Maybe my Java knowledge is insufficient here. I could make an interface that simulates .setInt64Array, setStructArray etc but it would be even more boilerplate.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-683611200


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=h1) Report
   > Merging [#12611](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=desc) into [master](https://codecov.io/gh/apache/beam/commit/0335ba5403e5071a1376a3a4101d14b03cf90b8d?el=desc) will **increase** coverage by `0.02%`.
   > The diff coverage is `58.51%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/12611/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #12611      +/-   ##
   ==========================================
   + Coverage   40.23%   40.26%   +0.02%     
   ==========================================
     Files         455      456       +1     
     Lines       53726    53825      +99     
   ==========================================
   + Hits        21619    21674      +55     
   - Misses      32107    32151      +44     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [sdks/python/apache\_beam/io/gcp/spanner.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL3NwYW5uZXIucHk=) | `58.51% <58.51%> (ø)` | |
   | [...hon/apache\_beam/runners/direct/test\_stream\_impl.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9kaXJlY3QvdGVzdF9zdHJlYW1faW1wbC5weQ==) | `41.17% <0.00%> (-1.58%)` | :arrow_down: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=footer). Last update [0335ba5...78c40e7](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski commented on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski commented on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-726984319


   Run Java PreCommit


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-683611200


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=h1) Report
   > Merging [#12611](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=desc) (1c43284) into [master](https://codecov.io/gh/apache/beam/commit/3d6cc0ed9ed537229b27b5dbe73288f21b0e351c?el=desc) (3d6cc0e) will **decrease** coverage by `0.04%`.
   > The diff coverage is `56.73%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/12611/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #12611      +/-   ##
   ==========================================
   - Coverage   82.48%   82.44%   -0.05%     
   ==========================================
     Files         455      456       +1     
     Lines       54876    54975      +99     
   ==========================================
   + Hits        45266    45324      +58     
   - Misses       9610     9651      +41     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [sdks/python/apache\_beam/io/gcp/spanner.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL3NwYW5uZXIucHk=) | `56.73% <56.73%> (ø)` | |
   | [...eam/runners/interactive/interactive\_environment.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9pbnRlcmFjdGl2ZS9pbnRlcmFjdGl2ZV9lbnZpcm9ubWVudC5weQ==) | `89.45% <0.00%> (-0.36%)` | :arrow_down: |
   | [sdks/python/apache\_beam/io/iobase.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vaW9iYXNlLnB5) | `83.75% <0.00%> (-0.29%)` | :arrow_down: |
   | [...hon/apache\_beam/runners/worker/bundle\_processor.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvYnVuZGxlX3Byb2Nlc3Nvci5weQ==) | `94.07% <0.00%> (-0.27%)` | :arrow_down: |
   | [...ks/python/apache\_beam/runners/worker/sdk\_worker.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvc2RrX3dvcmtlci5weQ==) | `89.47% <0.00%> (-0.16%)` | :arrow_down: |
   | [...runners/interactive/display/pcoll\_visualization.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9pbnRlcmFjdGl2ZS9kaXNwbGF5L3Bjb2xsX3Zpc3VhbGl6YXRpb24ucHk=) | `85.26% <0.00%> (-0.08%)` | :arrow_down: |
   | [...beam/runners/portability/local\_job\_service\_main.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9wb3J0YWJpbGl0eS9sb2NhbF9qb2Jfc2VydmljZV9tYWluLnB5) | `0.00% <0.00%> (ø)` | |
   | [sdks/python/apache\_beam/runners/common.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9jb21tb24ucHk=) | `89.20% <0.00%> (+0.44%)` | :arrow_up: |
   | [.../python/apache\_beam/transforms/periodicsequence.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdHJhbnNmb3Jtcy9wZXJpb2RpY3NlcXVlbmNlLnB5) | `98.24% <0.00%> (+1.75%)` | :arrow_up: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=footer). Last update [02a1cd2...aa73ae1](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] TheNeuralBit commented on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
TheNeuralBit commented on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-706381403


   I think most of my comments in that review are actually not relevant any more if we go down the path of separate xlang transforms per operation.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski removed a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski removed a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-716736973


   Run Python 3.8 PostCommit


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-683611200






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] TheNeuralBit commented on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
TheNeuralBit commented on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-705258493


   Sorry for dropping the ball on this @piotr-szuberski. I'll look over the changes to the Python API this week


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-683611200


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=h1) Report
   > Merging [#12611](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=desc) into [master](https://codecov.io/gh/apache/beam/commit/ba89e76f5daecd496cafb2861dac5ef69480a973?el=desc) will **increase** coverage by `0.03%`.
   > The diff coverage is `56.73%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/12611/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #12611      +/-   ##
   ==========================================
   + Coverage   40.26%   40.30%   +0.03%     
   ==========================================
     Files         455      456       +1     
     Lines       53780    53926     +146     
   ==========================================
   + Hits        21655    21733      +78     
   - Misses      32125    32193      +68     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [sdks/python/apache\_beam/io/gcp/spanner.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL3NwYW5uZXIucHk=) | `56.73% <56.73%> (ø)` | |
   | [...s/python/apache\_beam/testing/synthetic\_pipeline.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdGVzdGluZy9zeW50aGV0aWNfcGlwZWxpbmUucHk=) | `23.45% <0.00%> (+2.52%)` | :arrow_up: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=footer). Last update [ba89e76...c24e8cb](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-683611200


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=h1) Report
   > Merging [#12611](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=desc) into [master](https://codecov.io/gh/apache/beam/commit/0335ba5403e5071a1376a3a4101d14b03cf90b8d?el=desc) will **increase** coverage by `0.03%`.
   > The diff coverage is `59.13%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/12611/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #12611      +/-   ##
   ==========================================
   + Coverage   40.23%   40.27%   +0.03%     
   ==========================================
     Files         455      456       +1     
     Lines       53726    53819      +93     
   ==========================================
   + Hits        21619    21674      +55     
   - Misses      32107    32145      +38     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [sdks/python/apache\_beam/io/gcp/spanner.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL3NwYW5uZXIucHk=) | `59.13% <59.13%> (ø)` | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=footer). Last update [0335ba5...55ee406](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski removed a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski removed a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-677454675


   Run PythonDocker PreCommit


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] TheNeuralBit commented on a change in pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
TheNeuralBit commented on a change in pull request #12611:
URL: https://github.com/apache/beam/pull/12611#discussion_r520210174



##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/SpannerTransformRegistrar.java
##########
@@ -0,0 +1,287 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.gcp.spanner;
+
+import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument;
+
+import com.google.auto.service.AutoService;
+import com.google.cloud.Timestamp;
+import com.google.cloud.spanner.TimestampBound;
+import java.util.Map;
+import java.util.concurrent.TimeUnit;
+import org.apache.beam.model.pipeline.v1.SchemaApi;
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.annotations.Experimental.Kind;
+import org.apache.beam.sdk.expansion.ExternalTransformRegistrar;
+import org.apache.beam.sdk.schemas.Schema;
+import org.apache.beam.sdk.schemas.SchemaTranslation;
+import org.apache.beam.sdk.transforms.ExternalTransformBuilder;
+import org.apache.beam.sdk.transforms.PTransform;
+import org.apache.beam.sdk.values.PBegin;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.PDone;
+import org.apache.beam.sdk.values.Row;
+import org.apache.beam.vendor.grpc.v1p26p0.com.google.protobuf.InvalidProtocolBufferException;
+import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.ImmutableMap;
+import org.checkerframework.checker.nullness.qual.Nullable;
+import org.joda.time.Duration;
+
+/**
+ * Exposes {@link SpannerIO.WriteRows} and {@link SpannerIO.ReadRows} as an external transform for
+ * cross-language usage.
+ */
+@Experimental(Kind.PORTABILITY)
+@AutoService(ExternalTransformRegistrar.class)
+public class SpannerTransformRegistrar implements ExternalTransformRegistrar {
+  public static final String WRITE_URN = "beam:external:java:spanner:write:v1";
+  public static final String READ_URN = "beam:external:java:spanner:read:v1";
+
+  @Override
+  public Map<String, ExternalTransformBuilder<?, ?, ?>> knownBuilderInstances() {
+    return ImmutableMap.of(WRITE_URN, new WriteBuilder(), READ_URN, new ReadBuilder());

Review comment:
       What I had in mind was that there would be a separate URN for each possible write operation like `beam:external:java:spanner:delete`, `beam:external:java:spanner:insert_or_update`, ...
   
   Rather than accepting mutations encoded as rows to go over the xlang boundary, each of these transforms would have an input of just  `PCollection<Row>` representing the actual data (or `PCollection<List<Row>>` representing the keyset in the Delete case). Then python doesn't even need to have a concept of mutations.
   
   There's definitely value in a generic `beam:external:java:spanner:write` transform which accepts arbitrary mutations, but I think we should leave that for future work.

##########
File path: sdks/python/apache_beam/io/gcp/spanner.py
##########
@@ -0,0 +1,662 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""PTransforms for supporting Spanner in Python pipelines.
+
+  These transforms are currently supported by Beam portable
+  Flink and Spark runners.
+
+  **Setup**
+
+  Transforms provided in this module are cross-language transforms
+  implemented in the Beam Java SDK. During the pipeline construction, Python SDK
+  will connect to a Java expansion service to expand these transforms.
+  To facilitate this, a small amount of setup is needed before using these
+  transforms in a Beam Python pipeline.
+
+  There are several ways to setup cross-language Spanner transforms.
+
+  * Option 1: use the default expansion service
+  * Option 2: specify a custom expansion service
+
+  See below for details regarding each of these options.
+
+  *Option 1: Use the default expansion service*
+
+  This is the recommended and easiest setup option for using Python Spanner
+  transforms. This option is only available for Beam 2.26.0 and later.
+
+  This option requires following pre-requisites before running the Beam
+  pipeline.
+
+  * Install Java runtime in the computer from where the pipeline is constructed
+    and make sure that 'java' command is available.
+
+  In this option, Python SDK will either download (for released Beam version) or
+  build (when running from a Beam Git clone) a expansion service jar and use
+  that to expand transforms. Currently Spanner transforms use the
+  'beam-sdks-java-io-google-cloud-platform-expansion-service' jar for this
+  purpose.
+
+  *Option 2: specify a custom expansion service*
+
+  In this option, you startup your own expansion service and provide that as
+  a parameter when using the transforms provided in this module.
+
+  This option requires following pre-requisites before running the Beam
+  pipeline.
+
+  * Startup your own expansion service.
+  * Update your pipeline to provide the expansion service address when
+    initiating Spanner transforms provided in this module.
+
+  Flink Users can use the built-in Expansion Service of the Flink Runner's
+  Job Server. If you start Flink's Job Server, the expansion service will be
+  started on port 8097. For a different address, please set the
+  expansion_service parameter.
+
+  **More information**
+
+  For more information regarding cross-language transforms see:
+  - https://beam.apache.org/roadmap/portability/
+
+  For more information specific to Flink runner see:
+  - https://beam.apache.org/documentation/runners/flink/
+"""
+
+# pytype: skip-file
+
+from __future__ import absolute_import
+
+import uuid
+from enum import Enum
+from enum import auto
+from typing import List
+from typing import NamedTuple
+from typing import Optional
+
+from past.builtins import unicode
+
+from apache_beam import Map
+from apache_beam import PTransform
+from apache_beam import coders
+from apache_beam.transforms.external import BeamJarExpansionService
+from apache_beam.transforms.external import ExternalTransform
+from apache_beam.transforms.external import NamedTupleBasedPayloadBuilder
+from apache_beam.typehints.schemas import named_tuple_from_schema
+from apache_beam.typehints.schemas import named_tuple_to_schema
+from apache_beam.typehints.schemas import schema_from_element_type
+
+__all__ = [
+    'ReadFromSpanner',
+    'SpannerDelete',
+    'SpannerInsert',
+    'SpannerInsertOrUpdate',
+    'SpannerReplace',
+    'SpannerUpdate',
+    'TimestampBoundMode',
+    'TimeUnit',
+]
+
+
+def default_io_expansion_service():
+  return BeamJarExpansionService(
+      'sdks:java:io:google-cloud-platform:expansion-service:shadowJar')
+
+
+_READ_URN = 'beam:external:java:spanner:read:v1'
+_WRITE_URN = 'beam:external:java:spanner:write:v1'
+
+
+class TimeUnit(Enum):
+  NANOSECONDS = auto()
+  MICROSECONDS = auto()
+  MILLISECONDS = auto()
+  SECONDS = auto()
+  HOURS = auto()
+  DAYS = auto()
+
+
+class TimestampBoundMode(Enum):
+  MAX_STALENESS = auto()
+  EXACT_STALENESS = auto()
+  READ_TIMESTAMP = auto()
+  MIN_READ_TIMESTAMP = auto()
+  STRONG = auto()
+
+
+class ReadFromSpannerSchema(NamedTuple):
+  instance_id: unicode
+  database_id: unicode
+  schema: bytes
+  sql: Optional[unicode]
+  table: Optional[unicode]
+  project_id: Optional[unicode]
+  host: Optional[unicode]
+  emulator_host: Optional[unicode]
+  batching: Optional[bool]
+  timestamp_bound_mode: Optional[unicode]
+  read_timestamp: Optional[unicode]
+  exact_staleness: Optional[int]
+  time_unit: Optional[unicode]
+
+
+class ReadFromSpanner(ExternalTransform):
+  """
+  A PTransform which reads from the specified Spanner instance's database.
+
+  This transform required type of the row it has to return to provide the
+  schema. Example::
+
+    ExampleRow = typing.NamedTuple('ExampleRow',
+                                   [('id', int), ('name', unicode)])
+
+    with Pipeline() as p:
+      result = (
+          p
+          | ReadFromSpanner(
+              instance_id='your_instance_id',
+              database_id='your_database_id',
+              project_id='your_project_id',
+              row_type=ExampleRow,
+              query='SELECT * FROM some_table',
+              timestamp_bound_mode=TimestampBoundMode.MAX_STALENESS,
+              exact_staleness=3,
+              time_unit=TimeUnit.HOURS,
+          ).with_output_types(ExampleRow))
+
+  Experimental; no backwards compatibility guarantees.
+  """
+  def __init__(
+      self,
+      project_id,
+      instance_id,
+      database_id,
+      row_type=None,
+      sql=None,
+      table=None,
+      host=None,
+      emulator_host=None,
+      batching=None,
+      timestamp_bound_mode=None,
+      read_timestamp=None,
+      exact_staleness=None,
+      time_unit=None,
+      expansion_service=None,
+  ):
+    """
+    Initializes a read operation from Spanner.
+
+    :param project_id: Specifies the Cloud Spanner project.
+    :param instance_id: Specifies the Cloud Spanner instance.
+    :param database_id: Specifies the Cloud Spanner database.
+    :param row_type: Row type that fits the given query or table. Passed as
+        NamedTuple, e.g. NamedTuple('name', [('row_name', unicode)])
+    :param sql: An sql query to execute. It's results must fit the
+        provided row_type. Don't use when table is set.
+    :param table: A spanner table. When provided all columns from row_type
+        will be selected to query. Don't use when query is set.
+    :param batching: By default Batch API is used to read data from Cloud
+        Spanner. It is useful to disable batching when the underlying query
+        is not root-partitionable.
+    :param host: Specifies the Cloud Spanner host.
+    :param emulator_host: Specifies Spanner emulator host.
+    :param timestamp_bound_mode: Defines how Cloud Spanner will choose a
+        timestamp for a read-only transaction or a single read/query.
+        Passed as TimestampBoundMode enum. Possible values:
+        STRONG: A timestamp bound that will perform reads and queries at a
+        timestamp where all previously committed transactions are visible.
+        READ_TIMESTAMP: Returns a timestamp bound that will perform reads
+        and queries at the given timestamp.
+        MIN_READ_TIMESTAMP: Returns a timestamp bound that will perform reads
+        and queries at a timestamp chosen to be at least given timestamp value.
+        EXACT_STALENESS: Returns a timestamp bound that will perform reads and
+        queries at an exact staleness. The timestamp is chosen soon after the
+        read is started.
+        MAX_STALENESS: Returns a timestamp bound that will perform reads and
+        queries at a timestamp chosen to be at most time_unit stale.
+    :param read_timestamp: Timestamp in string. Use only when
+        timestamp_bound_mode is set to READ_TIMESTAMP or MIN_READ_TIMESTAMP.
+    :param exact_staleness: Staleness value as int. Use only when
+        timestamp_bound_mode is set to EXACT_STALENESS or MAX_STALENESS.
+        time_unit has to be set along with this param.
+    :param time_unit: Time unit for staleness_value passed as TimeUnit enum.
+        Possible values: NANOSECONDS, MICROSECONDS, MILLISECONDS, SECONDS,
+        HOURS, DAYS.
+    :param expansion_service: The address (host:port) of the ExpansionService.
+    """
+    assert row_type
+    assert sql or table and not (sql and table)
+    staleness_value = int(exact_staleness) if exact_staleness else None
+
+    if staleness_value or time_unit:
+      assert staleness_value and time_unit and \
+             timestamp_bound_mode is TimestampBoundMode.MAX_STALENESS or \
+             timestamp_bound_mode is TimestampBoundMode.EXACT_STALENESS
+
+    if read_timestamp:
+      assert timestamp_bound_mode is TimestampBoundMode.MIN_READ_TIMESTAMP\
+             or timestamp_bound_mode is TimestampBoundMode.READ_TIMESTAMP
+
+    coders.registry.register_coder(row_type, coders.RowCoder)
+
+    super(ReadFromSpanner, self).__init__(
+        _READ_URN,
+        NamedTupleBasedPayloadBuilder(
+            ReadFromSpannerSchema(
+                instance_id=instance_id,
+                database_id=database_id,
+                sql=sql,
+                table=table,
+                schema=named_tuple_to_schema(row_type).SerializeToString(),
+                project_id=project_id,
+                host=host,
+                emulator_host=emulator_host,
+                batching=batching,
+                timestamp_bound_mode=_get_enum_name(timestamp_bound_mode),
+                read_timestamp=read_timestamp,
+                exact_staleness=exact_staleness,
+                time_unit=_get_enum_name(time_unit),
+            ),
+        ),
+        expansion_service or default_io_expansion_service(),
+    )
+
+
+class WriteToSpannerSchema(NamedTuple):
+  project_id: unicode
+  instance_id: unicode
+  database_id: unicode
+  max_batch_size_bytes: Optional[int]
+  max_number_mutations: Optional[int]
+  max_number_rows: Optional[int]
+  grouping_factor: Optional[int]
+  host: Optional[unicode]
+  emulator_host: Optional[unicode]
+  commit_deadline: Optional[int]
+  max_cumulative_backoff: Optional[int]
+
+
+_CLASS_DOC = \
+  """
+  A PTransform which writes {0} mutations to the specified Spanner table.
+
+  This transform receives rows defined as NamedTuple. Example::
+
+    {1} = typing.NamedTuple('{1}',
+                                   [('id', int), ('name', unicode)])
+
+    with Pipeline() as p:
+      _ = (
+          p
+          | 'Impulse' >> beam.Impulse()
+          | 'Generate' >> beam.FlatMap(lambda x: range(num_rows))
+          | 'To row' >> beam.Map(lambda n: {1}(n, str(n))
+              .with_output_types({2})
+          | 'Write to Spanner' >> Spanner{3}(
+              instance_id='your_instance',
+              database_id='existing_database',
+              project_id='your_project_id',
+              table='your_table'))
+
+  Experimental; no backwards compatibility guarantees.
+  """
+
+_INIT_DOC = \
+  """
+  Initializes {} operation to a Spanner table.
+
+  :param project_id: Specifies the Cloud Spanner project.
+  :param instance_id: Specifies the Cloud Spanner instance.
+  :param database_id: Specifies the Cloud Spanner database.
+  :param table: Specifies the Cloud Spanner table.
+  :param max_batch_size_bytes: Specifies the batch size limit (max number of
+      bytes mutated per batch). Default value is 1048576 bytes = 1MB.
+  :param max_number_mutations: Specifies the cell mutation limit (maximum
+      number of mutated cells per batch). Default value is 5000.
+  :param max_number_rows: Specifies the row mutation limit (maximum number of
+      mutated rows per batch). Default value is 500.
+  :param grouping_factor: Specifies the multiple of max mutation (in terms
+      of both bytes per batch and cells per batch) that is used to select a
+      set of mutations to sort by key for batching. This sort uses local
+      memory on the workers, so using large values can cause out of memory
+      errors. Default value is 1000.
+  :param host: Specifies the Cloud Spanner host.
+  :param emulator_host: Specifies Spanner emulator host.
+  :param commit_deadline: Specifies the deadline for the Commit API call.
+      Default is 15 secs. DEADLINE_EXCEEDED errors will prompt a backoff/retry
+      until the value of commit_deadline is reached. DEADLINE_EXCEEDED errors
+      are ar reported with logging and counters. Pass seconds as value.
+  :param max_cumulative_backoff: Specifies the maximum cumulative backoff
+      time when retrying after DEADLINE_EXCEEDED errors. Default is 900s
+      (15min). If the mutations still have not been written after this time,
+      they are treated as a failure, and handled according to the setting of
+      failure_mode. Pass seconds as value.
+  :param expansion_service: The address (host:port) of the ExpansionService.
+  """
+
+
+def _add_doc(value, *args):
+  def _doc(obj):
+    obj.__doc__ = value.format(*args)
+    return obj
+
+  return _doc
+
+
+@_add_doc(_CLASS_DOC, 'delete', 'ExampleKey', 'List[ExampleKey]', 'Delete')
+class SpannerDelete(PTransform):
+  @_add_doc(_INIT_DOC, 'a delete')
+  def __init__(
+      self,
+      project_id,
+      instance_id,
+      database_id,
+      table,
+      max_batch_size_bytes=None,
+      max_number_mutations=None,
+      max_number_rows=None,
+      grouping_factor=None,
+      host=None,
+      emulator_host=None,
+      commit_deadline=None,
+      max_cumulative_backoff=None,
+      expansion_service=None,
+  ):
+    super().__init__()
+    max_cumulative_backoff = int(
+        max_cumulative_backoff) if max_cumulative_backoff else None
+    commit_deadline = int(commit_deadline) if commit_deadline else None
+    self.table = table
+    self.params = WriteToSpannerSchema(
+        project_id=project_id,
+        instance_id=instance_id,
+        database_id=database_id,
+        max_batch_size_bytes=max_batch_size_bytes,
+        max_number_mutations=max_number_mutations,
+        max_number_rows=max_number_rows,
+        grouping_factor=grouping_factor,
+        host=host,
+        emulator_host=emulator_host,
+        commit_deadline=commit_deadline,
+        max_cumulative_backoff=max_cumulative_backoff,
+    )
+    self.expansion_service = expansion_service or default_io_expansion_service()
+
+  def expand(self, pbegin):
+    return _apply_write_transform(
+        pbegin,
+        _RowToMutation(_Operation.DELETE, self.table),
+        self.params,
+        self.expansion_service)
+
+
+@_add_doc(_CLASS_DOC, 'insert', 'ExampleRow', 'ExampleRow', 'Insert')
+class SpannerInsert(PTransform):
+  @_add_doc(_INIT_DOC, 'an insert')
+  def __init__(
+      self,
+      project_id,
+      instance_id,
+      database_id,
+      table,
+      max_batch_size_bytes=None,
+      max_number_mutations=None,
+      max_number_rows=None,
+      grouping_factor=None,
+      host=None,
+      emulator_host=None,
+      commit_deadline=None,
+      max_cumulative_backoff=None,
+      expansion_service=None,
+  ):
+    super().__init__()
+    max_cumulative_backoff = int(
+        max_cumulative_backoff) if max_cumulative_backoff else None
+    commit_deadline = int(commit_deadline) if commit_deadline else None
+    self.table = table
+    self.params = WriteToSpannerSchema(
+        project_id=project_id,
+        instance_id=instance_id,
+        database_id=database_id,
+        max_batch_size_bytes=max_batch_size_bytes,
+        max_number_mutations=max_number_mutations,
+        max_number_rows=max_number_rows,
+        grouping_factor=grouping_factor,
+        host=host,
+        emulator_host=emulator_host,
+        commit_deadline=commit_deadline,
+        max_cumulative_backoff=max_cumulative_backoff,
+    )
+    self.expansion_service = expansion_service or default_io_expansion_service()
+
+  def expand(self, pbegin):
+    return _apply_write_transform(
+        pbegin,
+        _RowToMutation(_Operation.INSERT, self.table),
+        self.params,
+        self.expansion_service)
+
+
+@_add_doc(_CLASS_DOC, 'replace', 'ExampleRow', 'ExampleRow', 'Replace')
+class SpannerReplace(PTransform):
+  @_add_doc(_INIT_DOC, 'a replace')
+  def __init__(
+      self,
+      project_id,
+      instance_id,
+      database_id,
+      table,
+      max_batch_size_bytes=None,
+      max_number_mutations=None,
+      max_number_rows=None,
+      grouping_factor=None,
+      host=None,
+      emulator_host=None,
+      commit_deadline=None,
+      max_cumulative_backoff=None,
+      expansion_service=None,
+  ):
+    super().__init__()
+    max_cumulative_backoff = int(
+        max_cumulative_backoff) if max_cumulative_backoff else None
+    commit_deadline = int(commit_deadline) if commit_deadline else None
+    self.table = table
+    self.params = WriteToSpannerSchema(
+        project_id=project_id,
+        instance_id=instance_id,
+        database_id=database_id,
+        max_batch_size_bytes=max_batch_size_bytes,
+        max_number_mutations=max_number_mutations,
+        max_number_rows=max_number_rows,
+        grouping_factor=grouping_factor,
+        host=host,
+        emulator_host=emulator_host,
+        commit_deadline=commit_deadline,
+        max_cumulative_backoff=max_cumulative_backoff,
+    )
+    self.expansion_service = expansion_service or default_io_expansion_service()
+
+  def expand(self, pbegin):
+    return _apply_write_transform(
+        pbegin,
+        _RowToMutation(_Operation.REPLACE, self.table),
+        self.params,
+        self.expansion_service)
+
+
+@_add_doc(
+    _CLASS_DOC,
+    'insert-or-update',
+    'ExampleRow',
+    'ExampleRow',
+    'InsertOrUpdate')
+class SpannerInsertOrUpdate(PTransform):
+  @_add_doc(_INIT_DOC, 'an insert-or-update')
+  def __init__(
+      self,
+      project_id,
+      instance_id,
+      database_id,
+      table,
+      max_batch_size_bytes=None,
+      max_number_mutations=None,
+      max_number_rows=None,
+      grouping_factor=None,
+      host=None,
+      emulator_host=None,
+      commit_deadline=None,
+      max_cumulative_backoff=None,
+      expansion_service=None,
+  ):
+    super().__init__()
+    max_cumulative_backoff = int(
+        max_cumulative_backoff) if max_cumulative_backoff else None
+    commit_deadline = int(commit_deadline) if commit_deadline else None
+    self.table = table
+    self.params = WriteToSpannerSchema(
+        project_id=project_id,
+        instance_id=instance_id,
+        database_id=database_id,
+        max_batch_size_bytes=max_batch_size_bytes,
+        max_number_mutations=max_number_mutations,
+        max_number_rows=max_number_rows,
+        grouping_factor=grouping_factor,
+        host=host,
+        emulator_host=emulator_host,
+        commit_deadline=commit_deadline,
+        max_cumulative_backoff=max_cumulative_backoff,
+    )
+    self.expansion_service = expansion_service or default_io_expansion_service()
+
+  def expand(self, pbegin):
+    return _apply_write_transform(
+        pbegin,
+        _RowToMutation(_Operation.INSERT_OR_UPDATE, self.table),
+        self.params,
+        self.expansion_service)
+
+
+@_add_doc(_CLASS_DOC, 'update', 'ExampleRow', 'ExampleRow', 'Update')
+class SpannerUpdate(PTransform):
+  @_add_doc(_INIT_DOC, 'an update')
+  def __init__(
+      self,
+      project_id,
+      instance_id,
+      database_id,
+      table,
+      max_batch_size_bytes=None,
+      max_number_mutations=None,
+      max_number_rows=None,
+      grouping_factor=None,
+      host=None,
+      emulator_host=None,
+      commit_deadline=None,
+      max_cumulative_backoff=None,
+      expansion_service=None,
+  ):
+    super().__init__()
+    max_cumulative_backoff = int(
+        max_cumulative_backoff) if max_cumulative_backoff else None
+    commit_deadline = int(commit_deadline) if commit_deadline else None
+    self.table = table
+    self.params = WriteToSpannerSchema(
+        project_id=project_id,
+        instance_id=instance_id,
+        database_id=database_id,
+        max_batch_size_bytes=max_batch_size_bytes,
+        max_number_mutations=max_number_mutations,
+        max_number_rows=max_number_rows,
+        grouping_factor=grouping_factor,
+        host=host,
+        emulator_host=emulator_host,
+        commit_deadline=commit_deadline,
+        max_cumulative_backoff=max_cumulative_backoff,
+    )
+    self.expansion_service = expansion_service or default_io_expansion_service()
+
+  def expand(self, pbegin):
+    return _apply_write_transform(
+        pbegin,
+        _RowToMutation(_Operation.UPDATE, self.table),
+        self.params,
+        self.expansion_service)
+
+
+def _apply_write_transform(pbegin, to_mutation, params, expansion_service):
+  return (
+      pbegin
+      | to_mutation
+      | ExternalTransform(
+          _WRITE_URN, NamedTupleBasedPayloadBuilder(params), expansion_service))
+
+
+class _RowToMutation(PTransform):

Review comment:
       As pointed out in the last comment I think it would be preferable if Python didn't even need to have a concept of Mutations (for now). Instead it just sends the Rows (or Keysets) over to Java, which can wrap them in mutations for use in SpannerIO

##########
File path: sdks/python/apache_beam/io/gcp/spanner.py
##########
@@ -0,0 +1,662 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""PTransforms for supporting Spanner in Python pipelines.
+
+  These transforms are currently supported by Beam portable
+  Flink and Spark runners.
+
+  **Setup**
+
+  Transforms provided in this module are cross-language transforms
+  implemented in the Beam Java SDK. During the pipeline construction, Python SDK
+  will connect to a Java expansion service to expand these transforms.
+  To facilitate this, a small amount of setup is needed before using these
+  transforms in a Beam Python pipeline.
+
+  There are several ways to setup cross-language Spanner transforms.
+
+  * Option 1: use the default expansion service
+  * Option 2: specify a custom expansion service
+
+  See below for details regarding each of these options.
+
+  *Option 1: Use the default expansion service*
+
+  This is the recommended and easiest setup option for using Python Spanner
+  transforms. This option is only available for Beam 2.26.0 and later.
+
+  This option requires following pre-requisites before running the Beam
+  pipeline.
+
+  * Install Java runtime in the computer from where the pipeline is constructed
+    and make sure that 'java' command is available.
+
+  In this option, Python SDK will either download (for released Beam version) or
+  build (when running from a Beam Git clone) a expansion service jar and use
+  that to expand transforms. Currently Spanner transforms use the
+  'beam-sdks-java-io-google-cloud-platform-expansion-service' jar for this
+  purpose.
+
+  *Option 2: specify a custom expansion service*
+
+  In this option, you startup your own expansion service and provide that as
+  a parameter when using the transforms provided in this module.
+
+  This option requires following pre-requisites before running the Beam
+  pipeline.
+
+  * Startup your own expansion service.
+  * Update your pipeline to provide the expansion service address when
+    initiating Spanner transforms provided in this module.
+
+  Flink Users can use the built-in Expansion Service of the Flink Runner's
+  Job Server. If you start Flink's Job Server, the expansion service will be
+  started on port 8097. For a different address, please set the
+  expansion_service parameter.
+
+  **More information**
+
+  For more information regarding cross-language transforms see:
+  - https://beam.apache.org/roadmap/portability/
+
+  For more information specific to Flink runner see:
+  - https://beam.apache.org/documentation/runners/flink/
+"""
+
+# pytype: skip-file
+
+from __future__ import absolute_import
+
+import uuid
+from enum import Enum
+from enum import auto
+from typing import List
+from typing import NamedTuple
+from typing import Optional
+
+from past.builtins import unicode
+
+from apache_beam import Map
+from apache_beam import PTransform
+from apache_beam import coders
+from apache_beam.transforms.external import BeamJarExpansionService
+from apache_beam.transforms.external import ExternalTransform
+from apache_beam.transforms.external import NamedTupleBasedPayloadBuilder
+from apache_beam.typehints.schemas import named_tuple_from_schema
+from apache_beam.typehints.schemas import named_tuple_to_schema
+from apache_beam.typehints.schemas import schema_from_element_type
+
+__all__ = [
+    'ReadFromSpanner',
+    'SpannerDelete',
+    'SpannerInsert',
+    'SpannerInsertOrUpdate',
+    'SpannerReplace',
+    'SpannerUpdate',
+    'TimestampBoundMode',
+    'TimeUnit',
+]
+
+
+def default_io_expansion_service():
+  return BeamJarExpansionService(
+      'sdks:java:io:google-cloud-platform:expansion-service:shadowJar')
+
+
+_READ_URN = 'beam:external:java:spanner:read:v1'
+_WRITE_URN = 'beam:external:java:spanner:write:v1'
+
+
+class TimeUnit(Enum):
+  NANOSECONDS = auto()
+  MICROSECONDS = auto()
+  MILLISECONDS = auto()
+  SECONDS = auto()
+  HOURS = auto()
+  DAYS = auto()
+
+
+class TimestampBoundMode(Enum):
+  MAX_STALENESS = auto()
+  EXACT_STALENESS = auto()
+  READ_TIMESTAMP = auto()
+  MIN_READ_TIMESTAMP = auto()
+  STRONG = auto()
+
+
+class ReadFromSpannerSchema(NamedTuple):
+  instance_id: unicode
+  database_id: unicode
+  schema: bytes
+  sql: Optional[unicode]
+  table: Optional[unicode]
+  project_id: Optional[unicode]
+  host: Optional[unicode]
+  emulator_host: Optional[unicode]
+  batching: Optional[bool]
+  timestamp_bound_mode: Optional[unicode]
+  read_timestamp: Optional[unicode]
+  exact_staleness: Optional[int]
+  time_unit: Optional[unicode]
+
+
+class ReadFromSpanner(ExternalTransform):
+  """
+  A PTransform which reads from the specified Spanner instance's database.
+
+  This transform required type of the row it has to return to provide the
+  schema. Example::
+
+    ExampleRow = typing.NamedTuple('ExampleRow',
+                                   [('id', int), ('name', unicode)])
+
+    with Pipeline() as p:
+      result = (
+          p
+          | ReadFromSpanner(
+              instance_id='your_instance_id',
+              database_id='your_database_id',
+              project_id='your_project_id',
+              row_type=ExampleRow,
+              query='SELECT * FROM some_table',
+              timestamp_bound_mode=TimestampBoundMode.MAX_STALENESS,
+              exact_staleness=3,
+              time_unit=TimeUnit.HOURS,
+          ).with_output_types(ExampleRow))
+
+  Experimental; no backwards compatibility guarantees.
+  """
+  def __init__(
+      self,
+      project_id,
+      instance_id,
+      database_id,
+      row_type=None,
+      sql=None,
+      table=None,
+      host=None,
+      emulator_host=None,
+      batching=None,
+      timestamp_bound_mode=None,
+      read_timestamp=None,
+      exact_staleness=None,
+      time_unit=None,
+      expansion_service=None,
+  ):
+    """
+    Initializes a read operation from Spanner.
+
+    :param project_id: Specifies the Cloud Spanner project.
+    :param instance_id: Specifies the Cloud Spanner instance.
+    :param database_id: Specifies the Cloud Spanner database.
+    :param row_type: Row type that fits the given query or table. Passed as
+        NamedTuple, e.g. NamedTuple('name', [('row_name', unicode)])
+    :param sql: An sql query to execute. It's results must fit the
+        provided row_type. Don't use when table is set.
+    :param table: A spanner table. When provided all columns from row_type
+        will be selected to query. Don't use when query is set.
+    :param batching: By default Batch API is used to read data from Cloud
+        Spanner. It is useful to disable batching when the underlying query
+        is not root-partitionable.
+    :param host: Specifies the Cloud Spanner host.
+    :param emulator_host: Specifies Spanner emulator host.
+    :param timestamp_bound_mode: Defines how Cloud Spanner will choose a
+        timestamp for a read-only transaction or a single read/query.
+        Passed as TimestampBoundMode enum. Possible values:
+        STRONG: A timestamp bound that will perform reads and queries at a
+        timestamp where all previously committed transactions are visible.
+        READ_TIMESTAMP: Returns a timestamp bound that will perform reads
+        and queries at the given timestamp.
+        MIN_READ_TIMESTAMP: Returns a timestamp bound that will perform reads
+        and queries at a timestamp chosen to be at least given timestamp value.
+        EXACT_STALENESS: Returns a timestamp bound that will perform reads and
+        queries at an exact staleness. The timestamp is chosen soon after the
+        read is started.
+        MAX_STALENESS: Returns a timestamp bound that will perform reads and
+        queries at a timestamp chosen to be at most time_unit stale.
+    :param read_timestamp: Timestamp in string. Use only when
+        timestamp_bound_mode is set to READ_TIMESTAMP or MIN_READ_TIMESTAMP.
+    :param exact_staleness: Staleness value as int. Use only when
+        timestamp_bound_mode is set to EXACT_STALENESS or MAX_STALENESS.
+        time_unit has to be set along with this param.
+    :param time_unit: Time unit for staleness_value passed as TimeUnit enum.
+        Possible values: NANOSECONDS, MICROSECONDS, MILLISECONDS, SECONDS,
+        HOURS, DAYS.
+    :param expansion_service: The address (host:port) of the ExpansionService.
+    """
+    assert row_type
+    assert sql or table and not (sql and table)
+    staleness_value = int(exact_staleness) if exact_staleness else None
+
+    if staleness_value or time_unit:
+      assert staleness_value and time_unit and \
+             timestamp_bound_mode is TimestampBoundMode.MAX_STALENESS or \
+             timestamp_bound_mode is TimestampBoundMode.EXACT_STALENESS
+
+    if read_timestamp:
+      assert timestamp_bound_mode is TimestampBoundMode.MIN_READ_TIMESTAMP\
+             or timestamp_bound_mode is TimestampBoundMode.READ_TIMESTAMP
+
+    coders.registry.register_coder(row_type, coders.RowCoder)
+
+    super(ReadFromSpanner, self).__init__(
+        _READ_URN,
+        NamedTupleBasedPayloadBuilder(
+            ReadFromSpannerSchema(
+                instance_id=instance_id,
+                database_id=database_id,
+                sql=sql,
+                table=table,
+                schema=named_tuple_to_schema(row_type).SerializeToString(),
+                project_id=project_id,
+                host=host,
+                emulator_host=emulator_host,
+                batching=batching,
+                timestamp_bound_mode=_get_enum_name(timestamp_bound_mode),
+                read_timestamp=read_timestamp,
+                exact_staleness=exact_staleness,
+                time_unit=_get_enum_name(time_unit),
+            ),
+        ),
+        expansion_service or default_io_expansion_service(),
+    )
+
+
+class WriteToSpannerSchema(NamedTuple):
+  project_id: unicode
+  instance_id: unicode
+  database_id: unicode
+  max_batch_size_bytes: Optional[int]
+  max_number_mutations: Optional[int]
+  max_number_rows: Optional[int]
+  grouping_factor: Optional[int]
+  host: Optional[unicode]
+  emulator_host: Optional[unicode]
+  commit_deadline: Optional[int]
+  max_cumulative_backoff: Optional[int]
+
+
+_CLASS_DOC = \
+  """
+  A PTransform which writes {0} mutations to the specified Spanner table.
+
+  This transform receives rows defined as NamedTuple. Example::
+
+    {1} = typing.NamedTuple('{1}',
+                                   [('id', int), ('name', unicode)])
+
+    with Pipeline() as p:
+      _ = (
+          p
+          | 'Impulse' >> beam.Impulse()
+          | 'Generate' >> beam.FlatMap(lambda x: range(num_rows))
+          | 'To row' >> beam.Map(lambda n: {1}(n, str(n))
+              .with_output_types({2})
+          | 'Write to Spanner' >> Spanner{3}(
+              instance_id='your_instance',
+              database_id='existing_database',
+              project_id='your_project_id',
+              table='your_table'))
+
+  Experimental; no backwards compatibility guarantees.
+  """
+
+_INIT_DOC = \
+  """
+  Initializes {} operation to a Spanner table.
+
+  :param project_id: Specifies the Cloud Spanner project.
+  :param instance_id: Specifies the Cloud Spanner instance.
+  :param database_id: Specifies the Cloud Spanner database.
+  :param table: Specifies the Cloud Spanner table.
+  :param max_batch_size_bytes: Specifies the batch size limit (max number of
+      bytes mutated per batch). Default value is 1048576 bytes = 1MB.
+  :param max_number_mutations: Specifies the cell mutation limit (maximum
+      number of mutated cells per batch). Default value is 5000.
+  :param max_number_rows: Specifies the row mutation limit (maximum number of
+      mutated rows per batch). Default value is 500.
+  :param grouping_factor: Specifies the multiple of max mutation (in terms
+      of both bytes per batch and cells per batch) that is used to select a
+      set of mutations to sort by key for batching. This sort uses local
+      memory on the workers, so using large values can cause out of memory
+      errors. Default value is 1000.
+  :param host: Specifies the Cloud Spanner host.
+  :param emulator_host: Specifies Spanner emulator host.
+  :param commit_deadline: Specifies the deadline for the Commit API call.
+      Default is 15 secs. DEADLINE_EXCEEDED errors will prompt a backoff/retry
+      until the value of commit_deadline is reached. DEADLINE_EXCEEDED errors
+      are ar reported with logging and counters. Pass seconds as value.
+  :param max_cumulative_backoff: Specifies the maximum cumulative backoff
+      time when retrying after DEADLINE_EXCEEDED errors. Default is 900s
+      (15min). If the mutations still have not been written after this time,
+      they are treated as a failure, and handled according to the setting of
+      failure_mode. Pass seconds as value.
+  :param expansion_service: The address (host:port) of the ExpansionService.
+  """
+
+
+def _add_doc(value, *args):
+  def _doc(obj):
+    obj.__doc__ = value.format(*args)
+    return obj
+
+  return _doc
+
+
+@_add_doc(_CLASS_DOC, 'delete', 'ExampleKey', 'List[ExampleKey]', 'Delete')

Review comment:
       nice job keeping this concise :+1: 
   
   My only nit is that it would be a bit more readable if you used keyword args and had named parameters in the docstring template.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] TheNeuralBit commented on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
TheNeuralBit commented on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-728234466


   Looks good, merging now. Thanks for all your work on this @piotr-szuberski :)


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-683611200


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=h1) Report
   > Merging [#12611](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=desc) into [master](https://codecov.io/gh/apache/beam/commit/3d6cc0ed9ed537229b27b5dbe73288f21b0e351c?el=desc) will **decrease** coverage by `0.04%`.
   > The diff coverage is `56.73%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/12611/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #12611      +/-   ##
   ==========================================
   - Coverage   82.48%   82.44%   -0.05%     
   ==========================================
     Files         455      456       +1     
     Lines       54876    54975      +99     
   ==========================================
   + Hits        45266    45324      +58     
   - Misses       9610     9651      +41     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [sdks/python/apache\_beam/io/gcp/spanner.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL3NwYW5uZXIucHk=) | `56.73% <56.73%> (ø)` | |
   | [...eam/runners/interactive/interactive\_environment.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9pbnRlcmFjdGl2ZS9pbnRlcmFjdGl2ZV9lbnZpcm9ubWVudC5weQ==) | `89.45% <0.00%> (-0.36%)` | :arrow_down: |
   | [sdks/python/apache\_beam/io/iobase.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vaW9iYXNlLnB5) | `83.75% <0.00%> (-0.29%)` | :arrow_down: |
   | [...hon/apache\_beam/runners/worker/bundle\_processor.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvYnVuZGxlX3Byb2Nlc3Nvci5weQ==) | `94.07% <0.00%> (-0.27%)` | :arrow_down: |
   | [...ks/python/apache\_beam/runners/worker/sdk\_worker.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvc2RrX3dvcmtlci5weQ==) | `89.47% <0.00%> (-0.16%)` | :arrow_down: |
   | [...runners/interactive/display/pcoll\_visualization.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9pbnRlcmFjdGl2ZS9kaXNwbGF5L3Bjb2xsX3Zpc3VhbGl6YXRpb24ucHk=) | `85.26% <0.00%> (-0.08%)` | :arrow_down: |
   | [...beam/runners/portability/local\_job\_service\_main.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9wb3J0YWJpbGl0eS9sb2NhbF9qb2Jfc2VydmljZV9tYWluLnB5) | `0.00% <0.00%> (ø)` | |
   | [sdks/python/apache\_beam/runners/common.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9jb21tb24ucHk=) | `89.20% <0.00%> (+0.44%)` | :arrow_up: |
   | [.../python/apache\_beam/transforms/periodicsequence.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdHJhbnNmb3Jtcy9wZXJpb2RpY3NlcXVlbmNlLnB5) | `98.24% <0.00%> (+1.75%)` | :arrow_up: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=footer). Last update [ba24b08...e07de53](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-683611200


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=h1) Report
   > Merging [#12611](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=desc) (1c43284) into [master](https://codecov.io/gh/apache/beam/commit/3d6cc0ed9ed537229b27b5dbe73288f21b0e351c?el=desc) (3d6cc0e) will **decrease** coverage by `0.04%`.
   > The diff coverage is `56.73%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/12611/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #12611      +/-   ##
   ==========================================
   - Coverage   82.48%   82.44%   -0.05%     
   ==========================================
     Files         455      456       +1     
     Lines       54876    54975      +99     
   ==========================================
   + Hits        45266    45324      +58     
   - Misses       9610     9651      +41     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [sdks/python/apache\_beam/io/gcp/spanner.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL3NwYW5uZXIucHk=) | `56.73% <56.73%> (ø)` | |
   | [...eam/runners/interactive/interactive\_environment.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9pbnRlcmFjdGl2ZS9pbnRlcmFjdGl2ZV9lbnZpcm9ubWVudC5weQ==) | `89.45% <0.00%> (-0.36%)` | :arrow_down: |
   | [sdks/python/apache\_beam/io/iobase.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vaW9iYXNlLnB5) | `83.75% <0.00%> (-0.29%)` | :arrow_down: |
   | [...hon/apache\_beam/runners/worker/bundle\_processor.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvYnVuZGxlX3Byb2Nlc3Nvci5weQ==) | `94.07% <0.00%> (-0.27%)` | :arrow_down: |
   | [...ks/python/apache\_beam/runners/worker/sdk\_worker.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvc2RrX3dvcmtlci5weQ==) | `89.47% <0.00%> (-0.16%)` | :arrow_down: |
   | [...runners/interactive/display/pcoll\_visualization.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9pbnRlcmFjdGl2ZS9kaXNwbGF5L3Bjb2xsX3Zpc3VhbGl6YXRpb24ucHk=) | `85.26% <0.00%> (-0.08%)` | :arrow_down: |
   | [...beam/runners/portability/local\_job\_service\_main.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9wb3J0YWJpbGl0eS9sb2NhbF9qb2Jfc2VydmljZV9tYWluLnB5) | `0.00% <0.00%> (ø)` | |
   | [sdks/python/apache\_beam/runners/common.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9jb21tb24ucHk=) | `89.20% <0.00%> (+0.44%)` | :arrow_up: |
   | [.../python/apache\_beam/transforms/periodicsequence.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdHJhbnNmb3Jtcy9wZXJpb2RpY3NlcXVlbmNlLnB5) | `98.24% <0.00%> (+1.75%)` | :arrow_up: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=footer). Last update [02a1cd2...1602c26](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-683611200


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=h1) Report
   > Merging [#12611](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=desc) into [master](https://codecov.io/gh/apache/beam/commit/3d6cc0ed9ed537229b27b5dbe73288f21b0e351c?el=desc) will **decrease** coverage by `0.04%`.
   > The diff coverage is `56.73%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/12611/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #12611      +/-   ##
   ==========================================
   - Coverage   82.48%   82.44%   -0.05%     
   ==========================================
     Files         455      456       +1     
     Lines       54876    54975      +99     
   ==========================================
   + Hits        45266    45324      +58     
   - Misses       9610     9651      +41     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [sdks/python/apache\_beam/io/gcp/spanner.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL3NwYW5uZXIucHk=) | `56.73% <56.73%> (ø)` | |
   | [...eam/runners/interactive/interactive\_environment.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9pbnRlcmFjdGl2ZS9pbnRlcmFjdGl2ZV9lbnZpcm9ubWVudC5weQ==) | `89.45% <0.00%> (-0.36%)` | :arrow_down: |
   | [sdks/python/apache\_beam/io/iobase.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vaW9iYXNlLnB5) | `83.75% <0.00%> (-0.29%)` | :arrow_down: |
   | [...hon/apache\_beam/runners/worker/bundle\_processor.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvYnVuZGxlX3Byb2Nlc3Nvci5weQ==) | `94.07% <0.00%> (-0.27%)` | :arrow_down: |
   | [...ks/python/apache\_beam/runners/worker/sdk\_worker.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvc2RrX3dvcmtlci5weQ==) | `89.47% <0.00%> (-0.16%)` | :arrow_down: |
   | [...runners/interactive/display/pcoll\_visualization.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9pbnRlcmFjdGl2ZS9kaXNwbGF5L3Bjb2xsX3Zpc3VhbGl6YXRpb24ucHk=) | `85.26% <0.00%> (-0.08%)` | :arrow_down: |
   | [...beam/runners/portability/local\_job\_service\_main.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9wb3J0YWJpbGl0eS9sb2NhbF9qb2Jfc2VydmljZV9tYWluLnB5) | `0.00% <0.00%> (ø)` | |
   | [sdks/python/apache\_beam/runners/common.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9jb21tb24ucHk=) | `89.20% <0.00%> (+0.44%)` | :arrow_up: |
   | [.../python/apache\_beam/transforms/periodicsequence.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdHJhbnNmb3Jtcy9wZXJpb2RpY3NlcXVlbmNlLnB5) | `98.24% <0.00%> (+1.75%)` | :arrow_up: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=footer). Last update [e4c95f2...159def6](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-683611200


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=h1) Report
   > Merging [#12611](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=desc) into [master](https://codecov.io/gh/apache/beam/commit/3d6cc0ed9ed537229b27b5dbe73288f21b0e351c?el=desc) will **decrease** coverage by `0.04%`.
   > The diff coverage is `56.73%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/12611/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #12611      +/-   ##
   ==========================================
   - Coverage   82.48%   82.44%   -0.05%     
   ==========================================
     Files         455      456       +1     
     Lines       54876    54975      +99     
   ==========================================
   + Hits        45266    45324      +58     
   - Misses       9610     9651      +41     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [sdks/python/apache\_beam/io/gcp/spanner.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL3NwYW5uZXIucHk=) | `56.73% <56.73%> (ø)` | |
   | [...eam/runners/interactive/interactive\_environment.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9pbnRlcmFjdGl2ZS9pbnRlcmFjdGl2ZV9lbnZpcm9ubWVudC5weQ==) | `89.45% <0.00%> (-0.36%)` | :arrow_down: |
   | [sdks/python/apache\_beam/io/iobase.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vaW9iYXNlLnB5) | `83.75% <0.00%> (-0.29%)` | :arrow_down: |
   | [...hon/apache\_beam/runners/worker/bundle\_processor.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvYnVuZGxlX3Byb2Nlc3Nvci5weQ==) | `94.07% <0.00%> (-0.27%)` | :arrow_down: |
   | [...ks/python/apache\_beam/runners/worker/sdk\_worker.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvc2RrX3dvcmtlci5weQ==) | `89.47% <0.00%> (-0.16%)` | :arrow_down: |
   | [...runners/interactive/display/pcoll\_visualization.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9pbnRlcmFjdGl2ZS9kaXNwbGF5L3Bjb2xsX3Zpc3VhbGl6YXRpb24ucHk=) | `85.26% <0.00%> (-0.08%)` | :arrow_down: |
   | [...beam/runners/portability/local\_job\_service\_main.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9wb3J0YWJpbGl0eS9sb2NhbF9qb2Jfc2VydmljZV9tYWluLnB5) | `0.00% <0.00%> (ø)` | |
   | [sdks/python/apache\_beam/runners/common.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9jb21tb24ucHk=) | `89.20% <0.00%> (+0.44%)` | :arrow_up: |
   | [.../python/apache\_beam/transforms/periodicsequence.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdHJhbnNmb3Jtcy9wZXJpb2RpY3NlcXVlbmNlLnB5) | `98.24% <0.00%> (+1.75%)` | :arrow_up: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=footer). Last update [e4c95f2...a4a11ea](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski commented on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski commented on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-716890864


   @TheNeuralBit


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-683611200


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=h1) Report
   > Merging [#12611](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=desc) into [master](https://codecov.io/gh/apache/beam/commit/0335ba5403e5071a1376a3a4101d14b03cf90b8d?el=desc) will **increase** coverage by `0.02%`.
   > The diff coverage is `58.51%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/12611/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #12611      +/-   ##
   ==========================================
   + Coverage   40.23%   40.26%   +0.02%     
   ==========================================
     Files         455      456       +1     
     Lines       53726    53825      +99     
   ==========================================
   + Hits        21619    21674      +55     
   - Misses      32107    32151      +44     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [sdks/python/apache\_beam/io/gcp/spanner.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL3NwYW5uZXIucHk=) | `58.51% <58.51%> (ø)` | |
   | [...hon/apache\_beam/runners/direct/test\_stream\_impl.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9kaXJlY3QvdGVzdF9zdHJlYW1faW1wbC5weQ==) | `41.17% <0.00%> (-1.58%)` | :arrow_down: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=footer). Last update [0335ba5...78c40e7](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski commented on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski commented on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-686441745


   Run Python 3.8 PostCommit


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-683611200


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=h1) Report
   > Merging [#12611](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=desc) into [master](https://codecov.io/gh/apache/beam/commit/0335ba5403e5071a1376a3a4101d14b03cf90b8d?el=desc) will **increase** coverage by `0.02%`.
   > The diff coverage is `59.13%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/12611/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #12611      +/-   ##
   ==========================================
   + Coverage   40.23%   40.26%   +0.02%     
   ==========================================
     Files         455      456       +1     
     Lines       53726    53824      +98     
   ==========================================
   + Hits        21619    21674      +55     
   - Misses      32107    32150      +43     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [sdks/python/apache\_beam/io/gcp/spanner.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL3NwYW5uZXIucHk=) | `59.13% <59.13%> (ø)` | |
   | [...hon/apache\_beam/runners/direct/test\_stream\_impl.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9kaXJlY3QvdGVzdF9zdHJlYW1faW1wbC5weQ==) | `41.17% <0.00%> (-1.58%)` | :arrow_down: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=footer). Last update [0335ba5...13ac9de](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-683611200


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=h1) Report
   > Merging [#12611](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=desc) into [master](https://codecov.io/gh/apache/beam/commit/ba89e76f5daecd496cafb2861dac5ef69480a973?el=desc) will **increase** coverage by `0.03%`.
   > The diff coverage is `56.73%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/12611/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #12611      +/-   ##
   ==========================================
   + Coverage   40.26%   40.30%   +0.03%     
   ==========================================
     Files         455      456       +1     
     Lines       53780    53926     +146     
   ==========================================
   + Hits        21655    21733      +78     
   - Misses      32125    32193      +68     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [sdks/python/apache\_beam/io/gcp/spanner.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL3NwYW5uZXIucHk=) | `56.73% <56.73%> (ø)` | |
   | [...s/python/apache\_beam/testing/synthetic\_pipeline.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdGVzdGluZy9zeW50aGV0aWNfcGlwZWxpbmUucHk=) | `23.45% <0.00%> (+2.52%)` | :arrow_up: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=footer). Last update [ba89e76...c24e8cb](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski removed a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski removed a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-717180946






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-683611200


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=h1) Report
   > Merging [#12611](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=desc) into [master](https://codecov.io/gh/apache/beam/commit/0335ba5403e5071a1376a3a4101d14b03cf90b8d?el=desc) will **increase** coverage by `0.02%`.
   > The diff coverage is `58.51%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/12611/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #12611      +/-   ##
   ==========================================
   + Coverage   40.23%   40.26%   +0.02%     
   ==========================================
     Files         455      456       +1     
     Lines       53726    53825      +99     
   ==========================================
   + Hits        21619    21674      +55     
   - Misses      32107    32151      +44     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [sdks/python/apache\_beam/io/gcp/spanner.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL3NwYW5uZXIucHk=) | `58.51% <58.51%> (ø)` | |
   | [...hon/apache\_beam/runners/direct/test\_stream\_impl.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9kaXJlY3QvdGVzdF9zdHJlYW1faW1wbC5weQ==) | `41.17% <0.00%> (-1.58%)` | :arrow_down: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=footer). Last update [0335ba5...33a9b22](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] TheNeuralBit commented on a change in pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
TheNeuralBit commented on a change in pull request #12611:
URL: https://github.com/apache/beam/pull/12611#discussion_r488314492



##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/SpannerTransformRegistrar.java
##########
@@ -0,0 +1,287 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.gcp.spanner;
+
+import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument;
+
+import com.google.auto.service.AutoService;
+import com.google.cloud.Timestamp;
+import com.google.cloud.spanner.TimestampBound;
+import java.util.Map;
+import java.util.concurrent.TimeUnit;
+import org.apache.beam.model.pipeline.v1.SchemaApi;
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.annotations.Experimental.Kind;
+import org.apache.beam.sdk.expansion.ExternalTransformRegistrar;
+import org.apache.beam.sdk.schemas.Schema;
+import org.apache.beam.sdk.schemas.SchemaTranslation;
+import org.apache.beam.sdk.transforms.ExternalTransformBuilder;
+import org.apache.beam.sdk.transforms.PTransform;
+import org.apache.beam.sdk.values.PBegin;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.PDone;
+import org.apache.beam.sdk.values.Row;
+import org.apache.beam.vendor.grpc.v1p26p0.com.google.protobuf.InvalidProtocolBufferException;
+import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.ImmutableMap;
+import org.checkerframework.checker.nullness.qual.Nullable;
+import org.joda.time.Duration;
+
+/**
+ * Exposes {@link SpannerIO.WriteRows} and {@link SpannerIO.ReadRows} as an external transform for
+ * cross-language usage.
+ */
+@Experimental(Kind.PORTABILITY)
+@AutoService(ExternalTransformRegistrar.class)
+public class SpannerTransformRegistrar implements ExternalTransformRegistrar {
+  public static final String WRITE_URN = "beam:external:java:spanner:write:v1";
+  public static final String READ_URN = "beam:external:java:spanner:read:v1";
+
+  @Override
+  public Map<String, ExternalTransformBuilder<?, ?, ?>> knownBuilderInstances() {
+    return ImmutableMap.of(WRITE_URN, new WriteBuilder(), READ_URN, new ReadBuilder());
+  }
+
+  public abstract static class CrossLanguageConfiguration {
+    String instanceId;
+    String databaseId;
+    String projectId;
+    @Nullable String host;
+    @Nullable String emulatorHost;
+
+    public void setInstanceId(String instanceId) {
+      this.instanceId = instanceId;
+    }
+
+    public void setDatabaseId(String databaseId) {
+      this.databaseId = databaseId;
+    }
+
+    public void setProjectId(String projectId) {
+      this.projectId = projectId;
+    }
+
+    public void setHost(@Nullable String host) {
+      this.host = host;
+    }
+
+    public void setEmulatorHost(@Nullable String emulatorHost) {
+      this.emulatorHost = emulatorHost;
+    }
+  }
+
+  @Experimental(Kind.PORTABILITY)
+  public static class ReadBuilder
+      implements ExternalTransformBuilder<ReadBuilder.Configuration, PBegin, PCollection<Row>> {
+
+    public static class Configuration extends CrossLanguageConfiguration {
+      // TODO: BEAM-10851 Come up with something to determine schema without this explicit parameter
+      private Schema schema;
+      private @Nullable String sql;
+      private @Nullable String table;
+      private @Nullable Boolean batching;
+      private @Nullable String timestampBoundMode;
+      private @Nullable String readTimestamp;
+      private @Nullable String timeUnit;
+      private @Nullable Long exactStaleness;
+
+      public void setSql(@Nullable String sql) {
+        this.sql = sql;
+      }
+
+      public void setTable(@Nullable String table) {
+        this.table = table;
+      }
+
+      public void setBatching(@Nullable Boolean batching) {
+        this.batching = batching;
+      }
+
+      public void setTimestampBoundMode(@Nullable String timestampBoundMode) {
+        this.timestampBoundMode = timestampBoundMode;
+      }
+
+      public void setSchema(byte[] schema) throws InvalidProtocolBufferException {
+        this.schema = SchemaTranslation.schemaFromProto(SchemaApi.Schema.parseFrom(schema));
+      }
+
+      public void setReadTimestamp(@Nullable String readTimestamp) {
+        this.readTimestamp = readTimestamp;
+      }
+
+      public void setTimeUnit(@Nullable String timeUnit) {
+        this.timeUnit = timeUnit;
+      }
+
+      public void setExactStaleness(@Nullable Long exactStaleness) {
+        this.exactStaleness = exactStaleness;
+      }
+
+      private TimestampBound getTimestampBound() {
+        if (timestampBoundMode == null) {
+          return null;
+        }
+
+        TimestampBound.Mode mode = TimestampBound.Mode.valueOf(timestampBoundMode);
+        if (mode == TimestampBound.Mode.MAX_STALENESS
+            || mode == TimestampBound.Mode.EXACT_STALENESS) {
+          checkArgument(
+              exactStaleness != null,
+              "Staleness value cannot be null when MAX_STALENESS or EXACT_STALENESS mode is selected");
+          checkArgument(
+              timeUnit != null,
+              "Time unit cannot be null when MAX_STALENESS or EXACT_STALENESS mode is selected");
+        }
+        if (mode == TimestampBound.Mode.READ_TIMESTAMP
+            || mode == TimestampBound.Mode.MIN_READ_TIMESTAMP) {
+          checkArgument(
+              readTimestamp != null,
+              "Timestamp cannot be null when READ_TIMESTAMP or MIN_READ_TIMESTAMP mode is selected");
+        }
+        switch (mode) {
+          case STRONG:
+            return TimestampBound.strong();
+          case MAX_STALENESS:
+            return TimestampBound.ofMaxStaleness(exactStaleness, TimeUnit.valueOf(timeUnit));
+          case EXACT_STALENESS:
+            return TimestampBound.ofExactStaleness(exactStaleness, TimeUnit.valueOf(timeUnit));
+          case READ_TIMESTAMP:
+            return TimestampBound.ofReadTimestamp(Timestamp.parseTimestamp(readTimestamp));
+          case MIN_READ_TIMESTAMP:
+            return TimestampBound.ofMinReadTimestamp(Timestamp.parseTimestamp(readTimestamp));
+          default:
+            throw new RuntimeException("Unknown timestamp bound mode: " + mode);
+        }
+      }
+
+      public ReadOperation getReadOperation() {
+        checkArgument(
+            sql == null || table == null,
+            "Query and table params are mutually exclusive. Set just one of them.");
+        if (sql != null) {
+          return ReadOperation.create().withQuery(sql);
+        }
+        return ReadOperation.create().withTable(table).withColumns(schema.getFieldNames());
+      }
+    }
+
+    @Override
+    public PTransform<PBegin, PCollection<Row>> buildExternal(Configuration configuration) {
+      SpannerIO.Read readTransform =
+          SpannerIO.read()
+              .withProjectId(configuration.projectId)
+              .withDatabaseId(configuration.databaseId)
+              .withInstanceId(configuration.instanceId)
+              .withReadOperation(configuration.getReadOperation());
+
+      if (configuration.host != null) {
+        readTransform = readTransform.withHost(configuration.host);
+      }
+      if (configuration.emulatorHost != null) {
+        readTransform = readTransform.withEmulatorHost(configuration.emulatorHost);
+      }
+      if (configuration.getTimestampBound() != null) {
+        readTransform = readTransform.withTimestampBound(configuration.getTimestampBound());
+      }
+      if (configuration.batching != null) {
+        readTransform = readTransform.withBatching(configuration.batching);
+      }
+
+      return new SpannerIO.ReadRows(readTransform, configuration.schema);
+    }
+  }
+
+  @Experimental(Kind.PORTABILITY)
+  public static class WriteBuilder
+      implements ExternalTransformBuilder<WriteBuilder.Configuration, PCollection<Row>, PDone> {
+
+    public static class Configuration extends CrossLanguageConfiguration {

Review comment:
       Sounds good




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-683611200


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=h1) Report
   > Merging [#12611](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=desc) into [master](https://codecov.io/gh/apache/beam/commit/6bf56f92b34f7c15b752c46eca19489a604c4775?el=desc) will **decrease** coverage by `0.06%`.
   > The diff coverage is `56.73%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/12611/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #12611      +/-   ##
   ==========================================
   - Coverage   82.51%   82.44%   -0.07%     
   ==========================================
     Files         455      456       +1     
     Lines       54867    54975     +108     
   ==========================================
   + Hits        45272    45324      +52     
   - Misses       9595     9651      +56     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [sdks/python/apache\_beam/io/gcp/spanner.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL3NwYW5uZXIucHk=) | `56.73% <56.73%> (ø)` | |
   | [sdks/python/apache\_beam/io/source\_test\_utils.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vc291cmNlX3Rlc3RfdXRpbHMucHk=) | `88.28% <0.00%> (-1.36%)` | :arrow_down: |
   | [sdks/python/apache\_beam/io/gcp/bigquery\_tools.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL2JpZ3F1ZXJ5X3Rvb2xzLnB5) | `87.79% <0.00%> (-0.57%)` | :arrow_down: |
   | [...hon/apache\_beam/runners/worker/bundle\_processor.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvYnVuZGxlX3Byb2Nlc3Nvci5weQ==) | `94.07% <0.00%> (-0.40%)` | :arrow_down: |
   | [sdks/python/apache\_beam/io/iobase.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vaW9iYXNlLnB5) | `83.75% <0.00%> (-0.29%)` | :arrow_down: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=footer). Last update [6bf56f9...1c43284](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski removed a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski removed a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-686493626






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski commented on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski commented on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-716839061


   Run Python 3.8 PostCommit


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski commented on pull request #12611: [BEAM-10131][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski commented on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-675497655


   @TheNeuralBit I know I'm merciless to give such a big PR to review, but I think you're the most up-to-date person about rows and schemas :) There are some unit tests and TODOs left but overall I think it's almost completed. The integration tests work well on FlinkRunner.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-683611200


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=h1) Report
   > Merging [#12611](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=desc) into [master](https://codecov.io/gh/apache/beam/commit/0335ba5403e5071a1376a3a4101d14b03cf90b8d?el=desc) will **increase** coverage by `0.02%`.
   > The diff coverage is `58.51%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/12611/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #12611      +/-   ##
   ==========================================
   + Coverage   40.23%   40.26%   +0.02%     
   ==========================================
     Files         455      456       +1     
     Lines       53726    53825      +99     
   ==========================================
   + Hits        21619    21674      +55     
   - Misses      32107    32151      +44     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [sdks/python/apache\_beam/io/gcp/spanner.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL3NwYW5uZXIucHk=) | `58.51% <58.51%> (ø)` | |
   | [...hon/apache\_beam/runners/direct/test\_stream\_impl.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9kaXJlY3QvdGVzdF9zdHJlYW1faW1wbC5weQ==) | `41.17% <0.00%> (-1.58%)` | :arrow_down: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=footer). Last update [0335ba5...78c40e7](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-683611200


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=h1) Report
   > Merging [#12611](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=desc) into [master](https://codecov.io/gh/apache/beam/commit/0335ba5403e5071a1376a3a4101d14b03cf90b8d?el=desc) will **increase** coverage by `0.02%`.
   > The diff coverage is `58.51%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/12611/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #12611      +/-   ##
   ==========================================
   + Coverage   40.23%   40.26%   +0.02%     
   ==========================================
     Files         455      456       +1     
     Lines       53726    53825      +99     
   ==========================================
   + Hits        21619    21674      +55     
   - Misses      32107    32151      +44     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [sdks/python/apache\_beam/io/gcp/spanner.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL3NwYW5uZXIucHk=) | `58.51% <58.51%> (ø)` | |
   | [...hon/apache\_beam/runners/direct/test\_stream\_impl.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9kaXJlY3QvdGVzdF9zdHJlYW1faW1wbC5weQ==) | `41.17% <0.00%> (-1.58%)` | :arrow_down: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=footer). Last update [0335ba5...cfeb645](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] TheNeuralBit commented on a change in pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
TheNeuralBit commented on a change in pull request #12611:
URL: https://github.com/apache/beam/pull/12611#discussion_r483301387



##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/StructUtils.java
##########
@@ -0,0 +1,545 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.gcp.spanner;
+
+import static java.util.stream.Collectors.toList;
+import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument;
+
+import com.google.cloud.ByteArray;
+import com.google.cloud.Timestamp;
+import com.google.cloud.spanner.Struct;
+import com.google.cloud.spanner.Type;
+import java.math.BigDecimal;
+import java.util.List;
+import java.util.stream.StreamSupport;
+import org.apache.beam.sdk.schemas.Schema;
+import org.apache.beam.sdk.values.Row;
+import org.joda.time.DateTime;
+import org.joda.time.Instant;
+
+final class StructUtils {
+  public static Row translateStructToRow(Struct struct, Schema schema) {
+    checkForSchemasEquality(schema.getFields(), struct.getType().getStructFields(), false);
+
+    List<Schema.Field> fields = schema.getFields();
+    Row.FieldValueBuilder valueBuilder = null;
+    // TODO: Remove this null-checking once nullable fields are supported in cross-language

Review comment:
       Hm so it _should_ be supported. RowCoder encodes nulls for top-level fields separately so there's no need for NullableCoder. NullableCoder is only used when you have a nullable type in a container type, e.g. `ARRAY<NULLABLE INT>`. This wasn't supported in Python until recently - https://github.com/apache/beam/pull/12426 should have fixed it though.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] TheNeuralBit commented on a change in pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
TheNeuralBit commented on a change in pull request #12611:
URL: https://github.com/apache/beam/pull/12611#discussion_r483301387



##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/StructUtils.java
##########
@@ -0,0 +1,545 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.gcp.spanner;
+
+import static java.util.stream.Collectors.toList;
+import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument;
+
+import com.google.cloud.ByteArray;
+import com.google.cloud.Timestamp;
+import com.google.cloud.spanner.Struct;
+import com.google.cloud.spanner.Type;
+import java.math.BigDecimal;
+import java.util.List;
+import java.util.stream.StreamSupport;
+import org.apache.beam.sdk.schemas.Schema;
+import org.apache.beam.sdk.values.Row;
+import org.joda.time.DateTime;
+import org.joda.time.Instant;
+
+final class StructUtils {
+  public static Row translateStructToRow(Struct struct, Schema schema) {
+    checkForSchemasEquality(schema.getFields(), struct.getType().getStructFields(), false);
+
+    List<Schema.Field> fields = schema.getFields();
+    Row.FieldValueBuilder valueBuilder = null;
+    // TODO: Remove this null-checking once nullable fields are supported in cross-language

Review comment:
       Hm so it _should_ be supported. RowCoder encodes nulls for top-level fields separately so there's no need for NullableCoder. NullableCoder is only used when you have a nullable type in a container type, e.g. `ARRAY<NULLABLE INT>`. This wasn't supported in Python until recently - https://github.com/apache/beam/pull/12426 should have fixed it.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-683611200


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=h1) Report
   > Merging [#12611](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=desc) into [master](https://codecov.io/gh/apache/beam/commit/3d6cc0ed9ed537229b27b5dbe73288f21b0e351c?el=desc) will **decrease** coverage by `0.04%`.
   > The diff coverage is `56.73%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/12611/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #12611      +/-   ##
   ==========================================
   - Coverage   82.48%   82.44%   -0.05%     
   ==========================================
     Files         455      456       +1     
     Lines       54876    54975      +99     
   ==========================================
   + Hits        45266    45324      +58     
   - Misses       9610     9651      +41     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [sdks/python/apache\_beam/io/gcp/spanner.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL3NwYW5uZXIucHk=) | `56.73% <56.73%> (ø)` | |
   | [...eam/runners/interactive/interactive\_environment.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9pbnRlcmFjdGl2ZS9pbnRlcmFjdGl2ZV9lbnZpcm9ubWVudC5weQ==) | `89.45% <0.00%> (-0.36%)` | :arrow_down: |
   | [sdks/python/apache\_beam/io/iobase.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vaW9iYXNlLnB5) | `83.75% <0.00%> (-0.29%)` | :arrow_down: |
   | [...hon/apache\_beam/runners/worker/bundle\_processor.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvYnVuZGxlX3Byb2Nlc3Nvci5weQ==) | `94.07% <0.00%> (-0.27%)` | :arrow_down: |
   | [...ks/python/apache\_beam/runners/worker/sdk\_worker.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvc2RrX3dvcmtlci5weQ==) | `89.47% <0.00%> (-0.16%)` | :arrow_down: |
   | [...runners/interactive/display/pcoll\_visualization.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9pbnRlcmFjdGl2ZS9kaXNwbGF5L3Bjb2xsX3Zpc3VhbGl6YXRpb24ucHk=) | `85.26% <0.00%> (-0.08%)` | :arrow_down: |
   | [...beam/runners/portability/local\_job\_service\_main.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9wb3J0YWJpbGl0eS9sb2NhbF9qb2Jfc2VydmljZV9tYWluLnB5) | `0.00% <0.00%> (ø)` | |
   | [sdks/python/apache\_beam/runners/common.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9jb21tb24ucHk=) | `89.20% <0.00%> (+0.44%)` | :arrow_up: |
   | [.../python/apache\_beam/transforms/periodicsequence.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdHJhbnNmb3Jtcy9wZXJpb2RpY3NlcXVlbmNlLnB5) | `98.24% <0.00%> (+1.75%)` | :arrow_up: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=footer). Last update [e4c95f2...2440f07](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski removed a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski removed a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-676576861






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski removed a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski removed a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-712094472


   I've just run pydocs command and the generated html looks good.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-683611200


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=h1) Report
   > Merging [#12611](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=desc) into [master](https://codecov.io/gh/apache/beam/commit/3d6cc0ed9ed537229b27b5dbe73288f21b0e351c?el=desc) will **decrease** coverage by `0.04%`.
   > The diff coverage is `56.73%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/12611/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #12611      +/-   ##
   ==========================================
   - Coverage   82.48%   82.44%   -0.05%     
   ==========================================
     Files         455      456       +1     
     Lines       54876    54975      +99     
   ==========================================
   + Hits        45266    45324      +58     
   - Misses       9610     9651      +41     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [sdks/python/apache\_beam/io/gcp/spanner.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL3NwYW5uZXIucHk=) | `56.73% <56.73%> (ø)` | |
   | [...eam/runners/interactive/interactive\_environment.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9pbnRlcmFjdGl2ZS9pbnRlcmFjdGl2ZV9lbnZpcm9ubWVudC5weQ==) | `89.45% <0.00%> (-0.36%)` | :arrow_down: |
   | [sdks/python/apache\_beam/io/iobase.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vaW9iYXNlLnB5) | `83.75% <0.00%> (-0.29%)` | :arrow_down: |
   | [...hon/apache\_beam/runners/worker/bundle\_processor.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvYnVuZGxlX3Byb2Nlc3Nvci5weQ==) | `94.07% <0.00%> (-0.27%)` | :arrow_down: |
   | [...ks/python/apache\_beam/runners/worker/sdk\_worker.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvc2RrX3dvcmtlci5weQ==) | `89.47% <0.00%> (-0.16%)` | :arrow_down: |
   | [...runners/interactive/display/pcoll\_visualization.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9pbnRlcmFjdGl2ZS9kaXNwbGF5L3Bjb2xsX3Zpc3VhbGl6YXRpb24ucHk=) | `85.26% <0.00%> (-0.08%)` | :arrow_down: |
   | [...beam/runners/portability/local\_job\_service\_main.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9wb3J0YWJpbGl0eS9sb2NhbF9qb2Jfc2VydmljZV9tYWluLnB5) | `0.00% <0.00%> (ø)` | |
   | [sdks/python/apache\_beam/runners/common.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9jb21tb24ucHk=) | `89.20% <0.00%> (+0.44%)` | :arrow_up: |
   | [.../python/apache\_beam/transforms/periodicsequence.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdHJhbnNmb3Jtcy9wZXJpb2RpY3NlcXVlbmNlLnB5) | `98.24% <0.00%> (+1.75%)` | :arrow_up: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=footer). Last update [84db719...7aef88a](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski commented on a change in pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski commented on a change in pull request #12611:
URL: https://github.com/apache/beam/pull/12611#discussion_r505658098



##########
File path: sdks/java/io/google-cloud-platform/expansion-service/build.gradle
##########
@@ -0,0 +1,38 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+apply plugin: 'org.apache.beam.module'
+apply plugin: 'application'
+mainClassName = "org.apache.beam.sdk.expansion.service.ExpansionService"
+
+applyJavaNature(
+        enableChecker: true,
+        automaticModuleName: 'org.apache.beam.sdk.io.gcp.expansion.service',
+        exportJavadoc: false,
+        validateShadowJar: false,
+        shadowClosure: {},
+)
+
+description = "Apache Beam :: SDKs :: Java :: IO :: Google Cloud Platform :: Expansion Service"
+ext.summary = "Expansion service serving Spanner Java IO"

Review comment:
       Done.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-683611200


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=h1) Report
   > Merging [#12611](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=desc) into [master](https://codecov.io/gh/apache/beam/commit/3d6cc0ed9ed537229b27b5dbe73288f21b0e351c?el=desc) will **decrease** coverage by `0.04%`.
   > The diff coverage is `56.73%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/12611/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #12611      +/-   ##
   ==========================================
   - Coverage   82.48%   82.44%   -0.05%     
   ==========================================
     Files         455      456       +1     
     Lines       54876    54975      +99     
   ==========================================
   + Hits        45266    45324      +58     
   - Misses       9610     9651      +41     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [sdks/python/apache\_beam/io/gcp/spanner.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL3NwYW5uZXIucHk=) | `56.73% <56.73%> (ø)` | |
   | [...eam/runners/interactive/interactive\_environment.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9pbnRlcmFjdGl2ZS9pbnRlcmFjdGl2ZV9lbnZpcm9ubWVudC5weQ==) | `89.45% <0.00%> (-0.36%)` | :arrow_down: |
   | [sdks/python/apache\_beam/io/iobase.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vaW9iYXNlLnB5) | `83.75% <0.00%> (-0.29%)` | :arrow_down: |
   | [...hon/apache\_beam/runners/worker/bundle\_processor.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvYnVuZGxlX3Byb2Nlc3Nvci5weQ==) | `94.07% <0.00%> (-0.27%)` | :arrow_down: |
   | [...ks/python/apache\_beam/runners/worker/sdk\_worker.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvc2RrX3dvcmtlci5weQ==) | `89.47% <0.00%> (-0.16%)` | :arrow_down: |
   | [...runners/interactive/display/pcoll\_visualization.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9pbnRlcmFjdGl2ZS9kaXNwbGF5L3Bjb2xsX3Zpc3VhbGl6YXRpb24ucHk=) | `85.26% <0.00%> (-0.08%)` | :arrow_down: |
   | [...beam/runners/portability/local\_job\_service\_main.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9wb3J0YWJpbGl0eS9sb2NhbF9qb2Jfc2VydmljZV9tYWluLnB5) | `0.00% <0.00%> (ø)` | |
   | [sdks/python/apache\_beam/runners/common.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9jb21tb24ucHk=) | `89.20% <0.00%> (+0.44%)` | :arrow_up: |
   | [.../python/apache\_beam/transforms/periodicsequence.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdHJhbnNmb3Jtcy9wZXJpb2RpY3NlcXVlbmNlLnB5) | `98.24% <0.00%> (+1.75%)` | :arrow_up: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=footer). Last update [e4c95f2...78a71b4](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] chamikaramj commented on a change in pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
chamikaramj commented on a change in pull request #12611:
URL: https://github.com/apache/beam/pull/12611#discussion_r492364337



##########
File path: sdks/python/apache_beam/io/gcp/spanner.py
##########
@@ -0,0 +1,504 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""PTransforms for supporting Spanner in Python pipelines.
+
+  These transforms are currently supported by Beam portable
+  Flink and Spark runners.
+
+  **Setup**
+
+  Transforms provided in this module are cross-language transforms
+  implemented in the Beam Java SDK. During the pipeline construction, Python SDK
+  will connect to a Java expansion service to expand these transforms.
+  To facilitate this, a small amount of setup is needed before using these
+  transforms in a Beam Python pipeline.
+
+  There are several ways to setup cross-language Spanner transforms.
+
+  * Option 1: use the default expansion service
+  * Option 2: specify a custom expansion service
+
+  See below for details regarding each of these options.
+
+  *Option 1: Use the default expansion service*
+
+  This is the recommended and easiest setup option for using Python Spanner
+  transforms. This option is only available for Beam 2.25.0 and later.
+
+  This option requires following pre-requisites before running the Beam
+  pipeline.
+
+  * Install Java runtime in the computer from where the pipeline is constructed
+    and make sure that 'java' command is available.
+
+  In this option, Python SDK will either download (for released Beam version) or
+  build (when running from a Beam Git clone) a expansion service jar and use
+  that to expand transforms. Currently Spanner transforms use the
+  'beam-sdks-java-io-google-cloud-platform-expansion-service' jar for this
+  purpose.
+
+  *Option 2: specify a custom expansion service*
+
+  In this option, you startup your own expansion service and provide that as
+  a parameter when using the transforms provided in this module.
+
+  This option requires following pre-requisites before running the Beam
+  pipeline.
+
+  * Startup your own expansion service.
+  * Update your pipeline to provide the expansion service address when
+    initiating Spanner transforms provided in this module.
+
+  Flink Users can use the built-in Expansion Service of the Flink Runner's
+  Job Server. If you start Flink's Job Server, the expansion service will be
+  started on port 8097. For a different address, please set the
+  expansion_service parameter.
+
+  **More information**
+
+  For more information regarding cross-language transforms see:
+  - https://beam.apache.org/roadmap/portability/
+
+  For more information specific to Flink runner see:
+  - https://beam.apache.org/documentation/runners/flink/
+"""
+
+# pytype: skip-file
+
+from __future__ import absolute_import
+
+import typing
+import uuid
+from typing import List
+from typing import NamedTuple
+from typing import Optional
+
+from past.builtins import unicode
+
+from apache_beam import coders
+from apache_beam.transforms.external import BeamJarExpansionService
+from apache_beam.transforms.external import ExternalTransform
+from apache_beam.transforms.external import NamedTupleBasedPayloadBuilder
+from apache_beam.typehints.schemas import named_tuple_to_schema
+
+__all__ = [
+    'WriteToSpanner',
+    'ReadFromSpanner',
+    'MutationCreator',
+    'TimestampBoundMode',
+    'TimeUnit',
+]
+
+
+def default_io_expansion_service():
+  return BeamJarExpansionService(
+      'sdks:java:io:google-cloud-platform:expansion-service:shadowJar')
+
+
+WriteToSpannerSchema = typing.NamedTuple(
+    'WriteToSpannerSchema',
+    [
+        ('instance_id', unicode),
+        ('database_id', unicode),
+        ('project_id', Optional[unicode]),
+        ('batch_size_bytes', Optional[int]),
+        ('max_num_mutations', Optional[int]),
+        ('max_num_rows', Optional[int]),
+        ('grouping_factor', Optional[int]),
+        ('host', Optional[unicode]),
+        ('emulator_host', Optional[unicode]),
+        ('commit_deadline', Optional[int]),
+        ('max_cumulative_backoff', Optional[int]),
+    ],
+)
+
+
+class WriteToSpanner(ExternalTransform):

Review comment:
       Probably it makes sense to converge into one implementation. I'd prefer the Java implementation (hence cross-language) since it's being around for longer and used by many users. We have to make sure that the cross-language version works for all runners before native version can be removed. For example, cross-language version will not work for current production Dataflow (Runner v1) and we have to confirm that it works adequate for Dataflow Runner v2.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski commented on a change in pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski commented on a change in pull request #12611:
URL: https://github.com/apache/beam/pull/12611#discussion_r522053018



##########
File path: sdks/python/apache_beam/io/gcp/spanner.py
##########
@@ -0,0 +1,662 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""PTransforms for supporting Spanner in Python pipelines.
+
+  These transforms are currently supported by Beam portable
+  Flink and Spark runners.
+
+  **Setup**
+
+  Transforms provided in this module are cross-language transforms
+  implemented in the Beam Java SDK. During the pipeline construction, Python SDK
+  will connect to a Java expansion service to expand these transforms.
+  To facilitate this, a small amount of setup is needed before using these
+  transforms in a Beam Python pipeline.
+
+  There are several ways to setup cross-language Spanner transforms.
+
+  * Option 1: use the default expansion service
+  * Option 2: specify a custom expansion service
+
+  See below for details regarding each of these options.
+
+  *Option 1: Use the default expansion service*
+
+  This is the recommended and easiest setup option for using Python Spanner
+  transforms. This option is only available for Beam 2.26.0 and later.
+
+  This option requires following pre-requisites before running the Beam
+  pipeline.
+
+  * Install Java runtime in the computer from where the pipeline is constructed
+    and make sure that 'java' command is available.
+
+  In this option, Python SDK will either download (for released Beam version) or
+  build (when running from a Beam Git clone) a expansion service jar and use
+  that to expand transforms. Currently Spanner transforms use the
+  'beam-sdks-java-io-google-cloud-platform-expansion-service' jar for this
+  purpose.
+
+  *Option 2: specify a custom expansion service*
+
+  In this option, you startup your own expansion service and provide that as
+  a parameter when using the transforms provided in this module.
+
+  This option requires following pre-requisites before running the Beam
+  pipeline.
+
+  * Startup your own expansion service.
+  * Update your pipeline to provide the expansion service address when
+    initiating Spanner transforms provided in this module.
+
+  Flink Users can use the built-in Expansion Service of the Flink Runner's
+  Job Server. If you start Flink's Job Server, the expansion service will be
+  started on port 8097. For a different address, please set the
+  expansion_service parameter.
+
+  **More information**
+
+  For more information regarding cross-language transforms see:
+  - https://beam.apache.org/roadmap/portability/
+
+  For more information specific to Flink runner see:
+  - https://beam.apache.org/documentation/runners/flink/
+"""
+
+# pytype: skip-file
+
+from __future__ import absolute_import
+
+import uuid
+from enum import Enum
+from enum import auto
+from typing import List
+from typing import NamedTuple
+from typing import Optional
+
+from past.builtins import unicode
+
+from apache_beam import Map
+from apache_beam import PTransform
+from apache_beam import coders
+from apache_beam.transforms.external import BeamJarExpansionService
+from apache_beam.transforms.external import ExternalTransform
+from apache_beam.transforms.external import NamedTupleBasedPayloadBuilder
+from apache_beam.typehints.schemas import named_tuple_from_schema
+from apache_beam.typehints.schemas import named_tuple_to_schema
+from apache_beam.typehints.schemas import schema_from_element_type
+
+__all__ = [
+    'ReadFromSpanner',
+    'SpannerDelete',
+    'SpannerInsert',
+    'SpannerInsertOrUpdate',
+    'SpannerReplace',
+    'SpannerUpdate',
+    'TimestampBoundMode',
+    'TimeUnit',
+]
+
+
+def default_io_expansion_service():
+  return BeamJarExpansionService(
+      'sdks:java:io:google-cloud-platform:expansion-service:shadowJar')
+
+
+_READ_URN = 'beam:external:java:spanner:read:v1'
+_WRITE_URN = 'beam:external:java:spanner:write:v1'
+
+
+class TimeUnit(Enum):
+  NANOSECONDS = auto()
+  MICROSECONDS = auto()
+  MILLISECONDS = auto()
+  SECONDS = auto()
+  HOURS = auto()
+  DAYS = auto()
+
+
+class TimestampBoundMode(Enum):
+  MAX_STALENESS = auto()
+  EXACT_STALENESS = auto()
+  READ_TIMESTAMP = auto()
+  MIN_READ_TIMESTAMP = auto()
+  STRONG = auto()
+
+
+class ReadFromSpannerSchema(NamedTuple):
+  instance_id: unicode
+  database_id: unicode
+  schema: bytes
+  sql: Optional[unicode]
+  table: Optional[unicode]
+  project_id: Optional[unicode]
+  host: Optional[unicode]
+  emulator_host: Optional[unicode]
+  batching: Optional[bool]
+  timestamp_bound_mode: Optional[unicode]
+  read_timestamp: Optional[unicode]
+  exact_staleness: Optional[int]
+  time_unit: Optional[unicode]
+
+
+class ReadFromSpanner(ExternalTransform):
+  """
+  A PTransform which reads from the specified Spanner instance's database.
+
+  This transform required type of the row it has to return to provide the
+  schema. Example::
+
+    ExampleRow = typing.NamedTuple('ExampleRow',
+                                   [('id', int), ('name', unicode)])
+
+    with Pipeline() as p:
+      result = (
+          p
+          | ReadFromSpanner(
+              instance_id='your_instance_id',
+              database_id='your_database_id',
+              project_id='your_project_id',
+              row_type=ExampleRow,
+              query='SELECT * FROM some_table',
+              timestamp_bound_mode=TimestampBoundMode.MAX_STALENESS,
+              exact_staleness=3,
+              time_unit=TimeUnit.HOURS,
+          ).with_output_types(ExampleRow))
+
+  Experimental; no backwards compatibility guarantees.
+  """
+  def __init__(
+      self,
+      project_id,
+      instance_id,
+      database_id,
+      row_type=None,
+      sql=None,
+      table=None,
+      host=None,
+      emulator_host=None,
+      batching=None,
+      timestamp_bound_mode=None,
+      read_timestamp=None,
+      exact_staleness=None,
+      time_unit=None,
+      expansion_service=None,
+  ):
+    """
+    Initializes a read operation from Spanner.
+
+    :param project_id: Specifies the Cloud Spanner project.
+    :param instance_id: Specifies the Cloud Spanner instance.
+    :param database_id: Specifies the Cloud Spanner database.
+    :param row_type: Row type that fits the given query or table. Passed as
+        NamedTuple, e.g. NamedTuple('name', [('row_name', unicode)])
+    :param sql: An sql query to execute. It's results must fit the
+        provided row_type. Don't use when table is set.
+    :param table: A spanner table. When provided all columns from row_type
+        will be selected to query. Don't use when query is set.
+    :param batching: By default Batch API is used to read data from Cloud
+        Spanner. It is useful to disable batching when the underlying query
+        is not root-partitionable.
+    :param host: Specifies the Cloud Spanner host.
+    :param emulator_host: Specifies Spanner emulator host.
+    :param timestamp_bound_mode: Defines how Cloud Spanner will choose a
+        timestamp for a read-only transaction or a single read/query.
+        Passed as TimestampBoundMode enum. Possible values:
+        STRONG: A timestamp bound that will perform reads and queries at a
+        timestamp where all previously committed transactions are visible.
+        READ_TIMESTAMP: Returns a timestamp bound that will perform reads
+        and queries at the given timestamp.
+        MIN_READ_TIMESTAMP: Returns a timestamp bound that will perform reads
+        and queries at a timestamp chosen to be at least given timestamp value.
+        EXACT_STALENESS: Returns a timestamp bound that will perform reads and
+        queries at an exact staleness. The timestamp is chosen soon after the
+        read is started.
+        MAX_STALENESS: Returns a timestamp bound that will perform reads and
+        queries at a timestamp chosen to be at most time_unit stale.
+    :param read_timestamp: Timestamp in string. Use only when
+        timestamp_bound_mode is set to READ_TIMESTAMP or MIN_READ_TIMESTAMP.
+    :param exact_staleness: Staleness value as int. Use only when
+        timestamp_bound_mode is set to EXACT_STALENESS or MAX_STALENESS.
+        time_unit has to be set along with this param.
+    :param time_unit: Time unit for staleness_value passed as TimeUnit enum.
+        Possible values: NANOSECONDS, MICROSECONDS, MILLISECONDS, SECONDS,
+        HOURS, DAYS.
+    :param expansion_service: The address (host:port) of the ExpansionService.
+    """
+    assert row_type
+    assert sql or table and not (sql and table)
+    staleness_value = int(exact_staleness) if exact_staleness else None
+
+    if staleness_value or time_unit:
+      assert staleness_value and time_unit and \
+             timestamp_bound_mode is TimestampBoundMode.MAX_STALENESS or \
+             timestamp_bound_mode is TimestampBoundMode.EXACT_STALENESS
+
+    if read_timestamp:
+      assert timestamp_bound_mode is TimestampBoundMode.MIN_READ_TIMESTAMP\
+             or timestamp_bound_mode is TimestampBoundMode.READ_TIMESTAMP
+
+    coders.registry.register_coder(row_type, coders.RowCoder)
+
+    super(ReadFromSpanner, self).__init__(
+        _READ_URN,
+        NamedTupleBasedPayloadBuilder(
+            ReadFromSpannerSchema(
+                instance_id=instance_id,
+                database_id=database_id,
+                sql=sql,
+                table=table,
+                schema=named_tuple_to_schema(row_type).SerializeToString(),
+                project_id=project_id,
+                host=host,
+                emulator_host=emulator_host,
+                batching=batching,
+                timestamp_bound_mode=_get_enum_name(timestamp_bound_mode),
+                read_timestamp=read_timestamp,
+                exact_staleness=exact_staleness,
+                time_unit=_get_enum_name(time_unit),
+            ),
+        ),
+        expansion_service or default_io_expansion_service(),
+    )
+
+
+class WriteToSpannerSchema(NamedTuple):
+  project_id: unicode
+  instance_id: unicode
+  database_id: unicode
+  max_batch_size_bytes: Optional[int]
+  max_number_mutations: Optional[int]
+  max_number_rows: Optional[int]
+  grouping_factor: Optional[int]
+  host: Optional[unicode]
+  emulator_host: Optional[unicode]
+  commit_deadline: Optional[int]
+  max_cumulative_backoff: Optional[int]
+
+
+_CLASS_DOC = \
+  """
+  A PTransform which writes {0} mutations to the specified Spanner table.
+
+  This transform receives rows defined as NamedTuple. Example::
+
+    {1} = typing.NamedTuple('{1}',
+                                   [('id', int), ('name', unicode)])
+
+    with Pipeline() as p:
+      _ = (
+          p
+          | 'Impulse' >> beam.Impulse()
+          | 'Generate' >> beam.FlatMap(lambda x: range(num_rows))
+          | 'To row' >> beam.Map(lambda n: {1}(n, str(n))
+              .with_output_types({2})
+          | 'Write to Spanner' >> Spanner{3}(
+              instance_id='your_instance',
+              database_id='existing_database',
+              project_id='your_project_id',
+              table='your_table'))
+
+  Experimental; no backwards compatibility guarantees.
+  """
+
+_INIT_DOC = \
+  """
+  Initializes {} operation to a Spanner table.
+
+  :param project_id: Specifies the Cloud Spanner project.
+  :param instance_id: Specifies the Cloud Spanner instance.
+  :param database_id: Specifies the Cloud Spanner database.
+  :param table: Specifies the Cloud Spanner table.
+  :param max_batch_size_bytes: Specifies the batch size limit (max number of
+      bytes mutated per batch). Default value is 1048576 bytes = 1MB.
+  :param max_number_mutations: Specifies the cell mutation limit (maximum
+      number of mutated cells per batch). Default value is 5000.
+  :param max_number_rows: Specifies the row mutation limit (maximum number of
+      mutated rows per batch). Default value is 500.
+  :param grouping_factor: Specifies the multiple of max mutation (in terms
+      of both bytes per batch and cells per batch) that is used to select a
+      set of mutations to sort by key for batching. This sort uses local
+      memory on the workers, so using large values can cause out of memory
+      errors. Default value is 1000.
+  :param host: Specifies the Cloud Spanner host.
+  :param emulator_host: Specifies Spanner emulator host.
+  :param commit_deadline: Specifies the deadline for the Commit API call.
+      Default is 15 secs. DEADLINE_EXCEEDED errors will prompt a backoff/retry
+      until the value of commit_deadline is reached. DEADLINE_EXCEEDED errors
+      are ar reported with logging and counters. Pass seconds as value.
+  :param max_cumulative_backoff: Specifies the maximum cumulative backoff
+      time when retrying after DEADLINE_EXCEEDED errors. Default is 900s
+      (15min). If the mutations still have not been written after this time,
+      they are treated as a failure, and handled according to the setting of
+      failure_mode. Pass seconds as value.
+  :param expansion_service: The address (host:port) of the ExpansionService.
+  """
+
+
+def _add_doc(value, *args):
+  def _doc(obj):
+    obj.__doc__ = value.format(*args)
+    return obj
+
+  return _doc
+
+
+@_add_doc(_CLASS_DOC, 'delete', 'ExampleKey', 'List[ExampleKey]', 'Delete')

Review comment:
       Done




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] commented on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
codecov[bot] commented on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-683611200


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=h1) Report
   > :exclamation: No coverage uploaded for pull request head (`spanner-xlang@fed767a`). [Click here to learn what that means](https://docs.codecov.io/docs/error-reference#section-missing-head-commit).
   > The diff coverage is `n/a`.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski commented on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski commented on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-712094472


   I've just run pydocs command and the generated html looks good.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski removed a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski removed a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-726035603


   Run Python 3.7 PostCommit


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski commented on a change in pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski commented on a change in pull request #12611:
URL: https://github.com/apache/beam/pull/12611#discussion_r505693055



##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/MutationUtils.java
##########
@@ -34,4 +49,237 @@ public static boolean isPointDelete(Mutation m) {
         && Iterables.isEmpty(m.getKeySet().getRanges())
         && Iterables.size(m.getKeySet().getKeys()) == 1;
   }
+
+  /**
+   * Utility function to convert row to mutation.
+   *
+   * @return function that can convert row to mutation
+   */
+  public static SerializableFunction<Row, Mutation> beamRowToMutationFn() {

Review comment:
       Done. I hope it's not too brief.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski edited a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski edited a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-709456631


   @TheNeuralBit My greatest concern about current changes is that there is a lot of duplicated docs and constructor parameters code.. I'm not sure if something can be done. I've researched the subject of docs a bit and didn't find any decent solution.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski commented on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski commented on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-677656948


   Run Python 3.8 PostCommit


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-683611200


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=h1) Report
   > Merging [#12611](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=desc) into [master](https://codecov.io/gh/apache/beam/commit/0335ba5403e5071a1376a3a4101d14b03cf90b8d?el=desc) will **increase** coverage by `0.02%`.
   > The diff coverage is `59.13%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/12611/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #12611      +/-   ##
   ==========================================
   + Coverage   40.23%   40.26%   +0.02%     
   ==========================================
     Files         455      456       +1     
     Lines       53726    53824      +98     
   ==========================================
   + Hits        21619    21674      +55     
   - Misses      32107    32150      +43     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [sdks/python/apache\_beam/io/gcp/spanner.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL3NwYW5uZXIucHk=) | `59.13% <59.13%> (ø)` | |
   | [...hon/apache\_beam/runners/direct/test\_stream\_impl.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9kaXJlY3QvdGVzdF9zdHJlYW1faW1wbC5weQ==) | `41.17% <0.00%> (-1.58%)` | :arrow_down: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=footer). Last update [0335ba5...dd827fb](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-683611200


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=h1) Report
   > Merging [#12611](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=desc) into [master](https://codecov.io/gh/apache/beam/commit/0335ba5403e5071a1376a3a4101d14b03cf90b8d?el=desc) will **increase** coverage by `0.05%`.
   > The diff coverage is `56.73%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/12611/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #12611      +/-   ##
   ==========================================
   + Coverage   40.23%   40.29%   +0.05%     
   ==========================================
     Files         455      456       +1     
     Lines       53726    53884     +158     
   ==========================================
   + Hits        21619    21714      +95     
   - Misses      32107    32170      +63     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [sdks/python/apache\_beam/io/gcp/spanner.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL3NwYW5uZXIucHk=) | `56.73% <56.73%> (ø)` | |
   | [...hon/apache\_beam/runners/direct/test\_stream\_impl.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9kaXJlY3QvdGVzdF9zdHJlYW1faW1wbC5weQ==) | `41.17% <0.00%> (-1.58%)` | :arrow_down: |
   | [sdks/python/apache\_beam/portability/python\_urns.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcG9ydGFiaWxpdHkvcHl0aG9uX3VybnMucHk=) | `100.00% <0.00%> (ø)` | |
   | [...s/python/apache\_beam/io/gcp/bigquery\_file\_loads.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL2JpZ3F1ZXJ5X2ZpbGVfbG9hZHMucHk=) | `23.36% <0.00%> (ø)` | |
   | [...hon/apache\_beam/runners/worker/bundle\_processor.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvYnVuZGxlX3Byb2Nlc3Nvci5weQ==) | `28.92% <0.00%> (+0.24%)` | :arrow_up: |
   | [sdks/python/apache\_beam/dataframe/frames.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vZGF0YWZyYW1lL2ZyYW1lcy5weQ==) | `43.73% <0.00%> (+0.76%)` | :arrow_up: |
   | [sdks/python/apache\_beam/transforms/core.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdHJhbnNmb3Jtcy9jb3JlLnB5) | `39.22% <0.00%> (+0.94%)` | :arrow_up: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=footer). Last update [0335ba5...70c3073](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski removed a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski removed a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-726117163


   Run Python 3.7 PostCommit


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-683611200


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=h1) Report
   > Merging [#12611](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=desc) into [master](https://codecov.io/gh/apache/beam/commit/0335ba5403e5071a1376a3a4101d14b03cf90b8d?el=desc) will **increase** coverage by `0.02%`.
   > The diff coverage is `58.51%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/12611/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #12611      +/-   ##
   ==========================================
   + Coverage   40.23%   40.26%   +0.02%     
   ==========================================
     Files         455      456       +1     
     Lines       53726    53825      +99     
   ==========================================
   + Hits        21619    21674      +55     
   - Misses      32107    32151      +44     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [sdks/python/apache\_beam/io/gcp/spanner.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL3NwYW5uZXIucHk=) | `58.51% <58.51%> (ø)` | |
   | [...hon/apache\_beam/runners/direct/test\_stream\_impl.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9kaXJlY3QvdGVzdF9zdHJlYW1faW1wbC5weQ==) | `41.17% <0.00%> (-1.58%)` | :arrow_down: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=footer). Last update [0335ba5...33a9b22](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-683611200


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=h1) Report
   > Merging [#12611](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=desc) into [master](https://codecov.io/gh/apache/beam/commit/0335ba5403e5071a1376a3a4101d14b03cf90b8d?el=desc) will **increase** coverage by `0.03%`.
   > The diff coverage is `59.13%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/12611/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #12611      +/-   ##
   ==========================================
   + Coverage   40.23%   40.27%   +0.03%     
   ==========================================
     Files         455      456       +1     
     Lines       53726    53819      +93     
   ==========================================
   + Hits        21619    21674      +55     
   - Misses      32107    32145      +38     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [sdks/python/apache\_beam/io/gcp/spanner.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL3NwYW5uZXIucHk=) | `59.13% <59.13%> (ø)` | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=footer). Last update [0335ba5...dd827fb](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-683611200


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=h1) Report
   > Merging [#12611](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=desc) into [master](https://codecov.io/gh/apache/beam/commit/0335ba5403e5071a1376a3a4101d14b03cf90b8d?el=desc) will **increase** coverage by `0.02%`.
   > The diff coverage is `58.51%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/12611/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #12611      +/-   ##
   ==========================================
   + Coverage   40.23%   40.26%   +0.02%     
   ==========================================
     Files         455      456       +1     
     Lines       53726    53825      +99     
   ==========================================
   + Hits        21619    21674      +55     
   - Misses      32107    32151      +44     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [sdks/python/apache\_beam/io/gcp/spanner.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL3NwYW5uZXIucHk=) | `58.51% <58.51%> (ø)` | |
   | [...hon/apache\_beam/runners/direct/test\_stream\_impl.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9kaXJlY3QvdGVzdF9zdHJlYW1faW1wbC5weQ==) | `41.17% <0.00%> (-1.58%)` | :arrow_down: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=footer). Last update [0335ba5...5feb3d9](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski commented on pull request #12611: [BEAM-10131][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski commented on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-676423925






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski commented on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski commented on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-710247484


   Run Python 3.8 PostCommit


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski commented on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski commented on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-717339318


   Run Python 3.7 PostCommit


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski commented on a change in pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski commented on a change in pull request #12611:
URL: https://github.com/apache/beam/pull/12611#discussion_r505611080



##########
File path: sdks/python/apache_beam/io/gcp/spanner.py
##########
@@ -0,0 +1,483 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""PTransforms for supporting Spanner in Python pipelines.
+
+  These transforms are currently supported by Beam portable
+  Flink and Spark runners.
+
+  **Setup**
+
+  Transforms provided in this module are cross-language transforms
+  implemented in the Beam Java SDK. During the pipeline construction, Python SDK
+  will connect to a Java expansion service to expand these transforms.
+  To facilitate this, a small amount of setup is needed before using these
+  transforms in a Beam Python pipeline.
+
+  There are several ways to setup cross-language Spanner transforms.
+
+  * Option 1: use the default expansion service
+  * Option 2: specify a custom expansion service
+
+  See below for details regarding each of these options.
+
+  *Option 1: Use the default expansion service*
+
+  This is the recommended and easiest setup option for using Python Spanner
+  transforms. This option is only available for Beam 2.25.0 and later.
+
+  This option requires following pre-requisites before running the Beam
+  pipeline.
+
+  * Install Java runtime in the computer from where the pipeline is constructed
+    and make sure that 'java' command is available.
+
+  In this option, Python SDK will either download (for released Beam version) or
+  build (when running from a Beam Git clone) a expansion service jar and use
+  that to expand transforms. Currently Spanner transforms use the
+  'beam-sdks-java-io-google-cloud-platform-expansion-service' jar for this
+  purpose.
+
+  *Option 2: specify a custom expansion service*
+
+  In this option, you startup your own expansion service and provide that as
+  a parameter when using the transforms provided in this module.
+
+  This option requires following pre-requisites before running the Beam
+  pipeline.
+
+  * Startup your own expansion service.
+  * Update your pipeline to provide the expansion service address when
+    initiating Spanner transforms provided in this module.
+
+  Flink Users can use the built-in Expansion Service of the Flink Runner's
+  Job Server. If you start Flink's Job Server, the expansion service will be
+  started on port 8097. For a different address, please set the
+  expansion_service parameter.
+
+  **More information**
+
+  For more information regarding cross-language transforms see:
+  - https://beam.apache.org/roadmap/portability/
+
+  For more information specific to Flink runner see:
+  - https://beam.apache.org/documentation/runners/flink/
+"""
+
+# pytype: skip-file
+
+from __future__ import absolute_import
+
+import typing
+import uuid
+from typing import List
+from typing import NamedTuple
+from typing import Optional
+
+from past.builtins import unicode
+
+from apache_beam import Map
+from apache_beam import PTransform
+from apache_beam import coders
+from apache_beam.transforms.external import BeamJarExpansionService
+from apache_beam.transforms.external import ExternalTransform
+from apache_beam.transforms.external import NamedTupleBasedPayloadBuilder
+from apache_beam.typehints.schemas import named_tuple_to_schema
+
+__all__ = [
+    'WriteToSpanner',
+    'ReadFromSpanner',
+    'TimestampBoundMode',
+    'TimeUnit',
+]
+
+
+def default_io_expansion_service():
+  return BeamJarExpansionService(
+      'sdks:java:io:google-cloud-platform:expansion-service:shadowJar')
+
+
+WriteToSpannerSchema = typing.NamedTuple(
+    'WriteToSpannerSchema',
+    [
+        ('instance_id', unicode),
+        ('database_id', unicode),
+        ('project_id', Optional[unicode]),
+        ('max_batch_size_bytes', Optional[int]),
+        ('max_number_mutations', Optional[int]),
+        ('max_number_rows', Optional[int]),
+        ('grouping_factor', Optional[int]),
+        ('host', Optional[unicode]),
+        ('emulator_host', Optional[unicode]),
+        ('commit_deadline', Optional[int]),
+        ('max_cumulative_backoff', Optional[int]),
+    ],
+)
+
+
+class WriteToSpanner:
+  """
+  A PTransform which writes mutations to the specified instance's database
+  via Spanner.
+
+  This transform receives rows defined as NamedTuple or as List[NamedTuple]
+  in case of delete operation. Example::
+
+    ExampleRow = typing.NamedTuple('ExampleRow',
+                                   [('id', int), ('name', unicode)])
+
+    with Pipeline() as p:
+      _ = (
+          p
+          | 'Impulse' >> beam.Impulse()
+          | 'Generate' >> beam.FlatMap(lambda x: range(num_rows))
+          | 'To row' >> beam.Map(lambda n: ExampleRow(n, str(n))
+              .with_output_types(ExampleRow)
+          | 'Write to Spanner' >> WriteToSpanner(
+              instance_id='your_instance',
+              database_id='existing_database',
+              project_id='your_project_id').insert('your_table'))
+
+  In addition you can pass List[ExampleRow] to delete transform::
+
+    with Pipeline() as p:
+      _ = (
+          p
+          | 'Impulse' >> beam.Impulse()
+          | 'Generate' >> beam.FlatMap(lambda x: range(num_rows))
+          | 'To row' >> beam.Map(lambda n: [ExampleRow(n, str(n),
+              ExampleRow(n * 2, str(n * 2)])
+              .with_output_types(List[ExampleRow])
+          | 'Write to Spanner' >> WriteToSpanner(
+              instance_id='your_instance',
+              database_id='existing_database',
+              project_id='your_project_id').delete('your_table'))
+
+  Experimental; no backwards compatibility guarantees.
+  """
+  def __init__(
+      self,
+      project_id,
+      instance_id,
+      database_id,
+      max_batch_size_bytes=None,
+      max_number_mutations=None,
+      max_number_rows=None,
+      grouping_factor=None,
+      host=None,
+      emulator_host=None,
+      commit_deadline=None,
+      max_cumulative_backoff=None,
+      expansion_service=None,
+  ):
+    """
+    Initializes a write operation to Spanner.
+
+    :param project_id: Specifies the Cloud Spanner project.
+    :param instance_id: Specifies the Cloud Spanner instance.
+    :param database_id: Specifies the Cloud Spanner database.
+    :param max_batch_size_bytes: Specifies the batch size limit (max number of
+        bytes mutated per batch). Default value is 1048576 bytes = 1MB.
+    :param max_number_mutations: Specifies the cell mutation limit (maximum
+        number of mutated cells per batch). Default value is 5000.
+    :param max_number_rows: Specifies the row mutation limit (maximum number of
+        mutated rows per batch). Default value is 500.
+    :param grouping_factor: Specifies the multiple of max mutation (in terms
+        of both bytes per batch and cells per batch) that is used to select a
+        set of mutations to sort by key for batching. This sort uses local
+        memory on the workers, so using large values can cause out of memory
+        errors. Default value is 1000.
+    :param host: Specifies the Cloud Spanner host.
+    :param emulator_host: Specifies Spanner emulator host.
+    :param commit_deadline: Specifies the deadline for the Commit API call.
+        Default is 15 secs. DEADLINE_EXCEEDED errors will prompt a backoff/retry
+        until the value of commit_deadline is reached. DEADLINE_EXCEEDED errors
+        are ar reported with logging and counters. Pass seconds as value.
+    :param max_cumulative_backoff: Specifies the maximum cumulative backoff
+        time when retrying after DEADLINE_EXCEEDED errors. Default is 900s
+        (15min). If the mutations still have not been written after this time,
+        they are treated as a failure, and handled according to the setting of
+        failure_mode. Pass seconds as value.
+    :param expansion_service: The address (host:port) of the ExpansionService.
+    """
+    max_cumulative_backoff = int(
+        max_cumulative_backoff) if max_cumulative_backoff else None
+    commit_deadline = int(commit_deadline) if commit_deadline else None
+    self.config = NamedTupleBasedPayloadBuilder(
+        WriteToSpannerSchema(
+            project_id=project_id,
+            instance_id=instance_id,
+            database_id=database_id,
+            max_batch_size_bytes=max_batch_size_bytes,
+            max_number_mutations=max_number_mutations,
+            max_number_rows=max_number_rows,
+            grouping_factor=grouping_factor,
+            host=host,
+            emulator_host=emulator_host,
+            commit_deadline=commit_deadline,
+            max_cumulative_backoff=max_cumulative_backoff,
+        ),
+    )
+    self.expansion_service = expansion_service or default_io_expansion_service()
+
+  def insert(self, table):
+    return WriteToSpannerTransform(
+        self.config, self.expansion_service, Operation.INSERT, table)
+
+  def delete(self, table):
+    return WriteToSpannerTransform(
+        self.config, self.expansion_service, Operation.DELETE, table)
+
+  def update(self, table):
+    return WriteToSpannerTransform(
+        self.config, self.expansion_service, Operation.UPDATE, table)
+
+  def replace(self, table):
+    return WriteToSpannerTransform(
+        self.config, self.expansion_service, Operation.REPLACE, table)
+
+  def insert_or_update(self, table):
+    return WriteToSpannerTransform(
+        self.config, self.expansion_service, Operation.INSERT_OR_UPDATE, table)
+
+
+ReadFromSpannerSchema = NamedTuple(
+    'ReadFromSpannerSchema',
+    [
+        ('instance_id', unicode),
+        ('database_id', unicode),
+        ('schema', bytes),
+        ('sql', Optional[unicode]),
+        ('table', Optional[unicode]),
+        ('project_id', Optional[unicode]),
+        ('host', Optional[unicode]),
+        ('emulator_host', Optional[unicode]),
+        ('batching', Optional[bool]),
+        ('timestamp_bound_mode', Optional[unicode]),
+        ('read_timestamp', Optional[unicode]),
+        ('exact_staleness', Optional[int]),
+        ('time_unit', Optional[unicode]),
+    ],
+)
+
+
+class ReadFromSpanner(ExternalTransform):
+  """
+  A PTransform which reads from the specified Spanner instance's database.
+
+  This transform required type of the row it has to return to provide the
+  schema. Example::
+
+    ExampleRow = typing.NamedTuple('ExampleRow',
+                                   [('id', int), ('name', unicode)])
+
+    with Pipeline() as p:
+      result = (
+          p
+          | ReadFromSpanner(
+              instance_id='your_instance_id',
+              database_id='your_database_id',
+              project_id='your_project_id',
+              row_type=ExampleRow,
+              query='SELECT * FROM some_table',
+          ).with_output_types(ExampleRow))
+
+  Experimental; no backwards compatibility guarantees.
+  """
+  URN = 'beam:external:java:spanner:read:v1'
+
+  def __init__(
+      self,
+      project_id,
+      instance_id,
+      database_id,
+      row_type=None,
+      sql=None,
+      table=None,
+      host=None,
+      emulator_host=None,
+      batching=None,
+      timestamp_bound_mode=None,
+      read_timestamp=None,
+      exact_staleness=None,
+      time_unit=None,
+      expansion_service=None,
+  ):
+    """
+    Initializes a read operation from Spanner.
+
+    :param project_id: Specifies the Cloud Spanner project.
+    :param instance_id: Specifies the Cloud Spanner instance.
+    :param database_id: Specifies the Cloud Spanner database.
+    :param row_type: Row type that fits the given query or table. Passed as
+        NamedTuple, e.g. NamedTuple('name', [('row_name', unicode)])
+    :param sql: An sql query to execute. It's results must fit the
+        provided row_type. Don't use when table is set.
+    :param table: A spanner table. When provided all columns from row_type
+        will be selected to query. Don't use when query is set.
+    :param batching: By default Batch API is used to read data from Cloud
+        Spanner. It is useful to disable batching when the underlying query
+        is not root-partitionable.
+    :param host: Specifies the Cloud Spanner host.
+    :param emulator_host: Specifies Spanner emulator host.
+    :param timestamp_bound_mode: Defines how Cloud Spanner will choose a
+        timestamp for a read-only transaction or a single read/query.
+        Possible values:
+        STRONG: A timestamp bound that will perform reads and queries at a
+        timestamp where all previously committed transactions are visible.
+        READ_TIMESTAMP: Returns a timestamp bound that will perform reads
+        and queries at the given timestamp.
+        MIN_READ_TIMESTAMP: Returns a timestamp bound that will perform reads
+        and queries at a timestamp chosen to be at least given timestamp value.
+        EXACT_STALENESS: Returns a timestamp bound that will perform reads and
+        queries at an exact staleness. The timestamp is chosen soon after the
+        read is started.
+        MAX_STALENESS: Returns a timestamp bound that will perform reads and
+        queries at a timestamp chosen to be at most time_unit stale.
+    :param read_timestamp: Timestamp in string. Use only when
+        timestamp_bound_mode is set to READ_TIMESTAMP or MIN_READ_TIMESTAMP.
+    :param exact_staleness: Staleness value as int. Use only when
+        timestamp_bound_mode is set to EXACT_STALENESS or MAX_STALENESS.
+        time_unit has to be set along with this param.
+    :param time_unit: Time unit for staleness_value. Possible values:
+        NANOSECONDS, MICROSECONDS, MILLISECONDS, SECONDS, HOURS, DAYS.
+    :param expansion_service: The address (host:port) of the ExpansionService.
+    """
+    assert row_type
+    assert sql or table and not (sql and table)
+    TimeUnit.verify_param(time_unit)
+    TimestampBoundMode.verify_param(timestamp_bound_mode)
+    staleness_value = int(exact_staleness) if exact_staleness else None
+
+    if staleness_value or time_unit:
+      assert staleness_value and time_unit and \
+             timestamp_bound_mode == TimestampBoundMode.MAX_STALENESS or \
+             timestamp_bound_mode == TimestampBoundMode.EXACT_STALENESS
+
+    if read_timestamp:
+      assert timestamp_bound_mode == TimestampBoundMode.MIN_READ_TIMESTAMP\
+             or timestamp_bound_mode == TimestampBoundMode.READ_TIMESTAMP
+
+    coders.registry.register_coder(row_type, coders.RowCoder)
+
+    super(ReadFromSpanner, self).__init__(
+        self.URN,
+        NamedTupleBasedPayloadBuilder(
+            ReadFromSpannerSchema(
+                instance_id=instance_id,
+                database_id=database_id,
+                sql=sql,
+                table=table,
+                schema=named_tuple_to_schema(row_type).SerializeToString(),
+                project_id=project_id,
+                host=host,
+                emulator_host=emulator_host,
+                batching=batching,
+                timestamp_bound_mode=timestamp_bound_mode,
+                read_timestamp=read_timestamp,
+                exact_staleness=exact_staleness,
+                time_unit=time_unit,
+            ),
+        ),
+        expansion_service or default_io_expansion_service(),
+    )
+
+
+class Operation:
+  INSERT = 'INSERT'
+  DELETE = 'DELETE'
+  UPDATE = 'UPDATE'
+  REPLACE = 'REPLACE'
+  INSERT_OR_UPDATE = 'INSERT_OR_UPDATE'
+
+
+class TimeUnit:
+  NANOSECONDS = 'NANOSECONDS'
+  MICROSECONDS = 'MICROSECONDS'
+  MILLISECONDS = 'MILLISECONDS'
+  SECONDS = 'SECONDS'
+  HOURS = 'HOURS'
+  DAYS = 'DAYS'
+
+  @staticmethod
+  def verify_param(param):
+    if param and not hasattr(TimeUnit, param):
+      raise RuntimeError(
+          'Invalid param for TimestampBoundMode: {}'.format(param))
+
+
+class TimestampBoundMode:
+  MAX_STALENESS = 'MAX_STALENESS'
+  EXACT_STALENESS = 'EXACT_STALENESS'
+  READ_TIMESTAMP = 'READ_TIMESTAMP'
+  MIN_READ_TIMESTAMP = 'MIN_READ_TIMESTAMP'
+  STRONG = 'STRONG'
+
+  @staticmethod
+  def verify_param(param):
+    if param and not hasattr(TimestampBoundMode, param):
+      raise RuntimeError(
+          'Invalid param for TimestampBoundMode: {}'.format(param))
+
+
+class WriteToSpannerTransform(PTransform):
+  URN = 'beam:external:java:spanner:write:v1'
+
+  def __init__(self, config, expansion_service, operation, table):
+    super(WriteToSpannerTransform, self).__init__()
+    self.config = config
+    self.expansion_service = expansion_service
+    self.operation = operation
+    self.table = table
+
+  def expand(self, row_pcoll):
+    return (
+        row_pcoll
+        | RowToMutation(self.operation, self.table)
+        | ExternalTransform(self.URN, self.config, self.expansion_service))
+
+
+class RowToMutation(PTransform):
+  def __init__(self, operation, table):
+    super(RowToMutation, self).__init__()
+    self.operation = operation
+    self.table = table
+
+  def expand(self, pcoll):
+    is_delete = self.operation == Operation.DELETE
+    mutation_name = 'Mutation_%s_%s' % (
+        self.operation, str(uuid.uuid4()).replace('-', ''))
+
+    # There is an error when pcoll.element_type is List[row_type] so pass
+    # a list of inner element types to NamedTuple explicitly.
+    is_list = hasattr(pcoll.element_type, 'inner_type')
+    row_type = pcoll.element_type.inner_type if is_list else pcoll.element_type

Review comment:
       The issue was that the type returned from pcoll.element_type List[SpannerTestKey] was not compatible with typing.List[SpannerTestKey] and caused a silly message:
   ```
   Exception: NamedTuple('Name', [(f0, t0), (f1, t1), ...]); each t must be a type Got List[SpannerTestKey].
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski commented on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski commented on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-710517731


   Run Python 3.8 PostCommit


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] TheNeuralBit commented on a change in pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
TheNeuralBit commented on a change in pull request #12611:
URL: https://github.com/apache/beam/pull/12611#discussion_r522470928



##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/StructUtils.java
##########
@@ -0,0 +1,301 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.gcp.spanner;
+
+import static java.util.stream.Collectors.toList;
+
+import com.google.cloud.ByteArray;
+import com.google.cloud.Timestamp;
+import com.google.cloud.spanner.Struct;
+import com.google.cloud.spanner.Type;
+import java.math.BigDecimal;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.stream.StreamSupport;
+import org.apache.beam.sdk.schemas.Schema;
+import org.apache.beam.sdk.values.Row;
+import org.joda.time.DateTime;
+import org.joda.time.Instant;
+
+@SuppressWarnings({
+  "nullness" // TODO(https://issues.apache.org/jira/browse/BEAM-10402)
+})

Review comment:
       Could you try to address any lingering nullness errors here and in the other files that have it suppressed? If there are any intractable issues we could consider a smaller `@SuppressWarnings` blocks around a few functions, but in general we should make sure that new classes pass the null checker.

##########
File path: sdks/python/apache_beam/io/gcp/spanner.py
##########
@@ -0,0 +1,635 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""PTransforms for supporting Spanner in Python pipelines.
+
+  These transforms are currently supported by Beam portable
+  Flink and Spark runners.
+
+  **Setup**
+
+  Transforms provided in this module are cross-language transforms
+  implemented in the Beam Java SDK. During the pipeline construction, Python SDK
+  will connect to a Java expansion service to expand these transforms.
+  To facilitate this, a small amount of setup is needed before using these
+  transforms in a Beam Python pipeline.
+
+  There are several ways to setup cross-language Spanner transforms.
+
+  * Option 1: use the default expansion service
+  * Option 2: specify a custom expansion service
+
+  See below for details regarding each of these options.
+
+  *Option 1: Use the default expansion service*
+
+  This is the recommended and easiest setup option for using Python Spanner
+  transforms. This option is only available for Beam 2.26.0 and later.
+
+  This option requires following pre-requisites before running the Beam
+  pipeline.
+
+  * Install Java runtime in the computer from where the pipeline is constructed
+    and make sure that 'java' command is available.
+
+  In this option, Python SDK will either download (for released Beam version) or
+  build (when running from a Beam Git clone) a expansion service jar and use
+  that to expand transforms. Currently Spanner transforms use the
+  'beam-sdks-java-io-google-cloud-platform-expansion-service' jar for this
+  purpose.
+
+  *Option 2: specify a custom expansion service*
+
+  In this option, you startup your own expansion service and provide that as
+  a parameter when using the transforms provided in this module.
+
+  This option requires following pre-requisites before running the Beam
+  pipeline.
+
+  * Startup your own expansion service.
+  * Update your pipeline to provide the expansion service address when
+    initiating Spanner transforms provided in this module.
+
+  Flink Users can use the built-in Expansion Service of the Flink Runner's
+  Job Server. If you start Flink's Job Server, the expansion service will be
+  started on port 8097. For a different address, please set the
+  expansion_service parameter.
+
+  **More information**
+
+  For more information regarding cross-language transforms see:
+  - https://beam.apache.org/roadmap/portability/
+
+  For more information specific to Flink runner see:
+  - https://beam.apache.org/documentation/runners/flink/

Review comment:
       This information is getting duplicated across a lot of docstrings. It looks like https://github.com/apache/beam/pull/13317 will actually add similar information to the programming guide. I think we should re-write all these docstrings to refer to that once its complete.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-683611200


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=h1) Report
   > Merging [#12611](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=desc) (1c43284) into [master](https://codecov.io/gh/apache/beam/commit/3d6cc0ed9ed537229b27b5dbe73288f21b0e351c?el=desc) (3d6cc0e) will **decrease** coverage by `0.04%`.
   > The diff coverage is `56.73%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/12611/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #12611      +/-   ##
   ==========================================
   - Coverage   82.48%   82.44%   -0.05%     
   ==========================================
     Files         455      456       +1     
     Lines       54876    54975      +99     
   ==========================================
   + Hits        45266    45324      +58     
   - Misses       9610     9651      +41     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [sdks/python/apache\_beam/io/gcp/spanner.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL3NwYW5uZXIucHk=) | `56.73% <56.73%> (ø)` | |
   | [...eam/runners/interactive/interactive\_environment.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9pbnRlcmFjdGl2ZS9pbnRlcmFjdGl2ZV9lbnZpcm9ubWVudC5weQ==) | `89.45% <0.00%> (-0.36%)` | :arrow_down: |
   | [sdks/python/apache\_beam/io/iobase.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vaW9iYXNlLnB5) | `83.75% <0.00%> (-0.29%)` | :arrow_down: |
   | [...hon/apache\_beam/runners/worker/bundle\_processor.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvYnVuZGxlX3Byb2Nlc3Nvci5weQ==) | `94.07% <0.00%> (-0.27%)` | :arrow_down: |
   | [...ks/python/apache\_beam/runners/worker/sdk\_worker.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvc2RrX3dvcmtlci5weQ==) | `89.47% <0.00%> (-0.16%)` | :arrow_down: |
   | [...runners/interactive/display/pcoll\_visualization.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9pbnRlcmFjdGl2ZS9kaXNwbGF5L3Bjb2xsX3Zpc3VhbGl6YXRpb24ucHk=) | `85.26% <0.00%> (-0.08%)` | :arrow_down: |
   | [...beam/runners/portability/local\_job\_service\_main.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9wb3J0YWJpbGl0eS9sb2NhbF9qb2Jfc2VydmljZV9tYWluLnB5) | `0.00% <0.00%> (ø)` | |
   | [sdks/python/apache\_beam/runners/common.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9jb21tb24ucHk=) | `89.20% <0.00%> (+0.44%)` | :arrow_up: |
   | [.../python/apache\_beam/transforms/periodicsequence.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdHJhbnNmb3Jtcy9wZXJpb2RpY3NlcXVlbmNlLnB5) | `98.24% <0.00%> (+1.75%)` | :arrow_up: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=footer). Last update [02a1cd2...3a153ae](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski removed a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski removed a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-710517731






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-683611200


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=h1) Report
   > Merging [#12611](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=desc) (1c43284) into [master](https://codecov.io/gh/apache/beam/commit/3d6cc0ed9ed537229b27b5dbe73288f21b0e351c?el=desc) (3d6cc0e) will **decrease** coverage by `0.04%`.
   > The diff coverage is `56.73%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/12611/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #12611      +/-   ##
   ==========================================
   - Coverage   82.48%   82.44%   -0.05%     
   ==========================================
     Files         455      456       +1     
     Lines       54876    54975      +99     
   ==========================================
   + Hits        45266    45324      +58     
   - Misses       9610     9651      +41     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [sdks/python/apache\_beam/io/gcp/spanner.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL3NwYW5uZXIucHk=) | `56.73% <56.73%> (ø)` | |
   | [...eam/runners/interactive/interactive\_environment.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9pbnRlcmFjdGl2ZS9pbnRlcmFjdGl2ZV9lbnZpcm9ubWVudC5weQ==) | `89.45% <0.00%> (-0.36%)` | :arrow_down: |
   | [sdks/python/apache\_beam/io/iobase.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vaW9iYXNlLnB5) | `83.75% <0.00%> (-0.29%)` | :arrow_down: |
   | [...hon/apache\_beam/runners/worker/bundle\_processor.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvYnVuZGxlX3Byb2Nlc3Nvci5weQ==) | `94.07% <0.00%> (-0.27%)` | :arrow_down: |
   | [...ks/python/apache\_beam/runners/worker/sdk\_worker.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvc2RrX3dvcmtlci5weQ==) | `89.47% <0.00%> (-0.16%)` | :arrow_down: |
   | [...runners/interactive/display/pcoll\_visualization.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9pbnRlcmFjdGl2ZS9kaXNwbGF5L3Bjb2xsX3Zpc3VhbGl6YXRpb24ucHk=) | `85.26% <0.00%> (-0.08%)` | :arrow_down: |
   | [...beam/runners/portability/local\_job\_service\_main.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9wb3J0YWJpbGl0eS9sb2NhbF9qb2Jfc2VydmljZV9tYWluLnB5) | `0.00% <0.00%> (ø)` | |
   | [sdks/python/apache\_beam/runners/common.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9jb21tb24ucHk=) | `89.20% <0.00%> (+0.44%)` | :arrow_up: |
   | [.../python/apache\_beam/transforms/periodicsequence.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdHJhbnNmb3Jtcy9wZXJpb2RpY3NlcXVlbmNlLnB5) | `98.24% <0.00%> (+1.75%)` | :arrow_up: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=footer). Last update [02a1cd2...7228622](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] TheNeuralBit commented on a change in pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
TheNeuralBit commented on a change in pull request #12611:
URL: https://github.com/apache/beam/pull/12611#discussion_r483315059



##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/SpannerIO.java
##########
@@ -678,6 +703,42 @@ public Read withPartitionOptions(PartitionOptions partitionOptions) {
               .withTransaction(getTransaction());
       return input.apply(Create.of(getReadOperation())).apply("Execute query", readAll);
     }
+
+    SerializableFunction<Struct, Row> getFormatFn() {
+      return (SerializableFunction<Struct, Row>)
+          input ->
+              Row.withSchema(Schema.builder().addInt64Field("Key").build())
+                  .withFieldValue("Key", 3L)
+                  .build();
+    }
+  }
+
+  public static class ReadRows extends PTransform<PBegin, PCollection<Row>> {
+    Read read;
+    Schema schema;
+
+    public ReadRows(Read read, Schema schema) {
+      super("Read rows");
+      this.read = read;
+      this.schema = schema;

Review comment:
       I can reach out to the Spanner team to see if there's a good way to do this, I'll let you know if I learn anything. For now we can just plan on a jira and a TODO




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-683611200


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=h1) Report
   > Merging [#12611](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=desc) into [master](https://codecov.io/gh/apache/beam/commit/0335ba5403e5071a1376a3a4101d14b03cf90b8d?el=desc) will **increase** coverage by `0.02%`.
   > The diff coverage is `58.51%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/12611/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #12611      +/-   ##
   ==========================================
   + Coverage   40.23%   40.26%   +0.02%     
   ==========================================
     Files         455      456       +1     
     Lines       53726    53825      +99     
   ==========================================
   + Hits        21619    21674      +55     
   - Misses      32107    32151      +44     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [sdks/python/apache\_beam/io/gcp/spanner.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL3NwYW5uZXIucHk=) | `58.51% <58.51%> (ø)` | |
   | [...hon/apache\_beam/runners/direct/test\_stream\_impl.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9kaXJlY3QvdGVzdF9zdHJlYW1faW1wbC5weQ==) | `41.17% <0.00%> (-1.58%)` | :arrow_down: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=footer). Last update [0335ba5...70c3073](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski commented on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski commented on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-726253827


   Run Python 3.7 PostCommit


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski commented on pull request #12611: [BEAM-10131][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski commented on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-677454675


   Run PythonDocker PreCommit


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] TheNeuralBit merged pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
TheNeuralBit merged pull request #12611:
URL: https://github.com/apache/beam/pull/12611


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski removed a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski removed a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-717282468


   Run Python 3.8 PostCommit


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski commented on a change in pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski commented on a change in pull request #12611:
URL: https://github.com/apache/beam/pull/12611#discussion_r482075489



##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/StructUtils.java
##########
@@ -0,0 +1,545 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.gcp.spanner;
+
+import static java.util.stream.Collectors.toList;
+import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument;
+
+import com.google.cloud.ByteArray;
+import com.google.cloud.Timestamp;
+import com.google.cloud.spanner.Struct;
+import com.google.cloud.spanner.Type;
+import java.math.BigDecimal;
+import java.util.List;
+import java.util.stream.StreamSupport;
+import org.apache.beam.sdk.schemas.Schema;
+import org.apache.beam.sdk.values.Row;
+import org.joda.time.DateTime;
+import org.joda.time.Instant;
+
+final class StructUtils {
+  public static Row translateStructToRow(Struct struct, Schema schema) {
+    checkForSchemasEquality(schema.getFields(), struct.getType().getStructFields(), false);
+
+    List<Schema.Field> fields = schema.getFields();
+    Row.FieldValueBuilder valueBuilder = null;
+    // TODO: Remove this null-checking once nullable fields are supported in cross-language
+    int count = 0;
+    while (valueBuilder == null && count < fields.size()) {
+      valueBuilder = getFirstStructValue(struct, fields.get(count), schema);
+      ++count;
+    }
+    for (int i = count; i < fields.size(); ++i) {
+      valueBuilder = getStructValue(valueBuilder, struct, fields.get(i));
+    }
+    return valueBuilder != null ? valueBuilder.build() : Row.withSchema(schema).build();
+  }
+
+  public static Struct translateRowToStruct(Row row) {
+    Struct.Builder structBuilder = Struct.newBuilder();
+    List<Schema.Field> fields = row.getSchema().getFields();
+    fields.forEach(
+        field -> {
+          String column = field.getName();
+          switch (field.getType().getTypeName()) {
+            case ROW:
+              structBuilder
+                  .set(column)
+                  .to(
+                      beamTypeToSpannerType(field.getType()),
+                      translateRowToStruct(row.getRow(column)));
+              break;
+            case ARRAY:
+              addArrayToStruct(structBuilder, row, field);
+              break;
+            case ITERABLE:
+              addIterableToStruct(structBuilder, row, field);
+              break;
+            case FLOAT:
+              structBuilder.set(column).to(row.getFloat(column).doubleValue());
+              break;
+            case DOUBLE:
+              structBuilder.set(column).to(row.getDouble(column));
+              break;
+            case DECIMAL:
+              structBuilder.set(column).to(row.getDecimal(column).doubleValue());

Review comment:
       I agree, at first I didn't include decimals but it definitely is lossy.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski commented on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski commented on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-727145037


   Run Python 3.7 PostCommit


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski removed a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski removed a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-727145037


   Run Python 3.7 PostCommit


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski commented on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski commented on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-687108482


   @chamikaramj Brian asked me to ask you for the further review as he is going OOO this week.
   I've changed the API of WriteToSpanner to use WriteToSpanner(config).insert(table) etc instead of MutationCreator.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-683611200


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=h1) Report
   > Merging [#12611](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=desc) into [master](https://codecov.io/gh/apache/beam/commit/ba89e76f5daecd496cafb2861dac5ef69480a973?el=desc) will **increase** coverage by `0.03%`.
   > The diff coverage is `56.73%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/12611/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #12611      +/-   ##
   ==========================================
   + Coverage   40.26%   40.30%   +0.03%     
   ==========================================
     Files         455      456       +1     
     Lines       53780    53926     +146     
   ==========================================
   + Hits        21655    21733      +78     
   - Misses      32125    32193      +68     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [sdks/python/apache\_beam/io/gcp/spanner.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL3NwYW5uZXIucHk=) | `56.73% <56.73%> (ø)` | |
   | [...s/python/apache\_beam/testing/synthetic\_pipeline.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdGVzdGluZy9zeW50aGV0aWNfcGlwZWxpbmUucHk=) | `23.45% <0.00%> (+2.52%)` | :arrow_up: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=footer). Last update [ba89e76...c24e8cb](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-683611200


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=h1) Report
   > Merging [#12611](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=desc) into [master](https://codecov.io/gh/apache/beam/commit/0335ba5403e5071a1376a3a4101d14b03cf90b8d?el=desc) will **increase** coverage by `0.02%`.
   > The diff coverage is `58.51%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/12611/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #12611      +/-   ##
   ==========================================
   + Coverage   40.23%   40.26%   +0.02%     
   ==========================================
     Files         455      456       +1     
     Lines       53726    53825      +99     
   ==========================================
   + Hits        21619    21674      +55     
   - Misses      32107    32151      +44     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [sdks/python/apache\_beam/io/gcp/spanner.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL3NwYW5uZXIucHk=) | `58.51% <58.51%> (ø)` | |
   | [...hon/apache\_beam/runners/direct/test\_stream\_impl.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9kaXJlY3QvdGVzdF9zdHJlYW1faW1wbC5weQ==) | `41.17% <0.00%> (-1.58%)` | :arrow_down: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=footer). Last update [0335ba5...33a9b22](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-683611200


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=h1) Report
   > Merging [#12611](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=desc) into [master](https://codecov.io/gh/apache/beam/commit/0335ba5403e5071a1376a3a4101d14b03cf90b8d?el=desc) will **increase** coverage by `0.02%`.
   > The diff coverage is `59.13%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/12611/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #12611      +/-   ##
   ==========================================
   + Coverage   40.23%   40.26%   +0.02%     
   ==========================================
     Files         455      456       +1     
     Lines       53726    53824      +98     
   ==========================================
   + Hits        21619    21674      +55     
   - Misses      32107    32150      +43     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [sdks/python/apache\_beam/io/gcp/spanner.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL3NwYW5uZXIucHk=) | `59.13% <59.13%> (ø)` | |
   | [...hon/apache\_beam/runners/direct/test\_stream\_impl.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9kaXJlY3QvdGVzdF9zdHJlYW1faW1wbC5weQ==) | `41.17% <0.00%> (-1.58%)` | :arrow_down: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=footer). Last update [0335ba5...33a9b22](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski commented on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski commented on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-716801835


   Run Python 3.8 PostCommit


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-683611200


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=h1) Report
   > Merging [#12611](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=desc) (1c43284) into [master](https://codecov.io/gh/apache/beam/commit/3d6cc0ed9ed537229b27b5dbe73288f21b0e351c?el=desc) (3d6cc0e) will **decrease** coverage by `0.04%`.
   > The diff coverage is `56.73%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/12611/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #12611      +/-   ##
   ==========================================
   - Coverage   82.48%   82.44%   -0.05%     
   ==========================================
     Files         455      456       +1     
     Lines       54876    54975      +99     
   ==========================================
   + Hits        45266    45324      +58     
   - Misses       9610     9651      +41     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [sdks/python/apache\_beam/io/gcp/spanner.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL3NwYW5uZXIucHk=) | `56.73% <56.73%> (ø)` | |
   | [...eam/runners/interactive/interactive\_environment.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9pbnRlcmFjdGl2ZS9pbnRlcmFjdGl2ZV9lbnZpcm9ubWVudC5weQ==) | `89.45% <0.00%> (-0.36%)` | :arrow_down: |
   | [sdks/python/apache\_beam/io/iobase.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vaW9iYXNlLnB5) | `83.75% <0.00%> (-0.29%)` | :arrow_down: |
   | [...hon/apache\_beam/runners/worker/bundle\_processor.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvYnVuZGxlX3Byb2Nlc3Nvci5weQ==) | `94.07% <0.00%> (-0.27%)` | :arrow_down: |
   | [...ks/python/apache\_beam/runners/worker/sdk\_worker.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvc2RrX3dvcmtlci5weQ==) | `89.47% <0.00%> (-0.16%)` | :arrow_down: |
   | [...runners/interactive/display/pcoll\_visualization.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9pbnRlcmFjdGl2ZS9kaXNwbGF5L3Bjb2xsX3Zpc3VhbGl6YXRpb24ucHk=) | `85.26% <0.00%> (-0.08%)` | :arrow_down: |
   | [...beam/runners/portability/local\_job\_service\_main.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9wb3J0YWJpbGl0eS9sb2NhbF9qb2Jfc2VydmljZV9tYWluLnB5) | `0.00% <0.00%> (ø)` | |
   | [sdks/python/apache\_beam/runners/common.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9jb21tb24ucHk=) | `89.20% <0.00%> (+0.44%)` | :arrow_up: |
   | [.../python/apache\_beam/transforms/periodicsequence.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdHJhbnNmb3Jtcy9wZXJpb2RpY3NlcXVlbmNlLnB5) | `98.24% <0.00%> (+1.75%)` | :arrow_up: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=footer). Last update [02a1cd2...d05e0e4](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-683611200


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=h1) Report
   > Merging [#12611](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=desc) into [master](https://codecov.io/gh/apache/beam/commit/3d6cc0ed9ed537229b27b5dbe73288f21b0e351c?el=desc) will **decrease** coverage by `0.04%`.
   > The diff coverage is `56.73%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/12611/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #12611      +/-   ##
   ==========================================
   - Coverage   82.48%   82.44%   -0.05%     
   ==========================================
     Files         455      456       +1     
     Lines       54876    54975      +99     
   ==========================================
   + Hits        45266    45324      +58     
   - Misses       9610     9651      +41     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [sdks/python/apache\_beam/io/gcp/spanner.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL3NwYW5uZXIucHk=) | `56.73% <56.73%> (ø)` | |
   | [...eam/runners/interactive/interactive\_environment.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9pbnRlcmFjdGl2ZS9pbnRlcmFjdGl2ZV9lbnZpcm9ubWVudC5weQ==) | `89.45% <0.00%> (-0.36%)` | :arrow_down: |
   | [sdks/python/apache\_beam/io/iobase.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vaW9iYXNlLnB5) | `83.75% <0.00%> (-0.29%)` | :arrow_down: |
   | [...hon/apache\_beam/runners/worker/bundle\_processor.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvYnVuZGxlX3Byb2Nlc3Nvci5weQ==) | `94.07% <0.00%> (-0.27%)` | :arrow_down: |
   | [...ks/python/apache\_beam/runners/worker/sdk\_worker.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvc2RrX3dvcmtlci5weQ==) | `89.47% <0.00%> (-0.16%)` | :arrow_down: |
   | [...runners/interactive/display/pcoll\_visualization.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9pbnRlcmFjdGl2ZS9kaXNwbGF5L3Bjb2xsX3Zpc3VhbGl6YXRpb24ucHk=) | `85.26% <0.00%> (-0.08%)` | :arrow_down: |
   | [...beam/runners/portability/local\_job\_service\_main.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9wb3J0YWJpbGl0eS9sb2NhbF9qb2Jfc2VydmljZV9tYWluLnB5) | `0.00% <0.00%> (ø)` | |
   | [sdks/python/apache\_beam/runners/common.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9jb21tb24ucHk=) | `89.20% <0.00%> (+0.44%)` | :arrow_up: |
   | [.../python/apache\_beam/transforms/periodicsequence.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdHJhbnNmb3Jtcy9wZXJpb2RpY3NlcXVlbmNlLnB5) | `98.24% <0.00%> (+1.75%)` | :arrow_up: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=footer). Last update [e4c95f2...656add8](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski commented on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski commented on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-687192168


   Run Java PreCommit


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-683611200


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=h1) Report
   > Merging [#12611](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=desc) into [master](https://codecov.io/gh/apache/beam/commit/0335ba5403e5071a1376a3a4101d14b03cf90b8d?el=desc) will **increase** coverage by `0.02%`.
   > The diff coverage is `58.51%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/12611/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #12611      +/-   ##
   ==========================================
   + Coverage   40.23%   40.26%   +0.02%     
   ==========================================
     Files         455      456       +1     
     Lines       53726    53825      +99     
   ==========================================
   + Hits        21619    21674      +55     
   - Misses      32107    32151      +44     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [sdks/python/apache\_beam/io/gcp/spanner.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL3NwYW5uZXIucHk=) | `58.51% <58.51%> (ø)` | |
   | [...hon/apache\_beam/runners/direct/test\_stream\_impl.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9kaXJlY3QvdGVzdF9zdHJlYW1faW1wbC5weQ==) | `41.17% <0.00%> (-1.58%)` | :arrow_down: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=footer). Last update [0335ba5...5feb3d9](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski commented on a change in pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski commented on a change in pull request #12611:
URL: https://github.com/apache/beam/pull/12611#discussion_r505657264



##########
File path: sdks/python/apache_beam/io/gcp/spanner.py
##########
@@ -0,0 +1,483 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""PTransforms for supporting Spanner in Python pipelines.
+
+  These transforms are currently supported by Beam portable
+  Flink and Spark runners.
+
+  **Setup**
+
+  Transforms provided in this module are cross-language transforms
+  implemented in the Beam Java SDK. During the pipeline construction, Python SDK
+  will connect to a Java expansion service to expand these transforms.
+  To facilitate this, a small amount of setup is needed before using these
+  transforms in a Beam Python pipeline.
+
+  There are several ways to setup cross-language Spanner transforms.
+
+  * Option 1: use the default expansion service
+  * Option 2: specify a custom expansion service
+
+  See below for details regarding each of these options.
+
+  *Option 1: Use the default expansion service*
+
+  This is the recommended and easiest setup option for using Python Spanner
+  transforms. This option is only available for Beam 2.25.0 and later.
+
+  This option requires following pre-requisites before running the Beam
+  pipeline.
+
+  * Install Java runtime in the computer from where the pipeline is constructed
+    and make sure that 'java' command is available.
+
+  In this option, Python SDK will either download (for released Beam version) or
+  build (when running from a Beam Git clone) a expansion service jar and use
+  that to expand transforms. Currently Spanner transforms use the
+  'beam-sdks-java-io-google-cloud-platform-expansion-service' jar for this
+  purpose.
+
+  *Option 2: specify a custom expansion service*
+
+  In this option, you startup your own expansion service and provide that as
+  a parameter when using the transforms provided in this module.
+
+  This option requires following pre-requisites before running the Beam
+  pipeline.
+
+  * Startup your own expansion service.
+  * Update your pipeline to provide the expansion service address when
+    initiating Spanner transforms provided in this module.
+
+  Flink Users can use the built-in Expansion Service of the Flink Runner's
+  Job Server. If you start Flink's Job Server, the expansion service will be
+  started on port 8097. For a different address, please set the
+  expansion_service parameter.
+
+  **More information**
+
+  For more information regarding cross-language transforms see:
+  - https://beam.apache.org/roadmap/portability/
+
+  For more information specific to Flink runner see:
+  - https://beam.apache.org/documentation/runners/flink/
+"""
+
+# pytype: skip-file
+
+from __future__ import absolute_import
+
+import typing
+import uuid
+from typing import List
+from typing import NamedTuple
+from typing import Optional
+
+from past.builtins import unicode
+
+from apache_beam import Map
+from apache_beam import PTransform
+from apache_beam import coders
+from apache_beam.transforms.external import BeamJarExpansionService
+from apache_beam.transforms.external import ExternalTransform
+from apache_beam.transforms.external import NamedTupleBasedPayloadBuilder
+from apache_beam.typehints.schemas import named_tuple_to_schema
+
+__all__ = [
+    'WriteToSpanner',
+    'ReadFromSpanner',
+    'TimestampBoundMode',
+    'TimeUnit',
+]
+
+
+def default_io_expansion_service():
+  return BeamJarExpansionService(
+      'sdks:java:io:google-cloud-platform:expansion-service:shadowJar')
+
+
+WriteToSpannerSchema = typing.NamedTuple(
+    'WriteToSpannerSchema',
+    [
+        ('instance_id', unicode),
+        ('database_id', unicode),
+        ('project_id', Optional[unicode]),
+        ('max_batch_size_bytes', Optional[int]),
+        ('max_number_mutations', Optional[int]),
+        ('max_number_rows', Optional[int]),
+        ('grouping_factor', Optional[int]),
+        ('host', Optional[unicode]),
+        ('emulator_host', Optional[unicode]),
+        ('commit_deadline', Optional[int]),
+        ('max_cumulative_backoff', Optional[int]),
+    ],
+)

Review comment:
       Great!




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-683611200


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=h1) Report
   > Merging [#12611](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=desc) into [master](https://codecov.io/gh/apache/beam/commit/0335ba5403e5071a1376a3a4101d14b03cf90b8d?el=desc) will **increase** coverage by `0.05%`.
   > The diff coverage is `56.73%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/12611/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #12611      +/-   ##
   ==========================================
   + Coverage   40.23%   40.29%   +0.05%     
   ==========================================
     Files         455      456       +1     
     Lines       53726    53884     +158     
   ==========================================
   + Hits        21619    21714      +95     
   - Misses      32107    32170      +63     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [sdks/python/apache\_beam/io/gcp/spanner.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL3NwYW5uZXIucHk=) | `56.73% <56.73%> (ø)` | |
   | [...hon/apache\_beam/runners/direct/test\_stream\_impl.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9kaXJlY3QvdGVzdF9zdHJlYW1faW1wbC5weQ==) | `41.17% <0.00%> (-1.58%)` | :arrow_down: |
   | [sdks/python/apache\_beam/portability/python\_urns.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcG9ydGFiaWxpdHkvcHl0aG9uX3VybnMucHk=) | `100.00% <0.00%> (ø)` | |
   | [...s/python/apache\_beam/io/gcp/bigquery\_file\_loads.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL2JpZ3F1ZXJ5X2ZpbGVfbG9hZHMucHk=) | `23.36% <0.00%> (ø)` | |
   | [...hon/apache\_beam/runners/worker/bundle\_processor.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvYnVuZGxlX3Byb2Nlc3Nvci5weQ==) | `28.92% <0.00%> (+0.24%)` | :arrow_up: |
   | [sdks/python/apache\_beam/dataframe/frames.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vZGF0YWZyYW1lL2ZyYW1lcy5weQ==) | `43.73% <0.00%> (+0.76%)` | :arrow_up: |
   | [sdks/python/apache\_beam/transforms/core.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdHJhbnNmb3Jtcy9jb3JlLnB5) | `39.22% <0.00%> (+0.94%)` | :arrow_up: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=footer). Last update [0335ba5...b0c23a2](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-683611200


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=h1) Report
   > Merging [#12611](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=desc) into [master](https://codecov.io/gh/apache/beam/commit/0335ba5403e5071a1376a3a4101d14b03cf90b8d?el=desc) will **increase** coverage by `0.05%`.
   > The diff coverage is `56.73%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/12611/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #12611      +/-   ##
   ==========================================
   + Coverage   40.23%   40.29%   +0.05%     
   ==========================================
     Files         455      456       +1     
     Lines       53726    53884     +158     
   ==========================================
   + Hits        21619    21714      +95     
   - Misses      32107    32170      +63     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [sdks/python/apache\_beam/io/gcp/spanner.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL3NwYW5uZXIucHk=) | `56.73% <56.73%> (ø)` | |
   | [...hon/apache\_beam/runners/direct/test\_stream\_impl.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9kaXJlY3QvdGVzdF9zdHJlYW1faW1wbC5weQ==) | `41.17% <0.00%> (-1.58%)` | :arrow_down: |
   | [sdks/python/apache\_beam/portability/python\_urns.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcG9ydGFiaWxpdHkvcHl0aG9uX3VybnMucHk=) | `100.00% <0.00%> (ø)` | |
   | [...s/python/apache\_beam/io/gcp/bigquery\_file\_loads.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL2JpZ3F1ZXJ5X2ZpbGVfbG9hZHMucHk=) | `23.36% <0.00%> (ø)` | |
   | [...hon/apache\_beam/runners/worker/bundle\_processor.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvYnVuZGxlX3Byb2Nlc3Nvci5weQ==) | `28.92% <0.00%> (+0.24%)` | :arrow_up: |
   | [sdks/python/apache\_beam/dataframe/frames.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vZGF0YWZyYW1lL2ZyYW1lcy5weQ==) | `43.73% <0.00%> (+0.76%)` | :arrow_up: |
   | [sdks/python/apache\_beam/transforms/core.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdHJhbnNmb3Jtcy9jb3JlLnB5) | `39.22% <0.00%> (+0.94%)` | :arrow_up: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=footer). Last update [0335ba5...b0c23a2](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-683611200


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=h1) Report
   > Merging [#12611](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=desc) into [master](https://codecov.io/gh/apache/beam/commit/0335ba5403e5071a1376a3a4101d14b03cf90b8d?el=desc) will **increase** coverage by `0.02%`.
   > The diff coverage is `58.51%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/12611/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #12611      +/-   ##
   ==========================================
   + Coverage   40.23%   40.26%   +0.02%     
   ==========================================
     Files         455      456       +1     
     Lines       53726    53825      +99     
   ==========================================
   + Hits        21619    21674      +55     
   - Misses      32107    32151      +44     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [sdks/python/apache\_beam/io/gcp/spanner.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL3NwYW5uZXIucHk=) | `58.51% <58.51%> (ø)` | |
   | [...hon/apache\_beam/runners/direct/test\_stream\_impl.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9kaXJlY3QvdGVzdF9zdHJlYW1faW1wbC5weQ==) | `41.17% <0.00%> (-1.58%)` | :arrow_down: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=footer). Last update [0335ba5...5feb3d9](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-683611200


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=h1) Report
   > Merging [#12611](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=desc) into [master](https://codecov.io/gh/apache/beam/commit/6bf56f92b34f7c15b752c46eca19489a604c4775?el=desc) will **decrease** coverage by `0.06%`.
   > The diff coverage is `56.73%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/12611/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #12611      +/-   ##
   ==========================================
   - Coverage   82.51%   82.44%   -0.07%     
   ==========================================
     Files         455      456       +1     
     Lines       54867    54975     +108     
   ==========================================
   + Hits        45272    45324      +52     
   - Misses       9595     9651      +56     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [sdks/python/apache\_beam/io/gcp/spanner.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL3NwYW5uZXIucHk=) | `56.73% <56.73%> (ø)` | |
   | [sdks/python/apache\_beam/io/source\_test\_utils.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vc291cmNlX3Rlc3RfdXRpbHMucHk=) | `88.28% <0.00%> (-1.36%)` | :arrow_down: |
   | [sdks/python/apache\_beam/io/gcp/bigquery\_tools.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL2JpZ3F1ZXJ5X3Rvb2xzLnB5) | `87.79% <0.00%> (-0.57%)` | :arrow_down: |
   | [...hon/apache\_beam/runners/worker/bundle\_processor.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvYnVuZGxlX3Byb2Nlc3Nvci5weQ==) | `94.07% <0.00%> (-0.40%)` | :arrow_down: |
   | [sdks/python/apache\_beam/io/iobase.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vaW9iYXNlLnB5) | `83.75% <0.00%> (-0.29%)` | :arrow_down: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=footer). Last update [6bf56f9...1c43284](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski edited a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski edited a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-686447747


   @TheNeuralBit I've upgraded it a bit.
   - Checking schemas equality is redundant because it will throw an exception with a good message anyway (class cast failure or unknown column). Also, it's possible to just add row.addFieldValues(Map<String, Object> values) and depend on the casts following the schema.
   - I managed to unify addArray and addIterable code duplication with a bit ugly casts (needed SuppressWarning("unchecked")) but I don't think it can be easily achieved otherwise.
   - Nothing comes to my mind to remove duplication in addIterableToMutationBuilder and addIterableToStructBuilder methods. These are unrelated classes (Struct.Builder and Mutation.WriteBuilder. Maybe my Java knowledge is insufficient here. I could make an interface that simulates .setInt64Array, setStructArray etc but it would be even more boilerplate.
   - I unified a bit the API of both python spanners. Not everything could be done 1:1, but the corresponding keywords were changed and the positions of positional arguments.
   - Nulls come up with no problems - I used ImmutableMap.Builder that doesn't allow null values. I changed it to normal HashMap and now it's ok.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski edited a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski edited a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-687108482


   @chamikaramj Brian asked me to ask you for the further review as he is going OOO this week. I'd be grateful :)
   I've changed the API of WriteToSpanner to use WriteToSpanner(config).insert(table) etc instead of MutationCreator.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski commented on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski commented on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-726932495


   Run Python 3.7 PostCommit


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski commented on a change in pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski commented on a change in pull request #12611:
URL: https://github.com/apache/beam/pull/12611#discussion_r522052915



##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/SpannerTransformRegistrar.java
##########
@@ -0,0 +1,287 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.gcp.spanner;
+
+import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument;
+
+import com.google.auto.service.AutoService;
+import com.google.cloud.Timestamp;
+import com.google.cloud.spanner.TimestampBound;
+import java.util.Map;
+import java.util.concurrent.TimeUnit;
+import org.apache.beam.model.pipeline.v1.SchemaApi;
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.annotations.Experimental.Kind;
+import org.apache.beam.sdk.expansion.ExternalTransformRegistrar;
+import org.apache.beam.sdk.schemas.Schema;
+import org.apache.beam.sdk.schemas.SchemaTranslation;
+import org.apache.beam.sdk.transforms.ExternalTransformBuilder;
+import org.apache.beam.sdk.transforms.PTransform;
+import org.apache.beam.sdk.values.PBegin;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.PDone;
+import org.apache.beam.sdk.values.Row;
+import org.apache.beam.vendor.grpc.v1p26p0.com.google.protobuf.InvalidProtocolBufferException;
+import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.ImmutableMap;
+import org.checkerframework.checker.nullness.qual.Nullable;
+import org.joda.time.Duration;
+
+/**
+ * Exposes {@link SpannerIO.WriteRows} and {@link SpannerIO.ReadRows} as an external transform for
+ * cross-language usage.
+ */
+@Experimental(Kind.PORTABILITY)
+@AutoService(ExternalTransformRegistrar.class)
+public class SpannerTransformRegistrar implements ExternalTransformRegistrar {
+  public static final String WRITE_URN = "beam:external:java:spanner:write:v1";
+  public static final String READ_URN = "beam:external:java:spanner:read:v1";
+
+  @Override
+  public Map<String, ExternalTransformBuilder<?, ?, ?>> knownBuilderInstances() {
+    return ImmutableMap.of(WRITE_URN, new WriteBuilder(), READ_URN, new ReadBuilder());

Review comment:
       That makes sense. Done.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski commented on pull request #12611: [BEAM-10131][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski commented on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-676576861


   Run PythonDocker PreCommit


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski commented on a change in pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski commented on a change in pull request #12611:
URL: https://github.com/apache/beam/pull/12611#discussion_r482080324



##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/StructUtils.java
##########
@@ -0,0 +1,545 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.gcp.spanner;
+
+import static java.util.stream.Collectors.toList;
+import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument;
+
+import com.google.cloud.ByteArray;
+import com.google.cloud.Timestamp;
+import com.google.cloud.spanner.Struct;
+import com.google.cloud.spanner.Type;
+import java.math.BigDecimal;
+import java.util.List;
+import java.util.stream.StreamSupport;
+import org.apache.beam.sdk.schemas.Schema;
+import org.apache.beam.sdk.values.Row;
+import org.joda.time.DateTime;
+import org.joda.time.Instant;
+
+final class StructUtils {
+  public static Row translateStructToRow(Struct struct, Schema schema) {
+    checkForSchemasEquality(schema.getFields(), struct.getType().getStructFields(), false);

Review comment:
       Maybe we could just skip this check and let it crash when the types don't match?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski removed a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski removed a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-716740271






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-683611200


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=h1) Report
   > Merging [#12611](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=desc) into [master](https://codecov.io/gh/apache/beam/commit/0335ba5403e5071a1376a3a4101d14b03cf90b8d?el=desc) will **increase** coverage by `0.02%`.
   > The diff coverage is `58.51%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/12611/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #12611      +/-   ##
   ==========================================
   + Coverage   40.23%   40.26%   +0.02%     
   ==========================================
     Files         455      456       +1     
     Lines       53726    53825      +99     
   ==========================================
   + Hits        21619    21674      +55     
   - Misses      32107    32151      +44     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [sdks/python/apache\_beam/io/gcp/spanner.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL3NwYW5uZXIucHk=) | `58.51% <58.51%> (ø)` | |
   | [...hon/apache\_beam/runners/direct/test\_stream\_impl.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9kaXJlY3QvdGVzdF9zdHJlYW1faW1wbC5weQ==) | `41.17% <0.00%> (-1.58%)` | :arrow_down: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=footer). Last update [0335ba5...f7db356](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-683611200


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=h1) Report
   > Merging [#12611](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=desc) into [master](https://codecov.io/gh/apache/beam/commit/0335ba5403e5071a1376a3a4101d14b03cf90b8d?el=desc) will **increase** coverage by `0.02%`.
   > The diff coverage is `58.51%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/12611/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #12611      +/-   ##
   ==========================================
   + Coverage   40.23%   40.26%   +0.02%     
   ==========================================
     Files         455      456       +1     
     Lines       53726    53825      +99     
   ==========================================
   + Hits        21619    21674      +55     
   - Misses      32107    32151      +44     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [sdks/python/apache\_beam/io/gcp/spanner.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL3NwYW5uZXIucHk=) | `58.51% <58.51%> (ø)` | |
   | [...hon/apache\_beam/runners/direct/test\_stream\_impl.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9kaXJlY3QvdGVzdF9zdHJlYW1faW1wbC5weQ==) | `41.17% <0.00%> (-1.58%)` | :arrow_down: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=footer). Last update [0335ba5...5feb3d9](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski removed a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski removed a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-726125340






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] TheNeuralBit commented on a change in pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
TheNeuralBit commented on a change in pull request #12611:
URL: https://github.com/apache/beam/pull/12611#discussion_r477614300



##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/StructUtils.java
##########
@@ -0,0 +1,545 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.gcp.spanner;
+
+import static java.util.stream.Collectors.toList;
+import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument;
+
+import com.google.cloud.ByteArray;
+import com.google.cloud.Timestamp;
+import com.google.cloud.spanner.Struct;
+import com.google.cloud.spanner.Type;
+import java.math.BigDecimal;
+import java.util.List;
+import java.util.stream.StreamSupport;
+import org.apache.beam.sdk.schemas.Schema;
+import org.apache.beam.sdk.values.Row;
+import org.joda.time.DateTime;
+import org.joda.time.Instant;
+
+final class StructUtils {
+  public static Row translateStructToRow(Struct struct, Schema schema) {
+    checkForSchemasEquality(schema.getFields(), struct.getType().getStructFields(), false);
+
+    List<Schema.Field> fields = schema.getFields();
+    Row.FieldValueBuilder valueBuilder = null;
+    // TODO: Remove this null-checking once nullable fields are supported in cross-language

Review comment:
       What is the issue here? Nullable fields should be supported in cross-language

##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/SpannerIO.java
##########
@@ -678,6 +703,42 @@ public Read withPartitionOptions(PartitionOptions partitionOptions) {
               .withTransaction(getTransaction());
       return input.apply(Create.of(getReadOperation())).apply("Execute query", readAll);
     }
+
+    SerializableFunction<Struct, Row> getFormatFn() {
+      return (SerializableFunction<Struct, Row>)
+          input ->
+              Row.withSchema(Schema.builder().addInt64Field("Key").build())
+                  .withFieldValue("Key", 3L)
+                  .build();
+    }
+  }
+
+  public static class ReadRows extends PTransform<PBegin, PCollection<Row>> {
+    Read read;
+    Schema schema;
+
+    public ReadRows(Read read, Schema schema) {
+      super("Read rows");
+      this.read = read;
+      this.schema = schema;
+    }
+
+    @Override
+    public PCollection<Row> expand(PBegin input) {
+      return input
+          .apply(read)
+          .apply(
+              MapElements.into(TypeDescriptor.of(Row.class))
+                  .via(
+                      new SerializableFunction<Struct, Row>() {
+                        @Override
+                        public Row apply(Struct struct) {
+                          return StructUtils.translateStructToRow(struct, schema);
+                        }
+                      }))
+          .setRowSchema(schema)
+          .setCoder(RowCoder.of(schema));

Review comment:
       ```suggestion
             .setRowSchema(schema);
   ```
   
   `setCoder(RowCoder.of(schema))` is what `setRowSchema` does

##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/StructUtils.java
##########
@@ -0,0 +1,545 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.gcp.spanner;
+
+import static java.util.stream.Collectors.toList;
+import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument;
+
+import com.google.cloud.ByteArray;
+import com.google.cloud.Timestamp;
+import com.google.cloud.spanner.Struct;
+import com.google.cloud.spanner.Type;
+import java.math.BigDecimal;
+import java.util.List;
+import java.util.stream.StreamSupport;
+import org.apache.beam.sdk.schemas.Schema;
+import org.apache.beam.sdk.values.Row;
+import org.joda.time.DateTime;
+import org.joda.time.Instant;
+
+final class StructUtils {
+  public static Row translateStructToRow(Struct struct, Schema schema) {
+    checkForSchemasEquality(schema.getFields(), struct.getType().getStructFields(), false);

Review comment:
       Another reason to get the schema eagerly at pipeline construction time, this is an expensive operation to be doing for every Struct that we read.

##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/SpannerIO.java
##########
@@ -678,6 +703,42 @@ public Read withPartitionOptions(PartitionOptions partitionOptions) {
               .withTransaction(getTransaction());
       return input.apply(Create.of(getReadOperation())).apply("Execute query", readAll);
     }
+
+    SerializableFunction<Struct, Row> getFormatFn() {
+      return (SerializableFunction<Struct, Row>)
+          input ->
+              Row.withSchema(Schema.builder().addInt64Field("Key").build())
+                  .withFieldValue("Key", 3L)
+                  .build();
+    }
+  }
+
+  public static class ReadRows extends PTransform<PBegin, PCollection<Row>> {
+    Read read;
+    Schema schema;
+
+    public ReadRows(Read read, Schema schema) {
+      super("Read rows");
+      this.read = read;
+      this.schema = schema;

Review comment:
       We could also punt on this question and file a jira with a TODO here. I recognize this is a little out of scope for BEAM-10139, BEAM-10140.

##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/SpannerIO.java
##########
@@ -678,6 +703,42 @@ public Read withPartitionOptions(PartitionOptions partitionOptions) {
               .withTransaction(getTransaction());
       return input.apply(Create.of(getReadOperation())).apply("Execute query", readAll);
     }
+
+    SerializableFunction<Struct, Row> getFormatFn() {
+      return (SerializableFunction<Struct, Row>)
+          input ->
+              Row.withSchema(Schema.builder().addInt64Field("Key").build())
+                  .withFieldValue("Key", 3L)
+                  .build();
+    }
+  }
+
+  public static class ReadRows extends PTransform<PBegin, PCollection<Row>> {
+    Read read;
+    Schema schema;
+
+    public ReadRows(Read read, Schema schema) {
+      super("Read rows");
+      this.read = read;
+      this.schema = schema;

Review comment:
       It would be really great if `SpannerIO.ReadRows` could determine the schema at pipeline construction time so the user doesn't have to specify it. In `SpannerIO.Read#expand` we require the user to specify either a query or a list of columns: https://github.com/apache/beam/blob/2872e37d801b489ecbb2c0d6a2a70430d8ba91e9/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/SpannerIO.java#L656-L671
   
   In both case we're very close to a schema. We just need to analyze the query and/or get the output types for the projected columns. I looked into it a little bit, but I'm not quite sure the best way to use the spanner client to look up the schema. The only thing I could figure out was to start a read and look at the type of `ResultSet#getCurrentRowAsStruct` which seems less than ideal. 
   
   CC @nielm who's done some work with SpannerIO recently - do you have any suggestions for a way to determine the types of the Structs that SpannerIO.Read will produce?

##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/StructUtils.java
##########
@@ -0,0 +1,545 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.gcp.spanner;
+
+import static java.util.stream.Collectors.toList;
+import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument;
+
+import com.google.cloud.ByteArray;
+import com.google.cloud.Timestamp;
+import com.google.cloud.spanner.Struct;
+import com.google.cloud.spanner.Type;
+import java.math.BigDecimal;
+import java.util.List;
+import java.util.stream.StreamSupport;
+import org.apache.beam.sdk.schemas.Schema;
+import org.apache.beam.sdk.values.Row;
+import org.joda.time.DateTime;
+import org.joda.time.Instant;
+
+final class StructUtils {

Review comment:
       woo this is a hefty class for type conversions! It does seem like there's a lot of duplicated logic, what's preventing us from combining more of it?

##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/StructUtils.java
##########
@@ -0,0 +1,545 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.gcp.spanner;
+
+import static java.util.stream.Collectors.toList;
+import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument;
+
+import com.google.cloud.ByteArray;
+import com.google.cloud.Timestamp;
+import com.google.cloud.spanner.Struct;
+import com.google.cloud.spanner.Type;
+import java.math.BigDecimal;
+import java.util.List;
+import java.util.stream.StreamSupport;
+import org.apache.beam.sdk.schemas.Schema;
+import org.apache.beam.sdk.values.Row;
+import org.joda.time.DateTime;
+import org.joda.time.Instant;
+
+final class StructUtils {
+  public static Row translateStructToRow(Struct struct, Schema schema) {
+    checkForSchemasEquality(schema.getFields(), struct.getType().getStructFields(), false);
+
+    List<Schema.Field> fields = schema.getFields();
+    Row.FieldValueBuilder valueBuilder = null;
+    // TODO: Remove this null-checking once nullable fields are supported in cross-language
+    int count = 0;
+    while (valueBuilder == null && count < fields.size()) {
+      valueBuilder = getFirstStructValue(struct, fields.get(count), schema);
+      ++count;
+    }
+    for (int i = count; i < fields.size(); ++i) {
+      valueBuilder = getStructValue(valueBuilder, struct, fields.get(i));
+    }
+    return valueBuilder != null ? valueBuilder.build() : Row.withSchema(schema).build();
+  }
+
+  public static Struct translateRowToStruct(Row row) {
+    Struct.Builder structBuilder = Struct.newBuilder();
+    List<Schema.Field> fields = row.getSchema().getFields();
+    fields.forEach(
+        field -> {
+          String column = field.getName();
+          switch (field.getType().getTypeName()) {
+            case ROW:
+              structBuilder
+                  .set(column)
+                  .to(
+                      beamTypeToSpannerType(field.getType()),
+                      translateRowToStruct(row.getRow(column)));
+              break;
+            case ARRAY:
+              addArrayToStruct(structBuilder, row, field);
+              break;
+            case ITERABLE:
+              addIterableToStruct(structBuilder, row, field);
+              break;
+            case FLOAT:
+              structBuilder.set(column).to(row.getFloat(column).doubleValue());
+              break;
+            case DOUBLE:
+              structBuilder.set(column).to(row.getDouble(column));
+              break;
+            case DECIMAL:
+              structBuilder.set(column).to(row.getDecimal(column).doubleValue());

Review comment:
       This is lossy isn't it? I think we should just refuse to convert DECIMAL since Spanner doesn't have a corresponding type: https://cloud.google.com/spanner/docs/data-types#allowable_types

##########
File path: sdks/java/io/google-cloud-platform/expansion-service/build.gradle
##########
@@ -0,0 +1,44 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+apply plugin: 'org.apache.beam.module'
+apply plugin: 'application'
+mainClassName = "org.apache.beam.sdk.expansion.service.ExpansionService"
+
+applyJavaNature(
+        enableChecker: true,
+        automaticModuleName: 'org.apache.beam.sdk.io.gcp.expansion.service',
+        exportJavadoc: false,
+        validateShadowJar: false,
+        shadowClosure: {},
+)
+
+task runService(type: Exec) {
+    dependsOn shadowJar
+    executable 'sh'
+    args '-c', 'java -jar /Users/piotr/beam/sdks/java/io/google-cloud-platform/expansion-service/build/libs/beam-sdks-java-io-google-cloud-platform-expansion-service-2.24.0-SNAPSHOT.jar 8097'
+}

Review comment:
       looks like this was just there for testing?

##########
File path: sdks/python/apache_beam/io/gcp/spanner.py
##########
@@ -0,0 +1,504 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""PTransforms for supporting Spanner in Python pipelines.
+
+  These transforms are currently supported by Beam portable
+  Flink and Spark runners.
+
+  **Setup**
+
+  Transforms provided in this module are cross-language transforms
+  implemented in the Beam Java SDK. During the pipeline construction, Python SDK
+  will connect to a Java expansion service to expand these transforms.
+  To facilitate this, a small amount of setup is needed before using these
+  transforms in a Beam Python pipeline.
+
+  There are several ways to setup cross-language Spanner transforms.
+
+  * Option 1: use the default expansion service
+  * Option 2: specify a custom expansion service
+
+  See below for details regarding each of these options.
+
+  *Option 1: Use the default expansion service*
+
+  This is the recommended and easiest setup option for using Python Spanner
+  transforms. This option is only available for Beam 2.25.0 and later.
+
+  This option requires following pre-requisites before running the Beam
+  pipeline.
+
+  * Install Java runtime in the computer from where the pipeline is constructed
+    and make sure that 'java' command is available.
+
+  In this option, Python SDK will either download (for released Beam version) or
+  build (when running from a Beam Git clone) a expansion service jar and use
+  that to expand transforms. Currently Spanner transforms use the
+  'beam-sdks-java-io-google-cloud-platform-expansion-service' jar for this
+  purpose.
+
+  *Option 2: specify a custom expansion service*
+
+  In this option, you startup your own expansion service and provide that as
+  a parameter when using the transforms provided in this module.
+
+  This option requires following pre-requisites before running the Beam
+  pipeline.
+
+  * Startup your own expansion service.
+  * Update your pipeline to provide the expansion service address when
+    initiating Spanner transforms provided in this module.
+
+  Flink Users can use the built-in Expansion Service of the Flink Runner's
+  Job Server. If you start Flink's Job Server, the expansion service will be
+  started on port 8097. For a different address, please set the
+  expansion_service parameter.
+
+  **More information**
+
+  For more information regarding cross-language transforms see:
+  - https://beam.apache.org/roadmap/portability/
+
+  For more information specific to Flink runner see:
+  - https://beam.apache.org/documentation/runners/flink/
+"""
+
+# pytype: skip-file
+
+from __future__ import absolute_import
+
+import typing
+import uuid
+from typing import List
+from typing import NamedTuple
+from typing import Optional
+
+from past.builtins import unicode
+
+from apache_beam import coders
+from apache_beam.transforms.external import BeamJarExpansionService
+from apache_beam.transforms.external import ExternalTransform
+from apache_beam.transforms.external import NamedTupleBasedPayloadBuilder
+from apache_beam.typehints.schemas import named_tuple_to_schema
+
+__all__ = [
+    'WriteToSpanner',
+    'ReadFromSpanner',
+    'MutationCreator',
+    'TimestampBoundMode',
+    'TimeUnit',
+]
+
+
+def default_io_expansion_service():
+  return BeamJarExpansionService(
+      'sdks:java:io:google-cloud-platform:expansion-service:shadowJar')
+
+
+WriteToSpannerSchema = typing.NamedTuple(
+    'WriteToSpannerSchema',
+    [
+        ('instance_id', unicode),
+        ('database_id', unicode),
+        ('project_id', Optional[unicode]),
+        ('batch_size_bytes', Optional[int]),
+        ('max_num_mutations', Optional[int]),
+        ('max_num_rows', Optional[int]),
+        ('grouping_factor', Optional[int]),
+        ('host', Optional[unicode]),
+        ('emulator_host', Optional[unicode]),
+        ('commit_deadline', Optional[int]),
+        ('max_cumulative_backoff', Optional[int]),
+    ],
+)
+
+
+class WriteToSpanner(ExternalTransform):

Review comment:
       It looks like there's already a native SpannerIO in the Python SDK in [apache_beam/io/gcp/experimental/spannerio.py](https://github.com/apache/beam/blob/master/sdks/python/apache_beam/io/gcp/experimental/spannerio.py). Are we planning on removing that one? Should the API for this one be compliant with that one?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-683611200


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=h1) Report
   > Merging [#12611](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=desc) into [master](https://codecov.io/gh/apache/beam/commit/0335ba5403e5071a1376a3a4101d14b03cf90b8d?el=desc) will **increase** coverage by `0.02%`.
   > The diff coverage is `58.51%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/12611/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #12611      +/-   ##
   ==========================================
   + Coverage   40.23%   40.26%   +0.02%     
   ==========================================
     Files         455      456       +1     
     Lines       53726    53825      +99     
   ==========================================
   + Hits        21619    21674      +55     
   - Misses      32107    32151      +44     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [sdks/python/apache\_beam/io/gcp/spanner.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL3NwYW5uZXIucHk=) | `58.51% <58.51%> (ø)` | |
   | [...hon/apache\_beam/runners/direct/test\_stream\_impl.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9kaXJlY3QvdGVzdF9zdHJlYW1faW1wbC5weQ==) | `41.17% <0.00%> (-1.58%)` | :arrow_down: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=footer). Last update [0335ba5...78c40e7](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski commented on a change in pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski commented on a change in pull request #12611:
URL: https://github.com/apache/beam/pull/12611#discussion_r523132998



##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/StructUtils.java
##########
@@ -0,0 +1,380 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.gcp.spanner;
+
+import static java.util.stream.Collectors.toList;
+
+import com.google.cloud.ByteArray;
+import com.google.cloud.Timestamp;
+import com.google.cloud.spanner.Struct;
+import com.google.cloud.spanner.Type;
+import java.math.BigDecimal;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.stream.StreamSupport;
+import org.apache.beam.sdk.schemas.Schema;
+import org.apache.beam.sdk.values.Row;
+import org.checkerframework.checker.nullness.qual.Nullable;
+import org.joda.time.DateTime;
+import org.joda.time.Instant;
+import org.joda.time.ReadableDateTime;
+
+final class StructUtils {
+  public static Row structToBeamRow(Struct struct, Schema schema) {
+    Map<String, Object> structValues =
+        schema.getFields().stream()
+            .collect(
+                HashMap::new,
+                (map, field) -> {
+                  @Nullable Object structValue = getStructValue(struct, field);
+                  if (structValue == null) {
+                    throw new NullPointerException("Null struct value at field " + field.getName());
+                  }
+                  map.put(field.getName(), structValue);
+                },
+                Map::putAll);
+    return Row.withSchema(schema).withFieldValues(structValues).build();
+  }
+
+  public static Struct beamRowToStruct(Row row) {
+    Struct.Builder structBuilder = Struct.newBuilder();
+    List<Schema.Field> fields = row.getSchema().getFields();
+    fields.forEach(
+        field -> {
+          String column = field.getName();
+          switch (field.getType().getTypeName()) {
+            case ROW:
+              @Nullable Row subRow = row.getRow(column);
+              if (subRow == null) {
+                throw new NullPointerException(String.format("Null subRow at '%s' column", column));
+              }

Review comment:
       Guava's checkNotNull doesn't work - checker doesn't consider fields checked this way as not nulls. It also isn't expected to throw anything so I get missing return statements in the switch statements. So for now I'd leave it throwing NPE. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski commented on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski commented on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-717180857


   Run Python 3.8 PostCommit


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski edited a comment on pull request #12611: [BEAM-10131][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski edited a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-676422822


   > Regarding testing: we could consider adding a spanner instance to apache-beam-testing for integration testing, I'd suggest raising it on dev@ if you want to pursue it. I also just came across https://cloud.google.com/spanner/docs/emulator which could be a good option too. Its a docker container that starts up an in-memory version of spanner to test against.
   
   @TheNeuralBit Great advice as always! I tried to find something like this emulator on dockerhub but without success. I managed to successfully use this emulator, it has much better support than aws for localstack.
   
   Few comments about this PR:
   
   I am almost certain that the Schema doesn't have to be sent as proto in Read but I didn't come up with anything else.
   
   Another issue is representing the Mutation - for now it's a Row containing 4 fields: operation, table, rows and key_set. It does quite well but I wonder whether I can do it better.
   
   I erased SpannerWriteResult and return PDone for now - I don't see the way to keep it without including spanner dependencies to java.core. Because of that failure mode is FAIL_FAST and I didn't include it in configuration params.
   
   Transactions are not supported because they require a ptransform to be transferred. I suppose it's doable though and it could be a good future improvement.
   
   FYI - I'll be OOO the next week so there is absolutely no haste :)


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski commented on a change in pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski commented on a change in pull request #12611:
URL: https://github.com/apache/beam/pull/12611#discussion_r488422134



##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/StructUtils.java
##########
@@ -0,0 +1,545 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.gcp.spanner;
+
+import static java.util.stream.Collectors.toList;
+import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument;
+
+import com.google.cloud.ByteArray;
+import com.google.cloud.Timestamp;
+import com.google.cloud.spanner.Struct;
+import com.google.cloud.spanner.Type;
+import java.math.BigDecimal;
+import java.util.List;
+import java.util.stream.StreamSupport;
+import org.apache.beam.sdk.schemas.Schema;
+import org.apache.beam.sdk.values.Row;
+import org.joda.time.DateTime;
+import org.joda.time.Instant;
+
+final class StructUtils {
+  public static Row translateStructToRow(Struct struct, Schema schema) {
+    checkForSchemasEquality(schema.getFields(), struct.getType().getStructFields(), false);
+
+    List<Schema.Field> fields = schema.getFields();
+    Row.FieldValueBuilder valueBuilder = null;
+    // TODO: Remove this null-checking once nullable fields are supported in cross-language

Review comment:
       I'm not sure where my message has gone, but I wrote that nulls come up with no problems, I've just used ImmutableMap which does not allow null values. Replacing it with java.util.HashMap solved the issue.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] piotr-szuberski commented on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
piotr-szuberski commented on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-716871303


   Run Python PreCommit


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] codecov[bot] edited a comment on pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Posted by GitBox <gi...@apache.org>.
codecov[bot] edited a comment on pull request #12611:
URL: https://github.com/apache/beam/pull/12611#issuecomment-683611200


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=h1) Report
   > Merging [#12611](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=desc) into [master](https://codecov.io/gh/apache/beam/commit/0335ba5403e5071a1376a3a4101d14b03cf90b8d?el=desc) will **increase** coverage by `0.02%`.
   > The diff coverage is `59.13%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/beam/pull/12611/graphs/tree.svg?width=650&height=150&src=pr&token=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #12611      +/-   ##
   ==========================================
   + Coverage   40.23%   40.26%   +0.02%     
   ==========================================
     Files         455      456       +1     
     Lines       53726    53824      +98     
   ==========================================
   + Hits        21619    21674      +55     
   - Misses      32107    32150      +43     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [sdks/python/apache\_beam/io/gcp/spanner.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL3NwYW5uZXIucHk=) | `59.13% <59.13%> (ø)` | |
   | [...hon/apache\_beam/runners/direct/test\_stream\_impl.py](https://codecov.io/gh/apache/beam/pull/12611/diff?src=pr&el=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9kaXJlY3QvdGVzdF9zdHJlYW1faW1wbC5weQ==) | `41.17% <0.00%> (-1.58%)` | :arrow_down: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=footer). Last update [0335ba5...dd827fb](https://codecov.io/gh/apache/beam/pull/12611?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org