You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2020/10/08 12:21:35 UTC

[GitHub] [beam] piotr-szuberski commented on a change in pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

piotr-szuberski commented on a change in pull request #12611:
URL: https://github.com/apache/beam/pull/12611#discussion_r501676146



##########
File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/SpannerIO.java
##########
@@ -678,6 +703,42 @@ public Read withPartitionOptions(PartitionOptions partitionOptions) {
               .withTransaction(getTransaction());
       return input.apply(Create.of(getReadOperation())).apply("Execute query", readAll);
     }
+
+    SerializableFunction<Struct, Row> getFormatFn() {
+      return (SerializableFunction<Struct, Row>)
+          input ->
+              Row.withSchema(Schema.builder().addInt64Field("Key").build())
+                  .withFieldValue("Key", 3L)
+                  .build();
+    }
+  }
+
+  public static class ReadRows extends PTransform<PBegin, PCollection<Row>> {
+    Read read;
+    Schema schema;
+
+    public ReadRows(Read read, Schema schema) {
+      super("Read rows");
+      this.read = read;
+      this.schema = schema;

Review comment:
       Thank you @nielm ! I thought about the LIMIT approach but then I found the same arguments not to do that.
   
   It appears there exist a jdbc client for Spanner: https://cloud.google.com/spanner/docs/jdbc-drivers . I'll try to figure out if I can use it. 
   
   There is ResultSetMetadata in Spanner's REST API which extends json object. https://cloud.google.com/spanner/docs/reference/rest/v1/ResultSetMetadata but at the end of the day it requires at least partially to fetch the data.
   
   But I would leave it for another PR as it supposedly require to move SchemaUtils from io/jdbc to some more general place (extensions/sql?). As I can see Struct type is represented as String as is mentiones here:
   ```
   The Cloud Spanner STRUCT data type is mapped to a SQL VARCHAR data type, accessible through this driver as String types. All other types have appropriate mappings.
   ```
   So it may not be the best option.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org