You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2022/07/13 16:27:02 UTC

[GitHub] [beam] damondouglas opened a new pull request, #22262: Implement PubsubSchemaTransformWriteConfiguration

damondouglas opened a new pull request, #22262:
URL: https://github.com/apache/beam/pull/22262

   This PR address #21412 with a PubsubSchemaTransformWriteConfiguration implementation.  It's design goals are to like-for-like replicate PubsubSchemaIOTransform write configuration details.  Subsequent to this PR's approval/merge, the plan is to implement the final PubsubSchemaTransformWriteProvider.
   
   Due to repeatedly failing beam_PreCommit_Java tests, to validate this PR, I ran the following prior to submission:
   
   ```
   ./gradlew rat
   ./gradlew spotlessCheck
   ./gradlew sdks:java:io:google-cloud-platform:check
   ./gradlew sdks:java:io:google-cloud-platform:checkStyleMain
   ```
   
   I would like to request the following to review this PR:
   R: @pabloem 
   
   ------------------------
   
   Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:
   
    - [x] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`).
    - [x] Mention the appropriate issue in your description (for example: `addresses #123`), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, comment `fixes #<ISSUE NUMBER>` instead.
    - ~Update `CHANGES.md` with noteworthy changes.~
    - ~If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).~
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more tips on [how to make review process smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier).
   
   To check the build health, please visit [https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md](https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md)
   
   GitHub Actions Tests Status (on master branch)
   ------------------------------------------------------------------------------------------------
   [![Build python source distribution and wheels](https://github.com/apache/beam/workflows/Build%20python%20source%20distribution%20and%20wheels/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Build+python+source+distribution+and+wheels%22+branch%3Amaster+event%3Aschedule)
   [![Python tests](https://github.com/apache/beam/workflows/Python%20tests/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Python+Tests%22+branch%3Amaster+event%3Aschedule)
   [![Java tests](https://github.com/apache/beam/workflows/Java%20Tests/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Java+Tests%22+branch%3Amaster+event%3Aschedule)
   
   See [CI.md](https://github.com/apache/beam/blob/master/CI.md) for more information about GitHub Actions CI.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] pabloem merged pull request #22262: Implement PubsubSchemaTransformWriteConfiguration

Posted by GitBox <gi...@apache.org>.
pabloem merged PR #22262:
URL: https://github.com/apache/beam/pull/22262


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] pabloem commented on a diff in pull request #22262: Implement PubsubSchemaTransformWriteConfiguration

Posted by GitBox <gi...@apache.org>.
pabloem commented on code in PR #22262:
URL: https://github.com/apache/beam/pull/22262#discussion_r940684555


##########
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/pubsub/PubsubSchemaTransformWriteConfiguration.java:
##########
@@ -0,0 +1,132 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.gcp.pubsub;
+
+import com.google.auto.value.AutoValue;
+import javax.annotation.Nullable;
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.schemas.AutoValueSchema;
+import org.apache.beam.sdk.schemas.Schema;
+import org.apache.beam.sdk.schemas.annotations.DefaultSchema;
+
+/**
+ * Configuration for writing to Pub/Sub.
+ *
+ * <p><b>Internal only:</b> This class is actively being worked on, and it will likely change. We
+ * provide no backwards compatibility guarantees, and it should not be implemented outside the Beam
+ * repository.
+ */
+@Experimental
+@DefaultSchema(AutoValueSchema.class)
+@AutoValue
+public abstract class PubsubSchemaTransformWriteConfiguration {
+
+  /** The expected schema of the Pub/Sub message. */
+  public abstract Schema getDataSchema();

Review Comment:
   Since we're writing, won't the schema be known from the upstream transform?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] pabloem commented on a diff in pull request #22262: Implement PubsubSchemaTransformWriteConfiguration

Posted by GitBox <gi...@apache.org>.
pabloem commented on code in PR #22262:
URL: https://github.com/apache/beam/pull/22262#discussion_r940684331


##########
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/pubsub/PubsubSchemaTransformWriteConfiguration.java:
##########
@@ -0,0 +1,132 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.gcp.pubsub;
+
+import com.google.auto.value.AutoValue;
+import javax.annotation.Nullable;
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.schemas.AutoValueSchema;
+import org.apache.beam.sdk.schemas.Schema;
+import org.apache.beam.sdk.schemas.annotations.DefaultSchema;
+
+/**
+ * Configuration for writing to Pub/Sub.
+ *
+ * <p><b>Internal only:</b> This class is actively being worked on, and it will likely change. We
+ * provide no backwards compatibility guarantees, and it should not be implemented outside the Beam
+ * repository.
+ */
+@Experimental
+@DefaultSchema(AutoValueSchema.class)
+@AutoValue
+public abstract class PubsubSchemaTransformWriteConfiguration {
+
+  /** The expected schema of the Pub/Sub message. */
+  public abstract Schema getDataSchema();
+
+  /**
+   * The topic to which to write Pub/Sub messages.
+   *
+   * <p>See {@link PubsubIO.PubsubTopic#fromPath(String)} for more details on the format of the
+   * topic string.
+   */
+  public abstract String getTopic();
+
+  /**
+   * The expected format of the Pub/Sub message.
+   *
+   * <p>Used to retrieve the {@link org.apache.beam.sdk.schemas.io.payloads.PayloadSerializer} from
+   * {@link org.apache.beam.sdk.schemas.io.payloads.PayloadSerializers}.
+   */
+  @Nullable
+  public abstract String getFormat();
+
+  /** Used by the ProtoPayloadSerializerProvider when serializing to a Pub/Sub message. */
+  @Nullable
+  public abstract String getProtoClass();
+
+  /** Used by the ThriftPayloadSerializerProvider when serializing to Pub/Sub message. */
+  @Nullable
+  public abstract String getThriftClass();
+
+  /** Used by the ThriftPayloadSerializerProvider when serializing to Pub/Sub message. */
+  @Nullable
+  public abstract String getThriftProtocolFactoryClass();

Review Comment:
   I'm wondering if we should just avoid using these. Since these transforms will be mostly configured by commands/UIs, I don't expect users to fiddle with serialization format too much. I would think Format would only be JSON or Avro. Thoughts?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] pabloem commented on pull request #22262: Implement PubsubSchemaTransformWriteConfiguration

Posted by GitBox <gi...@apache.org>.
pabloem commented on PR #22262:
URL: https://github.com/apache/beam/pull/22262#issuecomment-1213477651

   lgtm sorry about the delay!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org