You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2023/01/13 11:39:52 UTC

[GitHub] [beam] aromanenko-dev opened a new pull request, #24992: [WIP][Avro] Use "extensions/avro" instead of avro from"core" in Java SDK modules

aromanenko-dev opened a new pull request, #24992:
URL: https://github.com/apache/beam/pull/24992

   DON'T MERGE!!!
   
   #24878 
   
   ------------------------
   
   Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:
   
    - [ ] Mention the appropriate issue in your description (for example: `addresses #123`), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, comment `fixes #<ISSUE NUMBER>` instead.
    - [ ] Update `CHANGES.md` with noteworthy changes.
    - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more tips on [how to make review process smoother](https://beam.apache.org/contribute/get-started-contributing/#make-the-reviewers-job-easier).
   
   To check the build health, please visit [https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md](https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md)
   
   GitHub Actions Tests Status (on master branch)
   ------------------------------------------------------------------------------------------------
   [![Build python source distribution and wheels](https://github.com/apache/beam/workflows/Build%20python%20source%20distribution%20and%20wheels/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Build+python+source+distribution+and+wheels%22+branch%3Amaster+event%3Aschedule)
   [![Python tests](https://github.com/apache/beam/workflows/Python%20tests/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Python+Tests%22+branch%3Amaster+event%3Aschedule)
   [![Java tests](https://github.com/apache/beam/workflows/Java%20Tests/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Java+Tests%22+branch%3Amaster+event%3Aschedule)
   [![Go tests](https://github.com/apache/beam/workflows/Go%20tests/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Go+tests%22+branch%3Amaster+event%3Aschedule)
   
   See [CI.md](https://github.com/apache/beam/blob/master/CI.md) for more information about GitHub Actions CI.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] aromanenko-dev commented on pull request #24992: [WIP][Avro] Use "extensions/avro" instead of avro from"core" in Java SDK modules

Posted by GitBox <gi...@apache.org>.
aromanenko-dev commented on PR #24992:
URL: https://github.com/apache/beam/pull/24992#issuecomment-1386983514

   Run Java_GCP_IO_Direct PreCommit


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] aromanenko-dev commented on pull request #24992: [WIP][Avro] Use "extensions/avro" instead of avro from"core" in Java SDK modules

Posted by GitBox <gi...@apache.org>.
aromanenko-dev commented on PR #24992:
URL: https://github.com/apache/beam/pull/24992#issuecomment-1396666654

   retest this please


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] aromanenko-dev commented on pull request #24992: [WIP][Avro] Use "extensions/avro" instead of avro from"core" in Java SDK modules

Posted by GitBox <gi...@apache.org>.
aromanenko-dev commented on PR #24992:
URL: https://github.com/apache/beam/pull/24992#issuecomment-1397047477

   Run Java PreCommit


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] aromanenko-dev commented on pull request #24992: [Avro] Use "extensions/avro" instead of avro from"core" in Java SDK modules

Posted by "aromanenko-dev (via GitHub)" <gi...@apache.org>.
aromanenko-dev commented on PR #24992:
URL: https://github.com/apache/beam/pull/24992#issuecomment-1434894102

   @Abacn Interesting about checkStyle, I may try to do something similar for this. Thanks


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] aromanenko-dev commented on pull request #24992: [Avro] Use "extensions/avro" instead of avro from"core" in Java SDK modules

Posted by "aromanenko-dev (via GitHub)" <gi...@apache.org>.
aromanenko-dev commented on PR #24992:
URL: https://github.com/apache/beam/pull/24992#issuecomment-1440239953

   I merge this one since failed checks are not related and all others are green.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] aromanenko-dev commented on pull request #24992: [WIP][Avro] Use "extensions/avro" instead of avro from"core" in Java SDK modules

Posted by GitBox <gi...@apache.org>.
aromanenko-dev commented on PR #24992:
URL: https://github.com/apache/beam/pull/24992#issuecomment-1396896891

   Run Java PostCommit


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] aromanenko-dev commented on pull request #24992: [WIP][Avro] Use "extensions/avro" instead of avro from"core" in Java SDK modules

Posted by GitBox <gi...@apache.org>.
aromanenko-dev commented on PR #24992:
URL: https://github.com/apache/beam/pull/24992#issuecomment-1396665755

   Run SQL_Java17 PreCommit


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] aromanenko-dev commented on pull request #24992: [WIP][Avro] Use "extensions/avro" instead of avro from"core" in Java SDK modules

Posted by GitBox <gi...@apache.org>.
aromanenko-dev commented on PR #24992:
URL: https://github.com/apache/beam/pull/24992#issuecomment-1385614039

   Run SQL PreCommit


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] aromanenko-dev commented on pull request #24992: [Avro] Use "extensions/avro" instead of avro from"core" in Java SDK modules

Posted by "aromanenko-dev (via GitHub)" <gi...@apache.org>.
aromanenko-dev commented on PR #24992:
URL: https://github.com/apache/beam/pull/24992#issuecomment-1404997448

   @mosche 
   > Wondering, does it make sense to also deprecate the Avro stuff in core in the same PR?
   
   I wanted to do this in a separate PR once this one is merged. Do you think it'd be better to do here as well?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] aromanenko-dev commented on pull request #24992: [WIP][Avro] Use "extensions/avro" instead of avro from"core" in Java SDK modules

Posted by GitBox <gi...@apache.org>.
aromanenko-dev commented on PR #24992:
URL: https://github.com/apache/beam/pull/24992#issuecomment-1398460707

   Run Flink ValidatesRunner


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] aromanenko-dev commented on pull request #24992: [Avro] Use "extensions/avro" instead of avro from"core" in Java SDK modules

Posted by "aromanenko-dev (via GitHub)" <gi...@apache.org>.
aromanenko-dev commented on PR #24992:
URL: https://github.com/apache/beam/pull/24992#issuecomment-1408377562

   Run Java_Pulsar_IO_Direct PreCommit


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] aromanenko-dev commented on a diff in pull request #24992: [Avro] Use "extensions/avro" instead of avro from"core" in Java SDK modules

Posted by "aromanenko-dev (via GitHub)" <gi...@apache.org>.
aromanenko-dev commented on code in PR #24992:
URL: https://github.com/apache/beam/pull/24992#discussion_r1092205875


##########
runners/spark/3/src/main/java/org/apache/beam/runners/spark/structuredstreaming/translation/SparkSessionFactory.java:
##########
@@ -283,5 +283,13 @@ public void registerClasses(Kryo kryo) {
       kryo.register(TupleTag.class);
       kryo.register(TupleTagList.class);
     }
+
+    private void tryToRegister(Kryo kryo, String className) {
+      try {
+        kryo.register(Class.forName(className));
+      } catch (ClassNotFoundException e) {
+        LOG.warn("Class {}} was not found on classpath", className);

Review Comment:
   Makes sense, done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] aromanenko-dev commented on pull request #24992: [WIP][Avro] Use "extensions/avro" instead of avro from"core" in Java SDK modules

Posted by GitBox <gi...@apache.org>.
aromanenko-dev commented on PR #24992:
URL: https://github.com/apache/beam/pull/24992#issuecomment-1385247961

   Run SQL_Java11 PreCommit


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] aromanenko-dev commented on a diff in pull request #24992: [Avro] Use "extensions/avro" instead of avro from"core" in Java SDK modules

Posted by "aromanenko-dev (via GitHub)" <gi...@apache.org>.
aromanenko-dev commented on code in PR #24992:
URL: https://github.com/apache/beam/pull/24992#discussion_r1090407133


##########
runners/spark/3/src/main/java/org/apache/beam/runners/spark/structuredstreaming/translation/SparkSessionFactory.java:
##########
@@ -70,6 +68,8 @@
 import org.apache.beam.sdk.coders.VarIntCoder;
 import org.apache.beam.sdk.coders.VarLongCoder;
 import org.apache.beam.sdk.coders.VoidCoder;
+import org.apache.beam.sdk.extensions.avro.coders.AvroCoder;

Review Comment:
   Makes sense, done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] aromanenko-dev commented on pull request #24992: [WIP][Avro] Use "extensions/avro" instead of avro from"core" in Java SDK modules

Posted by "aromanenko-dev (via GitHub)" <gi...@apache.org>.
aromanenko-dev commented on PR #24992:
URL: https://github.com/apache/beam/pull/24992#issuecomment-1401894965

   Run Spark ValidatesRunner


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] aromanenko-dev commented on pull request #24992: [WIP][Avro] Use "extensions/avro" instead of avro from"core" in Java SDK modules

Posted by "aromanenko-dev (via GitHub)" <gi...@apache.org>.
aromanenko-dev commented on PR #24992:
URL: https://github.com/apache/beam/pull/24992#issuecomment-1401900352

   Run Spark Runner Nexmark Tests


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] aromanenko-dev commented on pull request #24992: [Avro] Use "extensions/avro" instead of avro from"core" in Java SDK modules

Posted by "aromanenko-dev (via GitHub)" <gi...@apache.org>.
aromanenko-dev commented on PR #24992:
URL: https://github.com/apache/beam/pull/24992#issuecomment-1406841234

   Run Java PreCommit


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] aromanenko-dev commented on pull request #24992: [Avro] Use "extensions/avro" instead of avro from"core" in Java SDK modules

Posted by "aromanenko-dev (via GitHub)" <gi...@apache.org>.
aromanenko-dev commented on PR #24992:
URL: https://github.com/apache/beam/pull/24992#issuecomment-1437247518

   Run Whitespace PreCommit


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] Abacn commented on pull request #24992: [Avro] Use "extensions/avro" instead of avro from"core" in Java SDK modules

Posted by "Abacn (via GitHub)" <gi...@apache.org>.
Abacn commented on PR #24992:
URL: https://github.com/apache/beam/pull/24992#issuecomment-1438676537

   Thanks @aromanenko-dev for pinning. I think we can proceed given that those failures are understood to be not related. I also opened #25566


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] olehborysevych commented on pull request #24992: [WIP][Avro] Use "extensions/avro" instead of avro from"core" in Java SDK modules

Posted by GitBox <gi...@apache.org>.
olehborysevych commented on PR #24992:
URL: https://github.com/apache/beam/pull/24992#issuecomment-1381781681

   @aromanenko-dev looking into failing Example check. Sorry for inconvenience (


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] aromanenko-dev commented on pull request #24992: [WIP][Avro] Use "extensions/avro" instead of avro from"core" in Java SDK modules

Posted by "aromanenko-dev (via GitHub)" <gi...@apache.org>.
aromanenko-dev commented on PR #24992:
URL: https://github.com/apache/beam/pull/24992#issuecomment-1401895618

   Run Dataflow ValidatesRunner


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] aromanenko-dev commented on pull request #24992: [WIP][Avro] Use "extensions/avro" instead of avro from"core" in Java SDK modules

Posted by "aromanenko-dev (via GitHub)" <gi...@apache.org>.
aromanenko-dev commented on PR #24992:
URL: https://github.com/apache/beam/pull/24992#issuecomment-1402154733

   R: @mosche @reuvenlax @lukecwik 
   CC: @kennknowles 
   
   Please, take a look and let me know if you have any comments/objections and recommendations for testing that should be done before accepting this PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] aromanenko-dev commented on pull request #24992: [Avro] Use "extensions/avro" instead of avro from"core" in Java SDK modules

Posted by "aromanenko-dev (via GitHub)" <gi...@apache.org>.
aromanenko-dev commented on PR #24992:
URL: https://github.com/apache/beam/pull/24992#issuecomment-1431656386

   Run Java PostCommit


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] aromanenko-dev commented on pull request #24992: [Avro] Use "extensions/avro" instead of avro from"core" in Java SDK modules

Posted by "aromanenko-dev (via GitHub)" <gi...@apache.org>.
aromanenko-dev commented on PR #24992:
URL: https://github.com/apache/beam/pull/24992#issuecomment-1436793857

   Run Java_GCP_IO_Direct PreCommit


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] aromanenko-dev commented on pull request #24992: [Avro] Use "extensions/avro" instead of avro from"core" in Java SDK modules

Posted by "aromanenko-dev (via GitHub)" <gi...@apache.org>.
aromanenko-dev commented on PR #24992:
URL: https://github.com/apache/beam/pull/24992#issuecomment-1439899744

   retest this please


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] aromanenko-dev commented on pull request #24992: [Avro] Use "extensions/avro" instead of avro from"core" in Java SDK modules

Posted by "aromanenko-dev (via GitHub)" <gi...@apache.org>.
aromanenko-dev commented on PR #24992:
URL: https://github.com/apache/beam/pull/24992#issuecomment-1440047835

   Run Java_GCP_IO_Direct PreCommit


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] aromanenko-dev commented on pull request #24992: [WIP][Avro] Use "extensions/avro" instead of avro from"core" in Java SDK modules

Posted by GitBox <gi...@apache.org>.
aromanenko-dev commented on PR #24992:
URL: https://github.com/apache/beam/pull/24992#issuecomment-1396691260

   Run Java_Pulsar_IO_Direct PreCommit


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] aromanenko-dev commented on pull request #24992: [WIP][Avro] Use "extensions/avro" instead of avro from"core" in Java SDK modules

Posted by "aromanenko-dev (via GitHub)" <gi...@apache.org>.
aromanenko-dev commented on PR #24992:
URL: https://github.com/apache/beam/pull/24992#issuecomment-1401896459

   Run Direct ValidatesRunner


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] aromanenko-dev commented on pull request #24992: [WIP][Avro] Use "extensions/avro" instead of avro from"core" in Java SDK modules

Posted by "aromanenko-dev (via GitHub)" <gi...@apache.org>.
aromanenko-dev commented on PR #24992:
URL: https://github.com/apache/beam/pull/24992#issuecomment-1401902240

   Run Java Dataflow V2 ValidatesRunner


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] aromanenko-dev commented on a diff in pull request #24992: [Avro] Use "extensions/avro" instead of avro from"core" in Java SDK modules

Posted by "aromanenko-dev (via GitHub)" <gi...@apache.org>.
aromanenko-dev commented on code in PR #24992:
URL: https://github.com/apache/beam/pull/24992#discussion_r1088816185


##########
sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/io/Providers.java:
##########
@@ -42,12 +42,30 @@ private Providers() {}
   public static <T extends Identifyable> Map<String, T> loadProviders(Class<T> klass) {
     Map<String, T> providers = new HashMap<>();
     for (T provider : ServiceLoader.load(klass)) {
-      checkArgument(
-          !providers.containsKey(provider.identifier()),
-          "Duplicate providers exist with identifier `%s` for class %s.",
-          provider.identifier(),
-          klass);
-      providers.put(provider.identifier(), provider);
+      // Avro provider is treated as a special case until two providers may exist: in "core"

Review Comment:
   I updated a comment to make it more clear.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] mosche commented on a diff in pull request #24992: [Avro] Use "extensions/avro" instead of avro from"core" in Java SDK modules

Posted by "mosche (via GitHub)" <gi...@apache.org>.
mosche commented on code in PR #24992:
URL: https://github.com/apache/beam/pull/24992#discussion_r1090480964


##########
runners/core-construction-java/build.gradle:
##########
@@ -73,5 +74,6 @@ dependencies {
   testImplementation library.java.jackson_dataformat_yaml
   testImplementation project(path: ":model:fn-execution", configuration: "shadow")
   testImplementation project(path: ":sdks:java:core", configuration: "testRuntimeMigration")
+  testImplementation project(path: ":sdks:java:extensions:avro", configuration: "testRuntimeMigration")

Review Comment:
   implementation should be enough



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] Abacn commented on pull request #24992: [Avro] Use "extensions/avro" instead of avro from"core" in Java SDK modules

Posted by "Abacn (via GitHub)" <gi...@apache.org>.
Abacn commented on PR #24992:
URL: https://github.com/apache/beam/pull/24992#issuecomment-1439047368

   Run SQL_Java11 PreCommit


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] Abacn commented on pull request #24992: [Avro] Use "extensions/avro" instead of avro from"core" in Java SDK modules

Posted by "Abacn (via GitHub)" <gi...@apache.org>.
Abacn commented on PR #24992:
URL: https://github.com/apache/beam/pull/24992#issuecomment-1430081991

   Hi @aromanenko-dev and @mosche, just wondering what is the status of this PR now? Because I also get a PR review close to merge (#24274) also made some changes on Avro.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] aromanenko-dev commented on pull request #24992: [WIP][Avro] Use "extensions/avro" instead of avro from"core" in Java SDK modules

Posted by GitBox <gi...@apache.org>.
aromanenko-dev commented on PR #24992:
URL: https://github.com/apache/beam/pull/24992#issuecomment-1385249752

   Run Java_Kafka_IO_Direct PreCommit


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] aromanenko-dev commented on pull request #24992: [WIP][Avro] Use "extensions/avro" instead of avro from"core" in Java SDK modules

Posted by GitBox <gi...@apache.org>.
aromanenko-dev commented on PR #24992:
URL: https://github.com/apache/beam/pull/24992#issuecomment-1385246606

   Run SQL PreCommit


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] aromanenko-dev commented on pull request #24992: [WIP][Avro] Use "extensions/avro" instead of avro from"core" in Java SDK modules

Posted by GitBox <gi...@apache.org>.
aromanenko-dev commented on PR #24992:
URL: https://github.com/apache/beam/pull/24992#issuecomment-1398181872

   Run Java PostCommit


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] aromanenko-dev commented on pull request #24992: [WIP][Avro] Use "extensions/avro" instead of avro from"core" in Java SDK modules

Posted by "aromanenko-dev (via GitHub)" <gi...@apache.org>.
aromanenko-dev commented on PR #24992:
URL: https://github.com/apache/beam/pull/24992#issuecomment-1401903142

   Run Spark StructuredStreaming ValidatesRunner


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] aromanenko-dev commented on pull request #24992: [Avro] Use "extensions/avro" instead of avro from"core" in Java SDK modules

Posted by "aromanenko-dev (via GitHub)" <gi...@apache.org>.
aromanenko-dev commented on PR #24992:
URL: https://github.com/apache/beam/pull/24992#issuecomment-1440048150

   Run Java_Kafka_IO_Direct PreCommit


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] mosche commented on a diff in pull request #24992: [Avro] Use "extensions/avro" instead of avro from"core" in Java SDK modules

Posted by "mosche (via GitHub)" <gi...@apache.org>.
mosche commented on code in PR #24992:
URL: https://github.com/apache/beam/pull/24992#discussion_r1090481857


##########
runners/google-cloud-dataflow-java/build.gradle:
##########
@@ -115,6 +116,7 @@ dependencies {
   testImplementation library.java.junit
   testImplementation project(path: ":sdks:java:io:google-cloud-platform", configuration: "testRuntimeMigration")
   testImplementation project(path: ":sdks:java:core", configuration: "shadowTest")
+  testImplementation project(path: ":sdks:java:extensions:avro", configuration: "testRuntimeMigration")

Review Comment:
   just implementation?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] mosche commented on a diff in pull request #24992: [Avro] Use "extensions/avro" instead of avro from"core" in Java SDK modules

Posted by "mosche (via GitHub)" <gi...@apache.org>.
mosche commented on code in PR #24992:
URL: https://github.com/apache/beam/pull/24992#discussion_r1090484150


##########
sdks/java/io/parquet/build.gradle:
##########
@@ -52,6 +53,7 @@ dependencies {
   provided library.java.hadoop_common
   testImplementation library.java.hadoop_client
   testImplementation project(path: ":sdks:java:core", configuration: "shadowTest")
+  testImplementation project(path: ":sdks:java:extensions:avro", configuration: "testRuntimeMigration")

Review Comment:
   just implementation?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] mosche commented on a diff in pull request #24992: [Avro] Use "extensions/avro" instead of avro from"core" in Java SDK modules

Posted by "mosche (via GitHub)" <gi...@apache.org>.
mosche commented on code in PR #24992:
URL: https://github.com/apache/beam/pull/24992#discussion_r1090465327


##########
runners/spark/3/src/main/java/org/apache/beam/runners/spark/structuredstreaming/translation/SparkSessionFactory.java:
##########
@@ -283,5 +283,13 @@ public void registerClasses(Kryo kryo) {
       kryo.register(TupleTag.class);
       kryo.register(TupleTagList.class);
     }
+
+    private void tryToRegister(Kryo kryo, String className) {
+      try {
+        kryo.register(Class.forName(className));
+      } catch (ClassNotFoundException e) {
+        LOG.warn("Class {}} was not found on classpath", className);

Review Comment:
   Nit, just log these as info? It's kind of expected that these are missing.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] aromanenko-dev commented on a diff in pull request #24992: [Avro] Use "extensions/avro" instead of avro from"core" in Java SDK modules

Posted by "aromanenko-dev (via GitHub)" <gi...@apache.org>.
aromanenko-dev commented on code in PR #24992:
URL: https://github.com/apache/beam/pull/24992#discussion_r1088817034


##########
sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/io/Providers.java:
##########
@@ -42,12 +42,30 @@ private Providers() {}
   public static <T extends Identifyable> Map<String, T> loadProviders(Class<T> klass) {
     Map<String, T> providers = new HashMap<>();
     for (T provider : ServiceLoader.load(klass)) {
-      checkArgument(
-          !providers.containsKey(provider.identifier()),
-          "Duplicate providers exist with identifier `%s` for class %s.",
-          provider.identifier(),
-          klass);
-      providers.put(provider.identifier(), provider);
+      // Avro provider is treated as a special case until two providers may exist: in "core"
+      // (deprecated) and in "extensions/avro" (actual).
+      if (provider.identifier().equals("avro")) {
+        // Avro provider from "extensions/avro" must have a priority.
+        if (provider
+            .toString()
+            .startsWith(
+                "org.apache.beam.sdk.extensions.avro.schemas.io.payloads.AvroPayloadSerializerProvider")) {
+          // Use AvroPayloadSerializerProvider from extensions/avro by any case.
+          providers.put(provider.identifier(), provider);
+        } else {
+          // Load Avro provider from "core" if it was not loaded from Avro extension before.
+          if (!providers.containsKey(provider.identifier())) {

Review Comment:
   Good catch, done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] aromanenko-dev commented on pull request #24992: [Avro] Use "extensions/avro" instead of avro from"core" in Java SDK modules

Posted by "aromanenko-dev (via GitHub)" <gi...@apache.org>.
aromanenko-dev commented on PR #24992:
URL: https://github.com/apache/beam/pull/24992#issuecomment-1411966180

   @mosche Imho, I think it's a overkill for this case but you have a point. So, I reverted `CountingSource.java` back and this issue may be addressed in #25252


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] mosche commented on a diff in pull request #24992: [Avro] Use "extensions/avro" instead of avro from"core" in Java SDK modules

Posted by "mosche (via GitHub)" <gi...@apache.org>.
mosche commented on code in PR #24992:
URL: https://github.com/apache/beam/pull/24992#discussion_r1088706540


##########
sdks/java/core/src/main/java/org/apache/beam/sdk/io/CountingSource.java:
##########
@@ -481,11 +489,45 @@ public long getSplitBacklogBytes() {
     }
   }
 
+  public static class CounterMarkCoder extends CustomCoder<CounterMark> {

Review Comment:
   Sure, of course, but that's not the point I'm making. Switching coders is sketchy and can cause trouble if the encoded bytes are persisted somehow.
   Assuming somebody migrates a pipeline that uses checkpoints containing an Avro encoded `CounterMark` (unlikely, but possible), switching the coder here without being binary compatible will cause issues when migrating to the new Beam version... 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] aromanenko-dev commented on a diff in pull request #24992: [Avro] Use "extensions/avro" instead of avro from"core" in Java SDK modules

Posted by "aromanenko-dev (via GitHub)" <gi...@apache.org>.
aromanenko-dev commented on code in PR #24992:
URL: https://github.com/apache/beam/pull/24992#discussion_r1090407836


##########
sdks/java/io/kafka/build.gradle:
##########
@@ -90,6 +91,7 @@ dependencies {
   provided library.java.everit_json_schema
   testImplementation project(path: ":sdks:java:core", configuration: "shadowTest")
   testImplementation project(":sdks:java:io:synthetic")
+  testImplementation project(path: ":sdks:java:extensions:avro", configuration: "testRuntimeMigration")

Review Comment:
   Yes, done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] aromanenko-dev commented on pull request #24992: [Avro] Use "extensions/avro" instead of avro from"core" in Java SDK modules

Posted by "aromanenko-dev (via GitHub)" <gi...@apache.org>.
aromanenko-dev commented on PR #24992:
URL: https://github.com/apache/beam/pull/24992#issuecomment-1438266971

   @Abacn Do you think we have to wait for a fix before merging this one?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] Abacn commented on pull request #24992: [Avro] Use "extensions/avro" instead of avro from"core" in Java SDK modules

Posted by "Abacn (via GitHub)" <gi...@apache.org>.
Abacn commented on PR #24992:
URL: https://github.com/apache/beam/pull/24992#issuecomment-1431750213

   > @Abacn Thanks for a ping. I think this PR is almost ready to be merged. In the mean time, the next step will be a deprecation of Avro classes in "sdks/java/core" but before, I think we need to sync them with "extensions/avro". WDYT?
   > 
   
   Yeah sounds good to me. As part of deprecation, I am thinking about if we can add a checkStyle guard to prevent new changes in beam repo referring to avro in core, like what we are doing for vendored dependencies (e.g. guava, grpc).
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] aromanenko-dev commented on pull request #24992: [WIP][Avro] Use "extensions/avro" instead of avro from"core" in Java SDK modules

Posted by GitBox <gi...@apache.org>.
aromanenko-dev commented on PR #24992:
URL: https://github.com/apache/beam/pull/24992#issuecomment-1396665464

   Run Java PreCommit


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] aromanenko-dev commented on pull request #24992: [WIP][Avro] Use "extensions/avro" instead of avro from"core" in Java SDK modules

Posted by "aromanenko-dev (via GitHub)" <gi...@apache.org>.
aromanenko-dev commented on PR #24992:
URL: https://github.com/apache/beam/pull/24992#issuecomment-1401901637

   Run Dataflow Runner V2 Nexmark Tests


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] aromanenko-dev commented on pull request #24992: [WIP][Avro] Use "extensions/avro" instead of avro from"core" in Java SDK modules

Posted by "aromanenko-dev (via GitHub)" <gi...@apache.org>.
aromanenko-dev commented on PR #24992:
URL: https://github.com/apache/beam/pull/24992#issuecomment-1401894575

   Run Java PostCommit


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] aromanenko-dev commented on pull request #24992: [WIP][Avro] Use "extensions/avro" instead of avro from"core" in Java SDK modules

Posted by GitBox <gi...@apache.org>.
aromanenko-dev commented on PR #24992:
URL: https://github.com/apache/beam/pull/24992#issuecomment-1383944878

   Run Java_Kafka_IO_Direct PreCommit


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] aromanenko-dev commented on pull request #24992: [WIP][Avro] Use "extensions/avro" instead of avro from"core" in Java SDK modules

Posted by GitBox <gi...@apache.org>.
aromanenko-dev commented on PR #24992:
URL: https://github.com/apache/beam/pull/24992#issuecomment-1397024060

   Run Java PreCommit


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] aromanenko-dev commented on pull request #24992: [WIP][Avro] Use "extensions/avro" instead of avro from"core" in Java SDK modules

Posted by GitBox <gi...@apache.org>.
aromanenko-dev commented on PR #24992:
URL: https://github.com/apache/beam/pull/24992#issuecomment-1398460256

   Run Direct ValidatesRunner


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] aromanenko-dev commented on pull request #24992: [WIP][Avro] Use "extensions/avro" instead of avro from"core" in Java SDK modules

Posted by GitBox <gi...@apache.org>.
aromanenko-dev commented on PR #24992:
URL: https://github.com/apache/beam/pull/24992#issuecomment-1396690898

   Run Java PreCommit


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] aromanenko-dev commented on pull request #24992: [WIP][Avro] Use "extensions/avro" instead of avro from"core" in Java SDK modules

Posted by "aromanenko-dev (via GitHub)" <gi...@apache.org>.
aromanenko-dev commented on PR #24992:
URL: https://github.com/apache/beam/pull/24992#issuecomment-1401896159

   Run Flink ValidatesRunner


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] Abacn commented on a diff in pull request #24992: [Avro] Use "extensions/avro" instead of avro from"core" in Java SDK modules

Posted by "Abacn (via GitHub)" <gi...@apache.org>.
Abacn commented on code in PR #24992:
URL: https://github.com/apache/beam/pull/24992#discussion_r1110343492


##########
CHANGES.md:
##########
@@ -60,6 +60,7 @@
   container was based upon Debian 11.
 * RunInference PTransform will accept model paths as SideInputs in Python SDK. ([#24042](https://github.com/apache/beam/issues/24042))
 * RunInference supports ONNX runtime in Python SDK ([#22972](https://github.com/apache/beam/issues/22972))
+* Java SDK modules migrated to use `:sdks:java:extensions:avro` ([#24748](https://github.com/apache/beam/issues/24748))  

Review Comment:
   ```suggestion
   * Java SDK modules migrated to use `:sdks:java:extensions:avro` ([#24748](https://github.com/apache/beam/issues/24748))
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] aromanenko-dev commented on pull request #24992: [Avro] Use "extensions/avro" instead of avro from"core" in Java SDK modules

Posted by "aromanenko-dev (via GitHub)" <gi...@apache.org>.
aromanenko-dev commented on PR #24992:
URL: https://github.com/apache/beam/pull/24992#issuecomment-1431380213

   Run Java_GCP_IO_Direct PreCommit


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] Abacn commented on pull request #24992: [Avro] Use "extensions/avro" instead of avro from"core" in Java SDK modules

Posted by "Abacn (via GitHub)" <gi...@apache.org>.
Abacn commented on PR #24992:
URL: https://github.com/apache/beam/pull/24992#issuecomment-1435250338

   Run Java_GCP_IO_Direct PreCommit


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] aromanenko-dev commented on pull request #24992: [Avro] Use "extensions/avro" instead of avro from"core" in Java SDK modules

Posted by "aromanenko-dev (via GitHub)" <gi...@apache.org>.
aromanenko-dev commented on PR #24992:
URL: https://github.com/apache/beam/pull/24992#issuecomment-1436796091

   Run SQL_Java11 PreCommit


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] aromanenko-dev commented on pull request #24992: [Avro] Use "extensions/avro" instead of avro from"core" in Java SDK modules

Posted by "aromanenko-dev (via GitHub)" <gi...@apache.org>.
aromanenko-dev commented on PR #24992:
URL: https://github.com/apache/beam/pull/24992#issuecomment-1436796368

   Run SQL_Java11 PreCommit


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] Abacn commented on pull request #24992: [Avro] Use "extensions/avro" instead of avro from"core" in Java SDK modules

Posted by "Abacn (via GitHub)" <gi...@apache.org>.
Abacn commented on PR #24992:
URL: https://github.com/apache/beam/pull/24992#issuecomment-1437451980

   Java11/Java17 test failures are due to #23210, see investigation done in #25566


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] aromanenko-dev commented on a diff in pull request #24992: [Avro] Use "extensions/avro" instead of avro from"core" in Java SDK modules

Posted by "aromanenko-dev (via GitHub)" <gi...@apache.org>.
aromanenko-dev commented on code in PR #24992:
URL: https://github.com/apache/beam/pull/24992#discussion_r1088817251


##########
sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/io/Providers.java:
##########
@@ -42,12 +42,30 @@ private Providers() {}
   public static <T extends Identifyable> Map<String, T> loadProviders(Class<T> klass) {
     Map<String, T> providers = new HashMap<>();
     for (T provider : ServiceLoader.load(klass)) {
-      checkArgument(
-          !providers.containsKey(provider.identifier()),
-          "Duplicate providers exist with identifier `%s` for class %s.",
-          provider.identifier(),
-          klass);
-      providers.put(provider.identifier(), provider);
+      // Avro provider is treated as a special case until two providers may exist: in "core"
+      // (deprecated) and in "extensions/avro" (actual).
+      if (provider.identifier().equals("avro")) {
+        // Avro provider from "extensions/avro" must have a priority.
+        if (provider
+            .toString()
+            .startsWith(
+                "org.apache.beam.sdk.extensions.avro.schemas.io.payloads.AvroPayloadSerializerProvider")) {
+          // Use AvroPayloadSerializerProvider from extensions/avro by any case.
+          providers.put(provider.identifier(), provider);
+        } else {
+          // Load Avro provider from "core" if it was not loaded from Avro extension before.
+          if (!providers.containsKey(provider.identifier())) {
+            providers.put(provider.identifier(), provider);
+          }
+        }
+      } else {
+        checkArgument(

Review Comment:
   Agree, done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] aromanenko-dev commented on pull request #24992: [WIP][Avro] Use "extensions/avro" instead of avro from"core" in Java SDK modules

Posted by GitBox <gi...@apache.org>.
aromanenko-dev commented on PR #24992:
URL: https://github.com/apache/beam/pull/24992#issuecomment-1384154956

   Run SQL_Java17 PreCommit


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] aromanenko-dev commented on pull request #24992: [WIP][Avro] Use "extensions/avro" instead of avro from"core" in Java SDK modules

Posted by GitBox <gi...@apache.org>.
aromanenko-dev commented on PR #24992:
URL: https://github.com/apache/beam/pull/24992#issuecomment-1398459769

   Run Spark ValidatesRunner


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] aromanenko-dev commented on pull request #24992: [WIP][Avro] Use "extensions/avro" instead of avro from"core" in Java SDK modules

Posted by GitBox <gi...@apache.org>.
aromanenko-dev commented on PR #24992:
URL: https://github.com/apache/beam/pull/24992#issuecomment-1398460435

   Run Dataflow ValidatesRunner


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] aromanenko-dev commented on pull request #24992: [WIP][Avro] Use "extensions/avro" instead of avro from"core" in Java SDK modules

Posted by GitBox <gi...@apache.org>.
aromanenko-dev commented on PR #24992:
URL: https://github.com/apache/beam/pull/24992#issuecomment-1397026157

   Run Java PreCommit


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] mosche commented on pull request #24992: [Avro] Use "extensions/avro" instead of avro from"core" in Java SDK modules

Posted by "mosche (via GitHub)" <gi...@apache.org>.
mosche commented on PR #24992:
URL: https://github.com/apache/beam/pull/24992#issuecomment-1411492609

   > We can keep it "as it is" for now but it will be anyway a breaking change once Avro will be dropped from "core". Do you see any other options?
   
   @aromanenko-dev You don't need Avro to produce Avro compatible bytes. It's a well defined format with a detailed Spec. Of course this could become a hassle quickly, but luckily this case is trivial. Avro uses [varint zigzag encoding for longs](https://avro.apache.org/docs/1.8.2/spec.html#binary_encoding), that's just the same Protobuf is using for [signed ints](https://developers.google.com/protocol-buffers/docs/encoding?csw=1#signed-ints). And Avro records are just a concatenation of their fields without any further additions:
   
   Here's an example using Protobufs `CodedOutputStream`
   ```Java
   //import org.apache.beam.vendor.grpc.v1p48p1.com.google.protobuf.CodedOutputStream
   CodedOutputStream cos = CodedOutputStream.newInstance(outputStream);
   cos.writeSInt64NoTag(mark.getLastEmitted()); // signed int64 with varint zigzag encoding
   cos.writeSInt64NoTag(mark.getStartTime().getMillis()); // signed int64 with varint zigzag encoding
   cos.flush();
   ```
   
   Of course, that would require some additional tests to verify compatibility...


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] aromanenko-dev commented on a diff in pull request #24992: [Avro] Use "extensions/avro" instead of avro from"core" in Java SDK modules

Posted by "aromanenko-dev (via GitHub)" <gi...@apache.org>.
aromanenko-dev commented on code in PR #24992:
URL: https://github.com/apache/beam/pull/24992#discussion_r1087985441


##########
sdks/java/core/src/main/java/org/apache/beam/sdk/io/CountingSource.java:
##########
@@ -481,11 +489,45 @@ public long getSplitBacklogBytes() {
     }
   }
 
+  public static class CounterMarkCoder extends CustomCoder<CounterMark> {

Review Comment:
   Anyway, we should avoid stop using Avro in `core`.



##########
sdks/java/core/src/main/java/org/apache/beam/sdk/io/CountingSource.java:
##########
@@ -481,11 +489,45 @@ public long getSplitBacklogBytes() {
     }
   }
 
+  public static class CounterMarkCoder extends CustomCoder<CounterMark> {

Review Comment:
   Anyway, we should stop using Avro in `core`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] mosche commented on a diff in pull request #24992: [Avro] Use "extensions/avro" instead of avro from"core" in Java SDK modules

Posted by "mosche (via GitHub)" <gi...@apache.org>.
mosche commented on code in PR #24992:
URL: https://github.com/apache/beam/pull/24992#discussion_r1086378332


##########
sdks/java/core/src/main/java/org/apache/beam/sdk/io/CountingSource.java:
##########
@@ -481,11 +489,45 @@ public long getSplitBacklogBytes() {
     }
   }
 
+  public static class CounterMarkCoder extends CustomCoder<CounterMark> {

Review Comment:
   Is this encoding byte compatible with the previous Avro based encoding? Unlikely, but if not it might cause issues if this is ever persisted, e.g. in a snapshot.



##########
sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/io/Providers.java:
##########
@@ -42,12 +42,30 @@ private Providers() {}
   public static <T extends Identifyable> Map<String, T> loadProviders(Class<T> klass) {
     Map<String, T> providers = new HashMap<>();
     for (T provider : ServiceLoader.load(klass)) {
-      checkArgument(
-          !providers.containsKey(provider.identifier()),
-          "Duplicate providers exist with identifier `%s` for class %s.",
-          provider.identifier(),
-          klass);
-      providers.put(provider.identifier(), provider);
+      // Avro provider is treated as a special case until two providers may exist: in "core"
+      // (deprecated) and in "extensions/avro" (actual).
+      if (provider.identifier().equals("avro")) {
+        // Avro provider from "extensions/avro" must have a priority.
+        if (provider
+            .toString()
+            .startsWith(
+                "org.apache.beam.sdk.extensions.avro.schemas.io.payloads.AvroPayloadSerializerProvider")) {
+          // Use AvroPayloadSerializerProvider from extensions/avro by any case.
+          providers.put(provider.identifier(), provider);
+        } else {
+          // Load Avro provider from "core" if it was not loaded from Avro extension before.
+          if (!providers.containsKey(provider.identifier())) {

Review Comment:
   Please use `putIfAbsent` instead



##########
sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/io/Providers.java:
##########
@@ -42,12 +42,30 @@ private Providers() {}
   public static <T extends Identifyable> Map<String, T> loadProviders(Class<T> klass) {
     Map<String, T> providers = new HashMap<>();
     for (T provider : ServiceLoader.load(klass)) {
-      checkArgument(
-          !providers.containsKey(provider.identifier()),
-          "Duplicate providers exist with identifier `%s` for class %s.",
-          provider.identifier(),
-          klass);
-      providers.put(provider.identifier(), provider);
+      // Avro provider is treated as a special case until two providers may exist: in "core"
+      // (deprecated) and in "extensions/avro" (actual).
+      if (provider.identifier().equals("avro")) {
+        // Avro provider from "extensions/avro" must have a priority.
+        if (provider

Review Comment:
   Please don't rely on the default `toString()` for this and use `getClass().getName()` instead.



##########
sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/io/Providers.java:
##########
@@ -42,12 +42,30 @@ private Providers() {}
   public static <T extends Identifyable> Map<String, T> loadProviders(Class<T> klass) {
     Map<String, T> providers = new HashMap<>();
     for (T provider : ServiceLoader.load(klass)) {
-      checkArgument(
-          !providers.containsKey(provider.identifier()),
-          "Duplicate providers exist with identifier `%s` for class %s.",
-          provider.identifier(),
-          klass);
-      providers.put(provider.identifier(), provider);
+      // Avro provider is treated as a special case until two providers may exist: in "core"
+      // (deprecated) and in "extensions/avro" (actual).
+      if (provider.identifier().equals("avro")) {
+        // Avro provider from "extensions/avro" must have a priority.
+        if (provider
+            .toString()
+            .startsWith(
+                "org.apache.beam.sdk.extensions.avro.schemas.io.payloads.AvroPayloadSerializerProvider")) {
+          // Use AvroPayloadSerializerProvider from extensions/avro by any case.
+          providers.put(provider.identifier(), provider);
+        } else {
+          // Load Avro provider from "core" if it was not loaded from Avro extension before.
+          if (!providers.containsKey(provider.identifier())) {
+            providers.put(provider.identifier(), provider);
+          }
+        }
+      } else {
+        checkArgument(

Review Comment:
   Maybe better `checkState` instead?



##########
sdks/java/io/kafka/build.gradle:
##########
@@ -90,6 +91,7 @@ dependencies {
   provided library.java.everit_json_schema
   testImplementation project(path: ":sdks:java:core", configuration: "shadowTest")
   testImplementation project(":sdks:java:io:synthetic")
+  testImplementation project(path: ":sdks:java:extensions:avro", configuration: "testRuntimeMigration")

Review Comment:
   Shouldn't `implementation` be enough?



##########
runners/spark/3/src/main/java/org/apache/beam/runners/spark/structuredstreaming/translation/SparkSessionFactory.java:
##########
@@ -70,6 +68,8 @@
 import org.apache.beam.sdk.coders.VarIntCoder;
 import org.apache.beam.sdk.coders.VarLongCoder;
 import org.apache.beam.sdk.coders.VoidCoder;
+import org.apache.beam.sdk.extensions.avro.coders.AvroCoder;

Review Comment:
   How about only registering these if they exist on the classpath? So we can skip adding the dependency to the runner by default.



##########
sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/io/Providers.java:
##########
@@ -42,12 +42,30 @@ private Providers() {}
   public static <T extends Identifyable> Map<String, T> loadProviders(Class<T> klass) {
     Map<String, T> providers = new HashMap<>();
     for (T provider : ServiceLoader.load(klass)) {
-      checkArgument(
-          !providers.containsKey(provider.identifier()),
-          "Duplicate providers exist with identifier `%s` for class %s.",
-          provider.identifier(),
-          klass);
-      providers.put(provider.identifier(), provider);
+      // Avro provider is treated as a special case until two providers may exist: in "core"

Review Comment:
   ```suggestion
         // Avro provider is treated as a special case as two providers may exist: in "core"
   ```



##########
sdks/java/extensions/avro/src/test/java/org/apache/beam/sdk/extensions/avro/coders/CoderRegistryTest.java:
##########
@@ -0,0 +1,121 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.extensions.avro.coders;
+
+import static org.junit.Assert.assertEquals;
+
+import com.google.auto.service.AutoService;
+import java.io.IOException;
+import java.io.InputStream;
+import java.io.OutputStream;
+import java.io.Serializable;
+import java.util.List;
+import org.apache.beam.sdk.coders.Coder;
+import org.apache.beam.sdk.coders.CoderException;
+import org.apache.beam.sdk.coders.CoderProvider;
+import org.apache.beam.sdk.coders.CoderProviderRegistrar;
+import org.apache.beam.sdk.coders.CoderProviders;
+import org.apache.beam.sdk.coders.CoderRegistry;
+import org.apache.beam.sdk.coders.CustomCoder;
+import org.apache.beam.sdk.coders.DefaultCoder;
+import org.apache.beam.sdk.coders.SerializableCoder;
+import org.apache.beam.sdk.testing.ExpectedLogs;
+import org.apache.beam.sdk.testing.TestPipeline;
+import org.apache.beam.sdk.values.TypeDescriptor;
+import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.ImmutableList;
+import org.junit.Rule;
+import org.junit.Test;
+import org.junit.rules.ExpectedException;
+import org.junit.runner.RunWith;
+import org.junit.runners.JUnit4;
+
+/** Tests for CoderRegistry and AvroCoder. */
+@RunWith(JUnit4.class)
+public class CoderRegistryTest {
+

Review Comment:
   Wondering, what's the purpose of this test? As far as I can see this doesn't test anything Avro specific, but just general behavior of the CoderRegistry? Do we need this test at all?



##########
sdks/java/core/src/main/java/org/apache/beam/sdk/io/CountingSource.java:
##########
@@ -481,11 +489,45 @@ public long getSplitBacklogBytes() {
     }
   }
 
+  public static class CounterMarkCoder extends CustomCoder<CounterMark> {

Review Comment:
   Though, considering `GenerateSequence` is the recommended alternative, I'm not sure if to worry much about this.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] mosche commented on pull request #24992: [Avro] Use "extensions/avro" instead of avro from"core" in Java SDK modules

Posted by "mosche (via GitHub)" <gi...@apache.org>.
mosche commented on PR #24992:
URL: https://github.com/apache/beam/pull/24992#issuecomment-1406193060

   > I wanted to do this in a separate PR once this one is merged. Do you think it'd be better to do here as well?
   
   @aromanenko-dev I suppose that's fine. Even worst case, if it ends up in another release, there's no harm. The change as is is fairly transparent. 👍 Though it would be good to document usage of the new module in the changelog.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] aromanenko-dev commented on a diff in pull request #24992: [Avro] Use "extensions/avro" instead of avro from"core" in Java SDK modules

Posted by "aromanenko-dev (via GitHub)" <gi...@apache.org>.
aromanenko-dev commented on code in PR #24992:
URL: https://github.com/apache/beam/pull/24992#discussion_r1088816577


##########
sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/io/Providers.java:
##########
@@ -42,12 +42,30 @@ private Providers() {}
   public static <T extends Identifyable> Map<String, T> loadProviders(Class<T> klass) {
     Map<String, T> providers = new HashMap<>();
     for (T provider : ServiceLoader.load(klass)) {
-      checkArgument(
-          !providers.containsKey(provider.identifier()),
-          "Duplicate providers exist with identifier `%s` for class %s.",
-          provider.identifier(),
-          klass);
-      providers.put(provider.identifier(), provider);
+      // Avro provider is treated as a special case until two providers may exist: in "core"
+      // (deprecated) and in "extensions/avro" (actual).
+      if (provider.identifier().equals("avro")) {
+        // Avro provider from "extensions/avro" must have a priority.
+        if (provider

Review Comment:
   Good catch, done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] aromanenko-dev commented on pull request #24992: [Avro] Use "extensions/avro" instead of avro from"core" in Java SDK modules

Posted by "aromanenko-dev (via GitHub)" <gi...@apache.org>.
aromanenko-dev commented on PR #24992:
URL: https://github.com/apache/beam/pull/24992#issuecomment-1431640265

   @Abacn Thanks for a ping. I think this PR is almost ready to be merged. In the mean time, the next step will be a deprecation of Avro classes in "sdks/java/core" but before, I think we need to sync them with "extensions/avro". WDYT?
   
   PS: I also opened another PR #25216 to test "extensions/avro" against multiple Avro versions but it's WIP


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] aromanenko-dev commented on pull request #24992: [Avro] Use "extensions/avro" instead of avro from"core" in Java SDK modules

Posted by "aromanenko-dev (via GitHub)" <gi...@apache.org>.
aromanenko-dev commented on PR #24992:
URL: https://github.com/apache/beam/pull/24992#issuecomment-1436796941

   Run SQL_Java17 PreCommit


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] aromanenko-dev commented on pull request #24992: [Avro] Use "extensions/avro" instead of avro from"core" in Java SDK modules

Posted by "aromanenko-dev (via GitHub)" <gi...@apache.org>.
aromanenko-dev commented on PR #24992:
URL: https://github.com/apache/beam/pull/24992#issuecomment-1440048467

   Run Java_HCatalog_IO_Direct PreCommit


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] aromanenko-dev commented on pull request #24992: [WIP][Avro] Use "extensions/avro" instead of avro from"core" in Java SDK modules

Posted by GitBox <gi...@apache.org>.
aromanenko-dev commented on PR #24992:
URL: https://github.com/apache/beam/pull/24992#issuecomment-1386836298

   Run Java_Pulsar_IO_Direct PreCommit


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] aromanenko-dev commented on pull request #24992: [WIP][Avro] Use "extensions/avro" instead of avro from"core" in Java SDK modules

Posted by GitBox <gi...@apache.org>.
aromanenko-dev commented on PR #24992:
URL: https://github.com/apache/beam/pull/24992#issuecomment-1396665977

   Run SQL_Java11 PreCommit


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] aromanenko-dev commented on pull request #24992: [WIP][Avro] Use "extensions/avro" instead of avro from"core" in Java SDK modules

Posted by GitBox <gi...@apache.org>.
aromanenko-dev commented on PR #24992:
URL: https://github.com/apache/beam/pull/24992#issuecomment-1397263725

   Run Java PreCommit


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] aromanenko-dev commented on pull request #24992: [WIP][Avro] Use "extensions/avro" instead of avro from"core" in Java SDK modules

Posted by GitBox <gi...@apache.org>.
aromanenko-dev commented on PR #24992:
URL: https://github.com/apache/beam/pull/24992#issuecomment-1386983152

   Run Java PreCommit


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] aromanenko-dev commented on pull request #24992: [WIP][Avro] Use "extensions/avro" instead of avro from"core" in Java SDK modules

Posted by GitBox <gi...@apache.org>.
aromanenko-dev commented on PR #24992:
URL: https://github.com/apache/beam/pull/24992#issuecomment-1397349452

   Run Java_GCP_IO_Direct PreCommit


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] aromanenko-dev commented on pull request #24992: [Avro] Use "extensions/avro" instead of avro from"core" in Java SDK modules

Posted by "aromanenko-dev (via GitHub)" <gi...@apache.org>.
aromanenko-dev commented on PR #24992:
URL: https://github.com/apache/beam/pull/24992#issuecomment-1410720243

   @mosche Thanks, review comments are addressed.
   
   > Though, still flagging the coder replacement of CountingSource as potential issue. I can't think of use cases where one would checkpoint this, but who knows ..
   
   We can keep it "as it is" for now but it will be anyway a breaking change once Avro will be dropped from "core". Do you see any other options?
   
   > Do you want to move ahead and merge or get more people to ack?
   
   Thanks for review! Before merging, if possible, I'd be happy to have another look on this PR to make sure that we didn't miss something. 
   Ping @kennknowles @reuvenlax @lukecwik 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] aromanenko-dev commented on pull request #24992: [WIP][Avro] Use "extensions/avro" instead of avro from"core" in Java SDK modules

Posted by GitBox <gi...@apache.org>.
aromanenko-dev commented on PR #24992:
URL: https://github.com/apache/beam/pull/24992#issuecomment-1384220132

   Run Java_GCP_IO_Direct PreCommit


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] Abacn commented on pull request #24992: [Avro] Use "extensions/avro" instead of avro from"core" in Java SDK modules

Posted by "Abacn (via GitHub)" <gi...@apache.org>.
Abacn commented on PR #24992:
URL: https://github.com/apache/beam/pull/24992#issuecomment-1431584600

   Java_GCP_IO_Direct failing due to `testTriggeredFileLoadsWithTempTablesToExistingNullSchemaTable[1]
   ` timeout flaky is known issue #25207


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] aromanenko-dev commented on pull request #24992: [Avro] Use "extensions/avro" instead of avro from"core" in Java SDK modules

Posted by "aromanenko-dev (via GitHub)" <gi...@apache.org>.
aromanenko-dev commented on PR #24992:
URL: https://github.com/apache/beam/pull/24992#issuecomment-1436794299

   Run Portable_Python PreCommit


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] aromanenko-dev commented on pull request #24992: [Avro] Use "extensions/avro" instead of avro from"core" in Java SDK modules

Posted by "aromanenko-dev (via GitHub)" <gi...@apache.org>.
aromanenko-dev commented on PR #24992:
URL: https://github.com/apache/beam/pull/24992#issuecomment-1436798811

   Run Java_SingleStore_IO_Direct PreCommit


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] aromanenko-dev commented on pull request #24992: [Avro] Use "extensions/avro" instead of avro from"core" in Java SDK modules

Posted by "aromanenko-dev (via GitHub)" <gi...@apache.org>.
aromanenko-dev commented on PR #24992:
URL: https://github.com/apache/beam/pull/24992#issuecomment-1434949902

   Run Java_Kafka_IO_Direct PreCommit


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] aromanenko-dev merged pull request #24992: [Avro] Use "extensions/avro" instead of avro from"core" in Java SDK modules

Posted by "aromanenko-dev (via GitHub)" <gi...@apache.org>.
aromanenko-dev merged PR #24992:
URL: https://github.com/apache/beam/pull/24992


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] mosche commented on pull request #24992: [Avro] Use "extensions/avro" instead of avro from"core" in Java SDK modules

Posted by "mosche (via GitHub)" <gi...@apache.org>.
mosche commented on PR #24992:
URL: https://github.com/apache/beam/pull/24992#issuecomment-1403291892

   @aromanenko-dev Don't forget to enable publishing of the extension
   ```
   applyJavaNature(
       automaticModuleName: 'org.apache.beam.sdk.extensions.avro',
       disableLintWarnings: ['rawtypes'], // Avro-generated test code has raw-type errors
       publish: false, 
       exportJavadoc: false,
   )
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] aromanenko-dev commented on a diff in pull request #24992: [Avro] Use "extensions/avro" instead of avro from"core" in Java SDK modules

Posted by "aromanenko-dev (via GitHub)" <gi...@apache.org>.
aromanenko-dev commented on code in PR #24992:
URL: https://github.com/apache/beam/pull/24992#discussion_r1088106032


##########
sdks/java/extensions/avro/src/test/java/org/apache/beam/sdk/extensions/avro/coders/CoderRegistryTest.java:
##########
@@ -0,0 +1,121 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.extensions.avro.coders;
+
+import static org.junit.Assert.assertEquals;
+
+import com.google.auto.service.AutoService;
+import java.io.IOException;
+import java.io.InputStream;
+import java.io.OutputStream;
+import java.io.Serializable;
+import java.util.List;
+import org.apache.beam.sdk.coders.Coder;
+import org.apache.beam.sdk.coders.CoderException;
+import org.apache.beam.sdk.coders.CoderProvider;
+import org.apache.beam.sdk.coders.CoderProviderRegistrar;
+import org.apache.beam.sdk.coders.CoderProviders;
+import org.apache.beam.sdk.coders.CoderRegistry;
+import org.apache.beam.sdk.coders.CustomCoder;
+import org.apache.beam.sdk.coders.DefaultCoder;
+import org.apache.beam.sdk.coders.SerializableCoder;
+import org.apache.beam.sdk.testing.ExpectedLogs;
+import org.apache.beam.sdk.testing.TestPipeline;
+import org.apache.beam.sdk.values.TypeDescriptor;
+import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.ImmutableList;
+import org.junit.Rule;
+import org.junit.Test;
+import org.junit.rules.ExpectedException;
+import org.junit.runner.RunWith;
+import org.junit.runners.JUnit4;
+
+/** Tests for CoderRegistry and AvroCoder. */
+@RunWith(JUnit4.class)
+public class CoderRegistryTest {
+

Review Comment:
   There is a small part that checks that `AvroCoder` is properly loaded with `CoderRegistry` when using  `@DefaultCoder(AvroCoder.class)`. I agree that it's not Avro-specific, so can probably just remove it.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org