You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@avro.apache.org by GitBox <gi...@apache.org> on 2022/05/05 21:42:06 UTC

[GitHub] [avro] ashley-taylor opened a new pull request, #1680: AVRO-3126 Add support for java records

ashley-taylor opened a new pull request, #1680:
URL: https://github.com/apache/avro/pull/1680

   To implement this feature, I used a `CustomEncoding`. Added two new methods to pass the `DatumReader`/`DatumWriter`, allowing other `CustomEncoding` to not have to define an entire custom schema but use a hybrid approach.
   
   With this, I also change `AvroEncode` to be able to specify at a class level. Feels generically helpful to be able to instrument the class instead of all fields referencing the class.
   
   Used reflection to read the `isRecord` method from the `class` class so that it will work on older versions of java.
   
   For records, fields can not be modified or accessed with unsafe offsets reads. Instead, the constructor must be invoked, and reads must be done using either method/field reflection.
   
   To identify the appropriate constructor, I use the order of the fields. Since it is a record and can not be extended, reading the direct fields from the record class is safe.
   
   I have not written tests or updated documentation yet. I am seeking some feedback on the approach first.
   
   
   
   
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [ ] My PR addresses the following [Avro Jira](https://issues.apache.org/jira/browse/AVRO/AVRO-3126) issues
   
   ### Tests
   
   - [ ] My PR adds the following unit tests __OR__ does not need testing for this extremely good reason:
   
   ### Commits
   
   - [ ] My commits all reference Jira issues in their subject lines. In addition, my commits follow the guidelines from "[How to write a good git commit message](https://chris.beams.io/posts/git-commit/)":
     1. Subject is separated from body by a blank line
     1. Subject is limited to 50 characters (not including Jira issue reference)
     1. Subject does not end with a period
     1. Subject uses the imperative mood ("add", not "adding")
     1. Body wraps at 72 characters
     1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [ ] In case of new functionality, my PR adds documentation that describes how to use it.
     - All the public functions and the classes in the PR contain Javadoc that explain what it does
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] AVRO-3126: Add support for java records [avro]

Posted by "JoeryH (via GitHub)" <gi...@apache.org>.
JoeryH commented on PR #1680:
URL: https://github.com/apache/avro/pull/1680#issuecomment-1877350661

   Just wanted to share that record support for Avro would be really nice to have!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [avro] martin-g commented on pull request #1680: AVRO-3126 Add support for java records

Posted by GitBox <gi...@apache.org>.
martin-g commented on PR #1680:
URL: https://github.com/apache/avro/pull/1680#issuecomment-1150917499

   I know but JIRA offers me 5 users with "Ashley Taylor" as display name when I enter this id:
   ![ashley-taylor-avro-contributor](https://user-images.githubusercontent.com/232002/172819964-028ecd5a-eec6-4629-9b06-ff5396bb8f9e.png)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [avro] martin-g commented on pull request #1680: AVRO-3126 Add support for java records

Posted by GitBox <gi...@apache.org>.
martin-g commented on PR #1680:
URL: https://github.com/apache/avro/pull/1680#issuecomment-1136921297

   We should not merge this PR for `branch-1.11` because it will definitely cause problems with the release!
   I guess we will have to use JDK 17 for the release process and `-release 8` for all old modules. Actually no, this is a module with just test cases, so it does not need to be released at all!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [avro] ashley-taylor commented on pull request #1680: AVRO-3126 Add support for java records

Posted by GitBox <gi...@apache.org>.
ashley-taylor commented on PR #1680:
URL: https://github.com/apache/avro/pull/1680#issuecomment-1150857990

   > > Do you have a JIRA account so we can assign this JIRA to you and ensure that it's properly taken care of?
   > 
   > I tried to add Ashley to the Contributors role in JIRA but there are 5 JIRA users with that (display) name and I am not sure which one to add.
   
   ashley-taylor 
   
   is my jira username


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [avro] ashley-taylor commented on pull request #1680: AVRO-3126 Add support for java records

Posted by GitBox <gi...@apache.org>.
ashley-taylor commented on PR #1680:
URL: https://github.com/apache/avro/pull/1680#issuecomment-1136612469

   PR is ready for review. 
   I cannot test the record logic directly with supporting java 8.
   One way I can see doing this is to add another maven module that has java 17 tests only, which is skipped for lower versions. I'm happy to add this module, but I want to ensure I'm not missing something obvious before doing that.
   If PR is acceptable, I will squash the commits into a single commit.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [avro] martin-g commented on pull request #1680: AVRO-3126 Add support for java records

Posted by GitBox <gi...@apache.org>.
martin-g commented on PR #1680:
URL: https://github.com/apache/avro/pull/1680#issuecomment-1149569046

   @ashley-taylor  The PR should be reviewed and merged by a committer who knows better the Java SDK.
   I review most of the PRs for all SDKs but my main expertise is in the Rust SDK.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [avro] ashley-taylor commented on pull request #1680: AVRO-3126: Add support for java records

Posted by "ashley-taylor (via GitHub)" <gi...@apache.org>.
ashley-taylor commented on PR #1680:
URL: https://github.com/apache/avro/pull/1680#issuecomment-1610702023

   I have made some additional Java 17 improvements that are based on this and I wondering if I should add this branch. 
   If the classes are sealed it will automatically add the equivalent of the `@union` annotation, so polymorphism is automatic.
   
   https://github.com/ashley-taylor/avro/pull/1/files
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [avro] martin-g commented on pull request #1680: AVRO-3126 Add support for java records

Posted by GitBox <gi...@apache.org>.
martin-g commented on PR #1680:
URL: https://github.com/apache/avro/pull/1680#issuecomment-1136757541

   I've approved the workflow run!
   
   The new CI related code could be simplified. I will do it soon!
   
   I'll let the Java developers to comment on the Java changes!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [avro] carl-mastrangelo commented on pull request #1680: AVRO-3126: Add support for java records

Posted by "carl-mastrangelo (via GitHub)" <gi...@apache.org>.
carl-mastrangelo commented on PR #1680:
URL: https://github.com/apache/avro/pull/1680#issuecomment-1610417926

   I would like also like to express my interest in this getting merged.   We use Avro with our Apache Beam classes, and being able to use Record classes would make things more seamless.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [avro] ashley-taylor commented on pull request #1680: AVRO-3126 Add support for java records

Posted by GitBox <gi...@apache.org>.
ashley-taylor commented on PR #1680:
URL: https://github.com/apache/avro/pull/1680#issuecomment-1136726185

   @jklamer I have added a java17 module.
   I have updated the GitHub workflow to invoke the module hopefully. But unfortunately, PR doesn't currently have the approval to run them.
   
   Also built on top of this branch, I have added polymorphic support.
   https://github.com/ashley-taylor/avro/compare/master...ashley-taylor:polymorphic?expand=1
   Which I plan on opening if this one gets accepted.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [avro] martin-g commented on a diff in pull request #1680: AVRO-3126 Add support for java records

Posted by GitBox <gi...@apache.org>.
martin-g commented on code in PR #1680:
URL: https://github.com/apache/avro/pull/1680#discussion_r882566640


##########
lang/java/build.sh:
##########
@@ -35,6 +35,11 @@ main() {
         # Test the modules that depend on hadoop using Hadoop 2
         mvn -B test -Phadoop2
         ;;
+      testJava17)
+        mvn -DdisableJava17=false -B test
+        # Test the modules that depend on hadoop using Hadoop 2
+        mvn -B test -Phadoop2

Review Comment:
   IMO we can execute just `mvn -DdisableJava17=false -B test` at https://github.com/apache/avro/pull/1680/commits/50977227b1500dd73763c99e7962a00f18d7ec6d#diff-a0e5c0a6a98abf33065a9bfa4dd160ca920278cbdbfabed88cb317a10df9f294R83 and remove `testJava17` from here



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [avro] ashley-taylor commented on a diff in pull request #1680: AVRO-3126 Add support for java records

Posted by GitBox <gi...@apache.org>.
ashley-taylor commented on code in PR #1680:
URL: https://github.com/apache/avro/pull/1680#discussion_r883228655


##########
lang/java/build.sh:
##########
@@ -35,6 +35,11 @@ main() {
         # Test the modules that depend on hadoop using Hadoop 2
         mvn -B test -Phadoop2
         ;;
+      testJava17)
+        mvn -DdisableJava17=false -B test
+        # Test the modules that depend on hadoop using Hadoop 2
+        mvn -B test -Phadoop2

Review Comment:
   Happy to make that change. Just want to put rationale before doing that. Might change your mind 
   
   With the change to make the module included by default but skip the compile/tests.
   
   running `mvn -DdisableJava17=false -B test` will run all the tests in every module. Can add more `mvn` args to try and narrow the rerun down to only what is needed. But figured in the main test run for java17+ enable the module preventing the need to redo all the other work. This way it only adds a few seconds to the build. As opposed to redoing the step that takes minutes.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [avro] ashley-taylor commented on pull request #1680: AVRO-3126 Add support for java records

Posted by GitBox <gi...@apache.org>.
ashley-taylor commented on PR #1680:
URL: https://github.com/apache/avro/pull/1680#issuecomment-1136914683

   @martin-g I noticed a bug in how I had conditionally included the java17 module with a profile.
   The version of the java17 module would not increment automatically when performing a release, causing a manual headache.
   
   I changed it always to be included but have the compile and test plugins disabled so that it won't complain about the version of java.
   Annoyingly I can't put any logic in the maven properties, so the property to enable it has to be named backwards `disableJava17`.
   This way, the module will be included for lining and release purposes but not actually run the code unless configured to set the flag.
   Changed the build.sh script to have a new option to run in java17 mode and add an expression to execute one of the two test steps conditionally.
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [avro] RyanSkraba commented on pull request #1680: AVRO-3126 Add support for java records

Posted by GitBox <gi...@apache.org>.
RyanSkraba commented on PR #1680:
URL: https://github.com/apache/avro/pull/1680#issuecomment-1150097580

   Hello!  Thanks for your contribution and for your patience!  I've set this fix version for the next major release of Avro (1.12.0, later this year).  I know record support would be a great feature to get in.
   
   Do you have a JIRA account so we can assign this JIRA to you and ensure that it's properly taken care of?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [avro] martin-g commented on a diff in pull request #1680: AVRO-3126 Add support for java records

Posted by GitBox <gi...@apache.org>.
martin-g commented on code in PR #1680:
URL: https://github.com/apache/avro/pull/1680#discussion_r877998855


##########
lang/java/avro/src/main/java/org/apache/avro/specific/SpecificData.java:
##########
@@ -61,6 +65,20 @@ public class SpecificData extends GenericData {
   private static final Class<?>[] NO_ARG = new Class[] {};
   private static final Class<?>[] SCHEMA_ARG = new Class[] { Schema.class };
 
+  private static final Method IS_RECORD;

Review Comment:
   ```suggestion
     private static final Method IS_RECORD_METHOD;
   ```



##########
lang/java/avro/src/main/java/org/apache/avro/reflect/ReflectDatumReader.java:
##########
@@ -70,6 +70,21 @@ public ReflectDatumReader(ReflectData data) {
     super(data);
   }
 
+  /** Called to read data. */
+  protected Object read(Object old, Schema expected, ResolvingDecoder in) throws IOException {

Review Comment:
   ```suggestion
     @Override
     protected Object read(Object old, Schema expected, ResolvingDecoder in) throws IOException {
   ```



##########
lang/java/avro/src/main/java/org/apache/avro/reflect/ReflectRecordEncoding.java:
##########
@@ -0,0 +1,140 @@
+package org.apache.avro.reflect;
+
+import java.io.IOException;
+import java.lang.reflect.Constructor;
+import java.lang.reflect.Field;
+import java.lang.reflect.Method;
+import java.lang.reflect.Modifier;
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.stream.Collectors;
+
+import org.apache.avro.AvroMissingFieldException;
+import org.apache.avro.AvroRuntimeException;
+import org.apache.avro.Schema;
+import org.apache.avro.io.Decoder;
+import org.apache.avro.io.Encoder;
+import org.apache.avro.io.ResolvingDecoder;
+
+public class ReflectRecordEncoding extends CustomEncoding {
+
+  private final List<FieldWriter> writer;
+  private final Constructor<?> constructor;
+  private List<FieldReader> reader;
+
+  public ReflectRecordEncoding(Schema schema, Class<?> type) {
+    this.schema = schema;
+
+    this.writer = schema.getFields().stream().map(field -> {
+      try {
+        Method method = type.getMethod(field.name());
+        return new FieldWriter(method, field.schema());
+      } catch (ReflectiveOperationException e) {
+        throw new AvroMissingFieldException("Field does not exist", field);
+      }
+
+    }).collect(Collectors.toList());
+
+    // order of this matches default constructor find order mapping
+
+    Field[] fields = type.getDeclaredFields();
+
+    List<Class<?>> parameterTypes = new ArrayList<>(fields.length);
+
+    Map<String, Integer> offsets = new HashMap<>();
+
+    // need to know offset for mapping
+    for (Field field : fields) {
+      if (Modifier.isStatic(field.getModifiers())) {
+        continue;
+      }
+      offsets.put(field.getName(), parameterTypes.size());
+      parameterTypes.add(field.getType());
+    }
+
+    try {
+      this.constructor = type.getDeclaredConstructor(parameterTypes.toArray(Class[]::new));
+
+    } catch (NoSuchMethodException e) {
+      throw new AvroRuntimeException(e);
+    }
+
+    this.reader = schema.getFields().stream().map(field -> {
+      int offset = offsets.get(field.name());
+      return new FieldReader(offset, field.schema());
+
+    }).collect(Collectors.toList());
+
+  }
+
+  @Override
+  protected void write(Object datum, Encoder out) throws IOException {
+    throw new UnsupportedOperationException("No writer specified");
+  }
+
+  @Override
+  protected void write(Object datum, Encoder out, ReflectDatumWriter writer) throws IOException {
+    for (FieldWriter field : this.writer) {
+      field.write(datum, out, writer);
+    }
+  }
+
+  @Override
+  protected Object read(Object reuse, Decoder in) throws IOException {
+    throw new UnsupportedOperationException("No writer specified");

Review Comment:
   ```suggestion
       throw new UnsupportedOperationException("No reader specified");
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [avro] jklamer commented on pull request #1680: AVRO-3126 Add support for java records

Posted by GitBox <gi...@apache.org>.
jklamer commented on PR #1680:
URL: https://github.com/apache/avro/pull/1680#issuecomment-1136672762

   I would love the testing in java 17 module if nothing else but to see some examples of the workflow/use. Lots of really cool stuff here that Im still trying to get my head around!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [avro] ashley-taylor commented on pull request #1680: AVRO-3126 Add support for java records

Posted by GitBox <gi...@apache.org>.
ashley-taylor commented on PR #1680:
URL: https://github.com/apache/avro/pull/1680#issuecomment-1150929259

   > I know but JIRA offers me 5 users with "Ashley Taylor" as display name when I enter this id: ![ashley-taylor-avro-contributor](https://user-images.githubusercontent.com/232002/172819964-028ecd5a-eec6-4629-9b06-ff5396bb8f9e.png)
   
   I don't know how you did it. But have correctly assigned to the correct one. My name is clearly too common.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [avro] martin-g commented on pull request #1680: AVRO-3126 Add support for java records

Posted by GitBox <gi...@apache.org>.
martin-g commented on PR #1680:
URL: https://github.com/apache/avro/pull/1680#issuecomment-1150948803

   I did nothing! :-)
   Someone else did it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [avro] ashley-taylor commented on pull request #1680: AVRO-3126 Add support for java records

Posted by GitBox <gi...@apache.org>.
ashley-taylor commented on PR #1680:
URL: https://github.com/apache/avro/pull/1680#issuecomment-1149418059

   @martin-g @jklamer 
   Have added some more tests and squashed the commits.
   Is there anything I can do to assist in getting this PR merged?
   
   Thanks 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [avro] ashley-taylor commented on pull request #1680: AVRO-3126 Add support for java records

Posted by GitBox <gi...@apache.org>.
ashley-taylor commented on PR #1680:
URL: https://github.com/apache/avro/pull/1680#issuecomment-1150455325

   Hi @RyanSkraba thanks for the update
   
   Have added a comment on the Jira ticket
   
   I also have another [Branch](https://github.com/ashley-taylor/avro/pull/1) to address [AVRO-1568](https://issues.apache.org/jira/browse/AVRO-1568) Polymorphism that I have built on top of this branch. Looking to open that PR after this one.
   
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [avro] martin-g commented on pull request #1680: AVRO-3126 Add support for java records

Posted by GitBox <gi...@apache.org>.
martin-g commented on PR #1680:
URL: https://github.com/apache/avro/pull/1680#issuecomment-1150760605

   > Do you have a JIRA account so we can assign this JIRA to you and ensure that it's properly taken care of?
   
   I tried to add Ashley to the Contributors role in JIRA but there are 5 JIRA users with that (display) name and I am not sure which one to add.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org