You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@avro.apache.org by "horizonzy (via GitHub)" <gi...@apache.org> on 2023/11/09 09:27:10 UTC

[PR] Reflect data get fields support not order by property name. [avro]

horizonzy opened a new pull request, #2581:
URL: https://github.com/apache/avro/pull/2581

   In some cases, we hope that the reflect fields order maintain the original order.
   
   
   ```
   class Pojo {
   String f2;
   String f1;
   }
   ```
    At client, it create Pojo and serialize it using JSON format.  
   The json: 
   ```
   {"f2":"a", "f1":"b"}
   ```
   
   And then, create the schema using refelct data, the fields will be ordered. 
   
   The schema:
   ```
   [{"name":"f1", "type":"string"},
   {"name":"f2", "type":"string"}]
   ```
   
   Then push the payload and schema to server, the server store the json payload as k,v format, and store the schema to the schema registry.
   
   At another client, it want to read the data. Then the server read the k,v format data, then decode it according the schema from schema registry. But the fields order is not match the origin. So the decode json will be like
   ```
   {"f1":"b", "f2":"a"}
   ```
   
   I want to make the json fields order maintain the original order, when creating the schema using ReflectData, not order the fields.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] AVRO-3138: Reflect data get fields support not order by property name [avro]

Posted by "RyanSkraba (via GitHub)" <gi...@apache.org>.
RyanSkraba commented on PR #2581:
URL: https://github.com/apache/avro/pull/2581#issuecomment-1803900618

   Hello!  I've added AVRO-3138 as the relevant JIRA for this feature.  There's a bit of discussion about *why* we sort the field names.  It looks like you're implementing "A JVM system property ... triggers the old behaviour", but there are other possibilities for specifying this.
   
   In this case, it's _possible_ for your test to fail depending on the JVM being used to run it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] Reflect data get fields support not order by property name. [avro]

Posted by "horizonzy (via GitHub)" <gi...@apache.org>.
horizonzy commented on PR #2581:
URL: https://github.com/apache/avro/pull/2581#issuecomment-1803496229

   @nielsbasjes cc


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] AVRO-3138: Reflect data get fields support not order by property name [avro]

Posted by "horizonzy (via GitHub)" <gi...@apache.org>.
horizonzy commented on PR #2581:
URL: https://github.com/apache/avro/pull/2581#issuecomment-1805178620

   > Hello! I've added [AVRO-3138](https://issues.apache.org/jira/browse/AVRO-3138) as the relevant JIRA for this feature. There's a bit of discussion about _why_ we sort the field names. It looks like you're implementing "A JVM system property ... triggers the old behaviour", but there are other possibilities for specifying this.
   > 
   > In this case, it's _possible_ for your test to fail depending on the JVM being used to run it.
   
   I have gone through the context. In most projects, they will use Java Reflect to get the fields without order, which means if the user uses Avro and other projects together, and the data is related with the avro  and other projects, it may cause the problem that I meet.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] AVRO-3138: Reflect data get fields support not order by property name [avro]

Posted by "horizonzy (via GitHub)" <gi...@apache.org>.
horizonzy commented on code in PR #2581:
URL: https://github.com/apache/avro/pull/2581#discussion_r1391288030


##########
lang/java/avro/src/main/java/org/apache/avro/reflect/ReflectData.java:
##########
@@ -845,20 +859,25 @@ public static Schema makeNullable(Schema schema) {
 
   private static final ConcurrentMap<Class<?>, Field[]> FIELDS_CACHE = new ConcurrentHashMap<>();
 
+  private static final ConcurrentMap<Class<?>, Field[]> NATIVE_FIELDS_CACHE = new ConcurrentHashMap<>();
+
   // Return of this class and its superclasses to serialize.
-  private static Field[] getCachedFields(Class<?> recordClass) {
-    return MapUtil.computeIfAbsent(FIELDS_CACHE, recordClass, rc -> getFields(rc, true));
+  private static Field[] getCachedFields(Class<?> recordClass, boolean orderBy) {
+    return MapUtil.computeIfAbsent(orderBy ? FIELDS_CACHE : NATIVE_FIELDS_CACHE, recordClass,
+        rc -> getFields(rc, true, orderBy));
   }
 
-  private static Field[] getFields(Class<?> recordClass, boolean excludeJava) {
+  private static Field[] getFields(Class<?> recordClass, boolean excludeJava, boolean orderBy) {

Review Comment:
   addressed.



##########
lang/java/avro/src/test/java/org/apache/avro/reflect/TestReflectData.java:
##########
@@ -76,6 +78,25 @@ void genericProtocol() {
     assertThat(existsArgument.schema(), equalTo(Schema.create(Schema.Type.STRING)));
   }
 
+  @Test
+  void fieldsOrder() {
+    Schema schema = ReflectData.get().getSchema(Meta.class);
+    List<Schema.Field> fields = schema.getFields();
+    assertEquals(fields.size(), 4);
+    assertEquals(fields.get(0).name(), "f1");
+    assertEquals(fields.get(1).name(), "f2");
+    assertEquals(fields.get(2).name(), "f3");
+    assertEquals(fields.get(3).name(), "f4");
+
+    schema = new ReflectData(false).getSchema(Meta.class);
+    fields = schema.getFields();
+    assertEquals(fields.size(), 4);
+    assertEquals(fields.get(0).name(), "f1");
+    assertEquals(fields.get(1).name(), "f4");
+    assertEquals(fields.get(2).name(), "f2");
+    assertEquals(fields.get(3).name(), "f3");

Review Comment:
   addressed



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] AVRO-3138: Reflect data get fields support not order by property name [avro]

Posted by "horizonzy (via GitHub)" <gi...@apache.org>.
horizonzy commented on PR #2581:
URL: https://github.com/apache/avro/pull/2581#issuecomment-1805561373

   @RyanSkraba Use a final property to control the behavior.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] AVRO-3138: Reflect data get fields support not order by property name [avro]

Posted by "opwvhk (via GitHub)" <gi...@apache.org>.
opwvhk commented on code in PR #2581:
URL: https://github.com/apache/avro/pull/2581#discussion_r1391267880


##########
lang/java/avro/src/test/java/org/apache/avro/reflect/TestReflectData.java:
##########
@@ -76,6 +78,25 @@ void genericProtocol() {
     assertThat(existsArgument.schema(), equalTo(Schema.create(Schema.Type.STRING)));
   }
 
+  @Test
+  void fieldsOrder() {
+    Schema schema = ReflectData.get().getSchema(Meta.class);
+    List<Schema.Field> fields = schema.getFields();
+    assertEquals(fields.size(), 4);
+    assertEquals(fields.get(0).name(), "f1");
+    assertEquals(fields.get(1).name(), "f2");
+    assertEquals(fields.get(2).name(), "f3");
+    assertEquals(fields.get(3).name(), "f4");
+
+    schema = new ReflectData(false).getSchema(Meta.class);
+    fields = schema.getFields();
+    assertEquals(fields.size(), 4);
+    assertEquals(fields.get(0).name(), "f1");
+    assertEquals(fields.get(1).name(), "f4");
+    assertEquals(fields.get(2).name(), "f2");
+    assertEquals(fields.get(3).name(), "f3");

Review Comment:
   This order depends on the JVM in use. Can we please ensure this test is only enabled for JVMs where the test is known to succeed?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] AVRO-3138: Reflect data get fields support not order by property name [avro]

Posted by "horizonzy (via GitHub)" <gi...@apache.org>.
horizonzy commented on PR #2581:
URL: https://github.com/apache/avro/pull/2581#issuecomment-1805180247

   I would like to supply a config to control it instead of jvm properties.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] Reflect data get fields support not order by property name. [avro]

Posted by "KalleOlaviNiemitalo (via GitHub)" <gi...@apache.org>.
KalleOlaviNiemitalo commented on PR #2581:
URL: https://github.com/apache/avro/pull/2581#issuecomment-1803470581

   I wonder if this should be an annotation on the class, instead.  A Java code generator could then automatically add the annotation if needed.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] AVRO-3138: Reflect data get fields support not order by property name [avro]

Posted by "opwvhk (via GitHub)" <gi...@apache.org>.
opwvhk commented on code in PR #2581:
URL: https://github.com/apache/avro/pull/2581#discussion_r1391265121


##########
lang/java/avro/src/main/java/org/apache/avro/reflect/ReflectData.java:
##########
@@ -845,20 +859,25 @@ public static Schema makeNullable(Schema schema) {
 
   private static final ConcurrentMap<Class<?>, Field[]> FIELDS_CACHE = new ConcurrentHashMap<>();
 
+  private static final ConcurrentMap<Class<?>, Field[]> NATIVE_FIELDS_CACHE = new ConcurrentHashMap<>();
+
   // Return of this class and its superclasses to serialize.
-  private static Field[] getCachedFields(Class<?> recordClass) {
-    return MapUtil.computeIfAbsent(FIELDS_CACHE, recordClass, rc -> getFields(rc, true));
+  private static Field[] getCachedFields(Class<?> recordClass, boolean orderBy) {
+    return MapUtil.computeIfAbsent(orderBy ? FIELDS_CACHE : NATIVE_FIELDS_CACHE, recordClass,
+        rc -> getFields(rc, true, orderBy));
   }
 
-  private static Field[] getFields(Class<?> recordClass, boolean excludeJava) {
+  private static Field[] getFields(Class<?> recordClass, boolean excludeJava, boolean orderBy) {

Review Comment:
   ```suggestion
     private static Field[] getFields(Class<?> recordClass, boolean excludeJava, boolean useDeterministicFieldOrder) {
   ```
   
   As `orderBy` is a boolean, it's not a standard "order by X". Can we please rename it?
   
   More than a name like `sortFields`, the name `useDeterministicFieldOrder` captures the fact that a JVM is free to scramble the field order as it sees fit.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org