You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@gobblin.apache.org by GitBox <gi...@apache.org> on 2022/10/17 22:09:48 UTC

[GitHub] [gobblin] srramach commented on a diff in pull request #3581: [GOBBLIN-1726] Avro 1.9 upgrade of Gobblin OSS

srramach commented on code in PR #3581:
URL: https://github.com/apache/gobblin/pull/3581#discussion_r997451873


##########
gobblin-core-base/src/main/java/org/apache/gobblin/converter/filter/AvroSchemaFieldRemover.java:
##########
@@ -119,14 +121,17 @@ private Schema removeFieldsFromRecords(Schema schema, Map<String, Schema> schema
       if (!this.shouldRemove(field)) {
         Field newField;
         if (this.children.containsKey(field.name())) {
-          newField = new Field(field.name(), this.children.get(field.name()).removeFields(field.schema(), schemaMap),
-              field.doc(), field.defaultValue());
+          newField = AvroCompatibilityHelper.createSchemaField(field.name(),
+              this.children.get(field.name()).removeFields(field.schema(), schemaMap),
+              field.doc(), AvroUtils.getCompatibleDefaultValue(field));
         } else {
-          newField = new Field(field.name(), DO_NOTHING_INSTANCE.removeFields(field.schema(), schemaMap), field.doc(),
-              field.defaultValue());
+          newField = AvroCompatibilityHelper.createSchemaField(field.name(),
+              DO_NOTHING_INSTANCE.removeFields(field.schema(), schemaMap), field.doc(),
+              AvroUtils.getCompatibleDefaultValue(field));
         }
-        for (Map.Entry<String, JsonNode> stringJsonNodeEntry : field.getJsonProps().entrySet()) {
-          newField.addProp(stringJsonNodeEntry.getKey(), stringJsonNodeEntry.getValue());
+        // Avro 1.9 compatible change - replaced deprecated public api getJsonProps with getObjectProps
+        for (Map.Entry<String, Object> objectEntry : field.getObjectProps().entrySet()) {

Review Comment:
   getObjectProps() doesn't exist in Avro 1.7.7, so this is not Avro version agnostic. We did add a new API in AvroCompatibilityHelper to getAllPropNames(), so you should be able to use the AvroCompatibilityHelper here too.



##########
gobblin-core/src/test/java/org/apache/gobblin/recordaccess/AvroGenericRecordAccessorTest.java:
##########
@@ -139,6 +139,11 @@ public void testGetStringArrayUtf8() throws IOException {
 
   @Test
   public void testGetMultiConvertsStrings() throws IOException {
+    // The below error is due to invalid avro data. Type with "null" union must have "null" first and then

Review Comment:
   This comment is not strictly accurate (as far as Avro is concerned). A union with "null" doesn't _have_ to have null as the first type. It's usually intended to signify optional-ness, and usually people do want to put null first, so that they can set "null" as the default value, but it's not required.
   
   What _is_ required (as per Avro) is that the default value must have the same type as the first entry in the union. So, for example, this is correct:
   ```
   "type": ["null", "string"],
   "default": null
   ```
   
   Whereas this is wrong:
   ```
   // default is null, but must be a string, since that's the first type in the union
   "type": ["string", "null"],
   "default": null
   ```
   
   This is also wrong:
   ```
   // default is a string ("null"), not null, which is the first type in the union
   "type": ["null", "string"],
   "default": "null"
   ```
   
   And this is totally correct, though usually rare to see:
   ```
   "type": ["string", "null"]
   "default": "foobar"
   ```
   
   So what you really want to say in this comment is that since the default value is `null`, you want to ensure that that's the first type in the union.



##########
gobblin-core-base/src/main/java/org/apache/gobblin/converter/filter/AvroSchemaFieldRemover.java:
##########
@@ -119,14 +121,17 @@ private Schema removeFieldsFromRecords(Schema schema, Map<String, Schema> schema
       if (!this.shouldRemove(field)) {
         Field newField;
         if (this.children.containsKey(field.name())) {
-          newField = new Field(field.name(), this.children.get(field.name()).removeFields(field.schema(), schemaMap),
-              field.doc(), field.defaultValue());
+          newField = AvroCompatibilityHelper.createSchemaField(field.name(),
+              this.children.get(field.name()).removeFields(field.schema(), schemaMap),
+              field.doc(), AvroUtils.getCompatibleDefaultValue(field));
         } else {
-          newField = new Field(field.name(), DO_NOTHING_INSTANCE.removeFields(field.schema(), schemaMap), field.doc(),
-              field.defaultValue());
+          newField = AvroCompatibilityHelper.createSchemaField(field.name(),
+              DO_NOTHING_INSTANCE.removeFields(field.schema(), schemaMap), field.doc(),
+              AvroUtils.getCompatibleDefaultValue(field));
         }
-        for (Map.Entry<String, JsonNode> stringJsonNodeEntry : field.getJsonProps().entrySet()) {
-          newField.addProp(stringJsonNodeEntry.getKey(), stringJsonNodeEntry.getValue());
+        // Avro 1.9 compatible change - replaced deprecated public api getJsonProps with getObjectProps
+        for (Map.Entry<String, Object> objectEntry : field.getObjectProps().entrySet()) {
+          newField.addProp(objectEntry.getKey(), objectEntry.getValue());

Review Comment:
   Same for addProp(). Use the AvroCompatibilityHelper instead.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@gobblin.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org