You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2022/05/09 11:41:49 UTC

[GitHub] [hudi] nleena123 opened a new issue, #5539: [SUPPORT]Job aborted due to stage failure. Caused by: AvroTypeException: Invalid default for field CDC_TS: "null" not a ["null","string"]

nleena123 opened a new issue, #5539:
URL: https://github.com/apache/hudi/issues/5539

   Hi Team, 
   iam geeting the below exception our source is Kafka topic , we are running the hudi code in azure databricks.
   our hudi jobs are failing with below exception can anyone help me on this .
   
   ERROR:
   Job aborted due to stage failure. Caused by: AvroTypeException: Invalid default for field CDC_TS: "null" not a ["null","string"]
   Steps to reproduce the behavior:
   
   
   **Expected behavior**
   
   A clear and concise description of what you expected to happen.
   
   **Environment Description**
   Azure Databricks
   using below dependancies 
   
   org.apache.spark:spark-avro_2.12:3.1.1
   
   org.apache.hudi:hudi-spark3-bundle_2.12:0.8.0
   
   org.apache.hudi:hudi-utilities-bundle_2.12:0.8.0
   
   * Hudi version :0.8.0
   
   * Spark version :3.2.1
   
   * Storage (HDFS/S3/GCS..) : azure storage 
   
   * Running on Docker? (yes/no) :
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] pratyakshsharma commented on issue #5539: [SUPPORT]Job aborted due to stage failure. Caused by: AvroTypeException: Invalid default for field CDC_TS: "null" not a ["null","string"]

Posted by GitBox <gi...@apache.org>.
pratyakshsharma commented on issue #5539:
URL: https://github.com/apache/hudi/issues/5539#issuecomment-1122267396

   I do not see the cdc_ts field in this schema. From where is this field coming in the error you shared?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] codope commented on issue #5539: [SUPPORT]Job aborted due to stage failure. Caused by: AvroTypeException: Invalid default for field CDC_TS: "null" not a ["null","string"]

Posted by GitBox <gi...@apache.org>.
codope commented on issue #5539:
URL: https://github.com/apache/hudi/issues/5539#issuecomment-1329343439

   Closing due to inactivity. Please reopen wth more details as requested in https://github.com/apache/hudi/issues/5539#issuecomment-1125852964


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] nleena123 commented on issue #5539: [SUPPORT]Job aborted due to stage failure. Caused by: AvroTypeException: Invalid default for field CDC_TS: "null" not a ["null","string"]

Posted by GitBox <gi...@apache.org>.
nleena123 commented on issue #5539:
URL: https://github.com/apache/hudi/issues/5539#issuecomment-1122032972

   @pratyakshsharma 
   
   {"subject":"async.messaging.interaction.survey-value","version":4,"id":92358,"schema":"{\"type\":\"record\",\"name\":\"InteractionSurvey\",\"namespace\":\"com.optum.lpgc\",\"fields\":[{\"name\":\"survey_type\",\"type\":[\"null\",\"string\"]},{\"name\":\"survey_status\",\"type\":[\"null\",\"string\"]},{\"name\":\"interaction_id\",\"type\":\"string\"},{\"name\":\"survey_id\",\"type\":[\"null\",\"string\"]},{\"name\":\"survey_data\",\"type\":{\"type\":\"array\",\"items\":{\"type\":\"record\",\"name\":\"SurveyData\",\"fields\":[{\"name\":\"question\",\"type\":\"string\"},{\"name\":\"answer\",\"type\":\"string\"},{\"name\":\"questionId\",\"type\":\"string\"},{\"name\":\"answerId\",\"type\":\"string\"},{\"name\":\"questionType\",\"type\":\"string\"},{\"name\":\"questionFormat\",\"type\":\"string\"},{\"name\":\"isValidAnswer\",\"type\":\"string\"},{\"name\":\"answerSeq\",\"type\":\"string\"}]}}},{\"name\":\"create_datetime\",\"type\":\"string\"}]}"}
   
   please find the schema


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] codope closed issue #5539: [SUPPORT]Job aborted due to stage failure. Caused by: AvroTypeException: Invalid default for field CDC_TS: "null" not a ["null","string"]

Posted by GitBox <gi...@apache.org>.
codope closed issue #5539: [SUPPORT]Job aborted due to stage failure. Caused by: AvroTypeException: Invalid default for field CDC_TS: "null" not a ["null","string"]
URL: https://github.com/apache/hudi/issues/5539


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] nleena123 commented on issue #5539: [SUPPORT]Job aborted due to stage failure. Caused by: AvroTypeException: Invalid default for field CDC_TS: "null" not a ["null","string"]

Posted by GitBox <gi...@apache.org>.
nleena123 commented on issue #5539:
URL: https://github.com/apache/hudi/issues/5539#issuecomment-1122307710

   we are adding the new field in hudi code with the below code
   		**Schema.Field f = new Schema.Field(CDC_TS, Schema.createUnion(optionalString), CDC_TS, "null");**
   
   static GenericRecord getDataWithCDC_TS(GenericRecord record, String ts) {
   
   		Schema base = record.getSchema();
   		List<Schema.Field> baseFields = base.getFields().stream()
   				.map(field -> {
   					Schema.Field f = new Schema.Field(field.name(), field.schema(), field.doc(), field.defaultVal());
   					field.aliases().forEach(f::addAlias);
   					return f;
   				})
   				.collect(Collectors.toList());
   		List<Schema.Field> newFields = new ArrayList<>(baseFields);
   		ArrayList<Schema> optionalString = new ArrayList<>();
   		optionalString.add(Schema.create(Schema.Type.NULL));
   		optionalString.add(Schema.create(Schema.Type.STRING));
   		
   		Schema.Field f = new Schema.Field(CDC_TS, Schema.createUnion(optionalString), CDC_TS, "null");
   		if (!baseFields.contains(f))
   			newFields.add(f);
   
   
   		Schema newSchema = Schema.createRecord(
   				base.getName(),
   				"New schema by adding a new field,CDC_TS",
   				base.getNamespace(), base.isError(),
   				newFields);
   
   		GenericRecord newRecord = new GenericData.Record(newSchema);
   
   		base.getFields().stream().forEach(field -> newRecord.put(field.name(), record.get(field.name())));
   		newRecord.put(CDC_TS, ts);
   
   		return newRecord;
   
   	}


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] pratyakshsharma commented on issue #5539: [SUPPORT]Job aborted due to stage failure. Caused by: AvroTypeException: Invalid default for field CDC_TS: "null" not a ["null","string"]

Posted by GitBox <gi...@apache.org>.
pratyakshsharma commented on issue #5539:
URL: https://github.com/apache/hudi/issues/5539#issuecomment-1125852964

   this is an avro version mismtach issue. Can you see in your dependency graph if avro version < 1.8 is coming from somewhere?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] nsivabalan commented on issue #5539: [SUPPORT]Job aborted due to stage failure. Caused by: AvroTypeException: Invalid default for field CDC_TS: "null" not a ["null","string"]

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on issue #5539:
URL: https://github.com/apache/hudi/issues/5539#issuecomment-1302885245

   @nleena123 : would you mind closing the issue if are not looking for any further assistance. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] codope commented on issue #5539: [SUPPORT]Job aborted due to stage failure. Caused by: AvroTypeException: Invalid default for field CDC_TS: "null" not a ["null","string"]

Posted by GitBox <gi...@apache.org>.
codope commented on issue #5539:
URL: https://github.com/apache/hudi/issues/5539#issuecomment-1160588809

   @nleena123 Any udpate on this? Is this still an issue?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] pratyakshsharma commented on issue #5539: [SUPPORT]Job aborted due to stage failure. Caused by: AvroTypeException: Invalid default for field CDC_TS: "null" not a ["null","string"]

Posted by GitBox <gi...@apache.org>.
pratyakshsharma commented on issue #5539:
URL: https://github.com/apache/hudi/issues/5539#issuecomment-1122407861

   Check this jira - https://issues.apache.org/jira/browse/AVRO-2509. You probably need to set your default value to null and not "null". In a union schema, default value should correspond to the first schema type. "null" corresponds to string type and probably the reason you are getting this error.
   
   Another link - https://stackoverflow.com/questions/57094629/getting-schema-deserialisation-error-invalid-default-for-field


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] pratyakshsharma commented on issue #5539: [SUPPORT]Job aborted due to stage failure. Caused by: AvroTypeException: Invalid default for field CDC_TS: "null" not a ["null","string"]

Posted by GitBox <gi...@apache.org>.
pratyakshsharma commented on issue #5539:
URL: https://github.com/apache/hudi/issues/5539#issuecomment-1121097433

   Hi @nleena123 Can you share the schema you are using?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] nleena123 commented on issue #5539: [SUPPORT]Job aborted due to stage failure. Caused by: AvroTypeException: Invalid default for field CDC_TS: "null" not a ["null","string"]

Posted by GitBox <gi...@apache.org>.
nleena123 commented on issue #5539:
URL: https://github.com/apache/hudi/issues/5539#issuecomment-1123208199

   @pratyakshsharma 
   
   Error_log : before changing the CDC_TS default value  "null" to null .
   [logs_before_default_value.txt](https://github.com/apache/hudi/files/8667011/logs_before_default_value.txt)
   
   Error_log : After changing the CDC_TS default value  "null" to null .
   [logs_after_default_value.txt](https://github.com/apache/hudi/files/8667010/logs_after_default_value.txt)
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] nleena123 commented on issue #5539: [SUPPORT]Job aborted due to stage failure. Caused by: AvroTypeException: Invalid default for field CDC_TS: "null" not a ["null","string"]

Posted by GitBox <gi...@apache.org>.
nleena123 commented on issue #5539:
URL: https://github.com/apache/hudi/issues/5539#issuecomment-1122570839

   i followed the above resolution steps .
   
   changed the code to 
   Schema.Field f = new Schema.Field(CDC_TS, Schema.createUnion(optionalString), CDC_TS, null);
   
   After changing the default value "null" to null got the below exception 
   Job aborted due to stage failure. Caused by: NoSuchMethodError: org.apache.avro.Schema$Field.<init>(Ljava/lang/String;Lorg/apache/avro/Schema;Ljava/lang/String;Lorg/codehaus/jackson/JsonNode;)V
   at com.optum.df.hudi.sources.DFAvroKafkaSource.getDataWithCDC_TS(DFAvroKafkaSource.java:153)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] pratyakshsharma commented on issue #5539: [SUPPORT]Job aborted due to stage failure. Caused by: AvroTypeException: Invalid default for field CDC_TS: "null" not a ["null","string"]

Posted by GitBox <gi...@apache.org>.
pratyakshsharma commented on issue #5539:
URL: https://github.com/apache/hudi/issues/5539#issuecomment-1122694230

   Please share the entire error log.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] nsivabalan commented on issue #5539: [SUPPORT]Job aborted due to stage failure. Caused by: AvroTypeException: Invalid default for field CDC_TS: "null" not a ["null","string"]

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on issue #5539:
URL: https://github.com/apache/hudi/issues/5539#issuecomment-1229360880

   not sure if its already resolved. but in case of union w/ default as null, null has to be the first entry in the union. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] pratyakshsharma commented on issue #5539: [SUPPORT]Job aborted due to stage failure. Caused by: AvroTypeException: Invalid default for field CDC_TS: "null" not a ["null","string"]

Posted by GitBox <gi...@apache.org>.
pratyakshsharma commented on issue #5539:
URL: https://github.com/apache/hudi/issues/5539#issuecomment-1247835150

   @nleena123 Any updates here?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org