You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "Selvaraj Periyasamy (Jira)" <ji...@apache.org> on 2020/06/27 04:51:00 UTC
[jira] [Created] (HUDI-1057) optional int32 is not a group
Selvaraj Periyasamy created HUDI-1057:
-----------------------------------------
Summary: optional int32 is not a group
Key: HUDI-1057
URL: https://issues.apache.org/jira/browse/HUDI-1057
Project: Apache Hudi
Issue Type: Bug
Reporter: Selvaraj Periyasamy
I am using Hudi 0.5.0 and writing to COW table using Spark. Consecutive writes fails with below error.
Caused by: java.lang.ClassCastException: optional int32 trli_sequence_number_list is not a groupCaused by: java.lang.ClassCastException: optional int32 trli_sequence_number_list is not a group at org.apache.parquet.schema.Type.asGroupType(Type.java:202) at org.apache.parquet.avro.AvroRecordConverter.newConverter(AvroRecordConverter.java:206) at org.apache.parquet.avro.AvroRecordConverter.<init>(AvroRecordConverter.java:112) at org.apache.parquet.avro.AvroRecordConverter.<init>(AvroRecordConverter.java:79) at org.apache.parquet.avro.AvroRecordMaterializer.<init>(AvroRecordMaterializer.java:33) at org.apache.parquet.avro.AvroReadSupport.prepareForRead(AvroReadSupport.java:132) at org.apache.parquet.hadoop.InternalParquetRecordReader.initialize(InternalParquetRecordReader.java:175) at org.apache.parquet.hadoop.ParquetReader.initReader(ParquetReader.java:149) at org.apache.parquet.hadoop.ParquetReader.read(ParquetReader.java:125) at org.apache.hudi.func.ParquetReaderIterator.hasNext(ParquetReaderIterator.java:47) at org.apache.hudi.common.util.queue.IteratorBasedQueueProducer.produce(IteratorBasedQueueProducer.java:44) at org.apache.hudi.common.util.queue.BoundedInMemoryExecutor.lambda$null$0(BoundedInMemoryExecutor.java:91) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ... 4 more
Corresponding schema is as below.
{
"name" : "trli_sequence_number_list",
"type" : [ {
"type" : "array",
"items" : [ "string", "null" ]
}, "null" ]
},
I have multiple columns having array data type.
{
"name" : "rli_invoice_number_list",
"type" : [ {
"type" : "array",
"items" : [ "string", "null" ]
}, "null" ]
}, {
"name" : "trli_sequence_number_list",
"type" : [ {
"type" : "array",
"items" : [ "string", "null" ]
}, "null" ]
},
Is there a way to avoid this error?
Similarly there is an another error log on the same run.
Caused by: java.lang.ClassCastException: optional binary app_application_list (UTF8) is not a group
--
This message was sent by Atlassian Jira
(v8.3.4#803005)