You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2022/10/08 09:11:11 UTC

[GitHub] [hudi] gubinjie opened a new issue, #6894: [SUPPORT]Error running child : java.lang.NoSuchMethodError: org.apache.parquet.avro.AvroSchemaConverter.convert(Lorg/apache/parquet/schema/MessageType;)Lorg/apache/hudi/org/apache/avro/Schema;

gubinjie opened a new issue, #6894:
URL: https://github.com/apache/hudi/issues/6894

   CDH 6.3.2
   Hudi 0.10.1
   
   When querying a Hudi table through Hive, I get the following error:
   select * from hudi_flink_tyc_company_rt where name = '3213'
   
   `2022-10-08 16:30:27,365 INFO [main] org.apache.hadoop.conf.Configuration.deprecation: session.id is deprecated. Instead, use dfs.metrics.session-id
   2022-10-08 16:30:27,661 INFO [main] org.apache.hadoop.mapred.Task:  Using ResourceCalculatorProcessTree : [ ]
   2022-10-08 16:30:27,819 INFO [main] org.apache.hadoop.mapred.MapTask: Processing split: HoodieCombineRealtimeFileSplit{realtimeFileSplits=[HoodieRealtimeFileSplit{DataPath=hdfs://paat-dev/user/hudi/warehouse/paat_ods_hudi.db/5652dad0-9e32-43f5-99c4-eff0a89c6a79_0-1-0_20220929181835942.parquet, deltaLogPaths=[hdfs://paat-dev/user/hudi/warehouse/paat_ods_hudi.db/.5652dad0-9e32-43f5-99c4-eff0a89c6a79_20220929181835942.log.1_0-1-0], maxCommitTime='20220929190955221', basePath='hdfs://paat-dev/user/hudi/warehouse/paat_ods_hudi.db'}, HoodieRealtimeFileSplit{DataPath=hdfs://paat-dev/user/hudi/warehouse/paat_ods_hudi.db/6f026a25-797e-4a8b-9382-b426b94fd034_0-1-0_20220929181835942.parquet, deltaLogPaths=[hdfs://paat-dev/user/hudi/warehouse/paat_ods_hudi.db/.6f026a25-797e-4a8b-9382-b426b94fd034_20220929181835942.log.1_0-1-0], maxCommitTime='20220929190955221', basePath='hdfs://paat-dev/user/hudi/warehouse/paat_ods_hudi.db'}, HoodieRealtimeFileSplit{DataPath=hdfs://paat-dev/user/hudi/warehou
 se/paat_ods_hudi.db/4f90e72d-d205-4640-975f-09ebb2ad136a_0-1-0_20220929180105887.parquet, deltaLogPaths=[], maxCommitTime='20220929190955221', basePath='hdfs://paat-dev/user/hudi/warehouse/paat_ods_hudi.db'}, HoodieRealtimeFileSplit{DataPath=hdfs://paat-dev/user/hudi/warehouse/paat_ods_hudi.db/b72b41e5-7bd9-4a87-a91d-86a368a2f7b7_0-1-0_20220929181835942.parquet, deltaLogPaths=[], maxCommitTime='20220929190955221', basePath='hdfs://paat-dev/user/hudi/warehouse/paat_ods_hudi.db'}]}InputFormatClass: org.apache.hudi.hadoop.realtime.HoodieParquetRealtimeInputFormat
   
   2022-10-08 16:30:27,873 INFO [main] org.apache.hadoop.hive.conf.HiveConf: Found configuration file null
   2022-10-08 16:30:27,980 INFO [main] org.apache.hadoop.hive.ql.exec.SerializationUtilities: Deserializing MapWork using kryo
   2022-10-08 16:30:28,110 INFO [main] org.apache.hudi.hadoop.realtime.HoodieParquetRealtimeInputFormat: Before adding Hoodie columns, Projections :_hoodie_commit_time,_hoodie_commit_seqno,_hoodie_record_key,_hoodie_partition_path,_hoodie_file_name,company_id,company_name,legal_person_name,establish_time,reg_capital,reg_status,reg_number,org_number,credit_code,reg_location,phone_num,province_code,city_code,district_code,province,city,district,company_type,tax_code,category_code_std,social_security_staff_num,update_time, Ids :0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26
   2022-10-08 16:30:28,110 INFO [main] org.apache.hudi.hadoop.realtime.HoodieParquetRealtimeInputFormat: Creating record reader with readCols :_hoodie_commit_time,_hoodie_commit_seqno,_hoodie_record_key,_hoodie_partition_path,_hoodie_file_name,company_id,company_name,legal_person_name,establish_time,reg_capital,reg_status,reg_number,org_number,credit_code,reg_location,phone_num,province_code,city_code,district_code,province,city,district,company_type,tax_code,category_code_std,social_security_staff_num,update_time, Ids :0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26
   2022-10-08 16:30:28,361 INFO [main] org.apache.hadoop.conf.Configuration.deprecation: mapred.task.id is deprecated. Instead, use mapreduce.task.attempt.id
   2022-10-08 16:30:28,366 ERROR [main] org.apache.parquet.hadoop.ParquetRecordReader: Can not initialize counter due to context is not a instance of TaskInputOutputContext, but is org.apache.hadoop.mapreduce.task.TaskAttemptContextImpl
   2022-10-08 16:30:28,390 INFO [main] org.apache.parquet.hadoop.InternalParquetRecordReader: RecordReader initialized will read a total of 44225 records.
   2022-10-08 16:30:28,390 INFO [main] org.apache.parquet.hadoop.InternalParquetRecordReader: at row 0. reading next block
   2022-10-08 16:30:28,412 INFO [main] org.apache.hadoop.io.compress.zlib.ZlibFactory: Successfully loaded & initialized native-zlib library
   2022-10-08 16:30:28,413 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor [.gz]
   2022-10-08 16:30:28,418 INFO [main] org.apache.parquet.hadoop.InternalParquetRecordReader: block read in memory in 28 ms. row count = 44225
   2022-10-08 16:30:28,565 INFO [main] org.apache.hudi.hadoop.realtime.HoodieRealtimeRecordReader: Enabling merged reading of realtime records for split HoodieRealtimeFileSplit{DataPath=hdfs://paat-dev/user/hudi/warehouse/paat_ods_hudi.db/5652dad0-9e32-43f5-99c4-eff0a89c6a79_0-1-0_20220929181835942.parquet, deltaLogPaths=[hdfs://paat-dev/user/hudi/warehouse/paat_ods_hudi.db/.5652dad0-9e32-43f5-99c4-eff0a89c6a79_20220929181835942.log.1_0-1-0], maxCommitTime='20220929190955221', basePath='hdfs://paat-dev/user/hudi/warehouse/paat_ods_hudi.db'}
   2022-10-08 16:30:28,566 INFO [main] org.apache.hudi.hadoop.realtime.AbstractRealtimeRecordReader: cfg ==> _hoodie_commit_time,_hoodie_commit_seqno,_hoodie_record_key,_hoodie_partition_path,_hoodie_file_name,company_id,company_name,legal_person_name,establish_time,reg_capital,reg_status,reg_number,org_number,credit_code,reg_location,phone_num,province_code,city_code,district_code,province,city,district,company_type,tax_code,category_code_std,social_security_staff_num,update_time
   2022-10-08 16:30:28,566 INFO [main] org.apache.hudi.hadoop.realtime.AbstractRealtimeRecordReader: columnIds ==> 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26
   2022-10-08 16:30:28,566 INFO [main] org.apache.hudi.hadoop.realtime.AbstractRealtimeRecordReader: partitioningColumns ==> 
   2022-10-08 16:30:28,574 INFO [main] org.apache.hudi.common.table.HoodieTableMetaClient: Loading HoodieTableMetaClient from hdfs://paat-dev/user/hudi/warehouse/paat_ods_hudi.db
   2022-10-08 16:30:28,586 INFO [main] org.apache.hudi.common.table.HoodieTableConfig: Loading table properties from hdfs://paat-dev/user/hudi/warehouse/paat_ods_hudi.db/.hoodie/hoodie.properties
   2022-10-08 16:30:28,589 INFO [main] org.apache.hudi.common.table.HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=PARQUET) from hdfs://paat-dev/user/hudi/warehouse/paat_ods_hudi.db
   2022-10-08 16:30:28,590 INFO [main] org.apache.hudi.hadoop.realtime.AbstractRealtimeRecordReader: usesCustomPayload ==> true
   2022-10-08 16:30:28,590 INFO [main] org.apache.hudi.common.table.HoodieTableMetaClient: Loading HoodieTableMetaClient from hdfs://paat-dev/user/hudi/warehouse/paat_ods_hudi.db
   2022-10-08 16:30:28,592 INFO [main] org.apache.hudi.common.table.HoodieTableConfig: Loading table properties from hdfs://paat-dev/user/hudi/warehouse/paat_ods_hudi.db/.hoodie/hoodie.properties
   2022-10-08 16:30:28,594 INFO [main] org.apache.hudi.common.table.HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=PARQUET) from hdfs://paat-dev/user/hudi/warehouse/paat_ods_hudi.db
   2022-10-08 16:30:28,609 ERROR [main] org.apache.hadoop.mapred.YarnChild: Error running child : java.lang.NoSuchMethodError: org.apache.parquet.avro.AvroSchemaConverter.convert(Lorg/apache/parquet/schema/MessageType;)Lorg/apache/hudi/org/apache/avro/Schema;
   	at org.apache.hudi.common.util.ParquetUtils.readAvroSchema(ParquetUtils.java:217)
   	at org.apache.hudi.io.storage.HoodieParquetReader.getSchema(HoodieParquetReader.java:71)
   	at org.apache.hudi.hadoop.utils.HoodieRealtimeRecordReaderUtils.readSchema(HoodieRealtimeRecordReaderUtils.java:72)
   	at org.apache.hudi.hadoop.InputSplitUtils.getBaseFileSchema(InputSplitUtils.java:70)
   	at org.apache.hudi.hadoop.realtime.AbstractRealtimeRecordReader.init(AbstractRealtimeRecordReader.java:87)
   	at org.apache.hudi.hadoop.realtime.AbstractRealtimeRecordReader.<init>(AbstractRealtimeRecordReader.java:67)
   	at org.apache.hudi.hadoop.realtime.RealtimeCompactedRecordReader.<init>(RealtimeCompactedRecordReader.java:62)
   	at org.apache.hudi.hadoop.realtime.HoodieRealtimeRecordReader.constructRecordReader(HoodieRealtimeRecordReader.java:70)
   	at org.apache.hudi.hadoop.realtime.HoodieRealtimeRecordReader.<init>(HoodieRealtimeRecordReader.java:47)
   	at org.apache.hudi.hadoop.realtime.HoodieParquetRealtimeInputFormat.getRecordReader(HoodieParquetRealtimeInputFormat.java:323)
   	at org.apache.hudi.hadoop.hive.HoodieCombineHiveInputFormat$HoodieCombineFileInputFormatShim.getRecordReader(HoodieCombineHiveInputFormat.java:974)
   	at org.apache.hudi.hadoop.hive.HoodieCombineHiveInputFormat.getRecordReader(HoodieCombineHiveInputFormat.java:555)
   	at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.<init>(MapTask.java:175)
   	at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:444)
   	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:349)
   	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:174)
   	at java.security.AccessController.doPrivileged(Native Method)
   	at javax.security.auth.Subject.doAs(Subject.java:422)
   	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875)
   	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:168)
   
   2022-10-08 16:30:28,713 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Stopping MapTask metrics system...
   2022-10-08 16:30:28,714 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MapTask metrics system stopped.
   2022-10-08 16:30:28,714 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MapTask metrics system shutdown complete.`
   
   Hive table structure:
   CREATE EXTERNAL TABLE `paat_ods_hudi.paat_hudi_flink_tyc_company_rt`(
     `_hoodie_commit_time` string COMMENT '',
     `_hoodie_commit_seqno` string COMMENT '',
     `_hoodie_record_key` string COMMENT '',
     `_hoodie_partition_path` string COMMENT '',
     `_hoodie_file_name` string COMMENT '',
     `company_id` string COMMENT '',
     `company_name` string COMMENT '',
     `legal_person_name` string COMMENT '',
     `establish_time` string COMMENT '',
     `reg_capital` string COMMENT '',
     `reg_status` string COMMENT '',
     `reg_number` string COMMENT '',
     `org_number` string COMMENT '',
     `credit_code` string COMMENT '',
     `reg_location` string COMMENT '',
     `phone_num` string COMMENT '',
     `province_code` string COMMENT '',
     `city_code` string COMMENT '',
     `district_code` string COMMENT '',
     `province` string COMMENT '',
     `city` string COMMENT '',
     `district` string COMMENT '',
     `company_type` string COMMENT '',
     `tax_code` string COMMENT '',
     `category_code_std` string COMMENT '',
     `social_security_staff_num` string COMMENT '',
     `update_time` bigint COMMENT '')
   ROW FORMAT SERDE
     'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'
   WITH SERDEPROPERTIES (
     'hoodie.query.as.ro.table'='false',
     'path'='hdfs://paat-dev/user/hudi/warehouse/paat_ods_hudi.db/')
   STORED AS INPUTFORMAT
     'org.apache.hudi.hadoop.realtime.HoodieParquetRealtimeInputFormat'
   OUTPUTFORMAT
     'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat'
   LOCATION
     'hdfs://paat-dev/user/hudi/warehouse/paat_ods_hudi.db'
   TBLPROPERTIES (
     'last_commit_time_sync'='20220929181835978',
     'spark.sql.sources.provider'='hudi',
     'spark.sql.sources.schema.numParts'='1',
     'spark.sql.sources.schema.part.0'='{"type":"struct","fields":[{"name":"_hoodie_commit_time","type":"string","nullable": true,"metadata":{}},{"name":"_hoodie_commit_seqno","type":"string","nullable":true,"metadata":{}},{"name":"_hoodie_record_key", "type":"string","nullable":true,"metadata":{}},{"name":"_hoodie_partition_path","type":"string","nullable":true,"metadata":{ }},{"name":"_hoodie_file_name","type":"string","nullable":true,"metadata":{}},{"name":"company_id","type":"string" ,"nullable":false,"metadata":{}},{"name":"company_name","type":"string","nullable":true,"metadata":{}},{"name" :"legal_person_name","type":"string","nullable":true,"metadata":{}},{"name":"establish_time","type":"string","nullable":true, "metadata":{}},{"name":"reg_capital","type":"string","nullable":true,"metadata":{}},{"name":"reg_status","type ":"string","nullable":true,"metadata":{}},{"name":"reg_number","type":"string","nullable":true,"metadata":{}} ,{"name":"org_number","type":"string","nullable
 ":true,"metadat a":{}},{"name":"credit_code","type":"string","nullable":true,"metadata":{}},{"name":"reg_location","type" :"string","nullable":true,"metadata":{}},{"name":"phone_num","type":"string","nullable":true,"metadata":{}}, {"name":"province_code","type":"string","nullable":true,"metadata":{}},{"name":"city_code","type":"string","nullable ":true,"metadata":{}},{"name":"district_code","type":"string","nullable":true,"metadata":{}},{"name":"province ","type":"string","nullable":true,"metadata":{}},{"name":"city","type":"string","nullable":true,"metadata" :{}},{"name":"district","type":"string","nullable":true,"metadata":{}},{"name":"company_type","type":" string","nullable":true,"metadata":{}},{"name":"tax_code","type":"string","nullable":true,"metadata":{}},{" name":"category_code_std","type":"string","nullable":true,"metadata":{}},{"name":"social_security_staff_num","type":"string","nullable": true,"metadata":{}},{"name":"update_time","type":"timestamp","nullable":true,"metad
 ata":{}}]}',
     'transient_lastDdlTime'='1664444316')
   
   I have put the hudi-hadoop-mr-bundle-0.10.1.jar package into the /etc/hive/auxlib directory of Hive, may I ask if the Jar package is still missing?
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] gubinjie closed issue #6894: [SUPPORT]Error running child : java.lang.NoSuchMethodError: org.apache.parquet.avro.AvroSchemaConverter.convert(Lorg/apache/parquet/schema/MessageType;)Lorg/apache/hudi/org/apache/avro/Schema;

Posted by GitBox <gi...@apache.org>.
gubinjie closed issue #6894: [SUPPORT]Error running child : java.lang.NoSuchMethodError: org.apache.parquet.avro.AvroSchemaConverter.convert(Lorg/apache/parquet/schema/MessageType;)Lorg/apache/hudi/org/apache/avro/Schema;
URL: https://github.com/apache/hudi/issues/6894


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] shursulei commented on issue #6894: [SUPPORT]Error running child : java.lang.NoSuchMethodError: org.apache.parquet.avro.AvroSchemaConverter.convert(Lorg/apache/parquet/schema/MessageType;)Lorg/apache/hudi/org/apache/avro/Schema;

Posted by "shursulei (via GitHub)" <gi...@apache.org>.
shursulei commented on issue #6894:
URL: https://github.com/apache/hudi/issues/6894#issuecomment-1441234619

   i have to solve this problem.Beacuse of the version of the flink and the version of the hudi.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] shursulei commented on issue #6894: [SUPPORT]Error running child : java.lang.NoSuchMethodError: org.apache.parquet.avro.AvroSchemaConverter.convert(Lorg/apache/parquet/schema/MessageType;)Lorg/apache/hudi/org/apache/avro/Schema;

Posted by "shursulei (via GitHub)" <gi...@apache.org>.
shursulei commented on issue #6894:
URL: https://github.com/apache/hudi/issues/6894#issuecomment-1439691903

   How did you solve this problem?I use flink sink to hudi ,have the same problem


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org