You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2022/09/05 03:26:51 UTC

[GitHub] [hudi] dik111 opened a new issue, #6588: [SUPPORT]Caused by: java.lang.ClassNotFoundException: org.apache.hudi.org.apache.avro.util.Utf8

dik111 opened a new issue, #6588:
URL: https://github.com/apache/hudi/issues/6588

   **_Tips before filing an issue_**
   
   - Have you gone through our [FAQs](https://hudi.apache.org/learn/faq/)?
   
   - Join the mailing list to engage in conversations and get faster support at dev-subscribe@hudi.apache.org.
   
   - If you have triaged this as a bug, then file an [issue](https://issues.apache.org/jira/projects/HUDI/issues) directly.
   
   **Describe the problem you faced**
   
   I use spark to query hudi data, but it throws an exception Caused by: java.lang.ClassNotFoundException: org.apache.hudi.org.apache.avro.util.Utf8
   
   **To Reproduce**
   
   Steps to reproduce the behavior:
   
   1.use Flink to insert data
   2.use spark to query data
   
   **Expected behavior**
   
   A clear and concise description of what you expected to happen.
   
   **Environment Description**
   
   * Hudi version : 0.12.0
   
   * Spark version : 2.4.4
   
   * Hive version : 3.0.0
   
   * Hadoop version : 3.1.0
   
   * Storage (HDFS/S3/GCS..) : HDFS
   
   * Running on Docker? (yes/no) : no
   
   
   **Additional context**
   
   Here is my spark code
   ``` scala
       System.setProperty("HADOOP_USER_NAME", "hdfs")
       val spark:SparkSession = {
         SparkSession.builder()
           .appName(this.getClass.getSimpleName.stripSuffix("$"))
           .master("local[2]")
           .config("spark.serializer","org.apache.spark.serializer.KryoSerializer")
           .getOrCreate()
       }
   
       val tablePath = "hdfs://hdp03:8020/warehouse/tablespace/managed/hive/test.db/pt_sfy_sfy_oe_order_lines_all_hudi_0904"
   
   
       val dataFrame = spark.read.format("org.apache.hudi").load(tablePath)
       dataFrame.createOrReplaceTempView("pt_sfy_sfy_oe_order_lines_all_hudi_0904")
       spark.sql("select * from pt_sfy_sfy_oe_order_lines_all_hudi_0904 ").show()
   
       spark.stop()
   ```
   
   **Stacktrace**
   
   ```
   00:34  WARN: [kryo] Unable to load class org.apache.hudi.org.apache.avro.util.Utf8 with kryo's ClassLoader. Retrying with current..
   22/09/05 11:17:58 ERROR AbstractHoodieLogRecordReader: Got exception when reading log file
   com.esotericsoftware.kryo.KryoException: Unable to find class: org.apache.hudi.org.apache.avro.util.Utf8
   Serialization trace:
   orderingVal (org.apache.hudi.common.model.DeleteRecord)
   	at com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:160)
   	at com.esotericsoftware.kryo.util.DefaultClassResolver.readClass(DefaultClassResolver.java:133)
   	at com.esotericsoftware.kryo.Kryo.readClass(Kryo.java:693)
   	at com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:118)
   	at com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:543)
   	at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:731)
   	at com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ObjectArraySerializer.read(DefaultArraySerializers.java:391)
   	at com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ObjectArraySerializer.read(DefaultArraySerializers.java:302)
   	at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:813)
   	at org.apache.hudi.common.util.SerializationUtils$KryoSerializerInstance.deserialize(SerializationUtils.java:104)
   	at org.apache.hudi.common.util.SerializationUtils.deserialize(SerializationUtils.java:78)
   	at org.apache.hudi.common.table.log.block.HoodieDeleteBlock.deserialize(HoodieDeleteBlock.java:106)
   	at org.apache.hudi.common.table.log.block.HoodieDeleteBlock.getRecordsToDelete(HoodieDeleteBlock.java:91)
   	at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.processQueuedBlocksForInstant(AbstractHoodieLogRecordReader.java:473)
   	at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.scanInternal(AbstractHoodieLogRecordReader.java:343)
   	at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.scan(AbstractHoodieLogRecordReader.java:192)
   	at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner.performScan(HoodieMergedLogRecordScanner.java:110)
   	at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner.<init>(HoodieMergedLogRecordScanner.java:103)
   	at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner$Builder.build(HoodieMergedLogRecordScanner.java:324)
   	at org.apache.hudi.HoodieMergeOnReadRDD$.scanLog(HoodieMergeOnReadRDD.scala:402)
   	at org.apache.hudi.HoodieMergeOnReadRDD$LogFileIterator.<init>(HoodieMergeOnReadRDD.scala:196)
   	at org.apache.hudi.HoodieMergeOnReadRDD.compute(HoodieMergeOnReadRDD.scala:124)
   	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
   	at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
   	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
   	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
   	at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
   	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
   	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
   	at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
   	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
   	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
   	at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
   	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
   	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
   	at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
   	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
   	at org.apache.spark.scheduler.Task.run(Task.scala:123)
   	at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:408)
   	at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
   	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:414)
   	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
   	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
   	at java.lang.Thread.run(Thread.java:750)
   Caused by: java.lang.ClassNotFoundException: org.apache.hudi.org.apache.avro.util.Utf8
   	at java.net.URLClassLoader.findClass(URLClassLoader.java:387)
   	at java.lang.ClassLoader.loadClass(ClassLoader.java:419)
   	at java.lang.ClassLoader.loadClass(ClassLoader.java:352)
   	at java.lang.Class.forName0(Native Method)
   	at java.lang.Class.forName(Class.java:348)
   	at com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:154)
   	... 43 more
   22/09/05 11:17:58 ERROR Executor: Exception in task 3.0 in stage 5.0 (TID 18)
   org.apache.hudi.exception.HoodieException: Exception when reading log file 
   	at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.scanInternal(AbstractHoodieLogRecordReader.java:352)
   	at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.scan(AbstractHoodieLogRecordReader.java:192)
   	at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner.performScan(HoodieMergedLogRecordScanner.java:110)
   	at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner.<init>(HoodieMergedLogRecordScanner.java:103)
   	at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner$Builder.build(HoodieMergedLogRecordScanner.java:324)
   	at org.apache.hudi.HoodieMergeOnReadRDD$.scanLog(HoodieMergeOnReadRDD.scala:402)
   	at org.apache.hudi.HoodieMergeOnReadRDD$LogFileIterator.<init>(HoodieMergeOnReadRDD.scala:196)
   	at org.apache.hudi.HoodieMergeOnReadRDD.compute(HoodieMergeOnReadRDD.scala:124)
   	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
   	at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
   	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
   	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
   	at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
   	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
   	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
   	at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
   	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
   	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
   	at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
   	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
   	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
   	at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
   	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
   	at org.apache.spark.scheduler.Task.run(Task.scala:123)
   	at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:408)
   	at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
   	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:414)
   	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
   	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
   	at java.lang.Thread.run(Thread.java:750)
   Caused by: com.esotericsoftware.kryo.KryoException: Unable to find class: org.apache.hudi.org.apache.avro.util.Utf8
   Serialization trace:
   orderingVal (org.apache.hudi.common.model.DeleteRecord)
   	at com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:160)
   	at com.esotericsoftware.kryo.util.DefaultClassResolver.readClass(DefaultClassResolver.java:133)
   	at com.esotericsoftware.kryo.Kryo.readClass(Kryo.java:693)
   	at com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:118)
   	at com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:543)
   	at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:731)
   	at com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ObjectArraySerializer.read(DefaultArraySerializers.java:391)
   	at com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ObjectArraySerializer.read(DefaultArraySerializers.java:302)
   	at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:813)
   	at org.apache.hudi.common.util.SerializationUtils$KryoSerializerInstance.deserialize(SerializationUtils.java:104)
   	at org.apache.hudi.common.util.SerializationUtils.deserialize(SerializationUtils.java:78)
   	at org.apache.hudi.common.table.log.block.HoodieDeleteBlock.deserialize(HoodieDeleteBlock.java:106)
   	at org.apache.hudi.common.table.log.block.HoodieDeleteBlock.getRecordsToDelete(HoodieDeleteBlock.java:91)
   	at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.processQueuedBlocksForInstant(AbstractHoodieLogRecordReader.java:473)
   	at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.scanInternal(AbstractHoodieLogRecordReader.java:343)
   	... 29 more
   Caused by: java.lang.ClassNotFoundException: org.apache.hudi.org.apache.avro.util.Utf8
   	at java.net.URLClassLoader.findClass(URLClassLoader.java:387)
   	at java.lang.ClassLoader.loadClass(ClassLoader.java:419)
   	at java.lang.ClassLoader.loadClass(ClassLoader.java:352)
   	at java.lang.Class.forName0(Native Method)
   	at java.lang.Class.forName(Class.java:348)
   	at com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:154)
   	... 43 more
   22/09/05 11:17:58 WARN TaskSetManager: Lost task 3.0 in stage 5.0 (TID 18, localhost, executor driver): org.apache.hudi.exception.HoodieException: Exception when reading log file 
   	at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.scanInternal(AbstractHoodieLogRecordReader.java:352)
   	at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.scan(AbstractHoodieLogRecordReader.java:192)
   	at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner.performScan(HoodieMergedLogRecordScanner.java:110)
   	at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner.<init>(HoodieMergedLogRecordScanner.java:103)
   	at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner$Builder.build(HoodieMergedLogRecordScanner.java:324)
   	at org.apache.hudi.HoodieMergeOnReadRDD$.scanLog(HoodieMergeOnReadRDD.scala:402)
   	at org.apache.hudi.HoodieMergeOnReadRDD$LogFileIterator.<init>(HoodieMergeOnReadRDD.scala:196)
   	at org.apache.hudi.HoodieMergeOnReadRDD.compute(HoodieMergeOnReadRDD.scala:124)
   	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
   	at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
   	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
   	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
   	at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
   	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
   	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
   	at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
   	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
   	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
   	at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
   	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
   	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
   	at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
   	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
   	at org.apache.spark.scheduler.Task.run(Task.scala:123)
   	at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:408)
   	at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
   	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:414)
   	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
   	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
   	at java.lang.Thread.run(Thread.java:750)
   Caused by: com.esotericsoftware.kryo.KryoException: Unable to find class: org.apache.hudi.org.apache.avro.util.Utf8
   Serialization trace:
   orderingVal (org.apache.hudi.common.model.DeleteRecord)
   	at com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:160)
   	at com.esotericsoftware.kryo.util.DefaultClassResolver.readClass(DefaultClassResolver.java:133)
   	at com.esotericsoftware.kryo.Kryo.readClass(Kryo.java:693)
   	at com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:118)
   	at com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:543)
   	at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:731)
   	at com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ObjectArraySerializer.read(DefaultArraySerializers.java:391)
   	at com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ObjectArraySerializer.read(DefaultArraySerializers.java:302)
   	at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:813)
   	at org.apache.hudi.common.util.SerializationUtils$KryoSerializerInstance.deserialize(SerializationUtils.java:104)
   	at org.apache.hudi.common.util.SerializationUtils.deserialize(SerializationUtils.java:78)
   	at org.apache.hudi.common.table.log.block.HoodieDeleteBlock.deserialize(HoodieDeleteBlock.java:106)
   	at org.apache.hudi.common.table.log.block.HoodieDeleteBlock.getRecordsToDelete(HoodieDeleteBlock.java:91)
   	at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.processQueuedBlocksForInstant(AbstractHoodieLogRecordReader.java:473)
   	at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.scanInternal(AbstractHoodieLogRecordReader.java:343)
   	... 29 more
   Caused by: java.lang.ClassNotFoundException: org.apache.hudi.org.apache.avro.util.Utf8
   	at java.net.URLClassLoader.findClass(URLClassLoader.java:387)
   	at java.lang.ClassLoader.loadClass(ClassLoader.java:419)
   	at java.lang.ClassLoader.loadClass(ClassLoader.java:352)
   	at java.lang.Class.forName0(Native Method)
   	at java.lang.Class.forName(Class.java:348)
   	at com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:154)
   	... 43 more
   
   22/09/05 11:17:58 ERROR TaskSetManager: Task 3 in stage 5.0 failed 1 times; aborting job
   22/09/05 11:17:58 INFO TaskSchedulerImpl: Removed TaskSet 5.0, whose tasks have all completed, from pool 
   22/09/05 11:17:58 INFO TaskSchedulerImpl: Cancelling stage 5
   22/09/05 11:17:58 INFO TaskSchedulerImpl: Killing all running tasks in stage 5: Stage cancelled
   22/09/05 11:17:58 INFO DAGScheduler: ResultStage 5 (show at HudiSparkDemo.scala:81) failed in 28.179 s due to Job aborted due to stage failure: Task 3 in stage 5.0 failed 1 times, most recent failure: Lost task 3.0 in stage 5.0 (TID 18, localhost, executor driver): org.apache.hudi.exception.HoodieException: Exception when reading log file 
   	at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.scanInternal(AbstractHoodieLogRecordReader.java:352)
   	at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.scan(AbstractHoodieLogRecordReader.java:192)
   	at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner.performScan(HoodieMergedLogRecordScanner.java:110)
   	at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner.<init>(HoodieMergedLogRecordScanner.java:103)
   	at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner$Builder.build(HoodieMergedLogRecordScanner.java:324)
   	at org.apache.hudi.HoodieMergeOnReadRDD$.scanLog(HoodieMergeOnReadRDD.scala:402)
   	at org.apache.hudi.HoodieMergeOnReadRDD$LogFileIterator.<init>(HoodieMergeOnReadRDD.scala:196)
   	at org.apache.hudi.HoodieMergeOnReadRDD.compute(HoodieMergeOnReadRDD.scala:124)
   	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
   	at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
   	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
   	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
   	at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
   	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
   	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
   	at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
   	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
   	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
   	at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
   	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
   	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
   	at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
   	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
   	at org.apache.spark.scheduler.Task.run(Task.scala:123)
   	at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:408)
   	at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
   	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:414)
   	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
   	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
   	at java.lang.Thread.run(Thread.java:750)
   Caused by: com.esotericsoftware.kryo.KryoException: Unable to find class: org.apache.hudi.org.apache.avro.util.Utf8
   Serialization trace:
   orderingVal (org.apache.hudi.common.model.DeleteRecord)
   	at com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:160)
   	at com.esotericsoftware.kryo.util.DefaultClassResolver.readClass(DefaultClassResolver.java:133)
   	at com.esotericsoftware.kryo.Kryo.readClass(Kryo.java:693)
   	at com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:118)
   	at com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:543)
   	at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:731)
   	at com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ObjectArraySerializer.read(DefaultArraySerializers.java:391)
   	at com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ObjectArraySerializer.read(DefaultArraySerializers.java:302)
   	at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:813)
   	at org.apache.hudi.common.util.SerializationUtils$KryoSerializerInstance.deserialize(SerializationUtils.java:104)
   	at org.apache.hudi.common.util.SerializationUtils.deserialize(SerializationUtils.java:78)
   	at org.apache.hudi.common.table.log.block.HoodieDeleteBlock.deserialize(HoodieDeleteBlock.java:106)
   	at org.apache.hudi.common.table.log.block.HoodieDeleteBlock.getRecordsToDelete(HoodieDeleteBlock.java:91)
   	at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.processQueuedBlocksForInstant(AbstractHoodieLogRecordReader.java:473)
   	at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.scanInternal(AbstractHoodieLogRecordReader.java:343)
   	... 29 more
   Caused by: java.lang.ClassNotFoundException: org.apache.hudi.org.apache.avro.util.Utf8
   	at java.net.URLClassLoader.findClass(URLClassLoader.java:387)
   	at java.lang.ClassLoader.loadClass(ClassLoader.java:419)
   	at java.lang.ClassLoader.loadClass(ClassLoader.java:352)
   	at java.lang.Class.forName0(Native Method)
   	at java.lang.Class.forName(Class.java:348)
   	at com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:154)
   	... 43 more
   
   Driver stacktrace:
   22/09/05 11:17:58 INFO DAGScheduler: Job 5 failed: show at HudiSparkDemo.scala:81, took 28.189034 s
   Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task 3 in stage 5.0 failed 1 times, most recent failure: Lost task 3.0 in stage 5.0 (TID 18, localhost, executor driver): org.apache.hudi.exception.HoodieException: Exception when reading log file 
   	at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.scanInternal(AbstractHoodieLogRecordReader.java:352)
   	at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.scan(AbstractHoodieLogRecordReader.java:192)
   	at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner.performScan(HoodieMergedLogRecordScanner.java:110)
   	at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner.<init>(HoodieMergedLogRecordScanner.java:103)
   	at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner$Builder.build(HoodieMergedLogRecordScanner.java:324)
   	at org.apache.hudi.HoodieMergeOnReadRDD$.scanLog(HoodieMergeOnReadRDD.scala:402)
   	at org.apache.hudi.HoodieMergeOnReadRDD$LogFileIterator.<init>(HoodieMergeOnReadRDD.scala:196)
   	at org.apache.hudi.HoodieMergeOnReadRDD.compute(HoodieMergeOnReadRDD.scala:124)
   	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
   	at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
   	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
   	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
   	at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
   	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
   	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
   	at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
   	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
   	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
   	at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
   	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
   	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
   	at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
   	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
   	at org.apache.spark.scheduler.Task.run(Task.scala:123)
   	at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:408)
   	at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
   	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:414)
   	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
   	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
   	at java.lang.Thread.run(Thread.java:750)
   Caused by: com.esotericsoftware.kryo.KryoException: Unable to find class: org.apache.hudi.org.apache.avro.util.Utf8
   Serialization trace:
   orderingVal (org.apache.hudi.common.model.DeleteRecord)
   	at com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:160)
   	at com.esotericsoftware.kryo.util.DefaultClassResolver.readClass(DefaultClassResolver.java:133)
   	at com.esotericsoftware.kryo.Kryo.readClass(Kryo.java:693)
   	at com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:118)
   	at com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:543)
   	at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:731)
   	at com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ObjectArraySerializer.read(DefaultArraySerializers.java:391)
   	at com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ObjectArraySerializer.read(DefaultArraySerializers.java:302)
   	at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:813)
   	at org.apache.hudi.common.util.SerializationUtils$KryoSerializerInstance.deserialize(SerializationUtils.java:104)
   	at org.apache.hudi.common.util.SerializationUtils.deserialize(SerializationUtils.java:78)
   	at org.apache.hudi.common.table.log.block.HoodieDeleteBlock.deserialize(HoodieDeleteBlock.java:106)
   	at org.apache.hudi.common.table.log.block.HoodieDeleteBlock.getRecordsToDelete(HoodieDeleteBlock.java:91)
   	at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.processQueuedBlocksForInstant(AbstractHoodieLogRecordReader.java:473)
   	at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.scanInternal(AbstractHoodieLogRecordReader.java:343)
   	... 29 more
   Caused by: java.lang.ClassNotFoundException: org.apache.hudi.org.apache.avro.util.Utf8
   	at java.net.URLClassLoader.findClass(URLClassLoader.java:387)
   	at java.lang.ClassLoader.loadClass(ClassLoader.java:419)
   	at java.lang.ClassLoader.loadClass(ClassLoader.java:352)
   	at java.lang.Class.forName0(Native Method)
   	at java.lang.Class.forName(Class.java:348)
   	at com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:154)
   	... 43 more
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on issue #6588: [SUPPORT]Caused by: java.lang.ClassNotFoundException: org.apache.hudi.org.apache.avro.util.Utf8

Posted by GitBox <gi...@apache.org>.
danny0405 commented on issue #6588:
URL: https://github.com/apache/hudi/issues/6588#issuecomment-1236524891

   what jar did you use for spark, you can open the spark jar with command:
   ```shell
   vim xxx.jar
   ```
   
   and search about the missing clazz to see if the jar includes it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] dik111 commented on issue #6588: [SUPPORT]Caused by: java.lang.ClassNotFoundException: org.apache.hudi.org.apache.avro.util.Utf8

Posted by GitBox <gi...@apache.org>.
dik111 commented on issue #6588:
URL: https://github.com/apache/hudi/issues/6588#issuecomment-1236580669

   I solved this problem by adding the following configuration in Packaging/Hudi-spark-bundle/pom.xml
   ```
   ...
    <include>org.apache.avro:avro</include>
   ...
   ...
   <relocation>
     <pattern>org.apache.avro.</pattern>
     <shadedPattern>org.apache.hudi.org.apache.avro.</shadedPattern>
   </relocation>
   ...
   ...
       <dependency>
         <groupId>org.apache.avro</groupId>
         <artifactId>avro</artifactId>
         <version>1.8.2</version>
         <scope>compile</scope>
       </dependency>
   ...
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on issue #6588: [SUPPORT]Caused by: java.lang.ClassNotFoundException: org.apache.hudi.org.apache.avro.util.Utf8

Posted by GitBox <gi...@apache.org>.
danny0405 commented on issue #6588:
URL: https://github.com/apache/hudi/issues/6588#issuecomment-1236516899

   Seems that spark bundle jar does not contain the shaded avro clazz.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on issue #6588: [SUPPORT]Caused by: java.lang.ClassNotFoundException: org.apache.hudi.org.apache.avro.util.Utf8

Posted by GitBox <gi...@apache.org>.
danny0405 commented on issue #6588:
URL: https://github.com/apache/hudi/issues/6588#issuecomment-1347638244

   @xushiyan Did you know the background that the spark-bundle does not include avro as a dependency ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] dik111 closed issue #6588: [SUPPORT]Caused by: java.lang.ClassNotFoundException: org.apache.hudi.org.apache.avro.util.Utf8

Posted by GitBox <gi...@apache.org>.
dik111 closed issue #6588: [SUPPORT]Caused by: java.lang.ClassNotFoundException: org.apache.hudi.org.apache.avro.util.Utf8
URL: https://github.com/apache/hudi/issues/6588


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] dik111 commented on issue #6588: [SUPPORT]Caused by: java.lang.ClassNotFoundException: org.apache.hudi.org.apache.avro.util.Utf8

Posted by GitBox <gi...@apache.org>.
dik111 commented on issue #6588:
URL: https://github.com/apache/hudi/issues/6588#issuecomment-1236519071

   > Seems that spark bundle jar does not contain the shaded avro clazz.
   
   What should I do about it ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] daihw commented on issue #6588: [SUPPORT]Caused by: java.lang.ClassNotFoundException: org.apache.hudi.org.apache.avro.util.Utf8

Posted by GitBox <gi...@apache.org>.
daihw commented on issue #6588:
URL: https://github.com/apache/hudi/issues/6588#issuecomment-1347627637

   > I solved this problem by adding the following configuration in Packaging/Hudi-spark-bundle/pom.xml
   > 
   > ```
   > ...
   >  <include>org.apache.avro:avro</include>
   > ...
   > ...
   > <relocation>
   >   <pattern>org.apache.avro.</pattern>
   >   <shadedPattern>org.apache.hudi.org.apache.avro.</shadedPattern>
   > </relocation>
   > ...
   > ...
   >     <dependency>
   >       <groupId>org.apache.avro</groupId>
   >       <artifactId>avro</artifactId>
   >       <version>1.8.2</version>
   >       <scope>compile</scope>
   >     </dependency>
   > ...
   > ```
   
   hi,i got the same promblem as you ,when I repaired it according to your method,I encountered a new problem and could not use Spark to insert data,the error message is 
   java.lang.ClassCastException: org.apache.hudi.org.apache.avro.Schema$RecordSchema cannot be cast to org.apache.avro.Schema
           at org.apache.spark.SparkConf$$anonfun$registerAvroSchemas$1.apply(SparkConf.scala:221)
           at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
           at scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:35)
           at org.apache.spark.SparkConf.registerAvroSchemas(SparkConf.scala:221)
           at org.apache.hudi.HoodieSparkSqlWriter$.write(HoodieSparkSqlWriter.scala:280)
           at org.apache.spark.sql.hudi.command.InsertIntoHoodieTableCommand$.run(InsertIntoHoodieTableCommand.scala:101)
           at org.apache.spark.sql.hudi.command.InsertIntoHoodieTableCommand.run(InsertIntoHoodieTableCommand.scala:60)
           at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
           at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
           at org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79)
           at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:194)
           at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:194)
           at org.apache.spark.sql.Dataset$$anonfun$52.apply(Dataset.scala:3370)
           at org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:80)
           at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:127)
           at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:75)
           at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3369)
           at org.apache.spark.sql.Dataset.<init>(Dataset.scala:194)
           at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:79)
           at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:642)
           at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:694)
           at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:232)
           at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:175)
           at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:171)
           at java.security.AccessController.doPrivileged(Native Method)
           at javax.security.auth.Subject.doAs(Subject.java:422)
           at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1692)
           at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1.run(SparkExecuteStatementOperation.scala:185)
           at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
           at java.util.concurrent.FutureTask.run(FutureTask.java:266)
           at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
           at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
           at java.lang.Thread.run(Thread.java:748)
   165739 [HiveServer2-Background-Pool: Thread-133] ERROR org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation  - Error running hive query: 
   org.apache.hive.service.cli.HiveSQLException: java.lang.ClassCastException: org.apache.hudi.org.apache.avro.Schema$RecordSchema cannot be cast to org.apache.avro.Schema
           at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:269)
           at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:175)
           at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:171)
           at java.security.AccessController.doPrivileged(Native Method)
           at javax.security.auth.Subject.doAs(Subject.java:422)
           at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1692)
           at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1.run(SparkExecuteStatementOperation.scala:185)
           at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
           at java.util.concurrent.FutureTask.run(FutureTask.java:266)
           at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
           at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
           at java.lang.Thread.run(Thread.java:748)
   Error: java.lang.ClassCastException: org.apache.hudi.org.apache.avro.Schema$RecordSchema cannot be cast to org.apache.avro.Schema (state=,code=0)
   
   can you help me ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on issue #6588: [SUPPORT]Caused by: java.lang.ClassNotFoundException: org.apache.hudi.org.apache.avro.util.Utf8

Posted by GitBox <gi...@apache.org>.
danny0405 commented on issue #6588:
URL: https://github.com/apache/hudi/issues/6588#issuecomment-1347633205

   It seems you did not shade the avro correctly for your spark bundle jar


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org