You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2022/06/06 12:23:42 UTC

[GitHub] [hudi] sunke38 opened a new issue, #5765: [SUPPORT] throw "java.lang.NoSuchMethodError: org.apache.hadoop.hdfs.client.HdfsDataInputStream.getReadStatistics()"

sunke38 opened a new issue, #5765:
URL: https://github.com/apache/hudi/issues/5765

   I use Spark Sql to insert record to hudi. It work for a short time. However It throw "java.lang.NoSuchMethodError: org.apache.hadoop.hdfs.client.HdfsDataInputStream.getReadStatistics()" after a while.
   
   Steps to reproduce the behavior:
   
   I wrote a scala fuction to make instert sql
   ```
   
    private def write2Table(row: Row)(implicit sparkSession: SparkSession): Unit = {
   
       val filedSql = new StringBuilder()
   
   
   
       val filed = row.schema.fields.map(field =>{
         var value = ""
         if(row.getString(row.fieldIndex(field.name)).isEmpty){
           value = s"""null as ${field.name}"""
           value
         }else{
         field.dataType match {
           case StringType =>{value=s"""\'${row.getAs[String](field.name)}\' as ${field.name}"""}
           case BooleanType =>{value=s"""${row.getAs[Boolean](field.name)} as ${field.name}"""}
           case ByteType =>{value=s"""${row.getAs[Byte](field.name)} as ${field.name}"""}
           case ShortType =>{value=s"""${row.getAs[Short](field.name)} as ${field.name}"""}
           case IntegerType =>{value=s"""${row.getAs[Int](field.name)} as ${field.name}"""}
           case LongType =>{value=s"""${row.getAs[Long](field.name)} as ${field.name}"""}
           case FloatType =>{value=s"""${row.getAs[Float](field.name)} as ${field.name}"""}
           case DoubleType =>{value=s"""${row.getAs[Double](field.name)} as ${field.name}"""}
           case DateType =>{value=s"""\'${row.getAs[String](field.name)}\' as ${field.name}"""}
           case TimestampType =>{value=s"""\'${row.getAs[String](field.name)}\' as ${field.name}"""}
         }
         value
       }}).mkString(",")
   
       val insertSql = s"""insert into ${row.getAs("database")}.${row.getAs("table")}_cow select ${filed};"""
       try{
         println(s""" 插入 ${row.getAs("table")}_cow;""")
         sparkSession.sql(insertSql)
   
       }catch{
         case ex:Throwable=> {
           println(row.prettyJson)
           println(insertSql)
           throw ex
         }
       }
   
     }
   }
   ```
   Then call it in foreachRDD() of a DSteam
   ```
   saveRdd.foreachRDD ( rdd => {
   
         rdd.collect().foreach(x=>{
   
           //println(x.json)
   //        println(x.schema.sql)
   
           val row = x._1
           chackAndCreateTable(row)
   
           if(x._2.equals("INSERT")){
             write2Table(row)
           }
   
   
         })
   
       })
   ```
   **Expected behavior**
   
   A clear and concise description of what you expected to happen.
   
   **Environment Description**
   
   Environment Description
   
   Hudi version : 0.11
   
   Spark version : 3.2.1
   
   Hadoop version : 3.2.2
   
   Storage (HDFS/S3/GCS..) : HDFS
   
   Running on Docker? (yes/no) : no
   
   
   Here is my config 
   code:
   ```
         .appName("SparkHudi")
         .master("spark://hadoop203:7077")
         .config("spark.sql.warehouse.dir","/user/hive/warehouse")
         .config("spark.serialize","org.apache.spark.serializer.KryoSerializer")
         .config("spark.sql.extensions","org.apache.spark.sql.hudi.HoodieSparkSessionExtension")
         .config("spark.sql.catalog.spark_catalog","org.apache.spark.sql.hudi.catalog.HoodieCatalog")
         .config("spark.sql.legacy.exponentLiteralAsDecimal.enabled",true)
         .enableHiveSupport()
         .config("hive.metastore.uris","thrift://10.10.9.203:9083")
         .getOrCreate()
   
   ```
   
   spark-submit:
   ```
   spark-submit   --jars /home/kadm/module/hudi-0.11/packaging/hudi-spark-bundle/target/hudi-spark3.2-bundle_2.12-0.11.0.jar  --packages org.apache.spark:spark-sql-kafka-0-10_2.12:3.2.1,org.apache.spark:spark-avro_2.12:3.2.1,org.apache.kafka:kafka-clients:3.1.0  --conf 'spark.serializer=org.apache.spark.serializer.KryoSerializer'   --conf 'spark.sql.extensions=org.apache.spark.sql.hudi.HoodieSparkSessionExtension'   --conf 'spark.sql.catalog.spark_catalog=org.apache.spark.sql.hudi.catalog.HoodieCatalog'  --conf "spark.driver.extraJavaOptions=-agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=5445"   --master spark://hadoop203:7077 SparkHudi-1.0-SNAPSHOT-shaded.jar
   ```
   
   **Stacktrace**
   
   ```
   22/06/06 09:47:13 ERROR Javalin: Exception occurred while servicing http-request
   java.lang.NoSuchMethodError: org.apache.hadoop.hdfs.client.HdfsDataInputStream.getReadStatistics()Lorg/apache/hadoop/hdfs/DFSInputStream$ReadStatistics;
   	at org.apache.hudi.org.apache.hadoop.hbase.io.FSDataInputStreamWrapper.updateInputStreamStatistics(FSDataInputStreamWrapper.java:249)
   	at org.apache.hudi.org.apache.hadoop.hbase.io.FSDataInputStreamWrapper.close(FSDataInputStreamWrapper.java:296)
   	at org.apache.hudi.org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderImpl.closeStreams(HFileBlock.java:1825)
   	at org.apache.hudi.org.apache.hadoop.hbase.io.hfile.HFilePreadReader.close(HFilePreadReader.java:107)
   	at org.apache.hudi.org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.close(HFileReaderImpl.java:1421)
   	at org.apache.hudi.io.storage.HoodieHFileReader.close(HoodieHFileReader.java:218)
   	at org.apache.hudi.metadata.HoodieBackedTableMetadata.closeReader(HoodieBackedTableMetadata.java:574)
   	at org.apache.hudi.metadata.HoodieBackedTableMetadata.close(HoodieBackedTableMetadata.java:567)
   	at org.apache.hudi.metadata.HoodieBackedTableMetadata.close(HoodieBackedTableMetadata.java:554)
   	at org.apache.hudi.metadata.HoodieMetadataFileSystemView.close(HoodieMetadataFileSystemView.java:83)
   	at org.apache.hudi.common.table.view.FileSystemViewManager.clearFileSystemView(FileSystemViewManager.java:86)
   	at org.apache.hudi.timeline.service.handlers.FileSliceHandler.refreshTable(FileSliceHandler.java:118)
   	at org.apache.hudi.timeline.service.RequestHandler.lambda$registerFileSlicesAPI$19(RequestHandler.java:390)
   	at org.apache.hudi.timeline.service.RequestHandler$ViewHandler.handle(RequestHandler.java:501)
   	at io.javalin.security.SecurityUtil.noopAccessManager(SecurityUtil.kt:22)
   	at io.javalin.Javalin.lambda$addHandler$0(Javalin.java:606)
   	at io.javalin.core.JavalinServlet$service$2$1.invoke(JavalinServlet.kt:46)
   	at io.javalin.core.JavalinServlet$service$2$1.invoke(JavalinServlet.kt:17)
   	at io.javalin.core.JavalinServlet$service$1.invoke(JavalinServlet.kt:143)
   	at io.javalin.core.JavalinServlet$service$2.invoke(JavalinServlet.kt:41)
   	at io.javalin.core.JavalinServlet.service(JavalinServlet.kt:107)
   	at io.javalin.core.util.JettyServerUtil$initialize$httpHandler$1.doHandle(JettyServerUtil.kt:72)
   	at org.apache.hudi.org.apache.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:203)
   	at org.apache.hudi.org.apache.jetty.servlet.ServletHandler.doScope(ServletHandler.java:482)
   	at org.apache.hudi.org.apache.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1668)
   	at org.apache.hudi.org.apache.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:201)
   	at org.apache.hudi.org.apache.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1247)
   	at org.apache.hudi.org.apache.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:144)
   	at org.apache.hudi.org.apache.jetty.server.handler.HandlerList.handle(HandlerList.java:61)
   	at org.apache.hudi.org.apache.jetty.server.handler.StatisticsHandler.handle(StatisticsHandler.java:174)
   	at org.apache.hudi.org.apache.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
   	at org.apache.hudi.org.apache.jetty.server.Server.handle(Server.java:502)
   	at org.apache.hudi.org.apache.jetty.server.HttpChannel.handle(HttpChannel.java:370)
   	at org.apache.hudi.org.apache.jetty.server.HttpConnection.onFillable(HttpConnection.java:267)
   	at org.apache.hudi.org.apache.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:305)
   	at org.apache.hudi.org.apache.jetty.io.FillInterest.fillable(FillInterest.java:103)
   	at org.apache.hudi.org.apache.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:117)
   	at org.apache.hudi.org.apache.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:336)
   	at org.apache.hudi.org.apache.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:313)
   	at org.apache.hudi.org.apache.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:171)
   	at org.apache.hudi.org.apache.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:129)
   	at org.apache.hudi.org.apache.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:367)
   	at org.apache.hudi.org.apache.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:782)
   	at org.apache.hudi.org.apache.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:918)
   	at java.lang.Thread.run(Thread.java:748)
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] RoderickAdriance commented on issue #5765: [SUPPORT] throw "java.lang.NoSuchMethodError: org.apache.hadoop.hdfs.client.HdfsDataInputStream.getReadStatistics()"

Posted by "RoderickAdriance (via GitHub)" <gi...@apache.org>.
RoderickAdriance commented on issue #5765:
URL: https://github.com/apache/hudi/issues/5765#issuecomment-1443359569

   Check your steps carefully recompile the Habse 2.4.
   
   modified the pom file update hadoop version 2.7 to 3.2 or higher verion.
   &nbsp;
   
   
   Aldrich
   ***@***.***
   
   
   
   &nbsp;
   
   
   
   
   ------------------&nbsp;原始邮件&nbsp;------------------
   发件人: "chao ***@***.***&gt;; 
   发送时间: 2023年2月23日(星期四) 下午2:12
   收件人: ***@***.***&gt;; 
   抄送: ***@***.***&gt;; ***@***.***&gt;; 
   主题: Re: [apache/hudi] [SUPPORT] throw &quot;java.lang.NoSuchMethodError: org.apache.hadoop.hdfs.client.HdfsDataInputStream.getReadStatistics()&quot; (Issue #5765)
   
   
   
   
   
    
   @nsivabalan Hello, why is this possible? After compiling, I found that the method reference of HdfsDataInputStream.getReadStatistics has not changed
    
   —
   Reply to this email directly, view it on GitHub, or unsubscribe.
   You are receiving this because you were mentioned.Message ID: ***@***.***&gt;


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] XuQianJin-Stars commented on issue #5765: [SUPPORT] throw "java.lang.NoSuchMethodError: org.apache.hadoop.hdfs.client.HdfsDataInputStream.getReadStatistics()"

Posted by GitBox <gi...@apache.org>.
XuQianJin-Stars commented on issue #5765:
URL: https://github.com/apache/hudi/issues/5765#issuecomment-1154628087

   > @sunke38 @RoderickAdriance @XuQianJin-Stars : do you happened to have any hbase jars in your class path. If not, we should not see this issue in my understanding. CC @yihua
   
   hi @nsivabalan The problem on my side has been solved, and it is found that it is a jar package conflict problem.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] RoderickAdriance commented on issue #5765: [SUPPORT] throw "java.lang.NoSuchMethodError: org.apache.hadoop.hdfs.client.HdfsDataInputStream.getReadStatistics()"

Posted by GitBox <gi...@apache.org>.
RoderickAdriance commented on issue #5765:
URL: https://github.com/apache/hudi/issues/5765#issuecomment-1148112582

   22/06/06 22:15:27 ERROR Javalin: Exception occurred while servicing http-request
   	java.lang.NoSuchMethodError: org.apache.hadoop.hdfs.client.HdfsDataInputStream.getReadStatistics()Lorg/apache/hadoop/hdfs/DFSInputStream$ReadStatistics;
   		at org.apache.hudi.org.apache.hadoop.hbase.io.FSDataInputStreamWrapper.updateInputStreamStatistics(FSDataInputStreamWrapper.java:249)
   		at org.apache.hudi.org.apache.hadoop.hbase.io.FSDataInputStreamWrapper.close(FSDataInputStreamWrapper.java:296)
   		at org.apache.hudi.org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderImpl.closeStreams(HFileBlock.java:1825)
   		at org.apache.hudi.org.apache.hadoop.hbase.io.hfile.HFilePreadReader.close(HFilePreadReader.java:107)
   		at org.apache.hudi.org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.close(HFileReaderImpl.java:1421)
   		at org.apache.hudi.io.storage.HoodieHFileReader.close(HoodieHFileReader.java:218)
   		at org.apache.hudi.metadata.HoodieBackedTableMetadata.closeReader(HoodieBackedTableMetadata.java:574)
   		at org.apache.hudi.metadata.HoodieBackedTableMetadata.close(HoodieBackedTableMetadata.java:567)
   		at org.apache.hudi.metadata.HoodieBackedTableMetadata.close(HoodieBackedTableMetadata.java:554)
   		at org.apache.hudi.metadata.HoodieMetadataFileSystemView.close(HoodieMetadataFileSystemView.java:83)
   		at org.apache.hudi.common.table.view.FileSystemViewManager.clearFileSystemView(FileSystemViewManager.java:86)
   		at org.apache.hudi.timeline.service.handlers.FileSliceHandler.refreshTable(FileSliceHandler.java:118)
   		at org.apache.hudi.timeline.service.RequestHandler.lambda$registerFileSlicesAPI$19(RequestHandler.java:390)
   		at org.apache.hudi.timeline.service.RequestHandler$ViewHandler.handle(RequestHandler.java:501)
   		at io.javalin.security.SecurityUtil.noopAccessManager(SecurityUtil.kt:22)
   		at io.javalin.Javalin.lambda$addHandler$0(Javalin.java:606)
   		at io.javalin.core.JavalinServlet$service$2$1.invoke(JavalinServlet.kt:46)
   		at io.javalin.core.JavalinServlet$service$2$1.invoke(JavalinServlet.kt:17)
   		at io.javalin.core.JavalinServlet$service$1.invoke(JavalinServlet.kt:143)
   		at io.javalin.core.JavalinServlet$service$2.invoke(JavalinServlet.kt:41)
   		at io.javalin.core.JavalinServlet.service(JavalinServlet.kt:107)
   		at io.javalin.core.util.JettyServerUtil$initialize$httpHandler$1.doHandle(JettyServerUtil.kt:72)
   		at org.apache.hudi.org.apache.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:203)
   		at org.apache.hudi.org.apache.jetty.servlet.ServletHandler.doScope(ServletHandler.java:480)
   		at org.apache.hudi.org.apache.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1668)
   		at org.apache.hudi.org.apache.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:201)
   		at org.apache.hudi.org.apache.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1247)
   		at org.apache.hudi.org.apache.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:144)
   		at org.apache.hudi.org.apache.jetty.server.handler.HandlerList.handle(HandlerList.java:61)
   		at org.apache.hudi.org.apache.jetty.server.handler.StatisticsHandler.handle(StatisticsHandler.java:174)
   		at org.apache.hudi.org.apache.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
   		at org.apache.hudi.org.apache.jetty.server.Server.handle(Server.java:502)
   		at org.apache.hudi.org.apache.jetty.server.HttpChannel.handle(HttpChannel.java:370)
   		at org.apache.hudi.org.apache.jetty.server.HttpConnection.onFillable(HttpConnection.java:267)
   		at org.apache.hudi.org.apache.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:305)
   		at org.apache.hudi.org.apache.jetty.io.FillInterest.fillable(FillInterest.java:103)
   		at org.apache.hudi.org.apache.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:117)
   		at org.apache.hudi.org.apache.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:333)
   		at org.apache.hudi.org.apache.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:310)
   		at org.apache.hudi.org.apache.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:168)
   		at org.apache.hudi.org.apache.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:126)
   		at org.apache.hudi.org.apache.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:366)
   		at org.apache.hudi.org.apache.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:765)
   		at org.apache.hudi.org.apache.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:683)
   		at java.lang.Thread.run(Thread.java:748)
   	22/06/06 22:15:27 ERROR HoodieDeltaStreamer: Got error running delta sync once. Shutting down
   	org.apache.hudi.exception.HoodieRemoteException: status code: 500, reason phrase: Server Error
   		at org.apache.hudi.common.table.view.RemoteHoodieTableFileSystemView.refresh(RemoteHoodieTableFileSystemView.java:420)
   		at org.apache.hudi.common.table.view.RemoteHoodieTableFileSystemView.sync(RemoteHoodieTableFileSystemView.java:484)
   		at org.apache.hudi.common.table.view.PriorityBasedFileSystemView.sync(PriorityBasedFileSystemView.java:257)
   		at org.apache.hudi.table.HoodieSparkTable.create(HoodieSparkTable.java:92)
   		at org.apache.hudi.table.HoodieSparkTable.create(HoodieSparkTable.java:67)
   		at org.apache.hudi.client.SparkRDDWriteClient.createTable(SparkRDDWriteClient.java:129)
   		at org.apache.hudi.client.BaseHoodieWriteClient.scheduleTableServiceInternal(BaseHoodieWriteClient.java:1352)
   		at org.apache.hudi.client.BaseHoodieWriteClient.clean(BaseHoodieWriteClient.java:864)
   		at org.apache.hudi.client.BaseHoodieWriteClient.clean(BaseHoodieWriteClient.java:837)
   		at org.apache.hudi.client.BaseHoodieWriteClient.clean(BaseHoodieWriteClient.java:891)
   		at org.apache.hudi.client.BaseHoodieWriteClient.autoCleanOnCommit(BaseHoodieWriteClient.java:614)
   		at org.apache.hudi.client.BaseHoodieWriteClient.postCommit(BaseHoodieWriteClient.java:533)
   		at org.apache.hudi.client.BaseHoodieWriteClient.commitStats(BaseHoodieWriteClient.java:236)
   		at org.apache.hudi.client.SparkRDDWriteClient.commit(SparkRDDWriteClient.java:122)
   		at org.apache.hudi.utilities.deltastreamer.DeltaSync.writeToSink(DeltaSync.java:622)
   		at org.apache.hudi.utilities.deltastreamer.DeltaSync.syncOnce(DeltaSync.java:331)
   		at org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.lambda$sync$2(HoodieDeltaStreamer.java:200)
   		at org.apache.hudi.common.util.Option.ifPresent(Option.java:97)
   		at org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.sync(HoodieDeltaStreamer.java:198)
   		at org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.main(HoodieDeltaStreamer.java:549)
   		at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   		at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
   		at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   		at java.lang.reflect.Method.invoke(Method.java:498)
   		at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
   		at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:955)
   		at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
   		at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
   		at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
   		at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1043)
   		at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1052)
   		at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
   	Caused by: org.apache.http.client.HttpResponseException: status code: 500, reason phrase: Server Error
   		at org.apache.http.impl.client.AbstractResponseHandler.handleResponse(AbstractResponseHandler.java:70)
   		at org.apache.http.client.fluent.Response.handleResponse(Response.java:90)
   		at org.apache.http.client.fluent.Response.returnContent(Response.java:97)
   		at org.apache.hudi.common.table.view.RemoteHoodieTableFileSystemView.executeRequest(RemoteHoodieTableFileSystemView.java:179)
   		at org.apache.hudi.common.table.view.RemoteHoodieTableFileSystemView.refresh(RemoteHoodieTableFileSystemView.java:418)
   		... 31 more


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] shuai-xu commented on issue #5765: [SUPPORT] throw "java.lang.NoSuchMethodError: org.apache.hadoop.hdfs.client.HdfsDataInputStream.getReadStatistics()"

Posted by GitBox <gi...@apache.org>.
shuai-xu commented on issue #5765:
URL: https://github.com/apache/hudi/issues/5765#issuecomment-1176973738

   @yihua , This problem is caused by that the hbase 2.4.9 jars in maven source are compiled with hadoop-2.7. The quick fix is to compile hbase with hadoop 3.* and mvn install it, and then compile hudi


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] melin commented on issue #5765: [SUPPORT] throw "java.lang.NoSuchMethodError: org.apache.hadoop.hdfs.client.HdfsDataInputStream.getReadStatistics()"

Posted by GitBox <gi...@apache.org>.
melin commented on issue #5765:
URL: https://github.com/apache/hudi/issues/5765#issuecomment-1156597594

   hbase rely on hadoop-hdfs-client 2.10. If you use hadoop3, this error
   occurs, hadoop-client-api
   Why does an error occur when hbase Index is not used?
   
   
   RoderickAdriance ***@***.***> 于2022年6月15日周三 09:18写道:
   
   > This problem will occur after when I execute the command many times
   >
   > bin/spark-submit --master local[2] --class
   > org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer
   > jars/hudi-utilities-bundle_2.12-0.11.0.jar
   > --table-type COPY_ON_WRITE
   > --props
   > /dolphinscheduler/obeiadmin/resources/Logistics_screen/data_ingestion/mysql-cnb_order-jdbc-source.properties
   >
   > --source-class org.apache.hudi.utilities.sources.JdbcSource
   > --source-ordering-field ts
   > --target-base-path /user/hive/warehouse/bwdhmosaas.db/ods_cnb_order
   > --target-table ods_cnb_order
   > --transformer-class
   > org.apache.hudi.utilities.transform.SqlQueryBasedTransformer
   >
   > error.txt <https://github.com/apache/hudi/files/8904887/error.txt>
   >
   > —
   > Reply to this email directly, view it on GitHub
   > <https://github.com/apache/hudi/issues/5765#issuecomment-1155870786>, or
   > unsubscribe
   > <https://github.com/notifications/unsubscribe-auth/AAIXXZRBNLSTRGPX5K3NNX3VPEVNFANCNFSM5X7G33QA>
   > .
   > You are receiving this because you are subscribed to this thread.Message
   > ID: ***@***.***>
   >
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] yihua commented on issue #5765: [SUPPORT] throw "java.lang.NoSuchMethodError: org.apache.hadoop.hdfs.client.HdfsDataInputStream.getReadStatistics()"

Posted by GitBox <gi...@apache.org>.
yihua commented on issue #5765:
URL: https://github.com/apache/hudi/issues/5765#issuecomment-1226572795

   @nsivabalan We can document the workaround, yet it's still not ideal for users relying on releases.  I'll check if we can fix it by the dependency management within Hudi.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] xzwDavid commented on issue #5765: [SUPPORT] throw "java.lang.NoSuchMethodError: org.apache.hadoop.hdfs.client.HdfsDataInputStream.getReadStatistics()"

Posted by "xzwDavid (via GitHub)" <gi...@apache.org>.
xzwDavid commented on issue #5765:
URL: https://github.com/apache/hudi/issues/5765#issuecomment-1732532455

   > Same issue, loading data from Spark-native parquet into MOR table. 5 pieces (loads) completed succesfully, failed on 6-th. Data and structure is the same. Possibly some compaction/cleaning happens and invoke this problem?
   > 
   > incDF.filter("ins_flg='y'") .write .format("org.apache.hudi") .option(DataSourceWriteOptions.OPERATION_OPT_KEY, DataSourceWriteOptions.BULK_INSERT_OPERATION_OPT_VAL) .option(TABLE_TYPE_OPT_KEY, "MERGE_ON_READ") .option(RECORDKEY_FIELD_OPT_KEY, "rn") .option(PRECOMBINE_FIELD_OPT_KEY, "ss_date_time") .option(HIVE_SUPPORT_TIMESTAMP_TYPE.key, "true") .option("hoodie.index.type", "SIMPLE") .option(TABLE_NAME, "store_sales") .mode(SaveMode.Append) .save("/tmp/bench_hudi/store_sales")
   > 
   > UPD: yes, looks like compaction/cleaner works here (I have 2 writes per cycle and hoodie.cleaner.commits.retained = 10 by default). I added following options to disable cleaner in my test cycle, but error still appears on 10-th commit. "hoodie.keep.min.commits" -> "40", "hoodie.keep.max.commits" -> "50", "hoodie.cleaner.commits.retained" -> "30", "hoodie.clean.automatic" -> "false"
   
   Hi 
   I encounter the same issue. Do you fix this?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] nsivabalan commented on issue #5765: [SUPPORT] throw "java.lang.NoSuchMethodError: org.apache.hadoop.hdfs.client.HdfsDataInputStream.getReadStatistics()"

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on issue #5765:
URL: https://github.com/apache/hudi/issues/5765#issuecomment-1229344548

   yes, I get it. even if we put in a fix, only new versions of hudi might support. not the existing ones. So, better to document it. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] dohongdayi commented on issue #5765: [SUPPORT] throw "java.lang.NoSuchMethodError: org.apache.hadoop.hdfs.client.HdfsDataInputStream.getReadStatistics()"

Posted by GitBox <gi...@apache.org>.
dohongdayi commented on issue #5765:
URL: https://github.com/apache/hudi/issues/5765#issuecomment-1179915480

   I resolved this by packaging a new version of hbase 2.4.9 with our Hadoop 3 version with the following command:
   
   `mvn clean install -Denforcer.skip  -DskipTests -Dhadoop.profile=3.0 -Psite-install-step`
   
   after that, changed `hbase.version` in pom.xml of Hudi and package Hudi again.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] waywtdcc commented on issue #5765: [SUPPORT] throw "java.lang.NoSuchMethodError: org.apache.hadoop.hdfs.client.HdfsDataInputStream.getReadStatistics()"

Posted by "waywtdcc (via GitHub)" <gi...@apache.org>.
waywtdcc commented on issue #5765:
URL: https://github.com/apache/hudi/issues/5765#issuecomment-1441257445

   @nsivabalan Hello, why is this possible? After compiling, I found that the method reference of HdfsDataInputStream.getReadStatistics has not changed


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] RoderickAdriance commented on issue #5765: [SUPPORT] throw "java.lang.NoSuchMethodError: org.apache.hadoop.hdfs.client.HdfsDataInputStream.getReadStatistics()"

Posted by GitBox <gi...@apache.org>.
RoderickAdriance commented on issue #5765:
URL: https://github.com/apache/hudi/issues/5765#issuecomment-1175839617

   @yihua I use Hadoop3 and spark2 this problem will be resolved.
   So I think this problem  is caused by the incompatibility between Hudi jar package and spark3 package.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] RoderickAdriance commented on issue #5765: [SUPPORT] throw "java.lang.NoSuchMethodError: org.apache.hadoop.hdfs.client.HdfsDataInputStream.getReadStatistics()"

Posted by GitBox <gi...@apache.org>.
RoderickAdriance commented on issue #5765:
URL: https://github.com/apache/hudi/issues/5765#issuecomment-1175836187

   @yihua I use Hadoop3 and spark2 this problem will be resolved.
   So I think HFile classes is not compatible with spark2.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] sunke38 commented on issue #5765: [SUPPORT] throw "java.lang.NoSuchMethodError: org.apache.hadoop.hdfs.client.HdfsDataInputStream.getReadStatistics()"

Posted by GitBox <gi...@apache.org>.
sunke38 commented on issue #5765:
URL: https://github.com/apache/hudi/issues/5765#issuecomment-1149511966

   @nsivabalan :I didn't have any HBASE dependencies in my pom here is all about Hadoop dependencies. It make me confuse that it show same error  when I run insert in spark-shell.  I did nothing about hbase for spark-shell. I also didnt see any jar about hbase in spark/jars
   
   spark-shell
   ···
   spark-shell   --jars /home/qqq/module/hudi-0.11/packaging/hudi-spark-bundle/target/hudi-spark3.2-bundle_2.12-0.11.0.jar   --packages org.apache.spark:spark-avro_2.12:3.2.1  --conf 'spark.serializer=org.apache.spark.serializer.KryoSerializer'   --conf 'spark.sql.extensions=org.apache.spark.sql.hudi.HoodieSparkSessionExtension'   --conf 'spark.sql.catalog.spark_catalog=org.apache.spark.sql.hudi.catalog.HoodieCatalog'
   ···
   
   dependencies for code:
   ···
   <dependency>
               <groupId>org.apache.hadoop</groupId>
               <artifactId>hadoop-client</artifactId>
               <version>${hadoop.version}</version>
               <exclusions>
                   <exclusion>
                       <artifactId>commons-cli</artifactId>
                       <groupId>commons-cli</groupId>
                   </exclusion>
                   <exclusion>
                       <artifactId>commons-codec</artifactId>
                       <groupId>commons-codec</groupId>
                   </exclusion>
                   <exclusion>
                       <artifactId>jakarta.activation-api</artifactId>
                       <groupId>jakarta.activation</groupId>
                   </exclusion>
                   <exclusion>
                       <artifactId>protobuf-java</artifactId>
                       <groupId>com.google.protobuf</groupId>
                   </exclusion>
               </exclusions>
           </dependency>
           <dependency>
               <groupId>org.apache.hadoop</groupId>
               <artifactId>hadoop-client-api</artifactId>
               <version>${hadoop.version}</version>
               <exclusions>
                   <exclusion>
                       <artifactId>snappy-java</artifactId>
                       <groupId>org.xerial.snappy</groupId>
                   </exclusion>
               </exclusions>
           </dependency>
           <dependency>
               <groupId>org.apache.hadoop</groupId>
               <artifactId>hadoop-common</artifactId>
               <version>${hadoop.version}</version>
               <exclusions>
                   <exclusion>
                       <artifactId>stax2-api</artifactId>
                       <groupId>org.codehaus.woodstox</groupId>
                   </exclusion>
                   <exclusion>
                       <artifactId>avro</artifactId>
                       <groupId>org.apache.avro</groupId>
                   </exclusion>
                   <exclusion>
                       <artifactId>snappy-java</artifactId>
                       <groupId>org.xerial.snappy</groupId>
                   </exclusion>
                   <exclusion>
                       <artifactId>audience-annotations</artifactId>
                       <groupId>org.apache.yetus</groupId>
                   </exclusion>
                   <exclusion>
                       <artifactId>commons-lang3</artifactId>
                       <groupId>org.apache.commons</groupId>
                   </exclusion>
                   <exclusion>
                       <artifactId>jackson-core-asl</artifactId>
                       <groupId>org.codehaus.jackson</groupId>
                   </exclusion>
                   <exclusion>
                       <artifactId>commons-logging</artifactId>
                       <groupId>commons-logging</groupId>
                   </exclusion>
                   <exclusion>
                       <artifactId>commons-cli</artifactId>
                       <groupId>commons-cli</groupId>
                   </exclusion>
                   <exclusion>
                       <artifactId>zookeeper</artifactId>
                       <groupId>org.apache.zookeeper</groupId>
                   </exclusion>
                   <exclusion>
                       <artifactId>commons-text</artifactId>
                       <groupId>org.apache.commons</groupId>
                   </exclusion>
                   <exclusion>
                       <artifactId>jackson-mapper-asl</artifactId>
                       <groupId>org.codehaus.jackson</groupId>
                   </exclusion>
                   <exclusion>
                       <artifactId>jaxb-api</artifactId>
                       <groupId>javax.xml.bind</groupId>
                   </exclusion>
                   <exclusion>
                       <artifactId>protobuf-java</artifactId>
                       <groupId>com.google.protobuf</groupId>
                   </exclusion>
                   <exclusion>
                       <artifactId>commons-codec</artifactId>
                       <groupId>commons-codec</groupId>
                   </exclusion>
                   <exclusion>
                       <artifactId>commons-math3</artifactId>
                       <groupId>org.apache.commons</groupId>
                   </exclusion>
                   <exclusion>
                       <artifactId>slf4j-api</artifactId>
                       <groupId>org.slf4j</groupId>
                   </exclusion>
                   <exclusion>
                       <artifactId>nimbus-jose-jwt</artifactId>
                       <groupId>com.nimbusds</groupId>
                   </exclusion>
                   <exclusion>
                       <artifactId>jakarta.activation-api</artifactId>
                       <groupId>jakarta.activation</groupId>
                   </exclusion>
                   <exclusion>
                       <artifactId>commons-io</artifactId>
                       <groupId>commons-io</groupId>
                   </exclusion>
               </exclusions>
           </dependency>
           <dependency>
               <groupId>org.apache.hadoop</groupId>
               <artifactId>hadoop-hdfs</artifactId>
               <version>${hadoop.version}</version>
               <exclusions>
                   <exclusion>
                       <artifactId>commons-cli</artifactId>
                       <groupId>commons-cli</groupId>
                   </exclusion>
                   <exclusion>
                       <artifactId>commons-codec</artifactId>
                       <groupId>commons-codec</groupId>
                   </exclusion>
                   <exclusion>
                       <artifactId>protobuf-java</artifactId>
                       <groupId>com.google.protobuf</groupId>
                   </exclusion>
               </exclusions>
           </dependency>
           <dependency>
               <groupId>org.apache.hadoop</groupId>
               <artifactId>hadoop-hdfs-client</artifactId>
               <version>${hadoop.version}</version>
           </dependency>
   ···


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] chenlianguu commented on issue #5765: [SUPPORT] throw "java.lang.NoSuchMethodError: org.apache.hadoop.hdfs.client.HdfsDataInputStream.getReadStatistics()"

Posted by GitBox <gi...@apache.org>.
chenlianguu commented on issue #5765:
URL: https://github.com/apache/hudi/issues/5765#issuecomment-1178670277

   I have met same problen, anyone can  explain this
   ![微信图片_20220708154909](https://user-images.githubusercontent.com/20611410/177944061-8d198ffd-dff1-406e-b773-58dea3c24677.png)
   ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] codope commented on issue #5765: [SUPPORT] throw "java.lang.NoSuchMethodError: org.apache.hadoop.hdfs.client.HdfsDataInputStream.getReadStatistics()"

Posted by GitBox <gi...@apache.org>.
codope commented on issue #5765:
URL: https://github.com/apache/hudi/issues/5765#issuecomment-1160562283

   Should be resolved by https://github.com/apache/hudi/pull/5882
   @Humphrey0822 Can you try out that patch?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] nsivabalan commented on issue #5765: [SUPPORT] throw "java.lang.NoSuchMethodError: org.apache.hadoop.hdfs.client.HdfsDataInputStream.getReadStatistics()"

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on issue #5765:
URL: https://github.com/apache/hudi/issues/5765#issuecomment-1229344627

   Closing the issue as the workaround has been suggested. feel free to re-open or create new one. 
   thanks!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] 15663671003 commented on issue #5765: [SUPPORT] throw "java.lang.NoSuchMethodError: org.apache.hadoop.hdfs.client.HdfsDataInputStream.getReadStatistics()"

Posted by GitBox <gi...@apache.org>.
15663671003 commented on issue #5765:
URL: https://github.com/apache/hudi/issues/5765#issuecomment-1225911236

   > I created a ticket to track the fix: [HUDI-4341](https://issues.apache.org/jira/browse/HUDI-4341).
   
   Will the next version consider fixing this problem? which bothers newbies like me.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] yihua commented on issue #5765: [SUPPORT] throw "java.lang.NoSuchMethodError: org.apache.hadoop.hdfs.client.HdfsDataInputStream.getReadStatistics()"

Posted by GitBox <gi...@apache.org>.
yihua commented on issue #5765:
URL: https://github.com/apache/hudi/issues/5765#issuecomment-1169540234

   I created a ticket to track the fix: HUDI-4341.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] nsivabalan closed issue #5765: [SUPPORT] throw "java.lang.NoSuchMethodError: org.apache.hadoop.hdfs.client.HdfsDataInputStream.getReadStatistics()"

Posted by GitBox <gi...@apache.org>.
nsivabalan closed issue #5765: [SUPPORT] throw "java.lang.NoSuchMethodError: org.apache.hadoop.hdfs.client.HdfsDataInputStream.getReadStatistics()"
URL: https://github.com/apache/hudi/issues/5765


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] 15663671003 commented on issue #5765: [SUPPORT] throw "java.lang.NoSuchMethodError: org.apache.hadoop.hdfs.client.HdfsDataInputStream.getReadStatistics()"

Posted by GitBox <gi...@apache.org>.
15663671003 commented on issue #5765:
URL: https://github.com/apache/hudi/issues/5765#issuecomment-1225336035

   > Version 0.12 has been released. Is this bug fixed?
   
   in spark-3.2.2,hudi-0.12.0, it not fixed


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] yihua commented on issue #5765: [SUPPORT] throw "java.lang.NoSuchMethodError: org.apache.hadoop.hdfs.client.HdfsDataInputStream.getReadStatistics()"

Posted by GitBox <gi...@apache.org>.
yihua commented on issue #5765:
URL: https://github.com/apache/hudi/issues/5765#issuecomment-1258096397

   The workaround to mitigate this problem is captured in the FAQ: #6756.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] RoderickAdriance commented on issue #5765: [SUPPORT] throw "java.lang.NoSuchMethodError: org.apache.hadoop.hdfs.client.HdfsDataInputStream.getReadStatistics()"

Posted by GitBox <gi...@apache.org>.
RoderickAdriance commented on issue #5765:
URL: https://github.com/apache/hudi/issues/5765#issuecomment-1148087948

   I hava this problem too.When I use hudi-delta streamer tool to extract data from mysql to hdfs.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] nsivabalan commented on issue #5765: [SUPPORT] throw "java.lang.NoSuchMethodError: org.apache.hadoop.hdfs.client.HdfsDataInputStream.getReadStatistics()"

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on issue #5765:
URL: https://github.com/apache/hudi/issues/5765#issuecomment-1149239341

   @sunke38 @RoderickAdriance @XuQianJin-Stars : do you happened to have any hbase jars in your class path. If not, we should not see this issue in my understanding. 
   CC @yihua 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] pavels1983 commented on issue #5765: [SUPPORT] throw "java.lang.NoSuchMethodError: org.apache.hadoop.hdfs.client.HdfsDataInputStream.getReadStatistics()"

Posted by GitBox <gi...@apache.org>.
pavels1983 commented on issue #5765:
URL: https://github.com/apache/hudi/issues/5765#issuecomment-1149783227

   Same issue, loading data into MOR table. 5 parts completed succesfully, failed on 6-th. Data and structure is the same.
   Possibly some compaction happens and invoke this problem?
   
   incDF.filter("ins_flg='y'")
   .write
   .format("org.apache.hudi")
   .option(DataSourceWriteOptions.OPERATION_OPT_KEY, DataSourceWriteOptions.BULK_INSERT_OPERATION_OPT_VAL)
   .option(TABLE_TYPE_OPT_KEY, "MERGE_ON_READ")
   .option(RECORDKEY_FIELD_OPT_KEY, "rn")
   .option(PRECOMBINE_FIELD_OPT_KEY, "ss_date_time")
   .option(HIVE_SUPPORT_TIMESTAMP_TYPE.key, "true")
   .option("hoodie.index.type", "SIMPLE")
   .option(TABLE_NAME, "store_sales")
   .mode(SaveMode.Append)
   .save("/tmp/bench_hudi/store_sales")


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] RoderickAdriance commented on issue #5765: [SUPPORT] throw "java.lang.NoSuchMethodError: org.apache.hadoop.hdfs.client.HdfsDataInputStream.getReadStatistics()"

Posted by GitBox <gi...@apache.org>.
RoderickAdriance commented on issue #5765:
URL: https://github.com/apache/hudi/issues/5765#issuecomment-1155870786

   This problem will occur after when I execute the command many times
   
   bin/spark-submit --master local[2]  --class org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer  \
   jars/hudi-utilities-bundle_2.12-0.11.0.jar 
   --table-type COPY_ON_WRITE  \
   --props  /dolphinscheduler/obeiadmin/resources/Logistics_screen/data_ingestion/mysql-cnb_order-jdbc-source.properties  \
   --source-class org.apache.hudi.utilities.sources.JdbcSource  \
   --source-ordering-field ts \
   --target-base-path /user/hive/warehouse/bwdhmosaas.db/ods_cnb_order  \
   --target-table ods_cnb_order \
   --transformer-class org.apache.hudi.utilities.transform.SqlQueryBasedTransformer
   
   [error.txt](https://github.com/apache/hudi/files/8904887/error.txt)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] Humphrey0822 commented on issue #5765: [SUPPORT] throw "java.lang.NoSuchMethodError: org.apache.hadoop.hdfs.client.HdfsDataInputStream.getReadStatistics()"

Posted by GitBox <gi...@apache.org>.
Humphrey0822 commented on issue #5765:
URL: https://github.com/apache/hudi/issues/5765#issuecomment-1159935458

   > > > @sunke38 @RoderickAdriance @XuQianJin-Stars : do you happened to have any hbase jars in your class path. If not, we should not see this issue in my understanding. CC @yihua
   > > 
   > > 
   > > hi @nsivabalan The problem on my side has been solved, and it is found that it is a jar package conflict problem.
   > 
   > Hi @XuQianJin-Stars May I ask for detail how you solve this,plz? I am very confuse that where is localtion of jar package conflict. Thank you
   
   @yihua @nsivabalan @XuQianJin-Stars Which version of Hudi is planned to fix this issue?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] ruby-box commented on issue #5765: [SUPPORT] throw "java.lang.NoSuchMethodError: org.apache.hadoop.hdfs.client.HdfsDataInputStream.getReadStatistics()"

Posted by GitBox <gi...@apache.org>.
ruby-box commented on issue #5765:
URL: https://github.com/apache/hudi/issues/5765#issuecomment-1167115979

   @codope I also have the same issue, and I applied the patch mentioned above(#5882) and tested it. But same error occurs.
   ![image](https://user-images.githubusercontent.com/23192355/175904089-ae92adbf-625d-4f0e-87a1-c8bb07216f61.png)
   
   Spark version : 3.2.1
   Hadoop version : 3.2.2 (on-premise hadoop)
   Storage (HDFS/S3/GCS..) : HDFS
   Running on Docker? (yes/no) : no
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] jiangbiao910 commented on issue #5765: [SUPPORT] throw "java.lang.NoSuchMethodError: org.apache.hadoop.hdfs.client.HdfsDataInputStream.getReadStatistics()"

Posted by GitBox <gi...@apache.org>.
jiangbiao910 commented on issue #5765:
URL: https://github.com/apache/hudi/issues/5765#issuecomment-1217724942

   > I created a ticket to track the fix: [HUDI-4341](https://issues.apache.org/jira/browse/HUDI-4341).
   
   Version 0.12 has been released. Is this bug fixed?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] nsivabalan commented on issue #5765: [SUPPORT] throw "java.lang.NoSuchMethodError: org.apache.hadoop.hdfs.client.HdfsDataInputStream.getReadStatistics()"

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on issue #5765:
URL: https://github.com/apache/hudi/issues/5765#issuecomment-1216145403

   @yihua : do you think we can document the solution proposed by @dohongdayi above in some FAQ. 
   ```
   I resolved this by my own, by packaging a new version of hbase 2.4.9 with our Hadoop 3 version with the following command:
   
   mvn clean install -Denforcer.skip -DskipTests -Dhadoop.profile=3.0 -Psite-install-step
   
   then, changed hbase.defaults.for.version in hudi-common/src/main/resources/hbase-site.xml
   
   after that, changed hbase.version in pom.xml of Hudi, used versions-maven-plugin to create a new Hudi version, and package Hudi again.
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] yihua commented on issue #5765: [SUPPORT] throw "java.lang.NoSuchMethodError: org.apache.hadoop.hdfs.client.HdfsDataInputStream.getReadStatistics()"

Posted by GitBox <gi...@apache.org>.
yihua commented on issue #5765:
URL: https://github.com/apache/hudi/issues/5765#issuecomment-1169531895

   @melin HFile is used as the base file format in metadata table under `<base_path>/.hoodie/metadata`.  The metadata table is a MOR table and the HFile only appears after compaction in the metadata table.
   
   It looks like the current way of packaging HFile classes is not compatible with Hadoop 3.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] jiezi2026 commented on issue #5765: [SUPPORT] throw "java.lang.NoSuchMethodError: org.apache.hadoop.hdfs.client.HdfsDataInputStream.getReadStatistics()"

Posted by GitBox <gi...@apache.org>.
jiezi2026 commented on issue #5765:
URL: https://github.com/apache/hudi/issues/5765#issuecomment-1193472473

   We also encountered the same problem with  hudi-0.11.1 & spark-3.2.1,and our current temporary method is set hoodie.metadata.enable=false.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] abyssnlp commented on issue #5765: [SUPPORT] throw "java.lang.NoSuchMethodError: org.apache.hadoop.hdfs.client.HdfsDataInputStream.getReadStatistics()"

Posted by GitBox <gi...@apache.org>.
abyssnlp commented on issue #5765:
URL: https://github.com/apache/hudi/issues/5765#issuecomment-1193108491

   Ran into the same issue when using deltastreamer to read from Kafka and write to HDFS. 
   
   Spark: 3.2.1
   Hadoop: 3.2.2
   
   `Caused by: org.apache.hudi.exception.HoodieUpsertException: Failed to upsert for commit time 20220723125828383
   	at org.apache.hudi.table.action.commit.BaseWriteHelper.write(BaseWriteHelper.java:64)
   	at org.apache.hudi.table.action.deltacommit.SparkUpsertDeltaCommitActionExecutor.execute(SparkUpsertDeltaCommitActionExecutor.java:46)
   	at org.apache.hudi.table.HoodieSparkMergeOnReadTable.upsert(HoodieSparkMergeOnReadTable.java:89)
   	at org.apache.hudi.table.HoodieSparkMergeOnReadTable.upsert(HoodieSparkMergeOnReadTable.java:76)
   	at org.apache.hudi.client.SparkRDDWriteClient.upsert(SparkRDDWriteClient.java:155)
   	at org.apache.hudi.utilities.deltastreamer.DeltaSync.writeToSink(DeltaSync.java:586)
   	at org.apache.hudi.utilities.deltastreamer.DeltaSync.syncOnce(DeltaSync.java:333)
   	at org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer$DeltaSyncService.lambda$startService$0(HoodieDeltaStreamer.java:679)
   	at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1604)
   	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
   	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
   	at java.lang.Thread.run(Thread.java:748)
   Caused by: java.lang.NoSuchMethodError: org.apache.hadoop.hdfs.client.HdfsDataInputStream.getReadStatistics()Lorg/apache/hadoop/hdfs/DFSInputStream$ReadStatistics;
   	at org.apache.hudi.org.apache.hadoop.hbase.io.FSDataInputStreamWrapper.updateInputStreamStatistics(FSDataInputStreamWrapper.java:249)
   	at org.apache.hudi.org.apache.hadoop.hbase.io.FSDataInputStreamWrapper.close(FSDataInputStreamWrapper.java:296)
   	at org.apache.hudi.org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderImpl.closeStreams(HFileBlock.java:1825)
   	at org.apache.hudi.org.apache.hadoop.hbase.io.hfile.HFilePreadReader.close(HFilePreadReader.java:107)
   	at org.apache.hudi.org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.close(HFileReaderImpl.java:1421)
   	at org.apache.hudi.io.storage.HoodieHFileReader.close(HoodieHFileReader.java:218)
   	at org.apache.hudi.metadata.HoodieBackedTableMetadata.closeReader(HoodieBackedTableMetadata.java:588)
   	at org.apache.hudi.metadata.HoodieBackedTableMetadata.close(HoodieBackedTableMetadata.java:571)
   	at org.apache.hudi.metadata.HoodieBackedTableMetadata.lambda$getRecordsByKeys$0(HoodieBackedTableMetadata.java:232)
   	at java.util.HashMap.forEach(HashMap.java:1289)
   	at org.apache.hudi.metadata.HoodieBackedTableMetadata.getRecordsByKeys(HoodieBackedTableMetadata.java:207)
   	at org.apache.hudi.metadata.HoodieBackedTableMetadata.getRecordByKey(HoodieBackedTableMetadata.java:140)
   	at org.apache.hudi.metadata.BaseTableMetadata.fetchAllPartitionPaths(BaseTableMetadata.java:281)
   	at org.apache.hudi.metadata.BaseTableMetadata.getAllPartitionPaths(BaseTableMetadata.java:111)
   	at org.apache.hudi.common.fs.FSUtils.getAllPartitionPaths(FSUtils.java:313)
   	at org.apache.hudi.index.bloom.HoodieGlobalBloomIndex.loadColumnRangesFromFiles(HoodieGlobalBloomIndex.java:62)
   	at org.apache.hudi.index.bloom.HoodieBloomIndex.getBloomIndexFileInfoForPartitions(HoodieBloomIndex.java:151)
   	at org.apache.hudi.index.bloom.HoodieBloomIndex.lookupIndex(HoodieBloomIndex.java:125)
   	at org.apache.hudi.index.bloom.HoodieBloomIndex.tagLocation(HoodieBloomIndex.java:91)
   	at org.apache.hudi.table.action.commit.HoodieWriteHelper.tag(HoodieWriteHelper.java:49)
   	at org.apache.hudi.table.action.commit.HoodieWriteHelper.tag(HoodieWriteHelper.java:32)
   	at org.apache.hudi.table.action.commit.BaseWriteHelper.write(BaseWriteHelper.java:53)
   	... 11 more`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] jiangbiao910 commented on issue #5765: [SUPPORT] throw "java.lang.NoSuchMethodError: org.apache.hadoop.hdfs.client.HdfsDataInputStream.getReadStatistics()"

Posted by GitBox <gi...@apache.org>.
jiangbiao910 commented on issue #5765:
URL: https://github.com/apache/hudi/issues/5765#issuecomment-1217725080

   Version 0.12 has been released. Is this bug fixed?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] sunke38 commented on issue #5765: [SUPPORT] throw "java.lang.NoSuchMethodError: org.apache.hadoop.hdfs.client.HdfsDataInputStream.getReadStatistics()"

Posted by GitBox <gi...@apache.org>.
sunke38 commented on issue #5765:
URL: https://github.com/apache/hudi/issues/5765#issuecomment-1154755086

   
   
   
   > > @sunke38 @RoderickAdriance @XuQianJin-Stars : do you happened to have any hbase jars in your class path. If not, we should not see this issue in my understanding. CC @yihua
   > 
   > hi @nsivabalan The problem on my side has been solved, and it is found that it is a jar package conflict problem.
   
   Hi Xu May I ask for detail how you solve this,plz? I am very confuse that where is localtion of  jar package conflict. Thank you


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] XuQianJin-Stars commented on issue #5765: [SUPPORT] throw "java.lang.NoSuchMethodError: org.apache.hadoop.hdfs.client.HdfsDataInputStream.getReadStatistics()"

Posted by GitBox <gi...@apache.org>.
XuQianJin-Stars commented on issue #5765:
URL: https://github.com/apache/hudi/issues/5765#issuecomment-1148185043

   I also encountered this problem.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] dachn commented on issue #5765: [SUPPORT] throw "java.lang.NoSuchMethodError: org.apache.hadoop.hdfs.client.HdfsDataInputStream.getReadStatistics()"

Posted by "dachn (via GitHub)" <gi...@apache.org>.
dachn commented on issue #5765:
URL: https://github.com/apache/hudi/issues/5765#issuecomment-1588977319

   i found that problem throws when hudi use the hadoop version which is 3.3.1 in my env.
   ![image](https://github.com/apache/hudi/assets/46547576/a6235426-97b4-41d8-9f67-fb27a523b4f9)
   ![49DBB81F-979D-4511-A229-562AB6946B8A](https://github.com/apache/hudi/assets/46547576/deb13ad3-624e-4a3c-812f-130355b1a0c4)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] [SUPPORT] throw "java.lang.NoSuchMethodError: org.apache.hadoop.hdfs.client.HdfsDataInputStream.getReadStatistics()" [hudi]

Posted by "Loaimohamed79 (via GitHub)" <gi...@apache.org>.
Loaimohamed79 commented on issue #5765:
URL: https://github.com/apache/hudi/issues/5765#issuecomment-1868497750

   **### I solve it by using spark 3.4.0 and Hadoop 3.2.4 but I have a new error now XD,**
   23/12/24 12:24:42 WARN DFSPropertiesConfiguration: Cannot find HUDI_CONF_DIR, please set it as the dir of hudi-defaults.conf
   23/12/24 12:24:42 WARN DFSPropertiesConfiguration: Properties file file:/etc/hudi/conf/hudi-defaults.conf not found. Ignoring to load props file
   23/12/24 12:24:42 WARN HoodieSparkSqlWriter$: Choosing BULK_INSERT as the operation type since auto record key generation is applicable
   23/12/24 12:24:49 WARN AutoRecordKeyGenerationUtils$: Precombine field ts will be ignored with auto record key generation enabled
   23/12/24 12:24:55 WARN MetricsConfig: Cannot locate configuration: tried hadoop-metrics2-hbase.properties,hadoop-metrics2.properties
   23/12/24 12:25:01 WARN WriteMarkersFactory: Timeline-server-based markers are not supported for HDFS: base path hdfs://localhost:9000/hudi/data/test5.  Falling back to direct markers.
   23/12/24 12:25:01 WARN WriteMarkersFactory: Timeline-server-based markers are not supported for HDFS: base path hdfs://localhost:9000/hudi/data/test5.  Falling back to direct markers.
   23/12/24 12:25:01 WARN WriteMarkersFactory: Timeline-server-based markers are not supported for HDFS: base path hdfs://localhost:9000/hudi/data/test5.  Falling back to direct markers.
   23/12/24 12:25:01 WARN WriteMarkersFactory: Timeline-server-based markers are not supported for HDFS: base path hdfs://localhost:9000/hudi/data/test5.  Falling back to direct markers.
   23/12/24 12:25:02 WARN WriteMarkersFactory: Timeline-server-based markers are not supported for HDFS: base path hdfs://localhost:9000/hudi/data/test5.  Falling back to direct markers.
   23/12/24 12:25:03 WARN WriteMarkersFactory: Timeline-server-based markers are not supported for HDFS: base path hdfs://localhost:9000/hudi/data/test5.  Falling back to direct markers.
   # WARNING: Unable to attach Serviceability Agent. Unable to attach even with module exceptions: [org.apache.hudi.org.openjdk.jol.vm.sa.SASupportException: Sense failed., org.apache.hudi.org.openjdk.jol.vm.sa.SASupportException: Sense failed., org.apache.hudi.org.openjdk.jol.vm.sa.SASupportException: Sense failed.]
   23/12/24 12:25:06 WARN DataStreamer: DataStreamer Exception
   java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[DatanodeInfoWithStorage[127.0.0.1:9866,DS-56d60cd9-579e-4d47-942a-1d1cd47448c9,DISK]], original=[DatanodeInfoWithStorage[127.0.0.1:9866,DS-56d60cd9-579e-4d47-942a-1d1cd47448c9,DISK]]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration.
   	at org.apache.hadoop.hdfs.DataStreamer.findNewDatanode(DataStreamer.java:1352)
   	at org.apache.hadoop.hdfs.DataStreamer.addDatanode2ExistingPipeline(DataStreamer.java:1420)
   	at org.apache.hadoop.hdfs.DataStreamer.handleDatanodeReplacement(DataStreamer.java:1646)
   	at org.apache.hadoop.hdfs.DataStreamer.setupPipelineInternal(DataStreamer.java:1547)
   	at org.apache.hadoop.hdfs.DataStreamer.setupPipelineForAppendOrRecovery(DataStreamer.java:1529)
   	at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:717)
   23/12/24 12:25:06 WARN DFSClient: Error while syncing
   java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[DatanodeInfoWithStorage[127.0.0.1:9866,DS-56d60cd9-579e-4d47-942a-1d1cd47448c9,DISK]], original=[DatanodeInfoWithStorage[127.0.0.1:9866,DS-56d60cd9-579e-4d47-942a-1d1cd47448c9,DISK]]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration.
   	at org.apache.hadoop.hdfs.DataStreamer.findNewDatanode(DataStreamer.java:1352)
   	at org.apache.hadoop.hdfs.DataStreamer.addDatanode2ExistingPipeline(DataStreamer.java:1420)
   	at org.apache.hadoop.hdfs.DataStreamer.handleDatanodeReplacement(DataStreamer.java:1646)
   	at org.apache.hadoop.hdfs.DataStreamer.setupPipelineInternal(DataStreamer.java:1547)
   	at org.apache.hadoop.hdfs.DataStreamer.setupPipelineForAppendOrRecovery(DataStreamer.java:1529)
   	at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:717)
   23/12/24 12:25:06 ERROR BaseSparkCommitActionExecutor: Error upserting bucketType UPDATE for partition :0
   org.apache.hudi.exception.HoodieAppendException: Failed while appending records to hdfs://localhost:9000/hudi/data/test5/.hoodie/metadata/files/.files-0000-0_00000000000000010.log.1_0-0-0
   	at org.apache.hudi.io.HoodieAppendHandle.appendDataAndDeleteBlocks(HoodieAppendHandle.java:487)
   	at org.apache.hudi.io.HoodieAppendHandle.doAppend(HoodieAppendHandle.java:450)
   	at org.apache.hudi.table.action.deltacommit.BaseSparkDeltaCommitActionExecutor.handleUpdate(BaseSparkDeltaCommitActionExecutor.java:83)
   	at org.apache.hudi.table.action.commit.BaseSparkCommitActionExecutor.handleUpsertPartition(BaseSparkCommitActionExecutor.java:335)
   	at org.apache.hudi.table.action.commit.BaseSparkCommitActionExecutor.lambda$mapPartitionsAsRDD$a3ab3c4$1(BaseSparkCommitActionExecutor.java:257)
   	at org.apache.spark.api.java.JavaRDDLike.$anonfun$mapPartitionsWithIndex$1(JavaRDDLike.scala:102)
   	at org.apache.spark.api.java.JavaRDDLike.$anonfun$mapPartitionsWithIndex$1$adapted(JavaRDDLike.scala:102)
   	at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsWithIndex$2(RDD.scala:905)
   	at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsWithIndex$2$adapted(RDD.scala:905)
   	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
   	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:364)
   	at org.apache.spark.rdd.RDD.iterator(RDD.scala:328)
   	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
   	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:364)
   	at org.apache.spark.rdd.RDD.$anonfun$getOrCompute$1(RDD.scala:377)
   	at org.apache.spark.storage.BlockManager.$anonfun$doPutIterator$1(BlockManager.scala:1552)
   	at org.apache.spark.storage.BlockManager.org$apache$spark$storage$BlockManager$$doPut(BlockManager.scala:1462)
   	at org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:1526)
   	at org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:1349)
   	at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:375)
   	at org.apache.spark.rdd.RDD.iterator(RDD.scala:326)
   	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
   	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:364)
   	at org.apache.spark.rdd.RDD.iterator(RDD.scala:328)
   	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:92)
   	at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:161)
   	at org.apache.spark.scheduler.Task.run(Task.scala:139)
   	at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:554)
   	at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1529)
   	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:557)
   	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
   	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
   	at java.lang.Thread.run(Thread.java:750)
   Caused by: java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[DatanodeInfoWithStorage[127.0.0.1:9866,DS-56d60cd9-579e-4d47-942a-1d1cd47448c9,DISK]], original=[DatanodeInfoWithStorage[127.0.0.1:9866,DS-56d60cd9-579e-4d47-942a-1d1cd47448c9,DISK]]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration.
   	at org.apache.hadoop.hdfs.DataStreamer.findNewDatanode(DataStreamer.java:1352)
   	at org.apache.hadoop.hdfs.DataStreamer.addDatanode2ExistingPipeline(DataStreamer.java:1420)
   	at org.apache.hadoop.hdfs.DataStreamer.handleDatanodeReplacement(DataStreamer.java:1646)
   	at org.apache.hadoop.hdfs.DataStreamer.setupPipelineInternal(DataStreamer.java:1547)
   	at org.apache.hadoop.hdfs.DataStreamer.setupPipelineForAppendOrRecovery(DataStreamer.java:1529)
   	at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:717)
   23/12/24 12:25:06 WARN BlockManager: Putting block rdd_48_0 failed due to exception org.apache.hudi.exception.HoodieUpsertException: Error upserting bucketType UPDATE for partition :0.
   23/12/24 12:25:06 WARN BlockManager: Block rdd_48_0 could not be removed as it was not found on disk or in memory
   23/12/24 12:25:06 ERROR Executor: Exception in task 0.0 in stage 15.0 (TID 19)
   org.apache.hudi.exception.HoodieUpsertException: Error upserting bucketType UPDATE for partition :0
   	at org.apache.hudi.table.action.commit.BaseSparkCommitActionExecutor.handleUpsertPartition(BaseSparkCommitActionExecutor.java:342)
   	at org.apache.hudi.table.action.commit.BaseSparkCommitActionExecutor.lambda$mapPartitionsAsRDD$a3ab3c4$1(BaseSparkCommitActionExecutor.java:257)
   	at org.apache.spark.api.java.JavaRDDLike.$anonfun$mapPartitionsWithIndex$1(JavaRDDLike.scala:102)
   	at org.apache.spark.api.java.JavaRDDLike.$anonfun$mapPartitionsWithIndex$1$adapted(JavaRDDLike.scala:102)
   	at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsWithIndex$2(RDD.scala:905)
   	at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsWithIndex$2$adapted(RDD.scala:905)
   	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
   	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:364)
   	at org.apache.spark.rdd.RDD.iterator(RDD.scala:328)
   	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
   	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:364)
   	at org.apache.spark.rdd.RDD.$anonfun$getOrCompute$1(RDD.scala:377)
   	at org.apache.spark.storage.BlockManager.$anonfun$doPutIterator$1(BlockManager.scala:1552)
   	at org.apache.spark.storage.BlockManager.org$apache$spark$storage$BlockManager$$doPut(BlockManager.scala:1462)
   	at org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:1526)
   	at org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:1349)
   	at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:375)
   	at org.apache.spark.rdd.RDD.iterator(RDD.scala:326)
   	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
   	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:364)
   	at org.apache.spark.rdd.RDD.iterator(RDD.scala:328)
   	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:92)
   	at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:161)
   	at org.apache.spark.scheduler.Task.run(Task.scala:139)
   	at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:554)
   	at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1529)
   	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:557)
   	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
   	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
   	at java.lang.Thread.run(Thread.java:750)
   Caused by: org.apache.hudi.exception.HoodieAppendException: Failed while appending records to hdfs://localhost:9000/hudi/data/test5/.hoodie/metadata/files/.files-0000-0_00000000000000010.log.1_0-0-0
   	at org.apache.hudi.io.HoodieAppendHandle.appendDataAndDeleteBlocks(HoodieAppendHandle.java:487)
   	at org.apache.hudi.io.HoodieAppendHandle.doAppend(HoodieAppendHandle.java:450)
   	at org.apache.hudi.table.action.deltacommit.BaseSparkDeltaCommitActionExecutor.handleUpdate(BaseSparkDeltaCommitActionExecutor.java:83)
   	at org.apache.hudi.table.action.commit.BaseSparkCommitActionExecutor.handleUpsertPartition(BaseSparkCommitActionExecutor.java:335)
   	... 29 more
   Caused by: java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[DatanodeInfoWithStorage[127.0.0.1:9866,DS-56d60cd9-579e-4d47-942a-1d1cd47448c9,DISK]], original=[DatanodeInfoWithStorage[127.0.0.1:9866,DS-56d60cd9-579e-4d47-942a-1d1cd47448c9,DISK]]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration.
   	at org.apache.hadoop.hdfs.DataStreamer.findNewDatanode(DataStreamer.java:1352)
   	at org.apache.hadoop.hdfs.DataStreamer.addDatanode2ExistingPipeline(DataStreamer.java:1420)
   	at org.apache.hadoop.hdfs.DataStreamer.handleDatanodeReplacement(DataStreamer.java:1646)
   	at org.apache.hadoop.hdfs.DataStreamer.setupPipelineInternal(DataStreamer.java:1547)
   	at org.apache.hadoop.hdfs.DataStreamer.setupPipelineForAppendOrRecovery(DataStreamer.java:1529)
   	at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:717)
   23/12/24 12:25:06 WARN TaskSetManager: Lost task 0.0 in stage 15.0 (TID 19) (192.168.1.171 executor driver): org.apache.hudi.exception.HoodieUpsertException: Error upserting bucketType UPDATE for partition :0
   	at org.apache.hudi.table.action.commit.BaseSparkCommitActionExecutor.handleUpsertPartition(BaseSparkCommitActionExecutor.java:342)
   	at org.apache.hudi.table.action.commit.BaseSparkCommitActionExecutor.lambda$mapPartitionsAsRDD$a3ab3c4$1(BaseSparkCommitActionExecutor.java:257)
   	at org.apache.spark.api.java.JavaRDDLike.$anonfun$mapPartitionsWithIndex$1(JavaRDDLike.scala:102)
   	at org.apache.spark.api.java.JavaRDDLike.$anonfun$mapPartitionsWithIndex$1$adapted(JavaRDDLike.scala:102)
   	at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsWithIndex$2(RDD.scala:905)
   	at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsWithIndex$2$adapted(RDD.scala:905)
   	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
   	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:364)
   	at org.apache.spark.rdd.RDD.iterator(RDD.scala:328)
   	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
   	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:364)
   	at org.apache.spark.rdd.RDD.$anonfun$getOrCompute$1(RDD.scala:377)
   	at org.apache.spark.storage.BlockManager.$anonfun$doPutIterator$1(BlockManager.scala:1552)
   	at org.apache.spark.storage.BlockManager.org$apache$spark$storage$BlockManager$$doPut(BlockManager.scala:1462)
   	at org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:1526)
   	at org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:1349)
   	at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:375)
   	at org.apache.spark.rdd.RDD.iterator(RDD.scala:326)
   	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
   	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:364)
   	at org.apache.spark.rdd.RDD.iterator(RDD.scala:328)
   	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:92)
   	at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:161)
   	at org.apache.spark.scheduler.Task.run(Task.scala:139)
   	at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:554)
   	at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1529)
   	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:557)
   	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
   	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
   	at java.lang.Thread.run(Thread.java:750)
   Caused by: org.apache.hudi.exception.HoodieAppendException: Failed while appending records to hdfs://localhost:9000/hudi/data/test5/.hoodie/metadata/files/.files-0000-0_00000000000000010.log.1_0-0-0
   	at org.apache.hudi.io.HoodieAppendHandle.appendDataAndDeleteBlocks(HoodieAppendHandle.java:487)
   	at org.apache.hudi.io.HoodieAppendHandle.doAppend(HoodieAppendHandle.java:450)
   	at org.apache.hudi.table.action.deltacommit.BaseSparkDeltaCommitActionExecutor.handleUpdate(BaseSparkDeltaCommitActionExecutor.java:83)
   	at org.apache.hudi.table.action.commit.BaseSparkCommitActionExecutor.handleUpsertPartition(BaseSparkCommitActionExecutor.java:335)
   	... 29 more
   Caused by: java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[DatanodeInfoWithStorage[127.0.0.1:9866,DS-56d60cd9-579e-4d47-942a-1d1cd47448c9,DISK]], original=[DatanodeInfoWithStorage[127.0.0.1:9866,DS-56d60cd9-579e-4d47-942a-1d1cd47448c9,DISK]]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration.
   	at org.apache.hadoop.hdfs.DataStreamer.findNewDatanode(DataStreamer.java:1352)
   	at org.apache.hadoop.hdfs.DataStreamer.addDatanode2ExistingPipeline(DataStreamer.java:1420)
   	at org.apache.hadoop.hdfs.DataStreamer.handleDatanodeReplacement(DataStreamer.java:1646)
   	at org.apache.hadoop.hdfs.DataStreamer.setupPipelineInternal(DataStreamer.java:1547)
   	at org.apache.hadoop.hdfs.DataStreamer.setupPipelineForAppendOrRecovery(DataStreamer.java:1529)
   	at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:717)
   
   23/12/24 12:25:06 ERROR TaskSetManager: Task 0 in stage 15.0 failed 1 times; aborting job
   23/12/24 12:25:06 ERROR AppendDataExec: Data source write support org.apache.hudi.spark3.internal.HoodieDataSourceInternalBatchWrite@465cf872 is aborting.
   23/12/24 12:25:06 ERROR DataSourceInternalWriterHelper: Commit 20231224122443495 aborted 
   23/12/24 12:25:09 WARN HoodieLogFormatWriter: Remote Exception, attempting to handle or recover lease
   org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException): Failed to APPEND_FILE /hudi/data/test5/.hoodie/metadata/files/.files-0000-0_00000000000000010.log.1_0-0-0 for DFSClient_NONMAPREDUCE_1489527491_15 on 127.0.0.1 because DFSClient_NONMAPREDUCE_1489527491_15 is already the current lease holder.
   	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:2669)
   	at org.apache.hadoop.hdfs.server.namenode.FSDirAppendOp.appendFile(FSDirAppendOp.java:124)
   	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFile(FSNamesystem.java:2753)
   	at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.append(NameNodeRpcServer.java:840)
   	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.append(ClientNamenodeProtocolServerSideTranslatorPB.java:503)
   	at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
   	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:549)
   	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:518)
   	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1086)
   	at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1035)
   	at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:963)
   	at java.security.AccessController.doPrivileged(Native Method)
   	at javax.security.auth.Subject.doAs(Subject.java:422)
   	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1762)
   	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2960)
   
   	at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1612)
   	at org.apache.hadoop.ipc.Client.call(Client.java:1558)
   	at org.apache.hadoop.ipc.Client.call(Client.java:1455)
   	at org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:242)
   	at org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:129)
   	at com.sun.proxy.$Proxy39.append(Unknown Source)
   	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.append(ClientNamenodeProtocolTranslatorPB.java:413)
   	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
   	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   	at java.lang.reflect.Method.invoke(Method.java:498)
   	at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
   	at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
   	at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
   	at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
   	at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
   	at com.sun.proxy.$Proxy40.append(Unknown Source)
   	at org.apache.hadoop.hdfs.DFSClient.callAppend(DFSClient.java:1382)
   	at org.apache.hadoop.hdfs.DFSClient.callAppend(DFSClient.java:1404)
   	at org.apache.hadoop.hdfs.DFSClient.append(DFSClient.java:1473)
   	at org.apache.hadoop.hdfs.DFSClient.append(DFSClient.java:1443)
   	at org.apache.hadoop.hdfs.DistributedFileSystem$5.doCall(DistributedFileSystem.java:446)
   	at org.apache.hadoop.hdfs.DistributedFileSystem$5.doCall(DistributedFileSystem.java:442)
   	at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
   	at org.apache.hadoop.hdfs.DistributedFileSystem.append(DistributedFileSystem.java:454)
   	at org.apache.hadoop.hdfs.DistributedFileSystem.append(DistributedFileSystem.java:423)
   	at org.apache.hadoop.fs.FileSystem.append(FileSystem.java:1470)
   	at org.apache.hudi.common.fs.HoodieWrapperFileSystem.append(HoodieWrapperFileSystem.java:523)
   	at org.apache.hudi.common.table.log.HoodieLogFormatWriter.getOutputStream(HoodieLogFormatWriter.java:101)
   	at org.apache.hudi.common.table.log.HoodieLogFormatWriter.appendBlocks(HoodieLogFormatWriter.java:144)
   	at org.apache.hudi.common.table.log.HoodieLogFormatWriter.appendBlock(HoodieLogFormatWriter.java:135)
   	at org.apache.hudi.table.action.rollback.BaseRollbackHelper.lambda$maybeDeleteAndCollectStats$309309f3$1(BaseRollbackHelper.java:141)
   	at org.apache.hudi.client.common.HoodieSparkEngineContext.lambda$flatMap$7d470b86$1(HoodieSparkEngineContext.java:150)
   	at org.apache.spark.api.java.JavaRDDLike.$anonfun$flatMap$1(JavaRDDLike.scala:125)
   	at scala.collection.Iterator$$anon$11.nextCur(Iterator.scala:486)
   	at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:492)
   	at scala.collection.Iterator.foreach(Iterator.scala:943)
   	at scala.collection.Iterator.foreach$(Iterator.scala:943)
   	at scala.collection.AbstractIterator.foreach(Iterator.scala:1431)
   	at scala.collection.generic.Growable.$plus$plus$eq(Growable.scala:62)
   	at scala.collection.generic.Growable.$plus$plus$eq$(Growable.scala:53)
   	at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:105)
   	at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:49)
   	at scala.collection.TraversableOnce.to(TraversableOnce.scala:366)
   	at scala.collection.TraversableOnce.to$(TraversableOnce.scala:364)
   	at scala.collection.AbstractIterator.to(Iterator.scala:1431)
   	at scala.collection.TraversableOnce.toBuffer(TraversableOnce.scala:358)
   	at scala.collection.TraversableOnce.toBuffer$(TraversableOnce.scala:358)
   	at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1431)
   	at scala.collection.TraversableOnce.toArray(TraversableOnce.scala:345)
   	at scala.collection.TraversableOnce.toArray$(TraversableOnce.scala:339)
   	at scala.collection.AbstractIterator.toArray(Iterator.scala:1431)
   	at org.apache.spark.rdd.RDD.$anonfun$collect$2(RDD.scala:1019)
   	at org.apache.spark.SparkContext.$anonfun$runJob$5(SparkContext.scala:2303)
   	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:92)
   	at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:161)
   	at org.apache.spark.scheduler.Task.run(Task.scala:139)
   	at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:554)
   	at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1529)
   	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:557)
   	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
   	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
   	at java.lang.Thread.run(Thread.java:750)
   23/12/24 12:25:09 WARN HoodieLogFormatWriter: Another task executor writing to the same log file(HoodieLogFile{pathStr='hdfs://localhost:9000/hudi/data/test5/.hoodie/metadata/files/.files-0000-0_00000000000000010.log.1_0-0-0', fileLen=0}. Rolling over
   23/12/24 12:25:11 WARN DataStreamer: DataStreamer Exception
   java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[DatanodeInfoWithStorage[127.0.0.1:9866,DS-56d60cd9-579e-4d47-942a-1d1cd47448c9,DISK]], original=[DatanodeInfoWithStorage[127.0.0.1:9866,DS-56d60cd9-579e-4d47-942a-1d1cd47448c9,DISK]]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration.
   	at org.apache.hadoop.hdfs.DataStreamer.findNewDatanode(DataStreamer.java:1352)
   	at org.apache.hadoop.hdfs.DataStreamer.addDatanode2ExistingPipeline(DataStreamer.java:1420)
   	at org.apache.hadoop.hdfs.DataStreamer.handleDatanodeReplacement(DataStreamer.java:1646)
   	at org.apache.hadoop.hdfs.DataStreamer.setupPipelineInternal(DataStreamer.java:1547)
   	at org.apache.hadoop.hdfs.DataStreamer.setupPipelineForAppendOrRecovery(DataStreamer.java:1529)
   	at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:717)
   23/12/24 12:25:11 WARN DFSClient: Error while syncing
   java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[DatanodeInfoWithStorage[127.0.0.1:9866,DS-56d60cd9-579e-4d47-942a-1d1cd47448c9,DISK]], original=[DatanodeInfoWithStorage[127.0.0.1:9866,DS-56d60cd9-579e-4d47-942a-1d1cd47448c9,DISK]]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration.
   	at org.apache.hadoop.hdfs.DataStreamer.findNewDatanode(DataStreamer.java:1352)
   	at org.apache.hadoop.hdfs.DataStreamer.addDatanode2ExistingPipeline(DataStreamer.java:1420)
   	at org.apache.hadoop.hdfs.DataStreamer.handleDatanodeReplacement(DataStreamer.java:1646)
   	at org.apache.hadoop.hdfs.DataStreamer.setupPipelineInternal(DataStreamer.java:1547)
   	at org.apache.hadoop.hdfs.DataStreamer.setupPipelineForAppendOrRecovery(DataStreamer.java:1529)
   	at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:717)
   23/12/24 12:25:11 ERROR BaseSparkCommitActionExecutor: Error upserting bucketType UPDATE for partition :0
   org.apache.hudi.exception.HoodieAppendException: Failed while appending records to hdfs://localhost:9000/hudi/data/test5/.hoodie/metadata/files/.files-0000-0_00000000000000010.log.2_1-0-1
   	at org.apache.hudi.io.HoodieAppendHandle.appendDataAndDeleteBlocks(HoodieAppendHandle.java:487)
   	at org.apache.hudi.io.HoodieAppendHandle.doAppend(HoodieAppendHandle.java:450)
   	at org.apache.hudi.table.action.deltacommit.BaseSparkDeltaCommitActionExecutor.handleUpdate(BaseSparkDeltaCommitActionExecutor.java:83)
   	at org.apache.hudi.table.action.commit.BaseSparkCommitActionExecutor.handleUpsertPartition(BaseSparkCommitActionExecutor.java:335)
   	at org.apache.hudi.table.action.commit.BaseSparkCommitActionExecutor.lambda$mapPartitionsAsRDD$a3ab3c4$1(BaseSparkCommitActionExecutor.java:257)
   	at org.apache.spark.api.java.JavaRDDLike.$anonfun$mapPartitionsWithIndex$1(JavaRDDLike.scala:102)
   	at org.apache.spark.api.java.JavaRDDLike.$anonfun$mapPartitionsWithIndex$1$adapted(JavaRDDLike.scala:102)
   	at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsWithIndex$2(RDD.scala:905)
   	at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsWithIndex$2$adapted(RDD.scala:905)
   	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
   	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:364)
   	at org.apache.spark.rdd.RDD.iterator(RDD.scala:328)
   	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
   	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:364)
   	at org.apache.spark.rdd.RDD.$anonfun$getOrCompute$1(RDD.scala:377)
   	at org.apache.spark.storage.BlockManager.$anonfun$doPutIterator$1(BlockManager.scala:1552)
   	at org.apache.spark.storage.BlockManager.org$apache$spark$storage$BlockManager$$doPut(BlockManager.scala:1462)
   	at org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:1526)
   	at org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:1349)
   	at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:375)
   	at org.apache.spark.rdd.RDD.iterator(RDD.scala:326)
   	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
   	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:364)
   	at org.apache.spark.rdd.RDD.iterator(RDD.scala:328)
   	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:92)
   	at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:161)
   	at org.apache.spark.scheduler.Task.run(Task.scala:139)
   	at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:554)
   	at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1529)
   	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:557)
   	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
   	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
   	at java.lang.Thread.run(Thread.java:750)
   Caused by: java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[DatanodeInfoWithStorage[127.0.0.1:9866,DS-56d60cd9-579e-4d47-942a-1d1cd47448c9,DISK]], original=[DatanodeInfoWithStorage[127.0.0.1:9866,DS-56d60cd9-579e-4d47-942a-1d1cd47448c9,DISK]]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration.
   	at org.apache.hadoop.hdfs.DataStreamer.findNewDatanode(DataStreamer.java:1352)
   	at org.apache.hadoop.hdfs.DataStreamer.addDatanode2ExistingPipeline(DataStreamer.java:1420)
   	at org.apache.hadoop.hdfs.DataStreamer.handleDatanodeReplacement(DataStreamer.java:1646)
   	at org.apache.hadoop.hdfs.DataStreamer.setupPipelineInternal(DataStreamer.java:1547)
   	at org.apache.hadoop.hdfs.DataStreamer.setupPipelineForAppendOrRecovery(DataStreamer.java:1529)
   	at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:717)
   23/12/24 12:25:11 WARN BlockManager: Putting block rdd_85_0 failed due to exception org.apache.hudi.exception.HoodieUpsertException: Error upserting bucketType UPDATE for partition :0.
   23/12/24 12:25:11 WARN BlockManager: Block rdd_85_0 could not be removed as it was not found on disk or in memory
   23/12/24 12:25:11 ERROR Executor: Exception in task 0.0 in stage 31.0 (TID 57)
   org.apache.hudi.exception.HoodieUpsertException: Error upserting bucketType UPDATE for partition :0
   	at org.apache.hudi.table.action.commit.BaseSparkCommitActionExecutor.handleUpsertPartition(BaseSparkCommitActionExecutor.java:342)
   	at org.apache.hudi.table.action.commit.BaseSparkCommitActionExecutor.lambda$mapPartitionsAsRDD$a3ab3c4$1(BaseSparkCommitActionExecutor.java:257)
   	at org.apache.spark.api.java.JavaRDDLike.$anonfun$mapPartitionsWithIndex$1(JavaRDDLike.scala:102)
   	at org.apache.spark.api.java.JavaRDDLike.$anonfun$mapPartitionsWithIndex$1$adapted(JavaRDDLike.scala:102)
   	at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsWithIndex$2(RDD.scala:905)
   	at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsWithIndex$2$adapted(RDD.scala:905)
   	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
   	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:364)
   	at org.apache.spark.rdd.RDD.iterator(RDD.scala:328)
   	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
   	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:364)
   	at org.apache.spark.rdd.RDD.$anonfun$getOrCompute$1(RDD.scala:377)
   	at org.apache.spark.storage.BlockManager.$anonfun$doPutIterator$1(BlockManager.scala:1552)
   	at org.apache.spark.storage.BlockManager.org$apache$spark$storage$BlockManager$$doPut(BlockManager.scala:1462)
   	at org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:1526)
   	at org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:1349)
   	at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:375)
   	at org.apache.spark.rdd.RDD.iterator(RDD.scala:326)
   	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
   	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:364)
   	at org.apache.spark.rdd.RDD.iterator(RDD.scala:328)
   	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:92)
   	at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:161)
   	at org.apache.spark.scheduler.Task.run(Task.scala:139)
   	at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:554)
   	at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1529)
   	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:557)
   	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
   	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
   	at java.lang.Thread.run(Thread.java:750)
   Caused by: org.apache.hudi.exception.HoodieAppendException: Failed while appending records to hdfs://localhost:9000/hudi/data/test5/.hoodie/metadata/files/.files-0000-0_00000000000000010.log.2_1-0-1
   	at org.apache.hudi.io.HoodieAppendHandle.appendDataAndDeleteBlocks(HoodieAppendHandle.java:487)
   	at org.apache.hudi.io.HoodieAppendHandle.doAppend(HoodieAppendHandle.java:450)
   	at org.apache.hudi.table.action.deltacommit.BaseSparkDeltaCommitActionExecutor.handleUpdate(BaseSparkDeltaCommitActionExecutor.java:83)
   	at org.apache.hudi.table.action.commit.BaseSparkCommitActionExecutor.handleUpsertPartition(BaseSparkCommitActionExecutor.java:335)
   	... 29 more
   Caused by: java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[DatanodeInfoWithStorage[127.0.0.1:9866,DS-56d60cd9-579e-4d47-942a-1d1cd47448c9,DISK]], original=[DatanodeInfoWithStorage[127.0.0.1:9866,DS-56d60cd9-579e-4d47-942a-1d1cd47448c9,DISK]]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration.
   	at org.apache.hadoop.hdfs.DataStreamer.findNewDatanode(DataStreamer.java:1352)
   	at org.apache.hadoop.hdfs.DataStreamer.addDatanode2ExistingPipeline(DataStreamer.java:1420)
   	at org.apache.hadoop.hdfs.DataStreamer.handleDatanodeReplacement(DataStreamer.java:1646)
   	at org.apache.hadoop.hdfs.DataStreamer.setupPipelineInternal(DataStreamer.java:1547)
   	at org.apache.hadoop.hdfs.DataStreamer.setupPipelineForAppendOrRecovery(DataStreamer.java:1529)
   	at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:717)
   23/12/24 12:25:11 WARN TaskSetManager: Lost task 0.0 in stage 31.0 (TID 57) (192.168.1.171 executor driver): org.apache.hudi.exception.HoodieUpsertException: Error upserting bucketType UPDATE for partition :0
   	at org.apache.hudi.table.action.commit.BaseSparkCommitActionExecutor.handleUpsertPartition(BaseSparkCommitActionExecutor.java:342)
   	at org.apache.hudi.table.action.commit.BaseSparkCommitActionExecutor.lambda$mapPartitionsAsRDD$a3ab3c4$1(BaseSparkCommitActionExecutor.java:257)
   	at org.apache.spark.api.java.JavaRDDLike.$anonfun$mapPartitionsWithIndex$1(JavaRDDLike.scala:102)
   	at org.apache.spark.api.java.JavaRDDLike.$anonfun$mapPartitionsWithIndex$1$adapted(JavaRDDLike.scala:102)
   	at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsWithIndex$2(RDD.scala:905)
   	at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsWithIndex$2$adapted(RDD.scala:905)
   	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
   	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:364)
   	at org.apache.spark.rdd.RDD.iterator(RDD.scala:328)
   	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
   	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:364)
   	at org.apache.spark.rdd.RDD.$anonfun$getOrCompute$1(RDD.scala:377)
   	at org.apache.spark.storage.BlockManager.$anonfun$doPutIterator$1(BlockManager.scala:1552)
   	at org.apache.spark.storage.BlockManager.org$apache$spark$storage$BlockManager$$doPut(BlockManager.scala:1462)
   	at org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:1526)
   	at org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:1349)
   	at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:375)
   	at org.apache.spark.rdd.RDD.iterator(RDD.scala:326)
   	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
   	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:364)
   	at org.apache.spark.rdd.RDD.iterator(RDD.scala:328)
   	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:92)
   	at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:161)
   	at org.apache.spark.scheduler.Task.run(Task.scala:139)
   	at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:554)
   	at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1529)
   	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:557)
   	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
   	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
   	at java.lang.Thread.run(Thread.java:750)
   Caused by: org.apache.hudi.exception.HoodieAppendException: Failed while appending records to hdfs://localhost:9000/hudi/data/test5/.hoodie/metadata/files/.files-0000-0_00000000000000010.log.2_1-0-1
   	at org.apache.hudi.io.HoodieAppendHandle.appendDataAndDeleteBlocks(HoodieAppendHandle.java:487)
   	at org.apache.hudi.io.HoodieAppendHandle.doAppend(HoodieAppendHandle.java:450)
   	at org.apache.hudi.table.action.deltacommit.BaseSparkDeltaCommitActionExecutor.handleUpdate(BaseSparkDeltaCommitActionExecutor.java:83)
   	at org.apache.hudi.table.action.commit.BaseSparkCommitActionExecutor.handleUpsertPartition(BaseSparkCommitActionExecutor.java:335)
   	... 29 more
   Caused by: java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[DatanodeInfoWithStorage[127.0.0.1:9866,DS-56d60cd9-579e-4d47-942a-1d1cd47448c9,DISK]], original=[DatanodeInfoWithStorage[127.0.0.1:9866,DS-56d60cd9-579e-4d47-942a-1d1cd47448c9,DISK]]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration.
   	at org.apache.hadoop.hdfs.DataStreamer.findNewDatanode(DataStreamer.java:1352)
   	at org.apache.hadoop.hdfs.DataStreamer.addDatanode2ExistingPipeline(DataStreamer.java:1420)
   	at org.apache.hadoop.hdfs.DataStreamer.handleDatanodeReplacement(DataStreamer.java:1646)
   	at org.apache.hadoop.hdfs.DataStreamer.setupPipelineInternal(DataStreamer.java:1547)
   	at org.apache.hadoop.hdfs.DataStreamer.setupPipelineForAppendOrRecovery(DataStreamer.java:1529)
   	at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:717)
   
   23/12/24 12:25:11 ERROR TaskSetManager: Task 0 in stage 31.0 failed 1 times; aborting job
   23/12/24 12:25:11 ERROR AppendDataExec: Data source write support org.apache.hudi.spark3.internal.HoodieDataSourceInternalBatchWrite@465cf872 failed to abort.
   Traceback (most recent call last):
     File "<stdin>", line 1, in <module>
     File "/home/backupenv/spark-3.4.0-bin-hadoop3/python/pyspark/sql/readwriter.py", line 1398, in save
       self._jwrite.save(path)
     File "/home/backupenv/spark-3.4.0-bin-hadoop3/python/lib/py4j-0.10.9.7-src.zip/py4j/java_gateway.py", line 1322, in __call__
     File "/home/backupenv/spark-3.4.0-bin-hadoop3/python/pyspark/errors/exceptions/captured.py", line 169, in deco
       return f(*a, **kw)
     File "/home/backupenv/spark-3.4.0-bin-hadoop3/python/lib/py4j-0.10.9.7-src.zip/py4j/protocol.py", line 326, in get_return_value
   py4j.protocol.Py4JJavaError: An error occurred while calling o69.save.
   : org.apache.spark.SparkException: Writing job failed.
   	at org.apache.spark.sql.errors.QueryExecutionErrors$.writingJobFailedError(QueryExecutionErrors.scala:916)
   	at org.apache.spark.sql.execution.datasources.v2.V2TableWriteExec.writeWithV2(WriteToDataSourceV2Exec.scala:434)
   	at org.apache.spark.sql.execution.datasources.v2.V2TableWriteExec.writeWithV2$(WriteToDataSourceV2Exec.scala:382)
   	at org.apache.spark.sql.execution.datasources.v2.AppendDataExec.writeWithV2(WriteToDataSourceV2Exec.scala:248)
   	at org.apache.spark.sql.execution.datasources.v2.V2ExistingTableWriteExec.run(WriteToDataSourceV2Exec.scala:360)
   	at org.apache.spark.sql.execution.datasources.v2.V2ExistingTableWriteExec.run$(WriteToDataSourceV2Exec.scala:359)
   	at org.apache.spark.sql.execution.datasources.v2.AppendDataExec.run(WriteToDataSourceV2Exec.scala:248)
   	at org.apache.spark.sql.execution.datasources.v2.V2CommandExec.result$lzycompute(V2CommandExec.scala:43)
   	at org.apache.spark.sql.execution.datasources.v2.V2CommandExec.result(V2CommandExec.scala:43)
   	at org.apache.spark.sql.execution.datasources.v2.V2CommandExec.executeCollect(V2CommandExec.scala:49)
   	at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.$anonfun$applyOrElse$1(QueryExecution.scala:98)
   	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$6(SQLExecution.scala:118)
   	at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:195)
   	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:103)
   	at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:827)
   	at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:65)
   	at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.applyOrElse(QueryExecution.scala:98)
   	at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.applyOrElse(QueryExecution.scala:94)
   	at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformDownWithPruning$1(TreeNode.scala:512)
   	at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:104)
   	at org.apache.spark.sql.catalyst.trees.TreeNode.transformDownWithPruning(TreeNode.scala:512)
   	at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.org$apache$spark$sql$catalyst$plans$logical$AnalysisHelper$$super$transformDownWithPruning(LogicalPlan.scala:31)
   	at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning(AnalysisHelper.scala:267)
   	at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning$(AnalysisHelper.scala:263)
   	at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:31)
   	at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:31)
   	at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:488)
   	at org.apache.spark.sql.execution.QueryExecution.eagerlyExecuteCommands(QueryExecution.scala:94)
   	at org.apache.spark.sql.execution.QueryExecution.commandExecuted$lzycompute(QueryExecution.scala:81)
   	at org.apache.spark.sql.execution.QueryExecution.commandExecuted(QueryExecution.scala:79)
   	at org.apache.spark.sql.execution.QueryExecution.assertCommandExecuted(QueryExecution.scala:133)
   	at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:856)
   	at org.apache.spark.sql.DataFrameWriter.saveInternal(DataFrameWriter.scala:311)
   	at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:247)
   	at org.apache.hudi.commit.DatasetBulkInsertCommitActionExecutor.doExecute(DatasetBulkInsertCommitActionExecutor.java:81)
   	at org.apache.hudi.commit.BaseDatasetBulkInsertCommitActionExecutor.execute(BaseDatasetBulkInsertCommitActionExecutor.java:102)
   	at org.apache.hudi.HoodieSparkSqlWriter$.bulkInsertAsRow(HoodieSparkSqlWriter.scala:910)
   	at org.apache.hudi.HoodieSparkSqlWriter$.writeInternal(HoodieSparkSqlWriter.scala:409)
   	at org.apache.hudi.HoodieSparkSqlWriter$.write(HoodieSparkSqlWriter.scala:132)
   	at org.apache.hudi.DefaultSource.createRelation(DefaultSource.scala:150)
   	at org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:47)
   	at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:75)
   	at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:73)
   	at org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:84)
   	at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.$anonfun$applyOrElse$1(QueryExecution.scala:98)
   	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$6(SQLExecution.scala:118)
   	at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:195)
   	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:103)
   	at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:827)
   	at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:65)
   	at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.applyOrElse(QueryExecution.scala:98)
   	at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.applyOrElse(QueryExecution.scala:94)
   	at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformDownWithPruning$1(TreeNode.scala:512)
   	at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:104)
   	at org.apache.spark.sql.catalyst.trees.TreeNode.transformDownWithPruning(TreeNode.scala:512)
   	at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.org$apache$spark$sql$catalyst$plans$logical$AnalysisHelper$$super$transformDownWithPruning(LogicalPlan.scala:31)
   	at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning(AnalysisHelper.scala:267)
   	at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning$(AnalysisHelper.scala:263)
   	at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:31)
   	at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:31)
   	at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:488)
   	at org.apache.spark.sql.execution.QueryExecution.eagerlyExecuteCommands(QueryExecution.scala:94)
   	at org.apache.spark.sql.execution.QueryExecution.commandExecuted$lzycompute(QueryExecution.scala:81)
   	at org.apache.spark.sql.execution.QueryExecution.commandExecuted(QueryExecution.scala:79)
   	at org.apache.spark.sql.execution.QueryExecution.assertCommandExecuted(QueryExecution.scala:133)
   	at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:856)
   	at org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:387)
   	at org.apache.spark.sql.DataFrameWriter.saveInternal(DataFrameWriter.scala:360)
   	at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:239)
   	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
   	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   	at java.lang.reflect.Method.invoke(Method.java:498)
   	at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
   	at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:374)
   	at py4j.Gateway.invoke(Gateway.java:282)
   	at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
   	at py4j.commands.CallCommand.execute(CallCommand.java:79)
   	at py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:182)
   	at py4j.ClientServerConnection.run(ClientServerConnection.java:106)
   	at java.lang.Thread.run(Thread.java:750)
   Caused by: org.apache.hudi.exception.HoodieException: Failed to update metadata
   	at org.apache.hudi.internal.DataSourceInternalWriterHelper.commit(DataSourceInternalWriterHelper.java:92)
   	at org.apache.hudi.spark3.internal.HoodieDataSourceInternalBatchWrite.commit(HoodieDataSourceInternalBatchWrite.java:92)
   	at org.apache.spark.sql.execution.datasources.v2.V2TableWriteExec.writeWithV2(WriteToDataSourceV2Exec.scala:422)
   	... 79 more
   	Suppressed: org.apache.hudi.exception.HoodieRollbackException: Failed to rollback hdfs://localhost:9000/hudi/data/test5 commits 20231224122443495
   		at org.apache.hudi.client.BaseHoodieTableServiceClient.rollback(BaseHoodieTableServiceClient.java:1064)
   		at org.apache.hudi.client.BaseHoodieTableServiceClient.rollback(BaseHoodieTableServiceClient.java:1011)
   		at org.apache.hudi.client.BaseHoodieWriteClient.rollback(BaseHoodieWriteClient.java:771)
   		at org.apache.hudi.internal.DataSourceInternalWriterHelper.abort(DataSourceInternalWriterHelper.java:100)
   		at org.apache.hudi.spark3.internal.HoodieDataSourceInternalBatchWrite.abort(HoodieDataSourceInternalBatchWrite.java:97)
   		at org.apache.spark.sql.execution.datasources.v2.V2TableWriteExec.writeWithV2(WriteToDataSourceV2Exec.scala:429)
   		... 79 more
   	Caused by: org.apache.hudi.exception.HoodieException: Failed to apply rollbacks in metadata
   		at org.apache.hudi.table.action.BaseActionExecutor.writeTableMetadata(BaseActionExecutor.java:110)
   		at org.apache.hudi.table.action.rollback.BaseRollbackActionExecutor.finishRollback(BaseRollbackActionExecutor.java:255)
   		at org.apache.hudi.table.action.rollback.BaseRollbackActionExecutor.runRollback(BaseRollbackActionExecutor.java:117)
   		at org.apache.hudi.table.action.rollback.BaseRollbackActionExecutor.execute(BaseRollbackActionExecutor.java:138)
   		at org.apache.hudi.table.HoodieSparkCopyOnWriteTable.rollback(HoodieSparkCopyOnWriteTable.java:298)
   		at org.apache.hudi.client.BaseHoodieTableServiceClient.rollback(BaseHoodieTableServiceClient.java:1047)
   		... 84 more
   	Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 31.0 failed 1 times, most recent failure: Lost task 0.0 in stage 31.0 (TID 57) (192.168.1.171 executor driver): org.apache.hudi.exception.HoodieUpsertException: Error upserting bucketType UPDATE for partition :0
   	at org.apache.hudi.table.action.commit.BaseSparkCommitActionExecutor.handleUpsertPartition(BaseSparkCommitActionExecutor.java:342)
   	at org.apache.hudi.table.action.commit.BaseSparkCommitActionExecutor.lambda$mapPartitionsAsRDD$a3ab3c4$1(BaseSparkCommitActionExecutor.java:257)
   	at org.apache.spark.api.java.JavaRDDLike.$anonfun$mapPartitionsWithIndex$1(JavaRDDLike.scala:102)
   	at org.apache.spark.api.java.JavaRDDLike.$anonfun$mapPartitionsWithIndex$1$adapted(JavaRDDLike.scala:102)
   	at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsWithIndex$2(RDD.scala:905)
   	at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsWithIndex$2$adapted(RDD.scala:905)
   	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
   	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:364)
   	at org.apache.spark.rdd.RDD.iterator(RDD.scala:328)
   	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
   	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:364)
   	at org.apache.spark.rdd.RDD.$anonfun$getOrCompute$1(RDD.scala:377)
   	at org.apache.spark.storage.BlockManager.$anonfun$doPutIterator$1(BlockManager.scala:1552)
   	at org.apache.spark.storage.BlockManager.org$apache$spark$storage$BlockManager$$doPut(BlockManager.scala:1462)
   	at org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:1526)
   	at org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:1349)
   	at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:375)
   	at org.apache.spark.rdd.RDD.iterator(RDD.scala:326)
   	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
   	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:364)
   	at org.apache.spark.rdd.RDD.iterator(RDD.scala:328)
   	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:92)
   	at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:161)
   	at org.apache.spark.scheduler.Task.run(Task.scala:139)
   	at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:554)
   	at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1529)
   	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:557)
   	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
   	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
   	at java.lang.Thread.run(Thread.java:750)
   Caused by: org.apache.hudi.exception.HoodieAppendException: Failed while appending records to hdfs://localhost:9000/hudi/data/test5/.hoodie/metadata/files/.files-0000-0_00000000000000010.log.2_1-0-1
   	at org.apache.hudi.io.HoodieAppendHandle.appendDataAndDeleteBlocks(HoodieAppendHandle.java:487)
   	at org.apache.hudi.io.HoodieAppendHandle.doAppend(HoodieAppendHandle.java:450)
   	at org.apache.hudi.table.action.deltacommit.BaseSparkDeltaCommitActionExecutor.handleUpdate(BaseSparkDeltaCommitActionExecutor.java:83)
   	at org.apache.hudi.table.action.commit.BaseSparkCommitActionExecutor.handleUpsertPartition(BaseSparkCommitActionExecutor.java:335)
   	... 29 more
   Caused by: java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[DatanodeInfoWithStorage[127.0.0.1:9866,DS-56d60cd9-579e-4d47-942a-1d1cd47448c9,DISK]], original=[DatanodeInfoWithStorage[127.0.0.1:9866,DS-56d60cd9-579e-4d47-942a-1d1cd47448c9,DISK]]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration.
   	at org.apache.hadoop.hdfs.DataStreamer.findNewDatanode(DataStreamer.java:1352)
   	at org.apache.hadoop.hdfs.DataStreamer.addDatanode2ExistingPipeline(DataStreamer.java:1420)
   	at org.apache.hadoop.hdfs.DataStreamer.handleDatanodeReplacement(DataStreamer.java:1646)
   	at org.apache.hadoop.hdfs.DataStreamer.setupPipelineInternal(DataStreamer.java:1547)
   	at org.apache.hadoop.hdfs.DataStreamer.setupPipelineForAppendOrRecovery(DataStreamer.java:1529)
   	at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:717)
   
   Driver stacktrace:
   		at org.apache.spark.scheduler.DAGScheduler.failJobAndIndependentStages(DAGScheduler.scala:2785)
   		at org.apache.spark.scheduler.DAGScheduler.$anonfun$abortStage$2(DAGScheduler.scala:2721)
   		at org.apache.spark.scheduler.DAGScheduler.$anonfun$abortStage$2$adapted(DAGScheduler.scala:2720)
   		at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
   		at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
   		at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
   		at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:2720)
   		at org.apache.spark.scheduler.DAGScheduler.$anonfun$handleTaskSetFailed$1(DAGScheduler.scala:1206)
   		at org.apache.spark.scheduler.DAGScheduler.$anonfun$handleTaskSetFailed$1$adapted(DAGScheduler.scala:1206)
   		at scala.Option.foreach(Option.scala:407)
   		at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:1206)
   		at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:2984)
   		at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2923)
   		at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2912)
   		at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:49)
   		at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:971)
   		at org.apache.spark.SparkContext.runJob(SparkContext.scala:2263)
   		at org.apache.spark.SparkContext.runJob(SparkContext.scala:2284)
   		at org.apache.spark.SparkContext.runJob(SparkContext.scala:2303)
   		at org.apache.spark.SparkContext.runJob(SparkContext.scala:2328)
   		at org.apache.spark.rdd.RDD.$anonfun$collect$1(RDD.scala:1019)
   		at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
   		at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
   		at org.apache.spark.rdd.RDD.withScope(RDD.scala:405)
   		at org.apache.spark.rdd.RDD.collect(RDD.scala:1018)
   		at org.apache.spark.api.java.JavaRDDLike.collect(JavaRDDLike.scala:362)
   		at org.apache.spark.api.java.JavaRDDLike.collect$(JavaRDDLike.scala:361)
   		at org.apache.spark.api.java.AbstractJavaRDDLike.collect(JavaRDDLike.scala:45)
   		at org.apache.hudi.data.HoodieJavaRDD.collectAsList(HoodieJavaRDD.java:177)
   		at org.apache.hudi.table.action.commit.BaseSparkCommitActionExecutor.setCommitMetadata(BaseSparkCommitActionExecutor.java:289)
   		at org.apache.hudi.table.action.commit.BaseCommitActionExecutor.autoCommit(BaseCommitActionExecutor.java:197)
   		at org.apache.hudi.table.action.commit.BaseCommitActionExecutor.commitOnAutoCommit(BaseCommitActionExecutor.java:183)
   		at org.apache.hudi.table.action.commit.BaseSparkCommitActionExecutor.updateIndexAndCommitIfNeeded(BaseSparkCommitActionExecutor.java:279)
   		at org.apache.hudi.table.action.commit.BaseSparkCommitActionExecutor.execute(BaseSparkCommitActionExecutor.java:184)
   		at org.apache.hudi.table.action.deltacommit.SparkUpsertPreppedDeltaCommitActionExecutor.execute(SparkUpsertPreppedDeltaCommitActionExecutor.java:44)
   		at org.apache.hudi.table.HoodieSparkMergeOnReadTable.upsertPrepped(HoodieSparkMergeOnReadTable.java:126)
   		at org.apache.hudi.table.HoodieSparkMergeOnReadTable.upsertPrepped(HoodieSparkMergeOnReadTable.java:88)
   		at org.apache.hudi.client.SparkRDDWriteClient.upsertPreppedRecords(SparkRDDWriteClient.java:156)
   		at org.apache.hudi.client.SparkRDDWriteClient.upsertPreppedRecords(SparkRDDWriteClient.java:63)
   		at org.apache.hudi.metadata.HoodieBackedTableMetadataWriter.commitInternal(HoodieBackedTableMetadataWriter.java:1132)
   		at org.apache.hudi.metadata.SparkHoodieBackedTableMetadataWriter.commit(SparkHoodieBackedTableMetadataWriter.java:117)
   		at org.apache.hudi.metadata.HoodieBackedTableMetadataWriter.processAndCommit(HoodieBackedTableMetadataWriter.java:855)
   		at org.apache.hudi.metadata.HoodieBackedTableMetadataWriter.update(HoodieBackedTableMetadataWriter.java:1032)
   		at org.apache.hudi.table.action.BaseActionExecutor.writeTableMetadata(BaseActionExecutor.java:105)
   		... 89 more
   	Caused by: org.apache.hudi.exception.HoodieUpsertException: Error upserting bucketType UPDATE for partition :0
   		at org.apache.hudi.table.action.commit.BaseSparkCommitActionExecutor.handleUpsertPartition(BaseSparkCommitActionExecutor.java:342)
   		at org.apache.hudi.table.action.commit.BaseSparkCommitActionExecutor.lambda$mapPartitionsAsRDD$a3ab3c4$1(BaseSparkCommitActionExecutor.java:257)
   		at org.apache.spark.api.java.JavaRDDLike.$anonfun$mapPartitionsWithIndex$1(JavaRDDLike.scala:102)
   		at org.apache.spark.api.java.JavaRDDLike.$anonfun$mapPartitionsWithIndex$1$adapted(JavaRDDLike.scala:102)
   		at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsWithIndex$2(RDD.scala:905)
   		at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsWithIndex$2$adapted(RDD.scala:905)
   		at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
   		at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:364)
   		at org.apache.spark.rdd.RDD.iterator(RDD.scala:328)
   		at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
   		at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:364)
   		at org.apache.spark.rdd.RDD.$anonfun$getOrCompute$1(RDD.scala:377)
   		at org.apache.spark.storage.BlockManager.$anonfun$doPutIterator$1(BlockManager.scala:1552)
   		at org.apache.spark.storage.BlockManager.org$apache$spark$storage$BlockManager$$doPut(BlockManager.scala:1462)
   		at org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:1526)
   		at org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:1349)
   		at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:375)
   		at org.apache.spark.rdd.RDD.iterator(RDD.scala:326)
   		at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
   		at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:364)
   		at org.apache.spark.rdd.RDD.iterator(RDD.scala:328)
   		at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:92)
   		at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:161)
   		at org.apache.spark.scheduler.Task.run(Task.scala:139)
   		at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:554)
   		at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1529)
   		at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:557)
   		at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
   		at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
   		... 1 more
   	Caused by: org.apache.hudi.exception.HoodieAppendException: Failed while appending records to hdfs://localhost:9000/hudi/data/test5/.hoodie/metadata/files/.files-0000-0_00000000000000010.log.2_1-0-1
   		at org.apache.hudi.io.HoodieAppendHandle.appendDataAndDeleteBlocks(HoodieAppendHandle.java:487)
   		at org.apache.hudi.io.HoodieAppendHandle.doAppend(HoodieAppendHandle.java:450)
   		at org.apache.hudi.table.action.deltacommit.BaseSparkDeltaCommitActionExecutor.handleUpdate(BaseSparkDeltaCommitActionExecutor.java:83)
   		at org.apache.hudi.table.action.commit.BaseSparkCommitActionExecutor.handleUpsertPartition(BaseSparkCommitActionExecutor.java:335)
   		... 29 more
   	Caused by: java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[DatanodeInfoWithStorage[127.0.0.1:9866,DS-56d60cd9-579e-4d47-942a-1d1cd47448c9,DISK]], original=[DatanodeInfoWithStorage[127.0.0.1:9866,DS-56d60cd9-579e-4d47-942a-1d1cd47448c9,DISK]]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration.
   		at org.apache.hadoop.hdfs.DataStreamer.findNewDatanode(DataStreamer.java:1352)
   		at org.apache.hadoop.hdfs.DataStreamer.addDatanode2ExistingPipeline(DataStreamer.java:1420)
   		at org.apache.hadoop.hdfs.DataStreamer.handleDatanodeReplacement(DataStreamer.java:1646)
   		at org.apache.hadoop.hdfs.DataStreamer.setupPipelineInternal(DataStreamer.java:1547)
   		at org.apache.hadoop.hdfs.DataStreamer.setupPipelineForAppendOrRecovery(DataStreamer.java:1529)
   		at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:717)
   Caused by: org.apache.hudi.exception.HoodieException: Failed to update metadata
   	at org.apache.hudi.client.BaseHoodieWriteClient.writeTableMetadata(BaseHoodieWriteClient.java:367)
   	at org.apache.hudi.client.BaseHoodieWriteClient.commit(BaseHoodieWriteClient.java:285)
   	at org.apache.hudi.client.BaseHoodieWriteClient.commitStats(BaseHoodieWriteClient.java:236)
   	at org.apache.hudi.client.BaseHoodieWriteClient.commitStats(BaseHoodieWriteClient.java:211)
   	at org.apache.hudi.internal.DataSourceInternalWriterHelper.commit(DataSourceInternalWriterHelper.java:89)
   	... 81 more
   Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 15.0 failed 1 times, most recent failure: Lost task 0.0 in stage 15.0 (TID 19) (192.168.1.171 executor driver): org.apache.hudi.exception.HoodieUpsertException: Error upserting bucketType UPDATE for partition :0
   	at org.apache.hudi.table.action.commit.BaseSparkCommitActionExecutor.handleUpsertPartition(BaseSparkCommitActionExecutor.java:342)
   	at org.apache.hudi.table.action.commit.BaseSparkCommitActionExecutor.lambda$mapPartitionsAsRDD$a3ab3c4$1(BaseSparkCommitActionExecutor.java:257)
   	at org.apache.spark.api.java.JavaRDDLike.$anonfun$mapPartitionsWithIndex$1(JavaRDDLike.scala:102)
   	at org.apache.spark.api.java.JavaRDDLike.$anonfun$mapPartitionsWithIndex$1$adapted(JavaRDDLike.scala:102)
   	at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsWithIndex$2(RDD.scala:905)
   	at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsWithIndex$2$adapted(RDD.scala:905)
   	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
   	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:364)
   	at org.apache.spark.rdd.RDD.iterator(RDD.scala:328)
   	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
   	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:364)
   	at org.apache.spark.rdd.RDD.$anonfun$getOrCompute$1(RDD.scala:377)
   	at org.apache.spark.storage.BlockManager.$anonfun$doPutIterator$1(BlockManager.scala:1552)
   	at org.apache.spark.storage.BlockManager.org$apache$spark$storage$BlockManager$$doPut(BlockManager.scala:1462)
   	at org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:1526)
   	at org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:1349)
   	at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:375)
   	at org.apache.spark.rdd.RDD.iterator(RDD.scala:326)
   	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
   	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:364)
   	at org.apache.spark.rdd.RDD.iterator(RDD.scala:328)
   	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:92)
   	at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:161)
   	at org.apache.spark.scheduler.Task.run(Task.scala:139)
   	at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:554)
   	at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1529)
   	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:557)
   	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
   	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
   	at java.lang.Thread.run(Thread.java:750)
   Caused by: org.apache.hudi.exception.HoodieAppendException: Failed while appending records to hdfs://localhost:9000/hudi/data/test5/.hoodie/metadata/files/.files-0000-0_00000000000000010.log.1_0-0-0
   	at org.apache.hudi.io.HoodieAppendHandle.appendDataAndDeleteBlocks(HoodieAppendHandle.java:487)
   	at org.apache.hudi.io.HoodieAppendHandle.doAppend(HoodieAppendHandle.java:450)
   	at org.apache.hudi.table.action.deltacommit.BaseSparkDeltaCommitActionExecutor.handleUpdate(BaseSparkDeltaCommitActionExecutor.java:83)
   	at org.apache.hudi.table.action.commit.BaseSparkCommitActionExecutor.handleUpsertPartition(BaseSparkCommitActionExecutor.java:335)
   	... 29 more
   Caused by: java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[DatanodeInfoWithStorage[127.0.0.1:9866,DS-56d60cd9-579e-4d47-942a-1d1cd47448c9,DISK]], original=[DatanodeInfoWithStorage[127.0.0.1:9866,DS-56d60cd9-579e-4d47-942a-1d1cd47448c9,DISK]]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration.
   	at org.apache.hadoop.hdfs.DataStreamer.findNewDatanode(DataStreamer.java:1352)
   	at org.apache.hadoop.hdfs.DataStreamer.addDatanode2ExistingPipeline(DataStreamer.java:1420)
   	at org.apache.hadoop.hdfs.DataStreamer.handleDatanodeReplacement(DataStreamer.java:1646)
   	at org.apache.hadoop.hdfs.DataStreamer.setupPipelineInternal(DataStreamer.java:1547)
   	at org.apache.hadoop.hdfs.DataStreamer.setupPipelineForAppendOrRecovery(DataStreamer.java:1529)
   	at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:717)
   
   Driver stacktrace:
   	at org.apache.spark.scheduler.DAGScheduler.failJobAndIndependentStages(DAGScheduler.scala:2785)
   	at org.apache.spark.scheduler.DAGScheduler.$anonfun$abortStage$2(DAGScheduler.scala:2721)
   	at org.apache.spark.scheduler.DAGScheduler.$anonfun$abortStage$2$adapted(DAGScheduler.scala:2720)
   	at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
   	at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
   	at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
   	at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:2720)
   	at org.apache.spark.scheduler.DAGScheduler.$anonfun$handleTaskSetFailed$1(DAGScheduler.scala:1206)
   	at org.apache.spark.scheduler.DAGScheduler.$anonfun$handleTaskSetFailed$1$adapted(DAGScheduler.scala:1206)
   	at scala.Option.foreach(Option.scala:407)
   	at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:1206)
   	at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:2984)
   	at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2923)
   	at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2912)
   	at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:49)
   	at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:971)
   	at org.apache.spark.SparkContext.runJob(SparkContext.scala:2263)
   	at org.apache.spark.SparkContext.runJob(SparkContext.scala:2284)
   	at org.apache.spark.SparkContext.runJob(SparkContext.scala:2303)
   	at org.apache.spark.SparkContext.runJob(SparkContext.scala:2328)
   	at org.apache.spark.rdd.RDD.$anonfun$collect$1(RDD.scala:1019)
   	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
   	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
   	at org.apache.spark.rdd.RDD.withScope(RDD.scala:405)
   	at org.apache.spark.rdd.RDD.collect(RDD.scala:1018)
   	at org.apache.spark.api.java.JavaRDDLike.collect(JavaRDDLike.scala:362)
   	at org.apache.spark.api.java.JavaRDDLike.collect$(JavaRDDLike.scala:361)
   	at org.apache.spark.api.java.AbstractJavaRDDLike.collect(JavaRDDLike.scala:45)
   	at org.apache.hudi.data.HoodieJavaRDD.collectAsList(HoodieJavaRDD.java:177)
   	at org.apache.hudi.table.action.commit.BaseSparkCommitActionExecutor.setCommitMetadata(BaseSparkCommitActionExecutor.java:289)
   	at org.apache.hudi.table.action.commit.BaseCommitActionExecutor.autoCommit(BaseCommitActionExecutor.java:197)
   	at org.apache.hudi.table.action.commit.BaseCommitActionExecutor.commitOnAutoCommit(BaseCommitActionExecutor.java:183)
   	at org.apache.hudi.table.action.commit.BaseSparkCommitActionExecutor.updateIndexAndCommitIfNeeded(BaseSparkCommitActionExecutor.java:279)
   	at org.apache.hudi.table.action.commit.BaseSparkCommitActionExecutor.execute(BaseSparkCommitActionExecutor.java:184)
   	at org.apache.hudi.table.action.deltacommit.SparkUpsertPreppedDeltaCommitActionExecutor.execute(SparkUpsertPreppedDeltaCommitActionExecutor.java:44)
   	at org.apache.hudi.table.HoodieSparkMergeOnReadTable.upsertPrepped(HoodieSparkMergeOnReadTable.java:126)
   	at org.apache.hudi.table.HoodieSparkMergeOnReadTable.upsertPrepped(HoodieSparkMergeOnReadTable.java:88)
   	at org.apache.hudi.client.SparkRDDWriteClient.upsertPreppedRecords(SparkRDDWriteClient.java:156)
   	at org.apache.hudi.client.SparkRDDWriteClient.upsertPreppedRecords(SparkRDDWriteClient.java:63)
   	at org.apache.hudi.metadata.HoodieBackedTableMetadataWriter.commitInternal(HoodieBackedTableMetadataWriter.java:1132)
   	at org.apache.hudi.metadata.SparkHoodieBackedTableMetadataWriter.commit(SparkHoodieBackedTableMetadataWriter.java:117)
   	at org.apache.hudi.metadata.HoodieBackedTableMetadataWriter.processAndCommit(HoodieBackedTableMetadataWriter.java:855)
   	at org.apache.hudi.metadata.HoodieBackedTableMetadataWriter.update(HoodieBackedTableMetadataWriter.java:910)
   	at org.apache.hudi.client.BaseHoodieWriteClient.writeTableMetadata(BaseHoodieWriteClient.java:362)
   	... 85 more
   Caused by: org.apache.hudi.exception.HoodieUpsertException: Error upserting bucketType UPDATE for partition :0
   	at org.apache.hudi.table.action.commit.BaseSparkCommitActionExecutor.handleUpsertPartition(BaseSparkCommitActionExecutor.java:342)
   	at org.apache.hudi.table.action.commit.BaseSparkCommitActionExecutor.lambda$mapPartitionsAsRDD$a3ab3c4$1(BaseSparkCommitActionExecutor.java:257)
   	at org.apache.spark.api.java.JavaRDDLike.$anonfun$mapPartitionsWithIndex$1(JavaRDDLike.scala:102)
   	at org.apache.spark.api.java.JavaRDDLike.$anonfun$mapPartitionsWithIndex$1$adapted(JavaRDDLike.scala:102)
   	at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsWithIndex$2(RDD.scala:905)
   	at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsWithIndex$2$adapted(RDD.scala:905)
   	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
   	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:364)
   	at org.apache.spark.rdd.RDD.iterator(RDD.scala:328)
   	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
   	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:364)
   	at org.apache.spark.rdd.RDD.$anonfun$getOrCompute$1(RDD.scala:377)
   	at org.apache.spark.storage.BlockManager.$anonfun$doPutIterator$1(BlockManager.scala:1552)
   	at org.apache.spark.storage.BlockManager.org$apache$spark$storage$BlockManager$$doPut(BlockManager.scala:1462)
   	at org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:1526)
   	at org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:1349)
   	at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:375)
   	at org.apache.spark.rdd.RDD.iterator(RDD.scala:326)
   	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
   	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:364)
   	at org.apache.spark.rdd.RDD.iterator(RDD.scala:328)
   	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:92)
   	at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:161)
   	at org.apache.spark.scheduler.Task.run(Task.scala:139)
   	at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:554)
   	at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1529)
   	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:557)
   	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
   	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
   	... 1 more
   Caused by: org.apache.hudi.exception.HoodieAppendException: Failed while appending records to hdfs://localhost:9000/hudi/data/test5/.hoodie/metadata/files/.files-0000-0_00000000000000010.log.1_0-0-0
   	at org.apache.hudi.io.HoodieAppendHandle.appendDataAndDeleteBlocks(HoodieAppendHandle.java:487)
   	at org.apache.hudi.io.HoodieAppendHandle.doAppend(HoodieAppendHandle.java:450)
   	at org.apache.hudi.table.action.deltacommit.BaseSparkDeltaCommitActionExecutor.handleUpdate(BaseSparkDeltaCommitActionExecutor.java:83)
   	at org.apache.hudi.table.action.commit.BaseSparkCommitActionExecutor.handleUpsertPartition(BaseSparkCommitActionExecutor.java:335)
   	... 29 more
   Caused by: java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[DatanodeInfoWithStorage[127.0.0.1:9866,DS-56d60cd9-579e-4d47-942a-1d1cd47448c9,DISK]], original=[DatanodeInfoWithStorage[127.0.0.1:9866,DS-56d60cd9-579e-4d47-942a-1d1cd47448c9,DISK]]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration.
   	at org.apache.hadoop.hdfs.DataStreamer.findNewDatanode(DataStreamer.java:1352)
   	at org.apache.hadoop.hdfs.DataStreamer.addDatanode2ExistingPipeline(DataStreamer.java:1420)
   	at org.apache.hadoop.hdfs.DataStreamer.handleDatanodeReplacement(DataStreamer.java:1646)
   	at org.apache.hadoop.hdfs.DataStreamer.setupPipelineInternal(DataStreamer.java:1547)
   	at org.apache.hadoop.hdfs.DataStreamer.setupPipelineForAppendOrRecovery(DataStreamer.java:1529)
   	at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:717)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] [SUPPORT] throw "java.lang.NoSuchMethodError: org.apache.hadoop.hdfs.client.HdfsDataInputStream.getReadStatistics()" [hudi]

Posted by "daishuhuaimu (via GitHub)" <gi...@apache.org>.
daishuhuaimu commented on issue #5765:
URL: https://github.com/apache/hudi/issues/5765#issuecomment-1990674913

   in spark-3.3.4,hudi-0.14.0, it not fixed,set hoodie.metadata.enable=false; Can be used temporarily


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org