You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2021/09/09 02:35:07 UTC

[GitHub] [hudi] melin opened a new issue #3624: Failed to delete the partition table record

melin opened a new issue #3624:
URL: https://github.com/apache/hudi/issues/3624


   ```
   CREATE TABLE `bigdata`.`test_hudi_dt` (
     `_hoodie_commit_time` STRING,
     `_hoodie_commit_seqno` STRING,
     `_hoodie_record_key` STRING,
     `_hoodie_partition_path` STRING,
     `_hoodie_file_name` STRING,
     `id` INT COMMENT '',
     `name` STRING COMMENT '',
     `price` DOUBLE COMMENT '',
     `ds` STRING COMMENT '')
   USING hudi
   OPTIONS (
     `hoodie.payload.ordering.field` 'ds',
     `hoodie.datasource.write.precombine.field` 'ds',
     `hoodie.metadata.enable` 'true',
     `hoodie.parquet.compression.codec` 'zstd',
     `primaryKey` 'id',
     `type` 'cow',
     `hoodie.payload.event.time.field` 'ds')
   PARTITIONED BY (ds)
   TBLPROPERTIES (
     'last_commit_time_sync' = '20210908145419')
   
   insert into table test_hudi_dt values(3, 'xxx', 22, '2021-05-06');
   delete from test_hudi_dt where id=3  and ds='2021-05-06'
   
   ```
   
   
   ```
   1/09/09 10:17:31 INFO TaskSetManager: Starting task 0.0 in stage 2107.0 (TID 105280) (hadoop-test-dn2-9-14, executor 20, partition 0, NODE_LOCAL, 4260 bytes) taskResourceAssignments Map()
   21/09/09 10:17:31 INFO BlockManagerInfo: Added broadcast_1396_piece0 in memory on hadoop-test-dn2-9-14:40075 (size: 221.7 KiB, free: 911.9 MiB)
   21/09/09 10:17:31 INFO MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 523 to 10.10.9.14:49000
   21/09/09 10:17:31 WARN TaskSetManager: Lost task 0.0 in stage 2107.0 (TID 105280) (hadoop-test-dn2-9-14 executor 20): org.apache.hudi.exception.HoodieUpsertException: Error upserting bucketType UPDATE for partition :0
   	at org.apache.hudi.table.action.commit.BaseSparkCommitActionExecutor.handleUpsertPartition(BaseSparkCommitActionExecutor.java:305)
   	at org.apache.hudi.table.action.commit.BaseSparkCommitActionExecutor.lambda$execute$ecf5068c$1(BaseSparkCommitActionExecutor.java:156)
   	at org.apache.spark.api.java.JavaRDDLike.$anonfun$mapPartitionsWithIndex$1(JavaRDDLike.scala:102)
   	at org.apache.spark.api.java.JavaRDDLike.$anonfun$mapPartitionsWithIndex$1$adapted(JavaRDDLike.scala:102)
   	at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsWithIndex$2(RDD.scala:915)
   	at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsWithIndex$2$adapted(RDD.scala:915)
   	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
   	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
   	at org.apache.spark.rdd.RDD.iterator(RDD.scala:337)
   	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
   	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
   	at org.apache.spark.rdd.RDD.$anonfun$getOrCompute$1(RDD.scala:386)
   	at org.apache.spark.storage.BlockManager.$anonfun$doPutIterator$1(BlockManager.scala:1440)
   	at org.apache.spark.storage.BlockManager.org$apache$spark$storage$BlockManager$$doPut(BlockManager.scala:1350)
   	at org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:1414)
   	at org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:1237)
   	at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:384)
   	at org.apache.spark.rdd.RDD.iterator(RDD.scala:335)
   	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
   	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
   	at org.apache.spark.rdd.RDD.iterator(RDD.scala:337)
   	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
   	at org.apache.spark.scheduler.Task.run(Task.scala:131)
   	at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:498)
   	at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1439)
   	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:501)
   	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
   	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
   	at java.lang.Thread.run(Thread.java:748)
   Caused by: org.apache.avro.SchemaParseException: com.fasterxml.jackson.core.JsonParseException: Unrecognized token 'StructType': was expecting (JSON String, Number, Array, Object or token 'null', 'true' or 'false')
    at [Source: (String)"StructType()"; line: 1, column: 11]
   	at org.apache.avro.Schema$Parser.parse(Schema.java:1432)
   	at org.apache.avro.Schema$Parser.parse(Schema.java:1418)
   	at org.apache.hudi.io.HoodieWriteHandle.getSpecifiedTableSchema(HoodieWriteHandle.java:131)
   	at org.apache.hudi.io.HoodieWriteHandle.lambda$new$0(HoodieWriteHandle.java:114)
   	at org.apache.hudi.common.util.Option.orElseGet(Option.java:116)
   	at org.apache.hudi.io.HoodieWriteHandle.<init>(HoodieWriteHandle.java:114)
   	at org.apache.hudi.io.HoodieWriteHandle.<init>(HoodieWriteHandle.java:104)
   	at org.apache.hudi.io.HoodieMergeHandle.<init>(HoodieMergeHandle.java:121)
   	at org.apache.hudi.io.HoodieMergeHandle.<init>(HoodieMergeHandle.java:114)
   	at org.apache.hudi.table.action.commit.BaseSparkCommitActionExecutor.getUpdateHandle(BaseSparkCommitActionExecutor.java:353)
   	at org.apache.hudi.table.action.commit.BaseSparkCommitActionExecutor.handleUpdate(BaseSparkCommitActionExecutor.java:324)
   	at org.apache.hudi.table.action.commit.BaseSparkCommitActionExecutor.handleUpsertPartition(BaseSparkCommitActionExecutor.java:298)
   	... 28 more
   Caused by: com.fasterxml.jackson.core.JsonParseException: Unrecognized token 'StructType': was expecting (JSON String, Number, Array, Object or token 'null', 'true' or 'false')
    at [Source: (String)"StructType()"; line: 1, column: 11]
   	at com.fasterxml.jackson.core.JsonParser._constructError(JsonParser.java:2337)
   	at com.fasterxml.jackson.core.base.ParserMinimalBase._reportError(ParserMinimalBase.java:720)
   	at com.fasterxml.jackson.core.json.ReaderBasedJsonParser._reportInvalidToken(ReaderBasedJsonParser.java:2903)
   	at com.fasterxml.jackson.core.json.ReaderBasedJsonParser._handleOddValue(ReaderBasedJsonParser.java:1949)
   	at com.fasterxml.jackson.core.json.ReaderBasedJsonParser.nextToken(ReaderBasedJsonParser.java:781)
   	at com.fasterxml.jackson.databind.ObjectMapper.readTree(ObjectMapper.java:2902)
   	at org.apache.avro.Schema$Parser.parse(Schema.java:1430)
   	... 39 more
   
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] xushiyan commented on issue #3624: Failed to delete the partition table record from spark-sql

Posted by GitBox <gi...@apache.org>.
xushiyan commented on issue #3624:
URL: https://github.com/apache/hudi/issues/3624#issuecomment-1001120416


   close due to inactive


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] xushiyan closed issue #3624: Failed to delete the partition table record from spark-sql

Posted by GitBox <gi...@apache.org>.
xushiyan closed issue #3624:
URL: https://github.com/apache/hudi/issues/3624


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] xushiyan commented on issue #3624: Failed to delete the partition table record

Posted by GitBox <gi...@apache.org>.
xushiyan commented on issue #3624:
URL: https://github.com/apache/hudi/issues/3624#issuecomment-928837505


   @melin could you provide more version info like Hudi, Spark versions. also if it's on cloud or other environment?
   
   > The reason for this error is because a partition was deleted.If you insert data into the delete partition again, there is no problem executing delele
   
   This sounds odd... from the example you provided, the partition dropped is `2021-05-05` not `2021-05-06`. But the error came from deleting a record in `2021-05-06` ? let me know if i get this right. thanks.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] xushiyan commented on issue #3624: Failed to delete the partition table record

Posted by GitBox <gi...@apache.org>.
xushiyan commented on issue #3624:
URL: https://github.com/apache/hudi/issues/3624#issuecomment-928837505


   @melin could you provide more version info like Hudi, Spark versions. also if it's on cloud or other environment?
   
   > The reason for this error is because a partition was deleted.If you insert data into the delete partition again, there is no problem executing delele
   
   This sounds odd... from the example you provided, the partition dropped is `2021-05-05` not `2021-05-06`. But the error came from deleting a record in `2021-05-06` ? let me know if i get this right. thanks.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org