You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2023/03/17 14:54:00 UTC

[jira] [Updated] (HUDI-5949) Check the write operation configured by user for better troubleshooting

     [ https://issues.apache.org/jira/browse/HUDI-5949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

ASF GitHub Bot updated HUDI-5949:
---------------------------------
    Labels: pull-request-available  (was: )

> Check the write operation configured by user for better troubleshooting
> -----------------------------------------------------------------------
>
>                 Key: HUDI-5949
>                 URL: https://issues.apache.org/jira/browse/HUDI-5949
>             Project: Apache Hudi
>          Issue Type: Improvement
>          Components: configs
>            Reporter: Wechar
>            Priority: Blocker
>              Labels: pull-request-available
>
>  *Background:*
> We find that Spark-Hudi insert data will return a *HoodieException: (Part -) field not found in record. Acceptable fields were :[uuid, name, price]*
> {code:bash}
>   ......
> 	at org.apache.hudi.index.simple.HoodieSimpleIndex.fetchRecordLocationsForAffectedPartitions(HoodieSimpleIndex.java:142)
> 	at org.apache.hudi.index.simple.HoodieSimpleIndex.tagLocationInternal(HoodieSimpleIndex.java:113)
> 	at org.apache.hudi.index.simple.HoodieSimpleIndex.tagLocation(HoodieSimpleIndex.java:91)
> 	at org.apache.hudi.table.action.commit.HoodieWriteHelper.tag(HoodieWriteHelper.java:51)
> 	at org.apache.hudi.table.action.commit.HoodieWriteHelper.tag(HoodieWriteHelper.java:34)
> 	at org.apache.hudi.table.action.commit.BaseWriteHelper.write(BaseWriteHelper.java:53)
> 	... 52 more
> Caused by: org.apache.hudi.exception.HoodieException: (Part -) field not found in record. Acceptable fields were :[uuid, name, price]
> 	at org.apache.hudi.avro.HoodieAvroUtils.getNestedFieldVal(HoodieAvroUtils.java:530)
> 	at org.apache.hudi.HoodieSparkSqlWriter$.$anonfun$write$11(HoodieSparkSqlWriter.scala:305)
> 	at scala.collection.Iterator$$anon$10.next(Iterator.scala:459)
> 	at scala.collection.Iterator$$anon$10.next(Iterator.scala:459)
> 	at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:194)
> 	at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:62)
> 	at org.apache.spark.shuffle.ShuffleWriteProcessor.write(ShuffleWriteProcessor.scala:59)
> 	at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:99)
> 	at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:52)
> 	at org.apache.spark.scheduler.Task.run(Task.scala:131)
> 	at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:497)
> 	at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1509)
> 	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:500)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> 	at java.lang.Thread.run(Thread.java:748)
> org.apache.hudi.exception.HoodieUpsertException: Failed to upsert for commit time 20230317222153522
> 	at org.apache.hudi.table.action.commit.BaseWriteHelper.write(BaseWriteHelper.java:64)
> {code}
> {*}Steps to Reproduce{*}:
> {code:sql}
> -- 1. create a table without preCombineKey
> CREATE TABLE default.test_hudi_default (
>   uuid int,
>   name string,
>   price double
> ) USING hudi;
> -- 2. config write operation to upsert
> set hoodie.datasource.write.operation=upsert;
> -- 3. insert data and exception occurs
> insert into default.test_hudi_default select 1, 'name1', 1.1;
> {code}
> *Root Cause:*
> Hudi does not support upsert for table without preCombineKey, but this exception message may confuse the users.
> *Improvement:*
> We can check the user configured write operation and provide a more specific exception message, it will help user understand what's wrong immediately. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)