You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "Wechar (Jira)" <ji...@apache.org> on 2023/03/17 14:39:00 UTC
[jira] [Created] (HUDI-5949) Check the write operation configured by user for better troubleshooting

Wechar created HUDI-5949:
----------------------------

             Summary: Check the write operation configured by user for better troubleshooting
                 Key: HUDI-5949
                 URL: https://issues.apache.org/jira/browse/HUDI-5949
             Project: Apache Hudi
          Issue Type: Improvement
          Components: configs
            Reporter: Wechar


 *Background:*

We find that Spark-Hudi insert data will return a *HoodieException: (Part -) field not found in record. Acceptable fields were :[uuid, name, price]*
{code:bash}
  ......
	at org.apache.hudi.index.simple.HoodieSimpleIndex.fetchRecordLocationsForAffectedPartitions(HoodieSimpleIndex.java:142)
	at org.apache.hudi.index.simple.HoodieSimpleIndex.tagLocationInternal(HoodieSimpleIndex.java:113)
	at org.apache.hudi.index.simple.HoodieSimpleIndex.tagLocation(HoodieSimpleIndex.java:91)
	at org.apache.hudi.table.action.commit.HoodieWriteHelper.tag(HoodieWriteHelper.java:51)
	at org.apache.hudi.table.action.commit.HoodieWriteHelper.tag(HoodieWriteHelper.java:34)
	at org.apache.hudi.table.action.commit.BaseWriteHelper.write(BaseWriteHelper.java:53)
	... 52 more
Caused by: org.apache.hudi.exception.HoodieException: (Part -) field not found in record. Acceptable fields were :[uuid, name, price]
	at org.apache.hudi.avro.HoodieAvroUtils.getNestedFieldVal(HoodieAvroUtils.java:530)
	at org.apache.hudi.HoodieSparkSqlWriter$.$anonfun$write$11(HoodieSparkSqlWriter.scala:305)
	at scala.collection.Iterator$$anon$10.next(Iterator.scala:459)
	at scala.collection.Iterator$$anon$10.next(Iterator.scala:459)
	at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:194)
	at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:62)
	at org.apache.spark.shuffle.ShuffleWriteProcessor.write(ShuffleWriteProcessor.scala:59)
	at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:99)
	at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:52)
	at org.apache.spark.scheduler.Task.run(Task.scala:131)
	at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:497)
	at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1509)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:500)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
org.apache.hudi.exception.HoodieUpsertException: Failed to upsert for commit time 20230317222153522
	at org.apache.hudi.table.action.commit.BaseWriteHelper.write(BaseWriteHelper.java:64)

{code}
{*}Steps to Reproduce{*}:
{code:sql}
-- 1. create a table without preCombineKey
CREATE TABLE default.test_hudi_default (
  uuid int,
  name string,
  price double
) USING hudi;

-- 2. config write operation to upsert
set hoodie.datasource.write.operation=upsert;

-- 3. insert data and exception occurs
insert into default.test_hudi_default select 1, 'name1', 1.1;
{code}

*Root Cause:*
Hudi does not support upsert for table without preCombineKey, but this exception message may confuse the users.

*Improvement:*
We can check the user configured write operation and provide a more specific exception message, it will help user understand what's wrong immediately. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)