You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "Raymond Xu (Jira)" <ji...@apache.org> on 2022/09/07 09:07:00 UTC
[jira] [Updated] (HUDI-4330) NPE when trying to upsert into a dataset with no Meta Fields

     [ https://issues.apache.org/jira/browse/HUDI-4330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Raymond Xu updated HUDI-4330:
-----------------------------
    Sprint:   (was: 2022/09/05)

> NPE when trying to upsert into a dataset with no Meta Fields
> ------------------------------------------------------------
>
>                 Key: HUDI-4330
>                 URL: https://issues.apache.org/jira/browse/HUDI-4330
>             Project: Apache Hudi
>          Issue Type: Bug
>            Reporter: Alexey Kudinkin
>            Assignee: Raymond Xu
>            Priority: Critical
>             Fix For: 0.12.1
>
>
> When trying to upsert into a dataset with Meta Fields being disabled, you will encounter obscure NPE like below:
> {code:java}
> Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: Task 25 in stage 20.0 failed 4 times, most recent failure: Lost task 25.3 in stage 20.0 (TID 4110) (ip-172-31-20-53.us-west-2.compute.internal executor 7): java.lang.RuntimeException: org.apache.hudi.exception.HoodieIndexException: Error checking bloom filter index.
>         at org.apache.hudi.client.utils.LazyIterableIterator.next(LazyIterableIterator.java:121)
>         at scala.collection.convert.Wrappers$JIteratorWrapper.next(Wrappers.scala:46)
>         at scala.collection.Iterator$$anon$11.nextCur(Iterator.scala:486)
>         at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:492)
>         at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:513)
>         at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:491)
>         at org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:140)
>         at org.apache.spark.shuffle.ShuffleWriteProcessor.write(ShuffleWriteProcessor.scala:59)
>         at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:99)
>         at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:52)
>         at org.apache.spark.scheduler.Task.run(Task.scala:131)
>         at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:506)
>         at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1462)
>         at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:509)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>         at java.lang.Thread.run(Thread.java:750)
> Caused by: org.apache.hudi.exception.HoodieIndexException: Error checking bloom filter index.
>         at org.apache.hudi.index.bloom.HoodieBloomIndexCheckFunction$LazyKeyCheckIterator.computeNext(HoodieBloomIndexCheckFunction.java:110)
>         at org.apache.hudi.index.bloom.HoodieBloomIndexCheckFunction$LazyKeyCheckIterator.computeNext(HoodieBloomIndexCheckFunction.java:60)
>         at org.apache.hudi.client.utils.LazyIterableIterator.next(LazyIterableIterator.java:119)
>         ... 16 more
> Caused by: java.lang.NullPointerException
>         at org.apache.hudi.io.HoodieKeyLookupHandle.addKey(HoodieKeyLookupHandle.java:88)
>         at org.apache.hudi.index.bloom.HoodieBloomIndexCheckFunction$LazyKeyCheckIterator.computeNext(HoodieBloomIndexCheckFunction.java:92)
>         ... 18 more {code}
> Instead, we could be more explicit as to why this could have happened (meta-fields disabled -> no bloom filter created -> unable to do upserts)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)