You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "Alexey Kudinkin (Jira)" <ji...@apache.org> on 2022/03/04 20:33:00 UTC
[jira] [Created] (HUDI-3561) Task Failed to Serialize due to ConcurrentModificationException
Alexey Kudinkin created HUDI-3561:
-------------------------------------
Summary: Task Failed to Serialize due to ConcurrentModificationException
Key: HUDI-3561
URL: https://issues.apache.org/jira/browse/HUDI-3561
Project: Apache Hudi
Issue Type: Bug
Reporter: Alexey Kudinkin
Occasionally tests are observed failing with ConcurrentModificationException while iterating over some Map that is being serialized as part of the Spark closure serialization.
In this particular case "Test Call run_clustering Procedure By Table" (TestCallProcedure) was failing:
{code:java}
- Test Call run_clustering Procedure By Table *** FAILED ***
java.util.concurrent.CompletionException: org.apache.spark.SparkException: Task not serializable
at java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:273)
at java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:280)
at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1606)
at java.lang.Thread.run(Thread.java:750)
...
Cause: org.apache.spark.SparkException: Task not serializable
at org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:403)
at org.apache.spark.util.ClosureCleaner$.org$apache$spark$util$ClosureCleaner$$clean(ClosureCleaner.scala:393)
at org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:162)
at org.apache.spark.SparkContext.clean(SparkContext.scala:2326)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndex$1.apply(RDD.scala:850)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndex$1.apply(RDD.scala:849)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:363)
at org.apache.spark.rdd.RDD.mapPartitionsWithIndex(RDD.scala:849)
...
Cause: java.util.ConcurrentModificationException:
at java.util.LinkedHashMap$LinkedHashIterator.nextNode(LinkedHashMap.java:719)
at java.util.LinkedHashMap$LinkedKeyIterator.next(LinkedHashMap.java:742)
at java.util.HashSet.writeObject(HashSet.java:287)
at sun.reflect.GeneratedMethodAccessor13.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at java.io.ObjectStreamClass.invokeWriteObject(ObjectStreamClass.java:1154)
at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1496)
at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
...
{code}
--
This message was sent by Atlassian Jira
(v8.20.1#820001)