You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "Alexey Kudinkin (Jira)" <ji...@apache.org> on 2022/03/04 20:33:00 UTC

[jira] [Created] (HUDI-3561) Task Failed to Serialize due to ConcurrentModificationException

Alexey Kudinkin created HUDI-3561:
-------------------------------------

             Summary: Task Failed to Serialize due to ConcurrentModificationException
                 Key: HUDI-3561
                 URL: https://issues.apache.org/jira/browse/HUDI-3561
             Project: Apache Hudi
          Issue Type: Bug
            Reporter: Alexey Kudinkin


Occasionally tests are observed failing with ConcurrentModificationException while iterating over some Map that is being serialized as part of the Spark closure serialization.

 

In this particular case "Test Call run_clustering Procedure By Table" (TestCallProcedure) was failing:

 
{code:java}
- Test Call run_clustering Procedure By Table *** FAILED ***
  java.util.concurrent.CompletionException: org.apache.spark.SparkException: Task not serializable
  at java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:273)
  at java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:280)
  at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1606)
  at java.lang.Thread.run(Thread.java:750)
  ...
  Cause: org.apache.spark.SparkException: Task not serializable
  at org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:403)
  at org.apache.spark.util.ClosureCleaner$.org$apache$spark$util$ClosureCleaner$$clean(ClosureCleaner.scala:393)
  at org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:162)
  at org.apache.spark.SparkContext.clean(SparkContext.scala:2326)
  at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndex$1.apply(RDD.scala:850)
  at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndex$1.apply(RDD.scala:849)
  at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
  at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
  at org.apache.spark.rdd.RDD.withScope(RDD.scala:363)
  at org.apache.spark.rdd.RDD.mapPartitionsWithIndex(RDD.scala:849)
  ...
  Cause: java.util.ConcurrentModificationException:
  at java.util.LinkedHashMap$LinkedHashIterator.nextNode(LinkedHashMap.java:719)
  at java.util.LinkedHashMap$LinkedKeyIterator.next(LinkedHashMap.java:742)
  at java.util.HashSet.writeObject(HashSet.java:287)
  at sun.reflect.GeneratedMethodAccessor13.invoke(Unknown Source)
  at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  at java.lang.reflect.Method.invoke(Method.java:498)
  at java.io.ObjectStreamClass.invokeWriteObject(ObjectStreamClass.java:1154)
  at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1496)
  at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
  at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
  ...
 {code}
 

 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)