You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Saisai Shao (Jira)" <ji...@apache.org> on 2020/02/14 07:14:00 UTC
[jira] [Comment Edited] (SPARK-30586) NPE in LiveRDDDistribution (AppStatusListener)

    [ https://issues.apache.org/jira/browse/SPARK-30586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17036734#comment-17036734 ] 

Saisai Shao edited comment on SPARK-30586 at 2/14/20 7:13 AM:
--------------------------------------------------------------

We also met the same issue. Seems like the code doesn't check the nullable of string and directly called String intern, which throws NPE from guava. My first thinking is to add nullable check in {{weakIntern}}. Still investigating how this could be happened, might be due to the lost or out-of-order spark listener event.

CC [~vanzin]


was (Author: jerryshao):
We also met the same issue. Seems like the code doesn't check the nullable of string and directly called String intern, which throws NPE from guava. My first thinking is to add nullable check in {{weakIntern}}. Still investigating how this could be happened, might be due to the lost or out-of-order spark listener event.

> NPE in LiveRDDDistribution (AppStatusListener)
> ----------------------------------------------
>
>                 Key: SPARK-30586
>                 URL: https://issues.apache.org/jira/browse/SPARK-30586
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 2.4.4
>         Environment: A Hadoop cluster consisting of Centos 7.4 machines.
>            Reporter: Jan Van den bosch
>            Priority: Major
>
> We've been noticing a great amount of NullPointerExceptions in our long-running Spark job driver logs:
> {noformat}
> 20/01/17 23:40:12 ERROR AsyncEventQueue: Listener AppStatusListener threw an exception
> java.lang.NullPointerException
>         at org.spark_project.guava.base.Preconditions.checkNotNull(Preconditions.java:191)
>         at org.spark_project.guava.collect.MapMakerInternalMap.putIfAbsent(MapMakerInternalMap.java:3507)
>         at org.spark_project.guava.collect.Interners$WeakInterner.intern(Interners.java:85)
>         at org.apache.spark.status.LiveEntityHelpers$.weakIntern(LiveEntity.scala:603)
>         at org.apache.spark.status.LiveRDDDistribution.toApi(LiveEntity.scala:486)
>         at org.apache.spark.status.LiveRDD$$anonfun$2.apply(LiveEntity.scala:548)
>         at org.apache.spark.status.LiveRDD$$anonfun$2.apply(LiveEntity.scala:548)
>         at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
>         at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
>         at scala.collection.mutable.HashMap$$anon$2$$anonfun$foreach$3.apply(HashMap.scala:139)
>         at scala.collection.mutable.HashMap$$anon$2$$anonfun$foreach$3.apply(HashMap.scala:139)
>         at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:236)
>         at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40)
>         at scala.collection.mutable.HashMap$$anon$2.foreach(HashMap.scala:139)
>         at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
>         at scala.collection.AbstractTraversable.map(Traversable.scala:104)
>         at org.apache.spark.status.LiveRDD.doUpdate(LiveEntity.scala:548)
>         at org.apache.spark.status.LiveEntity.write(LiveEntity.scala:49)
>         at org.apache.spark.status.AppStatusListener.org$apache$spark$status$AppStatusListener$$update(AppStatusListener.scala:991)
>         at org.apache.spark.status.AppStatusListener.org$apache$spark$status$AppStatusListener$$maybeUpdate(AppStatusListener.scala:997)
>         at org.apache.spark.status.AppStatusListener$$anonfun$onExecutorMetricsUpdate$2.apply(AppStatusListener.scala:764)
>         at org.apache.spark.status.AppStatusListener$$anonfun$onExecutorMetricsUpdate$2.apply(AppStatusListener.scala:764)
>         at scala.collection.mutable.HashMap$$anon$2$$anonfun$foreach$3.apply(HashMap.scala:139)
>         at scala.collection.mutable.HashMap$$anon$2$$anonfun$foreach$3.apply(HashMap.scala:139)
>         at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:236)
>         at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40)
>         at scala.collection.mutable.HashMap$$anon$2.foreach(HashMap.scala:139)
>         at org.apache.spark.status.AppStatusListener.org$apache$spark$status$AppStatusListener$$flush(AppStatusListener.scala:788)
>         at org.apache.spark.status.AppStatusListener.onExecutorMetricsUpdate(AppStatusListener.scala:764)
>         at org.apache.spark.scheduler.SparkListenerBus$class.doPostEvent(SparkListenerBus.scala:59)
>         at org.apache.spark.scheduler.AsyncEventQueue.doPostEvent(AsyncEventQueue.scala:37)
>         at org.apache.spark.scheduler.AsyncEventQueue.doPostEvent(AsyncEventQueue.scala:37)
>         at org.apache.spark.util.ListenerBus$class.postToAll(ListenerBus.scala:91)
>         at org.apache.spark.scheduler.AsyncEventQueue.org$apache$spark$scheduler$AsyncEventQueue$$super$postToAll(AsyncEventQueue.scala:92)
>         at org.apache.spark.scheduler.AsyncEventQueue$$anonfun$org$apache$spark$scheduler$AsyncEventQueue$$dispatch$1.apply$mcJ$sp(AsyncEventQueue.scala:92)
>         at org.apache.spark.scheduler.AsyncEventQueue$$anonfun$org$apache$spark$scheduler$AsyncEventQueue$$dispatch$1.apply(AsyncEventQueue.scala:87)
>         at org.apache.spark.scheduler.AsyncEventQueue$$anonfun$org$apache$spark$scheduler$AsyncEventQueue$$dispatch$1.apply(AsyncEventQueue.scala:87)
>         at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58)
>         at org.apache.spark.scheduler.AsyncEventQueue.org$apache$spark$scheduler$AsyncEventQueue$$dispatch(AsyncEventQueue.scala:87)
>         at org.apache.spark.scheduler.AsyncEventQueue$$anon$1$$anonfun$run$1.apply$mcV$sp(AsyncEventQueue.scala:83)
>         at org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1302)
>         at org.apache.spark.scheduler.AsyncEventQueue$$anon$1.run(AsyncEventQueue.scala:82)
> {noformat}
> Symptoms of a Spark app that made us investigate the logs in the first place include:
>  * slower execution of submitted jobs
>  * jobs remaining "Active Jobs" in the Spark UI even though they should have completed days ago
>  * these jobs could not be killed from the Spark UI (the page refreshes but the jobs remained there)
>  * stages for these jobs could not be examined in the Spark UI because it returned an error instead.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org