You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Hyukjin Kwon (Jira)" <ji...@apache.org> on 2021/11/17 04:38:00 UTC
[jira] [Commented] (SPARK-37350) EventLoggingListener keep logging errors after hdfs restart all datanodes
[ https://issues.apache.org/jira/browse/SPARK-37350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17444937#comment-17444937 ]
Hyukjin Kwon commented on SPARK-37350:
--------------------------------------
Spark 2.4.x is EOL. can you check if the same issue persists in Spark 3.x?
> EventLoggingListener keep logging errors after hdfs restart all datanodes
> -------------------------------------------------------------------------
>
> Key: SPARK-37350
> URL: https://issues.apache.org/jira/browse/SPARK-37350
> Project: Spark
> Issue Type: Bug
> Components: Spark Core
> Affects Versions: 2.4.0
> Environment: Spark-2.4.0、Hadoop-3.0.0、Hive-2.1.1
> Reporter: Shefron Yudy
> Priority: Major
>
> I saw the error in SparkThriftServer process's log when I restart all datanodes of HDFS , The logs as follows:
> {code:java}
> 2021-11-16 13:52:11,044 ERROR [spark-listener-group-eventLog] scheduler.AsyncEventQueue:Listener EventLoggingListener threw an exception
> java.io.IOException: All datanodes [DatanodeInfoWithStorage[10.121.23.101:1019,DS-90cb8066-8e5c-443f-804b-20c3ad01851b,DISK]] are bad. Aborting...
> at org.apache.hadoop.hdfs.DataStreamer.handleBadDatanode(DataStreamer.java:1561)
> at org.apache.hadoop.hdfs.DataStreamer.setupPipelineInternal(DataStreamer.java:1495)
> at org.apache.hadoop.hdfs.DataStreamer.setupPipelineForAppendOrRecovery(DataStreamer.java:1481)
> at org.apache.hadoop.hdfs.DataStreamer.processDatanodeErrorOrExternalError(DataStreamer.java:1256)
> at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:667)
> {code}
> The eventLog will be available normally if I restart the SparkThriftServer, I suggest that the EventLoggingListener's dfs writer and hadoopDataStream should reconnect after all datanodes stop and then start later。
> {code:java}
> /** Log the event as JSON. */
> private def logEvent(event: SparkListenerEvent, flushLogger: Boolean = false) {
> val eventJson = JsonProtocol.sparkEventToJson(event)
> // scalastyle:off println
> writer.foreach(_.println(compact(render(eventJson))))
> // scalastyle:on println
> if (flushLogger) {
> writer.foreach(_.flush())
> hadoopDataStream.foreach(_.hflush())
> }
> if (testing) {
> loggedEvents += eventJson
> }
> }
> {code}
--
This message was sent by Atlassian Jira
(v8.20.1#820001)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org