You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Apache Spark (Jira)" <ji...@apache.org> on 2020/09/05 02:05:00 UTC

[jira] [Assigned] (SPARK-32803) Catch InterruptedException when resolve rack in SparkRackResolver

     [ https://issues.apache.org/jira/browse/SPARK-32803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Apache Spark reassigned SPARK-32803:
------------------------------------

    Assignee: Apache Spark

> Catch InterruptedException when resolve rack in SparkRackResolver
> -----------------------------------------------------------------
>
>                 Key: SPARK-32803
>                 URL: https://issues.apache.org/jira/browse/SPARK-32803
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 3.1.0
>            Reporter: ulysses you
>            Assignee: Apache Spark
>            Priority: Major
>
> In Yarn mode, Insert into hive table can produce some dirty data when kill Spark application. The error msg is:
> ```
> java.io.IOException: java.lang.InterruptedException at org.apache.hadoop.util.Shell.runCommand(Shell.java:607) at org.apache.hadoop.util.Shell.run(Shell.java:507) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:789) at org.apache.hadoop.net.ScriptBasedMapping$RawScriptBasedMapping.runResolveCommand(ScriptBasedMapping.java:251) at org.apache.hadoop.net.ScriptBasedMapping$RawScriptBasedMapping.resolve(ScriptBasedMapping.java:188) at org.apache.hadoop.net.CachedDNSToSwitchMapping.resolve(CachedDNSToSwitchMapping.java:119) at org.apache.hadoop.yarn.util.RackResolver.coreResolve(RackResolver.java:101) at org.apache.hadoop.yarn.util.RackResolver.resolve(RackResolver.java:81) at org.apache.spark.scheduler.cluster.YarnScheduler.getRackForHost(YarnScheduler.scala:37) at org.apache.spark.scheduler.TaskSetManager$$anonfun$addPendingTask$1.apply(TaskSetManager.scala:235) at org.apache.spark.scheduler.TaskSetManager$$anonfun$addPendingTask$1.apply(TaskSetManager.scala:216) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) at org.apache.spark.scheduler.TaskSetManager.addPendingTask(TaskSetManager.scala:216) at org.apache.spark.scheduler.TaskSetManager$$anonfun$1.apply$mcVI$sp(TaskSetManager.scala:188) at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:160) at org.apache.spark.scheduler.TaskSetManager.<init>(TaskSetManager.scala:187) at org.apache.spark.scheduler.TaskSchedulerImpl.createTaskSetManager(TaskSchedulerImpl.scala:250) at org.apache.spark.scheduler.TaskSchedulerImpl.submitTasks(TaskSchedulerImpl.scala:208) at org.apache.spark.scheduler.DAGScheduler.submitMissingTasks(DAGScheduler.scala:1215) at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$submitStage(DAGScheduler.scala:1071) at org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$submitStage$4.apply(DAGScheduler.scala:1074) at org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$submitStage$4.apply(DAGScheduler.scala:1073) at scala.collection.immutable.List.foreach(List.scala:392) at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$submitStage(DAGScheduler.scala:1073) at org.apache.spark.scheduler.DAGScheduler.handleJobSubmitted(DAGScheduler.scala:1014) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:2069) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2061) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2050) at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:49)
> ```
> The reason is:
> 1. `CachedDNSToSwitchMapping` may execute a shell command to resolve hostname when we kill Spark application, then we got an `InterruptedException` error.
> 2. `DAGSchedulerEventProcessLoop` ignore `InterruptedException`, so `FileFormatWriter` action cann't get the failed info.
> 3. `FileFormatWriter` not abort this commit and leaves some dirty data.
>  
> So we should catch the `InterruptedException` error and throw a `SparkException`.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org