You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Marcelo Vanzin (JIRA)" <ji...@apache.org> on 2016/10/03 18:19:20 UTC

[jira] [Commented] (SPARK-17742) Spark Launcher does not get failed state in Listener

    [ https://issues.apache.org/jira/browse/SPARK-17742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15543048#comment-15543048 ] 

Marcelo Vanzin commented on SPARK-17742:
----------------------------------------

This code in "LocalSchedulerBackend" causes the issue you're seeing:

{code}
  override def stop() {
    stop(SparkAppHandle.State.FINISHED)
  }
{code}

That code is run both when you explicitly stop a SparkContext, or when the VM shuts down, via a shutdown hook. You could remove that line (and similar lines in other backends), and change so that if the child process exist with a 0 exit code, then the app is "successful". But need to make sure it still works with YARN (both client and cluster mode), since the logic to report app state is different there.

> Spark Launcher does not get failed state in Listener 
> -----------------------------------------------------
>
>                 Key: SPARK-17742
>                 URL: https://issues.apache.org/jira/browse/SPARK-17742
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Submit
>    Affects Versions: 2.0.0
>            Reporter: Aseem Bansal
>
> I tried to launch an application using the below code. This is dummy code to reproduce the problem. I tried exiting spark with status -1, throwing an exception etc. but in no case did the listener give me failed status. But if a spark job returns -1 or throws an exception from the main method it should be considered as a failure. 
> {code}
> package com.example;
> import org.apache.spark.launcher.SparkAppHandle;
> import org.apache.spark.launcher.SparkLauncher;
> import java.io.IOException;
> public class Main2 {
>     public static void main(String[] args) throws IOException, InterruptedException {
>         SparkLauncher launcher = new SparkLauncher()
>                 .setSparkHome("/opt/spark2")
>                 .setAppResource("/home/aseem/projects/testsparkjob/build/libs/testsparkjob-1.0-SNAPSHOT.jar")
>                 .setMainClass("com.example.Main")
>                 .setMaster("local[2]");
>         launcher.startApplication(new MyListener());
>         Thread.sleep(1000 * 60);
>     }
> }
> class MyListener implements SparkAppHandle.Listener {
>     @Override
>     public void stateChanged(SparkAppHandle handle) {
>         System.out.println("state changed " + handle.getState());
>     }
>     @Override
>     public void infoChanged(SparkAppHandle handle) {
>         System.out.println("info changed " + handle.getState());
>     }
> }
> {code}
> The spark job is 
> {code}
> package com.example;
> import org.apache.spark.sql.SparkSession;
> import java.io.IOException;
> public class Main {
>     public static void main(String[] args) throws IOException {
>         SparkSession sparkSession = SparkSession
>                 .builder()
>                 .appName("" + System.currentTimeMillis())
>                 .getOrCreate();
>         try {
>             for (int i = 0; i < 15; i++) {
>                 Thread.sleep(1000);
>                 System.out.println("sleeping 1");
>             }
>         } catch (InterruptedException e) {
>             e.printStackTrace();
>         }
> //        sparkSession.stop();
>         System.exit(-1);
>     }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org