You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@zeppelin.apache.org by zj...@apache.org on 2019/10/09 05:45:00 UTC

[zeppelin] branch master updated: [WIP] [ZEPPELIN-4350]. Paragraph pending for a long time when interpreter process fails to launch

This is an automated email from the ASF dual-hosted git repository.

zjffdu pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/zeppelin.git


The following commit(s) were added to refs/heads/master by this push:
     new f4c9916  [WIP] [ZEPPELIN-4350]. Paragraph pending for a long time when interpreter process fails to launch
f4c9916 is described below

commit f4c991631e953953142a431b4af8c5eac6500aae
Author: Jeff Zhang <zj...@apache.org>
AuthorDate: Wed Sep 25 17:03:24 2019 +0800

    [WIP] [ZEPPELIN-4350]. Paragraph pending for a long time when interpreter process fails to launch
    
    ### What is this PR for?
    Before this PR, paragraph will be in pending for a long time until timeout when interpreter process fails to launch. After this PR, paragraph will finished with error status at once when interpreter fails to launch.
    
    ### What type of PR is it?
    [ Improvement ]
    
    ### Todos
    * [ ] - Task
    
    ### What is the Jira issue?
    * https://issues.apache.org/jira/browse/ZEPPELIN-4350
    
    ### How should this be tested?
    * CI pass
    
    ### Screenshots (if appropriate)
    ![Untitled](https://user-images.githubusercontent.com/164491/65587098-dd84fc00-dfb7-11e9-81ca-92cbb378e20c.gif)
    
    ### Questions:
    * Does the licenses files need update? No
    * Is there breaking changes for older versions? No
    * Does this needs documentation? No
    
    Author: Jeff Zhang <zj...@apache.org>
    
    Closes #3460 from zjffdu/ZEPPELIN-4350 and squashes the following commits:
    
    be6c674ad [Jeff Zhang] [ZEPPELIN-4350]. Paragraph pending for a long time when interpreter process fails to launch
---
 .../interpreter/remote/RemoteInterpreterManagedProcess.java      | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/zeppelin-zengine/src/main/java/org/apache/zeppelin/interpreter/remote/RemoteInterpreterManagedProcess.java b/zeppelin-zengine/src/main/java/org/apache/zeppelin/interpreter/remote/RemoteInterpreterManagedProcess.java
index ccac30b..3a538df 100644
--- a/zeppelin-zengine/src/main/java/org/apache/zeppelin/interpreter/remote/RemoteInterpreterManagedProcess.java
+++ b/zeppelin-zengine/src/main/java/org/apache/zeppelin/interpreter/remote/RemoteInterpreterManagedProcess.java
@@ -231,13 +231,16 @@ public class RemoteInterpreterManagedProcess extends RemoteInterpreterProcess {
     @Override
     public void onProcessComplete(int exitValue) {
       LOGGER.warn("Process is exited with exit value " + exitValue);
+      if (env.getOrDefault("ZEPPELIN_SPARK_YARN_CLUSTER", "false").equals("false")) {
+        // don't call notify in yarn-cluster mode
+        synchronized (this) {
+          notify();
+        }
+      }
       // For yarn-cluster mode, client process will exit with exit value 0
       // after submitting spark app. So don't move to TERMINATED state when exitValue is 0.
       if (exitValue != 0) {
         transition(State.TERMINATED);
-        synchronized (this) {
-          notify();
-        }
       } else {
         transition(State.COMPLETED);
       }