You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@kyuubi.apache.org by ch...@apache.org on 2023/05/15 16:36:41 UTC

[kyuubi] branch master updated: [KYUUBI #4840] Return cached appInfo iif both op and app are terminated

This is an automated email from the ASF dual-hosted git repository.

chengpan pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/kyuubi.git


The following commit(s) were added to refs/heads/master by this push:
     new 66dfe805d [KYUUBI #4840] Return cached appInfo iif both op and app are terminated
66dfe805d is described below

commit 66dfe805d2ed21b8c6414bc1a5a4e1a57120df3a
Author: Cheng Pan <ch...@apache.org>
AuthorDate: Tue May 16 00:36:27 2023 +0800

    [KYUUBI #4840] Return cached appInfo iif both op and app are terminated
    
    ### _Why are the changes needed?_
    
    When a batch job fails, there is a chance that the op state changed to ERROR, but engine_state has not been updated, w/ the current implementation, currentApplicationInfo always returns the cached outdated appInfo instead of fetching appInfo from the application manager, thus causing the client can not see the correct app status.
    
    ```
    mysql> select state, engine_state, peer_instance_closed from metadata;
    +-------+--------------+----------------------+
    | state | engine_state | peer_instance_closed |
    +-------+--------------+----------------------+
    | ERROR | RUNNING      |                    0 |
    | ERROR | RUNNING      |                    0 |
    +-------+--------------+----------------------+
    ```
    
    ### _How was this patch tested?_
    - [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible
    
    - [ ] Add screenshots for manual tests if appropriate
    
    - [x] [Run test](https://kyuubi.readthedocs.io/en/master/develop_tools/testing.html#running-tests) locally before make a pull request
    
    Closes #4840 from pan3793/appInfo.
    
    Closes #4840
    
    f5dddc9db [Cheng Pan] close on error
    7c5767e11 [Cheng Pan] Return cached appInfo iif both op and app are terminated
    
    Authored-by: Cheng Pan <ch...@apache.org>
    Signed-off-by: Cheng Pan <ch...@apache.org>
---
 .../scala/org/apache/kyuubi/operation/BatchJobSubmission.scala     | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/kyuubi-server/src/main/scala/org/apache/kyuubi/operation/BatchJobSubmission.scala b/kyuubi-server/src/main/scala/org/apache/kyuubi/operation/BatchJobSubmission.scala
index 96022614e..e58539ff7 100644
--- a/kyuubi-server/src/main/scala/org/apache/kyuubi/operation/BatchJobSubmission.scala
+++ b/kyuubi-server/src/main/scala/org/apache/kyuubi/operation/BatchJobSubmission.scala
@@ -101,8 +101,10 @@ class BatchJobSubmission(
     }
   }
 
-  override protected def currentApplicationInfo(): Option[ApplicationInfo] = {
-    if (isTerminal(state) && _applicationInfo.nonEmpty) return _applicationInfo
+  override def currentApplicationInfo(): Option[ApplicationInfo] = {
+    if (isTerminal(state) && _applicationInfo.map(_.state).exists(ApplicationState.isTerminated)) {
+      return _applicationInfo
+    }
     val applicationInfo =
       applicationManager.getApplicationInfo(
         builder.clusterManager(),
@@ -267,6 +269,7 @@ class BatchJobSubmission(
       }
     } finally {
       builder.close()
+      updateApplicationInfoMetadataIfNeeded()
       cleanupUploadedResourceIfNeeded()
     }
   }