You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@livy.apache.org by "James Chen (Jira)" <ji...@apache.org> on 2021/04/15 12:10:00 UTC

[jira] [Updated] (LIVY-852) Livy unable to recover upon losing connection with Zookeeper

     [ https://issues.apache.org/jira/browse/LIVY-852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

James Chen updated LIVY-852:
----------------------------
    Description: 
We've noticed that LIVY-732 appears to change Livy's behavior upon loss of connection with Zookeeper. Originally, before this pull request, upon loss of connection with Zookeeper, Livy would exit with an exit code of 1, allowing it to be restarted. At the moment, however, Livy continues to run, but returns a 404 upon interaction with the REST API:

<html>
 <head>
 <meta http-equiv="Content-Type" content="text/html;charset=ISO-8859-1"/>
 <title>Error 404 </title>
 </head>
 <body>
 <h2>HTTP ERROR: 404</h2>
 <p>Problem accessing /sessions. Reason:
 <pre> Not Found</pre></p>
 <hr /><a href="http://eclipse.org/jetty">Powered by Jetty:// 9.3.24.v20180605</a><hr/>
 </body>
 </html>

The direct cause of this change in behavior appears to be from the UnhandledErrorListener being converted from a System.exit(1) to throwing a LivyUncaughtException--see lines 74 from server/src/main/scala/org/apache/livy/server/recovery/ZooKeeperStateStore.scala and lines 72 from server/src/main/scala/org/apache/livy/server/recovery/ZooKeeperManager.scala, at [https://github.com/apache/incubator-livy/pull/267/files.]

 

As a whole, this change appears to be undesirable, as Livy becomes completely unresponsive after zookeeper reconnects (No logging/error messages are printed out after the uncaught exception is thrown) and needs to be manually checked and restarted. On the other hand, System.exit(1) seems to be a roundabout way of fixing the issue, and specifying a ConnectionStateListener instead of a UnhandledErrorListener might be better.

 

It would be good to figure out if this line should be reverted to a System.exit(1), or if there is a better way of handling this issue.

 

 

 

  was:
We've noticed that LIVY-732 appears to change Livy's behavior upon loss of connection with Zookeeper. Originally, before this pull request, upon loss of connection with Zookeeper, Livy would exit with an exit code of 1, allowing it to be restarted. At the moment, however, Livy continues to run, but returns a 404 upon interaction with the REST API:

<html>
<head>
<meta http-equiv="Content-Type" content="text/html;charset=ISO-8859-1"/>
<title>Error 404 </title>
</head>
<body>
<h2>HTTP ERROR: 404</h2>
<p>Problem accessing /sessions. Reason:
<pre> Not Found</pre></p>
<hr /><a href="http://eclipse.org/jetty">Powered by Jetty:// 9.3.24.v20180605</a><hr/>
</body>
</html>



The direct cause of this change in behavior appears to be from the UnhandledErrorListener being converted from a System.exit(1) to throwing a LivyUncaughtException--see lines 74 from server/src/main/scala/org/apache/livy/server/recovery/ZooKeeperStateStore.scala and lines 72 from server/src/main/scala/org/apache/livy/server/recovery/ZooKeeperManager.scala, at [https://github.com/apache/incubator-livy/pull/267/files.]

 

As a whole, this change appears to be undesirable, as Livy becomes completely unresponsive after zookeeper reconnects--no logging/error messages are printed out after the uncaught exception is thrown--and needs to be manually checked and restarted. On the other hand, System.exit(1) seems to be a roundabout way of fixing the issue, and specifying a ConnectionStateListener instead of a UnhandledErrorListener might be better.

 

It would be good to figure out if this line should be reverted to a System.exit(1), or if there is a better way of handling this issue.

 

 

 


> Livy unable to recover upon losing connection with Zookeeper
> ------------------------------------------------------------
>
>                 Key: LIVY-852
>                 URL: https://issues.apache.org/jira/browse/LIVY-852
>             Project: Livy
>          Issue Type: Bug
>          Components: Server
>    Affects Versions: 0.6.0
>            Reporter: James Chen
>            Priority: Major
>
> We've noticed that LIVY-732 appears to change Livy's behavior upon loss of connection with Zookeeper. Originally, before this pull request, upon loss of connection with Zookeeper, Livy would exit with an exit code of 1, allowing it to be restarted. At the moment, however, Livy continues to run, but returns a 404 upon interaction with the REST API:
> <html>
>  <head>
>  <meta http-equiv="Content-Type" content="text/html;charset=ISO-8859-1"/>
>  <title>Error 404 </title>
>  </head>
>  <body>
>  <h2>HTTP ERROR: 404</h2>
>  <p>Problem accessing /sessions. Reason:
>  <pre> Not Found</pre></p>
>  <hr /><a href="http://eclipse.org/jetty">Powered by Jetty:// 9.3.24.v20180605</a><hr/>
>  </body>
>  </html>
> The direct cause of this change in behavior appears to be from the UnhandledErrorListener being converted from a System.exit(1) to throwing a LivyUncaughtException--see lines 74 from server/src/main/scala/org/apache/livy/server/recovery/ZooKeeperStateStore.scala and lines 72 from server/src/main/scala/org/apache/livy/server/recovery/ZooKeeperManager.scala, at [https://github.com/apache/incubator-livy/pull/267/files.]
>  
> As a whole, this change appears to be undesirable, as Livy becomes completely unresponsive after zookeeper reconnects (No logging/error messages are printed out after the uncaught exception is thrown) and needs to be manually checked and restarted. On the other hand, System.exit(1) seems to be a roundabout way of fixing the issue, and specifying a ConnectionStateListener instead of a UnhandledErrorListener might be better.
>  
> It would be good to figure out if this line should be reverted to a System.exit(1), or if there is a better way of handling this issue.
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)