You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@zeppelin.apache.org by linxi zeng <li...@gmail.com> on 2015/09/07 14:44:05 UTC

Scheduler already terminated Exception

hi, moon:

After change some settings and restarting interpreter, the scheduler of
interpreter will be terminated and the RemoteInterpreterServer process
should be stopped too. But if the RemoteInterpreterServer didn't shutdown
as expected, an exception named "Scheduler already terminated" will be
thrown when we run paragraphs using this interpreter (such as spark). Then
restart the zeppelin server seems the only way to solve the problem.

This problem has already happen several times, but still have no idea how
to stable reproduct it.  I was thinking that if we can restart the
RemoteInterpreterServer when we catch this Exception?

Do you have any idea to solve this problem?


By the way, The detail error info is like that:

 INFO [2015-09-06 10:21:47,487] ({qtp1633200777-7462}
NotebookServer.java[onMessage]:112) - RECEIVE << RUN_PARAGRAPH
 INFO [2015-09-06 10:21:47,493] ({qtp1633200777-7462}
NotebookServer.java[broadcast]:264) - SEND >> NOTE
ERROR [2015-09-06 10:21:47,495] ({qtp1633200777-7462}
NotebookServer.java[runParagraph]:640) - Exception from run
java.lang.RuntimeException: Scheduler already terminated
        at org.apache.zeppelin.scheduler.RemoteScheduler.submit(RemoteScheduler.java:124)
        at org.apache.zeppelin.notebook.Note.run(Note.java:282)
        at org.apache.zeppelin.socket.NotebookServer.runParagraph(NotebookServer.java:638)
        at org.apache.zeppelin.socket.NotebookServer.onMessage(NotebookServer.java:137)
        at org.apache.zeppelin.socket.NotebookSocket.onMessage(NotebookSocket.java:56)
        at org.eclipse.jetty.websocket.WebSocketConnectionRFC6455$WSFrameHandler.onFrame(WebSocketConnectionRFC6455.java:835)
        at org.eclipse.jetty.websocket.WebSocketParserRFC6455.parseNext(WebSocketParserRFC6455.java:349)
        at org.eclipse.jetty.websocket.WebSocketConnectionRFC6455.handle(WebSocketConnectionRFC6455.java:225)
        at org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:667)
        at org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:52)
        at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
        at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
        at java.lang.Thread.run(Thread.java:745)

Re: Scheduler already terminated Exception

Posted by Sourav Mazumder <so...@gmail.com>.
Hi Moon,

I can suggest another approach to reproduce this.

1. Create a spark interpreter with less Executor memory (say 128 M).

2. Using this interpreter try to do something memory intensive. Say you try
to load a data set worth of 20GB and then run a select count(*). This will
eventually kill the executor process and I generally get RemoteInterpreter
not found/Connection refused error.

3. Now you try to rerun the same paragraph executing Select count(*). You
will get scheduler terminated error.

Regards,
Sourav




On Thu, Sep 17, 2015 at 5:25 AM, linxi zeng <li...@gmail.com> wrote:

> actually, there is a way to reproduce the problem (maybe not a very
> suitable example):
> (1)modify dereference() in *RemoteInterpreterProcess.java* like this:
>
> *diff --git
> a/zeppelin-interpreter/src/main/java/org/apache/zeppelin/interpreter/remote/RemoteInterpreterProcess.java
> b/zeppelin-interpreter/src/main/java/org/apache/zeppelin/interpreter/remote/RemoteInterpreterProcess.java*
>
> *index 534af27..e02b16a 100644*
>
> *---
> a/zeppelin-interpreter/src/main/java/org/apache/zeppelin/interpreter/remote/RemoteInterpreterProcess.java*
>
> *+++
> b/zeppelin-interpreter/src/main/java/org/apache/zeppelin/interpreter/remote/RemoteInterpreterProcess.java*
>
> *@@ -146,7 +146,8 @@* public class RemoteInterpreterProcess implements
> ExecuteResultHandler {
>
>    public int dereference() {
>
>      synchronized (referenceCount) {
>
>        int r = referenceCount.decrementAndGet();
>
> *-      if (r == 0) {*
>
> *+      //if (r == 0) {*
>
> *+      if (false) {*
>
>          logger.info("shutdown interpreter process");
>
>          remoteInterpreterEventPoller.shutdown();
>
>
> (2)restart this interpreter in interpreter settings
>
> [image: 内嵌图片 1]
>
> (3)run spark paragraph:
>
> [image: 内嵌图片 2]
>
>
>
> 2015-09-09 23:13 GMT+08:00 moon soo Lee <mo...@apache.org>:
>
>> If there're some way to reproduce the problem it'll help a lot.
>> Let me investigate more on this problem.
>>
>> I'm working on improving interpreter process restart.
>>
>> https://github.com/Leemoonsoo/incubator-zeppelin/commit/3200b9aac26d394a67d496c3b209eb3cda046c4a
>> Once i know how to reproduce "Scheduler already terminated Exception",
>> I'll make pullrequest together with this improvement.
>>
>> Thanks,
>> moon
>>
>>
>> On Mon, Sep 7, 2015 at 5:44 AM linxi zeng <li...@gmail.com>
>> wrote:
>>
>>> hi, moon:
>>>
>>> After change some settings and restarting interpreter, the scheduler of
>>> interpreter will be terminated and the RemoteInterpreterServer process
>>> should be stopped too. But if the RemoteInterpreterServer didn't shutdown
>>> as expected, an exception named "Scheduler already terminated" will be
>>> thrown when we run paragraphs using this interpreter (such as spark). Then
>>> restart the zeppelin server seems the only way to solve the problem.
>>>
>>> This problem has already happen several times, but still have no idea
>>> how to stable reproduct it.  I was thinking that if we can restart the
>>> RemoteInterpreterServer when we catch this Exception?
>>>
>>> Do you have any idea to solve this problem?
>>>
>>>
>>> By the way, The detail error info is like that:
>>>
>>>  INFO [2015-09-06 10:21:47,487] ({qtp1633200777-7462} NotebookServer.java[onMessage]:112) - RECEIVE << RUN_PARAGRAPH
>>>  INFO [2015-09-06 10:21:47,493] ({qtp1633200777-7462} NotebookServer.java[broadcast]:264) - SEND >> NOTE
>>> ERROR [2015-09-06 10:21:47,495] ({qtp1633200777-7462} NotebookServer.java[runParagraph]:640) - Exception from run
>>> java.lang.RuntimeException: Scheduler already terminated
>>>         at org.apache.zeppelin.scheduler.RemoteScheduler.submit(RemoteScheduler.java:124)
>>>         at org.apache.zeppelin.notebook.Note.run(Note.java:282)
>>>         at org.apache.zeppelin.socket.NotebookServer.runParagraph(NotebookServer.java:638)
>>>         at org.apache.zeppelin.socket.NotebookServer.onMessage(NotebookServer.java:137)
>>>         at org.apache.zeppelin.socket.NotebookSocket.onMessage(NotebookSocket.java:56)
>>>         at org.eclipse.jetty.websocket.WebSocketConnectionRFC6455$WSFrameHandler.onFrame(WebSocketConnectionRFC6455.java:835)
>>>         at org.eclipse.jetty.websocket.WebSocketParserRFC6455.parseNext(WebSocketParserRFC6455.java:349)
>>>         at org.eclipse.jetty.websocket.WebSocketConnectionRFC6455.handle(WebSocketConnectionRFC6455.java:225)
>>>         at org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:667)
>>>         at org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:52)
>>>         at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
>>>         at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
>>>         at java.lang.Thread.run(Thread.java:745)
>>>
>>>
>

Re: Scheduler already terminated Exception

Posted by linxi zeng <li...@gmail.com>.
actually, there is a way to reproduce the problem (maybe not a very
suitable example):
(1)modify dereference() in *RemoteInterpreterProcess.java* like this:

*diff --git
a/zeppelin-interpreter/src/main/java/org/apache/zeppelin/interpreter/remote/RemoteInterpreterProcess.java
b/zeppelin-interpreter/src/main/java/org/apache/zeppelin/interpreter/remote/RemoteInterpreterProcess.java*

*index 534af27..e02b16a 100644*

*---
a/zeppelin-interpreter/src/main/java/org/apache/zeppelin/interpreter/remote/RemoteInterpreterProcess.java*

*+++
b/zeppelin-interpreter/src/main/java/org/apache/zeppelin/interpreter/remote/RemoteInterpreterProcess.java*

*@@ -146,7 +146,8 @@* public class RemoteInterpreterProcess implements
ExecuteResultHandler {

   public int dereference() {

     synchronized (referenceCount) {

       int r = referenceCount.decrementAndGet();

*-      if (r == 0) {*

*+      //if (r == 0) {*

*+      if (false) {*

         logger.info("shutdown interpreter process");

         remoteInterpreterEventPoller.shutdown();


(2)restart this interpreter in interpreter settings

[image: 内嵌图片 1]

(3)run spark paragraph:

[image: 内嵌图片 2]



2015-09-09 23:13 GMT+08:00 moon soo Lee <mo...@apache.org>:

> If there're some way to reproduce the problem it'll help a lot.
> Let me investigate more on this problem.
>
> I'm working on improving interpreter process restart.
>
> https://github.com/Leemoonsoo/incubator-zeppelin/commit/3200b9aac26d394a67d496c3b209eb3cda046c4a
> Once i know how to reproduce "Scheduler already terminated Exception",
> I'll make pullrequest together with this improvement.
>
> Thanks,
> moon
>
>
> On Mon, Sep 7, 2015 at 5:44 AM linxi zeng <li...@gmail.com> wrote:
>
>> hi, moon:
>>
>> After change some settings and restarting interpreter, the scheduler of
>> interpreter will be terminated and the RemoteInterpreterServer process
>> should be stopped too. But if the RemoteInterpreterServer didn't shutdown
>> as expected, an exception named "Scheduler already terminated" will be
>> thrown when we run paragraphs using this interpreter (such as spark). Then
>> restart the zeppelin server seems the only way to solve the problem.
>>
>> This problem has already happen several times, but still have no idea how
>> to stable reproduct it.  I was thinking that if we can restart the
>> RemoteInterpreterServer when we catch this Exception?
>>
>> Do you have any idea to solve this problem?
>>
>>
>> By the way, The detail error info is like that:
>>
>>  INFO [2015-09-06 10:21:47,487] ({qtp1633200777-7462} NotebookServer.java[onMessage]:112) - RECEIVE << RUN_PARAGRAPH
>>  INFO [2015-09-06 10:21:47,493] ({qtp1633200777-7462} NotebookServer.java[broadcast]:264) - SEND >> NOTE
>> ERROR [2015-09-06 10:21:47,495] ({qtp1633200777-7462} NotebookServer.java[runParagraph]:640) - Exception from run
>> java.lang.RuntimeException: Scheduler already terminated
>>         at org.apache.zeppelin.scheduler.RemoteScheduler.submit(RemoteScheduler.java:124)
>>         at org.apache.zeppelin.notebook.Note.run(Note.java:282)
>>         at org.apache.zeppelin.socket.NotebookServer.runParagraph(NotebookServer.java:638)
>>         at org.apache.zeppelin.socket.NotebookServer.onMessage(NotebookServer.java:137)
>>         at org.apache.zeppelin.socket.NotebookSocket.onMessage(NotebookSocket.java:56)
>>         at org.eclipse.jetty.websocket.WebSocketConnectionRFC6455$WSFrameHandler.onFrame(WebSocketConnectionRFC6455.java:835)
>>         at org.eclipse.jetty.websocket.WebSocketParserRFC6455.parseNext(WebSocketParserRFC6455.java:349)
>>         at org.eclipse.jetty.websocket.WebSocketConnectionRFC6455.handle(WebSocketConnectionRFC6455.java:225)
>>         at org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:667)
>>         at org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:52)
>>         at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
>>         at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
>>         at java.lang.Thread.run(Thread.java:745)
>>
>>

Re: Scheduler already terminated Exception

Posted by linxi zeng <li...@gmail.com>.
actually, there is a way to reproduce the problem (maybe not a very
suitable example):
(1)modify dereference() in *RemoteInterpreterProcess.java* like this:

*diff --git
a/zeppelin-interpreter/src/main/java/org/apache/zeppelin/interpreter/remote/RemoteInterpreterProcess.java
b/zeppelin-interpreter/src/main/java/org/apache/zeppelin/interpreter/remote/RemoteInterpreterProcess.java*

*index 534af27..e02b16a 100644*

*---
a/zeppelin-interpreter/src/main/java/org/apache/zeppelin/interpreter/remote/RemoteInterpreterProcess.java*

*+++
b/zeppelin-interpreter/src/main/java/org/apache/zeppelin/interpreter/remote/RemoteInterpreterProcess.java*

*@@ -146,7 +146,8 @@* public class RemoteInterpreterProcess implements
ExecuteResultHandler {

   public int dereference() {

     synchronized (referenceCount) {

       int r = referenceCount.decrementAndGet();

*-      if (r == 0) {*

*+      //if (r == 0) {*

*+      if (false) {*

         logger.info("shutdown interpreter process");

         remoteInterpreterEventPoller.shutdown();


(2)restart this interpreter in interpreter settings

[image: 内嵌图片 1]

(3)run spark paragraph:

[image: 内嵌图片 2]



2015-09-09 23:13 GMT+08:00 moon soo Lee <mo...@apache.org>:

> If there're some way to reproduce the problem it'll help a lot.
> Let me investigate more on this problem.
>
> I'm working on improving interpreter process restart.
>
> https://github.com/Leemoonsoo/incubator-zeppelin/commit/3200b9aac26d394a67d496c3b209eb3cda046c4a
> Once i know how to reproduce "Scheduler already terminated Exception",
> I'll make pullrequest together with this improvement.
>
> Thanks,
> moon
>
>
> On Mon, Sep 7, 2015 at 5:44 AM linxi zeng <li...@gmail.com> wrote:
>
>> hi, moon:
>>
>> After change some settings and restarting interpreter, the scheduler of
>> interpreter will be terminated and the RemoteInterpreterServer process
>> should be stopped too. But if the RemoteInterpreterServer didn't shutdown
>> as expected, an exception named "Scheduler already terminated" will be
>> thrown when we run paragraphs using this interpreter (such as spark). Then
>> restart the zeppelin server seems the only way to solve the problem.
>>
>> This problem has already happen several times, but still have no idea how
>> to stable reproduct it.  I was thinking that if we can restart the
>> RemoteInterpreterServer when we catch this Exception?
>>
>> Do you have any idea to solve this problem?
>>
>>
>> By the way, The detail error info is like that:
>>
>>  INFO [2015-09-06 10:21:47,487] ({qtp1633200777-7462} NotebookServer.java[onMessage]:112) - RECEIVE << RUN_PARAGRAPH
>>  INFO [2015-09-06 10:21:47,493] ({qtp1633200777-7462} NotebookServer.java[broadcast]:264) - SEND >> NOTE
>> ERROR [2015-09-06 10:21:47,495] ({qtp1633200777-7462} NotebookServer.java[runParagraph]:640) - Exception from run
>> java.lang.RuntimeException: Scheduler already terminated
>>         at org.apache.zeppelin.scheduler.RemoteScheduler.submit(RemoteScheduler.java:124)
>>         at org.apache.zeppelin.notebook.Note.run(Note.java:282)
>>         at org.apache.zeppelin.socket.NotebookServer.runParagraph(NotebookServer.java:638)
>>         at org.apache.zeppelin.socket.NotebookServer.onMessage(NotebookServer.java:137)
>>         at org.apache.zeppelin.socket.NotebookSocket.onMessage(NotebookSocket.java:56)
>>         at org.eclipse.jetty.websocket.WebSocketConnectionRFC6455$WSFrameHandler.onFrame(WebSocketConnectionRFC6455.java:835)
>>         at org.eclipse.jetty.websocket.WebSocketParserRFC6455.parseNext(WebSocketParserRFC6455.java:349)
>>         at org.eclipse.jetty.websocket.WebSocketConnectionRFC6455.handle(WebSocketConnectionRFC6455.java:225)
>>         at org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:667)
>>         at org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:52)
>>         at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
>>         at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
>>         at java.lang.Thread.run(Thread.java:745)
>>
>>

Re: Scheduler already terminated Exception

Posted by moon soo Lee <mo...@apache.org>.
If there're some way to reproduce the problem it'll help a lot.
Let me investigate more on this problem.

I'm working on improving interpreter process restart.

https://github.com/Leemoonsoo/incubator-zeppelin/commit/3200b9aac26d394a67d496c3b209eb3cda046c4a
Once i know how to reproduce "Scheduler already terminated Exception", I'll
make pullrequest together with this improvement.

Thanks,
moon

On Mon, Sep 7, 2015 at 5:44 AM linxi zeng <li...@gmail.com> wrote:

> hi, moon:
>
> After change some settings and restarting interpreter, the scheduler of
> interpreter will be terminated and the RemoteInterpreterServer process
> should be stopped too. But if the RemoteInterpreterServer didn't shutdown
> as expected, an exception named "Scheduler already terminated" will be
> thrown when we run paragraphs using this interpreter (such as spark). Then
> restart the zeppelin server seems the only way to solve the problem.
>
> This problem has already happen several times, but still have no idea how
> to stable reproduct it.  I was thinking that if we can restart the
> RemoteInterpreterServer when we catch this Exception?
>
> Do you have any idea to solve this problem?
>
>
> By the way, The detail error info is like that:
>
>  INFO [2015-09-06 10:21:47,487] ({qtp1633200777-7462} NotebookServer.java[onMessage]:112) - RECEIVE << RUN_PARAGRAPH
>  INFO [2015-09-06 10:21:47,493] ({qtp1633200777-7462} NotebookServer.java[broadcast]:264) - SEND >> NOTE
> ERROR [2015-09-06 10:21:47,495] ({qtp1633200777-7462} NotebookServer.java[runParagraph]:640) - Exception from run
> java.lang.RuntimeException: Scheduler already terminated
>         at org.apache.zeppelin.scheduler.RemoteScheduler.submit(RemoteScheduler.java:124)
>         at org.apache.zeppelin.notebook.Note.run(Note.java:282)
>         at org.apache.zeppelin.socket.NotebookServer.runParagraph(NotebookServer.java:638)
>         at org.apache.zeppelin.socket.NotebookServer.onMessage(NotebookServer.java:137)
>         at org.apache.zeppelin.socket.NotebookSocket.onMessage(NotebookSocket.java:56)
>         at org.eclipse.jetty.websocket.WebSocketConnectionRFC6455$WSFrameHandler.onFrame(WebSocketConnectionRFC6455.java:835)
>         at org.eclipse.jetty.websocket.WebSocketParserRFC6455.parseNext(WebSocketParserRFC6455.java:349)
>         at org.eclipse.jetty.websocket.WebSocketConnectionRFC6455.handle(WebSocketConnectionRFC6455.java:225)
>         at org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:667)
>         at org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:52)
>         at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
>         at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
>         at java.lang.Thread.run(Thread.java:745)
>
>

Re: Scheduler already terminated Exception

Posted by moon soo Lee <mo...@apache.org>.
If there're some way to reproduce the problem it'll help a lot.
Let me investigate more on this problem.

I'm working on improving interpreter process restart.

https://github.com/Leemoonsoo/incubator-zeppelin/commit/3200b9aac26d394a67d496c3b209eb3cda046c4a
Once i know how to reproduce "Scheduler already terminated Exception", I'll
make pullrequest together with this improvement.

Thanks,
moon

On Mon, Sep 7, 2015 at 5:44 AM linxi zeng <li...@gmail.com> wrote:

> hi, moon:
>
> After change some settings and restarting interpreter, the scheduler of
> interpreter will be terminated and the RemoteInterpreterServer process
> should be stopped too. But if the RemoteInterpreterServer didn't shutdown
> as expected, an exception named "Scheduler already terminated" will be
> thrown when we run paragraphs using this interpreter (such as spark). Then
> restart the zeppelin server seems the only way to solve the problem.
>
> This problem has already happen several times, but still have no idea how
> to stable reproduct it.  I was thinking that if we can restart the
> RemoteInterpreterServer when we catch this Exception?
>
> Do you have any idea to solve this problem?
>
>
> By the way, The detail error info is like that:
>
>  INFO [2015-09-06 10:21:47,487] ({qtp1633200777-7462} NotebookServer.java[onMessage]:112) - RECEIVE << RUN_PARAGRAPH
>  INFO [2015-09-06 10:21:47,493] ({qtp1633200777-7462} NotebookServer.java[broadcast]:264) - SEND >> NOTE
> ERROR [2015-09-06 10:21:47,495] ({qtp1633200777-7462} NotebookServer.java[runParagraph]:640) - Exception from run
> java.lang.RuntimeException: Scheduler already terminated
>         at org.apache.zeppelin.scheduler.RemoteScheduler.submit(RemoteScheduler.java:124)
>         at org.apache.zeppelin.notebook.Note.run(Note.java:282)
>         at org.apache.zeppelin.socket.NotebookServer.runParagraph(NotebookServer.java:638)
>         at org.apache.zeppelin.socket.NotebookServer.onMessage(NotebookServer.java:137)
>         at org.apache.zeppelin.socket.NotebookSocket.onMessage(NotebookSocket.java:56)
>         at org.eclipse.jetty.websocket.WebSocketConnectionRFC6455$WSFrameHandler.onFrame(WebSocketConnectionRFC6455.java:835)
>         at org.eclipse.jetty.websocket.WebSocketParserRFC6455.parseNext(WebSocketParserRFC6455.java:349)
>         at org.eclipse.jetty.websocket.WebSocketConnectionRFC6455.handle(WebSocketConnectionRFC6455.java:225)
>         at org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:667)
>         at org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:52)
>         at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
>         at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
>         at java.lang.Thread.run(Thread.java:745)
>
>