You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zeppelin.apache.org by "Philipp Dallig (Jira)" <ji...@apache.org> on 2022/05/13 13:27:00 UTC

[jira] [Created] (ZEPPELIN-5737) Deadlock during Interpreter Creation

Philipp Dallig created ZEPPELIN-5737:
----------------------------------------

             Summary: Deadlock during Interpreter Creation
                 Key: ZEPPELIN-5737
                 URL: https://issues.apache.org/jira/browse/ZEPPELIN-5737
             Project: Zeppelin
          Issue Type: Bug
          Components: Interpreters
    Affects Versions: 0.10.1
            Reporter: Philipp Dallig


I encountered the following deadlock when starting the Python interpreter.
Triggering the deadlock is relatively simple. While starting the interpreter simply stop the interpreter via Rest-API.

{code}
Found one Java-level deadlock:
=============================
"Thread-29":
  waiting to lock monitor 0x00007fde240084d8 (object 0x00000000804c7120, a org.apache.zeppelin.interpreter.LazyOpenInterpreter),
  which is held by "FIFOScheduler-interpreter_1515166446-Worker-1"
"FIFOScheduler-interpreter_1515166446-Worker-1":
  waiting to lock monitor 0x00007fde20242928 (object 0x00000000804941e0, a org.apache.zeppelin.interpreter.InterpreterGroup),
  which is held by "pool-3-thread-8"
"pool-3-thread-8":
  waiting to lock monitor 0x00007fde202429d8 (object 0x00000000804c71b8, a org.apache.zeppelin.spark.PySparkInterpreter),
  which is held by "FIFOScheduler-interpreter_1515166446-Worker-1"

Java stack information for the threads listed above:
===================================================
"Thread-29":
        at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:63)
        - waiting to lock <0x00000000804c7120> (a org.apache.zeppelin.interpreter.LazyOpenInterpreter)
        at org.apache.zeppelin.interpreter.LazyOpenInterpreter.cancel(LazyOpenInterpreter.java:118)
        at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer.lambda$cancel$2(RemoteInterpreterServer.java:950)
        at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$$Lambda$2428/1999550584.run(Unknown Source)
        at java.lang.Thread.run(Thread.java:748)
"FIFOScheduler-interpreter_1515166446-Worker-1":
        at org.apache.zeppelin.interpreter.Interpreter.getInterpreterInTheSameSessionByClassName(Interpreter.java:293)
        - waiting to lock <0x00000000804941e0> (a org.apache.zeppelin.interpreter.InterpreterGroup)
        at org.apache.zeppelin.interpreter.Interpreter.getInterpreterInTheSameSessionByClassName(Interpreter.java:333)
        at org.apache.zeppelin.spark.IPySparkInterpreter.open(IPySparkInterpreter.java:57)
        - locked <0x00000000804bc9f8> (a org.apache.zeppelin.spark.IPySparkInterpreter)
        at org.apache.zeppelin.python.PythonInterpreter.open(PythonInterpreter.java:91)
        at org.apache.zeppelin.spark.PySparkInterpreter.open(PySparkInterpreter.java:94)
        at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:70)
        - locked <0x00000000804c71b8> (a org.apache.zeppelin.spark.PySparkInterpreter)
        - locked <0x00000000804c7120> (a org.apache.zeppelin.interpreter.LazyOpenInterpreter)
        at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:861)
        at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:769)
        at org.apache.zeppelin.scheduler.Job.run(Job.java:172)
        at org.apache.zeppelin.scheduler.AbstractScheduler.runJob(AbstractScheduler.java:132)
        at org.apache.zeppelin.scheduler.FIFOScheduler.lambda$runJobInScheduler$0(FIFOScheduler.java:42)
        at org.apache.zeppelin.scheduler.FIFOScheduler$$Lambda$268/1225679228.run(Unknown Source)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
"pool-3-thread-8":
        at org.apache.zeppelin.interpreter.LazyOpenInterpreter.isOpen(LazyOpenInterpreter.java:100)
        - waiting to lock <0x00000000804c71b8> (a org.apache.zeppelin.spark.PySparkInterpreter)
        at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer.close(RemoteInterpreterServer.java:496)
        - locked <0x00000000804941e0> (a org.apache.zeppelin.interpreter.InterpreterGroup)
        at org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Processor$close.getResult(RemoteInterpreterService.java:1757)
        at org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Processor$close.getResult(RemoteInterpreterService.java:1736)
        at org.apache.zeppelin.shaded.org.apache.thrift.ProcessFunction.process(ProcessFunction.java:38)
        at org.apache.zeppelin.shaded.org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:38)
        at org.apache.zeppelin.shaded.org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:313)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)

Found 1 deadlock.
{code}




--
This message was sent by Atlassian Jira
(v8.20.7#820007)