You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zeppelin.apache.org by "SunShun (Jira)" <ji...@apache.org> on 2022/01/14 09:33:00 UTC

[jira] [Created] (ZEPPELIN-5634) Memory hasn't been released after restarting spark interpreter from notebook page

SunShun created ZEPPELIN-5634:
---------------------------------

             Summary: Memory hasn't been released after restarting spark interpreter from notebook page
                 Key: ZEPPELIN-5634
                 URL: https://issues.apache.org/jira/browse/ZEPPELIN-5634
             Project: Zeppelin
          Issue Type: Improvement
    Affects Versions: 0.10.0, 0.9.0
         Environment: Hadoop 2.7.2

Spark 2.4.2 / 2.4.8 (scala 2.11)

Zeppelin 0.10 / master branch
            Reporter: SunShun


Each time running paragraph for a new spark notebook, it will consume some memory inside the spark interpreter process(JVM). 

However, when restarting the interpreter from the notebook page, the used memory will not be released (especially in spark2.x), and more memory will be used to run the paragraph in backend as it creates a new spark interpreter.

Unless restart interpreters from all open notebooks. At that time, the current JVM of spark interpreter will be killed.

If more notebooks are open, will even cause in an OOM issue as there is no enough memory to serve them.

I do an experiment testing o observe such problem, and  use jmap to fetch the used memory inside JVM.
{quote}Given that available memory for driver is 1GB;

When 1st notebook runs, used memory = 455M;

When 2nd notebook runs, used memory = 610M;

when 2nd notebook restart, used memory = 608M(almost no change);

When run the 2nd notebook again, used memory = 770MB.
{quote}
Is that possible to release the memory when restarting in notebook? it will help mitigate the chance of OOM issue in spark driver side.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)