You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zeppelin.apache.org by egorklimov <gi...@git.apache.org> on 2018/07/30 17:10:26 UTC

[GitHub] zeppelin pull request #3102: [ZEPPELIN-3671] Add info about running interpre...

GitHub user egorklimov opened a pull request:

    https://github.com/apache/zeppelin/pull/3102

    [ZEPPELIN-3671] Add info about running interpreters to API and JMX

    ### What is this PR for?
    It would be nice if we could get PID of running interpreter, and group of paragraphs that running under that interpreter, using API and JMX.
    
    Using this feature it will be easy to check CPU and memory load, etc. 
    This PR adds:
    * API method to get info about running interpreters (`/api/interpreter/running`)
    * API method to get info about running paragraphs grouped by interpreters (`/api/notebook/jobmanager/running`)
    * Few JMX methods which do the same as API.
    * Template for simple running paragraphs analysis using API.
    
    ### What type of PR is it?
    Improvement
    
    ### What is the Jira issue?
    Issue on Jira https://issues.apache.org/jira/browse/ZEPPELIN-3671
    
    ### How should this be tested?
    * CI is green
    
    ### Screenshots (if appropriate)
    Example of tables built on response data:
    * Interpreters:
    ![intp-table-example](https://user-images.githubusercontent.com/6136993/43404050-3e123f3c-941f-11e8-9223-056cd0d96adf.png)
    * Paragraphs:
    ![running-paragraphs-info](https://user-images.githubusercontent.com/6136993/43404049-3df6ee58-941f-11e8-9e7e-d60a1859a7de.png)
    * Count of paragraphs belongs to each interpreter:
    ![paragraph-example](https://user-images.githubusercontent.com/6136993/43404047-3dd8e6ba-941f-11e8-89a0-658c4686ef96.png)
    
    Example of cpu and memory load analysis:
    * Pie chart for memory load
    ![memory-example](https://user-images.githubusercontent.com/6136993/43404053-3e6d7726-941f-11e8-88a4-5d872bf8efaf.png)
    * Pie chart for cpu load
    ![cpu-example](https://user-images.githubusercontent.com/6136993/43404051-3e2f5c20-941f-11e8-9aa8-2749b843180a.png)
    
    [Tree of processes associated with running](http://localhost:8080/#/notebook/2DNG4YSTH/paragraph/20180727-122427_2143015301?asIframe)
    
    The same data using JMX (viewed in jconsole):
    
    * Running interpreters:
    ![jconsole2](https://user-images.githubusercontent.com/6136993/43408918-2a31b306-942b-11e8-9a51-f2cf78f52a8c.png)
    * Running paragraphs
    ![jconsole-example](https://user-images.githubusercontent.com/6136993/43408919-2a550478-942b-11e8-854d-7281615f1b17.png)
    
    
    ### Questions:
    * Does the licenses files need update? No
    * Is there breaking changes for older versions? No
    * Does this needs documentation? Yes, info about API should be updated


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/TinkoffCreditSystems/zeppelin DW-17571

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/zeppelin/pull/3102.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #3102
    
----
commit 47b6fd4df9d5a7088a77b60479946d50ad1fff71
Author: egorklimov <kl...@...>
Date:   2018-07-24T13:28:43Z

    MBean register fixed

commit 2fe7e965d1475541e25696f38e5d71d606458054
Author: egorklimov <kl...@...>
Date:   2018-07-24T16:22:20Z

    Running statistics functions added

commit 86a4308e86541f0685f7ea9484f5ddad038fad39
Author: egorklimov <kl...@...>
Date:   2018-07-27T17:44:54Z

    Bugs fixed

commit 2547443ab166dde30007d80a73aa3d113f976c0c
Author: Egor Klimov <kl...@...>
Date:   2018-07-30T12:00:10Z

    Add template1

commit 3119311c1d06957245f9716d7181020686871665
Author: Egor Klimov <kl...@...>
Date:   2018-07-30T12:01:39Z

    Add template 2

commit beaa547d5e79efebc381f470524a2cd53ff13f6f
Author: Egor Klimov <kl...@...>
Date:   2018-07-30T12:11:15Z

    Bug fixed

commit 3f8ce54fcff5d675628f982c9ca3b701b99a5441
Author: Egor Klimov <kl...@...>
Date:   2018-07-30T13:16:38Z

    Admin template updated

commit a321aee0588b58fc8e91697383aade6567115fe2
Author: Egor Klimov <kl...@...>
Date:   2018-07-30T13:18:14Z

    Interpreter template updated

commit e071378c64c294d2ff6ce0f0ff1daba02de16576
Author: egorklimov <kl...@...>
Date:   2018-07-30T15:54:15Z

    Docs updated, bug fixed

----


---

[GitHub] zeppelin pull request #3102: [WIP][ZEPPELIN-3671] Add info about running int...

Posted by egorklimov <gi...@git.apache.org>.
Github user egorklimov closed the pull request at:

    https://github.com/apache/zeppelin/pull/3102


---

[GitHub] zeppelin issue #3102: [WIP][ZEPPELIN-3671] Add info about running interprete...

Posted by zjffdu <gi...@git.apache.org>.
Github user zjffdu commented on the issue:

    https://github.com/apache/zeppelin/pull/3102
  
    Thanks @egorklimov , this is an interesting feature. 
    This assume all the interpreter process will generate pid file, but this assumption is not true. There's 2 exceptions at least for now. One is  yarn-cluster mode of spark where the interpreter runs in remote node of yarn cluster.  Another is running interpreter in container which is on our roadmap. Do you have any ideas of how to handle these 2 scenarios ?


---

[GitHub] zeppelin issue #3102: [WIP][ZEPPELIN-3671] Add info about running interprete...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on the issue:

    https://github.com/apache/zeppelin/pull/3102
  
    cool!


---

[GitHub] zeppelin issue #3102: [WIP][ZEPPELIN-3671] Add info about running interprete...

Posted by egorklimov <gi...@git.apache.org>.
Github user egorklimov commented on the issue:

    https://github.com/apache/zeppelin/pull/3102
  
    No tests added yet.


---

[GitHub] zeppelin issue #3102: [WIP][ZEPPELIN-3671] Add info about running interprete...

Posted by egorklimov <gi...@git.apache.org>.
Github user egorklimov commented on the issue:

    https://github.com/apache/zeppelin/pull/3102
  
    There are some troubles during processing `run` folder


---

[GitHub] zeppelin issue #3102: [WIP][ZEPPELIN-3671] Add info about running interprete...

Posted by egorklimov <gi...@git.apache.org>.
Github user egorklimov commented on the issue:

    https://github.com/apache/zeppelin/pull/3102
  
    @zjffdu If I'm not mistaken according to `interpreter.sh`:
    ```
    HOSTNAME=$(hostname)
    ZEPPELIN_SERVER=org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer
    
    INTERPRETER_ID=$(basename "${INTERPRETER_DIR}")
    ZEPPELIN_PID="${ZEPPELIN_PID_DIR}/zeppelin-interpreter-${INTERPRETER_ID}-${ZEPPELIN_IDENT_STRING}-${HOSTNAME}-${PORT}.pid"
    ZEPPELIN_LOGFILE="${ZEPPELIN_LOG_DIR}/zeppelin-interpreter-${INTERPRETER_SETTING_NAME}-"
    ``` 
    Pid file generates every time (except case when interpreter process didn't start successfully). But in yarn scenario it will be hard to analyze CPU and memory load because resources will be consumed on each node, maybe someone will add one more template for that case. 
    
    In case of container, I suppose we could generate other info in run folder.
    
    Now my tests fail third Travis test, trying to fix that scenario ;)


---