You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Zian Chen (JIRA)" <ji...@apache.org> on 2018/08/08 22:08:00 UTC

[jira] [Comment Edited] (YARN-8523) Interactive docker shell

    [ https://issues.apache.org/jira/browse/YARN-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16573946#comment-16573946 ] 

Zian Chen edited comment on YARN-8523 at 8/8/18 10:07 PM:
----------------------------------------------------------

[~eyang], thanks for raising this feature. This is very useful for live debug of container diagnosis. we can add a series of  interactive commands to let user debug more effectively, like tail -f container log, container resource usage, etc. 

For handling nodemanager restart scenario, we can register a event listener to listen restart or shutdown signal of node manager web socket and respond in xterm js terminal accordingly, (like print out NM restart/shutdown message to user, etc) and do reconnect retries several times after typical nm restart interval.

Again, if NM meet any unexpected issue which can not resume its service, that's something we can not solve on this interactive docker shell by itself and we should just give user reasonable alert message to inform the current situation (like retry failed with timeout, please check NM log to get more information, etc).

I think pass command through NM web socket and reuse container-executor security check would be a good prototype we can build first without have too much burden on handling root daemon by carving another secure channel. 


was (Author: zian chen):
[~eyang], thanks for raising this feature. This is very useful for live debug of container diagnosis. we can add a series of  interactive commands to let user debug more effectively, like tail -f container log, container resource usage, etc. 

For handling nodemanager restart scenario, we can register a event listener to listen restart or shutdown signal of node manager web socket and respond in xterm js terminal accordingly, (like print out NM restart/shutdown message to user, etc) and do reconnect retries several times after typical nm restart interval. Again, if NM meet any unexpected issue which can not resume its service, that's something we can not solve on this interactive docker shell by itself and we should just give user reasonable alert message to inform the current situation (like retry failed with timeout, please check NM log to get more information, etc). I think pass command through NM web socket and reuse container-executor security check would be a good prototype we can build first without have too much burden on handling root daemon by carving another secure channel. 

> Interactive docker shell
> ------------------------
>
>                 Key: YARN-8523
>                 URL: https://issues.apache.org/jira/browse/YARN-8523
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Eric Yang
>            Priority: Major
>              Labels: Docker
>
> Some application might require interactive unix commands executions to carry out operations.  Container-executor can interface with docker exec to debug or analyze docker containers while the application is running.  It would be nice to support an API to invoke docker exec to perform unix commands and report back the output to application master.  Application master can distribute and aggregate execution of the commands to record in application master log file.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org