You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Eric Yang (JIRA)" <ji...@apache.org> on 2018/08/09 21:52:00 UTC

[jira] [Commented] (YARN-8569) Create an interface to provide cluster information to application

    [ https://issues.apache.org/jira/browse/YARN-8569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16575445#comment-16575445 ] 

Eric Yang commented on YARN-8569:
---------------------------------

Sysfs is a pseudo file system provided by Linux Kernel to expose system related information to user space.  YARN can mimic the same ideology to export cluster information to container.  The proposal is to expose cluster information to:

{code}
/hadoop/yarn/fs/cluster.json
{code}

This basically have the runtime information about the deployed application, and getting updated when state changes happen.  The file is replicated from hdfs to host system in appcache for the application.

> Create an interface to provide cluster information to application
> -----------------------------------------------------------------
>
>                 Key: YARN-8569
>                 URL: https://issues.apache.org/jira/browse/YARN-8569
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Eric Yang
>            Priority: Major
>              Labels: Docker
>
> Some program requires container hostnames to be known for application to run.  For example, distributed tensorflow requires launch_command that looks like:
> {code}
> # On ps0.example.com:
> $ python trainer.py \
>      --ps_hosts=ps0.example.com:2222,ps1.example.com:2222 \
>      --worker_hosts=worker0.example.com:2222,worker1.example.com:2222 \
>      --job_name=ps --task_index=0
> # On ps1.example.com:
> $ python trainer.py \
>      --ps_hosts=ps0.example.com:2222,ps1.example.com:2222 \
>      --worker_hosts=worker0.example.com:2222,worker1.example.com:2222 \
>      --job_name=ps --task_index=1
> # On worker0.example.com:
> $ python trainer.py \
>      --ps_hosts=ps0.example.com:2222,ps1.example.com:2222 \
>      --worker_hosts=worker0.example.com:2222,worker1.example.com:2222 \
>      --job_name=worker --task_index=0
> # On worker1.example.com:
> $ python trainer.py \
>      --ps_hosts=ps0.example.com:2222,ps1.example.com:2222 \
>      --worker_hosts=worker0.example.com:2222,worker1.example.com:2222 \
>      --job_name=worker --task_index=1
> {code}
> This is a bit cumbersome to orchestrate via Distributed Shell, or YARN services launch_command.  In addition, the dynamic parameters do not work with YARN flex command.  This is the classic pain point for application developer attempt to automate system environment settings as parameter to end user application.
> It would be great if YARN Docker integration can provide a simple option to expose hostnames of the yarn service via a mounted file.  The file content gets updated when flex command is performed.  This allows application developer to consume system environment settings via a standard interface.  It is like /proc/devices for Linux, but for Hadoop.  This may involve updating a file in distributed cache, and allow mounting of the file via container-executor.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org