You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by sd wang <pi...@gmail.com> on 2018/01/22 07:28:20 UTC

run spark job in yarn cluster mode as specified user

Hi Advisers,
When submit spark job in yarn cluster mode, the job will be executed by
"yarn" user. Any parameters can change the user? I tried
setting HADOOP_USER_NAME but it did not work. I'm using spark 2.2.
Thanks for any help!

Re: run spark job in yarn cluster mode as specified user

Posted by Jörn Franke <jo...@gmail.com>.
Configure Kerberos

> On 22. Jan 2018, at 08:28, sd wang <pi...@gmail.com> wrote:
> 
> Hi Advisers,
> When submit spark job in yarn cluster mode, the job will be executed by "yarn" user. Any parameters can change the user? I tried setting HADOOP_USER_NAME but it did not work. I'm using spark 2.2. 
> Thanks for any help!

Re: run spark job in yarn cluster mode as specified user

Posted by sd wang <pi...@gmail.com>.
Thanks!
I finally make this work, except parameter LinuxContainerExecutor and
 cache directory permissions,  the following parameter also need to be
updated to specified user.
yarn.nodemanager.linux-container-executor.nonsecure-mode.local-user

Thanks.

2018-01-22 22:44 GMT+08:00 Margusja <ma...@roo.ee>:

> Hi
>
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor requires
> user in each node and right permissions set in necessary directories.
>
> Br
> Margus
>
>
> On 22 Jan 2018, at 13:41, sd wang <pi...@gmail.com> wrote:
>
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor
>
>
>

Re: run spark job in yarn cluster mode as specified user

Posted by Margusja <ma...@roo.ee>.
Hi

org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor requires user in each node and right permissions set in necessary directories. 

Br
Margus


> On 22 Jan 2018, at 13:41, sd wang <pi...@gmail.com> wrote:
> 
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor


Re: run spark job in yarn cluster mode as specified user

Posted by sd wang <pi...@gmail.com>.
Hi Margus,
Appreciate your help!
Seems this parameter is related to CGroups functions.
I am using CDH without kerberos, I set the parameter:
yarn.nodemanager.container-executor.class=org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor

Then run spark job again, hit the problem as below, any points I missed?
Thanks again !
... ...
Diagnostics: Application application_1516614010938_0003 initialization
failed (exitCode=255) with output: main : command provided 0
main : run as user is nobody
main : requested yarn user is ses_test
Can't create directory
/data/yarn/nm/usercache/test_user/appcache/application_1516614010938_0003 -
Permission denied
Can't create directory
/data01/yarn/nm/usercache/test_user/appcache/application_1516614010938_0003
- Permission denied
Did not create any app directories
... ...



2018-01-22 15:36 GMT+08:00 Margusja <ma...@roo.ee>:

> Hi
>
> One way to get it is use YARN configuration parameter - yarn.nodemanager.
> container-executor.class.
> By default it is org.apache.hadoop.yarn.server.nodemanager.
> DefaultContainerExecutor
>
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor - gives
> you user who executes script.
>
> Br
> Margus
>
>
>
> On 22 Jan 2018, at 09:28, sd wang <pi...@gmail.com> wrote:
>
> Hi Advisers,
> When submit spark job in yarn cluster mode, the job will be executed by
> "yarn" user. Any parameters can change the user? I tried
> setting HADOOP_USER_NAME but it did not work. I'm using spark 2.2.
> Thanks for any help!
>
>
>

Re: run spark job in yarn cluster mode as specified user

Posted by Margusja <ma...@roo.ee>.
Hi

One way to get it is use YARN configuration parameter - yarn.nodemanager.container-executor.class.
By default it is org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor

org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor - gives you user who executes script. 

Br
Margus



> On 22 Jan 2018, at 09:28, sd wang <pi...@gmail.com> wrote:
> 
> Hi Advisers,
> When submit spark job in yarn cluster mode, the job will be executed by "yarn" user. Any parameters can change the user? I tried setting HADOOP_USER_NAME but it did not work. I'm using spark 2.2. 
> Thanks for any help!