You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Tomasz Früboes (JIRA)" <ji...@apache.org> on 2015/08/01 00:14:05 UTC

[jira] [Commented] (SPARK-7791) Set user for executors in standalone-mode

    [ https://issues.apache.org/jira/browse/SPARK-7791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14649921#comment-14649921 ] 

Tomasz Früboes commented on SPARK-7791:
---------------------------------------

For us the final solution was to leave the standalone mode and setup YARN (and than use spark on YARN). Yarn can be configured to use LinuxContainerExecutor, which can be configured to set proper userids. No idea how mesos works but maybe something similar can be done?

> Set user for executors in standalone-mode
> -----------------------------------------
>
>                 Key: SPARK-7791
>                 URL: https://issues.apache.org/jira/browse/SPARK-7791
>             Project: Spark
>          Issue Type: Wish
>          Components: Spark Core
>            Reporter: Tomasz Früboes
>
> I'm opening this following a discussion in https://www.mail-archive.com/user@spark.apache.org/msg28633.html
>  Our setup was following. Spark (1.3.1, prebuilt for hadoop 2.6, also 2.4) was installed in the standalone mode and started manually from the root account. Everything worked properly apart of operations  such us
> rdd.saveAsPickleFile(ofile)
> which end with exception:
> py4j.protocol.Py4JJavaError: An error occurred while calling o27.save.
> : java.io.IOException: Failed to rename DeprecatedRawLocalFileStatus{path=file:/mnt/lustre/bigdata/med_home/tmp/test19EE/namesAndAges.parquet2/_temporary/0/task_201505191540_0009_r_000001/part-r-00002.parquet; isDirectory=false; length=534; replication=1; blocksize=33554432; modification_time=1432042832000; access_time=0; owner=; group=; permission=rw-rw-rw-; isSymlink=false} to file:/mnt/lustre/bigdata/med_home/tmp/test19EE/namesAndAges.parquet2/part-r-00002.parquet at org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.mergePaths(FileOutputCommitter.java:346)
> (files created in _temporary were owned by user root). It would be great if spark could set the user for the executor also in standalone mode. Setting SPARK_USER has no effect here.
> BTW it may be a good idea to add some warning (e.g. during spark startup) that running from root account is not very healthy idea. E.g. mapping this function 
> def test(x):
>    f = open('/etc/testTMF.txt', 'w')
>    return 0
> on a rdd creates a file in /etc/ (surprisingly calls like f.Write("text") end with an exception)
> Thanks,
>   Tomasz



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org