You are viewing a plain text version of this content. The canonical link for it is here.

Posted to mapreduce-issues@hadoop.apache.org by "Martin Gerlach (JIRA)" <ji...@apache.org> on 2012/08/15 17:42:38 UTC

[jira] [Created] (MAPREDUCE-4557) With default settings, log aggregation service creates aggregated log dirs with group ownership not matching JH server run-as user

Martin Gerlach created MAPREDUCE-4557:
-----------------------------------------

Summary: With default settings, log aggregation service creates aggregated log dirs with group ownership not matching JH server run-as user
Key: MAPREDUCE-4557
URL: https://issues.apache.org/jira/browse/MAPREDUCE-4557
Project: Hadoop Map/Reduce
Issue Type: Bug
Components: nodemanager
Affects Versions: 2.0.0-alpha
Environment: CDH 4.0.1 (ZK, HDFS, YARN, HBase) managed by Cloudera Manager 4.0.3
Reporter: Martin Gerlach
Priority: Minor

In order to read aggregated logs, JH server, running as mapred:hadoop by default, tries to access hdfs://<host:port>/tmp/logs/<user>/logs/<appId>/...

NodeManager runs as yarn:hadoop by default, but creates /tmp/logs initially as user yarn, and group unchanged. E.g., if /tmp as ownership hdfs:supergroup, /tmp/logs will have ownership yarn:supergroup.

Upon running a job, /tmp/logs/<username> is created by LogAggregationService as the user who submitted the job and leaves the group unchanged, e.g., /tmp/logs/<user> will have ownership <user>:supergroup, and permissions 750.

Like this, JH server, which runs as user and group mapred:hadoop by default, cannot access the aggregated logs.

I'm not sure what is a good way of fixing this.

There does not seem to be a way to fix this behavior through the configuration. While run-as groups can be specified, they do not seem to affect the created directories.

LogAggregationService should probably use the Nodemanager's run-as user AND group (which default to yarn:hadoop) to create /tmp/logs rather than leave the group unchanged.

On the other hand, the user and app dirs should better be created with the group unchanged (i.e., hadoop).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4557) With default settings, log aggregation service creates aggregated log dirs with ownership not matching JH server run-as user and group

Posted by "Martin Gerlach (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/MAPREDUCE-4557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Martin Gerlach updated MAPREDUCE-4557:
--------------------------------------

    Summary: With default settings, log aggregation service creates aggregated log dirs with ownership not matching JH server run-as user and group  (was: With default settings, log aggregation service creates aggregated log dirs with group ownership not matching JH server run-as user)
    
> With default settings, log aggregation service creates aggregated log dirs with ownership not matching JH server run-as user and group
> --------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-4557
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4557
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: nodemanager
>    Affects Versions: 2.0.0-alpha
>         Environment: CDH 4.0.1 (ZK, HDFS, YARN, HBase) managed by Cloudera Manager 4.0.3
>            Reporter: Martin Gerlach
>            Priority: Minor
>
> In order to read aggregated logs, JH server, running as mapred:hadoop by default, tries to access hdfs://<host:port>/tmp/logs/<user>/logs/<appId>/...
> NodeManager runs as yarn:hadoop by default, but creates /tmp/logs initially as user yarn, and group unchanged. E.g., if /tmp as ownership hdfs:supergroup, /tmp/logs will have ownership yarn:supergroup.
> Upon running a job, /tmp/logs/<username> is created by LogAggregationService as the user who submitted the job and leaves the group unchanged, e.g., /tmp/logs/<user> will have ownership <user>:supergroup, and permissions 750.
> Like this, JH server, which runs as user and group mapred:hadoop by default, cannot access the aggregated logs.
> I'm not sure what is a good way of fixing this. 
> There does not seem to be a way to fix this behavior through the configuration. While run-as groups can be specified, they do not seem to affect the created directories.
> LogAggregationService should probably use the Nodemanager's run-as user AND group (which default to yarn:hadoop) to create /tmp/logs rather than leave the group unchanged.
> On the other hand, the user and app dirs should better be created with the group unchanged (i.e., hadoop).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira