You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by Shay Rojansky <ro...@roji.org> on 2015/05/19 17:37:27 UTC

Log aggregation ownership issue with non-HDFS setup

Hi.

I'm trying to set up yarn log aggregation on a cluster using a shared NFS
filesystem (no HDFS). The issue is that the user directories that get
created under yarn.nodemanager.remote-app-log-dir are owned by the node
manager owner, and not by the submitting user (which therefore can't access
their logs). I've also tried running the node manager as root, to no avail.

Looking into the sources, it seems that the LogWriter freates the file
within a UserGroupInformation.doAs block; I'm guessing that under HDFS that
means "impersonating" the submitting user, and the file gets created with
their ownership. However, if yarn.nodemanager.remote-app-log-dir happens to
be on a file:/// filesystem and not HDFS, this doesn't happen.

Can anyone confirm this, and/or suggest a workaround? Is it maybe possible
to access the aggregated logs via some sort of rest API?

Thanks,

Shay