You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by John Lilley <jo...@redpoint.net> on 2015/04/17 21:00:54 UTC

Error in YARN localization with Active Directory user -- inconsistent directory name escapement

We have a Cloudera 5.3 cluster running on CentOS6 that is Kerberos-enabled and uses an external AD domain controller for the KDC.  We are able to authenticate, browse HDFS, etc.  However, YARN fails during localization because it seems to get confused by the presence of a \ character in the local user name.

Our AD authentication on the nodes goes through sssd and set configured to map AD users onto the form domain\username.  For example, our test user has a Kerberos principal of rpdmuserAD@OFFICE.DATALEVR.COM<ma...@OFFICE.DATALEVR.COM> and that maps onto a CentOS user "office\rpdmuserAD".  We have no problem validating that user with PAM, logging in as that user, su-ing to that user, etc.

However, when we attempt to run a YARN application master, the localization step fails when setting up the local cache directory for the AM.  The error that comes out of the RM logs:
2015-04-17 12:47:09 INFO net.redpoint.yarnapp.Client[0]: monitorApplication: ApplicationReport: appId=1, state=FAILED, progress=0.0, finalStatus=FAILED, diagnostics='Application application_1429295486450_0001 failed 1 times due to AM Container for appattempt_1429295486450_0001_000001 exited with  exitCode: -1000 due to: Application application_1429295486450_0001 initialization failed (exitCode=255) with output: main : command provided 0
main : user is OFFICE\rpdmuserad
main : requested yarn user is office\rpdmuserAD
org.apache.hadoop.util.DiskChecker$DiskErrorException: Cannot create directory: /data/yarn/nm/usercache/office%5CrpdmuserAD/appcache/application_1429295486450_0001/filecache/10
                at org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:105)
                at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.download(ContainerLocalizer.java:199)
                at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:241)
                at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:169)
                at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.main(ContainerLocalizer.java:347)
.Failing this attempt.. Failing the application.'

However, when we look on the node launching the AM, we see this:
[root@rpb-cdh-kerb-2 ~]# cd /data/yarn/nm/usercache
[root@rpb-cdh-kerb-2 usercache]# ls -l
drwxr-s--- 4 OFFICE\rpdmuserad yarn 4096 Apr 17 12:10 office\rpdmuserAD

There appears to be different treatment of the \ character in different places.  Something creates the directory as "office\rpdmuserAD" but something else later attempts to use it as "office%5CrpdmuserAD".  I'm not sure where or why the URL escapement converts the \ to %5C or why this is not consistent.

Is this a known issue?  Any fixes available?  Are we simply not allowed to map local usernames this way?

I should also mention, for the sake of completeness, our auth_to_local rule is set up to map user@OFFICE.DATALEVER.COM<ma...@OFFICE.DATALEVER.COM> to OFFICE\user:
RULE:[1:$1@$0](^.*@OFFICE\.DATALEVER\.COM$)s/^(.*)@OFFICE\.DATALEVER\.COM$/office\\$1/g<mailto:%5e.*@OFFICE\.DATALEVER\.COM$)s/%5e(.*)@OFFICE\.DATALEVER\.COM$/office\\$1/g>

Thanks
John Lilley


RE: Error in YARN localization with Active Directory user -- inconsistent directory name escapement

Posted by John Lilley <jo...@redpoint.net>.
Follow-up, this is indeed a YARN bug and I've filed a JIRA, which has garnered a lot of attention and a patch.
john

From: John Lilley [mailto:john.lilley@redpoint.net]
Sent: Friday, April 17, 2015 1:01 PM
To: 'user@hadoop.apache.org'
Subject: Error in YARN localization with Active Directory user -- inconsistent directory name escapement

We have a Cloudera 5.3 cluster running on CentOS6 that is Kerberos-enabled and uses an external AD domain controller for the KDC.  We are able to authenticate, browse HDFS, etc.  However, YARN fails during localization because it seems to get confused by the presence of a \ character in the local user name.

Our AD authentication on the nodes goes through sssd and set configured to map AD users onto the form domain\username.  For example, our test user has a Kerberos principal of rpdmuserAD@OFFICE.DATALEVR.COM<ma...@OFFICE.DATALEVR.COM> and that maps onto a CentOS user "office\rpdmuserAD".  We have no problem validating that user with PAM, logging in as that user, su-ing to that user, etc.

However, when we attempt to run a YARN application master, the localization step fails when setting up the local cache directory for the AM.  The error that comes out of the RM logs:
2015-04-17 12:47:09 INFO net.redpoint.yarnapp.Client[0]: monitorApplication: ApplicationReport: appId=1, state=FAILED, progress=0.0, finalStatus=FAILED, diagnostics='Application application_1429295486450_0001 failed 1 times due to AM Container for appattempt_1429295486450_0001_000001 exited with  exitCode: -1000 due to: Application application_1429295486450_0001 initialization failed (exitCode=255) with output: main : command provided 0
main : user is OFFICE\rpdmuserad
main : requested yarn user is office\rpdmuserAD
org.apache.hadoop.util.DiskChecker$DiskErrorException: Cannot create directory: /data/yarn/nm/usercache/office%5CrpdmuserAD/appcache/application_1429295486450_0001/filecache/10
                at org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:105)
                at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.download(ContainerLocalizer.java:199)
                at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:241)
                at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:169)
                at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.main(ContainerLocalizer.java:347)
.Failing this attempt.. Failing the application.'

However, when we look on the node launching the AM, we see this:
[root@rpb-cdh-kerb-2 ~]# cd /data/yarn/nm/usercache
[root@rpb-cdh-kerb-2 usercache]# ls -l
drwxr-s--- 4 OFFICE\rpdmuserad yarn 4096 Apr 17 12:10 office\rpdmuserAD

There appears to be different treatment of the \ character in different places.  Something creates the directory as "office\rpdmuserAD" but something else later attempts to use it as "office%5CrpdmuserAD".  I'm not sure where or why the URL escapement converts the \ to %5C or why this is not consistent.

Is this a known issue?  Any fixes available?  Are we simply not allowed to map local usernames this way?

I should also mention, for the sake of completeness, our auth_to_local rule is set up to map user@OFFICE.DATALEVER.COM<ma...@OFFICE.DATALEVER.COM> to OFFICE\user:
RULE:[1:$1@$0](^.*@OFFICE\.DATALEVER\.COM$)s/^(.*)@OFFICE\.DATALEVER\.COM$/office\\$1/g<mailto:%5e.*@OFFICE\.DATALEVER\.COM$)s/%5e(.*)@OFFICE\.DATALEVER\.COM$/office\\$1/g>

Thanks
John Lilley


RE: Error in YARN localization with Active Directory user -- inconsistent directory name escapement

Posted by John Lilley <jo...@redpoint.net>.
Follow-up, this is indeed a YARN bug and I've filed a JIRA, which has garnered a lot of attention and a patch.
john

From: John Lilley [mailto:john.lilley@redpoint.net]
Sent: Friday, April 17, 2015 1:01 PM
To: 'user@hadoop.apache.org'
Subject: Error in YARN localization with Active Directory user -- inconsistent directory name escapement

We have a Cloudera 5.3 cluster running on CentOS6 that is Kerberos-enabled and uses an external AD domain controller for the KDC.  We are able to authenticate, browse HDFS, etc.  However, YARN fails during localization because it seems to get confused by the presence of a \ character in the local user name.

Our AD authentication on the nodes goes through sssd and set configured to map AD users onto the form domain\username.  For example, our test user has a Kerberos principal of rpdmuserAD@OFFICE.DATALEVR.COM<ma...@OFFICE.DATALEVR.COM> and that maps onto a CentOS user "office\rpdmuserAD".  We have no problem validating that user with PAM, logging in as that user, su-ing to that user, etc.

However, when we attempt to run a YARN application master, the localization step fails when setting up the local cache directory for the AM.  The error that comes out of the RM logs:
2015-04-17 12:47:09 INFO net.redpoint.yarnapp.Client[0]: monitorApplication: ApplicationReport: appId=1, state=FAILED, progress=0.0, finalStatus=FAILED, diagnostics='Application application_1429295486450_0001 failed 1 times due to AM Container for appattempt_1429295486450_0001_000001 exited with  exitCode: -1000 due to: Application application_1429295486450_0001 initialization failed (exitCode=255) with output: main : command provided 0
main : user is OFFICE\rpdmuserad
main : requested yarn user is office\rpdmuserAD
org.apache.hadoop.util.DiskChecker$DiskErrorException: Cannot create directory: /data/yarn/nm/usercache/office%5CrpdmuserAD/appcache/application_1429295486450_0001/filecache/10
                at org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:105)
                at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.download(ContainerLocalizer.java:199)
                at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:241)
                at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:169)
                at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.main(ContainerLocalizer.java:347)
.Failing this attempt.. Failing the application.'

However, when we look on the node launching the AM, we see this:
[root@rpb-cdh-kerb-2 ~]# cd /data/yarn/nm/usercache
[root@rpb-cdh-kerb-2 usercache]# ls -l
drwxr-s--- 4 OFFICE\rpdmuserad yarn 4096 Apr 17 12:10 office\rpdmuserAD

There appears to be different treatment of the \ character in different places.  Something creates the directory as "office\rpdmuserAD" but something else later attempts to use it as "office%5CrpdmuserAD".  I'm not sure where or why the URL escapement converts the \ to %5C or why this is not consistent.

Is this a known issue?  Any fixes available?  Are we simply not allowed to map local usernames this way?

I should also mention, for the sake of completeness, our auth_to_local rule is set up to map user@OFFICE.DATALEVER.COM<ma...@OFFICE.DATALEVER.COM> to OFFICE\user:
RULE:[1:$1@$0](^.*@OFFICE\.DATALEVER\.COM$)s/^(.*)@OFFICE\.DATALEVER\.COM$/office\\$1/g<mailto:%5e.*@OFFICE\.DATALEVER\.COM$)s/%5e(.*)@OFFICE\.DATALEVER\.COM$/office\\$1/g>

Thanks
John Lilley


RE: Error in YARN localization with Active Directory user -- inconsistent directory name escapement

Posted by John Lilley <jo...@redpoint.net>.
Follow-up, this is indeed a YARN bug and I've filed a JIRA, which has garnered a lot of attention and a patch.
john

From: John Lilley [mailto:john.lilley@redpoint.net]
Sent: Friday, April 17, 2015 1:01 PM
To: 'user@hadoop.apache.org'
Subject: Error in YARN localization with Active Directory user -- inconsistent directory name escapement

We have a Cloudera 5.3 cluster running on CentOS6 that is Kerberos-enabled and uses an external AD domain controller for the KDC.  We are able to authenticate, browse HDFS, etc.  However, YARN fails during localization because it seems to get confused by the presence of a \ character in the local user name.

Our AD authentication on the nodes goes through sssd and set configured to map AD users onto the form domain\username.  For example, our test user has a Kerberos principal of rpdmuserAD@OFFICE.DATALEVR.COM<ma...@OFFICE.DATALEVR.COM> and that maps onto a CentOS user "office\rpdmuserAD".  We have no problem validating that user with PAM, logging in as that user, su-ing to that user, etc.

However, when we attempt to run a YARN application master, the localization step fails when setting up the local cache directory for the AM.  The error that comes out of the RM logs:
2015-04-17 12:47:09 INFO net.redpoint.yarnapp.Client[0]: monitorApplication: ApplicationReport: appId=1, state=FAILED, progress=0.0, finalStatus=FAILED, diagnostics='Application application_1429295486450_0001 failed 1 times due to AM Container for appattempt_1429295486450_0001_000001 exited with  exitCode: -1000 due to: Application application_1429295486450_0001 initialization failed (exitCode=255) with output: main : command provided 0
main : user is OFFICE\rpdmuserad
main : requested yarn user is office\rpdmuserAD
org.apache.hadoop.util.DiskChecker$DiskErrorException: Cannot create directory: /data/yarn/nm/usercache/office%5CrpdmuserAD/appcache/application_1429295486450_0001/filecache/10
                at org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:105)
                at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.download(ContainerLocalizer.java:199)
                at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:241)
                at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:169)
                at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.main(ContainerLocalizer.java:347)
.Failing this attempt.. Failing the application.'

However, when we look on the node launching the AM, we see this:
[root@rpb-cdh-kerb-2 ~]# cd /data/yarn/nm/usercache
[root@rpb-cdh-kerb-2 usercache]# ls -l
drwxr-s--- 4 OFFICE\rpdmuserad yarn 4096 Apr 17 12:10 office\rpdmuserAD

There appears to be different treatment of the \ character in different places.  Something creates the directory as "office\rpdmuserAD" but something else later attempts to use it as "office%5CrpdmuserAD".  I'm not sure where or why the URL escapement converts the \ to %5C or why this is not consistent.

Is this a known issue?  Any fixes available?  Are we simply not allowed to map local usernames this way?

I should also mention, for the sake of completeness, our auth_to_local rule is set up to map user@OFFICE.DATALEVER.COM<ma...@OFFICE.DATALEVER.COM> to OFFICE\user:
RULE:[1:$1@$0](^.*@OFFICE\.DATALEVER\.COM$)s/^(.*)@OFFICE\.DATALEVER\.COM$/office\\$1/g<mailto:%5e.*@OFFICE\.DATALEVER\.COM$)s/%5e(.*)@OFFICE\.DATALEVER\.COM$/office\\$1/g>

Thanks
John Lilley


RE: Error in YARN localization with Active Directory user -- inconsistent directory name escapement

Posted by John Lilley <jo...@redpoint.net>.
Follow-up, this is indeed a YARN bug and I've filed a JIRA, which has garnered a lot of attention and a patch.
john

From: John Lilley [mailto:john.lilley@redpoint.net]
Sent: Friday, April 17, 2015 1:01 PM
To: 'user@hadoop.apache.org'
Subject: Error in YARN localization with Active Directory user -- inconsistent directory name escapement

We have a Cloudera 5.3 cluster running on CentOS6 that is Kerberos-enabled and uses an external AD domain controller for the KDC.  We are able to authenticate, browse HDFS, etc.  However, YARN fails during localization because it seems to get confused by the presence of a \ character in the local user name.

Our AD authentication on the nodes goes through sssd and set configured to map AD users onto the form domain\username.  For example, our test user has a Kerberos principal of rpdmuserAD@OFFICE.DATALEVR.COM<ma...@OFFICE.DATALEVR.COM> and that maps onto a CentOS user "office\rpdmuserAD".  We have no problem validating that user with PAM, logging in as that user, su-ing to that user, etc.

However, when we attempt to run a YARN application master, the localization step fails when setting up the local cache directory for the AM.  The error that comes out of the RM logs:
2015-04-17 12:47:09 INFO net.redpoint.yarnapp.Client[0]: monitorApplication: ApplicationReport: appId=1, state=FAILED, progress=0.0, finalStatus=FAILED, diagnostics='Application application_1429295486450_0001 failed 1 times due to AM Container for appattempt_1429295486450_0001_000001 exited with  exitCode: -1000 due to: Application application_1429295486450_0001 initialization failed (exitCode=255) with output: main : command provided 0
main : user is OFFICE\rpdmuserad
main : requested yarn user is office\rpdmuserAD
org.apache.hadoop.util.DiskChecker$DiskErrorException: Cannot create directory: /data/yarn/nm/usercache/office%5CrpdmuserAD/appcache/application_1429295486450_0001/filecache/10
                at org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:105)
                at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.download(ContainerLocalizer.java:199)
                at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:241)
                at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:169)
                at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.main(ContainerLocalizer.java:347)
.Failing this attempt.. Failing the application.'

However, when we look on the node launching the AM, we see this:
[root@rpb-cdh-kerb-2 ~]# cd /data/yarn/nm/usercache
[root@rpb-cdh-kerb-2 usercache]# ls -l
drwxr-s--- 4 OFFICE\rpdmuserad yarn 4096 Apr 17 12:10 office\rpdmuserAD

There appears to be different treatment of the \ character in different places.  Something creates the directory as "office\rpdmuserAD" but something else later attempts to use it as "office%5CrpdmuserAD".  I'm not sure where or why the URL escapement converts the \ to %5C or why this is not consistent.

Is this a known issue?  Any fixes available?  Are we simply not allowed to map local usernames this way?

I should also mention, for the sake of completeness, our auth_to_local rule is set up to map user@OFFICE.DATALEVER.COM<ma...@OFFICE.DATALEVER.COM> to OFFICE\user:
RULE:[1:$1@$0](^.*@OFFICE\.DATALEVER\.COM$)s/^(.*)@OFFICE\.DATALEVER\.COM$/office\\$1/g<mailto:%5e.*@OFFICE\.DATALEVER\.COM$)s/%5e(.*)@OFFICE\.DATALEVER\.COM$/office\\$1/g>

Thanks
John Lilley