You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Varun Vasudev (JIRA)" <ji...@apache.org> on 2014/08/18 15:39:18 UTC

[jira] [Assigned] (YARN-2426) NodeManger is not able use WebHDFS token properly to tallk to WebHDFS while localizing

     [ https://issues.apache.org/jira/browse/YARN-2426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Varun Vasudev reassigned YARN-2426:
-----------------------------------

    Assignee: Varun Vasudev

> NodeManger is not able use WebHDFS token properly to tallk to WebHDFS while localizing 
> ---------------------------------------------------------------------------------------
>
>                 Key: YARN-2426
>                 URL: https://issues.apache.org/jira/browse/YARN-2426
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager, resourcemanager, webapp
>    Affects Versions: 2.6.0
>         Environment: Hadoop Keberos (Secure) cluster with LinuxContainerExcutor is enabled
> With SPNEGO on for Yarn new RM web services for application submission
> While using kinit we are using -C (to specify cachepath).
> Then while executing set export KRB5CCNAME = <path provided with -C option>
> There is no kerberos ticket in default KRB5 cache path with is /tmp
>            Reporter: Karam Singh
>            Assignee: Varun Vasudev
>
> Encountered this issue during using new YARN's RM WS for application submission, on single node cluster while submitting Distributed Shell application using RM WS(webservice).
> For this we need  pass custom script and AppMaster jar along with webhdfs token to NodeManager for localization.
> Distributed Shell Application was failing as Node was failing to localise AppMaster jar .
> Following is the NM log while localizing AppMaster jar:
> {code}
> 2014-08-18 01:53:52,434 INFO  authorize.ServiceAuthorizationManager (ServiceAuthorizationManager.java:authorize(114)) - Authorization successful for testing (auth:TOKEN) for protocol=interface org.apache.hadoop.yarn.server.nodemanager.api.LocalizationProtocolPB
> 2014-08-18 01:53:52,757 INFO  localizer.ResourceLocalizationService (ResourceLocalizationService.java:update(1011)) - DEBUG: FAILED { webhdfs://<NAMENODEHOST>:<NAMENODEHTTPPORT>/user/<JARpPATH>, 1408352019488, FILE, null }, Authentication required
> 2014-08-18 01:53:52,758 INFO  localizer.LocalizedResource (LocalizedResource.java:handle(203)) - Resource webhdfs://<NAMENODEHOST>:<NAMENODEHTTPPORT>/user/<JARPATH>(-><NM_LOCAL_DIR>/usercache/<APP_USER>/appcache/application_1408351986532_0001/filecache/10/DshellAppMaster.jar) transitioned from DOWNLOADING to FAILED
> 2014-08-18 01:53:52,758 INFO  container.Container (ContainerImpl.java:handle(999)) - Container container_1408351986532_0001_01_000001 transitioned from LOCALIZING to LOCALIZATION_FAILED
> {code}  
> Which is similar to what we get is when we try access webhdfs in secure (kerberos) cluster without doing kinit
> Whereas if we do curl -i -k -s 'http://<NAMENODEHOST>:<NAMENODEHTTPPORT>/webhdfs/v1/user/<JAR_PATH>?op=listStatus&delegation=<same webhdfs token used in app submission structure>"
> works properly
> I also tried using http://<NAMENODEHOST>:<NAMENODEHTTPPORT>/webhdfs/v1/user/hadoopqa/<JAR_PATH> in app submission object instead of webhdfs:// uri format
> Then NodeManger fail to localize as there is http filesystem scheme
> {code}
> 14-08-18 02:03:31,343 INFO  authorize.ServiceAuthorizationManager (ServiceAuthorizationManager.java:authorize(114)) - Authorization successful for testing (auth:TOKEN) for protocol=interface org.apache.
> hadoop.yarn.server.nodemanager.api.LocalizationProtocolPB
> 2014-08-18 02:03:31,583 INFO  localizer.ResourceLocalizationService (ResourceLocalizationService.java:update(1011)) - DEBUG: FAILED { http://<NAMENODEHOST>:<NAMENODEHTTPPORT>/webhdfs/v1/user/<JAR_PATH> 1408352576841, FILE, null }, No FileSystem for scheme: http
> 2014-08-18 02:03:31,583 INFO  localizer.LocalizedResource (LocalizedResource.java:handle(203)) - Resource http://<NAMENODEHOST>:<NAMENODEHTTPPORT>/webhdfs/v1/user/<JAR_PATH>(-><NM_LOCAL_DIR>/usercache/<APP_USER>/appcache/application_1408352544163_0002/filecache/11/DshellAppMaster.jar) transitioned from DOWNLOADING to FAILED
> {code}
> Now do kinit without providing -C option for KRB5 cache path. So Ticket to goes to default KRB5 cache /tmp
> Again submit same application object to Yarn WS, with webhdfs:// uri format paths and webhdfs token
> This time NM is able download jar and custom shell script and application runs fine
> Looks like following is happening:
> webhdfs is trying look for krb ticket in NM while localising 
> 1. As 1st case there was to krb ticket there in default cache. Application failing while localising AppMaster jar
> 2. In second case as already kinit and krb ticket was present in /tmp (default KRB5 cache). AppMaster got localized successfully



--
This message was sent by Atlassian JIRA
(v6.2#6252)