You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Rod Paulk <rm...@gmail.com> on 2013/07/31 23:28:27 UTC
YARN with local filesystem
I am having an issue running 2.0.5-alpha (BigTop-0.6.0) YARN-MapReduce on
the local filesystem instead of HDFS. The appTokens file that the error
states is missing, does exist after the job fails. I saw other 'similar'
issues noted in YARN-917, YARN-513, YARN-993. When I switch to HDFS, the
jobs run fine.
In core-site.xml
<property>
<name>fs.defaultFS</name>
<value>file:///</value>
</property>
In mapred-site.xml
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
2013-07-29 16:13:06,549 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl:
Start request for container_1375138534137_0003_01_000001 by user rpaulk
2013-07-29 16:13:06,549 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl:
Creating a new application reference for app application_1375138534137_0003
2013-07-29 16:13:06,549 INFO
org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=rpaulk
IP=172.20.130.215 OPERATION=Start Container Request
TARGET=ContainerManageImpl RESULT=SUCCESS
APPID=application_1375138534137_0003
CONTAINERID=container_1375138534137_0003_01_000001
2013-07-29 16:13:06,551 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application:
Application application_1375138534137_0003 transitioned from NEW to INITING
2013-07-29 16:13:06,551 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application:
Adding container_1375138534137_0003_01_000001 to application
application_1375138534137_0003
2013-07-29 16:13:06,554 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application:
Application application_1375138534137_0003 transitioned from INITING to
RUNNING
2013-07-29 16:13:06,555 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
Container container_1375138534137_0003_01_000001 transitioned from NEW to
LOCALIZING
*2013-07-29 16:13:06,555 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
Resource
file:/nas/scratch/localfs-1/hadoop-yarn/staging/rpaulk/.staging/job_13751385
*
*34137_0003/appTokens transitioned from INIT to DOWNLOADING*
2013-07-29 16:13:06,556 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
Resource
file:/nas/scratch/localfs-1/hadoop-yarn/staging/rpaulk/.staging/job_13751385
34137_0003/job.jar transitioned from INIT to DOWNLOADING
2013-07-29 16:13:06,556 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
Resource
file:/nas/scratch/localfs-1/hadoop-yarn/staging/rpaulk/.staging/job_13751385
34137_0003/job.splitmetainfo transitioned from INIT to DOWNLOADING
2013-07-29 16:13:06,556 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
Resource
file:/nas/scratch/localfs-1/hadoop-yarn/staging/rpaulk/.staging/job_13751385
34137_0003/job.split transitioned from INIT to DOWNLOADING
2013-07-29 16:13:06,556 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
Resource
file:/nas/scratch/localfs-1/hadoop-yarn/staging/rpaulk/.staging/job_1375138534137_0003/job.xml
transitioned from INIT to DOWNLOADING
2013-07-29 16:13:06,556 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
Created localizer for container_1375138534137_0003_01_000001
2013-07-29 16:13:06,559 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
Writing credentials to the nmPrivate file
/var/lib/hadoop-yarn/cache/yarn/nm-local-dir/nmPrivate/container_1375138534137_0003_01_000001.tokens.
Credentials list:
2013-07-29 16:13:06,560 INFO
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor:
Initializing user rpaulk
2013-07-29 16:13:06,564 INFO
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Copying
from
/var/lib/hadoop-yarn/cache/yarn/nm-local-dir/nmPrivate/container_1375138534137_0003_01_000001.tokens
to
/var/lib/hadoop-yarn/cache/yarn/nm-local-dir/usercache/rpaulk/appcache/application_1375138534137_0003/container_1375138534137_0003_01_000001.tokens
2013-07-29 16:13:06,564 INFO
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: CWD set
to
/var/lib/hadoop-yarn/cache/yarn/nm-local-dir/usercache/rpaulk/appcache/application_1375138534137_0003
=
file:/var/lib/hadoop-yarn/cache/yarn/nm-local-dir/usercache/rpaulk/appcache/application_1375138534137_0003
*2013-07-29 16:13:06,646 ERROR
org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException
as:rpaulk (auth:SIMPLE) cause:java.io.FileNotFoundException: File
file:/nas/scratch/localfs-1/hadoop-yarn/staging/rpaulk/.staging/job_1375138534137_0003/appTokens
does not exist*
2013-07-29 16:13:06,648 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
DEBUG: FAILED {
file:/nas/scratch/localfs-1/hadoop-yarn/staging/rpaulk/.staging/job_1375138534137_0003/appTokens,
1375139459000, FILE, null }
RemoteTrace:
java.io.FileNotFoundException: File
file:/nas/scratch/localfs-1/hadoop-yarn/staging/rpaulk/.staging/job_1375138534137_0003/appTokens
does not exist
at
org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:492)
at
org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:395)
at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:176)
at
org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:51)
at org.apache.hadoop.yarn.util.FSDownload$1.run(FSDownload.java:284)
at org.apache.hadoop.yarn.util.FSDownload$1.run(FSDownload.java:282)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1478)
at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:280)
at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:51)
at
java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
at
java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
at LocalTrace:
org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl:
File
file:/nas/scratch/localfs-1/hadoop-yarn/staging/rpaulk/.staging/job_1375138534137_0003/appTokens
does not exist
at
org.apache.hadoop.yarn.server.nodemanager.api.protocolrecords.impl.pb.LocalResourceStatusPBImpl.convertFromProtoFormat(LocalResourceStatusPBImpl.java:217)
at
org.apache.hadoop.yarn.server.nodemanager.api.protocolrecords.impl.pb.LocalResourceStatusPBImpl.getException(LocalResourceStatusPBImpl.java:147)
at
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.update(ResourceLocalizationService.java:819)
at
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerTracker.processHeartbeat(ResourceLocalizationService.java:491)
at
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService.heartbeat(ResourceLocalizationService.java:218)
at
org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.service.LocalizationProtocolPBServiceImpl.heartbeat(LocalizationProtocolPBServiceImpl.java:46)
at
org.apache.hadoop.yarn.proto.LocalizationProtocol$LocalizationProtocolService$2.callBlockingMethod(LocalizationProtocol.java:57)
at
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:454)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1014)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1741)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1737)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1478)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1735)
2013-07-29 16:13:06,650 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
Container container_1375138534137_0003_01_000001 transitioned from
LOCALIZING to LOCALIZATION_FAILED
*2013-07-29 16:13:06,650 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
Resource
file:/nas/scratch/localfs-1/hadoop-yarn/staging/rpaulk/.staging/job_1375138534137_0003/appTokens
transitioned from DOWNLOADING to INIT*
2013-07-29 16:13:06,650 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
Resource
file:/nas/scratch/localfs-1/hadoop-yarn/staging/rpaulk/.staging/job_1375138534137_0003/job.jar
transitioned from DOWNLOADING to INIT
2013-07-29 16:13:06,650 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
Resource
file:/nas/scratch/localfs-1/hadoop-yarn/staging/rpaulk/.staging/job_1375138534137_0003/job.splitmetainfo
transitioned from DOWNLOADING to INIT
2013-07-29 16:13:06,650 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
Resource
file:/nas/scratch/localfs-1/hadoop-yarn/staging/rpaulk/.staging/job_1375138534137_0003/job.split
transitioned from DOWNLOADING to INIT
2013-07-29 16:13:06,650 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
Resource
file:/nas/scratch/localfs-1/hadoop-yarn/staging/rpaulk/.staging/job_1375138534137_0003/job.xml
transitioned from DOWNLOADING to INIT
2013-07-29 16:13:06,652 WARN
org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=rpaulk
OPERATION=Container
Finished - Failed TARGET=ContainerImpl RESULT=FAILURE
DESCRIPTION=Container
failed with state: LOCALIZATION_FAILED APPID=application_1375138534137_0003
CONTAINERID=container_1375138534137_0003_01_000001
2013-07-29 16:13:06,652 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
Container container_1375138534137_0003_01_000001 transitioned from
LOCALIZATION_FAILED to DONE
2013-07-29 16:13:06,652 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application:
Removing container_1375138534137_0003_01_000001 from application
application_1375138534137_0003
2013-07-29 16:13:06,652 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl:
Considering container container_1375138534137_0003_01_000001 for
log-aggregation
2013-07-29 16:13:06,652 INFO
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor:
Deleting absolute path :
/var/lib/hadoop-yarn/cache/yarn/nm-local-dir/usercache/rpaulk/appcache/application_1375138534137_0003/container_1375138534137_0003_01_000001
Re: YARN with local filesystem
Posted by Rod Paulk <rm...@gmail.com>.
I was able to execute the example by running the job as the yarn user.
For example the following successfully completes:
sudo -u yarn yarn org.apache.hadoop.examples.RandomWriter /tmp/random-out
Whereas this fails with the local user rpaulk:
yarn org.apache.hadoop.examples.RandomWriter /tmp/random-out
On Wed, Jul 31, 2013 at 2:28 PM, Rod Paulk <rm...@gmail.com> wrote:
> I am having an issue running 2.0.5-alpha (BigTop-0.6.0) YARN-MapReduce on
> the local filesystem instead of HDFS. The appTokens file that the error
> states is missing, does exist after the job fails. I saw other 'similar'
> issues noted in YARN-917, YARN-513, YARN-993. When I switch to HDFS, the
> jobs run fine.
>
> In core-site.xml
> <property>
> <name>fs.defaultFS</name>
> <value>file:///</value>
> </property>
>
> In mapred-site.xml
> <property>
> <name>mapreduce.framework.name</name>
> <value>yarn</value>
> </property>
>
> 2013-07-29 16:13:06,549 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl:
> Start request for container_1375138534137_0003_01_000001 by user rpaulk
>
> 2013-07-29 16:13:06,549 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl:
> Creating a new application reference for app application_1375138534137_0003
>
> 2013-07-29 16:13:06,549 INFO
> org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=rpaulk
> IP=172.20.130.215 OPERATION=Start Container Request
> TARGET=ContainerManageImpl RESULT=SUCCESS
> APPID=application_1375138534137_0003
> CONTAINERID=container_1375138534137_0003_01_000001
>
> 2013-07-29 16:13:06,551 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application:
> Application application_1375138534137_0003 transitioned from NEW to INITING
>
> 2013-07-29 16:13:06,551 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application:
> Adding container_1375138534137_0003_01_000001 to application
> application_1375138534137_0003
>
> 2013-07-29 16:13:06,554 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application:
> Application application_1375138534137_0003 transitioned from INITING to
> RUNNING
>
> 2013-07-29 16:13:06,555 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
> Container container_1375138534137_0003_01_000001 transitioned from NEW to
> LOCALIZING
>
> *2013-07-29 16:13:06,555 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
> Resource
> file:/nas/scratch/localfs-1/hadoop-yarn/staging/rpaulk/.staging/job_13751385
> *
>
> *34137_0003/appTokens transitioned from INIT to DOWNLOADING*
>
> 2013-07-29 16:13:06,556 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
> Resource
> file:/nas/scratch/localfs-1/hadoop-yarn/staging/rpaulk/.staging/job_13751385
>
> 34137_0003/job.jar transitioned from INIT to DOWNLOADING
>
> 2013-07-29 16:13:06,556 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
> Resource
> file:/nas/scratch/localfs-1/hadoop-yarn/staging/rpaulk/.staging/job_13751385
>
> 34137_0003/job.splitmetainfo transitioned from INIT to DOWNLOADING
>
> 2013-07-29 16:13:06,556 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
> Resource
> file:/nas/scratch/localfs-1/hadoop-yarn/staging/rpaulk/.staging/job_13751385
>
> 34137_0003/job.split transitioned from INIT to DOWNLOADING
>
> 2013-07-29 16:13:06,556 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
> Resource
> file:/nas/scratch/localfs-1/hadoop-yarn/staging/rpaulk/.staging/job_1375138534137_0003/job.xml
> transitioned from INIT to DOWNLOADING
>
> 2013-07-29 16:13:06,556 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
> Created localizer for container_1375138534137_0003_01_000001
>
> 2013-07-29 16:13:06,559 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
> Writing credentials to the nmPrivate file
> /var/lib/hadoop-yarn/cache/yarn/nm-local-dir/nmPrivate/container_1375138534137_0003_01_000001.tokens.
> Credentials list:
>
> 2013-07-29 16:13:06,560 INFO
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor:
> Initializing user rpaulk
>
> 2013-07-29 16:13:06,564 INFO
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Copying
> from
> /var/lib/hadoop-yarn/cache/yarn/nm-local-dir/nmPrivate/container_1375138534137_0003_01_000001.tokens
> to
> /var/lib/hadoop-yarn/cache/yarn/nm-local-dir/usercache/rpaulk/appcache/application_1375138534137_0003/container_1375138534137_0003_01_000001.tokens
>
> 2013-07-29 16:13:06,564 INFO
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: CWD set
> to
> /var/lib/hadoop-yarn/cache/yarn/nm-local-dir/usercache/rpaulk/appcache/application_1375138534137_0003
> =
> file:/var/lib/hadoop-yarn/cache/yarn/nm-local-dir/usercache/rpaulk/appcache/application_1375138534137_0003
>
> *2013-07-29 16:13:06,646 ERROR
> org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException
> as:rpaulk (auth:SIMPLE) cause:java.io.FileNotFoundException: File
> file:/nas/scratch/localfs-1/hadoop-yarn/staging/rpaulk/.staging/job_1375138534137_0003/appTokens
> does not exist*
>
> 2013-07-29 16:13:06,648 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
> DEBUG: FAILED {
> file:/nas/scratch/localfs-1/hadoop-yarn/staging/rpaulk/.staging/job_1375138534137_0003/appTokens,
> 1375139459000, FILE, null }
>
> RemoteTrace:
>
> java.io.FileNotFoundException: File
> file:/nas/scratch/localfs-1/hadoop-yarn/staging/rpaulk/.staging/job_1375138534137_0003/appTokens
> does not exist
>
> at
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:492)
>
> at
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:395)
>
> at
> org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:176)
>
> at
> org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:51)
>
> at
> org.apache.hadoop.yarn.util.FSDownload$1.run(FSDownload.java:284)
>
> at
> org.apache.hadoop.yarn.util.FSDownload$1.run(FSDownload.java:282)
>
> at java.security.AccessController.doPrivileged(Native Method)
>
> at javax.security.auth.Subject.doAs(Subject.java:396)
>
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1478)
>
> at
> org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:280)
>
> at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:51)
>
> at
> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>
> at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
>
> at
> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>
> at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>
> at java.lang.Thread.run(Thread.java:662)
>
> at LocalTrace:
>
> org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl:
> File
> file:/nas/scratch/localfs-1/hadoop-yarn/staging/rpaulk/.staging/job_1375138534137_0003/appTokens
> does not exist
>
> at
> org.apache.hadoop.yarn.server.nodemanager.api.protocolrecords.impl.pb.LocalResourceStatusPBImpl.convertFromProtoFormat(LocalResourceStatusPBImpl.java:217)
>
> at
> org.apache.hadoop.yarn.server.nodemanager.api.protocolrecords.impl.pb.LocalResourceStatusPBImpl.getException(LocalResourceStatusPBImpl.java:147)
>
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.update(ResourceLocalizationService.java:819)
>
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerTracker.processHeartbeat(ResourceLocalizationService.java:491)
>
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService.heartbeat(ResourceLocalizationService.java:218)
>
> at
> org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.service.LocalizationProtocolPBServiceImpl.heartbeat(LocalizationProtocolPBServiceImpl.java:46)
>
> at
> org.apache.hadoop.yarn.proto.LocalizationProtocol$LocalizationProtocolService$2.callBlockingMethod(LocalizationProtocol.java:57)
>
> at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:454)
>
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1014)
>
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1741)
>
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1737)
>
> at java.security.AccessController.doPrivileged(Native Method)
>
> at javax.security.auth.Subject.doAs(Subject.java:396)
>
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1478)
>
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1735)
>
> 2013-07-29 16:13:06,650 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
> Container container_1375138534137_0003_01_000001 transitioned from
> LOCALIZING to LOCALIZATION_FAILED
>
> *2013-07-29 16:13:06,650 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
> Resource
> file:/nas/scratch/localfs-1/hadoop-yarn/staging/rpaulk/.staging/job_1375138534137_0003/appTokens
> transitioned from DOWNLOADING to INIT*
>
> 2013-07-29 16:13:06,650 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
> Resource
> file:/nas/scratch/localfs-1/hadoop-yarn/staging/rpaulk/.staging/job_1375138534137_0003/job.jar
> transitioned from DOWNLOADING to INIT
>
> 2013-07-29 16:13:06,650 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
> Resource
> file:/nas/scratch/localfs-1/hadoop-yarn/staging/rpaulk/.staging/job_1375138534137_0003/job.splitmetainfo
> transitioned from DOWNLOADING to INIT
>
> 2013-07-29 16:13:06,650 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
> Resource
> file:/nas/scratch/localfs-1/hadoop-yarn/staging/rpaulk/.staging/job_1375138534137_0003/job.split
> transitioned from DOWNLOADING to INIT
>
> 2013-07-29 16:13:06,650 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
> Resource
> file:/nas/scratch/localfs-1/hadoop-yarn/staging/rpaulk/.staging/job_1375138534137_0003/job.xml
> transitioned from DOWNLOADING to INIT
>
> 2013-07-29 16:13:06,652 WARN
> org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=rpaulk
> OPERATION=Container Finished - Failed TARGET=ContainerImpl
> RESULT=FAILURE DESCRIPTION=Container failed with state:
> LOCALIZATION_FAILED APPID=application_1375138534137_0003
> CONTAINERID=container_1375138534137_0003_01_000001
>
> 2013-07-29 16:13:06,652 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
> Container container_1375138534137_0003_01_000001 transitioned from
> LOCALIZATION_FAILED to DONE
>
> 2013-07-29 16:13:06,652 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application:
> Removing container_1375138534137_0003_01_000001 from application
> application_1375138534137_0003
>
> 2013-07-29 16:13:06,652 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl:
> Considering container container_1375138534137_0003_01_000001 for
> log-aggregation
>
> 2013-07-29 16:13:06,652 INFO
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor:
> Deleting absolute path :
> /var/lib/hadoop-yarn/cache/yarn/nm-local-dir/usercache/rpaulk/appcache/application_1375138534137_0003/container_1375138534137_0003_01_000001
>
Re: YARN with local filesystem
Posted by Rod Paulk <rm...@gmail.com>.
I was able to execute the example by running the job as the yarn user.
For example the following successfully completes:
sudo -u yarn yarn org.apache.hadoop.examples.RandomWriter /tmp/random-out
Whereas this fails with the local user rpaulk:
yarn org.apache.hadoop.examples.RandomWriter /tmp/random-out
On Wed, Jul 31, 2013 at 2:28 PM, Rod Paulk <rm...@gmail.com> wrote:
> I am having an issue running 2.0.5-alpha (BigTop-0.6.0) YARN-MapReduce on
> the local filesystem instead of HDFS. The appTokens file that the error
> states is missing, does exist after the job fails. I saw other 'similar'
> issues noted in YARN-917, YARN-513, YARN-993. When I switch to HDFS, the
> jobs run fine.
>
> In core-site.xml
> <property>
> <name>fs.defaultFS</name>
> <value>file:///</value>
> </property>
>
> In mapred-site.xml
> <property>
> <name>mapreduce.framework.name</name>
> <value>yarn</value>
> </property>
>
> 2013-07-29 16:13:06,549 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl:
> Start request for container_1375138534137_0003_01_000001 by user rpaulk
>
> 2013-07-29 16:13:06,549 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl:
> Creating a new application reference for app application_1375138534137_0003
>
> 2013-07-29 16:13:06,549 INFO
> org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=rpaulk
> IP=172.20.130.215 OPERATION=Start Container Request
> TARGET=ContainerManageImpl RESULT=SUCCESS
> APPID=application_1375138534137_0003
> CONTAINERID=container_1375138534137_0003_01_000001
>
> 2013-07-29 16:13:06,551 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application:
> Application application_1375138534137_0003 transitioned from NEW to INITING
>
> 2013-07-29 16:13:06,551 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application:
> Adding container_1375138534137_0003_01_000001 to application
> application_1375138534137_0003
>
> 2013-07-29 16:13:06,554 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application:
> Application application_1375138534137_0003 transitioned from INITING to
> RUNNING
>
> 2013-07-29 16:13:06,555 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
> Container container_1375138534137_0003_01_000001 transitioned from NEW to
> LOCALIZING
>
> *2013-07-29 16:13:06,555 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
> Resource
> file:/nas/scratch/localfs-1/hadoop-yarn/staging/rpaulk/.staging/job_13751385
> *
>
> *34137_0003/appTokens transitioned from INIT to DOWNLOADING*
>
> 2013-07-29 16:13:06,556 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
> Resource
> file:/nas/scratch/localfs-1/hadoop-yarn/staging/rpaulk/.staging/job_13751385
>
> 34137_0003/job.jar transitioned from INIT to DOWNLOADING
>
> 2013-07-29 16:13:06,556 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
> Resource
> file:/nas/scratch/localfs-1/hadoop-yarn/staging/rpaulk/.staging/job_13751385
>
> 34137_0003/job.splitmetainfo transitioned from INIT to DOWNLOADING
>
> 2013-07-29 16:13:06,556 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
> Resource
> file:/nas/scratch/localfs-1/hadoop-yarn/staging/rpaulk/.staging/job_13751385
>
> 34137_0003/job.split transitioned from INIT to DOWNLOADING
>
> 2013-07-29 16:13:06,556 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
> Resource
> file:/nas/scratch/localfs-1/hadoop-yarn/staging/rpaulk/.staging/job_1375138534137_0003/job.xml
> transitioned from INIT to DOWNLOADING
>
> 2013-07-29 16:13:06,556 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
> Created localizer for container_1375138534137_0003_01_000001
>
> 2013-07-29 16:13:06,559 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
> Writing credentials to the nmPrivate file
> /var/lib/hadoop-yarn/cache/yarn/nm-local-dir/nmPrivate/container_1375138534137_0003_01_000001.tokens.
> Credentials list:
>
> 2013-07-29 16:13:06,560 INFO
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor:
> Initializing user rpaulk
>
> 2013-07-29 16:13:06,564 INFO
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Copying
> from
> /var/lib/hadoop-yarn/cache/yarn/nm-local-dir/nmPrivate/container_1375138534137_0003_01_000001.tokens
> to
> /var/lib/hadoop-yarn/cache/yarn/nm-local-dir/usercache/rpaulk/appcache/application_1375138534137_0003/container_1375138534137_0003_01_000001.tokens
>
> 2013-07-29 16:13:06,564 INFO
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: CWD set
> to
> /var/lib/hadoop-yarn/cache/yarn/nm-local-dir/usercache/rpaulk/appcache/application_1375138534137_0003
> =
> file:/var/lib/hadoop-yarn/cache/yarn/nm-local-dir/usercache/rpaulk/appcache/application_1375138534137_0003
>
> *2013-07-29 16:13:06,646 ERROR
> org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException
> as:rpaulk (auth:SIMPLE) cause:java.io.FileNotFoundException: File
> file:/nas/scratch/localfs-1/hadoop-yarn/staging/rpaulk/.staging/job_1375138534137_0003/appTokens
> does not exist*
>
> 2013-07-29 16:13:06,648 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
> DEBUG: FAILED {
> file:/nas/scratch/localfs-1/hadoop-yarn/staging/rpaulk/.staging/job_1375138534137_0003/appTokens,
> 1375139459000, FILE, null }
>
> RemoteTrace:
>
> java.io.FileNotFoundException: File
> file:/nas/scratch/localfs-1/hadoop-yarn/staging/rpaulk/.staging/job_1375138534137_0003/appTokens
> does not exist
>
> at
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:492)
>
> at
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:395)
>
> at
> org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:176)
>
> at
> org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:51)
>
> at
> org.apache.hadoop.yarn.util.FSDownload$1.run(FSDownload.java:284)
>
> at
> org.apache.hadoop.yarn.util.FSDownload$1.run(FSDownload.java:282)
>
> at java.security.AccessController.doPrivileged(Native Method)
>
> at javax.security.auth.Subject.doAs(Subject.java:396)
>
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1478)
>
> at
> org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:280)
>
> at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:51)
>
> at
> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>
> at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
>
> at
> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>
> at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>
> at java.lang.Thread.run(Thread.java:662)
>
> at LocalTrace:
>
> org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl:
> File
> file:/nas/scratch/localfs-1/hadoop-yarn/staging/rpaulk/.staging/job_1375138534137_0003/appTokens
> does not exist
>
> at
> org.apache.hadoop.yarn.server.nodemanager.api.protocolrecords.impl.pb.LocalResourceStatusPBImpl.convertFromProtoFormat(LocalResourceStatusPBImpl.java:217)
>
> at
> org.apache.hadoop.yarn.server.nodemanager.api.protocolrecords.impl.pb.LocalResourceStatusPBImpl.getException(LocalResourceStatusPBImpl.java:147)
>
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.update(ResourceLocalizationService.java:819)
>
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerTracker.processHeartbeat(ResourceLocalizationService.java:491)
>
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService.heartbeat(ResourceLocalizationService.java:218)
>
> at
> org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.service.LocalizationProtocolPBServiceImpl.heartbeat(LocalizationProtocolPBServiceImpl.java:46)
>
> at
> org.apache.hadoop.yarn.proto.LocalizationProtocol$LocalizationProtocolService$2.callBlockingMethod(LocalizationProtocol.java:57)
>
> at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:454)
>
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1014)
>
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1741)
>
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1737)
>
> at java.security.AccessController.doPrivileged(Native Method)
>
> at javax.security.auth.Subject.doAs(Subject.java:396)
>
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1478)
>
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1735)
>
> 2013-07-29 16:13:06,650 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
> Container container_1375138534137_0003_01_000001 transitioned from
> LOCALIZING to LOCALIZATION_FAILED
>
> *2013-07-29 16:13:06,650 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
> Resource
> file:/nas/scratch/localfs-1/hadoop-yarn/staging/rpaulk/.staging/job_1375138534137_0003/appTokens
> transitioned from DOWNLOADING to INIT*
>
> 2013-07-29 16:13:06,650 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
> Resource
> file:/nas/scratch/localfs-1/hadoop-yarn/staging/rpaulk/.staging/job_1375138534137_0003/job.jar
> transitioned from DOWNLOADING to INIT
>
> 2013-07-29 16:13:06,650 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
> Resource
> file:/nas/scratch/localfs-1/hadoop-yarn/staging/rpaulk/.staging/job_1375138534137_0003/job.splitmetainfo
> transitioned from DOWNLOADING to INIT
>
> 2013-07-29 16:13:06,650 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
> Resource
> file:/nas/scratch/localfs-1/hadoop-yarn/staging/rpaulk/.staging/job_1375138534137_0003/job.split
> transitioned from DOWNLOADING to INIT
>
> 2013-07-29 16:13:06,650 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
> Resource
> file:/nas/scratch/localfs-1/hadoop-yarn/staging/rpaulk/.staging/job_1375138534137_0003/job.xml
> transitioned from DOWNLOADING to INIT
>
> 2013-07-29 16:13:06,652 WARN
> org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=rpaulk
> OPERATION=Container Finished - Failed TARGET=ContainerImpl
> RESULT=FAILURE DESCRIPTION=Container failed with state:
> LOCALIZATION_FAILED APPID=application_1375138534137_0003
> CONTAINERID=container_1375138534137_0003_01_000001
>
> 2013-07-29 16:13:06,652 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
> Container container_1375138534137_0003_01_000001 transitioned from
> LOCALIZATION_FAILED to DONE
>
> 2013-07-29 16:13:06,652 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application:
> Removing container_1375138534137_0003_01_000001 from application
> application_1375138534137_0003
>
> 2013-07-29 16:13:06,652 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl:
> Considering container container_1375138534137_0003_01_000001 for
> log-aggregation
>
> 2013-07-29 16:13:06,652 INFO
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor:
> Deleting absolute path :
> /var/lib/hadoop-yarn/cache/yarn/nm-local-dir/usercache/rpaulk/appcache/application_1375138534137_0003/container_1375138534137_0003_01_000001
>
Re: YARN with local filesystem
Posted by Rod Paulk <rm...@gmail.com>.
I was able to execute the example by running the job as the yarn user.
For example the following successfully completes:
sudo -u yarn yarn org.apache.hadoop.examples.RandomWriter /tmp/random-out
Whereas this fails with the local user rpaulk:
yarn org.apache.hadoop.examples.RandomWriter /tmp/random-out
On Wed, Jul 31, 2013 at 2:28 PM, Rod Paulk <rm...@gmail.com> wrote:
> I am having an issue running 2.0.5-alpha (BigTop-0.6.0) YARN-MapReduce on
> the local filesystem instead of HDFS. The appTokens file that the error
> states is missing, does exist after the job fails. I saw other 'similar'
> issues noted in YARN-917, YARN-513, YARN-993. When I switch to HDFS, the
> jobs run fine.
>
> In core-site.xml
> <property>
> <name>fs.defaultFS</name>
> <value>file:///</value>
> </property>
>
> In mapred-site.xml
> <property>
> <name>mapreduce.framework.name</name>
> <value>yarn</value>
> </property>
>
> 2013-07-29 16:13:06,549 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl:
> Start request for container_1375138534137_0003_01_000001 by user rpaulk
>
> 2013-07-29 16:13:06,549 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl:
> Creating a new application reference for app application_1375138534137_0003
>
> 2013-07-29 16:13:06,549 INFO
> org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=rpaulk
> IP=172.20.130.215 OPERATION=Start Container Request
> TARGET=ContainerManageImpl RESULT=SUCCESS
> APPID=application_1375138534137_0003
> CONTAINERID=container_1375138534137_0003_01_000001
>
> 2013-07-29 16:13:06,551 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application:
> Application application_1375138534137_0003 transitioned from NEW to INITING
>
> 2013-07-29 16:13:06,551 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application:
> Adding container_1375138534137_0003_01_000001 to application
> application_1375138534137_0003
>
> 2013-07-29 16:13:06,554 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application:
> Application application_1375138534137_0003 transitioned from INITING to
> RUNNING
>
> 2013-07-29 16:13:06,555 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
> Container container_1375138534137_0003_01_000001 transitioned from NEW to
> LOCALIZING
>
> *2013-07-29 16:13:06,555 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
> Resource
> file:/nas/scratch/localfs-1/hadoop-yarn/staging/rpaulk/.staging/job_13751385
> *
>
> *34137_0003/appTokens transitioned from INIT to DOWNLOADING*
>
> 2013-07-29 16:13:06,556 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
> Resource
> file:/nas/scratch/localfs-1/hadoop-yarn/staging/rpaulk/.staging/job_13751385
>
> 34137_0003/job.jar transitioned from INIT to DOWNLOADING
>
> 2013-07-29 16:13:06,556 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
> Resource
> file:/nas/scratch/localfs-1/hadoop-yarn/staging/rpaulk/.staging/job_13751385
>
> 34137_0003/job.splitmetainfo transitioned from INIT to DOWNLOADING
>
> 2013-07-29 16:13:06,556 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
> Resource
> file:/nas/scratch/localfs-1/hadoop-yarn/staging/rpaulk/.staging/job_13751385
>
> 34137_0003/job.split transitioned from INIT to DOWNLOADING
>
> 2013-07-29 16:13:06,556 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
> Resource
> file:/nas/scratch/localfs-1/hadoop-yarn/staging/rpaulk/.staging/job_1375138534137_0003/job.xml
> transitioned from INIT to DOWNLOADING
>
> 2013-07-29 16:13:06,556 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
> Created localizer for container_1375138534137_0003_01_000001
>
> 2013-07-29 16:13:06,559 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
> Writing credentials to the nmPrivate file
> /var/lib/hadoop-yarn/cache/yarn/nm-local-dir/nmPrivate/container_1375138534137_0003_01_000001.tokens.
> Credentials list:
>
> 2013-07-29 16:13:06,560 INFO
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor:
> Initializing user rpaulk
>
> 2013-07-29 16:13:06,564 INFO
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Copying
> from
> /var/lib/hadoop-yarn/cache/yarn/nm-local-dir/nmPrivate/container_1375138534137_0003_01_000001.tokens
> to
> /var/lib/hadoop-yarn/cache/yarn/nm-local-dir/usercache/rpaulk/appcache/application_1375138534137_0003/container_1375138534137_0003_01_000001.tokens
>
> 2013-07-29 16:13:06,564 INFO
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: CWD set
> to
> /var/lib/hadoop-yarn/cache/yarn/nm-local-dir/usercache/rpaulk/appcache/application_1375138534137_0003
> =
> file:/var/lib/hadoop-yarn/cache/yarn/nm-local-dir/usercache/rpaulk/appcache/application_1375138534137_0003
>
> *2013-07-29 16:13:06,646 ERROR
> org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException
> as:rpaulk (auth:SIMPLE) cause:java.io.FileNotFoundException: File
> file:/nas/scratch/localfs-1/hadoop-yarn/staging/rpaulk/.staging/job_1375138534137_0003/appTokens
> does not exist*
>
> 2013-07-29 16:13:06,648 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
> DEBUG: FAILED {
> file:/nas/scratch/localfs-1/hadoop-yarn/staging/rpaulk/.staging/job_1375138534137_0003/appTokens,
> 1375139459000, FILE, null }
>
> RemoteTrace:
>
> java.io.FileNotFoundException: File
> file:/nas/scratch/localfs-1/hadoop-yarn/staging/rpaulk/.staging/job_1375138534137_0003/appTokens
> does not exist
>
> at
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:492)
>
> at
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:395)
>
> at
> org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:176)
>
> at
> org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:51)
>
> at
> org.apache.hadoop.yarn.util.FSDownload$1.run(FSDownload.java:284)
>
> at
> org.apache.hadoop.yarn.util.FSDownload$1.run(FSDownload.java:282)
>
> at java.security.AccessController.doPrivileged(Native Method)
>
> at javax.security.auth.Subject.doAs(Subject.java:396)
>
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1478)
>
> at
> org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:280)
>
> at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:51)
>
> at
> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>
> at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
>
> at
> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>
> at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>
> at java.lang.Thread.run(Thread.java:662)
>
> at LocalTrace:
>
> org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl:
> File
> file:/nas/scratch/localfs-1/hadoop-yarn/staging/rpaulk/.staging/job_1375138534137_0003/appTokens
> does not exist
>
> at
> org.apache.hadoop.yarn.server.nodemanager.api.protocolrecords.impl.pb.LocalResourceStatusPBImpl.convertFromProtoFormat(LocalResourceStatusPBImpl.java:217)
>
> at
> org.apache.hadoop.yarn.server.nodemanager.api.protocolrecords.impl.pb.LocalResourceStatusPBImpl.getException(LocalResourceStatusPBImpl.java:147)
>
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.update(ResourceLocalizationService.java:819)
>
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerTracker.processHeartbeat(ResourceLocalizationService.java:491)
>
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService.heartbeat(ResourceLocalizationService.java:218)
>
> at
> org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.service.LocalizationProtocolPBServiceImpl.heartbeat(LocalizationProtocolPBServiceImpl.java:46)
>
> at
> org.apache.hadoop.yarn.proto.LocalizationProtocol$LocalizationProtocolService$2.callBlockingMethod(LocalizationProtocol.java:57)
>
> at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:454)
>
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1014)
>
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1741)
>
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1737)
>
> at java.security.AccessController.doPrivileged(Native Method)
>
> at javax.security.auth.Subject.doAs(Subject.java:396)
>
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1478)
>
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1735)
>
> 2013-07-29 16:13:06,650 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
> Container container_1375138534137_0003_01_000001 transitioned from
> LOCALIZING to LOCALIZATION_FAILED
>
> *2013-07-29 16:13:06,650 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
> Resource
> file:/nas/scratch/localfs-1/hadoop-yarn/staging/rpaulk/.staging/job_1375138534137_0003/appTokens
> transitioned from DOWNLOADING to INIT*
>
> 2013-07-29 16:13:06,650 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
> Resource
> file:/nas/scratch/localfs-1/hadoop-yarn/staging/rpaulk/.staging/job_1375138534137_0003/job.jar
> transitioned from DOWNLOADING to INIT
>
> 2013-07-29 16:13:06,650 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
> Resource
> file:/nas/scratch/localfs-1/hadoop-yarn/staging/rpaulk/.staging/job_1375138534137_0003/job.splitmetainfo
> transitioned from DOWNLOADING to INIT
>
> 2013-07-29 16:13:06,650 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
> Resource
> file:/nas/scratch/localfs-1/hadoop-yarn/staging/rpaulk/.staging/job_1375138534137_0003/job.split
> transitioned from DOWNLOADING to INIT
>
> 2013-07-29 16:13:06,650 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
> Resource
> file:/nas/scratch/localfs-1/hadoop-yarn/staging/rpaulk/.staging/job_1375138534137_0003/job.xml
> transitioned from DOWNLOADING to INIT
>
> 2013-07-29 16:13:06,652 WARN
> org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=rpaulk
> OPERATION=Container Finished - Failed TARGET=ContainerImpl
> RESULT=FAILURE DESCRIPTION=Container failed with state:
> LOCALIZATION_FAILED APPID=application_1375138534137_0003
> CONTAINERID=container_1375138534137_0003_01_000001
>
> 2013-07-29 16:13:06,652 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
> Container container_1375138534137_0003_01_000001 transitioned from
> LOCALIZATION_FAILED to DONE
>
> 2013-07-29 16:13:06,652 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application:
> Removing container_1375138534137_0003_01_000001 from application
> application_1375138534137_0003
>
> 2013-07-29 16:13:06,652 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl:
> Considering container container_1375138534137_0003_01_000001 for
> log-aggregation
>
> 2013-07-29 16:13:06,652 INFO
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor:
> Deleting absolute path :
> /var/lib/hadoop-yarn/cache/yarn/nm-local-dir/usercache/rpaulk/appcache/application_1375138534137_0003/container_1375138534137_0003_01_000001
>
Re: YARN with local filesystem
Posted by Rod Paulk <rm...@gmail.com>.
I was able to execute the example by running the job as the yarn user.
For example the following successfully completes:
sudo -u yarn yarn org.apache.hadoop.examples.RandomWriter /tmp/random-out
Whereas this fails with the local user rpaulk:
yarn org.apache.hadoop.examples.RandomWriter /tmp/random-out
On Wed, Jul 31, 2013 at 2:28 PM, Rod Paulk <rm...@gmail.com> wrote:
> I am having an issue running 2.0.5-alpha (BigTop-0.6.0) YARN-MapReduce on
> the local filesystem instead of HDFS. The appTokens file that the error
> states is missing, does exist after the job fails. I saw other 'similar'
> issues noted in YARN-917, YARN-513, YARN-993. When I switch to HDFS, the
> jobs run fine.
>
> In core-site.xml
> <property>
> <name>fs.defaultFS</name>
> <value>file:///</value>
> </property>
>
> In mapred-site.xml
> <property>
> <name>mapreduce.framework.name</name>
> <value>yarn</value>
> </property>
>
> 2013-07-29 16:13:06,549 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl:
> Start request for container_1375138534137_0003_01_000001 by user rpaulk
>
> 2013-07-29 16:13:06,549 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl:
> Creating a new application reference for app application_1375138534137_0003
>
> 2013-07-29 16:13:06,549 INFO
> org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=rpaulk
> IP=172.20.130.215 OPERATION=Start Container Request
> TARGET=ContainerManageImpl RESULT=SUCCESS
> APPID=application_1375138534137_0003
> CONTAINERID=container_1375138534137_0003_01_000001
>
> 2013-07-29 16:13:06,551 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application:
> Application application_1375138534137_0003 transitioned from NEW to INITING
>
> 2013-07-29 16:13:06,551 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application:
> Adding container_1375138534137_0003_01_000001 to application
> application_1375138534137_0003
>
> 2013-07-29 16:13:06,554 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application:
> Application application_1375138534137_0003 transitioned from INITING to
> RUNNING
>
> 2013-07-29 16:13:06,555 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
> Container container_1375138534137_0003_01_000001 transitioned from NEW to
> LOCALIZING
>
> *2013-07-29 16:13:06,555 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
> Resource
> file:/nas/scratch/localfs-1/hadoop-yarn/staging/rpaulk/.staging/job_13751385
> *
>
> *34137_0003/appTokens transitioned from INIT to DOWNLOADING*
>
> 2013-07-29 16:13:06,556 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
> Resource
> file:/nas/scratch/localfs-1/hadoop-yarn/staging/rpaulk/.staging/job_13751385
>
> 34137_0003/job.jar transitioned from INIT to DOWNLOADING
>
> 2013-07-29 16:13:06,556 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
> Resource
> file:/nas/scratch/localfs-1/hadoop-yarn/staging/rpaulk/.staging/job_13751385
>
> 34137_0003/job.splitmetainfo transitioned from INIT to DOWNLOADING
>
> 2013-07-29 16:13:06,556 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
> Resource
> file:/nas/scratch/localfs-1/hadoop-yarn/staging/rpaulk/.staging/job_13751385
>
> 34137_0003/job.split transitioned from INIT to DOWNLOADING
>
> 2013-07-29 16:13:06,556 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
> Resource
> file:/nas/scratch/localfs-1/hadoop-yarn/staging/rpaulk/.staging/job_1375138534137_0003/job.xml
> transitioned from INIT to DOWNLOADING
>
> 2013-07-29 16:13:06,556 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
> Created localizer for container_1375138534137_0003_01_000001
>
> 2013-07-29 16:13:06,559 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
> Writing credentials to the nmPrivate file
> /var/lib/hadoop-yarn/cache/yarn/nm-local-dir/nmPrivate/container_1375138534137_0003_01_000001.tokens.
> Credentials list:
>
> 2013-07-29 16:13:06,560 INFO
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor:
> Initializing user rpaulk
>
> 2013-07-29 16:13:06,564 INFO
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Copying
> from
> /var/lib/hadoop-yarn/cache/yarn/nm-local-dir/nmPrivate/container_1375138534137_0003_01_000001.tokens
> to
> /var/lib/hadoop-yarn/cache/yarn/nm-local-dir/usercache/rpaulk/appcache/application_1375138534137_0003/container_1375138534137_0003_01_000001.tokens
>
> 2013-07-29 16:13:06,564 INFO
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: CWD set
> to
> /var/lib/hadoop-yarn/cache/yarn/nm-local-dir/usercache/rpaulk/appcache/application_1375138534137_0003
> =
> file:/var/lib/hadoop-yarn/cache/yarn/nm-local-dir/usercache/rpaulk/appcache/application_1375138534137_0003
>
> *2013-07-29 16:13:06,646 ERROR
> org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException
> as:rpaulk (auth:SIMPLE) cause:java.io.FileNotFoundException: File
> file:/nas/scratch/localfs-1/hadoop-yarn/staging/rpaulk/.staging/job_1375138534137_0003/appTokens
> does not exist*
>
> 2013-07-29 16:13:06,648 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
> DEBUG: FAILED {
> file:/nas/scratch/localfs-1/hadoop-yarn/staging/rpaulk/.staging/job_1375138534137_0003/appTokens,
> 1375139459000, FILE, null }
>
> RemoteTrace:
>
> java.io.FileNotFoundException: File
> file:/nas/scratch/localfs-1/hadoop-yarn/staging/rpaulk/.staging/job_1375138534137_0003/appTokens
> does not exist
>
> at
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:492)
>
> at
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:395)
>
> at
> org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:176)
>
> at
> org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:51)
>
> at
> org.apache.hadoop.yarn.util.FSDownload$1.run(FSDownload.java:284)
>
> at
> org.apache.hadoop.yarn.util.FSDownload$1.run(FSDownload.java:282)
>
> at java.security.AccessController.doPrivileged(Native Method)
>
> at javax.security.auth.Subject.doAs(Subject.java:396)
>
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1478)
>
> at
> org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:280)
>
> at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:51)
>
> at
> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>
> at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
>
> at
> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>
> at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>
> at java.lang.Thread.run(Thread.java:662)
>
> at LocalTrace:
>
> org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl:
> File
> file:/nas/scratch/localfs-1/hadoop-yarn/staging/rpaulk/.staging/job_1375138534137_0003/appTokens
> does not exist
>
> at
> org.apache.hadoop.yarn.server.nodemanager.api.protocolrecords.impl.pb.LocalResourceStatusPBImpl.convertFromProtoFormat(LocalResourceStatusPBImpl.java:217)
>
> at
> org.apache.hadoop.yarn.server.nodemanager.api.protocolrecords.impl.pb.LocalResourceStatusPBImpl.getException(LocalResourceStatusPBImpl.java:147)
>
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.update(ResourceLocalizationService.java:819)
>
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerTracker.processHeartbeat(ResourceLocalizationService.java:491)
>
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService.heartbeat(ResourceLocalizationService.java:218)
>
> at
> org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.service.LocalizationProtocolPBServiceImpl.heartbeat(LocalizationProtocolPBServiceImpl.java:46)
>
> at
> org.apache.hadoop.yarn.proto.LocalizationProtocol$LocalizationProtocolService$2.callBlockingMethod(LocalizationProtocol.java:57)
>
> at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:454)
>
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1014)
>
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1741)
>
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1737)
>
> at java.security.AccessController.doPrivileged(Native Method)
>
> at javax.security.auth.Subject.doAs(Subject.java:396)
>
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1478)
>
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1735)
>
> 2013-07-29 16:13:06,650 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
> Container container_1375138534137_0003_01_000001 transitioned from
> LOCALIZING to LOCALIZATION_FAILED
>
> *2013-07-29 16:13:06,650 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
> Resource
> file:/nas/scratch/localfs-1/hadoop-yarn/staging/rpaulk/.staging/job_1375138534137_0003/appTokens
> transitioned from DOWNLOADING to INIT*
>
> 2013-07-29 16:13:06,650 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
> Resource
> file:/nas/scratch/localfs-1/hadoop-yarn/staging/rpaulk/.staging/job_1375138534137_0003/job.jar
> transitioned from DOWNLOADING to INIT
>
> 2013-07-29 16:13:06,650 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
> Resource
> file:/nas/scratch/localfs-1/hadoop-yarn/staging/rpaulk/.staging/job_1375138534137_0003/job.splitmetainfo
> transitioned from DOWNLOADING to INIT
>
> 2013-07-29 16:13:06,650 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
> Resource
> file:/nas/scratch/localfs-1/hadoop-yarn/staging/rpaulk/.staging/job_1375138534137_0003/job.split
> transitioned from DOWNLOADING to INIT
>
> 2013-07-29 16:13:06,650 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
> Resource
> file:/nas/scratch/localfs-1/hadoop-yarn/staging/rpaulk/.staging/job_1375138534137_0003/job.xml
> transitioned from DOWNLOADING to INIT
>
> 2013-07-29 16:13:06,652 WARN
> org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=rpaulk
> OPERATION=Container Finished - Failed TARGET=ContainerImpl
> RESULT=FAILURE DESCRIPTION=Container failed with state:
> LOCALIZATION_FAILED APPID=application_1375138534137_0003
> CONTAINERID=container_1375138534137_0003_01_000001
>
> 2013-07-29 16:13:06,652 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
> Container container_1375138534137_0003_01_000001 transitioned from
> LOCALIZATION_FAILED to DONE
>
> 2013-07-29 16:13:06,652 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application:
> Removing container_1375138534137_0003_01_000001 from application
> application_1375138534137_0003
>
> 2013-07-29 16:13:06,652 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl:
> Considering container container_1375138534137_0003_01_000001 for
> log-aggregation
>
> 2013-07-29 16:13:06,652 INFO
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor:
> Deleting absolute path :
> /var/lib/hadoop-yarn/cache/yarn/nm-local-dir/usercache/rpaulk/appcache/application_1375138534137_0003/container_1375138534137_0003_01_000001
>