You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by Arun C Murthy <ac...@hortonworks.com> on 2012/01/04 08:42:33 UTC

Re: Exception from Yarn Launch Container

Bing,

 Are you using the released version of hadoop-0.23? If so, you might want to upgrade to latest build off branch-0.23 (i.e. hadoop-0.23.1-SNAPSHOT) which has the fix for MAPREDUCE-3537.

Arun

On Dec 29, 2011, at 12:27 AM, Bing Jiang wrote:

> Hi, I use Yarn as resource management to deploy my run-time computing system. I follow  
>> http://hadoop.apache.org/common/docs/r0.23.0/hadoop-yarn/hadoop-yarn-site/YARN.html
>> http://hadoop.apache.org/common/docs/r0.23.0/hadoop-yarn/hadoop-yarn-site/WritingYarnApplications.html
> as guide, and I find these issues below. 
> 
> yarn-nodemanager-**.log:
> ....
> 2011-12-29 15:49:16,250 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application: Adding container_1325062142731_0006_01_000001 to application application_1325062142731_0006
> 2011-12-29 15:49:16,250 DEBUG org.apache.hadoop.yarn.event.AsyncDispatcher: Dispatching the event org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.event.ApplicationLocalizationEvent.EventType: INIT_APPLICATION_RESOURCES
> 2011-12-29 15:49:16,250 DEBUG org.apache.hadoop.yarn.event.AsyncDispatcher: Dispatching the event org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationInitedEvent.EventType: APPLICATION_INITED
> 2011-12-29 15:49:16,250 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application: Processing application_1325062142731_0006 of type APPLICATION_INITED
> 2011-12-29 15:49:16,250 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application: Application application_1325062142731_0006 transitioned from INITING to RUNNING
> 2011-12-29 15:49:16,250 DEBUG org.apache.hadoop.yarn.event.AsyncDispatcher: Dispatching the event org.apache.hadoop.yarn.server.nodemanager.containermanager.loghandler.event.LogHandlerAppStartedEvent.EventType: APPLICATION_STARTED
> 2011-12-29 15:49:16,250 DEBUG org.apache.hadoop.yarn.event.AsyncDispatcher: Dispatching the event org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerInitEvent.EventType: INIT_CONTAINER
> 2011-12-29 15:49:16,250 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Processing container_1325062142731_0006_01_000001 of type INIT_CONTAINER
> 2011-12-29 15:49:16,250 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Container container_1325062142731_0006_01_000001 transitioned from NEW to LOCALIZED
> 2011-12-29 15:49:16,250 DEBUG org.apache.hadoop.yarn.event.AsyncDispatcher: Dispatching the event org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncherEvent.EventType: LAUNCH_CONTAINER
> 2011-12-29 15:49:16,287 DEBUG org.apache.hadoop.yarn.event.AsyncDispatcher: Dispatching the event org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerEvent.EventType: CONTAINER_LAUNCHED
> 2011-12-29 15:49:16,287 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Processing container_1325062142731_0006_01_000001 of type CONTAINER_LAUNCHED
> 2011-12-29 15:49:16,287 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Container container_1325062142731_0006_01_000001 transitioned from LOCALIZED to RUNNING
> 2011-12-29 15:49:16,288 DEBUG org.apache.hadoop.yarn.event.AsyncDispatcher: Dispatching the event org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainerStartMonitoringEvent.EventType: START_MONITORING_CONTAINER
> 2011-12-29 15:49:16,289 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch: Failed to launch container
> java.io.FileNotFoundException: File /tmp/nm-local-dir/usercache/jiangbing/appcache/application_1325062142731_0006 does not exist
>     at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:431)
>     at org.apache.hadoop.fs.FileSystem.primitiveMkdir(FileSystem.java:815)
>     at org.apache.hadoop.fs.DelegateToFileSystem.mkdir(DelegateToFileSystem.java:143)
>     at org.apache.hadoop.fs.FilterFs.mkdir(FilterFs.java:189)
>     at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:700)
>     at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:697)
>    at org.apache.hadoop.fs.FileContext$FSLinkResolver.resolve(FileContext.java:2325)
>     at org.apache.hadoop.fs.FileContext.mkdir(FileContext.java:697)
>     at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:123)
>     at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:237)
>     at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:67)
>     at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>     at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>     at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>     at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>     at java.lang.Thread.run(Thread.java:662)
> 2011-12-29 15:49:16,290 DEBUG org.apache.hadoop.yarn.event.AsyncDispatcher: Dispatching the event org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerExitEvent.EventType: CONTAINER_EXITED_WITH_FAILURE
> 2011-12-29 15:49:16,290 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Processing container_1325062142731_0006_01_000001 of type CONTAINER_EXITED_WITH_FAILURE
> 2011-12-29 15:49:16,290 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Container container_1325062142731_0006_01_000001 transitioned from RUNNING to EXITED_WITH_FAILURE
> 2011-12-29 15:49:16,290 DEBUG org.apache.hadoop.yarn.event.AsyncDispatcher: Dispatching the event org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncherEvent.EventType: CLEANUP_CONTAINER
> 2011-12-29 15:49:16,290 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch: Cleaning up container container_1325062142731_0006_01_000001
> 2011-12-29 15:49:16,290 DEBUG org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch: Marking container container_1325062142731_0006_01_000001 as inactive
> 2011-12-29 15:49:16,290 DEBUG org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch: Getting pid for container container_1325062142731_0006_01_000001 to kill from pid file /tmp/nm-local-dir/nmPrivate/container_1325062142731_0006_01_000001.pid
> 2011-12-29 15:49:16,290 DEBUG org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch: Accessing pid for container container_1325062142731_0006_01_000001 from pid file /tmp/nm-local-dir/nmPrivate/container_1325062142731_0006_01_000001.pid
> 2011-12-29 15:49:16,307 DEBUG org.apache.hadoop.yarn.event.AsyncDispatcher: Dispatching the event org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.event.ContainerLocalizationCleanupEvent.EventType: CLEANUP_CONTAINER_RESOURCES
> 
> 
> 
> -- 
> Bing Jiang
> Tel:(86)134-2619-1361
> National Research Center for Intelligent Computing Systems
> Institute of Computing technology
> Graduate University of Chinese Academy of Science
> 


Re: Exception from Yarn Launch Container

Posted by "real great.." <gr...@gmail.com>.
@raghavendra: I think you should be started a new thread.
Anyways, a google should ideally lead you to the exact svn repository.
cheers.:)

2012/1/6 raghavendhra rahul <ra...@gmail.com>

> Hi,
> Can i know where to get the release hadoop 0.23.1
>
> 2012/1/5 Bing Jiang <ji...@gmail.com>
>
>> Arun,
>>
>> In order to figure out the fact, I trace back to source code. I find that
>> *org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor*:
>>
>> @Override
>>   public int launchContainer(Container container,
>>       Path nmPrivateContainerScriptPath, Path nmPrivateTokensPath,
>>       String userName, String appId, Path containerWorkDir)
>>       throws IOException {
>>       ....
>>        String[] sLocalDirs = getConf().getStrings(
>>         YarnConfiguration.NM_LOCAL_
>> DIRS,
>>         YarnConfiguration.DEFAULT_NM_LOCAL_DIRS);
>>     for (String sLocalDir : sLocalDirs) {
>>       Path usersdir = new Path(sLocalDir, ContainerLocalizer.USERCACHE);
>>       Path userdir = new Path(usersdir, userName);
>>       Path appCacheDir = new Path(userdir, ContainerLocalizer.APPCACHE);
>>       Path appDir = new Path(appCacheDir, appIdStr);
>>       Path containerDir = new Path(appDir, containerIdStr);
>>       lfs.mkdir(containerDir, null, false);
>>    }
>>   ....
>>
>> lfs.mkdir(containerDir, null, false);  refer to the api of mkdir, false
>> means cannot create parent path here if not exists.
>> In my hadoop project, I revise  lfs.mkdir(containerDir, null, false);  to
>> lfs.mkdir(containerDir, null, true); , then my program goes well.
>>
>> I fetch the hadoop source code from git now, but I can find the same
>> issue as before.
>>
>> I want to ask why you set false here, or I missed out some important
>> issues?
>>
>> Thanks!
>>
>>
>> 在 2012年1月4日 下午3:42,Arun C Murthy <ac...@hortonworks.com>写道:
>>
>> Bing,
>>>
>>>  Are you using the released version of hadoop-0.23? If so, you might
>>> want to upgrade to latest build off branch-0.23 (i.e.
>>> hadoop-0.23.1-SNAPSHOT) which has the fix for MAPREDUCE-3537.
>>>
>>> Arun
>>>
>>> On Dec 29, 2011, at 12:27 AM, Bing Jiang wrote:
>>>
>>> Hi, I use Yarn as resource management to deploy my run-time computing
>>> system. I follow
>>>
>>>
>>>>> http://hadoop.apache.org/common/docs/r0.23.0/hadoop-yarn/hadoop-yarn-site/YARN.html
>>>>>
>>>>> http://hadoop.apache.org/common/docs/r0.23.0/hadoop-yarn/hadoop-yarn-site/WritingYarnApplications.html
>>>>>
>>>> as guide, and I find these issues below.
>>>
>>> yarn-nodemanager-**.log:
>>> ....
>>> 2011-12-29 15:49:16,250 INFO
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application:
>>> Adding container_1325062142731_0006_01_000001 to application
>>> application_1325062142731_0006
>>> 2011-12-29 15:49:16,250 DEBUG
>>> org.apache.hadoop.yarn.event.AsyncDispatcher: Dispatching the event
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.event.ApplicationLocalizationEvent.EventType:
>>> INIT_APPLICATION_RESOURCES
>>> 2011-12-29 15:49:16,250 DEBUG
>>> org.apache.hadoop.yarn.event.AsyncDispatcher: Dispatching the event
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationInitedEvent.EventType:
>>> APPLICATION_INITED
>>> 2011-12-29 15:49:16,250 INFO
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application:
>>> Processing application_1325062142731_0006 of type APPLICATION_INITED
>>> 2011-12-29 15:49:16,250 INFO
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application:
>>> Application application_1325062142731_0006 transitioned from INITING to
>>> RUNNING
>>> 2011-12-29 15:49:16,250 DEBUG
>>> org.apache.hadoop.yarn.event.AsyncDispatcher: Dispatching the event
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.loghandler.event.LogHandlerAppStartedEvent.EventType:
>>> APPLICATION_STARTED
>>> 2011-12-29 15:49:16,250 DEBUG
>>> org.apache.hadoop.yarn.event.AsyncDispatcher: Dispatching the event
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerInitEvent.EventType:
>>> INIT_CONTAINER
>>> 2011-12-29 15:49:16,250 INFO
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>>> Processing container_1325062142731_0006_01_000001 of type INIT_CONTAINER
>>> 2011-12-29 15:49:16,250 INFO
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>>> Container container_1325062142731_0006_01_000001 transitioned from NEW to
>>> LOCALIZED
>>> 2011-12-29 15:49:16,250 DEBUG
>>> org.apache.hadoop.yarn.event.AsyncDispatcher: Dispatching the event
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncherEvent.EventType:
>>> LAUNCH_CONTAINER
>>> 2011-12-29 15:49:16,287 DEBUG
>>> org.apache.hadoop.yarn.event.AsyncDispatcher: Dispatching the event
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerEvent.EventType:
>>> CONTAINER_LAUNCHED
>>> 2011-12-29 15:49:16,287 INFO
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>>> Processing container_1325062142731_0006_01_000001 of type CONTAINER_LAUNCHED
>>> 2011-12-29 15:49:16,287 INFO
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>>> Container container_1325062142731_0006_01_000001 transitioned from
>>> LOCALIZED to RUNNING
>>> 2011-12-29 15:49:16,288 DEBUG
>>> org.apache.hadoop.yarn.event.AsyncDispatcher: Dispatching the event
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainerStartMonitoringEvent.EventType:
>>> START_MONITORING_CONTAINER
>>> 2011-12-29 15:49:16,289 WARN
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch:
>>> Failed to launch container
>>> java.io.FileNotFoundException: File
>>> /tmp/nm-local-dir/usercache/jiangbing/appcache/application_1325062142731_0006
>>> does not exist
>>>     at
>>> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:431)
>>>     at
>>> org.apache.hadoop.fs.FileSystem.primitiveMkdir(FileSystem.java:815)
>>>     at
>>> org.apache.hadoop.fs.DelegateToFileSystem.mkdir(DelegateToFileSystem.java:143)
>>>     at org.apache.hadoop.fs.FilterFs.mkdir(FilterFs.java:189)
>>>     at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:700)
>>>     at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:697)
>>>    at
>>> org.apache.hadoop.fs.FileContext$FSLinkResolver.resolve(FileContext.java:2325)
>>>     at org.apache.hadoop.fs.FileContext.mkdir(FileContext.java:697)
>>>     at
>>> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:123)
>>>     at
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:237)
>>>     at
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:67)
>>>     at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>>>     at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>>>     at
>>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>>>     at
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>>>     at java.lang.Thread.run(Thread.java:662)
>>> 2011-12-29 15:49:16,290 DEBUG
>>> org.apache.hadoop.yarn.event.AsyncDispatcher: Dispatching the event
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerExitEvent.EventType:
>>> CONTAINER_EXITED_WITH_FAILURE
>>> 2011-12-29 15:49:16,290 INFO
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>>> Processing container_1325062142731_0006_01_000001 of type
>>> CONTAINER_EXITED_WITH_FAILURE
>>> 2011-12-29 15:49:16,290 INFO
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>>> Container container_1325062142731_0006_01_000001 transitioned from RUNNING
>>> to EXITED_WITH_FAILURE
>>> 2011-12-29 15:49:16,290 DEBUG
>>> org.apache.hadoop.yarn.event.AsyncDispatcher: Dispatching the event
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncherEvent.EventType:
>>> CLEANUP_CONTAINER
>>> 2011-12-29 15:49:16,290 INFO
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch:
>>> Cleaning up container container_1325062142731_0006_01_000001
>>> 2011-12-29 15:49:16,290 DEBUG
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch:
>>> Marking container container_1325062142731_0006_01_000001 as inactive
>>> 2011-12-29 15:49:16,290 DEBUG
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch:
>>> Getting pid for container container_1325062142731_0006_01_000001 to kill
>>> from pid file
>>> /tmp/nm-local-dir/nmPrivate/container_1325062142731_0006_01_000001.pid
>>> 2011-12-29 15:49:16,290 DEBUG
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch:
>>> Accessing pid for container container_1325062142731_0006_01_000001 from pid
>>> file /tmp/nm-local-dir/nmPrivate/container_1325062142731_0006_01_000001.pid
>>> 2011-12-29 15:49:16,307 DEBUG
>>> org.apache.hadoop.yarn.event.AsyncDispatcher: Dispatching the event
>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.event.ContainerLocalizationCleanupEvent.EventType:
>>> CLEANUP_CONTAINER_RESOURCES
>>>
>>>
>>>
>>> --
>>> Bing Jiang
>>> Tel:(86)134-2619-1361
>>> National Research Center for Intelligent Computing Systems
>>> Institute of Computing technology
>>> Graduate University of Chinese Academy of Science
>>>
>>>
>>>
>>
>>
>> --
>> Bing Jiang
>> Tel:(86)134-2619-1361
>> National Research Center for Intelligent Computing Systems
>> Institute of Computing technology
>> Graduate University of Chinese Academy of Science
>>
>>
>


-- 
Regards,
R.V.

Re: Exception from Yarn Launch Container

Posted by raghavendhra rahul <ra...@gmail.com>.
Hi,
Can i know where to get the release hadoop 0.23.1

2012/1/5 Bing Jiang <ji...@gmail.com>

> Arun,
>
> In order to figure out the fact, I trace back to source code. I find that
> *org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor*:
>
> @Override
>   public int launchContainer(Container container,
>       Path nmPrivateContainerScriptPath, Path nmPrivateTokensPath,
>       String userName, String appId, Path containerWorkDir)
>       throws IOException {
>       ....
>        String[] sLocalDirs = getConf().getStrings(
>         YarnConfiguration.NM_LOCAL_
> DIRS,
>         YarnConfiguration.DEFAULT_NM_LOCAL_DIRS);
>     for (String sLocalDir : sLocalDirs) {
>       Path usersdir = new Path(sLocalDir, ContainerLocalizer.USERCACHE);
>       Path userdir = new Path(usersdir, userName);
>       Path appCacheDir = new Path(userdir, ContainerLocalizer.APPCACHE);
>       Path appDir = new Path(appCacheDir, appIdStr);
>       Path containerDir = new Path(appDir, containerIdStr);
>       lfs.mkdir(containerDir, null, false);
>    }
>   ....
>
> lfs.mkdir(containerDir, null, false);  refer to the api of mkdir, false
> means cannot create parent path here if not exists.
> In my hadoop project, I revise  lfs.mkdir(containerDir, null, false);  to
> lfs.mkdir(containerDir, null, true); , then my program goes well.
>
> I fetch the hadoop source code from git now, but I can find the same issue
> as before.
>
> I want to ask why you set false here, or I missed out some important
> issues?
>
> Thanks!
>
>
> 在 2012年1月4日 下午3:42,Arun C Murthy <ac...@hortonworks.com>写道:
>
> Bing,
>>
>>  Are you using the released version of hadoop-0.23? If so, you might want
>> to upgrade to latest build off branch-0.23 (i.e. hadoop-0.23.1-SNAPSHOT)
>> which has the fix for MAPREDUCE-3537.
>>
>> Arun
>>
>> On Dec 29, 2011, at 12:27 AM, Bing Jiang wrote:
>>
>> Hi, I use Yarn as resource management to deploy my run-time computing
>> system. I follow
>>
>>
>>>> http://hadoop.apache.org/common/docs/r0.23.0/hadoop-yarn/hadoop-yarn-site/YARN.html
>>>>
>>>> http://hadoop.apache.org/common/docs/r0.23.0/hadoop-yarn/hadoop-yarn-site/WritingYarnApplications.html
>>>>
>>> as guide, and I find these issues below.
>>
>> yarn-nodemanager-**.log:
>> ....
>> 2011-12-29 15:49:16,250 INFO
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application:
>> Adding container_1325062142731_0006_01_000001 to application
>> application_1325062142731_0006
>> 2011-12-29 15:49:16,250 DEBUG
>> org.apache.hadoop.yarn.event.AsyncDispatcher: Dispatching the event
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.event.ApplicationLocalizationEvent.EventType:
>> INIT_APPLICATION_RESOURCES
>> 2011-12-29 15:49:16,250 DEBUG
>> org.apache.hadoop.yarn.event.AsyncDispatcher: Dispatching the event
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationInitedEvent.EventType:
>> APPLICATION_INITED
>> 2011-12-29 15:49:16,250 INFO
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application:
>> Processing application_1325062142731_0006 of type APPLICATION_INITED
>> 2011-12-29 15:49:16,250 INFO
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application:
>> Application application_1325062142731_0006 transitioned from INITING to
>> RUNNING
>> 2011-12-29 15:49:16,250 DEBUG
>> org.apache.hadoop.yarn.event.AsyncDispatcher: Dispatching the event
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.loghandler.event.LogHandlerAppStartedEvent.EventType:
>> APPLICATION_STARTED
>> 2011-12-29 15:49:16,250 DEBUG
>> org.apache.hadoop.yarn.event.AsyncDispatcher: Dispatching the event
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerInitEvent.EventType:
>> INIT_CONTAINER
>> 2011-12-29 15:49:16,250 INFO
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>> Processing container_1325062142731_0006_01_000001 of type INIT_CONTAINER
>> 2011-12-29 15:49:16,250 INFO
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>> Container container_1325062142731_0006_01_000001 transitioned from NEW to
>> LOCALIZED
>> 2011-12-29 15:49:16,250 DEBUG
>> org.apache.hadoop.yarn.event.AsyncDispatcher: Dispatching the event
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncherEvent.EventType:
>> LAUNCH_CONTAINER
>> 2011-12-29 15:49:16,287 DEBUG
>> org.apache.hadoop.yarn.event.AsyncDispatcher: Dispatching the event
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerEvent.EventType:
>> CONTAINER_LAUNCHED
>> 2011-12-29 15:49:16,287 INFO
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>> Processing container_1325062142731_0006_01_000001 of type CONTAINER_LAUNCHED
>> 2011-12-29 15:49:16,287 INFO
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>> Container container_1325062142731_0006_01_000001 transitioned from
>> LOCALIZED to RUNNING
>> 2011-12-29 15:49:16,288 DEBUG
>> org.apache.hadoop.yarn.event.AsyncDispatcher: Dispatching the event
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainerStartMonitoringEvent.EventType:
>> START_MONITORING_CONTAINER
>> 2011-12-29 15:49:16,289 WARN
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch:
>> Failed to launch container
>> java.io.FileNotFoundException: File
>> /tmp/nm-local-dir/usercache/jiangbing/appcache/application_1325062142731_0006
>> does not exist
>>     at
>> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:431)
>>     at org.apache.hadoop.fs.FileSystem.primitiveMkdir(FileSystem.java:815)
>>     at
>> org.apache.hadoop.fs.DelegateToFileSystem.mkdir(DelegateToFileSystem.java:143)
>>     at org.apache.hadoop.fs.FilterFs.mkdir(FilterFs.java:189)
>>     at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:700)
>>     at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:697)
>>    at
>> org.apache.hadoop.fs.FileContext$FSLinkResolver.resolve(FileContext.java:2325)
>>     at org.apache.hadoop.fs.FileContext.mkdir(FileContext.java:697)
>>     at
>> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:123)
>>     at
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:237)
>>     at
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:67)
>>     at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>>     at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>>     at
>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>>     at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>>     at java.lang.Thread.run(Thread.java:662)
>> 2011-12-29 15:49:16,290 DEBUG
>> org.apache.hadoop.yarn.event.AsyncDispatcher: Dispatching the event
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerExitEvent.EventType:
>> CONTAINER_EXITED_WITH_FAILURE
>> 2011-12-29 15:49:16,290 INFO
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>> Processing container_1325062142731_0006_01_000001 of type
>> CONTAINER_EXITED_WITH_FAILURE
>> 2011-12-29 15:49:16,290 INFO
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
>> Container container_1325062142731_0006_01_000001 transitioned from RUNNING
>> to EXITED_WITH_FAILURE
>> 2011-12-29 15:49:16,290 DEBUG
>> org.apache.hadoop.yarn.event.AsyncDispatcher: Dispatching the event
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncherEvent.EventType:
>> CLEANUP_CONTAINER
>> 2011-12-29 15:49:16,290 INFO
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch:
>> Cleaning up container container_1325062142731_0006_01_000001
>> 2011-12-29 15:49:16,290 DEBUG
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch:
>> Marking container container_1325062142731_0006_01_000001 as inactive
>> 2011-12-29 15:49:16,290 DEBUG
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch:
>> Getting pid for container container_1325062142731_0006_01_000001 to kill
>> from pid file
>> /tmp/nm-local-dir/nmPrivate/container_1325062142731_0006_01_000001.pid
>> 2011-12-29 15:49:16,290 DEBUG
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch:
>> Accessing pid for container container_1325062142731_0006_01_000001 from pid
>> file /tmp/nm-local-dir/nmPrivate/container_1325062142731_0006_01_000001.pid
>> 2011-12-29 15:49:16,307 DEBUG
>> org.apache.hadoop.yarn.event.AsyncDispatcher: Dispatching the event
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.event.ContainerLocalizationCleanupEvent.EventType:
>> CLEANUP_CONTAINER_RESOURCES
>>
>>
>>
>> --
>> Bing Jiang
>> Tel:(86)134-2619-1361
>> National Research Center for Intelligent Computing Systems
>> Institute of Computing technology
>> Graduate University of Chinese Academy of Science
>>
>>
>>
>
>
> --
> Bing Jiang
> Tel:(86)134-2619-1361
> National Research Center for Intelligent Computing Systems
> Institute of Computing technology
> Graduate University of Chinese Academy of Science
>
>

Re: Exception from Yarn Launch Container

Posted by Bing Jiang <ji...@gmail.com>.
Arun,

In order to figure out the fact, I trace back to source code. I find that *
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor*:

@Override
  public int launchContainer(Container container,
      Path nmPrivateContainerScriptPath, Path nmPrivateTokensPath,
      String userName, String appId, Path containerWorkDir)
      throws IOException {
      ....
       String[] sLocalDirs = getConf().getStrings(
        YarnConfiguration.NM_LOCAL_
DIRS,
        YarnConfiguration.DEFAULT_NM_LOCAL_DIRS);
    for (String sLocalDir : sLocalDirs) {
      Path usersdir = new Path(sLocalDir, ContainerLocalizer.USERCACHE);
      Path userdir = new Path(usersdir, userName);
      Path appCacheDir = new Path(userdir, ContainerLocalizer.APPCACHE);
      Path appDir = new Path(appCacheDir, appIdStr);
      Path containerDir = new Path(appDir, containerIdStr);
      lfs.mkdir(containerDir, null, false);
   }
  ....

lfs.mkdir(containerDir, null, false);  refer to the api of mkdir, false
means cannot create parent path here if not exists.
In my hadoop project, I revise  lfs.mkdir(containerDir, null, false);  to
lfs.mkdir(containerDir, null, true); , then my program goes well.

I fetch the hadoop source code from git now, but I can find the same issue
as before.

I want to ask why you set false here, or I missed out some important issues?

Thanks!


在 2012年1月4日 下午3:42,Arun C Murthy <ac...@hortonworks.com>写道:

> Bing,
>
>  Are you using the released version of hadoop-0.23? If so, you might want
> to upgrade to latest build off branch-0.23 (i.e. hadoop-0.23.1-SNAPSHOT)
> which has the fix for MAPREDUCE-3537.
>
> Arun
>
> On Dec 29, 2011, at 12:27 AM, Bing Jiang wrote:
>
> Hi, I use Yarn as resource management to deploy my run-time computing
> system. I follow
>
>
>>> http://hadoop.apache.org/common/docs/r0.23.0/hadoop-yarn/hadoop-yarn-site/YARN.html
>>>
>>> http://hadoop.apache.org/common/docs/r0.23.0/hadoop-yarn/hadoop-yarn-site/WritingYarnApplications.html
>>>
>> as guide, and I find these issues below.
>
> yarn-nodemanager-**.log:
> ....
> 2011-12-29 15:49:16,250 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application:
> Adding container_1325062142731_0006_01_000001 to application
> application_1325062142731_0006
> 2011-12-29 15:49:16,250 DEBUG
> org.apache.hadoop.yarn.event.AsyncDispatcher: Dispatching the event
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.event.ApplicationLocalizationEvent.EventType:
> INIT_APPLICATION_RESOURCES
> 2011-12-29 15:49:16,250 DEBUG
> org.apache.hadoop.yarn.event.AsyncDispatcher: Dispatching the event
> org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationInitedEvent.EventType:
> APPLICATION_INITED
> 2011-12-29 15:49:16,250 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application:
> Processing application_1325062142731_0006 of type APPLICATION_INITED
> 2011-12-29 15:49:16,250 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application:
> Application application_1325062142731_0006 transitioned from INITING to
> RUNNING
> 2011-12-29 15:49:16,250 DEBUG
> org.apache.hadoop.yarn.event.AsyncDispatcher: Dispatching the event
> org.apache.hadoop.yarn.server.nodemanager.containermanager.loghandler.event.LogHandlerAppStartedEvent.EventType:
> APPLICATION_STARTED
> 2011-12-29 15:49:16,250 DEBUG
> org.apache.hadoop.yarn.event.AsyncDispatcher: Dispatching the event
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerInitEvent.EventType:
> INIT_CONTAINER
> 2011-12-29 15:49:16,250 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
> Processing container_1325062142731_0006_01_000001 of type INIT_CONTAINER
> 2011-12-29 15:49:16,250 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
> Container container_1325062142731_0006_01_000001 transitioned from NEW to
> LOCALIZED
> 2011-12-29 15:49:16,250 DEBUG
> org.apache.hadoop.yarn.event.AsyncDispatcher: Dispatching the event
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncherEvent.EventType:
> LAUNCH_CONTAINER
> 2011-12-29 15:49:16,287 DEBUG
> org.apache.hadoop.yarn.event.AsyncDispatcher: Dispatching the event
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerEvent.EventType:
> CONTAINER_LAUNCHED
> 2011-12-29 15:49:16,287 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
> Processing container_1325062142731_0006_01_000001 of type CONTAINER_LAUNCHED
> 2011-12-29 15:49:16,287 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
> Container container_1325062142731_0006_01_000001 transitioned from
> LOCALIZED to RUNNING
> 2011-12-29 15:49:16,288 DEBUG
> org.apache.hadoop.yarn.event.AsyncDispatcher: Dispatching the event
> org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainerStartMonitoringEvent.EventType:
> START_MONITORING_CONTAINER
> 2011-12-29 15:49:16,289 WARN
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch:
> Failed to launch container
> java.io.FileNotFoundException: File
> /tmp/nm-local-dir/usercache/jiangbing/appcache/application_1325062142731_0006
> does not exist
>     at
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:431)
>     at org.apache.hadoop.fs.FileSystem.primitiveMkdir(FileSystem.java:815)
>     at
> org.apache.hadoop.fs.DelegateToFileSystem.mkdir(DelegateToFileSystem.java:143)
>     at org.apache.hadoop.fs.FilterFs.mkdir(FilterFs.java:189)
>     at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:700)
>     at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:697)
>    at
> org.apache.hadoop.fs.FileContext$FSLinkResolver.resolve(FileContext.java:2325)
>     at org.apache.hadoop.fs.FileContext.mkdir(FileContext.java:697)
>     at
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:123)
>     at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:237)
>     at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:67)
>     at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>     at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>     at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>     at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>     at java.lang.Thread.run(Thread.java:662)
> 2011-12-29 15:49:16,290 DEBUG
> org.apache.hadoop.yarn.event.AsyncDispatcher: Dispatching the event
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerExitEvent.EventType:
> CONTAINER_EXITED_WITH_FAILURE
> 2011-12-29 15:49:16,290 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
> Processing container_1325062142731_0006_01_000001 of type
> CONTAINER_EXITED_WITH_FAILURE
> 2011-12-29 15:49:16,290 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
> Container container_1325062142731_0006_01_000001 transitioned from RUNNING
> to EXITED_WITH_FAILURE
> 2011-12-29 15:49:16,290 DEBUG
> org.apache.hadoop.yarn.event.AsyncDispatcher: Dispatching the event
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncherEvent.EventType:
> CLEANUP_CONTAINER
> 2011-12-29 15:49:16,290 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch:
> Cleaning up container container_1325062142731_0006_01_000001
> 2011-12-29 15:49:16,290 DEBUG
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch:
> Marking container container_1325062142731_0006_01_000001 as inactive
> 2011-12-29 15:49:16,290 DEBUG
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch:
> Getting pid for container container_1325062142731_0006_01_000001 to kill
> from pid file
> /tmp/nm-local-dir/nmPrivate/container_1325062142731_0006_01_000001.pid
> 2011-12-29 15:49:16,290 DEBUG
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch:
> Accessing pid for container container_1325062142731_0006_01_000001 from pid
> file /tmp/nm-local-dir/nmPrivate/container_1325062142731_0006_01_000001.pid
> 2011-12-29 15:49:16,307 DEBUG
> org.apache.hadoop.yarn.event.AsyncDispatcher: Dispatching the event
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.event.ContainerLocalizationCleanupEvent.EventType:
> CLEANUP_CONTAINER_RESOURCES
>
>
>
> --
> Bing Jiang
> Tel:(86)134-2619-1361
> National Research Center for Intelligent Computing Systems
> Institute of Computing technology
> Graduate University of Chinese Academy of Science
>
>
>


-- 
Bing Jiang
Tel:(86)134-2619-1361
National Research Center for Intelligent Computing Systems
Institute of Computing technology
Graduate University of Chinese Academy of Science