You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Raj V <ra...@yahoo.com> on 2011/10/03 16:37:09 UTC

Fw: pointing mapred.local.dir to a ramdisk

Sending it to the hadoop mailing list - I think this is a hadoop related problem and not related to Cloudera distribution.

Raj


----- Forwarded Message -----
>From: Raj V <ra...@yahoo.com>
>To: CDH Users <cd...@cloudera.org>
>Sent: Friday, September 30, 2011 5:21 PM
>Subject: pointing mapred.local.dir to a ramdisk
>
>
>Hi all
>
>
>I have been trying some experiments to improve performance. One of the experiments involved pointing mapred.local.dir to a RAM disk. To this end I created a 128MB RAM disk ( each of my map outputs are smaller than this) but I have not been able to get the task tracker to start.
>
>
>I am running CDH3B3 ( hadoop-0.20.2+737) and here the error message from the task tracker log.
>
>
>Tasktracker logs
>
>
>2011-09-30 16:50:00,689 INFO org.mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
>2011-09-30 16:50:00,930 INFO org.apache.hadoop.http.HttpServer: Added global filtersafety (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
>2011-09-30 16:50:01,000 INFO org.apache.hadoop.http.HttpServer: Port returned by webServer.getConnectors()[0].getLocalPort() before open() is -1. Opening the listener on 50060
>2011-09-30 16:50:01,023 INFO org.apache.hadoop.http.HttpServer: listener.getLocalPort() returned 50060 webServer.getConnectors()[0].getLocalPort() returned 50060
>2011-09-30 16:50:01,024 INFO org.apache.hadoop.http.HttpServer: Jetty bound to port 50060
>2011-09-30 16:50:01,024 INFO org.mortbay.log: jetty-6.1.14
>2011-09-30 16:50:02,388 INFO org.mortbay.log: Started SelectChannelConnector@0.0.0.0:50060
>2011-09-30 16:50:02,400 INFO org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
>2011-09-30 16:50:02,422 INFO org.apache.hadoop.mapred.TaskTracker: Starting tasktracker with owner as mapred
>2011-09-30 16:50:02,493 ERROR org.apache.hadoop.mapred.TaskTracker: Can not start task tracker because java.lang.NullPointerException
>        at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:213)
>        at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:157)
>        at org.apache.hadoop.fs.RawLocalFileSystem.rename(RawLocalFileSystem.java:253)
>        at org.apache.hadoop.fs.ChecksumFileSystem.rename(ChecksumFileSystem.java:404)
>        at org.apache.hadoop.util.MRAsyncDiskService.moveAndDeleteRelativePath(MRAsyncDiskService.java:255)
>        at org.apache.hadoop.util.MRAsyncDiskService.cleanupAllVolumes(MRAsyncDiskService.java:311)
>        at org.apache.hadoop.mapred.TaskTracker.initialize(TaskTracker.java:618)
>        at org.apache.hadoop.mapred.TaskTracker.<init>(TaskTracker.java:1351)
>        at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3504)
>
>
>2011-09-30 16:50:02,497 INFO org.apache.hadoop.mapred.TaskTracker: SHUTDOWN_MSG:
>/************************************************************
>SHUTDOWN_MSG: Shutting down TaskTracker at HADOOP52-4/10.52.1.5
>
>
>and here is my mapred-site.xml file
>
>
><property>
>    <name>mapred.local.dir</name>
>    <value>/ramdisk1</value>
>  </property>
>
>
>If I have a regular directory on a regular drive such as below - it works. If I don't mount the ramdisk - it works.
>
>
><property>
>    <name>mapred.local.dir</name>
>    <value>/hadoop-dsk0/local,/hadoop-dsk1/local</value>
>  </property>
>
>
>
>
>
>The NullPointerException does not tell me what the error is or how to fix it.
>
>
>From the logs it looks like some disk based operation failed. I can't guess I must also confess that this is the first time I am using an ext2 file system.
>
>
>Any ideas?
>
>
>
>
>Raj
>
>
>
>
>
>
>
>

Re: pointing mapred.local.dir to a ramdisk

Posted by Raj V <ra...@yahoo.com>.
Vinod

Carefully checked everything again. The permissions are 775 and the owner is hdfs:hadoop.  The task tracker creates a directory called toBeDeleted under /ramdisk so things do not seem to be permssion related.  The task tracker starts happily if I don't mount the ramdisk and leave everything else the same.



Raj



>________________________________
>From: Vinod Kumar Vavilapalli <vi...@hortonworks.com>
>To: common-user@hadoop.apache.org; Raj V <ra...@yahoo.com>
>Sent: Monday, October 3, 2011 9:07 AM
>Subject: Re: pointing mapred.local.dir to a ramdisk
>
>Must be related to some kind of permissions problems.
>
>It will help if you can paste the corresponding source code for
>FileUtil.copy(). Hard to track it with different versions, so.
>
>Thanks,
>+Vinod
>
>
>On Mon, Oct 3, 2011 at 9:28 PM, Raj V <ra...@yahoo.com> wrote:
>
>> Eric
>>
>> Yes. The owner is hdfs and group is hadoop and the directory is group
>> writable(775).  This is tehe exact same configuration I have when I use real
>> disks.But let me give it a try again to see if I overlooked something.
>> Thanks
>>
>> Raj
>>
>> >________________________________
>> >From: Eric Caspole <er...@amd.com>
>> >To: common-user@hadoop.apache.org
>> >Sent: Monday, October 3, 2011 8:44 AM
>> >Subject: Re: pointing mapred.local.dir to a ramdisk
>> >
>> >Are you sure you have chown'd/chmod'd the ramdisk directory to be
>> writeable by your hadoop user? I have played with this in the past and it
>> should basically work.
>> >
>> >
>> >On Oct 3, 2011, at 10:37 AM, Raj V wrote:
>> >
>> >> Sending it to the hadoop mailing list - I think this is a hadoop related
>> problem and not related to Cloudera distribution.
>> >>
>> >> Raj
>> >>
>> >>
>> >> ----- Forwarded Message -----
>> >>> From: Raj V <ra...@yahoo.com>
>> >>> To: CDH Users <cd...@cloudera.org>
>> >>> Sent: Friday, September 30, 2011 5:21 PM
>> >>> Subject: pointing mapred.local.dir to a ramdisk
>> >>>
>> >>>
>> >>> Hi all
>> >>>
>> >>>
>> >>> I have been trying some experiments to improve performance. One of the
>> experiments involved pointing mapred.local.dir to a RAM disk. To this end I
>> created a 128MB RAM disk ( each of my map outputs are smaller than this) but
>> I have not been able to get the task tracker to start.
>> >>>
>> >>>
>> >>> I am running CDH3B3 ( hadoop-0.20.2+737) and here the error message
>> from the task tracker log.
>> >>>
>> >>>
>> >>> Tasktracker logs
>> >>>
>> >>>
>> >>> 2011-09-30 16:50:00,689 INFO org.mortbay.log: Logging to
>> org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via
>> org.mortbay.log.Slf4jLog
>> >>> 2011-09-30 16:50:00,930 INFO org.apache.hadoop.http.HttpServer: Added
>> global filtersafety
>> (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
>> >>> 2011-09-30 16:50:01,000 INFO org.apache.hadoop.http.HttpServer: Port
>> returned by webServer.getConnectors()[0].getLocalPort() before open() is -1.
>> Opening the listener on 50060
>> >>> 2011-09-30 16:50:01,023 INFO org.apache.hadoop.http.HttpServer:
>> listener.getLocalPort() returned 50060
>> webServer.getConnectors()[0].getLocalPort() returned 50060
>> >>> 2011-09-30 16:50:01,024 INFO org.apache.hadoop.http.HttpServer: Jetty
>> bound to port 50060
>> >>> 2011-09-30 16:50:01,024 INFO org.mortbay.log: jetty-6.1.14
>> >>> 2011-09-30 16:50:02,388 INFO org.mortbay.log: Started
>> SelectChannelConnector@0.0.0.0:50060
>> >>> 2011-09-30 16:50:02,400 INFO
>> org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater
>> with mapRetainSize=-1 and reduceRetainSize=-1
>> >>> 2011-09-30 16:50:02,422 INFO org.apache.hadoop.mapred.TaskTracker:
>> Starting tasktracker with owner as mapred
>> >>> 2011-09-30 16:50:02,493 ERROR org.apache.hadoop.mapred.TaskTracker: Can
>> not start task tracker because java.lang.NullPointerException
>> >>>         at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:213)
>> >>>         at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:157)
>> >>>         at
>> org.apache.hadoop.fs.RawLocalFileSystem.rename(RawLocalFileSystem.java:253)
>> >>>         at
>> org.apache.hadoop.fs.ChecksumFileSystem.rename(ChecksumFileSystem.java:404)
>> >>>         at
>> org.apache.hadoop.util.MRAsyncDiskService.moveAndDeleteRelativePath(MRAsyncDiskService.java:255)
>> >>>         at
>> org.apache.hadoop.util.MRAsyncDiskService.cleanupAllVolumes(MRAsyncDiskService.java:311)
>> >>>         at
>> org.apache.hadoop.mapred.TaskTracker.initialize(TaskTracker.java:618)
>> >>>         at
>> org.apache.hadoop.mapred.TaskTracker.<init>(TaskTracker.java:1351)
>> >>>         at
>> org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3504)
>> >>>
>> >>>
>> >>> 2011-09-30 16:50:02,497 INFO org.apache.hadoop.mapred.TaskTracker:
>> SHUTDOWN_MSG:
>> >>> /************************************************************
>> >>> SHUTDOWN_MSG: Shutting down TaskTracker at HADOOP52-4/10.52.1.5
>> >>>
>> >>>
>> >>> and here is my mapred-site.xml file
>> >>>
>> >>>
>> >>> <property>
>> >>>     <name>mapred.local.dir</name>
>> >>>     <value>/ramdisk1</value>
>> >>>   </property>
>> >>>
>> >>>
>> >>> If I have a regular directory on a regular drive such as below - it
>> works. If I don't mount the ramdisk - it works.
>> >>>
>> >>>
>> >>> <property>
>> >>>     <name>mapred.local.dir</name>
>> >>>     <value>/hadoop-dsk0/local,/hadoop-dsk1/local</value>
>> >>>   </property>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>> The NullPointerException does not tell me what the error is or how to
>> fix it.
>> >>>
>> >>>
>> >>> From the logs it looks like some disk based operation failed. I can't
>> guess I must also confess that this is the first time I am using an ext2
>> file system.
>> >>>
>> >>>
>> >>> Any ideas?
>> >>>
>> >>>
>> >>>
>> >>>
>> >>> Raj
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >
>> >
>> >
>> >
>> >
>>
>
>
>

Re: pointing mapred.local.dir to a ramdisk

Posted by Raj V <ra...@yahoo.com>.
Thanks Joey, Todd,  Vinod , Edward

I have mixed news. The problem of the task tracker not starting was was indeed permssion related. under /ramdisk there was a lost+found that was owned by root, eventhough /ramdisk was owned by mapred:hadoop. This was the cause of the problem. Now I will see if I can fix the error message to something better than "Null pointer exception". 

Once I saw all my task trackers were UP I started with my TTT ( trusted teragen and terasort :-)).

I ran teragen with a data size of 10GB ( 100MB records). I have 500 nodes and I wanted 2000 files.  

It takes 19 minutes to complete - awfully slow  There is no swapping ,, CPU is not pegged so things look Ok. I ran it a couple of times and it takes about 15-19 minutes. It is the not the same same nodes either.  But that could be a problem with something local.  We will ignore it for now.

TeraSort doees not ever complete. 

I get the following errors on majority of the nodes.

org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any valid local directory for output/spill0.out at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:376) at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:146) at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:127) at org.apache.hadoop.mapred.MapOutputFile.getSpillFileForWrite(MapOutputFile.java:121) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1247) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1155) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:392) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325) at org.apache.hadoop.mapred.Child$4.run(Child.java:270) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127) at org.apache.hadoop.mapred.Child.main

I know this indicates lack of space but I have a df monitoring the disk space on all the nodes and all nodes have loads of dissk space available. The ramdiisk is never more than 25% full.

So once again - any clues?

Raj


>________________________________
>From: Raj V <ra...@yahoo.com>
>To: "common-user@hadoop.apache.org" <co...@hadoop.apache.org>
>Sent: Monday, October 3, 2011 12:31 PM
>Subject: Re: pointing mapred.local.dir to a ramdisk
>
>Joey
>
>Thanks. Will try and uppgrade to a newer version and check. I will also change the logs to debug and see if more information is available.
>
>Raj
>
>
>
>>________________________________
>>From: Joey Echeverria <jo...@cloudera.com>
>>To: common-user@hadoop.apache.org; Raj V <ra...@yahoo.com>
>>Sent: Monday, October 3, 2011 11:49 AM
>>Subject: Re: pointing mapred.local.dir to a ramdisk
>>
>>Raj,
>>
>>I just tried this on my CHD3u1 VM, and the ramdisk worked the first
>>time. So, it's possible you've hit a bug in CDH3b3 that was later
>>fixed. Can you enable debug logging in log4j.properties and then
>>repost your task tracker log? I think there might be more details that
>>it will print that will be helpful.
>>
>>-Joey
>>
>>On Mon, Oct 3, 2011 at 2:18 PM, Raj V <ra...@yahoo.com> wrote:
>>> Edward
>>>
>>> I understand the size limitations - but for my experiment the ramdisk size I have created is large enough.
>>> I think there will be substantial benefits by putting the intermediate map outputs on a ramdisk - size permitting, ofcourse, but I can't provide any numbers to substantiate my claim  given that I can't get it to run.
>>>
>>> -best regards
>>>
>>> Raj
>>>
>>>
>>>
>>>>________________________________
>>>>From: Edward Capriolo <ed...@gmail.com>
>>>>To: common-user@hadoop.apache.org
>>>>Cc: Raj V <ra...@yahoo.com>
>>>>Sent: Monday, October 3, 2011 10:36 AM
>>>>Subject: Re: pointing mapred.local.dir to a ramdisk
>>>>
>>>>This directory can get very large, in many cases I doubt it would fit on a
>>>>ram disk.
>>>>
>>>>Also RAM Disks tend to help most with random read/write, since hadoop is
>>>>doing mostly linear IO you may not see a great benefit from the RAM disk.
>>>>
>>>>
>>>>
>>>>On Mon, Oct 3, 2011 at 12:07 PM, Vinod Kumar Vavilapalli <
>>>>vinodkv@hortonworks.com> wrote:
>>>>
>>>>> Must be related to some kind of permissions problems.
>>>>>
>>>>> It will help if you can paste the corresponding source code for
>>>>> FileUtil.copy(). Hard to track it with different versions, so.
>>>>>
>>>>> Thanks,
>>>>> +Vinod
>>>>>
>>>>>
>>>>> On Mon, Oct 3, 2011 at 9:28 PM, Raj V <ra...@yahoo.com> wrote:
>>>>>
>>>>> > Eric
>>>>> >
>>>>> > Yes. The owner is hdfs and group is hadoop and the directory is group
>>>>> > writable(775).  This is tehe exact same configuration I have when I use
>>>>> real
>>>>> > disks.But let me give it a try again to see if I overlooked something.
>>>>> > Thanks
>>>>> >
>>>>> > Raj
>>>>> >
>>>>> > >________________________________
>>>>> > >From: Eric Caspole <er...@amd.com>
>>>>> > >To: common-user@hadoop.apache.org
>>>>> > >Sent: Monday, October 3, 2011 8:44 AM
>>>>> > >Subject: Re: pointing mapred.local.dir to a ramdisk
>>>>> > >
>>>>> > >Are you sure you have chown'd/chmod'd the ramdisk directory to be
>>>>> > writeable by your hadoop user? I have played with this in the past and it
>>>>> > should basically work.
>>>>> > >
>>>>> > >
>>>>> > >On Oct 3, 2011, at 10:37 AM, Raj V wrote:
>>>>> > >
>>>>> > >> Sending it to the hadoop mailing list - I think this is a hadoop
>>>>> related
>>>>> > problem and not related to Cloudera distribution.
>>>>> > >>
>>>>> > >> Raj
>>>>> > >>
>>>>> > >>
>>>>> > >> ----- Forwarded Message -----
>>>>> > >>> From: Raj V <ra...@yahoo.com>
>>>>> > >>> To: CDH Users <cd...@cloudera.org>
>>>>> > >>> Sent: Friday, September 30, 2011 5:21 PM
>>>>> > >>> Subject: pointing mapred.local.dir to a ramdisk
>>>>> > >>>
>>>>> > >>>
>>>>> > >>> Hi all
>>>>> > >>>
>>>>> > >>>
>>>>> > >>> I have been trying some experiments to improve performance. One of
>>>>> the
>>>>> > experiments involved pointing mapred.local.dir to a RAM disk. To this end
>>>>> I
>>>>> > created a 128MB RAM disk ( each of my map outputs are smaller than this)
>>>>> but
>>>>> > I have not been able to get the task tracker to start.
>>>>> > >>>
>>>>> > >>>
>>>>> > >>> I am running CDH3B3 ( hadoop-0.20.2+737) and here the error message
>>>>> > from the task tracker log.
>>>>> > >>>
>>>>> > >>>
>>>>> > >>> Tasktracker logs
>>>>> > >>>
>>>>> > >>>
>>>>> > >>> 2011-09-30 16:50:00,689 INFO org.mortbay.log: Logging to
>>>>> > org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via
>>>>> > org.mortbay.log.Slf4jLog
>>>>> > >>> 2011-09-30 16:50:00,930 INFO org.apache.hadoop.http.HttpServer: Added
>>>>> > global filtersafety
>>>>> > (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
>>>>> > >>> 2011-09-30 16:50:01,000 INFO org.apache.hadoop.http.HttpServer: Port
>>>>> > returned by webServer.getConnectors()[0].getLocalPort() before open() is
>>>>> -1.
>>>>> > Opening the listener on 50060
>>>>> > >>> 2011-09-30 16:50:01,023 INFO org.apache.hadoop.http.HttpServer:
>>>>> > listener.getLocalPort() returned 50060
>>>>> > webServer.getConnectors()[0].getLocalPort() returned 50060
>>>>> > >>> 2011-09-30 16:50:01,024 INFO org.apache.hadoop.http.HttpServer: Jetty
>>>>> > bound to port 50060
>>>>> > >>> 2011-09-30 16:50:01,024 INFO org.mortbay.log: jetty-6.1.14
>>>>> > >>> 2011-09-30 16:50:02,388 INFO org.mortbay.log: Started
>>>>> > SelectChannelConnector@0.0.0.0:50060
>>>>> > >>> 2011-09-30 16:50:02,400 INFO
>>>>> > org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater
>>>>> > with mapRetainSize=-1 and reduceRetainSize=-1
>>>>> > >>> 2011-09-30 16:50:02,422 INFO org.apache.hadoop.mapred.TaskTracker:
>>>>> > Starting tasktracker with owner as mapred
>>>>> > >>> 2011-09-30 16:50:02,493 ERROR org.apache.hadoop.mapred.TaskTracker:
>>>>> Can
>>>>> > not start task tracker because java.lang.NullPointerException
>>>>> > >>>         at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:213)
>>>>> > >>>         at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:157)
>>>>> > >>>         at
>>>>> >
>>>>> org.apache.hadoop.fs.RawLocalFileSystem.rename(RawLocalFileSystem.java:253)
>>>>> > >>>         at
>>>>> >
>>>>> org.apache.hadoop.fs.ChecksumFileSystem.rename(ChecksumFileSystem.java:404)
>>>>> > >>>         at
>>>>> >
>>>>> org.apache.hadoop.util.MRAsyncDiskService.moveAndDeleteRelativePath(MRAsyncDiskService.java:255)
>>>>> > >>>         at
>>>>> >
>>>>> org.apache.hadoop.util.MRAsyncDiskService.cleanupAllVolumes(MRAsyncDiskService.java:311)
>>>>> > >>>         at
>>>>> > org.apache.hadoop.mapred.TaskTracker.initialize(TaskTracker.java:618)
>>>>> > >>>         at
>>>>> > org.apache.hadoop.mapred.TaskTracker.<init>(TaskTracker.java:1351)
>>>>> > >>>         at
>>>>> > org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3504)
>>>>> > >>>
>>>>> > >>>
>>>>> > >>> 2011-09-30 16:50:02,497 INFO org.apache.hadoop.mapred.TaskTracker:
>>>>> > SHUTDOWN_MSG:
>>>>> > >>> /************************************************************
>>>>> > >>> SHUTDOWN_MSG: Shutting down TaskTracker at HADOOP52-4/10.52.1.5
>>>>> > >>>
>>>>> > >>>
>>>>> > >>> and here is my mapred-site.xml file
>>>>> > >>>
>>>>> > >>>
>>>>> > >>> <property>
>>>>> > >>>     <name>mapred.local.dir</name>
>>>>> > >>>     <value>/ramdisk1</value>
>>>>> > >>>   </property>
>>>>> > >>>
>>>>> > >>>
>>>>> > >>> If I have a regular directory on a regular drive such as below - it
>>>>> > works. If I don't mount the ramdisk - it works.
>>>>> > >>>
>>>>> > >>>
>>>>> > >>> <property>
>>>>> > >>>     <name>mapred.local.dir</name>
>>>>> > >>>     <value>/hadoop-dsk0/local,/hadoop-dsk1/local</value>
>>>>> > >>>   </property>
>>>>> > >>>
>>>>> > >>>
>>>>> > >>>
>>>>> > >>>
>>>>> > >>>
>>>>> > >>> The NullPointerException does not tell me what the error is or how to
>>>>> > fix it.
>>>>> > >>>
>>>>> > >>>
>>>>> > >>> From the logs it looks like some disk based operation failed. I can't
>>>>> > guess I must also confess that this is the first time I am using an ext2
>>>>> > file system.
>>>>> > >>>
>>>>> > >>>
>>>>> > >>> Any ideas?
>>>>> > >>>
>>>>> > >>>
>>>>> > >>>
>>>>> > >>>
>>>>> > >>> Raj
>>>>> > >>>
>>>>> > >>>
>>>>> > >>>
>>>>> > >>>
>>>>> > >>>
>>>>> > >>>
>>>>> > >>>
>>>>> > >
>>>>> > >
>>>>> > >
>>>>> > >
>>>>> > >
>>>>> >
>>>>>
>>>>
>>>>
>>>>
>>
>>
>>
>>-- 
>>Joseph Echeverria
>>Cloudera, Inc.
>>443.305.9434
>>
>>
>>
>
>

Re: pointing mapred.local.dir to a ramdisk

Posted by Raj V <ra...@yahoo.com>.
Joey

Thanks. Will try and uppgrade to a newer version and check. I will also change the logs to debug and see if more information is available.

Raj



>________________________________
>From: Joey Echeverria <jo...@cloudera.com>
>To: common-user@hadoop.apache.org; Raj V <ra...@yahoo.com>
>Sent: Monday, October 3, 2011 11:49 AM
>Subject: Re: pointing mapred.local.dir to a ramdisk
>
>Raj,
>
>I just tried this on my CHD3u1 VM, and the ramdisk worked the first
>time. So, it's possible you've hit a bug in CDH3b3 that was later
>fixed. Can you enable debug logging in log4j.properties and then
>repost your task tracker log? I think there might be more details that
>it will print that will be helpful.
>
>-Joey
>
>On Mon, Oct 3, 2011 at 2:18 PM, Raj V <ra...@yahoo.com> wrote:
>> Edward
>>
>> I understand the size limitations - but for my experiment the ramdisk size I have created is large enough.
>> I think there will be substantial benefits by putting the intermediate map outputs on a ramdisk - size permitting, ofcourse, but I can't provide any numbers to substantiate my claim  given that I can't get it to run.
>>
>> -best regards
>>
>> Raj
>>
>>
>>
>>>________________________________
>>>From: Edward Capriolo <ed...@gmail.com>
>>>To: common-user@hadoop.apache.org
>>>Cc: Raj V <ra...@yahoo.com>
>>>Sent: Monday, October 3, 2011 10:36 AM
>>>Subject: Re: pointing mapred.local.dir to a ramdisk
>>>
>>>This directory can get very large, in many cases I doubt it would fit on a
>>>ram disk.
>>>
>>>Also RAM Disks tend to help most with random read/write, since hadoop is
>>>doing mostly linear IO you may not see a great benefit from the RAM disk.
>>>
>>>
>>>
>>>On Mon, Oct 3, 2011 at 12:07 PM, Vinod Kumar Vavilapalli <
>>>vinodkv@hortonworks.com> wrote:
>>>
>>>> Must be related to some kind of permissions problems.
>>>>
>>>> It will help if you can paste the corresponding source code for
>>>> FileUtil.copy(). Hard to track it with different versions, so.
>>>>
>>>> Thanks,
>>>> +Vinod
>>>>
>>>>
>>>> On Mon, Oct 3, 2011 at 9:28 PM, Raj V <ra...@yahoo.com> wrote:
>>>>
>>>> > Eric
>>>> >
>>>> > Yes. The owner is hdfs and group is hadoop and the directory is group
>>>> > writable(775).  This is tehe exact same configuration I have when I use
>>>> real
>>>> > disks.But let me give it a try again to see if I overlooked something.
>>>> > Thanks
>>>> >
>>>> > Raj
>>>> >
>>>> > >________________________________
>>>> > >From: Eric Caspole <er...@amd.com>
>>>> > >To: common-user@hadoop.apache.org
>>>> > >Sent: Monday, October 3, 2011 8:44 AM
>>>> > >Subject: Re: pointing mapred.local.dir to a ramdisk
>>>> > >
>>>> > >Are you sure you have chown'd/chmod'd the ramdisk directory to be
>>>> > writeable by your hadoop user? I have played with this in the past and it
>>>> > should basically work.
>>>> > >
>>>> > >
>>>> > >On Oct 3, 2011, at 10:37 AM, Raj V wrote:
>>>> > >
>>>> > >> Sending it to the hadoop mailing list - I think this is a hadoop
>>>> related
>>>> > problem and not related to Cloudera distribution.
>>>> > >>
>>>> > >> Raj
>>>> > >>
>>>> > >>
>>>> > >> ----- Forwarded Message -----
>>>> > >>> From: Raj V <ra...@yahoo.com>
>>>> > >>> To: CDH Users <cd...@cloudera.org>
>>>> > >>> Sent: Friday, September 30, 2011 5:21 PM
>>>> > >>> Subject: pointing mapred.local.dir to a ramdisk
>>>> > >>>
>>>> > >>>
>>>> > >>> Hi all
>>>> > >>>
>>>> > >>>
>>>> > >>> I have been trying some experiments to improve performance. One of
>>>> the
>>>> > experiments involved pointing mapred.local.dir to a RAM disk. To this end
>>>> I
>>>> > created a 128MB RAM disk ( each of my map outputs are smaller than this)
>>>> but
>>>> > I have not been able to get the task tracker to start.
>>>> > >>>
>>>> > >>>
>>>> > >>> I am running CDH3B3 ( hadoop-0.20.2+737) and here the error message
>>>> > from the task tracker log.
>>>> > >>>
>>>> > >>>
>>>> > >>> Tasktracker logs
>>>> > >>>
>>>> > >>>
>>>> > >>> 2011-09-30 16:50:00,689 INFO org.mortbay.log: Logging to
>>>> > org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via
>>>> > org.mortbay.log.Slf4jLog
>>>> > >>> 2011-09-30 16:50:00,930 INFO org.apache.hadoop.http.HttpServer: Added
>>>> > global filtersafety
>>>> > (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
>>>> > >>> 2011-09-30 16:50:01,000 INFO org.apache.hadoop.http.HttpServer: Port
>>>> > returned by webServer.getConnectors()[0].getLocalPort() before open() is
>>>> -1.
>>>> > Opening the listener on 50060
>>>> > >>> 2011-09-30 16:50:01,023 INFO org.apache.hadoop.http.HttpServer:
>>>> > listener.getLocalPort() returned 50060
>>>> > webServer.getConnectors()[0].getLocalPort() returned 50060
>>>> > >>> 2011-09-30 16:50:01,024 INFO org.apache.hadoop.http.HttpServer: Jetty
>>>> > bound to port 50060
>>>> > >>> 2011-09-30 16:50:01,024 INFO org.mortbay.log: jetty-6.1.14
>>>> > >>> 2011-09-30 16:50:02,388 INFO org.mortbay.log: Started
>>>> > SelectChannelConnector@0.0.0.0:50060
>>>> > >>> 2011-09-30 16:50:02,400 INFO
>>>> > org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater
>>>> > with mapRetainSize=-1 and reduceRetainSize=-1
>>>> > >>> 2011-09-30 16:50:02,422 INFO org.apache.hadoop.mapred.TaskTracker:
>>>> > Starting tasktracker with owner as mapred
>>>> > >>> 2011-09-30 16:50:02,493 ERROR org.apache.hadoop.mapred.TaskTracker:
>>>> Can
>>>> > not start task tracker because java.lang.NullPointerException
>>>> > >>>         at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:213)
>>>> > >>>         at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:157)
>>>> > >>>         at
>>>> >
>>>> org.apache.hadoop.fs.RawLocalFileSystem.rename(RawLocalFileSystem.java:253)
>>>> > >>>         at
>>>> >
>>>> org.apache.hadoop.fs.ChecksumFileSystem.rename(ChecksumFileSystem.java:404)
>>>> > >>>         at
>>>> >
>>>> org.apache.hadoop.util.MRAsyncDiskService.moveAndDeleteRelativePath(MRAsyncDiskService.java:255)
>>>> > >>>         at
>>>> >
>>>> org.apache.hadoop.util.MRAsyncDiskService.cleanupAllVolumes(MRAsyncDiskService.java:311)
>>>> > >>>         at
>>>> > org.apache.hadoop.mapred.TaskTracker.initialize(TaskTracker.java:618)
>>>> > >>>         at
>>>> > org.apache.hadoop.mapred.TaskTracker.<init>(TaskTracker.java:1351)
>>>> > >>>         at
>>>> > org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3504)
>>>> > >>>
>>>> > >>>
>>>> > >>> 2011-09-30 16:50:02,497 INFO org.apache.hadoop.mapred.TaskTracker:
>>>> > SHUTDOWN_MSG:
>>>> > >>> /************************************************************
>>>> > >>> SHUTDOWN_MSG: Shutting down TaskTracker at HADOOP52-4/10.52.1.5
>>>> > >>>
>>>> > >>>
>>>> > >>> and here is my mapred-site.xml file
>>>> > >>>
>>>> > >>>
>>>> > >>> <property>
>>>> > >>>     <name>mapred.local.dir</name>
>>>> > >>>     <value>/ramdisk1</value>
>>>> > >>>   </property>
>>>> > >>>
>>>> > >>>
>>>> > >>> If I have a regular directory on a regular drive such as below - it
>>>> > works. If I don't mount the ramdisk - it works.
>>>> > >>>
>>>> > >>>
>>>> > >>> <property>
>>>> > >>>     <name>mapred.local.dir</name>
>>>> > >>>     <value>/hadoop-dsk0/local,/hadoop-dsk1/local</value>
>>>> > >>>   </property>
>>>> > >>>
>>>> > >>>
>>>> > >>>
>>>> > >>>
>>>> > >>>
>>>> > >>> The NullPointerException does not tell me what the error is or how to
>>>> > fix it.
>>>> > >>>
>>>> > >>>
>>>> > >>> From the logs it looks like some disk based operation failed. I can't
>>>> > guess I must also confess that this is the first time I am using an ext2
>>>> > file system.
>>>> > >>>
>>>> > >>>
>>>> > >>> Any ideas?
>>>> > >>>
>>>> > >>>
>>>> > >>>
>>>> > >>>
>>>> > >>> Raj
>>>> > >>>
>>>> > >>>
>>>> > >>>
>>>> > >>>
>>>> > >>>
>>>> > >>>
>>>> > >>>
>>>> > >
>>>> > >
>>>> > >
>>>> > >
>>>> > >
>>>> >
>>>>
>>>
>>>
>>>
>
>
>
>-- 
>Joseph Echeverria
>Cloudera, Inc.
>443.305.9434
>
>
>

Re: pointing mapred.local.dir to a ramdisk

Posted by Joey Echeverria <jo...@cloudera.com>.
Raj,

I just tried this on my CHD3u1 VM, and the ramdisk worked the first
time. So, it's possible you've hit a bug in CDH3b3 that was later
fixed. Can you enable debug logging in log4j.properties and then
repost your task tracker log? I think there might be more details that
it will print that will be helpful.

-Joey

On Mon, Oct 3, 2011 at 2:18 PM, Raj V <ra...@yahoo.com> wrote:
> Edward
>
> I understand the size limitations - but for my experiment the ramdisk size I have created is large enough.
> I think there will be substantial benefits by putting the intermediate map outputs on a ramdisk - size permitting, ofcourse, but I can't provide any numbers to substantiate my claim  given that I can't get it to run.
>
> -best regards
>
> Raj
>
>
>
>>________________________________
>>From: Edward Capriolo <ed...@gmail.com>
>>To: common-user@hadoop.apache.org
>>Cc: Raj V <ra...@yahoo.com>
>>Sent: Monday, October 3, 2011 10:36 AM
>>Subject: Re: pointing mapred.local.dir to a ramdisk
>>
>>This directory can get very large, in many cases I doubt it would fit on a
>>ram disk.
>>
>>Also RAM Disks tend to help most with random read/write, since hadoop is
>>doing mostly linear IO you may not see a great benefit from the RAM disk.
>>
>>
>>
>>On Mon, Oct 3, 2011 at 12:07 PM, Vinod Kumar Vavilapalli <
>>vinodkv@hortonworks.com> wrote:
>>
>>> Must be related to some kind of permissions problems.
>>>
>>> It will help if you can paste the corresponding source code for
>>> FileUtil.copy(). Hard to track it with different versions, so.
>>>
>>> Thanks,
>>> +Vinod
>>>
>>>
>>> On Mon, Oct 3, 2011 at 9:28 PM, Raj V <ra...@yahoo.com> wrote:
>>>
>>> > Eric
>>> >
>>> > Yes. The owner is hdfs and group is hadoop and the directory is group
>>> > writable(775).  This is tehe exact same configuration I have when I use
>>> real
>>> > disks.But let me give it a try again to see if I overlooked something.
>>> > Thanks
>>> >
>>> > Raj
>>> >
>>> > >________________________________
>>> > >From: Eric Caspole <er...@amd.com>
>>> > >To: common-user@hadoop.apache.org
>>> > >Sent: Monday, October 3, 2011 8:44 AM
>>> > >Subject: Re: pointing mapred.local.dir to a ramdisk
>>> > >
>>> > >Are you sure you have chown'd/chmod'd the ramdisk directory to be
>>> > writeable by your hadoop user? I have played with this in the past and it
>>> > should basically work.
>>> > >
>>> > >
>>> > >On Oct 3, 2011, at 10:37 AM, Raj V wrote:
>>> > >
>>> > >> Sending it to the hadoop mailing list - I think this is a hadoop
>>> related
>>> > problem and not related to Cloudera distribution.
>>> > >>
>>> > >> Raj
>>> > >>
>>> > >>
>>> > >> ----- Forwarded Message -----
>>> > >>> From: Raj V <ra...@yahoo.com>
>>> > >>> To: CDH Users <cd...@cloudera.org>
>>> > >>> Sent: Friday, September 30, 2011 5:21 PM
>>> > >>> Subject: pointing mapred.local.dir to a ramdisk
>>> > >>>
>>> > >>>
>>> > >>> Hi all
>>> > >>>
>>> > >>>
>>> > >>> I have been trying some experiments to improve performance. One of
>>> the
>>> > experiments involved pointing mapred.local.dir to a RAM disk. To this end
>>> I
>>> > created a 128MB RAM disk ( each of my map outputs are smaller than this)
>>> but
>>> > I have not been able to get the task tracker to start.
>>> > >>>
>>> > >>>
>>> > >>> I am running CDH3B3 ( hadoop-0.20.2+737) and here the error message
>>> > from the task tracker log.
>>> > >>>
>>> > >>>
>>> > >>> Tasktracker logs
>>> > >>>
>>> > >>>
>>> > >>> 2011-09-30 16:50:00,689 INFO org.mortbay.log: Logging to
>>> > org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via
>>> > org.mortbay.log.Slf4jLog
>>> > >>> 2011-09-30 16:50:00,930 INFO org.apache.hadoop.http.HttpServer: Added
>>> > global filtersafety
>>> > (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
>>> > >>> 2011-09-30 16:50:01,000 INFO org.apache.hadoop.http.HttpServer: Port
>>> > returned by webServer.getConnectors()[0].getLocalPort() before open() is
>>> -1.
>>> > Opening the listener on 50060
>>> > >>> 2011-09-30 16:50:01,023 INFO org.apache.hadoop.http.HttpServer:
>>> > listener.getLocalPort() returned 50060
>>> > webServer.getConnectors()[0].getLocalPort() returned 50060
>>> > >>> 2011-09-30 16:50:01,024 INFO org.apache.hadoop.http.HttpServer: Jetty
>>> > bound to port 50060
>>> > >>> 2011-09-30 16:50:01,024 INFO org.mortbay.log: jetty-6.1.14
>>> > >>> 2011-09-30 16:50:02,388 INFO org.mortbay.log: Started
>>> > SelectChannelConnector@0.0.0.0:50060
>>> > >>> 2011-09-30 16:50:02,400 INFO
>>> > org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater
>>> > with mapRetainSize=-1 and reduceRetainSize=-1
>>> > >>> 2011-09-30 16:50:02,422 INFO org.apache.hadoop.mapred.TaskTracker:
>>> > Starting tasktracker with owner as mapred
>>> > >>> 2011-09-30 16:50:02,493 ERROR org.apache.hadoop.mapred.TaskTracker:
>>> Can
>>> > not start task tracker because java.lang.NullPointerException
>>> > >>>         at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:213)
>>> > >>>         at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:157)
>>> > >>>         at
>>> >
>>> org.apache.hadoop.fs.RawLocalFileSystem.rename(RawLocalFileSystem.java:253)
>>> > >>>         at
>>> >
>>> org.apache.hadoop.fs.ChecksumFileSystem.rename(ChecksumFileSystem.java:404)
>>> > >>>         at
>>> >
>>> org.apache.hadoop.util.MRAsyncDiskService.moveAndDeleteRelativePath(MRAsyncDiskService.java:255)
>>> > >>>         at
>>> >
>>> org.apache.hadoop.util.MRAsyncDiskService.cleanupAllVolumes(MRAsyncDiskService.java:311)
>>> > >>>         at
>>> > org.apache.hadoop.mapred.TaskTracker.initialize(TaskTracker.java:618)
>>> > >>>         at
>>> > org.apache.hadoop.mapred.TaskTracker.<init>(TaskTracker.java:1351)
>>> > >>>         at
>>> > org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3504)
>>> > >>>
>>> > >>>
>>> > >>> 2011-09-30 16:50:02,497 INFO org.apache.hadoop.mapred.TaskTracker:
>>> > SHUTDOWN_MSG:
>>> > >>> /************************************************************
>>> > >>> SHUTDOWN_MSG: Shutting down TaskTracker at HADOOP52-4/10.52.1.5
>>> > >>>
>>> > >>>
>>> > >>> and here is my mapred-site.xml file
>>> > >>>
>>> > >>>
>>> > >>> <property>
>>> > >>>     <name>mapred.local.dir</name>
>>> > >>>     <value>/ramdisk1</value>
>>> > >>>   </property>
>>> > >>>
>>> > >>>
>>> > >>> If I have a regular directory on a regular drive such as below - it
>>> > works. If I don't mount the ramdisk - it works.
>>> > >>>
>>> > >>>
>>> > >>> <property>
>>> > >>>     <name>mapred.local.dir</name>
>>> > >>>     <value>/hadoop-dsk0/local,/hadoop-dsk1/local</value>
>>> > >>>   </property>
>>> > >>>
>>> > >>>
>>> > >>>
>>> > >>>
>>> > >>>
>>> > >>> The NullPointerException does not tell me what the error is or how to
>>> > fix it.
>>> > >>>
>>> > >>>
>>> > >>> From the logs it looks like some disk based operation failed. I can't
>>> > guess I must also confess that this is the first time I am using an ext2
>>> > file system.
>>> > >>>
>>> > >>>
>>> > >>> Any ideas?
>>> > >>>
>>> > >>>
>>> > >>>
>>> > >>>
>>> > >>> Raj
>>> > >>>
>>> > >>>
>>> > >>>
>>> > >>>
>>> > >>>
>>> > >>>
>>> > >>>
>>> > >
>>> > >
>>> > >
>>> > >
>>> > >
>>> >
>>>
>>
>>
>>



-- 
Joseph Echeverria
Cloudera, Inc.
443.305.9434

Re: pointing mapred.local.dir to a ramdisk

Posted by Raj V <ra...@yahoo.com>.
Edward

I understand the size limitations - but for my experiment the ramdisk size I have created is large enough. 
I think there will be substantial benefits by putting the intermediate map outputs on a ramdisk - size permitting, ofcourse, but I can't provide any numbers to substantiate my claim  given that I can't get it to run.

-best regards

Raj



>________________________________
>From: Edward Capriolo <ed...@gmail.com>
>To: common-user@hadoop.apache.org
>Cc: Raj V <ra...@yahoo.com>
>Sent: Monday, October 3, 2011 10:36 AM
>Subject: Re: pointing mapred.local.dir to a ramdisk
>
>This directory can get very large, in many cases I doubt it would fit on a
>ram disk.
>
>Also RAM Disks tend to help most with random read/write, since hadoop is
>doing mostly linear IO you may not see a great benefit from the RAM disk.
>
>
>
>On Mon, Oct 3, 2011 at 12:07 PM, Vinod Kumar Vavilapalli <
>vinodkv@hortonworks.com> wrote:
>
>> Must be related to some kind of permissions problems.
>>
>> It will help if you can paste the corresponding source code for
>> FileUtil.copy(). Hard to track it with different versions, so.
>>
>> Thanks,
>> +Vinod
>>
>>
>> On Mon, Oct 3, 2011 at 9:28 PM, Raj V <ra...@yahoo.com> wrote:
>>
>> > Eric
>> >
>> > Yes. The owner is hdfs and group is hadoop and the directory is group
>> > writable(775).  This is tehe exact same configuration I have when I use
>> real
>> > disks.But let me give it a try again to see if I overlooked something.
>> > Thanks
>> >
>> > Raj
>> >
>> > >________________________________
>> > >From: Eric Caspole <er...@amd.com>
>> > >To: common-user@hadoop.apache.org
>> > >Sent: Monday, October 3, 2011 8:44 AM
>> > >Subject: Re: pointing mapred.local.dir to a ramdisk
>> > >
>> > >Are you sure you have chown'd/chmod'd the ramdisk directory to be
>> > writeable by your hadoop user? I have played with this in the past and it
>> > should basically work.
>> > >
>> > >
>> > >On Oct 3, 2011, at 10:37 AM, Raj V wrote:
>> > >
>> > >> Sending it to the hadoop mailing list - I think this is a hadoop
>> related
>> > problem and not related to Cloudera distribution.
>> > >>
>> > >> Raj
>> > >>
>> > >>
>> > >> ----- Forwarded Message -----
>> > >>> From: Raj V <ra...@yahoo.com>
>> > >>> To: CDH Users <cd...@cloudera.org>
>> > >>> Sent: Friday, September 30, 2011 5:21 PM
>> > >>> Subject: pointing mapred.local.dir to a ramdisk
>> > >>>
>> > >>>
>> > >>> Hi all
>> > >>>
>> > >>>
>> > >>> I have been trying some experiments to improve performance. One of
>> the
>> > experiments involved pointing mapred.local.dir to a RAM disk. To this end
>> I
>> > created a 128MB RAM disk ( each of my map outputs are smaller than this)
>> but
>> > I have not been able to get the task tracker to start.
>> > >>>
>> > >>>
>> > >>> I am running CDH3B3 ( hadoop-0.20.2+737) and here the error message
>> > from the task tracker log.
>> > >>>
>> > >>>
>> > >>> Tasktracker logs
>> > >>>
>> > >>>
>> > >>> 2011-09-30 16:50:00,689 INFO org.mortbay.log: Logging to
>> > org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via
>> > org.mortbay.log.Slf4jLog
>> > >>> 2011-09-30 16:50:00,930 INFO org.apache.hadoop.http.HttpServer: Added
>> > global filtersafety
>> > (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
>> > >>> 2011-09-30 16:50:01,000 INFO org.apache.hadoop.http.HttpServer: Port
>> > returned by webServer.getConnectors()[0].getLocalPort() before open() is
>> -1.
>> > Opening the listener on 50060
>> > >>> 2011-09-30 16:50:01,023 INFO org.apache.hadoop.http.HttpServer:
>> > listener.getLocalPort() returned 50060
>> > webServer.getConnectors()[0].getLocalPort() returned 50060
>> > >>> 2011-09-30 16:50:01,024 INFO org.apache.hadoop.http.HttpServer: Jetty
>> > bound to port 50060
>> > >>> 2011-09-30 16:50:01,024 INFO org.mortbay.log: jetty-6.1.14
>> > >>> 2011-09-30 16:50:02,388 INFO org.mortbay.log: Started
>> > SelectChannelConnector@0.0.0.0:50060
>> > >>> 2011-09-30 16:50:02,400 INFO
>> > org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater
>> > with mapRetainSize=-1 and reduceRetainSize=-1
>> > >>> 2011-09-30 16:50:02,422 INFO org.apache.hadoop.mapred.TaskTracker:
>> > Starting tasktracker with owner as mapred
>> > >>> 2011-09-30 16:50:02,493 ERROR org.apache.hadoop.mapred.TaskTracker:
>> Can
>> > not start task tracker because java.lang.NullPointerException
>> > >>>         at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:213)
>> > >>>         at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:157)
>> > >>>         at
>> >
>> org.apache.hadoop.fs.RawLocalFileSystem.rename(RawLocalFileSystem.java:253)
>> > >>>         at
>> >
>> org.apache.hadoop.fs.ChecksumFileSystem.rename(ChecksumFileSystem.java:404)
>> > >>>         at
>> >
>> org.apache.hadoop.util.MRAsyncDiskService.moveAndDeleteRelativePath(MRAsyncDiskService.java:255)
>> > >>>         at
>> >
>> org.apache.hadoop.util.MRAsyncDiskService.cleanupAllVolumes(MRAsyncDiskService.java:311)
>> > >>>         at
>> > org.apache.hadoop.mapred.TaskTracker.initialize(TaskTracker.java:618)
>> > >>>         at
>> > org.apache.hadoop.mapred.TaskTracker.<init>(TaskTracker.java:1351)
>> > >>>         at
>> > org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3504)
>> > >>>
>> > >>>
>> > >>> 2011-09-30 16:50:02,497 INFO org.apache.hadoop.mapred.TaskTracker:
>> > SHUTDOWN_MSG:
>> > >>> /************************************************************
>> > >>> SHUTDOWN_MSG: Shutting down TaskTracker at HADOOP52-4/10.52.1.5
>> > >>>
>> > >>>
>> > >>> and here is my mapred-site.xml file
>> > >>>
>> > >>>
>> > >>> <property>
>> > >>>     <name>mapred.local.dir</name>
>> > >>>     <value>/ramdisk1</value>
>> > >>>   </property>
>> > >>>
>> > >>>
>> > >>> If I have a regular directory on a regular drive such as below - it
>> > works. If I don't mount the ramdisk - it works.
>> > >>>
>> > >>>
>> > >>> <property>
>> > >>>     <name>mapred.local.dir</name>
>> > >>>     <value>/hadoop-dsk0/local,/hadoop-dsk1/local</value>
>> > >>>   </property>
>> > >>>
>> > >>>
>> > >>>
>> > >>>
>> > >>>
>> > >>> The NullPointerException does not tell me what the error is or how to
>> > fix it.
>> > >>>
>> > >>>
>> > >>> From the logs it looks like some disk based operation failed. I can't
>> > guess I must also confess that this is the first time I am using an ext2
>> > file system.
>> > >>>
>> > >>>
>> > >>> Any ideas?
>> > >>>
>> > >>>
>> > >>>
>> > >>>
>> > >>> Raj
>> > >>>
>> > >>>
>> > >>>
>> > >>>
>> > >>>
>> > >>>
>> > >>>
>> > >
>> > >
>> > >
>> > >
>> > >
>> >
>>
>
>
>

Re: pointing mapred.local.dir to a ramdisk

Posted by Edward Capriolo <ed...@gmail.com>.
This directory can get very large, in many cases I doubt it would fit on a
ram disk.

Also RAM Disks tend to help most with random read/write, since hadoop is
doing mostly linear IO you may not see a great benefit from the RAM disk.



On Mon, Oct 3, 2011 at 12:07 PM, Vinod Kumar Vavilapalli <
vinodkv@hortonworks.com> wrote:

> Must be related to some kind of permissions problems.
>
> It will help if you can paste the corresponding source code for
> FileUtil.copy(). Hard to track it with different versions, so.
>
> Thanks,
> +Vinod
>
>
> On Mon, Oct 3, 2011 at 9:28 PM, Raj V <ra...@yahoo.com> wrote:
>
> > Eric
> >
> > Yes. The owner is hdfs and group is hadoop and the directory is group
> > writable(775).  This is tehe exact same configuration I have when I use
> real
> > disks.But let me give it a try again to see if I overlooked something.
> > Thanks
> >
> > Raj
> >
> > >________________________________
> > >From: Eric Caspole <er...@amd.com>
> > >To: common-user@hadoop.apache.org
> > >Sent: Monday, October 3, 2011 8:44 AM
> > >Subject: Re: pointing mapred.local.dir to a ramdisk
> > >
> > >Are you sure you have chown'd/chmod'd the ramdisk directory to be
> > writeable by your hadoop user? I have played with this in the past and it
> > should basically work.
> > >
> > >
> > >On Oct 3, 2011, at 10:37 AM, Raj V wrote:
> > >
> > >> Sending it to the hadoop mailing list - I think this is a hadoop
> related
> > problem and not related to Cloudera distribution.
> > >>
> > >> Raj
> > >>
> > >>
> > >> ----- Forwarded Message -----
> > >>> From: Raj V <ra...@yahoo.com>
> > >>> To: CDH Users <cd...@cloudera.org>
> > >>> Sent: Friday, September 30, 2011 5:21 PM
> > >>> Subject: pointing mapred.local.dir to a ramdisk
> > >>>
> > >>>
> > >>> Hi all
> > >>>
> > >>>
> > >>> I have been trying some experiments to improve performance. One of
> the
> > experiments involved pointing mapred.local.dir to a RAM disk. To this end
> I
> > created a 128MB RAM disk ( each of my map outputs are smaller than this)
> but
> > I have not been able to get the task tracker to start.
> > >>>
> > >>>
> > >>> I am running CDH3B3 ( hadoop-0.20.2+737) and here the error message
> > from the task tracker log.
> > >>>
> > >>>
> > >>> Tasktracker logs
> > >>>
> > >>>
> > >>> 2011-09-30 16:50:00,689 INFO org.mortbay.log: Logging to
> > org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via
> > org.mortbay.log.Slf4jLog
> > >>> 2011-09-30 16:50:00,930 INFO org.apache.hadoop.http.HttpServer: Added
> > global filtersafety
> > (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
> > >>> 2011-09-30 16:50:01,000 INFO org.apache.hadoop.http.HttpServer: Port
> > returned by webServer.getConnectors()[0].getLocalPort() before open() is
> -1.
> > Opening the listener on 50060
> > >>> 2011-09-30 16:50:01,023 INFO org.apache.hadoop.http.HttpServer:
> > listener.getLocalPort() returned 50060
> > webServer.getConnectors()[0].getLocalPort() returned 50060
> > >>> 2011-09-30 16:50:01,024 INFO org.apache.hadoop.http.HttpServer: Jetty
> > bound to port 50060
> > >>> 2011-09-30 16:50:01,024 INFO org.mortbay.log: jetty-6.1.14
> > >>> 2011-09-30 16:50:02,388 INFO org.mortbay.log: Started
> > SelectChannelConnector@0.0.0.0:50060
> > >>> 2011-09-30 16:50:02,400 INFO
> > org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater
> > with mapRetainSize=-1 and reduceRetainSize=-1
> > >>> 2011-09-30 16:50:02,422 INFO org.apache.hadoop.mapred.TaskTracker:
> > Starting tasktracker with owner as mapred
> > >>> 2011-09-30 16:50:02,493 ERROR org.apache.hadoop.mapred.TaskTracker:
> Can
> > not start task tracker because java.lang.NullPointerException
> > >>>         at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:213)
> > >>>         at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:157)
> > >>>         at
> >
> org.apache.hadoop.fs.RawLocalFileSystem.rename(RawLocalFileSystem.java:253)
> > >>>         at
> >
> org.apache.hadoop.fs.ChecksumFileSystem.rename(ChecksumFileSystem.java:404)
> > >>>         at
> >
> org.apache.hadoop.util.MRAsyncDiskService.moveAndDeleteRelativePath(MRAsyncDiskService.java:255)
> > >>>         at
> >
> org.apache.hadoop.util.MRAsyncDiskService.cleanupAllVolumes(MRAsyncDiskService.java:311)
> > >>>         at
> > org.apache.hadoop.mapred.TaskTracker.initialize(TaskTracker.java:618)
> > >>>         at
> > org.apache.hadoop.mapred.TaskTracker.<init>(TaskTracker.java:1351)
> > >>>         at
> > org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3504)
> > >>>
> > >>>
> > >>> 2011-09-30 16:50:02,497 INFO org.apache.hadoop.mapred.TaskTracker:
> > SHUTDOWN_MSG:
> > >>> /************************************************************
> > >>> SHUTDOWN_MSG: Shutting down TaskTracker at HADOOP52-4/10.52.1.5
> > >>>
> > >>>
> > >>> and here is my mapred-site.xml file
> > >>>
> > >>>
> > >>> <property>
> > >>>     <name>mapred.local.dir</name>
> > >>>     <value>/ramdisk1</value>
> > >>>   </property>
> > >>>
> > >>>
> > >>> If I have a regular directory on a regular drive such as below - it
> > works. If I don't mount the ramdisk - it works.
> > >>>
> > >>>
> > >>> <property>
> > >>>     <name>mapred.local.dir</name>
> > >>>     <value>/hadoop-dsk0/local,/hadoop-dsk1/local</value>
> > >>>   </property>
> > >>>
> > >>>
> > >>>
> > >>>
> > >>>
> > >>> The NullPointerException does not tell me what the error is or how to
> > fix it.
> > >>>
> > >>>
> > >>> From the logs it looks like some disk based operation failed. I can't
> > guess I must also confess that this is the first time I am using an ext2
> > file system.
> > >>>
> > >>>
> > >>> Any ideas?
> > >>>
> > >>>
> > >>>
> > >>>
> > >>> Raj
> > >>>
> > >>>
> > >>>
> > >>>
> > >>>
> > >>>
> > >>>
> > >
> > >
> > >
> > >
> > >
> >
>

Re: pointing mapred.local.dir to a ramdisk

Posted by Vinod Kumar Vavilapalli <vi...@hortonworks.com>.
Must be related to some kind of permissions problems.

It will help if you can paste the corresponding source code for
FileUtil.copy(). Hard to track it with different versions, so.

Thanks,
+Vinod


On Mon, Oct 3, 2011 at 9:28 PM, Raj V <ra...@yahoo.com> wrote:

> Eric
>
> Yes. The owner is hdfs and group is hadoop and the directory is group
> writable(775).  This is tehe exact same configuration I have when I use real
> disks.But let me give it a try again to see if I overlooked something.
> Thanks
>
> Raj
>
> >________________________________
> >From: Eric Caspole <er...@amd.com>
> >To: common-user@hadoop.apache.org
> >Sent: Monday, October 3, 2011 8:44 AM
> >Subject: Re: pointing mapred.local.dir to a ramdisk
> >
> >Are you sure you have chown'd/chmod'd the ramdisk directory to be
> writeable by your hadoop user? I have played with this in the past and it
> should basically work.
> >
> >
> >On Oct 3, 2011, at 10:37 AM, Raj V wrote:
> >
> >> Sending it to the hadoop mailing list - I think this is a hadoop related
> problem and not related to Cloudera distribution.
> >>
> >> Raj
> >>
> >>
> >> ----- Forwarded Message -----
> >>> From: Raj V <ra...@yahoo.com>
> >>> To: CDH Users <cd...@cloudera.org>
> >>> Sent: Friday, September 30, 2011 5:21 PM
> >>> Subject: pointing mapred.local.dir to a ramdisk
> >>>
> >>>
> >>> Hi all
> >>>
> >>>
> >>> I have been trying some experiments to improve performance. One of the
> experiments involved pointing mapred.local.dir to a RAM disk. To this end I
> created a 128MB RAM disk ( each of my map outputs are smaller than this) but
> I have not been able to get the task tracker to start.
> >>>
> >>>
> >>> I am running CDH3B3 ( hadoop-0.20.2+737) and here the error message
> from the task tracker log.
> >>>
> >>>
> >>> Tasktracker logs
> >>>
> >>>
> >>> 2011-09-30 16:50:00,689 INFO org.mortbay.log: Logging to
> org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via
> org.mortbay.log.Slf4jLog
> >>> 2011-09-30 16:50:00,930 INFO org.apache.hadoop.http.HttpServer: Added
> global filtersafety
> (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
> >>> 2011-09-30 16:50:01,000 INFO org.apache.hadoop.http.HttpServer: Port
> returned by webServer.getConnectors()[0].getLocalPort() before open() is -1.
> Opening the listener on 50060
> >>> 2011-09-30 16:50:01,023 INFO org.apache.hadoop.http.HttpServer:
> listener.getLocalPort() returned 50060
> webServer.getConnectors()[0].getLocalPort() returned 50060
> >>> 2011-09-30 16:50:01,024 INFO org.apache.hadoop.http.HttpServer: Jetty
> bound to port 50060
> >>> 2011-09-30 16:50:01,024 INFO org.mortbay.log: jetty-6.1.14
> >>> 2011-09-30 16:50:02,388 INFO org.mortbay.log: Started
> SelectChannelConnector@0.0.0.0:50060
> >>> 2011-09-30 16:50:02,400 INFO
> org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater
> with mapRetainSize=-1 and reduceRetainSize=-1
> >>> 2011-09-30 16:50:02,422 INFO org.apache.hadoop.mapred.TaskTracker:
> Starting tasktracker with owner as mapred
> >>> 2011-09-30 16:50:02,493 ERROR org.apache.hadoop.mapred.TaskTracker: Can
> not start task tracker because java.lang.NullPointerException
> >>>         at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:213)
> >>>         at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:157)
> >>>         at
> org.apache.hadoop.fs.RawLocalFileSystem.rename(RawLocalFileSystem.java:253)
> >>>         at
> org.apache.hadoop.fs.ChecksumFileSystem.rename(ChecksumFileSystem.java:404)
> >>>         at
> org.apache.hadoop.util.MRAsyncDiskService.moveAndDeleteRelativePath(MRAsyncDiskService.java:255)
> >>>         at
> org.apache.hadoop.util.MRAsyncDiskService.cleanupAllVolumes(MRAsyncDiskService.java:311)
> >>>         at
> org.apache.hadoop.mapred.TaskTracker.initialize(TaskTracker.java:618)
> >>>         at
> org.apache.hadoop.mapred.TaskTracker.<init>(TaskTracker.java:1351)
> >>>         at
> org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3504)
> >>>
> >>>
> >>> 2011-09-30 16:50:02,497 INFO org.apache.hadoop.mapred.TaskTracker:
> SHUTDOWN_MSG:
> >>> /************************************************************
> >>> SHUTDOWN_MSG: Shutting down TaskTracker at HADOOP52-4/10.52.1.5
> >>>
> >>>
> >>> and here is my mapred-site.xml file
> >>>
> >>>
> >>> <property>
> >>>     <name>mapred.local.dir</name>
> >>>     <value>/ramdisk1</value>
> >>>   </property>
> >>>
> >>>
> >>> If I have a regular directory on a regular drive such as below - it
> works. If I don't mount the ramdisk - it works.
> >>>
> >>>
> >>> <property>
> >>>     <name>mapred.local.dir</name>
> >>>     <value>/hadoop-dsk0/local,/hadoop-dsk1/local</value>
> >>>   </property>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> The NullPointerException does not tell me what the error is or how to
> fix it.
> >>>
> >>>
> >>> From the logs it looks like some disk based operation failed. I can't
> guess I must also confess that this is the first time I am using an ext2
> file system.
> >>>
> >>>
> >>> Any ideas?
> >>>
> >>>
> >>>
> >>>
> >>> Raj
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >
> >
> >
> >
> >
>

Re: pointing mapred.local.dir to a ramdisk

Posted by Raj V <ra...@yahoo.com>.
Eric

Yes. The owner is hdfs and group is hadoop and the directory is group writable(775).  This is tehe exact same configuration I have when I use real disks.But let me give it a try again to see if I overlooked something.
Thanks

Raj

>________________________________
>From: Eric Caspole <er...@amd.com>
>To: common-user@hadoop.apache.org
>Sent: Monday, October 3, 2011 8:44 AM
>Subject: Re: pointing mapred.local.dir to a ramdisk
>
>Are you sure you have chown'd/chmod'd the ramdisk directory to be writeable by your hadoop user? I have played with this in the past and it should basically work.
>
>
>On Oct 3, 2011, at 10:37 AM, Raj V wrote:
>
>> Sending it to the hadoop mailing list - I think this is a hadoop related problem and not related to Cloudera distribution.
>> 
>> Raj
>> 
>> 
>> ----- Forwarded Message -----
>>> From: Raj V <ra...@yahoo.com>
>>> To: CDH Users <cd...@cloudera.org>
>>> Sent: Friday, September 30, 2011 5:21 PM
>>> Subject: pointing mapred.local.dir to a ramdisk
>>> 
>>> 
>>> Hi all
>>> 
>>> 
>>> I have been trying some experiments to improve performance. One of the experiments involved pointing mapred.local.dir to a RAM disk. To this end I created a 128MB RAM disk ( each of my map outputs are smaller than this) but I have not been able to get the task tracker to start.
>>> 
>>> 
>>> I am running CDH3B3 ( hadoop-0.20.2+737) and here the error message from the task tracker log.
>>> 
>>> 
>>> Tasktracker logs
>>> 
>>> 
>>> 2011-09-30 16:50:00,689 INFO org.mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
>>> 2011-09-30 16:50:00,930 INFO org.apache.hadoop.http.HttpServer: Added global filtersafety (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
>>> 2011-09-30 16:50:01,000 INFO org.apache.hadoop.http.HttpServer: Port returned by webServer.getConnectors()[0].getLocalPort() before open() is -1. Opening the listener on 50060
>>> 2011-09-30 16:50:01,023 INFO org.apache.hadoop.http.HttpServer: listener.getLocalPort() returned 50060 webServer.getConnectors()[0].getLocalPort() returned 50060
>>> 2011-09-30 16:50:01,024 INFO org.apache.hadoop.http.HttpServer: Jetty bound to port 50060
>>> 2011-09-30 16:50:01,024 INFO org.mortbay.log: jetty-6.1.14
>>> 2011-09-30 16:50:02,388 INFO org.mortbay.log: Started SelectChannelConnector@0.0.0.0:50060
>>> 2011-09-30 16:50:02,400 INFO org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
>>> 2011-09-30 16:50:02,422 INFO org.apache.hadoop.mapred.TaskTracker: Starting tasktracker with owner as mapred
>>> 2011-09-30 16:50:02,493 ERROR org.apache.hadoop.mapred.TaskTracker: Can not start task tracker because java.lang.NullPointerException
>>>         at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:213)
>>>         at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:157)
>>>         at org.apache.hadoop.fs.RawLocalFileSystem.rename(RawLocalFileSystem.java:253)
>>>         at org.apache.hadoop.fs.ChecksumFileSystem.rename(ChecksumFileSystem.java:404)
>>>         at org.apache.hadoop.util.MRAsyncDiskService.moveAndDeleteRelativePath(MRAsyncDiskService.java:255)
>>>         at org.apache.hadoop.util.MRAsyncDiskService.cleanupAllVolumes(MRAsyncDiskService.java:311)
>>>         at org.apache.hadoop.mapred.TaskTracker.initialize(TaskTracker.java:618)
>>>         at org.apache.hadoop.mapred.TaskTracker.<init>(TaskTracker.java:1351)
>>>         at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3504)
>>> 
>>> 
>>> 2011-09-30 16:50:02,497 INFO org.apache.hadoop.mapred.TaskTracker: SHUTDOWN_MSG:
>>> /************************************************************
>>> SHUTDOWN_MSG: Shutting down TaskTracker at HADOOP52-4/10.52.1.5
>>> 
>>> 
>>> and here is my mapred-site.xml file
>>> 
>>> 
>>> <property>
>>>     <name>mapred.local.dir</name>
>>>     <value>/ramdisk1</value>
>>>   </property>
>>> 
>>> 
>>> If I have a regular directory on a regular drive such as below - it works. If I don't mount the ramdisk - it works.
>>> 
>>> 
>>> <property>
>>>     <name>mapred.local.dir</name>
>>>     <value>/hadoop-dsk0/local,/hadoop-dsk1/local</value>
>>>   </property>
>>> 
>>> 
>>> 
>>> 
>>> 
>>> The NullPointerException does not tell me what the error is or how to fix it.
>>> 
>>> 
>>> From the logs it looks like some disk based operation failed. I can't guess I must also confess that this is the first time I am using an ext2 file system.
>>> 
>>> 
>>> Any ideas?
>>> 
>>> 
>>> 
>>> 
>>> Raj
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>
>
>
>
>

Re: pointing mapred.local.dir to a ramdisk

Posted by Eric Caspole <er...@amd.com>.
Are you sure you have chown'd/chmod'd the ramdisk directory to be  
writeable by your hadoop user? I have played with this in the past  
and it should basically work.


On Oct 3, 2011, at 10:37 AM, Raj V wrote:

> Sending it to the hadoop mailing list - I think this is a hadoop  
> related problem and not related to Cloudera distribution.
>
> Raj
>
>
> ----- Forwarded Message -----
>> From: Raj V <ra...@yahoo.com>
>> To: CDH Users <cd...@cloudera.org>
>> Sent: Friday, September 30, 2011 5:21 PM
>> Subject: pointing mapred.local.dir to a ramdisk
>>
>>
>> Hi all
>>
>>
>> I have been trying some experiments to improve performance. One of  
>> the experiments involved pointing mapred.local.dir to a RAM disk.  
>> To this end I created a 128MB RAM disk ( each of my map outputs  
>> are smaller than this) but I have not been able to get the task  
>> tracker to start.
>>
>>
>> I am running CDH3B3 ( hadoop-0.20.2+737) and here the error  
>> message from the task tracker log.
>>
>>
>> Tasktracker logs
>>
>>
>> 2011-09-30 16:50:00,689 INFO org.mortbay.log: Logging to  
>> org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via  
>> org.mortbay.log.Slf4jLog
>> 2011-09-30 16:50:00,930 INFO org.apache.hadoop.http.HttpServer:  
>> Added global filtersafety (class=org.apache.hadoop.http.HttpServer 
>> $QuotingInputFilter)
>> 2011-09-30 16:50:01,000 INFO org.apache.hadoop.http.HttpServer:  
>> Port returned by webServer.getConnectors()[0].getLocalPort()  
>> before open() is -1. Opening the listener on 50060
>> 2011-09-30 16:50:01,023 INFO org.apache.hadoop.http.HttpServer:  
>> listener.getLocalPort() returned 50060 webServer.getConnectors() 
>> [0].getLocalPort() returned 50060
>> 2011-09-30 16:50:01,024 INFO org.apache.hadoop.http.HttpServer:  
>> Jetty bound to port 50060
>> 2011-09-30 16:50:01,024 INFO org.mortbay.log: jetty-6.1.14
>> 2011-09-30 16:50:02,388 INFO org.mortbay.log: Started  
>> SelectChannelConnector@0.0.0.0:50060
>> 2011-09-30 16:50:02,400 INFO  
>> org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs'  
>> truncater with mapRetainSize=-1 and reduceRetainSize=-1
>> 2011-09-30 16:50:02,422 INFO org.apache.hadoop.mapred.TaskTracker:  
>> Starting tasktracker with owner as mapred
>> 2011-09-30 16:50:02,493 ERROR  
>> org.apache.hadoop.mapred.TaskTracker: Can not start task tracker  
>> because java.lang.NullPointerException
>>         at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:213)
>>         at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:157)
>>         at org.apache.hadoop.fs.RawLocalFileSystem.rename 
>> (RawLocalFileSystem.java:253)
>>         at org.apache.hadoop.fs.ChecksumFileSystem.rename 
>> (ChecksumFileSystem.java:404)
>>         at  
>> org.apache.hadoop.util.MRAsyncDiskService.moveAndDeleteRelativePath 
>> (MRAsyncDiskService.java:255)
>>         at  
>> org.apache.hadoop.util.MRAsyncDiskService.cleanupAllVolumes 
>> (MRAsyncDiskService.java:311)
>>         at org.apache.hadoop.mapred.TaskTracker.initialize 
>> (TaskTracker.java:618)
>>         at org.apache.hadoop.mapred.TaskTracker.<init> 
>> (TaskTracker.java:1351)
>>         at org.apache.hadoop.mapred.TaskTracker.main 
>> (TaskTracker.java:3504)
>>
>>
>> 2011-09-30 16:50:02,497 INFO org.apache.hadoop.mapred.TaskTracker:  
>> SHUTDOWN_MSG:
>> /************************************************************
>> SHUTDOWN_MSG: Shutting down TaskTracker at HADOOP52-4/10.52.1.5
>>
>>
>> and here is my mapred-site.xml file
>>
>>
>> <property>
>>     <name>mapred.local.dir</name>
>>     <value>/ramdisk1</value>
>>   </property>
>>
>>
>> If I have a regular directory on a regular drive such as below -  
>> it works. If I don't mount the ramdisk - it works.
>>
>>
>> <property>
>>     <name>mapred.local.dir</name>
>>     <value>/hadoop-dsk0/local,/hadoop-dsk1/local</value>
>>   </property>
>>
>>
>>
>>
>>
>> The NullPointerException does not tell me what the error is or how  
>> to fix it.
>>
>>
>> From the logs it looks like some disk based operation failed. I  
>> can't guess I must also confess that this is the first time I am  
>> using an ext2 file system.
>>
>>
>> Any ideas?
>>
>>
>>
>>
>> Raj
>>
>>
>>
>>
>>
>>
>>