You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by 姚吉龙 <ge...@gmail.com> on 2013/04/23 09:11:41 UTC

Error about MR task when running 2T data

Hi Everyone

Today I am testing about 2T data on my cluster, there several failed map
task and reduce task on same node
Here is the log

Map failed:

org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any
valid local directory for output/spill0.out
at
org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:381)
 at
org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:146)
at
org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:127)
 at
org.apache.hadoop.mapred.MapOutputFile.getSpillFileForWrite(MapOutputFile.java:121)
at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1392)
 at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1298)
at
org.apache.hadoop.mapred.MapTask$NewOutputCollector.close(MapTask.java:699)
 at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
 at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
 at org.apache.hadoop.mapred.Child.main(Child.java:249)


Reduce failed:
java.io.IOException: Task: attempt_201304211423_0003_r_000006_0 - The
reduce copier failed
 at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
 at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
 at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not
find any valid local directory for output/map_10003.out
at
org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:381)
 at
org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:146)
at
org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:127)
 at
org.apache.hadoop.mapred.MapOutputFile.getInputFileForWrite(MapOutputFile.java:176)
at
org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.doInMemMerge(ReduceTask.java:2742)
 at
org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.run(ReduceTask.java:2706)


Does this mean something wrong with the configuration on node5? or this is
normal when we test the data over TBs.This is the first time I run data
over TBs
Any suggestion is welcome


BRs
Geelong


-- 
>From Good To Great

Re: Error about MR task when running 2T data

Posted by Geelong Yao <ge...@gmail.com>.
I have set two disk available for tem file, one is /usr another is /sda
But I found the first /usr is full while /sda has not been used.
Why would this hadppen ? especially when the first path is full
[image: 内嵌图片 1]


2013/4/23 Harsh J <ha...@cloudera.com>

> Does your node5 have adequate free space and proper multi-disk
> mapred.local.dir configuration set in it?
>
> On Tue, Apr 23, 2013 at 12:41 PM, 姚吉龙 <ge...@gmail.com> wrote:
> >
> > Hi Everyone
> >
> > Today I am testing about 2T data on my cluster, there several failed map
> > task and reduce task on same node
> > Here is the log
> >
> > Map failed:
> >
> > org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any
> > valid local directory for output/spill0.out
> > at
> >
> org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:381)
> > at
> >
> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:146)
> > at
> >
> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:127)
> > at
> >
> org.apache.hadoop.mapred.MapOutputFile.getSpillFileForWrite(MapOutputFile.java:121)
> > at
> >
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1392)
> > at
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1298)
> > at
> >
> org.apache.hadoop.mapred.MapTask$NewOutputCollector.close(MapTask.java:699)
> > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766)
> > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
> > at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
> > at java.security.AccessController.doPrivileged(Native Method)
> > at javax.security.auth.Subject.doAs(Subject.java:396)
> > at
> >
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
> > at org.apache.hadoop.mapred.Child.main(Child.java:249)
> >
> >
> > Reduce failed:
> > java.io.IOException: Task: attempt_201304211423_0003_r_000006_0 - The
> reduce
> > copier failed
> > at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)
> > at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
> > at java.security.AccessController.doPrivileged(Native Method)
> > at javax.security.auth.Subject.doAs(Subject.java:396)
> > at
> >
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
> > at org.apache.hadoop.mapred.Child.main(Child.java:249)
> > Caused by: org.apache.hadoop.util.DiskChecker$DiskErrorException: Could
> not
> > find any valid local directory for output/map_10003.out
> > at
> >
> org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:381)
> > at
> >
> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:146)
> > at
> >
> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:127)
> > at
> >
> org.apache.hadoop.mapred.MapOutputFile.getInputFileForWrite(MapOutputFile.java:176)
> > at
> >
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.doInMemMerge(ReduceTask.java:2742)
> > at
> >
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.run(ReduceTask.java:2706)
> >
> >
> > Does this mean something wrong with the configuration on node5? or this
> is
> > normal when we test the data over TBs.This is the first time I run data
> over
> > TBs
> > Any suggestion is welcome
> >
> >
> > BRs
> > Geelong
> >
> >
> > --
> > From Good To Great
>
>
>
> --
> Harsh J
>



-- 
>From Good To Great

Re: Error about MR task when running 2T data

Posted by Geelong Yao <ge...@gmail.com>.
I have set two disk available for tem file, one is /usr another is /sda
But I found the first /usr is full while /sda has not been used.
Why would this hadppen ? especially when the first path is full
[image: 内嵌图片 1]


2013/4/23 Harsh J <ha...@cloudera.com>

> Does your node5 have adequate free space and proper multi-disk
> mapred.local.dir configuration set in it?
>
> On Tue, Apr 23, 2013 at 12:41 PM, 姚吉龙 <ge...@gmail.com> wrote:
> >
> > Hi Everyone
> >
> > Today I am testing about 2T data on my cluster, there several failed map
> > task and reduce task on same node
> > Here is the log
> >
> > Map failed:
> >
> > org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any
> > valid local directory for output/spill0.out
> > at
> >
> org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:381)
> > at
> >
> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:146)
> > at
> >
> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:127)
> > at
> >
> org.apache.hadoop.mapred.MapOutputFile.getSpillFileForWrite(MapOutputFile.java:121)
> > at
> >
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1392)
> > at
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1298)
> > at
> >
> org.apache.hadoop.mapred.MapTask$NewOutputCollector.close(MapTask.java:699)
> > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766)
> > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
> > at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
> > at java.security.AccessController.doPrivileged(Native Method)
> > at javax.security.auth.Subject.doAs(Subject.java:396)
> > at
> >
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
> > at org.apache.hadoop.mapred.Child.main(Child.java:249)
> >
> >
> > Reduce failed:
> > java.io.IOException: Task: attempt_201304211423_0003_r_000006_0 - The
> reduce
> > copier failed
> > at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)
> > at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
> > at java.security.AccessController.doPrivileged(Native Method)
> > at javax.security.auth.Subject.doAs(Subject.java:396)
> > at
> >
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
> > at org.apache.hadoop.mapred.Child.main(Child.java:249)
> > Caused by: org.apache.hadoop.util.DiskChecker$DiskErrorException: Could
> not
> > find any valid local directory for output/map_10003.out
> > at
> >
> org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:381)
> > at
> >
> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:146)
> > at
> >
> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:127)
> > at
> >
> org.apache.hadoop.mapred.MapOutputFile.getInputFileForWrite(MapOutputFile.java:176)
> > at
> >
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.doInMemMerge(ReduceTask.java:2742)
> > at
> >
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.run(ReduceTask.java:2706)
> >
> >
> > Does this mean something wrong with the configuration on node5? or this
> is
> > normal when we test the data over TBs.This is the first time I run data
> over
> > TBs
> > Any suggestion is welcome
> >
> >
> > BRs
> > Geelong
> >
> >
> > --
> > From Good To Great
>
>
>
> --
> Harsh J
>



-- 
>From Good To Great

Re: Error about MR task when running 2T data

Posted by Geelong Yao <ge...@gmail.com>.
I have set two disk available for tem file, one is /usr another is /sda
But I found the first /usr is full while /sda has not been used.
Why would this hadppen ? especially when the first path is full
[image: 内嵌图片 1]


2013/4/23 Harsh J <ha...@cloudera.com>

> Does your node5 have adequate free space and proper multi-disk
> mapred.local.dir configuration set in it?
>
> On Tue, Apr 23, 2013 at 12:41 PM, 姚吉龙 <ge...@gmail.com> wrote:
> >
> > Hi Everyone
> >
> > Today I am testing about 2T data on my cluster, there several failed map
> > task and reduce task on same node
> > Here is the log
> >
> > Map failed:
> >
> > org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any
> > valid local directory for output/spill0.out
> > at
> >
> org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:381)
> > at
> >
> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:146)
> > at
> >
> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:127)
> > at
> >
> org.apache.hadoop.mapred.MapOutputFile.getSpillFileForWrite(MapOutputFile.java:121)
> > at
> >
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1392)
> > at
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1298)
> > at
> >
> org.apache.hadoop.mapred.MapTask$NewOutputCollector.close(MapTask.java:699)
> > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766)
> > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
> > at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
> > at java.security.AccessController.doPrivileged(Native Method)
> > at javax.security.auth.Subject.doAs(Subject.java:396)
> > at
> >
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
> > at org.apache.hadoop.mapred.Child.main(Child.java:249)
> >
> >
> > Reduce failed:
> > java.io.IOException: Task: attempt_201304211423_0003_r_000006_0 - The
> reduce
> > copier failed
> > at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)
> > at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
> > at java.security.AccessController.doPrivileged(Native Method)
> > at javax.security.auth.Subject.doAs(Subject.java:396)
> > at
> >
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
> > at org.apache.hadoop.mapred.Child.main(Child.java:249)
> > Caused by: org.apache.hadoop.util.DiskChecker$DiskErrorException: Could
> not
> > find any valid local directory for output/map_10003.out
> > at
> >
> org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:381)
> > at
> >
> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:146)
> > at
> >
> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:127)
> > at
> >
> org.apache.hadoop.mapred.MapOutputFile.getInputFileForWrite(MapOutputFile.java:176)
> > at
> >
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.doInMemMerge(ReduceTask.java:2742)
> > at
> >
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.run(ReduceTask.java:2706)
> >
> >
> > Does this mean something wrong with the configuration on node5? or this
> is
> > normal when we test the data over TBs.This is the first time I run data
> over
> > TBs
> > Any suggestion is welcome
> >
> >
> > BRs
> > Geelong
> >
> >
> > --
> > From Good To Great
>
>
>
> --
> Harsh J
>



-- 
>From Good To Great

Re: Error about MR task when running 2T data

Posted by Geelong Yao <ge...@gmail.com>.
I have set two disk available for tem file, one is /usr another is /sda
But I found the first /usr is full while /sda has not been used.
Why would this hadppen ? especially when the first path is full
[image: 内嵌图片 1]


2013/4/23 Harsh J <ha...@cloudera.com>

> Does your node5 have adequate free space and proper multi-disk
> mapred.local.dir configuration set in it?
>
> On Tue, Apr 23, 2013 at 12:41 PM, 姚吉龙 <ge...@gmail.com> wrote:
> >
> > Hi Everyone
> >
> > Today I am testing about 2T data on my cluster, there several failed map
> > task and reduce task on same node
> > Here is the log
> >
> > Map failed:
> >
> > org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any
> > valid local directory for output/spill0.out
> > at
> >
> org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:381)
> > at
> >
> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:146)
> > at
> >
> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:127)
> > at
> >
> org.apache.hadoop.mapred.MapOutputFile.getSpillFileForWrite(MapOutputFile.java:121)
> > at
> >
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1392)
> > at
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1298)
> > at
> >
> org.apache.hadoop.mapred.MapTask$NewOutputCollector.close(MapTask.java:699)
> > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766)
> > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
> > at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
> > at java.security.AccessController.doPrivileged(Native Method)
> > at javax.security.auth.Subject.doAs(Subject.java:396)
> > at
> >
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
> > at org.apache.hadoop.mapred.Child.main(Child.java:249)
> >
> >
> > Reduce failed:
> > java.io.IOException: Task: attempt_201304211423_0003_r_000006_0 - The
> reduce
> > copier failed
> > at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)
> > at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
> > at java.security.AccessController.doPrivileged(Native Method)
> > at javax.security.auth.Subject.doAs(Subject.java:396)
> > at
> >
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
> > at org.apache.hadoop.mapred.Child.main(Child.java:249)
> > Caused by: org.apache.hadoop.util.DiskChecker$DiskErrorException: Could
> not
> > find any valid local directory for output/map_10003.out
> > at
> >
> org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:381)
> > at
> >
> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:146)
> > at
> >
> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:127)
> > at
> >
> org.apache.hadoop.mapred.MapOutputFile.getInputFileForWrite(MapOutputFile.java:176)
> > at
> >
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.doInMemMerge(ReduceTask.java:2742)
> > at
> >
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.run(ReduceTask.java:2706)
> >
> >
> > Does this mean something wrong with the configuration on node5? or this
> is
> > normal when we test the data over TBs.This is the first time I run data
> over
> > TBs
> > Any suggestion is welcome
> >
> >
> > BRs
> > Geelong
> >
> >
> > --
> > From Good To Great
>
>
>
> --
> Harsh J
>



-- 
>From Good To Great

Re: Error about MR task when running 2T data

Posted by Harsh J <ha...@cloudera.com>.
Does your node5 have adequate free space and proper multi-disk
mapred.local.dir configuration set in it?

On Tue, Apr 23, 2013 at 12:41 PM, 姚吉龙 <ge...@gmail.com> wrote:
>
> Hi Everyone
>
> Today I am testing about 2T data on my cluster, there several failed map
> task and reduce task on same node
> Here is the log
>
> Map failed:
>
> org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any
> valid local directory for output/spill0.out
> at
> org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:381)
> at
> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:146)
> at
> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:127)
> at
> org.apache.hadoop.mapred.MapOutputFile.getSpillFileForWrite(MapOutputFile.java:121)
> at
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1392)
> at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1298)
> at
> org.apache.hadoop.mapred.MapTask$NewOutputCollector.close(MapTask.java:699)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
> at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:396)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
> at org.apache.hadoop.mapred.Child.main(Child.java:249)
>
>
> Reduce failed:
> java.io.IOException: Task: attempt_201304211423_0003_r_000006_0 - The reduce
> copier failed
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)
> at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:396)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
> at org.apache.hadoop.mapred.Child.main(Child.java:249)
> Caused by: org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not
> find any valid local directory for output/map_10003.out
> at
> org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:381)
> at
> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:146)
> at
> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:127)
> at
> org.apache.hadoop.mapred.MapOutputFile.getInputFileForWrite(MapOutputFile.java:176)
> at
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.doInMemMerge(ReduceTask.java:2742)
> at
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.run(ReduceTask.java:2706)
>
>
> Does this mean something wrong with the configuration on node5? or this is
> normal when we test the data over TBs.This is the first time I run data over
> TBs
> Any suggestion is welcome
>
>
> BRs
> Geelong
>
>
> --
> From Good To Great



-- 
Harsh J

Re: Error about MR task when running 2T data

Posted by Harsh J <ha...@cloudera.com>.
Does your node5 have adequate free space and proper multi-disk
mapred.local.dir configuration set in it?

On Tue, Apr 23, 2013 at 12:41 PM, 姚吉龙 <ge...@gmail.com> wrote:
>
> Hi Everyone
>
> Today I am testing about 2T data on my cluster, there several failed map
> task and reduce task on same node
> Here is the log
>
> Map failed:
>
> org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any
> valid local directory for output/spill0.out
> at
> org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:381)
> at
> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:146)
> at
> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:127)
> at
> org.apache.hadoop.mapred.MapOutputFile.getSpillFileForWrite(MapOutputFile.java:121)
> at
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1392)
> at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1298)
> at
> org.apache.hadoop.mapred.MapTask$NewOutputCollector.close(MapTask.java:699)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
> at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:396)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
> at org.apache.hadoop.mapred.Child.main(Child.java:249)
>
>
> Reduce failed:
> java.io.IOException: Task: attempt_201304211423_0003_r_000006_0 - The reduce
> copier failed
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)
> at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:396)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
> at org.apache.hadoop.mapred.Child.main(Child.java:249)
> Caused by: org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not
> find any valid local directory for output/map_10003.out
> at
> org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:381)
> at
> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:146)
> at
> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:127)
> at
> org.apache.hadoop.mapred.MapOutputFile.getInputFileForWrite(MapOutputFile.java:176)
> at
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.doInMemMerge(ReduceTask.java:2742)
> at
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.run(ReduceTask.java:2706)
>
>
> Does this mean something wrong with the configuration on node5? or this is
> normal when we test the data over TBs.This is the first time I run data over
> TBs
> Any suggestion is welcome
>
>
> BRs
> Geelong
>
>
> --
> From Good To Great



-- 
Harsh J

Re: Error about MR task when running 2T data

Posted by Harsh J <ha...@cloudera.com>.
Does your node5 have adequate free space and proper multi-disk
mapred.local.dir configuration set in it?

On Tue, Apr 23, 2013 at 12:41 PM, 姚吉龙 <ge...@gmail.com> wrote:
>
> Hi Everyone
>
> Today I am testing about 2T data on my cluster, there several failed map
> task and reduce task on same node
> Here is the log
>
> Map failed:
>
> org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any
> valid local directory for output/spill0.out
> at
> org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:381)
> at
> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:146)
> at
> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:127)
> at
> org.apache.hadoop.mapred.MapOutputFile.getSpillFileForWrite(MapOutputFile.java:121)
> at
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1392)
> at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1298)
> at
> org.apache.hadoop.mapred.MapTask$NewOutputCollector.close(MapTask.java:699)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
> at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:396)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
> at org.apache.hadoop.mapred.Child.main(Child.java:249)
>
>
> Reduce failed:
> java.io.IOException: Task: attempt_201304211423_0003_r_000006_0 - The reduce
> copier failed
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)
> at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:396)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
> at org.apache.hadoop.mapred.Child.main(Child.java:249)
> Caused by: org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not
> find any valid local directory for output/map_10003.out
> at
> org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:381)
> at
> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:146)
> at
> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:127)
> at
> org.apache.hadoop.mapred.MapOutputFile.getInputFileForWrite(MapOutputFile.java:176)
> at
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.doInMemMerge(ReduceTask.java:2742)
> at
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.run(ReduceTask.java:2706)
>
>
> Does this mean something wrong with the configuration on node5? or this is
> normal when we test the data over TBs.This is the first time I run data over
> TBs
> Any suggestion is welcome
>
>
> BRs
> Geelong
>
>
> --
> From Good To Great



-- 
Harsh J

Re: Error about MR task when running 2T data

Posted by Harsh J <ha...@cloudera.com>.
Does your node5 have adequate free space and proper multi-disk
mapred.local.dir configuration set in it?

On Tue, Apr 23, 2013 at 12:41 PM, 姚吉龙 <ge...@gmail.com> wrote:
>
> Hi Everyone
>
> Today I am testing about 2T data on my cluster, there several failed map
> task and reduce task on same node
> Here is the log
>
> Map failed:
>
> org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any
> valid local directory for output/spill0.out
> at
> org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:381)
> at
> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:146)
> at
> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:127)
> at
> org.apache.hadoop.mapred.MapOutputFile.getSpillFileForWrite(MapOutputFile.java:121)
> at
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1392)
> at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1298)
> at
> org.apache.hadoop.mapred.MapTask$NewOutputCollector.close(MapTask.java:699)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
> at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:396)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
> at org.apache.hadoop.mapred.Child.main(Child.java:249)
>
>
> Reduce failed:
> java.io.IOException: Task: attempt_201304211423_0003_r_000006_0 - The reduce
> copier failed
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)
> at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:396)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
> at org.apache.hadoop.mapred.Child.main(Child.java:249)
> Caused by: org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not
> find any valid local directory for output/map_10003.out
> at
> org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:381)
> at
> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:146)
> at
> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:127)
> at
> org.apache.hadoop.mapred.MapOutputFile.getInputFileForWrite(MapOutputFile.java:176)
> at
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.doInMemMerge(ReduceTask.java:2742)
> at
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.run(ReduceTask.java:2706)
>
>
> Does this mean something wrong with the configuration on node5? or this is
> normal when we test the data over TBs.This is the first time I run data over
> TBs
> Any suggestion is welcome
>
>
> BRs
> Geelong
>
>
> --
> From Good To Great



-- 
Harsh J