You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by syed kather <in...@gmail.com> on 2012/07/12 20:23:10 UTC

hbase map reduce is talking lot of time

Team ,
     i had wrote a mapreduce program . scenario of my program is to emit
<userid,seqid>  .

   Total no user : 825
   Total no seqid:6583100

   No of map which the program will emit is : 825 * 6583100

  I have Hbase table called ObjectSequence : which consist of 6583100(rows)

i had use TableMapper and TableReducer for my map reduce program


 Problem definition :

Processor : i7
Replication Factor : 1
Live Datanodes : 3

 Node Last
Contact Admin State Configured
Capacity (GB) Used
(GB) Non DFS
Used (GB) Remaining
(GB) Used
(%) Used
(%) Remaining
(%) Blocks chethan 1In Service28.590.625.172.822.11

9.8773 shashwat<http://shashwat:50075/browseDirectory.jsp?namenodeInfoPort=50070&dir=%2F>2In
Service28.980.8722.016.13

21.0469 syed<http://syed:50075/browseDirectory.jsp?namenodeInfoPort=50070&dir=%2F>0In
Service28.984.2918.376.3214.8

21.82129
When i run balancer in hadoop i had seen Blocks are not equally distributed
. Can i know what may be the reason for this ..


Kind% CompleteNum TasksPendingRunningCompleteKilledFailed/Killed
Task Attempts<http://chethan:50030/jobfailures.jsp?jobid=job_201207121836_0007>
map<http://chethan:50030/jobtasks.jsp?jobid=job_201207121836_0007&type=map&pagenum=1>
85.71%

701<http://chethan:50030/jobtasks.jsp?jobid=job_201207121836_0007&type=map&pagenum=1&state=running>
6<http://chethan:50030/jobtasks.jsp?jobid=job_201207121836_0007&type=map&pagenum=1&state=completed>
03<http://chethan:50030/jobfailures.jsp?jobid=job_201207121836_0007&kind=map&cause=failed>/
1<http://chethan:50030/jobfailures.jsp?jobid=job_201207121836_0007&kind=map&cause=killed>
reduce<http://chethan:50030/jobtasks.jsp?jobid=job_201207121836_0007&type=reduce&pagenum=1>
28.57%

101<http://chethan:50030/jobtasks.jsp?jobid=job_201207121836_0007&type=reduce&pagenum=1&state=running>
000 / 0
i had seen only Number Task is allocated is 8 . Is there any possibility to
increase the Map Number of Task

Completed TasksTaskCompleteStatusStart TimeFinish TimeErrorsCounters
task_201207121836_0007_m_000001<http://chethan:50030/taskdetails.jsp?tipid=task_201207121836_0007_m_000001>
100.00%
UserID: 777 SEQID:415794
12-Jul-2012 21:35:48
12-Jul-2012 21:36:12 (24sec)

16<http://chethan:50030/taskstats.jsp?tipid=task_201207121836_0007_m_000001>
task_201207121836_0007_m_000002<http://chethan:50030/taskdetails.jsp?tipid=task_201207121836_0007_m_000002>
100.00%
UserID: 777 SEQID:422256
12-Jul-2012 21:35:50
12-Jul-2012 21:36:47 (57sec)

16<http://chethan:50030/taskstats.jsp?tipid=task_201207121836_0007_m_000002>
task_201207121836_0007_m_000003<http://chethan:50030/taskdetails.jsp?tipid=task_201207121836_0007_m_000003>
100.00%
UserID: 777 SEQID:563544
12-Jul-2012 21:35:50
12-Jul-2012 22:00:08 (24mins, 17sec)

16<http://chethan:50030/taskstats.jsp?tipid=task_201207121836_0007_m_000003>
task_201207121836_0007_m_000004<http://chethan:50030/taskdetails.jsp?tipid=task_201207121836_0007_m_000004>
100.00%
UserID: 777 SEQID:592918
12-Jul-2012 21:35:50
12-Jul-2012 21:42:09 (6mins, 18sec)

16<http://chethan:50030/taskstats.jsp?tipid=task_201207121836_0007_m_000004>
task_201207121836_0007_m_000005<http://chethan:50030/taskdetails.jsp?tipid=task_201207121836_0007_m_000005>
100.00%
UserID: 777 SEQID:618121
12-Jul-2012 21:35:50
12-Jul-2012 21:44:34 (8mins, 43sec)

16<http://chethan:50030/taskstats.jsp?tipid=task_201207121836_0007_m_000005>
task_201207121836_0007_m_000006<http://chethan:50030/taskdetails.jsp?tipid=task_201207121836_0007_m_000006>
100.00%
UserID: 777 SEQID:685810
12-Jul-2012 21:36:12
12-Jul-2012 21:44:18 (8mins, 6sec)

16<http://chethan:50030/taskstats.jsp?tipid=task_201207121836_0007_m_000006>
why for last Map task is talking nearly 2 hours .please give me some
suggestion how to do an optimization

TaskCompleteStatusStart TimeFinish TimeErrorsCounters
task_201207121836_0007_m_000000<http://chethan:50030/taskdetails.jsp?tipid=task_201207121836_0007_m_000000>
0.00%
UserID: 482 SEQID:99596
12-Jul-2012 21:35:48

java.io.IOException: Spill failed
	at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1029)
	at org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:691)
	at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
	at org.pointcross.SearchPermission.MapReduce.NewObjectMapper.map(NewObjectMapper.java:205)
	at org.pointcross.SearchPermission.MapReduce.NewObjectMapper.map(NewObjectMapper.java:1)
	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
	at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:416)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093)
	at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: org.apache.hadoop.util.DiskChecker$DiskErrorException:
Could not find any valid local directory for output/spill712.out
	at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:381)
	at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:146)
	at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:127)
	at org.apache.hadoop.mapred.MapOutputFile.getSpillFileForWrite(MapOutputFile.java:121)
	at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1392)
	at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$1800(MapTask.java:853)
	at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$SpillThread.run(MapTask.java:1344)

java.lang.RuntimeException: Error while running command to get file
permissions : java.io.IOException: Cannot run program "/bin/ls":
java.io.IOException: error=12, Cannot allocate memory
	at java.lang.ProcessBuilder.start(ProcessBuilder.java:475)
	at org.apache.hadoop.util.Shell.runCommand(Shell.java:200)
	at org.apache.hadoop.util.Shell.run(Shell.java:182)
	at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:375)
	at org.apache.hadoop.util.Shell.execCommand(Shell.java:461)
	at org.apache.hadoop.util.Shell.execCommand(Shell.java:444)
	at org.apache.hadoop.fs.FileUtil.execCommand(FileUtil.java:703)
	at org.apache.hadoop.fs.RawLocalFileSystem$RawLocalFileStatus.loadPermissionInfo(RawLocalFileSystem.java:443)
	at org.apache.hadoop.fs.RawLocalFileSystem$RawLocalFileStatus.getOwner(RawLocalFileSystem.java:426)
	at org.apache.hadoop.mapred.TaskLog.obtainLogDirOwner(TaskLog.java:251)
	at org.apache.hadoop.mapred.TaskLogsTruncater.truncateLogs(TaskLogsTruncater.java:124)
	at org.apache.hadoop.mapred.Child$4.run(Child.java:260)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:416)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093)
	at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: java.io.IOException: java.io.IOException: error=12, Cannot
allocate memory
	at java.lang.UNIXProcess.<init>(UNIXProcess.java:164)
	at java.lang.ProcessImpl.start(ProcessImpl.java:81)
	at java.lang.ProcessBuilder.start(ProcessBuilder.java:468)
	... 15 more

	at org.apache.hadoop.fs.RawLocalFileSystem$RawLocalFileStatus.loadPermissionInfo(RawLocalFileSystem.java:468)
	at org.apache.hadoop.fs.RawLocalFileSystem$RawLocalFileStatus.getOwner(RawLocalFileSystem.java:426)
	at org.apache.hadoop.mapred.TaskLog.obtainLogDirOwner(TaskLog.java:251)
	at org.apache.hadoop.mapred.TaskLogsTruncater.truncateLogs(TaskLogsTruncater.java:124)
	at org.apache.hadoop.mapred.Child$4.run(Child.java:260)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:416)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093)
	at org.apache.hadoop.mapred.Child.main(Child.java:249)

java.io.IOException: Spill failed
	at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1029)
	at org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:691)
	at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
	at org.pointcross.SearchPermission.MapReduce.NewObjectMapper.map(NewObjectMapper.java:205)
	at org.pointcross.SearchPermission.MapReduce.NewObjectMapper.map(NewObjectMapper.java:1)
	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
	at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:416)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093)
	at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: org.apache.hadoop.util.DiskChecker$DiskErrorException:
Could not find any valid local directory for output/spill934.out
	at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:381)
	at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:146)
	at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:127)
	at org.apache.hadoop.mapred.MapOutputFile.getSpillFileForWrite(MapOutputFile.java:121)
	at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1392)
	at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$1800(MapTask.java:853)
	at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$SpillThread.run(MapTask.java:1344)



I had seen this error for last task what may be the reason for this error .

NOTE:  When i run import hbase table it takes 10 min .


Team please give suggestion what to be done to solve these issue .


            Thanks and Regards,
        S SYED ABDUL KATHER

Re: hbase map reduce is talking lot of time

Posted by syed kather <in...@gmail.com>.
Thanks shashwat .. After increasing the Size of disk MR works fine . There
are 6 mappers are running out of 6.first 5 map is talking 1L excuting in 20
min ... last 6th mapper is talking 5L record and taking long time to
excecute . which talks nearly 1.5 hours .. . Can any one give me an idea to
fix this .
            Thanks and Regards,
        S SYED ABDUL KATHER



On Wed, Jul 18, 2012 at 12:21 AM, shashwat shriparv <
dwivedishashwat@gmail.com> wrote:

> Hi Syed,
>
> Do give try to this solution :
>
>
> http://agiletesting.blogspot.in/2011/11/troubleshooting-memory-allocation.html
>
> Regards
>
> ∞
> Shashwat Shriparv
>
>
>
> On Wed, Jul 18, 2012 at 12:18 AM, shashwat shriparv <
> dwivedishashwat@gmail.com> wrote:
>
> > Hi Syed,
> >
> > The problem is with the disk space. as map-reduce keeps the intermediate
> > result on the local disk, just check if you have enough disk space. and
> > also make sure that you have cleared the tmp directory and its writable.
> > Just provide more space and try else try with small number of users and
> > check if its working
> >
> > Regards
> >
> > ∞
> > Shashwat Shriparv
> >
> >
> >
> >
> > On Tue, Jul 17, 2012 at 11:50 AM, syed kather <in...@gmail.com>
> wrote:
> >
> >> Team ,
> >>      i had wrote a mapreduce program . scenario of my program is to emit
> >> <userid,seqid>  .
> >>
> >>    Total no user : 825
> >>    Total no seqid:6583100
> >>
> >>    No of map which the program will emit is : 825 * 6583100
> >>
> >>   I have Hbase table called ObjectSequence : which consist of
> >> 6583100(rows)
> >>
> >> i had use TableMapper and TableReducer for my map reduce program
> >>
> >>
> >>  Problem definition :
> >>
> >> Processor : i7
> >> Replication Factor : 1
> >> Live Datanodes : 3
> >>
> >>  Node Last
> >> Contact  Admin State Configured
> >> Capacity (GB)  Used
> >> (GB) Non DFS
> >> Used (GB) Remaining
> >> (GB)  Used
> >> (%) Used
> >> (%)  Remaining
> >> (%) Blocks chethan 1In Service 28.590.625.172.82 2.11
> >>
> >> 9.8773 shashwat<
> >>
> http://shashwat:50075/browseDirectory.jsp?namenodeInfoPort=50070&dir=%2F>
> >> 2In Service28.980.87 22.016.13
> >>
> >> 21.0469 syed<
> >> http://syed:50075/browseDirectory.jsp?namenodeInfoPort=50070&dir=%2F>
> >> 0In Service28.984.29 18.376.3214.8
> >>
> >> 21.82129
> >> When i run balancer in hadoop i had seen Blocks are not equally
> >> distributed
> >> . Can i know what may be the reason for this ..
> >>
> >>
> >> Kind% CompleteNum TasksPendingRunningComplete KilledFailed/Killed
> >> Task Attempts<
> >> http://chethan:50030/jobfailures.jsp?jobid=job_201207121836_0007>
> >> map<
> >>
> http://chethan:50030/jobtasks.jsp?jobid=job_201207121836_0007&type=map&pagenum=1
> >> >
> >> 85.71%
> >>
> >> 701<
> >>
> http://chethan:50030/jobtasks.jsp?jobid=job_201207121836_0007&type=map&pagenum=1&state=running
> >> >
> >> 6<
> >>
> http://chethan:50030/jobtasks.jsp?jobid=job_201207121836_0007&type=map&pagenum=1&state=completed
> >> >
> >> 0 3<
> >>
> http://chethan:50030/jobfailures.jsp?jobid=job_201207121836_0007&kind=map&cause=failed
> >> >/
> >> 1<
> >>
> http://chethan:50030/jobfailures.jsp?jobid=job_201207121836_0007&kind=map&cause=killed
> >> >
> >> reduce<
> >>
> http://chethan:50030/jobtasks.jsp?jobid=job_201207121836_0007&type=reduce&pagenum=1
> >> >
> >> 28.57%
> >>
> >> 101<
> >>
> http://chethan:50030/jobtasks.jsp?jobid=job_201207121836_0007&type=reduce&pagenum=1&state=running
> >> >
> >> 000 / 0
> >> i had seen only Number Task is allocated is 8 . Is there any possibility
> >> to
> >> increase the Map Number of Task
> >>
> >> Completed TasksTaskCompleteStatusStart TimeFinish TimeErrorsCounters
> >> task_201207121836_0007_m_000001<
> >>
> http://chethan:50030/taskdetails.jsp?tipid=task_201207121836_0007_m_000001
> >> >
> >> 100.00%
> >> UserID: 777 SEQID:415794
> >> 12-Jul-2012 21:35:48
> >> 12-Jul-2012 21:36:12 (24sec)
> >>
> >> 16<
> >>
> http://chethan:50030/taskstats.jsp?tipid=task_201207121836_0007_m_000001>
> >> task_201207121836_0007_m_000002<
> >>
> http://chethan:50030/taskdetails.jsp?tipid=task_201207121836_0007_m_000002
> >> >
> >> 100.00%
> >> UserID: 777 SEQID:422256
> >> 12-Jul-2012 21:35:50
> >> 12-Jul-2012 21:36:47 (57sec)
> >>
> >> 16<
> >>
> http://chethan:50030/taskstats.jsp?tipid=task_201207121836_0007_m_000002>
> >> task_201207121836_0007_m_000003<
> >>
> http://chethan:50030/taskdetails.jsp?tipid=task_201207121836_0007_m_000003
> >> >
> >> 100.00%
> >> UserID: 777 SEQID:563544
> >> 12-Jul-2012 21:35:50
> >> 12-Jul-2012 22:00:08 (24mins, 17sec)
> >>
> >> 16<
> >>
> http://chethan:50030/taskstats.jsp?tipid=task_201207121836_0007_m_000003>
> >> task_201207121836_0007_m_000004<
> >>
> http://chethan:50030/taskdetails.jsp?tipid=task_201207121836_0007_m_000004
> >> >
> >> 100.00%
> >> UserID: 777 SEQID:592918
> >> 12-Jul-2012 21:35:50
> >> 12-Jul-2012 21:42:09 (6mins, 18sec)
> >>
> >> 16<
> >>
> http://chethan:50030/taskstats.jsp?tipid=task_201207121836_0007_m_000004>
> >> task_201207121836_0007_m_000005<
> >>
> http://chethan:50030/taskdetails.jsp?tipid=task_201207121836_0007_m_000005
> >> >
> >> 100.00%
> >> UserID: 777 SEQID:618121
> >> 12-Jul-2012 21:35:50
> >> 12-Jul-2012 21:44:34 (8mins, 43sec)
> >>
> >> 16<
> >>
> http://chethan:50030/taskstats.jsp?tipid=task_201207121836_0007_m_000005>
> >> task_201207121836_0007_m_000006<
> >>
> http://chethan:50030/taskdetails.jsp?tipid=task_201207121836_0007_m_000006
> >> >
> >> 100.00%
> >> UserID: 777 SEQID:685810
> >> 12-Jul-2012 21:36:12
> >> 12-Jul-2012 21:44:18 (8mins, 6sec)
> >>
> >> 16<
> >>
> http://chethan:50030/taskstats.jsp?tipid=task_201207121836_0007_m_000006>
> >> why for last Map task is talking nearly 2 hours .please give me some
> >> suggestion how to do an optimization
> >>
> >> TaskCompleteStatusStart Time Finish TimeErrorsCounters
> >> task_201207121836_0007_m_000000<
> >>
> http://chethan:50030/taskdetails.jsp?tipid=task_201207121836_0007_m_000000
> >> >
> >> 0.00%
> >> UserID: 482 SEQID:99596
> >> 12-Jul-2012 21:35:48
> >>
> >> java.io.IOException: Spill failed
> >>         at
> >>
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1029)
> >>         at
> >>
> org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:691)
> >>         at
> >>
> org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
> >>         at
> >>
> org.pointcross.SearchPermission.MapReduce.NewObjectMapper.map(NewObjectMapper.java:205)
> >>         at
> >>
> org.pointcross.SearchPermission.MapReduce.NewObjectMapper.map(NewObjectMapper.java:1)
> >>         at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
> >>         at
> org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
> >>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
> >>         at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
> >>         at java.security.AccessController.doPrivileged(Native Method)
> >>         at javax.security.auth.Subject.doAs(Subject.java:416)
> >>         at
> >>
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093)
> >>         at org.apache.hadoop.mapred.Child.main(Child.java:249)
> >> Caused by: org.apache.hadoop.util.DiskChecker$DiskErrorException:
> >> Could not find any valid local directory for output/spill712.out
> >>         at
> >>
> org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:381)
> >>         at
> >>
> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:146)
> >>         at
> >>
> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:127)
> >>         at
> >>
> org.apache.hadoop.mapred.MapOutputFile.getSpillFileForWrite(MapOutputFile.java:121)
> >>         at
> >>
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1392)
> >>         at
> >>
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$1800(MapTask.java:853)
> >>         at
> >>
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer$SpillThread.run(MapTask.java:1344)
> >>
> >> java.lang.RuntimeException: Error while running command to get file
> >> permissions : java.io.IOException: Cannot run program "/bin/ls":
> >> java.io.IOException: error=12, Cannot allocate memory
> >>         at java.lang.ProcessBuilder.start(ProcessBuilder.java:475)
> >>         at org.apache.hadoop.util.Shell.runCommand(Shell.java:200)
> >>         at org.apache.hadoop.util.Shell.run(Shell.java:182)
> >>         at
> >>
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:375)
> >>         at org.apache.hadoop.util.Shell.execCommand(Shell.java:461)
> >>         at org.apache.hadoop.util.Shell.execCommand(Shell.java:444)
> >>         at org.apache.hadoop.fs.FileUtil.execCommand(FileUtil.java:703)
> >>         at
> >>
> org.apache.hadoop.fs.RawLocalFileSystem$RawLocalFileStatus.loadPermissionInfo(RawLocalFileSystem.java:443)
> >>         at
> >>
> org.apache.hadoop.fs.RawLocalFileSystem$RawLocalFileStatus.getOwner(RawLocalFileSystem.java:426)
> >>         at
> >> org.apache.hadoop.mapred.TaskLog.obtainLogDirOwner(TaskLog.java:251)
> >>         at
> >>
> org.apache.hadoop.mapred.TaskLogsTruncater.truncateLogs(TaskLogsTruncater.java:124)
> >>         at org.apache.hadoop.mapred.Child$4.run(Child.java:260)
> >>         at java.security.AccessController.doPrivileged(Native Method)
> >>         at javax.security.auth.Subject.doAs(Subject.java:416)
> >>         at
> >>
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093)
> >>         at org.apache.hadoop.mapred.Child.main(Child.java:249)
> >> Caused by: java.io.IOException: java.io.IOException: error=12, Cannot
> >> allocate memory
> >>         at java.lang.UNIXProcess.<init>(UNIXProcess.java:164)
> >>         at java.lang.ProcessImpl.start(ProcessImpl.java:81)
> >>         at java.lang.ProcessBuilder.start(ProcessBuilder.java:468)
> >>         ... 15 more
> >>
> >>         at
> >>
> org.apache.hadoop.fs.RawLocalFileSystem$RawLocalFileStatus.loadPermissionInfo(RawLocalFileSystem.java:468)
> >>         at
> >>
> org.apache.hadoop.fs.RawLocalFileSystem$RawLocalFileStatus.getOwner(RawLocalFileSystem.java:426)
> >>         at
> >> org.apache.hadoop.mapred.TaskLog.obtainLogDirOwner(TaskLog.java:251)
> >>         at
> >>
> org.apache.hadoop.mapred.TaskLogsTruncater.truncateLogs(TaskLogsTruncater.java:124)
> >>         at org.apache.hadoop.mapred.Child$4.run(Child.java:260)
> >>         at java.security.AccessController.doPrivileged(Native Method)
> >>         at javax.security.auth.Subject.doAs(Subject.java:416)
> >>         at
> >>
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093)
> >>         at org.apache.hadoop.mapred.Child.main(Child.java:249)
> >>
> >> java.io.IOException: Spill failed
> >>         at
> >>
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1029)
> >>         at
> >>
> org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:691)
> >>         at
> >>
> org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
> >>         at
> >>
> org.pointcross.SearchPermission.MapReduce.NewObjectMapper.map(NewObjectMapper.java:205)
> >>         at
> >>
> org.pointcross.SearchPermission.MapReduce.NewObjectMapper.map(NewObjectMapper.java:1)
> >>         at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
> >>         at
> org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
> >>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
> >>         at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
> >>         at java.security.AccessController.doPrivileged(Native Method)
> >>         at javax.security.auth.Subject.doAs(Subject.java:416)
> >>         at
> >>
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093)
> >>         at org.apache.hadoop.mapred.Child.main(Child.java:249)
> >> Caused by: org.apache.hadoop.util.DiskChecker$DiskErrorException:
> >> Could not find any valid local directory for output/spill934.out
> >>         at
> >>
> org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:381)
> >>         at
> >>
> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:146)
> >>         at
> >>
> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:127)
> >>         at
> >>
> org.apache.hadoop.mapred.MapOutputFile.getSpillFileForWrite(MapOutputFile.java:121)
> >>         at
> >>
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1392)
> >>         at
> >>
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$1800(MapTask.java:853)
> >>         at
> >>
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer$SpillThread.run(MapTask.java:1344)
> >>
> >>
> >>
> >> I had seen this error for last task what may be the reason for this
> error
> >> .
> >>
> >> NOTE:  When i run import hbase table it takes 10 min .
> >>
> >>
> >> Team please give suggestion what to be done to solve these issue .
> >>
> >>
> >>             Thanks and Regards,
> >>         S SYED ABDUL KATHER
> >>
> >
> >
> >
> > --
> >
> >
> > ∞
> > Shashwat Shriparv
> >
> >
> >
>
>
> --
>
>
> ∞
> Shashwat Shriparv
>

Re: hbase map reduce is talking lot of time

Posted by shashwat shriparv <dw...@gmail.com>.
Hi Syed,

Do give try to this solution :

http://agiletesting.blogspot.in/2011/11/troubleshooting-memory-allocation.html

Regards

∞
Shashwat Shriparv



On Wed, Jul 18, 2012 at 12:18 AM, shashwat shriparv <
dwivedishashwat@gmail.com> wrote:

> Hi Syed,
>
> The problem is with the disk space. as map-reduce keeps the intermediate
> result on the local disk, just check if you have enough disk space. and
> also make sure that you have cleared the tmp directory and its writable.
> Just provide more space and try else try with small number of users and
> check if its working
>
> Regards
>
> ∞
> Shashwat Shriparv
>
>
>
>
> On Tue, Jul 17, 2012 at 11:50 AM, syed kather <in...@gmail.com> wrote:
>
>> Team ,
>>      i had wrote a mapreduce program . scenario of my program is to emit
>> <userid,seqid>  .
>>
>>    Total no user : 825
>>    Total no seqid:6583100
>>
>>    No of map which the program will emit is : 825 * 6583100
>>
>>   I have Hbase table called ObjectSequence : which consist of
>> 6583100(rows)
>>
>> i had use TableMapper and TableReducer for my map reduce program
>>
>>
>>  Problem definition :
>>
>> Processor : i7
>> Replication Factor : 1
>> Live Datanodes : 3
>>
>>  Node Last
>> Contact  Admin State Configured
>> Capacity (GB)  Used
>> (GB) Non DFS
>> Used (GB) Remaining
>> (GB)  Used
>> (%) Used
>> (%)  Remaining
>> (%) Blocks chethan 1In Service 28.590.625.172.82 2.11
>>
>> 9.8773 shashwat<
>> http://shashwat:50075/browseDirectory.jsp?namenodeInfoPort=50070&dir=%2F>
>> 2In Service28.980.87 22.016.13
>>
>> 21.0469 syed<
>> http://syed:50075/browseDirectory.jsp?namenodeInfoPort=50070&dir=%2F>
>> 0In Service28.984.29 18.376.3214.8
>>
>> 21.82129
>> When i run balancer in hadoop i had seen Blocks are not equally
>> distributed
>> . Can i know what may be the reason for this ..
>>
>>
>> Kind% CompleteNum TasksPendingRunningComplete KilledFailed/Killed
>> Task Attempts<
>> http://chethan:50030/jobfailures.jsp?jobid=job_201207121836_0007>
>> map<
>> http://chethan:50030/jobtasks.jsp?jobid=job_201207121836_0007&type=map&pagenum=1
>> >
>> 85.71%
>>
>> 701<
>> http://chethan:50030/jobtasks.jsp?jobid=job_201207121836_0007&type=map&pagenum=1&state=running
>> >
>> 6<
>> http://chethan:50030/jobtasks.jsp?jobid=job_201207121836_0007&type=map&pagenum=1&state=completed
>> >
>> 0 3<
>> http://chethan:50030/jobfailures.jsp?jobid=job_201207121836_0007&kind=map&cause=failed
>> >/
>> 1<
>> http://chethan:50030/jobfailures.jsp?jobid=job_201207121836_0007&kind=map&cause=killed
>> >
>> reduce<
>> http://chethan:50030/jobtasks.jsp?jobid=job_201207121836_0007&type=reduce&pagenum=1
>> >
>> 28.57%
>>
>> 101<
>> http://chethan:50030/jobtasks.jsp?jobid=job_201207121836_0007&type=reduce&pagenum=1&state=running
>> >
>> 000 / 0
>> i had seen only Number Task is allocated is 8 . Is there any possibility
>> to
>> increase the Map Number of Task
>>
>> Completed TasksTaskCompleteStatusStart TimeFinish TimeErrorsCounters
>> task_201207121836_0007_m_000001<
>> http://chethan:50030/taskdetails.jsp?tipid=task_201207121836_0007_m_000001
>> >
>> 100.00%
>> UserID: 777 SEQID:415794
>> 12-Jul-2012 21:35:48
>> 12-Jul-2012 21:36:12 (24sec)
>>
>> 16<
>> http://chethan:50030/taskstats.jsp?tipid=task_201207121836_0007_m_000001>
>> task_201207121836_0007_m_000002<
>> http://chethan:50030/taskdetails.jsp?tipid=task_201207121836_0007_m_000002
>> >
>> 100.00%
>> UserID: 777 SEQID:422256
>> 12-Jul-2012 21:35:50
>> 12-Jul-2012 21:36:47 (57sec)
>>
>> 16<
>> http://chethan:50030/taskstats.jsp?tipid=task_201207121836_0007_m_000002>
>> task_201207121836_0007_m_000003<
>> http://chethan:50030/taskdetails.jsp?tipid=task_201207121836_0007_m_000003
>> >
>> 100.00%
>> UserID: 777 SEQID:563544
>> 12-Jul-2012 21:35:50
>> 12-Jul-2012 22:00:08 (24mins, 17sec)
>>
>> 16<
>> http://chethan:50030/taskstats.jsp?tipid=task_201207121836_0007_m_000003>
>> task_201207121836_0007_m_000004<
>> http://chethan:50030/taskdetails.jsp?tipid=task_201207121836_0007_m_000004
>> >
>> 100.00%
>> UserID: 777 SEQID:592918
>> 12-Jul-2012 21:35:50
>> 12-Jul-2012 21:42:09 (6mins, 18sec)
>>
>> 16<
>> http://chethan:50030/taskstats.jsp?tipid=task_201207121836_0007_m_000004>
>> task_201207121836_0007_m_000005<
>> http://chethan:50030/taskdetails.jsp?tipid=task_201207121836_0007_m_000005
>> >
>> 100.00%
>> UserID: 777 SEQID:618121
>> 12-Jul-2012 21:35:50
>> 12-Jul-2012 21:44:34 (8mins, 43sec)
>>
>> 16<
>> http://chethan:50030/taskstats.jsp?tipid=task_201207121836_0007_m_000005>
>> task_201207121836_0007_m_000006<
>> http://chethan:50030/taskdetails.jsp?tipid=task_201207121836_0007_m_000006
>> >
>> 100.00%
>> UserID: 777 SEQID:685810
>> 12-Jul-2012 21:36:12
>> 12-Jul-2012 21:44:18 (8mins, 6sec)
>>
>> 16<
>> http://chethan:50030/taskstats.jsp?tipid=task_201207121836_0007_m_000006>
>> why for last Map task is talking nearly 2 hours .please give me some
>> suggestion how to do an optimization
>>
>> TaskCompleteStatusStart Time Finish TimeErrorsCounters
>> task_201207121836_0007_m_000000<
>> http://chethan:50030/taskdetails.jsp?tipid=task_201207121836_0007_m_000000
>> >
>> 0.00%
>> UserID: 482 SEQID:99596
>> 12-Jul-2012 21:35:48
>>
>> java.io.IOException: Spill failed
>>         at
>> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1029)
>>         at
>> org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:691)
>>         at
>> org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
>>         at
>> org.pointcross.SearchPermission.MapReduce.NewObjectMapper.map(NewObjectMapper.java:205)
>>         at
>> org.pointcross.SearchPermission.MapReduce.NewObjectMapper.map(NewObjectMapper.java:1)
>>         at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>>         at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
>>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>>         at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>>         at java.security.AccessController.doPrivileged(Native Method)
>>         at javax.security.auth.Subject.doAs(Subject.java:416)
>>         at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093)
>>         at org.apache.hadoop.mapred.Child.main(Child.java:249)
>> Caused by: org.apache.hadoop.util.DiskChecker$DiskErrorException:
>> Could not find any valid local directory for output/spill712.out
>>         at
>> org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:381)
>>         at
>> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:146)
>>         at
>> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:127)
>>         at
>> org.apache.hadoop.mapred.MapOutputFile.getSpillFileForWrite(MapOutputFile.java:121)
>>         at
>> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1392)
>>         at
>> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$1800(MapTask.java:853)
>>         at
>> org.apache.hadoop.mapred.MapTask$MapOutputBuffer$SpillThread.run(MapTask.java:1344)
>>
>> java.lang.RuntimeException: Error while running command to get file
>> permissions : java.io.IOException: Cannot run program "/bin/ls":
>> java.io.IOException: error=12, Cannot allocate memory
>>         at java.lang.ProcessBuilder.start(ProcessBuilder.java:475)
>>         at org.apache.hadoop.util.Shell.runCommand(Shell.java:200)
>>         at org.apache.hadoop.util.Shell.run(Shell.java:182)
>>         at
>> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:375)
>>         at org.apache.hadoop.util.Shell.execCommand(Shell.java:461)
>>         at org.apache.hadoop.util.Shell.execCommand(Shell.java:444)
>>         at org.apache.hadoop.fs.FileUtil.execCommand(FileUtil.java:703)
>>         at
>> org.apache.hadoop.fs.RawLocalFileSystem$RawLocalFileStatus.loadPermissionInfo(RawLocalFileSystem.java:443)
>>         at
>> org.apache.hadoop.fs.RawLocalFileSystem$RawLocalFileStatus.getOwner(RawLocalFileSystem.java:426)
>>         at
>> org.apache.hadoop.mapred.TaskLog.obtainLogDirOwner(TaskLog.java:251)
>>         at
>> org.apache.hadoop.mapred.TaskLogsTruncater.truncateLogs(TaskLogsTruncater.java:124)
>>         at org.apache.hadoop.mapred.Child$4.run(Child.java:260)
>>         at java.security.AccessController.doPrivileged(Native Method)
>>         at javax.security.auth.Subject.doAs(Subject.java:416)
>>         at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093)
>>         at org.apache.hadoop.mapred.Child.main(Child.java:249)
>> Caused by: java.io.IOException: java.io.IOException: error=12, Cannot
>> allocate memory
>>         at java.lang.UNIXProcess.<init>(UNIXProcess.java:164)
>>         at java.lang.ProcessImpl.start(ProcessImpl.java:81)
>>         at java.lang.ProcessBuilder.start(ProcessBuilder.java:468)
>>         ... 15 more
>>
>>         at
>> org.apache.hadoop.fs.RawLocalFileSystem$RawLocalFileStatus.loadPermissionInfo(RawLocalFileSystem.java:468)
>>         at
>> org.apache.hadoop.fs.RawLocalFileSystem$RawLocalFileStatus.getOwner(RawLocalFileSystem.java:426)
>>         at
>> org.apache.hadoop.mapred.TaskLog.obtainLogDirOwner(TaskLog.java:251)
>>         at
>> org.apache.hadoop.mapred.TaskLogsTruncater.truncateLogs(TaskLogsTruncater.java:124)
>>         at org.apache.hadoop.mapred.Child$4.run(Child.java:260)
>>         at java.security.AccessController.doPrivileged(Native Method)
>>         at javax.security.auth.Subject.doAs(Subject.java:416)
>>         at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093)
>>         at org.apache.hadoop.mapred.Child.main(Child.java:249)
>>
>> java.io.IOException: Spill failed
>>         at
>> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1029)
>>         at
>> org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:691)
>>         at
>> org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
>>         at
>> org.pointcross.SearchPermission.MapReduce.NewObjectMapper.map(NewObjectMapper.java:205)
>>         at
>> org.pointcross.SearchPermission.MapReduce.NewObjectMapper.map(NewObjectMapper.java:1)
>>         at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>>         at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
>>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>>         at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>>         at java.security.AccessController.doPrivileged(Native Method)
>>         at javax.security.auth.Subject.doAs(Subject.java:416)
>>         at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093)
>>         at org.apache.hadoop.mapred.Child.main(Child.java:249)
>> Caused by: org.apache.hadoop.util.DiskChecker$DiskErrorException:
>> Could not find any valid local directory for output/spill934.out
>>         at
>> org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:381)
>>         at
>> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:146)
>>         at
>> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:127)
>>         at
>> org.apache.hadoop.mapred.MapOutputFile.getSpillFileForWrite(MapOutputFile.java:121)
>>         at
>> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1392)
>>         at
>> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$1800(MapTask.java:853)
>>         at
>> org.apache.hadoop.mapred.MapTask$MapOutputBuffer$SpillThread.run(MapTask.java:1344)
>>
>>
>>
>> I had seen this error for last task what may be the reason for this error
>> .
>>
>> NOTE:  When i run import hbase table it takes 10 min .
>>
>>
>> Team please give suggestion what to be done to solve these issue .
>>
>>
>>             Thanks and Regards,
>>         S SYED ABDUL KATHER
>>
>
>
>
> --
>
>
> ∞
> Shashwat Shriparv
>
>
>


-- 


∞
Shashwat Shriparv

Re: hbase map reduce is talking lot of time

Posted by shashwat shriparv <dw...@gmail.com>.
Hi Syed,

The problem is with the disk space. as map-reduce keeps the intermediate
result on the local disk, just check if you have enough disk space. and
also make sure that you have cleared the tmp directory and its writable.
Just provide more space and try else try with small number of users and
check if its working

Regards

∞
Shashwat Shriparv



On Tue, Jul 17, 2012 at 11:50 AM, syed kather <in...@gmail.com> wrote:

> Team ,
>      i had wrote a mapreduce program . scenario of my program is to emit
> <userid,seqid>  .
>
>    Total no user : 825
>    Total no seqid:6583100
>
>    No of map which the program will emit is : 825 * 6583100
>
>   I have Hbase table called ObjectSequence : which consist of 6583100(rows)
>
> i had use TableMapper and TableReducer for my map reduce program
>
>
>  Problem definition :
>
> Processor : i7
> Replication Factor : 1
> Live Datanodes : 3
>
>  Node Last
> Contact  Admin State Configured
> Capacity (GB)  Used
> (GB) Non DFS
> Used (GB) Remaining
> (GB)  Used
> (%) Used
> (%)  Remaining
> (%) Blocks chethan 1In Service 28.590.625.172.82 2.11
>
> 9.8773 shashwat<
> http://shashwat:50075/browseDirectory.jsp?namenodeInfoPort=50070&dir=%2F>
> 2In Service28.980.87 22.016.13
>
> 21.0469 syed<
> http://syed:50075/browseDirectory.jsp?namenodeInfoPort=50070&dir=%2F>
> 0In Service28.984.29 18.376.3214.8
>
> 21.82129
> When i run balancer in hadoop i had seen Blocks are not equally distributed
> . Can i know what may be the reason for this ..
>
>
> Kind% CompleteNum TasksPendingRunningComplete KilledFailed/Killed
> Task Attempts<
> http://chethan:50030/jobfailures.jsp?jobid=job_201207121836_0007>
> map<
> http://chethan:50030/jobtasks.jsp?jobid=job_201207121836_0007&type=map&pagenum=1
> >
> 85.71%
>
> 701<
> http://chethan:50030/jobtasks.jsp?jobid=job_201207121836_0007&type=map&pagenum=1&state=running
> >
> 6<
> http://chethan:50030/jobtasks.jsp?jobid=job_201207121836_0007&type=map&pagenum=1&state=completed
> >
> 0 3<
> http://chethan:50030/jobfailures.jsp?jobid=job_201207121836_0007&kind=map&cause=failed
> >/
> 1<
> http://chethan:50030/jobfailures.jsp?jobid=job_201207121836_0007&kind=map&cause=killed
> >
> reduce<
> http://chethan:50030/jobtasks.jsp?jobid=job_201207121836_0007&type=reduce&pagenum=1
> >
> 28.57%
>
> 101<
> http://chethan:50030/jobtasks.jsp?jobid=job_201207121836_0007&type=reduce&pagenum=1&state=running
> >
> 000 / 0
> i had seen only Number Task is allocated is 8 . Is there any possibility to
> increase the Map Number of Task
>
> Completed TasksTaskCompleteStatusStart TimeFinish TimeErrorsCounters
> task_201207121836_0007_m_000001<
> http://chethan:50030/taskdetails.jsp?tipid=task_201207121836_0007_m_000001
> >
> 100.00%
> UserID: 777 SEQID:415794
> 12-Jul-2012 21:35:48
> 12-Jul-2012 21:36:12 (24sec)
>
> 16<
> http://chethan:50030/taskstats.jsp?tipid=task_201207121836_0007_m_000001>
> task_201207121836_0007_m_000002<
> http://chethan:50030/taskdetails.jsp?tipid=task_201207121836_0007_m_000002
> >
> 100.00%
> UserID: 777 SEQID:422256
> 12-Jul-2012 21:35:50
> 12-Jul-2012 21:36:47 (57sec)
>
> 16<
> http://chethan:50030/taskstats.jsp?tipid=task_201207121836_0007_m_000002>
> task_201207121836_0007_m_000003<
> http://chethan:50030/taskdetails.jsp?tipid=task_201207121836_0007_m_000003
> >
> 100.00%
> UserID: 777 SEQID:563544
> 12-Jul-2012 21:35:50
> 12-Jul-2012 22:00:08 (24mins, 17sec)
>
> 16<
> http://chethan:50030/taskstats.jsp?tipid=task_201207121836_0007_m_000003>
> task_201207121836_0007_m_000004<
> http://chethan:50030/taskdetails.jsp?tipid=task_201207121836_0007_m_000004
> >
> 100.00%
> UserID: 777 SEQID:592918
> 12-Jul-2012 21:35:50
> 12-Jul-2012 21:42:09 (6mins, 18sec)
>
> 16<
> http://chethan:50030/taskstats.jsp?tipid=task_201207121836_0007_m_000004>
> task_201207121836_0007_m_000005<
> http://chethan:50030/taskdetails.jsp?tipid=task_201207121836_0007_m_000005
> >
> 100.00%
> UserID: 777 SEQID:618121
> 12-Jul-2012 21:35:50
> 12-Jul-2012 21:44:34 (8mins, 43sec)
>
> 16<
> http://chethan:50030/taskstats.jsp?tipid=task_201207121836_0007_m_000005>
> task_201207121836_0007_m_000006<
> http://chethan:50030/taskdetails.jsp?tipid=task_201207121836_0007_m_000006
> >
> 100.00%
> UserID: 777 SEQID:685810
> 12-Jul-2012 21:36:12
> 12-Jul-2012 21:44:18 (8mins, 6sec)
>
> 16<
> http://chethan:50030/taskstats.jsp?tipid=task_201207121836_0007_m_000006>
> why for last Map task is talking nearly 2 hours .please give me some
> suggestion how to do an optimization
>
> TaskCompleteStatusStart Time Finish TimeErrorsCounters
> task_201207121836_0007_m_000000<
> http://chethan:50030/taskdetails.jsp?tipid=task_201207121836_0007_m_000000
> >
> 0.00%
> UserID: 482 SEQID:99596
> 12-Jul-2012 21:35:48
>
> java.io.IOException: Spill failed
>         at
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1029)
>         at
> org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:691)
>         at
> org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
>         at
> org.pointcross.SearchPermission.MapReduce.NewObjectMapper.map(NewObjectMapper.java:205)
>         at
> org.pointcross.SearchPermission.MapReduce.NewObjectMapper.map(NewObjectMapper.java:1)
>         at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>         at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>         at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:416)
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093)
>         at org.apache.hadoop.mapred.Child.main(Child.java:249)
> Caused by: org.apache.hadoop.util.DiskChecker$DiskErrorException:
> Could not find any valid local directory for output/spill712.out
>         at
> org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:381)
>         at
> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:146)
>         at
> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:127)
>         at
> org.apache.hadoop.mapred.MapOutputFile.getSpillFileForWrite(MapOutputFile.java:121)
>         at
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1392)
>         at
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$1800(MapTask.java:853)
>         at
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer$SpillThread.run(MapTask.java:1344)
>
> java.lang.RuntimeException: Error while running command to get file
> permissions : java.io.IOException: Cannot run program "/bin/ls":
> java.io.IOException: error=12, Cannot allocate memory
>         at java.lang.ProcessBuilder.start(ProcessBuilder.java:475)
>         at org.apache.hadoop.util.Shell.runCommand(Shell.java:200)
>         at org.apache.hadoop.util.Shell.run(Shell.java:182)
>         at
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:375)
>         at org.apache.hadoop.util.Shell.execCommand(Shell.java:461)
>         at org.apache.hadoop.util.Shell.execCommand(Shell.java:444)
>         at org.apache.hadoop.fs.FileUtil.execCommand(FileUtil.java:703)
>         at
> org.apache.hadoop.fs.RawLocalFileSystem$RawLocalFileStatus.loadPermissionInfo(RawLocalFileSystem.java:443)
>         at
> org.apache.hadoop.fs.RawLocalFileSystem$RawLocalFileStatus.getOwner(RawLocalFileSystem.java:426)
>         at
> org.apache.hadoop.mapred.TaskLog.obtainLogDirOwner(TaskLog.java:251)
>         at
> org.apache.hadoop.mapred.TaskLogsTruncater.truncateLogs(TaskLogsTruncater.java:124)
>         at org.apache.hadoop.mapred.Child$4.run(Child.java:260)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:416)
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093)
>         at org.apache.hadoop.mapred.Child.main(Child.java:249)
> Caused by: java.io.IOException: java.io.IOException: error=12, Cannot
> allocate memory
>         at java.lang.UNIXProcess.<init>(UNIXProcess.java:164)
>         at java.lang.ProcessImpl.start(ProcessImpl.java:81)
>         at java.lang.ProcessBuilder.start(ProcessBuilder.java:468)
>         ... 15 more
>
>         at
> org.apache.hadoop.fs.RawLocalFileSystem$RawLocalFileStatus.loadPermissionInfo(RawLocalFileSystem.java:468)
>         at
> org.apache.hadoop.fs.RawLocalFileSystem$RawLocalFileStatus.getOwner(RawLocalFileSystem.java:426)
>         at
> org.apache.hadoop.mapred.TaskLog.obtainLogDirOwner(TaskLog.java:251)
>         at
> org.apache.hadoop.mapred.TaskLogsTruncater.truncateLogs(TaskLogsTruncater.java:124)
>         at org.apache.hadoop.mapred.Child$4.run(Child.java:260)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:416)
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093)
>         at org.apache.hadoop.mapred.Child.main(Child.java:249)
>
> java.io.IOException: Spill failed
>         at
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1029)
>         at
> org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:691)
>         at
> org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
>         at
> org.pointcross.SearchPermission.MapReduce.NewObjectMapper.map(NewObjectMapper.java:205)
>         at
> org.pointcross.SearchPermission.MapReduce.NewObjectMapper.map(NewObjectMapper.java:1)
>         at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>         at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>         at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:416)
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093)
>         at org.apache.hadoop.mapred.Child.main(Child.java:249)
> Caused by: org.apache.hadoop.util.DiskChecker$DiskErrorException:
> Could not find any valid local directory for output/spill934.out
>         at
> org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:381)
>         at
> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:146)
>         at
> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:127)
>         at
> org.apache.hadoop.mapred.MapOutputFile.getSpillFileForWrite(MapOutputFile.java:121)
>         at
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1392)
>         at
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$1800(MapTask.java:853)
>         at
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer$SpillThread.run(MapTask.java:1344)
>
>
>
> I had seen this error for last task what may be the reason for this error .
>
> NOTE:  When i run import hbase table it takes 10 min .
>
>
> Team please give suggestion what to be done to solve these issue .
>
>
>             Thanks and Regards,
>         S SYED ABDUL KATHER
>



-- 


∞
Shashwat Shriparv

hbase map reduce is talking lot of time

Posted by syed kather <in...@gmail.com>.
Team ,
     i had wrote a mapreduce program . scenario of my program is to emit
<userid,seqid>  .

   Total no user : 825
   Total no seqid:6583100

   No of map which the program will emit is : 825 * 6583100

  I have Hbase table called ObjectSequence : which consist of 6583100(rows)

i had use TableMapper and TableReducer for my map reduce program


 Problem definition :

Processor : i7
Replication Factor : 1
Live Datanodes : 3

 Node Last
Contact  Admin State Configured
Capacity (GB)  Used
(GB) Non DFS
Used (GB) Remaining
(GB)  Used
(%) Used
(%)  Remaining
(%) Blocks chethan 1In Service 28.590.625.172.82 2.11

9.8773 shashwat<http://shashwat:50075/browseDirectory.jsp?namenodeInfoPort=50070&dir=%2F>
2In Service28.980.87 22.016.13

21.0469 syed<http://syed:50075/browseDirectory.jsp?namenodeInfoPort=50070&dir=%2F>
0In Service28.984.29 18.376.3214.8

21.82129
When i run balancer in hadoop i had seen Blocks are not equally distributed
. Can i know what may be the reason for this ..


Kind% CompleteNum TasksPendingRunningComplete KilledFailed/Killed
Task Attempts<http://chethan:50030/jobfailures.jsp?jobid=job_201207121836_0007>
map<http://chethan:50030/jobtasks.jsp?jobid=job_201207121836_0007&type=map&pagenum=1>
85.71%

701<http://chethan:50030/jobtasks.jsp?jobid=job_201207121836_0007&type=map&pagenum=1&state=running>
6<http://chethan:50030/jobtasks.jsp?jobid=job_201207121836_0007&type=map&pagenum=1&state=completed>
0 3<http://chethan:50030/jobfailures.jsp?jobid=job_201207121836_0007&kind=map&cause=failed>/
1<http://chethan:50030/jobfailures.jsp?jobid=job_201207121836_0007&kind=map&cause=killed>
reduce<http://chethan:50030/jobtasks.jsp?jobid=job_201207121836_0007&type=reduce&pagenum=1>
28.57%

101<http://chethan:50030/jobtasks.jsp?jobid=job_201207121836_0007&type=reduce&pagenum=1&state=running>
000 / 0
i had seen only Number Task is allocated is 8 . Is there any possibility to
increase the Map Number of Task

Completed TasksTaskCompleteStatusStart TimeFinish TimeErrorsCounters
task_201207121836_0007_m_000001<http://chethan:50030/taskdetails.jsp?tipid=task_201207121836_0007_m_000001>
100.00%
UserID: 777 SEQID:415794
12-Jul-2012 21:35:48
12-Jul-2012 21:36:12 (24sec)

16<http://chethan:50030/taskstats.jsp?tipid=task_201207121836_0007_m_000001>
task_201207121836_0007_m_000002<http://chethan:50030/taskdetails.jsp?tipid=task_201207121836_0007_m_000002>
100.00%
UserID: 777 SEQID:422256
12-Jul-2012 21:35:50
12-Jul-2012 21:36:47 (57sec)

16<http://chethan:50030/taskstats.jsp?tipid=task_201207121836_0007_m_000002>
task_201207121836_0007_m_000003<http://chethan:50030/taskdetails.jsp?tipid=task_201207121836_0007_m_000003>
100.00%
UserID: 777 SEQID:563544
12-Jul-2012 21:35:50
12-Jul-2012 22:00:08 (24mins, 17sec)

16<http://chethan:50030/taskstats.jsp?tipid=task_201207121836_0007_m_000003>
task_201207121836_0007_m_000004<http://chethan:50030/taskdetails.jsp?tipid=task_201207121836_0007_m_000004>
100.00%
UserID: 777 SEQID:592918
12-Jul-2012 21:35:50
12-Jul-2012 21:42:09 (6mins, 18sec)

16<http://chethan:50030/taskstats.jsp?tipid=task_201207121836_0007_m_000004>
task_201207121836_0007_m_000005<http://chethan:50030/taskdetails.jsp?tipid=task_201207121836_0007_m_000005>
100.00%
UserID: 777 SEQID:618121
12-Jul-2012 21:35:50
12-Jul-2012 21:44:34 (8mins, 43sec)

16<http://chethan:50030/taskstats.jsp?tipid=task_201207121836_0007_m_000005>
task_201207121836_0007_m_000006<http://chethan:50030/taskdetails.jsp?tipid=task_201207121836_0007_m_000006>
100.00%
UserID: 777 SEQID:685810
12-Jul-2012 21:36:12
12-Jul-2012 21:44:18 (8mins, 6sec)

16<http://chethan:50030/taskstats.jsp?tipid=task_201207121836_0007_m_000006>
why for last Map task is talking nearly 2 hours .please give me some
suggestion how to do an optimization

TaskCompleteStatusStart Time Finish TimeErrorsCounters
task_201207121836_0007_m_000000<http://chethan:50030/taskdetails.jsp?tipid=task_201207121836_0007_m_000000>
0.00%
UserID: 482 SEQID:99596
12-Jul-2012 21:35:48

java.io.IOException: Spill failed
	at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1029)
	at org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:691)
	at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
	at org.pointcross.SearchPermission.MapReduce.NewObjectMapper.map(NewObjectMapper.java:205)
	at org.pointcross.SearchPermission.MapReduce.NewObjectMapper.map(NewObjectMapper.java:1)
	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
	at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:416)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093)
	at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: org.apache.hadoop.util.DiskChecker$DiskErrorException:
Could not find any valid local directory for output/spill712.out
	at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:381)
	at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:146)
	at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:127)
	at org.apache.hadoop.mapred.MapOutputFile.getSpillFileForWrite(MapOutputFile.java:121)
	at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1392)
	at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$1800(MapTask.java:853)
	at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$SpillThread.run(MapTask.java:1344)

java.lang.RuntimeException: Error while running command to get file
permissions : java.io.IOException: Cannot run program "/bin/ls":
java.io.IOException: error=12, Cannot allocate memory
	at java.lang.ProcessBuilder.start(ProcessBuilder.java:475)
	at org.apache.hadoop.util.Shell.runCommand(Shell.java:200)
	at org.apache.hadoop.util.Shell.run(Shell.java:182)
	at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:375)
	at org.apache.hadoop.util.Shell.execCommand(Shell.java:461)
	at org.apache.hadoop.util.Shell.execCommand(Shell.java:444)
	at org.apache.hadoop.fs.FileUtil.execCommand(FileUtil.java:703)
	at org.apache.hadoop.fs.RawLocalFileSystem$RawLocalFileStatus.loadPermissionInfo(RawLocalFileSystem.java:443)
	at org.apache.hadoop.fs.RawLocalFileSystem$RawLocalFileStatus.getOwner(RawLocalFileSystem.java:426)
	at org.apache.hadoop.mapred.TaskLog.obtainLogDirOwner(TaskLog.java:251)
	at org.apache.hadoop.mapred.TaskLogsTruncater.truncateLogs(TaskLogsTruncater.java:124)
	at org.apache.hadoop.mapred.Child$4.run(Child.java:260)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:416)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093)
	at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: java.io.IOException: java.io.IOException: error=12, Cannot
allocate memory
	at java.lang.UNIXProcess.<init>(UNIXProcess.java:164)
	at java.lang.ProcessImpl.start(ProcessImpl.java:81)
	at java.lang.ProcessBuilder.start(ProcessBuilder.java:468)
	... 15 more

	at org.apache.hadoop.fs.RawLocalFileSystem$RawLocalFileStatus.loadPermissionInfo(RawLocalFileSystem.java:468)
	at org.apache.hadoop.fs.RawLocalFileSystem$RawLocalFileStatus.getOwner(RawLocalFileSystem.java:426)
	at org.apache.hadoop.mapred.TaskLog.obtainLogDirOwner(TaskLog.java:251)
	at org.apache.hadoop.mapred.TaskLogsTruncater.truncateLogs(TaskLogsTruncater.java:124)
	at org.apache.hadoop.mapred.Child$4.run(Child.java:260)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:416)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093)
	at org.apache.hadoop.mapred.Child.main(Child.java:249)

java.io.IOException: Spill failed
	at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1029)
	at org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:691)
	at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
	at org.pointcross.SearchPermission.MapReduce.NewObjectMapper.map(NewObjectMapper.java:205)
	at org.pointcross.SearchPermission.MapReduce.NewObjectMapper.map(NewObjectMapper.java:1)
	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
	at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:416)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093)
	at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: org.apache.hadoop.util.DiskChecker$DiskErrorException:
Could not find any valid local directory for output/spill934.out
	at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:381)
	at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:146)
	at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:127)
	at org.apache.hadoop.mapred.MapOutputFile.getSpillFileForWrite(MapOutputFile.java:121)
	at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1392)
	at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$1800(MapTask.java:853)
	at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$SpillThread.run(MapTask.java:1344)



I had seen this error for last task what may be the reason for this error .

NOTE:  When i run import hbase table it takes 10 min .


Team please give suggestion what to be done to solve these issue .


            Thanks and Regards,
        S SYED ABDUL KATHER