You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-dev@hadoop.apache.org by "Harsh J (JIRA)" <ji...@apache.org> on 2011/07/17 22:51:00 UTC

[jira] [Resolved] (MAPREDUCE-133) Getting errors in reading the output files of a map/reduce job immediately after the job is complete

     [ https://issues.apache.org/jira/browse/MAPREDUCE-133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Harsh J resolved MAPREDUCE-133.
-------------------------------

    Resolution: Cannot Reproduce

This doesn't seem to be a problem in 0.20 or trunk anymore.

Running JobControl tests or running a chained job test with checks for isComplete() and launching the second job works just fine. Perhaps it was an issue with the way the output committing worked in '06 (year of ticket).

This ticket has gone stale. Closing as Cannot Reproduce.

> Getting errors in reading the output files of a map/reduce job immediately after the job is complete
> ----------------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-133
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-133
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>            Reporter: Runping Qi
>            Assignee: Owen O'Malley
>
> I have an app that fire up map/reduce jobs sequentially. The output of one job if the input of the next.
> I observe that many map tasks failed due to file read errors:
> java.rmi.RemoteException: java.io.IOException: Cannot open filename /user/runping/runping/docs_store/stage_2/base_docs/part-00186 at org.apache.hadoop.dfs.NameNode.open(NameNode.java:130) at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:585) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:237) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:216) at org.apache.hadoop.ipc.Client.call(Client.java:303) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:141) at org.apache.hadoop.dfs.$Proxy1.open(Unknown Source) at org.apache.hadoop.dfs.DFSClient$DFSInputStream.openInfo(DFSClient.java:315) at org.apache.hadoop.dfs.DFSClient$DFSInputStream.(DFSClient.java:302) at org.apache.hadoop.dfs.DFSClient.open(DFSClient.java:95) at org.apache.hadoop.dfs.DistributedFileSystem.openRaw(DistributedFileSystem.java:78) at org.apache.hadoop.fs.FSDataInputStream$Checker.(FSDataInputStream.java:46) at org.apache.hadoop.fs.FSDataInputStream.(FSDataInputStream.java:220) at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:146) at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:234) at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:226) at org.apache.hadoop.mapred.SequenceFileRecordReader.(SequenceFileRecordReader.java:36) at org.apache.hadoop.mapred.SequenceFileInputFormat.getRecordReader(SequenceFileInputFormat.java:53) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:105) at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:709) 
> Those tasks succeeded in the second or third try.
> After interting 10 seconds sleep between consecutive jobs, the problem disappear.
> Here is my code to detect whether a job is completed:
>       try {
>         running = jc.submitJob(job);
>         String jobId = running.getJobID();
>         System.out.println("start job:\t" + jobId);
>         while (!running.isComplete()) {
>           try {
>             Thread.sleep(1000);
>           } catch (InterruptedException e) {}
>           running = jc.getJob(jobId);
>         }
>         sucess = running.isSuccessful();
>       } finally {
>         if (!sucess && (running != null)) {
>           running.killJob();
>         }
>         jc.close();
>       }

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira