You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Doug Cutting (JIRA)" <ji...@apache.org> on 2006/05/01 19:43:47 UTC
[jira] Updated: (HADOOP-164) Getting errors in reading the output
files of a map/reduce job immediately after the job is complete
[ http://issues.apache.org/jira/browse/HADOOP-164?page=all ]
Doug Cutting updated HADOOP-164:
--------------------------------
Component: mapred
> Getting errors in reading the output files of a map/reduce job immediately after the job is complete
> ----------------------------------------------------------------------------------------------------
>
> Key: HADOOP-164
> URL: http://issues.apache.org/jira/browse/HADOOP-164
> Project: Hadoop
> Type: Bug
> Components: mapred
> Reporter: Runping Qi
>
> I have an app that fire up map/reduce jobs sequentially. The output of one job if the input of the next.
> I observe that many map tasks failed due to file read errors:
> java.rmi.RemoteException: java.io.IOException: Cannot open filename /user/runping/runping/docs_store/stage_2/base_docs/part-00186 at org.apache.hadoop.dfs.NameNode.open(NameNode.java:130) at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:585) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:237) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:216) at org.apache.hadoop.ipc.Client.call(Client.java:303) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:141) at org.apache.hadoop.dfs.$Proxy1.open(Unknown Source) at org.apache.hadoop.dfs.DFSClient$DFSInputStream.openInfo(DFSClient.java:315) at org.apache.hadoop.dfs.DFSClient$DFSInputStream.(DFSClient.java:302) at org.apache.hadoop.dfs.DFSClient.open(DFSClient.java:95) at org.apache.hadoop.dfs.DistributedFileSystem.openRaw(DistributedFileSystem.java:78) at org.apache.hadoop.fs.FSDataInputStream$Checker.(FSDataInputStream.java:46) at org.apache.hadoop.fs.FSDataInputStream.(FSDataInputStream.java:220) at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:146) at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:234) at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:226) at org.apache.hadoop.mapred.SequenceFileRecordReader.(SequenceFileRecordReader.java:36) at org.apache.hadoop.mapred.SequenceFileInputFormat.getRecordReader(SequenceFileInputFormat.java:53) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:105) at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:709)
> Those tasks succeeded in the second or third try.
> After interting 10 seconds sleep between consecutive jobs, the problem disappear.
> Here is my code to detect whether a job is completed:
> try {
> running = jc.submitJob(job);
> String jobId = running.getJobID();
> System.out.println("start job:\t" + jobId);
> while (!running.isComplete()) {
> try {
> Thread.sleep(1000);
> } catch (InterruptedException e) {}
> running = jc.getJob(jobId);
> }
> sucess = running.isSuccessful();
> } finally {
> if (!sucess && (running != null)) {
> running.killJob();
> }
> jc.close();
> }
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira