You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by Jacob R Rideout <ap...@jacobrideout.net> on 2011/01/25 22:42:21 UTC
Distributed Cache problem
Hello all:
We've had an intermittent issue on our cluster when using the distributed cache:
11/01/25 13:46:19 INFO mapred.JobClient: Task Id :
attempt_201101071032_13017_r_000030_2, Status : FAILED
java.io.FileNotFoundException:
/hadoop.data.1/tmp/mapred/local/taskTracker/archive/hdfs/data/lookup/ipclass/classification_regex.txt/classification_regex.txt
(No such file or directory)
at java.io.FileInputStream.open(Native Method)
at java.io.FileInputStream.<init>(Unknown Source)
at java.io.FileInputStream.<init>(Unknown Source)
at com.returnpath.tusko.IPClassifier.loadRules(IPClassifier.java:725)
at com.returnpath.tusko.IPClassification$Reduce.loadClassifier(IPClassification.java:193)
at com.returnpath.tusko.IPClassification$Reduce.setup(IPClassification.java:313)
at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:174)
at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:566)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408)
at org.apache.hadoop.mapred.Child.main(Child.java:170)
where our code is referencing
"DistributedCache.getLocalCacheFiles(job)" and eventually "new
FileInputStream(path.toString());"
We've been able to work-around the issue when it has occurred by either
1) restarting the task-trackers, or
2) deleting and recreating the file to be cached on hdfs.
Does anyone have any idea what the root cause of the problem might be?
Thank you,
Jacob Rideout