You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Barry Haddow <bh...@inf.ed.ac.uk> on 2008/09/08 15:33:39 UTC
race condition in SequenceFileOutputFormat.getReaders() ?
Hi
Using the nutch tools I see fairly frequent crashes in
SequenceFileOutputFormat.getReaders() with stack traces like the one below.
What appears to be happending is that there's a temporary file inside the
generate-temp-1220879127849 which exists when getReaders() lists the contents
of the directory, but has been deleted by the time it goes to examine the
contents.
Since I'm using nutch, this is in Hadoop version 0.15, but the code for
getReaders() doesn't seem to have changed in 0.18. Is this a known problem?
regards
Barry
2008-09-08 14:07:04,429 FATAL crawl.Generator - Generator:
org.apache.hadoop.ipc.RemoteException: java.io.IOException: Cannot open
filename /tmp/hadoop-bhaddow/mapred/temp/generate-temp-1220879127849/_task_200809081337_0019_r_000004_1
at org.apache.hadoop.dfs.NameNode.open(NameNode.java:238)
at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:596)
at org.apache.hadoop.ipc.Client.call(Client.java:482)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:184)
at org.apache.hadoop.dfs.$Proxy0.open(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
at org.apache.hadoop.dfs.$Proxy0.open(Unknown Source)
at
org.apache.hadoop.dfs.DFSClient$DFSInputStream.openInfo(DFSClient.java:848)
at
org.apache.hadoop.dfs.DFSClient$DFSInputStream.<init>(DFSClient.java:840)
at org.apache.hadoop.dfs.DFSClient.open(DFSClient.java:285)
at
org.apache.hadoop.dfs.DistributedFileSystem.open(DistributedFileSystem.java:114)
at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1356)
at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1349)
at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1344)
at
org.apache.hadoop.mapred.SequenceFileOutputFormat.getReaders(SequenceFileOutputFormat.java:87)
at org.apache.nutch.crawl.Generator.generate(Generator.java:443)
at org.apache.nutch.crawl.Generator.run(Generator.java:580)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.hadoop.util.ToolBase.doMain(ToolBase.java:54)
at org.apache.nutch.crawl.Generator.main(Generator.java:543)
--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.