You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Barry Haddow <bh...@inf.ed.ac.uk> on 2008/09/08 15:33:39 UTC

race condition in SequenceFileOutputFormat.getReaders() ?

Hi 

Using the nutch tools I see fairly frequent crashes in 
SequenceFileOutputFormat.getReaders() with stack traces like the one below. 
What appears to be happending is that there's a temporary file inside the 
generate-temp-1220879127849 which exists when getReaders() lists the contents 
of the directory, but has been deleted by the time it goes to examine the 
contents. 

Since I'm using nutch, this is in Hadoop version 0.15, but the code for 
getReaders() doesn't seem to have changed in 0.18. Is this a known problem?

regards
Barry

2008-09-08 14:07:04,429 FATAL crawl.Generator - Generator: 
org.apache.hadoop.ipc.RemoteException: java.io.IOException: Cannot open 
filename /tmp/hadoop-bhaddow/mapred/temp/generate-temp-1220879127849/_task_200809081337_0019_r_000004_1
    at org.apache.hadoop.dfs.NameNode.open(NameNode.java:238)
    at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source)
    at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:596)

    at org.apache.hadoop.ipc.Client.call(Client.java:482)
    at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:184)
    at org.apache.hadoop.dfs.$Proxy0.open(Unknown Source)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
    at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
    at org.apache.hadoop.dfs.$Proxy0.open(Unknown Source)
    at 
org.apache.hadoop.dfs.DFSClient$DFSInputStream.openInfo(DFSClient.java:848)
    at 
org.apache.hadoop.dfs.DFSClient$DFSInputStream.<init>(DFSClient.java:840)
    at org.apache.hadoop.dfs.DFSClient.open(DFSClient.java:285)
    at 
org.apache.hadoop.dfs.DistributedFileSystem.open(DistributedFileSystem.java:114)
    at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1356)
    at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1349)
    at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1344)
    at 
org.apache.hadoop.mapred.SequenceFileOutputFormat.getReaders(SequenceFileOutputFormat.java:87)
    at org.apache.nutch.crawl.Generator.generate(Generator.java:443)
    at org.apache.nutch.crawl.Generator.run(Generator.java:580)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
    at org.apache.hadoop.util.ToolBase.doMain(ToolBase.java:54)
    at org.apache.nutch.crawl.Generator.main(Generator.java:543)

-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.