You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by kaveh minooie <ka...@plutoz.com> on 2012/02/02 01:32:23 UTC

org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException:

I do apologize in advance if what I am about to ask is strictly a hadoop 
problem, but I get it when I am trying to parse in nutch.

I am running nutch1.4 over hadoop .20.203 on 7 computers ( 7 datanode 
(one of them is also the namenode and tasktracker as well) and i get 
this usually after a few hours of parsing:

12/02/01 16:01:38 INFO mapred.JobClient: Task Id : 
attempt_201201301344_0025_r_000029_3, Status : FAILED
org.apache.hadoop.ipc.RemoteException: 
org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to 
create file 
/user/crawler/mycrawl/segments/20120201031744/parse_text/part-00029/data 
for DFSClient_attempt_201201301344_0025_r_000029_3 on client 10.0.0.16, 
because this file is already being created by 
DFSClient_attempt_201201301344_0025_r_000029_0 on 10.0.0.17
	at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1210)
	at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1123)
	at 
org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:551)
	at sun.reflect.GeneratedMethodAccessor13.invoke(Unknown Source)
	at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:601)
	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:523)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1383)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1379)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:415)
	at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1377)

	at org.apache.hadoop.ipc.Client.call(Client.java:1030)
	at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:224)
	at $Proxy2.create(Unknown Source)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
	at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:601)
	at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
	at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
	at $Proxy2.create(Unknown Source)
	at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.<init>(DFSClient.java:2862)
	at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:539)
	at 
org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:206)
	at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:528)
	at 
org.apache.hadoop.io.SequenceFile$RecordCompressWriter.<init>(SequenceFile.java:1074)
	at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:397)
	at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:354)
	at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:427)
	at org.apache.hadoop.io.MapFile$Writer.<init>(MapFile.java:157)
	at org.apache.hadoop.io.MapFile$Writer.<init>(MapFile.java:134)
	at org.apache.hadoop.io.MapFile$Writer.<init>(MapFile.java:92)
	at 
org.apache.nutch.parse.ParseOutputFormat.getRecordWriter(ParseOutputFormat.java:110)
	at 
org.apache.hadoop.mapred.ReduceTask$OldTrackingRecordWriter.<init>(ReduceTask.java:447)
	at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:489)
	at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:419)
	at org.apache.hadoop.mapred.Child$4.run(Child.java:259)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:415)
	at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
	at org.apache.hadoop.mapred.Child.main(Child.java:253)

I do get this multiple time during one. it seems that it is similar to 
issue that is mentioned here:

https://issues.apache.org/jira/browse/HADOOP-5268

but as I said I am using .20.X and I am still getting it.

Thanks,
-- 
Kaveh Minooie

www.plutoz.com