You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "Cosmin Lehene (JIRA)" <ji...@apache.org> on 2009/04/01 21:25:13 UTC

[jira] Commented: (NUTCH-692) AlreadyBeingCreatedException with Hadoop 0.19

    [ https://issues.apache.org/jira/browse/NUTCH-692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12694703#action_12694703 ] 

Cosmin Lehene commented on NUTCH-692:
-------------------------------------

The AlreadyBeingCreatedException appears when a reduce task fails at a first attempt and leaves the output files open for the next. I have a patch for it. The reduce task won't stop with an AlreadyBeingCreatedException on the second run. However this is sometimes caused by other bugs - on of them being the regexp match hang caused by a Java Regex bug and even if you won't get the AlreadyBeingCreatedException you still need to deal with the regexp infinite loop. 

> AlreadyBeingCreatedException with Hadoop 0.19
> ---------------------------------------------
>
>                 Key: NUTCH-692
>                 URL: https://issues.apache.org/jira/browse/NUTCH-692
>             Project: Nutch
>          Issue Type: Bug
>    Affects Versions: 1.0.0
>            Reporter: Julien Nioche
>
> I have been using the SVN version of Nutch on an EC2 cluster and got some AlreadyBeingCreatedException during the reduce phase of a parse. For some reason one of my tasks crashed and then I ran into this AlreadyBeingCreatedException when other nodes tried to pick it up.
> There was recently a discussion on the Hadoop user list on similar issues with Hadoop 0.19 (see http://markmail.org/search/after+upgrade+to+0%2E19%2E0). I have not tried using 0.18.2 yet but will do if the problems persist with 0.19
> I was wondering whether anyone else had experienced the same problem. Do you think 0.19 is stable enough to use it for Nutch 1.0?
> I will be running a crawl on a super large cluster in the next couple of weeks and I will confirm this issue  
> J.  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.