You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "Sami Siren (JIRA)" <ji...@apache.org> on 2006/07/23 20:22:15 UTC

[jira] Commented: (NUTCH-266) hadoop bug when doing updatedb

    [ http://issues.apache.org/jira/browse/NUTCH-266?page=comments#action_12422929 ] 
            
Sami Siren commented on NUTCH-266:
----------------------------------

I finally found the time to setup an environment with cygwin and try this out. I can confirm that the hadoop.jar version provided with nutch gives these errors. 

I then checked tested nutch with hadoop nightly jar and everything worked just fine.

Can someone try the hadoop nightly jar with nutch and see if it works for you. Nightly builds for hadoop are available from
http://people.apache.org/dist/lucene/hadoop/nightly/

just extract the archive and grab the hadoop-nightly.jar from there and replace the one in nutch installation with that one

thanks

> hadoop bug when doing updatedb
> ------------------------------
>
>                 Key: NUTCH-266
>                 URL: http://issues.apache.org/jira/browse/NUTCH-266
>             Project: Nutch
>          Issue Type: Bug
>    Affects Versions: 0.8-dev
>         Environment: windows xp, JDK 1.4.2_04
>            Reporter: Eugen Kochuev
>
> I constantly get the following error message
> 060508 230637 Running job: job_pbhn3t
> 060508 230637 c:/nutch/crawl-20060508230625/crawldb/current/part-00000/data:0+245
> 060508 230637 c:/nutch/crawl-20060508230625/segments/20060508230628/crawl_fetch/part-00000/data:0+296
> 060508 230637 c:/nutch/crawl-20060508230625/segments/20060508230628/crawl_parse/part-00000:0+5258
> 060508 230637 job_pbhn3t
> java.io.IOException: Target /tmp/hadoop/mapred/local/reduce_qnd5sx/map_qjp7tf.out already exists
>         at org.apache.hadoop.fs.FileUtil.checkDest(FileUtil.java:162)
>         at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:62)
>         at org.apache.hadoop.fs.LocalFileSystem.renameRaw(LocalFileSystem.java:191)
>         at org.apache.hadoop.fs.FileSystem.rename(FileSystem.java:306)
>         at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:101)
> Exception in thread "main" java.io.IOException: Job failed!
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:341)
>         at org.apache.nutch.crawl.CrawlDb.update(CrawlDb.java:54)
>         at org.apache.nutch.crawl.Crawl.main(Crawl.java:114)

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Re: [jira] Commented: (NUTCH-266) hadoop bug when doing updatedb

Posted by Sami Siren <ss...@gmail.com>.
>
> Are you planning to update Hadoop to trunk/ ? I'd rather be careful 
> with that - I'm not sure if it's still compatible with Java 1.4, 
> besides being unreleased/unstable ...
>
Not planning an upgrade, just wan't to know if it resolves the issues. 
We can then decide what's the best thing to do.

--
 Sami Siren



Re: [jira] Commented: (NUTCH-266) hadoop bug when doing updatedb

Posted by Andrzej Bialecki <ab...@getopt.org>.
Sami Siren (JIRA) wrote:
>     [ http://issues.apache.org/jira/browse/NUTCH-266?page=comments#action_12422929 ] 
>             
> Sami Siren commented on NUTCH-266:
> ----------------------------------
>
> I finally found the time to setup an environment with cygwin and try this out. I can confirm that the hadoop.jar version provided with nutch gives these errors. 
>
> I then checked tested nutch with hadoop nightly jar and everything worked just fine.
>
> Can someone try the hadoop nightly jar with nutch and see if it works for you. Nightly builds for hadoop are available from
> http://people.apache.org/dist/lucene/hadoop/nightly/
>
>   


Are you planning to update Hadoop to trunk/ ? I'd rather be careful with 
that - I'm not sure if it's still compatible with Java 1.4, besides 
being unreleased/unstable ...

-- 
Best regards,
Andrzej Bialecki     <><
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com