You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by EM <em...@cpuedge.com> on 2005/09/01 03:42:02 UTC

RE: Analyser error

I did it with 0.6 (late versions) without the analyzer part and it went all
fine.

It's no big deal, I'll figure something out. Just wanted to let you know.

-----Original Message-----
From: Piotr Kosiorowski [mailto:pkosiorowski@gmail.com] 
Sent: Wednesday, August 31, 2005 2:56 PM
To: nutch-user@lucene.apache.org
Subject: Re: Analyser error

I was never doing it this way - creating webdb content based on segments 
only. So I do not know if it works - I do not have time at the moment to 
test it myslef - sorry.
Regards
Piotr

EM wrote:
> The problem is still there, maybe I'm doing something wrong?
> 
> 1. 'rm -r db' 
> 2. 'mkdir db'
> 3. ' bin/nutch admin db -create'
> 4. I'll then updatedb db from a fetched segment, this should fill it up
with
> links?
> 5. 'bin/nutch analylze db 7'
> And it fails here with three 'tmp<something>' directories and webdb.new 
> 
> 
> 
> -----Original Message-----
> From: Piotr Kosiorowski [mailto:pkosiorowski@gmail.com] 
> Sent: Tuesday, August 30, 2005 3:07 PM
> To: nutch-user@lucene.apache.org
> Subject: Re: Analyser error
> 
> It looks like you have temporary results from previous run (probably 
> killed or terminated not successfully). It shoudl be safe to remove 
> db\webdb.new directory and start again.
> regars
> Piotr
> EM wrote:
> 
>>What does it mean if the bin/nutch analyze db 7 fails with:
>>
>>
>>050830 024914 Target pages from init(): 27419
>>050830 024914 Processing pagesByURL: Sorted 27419 instructions in 0.172
>>seconds.
>>050830 024914 Processing pagesByURL: Sorted 159412.79069767444
>>instructions/second
>>Finished at Tue Aug 30 02:49:14 EDT 2005
>>Exception in thread "main" java.io.IOException: already exists:
>>db\webdb.new\pagesByURL
>>        at org.apache.nutch.io.MapFile$Writer.<init>(MapFile.java:86)
>>        at
>>
> 
>
org.apache.nutch.db.WebDBWriter$CloseProcessor.closeDown(WebDBWriter.java:54
> 
>>9)
>>        at org.apache.nutch.db.WebDBWriter.close(WebDBWriter.java:1544)
>>        at
>>
> 
>
org.apache.nutch.tools.DistributedAnalysisTool.completeRound(DistributedAnal
> 
>>ysisTool.java:562)
>>        at
>>org.apache.nutch.tools.LinkAnalysisTool.iterate(LinkAnalysisTool.java:60)
>>        at
>>org.apache.nutch.tools.LinkAnalysisTool.main(LinkAnalysisTool.java:81)
>>
>>
> 
> 
> 
> 
>