You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by carmmello <ca...@globo.com> on 2005/10/12 16:21:02 UTC

Db - error in updating

I have tried for three times to build an index (two with the intranet method and one with the full web method, from which the error message below), but when trying to update the database, I get the error message, stating that there is some out of order of some sites.

I have used previous versions and I have not encountered this kind of problem.



051012 110415 Finishing update

051012 110559 Processing pagesByURL: Sorted 2702317 instructions in 103.48 seconds.

051012 110559 Processing pagesByURL: Sorted 26114.389253962116 instructions/second

Exception in thread "main" java.io.IOException: key out of order: http://www.ino.com/ after http://wwwcgi.ci.boulder.co.us/calendar.pl

at org.apache.nutch.io.MapFile$Writer.checkKey(MapFile.java:134)

at org.apache.nutch.io.MapFile$Writer.append(MapFile.java:120)

at org.apache.nutch.db.WebDBWriter$PagesByURLProcessor.mergeEdits(WebDBWriter.java:736)

at org.apache.nutch.db.WebDBWriter$CloseProcessor.closeDown(WebDBWriter.java:557)

at org.apache.nutch.db.WebDBWriter.close(WebDBWriter.java:1544)

at org.apache.nutch.tools.UpdateDatabaseTool.close(UpdateDatabaseTool.java:321)

at org.apache.nutch.tools.UpdateDatabaseTool.main(UpdateDatabaseTool.java:371)

[root@localhost nutch-0.7.1]# 

Any Help?

Tanks