You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by Egor Chernodarov <eg...@zarinsk.dem.ru> on 2005/04/19 09:01:42 UTC

Killed crawl process and corrupted segment

Hello, all!

I have killed crawl process and now I want to use fetched data.
I have try to "nutch segread -fix -dir dbdir/segments", but process is
frozen. No any output, cpu usage is high.
Last output:
----------------------
run java in /usr/local/jdk1.4.2/
expr: syntax error
050417 023311 No NutchFileSystem indicated, so defaulting to local fs.
050417 023311 loading file:/usr/local/nutch-0.6/conf/nutch-default.xml
050417 023311 loading file:/usr/local/nutch-0.6/conf/nutch-site.xml
----------------------


Then I try "nutch mergesegs -dir dbname/segments/ -i -ds" and see that
process is frozen again(I belive that mergesegs also fix corrupted
segments. It is right?). Last output:
----------------------
run java in /usr/local/jdk1.4.2/
expr: syntax error
050418 021900 No NutchFileSystem indicated, so defaulting to local fs.
050418 021900 * Opening 5 segments:
050418 021900 loading file:/usr/local/nutch-0.6/conf/nutch-default.xml
050418 021900 loading file:/usr/local/nutch-0.6/conf/nutch-site.xml
050418 021900  - segment 20050414133251: 337 records.
050418 021900  - segment 20050414134026: 5589 records.
050418 021901  - segment 20050414135514: 42022 records.
050418 021902  - segment 20050414150300: 216240 records.
----------------------
Process frozen on the biggest 5st segment (size is 6695M). ;-/

What I can do with corrupted segment?
Maybe this problem is solved in the latest nutch cvs? I use nutch-0.6
distr (14-Jan-2005) and linux-sun-jdk1.4.2p6 on FreeBSD4.1 with 4Gb RAM

Any help will be appreciated.


-- 
Best regards,
 Chernodarov Egor
 egor@zarinsk.dem.ru