You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by alexis artes <al...@yahoo.com> on 2006/04/24 08:04:53 UTC
deletable files
Hi,
I am using Nutchwax. which is using Nutch v0.7,
together with heritrix and wera for a web archive
system.
Since we are achiving the websites that we crawled,
storage is a concern. I would like to ask what files
inside the Index folder can be deleted? I did a trial
and error approach and was still able to run search
and retrieval on Wera without the following folders:
webdb,segment-*-indexs, segment-*-parse_data, and
segment-*-fetcher.
I hope someone can advise me if what I am doing is
correct.
Best Regards,
Alexis Artes
__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com