You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by Jay Pound <we...@poundwebhosting.com> on 2005/08/01 20:18:00 UTC
nutch prune
How do I write my queries file for pruning my database, to only .com .edu
.org .us etc... only us sites?
Thanks,
Jay Pound
Re: nutch prune
Posted by Matthias Jaekle <ja...@eventax.de>.
Hi Jay,
I think with the current version you could only prune segments.
We have once written a class to prune the db.
Maybe you could use this and add a function to delete pages according to
the urlfilter. I have attached our class.
Matthias
--
http://www.eventax.com - eventax GmbH
http://www.umkreisfinder.de - Die Suchmaschine für Lokales und Events
Jay Pound schrieb:
> How do I write my queries file for pruning my database, to only .com .edu
> .org .us etc... only us sites?
> Thanks,
> Jay Pound