You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Aled Jones <Al...@comtec-europe.co.uk> on 2006/09/01 13:13:57 UTC

Remove unwanted urls

Hi there

Could someone give me some advice on using the prune index tool?  I want
a command that removes all urls that end in "/" or "index.html".

Cheers
Aled



###########################################

This message has been scanned by F-Secure Anti-Virus for Microsoft Exchange.
For more information, connect to http://www.f-secure.com/
************************************************************************
This e-mail and any attachments are strictly confidential and intended solely for the addressee. They may contain information which is covered by legal, professional or other privilege. If you are not the intended addressee, you must not copy the e-mail or the attachments, or use them for any purpose or disclose their contents to any other person. To do so may be unlawful. If you have received this transmission in error, please notify us as soon as possible and delete the message and attachments from all places in your computer where they are stored. 

Although we have scanned this e-mail and any attachments for viruses, it is your responsibility to ensure that they are actually virus free.