You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Amna Waqar <am...@gmail.com> on 2011/02/25 05:45:43 UTC

help with deleting the docs

Hello everybody,
i want to see the lang of the doc, and if lang != ur then i want to delete
that that doc before it can be accessed..i ve used ur.ngp to detect lang of
the doc and the plugin is working fine..now i want to delete the non-urdu
docs..where i can do this? what should be the syntax of the code and where
it can be placed..my idea is to edit index-basic plugin and checking parse
data for lang and return null if lang!=ur..but this just donot pass the doc
to the index..how can i delete the non urdu doc..
regards
Amna Waqar

Re: help with deleting the docs

Posted by Markus Jelsma <ma...@openindex.io>.
Hi,

You'd need an indexing filter to do this. There are several recent threads on 
this subject.

Cheers,

On Friday 25 February 2011 05:45:43 Amna Waqar wrote:
> Hello everybody,
> i want to see the lang of the doc, and if lang != ur then i want to delete
> that that doc before it can be accessed..i ve used ur.ngp to detect lang of
> the doc and the plugin is working fine..now i want to delete the non-urdu
> docs..where i can do this? what should be the syntax of the code and where
> it can be placed..my idea is to edit index-basic plugin and checking parse
> data for lang and return null if lang!=ur..but this just donot pass the doc
> to the index..how can i delete the non urdu doc..
> regards
> Amna Waqar

-- 
Markus Jelsma - CTO - Openindex
http://www.linkedin.com/in/markus17
050-8536620 / 06-50258350