You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by suxiaoke04 <su...@126.com> on 2006/09/13 04:37:42 UTC

How can I modify the crawler?

   I want to realize a topic-based search engine through modifing the nutch. For example I define a computer topic so I hope that I only find some information about computer. I can't find the appropriate point where I can insert myself sentence in Fetcher.java. Please tell me how can I modify the Fetcher and the parser? thanks.
 
 

Re: How can I modify the crawler?

Posted by Ernesto De Santis <de...@yahoo.com.ar>.
Hi suxiaoke

I do something similar, I did tell it category.

Do you need do it in two steps:

- index your topic.
- search filtering by topic.

In my approach, I build a pluing to index the category. In it plugin,
the category is resolved by a rules applied to the url. In your case,
you know how to decide the topic's values.

Then, search over it is very easy.
In your search code, add something like this:

query.addRequiredTerm(aTopicValue, "topic");

Good luck
Ernesto.


suxiaoke04 escribi��:
>    I want to realize a topic-based search engine through modifing the nutch. For example I define a computer topic so I hope that I only find some information about computer. I can't find the appropriate point where I can insert myself sentence in Fetcher.java. Please tell me how can I modify the Fetcher and the parser? thanks.
>  
>  
>   
> ------------------------------------------------------------------------
>
> No virus found in this incoming message.
> Checked by AVG Free Edition.
> Version: 7.1.405 / Virus Database: 268.12.3/446 - Release Date: 12/09/2006
>   

	
	
		
__________________________________________________
Pregunt�. Respond�. Descubr�.
Todo lo que quer�as saber, y lo que ni imaginabas,
est� en Yahoo! Respuestas (Beta).
�Probalo ya! 
http://www.yahoo.com.ar/respuestas