You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Yulio Aleman Jimenez <yu...@uci.cu> on 2016/04/29 22:47:32 UTC

Priorize links in Fetching Step

Hi. 

I'm using Nutch 1.9 with Solr 4.10 in a local environment. 
I need a way to priorize some links in the Fetching Steps, through filtering the new links identified in the last crawls by some criterias, for example the extension of the resource. The goal is priorize images, documents, etc, before HTML pages in crawling process. 

Is there any property in nutch-site.xml or any plugin capable to do this?? How can I do this??? 

I accept any sugestion, or some source code snippets for creating a new plugin for nutch. 

Best regards 

-- 
Ing. Yulio Aleman Jimenez 
Dpto. Soluciones Informáticas para Internet. CIDI 
Universidad de las Ciencias Informáticas (UCI) 
----------------------------------------------------------------------------------------------------------------------------------- 
"Podrán morir los hombres, PERO JAMÁS SUS IDEAS" 


La UCI presente este 1ro. de Mayo en la Plaza de la Revolución
junto a todo el pueblo.¡Por Cuba: Unidad y Compromiso!