You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Roannel Fernández Hernández <ro...@uci.cu> on 2017/08/23 15:05:05 UTC

Exchange documents in indexing job

Hi folks: 

There is some way in Nutch to send some documents to a particular index writer according to particular values of fields? 

I explain myself better. I have a document with a field called "mimetype" and I want to send to Solr only the documents with value "text/plain" for this field and send to RabbitMQ the documents with value "text/html". How can I do that? 

Regards 

La @universidad_uci es Fidel. Los jóvenes no fallaremos.
#HastaSiempreComandante
#HastalaVictoriaSiempre

RE: Exchange documents in indexing job

Posted by Yossi Tamari <yo...@pipl.com>.
I don't see a good way to do it in configuration, but it should be very easy to override the write method in the two plugins to have it check the mime type and decide whether to call super.write or not.
(One terrible way to do it with configuration only would be to configure only one of the indexers and use mimetype-filter to filter the matching type, and then reconfigure for the other indexer and change mimetype-filter.txt to the other mime type and index again...)

-----Original Message-----
From: Roannel Fernández Hernández [mailto:roannel@uci.cu] 
Sent: 23 August 2017 18:05
To: user@nutch.apache.org
Subject: Exchange documents in indexing job

Hi folks: 

There is some way in Nutch to send some documents to a particular index writer according to particular values of fields? 

I explain myself better. I have a document with a field called "mimetype" and I want to send to Solr only the documents with value "text/plain" for this field and send to RabbitMQ the documents with value "text/html". How can I do that? 

Regards 

La @universidad_uci es Fidel. Los jóvenes no fallaremos.
#HastaSiempreComandante
#HastalaVictoriaSiempre