You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "Roannel Fernández Hernández (JIRA)" <ji...@apache.org> on 2017/08/23 14:41:00 UTC

[jira] [Commented] (NUTCH-1480) SolrIndexer to write to multiple servers.

    [ https://issues.apache.org/jira/browse/NUTCH-1480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16138433#comment-16138433 ] 

Roannel Fernández Hernández commented on NUTCH-1480:
----------------------------------------------------

I’m testing a solution which use this file [1] to configure the index writers. On this XML file, we could put into every tag "writer" the parameters used by the writer and a mapping for every field of the Nutch documents. With this new way of using the writers in Nutch, we could have so many field mappings, not only for the Solr index writer, but also for every index writer that we have. Also we will be able to define different configurations for index writers, even for the same IndexWriter class. This solution is applied to all types of index writers, not just for Solr index writer.

The structure of [1] is described in [2].

[1] https://github.com/r0ann3l/nutch/blob/NUTCH-1480/conf/index-writers.xml.template
[2] https://github.com/r0ann3l/nutch/blob/NUTCH-1480/conf/index-writers.xsd

> SolrIndexer to write to multiple servers.
> -----------------------------------------
>
>                 Key: NUTCH-1480
>                 URL: https://issues.apache.org/jira/browse/NUTCH-1480
>             Project: Nutch
>          Issue Type: Improvement
>          Components: indexer
>            Reporter: Markus Jelsma
>            Assignee: Markus Jelsma
>            Priority: Minor
>         Attachments: adding-support-for-sharding-indexer-for-solr.patch, NUTCH-1480-1.6.1.patch
>
>
> SolrUtils should return an array of SolrServers and read the SolrUrl as a comma delimited list of URL's using Configuration.getString(). SolrWriter should be able to handle this list of SolrServers.
> This is useful if you want to send documents to multiple servers if no replication is available or if you want to send documents to multiple NOCs.
> edit:
> This does not replace NUTCH-1377 but complements it. With NUTCH-1377 this issue allows you to index to multiple SolrCloud clusters at the same time.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)