You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Paris, Dan" <da...@cgi.com> on 2017/01/06 11:01:48 UTC

Solr Dedupe Issue

Hi Solr Dev,

I'm attempting to get dedupe working in Solr 6.3.0 but am experiencing some issues.

The updateRequestProcessorChain for dedupe doesn't appear to be working.

We are running Solr 6.3.0 (in cloud mode) that is taking in data via a NiFi Flow using a "PutSolrContentStream" using the following configuration:

[X][X]

When attempting to use the dedupe capability of Solr as per the documentation<http://https://cwiki.apache.org/confluence/display/solr/De-Duplication>, it is not working. The NiFi flow is continually consuming a hard coded JSON document from a lightweight Spring Boot server. The document contains a "signature" string and a "content" string (these are just placeholder fields for demo purposes). These documents are continually being created in the Solr collection when I would expect to see no change.

solrconfig.xml and schema.xml are attached.

Would you be able to provide some assistance?

Thanks in advance,
Dan

Dan Paris | Leading Engineer
250 Brook Drive, Reading, RG2 6UA | United Kingdom
M:  +44 7920783573
dan.paris@cgi.com<ma...@logica.com>  | www.cgi.com<http://www.logica.comregistered/>
Registered in England & Wales (registered number 947968)
Registered Office: 250 Brook Drive, Green Park, Reading RG2 6UA, United Kingdom