You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@solr.apache.org by ufuk yılmaz <uy...@vivaldi.net.INVALID> on 2021/04/19 10:25:08 UTC
Save a new document only when certain fields change
I had a requirement where I needed to index a new document only when there’s a change in some of the fields. I implemented it using Solr’s deduplication feature : https://solr.apache.org/guide/8_4/de-duplication.html
An example document:
{
userID: “userid”,
userName: “usernameA”,
userLoginCount: 123,
date: “2020-03-03T13:41:01.104Z”
}
These documents are sent to Solr regularly to be indexed. Requirement was to index a new document if user changes his username, otherwise update “userLoginCount” and “date” fields. My configuration is:
<updateRequestProcessorChain name="dedupe" processor="uuid,remove-blank,field-name-mutating,parse-boolean,parse-long,parse-double,parse-date">
<processor class="solr.processor.SignatureUpdateProcessorFactory">
<bool name="enabled">true</bool>
<str name="signatureField">id</str>
<bool name="overwriteDupes">true</bool>
<str name="fields">userID,userName</str>
<str name="signatureClass">solr.processor.Lookup3Signature</str>
</processor>
<processor class="solr.LogUpdateProcessorFactory" />
<processor class="solr.RunUpdateProcessorFactory" />
</updateRequestProcessorChain>
The following document triggers a new index:
{
userID: “userid”,
userName: “usernameB”,
userLoginCount: 128,
date: “2020-04-03T13:41:01.104Z”
}
This is working nicely with only one problem. If a user changes his username from usernameA to usernameB, and later to usernameA again, the older document is updated instead. What I was trying to capture was a user’s username changes throughout the time, preserving older states, so when this happens there should be 3 documents.
Is there a way to achieve this in Solr, or should I find a solution outside Solr?
--ufuk yilmaz
Sent from Mail for Windows 10