You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by christopher palm <cp...@gmail.com> on 2014/01/25 19:21:53 UTC
How to handle multiple sub second updates to same SOLR Document
I have a scenario where the same SOLR document is being updated several
times within a few ms of each other due to how the source system is sending
in field updates on the document.
The problem I am trying to solve is that the order of these updates isn’t
guaranteed once the multi threaded SOLRJ client starts sending them to
SOLR, and older updates are overlaying the newer updates on the same
document.
I would like to use a timestamp versioning so that the older document
change won’t be sent into SOLR, but I didn’t see any automated way of doing
this based on the document timestamp.
Is there a good way to handle this scenario in SOLR 4.6?
It seems that we would have to be soft auto committing with a subsecond
level as well, is that even possible?
Thanks,
Chris
Re: How to handle multiple sub second updates to same SOLR Document
Posted by Bram Van Dam <br...@intix.eu>.
On 01/25/2014 07:21 PM, christopher palm wrote:
> The problem I am trying to solve is that the order of these updates isn’t
> guaranteed once the multi threaded SOLRJ client starts sending them to
> SOLR, and older updates are overlaying the newer updates on the same
> document.
Don't do that. There is no way to guarantee what your updates will look
like. We deal with this by keeping a list of document IDs that are
currently being updated. If the ID is already in the list, do nothing.
If it isn't, go ahead. Some synchronization required :-)
If you can think of a better way, I'd love to hear it, but I haven't
found one.
Re: How to handle multiple sub second updates to same SOLR Document
Posted by Elisabeth Benoit <el...@gmail.com>.
yutz
Envoyé de mon iPhoneippj
Le 26 janv. 2014 à 06:13, Shalin Shekhar Mangar <sh...@gmail.com> a écrit :
> There is no timestamp versioning as such in Solr but there is a new
> document based versioning which will allow you to specify your own
> (externally assigned) versions.
>
> See the "Document Centric Versioning Constraints" section at
> https://cwiki.apache.org/confluence/display/solr/Updating+Parts+of+Documents
>
> Sub-second soft auto commit can be expensive but it is hard to say if
> it will be too expensive for your use-case. You must benchmark it
> yourself.
>
> On Sat, Jan 25, 2014 at 11:51 PM, christopher palm <cp...@gmail.com> wrote:
>> I have a scenario where the same SOLR document is being updated several
>> times within a few ms of each other due to how the source system is sending
>> in field updates on the document.
>>
>> The problem I am trying to solve is that the order of these updates isn’t
>> guaranteed once the multi threaded SOLRJ client starts sending them to
>> SOLR, and older updates are overlaying the newer updates on the same
>> document.
>>
>> I would like to use a timestamp versioning so that the older document
>> change won’t be sent into SOLR, but I didn’t see any automated way of doing
>> this based on the document timestamp.
>>
>> Is there a good way to handle this scenario in SOLR 4.6?
>>
>> It seems that we would have to be soft auto committing with a subsecond
>> level as well, is that even possible?
>>
>> Thanks,
>>
>> Chris
>
>
>
> --
> Regards,
> Shalin Shekhar Mangar.
Re: How to handle multiple sub second updates to same SOLR Document
Posted by Shalin Shekhar Mangar <sh...@gmail.com>.
There is no timestamp versioning as such in Solr but there is a new
document based versioning which will allow you to specify your own
(externally assigned) versions.
See the "Document Centric Versioning Constraints" section at
https://cwiki.apache.org/confluence/display/solr/Updating+Parts+of+Documents
Sub-second soft auto commit can be expensive but it is hard to say if
it will be too expensive for your use-case. You must benchmark it
yourself.
On Sat, Jan 25, 2014 at 11:51 PM, christopher palm <cp...@gmail.com> wrote:
> I have a scenario where the same SOLR document is being updated several
> times within a few ms of each other due to how the source system is sending
> in field updates on the document.
>
> The problem I am trying to solve is that the order of these updates isn’t
> guaranteed once the multi threaded SOLRJ client starts sending them to
> SOLR, and older updates are overlaying the newer updates on the same
> document.
>
> I would like to use a timestamp versioning so that the older document
> change won’t be sent into SOLR, but I didn’t see any automated way of doing
> this based on the document timestamp.
>
> Is there a good way to handle this scenario in SOLR 4.6?
>
> It seems that we would have to be soft auto committing with a subsecond
> level as well, is that even possible?
>
> Thanks,
>
> Chris
--
Regards,
Shalin Shekhar Mangar.