You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Andy <an...@yahoo.com> on 2010/11/01 22:04:24 UTC
Which is faster -- delete or update?
My documents have a "down_vote" field. Every time a user votes down a document, I increment the "down_vote" field in my database and also re-index the document to Solr to reflect the new down_vote value.
During searches, I want to restrict the results to only documents with, say fewer than 3 down_vote. 2 ways to implement that:
1) When a user down vote a document, check to see if total down votes have reached 3. If it has, delete document from Solr index.
2) When a user down vote a document, update the document in Solr index to reflect the new down_vote value even if total down votes might have been more than 3. During query, add a "fq" to restrict results to documents with fewer than 3 down votes.
Which approach is better? Is it faster to delete a document from index or to update the document to reflect the new down_vote value?
Thanks.Andy
Re: Which is faster -- delete or update?
Posted by Jonathan Rochkind <ro...@jhu.edu>.
The actual time it takes to delete or update the document is unlikely to
make a difference to you.
What might make a difference to you is the time it takes to actually
finalize the commit, and the time it takes to re-warm your indexes after
a commit, and especially the time it takes to run any warming queries
you have set in newSearcher. Most of these probably won't differ between
delete or update, but could be a problem either way; one way to find
out, try it and measure it.
Whether you do a delete or an update, if you're planning on making
changes to your index more often than, oh, 10 or 20 minute seperation,
you may run into trouble. Solr isn't so good at frequent changes to the
index like that. I haven't looked at it myself, but the Solr patches
that get called "near real-time" seem like they're intended to deal with
this, among other things, and allow frequent commits without killing
performance or RAM usage.
I am not sure how/if other people are effectively dealing with
user-generated content that needs to be included in the index for
filtering and searching against. Would be very curious if anyone has any
successful strategies to share. Another example would be user-generated
tagging.
Erick Erickson wrote:
> Just deleting a document is faster because all that really happens
> is the document is marked as deleted. An update is really
> a delete followed by an add of the same document, so by definition
> an update will be slower...
>
> But... does it really make a difference? How often to you expect this to
> happen? Perter Karich added a note while I was typing this, and he
> makes some cogent points.
>
> I'm starting to think that I don't care about better unless and until my
> users notice (or I have a reasonable expectation that they #will# notice).
> I'm far more interested in simpler code that I can maintain than I am
> shaving off another 4 milliseconds from the response time. That gives
> me more chance to put in cool new features that the user will notice...
>
> Best
> Erick
>
> On Mon, Nov 1, 2010 at 5:04 PM, Andy <an...@yahoo.com> wrote:
>
>
>> My documents have a "down_vote" field. Every time a user votes down a
>> document, I increment the "down_vote" field in my database and also re-index
>> the document to Solr to reflect the new down_vote value.
>> During searches, I want to restrict the results to only documents with, say
>> fewer than 3 down_vote. 2 ways to implement that:
>> 1) When a user down vote a document, check to see if total down votes have
>> reached 3. If it has, delete document from Solr index.
>> 2) When a user down vote a document, update the document in Solr index to
>> reflect the new down_vote value even if total down votes might have been
>> more than 3. During query, add a "fq" to restrict results to documents with
>> fewer than 3 down votes.
>> Which approach is better? Is it faster to delete a document from index or
>> to update the document to reflect the new down_vote value?
>> Thanks.Andy
>>
>>
>>
>>
>
>
Re: Which is faster -- delete or update?
Posted by Erick Erickson <er...@gmail.com>.
Just deleting a document is faster because all that really happens
is the document is marked as deleted. An update is really
a delete followed by an add of the same document, so by definition
an update will be slower...
But... does it really make a difference? How often to you expect this to
happen? Perter Karich added a note while I was typing this, and he
makes some cogent points.
I'm starting to think that I don't care about better unless and until my
users notice (or I have a reasonable expectation that they #will# notice).
I'm far more interested in simpler code that I can maintain than I am
shaving off another 4 milliseconds from the response time. That gives
me more chance to put in cool new features that the user will notice...
Best
Erick
On Mon, Nov 1, 2010 at 5:04 PM, Andy <an...@yahoo.com> wrote:
> My documents have a "down_vote" field. Every time a user votes down a
> document, I increment the "down_vote" field in my database and also re-index
> the document to Solr to reflect the new down_vote value.
> During searches, I want to restrict the results to only documents with, say
> fewer than 3 down_vote. 2 ways to implement that:
> 1) When a user down vote a document, check to see if total down votes have
> reached 3. If it has, delete document from Solr index.
> 2) When a user down vote a document, update the document in Solr index to
> reflect the new down_vote value even if total down votes might have been
> more than 3. During query, add a "fq" to restrict results to documents with
> fewer than 3 down votes.
> Which approach is better? Is it faster to delete a document from index or
> to update the document to reflect the new down_vote value?
> Thanks.Andy
>
>
>
Re: Which is faster -- delete or update?
Posted by Peter Karich <pe...@yahoo.de>.
From the user perspective I wouldn't delete it, because it could be
that down-voting by mistake or spam or something and up-voting can
resurrect it.
It could be also wise to keep the docs to see which content (from which
users?) are down voted to get spam accounts?
From the dev perspective you should benchmark it, if really necessary.
(I guess updating is a more expensive because I think it is
delete+completely-new-add)
Regards,
Peter.
> My documents have a "down_vote" field. Every time a user votes down a document, I increment the "down_vote" field in my database and also re-index the document to Solr to reflect the new down_vote value.
> During searches, I want to restrict the results to only documents with, say fewer than 3 down_vote. 2 ways to implement that:
> 1) When a user down vote a document, check to see if total down votes have reached 3. If it has, delete document from Solr index.
> 2) When a user down vote a document, update the document in Solr index to reflect the new down_vote value even if total down votes might have been more than 3. During query, add a "fq" to restrict results to documents with fewer than 3 down votes.
> Which approach is better? Is it faster to delete a document from index or to update the document to reflect the new down_vote value?
> Thanks.Andy
>
>
>