You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Sebastian Riemer <s....@littera.eu> on 2017/02/15 18:30:56 UTC

Atomic updates to increase single field bulk updates?

Dear solr users,

when updating documents in bulk (i.e. 40.000 documents at once), and only changing the value of a single Boolean-Flag, I currently re-index all whole 40.000 objects. However, the process of obtaining all relevant information for each object from the database is one of relatively high cost.

I now wonder, if in this situation it would be a good idea to implement a single-field update routine using atomic updates? In that case, I could skip any necessary lookups in the relational database, since the only information would be the new value for that Boolean-Flag, and the list of those 40.000 document ids.

I am aware of the requirements to use atomic updates, but as I understood, those would not have a big impact on performance and only a slight increase in index size?

What is your opinion on that?

Thanks for your input, have a nice evening!

Sebastian


Re: Atomic updates to increase single field bulk updates?

Posted by Erick Erickson <er...@gmail.com>.
Well, "it depends". The Atomic update has to first go out to disk and
decompress the original stored fields in 16K blocks,
then overlay the atomic update on the uncompressed doc, then re-index
the doc. 40K times in your example.

So yes, the stream going to Solr will be smaller if you do atomic
updates, but the processing on Solr will be heavier.

Plus, if you're not storing all the fields anyway, storing them just
for atomic up dates adds some load to the system as the index on disk
is bigger so merges take more I/O and the like.

However, you state that "the process of obtaining all relevant
information for each object from the database is one of relatively
high cost." so likely the extra work on Solr's part is worth it to
you.

Best,
Erick

On Fri, Feb 17, 2017 at 2:36 AM, Bram Van Dam <br...@intix.eu> wrote:
>> I am aware of the requirements to use atomic updates, but as I understood, those would not have a big impact on performance and only a slight increase in index size?
>
> AFAIK there won't be a difference in index size between atomic updates
> and full updates, as the end result is the same.
>
> But you will probably see a performance increase because you'll only
> have to send 40000 boolean flags instead of 40000 full documents.
>
> Using atomic updates sounds like a good idea to me.
>
>  - Bram
>

Re: Atomic updates to increase single field bulk updates?

Posted by Bram Van Dam <br...@intix.eu>.
> I am aware of the requirements to use atomic updates, but as I understood, those would not have a big impact on performance and only a slight increase in index size?

AFAIK there won't be a difference in index size between atomic updates
and full updates, as the end result is the same.

But you will probably see a performance increase because you'll only
have to send 40000 boolean flags instead of 40000 full documents.

Using atomic updates sounds like a good idea to me.

 - Bram