You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@solr.apache.org by Marius Grigaitis <ma...@home24.de.INVALID> on 2022/06/08 09:35:35 UTC

Solr indexing performance tips

Hi All,

Our Solr is bottlenecking on write performance (uses lots of cpu, writes
queue up). Looking for some tips on what to look into to figure out if we
can squeeze more write performance out of it without changing the setup too
drastically.

Here's the setup:
* Solr 8.2 (I know, could be upgraded to 8.11 or later but I don't think I
have seen significant changes that could impact the performance in
changelogs)
* Replica setup (one node is responsible for indexing, other nodes are
replicating every 10 minutes). Indexing node runs on 8 cores, 16G of RAM.
* 9 different cores. Each weighs around ~100 MB on disk and has
approximately 90k documents inside each.
* Updating is performed using update method in batches of 1000, around 9
processes in parallel (split by core)
* Setup is pretty straight forward, no magic string processors, setup is
quite default, just some field types are defined in core configuration and
only a few ids are stored.

The problem:
* It currently takes around 3 hours to process updates for all documents on
a machine with 2 cores and 8 gigs of RAM.

My gut-feeling assumptions:
* Since total index size is quite small (around 1G in total or less on
disk) and machine doing the indexing is quite powerful I would assume
indexing (updating) should be quite fast (probably <10 mins for all
documents). However it's far longer than that and we are probably doing
something wrong.

What am I looking for:
* Idea what might cause this?
* Common things to check/tips for write performance, maybe some reading
material?
* Measuring write performance, e.g. maybe there are some ways to see what
takes time on Solr side on monitoring to narrow down on what actually takes
time?

Sorry if I misused some Solr term.

Thank you for your tips / insights in advance.

Marius

Re: Solr indexing performance tips

Posted by Marius Grigaitis <ma...@home24.de.INVALID>.

Hi Vincenzo,

Yes.

On Thu, Jun 16, 2022 at 12:39 PM Vincenzo D'Amore <v....@gmail.com>
wrote:

> Hi Marius, if I have understood correctly you have a deleteByQuery for each
> document, am I right?
>
> On Thu, 16 Jun 2022 at 11:04, Marius Grigaitis
> <ma...@home24.de.invalid> wrote:
>
> > Just a followup on the topic.
> >
> > * We checked settings on solr, seem quite default (especially on merge,
> > commit strategies, etc)
> > * We commit every 10 minutes
> > * Added NewRelic to the Solr instance to gather more data and graphs
> >
> > In the end what caught our eye is a few deleteByQuery lines in stacks of
> > running threads while Solr is overloaded. We temporarily removed
> > deleteByQuery and it had around 10x performance improvement on indexing
> > speed.
> >
> > How are we using deleteByQuery?
> >
> > update(add=[{uid: foo-123, sku: 123, ...}, {uid: bar-124, sku: 124} ...],
> > deleteByQuery=["sku: 123 AND uid != foo-123", "sku: 123 AND uid !=
> > bar-124"])
> >
> > UID is the uniqueKey for the index. We do this because "foo" or "bar"
> could
> > change and we no longer want the previous document present.
> >
> > Ideally we should probably change our uniqueKey to be `sku` in this case
> > and we would no longer need deleteByQuery but what could be interesting
> is
> > why deleteByQuery causes such performance bottleneck as well as how we
> > could potentially optimize it if we wanted to keep it?
> >
> > Marius
> >
> > On Wed, Jun 8, 2022 at 8:41 PM David Hastings <
> > hastings.recursive@gmail.com>
> > wrote:
> >
> > > > * Do NOT commit after each batch of 1000 docs. Instead, commit as
> > seldom
> > > as your requirements allows, e.g. try commitWithin=60000 to commit
> every
> > > minute
> > >
> > > this is the big one.  commit after the entire process is done or on a
> > > timer, if you don't need NRT searching, rarely does anyone ever need
> > that.
> > > the commit is a heavy operation and takes about the same time if you
> are
> > > committing 1000 documents or 100k documents.
> > >
> > > On Wed, Jun 8, 2022 at 10:40 AM Jan Høydahl <ja...@cominvent.com>
> > wrote:
> > >
> > > > * Go multi threaded for each core as Shawn says. Try e.g. 2, 3 and 4
> > > > threads
> > > > * Experiment with different batch sizes, e.g. try 500 and 2000 -
> > depends
> > > > on your docs what is optimal
> > > > * Do NOT commit after each batch of 1000 docs. Instead, commit as
> > seldom
> > > > as your requirements allows, e.g. try commitWithin=60000 to commit
> > every
> > > > minute
> > > >
> > > > Tip: Try to push Solr metrics to DataDog or some other service, where
> > you
> > > > can see a dashboard with stats on requests/sec, RAM, CPU, threads, GC
> > etc
> > > > which may answer your last question.
> > > >
> > > > Jan
> > > >
> > > > > 8. jun. 2022 kl. 14:06 skrev Shawn Heisey <ap...@elyograg.org>:
> > > > >
> > > > > On 6/8/2022 3:35 AM, Marius Grigaitis wrote:
> > > > >> * 9 different cores. Each weighs around ~100 MB on disk and has
> > > > >> approximately 90k documents inside each.
> > > > >> * Updating is performed using update method in batches of 1000,
> > > around 9
> > > > >> processes in parallel (split by core)
> > > > >
> > > > > This means that indexing within each Solr core is single-threaded.
> > The
> > > > way to increase indexing speed is to index in parallel with multiple
> > > > threads or processes per index.  If you can increase the CPU power
> > > > available on the Solr server when you increase the number of
> > > > processes/threads sending data to Solr, that might help.
> > > > >
> > > > > Thanks,
> > > > > Shawn
> > > > >
> > > >
> > > >
> > >
> >
> --
> Vincenzo D'Amore
>

Re: Solr indexing performance tips

Posted by Vincenzo D'Amore <v....@gmail.com>.

Hi Marius, if I have understood correctly you have a deleteByQuery for each
document, am I right?

On Thu, 16 Jun 2022 at 11:04, Marius Grigaitis
<ma...@home24.de.invalid> wrote:

> Just a followup on the topic.
>
> * We checked settings on solr, seem quite default (especially on merge,
> commit strategies, etc)
> * We commit every 10 minutes
> * Added NewRelic to the Solr instance to gather more data and graphs
>
> In the end what caught our eye is a few deleteByQuery lines in stacks of
> running threads while Solr is overloaded. We temporarily removed
> deleteByQuery and it had around 10x performance improvement on indexing
> speed.
>
> How are we using deleteByQuery?
>
> update(add=[{uid: foo-123, sku: 123, ...}, {uid: bar-124, sku: 124} ...],
> deleteByQuery=["sku: 123 AND uid != foo-123", "sku: 123 AND uid !=
> bar-124"])
>
> UID is the uniqueKey for the index. We do this because "foo" or "bar" could
> change and we no longer want the previous document present.
>
> Ideally we should probably change our uniqueKey to be `sku` in this case
> and we would no longer need deleteByQuery but what could be interesting is
> why deleteByQuery causes such performance bottleneck as well as how we
> could potentially optimize it if we wanted to keep it?
>
> Marius
>
> On Wed, Jun 8, 2022 at 8:41 PM David Hastings <
> hastings.recursive@gmail.com>
> wrote:
>
> > > * Do NOT commit after each batch of 1000 docs. Instead, commit as
> seldom
> > as your requirements allows, e.g. try commitWithin=60000 to commit every
> > minute
> >
> > this is the big one.  commit after the entire process is done or on a
> > timer, if you don't need NRT searching, rarely does anyone ever need
> that.
> > the commit is a heavy operation and takes about the same time if you are
> > committing 1000 documents or 100k documents.
> >
> > On Wed, Jun 8, 2022 at 10:40 AM Jan Høydahl <ja...@cominvent.com>
> wrote:
> >
> > > * Go multi threaded for each core as Shawn says. Try e.g. 2, 3 and 4
> > > threads
> > > * Experiment with different batch sizes, e.g. try 500 and 2000 -
> depends
> > > on your docs what is optimal
> > > * Do NOT commit after each batch of 1000 docs. Instead, commit as
> seldom
> > > as your requirements allows, e.g. try commitWithin=60000 to commit
> every
> > > minute
> > >
> > > Tip: Try to push Solr metrics to DataDog or some other service, where
> you
> > > can see a dashboard with stats on requests/sec, RAM, CPU, threads, GC
> etc
> > > which may answer your last question.
> > >
> > > Jan
> > >
> > > > 8. jun. 2022 kl. 14:06 skrev Shawn Heisey <ap...@elyograg.org>:
> > > >
> > > > On 6/8/2022 3:35 AM, Marius Grigaitis wrote:
> > > >> * 9 different cores. Each weighs around ~100 MB on disk and has
> > > >> approximately 90k documents inside each.
> > > >> * Updating is performed using update method in batches of 1000,
> > around 9
> > > >> processes in parallel (split by core)
> > > >
> > > > This means that indexing within each Solr core is single-threaded.
> The
> > > way to increase indexing speed is to index in parallel with multiple
> > > threads or processes per index.  If you can increase the CPU power
> > > available on the Solr server when you increase the number of
> > > processes/threads sending data to Solr, that might help.
> > > >
> > > > Thanks,
> > > > Shawn
> > > >
> > >
> > >
> >
>
-- 
Vincenzo D'Amore

Re: Solr indexing performance tips

Posted by Marius Grigaitis <ma...@home24.de.INVALID>.

I think there are or were technical reasons behind it and thats something
to figure out. Its also more complicated than that, I just simplified it.
E.g. uniqueKey is actually a composition of two ids and relationship
between them is important for grouping purposes.

I agree with you on switching to sku might make sense though.

On Thu, Jun 16, 2022, 20:07 Vincenzo D'Amore <v....@gmail.com> wrote:

> May I ask why you haven't used the sku as (primary key)? Do you need to
> have more versions of the same sku?
> For my understanding, if you can have the sku as primary key, almost all
> deleteByQuery are useless.
>
> On Thu, Jun 16, 2022 at 4:38 PM Shawn Heisey <ap...@elyograg.org> wrote:
>
> > On 6/16/22 02:59, Marius Grigaitis wrote:
> > > In the end what caught our eye is a few deleteByQuery lines in stacks
> of
> > > running threads while Solr is overloaded. We temporarily removed
> > > deleteByQuery and it had around 10x performance improvement on indexing
> > > speed.
> >
> > I do not understand all the low-level interactions.  But I have seen
> > deleteByQuery cause some major problems.  It seems to create a blocking
> > situation where Lucene waits for things to complete before it actually
> > does the delete, and anything sent AFTER the delete waits for the
> > delete.  Imagine this situation:
> >
> > 1) Ongoing indexing begins a segment merge, one that will take 15
> > minutes to complete.
> > 2) A deleteByQuery is sent.
> > 3) More index changes are sent.
> >
> > What happens in this situation is that step 2 will wait for the merge to
> > complete, and step 3 will wait for step 2 to complete.  I have seen
> > automatic segment merges that take a lot longer than 15 minutes.
> >
> > If step 2 is changed to query for ID and then use deleteById, then steps
> > 2 and 3 will run concurrently with the merge.
> >
> > It took a lot of headscratching to figure out why my indexing process
> > sometimes stalled for LONG time spans.
> >
> > Thanks,
> > Shawn
> >
> >
>
> --
> Vincenzo D'Amore
>

Re: Solr indexing performance tips

Posted by Vincenzo D'Amore <v....@gmail.com>.

May I ask why you haven't used the sku as (primary key)? Do you need to
have more versions of the same sku?
For my understanding, if you can have the sku as primary key, almost all
deleteByQuery are useless.

On Thu, Jun 16, 2022 at 4:38 PM Shawn Heisey <ap...@elyograg.org> wrote:

> On 6/16/22 02:59, Marius Grigaitis wrote:
> > In the end what caught our eye is a few deleteByQuery lines in stacks of
> > running threads while Solr is overloaded. We temporarily removed
> > deleteByQuery and it had around 10x performance improvement on indexing
> > speed.
>
> I do not understand all the low-level interactions.  But I have seen
> deleteByQuery cause some major problems.  It seems to create a blocking
> situation where Lucene waits for things to complete before it actually
> does the delete, and anything sent AFTER the delete waits for the
> delete.  Imagine this situation:
>
> 1) Ongoing indexing begins a segment merge, one that will take 15
> minutes to complete.
> 2) A deleteByQuery is sent.
> 3) More index changes are sent.
>
> What happens in this situation is that step 2 will wait for the merge to
> complete, and step 3 will wait for step 2 to complete.  I have seen
> automatic segment merges that take a lot longer than 15 minutes.
>
> If step 2 is changed to query for ID and then use deleteById, then steps
> 2 and 3 will run concurrently with the merge.
>
> It took a lot of headscratching to figure out why my indexing process
> sometimes stalled for LONG time spans.
>
> Thanks,
> Shawn
>
>

-- 
Vincenzo D'Amore

Re: Solr indexing performance tips

Posted by Shawn Heisey <ap...@elyograg.org>.

On 6/16/22 02:59, Marius Grigaitis wrote:
> In the end what caught our eye is a few deleteByQuery lines in stacks of
> running threads while Solr is overloaded. We temporarily removed
> deleteByQuery and it had around 10x performance improvement on indexing
> speed.

I do not understand all the low-level interactions.  But I have seen 
deleteByQuery cause some major problems.  It seems to create a blocking 
situation where Lucene waits for things to complete before it actually 
does the delete, and anything sent AFTER the delete waits for the 
delete.  Imagine this situation:

1) Ongoing indexing begins a segment merge, one that will take 15 
minutes to complete.
2) A deleteByQuery is sent.
3) More index changes are sent.

What happens in this situation is that step 2 will wait for the merge to 
complete, and step 3 will wait for step 2 to complete.  I have seen 
automatic segment merges that take a lot longer than 15 minutes.

If step 2 is changed to query for ID and then use deleteById, then steps 
2 and 3 will run concurrently with the merge.

It took a lot of headscratching to figure out why my indexing process 
sometimes stalled for LONG time spans.

Thanks,
Shawn

Re: Solr indexing performance tips

Posted by Jan Høydahl <ja...@cominvent.com>.

Interesting find. I have seen other reports on very slow deleteByQuery earlier. So it should be used sparingly, and under no circumstance bombard Solr with multiple deleteByQuery requests on each update.

Sounds like a better plan to switch to a truly unique ID like SKU. Or if you know the previous ID, use delete-by-id instead, which is much faster. While you're stuch with deleteByQuery, it would probably also be more efficient to collapse multiple deleteByQuery requests into one, i.e. (("sku: 123 AND uid != foo-123") OR ("sku: 124 AND uid != "bar-124")...) as one query rather than individual ones. And try to batch them 100 at a time once in a while rather than many small..

Jan

> 16. jun. 2022 kl. 10:59 skrev Marius Grigaitis <ma...@home24.de.INVALID>:
> 
> Just a followup on the topic.
> 
> * We checked settings on solr, seem quite default (especially on merge,
> commit strategies, etc)
> * We commit every 10 minutes
> * Added NewRelic to the Solr instance to gather more data and graphs
> 
> In the end what caught our eye is a few deleteByQuery lines in stacks of
> running threads while Solr is overloaded. We temporarily removed
> deleteByQuery and it had around 10x performance improvement on indexing
> speed.
> 
> How are we using deleteByQuery?
> 
> update(add=[{uid: foo-123, sku: 123, ...}, {uid: bar-124, sku: 124} ...],
> deleteByQuery=["sku: 123 AND uid != foo-123", "sku: 123 AND uid !=
> bar-124"])
> 
> UID is the uniqueKey for the index. We do this because "foo" or "bar" could
> change and we no longer want the previous document present.
> 
> Ideally we should probably change our uniqueKey to be `sku` in this case
> and we would no longer need deleteByQuery but what could be interesting is
> why deleteByQuery causes such performance bottleneck as well as how we
> could potentially optimize it if we wanted to keep it?
> 
> Marius
> 
> On Wed, Jun 8, 2022 at 8:41 PM David Hastings <ha...@gmail.com>
> wrote:
> 
>>> * Do NOT commit after each batch of 1000 docs. Instead, commit as seldom
>> as your requirements allows, e.g. try commitWithin=60000 to commit every
>> minute
>> 
>> this is the big one.  commit after the entire process is done or on a
>> timer, if you don't need NRT searching, rarely does anyone ever need that.
>> the commit is a heavy operation and takes about the same time if you are
>> committing 1000 documents or 100k documents.
>> 
>> On Wed, Jun 8, 2022 at 10:40 AM Jan Høydahl <ja...@cominvent.com> wrote:
>> 
>>> * Go multi threaded for each core as Shawn says. Try e.g. 2, 3 and 4
>>> threads
>>> * Experiment with different batch sizes, e.g. try 500 and 2000 - depends
>>> on your docs what is optimal
>>> * Do NOT commit after each batch of 1000 docs. Instead, commit as seldom
>>> as your requirements allows, e.g. try commitWithin=60000 to commit every
>>> minute
>>> 
>>> Tip: Try to push Solr metrics to DataDog or some other service, where you
>>> can see a dashboard with stats on requests/sec, RAM, CPU, threads, GC etc
>>> which may answer your last question.
>>> 
>>> Jan
>>> 
>>>> 8. jun. 2022 kl. 14:06 skrev Shawn Heisey <ap...@elyograg.org>:
>>>> 
>>>> On 6/8/2022 3:35 AM, Marius Grigaitis wrote:
>>>>> * 9 different cores. Each weighs around ~100 MB on disk and has
>>>>> approximately 90k documents inside each.
>>>>> * Updating is performed using update method in batches of 1000,
>> around 9
>>>>> processes in parallel (split by core)
>>>> 
>>>> This means that indexing within each Solr core is single-threaded.  The
>>> way to increase indexing speed is to index in parallel with multiple
>>> threads or processes per index.  If you can increase the CPU power
>>> available on the Solr server when you increase the number of
>>> processes/threads sending data to Solr, that might help.
>>>> 
>>>> Thanks,
>>>> Shawn
>>>> 
>>> 
>>> 
>>

Re: Solr indexing performance tips

Posted by Marius Grigaitis <ma...@home24.de.INVALID>.

Just a followup on the topic.

* We checked settings on solr, seem quite default (especially on merge,
commit strategies, etc)
* We commit every 10 minutes
* Added NewRelic to the Solr instance to gather more data and graphs

In the end what caught our eye is a few deleteByQuery lines in stacks of
running threads while Solr is overloaded. We temporarily removed
deleteByQuery and it had around 10x performance improvement on indexing
speed.

How are we using deleteByQuery?

update(add=[{uid: foo-123, sku: 123, ...}, {uid: bar-124, sku: 124} ...],
deleteByQuery=["sku: 123 AND uid != foo-123", "sku: 123 AND uid !=
bar-124"])

UID is the uniqueKey for the index. We do this because "foo" or "bar" could
change and we no longer want the previous document present.

Ideally we should probably change our uniqueKey to be `sku` in this case
and we would no longer need deleteByQuery but what could be interesting is
why deleteByQuery causes such performance bottleneck as well as how we
could potentially optimize it if we wanted to keep it?

Marius

On Wed, Jun 8, 2022 at 8:41 PM David Hastings <ha...@gmail.com>
wrote:

> > * Do NOT commit after each batch of 1000 docs. Instead, commit as seldom
> as your requirements allows, e.g. try commitWithin=60000 to commit every
> minute
>
> this is the big one.  commit after the entire process is done or on a
> timer, if you don't need NRT searching, rarely does anyone ever need that.
> the commit is a heavy operation and takes about the same time if you are
> committing 1000 documents or 100k documents.
>
> On Wed, Jun 8, 2022 at 10:40 AM Jan Høydahl <ja...@cominvent.com> wrote:
>
> > * Go multi threaded for each core as Shawn says. Try e.g. 2, 3 and 4
> > threads
> > * Experiment with different batch sizes, e.g. try 500 and 2000 - depends
> > on your docs what is optimal
> > * Do NOT commit after each batch of 1000 docs. Instead, commit as seldom
> > as your requirements allows, e.g. try commitWithin=60000 to commit every
> > minute
> >
> > Tip: Try to push Solr metrics to DataDog or some other service, where you
> > can see a dashboard with stats on requests/sec, RAM, CPU, threads, GC etc
> > which may answer your last question.
> >
> > Jan
> >
> > > 8. jun. 2022 kl. 14:06 skrev Shawn Heisey <ap...@elyograg.org>:
> > >
> > > On 6/8/2022 3:35 AM, Marius Grigaitis wrote:
> > >> * 9 different cores. Each weighs around ~100 MB on disk and has
> > >> approximately 90k documents inside each.
> > >> * Updating is performed using update method in batches of 1000,
> around 9
> > >> processes in parallel (split by core)
> > >
> > > This means that indexing within each Solr core is single-threaded.  The
> > way to increase indexing speed is to index in parallel with multiple
> > threads or processes per index.  If you can increase the CPU power
> > available on the Solr server when you increase the number of
> > processes/threads sending data to Solr, that might help.
> > >
> > > Thanks,
> > > Shawn
> > >
> >
> >
>

Re: Solr indexing performance tips

Posted by David Hastings <ha...@gmail.com>.

> * Do NOT commit after each batch of 1000 docs. Instead, commit as seldom
as your requirements allows, e.g. try commitWithin=60000 to commit every
minute

this is the big one.  commit after the entire process is done or on a
timer, if you don't need NRT searching, rarely does anyone ever need that.
the commit is a heavy operation and takes about the same time if you are
committing 1000 documents or 100k documents.

On Wed, Jun 8, 2022 at 10:40 AM Jan Høydahl <ja...@cominvent.com> wrote:

> * Go multi threaded for each core as Shawn says. Try e.g. 2, 3 and 4
> threads
> * Experiment with different batch sizes, e.g. try 500 and 2000 - depends
> on your docs what is optimal
> * Do NOT commit after each batch of 1000 docs. Instead, commit as seldom
> as your requirements allows, e.g. try commitWithin=60000 to commit every
> minute
>
> Tip: Try to push Solr metrics to DataDog or some other service, where you
> can see a dashboard with stats on requests/sec, RAM, CPU, threads, GC etc
> which may answer your last question.
>
> Jan
>
> > 8. jun. 2022 kl. 14:06 skrev Shawn Heisey <ap...@elyograg.org>:
> >
> > On 6/8/2022 3:35 AM, Marius Grigaitis wrote:
> >> * 9 different cores. Each weighs around ~100 MB on disk and has
> >> approximately 90k documents inside each.
> >> * Updating is performed using update method in batches of 1000, around 9
> >> processes in parallel (split by core)
> >
> > This means that indexing within each Solr core is single-threaded.  The
> way to increase indexing speed is to index in parallel with multiple
> threads or processes per index.  If you can increase the CPU power
> available on the Solr server when you increase the number of
> processes/threads sending data to Solr, that might help.
> >
> > Thanks,
> > Shawn
> >
>
>

Re: Solr indexing performance tips

Posted by Jan Høydahl <ja...@cominvent.com>.

* Go multi threaded for each core as Shawn says. Try e.g. 2, 3 and 4 threads
* Experiment with different batch sizes, e.g. try 500 and 2000 - depends on your docs what is optimal
* Do NOT commit after each batch of 1000 docs. Instead, commit as seldom as your requirements allows, e.g. try commitWithin=60000 to commit every minute

Tip: Try to push Solr metrics to DataDog or some other service, where you can see a dashboard with stats on requests/sec, RAM, CPU, threads, GC etc which may answer your last question.

Jan

> 8. jun. 2022 kl. 14:06 skrev Shawn Heisey <ap...@elyograg.org>:
> 
> On 6/8/2022 3:35 AM, Marius Grigaitis wrote:
>> * 9 different cores. Each weighs around ~100 MB on disk and has
>> approximately 90k documents inside each.
>> * Updating is performed using update method in batches of 1000, around 9
>> processes in parallel (split by core)
> 
> This means that indexing within each Solr core is single-threaded.  The way to increase indexing speed is to index in parallel with multiple threads or processes per index.  If you can increase the CPU power available on the Solr server when you increase the number of processes/threads sending data to Solr, that might help.
> 
> Thanks,
> Shawn
>

Re: Solr indexing performance tips

Posted by Shawn Heisey <ap...@elyograg.org>.

On 6/8/2022 3:35 AM, Marius Grigaitis wrote:
> * 9 different cores. Each weighs around ~100 MB on disk and has
> approximately 90k documents inside each.
> * Updating is performed using update method in batches of 1000, around 9
> processes in parallel (split by core)

This means that indexing within each Solr core is single-threaded.  The 
way to increase indexing speed is to index in parallel with multiple 
threads or processes per index.  If you can increase the CPU power 
available on the Solr server when you increase the number of 
processes/threads sending data to Solr, that might help.

Thanks,
Shawn