You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by KNitin <ni...@gmail.com> on 2015/12/05 01:55:21 UTC

Max indexing threads & RamBuffered size

Hi,

The max indexing threads in the solrconfig.xml is set to 8 by default. Does
this mean only 8 concurrent indexing threads will be allowed per collection
level? or per core level?

Buffered size : This seems to be set at 64Mb. If we have beefier machine
that can take more load, can we set this to a higher limit say 1 or 2 Gb?
What will be downside of doing so? (apart from commits taking longer).

Thanks in advance!
Nitin

Re: Max indexing threads & RamBuffered size

Posted by KNitin <ni...@gmail.com>.
Thanks Eric. I will profile and check it out.

On Saturday, December 5, 2015, Erick Erickson <er...@gmail.com>
wrote:

> bq: What adds bottleneck in the indexing flow? Is it the buffering and
> flushing
> out to disk ?
>
> It Depends (tm). What do the Solr logs show when one of these two things
> happens?
>
> You pretty much have to put a profiler on the Solr instance to see where
> it's
> spending the time, but timeouts are very often caused by:
> 1> having a very large heap
> 2> hitting a stop-the-world garbage collection that exceeds your timeouts.
>
> Best,
> Erick
>
> On Sat, Dec 5, 2015 at 8:07 PM, KNitin <nitin.tnvl@gmail.com
> <javascript:;>> wrote:
> > I have an extremely large indexing load (per doc size of 4-5 Mb with over
> > 100M docs). I have auto commit settings to flush to disk (with open
> > searcher as false) every 20 seconds. Even with that the update sometime
> > fails or timed out. The goal is to improve the indexing throughput and
> > hence trying to experiment and see if tweaking any of these can speed up.
> >
> > What adds bottleneck in the indexing flow? Is it the buffering and
> flushing
> > out to disk ?
> >
> > On Sat, Dec 5, 2015 at 11:15 AM, Erick Erickson <erickerickson@gmail.com
> <javascript:;>>
> > wrote:
> >
> >> I'm pretty sure that max indexing threads is per core, but just looked
> >> and it's not supported in Solr 5.3 and above so I wouldn't worry about
> >> it at all.
> >>
> >> I've never seen much in the way of benefit for bumping this past 128M
> >> or maybe 256M. This is just how much memory is filled up before the
> >> buffer is flushed to disk. Unless you have very high indexing loads or
> >> really long autocommit times, you'll rarely hit it anyway since this
> >> memory is also flushed when you do any flavor of hard commit.
> >>
> >> Best,
> >> Erick
> >>
> >> On Fri, Dec 4, 2015 at 4:55 PM, KNitin <nitin.tnvl@gmail.com
> <javascript:;>> wrote:
> >> > Hi,
> >> >
> >> > The max indexing threads in the solrconfig.xml is set to 8 by default.
> >> Does
> >> > this mean only 8 concurrent indexing threads will be allowed per
> >> collection
> >> > level? or per core level?
> >> >
> >> > Buffered size : This seems to be set at 64Mb. If we have beefier
> machine
> >> > that can take more load, can we set this to a higher limit say 1 or 2
> Gb?
> >> > What will be downside of doing so? (apart from commits taking longer).
> >> >
> >> > Thanks in advance!
> >> > Nitin
> >>
>

Re: Max indexing threads & RamBuffered size

Posted by Erick Erickson <er...@gmail.com>.
bq: What adds bottleneck in the indexing flow? Is it the buffering and flushing
out to disk ?

It Depends (tm). What do the Solr logs show when one of these two things
happens?

You pretty much have to put a profiler on the Solr instance to see where it's
spending the time, but timeouts are very often caused by:
1> having a very large heap
2> hitting a stop-the-world garbage collection that exceeds your timeouts.

Best,
Erick

On Sat, Dec 5, 2015 at 8:07 PM, KNitin <ni...@gmail.com> wrote:
> I have an extremely large indexing load (per doc size of 4-5 Mb with over
> 100M docs). I have auto commit settings to flush to disk (with open
> searcher as false) every 20 seconds. Even with that the update sometime
> fails or timed out. The goal is to improve the indexing throughput and
> hence trying to experiment and see if tweaking any of these can speed up.
>
> What adds bottleneck in the indexing flow? Is it the buffering and flushing
> out to disk ?
>
> On Sat, Dec 5, 2015 at 11:15 AM, Erick Erickson <er...@gmail.com>
> wrote:
>
>> I'm pretty sure that max indexing threads is per core, but just looked
>> and it's not supported in Solr 5.3 and above so I wouldn't worry about
>> it at all.
>>
>> I've never seen much in the way of benefit for bumping this past 128M
>> or maybe 256M. This is just how much memory is filled up before the
>> buffer is flushed to disk. Unless you have very high indexing loads or
>> really long autocommit times, you'll rarely hit it anyway since this
>> memory is also flushed when you do any flavor of hard commit.
>>
>> Best,
>> Erick
>>
>> On Fri, Dec 4, 2015 at 4:55 PM, KNitin <ni...@gmail.com> wrote:
>> > Hi,
>> >
>> > The max indexing threads in the solrconfig.xml is set to 8 by default.
>> Does
>> > this mean only 8 concurrent indexing threads will be allowed per
>> collection
>> > level? or per core level?
>> >
>> > Buffered size : This seems to be set at 64Mb. If we have beefier machine
>> > that can take more load, can we set this to a higher limit say 1 or 2 Gb?
>> > What will be downside of doing so? (apart from commits taking longer).
>> >
>> > Thanks in advance!
>> > Nitin
>>

Re: Max indexing threads & RamBuffered size

Posted by KNitin <ni...@gmail.com>.
I have an extremely large indexing load (per doc size of 4-5 Mb with over
100M docs). I have auto commit settings to flush to disk (with open
searcher as false) every 20 seconds. Even with that the update sometime
fails or timed out. The goal is to improve the indexing throughput and
hence trying to experiment and see if tweaking any of these can speed up.

What adds bottleneck in the indexing flow? Is it the buffering and flushing
out to disk ?

On Sat, Dec 5, 2015 at 11:15 AM, Erick Erickson <er...@gmail.com>
wrote:

> I'm pretty sure that max indexing threads is per core, but just looked
> and it's not supported in Solr 5.3 and above so I wouldn't worry about
> it at all.
>
> I've never seen much in the way of benefit for bumping this past 128M
> or maybe 256M. This is just how much memory is filled up before the
> buffer is flushed to disk. Unless you have very high indexing loads or
> really long autocommit times, you'll rarely hit it anyway since this
> memory is also flushed when you do any flavor of hard commit.
>
> Best,
> Erick
>
> On Fri, Dec 4, 2015 at 4:55 PM, KNitin <ni...@gmail.com> wrote:
> > Hi,
> >
> > The max indexing threads in the solrconfig.xml is set to 8 by default.
> Does
> > this mean only 8 concurrent indexing threads will be allowed per
> collection
> > level? or per core level?
> >
> > Buffered size : This seems to be set at 64Mb. If we have beefier machine
> > that can take more load, can we set this to a higher limit say 1 or 2 Gb?
> > What will be downside of doing so? (apart from commits taking longer).
> >
> > Thanks in advance!
> > Nitin
>

Re: Max indexing threads & RamBuffered size

Posted by Erick Erickson <er...@gmail.com>.
I'm pretty sure that max indexing threads is per core, but just looked
and it's not supported in Solr 5.3 and above so I wouldn't worry about
it at all.

I've never seen much in the way of benefit for bumping this past 128M
or maybe 256M. This is just how much memory is filled up before the
buffer is flushed to disk. Unless you have very high indexing loads or
really long autocommit times, you'll rarely hit it anyway since this
memory is also flushed when you do any flavor of hard commit.

Best,
Erick

On Fri, Dec 4, 2015 at 4:55 PM, KNitin <ni...@gmail.com> wrote:
> Hi,
>
> The max indexing threads in the solrconfig.xml is set to 8 by default. Does
> this mean only 8 concurrent indexing threads will be allowed per collection
> level? or per core level?
>
> Buffered size : This seems to be set at 64Mb. If we have beefier machine
> that can take more load, can we set this to a higher limit say 1 or 2 Gb?
> What will be downside of doing so? (apart from commits taking longer).
>
> Thanks in advance!
> Nitin