You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@solr.apache.org by Dominique Bejean <do...@eolya.fr> on 2022/10/06 07:57:42 UTC

Advice in order to optimise resource usage of a huge server

Hi,

One of our customer have huge servers

   - Bar-metal
   - 64 CPU
   - 512 Gb RAM
   - 6x2Tb disk in RAID 6 (so 2Tb disk space available)


I think the best way to optimize resources usage of these servers is to
install several Solr instances.

I imagine 2 scenarios to be tested according to data volumes, update rate,
request volume, ...

Do not configure disks in RAID 6 but, leave 6 standard volumes (more space
disk, more I/O available)
Install 3 or 6 solr instances each one using 1 ou 2 disk volumes

Obviously, replicate shards and verify replicates of a shard are not
located on the same physical server.

What I am not sure is how MMapDirectory will work with several Solr
instances. Will off heap memory correctly managed and shared between
several Solr instances ?

Thank you for your advice.

Dominique

Re: Advice in order to optimise resource usage of a huge server

Posted by James Greene <ja...@jamesaustingreene.com>.

A reason for sharding on a single server is the 2.1b  max docs per core
limitation.

On Thu, Oct 6, 2022, 12:51 PM Dave <ha...@gmail.com> wrote:

> I know these machines. Sharding is kind of useless. Set the ssd tb drives
> up in fastest raid read available, 31 xms xmx, one solr instance. Buy back
> up ssd drives when you burn one out and it fails over to the master server.
> Multiple solr instances on one machine makes little sense unless they have
> different purposes like a ml instance and a text highlighting instance but
> even then you get no performance improvement
>
>
> > On Oct 6, 2022, at 12:21 PM, Shawn Heisey <ap...@elyograg.org> wrote:
> >
> > On 10/6/22 01:57, Dominique Bejean wrote:
> >> One of our customer have huge servers
> >>
> >>    - Bar-metal
> >>    - 64 CPU
> >>    - 512 Gb RAM
> >>    - 6x2Tb disk in RAID 6 (so 2Tb disk space available)
> >>
> >>
> >> I think the best way to optimize resources usage of these servers is to
> >> install several Solr instances.
> >
> > That is not what I would do.
> >
> >> Do not configure disks in RAID 6 but, leave 6 standard volumes (more
> space
> >> disk, more I/O available)
> >> Install 3 or 6 solr instances each one using 1 ou 2 disk volumes
> >
> > RAID10 will get you the best performance.  Six 2TB drives in RAID10 has
> 6TB of total space.  The ONLY disadvantage that RAID10 has is that you pay
> for twice the usable storage.  Disks are relatively cheap, though hard to
> get in quantity these days.  I would recommend going with the largest
> stripe size your hardware can support.  1MB is typically where that maxes
> out.
> >
> > Any use of RAID5 or RAID6 has two major issues:  1) A serious
> performance problem that also affects reads if there are ANY writes
> happening.  2) If a disk fails, performance across the board is terrible.
> When the bad disk is replaced, performance is REALLY terrible as long as a
> rebuild is happening, and I have seen a RAID5/6 rebuild take 24 to 48 hours
> with 2TB disks on a busy array.  It would take even longer with larger
> disks.
> >
> >> What I am not sure is how MMapDirectory will work with several Solr
> >> instances. Will off heap memory correctly managed and shared between
> >> several Solr instances ?
> >
> > With symlinks or multiple mount points in the solr home, you can have a
> single instance handle indexes on multiple storage devices.  One instance
> has less overhead, particularly in memory, than multiple instances. Off
> heap memory for the disk cache should function as expected with multiple
> instances or one instances.
> >
> > Thanks,
> > Shawn
> >
>

Re: Advice in order to optimise resource usage of a huge server

Posted by Dave <ha...@gmail.com>.

I know these machines. Sharding is kind of useless. Set the ssd tb drives up in fastest raid read available, 31 xms xmx, one solr instance. Buy back up ssd drives when you burn one out and it fails over to the master server. Multiple solr instances on one machine makes little sense unless they have different purposes like a ml instance and a text highlighting instance but even then you get no performance improvement


> On Oct 6, 2022, at 12:21 PM, Shawn Heisey <ap...@elyograg.org> wrote:
> 
> On 10/6/22 01:57, Dominique Bejean wrote:
>> One of our customer have huge servers
>> 
>>    - Bar-metal
>>    - 64 CPU
>>    - 512 Gb RAM
>>    - 6x2Tb disk in RAID 6 (so 2Tb disk space available)
>> 
>> 
>> I think the best way to optimize resources usage of these servers is to
>> install several Solr instances.
> 
> That is not what I would do.
> 
>> Do not configure disks in RAID 6 but, leave 6 standard volumes (more space
>> disk, more I/O available)
>> Install 3 or 6 solr instances each one using 1 ou 2 disk volumes
> 
> RAID10 will get you the best performance.  Six 2TB drives in RAID10 has 6TB of total space.  The ONLY disadvantage that RAID10 has is that you pay for twice the usable storage.  Disks are relatively cheap, though hard to get in quantity these days.  I would recommend going with the largest stripe size your hardware can support.  1MB is typically where that maxes out.
> 
> Any use of RAID5 or RAID6 has two major issues:  1) A serious performance problem that also affects reads if there are ANY writes happening.  2) If a disk fails, performance across the board is terrible.  When the bad disk is replaced, performance is REALLY terrible as long as a rebuild is happening, and I have seen a RAID5/6 rebuild take 24 to 48 hours with 2TB disks on a busy array.  It would take even longer with larger disks.
> 
>> What I am not sure is how MMapDirectory will work with several Solr
>> instances. Will off heap memory correctly managed and shared between
>> several Solr instances ?
> 
> With symlinks or multiple mount points in the solr home, you can have a single instance handle indexes on multiple storage devices.  One instance has less overhead, particularly in memory, than multiple instances. Off heap memory for the disk cache should function as expected with multiple instances or one instances.
> 
> Thanks,
> Shawn
>

Re: Advice in order to optimise resource usage of a huge server

Posted by Shawn Heisey <ap...@elyograg.org>.

On 10/6/22 01:57, Dominique Bejean wrote:
> One of our customer have huge servers
>
>     - Bar-metal
>     - 64 CPU
>     - 512 Gb RAM
>     - 6x2Tb disk in RAID 6 (so 2Tb disk space available)
>
>
> I think the best way to optimize resources usage of these servers is to
> install several Solr instances.

That is not what I would do.

> Do not configure disks in RAID 6 but, leave 6 standard volumes (more space
> disk, more I/O available)
> Install 3 or 6 solr instances each one using 1 ou 2 disk volumes

RAID10 will get you the best performance.  Six 2TB drives in RAID10 has 
6TB of total space.  The ONLY disadvantage that RAID10 has is that you 
pay for twice the usable storage.  Disks are relatively cheap, though 
hard to get in quantity these days.  I would recommend going with the 
largest stripe size your hardware can support.  1MB is typically where 
that maxes out.

Any use of RAID5 or RAID6 has two major issues:  1) A serious 
performance problem that also affects reads if there are ANY writes 
happening.  2) If a disk fails, performance across the board is 
terrible.  When the bad disk is replaced, performance is REALLY terrible 
as long as a rebuild is happening, and I have seen a RAID5/6 rebuild 
take 24 to 48 hours with 2TB disks on a busy array.  It would take even 
longer with larger disks.

> What I am not sure is how MMapDirectory will work with several Solr
> instances. Will off heap memory correctly managed and shared between
> several Solr instances ?

With symlinks or multiple mount points in the solr home, you can have a 
single instance handle indexes on multiple storage devices.  One 
instance has less overhead, particularly in memory, than multiple 
instances. Off heap memory for the disk cache should function as 
expected with multiple instances or one instances.

Thanks,
Shawn

Re: Advice in order to optimise resource usage of a huge server

Posted by Gus Heck <gu...@gmail.com>.

The ideal jvm size will be influenced by the latency sensitivity of the
application. Large VM's are mostly good if you need to hold large data
objects in memory, otherwise they fill up with large numbers of small
objects and that leads to long GC pauses (GC time relates to the number,
not the size of the objects). Testing to properly measure latency including
some longer runs and a clear picture of your latency requirement is
important for predictable results in production. Without information on how
many machines are available, the size of the corpus, the nature of the
application and the requirements I think it's hard to make solid
recommendations regarding cluster layout. There of course are other issues
and costs to managing many instances (zk overhead eventually becomes a
problem if this leads you into thousands of replicas). But the smallest
sufficient JVM plus some sort of safety margin to defend
vs growth/change/attack is usually what you want unless that leads to very
large numbers of nodes or difficulties managing complexities. Lots of
trade-offs.

On Fri, Oct 7, 2022 at 9:19 AM Shawn Heisey <ap...@elyograg.org> wrote:

> On 10/6/22 15:54, Dominique Bejean wrote:
> > We are starting to investigate performance issues with a new customer.
> > There are several bad practices (commit, sharding, replicas count and
> > types, heap size, ...), that can explain these issues and we will work on
> > it in the next few days. I agree we need to better understand specific
> > usage and make some tests after fixing the bad practices.
> >
> > Anyway, one of the specific aspects is these huge servers, so I am trying
> > to see what is the best way to use all these ressources.
> >
> >
> > * Why do you want to split it up at all?
> >
> > Because one of the bad practices is a huge heap size (80 Gb). I am pretty
> > sure this heap size is not required and anyway it doesn't respect the
> 31Gb
> > limit. After determining the best heap size, if this size is near 31Gb, I
> > imagine it is better to have several Solr JVMs with less heap size. For
> > instance 2 Solr JVMs with 20 Gb each or 4 Solr JVMs with 10 Gb each.
>
> IMHO, lowering the heap size requirement is the ONLY reason to run more
> than one Solr instance on a server.  But I would go with 2 JVMs at 20GB
> each rather than 4 at 11GB.  Reduce the number of moving parts that you
> must track and manage.  The minimum heap requirement of two JVMs each
> with half the data will be a little bit larger than the minimum
> requirement of one JVM with all the data, due to JVM overhead.  I don't
> have a number for you on how much overhead there is for each JVM.
>
> > * MMapDirectory JVM sharing
> >
> > This point is the main reason for my message. If several Solr JVMs are
> > running on one server, will MMapDirectory work fine or will the JVMs
> fight
> > with each other in order to use off heap memory ?
>
> There would be little or no difference in the competition for disk cache
> memory with one JVM or several.
>
> > Storage configuration is the second point that I would like to
> investigate
> > in order to better share disk resources.
> > Instead have one single RAID 6 volume, isn't it better to have one
> distinct
> > not RAID volume per Solr node (if multiple Solr nodes are running on the
> > server) or multiple not RAID volumes use by a single Solr JVM (if only
> one
> > Solr node is running on the server) ?
>
> It is true that if you have each instance running on its own disk that
> what a single instance does will have zero effect on another instance.
>
> But RAID10 can mean even better performance than one mirror set for each
> instance.  Here's some "back of the envelope" calculations for you.
> Let's assume that each of the drives has a sustained throughput of 125
> megabytes per second.  Most modern SATA disks can exceed that, and
> high-RPM enterprise SAS disks are faster.  SSD beats them all by a wide
> margin.
>
> If you move to an 8-drive RAID10 array with slower disks like I just
> described, then the array has access to a potential data write rate of
> 500 MB/s, as the array consists of four mirror sets with the volume
> striped across them.  A well-designed RAID controller can potentially
> have an even higher read rate than 500 MB/s, by taking advantage of the
> fact that every bit of data actually exists on two drives, not just
> one.  The highest possible data rates will not always happen, but the
> average throughput will very often exceed the single disk rate of 125 MB/s.
>
> There is another important consideration that applies no matter how the
> storage is arranged:  If the machine has sufficient spare memory, most
> of the data that Lucene needs will be sitting in the disk cache at all
> times and transfer at incredible speed, and the amount of data that is
> actually read from disk will be relatively small.  This is the secret to
> stellar Solr performance:  Lots of memory beyond what is needed by
> program heaps, so that disk accesses are not needed very often.
>
> Thanks,
> Shawn
>
>

-- 
http://www.needhamsoftware.com (work)
http://www.the111shift.com (play)

Re: Advice in order to optimise resource usage of a huge server

Posted by Shawn Heisey <ap...@elyograg.org>.

On 10/6/22 15:54, Dominique Bejean wrote:
> We are starting to investigate performance issues with a new customer.
> There are several bad practices (commit, sharding, replicas count and
> types, heap size, ...), that can explain these issues and we will work on
> it in the next few days. I agree we need to better understand specific
> usage and make some tests after fixing the bad practices.
>
> Anyway, one of the specific aspects is these huge servers, so I am trying
> to see what is the best way to use all these ressources.
>
>
> * Why do you want to split it up at all?
>
> Because one of the bad practices is a huge heap size (80 Gb). I am pretty
> sure this heap size is not required and anyway it doesn't respect the 31Gb
> limit. After determining the best heap size, if this size is near 31Gb, I
> imagine it is better to have several Solr JVMs with less heap size. For
> instance 2 Solr JVMs with 20 Gb each or 4 Solr JVMs with 10 Gb each.

IMHO, lowering the heap size requirement is the ONLY reason to run more 
than one Solr instance on a server.  But I would go with 2 JVMs at 20GB 
each rather than 4 at 11GB.  Reduce the number of moving parts that you 
must track and manage.  The minimum heap requirement of two JVMs each 
with half the data will be a little bit larger than the minimum 
requirement of one JVM with all the data, due to JVM overhead.  I don't 
have a number for you on how much overhead there is for each JVM.

> * MMapDirectory JVM sharing
>
> This point is the main reason for my message. If several Solr JVMs are
> running on one server, will MMapDirectory work fine or will the JVMs fight
> with each other in order to use off heap memory ?

There would be little or no difference in the competition for disk cache 
memory with one JVM or several.

> Storage configuration is the second point that I would like to investigate
> in order to better share disk resources.
> Instead have one single RAID 6 volume, isn't it better to have one distinct
> not RAID volume per Solr node (if multiple Solr nodes are running on the
> server) or multiple not RAID volumes use by a single Solr JVM (if only one
> Solr node is running on the server) ?

It is true that if you have each instance running on its own disk that 
what a single instance does will have zero effect on another instance.

But RAID10 can mean even better performance than one mirror set for each 
instance.  Here's some "back of the envelope" calculations for you.  
Let's assume that each of the drives has a sustained throughput of 125 
megabytes per second.  Most modern SATA disks can exceed that, and 
high-RPM enterprise SAS disks are faster.  SSD beats them all by a wide 
margin.

If you move to an 8-drive RAID10 array with slower disks like I just 
described, then the array has access to a potential data write rate of 
500 MB/s, as the array consists of four mirror sets with the volume 
striped across them.  A well-designed RAID controller can potentially 
have an even higher read rate than 500 MB/s, by taking advantage of the 
fact that every bit of data actually exists on two drives, not just 
one.  The highest possible data rates will not always happen, but the 
average throughput will very often exceed the single disk rate of 125 MB/s.

There is another important consideration that applies no matter how the 
storage is arranged:  If the machine has sufficient spare memory, most 
of the data that Lucene needs will be sitting in the disk cache at all 
times and transfer at incredible speed, and the amount of data that is 
actually read from disk will be relatively small.  This is the secret to 
stellar Solr performance:  Lots of memory beyond what is needed by 
program heaps, so that disk accesses are not needed very often.

Thanks,
Shawn

Re: Advice in order to optimise resource usage of a huge server

Posted by Dominique Bejean <do...@eolya.fr>.

Hi Dave,

Are you suggesting to use historical Solr master/slave architecture ?

In Sorlcloud / SolrJ architecture this can be achieved by creating only
TLOG replicas then FORCELEADER located on a specific server (then indexing
server) and search only on TLOG replicas with the parameter
"shards.preference=replica.type:TLOG". Is this what you are suggesting ?

Regards

Dominique


Le ven. 7 oct. 2022 à 00:59, Dave <ha...@gmail.com> a écrit :

> You should never index directly into your query servers by the way. Index
> to the indexing server and replicate out to you query servers and tune each
> as needed
>
> > On Oct 6, 2022, at 6:52 PM, Dominique Bejean <do...@eolya.fr>
> wrote:
> >
> > Thank you Dima,
> >
> > Updates are highly multi-threaded batch processes at any time.
> > We won't have all index in RAM cache
> > Disks are SSD
> >
> > Dominique
> >
> >
> >> Le ven. 7 oct. 2022 à 00:28, dmitri maziuk <dm...@gmail.com> a
> >> écrit :
> >>
> >>> On 2022-10-06 4:54 PM, Dominique Bejean wrote:
> >>>
> >>> Storage configuration is the second point that I would like to
> >> investigate
> >>> in order to better share disk resources.
> >>> Instead have one single RAID 6 volume, isn't it better to have one
> >> distinct
> >>> not RAID volume per Solr node (if multiple Solr nodes are running on
> the
> >>> server) or multiple not RAID volumes use by a single Solr JVM (if only
> >> one
> >>> Solr node is running on the server) ?
> >>
> >> The best option is to have the indexes in RAM cache. The 2nd best option
> >> is the 2-level cache w/ RAM + SSD -- that's what you get with ZFS, and
> >> you can use the cheaper HDDs for primary storage. The next one is all
> >> SSDs -- in that case RAID-1(0) may give you better read performance than
> >> a dedicated drive, but probably not enough to notice. There's very
> >> little point in going RAID-5 or 6 on SSDs.
> >>
> >> In terms of performance RAID5/6 on HDDs is likely the worst option, and
> >> a single RAID6 volume is also the works option in terms of flexibility
> >> and maintenance. If your customer doesn't have money to fill those slots
> >> with SSDs, I'd probably go with one small SSD for system + swap, a
> >> 4-disk RAID-10, and a hot spare for it.
> >>
> >> Dima
> >>
> >>
>

Re: Advice in order to optimise resource usage of a huge server

Posted by Dave <ha...@gmail.com>.

You should never index directly into your query servers by the way. Index to the indexing server and replicate out to you query servers and tune each as needed

> On Oct 6, 2022, at 6:52 PM, Dominique Bejean <do...@eolya.fr> wrote:
> 
> Thank you Dima,
> 
> Updates are highly multi-threaded batch processes at any time.
> We won't have all index in RAM cache
> Disks are SSD
> 
> Dominique
> 
> 
>> Le ven. 7 oct. 2022 à 00:28, dmitri maziuk <dm...@gmail.com> a
>> écrit :
>> 
>>> On 2022-10-06 4:54 PM, Dominique Bejean wrote:
>>> 
>>> Storage configuration is the second point that I would like to
>> investigate
>>> in order to better share disk resources.
>>> Instead have one single RAID 6 volume, isn't it better to have one
>> distinct
>>> not RAID volume per Solr node (if multiple Solr nodes are running on the
>>> server) or multiple not RAID volumes use by a single Solr JVM (if only
>> one
>>> Solr node is running on the server) ?
>> 
>> The best option is to have the indexes in RAM cache. The 2nd best option
>> is the 2-level cache w/ RAM + SSD -- that's what you get with ZFS, and
>> you can use the cheaper HDDs for primary storage. The next one is all
>> SSDs -- in that case RAID-1(0) may give you better read performance than
>> a dedicated drive, but probably not enough to notice. There's very
>> little point in going RAID-5 or 6 on SSDs.
>> 
>> In terms of performance RAID5/6 on HDDs is likely the worst option, and
>> a single RAID6 volume is also the works option in terms of flexibility
>> and maintenance. If your customer doesn't have money to fill those slots
>> with SSDs, I'd probably go with one small SSD for system + swap, a
>> 4-disk RAID-10, and a hot spare for it.
>> 
>> Dima
>> 
>>

Re: Advice in order to optimise resource usage of a huge server

Posted by dmitri maziuk <dm...@gmail.com>.

On 2022-10-06 5:52 PM, Dominique Bejean wrote:
> Thank you Dima,
> 
> Updates are highly multi-threaded batch processes at any time.
> We won't have all index in RAM cache
> Disks are SSD
> 

You'd have to benchmark, pref. with you real jobs, on RAID-10 (as per my 
previous e-mail) vs JBOD. I suspect you won't see much practical 
difference, but who knows.

Dima

Re: Advice in order to optimise resource usage of a huge server

Posted by Dominique Bejean <do...@eolya.fr>.

Thank you Dima,

Updates are highly multi-threaded batch processes at any time.
We won't have all index in RAM cache
Disks are SSD

Dominique


Le ven. 7 oct. 2022 à 00:28, dmitri maziuk <dm...@gmail.com> a
écrit :

> On 2022-10-06 4:54 PM, Dominique Bejean wrote:
>
> > Storage configuration is the second point that I would like to
> investigate
> > in order to better share disk resources.
> > Instead have one single RAID 6 volume, isn't it better to have one
> distinct
> > not RAID volume per Solr node (if multiple Solr nodes are running on the
> > server) or multiple not RAID volumes use by a single Solr JVM (if only
> one
> > Solr node is running on the server) ?
>
> The best option is to have the indexes in RAM cache. The 2nd best option
> is the 2-level cache w/ RAM + SSD -- that's what you get with ZFS, and
> you can use the cheaper HDDs for primary storage. The next one is all
> SSDs -- in that case RAID-1(0) may give you better read performance than
> a dedicated drive, but probably not enough to notice. There's very
> little point in going RAID-5 or 6 on SSDs.
>
> In terms of performance RAID5/6 on HDDs is likely the worst option, and
> a single RAID6 volume is also the works option in terms of flexibility
> and maintenance. If your customer doesn't have money to fill those slots
> with SSDs, I'd probably go with one small SSD for system + swap, a
> 4-disk RAID-10, and a hot spare for it.
>
> Dima
>
>

Re: Advice in order to optimise resource usage of a huge server

Posted by dmitri maziuk <dm...@gmail.com>.

On 2022-10-06 4:54 PM, Dominique Bejean wrote:

> Storage configuration is the second point that I would like to investigate
> in order to better share disk resources.
> Instead have one single RAID 6 volume, isn't it better to have one distinct
> not RAID volume per Solr node (if multiple Solr nodes are running on the
> server) or multiple not RAID volumes use by a single Solr JVM (if only one
> Solr node is running on the server) ?

The best option is to have the indexes in RAM cache. The 2nd best option 
is the 2-level cache w/ RAM + SSD -- that's what you get with ZFS, and 
you can use the cheaper HDDs for primary storage. The next one is all 
SSDs -- in that case RAID-1(0) may give you better read performance than 
a dedicated drive, but probably not enough to notice. There's very 
little point in going RAID-5 or 6 on SSDs.

In terms of performance RAID5/6 on HDDs is likely the worst option, and 
a single RAID6 volume is also the works option in terms of flexibility 
and maintenance. If your customer doesn't have money to fill those slots 
with SSDs, I'd probably go with one small SSD for system + swap, a 
4-disk RAID-10, and a hot spare for it.

Dima

Re: Advice in order to optimise resource usage of a huge server

Posted by Walter Underwood <wu...@wunderwood.org>.

Run a GC analyzer on that JVM. I cannot imagine that they need 80 GB of heap. I’ve never run with more than 16 GB, even for a collection with 70 million documents.

Look at the amount of heap used after full collections. Add a safety factor to that, then use that heap size.

wunder
Walter Underwood
wunder@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On Oct 6, 2022, at 2:54 PM, Dominique Bejean <do...@eolya.fr> wrote:
> 
> Hi,
> 
> Thank you all for your responses. I will try to answer your questions in
> one single message.
> 
> We are starting to investigate performance issues with a new customer.
> There are several bad practices (commit, sharding, replicas count and
> types, heap size, ...), that can explain these issues and we will work on
> it in the next few days. I agree we need to better understand specific
> usage and make some tests after fixing the bad practices.
> 
> Anyway, one of the specific aspects is these huge servers, so I am trying
> to see what is the best way to use all these ressources.
> 
> 
> * Why do you want to split it up at all?
> 
> Because one of the bad practices is a huge heap size (80 Gb). I am pretty
> sure this heap size is not required and anyway it doesn't respect the 31Gb
> limit. After determining the best heap size, if this size is near 31Gb, I
> imagine it is better to have several Solr JVMs with less heap size. For
> instance 2 Solr JVMs with 20 Gb each or 4 Solr JVMs with 10 Gb each.
> 
> According to Walter's response and Mattew's question, that doesn't seem
> like a good idea.
> 
> 
> * MMapDirectory JVM sharing
> 
> This point is the main reason for my message. If several Solr JVMs are
> running on one server, will MMapDirectory work fine or will the JVMs fight
> with each other in order to use off heap memory ?
> 
> According to Shawn's response it should work fine.
> 
> 
> What would the iops look like?
> 
> Not monitored yet.
> Storage configuration is the second point that I would like to investigate
> in order to better share disk resources.
> Instead have one single RAID 6 volume, isn't it better to have one distinct
> not RAID volume per Solr node (if multiple Solr nodes are running on the
> server) or multiple not RAID volumes use by a single Solr JVM (if only one
> Solr node is running on the server) ?
> 
> I note the various suggestions in your answers (ZFS, RAID 10, ...)
> 
> Thank you Dima and Shawn
> 
> 
> Regards
> 
> Dominique
> 
> Le jeu. 6 oct. 2022 à 09:57, Dominique Bejean <do...@eolya.fr> a
> écrit :
> 
>> Hi,
>> 
>> One of our customer have huge servers
>> 
>>   - Bar-metal
>>   - 64 CPU
>>   - 512 Gb RAM
>>   - 6x2Tb disk in RAID 6 (so 2Tb disk space available)
>> 
>> 
>> I think the best way to optimize resources usage of these servers is to
>> install several Solr instances.
>> 
>> I imagine 2 scenarios to be tested according to data volumes, update rate,
>> request volume, ...
>> 
>> Do not configure disks in RAID 6 but, leave 6 standard volumes (more space
>> disk, more I/O available)
>> Install 3 or 6 solr instances each one using 1 ou 2 disk volumes
>> 
>> Obviously, replicate shards and verify replicates of a shard are not
>> located on the same physical server.
>> 
>> What I am not sure is how MMapDirectory will work with several Solr
>> instances. Will off heap memory correctly managed and shared between
>> several Solr instances ?
>> 
>> Thank you for your advice.
>> 
>> Dominique
>> 
>> 
>> 
>> 
>>

Re: Advice in order to optimise resource usage of a huge server

Posted by Dominique Bejean <do...@eolya.fr>.

Hi,

Thank you all for your responses. I will try to answer your questions in
one single message.

We are starting to investigate performance issues with a new customer.
There are several bad practices (commit, sharding, replicas count and
types, heap size, ...), that can explain these issues and we will work on
it in the next few days. I agree we need to better understand specific
usage and make some tests after fixing the bad practices.

Anyway, one of the specific aspects is these huge servers, so I am trying
to see what is the best way to use all these ressources.


* Why do you want to split it up at all?

Because one of the bad practices is a huge heap size (80 Gb). I am pretty
sure this heap size is not required and anyway it doesn't respect the 31Gb
limit. After determining the best heap size, if this size is near 31Gb, I
imagine it is better to have several Solr JVMs with less heap size. For
instance 2 Solr JVMs with 20 Gb each or 4 Solr JVMs with 10 Gb each.

According to Walter's response and Mattew's question, that doesn't seem
like a good idea.


* MMapDirectory JVM sharing

This point is the main reason for my message. If several Solr JVMs are
running on one server, will MMapDirectory work fine or will the JVMs fight
with each other in order to use off heap memory ?

According to Shawn's response it should work fine.


What would the iops look like?

Not monitored yet.
Storage configuration is the second point that I would like to investigate
in order to better share disk resources.
Instead have one single RAID 6 volume, isn't it better to have one distinct
not RAID volume per Solr node (if multiple Solr nodes are running on the
server) or multiple not RAID volumes use by a single Solr JVM (if only one
Solr node is running on the server) ?

I note the various suggestions in your answers (ZFS, RAID 10, ...)

Thank you Dima and Shawn


Regards

Dominique

Le jeu. 6 oct. 2022 à 09:57, Dominique Bejean <do...@eolya.fr> a
écrit :

> Hi,
>
> One of our customer have huge servers
>
>    - Bar-metal
>    - 64 CPU
>    - 512 Gb RAM
>    - 6x2Tb disk in RAID 6 (so 2Tb disk space available)
>
>
> I think the best way to optimize resources usage of these servers is to
> install several Solr instances.
>
> I imagine 2 scenarios to be tested according to data volumes, update rate,
> request volume, ...
>
> Do not configure disks in RAID 6 but, leave 6 standard volumes (more space
> disk, more I/O available)
> Install 3 or 6 solr instances each one using 1 ou 2 disk volumes
>
> Obviously, replicate shards and verify replicates of a shard are not
> located on the same physical server.
>
> What I am not sure is how MMapDirectory will work with several Solr
> instances. Will off heap memory correctly managed and shared between
> several Solr instances ?
>
> Thank you for your advice.
>
> Dominique
>
>
>
>
>

Re: Advice in order to optimise resource usage of a huge server

Posted by Deepak Goel <de...@gmail.com>.

What would the iops look like?


Deepak
"The greatness of a nation can be judged by the way its animals are treated
- Mahatma Gandhi"

+91 73500 12833
deicool@gmail.com

Facebook: https://www.facebook.com/deicool
LinkedIn: www.linkedin.com/in/deicool

"Plant a Tree, Go Green"

Make In India : http://www.makeinindia.com/home


On Thu, Oct 6, 2022 at 8:49 PM Gus Heck <gu...@gmail.com> wrote:

> It depends... on your data, on your usage, etc. The best answers are
> obtained by testing various configurations, if possible by replaying
> captured query load from production. There is (for all java programs) an
> advantage to staying under 32 GB RAM, but without an idea of the number of
> machines you describe, the size of the corpus (docs and disk) and what your
> expected usage patterns are (both indexing and query) one can't say if you
> need more heap than that, either in one VM or across several VMs.
>
> To understand how "unallocated" memory not assigned to the java heap (or
> other processes) is utilized to improve search performance, this article is
> helpful:
> https://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html
>
> -Gus
>
> On Thu, Oct 6, 2022 at 8:31 AM matthew sporleder <ms...@gmail.com>
> wrote:
>
> > Why do you want to split it up at all?
> >
> > On Thu, Oct 6, 2022 at 3:58 AM Dominique Bejean
> > <do...@eolya.fr> wrote:
> > >
> > > Hi,
> > >
> > > One of our customer have huge servers
> > >
> > >    - Bar-metal
> > >    - 64 CPU
> > >    - 512 Gb RAM
> > >    - 6x2Tb disk in RAID 6 (so 2Tb disk space available)
> > >
> > >
> > > I think the best way to optimize resources usage of these servers is to
> > > install several Solr instances.
> > >
> > > I imagine 2 scenarios to be tested according to data volumes, update
> > rate,
> > > request volume, ...
> > >
> > > Do not configure disks in RAID 6 but, leave 6 standard volumes (more
> > space
> > > disk, more I/O available)
> > > Install 3 or 6 solr instances each one using 1 ou 2 disk volumes
> > >
> > > Obviously, replicate shards and verify replicates of a shard are not
> > > located on the same physical server.
> > >
> > > What I am not sure is how MMapDirectory will work with several Solr
> > > instances. Will off heap memory correctly managed and shared between
> > > several Solr instances ?
> > >
> > > Thank you for your advice.
> > >
> > > Dominique
> >
>
>
> --
> http://www.needhamsoftware.com (work)
> http://www.the111shift.com (play)
>

Re: Advice in order to optimise resource usage of a huge server

Posted by Walter Underwood <wu...@wunderwood.org>.

We have kept a 72 CPU machine busy with a single Solr process, so I doubt that multiple processes are needed.

The big question is the size of the index. If it is too big to fit in RAM (OS file buffers), then the system is IO bound and CPU doesn’t really matter. Everything will depend on the speed and capacity of the disk system.

If the index does fit in RAM, then you should be fine.

You may want to spend some effort on reducing index size if it is near the limit.

wunder
Walter Underwood
wunder@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On Oct 6, 2022, at 8:18 AM, Gus Heck <gu...@gmail.com> wrote:
> 
> It depends... on your data, on your usage, etc. The best answers are
> obtained by testing various configurations, if possible by replaying
> captured query load from production. There is (for all java programs) an
> advantage to staying under 32 GB RAM, but without an idea of the number of
> machines you describe, the size of the corpus (docs and disk) and what your
> expected usage patterns are (both indexing and query) one can't say if you
> need more heap than that, either in one VM or across several VMs.
> 
> To understand how "unallocated" memory not assigned to the java heap (or
> other processes) is utilized to improve search performance, this article is
> helpful:
> https://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html
> 
> -Gus
> 
> On Thu, Oct 6, 2022 at 8:31 AM matthew sporleder <ms...@gmail.com>
> wrote:
> 
>> Why do you want to split it up at all?
>> 
>> On Thu, Oct 6, 2022 at 3:58 AM Dominique Bejean
>> <do...@eolya.fr> wrote:
>>> 
>>> Hi,
>>> 
>>> One of our customer have huge servers
>>> 
>>>   - Bar-metal
>>>   - 64 CPU
>>>   - 512 Gb RAM
>>>   - 6x2Tb disk in RAID 6 (so 2Tb disk space available)
>>> 
>>> 
>>> I think the best way to optimize resources usage of these servers is to
>>> install several Solr instances.
>>> 
>>> I imagine 2 scenarios to be tested according to data volumes, update
>> rate,
>>> request volume, ...
>>> 
>>> Do not configure disks in RAID 6 but, leave 6 standard volumes (more
>> space
>>> disk, more I/O available)
>>> Install 3 or 6 solr instances each one using 1 ou 2 disk volumes
>>> 
>>> Obviously, replicate shards and verify replicates of a shard are not
>>> located on the same physical server.
>>> 
>>> What I am not sure is how MMapDirectory will work with several Solr
>>> instances. Will off heap memory correctly managed and shared between
>>> several Solr instances ?
>>> 
>>> Thank you for your advice.
>>> 
>>> Dominique
>> 
> 
> 
> -- 
> http://www.needhamsoftware.com (work)
> http://www.the111shift.com (play)

Re: Advice in order to optimise resource usage of a huge server

Posted by Gus Heck <gu...@gmail.com>.

It depends... on your data, on your usage, etc. The best answers are
obtained by testing various configurations, if possible by replaying
captured query load from production. There is (for all java programs) an
advantage to staying under 32 GB RAM, but without an idea of the number of
machines you describe, the size of the corpus (docs and disk) and what your
expected usage patterns are (both indexing and query) one can't say if you
need more heap than that, either in one VM or across several VMs.

To understand how "unallocated" memory not assigned to the java heap (or
other processes) is utilized to improve search performance, this article is
helpful:
https://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html

-Gus

On Thu, Oct 6, 2022 at 8:31 AM matthew sporleder <ms...@gmail.com>
wrote:

> Why do you want to split it up at all?
>
> On Thu, Oct 6, 2022 at 3:58 AM Dominique Bejean
> <do...@eolya.fr> wrote:
> >
> > Hi,
> >
> > One of our customer have huge servers
> >
> >    - Bar-metal
> >    - 64 CPU
> >    - 512 Gb RAM
> >    - 6x2Tb disk in RAID 6 (so 2Tb disk space available)
> >
> >
> > I think the best way to optimize resources usage of these servers is to
> > install several Solr instances.
> >
> > I imagine 2 scenarios to be tested according to data volumes, update
> rate,
> > request volume, ...
> >
> > Do not configure disks in RAID 6 but, leave 6 standard volumes (more
> space
> > disk, more I/O available)
> > Install 3 or 6 solr instances each one using 1 ou 2 disk volumes
> >
> > Obviously, replicate shards and verify replicates of a shard are not
> > located on the same physical server.
> >
> > What I am not sure is how MMapDirectory will work with several Solr
> > instances. Will off heap memory correctly managed and shared between
> > several Solr instances ?
> >
> > Thank you for your advice.
> >
> > Dominique
>


-- 
http://www.needhamsoftware.com (work)
http://www.the111shift.com (play)

Re: Advice in order to optimise resource usage of a huge server

Posted by matthew sporleder <ms...@gmail.com>.

Why do you want to split it up at all?

On Thu, Oct 6, 2022 at 3:58 AM Dominique Bejean
<do...@eolya.fr> wrote:
>
> Hi,
>
> One of our customer have huge servers
>
>    - Bar-metal
>    - 64 CPU
>    - 512 Gb RAM
>    - 6x2Tb disk in RAID 6 (so 2Tb disk space available)
>
>
> I think the best way to optimize resources usage of these servers is to
> install several Solr instances.
>
> I imagine 2 scenarios to be tested according to data volumes, update rate,
> request volume, ...
>
> Do not configure disks in RAID 6 but, leave 6 standard volumes (more space
> disk, more I/O available)
> Install 3 or 6 solr instances each one using 1 ou 2 disk volumes
>
> Obviously, replicate shards and verify replicates of a shard are not
> located on the same physical server.
>
> What I am not sure is how MMapDirectory will work with several Solr
> instances. Will off heap memory correctly managed and shared between
> several Solr instances ?
>
> Thank you for your advice.
>
> Dominique

Re: Advice in order to optimise resource usage of a huge server

Posted by dmitri maziuk <dm...@gmail.com>.

On 2022-10-07 9:34 AM, Dominique Bejean wrote:
> Hi Dima,
> 
> About ZFS, I read this *"In an effort to maximize read/write performance,
> ZFS uses all available space in RAM to create a huge cache"*. How does it
> work with MMapDirectory ? No conflict ?

ZFS uses half of the available RAM for L1 cache (configurable via kernel 
module parameter). I don't know how it works with mmap() because its 
caching module was originally ported from Solaris and was separate from 
linux's normal filesystem caching code. I haven't kept up with it for a 
couple of years now and don't know what state it's in now.

Normally linux kernel is supposed to be smart enough to recognize when 
the data is in RAM already, e.g. on RAM disk, and then mmap() just sets 
a pointer, but I don't know if that works with ZFS cache. (And that's 
assuming the data is in the cache already.)

I think it's a moot point with 2.5" disk slots -- I assume that's what 
they have? If you need more than 2TB/slot, you have to use SSDs anyway, 
and on SSDs the clever multi-level caching schemes aren't that 
beneficial anyway as SSDs are fast enough already.

Dima

Re: Advice in order to optimise resource usage of a huge server

Posted by Dominique Bejean <do...@eolya.fr>.

Hi Dima,

About ZFS, I read this *"In an effort to maximize read/write performance,
ZFS uses all available space in RAM to create a huge cache"*. How does it
work with MMapDirectory ? No conflict ?

Regards

Dominique



Le jeu. 6 oct. 2022 à 17:44, dmitri maziuk <dm...@gmail.com> a
écrit :

> On 2022-10-06 2:57 AM, Dominique Bejean wrote:
>
> > Do not configure disks in RAID 6 but, leave 6 standard volumes (more
> space
> > disk, more I/O available)
>
> If they're running linux: throw out the raid controller, replace with
> ZFS on 2 SSDs and 4 spinning rust drives. You're not going to have more
> i/o than your drives and bus can support, but with ZFS's 2-level read
> cache (RAM and SSD) you could get close to saturating the bus. In theory.
>
> You get hot-resizeable storage pool as a bonus.
>
> Dima
>
>
>

Re: Advice in order to optimise resource usage of a huge server

Posted by dmitri maziuk <dm...@gmail.com>.

On 2022-10-06 2:57 AM, Dominique Bejean wrote:

> Do not configure disks in RAID 6 but, leave 6 standard volumes (more space
> disk, more I/O available)

If they're running linux: throw out the raid controller, replace with 
ZFS on 2 SSDs and 4 spinning rust drives. You're not going to have more 
i/o than your drives and bus can support, but with ZFS's 2-level read 
cache (RAM and SSD) you could get close to saturating the bus. In theory.

You get hot-resizeable storage pool as a bonus.

Dima