You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@solr.apache.org by "Saur, Alexandre (ELS-AMS)" <a....@elsevier.com> on 2023/06/21 07:37:30 UTC

Solr 8 replica configuration/optimization

Hi guys,

I have a generic question related to Solr 8 optimization.

Our current setup for a collection that has 5 shards (roughly 30GB of data in each) is:
5 identical NRT nodes with a replication factor of 3 and a maximum number of shards per nodes of 3.
All instances have 64Gb memory total with 14GB memory available to Solr's JVM, 16 @3.1GHz Vcpus with 2 threads per core (this is an AWS m5.4xlarge instance type).

This cluster can sustain losing two nodes, which is a major concern for us (reliability). However, we are starting to see more usage from our users and the indexes are becoming larger. This means this configuration will not work in the near future and we will need to make some adjustments.

Given the fact that near real-time reads are no longer a priority, we are thinking about using a mix of TLOGS and PULL replicas to split the reading/writing responsibilities. Could someone make some suggestions on how we could start experimenting? The main points are improving reading and writing while maintaining the same reliability.

For example, the initial idea is having at least 2 TLOGS for writing and 3 PULL for reading. Would that make sense? How about memory and cpu configurations - should we have the same instance configuration we currently have for both replica types?

Any suggestions will be welcomed!

Cheers,
Alexandre

________________________________

Elsevier B.V. Registered Office: Radarweg 29, 1043 NX Amsterdam, The Netherlands, Registration No. 33158992, Registered in The Netherlands.