You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@solr.apache.org by Nick Vladiceanu <vl...@gmail.com> on 2022/12/05 10:07:45 UTC

Core reload timeout on Solr 9

Hello folks,

We’re running our SolrCloud cluster in Kubernetes. Recently we’ve upgraded from 8.11 to 9.0 (and eventually to 9.1). 

Fully reindexed collections after upgrade, all looking good, no errors, response time improvements are noticed.

We have the following specs:
collection size:
22M docs, 1.3Kb doc size; ~28Gb total collection size at this point;
shards: 6 shards, each ~4,7Gb; 1 core per node;
nodes: 
30Gi of RAM, 
16 cores
96 nodes
Heap: 23Gb heap
JavaOpts: -Dsolr.modules=scripting,analysis-extras,ltr”
gcTune: -XX:+UseG1GC -XX:G1HeapRegionSize=16m -XX:MaxGCPauseMillis=300 -XX:InitiatingHeapOccupancyPercent=75 -XX:+UseLargePages -XX:+ParallelRefProcEnabled -XX:ParallelGCThreads=10 -XX:ConcGCThreads=2 -XX:MinHeapFreeRatio=2 -XX:MaxHeapFreeRatio=10


Problem

The problem we face is when we try to reload the collection, in sync mode we’re getting timed out or forever running task if reload executed in async mode:

curl “reload” output: https://justpaste.it/ap4d2 <https://justpaste.it/ap4d2>
ErrorReportingConcurrentUpdateSolrClient stacktrace (appears in the logs of some nodes): https://justpaste.it/aq3dw <https://justpaste.it/aq3dw>

There are no issues on a newly created cluster if there is no incoming traffic to it. Once we start sending requests to the cluster, collection reload becomes impossible. Other collections (smaller) within the same cluster are reloading just fine.

In some cases, on some node the Old generation GC is kicking in and makes the entire cluster unstable, however, that doesn’t all the time when collection reload is timing out.

We’ve tried the rollback to 8.11 and everything works normally as it used to be, no errors with reload, no other errors in the logs during reload, etc.

We tried the following:
run 9.0, 9.1 on Java 11 and Java 17: same result;
lower cache warming, disable firstSearcher queries: same result;
increase heap size, tune gc: same result;
use apiv1 and apiv2 to issue reload commands: no difference;
sync vs async reload: either forever running task or timing out after 180 seconds;

Did anyone face similar issues after upgrading to version 9 of Solr? Could you please advice where should we focus our attention while debugging this behavior? Any other advices/suggestions? 

Thank you


Best regards,
Nick Vladiceanu

Re: Core reload timeout on Solr 9

Posted by Nick Vladiceanu <vl...@gmail.com>.

Unfortunately we couldn’t find the root cause of such behaviour in Solr 9 and thus forced to rollback to 8.11.

Does anyone else face similar to the issues mentioned in this thread? Any ideas how we should proceed in such case?

Thanks


---
Nick Vladiceanu
vladiceanu.n@gmail.com 




> On 9. Dec 2022, at 16:04, Nick Vladiceanu <vl...@gmail.com> wrote:
> 
> tried to enable the -Dsolr.http1=true but it didn’t help. Seeing timeout after 180s (even without sending any traffic to the cluster) and also noticed 
> 
> 	Caused by: java.util.concurrent.TimeoutException: Total timeout 600000 ms elapsed (stacktrace here https://justpaste.it/29bpv)
> 
> on some of the nodes. 
> 
> 
> Also, spotting errors related to:
> o.a.s.c.SolrCore java.lang.IllegalArgumentException: Unknown directory: MMapDirectory@/var/solr/data/my_collection_shard3_replica_t1643/data/snapshot_metadata (we do not use snapshots at all) (stacktrace https://justpaste.it/88en6 )
> CoreIsClosedException o.a.s.u.CommitTracker auto commit error...: https://justpaste.it/bbbms 
> org.apache.solr.client.solrj.impl.BaseHttpSolrClient$RemoteSolrException: Error from server at null  https://justpaste.it/5nq7b (this node is a leader)
> 
> From time to time observing in the logs (TLOG replicas across the board) across multiple nodes:
> WARN  (indexFetcher-120-thread-1) [] o.a.s.h.IndexFetcher File _8ux.cfe did not match. expected checksum is 3843994300 and actual is checksum 2148229542. expected length is 542 and actual length is 542
> 
> 
> 
>> On 5. Dec 2022, at 5:12 PM, Houston Putman <houston@apache.org <ma...@apache.org>> wrote:
>> 
>> I'm not sure this is the issue, but maybe its http2 vs http1.
>> 
>> Could you retry with the following set on the cluster?
>> 
>> -Dsolr.http1=true
>> 
>> 
>> 
>> On Mon, Dec 5, 2022 at 5:08 AM Nick Vladiceanu <vladiceanu.n@gmail.com <ma...@gmail.com>>
>> wrote:
>> 
>>> Hello folks,
>>> 
>>> We’re running our SolrCloud cluster in Kubernetes. Recently we’ve upgraded
>>> from 8.11 to 9.0 (and eventually to 9.1).
>>> 
>>> Fully reindexed collections after upgrade, all looking good, no errors,
>>> response time improvements are noticed.
>>> 
>>> We have the following specs:
>>> collection size:
>>> 22M docs, 1.3Kb doc size; ~28Gb total collection size at this point;
>>> shards: 6 shards, each ~4,7Gb; 1 core per node;
>>> nodes:
>>> 30Gi of RAM,
>>> 16 cores
>>> 96 nodes
>>> Heap: 23Gb heap
>>> JavaOpts: -Dsolr.modules=scripting,analysis-extras,ltr”
>>> gcTune: -XX:+UseG1GC -XX:G1HeapRegionSize=16m -XX:MaxGCPauseMillis=300
>>> -XX:InitiatingHeapOccupancyPercent=75 -XX:+UseLargePages
>>> -XX:+ParallelRefProcEnabled -XX:ParallelGCThreads=10 -XX:ConcGCThreads=2
>>> -XX:MinHeapFreeRatio=2 -XX:MaxHeapFreeRatio=10
>>> 
>>> 
>>> Problem
>>> 
>>> The problem we face is when we try to reload the collection, in sync mode
>>> we’re getting timed out or forever running task if reload executed in async
>>> mode:
>>> 
>>> curl “reload” output: https://justpaste.it/ap4d2 <
>>> https://justpaste.it/ap4d2>
>>> ErrorReportingConcurrentUpdateSolrClient stacktrace (appears in the logs
>>> of some nodes): https://justpaste.it/aq3dw <https://justpaste.it/aq3dw>
>>> 
>>> There are no issues on a newly created cluster if there is no incoming
>>> traffic to it. Once we start sending requests to the cluster, collection
>>> reload becomes impossible. Other collections (smaller) within the same
>>> cluster are reloading just fine.
>>> 
>>> In some cases, on some node the Old generation GC is kicking in and makes
>>> the entire cluster unstable, however, that doesn’t all the time when
>>> collection reload is timing out.
>>> 
>>> We’ve tried the rollback to 8.11 and everything works normally as it used
>>> to be, no errors with reload, no other errors in the logs during reload,
>>> etc.
>>> 
>>> We tried the following:
>>> run 9.0, 9.1 on Java 11 and Java 17: same result;
>>> lower cache warming, disable firstSearcher queries: same result;
>>> increase heap size, tune gc: same result;
>>> use apiv1 and apiv2 to issue reload commands: no difference;
>>> sync vs async reload: either forever running task or timing out after 180
>>> seconds;
>>> 
>>> Did anyone face similar issues after upgrading to version 9 of Solr? Could
>>> you please advice where should we focus our attention while debugging this
>>> behavior? Any other advices/suggestions?
>>> 
>>> Thank you
>>> 
>>> 
>>> Best regards,
>>> Nick Vladiceanu
>

Re: Core reload timeout on Solr 9

Posted by Nick Vladiceanu <vl...@gmail.com>.

tried to enable the -Dsolr.http1=true but it didn’t help. Seeing timeout after 180s (even without sending any traffic to the cluster) and also noticed 

	Caused by: java.util.concurrent.TimeoutException: Total timeout 600000 ms elapsed (stacktrace here https://justpaste.it/29bpv <https://justpaste.it/29bpv>)

on some of the nodes. 


Also, spotting errors related to:
o.a.s.c.SolrCore java.lang.IllegalArgumentException: Unknown directory: MMapDirectory@/var/solr/data/my_collection_shard3_replica_t1643/data/snapshot_metadata (we do not use snapshots at all) (stacktrace https://justpaste.it/88en6 <https://justpaste.it/88en6> )
CoreIsClosedException o.a.s.u.CommitTracker auto commit error...: https://justpaste.it/bbbms <https://justpaste.it/bbbms> 
org.apache.solr.client.solrj.impl.BaseHttpSolrClient$RemoteSolrException: Error from server at null  https://justpaste.it/5nq7b <https://justpaste.it/5nq7b> (this node is a leader)

From time to time observing in the logs (TLOG replicas across the board) across multiple nodes:
WARN  (indexFetcher-120-thread-1) [] o.a.s.h.IndexFetcher File _8ux.cfe did not match. expected checksum is 3843994300 and actual is checksum 2148229542. expected length is 542 and actual length is 542



> On 5. Dec 2022, at 5:12 PM, Houston Putman <ho...@apache.org> wrote:
> 
> I'm not sure this is the issue, but maybe its http2 vs http1.
> 
> Could you retry with the following set on the cluster?
> 
> -Dsolr.http1=true
> 
> 
> 
> On Mon, Dec 5, 2022 at 5:08 AM Nick Vladiceanu <vladiceanu.n@gmail.com <ma...@gmail.com>>
> wrote:
> 
>> Hello folks,
>> 
>> We’re running our SolrCloud cluster in Kubernetes. Recently we’ve upgraded
>> from 8.11 to 9.0 (and eventually to 9.1).
>> 
>> Fully reindexed collections after upgrade, all looking good, no errors,
>> response time improvements are noticed.
>> 
>> We have the following specs:
>> collection size:
>> 22M docs, 1.3Kb doc size; ~28Gb total collection size at this point;
>> shards: 6 shards, each ~4,7Gb; 1 core per node;
>> nodes:
>> 30Gi of RAM,
>> 16 cores
>> 96 nodes
>> Heap: 23Gb heap
>> JavaOpts: -Dsolr.modules=scripting,analysis-extras,ltr”
>> gcTune: -XX:+UseG1GC -XX:G1HeapRegionSize=16m -XX:MaxGCPauseMillis=300
>> -XX:InitiatingHeapOccupancyPercent=75 -XX:+UseLargePages
>> -XX:+ParallelRefProcEnabled -XX:ParallelGCThreads=10 -XX:ConcGCThreads=2
>> -XX:MinHeapFreeRatio=2 -XX:MaxHeapFreeRatio=10
>> 
>> 
>> Problem
>> 
>> The problem we face is when we try to reload the collection, in sync mode
>> we’re getting timed out or forever running task if reload executed in async
>> mode:
>> 
>> curl “reload” output: https://justpaste.it/ap4d2 <
>> https://justpaste.it/ap4d2 <https://justpaste.it/ap4d2>>
>> ErrorReportingConcurrentUpdateSolrClient stacktrace (appears in the logs
>> of some nodes): https://justpaste.it/aq3dw <https://justpaste.it/aq3dw> <https://justpaste.it/aq3dw <https://justpaste.it/aq3dw>>
>> 
>> There are no issues on a newly created cluster if there is no incoming
>> traffic to it. Once we start sending requests to the cluster, collection
>> reload becomes impossible. Other collections (smaller) within the same
>> cluster are reloading just fine.
>> 
>> In some cases, on some node the Old generation GC is kicking in and makes
>> the entire cluster unstable, however, that doesn’t all the time when
>> collection reload is timing out.
>> 
>> We’ve tried the rollback to 8.11 and everything works normally as it used
>> to be, no errors with reload, no other errors in the logs during reload,
>> etc.
>> 
>> We tried the following:
>> run 9.0, 9.1 on Java 11 and Java 17: same result;
>> lower cache warming, disable firstSearcher queries: same result;
>> increase heap size, tune gc: same result;
>> use apiv1 and apiv2 to issue reload commands: no difference;
>> sync vs async reload: either forever running task or timing out after 180
>> seconds;
>> 
>> Did anyone face similar issues after upgrading to version 9 of Solr? Could
>> you please advice where should we focus our attention while debugging this
>> behavior? Any other advices/suggestions?
>> 
>> Thank you
>> 
>> 
>> Best regards,
>> Nick Vladiceanu

Re: Core reload timeout on Solr 9

Posted by Houston Putman <ho...@apache.org>.

I'm not sure this is the issue, but maybe its http2 vs http1.

Could you retry with the following set on the cluster?

-Dsolr.http1=true



On Mon, Dec 5, 2022 at 5:08 AM Nick Vladiceanu <vl...@gmail.com>
wrote:

> Hello folks,
>
> We’re running our SolrCloud cluster in Kubernetes. Recently we’ve upgraded
> from 8.11 to 9.0 (and eventually to 9.1).
>
> Fully reindexed collections after upgrade, all looking good, no errors,
> response time improvements are noticed.
>
> We have the following specs:
> collection size:
> 22M docs, 1.3Kb doc size; ~28Gb total collection size at this point;
> shards: 6 shards, each ~4,7Gb; 1 core per node;
> nodes:
> 30Gi of RAM,
> 16 cores
> 96 nodes
> Heap: 23Gb heap
> JavaOpts: -Dsolr.modules=scripting,analysis-extras,ltr”
> gcTune: -XX:+UseG1GC -XX:G1HeapRegionSize=16m -XX:MaxGCPauseMillis=300
> -XX:InitiatingHeapOccupancyPercent=75 -XX:+UseLargePages
> -XX:+ParallelRefProcEnabled -XX:ParallelGCThreads=10 -XX:ConcGCThreads=2
> -XX:MinHeapFreeRatio=2 -XX:MaxHeapFreeRatio=10
>
>
> Problem
>
> The problem we face is when we try to reload the collection, in sync mode
> we’re getting timed out or forever running task if reload executed in async
> mode:
>
> curl “reload” output: https://justpaste.it/ap4d2 <
> https://justpaste.it/ap4d2>
> ErrorReportingConcurrentUpdateSolrClient stacktrace (appears in the logs
> of some nodes): https://justpaste.it/aq3dw <https://justpaste.it/aq3dw>
>
> There are no issues on a newly created cluster if there is no incoming
> traffic to it. Once we start sending requests to the cluster, collection
> reload becomes impossible. Other collections (smaller) within the same
> cluster are reloading just fine.
>
> In some cases, on some node the Old generation GC is kicking in and makes
> the entire cluster unstable, however, that doesn’t all the time when
> collection reload is timing out.
>
> We’ve tried the rollback to 8.11 and everything works normally as it used
> to be, no errors with reload, no other errors in the logs during reload,
> etc.
>
> We tried the following:
> run 9.0, 9.1 on Java 11 and Java 17: same result;
> lower cache warming, disable firstSearcher queries: same result;
> increase heap size, tune gc: same result;
> use apiv1 and apiv2 to issue reload commands: no difference;
> sync vs async reload: either forever running task or timing out after 180
> seconds;
>
> Did anyone face similar issues after upgrading to version 9 of Solr? Could
> you please advice where should we focus our attention while debugging this
> behavior? Any other advices/suggestions?
>
> Thank you
>
>
> Best regards,
> Nick Vladiceanu

Re: Core reload timeout on Solr 9

Posted by Nick Vladiceanu <vl...@gmail.com>.

It’s the last. Total 96 machines. Collection has 6 shards. Each shard would
have 16 replicas. There is no more than one replica on the same machine.

I do not observe any issues with zookeeper, as per logs and metrics
everything looks good. Should I look for something more specific?

I have tried different versions of zookeeper too,  3.6, 3.7, 3.8. No
difference was observed.

On Thu 19. Jan 2023 at 6:54 PM, Gus Heck <gu...@gmail.com> wrote:

> Just read through this, and don't yet have any concrete ideas better than
> what's been given, but I'm interested to clarify one thing you said:
>
> We are having 6 shards spread across 96 replicas. Each replica is hosted on
> > a dedicated EC2 instance, no more than one replica present on the same
> > machine
> >
>
> Is that implying 6x96 physical machines  ( = 576 pieces of hardware? ) or
> are you overlapping replicas for different shards on the same machine (=
> 576 processes on 96 bits of hardware) ? or overlapping on the same node (96
> processes on 96 bits of hardware)?
>
> The last one is much more common. If you've really got 576 java processes
> running solr, that's a fair bit of communication that needs to happen as
> replicas go up and down.
>
> Have you observed any slowness on zookeeper during these episodes?
>
>
> On Thu, Jan 19, 2023 at 9:58 AM Houston Putman <ho...@apache.org> wrote:
>
> > >
> > > I was wondering, could it be something wrong with the solrconfig.xml
> > > parameters? Perhaps, a combination of parameters does not behave
> stable?
> > Do
> > > you think it makes sense to go with a vanilla solrconfig.xml and
> > introduce
> > > all the custom options one-by-one (i.e. ShardHandlerFactory, etc.)?
> >
> >
> > That is a great idea. (Obviously with the operator you need to keep some
> of
> > the values there that it relies on, but I think everything it uses is
> > vanilla starting with Solr 9)
> >
> > - Houston
> >
> > On Thu, Jan 19, 2023 at 9:43 AM Nick Vladiceanu <vl...@gmail.com>
> > wrote:
> >
> > > Thanks Kevin for looking into it.
> > >
> > > I’ll answer the questions in the original order:
> > > * Pod volume has the correct permissions. Basically, we use emptyDir
> > > provisioned by the solr-operator. All the nodes are having exactly the
> > same
> > > setup. No pods are co-located on the same worker node. No more than one
> > > Solr core is located on the same node.
> > > * We are actively indexing and querying. We do also use partial
> updates.
> > > Since we use TLOG replica types, we have a hard commit of 180s that
> > opens a
> > > new searcher.
> > > * JDK 11 and JDK 17 behaves the same way. We were able to reproduce on
> > > both builds.
> > >
> > > As per directory exceptions, I also cannot understand why it is
> throwing
> > > that Unknown Directory exception. I have logged in into a Solr pod that
> > was
> > > throwing this error and was able to find the exact location on the disk
> > > existing.
> > >
> > > When reload fails, sometimes it might fail on one node, other times
> fail
> > > on multiple nodes at the same time. I was checking all the logs on the
> > k8s
> > > node and on the pod but couldn’t find anything related to the disk,
> > > network, or other errors.
> > >
> > > I was wondering, could it be something wrong with the solrconfig.xml
> > > parameters? Perhaps, a combination of parameters does not behave
> stable?
> > Do
> > > you think it makes sense to go with a vanilla solrconfig.xml and
> > introduce
> > > all the custom options one-by-one (i.e. ShardHandlerFactory, etc.)?
> > >
> > > ---
> > > Nick Vladiceanu
> > > vladiceanu.n@gmail.com
> > >
> > >
> > >
> > >
> > > > On 18. Jan 2023, at 18:41, Kevin Risden <kr...@apache.org> wrote:
> > > >
> > > > So I am going to share some ideas just in case it triggers something
> -
> > I
> > > > have this gut feel that the cores are closing due to an exception of
> > some
> > > > kind. It seems like a lot of the issue is either index corruption or
> > > > "SolrCoreState already closed."
> > > >
> > > > * Does the pod volume have the correct permissions for Solr to
> > > read/write?
> > > > * Are you indexing these nodes or just querying? (asking this if
> these
> > > are
> > > > meant to be read only that would be different than a changing index)
> > > > * Have you taken into account
> > > > https://issues.apache.org/jira/browse/SOLR-16463 by chance if you
> > have a
> > > > custom Docker image? (this might not be necessary since you say it
> > > > reproduces on JDK 11)
> > > >
> > > > I found this part of your update the most intriguing. Why would
> > changing
> > > > the directory factory change this? My understanding is everything
> under
> > > > "/var/solr/data/my_collection_shard3_replica_t1643" should be
> > controlled
> > > by
> > > > Solr both read/write so any directories underneath would be created
> > > > automatically.
> > > >
> > > > directoryFactory:
> > > >> https://solr.apache.org/docs/9_1_0/core/org/apache/solr/core
> > > >> /MMapDirectoryFactory.html throwing the following exception:
> > > >> o.a.s.c.SolrCore java.lang.IllegalArgumentException: Unknown
> > directory:
> > > >> MMapDirectory@
> > >
> /var/solr/data/my_collection_shard3_replica_t1643/data/snapshot_metadata
> > > >> (we do not use snapshots at all) (stack trace
> > > https://justpaste.it/88en6)
> > > >> Switched to
> > > https://solr.apache.org/docs/9_1_0/core/org/apache/solr/core
> > > >> /StandardDirectoryFactory.html; problem solved, no more Unknown
> > > directory
> > > >> exceptions
> > > >> Reload won’t fail on some nodes with Unknown directory exception;
> > > >> Result: reload still timing out, fewer exceptions;
> > > >
> > > >
> > > > My guess is that the reload is going to some node and that one node
> is
> > > > causing the whole process to timeout. If you find that node then you
> > > should
> > > > be able to collect the logs and see. Basically there is some reason
> the
> > > > cores are closing and its not good. I would guess the collection
> reload
> > > > timing out is just a symptom of whatever the bigger underlying cause
> > is.
> > > >
> > > > Kevin Risden
> > > >
> > > >
> > > > On Wed, Dec 21, 2022 at 5:57 AM Nick Vladiceanu <
> > vladiceanu.n@gmail.com>
> > > > wrote:
> > > >
> > > >> yes, it’s very unusual. On Solr 8.11 (and previous versions) with
> > setup
> > > >> and size of data, reload takes just a few seconds.
> > > >>
> > > >> We are having 6 shards spread across 96 replicas. Each replica is
> > hosted
> > > >> on a dedicated EC2 instance, no more than one replica present on the
> > > same
> > > >> machine (in k8s words, it's one pod per node, one solr replica per
> > pod).
> > > >>
> > > >> I am able to reproduce the reload issue on Solr 9.0 and 9.1. Tried
> to
> > > >> isolate the underlying node along with the Solr Pod, couldn’t
> identify
> > > and
> > > >> issues like high load, iowait, whatsoever. Only issues I see are
> > > exceptions
> > > >> in the Solr logs that never recover unless Pods are restarted (we
> use
> > > empty
> > > >> dir and not persistent value, every time a pod is restarted, the
> cores
> > > that
> > > >> were hosted on it are removed and when the pod comes back it gets
> > > allocated
> > > >> to the same shard or another shard, depending on how many replicas
> > other
> > > >> shards have, we try to keep balance in # of replicas)
> > > >>
> > > >> I agree that 23GB of heap is a bit too much and are doing some work
> to
> > > >> optimize it (resizing caches, etc.). We tried to lower the heap to
> > 20GB
> > > >> already and GC performance is better, and in general Solr performs
> > > better.
> > > >> I must mention that we have the same heap size in Solr 8.11 and it
> > > doesn’t
> > > >> cause any issues with the reload. Could it have an impact on Solr 9,
> > > >> somehow?
> > > >>
> > > >> Thank you a lot for sharing your thoughts, especially for explaining
> > GC
> > > >> params and sharing yours, very much appreciated.
> > > >>
> > > >> Do you have any ideas on what we should try more? Here is the digest
> > of
> > > >> what we have tried, but without any success:
> > > >>
> > > >> Zookeeper: Upgrade from 3.6 to 3.7 and 3.8: no impact;
> > > >>
> > > >> DNS: Solr pods joining and communicating over Pod IP instead of Pod
> > Svc
> > > >> DNS name (headless). This was done in order to avoid any potential
> > > issues
> > > >> (even though CoreDNS/nodelocaldns metrics looked Ok) with DNS
> > > resolvers; no
> > > >> impact;
> > > >>
> > > >> Lucene: upgrade to Lucene 9.1.0, 9.2.0, 9.3.0; no impact;
> > > >>
> > > >> Solr Nodes:
> > > >> version:
> > > >> tried Solr 9.0.0 and Solr 9.1.0
> > > >> Result: no difference;
> > > >> Heap:
> > > >> recalculate the Heap size;
> > > >> reduce the size by 3GB (15%) in combination with caches resize (see
> > > below);
> > > >> Result: better performance, no old GC is triggered; cluster more
> > stable;
> > > >> reload still timing out;
> > > >> TLOG and PULL:
> > > >> tested with 3TLOG replicas per shard, the rest 12 replicas PULL;
> > > >> tested all 15 replicas per shard of type TLOG;
> > > >> NRT is not an option at all, didn’t even try to test;
> > > >> Result: better response time with PULL, no impact on reload;
> > > >> Other tunings, including gc; no impact;
> > > >>
> > > >> solrconfig.xml:
> > > >> directoryFactory:
> > > >>
> > > >>
> > >
> >
> https://solr.apache.org/docs/9_1_0/core/org/apache/solr/core/MMapDirectoryFactory.html
> > > >> throwing the following exception:  o.a.s.c.SolrCore
> > > >> java.lang.IllegalArgumentException: Unknown directory:
> MMapDirectory@
> > >
> /var/solr/data/my_collection_shard3_replica_t1643/data/snapshot_metadata
> > > >> (we do not use snapshots at all) (stack trace
> > > https://justpaste.it/88en6)
> > > >> Switched to
> > > >>
> > >
> >
> https://solr.apache.org/docs/9_1_0/core/org/apache/solr/core/StandardDirectoryFactory.html
> > > ;
> > > >> problem solved, no more Unknown directory exceptions
> > > >> Reload won’t fail on some nodes with Unknown directory exception;
> > > >> Result: reload still timing out, fewer exceptions;
> > > >> lockType:
> > > >> Switched between “native” and “simple” lock type;
> > > >> Result: no impact;
> > > >> HttpShardHandlerFactory
> > > >> Increased timeout by 40% for cross-shards communication while doing
> > > >> queries;
> > > >> Result: no impact;
> > > >> filterCache,queryResultCache and documentCache:
> > > >> Limit the size of the caches to Megabyte instead of entries:
> > > >> filterCache - 1024MB
> > > >> queryResultCache - 1024MB
> > > >> documentCache - 2048MB
> > > >> Result: nodes are more stable during reload, cluster is not
> > > destabilizing,
> > > >> no old GC activity; better response time, less pressure on GC;
> > > >> circuitBreaker:
> > > >> disabling circuitBreaker;
> > > >> Result: no impact;
> > > >>
> > > >> ---
> > > >> Nick Vladiceanu
> > > >> vladiceanu.n@gmail.com
> > > >>
> > > >>
> > > >>
> > > >>
> > > >>> On 20. Dec 2022, at 15:58, Shawn Heisey <ap...@elyograg.org>
> wrote:
> > > >>>
> > > >>> On 12/20/22 06:34, Nick Vladiceanu wrote:
> > > >>>> Thank you Shawn for sharing, indeed useful information.
> > > >>>> However, I must say that we only used deleteById and never
> > > >> deleteByQuery. We also only rely on the auto segment merging and not
> > > >> issuing optimize command.
> > > >>>
> > > >>> That is very unusual.  I've never seen a core reload take more
> than a
> > > >> few seconds, even when I was dealing with core sizes of double-digit
> > GB.
> > > >> Unless you have hundreds or thousands of replicas for each of your 6
> > > >> shards, it really should complete very quickly.
> > > >>>
> > > >>> Have you been able to determine which Solr cores in the collection
> > are
> > > >> causing the delay, and take a look at those machines?
> > > >>>
> > > >>> Some thoughts:
> > > >>>
> > > >>> When you said 96 nodes, were you talking about Solr instances or
> > > >> servers?  You really should only run one Solr instance per server,
> > > >> especially for a small index like this.
> > > >>>
> > > >>> A 23GB heap seems very excessive for a 4.7GB index that has less
> > than 4
> > > >> million documents.  I'm sure you can reduce that by a lot and
> > encounter
> > > >> smaller GC pauses as a result.  If you can share your GC logs, I
> > should
> > > be
> > > >> able to provide a recommendation.
> > > >>>
> > > >>> I've been looking at what MinHeapFreeRatio and MaxHeapFreeRatio do.
> > > >> Those settings are probably unnecessary.  This is what I currently
> use
> > > for
> > > >> GC tuning on JDK 11 or JDK 17.  This produces EXTREMELY short
> > collection
> > > >> pauses, but I have noticed that throughput-heavy things like
> indexing
> > > run a
> > > >> bit slower, but if the indexing is multi-threaded, I think that it
> > would
> > > >> not be affected a lot.
> > > >>>
> > > >>> GC_TUNE=" \
> > > >>> -XX:+UnlockExperimentalVMOptions \
> > > >>> -XX:+UseZGC \
> > > >>> -XX:+ParallelRefProcEnabled \
> > > >>> -XX:+ExplicitGCInvokesConcurrent \
> > > >>> -XX:+UseStringDeduplication \
> > > >>> -XX:+AlwaysPreTouch \
> > > >>> -XX:+UseNUMA \
> > > >>> "
> > > >>>
> > > >>> ZGC has one unexpected disadvantage.  Using it will disable
> > Compressed
> > > >> OOPs -- meaning that even with a heap smaller than 32GB, it uses 64
> > bit
> > > >> pointers.  This hasn't really impacted me ... the index is so small
> > that
> > > >> with a 1GB heap I have more than enough.  If low pauses are the most
> > > >> important thing you need from GC and you're running at least JDK11,
> I
> > > would
> > > >> strongly recommend ZGC.  It does make indexing slower for me -- a
> full
> > > >> rebuild that takes 10 minutes with G1 takes 11 minutes with ZGC. But
> > > even
> > > >> the worst-case GC pauses are single-digit milliseconds.
> > > >>>
> > > >>> For G1GC, which is still the best option for JDK8, this is what I
> > used
> > > >> to have:
> > > >>>
> > > >>> #GC_TUNE=" \
> > > >>> #  -XX:+UseG1GC \
> > > >>> #  -XX:+ParallelRefProcEnabled \
> > > >>> #  -XX:MaxGCPauseMillis=100 \
> > > >>> #  -XX:+ExplicitGCInvokesConcurrent \
> > > >>> #  -XX:+UseStringDeduplication \
> > > >>> #  -XX:+AlwaysPreTouch \
> > > >>> #  -XX:+UseNUMA \
> > > >>> #"
> > > >>>
> > > >>> Thanks,
> > > >>> Shawn
> > > >>
> > > >>
> > >
> > >
> >
>
>
> --
> http://www.needhamsoftware.com (work)
> http://www.the111shift.com (play)
>
-- 


*Nick Vladiceanu*
*Mobile:* +(373)-68-388-418
*Email:* vladiceanu.n@gmail.com
*Email*: vladiceanu.n@ase.md
*Facebook:* https://fb.com/nick.vladiceanu

Re: Core reload timeout on Solr 9

Posted by Gus Heck <gu...@gmail.com>.

Just read through this, and don't yet have any concrete ideas better than
what's been given, but I'm interested to clarify one thing you said:

We are having 6 shards spread across 96 replicas. Each replica is hosted on
> a dedicated EC2 instance, no more than one replica present on the same
> machine
>

Is that implying 6x96 physical machines  ( = 576 pieces of hardware? ) or
are you overlapping replicas for different shards on the same machine (=
576 processes on 96 bits of hardware) ? or overlapping on the same node (96
processes on 96 bits of hardware)?

The last one is much more common. If you've really got 576 java processes
running solr, that's a fair bit of communication that needs to happen as
replicas go up and down.

Have you observed any slowness on zookeeper during these episodes?


On Thu, Jan 19, 2023 at 9:58 AM Houston Putman <ho...@apache.org> wrote:

> >
> > I was wondering, could it be something wrong with the solrconfig.xml
> > parameters? Perhaps, a combination of parameters does not behave stable?
> Do
> > you think it makes sense to go with a vanilla solrconfig.xml and
> introduce
> > all the custom options one-by-one (i.e. ShardHandlerFactory, etc.)?
>
>
> That is a great idea. (Obviously with the operator you need to keep some of
> the values there that it relies on, but I think everything it uses is
> vanilla starting with Solr 9)
>
> - Houston
>
> On Thu, Jan 19, 2023 at 9:43 AM Nick Vladiceanu <vl...@gmail.com>
> wrote:
>
> > Thanks Kevin for looking into it.
> >
> > I’ll answer the questions in the original order:
> > * Pod volume has the correct permissions. Basically, we use emptyDir
> > provisioned by the solr-operator. All the nodes are having exactly the
> same
> > setup. No pods are co-located on the same worker node. No more than one
> > Solr core is located on the same node.
> > * We are actively indexing and querying. We do also use partial updates.
> > Since we use TLOG replica types, we have a hard commit of 180s that
> opens a
> > new searcher.
> > * JDK 11 and JDK 17 behaves the same way. We were able to reproduce on
> > both builds.
> >
> > As per directory exceptions, I also cannot understand why it is throwing
> > that Unknown Directory exception. I have logged in into a Solr pod that
> was
> > throwing this error and was able to find the exact location on the disk
> > existing.
> >
> > When reload fails, sometimes it might fail on one node, other times fail
> > on multiple nodes at the same time. I was checking all the logs on the
> k8s
> > node and on the pod but couldn’t find anything related to the disk,
> > network, or other errors.
> >
> > I was wondering, could it be something wrong with the solrconfig.xml
> > parameters? Perhaps, a combination of parameters does not behave stable?
> Do
> > you think it makes sense to go with a vanilla solrconfig.xml and
> introduce
> > all the custom options one-by-one (i.e. ShardHandlerFactory, etc.)?
> >
> > ---
> > Nick Vladiceanu
> > vladiceanu.n@gmail.com
> >
> >
> >
> >
> > > On 18. Jan 2023, at 18:41, Kevin Risden <kr...@apache.org> wrote:
> > >
> > > So I am going to share some ideas just in case it triggers something -
> I
> > > have this gut feel that the cores are closing due to an exception of
> some
> > > kind. It seems like a lot of the issue is either index corruption or
> > > "SolrCoreState already closed."
> > >
> > > * Does the pod volume have the correct permissions for Solr to
> > read/write?
> > > * Are you indexing these nodes or just querying? (asking this if these
> > are
> > > meant to be read only that would be different than a changing index)
> > > * Have you taken into account
> > > https://issues.apache.org/jira/browse/SOLR-16463 by chance if you
> have a
> > > custom Docker image? (this might not be necessary since you say it
> > > reproduces on JDK 11)
> > >
> > > I found this part of your update the most intriguing. Why would
> changing
> > > the directory factory change this? My understanding is everything under
> > > "/var/solr/data/my_collection_shard3_replica_t1643" should be
> controlled
> > by
> > > Solr both read/write so any directories underneath would be created
> > > automatically.
> > >
> > > directoryFactory:
> > >> https://solr.apache.org/docs/9_1_0/core/org/apache/solr/core
> > >> /MMapDirectoryFactory.html throwing the following exception:
> > >> o.a.s.c.SolrCore java.lang.IllegalArgumentException: Unknown
> directory:
> > >> MMapDirectory@
> > /var/solr/data/my_collection_shard3_replica_t1643/data/snapshot_metadata
> > >> (we do not use snapshots at all) (stack trace
> > https://justpaste.it/88en6)
> > >> Switched to
> > https://solr.apache.org/docs/9_1_0/core/org/apache/solr/core
> > >> /StandardDirectoryFactory.html; problem solved, no more Unknown
> > directory
> > >> exceptions
> > >> Reload won’t fail on some nodes with Unknown directory exception;
> > >> Result: reload still timing out, fewer exceptions;
> > >
> > >
> > > My guess is that the reload is going to some node and that one node is
> > > causing the whole process to timeout. If you find that node then you
> > should
> > > be able to collect the logs and see. Basically there is some reason the
> > > cores are closing and its not good. I would guess the collection reload
> > > timing out is just a symptom of whatever the bigger underlying cause
> is.
> > >
> > > Kevin Risden
> > >
> > >
> > > On Wed, Dec 21, 2022 at 5:57 AM Nick Vladiceanu <
> vladiceanu.n@gmail.com>
> > > wrote:
> > >
> > >> yes, it’s very unusual. On Solr 8.11 (and previous versions) with
> setup
> > >> and size of data, reload takes just a few seconds.
> > >>
> > >> We are having 6 shards spread across 96 replicas. Each replica is
> hosted
> > >> on a dedicated EC2 instance, no more than one replica present on the
> > same
> > >> machine (in k8s words, it's one pod per node, one solr replica per
> pod).
> > >>
> > >> I am able to reproduce the reload issue on Solr 9.0 and 9.1. Tried to
> > >> isolate the underlying node along with the Solr Pod, couldn’t identify
> > and
> > >> issues like high load, iowait, whatsoever. Only issues I see are
> > exceptions
> > >> in the Solr logs that never recover unless Pods are restarted (we use
> > empty
> > >> dir and not persistent value, every time a pod is restarted, the cores
> > that
> > >> were hosted on it are removed and when the pod comes back it gets
> > allocated
> > >> to the same shard or another shard, depending on how many replicas
> other
> > >> shards have, we try to keep balance in # of replicas)
> > >>
> > >> I agree that 23GB of heap is a bit too much and are doing some work to
> > >> optimize it (resizing caches, etc.). We tried to lower the heap to
> 20GB
> > >> already and GC performance is better, and in general Solr performs
> > better.
> > >> I must mention that we have the same heap size in Solr 8.11 and it
> > doesn’t
> > >> cause any issues with the reload. Could it have an impact on Solr 9,
> > >> somehow?
> > >>
> > >> Thank you a lot for sharing your thoughts, especially for explaining
> GC
> > >> params and sharing yours, very much appreciated.
> > >>
> > >> Do you have any ideas on what we should try more? Here is the digest
> of
> > >> what we have tried, but without any success:
> > >>
> > >> Zookeeper: Upgrade from 3.6 to 3.7 and 3.8: no impact;
> > >>
> > >> DNS: Solr pods joining and communicating over Pod IP instead of Pod
> Svc
> > >> DNS name (headless). This was done in order to avoid any potential
> > issues
> > >> (even though CoreDNS/nodelocaldns metrics looked Ok) with DNS
> > resolvers; no
> > >> impact;
> > >>
> > >> Lucene: upgrade to Lucene 9.1.0, 9.2.0, 9.3.0; no impact;
> > >>
> > >> Solr Nodes:
> > >> version:
> > >> tried Solr 9.0.0 and Solr 9.1.0
> > >> Result: no difference;
> > >> Heap:
> > >> recalculate the Heap size;
> > >> reduce the size by 3GB (15%) in combination with caches resize (see
> > below);
> > >> Result: better performance, no old GC is triggered; cluster more
> stable;
> > >> reload still timing out;
> > >> TLOG and PULL:
> > >> tested with 3TLOG replicas per shard, the rest 12 replicas PULL;
> > >> tested all 15 replicas per shard of type TLOG;
> > >> NRT is not an option at all, didn’t even try to test;
> > >> Result: better response time with PULL, no impact on reload;
> > >> Other tunings, including gc; no impact;
> > >>
> > >> solrconfig.xml:
> > >> directoryFactory:
> > >>
> > >>
> >
> https://solr.apache.org/docs/9_1_0/core/org/apache/solr/core/MMapDirectoryFactory.html
> > >> throwing the following exception:  o.a.s.c.SolrCore
> > >> java.lang.IllegalArgumentException: Unknown directory: MMapDirectory@
> > /var/solr/data/my_collection_shard3_replica_t1643/data/snapshot_metadata
> > >> (we do not use snapshots at all) (stack trace
> > https://justpaste.it/88en6)
> > >> Switched to
> > >>
> >
> https://solr.apache.org/docs/9_1_0/core/org/apache/solr/core/StandardDirectoryFactory.html
> > ;
> > >> problem solved, no more Unknown directory exceptions
> > >> Reload won’t fail on some nodes with Unknown directory exception;
> > >> Result: reload still timing out, fewer exceptions;
> > >> lockType:
> > >> Switched between “native” and “simple” lock type;
> > >> Result: no impact;
> > >> HttpShardHandlerFactory
> > >> Increased timeout by 40% for cross-shards communication while doing
> > >> queries;
> > >> Result: no impact;
> > >> filterCache,queryResultCache and documentCache:
> > >> Limit the size of the caches to Megabyte instead of entries:
> > >> filterCache - 1024MB
> > >> queryResultCache - 1024MB
> > >> documentCache - 2048MB
> > >> Result: nodes are more stable during reload, cluster is not
> > destabilizing,
> > >> no old GC activity; better response time, less pressure on GC;
> > >> circuitBreaker:
> > >> disabling circuitBreaker;
> > >> Result: no impact;
> > >>
> > >> ---
> > >> Nick Vladiceanu
> > >> vladiceanu.n@gmail.com
> > >>
> > >>
> > >>
> > >>
> > >>> On 20. Dec 2022, at 15:58, Shawn Heisey <ap...@elyograg.org> wrote:
> > >>>
> > >>> On 12/20/22 06:34, Nick Vladiceanu wrote:
> > >>>> Thank you Shawn for sharing, indeed useful information.
> > >>>> However, I must say that we only used deleteById and never
> > >> deleteByQuery. We also only rely on the auto segment merging and not
> > >> issuing optimize command.
> > >>>
> > >>> That is very unusual.  I've never seen a core reload take more than a
> > >> few seconds, even when I was dealing with core sizes of double-digit
> GB.
> > >> Unless you have hundreds or thousands of replicas for each of your 6
> > >> shards, it really should complete very quickly.
> > >>>
> > >>> Have you been able to determine which Solr cores in the collection
> are
> > >> causing the delay, and take a look at those machines?
> > >>>
> > >>> Some thoughts:
> > >>>
> > >>> When you said 96 nodes, were you talking about Solr instances or
> > >> servers?  You really should only run one Solr instance per server,
> > >> especially for a small index like this.
> > >>>
> > >>> A 23GB heap seems very excessive for a 4.7GB index that has less
> than 4
> > >> million documents.  I'm sure you can reduce that by a lot and
> encounter
> > >> smaller GC pauses as a result.  If you can share your GC logs, I
> should
> > be
> > >> able to provide a recommendation.
> > >>>
> > >>> I've been looking at what MinHeapFreeRatio and MaxHeapFreeRatio do.
> > >> Those settings are probably unnecessary.  This is what I currently use
> > for
> > >> GC tuning on JDK 11 or JDK 17.  This produces EXTREMELY short
> collection
> > >> pauses, but I have noticed that throughput-heavy things like indexing
> > run a
> > >> bit slower, but if the indexing is multi-threaded, I think that it
> would
> > >> not be affected a lot.
> > >>>
> > >>> GC_TUNE=" \
> > >>> -XX:+UnlockExperimentalVMOptions \
> > >>> -XX:+UseZGC \
> > >>> -XX:+ParallelRefProcEnabled \
> > >>> -XX:+ExplicitGCInvokesConcurrent \
> > >>> -XX:+UseStringDeduplication \
> > >>> -XX:+AlwaysPreTouch \
> > >>> -XX:+UseNUMA \
> > >>> "
> > >>>
> > >>> ZGC has one unexpected disadvantage.  Using it will disable
> Compressed
> > >> OOPs -- meaning that even with a heap smaller than 32GB, it uses 64
> bit
> > >> pointers.  This hasn't really impacted me ... the index is so small
> that
> > >> with a 1GB heap I have more than enough.  If low pauses are the most
> > >> important thing you need from GC and you're running at least JDK11, I
> > would
> > >> strongly recommend ZGC.  It does make indexing slower for me -- a full
> > >> rebuild that takes 10 minutes with G1 takes 11 minutes with ZGC. But
> > even
> > >> the worst-case GC pauses are single-digit milliseconds.
> > >>>
> > >>> For G1GC, which is still the best option for JDK8, this is what I
> used
> > >> to have:
> > >>>
> > >>> #GC_TUNE=" \
> > >>> #  -XX:+UseG1GC \
> > >>> #  -XX:+ParallelRefProcEnabled \
> > >>> #  -XX:MaxGCPauseMillis=100 \
> > >>> #  -XX:+ExplicitGCInvokesConcurrent \
> > >>> #  -XX:+UseStringDeduplication \
> > >>> #  -XX:+AlwaysPreTouch \
> > >>> #  -XX:+UseNUMA \
> > >>> #"
> > >>>
> > >>> Thanks,
> > >>> Shawn
> > >>
> > >>
> >
> >
>


-- 
http://www.needhamsoftware.com (work)
http://www.the111shift.com (play)

Re: Core reload timeout on Solr 9

Posted by Houston Putman <ho...@apache.org>.

>
> I was wondering, could it be something wrong with the solrconfig.xml
> parameters? Perhaps, a combination of parameters does not behave stable? Do
> you think it makes sense to go with a vanilla solrconfig.xml and introduce
> all the custom options one-by-one (i.e. ShardHandlerFactory, etc.)?


That is a great idea. (Obviously with the operator you need to keep some of
the values there that it relies on, but I think everything it uses is
vanilla starting with Solr 9)

- Houston

On Thu, Jan 19, 2023 at 9:43 AM Nick Vladiceanu <vl...@gmail.com>
wrote:

> Thanks Kevin for looking into it.
>
> I’ll answer the questions in the original order:
> * Pod volume has the correct permissions. Basically, we use emptyDir
> provisioned by the solr-operator. All the nodes are having exactly the same
> setup. No pods are co-located on the same worker node. No more than one
> Solr core is located on the same node.
> * We are actively indexing and querying. We do also use partial updates.
> Since we use TLOG replica types, we have a hard commit of 180s that opens a
> new searcher.
> * JDK 11 and JDK 17 behaves the same way. We were able to reproduce on
> both builds.
>
> As per directory exceptions, I also cannot understand why it is throwing
> that Unknown Directory exception. I have logged in into a Solr pod that was
> throwing this error and was able to find the exact location on the disk
> existing.
>
> When reload fails, sometimes it might fail on one node, other times fail
> on multiple nodes at the same time. I was checking all the logs on the k8s
> node and on the pod but couldn’t find anything related to the disk,
> network, or other errors.
>
> I was wondering, could it be something wrong with the solrconfig.xml
> parameters? Perhaps, a combination of parameters does not behave stable? Do
> you think it makes sense to go with a vanilla solrconfig.xml and introduce
> all the custom options one-by-one (i.e. ShardHandlerFactory, etc.)?
>
> ---
> Nick Vladiceanu
> vladiceanu.n@gmail.com
>
>
>
>
> > On 18. Jan 2023, at 18:41, Kevin Risden <kr...@apache.org> wrote:
> >
> > So I am going to share some ideas just in case it triggers something - I
> > have this gut feel that the cores are closing due to an exception of some
> > kind. It seems like a lot of the issue is either index corruption or
> > "SolrCoreState already closed."
> >
> > * Does the pod volume have the correct permissions for Solr to
> read/write?
> > * Are you indexing these nodes or just querying? (asking this if these
> are
> > meant to be read only that would be different than a changing index)
> > * Have you taken into account
> > https://issues.apache.org/jira/browse/SOLR-16463 by chance if you have a
> > custom Docker image? (this might not be necessary since you say it
> > reproduces on JDK 11)
> >
> > I found this part of your update the most intriguing. Why would changing
> > the directory factory change this? My understanding is everything under
> > "/var/solr/data/my_collection_shard3_replica_t1643" should be controlled
> by
> > Solr both read/write so any directories underneath would be created
> > automatically.
> >
> > directoryFactory:
> >> https://solr.apache.org/docs/9_1_0/core/org/apache/solr/core
> >> /MMapDirectoryFactory.html throwing the following exception:
> >> o.a.s.c.SolrCore java.lang.IllegalArgumentException: Unknown directory:
> >> MMapDirectory@
> /var/solr/data/my_collection_shard3_replica_t1643/data/snapshot_metadata
> >> (we do not use snapshots at all) (stack trace
> https://justpaste.it/88en6)
> >> Switched to
> https://solr.apache.org/docs/9_1_0/core/org/apache/solr/core
> >> /StandardDirectoryFactory.html; problem solved, no more Unknown
> directory
> >> exceptions
> >> Reload won’t fail on some nodes with Unknown directory exception;
> >> Result: reload still timing out, fewer exceptions;
> >
> >
> > My guess is that the reload is going to some node and that one node is
> > causing the whole process to timeout. If you find that node then you
> should
> > be able to collect the logs and see. Basically there is some reason the
> > cores are closing and its not good. I would guess the collection reload
> > timing out is just a symptom of whatever the bigger underlying cause is.
> >
> > Kevin Risden
> >
> >
> > On Wed, Dec 21, 2022 at 5:57 AM Nick Vladiceanu <vl...@gmail.com>
> > wrote:
> >
> >> yes, it’s very unusual. On Solr 8.11 (and previous versions) with setup
> >> and size of data, reload takes just a few seconds.
> >>
> >> We are having 6 shards spread across 96 replicas. Each replica is hosted
> >> on a dedicated EC2 instance, no more than one replica present on the
> same
> >> machine (in k8s words, it's one pod per node, one solr replica per pod).
> >>
> >> I am able to reproduce the reload issue on Solr 9.0 and 9.1. Tried to
> >> isolate the underlying node along with the Solr Pod, couldn’t identify
> and
> >> issues like high load, iowait, whatsoever. Only issues I see are
> exceptions
> >> in the Solr logs that never recover unless Pods are restarted (we use
> empty
> >> dir and not persistent value, every time a pod is restarted, the cores
> that
> >> were hosted on it are removed and when the pod comes back it gets
> allocated
> >> to the same shard or another shard, depending on how many replicas other
> >> shards have, we try to keep balance in # of replicas)
> >>
> >> I agree that 23GB of heap is a bit too much and are doing some work to
> >> optimize it (resizing caches, etc.). We tried to lower the heap to 20GB
> >> already and GC performance is better, and in general Solr performs
> better.
> >> I must mention that we have the same heap size in Solr 8.11 and it
> doesn’t
> >> cause any issues with the reload. Could it have an impact on Solr 9,
> >> somehow?
> >>
> >> Thank you a lot for sharing your thoughts, especially for explaining GC
> >> params and sharing yours, very much appreciated.
> >>
> >> Do you have any ideas on what we should try more? Here is the digest of
> >> what we have tried, but without any success:
> >>
> >> Zookeeper: Upgrade from 3.6 to 3.7 and 3.8: no impact;
> >>
> >> DNS: Solr pods joining and communicating over Pod IP instead of Pod Svc
> >> DNS name (headless). This was done in order to avoid any potential
> issues
> >> (even though CoreDNS/nodelocaldns metrics looked Ok) with DNS
> resolvers; no
> >> impact;
> >>
> >> Lucene: upgrade to Lucene 9.1.0, 9.2.0, 9.3.0; no impact;
> >>
> >> Solr Nodes:
> >> version:
> >> tried Solr 9.0.0 and Solr 9.1.0
> >> Result: no difference;
> >> Heap:
> >> recalculate the Heap size;
> >> reduce the size by 3GB (15%) in combination with caches resize (see
> below);
> >> Result: better performance, no old GC is triggered; cluster more stable;
> >> reload still timing out;
> >> TLOG and PULL:
> >> tested with 3TLOG replicas per shard, the rest 12 replicas PULL;
> >> tested all 15 replicas per shard of type TLOG;
> >> NRT is not an option at all, didn’t even try to test;
> >> Result: better response time with PULL, no impact on reload;
> >> Other tunings, including gc; no impact;
> >>
> >> solrconfig.xml:
> >> directoryFactory:
> >>
> >>
> https://solr.apache.org/docs/9_1_0/core/org/apache/solr/core/MMapDirectoryFactory.html
> >> throwing the following exception:  o.a.s.c.SolrCore
> >> java.lang.IllegalArgumentException: Unknown directory: MMapDirectory@
> /var/solr/data/my_collection_shard3_replica_t1643/data/snapshot_metadata
> >> (we do not use snapshots at all) (stack trace
> https://justpaste.it/88en6)
> >> Switched to
> >>
> https://solr.apache.org/docs/9_1_0/core/org/apache/solr/core/StandardDirectoryFactory.html
> ;
> >> problem solved, no more Unknown directory exceptions
> >> Reload won’t fail on some nodes with Unknown directory exception;
> >> Result: reload still timing out, fewer exceptions;
> >> lockType:
> >> Switched between “native” and “simple” lock type;
> >> Result: no impact;
> >> HttpShardHandlerFactory
> >> Increased timeout by 40% for cross-shards communication while doing
> >> queries;
> >> Result: no impact;
> >> filterCache,queryResultCache and documentCache:
> >> Limit the size of the caches to Megabyte instead of entries:
> >> filterCache - 1024MB
> >> queryResultCache - 1024MB
> >> documentCache - 2048MB
> >> Result: nodes are more stable during reload, cluster is not
> destabilizing,
> >> no old GC activity; better response time, less pressure on GC;
> >> circuitBreaker:
> >> disabling circuitBreaker;
> >> Result: no impact;
> >>
> >> ---
> >> Nick Vladiceanu
> >> vladiceanu.n@gmail.com
> >>
> >>
> >>
> >>
> >>> On 20. Dec 2022, at 15:58, Shawn Heisey <ap...@elyograg.org> wrote:
> >>>
> >>> On 12/20/22 06:34, Nick Vladiceanu wrote:
> >>>> Thank you Shawn for sharing, indeed useful information.
> >>>> However, I must say that we only used deleteById and never
> >> deleteByQuery. We also only rely on the auto segment merging and not
> >> issuing optimize command.
> >>>
> >>> That is very unusual.  I've never seen a core reload take more than a
> >> few seconds, even when I was dealing with core sizes of double-digit GB.
> >> Unless you have hundreds or thousands of replicas for each of your 6
> >> shards, it really should complete very quickly.
> >>>
> >>> Have you been able to determine which Solr cores in the collection are
> >> causing the delay, and take a look at those machines?
> >>>
> >>> Some thoughts:
> >>>
> >>> When you said 96 nodes, were you talking about Solr instances or
> >> servers?  You really should only run one Solr instance per server,
> >> especially for a small index like this.
> >>>
> >>> A 23GB heap seems very excessive for a 4.7GB index that has less than 4
> >> million documents.  I'm sure you can reduce that by a lot and encounter
> >> smaller GC pauses as a result.  If you can share your GC logs, I should
> be
> >> able to provide a recommendation.
> >>>
> >>> I've been looking at what MinHeapFreeRatio and MaxHeapFreeRatio do.
> >> Those settings are probably unnecessary.  This is what I currently use
> for
> >> GC tuning on JDK 11 or JDK 17.  This produces EXTREMELY short collection
> >> pauses, but I have noticed that throughput-heavy things like indexing
> run a
> >> bit slower, but if the indexing is multi-threaded, I think that it would
> >> not be affected a lot.
> >>>
> >>> GC_TUNE=" \
> >>> -XX:+UnlockExperimentalVMOptions \
> >>> -XX:+UseZGC \
> >>> -XX:+ParallelRefProcEnabled \
> >>> -XX:+ExplicitGCInvokesConcurrent \
> >>> -XX:+UseStringDeduplication \
> >>> -XX:+AlwaysPreTouch \
> >>> -XX:+UseNUMA \
> >>> "
> >>>
> >>> ZGC has one unexpected disadvantage.  Using it will disable Compressed
> >> OOPs -- meaning that even with a heap smaller than 32GB, it uses 64 bit
> >> pointers.  This hasn't really impacted me ... the index is so small that
> >> with a 1GB heap I have more than enough.  If low pauses are the most
> >> important thing you need from GC and you're running at least JDK11, I
> would
> >> strongly recommend ZGC.  It does make indexing slower for me -- a full
> >> rebuild that takes 10 minutes with G1 takes 11 minutes with ZGC. But
> even
> >> the worst-case GC pauses are single-digit milliseconds.
> >>>
> >>> For G1GC, which is still the best option for JDK8, this is what I used
> >> to have:
> >>>
> >>> #GC_TUNE=" \
> >>> #  -XX:+UseG1GC \
> >>> #  -XX:+ParallelRefProcEnabled \
> >>> #  -XX:MaxGCPauseMillis=100 \
> >>> #  -XX:+ExplicitGCInvokesConcurrent \
> >>> #  -XX:+UseStringDeduplication \
> >>> #  -XX:+AlwaysPreTouch \
> >>> #  -XX:+UseNUMA \
> >>> #"
> >>>
> >>> Thanks,
> >>> Shawn
> >>
> >>
>
>

Re: Core reload timeout on Solr 9

Posted by Nick Vladiceanu <vl...@gmail.com>.

Thanks Kevin for looking into it.

I’ll answer the questions in the original order:
* Pod volume has the correct permissions. Basically, we use emptyDir provisioned by the solr-operator. All the nodes are having exactly the same setup. No pods are co-located on the same worker node. No more than one Solr core is located on the same node.
* We are actively indexing and querying. We do also use partial updates. Since we use TLOG replica types, we have a hard commit of 180s that opens a new searcher.
* JDK 11 and JDK 17 behaves the same way. We were able to reproduce on both builds.

As per directory exceptions, I also cannot understand why it is throwing that Unknown Directory exception. I have logged in into a Solr pod that was throwing this error and was able to find the exact location on the disk existing.

When reload fails, sometimes it might fail on one node, other times fail on multiple nodes at the same time. I was checking all the logs on the k8s node and on the pod but couldn’t find anything related to the disk, network, or other errors. 

I was wondering, could it be something wrong with the solrconfig.xml parameters? Perhaps, a combination of parameters does not behave stable? Do you think it makes sense to go with a vanilla solrconfig.xml and introduce all the custom options one-by-one (i.e. ShardHandlerFactory, etc.)?

---
Nick Vladiceanu
vladiceanu.n@gmail.com 




> On 18. Jan 2023, at 18:41, Kevin Risden <kr...@apache.org> wrote:
> 
> So I am going to share some ideas just in case it triggers something - I
> have this gut feel that the cores are closing due to an exception of some
> kind. It seems like a lot of the issue is either index corruption or
> "SolrCoreState already closed."
> 
> * Does the pod volume have the correct permissions for Solr to read/write?
> * Are you indexing these nodes or just querying? (asking this if these are
> meant to be read only that would be different than a changing index)
> * Have you taken into account
> https://issues.apache.org/jira/browse/SOLR-16463 by chance if you have a
> custom Docker image? (this might not be necessary since you say it
> reproduces on JDK 11)
> 
> I found this part of your update the most intriguing. Why would changing
> the directory factory change this? My understanding is everything under
> "/var/solr/data/my_collection_shard3_replica_t1643" should be controlled by
> Solr both read/write so any directories underneath would be created
> automatically.
> 
> directoryFactory:
>> https://solr.apache.org/docs/9_1_0/core/org/apache/solr/core
>> /MMapDirectoryFactory.html throwing the following exception:
>> o.a.s.c.SolrCore java.lang.IllegalArgumentException: Unknown directory:
>> MMapDirectory@/var/solr/data/my_collection_shard3_replica_t1643/data/snapshot_metadata
>> (we do not use snapshots at all) (stack trace https://justpaste.it/88en6)
>> Switched to https://solr.apache.org/docs/9_1_0/core/org/apache/solr/core
>> /StandardDirectoryFactory.html; problem solved, no more Unknown directory
>> exceptions
>> Reload won’t fail on some nodes with Unknown directory exception;
>> Result: reload still timing out, fewer exceptions;
> 
> 
> My guess is that the reload is going to some node and that one node is
> causing the whole process to timeout. If you find that node then you should
> be able to collect the logs and see. Basically there is some reason the
> cores are closing and its not good. I would guess the collection reload
> timing out is just a symptom of whatever the bigger underlying cause is.
> 
> Kevin Risden
> 
> 
> On Wed, Dec 21, 2022 at 5:57 AM Nick Vladiceanu <vl...@gmail.com>
> wrote:
> 
>> yes, it’s very unusual. On Solr 8.11 (and previous versions) with setup
>> and size of data, reload takes just a few seconds.
>> 
>> We are having 6 shards spread across 96 replicas. Each replica is hosted
>> on a dedicated EC2 instance, no more than one replica present on the same
>> machine (in k8s words, it's one pod per node, one solr replica per pod).
>> 
>> I am able to reproduce the reload issue on Solr 9.0 and 9.1. Tried to
>> isolate the underlying node along with the Solr Pod, couldn’t identify and
>> issues like high load, iowait, whatsoever. Only issues I see are exceptions
>> in the Solr logs that never recover unless Pods are restarted (we use empty
>> dir and not persistent value, every time a pod is restarted, the cores that
>> were hosted on it are removed and when the pod comes back it gets allocated
>> to the same shard or another shard, depending on how many replicas other
>> shards have, we try to keep balance in # of replicas)
>> 
>> I agree that 23GB of heap is a bit too much and are doing some work to
>> optimize it (resizing caches, etc.). We tried to lower the heap to 20GB
>> already and GC performance is better, and in general Solr performs better.
>> I must mention that we have the same heap size in Solr 8.11 and it doesn’t
>> cause any issues with the reload. Could it have an impact on Solr 9,
>> somehow?
>> 
>> Thank you a lot for sharing your thoughts, especially for explaining GC
>> params and sharing yours, very much appreciated.
>> 
>> Do you have any ideas on what we should try more? Here is the digest of
>> what we have tried, but without any success:
>> 
>> Zookeeper: Upgrade from 3.6 to 3.7 and 3.8: no impact;
>> 
>> DNS: Solr pods joining and communicating over Pod IP instead of Pod Svc
>> DNS name (headless). This was done in order to avoid any potential issues
>> (even though CoreDNS/nodelocaldns metrics looked Ok) with DNS resolvers; no
>> impact;
>> 
>> Lucene: upgrade to Lucene 9.1.0, 9.2.0, 9.3.0; no impact;
>> 
>> Solr Nodes:
>> version:
>> tried Solr 9.0.0 and Solr 9.1.0
>> Result: no difference;
>> Heap:
>> recalculate the Heap size;
>> reduce the size by 3GB (15%) in combination with caches resize (see below);
>> Result: better performance, no old GC is triggered; cluster more stable;
>> reload still timing out;
>> TLOG and PULL:
>> tested with 3TLOG replicas per shard, the rest 12 replicas PULL;
>> tested all 15 replicas per shard of type TLOG;
>> NRT is not an option at all, didn’t even try to test;
>> Result: better response time with PULL, no impact on reload;
>> Other tunings, including gc; no impact;
>> 
>> solrconfig.xml:
>> directoryFactory:
>> 
>> https://solr.apache.org/docs/9_1_0/core/org/apache/solr/core/MMapDirectoryFactory.html
>> throwing the following exception:  o.a.s.c.SolrCore
>> java.lang.IllegalArgumentException: Unknown directory: MMapDirectory@/var/solr/data/my_collection_shard3_replica_t1643/data/snapshot_metadata
>> (we do not use snapshots at all) (stack trace https://justpaste.it/88en6)
>> Switched to
>> https://solr.apache.org/docs/9_1_0/core/org/apache/solr/core/StandardDirectoryFactory.html;
>> problem solved, no more Unknown directory exceptions
>> Reload won’t fail on some nodes with Unknown directory exception;
>> Result: reload still timing out, fewer exceptions;
>> lockType:
>> Switched between “native” and “simple” lock type;
>> Result: no impact;
>> HttpShardHandlerFactory
>> Increased timeout by 40% for cross-shards communication while doing
>> queries;
>> Result: no impact;
>> filterCache,queryResultCache and documentCache:
>> Limit the size of the caches to Megabyte instead of entries:
>> filterCache - 1024MB
>> queryResultCache - 1024MB
>> documentCache - 2048MB
>> Result: nodes are more stable during reload, cluster is not destabilizing,
>> no old GC activity; better response time, less pressure on GC;
>> circuitBreaker:
>> disabling circuitBreaker;
>> Result: no impact;
>> 
>> ---
>> Nick Vladiceanu
>> vladiceanu.n@gmail.com
>> 
>> 
>> 
>> 
>>> On 20. Dec 2022, at 15:58, Shawn Heisey <ap...@elyograg.org> wrote:
>>> 
>>> On 12/20/22 06:34, Nick Vladiceanu wrote:
>>>> Thank you Shawn for sharing, indeed useful information.
>>>> However, I must say that we only used deleteById and never
>> deleteByQuery. We also only rely on the auto segment merging and not
>> issuing optimize command.
>>> 
>>> That is very unusual.  I've never seen a core reload take more than a
>> few seconds, even when I was dealing with core sizes of double-digit GB.
>> Unless you have hundreds or thousands of replicas for each of your 6
>> shards, it really should complete very quickly.
>>> 
>>> Have you been able to determine which Solr cores in the collection are
>> causing the delay, and take a look at those machines?
>>> 
>>> Some thoughts:
>>> 
>>> When you said 96 nodes, were you talking about Solr instances or
>> servers?  You really should only run one Solr instance per server,
>> especially for a small index like this.
>>> 
>>> A 23GB heap seems very excessive for a 4.7GB index that has less than 4
>> million documents.  I'm sure you can reduce that by a lot and encounter
>> smaller GC pauses as a result.  If you can share your GC logs, I should be
>> able to provide a recommendation.
>>> 
>>> I've been looking at what MinHeapFreeRatio and MaxHeapFreeRatio do.
>> Those settings are probably unnecessary.  This is what I currently use for
>> GC tuning on JDK 11 or JDK 17.  This produces EXTREMELY short collection
>> pauses, but I have noticed that throughput-heavy things like indexing run a
>> bit slower, but if the indexing is multi-threaded, I think that it would
>> not be affected a lot.
>>> 
>>> GC_TUNE=" \
>>> -XX:+UnlockExperimentalVMOptions \
>>> -XX:+UseZGC \
>>> -XX:+ParallelRefProcEnabled \
>>> -XX:+ExplicitGCInvokesConcurrent \
>>> -XX:+UseStringDeduplication \
>>> -XX:+AlwaysPreTouch \
>>> -XX:+UseNUMA \
>>> "
>>> 
>>> ZGC has one unexpected disadvantage.  Using it will disable Compressed
>> OOPs -- meaning that even with a heap smaller than 32GB, it uses 64 bit
>> pointers.  This hasn't really impacted me ... the index is so small that
>> with a 1GB heap I have more than enough.  If low pauses are the most
>> important thing you need from GC and you're running at least JDK11, I would
>> strongly recommend ZGC.  It does make indexing slower for me -- a full
>> rebuild that takes 10 minutes with G1 takes 11 minutes with ZGC. But even
>> the worst-case GC pauses are single-digit milliseconds.
>>> 
>>> For G1GC, which is still the best option for JDK8, this is what I used
>> to have:
>>> 
>>> #GC_TUNE=" \
>>> #  -XX:+UseG1GC \
>>> #  -XX:+ParallelRefProcEnabled \
>>> #  -XX:MaxGCPauseMillis=100 \
>>> #  -XX:+ExplicitGCInvokesConcurrent \
>>> #  -XX:+UseStringDeduplication \
>>> #  -XX:+AlwaysPreTouch \
>>> #  -XX:+UseNUMA \
>>> #"
>>> 
>>> Thanks,
>>> Shawn
>> 
>>

Re: Core reload timeout on Solr 9

Posted by Kevin Risden <kr...@apache.org>.

So I am going to share some ideas just in case it triggers something - I
have this gut feel that the cores are closing due to an exception of some
kind. It seems like a lot of the issue is either index corruption or
"SolrCoreState already closed."

* Does the pod volume have the correct permissions for Solr to read/write?
* Are you indexing these nodes or just querying? (asking this if these are
meant to be read only that would be different than a changing index)
* Have you taken into account
https://issues.apache.org/jira/browse/SOLR-16463 by chance if you have a
custom Docker image? (this might not be necessary since you say it
reproduces on JDK 11)

I found this part of your update the most intriguing. Why would changing
the directory factory change this? My understanding is everything under
"/var/solr/data/my_collection_shard3_replica_t1643" should be controlled by
Solr both read/write so any directories underneath would be created
automatically.

directoryFactory:
> https://solr.apache.org/docs/9_1_0/core/org/apache/solr/core
> /MMapDirectoryFactory.html throwing the following exception:
> o.a.s.c.SolrCore java.lang.IllegalArgumentException: Unknown directory:
> MMapDirectory@/var/solr/data/my_collection_shard3_replica_t1643/data/snapshot_metadata
> (we do not use snapshots at all) (stack trace https://justpaste.it/88en6)
> Switched to https://solr.apache.org/docs/9_1_0/core/org/apache/solr/core
> /StandardDirectoryFactory.html; problem solved, no more Unknown directory
> exceptions
> Reload won’t fail on some nodes with Unknown directory exception;
> Result: reload still timing out, fewer exceptions;


My guess is that the reload is going to some node and that one node is
causing the whole process to timeout. If you find that node then you should
be able to collect the logs and see. Basically there is some reason the
cores are closing and its not good. I would guess the collection reload
timing out is just a symptom of whatever the bigger underlying cause is.

Kevin Risden


On Wed, Dec 21, 2022 at 5:57 AM Nick Vladiceanu <vl...@gmail.com>
wrote:

> yes, it’s very unusual. On Solr 8.11 (and previous versions) with setup
> and size of data, reload takes just a few seconds.
>
> We are having 6 shards spread across 96 replicas. Each replica is hosted
> on a dedicated EC2 instance, no more than one replica present on the same
> machine (in k8s words, it's one pod per node, one solr replica per pod).
>
> I am able to reproduce the reload issue on Solr 9.0 and 9.1. Tried to
> isolate the underlying node along with the Solr Pod, couldn’t identify and
> issues like high load, iowait, whatsoever. Only issues I see are exceptions
> in the Solr logs that never recover unless Pods are restarted (we use empty
> dir and not persistent value, every time a pod is restarted, the cores that
> were hosted on it are removed and when the pod comes back it gets allocated
> to the same shard or another shard, depending on how many replicas other
> shards have, we try to keep balance in # of replicas)
>
> I agree that 23GB of heap is a bit too much and are doing some work to
> optimize it (resizing caches, etc.). We tried to lower the heap to 20GB
> already and GC performance is better, and in general Solr performs better.
> I must mention that we have the same heap size in Solr 8.11 and it doesn’t
> cause any issues with the reload. Could it have an impact on Solr 9,
> somehow?
>
> Thank you a lot for sharing your thoughts, especially for explaining GC
> params and sharing yours, very much appreciated.
>
> Do you have any ideas on what we should try more? Here is the digest of
> what we have tried, but without any success:
>
> Zookeeper: Upgrade from 3.6 to 3.7 and 3.8: no impact;
>
> DNS: Solr pods joining and communicating over Pod IP instead of Pod Svc
> DNS name (headless). This was done in order to avoid any potential issues
> (even though CoreDNS/nodelocaldns metrics looked Ok) with DNS resolvers; no
> impact;
>
> Lucene: upgrade to Lucene 9.1.0, 9.2.0, 9.3.0; no impact;
>
> Solr Nodes:
> version:
> tried Solr 9.0.0 and Solr 9.1.0
> Result: no difference;
> Heap:
> recalculate the Heap size;
> reduce the size by 3GB (15%) in combination with caches resize (see below);
> Result: better performance, no old GC is triggered; cluster more stable;
> reload still timing out;
> TLOG and PULL:
> tested with 3TLOG replicas per shard, the rest 12 replicas PULL;
> tested all 15 replicas per shard of type TLOG;
> NRT is not an option at all, didn’t even try to test;
> Result: better response time with PULL, no impact on reload;
> Other tunings, including gc; no impact;
>
> solrconfig.xml:
> directoryFactory:
>
> https://solr.apache.org/docs/9_1_0/core/org/apache/solr/core/MMapDirectoryFactory.html
> throwing the following exception:  o.a.s.c.SolrCore
> java.lang.IllegalArgumentException: Unknown directory: MMapDirectory@/var/solr/data/my_collection_shard3_replica_t1643/data/snapshot_metadata
> (we do not use snapshots at all) (stack trace https://justpaste.it/88en6)
> Switched to
> https://solr.apache.org/docs/9_1_0/core/org/apache/solr/core/StandardDirectoryFactory.html;
> problem solved, no more Unknown directory exceptions
> Reload won’t fail on some nodes with Unknown directory exception;
> Result: reload still timing out, fewer exceptions;
> lockType:
> Switched between “native” and “simple” lock type;
> Result: no impact;
> HttpShardHandlerFactory
> Increased timeout by 40% for cross-shards communication while doing
> queries;
> Result: no impact;
> filterCache,queryResultCache and documentCache:
> Limit the size of the caches to Megabyte instead of entries:
> filterCache - 1024MB
> queryResultCache - 1024MB
> documentCache - 2048MB
> Result: nodes are more stable during reload, cluster is not destabilizing,
> no old GC activity; better response time, less pressure on GC;
> circuitBreaker:
> disabling circuitBreaker;
> Result: no impact;
>
> ---
> Nick Vladiceanu
> vladiceanu.n@gmail.com
>
>
>
>
> > On 20. Dec 2022, at 15:58, Shawn Heisey <ap...@elyograg.org> wrote:
> >
> > On 12/20/22 06:34, Nick Vladiceanu wrote:
> >> Thank you Shawn for sharing, indeed useful information.
> >> However, I must say that we only used deleteById and never
> deleteByQuery. We also only rely on the auto segment merging and not
> issuing optimize command.
> >
> > That is very unusual.  I've never seen a core reload take more than a
> few seconds, even when I was dealing with core sizes of double-digit GB.
> Unless you have hundreds or thousands of replicas for each of your 6
> shards, it really should complete very quickly.
> >
> > Have you been able to determine which Solr cores in the collection are
> causing the delay, and take a look at those machines?
> >
> > Some thoughts:
> >
> > When you said 96 nodes, were you talking about Solr instances or
> servers?  You really should only run one Solr instance per server,
> especially for a small index like this.
> >
> > A 23GB heap seems very excessive for a 4.7GB index that has less than 4
> million documents.  I'm sure you can reduce that by a lot and encounter
> smaller GC pauses as a result.  If you can share your GC logs, I should be
> able to provide a recommendation.
> >
> > I've been looking at what MinHeapFreeRatio and MaxHeapFreeRatio do.
> Those settings are probably unnecessary.  This is what I currently use for
> GC tuning on JDK 11 or JDK 17.  This produces EXTREMELY short collection
> pauses, but I have noticed that throughput-heavy things like indexing run a
> bit slower, but if the indexing is multi-threaded, I think that it would
> not be affected a lot.
> >
> > GC_TUNE=" \
> >  -XX:+UnlockExperimentalVMOptions \
> >  -XX:+UseZGC \
> >  -XX:+ParallelRefProcEnabled \
> >  -XX:+ExplicitGCInvokesConcurrent \
> >  -XX:+UseStringDeduplication \
> >  -XX:+AlwaysPreTouch \
> >  -XX:+UseNUMA \
> > "
> >
> > ZGC has one unexpected disadvantage.  Using it will disable Compressed
> OOPs -- meaning that even with a heap smaller than 32GB, it uses 64 bit
> pointers.  This hasn't really impacted me ... the index is so small that
> with a 1GB heap I have more than enough.  If low pauses are the most
> important thing you need from GC and you're running at least JDK11, I would
> strongly recommend ZGC.  It does make indexing slower for me -- a full
> rebuild that takes 10 minutes with G1 takes 11 minutes with ZGC. But even
> the worst-case GC pauses are single-digit milliseconds.
> >
> > For G1GC, which is still the best option for JDK8, this is what I used
> to have:
> >
> > #GC_TUNE=" \
> > #  -XX:+UseG1GC \
> > #  -XX:+ParallelRefProcEnabled \
> > #  -XX:MaxGCPauseMillis=100 \
> > #  -XX:+ExplicitGCInvokesConcurrent \
> > #  -XX:+UseStringDeduplication \
> > #  -XX:+AlwaysPreTouch \
> > #  -XX:+UseNUMA \
> > #"
> >
> > Thanks,
> > Shawn
>
>

Re: Core reload timeout on Solr 9

Posted by Nick Vladiceanu <vl...@gmail.com>.

yes, it’s very unusual. On Solr 8.11 (and previous versions) with setup and size of data, reload takes just a few seconds.

We are having 6 shards spread across 96 replicas. Each replica is hosted on a dedicated EC2 instance, no more than one replica present on the same machine (in k8s words, it's one pod per node, one solr replica per pod).

I am able to reproduce the reload issue on Solr 9.0 and 9.1. Tried to isolate the underlying node along with the Solr Pod, couldn’t identify and issues like high load, iowait, whatsoever. Only issues I see are exceptions in the Solr logs that never recover unless Pods are restarted (we use empty dir and not persistent value, every time a pod is restarted, the cores that were hosted on it are removed and when the pod comes back it gets allocated to the same shard or another shard, depending on how many replicas other shards have, we try to keep balance in # of replicas)

I agree that 23GB of heap is a bit too much and are doing some work to optimize it (resizing caches, etc.). We tried to lower the heap to 20GB already and GC performance is better, and in general Solr performs better. I must mention that we have the same heap size in Solr 8.11 and it doesn’t cause any issues with the reload. Could it have an impact on Solr 9, somehow?

Thank you a lot for sharing your thoughts, especially for explaining GC params and sharing yours, very much appreciated.

Do you have any ideas on what we should try more? Here is the digest of what we have tried, but without any success:

Zookeeper: Upgrade from 3.6 to 3.7 and 3.8: no impact; 

DNS: Solr pods joining and communicating over Pod IP instead of Pod Svc DNS name (headless). This was done in order to avoid any potential issues (even though CoreDNS/nodelocaldns metrics looked Ok) with DNS resolvers; no impact;

Lucene: upgrade to Lucene 9.1.0, 9.2.0, 9.3.0; no impact;

Solr Nodes:
version: 
tried Solr 9.0.0 and Solr 9.1.0
Result: no difference;
Heap:
recalculate the Heap size;
reduce the size by 3GB (15%) in combination with caches resize (see below);
Result: better performance, no old GC is triggered; cluster more stable; reload still timing out;
TLOG and PULL:
tested with 3TLOG replicas per shard, the rest 12 replicas PULL;
tested all 15 replicas per shard of type TLOG;
NRT is not an option at all, didn’t even try to test;
Result: better response time with PULL, no impact on reload;
Other tunings, including gc; no impact;

solrconfig.xml: 
directoryFactory:
https://solr.apache.org/docs/9_1_0/core/org/apache/solr/core/MMapDirectoryFactory.html throwing the following exception:  o.a.s.c.SolrCore java.lang.IllegalArgumentException: Unknown directory: MMapDirectory@/var/solr/data/my_collection_shard3_replica_t1643/data/snapshot_metadata (we do not use snapshots at all) (stack trace https://justpaste.it/88en6)
Switched to https://solr.apache.org/docs/9_1_0/core/org/apache/solr/core/StandardDirectoryFactory.html; problem solved, no more Unknown directory exceptions
Reload won’t fail on some nodes with Unknown directory exception;
Result: reload still timing out, fewer exceptions;
lockType:
Switched between “native” and “simple” lock type;
Result: no impact;
HttpShardHandlerFactory
Increased timeout by 40% for cross-shards communication while doing queries;
Result: no impact;
filterCache,queryResultCache and documentCache:
Limit the size of the caches to Megabyte instead of entries:
filterCache - 1024MB
queryResultCache - 1024MB
documentCache - 2048MB
Result: nodes are more stable during reload, cluster is not destabilizing, no old GC activity; better response time, less pressure on GC;
circuitBreaker:
disabling circuitBreaker;
Result: no impact;

---
Nick Vladiceanu
vladiceanu.n@gmail.com 




> On 20. Dec 2022, at 15:58, Shawn Heisey <ap...@elyograg.org> wrote:
> 
> On 12/20/22 06:34, Nick Vladiceanu wrote:
>> Thank you Shawn for sharing, indeed useful information.
>> However, I must say that we only used deleteById and never deleteByQuery. We also only rely on the auto segment merging and not issuing optimize command.
> 
> That is very unusual.  I've never seen a core reload take more than a few seconds, even when I was dealing with core sizes of double-digit GB.  Unless you have hundreds or thousands of replicas for each of your 6 shards, it really should complete very quickly.
> 
> Have you been able to determine which Solr cores in the collection are causing the delay, and take a look at those machines?
> 
> Some thoughts:
> 
> When you said 96 nodes, were you talking about Solr instances or servers?  You really should only run one Solr instance per server, especially for a small index like this.
> 
> A 23GB heap seems very excessive for a 4.7GB index that has less than 4 million documents.  I'm sure you can reduce that by a lot and encounter smaller GC pauses as a result.  If you can share your GC logs, I should be able to provide a recommendation.
> 
> I've been looking at what MinHeapFreeRatio and MaxHeapFreeRatio do. Those settings are probably unnecessary.  This is what I currently use for GC tuning on JDK 11 or JDK 17.  This produces EXTREMELY short collection pauses, but I have noticed that throughput-heavy things like indexing run a bit slower, but if the indexing is multi-threaded, I think that it would not be affected a lot.
> 
> GC_TUNE=" \
>  -XX:+UnlockExperimentalVMOptions \
>  -XX:+UseZGC \
>  -XX:+ParallelRefProcEnabled \
>  -XX:+ExplicitGCInvokesConcurrent \
>  -XX:+UseStringDeduplication \
>  -XX:+AlwaysPreTouch \
>  -XX:+UseNUMA \
> "
> 
> ZGC has one unexpected disadvantage.  Using it will disable Compressed OOPs -- meaning that even with a heap smaller than 32GB, it uses 64 bit pointers.  This hasn't really impacted me ... the index is so small that with a 1GB heap I have more than enough.  If low pauses are the most important thing you need from GC and you're running at least JDK11, I would strongly recommend ZGC.  It does make indexing slower for me -- a full rebuild that takes 10 minutes with G1 takes 11 minutes with ZGC. But even the worst-case GC pauses are single-digit milliseconds.
> 
> For G1GC, which is still the best option for JDK8, this is what I used to have:
> 
> #GC_TUNE=" \
> #  -XX:+UseG1GC \
> #  -XX:+ParallelRefProcEnabled \
> #  -XX:MaxGCPauseMillis=100 \
> #  -XX:+ExplicitGCInvokesConcurrent \
> #  -XX:+UseStringDeduplication \
> #  -XX:+AlwaysPreTouch \
> #  -XX:+UseNUMA \
> #"
> 
> Thanks,
> Shawn

Re: Core reload timeout on Solr 9

Posted by Shawn Heisey <ap...@elyograg.org>.

On 12/20/22 06:34, Nick Vladiceanu wrote:
> Thank you Shawn for sharing, indeed useful information.
>
> However, I must say that we only used deleteById and never deleteByQuery. We also only rely on the auto segment merging and not issuing optimize command.

That is very unusual. I've never seen a core reload take more than a
few seconds, even when I was dealing with core sizes of double-digit GB.
Unless you have hundreds or thousands of replicas for each of your 6
shards, it really should complete very quickly.

Have you been able to determine which Solr cores in the collection are
causing the delay, and take a look at those machines?

Some thoughts:

When you said 96 nodes, were you talking about Solr instances or
servers? You really should only run one Solr instance per server,
especially for a small index like this.

A 23GB heap seems very excessive for a 4.7GB index that has less than 4
million documents. I'm sure you can reduce that by a lot and encounter
smaller GC pauses as a result. If you can share your GC logs, I should
be able to provide a recommendation.

I've been looking at what MinHeapFreeRatio and MaxHeapFreeRatio do.
Those settings are probably unnecessary. This is what I currently use
for GC tuning on JDK 11 or JDK 17. This produces EXTREMELY short
collection pauses, but I have noticed that throughput-heavy things like
indexing run a bit slower, but if the indexing is multi-threaded, I
think that it would not be affected a lot.

GC_TUNE=" \
-XX:+UnlockExperimentalVMOptions \
-XX:+UseZGC \
-XX:+ParallelRefProcEnabled \
-XX:+ExplicitGCInvokesConcurrent \
-XX:+UseStringDeduplication \
-XX:+AlwaysPreTouch \
-XX:+UseNUMA \
"

ZGC has one unexpected disadvantage. Using it will disable Compressed
OOPs -- meaning that even with a heap smaller than 32GB, it uses 64 bit
pointers. This hasn't really impacted me ... the index is so small that
with a 1GB heap I have more than enough. If low pauses are the most
important thing you need from GC and you're running at least JDK11, I
would strongly recommend ZGC. It does make indexing slower for me -- a
full rebuild that takes 10 minutes with G1 takes 11 minutes with ZGC.
But even the worst-case GC pauses are single-digit milliseconds.

For G1GC, which is still the best option for JDK8, this is what I used
to have:

#GC_TUNE=" \
# -XX:+UseG1GC \
# -XX:+ParallelRefProcEnabled \
# -XX:MaxGCPauseMillis=100 \
# -XX:+ExplicitGCInvokesConcurrent \
# -XX:+UseStringDeduplication \
# -XX:+AlwaysPreTouch \
# -XX:+UseNUMA \
#"

Thanks,
Shawn

Re: Core reload timeout on Solr 9

Posted by Nick Vladiceanu <vl...@gmail.com>.

Thank you Shawn for sharing, indeed useful information.

However, I must say that we only used deleteById and never deleteByQuery. We also only rely on the auto segment merging and not issuing optimize command. 


Thanks,
---
Nick Vladiceanu
vladiceanu.n@gmail.com 




> On 20. Dec 2022, at 11:52, Shawn Heisey <ap...@elyograg.org> wrote:
> 
> On 12/5/22 03:07, Nick Vladiceanu wrote:
>> The problem we face is when we try to reload the collection, in sync mode we’re getting timed out or forever running task if reload executed in async mode:
> 
> Are you by chance using deleteByQuery?  If you are, there is a possibility that you're running into a problem that dBQ can cause.
> 
> Basically, that kind of delete does not mix well with Lucene's built-in segment merging.  If a segment merge happens at the same time as a dBQ, then all subsequent index updates are put on hold until the merge finishes.  Sometimes a merge can take a REALLY long time, especially if it is explicitly kicked off as a Solr "optimize" operation.  Large segment merges happen without an optimize too, so this can happen even if you never use optimize.
> 
> It is entirely possible that if this happens in Solr 9.x (which uses Lucene 9.x), that Lucene prevents the shutdown of the current searcher until the merge finishes, but in 8.x, shutting down the searcher and killing the merge is allowed by Lucene.
> 
> If this is what is happening, and I do not know enough about Lucene internals to say for sure whether it could be the problem, changing a delete by query into a query (to gather ID values) and then a delete by ID will almost certainly fix it.
> 
> I ran into issues with combining dBQ with optimize way back in Solr 3.x or 4.x, with Solr indexes at a job that I no longer have.  And once I discovered the issue, I replaced every usage of deleteByQuery in the indexing software, and never had a problem again.  I had never tried an index reload while that was happening, though.
> 
> Thanks,
> Shawn

Re: Core reload timeout on Solr 9

Posted by Shawn Heisey <ap...@elyograg.org>.

On 12/5/22 03:07, Nick Vladiceanu wrote:
> The problem we face is when we try to reload the collection, in sync mode we’re getting timed out or forever running task if reload executed in async mode:

Are you by chance using deleteByQuery? If you are, there is a
possibility that you're running into a problem that dBQ can cause.

Basically, that kind of delete does not mix well with Lucene's built-in
segment merging. If a segment merge happens at the same time as a dBQ,
then all subsequent index updates are put on hold until the merge
finishes. Sometimes a merge can take a REALLY long time, especially if
it is explicitly kicked off as a Solr "optimize" operation. Large
segment merges happen without an optimize too, so this can happen even
if you never use optimize.

It is entirely possible that if this happens in Solr 9.x (which uses
Lucene 9.x), that Lucene prevents the shutdown of the current searcher
until the merge finishes, but in 8.x, shutting down the searcher and
killing the merge is allowed by Lucene.

If this is what is happening, and I do not know enough about Lucene
internals to say for sure whether it could be the problem, changing a
delete by query into a query (to gather ID values) and then a delete by
ID will almost certainly fix it.

I ran into issues with combining dBQ with optimize way back in Solr 3.x
or 4.x, with Solr indexes at a job that I no longer have. And once I
discovered the issue, I replaced every usage of deleteByQuery in the
indexing software, and never had a problem again. I had never tried an
index reload while that was happening, though.

Thanks,
Shawn