You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Ash Ramesh <as...@canva.com> on 2018/08/21 03:55:05 UTC

7.3.1: Query of death - all nodes ran out of memory and had to be shut down

Hi everyone,

We ran into an issue yesterday where all our ec2 machines, running solr,
ran out of memory and could not heal themselves. I'll try break down what
happened here.

*System Architecture:*

- Solr Version: 7.3.1
- Replica Types: TLOG/PULL
- Num Shards: 8 (default hashing mechanism)
- Doc Count: > 20m
- Index Size: 17G
- EC2 Machine Spec: 16 Core | 32G ram | 100G SSD
- Num EC2 Machines: 7+ (scales up and down)
- Max Shards per node (one node per EC2 instance): 8 (some nodes had 4,
some had 8)
- Num TLOG shard replicas: 3 (3 copies of each shard as TLOG)
- Num PULL shard replicas: 3+
- Heap: 4G

*What was run prior to the issue:*

We ran these queries around 2.55pm

We ran a bunch of deep paginated queries (offset of 1,000,000) with a
filter query. We set the timeout to 5 seconds and it did timeout. We aren't
sure if this is what caused the irrecoverable failure, but by reading this
-
https://lucene.apache.org/solr/guide/7_4/pagination-of-results.html#performance-problems-with-deep-paging
, we feel that this was the cause.

We did not use a cursor.

This cluster was healthy for about 1 week, but we noticed the degradation
soon after (within 30min) of running the offset queries mentioned above. We
currently use a single sharded collection in production, however are
transitioning to an 8 shard cluster. We hit this issue in a controlled 8
sharded environment, but don't notice any issues on our production (single
sharded) cluster. On production the query still timed out (with same num
docs etc.) but didn't go into a crazy state.

*What Happened:*

- All the EC2 instances started logging OOM error. None of the nodes were
responsive to new requests.
- We saw that the Heap usage jumped from an average of 2.7G to the max of
4G within a 5 minute window.
- CPU across all 16 cores was at 100%
- We saw that the distributed requests were timing out across all machines.
- We shutdown all the machines that only had PULL replicas on them and it
still didn't 'fix' itself.
- Eventually we shut down SOLR on the main node which had all the master
TLOG replicas. Once restarted, the machine started working again.


*Questions:*
- Did this deep pagination query *DEFINITELY* cause this issue?
- Is each node single threaded? I don't think so, but I'd like to confirm
that.
- Is there any configuration that we could use to avoid this in the future?
- Why could the nodes not recover by themselves? When we ran the same query
on the single shard cluster it failed and didn't spin out of control.

Thanks for all your help, Logs are pasted below from different timestamps.

Regards,
Ash

*Logs:*

Here are some logs we collected. Not sure if it tells a lot outside of what
we know.

*Time: 2.55pm ~ Requests are failing to complete in time*

> ERROR RequestHandlerBase org.apache.solr.common.SolrException:
> org.apache.solr.client.solrj.SolrServerException: Time allowed to handle
> this request exceeded:[
> http://10.0.9.204:8983/solr/media_shard1_replica_p57,
> http://10.0.9.204:8983/solr/media_shard4_replica_p80,
> http://10.0.9.204:8983/solr/media_shard3_replica_p73,
> http://10.0.9.204:8983/solr/media_shard2_replica_p68]
> #011at
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:410)
> #011at
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:195)
> #011at org.apache.solr.core.SolrCore.execute(SolrCore.java:2503)
> #011at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:711)
> #011at org.apache.s...
>  #011at
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:188)
>  #011at
> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:168)
>  #011at
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:473)
>  #011at
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1564)
>  #011at
> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:166)
>  #011at
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1155)
>  #011at
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
>  #011at
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:219)
>  #011at
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:126)
>  #011at
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>  #011at
> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
>  #011at
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>  #011at org.eclipse.jetty.server.Server.handle(Server.java:530)
>  #011at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:347)
>  #011at
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:126)
>  #011at
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>  #011at
> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
>  #011at
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>  #011at org.eclipse.jetty.server.Server.handle(Server.java:530)
>  #011at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:347)
>  #011at
> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:256)
>  #011at
> org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:279)
>  #011at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:102)
>  #011at
> org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:124)
>  #011at
> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:256)
>  #011at
> org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:279)
>  #011at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:102)
>  #011at
> org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:124)
>  #011at
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:708)
>  #011at
> org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:626)
>  #011at java.base/java.lang.Thread.run(Thread.java:844)
>  #011at
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:247)
>  #011at
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.produce(EatWhatYouKill.java:140)
>  #011at
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:131)
>  #011at
> org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:382)
>  #011at
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:708)
>  #011at
> org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:626)
>  #011at java.base/java.lang.Thread.run(Thread.java:844)
>  solr: ERROR RequestHandlerBase org.apache.solr.common.SolrException:
> org.apache.solr.client.solrj.SolrServerException: Time allowed to handle
> this request exceeded:[
> http://10.0.9.204:8983/solr/media_shard1_replica_p57,
> http://10.0.9.204:8983/solr/media_shard4_replica_p80,
> http://10.0.9.204:8983/solr/media_shard3_replica_p73,
> http://10.0.9.204:8983/solr/media_shard2_replica_p68]
>  #011at
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:410)
>  #011at
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:195)
>  #011at org.apache.solr.core.SolrCore.execute(SolrCore.java:2503)
>  ERROR RequestHandlerBase org.apache.solr.common.SolrException:
> org.apache.solr.client.solrj.SolrServerException: Time allowed to handle
> this request exceeded:[
> http://10.0.9.204:8983/solr/media_shard1_replica_p57,
> http://10.0.9.204:8983/solr/media_shard4_replica_p80,
> http://10.0.9.204:8983/solr/media_shard3_replica_p73,
> http://10.0.9.204:8983/solr/media_shard2_replica_p68]
> #011at
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:410)
> #011at
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:195)
> #011at org.apache.solr.core.SolrCore.execute(SolrCore.java:2503)
> #011at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:711)
> #011at org.apache.s...
>  #011at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:517)
>  #011at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:384)
>  #011at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:330)
>  #011at
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1629)
>  #011at
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:533)
>  #011at
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
>  #011at
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
>  #011at
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>  #011at
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:190)
>  #011at
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1595)
>  #011at
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:188)
>  #011at
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1253)
>  #011at
> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:168)
>  #011at
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:473)
>  #011at
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1564)
>  #011at
> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:166)
>  ERROR HttpSolrCall null:org.apache.solr.common.SolrException:
> org.apache.solr.client.solrj.SolrServerException: Time allowed to handle
> this request exceeded:[
> http://10.0.9.204:8983/solr/media_shard1_replica_p57,
> http://10.0.9.204:8983/solr/media_shard4_replica_p80,
> http://10.0.9.204:8983/solr/media_shard3_replica_p73,
> http://10.0.9.204:8983/solr/media_shard2_replica_p68]
> #011at
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:410)
> #011at
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:195)
> #011at org.apache.solr.core.SolrCore.execute(SolrCore.java:2503)
> #011at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:711)


*Time: 3.00pm ~ Seeing out of memory*

> The request took too long to iterate over terms. Timeout: timeoutAt:
> 440824897486723 (System.nanoTime(): 440825204861492),
> TermsEnum=org.apache.lucene.codecs.blocktree.SegmentTermsEnum@3a9d2efc
>  solr: ERROR RequestHandlerBase java.lang.ClassCastException:
> org.apache.solr.response.BasicResultContext cannot be cast to
> org.apache.solr.common.SolrDocumentList
>  #011at
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:315)
>  #011at
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:195)
>  #011at org.apache.solr.core.SolrCore.execute(SolrCore.java:2503)
>  #011at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:711)
>  #011at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:517)
>  #011at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:384)
>  #011at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:330)
>  #011at
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1629)
>  #011at
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:533)
>  #011at
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
>  #011at
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
>  #011at
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>  #011at
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:190)
>  #011at
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1595)
>  #011at
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:188)
>  #011at
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1253)
>  #011at
> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:168)
>  #011at
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:473)
>  #011at
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1564)
>  #011at
> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:166)
>  #011at
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1155)
>  #011at
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
>  #011at
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:219)
>  #011at
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:126)
>  #011at
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>  #011at
> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
>  #011at
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>  #011at org.eclipse.jetty.server.Server.handle(Server.java:530)
>  #011at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:347)
>  #011at
> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:256)
>  #011at
> org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:279)
>  #011at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:102)
>  #011at
> org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:124)
>  #011at
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:708)
>  #011at
> org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:626)
>  #011at java.base/java.lang.Thread.run(Thread.java:844)
>  solr: ERROR HttpSolrCall null:java.lang.ClassCastException:
> org.apache.solr.response.BasicResultContext cannot be cast to
> org.apache.solr.common.SolrDocumentList
>  #011at
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:315)
>  #011at
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:195)
>  #011at org.apache.solr.core.SolrCore.execute(SolrCore.java:2503)
>  #011at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:711)
>  #011at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:517)
>  #011at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:384)
>  #011at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:330)
>  #011at
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1629)
>  #011at
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:533)
>  #011at
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
>  #011at
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
>  #011at
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>  #011at
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:190)
>  #011at
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1595)
>  #011at
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:188)
>  #011at
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1253)
>  #011at
> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:168)
>  #011at
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:473)
>  #011at
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1564)
>  #011at
> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:166)
>  #011at
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1155)
>  #011at
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
>  #011at
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:219)
>  #011at
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:126)
>  #011at
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>  #011at
> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
>  #011at
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>  #011at org.eclipse.jetty.server.Server.handle(Server.java:530)
>  #011at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:347)
>  #011at
> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:256)
>  #011at
> org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:279)
>  #011at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:102)
>  #011at
> org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:124)
>  #011at
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:708)
>  #011at
> org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:626)
>  #011at java.base/java.lang.Thread.run(Thread.java:844)


*3.15~ Health checks failing, Nodes are unresponsive.*

>  WARN  DistributedUpdateProcessor Error sending update to
> http://10.0.9.47:8983/solr
>  Failed to connect to Solr. May not be online yet. Retrying in 0 seconds...
>  ERROR HttpSolrCall null:org.apache.solr.common.SolrException: Error
> trying to proxy request for url:
> http://10.0.10.117:8983/solr/media/admin/ping
> #011at
> org.apache.solr.servlet.HttpSolrCall.remoteQuery(HttpSolrCall.java:647)
> #011at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:501)
> #011at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:384)
> #011at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:330)
> #011at
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1629)
> #011at
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:533)
> #011at o...
>  ERROR RequestHandlerBase org.apache.solr.common.SolrException:
> org.apache.solr.client.solrj.SolrServerException: Time allowed to handle
> this request exceeded
> #011at
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:410)
> #011at
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:195)
> #011at org.apache.solr.core.SolrCore.execute(SolrCore.java:2503)
> #011at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:711)
> #011at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:517)
> #011at org.apache.solr....
> RROR ErrorReportingConcurrentUpdateSolrClient error
>  solr: ERROR RequestHandlerBase java.io.IOException:
> java.util.concurrent.TimeoutException: Idle timeout expired: 120394/120000
> ms
>  #011at
> org.eclipse.jetty.server.HttpInput$ErrorState.noContent(HttpInput.java:1069)
>  #011at org.eclipse.jetty.server.HttpInput.read(HttpInput.java:309)
>  #011at
> org.apache.commons.io.input.ProxyInputStream.read(ProxyInputStream.java:100)
>  #011at
> org.apache.solr.common.util.FastInputStream.readWrappedStream(FastInputStream.java:79)
>  #011at
> org.apache.solr.common.util.FastInputStream.refill(FastInputStream.java:88)
>  #011at
> org.apache.solr.common.util.FastInputStream.peek(FastInputStream.java:60)
>  #011at
> org.apache.solr.handler.loader.JavabinLoader.parseAndLoadDocs(JavabinLoader.java:107)
>  #011at
> org.apache.solr.handler.loader.JavabinLoader.load(JavabinLoader.java:55)
>  #011at
> org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:97)
>  #011at
> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:68)
>  #011at
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:195)
>  #011at org.apache.solr.core.SolrCore.execute(SolrCore.java:2503)
>  #011at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:711)
>  #011at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:517)
>  #011at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:384)
>  #011at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:330)
>  #011at
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1629)
>  #011at
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:533)
>  #011at
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
>  #011at
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
>  #011at
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>  #011at
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:190)
>  #011at
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1595)
>  #011at
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:188)
>  #011at
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1253)
>  #011at
> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:168)
>  #011at
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:473)
>  #011at
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1564)
>  #011at
> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:166)
>  #011at
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1155)
>  #011at
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
>  #011at
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:219)
>  #011at
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:126)
>  #011at
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>  #011at
> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
>  #011at
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>  #011at org.eclipse.jetty.server.Server.handle(Server.java:530)
>  #011at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:347)
>  #011at
> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:256)
>  #011at
> org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:279)
>  #011at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:102)
>  #011at
> org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:124)
>  #011at
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:247)
>  #011at
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.produce(EatWhatYouKill.java:140)
>  #011at
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:131)
>  #011at
> org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:382)
>  #011at
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:708)
>  #011at
> org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:626)
>  #011at java.base/java.lang.Thread.run(Thread.java:844)
>  Caused by: java.util.concurrent.TimeoutException: Idle timeout expired:
> 120394/120000 ms
>  #011at
> org.eclipse.jetty.io.IdleTimeout.checkIdleTimeout(IdleTimeout.java:166)
>  #011at org.eclipse.jetty.io.IdleTimeout$1.run(IdleTimeout.java:50)
>  #011at
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:514)
>  #011at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
>  #011at
> java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
>  #011at
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1135)
>  #011at
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
>  #011... 1 more
>  #011Suppressed: java.lang.Throwable: HttpInput failure
>  #011#011at org.eclipse.jetty.server.HttpInput.failed(HttpInput.java:816)
>  #011#011at
> org.eclipse.jetty.server.HttpConnection$BlockingReadCallback.failed(HttpConnection.java:639)
>  #011#011at org.eclipse.jetty.io.FillInterest.onFail(FillInterest.java:134)
>  #011#011at
> org.eclipse.jetty.io.AbstractEndPoint.onIdleExpired(AbstractEndPoint.java:398)
>  #011#011... 8 more
>  solr: ERROR HttpSolrCall null:org.apache.solr.common.SolrException: Error
> trying to proxy request for url:
> http://10.0.10.117:8983/solr/media/admin/ping
>  #011at
> org.apache.solr.servlet.HttpSolrCall.remoteQuery(HttpSolrCall.java:647)
>  #011at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:501)
>  ERROR HttpSolrCall null:org.apache.solr.common.SolrException: Error
> trying to proxy request for url:
> http://10.0.10.117:8983/solr/media/admin/ping
> #011at
> org.apache.solr.servlet.HttpSolrCall.remoteQuery(HttpSolrCall.java:647)
> #011at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:501)
> #011at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:384)
> #011at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:330)
> #011at
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1629)
> #011at org.eclipse.jetty.servlet.ServletHandler.doHandle(Servl...
>  #011at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:330)
>  #011at
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1629)
>  #011at
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:533)
>  #011at
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
>  #011at
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
>  #011at
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>  #011at
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:190)
>  #011at
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1595)
>  #011at
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:188)
>  #011at
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1253)
>  #011at
> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:168)
>  #011at
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:473)
>  #011at
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1564)
>  #011at
> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:166)
>  #011at
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1155)
>  #011at
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
>  #011at
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:219)
>  #011at
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:126)
>  #011at
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>  #011at
> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
>  #011at
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>  #011at org.eclipse.jetty.server.Server.handle(Server.java:530)
>  #011at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:347)
>  #011at
> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:256)
>  #011at
> org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:279)
>  #011at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:102)
>  #011at
> org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:124)
>  #011at
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:247)
>  #011at
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.produce(EatWhatYouKill.java:140)
>  #011at
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:708)
>  #011at
> org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:626)
>  #011at java.base/java.lang.Thread.run(Thread.java:844)
>  Caused by: org.apache.http.conn.ConnectTimeoutException: Connect to
> 10.0.10.117:8983 [/10.0.10.117] failed: connect timed out
>  #011at
> org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:151)
>  #011at
> org.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:359)
>  #011at
> org.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:381)
>  #011at
> org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:237)
>  #011at
> org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:185)
>  #011at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89)
>  #011at
> org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:111)
>  #011at
> org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
>  #011at
> org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
>  #011at
> org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56)
>  #011at
> org.apache.solr.servlet.HttpSolrCall.remoteQuery(HttpSolrCall.java:619)


*3.40~ Cannot do updates. Everything is broke :(*

> #011at
> org.apache.solr.servlet.HttpSolrCall.sendError(HttpSolrCall.java:664)
> #011at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:532)
> #011at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:384)
> #011at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:330)
> #011at
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1629)
> #011at
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:533)
> #011at
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
> #011at
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
> #011at org.eclipse.jetty.server.handler.HandlerWrapper.handle(Ha...
>  solr: WARN  DistributedUpdateProcessor Error sending update to
> http://10.0.8.157:8983/solr
>  org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:
> Error from server at http://10.0.8.157:8983/solr/media_shard6_replica_t35:
> Server Error
>  ERROR DistributedUpdateProcessor Setting up to try to start recovery on
> replica core_node36 with url
> http://10.0.8.157:8983/solr/media_shard6_replica_t35/ by increasing
> leader term
>  message repeated 2 times: []
>  request:
> http://10.0.8.157:8983/solr/media_shard6_replica_t35/update?update.distrib=FROMLEADER&distrib.from=http%3A%2F%2F10.0.10.117%3A8983%2Fsolr%2Fmedia_shard6_replica_t10%2F&wt=javabin&version=2
>  Remote error message: java.util.concurrent.TimeoutException: Idle timeout
> expired: 120485/120000 ms
>  #011at
> org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.sendUpdateStream(ConcurrentUpdateSolrClient.java:383)
>  #011at
> org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.run(ConcurrentUpdateSolrClient.java:182)
>  #011at
> com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:176)
>  #011at
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:188)
>  #011at
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1135)
>  #011at
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
>  #011at java.base/java.lang.Thread.run(Thread.java:844)
>  solr: ERROR DistributedUpdateProcessor Setting up to try to start
> recovery on replica core_node36 with url
> http://10.0.8.157:8983/solr/media_shard6_replica_t35/ by increasing
> leader term
>  org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:
> Error from server at http://10.0.8.157:8983/solr/media_shard6_replica_t35:
> Server Error
>  message repeated 2 times: []
>  request:
> http://10.0.8.157:8983/solr/media_shard6_replica_t35/update?update.distrib=FROMLEADER&distrib.from=http%3A%2F%2F10.0.10.117%3A8983%2Fsolr%2Fmedia_shard6_replica_t10%2F&wt=javabin&version=2
>  Remote error message: java.util.concurrent.TimeoutException: Idle timeout
> expired: 120485/120000 ms
>  #011at
> org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.sendUpdateStream(ConcurrentUpdateSolrClient.java:383)
>  #011at
> org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.run(ConcurrentUpdateSolrClient.java:182)
>  #011at
> com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:176)
>  #011at
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:188)
>  #011at
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1135)
>  #011at
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
>  #011at java.base/java.lang.Thread.run(Thread.java:844)
>  solr: WARN  DistributedUpdateProcessor Error sending update to
> http://10.0.8.157:8983/solr
>  org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:
> Error from server at http://10.0.8.157:8983/solr/media_shard6_replica_t35:
> Server Error
>  message repeated 2 times: []
>  request:
> http://10.0.8.157:8983/solr/media_shard6_replica_t35/update?update.distrib=FROMLEADER&distrib.from=http%3A%2F%2F10.0.10.117%3A8983%2Fsolr%2Fmedia_shard6_replica_t10%2F&wt=javabin&version=2
>  Remote error message: java.util.concurrent.TimeoutException: Idle timeout
> expired: 120307/120000 ms
>  #011at
> org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.sendUpdateStream(ConcurrentUpdateSolrClient.java:383)
>  #011at
> org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.run(ConcurrentUpdateSolrClient.java:182)
>  #011at
> com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:176)
>  #011at
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:188)
>  #011at
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1135)
>  #011at
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
>  #011at java.base/java.lang.Thread.run(Thread.java:844)
>  solr: WARN  DistributedUpdateProcessor Error sending update to
> http://10.0.9.47:8983/solr
>  org.apache.http.NoHttpResponseException: 10.0.9.47:8983 failed to respond
>  #011at
> org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:141)
>  #011at
> org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:56)
>  #011at
> org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:259)
>  #011at
> org.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:163)
>  #011at
> org.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:165)
>  #011at
> org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:273)
>  #011at
> org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125)
>  #011at
> org.apache.solr.util.stats.InstrumentedHttpRequestExecutor.execute(InstrumentedHttpRequestExecutor.java:118)
>  #011at
> org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:272)
>  #011at
> org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:185)
>  #011at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89)
>  #011at
> org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:111)
>  #011at
> org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
>  #011at
> org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
>  #011at
> org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56)
>  #011at
> org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.sendUpdateStream(ConcurrentUpdateSolrClient.java:347)
>  #011at
> org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.run(ConcurrentUpdateSolrClient.java:182)
>  #011at
> com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:176)
>  #011at
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:188)
>  #011at
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1135)
>  #011at
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
>  #011at java.base/java.lang.Thread.run(Thread.java:844)
>  solr: WARN  DistributedUpdateProcessor Error sending update to
> http://10.0.8.157:8983/solr
>  org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:
> Error from server at http://10.0.8.157:8983/solr/media_shard2_replica_t20:
> Server Error

-- 
*P.S. We've launched a new blog to share the latest ideas and case studies 
from our team. Check it out here: product.canva.com 
<http://product.canva.com/>. ***
** <https://canva.com>Empowering the world 
to design
Also, we're hiring. Apply here! 
<https://about.canva.com/careers/>
 <https://twitter.com/canva> 
<https://facebook.com/canva> <https://au.linkedin.com/company/canva> 
<https://instagram.com/canva>






Re: 7.3.1: Query of death - all nodes ran out of memory and had to be shut down

Posted by Erick Erickson <er...@gmail.com>.
bq. I meant to ask whether there is a high probability that that could
be the correlated cause for the issue.

Yes, I do tend to be pedantic on occasion, a personal failing ;)


bq. Do you know why Solr itself isn't able to recover or is that to be
expected with allowing such deep pagination.

The general problem isn't deep paging itself, but the fact that in
your case it generates an OOM. After an OOM there is no way to
recover; the state of the program is unknown. This is pretty much true
of all Java programs, which is why there's an OOM-killer script you
can configure. That won't help your situation since you'll probably
dive right back into an OOM, but at least it doesn't continue to try
to work with a program that's in an unknown state.

Best,
Erick

On Tue, Aug 21, 2018 at 2:08 AM, Ere Maijala <er...@helsinki.fi> wrote:
> Hi,
>
> Just my short comment here. It's difficult to say for someone else, but we
> identified deep paging as the definite reason for running out of memory or
> at least grinding to semi-halt because of long stop-the-world garbage
> collection pauses in an application running on a similar SolrCloud. You can
> often get away without issues as long as you only have a single shard, but
> for the reason Erick mentioned deep paging in a sharded index is a heavy
> operation.
>
> Regards,
> Ere
>
> Ash Ramesh kirjoitti 21.8.2018 klo 8.09:
>>
>> Hi Erick,
>>
>> Sorry I phrased that the wrong way. I meant to ask whether there is a high
>> probability that that could be the correlated cause for the issue. Do you
>> know why Solr itself isn't able to recover or is that to be expected with
>> allowing such deep pagination. We are going to be removing it going
>> forwards, but want to make sure that we find the root cause.
>>
>> Appreciate your help as always :)
>>
>> Ash
>>
>> On Tue, Aug 21, 2018 at 2:59 PM Erick Erickson <er...@gmail.com>
>> wrote:
>>
>>> Did the large offsets _definitely_ cause the OOM? How do you expect
>>> that to be answerable? It's likely though. To return rows 1,000,000
>>> through 1,000,010 the system has to keep a list of 1,000,010 top
>>> documents. It has to be this way because you don't know (and can't
>>> guess) the score or a doc prior to, well, scoring it. And these very
>>> large structures are kept for every query being processed. Not only
>>> will that chew up memory, it'll chew up CPU cycles as well as this an
>>> ordered list.
>>>
>>> This is an anti-pattern, cursors were invented because this pattern is
>>> very costly (as you're finding out).
>>>
>>> Further, 4G isn't very much memory by modern standards.
>>>
>>> So it's very likely (but not guaranteed) that using cursors will fix
>>> this problem.
>>>
>>> Best,
>>> Erick
>>>
>>>
>>>
>>> On Mon, Aug 20, 2018 at 8:55 PM, Ash Ramesh <as...@canva.com> wrote:
>>>>
>>>> Hi everyone,
>>>>
>>>> We ran into an issue yesterday where all our ec2 machines, running solr,
>>>> ran out of memory and could not heal themselves. I'll try break down
>>>> what
>>>> happened here.
>>>>
>>>> *System Architecture:*
>>>>
>>>> - Solr Version: 7.3.1
>>>> - Replica Types: TLOG/PULL
>>>> - Num Shards: 8 (default hashing mechanism)
>>>> - Doc Count: > 20m
>>>> - Index Size: 17G
>>>> - EC2 Machine Spec: 16 Core | 32G ram | 100G SSD
>>>> - Num EC2 Machines: 7+ (scales up and down)
>>>> - Max Shards per node (one node per EC2 instance): 8 (some nodes had 4,
>>>> some had 8)
>>>> - Num TLOG shard replicas: 3 (3 copies of each shard as TLOG)
>>>> - Num PULL shard replicas: 3+
>>>> - Heap: 4G
>>>>
>>>> *What was run prior to the issue:*
>>>>
>>>> We ran these queries around 2.55pm
>>>>
>>>> We ran a bunch of deep paginated queries (offset of 1,000,000) with a
>>>> filter query. We set the timeout to 5 seconds and it did timeout. We
>>>
>>> aren't
>>>>
>>>> sure if this is what caused the irrecoverable failure, but by reading
>>>
>>> this
>>>>
>>>> -
>>>>
>>>
>>> https://lucene.apache.org/solr/guide/7_4/pagination-of-results.html#performance-problems-with-deep-paging
>>>>
>>>> , we feel that this was the cause.
>>>>
>>>> We did not use a cursor.
>>>>
>>>> This cluster was healthy for about 1 week, but we noticed the
>>>> degradation
>>>> soon after (within 30min) of running the offset queries mentioned above.
>>>
>>> We
>>>>
>>>> currently use a single sharded collection in production, however are
>>>> transitioning to an 8 shard cluster. We hit this issue in a controlled 8
>>>> sharded environment, but don't notice any issues on our production
>>>
>>> (single
>>>>
>>>> sharded) cluster. On production the query still timed out (with same num
>>>> docs etc.) but didn't go into a crazy state.
>>>>
>>>> *What Happened:*
>>>>
>>>> - All the EC2 instances started logging OOM error. None of the nodes
>>>> were
>>>> responsive to new requests.
>>>> - We saw that the Heap usage jumped from an average of 2.7G to the max
>>>> of
>>>> 4G within a 5 minute window.
>>>> - CPU across all 16 cores was at 100%
>>>> - We saw that the distributed requests were timing out across all
>>>
>>> machines.
>>>>
>>>> - We shutdown all the machines that only had PULL replicas on them and
>>>> it
>>>> still didn't 'fix' itself.
>>>> - Eventually we shut down SOLR on the main node which had all the master
>>>> TLOG replicas. Once restarted, the machine started working again.
>>>>
>>>>
>>>> *Questions:*
>>>> - Did this deep pagination query *DEFINITELY* cause this issue?
>>>> - Is each node single threaded? I don't think so, but I'd like to
>>>> confirm
>>>> that.
>>>> - Is there any configuration that we could use to avoid this in the
>>>
>>> future?
>>>>
>>>> - Why could the nodes not recover by themselves? When we ran the same
>>>
>>> query
>>>>
>>>> on the single shard cluster it failed and didn't spin out of control.
>>>>
>>>> Thanks for all your help, Logs are pasted below from different
>>>
>>> timestamps.
>>>>
>>>>
>>>> Regards,
>>>> Ash
>>>>
>>>> *Logs:*
>>>>
>>>> Here are some logs we collected. Not sure if it tells a lot outside of
>>>
>>> what
>>>>
>>>> we know.
>>>>
>>>> *Time: 2.55pm ~ Requests are failing to complete in time*
>>>>
>>>>> ERROR RequestHandlerBase org.apache.solr.common.SolrException:
>>>>> org.apache.solr.client.solrj.SolrServerException: Time allowed to
>>>>> handle
>>>>> this request exceeded:[
>>>>> http://10.0.9.204:8983/solr/media_shard1_replica_p57,
>>>>> http://10.0.9.204:8983/solr/media_shard4_replica_p80,
>>>>> http://10.0.9.204:8983/solr/media_shard3_replica_p73,
>>>>> http://10.0.9.204:8983/solr/media_shard2_replica_p68]
>>>>> #011at
>>>>>
>>>
>>> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:410)
>>>>>
>>>>> #011at
>>>>>
>>>
>>> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:195)
>>>>>
>>>>> #011at org.apache.solr.core.SolrCore.execute(SolrCore.java:2503)
>>>>> #011at
>>>
>>> org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:711)
>>>>>
>>>>> #011at org.apache.s...
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:188)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:168)
>>>>>
>>>>>   #011at
>>>>>
>>> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:473)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1564)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:166)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1155)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:219)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:126)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>>>>>
>>>>>   #011at org.eclipse.jetty.server.Server.handle(Server.java:530)
>>>>>   #011at
>>>
>>> org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:347)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:126)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>>>>>
>>>>>   #011at org.eclipse.jetty.server.Server.handle(Server.java:530)
>>>>>   #011at
>>>
>>> org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:347)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:256)
>>>>>
>>>>>   #011at
>>>>> org.eclipse.jetty.io
>>>
>>> .AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:279)
>>>>>
>>>>>   #011at org.eclipse.jetty.io
>>>
>>> .FillInterest.fillable(FillInterest.java:102)
>>>>>
>>>>>   #011at
>>>>> org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:124)
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:256)
>>>>>
>>>>>   #011at
>>>>> org.eclipse.jetty.io
>>>
>>> .AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:279)
>>>>>
>>>>>   #011at org.eclipse.jetty.io
>>>
>>> .FillInterest.fillable(FillInterest.java:102)
>>>>>
>>>>>   #011at
>>>>> org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:124)
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:708)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:626)
>>>>>
>>>>>   #011at java.base/java.lang.Thread.run(Thread.java:844)
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:247)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.produce(EatWhatYouKill.java:140)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:131)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:382)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:708)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:626)
>>>>>
>>>>>   #011at java.base/java.lang.Thread.run(Thread.java:844)
>>>>>   solr: ERROR RequestHandlerBase org.apache.solr.common.SolrException:
>>>>> org.apache.solr.client.solrj.SolrServerException: Time allowed to
>>>>> handle
>>>>> this request exceeded:[
>>>>> http://10.0.9.204:8983/solr/media_shard1_replica_p57,
>>>>> http://10.0.9.204:8983/solr/media_shard4_replica_p80,
>>>>> http://10.0.9.204:8983/solr/media_shard3_replica_p73,
>>>>> http://10.0.9.204:8983/solr/media_shard2_replica_p68]
>>>>>   #011at
>>>>>
>>>
>>> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:410)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:195)
>>>>>
>>>>>   #011at org.apache.solr.core.SolrCore.execute(SolrCore.java:2503)
>>>>>   ERROR RequestHandlerBase org.apache.solr.common.SolrException:
>>>>> org.apache.solr.client.solrj.SolrServerException: Time allowed to
>>>>> handle
>>>>> this request exceeded:[
>>>>> http://10.0.9.204:8983/solr/media_shard1_replica_p57,
>>>>> http://10.0.9.204:8983/solr/media_shard4_replica_p80,
>>>>> http://10.0.9.204:8983/solr/media_shard3_replica_p73,
>>>>> http://10.0.9.204:8983/solr/media_shard2_replica_p68]
>>>>> #011at
>>>>>
>>>
>>> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:410)
>>>>>
>>>>> #011at
>>>>>
>>>
>>> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:195)
>>>>>
>>>>> #011at org.apache.solr.core.SolrCore.execute(SolrCore.java:2503)
>>>>> #011at
>>>
>>> org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:711)
>>>>>
>>>>> #011at org.apache.s...
>>>>>   #011at
>>>>> org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:517)
>>>>>   #011at
>>>>>
>>>
>>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:384)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:330)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1629)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:533)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:190)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1595)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:188)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1253)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:168)
>>>>>
>>>>>   #011at
>>>>>
>>> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:473)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1564)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:166)
>>>>>
>>>>>   ERROR HttpSolrCall null:org.apache.solr.common.SolrException:
>>>>> org.apache.solr.client.solrj.SolrServerException: Time allowed to
>>>>> handle
>>>>> this request exceeded:[
>>>>> http://10.0.9.204:8983/solr/media_shard1_replica_p57,
>>>>> http://10.0.9.204:8983/solr/media_shard4_replica_p80,
>>>>> http://10.0.9.204:8983/solr/media_shard3_replica_p73,
>>>>> http://10.0.9.204:8983/solr/media_shard2_replica_p68]
>>>>> #011at
>>>>>
>>>
>>> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:410)
>>>>>
>>>>> #011at
>>>>>
>>>
>>> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:195)
>>>>>
>>>>> #011at org.apache.solr.core.SolrCore.execute(SolrCore.java:2503)
>>>>> #011at
>>>
>>> org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:711)
>>>>
>>>>
>>>>
>>>> *Time: 3.00pm ~ Seeing out of memory*
>>>>
>>>>> The request took too long to iterate over terms. Timeout: timeoutAt:
>>>>> 440824897486723 (System.nanoTime(): 440825204861492),
>>>>> TermsEnum=org.apache.lucene.codecs.blocktree.SegmentTermsEnum@3a9d2efc
>>>>>   solr: ERROR RequestHandlerBase java.lang.ClassCastException:
>>>>> org.apache.solr.response.BasicResultContext cannot be cast to
>>>>> org.apache.solr.common.SolrDocumentList
>>>>>   #011at
>>>>>
>>>
>>> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:315)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:195)
>>>>>
>>>>>   #011at org.apache.solr.core.SolrCore.execute(SolrCore.java:2503)
>>>>>   #011at
>>>
>>> org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:711)
>>>>>
>>>>>   #011at
>>>>> org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:517)
>>>>>   #011at
>>>>>
>>>
>>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:384)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:330)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1629)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:533)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:190)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1595)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:188)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1253)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:168)
>>>>>
>>>>>   #011at
>>>>>
>>> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:473)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1564)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:166)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1155)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:219)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:126)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>>>>>
>>>>>   #011at org.eclipse.jetty.server.Server.handle(Server.java:530)
>>>>>   #011at
>>>
>>> org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:347)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:256)
>>>>>
>>>>>   #011at
>>>>> org.eclipse.jetty.io
>>>
>>> .AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:279)
>>>>>
>>>>>   #011at org.eclipse.jetty.io
>>>
>>> .FillInterest.fillable(FillInterest.java:102)
>>>>>
>>>>>   #011at
>>>>> org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:124)
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:708)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:626)
>>>>>
>>>>>   #011at java.base/java.lang.Thread.run(Thread.java:844)
>>>>>   solr: ERROR HttpSolrCall null:java.lang.ClassCastException:
>>>>> org.apache.solr.response.BasicResultContext cannot be cast to
>>>>> org.apache.solr.common.SolrDocumentList
>>>>>   #011at
>>>>>
>>>
>>> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:315)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:195)
>>>>>
>>>>>   #011at org.apache.solr.core.SolrCore.execute(SolrCore.java:2503)
>>>>>   #011at
>>>
>>> org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:711)
>>>>>
>>>>>   #011at
>>>>> org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:517)
>>>>>   #011at
>>>>>
>>>
>>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:384)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:330)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1629)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:533)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:190)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1595)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:188)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1253)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:168)
>>>>>
>>>>>   #011at
>>>>>
>>> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:473)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1564)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:166)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1155)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:219)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:126)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>>>>>
>>>>>   #011at org.eclipse.jetty.server.Server.handle(Server.java:530)
>>>>>   #011at
>>>
>>> org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:347)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:256)
>>>>>
>>>>>   #011at
>>>>> org.eclipse.jetty.io
>>>
>>> .AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:279)
>>>>>
>>>>>   #011at org.eclipse.jetty.io
>>>
>>> .FillInterest.fillable(FillInterest.java:102)
>>>>>
>>>>>   #011at
>>>>> org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:124)
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:708)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:626)
>>>>>
>>>>>   #011at java.base/java.lang.Thread.run(Thread.java:844)
>>>>
>>>>
>>>>
>>>> *3.15~ Health checks failing, Nodes are unresponsive.*
>>>>
>>>>>   WARN  DistributedUpdateProcessor Error sending update to
>>>>> http://10.0.9.47:8983/solr
>>>>>   Failed to connect to Solr. May not be online yet. Retrying in 0
>>>
>>> seconds...
>>>>>
>>>>>   ERROR HttpSolrCall null:org.apache.solr.common.SolrException: Error
>>>>> trying to proxy request for url:
>>>>> http://10.0.10.117:8983/solr/media/admin/ping
>>>>> #011at
>>>>> org.apache.solr.servlet.HttpSolrCall.remoteQuery(HttpSolrCall.java:647)
>>>>> #011at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:501)
>>>>> #011at
>>>>>
>>>
>>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:384)
>>>>>
>>>>> #011at
>>>>>
>>>
>>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:330)
>>>>>
>>>>> #011at
>>>>>
>>>
>>> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1629)
>>>>>
>>>>> #011at
>>>>>
>>>
>>> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:533)
>>>>>
>>>>> #011at o...
>>>>>   ERROR RequestHandlerBase org.apache.solr.common.SolrException:
>>>>> org.apache.solr.client.solrj.SolrServerException: Time allowed to
>>>>> handle
>>>>> this request exceeded
>>>>> #011at
>>>>>
>>>
>>> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:410)
>>>>>
>>>>> #011at
>>>>>
>>>
>>> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:195)
>>>>>
>>>>> #011at org.apache.solr.core.SolrCore.execute(SolrCore.java:2503)
>>>>> #011at
>>>
>>> org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:711)
>>>>>
>>>>> #011at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:517)
>>>>> #011at org.apache.solr....
>>>>> RROR ErrorReportingConcurrentUpdateSolrClient error
>>>>>   solr: ERROR RequestHandlerBase java.io.IOException:
>>>>> java.util.concurrent.TimeoutException: Idle timeout expired:
>>>
>>> 120394/120000
>>>>>
>>>>> ms
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.server.HttpInput$ErrorState.noContent(HttpInput.java:1069)
>>>>>
>>>>>   #011at org.eclipse.jetty.server.HttpInput.read(HttpInput.java:309)
>>>>>   #011at
>>>>>
>>>
>>> org.apache.commons.io.input.ProxyInputStream.read(ProxyInputStream.java:100)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.apache.solr.common.util.FastInputStream.readWrappedStream(FastInputStream.java:79)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.apache.solr.common.util.FastInputStream.refill(FastInputStream.java:88)
>>>>>
>>>>>   #011at
>>>>>
>>> org.apache.solr.common.util.FastInputStream.peek(FastInputStream.java:60)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.apache.solr.handler.loader.JavabinLoader.parseAndLoadDocs(JavabinLoader.java:107)
>>>>>
>>>>>   #011at
>>>>>
>>>>> org.apache.solr.handler.loader.JavabinLoader.load(JavabinLoader.java:55)
>>>>>   #011at
>>>>>
>>>
>>> org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:97)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:68)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:195)
>>>>>
>>>>>   #011at org.apache.solr.core.SolrCore.execute(SolrCore.java:2503)
>>>>>   #011at
>>>
>>> org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:711)
>>>>>
>>>>>   #011at
>>>>> org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:517)
>>>>>   #011at
>>>>>
>>>
>>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:384)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:330)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1629)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:533)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:190)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1595)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:188)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1253)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:168)
>>>>>
>>>>>   #011at
>>>>>
>>> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:473)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1564)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:166)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1155)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:219)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:126)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>>>>>
>>>>>   #011at org.eclipse.jetty.server.Server.handle(Server.java:530)
>>>>>   #011at
>>>
>>> org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:347)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:256)
>>>>>
>>>>>   #011at
>>>>> org.eclipse.jetty.io
>>>
>>> .AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:279)
>>>>>
>>>>>   #011at org.eclipse.jetty.io
>>>
>>> .FillInterest.fillable(FillInterest.java:102)
>>>>>
>>>>>   #011at
>>>>> org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:124)
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:247)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.produce(EatWhatYouKill.java:140)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:131)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:382)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:708)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:626)
>>>>>
>>>>>   #011at java.base/java.lang.Thread.run(Thread.java:844)
>>>>>   Caused by: java.util.concurrent.TimeoutException: Idle timeout
>>>>> expired:
>>>>> 120394/120000 ms
>>>>>   #011at
>>>>> org.eclipse.jetty.io.IdleTimeout.checkIdleTimeout(IdleTimeout.java:166)
>>>>>   #011at org.eclipse.jetty.io.IdleTimeout$1.run(IdleTimeout.java:50)
>>>>>   #011at
>>>>>
>>>
>>> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:514)
>>>>>
>>>>>   #011at
>>>
>>> java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1135)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
>>>>>
>>>>>   #011... 1 more
>>>>>   #011Suppressed: java.lang.Throwable: HttpInput failure
>>>>>   #011#011at
>>>
>>> org.eclipse.jetty.server.HttpInput.failed(HttpInput.java:816)
>>>>>
>>>>>   #011#011at
>>>>>
>>>
>>> org.eclipse.jetty.server.HttpConnection$BlockingReadCallback.failed(HttpConnection.java:639)
>>>>>
>>>>>   #011#011at org.eclipse.jetty.io
>>>
>>> .FillInterest.onFail(FillInterest.java:134)
>>>>>
>>>>>   #011#011at
>>>>> org.eclipse.jetty.io
>>>
>>> .AbstractEndPoint.onIdleExpired(AbstractEndPoint.java:398)
>>>>>
>>>>>   #011#011... 8 more
>>>>>   solr: ERROR HttpSolrCall null:org.apache.solr.common.SolrException:
>>>
>>> Error
>>>>>
>>>>> trying to proxy request for url:
>>>>> http://10.0.10.117:8983/solr/media/admin/ping
>>>>>   #011at
>>>>> org.apache.solr.servlet.HttpSolrCall.remoteQuery(HttpSolrCall.java:647)
>>>>>   #011at
>>>>> org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:501)
>>>>>   ERROR HttpSolrCall null:org.apache.solr.common.SolrException: Error
>>>>> trying to proxy request for url:
>>>>> http://10.0.10.117:8983/solr/media/admin/ping
>>>>> #011at
>>>>> org.apache.solr.servlet.HttpSolrCall.remoteQuery(HttpSolrCall.java:647)
>>>>> #011at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:501)
>>>>> #011at
>>>>>
>>>
>>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:384)
>>>>>
>>>>> #011at
>>>>>
>>>
>>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:330)
>>>>>
>>>>> #011at
>>>>>
>>>
>>> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1629)
>>>>>
>>>>> #011at org.eclipse.jetty.servlet.ServletHandler.doHandle(Servl...
>>>>>   #011at
>>>>>
>>>
>>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:330)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1629)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:533)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:190)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1595)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:188)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1253)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:168)
>>>>>
>>>>>   #011at
>>>>>
>>> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:473)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1564)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:166)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1155)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:219)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:126)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>>>>>
>>>>>   #011at org.eclipse.jetty.server.Server.handle(Server.java:530)
>>>>>   #011at
>>>
>>> org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:347)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:256)
>>>>>
>>>>>   #011at
>>>>> org.eclipse.jetty.io
>>>
>>> .AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:279)
>>>>>
>>>>>   #011at org.eclipse.jetty.io
>>>
>>> .FillInterest.fillable(FillInterest.java:102)
>>>>>
>>>>>   #011at
>>>>> org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:124)
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:247)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.produce(EatWhatYouKill.java:140)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:708)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:626)
>>>>>
>>>>>   #011at java.base/java.lang.Thread.run(Thread.java:844)
>>>>>   Caused by: org.apache.http.conn.ConnectTimeoutException: Connect to
>>>>> 10.0.10.117:8983 [/10.0.10.117] failed: connect timed out
>>>>>   #011at
>>>>>
>>>
>>> org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:151)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:359)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:381)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:237)
>>>>>
>>>>>   #011at
>>>>>
>>>
>>> org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:185)
>>>>>
>>>>>   #011at


Re: 7.3.1: Query of death - all nodes ran out of memory and had to be shut down

Posted by Jan Høydahl <ja...@cominvent.com>.
The solution is to move to cursors, but you may as a safety net try to apply the RequestSanitizerComponent to disallow large offsets, see https://github.com/cominvent/request-sanitizer-component <https://github.com/cominvent/request-sanitizer-component>

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com

> 21. aug. 2018 kl. 11:08 skrev Ere Maijala <er...@helsinki.fi>:
> 
> Hi,
> 
> Just my short comment here. It's difficult to say for someone else, but we identified deep paging as the definite reason for running out of memory or at least grinding to semi-halt because of long stop-the-world garbage collection pauses in an application running on a similar SolrCloud. You can often get away without issues as long as you only have a single shard, but for the reason Erick mentioned deep paging in a sharded index is a heavy operation.
> 
> Regards,
> Ere
> 
> Ash Ramesh kirjoitti 21.8.2018 klo 8.09:
>> Hi Erick,
>> Sorry I phrased that the wrong way. I meant to ask whether there is a high
>> probability that that could be the correlated cause for the issue. Do you
>> know why Solr itself isn't able to recover or is that to be expected with
>> allowing such deep pagination. We are going to be removing it going
>> forwards, but want to make sure that we find the root cause.
>> Appreciate your help as always :)
>> Ash
>> On Tue, Aug 21, 2018 at 2:59 PM Erick Erickson <er...@gmail.com>
>> wrote:
>>> Did the large offsets _definitely_ cause the OOM? How do you expect
>>> that to be answerable? It's likely though. To return rows 1,000,000
>>> through 1,000,010 the system has to keep a list of 1,000,010 top
>>> documents. It has to be this way because you don't know (and can't
>>> guess) the score or a doc prior to, well, scoring it. And these very
>>> large structures are kept for every query being processed. Not only
>>> will that chew up memory, it'll chew up CPU cycles as well as this an
>>> ordered list.
>>> 
>>> This is an anti-pattern, cursors were invented because this pattern is
>>> very costly (as you're finding out).
>>> 
>>> Further, 4G isn't very much memory by modern standards.
>>> 
>>> So it's very likely (but not guaranteed) that using cursors will fix
>>> this problem.
>>> 
>>> Best,
>>> Erick
>>> 
>>> 
>>> 
>>> On Mon, Aug 20, 2018 at 8:55 PM, Ash Ramesh <as...@canva.com> wrote:
>>>> Hi everyone,
>>>> 
>>>> We ran into an issue yesterday where all our ec2 machines, running solr,
>>>> ran out of memory and could not heal themselves. I'll try break down what
>>>> happened here.
>>>> 
>>>> *System Architecture:*
>>>> 
>>>> - Solr Version: 7.3.1
>>>> - Replica Types: TLOG/PULL
>>>> - Num Shards: 8 (default hashing mechanism)
>>>> - Doc Count: > 20m
>>>> - Index Size: 17G
>>>> - EC2 Machine Spec: 16 Core | 32G ram | 100G SSD
>>>> - Num EC2 Machines: 7+ (scales up and down)
>>>> - Max Shards per node (one node per EC2 instance): 8 (some nodes had 4,
>>>> some had 8)
>>>> - Num TLOG shard replicas: 3 (3 copies of each shard as TLOG)
>>>> - Num PULL shard replicas: 3+
>>>> - Heap: 4G
>>>> 
>>>> *What was run prior to the issue:*
>>>> 
>>>> We ran these queries around 2.55pm
>>>> 
>>>> We ran a bunch of deep paginated queries (offset of 1,000,000) with a
>>>> filter query. We set the timeout to 5 seconds and it did timeout. We
>>> aren't
>>>> sure if this is what caused the irrecoverable failure, but by reading
>>> this
>>>> -
>>>> 
>>> https://lucene.apache.org/solr/guide/7_4/pagination-of-results.html#performance-problems-with-deep-paging
>>>> , we feel that this was the cause.
>>>> 
>>>> We did not use a cursor.
>>>> 
>>>> This cluster was healthy for about 1 week, but we noticed the degradation
>>>> soon after (within 30min) of running the offset queries mentioned above.
>>> We
>>>> currently use a single sharded collection in production, however are
>>>> transitioning to an 8 shard cluster. We hit this issue in a controlled 8
>>>> sharded environment, but don't notice any issues on our production
>>> (single
>>>> sharded) cluster. On production the query still timed out (with same num
>>>> docs etc.) but didn't go into a crazy state.
>>>> 
>>>> *What Happened:*
>>>> 
>>>> - All the EC2 instances started logging OOM error. None of the nodes were
>>>> responsive to new requests.
>>>> - We saw that the Heap usage jumped from an average of 2.7G to the max of
>>>> 4G within a 5 minute window.
>>>> - CPU across all 16 cores was at 100%
>>>> - We saw that the distributed requests were timing out across all
>>> machines.
>>>> - We shutdown all the machines that only had PULL replicas on them and it
>>>> still didn't 'fix' itself.
>>>> - Eventually we shut down SOLR on the main node which had all the master
>>>> TLOG replicas. Once restarted, the machine started working again.
>>>> 
>>>> 
>>>> *Questions:*
>>>> - Did this deep pagination query *DEFINITELY* cause this issue?
>>>> - Is each node single threaded? I don't think so, but I'd like to confirm
>>>> that.
>>>> - Is there any configuration that we could use to avoid this in the
>>> future?
>>>> - Why could the nodes not recover by themselves? When we ran the same
>>> query
>>>> on the single shard cluster it failed and didn't spin out of control.
>>>> 
>>>> Thanks for all your help, Logs are pasted below from different
>>> timestamps.
>>>> 
>>>> Regards,
>>>> Ash
>>>> 
>>>> *Logs:*
>>>> 
>>>> Here are some logs we collected. Not sure if it tells a lot outside of
>>> what
>>>> we know.
>>>> 
>>>> *Time: 2.55pm ~ Requests are failing to complete in time*
>>>> 
>>>>> ERROR RequestHandlerBase org.apache.solr.common.SolrException:
>>>>> org.apache.solr.client.solrj.SolrServerException: Time allowed to handle
>>>>> this request exceeded:[
>>>>> http://10.0.9.204:8983/solr/media_shard1_replica_p57,
>>>>> http://10.0.9.204:8983/solr/media_shard4_replica_p80,
>>>>> http://10.0.9.204:8983/solr/media_shard3_replica_p73,
>>>>> http://10.0.9.204:8983/solr/media_shard2_replica_p68]
>>>>> #011at
>>>>> 
>>> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:410)
>>>>> #011at
>>>>> 
>>> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:195)
>>>>> #011at org.apache.solr.core.SolrCore.execute(SolrCore.java:2503)
>>>>> #011at
>>> org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:711)
>>>>> #011at org.apache.s...
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:188)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:168)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:473)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1564)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:166)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1155)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:219)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:126)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>>>>>  #011at org.eclipse.jetty.server.Server.handle(Server.java:530)
>>>>>  #011at
>>> org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:347)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:126)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>>>>>  #011at org.eclipse.jetty.server.Server.handle(Server.java:530)
>>>>>  #011at
>>> org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:347)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:256)
>>>>>  #011at
>>>>> org.eclipse.jetty.io
>>> .AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:279)
>>>>>  #011at org.eclipse.jetty.io
>>> .FillInterest.fillable(FillInterest.java:102)
>>>>>  #011at
>>>>> org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:124)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:256)
>>>>>  #011at
>>>>> org.eclipse.jetty.io
>>> .AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:279)
>>>>>  #011at org.eclipse.jetty.io
>>> .FillInterest.fillable(FillInterest.java:102)
>>>>>  #011at
>>>>> org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:124)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:708)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:626)
>>>>>  #011at java.base/java.lang.Thread.run(Thread.java:844)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:247)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.produce(EatWhatYouKill.java:140)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:131)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:382)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:708)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:626)
>>>>>  #011at java.base/java.lang.Thread.run(Thread.java:844)
>>>>>  solr: ERROR RequestHandlerBase org.apache.solr.common.SolrException:
>>>>> org.apache.solr.client.solrj.SolrServerException: Time allowed to handle
>>>>> this request exceeded:[
>>>>> http://10.0.9.204:8983/solr/media_shard1_replica_p57,
>>>>> http://10.0.9.204:8983/solr/media_shard4_replica_p80,
>>>>> http://10.0.9.204:8983/solr/media_shard3_replica_p73,
>>>>> http://10.0.9.204:8983/solr/media_shard2_replica_p68]
>>>>>  #011at
>>>>> 
>>> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:410)
>>>>>  #011at
>>>>> 
>>> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:195)
>>>>>  #011at org.apache.solr.core.SolrCore.execute(SolrCore.java:2503)
>>>>>  ERROR RequestHandlerBase org.apache.solr.common.SolrException:
>>>>> org.apache.solr.client.solrj.SolrServerException: Time allowed to handle
>>>>> this request exceeded:[
>>>>> http://10.0.9.204:8983/solr/media_shard1_replica_p57,
>>>>> http://10.0.9.204:8983/solr/media_shard4_replica_p80,
>>>>> http://10.0.9.204:8983/solr/media_shard3_replica_p73,
>>>>> http://10.0.9.204:8983/solr/media_shard2_replica_p68]
>>>>> #011at
>>>>> 
>>> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:410)
>>>>> #011at
>>>>> 
>>> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:195)
>>>>> #011at org.apache.solr.core.SolrCore.execute(SolrCore.java:2503)
>>>>> #011at
>>> org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:711)
>>>>> #011at org.apache.s...
>>>>>  #011at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:517)
>>>>>  #011at
>>>>> 
>>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:384)
>>>>>  #011at
>>>>> 
>>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:330)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1629)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:533)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:190)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1595)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:188)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1253)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:168)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:473)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1564)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:166)
>>>>>  ERROR HttpSolrCall null:org.apache.solr.common.SolrException:
>>>>> org.apache.solr.client.solrj.SolrServerException: Time allowed to handle
>>>>> this request exceeded:[
>>>>> http://10.0.9.204:8983/solr/media_shard1_replica_p57,
>>>>> http://10.0.9.204:8983/solr/media_shard4_replica_p80,
>>>>> http://10.0.9.204:8983/solr/media_shard3_replica_p73,
>>>>> http://10.0.9.204:8983/solr/media_shard2_replica_p68]
>>>>> #011at
>>>>> 
>>> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:410)
>>>>> #011at
>>>>> 
>>> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:195)
>>>>> #011at org.apache.solr.core.SolrCore.execute(SolrCore.java:2503)
>>>>> #011at
>>> org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:711)
>>>> 
>>>> 
>>>> *Time: 3.00pm ~ Seeing out of memory*
>>>> 
>>>>> The request took too long to iterate over terms. Timeout: timeoutAt:
>>>>> 440824897486723 (System.nanoTime(): 440825204861492),
>>>>> TermsEnum=org.apache.lucene.codecs.blocktree.SegmentTermsEnum@3a9d2efc
>>>>>  solr: ERROR RequestHandlerBase java.lang.ClassCastException:
>>>>> org.apache.solr.response.BasicResultContext cannot be cast to
>>>>> org.apache.solr.common.SolrDocumentList
>>>>>  #011at
>>>>> 
>>> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:315)
>>>>>  #011at
>>>>> 
>>> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:195)
>>>>>  #011at org.apache.solr.core.SolrCore.execute(SolrCore.java:2503)
>>>>>  #011at
>>> org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:711)
>>>>>  #011at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:517)
>>>>>  #011at
>>>>> 
>>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:384)
>>>>>  #011at
>>>>> 
>>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:330)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1629)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:533)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:190)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1595)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:188)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1253)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:168)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:473)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1564)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:166)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1155)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:219)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:126)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>>>>>  #011at org.eclipse.jetty.server.Server.handle(Server.java:530)
>>>>>  #011at
>>> org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:347)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:256)
>>>>>  #011at
>>>>> org.eclipse.jetty.io
>>> .AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:279)
>>>>>  #011at org.eclipse.jetty.io
>>> .FillInterest.fillable(FillInterest.java:102)
>>>>>  #011at
>>>>> org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:124)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:708)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:626)
>>>>>  #011at java.base/java.lang.Thread.run(Thread.java:844)
>>>>>  solr: ERROR HttpSolrCall null:java.lang.ClassCastException:
>>>>> org.apache.solr.response.BasicResultContext cannot be cast to
>>>>> org.apache.solr.common.SolrDocumentList
>>>>>  #011at
>>>>> 
>>> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:315)
>>>>>  #011at
>>>>> 
>>> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:195)
>>>>>  #011at org.apache.solr.core.SolrCore.execute(SolrCore.java:2503)
>>>>>  #011at
>>> org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:711)
>>>>>  #011at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:517)
>>>>>  #011at
>>>>> 
>>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:384)
>>>>>  #011at
>>>>> 
>>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:330)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1629)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:533)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:190)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1595)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:188)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1253)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:168)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:473)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1564)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:166)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1155)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:219)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:126)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>>>>>  #011at org.eclipse.jetty.server.Server.handle(Server.java:530)
>>>>>  #011at
>>> org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:347)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:256)
>>>>>  #011at
>>>>> org.eclipse.jetty.io
>>> .AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:279)
>>>>>  #011at org.eclipse.jetty.io
>>> .FillInterest.fillable(FillInterest.java:102)
>>>>>  #011at
>>>>> org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:124)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:708)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:626)
>>>>>  #011at java.base/java.lang.Thread.run(Thread.java:844)
>>>> 
>>>> 
>>>> *3.15~ Health checks failing, Nodes are unresponsive.*
>>>> 
>>>>>  WARN  DistributedUpdateProcessor Error sending update to
>>>>> http://10.0.9.47:8983/solr
>>>>>  Failed to connect to Solr. May not be online yet. Retrying in 0
>>> seconds...
>>>>>  ERROR HttpSolrCall null:org.apache.solr.common.SolrException: Error
>>>>> trying to proxy request for url:
>>>>> http://10.0.10.117:8983/solr/media/admin/ping
>>>>> #011at
>>>>> org.apache.solr.servlet.HttpSolrCall.remoteQuery(HttpSolrCall.java:647)
>>>>> #011at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:501)
>>>>> #011at
>>>>> 
>>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:384)
>>>>> #011at
>>>>> 
>>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:330)
>>>>> #011at
>>>>> 
>>> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1629)
>>>>> #011at
>>>>> 
>>> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:533)
>>>>> #011at o...
>>>>>  ERROR RequestHandlerBase org.apache.solr.common.SolrException:
>>>>> org.apache.solr.client.solrj.SolrServerException: Time allowed to handle
>>>>> this request exceeded
>>>>> #011at
>>>>> 
>>> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:410)
>>>>> #011at
>>>>> 
>>> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:195)
>>>>> #011at org.apache.solr.core.SolrCore.execute(SolrCore.java:2503)
>>>>> #011at
>>> org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:711)
>>>>> #011at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:517)
>>>>> #011at org.apache.solr....
>>>>> RROR ErrorReportingConcurrentUpdateSolrClient error
>>>>>  solr: ERROR RequestHandlerBase java.io.IOException:
>>>>> java.util.concurrent.TimeoutException: Idle timeout expired:
>>> 120394/120000
>>>>> ms
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.server.HttpInput$ErrorState.noContent(HttpInput.java:1069)
>>>>>  #011at org.eclipse.jetty.server.HttpInput.read(HttpInput.java:309)
>>>>>  #011at
>>>>> 
>>> org.apache.commons.io.input.ProxyInputStream.read(ProxyInputStream.java:100)
>>>>>  #011at
>>>>> 
>>> org.apache.solr.common.util.FastInputStream.readWrappedStream(FastInputStream.java:79)
>>>>>  #011at
>>>>> 
>>> org.apache.solr.common.util.FastInputStream.refill(FastInputStream.java:88)
>>>>>  #011at
>>>>> 
>>> org.apache.solr.common.util.FastInputStream.peek(FastInputStream.java:60)
>>>>>  #011at
>>>>> 
>>> org.apache.solr.handler.loader.JavabinLoader.parseAndLoadDocs(JavabinLoader.java:107)
>>>>>  #011at
>>>>> org.apache.solr.handler.loader.JavabinLoader.load(JavabinLoader.java:55)
>>>>>  #011at
>>>>> 
>>> org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:97)
>>>>>  #011at
>>>>> 
>>> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:68)
>>>>>  #011at
>>>>> 
>>> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:195)
>>>>>  #011at org.apache.solr.core.SolrCore.execute(SolrCore.java:2503)
>>>>>  #011at
>>> org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:711)
>>>>>  #011at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:517)
>>>>>  #011at
>>>>> 
>>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:384)
>>>>>  #011at
>>>>> 
>>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:330)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1629)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:533)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:190)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1595)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:188)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1253)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:168)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:473)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1564)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:166)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1155)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:219)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:126)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>>>>>  #011at org.eclipse.jetty.server.Server.handle(Server.java:530)
>>>>>  #011at
>>> org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:347)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:256)
>>>>>  #011at
>>>>> org.eclipse.jetty.io
>>> .AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:279)
>>>>>  #011at org.eclipse.jetty.io
>>> .FillInterest.fillable(FillInterest.java:102)
>>>>>  #011at
>>>>> org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:124)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:247)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.produce(EatWhatYouKill.java:140)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:131)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:382)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:708)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:626)
>>>>>  #011at java.base/java.lang.Thread.run(Thread.java:844)
>>>>>  Caused by: java.util.concurrent.TimeoutException: Idle timeout expired:
>>>>> 120394/120000 ms
>>>>>  #011at
>>>>> org.eclipse.jetty.io.IdleTimeout.checkIdleTimeout(IdleTimeout.java:166)
>>>>>  #011at org.eclipse.jetty.io.IdleTimeout$1.run(IdleTimeout.java:50)
>>>>>  #011at
>>>>> 
>>> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:514)
>>>>>  #011at
>>> java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
>>>>>  #011at
>>>>> 
>>> java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
>>>>>  #011at
>>>>> 
>>> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1135)
>>>>>  #011at
>>>>> 
>>> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
>>>>>  #011... 1 more
>>>>>  #011Suppressed: java.lang.Throwable: HttpInput failure
>>>>>  #011#011at
>>> org.eclipse.jetty.server.HttpInput.failed(HttpInput.java:816)
>>>>>  #011#011at
>>>>> 
>>> org.eclipse.jetty.server.HttpConnection$BlockingReadCallback.failed(HttpConnection.java:639)
>>>>>  #011#011at org.eclipse.jetty.io
>>> .FillInterest.onFail(FillInterest.java:134)
>>>>>  #011#011at
>>>>> org.eclipse.jetty.io
>>> .AbstractEndPoint.onIdleExpired(AbstractEndPoint.java:398)
>>>>>  #011#011... 8 more
>>>>>  solr: ERROR HttpSolrCall null:org.apache.solr.common.SolrException:
>>> Error
>>>>> trying to proxy request for url:
>>>>> http://10.0.10.117:8983/solr/media/admin/ping
>>>>>  #011at
>>>>> org.apache.solr.servlet.HttpSolrCall.remoteQuery(HttpSolrCall.java:647)
>>>>>  #011at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:501)
>>>>>  ERROR HttpSolrCall null:org.apache.solr.common.SolrException: Error
>>>>> trying to proxy request for url:
>>>>> http://10.0.10.117:8983/solr/media/admin/ping
>>>>> #011at
>>>>> org.apache.solr.servlet.HttpSolrCall.remoteQuery(HttpSolrCall.java:647)
>>>>> #011at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:501)
>>>>> #011at
>>>>> 
>>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:384)
>>>>> #011at
>>>>> 
>>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:330)
>>>>> #011at
>>>>> 
>>> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1629)
>>>>> #011at org.eclipse.jetty.servlet.ServletHandler.doHandle(Servl...
>>>>>  #011at
>>>>> 
>>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:330)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1629)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:533)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:190)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1595)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:188)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1253)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:168)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:473)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1564)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:166)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1155)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:219)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:126)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>>>>>  #011at org.eclipse.jetty.server.Server.handle(Server.java:530)
>>>>>  #011at
>>> org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:347)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:256)
>>>>>  #011at
>>>>> org.eclipse.jetty.io
>>> .AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:279)
>>>>>  #011at org.eclipse.jetty.io
>>> .FillInterest.fillable(FillInterest.java:102)
>>>>>  #011at
>>>>> org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:124)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:247)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.produce(EatWhatYouKill.java:140)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:708)
>>>>>  #011at
>>>>> 
>>> org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:626)
>>>>>  #011at java.base/java.lang.Thread.run(Thread.java:844)
>>>>>  Caused by: org.apache.http.conn.ConnectTimeoutException: Connect to
>>>>> 10.0.10.117:8983 [/10.0.10.117] failed: connect timed out
>>>>>  #011at
>>>>> 
>>> org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:151)
>>>>>  #011at
>>>>> 
>>> org.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:359)
>>>>>  #011at
>>>>> 
>>> org.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:381)
>>>>>  #011at
>>>>> 
>>> org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:237)
>>>>>  #011at
>>>>> 
>>> org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:185)
>>>>>  #011at
>>> org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89)
>>>>>  #011at
>>>>> 
>>> org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:111)
>>>>>  #011at
>>>>> 
>>> org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
>>>>>  #011at
>>>>> 
>>> org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
>>>>>  #011at
>>>>> 
>>> org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56)
>>>>>  #011at
>>>>> org.apache.solr.servlet.HttpSolrCall.remoteQuery(HttpSolrCall.java:619)
>>>> 
>>>> 
>>>> *3.40~ Cannot do updates. Everything is broke :(*
>>>> 
>>>>> #011at
>>>>> org.apache.solr.servlet.HttpSolrCall.sendError(HttpSolrCall.java:664)
>>>>> #011at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:532)
>>>>> #011at
>>>>> 
>>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:384)
>>>>> #011at
>>>>> 
>>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:330)
>>>>> #011at
>>>>> 
>>> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1629)
>>>>> #011at
>>>>> 
>>> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:533)
>>>>> #011at
>>>>> 
>>> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
>>>>> #011at
>>>>> 
>>> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
>>>>> #011at org.eclipse.jetty.server.handler.HandlerWrapper.handle(Ha...
>>>>>  solr: WARN  DistributedUpdateProcessor Error sending update to
>>>>> http://10.0.8.157:8983/solr
>>>>>  org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:
>>>>> Error from server at
>>> http://10.0.8.157:8983/solr/media_shard6_replica_t35:
>>>>> Server Error
>>>>>  ERROR DistributedUpdateProcessor Setting up to try to start recovery on
>>>>> replica core_node36 with url
>>>>> http://10.0.8.157:8983/solr/media_shard6_replica_t35/ by increasing
>>>>> leader term
>>>>>  message repeated 2 times: []
>>>>>  request:
>>>>> 
>>> http://10.0.8.157:8983/solr/media_shard6_replica_t35/update?update.distrib=FROMLEADER&distrib.from=http%3A%2F%2F10.0.10.117%3A8983%2Fsolr%2Fmedia_shard6_replica_t10%2F&wt=javabin&version=2
>>>>>  Remote error message: java.util.concurrent.TimeoutException: Idle
>>> timeout
>>>>> expired: 120485/120000 ms
>>>>>  #011at
>>>>> 
>>> org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.sendUpdateStream(ConcurrentUpdateSolrClient.java:383)
>>>>>  #011at
>>>>> 
>>> org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.run(ConcurrentUpdateSolrClient.java:182)
>>>>>  #011at
>>>>> 
>>> com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:176)
>>>>>  #011at
>>>>> 
>>> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:188)
>>>>>  #011at
>>>>> 
>>> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1135)
>>>>>  #011at
>>>>> 
>>> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
>>>>>  #011at java.base/java.lang.Thread.run(Thread.java:844)
>>>>>  solr: ERROR DistributedUpdateProcessor Setting up to try to start
>>>>> recovery on replica core_node36 with url
>>>>> http://10.0.8.157:8983/solr/media_shard6_replica_t35/ by increasing
>>>>> leader term
>>>>>  org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:
>>>>> Error from server at
>>> http://10.0.8.157:8983/solr/media_shard6_replica_t35:
>>>>> Server Error
>>>>>  message repeated 2 times: []
>>>>>  request:
>>>>> 
>>> http://10.0.8.157:8983/solr/media_shard6_replica_t35/update?update.distrib=FROMLEADER&distrib.from=http%3A%2F%2F10.0.10.117%3A8983%2Fsolr%2Fmedia_shard6_replica_t10%2F&wt=javabin&version=2
>>>>>  Remote error message: java.util.concurrent.TimeoutException: Idle
>>> timeout
>>>>> expired: 120485/120000 ms
>>>>>  #011at
>>>>> 
>>> org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.sendUpdateStream(ConcurrentUpdateSolrClient.java:383)
>>>>>  #011at
>>>>> 
>>> org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.run(ConcurrentUpdateSolrClient.java:182)
>>>>>  #011at
>>>>> 
>>> com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:176)
>>>>>  #011at
>>>>> 
>>> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:188)
>>>>>  #011at
>>>>> 
>>> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1135)
>>>>>  #011at
>>>>> 
>>> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
>>>>>  #011at java.base/java.lang.Thread.run(Thread.java:844)
>>>>>  solr: WARN  DistributedUpdateProcessor Error sending update to
>>>>> http://10.0.8.157:8983/solr
>>>>>  org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:
>>>>> Error from server at
>>> http://10.0.8.157:8983/solr/media_shard6_replica_t35:
>>>>> Server Error
>>>>>  message repeated 2 times: []
>>>>>  request:
>>>>> 
>>> http://10.0.8.157:8983/solr/media_shard6_replica_t35/update?update.distrib=FROMLEADER&distrib.from=http%3A%2F%2F10.0.10.117%3A8983%2Fsolr%2Fmedia_shard6_replica_t10%2F&wt=javabin&version=2
>>>>>  Remote error message: java.util.concurrent.TimeoutException: Idle
>>> timeout
>>>>> expired: 120307/120000 ms
>>>>>  #011at
>>>>> 
>>> org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.sendUpdateStream(ConcurrentUpdateSolrClient.java:383)
>>>>>  #011at
>>>>> 
>>> org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.run(ConcurrentUpdateSolrClient.java:182)
>>>>>  #011at
>>>>> 
>>> com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:176)
>>>>>  #011at
>>>>> 
>>> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:188)
>>>>>  #011at
>>>>> 
>>> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1135)
>>>>>  #011at
>>>>> 
>>> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
>>>>>  #011at java.base/java.lang.Thread.run(Thread.java:844)
>>>>>  solr: WARN  DistributedUpdateProcessor Error sending update to
>>>>> http://10.0.9.47:8983/solr
>>>>>  org.apache.http.NoHttpResponseException: 10.0.9.47:8983 failed to
>>> respond
>>>>>  #011at
>>>>> 
>>> org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:141)
>>>>>  #011at
>>>>> 
>>> org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:56)
>>>>>  #011at
>>>>> org.apache.http.impl.io
>>> .AbstractMessageParser.parse(AbstractMessageParser.java:259)
>>>>>  #011at
>>>>> 
>>> org.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:163)
>>>>>  #011at
>>>>> 
>>> org.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:165)
>>>>>  #011at
>>>>> 
>>> org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:273)
>>>>>  #011at
>>>>> 
>>> org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125)
>>>>>  #011at
>>>>> 
>>> org.apache.solr.util.stats.InstrumentedHttpRequestExecutor.execute(InstrumentedHttpRequestExecutor.java:118)
>>>>>  #011at
>>>>> 
>>> org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:272)
>>>>>  #011at
>>>>> 
>>> org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:185)
>>>>>  #011at
>>> org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89)
>>>>>  #011at
>>>>> 
>>> org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:111)
>>>>>  #011at
>>>>> 
>>> org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
>>>>>  #011at
>>>>> 
>>> org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
>>>>>  #011at
>>>>> 
>>> org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56)
>>>>>  #011at
>>>>> 
>>> org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.sendUpdateStream(ConcurrentUpdateSolrClient.java:347)
>>>>>  #011at
>>>>> 
>>> org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.run(ConcurrentUpdateSolrClient.java:182)
>>>>>  #011at
>>>>> 
>>> com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:176)
>>>>>  #011at
>>>>> 
>>> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:188)
>>>>>  #011at
>>>>> 
>>> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1135)
>>>>>  #011at
>>>>> 
>>> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
>>>>>  #011at java.base/java.lang.Thread.run(Thread.java:844)
>>>>>  solr: WARN  DistributedUpdateProcessor Error sending update to
>>>>> http://10.0.8.157:8983/solr
>>>>>  org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:
>>>>> Error from server at
>>> http://10.0.8.157:8983/solr/media_shard2_replica_t20:
>>>>> Server Error
>>>> 
>>>> --
>>>> *P.S. We've launched a new blog to share the latest ideas and case
>>> studies
>>>> from our team. Check it out here: product.canva.com
>>>> <http://product.canva.com/>. ***
>>>> ** <https://canva.com>Empowering the world
>>>> to design
>>>> Also, we're hiring. Apply here!
>>>> <https://about.canva.com/careers/>
>>>>  <https://twitter.com/canva>
>>>> <https://facebook.com/canva> <https://au.linkedin.com/company/canva>
>>>> <https://instagram.com/canva>
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>> 
>>> 
> 
> -- 
> Ere Maijala
> Kansalliskirjasto / The National Library of Finland
> 


Re: 7.3.1: Query of death - all nodes ran out of memory and had to be shut down

Posted by Ere Maijala <er...@helsinki.fi>.
Hi,

Just my short comment here. It's difficult to say for someone else, but 
we identified deep paging as the definite reason for running out of 
memory or at least grinding to semi-halt because of long stop-the-world 
garbage collection pauses in an application running on a similar 
SolrCloud. You can often get away without issues as long as you only 
have a single shard, but for the reason Erick mentioned deep paging in a 
sharded index is a heavy operation.

Regards,
Ere

Ash Ramesh kirjoitti 21.8.2018 klo 8.09:
> Hi Erick,
> 
> Sorry I phrased that the wrong way. I meant to ask whether there is a high
> probability that that could be the correlated cause for the issue. Do you
> know why Solr itself isn't able to recover or is that to be expected with
> allowing such deep pagination. We are going to be removing it going
> forwards, but want to make sure that we find the root cause.
> 
> Appreciate your help as always :)
> 
> Ash
> 
> On Tue, Aug 21, 2018 at 2:59 PM Erick Erickson <er...@gmail.com>
> wrote:
> 
>> Did the large offsets _definitely_ cause the OOM? How do you expect
>> that to be answerable? It's likely though. To return rows 1,000,000
>> through 1,000,010 the system has to keep a list of 1,000,010 top
>> documents. It has to be this way because you don't know (and can't
>> guess) the score or a doc prior to, well, scoring it. And these very
>> large structures are kept for every query being processed. Not only
>> will that chew up memory, it'll chew up CPU cycles as well as this an
>> ordered list.
>>
>> This is an anti-pattern, cursors were invented because this pattern is
>> very costly (as you're finding out).
>>
>> Further, 4G isn't very much memory by modern standards.
>>
>> So it's very likely (but not guaranteed) that using cursors will fix
>> this problem.
>>
>> Best,
>> Erick
>>
>>
>>
>> On Mon, Aug 20, 2018 at 8:55 PM, Ash Ramesh <as...@canva.com> wrote:
>>> Hi everyone,
>>>
>>> We ran into an issue yesterday where all our ec2 machines, running solr,
>>> ran out of memory and could not heal themselves. I'll try break down what
>>> happened here.
>>>
>>> *System Architecture:*
>>>
>>> - Solr Version: 7.3.1
>>> - Replica Types: TLOG/PULL
>>> - Num Shards: 8 (default hashing mechanism)
>>> - Doc Count: > 20m
>>> - Index Size: 17G
>>> - EC2 Machine Spec: 16 Core | 32G ram | 100G SSD
>>> - Num EC2 Machines: 7+ (scales up and down)
>>> - Max Shards per node (one node per EC2 instance): 8 (some nodes had 4,
>>> some had 8)
>>> - Num TLOG shard replicas: 3 (3 copies of each shard as TLOG)
>>> - Num PULL shard replicas: 3+
>>> - Heap: 4G
>>>
>>> *What was run prior to the issue:*
>>>
>>> We ran these queries around 2.55pm
>>>
>>> We ran a bunch of deep paginated queries (offset of 1,000,000) with a
>>> filter query. We set the timeout to 5 seconds and it did timeout. We
>> aren't
>>> sure if this is what caused the irrecoverable failure, but by reading
>> this
>>> -
>>>
>> https://lucene.apache.org/solr/guide/7_4/pagination-of-results.html#performance-problems-with-deep-paging
>>> , we feel that this was the cause.
>>>
>>> We did not use a cursor.
>>>
>>> This cluster was healthy for about 1 week, but we noticed the degradation
>>> soon after (within 30min) of running the offset queries mentioned above.
>> We
>>> currently use a single sharded collection in production, however are
>>> transitioning to an 8 shard cluster. We hit this issue in a controlled 8
>>> sharded environment, but don't notice any issues on our production
>> (single
>>> sharded) cluster. On production the query still timed out (with same num
>>> docs etc.) but didn't go into a crazy state.
>>>
>>> *What Happened:*
>>>
>>> - All the EC2 instances started logging OOM error. None of the nodes were
>>> responsive to new requests.
>>> - We saw that the Heap usage jumped from an average of 2.7G to the max of
>>> 4G within a 5 minute window.
>>> - CPU across all 16 cores was at 100%
>>> - We saw that the distributed requests were timing out across all
>> machines.
>>> - We shutdown all the machines that only had PULL replicas on them and it
>>> still didn't 'fix' itself.
>>> - Eventually we shut down SOLR on the main node which had all the master
>>> TLOG replicas. Once restarted, the machine started working again.
>>>
>>>
>>> *Questions:*
>>> - Did this deep pagination query *DEFINITELY* cause this issue?
>>> - Is each node single threaded? I don't think so, but I'd like to confirm
>>> that.
>>> - Is there any configuration that we could use to avoid this in the
>> future?
>>> - Why could the nodes not recover by themselves? When we ran the same
>> query
>>> on the single shard cluster it failed and didn't spin out of control.
>>>
>>> Thanks for all your help, Logs are pasted below from different
>> timestamps.
>>>
>>> Regards,
>>> Ash
>>>
>>> *Logs:*
>>>
>>> Here are some logs we collected. Not sure if it tells a lot outside of
>> what
>>> we know.
>>>
>>> *Time: 2.55pm ~ Requests are failing to complete in time*
>>>
>>>> ERROR RequestHandlerBase org.apache.solr.common.SolrException:
>>>> org.apache.solr.client.solrj.SolrServerException: Time allowed to handle
>>>> this request exceeded:[
>>>> http://10.0.9.204:8983/solr/media_shard1_replica_p57,
>>>> http://10.0.9.204:8983/solr/media_shard4_replica_p80,
>>>> http://10.0.9.204:8983/solr/media_shard3_replica_p73,
>>>> http://10.0.9.204:8983/solr/media_shard2_replica_p68]
>>>> #011at
>>>>
>> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:410)
>>>> #011at
>>>>
>> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:195)
>>>> #011at org.apache.solr.core.SolrCore.execute(SolrCore.java:2503)
>>>> #011at
>> org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:711)
>>>> #011at org.apache.s...
>>>>   #011at
>>>>
>> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:188)
>>>>   #011at
>>>>
>> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:168)
>>>>   #011at
>>>>
>> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:473)
>>>>   #011at
>>>>
>> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1564)
>>>>   #011at
>>>>
>> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:166)
>>>>   #011at
>>>>
>> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1155)
>>>>   #011at
>>>>
>> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
>>>>   #011at
>>>>
>> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:219)
>>>>   #011at
>>>>
>> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:126)
>>>>   #011at
>>>>
>> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>>>>   #011at
>>>>
>> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
>>>>   #011at
>>>>
>> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>>>>   #011at org.eclipse.jetty.server.Server.handle(Server.java:530)
>>>>   #011at
>> org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:347)
>>>>   #011at
>>>>
>> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:126)
>>>>   #011at
>>>>
>> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>>>>   #011at
>>>>
>> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
>>>>   #011at
>>>>
>> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>>>>   #011at org.eclipse.jetty.server.Server.handle(Server.java:530)
>>>>   #011at
>> org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:347)
>>>>   #011at
>>>>
>> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:256)
>>>>   #011at
>>>> org.eclipse.jetty.io
>> .AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:279)
>>>>   #011at org.eclipse.jetty.io
>> .FillInterest.fillable(FillInterest.java:102)
>>>>   #011at
>>>> org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:124)
>>>>   #011at
>>>>
>> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:256)
>>>>   #011at
>>>> org.eclipse.jetty.io
>> .AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:279)
>>>>   #011at org.eclipse.jetty.io
>> .FillInterest.fillable(FillInterest.java:102)
>>>>   #011at
>>>> org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:124)
>>>>   #011at
>>>>
>> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:708)
>>>>   #011at
>>>>
>> org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:626)
>>>>   #011at java.base/java.lang.Thread.run(Thread.java:844)
>>>>   #011at
>>>>
>> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:247)
>>>>   #011at
>>>>
>> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.produce(EatWhatYouKill.java:140)
>>>>   #011at
>>>>
>> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:131)
>>>>   #011at
>>>>
>> org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:382)
>>>>   #011at
>>>>
>> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:708)
>>>>   #011at
>>>>
>> org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:626)
>>>>   #011at java.base/java.lang.Thread.run(Thread.java:844)
>>>>   solr: ERROR RequestHandlerBase org.apache.solr.common.SolrException:
>>>> org.apache.solr.client.solrj.SolrServerException: Time allowed to handle
>>>> this request exceeded:[
>>>> http://10.0.9.204:8983/solr/media_shard1_replica_p57,
>>>> http://10.0.9.204:8983/solr/media_shard4_replica_p80,
>>>> http://10.0.9.204:8983/solr/media_shard3_replica_p73,
>>>> http://10.0.9.204:8983/solr/media_shard2_replica_p68]
>>>>   #011at
>>>>
>> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:410)
>>>>   #011at
>>>>
>> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:195)
>>>>   #011at org.apache.solr.core.SolrCore.execute(SolrCore.java:2503)
>>>>   ERROR RequestHandlerBase org.apache.solr.common.SolrException:
>>>> org.apache.solr.client.solrj.SolrServerException: Time allowed to handle
>>>> this request exceeded:[
>>>> http://10.0.9.204:8983/solr/media_shard1_replica_p57,
>>>> http://10.0.9.204:8983/solr/media_shard4_replica_p80,
>>>> http://10.0.9.204:8983/solr/media_shard3_replica_p73,
>>>> http://10.0.9.204:8983/solr/media_shard2_replica_p68]
>>>> #011at
>>>>
>> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:410)
>>>> #011at
>>>>
>> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:195)
>>>> #011at org.apache.solr.core.SolrCore.execute(SolrCore.java:2503)
>>>> #011at
>> org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:711)
>>>> #011at org.apache.s...
>>>>   #011at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:517)
>>>>   #011at
>>>>
>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:384)
>>>>   #011at
>>>>
>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:330)
>>>>   #011at
>>>>
>> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1629)
>>>>   #011at
>>>>
>> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:533)
>>>>   #011at
>>>>
>> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
>>>>   #011at
>>>>
>> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
>>>>   #011at
>>>>
>> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>>>>   #011at
>>>>
>> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:190)
>>>>   #011at
>>>>
>> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1595)
>>>>   #011at
>>>>
>> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:188)
>>>>   #011at
>>>>
>> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1253)
>>>>   #011at
>>>>
>> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:168)
>>>>   #011at
>>>>
>> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:473)
>>>>   #011at
>>>>
>> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1564)
>>>>   #011at
>>>>
>> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:166)
>>>>   ERROR HttpSolrCall null:org.apache.solr.common.SolrException:
>>>> org.apache.solr.client.solrj.SolrServerException: Time allowed to handle
>>>> this request exceeded:[
>>>> http://10.0.9.204:8983/solr/media_shard1_replica_p57,
>>>> http://10.0.9.204:8983/solr/media_shard4_replica_p80,
>>>> http://10.0.9.204:8983/solr/media_shard3_replica_p73,
>>>> http://10.0.9.204:8983/solr/media_shard2_replica_p68]
>>>> #011at
>>>>
>> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:410)
>>>> #011at
>>>>
>> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:195)
>>>> #011at org.apache.solr.core.SolrCore.execute(SolrCore.java:2503)
>>>> #011at
>> org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:711)
>>>
>>>
>>> *Time: 3.00pm ~ Seeing out of memory*
>>>
>>>> The request took too long to iterate over terms. Timeout: timeoutAt:
>>>> 440824897486723 (System.nanoTime(): 440825204861492),
>>>> TermsEnum=org.apache.lucene.codecs.blocktree.SegmentTermsEnum@3a9d2efc
>>>>   solr: ERROR RequestHandlerBase java.lang.ClassCastException:
>>>> org.apache.solr.response.BasicResultContext cannot be cast to
>>>> org.apache.solr.common.SolrDocumentList
>>>>   #011at
>>>>
>> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:315)
>>>>   #011at
>>>>
>> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:195)
>>>>   #011at org.apache.solr.core.SolrCore.execute(SolrCore.java:2503)
>>>>   #011at
>> org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:711)
>>>>   #011at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:517)
>>>>   #011at
>>>>
>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:384)
>>>>   #011at
>>>>
>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:330)
>>>>   #011at
>>>>
>> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1629)
>>>>   #011at
>>>>
>> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:533)
>>>>   #011at
>>>>
>> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
>>>>   #011at
>>>>
>> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
>>>>   #011at
>>>>
>> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>>>>   #011at
>>>>
>> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:190)
>>>>   #011at
>>>>
>> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1595)
>>>>   #011at
>>>>
>> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:188)
>>>>   #011at
>>>>
>> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1253)
>>>>   #011at
>>>>
>> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:168)
>>>>   #011at
>>>>
>> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:473)
>>>>   #011at
>>>>
>> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1564)
>>>>   #011at
>>>>
>> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:166)
>>>>   #011at
>>>>
>> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1155)
>>>>   #011at
>>>>
>> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
>>>>   #011at
>>>>
>> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:219)
>>>>   #011at
>>>>
>> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:126)
>>>>   #011at
>>>>
>> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>>>>   #011at
>>>>
>> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
>>>>   #011at
>>>>
>> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>>>>   #011at org.eclipse.jetty.server.Server.handle(Server.java:530)
>>>>   #011at
>> org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:347)
>>>>   #011at
>>>>
>> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:256)
>>>>   #011at
>>>> org.eclipse.jetty.io
>> .AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:279)
>>>>   #011at org.eclipse.jetty.io
>> .FillInterest.fillable(FillInterest.java:102)
>>>>   #011at
>>>> org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:124)
>>>>   #011at
>>>>
>> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:708)
>>>>   #011at
>>>>
>> org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:626)
>>>>   #011at java.base/java.lang.Thread.run(Thread.java:844)
>>>>   solr: ERROR HttpSolrCall null:java.lang.ClassCastException:
>>>> org.apache.solr.response.BasicResultContext cannot be cast to
>>>> org.apache.solr.common.SolrDocumentList
>>>>   #011at
>>>>
>> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:315)
>>>>   #011at
>>>>
>> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:195)
>>>>   #011at org.apache.solr.core.SolrCore.execute(SolrCore.java:2503)
>>>>   #011at
>> org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:711)
>>>>   #011at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:517)
>>>>   #011at
>>>>
>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:384)
>>>>   #011at
>>>>
>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:330)
>>>>   #011at
>>>>
>> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1629)
>>>>   #011at
>>>>
>> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:533)
>>>>   #011at
>>>>
>> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
>>>>   #011at
>>>>
>> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
>>>>   #011at
>>>>
>> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>>>>   #011at
>>>>
>> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:190)
>>>>   #011at
>>>>
>> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1595)
>>>>   #011at
>>>>
>> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:188)
>>>>   #011at
>>>>
>> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1253)
>>>>   #011at
>>>>
>> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:168)
>>>>   #011at
>>>>
>> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:473)
>>>>   #011at
>>>>
>> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1564)
>>>>   #011at
>>>>
>> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:166)
>>>>   #011at
>>>>
>> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1155)
>>>>   #011at
>>>>
>> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
>>>>   #011at
>>>>
>> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:219)
>>>>   #011at
>>>>
>> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:126)
>>>>   #011at
>>>>
>> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>>>>   #011at
>>>>
>> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
>>>>   #011at
>>>>
>> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>>>>   #011at org.eclipse.jetty.server.Server.handle(Server.java:530)
>>>>   #011at
>> org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:347)
>>>>   #011at
>>>>
>> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:256)
>>>>   #011at
>>>> org.eclipse.jetty.io
>> .AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:279)
>>>>   #011at org.eclipse.jetty.io
>> .FillInterest.fillable(FillInterest.java:102)
>>>>   #011at
>>>> org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:124)
>>>>   #011at
>>>>
>> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:708)
>>>>   #011at
>>>>
>> org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:626)
>>>>   #011at java.base/java.lang.Thread.run(Thread.java:844)
>>>
>>>
>>> *3.15~ Health checks failing, Nodes are unresponsive.*
>>>
>>>>   WARN  DistributedUpdateProcessor Error sending update to
>>>> http://10.0.9.47:8983/solr
>>>>   Failed to connect to Solr. May not be online yet. Retrying in 0
>> seconds...
>>>>   ERROR HttpSolrCall null:org.apache.solr.common.SolrException: Error
>>>> trying to proxy request for url:
>>>> http://10.0.10.117:8983/solr/media/admin/ping
>>>> #011at
>>>> org.apache.solr.servlet.HttpSolrCall.remoteQuery(HttpSolrCall.java:647)
>>>> #011at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:501)
>>>> #011at
>>>>
>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:384)
>>>> #011at
>>>>
>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:330)
>>>> #011at
>>>>
>> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1629)
>>>> #011at
>>>>
>> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:533)
>>>> #011at o...
>>>>   ERROR RequestHandlerBase org.apache.solr.common.SolrException:
>>>> org.apache.solr.client.solrj.SolrServerException: Time allowed to handle
>>>> this request exceeded
>>>> #011at
>>>>
>> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:410)
>>>> #011at
>>>>
>> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:195)
>>>> #011at org.apache.solr.core.SolrCore.execute(SolrCore.java:2503)
>>>> #011at
>> org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:711)
>>>> #011at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:517)
>>>> #011at org.apache.solr....
>>>> RROR ErrorReportingConcurrentUpdateSolrClient error
>>>>   solr: ERROR RequestHandlerBase java.io.IOException:
>>>> java.util.concurrent.TimeoutException: Idle timeout expired:
>> 120394/120000
>>>> ms
>>>>   #011at
>>>>
>> org.eclipse.jetty.server.HttpInput$ErrorState.noContent(HttpInput.java:1069)
>>>>   #011at org.eclipse.jetty.server.HttpInput.read(HttpInput.java:309)
>>>>   #011at
>>>>
>> org.apache.commons.io.input.ProxyInputStream.read(ProxyInputStream.java:100)
>>>>   #011at
>>>>
>> org.apache.solr.common.util.FastInputStream.readWrappedStream(FastInputStream.java:79)
>>>>   #011at
>>>>
>> org.apache.solr.common.util.FastInputStream.refill(FastInputStream.java:88)
>>>>   #011at
>>>>
>> org.apache.solr.common.util.FastInputStream.peek(FastInputStream.java:60)
>>>>   #011at
>>>>
>> org.apache.solr.handler.loader.JavabinLoader.parseAndLoadDocs(JavabinLoader.java:107)
>>>>   #011at
>>>> org.apache.solr.handler.loader.JavabinLoader.load(JavabinLoader.java:55)
>>>>   #011at
>>>>
>> org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:97)
>>>>   #011at
>>>>
>> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:68)
>>>>   #011at
>>>>
>> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:195)
>>>>   #011at org.apache.solr.core.SolrCore.execute(SolrCore.java:2503)
>>>>   #011at
>> org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:711)
>>>>   #011at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:517)
>>>>   #011at
>>>>
>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:384)
>>>>   #011at
>>>>
>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:330)
>>>>   #011at
>>>>
>> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1629)
>>>>   #011at
>>>>
>> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:533)
>>>>   #011at
>>>>
>> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
>>>>   #011at
>>>>
>> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
>>>>   #011at
>>>>
>> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>>>>   #011at
>>>>
>> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:190)
>>>>   #011at
>>>>
>> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1595)
>>>>   #011at
>>>>
>> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:188)
>>>>   #011at
>>>>
>> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1253)
>>>>   #011at
>>>>
>> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:168)
>>>>   #011at
>>>>
>> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:473)
>>>>   #011at
>>>>
>> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1564)
>>>>   #011at
>>>>
>> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:166)
>>>>   #011at
>>>>
>> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1155)
>>>>   #011at
>>>>
>> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
>>>>   #011at
>>>>
>> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:219)
>>>>   #011at
>>>>
>> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:126)
>>>>   #011at
>>>>
>> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>>>>   #011at
>>>>
>> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
>>>>   #011at
>>>>
>> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>>>>   #011at org.eclipse.jetty.server.Server.handle(Server.java:530)
>>>>   #011at
>> org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:347)
>>>>   #011at
>>>>
>> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:256)
>>>>   #011at
>>>> org.eclipse.jetty.io
>> .AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:279)
>>>>   #011at org.eclipse.jetty.io
>> .FillInterest.fillable(FillInterest.java:102)
>>>>   #011at
>>>> org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:124)
>>>>   #011at
>>>>
>> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:247)
>>>>   #011at
>>>>
>> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.produce(EatWhatYouKill.java:140)
>>>>   #011at
>>>>
>> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:131)
>>>>   #011at
>>>>
>> org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:382)
>>>>   #011at
>>>>
>> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:708)
>>>>   #011at
>>>>
>> org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:626)
>>>>   #011at java.base/java.lang.Thread.run(Thread.java:844)
>>>>   Caused by: java.util.concurrent.TimeoutException: Idle timeout expired:
>>>> 120394/120000 ms
>>>>   #011at
>>>> org.eclipse.jetty.io.IdleTimeout.checkIdleTimeout(IdleTimeout.java:166)
>>>>   #011at org.eclipse.jetty.io.IdleTimeout$1.run(IdleTimeout.java:50)
>>>>   #011at
>>>>
>> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:514)
>>>>   #011at
>> java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
>>>>   #011at
>>>>
>> java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
>>>>   #011at
>>>>
>> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1135)
>>>>   #011at
>>>>
>> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
>>>>   #011... 1 more
>>>>   #011Suppressed: java.lang.Throwable: HttpInput failure
>>>>   #011#011at
>> org.eclipse.jetty.server.HttpInput.failed(HttpInput.java:816)
>>>>   #011#011at
>>>>
>> org.eclipse.jetty.server.HttpConnection$BlockingReadCallback.failed(HttpConnection.java:639)
>>>>   #011#011at org.eclipse.jetty.io
>> .FillInterest.onFail(FillInterest.java:134)
>>>>   #011#011at
>>>> org.eclipse.jetty.io
>> .AbstractEndPoint.onIdleExpired(AbstractEndPoint.java:398)
>>>>   #011#011... 8 more
>>>>   solr: ERROR HttpSolrCall null:org.apache.solr.common.SolrException:
>> Error
>>>> trying to proxy request for url:
>>>> http://10.0.10.117:8983/solr/media/admin/ping
>>>>   #011at
>>>> org.apache.solr.servlet.HttpSolrCall.remoteQuery(HttpSolrCall.java:647)
>>>>   #011at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:501)
>>>>   ERROR HttpSolrCall null:org.apache.solr.common.SolrException: Error
>>>> trying to proxy request for url:
>>>> http://10.0.10.117:8983/solr/media/admin/ping
>>>> #011at
>>>> org.apache.solr.servlet.HttpSolrCall.remoteQuery(HttpSolrCall.java:647)
>>>> #011at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:501)
>>>> #011at
>>>>
>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:384)
>>>> #011at
>>>>
>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:330)
>>>> #011at
>>>>
>> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1629)
>>>> #011at org.eclipse.jetty.servlet.ServletHandler.doHandle(Servl...
>>>>   #011at
>>>>
>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:330)
>>>>   #011at
>>>>
>> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1629)
>>>>   #011at
>>>>
>> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:533)
>>>>   #011at
>>>>
>> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
>>>>   #011at
>>>>
>> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
>>>>   #011at
>>>>
>> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>>>>   #011at
>>>>
>> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:190)
>>>>   #011at
>>>>
>> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1595)
>>>>   #011at
>>>>
>> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:188)
>>>>   #011at
>>>>
>> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1253)
>>>>   #011at
>>>>
>> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:168)
>>>>   #011at
>>>>
>> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:473)
>>>>   #011at
>>>>
>> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1564)
>>>>   #011at
>>>>
>> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:166)
>>>>   #011at
>>>>
>> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1155)
>>>>   #011at
>>>>
>> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
>>>>   #011at
>>>>
>> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:219)
>>>>   #011at
>>>>
>> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:126)
>>>>   #011at
>>>>
>> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>>>>   #011at
>>>>
>> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
>>>>   #011at
>>>>
>> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>>>>   #011at org.eclipse.jetty.server.Server.handle(Server.java:530)
>>>>   #011at
>> org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:347)
>>>>   #011at
>>>>
>> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:256)
>>>>   #011at
>>>> org.eclipse.jetty.io
>> .AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:279)
>>>>   #011at org.eclipse.jetty.io
>> .FillInterest.fillable(FillInterest.java:102)
>>>>   #011at
>>>> org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:124)
>>>>   #011at
>>>>
>> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:247)
>>>>   #011at
>>>>
>> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.produce(EatWhatYouKill.java:140)
>>>>   #011at
>>>>
>> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:708)
>>>>   #011at
>>>>
>> org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:626)
>>>>   #011at java.base/java.lang.Thread.run(Thread.java:844)
>>>>   Caused by: org.apache.http.conn.ConnectTimeoutException: Connect to
>>>> 10.0.10.117:8983 [/10.0.10.117] failed: connect timed out
>>>>   #011at
>>>>
>> org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:151)
>>>>   #011at
>>>>
>> org.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:359)
>>>>   #011at
>>>>
>> org.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:381)
>>>>   #011at
>>>>
>> org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:237)
>>>>   #011at
>>>>
>> org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:185)
>>>>   #011at
>> org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89)
>>>>   #011at
>>>>
>> org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:111)
>>>>   #011at
>>>>
>> org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
>>>>   #011at
>>>>
>> org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
>>>>   #011at
>>>>
>> org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56)
>>>>   #011at
>>>> org.apache.solr.servlet.HttpSolrCall.remoteQuery(HttpSolrCall.java:619)
>>>
>>>
>>> *3.40~ Cannot do updates. Everything is broke :(*
>>>
>>>> #011at
>>>> org.apache.solr.servlet.HttpSolrCall.sendError(HttpSolrCall.java:664)
>>>> #011at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:532)
>>>> #011at
>>>>
>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:384)
>>>> #011at
>>>>
>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:330)
>>>> #011at
>>>>
>> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1629)
>>>> #011at
>>>>
>> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:533)
>>>> #011at
>>>>
>> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
>>>> #011at
>>>>
>> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
>>>> #011at org.eclipse.jetty.server.handler.HandlerWrapper.handle(Ha...
>>>>   solr: WARN  DistributedUpdateProcessor Error sending update to
>>>> http://10.0.8.157:8983/solr
>>>>   org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:
>>>> Error from server at
>> http://10.0.8.157:8983/solr/media_shard6_replica_t35:
>>>> Server Error
>>>>   ERROR DistributedUpdateProcessor Setting up to try to start recovery on
>>>> replica core_node36 with url
>>>> http://10.0.8.157:8983/solr/media_shard6_replica_t35/ by increasing
>>>> leader term
>>>>   message repeated 2 times: []
>>>>   request:
>>>>
>> http://10.0.8.157:8983/solr/media_shard6_replica_t35/update?update.distrib=FROMLEADER&distrib.from=http%3A%2F%2F10.0.10.117%3A8983%2Fsolr%2Fmedia_shard6_replica_t10%2F&wt=javabin&version=2
>>>>   Remote error message: java.util.concurrent.TimeoutException: Idle
>> timeout
>>>> expired: 120485/120000 ms
>>>>   #011at
>>>>
>> org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.sendUpdateStream(ConcurrentUpdateSolrClient.java:383)
>>>>   #011at
>>>>
>> org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.run(ConcurrentUpdateSolrClient.java:182)
>>>>   #011at
>>>>
>> com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:176)
>>>>   #011at
>>>>
>> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:188)
>>>>   #011at
>>>>
>> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1135)
>>>>   #011at
>>>>
>> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
>>>>   #011at java.base/java.lang.Thread.run(Thread.java:844)
>>>>   solr: ERROR DistributedUpdateProcessor Setting up to try to start
>>>> recovery on replica core_node36 with url
>>>> http://10.0.8.157:8983/solr/media_shard6_replica_t35/ by increasing
>>>> leader term
>>>>   org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:
>>>> Error from server at
>> http://10.0.8.157:8983/solr/media_shard6_replica_t35:
>>>> Server Error
>>>>   message repeated 2 times: []
>>>>   request:
>>>>
>> http://10.0.8.157:8983/solr/media_shard6_replica_t35/update?update.distrib=FROMLEADER&distrib.from=http%3A%2F%2F10.0.10.117%3A8983%2Fsolr%2Fmedia_shard6_replica_t10%2F&wt=javabin&version=2
>>>>   Remote error message: java.util.concurrent.TimeoutException: Idle
>> timeout
>>>> expired: 120485/120000 ms
>>>>   #011at
>>>>
>> org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.sendUpdateStream(ConcurrentUpdateSolrClient.java:383)
>>>>   #011at
>>>>
>> org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.run(ConcurrentUpdateSolrClient.java:182)
>>>>   #011at
>>>>
>> com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:176)
>>>>   #011at
>>>>
>> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:188)
>>>>   #011at
>>>>
>> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1135)
>>>>   #011at
>>>>
>> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
>>>>   #011at java.base/java.lang.Thread.run(Thread.java:844)
>>>>   solr: WARN  DistributedUpdateProcessor Error sending update to
>>>> http://10.0.8.157:8983/solr
>>>>   org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:
>>>> Error from server at
>> http://10.0.8.157:8983/solr/media_shard6_replica_t35:
>>>> Server Error
>>>>   message repeated 2 times: []
>>>>   request:
>>>>
>> http://10.0.8.157:8983/solr/media_shard6_replica_t35/update?update.distrib=FROMLEADER&distrib.from=http%3A%2F%2F10.0.10.117%3A8983%2Fsolr%2Fmedia_shard6_replica_t10%2F&wt=javabin&version=2
>>>>   Remote error message: java.util.concurrent.TimeoutException: Idle
>> timeout
>>>> expired: 120307/120000 ms
>>>>   #011at
>>>>
>> org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.sendUpdateStream(ConcurrentUpdateSolrClient.java:383)
>>>>   #011at
>>>>
>> org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.run(ConcurrentUpdateSolrClient.java:182)
>>>>   #011at
>>>>
>> com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:176)
>>>>   #011at
>>>>
>> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:188)
>>>>   #011at
>>>>
>> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1135)
>>>>   #011at
>>>>
>> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
>>>>   #011at java.base/java.lang.Thread.run(Thread.java:844)
>>>>   solr: WARN  DistributedUpdateProcessor Error sending update to
>>>> http://10.0.9.47:8983/solr
>>>>   org.apache.http.NoHttpResponseException: 10.0.9.47:8983 failed to
>> respond
>>>>   #011at
>>>>
>> org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:141)
>>>>   #011at
>>>>
>> org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:56)
>>>>   #011at
>>>> org.apache.http.impl.io
>> .AbstractMessageParser.parse(AbstractMessageParser.java:259)
>>>>   #011at
>>>>
>> org.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:163)
>>>>   #011at
>>>>
>> org.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:165)
>>>>   #011at
>>>>
>> org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:273)
>>>>   #011at
>>>>
>> org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125)
>>>>   #011at
>>>>
>> org.apache.solr.util.stats.InstrumentedHttpRequestExecutor.execute(InstrumentedHttpRequestExecutor.java:118)
>>>>   #011at
>>>>
>> org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:272)
>>>>   #011at
>>>>
>> org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:185)
>>>>   #011at
>> org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89)
>>>>   #011at
>>>>
>> org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:111)
>>>>   #011at
>>>>
>> org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
>>>>   #011at
>>>>
>> org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
>>>>   #011at
>>>>
>> org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56)
>>>>   #011at
>>>>
>> org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.sendUpdateStream(ConcurrentUpdateSolrClient.java:347)
>>>>   #011at
>>>>
>> org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.run(ConcurrentUpdateSolrClient.java:182)
>>>>   #011at
>>>>
>> com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:176)
>>>>   #011at
>>>>
>> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:188)
>>>>   #011at
>>>>
>> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1135)
>>>>   #011at
>>>>
>> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
>>>>   #011at java.base/java.lang.Thread.run(Thread.java:844)
>>>>   solr: WARN  DistributedUpdateProcessor Error sending update to
>>>> http://10.0.8.157:8983/solr
>>>>   org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:
>>>> Error from server at
>> http://10.0.8.157:8983/solr/media_shard2_replica_t20:
>>>> Server Error
>>>
>>> --
>>> *P.S. We've launched a new blog to share the latest ideas and case
>> studies
>>> from our team. Check it out here: product.canva.com
>>> <http://product.canva.com/>. ***
>>> ** <https://canva.com>Empowering the world
>>> to design
>>> Also, we're hiring. Apply here!
>>> <https://about.canva.com/careers/>
>>>   <https://twitter.com/canva>
>>> <https://facebook.com/canva> <https://au.linkedin.com/company/canva>
>>> <https://instagram.com/canva>
>>>
>>>
>>>
>>>
>>>
>>
>>
> 

-- 
Ere Maijala
Kansalliskirjasto / The National Library of Finland


Re: 7.3.1: Query of death - all nodes ran out of memory and had to be shut down

Posted by Ash Ramesh <as...@canva.com>.
Hi Erick,

Sorry I phrased that the wrong way. I meant to ask whether there is a high
probability that that could be the correlated cause for the issue. Do you
know why Solr itself isn't able to recover or is that to be expected with
allowing such deep pagination. We are going to be removing it going
forwards, but want to make sure that we find the root cause.

Appreciate your help as always :)

Ash

On Tue, Aug 21, 2018 at 2:59 PM Erick Erickson <er...@gmail.com>
wrote:

> Did the large offsets _definitely_ cause the OOM? How do you expect
> that to be answerable? It's likely though. To return rows 1,000,000
> through 1,000,010 the system has to keep a list of 1,000,010 top
> documents. It has to be this way because you don't know (and can't
> guess) the score or a doc prior to, well, scoring it. And these very
> large structures are kept for every query being processed. Not only
> will that chew up memory, it'll chew up CPU cycles as well as this an
> ordered list.
>
> This is an anti-pattern, cursors were invented because this pattern is
> very costly (as you're finding out).
>
> Further, 4G isn't very much memory by modern standards.
>
> So it's very likely (but not guaranteed) that using cursors will fix
> this problem.
>
> Best,
> Erick
>
>
>
> On Mon, Aug 20, 2018 at 8:55 PM, Ash Ramesh <as...@canva.com> wrote:
> > Hi everyone,
> >
> > We ran into an issue yesterday where all our ec2 machines, running solr,
> > ran out of memory and could not heal themselves. I'll try break down what
> > happened here.
> >
> > *System Architecture:*
> >
> > - Solr Version: 7.3.1
> > - Replica Types: TLOG/PULL
> > - Num Shards: 8 (default hashing mechanism)
> > - Doc Count: > 20m
> > - Index Size: 17G
> > - EC2 Machine Spec: 16 Core | 32G ram | 100G SSD
> > - Num EC2 Machines: 7+ (scales up and down)
> > - Max Shards per node (one node per EC2 instance): 8 (some nodes had 4,
> > some had 8)
> > - Num TLOG shard replicas: 3 (3 copies of each shard as TLOG)
> > - Num PULL shard replicas: 3+
> > - Heap: 4G
> >
> > *What was run prior to the issue:*
> >
> > We ran these queries around 2.55pm
> >
> > We ran a bunch of deep paginated queries (offset of 1,000,000) with a
> > filter query. We set the timeout to 5 seconds and it did timeout. We
> aren't
> > sure if this is what caused the irrecoverable failure, but by reading
> this
> > -
> >
> https://lucene.apache.org/solr/guide/7_4/pagination-of-results.html#performance-problems-with-deep-paging
> > , we feel that this was the cause.
> >
> > We did not use a cursor.
> >
> > This cluster was healthy for about 1 week, but we noticed the degradation
> > soon after (within 30min) of running the offset queries mentioned above.
> We
> > currently use a single sharded collection in production, however are
> > transitioning to an 8 shard cluster. We hit this issue in a controlled 8
> > sharded environment, but don't notice any issues on our production
> (single
> > sharded) cluster. On production the query still timed out (with same num
> > docs etc.) but didn't go into a crazy state.
> >
> > *What Happened:*
> >
> > - All the EC2 instances started logging OOM error. None of the nodes were
> > responsive to new requests.
> > - We saw that the Heap usage jumped from an average of 2.7G to the max of
> > 4G within a 5 minute window.
> > - CPU across all 16 cores was at 100%
> > - We saw that the distributed requests were timing out across all
> machines.
> > - We shutdown all the machines that only had PULL replicas on them and it
> > still didn't 'fix' itself.
> > - Eventually we shut down SOLR on the main node which had all the master
> > TLOG replicas. Once restarted, the machine started working again.
> >
> >
> > *Questions:*
> > - Did this deep pagination query *DEFINITELY* cause this issue?
> > - Is each node single threaded? I don't think so, but I'd like to confirm
> > that.
> > - Is there any configuration that we could use to avoid this in the
> future?
> > - Why could the nodes not recover by themselves? When we ran the same
> query
> > on the single shard cluster it failed and didn't spin out of control.
> >
> > Thanks for all your help, Logs are pasted below from different
> timestamps.
> >
> > Regards,
> > Ash
> >
> > *Logs:*
> >
> > Here are some logs we collected. Not sure if it tells a lot outside of
> what
> > we know.
> >
> > *Time: 2.55pm ~ Requests are failing to complete in time*
> >
> >> ERROR RequestHandlerBase org.apache.solr.common.SolrException:
> >> org.apache.solr.client.solrj.SolrServerException: Time allowed to handle
> >> this request exceeded:[
> >> http://10.0.9.204:8983/solr/media_shard1_replica_p57,
> >> http://10.0.9.204:8983/solr/media_shard4_replica_p80,
> >> http://10.0.9.204:8983/solr/media_shard3_replica_p73,
> >> http://10.0.9.204:8983/solr/media_shard2_replica_p68]
> >> #011at
> >>
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:410)
> >> #011at
> >>
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:195)
> >> #011at org.apache.solr.core.SolrCore.execute(SolrCore.java:2503)
> >> #011at
> org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:711)
> >> #011at org.apache.s...
> >>  #011at
> >>
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:188)
> >>  #011at
> >>
> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:168)
> >>  #011at
> >>
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:473)
> >>  #011at
> >>
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1564)
> >>  #011at
> >>
> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:166)
> >>  #011at
> >>
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1155)
> >>  #011at
> >>
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
> >>  #011at
> >>
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:219)
> >>  #011at
> >>
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:126)
> >>  #011at
> >>
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
> >>  #011at
> >>
> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
> >>  #011at
> >>
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
> >>  #011at org.eclipse.jetty.server.Server.handle(Server.java:530)
> >>  #011at
> org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:347)
> >>  #011at
> >>
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:126)
> >>  #011at
> >>
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
> >>  #011at
> >>
> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
> >>  #011at
> >>
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
> >>  #011at org.eclipse.jetty.server.Server.handle(Server.java:530)
> >>  #011at
> org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:347)
> >>  #011at
> >>
> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:256)
> >>  #011at
> >> org.eclipse.jetty.io
> .AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:279)
> >>  #011at org.eclipse.jetty.io
> .FillInterest.fillable(FillInterest.java:102)
> >>  #011at
> >> org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:124)
> >>  #011at
> >>
> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:256)
> >>  #011at
> >> org.eclipse.jetty.io
> .AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:279)
> >>  #011at org.eclipse.jetty.io
> .FillInterest.fillable(FillInterest.java:102)
> >>  #011at
> >> org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:124)
> >>  #011at
> >>
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:708)
> >>  #011at
> >>
> org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:626)
> >>  #011at java.base/java.lang.Thread.run(Thread.java:844)
> >>  #011at
> >>
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:247)
> >>  #011at
> >>
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.produce(EatWhatYouKill.java:140)
> >>  #011at
> >>
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:131)
> >>  #011at
> >>
> org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:382)
> >>  #011at
> >>
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:708)
> >>  #011at
> >>
> org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:626)
> >>  #011at java.base/java.lang.Thread.run(Thread.java:844)
> >>  solr: ERROR RequestHandlerBase org.apache.solr.common.SolrException:
> >> org.apache.solr.client.solrj.SolrServerException: Time allowed to handle
> >> this request exceeded:[
> >> http://10.0.9.204:8983/solr/media_shard1_replica_p57,
> >> http://10.0.9.204:8983/solr/media_shard4_replica_p80,
> >> http://10.0.9.204:8983/solr/media_shard3_replica_p73,
> >> http://10.0.9.204:8983/solr/media_shard2_replica_p68]
> >>  #011at
> >>
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:410)
> >>  #011at
> >>
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:195)
> >>  #011at org.apache.solr.core.SolrCore.execute(SolrCore.java:2503)
> >>  ERROR RequestHandlerBase org.apache.solr.common.SolrException:
> >> org.apache.solr.client.solrj.SolrServerException: Time allowed to handle
> >> this request exceeded:[
> >> http://10.0.9.204:8983/solr/media_shard1_replica_p57,
> >> http://10.0.9.204:8983/solr/media_shard4_replica_p80,
> >> http://10.0.9.204:8983/solr/media_shard3_replica_p73,
> >> http://10.0.9.204:8983/solr/media_shard2_replica_p68]
> >> #011at
> >>
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:410)
> >> #011at
> >>
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:195)
> >> #011at org.apache.solr.core.SolrCore.execute(SolrCore.java:2503)
> >> #011at
> org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:711)
> >> #011at org.apache.s...
> >>  #011at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:517)
> >>  #011at
> >>
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:384)
> >>  #011at
> >>
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:330)
> >>  #011at
> >>
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1629)
> >>  #011at
> >>
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:533)
> >>  #011at
> >>
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
> >>  #011at
> >>
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
> >>  #011at
> >>
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
> >>  #011at
> >>
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:190)
> >>  #011at
> >>
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1595)
> >>  #011at
> >>
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:188)
> >>  #011at
> >>
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1253)
> >>  #011at
> >>
> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:168)
> >>  #011at
> >>
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:473)
> >>  #011at
> >>
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1564)
> >>  #011at
> >>
> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:166)
> >>  ERROR HttpSolrCall null:org.apache.solr.common.SolrException:
> >> org.apache.solr.client.solrj.SolrServerException: Time allowed to handle
> >> this request exceeded:[
> >> http://10.0.9.204:8983/solr/media_shard1_replica_p57,
> >> http://10.0.9.204:8983/solr/media_shard4_replica_p80,
> >> http://10.0.9.204:8983/solr/media_shard3_replica_p73,
> >> http://10.0.9.204:8983/solr/media_shard2_replica_p68]
> >> #011at
> >>
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:410)
> >> #011at
> >>
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:195)
> >> #011at org.apache.solr.core.SolrCore.execute(SolrCore.java:2503)
> >> #011at
> org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:711)
> >
> >
> > *Time: 3.00pm ~ Seeing out of memory*
> >
> >> The request took too long to iterate over terms. Timeout: timeoutAt:
> >> 440824897486723 (System.nanoTime(): 440825204861492),
> >> TermsEnum=org.apache.lucene.codecs.blocktree.SegmentTermsEnum@3a9d2efc
> >>  solr: ERROR RequestHandlerBase java.lang.ClassCastException:
> >> org.apache.solr.response.BasicResultContext cannot be cast to
> >> org.apache.solr.common.SolrDocumentList
> >>  #011at
> >>
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:315)
> >>  #011at
> >>
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:195)
> >>  #011at org.apache.solr.core.SolrCore.execute(SolrCore.java:2503)
> >>  #011at
> org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:711)
> >>  #011at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:517)
> >>  #011at
> >>
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:384)
> >>  #011at
> >>
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:330)
> >>  #011at
> >>
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1629)
> >>  #011at
> >>
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:533)
> >>  #011at
> >>
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
> >>  #011at
> >>
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
> >>  #011at
> >>
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
> >>  #011at
> >>
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:190)
> >>  #011at
> >>
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1595)
> >>  #011at
> >>
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:188)
> >>  #011at
> >>
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1253)
> >>  #011at
> >>
> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:168)
> >>  #011at
> >>
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:473)
> >>  #011at
> >>
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1564)
> >>  #011at
> >>
> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:166)
> >>  #011at
> >>
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1155)
> >>  #011at
> >>
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
> >>  #011at
> >>
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:219)
> >>  #011at
> >>
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:126)
> >>  #011at
> >>
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
> >>  #011at
> >>
> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
> >>  #011at
> >>
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
> >>  #011at org.eclipse.jetty.server.Server.handle(Server.java:530)
> >>  #011at
> org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:347)
> >>  #011at
> >>
> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:256)
> >>  #011at
> >> org.eclipse.jetty.io
> .AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:279)
> >>  #011at org.eclipse.jetty.io
> .FillInterest.fillable(FillInterest.java:102)
> >>  #011at
> >> org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:124)
> >>  #011at
> >>
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:708)
> >>  #011at
> >>
> org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:626)
> >>  #011at java.base/java.lang.Thread.run(Thread.java:844)
> >>  solr: ERROR HttpSolrCall null:java.lang.ClassCastException:
> >> org.apache.solr.response.BasicResultContext cannot be cast to
> >> org.apache.solr.common.SolrDocumentList
> >>  #011at
> >>
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:315)
> >>  #011at
> >>
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:195)
> >>  #011at org.apache.solr.core.SolrCore.execute(SolrCore.java:2503)
> >>  #011at
> org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:711)
> >>  #011at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:517)
> >>  #011at
> >>
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:384)
> >>  #011at
> >>
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:330)
> >>  #011at
> >>
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1629)
> >>  #011at
> >>
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:533)
> >>  #011at
> >>
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
> >>  #011at
> >>
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
> >>  #011at
> >>
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
> >>  #011at
> >>
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:190)
> >>  #011at
> >>
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1595)
> >>  #011at
> >>
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:188)
> >>  #011at
> >>
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1253)
> >>  #011at
> >>
> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:168)
> >>  #011at
> >>
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:473)
> >>  #011at
> >>
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1564)
> >>  #011at
> >>
> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:166)
> >>  #011at
> >>
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1155)
> >>  #011at
> >>
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
> >>  #011at
> >>
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:219)
> >>  #011at
> >>
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:126)
> >>  #011at
> >>
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
> >>  #011at
> >>
> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
> >>  #011at
> >>
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
> >>  #011at org.eclipse.jetty.server.Server.handle(Server.java:530)
> >>  #011at
> org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:347)
> >>  #011at
> >>
> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:256)
> >>  #011at
> >> org.eclipse.jetty.io
> .AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:279)
> >>  #011at org.eclipse.jetty.io
> .FillInterest.fillable(FillInterest.java:102)
> >>  #011at
> >> org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:124)
> >>  #011at
> >>
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:708)
> >>  #011at
> >>
> org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:626)
> >>  #011at java.base/java.lang.Thread.run(Thread.java:844)
> >
> >
> > *3.15~ Health checks failing, Nodes are unresponsive.*
> >
> >>  WARN  DistributedUpdateProcessor Error sending update to
> >> http://10.0.9.47:8983/solr
> >>  Failed to connect to Solr. May not be online yet. Retrying in 0
> seconds...
> >>  ERROR HttpSolrCall null:org.apache.solr.common.SolrException: Error
> >> trying to proxy request for url:
> >> http://10.0.10.117:8983/solr/media/admin/ping
> >> #011at
> >> org.apache.solr.servlet.HttpSolrCall.remoteQuery(HttpSolrCall.java:647)
> >> #011at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:501)
> >> #011at
> >>
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:384)
> >> #011at
> >>
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:330)
> >> #011at
> >>
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1629)
> >> #011at
> >>
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:533)
> >> #011at o...
> >>  ERROR RequestHandlerBase org.apache.solr.common.SolrException:
> >> org.apache.solr.client.solrj.SolrServerException: Time allowed to handle
> >> this request exceeded
> >> #011at
> >>
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:410)
> >> #011at
> >>
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:195)
> >> #011at org.apache.solr.core.SolrCore.execute(SolrCore.java:2503)
> >> #011at
> org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:711)
> >> #011at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:517)
> >> #011at org.apache.solr....
> >> RROR ErrorReportingConcurrentUpdateSolrClient error
> >>  solr: ERROR RequestHandlerBase java.io.IOException:
> >> java.util.concurrent.TimeoutException: Idle timeout expired:
> 120394/120000
> >> ms
> >>  #011at
> >>
> org.eclipse.jetty.server.HttpInput$ErrorState.noContent(HttpInput.java:1069)
> >>  #011at org.eclipse.jetty.server.HttpInput.read(HttpInput.java:309)
> >>  #011at
> >>
> org.apache.commons.io.input.ProxyInputStream.read(ProxyInputStream.java:100)
> >>  #011at
> >>
> org.apache.solr.common.util.FastInputStream.readWrappedStream(FastInputStream.java:79)
> >>  #011at
> >>
> org.apache.solr.common.util.FastInputStream.refill(FastInputStream.java:88)
> >>  #011at
> >>
> org.apache.solr.common.util.FastInputStream.peek(FastInputStream.java:60)
> >>  #011at
> >>
> org.apache.solr.handler.loader.JavabinLoader.parseAndLoadDocs(JavabinLoader.java:107)
> >>  #011at
> >> org.apache.solr.handler.loader.JavabinLoader.load(JavabinLoader.java:55)
> >>  #011at
> >>
> org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:97)
> >>  #011at
> >>
> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:68)
> >>  #011at
> >>
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:195)
> >>  #011at org.apache.solr.core.SolrCore.execute(SolrCore.java:2503)
> >>  #011at
> org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:711)
> >>  #011at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:517)
> >>  #011at
> >>
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:384)
> >>  #011at
> >>
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:330)
> >>  #011at
> >>
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1629)
> >>  #011at
> >>
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:533)
> >>  #011at
> >>
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
> >>  #011at
> >>
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
> >>  #011at
> >>
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
> >>  #011at
> >>
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:190)
> >>  #011at
> >>
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1595)
> >>  #011at
> >>
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:188)
> >>  #011at
> >>
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1253)
> >>  #011at
> >>
> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:168)
> >>  #011at
> >>
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:473)
> >>  #011at
> >>
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1564)
> >>  #011at
> >>
> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:166)
> >>  #011at
> >>
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1155)
> >>  #011at
> >>
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
> >>  #011at
> >>
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:219)
> >>  #011at
> >>
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:126)
> >>  #011at
> >>
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
> >>  #011at
> >>
> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
> >>  #011at
> >>
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
> >>  #011at org.eclipse.jetty.server.Server.handle(Server.java:530)
> >>  #011at
> org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:347)
> >>  #011at
> >>
> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:256)
> >>  #011at
> >> org.eclipse.jetty.io
> .AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:279)
> >>  #011at org.eclipse.jetty.io
> .FillInterest.fillable(FillInterest.java:102)
> >>  #011at
> >> org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:124)
> >>  #011at
> >>
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:247)
> >>  #011at
> >>
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.produce(EatWhatYouKill.java:140)
> >>  #011at
> >>
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:131)
> >>  #011at
> >>
> org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:382)
> >>  #011at
> >>
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:708)
> >>  #011at
> >>
> org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:626)
> >>  #011at java.base/java.lang.Thread.run(Thread.java:844)
> >>  Caused by: java.util.concurrent.TimeoutException: Idle timeout expired:
> >> 120394/120000 ms
> >>  #011at
> >> org.eclipse.jetty.io.IdleTimeout.checkIdleTimeout(IdleTimeout.java:166)
> >>  #011at org.eclipse.jetty.io.IdleTimeout$1.run(IdleTimeout.java:50)
> >>  #011at
> >>
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:514)
> >>  #011at
> java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
> >>  #011at
> >>
> java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
> >>  #011at
> >>
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1135)
> >>  #011at
> >>
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
> >>  #011... 1 more
> >>  #011Suppressed: java.lang.Throwable: HttpInput failure
> >>  #011#011at
> org.eclipse.jetty.server.HttpInput.failed(HttpInput.java:816)
> >>  #011#011at
> >>
> org.eclipse.jetty.server.HttpConnection$BlockingReadCallback.failed(HttpConnection.java:639)
> >>  #011#011at org.eclipse.jetty.io
> .FillInterest.onFail(FillInterest.java:134)
> >>  #011#011at
> >> org.eclipse.jetty.io
> .AbstractEndPoint.onIdleExpired(AbstractEndPoint.java:398)
> >>  #011#011... 8 more
> >>  solr: ERROR HttpSolrCall null:org.apache.solr.common.SolrException:
> Error
> >> trying to proxy request for url:
> >> http://10.0.10.117:8983/solr/media/admin/ping
> >>  #011at
> >> org.apache.solr.servlet.HttpSolrCall.remoteQuery(HttpSolrCall.java:647)
> >>  #011at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:501)
> >>  ERROR HttpSolrCall null:org.apache.solr.common.SolrException: Error
> >> trying to proxy request for url:
> >> http://10.0.10.117:8983/solr/media/admin/ping
> >> #011at
> >> org.apache.solr.servlet.HttpSolrCall.remoteQuery(HttpSolrCall.java:647)
> >> #011at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:501)
> >> #011at
> >>
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:384)
> >> #011at
> >>
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:330)
> >> #011at
> >>
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1629)
> >> #011at org.eclipse.jetty.servlet.ServletHandler.doHandle(Servl...
> >>  #011at
> >>
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:330)
> >>  #011at
> >>
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1629)
> >>  #011at
> >>
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:533)
> >>  #011at
> >>
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
> >>  #011at
> >>
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
> >>  #011at
> >>
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
> >>  #011at
> >>
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:190)
> >>  #011at
> >>
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1595)
> >>  #011at
> >>
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:188)
> >>  #011at
> >>
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1253)
> >>  #011at
> >>
> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:168)
> >>  #011at
> >>
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:473)
> >>  #011at
> >>
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1564)
> >>  #011at
> >>
> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:166)
> >>  #011at
> >>
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1155)
> >>  #011at
> >>
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
> >>  #011at
> >>
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:219)
> >>  #011at
> >>
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:126)
> >>  #011at
> >>
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
> >>  #011at
> >>
> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
> >>  #011at
> >>
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
> >>  #011at org.eclipse.jetty.server.Server.handle(Server.java:530)
> >>  #011at
> org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:347)
> >>  #011at
> >>
> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:256)
> >>  #011at
> >> org.eclipse.jetty.io
> .AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:279)
> >>  #011at org.eclipse.jetty.io
> .FillInterest.fillable(FillInterest.java:102)
> >>  #011at
> >> org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:124)
> >>  #011at
> >>
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:247)
> >>  #011at
> >>
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.produce(EatWhatYouKill.java:140)
> >>  #011at
> >>
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:708)
> >>  #011at
> >>
> org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:626)
> >>  #011at java.base/java.lang.Thread.run(Thread.java:844)
> >>  Caused by: org.apache.http.conn.ConnectTimeoutException: Connect to
> >> 10.0.10.117:8983 [/10.0.10.117] failed: connect timed out
> >>  #011at
> >>
> org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:151)
> >>  #011at
> >>
> org.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:359)
> >>  #011at
> >>
> org.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:381)
> >>  #011at
> >>
> org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:237)
> >>  #011at
> >>
> org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:185)
> >>  #011at
> org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89)
> >>  #011at
> >>
> org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:111)
> >>  #011at
> >>
> org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
> >>  #011at
> >>
> org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
> >>  #011at
> >>
> org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56)
> >>  #011at
> >> org.apache.solr.servlet.HttpSolrCall.remoteQuery(HttpSolrCall.java:619)
> >
> >
> > *3.40~ Cannot do updates. Everything is broke :(*
> >
> >> #011at
> >> org.apache.solr.servlet.HttpSolrCall.sendError(HttpSolrCall.java:664)
> >> #011at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:532)
> >> #011at
> >>
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:384)
> >> #011at
> >>
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:330)
> >> #011at
> >>
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1629)
> >> #011at
> >>
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:533)
> >> #011at
> >>
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
> >> #011at
> >>
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
> >> #011at org.eclipse.jetty.server.handler.HandlerWrapper.handle(Ha...
> >>  solr: WARN  DistributedUpdateProcessor Error sending update to
> >> http://10.0.8.157:8983/solr
> >>  org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:
> >> Error from server at
> http://10.0.8.157:8983/solr/media_shard6_replica_t35:
> >> Server Error
> >>  ERROR DistributedUpdateProcessor Setting up to try to start recovery on
> >> replica core_node36 with url
> >> http://10.0.8.157:8983/solr/media_shard6_replica_t35/ by increasing
> >> leader term
> >>  message repeated 2 times: []
> >>  request:
> >>
> http://10.0.8.157:8983/solr/media_shard6_replica_t35/update?update.distrib=FROMLEADER&distrib.from=http%3A%2F%2F10.0.10.117%3A8983%2Fsolr%2Fmedia_shard6_replica_t10%2F&wt=javabin&version=2
> >>  Remote error message: java.util.concurrent.TimeoutException: Idle
> timeout
> >> expired: 120485/120000 ms
> >>  #011at
> >>
> org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.sendUpdateStream(ConcurrentUpdateSolrClient.java:383)
> >>  #011at
> >>
> org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.run(ConcurrentUpdateSolrClient.java:182)
> >>  #011at
> >>
> com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:176)
> >>  #011at
> >>
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:188)
> >>  #011at
> >>
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1135)
> >>  #011at
> >>
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
> >>  #011at java.base/java.lang.Thread.run(Thread.java:844)
> >>  solr: ERROR DistributedUpdateProcessor Setting up to try to start
> >> recovery on replica core_node36 with url
> >> http://10.0.8.157:8983/solr/media_shard6_replica_t35/ by increasing
> >> leader term
> >>  org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:
> >> Error from server at
> http://10.0.8.157:8983/solr/media_shard6_replica_t35:
> >> Server Error
> >>  message repeated 2 times: []
> >>  request:
> >>
> http://10.0.8.157:8983/solr/media_shard6_replica_t35/update?update.distrib=FROMLEADER&distrib.from=http%3A%2F%2F10.0.10.117%3A8983%2Fsolr%2Fmedia_shard6_replica_t10%2F&wt=javabin&version=2
> >>  Remote error message: java.util.concurrent.TimeoutException: Idle
> timeout
> >> expired: 120485/120000 ms
> >>  #011at
> >>
> org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.sendUpdateStream(ConcurrentUpdateSolrClient.java:383)
> >>  #011at
> >>
> org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.run(ConcurrentUpdateSolrClient.java:182)
> >>  #011at
> >>
> com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:176)
> >>  #011at
> >>
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:188)
> >>  #011at
> >>
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1135)
> >>  #011at
> >>
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
> >>  #011at java.base/java.lang.Thread.run(Thread.java:844)
> >>  solr: WARN  DistributedUpdateProcessor Error sending update to
> >> http://10.0.8.157:8983/solr
> >>  org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:
> >> Error from server at
> http://10.0.8.157:8983/solr/media_shard6_replica_t35:
> >> Server Error
> >>  message repeated 2 times: []
> >>  request:
> >>
> http://10.0.8.157:8983/solr/media_shard6_replica_t35/update?update.distrib=FROMLEADER&distrib.from=http%3A%2F%2F10.0.10.117%3A8983%2Fsolr%2Fmedia_shard6_replica_t10%2F&wt=javabin&version=2
> >>  Remote error message: java.util.concurrent.TimeoutException: Idle
> timeout
> >> expired: 120307/120000 ms
> >>  #011at
> >>
> org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.sendUpdateStream(ConcurrentUpdateSolrClient.java:383)
> >>  #011at
> >>
> org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.run(ConcurrentUpdateSolrClient.java:182)
> >>  #011at
> >>
> com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:176)
> >>  #011at
> >>
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:188)
> >>  #011at
> >>
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1135)
> >>  #011at
> >>
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
> >>  #011at java.base/java.lang.Thread.run(Thread.java:844)
> >>  solr: WARN  DistributedUpdateProcessor Error sending update to
> >> http://10.0.9.47:8983/solr
> >>  org.apache.http.NoHttpResponseException: 10.0.9.47:8983 failed to
> respond
> >>  #011at
> >>
> org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:141)
> >>  #011at
> >>
> org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:56)
> >>  #011at
> >> org.apache.http.impl.io
> .AbstractMessageParser.parse(AbstractMessageParser.java:259)
> >>  #011at
> >>
> org.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:163)
> >>  #011at
> >>
> org.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:165)
> >>  #011at
> >>
> org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:273)
> >>  #011at
> >>
> org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125)
> >>  #011at
> >>
> org.apache.solr.util.stats.InstrumentedHttpRequestExecutor.execute(InstrumentedHttpRequestExecutor.java:118)
> >>  #011at
> >>
> org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:272)
> >>  #011at
> >>
> org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:185)
> >>  #011at
> org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89)
> >>  #011at
> >>
> org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:111)
> >>  #011at
> >>
> org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
> >>  #011at
> >>
> org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
> >>  #011at
> >>
> org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56)
> >>  #011at
> >>
> org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.sendUpdateStream(ConcurrentUpdateSolrClient.java:347)
> >>  #011at
> >>
> org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.run(ConcurrentUpdateSolrClient.java:182)
> >>  #011at
> >>
> com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:176)
> >>  #011at
> >>
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:188)
> >>  #011at
> >>
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1135)
> >>  #011at
> >>
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
> >>  #011at java.base/java.lang.Thread.run(Thread.java:844)
> >>  solr: WARN  DistributedUpdateProcessor Error sending update to
> >> http://10.0.8.157:8983/solr
> >>  org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:
> >> Error from server at
> http://10.0.8.157:8983/solr/media_shard2_replica_t20:
> >> Server Error
> >
> > --
> > *P.S. We've launched a new blog to share the latest ideas and case
> studies
> > from our team. Check it out here: product.canva.com
> > <http://product.canva.com/>. ***
> > ** <https://canva.com>Empowering the world
> > to design
> > Also, we're hiring. Apply here!
> > <https://about.canva.com/careers/>
> >  <https://twitter.com/canva>
> > <https://facebook.com/canva> <https://au.linkedin.com/company/canva>
> > <https://instagram.com/canva>
> >
> >
> >
> >
> >
>
>

-- 
*P.S. We've launched a new blog to share the latest ideas and case studies 
from our team. Check it out here: product.canva.com 
<http://product.canva.com/>. ***
** <https://canva.com>Empowering the world 
to design
Also, we're hiring. Apply here! 
<https://about.canva.com/careers/>
 <https://twitter.com/canva> 
<https://facebook.com/canva> <https://au.linkedin.com/company/canva> 
<https://instagram.com/canva>






Re: 7.3.1: Query of death - all nodes ran out of memory and had to be shut down

Posted by Erick Erickson <er...@gmail.com>.
Did the large offsets _definitely_ cause the OOM? How do you expect
that to be answerable? It's likely though. To return rows 1,000,000
through 1,000,010 the system has to keep a list of 1,000,010 top
documents. It has to be this way because you don't know (and can't
guess) the score or a doc prior to, well, scoring it. And these very
large structures are kept for every query being processed. Not only
will that chew up memory, it'll chew up CPU cycles as well as this an
ordered list.

This is an anti-pattern, cursors were invented because this pattern is
very costly (as you're finding out).

Further, 4G isn't very much memory by modern standards.

So it's very likely (but not guaranteed) that using cursors will fix
this problem.

Best,
Erick



On Mon, Aug 20, 2018 at 8:55 PM, Ash Ramesh <as...@canva.com> wrote:
> Hi everyone,
>
> We ran into an issue yesterday where all our ec2 machines, running solr,
> ran out of memory and could not heal themselves. I'll try break down what
> happened here.
>
> *System Architecture:*
>
> - Solr Version: 7.3.1
> - Replica Types: TLOG/PULL
> - Num Shards: 8 (default hashing mechanism)
> - Doc Count: > 20m
> - Index Size: 17G
> - EC2 Machine Spec: 16 Core | 32G ram | 100G SSD
> - Num EC2 Machines: 7+ (scales up and down)
> - Max Shards per node (one node per EC2 instance): 8 (some nodes had 4,
> some had 8)
> - Num TLOG shard replicas: 3 (3 copies of each shard as TLOG)
> - Num PULL shard replicas: 3+
> - Heap: 4G
>
> *What was run prior to the issue:*
>
> We ran these queries around 2.55pm
>
> We ran a bunch of deep paginated queries (offset of 1,000,000) with a
> filter query. We set the timeout to 5 seconds and it did timeout. We aren't
> sure if this is what caused the irrecoverable failure, but by reading this
> -
> https://lucene.apache.org/solr/guide/7_4/pagination-of-results.html#performance-problems-with-deep-paging
> , we feel that this was the cause.
>
> We did not use a cursor.
>
> This cluster was healthy for about 1 week, but we noticed the degradation
> soon after (within 30min) of running the offset queries mentioned above. We
> currently use a single sharded collection in production, however are
> transitioning to an 8 shard cluster. We hit this issue in a controlled 8
> sharded environment, but don't notice any issues on our production (single
> sharded) cluster. On production the query still timed out (with same num
> docs etc.) but didn't go into a crazy state.
>
> *What Happened:*
>
> - All the EC2 instances started logging OOM error. None of the nodes were
> responsive to new requests.
> - We saw that the Heap usage jumped from an average of 2.7G to the max of
> 4G within a 5 minute window.
> - CPU across all 16 cores was at 100%
> - We saw that the distributed requests were timing out across all machines.
> - We shutdown all the machines that only had PULL replicas on them and it
> still didn't 'fix' itself.
> - Eventually we shut down SOLR on the main node which had all the master
> TLOG replicas. Once restarted, the machine started working again.
>
>
> *Questions:*
> - Did this deep pagination query *DEFINITELY* cause this issue?
> - Is each node single threaded? I don't think so, but I'd like to confirm
> that.
> - Is there any configuration that we could use to avoid this in the future?
> - Why could the nodes not recover by themselves? When we ran the same query
> on the single shard cluster it failed and didn't spin out of control.
>
> Thanks for all your help, Logs are pasted below from different timestamps.
>
> Regards,
> Ash
>
> *Logs:*
>
> Here are some logs we collected. Not sure if it tells a lot outside of what
> we know.
>
> *Time: 2.55pm ~ Requests are failing to complete in time*
>
>> ERROR RequestHandlerBase org.apache.solr.common.SolrException:
>> org.apache.solr.client.solrj.SolrServerException: Time allowed to handle
>> this request exceeded:[
>> http://10.0.9.204:8983/solr/media_shard1_replica_p57,
>> http://10.0.9.204:8983/solr/media_shard4_replica_p80,
>> http://10.0.9.204:8983/solr/media_shard3_replica_p73,
>> http://10.0.9.204:8983/solr/media_shard2_replica_p68]
>> #011at
>> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:410)
>> #011at
>> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:195)
>> #011at org.apache.solr.core.SolrCore.execute(SolrCore.java:2503)
>> #011at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:711)
>> #011at org.apache.s...
>>  #011at
>> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:188)
>>  #011at
>> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:168)
>>  #011at
>> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:473)
>>  #011at
>> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1564)
>>  #011at
>> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:166)
>>  #011at
>> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1155)
>>  #011at
>> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
>>  #011at
>> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:219)
>>  #011at
>> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:126)
>>  #011at
>> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>>  #011at
>> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
>>  #011at
>> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>>  #011at org.eclipse.jetty.server.Server.handle(Server.java:530)
>>  #011at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:347)
>>  #011at
>> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:126)
>>  #011at
>> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>>  #011at
>> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
>>  #011at
>> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>>  #011at org.eclipse.jetty.server.Server.handle(Server.java:530)
>>  #011at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:347)
>>  #011at
>> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:256)
>>  #011at
>> org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:279)
>>  #011at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:102)
>>  #011at
>> org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:124)
>>  #011at
>> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:256)
>>  #011at
>> org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:279)
>>  #011at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:102)
>>  #011at
>> org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:124)
>>  #011at
>> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:708)
>>  #011at
>> org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:626)
>>  #011at java.base/java.lang.Thread.run(Thread.java:844)
>>  #011at
>> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:247)
>>  #011at
>> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.produce(EatWhatYouKill.java:140)
>>  #011at
>> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:131)
>>  #011at
>> org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:382)
>>  #011at
>> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:708)
>>  #011at
>> org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:626)
>>  #011at java.base/java.lang.Thread.run(Thread.java:844)
>>  solr: ERROR RequestHandlerBase org.apache.solr.common.SolrException:
>> org.apache.solr.client.solrj.SolrServerException: Time allowed to handle
>> this request exceeded:[
>> http://10.0.9.204:8983/solr/media_shard1_replica_p57,
>> http://10.0.9.204:8983/solr/media_shard4_replica_p80,
>> http://10.0.9.204:8983/solr/media_shard3_replica_p73,
>> http://10.0.9.204:8983/solr/media_shard2_replica_p68]
>>  #011at
>> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:410)
>>  #011at
>> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:195)
>>  #011at org.apache.solr.core.SolrCore.execute(SolrCore.java:2503)
>>  ERROR RequestHandlerBase org.apache.solr.common.SolrException:
>> org.apache.solr.client.solrj.SolrServerException: Time allowed to handle
>> this request exceeded:[
>> http://10.0.9.204:8983/solr/media_shard1_replica_p57,
>> http://10.0.9.204:8983/solr/media_shard4_replica_p80,
>> http://10.0.9.204:8983/solr/media_shard3_replica_p73,
>> http://10.0.9.204:8983/solr/media_shard2_replica_p68]
>> #011at
>> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:410)
>> #011at
>> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:195)
>> #011at org.apache.solr.core.SolrCore.execute(SolrCore.java:2503)
>> #011at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:711)
>> #011at org.apache.s...
>>  #011at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:517)
>>  #011at
>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:384)
>>  #011at
>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:330)
>>  #011at
>> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1629)
>>  #011at
>> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:533)
>>  #011at
>> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
>>  #011at
>> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
>>  #011at
>> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>>  #011at
>> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:190)
>>  #011at
>> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1595)
>>  #011at
>> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:188)
>>  #011at
>> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1253)
>>  #011at
>> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:168)
>>  #011at
>> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:473)
>>  #011at
>> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1564)
>>  #011at
>> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:166)
>>  ERROR HttpSolrCall null:org.apache.solr.common.SolrException:
>> org.apache.solr.client.solrj.SolrServerException: Time allowed to handle
>> this request exceeded:[
>> http://10.0.9.204:8983/solr/media_shard1_replica_p57,
>> http://10.0.9.204:8983/solr/media_shard4_replica_p80,
>> http://10.0.9.204:8983/solr/media_shard3_replica_p73,
>> http://10.0.9.204:8983/solr/media_shard2_replica_p68]
>> #011at
>> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:410)
>> #011at
>> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:195)
>> #011at org.apache.solr.core.SolrCore.execute(SolrCore.java:2503)
>> #011at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:711)
>
>
> *Time: 3.00pm ~ Seeing out of memory*
>
>> The request took too long to iterate over terms. Timeout: timeoutAt:
>> 440824897486723 (System.nanoTime(): 440825204861492),
>> TermsEnum=org.apache.lucene.codecs.blocktree.SegmentTermsEnum@3a9d2efc
>>  solr: ERROR RequestHandlerBase java.lang.ClassCastException:
>> org.apache.solr.response.BasicResultContext cannot be cast to
>> org.apache.solr.common.SolrDocumentList
>>  #011at
>> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:315)
>>  #011at
>> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:195)
>>  #011at org.apache.solr.core.SolrCore.execute(SolrCore.java:2503)
>>  #011at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:711)
>>  #011at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:517)
>>  #011at
>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:384)
>>  #011at
>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:330)
>>  #011at
>> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1629)
>>  #011at
>> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:533)
>>  #011at
>> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
>>  #011at
>> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
>>  #011at
>> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>>  #011at
>> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:190)
>>  #011at
>> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1595)
>>  #011at
>> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:188)
>>  #011at
>> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1253)
>>  #011at
>> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:168)
>>  #011at
>> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:473)
>>  #011at
>> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1564)
>>  #011at
>> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:166)
>>  #011at
>> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1155)
>>  #011at
>> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
>>  #011at
>> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:219)
>>  #011at
>> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:126)
>>  #011at
>> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>>  #011at
>> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
>>  #011at
>> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>>  #011at org.eclipse.jetty.server.Server.handle(Server.java:530)
>>  #011at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:347)
>>  #011at
>> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:256)
>>  #011at
>> org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:279)
>>  #011at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:102)
>>  #011at
>> org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:124)
>>  #011at
>> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:708)
>>  #011at
>> org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:626)
>>  #011at java.base/java.lang.Thread.run(Thread.java:844)
>>  solr: ERROR HttpSolrCall null:java.lang.ClassCastException:
>> org.apache.solr.response.BasicResultContext cannot be cast to
>> org.apache.solr.common.SolrDocumentList
>>  #011at
>> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:315)
>>  #011at
>> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:195)
>>  #011at org.apache.solr.core.SolrCore.execute(SolrCore.java:2503)
>>  #011at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:711)
>>  #011at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:517)
>>  #011at
>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:384)
>>  #011at
>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:330)
>>  #011at
>> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1629)
>>  #011at
>> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:533)
>>  #011at
>> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
>>  #011at
>> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
>>  #011at
>> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>>  #011at
>> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:190)
>>  #011at
>> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1595)
>>  #011at
>> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:188)
>>  #011at
>> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1253)
>>  #011at
>> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:168)
>>  #011at
>> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:473)
>>  #011at
>> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1564)
>>  #011at
>> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:166)
>>  #011at
>> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1155)
>>  #011at
>> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
>>  #011at
>> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:219)
>>  #011at
>> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:126)
>>  #011at
>> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>>  #011at
>> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
>>  #011at
>> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>>  #011at org.eclipse.jetty.server.Server.handle(Server.java:530)
>>  #011at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:347)
>>  #011at
>> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:256)
>>  #011at
>> org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:279)
>>  #011at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:102)
>>  #011at
>> org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:124)
>>  #011at
>> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:708)
>>  #011at
>> org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:626)
>>  #011at java.base/java.lang.Thread.run(Thread.java:844)
>
>
> *3.15~ Health checks failing, Nodes are unresponsive.*
>
>>  WARN  DistributedUpdateProcessor Error sending update to
>> http://10.0.9.47:8983/solr
>>  Failed to connect to Solr. May not be online yet. Retrying in 0 seconds...
>>  ERROR HttpSolrCall null:org.apache.solr.common.SolrException: Error
>> trying to proxy request for url:
>> http://10.0.10.117:8983/solr/media/admin/ping
>> #011at
>> org.apache.solr.servlet.HttpSolrCall.remoteQuery(HttpSolrCall.java:647)
>> #011at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:501)
>> #011at
>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:384)
>> #011at
>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:330)
>> #011at
>> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1629)
>> #011at
>> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:533)
>> #011at o...
>>  ERROR RequestHandlerBase org.apache.solr.common.SolrException:
>> org.apache.solr.client.solrj.SolrServerException: Time allowed to handle
>> this request exceeded
>> #011at
>> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:410)
>> #011at
>> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:195)
>> #011at org.apache.solr.core.SolrCore.execute(SolrCore.java:2503)
>> #011at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:711)
>> #011at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:517)
>> #011at org.apache.solr....
>> RROR ErrorReportingConcurrentUpdateSolrClient error
>>  solr: ERROR RequestHandlerBase java.io.IOException:
>> java.util.concurrent.TimeoutException: Idle timeout expired: 120394/120000
>> ms
>>  #011at
>> org.eclipse.jetty.server.HttpInput$ErrorState.noContent(HttpInput.java:1069)
>>  #011at org.eclipse.jetty.server.HttpInput.read(HttpInput.java:309)
>>  #011at
>> org.apache.commons.io.input.ProxyInputStream.read(ProxyInputStream.java:100)
>>  #011at
>> org.apache.solr.common.util.FastInputStream.readWrappedStream(FastInputStream.java:79)
>>  #011at
>> org.apache.solr.common.util.FastInputStream.refill(FastInputStream.java:88)
>>  #011at
>> org.apache.solr.common.util.FastInputStream.peek(FastInputStream.java:60)
>>  #011at
>> org.apache.solr.handler.loader.JavabinLoader.parseAndLoadDocs(JavabinLoader.java:107)
>>  #011at
>> org.apache.solr.handler.loader.JavabinLoader.load(JavabinLoader.java:55)
>>  #011at
>> org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:97)
>>  #011at
>> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:68)
>>  #011at
>> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:195)
>>  #011at org.apache.solr.core.SolrCore.execute(SolrCore.java:2503)
>>  #011at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:711)
>>  #011at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:517)
>>  #011at
>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:384)
>>  #011at
>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:330)
>>  #011at
>> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1629)
>>  #011at
>> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:533)
>>  #011at
>> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
>>  #011at
>> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
>>  #011at
>> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>>  #011at
>> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:190)
>>  #011at
>> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1595)
>>  #011at
>> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:188)
>>  #011at
>> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1253)
>>  #011at
>> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:168)
>>  #011at
>> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:473)
>>  #011at
>> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1564)
>>  #011at
>> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:166)
>>  #011at
>> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1155)
>>  #011at
>> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
>>  #011at
>> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:219)
>>  #011at
>> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:126)
>>  #011at
>> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>>  #011at
>> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
>>  #011at
>> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>>  #011at org.eclipse.jetty.server.Server.handle(Server.java:530)
>>  #011at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:347)
>>  #011at
>> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:256)
>>  #011at
>> org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:279)
>>  #011at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:102)
>>  #011at
>> org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:124)
>>  #011at
>> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:247)
>>  #011at
>> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.produce(EatWhatYouKill.java:140)
>>  #011at
>> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:131)
>>  #011at
>> org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:382)
>>  #011at
>> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:708)
>>  #011at
>> org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:626)
>>  #011at java.base/java.lang.Thread.run(Thread.java:844)
>>  Caused by: java.util.concurrent.TimeoutException: Idle timeout expired:
>> 120394/120000 ms
>>  #011at
>> org.eclipse.jetty.io.IdleTimeout.checkIdleTimeout(IdleTimeout.java:166)
>>  #011at org.eclipse.jetty.io.IdleTimeout$1.run(IdleTimeout.java:50)
>>  #011at
>> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:514)
>>  #011at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
>>  #011at
>> java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
>>  #011at
>> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1135)
>>  #011at
>> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
>>  #011... 1 more
>>  #011Suppressed: java.lang.Throwable: HttpInput failure
>>  #011#011at org.eclipse.jetty.server.HttpInput.failed(HttpInput.java:816)
>>  #011#011at
>> org.eclipse.jetty.server.HttpConnection$BlockingReadCallback.failed(HttpConnection.java:639)
>>  #011#011at org.eclipse.jetty.io.FillInterest.onFail(FillInterest.java:134)
>>  #011#011at
>> org.eclipse.jetty.io.AbstractEndPoint.onIdleExpired(AbstractEndPoint.java:398)
>>  #011#011... 8 more
>>  solr: ERROR HttpSolrCall null:org.apache.solr.common.SolrException: Error
>> trying to proxy request for url:
>> http://10.0.10.117:8983/solr/media/admin/ping
>>  #011at
>> org.apache.solr.servlet.HttpSolrCall.remoteQuery(HttpSolrCall.java:647)
>>  #011at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:501)
>>  ERROR HttpSolrCall null:org.apache.solr.common.SolrException: Error
>> trying to proxy request for url:
>> http://10.0.10.117:8983/solr/media/admin/ping
>> #011at
>> org.apache.solr.servlet.HttpSolrCall.remoteQuery(HttpSolrCall.java:647)
>> #011at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:501)
>> #011at
>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:384)
>> #011at
>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:330)
>> #011at
>> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1629)
>> #011at org.eclipse.jetty.servlet.ServletHandler.doHandle(Servl...
>>  #011at
>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:330)
>>  #011at
>> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1629)
>>  #011at
>> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:533)
>>  #011at
>> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
>>  #011at
>> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
>>  #011at
>> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>>  #011at
>> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:190)
>>  #011at
>> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1595)
>>  #011at
>> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:188)
>>  #011at
>> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1253)
>>  #011at
>> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:168)
>>  #011at
>> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:473)
>>  #011at
>> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1564)
>>  #011at
>> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:166)
>>  #011at
>> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1155)
>>  #011at
>> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
>>  #011at
>> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:219)
>>  #011at
>> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:126)
>>  #011at
>> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>>  #011at
>> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
>>  #011at
>> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>>  #011at org.eclipse.jetty.server.Server.handle(Server.java:530)
>>  #011at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:347)
>>  #011at
>> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:256)
>>  #011at
>> org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:279)
>>  #011at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:102)
>>  #011at
>> org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:124)
>>  #011at
>> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:247)
>>  #011at
>> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.produce(EatWhatYouKill.java:140)
>>  #011at
>> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:708)
>>  #011at
>> org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:626)
>>  #011at java.base/java.lang.Thread.run(Thread.java:844)
>>  Caused by: org.apache.http.conn.ConnectTimeoutException: Connect to
>> 10.0.10.117:8983 [/10.0.10.117] failed: connect timed out
>>  #011at
>> org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:151)
>>  #011at
>> org.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:359)
>>  #011at
>> org.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:381)
>>  #011at
>> org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:237)
>>  #011at
>> org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:185)
>>  #011at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89)
>>  #011at
>> org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:111)
>>  #011at
>> org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
>>  #011at
>> org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
>>  #011at
>> org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56)
>>  #011at
>> org.apache.solr.servlet.HttpSolrCall.remoteQuery(HttpSolrCall.java:619)
>
>
> *3.40~ Cannot do updates. Everything is broke :(*
>
>> #011at
>> org.apache.solr.servlet.HttpSolrCall.sendError(HttpSolrCall.java:664)
>> #011at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:532)
>> #011at
>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:384)
>> #011at
>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:330)
>> #011at
>> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1629)
>> #011at
>> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:533)
>> #011at
>> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
>> #011at
>> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
>> #011at org.eclipse.jetty.server.handler.HandlerWrapper.handle(Ha...
>>  solr: WARN  DistributedUpdateProcessor Error sending update to
>> http://10.0.8.157:8983/solr
>>  org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:
>> Error from server at http://10.0.8.157:8983/solr/media_shard6_replica_t35:
>> Server Error
>>  ERROR DistributedUpdateProcessor Setting up to try to start recovery on
>> replica core_node36 with url
>> http://10.0.8.157:8983/solr/media_shard6_replica_t35/ by increasing
>> leader term
>>  message repeated 2 times: []
>>  request:
>> http://10.0.8.157:8983/solr/media_shard6_replica_t35/update?update.distrib=FROMLEADER&distrib.from=http%3A%2F%2F10.0.10.117%3A8983%2Fsolr%2Fmedia_shard6_replica_t10%2F&wt=javabin&version=2
>>  Remote error message: java.util.concurrent.TimeoutException: Idle timeout
>> expired: 120485/120000 ms
>>  #011at
>> org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.sendUpdateStream(ConcurrentUpdateSolrClient.java:383)
>>  #011at
>> org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.run(ConcurrentUpdateSolrClient.java:182)
>>  #011at
>> com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:176)
>>  #011at
>> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:188)
>>  #011at
>> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1135)
>>  #011at
>> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
>>  #011at java.base/java.lang.Thread.run(Thread.java:844)
>>  solr: ERROR DistributedUpdateProcessor Setting up to try to start
>> recovery on replica core_node36 with url
>> http://10.0.8.157:8983/solr/media_shard6_replica_t35/ by increasing
>> leader term
>>  org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:
>> Error from server at http://10.0.8.157:8983/solr/media_shard6_replica_t35:
>> Server Error
>>  message repeated 2 times: []
>>  request:
>> http://10.0.8.157:8983/solr/media_shard6_replica_t35/update?update.distrib=FROMLEADER&distrib.from=http%3A%2F%2F10.0.10.117%3A8983%2Fsolr%2Fmedia_shard6_replica_t10%2F&wt=javabin&version=2
>>  Remote error message: java.util.concurrent.TimeoutException: Idle timeout
>> expired: 120485/120000 ms
>>  #011at
>> org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.sendUpdateStream(ConcurrentUpdateSolrClient.java:383)
>>  #011at
>> org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.run(ConcurrentUpdateSolrClient.java:182)
>>  #011at
>> com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:176)
>>  #011at
>> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:188)
>>  #011at
>> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1135)
>>  #011at
>> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
>>  #011at java.base/java.lang.Thread.run(Thread.java:844)
>>  solr: WARN  DistributedUpdateProcessor Error sending update to
>> http://10.0.8.157:8983/solr
>>  org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:
>> Error from server at http://10.0.8.157:8983/solr/media_shard6_replica_t35:
>> Server Error
>>  message repeated 2 times: []
>>  request:
>> http://10.0.8.157:8983/solr/media_shard6_replica_t35/update?update.distrib=FROMLEADER&distrib.from=http%3A%2F%2F10.0.10.117%3A8983%2Fsolr%2Fmedia_shard6_replica_t10%2F&wt=javabin&version=2
>>  Remote error message: java.util.concurrent.TimeoutException: Idle timeout
>> expired: 120307/120000 ms
>>  #011at
>> org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.sendUpdateStream(ConcurrentUpdateSolrClient.java:383)
>>  #011at
>> org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.run(ConcurrentUpdateSolrClient.java:182)
>>  #011at
>> com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:176)
>>  #011at
>> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:188)
>>  #011at
>> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1135)
>>  #011at
>> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
>>  #011at java.base/java.lang.Thread.run(Thread.java:844)
>>  solr: WARN  DistributedUpdateProcessor Error sending update to
>> http://10.0.9.47:8983/solr
>>  org.apache.http.NoHttpResponseException: 10.0.9.47:8983 failed to respond
>>  #011at
>> org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:141)
>>  #011at
>> org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:56)
>>  #011at
>> org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:259)
>>  #011at
>> org.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:163)
>>  #011at
>> org.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:165)
>>  #011at
>> org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:273)
>>  #011at
>> org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125)
>>  #011at
>> org.apache.solr.util.stats.InstrumentedHttpRequestExecutor.execute(InstrumentedHttpRequestExecutor.java:118)
>>  #011at
>> org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:272)
>>  #011at
>> org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:185)
>>  #011at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89)
>>  #011at
>> org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:111)
>>  #011at
>> org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
>>  #011at
>> org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
>>  #011at
>> org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56)
>>  #011at
>> org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.sendUpdateStream(ConcurrentUpdateSolrClient.java:347)
>>  #011at
>> org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.run(ConcurrentUpdateSolrClient.java:182)
>>  #011at
>> com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:176)
>>  #011at
>> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:188)
>>  #011at
>> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1135)
>>  #011at
>> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
>>  #011at java.base/java.lang.Thread.run(Thread.java:844)
>>  solr: WARN  DistributedUpdateProcessor Error sending update to
>> http://10.0.8.157:8983/solr
>>  org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:
>> Error from server at http://10.0.8.157:8983/solr/media_shard2_replica_t20:
>> Server Error
>
> --
> *P.S. We've launched a new blog to share the latest ideas and case studies
> from our team. Check it out here: product.canva.com
> <http://product.canva.com/>. ***
> ** <https://canva.com>Empowering the world
> to design
> Also, we're hiring. Apply here!
> <https://about.canva.com/careers/>
>  <https://twitter.com/canva>
> <https://facebook.com/canva> <https://au.linkedin.com/company/canva>
> <https://instagram.com/canva>
>
>
>
>
>


Re: 7.3.1: Query of death - all nodes ran out of memory and had to be shut down

Posted by Ash Ramesh <as...@canva.com>.
Thank you all :) We have made the necessary changes to mitigate this issue

On Wed, Aug 22, 2018 at 6:01 AM Shawn Heisey <ap...@elyograg.org> wrote:

> On 8/20/2018 9:55 PM, Ash Ramesh wrote:
> > We ran a bunch of deep paginated queries (offset of 1,000,000) with a
> > filter query. We set the timeout to 5 seconds and it did timeout. We
> aren't
> > sure if this is what caused the irrecoverable failure, but by reading
> this
> > -
> >
> https://lucene.apache.org/solr/guide/7_4/pagination-of-results.html#performance-problems-with-deep-paging
> > , we feel that this was the cause.
>
> Yes, this is most likely the cause.
>
> Since you have three shards, the problem is even worse than Erick
> described.  Those 1000010 results will be returned by EVERY shard, and
> consolidated on the machine that's actually making the query.  So it
> will have three million results in memory that it must sort.
>
> Unless you're running on Windows, the bin/solr script will configure
> Java to kill itself when OutOfMemoryError occurs.  It does this because
> program behavior after OOME occurs is completely unpredictable, so
> there's a good chance that if it keeps running, it will corrupt the index.
>
> If you're going to be doing queries like this, you need a larger heap.
> There's no way around that.
>
> Thanks,
> Shawn
>
>

-- 
*P.S. We've launched a new blog to share the latest ideas and case studies 
from our team. Check it out here: product.canva.com 
<http://product.canva.com/>. ***
** <https://canva.com>Empowering the world 
to design
Also, we're hiring. Apply here! 
<https://about.canva.com/careers/>
 <https://twitter.com/canva> 
<https://facebook.com/canva> <https://au.linkedin.com/company/canva> 
<https://instagram.com/canva>






Re: 7.3.1: Query of death - all nodes ran out of memory and had to be shut down

Posted by Shawn Heisey <ap...@elyograg.org>.
On 8/20/2018 9:55 PM, Ash Ramesh wrote:
> We ran a bunch of deep paginated queries (offset of 1,000,000) with a
> filter query. We set the timeout to 5 seconds and it did timeout. We aren't
> sure if this is what caused the irrecoverable failure, but by reading this
> -
> https://lucene.apache.org/solr/guide/7_4/pagination-of-results.html#performance-problems-with-deep-paging
> , we feel that this was the cause.

Yes, this is most likely the cause.

Since you have three shards, the problem is even worse than Erick 
described.  Those 1000010 results will be returned by EVERY shard, and 
consolidated on the machine that's actually making the query.  So it 
will have three million results in memory that it must sort.

Unless you're running on Windows, the bin/solr script will configure 
Java to kill itself when OutOfMemoryError occurs.  It does this because 
program behavior after OOME occurs is completely unpredictable, so 
there's a good chance that if it keeps running, it will corrupt the index.

If you're going to be doing queries like this, you need a larger heap.  
There's no way around that.

Thanks,
Shawn