You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Tom Peters <tp...@synacor.com> on 2018/02/23 04:11:47 UTC

Indexing timeout issues with SolrCloud 7.1

I'm trying to debug why indexing in SolrCloud 7.1 is having so many issues. It will hang most of the time, and timeout the rest.

Here's an example:

    time curl -s 'myhost:8080/solr/mycollection/update/json/docs' -d '{"solr_id":"test_001", "data_type":"test"}'|jq .
    {
      "responseHeader": {
        "status": 0,
        "QTime": 5004
      }
    }
    curl -s 'myhost:8080/solr/mycollection/update/json/docs' -d   0.00s user 0.00s system 0% cpu 5.025 total
    jq .  0.01s user 0.00s system 0% cpu 5.025 total

Here's some of the timeout errors I'm seeing:

    2018-02-23 03:55:02.903 ERROR (qtp1595212853-3607) [c:mycollection s:shard1 r:core_node12 x:mycollection_shard1_replica_n11] o.a.s.h.RequestHandlerBase java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout expired: 120000/120000 ms
    2018-02-23 03:55:02.903 ERROR (qtp1595212853-3607) [c:mycollection s:shard1 r:core_node12 x:mycollection_shard1_replica_n11] o.a.s.s.HttpSolrCall null:java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout expired: 120000/120000 ms
    2018-02-23 03:55:36.517 ERROR (recoveryExecutor-3-thread-4-processing-n:solr2-d.myhost:8080_solr x:mycollection_shard1_replica_n11 s:shard1 c:mycollection r:core_node12) [c:mycollection s:shard1 r:core_node12 x:mycollection_shard1_replica_n11] o.a.s.h.ReplicationHandler Index fetch failed :org.apache.solr.common.SolrException: Index fetch failed :
    2018-02-23 03:55:36.517 ERROR (recoveryExecutor-3-thread-4-processing-n:solr2-d.myhost:8080_solr x:mycollection_shard1_replica_n11 s:shard1 c:mycollection r:core_node12) [c:mycollection s:shard1 r:core_node12 x:mycollection_shard1_replica_n11] o.a.s.c.RecoveryStrategy Error while trying to recover:org.apache.solr.common.SolrException: Replication for recovery failed.


We currently have two separate Solr clusters. Our current in-production cluster which runs on Solr 3.4 and a new ring that I'm trying to bring up which runs on SolrCloud 7.1. I have the exact same code that is indexing to both clusters. The Solr 3.4 indexes fine, but I'm running into lots of issues with SolrCloud 7.1.


Some additional details about the setup:

* 5 nodes solr2-a through solr2-e.
* 5 replicas
* 1 shard
* The servers have 48G of RAM with -Xmx and -Xms set to 16G
* I currently have soft commits at 10m intervals and hard commits (with openSearcher=false) at 1m intervals. I also tried 5m (soft) and 15s (hard) as well.

Any help or pointers would be greatly appreciated. Thanks!


This message and any attachment may contain information that is confidential and/or proprietary. Any use, disclosure, copying, storing, or distribution of this e-mail or any attached file by anyone other than the intended recipient is strictly prohibited. If you have received this message in error, please notify the sender by reply email and delete the message and any attachments. Thank you.

Re: Indexing timeout issues with SolrCloud 7.1

Posted by Rick Leir <rl...@leirtech.com>.
Tom
I think you are saying that all updates fail? Need to do a bit of troubleshooting. How about queries? What else is in the logs?
Cheers -- Rick
-- 
Sorry for being brief. Alternate email is rickleir at yahoo dot com 

Re: Indexing timeout issues with SolrCloud 7.1

Posted by Shawn Heisey <ap...@elyograg.org>.
On 2/23/2018 2:40 PM, Tom Peters wrote:
> I included the last 25 lines from the logs from each of the five nodes during that time period.

Did you perhaps grep for "ERROR" and include the last 25 lines of that? 
This excludes the vast majority of the information from the error,
information that be very important for troubleshooting.  Each error is
potentially dozens of lines long, and only the first line will have
"ERROR" on it.

> I _think_ I'm running into issues with bulking up deleteByQuery. Quick background: we have objects in our system that may have multiple availability windows. So when we index an object, will store it as separate documents each with their own begins and expires date. At index time we don't know if the all of the windows are still valid or not, so we remove all of them with a deleteByQuery (e.g. deleteByQuery=object_id:12345) and then index one or more documents.

The deleteByQuery operation does not coexist well with other
simultaneous operations.  I am not familiar with the details of WHY this
is the case, I only know that it IS the case.  I had some big problems
when I first started out.  Ultimately, I ended up making sure that if a
background optimize operation was underway, that I skipped the update
process, which included deletes.

I strongly recommend replacing deleteByQuery with a two step process
that sends the query as a standard query, retrieves all the uniqueKey
values from that query, and then issues deleteById calls.  This should
result in much smoother operation that can coexist with other
operations, merging in particular.

> I ran an isolated test a number of times where I indexed 1500 documents in this manner (deletes then index). In Solr 3.4, it takes about 15s to complete. In Solr 7.1, it's taking about 5m. If I remove the deleteByQuery, the indexing times are nearly identical.

SolrCloud did not exist in Solr version 3.4.  It was introduced in
version 4.0.  Did you mean version 4.3?  If you really did mean 3.4,
then a direct comparison with SolrCloud on 7.1 is not possible. 
SolrCloud works a little differently.

Thanks,
Shawn


Re: Indexing timeout issues with SolrCloud 7.1

Posted by Tom Peters <tp...@synacor.com>.
Thanks Erick. I found an older mailing list thread online where someone had similar issues to what I was experiencing (http://lucene.472066.n3.nabble.com/SolrCloud-delete-by-query-performance-td4206726.html <http://lucene.472066.n3.nabble.com/SolrCloud-delete-by-query-performance-td4206726.html>).

I decided to try and rewrite our indexing code to use delete by ID as opposed to delete by query (we deployed it today) and it appears to have significantly improved the indexing performance and reliability of the replicas.



> On Feb 26, 2018, at 12:08 AM, Erick Erickson <er...@gmail.com> wrote:
> 
> DBQ is something of a heavyweight action. Basically in order to
> preserve ordering it has to lock out updates while it executes since
> all docs (which may live on all shards) have to be deleted before
> subsequent adds of one of the affected docs is processed. In order to
> do that, things need to be locked.
> 
> Delete-by-id OTOH, can use the normal optimistic locking to insure
> proper ordering. So if object_id is your <uniqueKey>, this may be much
> more robust if you delete-by-id
> 
> Best,
> Erick
> 
> On Sat, Feb 24, 2018 at 1:37 AM, Deepak Goel <de...@gmail.com> wrote:
>> From the error list, i can see multiple errors:
>> 
>> 1. Failure to recover replica
>> 2. Peer sync error
>> 3. Failure to download file
>> 
>> On 24 Feb 2018 03:10, "Tom Peters" <tp...@synacor.com> wrote:
>> 
>> I included the last 25 lines from the logs from each of the five nodes
>> during that time period.
>> 
>> I _think_ I'm running into issues with bulking up deleteByQuery. Quick
>> background: we have objects in our system that may have multiple
>> availability windows. So when we index an object, will store it as separate
>> documents each with their own begins and expires date. At index time we
>> don't know if the all of the windows are still valid or not, so we remove
>> all of them with a deleteByQuery (e.g. deleteByQuery=object_id:12345) and
>> then index one or more documents.
>> 
>> I ran an isolated test a number of times where I indexed 1500 documents in
>> this manner (deletes then index). In Solr 3.4, it takes about 15s to
>> complete. In Solr 7.1, it's taking about 5m. If I remove the deleteByQuery,
>> the indexing times are nearly identical.
>> 
>> When run in normal production mode where we have lots of processes indexing
>> at once (~20 or so), it starts to cause lots of issues (which you see
>> below).
>> 
>> 
>> Please let me know if anything I mentioned is unclear. Thanks!
>> 
>> 
>> 
>> 
>> solr2-a:
>> 2018-02-23 04:09:36.551 ERROR (updateExecutor-2-thread-2672-
>> processing-http:////solr2-b:8080//solr//mycollection_shard1_replica_n1
>> x:mycollection_shard1_replica_n6 r:core_node9 n:solr2-a.vam.be.cmh.
>> mycollection.com:8080_solr s:shard1 c:mycollection) [c:mycollection
>> s:shard1 r:core_node9 x:mycollection_shard1_replica_n6] o.a.s.u.
>> ErrorReportingConcurrentUpdateSolrClient error
>> 2018-02-23 04:09:36.551 ERROR (updateExecutor-2-thread-2692-
>> processing-http:////solr2-d:8080//solr//mycollection_shard1_replica_n11
>> x:mycollection_shard1_replica_n6 r:core_node9 n:solr2-a.vam.be.cmh.
>> mycollection.com:8080_solr s:shard1 c:mycollection) [c:mycollection
>> s:shard1 r:core_node9 x:mycollection_shard1_replica_n6] o.a.s.u.
>> ErrorReportingConcurrentUpdateSolrClient error
>> 2018-02-23 04:09:36.551 ERROR (updateExecutor-2-thread-2711-
>> processing-http:////solr2-e:8080//solr//mycollection_shard1_replica_n4
>> x:mycollection_shard1_replica_n6 r:core_node9 n:solr2-a.vam.be.cmh.
>> mycollection.com:8080_solr s:shard1 c:mycollection) [c:mycollection
>> s:shard1 r:core_node9 x:mycollection_shard1_replica_n6] o.a.s.u.
>> ErrorReportingConcurrentUpdateSolrClient error
>> 2018-02-23 04:09:36.552 ERROR (qtp1595212853-32739) [c:mycollection
>> s:shard1 r:core_node9 x:mycollection_shard1_replica_n6]
>> o.a.s.u.p.DistributedUpdateProcessor
>> Setting up to try to start recovery on replica http://solr2-b:8080/solr/
>> mycollection_shard1_replica_n1/
>> 2018-02-23 04:09:36.552 ERROR (qtp1595212853-32739) [c:mycollection
>> s:shard1 r:core_node9 x:mycollection_shard1_replica_n6]
>> o.a.s.u.p.DistributedUpdateProcessor
>> Setting up to try to start recovery on replica http://solr2-d:8080/solr/
>> mycollection_shard1_replica_n11/
>> 2018-02-23 04:09:36.552 ERROR (qtp1595212853-32739) [c:mycollection
>> s:shard1 r:core_node9 x:mycollection_shard1_replica_n6]
>> o.a.s.u.p.DistributedUpdateProcessor
>> Setting up to try to start recovery on replica http://solr2-e:8080/solr/
>> mycollection_shard1_replica_n4/
>> 2018-02-23 04:09:38.217 ERROR (updateExecutor-2-thread-2712-
>> processing-http:////solr2-e:8080//solr//mycollection_shard1_replica_n4
>> x:mycollection_shard1_replica_n6 r:core_node9 n:solr2-a.vam.be.cmh.
>> mycollection.com:8080_solr s:shard1 c:mycollection) [c:mycollection
>> s:shard1 r:core_node9 x:mycollection_shard1_replica_n6] o.a.s.u.
>> ErrorReportingConcurrentUpdateSolrClient error
>> 2018-02-23 04:09:38.217 ERROR (updateExecutor-2-thread-2726-
>> processing-http:////solr2-d:8080//solr//mycollection_shard1_replica_n11
>> x:mycollection_shard1_replica_n6 r:core_node9 n:solr2-a.vam.be.cmh.
>> mycollection.com:8080_solr s:shard1 c:mycollection) [c:mycollection
>> s:shard1 r:core_node9 x:mycollection_shard1_replica_n6] o.a.s.u.
>> ErrorReportingConcurrentUpdateSolrClient error
>> 2018-02-23 04:09:38.217 ERROR (updateExecutor-2-thread-2727-
>> processing-http:////solr2-b:8080//solr//mycollection_shard1_replica_n1
>> x:mycollection_shard1_replica_n6 r:core_node9 n:solr2-a.vam.be.cmh.
>> mycollection.com:8080_solr s:shard1 c:mycollection) [c:mycollection
>> s:shard1 r:core_node9 x:mycollection_shard1_replica_n6] o.a.s.u.
>> ErrorReportingConcurrentUpdateSolrClient error
>> 2018-02-23 04:09:38.218 ERROR (qtp1595212853-32260) [c:mycollection
>> s:shard1 r:core_node9 x:mycollection_shard1_replica_n6]
>> o.a.s.u.p.DistributedUpdateProcessor
>> Setting up to try to start recovery on replica http://solr2-b:8080/solr/
>> mycollection_shard1_replica_n1/
>> 2018-02-23 04:09:38.218 ERROR (qtp1595212853-32260) [c:mycollection
>> s:shard1 r:core_node9 x:mycollection_shard1_replica_n6]
>> o.a.s.u.p.DistributedUpdateProcessor
>> Setting up to try to start recovery on replica http://solr2-d:8080/solr/
>> mycollection_shard1_replica_n11/
>> 2018-02-23 04:09:38.218 ERROR (qtp1595212853-32260) [c:mycollection
>> s:shard1 r:core_node9 x:mycollection_shard1_replica_n6]
>> o.a.s.u.p.DistributedUpdateProcessor
>> Setting up to try to start recovery on replica http://solr2-e:8080/solr/
>> mycollection_shard1_replica_n4/
>> 2018-02-23 04:09:50.048 ERROR (qtp1595212853-11840) [c:mycollection
>> s:shard1 r:core_node9 x:mycollection_shard1_replica_n6]
>> o.a.s.u.p.DistributedUpdateProcessor
>> Setting up to try to start recovery on replica http://solr2-d:8080/solr/
>> mycollection_shard1_replica_n11/
>> 2018-02-23 04:09:50.048 ERROR (qtp1595212853-11840) [c:mycollection
>> s:shard1 r:core_node9 x:mycollection_shard1_replica_n6]
>> o.a.s.u.p.DistributedUpdateProcessor
>> Setting up to try to start recovery on replica http://solr2-e:8080/solr/
>> mycollection_shard1_replica_n4/
>> 2018-02-23 04:09:50.142 ERROR (qtp1595212853-33675) [c:mycollection
>> s:shard1 r:core_node9 x:mycollection_shard1_replica_n6]
>> o.a.s.u.p.DistributedUpdateProcessor
>> Setting up to try to start recovery on replica http://solr2-b:8080/solr/
>> mycollection_shard1_replica_n1/
>> 2018-02-23 04:09:50.143 ERROR (qtp1595212853-33675) [c:mycollection
>> s:shard1 r:core_node9 x:mycollection_shard1_replica_n6]
>> o.a.s.u.p.DistributedUpdateProcessor
>> Setting up to try to start recovery on replica http://solr2-d:8080/solr/
>> mycollection_shard1_replica_n11/
>> 2018-02-23 04:09:50.143 ERROR (qtp1595212853-33675) [c:mycollection
>> s:shard1 r:core_node9 x:mycollection_shard1_replica_n6]
>> o.a.s.u.p.DistributedUpdateProcessor
>> Setting up to try to start recovery on replica http://solr2-e:8080/solr/
>> mycollection_shard1_replica_n4/
>> 2018-02-23 07:54:01.404 ERROR (qtp1595212853-33424) [c:mycollection
>> s:shard1 r:core_node9 x:mycollection_shard1_replica_n6]
>> o.a.s.h.RequestHandlerBase java.lang.NullPointerException
>> 2018-02-23 07:54:01.405 ERROR (qtp1595212853-33424) [c:mycollection
>> s:shard1 r:core_node9 x:mycollection_shard1_replica_n6]
>> o.a.s.s.HttpSolrCall null:java.lang.NullPointerException
>> 2018-02-23 12:24:12.389 ERROR (qtp1595212853-33906) [c:mycollection
>> s:shard1 r:core_node9 x:mycollection_shard1_replica_n6]
>> o.a.s.h.RequestHandlerBase java.lang.NullPointerException
>> 2018-02-23 12:24:12.390 ERROR (qtp1595212853-33906) [c:mycollection
>> s:shard1 r:core_node9 x:mycollection_shard1_replica_n6]
>> o.a.s.s.HttpSolrCall null:java.lang.NullPointerException
>> 2018-02-23 16:27:52.030 ERROR (qtp1595212853-32216) [c:mycollection
>> s:shard1 r:core_node9 x:mycollection_shard1_replica_n6]
>> o.a.s.h.RequestHandlerBase java.lang.NullPointerException
>> 2018-02-23 16:27:52.030 ERROR (qtp1595212853-32216) [c:mycollection
>> s:shard1 r:core_node9 x:mycollection_shard1_replica_n6]
>> o.a.s.s.HttpSolrCall null:java.lang.NullPointerException
>> 2018-02-23 16:30:12.276 ERROR (qtp1595212853-33424) [c:mycollection
>> s:shard1 r:core_node9 x:mycollection_shard1_replica_n6]
>> o.a.s.h.RequestHandlerBase java.lang.NullPointerException
>> 2018-02-23 16:30:12.276 ERROR (qtp1595212853-33424) [c:mycollection
>> s:shard1 r:core_node9 x:mycollection_shard1_replica_n6]
>> o.a.s.s.HttpSolrCall null:java.lang.NullPointerException
>> 
>> solr2-b:
>> 2018-02-23 04:08:24.260 ERROR (qtp1595212853-22127) [c:mycollection
>> s:shard1 r:core_node3 x:mycollection_shard1_replica_n1]
>> o.a.s.h.RequestHandlerBase java.io.IOException:
>> java.util.concurrent.TimeoutException:
>> Idle timeout expired: 120000/120000 ms
>> 2018-02-23 04:08:24.260 ERROR (qtp1595212853-22127) [c:mycollection
>> s:shard1 r:core_node3 x:mycollection_shard1_replica_n1]
>> o.a.s.s.HttpSolrCall null:java.io.IOException:
>> java.util.concurrent.TimeoutException:
>> Idle timeout expired: 120000/120000 ms
>> 2018-02-23 04:08:24.260 ERROR (qtp1595212853-22168) [c:mycollection
>> s:shard1 r:core_node3 x:mycollection_shard1_replica_n1]
>> o.a.s.s.HttpSolrCall null:java.io.IOException:
>> java.util.concurrent.TimeoutException:
>> Idle timeout expired: 120000/120000 ms
>> 2018-02-23 04:08:43.167 ERROR
>> (recoveryExecutor-3-thread-8-processing-n:solr2-b:8080_solr
>> x:mycollection_shard1_replica_n1 s:shard1 c:mycollection r:core_node3)
>> [c:mycollection s:shard1 r:core_node3 x:mycollection_shard1_replica_n1]
>> o.a.s.u.PeerSync PeerSync: core=mycollection_shard1_replica_n1 url=
>> http://solr2-b.vam.be.cmh.mycollection.com:8080/solr  Requested 2 updates
>> from http://solr2-a.vam.be.cmh.mycollection.com:8080/solr/
>> mycollection_shard1_replica_n6/ but retrieved 1
>> 2018-02-23 04:09:00.925 ERROR (qtp1595212853-20968) [c:mycollection
>> s:shard1 r:core_node3 x:mycollection_shard1_replica_n1]
>> o.a.s.h.RequestHandlerBase java.io.IOException:
>> java.util.concurrent.TimeoutException:
>> Idle timeout expired: 120001/120000 ms
>> 2018-02-23 04:09:00.925 ERROR (qtp1595212853-20968) [c:mycollection
>> s:shard1 r:core_node3 x:mycollection_shard1_replica_n1]
>> o.a.s.s.HttpSolrCall null:java.io.IOException:
>> java.util.concurrent.TimeoutException:
>> Idle timeout expired: 120001/120000 ms
>> 2018-02-23 04:09:00.941 ERROR (qtp1595212853-22123) [c:mycollection
>> s:shard1 r:core_node3 x:mycollection_shard1_replica_n1]
>> o.a.s.h.RequestHandlerBase java.io.IOException:
>> java.util.concurrent.TimeoutException:
>> Idle timeout expired: 120000/120000 ms
>> 2018-02-23 04:09:00.941 ERROR (qtp1595212853-22123) [c:mycollection
>> s:shard1 r:core_node3 x:mycollection_shard1_replica_n1]
>> o.a.s.s.HttpSolrCall null:java.io.IOException:
>> java.util.concurrent.TimeoutException:
>> Idle timeout expired: 120000/120000 ms
>> 2018-02-23 04:09:01.023 ERROR (qtp1595212853-20597) [c:mycollection
>> s:shard1 r:core_node3 x:mycollection_shard1_replica_n1]
>> o.a.s.h.RequestHandlerBase java.io.IOException:
>> java.util.concurrent.TimeoutException:
>> Idle timeout expired: 120000/120000 ms
>> 2018-02-23 04:09:01.024 ERROR (qtp1595212853-20597) [c:mycollection
>> s:shard1 r:core_node3 x:mycollection_shard1_replica_n1]
>> o.a.s.s.HttpSolrCall null:java.io.IOException:
>> java.util.concurrent.TimeoutException:
>> Idle timeout expired: 120000/120000 ms
>> 2018-02-23 04:09:04.793 ERROR (qtp1595212853-15597) [c:mycollection
>> s:shard1 r:core_node3 x:mycollection_shard1_replica_n1]
>> o.a.s.h.RequestHandlerBase java.io.IOException:
>> java.util.concurrent.TimeoutException:
>> Idle timeout expired: 120000/120000 ms
>> 2018-02-23 04:09:04.793 ERROR (qtp1595212853-15597) [c:mycollection
>> s:shard1 r:core_node3 x:mycollection_shard1_replica_n1]
>> o.a.s.s.HttpSolrCall null:java.io.IOException:
>> java.util.concurrent.TimeoutException:
>> Idle timeout expired: 120000/120000 ms
>> 2018-02-23 04:09:04.964 ERROR (qtp1595212853-22126) [c:mycollection
>> s:shard1 r:core_node3 x:mycollection_shard1_replica_n1]
>> o.a.s.h.RequestHandlerBase java.io.IOException:
>> java.util.concurrent.TimeoutException:
>> Idle timeout expired: 120000/120000 ms
>> 2018-02-23 04:09:04.965 ERROR (qtp1595212853-22126) [c:mycollection
>> s:shard1 r:core_node3 x:mycollection_shard1_replica_n1]
>> o.a.s.s.HttpSolrCall null:java.io.IOException:
>> java.util.concurrent.TimeoutException:
>> Idle timeout expired: 120000/120000 ms
>> 2018-02-23 04:09:13.809 ERROR (qtp1595212853-14929) [c:mycollection
>> s:shard1 r:core_node3 x:mycollection_shard1_replica_n1]
>> o.a.s.h.RequestHandlerBase java.io.IOException:
>> java.util.concurrent.TimeoutException:
>> Idle timeout expired: 120000/120000 ms
>> 2018-02-23 04:09:13.825 ERROR (qtp1595212853-14929) [c:mycollection
>> s:shard1 r:core_node3 x:mycollection_shard1_replica_n1]
>> o.a.s.s.HttpSolrCall null:java.io.IOException:
>> java.util.concurrent.TimeoutException:
>> Idle timeout expired: 120000/120000 ms
>> 2018-02-23 04:09:17.377 ERROR (qtp1595212853-4469) [c:mycollection s:shard1
>> r:core_node3 x:mycollection_shard1_replica_n1] o.a.s.h.RequestHandlerBase
>> java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout
>> expired: 120001/120000 ms
>> 2018-02-23 04:09:17.377 ERROR (qtp1595212853-4469) [c:mycollection s:shard1
>> r:core_node3 x:mycollection_shard1_replica_n1] o.a.s.s.HttpSolrCall
>> null:java.io.IOException: java.util.concurrent.TimeoutException: Idle
>> timeout expired: 120001/120000 ms
>> 2018-02-23 04:09:28.821 ERROR (qtp1595212853-22121) [c:mycollection
>> s:shard1 r:core_node3 x:mycollection_shard1_replica_n1]
>> o.a.s.h.RequestHandlerBase java.io.IOException:
>> java.util.concurrent.TimeoutException:
>> Idle timeout expired: 120000/120000 ms
>> 2018-02-23 04:09:28.822 ERROR (qtp1595212853-22121) [c:mycollection
>> s:shard1 r:core_node3 x:mycollection_shard1_replica_n1]
>> o.a.s.s.HttpSolrCall null:java.io.IOException:
>> java.util.concurrent.TimeoutException:
>> Idle timeout expired: 120000/120000 ms
>> 2018-02-23 04:10:24.643 ERROR
>> (recoveryExecutor-3-thread-8-processing-n:solr2-b:8080_solr
>> x:mycollection_shard1_replica_n1 s:shard1 c:mycollection r:core_node3)
>> [c:mycollection s:shard1 r:core_node3 x:mycollection_shard1_replica_n1]
>> o.a.s.h.IndexFetcher Error fetching file, doing one
>> retry...:org.apache.solr.common.SolrException: Unable to download
>> tlog.0000000000000046782.1593163207942340608 completely. Downloaded 0!=87454
>> 2018-02-23 04:10:24.645 ERROR
>> (recoveryExecutor-3-thread-8-processing-n:solr2-b:8080_solr
>> x:mycollection_shard1_replica_n1 s:shard1 c:mycollection r:core_node3)
>> [c:mycollection s:shard1 r:core_node3 x:mycollection_shard1_replica_n1]
>> o.a.s.h.IndexFetcher Error deleting file: tlog.0000000000000046782.
>> 1593163207942340608
>> 2018-02-23 04:10:26.466 ERROR
>> (recoveryExecutor-3-thread-8-processing-n:solr2-b:8080_solr
>> x:mycollection_shard1_replica_n1 s:shard1 c:mycollection r:core_node3)
>> [c:mycollection s:shard1 r:core_node3 x:mycollection_shard1_replica_n1]
>> o.a.s.c.RecoveryStrategy Error while trying to
>> recover:org.apache.solr.common.SolrException:
>> Replication for recovery failed.
>> 2018-02-23 04:10:26.466 ERROR
>> (recoveryExecutor-3-thread-8-processing-n:solr2-b:8080_solr
>> x:mycollection_shard1_replica_n1 s:shard1 c:mycollection r:core_node3)
>> [c:mycollection s:shard1 r:core_node3 x:mycollection_shard1_replica_n1]
>> o.a.s.c.RecoveryStrategy Recovery failed - trying again... (0)
>> 2018-02-23 04:10:26.466 ERROR
>> (recoveryExecutor-3-thread-8-processing-n:solr2-b:8080_solr
>> x:mycollection_shard1_replica_n1 s:shard1 c:mycollection r:core_node3)
>> [c:mycollection s:shard1 r:core_node3 x:mycollection_shard1_replica_n1]
>> o.a.s.h.ReplicationHandler Index fetch failed
>> :org.apache.solr.common.SolrException:
>> Unable to download tlog.0000000000000046782.1593163207942340608 completely.
>> Downloaded 0!=87454
>> 
>> solr2-c:
>> 2018-02-23 01:58:43.421 ERROR (updateExecutor-2-thread-63-processing-http://
>> //solr2-a:8080//solr//mycollection_shard1_replica_n6
>> x:mycollection_shard1_replica_n8 r:core_node10 n:solr2-c.vam.be.cmh.
>> mycollection.com:8080_solr s:shard1 c:mycollection) [c:mycollection
>> s:shard1 r:core_node10 x:mycollection_shard1_replica_n8] o.a.s.u.
>> ErrorReportingConcurrentUpdateSolrClient error
>> 2018-02-23 01:58:43.423 ERROR (qtp1595212853-16434) [c:mycollection
>> s:shard1 r:core_node10 x:mycollection_shard1_replica_n8]
>> o.a.s.h.RequestHandlerBase org.apache.solr.update.processor.
>> DistributedUpdateProcessor$DistributedUpdatesAsyncException: Async
>> exception during distributed update: Read timed out
>> 2018-02-23 01:58:43.424 ERROR (qtp1595212853-16434) [c:mycollection
>> s:shard1 r:core_node10 x:mycollection_shard1_replica_n8]
>> o.a.s.s.HttpSolrCall null:org.apache.solr.update.processor.
>> DistributedUpdateProcessor$DistributedUpdatesAsyncException: Async
>> exception during distributed update: Read timed out
>> 2018-02-23 02:53:00.174 ERROR (updateExecutor-2-thread-107-
>> processing-http:////solr2-a:8080//solr//mycollection_shard1_replica_n6
>> x:mycollection_shard1_replica_n8 r:core_node10 n:solr2-c.vam.be.cmh.
>> mycollection.com:8080_solr s:shard1 c:mycollection) [c:mycollection
>> s:shard1 r:core_node10 x:mycollection_shard1_replica_n8] o.a.s.u.
>> ErrorReportingConcurrentUpdateSolrClient error
>> 2018-02-23 02:53:00.175 ERROR (qtp1595212853-16483) [c:mycollection
>> s:shard1 r:core_node10 x:mycollection_shard1_replica_n8]
>> o.a.s.h.RequestHandlerBase org.apache.solr.update.processor.
>> DistributedUpdateProcessor$DistributedUpdatesAsyncException: Async
>> exception during distributed update: Read timed out
>> 2018-02-23 02:53:00.176 ERROR (qtp1595212853-16483) [c:mycollection
>> s:shard1 r:core_node10 x:mycollection_shard1_replica_n8]
>> o.a.s.s.HttpSolrCall null:org.apache.solr.update.processor.
>> DistributedUpdateProcessor$DistributedUpdatesAsyncException: Async
>> exception during distributed update: Read timed out
>> 
>> solr2-d:
>> 2018-02-23 04:08:24.258 ERROR (qtp1595212853-2104) [c:mycollection s:shard1
>> r:core_node12 x:mycollection_shard1_replica_n11] o.a.s.h.RequestHandlerBase
>> java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout
>> expired: 120000/120000 ms
>> 2018-02-23 04:08:24.258 ERROR (qtp1595212853-2104) [c:mycollection s:shard1
>> r:core_node12 x:mycollection_shard1_replica_n11] o.a.s.s.HttpSolrCall
>> null:java.io.IOException: java.util.concurrent.TimeoutException: Idle
>> timeout expired: 120000/120000 ms
>> 2018-02-23 04:08:24.259 ERROR (qtp1595212853-3603) [c:mycollection s:shard1
>> r:core_node12 x:mycollection_shard1_replica_n11] o.a.s.h.RequestHandlerBase
>> java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout
>> expired: 120000/120000 ms
>> 2018-02-23 04:08:24.259 ERROR (qtp1595212853-3603) [c:mycollection s:shard1
>> r:core_node12 x:mycollection_shard1_replica_n11] o.a.s.s.HttpSolrCall
>> null:java.io.IOException: java.util.concurrent.TimeoutException: Idle
>> timeout expired: 120000/120000 ms
>> 2018-02-23 04:09:00.923 ERROR (qtp1595212853-959) [c:mycollection s:shard1
>> r:core_node12 x:mycollection_shard1_replica_n11] o.a.s.h.RequestHandlerBase
>> java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout
>> expired: 120000/120000 ms
>> 2018-02-23 04:09:00.923 ERROR (qtp1595212853-959) [c:mycollection s:shard1
>> r:core_node12 x:mycollection_shard1_replica_n11] o.a.s.s.HttpSolrCall
>> null:java.io.IOException: java.util.concurrent.TimeoutException: Idle
>> timeout expired: 120000/120000 ms
>> 2018-02-23 04:09:00.939 ERROR (qtp1595212853-2742) [c:mycollection s:shard1
>> r:core_node12 x:mycollection_shard1_replica_n11] o.a.s.h.RequestHandlerBase
>> java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout
>> expired: 120000/120000 ms
>> 2018-02-23 04:09:00.939 ERROR (qtp1595212853-2742) [c:mycollection s:shard1
>> r:core_node12 x:mycollection_shard1_replica_n11] o.a.s.s.HttpSolrCall
>> null:java.io.IOException: java.util.concurrent.TimeoutException: Idle
>> timeout expired: 120000/120000 ms
>> 2018-02-23 04:09:01.021 ERROR (qtp1595212853-3532) [c:mycollection s:shard1
>> r:core_node12 x:mycollection_shard1_replica_n11] o.a.s.h.RequestHandlerBase
>> java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout
>> expired: 120001/120000 ms
>> 2018-02-23 04:09:01.021 ERROR (qtp1595212853-3532) [c:mycollection s:shard1
>> r:core_node12 x:mycollection_shard1_replica_n11] o.a.s.s.HttpSolrCall
>> null:java.io.IOException: java.util.concurrent.TimeoutException: Idle
>> timeout expired: 120001/120000 ms
>> 2018-02-23 04:09:01.317 ERROR (qtp1595212853-958) [c:mycollection s:shard1
>> r:core_node12 x:mycollection_shard1_replica_n11] o.a.s.h.RequestHandlerBase
>> java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout
>> expired: 120000/120000 ms
>> 2018-02-23 04:09:01.317 ERROR (qtp1595212853-958) [c:mycollection s:shard1
>> r:core_node12 x:mycollection_shard1_replica_n11] o.a.s.s.HttpSolrCall
>> null:java.io.IOException: java.util.concurrent.TimeoutException: Idle
>> timeout expired: 120000/120000 ms
>> 2018-02-23 04:09:01.484 ERROR (qtp1595212853-986) [c:mycollection s:shard1
>> r:core_node12 x:mycollection_shard1_replica_n11] o.a.s.h.RequestHandlerBase
>> java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout
>> expired: 120001/120000 ms
>> 2018-02-23 04:09:01.484 ERROR (qtp1595212853-986) [c:mycollection s:shard1
>> r:core_node12 x:mycollection_shard1_replica_n11] o.a.s.s.HttpSolrCall
>> null:java.io.IOException: java.util.concurrent.TimeoutException: Idle
>> timeout expired: 120001/120000 ms
>> 2018-02-23 04:09:13.755 ERROR (qtp1595212853-3629) [c:mycollection s:shard1
>> r:core_node12 x:mycollection_shard1_replica_n11] o.a.s.h.RequestHandlerBase
>> java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout
>> expired: 120001/120000 ms
>> 2018-02-23 04:09:13.755 ERROR (qtp1595212853-3629) [c:mycollection s:shard1
>> r:core_node12 x:mycollection_shard1_replica_n11] o.a.s.s.HttpSolrCall
>> null:java.io.IOException: java.util.concurrent.TimeoutException: Idle
>> timeout expired: 120001/120000 ms
>> 2018-02-23 04:09:17.658 ERROR (qtp1595212853-3631) [c:mycollection s:shard1
>> r:core_node12 x:mycollection_shard1_replica_n11] o.a.s.h.RequestHandlerBase
>> java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout
>> expired: 120000/120000 ms
>> 2018-02-23 04:09:17.658 ERROR (qtp1595212853-3631) [c:mycollection s:shard1
>> r:core_node12 x:mycollection_shard1_replica_n11] o.a.s.s.HttpSolrCall
>> null:java.io.IOException: java.util.concurrent.TimeoutException: Idle
>> timeout expired: 120000/120000 ms
>> 2018-02-23 04:09:28.786 ERROR (qtp1595212853-979) [c:mycollection s:shard1
>> r:core_node12 x:mycollection_shard1_replica_n11] o.a.s.h.RequestHandlerBase
>> java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout
>> expired: 120001/120000 ms
>> 2018-02-23 04:09:28.786 ERROR (qtp1595212853-979) [c:mycollection s:shard1
>> r:core_node12 x:mycollection_shard1_replica_n11] o.a.s.s.HttpSolrCall
>> null:java.io.IOException: java.util.concurrent.TimeoutException: Idle
>> timeout expired: 120001/120000 ms
>> 2018-02-23 04:10:23.260 ERROR
>> (recoveryExecutor-3-thread-5-processing-n:solr2-d:8080_solr
>> x:mycollection_shard1_replica_n11 s:shard1 c:mycollection r:core_node12)
>> [c:mycollection s:shard1 r:core_node12 x:mycollection_shard1_replica_n11]
>> o.a.s.h.IndexFetcher Error fetching file, doing one
>> retry...:org.apache.solr.common.SolrException: Unable to download
>> tlog.0000000000000046782.1593163207942340608 completely. Downloaded 0!=87454
>> 2018-02-23 04:10:23.262 ERROR
>> (recoveryExecutor-3-thread-5-processing-n:solr2-d:8080_solr
>> x:mycollection_shard1_replica_n11 s:shard1 c:mycollection r:core_node12)
>> [c:mycollection s:shard1 r:core_node12 x:mycollection_shard1_replica_n11]
>> o.a.s.h.IndexFetcher Error deleting file: tlog.0000000000000046782.
>> 1593163207942340608
>> 2018-02-23 04:10:23.887 ERROR
>> (recoveryExecutor-3-thread-5-processing-n:solr2-d:8080_solr
>> x:mycollection_shard1_replica_n11 s:shard1 c:mycollection r:core_node12)
>> [c:mycollection s:shard1 r:core_node12 x:mycollection_shard1_replica_n11]
>> o.a.s.c.RecoveryStrategy Error while trying to
>> recover:org.apache.solr.common.SolrException:
>> Replication for recovery failed.
>> 2018-02-23 04:10:23.887 ERROR
>> (recoveryExecutor-3-thread-5-processing-n:solr2-d:8080_solr
>> x:mycollection_shard1_replica_n11 s:shard1 c:mycollection r:core_node12)
>> [c:mycollection s:shard1 r:core_node12 x:mycollection_shard1_replica_n11]
>> o.a.s.c.RecoveryStrategy Recovery failed - trying again... (0)
>> 2018-02-23 04:10:23.887 ERROR
>> (recoveryExecutor-3-thread-5-processing-n:solr2-d:8080_solr
>> x:mycollection_shard1_replica_n11 s:shard1 c:mycollection r:core_node12)
>> [c:mycollection s:shard1 r:core_node12 x:mycollection_shard1_replica_n11]
>> o.a.s.h.ReplicationHandler Index fetch failed
>> :org.apache.solr.common.SolrException:
>> Unable to download tlog.0000000000000046782.1593163207942340608 completely.
>> Downloaded 0!=87454
>> 
>> solr2-e:
>> 2018-02-23 04:09:00.926 ERROR (qtp1595212853-15314) [c:mycollection
>> s:shard1 r:core_node7 x:mycollection_shard1_replica_n4]
>> o.a.s.s.HttpSolrCall null:java.io.IOException:
>> java.util.concurrent.TimeoutException:
>> Idle timeout expired: 120000/120000 ms
>> 2018-02-23 04:09:00.942 ERROR (qtp1595212853-4289) [c:mycollection s:shard1
>> r:core_node7 x:mycollection_shard1_replica_n4] o.a.s.h.RequestHandlerBase
>> java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout
>> expired: 120000/120000 ms
>> 2018-02-23 04:09:00.943 ERROR (qtp1595212853-4289) [c:mycollection s:shard1
>> r:core_node7 x:mycollection_shard1_replica_n4] o.a.s.s.HttpSolrCall
>> null:java.io.IOException: java.util.concurrent.TimeoutException: Idle
>> timeout expired: 120000/120000 ms
>> 2018-02-23 04:09:01.024 ERROR (qtp1595212853-15308) [c:mycollection
>> s:shard1 r:core_node7 x:mycollection_shard1_replica_n4]
>> o.a.s.h.RequestHandlerBase java.io.IOException:
>> java.util.concurrent.TimeoutException:
>> Idle timeout expired: 120000/120000 ms
>> 2018-02-23 04:09:01.024 ERROR (qtp1595212853-15308) [c:mycollection
>> s:shard1 r:core_node7 x:mycollection_shard1_replica_n4]
>> o.a.s.s.HttpSolrCall null:java.io.IOException:
>> java.util.concurrent.TimeoutException:
>> Idle timeout expired: 120000/120000 ms
>> 2018-02-23 04:09:01.328 ERROR (qtp1595212853-14946) [c:mycollection
>> s:shard1 r:core_node7 x:mycollection_shard1_replica_n4]
>> o.a.s.h.RequestHandlerBase java.io.IOException:
>> java.util.concurrent.TimeoutException:
>> Idle timeout expired: 120000/120000 ms
>> 2018-02-23 04:09:01.328 ERROR (qtp1595212853-14946) [c:mycollection
>> s:shard1 r:core_node7 x:mycollection_shard1_replica_n4]
>> o.a.s.s.HttpSolrCall null:java.io.IOException:
>> java.util.concurrent.TimeoutException:
>> Idle timeout expired: 120000/120000 ms
>> 2018-02-23 04:09:01.486 ERROR (qtp1595212853-7288) [c:mycollection s:shard1
>> r:core_node7 x:mycollection_shard1_replica_n4] o.a.s.h.RequestHandlerBase
>> java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout
>> expired: 120000/120000 ms
>> 2018-02-23 04:09:01.486 ERROR (qtp1595212853-7288) [c:mycollection s:shard1
>> r:core_node7 x:mycollection_shard1_replica_n4] o.a.s.s.HttpSolrCall
>> null:java.io.IOException: java.util.concurrent.TimeoutException: Idle
>> timeout expired: 120000/120000 ms
>> 2018-02-23 04:09:13.758 ERROR (qtp1595212853-9090) [c:mycollection s:shard1
>> r:core_node7 x:mycollection_shard1_replica_n4] o.a.s.h.RequestHandlerBase
>> java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout
>> expired: 120000/120000 ms
>> 2018-02-23 04:09:13.758 ERROR (qtp1595212853-9090) [c:mycollection s:shard1
>> r:core_node7 x:mycollection_shard1_replica_n4] o.a.s.s.HttpSolrCall
>> null:java.io.IOException: java.util.concurrent.TimeoutException: Idle
>> timeout expired: 120000/120000 ms
>> 2018-02-23 04:09:17.377 ERROR (qtp1595212853-9021) [c:mycollection s:shard1
>> r:core_node7 x:mycollection_shard1_replica_n4] o.a.s.h.RequestHandlerBase
>> java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout
>> expired: 120000/120000 ms
>> 2018-02-23 04:09:17.378 ERROR (qtp1595212853-9021) [c:mycollection s:shard1
>> r:core_node7 x:mycollection_shard1_replica_n4] o.a.s.s.HttpSolrCall
>> null:java.io.IOException: java.util.concurrent.TimeoutException: Idle
>> timeout expired: 120000/120000 ms
>> 2018-02-23 04:09:29.804 ERROR (qtp1595212853-16156) [c:mycollection
>> s:shard1 r:core_node7 x:mycollection_shard1_replica_n4]
>> o.a.s.h.RequestHandlerBase java.io.IOException:
>> java.util.concurrent.TimeoutException:
>> Idle timeout expired: 120000/120000 ms
>> 2018-02-23 04:09:29.827 ERROR (qtp1595212853-16156) [c:mycollection
>> s:shard1 r:core_node7 x:mycollection_shard1_replica_n4]
>> o.a.s.s.HttpSolrCall null:java.io.IOException:
>> java.util.concurrent.TimeoutException:
>> Idle timeout expired: 120000/120000 ms
>> 2018-02-23 04:10:24.945 ERROR
>> (recoveryExecutor-3-thread-6-processing-n:solr2-e:8080_solr
>> x:mycollection_shard1_replica_n4 s:shard1 c:mycollection r:core_node7)
>> [c:mycollection s:shard1 r:core_node7 x:mycollection_shard1_replica_n4]
>> o.a.s.h.IndexFetcher Error fetching file, doing one
>> retry...:org.apache.solr.common.SolrException: Unable to download
>> tlog.0000000000000046782.1593163207942340608 completely. Downloaded 0!=87454
>> 2018-02-23 04:10:24.946 ERROR
>> (recoveryExecutor-3-thread-6-processing-n:solr2-e:8080_solr
>> x:mycollection_shard1_replica_n4 s:shard1 c:mycollection r:core_node7)
>> [c:mycollection s:shard1 r:core_node7 x:mycollection_shard1_replica_n4]
>> o.a.s.h.IndexFetcher Error deleting file: tlog.0000000000000046782.
>> 1593163207942340608
>> 2018-02-23 04:10:26.278 ERROR
>> (recoveryExecutor-3-thread-6-processing-n:solr2-e:8080_solr
>> x:mycollection_shard1_replica_n4 s:shard1 c:mycollection r:core_node7)
>> [c:mycollection s:shard1 r:core_node7 x:mycollection_shard1_replica_n4]
>> o.a.s.c.RecoveryStrategy Error while trying to
>> recover:org.apache.solr.common.SolrException:
>> Replication for recovery failed.
>> 2018-02-23 04:10:26.278 ERROR
>> (recoveryExecutor-3-thread-6-processing-n:solr2-e:8080_solr
>> x:mycollection_shard1_replica_n4 s:shard1 c:mycollection r:core_node7)
>> [c:mycollection s:shard1 r:core_node7 x:mycollection_shard1_replica_n4]
>> o.a.s.c.RecoveryStrategy Recovery failed - trying again... (0)
>> 2018-02-23 04:10:26.278 ERROR
>> (recoveryExecutor-3-thread-6-processing-n:solr2-e:8080_solr
>> x:mycollection_shard1_replica_n4 s:shard1 c:mycollection r:core_node7)
>> [c:mycollection s:shard1 r:core_node7 x:mycollection_shard1_replica_n4]
>> o.a.s.h.ReplicationHandler Index fetch failed
>> :org.apache.solr.common.SolrException:
>> Unable to download tlog.0000000000000046782.1593163207942340608 completely.
>> Downloaded 0!=87454
>> 2018-02-23 04:12:20.570 ERROR
>> (recoveryExecutor-3-thread-6-processing-n:solr2-e:8080_solr
>> x:mycollection_shard1_replica_n4 s:shard1 c:mycollection r:core_node7)
>> [c:mycollection s:shard1 r:core_node7 x:mycollection_shard1_replica_n4]
>> o.a.s.h.IndexFetcher Error fetching file, doing one
>> retry...:org.apache.solr.common.SolrException: Unable to download
>> tlog.0000000000000046787.1593163366289899520 completely. Downloaded
>> 0!=179060
>> 2018-02-23 04:12:20.571 ERROR
>> (recoveryExecutor-3-thread-6-processing-n:solr2-e:8080_solr
>> x:mycollection_shard1_replica_n4 s:shard1 c:mycollection r:core_node7)
>> [c:mycollection s:shard1 r:core_node7 x:mycollection_shard1_replica_n4]
>> o.a.s.h.IndexFetcher Error deleting file: tlog.0000000000000046787.
>> 1593163366289899520
>> 2018-02-23 04:12:22.405 ERROR
>> (recoveryExecutor-3-thread-6-processing-n:solr2-e:8080_solr
>> x:mycollection_shard1_replica_n4 s:shard1 c:mycollection r:core_node7)
>> [c:mycollection s:shard1 r:core_node7 x:mycollection_shard1_replica_n4]
>> o.a.s.c.RecoveryStrategy Error while trying to
>> recover:org.apache.solr.common.SolrException:
>> Replication for recovery failed.
>> 2018-02-23 04:12:22.405 ERROR
>> (recoveryExecutor-3-thread-6-processing-n:solr2-e:8080_solr
>> x:mycollection_shard1_replica_n4 s:shard1 c:mycollection r:core_node7)
>> [c:mycollection s:shard1 r:core_node7 x:mycollection_shard1_replica_n4]
>> o.a.s.c.RecoveryStrategy Recovery failed - trying again... (1)
>> 2018-02-23 04:12:22.405 ERROR
>> (recoveryExecutor-3-thread-6-processing-n:solr2-e:8080_solr
>> x:mycollection_shard1_replica_n4 s:shard1 c:mycollection r:core_node7)
>> [c:mycollection s:shard1 r:core_node7 x:mycollection_shard1_replica_n4]
>> o.a.s.h.ReplicationHandler Index fetch failed
>> :org.apache.solr.common.SolrException:
>> Unable to download tlog.0000000000000046787.1593163366289899520 completely.
>> Downloaded 0!=179060
>> 
>> 
>>> On Feb 23, 2018, at 4:15 PM, Deepak Goel <de...@gmail.com> wrote:
>>> 
>>> Can you please post all the errors? The current error is only for the node
>>> 'solr-2d'
>>> 
>>> On 23 Feb 2018 09:42, "Tom Peters" <tp...@synacor.com> wrote:
>>> 
>>> I'm trying to debug why indexing in SolrCloud 7.1 is having so many
>> issues.
>>> It will hang most of the time, and timeout the rest.
>>> 
>>> Here's an example:
>>> 
>>>   time curl -s 'myhost:8080/solr/mycollection/update/json/docs' -d
>>> '{"solr_id":"test_001", "data_type":"test"}'|jq .
>>>   {
>>>     "responseHeader": {
>>>       "status": 0,
>>>       "QTime": 5004
>>>     }
>>>   }
>>>   curl -s 'myhost:8080/solr/mycollection/update/json/docs' -d   0.00s
>>> user 0.00s system 0% cpu 5.025 total
>>>   jq .  0.01s user 0.00s system 0% cpu 5.025 total
>>> 
>>> Here's some of the timeout errors I'm seeing:
>>> 
>>>   2018-02-23 03:55:02.903 ERROR (qtp1595212853-3607) [c:mycollection
>>> s:shard1 r:core_node12 x:mycollection_shard1_replica_n11]
>>> o.a.s.h.RequestHandlerBase java.io.IOException:
>>> java.util.concurrent.TimeoutException:
>>> Idle timeout expired: 120000/120000 ms
>>>   2018-02-23 03:55:02.903 ERROR (qtp1595212853-3607) [c:mycollection
>>> s:shard1 r:core_node12 x:mycollection_shard1_replica_n11]
>>> o.a.s.s.HttpSolrCall null:java.io.IOException:
>>> java.util.concurrent.TimeoutException:
>>> Idle timeout expired: 120000/120000 ms
>>>   2018-02-23 03:55:36.517 ERROR (recoveryExecutor-3-thread-4-
>>> processing-n:solr2-d.myhost:8080_solr x:mycollection_shard1_replica_n11
>>> s:shard1 c:mycollection r:core_node12) [c:mycollection s:shard1
>>> r:core_node12 x:mycollection_shard1_replica_n11]
>> o.a.s.h.ReplicationHandler
>>> Index fetch failed :org.apache.solr.common.SolrException: Index fetch
>>> failed :
>>>   2018-02-23 03:55:36.517 ERROR (recoveryExecutor-3-thread-4-
>>> processing-n:solr2-d.myhost:8080_solr x:mycollection_shard1_replica_n11
>>> s:shard1 c:mycollection r:core_node12) [c:mycollection s:shard1
>>> r:core_node12 x:mycollection_shard1_replica_n11] o.a.s.c.RecoveryStrategy
>>> Error while trying to recover:org.apache.solr.common.SolrException:
>>> Replication for recovery failed.
>>> 
>>> 
>>> We currently have two separate Solr clusters. Our current in-production
>>> cluster which runs on Solr 3.4 and a new ring that I'm trying to bring up
>>> which runs on SolrCloud 7.1. I have the exact same code that is indexing
>> to
>>> both clusters. The Solr 3.4 indexes fine, but I'm running into lots of
>>> issues with SolrCloud 7.1.
>>> 
>>> 
>>> Some additional details about the setup:
>>> 
>>> * 5 nodes solr2-a through solr2-e.
>>> * 5 replicas
>>> * 1 shard
>>> * The servers have 48G of RAM with -Xmx and -Xms set to 16G
>>> * I currently have soft commits at 10m intervals and hard commits (with
>>> openSearcher=false) at 1m intervals. I also tried 5m (soft) and 15s (hard)
>>> as well.
>>> 
>>> Any help or pointers would be greatly appreciated. Thanks!
>>> 
>>> 
>>> This message and any attachment may contain information that is
>>> confidential and/or proprietary. Any use, disclosure, copying, storing, or
>>> distribution of this e-mail or any attached file by anyone other than the
>>> intended recipient is strictly prohibited. If you have received this
>>> message in error, please notify the sender by reply email and delete the
>>> message and any attachments. Thank you.
>> 
>> 
>> 
>> This message and any attachment may contain information that is
>> confidential and/or proprietary. Any use, disclosure, copying, storing, or
>> distribution of this e-mail or any attached file by anyone other than the
>> intended recipient is strictly prohibited. If you have received this
>> message in error, please notify the sender by reply email and delete the
>> message and any attachments. Thank you.
> 





This message and any attachment may contain information that is confidential and/or proprietary. Any use, disclosure, copying, storing, or distribution of this e-mail or any attached file by anyone other than the intended recipient is strictly prohibited. If you have received this message in error, please notify the sender by reply email and delete the message and any attachments. Thank you.

Re: Indexing timeout issues with SolrCloud 7.1

Posted by Erick Erickson <er...@gmail.com>.
DBQ is something of a heavyweight action. Basically in order to
preserve ordering it has to lock out updates while it executes since
all docs (which may live on all shards) have to be deleted before
subsequent adds of one of the affected docs is processed. In order to
do that, things need to be locked.

Delete-by-id OTOH, can use the normal optimistic locking to insure
proper ordering. So if object_id is your <uniqueKey>, this may be much
more robust if you delete-by-id

Best,
Erick

On Sat, Feb 24, 2018 at 1:37 AM, Deepak Goel <de...@gmail.com> wrote:
> From the error list, i can see multiple errors:
>
> 1. Failure to recover replica
> 2. Peer sync error
> 3. Failure to download file
>
> On 24 Feb 2018 03:10, "Tom Peters" <tp...@synacor.com> wrote:
>
> I included the last 25 lines from the logs from each of the five nodes
> during that time period.
>
> I _think_ I'm running into issues with bulking up deleteByQuery. Quick
> background: we have objects in our system that may have multiple
> availability windows. So when we index an object, will store it as separate
> documents each with their own begins and expires date. At index time we
> don't know if the all of the windows are still valid or not, so we remove
> all of them with a deleteByQuery (e.g. deleteByQuery=object_id:12345) and
> then index one or more documents.
>
> I ran an isolated test a number of times where I indexed 1500 documents in
> this manner (deletes then index). In Solr 3.4, it takes about 15s to
> complete. In Solr 7.1, it's taking about 5m. If I remove the deleteByQuery,
> the indexing times are nearly identical.
>
> When run in normal production mode where we have lots of processes indexing
> at once (~20 or so), it starts to cause lots of issues (which you see
> below).
>
>
> Please let me know if anything I mentioned is unclear. Thanks!
>
>
>
>
> solr2-a:
> 2018-02-23 04:09:36.551 ERROR (updateExecutor-2-thread-2672-
> processing-http:////solr2-b:8080//solr//mycollection_shard1_replica_n1
> x:mycollection_shard1_replica_n6 r:core_node9 n:solr2-a.vam.be.cmh.
> mycollection.com:8080_solr s:shard1 c:mycollection) [c:mycollection
> s:shard1 r:core_node9 x:mycollection_shard1_replica_n6] o.a.s.u.
> ErrorReportingConcurrentUpdateSolrClient error
> 2018-02-23 04:09:36.551 ERROR (updateExecutor-2-thread-2692-
> processing-http:////solr2-d:8080//solr//mycollection_shard1_replica_n11
> x:mycollection_shard1_replica_n6 r:core_node9 n:solr2-a.vam.be.cmh.
> mycollection.com:8080_solr s:shard1 c:mycollection) [c:mycollection
> s:shard1 r:core_node9 x:mycollection_shard1_replica_n6] o.a.s.u.
> ErrorReportingConcurrentUpdateSolrClient error
> 2018-02-23 04:09:36.551 ERROR (updateExecutor-2-thread-2711-
> processing-http:////solr2-e:8080//solr//mycollection_shard1_replica_n4
> x:mycollection_shard1_replica_n6 r:core_node9 n:solr2-a.vam.be.cmh.
> mycollection.com:8080_solr s:shard1 c:mycollection) [c:mycollection
> s:shard1 r:core_node9 x:mycollection_shard1_replica_n6] o.a.s.u.
> ErrorReportingConcurrentUpdateSolrClient error
> 2018-02-23 04:09:36.552 ERROR (qtp1595212853-32739) [c:mycollection
> s:shard1 r:core_node9 x:mycollection_shard1_replica_n6]
> o.a.s.u.p.DistributedUpdateProcessor
> Setting up to try to start recovery on replica http://solr2-b:8080/solr/
> mycollection_shard1_replica_n1/
> 2018-02-23 04:09:36.552 ERROR (qtp1595212853-32739) [c:mycollection
> s:shard1 r:core_node9 x:mycollection_shard1_replica_n6]
> o.a.s.u.p.DistributedUpdateProcessor
> Setting up to try to start recovery on replica http://solr2-d:8080/solr/
> mycollection_shard1_replica_n11/
> 2018-02-23 04:09:36.552 ERROR (qtp1595212853-32739) [c:mycollection
> s:shard1 r:core_node9 x:mycollection_shard1_replica_n6]
> o.a.s.u.p.DistributedUpdateProcessor
> Setting up to try to start recovery on replica http://solr2-e:8080/solr/
> mycollection_shard1_replica_n4/
> 2018-02-23 04:09:38.217 ERROR (updateExecutor-2-thread-2712-
> processing-http:////solr2-e:8080//solr//mycollection_shard1_replica_n4
> x:mycollection_shard1_replica_n6 r:core_node9 n:solr2-a.vam.be.cmh.
> mycollection.com:8080_solr s:shard1 c:mycollection) [c:mycollection
> s:shard1 r:core_node9 x:mycollection_shard1_replica_n6] o.a.s.u.
> ErrorReportingConcurrentUpdateSolrClient error
> 2018-02-23 04:09:38.217 ERROR (updateExecutor-2-thread-2726-
> processing-http:////solr2-d:8080//solr//mycollection_shard1_replica_n11
> x:mycollection_shard1_replica_n6 r:core_node9 n:solr2-a.vam.be.cmh.
> mycollection.com:8080_solr s:shard1 c:mycollection) [c:mycollection
> s:shard1 r:core_node9 x:mycollection_shard1_replica_n6] o.a.s.u.
> ErrorReportingConcurrentUpdateSolrClient error
> 2018-02-23 04:09:38.217 ERROR (updateExecutor-2-thread-2727-
> processing-http:////solr2-b:8080//solr//mycollection_shard1_replica_n1
> x:mycollection_shard1_replica_n6 r:core_node9 n:solr2-a.vam.be.cmh.
> mycollection.com:8080_solr s:shard1 c:mycollection) [c:mycollection
> s:shard1 r:core_node9 x:mycollection_shard1_replica_n6] o.a.s.u.
> ErrorReportingConcurrentUpdateSolrClient error
> 2018-02-23 04:09:38.218 ERROR (qtp1595212853-32260) [c:mycollection
> s:shard1 r:core_node9 x:mycollection_shard1_replica_n6]
> o.a.s.u.p.DistributedUpdateProcessor
> Setting up to try to start recovery on replica http://solr2-b:8080/solr/
> mycollection_shard1_replica_n1/
> 2018-02-23 04:09:38.218 ERROR (qtp1595212853-32260) [c:mycollection
> s:shard1 r:core_node9 x:mycollection_shard1_replica_n6]
> o.a.s.u.p.DistributedUpdateProcessor
> Setting up to try to start recovery on replica http://solr2-d:8080/solr/
> mycollection_shard1_replica_n11/
> 2018-02-23 04:09:38.218 ERROR (qtp1595212853-32260) [c:mycollection
> s:shard1 r:core_node9 x:mycollection_shard1_replica_n6]
> o.a.s.u.p.DistributedUpdateProcessor
> Setting up to try to start recovery on replica http://solr2-e:8080/solr/
> mycollection_shard1_replica_n4/
> 2018-02-23 04:09:50.048 ERROR (qtp1595212853-11840) [c:mycollection
> s:shard1 r:core_node9 x:mycollection_shard1_replica_n6]
> o.a.s.u.p.DistributedUpdateProcessor
> Setting up to try to start recovery on replica http://solr2-d:8080/solr/
> mycollection_shard1_replica_n11/
> 2018-02-23 04:09:50.048 ERROR (qtp1595212853-11840) [c:mycollection
> s:shard1 r:core_node9 x:mycollection_shard1_replica_n6]
> o.a.s.u.p.DistributedUpdateProcessor
> Setting up to try to start recovery on replica http://solr2-e:8080/solr/
> mycollection_shard1_replica_n4/
> 2018-02-23 04:09:50.142 ERROR (qtp1595212853-33675) [c:mycollection
> s:shard1 r:core_node9 x:mycollection_shard1_replica_n6]
> o.a.s.u.p.DistributedUpdateProcessor
> Setting up to try to start recovery on replica http://solr2-b:8080/solr/
> mycollection_shard1_replica_n1/
> 2018-02-23 04:09:50.143 ERROR (qtp1595212853-33675) [c:mycollection
> s:shard1 r:core_node9 x:mycollection_shard1_replica_n6]
> o.a.s.u.p.DistributedUpdateProcessor
> Setting up to try to start recovery on replica http://solr2-d:8080/solr/
> mycollection_shard1_replica_n11/
> 2018-02-23 04:09:50.143 ERROR (qtp1595212853-33675) [c:mycollection
> s:shard1 r:core_node9 x:mycollection_shard1_replica_n6]
> o.a.s.u.p.DistributedUpdateProcessor
> Setting up to try to start recovery on replica http://solr2-e:8080/solr/
> mycollection_shard1_replica_n4/
> 2018-02-23 07:54:01.404 ERROR (qtp1595212853-33424) [c:mycollection
> s:shard1 r:core_node9 x:mycollection_shard1_replica_n6]
> o.a.s.h.RequestHandlerBase java.lang.NullPointerException
> 2018-02-23 07:54:01.405 ERROR (qtp1595212853-33424) [c:mycollection
> s:shard1 r:core_node9 x:mycollection_shard1_replica_n6]
> o.a.s.s.HttpSolrCall null:java.lang.NullPointerException
> 2018-02-23 12:24:12.389 ERROR (qtp1595212853-33906) [c:mycollection
> s:shard1 r:core_node9 x:mycollection_shard1_replica_n6]
> o.a.s.h.RequestHandlerBase java.lang.NullPointerException
> 2018-02-23 12:24:12.390 ERROR (qtp1595212853-33906) [c:mycollection
> s:shard1 r:core_node9 x:mycollection_shard1_replica_n6]
> o.a.s.s.HttpSolrCall null:java.lang.NullPointerException
> 2018-02-23 16:27:52.030 ERROR (qtp1595212853-32216) [c:mycollection
> s:shard1 r:core_node9 x:mycollection_shard1_replica_n6]
> o.a.s.h.RequestHandlerBase java.lang.NullPointerException
> 2018-02-23 16:27:52.030 ERROR (qtp1595212853-32216) [c:mycollection
> s:shard1 r:core_node9 x:mycollection_shard1_replica_n6]
> o.a.s.s.HttpSolrCall null:java.lang.NullPointerException
> 2018-02-23 16:30:12.276 ERROR (qtp1595212853-33424) [c:mycollection
> s:shard1 r:core_node9 x:mycollection_shard1_replica_n6]
> o.a.s.h.RequestHandlerBase java.lang.NullPointerException
> 2018-02-23 16:30:12.276 ERROR (qtp1595212853-33424) [c:mycollection
> s:shard1 r:core_node9 x:mycollection_shard1_replica_n6]
> o.a.s.s.HttpSolrCall null:java.lang.NullPointerException
>
> solr2-b:
> 2018-02-23 04:08:24.260 ERROR (qtp1595212853-22127) [c:mycollection
> s:shard1 r:core_node3 x:mycollection_shard1_replica_n1]
> o.a.s.h.RequestHandlerBase java.io.IOException:
> java.util.concurrent.TimeoutException:
> Idle timeout expired: 120000/120000 ms
> 2018-02-23 04:08:24.260 ERROR (qtp1595212853-22127) [c:mycollection
> s:shard1 r:core_node3 x:mycollection_shard1_replica_n1]
> o.a.s.s.HttpSolrCall null:java.io.IOException:
> java.util.concurrent.TimeoutException:
> Idle timeout expired: 120000/120000 ms
> 2018-02-23 04:08:24.260 ERROR (qtp1595212853-22168) [c:mycollection
> s:shard1 r:core_node3 x:mycollection_shard1_replica_n1]
> o.a.s.s.HttpSolrCall null:java.io.IOException:
> java.util.concurrent.TimeoutException:
> Idle timeout expired: 120000/120000 ms
> 2018-02-23 04:08:43.167 ERROR
> (recoveryExecutor-3-thread-8-processing-n:solr2-b:8080_solr
> x:mycollection_shard1_replica_n1 s:shard1 c:mycollection r:core_node3)
> [c:mycollection s:shard1 r:core_node3 x:mycollection_shard1_replica_n1]
> o.a.s.u.PeerSync PeerSync: core=mycollection_shard1_replica_n1 url=
> http://solr2-b.vam.be.cmh.mycollection.com:8080/solr  Requested 2 updates
> from http://solr2-a.vam.be.cmh.mycollection.com:8080/solr/
> mycollection_shard1_replica_n6/ but retrieved 1
> 2018-02-23 04:09:00.925 ERROR (qtp1595212853-20968) [c:mycollection
> s:shard1 r:core_node3 x:mycollection_shard1_replica_n1]
> o.a.s.h.RequestHandlerBase java.io.IOException:
> java.util.concurrent.TimeoutException:
> Idle timeout expired: 120001/120000 ms
> 2018-02-23 04:09:00.925 ERROR (qtp1595212853-20968) [c:mycollection
> s:shard1 r:core_node3 x:mycollection_shard1_replica_n1]
> o.a.s.s.HttpSolrCall null:java.io.IOException:
> java.util.concurrent.TimeoutException:
> Idle timeout expired: 120001/120000 ms
> 2018-02-23 04:09:00.941 ERROR (qtp1595212853-22123) [c:mycollection
> s:shard1 r:core_node3 x:mycollection_shard1_replica_n1]
> o.a.s.h.RequestHandlerBase java.io.IOException:
> java.util.concurrent.TimeoutException:
> Idle timeout expired: 120000/120000 ms
> 2018-02-23 04:09:00.941 ERROR (qtp1595212853-22123) [c:mycollection
> s:shard1 r:core_node3 x:mycollection_shard1_replica_n1]
> o.a.s.s.HttpSolrCall null:java.io.IOException:
> java.util.concurrent.TimeoutException:
> Idle timeout expired: 120000/120000 ms
> 2018-02-23 04:09:01.023 ERROR (qtp1595212853-20597) [c:mycollection
> s:shard1 r:core_node3 x:mycollection_shard1_replica_n1]
> o.a.s.h.RequestHandlerBase java.io.IOException:
> java.util.concurrent.TimeoutException:
> Idle timeout expired: 120000/120000 ms
> 2018-02-23 04:09:01.024 ERROR (qtp1595212853-20597) [c:mycollection
> s:shard1 r:core_node3 x:mycollection_shard1_replica_n1]
> o.a.s.s.HttpSolrCall null:java.io.IOException:
> java.util.concurrent.TimeoutException:
> Idle timeout expired: 120000/120000 ms
> 2018-02-23 04:09:04.793 ERROR (qtp1595212853-15597) [c:mycollection
> s:shard1 r:core_node3 x:mycollection_shard1_replica_n1]
> o.a.s.h.RequestHandlerBase java.io.IOException:
> java.util.concurrent.TimeoutException:
> Idle timeout expired: 120000/120000 ms
> 2018-02-23 04:09:04.793 ERROR (qtp1595212853-15597) [c:mycollection
> s:shard1 r:core_node3 x:mycollection_shard1_replica_n1]
> o.a.s.s.HttpSolrCall null:java.io.IOException:
> java.util.concurrent.TimeoutException:
> Idle timeout expired: 120000/120000 ms
> 2018-02-23 04:09:04.964 ERROR (qtp1595212853-22126) [c:mycollection
> s:shard1 r:core_node3 x:mycollection_shard1_replica_n1]
> o.a.s.h.RequestHandlerBase java.io.IOException:
> java.util.concurrent.TimeoutException:
> Idle timeout expired: 120000/120000 ms
> 2018-02-23 04:09:04.965 ERROR (qtp1595212853-22126) [c:mycollection
> s:shard1 r:core_node3 x:mycollection_shard1_replica_n1]
> o.a.s.s.HttpSolrCall null:java.io.IOException:
> java.util.concurrent.TimeoutException:
> Idle timeout expired: 120000/120000 ms
> 2018-02-23 04:09:13.809 ERROR (qtp1595212853-14929) [c:mycollection
> s:shard1 r:core_node3 x:mycollection_shard1_replica_n1]
> o.a.s.h.RequestHandlerBase java.io.IOException:
> java.util.concurrent.TimeoutException:
> Idle timeout expired: 120000/120000 ms
> 2018-02-23 04:09:13.825 ERROR (qtp1595212853-14929) [c:mycollection
> s:shard1 r:core_node3 x:mycollection_shard1_replica_n1]
> o.a.s.s.HttpSolrCall null:java.io.IOException:
> java.util.concurrent.TimeoutException:
> Idle timeout expired: 120000/120000 ms
> 2018-02-23 04:09:17.377 ERROR (qtp1595212853-4469) [c:mycollection s:shard1
> r:core_node3 x:mycollection_shard1_replica_n1] o.a.s.h.RequestHandlerBase
> java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout
> expired: 120001/120000 ms
> 2018-02-23 04:09:17.377 ERROR (qtp1595212853-4469) [c:mycollection s:shard1
> r:core_node3 x:mycollection_shard1_replica_n1] o.a.s.s.HttpSolrCall
> null:java.io.IOException: java.util.concurrent.TimeoutException: Idle
> timeout expired: 120001/120000 ms
> 2018-02-23 04:09:28.821 ERROR (qtp1595212853-22121) [c:mycollection
> s:shard1 r:core_node3 x:mycollection_shard1_replica_n1]
> o.a.s.h.RequestHandlerBase java.io.IOException:
> java.util.concurrent.TimeoutException:
> Idle timeout expired: 120000/120000 ms
> 2018-02-23 04:09:28.822 ERROR (qtp1595212853-22121) [c:mycollection
> s:shard1 r:core_node3 x:mycollection_shard1_replica_n1]
> o.a.s.s.HttpSolrCall null:java.io.IOException:
> java.util.concurrent.TimeoutException:
> Idle timeout expired: 120000/120000 ms
> 2018-02-23 04:10:24.643 ERROR
> (recoveryExecutor-3-thread-8-processing-n:solr2-b:8080_solr
> x:mycollection_shard1_replica_n1 s:shard1 c:mycollection r:core_node3)
> [c:mycollection s:shard1 r:core_node3 x:mycollection_shard1_replica_n1]
> o.a.s.h.IndexFetcher Error fetching file, doing one
> retry...:org.apache.solr.common.SolrException: Unable to download
> tlog.0000000000000046782.1593163207942340608 completely. Downloaded 0!=87454
> 2018-02-23 04:10:24.645 ERROR
> (recoveryExecutor-3-thread-8-processing-n:solr2-b:8080_solr
> x:mycollection_shard1_replica_n1 s:shard1 c:mycollection r:core_node3)
> [c:mycollection s:shard1 r:core_node3 x:mycollection_shard1_replica_n1]
> o.a.s.h.IndexFetcher Error deleting file: tlog.0000000000000046782.
> 1593163207942340608
> 2018-02-23 04:10:26.466 ERROR
> (recoveryExecutor-3-thread-8-processing-n:solr2-b:8080_solr
> x:mycollection_shard1_replica_n1 s:shard1 c:mycollection r:core_node3)
> [c:mycollection s:shard1 r:core_node3 x:mycollection_shard1_replica_n1]
> o.a.s.c.RecoveryStrategy Error while trying to
> recover:org.apache.solr.common.SolrException:
> Replication for recovery failed.
> 2018-02-23 04:10:26.466 ERROR
> (recoveryExecutor-3-thread-8-processing-n:solr2-b:8080_solr
> x:mycollection_shard1_replica_n1 s:shard1 c:mycollection r:core_node3)
> [c:mycollection s:shard1 r:core_node3 x:mycollection_shard1_replica_n1]
> o.a.s.c.RecoveryStrategy Recovery failed - trying again... (0)
> 2018-02-23 04:10:26.466 ERROR
> (recoveryExecutor-3-thread-8-processing-n:solr2-b:8080_solr
> x:mycollection_shard1_replica_n1 s:shard1 c:mycollection r:core_node3)
> [c:mycollection s:shard1 r:core_node3 x:mycollection_shard1_replica_n1]
> o.a.s.h.ReplicationHandler Index fetch failed
> :org.apache.solr.common.SolrException:
> Unable to download tlog.0000000000000046782.1593163207942340608 completely.
> Downloaded 0!=87454
>
> solr2-c:
> 2018-02-23 01:58:43.421 ERROR (updateExecutor-2-thread-63-processing-http://
> //solr2-a:8080//solr//mycollection_shard1_replica_n6
> x:mycollection_shard1_replica_n8 r:core_node10 n:solr2-c.vam.be.cmh.
> mycollection.com:8080_solr s:shard1 c:mycollection) [c:mycollection
> s:shard1 r:core_node10 x:mycollection_shard1_replica_n8] o.a.s.u.
> ErrorReportingConcurrentUpdateSolrClient error
> 2018-02-23 01:58:43.423 ERROR (qtp1595212853-16434) [c:mycollection
> s:shard1 r:core_node10 x:mycollection_shard1_replica_n8]
> o.a.s.h.RequestHandlerBase org.apache.solr.update.processor.
> DistributedUpdateProcessor$DistributedUpdatesAsyncException: Async
> exception during distributed update: Read timed out
> 2018-02-23 01:58:43.424 ERROR (qtp1595212853-16434) [c:mycollection
> s:shard1 r:core_node10 x:mycollection_shard1_replica_n8]
> o.a.s.s.HttpSolrCall null:org.apache.solr.update.processor.
> DistributedUpdateProcessor$DistributedUpdatesAsyncException: Async
> exception during distributed update: Read timed out
> 2018-02-23 02:53:00.174 ERROR (updateExecutor-2-thread-107-
> processing-http:////solr2-a:8080//solr//mycollection_shard1_replica_n6
> x:mycollection_shard1_replica_n8 r:core_node10 n:solr2-c.vam.be.cmh.
> mycollection.com:8080_solr s:shard1 c:mycollection) [c:mycollection
> s:shard1 r:core_node10 x:mycollection_shard1_replica_n8] o.a.s.u.
> ErrorReportingConcurrentUpdateSolrClient error
> 2018-02-23 02:53:00.175 ERROR (qtp1595212853-16483) [c:mycollection
> s:shard1 r:core_node10 x:mycollection_shard1_replica_n8]
> o.a.s.h.RequestHandlerBase org.apache.solr.update.processor.
> DistributedUpdateProcessor$DistributedUpdatesAsyncException: Async
> exception during distributed update: Read timed out
> 2018-02-23 02:53:00.176 ERROR (qtp1595212853-16483) [c:mycollection
> s:shard1 r:core_node10 x:mycollection_shard1_replica_n8]
> o.a.s.s.HttpSolrCall null:org.apache.solr.update.processor.
> DistributedUpdateProcessor$DistributedUpdatesAsyncException: Async
> exception during distributed update: Read timed out
>
> solr2-d:
> 2018-02-23 04:08:24.258 ERROR (qtp1595212853-2104) [c:mycollection s:shard1
> r:core_node12 x:mycollection_shard1_replica_n11] o.a.s.h.RequestHandlerBase
> java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout
> expired: 120000/120000 ms
> 2018-02-23 04:08:24.258 ERROR (qtp1595212853-2104) [c:mycollection s:shard1
> r:core_node12 x:mycollection_shard1_replica_n11] o.a.s.s.HttpSolrCall
> null:java.io.IOException: java.util.concurrent.TimeoutException: Idle
> timeout expired: 120000/120000 ms
> 2018-02-23 04:08:24.259 ERROR (qtp1595212853-3603) [c:mycollection s:shard1
> r:core_node12 x:mycollection_shard1_replica_n11] o.a.s.h.RequestHandlerBase
> java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout
> expired: 120000/120000 ms
> 2018-02-23 04:08:24.259 ERROR (qtp1595212853-3603) [c:mycollection s:shard1
> r:core_node12 x:mycollection_shard1_replica_n11] o.a.s.s.HttpSolrCall
> null:java.io.IOException: java.util.concurrent.TimeoutException: Idle
> timeout expired: 120000/120000 ms
> 2018-02-23 04:09:00.923 ERROR (qtp1595212853-959) [c:mycollection s:shard1
> r:core_node12 x:mycollection_shard1_replica_n11] o.a.s.h.RequestHandlerBase
> java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout
> expired: 120000/120000 ms
> 2018-02-23 04:09:00.923 ERROR (qtp1595212853-959) [c:mycollection s:shard1
> r:core_node12 x:mycollection_shard1_replica_n11] o.a.s.s.HttpSolrCall
> null:java.io.IOException: java.util.concurrent.TimeoutException: Idle
> timeout expired: 120000/120000 ms
> 2018-02-23 04:09:00.939 ERROR (qtp1595212853-2742) [c:mycollection s:shard1
> r:core_node12 x:mycollection_shard1_replica_n11] o.a.s.h.RequestHandlerBase
> java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout
> expired: 120000/120000 ms
> 2018-02-23 04:09:00.939 ERROR (qtp1595212853-2742) [c:mycollection s:shard1
> r:core_node12 x:mycollection_shard1_replica_n11] o.a.s.s.HttpSolrCall
> null:java.io.IOException: java.util.concurrent.TimeoutException: Idle
> timeout expired: 120000/120000 ms
> 2018-02-23 04:09:01.021 ERROR (qtp1595212853-3532) [c:mycollection s:shard1
> r:core_node12 x:mycollection_shard1_replica_n11] o.a.s.h.RequestHandlerBase
> java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout
> expired: 120001/120000 ms
> 2018-02-23 04:09:01.021 ERROR (qtp1595212853-3532) [c:mycollection s:shard1
> r:core_node12 x:mycollection_shard1_replica_n11] o.a.s.s.HttpSolrCall
> null:java.io.IOException: java.util.concurrent.TimeoutException: Idle
> timeout expired: 120001/120000 ms
> 2018-02-23 04:09:01.317 ERROR (qtp1595212853-958) [c:mycollection s:shard1
> r:core_node12 x:mycollection_shard1_replica_n11] o.a.s.h.RequestHandlerBase
> java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout
> expired: 120000/120000 ms
> 2018-02-23 04:09:01.317 ERROR (qtp1595212853-958) [c:mycollection s:shard1
> r:core_node12 x:mycollection_shard1_replica_n11] o.a.s.s.HttpSolrCall
> null:java.io.IOException: java.util.concurrent.TimeoutException: Idle
> timeout expired: 120000/120000 ms
> 2018-02-23 04:09:01.484 ERROR (qtp1595212853-986) [c:mycollection s:shard1
> r:core_node12 x:mycollection_shard1_replica_n11] o.a.s.h.RequestHandlerBase
> java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout
> expired: 120001/120000 ms
> 2018-02-23 04:09:01.484 ERROR (qtp1595212853-986) [c:mycollection s:shard1
> r:core_node12 x:mycollection_shard1_replica_n11] o.a.s.s.HttpSolrCall
> null:java.io.IOException: java.util.concurrent.TimeoutException: Idle
> timeout expired: 120001/120000 ms
> 2018-02-23 04:09:13.755 ERROR (qtp1595212853-3629) [c:mycollection s:shard1
> r:core_node12 x:mycollection_shard1_replica_n11] o.a.s.h.RequestHandlerBase
> java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout
> expired: 120001/120000 ms
> 2018-02-23 04:09:13.755 ERROR (qtp1595212853-3629) [c:mycollection s:shard1
> r:core_node12 x:mycollection_shard1_replica_n11] o.a.s.s.HttpSolrCall
> null:java.io.IOException: java.util.concurrent.TimeoutException: Idle
> timeout expired: 120001/120000 ms
> 2018-02-23 04:09:17.658 ERROR (qtp1595212853-3631) [c:mycollection s:shard1
> r:core_node12 x:mycollection_shard1_replica_n11] o.a.s.h.RequestHandlerBase
> java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout
> expired: 120000/120000 ms
> 2018-02-23 04:09:17.658 ERROR (qtp1595212853-3631) [c:mycollection s:shard1
> r:core_node12 x:mycollection_shard1_replica_n11] o.a.s.s.HttpSolrCall
> null:java.io.IOException: java.util.concurrent.TimeoutException: Idle
> timeout expired: 120000/120000 ms
> 2018-02-23 04:09:28.786 ERROR (qtp1595212853-979) [c:mycollection s:shard1
> r:core_node12 x:mycollection_shard1_replica_n11] o.a.s.h.RequestHandlerBase
> java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout
> expired: 120001/120000 ms
> 2018-02-23 04:09:28.786 ERROR (qtp1595212853-979) [c:mycollection s:shard1
> r:core_node12 x:mycollection_shard1_replica_n11] o.a.s.s.HttpSolrCall
> null:java.io.IOException: java.util.concurrent.TimeoutException: Idle
> timeout expired: 120001/120000 ms
> 2018-02-23 04:10:23.260 ERROR
> (recoveryExecutor-3-thread-5-processing-n:solr2-d:8080_solr
> x:mycollection_shard1_replica_n11 s:shard1 c:mycollection r:core_node12)
> [c:mycollection s:shard1 r:core_node12 x:mycollection_shard1_replica_n11]
> o.a.s.h.IndexFetcher Error fetching file, doing one
> retry...:org.apache.solr.common.SolrException: Unable to download
> tlog.0000000000000046782.1593163207942340608 completely. Downloaded 0!=87454
> 2018-02-23 04:10:23.262 ERROR
> (recoveryExecutor-3-thread-5-processing-n:solr2-d:8080_solr
> x:mycollection_shard1_replica_n11 s:shard1 c:mycollection r:core_node12)
> [c:mycollection s:shard1 r:core_node12 x:mycollection_shard1_replica_n11]
> o.a.s.h.IndexFetcher Error deleting file: tlog.0000000000000046782.
> 1593163207942340608
> 2018-02-23 04:10:23.887 ERROR
> (recoveryExecutor-3-thread-5-processing-n:solr2-d:8080_solr
> x:mycollection_shard1_replica_n11 s:shard1 c:mycollection r:core_node12)
> [c:mycollection s:shard1 r:core_node12 x:mycollection_shard1_replica_n11]
> o.a.s.c.RecoveryStrategy Error while trying to
> recover:org.apache.solr.common.SolrException:
> Replication for recovery failed.
> 2018-02-23 04:10:23.887 ERROR
> (recoveryExecutor-3-thread-5-processing-n:solr2-d:8080_solr
> x:mycollection_shard1_replica_n11 s:shard1 c:mycollection r:core_node12)
> [c:mycollection s:shard1 r:core_node12 x:mycollection_shard1_replica_n11]
> o.a.s.c.RecoveryStrategy Recovery failed - trying again... (0)
> 2018-02-23 04:10:23.887 ERROR
> (recoveryExecutor-3-thread-5-processing-n:solr2-d:8080_solr
> x:mycollection_shard1_replica_n11 s:shard1 c:mycollection r:core_node12)
> [c:mycollection s:shard1 r:core_node12 x:mycollection_shard1_replica_n11]
> o.a.s.h.ReplicationHandler Index fetch failed
> :org.apache.solr.common.SolrException:
> Unable to download tlog.0000000000000046782.1593163207942340608 completely.
> Downloaded 0!=87454
>
> solr2-e:
> 2018-02-23 04:09:00.926 ERROR (qtp1595212853-15314) [c:mycollection
> s:shard1 r:core_node7 x:mycollection_shard1_replica_n4]
> o.a.s.s.HttpSolrCall null:java.io.IOException:
> java.util.concurrent.TimeoutException:
> Idle timeout expired: 120000/120000 ms
> 2018-02-23 04:09:00.942 ERROR (qtp1595212853-4289) [c:mycollection s:shard1
> r:core_node7 x:mycollection_shard1_replica_n4] o.a.s.h.RequestHandlerBase
> java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout
> expired: 120000/120000 ms
> 2018-02-23 04:09:00.943 ERROR (qtp1595212853-4289) [c:mycollection s:shard1
> r:core_node7 x:mycollection_shard1_replica_n4] o.a.s.s.HttpSolrCall
> null:java.io.IOException: java.util.concurrent.TimeoutException: Idle
> timeout expired: 120000/120000 ms
> 2018-02-23 04:09:01.024 ERROR (qtp1595212853-15308) [c:mycollection
> s:shard1 r:core_node7 x:mycollection_shard1_replica_n4]
> o.a.s.h.RequestHandlerBase java.io.IOException:
> java.util.concurrent.TimeoutException:
> Idle timeout expired: 120000/120000 ms
> 2018-02-23 04:09:01.024 ERROR (qtp1595212853-15308) [c:mycollection
> s:shard1 r:core_node7 x:mycollection_shard1_replica_n4]
> o.a.s.s.HttpSolrCall null:java.io.IOException:
> java.util.concurrent.TimeoutException:
> Idle timeout expired: 120000/120000 ms
> 2018-02-23 04:09:01.328 ERROR (qtp1595212853-14946) [c:mycollection
> s:shard1 r:core_node7 x:mycollection_shard1_replica_n4]
> o.a.s.h.RequestHandlerBase java.io.IOException:
> java.util.concurrent.TimeoutException:
> Idle timeout expired: 120000/120000 ms
> 2018-02-23 04:09:01.328 ERROR (qtp1595212853-14946) [c:mycollection
> s:shard1 r:core_node7 x:mycollection_shard1_replica_n4]
> o.a.s.s.HttpSolrCall null:java.io.IOException:
> java.util.concurrent.TimeoutException:
> Idle timeout expired: 120000/120000 ms
> 2018-02-23 04:09:01.486 ERROR (qtp1595212853-7288) [c:mycollection s:shard1
> r:core_node7 x:mycollection_shard1_replica_n4] o.a.s.h.RequestHandlerBase
> java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout
> expired: 120000/120000 ms
> 2018-02-23 04:09:01.486 ERROR (qtp1595212853-7288) [c:mycollection s:shard1
> r:core_node7 x:mycollection_shard1_replica_n4] o.a.s.s.HttpSolrCall
> null:java.io.IOException: java.util.concurrent.TimeoutException: Idle
> timeout expired: 120000/120000 ms
> 2018-02-23 04:09:13.758 ERROR (qtp1595212853-9090) [c:mycollection s:shard1
> r:core_node7 x:mycollection_shard1_replica_n4] o.a.s.h.RequestHandlerBase
> java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout
> expired: 120000/120000 ms
> 2018-02-23 04:09:13.758 ERROR (qtp1595212853-9090) [c:mycollection s:shard1
> r:core_node7 x:mycollection_shard1_replica_n4] o.a.s.s.HttpSolrCall
> null:java.io.IOException: java.util.concurrent.TimeoutException: Idle
> timeout expired: 120000/120000 ms
> 2018-02-23 04:09:17.377 ERROR (qtp1595212853-9021) [c:mycollection s:shard1
> r:core_node7 x:mycollection_shard1_replica_n4] o.a.s.h.RequestHandlerBase
> java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout
> expired: 120000/120000 ms
> 2018-02-23 04:09:17.378 ERROR (qtp1595212853-9021) [c:mycollection s:shard1
> r:core_node7 x:mycollection_shard1_replica_n4] o.a.s.s.HttpSolrCall
> null:java.io.IOException: java.util.concurrent.TimeoutException: Idle
> timeout expired: 120000/120000 ms
> 2018-02-23 04:09:29.804 ERROR (qtp1595212853-16156) [c:mycollection
> s:shard1 r:core_node7 x:mycollection_shard1_replica_n4]
> o.a.s.h.RequestHandlerBase java.io.IOException:
> java.util.concurrent.TimeoutException:
> Idle timeout expired: 120000/120000 ms
> 2018-02-23 04:09:29.827 ERROR (qtp1595212853-16156) [c:mycollection
> s:shard1 r:core_node7 x:mycollection_shard1_replica_n4]
> o.a.s.s.HttpSolrCall null:java.io.IOException:
> java.util.concurrent.TimeoutException:
> Idle timeout expired: 120000/120000 ms
> 2018-02-23 04:10:24.945 ERROR
> (recoveryExecutor-3-thread-6-processing-n:solr2-e:8080_solr
> x:mycollection_shard1_replica_n4 s:shard1 c:mycollection r:core_node7)
> [c:mycollection s:shard1 r:core_node7 x:mycollection_shard1_replica_n4]
> o.a.s.h.IndexFetcher Error fetching file, doing one
> retry...:org.apache.solr.common.SolrException: Unable to download
> tlog.0000000000000046782.1593163207942340608 completely. Downloaded 0!=87454
> 2018-02-23 04:10:24.946 ERROR
> (recoveryExecutor-3-thread-6-processing-n:solr2-e:8080_solr
> x:mycollection_shard1_replica_n4 s:shard1 c:mycollection r:core_node7)
> [c:mycollection s:shard1 r:core_node7 x:mycollection_shard1_replica_n4]
> o.a.s.h.IndexFetcher Error deleting file: tlog.0000000000000046782.
> 1593163207942340608
> 2018-02-23 04:10:26.278 ERROR
> (recoveryExecutor-3-thread-6-processing-n:solr2-e:8080_solr
> x:mycollection_shard1_replica_n4 s:shard1 c:mycollection r:core_node7)
> [c:mycollection s:shard1 r:core_node7 x:mycollection_shard1_replica_n4]
> o.a.s.c.RecoveryStrategy Error while trying to
> recover:org.apache.solr.common.SolrException:
> Replication for recovery failed.
> 2018-02-23 04:10:26.278 ERROR
> (recoveryExecutor-3-thread-6-processing-n:solr2-e:8080_solr
> x:mycollection_shard1_replica_n4 s:shard1 c:mycollection r:core_node7)
> [c:mycollection s:shard1 r:core_node7 x:mycollection_shard1_replica_n4]
> o.a.s.c.RecoveryStrategy Recovery failed - trying again... (0)
> 2018-02-23 04:10:26.278 ERROR
> (recoveryExecutor-3-thread-6-processing-n:solr2-e:8080_solr
> x:mycollection_shard1_replica_n4 s:shard1 c:mycollection r:core_node7)
> [c:mycollection s:shard1 r:core_node7 x:mycollection_shard1_replica_n4]
> o.a.s.h.ReplicationHandler Index fetch failed
> :org.apache.solr.common.SolrException:
> Unable to download tlog.0000000000000046782.1593163207942340608 completely.
> Downloaded 0!=87454
> 2018-02-23 04:12:20.570 ERROR
> (recoveryExecutor-3-thread-6-processing-n:solr2-e:8080_solr
> x:mycollection_shard1_replica_n4 s:shard1 c:mycollection r:core_node7)
> [c:mycollection s:shard1 r:core_node7 x:mycollection_shard1_replica_n4]
> o.a.s.h.IndexFetcher Error fetching file, doing one
> retry...:org.apache.solr.common.SolrException: Unable to download
> tlog.0000000000000046787.1593163366289899520 completely. Downloaded
> 0!=179060
> 2018-02-23 04:12:20.571 ERROR
> (recoveryExecutor-3-thread-6-processing-n:solr2-e:8080_solr
> x:mycollection_shard1_replica_n4 s:shard1 c:mycollection r:core_node7)
> [c:mycollection s:shard1 r:core_node7 x:mycollection_shard1_replica_n4]
> o.a.s.h.IndexFetcher Error deleting file: tlog.0000000000000046787.
> 1593163366289899520
> 2018-02-23 04:12:22.405 ERROR
> (recoveryExecutor-3-thread-6-processing-n:solr2-e:8080_solr
> x:mycollection_shard1_replica_n4 s:shard1 c:mycollection r:core_node7)
> [c:mycollection s:shard1 r:core_node7 x:mycollection_shard1_replica_n4]
> o.a.s.c.RecoveryStrategy Error while trying to
> recover:org.apache.solr.common.SolrException:
> Replication for recovery failed.
> 2018-02-23 04:12:22.405 ERROR
> (recoveryExecutor-3-thread-6-processing-n:solr2-e:8080_solr
> x:mycollection_shard1_replica_n4 s:shard1 c:mycollection r:core_node7)
> [c:mycollection s:shard1 r:core_node7 x:mycollection_shard1_replica_n4]
> o.a.s.c.RecoveryStrategy Recovery failed - trying again... (1)
> 2018-02-23 04:12:22.405 ERROR
> (recoveryExecutor-3-thread-6-processing-n:solr2-e:8080_solr
> x:mycollection_shard1_replica_n4 s:shard1 c:mycollection r:core_node7)
> [c:mycollection s:shard1 r:core_node7 x:mycollection_shard1_replica_n4]
> o.a.s.h.ReplicationHandler Index fetch failed
> :org.apache.solr.common.SolrException:
> Unable to download tlog.0000000000000046787.1593163366289899520 completely.
> Downloaded 0!=179060
>
>
>> On Feb 23, 2018, at 4:15 PM, Deepak Goel <de...@gmail.com> wrote:
>>
>> Can you please post all the errors? The current error is only for the node
>> 'solr-2d'
>>
>> On 23 Feb 2018 09:42, "Tom Peters" <tp...@synacor.com> wrote:
>>
>> I'm trying to debug why indexing in SolrCloud 7.1 is having so many
> issues.
>> It will hang most of the time, and timeout the rest.
>>
>> Here's an example:
>>
>>    time curl -s 'myhost:8080/solr/mycollection/update/json/docs' -d
>> '{"solr_id":"test_001", "data_type":"test"}'|jq .
>>    {
>>      "responseHeader": {
>>        "status": 0,
>>        "QTime": 5004
>>      }
>>    }
>>    curl -s 'myhost:8080/solr/mycollection/update/json/docs' -d   0.00s
>> user 0.00s system 0% cpu 5.025 total
>>    jq .  0.01s user 0.00s system 0% cpu 5.025 total
>>
>> Here's some of the timeout errors I'm seeing:
>>
>>    2018-02-23 03:55:02.903 ERROR (qtp1595212853-3607) [c:mycollection
>> s:shard1 r:core_node12 x:mycollection_shard1_replica_n11]
>> o.a.s.h.RequestHandlerBase java.io.IOException:
>> java.util.concurrent.TimeoutException:
>> Idle timeout expired: 120000/120000 ms
>>    2018-02-23 03:55:02.903 ERROR (qtp1595212853-3607) [c:mycollection
>> s:shard1 r:core_node12 x:mycollection_shard1_replica_n11]
>> o.a.s.s.HttpSolrCall null:java.io.IOException:
>> java.util.concurrent.TimeoutException:
>> Idle timeout expired: 120000/120000 ms
>>    2018-02-23 03:55:36.517 ERROR (recoveryExecutor-3-thread-4-
>> processing-n:solr2-d.myhost:8080_solr x:mycollection_shard1_replica_n11
>> s:shard1 c:mycollection r:core_node12) [c:mycollection s:shard1
>> r:core_node12 x:mycollection_shard1_replica_n11]
> o.a.s.h.ReplicationHandler
>> Index fetch failed :org.apache.solr.common.SolrException: Index fetch
>> failed :
>>    2018-02-23 03:55:36.517 ERROR (recoveryExecutor-3-thread-4-
>> processing-n:solr2-d.myhost:8080_solr x:mycollection_shard1_replica_n11
>> s:shard1 c:mycollection r:core_node12) [c:mycollection s:shard1
>> r:core_node12 x:mycollection_shard1_replica_n11] o.a.s.c.RecoveryStrategy
>> Error while trying to recover:org.apache.solr.common.SolrException:
>> Replication for recovery failed.
>>
>>
>> We currently have two separate Solr clusters. Our current in-production
>> cluster which runs on Solr 3.4 and a new ring that I'm trying to bring up
>> which runs on SolrCloud 7.1. I have the exact same code that is indexing
> to
>> both clusters. The Solr 3.4 indexes fine, but I'm running into lots of
>> issues with SolrCloud 7.1.
>>
>>
>> Some additional details about the setup:
>>
>> * 5 nodes solr2-a through solr2-e.
>> * 5 replicas
>> * 1 shard
>> * The servers have 48G of RAM with -Xmx and -Xms set to 16G
>> * I currently have soft commits at 10m intervals and hard commits (with
>> openSearcher=false) at 1m intervals. I also tried 5m (soft) and 15s (hard)
>> as well.
>>
>> Any help or pointers would be greatly appreciated. Thanks!
>>
>>
>> This message and any attachment may contain information that is
>> confidential and/or proprietary. Any use, disclosure, copying, storing, or
>> distribution of this e-mail or any attached file by anyone other than the
>> intended recipient is strictly prohibited. If you have received this
>> message in error, please notify the sender by reply email and delete the
>> message and any attachments. Thank you.
>
>
>
> This message and any attachment may contain information that is
> confidential and/or proprietary. Any use, disclosure, copying, storing, or
> distribution of this e-mail or any attached file by anyone other than the
> intended recipient is strictly prohibited. If you have received this
> message in error, please notify the sender by reply email and delete the
> message and any attachments. Thank you.


Re: Indexing timeout issues with SolrCloud 7.1

Posted by Deepak Goel <de...@gmail.com>.
From the error list, i can see multiple errors:

1. Failure to recover replica
2. Peer sync error
3. Failure to download file

On 24 Feb 2018 03:10, "Tom Peters" <tp...@synacor.com> wrote:

I included the last 25 lines from the logs from each of the five nodes
during that time period.

I _think_ I'm running into issues with bulking up deleteByQuery. Quick
background: we have objects in our system that may have multiple
availability windows. So when we index an object, will store it as separate
documents each with their own begins and expires date. At index time we
don't know if the all of the windows are still valid or not, so we remove
all of them with a deleteByQuery (e.g. deleteByQuery=object_id:12345) and
then index one or more documents.

I ran an isolated test a number of times where I indexed 1500 documents in
this manner (deletes then index). In Solr 3.4, it takes about 15s to
complete. In Solr 7.1, it's taking about 5m. If I remove the deleteByQuery,
the indexing times are nearly identical.

When run in normal production mode where we have lots of processes indexing
at once (~20 or so), it starts to cause lots of issues (which you see
below).


Please let me know if anything I mentioned is unclear. Thanks!




solr2-a:
2018-02-23 04:09:36.551 ERROR (updateExecutor-2-thread-2672-
processing-http:////solr2-b:8080//solr//mycollection_shard1_replica_n1
x:mycollection_shard1_replica_n6 r:core_node9 n:solr2-a.vam.be.cmh.
mycollection.com:8080_solr s:shard1 c:mycollection) [c:mycollection
s:shard1 r:core_node9 x:mycollection_shard1_replica_n6] o.a.s.u.
ErrorReportingConcurrentUpdateSolrClient error
2018-02-23 04:09:36.551 ERROR (updateExecutor-2-thread-2692-
processing-http:////solr2-d:8080//solr//mycollection_shard1_replica_n11
x:mycollection_shard1_replica_n6 r:core_node9 n:solr2-a.vam.be.cmh.
mycollection.com:8080_solr s:shard1 c:mycollection) [c:mycollection
s:shard1 r:core_node9 x:mycollection_shard1_replica_n6] o.a.s.u.
ErrorReportingConcurrentUpdateSolrClient error
2018-02-23 04:09:36.551 ERROR (updateExecutor-2-thread-2711-
processing-http:////solr2-e:8080//solr//mycollection_shard1_replica_n4
x:mycollection_shard1_replica_n6 r:core_node9 n:solr2-a.vam.be.cmh.
mycollection.com:8080_solr s:shard1 c:mycollection) [c:mycollection
s:shard1 r:core_node9 x:mycollection_shard1_replica_n6] o.a.s.u.
ErrorReportingConcurrentUpdateSolrClient error
2018-02-23 04:09:36.552 ERROR (qtp1595212853-32739) [c:mycollection
s:shard1 r:core_node9 x:mycollection_shard1_replica_n6]
o.a.s.u.p.DistributedUpdateProcessor
Setting up to try to start recovery on replica http://solr2-b:8080/solr/
mycollection_shard1_replica_n1/
2018-02-23 04:09:36.552 ERROR (qtp1595212853-32739) [c:mycollection
s:shard1 r:core_node9 x:mycollection_shard1_replica_n6]
o.a.s.u.p.DistributedUpdateProcessor
Setting up to try to start recovery on replica http://solr2-d:8080/solr/
mycollection_shard1_replica_n11/
2018-02-23 04:09:36.552 ERROR (qtp1595212853-32739) [c:mycollection
s:shard1 r:core_node9 x:mycollection_shard1_replica_n6]
o.a.s.u.p.DistributedUpdateProcessor
Setting up to try to start recovery on replica http://solr2-e:8080/solr/
mycollection_shard1_replica_n4/
2018-02-23 04:09:38.217 ERROR (updateExecutor-2-thread-2712-
processing-http:////solr2-e:8080//solr//mycollection_shard1_replica_n4
x:mycollection_shard1_replica_n6 r:core_node9 n:solr2-a.vam.be.cmh.
mycollection.com:8080_solr s:shard1 c:mycollection) [c:mycollection
s:shard1 r:core_node9 x:mycollection_shard1_replica_n6] o.a.s.u.
ErrorReportingConcurrentUpdateSolrClient error
2018-02-23 04:09:38.217 ERROR (updateExecutor-2-thread-2726-
processing-http:////solr2-d:8080//solr//mycollection_shard1_replica_n11
x:mycollection_shard1_replica_n6 r:core_node9 n:solr2-a.vam.be.cmh.
mycollection.com:8080_solr s:shard1 c:mycollection) [c:mycollection
s:shard1 r:core_node9 x:mycollection_shard1_replica_n6] o.a.s.u.
ErrorReportingConcurrentUpdateSolrClient error
2018-02-23 04:09:38.217 ERROR (updateExecutor-2-thread-2727-
processing-http:////solr2-b:8080//solr//mycollection_shard1_replica_n1
x:mycollection_shard1_replica_n6 r:core_node9 n:solr2-a.vam.be.cmh.
mycollection.com:8080_solr s:shard1 c:mycollection) [c:mycollection
s:shard1 r:core_node9 x:mycollection_shard1_replica_n6] o.a.s.u.
ErrorReportingConcurrentUpdateSolrClient error
2018-02-23 04:09:38.218 ERROR (qtp1595212853-32260) [c:mycollection
s:shard1 r:core_node9 x:mycollection_shard1_replica_n6]
o.a.s.u.p.DistributedUpdateProcessor
Setting up to try to start recovery on replica http://solr2-b:8080/solr/
mycollection_shard1_replica_n1/
2018-02-23 04:09:38.218 ERROR (qtp1595212853-32260) [c:mycollection
s:shard1 r:core_node9 x:mycollection_shard1_replica_n6]
o.a.s.u.p.DistributedUpdateProcessor
Setting up to try to start recovery on replica http://solr2-d:8080/solr/
mycollection_shard1_replica_n11/
2018-02-23 04:09:38.218 ERROR (qtp1595212853-32260) [c:mycollection
s:shard1 r:core_node9 x:mycollection_shard1_replica_n6]
o.a.s.u.p.DistributedUpdateProcessor
Setting up to try to start recovery on replica http://solr2-e:8080/solr/
mycollection_shard1_replica_n4/
2018-02-23 04:09:50.048 ERROR (qtp1595212853-11840) [c:mycollection
s:shard1 r:core_node9 x:mycollection_shard1_replica_n6]
o.a.s.u.p.DistributedUpdateProcessor
Setting up to try to start recovery on replica http://solr2-d:8080/solr/
mycollection_shard1_replica_n11/
2018-02-23 04:09:50.048 ERROR (qtp1595212853-11840) [c:mycollection
s:shard1 r:core_node9 x:mycollection_shard1_replica_n6]
o.a.s.u.p.DistributedUpdateProcessor
Setting up to try to start recovery on replica http://solr2-e:8080/solr/
mycollection_shard1_replica_n4/
2018-02-23 04:09:50.142 ERROR (qtp1595212853-33675) [c:mycollection
s:shard1 r:core_node9 x:mycollection_shard1_replica_n6]
o.a.s.u.p.DistributedUpdateProcessor
Setting up to try to start recovery on replica http://solr2-b:8080/solr/
mycollection_shard1_replica_n1/
2018-02-23 04:09:50.143 ERROR (qtp1595212853-33675) [c:mycollection
s:shard1 r:core_node9 x:mycollection_shard1_replica_n6]
o.a.s.u.p.DistributedUpdateProcessor
Setting up to try to start recovery on replica http://solr2-d:8080/solr/
mycollection_shard1_replica_n11/
2018-02-23 04:09:50.143 ERROR (qtp1595212853-33675) [c:mycollection
s:shard1 r:core_node9 x:mycollection_shard1_replica_n6]
o.a.s.u.p.DistributedUpdateProcessor
Setting up to try to start recovery on replica http://solr2-e:8080/solr/
mycollection_shard1_replica_n4/
2018-02-23 07:54:01.404 ERROR (qtp1595212853-33424) [c:mycollection
s:shard1 r:core_node9 x:mycollection_shard1_replica_n6]
o.a.s.h.RequestHandlerBase java.lang.NullPointerException
2018-02-23 07:54:01.405 ERROR (qtp1595212853-33424) [c:mycollection
s:shard1 r:core_node9 x:mycollection_shard1_replica_n6]
o.a.s.s.HttpSolrCall null:java.lang.NullPointerException
2018-02-23 12:24:12.389 ERROR (qtp1595212853-33906) [c:mycollection
s:shard1 r:core_node9 x:mycollection_shard1_replica_n6]
o.a.s.h.RequestHandlerBase java.lang.NullPointerException
2018-02-23 12:24:12.390 ERROR (qtp1595212853-33906) [c:mycollection
s:shard1 r:core_node9 x:mycollection_shard1_replica_n6]
o.a.s.s.HttpSolrCall null:java.lang.NullPointerException
2018-02-23 16:27:52.030 ERROR (qtp1595212853-32216) [c:mycollection
s:shard1 r:core_node9 x:mycollection_shard1_replica_n6]
o.a.s.h.RequestHandlerBase java.lang.NullPointerException
2018-02-23 16:27:52.030 ERROR (qtp1595212853-32216) [c:mycollection
s:shard1 r:core_node9 x:mycollection_shard1_replica_n6]
o.a.s.s.HttpSolrCall null:java.lang.NullPointerException
2018-02-23 16:30:12.276 ERROR (qtp1595212853-33424) [c:mycollection
s:shard1 r:core_node9 x:mycollection_shard1_replica_n6]
o.a.s.h.RequestHandlerBase java.lang.NullPointerException
2018-02-23 16:30:12.276 ERROR (qtp1595212853-33424) [c:mycollection
s:shard1 r:core_node9 x:mycollection_shard1_replica_n6]
o.a.s.s.HttpSolrCall null:java.lang.NullPointerException

solr2-b:
2018-02-23 04:08:24.260 ERROR (qtp1595212853-22127) [c:mycollection
s:shard1 r:core_node3 x:mycollection_shard1_replica_n1]
o.a.s.h.RequestHandlerBase java.io.IOException:
java.util.concurrent.TimeoutException:
Idle timeout expired: 120000/120000 ms
2018-02-23 04:08:24.260 ERROR (qtp1595212853-22127) [c:mycollection
s:shard1 r:core_node3 x:mycollection_shard1_replica_n1]
o.a.s.s.HttpSolrCall null:java.io.IOException:
java.util.concurrent.TimeoutException:
Idle timeout expired: 120000/120000 ms
2018-02-23 04:08:24.260 ERROR (qtp1595212853-22168) [c:mycollection
s:shard1 r:core_node3 x:mycollection_shard1_replica_n1]
o.a.s.s.HttpSolrCall null:java.io.IOException:
java.util.concurrent.TimeoutException:
Idle timeout expired: 120000/120000 ms
2018-02-23 04:08:43.167 ERROR
(recoveryExecutor-3-thread-8-processing-n:solr2-b:8080_solr
x:mycollection_shard1_replica_n1 s:shard1 c:mycollection r:core_node3)
[c:mycollection s:shard1 r:core_node3 x:mycollection_shard1_replica_n1]
o.a.s.u.PeerSync PeerSync: core=mycollection_shard1_replica_n1 url=
http://solr2-b.vam.be.cmh.mycollection.com:8080/solr  Requested 2 updates
from http://solr2-a.vam.be.cmh.mycollection.com:8080/solr/
mycollection_shard1_replica_n6/ but retrieved 1
2018-02-23 04:09:00.925 ERROR (qtp1595212853-20968) [c:mycollection
s:shard1 r:core_node3 x:mycollection_shard1_replica_n1]
o.a.s.h.RequestHandlerBase java.io.IOException:
java.util.concurrent.TimeoutException:
Idle timeout expired: 120001/120000 ms
2018-02-23 04:09:00.925 ERROR (qtp1595212853-20968) [c:mycollection
s:shard1 r:core_node3 x:mycollection_shard1_replica_n1]
o.a.s.s.HttpSolrCall null:java.io.IOException:
java.util.concurrent.TimeoutException:
Idle timeout expired: 120001/120000 ms
2018-02-23 04:09:00.941 ERROR (qtp1595212853-22123) [c:mycollection
s:shard1 r:core_node3 x:mycollection_shard1_replica_n1]
o.a.s.h.RequestHandlerBase java.io.IOException:
java.util.concurrent.TimeoutException:
Idle timeout expired: 120000/120000 ms
2018-02-23 04:09:00.941 ERROR (qtp1595212853-22123) [c:mycollection
s:shard1 r:core_node3 x:mycollection_shard1_replica_n1]
o.a.s.s.HttpSolrCall null:java.io.IOException:
java.util.concurrent.TimeoutException:
Idle timeout expired: 120000/120000 ms
2018-02-23 04:09:01.023 ERROR (qtp1595212853-20597) [c:mycollection
s:shard1 r:core_node3 x:mycollection_shard1_replica_n1]
o.a.s.h.RequestHandlerBase java.io.IOException:
java.util.concurrent.TimeoutException:
Idle timeout expired: 120000/120000 ms
2018-02-23 04:09:01.024 ERROR (qtp1595212853-20597) [c:mycollection
s:shard1 r:core_node3 x:mycollection_shard1_replica_n1]
o.a.s.s.HttpSolrCall null:java.io.IOException:
java.util.concurrent.TimeoutException:
Idle timeout expired: 120000/120000 ms
2018-02-23 04:09:04.793 ERROR (qtp1595212853-15597) [c:mycollection
s:shard1 r:core_node3 x:mycollection_shard1_replica_n1]
o.a.s.h.RequestHandlerBase java.io.IOException:
java.util.concurrent.TimeoutException:
Idle timeout expired: 120000/120000 ms
2018-02-23 04:09:04.793 ERROR (qtp1595212853-15597) [c:mycollection
s:shard1 r:core_node3 x:mycollection_shard1_replica_n1]
o.a.s.s.HttpSolrCall null:java.io.IOException:
java.util.concurrent.TimeoutException:
Idle timeout expired: 120000/120000 ms
2018-02-23 04:09:04.964 ERROR (qtp1595212853-22126) [c:mycollection
s:shard1 r:core_node3 x:mycollection_shard1_replica_n1]
o.a.s.h.RequestHandlerBase java.io.IOException:
java.util.concurrent.TimeoutException:
Idle timeout expired: 120000/120000 ms
2018-02-23 04:09:04.965 ERROR (qtp1595212853-22126) [c:mycollection
s:shard1 r:core_node3 x:mycollection_shard1_replica_n1]
o.a.s.s.HttpSolrCall null:java.io.IOException:
java.util.concurrent.TimeoutException:
Idle timeout expired: 120000/120000 ms
2018-02-23 04:09:13.809 ERROR (qtp1595212853-14929) [c:mycollection
s:shard1 r:core_node3 x:mycollection_shard1_replica_n1]
o.a.s.h.RequestHandlerBase java.io.IOException:
java.util.concurrent.TimeoutException:
Idle timeout expired: 120000/120000 ms
2018-02-23 04:09:13.825 ERROR (qtp1595212853-14929) [c:mycollection
s:shard1 r:core_node3 x:mycollection_shard1_replica_n1]
o.a.s.s.HttpSolrCall null:java.io.IOException:
java.util.concurrent.TimeoutException:
Idle timeout expired: 120000/120000 ms
2018-02-23 04:09:17.377 ERROR (qtp1595212853-4469) [c:mycollection s:shard1
r:core_node3 x:mycollection_shard1_replica_n1] o.a.s.h.RequestHandlerBase
java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout
expired: 120001/120000 ms
2018-02-23 04:09:17.377 ERROR (qtp1595212853-4469) [c:mycollection s:shard1
r:core_node3 x:mycollection_shard1_replica_n1] o.a.s.s.HttpSolrCall
null:java.io.IOException: java.util.concurrent.TimeoutException: Idle
timeout expired: 120001/120000 ms
2018-02-23 04:09:28.821 ERROR (qtp1595212853-22121) [c:mycollection
s:shard1 r:core_node3 x:mycollection_shard1_replica_n1]
o.a.s.h.RequestHandlerBase java.io.IOException:
java.util.concurrent.TimeoutException:
Idle timeout expired: 120000/120000 ms
2018-02-23 04:09:28.822 ERROR (qtp1595212853-22121) [c:mycollection
s:shard1 r:core_node3 x:mycollection_shard1_replica_n1]
o.a.s.s.HttpSolrCall null:java.io.IOException:
java.util.concurrent.TimeoutException:
Idle timeout expired: 120000/120000 ms
2018-02-23 04:10:24.643 ERROR
(recoveryExecutor-3-thread-8-processing-n:solr2-b:8080_solr
x:mycollection_shard1_replica_n1 s:shard1 c:mycollection r:core_node3)
[c:mycollection s:shard1 r:core_node3 x:mycollection_shard1_replica_n1]
o.a.s.h.IndexFetcher Error fetching file, doing one
retry...:org.apache.solr.common.SolrException: Unable to download
tlog.0000000000000046782.1593163207942340608 completely. Downloaded 0!=87454
2018-02-23 04:10:24.645 ERROR
(recoveryExecutor-3-thread-8-processing-n:solr2-b:8080_solr
x:mycollection_shard1_replica_n1 s:shard1 c:mycollection r:core_node3)
[c:mycollection s:shard1 r:core_node3 x:mycollection_shard1_replica_n1]
o.a.s.h.IndexFetcher Error deleting file: tlog.0000000000000046782.
1593163207942340608
2018-02-23 04:10:26.466 ERROR
(recoveryExecutor-3-thread-8-processing-n:solr2-b:8080_solr
x:mycollection_shard1_replica_n1 s:shard1 c:mycollection r:core_node3)
[c:mycollection s:shard1 r:core_node3 x:mycollection_shard1_replica_n1]
o.a.s.c.RecoveryStrategy Error while trying to
recover:org.apache.solr.common.SolrException:
Replication for recovery failed.
2018-02-23 04:10:26.466 ERROR
(recoveryExecutor-3-thread-8-processing-n:solr2-b:8080_solr
x:mycollection_shard1_replica_n1 s:shard1 c:mycollection r:core_node3)
[c:mycollection s:shard1 r:core_node3 x:mycollection_shard1_replica_n1]
o.a.s.c.RecoveryStrategy Recovery failed - trying again... (0)
2018-02-23 04:10:26.466 ERROR
(recoveryExecutor-3-thread-8-processing-n:solr2-b:8080_solr
x:mycollection_shard1_replica_n1 s:shard1 c:mycollection r:core_node3)
[c:mycollection s:shard1 r:core_node3 x:mycollection_shard1_replica_n1]
o.a.s.h.ReplicationHandler Index fetch failed
:org.apache.solr.common.SolrException:
Unable to download tlog.0000000000000046782.1593163207942340608 completely.
Downloaded 0!=87454

solr2-c:
2018-02-23 01:58:43.421 ERROR (updateExecutor-2-thread-63-processing-http://
//solr2-a:8080//solr//mycollection_shard1_replica_n6
x:mycollection_shard1_replica_n8 r:core_node10 n:solr2-c.vam.be.cmh.
mycollection.com:8080_solr s:shard1 c:mycollection) [c:mycollection
s:shard1 r:core_node10 x:mycollection_shard1_replica_n8] o.a.s.u.
ErrorReportingConcurrentUpdateSolrClient error
2018-02-23 01:58:43.423 ERROR (qtp1595212853-16434) [c:mycollection
s:shard1 r:core_node10 x:mycollection_shard1_replica_n8]
o.a.s.h.RequestHandlerBase org.apache.solr.update.processor.
DistributedUpdateProcessor$DistributedUpdatesAsyncException: Async
exception during distributed update: Read timed out
2018-02-23 01:58:43.424 ERROR (qtp1595212853-16434) [c:mycollection
s:shard1 r:core_node10 x:mycollection_shard1_replica_n8]
o.a.s.s.HttpSolrCall null:org.apache.solr.update.processor.
DistributedUpdateProcessor$DistributedUpdatesAsyncException: Async
exception during distributed update: Read timed out
2018-02-23 02:53:00.174 ERROR (updateExecutor-2-thread-107-
processing-http:////solr2-a:8080//solr//mycollection_shard1_replica_n6
x:mycollection_shard1_replica_n8 r:core_node10 n:solr2-c.vam.be.cmh.
mycollection.com:8080_solr s:shard1 c:mycollection) [c:mycollection
s:shard1 r:core_node10 x:mycollection_shard1_replica_n8] o.a.s.u.
ErrorReportingConcurrentUpdateSolrClient error
2018-02-23 02:53:00.175 ERROR (qtp1595212853-16483) [c:mycollection
s:shard1 r:core_node10 x:mycollection_shard1_replica_n8]
o.a.s.h.RequestHandlerBase org.apache.solr.update.processor.
DistributedUpdateProcessor$DistributedUpdatesAsyncException: Async
exception during distributed update: Read timed out
2018-02-23 02:53:00.176 ERROR (qtp1595212853-16483) [c:mycollection
s:shard1 r:core_node10 x:mycollection_shard1_replica_n8]
o.a.s.s.HttpSolrCall null:org.apache.solr.update.processor.
DistributedUpdateProcessor$DistributedUpdatesAsyncException: Async
exception during distributed update: Read timed out

solr2-d:
2018-02-23 04:08:24.258 ERROR (qtp1595212853-2104) [c:mycollection s:shard1
r:core_node12 x:mycollection_shard1_replica_n11] o.a.s.h.RequestHandlerBase
java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout
expired: 120000/120000 ms
2018-02-23 04:08:24.258 ERROR (qtp1595212853-2104) [c:mycollection s:shard1
r:core_node12 x:mycollection_shard1_replica_n11] o.a.s.s.HttpSolrCall
null:java.io.IOException: java.util.concurrent.TimeoutException: Idle
timeout expired: 120000/120000 ms
2018-02-23 04:08:24.259 ERROR (qtp1595212853-3603) [c:mycollection s:shard1
r:core_node12 x:mycollection_shard1_replica_n11] o.a.s.h.RequestHandlerBase
java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout
expired: 120000/120000 ms
2018-02-23 04:08:24.259 ERROR (qtp1595212853-3603) [c:mycollection s:shard1
r:core_node12 x:mycollection_shard1_replica_n11] o.a.s.s.HttpSolrCall
null:java.io.IOException: java.util.concurrent.TimeoutException: Idle
timeout expired: 120000/120000 ms
2018-02-23 04:09:00.923 ERROR (qtp1595212853-959) [c:mycollection s:shard1
r:core_node12 x:mycollection_shard1_replica_n11] o.a.s.h.RequestHandlerBase
java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout
expired: 120000/120000 ms
2018-02-23 04:09:00.923 ERROR (qtp1595212853-959) [c:mycollection s:shard1
r:core_node12 x:mycollection_shard1_replica_n11] o.a.s.s.HttpSolrCall
null:java.io.IOException: java.util.concurrent.TimeoutException: Idle
timeout expired: 120000/120000 ms
2018-02-23 04:09:00.939 ERROR (qtp1595212853-2742) [c:mycollection s:shard1
r:core_node12 x:mycollection_shard1_replica_n11] o.a.s.h.RequestHandlerBase
java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout
expired: 120000/120000 ms
2018-02-23 04:09:00.939 ERROR (qtp1595212853-2742) [c:mycollection s:shard1
r:core_node12 x:mycollection_shard1_replica_n11] o.a.s.s.HttpSolrCall
null:java.io.IOException: java.util.concurrent.TimeoutException: Idle
timeout expired: 120000/120000 ms
2018-02-23 04:09:01.021 ERROR (qtp1595212853-3532) [c:mycollection s:shard1
r:core_node12 x:mycollection_shard1_replica_n11] o.a.s.h.RequestHandlerBase
java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout
expired: 120001/120000 ms
2018-02-23 04:09:01.021 ERROR (qtp1595212853-3532) [c:mycollection s:shard1
r:core_node12 x:mycollection_shard1_replica_n11] o.a.s.s.HttpSolrCall
null:java.io.IOException: java.util.concurrent.TimeoutException: Idle
timeout expired: 120001/120000 ms
2018-02-23 04:09:01.317 ERROR (qtp1595212853-958) [c:mycollection s:shard1
r:core_node12 x:mycollection_shard1_replica_n11] o.a.s.h.RequestHandlerBase
java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout
expired: 120000/120000 ms
2018-02-23 04:09:01.317 ERROR (qtp1595212853-958) [c:mycollection s:shard1
r:core_node12 x:mycollection_shard1_replica_n11] o.a.s.s.HttpSolrCall
null:java.io.IOException: java.util.concurrent.TimeoutException: Idle
timeout expired: 120000/120000 ms
2018-02-23 04:09:01.484 ERROR (qtp1595212853-986) [c:mycollection s:shard1
r:core_node12 x:mycollection_shard1_replica_n11] o.a.s.h.RequestHandlerBase
java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout
expired: 120001/120000 ms
2018-02-23 04:09:01.484 ERROR (qtp1595212853-986) [c:mycollection s:shard1
r:core_node12 x:mycollection_shard1_replica_n11] o.a.s.s.HttpSolrCall
null:java.io.IOException: java.util.concurrent.TimeoutException: Idle
timeout expired: 120001/120000 ms
2018-02-23 04:09:13.755 ERROR (qtp1595212853-3629) [c:mycollection s:shard1
r:core_node12 x:mycollection_shard1_replica_n11] o.a.s.h.RequestHandlerBase
java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout
expired: 120001/120000 ms
2018-02-23 04:09:13.755 ERROR (qtp1595212853-3629) [c:mycollection s:shard1
r:core_node12 x:mycollection_shard1_replica_n11] o.a.s.s.HttpSolrCall
null:java.io.IOException: java.util.concurrent.TimeoutException: Idle
timeout expired: 120001/120000 ms
2018-02-23 04:09:17.658 ERROR (qtp1595212853-3631) [c:mycollection s:shard1
r:core_node12 x:mycollection_shard1_replica_n11] o.a.s.h.RequestHandlerBase
java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout
expired: 120000/120000 ms
2018-02-23 04:09:17.658 ERROR (qtp1595212853-3631) [c:mycollection s:shard1
r:core_node12 x:mycollection_shard1_replica_n11] o.a.s.s.HttpSolrCall
null:java.io.IOException: java.util.concurrent.TimeoutException: Idle
timeout expired: 120000/120000 ms
2018-02-23 04:09:28.786 ERROR (qtp1595212853-979) [c:mycollection s:shard1
r:core_node12 x:mycollection_shard1_replica_n11] o.a.s.h.RequestHandlerBase
java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout
expired: 120001/120000 ms
2018-02-23 04:09:28.786 ERROR (qtp1595212853-979) [c:mycollection s:shard1
r:core_node12 x:mycollection_shard1_replica_n11] o.a.s.s.HttpSolrCall
null:java.io.IOException: java.util.concurrent.TimeoutException: Idle
timeout expired: 120001/120000 ms
2018-02-23 04:10:23.260 ERROR
(recoveryExecutor-3-thread-5-processing-n:solr2-d:8080_solr
x:mycollection_shard1_replica_n11 s:shard1 c:mycollection r:core_node12)
[c:mycollection s:shard1 r:core_node12 x:mycollection_shard1_replica_n11]
o.a.s.h.IndexFetcher Error fetching file, doing one
retry...:org.apache.solr.common.SolrException: Unable to download
tlog.0000000000000046782.1593163207942340608 completely. Downloaded 0!=87454
2018-02-23 04:10:23.262 ERROR
(recoveryExecutor-3-thread-5-processing-n:solr2-d:8080_solr
x:mycollection_shard1_replica_n11 s:shard1 c:mycollection r:core_node12)
[c:mycollection s:shard1 r:core_node12 x:mycollection_shard1_replica_n11]
o.a.s.h.IndexFetcher Error deleting file: tlog.0000000000000046782.
1593163207942340608
2018-02-23 04:10:23.887 ERROR
(recoveryExecutor-3-thread-5-processing-n:solr2-d:8080_solr
x:mycollection_shard1_replica_n11 s:shard1 c:mycollection r:core_node12)
[c:mycollection s:shard1 r:core_node12 x:mycollection_shard1_replica_n11]
o.a.s.c.RecoveryStrategy Error while trying to
recover:org.apache.solr.common.SolrException:
Replication for recovery failed.
2018-02-23 04:10:23.887 ERROR
(recoveryExecutor-3-thread-5-processing-n:solr2-d:8080_solr
x:mycollection_shard1_replica_n11 s:shard1 c:mycollection r:core_node12)
[c:mycollection s:shard1 r:core_node12 x:mycollection_shard1_replica_n11]
o.a.s.c.RecoveryStrategy Recovery failed - trying again... (0)
2018-02-23 04:10:23.887 ERROR
(recoveryExecutor-3-thread-5-processing-n:solr2-d:8080_solr
x:mycollection_shard1_replica_n11 s:shard1 c:mycollection r:core_node12)
[c:mycollection s:shard1 r:core_node12 x:mycollection_shard1_replica_n11]
o.a.s.h.ReplicationHandler Index fetch failed
:org.apache.solr.common.SolrException:
Unable to download tlog.0000000000000046782.1593163207942340608 completely.
Downloaded 0!=87454

solr2-e:
2018-02-23 04:09:00.926 ERROR (qtp1595212853-15314) [c:mycollection
s:shard1 r:core_node7 x:mycollection_shard1_replica_n4]
o.a.s.s.HttpSolrCall null:java.io.IOException:
java.util.concurrent.TimeoutException:
Idle timeout expired: 120000/120000 ms
2018-02-23 04:09:00.942 ERROR (qtp1595212853-4289) [c:mycollection s:shard1
r:core_node7 x:mycollection_shard1_replica_n4] o.a.s.h.RequestHandlerBase
java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout
expired: 120000/120000 ms
2018-02-23 04:09:00.943 ERROR (qtp1595212853-4289) [c:mycollection s:shard1
r:core_node7 x:mycollection_shard1_replica_n4] o.a.s.s.HttpSolrCall
null:java.io.IOException: java.util.concurrent.TimeoutException: Idle
timeout expired: 120000/120000 ms
2018-02-23 04:09:01.024 ERROR (qtp1595212853-15308) [c:mycollection
s:shard1 r:core_node7 x:mycollection_shard1_replica_n4]
o.a.s.h.RequestHandlerBase java.io.IOException:
java.util.concurrent.TimeoutException:
Idle timeout expired: 120000/120000 ms
2018-02-23 04:09:01.024 ERROR (qtp1595212853-15308) [c:mycollection
s:shard1 r:core_node7 x:mycollection_shard1_replica_n4]
o.a.s.s.HttpSolrCall null:java.io.IOException:
java.util.concurrent.TimeoutException:
Idle timeout expired: 120000/120000 ms
2018-02-23 04:09:01.328 ERROR (qtp1595212853-14946) [c:mycollection
s:shard1 r:core_node7 x:mycollection_shard1_replica_n4]
o.a.s.h.RequestHandlerBase java.io.IOException:
java.util.concurrent.TimeoutException:
Idle timeout expired: 120000/120000 ms
2018-02-23 04:09:01.328 ERROR (qtp1595212853-14946) [c:mycollection
s:shard1 r:core_node7 x:mycollection_shard1_replica_n4]
o.a.s.s.HttpSolrCall null:java.io.IOException:
java.util.concurrent.TimeoutException:
Idle timeout expired: 120000/120000 ms
2018-02-23 04:09:01.486 ERROR (qtp1595212853-7288) [c:mycollection s:shard1
r:core_node7 x:mycollection_shard1_replica_n4] o.a.s.h.RequestHandlerBase
java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout
expired: 120000/120000 ms
2018-02-23 04:09:01.486 ERROR (qtp1595212853-7288) [c:mycollection s:shard1
r:core_node7 x:mycollection_shard1_replica_n4] o.a.s.s.HttpSolrCall
null:java.io.IOException: java.util.concurrent.TimeoutException: Idle
timeout expired: 120000/120000 ms
2018-02-23 04:09:13.758 ERROR (qtp1595212853-9090) [c:mycollection s:shard1
r:core_node7 x:mycollection_shard1_replica_n4] o.a.s.h.RequestHandlerBase
java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout
expired: 120000/120000 ms
2018-02-23 04:09:13.758 ERROR (qtp1595212853-9090) [c:mycollection s:shard1
r:core_node7 x:mycollection_shard1_replica_n4] o.a.s.s.HttpSolrCall
null:java.io.IOException: java.util.concurrent.TimeoutException: Idle
timeout expired: 120000/120000 ms
2018-02-23 04:09:17.377 ERROR (qtp1595212853-9021) [c:mycollection s:shard1
r:core_node7 x:mycollection_shard1_replica_n4] o.a.s.h.RequestHandlerBase
java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout
expired: 120000/120000 ms
2018-02-23 04:09:17.378 ERROR (qtp1595212853-9021) [c:mycollection s:shard1
r:core_node7 x:mycollection_shard1_replica_n4] o.a.s.s.HttpSolrCall
null:java.io.IOException: java.util.concurrent.TimeoutException: Idle
timeout expired: 120000/120000 ms
2018-02-23 04:09:29.804 ERROR (qtp1595212853-16156) [c:mycollection
s:shard1 r:core_node7 x:mycollection_shard1_replica_n4]
o.a.s.h.RequestHandlerBase java.io.IOException:
java.util.concurrent.TimeoutException:
Idle timeout expired: 120000/120000 ms
2018-02-23 04:09:29.827 ERROR (qtp1595212853-16156) [c:mycollection
s:shard1 r:core_node7 x:mycollection_shard1_replica_n4]
o.a.s.s.HttpSolrCall null:java.io.IOException:
java.util.concurrent.TimeoutException:
Idle timeout expired: 120000/120000 ms
2018-02-23 04:10:24.945 ERROR
(recoveryExecutor-3-thread-6-processing-n:solr2-e:8080_solr
x:mycollection_shard1_replica_n4 s:shard1 c:mycollection r:core_node7)
[c:mycollection s:shard1 r:core_node7 x:mycollection_shard1_replica_n4]
o.a.s.h.IndexFetcher Error fetching file, doing one
retry...:org.apache.solr.common.SolrException: Unable to download
tlog.0000000000000046782.1593163207942340608 completely. Downloaded 0!=87454
2018-02-23 04:10:24.946 ERROR
(recoveryExecutor-3-thread-6-processing-n:solr2-e:8080_solr
x:mycollection_shard1_replica_n4 s:shard1 c:mycollection r:core_node7)
[c:mycollection s:shard1 r:core_node7 x:mycollection_shard1_replica_n4]
o.a.s.h.IndexFetcher Error deleting file: tlog.0000000000000046782.
1593163207942340608
2018-02-23 04:10:26.278 ERROR
(recoveryExecutor-3-thread-6-processing-n:solr2-e:8080_solr
x:mycollection_shard1_replica_n4 s:shard1 c:mycollection r:core_node7)
[c:mycollection s:shard1 r:core_node7 x:mycollection_shard1_replica_n4]
o.a.s.c.RecoveryStrategy Error while trying to
recover:org.apache.solr.common.SolrException:
Replication for recovery failed.
2018-02-23 04:10:26.278 ERROR
(recoveryExecutor-3-thread-6-processing-n:solr2-e:8080_solr
x:mycollection_shard1_replica_n4 s:shard1 c:mycollection r:core_node7)
[c:mycollection s:shard1 r:core_node7 x:mycollection_shard1_replica_n4]
o.a.s.c.RecoveryStrategy Recovery failed - trying again... (0)
2018-02-23 04:10:26.278 ERROR
(recoveryExecutor-3-thread-6-processing-n:solr2-e:8080_solr
x:mycollection_shard1_replica_n4 s:shard1 c:mycollection r:core_node7)
[c:mycollection s:shard1 r:core_node7 x:mycollection_shard1_replica_n4]
o.a.s.h.ReplicationHandler Index fetch failed
:org.apache.solr.common.SolrException:
Unable to download tlog.0000000000000046782.1593163207942340608 completely.
Downloaded 0!=87454
2018-02-23 04:12:20.570 ERROR
(recoveryExecutor-3-thread-6-processing-n:solr2-e:8080_solr
x:mycollection_shard1_replica_n4 s:shard1 c:mycollection r:core_node7)
[c:mycollection s:shard1 r:core_node7 x:mycollection_shard1_replica_n4]
o.a.s.h.IndexFetcher Error fetching file, doing one
retry...:org.apache.solr.common.SolrException: Unable to download
tlog.0000000000000046787.1593163366289899520 completely. Downloaded
0!=179060
2018-02-23 04:12:20.571 ERROR
(recoveryExecutor-3-thread-6-processing-n:solr2-e:8080_solr
x:mycollection_shard1_replica_n4 s:shard1 c:mycollection r:core_node7)
[c:mycollection s:shard1 r:core_node7 x:mycollection_shard1_replica_n4]
o.a.s.h.IndexFetcher Error deleting file: tlog.0000000000000046787.
1593163366289899520
2018-02-23 04:12:22.405 ERROR
(recoveryExecutor-3-thread-6-processing-n:solr2-e:8080_solr
x:mycollection_shard1_replica_n4 s:shard1 c:mycollection r:core_node7)
[c:mycollection s:shard1 r:core_node7 x:mycollection_shard1_replica_n4]
o.a.s.c.RecoveryStrategy Error while trying to
recover:org.apache.solr.common.SolrException:
Replication for recovery failed.
2018-02-23 04:12:22.405 ERROR
(recoveryExecutor-3-thread-6-processing-n:solr2-e:8080_solr
x:mycollection_shard1_replica_n4 s:shard1 c:mycollection r:core_node7)
[c:mycollection s:shard1 r:core_node7 x:mycollection_shard1_replica_n4]
o.a.s.c.RecoveryStrategy Recovery failed - trying again... (1)
2018-02-23 04:12:22.405 ERROR
(recoveryExecutor-3-thread-6-processing-n:solr2-e:8080_solr
x:mycollection_shard1_replica_n4 s:shard1 c:mycollection r:core_node7)
[c:mycollection s:shard1 r:core_node7 x:mycollection_shard1_replica_n4]
o.a.s.h.ReplicationHandler Index fetch failed
:org.apache.solr.common.SolrException:
Unable to download tlog.0000000000000046787.1593163366289899520 completely.
Downloaded 0!=179060


> On Feb 23, 2018, at 4:15 PM, Deepak Goel <de...@gmail.com> wrote:
>
> Can you please post all the errors? The current error is only for the node
> 'solr-2d'
>
> On 23 Feb 2018 09:42, "Tom Peters" <tp...@synacor.com> wrote:
>
> I'm trying to debug why indexing in SolrCloud 7.1 is having so many
issues.
> It will hang most of the time, and timeout the rest.
>
> Here's an example:
>
>    time curl -s 'myhost:8080/solr/mycollection/update/json/docs' -d
> '{"solr_id":"test_001", "data_type":"test"}'|jq .
>    {
>      "responseHeader": {
>        "status": 0,
>        "QTime": 5004
>      }
>    }
>    curl -s 'myhost:8080/solr/mycollection/update/json/docs' -d   0.00s
> user 0.00s system 0% cpu 5.025 total
>    jq .  0.01s user 0.00s system 0% cpu 5.025 total
>
> Here's some of the timeout errors I'm seeing:
>
>    2018-02-23 03:55:02.903 ERROR (qtp1595212853-3607) [c:mycollection
> s:shard1 r:core_node12 x:mycollection_shard1_replica_n11]
> o.a.s.h.RequestHandlerBase java.io.IOException:
> java.util.concurrent.TimeoutException:
> Idle timeout expired: 120000/120000 ms
>    2018-02-23 03:55:02.903 ERROR (qtp1595212853-3607) [c:mycollection
> s:shard1 r:core_node12 x:mycollection_shard1_replica_n11]
> o.a.s.s.HttpSolrCall null:java.io.IOException:
> java.util.concurrent.TimeoutException:
> Idle timeout expired: 120000/120000 ms
>    2018-02-23 03:55:36.517 ERROR (recoveryExecutor-3-thread-4-
> processing-n:solr2-d.myhost:8080_solr x:mycollection_shard1_replica_n11
> s:shard1 c:mycollection r:core_node12) [c:mycollection s:shard1
> r:core_node12 x:mycollection_shard1_replica_n11]
o.a.s.h.ReplicationHandler
> Index fetch failed :org.apache.solr.common.SolrException: Index fetch
> failed :
>    2018-02-23 03:55:36.517 ERROR (recoveryExecutor-3-thread-4-
> processing-n:solr2-d.myhost:8080_solr x:mycollection_shard1_replica_n11
> s:shard1 c:mycollection r:core_node12) [c:mycollection s:shard1
> r:core_node12 x:mycollection_shard1_replica_n11] o.a.s.c.RecoveryStrategy
> Error while trying to recover:org.apache.solr.common.SolrException:
> Replication for recovery failed.
>
>
> We currently have two separate Solr clusters. Our current in-production
> cluster which runs on Solr 3.4 and a new ring that I'm trying to bring up
> which runs on SolrCloud 7.1. I have the exact same code that is indexing
to
> both clusters. The Solr 3.4 indexes fine, but I'm running into lots of
> issues with SolrCloud 7.1.
>
>
> Some additional details about the setup:
>
> * 5 nodes solr2-a through solr2-e.
> * 5 replicas
> * 1 shard
> * The servers have 48G of RAM with -Xmx and -Xms set to 16G
> * I currently have soft commits at 10m intervals and hard commits (with
> openSearcher=false) at 1m intervals. I also tried 5m (soft) and 15s (hard)
> as well.
>
> Any help or pointers would be greatly appreciated. Thanks!
>
>
> This message and any attachment may contain information that is
> confidential and/or proprietary. Any use, disclosure, copying, storing, or
> distribution of this e-mail or any attached file by anyone other than the
> intended recipient is strictly prohibited. If you have received this
> message in error, please notify the sender by reply email and delete the
> message and any attachments. Thank you.



This message and any attachment may contain information that is
confidential and/or proprietary. Any use, disclosure, copying, storing, or
distribution of this e-mail or any attached file by anyone other than the
intended recipient is strictly prohibited. If you have received this
message in error, please notify the sender by reply email and delete the
message and any attachments. Thank you.

Re: Indexing timeout issues with SolrCloud 7.1

Posted by Tom Peters <tp...@synacor.com>.
I included the last 25 lines from the logs from each of the five nodes during that time period.

I _think_ I'm running into issues with bulking up deleteByQuery. Quick background: we have objects in our system that may have multiple availability windows. So when we index an object, will store it as separate documents each with their own begins and expires date. At index time we don't know if the all of the windows are still valid or not, so we remove all of them with a deleteByQuery (e.g. deleteByQuery=object_id:12345) and then index one or more documents.

I ran an isolated test a number of times where I indexed 1500 documents in this manner (deletes then index). In Solr 3.4, it takes about 15s to complete. In Solr 7.1, it's taking about 5m. If I remove the deleteByQuery, the indexing times are nearly identical.

When run in normal production mode where we have lots of processes indexing at once (~20 or so), it starts to cause lots of issues (which you see below).


Please let me know if anything I mentioned is unclear. Thanks!




solr2-a:
2018-02-23 04:09:36.551 ERROR (updateExecutor-2-thread-2672-processing-http:////solr2-b:8080//solr//mycollection_shard1_replica_n1 x:mycollection_shard1_replica_n6 r:core_node9 n:solr2-a.vam.be.cmh.mycollection.com:8080_solr s:shard1 c:mycollection) [c:mycollection s:shard1 r:core_node9 x:mycollection_shard1_replica_n6] o.a.s.u.ErrorReportingConcurrentUpdateSolrClient error
2018-02-23 04:09:36.551 ERROR (updateExecutor-2-thread-2692-processing-http:////solr2-d:8080//solr//mycollection_shard1_replica_n11 x:mycollection_shard1_replica_n6 r:core_node9 n:solr2-a.vam.be.cmh.mycollection.com:8080_solr s:shard1 c:mycollection) [c:mycollection s:shard1 r:core_node9 x:mycollection_shard1_replica_n6] o.a.s.u.ErrorReportingConcurrentUpdateSolrClient error
2018-02-23 04:09:36.551 ERROR (updateExecutor-2-thread-2711-processing-http:////solr2-e:8080//solr//mycollection_shard1_replica_n4 x:mycollection_shard1_replica_n6 r:core_node9 n:solr2-a.vam.be.cmh.mycollection.com:8080_solr s:shard1 c:mycollection) [c:mycollection s:shard1 r:core_node9 x:mycollection_shard1_replica_n6] o.a.s.u.ErrorReportingConcurrentUpdateSolrClient error
2018-02-23 04:09:36.552 ERROR (qtp1595212853-32739) [c:mycollection s:shard1 r:core_node9 x:mycollection_shard1_replica_n6] o.a.s.u.p.DistributedUpdateProcessor Setting up to try to start recovery on replica http://solr2-b:8080/solr/mycollection_shard1_replica_n1/
2018-02-23 04:09:36.552 ERROR (qtp1595212853-32739) [c:mycollection s:shard1 r:core_node9 x:mycollection_shard1_replica_n6] o.a.s.u.p.DistributedUpdateProcessor Setting up to try to start recovery on replica http://solr2-d:8080/solr/mycollection_shard1_replica_n11/
2018-02-23 04:09:36.552 ERROR (qtp1595212853-32739) [c:mycollection s:shard1 r:core_node9 x:mycollection_shard1_replica_n6] o.a.s.u.p.DistributedUpdateProcessor Setting up to try to start recovery on replica http://solr2-e:8080/solr/mycollection_shard1_replica_n4/
2018-02-23 04:09:38.217 ERROR (updateExecutor-2-thread-2712-processing-http:////solr2-e:8080//solr//mycollection_shard1_replica_n4 x:mycollection_shard1_replica_n6 r:core_node9 n:solr2-a.vam.be.cmh.mycollection.com:8080_solr s:shard1 c:mycollection) [c:mycollection s:shard1 r:core_node9 x:mycollection_shard1_replica_n6] o.a.s.u.ErrorReportingConcurrentUpdateSolrClient error
2018-02-23 04:09:38.217 ERROR (updateExecutor-2-thread-2726-processing-http:////solr2-d:8080//solr//mycollection_shard1_replica_n11 x:mycollection_shard1_replica_n6 r:core_node9 n:solr2-a.vam.be.cmh.mycollection.com:8080_solr s:shard1 c:mycollection) [c:mycollection s:shard1 r:core_node9 x:mycollection_shard1_replica_n6] o.a.s.u.ErrorReportingConcurrentUpdateSolrClient error
2018-02-23 04:09:38.217 ERROR (updateExecutor-2-thread-2727-processing-http:////solr2-b:8080//solr//mycollection_shard1_replica_n1 x:mycollection_shard1_replica_n6 r:core_node9 n:solr2-a.vam.be.cmh.mycollection.com:8080_solr s:shard1 c:mycollection) [c:mycollection s:shard1 r:core_node9 x:mycollection_shard1_replica_n6] o.a.s.u.ErrorReportingConcurrentUpdateSolrClient error
2018-02-23 04:09:38.218 ERROR (qtp1595212853-32260) [c:mycollection s:shard1 r:core_node9 x:mycollection_shard1_replica_n6] o.a.s.u.p.DistributedUpdateProcessor Setting up to try to start recovery on replica http://solr2-b:8080/solr/mycollection_shard1_replica_n1/
2018-02-23 04:09:38.218 ERROR (qtp1595212853-32260) [c:mycollection s:shard1 r:core_node9 x:mycollection_shard1_replica_n6] o.a.s.u.p.DistributedUpdateProcessor Setting up to try to start recovery on replica http://solr2-d:8080/solr/mycollection_shard1_replica_n11/
2018-02-23 04:09:38.218 ERROR (qtp1595212853-32260) [c:mycollection s:shard1 r:core_node9 x:mycollection_shard1_replica_n6] o.a.s.u.p.DistributedUpdateProcessor Setting up to try to start recovery on replica http://solr2-e:8080/solr/mycollection_shard1_replica_n4/
2018-02-23 04:09:50.048 ERROR (qtp1595212853-11840) [c:mycollection s:shard1 r:core_node9 x:mycollection_shard1_replica_n6] o.a.s.u.p.DistributedUpdateProcessor Setting up to try to start recovery on replica http://solr2-d:8080/solr/mycollection_shard1_replica_n11/
2018-02-23 04:09:50.048 ERROR (qtp1595212853-11840) [c:mycollection s:shard1 r:core_node9 x:mycollection_shard1_replica_n6] o.a.s.u.p.DistributedUpdateProcessor Setting up to try to start recovery on replica http://solr2-e:8080/solr/mycollection_shard1_replica_n4/
2018-02-23 04:09:50.142 ERROR (qtp1595212853-33675) [c:mycollection s:shard1 r:core_node9 x:mycollection_shard1_replica_n6] o.a.s.u.p.DistributedUpdateProcessor Setting up to try to start recovery on replica http://solr2-b:8080/solr/mycollection_shard1_replica_n1/
2018-02-23 04:09:50.143 ERROR (qtp1595212853-33675) [c:mycollection s:shard1 r:core_node9 x:mycollection_shard1_replica_n6] o.a.s.u.p.DistributedUpdateProcessor Setting up to try to start recovery on replica http://solr2-d:8080/solr/mycollection_shard1_replica_n11/
2018-02-23 04:09:50.143 ERROR (qtp1595212853-33675) [c:mycollection s:shard1 r:core_node9 x:mycollection_shard1_replica_n6] o.a.s.u.p.DistributedUpdateProcessor Setting up to try to start recovery on replica http://solr2-e:8080/solr/mycollection_shard1_replica_n4/
2018-02-23 07:54:01.404 ERROR (qtp1595212853-33424) [c:mycollection s:shard1 r:core_node9 x:mycollection_shard1_replica_n6] o.a.s.h.RequestHandlerBase java.lang.NullPointerException
2018-02-23 07:54:01.405 ERROR (qtp1595212853-33424) [c:mycollection s:shard1 r:core_node9 x:mycollection_shard1_replica_n6] o.a.s.s.HttpSolrCall null:java.lang.NullPointerException
2018-02-23 12:24:12.389 ERROR (qtp1595212853-33906) [c:mycollection s:shard1 r:core_node9 x:mycollection_shard1_replica_n6] o.a.s.h.RequestHandlerBase java.lang.NullPointerException
2018-02-23 12:24:12.390 ERROR (qtp1595212853-33906) [c:mycollection s:shard1 r:core_node9 x:mycollection_shard1_replica_n6] o.a.s.s.HttpSolrCall null:java.lang.NullPointerException
2018-02-23 16:27:52.030 ERROR (qtp1595212853-32216) [c:mycollection s:shard1 r:core_node9 x:mycollection_shard1_replica_n6] o.a.s.h.RequestHandlerBase java.lang.NullPointerException
2018-02-23 16:27:52.030 ERROR (qtp1595212853-32216) [c:mycollection s:shard1 r:core_node9 x:mycollection_shard1_replica_n6] o.a.s.s.HttpSolrCall null:java.lang.NullPointerException
2018-02-23 16:30:12.276 ERROR (qtp1595212853-33424) [c:mycollection s:shard1 r:core_node9 x:mycollection_shard1_replica_n6] o.a.s.h.RequestHandlerBase java.lang.NullPointerException
2018-02-23 16:30:12.276 ERROR (qtp1595212853-33424) [c:mycollection s:shard1 r:core_node9 x:mycollection_shard1_replica_n6] o.a.s.s.HttpSolrCall null:java.lang.NullPointerException

solr2-b:
2018-02-23 04:08:24.260 ERROR (qtp1595212853-22127) [c:mycollection s:shard1 r:core_node3 x:mycollection_shard1_replica_n1] o.a.s.h.RequestHandlerBase java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout expired: 120000/120000 ms
2018-02-23 04:08:24.260 ERROR (qtp1595212853-22127) [c:mycollection s:shard1 r:core_node3 x:mycollection_shard1_replica_n1] o.a.s.s.HttpSolrCall null:java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout expired: 120000/120000 ms
2018-02-23 04:08:24.260 ERROR (qtp1595212853-22168) [c:mycollection s:shard1 r:core_node3 x:mycollection_shard1_replica_n1] o.a.s.s.HttpSolrCall null:java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout expired: 120000/120000 ms
2018-02-23 04:08:43.167 ERROR (recoveryExecutor-3-thread-8-processing-n:solr2-b:8080_solr x:mycollection_shard1_replica_n1 s:shard1 c:mycollection r:core_node3) [c:mycollection s:shard1 r:core_node3 x:mycollection_shard1_replica_n1] o.a.s.u.PeerSync PeerSync: core=mycollection_shard1_replica_n1 url=http://solr2-b.vam.be.cmh.mycollection.com:8080/solr  Requested 2 updates from http://solr2-a.vam.be.cmh.mycollection.com:8080/solr/mycollection_shard1_replica_n6/ but retrieved 1
2018-02-23 04:09:00.925 ERROR (qtp1595212853-20968) [c:mycollection s:shard1 r:core_node3 x:mycollection_shard1_replica_n1] o.a.s.h.RequestHandlerBase java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout expired: 120001/120000 ms
2018-02-23 04:09:00.925 ERROR (qtp1595212853-20968) [c:mycollection s:shard1 r:core_node3 x:mycollection_shard1_replica_n1] o.a.s.s.HttpSolrCall null:java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout expired: 120001/120000 ms
2018-02-23 04:09:00.941 ERROR (qtp1595212853-22123) [c:mycollection s:shard1 r:core_node3 x:mycollection_shard1_replica_n1] o.a.s.h.RequestHandlerBase java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout expired: 120000/120000 ms
2018-02-23 04:09:00.941 ERROR (qtp1595212853-22123) [c:mycollection s:shard1 r:core_node3 x:mycollection_shard1_replica_n1] o.a.s.s.HttpSolrCall null:java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout expired: 120000/120000 ms
2018-02-23 04:09:01.023 ERROR (qtp1595212853-20597) [c:mycollection s:shard1 r:core_node3 x:mycollection_shard1_replica_n1] o.a.s.h.RequestHandlerBase java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout expired: 120000/120000 ms
2018-02-23 04:09:01.024 ERROR (qtp1595212853-20597) [c:mycollection s:shard1 r:core_node3 x:mycollection_shard1_replica_n1] o.a.s.s.HttpSolrCall null:java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout expired: 120000/120000 ms
2018-02-23 04:09:04.793 ERROR (qtp1595212853-15597) [c:mycollection s:shard1 r:core_node3 x:mycollection_shard1_replica_n1] o.a.s.h.RequestHandlerBase java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout expired: 120000/120000 ms
2018-02-23 04:09:04.793 ERROR (qtp1595212853-15597) [c:mycollection s:shard1 r:core_node3 x:mycollection_shard1_replica_n1] o.a.s.s.HttpSolrCall null:java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout expired: 120000/120000 ms
2018-02-23 04:09:04.964 ERROR (qtp1595212853-22126) [c:mycollection s:shard1 r:core_node3 x:mycollection_shard1_replica_n1] o.a.s.h.RequestHandlerBase java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout expired: 120000/120000 ms
2018-02-23 04:09:04.965 ERROR (qtp1595212853-22126) [c:mycollection s:shard1 r:core_node3 x:mycollection_shard1_replica_n1] o.a.s.s.HttpSolrCall null:java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout expired: 120000/120000 ms
2018-02-23 04:09:13.809 ERROR (qtp1595212853-14929) [c:mycollection s:shard1 r:core_node3 x:mycollection_shard1_replica_n1] o.a.s.h.RequestHandlerBase java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout expired: 120000/120000 ms
2018-02-23 04:09:13.825 ERROR (qtp1595212853-14929) [c:mycollection s:shard1 r:core_node3 x:mycollection_shard1_replica_n1] o.a.s.s.HttpSolrCall null:java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout expired: 120000/120000 ms
2018-02-23 04:09:17.377 ERROR (qtp1595212853-4469) [c:mycollection s:shard1 r:core_node3 x:mycollection_shard1_replica_n1] o.a.s.h.RequestHandlerBase java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout expired: 120001/120000 ms
2018-02-23 04:09:17.377 ERROR (qtp1595212853-4469) [c:mycollection s:shard1 r:core_node3 x:mycollection_shard1_replica_n1] o.a.s.s.HttpSolrCall null:java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout expired: 120001/120000 ms
2018-02-23 04:09:28.821 ERROR (qtp1595212853-22121) [c:mycollection s:shard1 r:core_node3 x:mycollection_shard1_replica_n1] o.a.s.h.RequestHandlerBase java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout expired: 120000/120000 ms
2018-02-23 04:09:28.822 ERROR (qtp1595212853-22121) [c:mycollection s:shard1 r:core_node3 x:mycollection_shard1_replica_n1] o.a.s.s.HttpSolrCall null:java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout expired: 120000/120000 ms
2018-02-23 04:10:24.643 ERROR (recoveryExecutor-3-thread-8-processing-n:solr2-b:8080_solr x:mycollection_shard1_replica_n1 s:shard1 c:mycollection r:core_node3) [c:mycollection s:shard1 r:core_node3 x:mycollection_shard1_replica_n1] o.a.s.h.IndexFetcher Error fetching file, doing one retry...:org.apache.solr.common.SolrException: Unable to download tlog.0000000000000046782.1593163207942340608 completely. Downloaded 0!=87454
2018-02-23 04:10:24.645 ERROR (recoveryExecutor-3-thread-8-processing-n:solr2-b:8080_solr x:mycollection_shard1_replica_n1 s:shard1 c:mycollection r:core_node3) [c:mycollection s:shard1 r:core_node3 x:mycollection_shard1_replica_n1] o.a.s.h.IndexFetcher Error deleting file: tlog.0000000000000046782.1593163207942340608
2018-02-23 04:10:26.466 ERROR (recoveryExecutor-3-thread-8-processing-n:solr2-b:8080_solr x:mycollection_shard1_replica_n1 s:shard1 c:mycollection r:core_node3) [c:mycollection s:shard1 r:core_node3 x:mycollection_shard1_replica_n1] o.a.s.c.RecoveryStrategy Error while trying to recover:org.apache.solr.common.SolrException: Replication for recovery failed.
2018-02-23 04:10:26.466 ERROR (recoveryExecutor-3-thread-8-processing-n:solr2-b:8080_solr x:mycollection_shard1_replica_n1 s:shard1 c:mycollection r:core_node3) [c:mycollection s:shard1 r:core_node3 x:mycollection_shard1_replica_n1] o.a.s.c.RecoveryStrategy Recovery failed - trying again... (0)
2018-02-23 04:10:26.466 ERROR (recoveryExecutor-3-thread-8-processing-n:solr2-b:8080_solr x:mycollection_shard1_replica_n1 s:shard1 c:mycollection r:core_node3) [c:mycollection s:shard1 r:core_node3 x:mycollection_shard1_replica_n1] o.a.s.h.ReplicationHandler Index fetch failed :org.apache.solr.common.SolrException: Unable to download tlog.0000000000000046782.1593163207942340608 completely. Downloaded 0!=87454

solr2-c:
2018-02-23 01:58:43.421 ERROR (updateExecutor-2-thread-63-processing-http:////solr2-a:8080//solr//mycollection_shard1_replica_n6 x:mycollection_shard1_replica_n8 r:core_node10 n:solr2-c.vam.be.cmh.mycollection.com:8080_solr s:shard1 c:mycollection) [c:mycollection s:shard1 r:core_node10 x:mycollection_shard1_replica_n8] o.a.s.u.ErrorReportingConcurrentUpdateSolrClient error
2018-02-23 01:58:43.423 ERROR (qtp1595212853-16434) [c:mycollection s:shard1 r:core_node10 x:mycollection_shard1_replica_n8] o.a.s.h.RequestHandlerBase org.apache.solr.update.processor.DistributedUpdateProcessor$DistributedUpdatesAsyncException: Async exception during distributed update: Read timed out
2018-02-23 01:58:43.424 ERROR (qtp1595212853-16434) [c:mycollection s:shard1 r:core_node10 x:mycollection_shard1_replica_n8] o.a.s.s.HttpSolrCall null:org.apache.solr.update.processor.DistributedUpdateProcessor$DistributedUpdatesAsyncException: Async exception during distributed update: Read timed out
2018-02-23 02:53:00.174 ERROR (updateExecutor-2-thread-107-processing-http:////solr2-a:8080//solr//mycollection_shard1_replica_n6 x:mycollection_shard1_replica_n8 r:core_node10 n:solr2-c.vam.be.cmh.mycollection.com:8080_solr s:shard1 c:mycollection) [c:mycollection s:shard1 r:core_node10 x:mycollection_shard1_replica_n8] o.a.s.u.ErrorReportingConcurrentUpdateSolrClient error
2018-02-23 02:53:00.175 ERROR (qtp1595212853-16483) [c:mycollection s:shard1 r:core_node10 x:mycollection_shard1_replica_n8] o.a.s.h.RequestHandlerBase org.apache.solr.update.processor.DistributedUpdateProcessor$DistributedUpdatesAsyncException: Async exception during distributed update: Read timed out
2018-02-23 02:53:00.176 ERROR (qtp1595212853-16483) [c:mycollection s:shard1 r:core_node10 x:mycollection_shard1_replica_n8] o.a.s.s.HttpSolrCall null:org.apache.solr.update.processor.DistributedUpdateProcessor$DistributedUpdatesAsyncException: Async exception during distributed update: Read timed out

solr2-d:
2018-02-23 04:08:24.258 ERROR (qtp1595212853-2104) [c:mycollection s:shard1 r:core_node12 x:mycollection_shard1_replica_n11] o.a.s.h.RequestHandlerBase java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout expired: 120000/120000 ms
2018-02-23 04:08:24.258 ERROR (qtp1595212853-2104) [c:mycollection s:shard1 r:core_node12 x:mycollection_shard1_replica_n11] o.a.s.s.HttpSolrCall null:java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout expired: 120000/120000 ms
2018-02-23 04:08:24.259 ERROR (qtp1595212853-3603) [c:mycollection s:shard1 r:core_node12 x:mycollection_shard1_replica_n11] o.a.s.h.RequestHandlerBase java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout expired: 120000/120000 ms
2018-02-23 04:08:24.259 ERROR (qtp1595212853-3603) [c:mycollection s:shard1 r:core_node12 x:mycollection_shard1_replica_n11] o.a.s.s.HttpSolrCall null:java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout expired: 120000/120000 ms
2018-02-23 04:09:00.923 ERROR (qtp1595212853-959) [c:mycollection s:shard1 r:core_node12 x:mycollection_shard1_replica_n11] o.a.s.h.RequestHandlerBase java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout expired: 120000/120000 ms
2018-02-23 04:09:00.923 ERROR (qtp1595212853-959) [c:mycollection s:shard1 r:core_node12 x:mycollection_shard1_replica_n11] o.a.s.s.HttpSolrCall null:java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout expired: 120000/120000 ms
2018-02-23 04:09:00.939 ERROR (qtp1595212853-2742) [c:mycollection s:shard1 r:core_node12 x:mycollection_shard1_replica_n11] o.a.s.h.RequestHandlerBase java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout expired: 120000/120000 ms
2018-02-23 04:09:00.939 ERROR (qtp1595212853-2742) [c:mycollection s:shard1 r:core_node12 x:mycollection_shard1_replica_n11] o.a.s.s.HttpSolrCall null:java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout expired: 120000/120000 ms
2018-02-23 04:09:01.021 ERROR (qtp1595212853-3532) [c:mycollection s:shard1 r:core_node12 x:mycollection_shard1_replica_n11] o.a.s.h.RequestHandlerBase java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout expired: 120001/120000 ms
2018-02-23 04:09:01.021 ERROR (qtp1595212853-3532) [c:mycollection s:shard1 r:core_node12 x:mycollection_shard1_replica_n11] o.a.s.s.HttpSolrCall null:java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout expired: 120001/120000 ms
2018-02-23 04:09:01.317 ERROR (qtp1595212853-958) [c:mycollection s:shard1 r:core_node12 x:mycollection_shard1_replica_n11] o.a.s.h.RequestHandlerBase java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout expired: 120000/120000 ms
2018-02-23 04:09:01.317 ERROR (qtp1595212853-958) [c:mycollection s:shard1 r:core_node12 x:mycollection_shard1_replica_n11] o.a.s.s.HttpSolrCall null:java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout expired: 120000/120000 ms
2018-02-23 04:09:01.484 ERROR (qtp1595212853-986) [c:mycollection s:shard1 r:core_node12 x:mycollection_shard1_replica_n11] o.a.s.h.RequestHandlerBase java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout expired: 120001/120000 ms
2018-02-23 04:09:01.484 ERROR (qtp1595212853-986) [c:mycollection s:shard1 r:core_node12 x:mycollection_shard1_replica_n11] o.a.s.s.HttpSolrCall null:java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout expired: 120001/120000 ms
2018-02-23 04:09:13.755 ERROR (qtp1595212853-3629) [c:mycollection s:shard1 r:core_node12 x:mycollection_shard1_replica_n11] o.a.s.h.RequestHandlerBase java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout expired: 120001/120000 ms
2018-02-23 04:09:13.755 ERROR (qtp1595212853-3629) [c:mycollection s:shard1 r:core_node12 x:mycollection_shard1_replica_n11] o.a.s.s.HttpSolrCall null:java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout expired: 120001/120000 ms
2018-02-23 04:09:17.658 ERROR (qtp1595212853-3631) [c:mycollection s:shard1 r:core_node12 x:mycollection_shard1_replica_n11] o.a.s.h.RequestHandlerBase java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout expired: 120000/120000 ms
2018-02-23 04:09:17.658 ERROR (qtp1595212853-3631) [c:mycollection s:shard1 r:core_node12 x:mycollection_shard1_replica_n11] o.a.s.s.HttpSolrCall null:java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout expired: 120000/120000 ms
2018-02-23 04:09:28.786 ERROR (qtp1595212853-979) [c:mycollection s:shard1 r:core_node12 x:mycollection_shard1_replica_n11] o.a.s.h.RequestHandlerBase java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout expired: 120001/120000 ms
2018-02-23 04:09:28.786 ERROR (qtp1595212853-979) [c:mycollection s:shard1 r:core_node12 x:mycollection_shard1_replica_n11] o.a.s.s.HttpSolrCall null:java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout expired: 120001/120000 ms
2018-02-23 04:10:23.260 ERROR (recoveryExecutor-3-thread-5-processing-n:solr2-d:8080_solr x:mycollection_shard1_replica_n11 s:shard1 c:mycollection r:core_node12) [c:mycollection s:shard1 r:core_node12 x:mycollection_shard1_replica_n11] o.a.s.h.IndexFetcher Error fetching file, doing one retry...:org.apache.solr.common.SolrException: Unable to download tlog.0000000000000046782.1593163207942340608 completely. Downloaded 0!=87454
2018-02-23 04:10:23.262 ERROR (recoveryExecutor-3-thread-5-processing-n:solr2-d:8080_solr x:mycollection_shard1_replica_n11 s:shard1 c:mycollection r:core_node12) [c:mycollection s:shard1 r:core_node12 x:mycollection_shard1_replica_n11] o.a.s.h.IndexFetcher Error deleting file: tlog.0000000000000046782.1593163207942340608
2018-02-23 04:10:23.887 ERROR (recoveryExecutor-3-thread-5-processing-n:solr2-d:8080_solr x:mycollection_shard1_replica_n11 s:shard1 c:mycollection r:core_node12) [c:mycollection s:shard1 r:core_node12 x:mycollection_shard1_replica_n11] o.a.s.c.RecoveryStrategy Error while trying to recover:org.apache.solr.common.SolrException: Replication for recovery failed.
2018-02-23 04:10:23.887 ERROR (recoveryExecutor-3-thread-5-processing-n:solr2-d:8080_solr x:mycollection_shard1_replica_n11 s:shard1 c:mycollection r:core_node12) [c:mycollection s:shard1 r:core_node12 x:mycollection_shard1_replica_n11] o.a.s.c.RecoveryStrategy Recovery failed - trying again... (0)
2018-02-23 04:10:23.887 ERROR (recoveryExecutor-3-thread-5-processing-n:solr2-d:8080_solr x:mycollection_shard1_replica_n11 s:shard1 c:mycollection r:core_node12) [c:mycollection s:shard1 r:core_node12 x:mycollection_shard1_replica_n11] o.a.s.h.ReplicationHandler Index fetch failed :org.apache.solr.common.SolrException: Unable to download tlog.0000000000000046782.1593163207942340608 completely. Downloaded 0!=87454

solr2-e:
2018-02-23 04:09:00.926 ERROR (qtp1595212853-15314) [c:mycollection s:shard1 r:core_node7 x:mycollection_shard1_replica_n4] o.a.s.s.HttpSolrCall null:java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout expired: 120000/120000 ms
2018-02-23 04:09:00.942 ERROR (qtp1595212853-4289) [c:mycollection s:shard1 r:core_node7 x:mycollection_shard1_replica_n4] o.a.s.h.RequestHandlerBase java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout expired: 120000/120000 ms
2018-02-23 04:09:00.943 ERROR (qtp1595212853-4289) [c:mycollection s:shard1 r:core_node7 x:mycollection_shard1_replica_n4] o.a.s.s.HttpSolrCall null:java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout expired: 120000/120000 ms
2018-02-23 04:09:01.024 ERROR (qtp1595212853-15308) [c:mycollection s:shard1 r:core_node7 x:mycollection_shard1_replica_n4] o.a.s.h.RequestHandlerBase java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout expired: 120000/120000 ms
2018-02-23 04:09:01.024 ERROR (qtp1595212853-15308) [c:mycollection s:shard1 r:core_node7 x:mycollection_shard1_replica_n4] o.a.s.s.HttpSolrCall null:java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout expired: 120000/120000 ms
2018-02-23 04:09:01.328 ERROR (qtp1595212853-14946) [c:mycollection s:shard1 r:core_node7 x:mycollection_shard1_replica_n4] o.a.s.h.RequestHandlerBase java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout expired: 120000/120000 ms
2018-02-23 04:09:01.328 ERROR (qtp1595212853-14946) [c:mycollection s:shard1 r:core_node7 x:mycollection_shard1_replica_n4] o.a.s.s.HttpSolrCall null:java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout expired: 120000/120000 ms
2018-02-23 04:09:01.486 ERROR (qtp1595212853-7288) [c:mycollection s:shard1 r:core_node7 x:mycollection_shard1_replica_n4] o.a.s.h.RequestHandlerBase java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout expired: 120000/120000 ms
2018-02-23 04:09:01.486 ERROR (qtp1595212853-7288) [c:mycollection s:shard1 r:core_node7 x:mycollection_shard1_replica_n4] o.a.s.s.HttpSolrCall null:java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout expired: 120000/120000 ms
2018-02-23 04:09:13.758 ERROR (qtp1595212853-9090) [c:mycollection s:shard1 r:core_node7 x:mycollection_shard1_replica_n4] o.a.s.h.RequestHandlerBase java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout expired: 120000/120000 ms
2018-02-23 04:09:13.758 ERROR (qtp1595212853-9090) [c:mycollection s:shard1 r:core_node7 x:mycollection_shard1_replica_n4] o.a.s.s.HttpSolrCall null:java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout expired: 120000/120000 ms
2018-02-23 04:09:17.377 ERROR (qtp1595212853-9021) [c:mycollection s:shard1 r:core_node7 x:mycollection_shard1_replica_n4] o.a.s.h.RequestHandlerBase java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout expired: 120000/120000 ms
2018-02-23 04:09:17.378 ERROR (qtp1595212853-9021) [c:mycollection s:shard1 r:core_node7 x:mycollection_shard1_replica_n4] o.a.s.s.HttpSolrCall null:java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout expired: 120000/120000 ms
2018-02-23 04:09:29.804 ERROR (qtp1595212853-16156) [c:mycollection s:shard1 r:core_node7 x:mycollection_shard1_replica_n4] o.a.s.h.RequestHandlerBase java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout expired: 120000/120000 ms
2018-02-23 04:09:29.827 ERROR (qtp1595212853-16156) [c:mycollection s:shard1 r:core_node7 x:mycollection_shard1_replica_n4] o.a.s.s.HttpSolrCall null:java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout expired: 120000/120000 ms
2018-02-23 04:10:24.945 ERROR (recoveryExecutor-3-thread-6-processing-n:solr2-e:8080_solr x:mycollection_shard1_replica_n4 s:shard1 c:mycollection r:core_node7) [c:mycollection s:shard1 r:core_node7 x:mycollection_shard1_replica_n4] o.a.s.h.IndexFetcher Error fetching file, doing one retry...:org.apache.solr.common.SolrException: Unable to download tlog.0000000000000046782.1593163207942340608 completely. Downloaded 0!=87454
2018-02-23 04:10:24.946 ERROR (recoveryExecutor-3-thread-6-processing-n:solr2-e:8080_solr x:mycollection_shard1_replica_n4 s:shard1 c:mycollection r:core_node7) [c:mycollection s:shard1 r:core_node7 x:mycollection_shard1_replica_n4] o.a.s.h.IndexFetcher Error deleting file: tlog.0000000000000046782.1593163207942340608
2018-02-23 04:10:26.278 ERROR (recoveryExecutor-3-thread-6-processing-n:solr2-e:8080_solr x:mycollection_shard1_replica_n4 s:shard1 c:mycollection r:core_node7) [c:mycollection s:shard1 r:core_node7 x:mycollection_shard1_replica_n4] o.a.s.c.RecoveryStrategy Error while trying to recover:org.apache.solr.common.SolrException: Replication for recovery failed.
2018-02-23 04:10:26.278 ERROR (recoveryExecutor-3-thread-6-processing-n:solr2-e:8080_solr x:mycollection_shard1_replica_n4 s:shard1 c:mycollection r:core_node7) [c:mycollection s:shard1 r:core_node7 x:mycollection_shard1_replica_n4] o.a.s.c.RecoveryStrategy Recovery failed - trying again... (0)
2018-02-23 04:10:26.278 ERROR (recoveryExecutor-3-thread-6-processing-n:solr2-e:8080_solr x:mycollection_shard1_replica_n4 s:shard1 c:mycollection r:core_node7) [c:mycollection s:shard1 r:core_node7 x:mycollection_shard1_replica_n4] o.a.s.h.ReplicationHandler Index fetch failed :org.apache.solr.common.SolrException: Unable to download tlog.0000000000000046782.1593163207942340608 completely. Downloaded 0!=87454
2018-02-23 04:12:20.570 ERROR (recoveryExecutor-3-thread-6-processing-n:solr2-e:8080_solr x:mycollection_shard1_replica_n4 s:shard1 c:mycollection r:core_node7) [c:mycollection s:shard1 r:core_node7 x:mycollection_shard1_replica_n4] o.a.s.h.IndexFetcher Error fetching file, doing one retry...:org.apache.solr.common.SolrException: Unable to download tlog.0000000000000046787.1593163366289899520 completely. Downloaded 0!=179060
2018-02-23 04:12:20.571 ERROR (recoveryExecutor-3-thread-6-processing-n:solr2-e:8080_solr x:mycollection_shard1_replica_n4 s:shard1 c:mycollection r:core_node7) [c:mycollection s:shard1 r:core_node7 x:mycollection_shard1_replica_n4] o.a.s.h.IndexFetcher Error deleting file: tlog.0000000000000046787.1593163366289899520
2018-02-23 04:12:22.405 ERROR (recoveryExecutor-3-thread-6-processing-n:solr2-e:8080_solr x:mycollection_shard1_replica_n4 s:shard1 c:mycollection r:core_node7) [c:mycollection s:shard1 r:core_node7 x:mycollection_shard1_replica_n4] o.a.s.c.RecoveryStrategy Error while trying to recover:org.apache.solr.common.SolrException: Replication for recovery failed.
2018-02-23 04:12:22.405 ERROR (recoveryExecutor-3-thread-6-processing-n:solr2-e:8080_solr x:mycollection_shard1_replica_n4 s:shard1 c:mycollection r:core_node7) [c:mycollection s:shard1 r:core_node7 x:mycollection_shard1_replica_n4] o.a.s.c.RecoveryStrategy Recovery failed - trying again... (1)
2018-02-23 04:12:22.405 ERROR (recoveryExecutor-3-thread-6-processing-n:solr2-e:8080_solr x:mycollection_shard1_replica_n4 s:shard1 c:mycollection r:core_node7) [c:mycollection s:shard1 r:core_node7 x:mycollection_shard1_replica_n4] o.a.s.h.ReplicationHandler Index fetch failed :org.apache.solr.common.SolrException: Unable to download tlog.0000000000000046787.1593163366289899520 completely. Downloaded 0!=179060


> On Feb 23, 2018, at 4:15 PM, Deepak Goel <de...@gmail.com> wrote:
> 
> Can you please post all the errors? The current error is only for the node
> 'solr-2d'
> 
> On 23 Feb 2018 09:42, "Tom Peters" <tp...@synacor.com> wrote:
> 
> I'm trying to debug why indexing in SolrCloud 7.1 is having so many issues.
> It will hang most of the time, and timeout the rest.
> 
> Here's an example:
> 
>    time curl -s 'myhost:8080/solr/mycollection/update/json/docs' -d
> '{"solr_id":"test_001", "data_type":"test"}'|jq .
>    {
>      "responseHeader": {
>        "status": 0,
>        "QTime": 5004
>      }
>    }
>    curl -s 'myhost:8080/solr/mycollection/update/json/docs' -d   0.00s
> user 0.00s system 0% cpu 5.025 total
>    jq .  0.01s user 0.00s system 0% cpu 5.025 total
> 
> Here's some of the timeout errors I'm seeing:
> 
>    2018-02-23 03:55:02.903 ERROR (qtp1595212853-3607) [c:mycollection
> s:shard1 r:core_node12 x:mycollection_shard1_replica_n11]
> o.a.s.h.RequestHandlerBase java.io.IOException:
> java.util.concurrent.TimeoutException:
> Idle timeout expired: 120000/120000 ms
>    2018-02-23 03:55:02.903 ERROR (qtp1595212853-3607) [c:mycollection
> s:shard1 r:core_node12 x:mycollection_shard1_replica_n11]
> o.a.s.s.HttpSolrCall null:java.io.IOException:
> java.util.concurrent.TimeoutException:
> Idle timeout expired: 120000/120000 ms
>    2018-02-23 03:55:36.517 ERROR (recoveryExecutor-3-thread-4-
> processing-n:solr2-d.myhost:8080_solr x:mycollection_shard1_replica_n11
> s:shard1 c:mycollection r:core_node12) [c:mycollection s:shard1
> r:core_node12 x:mycollection_shard1_replica_n11] o.a.s.h.ReplicationHandler
> Index fetch failed :org.apache.solr.common.SolrException: Index fetch
> failed :
>    2018-02-23 03:55:36.517 ERROR (recoveryExecutor-3-thread-4-
> processing-n:solr2-d.myhost:8080_solr x:mycollection_shard1_replica_n11
> s:shard1 c:mycollection r:core_node12) [c:mycollection s:shard1
> r:core_node12 x:mycollection_shard1_replica_n11] o.a.s.c.RecoveryStrategy
> Error while trying to recover:org.apache.solr.common.SolrException:
> Replication for recovery failed.
> 
> 
> We currently have two separate Solr clusters. Our current in-production
> cluster which runs on Solr 3.4 and a new ring that I'm trying to bring up
> which runs on SolrCloud 7.1. I have the exact same code that is indexing to
> both clusters. The Solr 3.4 indexes fine, but I'm running into lots of
> issues with SolrCloud 7.1.
> 
> 
> Some additional details about the setup:
> 
> * 5 nodes solr2-a through solr2-e.
> * 5 replicas
> * 1 shard
> * The servers have 48G of RAM with -Xmx and -Xms set to 16G
> * I currently have soft commits at 10m intervals and hard commits (with
> openSearcher=false) at 1m intervals. I also tried 5m (soft) and 15s (hard)
> as well.
> 
> Any help or pointers would be greatly appreciated. Thanks!
> 
> 
> This message and any attachment may contain information that is
> confidential and/or proprietary. Any use, disclosure, copying, storing, or
> distribution of this e-mail or any attached file by anyone other than the
> intended recipient is strictly prohibited. If you have received this
> message in error, please notify the sender by reply email and delete the
> message and any attachments. Thank you.



This message and any attachment may contain information that is confidential and/or proprietary. Any use, disclosure, copying, storing, or distribution of this e-mail or any attached file by anyone other than the intended recipient is strictly prohibited. If you have received this message in error, please notify the sender by reply email and delete the message and any attachments. Thank you.

Re: Indexing timeout issues with SolrCloud 7.1

Posted by Deepak Goel <de...@gmail.com>.
Can you please post all the errors? The current error is only for the node
'solr-2d'

On 23 Feb 2018 09:42, "Tom Peters" <tp...@synacor.com> wrote:

I'm trying to debug why indexing in SolrCloud 7.1 is having so many issues.
It will hang most of the time, and timeout the rest.

Here's an example:

    time curl -s 'myhost:8080/solr/mycollection/update/json/docs' -d
'{"solr_id":"test_001", "data_type":"test"}'|jq .
    {
      "responseHeader": {
        "status": 0,
        "QTime": 5004
      }
    }
    curl -s 'myhost:8080/solr/mycollection/update/json/docs' -d   0.00s
user 0.00s system 0% cpu 5.025 total
    jq .  0.01s user 0.00s system 0% cpu 5.025 total

Here's some of the timeout errors I'm seeing:

    2018-02-23 03:55:02.903 ERROR (qtp1595212853-3607) [c:mycollection
s:shard1 r:core_node12 x:mycollection_shard1_replica_n11]
o.a.s.h.RequestHandlerBase java.io.IOException:
java.util.concurrent.TimeoutException:
Idle timeout expired: 120000/120000 ms
    2018-02-23 03:55:02.903 ERROR (qtp1595212853-3607) [c:mycollection
s:shard1 r:core_node12 x:mycollection_shard1_replica_n11]
o.a.s.s.HttpSolrCall null:java.io.IOException:
java.util.concurrent.TimeoutException:
Idle timeout expired: 120000/120000 ms
    2018-02-23 03:55:36.517 ERROR (recoveryExecutor-3-thread-4-
processing-n:solr2-d.myhost:8080_solr x:mycollection_shard1_replica_n11
s:shard1 c:mycollection r:core_node12) [c:mycollection s:shard1
r:core_node12 x:mycollection_shard1_replica_n11] o.a.s.h.ReplicationHandler
Index fetch failed :org.apache.solr.common.SolrException: Index fetch
failed :
    2018-02-23 03:55:36.517 ERROR (recoveryExecutor-3-thread-4-
processing-n:solr2-d.myhost:8080_solr x:mycollection_shard1_replica_n11
s:shard1 c:mycollection r:core_node12) [c:mycollection s:shard1
r:core_node12 x:mycollection_shard1_replica_n11] o.a.s.c.RecoveryStrategy
Error while trying to recover:org.apache.solr.common.SolrException:
Replication for recovery failed.


We currently have two separate Solr clusters. Our current in-production
cluster which runs on Solr 3.4 and a new ring that I'm trying to bring up
which runs on SolrCloud 7.1. I have the exact same code that is indexing to
both clusters. The Solr 3.4 indexes fine, but I'm running into lots of
issues with SolrCloud 7.1.


Some additional details about the setup:

* 5 nodes solr2-a through solr2-e.
* 5 replicas
* 1 shard
* The servers have 48G of RAM with -Xmx and -Xms set to 16G
* I currently have soft commits at 10m intervals and hard commits (with
openSearcher=false) at 1m intervals. I also tried 5m (soft) and 15s (hard)
as well.

Any help or pointers would be greatly appreciated. Thanks!


This message and any attachment may contain information that is
confidential and/or proprietary. Any use, disclosure, copying, storing, or
distribution of this e-mail or any attached file by anyone other than the
intended recipient is strictly prohibited. If you have received this
message in error, please notify the sender by reply email and delete the
message and any attachments. Thank you.