You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Webster Homer <we...@sial.com> on 2017/07/04 20:27:50 UTC

cdcr bootstrap errors

We've been using cdcr for a while now. It seems to be pretty fragile.

Currently we're seeing tons of errors like this:
2017-07-04 14:41:27.015 ERROR
(cdcr-bootstrap-status-51-thread-1-processing-n:dfw-pauth-msc02:8983_solr)
[ ] o.a.s.h.CdcrReplicatorManager Exception during bootstrap status request

In this case we have one server throwing the above errors a lot!

The error isn't very informative what can cause this?

I also see these messages:
2017-07-04 18:59:39.730 WARN
 (cdcr-replicator-122-thread-3-processing-n:dfw-pauth-msc02:8983_solr
x:sial-catalog-gene_shard1_replica1 s:shard1 c:sial-catalog-gene
r:core_node1) [c:sial-catalog-gene s:shard1 r:core_node1
x:sial-catalog-gene_shard1_replica1] o.a.s.h.CdcrReplicator Log reader for
target sial-catalog-gene is not initialised, it will be ignored.

2017-07-04 18:59:39.730 INFO
(cdcr-replicator-122-thread-1-processing-n:dfw-pauth-msc02:8983_solr
x:sial-catalog-gene_shard1_replica1 s:shard1 c:sial-catalog-gene
r:core_node1) [c:sial-catalog-gene s:shard1 r:core_node1
x:sial-catalog-gene_shard1_replica1] o.a.s.h.CdcrReplicator Forwarded 0
updates to target sial-catalog-gene 2017-07-04 18:59:39.975 WARN
(cdcr-replicator-100-thread-3-processing-n:dfw-pauth-msc02:8983_solr) [ ]
o.a.s.h.CdcrReplicator Failed to forward update request to target:
bb-catalog-material java.lang.RuntimeException: Unknown type 17

We are using Solr 6.2
We have a 2 node cloud with multiple collections all with 2 shards
replicating to two solr clouds running in Google cloud.
We noticed that some of the prod collections only had data in one of the
shards.

So how do we diagnose this issue?

-- 


This message and any attachment are confidential and may be privileged or 
otherwise protected from disclosure. If you are not the intended recipient, 
you must not copy this message or attachment or disclose the contents to 
any other person. If you have received this transmission in error, please 
notify the sender immediately and delete the message and any attachment 
from your system. Merck KGaA, Darmstadt, Germany and any of its 
subsidiaries do not accept liability for any omissions or errors in this 
message which may arise as a result of E-Mail-transmission or for damages 
resulting from any unauthorized changes of the content of this message and 
any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its 
subsidiaries do not guarantee that this message is free of viruses and does 
not accept liability for any damages caused by any virus transmitted 
therewith.

Click http://www.emdgroup.com/disclaimer to access the German, French, 
Spanish and Portuguese versions of this disclaimer.

Re: cdcr bootstrap errors

Posted by Webster Homer <we...@sial.com>.
restarting the zookeeper on the source cloud seems to have helped

On Tue, Jul 4, 2017 at 3:42 PM, Webster Homer <we...@sial.com>
wrote:

> Another strange error message I'm seeing
> 2017-07-04 18:59:40.585 WARN  (cdcr-replicator-110-thread-
> 4-processing-n:dfw-pauth-msc02:8983_solr) [   ] o.a.s.h.CdcrReplicator
> Failed to forward update request to target: sial-catalog-product
> org.apache.solr.common.SolrException: Could not load collection from ZK:
> sial-catalog-product
> at org.apache.solr.common.cloud.ZkStateReader.getCollectionLive(
> ZkStateReader.java:1093)
> at org.apache.solr.common.cloud.ZkStateReader$LazyCollectionRef.get(
> ZkStateReader.java:638)
> at org.apache.solr.common.cloud.ClusterState.getCollectionOrNull(
> ClusterState.java:212)
> at org.apache.solr.common.cloud.ClusterState.hasCollection(
> ClusterState.java:114)
> at org.apache.solr.client.solrj.impl.CloudSolrClient.getCollectionNames(
> CloudSolrClient.java:1302)
> at org.apache.solr.client.solrj.impl.CloudSolrClient.
> requestWithRetryOnStaleState(CloudSolrClient.java:1024)
> at org.apache.solr.client.solrj.impl.CloudSolrClient.request(
> CloudSolrClient.java:997)
> at org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:149)
> at org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:166)
> at org.apache.solr.handler.CdcrReplicator.sendRequest(
> CdcrReplicator.java:135)
> at org.apache.solr.handler.CdcrReplicator.run(CdcrReplicator.java:115)
> at org.apache.solr.handler.CdcrReplicatorScheduler.lambda$null$0(
> CdcrReplicatorScheduler.java:81)
> at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.
> lambda$execute$0(ExecutorUtil.java:229)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(
> ThreadPoolExecutor.java:1142)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(
> ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.zookeeper.KeeperException$SessionExpiredException:
> KeeperErrorCode = Session expired for /collections/sial-catalog-
> product/state.json
>
> So is Zookeeper hosed? How do I tell?
>
> On Tue, Jul 4, 2017 at 3:27 PM, Webster Homer <we...@sial.com>
> wrote:
>
>> We've been using cdcr for a while now. It seems to be pretty fragile.
>>
>> Currently we're seeing tons of errors like this:
>> 2017-07-04 14:41:27.015 ERROR (cdcr-bootstrap-status-51-thre
>> ad-1-processing-n:dfw-pauth-msc02:8983_solr) [ ]
>> o.a.s.h.CdcrReplicatorManager Exception during bootstrap status request
>>
>> In this case we have one server throwing the above errors a lot!
>>
>> The error isn't very informative what can cause this?
>>
>> I also see these messages:
>> 2017-07-04 18:59:39.730 WARN  (cdcr-replicator-122-thread-3
>> -processing-n:dfw-pauth-msc02:8983_solr x:sial-catalog-gene_shard1_replica1
>> s:shard1 c:sial-catalog-gene r:core_node1) [c:sial-catalog-gene s:shard1
>> r:core_node1 x:sial-catalog-gene_shard1_replica1] o.a.s.h.CdcrReplicator
>> Log reader for target sial-catalog-gene is not initialised, it will be
>> ignored.
>>
>> 2017-07-04 18:59:39.730 INFO (cdcr-replicator-122-thread-1-
>> processing-n:dfw-pauth-msc02:8983_solr x:sial-catalog-gene_shard1_replica1
>> s:shard1 c:sial-catalog-gene r:core_node1) [c:sial-catalog-gene s:shard1
>> r:core_node1 x:sial-catalog-gene_shard1_replica1] o.a.s.h.CdcrReplicator
>> Forwarded 0 updates to target sial-catalog-gene 2017-07-04 18:59:39.975
>> WARN (cdcr-replicator-100-thread-3-processing-n:dfw-pauth-msc02:8983_solr)
>> [ ] o.a.s.h.CdcrReplicator Failed to forward update request to target:
>> bb-catalog-material java.lang.RuntimeException: Unknown type 17
>>
>> We are using Solr 6.2
>> We have a 2 node cloud with multiple collections all with 2 shards
>> replicating to two solr clouds running in Google cloud.
>> We noticed that some of the prod collections only had data in one of the
>> shards.
>>
>> So how do we diagnose this issue?
>>
>>
>

-- 


This message and any attachment are confidential and may be privileged or 
otherwise protected from disclosure. If you are not the intended recipient, 
you must not copy this message or attachment or disclose the contents to 
any other person. If you have received this transmission in error, please 
notify the sender immediately and delete the message and any attachment 
from your system. Merck KGaA, Darmstadt, Germany and any of its 
subsidiaries do not accept liability for any omissions or errors in this 
message which may arise as a result of E-Mail-transmission or for damages 
resulting from any unauthorized changes of the content of this message and 
any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its 
subsidiaries do not guarantee that this message is free of viruses and does 
not accept liability for any damages caused by any virus transmitted 
therewith.

Click http://www.emdgroup.com/disclaimer to access the German, French, 
Spanish and Portuguese versions of this disclaimer.

Re: cdcr bootstrap errors

Posted by Webster Homer <we...@sial.com>.
Another strange error message I'm seeing
2017-07-04 18:59:40.585 WARN
 (cdcr-replicator-110-thread-4-processing-n:dfw-pauth-msc02:8983_solr) [
] o.a.s.h.CdcrReplicator Failed to forward update request to target:
sial-catalog-product
org.apache.solr.common.SolrException: Could not load collection from ZK:
sial-catalog-product
at
org.apache.solr.common.cloud.ZkStateReader.getCollectionLive(ZkStateReader.java:1093)
at
org.apache.solr.common.cloud.ZkStateReader$LazyCollectionRef.get(ZkStateReader.java:638)
at
org.apache.solr.common.cloud.ClusterState.getCollectionOrNull(ClusterState.java:212)
at
org.apache.solr.common.cloud.ClusterState.hasCollection(ClusterState.java:114)
at
org.apache.solr.client.solrj.impl.CloudSolrClient.getCollectionNames(CloudSolrClient.java:1302)
at
org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:1024)
at
org.apache.solr.client.solrj.impl.CloudSolrClient.request(CloudSolrClient.java:997)
at org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:149)
at org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:166)
at
org.apache.solr.handler.CdcrReplicator.sendRequest(CdcrReplicator.java:135)
at org.apache.solr.handler.CdcrReplicator.run(CdcrReplicator.java:115)
at
org.apache.solr.handler.CdcrReplicatorScheduler.lambda$null$0(CdcrReplicatorScheduler.java:81)
at
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:229)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.zookeeper.KeeperException$SessionExpiredException:
KeeperErrorCode = Session expired for
/collections/sial-catalog-product/state.json

So is Zookeeper hosed? How do I tell?

On Tue, Jul 4, 2017 at 3:27 PM, Webster Homer <we...@sial.com>
wrote:

> We've been using cdcr for a while now. It seems to be pretty fragile.
>
> Currently we're seeing tons of errors like this:
> 2017-07-04 14:41:27.015 ERROR (cdcr-bootstrap-status-51-
> thread-1-processing-n:dfw-pauth-msc02:8983_solr) [ ]
> o.a.s.h.CdcrReplicatorManager Exception during bootstrap status request
>
> In this case we have one server throwing the above errors a lot!
>
> The error isn't very informative what can cause this?
>
> I also see these messages:
> 2017-07-04 18:59:39.730 WARN  (cdcr-replicator-122-thread-
> 3-processing-n:dfw-pauth-msc02:8983_solr x:sial-catalog-gene_shard1_replica1
> s:shard1 c:sial-catalog-gene r:core_node1) [c:sial-catalog-gene s:shard1
> r:core_node1 x:sial-catalog-gene_shard1_replica1] o.a.s.h.CdcrReplicator
> Log reader for target sial-catalog-gene is not initialised, it will be
> ignored.
>
> 2017-07-04 18:59:39.730 INFO (cdcr-replicator-122-thread-1-
> processing-n:dfw-pauth-msc02:8983_solr x:sial-catalog-gene_shard1_replica1
> s:shard1 c:sial-catalog-gene r:core_node1) [c:sial-catalog-gene s:shard1
> r:core_node1 x:sial-catalog-gene_shard1_replica1] o.a.s.h.CdcrReplicator
> Forwarded 0 updates to target sial-catalog-gene 2017-07-04 18:59:39.975
> WARN (cdcr-replicator-100-thread-3-processing-n:dfw-pauth-msc02:8983_solr)
> [ ] o.a.s.h.CdcrReplicator Failed to forward update request to target:
> bb-catalog-material java.lang.RuntimeException: Unknown type 17
>
> We are using Solr 6.2
> We have a 2 node cloud with multiple collections all with 2 shards
> replicating to two solr clouds running in Google cloud.
> We noticed that some of the prod collections only had data in one of the
> shards.
>
> So how do we diagnose this issue?
>
>

-- 


This message and any attachment are confidential and may be privileged or 
otherwise protected from disclosure. If you are not the intended recipient, 
you must not copy this message or attachment or disclose the contents to 
any other person. If you have received this transmission in error, please 
notify the sender immediately and delete the message and any attachment 
from your system. Merck KGaA, Darmstadt, Germany and any of its 
subsidiaries do not accept liability for any omissions or errors in this 
message which may arise as a result of E-Mail-transmission or for damages 
resulting from any unauthorized changes of the content of this message and 
any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its 
subsidiaries do not guarantee that this message is free of viruses and does 
not accept liability for any damages caused by any virus transmitted 
therewith.

Click http://www.emdgroup.com/disclaimer to access the German, French, 
Spanish and Portuguese versions of this disclaimer.