You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "Yonik Seeley (Jira)" <ji...@apache.org> on 2020/02/11 18:58:00 UTC

[jira] [Resolved] (SOLR-14058) IOOBE in PeerSync

     [ https://issues.apache.org/jira/browse/SOLR-14058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yonik Seeley resolved SOLR-14058.
---------------------------------
    Fix Version/s: 8.5
       Resolution: Fixed

> IOOBE in PeerSync
> -----------------
>
>                 Key: SOLR-14058
>                 URL: https://issues.apache.org/jira/browse/SOLR-14058
>             Project: Solr
>          Issue Type: Bug
>    Affects Versions: 6.2, master (9.0), 8.3
>            Reporter: Yonik Seeley
>            Assignee: Yonik Seeley
>            Priority: Major
>             Fix For: 8.5
>
>         Attachments: SOLR-14058.patch
>
>
> We hit an exception with 8.3 that someone else also hit on stackoverflow:
> https://stackoverflow.com/questions/58891563/problem-in-syncing-replicas-with-solr-8-3-with-zookeeper-3-5-6
> {quote}
> I recently converted a solr 7.x + zookeeper 3.4.14 to solr 8.3 + zk 3.5.6, and depending on how I start the solr nodes I'm geting a sync exception.
> My setup uses 3 zk nodes and 2 solr nodes (let's call it A and B). The collection that has this problem has 1 shard and 2 replicas. I've noticed 2 situations: (1) which works fine and (2) which does not work.
> 1) This works: I start solr node A, and wait until it's replica is elected leader ("green" in the Solr interface 'Cloud'->'Graph') - which takes about 2 min; and only then start solr node B. Both replicas are active and the one in A is the leader.
> 2) This does NOT work: I start solr node A, and a few secs after I star solr node B (that is, before the 'A' replica is elected leader - still "Down" in the solr interface). In this case I get the following exception:
> ERROR (coreZkRegister-1-thread-2-processing-n:192.168.15.20:8986_solr x:alldata_shard1_replica_n1 c:alldata s:shard1 r:core_node3) [c:alldata s:shard1 r:core_node3 x:alldata_shard1_replica_n1] o.a.s.c.SyncStrategy Sync Failed:java.lang.IndexOutOfBoundsException: Index -1 out of bounds for length 99
> It seems that if both solr node are started soon after each other, then ZK cannot elect one as leader. This error only appears in the solr.log of node A, even if I invert the order of starting nodes.
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org