You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Nathan Neulinger (JIRA)" <ji...@apache.org> on 2013/10/31 00:54:25 UTC

[jira] [Commented] (SOLR-5407) Strange error condition with cloud replication not working quite right

    [ https://issues.apache.org/jira/browse/SOLR-5407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13809767#comment-13809767 ] 

Nathan Neulinger commented on SOLR-5407:
----------------------------------------

The only error we could find in the logs was this:

09:08:01 	WARN 	PeerSync 	no frame of reference to tell if we've missed updates
09:25:49 	WARN 	Overseer 	
09:25:49 	ERROR 	SolrDispatchFilter 	null:org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired for /aliases.json
09:25:49 	ERROR 	SolrDispatchFilter 	null:org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired for /aliases.json
09:25:49 	WARN 	OverseerCollectionProcessor 	Overseer cannot talk to ZK
09:25:49 	ERROR 	SolrDispatchFilter 	null:org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired for /aliases.json
09:25:49 	ERROR 	SolrDispatchFilter 	null:org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired for /aliases.json
09:25:49 	ERROR 	SolrDispatchFilter 	null:org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired for /aliases.json
09:25:49 	ERROR 	SolrDispatchFilter 	null:org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired for /aliases.json
09:26:37 	WARN 	PeerSync 	no frame of reference to tell if we've missed updates



> Strange error condition with cloud replication not working quite right
> ----------------------------------------------------------------------
>
>                 Key: SOLR-5407
>                 URL: https://issues.apache.org/jira/browse/SOLR-5407
>             Project: Solr
>          Issue Type: Bug
>    Affects Versions: 4.5
>            Reporter: Nathan Neulinger
>              Labels: cloud, replication
>
> I have a clodu deployment of 4.5 on EC2. Architecture is 3 dedicated ZK nodes, and a pair of solr nodes.  I'll apologize in advance that this error report is not going to have a lot of detail, I'm really hoping that the scenario/description will trigger some "likely" possible explanation.
> The situation I got into was that the server had decided to fail over, so my app servers were all taking to what should have been the primary for most of the shards/collections, but actually was the replica.
> Here's where it gets odd - no errors being returned to the client code for any of the searches or document updates - and the current primary server was definitely receiving all of the updates - even though they were being submitted to the inactive/replica node. (clients talking to solr-p1, which was not primary at the time, and writes were being passed through to solr-r1, which was primary at the time.)
> All sounds good so far right? Except - the replica server at the time, through which the writes were passing - never got any of those content updates. It had an old unmodified copy of the index. 
> I restarted solr-p1 (was the replica at the time) - no change in behavior. Behavior did not change until I killed and restarted the current primary (solr-r1) to force it to fail over.
> At that point, everything was all happy again and working properly. 
> Until this morning, when one of the developers provisioned a new collection, which happened to put it's primary on solr-r1. Again, clients all pointing at solr-p1. The developer reported that the documents were going into the index, but not visible on the replica server. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org