You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "ludovic Boutros (JIRA)" <ji...@apache.org> on 2014/05/21 18:24:38 UTC
[jira] [Comment Edited] (SOLR-6086) Replica active during Warming

    [ https://issues.apache.org/jira/browse/SOLR-6086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14004856#comment-14004856 ] 

ludovic Boutros edited comment on SOLR-6086 at 5/21/14 4:23 PM:
----------------------------------------------------------------

I checked the differences in the logs and in the code.

The problem occures when:
- a node is restarted 
- Peer Sync failed (no "/get" handler for instance, should it become mandatory ?)
- the node is already synced (nothing to replicate)

or :

- a node is restarted and this is the leader (I do not know if it only appends with a lonely leader...)
- the node is already synced (nothing to replicate)

For the first case,

I think this is a side effect of the modification in SOLR-4965. 

If Peer Sync is succesfull, in the code an explicit commit is called. And there's a comment which says:

{code:title=RecoveryStrategy.java|borderStyle=solid}
            // force open a new searcher
            core.getUpdateHandler().commit(new CommitUpdateCommand(req, false));
{code}

This is not the case if Peer Sync failed.
Just adding this line is enough to correct this issue.

Here is a patch with a test which reproduces the problem and the correction (to be applied to the branch 4x).

I am working on the second case.


was (Author: lboutros):
I checked the differences in the logs and in the code.

The problem occures when:
- a node is restarted 
- Peer Sync failed (no "/get" handler for instance, should it become mandatory ?)
- the node is already synced (nothing to replicate)

or :

- a node is restarted and this is the leader (I do not know if it only appends with a lonely leader...)
- the node is already synced (nothing to replicate)

For the first case,

I think this is a side effect of the modification in SOLR-4965. 

If Peer Sync is succesfull, in the code an explicit commit is called. And there's a comment which says:

{code:title=RecoveryStrategy.java|borderStyle=solid}
            // force open a new searcher
            core.getUpdateHandler().commit(new CommitUpdateCommand(req, false));
{code}

This is not the case if Peer Sync failed.
Just adding this line is enough to correct this issue.

Here is a patch with a test which reproduce the problem and the correction (to be applied to the branch 4x).

I am working on the second case.

> Replica active during Warming
> -----------------------------
>
>                 Key: SOLR-6086
>                 URL: https://issues.apache.org/jira/browse/SOLR-6086
>             Project: Solr
>          Issue Type: Bug
>          Components: SolrCloud
>    Affects Versions: 4.6.1, 4.8.1
>            Reporter: ludovic Boutros
>         Attachments: SOLR-6086.patch
>
>
> At least with Solr 4.6.1, replica are considered as active during the warming process.
> This means that if you restart a replica or create a new one, queries will  
> be send to this replica and the query will hang until the end of the warming  
> process (If cold searchers are not used).
> You cannot add or restart a node silently anymore.
> I think that the fact that the replica is active is not a bad thing.
> But, the HttpShardHandler and the CloudSolrServer class should take the warming process in account.
> Currently, I have developped a new very simple component which check that a searcher is registered.
> I am also developping custom HttpShardHandler and CloudSolrServer classes which will check the warming process in addition to the ACTIVE status in the cluster state.
> This seems to be more a workaround than a solution but that's all I can do in this version.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org