You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "Lucene/Solr QA (Jira)" <ji...@apache.org> on 2019/10/29 05:51:00 UTC

[jira] [Commented] (SOLR-11431) Leader candidate cannot become leader if replica responds 500 to PeerSync

    [ https://issues.apache.org/jira/browse/SOLR-11431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16961698#comment-16961698 ] 

Lucene/Solr QA commented on SOLR-11431:
---------------------------------------

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  5s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} Release audit (RAT) {color} | {color:green}  1m 15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} Check forbidden APIs {color} | {color:green}  1m 15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} Validate source patterns {color} | {color:green}  1m 15s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 31m 23s{color} | {color:red} core in the patch failed. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 35m 41s{color} | {color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | solr.update.processor.UpdateRequestProcessorFactoryTest |
|   | solr.core.TestBadConfig |
|   | solr.core.TestCoreContainer |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | SOLR-11431 |
| JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12984176/SOLR-11431.patch |
| Optional Tests |  compile  javac  unit  ratsources  checkforbiddenapis  validatesourcepatterns  |
| uname | Linux lucene1-us-west 4.15.0-54-generic #58-Ubuntu SMP Mon Jun 24 10:55:24 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | ant |
| Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-SOLR-Build/sourcedir/dev-tools/test-patch/lucene-solr-yetus-personality.sh |
| git revision | master / c7c0bdf2df8 |
| ant | version: Apache Ant(TM) version 1.10.5 compiled on March 28 2019 |
| Default Java | LTS |
| unit | https://builds.apache.org/job/PreCommit-SOLR-Build/589/artifact/out/patch-unit-solr_core.txt |
|  Test Results | https://builds.apache.org/job/PreCommit-SOLR-Build/589/testReport/ |
| modules | C: solr solr/core U: solr |
| Console output | https://builds.apache.org/job/PreCommit-SOLR-Build/589/console |
| Powered by | Apache Yetus 0.7.0   http://yetus.apache.org |


This message was automatically generated.



> Leader candidate cannot become leader if replica responds 500 to PeerSync
> -------------------------------------------------------------------------
>
>                 Key: SOLR-11431
>                 URL: https://issues.apache.org/jira/browse/SOLR-11431
>             Project: Solr
>          Issue Type: Bug
>          Components: SolrCloud
>    Affects Versions: 7.0
>            Reporter: Mano Kovacs
>            Priority: Major
>         Attachments: SOLR-11431.patch
>
>          Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> When leader candidate does PeerSync to all replicas, to download any missing updates, it is tolerant to failures. It uses {{cantReachIsSuccess=true}} switch which handles connection issue, 404 and 503 as success, since replicas being DOWN should not affect the process.
> However, if a replica has disk issues, the core initialization might fail and that results in {{500}} instead of {{503}}. I failing replica like that can prevent any other replicas becoming the leader.
> Proposing either:
> * Accepting {{500}} as "cant reach" so leader candidate can go on
> or
> * Changing {{SolrCoreInitializationException}} to return {{503}} instead of {{500}}
> * * this might be API change, however



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org