You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "denis isack (JIRA)" <ji...@apache.org> on 2015/09/05 23:58:45 UTC

[jira] [Commented] (SOLR-7936) Bogus failure when deleting collections.

    [ https://issues.apache.org/jira/browse/SOLR-7936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14732114#comment-14732114 ] 

denis isack commented on SOLR-7936:
-----------------------------------

It is just give time for other readers or other people accessing your code (application online/or connected device if any) can get see the change if they request for the same request(code).It means each time the loop executes the delay is given so other users can get updated or any other criteria based on the business logic that need to be full field mostly to avoid pitfall of concurrency. 

> Bogus failure when deleting collections.
> ----------------------------------------
>
>                 Key: SOLR-7936
>                 URL: https://issues.apache.org/jira/browse/SOLR-7936
>             Project: Solr
>          Issue Type: Bug
>            Reporter: Erick Erickson
>            Assignee: Erick Erickson
>
> When looking at the CDCR test failures, we began to wonder whether the problem was
> 1> the cdcr code itself
> 2> the test framework
> 3> Solr
> Some of the failures seem to be "impossible" assuming collection creation/deletion work OK.
> So I wrote a little program to exercise collection creation/deletion outside the test framework by just adding and deleting the same collection over and over and over again, and it started regularly failing in OverseerCollectionMessageHandler.deleteCollection about line 780 it would throw the "Could not fully remove the collection" exception:
> {code}
>       TimeOut timeout = new TimeOut(30, TimeUnit.SECONDS);
>       boolean removed = false;
>       while (! timeout.hasTimedOut()) {
>         Thread.sleep(100);
>         // WORKS SO FAR IF UNCOMMENTED zkStateReader.updateClusterState();
>         removed = !zkStateReader.getClusterState().hasCollection(collection);
>         if (removed) {
>           Thread.sleep(500); // just a bit of time so it's more likely other
>                              // readers see on return
>           break;
>         }
>       }
>       if (!removed) {
>         throw new SolrException(ErrorCode.SERVER_ERROR,
>             "Could not fully remove collection: " + collection);
>       }
> {code}
> However, the collection is really gone from clusterstate. When I put the updateClusterState() in above, it doesn't seem to fail. Is it as simple as the updateClusterState() call?
> Without the update in place, it failed within 20 reps very regularly. So far, with the update in place we're at 132 and counting. Any comments?
> If this runs 1,000 times tonight, I'll check it in if there are no objections. I don't know what it means for CDCR yet though.
> I'm also suspicious of the 500ms sleep. Anyone have a clue what that's in there for?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org