You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "Erick Erickson (Jira)" <ji...@apache.org> on 2019/10/05 19:53:00 UTC

[jira] [Commented] (SOLR-13709) Race condition on core reload while core is still loading?

    [ https://issues.apache.org/jira/browse/SOLR-13709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16945143#comment-16945143 ] 

Erick Erickson commented on SOLR-13709:
---------------------------------------

I think I've finally nailed down what's happening here.
The test I'm concentrating on (TestSolrCLIRunExample.testInteractiveSolrCloudExample:
- Is testing the example code with the "1 50984 testCloudExamplePrompt 2 2 _default
-- I think the important part is it's using the _default (schemaless) configset.
- After that, it indexes a bunch of documents, which modify the schema
- the schema modifications cause the cores to reload
- the test deletes the collection _before_ the cores are fully reloaded and the coreDescriptor is removed
- the metrics manager then relies on the coreDescriptor being there and then gets the NPE 'cause it's gone

If I put a short delay in the test before deleting the collection, I can't get it to fail. Not a robust solution.

I'm trying a bunch of runs with a check that the overseer queues are empty, we'll see how that works.

Any better suggestions [~hossman][~caomanhdat2] Anybody? 

If I get a few thousand runs of this test without errors, I'll check it in. Again note that I'm only concentration on one of the tests at present, so this suite may have other failures.



> Race condition on core reload while core is still loading?
> ----------------------------------------------------------
>
>                 Key: SOLR-13709
>                 URL: https://issues.apache.org/jira/browse/SOLR-13709
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: Chris M. Hostetter
>            Assignee: Erick Erickson
>            Priority: Major
>         Attachments: apache_Lucene-Solr-Tests-8.x_449.log.txt
>
>
> A recent jenkins failure from {{TestSolrCLIRunExample}} seems to suggest that there may be a race condition when attempting to re-load a SolrCore while the core is currently in the process of (re)loading that can leave the SolrCore in an unusable state.
> Details to follow...



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org