You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by David Smiley <da...@gmail.com> on 2018/04/27 13:34:36 UTC

SolrCloud test fails RE "Can't find resource"

Just thinking out loud here about something spooky...

I've noticed some relatively rare and seemingly random SolrCloud tests that
fail to create a collection because an expected configSet resource isn't
found (isn't in Zookeeper where it ought to be).  The initial exception
will have "Can't find resource" then the name of the resource -- sometimes
it's solrconfig.xml or other times some language specific stop-word file or
something.  This happens when a collection is being created, but fails
because a core/replica won't load because the configSet is faulty because a
required file isn't there.  I do *not* believe this is some sort of
visibility race of the configSet wherein a sleep/wait before creating the
collection would help, because I've seen an entire test suite (test class
file) of many tests (test methods) all fail for the same reason. In that
case the MiniSolrCloudCluster was simply initialized in beforeClass and
thus ought to have "_default" ready to be used before the test methods, yet
every test method failed for the same reason.   This is very strange; the
code that uploads the _default configSet seems sound.  Yet there's some
sporadic bug so I guess *somewhere *there's either some race or there's an
underlying upload failure that is being ignored*.*

I think I ought to compile a list of tests that have shown this error and
the dates(s) / build info in which it occurred.  Maybe there's a pattern /
something they have in common.
-- 
Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
http://www.solrenterprisesearchserver.com

Re: SolrCloud test fails RE "Can't find resource"

Posted by Erick Erickson <er...@gmail.com>.
NP, I've been swamped lately anyway. If you do see anything you want
me to beast, just give me their names. It's one of those things that
takes 5 minutes to set up then a long time doing nothing.....

Let me know if/when you want me to give it a whirl....

Erick

On Tue, May 1, 2018 at 11:02 AM, David Smiley <da...@gmail.com> wrote:
> (sorry for my belated response)
> Beasting a zillion times sounds like a decent option.  First I want to
> compile a list to see if there are any apparent patterns though.
>
> On Fri, Apr 27, 2018 at 11:16 AM Erick Erickson <er...@gmail.com>
> wrote:
>>
>> Do you think making a test that just started up a
>> MiniSolrCloudCluster, created a collection then quit and then beasting
>> it a zillion times might reproduce? I can help beast such a thing if
>> it was available.
>>
>> I think there are a few root causes that are sporadically showing up
>> here and there, any we can track down will help enormously in reducing
>> noise.....
>>
>> Erick
>>
>> On Fri, Apr 27, 2018 at 6:34 AM, David Smiley <da...@gmail.com>
>> wrote:
>> > Just thinking out loud here about something spooky...
>> >
>> > I've noticed some relatively rare and seemingly random SolrCloud tests
>> > that
>> > fail to create a collection because an expected configSet resource isn't
>> > found (isn't in Zookeeper where it ought to be).  The initial exception
>> > will
>> > have "Can't find resource" then the name of the resource -- sometimes
>> > it's
>> > solrconfig.xml or other times some language specific stop-word file or
>> > something.  This happens when a collection is being created, but fails
>> > because a core/replica won't load because the configSet is faulty
>> > because a
>> > required file isn't there.  I do *not* believe this is some sort of
>> > visibility race of the configSet wherein a sleep/wait before creating
>> > the
>> > collection would help, because I've seen an entire test suite (test
>> > class
>> > file) of many tests (test methods) all fail for the same reason. In that
>> > case the MiniSolrCloudCluster was simply initialized in beforeClass and
>> > thus
>> > ought to have "_default" ready to be used before the test methods, yet
>> > every
>> > test method failed for the same reason.   This is very strange; the code
>> > that uploads the _default configSet seems sound.  Yet there's some
>> > sporadic
>> > bug so I guess somewhere there's either some race or there's an
>> > underlying
>> > upload failure that is being ignored.
>> >
>> > I think I ought to compile a list of tests that have shown this error
>> > and
>> > the dates(s) / build info in which it occurred.  Maybe there's a pattern
>> > /
>> > something they have in common.
>> > --
>> > Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
>> > LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
>> > http://www.solrenterprisesearchserver.com
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: dev-help@lucene.apache.org
>>
> --
> Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
> LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
> http://www.solrenterprisesearchserver.com

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Re: SolrCloud test fails RE "Can't find resource"

Posted by David Smiley <da...@gmail.com>.
(sorry for my belated response)
Beasting a zillion times sounds like a decent option.  First I want to
compile a list to see if there are any apparent patterns though.

On Fri, Apr 27, 2018 at 11:16 AM Erick Erickson <er...@gmail.com>
wrote:

> Do you think making a test that just started up a
> MiniSolrCloudCluster, created a collection then quit and then beasting
> it a zillion times might reproduce? I can help beast such a thing if
> it was available.
>
> I think there are a few root causes that are sporadically showing up
> here and there, any we can track down will help enormously in reducing
> noise.....
>
> Erick
>
> On Fri, Apr 27, 2018 at 6:34 AM, David Smiley <da...@gmail.com>
> wrote:
> > Just thinking out loud here about something spooky...
> >
> > I've noticed some relatively rare and seemingly random SolrCloud tests
> that
> > fail to create a collection because an expected configSet resource isn't
> > found (isn't in Zookeeper where it ought to be).  The initial exception
> will
> > have "Can't find resource" then the name of the resource -- sometimes
> it's
> > solrconfig.xml or other times some language specific stop-word file or
> > something.  This happens when a collection is being created, but fails
> > because a core/replica won't load because the configSet is faulty
> because a
> > required file isn't there.  I do *not* believe this is some sort of
> > visibility race of the configSet wherein a sleep/wait before creating the
> > collection would help, because I've seen an entire test suite (test class
> > file) of many tests (test methods) all fail for the same reason. In that
> > case the MiniSolrCloudCluster was simply initialized in beforeClass and
> thus
> > ought to have "_default" ready to be used before the test methods, yet
> every
> > test method failed for the same reason.   This is very strange; the code
> > that uploads the _default configSet seems sound.  Yet there's some
> sporadic
> > bug so I guess somewhere there's either some race or there's an
> underlying
> > upload failure that is being ignored.
> >
> > I think I ought to compile a list of tests that have shown this error and
> > the dates(s) / build info in which it occurred.  Maybe there's a pattern
> /
> > something they have in common.
> > --
> > Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
> > LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
> > http://www.solrenterprisesearchserver.com
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
> --
Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
http://www.solrenterprisesearchserver.com

Re: SolrCloud test fails RE "Can't find resource"

Posted by Erick Erickson <er...@gmail.com>.
Do you think making a test that just started up a
MiniSolrCloudCluster, created a collection then quit and then beasting
it a zillion times might reproduce? I can help beast such a thing if
it was available.

I think there are a few root causes that are sporadically showing up
here and there, any we can track down will help enormously in reducing
noise.....

Erick

On Fri, Apr 27, 2018 at 6:34 AM, David Smiley <da...@gmail.com> wrote:
> Just thinking out loud here about something spooky...
>
> I've noticed some relatively rare and seemingly random SolrCloud tests that
> fail to create a collection because an expected configSet resource isn't
> found (isn't in Zookeeper where it ought to be).  The initial exception will
> have "Can't find resource" then the name of the resource -- sometimes it's
> solrconfig.xml or other times some language specific stop-word file or
> something.  This happens when a collection is being created, but fails
> because a core/replica won't load because the configSet is faulty because a
> required file isn't there.  I do *not* believe this is some sort of
> visibility race of the configSet wherein a sleep/wait before creating the
> collection would help, because I've seen an entire test suite (test class
> file) of many tests (test methods) all fail for the same reason. In that
> case the MiniSolrCloudCluster was simply initialized in beforeClass and thus
> ought to have "_default" ready to be used before the test methods, yet every
> test method failed for the same reason.   This is very strange; the code
> that uploads the _default configSet seems sound.  Yet there's some sporadic
> bug so I guess somewhere there's either some race or there's an underlying
> upload failure that is being ignored.
>
> I think I ought to compile a list of tests that have shown this error and
> the dates(s) / build info in which it occurred.  Maybe there's a pattern /
> something they have in common.
> --
> Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
> LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
> http://www.solrenterprisesearchserver.com

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org