You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Erick Erickson (JIRA)" <ji...@apache.org> on 2016/12/13 15:49:58 UTC

[jira] [Comment Edited] (SOLR-9843) Fix up DocValuesNotIndexedTest failures

    [ https://issues.apache.org/jira/browse/SOLR-9843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15742595#comment-15742595 ] 

Erick Erickson edited comment on SOLR-9843 at 12/13/16 3:49 PM:
----------------------------------------------------------------

Attaching triage of the output from Jenkins. The difficult bit is that shard3_replica 1 apparently gets 2 docs added (2,3) but then only returns 1 if them (id=2). From what I see, the sequence of events is fine.

- fail.txt is the snippet around this test from the failure on Jenkins (Windows 32 bit). I _think_ I've seen Jenkins failures on OS X, but don't have the record now.
- shard3_replica1.txt is all the mentions of shard3_replica1 from fail.txt
- shard_3_searchers.txt shows all of the searchers opening from shard3_replica1.txt.

Here's the sequence in the test:
- the DBQ of <tt>*:*</tt> happens in @Before
- commit happens in @Before
- the update for docs 2 and 3 going to shard3_replica1 happens
- commit happens
- a new searcher is opened
- a *:* query goes to shard3_replica1
- shard3_replica1 only returns doc 2 (ids=2).

I looked at successful tests and the "expected" thing happens, i.e. shard3_replica1 *:* returns ids=2,3

So this looks like something that's not just an artifact of this test, nor something reproducible with the seeds. Maybe something fundamental to Solr, maybe something in the test framework. Maybe a race condition. Maybe I'm hallucinating.

I'm starting to think this test exposes some race condition that's been lurking in the code for a while. There's a user's list question about not returning docs that may be relevant too, the title is "empty result set for a sort query" from moscovig that [~yonik@apache.org] responded to.


was (Author: erickerickson):
Attaching triage of the output from Jenkins. The difficult bit is that shard3_replica 1 apparently gets 2 docs added (2,3) but then only returns 1 if them (id=2). From what I see, the sequence of events is fine.

- fail.txt is the snippet around this test from the failure on Jenkins (Windows 32 bit). I _think_ I've seen Jenkins failures on OS X, but don't have the record now.
- shard3_replica1.txt is all the mentions of shard3_replica1 from fail.txt
- shard_3_searchers.txt shows all of the searchers opening from shard3_replica1.txt.

Here's the sequence in the test:
- the DBQ of *:* happens in @Before
- commit happens in @Before
- the update for docs 2 and 3 going to shard3_replica1 happens
- commit happens
- a new searcher is opened
- a *:* query goes to shard3_replica1
- shard3_replica1 only returns doc 2 (ids=2).

I looked at successful tests and the "expected" thing happens, i.e. shard3_replica1 *:* returns ids=2,3

So this looks like something that's not just an artifact of this test, nor something reproducible with the seeds. Maybe something fundamental to Solr, maybe something in the test framework. Maybe a race condition. Maybe I'm hallucinating.

I'm starting to think this test exposes some race condition that's been lurking in the code for a while. There's a user's list question about not returning docs that may be relevant too, the title is "empty result set for a sort query" from moscovig that [~yonik@apache.org] responded to.

> Fix up DocValuesNotIndexedTest failures
> ---------------------------------------
>
>                 Key: SOLR-9843
>                 URL: https://issues.apache.org/jira/browse/SOLR-9843
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: Erick Erickson
>            Assignee: Erick Erickson
>            Priority: Blocker
>         Attachments: SOLR-9843.patch, fail.txt, shard3_replica1.txt, shard_3_searchers.txt
>
>
> I'll have to do a few iterations on the Jenkins builds since I can't get this to fail locally. Marking as "blocker" since I'll probably have to put some extra code in that I want to be sure is removed before we cut any new releases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org