You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by Mark Miller <ma...@gmail.com> on 2010/08/01 01:10:13 UTC

Re: Solr Replication Test Case Failure

Still running tests non stop here as well - I'll ping the list if I see
it again.

- Mark

On 7/31/10 12:38 PM, Yonik Seeley wrote:
> FYI, I'm now running this in a loop on my ubuntu box, without the
> retry-loop, trying to replicate a failure.
> 
> -Yonik
> http://www.lucidimagination.com
> 
> On Sat, Jul 31, 2010 at 11:52 AM, Yonik Seeley
> <yo...@lucidimagination.com> wrote:
>> OK, can you try to reproduce now?
>> Since the comments indicated that all the commits were to bump up the
>> index version number, I kept them all and just inserted an additional
>> commit in the query retry loop.
>>
>> But actually... there may still be a bug somewhere (even if this fixes
>> the test failures).
>> Each commit should wait for a new searcher to be registered before
>> returning... hence it should be impossible for overlapping warming
>> searchers to be responsible for the failure.  Hence when the test
>> fails, either the doc add, or the commit is failing.
>>
>> -Yonik
>> http://www.lucidimagination.com
>>
>>
>>
>> On Sat, Jul 31, 2010 at 11:35 AM, Yonik Seeley
>> <yo...@lucidimagination.com> wrote:
>>> Do the logs give any hints?
>>> Downside of only logging SEVERE is that it's much harder to
>>> investigate the cause of any intermittent failures that do happen.
>>>
>>> Looking at this test code, you shouldn't have to wait at all.  The
>>> test disables replication, indexes docs to the slave, commits (and
>>> waits for a new searcher to be registered), and then queries the
>>> slave.
>>>
>>> We should just remove that wait loop.
>>>
>>> Oh... i just figured it out while writing this I think...
>>>
>>>    index(slaveClient, "id", 551, "name", "name = " + 551);
>>>    slaveClient.commit(true, true);
>>>    index(slaveClient, "id", 552, "name", "name = " + 552);
>>>    slaveClient.commit(true, true);
>>>    index(slaveClient, "id", 553, "name", "name = " + 553);
>>>    slaveClient.commit(true, true);
>>>    index(slaveClient, "id", 554, "name", "name = " + 554);
>>>    slaveClient.commit(true, true);
>>>    index(slaveClient, "id", 555, "name", "name = " + 555);
>>>    slaveClient.commit(true, true);
>>>
>>> I bet that last commit can fail due to max warming searchers.
>>> I'll fix.
>>>
>>> -Yonik
>>> http://www.lucidimagination.com
>>>
>>> On Sat, Jul 31, 2010 at 8:41 AM, Mark Miller <ma...@gmail.com> wrote:
>>>>
>>>>
>>>>  This looks like it might actually be an issue - it fails once every 20
>>>> runs or so as a guess.
>>>>
>>>>   [junit] Testsuite: org.apache.solr.handler.TestReplicationHandler
>>>>    [junit] Testcase:
>>>> testReplicateAfterWrite2Slave(org.apache.solr.handler.TestReplicationHandler):
>>>> FAILED
>>>>    [junit] expected:<1> but was:<0>
>>>>    [junit] junit.framework.AssertionFailedError: expected:<1> but was:<0>
>>>>    [junit]     at
>>>> org.apache.solr.handler.TestReplicationHandler.testReplicateAfterWrite2Slave(TestReplicationHandler.java:464)
>>>>    [junit]
>>>>    [junit]
>>>>    [junit] Tests run: 7, Failures: 1, Errors: 0, Time elapsed: 343.909 sec
>>>>
>>>> At first I tried to extend the wait for it, but that's obviously no help
>>>> - in this case the test failed after running for 343 seconds. I've seen it as high as 968 seconds.
>>>>
>>>> - Mark
>>>
>>
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org