You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by Erick Erickson <er...@gmail.com> on 2019/05/02 00:05:57 UTC

PeerSync and PeerSyncWithLeader tests

Are failing a lot with:

Error Message:
Unexpected exception type, expected SolrException but got org.apache.solr.client.solrj.SolrServerException: Timeout occurred while waiting response from 

Anyone working on this? The super-simple change would be to expect a different error, is that OK? I’m assuming that all that’s happening is a different error is being thrown due to some changes on the server side, but haven’t tried to track down whether that’s the issue or whether it’s deeper than that.

Erick
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Re: PeerSync and PeerSyncWithLeader tests

Posted by Ishan Chattopadhyaya <ic...@gmail.com>.
I think this is an issue with Solr 8.1 as well, as I saw some failures
on this test on my local Jenkins.

On Thu, May 2, 2019 at 3:42 PM Mikhail Khludnev <mk...@apache.org> wrote:
>
> Hello, Erick.
> I've looked at one failed job https://builds.apache.org/job/Lucene-Solr-Tests-8.x/183/consoleFull
> It seems like severe serverside problem: it hangs for waiting for "dependent" updates. It pass on my laptop, so far.
>
>    [junit4]   2> 2825663 INFO  (qtp2139807079-27393) [    x:collection1] o.a.s.u.p.LogUpdateProcessorFactory [collection1]  webapp= path=/update params={update.distrib=FROMLEADER&distrib.inplace.prevversion=6000&wt=javabin&version=2}{} 0 135571
>    [junit4]   2> 2825663 ERROR (qtp2139807079-27393) [    x:collection1] o.a.s.h.RequestHandlerBase java.lang.RuntimeException: java.lang.InterruptedException
>    [junit4]   2> at org.apache.solr.update.VersionBucket.awaitNanos(VersionBucket.java:68)
>    [junit4]   2> at org.apache.solr.update.processor.DistributedUpdateProcessor.doWaitForDependentUpdates(DistributedUpdateProcessor.java:593)
>    [junit4]   2> at org.apache.solr.update.processor.DistributedUpdateProcessor.lambda$waitForDependentUpdates$1(DistributedUpdateProcessor.java:536)
>    [junit4]   2> at org.apache.solr.update.VersionBucket.runWithLock(VersionBucket.java:50)
>    [junit4]   2> at org.apache.solr.update.processor.DistributedUpdateProcessor.waitForDependentUpdates(DistributedUpdateProcessor.java:536)
>    [junit4]   2> at org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:327)
>    [junit4]   2> at org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:223)
>    [junit4]   2> at org.apache.solr.update.processor.LogUpdateProcessorFactory$LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:103)
>    [junit4]   2> at org.apache.solr.handler.loader.JavabinLoader$1.update(JavabinLoader.java:110)
>
>
> On Thu, May 2, 2019 at 3:14 AM Erick Erickson <er...@gmail.com> wrote:
>>
>> Are failing a lot with:
>>
>> Error Message:
>> Unexpected exception type, expected SolrException but got org.apache.solr.client.solrj.SolrServerException: Timeout occurred while waiting response from
>>
>> Anyone working on this? The super-simple change would be to expect a different error, is that OK? I’m assuming that all that’s happening is a different error is being thrown due to some changes on the server side, but haven’t tried to track down whether that’s the issue or whether it’s deeper than that.
>>
>> Erick
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: dev-help@lucene.apache.org
>>
>
>
> --
> Sincerely yours
> Mikhail Khludnev

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Re: PeerSync and PeerSyncWithLeader tests

Posted by Mikhail Khludnev <mk...@apache.org>.
Hello, Erick.
I've looked at one failed job
https://builds.apache.org/job/Lucene-Solr-Tests-8.x/183/consoleFull
It seems like severe serverside problem: it hangs for waiting for
"dependent" updates. It pass on my laptop, so far.

   [junit4]   2> 2825663 INFO  (qtp2139807079-27393) [
x:collection1] o.a.s.u.p.LogUpdateProcessorFactory [collection1]
webapp= path=/update
params={update.distrib=FROMLEADER&distrib.inplace.prevversion=6000&wt=javabin&version=2}{}
0 135571
   [junit4]   2> 2825663 ERROR (qtp2139807079-27393) [
x:collection1] o.a.s.h.RequestHandlerBase java.lang.RuntimeException:
java.lang.InterruptedException
   [junit4]   2> 	at
org.apache.solr.update.VersionBucket.awaitNanos(VersionBucket.java:68)
   [junit4]   2> 	at
org.apache.solr.update.processor.DistributedUpdateProcessor.doWaitForDependentUpdates(DistributedUpdateProcessor.java:593)
   [junit4]   2> 	at
org.apache.solr.update.processor.DistributedUpdateProcessor.lambda$waitForDependentUpdates$1(DistributedUpdateProcessor.java:536)
   [junit4]   2> 	at
org.apache.solr.update.VersionBucket.runWithLock(VersionBucket.java:50)
   [junit4]   2> 	at
org.apache.solr.update.processor.DistributedUpdateProcessor.waitForDependentUpdates(DistributedUpdateProcessor.java:536)
   [junit4]   2> 	at
org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:327)
   [junit4]   2> 	at
org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:223)
   [junit4]   2> 	at
org.apache.solr.update.processor.LogUpdateProcessorFactory$LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:103)
   [junit4]   2> 	at
org.apache.solr.handler.loader.JavabinLoader$1.update(JavabinLoader.java:110)


On Thu, May 2, 2019 at 3:14 AM Erick Erickson <er...@gmail.com>
wrote:

> Are failing a lot with:
>
> Error Message:
> Unexpected exception type, expected SolrException but got
> org.apache.solr.client.solrj.SolrServerException: Timeout occurred while
> waiting response from
>
> Anyone working on this? The super-simple change would be to expect a
> different error, is that OK? I’m assuming that all that’s happening is a
> different error is being thrown due to some changes on the server side, but
> haven’t tried to track down whether that’s the issue or whether it’s deeper
> than that.
>
> Erick
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>

-- 
Sincerely yours
Mikhail Khludnev

Re: PeerSync and PeerSyncWithLeader tests

Posted by Mikhail Khludnev <mk...@apache.org>.
Not sure it's significant but NPE  in debug log is caused by null
ZkController
            log.debug(req.getCore()
                .getCoreContainer().getZkController().getNodeName()
                + " min count to sync to (from most recent searcher view) "
                + searcher.count(new MatchAllDocsQuery()));

On Thu, May 2, 2019 at 5:46 PM Mikhail Khludnev <mk...@apache.org> wrote:

> FWIW, Failed PeerSync with debug log enabled left pretty odd exception in
> the log
> 136672 DEBUG (qtp662266656-56) [    x:collection1] o.a.s.c.S.Request
> [collection1]  webapp= path=/get
> params={getUpdates=6000...7001&distrib=false&qt=/get&fingerprint=true&onlyIfActive=false&wt=javabin&version=2}
> 136672 DEBUG (qtp662266656-56) [    x:collection1]
> o.a.s.h.c.RealTimeGetComponent Error in solrcloud_debug block
>           => java.lang.NullPointerException
> at
> org.apache.solr.handler.component.RealTimeGetComponent.process(RealTimeGetComponent.java:170)
> java.lang.NullPointerException: null
> at
> org.apache.solr.handler.component.RealTimeGetComponent.process(RealTimeGetComponent.java:170)
> ~[main/:?]
>
> Looking into code made me puzzled.
>
> On Thu, May 2, 2019 at 3:14 AM Erick Erickson <er...@gmail.com>
> wrote:
>
>> Are failing a lot with:
>>
>> Error Message:
>> Unexpected exception type, expected SolrException but got
>> org.apache.solr.client.solrj.SolrServerException: Timeout occurred while
>> waiting response from
>>
>> Anyone working on this? The super-simple change would be to expect a
>> different error, is that OK? I’m assuming that all that’s happening is a
>> different error is being thrown due to some changes on the server side, but
>> haven’t tried to track down whether that’s the issue or whether it’s deeper
>> than that.
>>
>> Erick
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: dev-help@lucene.apache.org
>>
>>
>
> --
> Sincerely yours
> Mikhail Khludnev
>


-- 
Sincerely yours
Mikhail Khludnev

Re: PeerSync and PeerSyncWithLeader tests

Posted by Ishan Chattopadhyaya <ic...@gmail.com>.
It seems the PeerSyncTest started failing around the time SOLR-12833
was committed. I am not sure if it is related.
@Andrzej Bialecki, do you have any ideas?

On Thu, May 2, 2019 at 8:16 PM Mikhail Khludnev <mk...@apache.org> wrote:
>
> FWIW, Failed PeerSync with debug log enabled left pretty odd exception in the log
> 136672 DEBUG (qtp662266656-56) [    x:collection1] o.a.s.c.S.Request [collection1]  webapp= path=/get params={getUpdates=6000...7001&distrib=false&qt=/get&fingerprint=true&onlyIfActive=false&wt=javabin&version=2}
> 136672 DEBUG (qtp662266656-56) [    x:collection1] o.a.s.h.c.RealTimeGetComponent Error in solrcloud_debug block
>           => java.lang.NullPointerException
> at org.apache.solr.handler.component.RealTimeGetComponent.process(RealTimeGetComponent.java:170)
> java.lang.NullPointerException: null
> at org.apache.solr.handler.component.RealTimeGetComponent.process(RealTimeGetComponent.java:170) ~[main/:?]
>
> Looking into code made me puzzled.
>
> On Thu, May 2, 2019 at 3:14 AM Erick Erickson <er...@gmail.com> wrote:
>>
>> Are failing a lot with:
>>
>> Error Message:
>> Unexpected exception type, expected SolrException but got org.apache.solr.client.solrj.SolrServerException: Timeout occurred while waiting response from
>>
>> Anyone working on this? The super-simple change would be to expect a different error, is that OK? I’m assuming that all that’s happening is a different error is being thrown due to some changes on the server side, but haven’t tried to track down whether that’s the issue or whether it’s deeper than that.
>>
>> Erick
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: dev-help@lucene.apache.org
>>
>
>
> --
> Sincerely yours
> Mikhail Khludnev

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Re: PeerSync and PeerSyncWithLeader tests

Posted by Mikhail Khludnev <mk...@apache.org>.
FWIW, Failed PeerSync with debug log enabled left pretty odd exception in
the log
136672 DEBUG (qtp662266656-56) [    x:collection1] o.a.s.c.S.Request
[collection1]  webapp= path=/get
params={getUpdates=6000...7001&distrib=false&qt=/get&fingerprint=true&onlyIfActive=false&wt=javabin&version=2}
136672 DEBUG (qtp662266656-56) [    x:collection1]
o.a.s.h.c.RealTimeGetComponent Error in solrcloud_debug block
          => java.lang.NullPointerException
at
org.apache.solr.handler.component.RealTimeGetComponent.process(RealTimeGetComponent.java:170)
java.lang.NullPointerException: null
at
org.apache.solr.handler.component.RealTimeGetComponent.process(RealTimeGetComponent.java:170)
~[main/:?]

Looking into code made me puzzled.

On Thu, May 2, 2019 at 3:14 AM Erick Erickson <er...@gmail.com>
wrote:

> Are failing a lot with:
>
> Error Message:
> Unexpected exception type, expected SolrException but got
> org.apache.solr.client.solrj.SolrServerException: Timeout occurred while
> waiting response from
>
> Anyone working on this? The super-simple change would be to expect a
> different error, is that OK? I’m assuming that all that’s happening is a
> different error is being thrown due to some changes on the server side, but
> haven’t tried to track down whether that’s the issue or whether it’s deeper
> than that.
>
> Erick
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>

-- 
Sincerely yours
Mikhail Khludnev