You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "Chris M. Hostetter (Jira)" <ji...@apache.org> on 2019/10/11 17:55:00 UTC

[jira] [Commented] (SOLR-13778) Windows JDK SSL Test Failure trend: SSLException: Software caused connection abort: recv failed

    [ https://issues.apache.org/jira/browse/SOLR-13778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16949681#comment-16949681 ] 

Chris M. Hostetter commented on SOLR-13778:
-------------------------------------------


I just realized we're seeing a slightly _different_ SSLException from Uwe's java13 windows VMs...

{noformat}
   [junit4]    > Throwable #1: org.apache.solr.client.solrj.SolrServerException: IOException occurred when talking to server at: https://127.0.0.1:551
21/solr
   [junit4]    >        at __randomizedtesting.SeedInfo.seed([E2C1EFE3F69FB5C6:35E9A23BE77FFC28]:0)
   [junit4]    >        at org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:679)
   [junit4]    >        at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:265)
   [junit4]    >        at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:248)
   [junit4]    >        at org.apache.solr.client.solrj.impl.LBSolrClient.doRequest(LBSolrClient.java:368)
   [junit4]    >        at org.apache.solr.client.solrj.impl.LBSolrClient.request(LBSolrClient.java:296)
   [junit4]    >        at org.apache.solr.client.solrj.impl.BaseCloudSolrClient.sendRequest(BaseCloudSolrClient.java:1128)
   [junit4]    >        at org.apache.solr.client.solrj.impl.BaseCloudSolrClient.requestWithRetryOnStaleState(BaseCloudSolrClient.java:897)
   [junit4]    >        at org.apache.solr.client.solrj.impl.BaseCloudSolrClient.request(BaseCloudSolrClient.java:829)
   [junit4]    >        at org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:211)
   [junit4]    >        at org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:228)
   [junit4]    >        at org.apache.solr.cloud.MiniSolrCloudCluster.deleteAllCollections(MiniSolrCloudCluster.java:549)
   [junit4]    >        at org.apache.solr.cloud.TestCloudSearcherWarming.tearDown(TestCloudSearcherWarming.java:79)
   [junit4]    >        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   [junit4]    >        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
   [junit4]    >        at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   [junit4]    >        at java.base/java.lang.reflect.Method.invoke(Method.java:567)
   [junit4]    >        at java.base/java.lang.Thread.run(Thread.java:830)
   [junit4]    > Caused by: javax.net.ssl.SSLException: An established connection was aborted by the software in your host machine
   [junit4]    >        at java.base/sun.security.ssl.Alert.createSSLException(Alert.java:127)
   [junit4]    >        at java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:324)
   [junit4]    >        at java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:267)
   [junit4]    >        at java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:262)
   [junit4]    >        at java.base/sun.security.ssl.SSLSocketImpl.handleException(SSLSocketImpl.java:1652)
   [junit4]    >        at java.base/sun.security.ssl.SSLSocketImpl$AppInputStream.read(SSLSocketImpl.java:1038)
   [junit4]    >        at org.apache.http.impl.io.SessionInputBufferImpl.streamRead(SessionInputBufferImpl.java:137)
   [junit4]    >        at org.apache.http.impl.io.SessionInputBufferImpl.fillBuffer(SessionInputBufferImpl.java:153)
   [junit4]    >        at org.apache.http.impl.io.SessionInputBufferImpl.readLine(SessionInputBufferImpl.java:282)
   [junit4]    >        at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:138)
   [junit4]    >        at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:56)
   [junit4]    >        at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:259)
   [junit4]    >        at org.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:163)
   [junit4]    >        at org.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:165)
   [junit4]    >        at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:273)
   [junit4]    >        at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125)
   [junit4]    >        at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:272)
   [junit4]    >        at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:185)
   [junit4]    >        at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89)
   [junit4]    >        at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110)
   [junit4]    >        at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
   [junit4]    >        at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
   [junit4]    >        at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56)
   [junit4]    >        at org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:564)
   [junit4]    >        ... 47 more
   [junit4]    >        Suppressed: java.net.SocketException: An established connection was aborted by the software in your host machine
   [junit4]    >                at java.base/sun.nio.ch.NioSocketImpl.implWrite(NioSocketImpl.java:421)
   [junit4]    >                at java.base/sun.nio.ch.NioSocketImpl.write(NioSocketImpl.java:441)
   [junit4]    >                at java.base/sun.nio.ch.NioSocketImpl$2.write(NioSocketImpl.java:825)
   [junit4]    >                at java.base/java.net.Socket$SocketOutputStream.write(Socket.java:989)
   [junit4]    >                at java.base/sun.security.ssl.SSLSocketOutputRecord.encodeAlert(SSLSocketOutputRecord.java:82)
   [junit4]    >                at java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:355)
   [junit4]    >                ... 69 more
   [junit4]    > Caused by: java.net.SocketException: An established connection was aborted by the software in your host machine
   [junit4]    >        at java.base/sun.nio.ch.NioSocketImpl.implRead(NioSocketImpl.java:326)
   [junit4]    >        at java.base/sun.nio.ch.NioSocketImpl.read(NioSocketImpl.java:351)
   [junit4]    >        at java.base/sun.nio.ch.NioSocketImpl$1.read(NioSocketImpl.java:802)
   [junit4]    >        at java.base/java.net.Socket$SocketInputStream.read(Socket.java:919)
   [junit4]    >        at java.base/sun.security.ssl.SSLSocketInputRecord.read(SSLSocketInputRecord.java:450)
   [junit4]    >        at java.base/sun.security.ssl.SSLSocketInputRecord.bytesInCompletePacket(SSLSocketInputRecord.java:68)
   [junit4]    >        at java.base/sun.security.ssl.SSLSocketImpl.readApplicationRecord(SSLSocketImpl.java:1409)
   [junit4]    >        at java.base/sun.security.ssl.SSLSocketImpl$AppInputStream.read(SSLSocketImpl.java:1022)
   [junit4]    >        ... 65 more
{noformat}

As before, these only happen on Windows jobs (and Uwe is the only one running windows jobs) and the root cause SocketException is only reported (in this case as both 'Caused By' and 'Suppressed') when they are part of these SSLExceptions...


{noformat}
$ zgrep -c 'javax.net.ssl.SSLException: An established connection was aborted by the software in your host machine' */*/*/jenkins.log.txt.gz | grep -v ':0$'
thetaphi/Lucene-Solr-8.x-Windows/489/jenkins.log.txt.gz:6
thetaphi/Lucene-Solr-8.x-Windows/491/jenkins.log.txt.gz:6
thetaphi/Lucene-Solr-8.x-Windows/494/jenkins.log.txt.gz:5
thetaphi/Lucene-Solr-8.x-Windows/499/jenkins.log.txt.gz:6
thetaphi/Lucene-Solr-master-Windows/8177/jenkins.log.txt.gz:11
thetaphi/Lucene-Solr-master-Windows/8180/jenkins.log.txt.gz:11
thetaphi/Lucene-Solr-master-Windows/8181/jenkins.log.txt.gz:6
thetaphi/Lucene-Solr-master-Windows/8183/jenkins.log.txt.gz:23

$ zgrep 'javax.net.ssl.SSLException: An established connection was aborted by the software in your host machine' */*/*/jenkins.log.txt.gz | wc -l
74
$ zgrep 'java.net.SocketException: An established connection was aborted by the software in your host machine' */*/*/jenkins.log.txt.gz | wc -l
148
$ zgrep 'java.net.SocketException: An established connection was aborted by the software in your host machine' thetaphi/*Windows/*/jenkins.log.txt.gz | wc -l
148
{noformat}

It's probably worth noting for posterity the java-version info for both types of exceptions...

{noformat}
$ zgrep -l 'javax.net.ssl.SSLException: Software caused connection abort: recv failed' thetaphi/*Windows/*/jenkins.log.txt.gz | xargs zgrep -h -F '[java-info]' | grep -v -E 'java version|Test args|Runtime Environment' | sort | uniq -c
      3 [java-info] OpenJDK 64-Bit Server VM (11.0.4+11, AdoptOpenJDK)
      5 [java-info] OpenJDK 64-Bit Server VM (12.0.1+12, AdoptOpenJDK)
$ zgrep -l 'javax.net.ssl.SSLException: An established connection was aborted by the software in your host machine' thetaphi/*Windows/*/jenkins.log.txt.gz | xargs zgrep -h -F '[java-info]' | grep -v -E 'java version|Test args|Runtime Environment' | sort | uniq -c
      5 [java-info] OpenJDK 64-Bit Server VM (13+33, Oracle Corporation)
      3 [java-info] OpenJDK 64-Bit Server VM (14-ea+14-570, Oracle Corporation)
{noformat}


----

Uwe: any thoughts on how we should deal with these? ... they currently reprsent the cause of the biggest chunk of recurring failures in Solr test cases.

> Windows JDK SSL Test Failure trend: SSLException: Software caused connection abort: recv failed
> -----------------------------------------------------------------------------------------------
>
>                 Key: SOLR-13778
>                 URL: https://issues.apache.org/jira/browse/SOLR-13778
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: Chris M. Hostetter
>            Priority: Major
>
> Now that Uwe's jenkins build has been correctly reporting it's build results for my [automated reports|http://fucit.org/solr-jenkins-reports/failure-report.html] to pick up, I've noticed a pattern of failures that indicate a definite problem with using SSL on Windows (even with java 11.0.4
>  )
>  The symptommatic stack traces all contain...
> {noformat}
> ...
>    [junit4]    > Caused by: javax.net.ssl.SSLException: Software caused connection abort: recv failed
>    [junit4]    >        at java.base/sun.security.ssl.Alert.createSSLException(Alert.java:127)
> ...
>    [junit4]    > Caused by: java.net.SocketException: Software caused connection abort: recv failed
>    [junit4]    >        at java.base/java.net.SocketInputStream.socketRead0(Native Method)
> ...
> {noformat}
> I suspect this may be related to [https://bugs.openjdk.java.net/browse/JDK-8209333] but i have no concrete evidence to back this up.
> I'll post some details of my analysis in comments...



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org