You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tomcat.apache.org by Rainer Jung <ra...@kippdata.de> on 2015/05/25 07:26:38 UTC

Strange 1000 seconds timeout in AprEndpoint

When testing tcnative I observed a strange 1000 seconds timeout. The 
occurrence of the timeout was likely due to using an old OpenSSL version 
(0.9.8m) in combination with the unit tests that want a newer one, but I 
found the long duration of 1000 seconds strange.

It happens in TestSsl.testSimpleSsl lines 66 and again in line 86 (TC8):

25-May-2015 00:30:27.121 SEVERE [main] 
org.apache.coyote.AbstractProtocol.init Failed to initialize end point 
associated with ProtocolHandler ["http-apr-127.0.0.1-auto-1-40424"]
  java.lang.Exception: Unable to create SSLContext. Check that SSLEngine 
is enabled in the AprLifecycleListener, the AprLifecycleListener has 
initialised correctly and that a valid SSLProtocol has been specified
         at 
org.apache.tomcat.util.net.AprEndpoint.bind(AprEndpoint.java:538)
         at 
org.apache.tomcat.util.net.AbstractEndpoint.init(AbstractEndpoint.java:737)


As a consequence the handshakes hangs:


Testcase: testSimpleSsl took 1,006.091 sec
         Caused an ERROR
Read timed out
java.net.SocketTimeoutException: Read timed out
         at java.net.SocketInputStream.socketRead0(Native Method)
...
         at 
sun.net.www.protocol.https.HttpsClient.afterConnect(HttpsClient.java:559)
...
org.apache.tomcat.util.net.TestSsl.testSimpleSsl(TestSsl.java:67)


I did a little svn archaelogy and I think the problem goes back to 
revision r1433976 in AprEndpoint where the following code block


-                    // Check timeouts (much less frequently than polling)
-                    if (maintainTime > 1000000L && running) {
-                        maintainTime = 0;
-                        if (socketProperties.getSoTimeout() > 0) {
-                            doTimeout(connectionPollset);
-                        }
-                    }


changed into


+                    // Process socket timeouts
+                    if (getSoTimeout() > 0 && maintain++ > 1000 && 
running) {
+                        // This works and uses only one timeout 
mechanism for everything, but the
+                        // non event poller might be a bit faster by 
using the old maintain.
+                        maintain = 0;
+                        maintain();
                      }

The commit contains many more changes, but I think the problem can be 
localized in this block. In APR maintainTime is in microseconds (it is 
incremented with pollTime which is in microseconds), so 1000000L is one 
second. The new block counts maintain and I think that is incremented 
once per second, so we switch from a 1000000 microsecond = 1 second 
condition to a 1000 second condition.

I'm not sure whether it is better to simply check for maintain++ > 1 (or 
 >= 1) or switch back to summing up pollTime.Needs more investigation.

Regards,

Rainer

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


Re: Strange 1000 seconds timeout in AprEndpoint

Posted by Mark Thomas <ma...@apache.org>.
On 26/05/2015 09:31, Mark Thomas wrote:
> On 25/05/2015 06:26, Rainer Jung wrote:
>> When testing tcnative I observed a strange 1000 seconds timeout. The
>> occurrence of the timeout was likely due to using an old OpenSSL version
>> (0.9.8m) in combination with the unit tests that want a newer one, but I
>> found the long duration of 1000 seconds strange.
>>
>> It happens in TestSsl.testSimpleSsl lines 66 and again in line 86 (TC8):
>>
>> 25-May-2015 00:30:27.121 SEVERE [main]
>> org.apache.coyote.AbstractProtocol.init Failed to initialize end point
>> associated with ProtocolHandler ["http-apr-127.0.0.1-auto-1-40424"]
>>  java.lang.Exception: Unable to create SSLContext. Check that SSLEngine
>> is enabled in the AprLifecycleListener, the AprLifecycleListener has
>> initialised correctly and that a valid SSLProtocol has been specified
>>         at
>> org.apache.tomcat.util.net.AprEndpoint.bind(AprEndpoint.java:538)
>>         at
>> org.apache.tomcat.util.net.AbstractEndpoint.init(AbstractEndpoint.java:737)
>>
>>
>> As a consequence the handshakes hangs:
>>
>>
>> Testcase: testSimpleSsl took 1,006.091 sec
>>         Caused an ERROR
>> Read timed out
>> java.net.SocketTimeoutException: Read timed out
>>         at java.net.SocketInputStream.socketRead0(Native Method)
>> ...
>>         at
>> sun.net.www.protocol.https.HttpsClient.afterConnect(HttpsClient.java:559)
>> ...
>> org.apache.tomcat.util.net.TestSsl.testSimpleSsl(TestSsl.java:67)
>>
>>
>> I did a little svn archaelogy and I think the problem goes back to
>> revision r1433976 in AprEndpoint where the following code block
>>
>>
>> -                    // Check timeouts (much less frequently than polling)
>> -                    if (maintainTime > 1000000L && running) {
>> -                        maintainTime = 0;
>> -                        if (socketProperties.getSoTimeout() > 0) {
>> -                            doTimeout(connectionPollset);
>> -                        }
>> -                    }
>>
>>
>> changed into
>>
>>
>> +                    // Process socket timeouts
>> +                    if (getSoTimeout() > 0 && maintain++ > 1000 &&
>> running) {
>> +                        // This works and uses only one timeout
>> mechanism for everything, but the
>> +                        // non event poller might be a bit faster by
>> using the old maintain.
>> +                        maintain = 0;
>> +                        maintain();
>>                      }
>>
>> The commit contains many more changes, but I think the problem can be
>> localized in this block. In APR maintainTime is in microseconds (it is
>> incremented with pollTime which is in microseconds), so 1000000L is one
>> second. The new block counts maintain and I think that is incremented
>> once per second, so we switch from a 1000000 microsecond = 1 second
>> condition to a 1000 second condition.
>>
>> I'm not sure whether it is better to simply check for maintain++ > 1 (or
>>> = 1) or switch back to summing up pollTime.Needs more investigation.
> 
> Looking at it now. I think there are some simplifications that can be
> made but I need to look into the more closely.

OK. I think I am done now (but I'm waiting for the CI to confirm there
are no new timeout issues in async etc. that my local testing hasn't
spotted).

Mark


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


Re: Strange 1000 seconds timeout in AprEndpoint

Posted by Mark Thomas <ma...@apache.org>.
On 25/05/2015 06:26, Rainer Jung wrote:
> When testing tcnative I observed a strange 1000 seconds timeout. The
> occurrence of the timeout was likely due to using an old OpenSSL version
> (0.9.8m) in combination with the unit tests that want a newer one, but I
> found the long duration of 1000 seconds strange.
> 
> It happens in TestSsl.testSimpleSsl lines 66 and again in line 86 (TC8):
> 
> 25-May-2015 00:30:27.121 SEVERE [main]
> org.apache.coyote.AbstractProtocol.init Failed to initialize end point
> associated with ProtocolHandler ["http-apr-127.0.0.1-auto-1-40424"]
>  java.lang.Exception: Unable to create SSLContext. Check that SSLEngine
> is enabled in the AprLifecycleListener, the AprLifecycleListener has
> initialised correctly and that a valid SSLProtocol has been specified
>         at
> org.apache.tomcat.util.net.AprEndpoint.bind(AprEndpoint.java:538)
>         at
> org.apache.tomcat.util.net.AbstractEndpoint.init(AbstractEndpoint.java:737)
> 
> 
> As a consequence the handshakes hangs:
> 
> 
> Testcase: testSimpleSsl took 1,006.091 sec
>         Caused an ERROR
> Read timed out
> java.net.SocketTimeoutException: Read timed out
>         at java.net.SocketInputStream.socketRead0(Native Method)
> ...
>         at
> sun.net.www.protocol.https.HttpsClient.afterConnect(HttpsClient.java:559)
> ...
> org.apache.tomcat.util.net.TestSsl.testSimpleSsl(TestSsl.java:67)
> 
> 
> I did a little svn archaelogy and I think the problem goes back to
> revision r1433976 in AprEndpoint where the following code block
> 
> 
> -                    // Check timeouts (much less frequently than polling)
> -                    if (maintainTime > 1000000L && running) {
> -                        maintainTime = 0;
> -                        if (socketProperties.getSoTimeout() > 0) {
> -                            doTimeout(connectionPollset);
> -                        }
> -                    }
> 
> 
> changed into
> 
> 
> +                    // Process socket timeouts
> +                    if (getSoTimeout() > 0 && maintain++ > 1000 &&
> running) {
> +                        // This works and uses only one timeout
> mechanism for everything, but the
> +                        // non event poller might be a bit faster by
> using the old maintain.
> +                        maintain = 0;
> +                        maintain();
>                      }
> 
> The commit contains many more changes, but I think the problem can be
> localized in this block. In APR maintainTime is in microseconds (it is
> incremented with pollTime which is in microseconds), so 1000000L is one
> second. The new block counts maintain and I think that is incremented
> once per second, so we switch from a 1000000 microsecond = 1 second
> condition to a 1000 second condition.
> 
> I'm not sure whether it is better to simply check for maintain++ > 1 (or
>>= 1) or switch back to summing up pollTime.Needs more investigation.

Looking at it now. I think there are some simplifications that can be
made but I need to look into the more closely.

Mark


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org