You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tomcat.apache.org by Wesley Hall <we...@prociss.co.uk> on 2003/08/11 18:30:18 UTC

Bug Report - 4.1.27 Race Condition

Hello,

I have found a small bug during my work embedding the tomcat server into our
application.

When a secure connector is added to an instance of Embedded, and embedded is
startd and stopped several times in quick succession (which is part of our
test framework) a race condition occurs.

Sometimes i am presented with an exception with a root cause of...

Caused by: LifecycleException: Coyote connector has not been started
at org.apache.coyote.tomcat4.CoyoteConnector.stop(CoyoteConnector.java:1199)
at org.apache.catalina.startup.Embedded.stop(Embedded.java:1030)

(NOTE: This exception is thrown after a call to the start method of Embedded
has been made and connectors added)

Other times the test is successful. Sometimes (although a little more rare
than these two outcomes), I recieve an 'Address already in use', which seems
to be the inverse of the exception above, and occurs on a call to start()
after a stop() has already been made.

Placing a Thread.sleep(x) between start/stop attempts seems to increase the
likelyhood of the test being successful and increasing the interval increase
the percentage of successful test runs until upon reaching an interval of
75ms the success rate becomes 100%.

It seems there is a race condition in starting up or shutting down of
Embedded vs The secure connector. Sometimes Embedded.start() will return
before the connector has been started and the same with Embedded.stop().
Unfortunatly, i cannot seem to locate the Coyote source to verify this.

Presumably, this issue is pretty minor as the only circumstance i can see
where someone will be running stops and starts in quick succession is a
contrived test case, but i thought i would bring it to your attention
anyway.

Regards

Wesley I. Hall

P.S. If Mr Bill Barker is reading this, your advice on my https problem
worked like a charm. I owe you a beer! =o)




---------------------------------------------------------------------
To unsubscribe, e-mail: tomcat-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: tomcat-dev-help@jakarta.apache.org


RE: Bug Report - 4.1.27 Race Condition

Posted by Wesley Hall <we...@prociss.co.uk>.
Having looked at this problem a little more (assuming this is the same
problem, i had done a little recoding, but the symptoms are the same), i
have some more info.

>
> Unless there's a major unexpected flaw and I missed something, I do
> believe that the shutdown is synchronous if you use the stop method
> (using the shutdown socket is not synchronous, but it's another
> problem).

I am not getting the following stack trace from my ant driven junit tests...

SEVERE: Caught exception trying to unlock accept.
java.net.ConnectException: Connection refused: connect
        at java.net.PlainSocketImpl.socketConnect(Native Method)
        at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:305)
        at
java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:171)
        at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:158)
        at java.net.Socket.connect(Socket.java:452)
        at java.net.Socket.connect(Socket.java:402)
        at java.net.Socket.<init>(Socket.java:309)
        at java.net.Socket.<init>(Socket.java:124)
        at
org.apache.tomcat.util.net.PoolTcpEndpoint.closeServerSocket(PoolTcpEndpoint
.java:326)
        at
org.apache.tomcat.util.net.PoolTcpEndpoint.acceptSocket(PoolTcpEndpoint.java
:397)
        at
org.apache.tomcat.util.net.TcpWorkerThread.runIt(PoolTcpEndpoint.java:529)
        at
org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.jav
a:619)
        at java.lang.Thread.run(Thread.java:534)
11-Aug-2003 21:55:06 org.apache.tomcat.util.net.PoolTcpEndpoint
closeServerSocket
SEVERE: Caught exception trying to close socket.
java.lang.NullPointerException
        at
org.apache.tomcat.util.net.PoolTcpEndpoint.closeServerSocket(PoolTcpEndpoint
.java:338)
        at
org.apache.tomcat.util.net.PoolTcpEndpoint.acceptSocket(PoolTcpEndpoint.java
:397)
        at
org.apache.tomcat.util.net.TcpWorkerThread.runIt(PoolTcpEndpoint.java:529)
        at
org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.jav
a:619)
        at java.lang.Thread.run(Thread.java:534)

The presence of Thread at the bottom of each traces suggests that there is
some asynchronous code in here somewhere. Is the stop() method delegating
its work to this shutdown socket you mention?

> It's part of the source distribution, in the j-t-connectors directory.

I really cannot find this code, I have an unzipped copy of
jakarta-tomcat-4.1.27-src.zip, in which i cannot seem to find the coyote
code, nor this PoolTcpEndpoint. I seem to have the package
org.apache.catalina.* but not org.apache.tomcat.*. Do i need to download an
earlier copy of the source for these classes?

Thanks for your time

Regards

Wesley I. Hall


RE: Bug Report - 4.1.27 Race Condition

Posted by Wesley Hall <we...@prociss.co.uk>.
Having looked at this problem a little more (assuming this is the same
problem, i had done a little recoding, but the symptoms are the same), i
have some more info.

>
> Unless there's a major unexpected flaw and I missed something, I do
> believe that the shutdown is synchronous if you use the stop method
> (using the shutdown socket is not synchronous, but it's another
> problem).

I am not getting the following stack trace from my ant driven junit tests...

SEVERE: Caught exception trying to unlock accept.
java.net.ConnectException: Connection refused: connect
        at java.net.PlainSocketImpl.socketConnect(Native Method)
        at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:305)
        at
java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:171)
        at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:158)
        at java.net.Socket.connect(Socket.java:452)
        at java.net.Socket.connect(Socket.java:402)
        at java.net.Socket.<init>(Socket.java:309)
        at java.net.Socket.<init>(Socket.java:124)
        at
org.apache.tomcat.util.net.PoolTcpEndpoint.closeServerSocket(PoolTcpEndpoint
.java:326)
        at
org.apache.tomcat.util.net.PoolTcpEndpoint.acceptSocket(PoolTcpEndpoint.java
:397)
        at
org.apache.tomcat.util.net.TcpWorkerThread.runIt(PoolTcpEndpoint.java:529)
        at
org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.jav
a:619)
        at java.lang.Thread.run(Thread.java:534)
11-Aug-2003 21:55:06 org.apache.tomcat.util.net.PoolTcpEndpoint
closeServerSocket
SEVERE: Caught exception trying to close socket.
java.lang.NullPointerException
        at
org.apache.tomcat.util.net.PoolTcpEndpoint.closeServerSocket(PoolTcpEndpoint
.java:338)
        at
org.apache.tomcat.util.net.PoolTcpEndpoint.acceptSocket(PoolTcpEndpoint.java
:397)
        at
org.apache.tomcat.util.net.TcpWorkerThread.runIt(PoolTcpEndpoint.java:529)
        at
org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.jav
a:619)
        at java.lang.Thread.run(Thread.java:534)

The presence of Thread at the bottom of each traces suggests that there is
some asynchronous code in here somewhere. Is the stop() method delegating
its work to this shutdown socket you mention?

> It's part of the source distribution, in the j-t-connectors directory.

I really cannot find this code, I have an unzipped copy of
jakarta-tomcat-4.1.27-src.zip, in which i cannot seem to find the coyote
code, nor this PoolTcpEndpoint. I seem to have the package
org.apache.catalina.* but not org.apache.tomcat.*. Do i need to download an
earlier copy of the source for these classes?

Thanks for your time

Regards

Wesley I. Hall


---------------------------------------------------------------------
To unsubscribe, e-mail: tomcat-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: tomcat-dev-help@jakarta.apache.org


Re: Bug Report - 4.1.27 Race Condition

Posted by Remy Maucherat <re...@apache.org>.
Wesley Hall wrote:
> Hello,
> 
> I have found a small bug during my work embedding the tomcat server into our
> application.
> 
> When a secure connector is added to an instance of Embedded, and embedded is
> startd and stopped several times in quick succession (which is part of our
> test framework) a race condition occurs.
> 
> Sometimes i am presented with an exception with a root cause of...
> 
> Caused by: LifecycleException: Coyote connector has not been started
> at org.apache.coyote.tomcat4.CoyoteConnector.stop(CoyoteConnector.java:1199)
> at org.apache.catalina.startup.Embedded.stop(Embedded.java:1030)
> 
> (NOTE: This exception is thrown after a call to the start method of Embedded
> has been made and connectors added)
> 
> Other times the test is successful. Sometimes (although a little more rare
> than these two outcomes), I recieve an 'Address already in use', which seems
> to be the inverse of the exception above, and occurs on a call to start()
> after a stop() has already been made.
> 
> Placing a Thread.sleep(x) between start/stop attempts seems to increase the
> likelyhood of the test being successful and increasing the interval increase
> the percentage of successful test runs until upon reaching an interval of
> 75ms the success rate becomes 100%.

Unless there's a major unexpected flaw and I missed something, I do 
believe that the shutdown is synchronous if you use the stop method 
(using the shutdown socket is not synchronous, but it's another 
problem). That means that something, somehwere, is being shutdown 
asynchronously related to SSL, be it at the OS level or at the JDK 
level. Given that SSL uses a non trivial amount of resources, which also 
take a significant amount of time to allocate, I'm not completely 
surprised by what you see.

As your next paragraph states, it would be useful to double check in the 
Tomcat code, but I don't expect you would find much. I believe it's 
rather straightforward code.

IMO, you should make a quick pass looking for an obvious flaw, then go 
with a 100ms delay. Unless you have to restart the embedded TC every 2s, 
I think it should be acceptable.

> It seems there is a race condition in starting up or shutting down of
> Embedded vs The secure connector. Sometimes Embedded.start() will return
> before the connector has been started and the same with Embedded.stop().
> Unfortunatly, i cannot seem to locate the Coyote source to verify this.

It's part of the source distribution, in the j-t-connectors directory.

> Presumably, this issue is pretty minor as the only circumstance i can see
> where someone will be running stops and starts in quick succession is a
> contrived test case, but i thought i would bring it to your attention
> anyway.

Remy


Re: Bug Report - 4.1.27 Race Condition

Posted by Remy Maucherat <re...@apache.org>.
Wesley Hall wrote:
> Hello,
> 
> I have found a small bug during my work embedding the tomcat server into our
> application.
> 
> When a secure connector is added to an instance of Embedded, and embedded is
> startd and stopped several times in quick succession (which is part of our
> test framework) a race condition occurs.
> 
> Sometimes i am presented with an exception with a root cause of...
> 
> Caused by: LifecycleException: Coyote connector has not been started
> at org.apache.coyote.tomcat4.CoyoteConnector.stop(CoyoteConnector.java:1199)
> at org.apache.catalina.startup.Embedded.stop(Embedded.java:1030)
> 
> (NOTE: This exception is thrown after a call to the start method of Embedded
> has been made and connectors added)
> 
> Other times the test is successful. Sometimes (although a little more rare
> than these two outcomes), I recieve an 'Address already in use', which seems
> to be the inverse of the exception above, and occurs on a call to start()
> after a stop() has already been made.
> 
> Placing a Thread.sleep(x) between start/stop attempts seems to increase the
> likelyhood of the test being successful and increasing the interval increase
> the percentage of successful test runs until upon reaching an interval of
> 75ms the success rate becomes 100%.

Unless there's a major unexpected flaw and I missed something, I do 
believe that the shutdown is synchronous if you use the stop method 
(using the shutdown socket is not synchronous, but it's another 
problem). That means that something, somehwere, is being shutdown 
asynchronously related to SSL, be it at the OS level or at the JDK 
level. Given that SSL uses a non trivial amount of resources, which also 
take a significant amount of time to allocate, I'm not completely 
surprised by what you see.

As your next paragraph states, it would be useful to double check in the 
Tomcat code, but I don't expect you would find much. I believe it's 
rather straightforward code.

IMO, you should make a quick pass looking for an obvious flaw, then go 
with a 100ms delay. Unless you have to restart the embedded TC every 2s, 
I think it should be acceptable.

> It seems there is a race condition in starting up or shutting down of
> Embedded vs The secure connector. Sometimes Embedded.start() will return
> before the connector has been started and the same with Embedded.stop().
> Unfortunatly, i cannot seem to locate the Coyote source to verify this.

It's part of the source distribution, in the j-t-connectors directory.

> Presumably, this issue is pretty minor as the only circumstance i can see
> where someone will be running stops and starts in quick succession is a
> contrived test case, but i thought i would bring it to your attention
> anyway.

Remy


---------------------------------------------------------------------
To unsubscribe, e-mail: tomcat-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: tomcat-dev-help@jakarta.apache.org