You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tomcat.apache.org by ma...@apache.org on 2013/10/07 23:44:09 UTC

svn commit: r1530081 - /tomcat/trunk/java/org/apache/tomcat/util/net/AprEndpoint.java

Author: markt
Date: Mon Oct  7 21:44:09 2013
New Revision: 1530081

URL: http://svn.apache.org/r1530081
Log:
Fix https://issues.apache.org/bugzilla/show_bug.cgi?id=55633 for APR.
If the socket was registered for read and write and only one occurred, the registration for the other event was lost. Ensure that the socket is re-registered with the Poller when necessary.

Modified:
    tomcat/trunk/java/org/apache/tomcat/util/net/AprEndpoint.java

Modified: tomcat/trunk/java/org/apache/tomcat/util/net/AprEndpoint.java
URL: http://svn.apache.org/viewvc/tomcat/trunk/java/org/apache/tomcat/util/net/AprEndpoint.java?rev=1530081&r1=1530080&r2=1530081&view=diff
==============================================================================
--- tomcat/trunk/java/org/apache/tomcat/util/net/AprEndpoint.java (original)
+++ tomcat/trunk/java/org/apache/tomcat/util/net/AprEndpoint.java Mon Oct  7 21:44:09 2013
@@ -1881,8 +1881,17 @@ public class AprEndpoint extends Abstrac
                                             ((desc[n*2] & Poll.APR_POLLOUT) == Poll.APR_POLLOUT) &&
                                             !processSocket(desc[n*2+1], SocketStatus.OPEN_WRITE)) {
                                         // Close socket and clear pool
+                                        error = true;
                                         closeSocket(desc[n*2+1]);
                                     }
+                                    if (!error) {
+                                        // If socket was registered for multiple events but
+                                        // only some of the occurred, re-register for the
+                                        // remaining events.
+                                        if (wrapper.pollerFlags != 0) {
+                                            add(desc[n*2+1], 1, wrapper.pollerFlags);
+                                        }
+                                    }
                                 } else {
                                     // Unknown event
                                     getLog().warn(sm.getString(



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


Re: svn commit: r1530081 - /tomcat/trunk/java/org/apache/tomcat/util/net/AprEndpoint.java

Posted by Mark Thomas <ma...@apache.org>.
On 10/10/2013 21:30, Konstantin Preißer wrote:
> Mark,
> 
>> -----Original Message-----
>> From: Mark Thomas [mailto:markt@apache.org]
>> Sent: Thursday, October 10, 2013 10:03 PM
> 
>> OK. Try now. You will probably still see the odd NPE but the other
>> issues should be fixed.
> 
> Thank you very much.
> I tried with 1531087 and can confirm that I still see that NPE, but I don't see any other problems anymore (Tomcat will respond on each TCP connection).
> The thread dump shows 2 ClientPoller threads after the NPE.

Excellent.

Thanks for the feedback.

Mark


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


RE: svn commit: r1530081 - /tomcat/trunk/java/org/apache/tomcat/util/net/AprEndpoint.java

Posted by Konstantin Preißer <kp...@apache.org>.
Mark,

> -----Original Message-----
> From: Mark Thomas [mailto:markt@apache.org]
> Sent: Thursday, October 10, 2013 10:03 PM

> OK. Try now. You will probably still see the odd NPE but the other
> issues should be fixed.

Thank you very much.
I tried with 1531087 and can confirm that I still see that NPE, but I don't see any other problems anymore (Tomcat will respond on each TCP connection).
The thread dump shows 2 ClientPoller threads after the NPE.


Regards,
Konstantin Preißer


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


Re: svn commit: r1530081 - /tomcat/trunk/java/org/apache/tomcat/util/net/AprEndpoint.java

Posted by Mark Thomas <ma...@apache.org>.
On 10/10/2013 18:43, Mark Thomas wrote:
> On 10/10/2013 18:40, Konstantin Preißer wrote:
>> Hi Mark,
>>
>>> -----Original Message-----
>>> From: Mark Thomas [mailto:markt@apache.org]
>>> Sent: Thursday, October 10, 2013 7:37 PM
>>
>>> Thanks. Can you just check how many poller threads you get on a clean
>>> start? I'd expect 2 in most cases.
>>
>> Yes, I see two "ClientPoller" threads after a clean start.
> 
> OK. I see what is going on now.
> 
> The short NPE stack trace you reported earlier in this thread is
> triggering the Poller to close.
> 
> Most of the problems you are seeing are related to that Poller close. I
> want to spend some time trying to figure out how the NPE happens in the
> first place but fixing the Poller close should be pretty simple.

OK. Try now. You will probably still see the odd NPE but the other
issues should be fixed.

Mark


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


Re: svn commit: r1530081 - /tomcat/trunk/java/org/apache/tomcat/util/net/AprEndpoint.java

Posted by Mark Thomas <ma...@apache.org>.
On 10/10/2013 18:40, Konstantin Preißer wrote:
> Hi Mark,
> 
>> -----Original Message-----
>> From: Mark Thomas [mailto:markt@apache.org]
>> Sent: Thursday, October 10, 2013 7:37 PM
> 
>> Thanks. Can you just check how many poller threads you get on a clean
>> start? I'd expect 2 in most cases.
> 
> Yes, I see two "ClientPoller" threads after a clean start.

OK. I see what is going on now.

The short NPE stack trace you reported earlier in this thread is
triggering the Poller to close.

Most of the problems you are seeing are related to that Poller close. I
want to spend some time trying to figure out how the NPE happens in the
first place but fixing the Poller close should be pretty simple.

Mark




---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


RE: svn commit: r1530081 - /tomcat/trunk/java/org/apache/tomcat/util/net/AprEndpoint.java

Posted by Konstantin Preißer <kp...@apache.org>.
Hi Mark,

> -----Original Message-----
> From: Mark Thomas [mailto:markt@apache.org]
> Sent: Thursday, October 10, 2013 7:37 PM

> Thanks. Can you just check how many poller threads you get on a clean
> start? I'd expect 2 in most cases.

Yes, I see two "ClientPoller" threads after a clean start.

Here is a thread dump after clean start:

2013-10-10 19:39:05
Full thread dump Java HotSpot(TM) 64-Bit Server VM (24.0-b56 mixed mode):

"http-nio-8784-Acceptor-0" daemon prio=6 tid=0x000000000a826000 nid=0x27e8 runnable [0x000000000c9ce000]
   java.lang.Thread.State: RUNNABLE
	at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method)
	at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:241)
	- locked <0x00000000ef0026b0> (a java.lang.Object)
	at org.apache.tomcat.util.net.NioEndpoint$Acceptor.run(NioEndpoint.java:714)
	at java.lang.Thread.run(Thread.java:724)

"http-nio-8784-ClientPoller-1" daemon prio=6 tid=0x000000000ae73000 nid=0x1bd0 runnable [0x000000000c78f000]
   java.lang.Thread.State: RUNNABLE
	at sun.nio.ch.WindowsSelectorImpl$SubSelector.poll0(Native Method)
	at sun.nio.ch.WindowsSelectorImpl$SubSelector.poll(WindowsSelectorImpl.java:296)
	at sun.nio.ch.WindowsSelectorImpl$SubSelector.access$400(WindowsSelectorImpl.java:278)
	at sun.nio.ch.WindowsSelectorImpl.doSelect(WindowsSelectorImpl.java:159)
	at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
	- locked <0x00000000edcc3748> (a sun.nio.ch.Util$2)
	- locked <0x00000000edcc3738> (a java.util.Collections$UnmodifiableSet)
	- locked <0x00000000edcc35e8> (a sun.nio.ch.WindowsSelectorImpl)
	at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
	at org.apache.tomcat.util.net.NioEndpoint$Poller.run(NioEndpoint.java:1074)
	at java.lang.Thread.run(Thread.java:724)

"http-nio-8784-ClientPoller-0" daemon prio=6 tid=0x000000000ae72800 nid=0xa78 runnable [0x000000000c30f000]
   java.lang.Thread.State: RUNNABLE
	at sun.nio.ch.WindowsSelectorImpl$SubSelector.poll0(Native Method)
	at sun.nio.ch.WindowsSelectorImpl$SubSelector.poll(WindowsSelectorImpl.java:296)
	at sun.nio.ch.WindowsSelectorImpl$SubSelector.access$400(WindowsSelectorImpl.java:278)
	at sun.nio.ch.WindowsSelectorImpl.doSelect(WindowsSelectorImpl.java:159)
	at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
	- locked <0x00000000edcbf548> (a sun.nio.ch.Util$2)
	- locked <0x00000000edcbf538> (a java.util.Collections$UnmodifiableSet)
	- locked <0x00000000edcbf3e8> (a sun.nio.ch.WindowsSelectorImpl)
	at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
	at org.apache.tomcat.util.net.NioEndpoint$Poller.run(NioEndpoint.java:1074)
	at java.lang.Thread.run(Thread.java:724)

"ContainerBackgroundProcessor[StandardEngine[Catalina]]" daemon prio=6 tid=0x000000000aae3000 nid=0x19e0 waiting on condition [0x000000000c0ff000]
   java.lang.Thread.State: TIMED_WAITING (sleeping)
	at java.lang.Thread.sleep(Native Method)
	at org.apache.catalina.core.ContainerBase$ContainerBackgroundProcessor.run(ContainerBase.java:1331)
	at java.lang.Thread.run(Thread.java:724)

"Thread-4" daemon prio=6 tid=0x000000000af7a800 nid=0x1a5c runnable [0x000000000bfdf000]
   java.lang.Thread.State: RUNNABLE
	at sun.nio.ch.Iocp.getQueuedCompletionStatus(Native Method)
	at sun.nio.ch.Iocp.access$300(Iocp.java:46)
	at sun.nio.ch.Iocp$EventHandlerTask.run(Iocp.java:333)
	at java.lang.Thread.run(Thread.java:724)

"NioBlockingSelector.BlockPoller-1" daemon prio=6 tid=0x000000000aaa0800 nid=0xcbc runnable [0x0000000009d0e000]
   java.lang.Thread.State: RUNNABLE
	at sun.nio.ch.WindowsSelectorImpl$SubSelector.poll0(Native Method)
	at sun.nio.ch.WindowsSelectorImpl$SubSelector.poll(WindowsSelectorImpl.java:296)
	at sun.nio.ch.WindowsSelectorImpl$SubSelector.access$400(WindowsSelectorImpl.java:278)
	at sun.nio.ch.WindowsSelectorImpl.doSelect(WindowsSelectorImpl.java:159)
	at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
	- locked <0x00000000eeffeb90> (a sun.nio.ch.Util$2)
	- locked <0x00000000eeffeb80> (a java.util.Collections$UnmodifiableSet)
	- locked <0x00000000eeffea10> (a sun.nio.ch.WindowsSelectorImpl)
	at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
	at org.apache.tomcat.util.net.NioBlockingSelector$BlockPoller.run(NioBlockingSelector.java:342)

"GC Daemon" daemon prio=2 tid=0x000000000a7e4800 nid=0x1988 in Object.wait() [0x000000000b9df000]
   java.lang.Thread.State: TIMED_WAITING (on object monitor)
	at java.lang.Object.wait(Native Method)
	- waiting on <0x00000000c04031b8> (a sun.misc.GC$LatencyLock)
	at sun.misc.GC$Daemon.run(GC.java:117)
	- locked <0x00000000c04031b8> (a sun.misc.GC$LatencyLock)

"AsyncFileHandlerWriter-242583474" daemon prio=6 tid=0x000000000981b000 nid=0x12d4 waiting on condition [0x0000000009e3f000]
   java.lang.Thread.State: TIMED_WAITING (parking)
	at sun.misc.Unsafe.park(Native Method)
	- parking to wait for  <0x00000000c04031c8> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
	at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082)
	at java.util.concurrent.LinkedBlockingDeque.pollFirst(LinkedBlockingDeque.java:519)
	at java.util.concurrent.LinkedBlockingDeque.poll(LinkedBlockingDeque.java:682)
	at org.apache.juli.AsyncFileHandler$LoggerThread.run(AsyncFileHandler.java:147)

"Service Thread" daemon prio=6 tid=0x00000000082b4000 nid=0x95c runnable [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"C2 CompilerThread1" daemon prio=10 tid=0x00000000082a7000 nid=0x1e8 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"C2 CompilerThread0" daemon prio=10 tid=0x0000000008290800 nid=0x2100 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Attach Listener" daemon prio=10 tid=0x000000000828d800 nid=0x6b0 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Signal Dispatcher" daemon prio=10 tid=0x000000000828c000 nid=0xd60 runnable [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Finalizer" daemon prio=8 tid=0x0000000008238800 nid=0x1838 in Object.wait() [0x0000000008f8f000]
   java.lang.Thread.State: WAITING (on object monitor)
	at java.lang.Object.wait(Native Method)
	- waiting on <0x00000000c042c3c0> (a java.lang.ref.ReferenceQueue$Lock)
	at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:135)
	- locked <0x00000000c042c3c0> (a java.lang.ref.ReferenceQueue$Lock)
	at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:151)
	at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:189)

"Reference Handler" daemon prio=10 tid=0x00000000020ea000 nid=0x1fb0 in Object.wait() [0x0000000008b7f000]
   java.lang.Thread.State: WAITING (on object monitor)
	at java.lang.Object.wait(Native Method)
	- waiting on <0x00000000c042c3f0> (a java.lang.ref.Reference$Lock)
	at java.lang.Object.wait(Object.java:503)
	at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:133)
	- locked <0x00000000c042c3f0> (a java.lang.ref.Reference$Lock)

"main" prio=6 tid=0x0000000001c1f000 nid=0xd04 runnable [0x000000000235e000]
   java.lang.Thread.State: RUNNABLE
	at java.net.DualStackPlainSocketImpl.accept0(Native Method)
	at java.net.DualStackPlainSocketImpl.socketAccept(DualStackPlainSocketImpl.java:131)
	at java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:398)
	at java.net.PlainSocketImpl.accept(PlainSocketImpl.java:198)
	- locked <0x00000000edcce9b8> (a java.net.SocksSocketImpl)
	at java.net.ServerSocket.implAccept(ServerSocket.java:530)
	at java.net.ServerSocket.accept(ServerSocket.java:498)
	at org.apache.catalina.core.StandardServer.await(StandardServer.java:437)
	at org.apache.catalina.startup.Catalina.await(Catalina.java:746)
	at org.apache.catalina.startup.Catalina.start(Catalina.java:692)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:606)
	at org.apache.catalina.startup.Bootstrap.start(Bootstrap.java:352)
	at org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:486)

"VM Thread" prio=10 tid=0x0000000008232000 nid=0xa60 runnable 

"GC task thread#0 (ParallelGC)" prio=6 tid=0x0000000002026800 nid=0x1c54 runnable 

"GC task thread#1 (ParallelGC)" prio=6 tid=0x0000000002028000 nid=0x2680 runnable 

"VM Periodic Task Thread" prio=10 tid=0x00000000082bd800 nid=0xe2c waiting on condition 

JNI global references: 219



Regards,
Konstantin Preißer


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


Re: svn commit: r1530081 - /tomcat/trunk/java/org/apache/tomcat/util/net/AprEndpoint.java

Posted by Mark Thomas <ma...@apache.org>.
On 10/10/2013 18:33, Konstantin Preißer wrote:
> Hi Mark,
> 
>> -----Original Message-----
>> From: Mark Thomas [mailto:markt@apache.org]
>> Sent: Thursday, October 10, 2013 7:11 PM
> 
>>> What is interesting is that after this happens and I telnet to the
>>> server for manually sending a request, then every (2n)th TCP
>>> connection works (Tomcat replies and closes the connection after the
>>> 20s timeout), but every (2n + 1)th connection does not work (there
>>> Tomcat does not respond and is never half-closing the connection).
>>
>> Now that is very interesting. It suggests one of the two Pollers has
>> locked up somehow. Could you take a thread dump when this happens as I
>> can't reproduce it. I'd love to know what that thread is doing.
> 
> Sure, here is a thread dump taken after the problems occured (Note: I modified the default server.xml to remove the AJP connector and for the HTTP connector change the port from 8080 to 8783 - that were the only changes I made).

Thanks. Can you just check how many poller threads you get on a clean
start? I'd expect 2 in most cases.

Mark

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


RE: svn commit: r1530081 - /tomcat/trunk/java/org/apache/tomcat/util/net/AprEndpoint.java

Posted by Konstantin Preißer <kp...@apache.org>.
Hi Mark,

> -----Original Message-----
> From: Mark Thomas [mailto:markt@apache.org]
> Sent: Thursday, October 10, 2013 7:11 PM

> > What is interesting is that after this happens and I telnet to the
> > server for manually sending a request, then every (2n)th TCP
> > connection works (Tomcat replies and closes the connection after the
> > 20s timeout), but every (2n + 1)th connection does not work (there
> > Tomcat does not respond and is never half-closing the connection).
> 
> Now that is very interesting. It suggests one of the two Pollers has
> locked up somehow. Could you take a thread dump when this happens as I
> can't reproduce it. I'd love to know what that thread is doing.

Sure, here is a thread dump taken after the problems occured (Note: I modified the default server.xml to remove the AJP connector and for the HTTP connector change the port from 8080 to 8783 - that were the only changes I made).


2013-10-10 19:31:49
Full thread dump Java HotSpot(TM) 64-Bit Server VM (24.0-b56 mixed mode):

"http-nio-8783-exec-10" daemon prio=6 tid=0x000000000b00b800 nid=0x2294 waiting on condition [0x000000000d76f000]
   java.lang.Thread.State: WAITING (parking)
	at sun.misc.Unsafe.park(Native Method)
	- parking to wait for  <0x00000000c14cd958> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
	at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
	at org.apache.tomcat.util.threads.TaskQueue.take(TaskQueue.java:104)
	at org.apache.tomcat.util.threads.TaskQueue.take(TaskQueue.java:32)
	at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:724)

"http-nio-8783-exec-9" daemon prio=6 tid=0x0000000009c8d800 nid=0x175c waiting on condition [0x000000000ca2e000]
   java.lang.Thread.State: WAITING (parking)
	at sun.misc.Unsafe.park(Native Method)
	- parking to wait for  <0x00000000c14cd958> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
	at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
	at org.apache.tomcat.util.threads.TaskQueue.take(TaskQueue.java:104)
	at org.apache.tomcat.util.threads.TaskQueue.take(TaskQueue.java:32)
	at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:724)

"http-nio-8783-exec-8" daemon prio=6 tid=0x0000000009c8c800 nid=0x2388 waiting on condition [0x000000000d3df000]
   java.lang.Thread.State: WAITING (parking)
	at sun.misc.Unsafe.park(Native Method)
	- parking to wait for  <0x00000000c14cd958> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
	at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
	at org.apache.tomcat.util.threads.TaskQueue.take(TaskQueue.java:104)
	at org.apache.tomcat.util.threads.TaskQueue.take(TaskQueue.java:32)
	at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:724)

"http-nio-8783-exec-7" daemon prio=6 tid=0x0000000009c8c000 nid=0x1770 waiting on condition [0x000000000d54f000]
   java.lang.Thread.State: WAITING (parking)
	at sun.misc.Unsafe.park(Native Method)
	- parking to wait for  <0x00000000c14cd958> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
	at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
	at org.apache.tomcat.util.threads.TaskQueue.take(TaskQueue.java:104)
	at org.apache.tomcat.util.threads.TaskQueue.take(TaskQueue.java:32)
	at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:724)

"http-nio-8783-exec-6" daemon prio=6 tid=0x0000000009c8b000 nid=0x1040 waiting on condition [0x000000000be9e000]
   java.lang.Thread.State: WAITING (parking)
	at sun.misc.Unsafe.park(Native Method)
	- parking to wait for  <0x00000000c14cd958> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
	at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
	at org.apache.tomcat.util.threads.TaskQueue.take(TaskQueue.java:104)
	at org.apache.tomcat.util.threads.TaskQueue.take(TaskQueue.java:32)
	at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:724)

"http-nio-8783-exec-5" daemon prio=6 tid=0x0000000009c8a800 nid=0x2174 waiting on condition [0x000000000d2af000]
   java.lang.Thread.State: WAITING (parking)
	at sun.misc.Unsafe.park(Native Method)
	- parking to wait for  <0x00000000c14cd958> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
	at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
	at org.apache.tomcat.util.threads.TaskQueue.take(TaskQueue.java:104)
	at org.apache.tomcat.util.threads.TaskQueue.take(TaskQueue.java:32)
	at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:724)

"http-nio-8783-exec-4" daemon prio=6 tid=0x0000000009c89800 nid=0x1c98 waiting on condition [0x000000000d1af000]
   java.lang.Thread.State: WAITING (parking)
	at sun.misc.Unsafe.park(Native Method)
	- parking to wait for  <0x00000000c14cd958> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
	at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
	at org.apache.tomcat.util.threads.TaskQueue.take(TaskQueue.java:104)
	at org.apache.tomcat.util.threads.TaskQueue.take(TaskQueue.java:32)
	at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:724)

"pool-1-thread-1" prio=6 tid=0x0000000009c88000 nid=0x1940 waiting on condition [0x000000000c40e000]
   java.lang.Thread.State: WAITING (parking)
	at sun.misc.Unsafe.park(Native Method)
	- parking to wait for  <0x00000000c123def8> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
	at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
	at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:724)

"Java2D Disposer" daemon prio=10 tid=0x0000000009c87800 nid=0x1808 in Object.wait() [0x000000000cf8f000]
   java.lang.Thread.State: WAITING (on object monitor)
	at java.lang.Object.wait(Native Method)
	- waiting on <0x00000000c18ef110> (a java.lang.ref.ReferenceQueue$Lock)
	at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:135)
	- locked <0x00000000c18ef110> (a java.lang.ref.ReferenceQueue$Lock)
	at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:151)
	at sun.java2d.Disposer.run(Disposer.java:145)
	at java.lang.Thread.run(Thread.java:724)

"Timer-0" daemon prio=6 tid=0x0000000009c86800 nid=0x1458 in Object.wait() [0x000000000cddf000]
   java.lang.Thread.State: TIMED_WAITING (on object monitor)
	at java.lang.Object.wait(Native Method)
	at java.util.TimerThread.mainLoop(Timer.java:552)
	- locked <0x00000000c123dc80> (a java.util.TaskQueue)
	at java.util.TimerThread.run(Timer.java:505)

"http-nio-8783-exec-3" daemon prio=6 tid=0x000000000b025000 nid=0x22d8 waiting on condition [0x000000000adcf000]
   java.lang.Thread.State: WAITING (parking)
	at sun.misc.Unsafe.park(Native Method)
	- parking to wait for  <0x00000000c14cd958> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
	at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
	at org.apache.tomcat.util.threads.TaskQueue.take(TaskQueue.java:104)
	at org.apache.tomcat.util.threads.TaskQueue.take(TaskQueue.java:32)
	at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:724)

"http-nio-8783-exec-2" daemon prio=6 tid=0x000000000b066800 nid=0x2144 waiting on condition [0x000000000ccbf000]
   java.lang.Thread.State: WAITING (parking)
	at sun.misc.Unsafe.park(Native Method)
	- parking to wait for  <0x00000000c14cd958> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
	at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
	at org.apache.tomcat.util.threads.TaskQueue.take(TaskQueue.java:104)
	at org.apache.tomcat.util.threads.TaskQueue.take(TaskQueue.java:32)
	at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:724)

"http-nio-8783-exec-1" daemon prio=6 tid=0x000000000a8c5000 nid=0x1980 waiting on condition [0x000000000bf9f000]
   java.lang.Thread.State: WAITING (parking)
	at sun.misc.Unsafe.park(Native Method)
	- parking to wait for  <0x00000000c14cd958> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
	at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
	at org.apache.tomcat.util.threads.TaskQueue.take(TaskQueue.java:104)
	at org.apache.tomcat.util.threads.TaskQueue.take(TaskQueue.java:32)
	at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:724)

"http-nio-8783-Acceptor-0" daemon prio=6 tid=0x000000000b061000 nid=0xb3c runnable [0x000000000cb3f000]
   java.lang.Thread.State: RUNNABLE
	at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method)
	at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:241)
	- locked <0x00000000c0d5b690> (a java.lang.Object)
	at org.apache.tomcat.util.net.NioEndpoint$Acceptor.run(NioEndpoint.java:714)
	at java.lang.Thread.run(Thread.java:724)

"http-nio-8783-ClientPoller-0" daemon prio=6 tid=0x000000000af9b000 nid=0x1628 runnable [0x000000000c7de000]
   java.lang.Thread.State: RUNNABLE
	at sun.nio.ch.WindowsSelectorImpl$SubSelector.poll0(Native Method)
	at sun.nio.ch.WindowsSelectorImpl$SubSelector.poll(WindowsSelectorImpl.java:296)
	at sun.nio.ch.WindowsSelectorImpl$SubSelector.access$400(WindowsSelectorImpl.java:278)
	at sun.nio.ch.WindowsSelectorImpl.doSelect(WindowsSelectorImpl.java:159)
	at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
	- locked <0x00000000c120a300> (a sun.nio.ch.Util$2)
	- locked <0x00000000c120a2f0> (a java.util.Collections$UnmodifiableSet)
	- locked <0x00000000c120a0b0> (a sun.nio.ch.WindowsSelectorImpl)
	at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
	at org.apache.tomcat.util.net.NioEndpoint$Poller.run(NioEndpoint.java:1074)
	at java.lang.Thread.run(Thread.java:724)

"ContainerBackgroundProcessor[StandardEngine[Catalina]]" daemon prio=6 tid=0x000000000aadb000 nid=0xa08 waiting on condition [0x000000000c65f000]
   java.lang.Thread.State: TIMED_WAITING (sleeping)
	at java.lang.Thread.sleep(Native Method)
	at org.apache.catalina.core.ContainerBase$ContainerBackgroundProcessor.run(ContainerBase.java:1331)
	at java.lang.Thread.run(Thread.java:724)

"Thread-4" daemon prio=6 tid=0x000000000a87c000 nid=0x520 runnable [0x000000000a2af000]
   java.lang.Thread.State: RUNNABLE
	at sun.nio.ch.Iocp.getQueuedCompletionStatus(Native Method)
	at sun.nio.ch.Iocp.access$300(Iocp.java:46)
	at sun.nio.ch.Iocp$EventHandlerTask.run(Iocp.java:333)
	at java.lang.Thread.run(Thread.java:724)

"NioBlockingSelector.BlockPoller-1" daemon prio=6 tid=0x000000000a73b000 nid=0x1c08 runnable [0x0000000009daf000]
   java.lang.Thread.State: RUNNABLE
	at sun.nio.ch.WindowsSelectorImpl$SubSelector.poll0(Native Method)
	at sun.nio.ch.WindowsSelectorImpl$SubSelector.poll(WindowsSelectorImpl.java:296)
	at sun.nio.ch.WindowsSelectorImpl$SubSelector.access$400(WindowsSelectorImpl.java:278)
	at sun.nio.ch.WindowsSelectorImpl.doSelect(WindowsSelectorImpl.java:159)
	at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
	- locked <0x00000000c0d59f90> (a sun.nio.ch.Util$2)
	- locked <0x00000000c0d59f80> (a java.util.Collections$UnmodifiableSet)
	- locked <0x00000000c0d59e10> (a sun.nio.ch.WindowsSelectorImpl)
	at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
	at org.apache.tomcat.util.net.NioBlockingSelector$BlockPoller.run(NioBlockingSelector.java:342)

"GC Daemon" daemon prio=2 tid=0x0000000009b84000 nid=0x10e8 in Object.wait() [0x000000000ac9f000]
   java.lang.Thread.State: TIMED_WAITING (on object monitor)
	at java.lang.Object.wait(Native Method)
	- waiting on <0x00000000c0590030> (a sun.misc.GC$LatencyLock)
	at sun.misc.GC$Daemon.run(GC.java:117)
	- locked <0x00000000c0590030> (a sun.misc.GC$LatencyLock)

"AsyncFileHandlerWriter-982344655" daemon prio=6 tid=0x0000000009943800 nid=0x2040 waiting on condition [0x0000000009eae000]
   java.lang.Thread.State: TIMED_WAITING (parking)
	at sun.misc.Unsafe.park(Native Method)
	- parking to wait for  <0x00000000c03d5648> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
	at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082)
	at java.util.concurrent.LinkedBlockingDeque.pollFirst(LinkedBlockingDeque.java:519)
	at java.util.concurrent.LinkedBlockingDeque.poll(LinkedBlockingDeque.java:682)
	at org.apache.juli.AsyncFileHandler$LoggerThread.run(AsyncFileHandler.java:147)

"Service Thread" daemon prio=6 tid=0x00000000082f2000 nid=0x2148 runnable [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"C2 CompilerThread1" daemon prio=10 tid=0x00000000082eb000 nid=0x1348 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"C2 CompilerThread0" daemon prio=10 tid=0x00000000082e2000 nid=0x15bc waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Attach Listener" daemon prio=10 tid=0x00000000082df000 nid=0x2268 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Signal Dispatcher" daemon prio=10 tid=0x00000000082dd800 nid=0x204c runnable [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Finalizer" daemon prio=8 tid=0x00000000021af000 nid=0x1a34 in Object.wait() [0x00000000090ef000]
   java.lang.Thread.State: WAITING (on object monitor)
	at java.lang.Object.wait(Native Method)
	at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:135)
	- locked <0x00000000c03e2d90> (a java.lang.ref.ReferenceQueue$Lock)
	at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:151)
	at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:189)

"Reference Handler" daemon prio=10 tid=0x00000000021ab800 nid=0x1020 in Object.wait() [0x0000000008cee000]
   java.lang.Thread.State: WAITING (on object monitor)
	at java.lang.Object.wait(Native Method)
	at java.lang.Object.wait(Object.java:503)
	at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:133)
	- locked <0x00000000c03e2850> (a java.lang.ref.Reference$Lock)

"main" prio=6 tid=0x000000000220f000 nid=0x33c runnable [0x000000000248e000]
   java.lang.Thread.State: RUNNABLE
	at java.net.DualStackPlainSocketImpl.accept0(Native Method)
	at java.net.DualStackPlainSocketImpl.socketAccept(DualStackPlainSocketImpl.java:131)
	at java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:398)
	at java.net.PlainSocketImpl.accept(PlainSocketImpl.java:198)
	- locked <0x00000000c18e5c58> (a java.net.SocksSocketImpl)
	at java.net.ServerSocket.implAccept(ServerSocket.java:530)
	at java.net.ServerSocket.accept(ServerSocket.java:498)
	at org.apache.catalina.core.StandardServer.await(StandardServer.java:437)
	at org.apache.catalina.startup.Catalina.await(Catalina.java:746)
	at org.apache.catalina.startup.Catalina.start(Catalina.java:692)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:606)
	at org.apache.catalina.startup.Bootstrap.start(Bootstrap.java:352)
	at org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:486)

"VM Thread" prio=10 tid=0x0000000008282000 nid=0x1f38 runnable 

"GC task thread#0 (ParallelGC)" prio=6 tid=0x00000000020e6000 nid=0x1b08 runnable 

"GC task thread#1 (ParallelGC)" prio=6 tid=0x00000000020e7800 nid=0x182c runnable 

"VM Periodic Task Thread" prio=10 tid=0x0000000008303800 nid=0x2104 waiting on condition 

JNI global references: 400



Regards,
Konstantin Preißer


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


Re: svn commit: r1530081 - /tomcat/trunk/java/org/apache/tomcat/util/net/AprEndpoint.java

Posted by Mark Thomas <ma...@apache.org>.
On 10/10/2013 17:42, Konstantin Preißer wrote:
>> -----Original Message----- From: Konstantin Preißer
>> [mailto:kpreisser@apache.org] Sent: Thursday, October 10, 2013 6:02
>> PM
> 
>> After this happens, then when I close all browsers on every machine
>> and open new instances to make new HTTP requests (with IE, Chrome
>> and Firefox), then sometimes Tomcat replies, but sometimes not.
> 
> What is interesting is that after this happens and I telnet to the
> server for manually sending a request, then every (2n)th TCP
> connection works (Tomcat replies and closes the connection after the
> 20s timeout), but every (2n + 1)th connection does not work (there
> Tomcat does not respond and is never half-closing the connection).

Now that is very interesting. It suggests one of the two Pollers has
locked up somehow. Could you take a thread dump when this happens as I
can't reproduce it. I'd love to know what that thread is doing.

Mark

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


RE: svn commit: r1530081 - /tomcat/trunk/java/org/apache/tomcat/util/net/AprEndpoint.java

Posted by Konstantin Preißer <kp...@apache.org>.
> -----Original Message-----
> From: Konstantin Preißer [mailto:kpreisser@apache.org]
> Sent: Thursday, October 10, 2013 6:02 PM

> After this happens, then when I close all browsers on every machine and
> open new instances to make new HTTP requests (with IE, Chrome and
> Firefox), then sometimes Tomcat replies, but sometimes not.

What is interesting is that after this happens and I telnet to the server for manually sending a request, then every (2n)th TCP connection works (Tomcat replies and closes the connection after the 20s timeout), but every (2n + 1)th connection does not work (there Tomcat does not respond and is never half-closing the connection).


Regards,
Konstantin Preißer


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


RE: svn commit: r1530081 - /tomcat/trunk/java/org/apache/tomcat/util/net/AprEndpoint.java

Posted by Konstantin Preißer <kp...@apache.org>.
Hi Mark,

> -----Original Message-----
> From: Mark Thomas [mailto:markt@apache.org]
> Sent: Thursday, October 10, 2013 5:22 PM

> I still can't reproduce this. I have Tomcat running in a VM with the
> bandwidth limited to 192Kbps and two separate machines accessing the
> drawing board. Can you try to reproduce this without the TCP forwarder
> in the mix just to make sure there isn't an issue there?
> 
> I do see the odd stray connection but it looks to be something the
> browser is holding on to. As soon as I close the browser, the connection
> drops.

Yes, I reproduced the issue without the TCP forwarder (Firefox connecting directly to Tomcat).

I just tried it again, and it seems the issue is somewhat hard to reproduce - I had to try this (Pressing F5) several minutes until the problems showed up again (the NPE in NioEndpoint.processSocket(), Firefox is unable to make a new HTTP request to Tomcat, and Tomcat closed an existing WebSocket connection from the other machine).

After this happens, then when I close all browsers on every machine and open new instances to make new HTTP requests (with IE, Chrome and Firefox), then sometimes Tomcat replies, but sometimes not.


Regards,
Konstantin Preißer


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


Re: svn commit: r1530081 - /tomcat/trunk/java/org/apache/tomcat/util/net/AprEndpoint.java

Posted by Mark Thomas <ma...@apache.org>.
On 10/10/2013 14:41, Konstantin Preißer wrote:
> Hi Mark,
> 
>> -----Original Message-----
>> From: Mark Thomas [mailto:markt@apache.org]
>> Sent: Thursday, October 10, 2013 10:53 AM
>> To: Tomcat Developers List
>> Subject: Re: svn commit: r1530081 -
>> /tomcat/trunk/java/org/apache/tomcat/util/net/AprEndpoint.java
>>
>>>> 2) What is a bit more severe is that sometimes after pressing F5 a lot of
>> times, Firefox cannot establish a Websocket connection or make another
>> HTTP request to Tomcat.
>>
>> I can't reproduce this. It is possible that the fix for 1 also fixed
>> this. Can you test?
>>
>> What I did see was that with a lot of refreshes, Firefox ended up with
>> two connections open when it should only have had one. Closing the
>> browser fixed that.
> 
> Thanks.
> 
> Yes, I can still reproduce this on trunk with r1530937.
> 
> What I did was:
> I started Tomcat on a server (connected over internet, not LAN, so the latency/speed is not as good as in a LAN) and started Tomcat. Then, on two machines I opened the drawboard example and drew something on it so that the PNG images gets big (about 500 KB). Then on both machines I simultaneously pressed F5 (either holding it down for some seconds, or press it manually when Firefox was in the middle of receiving the PNG image).
> After some time where I stopped, the drawboard mentioned 3 connected players instead of 2. Then I closed Firefox on machine A - it reported 2 players. I reopened firefox - 3 players. Then I closed Firefox on machine B - still 2 players. I reopened Firefox on machine B - then it tried to establish a new websocket connection (the .xhtml page was probably loaded from the cache), but after 5 seconds, it displayed "Websocket closed" so it couldn't establish a new connection - at the same time on the other machine the player count went down to 1.
> Then, on machine B I pressed F5 again, but now Firefox couldn't even load the .xhtml page, and on Tomcat the NPE showed up again:
> 
> 10-Oct-2013 15:05:49.249 SEVERE [http-nio-8783-ClientPoller-0] org.apache.tomcat.util.net.NioEndpoint.processSocket Error allocating socket processor
>  java.lang.NullPointerException
> 	at org.apache.tomcat.util.net.NioEndpoint.processSocket(NioEndpoint.java:624)
> 	at org.apache.tomcat.util.net.NioEndpoint$Poller.processKey(NioEndpoint.java:1163)
> 	at org.apache.tomcat.util.net.NioEndpoint$Poller.run(NioEndpoint.java:1121)
> 	at java.lang.Thread.run(Thread.java:724)
> 
> (and as I'm writing this, Firefox still seems to wait for a reply from Tomcat.
> 
> After that I have pressed ESC to abort loading and pressed F5 again, but still no new websocket connection.
> 
> Then I took my TCP forwarder tool to see what data is actually sent on the TCP connections from and to Tomcat. This is what happened:
> 1) Firefox opened Connection 1 and sent "GET /examples/websocket/drawboard.xhtml HTTP/1.1" request to Tomcat.
> 2) On Connection 1, Tomcat replied with a correct reply with the contents of the drawboard.xhtml file.
> 3) On Connection 1, Firefox sent "GET /favicon.ico HTTP/1.1" request to Tomcat.
> 4) Firefox established two new TCP connections (3 and 4) to Tomcat.
> 5) On Connection 1, Tomcat replied with the correct contents of "favicon.ico".
> 6) On Connection 2, Firefox sends "GET /examples/websocket/drawboard HTTP/1.1" request to Tomcat with WebSocket headers to open the WebSocket connection.
> 7) Nothing happens.
> 8) After some time Firefox half-closed connections 1 and 3, and Tomcat full-closed them.
> 9) I shutdown Firefox so connection 2 was half-closed by the Client, but Tomcat did not full-close it so it is still active on the TCP forwarder.
> 
> Then I killed the TCP forwarder and opened it and Firefox again, but this time Tomcat even did not response on the "GET /examples/websocket/drawboard.xhtml HTTP/1.1" request. After I closed Firefox, then again the connection where that request was sent was still hold open by Tomcat. After some minutes, the forwarder received a Timeout error on the TCP connection that was still held open by Tomcat (SocketError: TimedOut (10060)).
> 
> 
> Hope this helps,

I still can't reproduce this. I have Tomcat running in a VM with the
bandwidth limited to 192Kbps and two separate machines accessing the
drawing board. Can you try to reproduce this without the TCP forwarder
in the mix just to make sure there isn't an issue there?

I do see the odd stray connection but it looks to be something the
browser is holding on to. As soon as I close the browser, the connection
drops.

Mark


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


RE: svn commit: r1530081 - /tomcat/trunk/java/org/apache/tomcat/util/net/AprEndpoint.java

Posted by Konstantin Preißer <kp...@apache.org>.
Hi Mark,

> -----Original Message-----
> From: Mark Thomas [mailto:markt@apache.org]
> Sent: Thursday, October 10, 2013 10:53 AM
> To: Tomcat Developers List
> Subject: Re: svn commit: r1530081 -
> /tomcat/trunk/java/org/apache/tomcat/util/net/AprEndpoint.java
> 
> >> 2) What is a bit more severe is that sometimes after pressing F5 a lot of
> times, Firefox cannot establish a Websocket connection or make another
> HTTP request to Tomcat.
> 
> I can't reproduce this. It is possible that the fix for 1 also fixed
> this. Can you test?
> 
> What I did see was that with a lot of refreshes, Firefox ended up with
> two connections open when it should only have had one. Closing the
> browser fixed that.

Thanks.

Yes, I can still reproduce this on trunk with r1530937.

What I did was:
I started Tomcat on a server (connected over internet, not LAN, so the latency/speed is not as good as in a LAN) and started Tomcat. Then, on two machines I opened the drawboard example and drew something on it so that the PNG images gets big (about 500 KB). Then on both machines I simultaneously pressed F5 (either holding it down for some seconds, or press it manually when Firefox was in the middle of receiving the PNG image).
After some time where I stopped, the drawboard mentioned 3 connected players instead of 2. Then I closed Firefox on machine A - it reported 2 players. I reopened firefox - 3 players. Then I closed Firefox on machine B - still 2 players. I reopened Firefox on machine B - then it tried to establish a new websocket connection (the .xhtml page was probably loaded from the cache), but after 5 seconds, it displayed "Websocket closed" so it couldn't establish a new connection - at the same time on the other machine the player count went down to 1.
Then, on machine B I pressed F5 again, but now Firefox couldn't even load the .xhtml page, and on Tomcat the NPE showed up again:

10-Oct-2013 15:05:49.249 SEVERE [http-nio-8783-ClientPoller-0] org.apache.tomcat.util.net.NioEndpoint.processSocket Error allocating socket processor
 java.lang.NullPointerException
	at org.apache.tomcat.util.net.NioEndpoint.processSocket(NioEndpoint.java:624)
	at org.apache.tomcat.util.net.NioEndpoint$Poller.processKey(NioEndpoint.java:1163)
	at org.apache.tomcat.util.net.NioEndpoint$Poller.run(NioEndpoint.java:1121)
	at java.lang.Thread.run(Thread.java:724)

(and as I'm writing this, Firefox still seems to wait for a reply from Tomcat.

After that I have pressed ESC to abort loading and pressed F5 again, but still no new websocket connection.

Then I took my TCP forwarder tool to see what data is actually sent on the TCP connections from and to Tomcat. This is what happened:
1) Firefox opened Connection 1 and sent "GET /examples/websocket/drawboard.xhtml HTTP/1.1" request to Tomcat.
2) On Connection 1, Tomcat replied with a correct reply with the contents of the drawboard.xhtml file.
3) On Connection 1, Firefox sent "GET /favicon.ico HTTP/1.1" request to Tomcat.
4) Firefox established two new TCP connections (3 and 4) to Tomcat.
5) On Connection 1, Tomcat replied with the correct contents of "favicon.ico".
6) On Connection 2, Firefox sends "GET /examples/websocket/drawboard HTTP/1.1" request to Tomcat with WebSocket headers to open the WebSocket connection.
7) Nothing happens.
8) After some time Firefox half-closed connections 1 and 3, and Tomcat full-closed them.
9) I shutdown Firefox so connection 2 was half-closed by the Client, but Tomcat did not full-close it so it is still active on the TCP forwarder.

Then I killed the TCP forwarder and opened it and Firefox again, but this time Tomcat even did not response on the "GET /examples/websocket/drawboard.xhtml HTTP/1.1" request. After I closed Firefox, then again the connection where that request was sent was still hold open by Tomcat. After some minutes, the forwarder received a Timeout error on the TCP connection that was still held open by Tomcat (SocketError: TimedOut (10060)).


Hope this helps,

Regards,
Konstantin Preißer



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


Re: svn commit: r1530081 - /tomcat/trunk/java/org/apache/tomcat/util/net/AprEndpoint.java

Posted by Mark Thomas <ma...@apache.org>.
On 10/10/2013 08:56, Mark Thomas wrote:
> On 09/10/2013 22:48, Konstantin Preißer wrote:
> 
>> Unfortunately, it seems there are some new problems when using the NIO connector  :(
> 
> Not unfortunate at all. This is all very useful feedback.
> 
>> (though I do not think they affect a 8.0.0-RC4 release).
>> I was testing the current trunk with NIO connector on Windows with 64-Bit Java and I tried to open the Drawboard example and there press F5 for several times on different computers.
>>
>> 1) Sometimes I get following exceptions:
>>
>> 09-Oct-2013 23:11:28.069 SEVERE [http-nio-8783-exec-8] org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process Error reading request, ignored
>>  java.lang.IllegalStateException: Message will not be sent because the WebSocket session has been closed
>> 	at org.apache.tomcat.websocket.WsRemoteEndpointImplBase.writeMessagePart(WsRemoteEndpointImplBase.java:292)
> 
> Fixed (purely on code inspection - I haven't tested this yet).
> 
>> 2) What is a bit more severe is that sometimes after pressing F5 a lot of times, Firefox cannot establish a Websocket connection or make another HTTP request to Tomcat.

I can't reproduce this. It is possible that the fix for 1 also fixed
this. Can you test?

What I did see was that with a lot of refreshes, Firefox ended up with
two connections open when it should only have had one. Closing the
browser fixed that.

Mark

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


Re: svn commit: r1530081 - /tomcat/trunk/java/org/apache/tomcat/util/net/AprEndpoint.java

Posted by Mark Thomas <ma...@apache.org>.
On 09/10/2013 22:48, Konstantin Preißer wrote:

> Unfortunately, it seems there are some new problems when using the NIO connector  :(

Not unfortunate at all. This is all very useful feedback.

> (though I do not think they affect a 8.0.0-RC4 release).
> I was testing the current trunk with NIO connector on Windows with 64-Bit Java and I tried to open the Drawboard example and there press F5 for several times on different computers.
> 
> 1) Sometimes I get following exceptions:
> 
> 09-Oct-2013 23:11:28.069 SEVERE [http-nio-8783-exec-8] org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process Error reading request, ignored
>  java.lang.IllegalStateException: Message will not be sent because the WebSocket session has been closed
> 	at org.apache.tomcat.websocket.WsRemoteEndpointImplBase.writeMessagePart(WsRemoteEndpointImplBase.java:292)

Fixed (purely on code inspection - I haven't tested this yet).

> 2) What is a bit more severe is that sometimes after pressing F5 a lot of times, Firefox cannot establish a Websocket connection or make another HTTP request to Tomcat.

That sounds like Firefox has reached its per server connection limit
which suggests Tomcat isn't closing connections properly on some error
condition(s).

> Then I also get such exceptions sporadically:
> 
> 09-Oct-2013 23:19:07.646 SEVERE [http-nio-8783-ClientPoller-0] org.apache.tomcat.util.net.NioEndpoint.processSocket Error allocating socket processor
>  java.lang.NullPointerException
> 	at org.apache.tomcat.util.net.NioEndpoint.processSocket(NioEndpoint.java:624)
> 	at org.apache.tomcat.util.net.NioEndpoint$Poller.processKey(NioEndpoint.java:1163)
> 	at org.apache.tomcat.util.net.NioEndpoint$Poller.run(NioEndpoint.java:1121)
> 	at java.lang.Thread.run(Thread.java:724)

That looks like a closed socket hasn't been removed from the Poller.

> or these (which seems strange because there occurs a Timeout exception when sending data, but I don't think Firefox would leave the connection open and stop to read from it):
> 09-Oct-2013 23:14:57.453 SEVERE [http-nio-8783-exec-4] websocket.drawboard.DrawboardEndpoint.onError onError: java.io.IOException: java.util.concurrent.TimeoutException
>  java.io.IOException: java.util.concurrent.TimeoutException
> 	at org.apache.tomcat.websocket.WsRemoteEndpointImplBase.startMessageBlock(WsRemoteEndpointImplBase.java:226)
> 	at org.apache.tomcat.websocket.WsSession.sendCloseMessage(WsSession.java:474)
> 	at org.apache.tomcat.websocket.WsSession.onClose(WsSession.java:438)
> 	at org.apache.tomcat.websocket.WsFrameBase.processDataControl(WsFrameBase.java:316)
> 	at org.apache.tomcat.websocket.WsFrameBase.processData(WsFrameBase.java:270)
> 	at org.apache.tomcat.websocket.WsFrameBase.processInputBuffer(WsFrameBase.java:116)
> 	at org.apache.tomcat.websocket.server.WsFrameServer.onDataAvailable(WsFrameServer.java:55)
> 	at org.apache.tomcat.websocket.server.WsHttpUpgradeHandler$WsReadListener.onDataAvailable(WsHttpUpgradeHandler.java:192)
> 	at org.apache.coyote.http11.upgrade.AbstractServletInputStream.onDataAvailable(AbstractServletInputStream.java:169)
> 	at org.apache.coyote.http11.upgrade.AbstractProcessor.upgradeDispatch(AbstractProcessor.java:95)
> 	at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:640)
> 	at org.apache.coyote.http11.Http11NioProtocol$Http11ConnectionHandler.process(Http11NioProtocol.java:223)
> 	at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1595)
> 	at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.run(NioEndpoint.java:1553)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> 	at java.lang.Thread.run(Thread.java:724)
> Caused by: java.util.concurrent.TimeoutException
> 	at org.apache.tomcat.websocket.FutureToSendHandler.get(FutureToSendHandler.java:99)
> 	at org.apache.tomcat.websocket.WsRemoteEndpointImplBase.startMessageBlock(WsRemoteEndpointImplBase.java:222)
> 	... 16 more

Hmm. That is timing out when trying to send a close message.

Overall, it looks like the connection isn't always closed cleanly when
F5 is pressed several times in quick succession. I'll take a look.

Mark


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


RE: svn commit: r1530081 - /tomcat/trunk/java/org/apache/tomcat/util/net/AprEndpoint.java

Posted by Konstantin Preißer <kp...@apache.org>.
> Then I also get such exceptions sporadically:
> 
> 09-Oct-2013 23:19:07.646 SEVERE [http-nio-8783-ClientPoller-0]
> org.apache.tomcat.util.net.NioEndpoint.processSocket Error allocating
> socket processor
>  java.lang.NullPointerException
> 	at org.apache.tomcat.util.net.NioEndpoint.processSocket(NioEndpoint.java:624)
> 	at org.apache.tomcat.util.net.NioEndpoint$Poller.processKey(NioEndpoint.java:1163)
> 	at org.apache.tomcat.util.net.NioEndpoint$Poller.run(NioEndpoint.java:1121)
> 	at java.lang.Thread.run(Thread.java:724)

It seems similar issues happens for APR, as I got these exceptions when doing the same things (drawing onto the canvas until the PNG gets big, then repeatedly pressing F5 from different machines):

Exception in thread "http-apr-8783-exec-2" java.lang.NullPointerException
        at org.apache.tomcat.util.net.AprEndpoint$SocketProcessor.doRun(AprEndpoint.java:2451)
        at org.apache.tomcat.util.net.AprEndpoint$SocketProcessor.run(AprEndpoint.java:2437)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:724)
Exception in thread "http-apr-8783-exec-7" java.lang.NullPointerException
        at org.apache.tomcat.util.net.AprEndpoint$SocketProcessor.doRun(AprEndpoint.java:2451)
        at org.apache.tomcat.util.net.AprEndpoint$SocketProcessor.run(AprEndpoint.java:2437)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:724)


Regards,
Konstantin Preißer


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


RE: svn commit: r1530081 - /tomcat/trunk/java/org/apache/tomcat/util/net/AprEndpoint.java

Posted by Konstantin Preißer <kp...@apache.org>.
Hi Mark,

> -----Original Message-----
> From: Mark Thomas [mailto:markt@apache.org]
> Sent: Wednesday, October 9, 2013 3:38 PM
> To: Tomcat Developers List
> Subject: Re: svn commit: r1530081 -
> /tomcat/trunk/java/org/apache/tomcat/util/net/AprEndpoint.java
> 
> On 09/10/2013 12:26, Mark Thomas wrote:
> > On 09/10/2013 02:41, Konstantin Preißer wrote:
> >
> >> When trying with APR, I keep seeing that Tomcat initates the
> >> connection shutdown.
> >
> > I do still see the error when using APR with the current code. I'll dig
> > into this some more.
> 
> This should be fixed as well now.
> 
> Mark

Thanks a lot for fixing those problems.


Unfortunately, it seems there are some new problems when using the NIO connector  :(   (though I do not think they affect a 8.0.0-RC4 release).
I was testing the current trunk with NIO connector on Windows with 64-Bit Java and I tried to open the Drawboard example and there press F5 for several times on different computers.

1) Sometimes I get following exceptions:

09-Oct-2013 23:11:28.069 SEVERE [http-nio-8783-exec-8] org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process Error reading request, ignored
 java.lang.IllegalStateException: Message will not be sent because the WebSocket session has been closed
	at org.apache.tomcat.websocket.WsRemoteEndpointImplBase.writeMessagePart(WsRemoteEndpointImplBase.java:292)
	at org.apache.tomcat.websocket.WsRemoteEndpointImplBase.endMessage(WsRemoteEndpointImplBase.java:279)
	at org.apache.tomcat.websocket.WsRemoteEndpointImplBase$EndMessageHandler.onResult(WsRemoteEndpointImplBase.java:436)
	at org.apache.tomcat.websocket.server.WsRemoteEndpointImplServer.clearHandler(WsRemoteEndpointImplServer.java:159)
	at org.apache.tomcat.websocket.server.WsRemoteEndpointImplServer.doClose(WsRemoteEndpointImplServer.java:121)
	at org.apache.tomcat.websocket.WsRemoteEndpointImplBase.close(WsRemoteEndpointImplBase.java:536)
	at org.apache.tomcat.websocket.WsSession.sendCloseMessage(WsSession.java:482)
	at org.apache.tomcat.websocket.WsSession.onClose(WsSession.java:438)
	at org.apache.tomcat.websocket.server.WsHttpUpgradeHandler.close(WsHttpUpgradeHandler.java:172)
	at org.apache.tomcat.websocket.server.WsHttpUpgradeHandler.access$200(WsHttpUpgradeHandler.java:45)
	at org.apache.tomcat.websocket.server.WsHttpUpgradeHandler$WsReadListener.onDataAvailable(WsHttpUpgradeHandler.java:203)
	at org.apache.coyote.http11.upgrade.AbstractServletInputStream.onDataAvailable(AbstractServletInputStream.java:169)
	at org.apache.coyote.http11.upgrade.AbstractProcessor.upgradeDispatch(AbstractProcessor.java:95)
	at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:640)
	at org.apache.coyote.http11.Http11NioProtocol$Http11ConnectionHandler.process(Http11NioProtocol.java:223)
	at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1595)
	at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.run(NioEndpoint.java:1553)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:724)

The code in WsReadListener.onDataAvailable() seems to catch an IOException which closes the session, therefore causing writing a Close end frame which in turn fails again. So it probably is not much severe.

2) What is a bit more severe is that sometimes after pressing F5 a lot of times, Firefox cannot establish a Websocket connection or make another HTTP request to Tomcat.

Then I also get such exceptions sporadically:

09-Oct-2013 23:19:07.646 SEVERE [http-nio-8783-ClientPoller-0] org.apache.tomcat.util.net.NioEndpoint.processSocket Error allocating socket processor
 java.lang.NullPointerException
	at org.apache.tomcat.util.net.NioEndpoint.processSocket(NioEndpoint.java:624)
	at org.apache.tomcat.util.net.NioEndpoint$Poller.processKey(NioEndpoint.java:1163)
	at org.apache.tomcat.util.net.NioEndpoint$Poller.run(NioEndpoint.java:1121)
	at java.lang.Thread.run(Thread.java:724)

or these (which seems strange because there occurs a Timeout exception when sending data, but I don't think Firefox would leave the connection open and stop to read from it):
09-Oct-2013 23:14:57.453 SEVERE [http-nio-8783-exec-4] websocket.drawboard.DrawboardEndpoint.onError onError: java.io.IOException: java.util.concurrent.TimeoutException
 java.io.IOException: java.util.concurrent.TimeoutException
	at org.apache.tomcat.websocket.WsRemoteEndpointImplBase.startMessageBlock(WsRemoteEndpointImplBase.java:226)
	at org.apache.tomcat.websocket.WsSession.sendCloseMessage(WsSession.java:474)
	at org.apache.tomcat.websocket.WsSession.onClose(WsSession.java:438)
	at org.apache.tomcat.websocket.WsFrameBase.processDataControl(WsFrameBase.java:316)
	at org.apache.tomcat.websocket.WsFrameBase.processData(WsFrameBase.java:270)
	at org.apache.tomcat.websocket.WsFrameBase.processInputBuffer(WsFrameBase.java:116)
	at org.apache.tomcat.websocket.server.WsFrameServer.onDataAvailable(WsFrameServer.java:55)
	at org.apache.tomcat.websocket.server.WsHttpUpgradeHandler$WsReadListener.onDataAvailable(WsHttpUpgradeHandler.java:192)
	at org.apache.coyote.http11.upgrade.AbstractServletInputStream.onDataAvailable(AbstractServletInputStream.java:169)
	at org.apache.coyote.http11.upgrade.AbstractProcessor.upgradeDispatch(AbstractProcessor.java:95)
	at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:640)
	at org.apache.coyote.http11.Http11NioProtocol$Http11ConnectionHandler.process(Http11NioProtocol.java:223)
	at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1595)
	at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.run(NioEndpoint.java:1553)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:724)
Caused by: java.util.concurrent.TimeoutException
	at org.apache.tomcat.websocket.FutureToSendHandler.get(FutureToSendHandler.java:99)
	at org.apache.tomcat.websocket.WsRemoteEndpointImplBase.startMessageBlock(WsRemoteEndpointImplBase.java:222)
	... 16 more


I will do some more debugging later to see what happens on TCP level when Firefox cannot establish a new Websocket connection.
Those errors seem to only happen with the Drawboard example (and there only if some people already drew something so that the PNG images does not fit into a single 8 KB packet when sending a binary message, but instead has e.g. 200 KB (when it is completely white it only hase about 3 KB)), but not with the Snake example.


Regards,
Konstantin Preißer


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


Re: svn commit: r1530081 - /tomcat/trunk/java/org/apache/tomcat/util/net/AprEndpoint.java

Posted by Mark Thomas <ma...@apache.org>.
On 09/10/2013 12:26, Mark Thomas wrote:
> On 09/10/2013 02:41, Konstantin Preißer wrote:
> 
>> When trying with APR, I keep seeing that Tomcat initates the
>> connection shutdown.
> 
> I do still see the error when using APR with the current code. I'll dig
> into this some more.

This should be fixed as well now.

Mark


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


Re: svn commit: r1530081 - /tomcat/trunk/java/org/apache/tomcat/util/net/AprEndpoint.java

Posted by Mark Thomas <ma...@apache.org>.
On 09/10/2013 02:41, Konstantin Preißer wrote:
> Mark,
> 
>>> -----Original Message----- From: Mark Thomas
>>> [mailto:markt@apache.org] Sent: Tuesday, October 8, 2013 3:57 PM 
>>> To: Tomcat Developers List Subject: Re: svn commit: r1530081 - 
>>> /tomcat/trunk/java/org/apache/tomcat/util/net/AprEndpoint.java
>>> 
>>> 
>>> None. I've seem the same problem but haven't got to the bottom of
>>> it yet. What I do know is: - Tomcat doesn't send a close - The
>>> browser closes the connection on its own - Tomcat tries (and
>>> fails) to write some data and that triggers the close on the
>>> Tomcat side
>>> 
>>> I'm wondering if some form of data corruption could be the
>>> trigger.
>>> 
>>> Whatever is going on, I'm fairly sure it is a different issue to
>>> BZ 55633.
>>> 
>>> I need to do some more careful analysis with Wireshark.
>> 
>> OK, thanks.
>> 
>> Note: When I tested this scenario, I got a slightly other result:
>> For me, Tomcat initiated the shutdown of TCP connection. (And
>> another issue: When/After this happens, if the browser tries to
>> establishes a new Websocket connection (with a Upgrade: ... header)
>> then tomcat sometimes shuts down the TCP connection without sending
>> a reply).
> 
> I do now see the behavior which you described, but with the NIO
> connector: There the browser is the one which closes the connection
> (probably due to a data corruption).

With the current 8.0.x code, I can no longer recreate this issue (I am
using Firefox and Chrome) with NIO and have been pausing Chrome.

I also tried suspending Firefox and could not re-create the issue.

> When trying with APR, I keep seeing that Tomcat initates the
> connection shutdown.

I do still see the error when using APR with the current code. I'll dig
into this some more.

> With BIO, the problems don't occur.


Note that if you suspend for too long, you may hit the default write
timeout (20s) and that may trigger a connection close.

Mark

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


RE: svn commit: r1530081 - /tomcat/trunk/java/org/apache/tomcat/util/net/AprEndpoint.java

Posted by Konstantin Preißer <kp...@apache.org>.
Mark,

> > -----Original Message-----
> > From: Mark Thomas [mailto:markt@apache.org]
> > Sent: Tuesday, October 8, 2013 3:57 PM
> > To: Tomcat Developers List
> > Subject: Re: svn commit: r1530081 -
> > /tomcat/trunk/java/org/apache/tomcat/util/net/AprEndpoint.java
> >
> >
> > None. I've seem the same problem but haven't got to the bottom of it
> > yet. What I do know is:
> > - Tomcat doesn't send a close
> > - The browser closes the connection on its own
> > - Tomcat tries (and fails) to write some data and that triggers the
> >   close on the Tomcat side
> >
> > I'm wondering if some form of data corruption could be the trigger.
> >
> > Whatever is going on, I'm fairly sure it is a different issue to BZ 55633.
> >
> > I need to do some more careful analysis with Wireshark.
> 
> OK, thanks.
> 
> Note: When I tested this scenario, I got a slightly other result: For me, Tomcat
> initiated the shutdown of TCP connection. (And another issue: When/After
> this happens, if the browser tries to establishes a new Websocket
> connection (with a Upgrade: ... header) then tomcat sometimes shuts down
> the TCP connection without sending a reply).

I do now see the behavior which you described, but with the NIO connector: There the browser is the one which closes the connection (probably due to a data corruption).

E.g. when redirecting IE and make it stop to read incoming data for ~ 30 seconds, then make it to continue reading, IE will display (translated from german): "SCRIPT12152: WebSocket Error: Network Error 12152, The server sent an invalid or unknown reply" and reset the TCP connection.

When I tried this with FF, it first seemed that here Tomcat was the one which initiated a normal TCP shutdown, but on more debugging it seems that Firefox stopped at some point to process the received messages and then sent something like a Websocket Close message so that Tomcat closes the connection (when pausing the Firefox -> Tomcat connection prior to unpausing Tomcat -> Firefox direction, Tomcat continues to send data).

When trying with APR, I keep seeing that Tomcat initates the connection shutdown.

With BIO, the problems don't occur.


Hope this helps,


Regards,
Konstantin Preißer


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


RE: svn commit: r1530081 - /tomcat/trunk/java/org/apache/tomcat/util/net/AprEndpoint.java

Posted by Konstantin Preißer <kp...@apache.org>.
Hi Mark,

> -----Original Message-----
> From: Mark Thomas [mailto:markt@apache.org]
> Sent: Tuesday, October 8, 2013 3:57 PM
> To: Tomcat Developers List
> Subject: Re: svn commit: r1530081 -
> /tomcat/trunk/java/org/apache/tomcat/util/net/AprEndpoint.java
> 
> 
> None. I've seem the same problem but haven't got to the bottom of it
> yet. What I do know is:
> - Tomcat doesn't send a close
> - The browser closes the connection on its own
> - Tomcat tries (and fails) to write some data and that triggers the
>   close on the Tomcat side
> 
> I'm wondering if some form of data corruption could be the trigger.
> 
> Whatever is going on, I'm fairly sure it is a different issue to BZ 55633.
> 
> I need to do some more careful analysis with Wireshark.

OK, thanks.

Note: When I tested this scenario, I got a slightly other result: For me, Tomcat initiated the shutdown of TCP connection. (And another issue: When/After this happens, if the browser tries to establishes a new Websocket connection (with a Upgrade: ... header) then tomcat sometimes shuts down the TCP connection without sending a reply).

Some more background:
I wrote a program in .Net/C# that can forward TCP connections by listening on a local TCP port and forwarding incoming TCP connections to a remote host. It logs events (local/remote connection opened, closed, aborted; data received) and it is able to "pause" one side of a connection (e.g. the transfer of data from client to server and from server to client). When pausing, it stops to read from the corresponding side of the TCP connections, and after resuming, continues to read from them.

I used it to debug the problems simulating a browser that stops to read from the connection and then some time later continues to read from it (instead of suspending the browser process).

Now, when I'm pausing the transfer from Tomcat to the browser (so that the other snakes stop moving) and continuing it, I can see that Tomcat initiates the shutdown of the TCP connection, and after that the browser aborts the connection. E.g. when I pause the transfer, then resume it and immediately pause it again, then about 1 second later one snake disappears on the other browser but the browser connected to the forwarder program has its Websocket connection still opened. When I resume the data transfer then the browser receives the connection shutdown and aborts the connection.
If you (or anybody else) have interest I can share the sources (and binaries) of the program to help debugging the problems.


Regards,
Konstantin Preißer


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


Re: svn commit: r1530081 - /tomcat/trunk/java/org/apache/tomcat/util/net/AprEndpoint.java

Posted by Mark Thomas <ma...@apache.org>.
On 08/10/2013 14:50, Konstantin Preißer wrote:
> Hi Mark,
> 
>> -----Original Message-----
>> From: markt@apache.org [mailto:markt@apache.org]
>> Sent: Monday, October 7, 2013 11:44 PM
>> To: dev@tomcat.apache.org
>> Subject: svn commit: r1530081 -
>> /tomcat/trunk/java/org/apache/tomcat/util/net/AprEndpoint.java
>>
>> Author: markt
>> Date: Mon Oct  7 21:44:09 2013
>> New Revision: 1530081
>>
>> URL: http://svn.apache.org/r1530081
>> Log:
>> Fix https://issues.apache.org/bugzilla/show_bug.cgi?id=55633 for APR.
>> If the socket was registered for read and write and only one occurred, the
>> registration for the other event was lost. Ensure that the socket is re-
>> registered with the Poller when necessary.
> 
> Thanks for the fix.
> 
> Unfortunately, for APR it seems there is an issue with connections being closed by Tomcat (on Windows):
> 
> After resuming firefox.exe, in ~ 50% of the cases Tomcat will close the connection one second after resuming the browser (which normally shouldn't happen). The snake will disappear in the other browser and Firefox prints "WebSocket cosed".
> Any idea about this?

None. I've seem the same problem but haven't got to the bottom of it
yet. What I do know is:
- Tomcat doesn't send a close
- The browser closes the connection on its own
- Tomcat tries (and fails) to write some data and that triggers the
  close on the Tomcat side

I'm wondering if some form of data corruption could be the trigger.

Whatever is going on, I'm fairly sure it is a different issue to BZ 55633.

I need to do some more careful analysis with Wireshark.

Mark

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


RE: svn commit: r1530081 - /tomcat/trunk/java/org/apache/tomcat/util/net/AprEndpoint.java

Posted by Konstantin Preißer <kp...@apache.org>.
Hi Mark,

> -----Original Message-----
> From: markt@apache.org [mailto:markt@apache.org]
> Sent: Monday, October 7, 2013 11:44 PM
> To: dev@tomcat.apache.org
> Subject: svn commit: r1530081 -
> /tomcat/trunk/java/org/apache/tomcat/util/net/AprEndpoint.java
> 
> Author: markt
> Date: Mon Oct  7 21:44:09 2013
> New Revision: 1530081
> 
> URL: http://svn.apache.org/r1530081
> Log:
> Fix https://issues.apache.org/bugzilla/show_bug.cgi?id=55633 for APR.
> If the socket was registered for read and write and only one occurred, the
> registration for the other event was lost. Ensure that the socket is re-
> registered with the Poller when necessary.

Thanks for the fix.

Unfortunately, for APR it seems there is an issue with connections being closed by Tomcat (on Windows):

After resuming firefox.exe, in ~ 50% of the cases Tomcat will close the connection one second after resuming the browser (which normally shouldn't happen). The snake will disappear in the other browser and Firefox prints "WebSocket cosed".
Any idea about this?


Thanks,
Konstantin Preißer


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org