You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tomcat.apache.org by xnslong <gi...@git.apache.org> on 2017/10/31 10:52:52 UTC

[GitHub] tomcat pull request #81: Fix NIO BlockPoller slow bug

GitHub user xnslong opened a pull request:

    https://github.com/apache/tomcat/pull/81

    Fix NIO BlockPoller slow bug

    When I use tomcat as my server to receive files, and transfer some files via a non-stable network, I found the traffic always hang for a great while with [`zero window`](https://wiki.wireshark.org/TCP%20ZeroWindow) reported from server. Following are the steps to reproduce the problem. 
    
    1. Configure the server tcp receive window size.
    
    ```bash
    $ cat /proc/sys/net/ipv4/tcp_rmem
    1024	2048	8192
    ```
    
    2. put the following JSP into the ROOT project. i.e. in the path `$CATALINA_HOME/webapps/ROOT/consume.jsp`
    
    ```jsp
    <%@ page import="org.apache.commons.io.IOUtils" %><%
        final ServletInputStream inputStream = request.getInputStream();
        byte[] content = IOUtils.toByteArray(inputStream);
    %>ok
    ```
    
    > The `IOUtils` is a class from [commons-io](https://mvnrepository.com/artifact/commons-io/commons-io/2.5)
    
    3. start tomcat with the following settings
    
    ```
    Server version:        Apache Tomcat/8.0.46
    Server built:          Aug 10 2017 10:10:31 UTC
    Server number:         8.0.46.0
    OS Name:               Linux
    OS Version:            3.10.0-229.el7.x86_64
    Architecture:          amd64
    Java Home:             /usr/jdk1.8.0_121/jre
    JVM Version:           1.8.0_121-b13
    JVM Vendor:            Oracle Corporation
    Command line argument: -Xms256m
    Command line argument: -Xmx256m
    Command line argument: -Xmn128m
    Command line argument: -Xss1m
    ```
    
    4. Capture the TCP traffics.
    
    ```
    tcpdump -i any -w tcp.cap port 8080
    ```
    
    5. Start a client. sending data with http in chunks. With the following settings:
    
    ```
    chunk size                   1024 byte
    rest between chunks          50 ms
    chunk count per request      100
    parallel requests            100
    total requests               10000
    ```
    
    6. Then we will got a lot of zero window packets, which lasts several seconds. 
    
    ```
    No.     stream     Time               Source                Destination           ack        win        Length Info
     469817 3201       15:09:04.175309    172.16.1.4          172.16.1.5          1          29696      57     [TCP segment of a reassembled PDU]
     469904 3201       15:09:04.214945    172.16.1.5          172.16.1.4          4238       1024       54     8080→58750 [ACK] Seq=1 Ack=4238 Win=1024 Len=0
     470091 3201       15:09:04.437137    172.16.1.4          172.16.1.5          1          29696      1078   [TCP Window Full] [TCP segment of a reassembled PDU]
     470092 3201       15:09:04.437142    172.16.1.5          172.16.1.4          5262       0          54     [TCP ZeroWindow] 8080→58750 [ACK] Seq=1 Ack=5262 Win=0 Len=0
     470334 3201       15:09:04.657120    172.16.1.4          172.16.1.5          1          29696      54     [TCP Keep-Alive] 58750→8080 [ACK] Seq=5261 Ack=1 Win=29696 Len=0
     470335 3201       15:09:04.657123    172.16.1.5          172.16.1.4          5262       0          54     [TCP ZeroWindow] 8080→58750 [ACK] Seq=1 Ack=5262 Win=0 Len=0
     470620 3201       15:09:05.098135    172.16.1.4          172.16.1.5          1          29696      54     [TCP Keep-Alive] 58750→8080 [ACK] Seq=5261 Ack=1 Win=29696 Len=0
     470621 3201       15:09:05.098141    172.16.1.5          172.16.1.4          5262       0          54     [TCP ZeroWindow] 8080→58750 [ACK] Seq=1 Ack=5262 Win=0 Len=0
     471017 3201       15:09:05.979136    172.16.1.4          172.16.1.5          1          29696      54     [TCP Keep-Alive] 58750→8080 [ACK] Seq=5261 Ack=1 Win=29696 Len=0
     471018 3201       15:09:05.979140    172.16.1.5          172.16.1.4          5262       0          54     [TCP ZeroWindow] 8080→58750 [ACK] Seq=1 Ack=5262 Win=0 Len=0
     471619 3201       15:09:07.743148    172.16.1.4          172.16.1.5          1          29696      54     [TCP Keep-Alive] 58750→8080 [ACK] Seq=5261 Ack=1 Win=29696 Len=0
     471620 3201       15:09:07.743151    172.16.1.5          172.16.1.4          5262       0          54     [TCP ZeroWindow] 8080→58750 [ACK] Seq=1 Ack=5262 Win=0 Len=0
     475765 3201       15:09:08.545625    172.16.1.5          172.16.1.4          5262       3072       54     [TCP Window Update] 8080→58750 [ACK] Seq=1 Ack=5262 Win=3072 Len=0
     475781 3201       15:09:08.545815    172.16.1.4          172.16.1.5          1          29696      490    [TCP segment of a reassembled PDU]
     475782 3201       15:09:08.545821    172.16.1.5          172.16.1.4          5698       3072       54     8080→58750 [ACK] Seq=1 Ack=5698 Win=3072 Len=0
     475784 3201       15:09:08.545825    172.16.1.4          172.16.1.5          1          29696      1514   [TCP segment of a reassembled PDU]
    ```
    
    After a lot of study, I found this is because of a BUG in the BlockPoller, which will slow down the speed of polling cycle a lot.
    
    The `BlockPoller` will alway register the events via the `SelectionKey`s, which are one-time keys. Once the interest operations is notified by the OS, the key will turn invalid. But the JDK has buffered the `SelectionKey` for each channel (in fact the class is `AbstractSelectableChannel`) and `Selector`, that's to say the same channel and selector will got the same `SelectionKey`. The buffer will be cleared when the `Selector.select...()` method is called. So we can just treat a `SelectionKey` as another identity of the `Channel` before the `select()` method is called.
    
    So when the `BlockPoller` has selected an interested operation for a `SelectionKey`, the worker thread for this `SelectionKey` handles the event, and turn this key into invalid state. Then the worker thread will add another event to the queue for the interested operations if data is not finished. When the `BlockPoller` finished the current loop, it will enter the next loop starting with handling all the events in the queue (a `poll-n-run` loop on the queue), including the event offered by the former worker thread. However this time, the `SelectionKey` got for the channel and selector is still the invalid key for last loop (because the next `Selector.select()` method is not yet called). Then the `BlockPoller` will reject the event, waking up the worker thread. The worker thread will still offers another event to the queue. If it happens before the queue is empty, the `BlockPoller` will have to handle another invalid `SelectionKey` for the channel in this loop. So if the queue is l
 arge enough, the poller will be very busy handling invalid `SelectionKey`s for the same channel again and again, which will kill a lot of time, and make the next select operation start very late. So the IO events from the OS cannot be notified to the worker threads very timely.
    
    In fact the invalid keys should not be handled again and again. This PR brought about a solution that for each selection loop, the `events()` method will not try to empty the queue, instead just read and executes all the events in the queue when it enters the method, events came later will not be handled in this selection loop. So the loop will only handle very definitely limited events, and will not be slow. When on yet another next loop, the invalid key will have been cleared by the `select()` method already. So the next time, it will add the event with a very healthy `SelectionKey`.
    
    After this PR, some events added when the `pool-n-run` loop in the `events()` method has started will  be delayed to the next select loop, but the speed of select loop will be improved a lot when on heavy loads. I tried the reproduction steps on the code of this PR, the zero window packets never appear again.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/xnslong/tomcat trunk

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/tomcat/pull/81.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #81
    
----
commit b15b0ef5e10b5d8b42c00201ede9d1ebc4725b13
Author: zilong6 <zi...@staff.weibo.com>
Date:   2017-10-31T05:51:14Z

    fix non-stoppable poll loop in the events methods

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[GitHub] tomcat pull request #81: BUG 61736 - Fix NIO BlockPoller bug for it's too bu...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/tomcat/pull/81


---

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org