You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@tomcat.apache.org by Jesse Barnum <js...@360works.com> on 2014/02/07 04:11:58 UTC

What is the best connector configuration for thousands of mostly idle users?

Problem summary:
My nio polling threads are using too much CPU time.

Application overview:
My application has from 1,300 - 4,000 users connected at any given time. Each user sends about 200 bytes, then waits 30 seconds, then sends about 200 bytes, and this just loops for each user.
Each user connects with SSL, and we use a long keepalive to ensure that the HTTP connection doesn't close, so that we don't have to renegotiate SSL.

Configuration:
Ubuntu 12.0.4 with Tomcat 7.0.35, 1.75 gigs of RAM.

We are using Apache with SSL and mod_proxy_ajp to forward requests to Tomcat. It has MPM module enabled, with 500 ThreadsPerChild, so we typically have from 3-9 Apache instances running.

> <IfModule mpm_worker_module>
>     ServerLimit         12
>     ThreadLimit         1000
> 
>     StartServers         1 
>     MinSpareThreads      25
>     MaxRequestsPerChild 0
>     MaxSpareThreads     500
>     ThreadsPerChild     500
>     MaxClients          5000
> </IfModule>
> 


> ProxyPass /WSMRegister ajp://localhost:8009/WSMRegister

We are using the AJP NIO connector on port 8009 on Tomcat with 15 worker threads:

>     <!-- Define an AJP 1.3 Connector on port 8009 -->
>     <Connector port="8009" 
>         protocol="org.apache.coyote.ajp.AjpNioProtocol" 
>         redirectPort="8443"
>         minSpareThreads="1" 
>         maxThreads="15" 
>         scheme="https"
>         secure="true"
>         URIDecoding="UTF-8"
>         proxyName="secure2.360works.com"
>         proxyPort="443" />

Problem detail:
lsof is currently showing 564 open sockets between Apache and Tomcat on port 8009, with 1,352 users connected to Apache.
The two threads consuming the most CPU time in Tomcat are "NioBlockingSelector.BlockPoller-2 / 15" and "ajp-nio-8009-ClientPoller-0 / 25". Between them, they are taking 20% of all CPU time for the Tomcat process. I get a few times a day when our monitoring software reports slow response times, and I'd like to solve this.

Some guesses at solutions:
I'm guessing that the high CPU usage is because they are polling all 564 open sockets constantly? Would it make sense to reduce the number of open sockets? I didn't configure any maximum and I don't know how to reduce this number. I'm also concerned that reducing that might negate any benefits by increasing the number of sockets opening and closing between ajp_mod_proxy and the NIO AJP connector.

Maybe it's already running at optimal performance and I just need to throw hardware at it, but it seems like a solvable problem, because the actual worker threads are not doing much at all.

--Jesse Barnum, President, 360Works
http://www.360works.com
Product updates and news on http://facebook.com/360Works
(770) 234-9293
== Don't lose your data! http://360works.com/safetynet/ for FileMaker Server ==


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: What is the best connector configuration for thousands of mostly idle users?

Posted by Christopher Schultz <ch...@christopherschultz.net>.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

Jesse,

On 2/6/14, 10:11 PM, Jesse Barnum wrote:
> Problem summary: My nio polling threads are using too much CPU
> time.

AFAIK, the NIO "poller" doesn't really poll. It's more like an event
processor. It should use very little CPU time.

> Application overview: My application has from 1,300 - 4,000 users
> connected at any given time. Each user sends about 200 bytes, then
> waits 30 seconds, then sends about 200 bytes, and this just loops
> for each user.

You're using standard HTTP with long keepalives, right? It sounds like
this application is ripe for a re-architecture around Websocket or
something similar.

> Each user connects with SSL, and we use a long keepalive to ensure 
> that the HTTP connection doesn't close, so that we don't have to 
> renegotiate SSL.
> 
> Configuration: Ubuntu 12.0.4 with Tomcat 7.0.35, 1.75 gigs of RAM.
> 
> We are using Apache with SSL and mod_proxy_ajp to forward requests
>  to Tomcat. It has MPM module enabled, with 500 ThreadsPerChild,
> so we typically have from 3-9 Apache instances running.
> 
>> <IfModule mpm_worker_module> ServerLimit         12 ThreadLimit
>> 1000
>> 
>> StartServers         1 MinSpareThreads      25 
>> MaxRequestsPerChild 0 MaxSpareThreads     500 ThreadsPerChild
>> 500 MaxClients          5000 </IfModule>

If you are expecting long keepalive times with a huge amount of
activity on each one, you might want to switch to the "event" MPM,
which is httpd's version of Tomcat's NIO connector.

You might be able to get away with fewer threads under the event MPM
as well.

- -chris
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
Comment: GPGTools - http://gpgtools.org
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQIcBAEBCAAGBQJS9Tn9AAoJEBzwKT+lPKRYPcMP/if0sMK2+6kKaaaHN1FG8N+g
lgOQHUz4xyUAOGESAY7qIQaWcpuAisiVTY992BhzrzkLQz3vhkc8XhMX4vNYbrak
6sd6j8Kd+/W7IYFykbWDJ3kRioXy5kgbjcf/gT0PWtXrcJ+hAAv3g3NQZP8HrNRf
kiU2AY1zYbbexchZTw+wteDOR5HMf2ObOcHHzvAxFjnObKiKrKXHlkkxMF25UTPp
BADhaSgcFN75mRzaAOoJMTP6gFdscAB2thAY24sNbpqko6Hyk/mgN+cmHKFliw5O
tzp8c2L1chszOBz3/XdSO5ZBXpb0/gNwPF5RQ2UoufL0edlPHP1PKOArJjhb6HSv
JH4IORdljTJONFW6VuCeoUgMZC7nkpFkWDQeXYYaVn4KjNGZbPn+ms0juLz+qpqn
L6BgK5E9AUx0RZcqcfsuIOktw0RE4sXN7ty/uAplze7kYIzkA+u8lfFexZWR92CO
xmWA/0Aolt9OJd+lYVZdYW3Hp8Us8Ml4o9DC9tUTN/l+jw/gASKyqcomrLjDF4mc
PcB0+gSSyPaaHHaAJmDYuEVeH9GFJvF4xgSMUrltrUUIlkGnkzpKaxcefPWvGgmi
oJZLCtvxGhiuhkrGddWwH8ocAgV6C3EzzUao2JJbkeRAk+q5ZAOGmadmoyZ5gjKs
glccs5BBQC5Sba4TZvdi
=kcAQ
-----END PGP SIGNATURE-----

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: What is the best connector configuration for thousands of mostly idle users?

Posted by Christopher Schultz <ch...@christopherschultz.net>.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

Jesse,

On 2/11/14, 8:24 AM, Jesse Barnum wrote:
> On Feb 11, 2014, at 4:56 AM, André Warnier <aw...@ice-sa.com> wrote:
> 
>> It looks that way. But this mod_proxy parameter (disablereuse,
>> lowercase - I don't know if it matters) is in a section
>> "BalancerMember parameters", and it is not very clear if that
>> applies even if you are not using a balancer, or if it is
>> forwarded to mod_proxy_ajp. Some other options in the same page
>> specify this explicitly, but this one doesn't.
>> 
>> I guess that Mark could answer that.
>> 
>> I think that it would help, in a general sense, if there was a
>> general "translation table" somewhere showing the AJP or other
>> attributes or control parameters which exist, and to what option
>> they correspond in respectively mod_jk and
>> mod_proxy/mod_proxy_ajp. But due to the difficulty of figuring
>> this out by trial and error, probably only the respective
>> developers can do that.
>> 
>>>> But I don't remember (and did not check earlier in the
>>>> thread) if you indicated that you are using mod_proxy_ajp.
>>>> 
>>>> And to answer the previous question : yes, I believe that you
>>>> can keep a long keep-alive in Apache httpd, independently of
>>>> how httpd connects to Tomcat.
> 
> Once we started getting to around 5,000 connected users, our
> Tomcat application became overwhelmed and unresponsive using the
> AJP NIO connector. I've temporarily improved the situation by
> moving to much more powerful hardware. However, it still drops a
> few connections throughout the day.

Are you sure the problem is Tomcat? If your clients have a long
keepalive timeout with httpd and you are not using the event MPM, then
each client is tying-up an HTTP connection (and thread) while not
actually accomplishing anything. You may be starving your own clients
by not having enough connections available to httpd.

> I will experiment with other configurations on a test instance,
> and load it up using the 'ab' Apache Benchmarking tool. I will post
> back my results on this list.

I have it on good authority that ab isn't great at launching
simultaneous requests (high concurrency). Unless you have lots of
servers available to generate load, you might want to try a different
tool. (Though 5000 concurrent requests isn't much, so it will probably
suffice)

- -chris
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
Comment: GPGTools - http://gpgtools.org
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQIcBAEBCAAGBQJS+oxFAAoJEBzwKT+lPKRYm8wP/jV2xIEiUYzZGeMCmQgHMoLk
b5QtYtIPuPqjxtK1pJr+oCRDIXzCJmX8jja9Qdj+VlS+Oq57Zl8ZZ4Fv2q2j2a57
cn+fIwMWr/hm8SlnveDlA5maMchPCBIyRfg7TFWs4+5fyPbROMLjDCtFPBNA2joY
2qcFJMG4qYnRZ+uVPqRmgwD/ZjADXrKXoAOX2a7uJn0mL1rDZeDG0l9TizdeSpFq
s3PrpghAIyTp3M8+RSNOiDReeHTa3/6yiHhrPUd8ppn8eXiFdN6/DzGk3SvOgynb
y1ao+mXEcc2MWC2j/GSL/TiTib3RdMWGWH3nzNARol/TVeqJsqlaWnOXTho77Qfj
/o5RVbTj/XWMG90nflBl4KZLKeTJczynT6/58dd+6YnxXOdyH8+20YHQsBAMxfVv
AVxdqBgDZxQ6FpA5JULqXqFtLhCZ9cHh333eIKMxED+wrvGQfLUxntINKkzWHv/b
26mvQzBgZ02ZzEDEmT7wOmYfWgEtiUeRr/2CBdyPXI9O2k4knF5pPEaEp5rRQOj+
ye+r1VRnYFnNpR1LijsvzpcfuwsB9WFvNzjYj+n14e1A1yl4EvizpczNFOEPWjWW
I2B1d0dp5QblW15PB1e+JZ7AaWuuCFxHV66+A+salf66zpJbhboe0Z0WYj8nG84N
sxuCAv135VeKzWD/VzVo
=6Don
-----END PGP SIGNATURE-----

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: What is the best connector configuration for thousands of mostly idle users?

Posted by Jesse Barnum <js...@360works.com>.
On Feb 11, 2014, at 4:56 AM, André Warnier <aw...@ice-sa.com> wrote:

> It looks that way. But this mod_proxy parameter (disablereuse, lowercase - I don't know if it matters) is in a section "BalancerMember parameters", and it is not very clear if that applies even if you are not using a balancer, or if it is forwarded to mod_proxy_ajp. Some other options in the same page specify this explicitly, but this one doesn't.
> 
> I guess that Mark could answer that.
> 
> I think that it would help, in a general sense, if there was a general "translation table" somewhere showing the AJP or other attributes or control parameters which exist, and to what option they correspond in respectively mod_jk and mod_proxy/mod_proxy_ajp.
> But due to the difficulty of figuring this out by trial and error, probably only the respective developers can do that.
> 
>>> But I don't remember (and did not check earlier in the thread) if you
>>> indicated that you are using mod_proxy_ajp.
>>> 
>>> And to answer the previous question : yes, I believe that you can keep a
>>> long keep-alive in Apache httpd, independently of how httpd connects to
>>> Tomcat.

Once we started getting to around 5,000 connected users, our Tomcat application became overwhelmed and unresponsive using the AJP NIO connector. I've temporarily improved the situation by moving to much more powerful hardware. However, it still drops a few connections throughout the day.

I will experiment with other configurations on a test instance, and load it up using the 'ab' Apache Benchmarking tool. I will post back my results on this list.

Some things I would like to try:
* Using APR with SSL and directly serving connections, instead of using Apache as a front-end.
* Setting the org.apache.tomcat.util.net.NioSelectorShared attributer to false, and increasing the selectorPool.maxSelectors property
* The suggestion made here about using disablereuse with mod_proxy

If there are any other suggestions, please let me know so that I can try them out.

--Jesse Barnum, President, 360Works
http://www.360works.com
Product updates and news on http://facebook.com/360Works
(770) 234-9293
== Don't lose your data! http://360works.com/safetynet/ for FileMaker Server ==


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: What is the best connector configuration for thousands of mostly idle users?

Posted by André Warnier <aw...@ice-sa.com>.
Cédric Couralet wrote:
> 2014-02-10 22:34 GMT+01:00 André Warnier <aw...@ice-sa.com>:
> 
>> Jesse Barnum wrote:
>>
>>> On Feb 10, 2014, at 11:14 AM, Filip Hanik <fi...@hanik.com> wrote:
>>>
>>>  Jesse, mostly idle users and you wish to conserve resources. Use the
>>>> JkOptions +DisableReuse
>>>> on the mod_jk module. This will close connections after the request has
>>>> been completed. Many will tell you this will slow down your system since
>>>> new connections have to be created for each request. Usually, the
>>>> overhead
>>>> of this connection creation on a LAN is worth it. Measure for yourself.
>>>> Then you can go back to the regular blocking AJP connector, that will
>>>> perform a bit better as it doesn't have to do polling.
>>>>
>>>
>>> If I do this, can I keep a long keep-alive time on Apache? I need to
>>> preserve that, because renegotiating SSL connections for every request
>>> grinds the web server to a halt.
>>>
>>> Also, I thought mod_jk and mod_ajp were two different things - how can I
>>> use them both together?
>>>
>>>
>> Reply to the last phrase above :
>>
>> mod_jk and mod_proxy_ajp are indeed two different things, but with a
>> similar purpose :
>> - each of them is a different add-on module to Apache httpd
>> - each one of them can be used as a connector between Apache httpd and
>> Apache Tomcat
>> - you generally use one or the other, not both at the same time
>> - they both connect to the same AJP <Connector> at the Tomcat level
>> - between Apache httpd and Tomcat, they both "speak the same language"
>> (the AJP protocol)
>>
>> One difference is that mod_jk has quite a few more tunable options than
>> the mod_proxy_ajp module.  The JkOptions mentioned above by Filip is one of
>> these mod_jk options.
>>
> 
> I don't know what that JkOptions options does exactly, but from the name,
> isn't it the same as the disableReuse option on mod_proxy?
> http://httpd.apache.org/docs/current/mod/mod_proxy.html#proxypass
> 
> Then the OP could try that.

It looks that way. But this mod_proxy parameter (disablereuse, lowercase - I don't know if 
it matters) is in a section "BalancerMember parameters", and it is not very clear if that 
applies even if you are not using a balancer, or if it is forwarded to mod_proxy_ajp. 
Some other options in the same page specify this explicitly, but this one doesn't.

I guess that Mark could answer that.

I think that it would help, in a general sense, if there was a general "translation table" 
somewhere showing the AJP or other attributes or control parameters which exist, and to 
what option they correspond in respectively mod_jk and mod_proxy/mod_proxy_ajp.
But due to the difficulty of figuring this out by trial and error, probably only the 
respective developers can do that.

> 
> 
>> But I don't remember (and did not check earlier in the thread) if you
>> indicated that you are using mod_proxy_ajp.
>>
>> And to answer the previous question : yes, I believe that you can keep a
>> long keep-alive in Apache httpd, independently of how httpd connects to
>> Tomcat.
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
>> For additional commands, e-mail: users-help@tomcat.apache.org
>>
>>
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: What is the best connector configuration for thousands of mostly idle users?

Posted by Cédric Couralet <ce...@gmail.com>.
2014-02-10 22:34 GMT+01:00 André Warnier <aw...@ice-sa.com>:

> Jesse Barnum wrote:
>
>> On Feb 10, 2014, at 11:14 AM, Filip Hanik <fi...@hanik.com> wrote:
>>
>>  Jesse, mostly idle users and you wish to conserve resources. Use the
>>> JkOptions +DisableReuse
>>> on the mod_jk module. This will close connections after the request has
>>> been completed. Many will tell you this will slow down your system since
>>> new connections have to be created for each request. Usually, the
>>> overhead
>>> of this connection creation on a LAN is worth it. Measure for yourself.
>>> Then you can go back to the regular blocking AJP connector, that will
>>> perform a bit better as it doesn't have to do polling.
>>>
>>
>>
>> If I do this, can I keep a long keep-alive time on Apache? I need to
>> preserve that, because renegotiating SSL connections for every request
>> grinds the web server to a halt.
>>
>> Also, I thought mod_jk and mod_ajp were two different things - how can I
>> use them both together?
>>
>>
> Reply to the last phrase above :
>
> mod_jk and mod_proxy_ajp are indeed two different things, but with a
> similar purpose :
> - each of them is a different add-on module to Apache httpd
> - each one of them can be used as a connector between Apache httpd and
> Apache Tomcat
> - you generally use one or the other, not both at the same time
> - they both connect to the same AJP <Connector> at the Tomcat level
> - between Apache httpd and Tomcat, they both "speak the same language"
> (the AJP protocol)
>
> One difference is that mod_jk has quite a few more tunable options than
> the mod_proxy_ajp module.  The JkOptions mentioned above by Filip is one of
> these mod_jk options.
>

I don't know what that JkOptions options does exactly, but from the name,
isn't it the same as the disableReuse option on mod_proxy?
http://httpd.apache.org/docs/current/mod/mod_proxy.html#proxypass

Then the OP could try that.


> But I don't remember (and did not check earlier in the thread) if you
> indicated that you are using mod_proxy_ajp.
>
> And to answer the previous question : yes, I believe that you can keep a
> long keep-alive in Apache httpd, independently of how httpd connects to
> Tomcat.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
> For additional commands, e-mail: users-help@tomcat.apache.org
>
>

Re: What is the best connector configuration for thousands of mostly idle users?

Posted by André Warnier <aw...@ice-sa.com>.
Jesse Barnum wrote:
> On Feb 10, 2014, at 11:14 AM, Filip Hanik <fi...@hanik.com> wrote:
> 
>> Jesse, mostly idle users and you wish to conserve resources. Use the
>> JkOptions +DisableReuse
>> on the mod_jk module. This will close connections after the request has
>> been completed. Many will tell you this will slow down your system since
>> new connections have to be created for each request. Usually, the overhead
>> of this connection creation on a LAN is worth it. Measure for yourself.
>> Then you can go back to the regular blocking AJP connector, that will
>> perform a bit better as it doesn't have to do polling.
> 
> 
> If I do this, can I keep a long keep-alive time on Apache? I need to preserve that, because renegotiating SSL connections for every request grinds the web server to a halt.
> 
> Also, I thought mod_jk and mod_ajp were two different things - how can I use them both together?
> 

Reply to the last phrase above :

mod_jk and mod_proxy_ajp are indeed two different things, but with a similar purpose :
- each of them is a different add-on module to Apache httpd
- each one of them can be used as a connector between Apache httpd and Apache Tomcat
- you generally use one or the other, not both at the same time
- they both connect to the same AJP <Connector> at the Tomcat level
- between Apache httpd and Tomcat, they both "speak the same language" (the AJP protocol)

One difference is that mod_jk has quite a few more tunable options than the mod_proxy_ajp 
module.  The JkOptions mentioned above by Filip is one of these mod_jk options.
But I don't remember (and did not check earlier in the thread) if you indicated that you 
are using mod_proxy_ajp.

And to answer the previous question : yes, I believe that you can keep a long keep-alive 
in Apache httpd, independently of how httpd connects to Tomcat.

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: What is the best connector configuration for thousands of mostly idle users?

Posted by Jesse Barnum <js...@360works.com>.
On Feb 10, 2014, at 11:14 AM, Filip Hanik <fi...@hanik.com> wrote:

> Jesse, mostly idle users and you wish to conserve resources. Use the
> JkOptions +DisableReuse
> on the mod_jk module. This will close connections after the request has
> been completed. Many will tell you this will slow down your system since
> new connections have to be created for each request. Usually, the overhead
> of this connection creation on a LAN is worth it. Measure for yourself.
> Then you can go back to the regular blocking AJP connector, that will
> perform a bit better as it doesn't have to do polling.


If I do this, can I keep a long keep-alive time on Apache? I need to preserve that, because renegotiating SSL connections for every request grinds the web server to a halt.

Also, I thought mod_jk and mod_ajp were two different things - how can I use them both together?

--Jesse Barnum, President, 360Works
http://www.360works.com

Re: What is the best connector configuration for thousands of mostly idle users?

Posted by Filip Hanik <fi...@hanik.com>.
Jesse, mostly idle users and you wish to conserve resources. Use the
JkOptions +DisableReuse
on the mod_jk module. This will close connections after the request has
been completed. Many will tell you this will slow down your system since
new connections have to be created for each request. Usually, the overhead
of this connection creation on a LAN is worth it. Measure for yourself.
Then you can go back to the regular blocking AJP connector, that will
perform a bit better as it doesn't have to do polling.




On Mon, Feb 10, 2014 at 9:04 AM, Jesse Barnum <js...@360works.com>wrote:

> On Feb 7, 2014, at 1:11 PM, Mark Thomas <ma...@apache.org> wrote:
>
> >>
> >> This is a single core box (sorry, should have mentioned that in the
> configuration details). Would you still expect increasing the worker thread
> count to help?
> >
> > Yes. I'd return it to the default of 200 and let Tomcat manage the pool.
> > It will increase/decrease the thread pool size as necessary. Depending
> > on how long some clients take to send the data, you might need to
> > increase the thread pool beyond 200.
> >
> > Mark
>
> Unfortunately, this has made the problem worse.
>
> We are now getting site failure messages from our monitoring software more
> frequently, and outside of peak hours, and CPU usage is running much higher
> than normal.
>
> Looking at the manager page shows 76 threads busy out of 200, and YourKit
> shows that many threads (I'm assuming 76-1) are stuck at this point:
>
> > ajp-nio-8009-exec-148 [WAITING] CPU time: 0:50
> > sun.misc.Unsafe.park(boolean, long)
> > java.util.concurrent.locks.LockSupport.parkNanos(Object, long)
> >
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(int,
> long)
> >
> java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(int,
> long)
> > java.util.concurrent.CountDownLatch.await(long, TimeUnit)
> >
> org.apache.tomcat.util.net.NioEndpoint$KeyAttachment.awaitLatch(CountDownLatch,
> long, TimeUnit)
> >
> org.apache.tomcat.util.net.NioEndpoint$KeyAttachment.awaitReadLatch(long,
> TimeUnit)
> > org.apache.tomcat.util.net.NioBlockingSelector.read(ByteBuffer,
> NioChannel, long)
> > org.apache.tomcat.util.net.NioSelectorPool.read(ByteBuffer, NioChannel,
> Selector, long, boolean)
> > org.apache.tomcat.util.net.NioSelectorPool.read(ByteBuffer, NioChannel,
> Selector, long)
> > org.apache.coyote.ajp.AjpNioProcessor.readSocket(byte[], int, int,
> boolean)
> > org.apache.coyote.ajp.AjpNioProcessor.read(byte[], int, int, boolean)
> > org.apache.coyote.ajp.AjpNioProcessor.readMessage(AjpMessage, boolean)
> > org.apache.coyote.ajp.AjpNioProcessor.receive()
> > org.apache.coyote.ajp.AbstractAjpProcessor.refillReadBuffer()
> >
> org.apache.coyote.ajp.AbstractAjpProcessor$SocketInputBuffer.doRead(ByteChunk,
> Request)
> > org.apache.coyote.Request.doRead(ByteChunk)
> > org.apache.catalina.connector.InputBuffer.realReadBytes(byte[], int, int)
> > org.apache.tomcat.util.buf.ByteChunk.substract(byte[], int, int)
> > org.apache.catalina.connector.InputBuffer.read(byte[], int, int)
> > org.apache.catalina.connector.CoyoteInputStream.read(byte[])
> > com.prosc.io.IOUtils.writeInputToOutput(InputStream, OutputStream, int)
>
> Almost all requests to the site are POST operations with small payloads.
> My theory, based on this stack trace, is that all threads are in contention
> for the single selector thread to read the contents of the POST, and that
> as the number of worker threads increases, so does thread contention,
> reducing overall throughput. Please let me know whether this sounds
> accurate to you.
>
> If so, how do I solve this? Here are my ideas, but I'm really not familiar
> enough with the connector configurations to know whether I'm on the right
> track or not:
> * Set 'org.apache.tomcat.util.net.NioSelectorShared' property to false. It
> sounds like this would give each worker thread concurrent access to the
> POST requests, although I can't quite tell from the documentation if that's
> true.
> * Re-write my client application to use multiple GET requests instead of
> single POST requests. This would be a lot of work, and seems like it should
> not be necessary.
> * Ditch the NIO connector and Apache/SSL front-end and move to APR/SSL
> with a whole lot of threads. Also seems like it should not be necessary; I
> thought my use case is exactly what NIO is made for.
>
> I'm open to any other ideas, thank you for all of your help!
>
> --Jesse Barnum, President, 360Works
> http://www.360works.com
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
> For additional commands, e-mail: users-help@tomcat.apache.org
>
>

Re: What is the best connector configuration for thousands of mostly idle users?

Posted by Jesse Barnum <js...@360works.com>.
On Feb 7, 2014, at 1:11 PM, Mark Thomas <ma...@apache.org> wrote:

>> 
>> This is a single core box (sorry, should have mentioned that in the configuration details). Would you still expect increasing the worker thread count to help?
> 
> Yes. I'd return it to the default of 200 and let Tomcat manage the pool.
> It will increase/decrease the thread pool size as necessary. Depending
> on how long some clients take to send the data, you might need to
> increase the thread pool beyond 200.
> 
> Mark

Unfortunately, this has made the problem worse.

We are now getting site failure messages from our monitoring software more frequently, and outside of peak hours, and CPU usage is running much higher than normal.

Looking at the manager page shows 76 threads busy out of 200, and YourKit shows that many threads (I'm assuming 76-1) are stuck at this point:

> ajp-nio-8009-exec-148 [WAITING] CPU time: 0:50
> sun.misc.Unsafe.park(boolean, long)
> java.util.concurrent.locks.LockSupport.parkNanos(Object, long)
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(int, long)
> java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(int, long)
> java.util.concurrent.CountDownLatch.await(long, TimeUnit)
> org.apache.tomcat.util.net.NioEndpoint$KeyAttachment.awaitLatch(CountDownLatch, long, TimeUnit)
> org.apache.tomcat.util.net.NioEndpoint$KeyAttachment.awaitReadLatch(long, TimeUnit)
> org.apache.tomcat.util.net.NioBlockingSelector.read(ByteBuffer, NioChannel, long)
> org.apache.tomcat.util.net.NioSelectorPool.read(ByteBuffer, NioChannel, Selector, long, boolean)
> org.apache.tomcat.util.net.NioSelectorPool.read(ByteBuffer, NioChannel, Selector, long)
> org.apache.coyote.ajp.AjpNioProcessor.readSocket(byte[], int, int, boolean)
> org.apache.coyote.ajp.AjpNioProcessor.read(byte[], int, int, boolean)
> org.apache.coyote.ajp.AjpNioProcessor.readMessage(AjpMessage, boolean)
> org.apache.coyote.ajp.AjpNioProcessor.receive()
> org.apache.coyote.ajp.AbstractAjpProcessor.refillReadBuffer()
> org.apache.coyote.ajp.AbstractAjpProcessor$SocketInputBuffer.doRead(ByteChunk, Request)
> org.apache.coyote.Request.doRead(ByteChunk)
> org.apache.catalina.connector.InputBuffer.realReadBytes(byte[], int, int)
> org.apache.tomcat.util.buf.ByteChunk.substract(byte[], int, int)
> org.apache.catalina.connector.InputBuffer.read(byte[], int, int)
> org.apache.catalina.connector.CoyoteInputStream.read(byte[])
> com.prosc.io.IOUtils.writeInputToOutput(InputStream, OutputStream, int)

Almost all requests to the site are POST operations with small payloads. My theory, based on this stack trace, is that all threads are in contention for the single selector thread to read the contents of the POST, and that as the number of worker threads increases, so does thread contention, reducing overall throughput. Please let me know whether this sounds accurate to you.

If so, how do I solve this? Here are my ideas, but I'm really not familiar enough with the connector configurations to know whether I'm on the right track or not:
* Set 'org.apache.tomcat.util.net.NioSelectorShared' property to false. It sounds like this would give each worker thread concurrent access to the POST requests, although I can't quite tell from the documentation if that's true.
* Re-write my client application to use multiple GET requests instead of single POST requests. This would be a lot of work, and seems like it should not be necessary.
* Ditch the NIO connector and Apache/SSL front-end and move to APR/SSL with a whole lot of threads. Also seems like it should not be necessary; I thought my use case is exactly what NIO is made for.

I'm open to any other ideas, thank you for all of your help!

--Jesse Barnum, President, 360Works
http://www.360works.com
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: What is the best connector configuration for thousands of mostly idle users?

Posted by Mark Thomas <ma...@apache.org>.
On 07/02/2014 17:26, Jesse Barnum wrote:
> On Feb 7, 2014, at 2:38 AM, Mark Thomas <ma...@apache.org> wrote:
> 
>> Jesse Barnum <js...@360works.com> wrote:
>>
>> Thanks for such a well written question. All the relevant information is available and presented clearly and logically.
> 
> Glad I could help. I get error reports from my users all the time like "I installed the update and now it doesn't work", so I know how frustrating that can be :-)
> 
>>
>>> Problem summary:
>>> My nio polling threads are using too much CPU time.
>>
>> Are you sure that is the real problem? It sounds like the occasional slowness is the problem and the polling threads are the suspected - emphasis on the suspected - root cause.
> 
> That's true. I've tried running it with a profiler (YourKit) during peak time, and I think I've found the real culprit - reading from the Coyote InputStream is blocking.
> 
> Out of 98,515 milliseconds spent in javax.servlet.http.HttpServlet.service:
> * 95,801 milliseconds are spent calling org.apache.catalina.connector.CoyoteInputStream.read( byte[] ). Of this time, 81,707 ms are spent waiting for KeyAttachment.awaitLatch.
> * 2,698 milliseconds are spent on business logic and MySQL calls
> * 16 milliseconds are spent closing the input stream and setting HTTP headers
> 
> The last Tomcat code I see is org.apache.tomcat.util.net.NioEndpoint$KeyAttachment.awaitLatch( CountDownLatch, long, TimeUnit ). I don't really know what that means or does, but it seems to be the holdup.
> 
> If you want me to send you the raw profiler information, let me know how to send it and I'll be happy to (although I'm running an old version of YourKit, 9.5.6). I can also send screen shots if that's helpful / easier.
> 
> Here is a typical thread dump while the request is being serviced:
>> Thread name: ajp-nio-8009-exec-19 / ID: 55 / state: TIMED_WAITING 
>> Milliseconds spent waiting: 47,475,371 / blocked: 2,389 
>>     sun.misc.Unsafe.$$YJP$$park(Native Method) 
>>     sun.misc.Unsafe.park(Unsafe.java) 
>>     java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226) 
>>     java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1033) 
>>     java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1326) 
>>     java.util.concurrent.CountDownLatch.await(CountDownLatch.java:282) 
>>     org.apache.tomcat.util.net.NioEndpoint$KeyAttachment.awaitLatch(NioEndpoint.java:1571) 
>>     org.apache.tomcat.util.net.NioEndpoint$KeyAttachment.awaitReadLatch(NioEndpoint.java:1573) 
>>     org.apache.tomcat.util.net.NioBlockingSelector.read(NioBlockingSelector.java:172) 
>>     org.apache.tomcat.util.net.NioSelectorPool.read(NioSelectorPool.java:246) 
>>     org.apache.tomcat.util.net.NioSelectorPool.read(NioSelectorPool.java:227) 
>>     org.apache.coyote.ajp.AjpNioProcessor.readSocket(AjpNioProcessor.java:342) 
>>     org.apache.coyote.ajp.AjpNioProcessor.read(AjpNioProcessor.java:314) 
>>     org.apache.coyote.ajp.AjpNioProcessor.readMessage(AjpNioProcessor.java:406) 
>>     org.apache.coyote.ajp.AjpNioProcessor.receive(AjpNioProcessor.java:375) 
>>     org.apache.coyote.ajp.AbstractAjpProcessor.refillReadBuffer(AbstractAjpProcessor.java:616) 
>>     org.apache.coyote.ajp.AbstractAjpProcessor$SocketInputBuffer.doRead(AbstractAjpProcessor.java:1070) 
>>     org.apache.coyote.Request.doRead(Request.java:422) 
>>     org.apache.catalina.connector.InputBuffer.realReadBytes(InputBuffer.java:290) 
>>     org.apache.tomcat.util.buf.ByteChunk.substract(ByteChunk.java:431) 
>>     org.apache.catalina.connector.InputBuffer.read(InputBuffer.java:315) 
>>     org.apache.catalina.connector.CoyoteInputStream.read(CoyoteInputStream.java:167) 
>>     com.prosc.io.IOUtils.writeInputToOutput(IOUtils.java:49) 
>>     com.prosc.io.IOUtils.inputStreamAsBytes(IOUtils.java:116) 
>>     com.prosc.io.IOUtils.inputStreamAsString(IOUtils.java:136) 
>>     com.prosc.io.IOUtils.inputStreamAsString(IOUtils.java:127) 
>>     com.prosc.licensecheck.LicenseCheck.doPost(LicenseCheck.java:169) 
>>     javax.servlet.http.HttpServlet.service(HttpServlet.java:647) 
>>     javax.servlet.http.HttpServlet.service(HttpServlet.java:728) 
>>     org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:305) 
>>     org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210) 
>>     com.prosc.infrastructure.LogFilter.doFilter(LogFilter.java:22) 
>>     org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243) 
>>     org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210) 
>>     com.prosc.infrastructure.CharacterEncodingFilter.doFilter(CharacterEncodingFilter.java:38) 
>>     org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243) 
>>     org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210) 
>>     org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:222) 
>>     org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:123) 
>>     org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:472) 
>>     org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171) 
>>     org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:99) 
>>     org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118) 
>>     org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:407) 
>>     org.apache.coyote.ajp.AjpNioProcessor.process(AjpNioProcessor.java:184) 
>>     org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:589) 
>>     org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.run(NioEndpoint.java:1680) 
>>     java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
>>     java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
>>     java.lang.Thread.run(Thread.java:722)
> 
> So it seems that the worker thread is blocking on something while it's trying to read the POST data…? If my guess is correct, is this a problem that can be solved by reconfiguring?

Correct. Even the NIO HTTP connector has to use blocking IO to read the
request body and write the response.

> This is a single core box (sorry, should have mentioned that in the configuration details). Would you still expect increasing the worker thread count to help?

Yes. I'd return it to the default of 200 and let Tomcat manage the pool.
It will increase/decrease the thread pool size as necessary. Depending
on how long some clients take to send the data, you might need to
increase the thread pool beyond 200.

Mark


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: What is the best connector configuration for thousands of mostly idle users?

Posted by Jesse Barnum <js...@360works.com>.
On Feb 7, 2014, at 2:38 AM, Mark Thomas <ma...@apache.org> wrote:

> Jesse Barnum <js...@360works.com> wrote:
> 
> Thanks for such a well written question. All the relevant information is available and presented clearly and logically.

Glad I could help. I get error reports from my users all the time like "I installed the update and now it doesn't work", so I know how frustrating that can be :-)

> 
>> Problem summary:
>> My nio polling threads are using too much CPU time.
> 
> Are you sure that is the real problem? It sounds like the occasional slowness is the problem and the polling threads are the suspected - emphasis on the suspected - root cause.

That's true. I've tried running it with a profiler (YourKit) during peak time, and I think I've found the real culprit - reading from the Coyote InputStream is blocking.

Out of 98,515 milliseconds spent in javax.servlet.http.HttpServlet.service:
* 95,801 milliseconds are spent calling org.apache.catalina.connector.CoyoteInputStream.read( byte[] ). Of this time, 81,707 ms are spent waiting for KeyAttachment.awaitLatch.
* 2,698 milliseconds are spent on business logic and MySQL calls
* 16 milliseconds are spent closing the input stream and setting HTTP headers

The last Tomcat code I see is org.apache.tomcat.util.net.NioEndpoint$KeyAttachment.awaitLatch( CountDownLatch, long, TimeUnit ). I don't really know what that means or does, but it seems to be the holdup.

If you want me to send you the raw profiler information, let me know how to send it and I'll be happy to (although I'm running an old version of YourKit, 9.5.6). I can also send screen shots if that's helpful / easier.

Here is a typical thread dump while the request is being serviced:
> Thread name: ajp-nio-8009-exec-19 / ID: 55 / state: TIMED_WAITING 
> Milliseconds spent waiting: 47,475,371 / blocked: 2,389 
>     sun.misc.Unsafe.$$YJP$$park(Native Method) 
>     sun.misc.Unsafe.park(Unsafe.java) 
>     java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226) 
>     java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1033) 
>     java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1326) 
>     java.util.concurrent.CountDownLatch.await(CountDownLatch.java:282) 
>     org.apache.tomcat.util.net.NioEndpoint$KeyAttachment.awaitLatch(NioEndpoint.java:1571) 
>     org.apache.tomcat.util.net.NioEndpoint$KeyAttachment.awaitReadLatch(NioEndpoint.java:1573) 
>     org.apache.tomcat.util.net.NioBlockingSelector.read(NioBlockingSelector.java:172) 
>     org.apache.tomcat.util.net.NioSelectorPool.read(NioSelectorPool.java:246) 
>     org.apache.tomcat.util.net.NioSelectorPool.read(NioSelectorPool.java:227) 
>     org.apache.coyote.ajp.AjpNioProcessor.readSocket(AjpNioProcessor.java:342) 
>     org.apache.coyote.ajp.AjpNioProcessor.read(AjpNioProcessor.java:314) 
>     org.apache.coyote.ajp.AjpNioProcessor.readMessage(AjpNioProcessor.java:406) 
>     org.apache.coyote.ajp.AjpNioProcessor.receive(AjpNioProcessor.java:375) 
>     org.apache.coyote.ajp.AbstractAjpProcessor.refillReadBuffer(AbstractAjpProcessor.java:616) 
>     org.apache.coyote.ajp.AbstractAjpProcessor$SocketInputBuffer.doRead(AbstractAjpProcessor.java:1070) 
>     org.apache.coyote.Request.doRead(Request.java:422) 
>     org.apache.catalina.connector.InputBuffer.realReadBytes(InputBuffer.java:290) 
>     org.apache.tomcat.util.buf.ByteChunk.substract(ByteChunk.java:431) 
>     org.apache.catalina.connector.InputBuffer.read(InputBuffer.java:315) 
>     org.apache.catalina.connector.CoyoteInputStream.read(CoyoteInputStream.java:167) 
>     com.prosc.io.IOUtils.writeInputToOutput(IOUtils.java:49) 
>     com.prosc.io.IOUtils.inputStreamAsBytes(IOUtils.java:116) 
>     com.prosc.io.IOUtils.inputStreamAsString(IOUtils.java:136) 
>     com.prosc.io.IOUtils.inputStreamAsString(IOUtils.java:127) 
>     com.prosc.licensecheck.LicenseCheck.doPost(LicenseCheck.java:169) 
>     javax.servlet.http.HttpServlet.service(HttpServlet.java:647) 
>     javax.servlet.http.HttpServlet.service(HttpServlet.java:728) 
>     org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:305) 
>     org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210) 
>     com.prosc.infrastructure.LogFilter.doFilter(LogFilter.java:22) 
>     org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243) 
>     org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210) 
>     com.prosc.infrastructure.CharacterEncodingFilter.doFilter(CharacterEncodingFilter.java:38) 
>     org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243) 
>     org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210) 
>     org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:222) 
>     org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:123) 
>     org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:472) 
>     org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171) 
>     org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:99) 
>     org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118) 
>     org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:407) 
>     org.apache.coyote.ajp.AjpNioProcessor.process(AjpNioProcessor.java:184) 
>     org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:589) 
>     org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.run(NioEndpoint.java:1680) 
>     java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
>     java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
>     java.lang.Thread.run(Thread.java:722)

So it seems that the worker thread is blocking on something while it's trying to read the POST data…? If my guess is correct, is this a problem that can be solved by reconfiguring?

> 
>> Application overview:
>> My application has from 1,300 - 4,000 users connected at any given
>> time. Each user sends about 200 bytes, then waits 30 seconds, then
>> sends about 200 bytes, and this just loops for each user.
>> Each user connects with SSL, and we use a long keepalive to ensure that
>> the HTTP connection doesn't close, so that we don't have to renegotiate
>> SSL.
> 
> How long should the application take to process those 200 bytes? I'm wondering what the peak number of concurrent requests, rather than connections, might be.


You're welcome to take a look at the live Apache server status here:
http://venus.360works.com/server-status

At 7:30am this morning, it showed 1,733 connected users, and looking at the 'Req' column which measures response time in milliseconds, it's typically in the 0-7 millisecond range, with some rare outliers in the hundreds of milliseconds.

At 12:07pm today, which is our peak time, for every 3-4 requests in the 0-8 millisecond range, there is a request in the 150-700 millisecond range. 

>> 
>> Problem detail:
>> lsof is currently showing 564 open sockets between Apache and Tomcat on
>> port 8009, with 1,352 users connected to Apache.
>> The two threads consuming the most CPU time in Tomcat are
>> "NioBlockingSelector.BlockPoller-2 / 15" and
>> "ajp-nio-8009-ClientPoller-0 / 25". Between them, they are taking 20%
>> of all CPU time for the Tomcat process. I get a few times a day when
>> our monitoring software reports slow response times, and I'd like to
>> solve this.
> 
> How much actual processor time are those threads taking? The 20% is only relative and it would be helpful to know what it is relative to.


The profiler is showing the selector threads as taking 264 milliseconds during the profiled period of time (1 minute, 26 seconds), so I agree with you; that's not the culprit.

> 
> Do you have access logs available for those times? The number of concurrent requests would be a useful number. When working that out you need to be sure if the time in the access log is the time the request started or the time it finished.
> 
>> Some guesses at solutions:
>> I'm guessing that the high CPU usage is because they are polling all
>> 564 open sockets constantly? Would it make sense to reduce the number
>> of open sockets? I didn't configure any maximum and I don't know how to
>> reduce this number. I'm also concerned that reducing that might negate
>> any benefits by increasing the number of sockets opening and closing
>> between ajp_mod_proxy and the NIO AJP connector.
> 
> Some rough numbers.
> With a peak of 9 httpd children with 500 threads each - assume at least 4000 connected clients.
> With a request every 30 seconds that is roughly 133 requests a second assuming they are perfectly evenly distributed which they won't be.
> With 15 worker threads on Tomcat each one will be handling roughly 9 requests a second, again assuming even distribution.
> That means a request has about 110ms to complete or you run the risk of running out of threads. This includes the time to process the request and a little overhead for Tomcat to recycle the thread.
> I wouldn't be surprised for peak loads to be at least 2-3 times higher due to the uneven distribution of requests in time. That means requests have more like 35-40ms to complete.

That sounds about right. The Apache status page is currently showing 153 requests per second, with 2,912 connected users.

> 
> I suspect that you are running out of worker threads on the Tomcat side. Increasing it from 15 to 30 wouldn't do any harm.

This is a single core box (sorry, should have mentioned that in the configuration details). Would you still expect increasing the worker thread count to help?

> 
> Of course, this is just guess work. You'd need to look at your access logs to be sure.
> 
>> Maybe it's already running at optimal performance and I just need to
>> throw hardware at it, but it seems like a solvable problem, because the
>> actual worker threads are not doing much at all.
> 
> I agree it sounds like configuration at this point.
> 
> Mark
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
> For additional commands, e-mail: users-help@tomcat.apache.org
> 


--Jesse Barnum, President, 360Works
http://www.360works.com
Product updates and news on http://facebook.com/360Works
(770) 234-9293
== Don't lose your data! http://360works.com/safetynet/ for FileMaker Server ==
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: What is the best connector configuration for thousands of mostly idle users?

Posted by Mark Thomas <ma...@apache.org>.
Jesse Barnum <js...@360works.com> wrote:

Thanks for such a well written question. All the relevant information is available and presented clearly and logically.

>Problem summary:
>My nio polling threads are using too much CPU time.

Are you sure that is the real problem? It sounds like the occasional slowness is the problem and the polling threads are the suspected - emphasis on the suspected - root cause.

>Application overview:
>My application has from 1,300 - 4,000 users connected at any given
>time. Each user sends about 200 bytes, then waits 30 seconds, then
>sends about 200 bytes, and this just loops for each user.
>Each user connects with SSL, and we use a long keepalive to ensure that
>the HTTP connection doesn't close, so that we don't have to renegotiate
>SSL.

How long should the application take to process those 200 bytes? I'm wondering what the peak number of concurrent requests, rather than connections, might be.

>Configuration:
>Ubuntu 12.0.4 with Tomcat 7.0.35, 1.75 gigs of RAM.
>
>We are using Apache with SSL and mod_proxy_ajp to forward requests to
>Tomcat. It has MPM module enabled, with 500 ThreadsPerChild, so we
>typically have from 3-9 Apache instances running.
>
>> <IfModule mpm_worker_module>
>>     ServerLimit         12
>>     ThreadLimit         1000
>> 
>>     StartServers         1 
>>     MinSpareThreads      25
>>     MaxRequestsPerChild 0
>>     MaxSpareThreads     500
>>     ThreadsPerChild     500
>>     MaxClients          5000
>> </IfModule>
>> 
>
>
>> ProxyPass /WSMRegister ajp://localhost:8009/WSMRegister
>
>We are using the AJP NIO connector on port 8009 on Tomcat with 15
>worker threads:
>
>>     <!-- Define an AJP 1.3 Connector on port 8009 -->
>>     <Connector port="8009" 
>>         protocol="org.apache.coyote.ajp.AjpNioProtocol" 
>>         redirectPort="8443"
>>         minSpareThreads="1" 
>>         maxThreads="15" 
>>         scheme="https"
>>         secure="true"
>>         URIDecoding="UTF-8"
>>         proxyName="secure2.360works.com"
>>         proxyPort="443" />
>
>Problem detail:
>lsof is currently showing 564 open sockets between Apache and Tomcat on
>port 8009, with 1,352 users connected to Apache.
>The two threads consuming the most CPU time in Tomcat are
>"NioBlockingSelector.BlockPoller-2 / 15" and
>"ajp-nio-8009-ClientPoller-0 / 25". Between them, they are taking 20%
>of all CPU time for the Tomcat process. I get a few times a day when
>our monitoring software reports slow response times, and I'd like to
>solve this.

How much actual processor time are those threads taking? The 20% is only relative and it would be helpful to know what it is relative to.

Do you have access logs available for those times? The number of concurrent requests would be a useful number. When working that out you need to be sure if the time in the access log is the time the request started or the time it finished.

>Some guesses at solutions:
>I'm guessing that the high CPU usage is because they are polling all
>564 open sockets constantly? Would it make sense to reduce the number
>of open sockets? I didn't configure any maximum and I don't know how to
>reduce this number. I'm also concerned that reducing that might negate
>any benefits by increasing the number of sockets opening and closing
>between ajp_mod_proxy and the NIO AJP connector.

Some rough numbers.
With a peak of 9 httpd children with 500 threads each - assume at least 4000 connected clients.
With a request every 30 seconds that is roughly 133 requests a second assuming they are perfectly evenly distributed which they won't be.
With 15 worker threads on Tomcat each one will be handling roughly 9 requests a second, again assuming even distribution.
That means a request has about 110ms to complete or you run the risk of running out of threads. This includes the time to process the request and a little overhead for Tomcat to recycle the thread.
I wouldn't be surprised for peak loads to be at least 2-3 times higher due to the uneven distribution of requests in time. That means requests have more like 35-40ms to complete.

I suspect that you are running out of worker threads on the Tomcat side. Increasing it from 15 to 30 wouldn't do any harm.

Of course, this is just guess work. You'd need to look at your access logs to be sure.

>Maybe it's already running at optimal performance and I just need to
>throw hardware at it, but it seems like a solvable problem, because the
>actual worker threads are not doing much at all.

I agree it sounds like configuration at this point.

Mark


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org