You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@tomcat.apache.org by "Darius D." <da...@gmail.com> on 2011/07/18 20:23:19 UTC

APR connector pollTime defaults are strange in tomcat6/7

>From documentation:

TC6:

Duration of a poll call. Lowering this value will slightly decrease latency
of connections being kept alive in some cases, but will use more CPU as more
poll calls are being made. The default value is 2000 (5ms).

TC7:

Duration of a poll call in microseconds. Lowering this value will slightly
decrease latency of connections being kept alive in some cases , but will
use more CPU as more poll calls are being made. The default value is 2000
(2ms). 


TC6 APR connector defaults are also microseconds.  But it gives not 5ms, but
some arbitrary value that depends on kernel configuration. On distribution
default kernels (debian, redhat...) with 100HZ configs ( very common on
servers ) it gives epoll time of 10ms ( sounds reasonable, but... ). Now
where trouble starts is on kernels with NO_HZ and HPET timers - it actually
gives epoll time of 2ms.

The problem is that on reasonably loaded servers tomcat java processes start
to dominate wake up reasons and timer interrupt reasons - waking up each
thread with APR connector ~480 times per second.

pidstat  -t -w -C java 1

will show those threads and ~480 context switches they are causing.

and you can confirm the reason for those wakeups with:

gdb -batch -ex bt -p 4056      

warning: process 4056 is a cloned process
[Thread debugging using libthread_db enabled]
0x00007f652a096623 in epoll_wait () from /lib/libc.so.6
#0  0x00007f652a096623 in epoll_wait () from /lib/libc.so.6
#1  0x00007f6521147ca3 in ?? () from /usr/lib/libapr-1.so.0
#2  0x00007f6521146908 in apr_pollset_poll () from /usr/lib/libapr-1.so.0
#3  0x00007f652196b2b3 in Java_org_apache_tomcat_jni_Poll_poll
(e=0x40fbf9c8, o=<value optimized out>, pollset=1092883016, timeout=2000,
set=0x7f651b622750,
    remove=1 '\001') at src/poll.c:311


Also if you do strace -r -p APRconnectorpid, you will see that there is a
mass of epool_wait calls going on, most of them each doing absolutely
nothing.


Does Tomcat APR really needs pollTime set so low by default? I thought
timeout is meant for some sort of book keeping, where is all connections in
FD set are "idle", no events come for timeout period - you force timeout and
do bookkeeping - on a busy system you will get events anyway cause of socket
traffic. Also connection timeout is 60s by default, so ending connection @
2ms precision is not enhancing latency in any way.

I think defaults should be increased to something reasonable like 100ms
(pollTime ="100000") to avoid unneeded wakeups (and wakeups are bad, cause
they cause context switch, and context switches pollute caches, TLB buffers
and on modern servers burn electricity by forcing CPUs from low C states )


P.S. There exists perfect workaround in latest Tomcat7, using
protocol="org.apache.coyote.http11.Http11NioProtocol" and
protocol="org.apache.coyote.ajp.AjpNioProtocol" for AJP will do away with
all unneeded context switches.




-- 
View this message in context: http://old.nabble.com/APR-connector-pollTime-defaults-are-strange-in-tomcat6-7-tp32085364p32085364.html
Sent from the Tomcat - User mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: APR connector pollTime defaults are strange in tomcat6/7

Posted by "Darius D." <da...@gmail.com>.
My goal with this thread was to rise awareness with APR connector poll time
defaults, as some users will not really bother investigating why their
servers have such high context switches / timer interrupts. There is no
"problem" here as Tomcat is working fine with defaults. 

There should be no harm by setting it to 1000000 microseconds, as NIO
connectors are using 1000ms as default selectorTimeout ( same epoll inside )
and working just fine.





Christopher Schultz-2 wrote:
> 
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Darius,
> 
> On 7/18/2011 2:23 PM, Darius D. wrote:
>> Does Tomcat APR really needs pollTime set so low by default? I
>> thought timeout is meant for some sort of book keeping, where is all
>> connections in FD set are "idle", no events come for timeout period -
>> you force timeout and do bookkeeping - on a busy system you will get
>> events anyway cause of socket traffic. Also connection timeout is 60s
>> by default, so ending connection @ 2ms precision is not enhancing
>> latency in any way.
> 
> Seems like a reasonable question.
> 
>> P.S. There exists perfect workaround in latest Tomcat7, using 
>> protocol="org.apache.coyote.http11.Http11NioProtocol" and 
>> protocol="org.apache.coyote.ajp.AjpNioProtocol" for AJP will do away
>> with all unneeded context switches.
> 
> Yes, switching from APR connector to another one certainly does
> alleviate any issues you are experiencing by using the APR connector.
> This isn't really a workaround. :)
> 
> On the other hand, a better "workaround" would be to set these values
> appropriately for your environment. What's stopping you from setting the
> pollTime to, as you suggest, 100000 microseconds? That isn't really a
> workaround, either: it's proper configuration.
> 
> It's probably worth discussing what the defaults should be, but there's
> a perfectly reasonable course of action for you at this point: change
> the configuration.
> 
> - -chris
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.10 (MingW32)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
> 
> iEYEARECAAYFAk4tw8UACgkQ9CaO5/Lv0PDJ2ACeNAYeMDPWDw9jyjtXz2J82O9z
> 5b0An0a1E4LPyrIVcREaBqt+deRvVsOa
> =bJY5
> -----END PGP SIGNATURE-----
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
> For additional commands, e-mail: users-help@tomcat.apache.org
> 
> 
> 

-- 
View this message in context: http://old.nabble.com/APR-connector-pollTime-defaults-are-strange-in-tomcat6-7-tp32085364p32173790.html
Sent from the Tomcat - User mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: APR connector pollTime defaults are strange in tomcat6/7

Posted by Christopher Schultz <ch...@christopherschultz.net>.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Darius,

On 7/18/2011 2:23 PM, Darius D. wrote:
> Does Tomcat APR really needs pollTime set so low by default? I
> thought timeout is meant for some sort of book keeping, where is all
> connections in FD set are "idle", no events come for timeout period -
> you force timeout and do bookkeeping - on a busy system you will get
> events anyway cause of socket traffic. Also connection timeout is 60s
> by default, so ending connection @ 2ms precision is not enhancing
> latency in any way.

Seems like a reasonable question.

> P.S. There exists perfect workaround in latest Tomcat7, using 
> protocol="org.apache.coyote.http11.Http11NioProtocol" and 
> protocol="org.apache.coyote.ajp.AjpNioProtocol" for AJP will do away
> with all unneeded context switches.

Yes, switching from APR connector to another one certainly does
alleviate any issues you are experiencing by using the APR connector.
This isn't really a workaround. :)

On the other hand, a better "workaround" would be to set these values
appropriately for your environment. What's stopping you from setting the
pollTime to, as you suggest, 100000 microseconds? That isn't really a
workaround, either: it's proper configuration.

It's probably worth discussing what the defaults should be, but there's
a perfectly reasonable course of action for you at this point: change
the configuration.

- -chris
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk4tw8UACgkQ9CaO5/Lv0PDJ2ACeNAYeMDPWDw9jyjtXz2J82O9z
5b0An0a1E4LPyrIVcREaBqt+deRvVsOa
=bJY5
-----END PGP SIGNATURE-----

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: APR connector pollTime defaults are strange in tomcat6/7

Posted by Marvin Addison <ma...@gmail.com>.
> Does Tomcat APR really needs pollTime set so low by default?

Anyone care to comment on this point?  I'm interested in this
discussion as a user of Linux+APR connectors.  While we don't yet run
on a tickless kernel, I'm considering trying to measure the impact on
our systems as well, but some insight on the rationale for the
defaults would be helpful.

M

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: APR connector pollTime defaults are strange in tomcat6/7

Posted by "Darius D." <da...@gmail.com>.

Darius D. wrote:
> 
> 
> Does Tomcat APR really needs pollTime set so low by default? I thought
> timeout is meant for some sort of book keeping, where is all connections
> in FD set are "idle", no events come for timeout period - you force
> timeout and do bookkeeping - on a busy system you will get events anyway
> cause of socket traffic. Also connection timeout is 60s by default, so
> ending connection @ 2ms precision is not enhancing latency in any way.
> 
> I think defaults should be increased to something reasonable like 100ms
> (pollTime ="100000") to avoid unneeded wakeups (and wakeups are bad, cause
> they cause context switch, and context switches pollute caches, TLB
> buffers and on modern servers burn electricity by forcing CPUs from low C
> states )
> 
> 


I guess there is no interest in efficiency and reducing overhead with APR
connectors? Overhead is quite substantial. Consider the following - on a
lightly loaded system we were seeing ~1.8k timer interrups and context
switches with Linux 2.6.39 kernel and latest Tomcat 7 + 1.20 TCNative + APR.
And its easy to see where from they are coming - 3 connector (AJP 8009,
HTTP, HTTPS) , all APR, all 2000 microseconds PollTime. So we were getting
~500x3 context switches from all those epoll_wait(...,2ms) calls. And they
were just burning CPU and polluting caches.

After switching to NIO connectors on same system and same load CS and
interrupts are down to ~600.
( note that to reproduce this you need a system with NO_HZ kernel and HPET
to actually get a epoll_wait timeout of 2000us instead of ~1/HZ (10ms on
100HZ kernel ) minimum on normal kernels )

I have attached screenshot from munin irq stats display.
http://old.nabble.com/file/p32115035/irqstats-week.png irqstats-week.png 

So results are pretty obviuos.
-- 
View this message in context: http://old.nabble.com/APR-connector-pollTime-defaults-are-strange-in-tomcat6-7-tp32085364p32115035.html
Sent from the Tomcat - User mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org