You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@tomcat.apache.org by Janne Jalkanen <Ja...@ecyrd.com> on 2011/12/31 14:06:38 UTC

File leak in 7.0.23?

Hi all!

I am seeing odd behaviour with 7.0.23, with the tomcat user's open file count increasing slowly, but consistently. Two other instances running the exact same codebase on identical hardware, BUT with Tomcat 7.0.20, are not exhibiting the same behaviour. 7.0.20 is rock solid, 7.0.23 dies due to too many open files every now and then. I've increased ulimit for now, but this is still a bit nasty.

Any suggestions where to look? Should I file a bug?

lsof says

...
java    21299 ubuntu   87r     sock                0,6       0t0  31983006 can't identify protocol
java    21299 ubuntu   88r     sock                0,6       0t0  31983007 can't identify protocol
java    21299 ubuntu   89r     sock                0,6       0t0  31983008 can't identify protocol
java    21299 ubuntu   90r     sock                0,6       0t0  31989046 can't identify protocol
java    21299 ubuntu   91r     sock                0,6       0t0  31986504 can't identify protocol
java    21299 ubuntu   92r     sock                0,6       0t0  31987223 can't identify protocol
...

with a new one every couple of minutes.

java version "1.6.0_26"
Java(TM) SE Runtime Environment (build 1.6.0_26-b03)
Java HotSpot(TM) 64-Bit Server VM (build 20.1-b02, mixed mode)

/usr/bin/java -Dnop -server -Xmx1024m -Xms128m -XX:MaxPermSize=256m -Dcom.sun.management.jmxremote.port=<redacted> -Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager -Djava.endorsed.dirs=/home/ubuntu/tomcat-bin/current/endorsed -classpath /home/ubuntu/tomcat-bin/current/bin/bootstrap.jar:/home/ubuntu/tomcat-bin/current/bin/tomcat-juli.jar -Dcatalina.base=/home/ubuntu/tomcat-run -Dcatalina.home=/home/ubuntu/tomcat-bin/current -Djava.io.tmpdir=/home/ubuntu/tomcat-run/temp org.apache.catalina.startup.Bootstrap start

OS: Ubuntu 10.04 LTS, kernel 2.6.32 smp.

No OOM issues, no errors in log files, until the eventual SocketException when ulimit is reached:

30-Dec-2011 19:05:47 sun.rmi.transport.tcp.TCPTransport$AcceptLoop executeAcceptLoop
WARNING: RMI TCP Accept-0: accept loop for ServerSocket[addr=0.0.0.0/0.0.0.0,port=0,localport=41709] throws
java.net.SocketException: Too many open files
        at java.net.PlainSocketImpl.socketAccept(Native Method)
        at java.net.PlainSocketImpl.accept(PlainSocketImpl.java:408)
        at java.net.ServerSocket.implAccept(ServerSocket.java:462)
        at java.net.ServerSocket.accept(ServerSocket.java:430)
        at sun.rmi.transport.tcp.TCPTransport$AcceptLoop.executeAcceptLoop(TCPTransport.java:369)        
        at sun.rmi.transport.tcp.TCPTransport$AcceptLoop.run(TCPTransport.java:341)
        at java.lang.Thread.run(Thread.java:662)

/Janne
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: File leak in 7.0.23?

Posted by Mike Wertheim <mw...@hyperreal.org>.
If a simple test case isn't discovered, I'm happy to test out
potential fixes (by deploying the fixes to one of my production
servers and seeing whether or not the server dies after running for a
day).

On Sun, Jan 1, 2012 at 7:17 AM,  <ma...@apache.org> wrote:
> Janne Jalkanen <Ja...@ecyrd.com> wrote:
>
>>APR + native. Good catch there, I took apr out and I am no longer
>>seeing the FD leak.
>
> OK. Sounds like APR/native has an issue. There was a fair bit of refactoring in 7.0.22.
>
> I'll see if I can reproduce it. A simple test case may help.
>
> Mark
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
> For additional commands, e-mail: users-help@tomcat.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: File leak in 7.0.23?

Posted by Christopher Schultz <ch...@christopherschultz.net>.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Janne,

On 1/1/12 12:23 PM, Janne Jalkanen wrote:
>> OK. Sounds like APR/native has an issue. There was a fair bit of
>> refactoring in 7.0.22.
>> 
>> I'll see if I can reproduce it. A simple test case may help.
> 
> Should I open a bug now on the issue tracker?

Yes. That would be the best place to post a simple test case (in the
form of a webapp, please).

- -chris
-----BEGIN PGP SIGNATURE-----
Version: GnuPG/MacGPG2 v2.0.17 (Darwin)
Comment: GPGTools - http://gpgtools.org
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk8CPAYACgkQ9CaO5/Lv0PDp4gCfUHRy2J1xzC4w0rxzsfaWRMmm
WJEAn0huK8QK5Bx/93QRXhhT0XI0goyA
=fHq4
-----END PGP SIGNATURE-----

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: File leak in 7.0.23?

Posted by Janne Jalkanen <Ja...@ecyrd.com>.
> OK. Sounds like APR/native has an issue. There was a fair bit of refactoring in 7.0.22.
> 
> I'll see if I can reproduce it. A simple test case may help.

Should I open a bug now on the issue tracker?

/Janne
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: File leak in 7.0.23?

Posted by ma...@apache.org.
Janne Jalkanen <Ja...@ecyrd.com> wrote:

>APR + native. Good catch there, I took apr out and I am no longer
>seeing the FD leak. 

OK. Sounds like APR/native has an issue. There was a fair bit of refactoring in 7.0.22.

I'll see if I can reproduce it. A simple test case may help.

Mark



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: File leak in 7.0.23?

Posted by Janne Jalkanen <Ja...@ecyrd.com>.
APR + native. Good catch there, I took apr out and I am no longer seeing the FD leak. 

I was running APR 1.3.8 with TC-native 1.1.19, so I thought that maybe these old versions were the problem. But I upgraded to APR 1.4.5 and tc-native 1.1.22 (the latest stables from both) and am still seeing the FD leak, whenever APR+native is enabled.

/Janne

On Dec 31, 2011, at 23:05 , Mark Thomas wrote:

> On 31/12/2011 13:38, Janne Jalkanen wrote:
>>> Which Connector are you using?
>> 
>>    <Connector port="8080" protocol="HTTP/1.1" 
>>               connectionTimeout="20000" 
>>               redirectPort="8443" 
>>               URIEncoding="UTF-8"
>>               compression="on"
>>               compressableMimeType="text/html,text/plain,text/css,text/javascript,application/json,application/javascript"
>>               noCompressionUserAgents=".*MSIE 6.*"/>
> 
> Is that APR/native or BIO?
> 
> Mark
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
> For additional commands, e-mail: users-help@tomcat.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: File leak in 7.0.23?

Posted by Mark Thomas <ma...@apache.org>.
On 31/12/2011 13:38, Janne Jalkanen wrote:
>> Which Connector are you using?
> 
>     <Connector port="8080" protocol="HTTP/1.1" 
>                connectionTimeout="20000" 
>                redirectPort="8443" 
>                URIEncoding="UTF-8"
>                compression="on"
>                compressableMimeType="text/html,text/plain,text/css,text/javascript,application/json,application/javascript"
>                noCompressionUserAgents=".*MSIE 6.*"/>

Is that APR/native or BIO?

Mark

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: File leak in 7.0.23?

Posted by Janne Jalkanen <Ja...@ecyrd.com>.
> If you take heap dumps can you see an Increase in the number of sockets?

What should I be looking for, assuming that the problem is with APR connection? Any particular class?

/Janne
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: File leak in 7.0.23?

Posted by Pid * <pi...@pidster.com>.
On 31 Dec 2011, at 20:05, Janne Jalkanen <Ja...@ecyrd.com> wrote:

>> What if you remove the command-line switch
>> -Dcom.sun.management.jmxremote.port=<redacted>
>> ?
>
> No effect.
>
>> Also, what does
>> netstat -pan | grep <tomcat pid>
>> have to say ?
>
> Nothing unusual (i.e. stuff I wouldn't expect) or different from the other instances: Database connections, listening on 8080, etc.
>
> It looks like 7.0.20&21 have constantly a couple of FDs open that are in the "can't identify protocol" -domain; with .22 the number increases slowly, and with .23 it increases a bit more rapidly.  The rate of change apparently seems to correlate to the number of requests served by Tomcat.

If you take heap dumps can you see an Increase in the number of sockets?


p


> /Janne
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
> For additional commands, e-mail: users-help@tomcat.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: File leak in 7.0.23?

Posted by Janne Jalkanen <Ja...@ecyrd.com>.
> What if you remove the command-line switch
> -Dcom.sun.management.jmxremote.port=<redacted>
> ?

No effect.

> Also, what does
> netstat -pan | grep <tomcat pid>
> have to say ?

Nothing unusual (i.e. stuff I wouldn't expect) or different from the other instances: Database connections, listening on 8080, etc.

It looks like 7.0.20&21 have constantly a couple of FDs open that are in the "can't identify protocol" -domain; with .22 the number increases slowly, and with .23 it increases a bit more rapidly.  The rate of change apparently seems to correlate to the number of requests served by Tomcat.

/Janne
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: File leak in 7.0.23?

Posted by Janne Jalkanen <Ja...@ecyrd.com>.
ATM my testing looks like 7.0.21 works, whereas 7.0.22 is leaky. Will continue to investigate.

/Janne

On Dec 31, 2011, at 20:18 , Mike Wertheim wrote:

> I'm not sure how useful this comment is, but...  I also recently
> posted about an app that runs fine on Tomcat 7.0.21 and dies a slow
> horrible death on Tomcat 7.0.23.  It would seem that a bug was
> introduced in either 7.0.22 or 7.0.23.
> 
> 
> 
> On Sat, Dec 31, 2011 at 6:48 AM, André Warnier <aw...@ice-sa.com> wrote:
>> Janne Jalkanen wrote:
>>>> 
>>>> When did the problem start occurring and what else has changed?
>>> 
>>> 
>>> Exactly at the time when I upgraded to 7.0.23.  I don't recall making any
>>> other modifications (I would've suspected them first ;-)
>>> 
>>> I will try to downgrade to 7.0.22 and lower to try and see if there's a
>>> difference between Tomcat versions.
>>> 
>>>> Can you stop monitoring with Munin and see if the problem goes away?
>>>> If it does, I would consider that Munin may not be properly closing the
>>>> connections it makes.
>>> 
>>> 
>>> I did, and the problem did not go away. Open file count still increasing.
>>> 
>> What if you remove the command-line switch
>> -Dcom.sun.management.jmxremote.port=<redacted>
>> ?
>> 
>> (just trying to figure out what these sockets are..)
>> 
>> Also, what does
>> netstat -pan | grep <tomcat pid>
>> have to say ?
>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
>> For additional commands, e-mail: users-help@tomcat.apache.org
>> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
> For additional commands, e-mail: users-help@tomcat.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: File leak in 7.0.23?

Posted by Pid * <pi...@pidster.com>.
On 31 Dec 2011, at 20:03, Mike Wertheim <mw...@hyperreal.org> wrote:

> Janne's latest email says that 7.0.22 is leaky as well.  So the
> regression most likely happened between 7.0.21 and 7.0.22.
>
> I'm not familiar with the Tomcat code base.  But I wonder how
> difficult it would be for someone to review all of the code changes
> that were checked in between 7.0.21 and 7.0.22, keeping an eye out for
> anything that could be causing a file leak.

Right now & for me: hard, because I'm using an iPhone.

Tomorrow easier: but having a minimal test case for reproducing the
error will make things easier.


p


(p.s. please don't top-post)

> On Sat, Dec 31, 2011 at 11:17 AM, Pid * <pi...@pidster.com> wrote:
>> On 31 Dec 2011, at 18:19, Mike Wertheim <mw...@hyperreal.org> wrote:
>>
>>> I'm not sure how useful this comment is, but...  I also recently
>>> posted about an app that runs fine on Tomcat 7.0.21 and dies a slow
>>> horrible death on Tomcat 7.0.23.  It would seem that a bug was
>>> introduced in either 7.0.22 or 7.0.23.
>>
>> There are now 3 threads with non-obvious issues referencing 7.0.23. I
>> don't yet see any common ground.
>>
>> Following them all and comparing notes isn't unreasonable.
>>
>>
>> p
>>
>>
>>> On Sat, Dec 31, 2011 at 6:48 AM, André Warnier <aw...@ice-sa.com> wrote:
>>>> Janne Jalkanen wrote:
>>>>>>
>>>>>> When did the problem start occurring and what else has changed?
>>>>>
>>>>>
>>>>> Exactly at the time when I upgraded to 7.0.23.  I don't recall making any
>>>>> other modifications (I would've suspected them first ;-)
>>>>>
>>>>> I will try to downgrade to 7.0.22 and lower to try and see if there's a
>>>>> difference between Tomcat versions.
>>>>>
>>>>>> Can you stop monitoring with Munin and see if the problem goes away?
>>>>>> If it does, I would consider that Munin may not be properly closing the
>>>>>> connections it makes.
>>>>>
>>>>>
>>>>> I did, and the problem did not go away. Open file count still increasing.
>>>>>
>>>> What if you remove the command-line switch
>>>> -Dcom.sun.management.jmxremote.port=<redacted>
>>>> ?
>>>>
>>>> (just trying to figure out what these sockets are..)
>>>>
>>>> Also, what does
>>>> netstat -pan | grep <tomcat pid>
>>>> have to say ?
>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
>>>> For additional commands, e-mail: users-help@tomcat.apache.org
>>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
>>> For additional commands, e-mail: users-help@tomcat.apache.org
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
>> For additional commands, e-mail: users-help@tomcat.apache.org
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
> For additional commands, e-mail: users-help@tomcat.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: File leak in 7.0.23?

Posted by Mike Wertheim <mw...@hyperreal.org>.
Janne's latest email says that 7.0.22 is leaky as well.  So the
regression most likely happened between 7.0.21 and 7.0.22.

I'm not familiar with the Tomcat code base.  But I wonder how
difficult it would be for someone to review all of the code changes
that were checked in between 7.0.21 and 7.0.22, keeping an eye out for
anything that could be causing a file leak.



On Sat, Dec 31, 2011 at 11:17 AM, Pid * <pi...@pidster.com> wrote:
> On 31 Dec 2011, at 18:19, Mike Wertheim <mw...@hyperreal.org> wrote:
>
>> I'm not sure how useful this comment is, but...  I also recently
>> posted about an app that runs fine on Tomcat 7.0.21 and dies a slow
>> horrible death on Tomcat 7.0.23.  It would seem that a bug was
>> introduced in either 7.0.22 or 7.0.23.
>
> There are now 3 threads with non-obvious issues referencing 7.0.23. I
> don't yet see any common ground.
>
> Following them all and comparing notes isn't unreasonable.
>
>
> p
>
>
>> On Sat, Dec 31, 2011 at 6:48 AM, André Warnier <aw...@ice-sa.com> wrote:
>>> Janne Jalkanen wrote:
>>>>>
>>>>> When did the problem start occurring and what else has changed?
>>>>
>>>>
>>>> Exactly at the time when I upgraded to 7.0.23.  I don't recall making any
>>>> other modifications (I would've suspected them first ;-)
>>>>
>>>> I will try to downgrade to 7.0.22 and lower to try and see if there's a
>>>> difference between Tomcat versions.
>>>>
>>>>> Can you stop monitoring with Munin and see if the problem goes away?
>>>>> If it does, I would consider that Munin may not be properly closing the
>>>>> connections it makes.
>>>>
>>>>
>>>> I did, and the problem did not go away. Open file count still increasing.
>>>>
>>> What if you remove the command-line switch
>>> -Dcom.sun.management.jmxremote.port=<redacted>
>>> ?
>>>
>>> (just trying to figure out what these sockets are..)
>>>
>>> Also, what does
>>> netstat -pan | grep <tomcat pid>
>>> have to say ?
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
>>> For additional commands, e-mail: users-help@tomcat.apache.org
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
>> For additional commands, e-mail: users-help@tomcat.apache.org
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
> For additional commands, e-mail: users-help@tomcat.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: File leak in 7.0.23?

Posted by Pid * <pi...@pidster.com>.
On 31 Dec 2011, at 18:19, Mike Wertheim <mw...@hyperreal.org> wrote:

> I'm not sure how useful this comment is, but...  I also recently
> posted about an app that runs fine on Tomcat 7.0.21 and dies a slow
> horrible death on Tomcat 7.0.23.  It would seem that a bug was
> introduced in either 7.0.22 or 7.0.23.

There are now 3 threads with non-obvious issues referencing 7.0.23. I
don't yet see any common ground.

Following them all and comparing notes isn't unreasonable.


p


> On Sat, Dec 31, 2011 at 6:48 AM, André Warnier <aw...@ice-sa.com> wrote:
>> Janne Jalkanen wrote:
>>>>
>>>> When did the problem start occurring and what else has changed?
>>>
>>>
>>> Exactly at the time when I upgraded to 7.0.23.  I don't recall making any
>>> other modifications (I would've suspected them first ;-)
>>>
>>> I will try to downgrade to 7.0.22 and lower to try and see if there's a
>>> difference between Tomcat versions.
>>>
>>>> Can you stop monitoring with Munin and see if the problem goes away?
>>>> If it does, I would consider that Munin may not be properly closing the
>>>> connections it makes.
>>>
>>>
>>> I did, and the problem did not go away. Open file count still increasing.
>>>
>> What if you remove the command-line switch
>> -Dcom.sun.management.jmxremote.port=<redacted>
>> ?
>>
>> (just trying to figure out what these sockets are..)
>>
>> Also, what does
>> netstat -pan | grep <tomcat pid>
>> have to say ?
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
>> For additional commands, e-mail: users-help@tomcat.apache.org
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
> For additional commands, e-mail: users-help@tomcat.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: File leak in 7.0.23?

Posted by Mike Wertheim <mw...@hyperreal.org>.
I'm not sure how useful this comment is, but...  I also recently
posted about an app that runs fine on Tomcat 7.0.21 and dies a slow
horrible death on Tomcat 7.0.23.  It would seem that a bug was
introduced in either 7.0.22 or 7.0.23.



On Sat, Dec 31, 2011 at 6:48 AM, André Warnier <aw...@ice-sa.com> wrote:
> Janne Jalkanen wrote:
>>>
>>> When did the problem start occurring and what else has changed?
>>
>>
>> Exactly at the time when I upgraded to 7.0.23.  I don't recall making any
>> other modifications (I would've suspected them first ;-)
>>
>> I will try to downgrade to 7.0.22 and lower to try and see if there's a
>> difference between Tomcat versions.
>>
>>> Can you stop monitoring with Munin and see if the problem goes away?
>>> If it does, I would consider that Munin may not be properly closing the
>>> connections it makes.
>>
>>
>> I did, and the problem did not go away. Open file count still increasing.
>>
> What if you remove the command-line switch
> -Dcom.sun.management.jmxremote.port=<redacted>
> ?
>
> (just trying to figure out what these sockets are..)
>
> Also, what does
> netstat -pan | grep <tomcat pid>
> have to say ?
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
> For additional commands, e-mail: users-help@tomcat.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: File leak in 7.0.23?

Posted by André Warnier <aw...@ice-sa.com>.
Janne Jalkanen wrote:
>> When did the problem start occurring and what else has changed?
> 
> Exactly at the time when I upgraded to 7.0.23.  I don't recall making any other modifications (I would've suspected them first ;-)
> 
> I will try to downgrade to 7.0.22 and lower to try and see if there's a difference between Tomcat versions.
> 
>> Can you stop monitoring with Munin and see if the problem goes away?
>> If it does, I would consider that Munin may not be properly closing the
>> connections it makes.
> 
> I did, and the problem did not go away. Open file count still increasing.
> 
What if you remove the command-line switch
-Dcom.sun.management.jmxremote.port=<redacted>
?

(just trying to figure out what these sockets are..)

Also, what does
netstat -pan | grep <tomcat pid>
have to say ?

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: File leak in 7.0.23?

Posted by Janne Jalkanen <Ja...@ecyrd.com>.
> When did the problem start occurring and what else has changed?

Exactly at the time when I upgraded to 7.0.23.  I don't recall making any other modifications (I would've suspected them first ;-)

I will try to downgrade to 7.0.22 and lower to try and see if there's a difference between Tomcat versions.

> Can you stop monitoring with Munin and see if the problem goes away?
> If it does, I would consider that Munin may not be properly closing the
> connections it makes.

I did, and the problem did not go away. Open file count still increasing.

/Janne


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: File leak in 7.0.23?

Posted by Pid <pi...@pidster.com>.
On 31/12/2011 13:38, Janne Jalkanen wrote:
>> Which Connector are you using?
> 
>     <Connector port="8080" protocol="HTTP/1.1" 
>                connectionTimeout="20000" 
>                redirectPort="8443" 
>                URIEncoding="UTF-8"
>                compression="on"
>                compressableMimeType="text/html,text/plain,text/css,text/javascript,application/json,application/javascript"
>                noCompressionUserAgents=".*MSIE 6.*"/>
> 
> 
>> What is the minimum that is required to reproduce the issue?
> 
> Wish I knew. This is from our production cluster, so I'm a bit hesitant to start experimenting with it. However, there was no configuration change between 7.0.20 and 7.0.23, I just shut down tomcat, unpacked the 7.0.23 tar.gz, changed symlink to point at the new binaries, and restarted.
> 
>>
>>
>>> lsof says
>>
>> Did you supply any options to lsof?
> 
> lsof -nP -u ubuntu|grep 21299|grep "identify protocol"
> 
>> Are you using RMI for anything in your application or is this a
>> reference to the JMX connection?
> 
> I am using RMI only for JMX monitoring. The socket exception times *do* roughly correspond to Munin contacting my application every five minutes for getting monitoring data, which is logical because they would be requiring a set of new file handles.

When did the problem start occurring and what else has changed?

Can you stop monitoring with Munin and see if the problem goes away?
If it does, I would consider that Munin may not be properly closing the
connections it makes.


p

> Ah yes, I also have Terracotta session clustering configured using their Valve. That might be using RMI, but I would be expecting to see a lot more errors in that case.
> 
> /Janne
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
> For additional commands, e-mail: users-help@tomcat.apache.org
> 


-- 

[key:62590808]


Re: File leak in 7.0.23?

Posted by Janne Jalkanen <Ja...@ecyrd.com>.
> Which Connector are you using?

    <Connector port="8080" protocol="HTTP/1.1" 
               connectionTimeout="20000" 
               redirectPort="8443" 
               URIEncoding="UTF-8"
               compression="on"
               compressableMimeType="text/html,text/plain,text/css,text/javascript,application/json,application/javascript"
               noCompressionUserAgents=".*MSIE 6.*"/>


> What is the minimum that is required to reproduce the issue?

Wish I knew. This is from our production cluster, so I'm a bit hesitant to start experimenting with it. However, there was no configuration change between 7.0.20 and 7.0.23, I just shut down tomcat, unpacked the 7.0.23 tar.gz, changed symlink to point at the new binaries, and restarted.

> 
> 
>> lsof says
> 
> Did you supply any options to lsof?

lsof -nP -u ubuntu|grep 21299|grep "identify protocol"

> Are you using RMI for anything in your application or is this a
> reference to the JMX connection?

I am using RMI only for JMX monitoring. The socket exception times *do* roughly correspond to Munin contacting my application every five minutes for getting monitoring data, which is logical because they would be requiring a set of new file handles.

Ah yes, I also have Terracotta session clustering configured using their Valve. That might be using RMI, but I would be expecting to see a lot more errors in that case.

/Janne


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: File leak in 7.0.23?

Posted by Pid <pi...@pidster.com>.
On 31/12/2011 13:06, Janne Jalkanen wrote:
> Hi all!
> 
> I am seeing odd behaviour with 7.0.23, with the tomcat user's open file count increasing slowly, but consistently. Two other instances running the exact same codebase on identical hardware, BUT with Tomcat 7.0.20, are not exhibiting the same behaviour. 7.0.20 is rock solid, 7.0.23 dies due to too many open files every now and then. I've increased ulimit for now, but this is still a bit nasty.
> 
> Any suggestions where to look? Should I file a bug?

Which Connector are you using?

What is the minimum that is required to reproduce the issue?


> lsof says

Did you supply any options to lsof?

> 
> ...
> java    21299 ubuntu   87r     sock                0,6       0t0  31983006 can't identify protocol
> java    21299 ubuntu   88r     sock                0,6       0t0  31983007 can't identify protocol
> java    21299 ubuntu   89r     sock                0,6       0t0  31983008 can't identify protocol
> java    21299 ubuntu   90r     sock                0,6       0t0  31989046 can't identify protocol
> java    21299 ubuntu   91r     sock                0,6       0t0  31986504 can't identify protocol
> java    21299 ubuntu   92r     sock                0,6       0t0  31987223 can't identify protocol
> ...
> 
> with a new one every couple of minutes.
> 
> java version "1.6.0_26"
> Java(TM) SE Runtime Environment (build 1.6.0_26-b03)
> Java HotSpot(TM) 64-Bit Server VM (build 20.1-b02, mixed mode)
> 
> /usr/bin/java -Dnop -server -Xmx1024m -Xms128m -XX:MaxPermSize=256m -Dcom.sun.management.jmxremote.port=<redacted> -Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager -Djava.endorsed.dirs=/home/ubuntu/tomcat-bin/current/endorsed -classpath /home/ubuntu/tomcat-bin/current/bin/bootstrap.jar:/home/ubuntu/tomcat-bin/current/bin/tomcat-juli.jar -Dcatalina.base=/home/ubuntu/tomcat-run -Dcatalina.home=/home/ubuntu/tomcat-bin/current -Djava.io.tmpdir=/home/ubuntu/tomcat-run/temp org.apache.catalina.startup.Bootstrap start
> 
> OS: Ubuntu 10.04 LTS, kernel 2.6.32 smp.
> 
> No OOM issues, no errors in log files, until the eventual SocketException when ulimit is reached:
> 
> 30-Dec-2011 19:05:47 sun.rmi.transport.tcp.TCPTransport$AcceptLoop executeAcceptLoop
> WARNING: RMI TCP Accept-0: accept loop for ServerSocket[addr=0.0.0.0/0.0.0.0,port=0,localport=41709] throws

Are you using RMI for anything in your application or is this a
reference to the JMX connection?

Are you using JMX for something specific?


p

> java.net.SocketException: Too many open files
>         at java.net.PlainSocketImpl.socketAccept(Native Method)
>         at java.net.PlainSocketImpl.accept(PlainSocketImpl.java:408)
>         at java.net.ServerSocket.implAccept(ServerSocket.java:462)
>         at java.net.ServerSocket.accept(ServerSocket.java:430)
>         at sun.rmi.transport.tcp.TCPTransport$AcceptLoop.executeAcceptLoop(TCPTransport.java:369)        
>         at sun.rmi.transport.tcp.TCPTransport$AcceptLoop.run(TCPTransport.java:341)
>         at java.lang.Thread.run(Thread.java:662)
> 
> /Janne
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
> For additional commands, e-mail: users-help@tomcat.apache.org
> 


-- 

[key:62590808]


Re: File leak in 7.0.23?

Posted by Mike Wertheim <mw...@hyperreal.org>.
On Mon, Jan 2, 2012 at 4:31 PM, Konstantin Kolinko
<kn...@gmail.com> wrote:
> 2011/12/31 Janne Jalkanen <Ja...@ecyrd.com>:
>> Hi all!
>>
>> I am seeing odd behaviour with 7.0.23, with the tomcat user's open file count increasing slowly, but consistently. Two other instances running the exact same codebase on identical hardware, BUT with Tomcat 7.0.20, are not exhibiting the same behaviour. 7.0.20 is rock solid, 7.0.23 dies due to too many open files every now and then. I've increased ulimit for now, but this is still a bit nasty.
>>
>> Any suggestions where to look? Should I file a bug?
>>
>> lsof says
>>
>> ...
>> java    21299 ubuntu   87r     sock                0,6       0t0  31983006 can't identify protocol
>> java    21299 ubuntu   88r     sock                0,6       0t0  31983007 can't identify protocol
>> java    21299 ubuntu   89r     sock                0,6       0t0  31983008 can't identify protocol
>> java    21299 ubuntu   90r     sock                0,6       0t0  31989046 can't identify protocol
>> java    21299 ubuntu   91r     sock                0,6       0t0  31986504 can't identify protocol
>> java    21299 ubuntu   92r     sock                0,6       0t0  31987223 can't identify protocol
>> ...
>>
>> with a new one every couple of minutes.
>
> I wonder whether it is possible to get some numbers in Access Log to
> match with lsof output. So to get a clue what kinds of requests result
> in the leak.
>
> a) normal dynamic requests
> b) comet or asynchronous requests
> c) requests for static files that are served via sendfile

For my app (which fails on 7.0.23 and uses the APR Connector), there
are no comet requests.  99% of the requests are normal dynamic
requests, and the remaining 1% are static files.



> Maybe there are timeouts, or connections closed at client's side.
>
>
> It should be easy to exclude sendfile:
>
> a) When sendfile is used it is visible in access log, because a static
> file is requested and file size that is logged as 0 (until 7.0.24).
> See https://issues.apache.org/bugzilla/show_bug.cgi?id=52316
>
> b) If it easy to turn it off, by setting useSendfile="false" on a Connector.
>
>
>> No OOM issues, no errors in log files, until the eventual SocketException when ulimit is reached:
>
> I wonder whether there were some thread deaths. If that happens the
> ThreadGroup.uncaughtException() method will print the stacktrace
> directly into System.err. That is it will be written to catalina.out
> file (and it wouldn't be written to catalina.log).
>
> Best regards,
> Konstantin Kolinko
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
> For additional commands, e-mail: users-help@tomcat.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: File leak in 7.0.23?

Posted by Konstantin Kolinko <kn...@gmail.com>.
2011/12/31 Janne Jalkanen <Ja...@ecyrd.com>:
> Hi all!
>
> I am seeing odd behaviour with 7.0.23, with the tomcat user's open file count increasing slowly, but consistently. Two other instances running the exact same codebase on identical hardware, BUT with Tomcat 7.0.20, are not exhibiting the same behaviour. 7.0.20 is rock solid, 7.0.23 dies due to too many open files every now and then. I've increased ulimit for now, but this is still a bit nasty.
>
> Any suggestions where to look? Should I file a bug?
>
> lsof says
>
> ...
> java    21299 ubuntu   87r     sock                0,6       0t0  31983006 can't identify protocol
> java    21299 ubuntu   88r     sock                0,6       0t0  31983007 can't identify protocol
> java    21299 ubuntu   89r     sock                0,6       0t0  31983008 can't identify protocol
> java    21299 ubuntu   90r     sock                0,6       0t0  31989046 can't identify protocol
> java    21299 ubuntu   91r     sock                0,6       0t0  31986504 can't identify protocol
> java    21299 ubuntu   92r     sock                0,6       0t0  31987223 can't identify protocol
> ...
>
> with a new one every couple of minutes.

I wonder whether it is possible to get some numbers in Access Log to
match with lsof output. So to get a clue what kinds of requests result
in the leak.

a) normal dynamic requests
b) comet or asynchronous requests
c) requests for static files that are served via sendfile

Maybe there are timeouts, or connections closed at client's side.


It should be easy to exclude sendfile:

a) When sendfile is used it is visible in access log, because a static
file is requested and file size that is logged as 0 (until 7.0.24).
See https://issues.apache.org/bugzilla/show_bug.cgi?id=52316

b) If it easy to turn it off, by setting useSendfile="false" on a Connector.


> No OOM issues, no errors in log files, until the eventual SocketException when ulimit is reached:

I wonder whether there were some thread deaths. If that happens the
ThreadGroup.uncaughtException() method will print the stacktrace
directly into System.err. That is it will be written to catalina.out
file (and it wouldn't be written to catalina.log).

Best regards,
Konstantin Kolinko

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org