You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@tomcat.apache.org by Rubén Pérez <lq...@yomolo.com> on 2023/08/18 10:28:10 UTC

Re: Possible AbstractProtocol.waitingProcessors leak in Tomcat 9.0.75

This is a response to an existing thread (about Memory leak in recent
versions of Tomcat):

https://www.mail-archive.com/users@tomcat.apache.org/msg141882.html

I haven't found a way to reply publicly as a continuation of that thread.
So here it goes what I am trying to say:

I started experiencing exactly the same issue when updating from Spring
6.0.7 to 6.0.9, therefore updating tomcat from 10.1.5 to 10.1.8. The Memory
leak is very clearly visible in my monitoring tools. A further heap dump
reveals like many times more entries in waitingProcessors map than real
active connections, and we end up with like 8 retained GB in memory full of
those entries.

I believe I have found a way to reproduce the issue locally. Open a
websocket session from a client in Chrome, go to dev-tools and switch the
tab to offline mode, wait > 50secs, go and switch it back to No Throttling.
Sometimes I get an error back to the client like:

a["ERROR\nmessage:AMQ229014\\c Did not receive data from /192.168.0.1\\c12720
within the 50000ms connection TTL. The connection will now be
closed.\ncontent-length:0\n\n\u0000"]

And other times I get instead something like c[1002, ""] from Artemis
followed by an "Invalid frame header" error from Chrome (websockets view in
dev-tools).

Only when it is the latter case, looks to be leaking things in that map.
Maybe it is a casualty or not, but that is what I have observed at least 2
times.

After the error appeared, I waited long enough for FE to reconnect the
session, and then I just quitted Chrome.

Again, after forcefully downgrading Tomcat 10.1.8 to 10.1.5 while
preserving the same Spring version, the issue is gone (confirmed in
production), in fact I have never managed to get an "Invalid frame header"
in Chrome again with Tomcat 10.1.5 (in like 10 attempts). Before I got it
in 2 out of 4 attempts.

Is this something already tracked?

Best regards,
Ruben

Re: Possible AbstractProtocol.waitingProcessors leak in Tomcat 9.0.75

Posted by Mark Thomas <ma...@apache.org>.
On 20/08/2023 05:21, Mark Thomas wrote:
> On 18/08/2023 11:28, Rubén Pérez wrote:

<snip/>

>> I started experiencing exactly the same issue when updating from Spring
>> 6.0.7 to 6.0.9, therefore updating tomcat from 10.1.5 to 10.1.8. The 
>> Memory
>> leak is very clearly visible in my monitoring tools. A further heap dump
>> reveals like many times more entries in waitingProcessors map than real
>> active connections, and we end up with like 8 retained GB in memory 
>> full of
>> those entries.
>>
>> I believe I have found a way to reproduce the issue locally. Open a
>> websocket session from a client in Chrome, go to dev-tools and switch the
>> tab to offline mode, wait > 50secs, go and switch it back to No 
>> Throttling.
>> Sometimes I get an error back to the client like:
>>
>> a["ERROR\nmessage:AMQ229014\\c Did not receive data from 
>> /192.168.0.1\\c12720
>> within the 50000ms connection TTL. The connection will now be
>> closed.\ncontent-length:0\n\n\u0000"]
>>
>> And other times I get instead something like c[1002, ""] from Artemis
>> followed by an "Invalid frame header" error from Chrome (websockets 
>> view in
>> dev-tools).
>>
>> Only when it is the latter case, looks to be leaking things in that map.
>> Maybe it is a casualty or not, but that is what I have observed at 
>> least 2
>> times.
>>
>> After the error appeared, I waited long enough for FE to reconnect the
>> session, and then I just quitted Chrome.
> 
> Thanks for the steps to reproduce. That is helpful. I'll let you know 
> how I get on.

Unfortunately, I didn't get very far. Based on the log messages it looks 
very much like those are application generated rather than Tomcat generated.

At this point I am wondering if this is an application or a Tomcat 
issue. I'm going to need a sample application (ideally as cut down as 
possible) that demonstrates the issue to make progress on this.

Another option is debugging this yourself to figure out what has 
changed. I can provide some pointers if this is of interest. Giv en you 
can repeat the issue reaosnable reliably, tracking down the commit that 
trigger the change isn't too hard.

>> Again, after forcefully downgrading Tomcat 10.1.8 to 10.1.5 while
>> preserving the same Spring version, the issue is gone (confirmed in
>> production), in fact I have never managed to get an "Invalid frame 
>> header"
>> in Chrome again with Tomcat 10.1.5 (in like 10 attempts). Before I got it
>> in 2 out of 4 attempts.
> 
> Could you do some further testing and see if you can narrow down exactly 
> which version (10.1.6, 10.1.7 or 10.1.8) the issue first appears in?
> 
> It would also be helpful to confirm if the issue is still present in 
> 10.1.12.

Answers to the above would still be helpful.

Mark

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: Possible AbstractProtocol.waitingProcessors leak in Tomcat 9.0.75

Posted by Mark Thomas <ma...@apache.org>.
On 18/08/2023 11:28, Rubén Pérez wrote:
> This is a response to an existing thread (about Memory leak in recent
> versions of Tomcat):
> 
> https://www.mail-archive.com/users@tomcat.apache.org/msg141882.html
> 
> I haven't found a way to reply publicly as a continuation of that thread.

You need to reply to one of the messages from the original thread. That 
is a little tricky (but not impossible) if you weren't subscribed at the 
time.

lists.apache.org should allow you to do that fairly easily.

Anyway...

> So here it goes what I am trying to say:
> 
> I started experiencing exactly the same issue when updating from Spring
> 6.0.7 to 6.0.9, therefore updating tomcat from 10.1.5 to 10.1.8. The Memory
> leak is very clearly visible in my monitoring tools. A further heap dump
> reveals like many times more entries in waitingProcessors map than real
> active connections, and we end up with like 8 retained GB in memory full of
> those entries.
> 
> I believe I have found a way to reproduce the issue locally. Open a
> websocket session from a client in Chrome, go to dev-tools and switch the
> tab to offline mode, wait > 50secs, go and switch it back to No Throttling.
> Sometimes I get an error back to the client like:
> 
> a["ERROR\nmessage:AMQ229014\\c Did not receive data from /192.168.0.1\\c12720
> within the 50000ms connection TTL. The connection will now be
> closed.\ncontent-length:0\n\n\u0000"]
> 
> And other times I get instead something like c[1002, ""] from Artemis
> followed by an "Invalid frame header" error from Chrome (websockets view in
> dev-tools).
> 
> Only when it is the latter case, looks to be leaking things in that map.
> Maybe it is a casualty or not, but that is what I have observed at least 2
> times.
> 
> After the error appeared, I waited long enough for FE to reconnect the
> session, and then I just quitted Chrome.

Thanks for the steps to reproduce. That is helpful. I'll let you know 
how I get on.

> Again, after forcefully downgrading Tomcat 10.1.8 to 10.1.5 while
> preserving the same Spring version, the issue is gone (confirmed in
> production), in fact I have never managed to get an "Invalid frame header"
> in Chrome again with Tomcat 10.1.5 (in like 10 attempts). Before I got it
> in 2 out of 4 attempts.

Could you do some further testing and see if you can narrow down exactly 
which version (10.1.6, 10.1.7 or 10.1.8) the issue first appears in?

It would also be helpful to confirm if the issue is still present in 
10.1.12.

> Is this something already tracked?

Not that I am aware of.

Mark

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org