You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mina.apache.org by Luis Neves <lu...@co.sapo.pt> on 2007/12/21 23:14:28 UTC

memory leak in http codec

Hello all.

I think I'm facing a problem in the mina http codec filter, I've been using it 
in previous applications that only deal with the request body without issues 
(SOAP web-services and the like), but in this application I have requests with 
cookies and query strings and I think this is the cause of the problem.
One of the lists in the HttpRequestDecodingState seems to keep references to 
request headers... or at least that is what it seems to me.

Dump file (55MB):
<http://netservices.sapo.pt/debug/java_pid15718.hprof.gz>

Screenshot of the HeapAnalyser:
<http://netservices.sapo.pt/debug/mina_http_leak.png>

I'm not sure what the proper fix is.

--
Luis Neves

Re: memory leak in http codec

Posted by Luis Neves <lu...@co.sapo.pt>.
Niklas Therning wrote:

> Well, in any way, I think MINA should be able to handle faulty clients 
> like these more gracefully. OOM isn't acceptable. 

I agree.


 > Please file a JIRA issue and we'll look into it.
> 
> /Niklas

Done... see DIRMINA-505.

--
Luis Neves

Re: memory leak in http codec

Posted by Emmanuel Lecharny <el...@gmail.com>.
Niklas Therning wrote:
> Luis Neves wrote:
>>
>> I still don't know how to properly fixed the decoder but for now this 
>> inelegant
>> fix will do.
>> <snip/>
>
> Well, in any way, I think MINA should be able to handle faulty clients 
> like these more gracefully. OOM isn't acceptable. Please file a JIRA 
> issue and we'll look into it.
Just thinking that this problem is not specific to the HttpDecoder. It 
would be very valuable to be able to handle big chunks of data as 
streamed data instead as byte[], so that the decoder won't be swamped by 
huge packets stored in memory. A good strategy could be to stream to 
disk any packet above a certain size (say, 4 k), until the decoder has 
finished his work.

That means we have to write a streamed decoder : it should not be such a 
complex task...

We need it in ADS, because we can receive massive pieces of data (like 
pictures : some Mb), and we can't afford to store them in memory until 
the full load has been received.

Note : we know that since months (years ?) but never found time to work 
on this subject ... :/
>
> /Niklas
>
>


-- 
--
cordialement, regards,
Emmanuel Lécharny
www.iktek.com
directory.apache.org



Re: memory leak in http codec

Posted by Niklas Therning <ni...@trillian.se>.
Luis Neves wrote:
> Luis Neves wrote:
>
>
>> I find those "-" and "~" very weird, I doubt that the clients are 
>> sending those headers.
>
> My bad. As it turns out the clients are really sending that garbage.
>
> The main problem seems to be that the decoder gets confused when 
> clients send Headers without values, e.g:
>
> *******************************************
>
> GET /somepath HTTP/1.1
> Accept: */*
> Accept-Encoding: gzip, deflate
> User-Agent: Mozilla/4.0
> Host: foo.bar.com
> Referer:
> Connection: Keep-Alive
>
> *******************************************
>
> Notice the empty referer header.
> I worked around my problem by putting a limit in the size of every 
> Collection
> that holds HTTP Headers and also an upper/lower limit in the
> length of every Header Name/Value.
>
> I still don't know how to properly fixed the decoder but for now this 
> inelegant
> fix will do.
>
> As an aside I also tried the Grizzly http server as stopgap measure 
> and although
> there are no OOM errors with misbehaved requests it completely freezes 
> and stops
> serving requests... but that might just be because of my unfamiliarity 
> with the creature.

Well, in any way, I think MINA should be able to handle faulty clients 
like these more gracefully. OOM isn't acceptable. Please file a JIRA 
issue and we'll look into it.

/Niklas


Re: memory leak in http codec

Posted by Luis Neves <lu...@co.sapo.pt>.
Luis Neves wrote:


> I find those "-" and "~" very weird, I doubt that the clients are 
> sending those headers.

My bad. As it turns out the clients are really sending that garbage.

The main problem seems to be that the decoder gets confused when clients send 
Headers without values, e.g:

*******************************************

GET /somepath HTTP/1.1
Accept: */*
Accept-Encoding: gzip, deflate
User-Agent: Mozilla/4.0
Host: foo.bar.com
Referer:
Connection: Keep-Alive

*******************************************

Notice the empty referer header.
I worked around my problem by putting a limit in the size of every Collection
that holds HTTP Headers and also an upper/lower limit in the
length of every Header Name/Value.

I still don't know how to properly fixed the decoder but for now this inelegant
fix will do.

As an aside I also tried the Grizzly http server as stopgap measure and although
there are no OOM errors with misbehaved requests it completely freezes and stops
serving requests... but that might just be because of my unfamiliarity with the 
creature.


--
Luis Neves


Re: memory leak in http codec

Posted by Luis Neves <lu...@co.sapo.pt>.
Luis Neves wrote:

>  I'm running the server with logging set to DEBUG and I would 
> like to know if there is anything more I can do to try to isolate this 
> problem.
> 

I'm still struggling with this issue, I'm forced to restart the server every 10 
minutes... the debug logging doesn't seem to point to anything specific (at 
least to my eyes).
There are some requests that look weird, e.g:


DEBUG #org.apache.mina.filter.codec.http.HttpRequestDecodingState$3# 
[NioProcessor-12] - Decoded header: 
{-------=[----:---------------------------------------------------------------------------------------------------------------------------------------------------------], 
Accept=[*/*], Accept-Encoding=[gzip, deflate], Accept-Language=[pt], 
Connection=[Keep-Alive], Host=[wa.sl.pt], User-Agent=[Mozilla/4.0 (compatible; 
MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.0.3705; .NET CLR 1.1.4322; Media 
Center PC 4.0)]}
DEBUG #org.apache.mina.filter.codec.http.HttpRequestDecodingState$3# 
[NioProcessor-12] - Request is HTTP 1/1. Checking for transfer coding
DEBUG #org.apache.mina.filter.codec.http.HttpRequestDecodingState$3# 
[NioProcessor-12] - No entity body for this request

and another one:

DEBUG #org.apache.mina.filter.codec.http.HttpRequestDecodingState$3# 
[NioProcessor-12] - Decoded header: {Accept=[*/*], Accept-Language=[pt], 
Connection=[Keep-Alive], Host=[wa.sl.pt], User-Agent=[Mozilla/4.0 (compatible; 
MSIE 6.0; Windows NT 5.1)], 
~~~~~~~=[~~~~:~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~], 
~~~~~~~~~~~~~~~=[~~~~~ ~~~~~~~]}
DEBUG #org.apache.mina.filter.codec.http.HttpRequestDecodingState$3# 
[NioProcessor-12] - Request is HTTP 1/1. Checking for transfer coding
DEBUG #org.apache.mina.filter.codec.http.HttpRequestDecodingState$3# 
[NioProcessor-12] - No entity body for this request

I find those "-" and "~" very weird, I doubt that the clients are sending those 
headers. If the result of a Mina artifact, from where are those headers coming?

Thanks!

--
Luis Neves

Re: memory leak in http codec

Posted by Emmanuel Lecharny <el...@gmail.com>.
Luis Neves wrote:
> Hi,
>
> Emmanuel Lecharny wrote:
>> You have plenty of tools to analyze a OOM. Either a commercial tool,
>> or free tools. Just googling for 'Memory leak java analyzer' will
>> bring a lot of those guys, like
>> http://www.manageability.org/blog/stuff/open-source-profilers-for-java.
>
> I'm aware of those tools, I already used Jhat, the IBM HeapAnalyser 
> and Jconsole to locate the memory "back hole".
> My initial assumption that there was a memory leak doesn't seem to be 
> valid.
> Monitoring the heap with JConsole the behaviour is not consistent with 
> a memory leak, instead of  steadily increasing memory over time what I 
> see is a sudden spike in memory and the subsequent OOM error. The 
> "holder" for most of the used memory is the field:
> private final List<Object> childProducts = new ArrayList<Object>();
> in org.apache.mina.filter.codec.statemachine.DecodingStateMachine
>
>
> My problem is that I don't fully understand the state machine that 
> Mina uses to process HTTP requests and I can't tell what could cause 
> the addition of thousands of elements to the "childProducts" field... 
> anyway... 
Can you produce a StackTrace when you get the OOM ? It may help to 
understand where the problem has been originated. There is a good luck 
you get into an infinite loop somewhere in your code, the point is to 
determine when you get into this loop (typically behavior when you get a 
sudden spike in a program).

Hope it helps...

-- 
--
cordialement, regards,
Emmanuel Lécharny
www.iktek.com
directory.apache.org



Re: memory leak in http codec

Posted by Luis Neves <lu...@co.sapo.pt>.
Hi,

Emmanuel Lecharny wrote:
> You have plenty of tools to analyze a OOM. Either a commercial tool,
> or free tools. Just googling for 'Memory leak java analyzer' will
> bring a lot of those guys, like
> http://www.manageability.org/blog/stuff/open-source-profilers-for-java.

I'm aware of those tools, I already used Jhat, the IBM HeapAnalyser and Jconsole 
to locate the memory "back hole".
My initial assumption that there was a memory leak doesn't seem to be valid.
Monitoring the heap with JConsole the behaviour is not consistent with a memory 
leak, instead of  steadily increasing memory over time what I see is a sudden 
spike in memory and the subsequent OOM error. The "holder" for most of the used 
memory is the field:
private final List<Object> childProducts = new ArrayList<Object>();
in org.apache.mina.filter.codec.statemachine.DecodingStateMachine


My problem is that I don't fully understand the state machine that Mina uses to 
process HTTP requests and I can't tell what could cause the addition of 
thousands of elements to the "childProducts" field... anyway... I'm running the 
server with logging set to DEBUG and I would like to know if there is anything 
more I can do to try to isolate this problem.

--
Luis Neves


Re: memory leak in http codec

Posted by Emmanuel Lecharny <el...@gmail.com>.
You have plenty of tools to analyze a OOM. Either a commercial tool,
or free tools. Just googling for 'Memory leak java analyzer' will
bring a lot of those guys, like
http://www.manageability.org/blog/stuff/open-source-profilers-for-java.

On Dec 24, 2007 6:58 PM, Luis Neves <lu...@co.sapo.pt> wrote:
> Luis Neves wrote:
> >
> > Hello all.
> >
> > I think I'm facing a problem in the mina http codec filter, I've been
> > using it in previous applications that only deal with the request body
> > without issues (SOAP web-services and the like), but in this application
> > I have requests with cookies and query strings and I think this is the
> > cause of the problem.
> > One of the lists in the HttpRequestDecodingState seems to keep
> > references to request headers... or at least that is what it seems to me.
>
> I guess I was wrong.
> What seems to be happening is that some of the requests are built in such a way
> that while trying to decode them the codec enters in some weird loop that adds
> elements to the headers collection, only stopping when there is no more memory
> available.
>
> Is there something I can do to isolate the offending requests?
>
>
> --
> Luis Neves
>



-- 
Regards,
Cordialement,
Emmanuel Lécharny
www.iktek.com

Re: memory leak in http codec

Posted by Luis Neves <lu...@co.sapo.pt>.
Luis Neves wrote:
> 
> Hello all.
> 
> I think I'm facing a problem in the mina http codec filter, I've been 
> using it in previous applications that only deal with the request body 
> without issues (SOAP web-services and the like), but in this application 
> I have requests with cookies and query strings and I think this is the 
> cause of the problem.
> One of the lists in the HttpRequestDecodingState seems to keep 
> references to request headers... or at least that is what it seems to me.

I guess I was wrong.
What seems to be happening is that some of the requests are built in such a way 
that while trying to decode them the codec enters in some weird loop that adds 
elements to the headers collection, only stopping when there is no more memory 
available.

Is there something I can do to isolate the offending requests?


--
Luis Neves