You are viewing a plain text version of this content. The canonical link for it is here.
Posted to httpclient-users@hc.apache.org by Suladna <su...@yahoo.com> on 2008/09/04 09:40:46 UTC

My HttpClient gives me a memory leak

Hi
�
I have used the following tutorial to bring html-code from websites to my java program:
�
http://svn.apache.org/repos/asf/httpcomponents/httpclient/trunk/module-client/src/examples/org/apache/http/examples/client/ClientExecuteDirect.java
�
The websites which I connect to all start with www.xxxxxxxxx.se�*� but they have different endings.
�
So I start by making one HttpHost and defining a HttpEntity and HttpRespons
�
final HttpHost target = new HttpHost("www.xxxxxxx.se", 80, "http");
HttpEntity entity = null;
HttpResponse rsp = null;

�
After that I use a loop to connect to each subsite. This is what I write in the loop:
�
HttpRequest req = createRequest(urlEnding);
rsp = client.execute(target, req);
entity = rsp.getEntity();
String[] line = EntityUtils.toString(entity).split("\n"); //this info is used by the program
if (entity != null){
entity.consumeContent(); 
}
�
It basically works fine, but the problem is that I get a memory leak.. after having brought the code from a few hundred websites I get a java.lang.OutOfMemoryError: Java heap space. Is there anything I can do in the loop except entity.consumeConent() to prevent this from happening?
�
I attach my java code as a text file. It is just a slight modification of the code in the tutorial.
�
* this is not the real domain.. if anyone wants to know the real domain, please send me a private e-mail


      

Re: My HttpClient gives me a memory leak

Posted by Suladna <su...@yahoo.com>.
Hm I don't think I keep any content but it is worth a try. Thanks.


--- On Thu, 9/4/08, Tomek Maciejewski <to...@gmail.com> wrote:

From: Tomek Maciejewski <to...@gmail.com>
Subject: Re: My HttpClient gives me a memory leak
To: "HttpClient User Discussion" <ht...@hc.apache.org>
Date: Thursday, September 4, 2008, 1:27 PM

Hi Suladna,

What do you do with the content of the site during the 'processing 
html'. It is possible that if you download few hundreds of sites and 
keep content (or some kind of parsed object representation) of every 
site in memory, then you can have too low heap size declared. You can 
declare the maximum heap size of your JVM by adding -Xmx argument , i.e. 
by seting -Xmx512M you defines heap size as 512MB. But I'm not sure if 
it is a reason of your problem.

Tomek


Suladna wrote:
> Hi
>  
> I have used the following tutorial to bring html-code from websites to 
> my java program:
>  
>
http://svn.apache.org/repos/asf/httpcomponents/httpclient/trunk/module-client/src/examples/org/apache/http/examples/client/ClientExecuteDirect.java
>  
> The websites which I connect to all start with www.xxxxxxxxx.se 
> <http://www.xxxxxxxxx.se/> *  but they have different endings.
>  
> So I start by making one HttpHost and defining a HttpEntity and 
> HttpRespons
>  
> final HttpHost target = new HttpHost("www.xxxxxxx.se", 80,
"http");
> HttpEntity entity = null;
> HttpResponse rsp = null;
>
>  
> After that I use a loop to connect to each subsite. This is what I 
> write in the loop:
>  
> HttpRequest req = createRequest(urlEnding);
> rsp = client.execute(target, req);
>
> entity = rsp.getEntity();
>
> String[] line = EntityUtils.toString(entity).split("\n");
//this info 
> is used by the program
>
> *if* (entity != *null*){
>
> entity.consumeContent();
>
> }
>  
> It basically works fine, but the problem is that I get a memory leak.. 
> after having brought the code from a few hundred websites I get a 
> java.lang.OutOfMemoryError: Java heap space. Is there anything I can 
> do in the loop except entity.consumeConent() to prevent this from 
> happening?
>  
> I attach my java code as a text file. It is just a slight modification 
> of the code in the tutorial.
>  
> * this is not the real domain.. if anyone wants to know the real 
> domain, please send me a private e-mail
>
>
> ------------------------------------------------------------------------
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org
> For additional commands, e-mail: httpclient-users-help@hc.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org
For additional commands, e-mail: httpclient-users-help@hc.apache.org




      

Re: My HttpClient gives me a memory leak

Posted by Tomek Maciejewski <to...@gmail.com>.
Hi Suladna,

What do you do with the content of the site during the 'processing 
html'. It is possible that if you download few hundreds of sites and 
keep content (or some kind of parsed object representation) of every 
site in memory, then you can have too low heap size declared. You can 
declare the maximum heap size of your JVM by adding -Xmx argument , i.e. 
by seting -Xmx512M you defines heap size as 512MB. But I'm not sure if 
it is a reason of your problem.

Tomek


Suladna wrote:
> Hi
>  
> I have used the following tutorial to bring html-code from websites to 
> my java program:
>  
> http://svn.apache.org/repos/asf/httpcomponents/httpclient/trunk/module-client/src/examples/org/apache/http/examples/client/ClientExecuteDirect.java
>  
> The websites which I connect to all start with www.xxxxxxxxx.se 
> <http://www.xxxxxxxxx.se/> *  but they have different endings.
>  
> So I start by making one HttpHost and defining a HttpEntity and 
> HttpRespons
>  
> final HttpHost target = new HttpHost("www.xxxxxxx.se", 80, "http");
> HttpEntity entity = null;
> HttpResponse rsp = null;
>
>  
> After that I use a loop to connect to each subsite. This is what I 
> write in the loop:
>  
> HttpRequest req = createRequest(urlEnding);
> rsp = client.execute(target, req);
>
> entity = rsp.getEntity();
>
> String[] line = EntityUtils.toString(entity).split("\n"); //this info 
> is used by the program
>
> *if* (entity != *null*){
>
> entity.consumeContent();
>
> }
>  
> It basically works fine, but the problem is that I get a memory leak.. 
> after having brought the code from a few hundred websites I get a 
> java.lang.OutOfMemoryError: Java heap space. Is there anything I can 
> do in the loop except entity.consumeConent() to prevent this from 
> happening?
>  
> I attach my java code as a text file. It is just a slight modification 
> of the code in the tutorial.
>  
> * this is not the real domain.. if anyone wants to know the real 
> domain, please send me a private e-mail
>
>
> ------------------------------------------------------------------------
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org
> For additional commands, e-mail: httpclient-users-help@hc.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org
For additional commands, e-mail: httpclient-users-help@hc.apache.org


Re: My HttpClient gives me a memory leak

Posted by Oleg Kalnichevski <ol...@apache.org>.
On Thu, 2008-09-04 at 00:40 -0700, Suladna wrote:
> Hi
>  
> I have used the following tutorial to bring html-code from websites to
> my java program:
>  
> http://svn.apache.org/repos/asf/httpcomponents/httpclient/trunk/module-client/src/examples/org/apache/http/examples/client/ClientExecuteDirect.java
>  
> The websites which I connect to all start with www.xxxxxxxxx.se *  but
> they have different endings.
>  
> So I start by making one HttpHost and defining a HttpEntity and
> HttpRespons
>  
> final HttpHost target = new HttpHost("www.xxxxxxx.se", 80, "http");
> HttpEntity entity = null;
> HttpResponse rsp = null;
> 
>  
> After that I use a loop to connect to each subsite. This is what I
> write in the loop:
>  
> HttpRequest req = createRequest(urlEnding);
> rsp = client.execute(target, req);
> entity = rsp.getEntity(); 
> 
> String[] line = EntityUtils.toString(entity).split("\n"); //this info
> is used by the program 
> 
> 
> 

This is an exceptionally bad idea. You are buffering the entire response
content in memory and then making yet another copy by splitting the
string into individual lines. No wonder you are getting OM exceptions. 

This problem has nothing to do with HttpClient.

Oleg 


---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org
For additional commands, e-mail: httpclient-users-help@hc.apache.org