You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@commons.apache.org by Michael Mastroianni <MM...@choicestream.com> on 2004/06/09 16:43:10 UTC

HttpClient -- possible resource leak?

I have a multi-threaded app, using Httpclient to download a few thousand urls at a time. Currently, I have one MultiThreadedHttpConnectionManager, which the thread manager creates, and passes around to each of its worker threads.

Each thread has a queue of urls, and it creates a new HttpClient, using the ConnectionManager, for each one. I've tried using one, created at construction time for each worker thread, and gotten no luck.

The worker threads make executeMethod calls, and I notice that I'm leaking a lot of memory (it looks like the memory usage goes up every time I successfully download a page). It seems as if perhaps the underlying buffer of the GetMethod is not being cleaned up. I'm calling release on the GetMethod in a finally block. A relevant piece of code is below:

            private void SpiderUrlImpl()
            {
                        HttpMethod method = new GetMethod(m_sUrl);
                        try
                        {
                                    //if(m_State == null)
                                    //{
                                                m_State = new HttpState();
                                                m_State.setCookiePolicy(CookiePolicy.RFC2109);
                                    //}
            
                                    m_client.setState(m_State);
                                    m_client.setConnectionTimeout(m_timeout);

                                    method.setFollowRedirects(true);
                                    method.setStrictMode(false);
                                    String responseBody = null;
                                    
                                    int iCode    = m_client.executeMethod(method);
                                    responseBody = method.getResponseBodyAsString();
                                    Header hLoc  = method.getResponseHeader("Location");
                                    
                                    java.io.FileWriter fw = new java.io.FileWriter(m_sPath + "\\" + m_sFile);
                                    fw.write(responseBody);
                                    w.close();
                        }//TODO: LOG STUFF GOES HERE
                        catch (org.apache.commons.httpclient.HttpException he)
                        {
                            System.err.println("Http error connecting to '" + m_sUrl + "'");
                            System.err.println(he.getMessage());
                        }
                        catch (IOException ioe)
                        {
                            System.err.println("Unable to connect to '" + m_sUrl + "' or print file + '" +  m_sPath + "\\" + m_sFile + "'");
                            System.err.println(ioe.getMessage());
                        }
                        catch(Exception eExc)
                        {
                            System.err.println(eExc.getMessage());
                        }
                        finally
                        {
                            method.releaseConnection();
                        }
            }
}



---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org


Re: HttpClient -- possible resource leak?

Posted by Ortwin Glück <or...@nose.ch>.

Michael Mastroianni wrote:
>                                     responseBody = method.getResponseBodyAsString();
>                                     
>                                     java.io.FileWriter fw = new java.io.FileWriter(m_sPath + "\\" + m_sFile);
>                                     fw.write(responseBody);


Can you try and use a stream instead of a string? You need to take care 
of the encoding then, but you would avoid any buffering in memory.

-- 
  _________________________________________________________________
  NOSE applied intelligence ag

  ortwin glück                      [www]      http://www.nose.ch
  software engineer
  hardturmstrasse 171               [pgp id]           0x81CF3416
  8005 zürich                       [office]      +41-1-277 57 35
  switzerland                       [fax]         +41-1-277 57 12

---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org


RE: HttpClient -- possible resource leak?

Posted by ol...@bluewin.ch.
Michael

Could you provide us with additional details on the execution environment
of your application?

(1) What version of HttpClient are you using?
(2) What is the JDK version? 
(3) What platform?
(4) How exactly do you measure memory consumption by your application?
(5) Do you set initial and maximum heap size for the JRE?

Oleg


>-- Original Message --
>Reply-To: "Jakarta Commons Developers List" <co...@jakarta.apache.org>
>Subject: HttpClient -- possible resource leak?
>Date: Wed, 9 Jun 2004 10:43:10 -0400
>From: "Michael Mastroianni" <MM...@choicestream.com>
>To: <co...@jakarta.apache.org>
>
>
>I have a multi-threaded app, using Httpclient to download a few thousand
>urls at a time. Currently, I have one MultiThreadedHttpConnectionManager,
>which the thread manager creates, and passes around to each of its worker
>threads.
>
>Each thread has a queue of urls, and it creates a new HttpClient, using
the
>ConnectionManager, for each one. I've tried using one, created at construction
>time for each worker thread, and gotten no luck.
>
>The worker threads make executeMethod calls, and I notice that I'm leaking
>a lot of memory (it looks like the memory usage goes up every time I successfully
>download a page). It seems as if perhaps the underlying buffer of the GetMethod
>is not being cleaned up. I'm calling release on the GetMethod in a finally
>block. A relevant piece of code is below:
>
>            private void SpiderUrlImpl()
>            {
>                        HttpMethod method = new GetMethod(m_sUrl);
>                        try
>                        {
>                                    //if(m_State == null)
>                                    //{
>                                                m_State = new HttpState();
>                                                m_State.setCookiePolicy(CookiePolicy.RFC2109);
>                                    //}
>            
>                                    m_client.setState(m_State);
>                                    m_client.setConnectionTimeout(m_timeout);
>
>                                    method.setFollowRedirects(true);
>                                    method.setStrictMode(false);
>                                    String responseBody = null;
>                                    
>                                    int iCode    = m_client.executeMethod(method);
>                                    responseBody = method.getResponseBodyAsString();
>                                    Header hLoc  = method.getResponseHeader("Location");
>                                    
>                                    java.io.FileWriter fw = new java.io.FileWriter(m_sPath
>+ "\\" + m_sFile);
>                                    fw.write(responseBody);
>                                    w.close();
>                        }//TODO: LOG STUFF GOES HERE
>                        catch (org.apache.commons.httpclient.HttpException
>he)
>                        {
>                            System.err.println("Http error connecting to
>'" + m_sUrl + "'");
>                            System.err.println(he.getMessage());
>                        }
>                        catch (IOException ioe)
>                        {
>                            System.err.println("Unable to connect to '"
+
>m_sUrl + "' or print file + '" +  m_sPath + "\\" + m_sFile + "'");
>                            System.err.println(ioe.getMessage());
>                        }
>                        catch(Exception eExc)
>                        {
>                            System.err.println(eExc.getMessage());
>                        }
>                        finally
>                        {
>                            method.releaseConnection();
>                        }
>            }
>}
>
>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
>For additional commands, e-mail: commons-dev-help@jakarta.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org