You are viewing a plain text version of this content. The canonical link for it is here.
Posted to modules-dev@httpd.apache.org by Sai Jai Ganesh Gurubaran <sg...@cordys.com> on 2006/11/13 09:32:02 UTC

Huge memory consumption while storing the entire response

Hi All,
	We were developing a content manager module on Apache 2.0.
For this we have the module hooked to the out filter of Apache (and
there are no other filters).

The base code was taken from Apache 2.0 book (conversion of text to
upper case).

We notice that the memory consumption is very high per httpd and
eventually Apache has to be restarted within 25 mins.

On analysis, we find that the main problem is in the way we store the
complete response for further analysis.

Here is the main code that we use:


[CODE]


//----------------------------------------------------------------------
-------
/*
 *	Checks if the input bucket carries text/html chunks of
information.
 *  For other than text/html chunks, this filter hands over the same
 *	to the next filter in chain.
 *	For text/html, this breaks the pipe line, accumulates the full
response
 *	
 */
static apr_status_t testOutFilter_run(ap_filter_t *f,
					apr_bucket_brigade *pbbIn)
{
    request_rec* r = f->r;
    conn_rec* c = r->connection;
    apr_bucket* pbktIn;
    apr_bucket_brigade* pbbOut;
    int flag_EOS = FALSE;

    // To take into account all the MIME Types related to text
    if ( (r->content_type != NULL) &&
(strstr(r->content_type,"text")==NULL))
    {
		/*
		 * For MIME types other than text type send bucket
brigade as received
		 */
				
		return ap_pass_brigade(f->next,pbbIn);
    }
    else
    {
		if (f->ctx) 
		{
			pbbOut = (apr_bucket_brigade *) f->ctx;
		}
		else 
		{
			pbbOut=apr_brigade_create(r->pool,
c->bucket_alloc);
			f->ctx = pbbOut;
			ap_log_rerror(APLOG_MARK, APLOG_INFO, 0, r,
					"First BB of :%s",r->uri);
		}

		APR_BRIGADE_FOREACH(pbktIn,pbbIn) 
		{
			apr_bucket *pbktOut;
			
			if(APR_BUCKET_IS_EOS(pbktIn)) 
			{
				apr_bucket
*pbktEOS=apr_bucket_eos_create(c->bucket_alloc);
				APR_BRIGADE_INSERT_TAIL(pbbOut,pbktEOS);
				flag_EOS = TRUE;
				continue;
			}
			//Copy the in-buckets to the out-buckets
			apr_bucket_copy(pbktIn,&pbktOut);
			APR_BRIGADE_INSERT_TAIL(pbbOut,pbktOut);
		}

		/* To get the entire content to a pointer to char */
		if(flag_EOS == TRUE)
		{
			// Custom Code goes here
			// ...........
			// ...........
			// ...........
			
			return ap_pass_brigade(f->next,pbbOut);
				
		}
		else
		{
			return OK;
		}
   	}
}


Here is some statistics we gathered:
1. Module with the custom code - Apache mem growth is rapid and 25 mins
we need to restart 
2. Module without the custom code - Apache mem growth is still rapid and
25-30 mins max before we restart
3. Apache without the response gathering part - stable for long hours
(per httpd mem consumption is 44 MB)

Can anyone tell us the best way to gather the entire response without
consuming all the memory?


Any help will be greatly appreciated.
Ganesh


***********************************************************************
The information in this message is confidential and may be legally
privileged. It is intended solely for the addressee. Access to this 
message by anyone else is unauthorized. If you are not the 
intended recipient, any disclosure, copying, or distribution of the 
message, or any action or omission taken by you in reliance on 
it is prohibited and may be unlawful. Please immediately contact 
the sender if you have received this message in error. This email 
does not constitute any commitment from Cordys Holding BV or 
any of its subsidiaries except when expressly agreed in a written 
agreement between the intended recipient and 
Cordys Holding BV or its subsidiaries.
***********************************************************************



Re: Huge memory consumption while storing the entire response

Posted by Brian McQueen <mc...@gmail.com>.
This sounds like the same problem I was asking about.  Please do check
the email from myself from yesterday.

Brian McQueen

On 11/13/06, Sai Jai Ganesh Gurubaran <sg...@cordys.com> wrote:
> Hi All,
>         We were developing a content manager module on Apache 2.0.
> For this we have the module hooked to the out filter of Apache (and
> there are no other filters).
>
> The base code was taken from Apache 2.0 book (conversion of text to
> upper case).
>
> We notice that the memory consumption is very high per httpd and
> eventually Apache has to be restarted within 25 mins.
>
> On analysis, we find that the main problem is in the way we store the
> complete response for further analysis.
>
> Here is the main code that we use:
>
>
> [CODE]
>
>
> //----------------------------------------------------------------------
> -------
> /*
>  *      Checks if the input bucket carries text/html chunks of
> information.
>  *  For other than text/html chunks, this filter hands over the same
>  *      to the next filter in chain.
>  *      For text/html, this breaks the pipe line, accumulates the full
> response
>  *
>  */
> static apr_status_t testOutFilter_run(ap_filter_t *f,
>                                         apr_bucket_brigade *pbbIn)
> {
>     request_rec* r = f->r;
>     conn_rec* c = r->connection;
>     apr_bucket* pbktIn;
>     apr_bucket_brigade* pbbOut;
>     int flag_EOS = FALSE;
>
>     // To take into account all the MIME Types related to text
>     if ( (r->content_type != NULL) &&
> (strstr(r->content_type,"text")==NULL))
>     {
>                 /*
>                  * For MIME types other than text type send bucket
> brigade as received
>                  */
>
>                 return ap_pass_brigade(f->next,pbbIn);
>     }
>     else
>     {
>                 if (f->ctx)
>                 {
>                         pbbOut = (apr_bucket_brigade *) f->ctx;
>                 }
>                 else
>                 {
>                         pbbOut=apr_brigade_create(r->pool,
> c->bucket_alloc);
>                         f->ctx = pbbOut;
>                         ap_log_rerror(APLOG_MARK, APLOG_INFO, 0, r,
>                                         "First BB of :%s",r->uri);
>                 }
>
>                 APR_BRIGADE_FOREACH(pbktIn,pbbIn)
>                 {
>                         apr_bucket *pbktOut;
>
>                         if(APR_BUCKET_IS_EOS(pbktIn))
>                         {
>                                 apr_bucket
> *pbktEOS=apr_bucket_eos_create(c->bucket_alloc);
>                                 APR_BRIGADE_INSERT_TAIL(pbbOut,pbktEOS);
>                                 flag_EOS = TRUE;
>                                 continue;
>                         }
>                         //Copy the in-buckets to the out-buckets
>                         apr_bucket_copy(pbktIn,&pbktOut);
>                         APR_BRIGADE_INSERT_TAIL(pbbOut,pbktOut);
>                 }
>
>                 /* To get the entire content to a pointer to char */
>                 if(flag_EOS == TRUE)
>                 {
>                         // Custom Code goes here
>                         // ...........
>                         // ...........
>                         // ...........
>
>                         return ap_pass_brigade(f->next,pbbOut);
>
>                 }
>                 else
>                 {
>                         return OK;
>                 }
>         }
> }
>
>
> Here is some statistics we gathered:
> 1. Module with the custom code - Apache mem growth is rapid and 25 mins
> we need to restart
> 2. Module without the custom code - Apache mem growth is still rapid and
> 25-30 mins max before we restart
> 3. Apache without the response gathering part - stable for long hours
> (per httpd mem consumption is 44 MB)
>
> Can anyone tell us the best way to gather the entire response without
> consuming all the memory?
>
>
> Any help will be greatly appreciated.
> Ganesh
>
>
> ***********************************************************************
> The information in this message is confidential and may be legally
> privileged. It is intended solely for the addressee. Access to this
> message by anyone else is unauthorized. If you are not the
> intended recipient, any disclosure, copying, or distribution of the
> message, or any action or omission taken by you in reliance on
> it is prohibited and may be unlawful. Please immediately contact
> the sender if you have received this message in error. This email
> does not constitute any commitment from Cordys Holding BV or
> any of its subsidiaries except when expressly agreed in a written
> agreement between the intended recipient and
> Cordys Holding BV or its subsidiaries.
> ***********************************************************************
>
>
>

Re: Huge memory consumption while storing the entire response

Posted by Brian McQueen <mc...@gmail.com>.
Sorry, but I just noticed that the messages to which I referred you
actually went to the apreq mailing list:

<ap...@httpd.apache.org>

If you can't find them I'll send them to you directly.

On 11/13/06, Sai Jai Ganesh Gurubaran <sg...@cordys.com> wrote:
> Hi All,
>         We were developing a content manager module on Apache 2.0.
> For this we have the module hooked to the out filter of Apache (and
> there are no other filters).
>
> The base code was taken from Apache 2.0 book (conversion of text to
> upper case).
>
> We notice that the memory consumption is very high per httpd and
> eventually Apache has to be restarted within 25 mins.
>
> On analysis, we find that the main problem is in the way we store the
> complete response for further analysis.
>
> Here is the main code that we use:
>
>
> [CODE]
>
>
> //----------------------------------------------------------------------
> -------
> /*
>  *      Checks if the input bucket carries text/html chunks of
> information.
>  *  For other than text/html chunks, this filter hands over the same
>  *      to the next filter in chain.
>  *      For text/html, this breaks the pipe line, accumulates the full
> response
>  *
>  */
> static apr_status_t testOutFilter_run(ap_filter_t *f,
>                                         apr_bucket_brigade *pbbIn)
> {
>     request_rec* r = f->r;
>     conn_rec* c = r->connection;
>     apr_bucket* pbktIn;
>     apr_bucket_brigade* pbbOut;
>     int flag_EOS = FALSE;
>
>     // To take into account all the MIME Types related to text
>     if ( (r->content_type != NULL) &&
> (strstr(r->content_type,"text")==NULL))
>     {
>                 /*
>                  * For MIME types other than text type send bucket
> brigade as received
>                  */
>
>                 return ap_pass_brigade(f->next,pbbIn);
>     }
>     else
>     {
>                 if (f->ctx)
>                 {
>                         pbbOut = (apr_bucket_brigade *) f->ctx;
>                 }
>                 else
>                 {
>                         pbbOut=apr_brigade_create(r->pool,
> c->bucket_alloc);
>                         f->ctx = pbbOut;
>                         ap_log_rerror(APLOG_MARK, APLOG_INFO, 0, r,
>                                         "First BB of :%s",r->uri);
>                 }
>
>                 APR_BRIGADE_FOREACH(pbktIn,pbbIn)
>                 {
>                         apr_bucket *pbktOut;
>
>                         if(APR_BUCKET_IS_EOS(pbktIn))
>                         {
>                                 apr_bucket
> *pbktEOS=apr_bucket_eos_create(c->bucket_alloc);
>                                 APR_BRIGADE_INSERT_TAIL(pbbOut,pbktEOS);
>                                 flag_EOS = TRUE;
>                                 continue;
>                         }
>                         //Copy the in-buckets to the out-buckets
>                         apr_bucket_copy(pbktIn,&pbktOut);
>                         APR_BRIGADE_INSERT_TAIL(pbbOut,pbktOut);
>                 }
>
>                 /* To get the entire content to a pointer to char */
>                 if(flag_EOS == TRUE)
>                 {
>                         // Custom Code goes here
>                         // ...........
>                         // ...........
>                         // ...........
>
>                         return ap_pass_brigade(f->next,pbbOut);
>
>                 }
>                 else
>                 {
>                         return OK;
>                 }
>         }
> }
>
>
> Here is some statistics we gathered:
> 1. Module with the custom code - Apache mem growth is rapid and 25 mins
> we need to restart
> 2. Module without the custom code - Apache mem growth is still rapid and
> 25-30 mins max before we restart
> 3. Apache without the response gathering part - stable for long hours
> (per httpd mem consumption is 44 MB)
>
> Can anyone tell us the best way to gather the entire response without
> consuming all the memory?
>
>
> Any help will be greatly appreciated.
> Ganesh
>
>
> ***********************************************************************
> The information in this message is confidential and may be legally
> privileged. It is intended solely for the addressee. Access to this
> message by anyone else is unauthorized. If you are not the
> intended recipient, any disclosure, copying, or distribution of the
> message, or any action or omission taken by you in reliance on
> it is prohibited and may be unlawful. Please immediately contact
> the sender if you have received this message in error. This email
> does not constitute any commitment from Cordys Holding BV or
> any of its subsidiaries except when expressly agreed in a written
> agreement between the intended recipient and
> Cordys Holding BV or its subsidiaries.
> ***********************************************************************
>
>
>

Re: Huge memory consumption while storing the entire response

Posted by Nick Kew <ni...@webthing.com>.
On Mon, 13 Nov 2006 14:02:02 +0530
"Sai Jai Ganesh Gurubaran" <sg...@cordys.com> wrote:


> On analysis, we find that the main problem is in the way we store the
> complete response for further analysis.

Why are you doing that?  Buffering everything in memory is fine
for small docs, but as they grow there's usually a better solution.
My book discusses some much better ways to process text and markup:-)

-- 
Nick Kew

Application Development with Apache - the Apache Modules Book
http://www.apachetutor.org/