You are viewing a plain text version of this content. The canonical link for it is here.

Posted to modperl@perl.apache.org by Pavel Georgiev <pa...@3tera.com> on 2010/03/16 07:26:22 UTC

mod_perl memory

Hi,

I have a perl script running in mod_perl that needs to write a large amount of data to the client, possibly over a long period. The behavior that I observe is that once I print and flush something, the buffer memory is not reclaimed even though I rflush (I know this cant be reclaimed back by the OS).

Is that how mod_perl operates and is there a way that I can force it to periodically free the buffer memory, so that I can use that for new buffers instead of taking more from the OS?

Re: mod_perl memory

Posted by ARTHUR GOLDBERG <ar...@cs.nyu.edu>.

Pavel

You're welcome. You are correct about the limitations of  
Apache2::SizeLimit. Processes cannot be 'scrubbed'; rather they should  
be killed and restarted.

Rapid memory growth should be prevented by prohibiting processes from  
ever growing large than a preset limit. On Unix systems, the system  
call setrlimit sets process resource limits. These limits are  
inherited by children of the process. These limits can view and set  
with the bash command rlimit. Many resources can be limited, but I'm  
focusing on process size, which is controlled by resource RLIMIT_AS,  
the maximum size of a process's virtual memory (address space) in  
bytes. (Some operating systems control RLIMIT_DATA, The maximum size  
of the process's data segment, but Linux doesn't.)
When a process tries to exceeds a resource limit, the system call that  
requested the resource fails and returns an error. The type of error  
depends on which resource's limit is violated (see man page for  
setrlimit). In the case of virtual memory, the RLIMIT_AS can be  
exceeded by any call that asks for additional virtual memory, such as  
brk(2), which sets the end of the data segment. Perl manages memory  
via either the system's malloc or its own malloc. If asking for  
virtual memory fails, then malloc will fail, which will typically  
cause the Perl process to write "Out of Memory!" to STDERR and die.
RLIMIT_AS can be set in many ways. One direct way an Apache/mod_perl  
process can set it is via Apache2::Resource. For example, these  
commands can be added to httpd.conf:

PerlModule Apache2::Resource
# set child memory limit to 100 megabytes
# RLIMIT_AS (address space) will work to limit the size of a process
PerlSetEnv PERL_RLIMIT_AS 100
PerlChildInitHandler Apache2::Resource

The PerlSetEnv line sets the Perl environment variable PERL_RLIMIT_AS.  
The PerlChildInitHandler line directs Apache to load Apache2::Resource  
each time it creates an httpd process. Apache2::Resource then reads  
PERL_RLIMIT_AS and sets the RLIMIT_AS limit to 100 (megabytes). Any  
httpd that tries to grow bigger than 100 MB will fail. (Also,  
PERL_RLIMIT_AS can be set to soft_limit:hard_limit, where soft_limit  
is the limit at which the resource request will fail. At any time the  
soft_limit can be adjusted up to the hard_limit.)
I recommend against setting this limit for a threaded process, because  
if one Request handler gets the process killed then all threads  
handling requests will fail.
When the process has failed it is difficult to output an error message  
to the web user, because Perl calls die and the process exits.

As I wrote yesterday, failure of a mod_perl process with "Out of  
Memory!", as occurs when the softlimit of RLIMIT_AS is exceeded, does  
not trigger an Apache ErrorDocument 500. A mod_perl process that exits  
(actually CORE::exit() must be called) doesn't trigger an  
ErrorDocument 500 either.

Second, if Apache detects a server error it can redirect to a script  
as discussed in Custom Error Response. It can access the REDIRECT  
environment variables but doesn't know anything else about the HTTP  
Request.

At this point I think that the best thing to do is use  
MaxRequestsPerChild and Apache2::SizeLimit to handle most memory  
problems, and simply let processes that blow up die without feedback  
to users. Not ideal, but they should be extremely rare events.

BR
A

On Mar 16, 2010, at 2:31 PM, Pavel Georgiev wrote:

> Thank you both for the quick replies!
>
> Arthur,
>
> Apache2::SizeLimit is no solution for my problem as I'm looking for  
> a way to limit the size each requests take, the fact that I can  
> scrub the process after the request is done (or drop the requests if  
> the process reaches some limit, although my understanding is that  
> Apache2::SizeLimit does its job after the requests is done) does not  
> help me.
>
> William,
> Let me make I'm understanding this right - I'm not using any buffers  
> myself, all I do is sysread() from a unix socked and print(), its  
> just that I need to print a large amount of data for each request.  
> Are you saying that there is no way to free the memory after I've  
> done print() and rflush()?
>
> BTW thanks for the other suggestions, switching to cgi seems like  
> the only reasonable thing for me, I just want to make sure that this  
> is how mod_perl operates and it is not me who is doing something  
> wrong.
>
> Thanks,
> Pavel
>
> On Mar 16, 2010, at 11:18 AM, ARTHUR GOLDBERG wrote:
>
>> You could use Apache2::SizeLimit ("because size does matter") which  
>> evaluates the size of Apache httpd processes when they complete  
>> HTTP Requests, and kills those that grow too large. (Note that  
>> Apache2::SizeLimit can only be used for non-threaded MPMs, such as  
>> prefork.) Since it operates at the end of a Request, SizeLimit has  
>> the advantage that it doesn't interrupt Request processing and the  
>> disadvantage that it won't prevent a process from becoming  
>> oversized while processing a Request. To reduce the regular load of  
>> Apache2::SizeLimit it can be configured to check the size  
>> intermittently by setting the parameter CHECK_EVERY_N_REQUESTS.  
>> These parameters can be configured in a <Perl> section in  
>> httpd.conf, or a Perl start-up file.
>>
>> That way, if your script allocates too much memory the process will  
>> be killed when it finishes handling the request. The MPM will  
>> eventually start another process if necessary.
>>
>> BR
>> A
>>
>> On Mar 16, 2010, at 9:30 AM, William T wrote:
>>
>>> On Mon, Mar 15, 2010 at 11:26 PM, Pavel Georgiev <pa...@3tera.com>  
>>> wrote:
>>>> I have a perl script running in mod_perl that needs to write a  
>>>> large amount of data to the client, possibly over a long period.  
>>>> The behavior that I observe is that once I print and flush  
>>>> something, the buffer memory is not reclaimed even though I  
>>>> rflush (I know this cant be reclaimed back by the OS).
>>>>
>>>> Is that how mod_perl operates and is there a way that I can force  
>>>> it to periodically free the buffer memory, so that I can use that  
>>>> for new buffers instead of taking more from the OS?
>>>
>>> That is how Perl operates.  Mod_Perl is just Perl embedded in the
>>> Apache Process.
>>>
>>> You have a few options:
>>> * Buy more memory. :)
>>> * Delegate resource intensive work to a different process (I would
>>> NOT suggest a forking a child in Apache).
>>> * Tie the buffer to a file on disk, or db object, that can be
>>> explicitly reclaimed
>>> * Create a buffer object of a fixed size and loop.
>>> * Use compression on the data stream that you read into a buffer.
>>>
>>> You could also architect your system to mitigate resource usage if  
>>> the
>>> large data serve is not a common operation:
>>> * Proxy those requests to a different server which is optimized to
>>> handle large data serves.
>>> * Execute the large data serves with CGI rather than Mod_Perl.
>>>
>>> I'm sure there are probably other options as well.
>>>
>>> -wjt
>>>
>>
>> Arthur P. Goldberg, PhD
>>
>> Research Scientist in Bioinformatics
>> Plant Systems Biology Laboratory
>> www.virtualplant.org
>>
>> Visiting Academic
>> Computer Science Department
>> Courant Institute of Mathematical Sciences
>> www.cs.nyu.edu/artg
>>
>> artg@cs.nyu.edu
>> New York University
>> 212 995-4918
>> Coruzzi Lab
>> 8th Floor Silver Building
>> 1009 Silver Center
>> 100 Washington Sq East
>> New York NY 10003-6688
>>
>>
>>
>
>

Re: mod_perl memory

Posted by Torsten Förtsch <to...@gmx.net>.

On Friday 19 March 2010 22:07:39 André Warnier wrote:
> In one of your initial posts, you mentioned sending a response with a 
> Content-type "multipart/x-mixed-replace".
> What does that do exactly ?
> A pointer would be fine.
> 
http://en.wikipedia.org/wiki/Push_technology#HTTP_server_push
http://foertsch.name/ModPerl-Tricks/ServerPush.shtml

Torsten Förtsch

-- 
Need professional modperl support? Hire me! (http://foertsch.name)

Like fantasy? http://kabatinte.net

Re: mod_perl memory

Posted by André Warnier <aw...@ice-sa.com>.

Pavel Georgiev wrote:
> Thanks, that did the job. I'm currently testing for side effects but it all looks good so far.
> 
Glad someone could help you.
I have been meaning to ask a question, and holding back.
In one of your initial posts, you mentioned sending a response with a 
Content-type "multipart/x-mixed-replace".
What does that do exactly ?
A pointer would be fine.

Thanks

Re: mod_perl memory

Posted by Pavel Georgiev <pa...@3tera.com>.

Thanks, that did the job. I'm currently testing for side effects but it all looks good so far.

On Mar 18, 2010, at 4:09 AM, Torsten Förtsch wrote:

> On Thursday 18 March 2010 11:54:53 Mårten Svantesson wrote:
>> I have never worked directly with the APR API but in the example above
>> couldn't you prevent the request pool from growing by explicitly reusing
>> the  bucket brigade?
>> 
>> Something like (not tested):
>> 
>> sub {
>>   my ($r)=@_;
>> 
>>   my $ba=$r->connection->bucket_alloc;
>>   my $bb2=APR::Brigade->new($r->pool, $ba);
>>   until( -e '/tmp/stop' ) {
>>     $bb2->insert_tail(APR::Bucket->new($ba, ("x"x70)."\n"));
>>     $bb2->insert_tail(APR::Bucket::flush_create $ba);
>>     $r->output_filters->pass_brigade($bb2);
>>     $bb2->cleanup();
>>   }
>> 
>>   $bb2->insert_tail(APR::Bucket::eos_create $ba);
>>   $r->output_filters->pass_brigade($bb2);
>> 
>>   return Apache2::Const::OK;
>> }
>> 
> Thanks for pointing to the obvious. This doesn't grow either.
> 
> Torsten Förtsch
> 
> -- 
> Need professional modperl support? Hire me! (http://foertsch.name)
> 
> Like fantasy? http://kabatinte.net

Re: mod_perl memory

Posted by Dimitri Kollias <me...@gmail.com>.

I ran similar tests to what Torsten had, and definitely noticed a
difference in memory usage, but didn't see any leaks in memory.  Memory
usage is about twice when using $r->print, but it does hit it's max fairly
quickly.  Memory usage also maxes out fairly quickly using the bucket
brigade method.  Maybe the leak has been fixed?  I'm using apache 2.2.21
and mod_perl 2.05

Thanks,
Dimitri

On Thu, Mar 18, 2010 at 7:09 AM, Torsten Förtsch
<to...@gmx.net>wrote:

> On Thursday 18 March 2010 11:54:53 Mårten Svantesson wrote:
> > I have never worked directly with the APR API but in the example above
> >  couldn't you prevent the request pool from growing by explicitly reusing
> >  the  bucket brigade?
> >
> > Something like (not tested):
> >
> > sub {
> >    my ($r)=@_;
> >
> >    my $ba=$r->connection->bucket_alloc;
> >    my $bb2=APR::Brigade->new($r->pool, $ba);
> >    until( -e '/tmp/stop' ) {
> >      $bb2->insert_tail(APR::Bucket->new($ba, ("x"x70)."\n"));
> >      $bb2->insert_tail(APR::Bucket::flush_create $ba);
> >      $r->output_filters->pass_brigade($bb2);
> >      $bb2->cleanup();
> >    }
> >
> >    $bb2->insert_tail(APR::Bucket::eos_create $ba);
> >    $r->output_filters->pass_brigade($bb2);
> >
> >    return Apache2::Const::OK;
> > }
> >
> Thanks for pointing to the obvious. This doesn't grow either.
>
> Torsten Förtsch
>
> --
> Need professional modperl support? Hire me! (http://foertsch.name)
>
> Like fantasy? http://kabatinte.net
>

Re: mod_perl memory

Posted by Torsten Förtsch <to...@gmx.net>.

On Thursday 18 March 2010 11:54:53 Mårten Svantesson wrote:
> I have never worked directly with the APR API but in the example above
>  couldn't you prevent the request pool from growing by explicitly reusing
>  the  bucket brigade?
> 
> Something like (not tested):
> 
> sub {
>    my ($r)=@_;
> 
>    my $ba=$r->connection->bucket_alloc;
>    my $bb2=APR::Brigade->new($r->pool, $ba);
>    until( -e '/tmp/stop' ) {
>      $bb2->insert_tail(APR::Bucket->new($ba, ("x"x70)."\n"));
>      $bb2->insert_tail(APR::Bucket::flush_create $ba);
>      $r->output_filters->pass_brigade($bb2);
>      $bb2->cleanup();
>    }
> 
>    $bb2->insert_tail(APR::Bucket::eos_create $ba);
>    $r->output_filters->pass_brigade($bb2);
> 
>    return Apache2::Const::OK;
> }
> 
Thanks for pointing to the obvious. This doesn't grow either.

Torsten Förtsch

-- 
Need professional modperl support? Hire me! (http://foertsch.name)

Like fantasy? http://kabatinte.net

Re: mod_perl memory

Posted by Torsten Förtsch <to...@gmx.net>.

On Thursday 18 March 2010 10:16:07 Torsten Förtsch wrote:
> No, this one does not grow here.
> 
> sub {
> ...
forgot to mention that the function is supposed to be a PerlResponseHandler.

Torsten Förtsch

-- 
Need professional modperl support? Hire me! (http://foertsch.name)

Like fantasy? http://kabatinte.net

Re: mod_perl memory

Posted by Mårten Svantesson <ma...@travelocitynordic.com>.

Torsten Förtsch skrev:
> On Thursday 18 March 2010 04:13:04 Pavel Georgiev wrote:
>> How would that logic (adding subpools and using them) be applied to my
>>  simplified example:
>>
>> for (;;) {
>>    $request->print("--$this->{boundary}\n");
>>    $request->print("Content-type: text/html; charset=utf-8;\n\n");
>>    $request->print("$data\n\n");
>>    $request->rflush;
>> }
>>
>> Do I need to add an output filter?
>>
> No, this one does not grow here.
> 
> sub {
>   my ($r)=@_;
> 
>   my $ba=$r->connection->bucket_alloc;
>   until( -e '/tmp/stop' ) {
>     my $pool=$r->pool->new;
>     my $bb2=APR::Brigade->new($pool, $ba);
>     $bb2->insert_tail(APR::Bucket->new($ba, ("x"x70)."\n"));
>     $bb2->insert_tail(APR::Bucket::flush_create $ba);
>     $r->output_filters->pass_brigade($bb2);
>     $pool->destroy;
>   }
> 
>   my $bb2=APR::Brigade->new($r->pool, $ba);
>   $bb2->insert_tail(APR::Bucket::eos_create $ba);
>   $r->output_filters->pass_brigade($bb2);
> 
>   return Apache2::Const::OK;
> }
> 
> Torsten Förtsch
> 

I have never worked directly with the APR API but in the example above couldn't you prevent the request pool from growing by explicitly reusing the 
bucket brigade?

Something like (not tested):

sub {
   my ($r)=@_;

   my $ba=$r->connection->bucket_alloc;
   my $bb2=APR::Brigade->new($r->pool, $ba);
   until( -e '/tmp/stop' ) {
     $bb2->insert_tail(APR::Bucket->new($ba, ("x"x70)."\n"));
     $bb2->insert_tail(APR::Bucket::flush_create $ba);
     $r->output_filters->pass_brigade($bb2);
     $bb2->cleanup();
   }

   $bb2->insert_tail(APR::Bucket::eos_create $ba);
   $r->output_filters->pass_brigade($bb2);

   return Apache2::Const::OK;
}


-- 
   Mårten Svantesson
   Senior Developer
   Travelocity Nordic
   +46 (0)8 505 787 23

Re: mod_perl memory

Posted by Torsten Förtsch <to...@gmx.net>.

On Thursday 18 March 2010 04:13:04 Pavel Georgiev wrote:
> How would that logic (adding subpools and using them) be applied to my
>  simplified example:
> 
> for (;;) {
>    $request->print("--$this->{boundary}\n");
>    $request->print("Content-type: text/html; charset=utf-8;\n\n");
>    $request->print("$data\n\n");
>    $request->rflush;
> }
> 
> Do I need to add an output filter?
> 
No, this one does not grow here.

sub {
  my ($r)=@_;

  my $ba=$r->connection->bucket_alloc;
  until( -e '/tmp/stop' ) {
    my $pool=$r->pool->new;
    my $bb2=APR::Brigade->new($pool, $ba);
    $bb2->insert_tail(APR::Bucket->new($ba, ("x"x70)."\n"));
    $bb2->insert_tail(APR::Bucket::flush_create $ba);
    $r->output_filters->pass_brigade($bb2);
    $pool->destroy;
  }

  my $bb2=APR::Brigade->new($r->pool, $ba);
  $bb2->insert_tail(APR::Bucket::eos_create $ba);
  $r->output_filters->pass_brigade($bb2);

  return Apache2::Const::OK;
}

Torsten Förtsch

-- 
Need professional modperl support? Hire me! (http://foertsch.name)

Like fantasy? http://kabatinte.net

Re: mod_perl memory

Posted by Pavel Georgiev <pa...@3tera.com>.

On Mar 17, 2010, at 11:27 AM, Torsten Förtsch wrote:

> On Wednesday 17 March 2010 12:15:15 Torsten Förtsch wrote:
>> On Tuesday 16 March 2010 21:09:33 Pavel Georgiev wrote:
>>> for (<some condition>) {
>>>    $request->print("--$this->{boundary}\n");
>>>    $request->print("Content-type: text/html; charset=utf-8;\n\n");
>>>    $request->print("$data\n\n");
>>>    $request->rflush;
>>> }
>>> 
>>> And the result is endless memory growth in the apache process. Is that
>>> what you had in mind?
>>> 
>> 
>> I can confirm this. I have tried this little handler:
>> 
>> sub {
>>  my $r=shift;
>> 
>>  until( -e "/tmp/stop" ) {
>>    $r->print(("x"x70)."\n");
>>    $r->rflush;
>>  }
>> 
>>  return Apache2::Const::OK;
>> }
>> 
>> The httpd process grows slowly but unlimited. Without the rflush() it
>> grows  slower but still does.
>> 
> Here is a bit more stuff on the bug. It is the pool that grows.
> 
> To show it I use a handler that prints an empty document. I think an empty 
> file shipped by the default handler will do as well.
> 
> Then I add the following filter to the request:
> 
> $r->add_output_filter(sub {
>  my ($f, $bb)=@_;
> 
>  unless( $f->ctx ) {
>    $f->r->headers_out->unset('Content-Length');
>    $f->ctx(1);
>  }
> 
>  my $eos=0;
>  while( my $b=$bb->first ) {
>    $eos++ if( $b->is_eos );
>    $b->delete;
>  }
>  return 0 unless $eos;
> 
>  my $ba=$f->c->bucket_alloc;
>  until( -e '/tmp/stop' ) {
>    my $bb2=APR::Brigade->new($f->c->pool, $ba);
>    $bb2->insert_tail(APR::Bucket->new($ba, ("x"x70)."\n"));
>    $bb2->insert_tail(APR::Bucket::flush_create $ba);
>    $f->next->pass_brigade($bb2);
>  }
> 
>  my $bb2=APR::Brigade->new($f->c->pool, $ba);
>  $bb2->insert_tail(APR::Bucket::eos_create $ba);
>  $f->next->pass_brigade($bb2);
> 
>  return 0;
> });
> 
> The filter drops the empty document and emulates our infinite output. With 
> this filter the httpd process still grows. Now I add a subpool to the loop:
> 
> [...]
>  until( -e '/tmp/stop' ) {
>    my $pool=$f->c->pool->new;                       # create a subpool
>    my $bb2=APR::Brigade->new($pool, $ba);           # use the subpool
>    $bb2->insert_tail(APR::Bucket->new($ba, ("x"x70)."\n"));
>    $bb2->insert_tail(APR::Bucket::flush_create $ba);
>    $f->next->pass_brigade($bb2);
>    $pool->destroy;                                  # and destroy it
>  }
> [...]
> 
> Now it does not grow.
> 
> Torsten Förtsch
> 
> -- 
> Need professional modperl support? Hire me! (http://foertsch.name)
> 
> Like fantasy? http://kabatinte.net


How would that logic (adding subpools and using them) be applied to my simplified example:

for (;;) {
   $request->print("--$this->{boundary}\n");
   $request->print("Content-type: text/html; charset=utf-8;\n\n");
   $request->print("$data\n\n");
   $request->rflush;
}

Do I need to add an output filter?

Re: mod_perl memory

Posted by Torsten Förtsch <to...@gmx.net>.

On Wednesday 17 March 2010 12:15:15 Torsten Förtsch wrote:
> On Tuesday 16 March 2010 21:09:33 Pavel Georgiev wrote:
> > for (<some condition>) {
> >     $request->print("--$this->{boundary}\n");
> >     $request->print("Content-type: text/html; charset=utf-8;\n\n");
> >     $request->print("$data\n\n");
> >     $request->rflush;
> > }
> > 
> > And the result is endless memory growth in the apache process. Is that
> > what you had in mind?
> > 
> 
> I can confirm this. I have tried this little handler:
> 
> sub {
>   my $r=shift;
> 
>   until( -e "/tmp/stop" ) {
>     $r->print(("x"x70)."\n");
>     $r->rflush;
>   }
> 
>   return Apache2::Const::OK;
> }
> 
> The httpd process grows slowly but unlimited. Without the rflush() it
>  grows  slower but still does.
> 
Here is a bit more stuff on the bug. It is the pool that grows.

To show it I use a handler that prints an empty document. I think an empty 
file shipped by the default handler will do as well.

Then I add the following filter to the request:

$r->add_output_filter(sub {
  my ($f, $bb)=@_;

  unless( $f->ctx ) {
    $f->r->headers_out->unset('Content-Length');
    $f->ctx(1);
  }

  my $eos=0;
  while( my $b=$bb->first ) {
    $eos++ if( $b->is_eos );
    $b->delete;
  }
  return 0 unless $eos;

  my $ba=$f->c->bucket_alloc;
  until( -e '/tmp/stop' ) {
    my $bb2=APR::Brigade->new($f->c->pool, $ba);
    $bb2->insert_tail(APR::Bucket->new($ba, ("x"x70)."\n"));
    $bb2->insert_tail(APR::Bucket::flush_create $ba);
    $f->next->pass_brigade($bb2);
  }

  my $bb2=APR::Brigade->new($f->c->pool, $ba);
  $bb2->insert_tail(APR::Bucket::eos_create $ba);
  $f->next->pass_brigade($bb2);

  return 0;
});

The filter drops the empty document and emulates our infinite output. With 
this filter the httpd process still grows. Now I add a subpool to the loop:

[...]
  until( -e '/tmp/stop' ) {
    my $pool=$f->c->pool->new;                       # create a subpool
    my $bb2=APR::Brigade->new($pool, $ba);           # use the subpool
    $bb2->insert_tail(APR::Bucket->new($ba, ("x"x70)."\n"));
    $bb2->insert_tail(APR::Bucket::flush_create $ba);
    $f->next->pass_brigade($bb2);
    $pool->destroy;                                  # and destroy it
  }
[...]

Now it does not grow.

Torsten Förtsch

-- 
Need professional modperl support? Hire me! (http://foertsch.name)

Like fantasy? http://kabatinte.net

Re: mod_perl memory

Posted by Perrin Harkins <ph...@gmail.com>.

2010/3/17 Torsten Förtsch <to...@gmx.net>:
> The httpd process grows slowly but unlimited. Without the rflush() it grows
> slower but still does.
>
> With the rflush() its size increased by 100MB for an output of 220MB.
> Without it it grew by 10MB for an output of 2.3GB.
>
> I'd say it's a bug.

I agree.  This should not cause ongoing process growth.  The data must
be accumulating somewhere.

- Perrin

Re: mod_perl memory

Posted by Salvador Ortiz Garcia <so...@msg.com.mx>.

On 03/17/2010 05:15 AM, Torsten Förtsch wrote:
> until( -e "/tmp/stop" ) {
>      $r->print(("x"x70)."\n");
>      $r->rflush;
>    }
>    

Just for the record:

With mp1 there isn't any mem leak with or without rflush.
(After 10 mins: output 109GB. Fedora's 12 stock perl 5.10.0, apache 
1.3.42, mod_perl 1.31)

Maybe is mp2's filters related.

Regards.

Salvador Ortiz.

Re: mod_perl memory

Posted by Torsten Förtsch <to...@gmx.net>.

On Tuesday 16 March 2010 21:09:33 Pavel Georgiev wrote:
> for (<some condition>) {
>     $request->print("--$this->{boundary}\n");
>     $request->print("Content-type: text/html; charset=utf-8;\n\n");
>     $request->print("$data\n\n");
>     $request->rflush;
> }
> 
> And the result is endless memory growth in the apache process. Is that what
>  you had in mind?
> 
I can confirm this. I have tried this little handler:

sub {
  my $r=shift;

  until( -e "/tmp/stop" ) {
    $r->print(("x"x70)."\n");
    $r->rflush;
  }

  return Apache2::Const::OK;
}

The httpd process grows slowly but unlimited. Without the rflush() it grows 
slower but still does.

With the rflush() its size increased by 100MB for an output of 220MB.
Without it it grew by 10MB for an output of 2.3GB.

I'd say it's a bug. 

Torsten Förtsch

-- 
Need professional modperl support? Hire me! (http://foertsch.name)

Like fantasy? http://kabatinte.net

Re: mod_perl memory

Posted by André Warnier <aw...@ice-sa.com>.

André Warnier wrote:
> Pavel Georgiev wrote:
>> Andre,
>>
>> That is what I'm currently doing:
>> $request->content_type("multipart/x-mixed-replace;boundary=\"$this->{boundary}\";"); 
>>
>>
> I don't think so.  What you show above is a multipart message body, 
> which is not the same (and not the same level).
> What you are looking for is a header like
> Transfer-encoding: chunked
> 
> And this chunked transfer encoding happens last in the chain. It is just 
> a way of sending the data to the browser in smaller pieces, like one 
> would do for example to send a whole multi-megabyte video file.
> 
> 
> This link may help :
> 
> http://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html#sec3.6
> and this
> http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.41
> 
> Also search Google for "mod_perl +chunked"
> 
> The first item in the list contains a phrase :
> "mod_perl can transparently generate chunked encoding on recent versions 
> of Apache"
> 
> I personally don't know how, but it sounds intriguing enough to look 
> further into it.
> Maybe one of the gurus on this list knows more about it, and can give a 
> better opinion than mine as to whether this might help you.
> 
> 
In one of the following hits in Google, I found this demo CGI script :

#!/usr/bin/perl -w
use CGI;
my $cgi = new CGI();
print $cgi->header(-type => 'application/octet-stream',
-content_disposition => 'attachment; filename=data.raw', -connection =>
'close', -transfer_encoding => 'chunked');
open (my $fh, '<', '/dev/urandom') || die "$!\n";
for (1..100) {
my $data;
read($fh, $data, 1024*1024);
print $data;
}

That should be easily transposable to mod_perl.

Of course above they read and send in chunks of 1 MB, which may not be 
the small buffers you are looking for..
;-)

Re: mod_perl memory

Posted by André Warnier <aw...@ice-sa.com>.

Pavel Georgiev wrote:
> Andre,
> 
> That is what I'm currently doing:
> $request->content_type("multipart/x-mixed-replace;boundary=\"$this->{boundary}\";");
> 
I don't think so.  What you show above is a multipart message body, 
which is not the same (and not the same level).
What you are looking for is a header like
Transfer-encoding: chunked

And this chunked transfer encoding happens last in the chain. It is just 
a way of sending the data to the browser in smaller pieces, like one 
would do for example to send a whole multi-megabyte video file.

This link may help :

http://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html#sec3.6
and this
http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.41

Also search Google for "mod_perl +chunked"

The first item in the list contains a phrase :
"mod_perl can transparently generate chunked encoding on recent versions 
of Apache"

I personally don't know how, but it sounds intriguing enough to look 
further into it.
Maybe one of the gurus on this list knows more about it, and can give a 
better opinion than mine as to whether this might help you.

Re: mod_perl memory

Posted by Pavel Georgiev <pa...@3tera.com>.

Andre,

That is what I'm currently doing:
$request->content_type("multipart/x-mixed-replace;boundary=\"$this->{boundary}\";");

and then each chuck of prints looks like this (no length specified):

for (<some condition>) {
    $request->print("--$this->{boundary}\n");
    $request->print("Content-type: text/html; charset=utf-8;\n\n");
    $request->print("$data\n\n");
    $request->rflush;
}

And the result is endless memory growth in the apache process. Is that what you had in mind?

On Mar 16, 2010, at 12:50 PM, André Warnier wrote:

> Pavel Georgiev wrote:
> ...
>> Let me make I'm understanding this right - I'm not using any buffers myself,
>  all I do is sysread() from a unix socked and print(),
>  its just that I need to print a large amount of data for each request.
>> 
> ...
> Taking the issue at the source : can you not arrange to sysread() and/or 
> print() in smaller chunks ?
> There exists something in HTTP named "chunked response encoding" 
> (forgive me for not remembering the precise technical name).  It 
> consists of sending the response to the browser without an overall 
> Content-Length response header, but indicating that the response is 
> chunked.  Then each "chunk" is sent with its own length, and the 
> sequence ends with (if I remember correctly) a last chunk of size zero.
> The browser receives each chunk in turn, and re-assembles them.
> I have never had the problem myself, so I never looked deeply into it.
> But it just seems to me that before going off in more complicated 
> solutions, it might be worth investigating.
> 
>

Re: mod_perl memory

Posted by André Warnier <aw...@ice-sa.com>.

Pavel Georgiev wrote:
...
> Let me make I'm understanding this right - I'm not using any buffers myself,
  all I do is sysread() from a unix socked and print(),
  its just that I need to print a large amount of data for each request.
> 
...
Taking the issue at the source : can you not arrange to sysread() and/or 
print() in smaller chunks ?
There exists something in HTTP named "chunked response encoding" 
(forgive me for not remembering the precise technical name).  It 
consists of sending the response to the browser without an overall 
Content-Length response header, but indicating that the response is 
chunked.  Then each "chunk" is sent with its own length, and the 
sequence ends with (if I remember correctly) a last chunk of size zero.
The browser receives each chunk in turn, and re-assembles them.
I have never had the problem myself, so I never looked deeply into it.
But it just seems to me that before going off in more complicated 
solutions, it might be worth investigating.

Re: mod_perl memory

Posted by Pavel Georgiev <pa...@3tera.com>.

Thank you both for the quick replies!

Arthur,

Apache2::SizeLimit is no solution for my problem as I'm looking for a way to limit the size each requests take, the fact that I can scrub the process after the request is done (or drop the requests if the process reaches some limit, although my understanding is that Apache2::SizeLimit does its job after the requests is done) does not help me.

William,
Let me make I'm understanding this right - I'm not using any buffers myself, all I do is sysread() from a unix socked and print(), its just that I need to print a large amount of data for each request. Are you saying that there is no way to free the memory after I've done print() and rflush()?

BTW thanks for the other suggestions, switching to cgi seems like the only reasonable thing for me, I just want to make sure that this is how mod_perl operates and it is not me who is doing something wrong.

Thanks,
Pavel

On Mar 16, 2010, at 11:18 AM, ARTHUR GOLDBERG wrote:

> You could use Apache2::SizeLimit ("because size does matter") which evaluates the size of Apache httpd processes when they complete HTTP Requests, and kills those that grow too large. (Note that Apache2::SizeLimit can only be used for non-threaded MPMs, such as prefork.) Since it operates at the end of a Request, SizeLimit has the advantage that it doesn't interrupt Request processing and the disadvantage that it won't prevent a process from becoming oversized while processing a Request. To reduce the regular load of Apache2::SizeLimit it can be configured to check the size intermittently by setting the parameter CHECK_EVERY_N_REQUESTS. These parameters can be configured in a <Perl> section in httpd.conf, or a Perl start-up file.
> 
> That way, if your script allocates too much memory the process will be killed when it finishes handling the request. The MPM will eventually start another process if necessary.
> 
> BR
> A
> 
> On Mar 16, 2010, at 9:30 AM, William T wrote:
> 
>> On Mon, Mar 15, 2010 at 11:26 PM, Pavel Georgiev <pa...@3tera.com> wrote:
>>> I have a perl script running in mod_perl that needs to write a large amount of data to the client, possibly over a long period. The behavior that I observe is that once I print and flush something, the buffer memory is not reclaimed even though I rflush (I know this cant be reclaimed back by the OS).
>>> 
>>> Is that how mod_perl operates and is there a way that I can force it to periodically free the buffer memory, so that I can use that for new buffers instead of taking more from the OS?
>> 
>> That is how Perl operates.  Mod_Perl is just Perl embedded in the
>> Apache Process.
>> 
>> You have a few options:
>>  * Buy more memory. :)
>>  * Delegate resource intensive work to a different process (I would
>> NOT suggest a forking a child in Apache).
>>  * Tie the buffer to a file on disk, or db object, that can be
>> explicitly reclaimed
>>  * Create a buffer object of a fixed size and loop.
>>  * Use compression on the data stream that you read into a buffer.
>> 
>> You could also architect your system to mitigate resource usage if the
>> large data serve is not a common operation:
>>  * Proxy those requests to a different server which is optimized to
>> handle large data serves.
>>  * Execute the large data serves with CGI rather than Mod_Perl.
>> 
>> I'm sure there are probably other options as well.
>> 
>> -wjt
>> 
> 
> Arthur P. Goldberg, PhD
> 
> Research Scientist in Bioinformatics
> Plant Systems Biology Laboratory
> www.virtualplant.org
> 
> Visiting Academic
> Computer Science Department
> Courant Institute of Mathematical Sciences
> www.cs.nyu.edu/artg
> 
> artg@cs.nyu.edu
> New York University
> 212 995-4918
> Coruzzi Lab
> 8th Floor Silver Building
> 1009 Silver Center
> 100 Washington Sq East
> New York NY 10003-6688
> 
> 
>

Re: mod_perl memory

Posted by ARTHUR GOLDBERG <ar...@cs.nyu.edu>.

You could use Apache2::SizeLimit ("because size does matter") which  
evaluates the size of Apache httpd processes when they complete HTTP  
Requests, and kills those that grow too large. (Note that  
Apache2::SizeLimit can only be used for non-threaded MPMs, such as  
prefork.) Since it operates at the end of a Request, SizeLimit has the  
advantage that it doesn't interrupt Request processing and the  
disadvantage that it won't prevent a process from becoming oversized  
while processing a Request. To reduce the regular load of  
Apache2::SizeLimit it can be configured to check the size  
intermittently by setting the parameter CHECK_EVERY_N_REQUESTS. These  
parameters can be configured in a <Perl> section in httpd.conf, or a  
Perl start-up file.

That way, if your script allocates too much memory the process will be  
killed when it finishes handling the request. The MPM will eventually  
start another process if necessary.

BR
A

On Mar 16, 2010, at 9:30 AM, William T wrote:

> On Mon, Mar 15, 2010 at 11:26 PM, Pavel Georgiev <pa...@3tera.com>  
> wrote:
>> I have a perl script running in mod_perl that needs to write a  
>> large amount of data to the client, possibly over a long period.  
>> The behavior that I observe is that once I print and flush  
>> something, the buffer memory is not reclaimed even though I rflush  
>> (I know this cant be reclaimed back by the OS).
>>
>> Is that how mod_perl operates and is there a way that I can force  
>> it to periodically free the buffer memory, so that I can use that  
>> for new buffers instead of taking more from the OS?
>
> That is how Perl operates.  Mod_Perl is just Perl embedded in the
> Apache Process.
>
> You have a few options:
>  * Buy more memory. :)
>  * Delegate resource intensive work to a different process (I would
> NOT suggest a forking a child in Apache).
>  * Tie the buffer to a file on disk, or db object, that can be
> explicitly reclaimed
>  * Create a buffer object of a fixed size and loop.
>  * Use compression on the data stream that you read into a buffer.
>
> You could also architect your system to mitigate resource usage if the
> large data serve is not a common operation:
>  * Proxy those requests to a different server which is optimized to
> handle large data serves.
>  * Execute the large data serves with CGI rather than Mod_Perl.
>
> I'm sure there are probably other options as well.
>
> -wjt
>

Arthur P. Goldberg, PhD

Research Scientist in Bioinformatics
Plant Systems Biology Laboratory
www.virtualplant.org

Visiting Academic
Computer Science Department
Courant Institute of Mathematical Sciences
www.cs.nyu.edu/artg

artg@cs.nyu.edu
New York University
212 995-4918
Coruzzi Lab
8th Floor Silver Building
1009 Silver Center
100 Washington Sq East
New York NY 10003-6688

Re: mod_perl memory

Posted by William T <di...@gmail.com>.

On Mon, Mar 15, 2010 at 11:26 PM, Pavel Georgiev <pa...@3tera.com> wrote:
> I have a perl script running in mod_perl that needs to write a large amount of data to the client, possibly over a long period. The behavior that I observe is that once I print and flush something, the buffer memory is not reclaimed even though I rflush (I know this cant be reclaimed back by the OS).
>
> Is that how mod_perl operates and is there a way that I can force it to periodically free the buffer memory, so that I can use that for new buffers instead of taking more from the OS?

That is how Perl operates.  Mod_Perl is just Perl embedded in the
Apache Process.

You have a few options:
  * Buy more memory. :)
  * Delegate resource intensive work to a different process (I would
NOT suggest a forking a child in Apache).
  * Tie the buffer to a file on disk, or db object, that can be
explicitly reclaimed
  * Create a buffer object of a fixed size and loop.
  * Use compression on the data stream that you read into a buffer.

You could also architect your system to mitigate resource usage if the
large data serve is not a common operation:
  * Proxy those requests to a different server which is optimized to
handle large data serves.
  * Execute the large data serves with CGI rather than Mod_Perl.

I'm sure there are probably other options as well.

-wjt