You are viewing a plain text version of this content. The canonical link for it is here.
Posted to modperl@perl.apache.org by ben syverson <be...@elsif.org> on 2005/02/10 23:27:28 UTC

[mp1 and mp2] Grokking memory

I don't think I'm getting mod_perl's shared memory scheme yet. I have a 
package that gets loaded in my startup.pl, and it basically does this:

use vars qw(%words);
open FILE ...
while <FILE> {
	$words{$_} = 1;
}
close FILE;

...creating a hash of words from a CR-delimited list of words. The hash 
winds up taking up a few megabytes of RAM, but it's absolutely never 
written to, so I figured it would be shared between the mod_perl 
processes. However, each process grows by a few megs...

What's the best way to get around this?

- ben


Re: [mp1 and mp2] Grokking memory

Posted by Steven Lembark <le...@wrkhors.com>.

> ...creating a hash of words from a CR-delimited list of words. The hash
> winds up taking up a few megabytes of RAM, but it's absolutely never
> written to, so I figured it would be shared between the mod_perl
> processes. However, each process grows by a few megs...
>
> What's the best way to get around this?

Depends on how you measure "bigger". Things like top will normally
show you the virtual set size ("VIRT"); this includes the shared
portion ("SHR") and will add up to more than the total core used if 
programs share memory. If you look up the shared portion for the
proc's, Does it grow if you use the hash? If so then the O/S is
sharing the hash until its written.



-- 
Steven Lembark                                       85-09 90th Street
Workhorse Computing                                Woodhaven, NY 11421
lembark@wrkhors.com                                     1 888 359 3508

Re: [mp1 and mp2] Grokking memory

Posted by "Richard F. Rebel" <rr...@whenu.com>.
Ugh, okay.

Last BSD system I really payed much attention to was a BSDI system, and
that was years ago.

There was a simple way to calculate shared memory between processes at
the time in BSD/OS, but alas, I am sure it's somewhat different from
Linux and I surely don't remember it.

On Thu, 2005-02-10 at 17:43 -0600, ben syverson wrote:
> Arg -- I'm not being specific enough again. Sorry. This is all in 
> FreeBSD, which I know handles memory much differently than Linux. 
> Here's a sample line from top:
> 
>    PID USERNAME        PRI NICE   SIZE    RES STATE    TIME   WCPU    
> CPU COMMAND
> 91778 nobody            4    0 13496K 12584K select   0:00  0.00%  
> 0.00% httpd
> 
> 
> On Feb 10, 2005, at 5:28 PM, Richard F. Rebel wrote:
> 
> > OHHH, and BTW, when do you load this hash?
> 
> The hash is called in a startup.pl like this:
> My::HashLoader ();
> 
> And HashLoader basically does the code I listed before. So I would 
> assume that it would be shared.
> 
> Thanks again,
> 
> - ben
> 
-- 
Richard F. Rebel <rr...@whenu.com>
WhenU.com

Re: [mp1 and mp2] Grokking memory

Posted by ben syverson <be...@elsif.org>.
Arg -- I'm not being specific enough again. Sorry. This is all in 
FreeBSD, which I know handles memory much differently than Linux. 
Here's a sample line from top:

   PID USERNAME        PRI NICE   SIZE    RES STATE    TIME   WCPU    
CPU COMMAND
91778 nobody            4    0 13496K 12584K select   0:00  0.00%  
0.00% httpd


On Feb 10, 2005, at 5:28 PM, Richard F. Rebel wrote:

> OHHH, and BTW, when do you load this hash?

The hash is called in a startup.pl like this:
My::HashLoader ();

And HashLoader basically does the code I listed before. So I would 
assume that it would be shared.

Thanks again,

- ben


Re: [mp1 and mp2] Grokking memory

Posted by Perrin Harkins <pe...@elem.com>.
On Thu, 2005-02-10 at 19:33 -0500, Richard F. Rebel wrote:
> Does this report or help illustrate shared COW pages between apache
> processes?

I certainly though it did, but this work was done by my friend Doug
Steinwand, not by me.  I don't really know much more about it, but it
always seemed to work properly for me.

> I thought that particular part of /proc/<pid>/statm reported
> the pages potentially shared with other processes as they are part of
> dynamically loaded libraries.

I'm not sure the OS makes a distinction in reporting these things.

> When I check top, the SHARE column says 2328, which is exactly 4 (page
> size) x the 'share' number column number from top.  From what I
> understand so far, this does not represent COW pages shared between
> related processes.

It doesn't?  We tell people that it does in the mod_perl docs.

> Do I have the wrong end of the stick here?  (Id rather I did, because I
> have been using GTop to test my stuff before releasing).

Does GTop tell you something different?  I would expect it to give the
same answer.

- Perrin


Re: [mp1 and mp2] Grokking memory

Posted by "Richard F. Rebel" <rr...@whenu.com>.
Hi Perrin,

Does this report or help illustrate shared COW pages between apache
processes?  I thought that particular part of /proc/<pid>/statm reported
the pages potentially shared with other processes as they are part of
dynamically loaded libraries.

On my 2.6 kernel:

bash-2.05b$ echo $$
25964
bash-2.05b$ cat /proc/25964/statm
793 449 582 197 0 596 0
bash-2.05b$

According to 'man proc'

       /proc/[number]/statm
              Provides information about memory status in pages.  The
columns
              are:
               size       total program size
               resident   resident set size
               share      shared pages
               trs        text (code)
               drs        data/stack
               lrs        library
               dt         dirty pages

Of course the man page isn't all that illuminating.

When I check top, the SHARE column says 2328, which is exactly 4 (page
size) x the 'share' number column number from top.  From what I
understand so far, this does not represent COW pages shared between
related processes.

Do I have the wrong end of the stick here?  (Id rather I did, because I
have been using GTop to test my stuff before releasing).

Best,

On Thu, 2005-02-10 at 18:32 -0500, Perrin Harkins wrote:
> On Thu, 2005-02-10 at 18:28 -0500, Richard F. Rebel wrote:
> > As far as I know, especially on linux, there is no way to tell exactly
> > how 'shared' your apache processes are, except by using apache+mod_perl
> > with GTop (and it's associated apache module).  I certainly don't know
> > of a way to get this figure from the command line.  Maybe someone else
> > on the list does.
> 
> You can read it from /proc.  From Apache::SizeLimit:
> 
> sub linux_size_check {
>     my($size, $resident, $share) = (0, 0, 0);
> 
>     my $file = "/proc/self/statm";
>     if (open my $fh, "<$file") {
>         ($size, $resident, $share) = split /\s/, scalar <$fh>;
>         close $fh;
>     } else {
>         error_log("Fatal Error: couldn't access $file");
>     }
> 
>     # linux on intel x86 has 4KB page size...
>     return ($size * 4, $share * 4);
> }
> 
> - Perrin
> 
-- 
Richard F. Rebel <rr...@whenu.com>
WhenU.com

Re: [mp1 and mp2] Grokking memory

Posted by Perrin Harkins <pe...@elem.com>.
On Thu, 2005-02-10 at 18:28 -0500, Richard F. Rebel wrote:
> As far as I know, especially on linux, there is no way to tell exactly
> how 'shared' your apache processes are, except by using apache+mod_perl
> with GTop (and it's associated apache module).  I certainly don't know
> of a way to get this figure from the command line.  Maybe someone else
> on the list does.

You can read it from /proc.  From Apache::SizeLimit:

sub linux_size_check {
    my($size, $resident, $share) = (0, 0, 0);

    my $file = "/proc/self/statm";
    if (open my $fh, "<$file") {
        ($size, $resident, $share) = split /\s/, scalar <$fh>;
        close $fh;
    } else {
        error_log("Fatal Error: couldn't access $file");
    }

    # linux on intel x86 has 4KB page size...
    return ($size * 4, $share * 4);
}

- Perrin


Re: [mp1 and mp2] Grokking memory

Posted by "Richard F. Rebel" <rr...@whenu.com>.
Gosh, I am going to take a crack at this, but it's been a long while.  I
know somewhere this stuff is documented.  On Linux these things are not
terribly well documented anywhere that I have found.

Okay, the VIRT/VSZ/VSS is the virtual memory size of your process.  This
includes all shared pages with other processes.  The way the VM system
works, this is not really terribly accurate way to tell the memory usage
of your process tho.

The RES figure is the resident size of your process, how much of it is
actually in ram at the current moment.  This figure does not include
Shared Pages of dynamically loaded libraries that are also loaded by
other programs (eg, libc).

SHR/SHARE (on linux at least) is the pages that COULD be shared with
other processes (and usually are, such as libc).

However, all this said, none of these columns clearly demonstrates the
shared COPY ON WRITE (COW) pages.

COW pages are shared between related processes and are a "optimization".
Basically pages marked as copy on write, are shared between the
processes until they are written to.  So long as they are not, then they
remain shared.  This is the type of shared memory you are talking about
when you mention Apache "sharing" pages between children.  Eg, you load
a big ole hash in the parent, then fork, the children should not
actually copy the memory but share it until it starts getting written
to.

As far as I know, especially on linux, there is no way to tell exactly
how 'shared' your apache processes are, except by using apache+mod_perl
with GTop (and it's associated apache module).  I certainly don't know
of a way to get this figure from the command line.  Maybe someone else
on the list does.

I have seen people post ways to 'approximate' how shared your
applications are from ps/top.  Just looking at the 'RES' column and
observing that upon start most of your apache processes should be nearly
the same size.  Over time this will change, and some will end up being
larger.  This could be because of alloc()'s or could be because of COW
pages being written to and becoming un-shared.  So long as the bulk of
them are roughly the same size, you can assume your shared-ness is okay.
If it's of concern, then you might consider making sur MaxRequests is
set so that children get restarted every so often, or using apache +
gtop (which is what I have done when I have needed to).

Maybe someone can add some detail to help Ben approximate his
sharedness...

OHHH, and BTW, when do you load this hash?  If it's not done in the
Parent/Master process in the early part of the apache initialization
process, each process when forks then would of course load the hash
separately and it would not be shared.  This means in the module
initialization phase.  You can force this by putting the code you want
to load the hash into your startup.pl file or calling it from there.
Other methods would maybe be BEGIN blocks, or in a module in a place
that would be executed soon as the module is loaded.  Now that I think
about this, I bet this may be your problem if you are sure the data is
not being shared with COW pages.

Best,


On Thu, 2005-02-10 at 16:51 -0600, ben syverson wrote:
> Hi Richard,
> 
> Sorry -- I should have been more specific:
> 
> On Feb 10, 2005, at 4:40 PM, Richard F. Rebel wrote:
> 
> > How are you detecting that a process is growing by a couple megs?
> 
> Via "top" on the command line.
> 
> > Also, you mention that the processes grow by a couple megs.  By this do
> > you mean that each subsequent fork is a few megs larger than the
> > parents?
> 
> Each fork (and also the parent) winds up being a few megs larger when 
> the hash is loaded. Right now I'm testing with 5 children, and If I 
> comment out the line that loads the module which loads the hash, all 6 
> processes wind up being larger.
> 
> > One more question, what mpm are you using?
> 
> Old-style pre-fork.
> 
> Thanks!
> 
> - ben
> 
-- 
Richard F. Rebel

cat /dev/null > `tty`

Re: [mp1 and mp2] Grokking memory

Posted by ben syverson <be...@elsif.org>.
Hi Richard,

Sorry -- I should have been more specific:

On Feb 10, 2005, at 4:40 PM, Richard F. Rebel wrote:

> How are you detecting that a process is growing by a couple megs?

Via "top" on the command line.

> Also, you mention that the processes grow by a couple megs.  By this do
> you mean that each subsequent fork is a few megs larger than the
> parents?

Each fork (and also the parent) winds up being a few megs larger when 
the hash is loaded. Right now I'm testing with 5 children, and If I 
comment out the line that loads the module which loads the hash, all 6 
processes wind up being larger.

> One more question, what mpm are you using?

Old-style pre-fork.

Thanks!

- ben


Re: [mp1 and mp2] Grokking memory

Posted by "Richard F. Rebel" <rr...@whenu.com>.
Hi Ben,

How are you detecting that a process is growing by a couple megs?  Are
you looking at the VSS (virtual segment size) because if you are, on
most un*x-es this figure should remain roughly the same despite shared
segments.

Also, you mention that the processes grow by a couple megs.  By this do
you mean that each subsequent fork is a few megs larger than the
parents?

One more question, what mpm are you using?  If you are using a threaded
mpm, the perl interpreters are cloned and all variables are cloned
between interpreters, that is unless you explicitly share them.  I don't
know what effect this behavior has on copy-on-write pages (aka shared),
but I wouldn't be surprised if this defeats this type of sharing.

Richard F. Rebel


On Thu, 2005-02-10 at 16:27 -0600, ben syverson wrote:
> I don't think I'm getting mod_perl's shared memory scheme yet. I have a 
> package that gets loaded in my startup.pl, and it basically does this:
> 
> use vars qw(%words);
> open FILE ...
> while <FILE> {
> 	$words{$_} = 1;
> }
> close FILE;
> 
> ...creating a hash of words from a CR-delimited list of words. The hash 
> winds up taking up a few megabytes of RAM, but it's absolutely never 
> written to, so I figured it would be shared between the mod_perl 
> processes. However, each process grows by a few megs...
> 
> What's the best way to get around this?
> 
> - ben
> 
-- 
Richard F. Rebel

cat /dev/null > `tty`