You are viewing a plain text version of this content. The canonical link for it is here.

Posted to modperl@perl.apache.org by Daniel Hanks <ha...@about-inc.com> on 2002/03/22 19:07:58 UTC

'Pinning' the root apache process in memory with mlockall

Recently on this list the idea of 'pinning' or locking the root apache process 
in memory has been discussed with some interest. The reason being was that some
users have experienced the situtaion where a server becomes loaded, and the root
apache process gets swapped out, and in the process loses some of its shared 
memory. Future child processes that are forked also share in the loss of shared
memory, so methods like using GtopLimit to 'recyle' child processes when their
shared memory becomes too low cease to work because when they come up, they are
already too low on shared memory. 

In our systems we had attempted this but it always came down to the same problem
--the root process would lose its shared memory, to the point that any child 
process would come up, serve a request, find that it was beyond the threshold 
for shared memory and die. The only help was to restart Apache altogether.

So in scouring the list I found someone mentioning using the mlockall C function
to lock the pages of the core apache process in memory. Some handy .xs code was
provided, so I built a module, Sys::Mman, which wraps mlockall, and makes it 
available to Perl.

We installed this on our servers, and call mlockall right at the end of our
preload stuff, i.e., the end of the 'startup.pl'-style script called from 
httpd.conf. The result has been very encouraging. The core apache process is
able then to maintain all its shared memory, and child processes that are forked
are able to start with high amounts of shared memory, all making for a much
happier system.

Now I also read that probably better than this would be to ensure that you
never swap by tuning MaxClients, as well as examining our Perl code to make it
less prone to lose shared memory. We're working on that sort of tuning, but
in volatile environments like ours, where we serve a very large amount of data,
and new code is coming out almost daily here and there, locking the core
httpd in memory has been very helpful. I just thought I would let others know
on the list that it is feasible, and works well in our environment.

If there's enough interest I might put the module up on CPAN, but it's really
very simple. h2xs did most of the work for me. And thanks to Doug MacEachern for
posting the .xs code. It worked like a charm.

FWIW,

-- Dan Hanks
========================================================================
   Daniel Hanks - Systems/Database Administrator
   About Inc., Web Services Division
========================================================================

Re: 'Pinning' the root apache process in memory with mlockall

Posted by Stas Bekman <st...@stason.org>.

Perrin Harkins wrote:
> Stas Bekman wrote:
> 
>> Moreover the memory doesn't
>> get unshared
>> when the parent pages are paged out, it's the reporting tools that 
>> report the wrong
>> information and of course mislead the the size limiting modules which 
>> start killing
>> the processes.
> 
> 
> Apache::SizeLimit just reads /proc on Linux.  Is that going to report a 
> shared page as an unshared page if it has been swapped out?

That's what people report. Try the code here:
http://marc.theaimsgroup.com/?l=apache-modperl&m=101667859909389&w=2
to reproduce the phenomena in a few easy steps

> Of course you can void these issues if you tune your machine not to 
> swap.  The trick is, you really have to tune it for the worst case, i.e. 
> look at the memory usage while beating it to a pulp with httperf or 
> http_load and tune for that.  That will result in MaxClients and memory 
> limit settings that underutilize the machine when things aren't so busy. 
>  At one point I was thinking of trying to dynamically adjust memory 
> limits to allow processes to get much bigger when things are slow on the 
> machine (giving better performance for the people who are on at that 
> time), but I never thought of a good way to do it.

This can be done in the following way: move the variable that controls
the limit into a shared memory. Now run a special monitor process that
will adjust this variable, or let each child process to do that in the
cleanup stage.

To dynamically change MaxClients one need to re-HUP the server.
__________________________________________________________________
Stas Bekman            JAm_pH ------> Just Another mod_perl Hacker
http://stason.org/     mod_perl Guide ---> http://perl.apache.org
mailto:stas@stason.org http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org   http://ticketmaster.com

RE: 'Pinning' the root apache process in memory with mlockall

Posted by Rob Bloodgood <ro...@empire2.com>.

> Stas Bekman wrote:

> > Moreover the memory doesn't get unshared when the parent pages are
> > paged out, it's the reporting tools that report the wrong
> > information and of course mislead the the size limiting modules
> > which start killing the processes.
> 
> Apache::SizeLimit just reads /proc on Linux.  Is that going to report a 
> shared page as an unshared page if it has been swapped out?
> 
> Of course you can void these issues if you tune your machine not to 
> swap.  The trick is, you really have to tune it for the worst case, i.e. 
> look at the memory usage while beating it to a pulp with httperf or 
> http_load and tune for that.  That will result in MaxClients and memory 
> limit settings that underutilize the machine when things aren't so busy. 
>   At one point I was thinking of trying to dynamically adjust memory 
> limits to allow processes to get much bigger when things are slow on the 
> machine (giving better performance for the people who are on at that 
> time), but I never thought of a good way to do it.

Ooh... neat idea, but then that leads to a logical set of questions:
Is MaxClients that can be changed at runtime?
If not, would it be possible to see about patches to set this?
:-)

L8r
Rob

#!/usr/bin/perl -w
use Disclaimer qw/:standard/;

Re: 'Pinning' the root apache process in memory with mlockall

Posted by Perrin Harkins <pe...@elem.com>.

Stas Bekman wrote:
> Moreover the memory doesn't
> get unshared
> when the parent pages are paged out, it's the reporting tools that 
> report the wrong
> information and of course mislead the the size limiting modules which 
> start killing
> the processes.

Apache::SizeLimit just reads /proc on Linux.  Is that going to report a 
shared page as an unshared page if it has been swapped out?

Of course you can void these issues if you tune your machine not to 
swap.  The trick is, you really have to tune it for the worst case, i.e. 
look at the memory usage while beating it to a pulp with httperf or 
http_load and tune for that.  That will result in MaxClients and memory 
limit settings that underutilize the machine when things aren't so busy. 
  At one point I was thinking of trying to dynamically adjust memory 
limits to allow processes to get much bigger when things are slow on the 
machine (giving better performance for the people who are on at that 
time), but I never thought of a good way to do it.

- Perrin

Re: 'Pinning' the root apache process in memory with mlockall

Posted by Stas Bekman <st...@stason.org>.

Bill Marrs wrote:
> At 10:53 PM 3/22/2002, Stas Bekman wrote:
> 
>> top and libgtop use the same source of information, so it has nothing 
>> to do with these tools.
> 
> 
> 'top' has the ability to display SWAP on a per-process basis (you have 
> to change the defaults to see it, but it's there).

yeah, that's a cool feature!

> I didn't find this per-process SWAP value in Gtop.pm anywhere.
> 
> If GTop.pm had it, I could fix GTopLimit's bug.

GTop.pm is a Perl glue to the libgtop Gnome C library.
If somebody adds the SWAP/process feature this will be available in the 
GTop.pm.
Any takers?


__________________________________________________________________
Stas Bekman            JAm_pH ------> Just Another mod_perl Hacker
http://stason.org/     mod_perl Guide ---> http://perl.apache.org
mailto:stas@stason.org http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org   http://ticketmaster.com

Re: 'Pinning' the root apache process in memory with mlockall

Posted by Bill Marrs <bi...@apocalypse.org>.

At 10:53 PM 3/22/2002, Stas Bekman wrote:
>top and libgtop use the same source of information, so it has nothing to 
>do with these tools.

'top' has the ability to display SWAP on a per-process basis (you have to 
change the defaults to see it, but it's there).

I didn't find this per-process SWAP value in Gtop.pm anywhere.

If GTop.pm had it, I could fix GTopLimit's bug.

-bill

Re: 'Pinning' the root apache process in memory with mlockall

Posted by Stas Bekman <st...@stason.org>.

Bill Marrs wrote:
> Stas,
> 
> Thanks for tracking that down.
> 
> So, the problem is our tools.  For me, that's GTopLimit (but also 
> SizeLimit).
> 
> I would think it must be possible to cajole these two into realizing 
> their error.  "top" seems to know how much a process has swapped.  If 
> GTopLimit could know that, the number could be subtracted from the total 
> used in calculating the amount of sharing (and the new "unshared"), then 
> this "bug" would be resolved, right?
> 
> I looked, but didn't see anything in Gtop.pm that gives swap per 
> process, though.  So, it's not going to be easy.
> 
> I guess I'll turn of me deswapper...
> 
> ...and GTopLimit as well.  for now...
> 
> hmm,  maybe I could just avoid using the share-related trigger values in 
> GTopLimit, and just use the SIZE one.  That would be an acceptable 
> compromise, though not the best.

top and libgtop use the same source of information, so it has nothing to 
do with these tools.


__________________________________________________________________
Stas Bekman            JAm_pH ------> Just Another mod_perl Hacker
http://stason.org/     mod_perl Guide ---> http://perl.apache.org
mailto:stas@stason.org http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org   http://ticketmaster.com

Re: 'Pinning' the root apache process in memory with mlockall

Posted by Bill Marrs <bi...@apocalypse.org>.

Stas,

Thanks for tracking that down.

So, the problem is our tools.  For me, that's GTopLimit (but also SizeLimit).

I would think it must be possible to cajole these two into realizing their 
error.  "top" seems to know how much a process has swapped.  If GTopLimit 
could know that, the number could be subtracted from the total used in 
calculating the amount of sharing (and the new "unshared"), then this "bug" 
would be resolved, right?

I looked, but didn't see anything in Gtop.pm that gives swap per process, 
though.  So, it's not going to be easy.

I guess I'll turn of me deswapper...

...and GTopLimit as well.  for now...

hmm,  maybe I could just avoid using the share-related trigger values in 
GTopLimit, and just use the SIZE one.  That would be an acceptable 
compromise, though not the best.

-bill

Re: 'Pinning' the root apache process in memory with mlockall

Posted by Stas Bekman <st...@stason.org>.

Daniel Hanks wrote:
> On Sat, 23 Mar 2002, Stas Bekman wrote:
>  
> 
>>See the discussion on the dev@httpd.apache.org list,
>>http://marc.theaimsgroup.com/?t=101659730800001&r=1&w=2
>>where it was said that it's
>>a very bad idea to use mlock and variants. Moreover the memory doesn't 
>>get unshared
>>when the parent pages are paged out, it's the reporting tools that 
>>report the wrong
>>information and of course mislead the the size limiting modules which 
>>start killing
>>the processes. As a conclusion to this thread I've added the following 
>>section to
>>the performance chapter of the guide:
>>
> 
> 
> Are we saying then that libgtop is erroneous in its reporting under these circumstances? And in the case of Linux, I'm asusming libgtop just reads its info straight from /proc. Is /proc erroneous then?

As people have pointed out it's not libgtop, it's /proc. You have the 
same problem with top(1).

It's not erroneous, it just doesn't reflect the immidiate change, the 
/proc will be updated when pages in question will be accessed which for 
performance reasons doesn't happen immediately. But this could be too 
late for the processes that are going to be killed.

I've posted the C code to test this earlier this week here:
http://marc.theaimsgroup.com/?l=apache-modperl&m=101667859909389&w=2

You are welcome to run more tests and report back.

__________________________________________________________________
Stas Bekman            JAm_pH ------> Just Another mod_perl Hacker
http://stason.org/     mod_perl Guide ---> http://perl.apache.org
mailto:stas@stason.org http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org   http://ticketmaster.com

Re: 'Pinning' the root apache process in memory with mlockall

Posted by Daniel Hanks <ha...@about-inc.com>.

On Sat, 23 Mar 2002, Stas Bekman wrote:
 
> See the discussion on the dev@httpd.apache.org list,
> http://marc.theaimsgroup.com/?t=101659730800001&r=1&w=2
> where it was said that it's
> a very bad idea to use mlock and variants. Moreover the memory doesn't 
> get unshared
> when the parent pages are paged out, it's the reporting tools that 
> report the wrong
> information and of course mislead the the size limiting modules which 
> start killing
> the processes. As a conclusion to this thread I've added the following 
> section to
> the performance chapter of the guide:
> 

Are we saying then that libgtop is erroneous in its reporting under these circumstances? And in the case of Linux, I'm asusming libgtop just reads its info straight from /proc. Is /proc erroneous then?

-- Dan
========================================================================
   Daniel Hanks - Systems/Database Administrator
   About Inc., Web Services Division
========================================================================

Re: 'Pinning' the root apache process in memory with mlockall

Posted by Stas Bekman <st...@stason.org>.

Ed Grimm wrote:
> Danger: Rant ahead.  Proceed with caution.

[my summary of mlocks discussion removed]

> In the discussion you referred to, all of the people saying this was a
> bad idea were using terms like, "I think".  None of them had the
> situation themselves, so have a difficult time coming to terms with it.
> None of them had related former experience using this.  Something like
> this really needs to be tested by someone who has the issue, and has the
> ability to do benchmarks with real data streams.  If they find it seems
> to work well, then they should test it on production systems.  Anyone
> else talking about it is simply that much hot air, myself included.  (I
> *could* test it, but I don't have enough of a problem to put a priority
> on it.  If we were waiting for me to get time, we'd be waiting a long
> time.)

Right, therefore until someone comes up with real numbers and real 
testing this is all words. I've summarized the discussions here and at 
the httpd-dev lists. I've never used mlock() and friends myself, so I 
can only rely on other users' experiences. I cannot do the testing 
myself now, too many things on my plate already. If you think my summary 
is not so good (which is definitely can be) you are welcome to improve 
it. If someone can do the real testing and share the results with the 
list, everybody will benefit.

[the rest of the interesting rant removed]

__________________________________________________________________
Stas Bekman            JAm_pH ------> Just Another mod_perl Hacker
http://stason.org/     mod_perl Guide ---> http://perl.apache.org
mailto:stas@stason.org http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org   http://ticketmaster.com

Re: 'Pinning' the root apache process in memory with mlockall

Posted by Ed Grimm <ed...@tgape.org>.

Danger: Rant ahead.  Proceed with caution.

On Sat, 23 Mar 2002, Stas Bekman wrote:

> See the discussion on the dev@httpd.apache.org list,
> http://marc.theaimsgroup.com/?t=101659730800001&r=1&w=2 where it was
> said that it's a very bad idea to use mlock and variants. Moreover the
> memory doesn't get unshared when the parent pages are paged out, it's
> the reporting tools that report the wrong information and of course
> mislead the the size limiting modules which start killing the
> processes. As a conclusion to this thread I've added the following
> section to the performance chapter of the guide:
> 
> =head3 Potential Drawbacks of Memory Sharing Restriction
> 
> It's very important that the system won't be heavily engaged in
> swapping process. Some systems do swap in and out every so often even
> if they have plenty of real memory available and it's OK. The
> following applies to conditions when there is hardly any free memory
> available.
> 
> So if the system uses almost all of its real memory (including the
> cache), there is a danger of parent's process memory pages being
> swapped out (written to a swap device). If this happens the memory
> usage reporting tools will report all those swapped out pages as
> non-shared, even though in reality these pages are still shared on
> most OSs. When these pages are getting swapped in, the sharing will be

My Solaris 2.6 box, while in this situation, was swapping hard, as
measured by my ears, by iostat, and by top (both iowait and the memory
stats).

Note that mlockall does not restrict memory sharing, it restricts
swapping a certain portion of memory.  This will prevent this memory
from ever being needlessly unshared.

In the discussion you referred to, all of the people saying this was a
bad idea were using terms like, "I think".  None of them had the
situation themselves, so have a difficult time coming to terms with it.
None of them had related former experience using this.  Something like
this really needs to be tested by someone who has the issue, and has the
ability to do benchmarks with real data streams.  If they find it seems
to work well, then they should test it on production systems.  Anyone
else talking about it is simply that much hot air, myself included.  (I
*could* test it, but I don't have enough of a problem to put a priority
on it.  If we were waiting for me to get time, we'd be waiting a long
time.)

Yes, I agree, it's better to never swap.  But if we take the attitude
that we won't use tools to help us when times are tight, then get rid of
swap entirely.  Locking memory is all about being selective about what
you will and won't swap.

Yes, I agree, it'd be better to mlock those bits of memory that you
really care about, but that's hard to do when that memory is allocated
by software you didn't write.  (In this case, I'd really like to mlock
all the memory that perl allocated but did not free in BEGIN sections
(including, of course, use blocks).  I would also like to compact that
first, but that could be even more difficult.)

As far as the logic regarding 'let the OS decide' - the admin of the
system has the ability to have a much better understanding of how the
system resources are used.  If I have one section of memory which is
used 95% of the time by 75% of my active processes, I really don't want
that memory to swap out just because another program that'll only run
for a minute wants a bit more memory, if it can take that memory from
anywhere else.  When doing individual page-ins, memory managers tend to
worry only about those processes that they are trying to make runable
now; they're not going to go and load that page back on to every other
page map that shares it just because they also use it.  So even though
that memory is loaded back into memory, all those processes will still
have to swap it back.

For them to do otherwise would be irresponsible, unless the system
administrator clearly doesn't know how to system administrate, or has
chosen not to.  The OS is supposed to handle the typical case; having
one segment of memory used by dozens of processes actively is not the
typical case.  This does not happen on end-user machines; this only
happens on servers.  Theoretically speaking, servers are run by people
who can analyze and tune; mlock and mlockall are tools available to them
to do such tuning.

> reported back to normal after a certain amount of time. If a big chunk
> of the memory shared with child processes is swapped out, it's most
> likely that C<Apache::SizeLimit> or C<Apache::GTopLimit> will notice
> that the shared memory floor threshold was crossed and as a result
> kill those processes. If many of the parent process' pages are swapped
> out, and the newly created child process is already starting with
> shared memory below the limit, it'll be killed immediately after
> serving a single request (assuming that we the
> C<$CHECK_EVERY_N_REQUESTS> is set to one). This is a very bad
> situation which will eventually lead to a state where the system won't
> respond at all, as it'll be heavily engaged in swapping process.

Yes, this is why we want to lock the memory.

Ed

Re: 'Pinning' the root apache process in memory with mlockall

Posted by Stas Bekman <st...@stason.org>.

Daniel Hanks wrote:
> Recently on this list the idea of 'pinning' or locking the root apache process 
> in memory has been discussed with some interest. The reason being was that some
> users have experienced the situtaion where a server becomes loaded, and the root
> apache process gets swapped out, and in the process loses some of its shared 
> memory. Future child processes that are forked also share in the loss of shared
> memory, so methods like using GtopLimit to 'recyle' child processes when their
> shared memory becomes too low cease to work because when they come up, they are
> already too low on shared memory. 
> 
> In our systems we had attempted this but it always came down to the same problem
> --the root process would lose its shared memory, to the point that any child 
> process would come up, serve a request, find that it was beyond the threshold 
> for shared memory and die. The only help was to restart Apache altogether.
> 
> So in scouring the list I found someone mentioning using the mlockall C function
> to lock the pages of the core apache process in memory. Some handy .xs code was
> provided, so I built a module, Sys::Mman, which wraps mlockall, and makes it 
> available to Perl.
> 
> We installed this on our servers, and call mlockall right at the end of our
> preload stuff, i.e., the end of the 'startup.pl'-style script called from 
> httpd.conf. The result has been very encouraging. The core apache process is
> able then to maintain all its shared memory, and child processes that are forked
> are able to start with high amounts of shared memory, all making for a much
> happier system.
> 
> Now I also read that probably better than this would be to ensure that you
> never swap by tuning MaxClients, as well as examining our Perl code to make it
> less prone to lose shared memory. We're working on that sort of tuning, but
> in volatile environments like ours, where we serve a very large amount of data,
> and new code is coming out almost daily here and there, locking the core
> httpd in memory has been very helpful. I just thought I would let others know
> on the list that it is feasible, and works well in our environment.
> 
> If there's enough interest I might put the module up on CPAN, but it's really
> very simple. h2xs did most of the work for me. And thanks to Doug MacEachern for
> posting the .xs code. It worked like a charm.

See the discussion on the dev@httpd.apache.org list,
http://marc.theaimsgroup.com/?t=101659730800001&r=1&w=2
where it was said that it's
a very bad idea to use mlock and variants. Moreover the memory doesn't 
get unshared
when the parent pages are paged out, it's the reporting tools that 
report the wrong
information and of course mislead the the size limiting modules which 
start killing
the processes. As a conclusion to this thread I've added the following 
section to
the performance chapter of the guide:

=head3 Potential Drawbacks of Memory Sharing Restriction

It's very important that the system won't be heavily engaged in
swapping process. Some systems do swap in and out every so often even
if they have plenty of real memory available and it's OK. The
following applies to conditions when there is hardly any free memory
available.

So if the system uses almost all of its real memory (including the
cache), there is a danger of parent's process memory pages being
swapped out (written to a swap device). If this happens the memory
usage reporting tools will report all those swapped out pages as
non-shared, even though in reality these pages are still shared on
most OSs. When these pages are getting swapped in, the sharing will be
reported back to normal after a certain amount of time. If a big chunk
of the memory shared with child processes is swapped out, it's most
likely that C<Apache::SizeLimit> or C<Apache::GTopLimit> will notice
that the shared memory floor threshold was crossed and as a result
kill those processes. If many of the parent process' pages are swapped
out, and the newly created child process is already starting with
shared memory below the limit, it'll be killed immediately after
serving a single request (assuming that we the
C<$CHECK_EVERY_N_REQUESTS> is set to one). This is a very bad
situation which will eventually lead to a state where the system won't
respond at all, as it'll be heavily engaged in swapping process.

This effect may be less or more severe depending on the memory
manager's implementation and it certainly varies from OS to OS, and
different kernel versions. Therefore you should be aware of this
potential problem and simply try to avoid situations where the system
needs to swap at all, by adding more memory, reducing the number of
child servers or spreading the load across more machines, if reducing
the number of child servers is not an options because of the request
rate demands.

__________________________________________________________________
Stas Bekman            JAm_pH ------> Just Another mod_perl Hacker
http://stason.org/     mod_perl Guide ---> http://perl.apache.org
mailto:stas@stason.org http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org   http://ticketmaster.com