You are viewing a plain text version of this content. The canonical link for it is here.
Posted to modperl@perl.apache.org by Stas Bekman <st...@stason.org> on 2000/06/22 20:31:38 UTC

[RFC] (corrected) Swapping Prevention

Here is yet another almost complete rewrite of the section. Thanks to
Barrie and Ed for the comments.

I've taken time to re-read long time ago read linux kernel memory
management system overview, so I hope now the section carries no mistakes.
Remember that many details, like virtual memory explanation, are skipped.

Comments are welcome.

=head2 Swapping Prevention

Before we delve into swapping process details, let's refresh our
knowledge of memory components and memory management

The computer memory is called RAM, which stands for Random Access
Memory.  Reading and writing to RAM is, by a few orders, faster than
doing the same operations on a hard disk, the former uses non-movable
memory cells, while the latter uses rotating magnetic media.

On most operating systems swap memory is used as an extension for RAM
and not as a duplication of it. So if your OS is one of those, if you
have 128MB of RAM and 256MB swap partition, you have a total of 384MB
of memory available. You should never count the extra memory when you
decide on the maximum number of processes to be run, and we will show
why in the moment.

The swapping memory can be built of a number of hard disk partitions
and swap files formatted to be used as swap memory. When you need
more swap memory you can always extend it on demand as long as you
have some free disk space (for more information see the I<mkswap> and
I<swapon> manpages).

System memory is quantified in units called memory pages. Usually the
size of a memory page is between 1KB and 8KB.  So if you have 256MB of
RAM installed on your machine and the page size is 4KB your system has
64,000 main memory pages to work with and these pages are fast.  If
you have 256MB swap partition the system can use yet another 64,000
memory pages, but they are much slower.

When the system is started all memory pages are available for use by
the programs (processes).

Unless the program is really small, the process running this program
uses only a few segments of the program, each segment mapped onto its
own memory page. Therefore only a few memory pages are required to be
loaded into the memory.

When the process needs an additional program's segment to be loaded
into the memory, it asks the system whether the page containing this
segment is already loaded in the memory. If the page is not found--an
event know as a I<page fault> occurs, which requires the system to
allocate a free memory page, go to the disk, read and load the
requested program's segment into the allocated memory page.

If a process needs to bring a new page into physical memory and there
are no free physical pages available, the operating system must make
room for this page by discarding another page from physical memory.

If the page to be discarded from physical memory came from an image or
data file and has not been written to then the page does not need to
be saved. Instead it can be discarded and if the process needs that
page again it can be brought back into memory from the image or data
file.

However, if the page has been modified, the operating system must
preserve the contents of that page so that it can be accessed at a
later time. This type of page is known as a I<dirty page> and when it
is removed from memory it is saved in a special sort of file called
the swap file. This process is referred to as a I<swapping out>.

Accesses to the swap file are very long relative to the speed of the
processor and physical memory and the operating system must juggle the
need to write pages to disk with the need to retain them in memory to
be used again.

In order to improve the swapping out process, to decrease the
possibility that the page that has just been swapped out, will be
needed at the next moment, the LRU (least recently used) or a similar
algorithm is used.

To summarize the two swapping scenarios, read-only pages discarding
incurs no overhead in contrast with the discarding scenario of the
data pages that have been written to, since in the latter case the
pages have to be written to a swap partition located on the slow disk.
Therefore your machine's overall performance will be much better if
there will be less memory pages that can become dirty.

But the problem is, Perl is a language with no strong data types,
which means that both, the program code and the program data are seen
as a data pages by OS since both mapped to the same memory
pages. Therefore a big chunk of your Perl code becomes dirty when its
variables are modified and when the pages need to be discarded they
have to be written to the swap partition.

This leads us to two important conclusions about swapping and Perl.

=over 

=item *

Running your system when there is no free main memory available
hinders performance, because processes memory pages should be
discarded and then reread from disk again and again.

=item *

Since a majority of the running code is a Perl code, in addition to
the overhead of reading the previously discarded pages in, the
overhead of saving the dirty pages to the swap partition is occurring.

=back


When the system has to swap memory pages in and out, the system slows
down, not serving the processes as fast as before. This leads to an
accumulation of processes waiting for their turn to run, which further
causes processing demands to go up, which in turn slows down the
system even more as more memory is required.  This ever worsening
spiral will lead the machine to halt, unless the resource demand
suddenly drops down and allows the processes to catch up with their
tasks and go back to normal memory usage.

In addition it's important to know that for a better performance, most
programs, particularly programs written in Perl, on most modern OSs
don't return memory pages while they are running. If some of the
memory gets freed it's reused when needed by the process, without
creating the additional overhead of asking the system to allocate new
memory pages.  That's why you will observe that Perl programs grow in
size as they run and almost never shrink.

When the process quits it returns its memory pages to the pool of
freely available pages for other processes to use.

This scenario is certainly educating, and it should be now obvious
that your system that runs the web server should never swap. It's
absolutely normal for your desktop to start swapping. You will see it
immediately since things will slow down and sometimes the system will
freeze for a short periods. But as we just mentioned, you can stop
starting new programs and can quit some, thus allowing the system to
catch up with the load and come back to use the RAM.

In the case of the web server you have much less control since it's
users who load your machine by issuing requests to your server.
Therefore you should configure the server, so that the maximum number
of possible processes will be small enough using the C<MaxClients>
directive. This will ensure that at peak hours the system won't
swap. Remember that swap space is an emergency pool, not a resource to
be used routinely.  If you are low on memory and you badly need it,
buy it or reduce the number of processes to prevent swapping.

However sometimes, due to the faulty code, some process might start
spinning in an unconstrained loop, consuming all the available RAM and
starting to heavily use swap memory. In such a situation it helps when
you have a big emergency pool (i.e. lots of swap memory). But you have
to resolve this problem as soon as possible since this pool won't last
for a long time. In the meanwhile the C<Apache::Resource> module can
be handy.

Sometimes calling an undefined subroutine in a module can cause a
tight loop that consumes all the available memory.  Here is a way to
catch such errors.  Define an C<AUTOLOAD> subroutine:

  sub UNIVERSAL::AUTOLOAD {
    my $class = shift;
    warn "$class can't \$UNIVERSAL::AUTOLOAD!\n";
  }

This will produce a nice error in I<error_log>, giving the line number
of the call and the name of the undefined subroutine.

_____________________________________________________________________
Stas Bekman              JAm_pH     --   Just Another mod_perl Hacker
http://stason.org/       mod_perl Guide  http://perl.apache.org/guide 
mailto:stas@stason.org   http://perl.org     http://stason.org/TULARC
http://singlesheaven.com http://perlmonth.com http://sourcegarden.org