You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@httpd.apache.org by David Reid <ab...@dial.pipex.com> on 1999/10/14 20:19:02 UTC

MMAP support for APR

Hi,

Below is a very early overview of what I'm proposing we do to add mmap to
APR.  This is the very basics and I'm open to suggestions.

I'm posting now as I'm off to deliver an aircraft in 3 days time and will be
totally out of contact (as much by choice as anything alese) for 6 days.  My
aim is that unless anyone else writes the code before I get around to it
then by the time I return most of the issues will have been fleshed out on
the list and I can get on with writing the first hash of the code.  As I say
if anyone has a few spare moments and wants to get on with then please feel
free.

Once the basics are in place I envisage a set of helper functions being
added to do things like find an ap_mmap_t using a filename, compare 2
ap_mmap_t's and so on.

david


MMAP for APR

The aim is to build as open an API as possible to allow us to do all the
things that 1.3
does with MMAP files but for every platform.

Basic Structure

 pointer to base address of mmap'd area
 size of mmap'd area
 stat details
 filename

Unix
  struct mmap_t {
   char *filename;
   struct stat st;
   void *mm;
   size_t size;
  };

BeOS
  struct mmap_t {
   char *filename;
   struct stat st;
   area_id areaid;
   void *mm;
   size_t size;
  }


Questions?

How do we control the lifetime of an mmap?  We need some way of stopping us
from deleting an mmap while it is still being read.
Similarly are there any platforms that need to restrict access to a certain
number of simultaneous accesses?
Is it worth making the mmap's created into a linked chain within each
application?  This would be useful for "walking the chain" to find a
matching mmap or to generate info on the size of mmaps presently in the
system (thinking ahead to possibly linking with mod_status).
At present I've used the whole stat struct as it's impossible to determine
which fields will be used in the future.  Is this worthwhile or do we simply
make a decision and cherry pick the fields that will be useful?

API prototypes

ap_status_t ap_create_mmap(ap_mmap_t, char *filename, ap_context_t)

Create a new mmap area and read the file into it.

ap_status_t ap_read_mmap(char * buffer, ap_size_t buflen, ap_size_t
startpos,
    ap_size_t len, ap_mmap_t, ap_context_t)

Read a section from an mmap'd file.  Start at startpos and read len bytes.
With startpos set to 0 and len set to -1 try to read entire file, checking
the size
of the buffer.

ap_send_mmap(ap_socket_t, apint32 startpos, apint32 len, ap_mmap_t,
ap_context_t)

Send a section of an mmap'd file to a socket.  Start at startpos and read
len bytes.
With startpos set to 0 and len set to -1 try to read entire file, checking
the size
of the buffer.

ap_delete_mmap(ap_mmap_t)

Delete the mmap area.

Re: MMAP support for APR

Posted by David Reid <ab...@dial.pipex.com>.

I'll pop on over and have a look.  We shouldn't ignore anything that might
be useful.  This is why I wanted to throw this open as there is a wide range
of experience on this forum and a lot of useful ideas out there.  :-)

david

----- Original Message -----
From: Tony Finch <do...@dotat.at>
To: <ne...@apache.org>
Sent: 15 October 1999 02:49
Subject: Re: MMAP support for APR


> "David Reid" <ab...@dial.pipex.com> wrote:
> >
> >How do we control the lifetime of an mmap?  We need some way of stopping
us
> >from deleting an mmap while it is still being read.
> >Similarly are there any platforms that need to restrict access to a
certain
> >number of simultaneous accesses?
> >Is it worth making the mmap's created into a linked chain within each
> >application?  This would be useful for "walking the chain" to find a
> >matching mmap or to generate info on the size of mmaps presently in the
> >system (thinking ahead to possibly linking with mod_status).
> >At present I've used the whole stat struct as it's impossible to
determine
> >which fields will be used in the future.  Is this worthwhile or do we
simply
> >make a decision and cherry pick the fields that will be useful?
>
> I'm going to mention Flash again because it's cool.
> http://www.cs.rice.edu/~vivek/flash99/flash.ps.gz
>
> This server does some really cool things with mmap: It maintains a
> cache of chunks of mmapped files that is tuned to the available memory
> in the machine and has an LRU replacement strategy to approximate the
> kernel's page replacement strategy. It uses auxiliary processes to
> peek at the pages of the mmap (and therefore block while they are read
> in) so that the main server doesn't block. If Dean's idea of a
> select-loop-based MPM happens then this could be very handy (even if
> the select loop is hidden inside some userland threading library).
>
> There are some other optimisations: Instead of cacheing struct stats,
> it caches the translation of URI to filename. It's based on thttpd so
> it doesn't have to worry about .htaccess files, but I like the idea of
> being able to avoid traipsing around the filesystem on every request.
> It also caches response headers so that they can be re-used, and does
> some tricks to avoid realignment costs inside the network stack.
>
> >ap_status_t ap_read_mmap(char * buffer, ap_size_t buflen,
> >                         ap_size_t startpos, ap_size_t len,
> >                         ap_mmap_t, ap_context_t)
> >
> >Read a section from an mmap'd file.  Start at startpos and read len
> >bytes. With startpos set to 0 and len set to -1 try to read entire
> >file, checking the size of the buffer.
>
> What's the point of this? Why can't I just slurp bytes straight out of
> memory?
>
> Tony.
> --
> fanf@demon.net -- the .@ person

MMAP Performance (Was: MMAP support for APR)

Posted by Henrik Vendelbo <hv...@bluprints.com>.

----- Original Message ----- 
From: Ryan Bloom <rb...@raleigh.ibm.com>
> I think if you review the conversation, you will find that this API has
> been well researched, and well thought out.  I think the initial problem......

> I do not see APR implementing anything like MMAP on a platform that does 
> not support it natively.  Especially not in version 1.0.  The overriding
> mantra of APR has been "Keep it simple" (well, as far as what we are
> putting in).  Maybe in the future, we will look for ways to implement
> MMAP'in on platforms that don't have it natively.

That sounds like a good aproach, since most implementations in the camp of do-it-all yourself has been pretty bulky.

Yet I still believe that single function API's may be very flexible, but they tend to perform badly. Since MMAP is very
much a performance API, the inherent performance due to the API structure should be taken into consideration. So 
far I have only seen one read and one write API function mentioned, so I gather that those are the ones intended. If 
this is indeed the case, I suggest that we expand the API; Especially when it comes to traversing etc. If the MMAP is
to be accessed using normal memory acces, I have no valid suggestions.

\Henrik

Re: Sv: MMAP support for APR

Posted by David Reid <ab...@dial.pipex.com>.

OK, so my trip was cut short, I'm back and I'll start to look at some code
:-)

david
----- Original Message -----
From: Ryan Bloom <rb...@raleigh.ibm.com>
To: <ne...@apache.org>
Sent: 18 October 1999 19:03
Subject: Re: Sv: MMAP support for APR


>
> > > Probably should have made that clearer.  There will be a #define for
> > > APR_HAS_MMAP (or something like it).  But, the ENOTIMPL will be put
in,
> > > for the case where stupid programmers try to use functions they don't
> > > have.
> >
> > rad, thanks.
>
> I forget sometimes that not everybody knows what I am thinking.  :)
> Thanks for keeping me on my toes.  I'll add this stuff to the APR Design
> doc that is in CVS.
>
> Ryan
>
> _______________________________________________________________________
> Ryan Bloom rbb@raleigh.ibm.com
> 4205 S Miami Blvd
> RTP, NC 27709 It's a beautiful sight to see good dancers
> doing simple steps.  It's a painful sight to
> see beginners doing complicated patterns.
>

Re: Sv: MMAP support for APR

Posted by Ryan Bloom <rb...@raleigh.ibm.com>.

> > Probably should have made that clearer.  There will be a #define for
> > APR_HAS_MMAP (or something like it).  But, the ENOTIMPL will be put in,
> > for the case where stupid programmers try to use functions they don't
> > have.
> 
> rad, thanks.

I forget sometimes that not everybody knows what I am thinking.  :)
Thanks for keeping me on my toes.  I'll add this stuff to the APR Design
doc that is in CVS.

Ryan

_______________________________________________________________________
Ryan Bloom		rbb@raleigh.ibm.com
4205 S Miami Blvd	
RTP, NC 27709		It's a beautiful sight to see good dancers 
			doing simple steps.  It's a painful sight to
			see beginners doing complicated patterns.

Re: Sv: MMAP support for APR

Posted by Dean Gaudet <dg...@arctic.org>.


On Mon, 18 Oct 1999, Ryan Bloom wrote:

> > > The point of this, is to allow ONE definition for mmapp'ed files, which
> > > will work on any platform that supports them.  If the platform doesn't
> > > support them, the APR functions will return APR_ENOTIMPL, and nothing will
> > > be done, assuming Apache handles that case correctly.
> > > 
> > 
> > feature macros are way better.  ENOTIMPL means there'll be useless code
> > lying around on some platforms.
> > 
> > Dean
> 
> Probably should have made that clearer.  There will be a #define for
> APR_HAS_MMAP (or something like it).  But, the ENOTIMPL will be put in,
> for the case where stupid programmers try to use functions they don't
> have.

rad, thanks.

Dean

Re: Sv: MMAP support for APR

Posted by Ryan Bloom <rb...@raleigh.ibm.com>.

> > The point of this, is to allow ONE definition for mmapp'ed files, which
> > will work on any platform that supports them.  If the platform doesn't
> > support them, the APR functions will return APR_ENOTIMPL, and nothing will
> > be done, assuming Apache handles that case correctly.
> > 
> 
> feature macros are way better.  ENOTIMPL means there'll be useless code
> lying around on some platforms.
> 
> Dean

Probably should have made that clearer.  There will be a #define for
APR_HAS_MMAP (or something like it).  But, the ENOTIMPL will be put in,
for the case where stupid programmers try to use functions they don't
have.

Ryan

_______________________________________________________________________
Ryan Bloom		rbb@raleigh.ibm.com
4205 S Miami Blvd	
RTP, NC 27709		It's a beautiful sight to see good dancers 
			doing simple steps.  It's a painful sight to
			see beginners doing complicated patterns.

Re: Sv: MMAP support for APR

Posted by Dean Gaudet <dg...@arctic.org>.

On Sun, 17 Oct 1999, Ryan Bloom wrote:

> The point of this, is to allow ONE definition for mmapp'ed files, which
> will work on any platform that supports them.  If the platform doesn't
> support them, the APR functions will return APR_ENOTIMPL, and nothing will
> be done, assuming Apache handles that case correctly.
> 

feature macros are way better.  ENOTIMPL means there'll be useless code
lying around on some platforms.

Dean

Re: Sv: MMAP support for APR

Posted by Ryan Bloom <rb...@raleigh.ibm.com>.

> We should look at what type of operations it will be used for, and then destill which general API functions
> are needed. Whichever way you look upon it; The more call instructions for the CPU to execute, the slower
> the code.

I think if you review the conversation, you will find that this API has
been well researched, and well thought out.  I think the initial problem
was that the initial message didn't spell out exactly what we were talking
about.  Jim thought we were talking about shared memory at first, and
David was talking about just mmap'ing files.

The point of this, is to allow ONE definition for mmapp'ed files, which
will work on any platform that supports them.  If the platform doesn't
support them, the APR functions will return APR_ENOTIMPL, and nothing will
be done, assuming Apache handles that case correctly.

I do not see APR implementing anything like MMAP on a platform that does 
not support it natively.  Especially not in version 1.0.  The overriding
mantra of APR has been "Keep it simple" (well, as far as what we are
putting in).  Maybe in the future, we will look for ways to implement
MMAP'in on platforms that don't have it natively.

Ryan

_______________________________________________________________________
Ryan Bloom		rbb@raleigh.ibm.com
4205 S Miami Blvd	
RTP, NC 27709		It's a beautiful sight to see good dancers 
			doing simple steps.  It's a painful sight to
			see beginners doing complicated patterns.

Sv: MMAP support for APR

Posted by Henrik Vendelbo <hv...@bluprints.com>.

> >ap_status_t ap_read_mmap(char * buffer, ap_size_t buflen,
> >                         ap_size_t startpos, ap_size_t len,
> >                         ap_mmap_t, ap_context_t)
> >
> >Read a section from an mmap'd file.  Start at startpos and read len
> >bytes. With startpos set to 0 and len set to -1 try to read entire
> >file, checking the size of the buffer.
> 
> What's the point of this? Why can't I just slurp bytes straight out of
> memory?

If a platform doesn't support mmap, the apr implementation would have to contain the implementation.
And if it doesn't have it, and it is a current release, it's probably because it's not possible to implement on 
that platform. On the other hand the performance tradeoff of calling a function compared to copying from
memory directly is not terrible. 
A situation where the tradeoff is terrible though is when iterating the memory. Any MMAP implementation 
is already buffered. Certain operations like searching and traversing seem natural to do directly.

I suggest that the interface is more than a single API function.

We should look at what type of operations it will be used for, and then destill which general API functions
are needed. Whichever way you look upon it; The more call instructions for the CPU to execute, the slower
the code.

\Henrik

Re: MMAP support for APR

Posted by "Jeffrey W. Baker" <jw...@cp.net>.

Tony Finch wrote:
> I'm going to mention Flash again because it's cool.
> http://www.cs.rice.edu/~vivek/flash99/flash.ps.gz
> 
> This server does some really cool things with mmap: It maintains a
> cache of chunks of mmapped files that is tuned to the available memory
> in the machine and has an LRU replacement strategy to approximate the
> kernel's page replacement strategy. It uses auxiliary processes to
> peek at the pages of the mmap (and therefore block while they are read
> in) so that the main server doesn't block. If Dean's idea of a
> select-loop-based MPM happens then this could be very handy (even if
> the select loop is hidden inside some userland threading library).
> 
> There are some other optimisations: Instead of cacheing struct stats,
> it caches the translation of URI to filename. It's based on thttpd so
> it doesn't have to worry about .htaccess files, but I like the idea of
> being able to avoid traipsing around the filesystem on every request.
> It also caches response headers so that they can be re-used, and does
> some tricks to avoid realignment costs inside the network stack.

Isn't that last paragraph substantially the same as the last of the "10x
performance increase" patches send by SGI?  I believe it cached URI
translations and repsonse headers for static content within the
framework of the current mod_mmap_static.

Jeffrey

Re: MMAP support for APR

Posted by Tony Finch <do...@dotat.at>.

"David Reid" <ab...@dial.pipex.com> wrote:
>
>How do we control the lifetime of an mmap?  We need some way of stopping us
>from deleting an mmap while it is still being read.
>Similarly are there any platforms that need to restrict access to a certain
>number of simultaneous accesses?
>Is it worth making the mmap's created into a linked chain within each
>application?  This would be useful for "walking the chain" to find a
>matching mmap or to generate info on the size of mmaps presently in the
>system (thinking ahead to possibly linking with mod_status).
>At present I've used the whole stat struct as it's impossible to determine
>which fields will be used in the future.  Is this worthwhile or do we simply
>make a decision and cherry pick the fields that will be useful?

I'm going to mention Flash again because it's cool.
http://www.cs.rice.edu/~vivek/flash99/flash.ps.gz

This server does some really cool things with mmap: It maintains a
cache of chunks of mmapped files that is tuned to the available memory
in the machine and has an LRU replacement strategy to approximate the
kernel's page replacement strategy. It uses auxiliary processes to
peek at the pages of the mmap (and therefore block while they are read
in) so that the main server doesn't block. If Dean's idea of a
select-loop-based MPM happens then this could be very handy (even if
the select loop is hidden inside some userland threading library).

There are some other optimisations: Instead of cacheing struct stats,
it caches the translation of URI to filename. It's based on thttpd so
it doesn't have to worry about .htaccess files, but I like the idea of
being able to avoid traipsing around the filesystem on every request.
It also caches response headers so that they can be re-used, and does
some tricks to avoid realignment costs inside the network stack.

>ap_status_t ap_read_mmap(char * buffer, ap_size_t buflen,
>                         ap_size_t startpos, ap_size_t len,
>                         ap_mmap_t, ap_context_t)
>
>Read a section from an mmap'd file.  Start at startpos and read len
>bytes. With startpos set to 0 and len set to -1 try to read entire
>file, checking the size of the buffer.

What's the point of this? Why can't I just slurp bytes straight out of
memory?

Tony.
-- 
fanf@demon.net -- the .@ person