You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@apr.apache.org by "William A. Rowe, Jr." <wr...@rowe-clan.net> on 2003/09/19 21:56:03 UTC

apr-util/dbm/sdbm page sizes

As a few here already know, we used the 'Standard' page size for our
apr-util SDBM implementation.  This assures that you can modify our
sdbm files from perl and other tools.

However, it's pretty clear that 1024 just doesn't cut it when it comes
to huge objects from ssl caching and other sorts of large blobs.  The
modssl project (1.3) had tweaked sdbm to take larger page sizes, but
this leads to an inflexible implementation (another fixed page size, but
incompatible with other implementations.)

I'd like to see apr-util support alternate sized dbm pages, but without
making fixed assumptions about those sizes (other than this - the 
structure uses short's for offsets, so the page size does have a cap.)

The attached patch allows the SDBM to be of any arbitrary data and
directory page sizes.  However, I've not hacked in any method of
declaring what those sizes should be (they are the 'standard' defaults
at the moment.)

What do folks believe is the most rational approach to expanding support 
to create an sdbm with alternate page sizes?  And how best can we embed 
that info into the sdbm so that later accesses to the file use the correct
page size?  My current thought is to embed a NULL-key record at the head
of the file, with a specific known value element containing some identifier
string such as "APR-SDBM" followed by two ints, the page and data len.
If that record is missing (or created with default sizes), the file is treated 
as a standard sdbm with 1024/4096 sized pages.

Other thoughts, suggestions or observations?

Bill

Re: apr-util/dbm/sdbm page sizes

Posted by Ian Holsman <li...@holsman.net>.
William A. Rowe, Jr. wrote:
> As a few here already know, we used the 'Standard' page size for our
> apr-util SDBM implementation.  This assures that you can modify our
> sdbm files from perl and other tools.
> 
> However, it's pretty clear that 1024 just doesn't cut it when it comes
> to huge objects from ssl caching and other sorts of large blobs.  The
> modssl project (1.3) had tweaked sdbm to take larger page sizes, but
> this leads to an inflexible implementation (another fixed page size, but
> incompatible with other implementations.)
> 
> I'd like to see apr-util support alternate sized dbm pages, but without
> making fixed assumptions about those sizes (other than this - the 
> structure uses short's for offsets, so the page size does have a cap.)
> 
> The attached patch allows the SDBM to be of any arbitrary data and
> directory page sizes.  However, I've not hacked in any method of
> declaring what those sizes should be (they are the 'standard' defaults
> at the moment.)
> 
> What do folks believe is the most rational approach to expanding support 
> to create an sdbm with alternate page sizes?  And how best can we embed 
> that info into the sdbm so that later accesses to the file use the correct
> page size?  My current thought is to embed a NULL-key record at the head
> of the file, with a specific known value element containing some identifier
> string such as "APR-SDBM" followed by two ints, the page and data len.
> If that record is missing (or created with default sizes), the file is treated 
> as a standard sdbm with 1024/4096 sized pages.
> 
> Other thoughts, suggestions or observations?
> 
> Bill
> 
+1 SDBM as currently implemented is next to useless on a large number of 
records either.
you'll need to modify apr_dbm_open call so it can pass a page size (so 
you can create the sucker)

I'm still curious on why people just don't use berkeley DB (except for 
licencing issues I guess)

are there any other BSD-style DBM formats we could use?


Re: [patch] Re: apr-util/dbm/sdbm page sizes

Posted by Greg Stein <gs...@lyra.org>.
On Tue, Oct 21, 2003 at 02:12:12PM -0500, William A. Rowe, Jr. wrote:
> For whatever reason, part of the original patch was incorrect.  Here's
> the update against apr-util HEAD...
> 
> Again there is no storage/recovery of these page size parameters yet,
> and we have yet to define an API for opening/creating an SDBM with
> such alternatives.  It should be obvious in the open call where I've hacked
> in our default values, for anyone to experiment with.
> 
> Immediate beneficiaries include mod_ssl and dav, where pages may need
> to be much larger than the default.

This patch isn't really a benefit for mod_dav. Unfortunately, mod_dav's
pool usage isn't very good. As a result, you'll end up opening a DBM for
each resource that it reports properties for. Since those go into a pool
(and that pool isn't cleared), then you effectively have an unbounded
memory usage. By default, you can't do a Depth:infinity request, but you
*can* configure it to allow it, or you can simply have a directory with a
gazillion files. Regardless, the old form of apr_sdbm which used
malloc/free would at least toss out memory at apr_dbm_close() time.

I'd suggest that (for now) the patch sticks with malloc/free, pending a
fix to mod_dav's propdb pool usage.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/

Re: [patch] Re: apr-util/dbm/sdbm page sizes

Posted by "William A. Rowe, Jr." <wr...@rowe-clan.net>.
At 08:07 AM 10/22/2003, Jim Jagielski wrote:
>++1 on concept and I'll be actually doing some
>testing soon.
>
>One problem is just that mod_ssl uses the default APR
>dbm which is the (non-optimal) sdbm. A cool fix would
>be to allow SSLSessionCache to pick out the underlying
>dbm implementation, since we have those hooks in APR
>anyway. But error detection/correction would be
>interesting because of the linked nature of those
>dbm libs.

+1 on supporting apr_dbm in mod_ssl - but that's sort of OT
on this list :)  Win32 ships with 'just' sdbm so I remain just
as concerned about sdbm.

>As far as storage of that "private" information regarding
>sizes, one thought would be to place them into the
>APR dbm datum types... Looking into that now :)

That could be fine.  However, I'm mostly puzzled by where we
could persist the two block size variable in the page0 of one of
the files (.pag or .dat).  If we found a way (it's non-trivial, because
the format of sdbm's are trivial and we don't want to touch many
lines of code) then apr_sdbm would know the page sizes once
the filie is initially created.



Re: [patch] Re: apr-util/dbm/sdbm page sizes

Posted by Jim Jagielski <ji...@jagunet.com>.
++1 on concept and I'll be actually doing some
testing soon.

One problem is just that mod_ssl uses the default APR
dbm which is the (non-optimal) sdbm. A cool fix would
be to allow SSLSessionCache to pick out the underlying
dbm implementation, since we have those hooks in APR
anyway. But error detection/correction would be
interesting because of the linked nature of those
dbm libs.

As far as storage of that "private" information regarding
sizes, one thought would be to place them into the
APR dbm datum types... Looking into that now :)


[patch] Re: apr-util/dbm/sdbm page sizes

Posted by "William A. Rowe, Jr." <wr...@rowe-clan.net>.
For whatever reason, part of the original patch was incorrect.  Here's
the update against apr-util HEAD...

Again there is no storage/recovery of these page size parameters yet,
and we have yet to define an API for opening/creating an SDBM with
such alternatives.  It should be obvious in the open call where I've hacked
in our default values, for anyone to experiment with.

Immediate beneficiaries include mod_ssl and dav, where pages may need
to be much larger than the default.

Bill

At 02:56 PM 9/19/2003, William A. Rowe, Jr. wrote:
>As a few here already know, we used the 'Standard' page size for our
>apr-util SDBM implementation.  This assures that you can modify our
>sdbm files from perl and other tools.
>
>However, it's pretty clear that 1024 just doesn't cut it when it comes
>to huge objects from ssl caching and other sorts of large blobs.  The
>modssl project (1.3) had tweaked sdbm to take larger page sizes, but
>this leads to an inflexible implementation (another fixed page size, but
>incompatible with other implementations.)
>
>I'd like to see apr-util support alternate sized dbm pages, but without
>making fixed assumptions about those sizes (other than this - the 
>structure uses short's for offsets, so the page size does have a cap.)
>
>The attached patch allows the SDBM to be of any arbitrary data and
>directory page sizes.  However, I've not hacked in any method of
>declaring what those sizes should be (they are the 'standard' defaults
>at the moment.)
>
>What do folks believe is the most rational approach to expanding support 
>to create an sdbm with alternate page sizes?  And how best can we embed 
>that info into the sdbm so that later accesses to the file use the correct
>page size?  My current thought is to embed a NULL-key record at the head
>of the file, with a specific known value element containing some identifier
>string such as "APR-SDBM" followed by two ints, the page and data len.
>If that record is missing (or created with default sizes), the file is treated 
>as a standard sdbm with 1024/4096 sized pages.