You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Nick Kew <ni...@webthing.com> on 2007/03/23 13:24:53 UTC

DAV and lazy evaluation

I'm developing a DBD-based DAV backend.

I've been trying to use lazy evaluation for efficiency.
But there are obstacles in the way.

I've just added an SQL query to my get_resource method
just to determine whether the resource is a collection.
I don't think that should be necessary: not every request
needs that information about every resource.  And it's
very expensive in an operation such as directory listing.
This could be fixed if the collection field of dav_resource
was a method rather than just a field.

I don't want to propose a bunch of tiny changes like this,
but I'm looking towards a possible review of mod_dav.
Meanwhile, anyone BTDT and have insights to share?

-- 
Nick Kew

Application Development with Apache - the Apache Modules Book
http://www.apachetutor.org/

Re: DAV and lazy evaluation

Posted by Nick Kew <ni...@webthing.com>.
On Sat, 24 Mar 2007 09:11:46 -0700
"Justin Erenkrantz" <ju...@erenkrantz.com> wrote:

> On 3/23/07, Nick Kew <ni...@webthing.com> wrote:
> >
> > I'm developing a DBD-based DAV backend.
> >
> > I've been trying to use lazy evaluation for efficiency.
> > But there are obstacles in the way.
> >
> > I've just added an SQL query to my get_resource method
> > just to determine whether the resource is a collection.
> > I don't think that should be necessary: not every request
> > needs that information about every resource.  And it's
> > very expensive in an operation such as directory listing.
> > This could be fixed if the collection field of dav_resource
> > was a method rather than just a field.
> 
> Eww - that would be a *major* change to all downstream WebDAV
> providers with no possibilities of caching it (unless you add a
> 'hidden' field - which defeats your purpose entirely! - but still
> there's an inappropriate function-call overhead).  Why can't your SQL
> query return that as part of the initial existence test?

The simple usage case where lazy evaluation would help is
in listing a collection.  In terms of a standard filesystem
backend, lazy evaluation could be opendir/readdir, while
setting the collection flag implies a stat() on each entry.
Substitute child nodes in a database, and the stat() becomes
a database lookup.

>  There really
> shouldn't be much overhead here.  My hunch without having *any*
> details provided by you whatsoever is that your SQL schema is
> inappropriate - there are too many instances where the caller needs to
> know whether it's a collection or a specific file.  (For a directory
> listing, it'd almost certainly *need* to know whether the entry is
> itself a directory or not.)

In the case of a recursive listing, that is of course necessary,
and lazy evaluation will do the job.  But I'm not convinced it is
or should be necessary for every listing.  If I run 'ls -l' on a
filesystem, I'm asking for that stat() information.  But if I run
plain 'ls', then it can be lazier.

> > I don't want to propose a bunch of tiny changes like this,
> > but I'm looking towards a possible review of mod_dav.

This could, for example, work by making collection an
enum { YES, NO, DONTKNOW } and calling an evaluation function
in the case of DONTKNOW.

-- 
Nick Kew

Application Development with Apache - the Apache Modules Book
http://www.apachetutor.org/

Re: DAV and lazy evaluation

Posted by Greg Marr <gr...@alum.wpi.edu>.
At 12:11 PM 3/24/2007, Justin Erenkrantz wrote:
>>I don't want to propose a bunch of tiny changes like this,
>>but I'm looking towards a possible review of mod_dav.
>>Meanwhile, anyone BTDT and have insights to share?
>
>What is BTDT?  -- justin

Been There, Done That


Re: DAV and lazy evaluation

Posted by Justin Erenkrantz <ju...@erenkrantz.com>.
On 3/23/07, Nick Kew <ni...@webthing.com> wrote:
>
> I'm developing a DBD-based DAV backend.
>
> I've been trying to use lazy evaluation for efficiency.
> But there are obstacles in the way.
>
> I've just added an SQL query to my get_resource method
> just to determine whether the resource is a collection.
> I don't think that should be necessary: not every request
> needs that information about every resource.  And it's
> very expensive in an operation such as directory listing.
> This could be fixed if the collection field of dav_resource
> was a method rather than just a field.

Eww - that would be a *major* change to all downstream WebDAV
providers with no possibilities of caching it (unless you add a
'hidden' field - which defeats your purpose entirely! - but still
there's an inappropriate function-call overhead).  Why can't your SQL
query return that as part of the initial existence test?  There really
shouldn't be much overhead here.  My hunch without having *any*
details provided by you whatsoever is that your SQL schema is
inappropriate - there are too many instances where the caller needs to
know whether it's a collection or a specific file.  (For a directory
listing, it'd almost certainly *need* to know whether the entry is
itself a directory or not.)

> I don't want to propose a bunch of tiny changes like this,
> but I'm looking towards a possible review of mod_dav.
> Meanwhile, anyone BTDT and have insights to share?

What is BTDT?  -- justin

Re: DAV and lazy evaluation

Posted by Guenter Knauf <fu...@apache.org>.
Hi Nick,

> I'm developing a DBD-based DAV backend.
do you know about the Catacomb project?
http://catacomb.tigris.org/
they have similar plans, and probably it would make sense to participate there:

from recent catacomb mailing list:

Are there any other plans for developing catacomb in near future? I could give 
you a small roadmap of our development-plans:
- mod_dbd and apache 2.2
- full acl support
- advanced visioning
- WebDAV-quota support

greets, Guenter.