You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Sander Striker <st...@apache.org> on 2003/06/10 18:44:36 UTC

Speeding up mod_dav_svn

Hi,

While thinking about caching the parsed access file for mod_authz_svn,
a little light went on.  We could keep the fs environment open during
an entire connection, instead of opening/closing it per request.
Typically, a svn client does most of its actions all over one connection
(two at most), so we can probably abuse the connection pool for this.
This could be a major saver.


Sander

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: mod_authz_svn config, WAS: RE: Speeding up mod_dav_svn

Posted by Greg Stein <gs...@lyra.org>.
On Tue, Jun 10, 2003 at 11:25:22PM +0200, Sander Striker wrote:
> > From: Greg Stein [mailto:gstein@lyra.org]
> > Sent: Tuesday, June 10, 2003 11:15 PM
> 
> > On Tue, Jun 10, 2003 at 02:00:28PM -0500, kfogel@collab.net wrote:
> >> "Sander Striker" <st...@apache.org> writes:
> >>> While thinking about caching the parsed access file for mod_authz_svn,
> > 
> > Isn't there a way to get that stuff integrated more tightly with httpd's
> > configuration? It has all the support needed for specifying locations,
> > merging config options, inheritance of config, etc. It kind of sucks to have
> > Yet Another Config Format.
> 
> I know.  But I'm not entirely sure how you'd call back to httpd's authz hooks
> for some arbitrary location.

Ah. Right.  hmm... I won't mention 'subrequest' because then I'd have to
break my own legs (no sense waiting for you to do it).

[ boy, I wish my location tree concept was implemented... ]

> > If you don't want to integrate with Apache's config, then how about using
> > svn_config instead? You're binding against mod_dav_svn, which means that
> > you've got libsvn_subr in your process space anyways. This approach would
> > probably work for svnserve, too.
> 
> I'm using those routines already.  I'm not sure what you are implying.

Brain fart on my part. I saw 'parse_authz_line' and some strchr's and
whatnot, but I didn't read the whole module.  Heh :-)

> > +1 on integrating with httpd.conf
> > +0 on a separate .conf
> 
> Integration isn't going to be a fun enterprise.  For me the most important
> part was to get the code running that makes the calls to check access
> against a repos path.  I'll be working on perfecting that.  If someone wants
> to change the actual is_allowed code I won't stop 'm ;).

Heh. I was just stating opinions :-) ... +0 still means "sure" :-)

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Speeding up mod_dav_svn

Posted by "Glenn A. Thompson" <gt...@cdr.net>.
Hey,

Greg Hudson wrote:

>On Wed, 2003-06-11 at 02:18, Glenn A. Thompson wrote:
>  
>
>>Where I got confused is: I was under the impression (falsely I assume) 
>>that the uuid came across
>>from the client to the server in every request. 
>>    
>>
>
>That can't be right.  For an initial checkout, the client has no idea
>what the repository's uuid is.
>
I did say (falsely I assume) :-)
And I'm not advocating it's use.  At this point, I'm just trying to 
figure out what it's currently used for.

>
>  
>
>>I went and looked at mod_dav_svn and I see clearly that using the 
>>repository path as the key makes sense.
>>    
>>
>
>Did you see my objection that multiple paths could yield the same
>repository?
>
Whoops! I found it in my trash.  Sorry.  I'm not sure how I did that.

Provided we use "fs_path" after the SVNParentPath logic, I don't see how 
any Apache config directives could cause problems.
I do see the symlinks issue.  But all that means is that we would have 
two repos objects for the same repos.  It's a waste but it doesn't hurt 
anything.  It would have to be two different paths on the same 
connection to have an impact.

gat


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Speeding up mod_dav_svn

Posted by Greg Hudson <gh...@MIT.EDU>.
On Wed, 2003-06-11 at 02:18, Glenn A. Thompson wrote:
> Where I got confused is: I was under the impression (falsely I assume) 
> that the uuid came across
> from the client to the server in every request. 

That can't be right.  For an initial checkout, the client has no idea
what the repository's uuid is.

> I went and looked at mod_dav_svn and I see clearly that using the 
> repository path as the key makes sense.

Did you see my objection that multiple paths could yield the same
repository?


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Speeding up mod_dav_svn

Posted by "Glenn A. Thompson" <gt...@cdr.net>.
Hey,

>
>Lastly, if somebody *does* want to tackle a per-conn FS, then it needs to be
>keyed by the repository path (and stop and think before you say "key it by
>the UUID"). 
>
>A single connection can quite legally talk to multiple
>repositories on the server, so we need to account for that.
>
OK so I'm going to step into it here.  On IRC I did stop and think and I 
still got it wrong:-)

So I'd like some edu-ma-cation please.
And yes, I understand that you can't get a UUID without having the FS open.
Where I got confused is: I was under the impression (falsely I assume) 
that the uuid came across
from the client to the server in every request.  
So naively, I assumed you could implement the cache like so:
1. Get the uuid from the request.
2. look for existing repos object in a hash keyed by uuid.
3. If no repos object is found, open the repos and get the uuid.  Store 
repo object in hash with uuid as key.
4. A subsequent request would go looking for that uuid, find it and 
reuse it.

I went and looked at mod_dav_svn and I see clearly that using the 
repository path as the key makes sense.  It's needed to open the FS.  So 
it makes sense that it would be the key.

But, I'm wondering: What is the UUID actually used for *today*?  

gat




---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

RE: Speeding up mod_dav_svn

Posted by Sander Striker <st...@apache.org>.
> From: Greg Hudson [mailto:ghudson@MIT.EDU]
> Sent: Tuesday, June 10, 2003 11:18 PM

> On Tue, 2003-06-10 at 17:14, Greg Stein wrote:
> > Lastly, if somebody *does* want to tackle a per-conn FS, then it needs to be
> > keyed by the repository path (and stop and think before you say "key it by
> > the UUID").
> 
> Keying on the UUID is a problem at the moment, since you have to open
> the DB to get at the UUID.  (Pity we implemented it that way.)

I disagree with that last comment.  The uuid table will prove useful
in the future, I am certain of it.
 
> Keying on the path won't work well if two paths can point to the same
> repository (through symlinks or some Apache configuration equivalent).

Keying off of r->uri - r->path_info would work (or the dir argument
passed to create_config, which is the same).  The only way to screw things
up then is symlinks.  Which would be a bad idea to start with.
 
> We could key on the device/inode of the repository directory, but that's
> a Unix-specific concept.  I have no idea what apr_stat() puts into the
> device and inode fields of the apr_finfo_t structure.

Whoa, that's a bit drastic ;)


Sander

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Speeding up mod_dav_svn

Posted by Greg Hudson <gh...@MIT.EDU>.
On Tue, 2003-06-10 at 17:14, Greg Stein wrote:
> Lastly, if somebody *does* want to tackle a per-conn FS, then it needs to be
> keyed by the repository path (and stop and think before you say "key it by
> the UUID").

Keying on the UUID is a problem at the moment, since you have to open
the DB to get at the UUID.  (Pity we implemented it that way.)

Keying on the path won't work well if two paths can point to the same
repository (through symlinks or some Apache configuration equivalent).

We could key on the device/inode of the repository directory, but that's
a Unix-specific concept.  I have no idea what apr_stat() puts into the
device and inode fields of the apr_finfo_t structure.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Speeding up mod_dav_svn

Posted by Branko Čibej <br...@xbc.nu>.
Greg Stein wrote:

>Lastly, if somebody *does* want to tackle a per-conn FS, then it needs to be
>keyed by the repository path (and stop and think before you say "key it by
>the UUID"). A single connection can quite legally talk to multiple
>repositories on the server, so we need to account for that.
>  
>
Yes, and do take a look at issue 688. We might be able to stone two
birds with one reefer.

-- 
Brane Čibej   <br...@xbc.nu>   http://www.xbc.nu/brane/


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: mod_authz_svn config, WAS: RE: Speeding up mod_dav_svn

Posted by Greg Hudson <gh...@MIT.EDU>.
On Tue, 2003-06-10 at 17:25, Sander Striker wrote:
> Integration isn't going to be a fun enterprise.  For me the most important
> part was to get the code running that makes the calls to check access
> against a repos path.  I'll be working on perfecting that.  If someone wants
> to change the actual is_allowed code I won't stop 'm ;).

Another conceivable axis of integration is: put the acl configuration in
the repository directory (using the svn_config syntax, probably). 
Define semantics for the file which make sense for all ra layers.  Make
all ra layers implement the acls.

(I know Ben S. has discussed some of this with me and with you, but I
don't think the idea has hit the list, and I don't know how much of that
idea is reflected in your current code.)


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

mod_authz_svn config, WAS: RE: Speeding up mod_dav_svn

Posted by Sander Striker <st...@apache.org>.
> From: Greg Stein [mailto:gstein@lyra.org]
> Sent: Tuesday, June 10, 2003 11:15 PM

> On Tue, Jun 10, 2003 at 02:00:28PM -0500, kfogel@collab.net wrote:
>> "Sander Striker" <st...@apache.org> writes:
>>> While thinking about caching the parsed access file for mod_authz_svn,
> 
> Isn't there a way to get that stuff integrated more tightly with httpd's
> configuration? It has all the support needed for specifying locations,
> merging config options, inheritance of config, etc. It kind of sucks to have
> Yet Another Config Format.

I know.  But I'm not entirely sure how you'd call back to httpd's authz hooks
for some arbitrary location.
 
> If you don't want to integrate with Apache's config, then how about using
> svn_config instead? You're binding against mod_dav_svn, which means that
> you've got libsvn_subr in your process space anyways. This approach would
> probably work for svnserve, too.

I'm using those routines already.  I'm not sure what you are implying.
 
> +1 on integrating with httpd.conf
> +0 on a separate .conf

Integration isn't going to be a fun enterprise.  For me the most important
part was to get the code running that makes the calls to check access
against a repos path.  I'll be working on perfecting that.  If someone wants
to change the actual is_allowed code I won't stop 'm ;).


Sander

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Speeding up mod_dav_svn

Posted by Greg Stein <gs...@lyra.org>.
On Tue, Jun 10, 2003 at 02:00:28PM -0500, kfogel@collab.net wrote:
> "Sander Striker" <st...@apache.org> writes:
> > While thinking about caching the parsed access file for mod_authz_svn,

Isn't there a way to get that stuff integrated more tightly with httpd's
configuration? It has all the support needed for specifying locations,
merging config options, inheritance of config, etc. It kind of sucks to have
Yet Another Config Format.

If you don't want to integrate with Apache's config, then how about using
svn_config instead? You're binding against mod_dav_svn, which means that
you've got libsvn_subr in your process space anyways. This approach would
probably work for svnserve, too.

+1 on integrating with httpd.conf
+0 on a separate .conf

> > a little light went on.  We could keep the fs environment open during
> > an entire connection, instead of opening/closing it per request.
> > Typically, a svn client does most of its actions all over one connection
> > (two at most), so we can probably abuse the connection pool for this.
> > This could be a major saver.
> 
> !
> 
> Hmmm, I wonder, was there some reason we didn't do it that way
> originally?

Each request should be considered stateless. It is very hard to scale the
server when you start loading up the per-connection cost. Note: this only
applies in-between requests -- when a request *is* being processed, an FS
will be open, which matches the footprint of a connection-cached FS. The
trick is to figure out the concurrency here. Depending upon your load, the
connection-cache could keep 10 FS's open, while the per-request peaks at
(say) 2 FS's open at a time.

That said, it is also hard to scale up the server when it opens/closes the
FS on each request :-)

Lastly, if somebody *does* want to tackle a per-conn FS, then it needs to be
keyed by the repository path (and stop and think before you say "key it by
the UUID"). A single connection can quite legally talk to multiple
repositories on the server, so we need to account for that.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Speeding up mod_dav_svn

Posted by kf...@collab.net.
"Sander Striker" <st...@apache.org> writes:
> While thinking about caching the parsed access file for mod_authz_svn,
> a little light went on.  We could keep the fs environment open during
> an entire connection, instead of opening/closing it per request.
> Typically, a svn client does most of its actions all over one connection
> (two at most), so we can probably abuse the connection pool for this.
> This could be a major saver.

!

Hmmm, I wonder, was there some reason we didn't do it that way
originally?


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org