You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Garrett Rooney <ro...@electricjellyfish.net> on 2006/02/24 18:03:35 UTC

Prototype Big Checkout Blocker

So several of the potential ideas for dealing with disruptively large
checkouts involved writing custom Apache modules.  Paul already posted
his example of a module that reduces the priority of requests to check
out certain URLs, so I thought it might be nice to explore other
possibilities.

Specifically, I want something that can actually block the request
entirely, not just reduce its priority, and it should NOT interfere
with REPORT requests that are not updates/checkouts/exports.  Also,
eventually it would be nice to be able to extend this to support
stopping overly large replay requests while not interfering with
REPORTs to these URLs that are not resource intensive.

What I've got here is a prototype for mod_limit_svn, it's implemented
as an Apache input filter (which is apparently a reasonably rare thing
to do, I actually had trouble finding good examples of this stuff,
most people seem to do their filtering on the output side, not the
input side) which parses the body of REPORT requests looking for
parameters that are problematic.

Right now, all it does is block checkouts of the root of the
repository, but eventually I want to extend it so that the list of
paths it blocks are configurable.  It's also smart enough to avoid
blocking "safe" uses of the update-report, like diff (of course, that
means it's not blocking potentially unsafe things like 'switch', but
we are limited by the information provided by the protocol).

There are a number of problems with it.  The XML parsing is naive,
with no handling of namespaces at all, so it would be easy to work
around it from the client side by simply changing your namespace
prefix.  The actual parsing of the report XML is not as robust as it
could be.  The error it sends back to the user is pretty cryptic
(although amusing if you're a Seinfeld fan), and that could be
improved.  But in the end, it does seem to block the really painful
checkouts, with the only price paid being a bit of XML parsing
overhead to look for them.

I don't consider this a "final" solution to this sort of problem, as I
would much prefer something integrated into Subversion itself so that
it can work with svnserve as well as mod_dav_svn, but Apache lets us
do some neat stuff, so I figured having a prototype to play with the
ideas would be interesting.  Also, for users that have this problem
now, this would be something that could be rolled out fairly quickly,
without having to wait for Subversion 1.4.x.

-garrett

Re: Prototype Big Checkout Blocker

Posted by Garrett Rooney <ro...@electricjellyfish.net>.
On 2/24/06, Justin Erenkrantz <ju...@erenkrantz.com> wrote:
> On 2/24/06, Bruce Elrick <br...@gmail.com> wrote:
> > Indeed, abuse of HTTP response codes if I ever saw it! ;-)
>
> 403 might be slightly better if we're to pick nits. (The server is
> forbidding itself from generating the response.)  But, very few
> programs care what the specific 4xx code is if isn't 404 or 401.
>
> I do like the status line though.  ;-)  Keeping it distinctive may let
> it be easier for clients to figure out what a 403 in this case means
> when they put it into Google as ra_dav will print the status line in
> the error displayed to the user.  ra_serf is likely to wind up doing
> the same; now, it'd just abort().  ;-)  -- justin

And here's an updated version that includes fixes to all the problems
Justin pointed out to me.  Still needs to be fixed to pull a list of
paths to block from the config file though.

-garrett

Re: Prototype Big Checkout Blocker

Posted by Justin Erenkrantz <ju...@erenkrantz.com>.
On 2/24/06, Bruce Elrick <br...@gmail.com> wrote:
> Indeed, abuse of HTTP response codes if I ever saw it! ;-)

403 might be slightly better if we're to pick nits. (The server is
forbidding itself from generating the response.)  But, very few
programs care what the specific 4xx code is if isn't 404 or 401.

I do like the status line though.  ;-)  Keeping it distinctive may let
it be easier for clients to figure out what a 403 in this case means
when they put it into Google as ra_dav will print the status line in
the error displayed to the user.  ra_serf is likely to wind up doing
the same; now, it'd just abort().  ;-)  -- justin

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org


Re: Prototype Big Checkout Blocker

Posted by Bruce Elrick <br...@gmail.com>.
On 2/24/06, Garrett Rooney <ro...@electricjellyfish.net> wrote:
>
>  The error it sends back to the user is pretty cryptic
> (although amusing if you're a Seinfeld fan), and that could be
> improved.  But in the end, it does seem to block the really painful
> checkouts, with the only price paid being a bit of XML parsing
> overhead to look for them.
>
> Indeed, abuse of HTTP response codes if I ever saw it! ;-)


--
Bruce Elrick
bruce.elrick@gmail.com