You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@subversion.apache.org by Garrett Rooney <ro...@electricjellyfish.net> on 2006/04/08 00:27:53 UTC

Limiting access to replay in 1.4

One thing I'd like to see resolved before 1.4 goes out the door is the
question about providing a way to limit access to replay
functionality.  The argument is that replay and svnsync encourage
users to put a rather high amount of load on a system, so we should
provide users with a way to either turn it off, or hopefully just turn
it off for part of the system.  For example you might want to limit
the ability to run replay over the entire repository to specific
users, but allow anyone interested to replay subsets of it (i.e. a
specific branch).

Note that replay is not the only thing vulnerable to this problem,
update can be just as dangerous if applied correctly.  So ideally we'd
want some sort of system that lets us limit access to all sorts of
things, not just replay.

The last time I brought some of this up, I also posted an apache
module that allowed you to keep people from doing silly things like
checking out the root of the repository.  This could fairly easily be
extended to support more things, like replay, but obviously it is
limited to mod_dav_svn servers, and a more generic approach would be
nice.

Alternatively, we could put hook scripts in place to allow users to
control such things.  A pre-replay hook would allow you to keep people
from replaying large chunks of your repository, a pre-update hook
could stop checkouts of really big trees, etc.  But do we really want
to be calling hook scripts for this sort of thing?  Note that there is
some demand for this stuff, as a pre-checkout hook script patch has
been sent in already by one of our users.

Finally, we could bake support for this stuff right into libsvn_repos,
but I suspect such a thing would grow into a rather complex
undertaking due to the scope of the problem, and I'm not sure we want
to go there.

I'm not sure what the right solution is, but I'd like to come up with
an acceptable one before we branch 1.4.x.  Personally at the moment
I'm leaning towards either hook scripts or a cleaned up version of my
apache module, but that's just me.

Thoughts?

-garrett

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Limiting access to replay in 1.4

Posted by Jonathan Gilbert <o2...@sneakemail.com>.

At 08:28 PM 07/04/2006 -0700, you wrote:
>On 4/7/06, C. Michael Pilato <cm...@collab.net> wrote:
>> Hook scripts have the feature of being RA-insensitive.  I like that.  Alot.
>
>Hook scripts have the feature of being very very slow.  I hate that.
Alot.  ;-)
>
>Seriously, I think forcing a shell invocation before any read ops
>would be a serious killer of performance - and we're trying to avoid
>performance quagmires here.  I wouldn't want to run a shell script on
>*every* SVN request off svn.apache.org.  Yikes.  -- justin

I haven't been following this thread exceedingly closely, but could
whatever checking is needed be done more efficiently with a native binary?
I don't think there's any rule that says hooks must be shell scripts :-)

Jonathan

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Limiting access to replay in 1.4

Posted by Garrett Rooney <ro...@electricjellyfish.net>.

On 4/7/06, Justin Erenkrantz <ju...@erenkrantz.com> wrote:

> My strawman is that we treat this as an authz mode - think Unix
> non-inherited execute perms on dirs.  For simplicity sake, it probably
> should actually be the *opposite* of x in Unix (if it's set, don't
> allow this dir to be checked out), but so you can see where I'm
> going...like so:
>
> [/]
> * = rx
> admins = rw
>
> [/project]
> * = x
> committers = rwx
>
> [/project/branches]
> * = x
>
> [/*/tags]
> * = x
>
> I forget if mod_authz_svn will expand regexes in the config file...(it
> could, I guess)
>
> The authz checks we already do would do the right thing if we extended
> it to represent these semantics and checked for this when driving the
> update editor - have it send 'absent-directory' to the client...

Ok, so this seems to be a reasonable approach, so I decided to
investigate a bit and see how it would be implemented.

Actually adding a concept of 'x' permissions to the authz format is
simple, you can just reserve some more bits in
svn_repos_authz_access_t for the purpose.

The more complicated issue is figuring out where to make the call to
svn_repos_authz_check_access.

As usual, with svnserve it's not all that big a deal, we can just
stick them inside the update and replay functions, since authz is
handled inline for that code anyway, and enough information is
available to determine that something is a checkout/export at that
point.

Adding support to mod_authz_svn is more tricky.  We don't actually
know that we're doing a replay or checkout at the point where we do
most of the authz checking, so we'll have to jump through hoops,
attaching an input filter that looks at the REPORT body and figures
out if it's a checkout or report.  Not impossible (I mean I already
wrote that code in mod_limit_svn), but not simple either.

So before I run down this path further, do people think it's worth
pursuing?  In the meantime I want to investigate the hook script idea,
and see if the performance implications are actually noticable.

-garrett

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

RE: Limiting access to replay in 1.4

Posted by Lieven Govaerts <lg...@mobsol.be>.

[..]
> 
> I forget if mod_authz_svn will expand regexes in the config 
> file...(it could, I guess)
> 
This is a feature I'm just about to start on (not regexes, PRCE wildcards).

Lieven.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Limiting access to replay in 1.4

Posted by Justin Erenkrantz <ju...@erenkrantz.com>.

On 4/7/06, Garrett Rooney <ro...@electricjellyfish.net> wrote:
> Well, we don't have to execute the hook for each path, we could send
> them in on stdin or something like that.

We have to pay attention to stdin deadlock - but, yah, that should work.

> Or perhaps a subset of the
> information could be useful.  I mean for checkouts or exports the
> important thing is the source path, you don't really have anything
> else.  Do we really NEED every path passed to the reporter to make a
> useful decision?

I think so, depending upon whether we to force the admin to manually
put in every parent path to block just one dir or require the hook
script to have logic to block all ancestors of those in the excluded
list.

Consider the following case where an admin wants to block a checkout
on just /project/branches/ and /project/tags - the admin wants to
force users to do a checkout of the dir underneath that (i.e. just a
single branch or a tag).  However, the admin also needs to block / and
/project/ because if a malicious user specified /project, they'd route
around the blockage.

Without checking the child paths (sucks for SVN as this degrades to
authz and hooks suck at this) or their parents (sucks for the hook
script as it's generally getting more complicated than just a simple
script), you can accidentally open a hole the admin thought they
closed.

My strawman is that we treat this as an authz mode - think Unix
non-inherited execute perms on dirs.  For simplicity sake, it probably
should actually be the *opposite* of x in Unix (if it's set, don't
allow this dir to be checked out), but so you can see where I'm
going...like so:

[/]
* = rx
admins = rw

[/project]
* = x
committers = rwx

[/project/branches]
* = x

[/*/tags]
* = x

I forget if mod_authz_svn will expand regexes in the config file...(it
could, I guess)

The authz checks we already do would do the right thing if we extended
it to represent these semantics and checked for this when driving the
update editor - have it send 'absent-directory' to the client...  --
justin

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Limiting access to replay in 1.4

Posted by Garrett Rooney <ro...@electricjellyfish.net>.

On 4/7/06, Justin Erenkrantz <ju...@erenkrantz.com> wrote:
> On 4/7/06, Garrett Rooney <ro...@electricjellyfish.net> wrote:
> > And I also worry about that.  I'm kind of curious how noticable things
> > would be tough.  I mean it wouldn't actually be EVERY read request,
> > just the significant reports really, but still.  I think it's worth
> > prototyping and benchmarking though, just to see if it's actually
> > noticable.
>
> Sure, but I'm thinking of the implications of a hook script on the
> update report more than I am about the replay report.  You'd have to
> execute the hook on every path in the update report to see whether it
> is allowed or not (passing all paths on the command-line would
> overwhelm some OS limits on number of arguments).  It's essentially
> another form of authz with the same 'check every path' semantics.

Well, we don't have to execute the hook for each path, we could send
them in on stdin or something like that.  Or perhaps a subset of the
information could be useful.  I mean for checkouts or exports the
important thing is the source path, you don't really have anything
else.  Do we really NEED every path passed to the reporter to make a
useful decision?

-garrett

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Limiting access to replay in 1.4

Posted by Justin Erenkrantz <ju...@erenkrantz.com>.

On 4/7/06, Garrett Rooney <ro...@electricjellyfish.net> wrote:
> And I also worry about that.  I'm kind of curious how noticable things
> would be tough.  I mean it wouldn't actually be EVERY read request,
> just the significant reports really, but still.  I think it's worth
> prototyping and benchmarking though, just to see if it's actually
> noticable.

Sure, but I'm thinking of the implications of a hook script on the
update report more than I am about the replay report.  You'd have to
execute the hook on every path in the update report to see whether it
is allowed or not (passing all paths on the command-line would
overwhelm some OS limits on number of arguments).  It's essentially
another form of authz with the same 'check every path' semantics.  --
justin

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Limiting access to replay in 1.4

Posted by Garrett Rooney <ro...@electricjellyfish.net>.

On 4/7/06, Justin Erenkrantz <ju...@erenkrantz.com> wrote:
> On 4/7/06, C. Michael Pilato <cm...@collab.net> wrote:
> > Hook scripts have the feature of being RA-insensitive.  I like that.  Alot.

I also like that...

> Hook scripts have the feature of being very very slow.  I hate that.  Alot.  ;-)
>
> Seriously, I think forcing a shell invocation before any read ops
> would be a serious killer of performance - and we're trying to avoid
> performance quagmires here.  I wouldn't want to run a shell script on
> *every* SVN request off svn.apache.org.  Yikes.

And I also worry about that.  I'm kind of curious how noticable things
would be tough.  I mean it wouldn't actually be EVERY read request,
just the significant reports really, but still.  I think it's worth
prototyping and benchmarking though, just to see if it's actually
noticable.

-garrett

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Limiting access to replay in 1.4

Posted by Justin Erenkrantz <ju...@erenkrantz.com>.

On 4/7/06, C. Michael Pilato <cm...@collab.net> wrote:
> Hook scripts have the feature of being RA-insensitive.  I like that.  Alot.

Hook scripts have the feature of being very very slow.  I hate that.  Alot.  ;-)

Seriously, I think forcing a shell invocation before any read ops
would be a serious killer of performance - and we're trying to avoid
performance quagmires here.  I wouldn't want to run a shell script on
*every* SVN request off svn.apache.org.  Yikes.  -- justin

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Limiting access to replay in 1.4

Posted by "C. Michael Pilato" <cm...@collab.net>.

Garrett Rooney wrote:
> One thing I'd like to see resolved before 1.4 goes out the door is the
> question about providing a way to limit access to replay
> functionality.

[...]

> I'm not sure what the right solution is, but I'd like to come up with
> an acceptable one before we branch 1.4.x.  Personally at the moment
> I'm leaning towards either hook scripts or a cleaned up version of my
> apache module, but that's just me.

Hook scripts have the feature of being RA-insensitive.  I like that.  Alot.

-- 
C. Michael Pilato <cm...@collab.net>
CollabNet   <>   www.collab.net   <>   Distributed Development On Demand

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Limiting access to replay in 1.4

Posted by Garrett Rooney <ro...@electricjellyfish.net>.

On 4/8/06, Max Bowsher <ma...@ukf.net> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Garrett Rooney wrote:
> > One thing I'd like to see resolved before 1.4 goes out the door is the
> > question about providing a way to limit access to replay
> > functionality.  The argument is that replay and svnsync encourage
> > users to put a rather high amount of load on a system, so we should
> > provide users with a way to either turn it off, or hopefully just turn
> > it off for part of the system.
>
> Ouch.
>
> Whilst clearly it's not good for servers to be hammered into the ground
> by scores of users syncing repositories, it would be a very great shame
> if this nice new feature ending up being turned off on the majority of
> significant Subversion installations.

Personally, I'm just looking for a way to keep people from syncing
entire repositories.  I think it's reasonable for people to keep a
mirror of a specific branch, for example, but I don't want just
anybody to be able to mirror all of svn.apache.org/repos/asf, it's
just asking for trouble.  We have enough trouble as it is with people
accidentally checking out the root of repositories and getting way
more than they expected (with the corresponding load on the server
that results from such actions).  Hopefully this feature can help
avoid both the new potential problems (replay) and the old ones (big
checkouts), while leaving a way for admins to turn it on when
appropriate, hopefully on a per-user basis.

> Perhaps we could consider rate-limiting rather than outright blocking of
> the feature as a DoS avoidance strategy?

I'm open to suggestions as to how such a feature would be implemented.

-garrett

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Limiting access to replay in 1.4

Posted by Brian Behlendorf <br...@collab.net>.

My overall sense is that if mirroring of large repositories is a broadly 
desireable option, then it needs to use a mechanism that doesn't place all 
the responsibility for bootstrapping new mirrors (or personal mirrors) 
entirely on the central server.  Something like a BitTorrent-esque 
protocol where mirrors can help bootstrap new mirrors would be ideal - 
there's probably even ways to do it without major coding, such as putting 
up large but compressed dumpfiles and using existing BT tools to 
distribute them.  Until then, I think installations should be able to 
limit total resources consumed by people doing mirroring - which could be 
0 in Justin's case.  But even without those, there might be reasonable 
ways to limit:

On Wed, 12 Apr 2006, Molle Bestefich wrote:
> I'm going to get picky with the terms, bear with me..
>
> If setup correctly (fx. per-user-per-host), rate limiting _does_
> prevent DoS, just not DDoS.

You have to carefully define the rate of _what_ is being limited. 
There's little direct correlation between # of connections, # of 
transactions, # of bytes, I/O load, or CPU time; any one of those could 
pose a problem by a user independently of another, and can even be 
unintentional.  There are now commercial Subversion clients that insist on 
doing repository scans on a periodic basis just to let people know when 
there's something new - and people are setting that to once a minute. 
It's the RSS polling problem writ large, and actually poses more of a 
problem for CollabNet's customers than people doing too much mirroring.

I noticed that a similar commercial system implements rate limiting by 
trying to predict how many objects are going to be read or modified by a 
given operation, and then rejecting requested operations of that list is 
larger than some configurable amount (and the end-user isn't in a 
permissions class that allows them to avoid that check).  Such a request 
would be rejected, even though an end-user can accomplish the same thing 
by performing the same update or commit operation with two separate 
commits.  I wonder if it's even possible in SVN to calculate that ahead of 
time; you don't want operations to fail halfway through.  But it can block 
that accidental big usage.

For non-accidental usage, the problem starts to look like a standard 
quality of service problem - you want the "critical" operations to have 
precedence over operations that have more tolerance for delay.  By 
definition, operations like mirroring have more tolerance for delay than 
operations like commit, and updating a small working copy should take 
precedence over large.  It's why, for example, we have the concept of 
"MinSpareServers" in Apache httpd - you always want, when possible, a pool 
of servers available to immediately handle the fast requests even if other 
threads are consumed handling more difficult requests.

Can we identify "expensive" operations as those consuming over some 
configurable amount of wall-clock time (counting HTTP persistant 
connection), and then lower their priority against operations that have 
not yet exceeded that limit?  Go further and limit it based on a 
combination of IP address (to separately identify the anonymous users) and 
login name (to separately identify those named users all coming from the 
same company).  Something persistant so that a quick cancel and restart 
doesn't allow someone to work around the limits.  That indentification 
should time out so that someone isn't penalized indefinitely.  And while 
it may be romantic to allow the long-termers to duke it out for fixed 
resources, we can't have too many there, just as eventually you might run 
out of MinSpareServers headroom if you hit MaxClients in httpd.

That would still reward people who update more selectively or are making 
commits, and still allow those with larger updates or big checkouts to 
eventually get what they want.

Conceptually it's a mirror of "make the easy things easy and the hard 
things possible": "make the quick actions quick and the long-term 
operations eventually finish".

 	Brian

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Limiting access to replay in 1.4

Posted by Molle Bestefich <mo...@gmail.com>.

> Why not let the server dynamically decide if rate limiting is required to
> share access to itself among connections?

Unless you can make it unusually bright, a problem with a completely
automagic system is that it's too fair.  Consider fx. a sudden blast
of anonymous users that each get a fair (but extremely minimal) share
of bandwidth.  The developers would then get the same extremely
minimal amount of bandwidth each.  You really don't want to be that
fair :-).

That said, it might be overkill to implement rate limiting rules in Subversion.
I mean, you can just setup a mirror SVN server and allow anonymous
users access to that instead of your main SVN server.  Wrap that
server with an off-the-shelf box that does TCP load balancing.

On top of that, if you only have one line to your ISP, you'll have to
again limit the anonymous SVN box to use max 75% of the available
total bandwidth.  Not sure if that can be done in the same
hypothetical load balancing box or if you'll need another one.

Either way, one or two boxes shouldn't be too much of an expense for
anyone that actually has a SVN bandwidth problem :-).

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Limiting access to replay in 1.4

Posted by Jonathan Gilbert <o2...@sneakemail.com>.

At 07:50 AM 19/04/2006 -0400, Marc Sherman wrote:
>Jonathan Gilbert wrote:
>> 
>> This of course doesn't do anything to stem an intentional denial of
>> service attack (apart from forcing such a malicious person to make
>> many short-lived connections rather than just one long one -- if the
>> number of connections from each IP were itself rate-limited, that
>> could potentially deal with non-distributed DoS attacks), but rather
>> prevents accidental requests from blowing up the server and also
>> allows legitimate long-running requests to proceed at a lower speed
>> without preventing anybody else from effectively using the system.
>
>As an svn admin on a private network, I don't care about intentional DOS
>at all; we've got HR procedures to handle that.  What I do care about is
>people accidentally checking out the root of the repository, and then
>going to get coffee and filling up their own disks -- a rate limiter
>doesn't help with that.  I want a way to configure the server to reject
>checkouts of certain parts of the repository.  These checkouts should be
>allowed with a --force command, as this _isn't_ access control, it's
>there to help people avoid making common mistakes.

I guess what is needed is some sort of flag to 'checkout' that can be
passed to hook scripts on the server side. My understanding is that the
Subversion policy is fairly firm that this sort of thing should *not* be a
part of the Subversion binary itself, but at present, it is a deficiency
(perhaps by design) of the hook system that arguments cannot be passed
directly to it from the *client* side.

I still believe that rate control like I described would effectively
mitigate the problem you are worried about. I mean, users make mistakes,
but they're not all going to make that same mistake at the same time, are
they? One user doing a long-running (and this rate-limited) checkout of,
say, /source/trunk instead of /source/trunk/specificproject will come back
from getting coffee to find it 10% done and going at only 100 k/s instead
of saturating the network and slowing the server down significantly. At
that point, he/she *should* notice that something is amiss and cancel the
operation. Even if they don't, it doesn't stop anybody else from using the
server at near to full speed in the mean time.

Basically, what you're suggesting would be analogous to an FTP server that
would by default refuse to let users initiate downloads of files larger
than a certain size, unless they ran some custom SITE command first.
Perhaps some people would like that feature; I personally would be annoyed
all to hell by it and might even intentionally download a lot of large
files just to get even with the site operator ;-)

Jonathan Gilbert

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Limiting access to replay in 1.4

Posted by Russell Hind <rh...@mac.com>.

Marc Sherman wrote:
> 
> As an svn admin on a private network, I don't care about intentional DOS
> at all; we've got HR procedures to handle that.  What I do care about is
> people accidentally checking out the root of the repository, and then
> going to get coffee and filling up their own disks -- a rate limiter
> doesn't help with that.  I want a way to configure the server to reject
> checkouts of certain parts of the repository.  These checkouts should be
> allowed with a --force command, as this _isn't_ access control, it's
> there to help people avoid making common mistakes.
> 

(sorry if something like this has been pointed out, only just seen this 
last message).

trac (http://projects.edgewall.com/trac) have just added the ability to 
download a zip of any svn folder to their web-based bug tracking system 
with svn integration.

They have added a config option, download_paths which allows you to 
limit what paths in the repo can be downloaded as zip file e.g.

; Any directory can be downloaded
download_paths = *

; Limit to individual releases and trunk
download_paths = trunk;releases/*

etc.

Cheers

Russell


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Limiting access to replay in 1.4

Posted by Marc Sherman <ms...@projectile.ca>.

Jonathan Gilbert wrote:
> 
> This of course doesn't do anything to stem an intentional denial of
> service attack (apart from forcing such a malicious person to make
> many short-lived connections rather than just one long one -- if the
> number of connections from each IP were itself rate-limited, that
> could potentially deal with non-distributed DoS attacks), but rather
> prevents accidental requests from blowing up the server and also
> allows legitimate long-running requests to proceed at a lower speed
> without preventing anybody else from effectively using the system.

As an svn admin on a private network, I don't care about intentional DOS
at all; we've got HR procedures to handle that.  What I do care about is
people accidentally checking out the root of the repository, and then
going to get coffee and filling up their own disks -- a rate limiter
doesn't help with that.  I want a way to configure the server to reject
checkouts of certain parts of the repository.  These checkouts should be
allowed with a --force command, as this _isn't_ access control, it's
there to help people avoid making common mistakes.

- Marc

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Limiting access to replay in 1.4

Posted by Jonathan Gilbert <o2...@sneakemail.com>.

At 07:37 PM 12/04/2006 +0200, Molle Bestefich wrote:
>> Why not let the server dynamically decide if rate limiting is required to
>> share access to itself among connections?
>
>Unless you can make it unusually bright, a problem with a completely
>automagic system is that it's too fair.  Consider fx. a sudden blast
>of anonymous users that each get a fair (but extremely minimal) share
>of bandwidth.  The developers would then get the same extremely
>minimal amount of bandwidth each.  You really don't want to be that
>fair :-).

How about a sort of "lazy" resource limiter, one which only kicks in once a
client has reached a certain level of usage? That way, short queries always
run quickly because they finish before the limitations kick in. Someone
asking for all versions of the entire tree will get the first N
kilobytes/seconds of CPU time/kilobytes of memory at full speed, but then
when it becomes clear that the request isn't anywhere near finishing, the
limiter starts, well, limiting that particular person. Other people doing
short queries at the same time would be unaffected by the limitations.

This of course doesn't do anything to stem an intentional denial of service
attack (apart from forcing such a malicious person to make many short-lived
connections rather than just one long one -- if the number of connections
from each IP were itself rate-limited, that could potentially deal with
non-distributed DoS attacks), but rather prevents accidental requests from
blowing up the server and also allows legitimate long-running requests to
proceed at a lower speed without preventing anybody else from effectively
using the system.

I don't think something like this would be too complicated to implement,
though I could be wrong... :-)

Jonathan Gilbert

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Limiting access to replay in 1.4

Posted by Molle Bestefich <mo...@gmail.com>.

Max Bowsher wrote:
> Perhaps we could consider rate-limiting rather than outright blocking of
> the feature as a DoS avoidance strategy?

The problem with rate limiting as I see it would probably be to convey
to users that they're being rate-limited.

You might get a ton of "why is SVN so slow!" outcries on the
subversion mailing lists, when in fact SVN is not in itself slow but
rate limiting is in action.  Be that because the server admin
accidentally setup things wrong, or because the user tried to do
something dumb...

Have you got any ideas how to avoid that scenario?

I think that Subversion should convey to the user that this-and-that
rate-limiting rule is being applied to their request.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Limiting access to replay in 1.4

Posted by Greg Hudson <gh...@MIT.EDU>.

On Sat, 2006-04-08 at 16:05 +0100, Max Bowsher wrote:
> Whilst clearly it's not good for servers to be hammered into the ground
> by scores of users syncing repositories, it would be a very great shame
> if this nice new feature ending up being turned off on the majority of
> significant Subversion installations.

Especially since there is already code out there to circumvent the lack
of the feature in a more resource-intensive way.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Limiting access to replay in 1.4

Posted by Molle Bestefich <mo...@gmail.com>.

Justin Erenkrantz wrote:
> Because what they are trying to do shouldn't be allowed as a matter of policy.

I strongly disagree, but then again we're not administering the same
servers so I guess that figures :).

> > To me it sounds just perfect - allow people to do what they want, but
> > don't let them eat more than so-and-so-many-percent bandwidth (... or
> > cpu, or memory, that would be useful too.)
>
> Rate limting is still a DoS - albeit much slower.  -- justin

I'm going to get picky with the terms, bear with me..

If setup correctly (fx. per-user-per-host), rate limiting _does_
prevent DoS, just not DDoS.

And rate limiting can even prohibit what we could call "accidental
DDoS-like occurances", where many users does something stupid like
check out all the tags by accident.

I think preventing DoS but not DDoS is perfectly fine for the vast
majority of users, since DDoS is generally a much harder problem to
solve and a much less frequently occurring one, at least if
"accidental DoS-like occurances" is excluded from the count.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Limiting access to replay in 1.4

Posted by Justin Erenkrantz <ju...@erenkrantz.com>.

On 4/12/06, Molle Bestefich <mo...@gmail.com> wrote:
> You do not provide any arguments as to why this is a problem.
>
> Upon reading your post, I'm left thinking "bah, who are you to decide
> what 'normal users' should do and not do?".

In this case, I don't care about the users - I care about the server.

> Could you please provide some reasoning why you think rate limiting is
> not a good solution?

Because what they are trying to do shouldn't be allowed as a matter of policy.

> To me it sounds just perfect - allow people to do what they want, but
> don't let them eat more than so-and-so-many-percent bandwidth (... or
> cpu, or memory, that would be useful too.)

Rate limting is still a DoS - albeit much slower.  -- justin

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Limiting access to replay in 1.4

Posted by Justin Erenkrantz <ju...@erenkrantz.com>.

On 4/12/06, Michael Brouwer <mb...@gmail.com> wrote:
> The main advantage that the replay API has is that it will make
> SVN::Mirror faster and use less resources on the server than it does
> today.
>
> Since there doesn't seem to be an outcry on users today to prevent
> people from being able to mirror their repos with svk, why limit the
> more efficient new API but leave the slower existing one around?

Two things: Very few people use svk in comparison to our bundled
client, and my intent is to block SVN::Mirror as well by restricting
checkouts and updates rooted in certain directories.

While you are free to allow people to poach off your servers, I have
no intention of wanting to permit that behavior off Apache's SVN
servers.  -- justin

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Limiting access to replay in 1.4

Posted by Michael Brouwer <mb...@gmail.com>.

SVK already let's you mirror all revisions of a remote repository
today using SVN::Mirror, with or without the replay API.  And this is
exactly how you would use SVK if you were working with the apache
sources (though you might choose to mirror only a subset of the tree
if you don't need all of it).  Once the initial mirror is done users
using svk will typically present a much smaller load on the server
since they only hit the server for commit's and mirrors of new
revisions.

The main advantage that the replay API has is that it will make
SVN::Mirror faster and use less resources on the server than it does
today.

Since there doesn't seem to be an outcry on users today to prevent
people from being able to mirror their repos with svk, why limit the
more efficient new API but leave the slower existing one around?

Michael

On 4/12/06, Molle Bestefich <mo...@gmail.com> wrote:
> Justin Erenkrantz wrote:
> > > Perhaps we could consider rate-limiting rather than outright blocking of
> > > the feature as a DoS avoidance strategy?
> >
> > Rate-limiting isn't the problem - it's that the normal user shouldn't
> > be making a local copy of repositories that have 400000+ revisions.
> > If they even *think* Subversion now encourages this, we're going to be
> > opening a Pandora's box.  -- justin
>
> You do not provide any arguments as to why this is a problem.
>
> Upon reading your post, I'm left thinking "bah, who are you to decide
> what 'normal users' should do and not do?".
>
> Could you please provide some reasoning why you think rate limiting is
> not a good solution?
>
> To me it sounds just perfect - allow people to do what they want, but
> don't let them eat more than so-and-so-many-percent bandwidth (... or
> cpu, or memory, that would be useful too.)

Re: Limiting access to replay in 1.4

Posted by Molle Bestefich <mo...@gmail.com>.

Justin Erenkrantz wrote:
> > Perhaps we could consider rate-limiting rather than outright blocking of
> > the feature as a DoS avoidance strategy?
>
> Rate-limiting isn't the problem - it's that the normal user shouldn't
> be making a local copy of repositories that have 400000+ revisions.
> If they even *think* Subversion now encourages this, we're going to be
> opening a Pandora's box.  -- justin

You do not provide any arguments as to why this is a problem.

Upon reading your post, I'm left thinking "bah, who are you to decide
what 'normal users' should do and not do?".

Could you please provide some reasoning why you think rate limiting is
not a good solution?

To me it sounds just perfect - allow people to do what they want, but
don't let them eat more than so-and-so-many-percent bandwidth (... or
cpu, or memory, that would be useful too.)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Limiting access to replay in 1.4

Posted by Justin Erenkrantz <ju...@erenkrantz.com>.

On 4/8/06, Max Bowsher <ma...@ukf.net> wrote:
> Whilst clearly it's not good for servers to be hammered into the ground
> by scores of users syncing repositories, it would be a very great shame
> if this nice new feature ending up being turned off on the majority of
> significant Subversion installations.

Encouraging end-users to sync large repositories is not a generally
good idea.  This feature is like handing people a loaded gun and
helping them point it at our heads.  No thanks.  For Apache, we might
open it up on a per-request basis (whitelist of IPs that can execute
it); but there's no way we're opening it up for everyone.

> Perhaps we could consider rate-limiting rather than outright blocking of
> the feature as a DoS avoidance strategy?

Rate-limiting isn't the problem - it's that the normal user shouldn't
be making a local copy of repositories that have 400000+ revisions. 
If they even *think* Subversion now encourages this, we're going to be
opening a Pandora's box.  -- justin

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Limiting access to replay in 1.4

Posted by Max Bowsher <ma...@ukf.net>.

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Garrett Rooney wrote:
> One thing I'd like to see resolved before 1.4 goes out the door is the
> question about providing a way to limit access to replay
> functionality.  The argument is that replay and svnsync encourage
> users to put a rather high amount of load on a system, so we should
> provide users with a way to either turn it off, or hopefully just turn
> it off for part of the system.

Ouch.

Whilst clearly it's not good for servers to be hammered into the ground
by scores of users syncing repositories, it would be a very great shame
if this nice new feature ending up being turned off on the majority of
significant Subversion installations.

Perhaps we could consider rate-limiting rather than outright blocking of
the feature as a DoS avoidance strategy?

Max.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.1 (Cygwin)

iD8DBQFEN9EyfFNSmcDyxYARAki2AJ0Rs5jwTZnwrRLSP/k0mR4XuiFp2ACeLHXq
l9CTQvtAn9zlER0a48MOCp4=
=4MJF
-----END PGP SIGNATURE-----

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org