You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by David Anderson <da...@calixo.net> on 2005/08/23 22:42:39 UTC

[PROPOSAL] Proxy support for svn://

Following events that took place at OSCon 2005 and subsequent 
discussions with yks on the dev channel, I am proposing a design for 
proxy support within the subversion inhouse protocol, as well as plans 
for the associated 'svnproxy' daemon.

ABSTRACT
========

Svnproxy is a daemon that aggregates handles to many real Subversion 
repository URLs (all using the svn RA method), and acts as a middle man, 
forwarding and mangling communications back and forth between clients 
operating on a proxy URL and the real svnserve server the proxy URL maps to.

The new 'proxy' capability added to the ra_svn protocol enables proxy 
servers to announce themselves to proxy-savvy clients, who can then 
respond in the manner appropriate to set up the proxy relaying with 
maximum chances of a successful communication with the proxied server.

MOTIVATION
==========

The obvious: having something that can act as a virtual facade to many 
real (and potentially vastly distributed) repositories, with minimal 
visible changes for the end user.

A typical use case would be to set up the proxy server on a gateway 
between a lan and the internet, to forward client operations to many 
different real servers on the lan.  In other words, a classic proxy setup.

Now, here is a slightly more contrived variation of this use case, 
brought to us by the good folks working on svl.  Svl wants to let users 
who are behind a NATing firewall publish their local repository(ies) on 
a public proxy via a tunnel (ssh port forwarding, whatever).  In this 
case, svnproxy would manage the actual proxy negociation and protocol 
mangling, let svl deal with setting up the tunnel, and basically just 
define proxy maps that redirect to localhost on nonstandard ports to go 
through the tunnel.

STATE OF THE ART
================

At OSCon US 2005, the svl presentation was to include a live 
demonstration, where people went to pull stuff from repositories on 
internet.  However, the OSCon network setup was firewalling the ports on 
which the svnserve processes had been set up to listen on.  Strangely 
enough though, the official svnserve port was authorized.

So, one monday afternoon, yks joined #svn-dev and asked how the svn 
protocol could be tortured into setting up a proxy, that would proxy 
requests for svn://proxy_server/real-repos-uuid/path/in/repos to 
svn://localhost:1234/path/in/repos.  I was in the middle of dissecting 
the svn protocol at that time, so we discussed the implications of such 
a setup for a bit, and he then went off to "try something with Perl".

And indeed, for all that I know, the svl presentation featured a live 
demo, with people going via a tiny perl script that masqueraded as 
svnserve and did regular expression mangling of the data flowing through 
it to switch URLs.  As far as I can remember, there is even a slide 
about how they hacked the proxy together in a couple of hours :-).

His implementation works in that it has been tested in public by OSCon 
hackers.  However, given the nature of its operation, it is not 
recommended to breathe too heavily on the whole setup, lest it collapse 
into a quantum singularity and destroy the universe.  The two main 
problems it has are:
  - Blind regular expression mangling of the raw data stream.  No 
interpretation of the data, which means that data other than 
command-parameter URLs could get accidentally mangled.
  - Due to the reponse-request nature of the svn protocol, the proxy has 
to send a greeting where it lists supported protocol versions and 
capabilities before the client tells it which repository it wants 
accessing.  Practically the perl proxy adopts the ostrich tactic: send a 
v2-protocol-only greeting, with edit-pipeline capability, then connect 
to the proxied server once the client has replied and swallow it's 
greeting, praying that it's settings are compatible with those announced 
to the client.

The proxy put together for OSCon serves in my opinion as a proof of 
concept that wouldn't be that difficult to implement proper proxy 
capability right into the svn protocol and client behaviour.

PROPOSAL PART I : THE PROXY CAP
===============================

At first I had launched into a passionate design of ripping the svn 
protocol to shreds and emulating a request-response protocol that would 
be backcompatible, to enable graceful proxying.  Thankfully for my 
sanity (and overall reputation as designer), Greg Hudson stopped me in 
mid-flight and dropped one of his famous One-Liners of Enlightenment: 
"Just implement a 'proxy' capability!"

Based on this initial enlightenment, this is the proposed addition to 
the svn RA protocol: Create a proxy capability :-).

In the following, (C) is a subversion client, (S) is a svnserve process 
serving repositories, and (P) is a svn RA proxy which stands between (C) 
and (S).

  A. The client side
  ------------------

When (C) sees the proxy capability in a server greeting, it knows it is 
talking to a proxy and not a regular server.
It should then send back a request with the proxy capability set (to 
indicate to (P) that it has caught on) and send the repository URL like 
a normal greeting request.

Then, (C) starts over and goes back to expecting the initial server 
greeting, which will this time be the greeting from (S), relayed by (P).
(C) resends the same repository URL as in its first response, and can 
then proceed with whatever it wanted to do in the first place.

 From then on, no mention of proxies or anything else comes into play. 
The discussion is a perfectly normal svn RA exchange.

  B. The server side
  ------------------

The simplest yet: (S) is oblivious that anything murky is going on. As 
far as it is concerned, (P) is a regular svn client requesting stuff and 
getting normal replies.

  C. The proxy sides
  ------------------

A complying proxy server (P) sends a standard greeting to the connecting 
(C), with any protocol versions it feels like supporting, and nothing 
but the proxy capability set.

Upon receiving the reply from (C), (P) dissects the requested URL and 
use the first element of the path as a key in a proxy map table, in 
order to find the corresponding root URI for (S).  It then connects to 
that (S), and sets its translation engine to transpose from one URL 
space to the other.

If (C) replied with the proxy cap set (ie. "I understand you are a proxy 
server, let's do this"), then just pipeline (S)'s greeting back to (C) 
and let them work it out amongst themselve.

If (C) did not acknowledge the proxy capability, either disconnect him 
with "Error: You need a client that can talk to proxies"; or connect to 
(S) anyway, and see if the protocol version and caps (C) selected are 
compatible with what (S) offers.  If so, forward (C)'s initial response 
on to (S). (P) essentially becomes a transparent proxy if the initial 
negociations happen to be compatible.

  D. Example session
  ------------------

(C) wants to commit to svn://magic.mushroom.server/shroom/spores . 
Little does it know that magic.mushroom.server is in fact a world famous 
hallucinogenic version control proxy set up in the Caiman Islands, which 
hides the real location of (S) (somewhere in lower Amsterdam) from the 
rest of the world's police.

(C) connects to (P), receives the following greeting:
  ( success ( 1 2 ( ANONYMOUS ) ( proxy ) )

(C) replies with:
  ( 2 ( proxy ) svn://magic.mushroom.server/shroom/spores/ )

(P) rewrites the requested client URL into proxy_map['shroom'] + 
"/spores/", connects to the resulting server, starts to relay 
communications, scrambling URLs as required.

(C) receives a new greeting, from (S) via (P):
  ( success ( 1 2 ( ANONYMOUS CRAM-MD5 ) ( edit-pipeline ) )

(C) replies to this new greeting:
  ( 2 ( edit-pipeline ) svn://magic.mushroom.server/shroom/spores/ )

(S) receives (C)'s greeting reply, mangled by (P):
  ( 2 ( edit-pipeline ) svn://shroom.badtrip.nl/spores/ )

And so on and so forth.

PROPOSAL PART II: THE SVNPROXY DAEMON
=====================================

The svnproxy daemon is an implementation of (P) in the above discussion 
of the proxy capability.  It operates on the svn_ra layer and slightly 
below (connection establishment and such is done manually, as in 
svnserve). Aside from the behaviour described above, it processes 
commands through callbacks in a way much the same as svnserve.  The 
difference is that for most commands it calls a "passthrough" handler, 
which just blindly relays the request and response back and forth 
without changing anything.

For the few commands and responses that have URLs in their parameters, 
the commands are parsed, the URL extracted, mangled into the correct (S) 
or (C) side URL, and the command is then reconstructed and sent on.

  A. Configuration
  ----------------

Svnproxy has very little internal configuration, and can operate off a 
single configuration file with one section, and up to one configuration 
file with 2 sections and a password store.

The file that is always present is the proxy configuration file 
svnproxy.conf, which is composed of one mandatory section and one 
optional section.  The first section defines a name:repository-url map 
which svnproxy uses to establish connections to the various (S) servers 
and to mangle URLs properly.  The second section is optional, and 
defines the access rights to the proxy configuration, in the same way 
that blanket access directives and a password file are defined in 
svnserve.conf.

If the auth section is omited, all access to the proxy configuration, 
read or write, is denied.  More on what the hell I'm thinking about here 
a little further down.

The second file is - you guessed it - the optional passwd datastore, if 
defined in svnproxy.conf. Same format as passwd.

With this configuration (passed on the commandline like the root of 
repositories to serve is passed to svnserve), svnproxy can initialise 
its proxy map and start servicing requests.

  B. On-line proxy map edition
  ----------------------------

One of the requirements of the svl people is that svnproxy should be 
usable as a "repository PA system".  That is, users (authenticated and 
authorized or anonymous, depending on the setting) should be able to 
edit the proxy map of a running svnproxy, not just admins with access to 
the flat file proxy map.

A nice way to do this would be to do it in such a way that we don't need 
to listen for an alternate protocol on a different port and whatnot.

One way to do this is to have svnproxy build a virtual repository when a 
client requests the root URL, with no proxy map name.  In many ways 
similar to the /sys filesystem, svnproxy would construct a repository 
that looks to the client to be totally empty and at r0.  However, all 
the defined proxy maps are present, represented as individual revision 
properties.  Access to this virtual repos is restricted by the 
configured auth access rules.

For example, a map from 'shroom' to 'svn://shroom.nl/repos' would appear 
in this configuration virtual repository as the revprop 'proxy:shroom' 
-> 'svn://shroom.nl/repos'.

Svnproxy only accepts revprop related commands on this virtual 
repository.  All other attempts at repository manipulation result in 
access denied errors.  This allows clients with read access to the 
configuration virtual repository to view the currently active mappings, 
and clients with write access to define new/edit/delete mappings by 
altering revprops.

Modifications made to the virtual repository are impacted back to the 
flat configuration file, so that the last 'live' configured state is 
restored when svnproxy restarts.

A specialised UI for handling proxy remote configuration will be 
written, in the form of a python script, svnproxyctl.py, which connects 
to the proxy configuration repository and messes with revprops, all 
hidden in a nice little interface.  Svl can probably be made to 
integrate some kind of abstraction to configure these proxies (which 
appear to be of some importance to it) in a more intuitive manner.

Of course, if you don't want to let remote people reconfigure your 
lovely proxy, just completely deny access, and svnproxy will deny it's 
configuration virtual repository ever existed and dismiss your requests 
as liberal propaganda.

YOUR AD HERE
============

What do people thing of all this?  I am relatively satisfied with 
everything, except for the auth done for the virtual configuration 
repository, because I don't feel it offers the level of control that svl 
would feel happy to work with.  I've ben toying with fine grained 
credentials such as anon = add-only, or something similar, but I'm not 
sure what svl (and others) would actually need in this department.

Oh, and there's of course the issue that "Your virtual repository is 
Evil, because it opens access to editing unversionned server 
configuration from a client!!"  I believe that with proper auth control 
on the access to the configuration repository, that is no problem. 
Proxy configuration is as open as you make it.

As for the unversionniness...  Well, svnproxy could be made to operate 
off a full svn repository that would contain it's (versionned) 
configuration.  But I don't see any real use that would justify this 
degree of added setup complexity.

What do y'all think?

- Dave.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: [PROPOSAL] Proxy support for svn://

Posted by Garrett Rooney <ro...@electricjellyfish.net>.
David Anderson wrote:
> David Anderson wrote:
> 
>> What do y'all think?
> 
> 
> I'll take this absence of reactions as "we're all elsewhere discussing 
> the colours of better bikesheds." :-)
> 
> If there are no definite -1's in all this, I would like to stress the 
> help this would give the svl people.  Given this, I'd really like to see 
> this proxy support in trunk, at least the protocol capability part. 
> Would at least one commiter be okay (+0) to review my (future) patch and 
> commit it?  I'd just like to get some form of assurance that I won't be 
> coding all this for nothing.

Well, personally I don't see the ability to reverse-proxy svn:// 
connections as especially useful, but assuming that it doesn't overly 
complicate the protocol and the codebase I wouldn't object to it either.

Now if you were talking about a write-through caching proxy, that I 
could see being pretty useful (Perforce has such a proxy, for example, 
and it's often used for speeding up read-access to a remote repository, 
as you would see in a branch office of a company), but that seems rather 
more complex than what you're talking about here.

-garrett

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: [PROPOSAL] Proxy support for svn://

Posted by Greg Hudson <gh...@MIT.EDU>.
On Mon, 2005-08-29 at 02:23 +0200, David Anderson wrote:
> David Anderson wrote:
> > What do y'all think?
> 
> I'll take this absence of reactions as "we're all elsewhere discussing 
> the colours of better bikesheds." :-)

The proposal you sent out was the proposal we discussed on IRC, and I
already approved it there (in the sense of not having any objections,
not in the sense of being excited about).  So I think you can go ahead
and write the patch at this point.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: [PROPOSAL] Proxy support for svn://

Posted by David Anderson <da...@calixo.net>.
David Anderson wrote:
> What do y'all think?

I'll take this absence of reactions as "we're all elsewhere discussing 
the colours of better bikesheds." :-)

If there are no definite -1's in all this, I would like to stress the 
help this would give the svl people.  Given this, I'd really like to see 
this proxy support in trunk, at least the protocol capability part. 
Would at least one commiter be okay (+0) to review my (future) patch and 
commit it?  I'd just like to get some form of assurance that I won't be 
coding all this for nothing.

- Dave.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: [PROPOSAL] Proxy support for svn://

Posted by Daniel Rall <dl...@finemaltcoding.com>.
On Wed, 24 Aug 2005, David Anderson wrote:
...
> Given the clarification I made above, this answers itself: a reverse 
> HTTP proxy could do what I propose, for the http:// and such repository 
> access methods, provided that the proxy also relays non-HTTP requests 
> (DAV/DeltaV).  This is already possible (I believe) by setting up a 
> properly configured Apache web server, no extra design or implementation 
> needed.
... 
>  - Add that for the http:// (mod_dav_svn) methods, a regular 
> non-caching reverse HTTP proxy that is configured to relay DAV/DeltaV 
> chat should (can a dav_svn guru get back to us on this?) accomplish what 
> I am proposing, except for the "dynamic reconfiguration" bit which is an 
> implementation detail for the svn:// proxy server software I'm
> proposing.

Hiya Dave.  Having recently implemented such a DAV reverse proxy for SVN
using Apache httpd and a few key, stock modules, I can confirm that this is
possible, but does impose a number of restrictions (mostly from mod_dav),
including:

- scheme in HTTP Destination header must match (e.g. http)
- port number in HTTP Destination header must match (e.g. 80)
- URI path to repository root must match (e.g. /repos/svn)

Note that this means that just setting up mod_proxy to reverse proxy HTTP 
requests is often INSUFFICIENT for a DAV reverse proxy for SVN.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: [PROPOSAL] Proxy support for svn://

Posted by David Anderson <da...@calixo.net>.
Jon Bendtsen wrote:
> I believe that you want to add a proxy to speed up SVN requests, or
> support more users. I especially get this from your sentense:
> 
> "A typical use case would be to set up the proxy server on a gateway
> between a lan and the internet, to forward client operations to many
> different real servers on the lan.  In other words, a classic proxy  
> setup."

Right, sorry about that.  That also wasn't as clear as it could have 
been.  I was thinking of *my* "classic" proxy setup, which is actually a 
*reverse* proxy, sitting on port 80 of my public web server and relaying 
certain requests on to other lan-based web servers with no internet 
connectivity.

So, if we want to compare svnproxy and this design to the HTTP world, it 
would actually be a non-caching reverse proxy setup.  Does that clarify 
things a little?

> I believe that your text misses some text about why a normal regular
> HTTP proxy can not do what you want to achive, or how your solution
> is better.

Given the clarification I made above, this answers itself: a reverse 
HTTP proxy could do what I propose, for the http:// and such repository 
access methods, provided that the proxy also relays non-HTTP requests 
(DAV/DeltaV).  This is already possible (I believe) by setting up a 
properly configured Apache web server, no extra design or implementation 
needed.

As for the svn protocol, it is an inhouse, stateful protocol, which is 
not spoken by HTTP proxies and would be (in my opinion) difficult to 
translate into something a HTTP proxy would be okay to relay.  It would 
also require adding a translation layer to translate the HTTP-savvy 
dialect back into stateful svn:// for the server on the remote side to 
understand what the client is talking about.  Fairly complex.

Hence we need new logic and new server software to add reverse proxy 
capability to svn:// .

> Now, the reason that i wrote you directly and not to the list is because
> i only just subscribed the other day, so who am i to give judgement,

I had actually replied back to the list because I thought it was a 
mistake: I sometimes click "reply", which replies to just the post 
author, when I mean to "reply to all".  So I quickly corrected this in 
my reply.  Sorry if reposting your private mail sounded rude, it was not 
intended to look like that.

Now, as for "who am I to give judgement": You are someone subscribed to 
dev@, and are therefore perfectly and completely free and entitled to 
ask questions, comment, shoot down, or otherwise react about whatever 
goes on!  Please, be my guest.  There are no stupid questions, and I'm 
quite prepared to explain things if I refer to parts of the Subversion 
internals you are not familiar with.

In any case, I value anyone's input on this list at the same level; 
nobody gets flamed for "newbieness" or Zark knows what else :-).

So, please, make your concerns and comments public, it can only benefit 
us all.

> however, i still believe that at least writting something in your text 
> about regular HTTP proxies, would benefit your text.

Given this mail's clarification (that my proposal is actually for what 
the HTTP world calls a non-caching reverse proxy, and that normal HTTP 
proxy software can't speak the SVN protocol), I'll:
  - Add an ASSUMPTIONS section clarifying that I am talking about the 
svn:// side only, and
  - Add that for the http:// (mod_dav_svn) methods, a regular 
non-caching reverse HTTP proxy that is configured to relay DAV/DeltaV 
chat should (can a dav_svn guru get back to us on this?) accomplish what 
I am proposing, except for the "dynamic reconfiguration" bit which is an 
implementation detail for the svn:// proxy server software I'm
proposing.

Would that be okay?

- Dave.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: [PROPOSAL] Proxy support for svn://

Posted by Jon Bendtsen <jo...@laerdal.dk>.
Den 24. aug 2005 kl. 13:35 skrev David Anderson:

> Jon Bendtsen wrote:
>
>> I think that i miss a section in your text about regular HTTP  
>> proxies.
>>
>
> This proposal is to implement proxy support for the svn:// RA  
> protocol.  I don't mean for it to implement HTTP proxy support, or  
> to add anything to the DAV RA methods, as that would make no sense  
> in the scope of my proposal.  The feature I'm proposing for  
> svnproxy and the svn:// RA method could be implemented for the HTTP  
> methods by adding a support for redirects, and/or setting up a  
> proxy which supports DAV directives to relay the DAV chat, which  
> can probably be done with a properly set up Apache.  That however  
> is a job for someone who groks mod_dav_svn, not a lowly peon in  
> things DAV such as I.
>
> Again, I am NOT proposing to address the issue of "let's get svn to  
> push HTTP/WebDAV/DeltaV through an unsuspecting GET-only HTTP proxy  
> that is blocking my access to a repository on the internet!"
>
> Sorry, I should have noted this in assumptions.  By "proxy" I don't  
> mean "allow the svn to break through HTTP proxies".  I have no idea  
> how you push a stateful protocol on a non-web port through a  
> stateless web proxy.  My proposal is for a different set of problems.

I'm sorry that i was not as clear as i could have been.

I believe that you want to add a proxy to speed up SVN requests, or
support more users. I especially get this from your sentense:

"A typical use case would be to set up the proxy server on a gateway
between a lan and the internet, to forward client operations to many
different real servers on the lan.  In other words, a classic proxy  
setup."

I believe that your text misses some text about why a normal regular
HTTP proxy can not do what you want to achive, or how your solution
is better.

Now, the reason that i wrote you directly and not to the list is because
i only just subscribed the other day, so who am i to give judgement,
however, i still believe that at least writting something in your  
text about
regular HTTP proxies, would benefit your text.




JonB

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: [PROPOSAL] Proxy support for svn://

Posted by David Anderson <da...@calixo.net>.
Jon Bendtsen wrote:
> I think that i miss a section in your text about regular HTTP proxies.

This proposal is to implement proxy support for the svn:// RA protocol. 
  I don't mean for it to implement HTTP proxy support, or to add 
anything to the DAV RA methods, as that would make no sense in the scope 
of my proposal.  The feature I'm proposing for svnproxy and the svn:// 
RA method could be implemented for the HTTP methods by adding a support 
for redirects, and/or setting up a proxy which supports DAV directives 
to relay the DAV chat, which can probably be done with a properly set up 
Apache.  That however is a job for someone who groks mod_dav_svn, not a 
lowly peon in things DAV such as I.

Again, I am NOT proposing to address the issue of "let's get svn to push 
HTTP/WebDAV/DeltaV through an unsuspecting GET-only HTTP proxy that is 
blocking my access to a repository on the internet!"

Sorry, I should have noted this in assumptions.  By "proxy" I don't mean 
"allow the svn to break through HTTP proxies".  I have no idea how you 
push a stateful protocol on a non-web port through a stateless web 
proxy.  My proposal is for a different set of problems.

- Dave.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org