You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@subversion.apache.org by Greg Stein <gs...@lyra.org> on 2005/11/22 08:08:40 UTC

Re: DAV is complicated and slow?

On Mon, Nov 21, 2005 at 06:35:58PM -0600, Ben Collins-Sussman wrote:
> On 11/21/05, Justin Erenkrantz <ju...@erenkrantz.com> wrote:
> >  It probably
> > *does* work today with httpd 2.2 (as SVN sends the L-M headers on GET).
> 
> Nope, subversion stopped using GET just before 1.0.  The only commands
> that issue GET are 'svn cat' and 'svn diff/merge'.  Checkouts and
> updates are a huge custom REPORT request.
> 
> So, if we really want to use pipelining and caching, we'd have to
> switch back the old-styles of checkouts/updates.

If we had HTTP pipelining in the client, then hell yeah: we'd use the
old-style series of GETs and PROPFINDs.

Note that PROPFINDs are *specifically* not cacheable, per the spec.
But the GETs are absolutely cacheable. I designed the URL/resources
many, many moons ago so that we could take specific advantage of the
caching benefit.

The only thing holding us back is the pipelining. With cacheable GETs,
then a server could greatly improve its capacity by throwing a bunch
of caching reverse proxies in front of it to handle the GET load (tho
I bet mod_disk_cache is efficient enough that it can handle serious
load itself without needing to offload to proxies). But even better, a
local network (say, at a corp or school) can put up a caching proxy.
Run your SVN operatins thru that sucker, and you can share cached
copies with all your coworkers/associates/friends. First person in the
morning pulls the latest HEAD into the local cache, and then everybody
else gets their "svn up" served from that cache.

I'll also note that a lot of complexity has arisen in ra_dav due to
compatibility concerns, both with Neon compat, and with protocol
compat. ra_dav was the first RA module built and has grown a lot of
"uglies" over time. Next to WC, it is one of the oldest modules.
ra_svn is much cleaner because it learned lessons, it has a zero
impedance between svn and the protocol, and it has a cleaner protocol
compatibility mechanism.

DAV brings many feature benefits. There are definite improvements to
be made in the implementation, but the concept is (IMO) still quite
sound. I wouldn't suggest a wholesale move to REPORT requests, but
would (instead) encourage consideration of a pipelining client, thus
bringing in a number of caching benefits on a network-wide scale.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: DAV is complicated and slow?

Posted by Michael Sinz <Mi...@sinz.org>.

On 11/29/05, Joe Orton <jo...@manyfish.co.uk> wrote:
> On Sun, Nov 27, 2005 at 07:41:44AM -0800, Justin Erenkrantz wrote:
> > --On November 27, 2005 10:12:03 AM +0000 Joe Orton <jo...@manyfish.co.uk>
> > wrote:
> >
> > >You can't really pipeline over SSL - if you start sending request N + 1
> > >when the server wants to renegotiate after reading request N, you're
> > >stuffed.
> >
> > We could argue about that one.  =)
>
> Using pipelining over SSL will break any server configuration requiring
> a per-location SSL renegotiation (for a client cert, etc), which is not
> something you want broken by default.  (you can perhaps argue about
> whether this is a problem inherent in the TLS protocol or "merely" in
> the current implementations thereof)

Yuk!  So, to pipeline SSL the client/server would need to know
that there is no difference in the authentication requirements for
all of the paths being passed in during the pipelining operation.
I have never built a system that had such constraints but I now
see how that would be a major issue with SSL pipelining...

The question would be if there was some way to start pipelining
and if the authentication requirements require a separate connection,
could that be requested for those requests (again, yuck!)

--
Michael Sinz               Technology and Engineering Director/Consultant
"Starting Startups"                          mailto:Michael.Sinz@sinz.org
My place on the web                      http://www.sinz.org/Michael.Sinz

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: DAV is complicated and slow?

Posted by Joe Orton <jo...@manyfish.co.uk>.

On Sun, Nov 27, 2005 at 07:41:44AM -0800, Justin Erenkrantz wrote:
> --On November 27, 2005 10:12:03 AM +0000 Joe Orton <jo...@manyfish.co.uk> 
> wrote:
> 
> >You can't really pipeline over SSL - if you start sending request N + 1
> >when the server wants to renegotiate after reading request N, you're
> >stuffed.
> 
> We could argue about that one.  =)

Using pipelining over SSL will break any server configuration requiring 
a per-location SSL renegotiation (for a client cert, etc), which is not 
something you want broken by default.  (you can perhaps argue about 
whether this is a problem inherent in the TLS protocol or "merely" in 
the current implementations thereof)

joe

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: DAV is complicated and slow?

Posted by Michael Sinz <Mi...@sinz.org>.

Justin Erenkrantz wrote:
> On Sun, Nov 27, 2005 at 01:52:45PM -0500, Michael Sinz wrote:
> 
>>I have the same problem at one of my clients' offices.  But what is
>>annoying about using SSL - it is a good excuse to keep everything secure
>>in transit, a generally good thing.
> 
> 
> Read-only operations are slower.  Does checking out from a public
> repository *really* need to be over SSL?

No, it does not - but at least this is an option.  I wish the proxies
knew how to behave better...

> But, my annoyance in this case has more to do with the lack of SSL - one of
> the public SVN repositories I commit to doesn't offer https.  Although,
> this is the same one that's still on Subversion 1.0.6.  *duck*  -- justin

Ahh yes, servers that do not offer SSL.  I would hope that it is just a
read-only server :-)

-- 
Michael Sinz                     Technology and Engineering Director/Consultant
"Starting Startups"                                mailto:michael.sinz@sinz.org
My place on the web                            http://www.sinz.org/Michael.Sinz

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: DAV is complicated and slow?

Posted by Justin Erenkrantz <ju...@erenkrantz.com>.

On Sun, Nov 27, 2005 at 01:52:45PM -0500, Michael Sinz wrote:
> I have the same problem at one of my clients' offices.  But what is
> annoying about using SSL - it is a good excuse to keep everything secure
> in transit, a generally good thing.

Read-only operations are slower.  Does checking out from a public
repository *really* need to be over SSL?

But, my annoyance in this case has more to do with the lack of SSL - one of
the public SVN repositories I commit to doesn't offer https.  Although,
this is the same one that's still on Subversion 1.0.6.  *duck*  -- justin

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: DAV is complicated and slow?

Posted by Michael Sinz <Mi...@sinz.org>.

Justin Erenkrantz wrote:
> --On November 27, 2005 10:12:03 AM +0000 Joe Orton <jo...@manyfish.co.uk> 
> wrote:
> 
>> You can't really pipeline over SSL - if you start sending request N + 1
>> when the server wants to renegotiate after reading request N, you're
>> stuffed.
> 
> We could argue about that one.  =)

SSL should not prevent pipelining in any way any more so than any TCP/IP
socket connection might prevent it due to connection loss/etc.

>> Buggy proxies break pipelining, and most importantly, buggy
>> transparent proxies - *which you can't tell are there* - break
>> pipelining.
> 
> Buggy transparent proxies almost certainly break WebDAV.  For example, I 
> can't commit to any WebDAV repositories from my apartment on campus 
> because the campus 'geniuses' installed a buggy transparent proxy that 
> can't be routed-around that doesn't know what to do with 
> REPORT/MERGE/etc.  I have to use SSL here in order to do anything which 
> is very, very annoying.  -- justin

I have the same problem at one of my clients' offices.  But what is
annoying about using SSL - it is a good excuse to keep everything secure
in transit, a generally good thing.

-- 
Michael Sinz                     Technology and Engineering Director/Consultant
"Starting Startups"                                mailto:michael.sinz@sinz.org
My place on the web                            http://www.sinz.org/Michael.Sinz

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: DAV is complicated and slow?

Posted by Justin Erenkrantz <ju...@erenkrantz.com>.

--On November 27, 2005 10:12:03 AM +0000 Joe Orton <jo...@manyfish.co.uk> wrote:

> You can't really pipeline over SSL - if you start sending request N + 1
> when the server wants to renegotiate after reading request N, you're
> stuffed.

We could argue about that one.  =)

> Buggy proxies break pipelining, and most importantly, buggy
> transparent proxies - *which you can't tell are there* - break
> pipelining.

Buggy transparent proxies almost certainly break WebDAV.  For example, I can't 
commit to any WebDAV repositories from my apartment on campus because the 
campus 'geniuses' installed a buggy transparent proxy that can't be 
routed-around that doesn't know what to do with REPORT/MERGE/etc.  I have to 
use SSL here in order to do anything which is very, very annoying.  -- justin

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: DAV is complicated and slow?

Posted by Joe Orton <jo...@manyfish.co.uk>.

On Tue, Nov 22, 2005 at 12:08:40AM -0800, Greg Stein wrote:
> On Mon, Nov 21, 2005 at 06:35:58PM -0600, Ben Collins-Sussman wrote:
> > On 11/21/05, Justin Erenkrantz <ju...@erenkrantz.com> wrote:
> > >  It probably
> > > *does* work today with httpd 2.2 (as SVN sends the L-M headers on GET).
> > 
> > Nope, subversion stopped using GET just before 1.0.  The only commands
> > that issue GET are 'svn cat' and 'svn diff/merge'.  Checkouts and
> > updates are a huge custom REPORT request.
> > 
> > So, if we really want to use pipelining and caching, we'd have to
> > switch back the old-styles of checkouts/updates.
> 
> If we had HTTP pipelining in the client, then hell yeah: we'd use the
> old-style series of GETs and PROPFINDs.

A couple of times over the last few years I've take a few stabs at 
supporting async request processing in neon (and hence pipelining); I 
never really go the full hog because I'm afraid that you can never use 
pipelining by default in any general-purpose application, so it never 
really seems worth the effort.

You can't really pipeline over SSL - if you start sending request N + 1 
when the server wants to renegotiate after reading request N, you're 
stuffed.  Buggy proxies break pipelining, and most importantly, buggy 
transparent proxies - *which you can't tell are there* - break 
pipelining.

joe

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: DAV is complicated and slow?

Posted by Garrett Rooney <ro...@electricjellyfish.net>.

On 11/24/05, Greg Stein <gs...@lyra.org> wrote:

> > Of course, there are certainly other reasons that ra_dav is slow (lets
> > make a million PROPFIND requests before we do anything!  ok!  how
> > about we base64 encode all binary data so it's bigger than it has to
>
> Note that the base64 encoding is because we embed the content into a
> big-ass XML response. Switching over to the GET form allows us to send
> the content to the client in its original binary encoding.

True, it would be nice to see what kind of effect that would have on
our speed just from improved bandwidth usage.

> What operation are you referring to which does all the PROPFINDs?

Well, it seems like just about all of them do a bunch, but 'svn list'
is perhaps the best example.

> > be!  great!  etc.) so perhaps it's worth moving towards a custom
> > protocol that uses http, is built with the possibility of being http
> > cache friendly, but doesn't fall victim to the problems we currently
> > have with DAV.
>
> Feel free, but I think there is ample room for improvement by moving
> back towards "plain old HTTP/DAV" than to increase the customization.
> As mentioned before, you could always tunnel the svn protocol, or
> something close to it, over an HTTP pipe. Dunno how cacheable that
> would be; I guess you could always do more protocol design work...

I'm thinking something like the svn protocol, just with the option to
tell the server you'd rather get the file contents/deltas via a
separate GET request, so that the hypothetical http pipelining ra_dav
could take advantage of http caches.

-garrett

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: DAV is complicated and slow?

Posted by Michael Sinz <Mi...@sinz.org>.

Greg Stein wrote:
> On Wed, Nov 23, 2005 at 08:51:48AM -0800, Garrett Rooney wrote:
> 
>>On 23 Nov 2005 09:21:12 -0600, kfogel@collab.net <kf...@collab.net> wrote:
>>
>>>Greg Hudson <gh...@MIT.EDU> writes:
>>>
>>>>We already have the performance benefit of HTTP pipelining (at the
>>>>expense of giving up on generic HTTP caching), and ra_dav is still much
>>>>slower than ra_svn.
> 
> 
> And will always be slower. Can't do much better than a custom,
> application-specific protocol. No argument there.
> 
> I've been talking about simplifying the system and improving the
> overall capacity through caches. We can use normal GET requests and
> PROPFIND and whatnot rather than custom reports.

While that may be the case, doing many GET operations, even in pipelined
connections does increase server load during non-cached operations.
(Or if there is no cache between you and the server)  Many places are
using HTTP/DAV SVN internally and are getting very good (or at least
very reasonable) performance without killing the server.  Going to
back to the GET method may well undo that.

However, I am all for not using XML (and thus BASE64 encoding) when
the server and client can agree on the format.  This would be a reasonably
big win for some repositories (and maybe even many repositories)

-- 
Michael Sinz                     Technology and Engineering Director/Consultant
"Starting Startups"                                mailto:michael.sinz@sinz.org
My place on the web                            http://www.sinz.org/Michael.Sinz

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: DAV is complicated and slow?

Posted by Greg Stein <gs...@lyra.org>.

On Wed, Nov 23, 2005 at 08:51:48AM -0800, Garrett Rooney wrote:
> On 23 Nov 2005 09:21:12 -0600, kfogel@collab.net <kf...@collab.net> wrote:
> > Greg Hudson <gh...@MIT.EDU> writes:
> > > We already have the performance benefit of HTTP pipelining (at the
> > > expense of giving up on generic HTTP caching), and ra_dav is still much
> > > slower than ra_svn.

And will always be slower. Can't do much better than a custom,
application-specific protocol. No argument there.

I've been talking about simplifying the system and improving the
overall capacity through caches. We can use normal GET requests and
PROPFIND and whatnot rather than custom reports.

>...
> I believe what ghudson means is that we've interleaved the diffs
> and/or file contents with the rest of the data we're getting back,
> just as we would if we had gone with the original scheme + http
> pipelining.  So we've already got the kind of result we would have had
> if neon had pipelining, and it's still slow.

Right. But we've also thrown out the ability to increase capacity
through the use of caches. Certain operations can also be satisfied
much closer to the client, thus reducing some latency. The server can
use caches to improve overall capacity (much faster and less resource-
intensive to serve off disk than to compose diffs to generate a
response).

> Of course, there are certainly other reasons that ra_dav is slow (lets
> make a million PROPFIND requests before we do anything!  ok!  how
> about we base64 encode all binary data so it's bigger than it has to

Note that the base64 encoding is because we embed the content into a
big-ass XML response. Switching over to the GET form allows us to send
the content to the client in its original binary encoding.

What operation are you referring to which does all the PROPFINDs?

> be!  great!  etc.) so perhaps it's worth moving towards a custom
> protocol that uses http, is built with the possibility of being http
> cache friendly, but doesn't fall victim to the problems we currently
> have with DAV.

Feel free, but I think there is ample room for improvement by moving
back towards "plain old HTTP/DAV" than to increase the customization.
As mentioned before, you could always tunnel the svn protocol, or
something close to it, over an HTTP pipe. Dunno how cacheable that
would be; I guess you could always do more protocol design work...

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Serf and Subversion was Re: DAV is complicated and slow?

Posted by Garrett Rooney <ro...@electricjellyfish.net>.

On 27 Nov 2005 17:47:42 -0600, kfogel@collab.net <kf...@collab.net> wrote:

> The naming thing is a bit wierd... after all, they're both DAV.  We
> just didn't call the first one libsvn_ra_neon :-).

On the other hand, it's not like we had plans for more than one DAV
implementation at the time ;-)

Personally, I think as long as it's done as a compile time option and
the library is presented to the user via the current http:// and
https:// url schemes it doesn't really matter so much what we call the
library.

-garrett

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Serf and Subversion was Re: DAV is complicated and slow?

Posted by Garrett Rooney <ro...@electricjellyfish.net>.

On 11/28/05, Marc Sherman <ms...@projectile.ca> wrote:
> kfogel@collab.net wrote:
> >
> > Compile-time option sounds good.
> >
> > The naming thing is a bit wierd... after all, they're both DAV.  We
> > just didn't call the first one libsvn_ra_neon :-).
>
> Would it make sense to allow a build with both ra implementations, and
> select using URLs?
>
> IE: http+neon://... vs http+serf://...
>
> And then, the compile time option would just control which one was the
> default handler for plain http://...

IMO, no, it's not worth the effort.  If the user has to know the
difference between neon and serf, we have done something horribly
wrong.  The transition should be seamless from the users perspective.

-garrett

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Serf and Subversion was Re: DAV is complicated and slow?

Posted by Michael Sinz <Mi...@sinz.org>.

On 11/28/05, Marc Sherman <ms...@projectile.ca> wrote:
> kfogel@collab.net wrote:
> >
> > Compile-time option sounds good.
> >
> > The naming thing is a bit wierd... after all, they're both DAV.  We
> > just didn't call the first one libsvn_ra_neon :-).
>
> Would it make sense to allow a build with both ra implementations, and
> select using URLs?
>
> IE: http+neon://... vs http+serf://...
>
> And then, the compile time option would just control which one was the
> default handler for plain http://...

+1 - I like having both and picking one as the default.

--
Michael Sinz               Technology and Engineering Director/Consultant
"Starting Startups"                          mailto:Michael.Sinz@sinz.org
My place on the web                      http://www.sinz.org/Michael.Sinz

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Serf and Subversion

Posted by Michael Brouwer <mb...@gmail.com>.

You could always make libsvn_ra_dav a link to either of the dav
implementations at install time.  Then you could even support building
both, though only linking to the default one.  Replacing the link
would allow switching between the two implementations.

Michael


On 28 Nov 2005 15:28:14 -0600, kfogel@collab.net <kf...@collab.net> wrote:
> Justin Erenkrantz <ju...@erenkrantz.com> writes:
> > > Sure -- I'm also proposing that they be kept separate.  All I meant
> > > was, reorganize libsvn_ra_dav/ to have two separate subdirs:
> > >
> > >     libsvn_ra_dav/neon/   <-- for everything currently in libsvn_ra_dav/
> > >     libsvn_ra_dav/serf/
> > >
> > > The above change sounds major, but it's pretty trivial to implement.
> > > The compile-time switch would determine which subdir gets used.
> >
> > I guess we have a gap on just how difficult this would be to implement
> > in the build system.  Are you seeing a short-cut I'm not seeing?  The
> > build system *could* make that choice at autogen.sh, but I'm not
> > seeing an easy way to do it at configure-time because gen_make walks
> > all of the dependencies at autogen-time.
>
> No, I don't see a shortcut, I was just asserting that it's trivial
> without doing any research to back it up.
>
> (Would you like fries with your brutal honesty, sir?)
>
> Improvising here, but this is roughly what I had in mind for
> build.conf:
>
>    Index: build.conf
>    ===================================================================
>    --- build.conf       (revision 17541)
>    +++ build.conf       (working copy)
>    @@ -203,14 +203,22 @@
>     install = lib
>     msvc-static = yes
>
>    -# Accessing repositories via DAV
>    -[libsvn_ra_dav]
>    +# Accessing repositories via DAV, using the Neon HTTP library
>    +[libsvn_ra_dav/neon]
>     type = ra-module
>    -path = subversion/libsvn_ra_dav
>    -install = dav-lib
>    +path = subversion/libsvn_ra_dav/neon
>    +install = dav-neon-lib
>     libs = libsvn_delta libsvn_subr aprutil apriconv apr neon
>     msvc-static = yes
>
>    +# Accessing repositories via DAV, using the Serf HTTP library
>    +[libsvn_ra_dav/serf]
>    +type = ra-module
>    +path = subversion/libsvn_ra_dav/serf
>    +install = dav-serf-lib
>    +libs = libsvn_delta libsvn_subr aprutil apriconv apr serf
>    +msvc-static = yes
>    +
>     # Accessing repositories via SVN
>     [libsvn_ra_svn]
>     type = ra-module
>
> Corresponding changes would have to be made for configure.in, which
> currently refers to "dav-lib".  Am I on a crazy track here, or could
> this work?
>
> Basically the idea is: if the user requests DAV support, we have two
> ways of satisfying it, either via neon or via serf.  If the user
> requests DAV support generically, we decide which is the default.  If
> the user requests exactly one of those two libs, we know they want DAV
> support *and* we know how.  If the user requests both libs, we error
> at configure time -- because we don't support loading them
> simultaneously.
>
> I'm not too up on all our --with-foo flags, so rather than say
> something stupid, I've expressed the idea in general terms above.  I
> guess I'm asking you to see if you make things work this way; if you
> run into a showstopper obstacle, then okay, it can't be done.  But if
> it's just a matter of some configury and link rearrangement...
>
> If you want, start out by implementing a totally skeleton Serf-based
> DAV implementation, one that just implements the svn_ra API but errors
> on every call.  I'll help try to fit that into my scheme above; if I
> can't do it, I certainly won't complain that you couldn't.
>
> > I think we're in agreement that neon and serf can't *both* be loaded
> > at the same time because they both want control over the http/https
> > scheme.  I don't think an 'agent' system like what ra_svn does makes
> > sense here.
>
> Yes, agreed.
>
> > Plus, if we have a library that has the same name but could have two
> > completely independent implementations, I'm very concerned that it
> > would present difficulties on a user-support issue.
>
> Sure.  We should change *both* their names, to dav-neon and dav-serf
> (adjust appropriately for underscore- and prefix-preferring contexts).
>
> > Like we have libsvn_fs_fs and libsvn_fs_base, I could see
> > libsvn_ra_dav_neon and libsvn_ra_dav_serf.  But, I'd be wary of having
> > the library name not clearly indicate what client library we'd use.
>
> I agree the library name should clearly indicate which library it is.
>
> > I've also been trying to avoid forcing a rename in libsvn_ra_dav as I
> > haven't thought through the consequences on our compatiblity rules if
> > we remove that library. -- justin
>
> Let's think those through, then...
>
> Hmmm, did we have this problem when it was libsvn_fs?  No, because
> libsvn_fs became a forwarding library, that (as it happens) can
> actually support both of its forwardees simultaneously.  We could do a
> similar thing, but just refuse to have both forwardees available at
> the same time.  That is, at run time, it looks like a pointless level
> of indirection; but at development/compile time, the indirection
> switches between two real choices.
>
> -Karl
>
> --
> www.collab.net  <>  CollabNet  |  Distributed Development On Demand
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
> For additional commands, e-mail: dev-help@subversion.tigris.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Serf and Subversion

Posted by Greg Hudson <gh...@MIT.EDU>.

On Mon, 2005-11-28 at 12:23 -0800, Justin Erenkrantz wrote:
> If we do the work in trunk, my expectation would be that 1.4 would still 
> have neon by default with serf a configure-time option, but consider 
> switching for 1.5 if all goes well and it does have the advantages I think 
> it will.  This follows what we did with fsfs and BDB.  -- justin

FSFS was a runtime option in 1.1 and 1.2, and it was a really easy one
to use--just specify it at repository creation time and you're set.

A compile-time option is fine for development on the trunk, but it's not
really appropriate for getting field experience which we could use to
consider switching the default.  I think a use-serf option
in .subversion/config (or servers? not sure) is probably the only
realistic choice for that.

I'm not saying that we have to collect field experience before switching
the default--substantial in-house testing could probably justify that as
well.  But we wouldn't be following what we did with FSFS.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Serf and Subversion

Posted by Justin Erenkrantz <ju...@erenkrantz.com>.

--On November 28, 2005 3:14:18 PM -0500 David James <ja...@cs.toronto.edu> 
wrote:

> If the new Serf-friendly protocols are developed on trunk, then they
> will be shipped with Subversion 1.4. Will we need to worry about
> binary compatibility for a library and protocol that is still in the
> early stages of development?

Given that we haven't even released 1.3 yet, judging by our past experience 
of releasing a new minor roughly every 6 months, and the serf work should 
be completed by April.  So, I doubt that'll be a problem unless we rush out 
1.4 sooner than that - which I haven't seen any discussion of yet.

If we do the work in trunk, my expectation would be that 1.4 would still 
have neon by default with serf a configure-time option, but consider 
switching for 1.5 if all goes well and it does have the advantages I think 
it will.  This follows what we did with fsfs and BDB.  -- justin

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Serf and Subversion

Posted by David James <ja...@cs.toronto.edu>.

On 11/28/05, Greg Stein <gs...@lyra.org> wrote:
> On Sun, Nov 27, 2005 at 10:34:42PM -0600, Peter Samuelson wrote:
> >
> > [Justin Erenkrantz]
> > > The RA layer is exactly intended to allow for this type of
> > > flexibility - so I think we'd be working 'against' the RA layer if we
> > > tried to be too cute.  By and large, my opinion is the ra name means
> > > very little.  I don't think users would really care if we add an
> > > ra_serf as long as it handles http and https.
> >
> > Indeed.  You know, in the experimental stages, it wouldn't hurt to
> > register protocols serf:// and serfs:// instead of http:// and
> > https://, and compile both libraries.  Later you could grow the
> > configure test, and later still, if libsvn_ra_serf ever matures to the
> > point of being able to replace libsvn_ra_dav unconditionally, it
> > wouldn't be hard at all to delete the one and rename the other.
>
> Interesting thought. It would certainly be a bit handier from a
> testing standpoint to have an svn that is still functional for your
> "normal" work.
>
> Assuming that it will end up being built as ra_serf, does it need to
> go into a branch? We use branches to avoid disrupting developers'
> normal work, but given that an explicit switch to ./configure would be
> needed to see this (and especially if a (debug) scheme were used),
> then it seems it would be easier to keep the dev on trunk. [...]

If the new Serf-friendly protocols are developed on trunk, then they
will be shipped with Subversion 1.4. Will we need to worry about
binary compatibility for a library and protocol that is still in the
early stages of development?

Perhaps we should document explicitly that we do not support backwards
compatibility on a binary- or protocol-level for Subversion binaries
compiled using the new "--enable-serf" option?

Cheers,

David

--
David James -- http://www.cs.toronto.edu/~james

Re: Serf and Subversion

Posted by Greg Hudson <gh...@MIT.EDU>.

On Mon, 2005-11-28 at 03:28 -0800, Greg Stein wrote:
> Assuming that it will end up being built as ra_serf, does it need to
> go into a branch? We use branches to avoid disrupting developers'
> normal work, but given that an explicit switch to ./configure would be
> needed to see this (and especially if a (debug) scheme were used),
> then it seems it would be easier to keep the dev on trunk.

I agree.

I will be a bit sad if the default build ever produces a binary which
understands a made-up URL scheme like serf://, but I think it's fine to
have this stuff on the trunk.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Serf and Subversion

Posted by kf...@collab.net.

Justin Erenkrantz <ju...@erenkrantz.com> writes:
> > Sure -- I'm also proposing that they be kept separate.  All I meant
> > was, reorganize libsvn_ra_dav/ to have two separate subdirs:
> >
> >     libsvn_ra_dav/neon/   <-- for everything currently in libsvn_ra_dav/
> >     libsvn_ra_dav/serf/
> >
> > The above change sounds major, but it's pretty trivial to implement.
> > The compile-time switch would determine which subdir gets used.
> 
> I guess we have a gap on just how difficult this would be to implement
> in the build system.  Are you seeing a short-cut I'm not seeing?  The
> build system *could* make that choice at autogen.sh, but I'm not
> seeing an easy way to do it at configure-time because gen_make walks
> all of the dependencies at autogen-time.

No, I don't see a shortcut, I was just asserting that it's trivial
without doing any research to back it up.

(Would you like fries with your brutal honesty, sir?)

Improvising here, but this is roughly what I had in mind for
build.conf:

   Index: build.conf
   ===================================================================
   --- build.conf	(revision 17541)
   +++ build.conf	(working copy)
   @@ -203,14 +203,22 @@
    install = lib
    msvc-static = yes

   -# Accessing repositories via DAV
   -[libsvn_ra_dav]
   +# Accessing repositories via DAV, using the Neon HTTP library
   +[libsvn_ra_dav/neon]
    type = ra-module
   -path = subversion/libsvn_ra_dav
   -install = dav-lib
   +path = subversion/libsvn_ra_dav/neon
   +install = dav-neon-lib
    libs = libsvn_delta libsvn_subr aprutil apriconv apr neon
    msvc-static = yes

   +# Accessing repositories via DAV, using the Serf HTTP library
   +[libsvn_ra_dav/serf]
   +type = ra-module
   +path = subversion/libsvn_ra_dav/serf
   +install = dav-serf-lib
   +libs = libsvn_delta libsvn_subr aprutil apriconv apr serf
   +msvc-static = yes
   +
    # Accessing repositories via SVN
    [libsvn_ra_svn]
    type = ra-module

Corresponding changes would have to be made for configure.in, which
currently refers to "dav-lib".  Am I on a crazy track here, or could
this work?

Basically the idea is: if the user requests DAV support, we have two
ways of satisfying it, either via neon or via serf.  If the user
requests DAV support generically, we decide which is the default.  If
the user requests exactly one of those two libs, we know they want DAV
support *and* we know how.  If the user requests both libs, we error
at configure time -- because we don't support loading them
simultaneously.

I'm not too up on all our --with-foo flags, so rather than say
something stupid, I've expressed the idea in general terms above.  I
guess I'm asking you to see if you make things work this way; if you
run into a showstopper obstacle, then okay, it can't be done.  But if
it's just a matter of some configury and link rearrangement...

If you want, start out by implementing a totally skeleton Serf-based
DAV implementation, one that just implements the svn_ra API but errors
on every call.  I'll help try to fit that into my scheme above; if I
can't do it, I certainly won't complain that you couldn't.

> I think we're in agreement that neon and serf can't *both* be loaded
> at the same time because they both want control over the http/https
> scheme.  I don't think an 'agent' system like what ra_svn does makes
> sense here.

Yes, agreed.

> Plus, if we have a library that has the same name but could have two
> completely independent implementations, I'm very concerned that it
> would present difficulties on a user-support issue.

Sure.  We should change *both* their names, to dav-neon and dav-serf
(adjust appropriately for underscore- and prefix-preferring contexts).

> Like we have libsvn_fs_fs and libsvn_fs_base, I could see
> libsvn_ra_dav_neon and libsvn_ra_dav_serf.  But, I'd be wary of having
> the library name not clearly indicate what client library we'd use.

I agree the library name should clearly indicate which library it is.

> I've also been trying to avoid forcing a rename in libsvn_ra_dav as I
> haven't thought through the consequences on our compatiblity rules if
> we remove that library. -- justin

Let's think those through, then...

Hmmm, did we have this problem when it was libsvn_fs?  No, because
libsvn_fs became a forwarding library, that (as it happens) can
actually support both of its forwardees simultaneously.  We could do a
similar thing, but just refuse to have both forwardees available at
the same time.  That is, at run time, it looks like a pointless level
of indirection; but at development/compile time, the indirection
switches between two real choices.

-Karl

-- 
www.collab.net  <>  CollabNet  |  Distributed Development On Demand

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Serf and Subversion

Posted by Justin Erenkrantz <ju...@erenkrantz.com>.

--On November 28, 2005 9:54:44 AM -0600 kfogel@collab.net wrote:

> Sure -- I'm also proposing that they be kept separate.  All I meant
> was, reorganize libsvn_ra_dav/ to have two separate subdirs:
>
>     libsvn_ra_dav/neon/   <-- for everything currently in libsvn_ra_dav/
>     libsvn_ra_dav/serf/
>
> The above change sounds major, but it's pretty trivial to implement.
> The compile-time switch would determine which subdir gets used.

I guess we have a gap on just how difficult this would be to implement in 
the build system.  Are you seeing a short-cut I'm not seeing?  The build 
system *could* make that choice at autogen.sh, but I'm not seeing an easy 
way to do it at configure-time because gen_make walks all of the 
dependencies at autogen-time.

I think we're in agreement that neon and serf can't *both* be loaded at the 
same time because they both want control over the http/https scheme.  I 
don't think an 'agent' system like what ra_svn does makes sense here.

Plus, if we have a library that has the same name but could have two 
completely independent implementations, I'm very concerned that it would 
present difficulties on a user-support issue.

> By putting them both under libsvn_ra_dav/, we'd make the layout
> reflect reality, namely, that we have two different implementations of
> the DAV protocol.

Like we have libsvn_fs_fs and libsvn_fs_base, I could see 
libsvn_ra_dav_neon and libsvn_ra_dav_serf.  But, I'd be wary of having the 
library name not clearly indicate what client library we'd use.  I've also 
been trying to avoid forcing a rename in libsvn_ra_dav as I haven't thought 
through the consequences on our compatiblity rules if we remove that 
library. -- justin

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Serf and Subversion

Posted by kf...@collab.net.

Greg Stein <gs...@lyra.org> writes:
> Assuming that it will end up being built as ra_serf, does it need to
> go into a branch? We use branches to avoid disrupting developers'
> normal work, but given that an explicit switch to ./configure would be
> needed to see this (and especially if a (debug) scheme were used),
> then it seems it would be easier to keep the dev on trunk. I think the
> one decision point *against* using trunk would be if people want to
> see the work [on a branch], and evaluate it before allowing it to
> trunk. And that sort of falls back to, "does anyone think this is a
> bad idea/approach and shouldn't be done?"

+1 on trunk.

In reply to Justin:

> > Is there any way to put both serf and neon implementations separately
> > inside libsvn_ra_dav/?  I'm not proposing that we mix the code, just
> > that we keep the names reflective of the actual situation.
> 
> I don't think so.  Other than having #ifdef's or introducing a
> libsvn_ra_dav meta-API that both serf and neon can hook in to that
> mimics the RA layer for all intents and purposes, I don't see how it'd
> work.  Serf and neon have a similar purpose, but view the world very
> differently - therefore, I think it's best to keep the code as separate
> as possible.

Sure -- I'm also proposing that they be kept separate.  All I meant
was, reorganize libsvn_ra_dav/ to have two separate subdirs:

    libsvn_ra_dav/neon/   <-- for everything currently in libsvn_ra_dav/
    libsvn_ra_dav/serf/

The above change sounds major, but it's pretty trivial to implement.
The compile-time switch would determine which subdir gets used.

I seem to be the only person who thinks this is important, but believe
me, to a new developer, a code tree layout that makes sense is a great
blessing :-).  If I were approaching Subversion for the first time,
and saw libsvn_ra_dav/ (which ironically would implement a *less*
DAV-like DAV, using Neon) and its sibling libsvn_ra_serf/ (which would
implement a *more* DAV-like DAV, using Serf), I'd be pretty confused.

By putting them both under libsvn_ra_dav/, we'd make the layout
reflect reality, namely, that we have two different implementations of
the DAV protocol.

-Karl

-- 
www.collab.net  <>  CollabNet  |  Distributed Development On Demand

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Serf and Subversion

Posted by Greg Stein <gs...@lyra.org>.

On Sun, Nov 27, 2005 at 10:34:42PM -0600, Peter Samuelson wrote:
> 
> [Justin Erenkrantz]
> > The RA layer is exactly intended to allow for this type of
> > flexibility - so I think we'd be working 'against' the RA layer if we
> > tried to be too cute.  By and large, my opinion is the ra name means
> > very little.  I don't think users would really care if we add an
> > ra_serf as long as it handles http and https.
> 
> Indeed.  You know, in the experimental stages, it wouldn't hurt to
> register protocols serf:// and serfs:// instead of http:// and
> https://, and compile both libraries.  Later you could grow the
> configure test, and later still, if libsvn_ra_serf ever matures to the
> point of being able to replace libsvn_ra_dav unconditionally, it
> wouldn't be hard at all to delete the one and rename the other.

Interesting thought. It would certainly be a bit handier from a
testing standpoint to have an svn that is still functional for your
"normal" work.

Assuming that it will end up being built as ra_serf, does it need to
go into a branch? We use branches to avoid disrupting developers'
normal work, but given that an explicit switch to ./configure would be
needed to see this (and especially if a (debug) scheme were used),
then it seems it would be easier to keep the dev on trunk. I think the
one decision point *against* using trunk would be if people want to
see the work [on a branch], and evaluate it before allowing it to
trunk. And that sort of falls back to, "does anyone think this is a
bad idea/approach and shouldn't be done?"

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Serf and Subversion was Re: DAV is complicated and slow?

Posted by Peter Samuelson <pe...@p12n.org>.

[Justin Erenkrantz]
> The RA layer is exactly intended to allow for this type of
> flexibility - so I think we'd be working 'against' the RA layer if we
> tried to be too cute.  By and large, my opinion is the ra name means
> very little.  I don't think users would really care if we add an
> ra_serf as long as it handles http and https.

Indeed.  You know, in the experimental stages, it wouldn't hurt to
register protocols serf:// and serfs:// instead of http:// and
https://, and compile both libraries.  Later you could grow the
configure test, and later still, if libsvn_ra_serf ever matures to the
point of being able to replace libsvn_ra_dav unconditionally, it
wouldn't be hard at all to delete the one and rename the other.

Re: Serf and Subversion was Re: DAV is complicated and slow?

Posted by Justin Erenkrantz <ju...@erenkrantz.com>.

On Sun, Nov 27, 2005 at 05:47:42PM -0600, kfogel@collab.net wrote:
> Compile-time option sounds good.
> 
> The naming thing is a bit wierd... after all, they're both DAV.  We
> just didn't call the first one libsvn_ra_neon :-).
> 
> Is there any way to put both serf and neon implementations separately
> inside libsvn_ra_dav/?  I'm not proposing that we mix the code, just
> that we keep the names reflective of the actual situation.

I don't think so.  Other than having #ifdef's or introducing a
libsvn_ra_dav meta-API that both serf and neon can hook in to that
mimics the RA layer for all intents and purposes, I don't see how it'd
work.  Serf and neon have a similar purpose, but view the world very
differently - therefore, I think it's best to keep the code as separate
as possible.

The RA layer is exactly intended to allow for this type of flexibility -
so I think we'd be working 'against' the RA layer if we tried to be too
cute.  By and large, my opinion is the ra name means very little.  I
don't think users would really care if we add an ra_serf as long as it
handles http and https.  -- justin

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: DAV is complicated and slow?

Posted by Greg Stein <gs...@lyra.org>.

On Thu, Nov 24, 2005 at 01:35:29PM +0100, Peter N. Lundblad wrote:
> On Thu, 24 Nov 2005, Greg Stein wrote:
>...
> > Hopefully, the wc improvements branch will fix perf related to the
> > wcprops. Have to see.
>
> If you're talking about the wc-propcaching branch, making changes to
> wcprops handling isn't on the plans for that branch.  I know too little
> about these beasts to know how much that would improve performance.

Well, you could get some perf by writing 1 file per directory
(containing all the files' wcprops) rather than N files. Note that the
wcprop files are usually quite small: they typically contain just one
key/value pair.

You can't really use the "only write a file if props are present"
because wcprops will always be present for every file when ra_dav is
being used.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: DAV is complicated and slow?

Posted by "Peter N. Lundblad" <pe...@famlundblad.se>.

On Thu, 24 Nov 2005, Greg Stein wrote:

> On Wed, Nov 23, 2005 at 09:36:12AM -0500, Greg Hudson wrote:
> > > And besides, serf already does pipelining (and deflate/gzip and basic
> > > SSL). There are a ton of "friendly" bits that it is lacking, but the
> > > core is there. IMO, it is much more feasible to complete that thing
> > > and hook it into svn, than it is to write a new mirror system.
> >
> > We already have the performance benefit of HTTP pipelining (at the
> > expense of giving up on generic HTTP caching), and ra_dav is still much
> > slower than ra_svn.  I believe this mostly comes from wc-props and other
> > impedance mismatches between svn and DAV.  Perhaps it's possible to fix
> > all of the resulting performance problems in other ways, but for years
> > now, no one who holds that theory has been writing much Subversion code.
>
> Hopefully, the wc improvements branch will fix perf related to the
> wcprops. Have to see.
>
If you're talking about the wc-propcaching branch, making changes to
wcprops handling isn't on the plans for that branch.  I know too little
about these beasts to know how much that would improve performance.

Thanks,
//Peter

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Serf and Subversion was Re: DAV is complicated and slow?

Posted by Paul Querna <ch...@force-elite.com>.

Marc Sherman wrote:
> kfogel@collab.net wrote:
>> Compile-time option sounds good.
>>
>> The naming thing is a bit wierd... after all, they're both DAV.  We
>> just didn't call the first one libsvn_ra_neon :-).
> 
> Would it make sense to allow a build with both ra implementations, and
> select using URLs?
> 
> IE: http+neon://... vs http+serf://...
> 
> And then, the compile time option would just control which one was the
> default handler for plain http://...

That is insane.

This is a user-level detail, and should not be exposed in Working Copies.

In fact, this would make a working copy checked out with http+neon://
unable to be used by someone who only has serf, without using svn
switch, which seems completely bogus.  Yes, you could do some voodoo to
detect this case, but exposing the underlying library used to FETCH a
URL is crazy, and should not be done.

IMO, Make it a compile time choice.

-Paul

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Serf and Subversion was Re: DAV is complicated and slow?

Posted by Marc Sherman <ms...@projectile.ca>.

kfogel@collab.net wrote:
> 
> Compile-time option sounds good.
> 
> The naming thing is a bit wierd... after all, they're both DAV.  We
> just didn't call the first one libsvn_ra_neon :-).

Would it make sense to allow a build with both ra implementations, and
select using URLs?

IE: http+neon://... vs http+serf://...

And then, the compile time option would just control which one was the
default handler for plain http://...

- Marc

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Serf and Subversion was Re: DAV is complicated and slow?

Posted by kf...@collab.net.

Justin Erenkrantz <ju...@erenkrantz.com> writes:
> *raises hand*
> 
> As some of you may already know, Google has hired me to work on
> Subversion with my main goal of integrating serf with Subversion.
> 
> Google and I (since, well, I accepted!) both view this as an excellent
> opportunity to finally realize the ideas Greg and I have had for a
> long time about Subversion and serf.  We can finally prove that DAV on
> the client-side doesn't *have* to be complicated and slow.  We do hope
> that we'll be able to improve Subversion client's WebDAV performance
> substantially.

Hear, hear!

Great news, Justin.

> Due to obligations that I need to complete before I head to Google, I
> haven't sat down in detail and sketched out how it will all work out
> yet.  I do know that I intend to create a new ra_serf layer (for lack
> of a better name) that will aim to replace ra_dav with Neon.  The
> code-bases of neon and serf are too different to share the same ra
> layer on the client-side.  (We'd likely make ra_serf/ra_dav a
> configure-time option - you'll get one or the other.)  I also have
> some ideas for some other nifty features that we'll be able to add
> with serf that we can't do with neon as well.

Compile-time option sounds good.

The naming thing is a bit wierd... after all, they're both DAV.  We
just didn't call the first one libsvn_ra_neon :-).

Is there any way to put both serf and neon implementations separately
inside libsvn_ra_dav/?  I'm not proposing that we mix the code, just
that we keep the names reflective of the actual situation.

> If you have any general questions, please feel free to ask.  If you
> want details about ra_serf, well, I don't know that just yet - that
> will have to wait until I have the time to answer it in the detail it
> merits.  =)  -- justin

Looking forward to January, and congratulations!

-Karl

-- 
www.collab.net  <>  CollabNet  |  Distributed Development On Demand

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Serf and Subversion was Re: DAV is complicated and slow?

Posted by Justin Erenkrantz <ju...@erenkrantz.com>.

--On November 24, 2005 3:05:07 AM -0800 Greg Stein <gs...@lyra.org> wrote:

> In terms of writing code, while I haven't had time personally, I can
> make it possible for others :-)  I have an intern starting in January
> to work on serf and svn integration. I'll make sure he knows that he
> can talk about it whenever he's up for it.

*raises hand*

As some of you may already know, Google has hired me to work on Subversion 
with my main goal of integrating serf with Subversion.

Google and I (since, well, I accepted!) both view this as an excellent 
opportunity to finally realize the ideas Greg and I have had for a long time 
about Subversion and serf.  We can finally prove that DAV on the client-side 
doesn't *have* to be complicated and slow.  We do hope that we'll be able to 
improve Subversion client's WebDAV performance substantially.

For those that aren't aware, Serf is an HTTP client library that is loosely 
modeled after the httpd architecture.  Serf's code base lives at:

<http://svn.webdav.org/repos/projects/serf/>

Serf is an asynchronous library that does pipelining, deflate, SSL, chunking, 
etc. today built on top of APR.  As Greg mentioned in a later post, there is 
some 'friendly' stuff that neon does that serf doesn't do yet - so part of the 
enhancements to serf will be to improve on the accouterments that serf offers.

We believe that serf can offer a better HTTP client platform for us than neon 
does because without pipelining, we're just too slow.  As I said in my earlier 
posts, using custom methods also kills any potential to placing a cache in the 
mix.  Therefore, I have a real strong dislike for custom REPORT methods and 
want to return to a GETy model on the client-side.  (REPORT goes against 
everything that HTTP/REST is about!)

Due to obligations that I need to complete before I head to Google, I haven't 
sat down in detail and sketched out how it will all work out yet.  I do know 
that I intend to create a new ra_serf layer (for lack of a better name) that 
will aim to replace ra_dav with Neon.  The code-bases of neon and serf are too 
different to share the same ra layer on the client-side.  (We'd likely make 
ra_serf/ra_dav a configure-time option - you'll get one or the other.)  I also 
have some ideas for some other nifty features that we'll be able to add with 
serf that we can't do with neon as well.

With Google's blessing, I plan to do all of the Subversion integration work 
here on dev@svn and have the code reside in a branch until we're ready to 
discussing merging it to trunk.  Of course, that'll mean that I will hammer 
out a design proposal before we create the branch here.  In DannyB fashion, I 
may just throw up a first-cut implementation here to guide the discussion.  =)

My work will be primarily focused on the integration of Subversion and serf, 
but there may be things along the way that I'll also touch on.  Since I'm also 
knowledgeable about httpd and caching, I may also make improvements to 
mod_dav_svn as necessary.  But, that's not my real focus.

I'm excited about this opportunity that Google is giving me to put my time 
where my mouth is.  So, while embarking on a rewrite of ra_dav to go to a 
strict REPORT model is cute, I'm going to have the cycles to put a big 
bulls-eye on that model and do my best to shoot it down.  =)

To give an idea about the estimated timeline, I'll be starting at Google in 
mid-to-late January and will be there until April.  I'll keep an eye on 
dev@svn until then, but I'll be focused on other things until I officially 
start.  For those that care and understand the ins-and-outs of academia, I'm 
advancing to candidacy for my PhD in mid-January.  Needless to say, I'm 
*really* swamped until I pass that hurdle.  (I should be working on my survey 
paper right now!)  The nice thing is that I'll be able to fully dedicate 
myself to this work during my three months at Google.

If you have any general questions, please feel free to ask.  If you want 
details about ra_serf, well, I don't know that just yet - that will have to 
wait until I have the time to answer it in the detail it merits.  =)  -- justin

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: DAV is complicated and slow?

Posted by Greg Stein <gs...@lyra.org>.

On Wed, Nov 23, 2005 at 09:36:12AM -0500, Greg Hudson wrote:
> On Wed, 2005-11-23 at 02:01 -0800, Greg Stein wrote:
>...
> > The complexity of the software increases, yes, but the user story is
> > greatly simplified. Many places are already running caches of some
> > form. Deploying squid is no big deal, and many people do that. HTTP
> > caching proxies are well known items, and people have a great choice
> > in how to install and configure those.
> 
> I dunno.  If I'm, say, gnome.org, and my servers can't handle the
> Subversion traffic, I think I'm more likely to want to set up a bunch of
> mirrors than to point people at a bunch of Squid caches.

You don't have to tell people *anything* different. You switch the
front end server into a reverse-proxy. Same IP and all, but it can
serve from its own cache, or relay the requests to N backend servers
with their own caches. And if a request isn't satisfiable, then the
server can just invoke the appropriate svn functionality.

This frontend that relays out to N backend servers can be Apache or
other reverse proxies, or even hardware such as a NetScalar.

When I was at CollabNet, we did a bunch of scalability testing and
found that the servers were CPU bound computing diffs. It kind of
sucked at the time because BDB was the only option, and that was
effectively impossible to have N servers against a single (NFS)
backing storage system. With FSFS, it is rather straight-forward to
have a farm of frontend Apache servers grunting thru diffs/deltas, all
talking to an FSFS repo on a networked storage device (e.g. a NetApp).

>...
> > I'd much rather improve the client than to develop yet another server,
> > with its own host issues related to networking code, security,
> > documentation, logging, monitoring, and performance tweaking.
> 
> Are you imagining the mirror system would be yet another network server?
> Much more likely, it would be one of our existing network servers
> pointed at a mirror maintained by a cron job.

Yup. I assumed a transparent proxy/cache thingy. Try to serve from
cache, or to relay the request back to the "real" server if it can't
be satisfied locally.

You could have a mirror like thing, but you'd still want to relay
write requests.

> >  When I
> > look back at svnserve, the original idea was "small and light", but I
> > note that it has grown a ton of functionality since then.
> 
> It has?  It got path-based acls, but that was just moving logic from
> mod_authz_svn (where it never should have been) into libsvn_repos, and
> adding a few calls.

Logging? Threads? Doesn't it have a config file now?

> (I admit to some frustration at the amount of crud involved in having a
> network server which satisfies everyone's needs, which svnserve
> currently does not.  I believe the answer is to encourage the creation
> of better support libraries.  I do not believe the answer is to
> implement everything inside Apache httpd.)

Right. Shove people at httpd rather than building all that into
svnserve itself.

> > And besides, serf already does pipelining (and deflate/gzip and basic
> > SSL). There are a ton of "friendly" bits that it is lacking, but the
> > core is there. IMO, it is much more feasible to complete that thing
> > and hook it into svn, than it is to write a new mirror system.
> 
> We already have the performance benefit of HTTP pipelining (at the
> expense of giving up on generic HTTP caching), and ra_dav is still much
> slower than ra_svn.  I believe this mostly comes from wc-props and other
> impedance mismatches between svn and DAV.  Perhaps it's possible to fix
> all of the resulting performance problems in other ways, but for years
> now, no one who holds that theory has been writing much Subversion code.

Hopefully, the wc improvements branch will fix perf related to the
wcprops. Have to see.

In terms of writing code, while I haven't had time personally, I can
make it possible for others :-)  I have an intern starting in January
to work on serf and svn integration. I'll make sure he knows that he
can talk about it whenever he's up for it.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: DAV is complicated and slow?

Posted by Garrett Rooney <ro...@electricjellyfish.net>.

On 23 Nov 2005 09:21:12 -0600, kfogel@collab.net <kf...@collab.net> wrote:
> Greg Hudson <gh...@MIT.EDU> writes:
> > We already have the performance benefit of HTTP pipelining (at the
> > expense of giving up on generic HTTP caching), and ra_dav is still much
> > slower than ra_svn.
>
> We don't do HTTP pipelining in ra_dav/mod_dav_svn right now (Neon
> doesn't handle it); we just use custom reports.

I believe what ghudson means is that we've interleaved the diffs
and/or file contents with the rest of the data we're getting back,
just as we would if we had gone with the original scheme + http
pipelining.  So we've already got the kind of result we would have had
if neon had pipelining, and it's still slow.

Of course, there are certainly other reasons that ra_dav is slow (lets
make a million PROPFIND requests before we do anything!  ok!  how
about we base64 encode all binary data so it's bigger than it has to
be!  great!  etc.) so perhaps it's worth moving towards a custom
protocol that uses http, is built with the possibility of being http
cache friendly, but doesn't fall victim to the problems we currently
have with DAV.

-garrett

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: DAV is complicated and slow?

Posted by kf...@collab.net.

Greg Hudson <gh...@MIT.EDU> writes:
> On Wed, 2005-11-23 at 09:21 -0600, kfogel@collab.net wrote:
> > Greg Hudson <gh...@MIT.EDU> writes:
> > > We already have the performance benefit of HTTP pipelining (at the
> > > expense of giving up on generic HTTP caching), and ra_dav is still much
> > > slower than ra_svn.
> > 
> > We don't do HTTP pipelining in ra_dav/mod_dav_svn right now (Neon
> > doesn't handle it); we just use custom reports.
> 
> Right, so we already have the performance benefit we *would* get from
> HTTP pipelining.

Oh, I get it now -- misparsed your original sentence, sorry.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: DAV is complicated and slow?

Posted by Greg Hudson <gh...@MIT.EDU>.

On Wed, 2005-11-23 at 09:21 -0600, kfogel@collab.net wrote:
> Greg Hudson <gh...@MIT.EDU> writes:
> > We already have the performance benefit of HTTP pipelining (at the
> > expense of giving up on generic HTTP caching), and ra_dav is still much
> > slower than ra_svn.
> 
> We don't do HTTP pipelining in ra_dav/mod_dav_svn right now (Neon
> doesn't handle it); we just use custom reports.

Right, so we already have the performance benefit we *would* get from
HTTP pipelining.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: DAV is complicated and slow?

Posted by kf...@collab.net.

Greg Hudson <gh...@MIT.EDU> writes:
> We already have the performance benefit of HTTP pipelining (at the
> expense of giving up on generic HTTP caching), and ra_dav is still much
> slower than ra_svn.

We don't do HTTP pipelining in ra_dav/mod_dav_svn right now (Neon
doesn't handle it); we just use custom reports.

-Karl

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: DAV is complicated and slow?

Posted by Greg Hudson <gh...@MIT.EDU>.

On Wed, 2005-11-23 at 02:01 -0800, Greg Stein wrote:
> > So, while there's a certain elegance in being able to use generic HTTP
> > caching to improve performance, that elegance comes at a substantial
> > cost in the complexity of libsvn_ra_dav and--at least with the current
> > architecture--in the performance of the basic functioning of svn over
> > ra_dav.
> 
> The complexity of the software increases, yes, but the user story is
> greatly simplified. Many places are already running caches of some
> form. Deploying squid is no big deal, and many people do that. HTTP
> caching proxies are well known items, and people have a great choice
> in how to install and configure those.

I dunno.  If I'm, say, gnome.org, and my servers can't handle the
Subversion traffic, I think I'm more likely to want to set up a bunch of
mirrors than to point people at a bunch of Squid caches.

> Creating a custom mirroring system is a lot of complexity in its own
> right

With our architecture, not really.  The hardest issue is rev-prop mods,
and it's not all that hard, although we could probably provide a feature
or two to make it easier.

> I'd much rather improve the client than to develop yet another server,
> with its own host issues related to networking code, security,
> documentation, logging, monitoring, and performance tweaking.

Are you imagining the mirror system would be yet another network server?
Much more likely, it would be one of our existing network servers
pointed at a mirror maintained by a cron job.

>  When I
> look back at svnserve, the original idea was "small and light", but I
> note that it has grown a ton of functionality since then.

It has?  It got path-based acls, but that was just moving logic from
mod_authz_svn (where it never should have been) into libsvn_repos, and
adding a few calls.

(I admit to some frustration at the amount of crud involved in having a
network server which satisfies everyone's needs, which svnserve
currently does not.  I believe the answer is to encourage the creation
of better support libraries.  I do not believe the answer is to
implement everything inside Apache httpd.)

> And besides, serf already does pipelining (and deflate/gzip and basic
> SSL). There are a ton of "friendly" bits that it is lacking, but the
> core is there. IMO, it is much more feasible to complete that thing
> and hook it into svn, than it is to write a new mirror system.

We already have the performance benefit of HTTP pipelining (at the
expense of giving up on generic HTTP caching), and ra_dav is still much
slower than ra_svn.  I believe this mostly comes from wc-props and other
impedance mismatches between svn and DAV.  Perhaps it's possible to fix
all of the resulting performance problems in other ways, but for years
now, no one who holds that theory has been writing much Subversion code.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: DAV is complicated and slow?

Posted by Greg Stein <gs...@lyra.org>.

On Tue, Nov 22, 2005 at 12:34:34PM -0500, Greg Hudson wrote:
>...
> I'll note that better caching benefits could be obtained using a
> Subversion-specific mirroring mechanism (the svn_ra_reply functionality
> is a good start here, as it allows such mirrors to operate in pull
> mode).  You'd get better performance here because the mirror could
> handle arbitrary requests for diffs between versions, while the cache
> could at best only handle the ones it's seen recently.

True. But I'll note that certain diffs will be common. Consider 10
revisions which update a total of 100 files. Those hundred diffs will
be cached, so anybody updating from anything in HEAD-10 to HEAD will
get the cached diffs to bring them up to HEAD.

The trouble is in the one-off diffs for diff/merge requests or for
people that go a long time between updates. Those will have to hit the
server in the HTTP case, whereas your suggested mirror could probably
satisfy them.

> A mirror might consume more disk space than a functional cache (or it
> might consume less in some cases), but someone setting up a
> high-performance network isn't generally limited by disk space.

Agreed.

> So, while there's a certain elegance in being able to use generic HTTP
> caching to improve performance, that elegance comes at a substantial
> cost in the complexity of libsvn_ra_dav and--at least with the current
> architecture--in the performance of the basic functioning of svn over
> ra_dav.

The complexity of the software increases, yes, but the user story is
greatly simplified. Many places are already running caches of some
form. Deploying squid is no big deal, and many people do that. HTTP
caching proxies are well known items, and people have a great choice
in how to install and configure those.

Creating a custom mirroring system is a lot of complexity in its own
right, so I don't think it saves anything [in the HTTP environment
we're talking about]. Sure, it solves one unique problem, but an HTTP
cache can improve a lot of your browsing experience, too.

I'd much rather improve the client than to develop yet another server,
with its own host issues related to networking code, security,
documentation, logging, monitoring, and performance tweaking. When I
look back at svnserve, the original idea was "small and light", but I
note that it has grown a ton of functionality since then. I would
worry about the same kind of thing affecting any custom mirror/cache
solution.

And besides, serf already does pipelining (and deflate/gzip and basic
SSL). There are a ton of "friendly" bits that it is lacking, but the
core is there. IMO, it is much more feasible to complete that thing
and hook it into svn, than it is to write a new mirror system.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: DAV is complicated and slow?

Posted by Greg Hudson <gh...@MIT.EDU>.

On Tue, 2005-11-22 at 00:08 -0800, Greg Stein wrote:
> On Mon, Nov 21, 2005 at 06:35:58PM -0600, Ben Collins-Sussman wrote:
> > On 11/21/05, Justin Erenkrantz <ju...@erenkrantz.com> wrote:
[Lots of stuff about the potential for using generic HTTP caching for
Subversion]

I'll note that better caching benefits could be obtained using a
Subversion-specific mirroring mechanism (the svn_ra_reply functionality
is a good start here, as it allows such mirrors to operate in pull
mode).  You'd get better performance here because the mirror could
handle arbitrary requests for diffs between versions, while the cache
could at best only handle the ones it's seen recently.

A mirror might consume more disk space than a functional cache (or it
might consume less in some cases), but someone setting up a
high-performance network isn't generally limited by disk space.

So, while there's a certain elegance in being able to use generic HTTP
caching to improve performance, that elegance comes at a substantial
cost in the complexity of libsvn_ra_dav and--at least with the current
architecture--in the performance of the basic functioning of svn over
ra_dav.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org