You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Greg Stein <gs...@lyra.org> on 2001/03/29 09:27:49 UTC

transaction roots

I've been creating transactions based on the "youngest" revision, and then
applying changes to that. I was wondering if anybody sees a problem with
doing that.

[ this is contrary to libsvn_fs/editor.c creating transactions based on
  arbitrary revisions. ]

Specifically, my thinking with using the youngest revision is simply that a
commit must be made against the youngest revision. If the user wants to
change file REV:PATH, then that file exists in one of two states within the
youngest revision tree:

1) the youngest tree has the same file as REV:PATH. it is okay to change.
2) the youngest tree shows a change since REV:PATH, so the client is out of
   date and needs to be updated.

I don't see that grabbing an old revision is all that helpful.

There are potentially intervening tree changes that could interfere, but (at
least in the DAV case), I can detect those. Recall that I store the ID for
each node. If 3:/FOO/BAR is different from 4:/FOO/BAR because of changes at
the FOO level, then I'll see the ID change and punt.

And note that a user cannot commit against 3:/FOO/BAR anyways, because it is
out of date.

I guess it is conceivable for something like this to happen:

  REV3/
    FOO/
      BAR(4.1)

  REV4/
    FOO/
      BAR(71.6)

  REV5/
    FOO/
      BAR(4.1)

If a user has 3:/FOO/BAR, and the latest rev is 5, then a commit to BAR will
actually succeed since it is "up to date" :-)

Can anybody see a problem with creating transactions based on the latest,
and verifying that the base of each desired change matches that within the
latest tree?

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/

Re: transaction roots

Posted by Ben Collins-Sussman <su...@newton.ch.collab.net>.
Greg Stein <gs...@lyra.org> writes:
 
> > In the long run, I'd hate to see ra_local relegated to an obsolete
> > test harness.
> 
> Quite true. I hear ya. I bet it will always exist, but I'm hoping it will be
> the black sheep of the family.

Heh.... if I buy into this argument, then maybe I shouldn't care that
ra_local does commits differently than ra_dav.  Seriously!  :)

.
> > > > [ long (semi-sarcastic :-) spiel about converting ra_local ]
> > 
> > Since it's clear that you're not going to budge, I think you've mostly
> > managed to talk me into rewriting the fs commit-editor to match your
> > model.  It's still a good model, I like it.
> 
> Hmm. I'm hoping changes will come from agreement rather than avoiding an
> "immovable object." I actually feel quite bad when it seems that people
> relent rather than agree. The part that is even worse for me, is that it
> happens enough that I notice. It makes me apprehensive about whether I'm too
> stubborn, too inflexible, etc. I hope that I'm open, but it invariably
> devolves to "prove it is better" which isn't nearly as diplomatic :-(

Well, next week we'll be face-to-face, so you'll have plenty of
opportunity to make me emphatically agree that the ra_dav commit
system is inherently more attractive.  You're not too stubborn --
you're very open and empathic -- but I'm still not persuaded that
ra_dav's system has more technical merit.

Lemme think.  Hm.  I see problems with *both* systems, really.

In your last mail to Jim, you said that the problem was that the
filesystem provides no interface that answers the question:  "I want
to change REV:PATH.  Is that allowed?  Am I up-to-date?"  Therefore,
you're doing the node-rev-id checks yourself.

But I think now that this was the original heart of my objection to
ra_dav's technique: the fs *does* have a function that comes awfully
close.  svn_fs_merge answers the question: "I want to change all these
REV:PATH objects that I've assembled into a tree.  Is that allowed?
Am I up-to-date?"  It's just the *plural* form of the function that
you've been wanting.

So, let me summarize my complaints:

  * I think it's definitely inefficient for ra_local to keep calling
    fs_merge on an ever-growing tree, every time it gets one change.
    Why call a "plural" routine when you really want a "singular" one?
    It's wasteful.

  * I think it's bad that ra_dav is running id-comparison logic on
    individual items, when that logic *already* exists inside the
    filesystem.  (We've already found a few bugs last week in
    fs_merge's logic -- down at the low-level 'id distance computing'
    routines.  I want to think that if we fix them, that *all* ra
    layers conflict-detection will be equally stable as a result.)


Perhaps we can synthesize a new commit-system that we *both* use, sort
of a compromise?

  * the ra layer ignores the base_rev arg given to replace_root.
    Instead, it just creates a txn based on the youngest rev.

  * we break out the node-id-comparison logic into a shared internal
    func, and use it to create a new fs function call
    svn_fs_merge_item().  (and rename svn_fs_merge() into
    svn_fs_merge_tree()) 

  * as changes (with base_revs) trickle in, the ra layer uses
    fs_merge_item to check for conflicts.

  * finally, we call fs_commit_txn... which internally uses the
    "plural" function to merge against whatever the new 'youngest'
    tree is (in case it changed during the commit.)


This system addresses both of my complaints above... and hopefully
it's close enough to what you're already doing that you'd like it
too.  I think this system has the most Technical Merit. 


 
> > But hey, it's M2 week, so we're all a little tense.  Nevertheless,
> > Greg, you're the epitome of calmness under stress... a real cool head
> > in these arguments.  Thanks.  :)
> 
> I hear the tense bit :-) ... I've been spending a lot of time in front of
> the computer, and not enough on (ahem) planning/doing other things in life.
> To get both done, it means little sleep :-)

Yeah, my wife is going to kill me.  I need a vacation.  :)

> 
> But I tend to think of them as "discussions." It is impossible for me to be
> angry with somebody for something they believe in. I tend to understand
> others' positions, so I can see where they're coming from; I just tend to
> disagree with those positions :-)

Definitely... no anger here at all.  Just friendly debate.

Re: transaction roots

Posted by Greg Stein <gs...@lyra.org>.
On Thu, Mar 29, 2001 at 08:35:49PM -0600, Ben Collins-Sussman wrote:
> Greg Stein <gs...@lyra.org> writes:
> > Lastly, these two paths are *specifically* why we wanted to avoid ra_local
> > as much as possible. Just get people to use and stick with ra_dav even when
> > they work on their own machine.
> 
> In the long run, I'd hate to see ra_local relegated to an obsolete
> test harness.

Quite true. I hear ya. I bet it will always exist, but I'm hoping it will be
the black sheep of the family.

> I still think setting up Apache/DAV/mod_dav_svn on a
> local box is way too high a barrier to entry for someone who wants to
> create a private repository for personal use...

Somewhat true. And yes, the overhead is there and is quite unavoidable.

> so I predict (I hope?)
> that ra_local will be used just often as it's used in CVS.

I hope not. The more it is used, the more we need to deal with dual path
maintenance. Although... our code is a hella lot more organized than CVS, so
I doubt we'll have CVS's maintenance issues.

> (But
> that's just an opinion... I'm sure you'll personally write an amazing
> install-script to lower that Apache barrier.  ;) )

I won't say amazing, but my hope is to make it as damn near invisible as
possible. I know that we can get a lot prepackaged, but there will always be
the "what port?" issue. If you have N users on a system, they each need a
different port if they'll be running a private SVN server. And each user
will need to use that port in their URLs.

>...
> > > [ long (semi-sarcastic :-) spiel about converting ra_local ]
> 
> Since it's clear that you're not going to budge, I think you've mostly
> managed to talk me into rewriting the fs commit-editor to match your
> model.  It's still a good model, I like it.

Hmm. I'm hoping changes will come from agreement rather than avoiding an
"immovable object." I actually feel quite bad when it seems that people
relent rather than agree. The part that is even worse for me, is that it
happens enough that I notice. It makes me apprehensive about whether I'm too
stubborn, too inflexible, etc. I hope that I'm open, but it invariably
devolves to "prove it is better" which isn't nearly as diplomatic :-(

>...
> I guess my frustration has been showing.  It's not that I object to
> rewriting code when a better system is discovered;  I've been doing
> that for a year.  I just get frustrated when the rewrite is a result
> of two developers not agreeing on things early on.  If you had told me
> about your algorithm 2 weeks ago, or vice versa, one of us wouldn't
> have to rewrite now.  Nobody's fault, really.  This just falls in the
> same category of that time last summer when I spent 2 weeks writing a
> library that was totally unnecessary -- and could have been prevented
> from the start if people had been communicating better.  Such is life.

I hear ya. I'd say you could blame me for it. I don't really know the
algorithm until I sit down to build it. I tend to iterate a lot more often
than design-up-front. For complex problems, I tend to let them simmer in my
head for a while. Eventually, it gels, and I begin coding. Point is: if you
ask me how something will be built, I rarely know :-) Sure, I can sketch out
some ideas based on the brain-simmer, but it isn't until the code hits the
keyboard that I'm truly sure.

Point is: two weeks ago, I may have been able to describe in general terms,
but I'm not sure that it would have been clear enough for us to meet on a
particular path. So... my fault.

But I can sympathize. I just rewrote the commit stuff last night. I think
I've rewritten the checkout once or twice now.

> But hey, it's M2 week, so we're all a little tense.  Nevertheless,
> Greg, you're the epitome of calmness under stress... a real cool head
> in these arguments.  Thanks.  :)

I hear the tense bit :-) ... I've been spending a lot of time in front of
the computer, and not enough on (ahem) planning/doing other things in life.
To get both done, it means little sleep :-)

But I tend to think of them as "discussions." It is impossible for me to be
angry with somebody for something they believe in. I tend to understand
others' positions, so I can see where they're coming from; I just tend to
disagree with those positions :-)

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/

Re: Installation [was Re: transaction roots]

Posted by RADICS Peter <mi...@lbcons.net>.
Just for the record, I'm personally +1 (or even +2) on ra_local.
Right now I'm working on a 486 box with 16 megs of ram, so I'd be very
happy to do svn without apache.  Yes, I know computers are cheap these
days.  So what?  I happen to like this one :)  So please don't force
everyone to use apache for their small repos.  Long live ra_local! :)

(of course cheers to ra_dav as well, but for different reasons)

cheers,
mitch
-- 
// RADICS Peter <mi...@lbcons.net> (http://lbcons.net)
//
// "If human beings don't keep exercising their lips, 
//  their brains start working." -- Ford Prefect

Re: Installation [was Re: transaction roots]

Posted by Daniel Stenberg <da...@haxx.se>.
On Fri, 30 Mar 2001, Greg Stein wrote:

> > 2 - even when we are system administrators, we might just want to have a
> >     bunch of local files in the repository. Requiring Apache and even
> >     featuring apache in the install procedure will make possibly dreadful
> >     collisions and weird setup quirks for people that already have apache
> >     installed and up and running (compared to using SVN on a local
> >     repository)

[snip]

> Why do you supposed there would be conflicts?

I can't say there *will* be, and I surely couldn't argue about apache setup
issues with you! ;-) I'm just saying there's a risk. I don't know what kind
of weird setups and requirements people can have and use for their web
servers that the new one would have to use (or not have to use) as well.

This is not a very strong argument as I don't have any specific examples or
even a possible problematic scenario.

> So... Average Joe can definitely run an SVN Server (oh, sorry, Apache
> server plus SVN :-). The question is whether the machine administrator
> will let these things run continually. But hey... to run one while logged
> in? Surely, an admin would be fine with that.

Right, that's true of course. It could certainly run while the user is logged
in...

> Apache will increase the source footprint, but it doesn't necessarily
> create an insurmountable barrier for Joe User. Just think "port 4734"

I agree. I'm not seeing any insurmountable barriers. I'm only saying that I
see advantages with the local-mode.

-- 
      Daniel Stenberg - http://daniel.haxx.se - +46-705-44 31 77
   ech`echo xiun|tr nu oc|sed 'sx\([sx]\)\([xoi]\)xo un\2\1 is xg'`ol

Re: Installation [was Re: transaction roots]

Posted by Greg Stein <gs...@lyra.org>.
On Fri, Mar 30, 2001 at 09:39:50AM +0200, Daniel Stenberg wrote:
> On Fri, 30 Mar 2001, Tripp Lilley wrote:
>...
> I think you're mixing things here. I've used many CVS repositories, and once
> I've checked out the code I've never bothered about the "two houses" dilemma
> as you describe it. A source code repository is usually singularis. There's
> only one, be it local or remote.

Agreed. For the cases I've seen/used, a repository is local or remote. Never
both.

> > To that end, I propose that the lowered entry barrier is pretty simple
> > really: when you build RPMs, BSD ports, etc., you have them include a
> > stripped down Apache install with mod_dav and mod_dav_svn all configured,
> > and all set to run on a different port and to use a different set of
> > directories for config, libraries, etc. In that sense, Apache, mod_dav,
> > and mod_dav_svn are all just "part of the installation".

That's my hope/intent. "Oh, yah, there's an Apache server in there. why do
you ask?"

> I'd really not like that. For a number of reasons, but these are the two
> main ones I can immediately think of:
> 
> 1 - we're not always system administrators when we want to run SVN on a few
>     simple files. requiring apache is an immense barrier in that scenario.

And if the SVN install preconfigures it for you? Just another little app
running.

> 2 - even when we are system administrators, we might just want to have a
>     bunch of local files in the repository. Requiring Apache and even
>     featuring apache in the install procedure will make possibly dreadful
>     collisions and weird setup quirks for people that already have apache
>     installed and up and running (compared to using SVN on a local
>     repository)

I have two Apaches running on my system. No conflicts at all. I could run a
third on port 80, should I choose.

Oh. Actually, I have a third. It's running on another interface at port 80.

Why do you supposed there would be conflicts?

> > Anyone who cares about the fact that you're running "another" copy of
> > Apache is someone who cares enough to install it themselves by hand, to
> > their specific liking. Your "weekend warrior" who just wants SVN to
> > manage a small repo for personal use is likely to be just grabbing the
> > RPM and going, anyway.
> 
> What about the "weekend warrior" who uses a machine he can't install servers
> on?

Ah. Now we get to it. You can install Apache on any machine that you want.
Nobody said the thing had to run on port 80. And you can install/run it from
/home/gstein/svn if you'd like.

I run a development Apache 1.3 on 8080 as "gstein". Apache 2.0 (as "gstein")
on 8081. Apache 1.3 on test.webdav.org, port 80. No issues, no problems.

So... Average Joe can definitely run an SVN Server (oh, sorry, Apache server
plus SVN :-). The question is whether the machine administrator will let
these things run continually. But hey... to run one while logged in? Surely,
an admin would be fine with that. For personal SVN servers, we'd simply have
Apache configured to start a single process/thread. If the server happens to
be used by many people, then Apache would just scale up the number of ready
workers. But an admin is not going to complain about a single process
hanging out while a user is logged in.

etc etc

Apache will increase the source footprint, but it doesn't necessarily create
an insurmountable barrier for Joe User. Just think "port 4734"

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/

Re: Installation [was Re: transaction roots]

Posted by Daniel Stenberg <da...@haxx.se>.
On Fri, 30 Mar 2001, Tripp Lilley wrote:

> > 1 - we're not always system administrators when we want to run SVN on a few
> >     simple files. requiring apache is an immense barrier in that scenario.
>
> I hate to be cavalier, but this is what I envision:
>
> 	rpm -i svn-toto-1.0-1.i386.rpm

Now, I'm not any RPM knowledgable person, but doesn't that use hard-coded
paths? If you're not sysadmin, how can to install files in system
directories?

> > What about the "weekend warrior" who uses a machine he can't install
> > servers on?
>
> A relocatable version of the above kit, installed in the user's home
> directory, listening on an unprivileged port (probably computed based on
> an offset and the user's UID, so it doesn't conflict with another user
> also evaluating SVN).

You're missing my point (my fault, I didn't really spell it out).

I'm not talking about a fysical limitation, I'm talking system policies. Like
imagine my account over at my web hotel that hosts a bazillion domains and
users. They frawn upon servers started by users. Sure, finding a
non-priveledged port is simple enough, but that's not the problem I'd have to
work-around here.

Any program that requires a server is out of the question for me to install
there, unless I can get the management's attention and have their permission
or perhaps have them do the job to set it up system-wide.

I don't know how common this scenario is, but I know for sure that I've had
access to many such machines during my days that have imposed similar limits.

> I guess I just feel like the entire "local" versus "remote" access
> dichotomy is "askin' fer trouble". It duplicates code, even when you're
> diligent about sharing code.

Well, any new feature adds code and thus also problems/bugs. I think of
local-mode as a feature. A cool feature actually.

> It adds new places for bugs to hide. It allows users and sysadmins to do
> grievous damage if they're not aware of the distinction

How would that be? How would I cause "grievous damage" to my local repository
if I mistakenly think it is a "remote" one all of a sudden?

-- 
      Daniel Stenberg - http://daniel.haxx.se - +46-705-44 31 77
   ech`echo xiun|tr nu oc|sed 'sx\([sx]\)\([xoi]\)xo un\2\1 is xg'`ol

Re: Installation [was Re: transaction roots]

Posted by Tripp Lilley <tl...@perspex.com>.
On Fri, 30 Mar 2001, Daniel Stenberg wrote:

> I think you're mixing things here. I've used many CVS repositories, and once
> I've checked out the code I've never bothered about the "two houses" dilemma
> as you describe it. A source code repository is usually singularis. There's
> only one, be it local or remote.

I was speaking mainly of understanding the distinction as a sysadmin
installing a CVS repo and configuring clients for the first time. My
experience with Perforce was simple: you drop the server in a directory,
tell it to listen on a port, then point your client at that port. If you
want SSH, you build a tunnel to that port and point your client to the
local side of the tunnel.

All of this maps to knowledge I already had of how "network" services
work, which made my life as a sysadmin straightforward.

CVS, on the other hand, required me to understand that the toolset could
use the repository locally, through the filesystem, or remotely through
the pserver protocol, or remotely over SSH, which ultimately treated the
access as "local" through the filesystem, piping output around
willy-nilly.

It scared me. I didn't like it. It was veeery bad :)


> 1 - we're not always system administrators when we want to run SVN on a few
>     simple files. requiring apache is an immense barrier in that scenario.

I hate to be cavalier, but this is what I envision:

	rpm -i svn-toto-1.0-1.i386.rpm


> 2 - even when we are system administrators, we might just want to have a
>     bunch of local files in the repository. Requiring Apache and even
>     featuring apache in the install procedure will make possibly dreadful
>     collisions and weird setup quirks for people that already have apache
>     installed and up and running (compared to using SVN on a local
>     repository)

	/opt/svn/
		apache/
			bin/
				httpd
			lib/
				mod_dav.so
				mod_dav_svn.so
	/etc/opt/svn/
		apache/
			httpd.conf
		svn/
			...

	/var/opt/svn/
		...


In this layout, the "Apache" serving SVN is completely self-contained
-under- SVN's directory and config hierarchy. It lives on a separate port,
uses a separate lib/module directory (and that's presuming you don't
statically link mod_dav and mod_dav_svn, though it would make sense to do
so), and so forth.

Note that I'm not suggesting that this should be the -only- way to install
SVN. Of course, them what wants to install it into their existing Apache
install are welcome to! (rpm -i svn-1.0-1.i386.rpm ? :) ). I'm just saying
that the "low-barrier" method would include everything.


> What about the "weekend warrior" who uses a machine he can't install servers
> on?

A relocatable version of the above kit, installed in the user's home
directory, listening on an unprivileged port (probably computed based on
an offset and the user's UID, so it doesn't conflict with another user
also evaluating SVN).


I guess I just feel like the entire "local" versus "remote" access
dichotomy is "askin' fer trouble". It duplicates code, even when you're
diligent about sharing code. It adds new places for bugs to hide. It
allows users and sysadmins to do grievous damage if they're not aware of
the distinction and the implication of the distinction (I might have made
that last one up, but I'm not sure).

It grates on my "elegance" nerve, basically. How much weight you give to
-my- elegance nerve is, of course, a matter for open debate :)

Take care!

-- 
   Joy-Loving * Tripp Lilley  *  http://stargate.eheart.sg505.net/~tlilley/
------------------------------------------------------------------------------
  "Fiber makes you poop." -- From <http://www.pvponline.com/bts_studio.php3>

Re: Installation [was Re: transaction roots]

Posted by Daniel Stenberg <da...@haxx.se>.
On Fri, 30 Mar 2001, Tripp Lilley wrote:

> Just as a side note, I recently installed CVS for the first time (someone
> was holding a gun to my head). One of the things that bugged me was the
> disconnect between "local", "remote", and "external" access to the repo.
>
> To my admittedly small mind, I'd prefer a system that declared itself as
> being "a network app" that happened to work locally through the magic of
> the localhost interface or perhaps domain sockets :) As an administrator
> and user, it helps me to not have to keep track of the difference. As a
> coder / possible contributor, it helps me to not have to think of the
> "two houses, both alike in dignity" that need updating...

I think you're mixing things here. I've used many CVS repositories, and once
I've checked out the code I've never bothered about the "two houses" dilemma
as you describe it. A source code repository is usually singularis. There's
only one, be it local or remote.

> To that end, I propose that the lowered entry barrier is pretty simple
> really: when you build RPMs, BSD ports, etc., you have them include a
> stripped down Apache install with mod_dav and mod_dav_svn all configured,
> and all set to run on a different port and to use a different set of
> directories for config, libraries, etc. In that sense, Apache, mod_dav,
> and mod_dav_svn are all just "part of the installation".

I'd really not like that. For a number of reasons, but these are the two
main ones I can immediately think of:

1 - we're not always system administrators when we want to run SVN on a few
    simple files. requiring apache is an immense barrier in that scenario.

2 - even when we are system administrators, we might just want to have a
    bunch of local files in the repository. Requiring Apache and even
    featuring apache in the install procedure will make possibly dreadful
    collisions and weird setup quirks for people that already have apache
    installed and up and running (compared to using SVN on a local
    repository)

> Anyone who cares about the fact that you're running "another" copy of
> Apache is someone who cares enough to install it themselves by hand, to
> their specific liking. Your "weekend warrior" who just wants SVN to
> manage a small repo for personal use is likely to be just grabbing the
> RPM and going, anyway.

What about the "weekend warrior" who uses a machine he can't install servers
on?

-- 
      Daniel Stenberg - http://daniel.haxx.se - +46-705-44 31 77
   ech`echo xiun|tr nu oc|sed 'sx\([sx]\)\([xoi]\)xo un\2\1 is xg'`ol

Installation [was Re: transaction roots]

Posted by Tripp Lilley <tl...@perspex.com>.
On 29 Mar 2001, Ben Collins-Sussman wrote:

> In the long run, I'd hate to see ra_local relegated to an obsolete
> test harness.  I still think setting up Apache/DAV/mod_dav_svn on a
> local box is way too high a barrier to entry for someone who wants to
> create a private repository for personal use... so I predict (I hope?)
> that ra_local will be used just often as it's used in CVS.  (But
> that's just an opinion... I'm sure you'll personally write an amazing
> install-script to lower that Apache barrier.  ;) )

Just as a side note, I recently installed CVS for the first time (someone
was holding a gun to my head). One of the things that bugged me was the
disconnect between "local", "remote", and "external" access to the repo.

To my admittedly small mind, I'd prefer a system that declared itself as
being "a network app" that happened to work locally through the magic of
the localhost interface or perhaps domain sockets :) As an administrator
and user, it helps me to not have to keep track of the difference. As a
coder / possible contributor, it helps me to not have to think of the "two
houses, both alike in dignity" that need updating...

To that end, I propose that the lowered entry barrier is pretty simple
really: when you build RPMs, BSD ports, etc., you have them include a
stripped down Apache install with mod_dav and mod_dav_svn all configured,
and all set to run on a different port and to use a different set of
directories for config, libraries, etc. In that sense, Apache, mod_dav,
and mod_dav_svn are all just "part of the installation".

Anyone who cares about the fact that you're running "another" copy of
Apache is someone who cares enough to install it themselves by hand, to
their specific liking. Your "weekend warrior" who just wants SVN to manage
a small repo for personal use is likely to be just grabbing the RPM and
going, anyway.

Some of these opinions are worth what you paid me for them :) Others are
pure gold. Which is which? ;)

-- 
   Joy-Loving * Tripp Lilley  *  http://stargate.eheart.sg505.net/~tlilley/
------------------------------------------------------------------------------
  "Fiber makes you poop." -- From <http://www.pvponline.com/bts_studio.php3>

Re: transaction roots

Posted by Ben Collins-Sussman <su...@newton.ch.collab.net>.
Greg Stein <gs...@lyra.org> writes:
 
> Lastly, these two paths are *specifically* why we wanted to avoid ra_local
> as much as possible. Just get people to use and stick with ra_dav even when
> they work on their own machine.

In the long run, I'd hate to see ra_local relegated to an obsolete
test harness.  I still think setting up Apache/DAV/mod_dav_svn on a
local box is way too high a barrier to entry for someone who wants to
create a private repository for personal use... so I predict (I hope?)
that ra_local will be used just often as it's used in CVS.  (But
that's just an opinion... I'm sure you'll personally write an amazing
install-script to lower that Apache barrier.  ;) )

 > ra_local could perform the same algorithm:
> 
> [ description of algorithm... ]

Yah, I understand the algorithm.  It's pretty easy... it's just doing
what fs_merge is doing, only incrementally, and on-the-fly.  

 
> > [ long (semi-sarcastic :-) spiel about converting ra_local ]
> 

Since it's clear that you're not going to budge, I think you've mostly
managed to talk me into rewriting the fs commit-editor to match your
model.  It's still a good model, I like it.  

(And as you said, fs_copy will still be used by real copies and when
building a 'reporter' txn for updates.  fs_merge will still be called
internally when we do our final fs_commit_txn.  So it's not like these
functions aren't still being used!)

I guess my frustration has been showing.  It's not that I object to
rewriting code when a better system is discovered;  I've been doing
that for a year.  I just get frustrated when the rewrite is a result
of two developers not agreeing on things early on.  If you had told me
about your algorithm 2 weeks ago, or vice versa, one of us wouldn't
have to rewrite now.  Nobody's fault, really.  This just falls in the
same category of that time last summer when I spent 2 weeks writing a
library that was totally unnecessary -- and could have been prevented
from the start if people had been communicating better.  Such is life.

But hey, it's M2 week, so we're all a little tense.  Nevertheless,
Greg, you're the epitome of calmness under stress... a real cool head
in these arguments.  Thanks.  :)

Re: transaction roots

Posted by Greg Stein <gs...@lyra.org>.
On Thu, Mar 29, 2001 at 06:32:15PM -0600, Ben Collins-Sussman wrote:
>...
> Basically, if commits are ever broken, I don't want to end up back in
> the CVS mindeset: "oh, gotta test this bug (or fix) in the local case
> and the network case *separately*."  
> 
> If there's one *theory* that describes how to do a commit, life gets
> much easier.  We fall less often into that maintainability trap.

Generally, I agree here.

But it is also a matter of degree. Commits over the network are *vastly*
different to begin with. Yes, it would be nice to have the endpoints as
similar as possible, but let's not lose sight that we will have whole
separate classes of bugs between the two paths.

Lastly, these two paths are *specifically* why we wanted to avoid ra_local
as much as possible. Just get people to use and stick with ra_dav even when
they work on their own machine.

> So my objection is that we currently have two different commit
> systems; and I'm arguing that if we have to choose one system,
> ra_local's is better... just because it's using fs_copy() and
> fs_merge().  I think I'd rather depend on the (future) robustness of
> these funcs rather than doing manual node-rev-id comparisons.

I don't see the justification that ra_local's system is better (you state it
above, as if it were obvious). If anything, I could say that ra_dav's is
better because I don't need to use fs_copy and fs_merge :-). Less code to
call, so less potential for problems.

ra_local could perform the same algorithm:

*) create the transaction based on latest
*) for each change to REV:PATH, do:
   - open root REV, fetch ID of PATH
   - (open) root TXN, fetch ID of path
   - compare

[ note the base_revision isn't ignored; it is used to determine REV in the
  above steps ]

You would then eliminate copies and merges, and align ra_local (well, the FS
editor) with ra_dav's algorithm.

You asked, "why do that, when copy and merge were written to do it for you?"
fs_copy is definitely needed because "real" copies do happen. I never said
that fs_merge was needed :-) Just because it exists, doesn't mean that I
should use it, or that it should continue to exist. I always saw it as a
step of the commit function, rather than a "first order" function.

[ trying to pull some threads together here ]

Ben wrote in another thread:
>...
> So we agree here;  we just use differing methods to check for
> conflicts as we go.  You use DAV-ey methods, I use fs_merge().

I use IDs for difference detection. Has nothing to do with DAV. You use ID's
for detection, via the fs_merge call.

>...
> I don't think that there's a theoretical problem with what you're
> doing, as long as your definition of "latest revision" remains fixed
> after you create the transaction.  ("latest revision" shouldn't be a
> moving target).

I don't retarget the transaction. "latest" is the latest revision at the
time the commit started, and remains that way. Of course, if somebody else
creates a revision while the commit is occurring, then svn_fs_commit_txn
will do a merge and potentially throw a conflict at that time (the late
conflict could arise for an item that I previously checked, but became out
of date during the commit process).

> [ long (semi-sarcastic :-) spiel about converting ra_local ]

It isn't as bad as all that. The short clip of an algorithm above is about
all that is needed. Whenever you get a replace_file or replace_dir, you
could just call a "check_latest" function. You know the base_revision (REV)
for the file/dir, and can easily do the comparison.

Note: the check is also quite easy. It is a simple equality check. If the
REV is older than TXN's ("latest" at commit-start), then the user is
obviously out of date. If the REV is *newer*, then the user purposefully
screwed the commit :-). Stopping and restarting the commit is sufficient;
they don't even need to update.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/

Re: transaction roots

Posted by Ben Collins-Sussman <su...@newton.ch.collab.net>.
Greg Stein <gs...@lyra.org> writes:
 
> Now, let's get back to the original question. Is there a *problem* with
> using the latest revision for the transaction root, rather than an arbitrary
> revision (i.e. the one passed to replace_root).

Sorry, Greg, I don't mean to sound so hyper or snotty in that last
message.  :)

Basically, if commits are ever broken, I don't want to end up back in
the CVS mindeset: "oh, gotta test this bug (or fix) in the local case
and the network case *separately*."  

If there's one *theory* that describes how to do a commit, life gets
much easier.  We fall less often into that maintainability trap.

So my objection is that we currently have two different commit
systems; and I'm arguing that if we have to choose one system,
ra_local's is better... just because it's using fs_copy() and
fs_merge().  I think I'd rather depend on the (future) robustness of
these funcs rather than doing manual node-rev-id comparisons.

Re: transaction roots

Posted by Ben Collins-Sussman <su...@newton.ch.collab.net>.
Greg Stein <gs...@lyra.org> writes:
 
> > Why reinvent the wheel?  When you call svn_fs_commit_txn, it will
> > automatically decide if the user is out-of-date.  You don't need to do
> > this check yourself as you build a txn against 'youngest'.
> 
> Why wait for commit_txn? That occurs *after* all the data has been delivered
> to the server. On my poor little 56k modem, that could be 15 minutes later.
> Then, *one* file is out of date, I say "fuck!", update that one file, and
> recommit the whole bloody thing.

Ah, yes, sure.  But note an earlier mail thread: ra_local is planning
to do early conflict detection.  Every time it receives a change, it
will call fs_merge() and potentially reject the whole commit -- all
before the text deltas are sent.  (That's why our commit-driver is
sending postfix textdeltas, after all.)

So we agree here;  we just use differing methods to check for
conflicts as we go.  You use DAV-ey methods, I use fs_merge().

 
> > Just build a txn based on whatever revision the client passes to your
> > replace_root() function.  When you get a replace_dir(rev) and rev !=
> > the parent's rev, then call svn_fs_copy.  That's what ra_local does.
> 
> replace_root? What's that?
> 
> mod_dav_svn does not have a replace_root call, and the replace_root on the
> client is not marshalled over the wire. IOW, the base_revision parameter to
> replace_root is dropped on the floor.

No, but libsvn_ra_local has a replace_root call.  You *could* send
that revision over the wire, and have mod_dav_svn use it to create a
transaction. 

> My question about the transaction roots, and using the latest, also leads to
> the fact that I never need to do an svn_fs_copy. If you want to talk about
> extra work, ra_local is the one doing the work with those copies :-)

Oy, I guess you're right.  The only reason I use fs_copy() is so that
fs_merge() can accurately tell me if my transaction conflicts.

(Of course fs_copy isn't really "work";  it's super cheap.)


> Now, let's get back to the original question. Is there a *problem* with
> using the latest revision for the transaction root, rather than an arbitrary
> revision (i.e. the one passed to replace_root).

I don't think that there's a theoretical problem with what you're
doing, as long as your definition of "latest revision" remains fixed
after you create the transaction.  ("latest revision" shouldn't be a
moving target).

But what bothers me is that we have two totally different systems for
building transactions here.  In my get, this just seems like bad
software design.  Let's choose a commit design and stick to it.

On the one hand, I could plea:  why on earth have we slaved over
making fs_copy and fs_merge work correctly, if you're not even going
to use them?  Why ignore the interfaces svn_fs provides? 

On the other hand, I could capitulate: I mean, if you refuse to build
a mixed-revision "mirror" of the working copy, then maybe I should
stop doing it also.  Maybe ra_local should switch to your system as
well; maybe it should do manual comparisons of node-rev-ids as it
goes, ignoring the fs's built-in ability to detect conflict copies;
maybe ra_local, too, should ignore the base_revision argument to
replace_root -- so that this editor argument now only has meaning in
one direction; and it wouldn't call fs_copy anymore, since it just
looks at node-rev-ids.  I could do all this, but I wouldn't be happy.

The real issue is that I can't stand the thought of a world where we
discover a commit-merge bug, but it only happens in one ra layer and
not the other.  We shouldn't have two independent commit systems.

Re: transaction roots

Posted by Karl Fogel <kf...@collab.net>.
Ben Collins-Sussman <su...@newton.ch.collab.net> writes:
> But as I mentioned, doesn't it seem odd that one ra layer be
> continually calling fs_merge() during a commit, and another ra layer
> be companing node-rev-ids during a commit.... both to achieve the
> *exact* same network-optimization effect?   Why have two different
> techniques for this?

I see what you mean, yeah.

I just reached my email saturation point, but let's talk about this
tomorrow.

-K

Re: transaction roots

Posted by Ben Collins-Sussman <su...@newton.ch.collab.net>.
Karl Fogel <kf...@collab.net> writes:

> Greg Stein <gs...@lyra.org> writes:
> > Why wait for commit_txn? That occurs *after* all the data has been delivered
> > to the server. On my poor little 56k modem, that could be 15 minutes later.
> > Then, *one* file is out of date, I say "fuck!", update that one file, and
> > recommit the whole bloody thing.
> 
> +1 :-)
> 
> I think this is a case where the filesystem has to have code to make a
> guarantee (the merge guarantee, that commits will error with conflict
> rather than commit against out-of-date data), and ra_dav wants to have
> some very similar code in order to achieve a network optimization.
> 
> Yes, the codes are similar, but each exists for a good reason,
> independent of the other.

But as I mentioned, doesn't it seem odd that one ra layer be
continually calling fs_merge() during a commit, and another ra layer
be companing node-rev-ids during a commit.... both to achieve the
*exact* same network-optimization effect?   Why have two different
techniques for this?

Re: transaction roots

Posted by Karl Fogel <kf...@collab.net>.
Greg Stein <gs...@lyra.org> writes:
> Why wait for commit_txn? That occurs *after* all the data has been delivered
> to the server. On my poor little 56k modem, that could be 15 minutes later.
> Then, *one* file is out of date, I say "fuck!", update that one file, and
> recommit the whole bloody thing.

+1 :-)

I think this is a case where the filesystem has to have code to make a
guarantee (the merge guarantee, that commits will error with conflict
rather than commit against out-of-date data), and ra_dav wants to have
some very similar code in order to achieve a network optimization.

Yes, the codes are similar, but each exists for a good reason,
independent of the other.

> Now, let's get back to the original question. Is there a *problem* with
> using the latest revision for the transaction root, rather than an arbitrary
> revision (i.e. the one passed to replace_root).

I don't see a problem with it.  You're able to make the guarantees you
need.  The one potential problem case you mentioned:

> I guess it is conceivable for something like this to happen:
> 
>   REV3/
>     FOO/
>       BAR(4.1)
> 
>   REV4/
>     FOO/
>       BAR(71.6)
> 
>   REV5/
>     FOO/
>       BAR(4.1)
> 
> If a user has 3:/FOO/BAR, and the latest rev is 5, then a commit to BAR will
> actually succeed since it is "up to date" :-)

...doesn't seem like much of a problem; if the commit succeeds, it's
not like the merge rule has been violated.  If someone got the
revision tree into that odd state, it means they've managed to do a
totally clean reversion back to REV3 for BAR, in which case it's
reasonable to consider someone committing from REV3 as "up-to-date".

Go for it, I say. :-)

-K

Re: transaction roots

Posted by Greg Stein <gs...@lyra.org>.
On Thu, Mar 29, 2001 at 07:50:12AM -0600, Ben Collins-Sussman wrote:
> Greg Stein <gs...@lyra.org> writes:
> > Can anybody see a problem with creating transactions based on the latest,
> > and verifying that the base of each desired change matches that within the
> > latest tree?
> 
> My complaint is that you're making mod_dav_svn do a whole bunch of
> work that is *already* being done my svn_fs_merge() at commit-time.

I'm doing the check sooner, rather than later. I see no problem with that.

> Why reinvent the wheel?  When you call svn_fs_commit_txn, it will
> automatically decide if the user is out-of-date.  You don't need to do
> this check yourself as you build a txn against 'youngest'.

Why wait for commit_txn? That occurs *after* all the data has been delivered
to the server. On my poor little 56k modem, that could be 15 minutes later.
Then, *one* file is out of date, I say "fuck!", update that one file, and
recommit the whole bloody thing.

> Why create a new codepath to test?  There's no need to have two
> different methods of building and committing transactions, depending
> on which ra layer you use.  That's just an ugly, internal
> inconsistency.

Okay, then drop ra_local and we have one code path to test. :-)

> Just build a txn based on whatever revision the client passes to your
> replace_root() function.  When you get a replace_dir(rev) and rev !=
> the parent's rev, then call svn_fs_copy.  That's what ra_local does.

replace_root? What's that?

mod_dav_svn does not have a replace_root call, and the replace_root on the
client is not marshalled over the wire. IOW, the base_revision parameter to
replace_root is dropped on the floor.

My question about the transaction roots, and using the latest, also leads to
the fact that I never need to do an svn_fs_copy. If you want to talk about
extra work, ra_local is the one doing the work with those copies :-)


Now, let's get back to the original question. Is there a *problem* with
using the latest revision for the transaction root, rather than an arbitrary
revision (i.e. the one passed to replace_root).

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/

Re: transaction roots

Posted by Ben Collins-Sussman <su...@newton.ch.collab.net>.
(Sorry, resending because my mailer died...)

Greg Stein <gs...@lyra.org> writes:

> Can anybody see a problem with creating transactions based on the
> latest,
> and verifying that the base of each desired change matches that
> within the
> latest tree?

My complaint is that you're making mod_dav_svn do a whole bunch of
work that is *already* being done my svn_fs_merge() at commit-time.

Why reinvent the wheel?  When you call svn_fs_commit_txn, it will
automatically decide if the user is out-of-date.  You don't need to do
this check yourself as you build a txn against 'youngest'.

Why create a new codepath to test?  There's no need to have two
different methods of building and committing transactions, depending
on which ra layer you use.  That's just an ugly, internal
inconsistency.  

Just build a txn based on whatever revision the client passes to your
replace_root() function.  When you get a replace_dir(rev) and rev !=
the parent's rev, then call svn_fs_copy.  That's what ra_local does.

Re: transaction roots

Posted by Ben Collins-Sussman <su...@newton.ch.collab.net>.
Jim Blandy <ji...@zwingli.cygnus.com> writes:

> > But: Greg, were you aware that svn_fs_merge() is not only for trees?
> > It just takes three roots and three paths -- those pairs can result in
> > any kind of object.  So maybe svn_fs_merge() gives you the information
> > you need after all.  (Or maybe not?  Let us know...)
> 
> I think it doesn't, because Greg does his merge with no base node
> whatsoever.  He never ever looks at the base revision.  He stashes
> exactly the information the merge needs from the base revision in
> those wc properties, so he never actually needs the base revision
> itself.
> 
> So any function that operates on three actual nodes isn't useful to
> him; he only needs two nodes, and some administrative info.
> 
> Is that right, Greg?

This is effectively what Greg just said on the phone to me.

So -- we'll probably need to write one new fs routine that both RA
layers can use.

Re: transaction roots

Posted by Jim Blandy <ji...@zwingli.cygnus.com>.
Karl Fogel <kf...@collab.net> writes:
> Jim Blandy <ji...@zwingli.cygnus.com> writes:
> > Ahh, I just thought of the right answer to this question:
> > 
> > In the approach you suggest, you're basically re-implementing all the
> > logic in svn_fs_merge within DAV.  This means that the filesystem has
> > packaged up that logic in a way that's not useful to you.  If you
> > could think of an alternative, easy-to-understand interface that would
> > allow you and the commit process to share the logic, that would be the
> > best outcome.
> > 
> > I guess the logic isn't that complex, so the duplication still isn't a
> > big deal.  It's just finesse points.
> 
> Yeah, good point.
> 
> But: Greg, were you aware that svn_fs_merge() is not only for trees?
> It just takes three roots and three paths -- those pairs can result in
> any kind of object.  So maybe svn_fs_merge() gives you the information
> you need after all.  (Or maybe not?  Let us know...)

I think it doesn't, because Greg does his merge with no base node
whatsoever.  He never ever looks at the base revision.  He stashes
exactly the information the merge needs from the base revision in
those wc properties, so he never actually needs the base revision
itself.

So any function that operates on three actual nodes isn't useful to
him; he only needs two nodes, and some administrative info.

Is that right, Greg?

Re: transaction roots

Posted by Greg Stein <gs...@lyra.org>.
On Fri, Mar 30, 2001 at 03:23:18PM -0600, Ben Collins-Sussman wrote:
> Greg Stein <gs...@lyra.org> writes:
>...
> > Merge happens *after* the fact. I'm checking before even starting a change.
> > To use merge, I'd have to wait for the postfix deltas, change the nodes in
> > my transaction, then attempt the merge to see if it was legal to do the
> > postfix delta in the first place!
> 
> Merge only looks at node-rev-ids and their relationship to one
> another.  It makes no difference what their contents are.  That's why
> I was planning to call merge() after each skeletal change, before any
> textdeltas were sent.

Ah! But how do you generate a new node-rev-id if you don't have a delta? You
would need to actually change the node to an empty string or something
(temporarily; until the txdelta arrived). That would give you the new node
id, which fs_merge can then use for conflict detection.

> But now I'm with you Greg...

:-)

> > Second, if I pass directories to merge, then I'm kinda screwed, no? It will
> > recurse all the way down...
> > 
> 
> Yah, this is inefficient.  :)
> 
> We need that new fs routine.

Yup. We can talk about the form/API next week.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/

Re: transaction roots

Posted by Ben Collins-Sussman <su...@newton.ch.collab.net>.
Greg Stein <gs...@lyra.org> writes:

> On Fri, Mar 30, 2001 at 01:40:39PM -0600, Karl Fogel wrote:
> >...
> > But: Greg, were you aware that svn_fs_merge() is not only for trees?
> > It just takes three roots and three paths -- those pairs can result in
> > any kind of object.  So maybe svn_fs_merge() gives you the information
> > you need after all.  (Or maybe not?  Let us know...)
> 
> Merge happens *after* the fact. I'm checking before even starting a change.
> To use merge, I'd have to wait for the postfix deltas, change the nodes in
> my transaction, then attempt the merge to see if it was legal to do the
> postfix delta in the first place!

Merge only looks at node-rev-ids and their relationship to one
another.  It makes no difference what their contents are.  That's why
I was planning to call merge() after each skeletal change, before any
textdeltas were sent.

But now I'm with you Greg...

> 
> Second, if I pass directories to merge, then I'm kinda screwed, no? It will
> recurse all the way down...
> 

Yah, this is inefficient.  :)

We need that new fs routine.

Re: transaction roots

Posted by Karl Fogel <kf...@collab.net>.
Greg Stein <gs...@lyra.org> writes:
> Merge happens *after* the fact. I'm checking before even starting a change.
> To use merge, I'd have to wait for the postfix deltas, change the nodes in
> my transaction, then attempt the merge to see if it was legal to do the
> postfix delta in the first place!
> 
> Second, if I pass directories to merge, then I'm kinda screwed, no? It will
> recurse all the way down...
> 
> Merge is "I did this. Was it okay?" ra_dav/mod_dav_svn is "Can I do this?"
> I much prefer the latter :-)

Gotcha, thanks for the enlightening explanation.

Re: transaction roots

Posted by Greg Stein <gs...@lyra.org>.
On Fri, Mar 30, 2001 at 01:40:39PM -0600, Karl Fogel wrote:
>...
> But: Greg, were you aware that svn_fs_merge() is not only for trees?
> It just takes three roots and three paths -- those pairs can result in
> any kind of object.  So maybe svn_fs_merge() gives you the information
> you need after all.  (Or maybe not?  Let us know...)

Merge happens *after* the fact. I'm checking before even starting a change.
To use merge, I'd have to wait for the postfix deltas, change the nodes in
my transaction, then attempt the merge to see if it was legal to do the
postfix delta in the first place!

Second, if I pass directories to merge, then I'm kinda screwed, no? It will
recurse all the way down...

Merge is "I did this. Was it okay?" ra_dav/mod_dav_svn is "Can I do this?"
I much prefer the latter :-)

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/

Re: transaction roots

Posted by Karl Fogel <kf...@collab.net>.
Jim Blandy <ji...@zwingli.cygnus.com> writes:
> Ahh, I just thought of the right answer to this question:
> 
> In the approach you suggest, you're basically re-implementing all the
> logic in svn_fs_merge within DAV.  This means that the filesystem has
> packaged up that logic in a way that's not useful to you.  If you
> could think of an alternative, easy-to-understand interface that would
> allow you and the commit process to share the logic, that would be the
> best outcome.
> 
> I guess the logic isn't that complex, so the duplication still isn't a
> big deal.  It's just finesse points.

Yeah, good point.

But: Greg, were you aware that svn_fs_merge() is not only for trees?
It just takes three roots and three paths -- those pairs can result in
any kind of object.  So maybe svn_fs_merge() gives you the information
you need after all.  (Or maybe not?  Let us know...)

-K

Re: transaction roots

Posted by Greg Stein <gs...@lyra.org>.
On Thu, Mar 29, 2001 at 10:58:09PM -0500, Jim Blandy wrote:
> Greg Stein <gs...@lyra.org> writes:
> > I've been creating transactions based on the "youngest" revision, and then
> > applying changes to that. I was wondering if anybody sees a problem with
> > doing that.
> 
> Ahh, I just thought of the right answer to this question:
> 
> In the approach you suggest, you're basically re-implementing all the
> logic in svn_fs_merge within DAV.  This means that the filesystem has
> packaged up that logic in a way that's not useful to you.  If you
> could think of an alternative, easy-to-understand interface that would
> allow you and the commit process to share the logic, that would be the
> best outcome.
> 
> I guess the logic isn't that complex, so the duplication still isn't a
> big deal.  It's just finesse points.

Ah. Interesting re-think. Quite right. That is a valid description of the
situation, and yes: finesse will help :-)

[ play bridge, by any chance? ]

[ tossing ideas here; I don't have an immediate answer, but throwing stuff
  out might help somebody key in on the answer... ]

Lessee... at heart, ra_dav has knowledge that a particular client-side item
is *that* node on the server. The server then verifies that the item being
modified is the most recent.

Currently, I refer to the item using (ID, PATH). But I've been massaging the
code, or implementing new code, by referring to that tuple as STABLE-ID.
Consider it as a unique value to refer to a particular instance of PATH in
time. (where instance "I" may be visible in REV(N)..REV(M))

Hmm. Maybe the FS could export something like, "I want to modify STABLE-ID.
Is that the most current? Am I allowed?"

That question is answered within mod_dav_svn by comparing STABLE-ID against
that of the item in TXN (where TXN was created as the latest). Hmm. I was
about to say that it is independent of the TXN, but that isn't true. TXN
could hold something different from LATEST. Okay... so the question is
within the context of TXN.

The editor answers the question by copying STABLE-ID into TXN, then
attempting a merge.

Back to Jim' query: the FS exports items as "FOO in REV". The client seems
to be more interesting in "I've got XYZ, can I modify it?" Leading up to how
to represent XYZ such that the question is possible. ra_dav and ra_local
have each manhandled their own approaches to enable that question.

Does that help? Ping any particular axons?

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/

Re: transaction roots

Posted by Jim Blandy <ji...@zwingli.cygnus.com>.
Greg Stein <gs...@lyra.org> writes:
> I've been creating transactions based on the "youngest" revision, and then
> applying changes to that. I was wondering if anybody sees a problem with
> doing that.

Ahh, I just thought of the right answer to this question:

In the approach you suggest, you're basically re-implementing all the
logic in svn_fs_merge within DAV.  This means that the filesystem has
packaged up that logic in a way that's not useful to you.  If you
could think of an alternative, easy-to-understand interface that would
allow you and the commit process to share the logic, that would be the
best outcome.

I guess the logic isn't that complex, so the duplication still isn't a
big deal.  It's just finesse points.

Re: transaction roots

Posted by Jim Blandy <ji...@zwingli.cygnus.com>.
Greg Stein <gs...@lyra.org> writes:
> > In your arrangement, you apply the modifications and do the merge in
> > one pass.  You get better locality, and less intermediate mucking
> > around in the database, but the logic is a bit more complex.
> 
> Hmm. I didn't really think that I was doing a merge, but I think I see what
> you're talking about. I'm taking the latest revision for the transaction
> root, then making the WC's changes. That application of the WC's changes to
> the txn root is effectively merging. Is that what you mean?

Yes.

> Personally, it think it "just happens." All that I see is "make a tree. make
> some changes." The fact that "merging" happens is some kind of neato side
> effect to the process :-). This is why I asked where you saw the
> complexity.

Fair enough.  Two sides of a coin.

Re: transaction roots

Posted by Greg Stein <gs...@lyra.org>.
[ just to clarify for myself; Jim sees complexity, so I'd like to understand
  it better since nobody likes complexity :-) ]

On Fri, Mar 30, 2001 at 05:53:33PM -0500, Jim Blandy wrote:
>...
> In your arrangement, you apply the modifications and do the merge in
> one pass.  You get better locality, and less intermediate mucking
> around in the database, but the logic is a bit more complex.

Hmm. I didn't really think that I was doing a merge, but I think I see what
you're talking about. I'm taking the latest revision for the transaction
root, then making the WC's changes. That application of the WC's changes to
the txn root is effectively merging. Is that what you mean?

Personally, it think it "just happens." All that I see is "make a tree. make
some changes." The fact that "merging" happens is some kind of neato side
effect to the process :-). This is why I asked where you saw the complexity.

[ and, of course, there is always the final merge at commit time in case
  another revision was created while the transaction was being built ]

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/

Re: transaction roots

Posted by Jim Blandy <ji...@zwingli.cygnus.com>.
Greg Stein <gs...@lyra.org> writes:
> On Thu, Mar 29, 2001 at 10:43:27PM -0500, Jim Blandy wrote:
> >...
> > If I were really looking for something to complain about, I'd say that
> > this reminds me of folding together two loops into one loop with a
> > more complex body.  It makes the process harder to follow overall.
> > But we're getting good behaviour out of it, and probably better
> > locality.
> 
> Hmm. I'd be interested to hear more about this. i.e. what part is complex,
> and what part is harder to follow? I'm not feeling that, but it could be
> that I'm too close to what I'm doing, or that I haven't explained it
> adequately.

No, I'm with you --- I think it's pretty clear.  It's just a general
principle kind of thing.

In the commit system Ben, Karl, and I had originally imagined, one
creates a transaction that is initially identical to the WC's base
revision, and then modifies it to be identical to the WC.  Then, you
walk over it and merge that with subsequent commits.

In your arrangement, you apply the modifications and do the merge in
one pass.  You get better locality, and less intermediate mucking
around in the database, but the logic is a bit more complex.

Re: transaction roots

Posted by Greg Stein <gs...@lyra.org>.
On Thu, Mar 29, 2001 at 10:43:27PM -0500, Jim Blandy wrote:
>...
> If I were really looking for something to complain about, I'd say that
> this reminds me of folding together two loops into one loop with a
> more complex body.  It makes the process harder to follow overall.
> But we're getting good behaviour out of it, and probably better
> locality.

Hmm. I'd be interested to hear more about this. i.e. what part is complex,
and what part is harder to follow? I'm not feeling that, but it could be
that I'm too close to what I'm doing, or that I haven't explained it
adequately.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/

Re: transaction roots

Posted by Jim Blandy <ji...@zwingli.cygnus.com>.
Greg Stein <gs...@lyra.org> writes:
> I guess it is conceivable for something like this to happen:
> 
>   REV3/
>     FOO/
>       BAR(4.1)
> 
>   REV4/
>     FOO/
>       BAR(71.6)
> 
>   REV5/
>     FOO/
>       BAR(4.1)
> 
> If a user has 3:/FOO/BAR, and the latest rev is 5, then a commit to BAR will
> actually succeed since it is "up to date" :-)

The old way of doing this won't detect this case either.  One would
have to scan every revision between your base and youngest to catch
these; it doesn't seem worth it.

> Can anybody see a problem with creating transactions based on the latest,
> and verifying that the base of each desired change matches that within the
> latest tree?

I don't think so.  I think you're just folding together the working
tree construction phase and the merge phase.  If you have some way of
knowing the base revision of each node being changed, then there's no
point in actually going back to the base revision to find it.

Of course, a merge will still necessary, to handle other transactions
people have committed while you've been building yours, so this
doesn't simplify the filesystem's needs.  But it's nice to be able to
catch conflicts earlier, and the merge will usually need to do less
work when it does happen.

If our server-side merging were doing something fancier, then your
approach might interfere, but, well, we're not doing anything more
sophisticated.  And Karl has made pretty good arguments that one
shouldn't, anyway.

If I were really looking for something to complain about, I'd say that
this reminds me of folding together two loops into one loop with a
more complex body.  It makes the process harder to follow overall.
But we're getting good behaviour out of it, and probably better
locality.