You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Tom Lord <lo...@regexps.com> on 2002/08/10 03:12:00 UTC

what proofs look like (relates to why i find you guys ridiculous to try to work with in spite of the fact we both work on "open source")


This is a _pretty good_ example of a proof:

From: Tom Lord <lo...@regexps.com>
To: arch-dev@regexps.com
Subject: [arch-dev] towards formal patch set semantics
Sender: arch-dev-admin@regexps.com
X-BeenThere: arch-dev@regexps.com
X-Mailman-Version: 2.0.9
Precedence: bulk
List-Help: <mailto:arch-dev-request@regexps.com?subject=help>
List-Post: <ma...@regexps.com>
List-Subscribe: <http://www.regexps.com/mailman/listinfo/arch-dev>,
	<mailto:arch-dev-request@regexps.com?subject=subscribe>
List-Id: a discussion list for arch developers <arch-dev.regexps.com>
List-Unsubscribe: <http://www.regexps.com/mailman/listinfo/arch-dev>,
	<mailto:arch-dev-request@regexps.com?subject=unsubscribe>
List-Archive: <http://www.regexps.com/pipermail/arch-dev/>
Date: Mon, 8 Jul 2002 14:03:16 -0700 (PDT)
X-UIDL: '$9"!0GF"!a7X!!-W_"!




!	 How Whole Tree Patching Works


  This note contains a formalized description of how whole tree
  patch sets handle tree rearrangements.

  There are two limitations: the math of the formalization is "pretty
  good", but could use a little more precision for a final draft
  patch set standard;  only the (interesting) problem of tree
  rearrangements is addressed -- other details of patching aren't
  addressed (yet).

  This is long and rather dry with lots of equations -- so it isn't
  for all readers.  

  On the other hand, there's an important theorem proven here
  ({theorem: rearrangement inventory equivalence}) whose significance
  is that it proves that our patch set format contains enough
  information for exact patching (in spite of not containing complete
  inventories for the `orig' and `mod' trees).

  Additionally, if you want to try formally defining operations such
  as patch set composition, this document contains enough formalism
  for you to work out how to compose tree rearrangements.

* overview

  In whole tree patching, we have three filesystem trees: `orig',
  `mod', and `target'.  

  First, we'll compare `orig' and `mod' to produce a data stream
  called a "patch set" the describes the changes between them.

  Second, we'll "apply" the patch set to the target -- which means
  we want to make the same set of changes to the target tree.

  Two difficulties arise: one, the `target' tree may not be identical
  to the `orig' tree, in which case it isn't a priori clear what it
  means to apply the "same changes"; two, we want to include file tree
  structural rearrangements among the changes we record when computing
  patch sets and replay while applying them.


* the focus of this document is on tree-structure patching

** patch sets have two parts

  A whole tree patch set has two parts:

	*) a set of __individual file changes__
	*) a __tree rearrangement__


** individual file changes

  Patch set contents for individual programs are created by programs
  such as `diff' and applied by programs such as `patch'.  
  The details are fairly banal and aren't important to this note.


** tree rearrangements

  Tree rearrangements are based on comparing `orig' and `mod' to see
  how files and directories have been rearranged, and 
  applying that "same" transformation to `target'.

  The problem is defining what "same" means when the target tree
  does not precisely match the original tree.  So, exploring that
  topic is what comes next:


* what's a tree structure (formally)?

  Let's do some intuitive math.

** tag sets

<<<
	DirTags		:=	"the set of directory tags"
	FileTags	:=	"the set of file tags"
	SymlinkTags	:=	"the set of symlink tags"

	AllTags		:=	DirTags union FileTags union SymlinkTags
>>>

  The `StudlyCaps' words are set names.  The `:=' means "is defined to
  be".  The definition is in quotes which means that it's an 
  informal definition -- you're supposed to be able to fill in the
  details for yourself (e.g., you know what a tag string is because
  you are familiar with `arch').

** Tag Set Disjointness Axiom

<<<
	Axiom:

		   DirTags int FileTags
		== DirTags int SymlinkTags
		== FileTags int SymlinkTags
		== -empty-
>>>

  `int' is the set intersection operator. `-empty-' is the
  empty set.  Along with the axiom, we can define some predicates:

*** tag predicates

<<<
	dir_tag?(x)	:=	x member DirTags
	file_tag?(x)	:=	x member FileTags
	symlink_tag?(x)	:=	x member SymlinkTags
>>>


** relative path names

<<<
	RelPaths	:=	"the set of simplified down-relative paths"
>>>

  `Simplified' means that the paths do not contain `./' or `/..'
  components.  `down-relative' means that the the paths do not
  begin with '/' or `..'.


** so what's a tree (structure)?

  Each of our three trees is characterized by an _inventory_ function
  whose type we approximately know:

<<<
	inventory_orig		:	AllTags <-> RelPaths
	inventory_mod		:	AllTags <-> RelPaths
	inventory_target	:	AllTags <-> RelPaths
>>>

  (The notation means that each `inventory_*' is an invertible
  function between a subset of `AllTags' and a subset of `RelPaths'.)

  The precise domains and ranges of those functions are of 
  interest:

<<<
	orig_tags	:=	Dom (inventory_orig)
	mod_tags	:=	Dom (inventory_mod)
	target_tags	:=	Dom (inventory_target)
	
	orig_paths	:=	Rng (inventory_orig)
	mod_paths	:=	Rng (inventory_mod)
	target_paths	:=	Rng (inventory_target)
	
>>>


** comparing two trees


  Let's start to compare `orig' and `mod' in order to produce a 
  patch set.

*** interesting sets while comparing trees

  A few interesting sets are:

<<<
	removed_tags	:=	Dom (inventory_orig) - Dom (inventory_mod)
	added_tags	:=	Dom (inventory_mod) - Dom (inventory_orig)
	common_tags	:=	Dom (inventory_mod) int Dom (inventory_orig)
>>>

  The sets `added_' and `removed_tags' are the deleted and new files,
  symlinks and directories.  The set `common_tags' are files found
  in both `orig' and `mod' (whether moved or not).

*** lemma: disjointness of tree comparison sets

<<<
	Lemma:
		   removed_tags int added_tags
		== removed_tags int common_tags
		== added_tags int common_tags
		== -empty-

	Proof:

	From the definitions, we know that:

		[i] removed_tags == A - B
		[ii] added_tags == B - A
		[iii] common_tags == A int B

	From that it follows that:

		[iv] removed_tags int B == empty	(from [i])
		[v] added_tags int A == empty		(from [ii])

  		[vi] removed_tags is_subset A		(from [i])
  		[vi] added_tags is_subset B		(from [ii])
  		[vii] common_tags is_subset A		(from [iii])
  		[viii] common_tags is_subset B		(from [iii])

	From that it follows that __each of the following sets
	is `-empty-'__:

		[ix] removed_tags int added_tags	(from [iv] + [vi])
		[x] removed_tags int common_tags	(from [iv] + [viii])
		[xi] added_tags int added_tags		(from [v] + [vii])

	QED
>>>

*** the set of moved files, directories, and symlinks

  Another set of interest is:

<<<
	moved_tags	:= { x | 
			         (x is_in common_tags)
				and
				 (inventory_orig(x) != inventory_mod(x))
			   }
>>>


*** lemma: disjointedness of interesting tree comparison sets

<<<
	Lemma:
		   removed_tags int added_tags
		== removed_tags int moved_tags
		== added_tags int moved_tags
		== -empty-

	Proof:

	From the definitions, we know that:

		moved_tags is_subset common_tags

	thus this lemma follows trivially from
	the lemma "disjointness of tree comparison sets".

	QED
>>>


*** the two tree-relative sets of interesting tags

  We just constructed three sets of tags: `added_', `removed_', and
  `moved_' tags.  Not all of those tags are present in both the `orig' 
  and `mod' trees.  Just these:

<<<
	orig_rearrangement_tags	:=	removed_tags union moved_tags
	mod_rearrangement_tags	:=	added_tags union moved_tags
>>>

*** theorem: rearrangement set equivalences

  This theorem is useful because it means that given only
  the two `_rearrangement_tags' sets, we can recover the 
  three `added_/remove_/moved_' sets.

<<<
	Theorem:

	  [I] moved_tags == orig_rearrangement_tags int mod_rearrangement_tags

	  [II] removed_tags == orig_rearrangement_tags - mod_rearrangement_tags
  
	  [III] added_tags == mod_rearrangement_tags - orig_rearrangement_tags

	Proof:

	  With shorthand:

		A := added_tags
		M := moved_tags
		R := removed_tags

		G := orig_rearrangement_tags
		D := mod_rearrangement_tags


	  of [I]...

	   [i] (from defs)

	           (G int D)
		== ((R union M) int (A union M))

  
	
	   [ii] (from set algebra)

		== (A int (R union M)) union (M int (R union M))
		== (A int R) union (A int M) union M

	   [iv] (from {lemma: disjointedness of interesting tree comparison sets})

		== -empty- union -empty- union M
		== M


	  of [II] (and [III] by symmetry):

	   [i] (from defs)

	       	   (G - D)
		== (R union M) - (A union M)

	   [ii] (from set algebra)

		== R - (A union M)
		== (R - A) - M

	   [iii] (from {lemma: disjointedness of interesting tree
	          comparison sets})

		== R - M
		== R

	QED

>>>



*** interesting functions while comparing trees

  Given those sets, we can define some new sets functions by restricting
  the domains of the `inventory_' functions:

<<<

	rearrangement_inventory_orig :=
		restrict(inventory_orig, orig_rearrangement_tags)

	rearrangement_inventory_mod :=
		restrict(inventory_mod, mod_rearrangement_tags)

>>>


*** theorem: rearrangement inventory equivalence

  Here is an interesting and really important equation:

<<<
	Theorem:

		inventory_mod(x)
  
	     == by-cases:

	        -undefined- if x is_in removed_tags
		rearrangement_inventory_mod(x) if x is_in moved_tags
		rearrangement_inventory_mod(x) if x is_in added_tags
		inventory_orig(x) otherwise
  

		inventory_orig(x)
  
	     == by-cases:

	        -undefined- if x is_in added_tags
		rearrangement_inventory_orig(x) if x is_in moved_tags
		rearrangement_inventory_orig(x) if x is_in removed_tags
		inventory_mod(x) otherwise
  

	Proof:

	   [omitted for brevity but simple]
>>>

  
  What that tells us is that if we have `rearrangement_inventory_mod',
  the `added_/removed_/moved_tags' sets, and `inventory_orig', then
  we can reconstruct `inventory_mod'.

  Symmetrically, with `rearrangement_inventory_orig', the `added_/...'
  sets, and `inventory_mod', we can reconstruct `inventory_orig'.

  What that means in english is that if we have a patch set plus a
  copy of either `orig' or `mod', and the patch contains rearrangement
  inventories for the `orig' and `mod' trees, then we have all the
  information we need for forwards or backwards exact tree
  rearrangements.  (We know we have all the information we need
  because the theorem above tells us how to reconstruct all of the
  relevant information there is from just the inventory of the tree we
  have plus the contents of the patch set.)


* what goes in a patch set?

  Patch set syntax (described elswhere) includes `rename' directives
  and `change' directives.  Collectively, these include (as a subset
  of the useful information they contain), specifications
  of the `rearrangement_inventory_orig' and
  `rearrangement_inventory_mod' tables.

  That design is at least _plausible_ since (as we just proved) it is
  sufficient for forwards and backwards exact tree rearrangements.

  Whether or not it's sufficient for inexact patching is a matter
  of subjective judgement.   The semantics of inexact patching 
  haven't been formally (only informally) specified so far.  Is it
  a usefully specified semantics?


* what's missing?

  Still needed are formal models of (in terms of impact on inventory and 
  file contents) inexact tree rearrangements and individual file change
  application.


-----
Offer feedback/support:
http://svcs.affero.net/rm.php?m=lord%40regexps%2Ecom&ls=towards%20formal%20patch%20set%20semantics&ll=arch-dev@regexps.com

_______________________________________________
arch-dev mailing list
arch-dev@regexps.com
http://www.regexps.com/mailman/listinfo/arch-dev



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: what proofs look like

Posted by Larry McVoy <lm...@bitmover.com>.
On Sat, Aug 10, 2002 at 12:23:10PM -0700, Tom Lord wrote:
> 	I had hoped to reply to your earlier note in as positive and
> 	constructive a mode as possible, to get past the difficult
> 	garbage, and into the space of making improvements of mutual
> 	benefit.
> 
> Do you think that's possible?

No.  Please take me off the CC list of this thread.  As gently as 
possible, with no ill will intended, I want you to hear that I can't
help you.  I don't have the extra money that you need and I doubt 
that anyone else does.  

[ Delete now if you don't want to know why we didn't take Tom's path ]

It's perhaps worth pointing out that I've been running a company doing
SCM type stuff for 5 years.  We own our IP and we use legal means
to force people to pay us for it.  And we have a good product, many
people think it is better than clearcase. 

Even with all that, the last 5 years have been a non-stop struggle to
scrap together payroll every 2 weeks.  It's a constant source of stress,
there are houses, families, kids, all of whom depend on me finding the
money to keep things going.  It's incredibly hard.  My health sucks
as a result of doing this, you have no idea of the toll it has taken.
And the part that you just can't seem to hear is that there is ABSOLUTELY
NO CHANCE that we would have made 1/100th of the money we have made if
we gave away our software for free.  And there is ABSOLUTELY NO CHANCE
that we would have made 1/100th of the money we have made even if we had
all the good parts of arch, BK, subversion, and clearcase put together
in a GPLed package.  The market simply will not pay for obscure products
unless they have to do so.

You may have a different opinion but what you are finding out is that
your opinion is wrong and that's a painful process.  I'm sorry for you,
I tried to warn you but it's understandable that you didn't listen,
we don't produce anything remotely approximating free software so we're
automatically in the "evil corporate" camp.  What you didn't get is that
I'm you.  I have the same ideals, the same goals, the same dedication,
the same drive to help the world.  I can just hear you saying "if that's
true then BitKeeper would be GPLed, you self serving bastard".  Not so.
My goal was, is, and will remain a goal of providing support for Linus
and Linux.  The difference between me and you is that I have realized 
what it really costs to produce a decent SCM system and then continue
to support and evolve it.  It's a HUGE cost.  Given that my goal was to
help Linus and that I believed that he needed a production quality system,
my choices were to get on the dot com wagon and get VC and/or make it
commercial.  Otherwise it was never going to get finished.  GPL was not
an option.

I choose not to go the VC route because the VC guys don't share my goals.
Their only goal is to make more money.  Which means as soon as they
thought that giving BK away to the open source crowd didn't help them
make more money, they'd put a stop to that.  So I passed on that, turned
down $6M from a top 3 VC firm,  just wasn't worth the risk.

We went it alone, but we had to make it a for profit concern or we'd 
never have gotten to where we are.  And we're nowhere near done.

Yeah, yeah, I can hear you saying "thanks for the BK advertisement"
but that's not the point.  The point is that the goal of helping out the
portion of free software community with difficult SCM problems FORCED us
into a corporate model.  You can whine all you like about how evil that
is, how I've sold out, whatever, but the reality is that you are begging
for money so you can get to a 1.0 release and we are shipping a tool that
2000+ Linux kernel developers use world wide.  For free.  And it has met
the goal of helping Linus.  BK still sucks, it has tons of problems, but
those problems will get solved precisely because we have a business model.
You don't.  Your business model is charity.  That's not going to work.

I'd be far more impressed with you if you were demonstrating that I
was wrong by showing me how to develop a system that works, in all the
corner cases, and is self supporting through a business model that
somehow works with an open source product.  I'd LOVE to see that.
I hate the idea of not shipping source, it pisses me off to no end.
But it is a fact of life that if we ship it, people abuse it.  And then
we go out of business.  And then the product doesn't get finished.

At any rate, I don't think you are listening to any of this, so just
listen to this one thing: please stop mailing me about this.  Feel 
free to flame me a few more times if that makes you feel good but 
don't expect a response, I've procmailed you into /dev/null.  Sorry,
but I have work to do and this is too much additional stress.  Good
luck, I'd love nothing more than to have you show me a business model
which proves me completely wrong, but until you do, I don't want to 
hear about your problems.
-- 
---
Larry McVoy            	 lm at bitmover.com           http://www.bitmover.com/lm 

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: what proofs look like

Posted by Tom Lord <lo...@regexps.com>.

I had hoped to reply to your earlier note in as positive and
constructive a mode as possible, to get past the difficult garbage,
and into the space of making improvements of mutual benefit.

I had hoped the three notes I sent a few minutes ago would make that
clear but since they didn't -- I'll say it explicitly.

You said, in part, that you didn't think I would believe the valence
of your stated intentions towards me.  Actually, I _pretty much_ DO
believe you, and if I have doubts, I (usually ;-) try to maintain a
large margin for error in favor of others (so they can change their
mind, if nothing else :-).

So, regarding this:

	I had hoped to reply to your earlier note in as positive and
	constructive a mode as possible, to get past the difficult
	garbage, and into the space of making improvements of mutual
	benefit.

Do you think that's possible?


-t



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: what proofs look like

Posted by Larry McVoy <lm...@bitmover.com>.
Please keep me out of this discussion, I already apologized for entering it
in the first place, what else do you want?
-- 
---
Larry McVoy            	 lm at bitmover.com           http://www.bitmover.com/lm 

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: what proofs look like

Posted by Tom Lord <lo...@regexps.com>.

	> Now, if our model is so far off-base that we won't be able
	> to support some important future feature like repeated
	> merges

That's roughly one of the kinds of thing I'm worried about (one of the
big ones): the momentum that can build up behind businesses and
popular projects and the way that leads to people being pressured by
circumstance to not fix various problems until fixing them is too
costly and everyone is just stuck with them for a long time.  

It's a whole bunch of issues -- repeated merges might be one place to
start looking into them.

Part of the point of the proof (in the context of this list) is not
just "we should be having really careful discussions like this", but
also "this (that one proof) is about as much as i've had time and
resources to work out in presentable format since starting arch" and I
think that's another symptom of the (vaguely defined) larger problem.

-t

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: what proofs look like

Posted by Greg Hudson <gh...@MIT.EDU>.
(CC list trimmed, again.)

I don't work for CollabNet (I work for MIT, where my job has nothing to
do with Subversion); others can provide better insight into the working
environment there, if they so choose.

> From what I've seen on the dev list, that discussion could have
> reached a level where we were building models and proofs like that one
> i sent you

To make progress, we want good understandings between the people who
write the code.  Most of the people who are writing the code probably do
not think best in terms of set-theoretic formalisms; less formal styles
of communication work better.  If you look at, say,
http://svn.collab.net/repos/svn/trunk/subversion/libsvn_fs/structure,
you can see the kind of communication which our developers can all
understand.  It is at the right level of conciseness.

Formalisms are not necessary clearer or even more precise than informal
descriptions.  If you're going to prove something non-obvious, a
formalism can be useful, but once you've done your proof, I think we (a
group of mostly non-mathematicians) are better off with informal models.

> But instead, well, it's all "post 1.0"

This project has a goal, which is to provide a compelling CVS
replacement for the open source community.  We are close to realizing
that goal.  Standardizing on a patch set format with other version
control systems is not a requirement for that goal.  Hence, it waits
until after 1.0.  This isn't "technology taking a back seat to
business"; this is basic project management.

Now, if our model is so far off-base that we won't be able to support
some important future feature like repeated merges, that's a reason to
stop and think.  You can see that we're actually doing that in the
"Issue 838 merge should copy-with-history" thread, because we believe
there might be a real problem.

> Regardless, arch is at least a very
> plausible indication that svn has a few fundamentals wrong (and
> perhaps some other fundamentals right).

Not by its mere existence, it isn't.  There are a few steps missing in
this argument.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: what proofs look like

Posted by Tom Lord <lo...@regexps.com>.

	> A common patch set format would indeed be nice, but honestly
	> it isn't going to be on the top of anybody's plate for
	> awhile yet.


In my experience, if you set out to work on a common patch set format,
it changes (for the better, I think) how you see other important
aspects of revision control systems.  It's just a conversation
starter.

If we work out a patch set format, that may well (in my opinion) lead
to (at least) slight (but significant) changes to your storage manager
and client side user interface.   

The kinds of storage manager changes I (suspect) might be wanted for
svn are (from what I can tell so far -- more work is needed) of a sort
that will make converting old archives to the new format problematic.
In other words, I'm talking about semantic changes to your database --
not just representation changes.   

Something else that came up more recently on this end is user
interfaces:

Most users of arch find arch's to be tedious and awkward in several
places and, in my experience, it makes it a little more difficult than
it ought to be to produce clean (single change) commits.  So that has
to be fixed.   A few have requested an essentially CVS-like front-end
to arch -- something closer to svn.

Well, I can see how that could be done (and, interestingly, it would
leave us with another gratuitous incompatibility problem, as far as I
can tell).    

On the other hand, I think there's other approaches very different
from CVS that will ultimately be easier to learn and use and will make
it natural for developers to produce clean change sets.  Very roughly:
I want to build a client-side "librarian" for working trees that makes
it easier for programmers to have *many* trees at once, do weird
mix-n-match on the fly for compiling and testing those trees, work on
each change in a separate tree and, thus, get clean change sets from
whole-tree commits.  Again, very roughly: my _tentative opinion_ is
that this needs only a radically simplified revision control system
layer, but then some little tools on top of that to help programmers
manage all those trees without getting lost.  Hopelessly vague, I
realize.  Maybe it will make some sense to you.  The only point is
that if you have a slightly different perspective on patch sets, I am
fairly certain it will have positive impact on your view of all other
aspects of the system.

-t

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

RE: Re: what proofs look like

Posted by Bill Tutt <ra...@lyra.org>.
I started out writing a long reply to this email, and then I realized it
just wasn't worth it. If you don't know it, I'm not working on
Subversion full time either. I have a full time programming job that has
nothing to do with version control and I'm married. I'm just happy that
I've been able to contribute as much as I have been to Subversion. The
kinds of operations/situations that Subversion's data model handles are
very complex. I wish I had more time to spend on such issues, but sadly
there's only one of me to schedule my free time with as opposed to 5+
clones of me each doing a subset of the total amount of interesting
things I'd like to work on. I'm glad in the past you've had time to work
on arch so much. It's a shame you didn't give the Subversion folks a
chance, since the fundamental data model has changed substantially since
its initial conception. 

Greg Hudson commented on how Subversion is using basic project
management to get 1.0 out the door, so I'm not going to bother
commenting on that.

A common patch set format would indeed be nice, but honestly it isn't
going to be on the top of anybody's plate for awhile yet. It seems more
likely that such a thing would be much more likely when Subversion
decides to take on the chore of moving it's data model to one that takes
distribution into account. When Subversion is ready for such a thing, it
will need some kind of patch set format for replicating changes. This
will include handling copies correctly. 

Patch set formats in the context of version control systems aren't just
about merging changes into someone's local working copy, it's also about
specifying some sane kind of version related merge semantic. (ala the
merge of a new directory from one branch to another should be a copy
with history instead of just an add) Thinking about how to handle the
multiple branch merge rename issue would probably be a good thing as
well.

Bill


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: what proofs look like

Posted by Tom Lord <lo...@regexps.com>.

Business plus hacking is a dangerous mixture.  Handle with extreme
caution.

I have lots of interesting replies to reply to today, but I want to
start here (on the svn dev list).  I think what I'm going to say is a
no-brainer and the hard part is figuring out why it didn't happen in
practice and what to do about it now and in the future.

Very abstractly (and roughly speaking), I think one should:


	1) explore and understand a technology domain

	2) develop a widely shared understanding of its nature

	3) develop opinions about what to do with it

	4) be able to explain it clearly and precisely to just about
           anyone;  do so

	5) explore all that carefully

	6) THEN, (if appropriate) integrate it into business plans

Don't I have a (slight) gift for stating the very very obvious as if i
were the only one who knew it?

Taking a broader perspective, steps (1..5) are an activity that
businesses need to fund and support (research and development) if they
want to have a reason to keep existing.  Frankly, looking at the
industry, that'd be a good place to start picking up the mess.

Activity 6, rather than being where the battle is among competing
executive teams, boards, and investment brokers, should really be a
fairly mundane tactical space.   Really, all those business decision
makers need to focus on 1..5 and wear that fact on their sleeves.
That's why they will deserve trust.

If those decision makers aren't personally qualified to grok the
technology .... well, maybe they know people who can help and maybe
that means we just all go a little bit slower.  I mean, they can
already read and write and do arithmetic so they're 3/4 of the way
there (at least); some of them are former engineers; I suspect they
can handle it pretty well, actually.  All it takes is the will to act.

One is losing hard whenever things go in approximate reverse.  As in:


	1) get a so-so picture of a possible cool hack; make some good
	   demos or a neat power-point presentation

	2) figure out how (supposedly) it can drive a business model
	   (or reputation exercise or other kind of power trip)

	3) start the project.   Get at least some people into such
	   a bind that their livelihood depends on a relentless march
	   to completion.


Our interactions look to me like you are losing hard.  Lame replies
about "oooo its shell scripts" and "that's a post 1.0 issue", both of
which you are obviously collectively capable of doing much better
than, are evidence.

In similar situations, with other groups forced into losing hard, I
have formed the impression of a quiet war: engineers basically lie and
hide things from business decision makers who expect this, tolerate
it, count on it, and dance their merry way straight to hell.  I hope
things aren't that bad where you are, but i'd be surprised if they
were much better than that.

What if you got really close to 1.0, then something came along that
caused you to rethink the fundamentals of what you are doing and the
impact it will have if plans proceed?  Do you have the freedom to stop
and think?  For a day?  A week?  Some months?  Or will you lose your
job if you don't make a deadline?  If your company *has* to shut down,
and you are a wage slave, has your employer made provisions to keep
you safe for as long as possible?  Do you have the freedom to bail?

Why do people volunteer for your project, unless because it gets a
certain kind of press, sending a perhaps rather misleading message
about the value of volunteering to volunteers.   And how does that
press happen?  No -- volunteers aren't fools.  I almost volunteered
for svn myself, purely on the basis of the technology, and that's
where i'd guess the best volunteers are at.  (Ultimately, I didn't
volunteer personally because of some subtle flaws (imo) with the tech
that were easier to work-around by writing arch than volunteering into
svn and trying to fix them from within.)

We started chatting about patch set formats long ago.  I thought it a
good place to start because, _if_ we reach more or less agreement on
that, then i think there are broader implications for the rest of your
design.   Even if we never get to those broader implications, a common
patch set format won't be a bad thing to have, in my opinion.

RE: Re: what proofs look like

Posted by Bill Tutt <ra...@lyra.org>.
Indeed, I'm glad Tom can find time to write detailed proofs of his work.
Unfortunately, the main theory nuts on Subversion seem to be busy enough
with their real life job, or school work to come up with detailed
specifications of the operations that Subversion is attempting to solve.

I'm also glad that Tom could find the time to clarify the generic tree
patching concepts that he did, because while most of what his proof
described is fairly self-evident to me, having an actual proof is also
just a nice thing to have, because it gives your gut instincts about
problem spaces a more secure foundation.

In terms of the patch mechanics:
While the patch semantics make sense (barring the non-handling of
copies), existing working copy libraries of various source control
systems aren't really setup to handle the fully generic patch in one
distinct target system checkin.

e.g.: Subversion's working copy model just isn't setup to handle
arbitrary chained working copy operations. 

This is mostly because there just hasn't been nearly as much focus on
improving this part of the system. 

Honestly, I don't think the biggest problem folks will have with tree
merging/patching will be renames/copies. I think the biggest problem
folks will have is getting a given set of changes into this patch
format.

i.e. the typical vendor branch update problem into your local
repository.


Bill
----
Do you want a dangerous fugitive staying in your flat?
No.
Well, don't upset him and he'll be a nice fugitive staying in your flat.
 

> -----Original Message-----
> From: Greg Hudson [mailto:ghudson@MIT.EDU]
> Sent: Friday, August 09, 2002 11:20 PM
> To: dev@subversion.tigris.org
> Cc: Tom Lord
> Subject: Re: what proofs look like
> 
> (Recipients and subject line trimmed.)
> 
> For those who are curious, what Tom proved here is that, given a tree
> delta and an original tree, you can reproduce the modified tree.  Or
you
> can reproduce the original tree from the modified tree and the tree
> delta.  For this to work, a tree delta has to include information
about
> files which have been added, removed, or moved (arch's model of the
> world doesn't seem to track copies, just moves), but you don't have to
> include information about files which just sat there--except for
changes
> to their contents, of course.
> 
> I have no particular insight on how these formalisms are related to
why
> Tom finds us ridiculous to try to work with, or why we only work on
> "open source" in quotes.
> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: what proofs look like

Posted by Greg Hudson <gh...@MIT.EDU>.
(Recipients and subject line trimmed.)

For those who are curious, what Tom proved here is that, given a tree
delta and an original tree, you can reproduce the modified tree.  Or you
can reproduce the original tree from the modified tree and the tree
delta.  For this to work, a tree delta has to include information about
files which have been added, removed, or moved (arch's model of the
world doesn't seem to track copies, just moves), but you don't have to
include information about files which just sat there--except for changes
to their contents, of course.

I have no particular insight on how these formalisms are related to why
Tom finds us ridiculous to try to work with, or why we only work on
"open source" in quotes.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org