You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by cm...@collab.net on 2001/11/16 19:34:06 UTC

commit crawler

Hey, Ben.  Was wondering what you thought about us changing the commit
crawler so that mined information from the working copy first, storing
relavent bits in some applicable in-memory data structure and locking
dirs and such, then blew through that data structure to perform the
actual commit.

Reason I ask is that I'd like to recycle alot of the logic in that
commit process, but for working-copy-free commits (like `svn cp' is
soon to do).  It'd be great if svn_wc_crawl_local_mods simply
generated some Thing that, when passed to a commit editor driver,
could perform a commit.  That way our working-copy-less commits could
simply manufacture their own Things, and pass those off to the same
driver.

Plus, I think it would help the commit crawler not be such a mess, to
seperate the local-mod search from the commit process itself.

Just polling for thoughts here.  Lemme know.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: commit crawler

Posted by Ben Collins-Sussman <su...@collab.net>.
Greg Stein <gs...@lyra.org> writes:

> On Fri, Nov 16, 2001 at 01:34:06PM -0600, cmpilato@collab.net wrote:
> > Hey, Ben.  Was wondering what you thought about us changing the commit
> > crawler so that mined information from the working copy first, storing
> > relavent bits in some applicable in-memory data structure and locking
> > dirs and such, then blew through that data structure to perform the
> > actual commit.
> 
> Eek. That blows away all concept of streamy. -1

Yeah, eek is right.  -1.


> I would suggest writing a crawler, much like Python's os.path.walk(). It
> crawls over the disk and calls a callback for each entry. You can then use
> different callbacks for status, for committing, for updating, whatever.

This sounds pretty clean to me.  I like this, Greg.  Thanks for the
detailed explanation.


> > Reason I ask is that I'd like to recycle alot of the logic in that
> > commit process, but for working-copy-free commits (like `svn cp' is
> > soon to do).
> 
> "svn cp" for two URLs would entirely skip all this gobbledy gook. I don't
> think you'd use any common routines at all for it. Seems like it would be a
> manual sequence of calls into the commit editor, just like you posted in
> copy-planz.

Yeah, I agree.  I don't understand why you want a level of abstraction
here.


> > Plus, I think it would help the commit crawler not be such a mess, to
> > seperate the local-mod search from the commit process itself.
> 
> Agreed. See above. Note that the recent "copy the filesystem" walker as part
> of the wc->wc copy could be tossed in favor of the walk() function and a
> callback to copy a file/dir.
> 
> (I noticed that the code to do the copy was effectively duplicating the
> other filesystem traversal stuff in the WC; no sense duplicating a basic
> function everywhere, leading to possible divergence and maintenance issues)

Ahhhh, yes!  Cool.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: commit crawler

Posted by Greg Stein <gs...@lyra.org>.
On Fri, Nov 16, 2001 at 01:34:06PM -0600, cmpilato@collab.net wrote:
> Hey, Ben.  Was wondering what you thought about us changing the commit
> crawler so that mined information from the working copy first, storing
> relavent bits in some applicable in-memory data structure and locking
> dirs and such, then blew through that data structure to perform the
> actual commit.

Eek. That blows away all concept of streamy. -1

I would suggest writing a crawler, much like Python's os.path.walk(). It
crawls over the disk and calls a callback for each entry. You can then use
different callbacks for status, for committing, for updating, whatever.

The walker would take several flags:

* include unversioned items
* include versioned items missing from the WC
* include .svn files (a raw filesystem walk; overrides the above two)
* callback for dirs in a prefix manner
* callback for dirs in a postfix manner

Define a single, simple item structure that has just enough information such
that a callback can use it to get any further data. I would imagine this
structure would have a pointer to the "versioning" data. For some of the
walk types, you're going to open the .svn/entries file. That data would go
into item->vsn_info. For a raw filesystem walk, or for unversioned items,
that pointer would be NULL.

(and a simple function call would fill in vsn_info for a given item since
that item structure has enough info to "get back" to the right spot to fetch
the necesssary data)

(the different on prefix vs postfix on dirs: consider a copy-tree function
wants dirs in a prefix so it can mkdir; deleteing a tree wants dirs postfix
so it can rmdir after the dir is empty)

> Reason I ask is that I'd like to recycle alot of the logic in that
> commit process, but for working-copy-free commits (like `svn cp' is
> soon to do).

"svn cp" for two URLs would entirely skip all this gobbledy gook. I don't
think you'd use any common routines at all for it. Seems like it would be a
manual sequence of calls into the commit editor, just like you posted in
copy-planz. There isn't any reason to build up a Thing just to execute four
calls which you *know* ahead of time. In fact, you're building the Thing so
that it happens to result in precisely those calls. Skip the whole
intermediate process of a Thing and keep it simple.

>...
> Plus, I think it would help the commit crawler not be such a mess, to
> seperate the local-mod search from the commit process itself.

Agreed. See above. Note that the recent "copy the filesystem" walker as part
of the wc->wc copy could be tossed in favor of the walk() function and a
callback to copy a file/dir.

(I noticed that the code to do the copy was effectively duplicating the
other filesystem traversal stuff in the WC; no sense duplicating a basic
function everywhere, leading to possible divergence and maintenance issues)


Anyway... separate the filesystem traversal out of the various WC bits will
help quite a bit. I think there are numerous subtle benefits that are going
to pop up after this change. A nice little dividing line between filesystem
data/representation and control will be established.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org