You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@subversion.apache.org by Jim Blandy <ji...@zwingli.cygnus.com> on 2001/02/01 21:28:33 UTC

Subversion vs. filesystems

Kevin, you're the fellow who was talking about implementing a genuine
mountable filesystem that tracked versions, right?  What I think would
be really useful along those lines would be to implement a filesystem
on top of the *client* library --- you'd get a working directory where
ordinary "rm", "mv", and "cp" would maintain the meta-information
Subversion needs to do commits and updates.  You could just use Emacs
dired to mess around in your working copy, and it'd all Just Work.

Detecting copies is a bit of a challenge, but I think GNU cp is a
"memory mapped" cp: it maps the entire input file into memory, and
then does a single write request to create the new file.  This means
that the write uses the exact same pages as the read.  Perhaps the
filesystem could notice which file the pages were coming from,
recognize that a copy was in progress, and adjust the Subversion
metainformation appropriately.

Re: Subversion vs. filesystems

Posted by Jim Blandy <ji...@zwingli.cygnus.com>.

Kevin Pilch-Bisson <ke...@pilch-bisson.net> writes:
> That is indeed an interesting idea.  An immediate problem is how are
> files added to the repository automatically, or with a command.  If
> automatically, then it would need a good way to know what not to include
> automatically.  If with a command then there can still be the problem of
> forgetting to add things.

Yep.  There are a lot of hairy questions here.  But I think it would
be something cool to play with.

Re: Subversion vs. filesystems

Posted by Kevin Pilch-Bisson <ke...@pilch-bisson.net>.

On Thu, Feb 01, 2001 at 04:28:33PM -0500, Jim Blandy wrote:
> 
> Kevin, you're the fellow who was talking about implementing a genuine
> mountable filesystem that tracked versions, right?  What I think would
> be really useful along those lines would be to implement a filesystem
> on top of the *client* library --- you'd get a working directory where
> ordinary "rm", "mv", and "cp" would maintain the meta-information
> Subversion needs to do commits and updates.  You could just use Emacs
> dired to mess around in your working copy, and it'd all Just Work.


That is indeed an interesting idea.  An immediate problem is how are
files added to the repository automatically, or with a command.  If
automatically, then it would need a good way to know what not to include
automatically.  If with a command then there can still be the problem of
forgetting to add things.
> 
> Detecting copies is a bit of a challenge, but I think GNU cp is a
> "memory mapped" cp: it maps the entire input file into memory, and
> then does a single write request to create the new file.  This means
> that the write uses the exact same pages as the read.  Perhaps the
> filesystem could notice which file the pages were coming from,
> recognize that a copy was in progress, and adjust the Subversion
> metainformation appropriately.

I think I am still some time away from actually starting to implement
something, but thanks for the technical help.  I'm sure it will come in
handy.

-- 
>~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Kevin Pilch-Bisson
kevin@pilch-bisson.net
http://www.pilch-bisson.net
PGP Public Key At http://pgp.pilch-bisson.net

Re: Subversion vs. filesystems

Posted by Branko Čibej <br...@xbc.nu>.

Matthew O. Persico wrote:

> Before you go off and work to hard to re-invent the wheel, maybe you can
> get some good ideas from the ClearCase site and its white papers. After
> all, if I read this correctly, an "installable version-tracking
> filesystem" smells a LOT like ClearCase to me. And I LOVED ClearCase,
> but I had to leave it behind when I changed jobs three years ago. Sigh.

Heh. I use ClearCase at work. But that's just incidental -- I helped 
design and write such a filesystem for another configuration management 
tool. So no, I'm not about to reinvent the wheel, I'm just gonna copy 
it. :-)

-- 
Brane ďż˝ibej
    home:   <br...@xbc.nu>             http://www.xbc.nu/brane/
    work:   <br...@hermes.si>   http://www.hermes-softlab.com/
     ACM:   <br...@acm.org>            http://www.acm.org/

Re: Subversion vs. filesystems

Posted by "Matthew O. Persico" <pe...@acedsl.com>.

Branko ï¿½ibej wrote:
> 
> Jim Blandy wrote:
> 
> > Kevin, you're the fellow who was talking about implementing a genuine
> > mountable filesystem that tracked versions, right?  What I think would
> > be really useful along those lines would be to implement a filesystem
> > on top of the *client* library --- you'd get a working directory where
> > ordinary "rm", "mv", and "cp" would maintain the meta-information
> > Subversion needs to do commits and updates.  You could just use Emacs
> > dired to mess around in your working copy, and it'd all Just Work.
> 
> That's the logical way to do it, so that you're not limited to mounting
> only from a local directory.
> 
> > Detecting copies is a bit of a challenge, but I think GNU cp is a
> > "memory mapped" cp: it maps the entire input file into memory, and
> > then does a single write request to create the new file.  This means
> > that the write uses the exact same pages as the read.  Perhaps the
> > filesystem could notice which file the pages were coming from,
> > recognize that a copy was in progress, and adjust the Subversion
> > metainformation appropriately.
> 
> This would depend on having GNU cp too much. One of the ways to solve
> this problem is to simply not have automatic commits, or at least
> require an explicit "register" command for new files -- the ancestor
> file could be passed as an argument.
> 
> I'd advise against trying to be too smart about detecting copies. From
> my experience it creates more problems than it solves. For instance, you
> can't distinguish between a "real" copy and a tool creating a backup
> file; and you don't want to put backup files in the repository. A
> .cvsignore-ish approach will work, but then you've just moved the magic
> from a command to a filter -- which is almost certain to bite you someday.
> 
> It would be nice to have such a filesystem be completely transparent,
> but in practice you'll use extra revision control commands anyway, or
> you wouldn't need a versioning filesystem.
> 

Before you go off and work to hard to re-invent the wheel, maybe you can
get some good ideas from the ClearCase site and its white papers. After
all, if I read this correctly, an "installable version-tracking
filesystem" smells a LOT like ClearCase to me. And I LOVED ClearCase,
but I had to leave it behind when I changed jobs three years ago. Sigh.

> --
> Brane ï¿½ibej
>     home:   <br...@xbc.nu>             http://www.xbc.nu/brane/
>     work:   <br...@hermes.si>   http://www.hermes-softlab.com/
>      ACM:   <br...@acm.org>            http://www.acm.org/


-- 
Matthew O. Persico
    
"If you were supposed to understand it,
we wouldn't call it code." - FedEx

Shop online without a credit card
http://www.rocketcash.com
RocketCash, a NetZero subsidiary

Re: Subversion vs. filesystems

Posted by Branko Čibej <br...@xbc.nu>.

Jim Blandy wrote:

> Kevin, you're the fellow who was talking about implementing a genuine
> mountable filesystem that tracked versions, right?  What I think would
> be really useful along those lines would be to implement a filesystem
> on top of the *client* library --- you'd get a working directory where
> ordinary "rm", "mv", and "cp" would maintain the meta-information
> Subversion needs to do commits and updates.  You could just use Emacs
> dired to mess around in your working copy, and it'd all Just Work.

That's the logical way to do it, so that you're not limited to mounting 
only from a local directory.

> Detecting copies is a bit of a challenge, but I think GNU cp is a
> "memory mapped" cp: it maps the entire input file into memory, and
> then does a single write request to create the new file.  This means
> that the write uses the exact same pages as the read.  Perhaps the
> filesystem could notice which file the pages were coming from,
> recognize that a copy was in progress, and adjust the Subversion
> metainformation appropriately.

This would depend on having GNU cp too much. One of the ways to solve 
this problem is to simply not have automatic commits, or at least 
require an explicit "register" command for new files -- the ancestor 
file could be passed as an argument.

I'd advise against trying to be too smart about detecting copies. From 
my experience it creates more problems than it solves. For instance, you 
can't distinguish between a "real" copy and a tool creating a backup 
file; and you don't want to put backup files in the repository. A 
.cvsignore-ish approach will work, but then you've just moved the magic 
from a command to a filter -- which is almost certain to bite you someday.

It would be nice to have such a filesystem be completely transparent, 
but in practice you'll use extra revision control commands anyway, or 
you wouldn't need a versioning filesystem.

-- 
Brane ďż˝ibej
    home:   <br...@xbc.nu>             http://www.xbc.nu/brane/
    work:   <br...@hermes.si>   http://www.hermes-softlab.com/
     ACM:   <br...@acm.org>            http://www.acm.org/

Re: Subversion vs. filesystems

Posted by Bob Miller <kb...@jogger-egg.com>.

Jim Blandy wrote:

> Detecting copies is a bit of a challenge

Compare the new file's md5 sum with those of existing files.
(As a special case, all empty files aren't equal. (-: )

-- 
                                        K<bob>
kbob@jogger-egg.com, http://www.jogger-egg.com/