You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@subversion.apache.org by "Perry E. Metzger" <pe...@piermont.com> on 2004/02/18 18:53:40 UTC

versioning unversioned metadata + anonsvn strategies....

Several slightly related issues:

0) The documentation does not make it clear if "svnadmin dump" is safe
   to run at any time (i.e. without quiescing the database). I assume
   it is, but the docs might make that clear.
1) Having an ancient fear of binary databases, I'd like to back up my
   subversion database to a text format dump file after every
   commit. However, if I do incrementals, I risk losing changes made
   to unversioned metadata. I also note that the documentation has
   very prominent warnings about the dangers of altering svn:log
   properties, and rightfully so. I also have another reason for
   wanting incrementals to catch all changes (see #2 below).
   What I'm going to ask, then, is why not version at least some of
   the unversioned metadata. If it was done at least to the extent of
   providing semi-fake version numbers for metadata changes, one could
   do nice incremental dumps of the metadata. If it was done so far as
   to actually fully version the metadata, one could be slightly less
   paranoid about log edits, which would be pleasant.
2) I'd like (for an open source project) to run a separate anonsvn
   server that is pretty close to "real time". Right now, on one open
   source project I'm part of, the way this is done is by having a CVS
   post commit script trigger a copy of the ,v files that changed to
   the anoncvs machine, along with nightly rsyncs. Such a strategy is
   clearly impossible with SVN.
   The two strategies that have come to mind for me are to do an
   incremental dump after every commit, copying that to the anon
   repository -- but that would miss property changes (another reason
   I'm bringing up issue #1). The other possibility that came to mind
   is somehow sending the db transaction log files over to the copy
   and replaying them there, but I must confess I have little or no
   idea how to do that.

-- 
Perry E. Metzger		perry@piermont.com

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: versioning unversioned metadata + anonsvn strategies....

Posted by "Perry E. Metzger" <pe...@piermont.com>.
Ben Collins-Sussman <su...@collab.net> writes:
> Yes, this would require major new design to the repository.  At the
> moment, we have exactly one mechanism for "versioning" anything, whether
> it be files, directory structure, or file/dir metadata:  sequential
> revision trees.  
>
> The problem is, the trees themselves have metadata (datestamps, author,
> and log messages).  If you want to version *that* metadata, how do you
> do it?

Most of that metadata one doesn't want to touch. If datestamps,
author, etc. were made read only, and log messages were somehow
treated specially, we'd probably eliminate the problem -- i.e. if all
unversioned properties were read-only and somehow logs were treated
specially.

However, ignoring that for now, is there any sort of hack that could
be added so that (for now) changes to the unversioned properties
showed up in incremental files? That would get rid of a lot of my
immediate issue.

Perry

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: versioning unversioned metadata + anonsvn strategies....

Posted by Ben Collins-Sussman <su...@collab.net>.
On Thu, 2004-02-19 at 08:52, kfogel@collab.net wrote:

> Versioning the currently-unversioned metadata is not impossible, but
> would require some pretty significant changes.  Try expanding your
> paragraph above into a concrete technical proposal and you'll start to
> see what I mean.

Yes, this would require major new design to the repository.  At the
moment, we have exactly one mechanism for "versioning" anything, whether
it be files, directory structure, or file/dir metadata:  sequential
revision trees.  

The problem is, the trees themselves have metadata (datestamps, author,
and log messages).  If you want to version *that* metadata, how do you
do it?  We'll need to invent a completely new mechanism for describing
change over time, something orthogonal to revision trees.  And once
you've gone that far, where do you stop?  Surely the new mechanism will
have it's *own* metadata (who changed a revision log message? when?
why?)... will that metadata need to be versioned too?  Where do you draw
the line?

That's the reason we punted on this topic years ago, IIRC.



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: versioning unversioned metadata + anonsvn strategies....

Posted by "Perry E. Metzger" <pe...@piermont.com>.
kfogel@collab.net writes:
> "Perry E. Metzger" <pe...@piermont.com> writes:
>>    What I'm going to ask, then, is why not version at least some of
>>    the unversioned metadata. If it was done at least to the extent of
>>    providing semi-fake version numbers for metadata changes, one could
>>    do nice incremental dumps of the metadata. If it was done so far as
>>    to actually fully version the metadata, one could be slightly less
>>    paranoid about log edits, which would be pleasant.
>
> Versioning the currently-unversioned metadata is not impossible, but
> would require some pretty significant changes.  Try expanding your
> paragraph above into a concrete technical proposal and you'll start to
> see what I mean.

I fully agree. I went over about five ways to do it in my mind, and
just the namespace issue for the metadata versions is
disgusting.

> IMHO the benefit/cost ratio is too low to be doing this anytime soon.

Well, doing it for real, so that you could recover old log messages,
is probably not a very high priority.

However, the other problem seems much more real to me. It would be
*very* useful to allow incrementals to somehow include the change in
the metadata with them, so that incrementals could be used to fully
recover the repository. Being able to fully recover the repository --
or to ship incrementals off to mirror servers to keep them up to date
-- is something that *is* important.

Maybe the repository could mark that certain metadata had been changed
between two version numbers and supply it along with the given
incremental even though (technically) it is not part of those versions?

-- 
Perry E. Metzger		perry@piermont.com

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: versioning unversioned metadata + anonsvn strategies....

Posted by kf...@collab.net.
"Perry E. Metzger" <pe...@piermont.com> writes:
>    What I'm going to ask, then, is why not version at least some of
>    the unversioned metadata. If it was done at least to the extent of
>    providing semi-fake version numbers for metadata changes, one could
>    do nice incremental dumps of the metadata. If it was done so far as
>    to actually fully version the metadata, one could be slightly less
>    paranoid about log edits, which would be pleasant.

Versioning the currently-unversioned metadata is not impossible, but
would require some pretty significant changes.  Try expanding your
paragraph above into a concrete technical proposal and you'll start to
see what I mean.

IMHO the benefit/cost ratio is too low to be doing this anytime soon.
There are fairly easy ad hoc ways to preserve that history (e.g.,
saving propchange emails), if one really wants to.

Btw, I don't think your "backup every revision" plan is crazy at all.
It is an entirely sane level of paranoia :-).  Incremental dumps seem
like the best option (the repository does not have to be quiescent for
this, as you've already found out), and for revision properties, just
log the changes manually via pre- and post-revprop-change hooks.

-Karl

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: versioning unversioned metadata + anonsvn strategies....

Posted by Florian Weimer <fw...@deneb.enyo.de>.
Perry E. Metzger wrote:

> 2) I'd like (for an open source project) to run a separate anonsvn
>    server that is pretty close to "real time". Right now, on one open
>    source project I'm part of, the way this is done is by having a CVS
>    post commit script trigger a copy of the ,v files that changed to
>    the anoncvs machine, along with nightly rsyncs. Such a strategy is
>    clearly impossible with SVN.

This problem also occurs if you want to mirror a repository over, say,
WebDAV.  Currently, you don't notice log message changes.

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: versioning unversioned metadata + anonsvn strategies....

Posted by Ben Collins-Sussman <su...@collab.net>.
On Wed, 2004-02-18 at 13:51, Perry E. Metzger wrote:

> My point is that users shouldn't have to do that -- the svn docs
> should tell them. :)

I just fixed this in chapter 5 of the book, "Migrating a repository".



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: versioning unversioned metadata + anonsvn strategies....

Posted by John Peacock <jp...@rowman.com>.
Toshio wrote:
> I was under the impression that the on-disk format changed between 4.1
> and 4.2.  If that's correct, I would say a reason not to force people to
> upgrade Berkeley for their own good is that there can be quite a few
> programs on the system beyond apache/subversion thst use it and you may
> not want to recompile them all just to evaluate how subversion will
> improve your life over CVS.

Multiple versions of BerkeleyDB can be installed quite independently of each 
other.  I appear to have system libraries of BDB 1.85 and 4.0.14 (thanks to 
Mandrake) in /usr/lib, as well as my own custom installed libraries for 4.0.14, 
4.1.24, and 4.2.52 in /usr/local/BerkeleyDB[something].

I think that there is enough evidence that, for whatever reason, subversion does 
not play well with BDB 4.1 and that it should be actively shunned.  BDB 4.2 is 
the best recommended version, but 4.0 works too.  Since using Subversion with 
Apache requires specific versions of Apache, it is not that much harder to 
require avoiding a specific BDB lib.

John

-- 
John Peacock
Director of Information Research and Technology
Rowman & Littlefield Publishing Group
4720 Boston Way
Lanham, MD 20706
301-459-3366 x.5010
fax 301-429-5747

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: versioning unversioned metadata + anonsvn strategies....

Posted by Toshio <to...@tiki-lounge.com>.
On Wed, 2004-02-18 at 15:25, John Peacock wrote
> Berkeley 4.1 has documented problems and it has been known for some time that it 
> has problems.  Unfortunately, not _everyone_ had problems (I had them myself) so 
> it was very hard to forbid its use with subversion.  In fact, I believe that 4.1 
> was the only version that worked for OS/X (don't quote me).  However, now that 
> 4.2 has been released, there is no reason not to prohibit 4.1 (in the 
> 'configure' code), and require either 4.0.x or 4.2 in order to build the files.
> 
I was under the impression that the on-disk format changed between 4.1
and 4.2.  If that's correct, I would say a reason not to force people to
upgrade Berkeley for their own good is that there can be quite a few
programs on the system beyond apache/subversion thst use it and you may
not want to recompile them all just to evaluate how subversion will
improve your life over CVS.

-Toshio
-- 
Toshio <to...@tiki-lounge.com>

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: versioning unversioned metadata + anonsvn strategies....

Posted by John Peacock <jp...@rowman.com>.
Perry E. Metzger wrote:

> John Peacock <jp...@rowman.com> writes:
> 
>>There is no reason (AFAICT) to duplicate the database's own mechanisms
>>for maintaining transaction history through so crude a method as
>>performing incremental dumps after every transaction.
> 
> 
> The subversion developers themselves run hot-backup.py in their commit
> script -- which makes a copy of the whole repository!

Which at was very important over the years (since various problems have caused 
the database to go belly up).  Early in the development, it made perfect sense; 
at this point in the project, I think it is probably overkill.  It also had the 
useful behavior of purging un-used log files (which 4.2 has made unneeded).

If you are willing to maintain all of the infrastructure needed to have all of 
those backups, and are sufficiently concerned about quick recovery at any time, 
then it is a viable methodology.  Most projects don't justify such level of 
paranoia.

However, and this is precisely my larger point, hot-backup.py uses the 
BerkeleyDB suggested procedure for performing database backups.  It does not use 
'svnadmin dump --incremental', which would be much harder to use for recovery 
purposes.

John

-- 
John Peacock
Director of Information Research and Technology
Rowman & Littlefield Publishing Group
4501 Forbes Boulevard
Suite H
Lanham, MD  20706
301-459-3366 x.5010
fax 301-429-5748

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: versioning unversioned metadata + anonsvn strategies....

Posted by "Perry E. Metzger" <pe...@piermont.com>.
John Peacock <jp...@rowman.com> writes:
> There is no reason (AFAICT) to duplicate the database's own mechanisms
> for maintaining transaction history through so crude a method as
> performing incremental dumps after every transaction.

The subversion developers themselves run hot-backup.py in their commit
script -- which makes a copy of the whole repository!

In any case, there are very good reasons to want to get metadata info
in incremental dumps, so even if this particular use of incremental
dumps seems silly it would still be nice to have the feature.

Perry

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: versioning unversioned metadata + anonsvn strategies....

Posted by John Peacock <jp...@rowman.com>.
Florian Weimer wrote:

>>Sure, do the backups nightly (or hourly), but dumping 
>>the database for every rev (even with an incremental) is, to put it 
>>delicately, quite insane. ;~)
> 
> 
> I disagree.  Most databases offer this functionality.  It's required for
> some forms of backup.  Even Subversion supports it (but of course not
> with plain-text data).

It's called a transaction log and, yes, Berkeley (and Oracle ;~) supports it. 
Disable the automatic log purges and use the hot backup script on a schedule. 
With the Berkeley log files, it is possible to roll the repository to any given 
point in time.  I believe you can even configure Berkeley to store the log files 
on a different partition.

There is no reason (AFAICT) to duplicate the database's own mechanisms for 
maintaining transaction history through so crude a method as performing 
incremental dumps after every transaction.

John

-- 
John Peacock
Director of Information Research and Technology
Rowman & Littlefield Publishing Group
4501 Forbes Boulevard
Suite H
Lanham, MD  20706
301-459-3366 x.5010
fax 301-429-5748

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: versioning unversioned metadata + anonsvn strategies....

Posted by Florian Weimer <fw...@deneb.enyo.de>.
John Peacock wrote:

> Perry E. Metzger wrote:
> 
> >>You need to get over that fear. ;~)  Replace the word "subversion" in
> >>that paragraph with "Oracle" or "SQLServer" and you'll see exactly how
> >>crazy that scheme sounds.
> >
> >
> >I dump all my postgres databases to text files.
> 
> For every commit???

Oracle does this, by the way.  There is something rather close to
cleartext which stores SQL statements.  Apparently, some DBAs are quite
used to it because they often have to consult it to hush up users
errors. 8-)

> Sure, do the backups nightly (or hourly), but dumping 
> the database for every rev (even with an incremental) is, to put it 
> delicately, quite insane. ;~)

I disagree.  Most databases offer this functionality.  It's required for
some forms of backup.  Even Subversion supports it (but of course not
with plain-text data).

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: versioning unversioned metadata + anonsvn strategies....

Posted by John Peacock <jp...@rowman.com>.
Perry E. Metzger wrote:

>>You need to get over that fear. ;~)  Replace the word "subversion" in
>>that paragraph with "Oracle" or "SQLServer" and you'll see exactly how
>>crazy that scheme sounds.
> 
> 
> I dump all my postgres databases to text files.

For every commit???  Sure, do the backups nightly (or hourly), but dumping the 
database for every rev (even with an incremental) is, to put it delicately, 
quite insane. ;~)

> 
> Also, I was at a company once some years ago where the version control
> system, run inside a database, went corrupt and lost weeks of
> work.

Was that VSS by any chance? ;~o

> 
> Also, in the last few days someone wrote about how their (apparently
> buggy version of) db ate their revisions.

Berkeley 4.1 has documented problems and it has been known for some time that it 
has problems.  Unfortunately, not _everyone_ had problems (I had them myself) so 
it was very hard to forbid its use with subversion.  In fact, I believe that 4.1 
was the only version that worked for OS/X (don't quote me).  However, now that 
4.2 has been released, there is no reason not to prohibit 4.1 (in the 
'configure' code), and require either 4.0.x or 4.2 in order to build the files.

John

-- 
John Peacock
Director of Information Research and Technology
Rowman & Littlefield Publishing Group
4501 Forbes Boulevard
Suite H
Lanham, MD  20706
301-459-3366 x.5010
fax 301-429-5748

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: versioning unversioned metadata + anonsvn strategies....

Posted by "Perry E. Metzger" <pe...@piermont.com>.
John Peacock <jp...@rowman.com> writes:
> Perry E. Metzger wrote:
>
>> 0) The documentation does not make it clear if "svnadmin dump" is safe
>>    to run at any time (i.e. without quiescing the database). I assume
>>    it is, but the docs might make that clear.
>
> Check out the BerkeleyDB docs if in doubt.  I believe that it is safe
> at all times (i.e. a consistent dumpfile is guaranteed), but don't
> take my word for it.

My point is that users shouldn't have to do that -- the svn docs
should tell them. :)

>> 1) Having an ancient fear of binary databases, I'd like to back up my
>>    subversion database to a text format dump file after every
>>    commit.
>
> You need to get over that fear. ;~)  Replace the word "subversion" in
> that paragraph with "Oracle" or "SQLServer" and you'll see exactly how
> crazy that scheme sounds.

I dump all my postgres databases to text files.

Also, I was at a company once some years ago where the version control
system, run inside a database, went corrupt and lost weeks of
work.

Also, in the last few days someone wrote about how their (apparently
buggy version of) db ate their revisions.

Backups are cheap, developers are expensive. I'll keep making
backups.

-- 
Perry E. Metzger		perry@piermont.com

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: versioning unversioned metadata + anonsvn strategies....

Posted by John Peacock <jp...@rowman.com>.
Perry E. Metzger wrote:

> 0) The documentation does not make it clear if "svnadmin dump" is safe
>    to run at any time (i.e. without quiescing the database). I assume
>    it is, but the docs might make that clear.

Check out the BerkeleyDB docs if in doubt.  I believe that it is safe at all 
times (i.e. a consistent dumpfile is guaranteed), but don't take my word for it.

> 1) Having an ancient fear of binary databases, I'd like to back up my
>    subversion database to a text format dump file after every
>    commit. 

You need to get over that fear. ;~)  Replace the word "subversion" in that 
paragraph with "Oracle" or "SQLServer" and you'll see exactly how crazy that 
scheme sounds.

> 2) I'd like (for an open source project) to run a separate anonsvn
>    server that is pretty close to "real time". 

See the svk project, which might be what you want:

	http://svk.elixus.org/

The anonsvn server would have an svk repository that mirrors the production 
server.  A post-commit on the production server (or a cron job for less "real 
time") would do

	svk sync //project

(where //project is the local copy of the primary repository).  Once that is 
done, this is a completely conventional svn repository.

HTH

John


-- 
John Peacock
Director of Information Research and Technology
Rowman & Littlefield Publishing Group
4501 Forbes Boulevard
Suite H
Lanham, MD  20706
301-459-3366 x.5010
fax 301-429-5748

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: many anonsvn replicas...

Posted by "Perry E. Metzger" <pe...@piermont.com>.
Holger Krekel <py...@devel.trillke.net> writes:
>> If you find nntp too disgusting, some custom service could work the
>> same way, but there's lots of good nntp code with high performance out
>> there already, I figure...
>
> For smaller setups maybe just triggering the peers to grab incremental 
> dumps from the master repo might be enough. This sounds simple enough
> if we are dealing with three or four repository mirrors. 
>
> For massively distributed environments, like every developer
> running his own repo, using bittorrents for wider distribution also 
> sounds worthwhile. 

I think NNTP is easier -- it is built for flood filling files already
between a bunch of servers. It is easy to build trees of NNTP hosts,
too. A custom protocol might also work, but I don't think Bit Torrent
is really built for the task...

> Then again, really interesting would be if you can actually commit to any 
> of the many repositories.

No. It provides you with read only replicas -- committing to them
will not work. Presumably, though, one could enhance "commit" so that
it knew which was the main repository, checked that your copy was at
the same revision number as the main one, and let you commit to the main...


-- 
Perry E. Metzger		perry@piermont.com

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: many anonsvn replicas...

Posted by Holger Krekel <py...@devel.trillke.net>.
Hi Perry,

Perry E. Metzger wrote:
> Lets say you have an open source project where you want to have a lot
> of anon repositories. Have the main repository run an nntp server, and
> post PGP signed versions of all incremental changes to a special
> newsgroup. Run nntp on a tree of anon servers... and watch as the
> updates flood fill to all of them in seconds. Add some code which
> checks the signatures and applies the changes, and something to do a
> periodic last ditch consistency check (say posting cryptographic
> hashes of the whole repository every day or so) and you have an
> anonsvn infrastructure that scales to thousands of copies of the
> repository that are all nearly in sync in real time...

I like the basic idea (but could only be bothered to help when doing
it in python :-). 

> If you find nntp too disgusting, some custom service could work the
> same way, but there's lots of good nntp code with high performance out
> there already, I figure...

For smaller setups maybe just triggering the peers to grab incremental 
dumps from the master repo might be enough. This sounds simple enough
if we are dealing with three or four repository mirrors. 

For massively distributed environments, like every developer
running his own repo, using bittorrents for wider distribution also 
sounds worthwhile. 

Then again, really interesting would be if you can actually commit to any 
of the many repositories.  ASFAIK the possible resulting merge conflicts 
are not trivial to handle or to avoid unless you revert to some centralized 
locking mechanism which defeats the massively distributed scenario and
leads to single point of failures.  Also the uniqueness of revision numbers 
will suffer and what not ...  (someone will surely point out that 
the documentation explains all this already :-) 

cheers,

    holger

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: many anonsvn replicas...

Posted by "Perry E. Metzger" <pe...@piermont.com>.
Craig Peterein <cr...@peterein.org> writes:
> On Wed, Feb 18, 2004 at 04:42:31PM -0500, Perry E. Metzger wrote:
>> Lets say you have an open source project where you want to have a lot
>> of anon repositories. Have the main repository run an nntp server, and
>> post PGP signed versions of all incremental changes to a special
>> newsgroup. Run nntp on a tree of anon servers... and watch as the
>> updates flood fill to all of them in seconds. Add some code which
>> checks the signatures and applies the changes, and something to do a
>> periodic last ditch consistency check (say posting cryptographic
>> hashes of the whole repository every day or so) and you have an
>> anonsvn infrastructure that scales to thousands of copies of the
>> repository that are all nearly in sync in real time...
>
> This sounds like monotone: http://www.venge.net/monotone/

Not really, no. In fact, not at all so far as I can tell.

I'm talking about a way of doing lots of read-only replicas of a
repository (or indeed of any set of data) -- not a source control
system.

Perry

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: many anonsvn replicas...

Posted by Craig Peterein <cr...@peterein.org>.
On Wed, Feb 18, 2004 at 04:42:31PM -0500, Perry E. Metzger wrote:
> Lets say you have an open source project where you want to have a lot
> of anon repositories. Have the main repository run an nntp server, and
> post PGP signed versions of all incremental changes to a special
> newsgroup. Run nntp on a tree of anon servers... and watch as the
> updates flood fill to all of them in seconds. Add some code which
> checks the signatures and applies the changes, and something to do a
> periodic last ditch consistency check (say posting cryptographic
> hashes of the whole repository every day or so) and you have an
> anonsvn infrastructure that scales to thousands of copies of the
> repository that are all nearly in sync in real time...

This sounds like monotone: http://www.venge.net/monotone/

Craig


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

many anonsvn replicas...

Posted by "Perry E. Metzger" <pe...@piermont.com>.
"Perry E. Metzger" <pe...@piermont.com> writes:
>    The two strategies that have come to mind for me are to do an
>    incremental dump after every commit, copying that to the anon
>    repository -- but that would miss property changes (another reason
>    I'm bringing up issue #1). The other possibility that came to mind
>    is somehow sending the db transaction log files over to the copy
>    and replaying them there, but I must confess I have little or no
>    idea how to do that.

By the way, this brings to mind my favorite unimplemented project
that I've always wanted to do with CVS and now would like to do with
SVN -- I'm mentioning it here because I may never actually get enough
round tuits and other people might like the idea.

Lets say you have an open source project where you want to have a lot
of anon repositories. Have the main repository run an nntp server, and
post PGP signed versions of all incremental changes to a special
newsgroup. Run nntp on a tree of anon servers... and watch as the
updates flood fill to all of them in seconds. Add some code which
checks the signatures and applies the changes, and something to do a
periodic last ditch consistency check (say posting cryptographic
hashes of the whole repository every day or so) and you have an
anonsvn infrastructure that scales to thousands of copies of the
repository that are all nearly in sync in real time...

If you find nntp too disgusting, some custom service could work the
same way, but there's lots of good nntp code with high performance out
there already, I figure...

Perry

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: redux: versioning unversioned metadata + anonsvn strategies....

Posted by John Peacock <jp...@rowman.com>.
Perry E. Metzger wrote:

> Perhaps, but consider my desire to push out incrementals to not just
> one but very large numbers of users. (Say, for example, that a huge
> open source project used SVN and wanted to allow thousands of
> organizations to keep read-only repository mirrors.) 

svk does exactly this, using a "pull" model instead of a "push".  If you use 
some out-of-band method to alert the slaves to sync (nntp or e-mail would work), 
you would be all set.

John


-- 
John Peacock
Director of Information Research and Technology
Rowman & Littlefield Publishing Group
4501 Forbes Boulevard
Suite H
Lanham, MD  20706
301-459-3366 x.5010
fax 301-429-5748

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: redux: versioning unversioned metadata + anonsvn strategies....

Posted by "Perry E. Metzger" <pe...@piermont.com>.
Dominic Anello <da...@danky.com> writes:
> On 2004-02-19 10:17:09 -0500, Perry E. Metzger wrote:
>> Thanks to someone who seemed excessively interested in flaming me for
>> wanting to get text dumps, it seems no one paid attention to the
>> questions I was actually asking, so I'm going to ask them again...
>
> I do an incremental dump on commit and a cumulative dump every night.
> If something takes out the repo, loosing 1 day's worth of log message
> changes doesn't seem like the end of the world.

Perhaps, but consider my desire to push out incrementals to not just
one but very large numbers of users. (Say, for example, that a huge
open source project used SVN and wanted to allow thousands of
organizations to keep read-only repository mirrors.) You might not
want to give each and every one of them a 3G full dump every night if
you can avoid it -- flood filling just the incrementals, if it could
keep everyone up to date, would be far superior.

I have to say, though, that even for backup purposes, I don't like the
subtlety of "incrementals aren't really incrementals". Some users will
miss that fact and get screwed one day, without knowing it because
they won't think to check all the log messages etc.

>> 2) AnonSVN:
>> I can see to obvious ways to run a "shadow" AnonSVN server mirroring
>> an original in near-real time:
>> a) Do an incremental dump after every commit, and push it to the anon
>>    sever(s) where they get loaded -- only 1) gets in the way here.
>> b) Send over the db transaction logs and replay them, but I
>>    don't really know how to do this.
>> Any advice available on this?
>
> Can't you push the new property to the shadow repo in
> pre-revprop-change?  I believe you get the new value on standard input.

It seems less general -- and like more of a pain. It also can't easily
accommodate my dream system in which huge numbers of people are
maintaining local read only copies.

-- 
Perry E. Metzger		perry@piermont.com

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: redux: versioning unversioned metadata + anonsvn strategies....

Posted by "C. Michael Pilato" <cm...@collab.net>.
Dominic Anello <da...@danky.com> writes:

> > I can see to obvious ways to run a "shadow" AnonSVN server mirroring
> > an original in near-real time:
> > a) Do an incremental dump after every commit, and push it to the anon
> >    sever(s) where they get loaded -- only 1) gets in the way here.
> > b) Send over the db transaction logs and replay them, but I
> >    don't really know how to do this.
> > Any advice available on this?
> 
> Can't you push the new property to the shadow repo in
> pre-revprop-change?  I believe you get the new value on standard input.

Nah, pre-revprop-change doesn't get the new value.  That's
post-revprop-changed, and it isn't via stdin -- you have to ask the
repository for that property's value.

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: redux: versioning unversioned metadata + anonsvn strategies....

Posted by Dominic Anello <da...@danky.com>.
On 2004-02-19 10:17:09 -0500, Perry E. Metzger wrote:
> Thanks to someone who seemed excessviely interested in flaming me for
> wanting to get text dumps, it seems no one paid attention to the
> questions I was actually asking, so I'm going to ask them again...

I do an incremental dump on commit and a cumulative dump every night.
If something takes out the repo, loosing 1 day's worth of log message
changes doesn't seem like the end of the world.  Besides, it seems like
the only other way to have an complete backup is to hotcopy the entire
repo every commit.

> 1) Metadata:
> a) The fact that metadata is unversioned makes updates of log messages
>    more dangerous than they need to be.
> b) The fact that metadata is unversioned means incremental dumps of
>    the database, which are useful for all sorts of reasons, don't
>    include updates to things like log messages.
> Therefore, there there any chance of revisiting the decision not to
> version metadata, or at least of finding a way of fixing b)?
> 
> 2) AnonSVN:
> I can see to obvious ways to run a "shadow" AnonSVN server mirroring
> an original in near-real time:
> a) Do an incremental dump after every commit, and push it to the anon
>    sever(s) where they get loaded -- only 1) gets in the way here.
> b) Send over the db transaction logs and replay them, but I
>    don't really know how to do this.
> Any advice available on this?

Can't you push the new property to the shadow repo in
pre-revprop-change?  I believe you get the new value on standard input.

Or - couldn't you have your read-only anonSVN repo be an svk mirror?
Your commit scripts could send a message to the anonSVN server that it
needs to update it's local copy.  I don't really know a lot about svk,
but from what I've heard, it might be worth looking into for a situation
like this.

-- 
Maybe I feel detached. I may just look too shy.
It's disinterest, not that I'm a timid guy.
    -The Faint, "Glass Danse"

Re: redux: versioning unversioned metadata + anonsvn strategies....

Posted by "Perry E. Metzger" <pe...@piermont.com>.
Holger Krekel <py...@devel.trillke.net> writes:
> maybe a dumb comment: all the metadata i use is currently versioned
> (svn properties are generally versioned) as i am not using revision 
> properties, not explicitely at least.  Is using "unversioned metadata"
> something common?

Every log message with every commit is an unversioned property. If a
log message gets updated, the database has changed, but the next
incremental dump will *not* include that change. This means that a set
of incrementals cannot fully recover the database, or allow you to
keep a mirror fully up to date, which is a problem.

-- 
Perry E. Metzger		perry@piermont.com

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: redux: versioning unversioned metadata + anonsvn strategies....

Posted by Holger Krekel <py...@devel.trillke.net>.
Perry E. Metzger wrote:
> 
> Thanks to someone who seemed excessviely interested in flaming me for
> wanting to get text dumps, it seems no one paid attention to the
> questions I was actually asking, so I'm going to ask them again...
> 1) Metadata:
> a) The fact that metadata is unversioned makes updates of log messages
>    more dangerous than they need to be.
> b) The fact that metadata is unversioned means incremental dumps of
>    the database, which are useful for all sorts of reasons, don't
>    include updates to things like log messages.
> Therefore, there there any chance of revisiting the decision not to
> version metadata, or at least of finding a way of fixing b)?

maybe a dumb comment: all the metadata i use is currently versioned
(svn properties are generally versioned) as i am not using revision 
properties, not explicitely at least.  Is using "unversioned metadata"
something common?   IOW, would it be an option if you simply 
restricted the users to change only versioned properties/metadata
if they want to use the mirrored/hotsynced svn repos? 

cheers,

    holger krekel

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: redux: versioning unversioned metadata + anonsvn strategies....

Posted by Andy Parkins <an...@leaseline.plus.com>.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Thursday 19 February 2004 16:31, Perry E. Metzger wrote:

> Well, certainly for 1.0 that's all that can be done, given that 1.0 is
> feature frozen. I might even kludge something up. However, is there
> any way we could do better for after that?

This probably is way off base; for the log at least though could this be 
solved by doing the following at rev 0:

	touch versioned.log
	svn add versioned.log

Then before each commit (obviously this would be scripted/hooked)

	svn log > versioned.log

Obviously you would always be one log message out, but with some more horrible 
hacks you could append svn-commit.tmp to that file before actually doing the 
commit.

Andy

- -- 
Andy Parkins
Technical Director                          email: andyp@leaseline.plus.com
Leaseline Systems Limited                   tel:   +44 (0)151 652 5551
Unit 31, Price Street Business Centre       fax:   +44 (0)151 652 9983
Birkenhead, CH41 4JQ

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)

iD8DBQFANOg0wQJ9gE9xL20RAsUYAJ46w0tVTrVaYUF/zLfxICkxmO+2fwCgvrSt
2uuNTsDp4293xXQOcFuMH8s=
=bB5h
-----END PGP SIGNATURE-----

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org


Re: redux: versioning unversioned metadata + anonsvn strategies....

Posted by "C. Michael Pilato" <cm...@collab.net>.
kfogel@collab.net writes:

> "Perry E. Metzger" <pe...@piermont.com> writes:
> > One possibility that comes to mind: just keep a log in the database
> > of property changes between two revision numbers. If you change a
> > property, it gets put in the log, and then the incremental dump
> > mechanism could just extract them along with the changes between those
> > revisions. It is a slight hack, but it would fix the problem of
> > incrementals not really fully containing the state changes of the
> > database.
> 
> Heh -- yes, Ben Collins-Sussman and I were just talking about that,
> after reading one of your earlier mails.
> 
> Could you file a 'feature' issue on this, and describe roughly how you
> think it would work?  I can't say when we could schedule it for
> (mostly depends on which developer gets interested enough to actually
> do it -- of course, a patch will accelerate this process! :-) ).  But
> having it in the tracker makes it more manipulatable ("manipulable"?).
> 
> I think it's a good idea, though we probably haven't thought of all
> the implications yet...

Actually, I'm -1 on such an idea.  The Subversion filesystem tracks
all the information it needs to track to work.  It is not the job of
the filesystem to remember what portions of it the administrator deems
"safely backed up."

In my opinion, the right steps to make this work are:

   - Fix the dump file format/dumper/loader to understand the notion
     of a revision property patchup record.  Right now, all revision
     records indicate that a new revision needs to be created.  It'd
     be great to be able to say, "Just update the properties of this
     already-existing revision."

   - Give 'svnadmin dump' a flag which says, "Do a rev-property-patch
     dump".

   - Tell people to use this type of 'svnadmin dump' in their
     post-revprop-change hooks.

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: redux: versioning unversioned metadata + anonsvn strategies....

Posted by kf...@collab.net.
"Perry E. Metzger" <pe...@piermont.com> writes:
> One possibility that comes to mind: just keep a log in the database
> of property changes between two revision numbers. If you change a
> property, it gets put in the log, and then the incremental dump
> mechanism could just extract them along with the changes between those
> revisions. It is a slight hack, but it would fix the problem of
> incrementals not really fully containing the state changes of the
> database.

Heh -- yes, Ben Collins-Sussman and I were just talking about that,
after reading one of your earlier mails.

Could you file a 'feature' issue on this, and describe roughly how you
think it would work?  I can't say when we could schedule it for
(mostly depends on which developer gets interested enough to actually
do it -- of course, a patch will accelerate this process! :-) ).  But
having it in the tracker makes it more manipulatable ("manipulable"?).

I think it's a good idea, though we probably haven't thought of all
the implications yet...

-Karl

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: redux: versioning unversioned metadata + anonsvn strategies....

Posted by "Perry E. Metzger" <pe...@piermont.com>.
kfogel@collab.net writes:
> There's certainly no chance of revisiting it for 1.0.0 or any of the
> 1.0.x series.  Personally, I would also not want to revisit it for
> 1.1.x, since I think there are more important improvements to be made
> there.

It is okay to me if metadata versioning itself isn't addressed -- but
it *would* be nice if some hack could be found to include the
properties that go changed in incremental dumps. That seems like a
very useful improvement to me.

>> 2) AnonSVN:
>> I can see to obvious ways to run a "shadow" AnonSVN server mirroring
>> an original in near-real time:
>> a) Do an incremental dump after every commit, and push it to the anon
>>    sever(s) where they get loaded -- only 1) gets in the way here.
>
> I like (a), and for (1), you could have your post-revprop-change hook
> write out a log of just the metadata changes -- make up some simple
> XML format -- to be transported with the incremental dumpfile.  (If
> you do this, and you have time to package up the scripts in some
> reasonably comprehensible way, we'd love to include them in the
> contrib/ section of Subversion!)
>
> And yes, I agree this is a kluge.  It's just a question of priorities.

Well, certainly for 1.0 that's all that can be done, given that 1.0 is
feature frozen. I might even kludge something up. However, is there
any way we could do better for after that?

One possibility that comes to mind: just keep a log in the database
of property changes between two revision numbers. If you change a
property, it gets put in the log, and then the incremental dump
mechanism could just extract them along with the changes between those
revisions. It is a slight hack, but it would fix the problem of
incrementals not really fully containing the state changes of the
database.

-- 
Perry E. Metzger		perry@piermont.com

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: redux: versioning unversioned metadata + anonsvn strategies....

Posted by kf...@collab.net.
"Perry E. Metzger" <pe...@piermont.com> writes:
> Thanks to someone who seemed excessviely interested in flaming me for
> wanting to get text dumps, it seems no one paid attention to the
> questions I was actually asking, so I'm going to ask them again...

Oh, I wouldn't assume that so-and-so's flame was the reason others
didn't answer your original question :-).  More likely no one had a
good answer to give, so they didn't respond.  That's a normal pattern,
and doesn't much correlate with the presence or absence of a flame in
the thread.

> 1) Metadata:
> a) The fact that metadata is unversioned makes updates of log messages
>    more dangerous than they need to be.
> b) The fact that metadata is unversioned means incremental dumps of
>    the database, which are useful for all sorts of reasons, don't
>    include updates to things like log messages.
> Therefore, there there any chance of revisiting the decision not to
> version metadata, or at least of finding a way of fixing b)?

There's certainly no chance of revisiting it for 1.0.0 or any of the
1.0.x series.  Personally, I would also not want to revisit it for
1.1.x, since I think there are more important improvements to be made
there.

> 2) AnonSVN:
> I can see to obvious ways to run a "shadow" AnonSVN server mirroring
> an original in near-real time:
> a) Do an incremental dump after every commit, and push it to the anon
>    sever(s) where they get loaded -- only 1) gets in the way here.

I like (a), and for (1), you could have your post-revprop-change hook
write out a log of just the metadata changes -- make up some simple
XML format -- to be transported with the incremental dumpfile.  (If
you do this, and you have time to package up the scripts in some
reasonably comprehensible way, we'd love to include them in the
contrib/ section of Subversion!)

And yes, I agree this is a kluge.  It's just a question of priorities.

> b) Send over the db transaction logs and replay them, but I
>    don't really know how to do this.
> Any advice available on this?

I think (a) is your best bet.  I've never tried (b), but anyway a
method that keeps the data in human-readable/human-hackable form seems
somehow more appealing.

-Karl

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

redux: versioning unversioned metadata + anonsvn strategies....

Posted by "Perry E. Metzger" <pe...@piermont.com>.
Thanks to someone who seemed excessviely interested in flaming me for
wanting to get text dumps, it seems no one paid attention to the
questions I was actually asking, so I'm going to ask them again...

1) Metadata:
a) The fact that metadata is unversioned makes updates of log messages
   more dangerous than they need to be.
b) The fact that metadata is unversioned means incremental dumps of
   the database, which are useful for all sorts of reasons, don't
   include updates to things like log messages.
Therefore, there there any chance of revisiting the decision not to
version metadata, or at least of finding a way of fixing b)?

2) AnonSVN:
I can see to obvious ways to run a "shadow" AnonSVN server mirroring
an original in near-real time:
a) Do an incremental dump after every commit, and push it to the anon
   sever(s) where they get loaded -- only 1) gets in the way here.
b) Send over the db transaction logs and replay them, but I
   don't really know how to do this.
Any advice available on this?

Perry

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org