You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by "Eric S. Raymond" <es...@thyrsus.com> on 2016/02/03 00:42:08 UTC

svncutter can be removed

The svncutter utility under contrib/server-side can be removed.  I
wrote it, and I'd do this myself, but I no longer seem to have current
write-access credentials to the repo.

A revised and improved version named 'repocutter' is now part of the
reposurgeon distribution, where it actually has proper documentation
and tests.
-- 
		<a href="http://www.catb.org/~esr/">Eric S. Raymond</a>

The IRS has become morally corrupted by the enormous power which we in
Congress have unwisely entrusted to it. Too often it acts like a
Gestapo preying upon defenseless citizens.
	-- Senator Edward V. Long

RE: svncutter can be removed

Posted by Bert Huijben <be...@qqmail.nl>.

> -----Original Message-----
> From: Stefan Sperling [mailto:stsp@elego.de]
> Sent: woensdag 3 februari 2016 09:18
> To: Greg Stein <gs...@gmail.com>
> Cc: Eric Raymond <es...@thyrsus.com>; dev@subversion.apache.org
> Subject: Re: svncutter can be removed
> 
> On Tue, Feb 02, 2016 at 07:48:53PM -0600, Greg Stein wrote:
> > On Tue, Feb 2, 2016 at 7:45 PM, Eric S. Raymond <es...@thyrsus.com> wrote:
> > > Because of the difference between Subversion and DVCS branching
> models,
> > > this kind of situation is better handled by using svncutter to extract
> > > the components into separate dumpfiles (and then perocessing each one
> > > through reposugeon separately) than it would be by reading in the
whole
> > > repo at once and doing the dissection in gitspace.
> > >
> >
> > Gotcha. How is that different from "svndumpfilter include" ?
> 
> I would guess it handles missing copyfrom sources more intelligently
> than "svndumpfilter include", which simply errors out when it hits
> that case (svndumpfilter: E200003: Invalid copy source path '/foo').

In my experience using svndumpfilter without --drop-*, but without
--renumber-revs is the most likely cause of these problems. Forgetting to
add that flag just creates invalid dumpfiles, so I don't think we should
have allowed that scenario. (Where is my time machine?)

BTW: svndumpfilter existed since Subversion 1.0, so it certainly predates
2009. Not sure about when its specific features were introduced though.


	Bert


Re: svncutter can be removed

Posted by "Eric S. Raymond" <es...@thyrsus.com>.
Stefan Sperling <st...@elego.de>:
> I would guess it handles missing copyfrom sources more intelligently
> than "svndumpfilter include", which simply errors out when it hits
> that case (svndumpfilter: E200003: Invalid copy source path '/foo').

Yes, it does.  I have to admit this is a recent development -
following a recent bug report, repocutter removes copyfrom references
to expunged revisions, but that bug never came up when it was named
svncutter.

    esr@snark:~/WWW/reposurgeon$ repocutter help

    repocutter - stream surgery on SVN dump files
    general usage: repocutter [-q] [-r SELECTION] SUBCOMMAND

    In all commands, the -r (or --range) option limits the selection of revisions
    over which an operation will be performed. A selection consists of
    one or more comma-separated ranges. A range may consist of an integer
    revision number or the special name HEAD for the head revision. Or it
    may be a colon-separated pair of integers, or an integer followed by a
    colon followed by HEAD.

    Normally, each subcommand produces a progress spinner on standard
    error; each turn means another revision has been filtered. The -q (or
    --quiet) option suppresses this.

    Type 'repocutter help <subcommand>' for help on a specific subcommand.

    Available subcommands:
       squash
       select
       propdel
       propset
       proprename
       log
       setlog
       strip
       expunge
       pathrename
       renumber
       reduce

Basically, whenever I perceive a need for some kind of Subversion repo
transformation that isn't expressible in gitspace, I thow it in here.
Thus propdel/propset/properename, no prize for guessing what those do.

The strip/reduce operations, as previously noted, do test load reduction.

The important ones for dissecting multiproject repos are
expunge, pathrename, and occasionally select.

Some other operations can be done in reposurgeon but lived here before
reposurgeon existed; log and setlog report and set change comments, and
renumber does the obvious (usually to clean up after an expunge or select).

The squash operation is a bit of a mini-epic.

    esr@snark:~/WWW/reposurgeon$ repocutter help squash
    squash: usage: repocutter [-q] [-r SELECTION] [-m mapfile] [-f] [-c] squash

    The 'squash' subcommand merges adjacent commits that have the same
    author and log text and were made within 5 minutes of each other.
    This can be helpful in cleaning up after migrations from file-oriented
    revision control systems, or if a developer has been using a pre-2006
    version of Emacs VC.

    With the -m (or --mapfile) option, squash emits a map to the named
    file showing how old revision numbers map into new ones.

    With the -e (or --excise) option, the specified set of revisions in
    unconditionally removed.  The tool will exit with an error if an
    excised remove is part of a clique eligible for squashing.  Note that
    repocutter does not perform any checks on whether the repository
    history is afterwards valid; if you delete a node using this option,
    you won't find out you have a problem until you attempt to load the
    resulting dumpfile.

    repocutter attempts to fix up references to Subversion revisions in log
    entries so they will still be correct after squashing.  It considers
    anything that looks like the regular expression \br[0-9]+\b to be
    a comment reference (this is the same format that Subversion uses
    in log headers).

    Every revision in the file after the first omitted one gets the property
    'repocutter:original' set to the revision number it had before the
    squash operation.

    The option --f (or --flagrefs) causes repocutter to wrap its revision-reference
    substitutions in curly braces ({}).  By doing this, then grepping for 'r{'
    in the output of 'repocutter log', you can check for false conversions.

    The -c (or --compressmap) option changes the mapfile format to one
    that is easier for human browsing, though less well suited for
    interpretation by other programs.

Nowadays the analogous operation would probably be handled by a git
lift followed by a reposurgeon 'coalesce' command, but I've left this
in place in case people want to stay in Subversion.  One of the few
places reposurgeon is still weak five years in is that its Subversion
export is not very good - doesn't round-trip losslessly with import
the way fast-import files do.
-- 
		<a href="http://www.catb.org/~esr/">Eric S. Raymond</a>

Re: svncutter can be removed

Posted by Stefan Sperling <st...@elego.de>.
On Tue, Feb 02, 2016 at 07:48:53PM -0600, Greg Stein wrote:
> On Tue, Feb 2, 2016 at 7:45 PM, Eric S. Raymond <es...@thyrsus.com> wrote:
> > Because of the difference between Subversion and DVCS branching models,
> > this kind of situation is better handled by using svncutter to extract
> > the components into separate dumpfiles (and then perocessing each one
> > through reposugeon separately) than it would be by reading in the whole
> > repo at once and doing the dissection in gitspace.
> >
> 
> Gotcha. How is that different from "svndumpfilter include" ?

I would guess it handles missing copyfrom sources more intelligently
than "svndumpfilter include", which simply errors out when it hits
that case (svndumpfilter: E200003: Invalid copy source path '/foo').

Re: svncutter can be removed

Posted by Greg Stein <gs...@gmail.com>.
Excellent. Thanks for the detailed information. It sounds like you've
filled in lots of edge details.

On Tue, Feb 2, 2016 at 9:43 PM, Eric S. Raymond <es...@thyrsus.com> wrote:

> Greg Stein <gs...@gmail.com>:
> > Gotcha. How is that different from "svndumpfilter include" ?
> >
> > Or maybe the latter was created after you started svncutter?
>
> I think it was; svncutter dates back to 2009, possibly earlier.
>
> Another function it has is test load reduction.  There's a 'strip' command
> that replaces all content blobs with small unique cookies, leaving all
> node structure and metadata intact.
>
> This is very handy when some metadata malformation triggers a bug and
> the person reporting wants to send me a smallest-possible repo that will
> reproduce it.
>
> Usually the test load can be made still smaller with a 'reduce' operation.
> This, essentially, drops all nodes corresponding to plain file changes
> that are not adjacent to something interesting, like a copyfrom or
> directory
> creatio or property change. So, the branch topology is preserved.
>
> Always, so far, if a full repository triggers a bug, the stripped and
> reduced version does too. And is much easier to pass around and look at.
> --
>                 <a href="http://www.catb.org/~esr/">Eric S. Raymond</a>
>

Re: svncutter can be removed

Posted by "Eric S. Raymond" <es...@thyrsus.com>.
Greg Stein <gs...@gmail.com>:
> Gotcha. How is that different from "svndumpfilter include" ?
> 
> Or maybe the latter was created after you started svncutter?

I think it was; svncutter dates back to 2009, possibly earlier.

Another function it has is test load reduction.  There's a 'strip' command
that replaces all content blobs with small unique cookies, leaving all
node structure and metadata intact.

This is very handy when some metadata malformation triggers a bug and
the person reporting wants to send me a smallest-possible repo that will
reproduce it.  

Usually the test load can be made still smaller with a 'reduce' operation.
This, essentially, drops all nodes corresponding to plain file changes
that are not adjacent to something interesting, like a copyfrom or directory
creatio or property change. So, the branch topology is preserved.

Always, so far, if a full repository triggers a bug, the stripped and
reduced version does too. And is much easier to pass around and look at.
-- 
		<a href="http://www.catb.org/~esr/">Eric S. Raymond</a>

Re: svncutter can be removed

Posted by Greg Stein <gs...@gmail.com>.
On Tue, Feb 2, 2016 at 7:45 PM, Eric S. Raymond <es...@thyrsus.com> wrote:

> Greg Stein <gs...@gmail.com>:
> > But I'll take care of it for you, no worries. I'll rm the content, and
> > leave a README pointing to reposurgeon.
>
> Thanks.
>
> > Much appreciated for the prior work, and the heads up!
>
> As a matter of potential interest: I originally thought that svncutter
> would be completely obsolesced by reposurgeon.  But it turns out there
> is one case where the former is still more useful.  That's when you
> have a multi-project Subversion repository and want to split it up
> into components for separate conversion to git or whatever.
>
> Because of the difference between Subversion and DVCS branching models,
> this kind of situation is better handled by using svncutter to extract
> the components into separate dumpfiles (and then perocessing each one
> through reposugeon separately) than it would be by reading in the whole
> repo at once and doing the dissection in gitspace.
>

Gotcha. How is that different from "svndumpfilter include" ?

Or maybe the latter was created after you started svncutter?

That's why repocutter still exists.  It's not a hugely common case, but
> it comes up often eniugh to merit some support.
>

We actually use it quite a bit at the ASF, when projects go from the
"mother" svn repository, to Git repositories. Git is simply unworkable at
the scale of our svn repository, so projects have to switch to one or more
Git repos when they convert.

Cheers,
-g

Re: svncutter can be removed

Posted by "Eric S. Raymond" <es...@thyrsus.com>.
Greg Stein <gs...@gmail.com>:
> But I'll take care of it for you, no worries. I'll rm the content, and
> leave a README pointing to reposurgeon.

Thanks.

> Much appreciated for the prior work, and the heads up!

As a matter of potential interest: I originally thought that svncutter
would be completely obsolesced by reposurgeon.  But it turns out there
is one case where the former is still more useful.  That's when you
have a multi-project Subversion repository and want to split it up
into components for separate conversion to git or whatever.

Because of the difference between Subversion and DVCS branching models,
this kind of situation is better handled by using svncutter to extract
the components into separate dumpfiles (and then perocessing each one
through reposugeon separately) than it would be by reading in the whole
repo at once and doing the dissection in gitspace.

That's why repocutter still exists.  It's not a hugely common case, but
it comes up often eniugh to merit some support.
-- 
		<a href="http://www.catb.org/~esr/">Eric S. Raymond</a>

Re: svncutter can be removed

Posted by Greg Stein <gs...@gmail.com>.
Done: http://svn.apache.org/r1728244

On Tue, Feb 2, 2016 at 7:28 PM, Greg Stein <gs...@gmail.com> wrote:

> A year or three back, the Infra team had everybody reset their passwords.
> I see you have an ASF account, so it is likely just a need to reset the
> thing.
>
> But I'll take care of it for you, no worries. I'll rm the content, and
> leave a README pointing to reposurgeon.
>
> Much appreciated for the prior work, and the heads up!
>
> Cheers,
> -g
>
>
> On Tue, Feb 2, 2016 at 5:42 PM, Eric S. Raymond <es...@thyrsus.com> wrote:
>
>> The svncutter utility under contrib/server-side can be removed.  I
>> wrote it, and I'd do this myself, but I no longer seem to have current
>> write-access credentials to the repo.
>>
>> A revised and improved version named 'repocutter' is now part of the
>> reposurgeon distribution, where it actually has proper documentation
>> and tests.
>> --
>>                 <a href="http://www.catb.org/~esr/">Eric S. Raymond</a>
>>
>> The IRS has become morally corrupted by the enormous power which we in
>> Congress have unwisely entrusted to it. Too often it acts like a
>> Gestapo preying upon defenseless citizens.
>>         -- Senator Edward V. Long
>>
>
>

Re: svncutter can be removed

Posted by Greg Stein <gs...@gmail.com>.
A year or three back, the Infra team had everybody reset their passwords. I
see you have an ASF account, so it is likely just a need to reset the thing.

But I'll take care of it for you, no worries. I'll rm the content, and
leave a README pointing to reposurgeon.

Much appreciated for the prior work, and the heads up!

Cheers,
-g


On Tue, Feb 2, 2016 at 5:42 PM, Eric S. Raymond <es...@thyrsus.com> wrote:

> The svncutter utility under contrib/server-side can be removed.  I
> wrote it, and I'd do this myself, but I no longer seem to have current
> write-access credentials to the repo.
>
> A revised and improved version named 'repocutter' is now part of the
> reposurgeon distribution, where it actually has proper documentation
> and tests.
> --
>                 <a href="http://www.catb.org/~esr/">Eric S. Raymond</a>
>
> The IRS has become morally corrupted by the enormous power which we in
> Congress have unwisely entrusted to it. Too often it acts like a
> Gestapo preying upon defenseless citizens.
>         -- Senator Edward V. Long
>