You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Mark Eichin <ei...@gmail.com> on 2011/05/12 22:26:56 UTC

svnrdump revision inconsistency bugs

I'm working on some svn migration-with-history between repos from
different companies, git-svn isn't cutting it (close, but it loses
properties entirely, and doesn't really give me enough control) so I
thought I'd give svnrdump a try, given that it's been recommended for
this use a bunch of times... built trunk (r1102449, but I don't see
any useful changes in the logs since then) under ubuntu maverick (in a
chroot, with --disable-shared, to avoid contamination from any
ubuntu-supplied subversion libraries.)

I found what appear to be three distinct bugs, demonstrated by the
attached script.  The one that actually prompted me to reduce this to
a standalone test script was that snvrdump load failed to handle
renames -- it gave "E160013: File not found: revision 31, path ...",
which I eventually realized was the Node-copyfrom-rev *in the dump*
that it was loading, not the revision the file had actually gotten
loaded into.  The only way that could be right is if the two repos
were in perfect sync (and we already have svnsync for that.)  The
attached script demonstrates this directly.

The second bug came up while writing the test case: if the destination
repo for the svnrdump load has fewer revisions in it than the source
repo that generated the dump, it blows up *immediately* on the first
revision, with an error number that references the source-dump
revision number.  I don't know what it's actually trying to do to that
revision, but it's clear that in the case where the target has more
revs, it's doing whatever that is to the *wrong* rev, and that
deserves investigation.  You can demonstrate this by deleting the
"pump the version number up" for loop at line 57 in the attached
script.

The third bug is that "svnrdump -r 3:HEAD" says "E205000: Unsupported
revision specifier used; use only integer values or 'HEAD'" (which is
why the attached script goes out of its way to parse revision numbers
out of "svn info" -- -r 3:7 works fine.)  Even if there's a reason to
not support HEAD (though I kind of doubt that) the error message
itself is inconsistent :-)

I haven't looked at the actual code (at this point I may be out of
time and have to go with convincing people that git-svn's
approximation is good enough) but since I'd actually managed to narrow
it down to a clean test case that didn't involve any of the
dump-mangling I'm doing for the actual task, I figured it best to get
that out there so people closer to the code can give it a try.

-- 
_Mark_ <ei...@thok.org> <ei...@gmail.com>

Re: svnrdump revision inconsistency bugs

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.
Mark Eichin wrote on Thu, May 12, 2011 at 16:26:56 -0400:
> I found what appear to be three distinct bugs, demonstrated by the
> attached script.

It didn't reach the list... could you try again with a different MIME type?

Re: svnrdump revision inconsistency bugs

Posted by Mark Eichin <ei...@gmail.com>.
cool, I'll admit I didn't take the time to look at existing test cases
to see what style to write them in (partly because I was mostly
simplifying the script I was using to perform the original goal);
looks like I didn't include trap "rm -rf" 0  lines either (but for
debugging the problem one probably wants to keep the repo around
anyway.)

On Fri, May 13, 2011 at 12:17 PM, C. Michael Pilato <cm...@collab.net> wrote:
> Just contributing a version of the recipe script that's a bit more flexible.
>
> --
> C. Michael Pilato <cm...@collab.net>
> CollabNet   <>   www.collab.net   <>   Distributed Development On Demand
>



-- 
_Mark_ <ei...@thok.org> <ei...@gmail.com>

Re: svnrdump revision inconsistency bugs

Posted by "C. Michael Pilato" <cm...@collab.net>.
Just contributing a version of the recipe script that's a bit more flexible.

-- 
C. Michael Pilato <cm...@collab.net>
CollabNet   <>   www.collab.net   <>   Distributed Development On Demand

Re: svnrdump revision inconsistency bugs

Posted by Mark Eichin <ei...@gmail.com>.
Fortunately shell scripts aren't really white-space sensitive, so I've
just pasted it inline.

> Just a reminder, you always have the option to use svnsync + 'svnadmin dump'
> instead of 'svnrdump dump'.

The more serious of the two bugs are on the svnrdump *load* side;
svnsync doesn't help with that (because it expects a clean repo, and
it expects more access than I have to the destination repo.)  I use
svnrdump dump here just to minimize the tools involved in the
reduction case; if I use "svnadmin dump" with the same options,
svnrdump load fails the same way.

#!/bin/sh -xe

PATH=$HOME/tmp/maverick.chroot/tmp/svnrdump/bin:$PATH

svn --version
svnrdump --version

TMPDIR=/dev/shm

REPO1_DIR=$TMPDIR/repo1
REPO1=file:////$REPO1_DIR
REPO2_DIR=$TMPDIR/repo2
REPO2=file:////$REPO2_DIR

rm -rf $REPO1_DIR $REPO2_DIR

# set up repo1
svnadmin create $REPO1_DIR
WORK=$(mktemp -d -t svnbug_work.XXXXXXX)
echo to clean up:
echo rm -rf $WORK

svn mkdir -m 'setup dross' --parents $REPO1/trunk/unrelated_project
svn co $REPO1/trunk/unrelated_project $WORK
cat > $WORK/sample1 <<EOF
sample1
EOF
cat > $WORK/sample2 <<EOF
sample2
EOF
svn add $WORK/sample1
svn commit -m 'setup dross file' $WORK/sample1
svn add $WORK/sample2
svn commit -m 'setup dross file' $WORK/sample2
svn mkdir -m 'setup real work' --parents $REPO1/trunk/real_project
# record starting rev for what we do want
REV=$(svn info $REPO1 | awk '/Revision:/ { print $2 }')
rm -rf $WORK
svn co $REPO1/trunk/real_project $WORK
cat > $WORK/sample1 <<EOF
sample1
EOF
cat > $WORK/sample2 <<EOF
sample2
EOF
svn add $WORK/sample1
svn commit -m 'setup real file' $WORK/sample1
svn add $WORK/sample2
svn commit -m 'setup real file' $WORK/sample2
svn mv $WORK/sample1 $WORK/sample3
svn commit -m 'perform a rename' $WORK
HEAD=$(svn info $REPO1 | awk '/Revision:/ { print $2 }')
# try crossloading
svnadmin create $REPO2_DIR
svn mkdir -m 'need at least something to load trunk/real_project into'
$REPO2/trunk
# need to pump the version number up, or we lose on *every* rev?
for i in $(seq 1 10); do
    svn mkdir -m 'dummy' $REPO2/trunk/p$i
done

(echo '#!/bin/sh'; echo 'echo "you win" ; exit 0') >
$REPO2_DIR/hooks/pre-revprop-change
chmod +x $REPO2_DIR/hooks/pre-revprop-change
# 3:HEAD gives, falsely, "svnrdump: E205000: Unsupported revision
specifier used; use only integer values or 'HEAD'"
echo $REV = 4 $HEAD = 7
svnrdump dump --incremental -r $REV:$HEAD $REPO1 | svnrdump load $REPO2



On Thu, May 12, 2011 at 4:39 PM, Daniel Shahaf <da...@elego.de> wrote:
> Mark Eichin wrote on Thu, May 12, 2011 at 16:26:56 -0400:
>> I haven't looked at the actual code (at this point I may be out of
>> time and have to go with convincing people that git-svn's
>> approximation is good enough)
>
> Just a reminder, you always have the option to use svnsync + 'svnadmin dump'
> instead of 'svnrdump dump'.
>
>> but since I'd actually managed to narrow
>> it down to a clean test case that didn't involve any of the
>> dump-mangling I'm doing for the actual task, I figured it best to get
>> that out there so people closer to the code can give it a try.
>
> Thanks for the report; looking forward to see the concrete details (that
> script which our mailing list software killed).
>



-- 
_Mark_ <ei...@thok.org> <ei...@gmail.com>

Re: svnrdump revision inconsistency bugs

Posted by Daniel Shahaf <da...@elego.de>.
Mark Eichin wrote on Thu, May 12, 2011 at 16:26:56 -0400:
> I haven't looked at the actual code (at this point I may be out of
> time and have to go with convincing people that git-svn's
> approximation is good enough)

Just a reminder, you always have the option to use svnsync + 'svnadmin dump'
instead of 'svnrdump dump'.

> but since I'd actually managed to narrow
> it down to a clean test case that didn't involve any of the
> dump-mangling I'm doing for the actual task, I figured it best to get
> that out there so people closer to the code can give it a try.

Thanks for the report; looking forward to see the concrete details (that
script which our mailing list software killed).

Re: svnrdump revision inconsistency bugs

Posted by "C. Michael Pilato" <cm...@collab.net>.
On 05/17/2011 07:14 AM, Mark Eichin wrote:
> Great, thanks (in a fixing-broken-windows sense; svnrdump itself is
> still unusable-or-unsafe without sane revision number handling.)

I've filed http://subversion.tigris.org/issues/show_bug.cgi?id=3890 to track
this.

-- 
C. Michael Pilato <cm...@collab.net>
CollabNet   <>   www.collab.net   <>   Distributed Development On Demand


Re: svnrdump revision inconsistency bugs

Posted by Mark Eichin <ei...@gmail.com>.
Great, thanks (in a fixing-broken-windows sense; svnrdump itself is
still unusable-or-unsafe without sane revision number handling.)

(I did discover "Tailor", http://progetti.arstecnica.it/tailor/ which
was actually the first tool to *work* on the original task, I'm
surprised it doesn't get more attention... though as far as I can
tell, it works purely with a combination of the client command line
interface and rsync :-)

On Fri, May 13, 2011 at 5:21 PM, C. Michael Pilato <cm...@collab.net> wrote:
> On 05/13/2011 05:13 PM, C. Michael Pilato wrote:
>> On 05/12/2011 04:26 PM, Mark Eichin wrote:
>>> The third bug is that "svnrdump -r 3:HEAD" says "E205000: Unsupported
>>> revision specifier used; use only integer values or 'HEAD'" (which is
>>> why the attached script goes out of its way to parse revision numbers
>>> out of "svn info" -- -r 3:7 works fine.)  Even if there's a reason to
>>> not support HEAD (though I kind of doubt that) the error message
>>> itself is inconsistent :-)
>>
>> This is just a coding problem.  (Probably my own.)  I'll see about fixing
>> this one in the next few minutes.
>
> Committed.
>
>   $ cat ~/log_message.txt
>   Fix a silly buglet:
>
>      $ svnrdump dump -r4:HEAD file:///path/to/repos
>      svnrdump: E205000: Unsupported revision specifier used; use only
>      integer values or 'HEAD'
>
>   * subversion/svnrdump/svnrdump.c
>     (validate_and_resolve_revisions): Resolve revisions of kind
>       svn_opt_revision_head to the HEAD revision (which we conveniently
>       have as 'latest_revision').
>
>   Reported by: Mark Eichin <ei...@gmail.com>
>   $ svn ci -F ~/log_message.txt
>   Sending        subversion/svnrdump/svnrdump.c
>   Transmitting file data .
>   Committed revision 1102901.
>   $
>
>
> --
> C. Michael Pilato <cm...@collab.net>
> CollabNet   <>   www.collab.net   <>   Distributed Development On Demand
>
>



-- 
_Mark_ <ei...@thok.org> <ei...@gmail.com>

Re: svnrdump revision inconsistency bugs

Posted by "C. Michael Pilato" <cm...@collab.net>.
On 05/13/2011 05:13 PM, C. Michael Pilato wrote:
> On 05/12/2011 04:26 PM, Mark Eichin wrote:
>> The third bug is that "svnrdump -r 3:HEAD" says "E205000: Unsupported
>> revision specifier used; use only integer values or 'HEAD'" (which is
>> why the attached script goes out of its way to parse revision numbers
>> out of "svn info" -- -r 3:7 works fine.)  Even if there's a reason to
>> not support HEAD (though I kind of doubt that) the error message
>> itself is inconsistent :-)
> 
> This is just a coding problem.  (Probably my own.)  I'll see about fixing
> this one in the next few minutes.

Committed.

   $ cat ~/log_message.txt
   Fix a silly buglet:

      $ svnrdump dump -r4:HEAD file:///path/to/repos
      svnrdump: E205000: Unsupported revision specifier used; use only
      integer values or 'HEAD'

   * subversion/svnrdump/svnrdump.c
     (validate_and_resolve_revisions): Resolve revisions of kind
       svn_opt_revision_head to the HEAD revision (which we conveniently
       have as 'latest_revision').

   Reported by: Mark Eichin <ei...@gmail.com>
   $ svn ci -F ~/log_message.txt
   Sending        subversion/svnrdump/svnrdump.c
   Transmitting file data .
   Committed revision 1102901.
   $


-- 
C. Michael Pilato <cm...@collab.net>
CollabNet   <>   www.collab.net   <>   Distributed Development On Demand


Re: svnrdump revision inconsistency bugs

Posted by "C. Michael Pilato" <cm...@collab.net>.
On 05/12/2011 04:26 PM, Mark Eichin wrote:
> I'm working on some svn migration-with-history between repos from
> different companies, git-svn isn't cutting it (close, but it loses
> properties entirely, and doesn't really give me enough control) so I
> thought I'd give svnrdump a try, given that it's been recommended for
> this use a bunch of times... built trunk (r1102449, but I don't see
> any useful changes in the logs since then) under ubuntu maverick (in a
> chroot, with --disable-shared, to avoid contamination from any
> ubuntu-supplied subversion libraries.)
> 
> I found what appear to be three distinct bugs, demonstrated by the
> attached script.  The one that actually prompted me to reduce this to
> a standalone test script was that snvrdump load failed to handle
> renames -- it gave "E160013: File not found: revision 31, path ...",
> which I eventually realized was the Node-copyfrom-rev *in the dump*
> that it was loading, not the revision the file had actually gotten
> loaded into.  The only way that could be right is if the two repos
> were in perfect sync (and we already have svnsync for that.)  The
> attached script demonstrates this directly.

It looks like svnrdump needs to implement the revision map concept that
'svnadmin load' implements, mapping revisions as specified in the dumpstream
to revisions in target repository.

> The second bug came up while writing the test case: if the destination
> repo for the svnrdump load has fewer revisions in it than the source
> repo that generated the dump, it blows up *immediately* on the first
> revision, with an error number that references the source-dump
> revision number.  I don't know what it's actually trying to do to that
> revision, but it's clear that in the case where the target has more
> revs, it's doing whatever that is to the *wrong* rev, and that
> deserves investigation.  You can demonstrate this by deleting the
> "pump the version number up" for loop at line 57 in the attached
> script.

I'm guessing that this is another bit of fallout from the general problem
that caused your first reported issue.

> The third bug is that "svnrdump -r 3:HEAD" says "E205000: Unsupported
> revision specifier used; use only integer values or 'HEAD'" (which is
> why the attached script goes out of its way to parse revision numbers
> out of "svn info" -- -r 3:7 works fine.)  Even if there's a reason to
> not support HEAD (though I kind of doubt that) the error message
> itself is inconsistent :-)

This is just a coding problem.  (Probably my own.)  I'll see about fixing
this one in the next few minutes.

-- 
C. Michael Pilato <cm...@collab.net>
CollabNet   <>   www.collab.net   <>   Distributed Development On Demand