You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@subversion.apache.org by Lev Serebryakov <le...@serebryakov.spb.ru> on 2003/11/12 14:32:46 UTC

Re[2]: cvs2svn.pl 0.50 released: first public release

Hello Mats,

Wednesday, November 12, 2003, 5:12:50 PM, you wrote:

MN> I tried it on the small sample CVS-repos I sent to the list yesterday
MN> (2nd attachment to issue
MN> http://subversion.tigris.org/issues/show_bug.cgi?id=1540).
Thank you for your reaction.

This is, really, not a bug of my script.
Here are too many tags for two files :)
Here is NO ONE PROPER solution.
Script could not select from MANY SOLUTIONS, which all have same
priority and could be used without problems.

I (a man, with intellect, not a script, without even AI) could not
find proper solution in this case, because I know nothing about this
repository, release tagging rules, etc.

You could use option `-sp' in such case, because YOU knows about your
tags and branches.

For some tagging/branching naming rules `-sh' could help, but this is
not such case.

Really, I could imagine many such `test' cases, when my script will
fail. But most of them will be very syntactic and never will be meet
in real-life repos.

-- 
Best regards,
 Lev                            mailto:lev@serebryakov.spb.ru


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: Re[6]: cvs2svn.pl 0.50 released: first public release

Posted by kf...@collab.net.
Lev, just wanted to point out something that may be helpful to you:

The cvs2svn.py test suite data contains many pathological cases, drawn
from real-life experience.  It's tools/cvs2svn/test-data/*-cvsrepos/
in the Subversion tree.  Perhaps they can be useful to you in testing
cvs2svn.pl.  (Or maybe you already knew about them?  I wasn't sure,
sorry if this is old news to you :-) ).

Good luck,
-Karl

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

RE: Re[6]: cvs2svn.pl 0.50 released: first public release

Posted by Mats Nilsson <ma...@xware.se>.
Lev Serebryakov wrote:
>  If script have two equivalent choices (as in eexample with only
>  file1.c) it could not choice anything/ because it don't have
>  additional information for choice, and I don't want to do RANDOM
>  choice. You could help scrpt with `-sp' option or allows random
>  choice with `-w DiffSymParent' option.

Well. First you say they are equivalent, but then you don't take the
consequence (and select one of them).

And it wouldn't be random if you always selected the branch closer to
the trunk, or trunk if applicable. 

To me they are equivalent. How we copy stuff while creating the svn
dumpfile is not an inherent attribute of the original repository. What
matters to me is that every line of development and every tag has the
correct intermediate and final states, defined solely and exclusively in
terms of file revisions. That is, not how we have copied stuff around to
get there. 

-w DiffSymParent didn't make a difference on my example (cvs2svn-0.50).
I'll try later with 0.51.

Sorry for the fuzz. It's your show. I'll be quiet now, waiting for some
other peoples' comments.


Thanks.

Mats


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re[6]: cvs2svn.pl 0.50 released: first public release

Posted by Lev Serebryakov <le...@serebryakov.spb.ru>.
Hello Mats,

Thursday, November 13, 2003, 2:46:37 PM, you wrote:

>> file1.c:
MN> Guess you mean file2.c.
  Yes, my fault.

>>    REL1:1.2
>>    REL2:1.4
>>    BRANCH1:1.5.0.2
>> 
>> If script sees only `file1.c' it "thinks":
>> 
>>  (1) REL1 is tag on TRUNK, we copy TRUNK to create it
>>  (2) REL2 is tag on TRUNK OR BRANCH1 before file was changed on this
>>      branch. WHAT SHOULD WE COPY!? BRANCH1 and TRUNK contains SAME
>>      file in this moment of time!
MN> I'd say REL2 is a tag on TRUNK, since its version only has two
MN> components. A tag on BRANCH1 would have a version number 1.3.2.x.
  NOT, NOT AND NOT. When you set tag on file, which is checked out
  from branch, but was not changed on this branch, tag will be set to
  branch point revision. Tag was set ON BRANCH (for all files), but on
  THIS ONE FILE (not changed on branch) it will be on TRUNK revision,.

MN> If I converted this repository (consisting of only file1.c) to svn, I'd
MN> expect to see something like this:
MN>         rev 1: A /trunk/file1.c (1.1)
MN>         rev 2: M /trunk/file1.c (1.2)
MN>         rev 3: M /trunk/file1.c (1.3)
MN>         rev 4: A /branches/BRANCH1 by copying /trunk@3
MN>         rev 5: A /tags/REL1        by copying /trunk@1
  Why not copy /branches/BRANCH1@4 here? They are equal and consist of
same revision of this file! Why trunk is much important, than
branch?! May be this tag was created on branch, but before first
change of this file on this branch (it is possible).

  Result will be the not same, because, when you copy from branch, you
copy log message about branch creation too. It could be important. If
it is not important for you, read about `-w' option and
`DiffSymParents' event.

  In some cases, `-sh' option could solve such ambiguous situations,
but only in case when tags on branches starts with branch names... It
is simple heuristic, but it works for FreeBSD repository, for example.

MN> For a repository with both files, it might read:
MN>         rev 1: A /trunk/file1.c (1.1)
MN>                A /trunk/file2.c (1.1)
MN>         rev 2: M /trunk/file2.c (1.2)
MN>         rev 3: M /trunk/file1.c (1.2)
MN>                M /trunk/file2.c (1.3)
MN>         rev 4: M /trunk/file1.c (1.3)
MN>                M /trunk/file2.c (1.4)
MN>         rev 5: M /trunk/file2.c (1.5)
MN>         rev 6: A /branches/BRANCH1 by copying /trunk@5
MN>         rev 7: A /tags/REL1        by copying /trunk@2
MN> Look for a copy source with revisions file1(1.3) file2(1.4):
MN>         rev 8: A /tags/REL2        by copying /trunk@4
  Yes, script will do exactly this work in such case.

MN> Lev, it would be really interesting if you could extend this example in
MN> such a way that the choices in the first walkthrough would not be
MN> equivalent. That could maybe help to illustrate your point.
 If script have two equivalent choices (as in eexample with only
 file1.c) it could not choice anything/ because it don't have
 additional information for choice, and I don't want to do RANDOM
 choice. You could help scrpt with `-sp' option or allows random
 choice with `-w DiffSymParent' option.

-- 
Best regards,
 Lev                            mailto:lev@serebryakov.spb.ru


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

RE: Re[4]: cvs2svn.pl 0.50 released: first public release

Posted by Mats Nilsson <ma...@xware.se>.
Lev Serebryakov wrote:
>  And simple example.
> 
> Here are repository with two files: file1.c & file2.c
> Symbols in these files are:
> 
> file1.c:
>    REL1:1.1
>    REL2:1.3
>    BRANCH1:1.3.0.2
> 
> file1.c:

Guess you mean file2.c.

>    REL1:1.2
>    REL2:1.4
>    BRANCH1:1.5.0.2
> 
> If script sees only `file1.c' it "thinks":
> 
>  (1) REL1 is tag on TRUNK, we copy TRUNK to create it
>  (2) REL2 is tag on TRUNK OR BRANCH1 before file was changed on this
>      branch. WHAT SHOULD WE COPY!? BRANCH1 and TRUNK contains SAME
>      file in this moment of time!

I'd say REL2 is a tag on TRUNK, since its version only has two
components. A tag on BRANCH1 would have a version number 1.3.2.x.

If I converted this repository (consisting of only file1.c) to svn, I'd
expect to see something like this:
	rev 1: A /trunk/file1.c (1.1)
	rev 2: M /trunk/file1.c (1.2)
	rev 3: M /trunk/file1.c (1.3)
	rev 4: A /branches/BRANCH1 by copying /trunk@3
	rev 5: A /tags/REL1        by copying /trunk@1
Look for a copy source with file1 (1.3):
Alternative 1:
	rev 6: A /tags/REL2        by copying /trunk@3
Alternative 2:
	rev 6: A /tags/REL2        by copying /branch/BRANCH1@4

Either way would be fine, since we have equivalent history recordings in
both cases, and all directories have the correct version of
file1.c@HEAD. First alternative would be simpler, but that's an
optimization.


> If script sees BOTH files, it "thinks"
>      
>  (1) REL1 is tag on TRUNK in BOTH files, we copy TRUNK to create it.
>  (2) REL2 is tag on TRUNK OR BRANCH1 in file1.c
>      REL2 is tag on TRUNK in file2.c
>      HERE IS NO CONFLICT: TRUNK is PROPER parent for this tag, copy
>      it!

For a repository with both files, it might read:
	rev 1: A /trunk/file1.c (1.1)
	       A /trunk/file2.c (1.1)
	rev 2: M /trunk/file2.c (1.2)
	rev 3: M /trunk/file1.c (1.2)
	       M /trunk/file2.c (1.3)
	rev 4: M /trunk/file1.c (1.3)
	       M /trunk/file2.c (1.4)
	rev 5: M /trunk/file2.c (1.5)
	rev 6: A /branches/BRANCH1 by copying /trunk@5
	rev 7: A /tags/REL1        by copying /trunk@2
Look for a copy source with revisions file1(1.3) file2(1.4):
	rev 8: A /tags/REL2        by copying /trunk@4

All directories have correct history and contain the right versions of
each file @HEAD.

Now, if instead file2.c would have contained:
file2.c:
    REL1:1.2
    BRANCH1:1.5.0.2
    REL2:1.5.2.2

	rev 1: A /trunk/file1.c (1.1)
	       A /trunk/file2.c (1.1)
	rev 2: M /trunk/file2.c (1.2)
	rev 3: M /trunk/file1.c (1.2)
	       M /trunk/file2.c (1.3)
	rev 4: M /trunk/file1.c (1.3)
	       M /trunk/file2.c (1.4)
	rev 5: M /trunk/file2.c (1.5)
At this point we look for a copy source with revisions file1 (1.3) and
file2 (1.5).
	rev 6: A /branches/BRANCH1 by copying /trunk@5
	rev 7: M /branches/BRANCH1/file2.c (1.5.2.1)
	rev 8: M /branches/BRANCH1/file2.c (1.5.2.2)
	rev 9: A /tags/REL1        by copying /trunk@2
At this point, it should look for a copy source with revisions file1
(1.3) file2 (1.5.2.2):
	rev 10: A /tags/REL2       by copying /branches/BRANCH1@8

All directories have correct history and the contain right versions of
each file @HEAD.

--

Lev, it would be really interesting if you could extend this example in
such a way that the choices in the first walkthrough would not be
equivalent. That could maybe help to illustrate your point.

Mats


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re[4]: cvs2svn.pl 0.50 released: first public release

Posted by Lev Serebryakov <le...@serebryakov.spb.ru>.
Hello Mats,

Thursday, November 13, 2003, 11:28:03 AM, you wrote:

MN> I don't understand your point. Why aren't you allowed to have multiple
MN> tags on the same version? It's a very, very common use case.
 And simple example.

Here are repository with two files: file1.c & file2.c
Symbols in these files are:

file1.c:
   REL1:1.1
   REL2:1.3
   BRANCH1:1.3.0.2

file1.c:
   REL1:1.2
   REL2:1.4
   BRANCH1:1.5.0.2

If script sees only `file1.c' it "thinks":

 (1) REL1 is tag on TRUNK, we copy TRUNK to create it
 (2) REL2 is tag on TRUNK OR BRANCH1 before file was changed on this
     branch. WHAT SHOULD WE COPY!? BRANCH1 and TRUNK contains SAME
     file in this moment of time!
     
If script sees BOTH files, it "thinks"
     
 (1) REL1 is tag on TRUNK in BOTH files, we copy TRUNK to create it.
 (2) REL2 is tag on TRUNK OR BRANCH1 in file1.c
     REL2 is tag on TRUNK in file2.c
     HERE IS NO CONFLICT: TRUNK is PROPER parent for this tag, copy
     it!
     
-- 
Best regards,
 Lev                            mailto:lev@serebryakov.spb.ru


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

RE: Re[4]: cvs2svn.pl 0.50 released: first public release

Posted by Mats Nilsson <ma...@xware.se>.

Lev Serebryakov wrote:
> Thursday, November 13, 2003, 11:28:03 AM, you wrote:
> 
> MN> I don't understand your point. Why aren't you allowed to 
> have multiple
> MN> tags on the same version? It's a very, very common use case.
>   You could have multiple TAGS on one REVISION. But you could not have
>   one TAG on multiple BRANCHES IN ONE FILE (yes, one file could have
>   tag on one branch, and other -- on another, it is allowed). Tags are
>   created with COPY operation in subversion, and if here is many
>   possible parents for tag, script DON'T KNOW, which TREE it should
>   copy to create this tag.

That is fair. But if the situation should appear, then the conversion
script could take steps to faithfully reproduce at least HEAD for each
tag, resorting to deletions and selective copies if necessary.

> MN> A branch is unambiguously defined by its revision number (e.g
> MN> 1.24.0.22). Specifically, it is not defined in terms of 
> other tags. IMO,
> MN> a conversion script should not depend on that, or it will 
> not be useful
> MN> for a great number of well-formed CVS repositories.
>   Tag always is set on some branch. And tag created with COPYING.

I don't understand this comment.

> MN> As for synthetic test cases, this is in fact a fragment 
> of a real world
> MN> repository. Every file is not changed for each release. 
> This means that
> MN> for some files, there will be many tags on some versions.
>   Script gather information form ALL files. And this allow to decide,
> which branch (trunk is special branch, of course) should be copied
> to create tag. Script don't work on per-file basis, it try to
> reconstruct FULL tree of branches and tags on them. It NEEDS this
> information, because tag creation process is not `put into SVN
> repository THIS revision of this file and THAT revision of that
> file', but it is `COPY this tree in that moment of time'. So, script
> should now WHICH tree it should copy/ And when tag could be created
> with copying any of 4 trees (branches), for example (because all
> these branches are equial in given moment of time), it ask for help
> from user, because it COULD NOT decide, which branch is PROPER to
> copy into tag (or new branch).

Well, I argue that for my repository, if there exists alternative
solutions, and if all are equivalent and correct, you should simply
choose one of them.

>   Ok, I could add DUMB mode, when tags and branches will be 
> created not
> with cheap copy operations, but with ADDING files to repository. Then
> no conflicts will be detected and every situation will be resolved
> WITHOUT user interaction. But this will lead to situation, when each
> branch or tag consume full space in repository, history on these files
> will be lost, etc, etc, etc. Do you need it? It could be added in 10
> minutes!

No, please don't. I still think there exists a solution. I guess the
cvs2svn.py and vcp efforts would already have complained about these
problems if they weren't resolvable, at least approximately resolvable.

(Well, cvs2svn.py doesn't handle this repository fragment either, so I
might stand corrected in the long run. :))


Mats


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re[4]: cvs2svn.pl 0.50 released: first public release

Posted by Lev Serebryakov <le...@serebryakov.spb.ru>.
Hello Mats,

Thursday, November 13, 2003, 11:28:03 AM, you wrote:

MN> I don't understand your point. Why aren't you allowed to have multiple
MN> tags on the same version? It's a very, very common use case.
  You could have multiple TAGS on one REVISION. But you could not have
  one TAG on multiple BRANCHES IN ONE FILE (yes, one file could have
  tag on one branch, and other -- on another, it is allowed). Tags are
  created with COPY operation in subversion, and if here is many
  possible parents for tag, script DON'T KNOW, which TREE it should
  copy to create this tag.

MN> A branch is unambiguously defined by its revision number (e.g
MN> 1.24.0.22). Specifically, it is not defined in terms of other tags. IMO,
MN> a conversion script should not depend on that, or it will not be useful
MN> for a great number of well-formed CVS repositories.
  Tag always is set on some branch. And tag created with COPYING.

MN> As for synthetic test cases, this is in fact a fragment of a real world
MN> repository. Every file is not changed for each release. This means that
MN> for some files, there will be many tags on some versions.
  Script gather information form ALL files. And this allow to decide,
which branch (trunk is special branch, of course) should be copied
to create tag. Script don't work on per-file basis, it try to
reconstruct FULL tree of branches and tags on them. It NEEDS this
information, because tag creation process is not `put into SVN
repository THIS revision of this file and THAT revision of that
file', but it is `COPY this tree in that moment of time'. So, script
should now WHICH tree it should copy/ And when tag could be created
with copying any of 4 trees (branches), for example (because all
these branches are equial in given moment of time), it ask for help
from user, because it COULD NOT decide, which branch is PROPER to
copy into tag (or new branch).

  Ok, I could add DUMB mode, when tags and branches will be created not
with cheap copy operations, but with ADDING files to repository. Then
no conflicts will be detected and every situation will be resolved
WITHOUT user interaction. But this will lead to situation, when each
branch or tag consume full space in repository, history on these files
will be lost, etc, etc, etc. Do you need it? It could be added in 10
minutes!

-- 
Best regards,
 Lev                            mailto:lev@serebryakov.spb.ru


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re[4]: cvs2svn.pl 0.50 released: first public release

Posted by Lev Serebryakov <le...@serebryakov.spb.ru>.
Hello Mats,

Thursday, November 13, 2003, 11:28:03 AM, you wrote:

MN> I don't understand your point. Why aren't you allowed to have multiple
MN> tags on the same version? It's a very, very common use case.
  I need add to previous answer: TREE could contain one file too, it
  is not always directory.

-- 
Best regards,
 Lev                            mailto:lev@serebryakov.spb.ru


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

RE: Re[2]: cvs2svn.pl 0.50 released: first public release

Posted by Mats Nilsson <ma...@xware.se>.
Lev,

I don't understand your point. Why aren't you allowed to have multiple
tags on the same version? It's a very, very common use case.

A branch is unambiguously defined by its revision number (e.g
1.24.0.22). Specifically, it is not defined in terms of other tags. IMO,
a conversion script should not depend on that, or it will not be useful
for a great number of well-formed CVS repositories.
 
As for synthetic test cases, this is in fact a fragment of a real world
repository. Every file is not changed for each release. This means that
for some files, there will be many tags on some versions.

Mats

Lev Serebryakov wrote:
> This is, really, not a bug of my script.
> Here are too many tags for two files :)
> Here is NO ONE PROPER solution.
> Script could not select from MANY SOLUTIONS, which all have same
> priority and could be used without problems.


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org