You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by e....@gmx.net on 2003/06/26 16:36:10 UTC

cvs2svn.py fails converting repository with the attached file

Hi,

Some time in the past I have committed files to my cvs repository with
spaces in the author name (ie USER="Erik Huelsmann") obviously, this was before I
started using unix, but the repository came with me to my new OSes. CVS nor
RCS choke on it, but cvs2svn.py does. It happens - amongst other files - with
the file attached to this mail. Should an issue be created for this?

bye,


Erik Huelsmann

-- 
+++ GMX - Mail, Messaging & more  http://www.gmx.net +++
Bitte lächeln! Fotogalerie online mit GMX ohne eigene Homepage!

Re: cvs2svn.py fails converting repository with the attached file

Posted by e....@gmx.net.
It does fix my problems, so I guess that if it is not part of svn then there
must be some other way to make it part of rcsparse. Who maintains the
repository? I looked at ViewCVS, but did not really find an address to submit
things to.

It looks to me like this is a pre-1.0 issue.

bye,


Erik.

> On Thu, Jun 26, 2003 at 06:36:10PM +0200, e.huelsmann@gmx.net wrote:
> > Hi,
> > 
> > Some time in the past I have committed files to my cvs repository with
> > spaces in the author name (ie USER="Erik Huelsmann") obviously, this was
> before I
> > started using unix, but the repository came with me to my new OSes. CVS
> nor
> > RCS choke on it, but cvs2svn.py does. It happens - amongst other files -
> with
> > the file attached to this mail. Should an issue be created for this?
> > 
> 
> Not if the following patch fixes it. :-)
> 
> rcsparse isn't technically part of svn, so I haven't committed
> this change. :-(
> 
> Let me know if this works.
> 
> --ben
> 
> 
> --- tools/cvs2svn/rcsparse/common.py	Sun Jun  8 18:08:27 2003
> +++ tools/cvs2svn/rcsparse/common.py	Thu Jun 26 13:10:43 2003
> @@ -153,11 +153,16 @@
>        timestamp = compat.timegm(tuple(date_fields))
>  
>        # Parse author
> -      semi, author, sym = self.ts.mget(3)
> +      sym = self.ts.get()
>        if sym != 'author':
>          raise RCSExpected(sym, 'author')
> -      if semi != ';':
> -        raise RCSExpected(semi, ';')
> +      author = ''
> +      while 1:
> +        token = self.ts.get()
> +        if token == ';':
> +          break
> +        author = author + token + ' '
> +      author = author[:-1]
>  
>        # Parse state
>        self.ts.match('state')
> 

-- 
+++ GMX - Mail, Messaging & more  http://www.gmx.net +++
Bitte lächeln! Fotogalerie online mit GMX ohne eigene Homepage!


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: cvs2svn.py fails converting repository with the attached file

Posted by mark benedetto king <mb...@lowlatency.com>.
On Thu, Jun 26, 2003 at 06:23:26PM -0700, Greg Stein wrote:
> On Thu, Jun 26, 2003 at 01:15:49PM -0400, mark benedetto king wrote:
> >...
> > +++ tools/cvs2svn/rcsparse/common.py	Thu Jun 26 13:10:43 2003
> > @@ -153,11 +153,16 @@
> >        timestamp = compat.timegm(tuple(date_fields))
> >  
> >        # Parse author
> > -      semi, author, sym = self.ts.mget(3)
> > +      sym = self.ts.get()
> 
> We can optimistically grab all three tokens. If 'semi' is not a semicolon,
> then we can assume it is a second part for the author, append them, and grab
> another token looking for the semicolon.
> 

But my name is Mark Benedetto King. :-)

> >        if sym != 'author':
> >          raise RCSExpected(sym, 'author')
> > -      if semi != ';':
> > -        raise RCSExpected(semi, ';')
> > +      author = ''
> > +      while 1:
> > +        token = self.ts.get()
> > +        if token == ';':
> > +          break
> > +        author = author + token + ' '
> > +      author = author[:-1]
> 
> I would favor an assumption of a single extra token for the author, rather
> than an unbounded loop. Note that these loops are quite sensitive to
> performance, too. That was the whole purpose for introducing .mget() :-)
> 

I thought about that too; error messages like "unexpected EOF" aren't
very useful, when the real problem was corruption 10% into the file.
However, I noticed that the next few lines to parse the 'state' had
an unbounded loop of their own, so I figured, "when in Rome..."

--ben



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: cvs2svn.py fails converting repository with the attached file

Posted by Julian Foad <ju...@btopenworld.com>.
Greg Stein wrote:
> On Thu, Jun 26, 2003 at 01:15:49PM -0400, mark benedetto king wrote:
> 
>>+      while 1:
> 
> I would favor an assumption of a single extra token for the author, rather
> than an unbounded loop.

Let me get this clear.  Are you saying "Greg Stein" is a valid name but "Mark Benedetto King" is an invalid name?  :-)

- Julian


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: cvs2svn.py fails converting repository with the attached file

Posted by Greg Stein <gs...@lyra.org>.
On Thu, Jun 26, 2003 at 01:15:49PM -0400, mark benedetto king wrote:
>...
> +++ tools/cvs2svn/rcsparse/common.py	Thu Jun 26 13:10:43 2003
> @@ -153,11 +153,16 @@
>        timestamp = compat.timegm(tuple(date_fields))
>  
>        # Parse author
> -      semi, author, sym = self.ts.mget(3)
> +      sym = self.ts.get()

We can optimistically grab all three tokens. If 'semi' is not a semicolon,
then we can assume it is a second part for the author, append them, and grab
another token looking for the semicolon.

>        if sym != 'author':
>          raise RCSExpected(sym, 'author')
> -      if semi != ';':
> -        raise RCSExpected(semi, ';')
> +      author = ''
> +      while 1:
> +        token = self.ts.get()
> +        if token == ';':
> +          break
> +        author = author + token + ' '
> +      author = author[:-1]

I would favor an assumption of a single extra token for the author, rather
than an unbounded loop. Note that these loops are quite sensitive to
performance, too. That was the whole purpose for introducing .mget() :-)

Something like:

  semi, author, sym = self.ts.mget(3)
  if sym != 'author':
    raise RCSExpected(sym, 'author')
  if semi != ';':
    ### some comment goes here about how broken the file is that we're
    ### reading, but we'll be Nice People and give it a shot.
    author = author + ' ' + semi
    semi = self.ts.get()
    if semi != ';':
      raise RCSExpected(semi, ';')
  
  # Parse state
  ...


Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: cvs2svn.py fails converting repository with the attached file

Posted by Greg Stein <gs...@lyra.org>.
On Thu, Jun 26, 2003 at 06:42:30PM -0500, kfogel@collab.net wrote:
> MBK writes:
> > > rcsparse isn't technically part of svn, so I haven't committed
> > > this change. :-(

I'm not sure we should. The provided file is corrupt, per the RCS file
specification. Type 'man rcsfile' and look at the spec for the author. Or,
you can read this:

    http://subversion.tigris.org/servlets/ReadMsg?list=dev&msgNo=29647

If we want to be "nice" and go ahead parse the file, then I'd like to see a
very clear comment in rcsparse.py about *why* we're allowing for invalid
files.

And note that we would still have problems round 'co' and its warning about
the invalid file.

> David Summers <da...@summersoft.fay.ar.us> writes:
> > The patch certainly fixes the abnormal termination of the script, but now 
> > I'm getting:  [errors from rcs 'co']

Yup. The file is invalid :-)

>...
> So the best solution would be to commit this change upstream.  I'll
> take a look at the patch & talk to Greg Stein about getting it into
> the master rcsparse tree.

See above.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: cvs2svn.py fails converting repository with the attached file

Posted by kf...@collab.net.
MBK writes:
> > rcsparse isn't technically part of svn, so I haven't committed
> > this change. :-(

David Summers <da...@summersoft.fay.ar.us> writes:
> The patch certainly fixes the abnormal termination of the script, but now 
> I'm getting:  [errors from rcs 'co']

Thanks for the patch, MBK.

Subversion includes rcsparse as a third-party project, but we have
actually checked in its CVS dir along with its files, so that we can
update our Subversion copy more or less like this:

   $ cvs up
   $ svn ci -m "Update to head of master."

So the best solution would be to commit this change upstream.  I'll
take a look at the patch & talk to Greg Stein about getting it into
the master rcsparse tree.

The fact that we can also get a warning from RCS 'co' later on is
unfortunate, but it may not be fatal.  The question is whether RCS
correctly reads the state correctly afterwards, since these lines tend
to look like this in the ,v file:

   date   2003.05.23.00.17.53;   author jrandom;   state Exp;

If RCS doesn't read the state field, then we may have a bigger
problem, we'll see...

-Karl

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: cvs2svn.py fails converting repository with the attached file

Posted by David Summers <da...@summersoft.fay.ar.us>.
On Thu, 26 Jun 2003, mark benedetto king wrote:
> On Thu, Jun 26, 2003 at 06:36:10PM +0200, e.huelsmann@gmx.net wrote:
> > Hi,
> > 
> > Some time in the past I have committed files to my cvs repository with
> > spaces in the author name (ie USER="Erik Huelsmann") obviously, this was before I
> > started using unix, but the repository came with me to my new OSes. CVS nor
> > RCS choke on it, but cvs2svn.py does. It happens - amongst other files - with
> > the file attached to this mail. Should an issue be created for this?
> > 
> 
> Not if the following patch fixes it. :-)
> 
> rcsparse isn't technically part of svn, so I haven't committed
> this change. :-(
> 
> Let me know if this works.
> 
> --ben
> 
> 
> --- tools/cvs2svn/rcsparse/common.py	Sun Jun  8 18:08:27 2003
> +++ tools/cvs2svn/rcsparse/common.py	Thu Jun 26 13:10:43 2003
> @@ -153,11 +153,16 @@
>        timestamp = compat.timegm(tuple(date_fields))
>  
>        # Parse author
> -      semi, author, sym = self.ts.mget(3)
> +      sym = self.ts.get()
>        if sym != 'author':
>          raise RCSExpected(sym, 'author')
> -      if semi != ';':
> -        raise RCSExpected(semi, ';')
> +      author = ''
> +      while 1:
> +        token = self.ts.get()
> +        if token == ';':
> +          break
> +        author = author + token + ' '
> +      author = author[:-1]
>  
>        # Parse state
>        self.ts.match('state')
> 

The patch certainly fixes the abnormal termination of the script, but now 
I'm getting:

co: 
/s/uranus/projects/CVSRoot/BattleStorm/data/images/Software/BattleStorm_InstallShield_Exterior.bmp,v:11: 
missing ';' after 'author'
co aborted

which doesn't make me feel good. These errors occur on all RCS *,v files 
that have a space in between two text tokens before the ';' character in 
the author field.

I think it is still not doing something correctly.  Notice that this error comes
from the RCS co command.  I'm running this on both RedHat 8.0 and RedHat 9.0
with the same results.

Ideas?

Thanks!

-- 
David Wayne Summers          "Linux: Because reboots are for hardware upgrades!"
david@summersoft.fay.ar.us   PGP Key: http://summersoft.fay.ar.us/~david/pgp.txt
PGP Key fingerprint =  C0 E0 4F 50 DD A9 B6 2B  60 A1 31 7E D2 28 6D A8 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: cvs2svn.py fails converting repository with the attached file

Posted by mark benedetto king <mb...@lowlatency.com>.
On Thu, Jun 26, 2003 at 06:36:10PM +0200, e.huelsmann@gmx.net wrote:
> Hi,
> 
> Some time in the past I have committed files to my cvs repository with
> spaces in the author name (ie USER="Erik Huelsmann") obviously, this was before I
> started using unix, but the repository came with me to my new OSes. CVS nor
> RCS choke on it, but cvs2svn.py does. It happens - amongst other files - with
> the file attached to this mail. Should an issue be created for this?
> 

Not if the following patch fixes it. :-)

rcsparse isn't technically part of svn, so I haven't committed
this change. :-(

Let me know if this works.

--ben


--- tools/cvs2svn/rcsparse/common.py	Sun Jun  8 18:08:27 2003
+++ tools/cvs2svn/rcsparse/common.py	Thu Jun 26 13:10:43 2003
@@ -153,11 +153,16 @@
       timestamp = compat.timegm(tuple(date_fields))
 
       # Parse author
-      semi, author, sym = self.ts.mget(3)
+      sym = self.ts.get()
       if sym != 'author':
         raise RCSExpected(sym, 'author')
-      if semi != ';':
-        raise RCSExpected(semi, ';')
+      author = ''
+      while 1:
+        token = self.ts.get()
+        if token == ';':
+          break
+        author = author + token + ' '
+      author = author[:-1]
 
       # Parse state
       self.ts.match('state')

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: cvs2svn.py fails converting repository with the attached file

Posted by David Summers <da...@summersoft.fay.ar.us>.
On Thu, 26 Jun 2003 e.huelsmann@gmx.net wrote:

> Some time in the past I have committed files to my cvs repository with
> spaces in the author name (ie USER="Erik Huelsmann") obviously, this was before I
> started using unix, but the repository came with me to my new OSes. CVS nor
> RCS choke on it, but cvs2svn.py does. It happens - amongst other files - with
> the file attached to this mail. Should an issue be created for this?
>

This happens to us on our Windows 2000 Server using WinCVS as a client.

We have a 4 Gig CVS repository of some 3+ years of commits and many of 
them have spaces in the user name.   It would be nice if it accepted this, 
even if it is not *supposed* to be that way in the RCS/CVS *,v files.

I have attempted to write a perl script which finds all the names with 
spaces and takes the space out but I've not yet figured out how to do it 
properly, though I could be close....I've just not had much time to work 
on it lately.

-- 
David Wayne Summers          "Linux: Because reboots are for hardware upgrades!"
david@summersoft.fay.ar.us   PGP Key: http://summersoft.fay.ar.us/~david/pgp.txt
PGP Key fingerprint =  C0 E0 4F 50 DD A9 B6 2B  60 A1 31 7E D2 28 6D A8 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org