You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@subversion.apache.org by "Delbert D. Franz" <dd...@sonic.net> on 2004/02/20 23:00:31 UTC

Strange problem with archive/extraction of working copy

I need to create an archive of a working copy that is about 
1 GB in size including files under revision control and many 
that are not.  The bandwidth on my server is too small
( about 130 Kbits/s) to have my user base do a checkout 
and then they would have to merge the unversioned files. 

The working copy in question is on a Windows 2000 Pro 
operating system.  I tried three different archive 
products: pkzip, winace, and info-zip zip.  All had the 
same result: every file in the extracted working copy 
was shown as changed by subversion.   I am using 
version 0.37 on my server (Linux) and on the clients
(W2K).  The end-of-line handling is perfect.  I have checked
that several times. 

Here is what I did:

1. Got an external diff program for W2K, WinMerge and did 
a diff:

   1.1 between working files in the working copy on the 
          source drive and the extracted copy on a test 
           drive.   That is I had two copies of the working
            directory: one that showed everything up to date
            and the second showing all  files under version
            control as changed. 

   1.2 between a working file and its base text file
          in the hidden svn dir. 

   1.3 between the entire working copies on both drives
          including the svn files and DB files.
         

2.  No diff using WinMerge showed a single byte of difference
     between any pair of files!

This was most puzzling.  Our son, a computer science major
and working in the field for about 6 years, suggested that I
try this on drives formatted with FAT32 instead of NTFS,
the default format used by W2K during an install. 

I had the extra space so I created FAT32 partitions that
were copies of the NTFS partitions.  With two drives 
of each I could then archive a clean working copy on
one drive and extract it on two other drives: one 
formatted with NTFS and the other with FAT32.


Here are the results:


Archive created          Extracted archive on:
  on:                          FAT32                 NTFS
                                  drive                    drive
-----------------------   --------------      ----------------
FAT32                       OK                    OK

NTFS                         NOK                 NOK


OK--files under version control appear as up to date, as they should
NOK-all files under version control are shown as changed, as they should not!

A working copy on an NTFS drive cannot be archived and extracted
anywhere and still be correct!!  This outcome was strange because I
would not expect the file system format to make such a difference.   

I also checked what happend with a drag-and-drop copy using
Explorer.   The results were:


Source drive                    Destination drive for copy
for copy                           FAT32                 NTFS
------------------------        ---------------------    -------------------
FAT32                             OK                         OK

NTFS                               NOK                       OK

In each case a small test repository was checked out to 
the source drive.  Note that if the source drive is NTFS
then even a drag-and-drop copy with Explorer fails to
create a correct working copy on a FAT32 drive. 

My current work-around is to only use FAT32-formatted drives
for working copies on my system.  Then I can send archives to 
anyone and they will extract properly on a Windows OS machine. 
However, some subtle problem with files on an NTFS drive
is leading Subversion to conclude that files have been changed 
when in fact an external diff program shows that not one byte has changed.  

I have placed a small repository on my server that is 
about 40 KB in size and involves a few directories and
files.  It can be checked out at :

http://www.iqdotdt.com/svn/testsvn/trunk/testsvn


if anyone wants to try to duplicate this strange result. 
This repository is world readable.

                                                Thanks
                                                      Delbert
                                                      ddf@lka.com

        


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: Strange problem with archive/extraction of working copy

Posted by Francois Beausoleil <fb...@users.sourceforge.net>.
Hi,

I would like to chime in.  I've tested the procedure you've outlined
below, and here my results:

Checkout on     Copy Method     Copy To     Result
--------------------------------------------------
NTFS            XXCOPY[1]       NTFS        OK
                                FAT32       NOK
                Explorer        NTFS        OK
                                FAT32       NOK

FAT32           XXCOPY[1]       NTFS        OK
                                FAT32       OK
                Explorer        NTFS        OK
                                FAT32       OK

[1] http://www.xxcopy.com/, using the /clone command-line option

Didn't use XCOPY, because it horribly munges the short filenames, and
then Subversion complains that the directory is not a WC.

So, it looks like this is reproductible.  I use Win2K SP4, French
Canadian and Subversion 0.37.0.  It would be interesting to get WinXP and
Win9x machines to try that too.

Please note that I haven't used any archiving program.  I simply did a
regular copy.

Bye,
François

On Fri, 20 Feb 2004 15:00:31 -0800, "Delbert D. Franz" <dd...@sonic.net>
said:
[snip]
> 
> Archive created          Extracted archive on:
>   on:                          FAT32                 NTFS
>                                   drive                    drive
> -----------------------   --------------      ----------------
> FAT32                       OK                    OK
> 
> NTFS                         NOK                 NOK
> 
> 
> OK--files under version control appear as up to date, as they should
> NOK-all files under version control are shown as changed, as they should
> not!
> 
[snip]

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org


Re: Strange problem with archive/extraction of working copy

Posted by Philip Martin <ph...@codematters.co.uk>.
"Delbert D. Franz" <dd...@sonic.net> writes:

> Thanks for creating the bug issue.  In the meantime here is
> what I will be doing:

Or you could: - import without setting svn:eol-style
              - checkout the newly imported tree
              - run a recursive propset
                  $ svn ps -R svn:eol-style native wc
That should work on Windows, assuming you don't encounter aren't any
other eol bugs.

-- 
Philip Martin

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: Strange problem with archive/extraction of working copy

Posted by "Delbert D. Franz" <dd...@sonic.net>.
Thanks for creating the bug issue.  In the meantime here is
what I will be doing:

1. Do a svnadmin dump of my repository.  I have checked
and this yields a file in which each line of a text file contains
an extra character at the end, that is,  the spurious CR.  

2. Scan the file and strip off the spurious CR's everywhere
it appears.   A script file can do this with a bit of thought. 

3. Then reload the repository with the "cleaned" file.  This
should give me a "clean" repository.  I'll do this after I 
compile for 1.0.0 on the server. 

4.  Then checkout working copies to both Linux and W2K  clients
and check if they look OK.    If so,  make archives
of the W2K working copy and send them to my 
project team members.   I'm the only one on the team right
now using Linux and that will remain so for a long time. 

Having the correct line endings in the repository makes
me feel a bit more confident about the behavior of the 
repository when we all start pounding away at it in a 
week or two. 

My above plan seems
simpler than moving the project to Linux, doing a dos2unix
and then importing from Linux.   The output of the dump 
command clearly shows everyline that has the wrong ending
character in it.  

  Thanks for your help.  Lets hope the bug can be fixed soon. 

                                                           Delbert

                                               
On Monday 23 February 2004 12:07 pm, you wrote:
> "Delbert D. Franz" <dd...@sonic.net> writes:
> 
> > Thanks for your response.  It seems that the behavior 
> > in question is currently shrouded in a bit of mystery. 
> >
> > I did some more testing on my system.   Moved my test case
> > to Linux, I run W2K under Vmware on Linux, so that was easy. 
> > The repository is on a Linux box already.  
> >
> > 1. Created a  test dir on Linux that matches the one on W2K 
> > but with the correct line endings for Linux. 
> >
> > 2. Used svn import to create a new test case: testsvn2 on 
> > the server.  Checked the db strings, and yes, they only have a LF as
> > line endings.  This differs from the result when importing from W2K
> > where the db strings have a CRLF as the eol signal!  This is not the
> > expected result.
> 
> Yes, there is a very obvious bug in the auto-props/import code.
> Importing files with non-LF line endings won't produce the right
> result if svn:eol-style is set.  This is a general problem, not
> Windows specific.  I have raised issue 1756.
> 
> -- 
> Philip Martin
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: Strange problem with archive/extraction of working copy

Posted by Philip Martin <ph...@codematters.co.uk>.
"Delbert D. Franz" <dd...@sonic.net> writes:

> Thanks for your response.  It seems that the behavior 
> in question is currently shrouded in a bit of mystery. 
>
> I did some more testing on my system.   Moved my test case
> to Linux, I run W2K under Vmware on Linux, so that was easy. 
> The repository is on a Linux box already.  
>
> 1. Created a  test dir on Linux that matches the one on W2K 
> but with the correct line endings for Linux. 
>
> 2. Used svn import to create a new test case: testsvn2 on 
> the server.  Checked the db strings, and yes, they only have a LF as
> line endings.  This differs from the result when importing from W2K
> where the db strings have a CRLF as the eol signal!  This is not the
> expected result.

Yes, there is a very obvious bug in the auto-props/import code.
Importing files with non-LF line endings won't produce the right
result if svn:eol-style is set.  This is a general problem, not
Windows specific.  I have raised issue 1756.

-- 
Philip Martin

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: Strange problem with archive/extraction of working copy

Posted by "Delbert D. Franz" <dd...@sonic.net>.
Thanks for your response.  It seems that the behavior 
in question is currently shrouded in a bit of mystery. 

I did some more testing on my system.   Moved my test case
to Linux, I run W2K under Vmware on Linux, so that was easy. 
The repository is on a Linux box already.  

1. Created a  test dir on Linux that matches the one on W2K 
but with the correct line endings for Linux. 

2. Used svn import to create a new test case: testsvn2 on 
the server.  Checked the db strings, and yes, they only have
a LF as line endings.  This differs from the result when importing from
W2K where the db strings have a CRLF as the eol signal!
This is not the expected result.

3. Did a checkout of testsvn2 to W2K and got the correct
line endings. 

Page 121 of the Subversion Book, Revision 8770 states the 
the line endings in the repository are supposed to be LF. 

Therefore it seems to me there is a bug of some sort 
in the W2K Subversion that retains the CRLF on import. 
Or perhaps it is a bug in the server on Linux that does not 
strip the LF when an import comes from a Windows client?

However, except for the archive problem, updates, commits,
etc. seem to work correctly with a repository that contains
CRLF  line endings.  This has been tested under W2K and
Linux.  I am running with the config-file option
auto-props as yes and then setting eol-style: native on every
ASCII file.  

Can anyone confirm if this behavior is:

1. A bug,

2. An undocumented "feature",

3. A mistake on my part in not setting some option
in a config file on the client or the server?

Also, I could transfer my main project from W2K to 
Linux, change all the line endings with dos2unix, 
then import to create the repository on the server. 
Would this provide greater insurance against problems
latter?  At least it gets the correct line endings in the
db strings for the repository.

Otherwise Subversion is working well.  I'll be updating
to the 1.0.0 release this week but I expect the above
behavior to remain the same. 

                                                Delbert



On Saturday 21 February 2004 05:24 am, you wrote:
> "Delbert D. Franz" <dd...@sonic.net> writes:
> 
> > "The problem is that the text-base in the repository has CRLF line endings.
> >   This is an error."
> >
> > I'm not clear on what this statement means:  Did I do something 
> > incorrect in setting up the repository or is there an error in Subversion?  
> 
> As I said:
> 
> >> It works most of the time, but
> >> this problem with incorrect line endings in the repository has been
> >> reported before.  Nobody with a Windows box has tracked it down, I
> >> don't know if they ever tried.
> 
> You may have done something wrong, it may be a Subversion bug.  I have
> only ever seen this reported by Windows users.
> 
> -- 
> Philip Martin
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: Strange problem with archive/extraction of working copy

Posted by Philip Martin <ph...@codematters.co.uk>.
"Delbert D. Franz" <dd...@sonic.net> writes:

> "The problem is that the text-base in the repository has CRLF line endings.
>   This is an error."
>
> I'm not clear on what this statement means:  Did I do something 
> incorrect in setting up the repository or is there an error in Subversion?  

As I said:

>> It works most of the time, but
>> this problem with incorrect line endings in the repository has been
>> reported before.  Nobody with a Windows box has tracked it down, I
>> don't know if they ever tried.

You may have done something wrong, it may be a Subversion bug.  I have
only ever seen this reported by Windows users.

-- 
Philip Martin

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: Strange problem with archive/extraction of working copy

Posted by "Delbert D. Franz" <dd...@sonic.net>.
"The problem is that the text-base in the repository has CRLF line endings.
  This is an error."

I'm not clear on what this statement means:  Did I do something 
incorrect in setting up the repository or is there an error in Subversion?  

I know how to find the text base in a working copy, under .svn,
but on the repository itself, I don't find a text-base but only
a strings file.  This has a mix of printing and non-printing
ASCII and it does show, when viewed in hex, that there
is a CRLF for each line.   

I have a repository of about 1.5 GB about to be used by several 
people, and it is setup in the same why that I did testsvn.  Thus
it is of importance to me to make a correction now if it is 
possible.
                                           Delbert Franz

On Friday 20 February 2004 06:28 pm, you wrote:
> "Delbert D. Franz" <dd...@sonic.net> writes:
> 
> > However, some subtle problem with files on an NTFS drive
> > is leading Subversion to conclude that files have been changed 
> > when in fact an external diff program shows that not one byte has changed.  
> >
> > I have placed a small repository on my server that is 
> > about 40 KB in size and involves a few directories and
> > files.  It can be checked out at :
> >
> > http://www.iqdotdt.com/svn/testsvn/trunk/testsvn
> 
> The problem is that the text-base in the repository has CRLF line
> endings.  This is an error, the text-base is supposed to have LF line
> endings.
> 
> What happens is that when you checkout a working copy it sets the
> working copy timestamp to that of the text base and then uses the
> matching timestamps to avoid doing a byte-by-byte comparison.  Thus
> the newly checked out working copy appears to be unmodified.
> 
> When you archive/extract using your broken tools they modify the
> working copy timestamps.  This causes the timestamp comparison to fail
> and Subversion falls back on a byte-by-byte comparison, which reveals
> the incorrect eols in the text-base.  Breaking the timestamps like
> this will also make Subversion much slower.
> 
> If you modify the timestamp of one of the files in the original
> "unmodified" working copy then it too will show up as modified.  If
> you commit these "modified" files it will correct the line endings in
> the repository and the problem will go away.
> 
> I never liked eol-conversion.  Nobody knows exactly how it should work
> in some corner cases, and I suspect nobody can say how it currently
> works without looking at the code[1].  It works most of the time, but
> this problem with incorrect line endings in the repository has been
> reported before.  Nobody with a Windows box has tracked it down, I
> don't know if they ever tried.
> 
> [1] The corner case: during an update with locally modifed files a
> three way merge is attempted.  There are three files involved the
> repository version after the update, the working file and the working
> file's base revision.  All three may have different settings for
> svn:eol-style.  Should the three way merge be attempted in repository
> format or working copy format?  If in working copy format, should each
> file use it's own eol-style, or should they use a common eol-style?
> If they use a common eol-style which one?  What does the current
> implementation do?
> 
> -- 
> Philip Martin
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: Strange problem with archive/extraction of working copy

Posted by Philip Martin <ph...@codematters.co.uk>.
"Delbert D. Franz" <dd...@sonic.net> writes:

> However, some subtle problem with files on an NTFS drive
> is leading Subversion to conclude that files have been changed 
> when in fact an external diff program shows that not one byte has changed.  
>
> I have placed a small repository on my server that is 
> about 40 KB in size and involves a few directories and
> files.  It can be checked out at :
>
> http://www.iqdotdt.com/svn/testsvn/trunk/testsvn

The problem is that the text-base in the repository has CRLF line
endings.  This is an error, the text-base is supposed to have LF line
endings.

What happens is that when you checkout a working copy it sets the
working copy timestamp to that of the text base and then uses the
matching timestamps to avoid doing a byte-by-byte comparison.  Thus
the newly checked out working copy appears to be unmodified.

When you archive/extract using your broken tools they modify the
working copy timestamps.  This causes the timestamp comparison to fail
and Subversion falls back on a byte-by-byte comparison, which reveals
the incorrect eols in the text-base.  Breaking the timestamps like
this will also make Subversion much slower.

If you modify the timestamp of one of the files in the original
"unmodified" working copy then it too will show up as modified.  If
you commit these "modified" files it will correct the line endings in
the repository and the problem will go away.

I never liked eol-conversion.  Nobody knows exactly how it should work
in some corner cases, and I suspect nobody can say how it currently
works without looking at the code[1].  It works most of the time, but
this problem with incorrect line endings in the repository has been
reported before.  Nobody with a Windows box has tracked it down, I
don't know if they ever tried.

[1] The corner case: during an update with locally modifed files a
three way merge is attempted.  There are three files involved the
repository version after the update, the working file and the working
file's base revision.  All three may have different settings for
svn:eol-style.  Should the three way merge be attempted in repository
format or working copy format?  If in working copy format, should each
file use it's own eol-style, or should they use a common eol-style?
If they use a common eol-style which one?  What does the current
implementation do?

-- 
Philip Martin

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org