You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@subversion.apache.org by Johan Holmberg <ho...@iar.se> on 2012/02/03 12:47:57 UTC

Strange diffs after rsync copying of working copy

Hi!

I get some strange reports of differences from "svn diff" and "svn 
status" after doing the following operations:

   $ svn co http://somehost/somepath proj
   $ rsync -a proj proj-rsync-copy
   $
   $ svn status proj-rsync-copy
   ... reports modified files ...
   $ svn status proj                             # no modified files
   $
   $ svn diff proj-rsync-copy
   ... strange differences reported here ...
   $ svn diff proj                             # no differences
   $
   $ diff -r proj proj-rsync-copy           # reports no differences
   $

The differences reported by "svn diff" are of two kinds: one is related 
to keyword expansion, and the other one to the line-style of files. The 
problem only occurs for a small subset of all files in the project, and 
I suspect that they share some "peculiarity" in their history in the 
Subversion archive (but I haven't been able to figure out what).

So I have two directory trees that are exactly equal (byte for byte), 
but still treated differently by "svn". The only differences I can think 
of are the timestamps of files in the trees. It seems that rsync only 
copies "whole seconds" when creating the target directory tree. I use an 
ext4 file system, and my files are normally stored with sub-second 
precision (as can seen by ls --full-time).

To verify this hypothesis, I also did a copy with "cp -a" of my working 
copy, and then I got NO differences in the new working copy. And GNU cp 
actually copies the sub-second part of timestamps.

Is Subversion really sensitive to such timestamp differences? And for 
which files? Since rsync is used as a backup tool, it can have quite 
undesirable consequences.

I'm using:

Ubuntu 10.10, with:
svn, version 1.6.12 (r955767)
rsync  version 3.0.7  protocol version 30
ext4 file system

Regards,
/Johan Holmberg



Re: Strange diffs after rsync copying of working copy

Posted by Stefan Sperling <st...@elego.de>.
On Fri, Feb 03, 2012 at 01:58:08PM +0100, Johan Holmberg wrote:
> So I don't think saying "If that timestamp differs from the one on
> disk, the file is considered modified." is correct. "svn" does not
> work like that for me at least.

Right. I misremembered how this works.
Thanks to Philip for correcting me.

Re: Strange diffs after rsync copying of working copy

Posted by Stefan Sperling <st...@elego.de>.
On Fri, Feb 03, 2012 at 04:23:35PM +0100, Johan Holmberg wrote:
> I would like to try it with Subversion 1.7. How do I get a binary
> for Ubuntu 10.10 or 11.10? I couldn't find any precompiled packages
> easily. Or do I have to build it from source?

I don't think Ubuntu is shipping Subversion 1.7 packages yet.
Please consider reporting the issue to Debian/Ubuntu package
maintainers.

I believe there are some PPA repositories on launchpad.net that
have Subversion 1.7 packages.

Re: Strange diffs after rsync copying of working copy

Posted by Johan Holmberg <ho...@iar.se>.
On 02/03/2012 03:23 PM, Stefan Sperling wrote:
> On Fri, Feb 03, 2012 at 03:04:54PM +0100, Johan Holmberg wrote:
>> On 02/03/2012 02:50 PM, Philip Martin wrote:
>>> I have now done some further experiments. I just issued commands like this:
>>>
>>>      $ svn status              # no modified files reported
>>> Subversion is not doing a full text comparison because the timestamps
>>> match.  So the difference that is present is not reported.
>> But this occurs right after a fresh "svn checkout". There can be no
>> differences.
> You said these differences were always due to keyword expansion
> or eol-style. Can you describe in more detail what the related
> svn: property settings are, and how file content differs from
> what you expect?

See my previous mail (the reply to Philip Martin).


>
>> To be really sure, I saved the file before/after my "touch + svn
>> status". The working copy file is identical before and after. But
>> first it is reported as unmodified and then as modified.
>>
>> Seems like a bug.
> It may well be. But we've stopped fixing bugs in the 1.6 working copy
> implementation long ago. 1.6 receives fixes only for critical data
> corruption issues for which no workaround exists, and security fixes.
> Does the issue exist in Subversion 1.7?
>

I would like to try it with Subversion 1.7. How do I get a binary for 
Ubuntu 10.10 or 11.10? I couldn't find any precompiled packages easily. 
Or do I have to build it from source?

/Johan Holmberg






Re: Strange diffs after rsync copying of working copy

Posted by Stefan Sperling <st...@elego.de>.
On Fri, Feb 03, 2012 at 03:04:54PM +0100, Johan Holmberg wrote:
> On 02/03/2012 02:50 PM, Philip Martin wrote:
> >
> >I have now done some further experiments. I just issued commands like this:
> >
> >     $ svn status              # no modified files reported
> >Subversion is not doing a full text comparison because the timestamps
> >match.  So the difference that is present is not reported.
> 
> But this occurs right after a fresh "svn checkout". There can be no
> differences.

You said these differences were always due to keyword expansion
or eol-style. Can you describe in more detail what the related
svn: property settings are, and how file content differs from
what you expect?

> To be really sure, I saved the file before/after my "touch + svn
> status". The working copy file is identical before and after. But
> first it is reported as unmodified and then as modified.
> 
> Seems like a bug.

It may well be. But we've stopped fixing bugs in the 1.6 working copy
implementation long ago. 1.6 receives fixes only for critical data
corruption issues for which no workaround exists, and security fixes.
Does the issue exist in Subversion 1.7?

Re: Strange diffs after rsync copying of working copy

Posted by Ryan Schmidt <su...@ryandesign.com>.
On Feb 3, 2012, at 11:23, Johan Holmberg wrote:

> But even if I don't understand exactly why I had these problems, they are solved now (as I wrote in the previous mail).

The Subversion client is responsible for normalizing keywords and eol styles before sending the changes to the repository to be committed. If you used a broken Subversion client (such as git-svn), unnormalized content could have been written to the repository. To avoid this, you could write a pre-commit hook to prevent unnormalized content. Hopefully a future version of the Subversion server will do this for you, but at present, it does not.

http://subversion.tigris.org/issues/show_bug.cgi?id=4065




Re: Strange diffs after rsync copying of working copy

Posted by Johan Holmberg <ho...@iar.se>.
On 02/03/2012 05:57 PM, Stefan Sperling wrote:
> On Fri, Feb 03, 2012 at 05:33:11PM +0100, Johan Holmberg wrote:
>> But here my working copy files are already UNIX text files (no \r\n
>> line endings).
> Your working files might already be normalised according to the
> setting of svn:eol-style, and therefore not show mixed eols.
>
> Philip meant the files in the repository, reflected by the file's
> text base in the .svn directory (inside .svn/text-base/ for 1.6).
>

I just looked in the "text-base" directory (of a working copy before the 
"fix") and there the files use \r\n. None of the files have mixed eols. 
But the working copy still have the problematic behaviour described earlier.

But even if I don't understand exactly why I had these problems, they 
are solved now (as I wrote in the previous mail).

/Johan Holmberg



Re: Strange diffs after rsync copying of working copy

Posted by Stefan Sperling <st...@elego.de>.
On Fri, Feb 03, 2012 at 05:33:11PM +0100, Johan Holmberg wrote:
> But here my working copy files are already UNIX text files (no \r\n
> line endings). 

Your working files might already be normalised according to the
setting of svn:eol-style, and therefore not show mixed eols.

Philip meant the files in the repository, reflected by the file's
text base in the .svn directory (inside .svn/text-base/ for 1.6).

Re: Strange diffs after rsync copying of working copy

Posted by Johan Holmberg <ho...@iar.se>.
On 02/03/2012 05:54 PM, Philip Martin wrote:
> Johan Holmberg<ho...@iar.se>  writes:
>
>>> A similar problem here, the file in the repository should have \n line
>>> endings but the file has some \r\n line endings.  Once again committing
>>> the file should fix the problem (although mixed line endings may cause
>>> the commit to fail and require you to make them consistent before
>>> committing).
>> But here my working copy files are already UNIX text files (no \r\n
>> line endings). And Subversion refuses to commit unchanges
>> files.
> Are you trying to commit when status shows nothing (that won't work) or
> when status shows the files as modified? I'd expect the latter to work.
>

So this should work:

   $ svn co http://someserver/somedir proj
   $ touch proj/*
   $ svn ci -m"fixed line ending problems" proj

.... and it actually worked.

I think I have solved my immediate problems now.

Thanks Philip and Stefan.

/Johan Holmberg




Re: Strange diffs after rsync copying of working copy

Posted by Philip Martin <ph...@wandisco.com>.
Johan Holmberg <ho...@iar.se> writes:

>> A similar problem here, the file in the repository should have \n line
>> endings but the file has some \r\n line endings.  Once again committing
>> the file should fix the problem (although mixed line endings may cause
>> the commit to fail and require you to make them consistent before
>> committing).
>
> But here my working copy files are already UNIX text files (no \r\n
> line endings). And Subversion refuses to commit unchanges
> files.

Are you trying to commit when status shows nothing (that won't work) or
when status shows the files as modified? I'd expect the latter to work.

> Perhaps I should:
>
> - temporarily remove the "svn:eol-style native" settings + commit
> - then remove any \r\n + commit
> - finally add back the "svn:eol-style native" attribute
>
> Or is there some easier way? (this three-step process seemed to work
> for the first file I tried).

Add a blank line?  You could use 'svnmucc put' but doing that is the
sort of thing that causes the problem in the first place.

-- 
Philip

Re: Strange diffs after rsync copying of working copy

Posted by Johan Holmberg <ho...@iar.se>.
On 02/03/2012 04:57 PM, Philip Martin wrote:
> Johan Holmberg<ho...@iar.se>  writes:
>
>> Yes, in one case I get a difference for the $Id$ line (from "svn diff"):
>>
>>    -$Id:$
>>    +$Id$
> OK. I think the file in the repository has '$Id:$' instead of '$Id$' for
> the keyword.  Committing the file should fix the problem.

I edited the file to actually contain $Id$ (removing the previous 
expansion present in the working copy). That seems to have solved the 
problem for that file.


>> and in the other files the "svn diff" show a difference on line
>> endings (Windows .vs. UNIX). But the files have svn:eol-style native
>> set, and appear as normal UNIX text files in the working copies.
> A similar problem here, the file in the repository should have \n line
> endings but the file has some \r\n line endings.  Once again committing
> the file should fix the problem (although mixed line endings may cause
> the commit to fail and require you to make them consistent before
> committing).

But here my working copy files are already UNIX text files (no \r\n line 
endings). And Subversion refuses to commit unchanges files. Perhaps I 
should:

- temporarily remove the "svn:eol-style native" settings + commit
- then remove any \r\n + commit
- finally add back the "svn:eol-style native" attribute

Or is there some easier way? (this three-step process seemed to work for 
the first file I tried).

/Johan Holmberg







Re: Strange diffs after rsync copying of working copy

Posted by Philip Martin <ph...@wandisco.com>.
Johan Holmberg <ho...@iar.se> writes:

> Yes, in one case I get a difference for the $Id$ line (from "svn diff"):
>
>   -$Id:$
>   +$Id$

OK. I think the file in the repository has '$Id:$' instead of '$Id$' for
the keyword.  Committing the file should fix the problem.

> and in the other files the "svn diff" show a difference on line
> endings (Windows .vs. UNIX). But the files have svn:eol-style native
> set, and appear as normal UNIX text files in the working copies.

A similar problem here, the file in the repository should have \n line
endings but the file has some \r\n line endings.  Once again committing
the file should fix the problem (although mixed line endings may cause
the commit to fail and require you to make them consistent before
committing).

-- 
Philip

Re: Strange diffs after rsync copying of working copy

Posted by Johan Holmberg <ho...@iar.se>.
On 02/03/2012 03:22 PM, Philip Martin wrote:
> Johan Holmberg<ho...@iar.se>  writes:
>
>> On 02/03/2012 02:50 PM, Philip Martin wrote:
>>> I have now done some further experiments. I just issued commands like this:
>>>
>>>       $ svn status              # no modified files reported
>>> Subversion is not doing a full text comparison because the timestamps
>>> match.  So the difference that is present is not reported.
>> But this occurs right after a fresh "svn checkout". There can be no
>> differences.
> If a fresh checkout really produces a working file with the wrong
> contents then that is a checkout bug, but it is nothing to do with
> status.  What sort of difference is present?  Is it something to so with
> svn:keywords or svn:eol-style?

Yes, in one case I get a difference for the $Id$ line (from "svn diff"):

   -$Id:$
   +$Id$

and in the other files the "svn diff" show a difference on line endings 
(Windows .vs. UNIX). But the files have svn:eol-style native set, and 
appear as normal UNIX text files in the working copies.

>
>> To be really sure, I saved the file before/after my "touch + svn
>> status". The working copy file is identical before and after. But
>> first it is reported as unmodified and then as modified.
> That doesn't prove anything.  The file is reported as unmodified because
> the timestamps match.  That does not mean that the file is unmodified.
> It means that the timestamps match.  The file could have modifications
> at that stage.  When the timestamps differ that modification becomes
> visible.
>

OK, I understand. So it appears to be a checkout bug instead (or some 
other svn-internal problem with the ".svn" information in the directory?).

/Johan Holmberg





Re: Strange diffs after rsync copying of working copy

Posted by Philip Martin <ph...@wandisco.com>.
Johan Holmberg <ho...@iar.se> writes:

> On 02/03/2012 02:50 PM, Philip Martin wrote:
>>
>> I have now done some further experiments. I just issued commands like this:
>>
>>      $ svn status              # no modified files reported
>> Subversion is not doing a full text comparison because the timestamps
>> match.  So the difference that is present is not reported.
>
> But this occurs right after a fresh "svn checkout". There can be no
> differences.

If a fresh checkout really produces a working file with the wrong
contents then that is a checkout bug, but it is nothing to do with
status.  What sort of difference is present?  Is it something to so with
svn:keywords or svn:eol-style?

> To be really sure, I saved the file before/after my "touch + svn
> status". The working copy file is identical before and after. But
> first it is reported as unmodified and then as modified.

That doesn't prove anything.  The file is reported as unmodified because
the timestamps match.  That does not mean that the file is unmodified.
It means that the timestamps match.  The file could have modifications
at that stage.  When the timestamps differ that modification becomes
visible.

-- 
Philip

Re: Strange diffs after rsync copying of working copy

Posted by Johan Holmberg <ho...@iar.se>.
On 02/03/2012 02:50 PM, Philip Martin wrote:
>
> I have now done some further experiments. I just issued commands like this:
>
>      $ svn status              # no modified files reported
> Subversion is not doing a full text comparison because the timestamps
> match.  So the difference that is present is not reported.

But this occurs right after a fresh "svn checkout". There can be no 
differences.

To be really sure, I saved the file before/after my "touch + svn 
status". The working copy file is identical before and after. But first 
it is reported as unmodified and then as modified.

Seems like a bug.

/Johan Holmberg




Re: Strange diffs after rsync copying of working copy

Posted by Philip Martin <ph...@wandisco.com>.
Johan Holmberg <ho...@iar.se> writes:

> On 02/03/2012 02:09 PM, Philip Martin wrote:
>> Johan Holmberg<ho...@iar.se>  writes:
>>
>>> So I don't think saying "If that timestamp differs from the one on
>>> disk, the file is considered modified." is correct. "svn" does not
>>> work like that for me at least.
>> You are correct, that is not how Subversion behaves.  Subversion checks
>> the timestamp to determine whether to do a full text comparison.  If the
>> timestamp and size of a working file match the recorded values the file
>> is assummed to be unmodified.  If the values differ Subversion does a
>> full text comparison to determine if the files differ.
>>
>
> I have now done some further experiments. I just issued commands like this:
>
>     $ svn status              # no modified files reported

Subversion is not doing a full text comparison because the timestamps
match.  So the difference that is present is not reported.

>     $ touch foo.txt bar.txt

Now the timestamps don't match.

>     $ svn status
>     M       foo.txt <------ only "foo.txt" not "bar.txt"

Subversion now does a full text comparison and the difference is
identified.


-- 
Philip

Re: Strange diffs after rsync copying of working copy

Posted by Johan Holmberg <ho...@iar.se>.
On 02/03/2012 02:09 PM, Philip Martin wrote:
> Johan Holmberg<ho...@iar.se>  writes:
>
>> So I don't think saying "If that timestamp differs from the one on
>> disk, the file is considered modified." is correct. "svn" does not
>> work like that for me at least.
> You are correct, that is not how Subversion behaves.  Subversion checks
> the timestamp to determine whether to do a full text comparison.  If the
> timestamp and size of a working file match the recorded values the file
> is assummed to be unmodified.  If the values differ Subversion does a
> full text comparison to determine if the files differ.
>

I have now done some further experiments. I just issued commands like this:

     $ svn status              # no modified files reported
     $ touch foo.txt bar.txt
     $ svn status
     M       foo.txt <------ only "foo.txt" not "bar.txt"
     $

So for my "problematic" files (like "foo.txt" in my example), svn 
actually is sensitive to a change in timestamp. But my project consists 
of 21000 files, and only 19 of these are "problematic". I have still no 
clue why just these 19 files behave like that.

I can repeat the commands above several times, and it consistently flags 
*one* of the two files as modified. Always the same file.

/Johan Holmberg



Re: Strange diffs after rsync copying of working copy

Posted by Philip Martin <ph...@wandisco.com>.
Johan Holmberg <ho...@iar.se> writes:

> So I don't think saying "If that timestamp differs from the one on
> disk, the file is considered modified." is correct. "svn" does not
> work like that for me at least.

You are correct, that is not how Subversion behaves.  Subversion checks
the timestamp to determine whether to do a full text comparison.  If the
timestamp and size of a working file match the recorded values the file
is assummed to be unmodified.  If the values differ Subversion does a
full text comparison to determine if the files differ.

-- 
Philip

Re: Strange diffs after rsync copying of working copy

Posted by Johan Holmberg <ho...@iar.se>.
On 02/03/2012 01:28 PM, Stefan Sperling wrote:
> On Fri, Feb 03, 2012 at 12:47:57PM +0100, Johan Holmberg wrote:
>> Is Subversion really sensitive to such timestamp differences?
> Yes, it is. To avoid checking the entire content of all files in the
> working copy every time you run 'svn diff' or 'svn status', Subversion
> keeps a record of the timestamp each file had after checkout. If that
> timestamp differs from the one on disk, the file is considered modified.
>

But isn't this just a cache? If I do a "touch" on a file in my working 
copy, "svn status" and "svn diff" does not report any new difference. I 
guess the cached info is just invalidated, and "svn" has to re-calculate 
the cached info. But there is still no difference introduced by a simple 
timestamp change on a working copy file.

So I don't think saying "If that timestamp differs from the one on disk, 
the file is considered modified." is correct. "svn" does not work like 
that for me at least.

I'm beginning to suspect that the timestamps of files in the ".svn" 
directory may matter, an cause the behaviour I described in my earlier mail.

/Johan Holmberg



Re: Strange diffs after rsync copying of working copy

Posted by Stefan Sperling <st...@elego.de>.
On Fri, Feb 03, 2012 at 12:47:57PM +0100, Johan Holmberg wrote:
> Is Subversion really sensitive to such timestamp differences?

Yes, it is. To avoid checking the entire content of all files in the
working copy every time you run 'svn diff' or 'svn status', Subversion
keeps a record of the timestamp each file had after checkout. If that
timestamp differs from the one on disk, the file is considered modified.