You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Vincent Lefevre <vi...@vinc17.org> on 2006/07/14 01:12:04 UTC

svn doesn't report modified file when timestamp has not changed

Subversion assumes that if the mtime value has not changed, then the
file of a working copy has not changed. This assumption is wrong under
Unix. For instance, the commands "recode" and "mv" don't change the
mtime by default (with "mv", the timestamp of the original file is
kept, which leads to the problem when one overwrite a file with a
different file that has the same timestamp); concerning "mv", this
behavior is even required by POSIX:

  http://www.opengroup.org/onlinepubs/009695399/utilities/mv.html

According to a discussion in November 2002 ("svn status: does not
notice changed file if timestamp of "new" file is older") concerning
the behavior under Windows, it seems that Subversion did the right
thing (under Unix) in the past:

  http://subversion.tigris.org/servlets/ReadMsg?listName=dev&msgNo=25545

and the change was done due to the Windows behavior (probably
because APR regards ctime as the creation time under Windows,
which is inconsistent with the Unix meaning of ctime).

Indeed, under Unix, if max(mtime,ctime) is the same, then it is
almost guaranteed that the file has not changed.

This is a rather nasty bug, as some changes may remain unnoticed,
and therefore, they may be lost.

I also reported this bug in the Debian BTS:

  http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=376103

-- 
Vincent Lefèvre <vi...@vinc17.org> - Web: <http://www.vinc17.org/>
100% accessible validated (X)HTML - Blog: <http://www.vinc17.org/blog/>
Work: CR INRIA - computer arithmetic / SPACES project at LORIA

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn doesn't report modified file when timestamp has not changed

Posted by Peter Samuelson <pe...@p12n.org>.
[Ben Collins-Sussman]
> >For instance, the commands "recode" and "mv" don't change the
> >mtime by default
> 
> Don't do that.  Use 'svn mv', not 'mv'.

The counter-argument is that the documentation says you can use any
normal file manipulation commands to play with your wc.  'mv' is
certainly one.  If mv is dangerous, which it is if you're using it
between two checked-out files (which often have the same mtime because
they were checked out together), it might be worth mentioning this.

> I don't understand, can you elaborate?  Where's the risk of data loss?

This came up because someone tried to use GNU recode, which can edit a
file's character encoding "in situ".  When doing so, it has the
"interesting" default behavior of resetting the mtime to hide the fact
that it modified the file.  The documentation justifies this "feature"
by saying that your data was not, after all, _really_ changed.

Now, call me dense, but I cannot fathom how this "feature" can possibly
be considered desirable as default behavior.  Yet the author has
weighed in and reaffirmed that he sees nothing wrong.  See
http://bugs.debian.org/376124.

Anyway, if you run 'recode' on your wc file, and someone else checks in
a newer version, the next 'svn update' will overwrite your
modifications.  This is the data loss Vincent is referring to.

Peter

Re: svn doesn't report modified file when timestamp has not changed

Posted by Vincent Lefevre <vi...@vinc17.org>.
On 2006-07-15 09:09:34 -0500, Ben Collins-Sussman wrote:
> On 7/13/06, Vincent Lefevre <vi...@vinc17.org> wrote:
> >Subversion assumes that if the mtime value has not changed, then the
> >file of a working copy has not changed.
> 
> Yes, this is a deliberate choice.  Both CVS and Subversion examine
> current mtime, and compare it to a previously-recorded mtime to decide
> if further investigation (filesize comparison, brute-force comparison)
> is necessary.   If CVS and Subversion didn't have this algorithm, then
> commands which scan the working copy (like 'svn status' and 'svn
> commit') would be orders of magnitude slower.

Not if you use both ctime and mtime (POSIX has been designed this way).
Not if you call the command on one file (e.g. "svn revert file").

And even if ctime didn't exist, one may prefer a (not necessarily)
slow and reliable behavior to a fast but possibly buggy behavior,
at least under some conditions (e.g. before whiping a working copy).

Now, since ctime exists, there's no reason not to use it (possibly
forwarding the request to APR if there's some limitation there).

Also, Subversion currently doesn't work as documented.

> >For instance, the commands "recode" and "mv" don't change the
> >mtime by default
> 
> Don't do that.  Use 'svn mv', not 'mv'.

There are several reasons why one can use mv. First, it can be by
mistake. Then, one may also really want to move some file, but only
the contents, not the svn history associated with this file (so
that "svn mv" would be the wrong command).

> >This is a rather nasty bug, as some changes may remain unnoticed,
> >and therefore, they may be lost.
> 
> I don't understand, can you elaborate?  Where's the risk of data loss?

Because one doesn't know that a file have modification. For instance,
if one whipes the working copy, one looses these modifications. Also,
one may want to modify the file, but with this bug, one would modify
the file that already has some possibly unwanted changes; this may be
fixable (using the history of the file), but not necessarily.

-- 
Vincent Lefèvre <vi...@vinc17.org> - Web: <http://www.vinc17.org/>
100% accessible validated (X)HTML - Blog: <http://www.vinc17.org/blog/>
Work: CR INRIA - computer arithmetic / SPACES project at LORIA

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn doesn't report modified file when timestamp has not changed

Posted by Garrett Rooney <ro...@electricjellyfish.net>.
On 7/17/06, Vincent Lefevre <vi...@vinc17.org> wrote:
> On 2006-07-17 17:54:03 -0400, Garrett Rooney wrote:
> > Subversion only does that if you explicitly tell it to, IIRC.  The
> > default behavior of both export and update/checkout is to use the
> > current time as mtime.
>
> No, "svn export" does that by default (contrary to "svn checkout"),
> at least in 1.3.2.

Weird.  I would honestly consider that a bug...

-garrett

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn doesn't report modified file when timestamp has not changed

Posted by Vincent Lefevre <vi...@vinc17.org>.
On 2006-07-17 17:54:03 -0400, Garrett Rooney wrote:
> Subversion only does that if you explicitly tell it to, IIRC.  The
> default behavior of both export and update/checkout is to use the
> current time as mtime.

No, "svn export" does that by default (contrary to "svn checkout"),
at least in 1.3.2.

-- 
Vincent Lefèvre <vi...@vinc17.org> - Web: <http://www.vinc17.org/>
100% accessible validated (X)HTML - Blog: <http://www.vinc17.org/blog/>
Work: CR INRIA - computer arithmetic / SPACES project at LORIA

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn doesn't report modified file when timestamp has not changed

Posted by Garrett Rooney <ro...@electricjellyfish.net>.
On 7/15/06, Vincent Lefevre <vi...@vinc17.org> wrote:
> On 2006-07-15 10:17:19 -0700, Justin Erenkrantz wrote:
> > Recode doesn't necessarily change the file size either.  But, I
> > couldn't care less about the fact that recode wants to redefine the
> > meaning of mtime: don't use such hacky tools in the first place...  --
>
> That's not redefining the meaning of mtime. Even Subversion sets
> the mtime to values back in the time (e.g. "svn export"). What is
> more important is that it is allowed by POSIX, and POSIX doesn't
> say that's bad, and even requires such kind of things for some
> commands, like mv. Each tool has its own reasons to change the
> mtime value, like "svn export".

Subversion only does that if you explicitly tell it to, IIRC.  The
default behavior of both export and update/checkout is to use the
current time as mtime.

-garrett

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn doesn't report modified file when timestamp has not changed

Posted by Vincent Lefevre <vi...@vinc17.org>.
On 2006-07-15 10:17:19 -0700, Justin Erenkrantz wrote:
> Recode doesn't necessarily change the file size either.  But, I
> couldn't care less about the fact that recode wants to redefine the
> meaning of mtime: don't use such hacky tools in the first place...  --

That's not redefining the meaning of mtime. Even Subversion sets
the mtime to values back in the time (e.g. "svn export"). What is
more important is that it is allowed by POSIX, and POSIX doesn't
say that's bad, and even requires such kind of things for some
commands, like mv. Each tool has its own reasons to change the
mtime value, like "svn export".

And people should stop saying that this will not work with "make"
either. This is wrong. "make" is a different utility and behaves
as expected (using the mtime value as *documented*). And some tools
may change the mtime value just because of "make"; this is the case
of "patch -Z", to prevent some rules from being applied by "make"
after patching the source. At the same time, this can cause problems
with Subversion; for instance, when patching the "configure.in" and
"configure" files, one may want to keep the same mtime value (I had
to do this recently, to make sure the autotools wouldn't be called).

-- 
Vincent Lefèvre <vi...@vinc17.org> - Web: <http://www.vinc17.org/>
100% accessible validated (X)HTML - Blog: <http://www.vinc17.org/blog/>
Work: CR INRIA - computer arithmetic / SPACES project at LORIA

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn doesn't report modified file when timestamp has not changed

Posted by Philip Martin <ph...@codematters.co.uk>.
"Justin Erenkrantz" <ju...@erenkrantz.com> writes:

> On 7/15/06, Philip Martin <ph...@codematters.co.uk> wrote:
>> I'm not against using both ctime in addition to mtime, but it's low
>> priority as far as I am concerned.  Much more useful would be to start
>> using working file size in addition to mtime.
>
> Recode doesn't necessarily change the file size either.  But, I
> couldn't care less about the fact that recode wants to redefine the
> meaning of mtime: don't use such hacky tools in the first place...  --
> justin

The file size is not intended to help recode, it's to make things
faster for people using sensible tools -- if we know the file size has
changed we can avoid an expensive byte-for-byte translation and
comparison.

-- 
Philip Martin

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn doesn't report modified file when timestamp has not changed

Posted by Justin Erenkrantz <ju...@erenkrantz.com>.
On 7/15/06, Philip Martin <ph...@codematters.co.uk> wrote:
> I'm not against using both ctime in addition to mtime, but it's low
> priority as far as I am concerned.  Much more useful would be to start
> using working file size in addition to mtime.

Recode doesn't necessarily change the file size either.  But, I
couldn't care less about the fact that recode wants to redefine the
meaning of mtime: don't use such hacky tools in the first place...  --
justin

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn doesn't report modified file when timestamp has not changed

Posted by Vincent Lefevre <vi...@vinc17.org>.
On 2006-07-15 14:01:42 -0500, Peter Samuelson wrote:
> And yet people _will_ do it, and the author of recode even assures me
> that people even _want_ this behavior!  He didn't say why they want it.

If I've understood correctly, recoding is sometimes necessary when
exchanging files with Windows. I think this may be due to the different
end-of-line sequence, in particular. And the mtime value should be kept
just as when one extracts files from an archive.

IMHO, this kind of problems can happen if you get source files with
"svn export" on Windows (the files will have the CRLF end-of-line),
then transfer the files with a USB key or CD-ROM to some Unix machine,
then fix the EOL sequence with "recode".

-- 
Vincent Lefèvre <vi...@vinc17.org> - Web: <http://www.vinc17.org/>
100% accessible validated (X)HTML - Blog: <http://www.vinc17.org/blog/>
Work: CR INRIA - computer arithmetic / SPACES project at LORIA

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn doesn't report modified file when timestamp has not changed

Posted by Peter Samuelson <pe...@p12n.org>.
[Philip Martin]
> I think the old algorithm looked at both ctime and mtime and used
> whichever was the greater as the 'text-time'.  This caused problems on
> Windows as it is possible to have ctime much greater than mtime and
> then text modifications, which only affect mtime, don't cause the
> 'text-time' to change and then modified files get treated as
> unmodified.

And _that_ in turn is caused by a bug in apr where 'ctime' means two
completely different things.  They should have used two different
fields in the stat structure to denote "file creation time" and "file
inode change time".  Why they overloaded these two into a single field,
in a supposedly _portable_ runtime library, I will never know.

If any of you are apr developers, please consider adding two new fields
to apr_finfo_t, and deprecating the ctime field.  Then it will be
possible to make subversion use newer(mtime,inode_mtime) if that is
indeed judged reasonable.


> I suppose we could store both ctime and mtime in the entries file,
> and check the file for modifications if either is different.

As that involves a format change to the working copy file, it's too bad
it wasn't proposed a week ago, before 1.4.0rc3 was rolled.  That would
have been much more convenient, since the wc is already changing in a
most disruptive way.


> Looking at the original problem: yes, it's possible to modify a
> file's contents and change only the ctime, but that's not a good
> idea.  Don't do it!  It will confuse make for a start, and so for any
> sort of 'conventional' source code it's a bad idea.

And yet people _will_ do it, and the author of recode even assures me
that people even _want_ this behavior!  He didn't say why they want it.

Re: svn doesn't report modified file when timestamp has not changed

Posted by Philip Martin <ph...@codematters.co.uk>.
Greg Hudson <gh...@MIT.EDU> writes:

> Ben, your answer is boilerplate, and suggests that you didn't completely
> read or understand Vincent's message.  The proposal is not to scan every
> file's contents on each out-of-date check, but it take into account the
> inode change time on Unix (as we apparently once did, years ago).

I think the old algorithm looked at both ctime and mtime and used
whichever was the greater as the 'text-time'.  This caused problems on
Windows as it is possible to have ctime much greater than mtime and
then text modifications, which only affect mtime, don't cause the
'text-time' to change and then modified files get treated as
unmodified.

I suppose we could store both ctime and mtime in the entries file, and
check the file for modifications if either is different.  We would
still have the problem that ctime is something different on Windows
and Unix, but I don't suppose that would matter.  One argument against
this is that ctime doesn't usually change on Windows.  Another
argument against is that Subversion doesn't version the things usually
affected by ctime on Unix.

Looking at the original problem: yes, it's possible to modify a file's
contents and change only the ctime, but that's not a good idea.  Don't
do it!  It will confuse make for a start, and so for any sort of
'conventional' source code it's a bad idea.

I'm not against using both ctime in addition to mtime, but it's low
priority as far as I am concerned.  Much more useful would be to start
using working file size in addition to mtime.

-- 
Philip Martin

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn doesn't report modified file when timestamp has not changed

Posted by Greg Hudson <gh...@MIT.EDU>.
On Sat, 2006-07-15 at 09:09 -0500, Ben Collins-Sussman wrote:
> On 7/13/06, Vincent Lefevre <vi...@vinc17.org> wrote:
> > Subversion assumes that if the mtime value has not changed, then the
> > file of a working copy has not changed.
> 
> Yes, this is a deliberate choice.  Both CVS and Subversion examine
> current mtime, and compare it to a previously-recorded mtime to decide
> if further investigation (filesize comparison, brute-force comparison)
> is necessary.   If CVS and Subversion didn't have this algorithm, then
> commands which scan the working copy (like 'svn status' and 'svn
> commit') would be orders of magnitude slower.

Ben, your answer is boilerplate, and suggests that you didn't completely
read or understand Vincent's message.  The proposal is not to scan every
file's contents on each out-of-date check, but it take into account the
inode change time on Unix (as we apparently once did, years ago).

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: svn doesn't report modified file when timestamp has not changed

Posted by Ben Collins-Sussman <su...@red-bean.com>.
On 7/13/06, Vincent Lefevre <vi...@vinc17.org> wrote:
> Subversion assumes that if the mtime value has not changed, then the
> file of a working copy has not changed.

Yes, this is a deliberate choice.  Both CVS and Subversion examine
current mtime, and compare it to a previously-recorded mtime to decide
if further investigation (filesize comparison, brute-force comparison)
is necessary.   If CVS and Subversion didn't have this algorithm, then
commands which scan the working copy (like 'svn status' and 'svn
commit') would be orders of magnitude slower.

> For instance, the commands "recode" and "mv" don't change the
> mtime by default

Don't do that.  Use 'svn mv', not 'mv'.

> This is a rather nasty bug, as some changes may remain unnoticed,
> and therefore, they may be lost.

I don't understand, can you elaborate?  Where's the risk of data loss?

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org