You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Johan Corveleyn <jc...@gmail.com> on 2012/01/15 21:26:14 UTC

How to fix issue #4023 (on Windows, 'svn rm missingItem' deletes the unversioned 'MissingItem' from disk)?

Hi,

I'm looking into fixing issue #4023. I'd like to have some feedback on
a possible solution, and some hints on how to best implement it.

The problem
-----------
- 'svn rm missingItem' (while there is a 'MissingItem' on disk, and
not a 'missingItem') will eventually call
adm.ops.c#svn_wc__delete_many, with 'missingItem' as one of the
targets (so far so good).

- svn_wc__delete_many, after updating wc.db, will eventually call
erase_unversioned_from_wc, which will call
svn_io_remove_file2('missingItem').

- At this point, we have a problem, because
svn_io_remove_file2('missingItem') will eventually call the Windows
API DeleteFileW, which will happily delete 'MissingItem' when given
the 'missingItem' argument.


Proposed solution
-----------------
Inside svn_wc__delete_many, we could find out that 'missingItem' is,
you know, missing, before we try to delete it from disk. In that case,
there is no need to call erase_unversioned_from_wc, so no call to
DeleteFileW occurs.


I can't come up with a better way, but if someone has other ideas, shoot.

If this is indeed the way to proceed, then I'd appreciate a small
hint/pointer: how do I find out (as cheaply as possible) whether the
target in question is missing? In the beginning of
svn_wc__delete_many, a call is done to svn_wc__db_read_info, but this
doesn't seem to give this information (status == normal, kind == file,
whether or not the file in question is missing from disk).


-- 
Johan

Re: How to fix issue #4023 (on Windows, 'svn rm missingItem' deletes the unversioned 'MissingItem' from disk)?

Posted by Johan Corveleyn <jc...@gmail.com>.
On Tue, Jan 17, 2012 at 3:02 PM, Johan Corveleyn <jc...@gmail.com> wrote:
> On Sun, Jan 15, 2012 at 10:14 PM, Bert Huijben <be...@qqmail.nl> wrote:
>>
>>
>>> -----Original Message-----
>>> From: Johan Corveleyn [mailto:jcorvel@gmail.com]
>>> Sent: zondag 15 januari 2012 21:26
>>> To: Subversion Development
>>> Subject: How to fix issue #4023 (on Windows, 'svn rm missingItem' deletes
>>> the unversioned 'MissingItem' from disk)?
>>>
>>> Hi,
>>>
>>> I'm looking into fixing issue #4023. I'd like to have some feedback on
>>> a possible solution, and some hints on how to best implement it.
>>>
>>> The problem
>>> -----------
>>> - 'svn rm missingItem' (while there is a 'MissingItem' on disk, and
>>> not a 'missingItem') will eventually call
>>> adm.ops.c#svn_wc__delete_many, with 'missingItem' as one of the
>>> targets (so far so good).
>>>
>>> - svn_wc__delete_many, after updating wc.db, will eventually call
>>> erase_unversioned_from_wc, which will call
>>> svn_io_remove_file2('missingItem').
>>>
>>> - At this point, we have a problem, because
>>> svn_io_remove_file2('missingItem') will eventually call the Windows
>>> API DeleteFileW, which will happily delete 'MissingItem' when given
>>> the 'missingItem' argument.
>>>
>>>
>>> Proposed solution
>>> -----------------
>>> Inside svn_wc__delete_many, we could find out that 'missingItem' is,
>>> you know, missing, before we try to delete it from disk. In that case,
>>> there is no need to call erase_unversioned_from_wc, so no call to
>>> DeleteFileW occurs.
>>>
>>>
>>> I can't come up with a better way, but if someone has other ideas, shoot.
>>>
>>> If this is indeed the way to proceed, then I'd appreciate a small
>>> hint/pointer: how do I find out (as cheaply as possible) whether the
>>> target in question is missing? In the beginning of
>>> svn_wc__delete_many, a call is done to svn_wc__db_read_info, but this
>>> doesn't seem to give this information (status == normal, kind == file,
>>> whether or not the file in question is missing from disk).
>>
>> The cheapest way to check would be (indirectly) calling the
>> apr_filepath_merge call to find the paths true name. If the basename of the
>> returned path is (case-)different than the path you entered, then the file
>> would be missing.
>> If the file is really missing, you would get the identical path or an error
>> from that call.
>
> Thanks. I'll look into that.
>
> I'll try to complete it during this week, but these are busy times, so
> if someone beats me to it: no problem.
>
>> On Windows this is cheaper than retrieving all directory entries in the
>> parent directory and checking yourself and on other platforms getting the
>> truename is a no-op.
>>
>>
>> We might have a wrapper for that apr_filepath_merge, for handling the path
>> encoding conversion.
>> If we have, then we probably used it for the file move fix we programmed in
>> Berlin. If not, then we should probably add one :)
>
> Yes, apr_filepath_merge (with APR_FILEPATH_TRUENAME flag) is wrapped
> inside svn_opt__arg_canonicalize_path. Am I allowed to call this
> function from inside svn_wc__delete_many? In that case, I can just
> call that function and compare its output with the output of
> svn_dirent_internal_style, just like we did for the 'case-only move'
> code.
>
> Actually, during the Berlin hackathon last year, we fixed the
> case-only moves without adding another call to
> svn_opt__arg_canonicalize_path. Instead, we did it by inserting some
> additional logic inside cmdline.c#svn_client_args_to_target_array,
> where the truepath conversion is done anyway, for all the path
> arguments. At that point we already had knowledge of both the original
> form and the truepath-converted form, so we could compare the two.
> (see r1124469 and r1131326)

Hi Bert,

Ok, the following patch fixes issue #4023 (and passes make check). But
I'm not sure if it's a "good fix" (more below). So I'd like some extra
eyes if possible ...

[[[
Index: subversion/libsvn_wc/adm_ops.c
===================================================================
--- subversion/libsvn_wc/adm_ops.c      (revision 1233695)
+++ subversion/libsvn_wc/adm_ops.c      (working copy)
@@ -59,6 +59,7 @@
 #include "svn_private_config.h"
 #include "private/svn_io_private.h"
 #include "private/svn_wc_private.h"
+#include "private/svn_opt_private.h"



@@ -742,14 +743,29 @@ svn_wc__delete_many(svn_wc_context_t *wc_ctx,
    * unversioned. */
   if (!keep_local)
     {
+      const char *local_truepath;
+
       for (i = 0; i < versioned_targets->nelts; i++)
         {
           svn_pool_clear(iterpool);

           local_abspath = APR_ARRAY_IDX(versioned_targets, i, const char *);
-          SVN_ERR(erase_unversioned_from_wc(local_abspath, TRUE,
-                                            cancel_func, cancel_baton,
-                                            iterpool));
+
+          /* Canonicalization will, on Windows, return the 'truepath' of the
+             given target (the actual casing on disk corresponding to the
+             given path).  If this differs from the original target, we know
+             that the latter is missing from disk, and another case-clashing
+             file is present.  In that case, we avoid the call to
+             erase_unversioned_from_wc (which would otherwise erase the wrong
+             file from disk).  See also issue #4023. */
+          SVN_ERR(svn_opt__arg_canonicalize_path(&local_truepath,
+                                                 local_abspath, iterpool));
+          if (strcmp(local_abspath, local_truepath) != 0)
+            continue;
+          else
+            SVN_ERR(erase_unversioned_from_wc(local_abspath, TRUE,
+                                              cancel_func, cancel_baton,
+                                              iterpool));
         }
     }

]]]

Some things on my mind:

- Is it ok to call svn_opt__arg_canonicalize_path from libsvn_wc, or
is this some kind of layering violation? (if so, we'd have to write
another wrapper for apr_filepath_merge (if so, where should we put
this?))

- It can only detect real 'case-clashes' (which is sufficient to fix
#4023), not 'missing' in general (like we discussed above). That's
because, going through svn_opt__arg_canonicalize_path, one cannot see
whether apr_filepath_merge returned an error, or just canonicalized to
the identical path (meaning the exact path was present). Both will
return the given path as is. OTOH, this is not really a problem, it's
enough to get #4023 fixed ...

-- 
Johan

Re: How to fix issue #4023 (on Windows, 'svn rm missingItem' deletes the unversioned 'MissingItem' from disk)?

Posted by Johan Corveleyn <jc...@gmail.com>.
On Sun, Jan 15, 2012 at 10:14 PM, Bert Huijben <be...@qqmail.nl> wrote:
>
>
>> -----Original Message-----
>> From: Johan Corveleyn [mailto:jcorvel@gmail.com]
>> Sent: zondag 15 januari 2012 21:26
>> To: Subversion Development
>> Subject: How to fix issue #4023 (on Windows, 'svn rm missingItem' deletes
>> the unversioned 'MissingItem' from disk)?
>>
>> Hi,
>>
>> I'm looking into fixing issue #4023. I'd like to have some feedback on
>> a possible solution, and some hints on how to best implement it.
>>
>> The problem
>> -----------
>> - 'svn rm missingItem' (while there is a 'MissingItem' on disk, and
>> not a 'missingItem') will eventually call
>> adm.ops.c#svn_wc__delete_many, with 'missingItem' as one of the
>> targets (so far so good).
>>
>> - svn_wc__delete_many, after updating wc.db, will eventually call
>> erase_unversioned_from_wc, which will call
>> svn_io_remove_file2('missingItem').
>>
>> - At this point, we have a problem, because
>> svn_io_remove_file2('missingItem') will eventually call the Windows
>> API DeleteFileW, which will happily delete 'MissingItem' when given
>> the 'missingItem' argument.
>>
>>
>> Proposed solution
>> -----------------
>> Inside svn_wc__delete_many, we could find out that 'missingItem' is,
>> you know, missing, before we try to delete it from disk. In that case,
>> there is no need to call erase_unversioned_from_wc, so no call to
>> DeleteFileW occurs.
>>
>>
>> I can't come up with a better way, but if someone has other ideas, shoot.
>>
>> If this is indeed the way to proceed, then I'd appreciate a small
>> hint/pointer: how do I find out (as cheaply as possible) whether the
>> target in question is missing? In the beginning of
>> svn_wc__delete_many, a call is done to svn_wc__db_read_info, but this
>> doesn't seem to give this information (status == normal, kind == file,
>> whether or not the file in question is missing from disk).
>
> The cheapest way to check would be (indirectly) calling the
> apr_filepath_merge call to find the paths true name. If the basename of the
> returned path is (case-)different than the path you entered, then the file
> would be missing.
> If the file is really missing, you would get the identical path or an error
> from that call.

Thanks. I'll look into that.

I'll try to complete it during this week, but these are busy times, so
if someone beats me to it: no problem.

> On Windows this is cheaper than retrieving all directory entries in the
> parent directory and checking yourself and on other platforms getting the
> truename is a no-op.
>
>
> We might have a wrapper for that apr_filepath_merge, for handling the path
> encoding conversion.
> If we have, then we probably used it for the file move fix we programmed in
> Berlin. If not, then we should probably add one :)

Yes, apr_filepath_merge (with APR_FILEPATH_TRUENAME flag) is wrapped
inside svn_opt__arg_canonicalize_path. Am I allowed to call this
function from inside svn_wc__delete_many? In that case, I can just
call that function and compare its output with the output of
svn_dirent_internal_style, just like we did for the 'case-only move'
code.

Actually, during the Berlin hackathon last year, we fixed the
case-only moves without adding another call to
svn_opt__arg_canonicalize_path. Instead, we did it by inserting some
additional logic inside cmdline.c#svn_client_args_to_target_array,
where the truepath conversion is done anyway, for all the path
arguments. At that point we already had knowledge of both the original
form and the truepath-converted form, so we could compare the two.
(see r1124469 and r1131326)

-- 
Johan

RE: How to fix issue #4023 (on Windows, 'svn rm missingItem' deletes the unversioned 'MissingItem' from disk)?

Posted by Bert Huijben <be...@qqmail.nl>.

> -----Original Message-----
> From: Johan Corveleyn [mailto:jcorvel@gmail.com]
> Sent: zondag 15 januari 2012 21:26
> To: Subversion Development
> Subject: How to fix issue #4023 (on Windows, 'svn rm missingItem' deletes
> the unversioned 'MissingItem' from disk)?
> 
> Hi,
> 
> I'm looking into fixing issue #4023. I'd like to have some feedback on
> a possible solution, and some hints on how to best implement it.
> 
> The problem
> -----------
> - 'svn rm missingItem' (while there is a 'MissingItem' on disk, and
> not a 'missingItem') will eventually call
> adm.ops.c#svn_wc__delete_many, with 'missingItem' as one of the
> targets (so far so good).
> 
> - svn_wc__delete_many, after updating wc.db, will eventually call
> erase_unversioned_from_wc, which will call
> svn_io_remove_file2('missingItem').
> 
> - At this point, we have a problem, because
> svn_io_remove_file2('missingItem') will eventually call the Windows
> API DeleteFileW, which will happily delete 'MissingItem' when given
> the 'missingItem' argument.
> 
> 
> Proposed solution
> -----------------
> Inside svn_wc__delete_many, we could find out that 'missingItem' is,
> you know, missing, before we try to delete it from disk. In that case,
> there is no need to call erase_unversioned_from_wc, so no call to
> DeleteFileW occurs.
> 
> 
> I can't come up with a better way, but if someone has other ideas, shoot.
> 
> If this is indeed the way to proceed, then I'd appreciate a small
> hint/pointer: how do I find out (as cheaply as possible) whether the
> target in question is missing? In the beginning of
> svn_wc__delete_many, a call is done to svn_wc__db_read_info, but this
> doesn't seem to give this information (status == normal, kind == file,
> whether or not the file in question is missing from disk).

The cheapest way to check would be (indirectly) calling the
apr_filepath_merge call to find the paths true name. If the basename of the
returned path is (case-)different than the path you entered, then the file
would be missing.
If the file is really missing, you would get the identical path or an error
from that call.

On Windows this is cheaper than retrieving all directory entries in the
parent directory and checking yourself and on other platforms getting the
truename is a no-op.


We might have a wrapper for that apr_filepath_merge, for handling the path
encoding conversion.
If we have, then we probably used it for the file move fix we programmed in
Berlin. If not, then we should probably add one :)

	Bert Huijben

> 
> 
> --
> Johan