You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@subversion.apache.org by sv...@apache.org on 2022/03/29 04:01:20 UTC

svn commit: r1899341 - in /subversion/branches/1.14.x: ./ STATUS subversion/libsvn_subr/io.c

Author: svn-role
Date: Tue Mar 29 04:01:20 2022
New Revision: 1899341

URL: http://svn.apache.org/viewvc?rev=1899341&view=rev
Log:
Merge r1883355 from trunk:

 * r1883355
   Use the APR-1.4+ API for flushing file contents to disk.
   Justification:
     Reduce code duplication between APR and SVN.
   Votes:
     +1: brane, jun66j5, markphip

Modified:
    subversion/branches/1.14.x/   (props changed)
    subversion/branches/1.14.x/STATUS
    subversion/branches/1.14.x/subversion/libsvn_subr/io.c

Propchange: subversion/branches/1.14.x/
------------------------------------------------------------------------------
  Merged /subversion/trunk:r1883355

Modified: subversion/branches/1.14.x/STATUS
URL: http://svn.apache.org/viewvc/subversion/branches/1.14.x/STATUS?rev=1899341&r1=1899340&r2=1899341&view=diff
==============================================================================
--- subversion/branches/1.14.x/STATUS (original)
+++ subversion/branches/1.14.x/STATUS Tue Mar 29 04:01:20 2022
@@ -79,13 +79,6 @@ Approved changes:
    Votes:
      +1: jamessan, stsp
 
- * r1883355
-   Use the APR-1.4+ API for flushing file contents to disk.
-   Justification:
-     Reduce code duplication between APR and SVN.
-   Votes:
-     +1: brane, jun66j5, markphip
-
  * r1881534
    Fix issue #4864 "build/ac-macros/macosx.m4: workaround AC_RUN_IFELSE"
    Justification:

Modified: subversion/branches/1.14.x/subversion/libsvn_subr/io.c
URL: http://svn.apache.org/viewvc/subversion/branches/1.14.x/subversion/libsvn_subr/io.c?rev=1899341&r1=1899340&r2=1899341&view=diff
==============================================================================
--- subversion/branches/1.14.x/subversion/libsvn_subr/io.c (original)
+++ subversion/branches/1.14.x/subversion/libsvn_subr/io.c Tue Mar 29 04:01:20 2022
@@ -2703,7 +2703,6 @@ svn_io__file_lock_autocreate(const char
 svn_error_t *svn_io_file_flush_to_disk(apr_file_t *file,
                                        apr_pool_t *pool)
 {
-  apr_os_file_t filehand;
   const char *fname;
   apr_status_t apr_err;
 
@@ -2713,49 +2712,21 @@ svn_error_t *svn_io_file_flush_to_disk(a
   if (apr_err)
     return svn_error_wrap_apr(apr_err, _("Can't get file name"));
 
-  /* ### In apr 1.4+ we could delegate most of this function to
-         apr_file_sync(). The only major difference is that this doesn't
-         contain the retry loop for EINTR on linux. */
-
-  /* First make sure that any user-space buffered data is flushed. */
-  SVN_ERR(svn_io_file_flush(file, pool));
-
-  apr_os_file_get(&filehand, file);
-
-  /* Call the operating system specific function to actually force the
-     data to disk. */
-  {
-#ifdef WIN32
-
-    if (! FlushFileBuffers(filehand))
-        return svn_error_wrap_apr(apr_get_os_error(),
-                                  _("Can't flush file '%s' to disk"),
-                                  try_utf8_from_internal_style(fname, pool));
-
-#else
-      int rv;
-
-      do {
-#ifdef F_FULLFSYNC
-        rv = fcntl(filehand, F_FULLFSYNC, 0);
-#else
-        rv = fsync(filehand);
-#endif
-      } while (rv == -1 && APR_STATUS_IS_EINTR(apr_get_os_error()));
-
-      /* If the file is in a memory filesystem, fsync() may return
-         EINVAL.  Presumably the user knows the risks, and we can just
-         ignore the error. */
-      if (rv == -1 && APR_STATUS_IS_EINVAL(apr_get_os_error()))
-        return SVN_NO_ERROR;
-
-      if (rv == -1)
-        return svn_error_wrap_apr(apr_get_os_error(),
-                                  _("Can't flush file '%s' to disk"),
-                                  try_utf8_from_internal_style(fname, pool));
+  do {
+    apr_err = apr_file_datasync(file);
+  } while(APR_STATUS_IS_EINTR(apr_err));
+
+  /* If the file is in a memory filesystem, fsync() may return
+     EINVAL.  Presumably the user knows the risks, and we can just
+     ignore the error. */
+  if (APR_STATUS_IS_EINVAL(apr_err))
+    return SVN_NO_ERROR;
+
+  if (apr_err)
+    return svn_error_wrap_apr(apr_err,
+                              _("Can't flush file '%s' to disk"),
+                              try_utf8_from_internal_style(fname, pool));
 
-#endif
-  }
   return SVN_NO_ERROR;
 }
 



Re: svn commit: r1899341 - in /subversion/branches/1.14.x: ./ STATUS subversion/libsvn_subr/io.c

Posted by Mark Phippard <ma...@gmail.com>.
On Mon, Apr 11, 2022 at 6:14 PM Evgeny Kotkov
<ev...@visualsvn.com> wrote:
>
> svn-role <sv...@apache.org> writes:
>
> > Merge r1883355 from trunk:
> >
> >  * r1883355
> >    Use the APR-1.4+ API for flushing file contents to disk.
> >    Justification:
> >      Reduce code duplication between APR and SVN.
> >    Votes:
> >      +1: brane, jun66j5, markphip
> …
> > +  do {
> > +    apr_err = apr_file_datasync(file);
> > +  } while(APR_STATUS_IS_EINTR(apr_err));
> > +
> > +  /* If the file is in a memory filesystem, fsync() may return
> > +     EINVAL.  Presumably the user knows the risks, and we can just
> > +     ignore the error. */
> > +  if (APR_STATUS_IS_EINVAL(apr_err))
> > +    return SVN_NO_ERROR;
> > +
> > +  if (apr_err)
> > +    return svn_error_wrap_apr(apr_err,
> > +                              _("Can't flush file '%s' to disk"),
> > +                              try_utf8_from_internal_style(fname, pool));
>
> A few thoughts on this change:
>
> 1) Previously, the check for EINVAL within the svn_io_file_flush_to_disk()
>    function only happened in the #else block, so it did not affect the
>    behavior on Windows.  With the change, the check happens unconditionally
>    on all platforms.
>
>    On Windows, APR_STATUS_IS_EINVAL() is defined as follows:
>
> #define APR_STATUS_IS_EINVAL(s)         ((s) == APR_EINVAL \
>                 || (s) == APR_OS_START_SYSERR + ERROR_INVALID_ACCESS \
>                 || (s) == APR_OS_START_SYSERR + ERROR_INVALID_DATA \
>                 || (s) == APR_OS_START_SYSERR + ERROR_INVALID_FUNCTION \
>                 || (s) == APR_OS_START_SYSERR + ERROR_INVALID_HANDLE \
>                 || (s) == APR_OS_START_SYSERR + ERROR_INVALID_PARAMETER \
>                 || (s) == APR_OS_START_SYSERR + ERROR_NEGATIVE_SEEK)
>
>    So with this change, all of these error codes are now going to be ignored,
>    and the svn_io_file_flush_to_disk() function will return SVN_NO_ERROR,
>    indicating the success of flushing the data to disk.  Some of these error
>    codes, such as ERROR_INVALID_HANDLE, are quite generic.
>    I would think that this change opens a data corruption possibility in
>    the following case:
>
>    - A real error happens during FlushFileBuffers().
>    - It gets translated into one of the error codes above by the I/O stack.
>    - The error is ignored in the implementation of svn_io_file_flush_to_disk().
>    - The caller of svn_io_file_flush_to_disk() interprets this situation as a
>      successful data flush.
>    - He continues to work under an assumption that the data was successfully
>      flushed to disk, whereas in fact it was not. For example, by committing
>      a repository transaction.
>    - A power loss after the transaction commit causes the data to be lost or
>      corrupted.
>
> 2) Unless I am mistaken, this change replaces an fcntl(F_FULLFSYNC) with
>    fdatasync() when the latter is supported.  So assuming that both operations
>    are supported, the new behavior of the svn_io_file_flush_to_disk() function
>    has weaker integrity guarantees than before, because fdatasync() does not
>    ensure that metadata such as the last modification time is written to disk.
>
>    I haven't done a thorough check if we rely on mtime being flushed to the
>    disk in our code, but even if we don't, svn_io_file_flush_to_disk() is part
>    of the public API, so altering its integrity guarantees in a patch release
>    might be unexpected for its callers.
>
> Thoughts?
>
> (A small disclaimer: these changes have slipped past my attention until now,
>  when I tried updating to 1.14.2.  But better late than sorry…)

I assume nothing has been done on /trunk since this change was added?

FWIW, if we decide we need a 1.14.3, I will do the RM work for it once
everything is backported.

Mark

Re: svn commit: r1899341 - in /subversion/branches/1.14.x: ./ STATUS subversion/libsvn_subr/io.c

Posted by Evgeny Kotkov <ev...@visualsvn.com>.
Branko Čibej <br...@apache.org> writes:

> > 1) Previously, the check for EINVAL within the svn_io_file_flush_to_disk()
> >     function only happened in the #else block, so it did not affect the
> >     behavior on Windows.  With the change, the check happens unconditionally
> >     on all platforms.
>
> You're right, I hadn't considered that. And Windows behaves sufficiently
> differently that the EINVAL check as it stands is not appropriate. Would
> you consider putting that entire check in an #ifdef an appropriate fix?

Yes, I think it would be fine if we restored the pre-r1883355 way of handling
EINVAL under an #ifdef.

> > 2) Unless I am mistaken, this change replaces an fcntl(F_FULLFSYNC) with
> >     fdatasync() when the latter is supported.
>
> Depending platform yes, APR may pick fdatasync() over fcntl(F_FULLFSYNC).

I think that since fdatasync() has lower integrity guarantees, starting to
unconditionally use it in the existing API might be problematic.

For example, the API users (ourselves and third-party) may implicitly rely on
the current integrity guarantees and assume that metadata is flushed to disk.
And in this case it's not limited to just the svn_io_file_flush_to_disk()
function, because there are other parts of the public API that call
svn_io_file_flush_to_disk() internally — such as svn_io_file_rename2()
or svn_io_write_atomic2().

If the original intention of this change was to save I/O in cases where we
would be fine with just flushing the file contents, then perhaps we could do
something like svn_io_file_flush_to_disk2(…, svn_boolean_t sync_metadata)
that uses apr_file_datasync() if sync_metadata is false — and incrementally
start passing sync_metadata=FALSE on the calling sites where it's safe to
do so.

But if the original intention was to fully switch to APR functions, I am not
too sure if we can do so, because both apr_file_datasync() and apr_file_sync()
have lower integrity guarantees in some cases, compared to the previous code.
(apr_file_datasync() — due to calling fdatasync() in some cases and
 apr_file_sync() due to always doing just an fsync() instead of F_FULLFSYNC
 when it's supported).


Thanks,
Evgeny Kotkov

Re: svn commit: r1899341 - in /subversion/branches/1.14.x: ./ STATUS subversion/libsvn_subr/io.c

Posted by Branko Čibej <br...@apache.org>.
On 12.04.2022 00:14, Evgeny Kotkov wrote:
> svn-role <sv...@apache.org> writes:
>
>> Merge r1883355 from trunk:
>>
>>   * r1883355
>>     Use the APR-1.4+ API for flushing file contents to disk.
>>     Justification:
>>       Reduce code duplication between APR and SVN.
>>     Votes:
>>       +1: brane, jun66j5, markphip
> …
>> +  do {
>> +    apr_err = apr_file_datasync(file);
>> +  } while(APR_STATUS_IS_EINTR(apr_err));
>> +
>> +  /* If the file is in a memory filesystem, fsync() may return
>> +     EINVAL.  Presumably the user knows the risks, and we can just
>> +     ignore the error. */
>> +  if (APR_STATUS_IS_EINVAL(apr_err))
>> +    return SVN_NO_ERROR;
>> +
>> +  if (apr_err)
>> +    return svn_error_wrap_apr(apr_err,
>> +                              _("Can't flush file '%s' to disk"),
>> +                              try_utf8_from_internal_style(fname, pool));
> A few thoughts on this change:
>
> 1) Previously, the check for EINVAL within the svn_io_file_flush_to_disk()
>     function only happened in the #else block, so it did not affect the
>     behavior on Windows.  With the change, the check happens unconditionally
>     on all platforms.

You're right, I hadn't considered that. And Windows behaves sufficiently 
differently that the EINVAL check as it stands is not appropriate. Would 
you consider putting that entire check in an #ifdef an appropriate fix?

[...]
> 2) Unless I am mistaken, this change replaces an fcntl(F_FULLFSYNC) with
>     fdatasync() when the latter is supported.

Depending platform yes, APR may pick fdatasync() over fcntl(F_FULLFSYNC).

-- Brane

Re: svn commit: r1899341 - in /subversion/branches/1.14.x: ./ STATUS subversion/libsvn_subr/io.c

Posted by Evgeny Kotkov <ev...@visualsvn.com>.
svn-role <sv...@apache.org> writes:

> Merge r1883355 from trunk:
>
>  * r1883355
>    Use the APR-1.4+ API for flushing file contents to disk.
>    Justification:
>      Reduce code duplication between APR and SVN.
>    Votes:
>      +1: brane, jun66j5, markphip
…
> +  do {
> +    apr_err = apr_file_datasync(file);
> +  } while(APR_STATUS_IS_EINTR(apr_err));
> +
> +  /* If the file is in a memory filesystem, fsync() may return
> +     EINVAL.  Presumably the user knows the risks, and we can just
> +     ignore the error. */
> +  if (APR_STATUS_IS_EINVAL(apr_err))
> +    return SVN_NO_ERROR;
> +
> +  if (apr_err)
> +    return svn_error_wrap_apr(apr_err,
> +                              _("Can't flush file '%s' to disk"),
> +                              try_utf8_from_internal_style(fname, pool));

A few thoughts on this change:

1) Previously, the check for EINVAL within the svn_io_file_flush_to_disk()
   function only happened in the #else block, so it did not affect the
   behavior on Windows.  With the change, the check happens unconditionally
   on all platforms.

   On Windows, APR_STATUS_IS_EINVAL() is defined as follows:

#define APR_STATUS_IS_EINVAL(s)         ((s) == APR_EINVAL \
                || (s) == APR_OS_START_SYSERR + ERROR_INVALID_ACCESS \
                || (s) == APR_OS_START_SYSERR + ERROR_INVALID_DATA \
                || (s) == APR_OS_START_SYSERR + ERROR_INVALID_FUNCTION \
                || (s) == APR_OS_START_SYSERR + ERROR_INVALID_HANDLE \
                || (s) == APR_OS_START_SYSERR + ERROR_INVALID_PARAMETER \
                || (s) == APR_OS_START_SYSERR + ERROR_NEGATIVE_SEEK)

   So with this change, all of these error codes are now going to be ignored,
   and the svn_io_file_flush_to_disk() function will return SVN_NO_ERROR,
   indicating the success of flushing the data to disk.  Some of these error
   codes, such as ERROR_INVALID_HANDLE, are quite generic.
   I would think that this change opens a data corruption possibility in
   the following case:

   - A real error happens during FlushFileBuffers().
   - It gets translated into one of the error codes above by the I/O stack.
   - The error is ignored in the implementation of svn_io_file_flush_to_disk().
   - The caller of svn_io_file_flush_to_disk() interprets this situation as a
     successful data flush.
   - He continues to work under an assumption that the data was successfully
     flushed to disk, whereas in fact it was not. For example, by committing
     a repository transaction.
   - A power loss after the transaction commit causes the data to be lost or
     corrupted.

2) Unless I am mistaken, this change replaces an fcntl(F_FULLFSYNC) with
   fdatasync() when the latter is supported.  So assuming that both operations
   are supported, the new behavior of the svn_io_file_flush_to_disk() function
   has weaker integrity guarantees than before, because fdatasync() does not
   ensure that metadata such as the last modification time is written to disk.

   I haven't done a thorough check if we rely on mtime being flushed to the
   disk in our code, but even if we don't, svn_io_file_flush_to_disk() is part
   of the public API, so altering its integrity guarantees in a patch release
   might be unexpected for its callers.

Thoughts?

(A small disclaimer: these changes have slipped past my attention until now,
 when I tried updating to 1.14.2.  But better late than sorry…)


Thanks,
Evgeny Kotkov