You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@subversion.apache.org by hw...@apache.org on 2010/10/14 23:00:44 UTC

svn commit: r1022707 - in /subversion/trunk: ./ subversion/libsvn_subr/io.c

Author: hwright
Date: Thu Oct 14 21:00:43 2010
New Revision: 1022707

URL: http://svn.apache.org/viewvc?rev=1022707&view=rev
Log:
Merge r985472 from the performance branch, which see for more information.

This revision randomizes the numerical suffix of temporary files, in an effort
to make finding one easier.

Modified:
    subversion/trunk/   (props changed)
    subversion/trunk/subversion/libsvn_subr/io.c

Propchange: subversion/trunk/
------------------------------------------------------------------------------
--- svn:mergeinfo (original)
+++ svn:mergeinfo Thu Oct 14 21:00:43 2010
@@ -23,7 +23,7 @@
 /subversion/branches/log-g-performance:870941-871032
 /subversion/branches/merge-skips-obstructions:874525-874615
 /subversion/branches/nfc-nfd-aware-client:870276,870376
-/subversion/branches/performance:982355,983764,983766,984927,985014,985037,985046,985477,985669,987888,987893,995507,995603,1001413
+/subversion/branches/performance:982355,983764,983766,984927,985014,985037,985046,985472,985477,985669,987888,987893,995507,995603,1001413
 /subversion/branches/ra_serf-digest-authn:875693-876404
 /subversion/branches/reintegrate-improvements:873853-874164
 /subversion/branches/subtree-mergeinfo:876734-878766

Modified: subversion/trunk/subversion/libsvn_subr/io.c
URL: http://svn.apache.org/viewvc/subversion/trunk/subversion/libsvn_subr/io.c?rev=1022707&r1=1022706&r2=1022707&view=diff
==============================================================================
--- subversion/trunk/subversion/libsvn_subr/io.c (original)
+++ subversion/trunk/subversion/libsvn_subr/io.c Thu Oct 14 21:00:43 2010
@@ -334,6 +334,10 @@ svn_io_open_uniquely_named(apr_file_t **
   unsigned int i;
   struct temp_file_cleanup_s *baton = NULL;
 
+  /* At the beginning, we don't know whether unique_path will need 
+     UTF8 conversion */
+  svn_boolean_t needs_utf8_conversion = TRUE;
+
   SVN_ERR_ASSERT(file || unique_path);
 
   if (dirpath == NULL)
@@ -374,6 +378,11 @@ svn_io_open_uniquely_named(apr_file_t **
       if (delete_when == svn_io_file_del_on_close)
         flag |= APR_DELONCLOSE;
 
+      /* Increase the chance that rand() will return something truely
+         independent from what others get or do. */
+      if (i == 2)
+        srand(apr_time_now());
+
       /* Special case the first attempt -- if we can avoid having a
          generated numeric portion at all, that's best.  So first we
          try with just the suffix; then future tries add a number
@@ -384,17 +393,35 @@ svn_io_open_uniquely_named(apr_file_t **
          This is good, since "1" would misleadingly imply that
          the second attempt was actually the first... and if someone's
          got conflicts on their conflicts, we probably don't want to
-         add to their confusion :-). */
+         add to their confusion :-). 
+
+         Also, the randomization used to minimize the number of re-try 
+         cycles will interfere with certain tests that compare working
+         copies etc.
+       */
       if (i == 1)
         unique_name = apr_psprintf(scratch_pool, "%s%s", path, suffix);
       else
-        unique_name = apr_psprintf(scratch_pool, "%s.%u%s", path, i, suffix);
+        unique_name = apr_psprintf(scratch_pool, "%s.%u_%x%s", path, i, rand(), suffix);
 
       /* Hmmm.  Ideally, we would append to a native-encoding buf
          before starting iteration, then convert back to UTF-8 for
          return. But I suppose that would make the appending code
          sensitive to i18n in a way it shouldn't be... Oh well. */
-      SVN_ERR(cstring_from_utf8(&unique_name_apr, unique_name, scratch_pool));
+      if (needs_utf8_conversion)
+        {
+          SVN_ERR(cstring_from_utf8(&unique_name_apr, unique_name, scratch_pool));
+          if (i == 1)
+            {
+              /* The variable parts of unique_name will not require UTF8
+                 conversion. Therefore, if UTF8 conversion had no effect
+                 on it in the first iteration, it won't require conversion
+                 in any future interation. */
+              needs_utf8_conversion = strcmp(unique_name_apr, unique_name);
+            }
+        }
+      else
+        unique_name_apr = unique_name;
 
       apr_err = file_open(&try_file, unique_name_apr, flag,
                           APR_OS_DEFAULT, FALSE, result_pool);



Re: svn commit: r1022707 - in /subversion/trunk: ./ subversion/libsvn_subr/io.c

Posted by "Hyrum K. Wright" <hy...@mail.utexas.edu>.
On Sat, Oct 16, 2010 at 12:54 PM, Hyrum K. Wright
<hy...@mail.utexas.edu> wrote:
> On Fri, Oct 15, 2010 at 10:27 AM, Bert Huijben <be...@qqmail.nl> wrote:
>>
>>
>>> -----Original Message-----
>>> From: Stefan Sperling [mailto:stsp@elego.de]
>>> Sent: donderdag 14 oktober 2010 23:39
>>> To: dev@subversion.apache.org
>>> Subject: Re: svn commit: r1022707 - in /subversion/trunk: ./
>>> subversion/libsvn_subr/io.c
>>>
>>> On Thu, Oct 14, 2010 at 09:00:44PM -0000, hwright@apache.org wrote:
>>> > Author: hwright
>>> > Date: Thu Oct 14 21:00:43 2010
>>> > New Revision: 1022707
>>> >
>>> > URL: http://svn.apache.org/viewvc?rev=1022707&view=rev
>>> > Log:
>>> > Merge r985472 from the performance branch, which see for more
>>> information.
>>> >
>>> > This revision randomizes the numerical suffix of temporary files, in an
>>> effort
>>> > to make finding one easier.
>>
>>
>>> > @@ -384,17 +393,35 @@ svn_io_open_uniquely_named(apr_file_t **
>>> >           This is good, since "1" would misleadingly imply that
>>> >           the second attempt was actually the first... and if someone's
>>> >           got conflicts on their conflicts, we probably don't want to
>>> > -         add to their confusion :-). */
>>> > +         add to their confusion :-).
>>> > +
>>> > +         Also, the randomization used to minimize the number of re-try
>>> > +         cycles will interfere with certain tests that compare working
>>> > +         copies etc.
>>> > +       */
>>> >        if (i == 1)
>>> >          unique_name = apr_psprintf(scratch_pool, "%s%s", path, suffix);
>>> >        else
>>> > -        unique_name = apr_psprintf(scratch_pool, "%s.%u%s", path, i,
>> suffix);
>>> > +        unique_name = apr_psprintf(scratch_pool, "%s.%u_%x%s", path, i,
>>> rand(), suffix);
>>>
>>> -1 on using rand() here, for several reasons:
>>>
>>> The intent of svn_io_open_uniquely_named() is to provide user-facing
>>> temporary files. The order of numbers actually carry meaning in some
>> cases.
>>> With files named svn-commit.tmp, svn-commit.2.tmp, svn-commit.3.tmp,
>>> users
>>> can easily tell which one is the newest by looking at the number.
>>> Names with random numbers don't provide this information.
>>>
>>> We already have a different function to create truely randomly-named
>>> temporary files, svn_io_open_unique_file3(). It should be used instead
>>> of svn_io_open_uniquely_named() wherever doing so doesn't hurt user
>>> convenience (i.e. for any tempfile a user will never directly work with).
>>>
>>> Also, rand() isn't thread-safe, and this is library code.
>>
>> +1 on this reasoning and the -1.
>>  This merge should be reverted and performance critical code should use
>> svn_io_open_unique_file3() instead of this function.
>> (I think we already did this in most cases. In 1.6 this function was a
>> severe performance problem)
>
> There are parts to the merge which are still useful, so I propose that
> instead of reverting the merge, we make the appropriate reversions on
> the performance branch, and then merge that revision to trunk.

Change made on the branch in r1025660, and merged to trunk in r1025668.

-Hyrum

Re: svn commit: r1022707 - in /subversion/trunk: ./ subversion/libsvn_subr/io.c

Posted by "Hyrum K. Wright" <hy...@mail.utexas.edu>.
On Fri, Oct 15, 2010 at 10:27 AM, Bert Huijben <be...@qqmail.nl> wrote:
>
>
>> -----Original Message-----
>> From: Stefan Sperling [mailto:stsp@elego.de]
>> Sent: donderdag 14 oktober 2010 23:39
>> To: dev@subversion.apache.org
>> Subject: Re: svn commit: r1022707 - in /subversion/trunk: ./
>> subversion/libsvn_subr/io.c
>>
>> On Thu, Oct 14, 2010 at 09:00:44PM -0000, hwright@apache.org wrote:
>> > Author: hwright
>> > Date: Thu Oct 14 21:00:43 2010
>> > New Revision: 1022707
>> >
>> > URL: http://svn.apache.org/viewvc?rev=1022707&view=rev
>> > Log:
>> > Merge r985472 from the performance branch, which see for more
>> information.
>> >
>> > This revision randomizes the numerical suffix of temporary files, in an
>> effort
>> > to make finding one easier.
>
>
>> > @@ -384,17 +393,35 @@ svn_io_open_uniquely_named(apr_file_t **
>> >           This is good, since "1" would misleadingly imply that
>> >           the second attempt was actually the first... and if someone's
>> >           got conflicts on their conflicts, we probably don't want to
>> > -         add to their confusion :-). */
>> > +         add to their confusion :-).
>> > +
>> > +         Also, the randomization used to minimize the number of re-try
>> > +         cycles will interfere with certain tests that compare working
>> > +         copies etc.
>> > +       */
>> >        if (i == 1)
>> >          unique_name = apr_psprintf(scratch_pool, "%s%s", path, suffix);
>> >        else
>> > -        unique_name = apr_psprintf(scratch_pool, "%s.%u%s", path, i,
> suffix);
>> > +        unique_name = apr_psprintf(scratch_pool, "%s.%u_%x%s", path, i,
>> rand(), suffix);
>>
>> -1 on using rand() here, for several reasons:
>>
>> The intent of svn_io_open_uniquely_named() is to provide user-facing
>> temporary files. The order of numbers actually carry meaning in some
> cases.
>> With files named svn-commit.tmp, svn-commit.2.tmp, svn-commit.3.tmp,
>> users
>> can easily tell which one is the newest by looking at the number.
>> Names with random numbers don't provide this information.
>>
>> We already have a different function to create truely randomly-named
>> temporary files, svn_io_open_unique_file3(). It should be used instead
>> of svn_io_open_uniquely_named() wherever doing so doesn't hurt user
>> convenience (i.e. for any tempfile a user will never directly work with).
>>
>> Also, rand() isn't thread-safe, and this is library code.
>
> +1 on this reasoning and the -1.
>  This merge should be reverted and performance critical code should use
> svn_io_open_unique_file3() instead of this function.
> (I think we already did this in most cases. In 1.6 this function was a
> severe performance problem)

There are parts to the merge which are still useful, so I propose that
instead of reverting the merge, we make the appropriate reversions on
the performance branch, and then merge that revision to trunk.

As an aside, it would be easier if people reviewed these changes on
the branch and commented in the STATUS file, rather than wait until
the merge actually happens. :/

-Hyrum

RE: svn commit: r1022707 - in /subversion/trunk: ./ subversion/libsvn_subr/io.c

Posted by Bert Huijben <be...@qqmail.nl>.

> -----Original Message-----
> From: Stefan Sperling [mailto:stsp@elego.de]
> Sent: donderdag 14 oktober 2010 23:39
> To: dev@subversion.apache.org
> Subject: Re: svn commit: r1022707 - in /subversion/trunk: ./
> subversion/libsvn_subr/io.c
> 
> On Thu, Oct 14, 2010 at 09:00:44PM -0000, hwright@apache.org wrote:
> > Author: hwright
> > Date: Thu Oct 14 21:00:43 2010
> > New Revision: 1022707
> >
> > URL: http://svn.apache.org/viewvc?rev=1022707&view=rev
> > Log:
> > Merge r985472 from the performance branch, which see for more
> information.
> >
> > This revision randomizes the numerical suffix of temporary files, in an
> effort
> > to make finding one easier.


> > @@ -384,17 +393,35 @@ svn_io_open_uniquely_named(apr_file_t **
> >           This is good, since "1" would misleadingly imply that
> >           the second attempt was actually the first... and if someone's
> >           got conflicts on their conflicts, we probably don't want to
> > -         add to their confusion :-). */
> > +         add to their confusion :-).
> > +
> > +         Also, the randomization used to minimize the number of re-try
> > +         cycles will interfere with certain tests that compare working
> > +         copies etc.
> > +       */
> >        if (i == 1)
> >          unique_name = apr_psprintf(scratch_pool, "%s%s", path, suffix);
> >        else
> > -        unique_name = apr_psprintf(scratch_pool, "%s.%u%s", path, i,
suffix);
> > +        unique_name = apr_psprintf(scratch_pool, "%s.%u_%x%s", path, i,
> rand(), suffix);
> 
> -1 on using rand() here, for several reasons:
> 
> The intent of svn_io_open_uniquely_named() is to provide user-facing
> temporary files. The order of numbers actually carry meaning in some
cases.
> With files named svn-commit.tmp, svn-commit.2.tmp, svn-commit.3.tmp,
> users
> can easily tell which one is the newest by looking at the number.
> Names with random numbers don't provide this information.
> 
> We already have a different function to create truely randomly-named
> temporary files, svn_io_open_unique_file3(). It should be used instead
> of svn_io_open_uniquely_named() wherever doing so doesn't hurt user
> convenience (i.e. for any tempfile a user will never directly work with).
> 
> Also, rand() isn't thread-safe, and this is library code.

+1 on this reasoning and the -1.
 This merge should be reverted and performance critical code should use
svn_io_open_unique_file3() instead of this function.
(I think we already did this in most cases. In 1.6 this function was a
severe performance problem)

	Bert

Re: svn commit: r1022707 - in /subversion/trunk: ./ subversion/libsvn_subr/io.c

Posted by Stefan Sperling <st...@elego.de>.
On Thu, Oct 14, 2010 at 09:00:44PM -0000, hwright@apache.org wrote:
> Author: hwright
> Date: Thu Oct 14 21:00:43 2010
> New Revision: 1022707
> 
> URL: http://svn.apache.org/viewvc?rev=1022707&view=rev
> Log:
> Merge r985472 from the performance branch, which see for more information.
> 
> This revision randomizes the numerical suffix of temporary files, in an effort
> to make finding one easier.
> 
> Modified:
>     subversion/trunk/   (props changed)
>     subversion/trunk/subversion/libsvn_subr/io.c
> 
> Propchange: subversion/trunk/
> ------------------------------------------------------------------------------
> --- svn:mergeinfo (original)
> +++ svn:mergeinfo Thu Oct 14 21:00:43 2010
> @@ -23,7 +23,7 @@
>  /subversion/branches/log-g-performance:870941-871032
>  /subversion/branches/merge-skips-obstructions:874525-874615
>  /subversion/branches/nfc-nfd-aware-client:870276,870376
> -/subversion/branches/performance:982355,983764,983766,984927,985014,985037,985046,985477,985669,987888,987893,995507,995603,1001413
> +/subversion/branches/performance:982355,983764,983766,984927,985014,985037,985046,985472,985477,985669,987888,987893,995507,995603,1001413
>  /subversion/branches/ra_serf-digest-authn:875693-876404
>  /subversion/branches/reintegrate-improvements:873853-874164
>  /subversion/branches/subtree-mergeinfo:876734-878766
> 
> Modified: subversion/trunk/subversion/libsvn_subr/io.c
> URL: http://svn.apache.org/viewvc/subversion/trunk/subversion/libsvn_subr/io.c?rev=1022707&r1=1022706&r2=1022707&view=diff
> ==============================================================================
> --- subversion/trunk/subversion/libsvn_subr/io.c (original)
> +++ subversion/trunk/subversion/libsvn_subr/io.c Thu Oct 14 21:00:43 2010
> @@ -334,6 +334,10 @@ svn_io_open_uniquely_named(apr_file_t **
>    unsigned int i;
>    struct temp_file_cleanup_s *baton = NULL;
>  
> +  /* At the beginning, we don't know whether unique_path will need 
> +     UTF8 conversion */
> +  svn_boolean_t needs_utf8_conversion = TRUE;
> +
>    SVN_ERR_ASSERT(file || unique_path);
>  
>    if (dirpath == NULL)
> @@ -374,6 +378,11 @@ svn_io_open_uniquely_named(apr_file_t **
>        if (delete_when == svn_io_file_del_on_close)
>          flag |= APR_DELONCLOSE;
>  
> +      /* Increase the chance that rand() will return something truely
> +         independent from what others get or do. */
> +      if (i == 2)
> +        srand(apr_time_now());
> +
>        /* Special case the first attempt -- if we can avoid having a
>           generated numeric portion at all, that's best.  So first we
>           try with just the suffix; then future tries add a number
> @@ -384,17 +393,35 @@ svn_io_open_uniquely_named(apr_file_t **
>           This is good, since "1" would misleadingly imply that
>           the second attempt was actually the first... and if someone's
>           got conflicts on their conflicts, we probably don't want to
> -         add to their confusion :-). */
> +         add to their confusion :-). 
> +
> +         Also, the randomization used to minimize the number of re-try 
> +         cycles will interfere with certain tests that compare working
> +         copies etc.
> +       */
>        if (i == 1)
>          unique_name = apr_psprintf(scratch_pool, "%s%s", path, suffix);
>        else
> -        unique_name = apr_psprintf(scratch_pool, "%s.%u%s", path, i, suffix);
> +        unique_name = apr_psprintf(scratch_pool, "%s.%u_%x%s", path, i, rand(), suffix);

-1 on using rand() here, for several reasons:

The intent of svn_io_open_uniquely_named() is to provide user-facing
temporary files. The order of numbers actually carry meaning in some cases.
With files named svn-commit.tmp, svn-commit.2.tmp, svn-commit.3.tmp, users
can easily tell which one is the newest by looking at the number.
Names with random numbers don't provide this information.

We already have a different function to create truely randomly-named
temporary files, svn_io_open_unique_file3(). It should be used instead
of svn_io_open_uniquely_named() wherever doing so doesn't hurt user
convenience (i.e. for any tempfile a user will never directly work with).

Also, rand() isn't thread-safe, and this is library code.

>  
>        /* Hmmm.  Ideally, we would append to a native-encoding buf
>           before starting iteration, then convert back to UTF-8 for
>           return. But I suppose that would make the appending code
>           sensitive to i18n in a way it shouldn't be... Oh well. */
> -      SVN_ERR(cstring_from_utf8(&unique_name_apr, unique_name, scratch_pool));
> +      if (needs_utf8_conversion)
> +        {
> +          SVN_ERR(cstring_from_utf8(&unique_name_apr, unique_name, scratch_pool));
> +          if (i == 1)
> +            {
> +              /* The variable parts of unique_name will not require UTF8
> +                 conversion. Therefore, if UTF8 conversion had no effect
> +                 on it in the first iteration, it won't require conversion
> +                 in any future interation. */

This part is nice though, good catch.

Stefan

> +              needs_utf8_conversion = strcmp(unique_name_apr, unique_name);
> +            }
> +        }
> +      else
> +        unique_name_apr = unique_name;
>  
>        apr_err = file_open(&try_file, unique_name_apr, flag,
>                            APR_OS_DEFAULT, FALSE, result_pool);
>