You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Matt Doran <ma...@bigpond.com> on 2004/08/12 22:41:25 UTC

Re: 1.1rc1 performance regression in 'svn status' (and also 1.1rc2)

> On Sun, 8 Aug 2004, Brane wrote: 
>>>
>>>SVN 1.0.6: ~ 4 secs
>>>SVN 1.1rc1: ~ 38 secs !!
>>>
>>>I used the Filemon utility from sysinternals to try to determine what it
>>>happening. It looks like 1.1 excessively hits 'iconv' files. (e.g.
>>>windows-1252.so, utf-8.so, etc)
>>>
>>>
>>I did some more checking in GDB. It appears that apr_xlate_open gets
>>called 4312 times out of 17204 calls to get_xlate_handle (in
>>libsvn_subr/utf.c).
>>This was during a run of "svn st" on trunk. This caching strategy needs to
>>be improved.
>>
>>
>That's a 75% hit rate, which -- given our constraints -- isn't all that
>bad. The problem is that we can only safely cache the xlate handle in
>the current pool. If that happens to be a subpool in some loop -- well,
>tough luck.
>
>>I don't know why it is faster on 1.0, though.
>>
>>
>We probably fixed some pool usage bugs in the meantime, and those fixes
>happened to narrow the scopes of some xlate handles.

So is that the answer on this one?  There's nothing that can be done?

I've just installed the TortoiseSVN 1.1RC (built from svn 1.1RC2) and it's
showing the same behaviour.  It takes 15-20 times longer to do a
status/commit on a large repository compared to 1.0.6.  For a WC for a repo
with about 5000 files it takes ~40 secs at 100% CPU on a P4 1.8G.  Not very
pretty, especially when it used to only take about 4 secs with 1.0.6.

FileMon still shows very aggressive reading of the iconv files.

Is anyone else out there experiencing this type of perf degradation on
windows?   Could this be something peculiar on my box?

Thanks,
Matt

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: 1.1rc1 performance regression in 'svn status' (and also 1.1rc2)

Posted by Greg Hudson <gh...@MIT.EDU>.
Looks good to me.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: 1.1rc1 performance regression in 'svn status' (and also 1.1rc2)

Posted by Philip Martin <ph...@codematters.co.uk>.
Greg Hudson <gh...@MIT.EDU> writes:

> On Thu, 2004-08-12 at 19:01, Philip Martin wrote:
>> Does it make sense for to look for a suitable xlate handle in the
>> parent pools before falling back and creating a new one in the current
>> pool?  If such a handle is found it could be stored in the current
>> pool so that the search would only occur once per pool.
>
> Yes, I proposed the same thing at some point, though perhaps only over
> IRC.

Something like

Index: subversion/libsvn_subr/utf.c
===================================================================
--- subversion/libsvn_subr/utf.c	(revision 10603)
+++ subversion/libsvn_subr/utf.c	(working copy)
@@ -54,11 +54,19 @@
   /* If we already have a handle, just return it. */
   if (userdata_key)
     {
-      apr_pool_userdata_get (&old_handle, userdata_key, pool);
-      if (old_handle != NULL)
+      apr_pool_t *search_pool = pool;
+      while (search_pool)
         {
-          *ret = old_handle;
-          return SVN_NO_ERROR;
+          apr_pool_userdata_get (&old_handle, userdata_key, search_pool);
+          if (old_handle != NULL)
+            {
+              *ret = old_handle;
+              if (search_pool != pool)
+                apr_pool_userdata_set (*ret, userdata_key,
+                                       apr_pool_cleanup_null, pool);
+              return SVN_NO_ERROR;
+            }
+          search_pool = apr_pool_get_parent (search_pool);
         }
     }
 
> That, possibly in addition to a few strategic calls to a new
> function which deliberately places an xlate handle in a pool, would
> probably solve our problem.

I suspect svn_opt_args_to_target_array will be sufficient for the
command line client.

-- 
Philip Martin

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: 1.1rc1 performance regression in 'svn status' (and also 1.1rc2)

Posted by "Peter N. Lundblad" <pe...@famlundblad.se>.
On Sun, 15 Aug 2004, Greg Hudson wrote:

> On Sun, 2004-08-15 at 16:41, Peter N. Lundblad wrote:
> > I don't see a way to improve the performance via caching in pools. If we
> > want to fix this, I think we need to add init functions and be able to
> > detect if they were called. If this is problematic for 1.1 ( to which I
> > might agree), then we won't fix this until 1.2. This will hurt TSVN users
> > I think and that's bad because that's an important application on Windows.
>
> I agree that we should solve this problem for 1.1, even at a high cost.
>
> Let's introduce an optional initializer, but let's make it very limited
> and specific.  "void svn_utf_initialize(void);".  We acknowledge that

I want to introduce a pool argument since I need it for initializing TSD,
requiring that the pool lives as long as any of the UTF functions are used
(or better introduce svn_utf_terminate to be safe).

//Peter

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: [PATCH] Issue 2016, next try (was Re: Re: 1.1rc1 performance regression in 'svn status' (and also 1.1rc2))

Posted by "Peter N. Lundblad" <pe...@famlundblad.se>.
On Thu, 26 Aug 2004, Greg Hudson wrote:

> Your docstring for svn_utf_initialize says:
>
> +/** @since New in 1.1.
> + * Initialize the utf8 encoding/decoding routines.
> + * NOTE: It is optional to call this function, but if it is used, no other
> + * svn_utf function may be in use during the call of this function.
> + * Initializing the UTF-8 routines will improve performance.
> + */
> +void svn_utf_initialize (apr_pool_t *pool);
>
> There was a reason I said "no Subversion library function may be in use"
> when I described the constraint; it's clearer to describe it that way.
> Almost any Subversion library function is liable to use svn_utf
> functions.
>
I changed it to require that no svn function may be active during this
call. So you can create a pool before this call, which is important for
cleanup order and performance.

> If you're going to take a pool parameter, document that the pool must
> live for as long as Subversion library functions are in use.
> (Another option is to create a pool from scratch like we do in
> svn_error_create().  But on consideration, I think that's a bad idea,
> since that pool could never be cleaned up.)
>
OK. The docstring wasn't ready yet. Really, I wanted to get feedback on
the approach more than the exact patch, but I could have been clearer on
that... I removed the pool, since we need to create a subpool anyway,
which we can protect with the mutex. Pools created with no parent are
really owned by an internal pool in APR which will be destroyed during
apr_terminate. The patch works even if something uses UTF-8 translations
after the cache is cleaned up, but then without the cache ofcourse. I'm
assuming that all other threads are finished when apr_terminate gets
called. This must be required anyway...

 > +  if (utf_initialized++ == 0)
>
> If you're treating utf_initialized as a flag, then it's stylistically
> confusing to operate on it like it's a counter.  APR treats its
> initialization static as a counter because it wants to count
> apr_terminate() calls to know when it should clean up.
>
You're right. I removed the flag completely and use the hash table pointer
as a flag instead.

> I'll review more thoroughly when you've written a log message.  It's
> good to do that before asking for review.
>
You're right again. As I said, I asked more about the concept, but
expecting you to read my mind might be to require too much:-)

This time, it would be nice with some review. This got complicated, at
least for me...

Thanks,
//Peter

Re: [PATCH] Issue 2016 (was Re: Re: 1.1rc1 performance regression in 'svn status' (and also 1.1rc2))

Posted by Greg Hudson <gh...@MIT.EDU>.
Your docstring for svn_utf_initialize says:

+/** @since New in 1.1.
+ * Initialize the utf8 encoding/decoding routines.
+ * NOTE: It is optional to call this function, but if it is used, no other
+ * svn_utf function may be in use during the call of this function.
+ * Initializing the UTF-8 routines will improve performance.
+ */
+void svn_utf_initialize (apr_pool_t *pool);

There was a reason I said "no Subversion library function may be in use"
when I described the constraint; it's clearer to describe it that way. 
Almost any Subversion library function is liable to use svn_utf
functions.

If you're going to take a pool parameter, document that the pool must
live for as long as Subversion library functions are in use.  
(Another option is to create a pool from scratch like we do in
svn_error_create().  But on consideration, I think that's a bad idea,
since that pool could never be cleaned up.)

+  if (utf_initialized++ == 0)

If you're treating utf_initialized as a flag, then it's stylistically
confusing to operate on it like it's a counter.  APR treats its
initialization static as a counter because it wants to count
apr_terminate() calls to know when it should clean up.

I'll review more thoroughly when you've written a log message.  It's
good to do that before asking for review.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

[PATCH] Issue 2016 (was Re: Re: 1.1rc1 performance regression in 'svn status' (and also 1.1rc2))

Posted by "Peter N. Lundblad" <pe...@famlundblad.se>.
On Tue, 17 Aug 2004, Philip Martin wrote:

> "Peter N. Lundblad" <pe...@famlundblad.se> writes:
>
> > - Have a global hash with mappings from handle name (identifying the
> > translation) to a linked list of xlate handles.
> > - When you want to translate:
> > . Lock a mutex.
> > . Look for a handle in the hash. If one is found, remove it from the list
> > and use it.
> > . If no handle was found, create one.
> > . Unlock the mutex
> > . Use the handle.
> > . Lock, put the handle back into the list and unlock.
>
> Just make sure you add/remove from the list without allocating memory,
> > This avoids having the lock held during the translation. It will create as
> > many handles as is needed simultaneously and destroy them at program
> > termination.
> >
> > It would be simpler to just have one handle per combination of to/from
> > encoding, but as Greg pointed out, this might give to much contention on
> > the translation lock, especially since we do a lot of translations during
> > our WC crawling commands.
>
> It's notoriously difficult to predict thread contention accurately.
> However, instinctively I prefer your other design, the one that
> provides per-thread xlate handles, assuming it can be implemented
> correctly.
>
OK. Here is the patch. It keeps the lock during apr_xlate_open, but that
call should be very rare compared to the number of translations we do. It
isn't ready yet, but quite so and I'd like feedback.

Regards,
//Peter

Re: [PATCH] Re: 1.1rc1 performance regression in 'svn status' (and also 1.1rc2)

Posted by Philip Martin <ph...@codematters.co.uk>.
"Peter N. Lundblad" <pe...@famlundblad.se> writes:

> - Have a global hash with mappings from handle name (identifying the
> translation) to a linked list of xlate handles.
> - When you want to translate:
> . Lock a mutex.
> . Look for a handle in the hash. If one is found, remove it from the list
> and use it.
> . If no handle was found, create one.
> . Unlock the mutex
> . Use the handle.
> . Lock, put the handle back into the list and unlock.

Just make sure you add/remove from the list without allocating memory,
otherwise the memory use will grow without bound.  Allocating memory
per handle is OK, allocating memory per use of a handle is not.

> This avoids having the lock held during the translation. It will create as
> many handles as is needed simultaneously and destroy them at program
> termination.
>
> It would be simpler to just have one handle per combination of to/from
> encoding, but as Greg pointed out, this might give to much contention on
> the translation lock, especially since we do a lot of translations during
> our WC crawling commands.

It's notoriously difficult to predict thread contention accurately.
However, instinctively I prefer your other design, the one that
provides per-thread xlate handles, assuming it can be implemented
correctly.

> Comments/flames/... Maybe we just hve to give up on this one?

One report had "svn st" more then 10x slower on a large wc, that's
1.0.x compared to 1.1.x on win32.  I'm glad I don't use Windows!

-- 
Philip Martin

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: [PATCH] Re: 1.1rc1 performance regression in 'svn status' (and also 1.1rc2)

Posted by "Peter N. Lundblad" <pe...@famlundblad.se>.
Hi,

Here is another idea for solving this caching problem.

WARNING, WARNING, WARNING: I am starting to feel insane solving this one.
It's a bad regressions, but we have to stop somewhere.

- Have a global hash with mappings from handle name (identifying the
translation) to a linked list of xlate handles.
- When you want to translate:
. Lock a mutex.
. Look for a handle in the hash. If one is found, remove it from the list
and use it.
. If no handle was found, create one.
. Unlock the mutex
. Use the handle.
. Lock, put the handle back into the list and unlock.

This avoids having the lock held during the translation. It will create as
many handles as is needed simultaneously and destroy them at program
termination.

It would be simpler to just have one handle per combination of to/from
encoding, but as Greg pointed out, this might give to much contention on
the translation lock, especially since we do a lot of translations during
our WC crawling commands.

Comments/flames/... Maybe we just hve to give up on this one?

Regards,
//Peter

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: [PATCH] Re: 1.1rc1 performance regression in 'svn status' (and also 1.1rc2)

Posted by "Peter N. Lundblad" <pe...@famlundblad.se>.
On Wed, 18 Aug 2004, Philip Martin wrote:

> Philip Martin <ph...@codematters.co.uk> writes:
>
> > "Peter N. Lundblad" <pe...@famlundblad.se> writes:
> >
> >>> > +              handle_pool = svn_pool_create (NULL);
> >>>
> >>> This creates a pool in every thread, but I don't see the pool getting
> >>> cleared or destroyed.  An application that creates threads continually
> >>> (e.g. threaded svnserve) will cause the number of pools to grow
> >>> without bound.
> >>>
> >> xlate_cleanup will take care of that. It is called when the thread is
> >> destoyed.
> >
> > Unless you are planning to write some more code I don't understand.
> > xlate_cleanup appears to be applied to the pool passed to
> > svn_utf_initialize.  That pool could well have a lifetime much longer
> > than the thread lifetime, typically I would expect the pool to last as
> > long as the application runs.
>
> I see now, it's a thread cleanup not a pool cleanup.  That would work
> if APR were not broken as you point out below.
>
Then I understand your confusion...

> >> Problems is, when I check the implementation of
> >> apr_threadkey_private_create, the destructor is only used on POSIX and
> >> netware systems, not on Windows/OS2. So on these platforms, it is a leak
> >> after all. Maybe there is no way to register cleanups on these
> >> platforms,. This should have been documented in the APR headers!!!
>
> That does look broken, it should probably be brought up on the dev@apr
> list.
>
> Without a working thread destructor it will be very hard to use a
> thread private solution to solve the caching problem.
>
It will be hard to use thread-specific storage from a library in general,
or from an application when you don't have control of the thread
creation/termination. I don't know, however, if it is possible to get the
destructors called on Win32. The Win32 TLS doesn't have destructors.
POssibly you could solve it in DllMain if APR is linked dynamically.
Anyway, this is an APR question, not svn...

 > >>
> >>> > +              handle_hash = apr_hash_make (pool);
> >>> > +              if (apr_threadkey_private_set (handle_hash, xlate_handle_key)
> >>>
> >>> If pool is cleared handle_hash will become invalid, but will still be
> >>> referenced by the thread private data.  The next time the thread calls
> >>> get_xlate_handle it will crash, or do something worse.
> >>>
> >> When the pool is cleared, the thread is dying, so the TSD for that thread
> >> isn't used again. This would be true where the pool is destroyed at all...
> >
> > Again I don't understand.  pool is an argument to one of the
> > svn_utf_xxx_(to/from)_xxx functions, it could well have a lifetime
> > much shorter than the thread and may well get repeatedly cleared.
>
> Did you mean to allocate handle_hash from handle_pool rather than
> pool?  Then the hash would have the correct lifetime.  It's all a bit
> academic if thread destructors don't work on win32.
>
Yes and yes. I noticed that error later.

Regards,
//Peter

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: [PATCH] Re: 1.1rc1 performance regression in 'svn status' (and also 1.1rc2)

Posted by Philip Martin <ph...@codematters.co.uk>.
Philip Martin <ph...@codematters.co.uk> writes:

> "Peter N. Lundblad" <pe...@famlundblad.se> writes:
>
>>> > +              handle_pool = svn_pool_create (NULL);
>>>
>>> This creates a pool in every thread, but I don't see the pool getting
>>> cleared or destroyed.  An application that creates threads continually
>>> (e.g. threaded svnserve) will cause the number of pools to grow
>>> without bound.
>>>
>> xlate_cleanup will take care of that. It is called when the thread is
>> destoyed.
>
> Unless you are planning to write some more code I don't understand.
> xlate_cleanup appears to be applied to the pool passed to
> svn_utf_initialize.  That pool could well have a lifetime much longer
> than the thread lifetime, typically I would expect the pool to last as
> long as the application runs.

I see now, it's a thread cleanup not a pool cleanup.  That would work
if APR were not broken as you point out below.

>> Problems is, when I check the implementation of
>> apr_threadkey_private_create, the destructor is only used on POSIX and
>> netware systems, not on Windows/OS2. So on these platforms, it is a leak
>> after all. Maybe there is no way to register cleanups on these
>> platforms,. This should have been documented in the APR headers!!!

That does look broken, it should probably be brought up on the dev@apr
list.

Without a working thread destructor it will be very hard to use a
thread private solution to solve the caching problem.

>>
>>> > +              handle_hash = apr_hash_make (pool);
>>> > +              if (apr_threadkey_private_set (handle_hash, xlate_handle_key)
>>>
>>> If pool is cleared handle_hash will become invalid, but will still be
>>> referenced by the thread private data.  The next time the thread calls
>>> get_xlate_handle it will crash, or do something worse.
>>>
>> When the pool is cleared, the thread is dying, so the TSD for that thread
>> isn't used again. This would be true where the pool is destroyed at all...
>
> Again I don't understand.  pool is an argument to one of the
> svn_utf_xxx_(to/from)_xxx functions, it could well have a lifetime
> much shorter than the thread and may well get repeatedly cleared.

Did you mean to allocate handle_hash from handle_pool rather than
pool?  Then the hash would have the correct lifetime.  It's all a bit
academic if thread destructors don't work on win32.

-- 
Philip Martin

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: [PATCH] Re: 1.1rc1 performance regression in 'svn status' (and also 1.1rc2)

Posted by Philip Martin <ph...@codematters.co.uk>.
"Peter N. Lundblad" <pe...@famlundblad.se> writes:

>> > +              handle_pool = svn_pool_create (NULL);
>>
>> This creates a pool in every thread, but I don't see the pool getting
>> cleared or destroyed.  An application that creates threads continually
>> (e.g. threaded svnserve) will cause the number of pools to grow
>> without bound.
>>
> xlate_cleanup will take care of that. It is called when the thread is
> destoyed.

Unless you are planning to write some more code I don't understand.
xlate_cleanup appears to be applied to the pool passed to
svn_utf_initialize.  That pool could well have a lifetime much longer
than the thread lifetime, typically I would expect the pool to last as
long as the application runs.

> Problems is, when I check the implementation of
> apr_threadkey_private_create, the destructor is only used on POSIX and
> netware systems, not on Windows/OS2. So on these platforms, it is a leak
> after all. Maybe there is no way to register cleanups on these
> platforms,. This should have been documented in the APR headers!!!
>
>> > +              handle_hash = apr_hash_make (pool);
>> > +              if (apr_threadkey_private_set (handle_hash, xlate_handle_key)
>>
>> If pool is cleared handle_hash will become invalid, but will still be
>> referenced by the thread private data.  The next time the thread calls
>> get_xlate_handle it will crash, or do something worse.
>>
> When the pool is cleared, the thread is dying, so the TSD for that thread
> isn't used again. This would be true where the pool is destroyed at all...

Again I don't understand.  pool is an argument to one of the
svn_utf_xxx_(to/from)_xxx functions, it could well have a lifetime
much shorter than the thread and may well get repeatedly cleared.

-- 
Philip Martin

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: [PATCH] Re: 1.1rc1 performance regression in 'svn status' (and also 1.1rc2)

Posted by "Peter N. Lundblad" <pe...@famlundblad.se>.

On Tue, 17 Aug 2004, Philip Martin wrote:

> "Peter N. Lundblad" <pe...@famlundblad.se> writes:
>
> > +void
> > +svn_utf_initialize(apr_pool_t *pool)
> > +{
> > +#if APR_HAS_THREADS
> > +  apr_threadkey_t *key;
> > +  if (utf_initialized++ == 0)
> > +    {
> > +      if (apr_threadkey_private_create(&key, xlate_cleanup, pool)
> > +          != APR_SUCCESS)
> > +        return;
>
> I don't see a call to apr_threadkey_private_destroy.  Is this a
> resource leak?  It's probably trivial.
>
You're right. It isn't destroyed by the pool (strangely enough), so
apr_threadkey_private_delete needs to be registered in the pool
explicitly.

 > > +      xlate_handle_key = key;
> > +    }
> > +#endif
> > +}
> > +
> >  /* Return an apr_xlate handle for converting from FROMPAGE to
> >     TOPAGE. Create one if it doesn't exist in USERDATA_KEY. If
> >     unable to find a handle, or unable to create one because
> > @@ -50,11 +83,47 @@
> >  {
> >    void *old_handle = NULL;
> >    apr_status_t apr_err;
> > +  apr_pool_t *handle_pool = NULL;
> > +#if APR_HAS_THREADS
> > +  apr_hash_t *handle_hash = NULL;
> > +#endif
> >
> >    /* If we already have a handle, just return it. */
> >    if (userdata_key)
> >      {
> > -      apr_pool_userdata_get (&old_handle, userdata_key, pool);
> > +#if APR_HAS_THREADS
> > +      /* First try our thread private handle. */
> > +      if (xlate_handle_key)
> > +        {
> > +          void *val;
> > +
> > +          if (apr_threadkey_private_get (&val, xlate_handle_key) == APR_SUCCESS)
> > +            {
> > +              handle_hash = val;
> > +            }
> > +
> > +          if (handle_hash)
> > +            {
> > +              old_handle = apr_hash_get(handle_hash, userdata_key,
> > +                                        APR_HASH_KEY_STRING);
> > +              handle_pool = apr_hash_pool_get (handle_hash);
> > +            }
> > +          else
> > +            {
> > +              handle_pool = svn_pool_create (NULL);
>
> This creates a pool in every thread, but I don't see the pool getting
> cleared or destroyed.  An application that creates threads continually
> (e.g. threaded svnserve) will cause the number of pools to grow
> without bound.
>
xlate_cleanup will take care of that. It is called when the thread is
destoyed.

Problems is, when I check the implementation of
apr_threadkey_private_create, the destructor is only used on POSIX and
netware systems, not on Windows/OS2. So on these platforms, it is a leak
after all. Maybe there is no way to register cleanups on these
platforms,. This should have been documented in the APR headers!!!

> > +              handle_hash = apr_hash_make (pool);
> > +              if (apr_threadkey_private_set (handle_hash, xlate_handle_key)
>
> If pool is cleared handle_hash will become invalid, but will still be
> referenced by the thread private data.  The next time the thread calls
> get_xlate_handle it will crash, or do something worse.
>
When the pool is cleared, the thread is dying, so the TSD for that thread
isn't used again. This would be true where the pool is destroyed at all...


Thanks for the review. Currently, I don't know how to get further.
Creating a set of xlate handles per thread is a leak, since there is no
way to clean it up. Caching the handles in parent pools doesn't work,
since that's not thread-safe. Well... Anyone has an idea??? Maybe we could
check if it is possible to decrease the number of translations by some
factor? I'll give it a try.

Regards,
//Peter

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: [PATCH] Re: 1.1rc1 performance regression in 'svn status' (and also 1.1rc2)

Posted by Philip Martin <ph...@codematters.co.uk>.
"Peter N. Lundblad" <pe...@famlundblad.se> writes:

> +void
> +svn_utf_initialize(apr_pool_t *pool)
> +{
> +#if APR_HAS_THREADS
> +  apr_threadkey_t *key;
> +  if (utf_initialized++ == 0)
> +    {
> +      if (apr_threadkey_private_create(&key, xlate_cleanup, pool)
> +          != APR_SUCCESS)
> +        return;

I don't see a call to apr_threadkey_private_destroy.  Is this a
resource leak?  It's probably trivial.

> +      xlate_handle_key = key;
> +    }
> +#endif
> +}
> +
>  /* Return an apr_xlate handle for converting from FROMPAGE to
>     TOPAGE. Create one if it doesn't exist in USERDATA_KEY. If
>     unable to find a handle, or unable to create one because
> @@ -50,11 +83,47 @@
>  {
>    void *old_handle = NULL;
>    apr_status_t apr_err;
> +  apr_pool_t *handle_pool = NULL;
> +#if APR_HAS_THREADS
> +  apr_hash_t *handle_hash = NULL;
> +#endif
>  
>    /* If we already have a handle, just return it. */
>    if (userdata_key)
>      {
> -      apr_pool_userdata_get (&old_handle, userdata_key, pool);
> +#if APR_HAS_THREADS
> +      /* First try our thread private handle. */
> +      if (xlate_handle_key)
> +        {
> +          void *val;
> +
> +          if (apr_threadkey_private_get (&val, xlate_handle_key) == APR_SUCCESS)
> +            {
> +              handle_hash = val;
> +            }
> +
> +          if (handle_hash)
> +            {
> +              old_handle = apr_hash_get(handle_hash, userdata_key,
> +                                        APR_HASH_KEY_STRING);
> +              handle_pool = apr_hash_pool_get (handle_hash);
> +            }
> +          else
> +            {
> +              handle_pool = svn_pool_create (NULL);

This creates a pool in every thread, but I don't see the pool getting
cleared or destroyed.  An application that creates threads continually
(e.g. threaded svnserve) will cause the number of pools to grow
without bound.

> +              handle_hash = apr_hash_make (pool);
> +              if (apr_threadkey_private_set (handle_hash, xlate_handle_key)

If pool is cleared handle_hash will become invalid, but will still be
referenced by the thread private data.  The next time the thread calls
get_xlate_handle it will crash, or do something worse.

> +                  != APR_SUCCESS)
> +                {
> +                  handle_hash = NULL;
> +                  handle_pool = NULL;
> +                }
> +            }
> +        }
> +#endif
> +      /* Try what might be cached in the pool. */
> +      if (old_handle == NULL)
> +        apr_pool_userdata_get (&old_handle, userdata_key, pool);
>        if (old_handle != NULL)
>          {
>            *ret = old_handle;

-- 
Philip Martin

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

[PATCH] Re: 1.1rc1 performance regression in 'svn status' (and also 1.1rc2)

Posted by "Peter N. Lundblad" <pe...@famlundblad.se>.
On Sun, 15 Aug 2004, Greg Hudson wrote:

> On Sun, 2004-08-15 at 16:41, Peter N. Lundblad wrote:
> > I don't see a way to improve the performance via caching in pools. If we
> > want to fix this, I think we need to add init functions and be able to
> > detect if they were called. If this is problematic for 1.1 ( to which I
> > might agree), then we won't fix this until 1.2. This will hurt TSVN users
> > I think and that's bad because that's an important application on Windows.
>
> Let's introduce an optional initializer, but let's make it very limited
> and specific.  "void svn_utf_initialize(void);".  We acknowledge that
> it's a short-term hack.  It can become a no-op in 1.x and can disappear
> in 2.0.
>
Here is this hack in the form of a patch that could have some review. If
someone is actually using the svn libs in a multi-threaded environment,
please test. I'm running the testsuite right now, and it seems to work.

I kept the original caching in the pool to not hurt performance even more
if svn_utf_initialize wasn't called. An alternative would be to drop it
and tell people to use this new API if they feel svn is too slow. Maybe
this cleanup could wait until we have the real solution in place.

Comments?
//Peter

Re: 1.1rc1 performance regression in 'svn status' (and also 1.1rc2)

Posted by Greg Hudson <gh...@MIT.EDU>.
On Sun, 2004-08-15 at 16:41, Peter N. Lundblad wrote:
> But the apr_pool_userdata_{get,set} isn't thread-safe just because we
> protect our uses by a lock. Other callers don't.

Right, sorry.  I was imagining our specific userdata slot could be
protected this way, but of course it can't.

> I don't see a way to improve the performance via caching in pools. If we
> want to fix this, I think we need to add init functions and be able to
> detect if they were called. If this is problematic for 1.1 ( to which I
> might agree), then we won't fix this until 1.2. This will hurt TSVN users
> I think and that's bad because that's an important application on Windows.

I agree that we should solve this problem for 1.1, even at a high cost.

Let's introduce an optional initializer, but let's make it very limited
and specific.  "void svn_utf_initialize(void);".  We acknowledge that
it's a short-term hack.  It can become a no-op in 1.x and can disappear
in 2.0.

I can think of three long-term solutions to free ourselves from this
hack:

  * Add a more general library initialization framework for this sort of
thing.
  * Invent the notion of a Subversion library context, like the
libsvn_client context but more opaque and general, and pass a library
context around to most or all svn library functions.  This is the
"cleanest" solution, but it's also burdensome; callers already have to
pass around pools and exceptions.  (We could fold the pools and/or the
exceptions into the library context to reduce the burden, but I'm not
sure that idea really pans out.)
  * Teach APR to handle statically-initialized onces.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: 1.1rc1 performance regression in 'svn status' (and also 1.1rc2)

Posted by "Peter N. Lundblad" <pe...@famlundblad.se>.
On Sun, 15 Aug 2004, Greg Hudson wrote:

> On Sun, 2004-08-15 at 10:29, Peter N. Lundblad wrote:
> > But apr_pool_userdata_{get,set} aren't thread-safe, so you can't use them
> > on the same pool in different threads simultanously.
>
> Yeah, I think this is the lynchpin of the problem.
>
> > Oh, had we only had init routines that were required to be called
> > by the user!
>
> What would we do with them?  We could add one, and just say that xlate
> handle caching isn't very good unless you call it.  But adding an
> initializer before 1.1 frightens me, since we wouldn't really have a lot
> of time to consider the API design.
>
I would initialize thread specific data and store the xlate handles there.
Seems like APR detects if initialize has been called by a plain int. We
could do the same, requiring the init function to be called before
anything in other threads use the lib. Then we can detect if the TSD is to
be used.

> (Also, if you're going to use the init routine to initialize a global
> lock, I think that would hurt the MP performance of apps which use svn.
> Right now, if you have N threads performing svn operations, each with a
> totally independent pool, they won't lock against each other.  But if
> you add a global lock in a frequently-used low-level routine, that
> changes.)
>
Correct, so I wouldn't.

> I think I've identified an alternative.  The APR pool code locks pools
> (or rather, locks the allocator which I believe is shared between all
> pools in a tree of pools) like this:
>
...
> All of those functions are public.  So we should be able to do the same
> thing ourselves.  (The allocator mutex could be retrieved again in the
> unlock code, so we could add svn_pool_lock() and svn_pool_unlock() to
> svn_pools.h, although we might not want to do that in the 1.1 backport.)
>
> That gives us a couple of options for storing xlate handles in parent
> pools.  We can either store bare xlate handles and lock around the
...

But the apr_pool_userdata_{get,set} isn't thread-safe just because we
protect our uses by a lock. Other callers don't.

I don't see a way to improve the performance via caching in pools. If we
want to fix this, I think we need to add init functions and be able to
detect if they were called. If this is problematic for 1.1 ( to which I
might agree), then we won't fix this until 1.2. This will hurt TSVN users
I think and that's bad because that's an important application on Windows.

//Peter

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: 1.1rc1 performance regression in 'svn status' (and also 1.1rc2)

Posted by Greg Hudson <gh...@MIT.EDU>.
On Sun, 2004-08-15 at 10:29, Peter N. Lundblad wrote:
> But apr_pool_userdata_{get,set} aren't thread-safe, so you can't use them
> on the same pool in different threads simultanously.

Yeah, I think this is the lynchpin of the problem.

> Oh, had we only had init routines that were required to be called
> by the user!

What would we do with them?  We could add one, and just say that xlate
handle caching isn't very good unless you call it.  But adding an
initializer before 1.1 frightens me, since we wouldn't really have a lot
of time to consider the API design.

(Also, if you're going to use the init routine to initialize a global
lock, I think that would hurt the MP performance of apps which use svn. 
Right now, if you have N threads performing svn operations, each with a
totally independent pool, they won't lock against each other.  But if
you add a global lock in a frequently-used low-level routine, that
changes.)

I think I've identified an alternative.  The APR pool code locks pools
(or rather, locks the allocator which I believe is shared between all
pools in a tree of pools) like this:

#if APR_HAS_THREADS
    apr_thread_mutex_t *mutex;

    mutex = apr_allocator_mutex_get(allocator);
    if (mutex != NULL)
        apr_thread_mutex_lock(mutex);
#endif /* APR_HAS_THREADS */
  [...]
#if APR_HAS_THREADS
    if (mutex != NULL)
        apr_thread_mutex_unlock(mutex);
#endif

All of those functions are public.  So we should be able to do the same
thing ourselves.  (The allocator mutex could be retrieved again in the
unlock code, so we could add svn_pool_lock() and svn_pool_unlock() to
svn_pools.h, although we might not want to do that in the 1.1 backport.)

That gives us a couple of options for storing xlate handles in parent
pools.  We can either store bare xlate handles and lock around the
translation we do with them, or we can store <xlate handle, thread id>
pairs and just lock around the retrieval of the xlate handle, ensuring
that we never use a handle in more than one thread.  I'm guessing the
latter will turn out to be easier to implement, although it's a little
harder to think about.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: 1.1rc1 performance regression in 'svn status' (and also 1.1rc2)

Posted by "Peter N. Lundblad" <pe...@famlundblad.se>.
On Sun, 15 Aug 2004, Travis P wrote:
> On Aug 14, 2004, at 9:54 AM, Peter N. Lundblad wrote:
>
> > On Sat, 14 Aug 2004, Mark Benedetto King wrote:
> >
> >> On Fri, Aug 13, 2004 at 10:50:26PM +0200, Peter N. Lundblad wrote:
> >>> On Fri, 13 Aug 2004, C.A.T.Magic wrote:
> >>>
> >>>>
> >> We don't tell people not to trade pools between threads, so just
> >> because a pool was created by one thread doesn't mean it won't
> >> be in use by another.
> >>
> > I think I see what you mean now:
> > - T1: Create pool p1
> > - T1: Create subpool of p1, p2
> > - T1: start thread T2 and give it p1
> > - T2: use p1 for char xlations
> > - T1: use p2 for xlations, get cached xlate handle from p1 and use,
> > kaboom!
> >
> > Sorry. I'll rethink again.
>
> To defend against that scenario, as a slight variation of the "store
> the current apr_os_thread_t in the pool" approach, what if rather than
> store the thead id with by pool, store the id as part of the
> cache-lookup id for the cached xlate handle.  The thread-safety rule
> then being:  a thread can only use the xlate handle that it cached.
> This may involve the cached handle not always being used and maybe
> being re-looked up and cached independently.  For the common case where
> pools aren't handed around and multiple xlate handles are looked up in
> the same thread (I'm assuming this is most common), then the cached
> xlate handle will be readilty available.
>
But apr_pool_userdata_{get,set} aren't thread-safe, so you can't use them
on the same pool in different threads simultanously.

Oh, had we only had init routines that were required to be called by the
user!

//Peter

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: 1.1rc1 performance regression in 'svn status' (and also 1.1rc2)

Posted by Travis P <sv...@castle.fastmail.fm>.
On Aug 14, 2004, at 9:54 AM, Peter N. Lundblad wrote:

> On Sat, 14 Aug 2004, Mark Benedetto King wrote:
>
>> On Fri, Aug 13, 2004 at 10:50:26PM +0200, Peter N. Lundblad wrote:
>>> On Fri, 13 Aug 2004, C.A.T.Magic wrote:
>>>
>>>>
>>>> Peter N. Lundblad wrote:
>>>>> This was discussed earlier on this thread. If a pools parent is in 
>>>>> a
>>>>> different thread, this will not work. You can't use a xlate handle 
>>>>> in
>>>>> different threads at the same time, AFAICK.
>>>>
>>>> maybe you can store store the threadID together with the xlate 
>>>> handle,
>>>> and only use the pool's xlate handle if the threadIDs match?
>>>
>>> That requires locking, since another thread might be setting it 
>>> during the
>>> test.
>>>
>>> But, what about the following approach:
>>> - When a pool is created in svn_pool_create(_ex), store the current
>>> apr_os_thread_t in the pool.
>>> - When we want an xlate handle, search the pool tre upwards until we 
>>> are
>>> in a pool created by another thread or we find a handle. If the pool
>>> wasn't created by our routines, then we will detect that and stop.
>>>
>>
>> We don't tell people not to trade pools between threads, so just
>> because a pool was created by one thread doesn't mean it won't
>> be in use by another.
>>
> I think I see what you mean now:
> - T1: Create pool p1
> - T1: Create subpool of p1, p2
> - T1: start thread T2 and give it p1
> - T2: use p1 for char xlations
> - T1: use p2 for xlations, get cached xlate handle from p1 and use,
> kaboom!
>
> Sorry. I'll rethink again.

To defend against that scenario, as a slight variation of the "store 
the current apr_os_thread_t in the pool" approach, what if rather than 
store the thead id with by pool, store the id as part of the 
cache-lookup id for the cached xlate handle.  The thread-safety rule 
then being:  a thread can only use the xlate handle that it cached.  
This may involve the cached handle not always being used and maybe 
being re-looked up and cached independently.  For the common case where 
pools aren't handed around and multiple xlate handles are looked up in 
the same thread (I'm assuming this is most common), then the cached 
xlate handle will be readilty available.

(Disclaimer:  I haven't programmed with APR pools nor used multilingual 
libraries as Subversion is using.  I've just been following the email 
list.  Apologies if I've the wrong ideas about what is possible in that 
context.)

-Travis


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: 1.1rc1 performance regression in 'svn status' (and also 1.1rc2)

Posted by "Peter N. Lundblad" <pe...@famlundblad.se>.
On Sat, 14 Aug 2004, Mark Benedetto King wrote:

> On Fri, Aug 13, 2004 at 10:50:26PM +0200, Peter N. Lundblad wrote:
> > On Fri, 13 Aug 2004, C.A.T.Magic wrote:
> >
> > >
> > > Peter N. Lundblad wrote:
> > > > This was discussed earlier on this thread. If a pools parent is in a
> > > > different thread, this will not work. You can't use a xlate handle in
> > > > different threads at the same time, AFAICK.
> > >
> > > maybe you can store store the threadID together with the xlate handle,
> > > and only use the pool's xlate handle if the threadIDs match?
> >
> > That requires locking, since another thread might be setting it during the
> > test.
> >
> > But, what about the following approach:
> > - When a pool is created in svn_pool_create(_ex), store the current
> > apr_os_thread_t in the pool.
> > - When we want an xlate handle, search the pool tre upwards until we are
> > in a pool created by another thread or we find a handle. If the pool
> > wasn't created by our routines, then we will detect that and stop.
> >
>
> We don't tell people not to trade pools between threads, so just
> because a pool was created by one thread doesn't mean it won't
> be in use by another.
>
I think I see what you mean now:
- T1: Create pool p1
- T1: Create subpool of p1, p2
- T1: start thread T2 and give it p1
- T2: use p1 for char xlations
- T1: use p2 for xlations, get cached xlate handle from p1 and use,
kaboom!

Sorry. I'll rethink again.

Thanks,
//Peter

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: 1.1rc1 performance regression in 'svn status' (and also 1.1rc2)

Posted by "Peter N. Lundblad" <pe...@famlundblad.se>.
On Sat, 14 Aug 2004, Mark Benedetto King wrote:

> On Fri, Aug 13, 2004 at 10:50:26PM +0200, Peter N. Lundblad wrote:
> > On Fri, 13 Aug 2004, C.A.T.Magic wrote:
> >
> > >
> > > Peter N. Lundblad wrote:
> > > > This was discussed earlier on this thread. If a pools parent is in a
> > > > different thread, this will not work. You can't use a xlate handle in
> > > > different threads at the same time, AFAICK.
> > >
> > > maybe you can store store the threadID together with the xlate handle,
> > > and only use the pool's xlate handle if the threadIDs match?
> >
> > That requires locking, since another thread might be setting it during the
> > test.
> >
> > But, what about the following approach:
> > - When a pool is created in svn_pool_create(_ex), store the current
> > apr_os_thread_t in the pool.
> > - When we want an xlate handle, search the pool tre upwards until we are
> > in a pool created by another thread or we find a handle. If the pool
> > wasn't created by our routines, then we will detect that and stop.
> >
>
> We don't tell people not to trade pools between threads, so just
> because a pool was created by one thread doesn't mean it won't
> be in use by another.
>
Then you won't get the optimization. NOte that we can still cache the
handle in the pool that was passed to the conversion function regardless
of where it was created. I don't think it is safe to use a pool in
different threads at the same time.

//Peter

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: 1.1rc1 performance regression in 'svn status' (and also 1.1rc2)

Posted by Mark Benedetto King <mb...@lowlatency.com>.
On Fri, Aug 13, 2004 at 10:50:26PM +0200, Peter N. Lundblad wrote:
> On Fri, 13 Aug 2004, C.A.T.Magic wrote:
> 
> >
> > Peter N. Lundblad wrote:
> > > This was discussed earlier on this thread. If a pools parent is in a
> > > different thread, this will not work. You can't use a xlate handle in
> > > different threads at the same time, AFAICK.
> >
> > maybe you can store store the threadID together with the xlate handle,
> > and only use the pool's xlate handle if the threadIDs match?
> 
> That requires locking, since another thread might be setting it during the
> test.
> 
> But, what about the following approach:
> - When a pool is created in svn_pool_create(_ex), store the current
> apr_os_thread_t in the pool.
> - When we want an xlate handle, search the pool tre upwards until we are
> in a pool created by another thread or we find a handle. If the pool
> wasn't created by our routines, then we will detect that and stop.
> 

We don't tell people not to trade pools between threads, so just
because a pool was created by one thread doesn't mean it won't
be in use by another.

--ben


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: 1.1rc1 performance regression in 'svn status' (and also 1.1rc2)

Posted by "Peter N. Lundblad" <pe...@famlundblad.se>.
On Sat, 14 Aug 2004, Greg Hudson wrote:

> On Fri, 2004-08-13 at 16:50, Peter N. Lundblad wrote:
> > But, what about the following approach:
> > - When a pool is created in svn_pool_create(_ex), store the current
> > apr_os_thread_t in the pool.
> > - When we want an xlate handle, search the pool tre upwards until we are
> > in a pool created by another thread or we find a handle. If the pool
> > wasn't created by our routines, then we will detect that and stop.
>
> I think this will work.  It seems awfully complicated, but I don't have
> any better ideas.  Just make sure to write a very clear comment. :)
>
I posted later about a case where someone gives a parent pool tthat she
created to a thread and then uses a subpool of that herself. I don't know
if this convoluted case would be thread-safe in general. Does some expert
on APR pools know?

Thanks,
//Peter

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: 1.1rc1 performance regression in 'svn status' (and also 1.1rc2)

Posted by Greg Hudson <gh...@MIT.EDU>.
On Fri, 2004-08-13 at 16:50, Peter N. Lundblad wrote:
> But, what about the following approach:
> - When a pool is created in svn_pool_create(_ex), store the current
> apr_os_thread_t in the pool.
> - When we want an xlate handle, search the pool tre upwards until we are
> in a pool created by another thread or we find a handle. If the pool
> wasn't created by our routines, then we will detect that and stop.

I think this will work.  It seems awfully complicated, but I don't have
any better ideas.  Just make sure to write a very clear comment. :)


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: 1.1rc1 performance regression in 'svn status' (and also 1.1rc2)

Posted by "Peter N. Lundblad" <pe...@famlundblad.se>.
On Fri, 13 Aug 2004, Mark Phippard wrote:

> "Peter N. Lundblad" <pe...@famlundblad.se> wrote on 08/13/2004 04:50:26
> PM:
>
> > On Fri, 13 Aug 2004, C.A.T.Magic wrote:
> >
> > >
> Anything sounds better to me.
>
Anything that works, though... :-)

> I do not know the function flow here, but if the client commands call the
> conversion routines directly, couldn't a new set of conversion routines
> just be made (copied).  Those routines would make the assumption they are
> using a single-thread environment and could safely cache the xlate handle
> in the pool?
>
> Obviously this would be ugly, and if the conversion routines are only
> called from deeper in the library structure it would be impractical to do
> it this way.
>

That's the case, unfortunately. This conversion is to UTF-8 when reading
the directory listing from the OS. We need a general solution here. Else,
it isn't acceptable.

//Peter

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: 1.1rc1 performance regression in 'svn status' (and also 1.1rc2)

Posted by Mark Phippard <Ma...@softlanding.com>.
"Peter N. Lundblad" <pe...@famlundblad.se> wrote on 08/13/2004 04:50:26 
PM:

> On Fri, 13 Aug 2004, C.A.T.Magic wrote:
> 
> >
> > Peter N. Lundblad wrote:
> > > This was discussed earlier on this thread. If a pools parent is in a
> > > different thread, this will not work. You can't use a xlate handle 
in
> > > different threads at the same time, AFAICK.
> >
> > maybe you can store store the threadID together with the xlate handle,
> > and only use the pool's xlate handle if the threadIDs match?
> 
> That requires locking, since another thread might be setting it during 
the
> test.
> 
> But, what about the following approach:
> - When a pool is created in svn_pool_create(_ex), store the current
> apr_os_thread_t in the pool.
> - When we want an xlate handle, search the pool tre upwards until we are
> in a pool created by another thread or we find a handle. If the pool
> wasn't created by our routines, then we will detect that and stop.
> 
>  > (assuming the threadid doesnt get re-used).
> >
> It won't be reused until the previous user is dead and then it can't be
> the current thread.
> 
> I'll finish some other work and then work on this idea if a) no one 
kills
> it (brane, you're good at such things:-) or b) someone else beats me to
> it.

Anything sounds better to me.

I do not know the function flow here, but if the client commands call the 
conversion routines directly, couldn't a new set of conversion routines 
just be made (copied).  Those routines would make the assumption they are 
using a single-thread environment and could safely cache the xlate handle 
in the pool?

Obviously this would be ugly, and if the conversion routines are only 
called from deeper in the library structure it would be impractical to do 
it this way.

Mark

PS - I am working on an EBCDIC port (OS/400) and we have to call these 
conversion routines everywhere.



_____________________________________________________________________________
Scanned for SoftLanding Systems, Inc. by IBM Email Security Management Services powered by MessageLabs. 
_____________________________________________________________________________

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: 1.1rc1 performance regression in 'svn status' (and also 1.1rc2)

Posted by "Peter N. Lundblad" <pe...@famlundblad.se>.
On Fri, 13 Aug 2004, C.A.T.Magic wrote:

>
> Peter N. Lundblad wrote:
> > This was discussed earlier on this thread. If a pools parent is in a
> > different thread, this will not work. You can't use a xlate handle in
> > different threads at the same time, AFAICK.
>
> maybe you can store store the threadID together with the xlate handle,
> and only use the pool's xlate handle if the threadIDs match?

That requires locking, since another thread might be setting it during the
test.

But, what about the following approach:
- When a pool is created in svn_pool_create(_ex), store the current
apr_os_thread_t in the pool.
- When we want an xlate handle, search the pool tre upwards until we are
in a pool created by another thread or we find a handle. If the pool
wasn't created by our routines, then we will detect that and stop.

 > (assuming the threadid doesnt get re-used).
>
It won't be reused until the previous user is dead and then it can't be
the current thread.

I'll finish some other work and then work on this idea if a) no one kills
it (brane, you're good at such things:-) or b) someone else beats me to
it.

//Peter

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: 1.1rc1 performance regression in 'svn status' (and also 1.1rc2)

Posted by "C.A.T.Magic" <c....@gmx.at>.
Peter N. Lundblad wrote:
> This was discussed earlier on this thread. If a pools parent is in a
> different thread, this will not work. You can't use a xlate handle in
> different threads at the same time, AFAICK.

maybe you can store store the threadID together with the xlate handle,
and only use the pool's xlate handle if the threadIDs match?
(assuming the threadid doesnt get re-used).

-

better solution would be:

create 2 new functions to initialize / exit
the subversion api. _if_ the init function is called,
the xlate handle can be stored in some thread local
storage. if it is _not_ called, everything remains like
it is: slow :)
afaik you are allowed to introduce _new_ functions
to the public api?

-

third option (better than to force everyone to a slow svn)

allow some config or environment variable to _disable_
the translations. svn would be plain english, but at least
it would be fast(er) again.
For everyone using an english windows, US locale, untranslated svn -
is there any reason why an xlate handle is created at all?
(currently I compile svn myself and withouth language support)


:-)
c.a.t.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: 1.1rc1 performance regression in 'svn status' (and also 1.1rc2)

Posted by Philip Martin <ph...@codematters.co.uk>.
"Peter N. Lundblad" <pe...@famlundblad.se> writes:

> On Thu, 12 Aug 2004, Greg Hudson wrote:
>
>> On Thu, 2004-08-12 at 19:01, Philip Martin wrote:
>> > Does it make sense for to look for a suitable xlate handle in the
>> > parent pools before falling back and creating a new one in the current
>> > pool?  If such a handle is found it could be stored in the current
>> > pool so that the search would only occur once per pool.
>>
>> Yes, I proposed the same thing at some point, though perhaps only over
>> IRC.  That, possibly in addition to a few strategic calls to a new
>> function which deliberately places an xlate handle in a pool, would
>> probably solve our problem.
>>
> This was discussed earlier on this thread. If a pools parent is in a
> different thread, this will not work. You can't use a xlate handle in
> different threads at the same time, AFAICK.

I've reverted r10613.  It's a pity, even on Linux I measured a 20%
speed increase running "svn st" on a large/deep working copy.

-- 
Philip Martin

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: 1.1rc1 performance regression in 'svn status' (and also 1.1rc2)

Posted by "Peter N. Lundblad" <pe...@famlundblad.se>.
On Fri, 13 Aug 2004, Mark Phippard wrote:

> "Peter N. Lundblad" <pe...@famlundblad.se> wrote on 08/13/2004 01:56:08
> AM:
>
> > On Thu, 12 Aug 2004, Greg Hudson wrote:
> >
> > This was discussed earlier on this thread. If a pools parent is in a
> > different thread, this will not work. You can't use a xlate handle in
> > different threads at the same time, AFAICK.
> >
>
> How would svn st be running multi-threaded?  Aren't client ops always
> single thread?  Or is this a general routine used in other cases as well?
>
The question isn't about the status command, but a routine that converts
between UTF-8 and the character encoding used in the current environment.
This routine caches the translation handle, but this is an internal
implementation detail. It is supposed to be thread-safe in the sense that
simultanous calls from different threads. Remember that subversion is a
set of libraries and we don't, in general, have control over the
application.

> Is there a simple way to check if there are multiple threads running so
> that it could use the xlate handle when running client operations?
>
Not that I know of.

Regards,
//Peter

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: 1.1rc1 performance regression in 'svn status' (and also 1.1rc2)

Posted by Mark Phippard <Ma...@softlanding.com>.
"Peter N. Lundblad" <pe...@famlundblad.se> wrote on 08/13/2004 01:56:08 
AM:

> On Thu, 12 Aug 2004, Greg Hudson wrote:
> 
> > On Thu, 2004-08-12 at 19:01, Philip Martin wrote:
> > > Does it make sense for to look for a suitable xlate handle in the
> > > parent pools before falling back and creating a new one in the 
current
> > > pool?  If such a handle is found it could be stored in the current
> > > pool so that the search would only occur once per pool.
> >
> > Yes, I proposed the same thing at some point, though perhaps only over
> > IRC.  That, possibly in addition to a few strategic calls to a new
> > function which deliberately places an xlate handle in a pool, would
> > probably solve our problem.
> >
> This was discussed earlier on this thread. If a pools parent is in a
> different thread, this will not work. You can't use a xlate handle in
> different threads at the same time, AFAICK.
> 

How would svn st be running multi-threaded?  Aren't client ops always 
single thread?  Or is this a general routine used in other cases as well?

Is there a simple way to check if there are multiple threads running so 
that it could use the xlate handle when running client operations?

Mark


_____________________________________________________________________________
Scanned for SoftLanding Systems, Inc. by IBM Email Security Management Services powered by MessageLabs. 
_____________________________________________________________________________

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: 1.1rc1 performance regression in 'svn status' (and also 1.1rc2)

Posted by "Peter N. Lundblad" <pe...@famlundblad.se>.
On Thu, 12 Aug 2004, Greg Hudson wrote:

> On Thu, 2004-08-12 at 19:01, Philip Martin wrote:
> > Does it make sense for to look for a suitable xlate handle in the
> > parent pools before falling back and creating a new one in the current
> > pool?  If such a handle is found it could be stored in the current
> > pool so that the search would only occur once per pool.
>
> Yes, I proposed the same thing at some point, though perhaps only over
> IRC.  That, possibly in addition to a few strategic calls to a new
> function which deliberately places an xlate handle in a pool, would
> probably solve our problem.
>
This was discussed earlier on this thread. If a pools parent is in a
different thread, this will not work. You can't use a xlate handle in
different threads at the same time, AFAICK.

//Peter

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: 1.1rc1 performance regression in 'svn status' (and also 1.1rc2)

Posted by Greg Hudson <gh...@MIT.EDU>.
On Thu, 2004-08-12 at 19:01, Philip Martin wrote:
> Does it make sense for to look for a suitable xlate handle in the
> parent pools before falling back and creating a new one in the current
> pool?  If such a handle is found it could be stored in the current
> pool so that the search would only occur once per pool.

Yes, I proposed the same thing at some point, though perhaps only over
IRC.  That, possibly in addition to a few strategic calls to a new
function which deliberately places an xlate handle in a pool, would
probably solve our problem.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: 1.1rc1 performance regression in 'svn status' (and also 1.1rc2)

Posted by Philip Martin <ph...@codematters.co.uk>.
Matt Doran <ma...@bigpond.com> writes:

>> On Sun, 8 Aug 2004, Brane wrote: 
>>>>
>>>>SVN 1.0.6: ~ 4 secs
>>>>SVN 1.1rc1: ~ 38 secs !!
>>>>
>>>>I used the Filemon utility from sysinternals to try to determine what it
>>>>happening. It looks like 1.1 excessively hits 'iconv' files. (e.g.
>>>>windows-1252.so, utf-8.so, etc)
>>>>
>>>>
>>>I did some more checking in GDB. It appears that apr_xlate_open gets
>>>called 4312 times out of 17204 calls to get_xlate_handle (in
>>>libsvn_subr/utf.c).
>>>This was during a run of "svn st" on trunk. This caching strategy needs to
>>>be improved.
>>>
>>>
>>That's a 75% hit rate, which -- given our constraints -- isn't all that
>>bad. The problem is that we can only safely cache the xlate handle in
>>the current pool. If that happens to be a subpool in some loop -- well,
>>tough luck.

Does it make sense for to look for a suitable xlate handle in the
parent pools before falling back and creating a new one in the current
pool?  If such a handle is found it could be stored in the current
pool so that the search would only occur once per pool.

-- 
Philip Martin

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org