You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Doug Robinson <do...@wandisco.com> on 2019/10/04 13:20:02 UTC

Better choice for Linux semaphore than spinlock?

Folks:

From a Subversion user:

“... we have very high concurrent connections to Subversion that seem to
crater Subversion. The SVN Serve process we use to access the Subversion
repository is using the “svn” protocol by our “system user”, mostly
read-only.  Then, we, on behalf of the user make request to Subversion
using the “http” protocol to fetch their data. So we have lots of
connections to Subversion. But the volume of concurrent requests over the
“svn” protocol cause the “svnserve” process to consume CPU cycles in a
kernel “mutex-lock” which is implemented using “spin locks”. The “svnserve”
process makes the mutex calls using the “apache” (APR) semaphore wait
calls, but on Linux this is a “mutext-lock” request.”

So is there a better, more scalable, semaphore that can be used?

Cheers.

Doug
-- 
*DOUGLAS B ROBINSON* SENIOR PRODUCT MANAGER

T +1 925 396 1125
*E* doug.robinson@wandisco.com

-- 


* <http://wandisco.com/>*

**The *LiveData* Company
*Find out more 
*wandisco.com <http://wandisco.com/>*



 
<https://www.wandisco.com/liveanalytics>


THIS MESSAGE AND ANY ATTACHMENTS 
ARE CONFIDENTIAL, PROPRIETARY AND MAY BE PRIVILEGED
*


If this message was 
misdirected, WANdisco, Inc. and its subsidiaries, ("WANdisco") does not 
waive any confidentiality or privilege. If you are not the intended 
recipient, please notify us immediately and destroy the message without 
disclosing its contents to anyone. Any distribution, use or copying of this 
email or the information it contains by other than an intended recipient is 
unauthorized. The views and opinions expressed in this email message are 
the author's own and may not reflect the views and opinions of WANdisco, 
unless the author is authorized by WANdisco to express such views or 
opinions on its behalf. All email sent to or from this address is subject 
to electronic storage and review by WANdisco. Although WANdisco operates 
anti-virus programs, it does not accept responsibility for any damage 
whatsoever caused by viruses being passed.

Re: Better choice for Linux semaphore than spinlock?

Posted by eponymous alias <ep...@yahoo.com>.
 Perhaps these links might be of help in some way:

https://webkit.org/blog/6161/locking-in-webkit/
https://blog.mozilla.org/nfroyd/2017/03/29/on-mutex-performance-part-1/
https://preshing.com/20111118/locks-arent-slow-lock-contention-is/

On Monday, October 7, 2019, 1:56:14 PM PDT, Doug Robinson <do...@wandisco.com> wrote:

Rüdiger:

On Mon, Oct 7, 2019 at 3:51 PM Ruediger Pluem <rp...@apache.org> wrote:

 On 10/07/2019 08:40 PM, Branko Čibej wrote:
 > On Mon, 7 Oct 2019, 19:47 Doug Robinson, <doug.robinson@wandisco.com <ma...@wandisco.com>> wrote:
 >
 > Folks:
 >
 > I spoke with this user late last week. They stated that they can only get approximately 400 parallel SVN operations
 > before the "system time" consumes all available CPU for an 8-core machine. Adding more cores won't help because of
 > the nature of spin locks (it makes things worse). Turns out that even with ~100 parallel SVN operations the "system
 > time" starts becoming significant/measurable (~10%). Both HTTP (mod_dav_svn) and "svnserve" protocols participate
 > in the lock contention.
 >
 > Your help would be greatly appreciated.
 >
 > Whew. So. Reducing this issue to "use a more efficient lock" is not going to work, and you provided far too little
 > information to even attempt a diagnosis. For starters, I recommend gathering as much info as possible (anonymised of
 > course) about the server configuration, everything from httpd an svnserve to the repository config and underlying
 > filesystem, if possible. Getting stack traces of the "stuck" threads would be necessary, too. Without knowing exactly
 > what is happening, these kinds of problems are extremely hard to understand, let alone fix.

 Plus depending on which part of the code requires this lock a different locking mechanism that might suit better for
 this use case can possibly be chosen via configuration changes (e.g. httpd allows this for most of its locking).

That would be awesome! I'll definitely try to get those stack tracebacks.

Cheers.

Doug
  

Re: Better choice for Linux semaphore than spinlock?

Posted by eponymous alias <ep...@yahoo.com>.
 Perhaps these links might be of help in some way:

https://webkit.org/blog/6161/locking-in-webkit/
https://blog.mozilla.org/nfroyd/2017/03/29/on-mutex-performance-part-1/
https://preshing.com/20111118/locks-arent-slow-lock-contention-is/

On Monday, October 7, 2019, 1:56:14 PM PDT, Doug Robinson <do...@wandisco.com> wrote:

Rüdiger:

On Mon, Oct 7, 2019 at 3:51 PM Ruediger Pluem <rp...@apache.org> wrote:

 On 10/07/2019 08:40 PM, Branko Čibej wrote:
 > On Mon, 7 Oct 2019, 19:47 Doug Robinson, <doug.robinson@wandisco.com <ma...@wandisco.com>> wrote:
 >
 > Folks:
 >
 > I spoke with this user late last week. They stated that they can only get approximately 400 parallel SVN operations
 > before the "system time" consumes all available CPU for an 8-core machine. Adding more cores won't help because of
 > the nature of spin locks (it makes things worse). Turns out that even with ~100 parallel SVN operations the "system
 > time" starts becoming significant/measurable (~10%). Both HTTP (mod_dav_svn) and "svnserve" protocols participate
 > in the lock contention.
 >
 > Your help would be greatly appreciated.
 >
 > Whew. So. Reducing this issue to "use a more efficient lock" is not going to work, and you provided far too little
 > information to even attempt a diagnosis. For starters, I recommend gathering as much info as possible (anonymised of
 > course) about the server configuration, everything from httpd an svnserve to the repository config and underlying
 > filesystem, if possible. Getting stack traces of the "stuck" threads would be necessary, too. Without knowing exactly
 > what is happening, these kinds of problems are extremely hard to understand, let alone fix.

 Plus depending on which part of the code requires this lock a different locking mechanism that might suit better for
 this use case can possibly be chosen via configuration changes (e.g. httpd allows this for most of its locking).

That would be awesome! I'll definitely try to get those stack tracebacks.

Cheers.

Doug
  

Re: Better choice for Linux semaphore than spinlock?

Posted by Doug Robinson <do...@wandisco.com>.
Rüdiger:

On Mon, Oct 7, 2019 at 3:51 PM Ruediger Pluem <rp...@apache.org> wrote:

> On 10/07/2019 08:40 PM, Branko Čibej wrote:
> > On Mon, 7 Oct 2019, 19:47 Doug Robinson, <doug.robinson@wandisco.com
> <ma...@wandisco.com>> wrote:
> >
> >     Folks:
> >
> >     I spoke with this user late last week.  They stated that they can
> only get approximately 400 parallel SVN operations
> >     before the "system time" consumes all available CPU for an 8-core
> machine.  Adding more cores won't help because of
> >     the nature of spin locks (it makes things worse).  Turns out that
> even with ~100 parallel SVN operations the "system
> >     time" starts becoming significant/measurable (~10%).  Both HTTP
> (mod_dav_svn) and "svnserve" protocols participate
> >     in the lock contention.
> >
> >     Your help would be greatly appreciated.
> >
> > Whew. So. Reducing this issue to "use a more efficient lock" is not
> going to work, and you provided far too little
> > information to even attempt a diagnosis. For starters, I recommend
> gathering as much info as possible (anonymised of
> > course) about the server configuration, everything from httpd an
> svnserve to the repository config and underlying
> > filesystem, if possible. Getting stack traces of the "stuck" threads
> would be necessary, too. Without knowing exactly
> > what is happening, these kinds of problems are extremely hard to
> understand, let alone fix.
>
> Plus depending on which part of the code requires this lock a different
> locking mechanism that might suit better for
> this use case can possibly be chosen via configuration changes (e.g. httpd
> allows this for most of its locking).
>

That would be awesome!  I'll definitely try to get those stack tracebacks.

Cheers.

Doug
-- 
*DOUGLAS B ROBINSON* SENIOR PRODUCT MANAGER

T +1 925 396 1125
*E* doug.robinson@wandisco.com

-- 


* <http://wandisco.com/>*

**The *LiveData* Company
*Find out more 
*wandisco.com <http://wandisco.com/>*



 
<https://www.wandisco.com/liveanalytics>


THIS MESSAGE AND ANY ATTACHMENTS 
ARE CONFIDENTIAL, PROPRIETARY AND MAY BE PRIVILEGED
*


If this message was 
misdirected, WANdisco, Inc. and its subsidiaries, ("WANdisco") does not 
waive any confidentiality or privilege. If you are not the intended 
recipient, please notify us immediately and destroy the message without 
disclosing its contents to anyone. Any distribution, use or copying of this 
email or the information it contains by other than an intended recipient is 
unauthorized. The views and opinions expressed in this email message are 
the author's own and may not reflect the views and opinions of WANdisco, 
unless the author is authorized by WANdisco to express such views or 
opinions on its behalf. All email sent to or from this address is subject 
to electronic storage and review by WANdisco. Although WANdisco operates 
anti-virus programs, it does not accept responsibility for any damage 
whatsoever caused by viruses being passed.

Re: Better choice for Linux semaphore than spinlock?

Posted by Doug Robinson <do...@wandisco.com>.
Rüdiger:

On Mon, Oct 7, 2019 at 3:51 PM Ruediger Pluem <rp...@apache.org> wrote:

> On 10/07/2019 08:40 PM, Branko Čibej wrote:
> > On Mon, 7 Oct 2019, 19:47 Doug Robinson, <doug.robinson@wandisco.com
> <ma...@wandisco.com>> wrote:
> >
> >     Folks:
> >
> >     I spoke with this user late last week.  They stated that they can
> only get approximately 400 parallel SVN operations
> >     before the "system time" consumes all available CPU for an 8-core
> machine.  Adding more cores won't help because of
> >     the nature of spin locks (it makes things worse).  Turns out that
> even with ~100 parallel SVN operations the "system
> >     time" starts becoming significant/measurable (~10%).  Both HTTP
> (mod_dav_svn) and "svnserve" protocols participate
> >     in the lock contention.
> >
> >     Your help would be greatly appreciated.
> >
> > Whew. So. Reducing this issue to "use a more efficient lock" is not
> going to work, and you provided far too little
> > information to even attempt a diagnosis. For starters, I recommend
> gathering as much info as possible (anonymised of
> > course) about the server configuration, everything from httpd an
> svnserve to the repository config and underlying
> > filesystem, if possible. Getting stack traces of the "stuck" threads
> would be necessary, too. Without knowing exactly
> > what is happening, these kinds of problems are extremely hard to
> understand, let alone fix.
>
> Plus depending on which part of the code requires this lock a different
> locking mechanism that might suit better for
> this use case can possibly be chosen via configuration changes (e.g. httpd
> allows this for most of its locking).
>

That would be awesome!  I'll definitely try to get those stack tracebacks.

Cheers.

Doug
-- 
*DOUGLAS B ROBINSON* SENIOR PRODUCT MANAGER

T +1 925 396 1125
*E* doug.robinson@wandisco.com

-- 


* <http://wandisco.com/>*

**The *LiveData* Company
*Find out more 
*wandisco.com <http://wandisco.com/>*



 
<https://www.wandisco.com/liveanalytics>


THIS MESSAGE AND ANY ATTACHMENTS 
ARE CONFIDENTIAL, PROPRIETARY AND MAY BE PRIVILEGED
*


If this message was 
misdirected, WANdisco, Inc. and its subsidiaries, ("WANdisco") does not 
waive any confidentiality or privilege. If you are not the intended 
recipient, please notify us immediately and destroy the message without 
disclosing its contents to anyone. Any distribution, use or copying of this 
email or the information it contains by other than an intended recipient is 
unauthorized. The views and opinions expressed in this email message are 
the author's own and may not reflect the views and opinions of WANdisco, 
unless the author is authorized by WANdisco to express such views or 
opinions on its behalf. All email sent to or from this address is subject 
to electronic storage and review by WANdisco. Although WANdisco operates 
anti-virus programs, it does not accept responsibility for any damage 
whatsoever caused by viruses being passed.

Re: Better choice for Linux semaphore than spinlock?

Posted by Ruediger Pluem <rp...@apache.org>.

On 10/07/2019 08:40 PM, Branko Čibej wrote:
> On Mon, 7 Oct 2019, 19:47 Doug Robinson, <doug.robinson@wandisco.com <ma...@wandisco.com>> wrote:
> 
>     Folks:
> 
>     I spoke with this user late last week.  They stated that they can only get approximately 400 parallel SVN operations
>     before the "system time" consumes all available CPU for an 8-core machine.  Adding more cores won't help because of
>     the nature of spin locks (it makes things worse).  Turns out that even with ~100 parallel SVN operations the "system
>     time" starts becoming significant/measurable (~10%).  Both HTTP (mod_dav_svn) and "svnserve" protocols participate
>     in the lock contention.
> 
>     Your help would be greatly appreciated.
> 
> 
> 
> Whew. So. Reducing this issue to "use a more efficient lock" is not going to work, and you provided far too little
> information to even attempt a diagnosis. For starters, I recommend gathering as much info as possible (anonymised of
> course) about the server configuration, everything from httpd an svnserve to the repository config and underlying
> filesystem, if possible. Getting stack traces of the "stuck" threads would be necessary, too. Without knowing exactly
> what is happening, these kinds of problems are extremely hard to understand, let alone fix.

Plus depending on which part of the code requires this lock a different locking mechanism that might suit better for
this use case can possibly be chosen via configuration changes (e.g. httpd allows this for most of its locking).

Regards

Rüdiger

Re: Better choice for Linux semaphore than spinlock?

Posted by Doug Robinson <do...@wandisco.com>.
Brane:

On Mon, Oct 7, 2019 at 2:40 PM Branko Čibej <br...@apache.org> wrote:

> On Mon, 7 Oct 2019, 19:47 Doug Robinson, <do...@wandisco.com>
> wrote:
>
>> I spoke with this user late last week.  They stated that they can only
>> get approximately 400 parallel SVN operations before the "system
>> time" consumes all available CPU for an 8-core machine.  Adding more cores
>> won't help because of the nature of spin locks (it makes things worse).
>> Turns out that even with ~100 parallel SVN operations the "system time"
>> starts becoming significant/measurable (~10%).  Both HTTP (mod_dav_svn) and
>> "svnserve" protocols participate in the lock contention.
>>
>> Your help would be greatly appreciated.
>>
>
> Whew. So. Reducing this issue to "use a more efficient lock" is not going
> to work, and you provided far too little information to even attempt a
> diagnosis. For starters, I recommend gathering as much info as possible
> (anonymised of course) about the server configuration, everything from
> httpd an svnserve to the repository config and underlying filesystem, if
> possible. Getting stack traces of the "stuck" threads would be necessary,
> too. Without knowing exactly what is happening, these kinds of problems are
> extremely hard to understand, let alone fix.
>

I'll try to get this information and report back.  Or perhaps they can join
this conversation (I gave them a pointer).

I'd be surprised if the spinlock is the actual culprit. AFAIK, kernel-level
> locks hand off to the scheduler if they spin too long; on multiprocessor
> machines, this is usually more efficient than immediately yielding and
> causing an expensive context switch. It's possible that you're seeing an
> unfortunate timing "resonance" that might go away with either more *or*
> less cores being available. The behaviour is really hard to predict.
>

Note: the told me that RHEL support was used and that they identified the
culprit as SVN mutex locks being translated into spin-locks at the OS level.
They also provided the example of Apache itself already having worked
around this in better ways but because this is really buried deep in
mod_dav_svn/svnserve the Apache work-arounds won't help.

Again, I'll see what I can obtain in terms of stack tracebacks, etc.

Cheers.

Doug


>
> -- Brane
>
>
>
>> On Fri, Oct 4, 2019 at 9:20 AM Doug Robinson <do...@wandisco.com>
>> wrote:
>>
>>> Folks:
>>>
>>> From a Subversion user:
>>>
>>> “... we have very high concurrent connections to Subversion that seem to
>>> crater Subversion. The SVN Serve process we use to access the Subversion
>>> repository is using the “svn” protocol by our “system user”, mostly
>>> read-only.  Then, we, on behalf of the user make request to Subversion
>>> using the “http” protocol to fetch their data. So we have lots of
>>> connections to Subversion. But the volume of concurrent requests over the
>>> “svn” protocol cause the “svnserve” process to consume CPU cycles in a
>>> kernel “mutex-lock” which is implemented using “spin locks”. The “svnserve”
>>> process makes the mutex calls using the “apache” (APR) semaphore wait
>>> calls, but on Linux this is a “mutext-lock” request.”
>>>
>>> So is there a better, more scalable, semaphore that can be used?
>>>
>>
>
>

-- 
*DOUGLAS B ROBINSON* SENIOR PRODUCT MANAGER

T +1 925 396 1125
*E* doug.robinson@wandisco.com

-- 


* <http://wandisco.com/>*

**The *LiveData* Company
*Find out more 
*wandisco.com <http://wandisco.com/>*



 
<https://www.wandisco.com/liveanalytics>


THIS MESSAGE AND ANY ATTACHMENTS 
ARE CONFIDENTIAL, PROPRIETARY AND MAY BE PRIVILEGED
*


If this message was 
misdirected, WANdisco, Inc. and its subsidiaries, ("WANdisco") does not 
waive any confidentiality or privilege. If you are not the intended 
recipient, please notify us immediately and destroy the message without 
disclosing its contents to anyone. Any distribution, use or copying of this 
email or the information it contains by other than an intended recipient is 
unauthorized. The views and opinions expressed in this email message are 
the author's own and may not reflect the views and opinions of WANdisco, 
unless the author is authorized by WANdisco to express such views or 
opinions on its behalf. All email sent to or from this address is subject 
to electronic storage and review by WANdisco. Although WANdisco operates 
anti-virus programs, it does not accept responsibility for any damage 
whatsoever caused by viruses being passed.

Re: Better choice for Linux semaphore than spinlock?

Posted by Doug Robinson <do...@wandisco.com>.
Brane:

On Mon, Oct 7, 2019 at 2:40 PM Branko Čibej <br...@apache.org> wrote:

> On Mon, 7 Oct 2019, 19:47 Doug Robinson, <do...@wandisco.com>
> wrote:
>
>> I spoke with this user late last week.  They stated that they can only
>> get approximately 400 parallel SVN operations before the "system
>> time" consumes all available CPU for an 8-core machine.  Adding more cores
>> won't help because of the nature of spin locks (it makes things worse).
>> Turns out that even with ~100 parallel SVN operations the "system time"
>> starts becoming significant/measurable (~10%).  Both HTTP (mod_dav_svn) and
>> "svnserve" protocols participate in the lock contention.
>>
>> Your help would be greatly appreciated.
>>
>
> Whew. So. Reducing this issue to "use a more efficient lock" is not going
> to work, and you provided far too little information to even attempt a
> diagnosis. For starters, I recommend gathering as much info as possible
> (anonymised of course) about the server configuration, everything from
> httpd an svnserve to the repository config and underlying filesystem, if
> possible. Getting stack traces of the "stuck" threads would be necessary,
> too. Without knowing exactly what is happening, these kinds of problems are
> extremely hard to understand, let alone fix.
>

I'll try to get this information and report back.  Or perhaps they can join
this conversation (I gave them a pointer).

I'd be surprised if the spinlock is the actual culprit. AFAIK, kernel-level
> locks hand off to the scheduler if they spin too long; on multiprocessor
> machines, this is usually more efficient than immediately yielding and
> causing an expensive context switch. It's possible that you're seeing an
> unfortunate timing "resonance" that might go away with either more *or*
> less cores being available. The behaviour is really hard to predict.
>

Note: the told me that RHEL support was used and that they identified the
culprit as SVN mutex locks being translated into spin-locks at the OS level.
They also provided the example of Apache itself already having worked
around this in better ways but because this is really buried deep in
mod_dav_svn/svnserve the Apache work-arounds won't help.

Again, I'll see what I can obtain in terms of stack tracebacks, etc.

Cheers.

Doug


>
> -- Brane
>
>
>
>> On Fri, Oct 4, 2019 at 9:20 AM Doug Robinson <do...@wandisco.com>
>> wrote:
>>
>>> Folks:
>>>
>>> From a Subversion user:
>>>
>>> “... we have very high concurrent connections to Subversion that seem to
>>> crater Subversion. The SVN Serve process we use to access the Subversion
>>> repository is using the “svn” protocol by our “system user”, mostly
>>> read-only.  Then, we, on behalf of the user make request to Subversion
>>> using the “http” protocol to fetch their data. So we have lots of
>>> connections to Subversion. But the volume of concurrent requests over the
>>> “svn” protocol cause the “svnserve” process to consume CPU cycles in a
>>> kernel “mutex-lock” which is implemented using “spin locks”. The “svnserve”
>>> process makes the mutex calls using the “apache” (APR) semaphore wait
>>> calls, but on Linux this is a “mutext-lock” request.”
>>>
>>> So is there a better, more scalable, semaphore that can be used?
>>>
>>
>
>

-- 
*DOUGLAS B ROBINSON* SENIOR PRODUCT MANAGER

T +1 925 396 1125
*E* doug.robinson@wandisco.com

-- 


* <http://wandisco.com/>*

**The *LiveData* Company
*Find out more 
*wandisco.com <http://wandisco.com/>*



 
<https://www.wandisco.com/liveanalytics>


THIS MESSAGE AND ANY ATTACHMENTS 
ARE CONFIDENTIAL, PROPRIETARY AND MAY BE PRIVILEGED
*


If this message was 
misdirected, WANdisco, Inc. and its subsidiaries, ("WANdisco") does not 
waive any confidentiality or privilege. If you are not the intended 
recipient, please notify us immediately and destroy the message without 
disclosing its contents to anyone. Any distribution, use or copying of this 
email or the information it contains by other than an intended recipient is 
unauthorized. The views and opinions expressed in this email message are 
the author's own and may not reflect the views and opinions of WANdisco, 
unless the author is authorized by WANdisco to express such views or 
opinions on its behalf. All email sent to or from this address is subject 
to electronic storage and review by WANdisco. Although WANdisco operates 
anti-virus programs, it does not accept responsibility for any damage 
whatsoever caused by viruses being passed.

Re: Better choice for Linux semaphore than spinlock?

Posted by Ruediger Pluem <rp...@apache.org>.

On 10/07/2019 08:40 PM, Branko Čibej wrote:
> On Mon, 7 Oct 2019, 19:47 Doug Robinson, <doug.robinson@wandisco.com <ma...@wandisco.com>> wrote:
> 
>     Folks:
> 
>     I spoke with this user late last week.  They stated that they can only get approximately 400 parallel SVN operations
>     before the "system time" consumes all available CPU for an 8-core machine.  Adding more cores won't help because of
>     the nature of spin locks (it makes things worse).  Turns out that even with ~100 parallel SVN operations the "system
>     time" starts becoming significant/measurable (~10%).  Both HTTP (mod_dav_svn) and "svnserve" protocols participate
>     in the lock contention.
> 
>     Your help would be greatly appreciated.
> 
> 
> 
> Whew. So. Reducing this issue to "use a more efficient lock" is not going to work, and you provided far too little
> information to even attempt a diagnosis. For starters, I recommend gathering as much info as possible (anonymised of
> course) about the server configuration, everything from httpd an svnserve to the repository config and underlying
> filesystem, if possible. Getting stack traces of the "stuck" threads would be necessary, too. Without knowing exactly
> what is happening, these kinds of problems are extremely hard to understand, let alone fix.

Plus depending on which part of the code requires this lock a different locking mechanism that might suit better for
this use case can possibly be chosen via configuration changes (e.g. httpd allows this for most of its locking).

Regards

Rüdiger

Re: Better choice for Linux semaphore than spinlock?

Posted by Branko Čibej <br...@apache.org>.
On Mon, 7 Oct 2019, 19:47 Doug Robinson, <do...@wandisco.com> wrote:

> Folks:
>
> I spoke with this user late last week.  They stated that they can only get
> approximately 400 parallel SVN operations before the "system time" consumes
> all available CPU for an 8-core machine.  Adding more cores won't help
> because of the nature of spin locks (it makes things worse).  Turns out
> that even with ~100 parallel SVN operations the "system time" starts
> becoming significant/measurable (~10%).  Both HTTP (mod_dav_svn) and
> "svnserve" protocols participate in the lock contention.
>
> Your help would be greatly appreciated.
>


Whew. So. Reducing this issue to "use a more efficient lock" is not going
to work, and you provided far too little information to even attempt a
diagnosis. For starters, I recommend gathering as much info as possible
(anonymised of course) about the server configuration, everything from
httpd an svnserve to the repository config and underlying filesystem, if
possible. Getting stack traces of the "stuck" threads would be necessary,
too. Without knowing exactly what is happening, these kinds of problems are
extremely hard to understand, let alone fix.

I'd be surprised if the spinlock is the actual culprit. AFAIK, kernel-level
locks hand off to the scheduler if they spin too long; on multiprocessor
machines, this is usually more efficient than immediately yielding and
causing an expensive context switch. It's possible that you're seeing an
unfortunate timing "resonance" that might go away with either more *or*
less cores being available. The behaviour is really hard to predict.

-- Brane



> On Fri, Oct 4, 2019 at 9:20 AM Doug Robinson <do...@wandisco.com>
> wrote:
>
>> Folks:
>>
>> From a Subversion user:
>>
>> “... we have very high concurrent connections to Subversion that seem to
>> crater Subversion. The SVN Serve process we use to access the Subversion
>> repository is using the “svn” protocol by our “system user”, mostly
>> read-only.  Then, we, on behalf of the user make request to Subversion
>> using the “http” protocol to fetch their data. So we have lots of
>> connections to Subversion. But the volume of concurrent requests over the
>> “svn” protocol cause the “svnserve” process to consume CPU cycles in a
>> kernel “mutex-lock” which is implemented using “spin locks”. The “svnserve”
>> process makes the mutex calls using the “apache” (APR) semaphore wait
>> calls, but on Linux this is a “mutext-lock” request.”
>>
>> So is there a better, more scalable, semaphore that can be used?
>>
>

Re: Better choice for Linux semaphore than spinlock?

Posted by Branko Čibej <br...@apache.org>.
On Mon, 7 Oct 2019, 19:47 Doug Robinson, <do...@wandisco.com> wrote:

> Folks:
>
> I spoke with this user late last week.  They stated that they can only get
> approximately 400 parallel SVN operations before the "system time" consumes
> all available CPU for an 8-core machine.  Adding more cores won't help
> because of the nature of spin locks (it makes things worse).  Turns out
> that even with ~100 parallel SVN operations the "system time" starts
> becoming significant/measurable (~10%).  Both HTTP (mod_dav_svn) and
> "svnserve" protocols participate in the lock contention.
>
> Your help would be greatly appreciated.
>


Whew. So. Reducing this issue to "use a more efficient lock" is not going
to work, and you provided far too little information to even attempt a
diagnosis. For starters, I recommend gathering as much info as possible
(anonymised of course) about the server configuration, everything from
httpd an svnserve to the repository config and underlying filesystem, if
possible. Getting stack traces of the "stuck" threads would be necessary,
too. Without knowing exactly what is happening, these kinds of problems are
extremely hard to understand, let alone fix.

I'd be surprised if the spinlock is the actual culprit. AFAIK, kernel-level
locks hand off to the scheduler if they spin too long; on multiprocessor
machines, this is usually more efficient than immediately yielding and
causing an expensive context switch. It's possible that you're seeing an
unfortunate timing "resonance" that might go away with either more *or*
less cores being available. The behaviour is really hard to predict.

-- Brane



> On Fri, Oct 4, 2019 at 9:20 AM Doug Robinson <do...@wandisco.com>
> wrote:
>
>> Folks:
>>
>> From a Subversion user:
>>
>> “... we have very high concurrent connections to Subversion that seem to
>> crater Subversion. The SVN Serve process we use to access the Subversion
>> repository is using the “svn” protocol by our “system user”, mostly
>> read-only.  Then, we, on behalf of the user make request to Subversion
>> using the “http” protocol to fetch their data. So we have lots of
>> connections to Subversion. But the volume of concurrent requests over the
>> “svn” protocol cause the “svnserve” process to consume CPU cycles in a
>> kernel “mutex-lock” which is implemented using “spin locks”. The “svnserve”
>> process makes the mutex calls using the “apache” (APR) semaphore wait
>> calls, but on Linux this is a “mutext-lock” request.”
>>
>> So is there a better, more scalable, semaphore that can be used?
>>
>

Re: Better choice for Linux semaphore than spinlock?

Posted by Doug Robinson <do...@wandisco.com>.
Folks:

I spoke with this user late last week.  They stated that they can only get
approximately 400 parallel SVN operations before the "system time" consumes
all available CPU for an 8-core machine.  Adding more cores won't help
because of the nature of spin locks (it makes things worse).  Turns out
that even with ~100 parallel SVN operations the "system time" starts
becoming significant/measurable (~10%).  Both HTTP (mod_dav_svn) and
"svnserve" protocols participate in the lock contention.

Your help would be greatly appreciated.

Cheers.

Doug

On Fri, Oct 4, 2019 at 9:20 AM Doug Robinson <do...@wandisco.com>
wrote:

> Folks:
>
> From a Subversion user:
>
> “... we have very high concurrent connections to Subversion that seem to
> crater Subversion. The SVN Serve process we use to access the Subversion
> repository is using the “svn” protocol by our “system user”, mostly
> read-only.  Then, we, on behalf of the user make request to Subversion
> using the “http” protocol to fetch their data. So we have lots of
> connections to Subversion. But the volume of concurrent requests over the
> “svn” protocol cause the “svnserve” process to consume CPU cycles in a
> kernel “mutex-lock” which is implemented using “spin locks”. The “svnserve”
> process makes the mutex calls using the “apache” (APR) semaphore wait
> calls, but on Linux this is a “mutext-lock” request.”
>
> So is there a better, more scalable, semaphore that can be used?
>
> Cheers.
>
> Doug
> --
> *DOUGLAS B ROBINSON* SENIOR PRODUCT MANAGER
>
> T +1 925 396 1125
> *E* doug.robinson@wandisco.com
>


-- 
*DOUGLAS B ROBINSON* SENIOR PRODUCT MANAGER

T +1 925 396 1125
*E* doug.robinson@wandisco.com

-- 


* <http://wandisco.com/>*

**The *LiveData* Company
*Find out more 
*wandisco.com <http://wandisco.com/>*



 
<https://www.wandisco.com/liveanalytics>


THIS MESSAGE AND ANY ATTACHMENTS 
ARE CONFIDENTIAL, PROPRIETARY AND MAY BE PRIVILEGED
*


If this message was 
misdirected, WANdisco, Inc. and its subsidiaries, ("WANdisco") does not 
waive any confidentiality or privilege. If you are not the intended 
recipient, please notify us immediately and destroy the message without 
disclosing its contents to anyone. Any distribution, use or copying of this 
email or the information it contains by other than an intended recipient is 
unauthorized. The views and opinions expressed in this email message are 
the author's own and may not reflect the views and opinions of WANdisco, 
unless the author is authorized by WANdisco to express such views or 
opinions on its behalf. All email sent to or from this address is subject 
to electronic storage and review by WANdisco. Although WANdisco operates 
anti-virus programs, it does not accept responsibility for any damage 
whatsoever caused by viruses being passed.

Re: Better choice for Linux semaphore than spinlock?

Posted by Doug Robinson <do...@wandisco.com>.
Folks:

I spoke with this user late last week.  They stated that they can only get
approximately 400 parallel SVN operations before the "system time" consumes
all available CPU for an 8-core machine.  Adding more cores won't help
because of the nature of spin locks (it makes things worse).  Turns out
that even with ~100 parallel SVN operations the "system time" starts
becoming significant/measurable (~10%).  Both HTTP (mod_dav_svn) and
"svnserve" protocols participate in the lock contention.

Your help would be greatly appreciated.

Cheers.

Doug

On Fri, Oct 4, 2019 at 9:20 AM Doug Robinson <do...@wandisco.com>
wrote:

> Folks:
>
> From a Subversion user:
>
> “... we have very high concurrent connections to Subversion that seem to
> crater Subversion. The SVN Serve process we use to access the Subversion
> repository is using the “svn” protocol by our “system user”, mostly
> read-only.  Then, we, on behalf of the user make request to Subversion
> using the “http” protocol to fetch their data. So we have lots of
> connections to Subversion. But the volume of concurrent requests over the
> “svn” protocol cause the “svnserve” process to consume CPU cycles in a
> kernel “mutex-lock” which is implemented using “spin locks”. The “svnserve”
> process makes the mutex calls using the “apache” (APR) semaphore wait
> calls, but on Linux this is a “mutext-lock” request.”
>
> So is there a better, more scalable, semaphore that can be used?
>
> Cheers.
>
> Doug
> --
> *DOUGLAS B ROBINSON* SENIOR PRODUCT MANAGER
>
> T +1 925 396 1125
> *E* doug.robinson@wandisco.com
>


-- 
*DOUGLAS B ROBINSON* SENIOR PRODUCT MANAGER

T +1 925 396 1125
*E* doug.robinson@wandisco.com

-- 


* <http://wandisco.com/>*

**The *LiveData* Company
*Find out more 
*wandisco.com <http://wandisco.com/>*



 
<https://www.wandisco.com/liveanalytics>


THIS MESSAGE AND ANY ATTACHMENTS 
ARE CONFIDENTIAL, PROPRIETARY AND MAY BE PRIVILEGED
*


If this message was 
misdirected, WANdisco, Inc. and its subsidiaries, ("WANdisco") does not 
waive any confidentiality or privilege. If you are not the intended 
recipient, please notify us immediately and destroy the message without 
disclosing its contents to anyone. Any distribution, use or copying of this 
email or the information it contains by other than an intended recipient is 
unauthorized. The views and opinions expressed in this email message are 
the author's own and may not reflect the views and opinions of WANdisco, 
unless the author is authorized by WANdisco to express such views or 
opinions on its behalf. All email sent to or from this address is subject 
to electronic storage and review by WANdisco. Although WANdisco operates 
anti-virus programs, it does not accept responsibility for any damage 
whatsoever caused by viruses being passed.