You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by "Krotil, Radek" <ra...@siemens.com> on 2020/02/04 11:01:40 UTC

Re: Better choice for Linux semaphore than spinlock?

Hi All.
I believe this issue originates at one of our customer and is related how Polarion ALM is using Subversion at scale. This is a reoccurring issue being encountered by several enterprise customers and I'd be more than happy to help the community to pin it down and get it fixed. Has there been any update since October on this problem?
On the customer end the problem is easy to detect, where almost all CPU is consumed by svnserve process, while more than 95% of the CPU is spent in system time, leaving almost no throughput for the actual operation.
[cid:image001.png@01D5DB52.DAB45430]
Best regards,
Radek Krotil

Siemens Digital Industries Software
Polarion ALM Product Management
polarion.plm.automation.siemens.com<https://polarion.plm.automation.siemens.com/>


On 2019/10/12 23:29:44, eponymous alias <e....@yahoo.com> wrote:
>  Perhaps these links might be of help in some way:>
>
> https://webkit.org/blog/6161/locking-in-webkit/>
> https://blog.mozilla.org/nfroyd/2017/03/29/on-mutex-performance-part-1/>
> https://preshing.com/20111118/locks-arent-slow-lock-contention-is/>
>
> On Monday, October 7, 2019, 1:56:14 PM PDT, Doug Robinson <do...@wandisco.com> wrote:>
>
> Rüdiger:>
>
> On Mon, Oct 7, 2019 at 3:51 PM Ruediger Pluem <rp...@apache.org> wrote:>
>
>  On 10/07/2019 08:40 PM, Branko Čibej wrote:>
>  > On Mon, 7 Oct 2019, 19:47 Doug Robinson, <doug.robinson@wandisco.com <ma...@wandisco.com>> wrote:>
>  >>
>  > Folks:>
>  >>
>  > I spoke with this user late last week. They stated that they can only get approximately 400 parallel SVN operations>

>  > before the "system time" consumes all available CPU for an 8-core machine. Adding more cores won't help because of>

>  > the nature of spin locks (it makes things worse). Turns out that even with ~100 parallel SVN operations the "system>

>  > time" starts becoming significant/measurable (~10%). Both HTTP (mod_dav_svn) and "svnserve" protocols participate>

>  > in the lock contention.>
>  >>
>  > Your help would be greatly appreciated.>
>  >>
>  > Whew. So. Reducing this issue to "use a more efficient lock" is not going to work, and you provided far too little>

>  > information to even attempt a diagnosis. For starters, I recommend gathering as much info as possible (anonymised of>

>  > course) about the server configuration, everything from httpd an svnserve to the repository config and underlying>

>  > filesystem, if possible. Getting stack traces of the "stuck" threads would be necessary, too. Without knowing exactly>

>  > what is happening, these kinds of problems are extremely hard to understand, let alone fix.>
>
>  Plus depending on which part of the code requires this lock a different locking mechanism that might suit better for>

>  this use case can possibly be chosen via configuration changes (e.g. httpd allows this for most of its locking).>
>
> That would be awesome! I'll definitely try to get those stack tracebacks.>
>
> Cheers.>
>
> Doug>
>   >

-----------------
Siemens Industry Software, s.r.o.
Praha 4, Doudlebská 1699/5, PSČ 140 00
IČ 256 51 897
Zapsaná v obchodním rejstříku vedeném Městským soudem v Praze, oddíl C, vložka 58222

Důležité upozornění: Tato zpráva má jen informativní charakter. Obsah této zprávy odesílatele nezavazuje a odesílatel nemá v úmyslu touto zprávou uzavřít smlouvu, přijmout nabídku, potvrdit uzavření smlouvy ani nezakládá předsmluvní odpovědnost jejího odesílatele, ledaže je odesílatelem ve zprávě uvedeno výslovně jinak. Obsah této zprávy (včetně příloh) je důvěrný. Pokud nejste zamýšleným adresátem této zprávy, zpřístupnění, kopírování, distribuce nebo užití obsahu zprávy je přísně zakázáno a v takovém případě, prosím, okamžitě informujte odesílatele a poté zprávu (vč. příloh) odstraňte z Vašeho systému.

Important Note: This message is only of informative nature. The content of this message shall not be binding for sender and sender does neither intend to conclude contract, accept offer or confirm the conclusion of the contract by this message nor this message represents pre-contractual liability of the sender, unless the sender states in the message excplicitly otherwise. The content of this message (including appendices) shall be confidential. Should you are not intended receiver of this message, any access, copying, distribution or use of the content of this message is strictly prohibited and in such case, please immediately notify the sender and subsequently delete the entire message (including apppendices) from your system.

Re: Better choice for Linux semaphore than spinlock?

Posted by Nathan Hartman <ha...@gmail.com>.
On Tue, Feb 4, 2020 at 9:02 AM Krotil, Radek <ra...@siemens.com> wrote:
> From what I understand, the svnserve on Windows can run only in the
> threaded mode. On Linux the default is the fork mode, but using the
> fork mode may produce significant overhead and excessive memory
> allocation due to the caches at high concurrency. So this particular
> case where the problem was reported is coming likely from a threaded
> svnserve.
>
> When it comes to comparison of between Windows and Linux deployment,
> our experience shows that svnserve stalling happens sooner on
> Windows than on Linux using fork mode. Symptoms include all CPU
> being burned in svnserve process and long response times in terms of
> 100 seconds on SVN side. Note that our application in parallel uses
> http connection to get and put data into SVN on behalf of regular
> end users, and also system user access via svn protocol and
> svnserve. Both Apache and Svnserve are running on top of the same
> repository.
>
> This subject has been opened some 18 months ago as well, and you can
> see the history of the conversation at http://subversion.1072662.n5.nabble.com/Deadlock-like-behaviour-of-svnserve-in-multi-threaded-mode-T-td196421.html#a197955.

I just finished reading the earlier discussions and issue SVN-4626
(https://issues.apache.org/jira/browse/SVN-4626).

Some thoughts...

There is a combination of several different pieces of software at play
here, and the issue could be isolated to one of them, or could be the
result of several seemingly unrelated things.

It was suggested it could be a regression in svnserve that appeared
sometime after Subversion 1.8.x and before 1.9.3. Has anyone tried to
run a bisect (between r1467414 and r1718531), to try to nail down a
specific change that introduces the issue?

SVNKit was mentioned several times. That is a separate project to
Subversion. Has anyone been able to reproduce the lockup issue without
SVNKit, either by interfacing to the SVN libraries directly, or via
the command line client? Is it possible that a regression appeared in
SVNKit? Have you tried, for example, using older versions of SVNKit
with the newest available Subversion? Or alternately, the newest
version of SVNKit with Subversion 1.8.x? Again, this is to try to
isolate the issue to one of these pieces of software, or to rule out
their involvement.

One thing I didn't see was how/why did the discussion end up with the
title "Better choice for Linux semaphore than spinlock?" That might be
beside the point.

> I can pull in additional experts from our team that were involved in
> the analysis with Red Hat in detail.

We can use all the help we can get! Please feel free to involve every
expert who can help.

Nathan

RE: Better choice for Linux semaphore than spinlock?

Posted by "Krotil, Radek" <ra...@siemens.com>.
From what I understand, the svnserve on Windows can run only in the threaded mode. On Linux the default is the fork mode, but using the fork mode may produce significant overhead and excessive memory allocation due to the caches at high concurrency. So this particular case where the problem was reported is coming likely from a threaded svnserve.

When it comes to comparison of between Windows and Linux deployment, our experience shows that svnserve stalling happens sooner on Windows than on Linux using fork mode. Symptoms include all CPU being burned in svnserve process and long response times in terms of 100 seconds on SVN side. Note that our application in parallel uses http connection to get and put data into SVN on behalf of regular end users, and also system user access via svn protocol and svnserve. Both Apache and Svnserve are running on top of the same repository.

This subject has been opened some 18 months ago as well, and you can see the history of the conversation at http://subversion.1072662.n5.nabble.com/Deadlock-like-behaviour-of-svnserve-in-multi-threaded-mode-T-td196421.html#a197955.

I can pull in additional experts from our team that were involved in the analysis with Red Hat in detail.

Best regards,
Radek Krotil

Siemens Digital Industries Software
Polarion ALM Product Management
polarion.plm.automation.siemens.com<https://polarion.plm.automation.siemens.com/>


From: Nathan Hartman <ha...@gmail.com>
Sent: Tuesday, February 4, 2020 12:55 PM
To: Krotil, Radek (DI SW LCS PMT ALM) <ra...@siemens.com>; Subversion Developers <de...@subversion.apache.org>
Subject: Re: Better choice for Linux semaphore than spinlock?

On Tue, Feb 4, 2020 at 6:45 AM Krotil, Radek <ra...@siemens.com>> wrote:
Hi All.
I believe this issue originates at one of our customer and is related how Polarion ALM is using Subversion at scale. This is a reoccurring issue being encountered by several enterprise customers and I’d be more than happy to help the community to pin it down and get it fixed. Has there been any update since October on this problem?

I think there haven't been any changes in this area but we will definitely appreciate your help.

The underlying cause might be in the APR (Apache Portable Runtime) libraries, which Subversion uses for its platform independence, or could be in the choice of APR APIs used somewhere in svnserve.

By the way, are you doing a threaded or non-threaded build?

Nathan


-----------------
Siemens Industry Software, s.r.o.
Praha 4, Doudlebská 1699/5, PSČ 140 00
IČ 256 51 897
Zapsaná v obchodním rejstříku vedeném Městským soudem v Praze, oddíl C, vložka 58222

Důležité upozornění: Tato zpráva má jen informativní charakter. Obsah této zprávy odesílatele nezavazuje a odesílatel nemá v úmyslu touto zprávou uzavřít smlouvu, přijmout nabídku, potvrdit uzavření smlouvy ani nezakládá předsmluvní odpovědnost jejího odesílatele, ledaže je odesílatelem ve zprávě uvedeno výslovně jinak. Obsah této zprávy (včetně příloh) je důvěrný. Pokud nejste zamýšleným adresátem této zprávy, zpřístupnění, kopírování, distribuce nebo užití obsahu zprávy je přísně zakázáno a v takovém případě, prosím, okamžitě informujte odesílatele a poté zprávu (vč. příloh) odstraňte z Vašeho systému.

Important Note: This message is only of informative nature. The content of this message shall not be binding for sender and sender does neither intend to conclude contract, accept offer or confirm the conclusion of the contract by this message nor this message represents pre-contractual liability of the sender, unless the sender states in the message excplicitly otherwise. The content of this message (including appendices) shall be confidential. Should you are not intended receiver of this message, any access, copying, distribution or use of the content of this message is strictly prohibited and in such case, please immediately notify the sender and subsequently delete the entire message (including apppendices) from your system.

Re: Better choice for Linux semaphore than spinlock?

Posted by Nathan Hartman <ha...@gmail.com>.
On Tue, Feb 4, 2020 at 6:45 AM Krotil, Radek <ra...@siemens.com>
wrote:

> Hi All.
>
> I believe this issue originates at one of our customer and is related how
> Polarion ALM is using Subversion at scale. This is a reoccurring issue
> being encountered by several enterprise customers and I’d be more than
> happy to help the community to pin it down and get it fixed. Has there been
> any update since October on this problem?
>

I think there haven't been any changes in this area but we will definitely
appreciate your help.

The underlying cause might be in the APR (Apache Portable Runtime)
libraries, which Subversion uses for its platform independence, or could be
in the choice of APR APIs used somewhere in svnserve.

By the way, are you doing a threaded or non-threaded build?

Nathan