You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Radek Krotil <ra...@polarion.com> on 2016/05/21 08:46:19 UTC

RE: Deadlock-like behaviour of svnserve in multi-threaded mode (-T)

Hi Ivan.

Apologies for my late reply. I've been caught in other tasks and events, but
recently I've encountered the behavior again and also similar pattern was
observed in environment of one of our enterprise customer.

All the dumps from svnserve, our application and also the memory dump can be
downloaded at https://drive.google.com/open?id=0B-yalijSzAIULWNuRGNGUW9KeWM.
I've been using the binaries from
http://de.apachehaus.com/downloads/mod_svn-1.9.3-ap24-x64.zip.

So to answer your questions..
1. Do you have debug symbols for Subversion binaries you're using?
A: I'm not an expert in C development, so I will need to consult with one of
my developers to understand, what you actually ask for. Maybe you can figure
the necessary details from the binaries themselves, or maybe you can give me
bit more hint in how to get the information you seek.

2. Other threads should have different stack trace, because AFAIK only one
thread calls accept().
A: Correct. All the stacktraces are captured in the zip file above. Majority
of the threads are in the following state, but our application does not
receive any response for a long time.

ntoskrnl.exe!KeSynchronizeExecution+0x3de6
ntoskrnl.exe!KeWaitForMutexObject+0xc7a
ntoskrnl.exe!KeWaitForMutexObject+0x709
ntoskrnl.exe!KeWaitForMutexObject+0x375
ntoskrnl.exe!IoThreadToProcess+0xff0
ntoskrnl.exe!KeRemoveQueueEx+0x16ba
ntoskrnl.exe!KeWaitForMutexObject+0xe8e
ntoskrnl.exe!KeWaitForMutexObject+0x709
ntoskrnl.exe!KeWaitForMutexObject+0x375
ntoskrnl.exe!NtWaitForSingleObject+0xf2
ntoskrnl.exe!setjmpex+0x3963
ntdll.dll!NtWaitForSingleObject+0x14
MSWSOCK.dll!Tcpip6_WSHSetSocketInformation+0x155
MSWSOCK.dll!WSPStartup+0x3b1c
WS2_32.dll!WSARecv+0x17d
libapr-1.dll!apr_socket_recv+0x4a
svnserve.exe+0x15895
svnserve.exe+0x10a6d
svnserve.exe+0x1175b
svnserve.exe+0x1f8f
svnserve.exe+0xbeb9
libaprutil-1.dll!apr_thread_pool_top+0x797
MSVCR110.dll!beginthreadex+0x107
MSVCR110.dll!endthreadex+0x192
KERNEL32.DLL!BaseThreadInitThunk+0x22
ntdll.dll!RtlUserThreadStart+0x34

3. Could you please create full memory dump of locked process and send it to
me including debug symbols?
Dump is part of the zip file and for the debug symbol, I'll follow up on
that during the next week with my developers.

Thank you,
Radek Krotil

-----Original Message-----
From: Ivan Zhakov [mailto:ivan@visualsvn.com]
Sent: Friday, April 15, 2016 9:46 AM
To: Radek Krotil
Cc: dev@subversion.apache.org
Subject: Re: Deadlock-like behaviour of svnserve in multi-threaded mode (-T)

On 14 April 2016 at 19:14, Radek Krotil <ra...@polarion.com> wrote:
> Hi all.
>
> Our application generates lot of concurrent read requests to
> subversion using svn: protocol. When we tested the multithreaded mode
> of svnserve after upgrade to 1.9.3, we noticed strange 'deadlock-like'
> behavior: at some point all the requests are blocked in svnserve and
> wait there for a few minutes (3 to 15 minutes, no CPU activity), after
> which they continue to work. This is making our application significantly
> slower.
>
> Operating system: Windows 10, CentOS 6.6, CentOS 7.2
>
> The release and/or revision of Subversion: 1.9.3
>
> The compiler and configuration options you built Subversion with:
> WANDisco binaries for CentOS, Apache Haus binaries for Windows
>
> The workaround on Linux is to run svnserve without -T switch, i.e. not
> using multithreaded mode. For Windows, there is no workaround as
> svnserve only supports the multi-threaded mode.
>
> Here is a sample of thread dump of svnserve.exe during the 'deadlock'
> obtained on Windows 10 using Process Explorer:
>
> ntoskrnl.exe!KeSynchronizeExecution+0x3de6
> ntoskrnl.exe!KeWaitForMutexObject+0xc7a
> ntoskrnl.exe!KeWaitForMutexObject+0x709
> ntoskrnl.exe!KeWaitForMutexObject+0x375
> ntoskrnl.exe!IoThreadToProcess+0xff0
> ntoskrnl.exe!KeRemoveQueueEx+0x16ba
> ntoskrnl.exe!KeWaitForMutexObject+0xe8e
> ntoskrnl.exe!KeWaitForMutexObject+0x709
> ntoskrnl.exe!KeWaitForMutexObject+0x375
> ntoskrnl.exe!NtWaitForSingleObject+0xf2
> ntoskrnl.exe!setjmpex+0x3963
> ntdll.dll!NtWaitForSingleObject+0x14
> MSWSOCK.dll!Tcpip6_WSHSetSocketInformation+0x155
> MSWSOCK.dll+0x1bf1
> WS2_32.dll!WSAAccept+0xce
>
> WS2_32.dll!accept+0x12
> libapr-1.dll!apr_socket_accept+0x46
> svnserve.exe+0xc11c
> svnserve.exe+0xbae5
> svnserve.exe+0xaf6c
> svnserve.exe+0x13ab
> KERNEL32.DLL!BaseThreadInitThunk+0x22
> ntdll.dll!RtlUserThreadStart+0x34
>
> The similar stack can be seen with other threads too.
>
1. Do you have debug symbols for Subversion binaries you're using?
2. Other threads should have different stack trace, because AFAIK only one
thread calls accept().
3. Could you please create full memory dump of locked process and send it to
me including debug symbols?

--
Ivan Zhakov

RE: Deadlock-like behaviour of svnserve in multi-threaded mode (-T)

Posted by Radek Krotil <ra...@polarion.com>.
Hi again, Ivan.

Sorry, it took me while to get back to this issue, but it's still on. Today,
I encountered it again even after update to most recent binaries available.

What do you mean by the network dump? I tried to collect all relevant
information for my environment that is now uploaded to
https://drive.google.com/drive/folders/0B-yalijSzAIUdVA3Rm5hQW1pLTg
threaddump - Polarion 2.txt - thread dump from our Java app
threaddumps  - svnserve 2.txt - thread dumps from svnserve processes
server logs.txt - particular snippets from Polarion and svnserve logs
TPCview.txt - export of TCPview monitor

I'm kindly asking for suggestions how can we help you to analyze this issue
further. I checked with my developers and they told me that SVN connection
timeout is set to 60 s, so if the thread is being stuck in this frozen mode
for about 10 minutes it indicates that the connection was established, but
server is not returning any data. Read timeout for svn connection is set to
3600 s, but the communication recovers prior reaching this timeout.

Thank you,
Radek Krotil

-----Original Message-----
From: Ivan Zhakov [mailto:ivan@visualsvn.com]
Sent: Tuesday, May 31, 2016 4:39 PM
To: Radek Krotil
Cc: dev@subversion.apache.org
Subject: Re: Deadlock-like behaviour of svnserve in multi-threaded mode (-T)

On 31 May 2016 at 17:19, Radek Krotil <ra...@polarion.com> wrote:
> Polarion is our application and it uses SvnKit (http://svnkit.com/) as
> a connector to SVN. We use SVNKit version 1.8.12.
>
> Dump from our application also shows that the threads are waiting for
> data from network. Could it be an issue in Windows itself? Anyway, we
> have seen this behavior both in Windows and Linux environment and
> never seen it with Subversion 1.8 and earlier, so we suspect it's a
> regression in Subversion.
>
I suggest to check network dump prior deadlock.

--
Ivan Zhakov

RE: Deadlock-like behaviour of svnserve in multi-threaded mode (-T)

Posted by "Stroleny, Jakub" <ja...@siemens.com>.
Hi,

we're struggling with this issue again, but we're able to reproduce it now.
We have tested issue on svn 1.9.7 and 1.10.0 and it seems it is still valid.

I wrote some simple application to reproduce the issue. I've
described it in comment, see related issue https://issues.apache.org/jira/browse/SVN-4626
The application needs to be run probably multiple times to reproduce it correctly.

Basically, it seems that it is reproducible only when svnserve is running on Windows machine.
I did some tests on svnserve running on Linux machine and issue did not occur.

The client application is stuck on socket read from svnserve and there is no response
from server. We have network dumps and process monitor dumps, but
there isn't any response or network traffic when issue occurs.
The client application is blocked on the connection and  no progress is done
until we free some other connection to svnserve.

Can someone help us with analysis for this issue? I can provide more details if needed.
I had suspicious on svnserve.c:min_thread_count property, which is by default set to 1, but
we have 5+ concurrent connection, so that pool do not have enough workers to process
incoming requests. I have tried suggestion with "--min-threads" and "--max-threads" as
described also in previous replies, but without any change - it has same behavior as before.
Is there anything more what can we do to debug/trace this issue more?


Thank you,
Jakub Stroleny


-----Original Message-----
From: Daniel Shahaf [mailto:danielsh@apache.org]
Sent: Tuesday, November 22, 2016 5:53 PM
To: dev@subversion.apache.org
Subject: Re: Deadlock-like behaviour of svnserve in multi-threaded mode (-T)

Ping.  Has anyone looked into the network dump to try and understand what's happening here?

-----------------
Siemens Industry Software, s.r.o.
Praha 4, Doudlebská 1699/5, PSČ 140 00
IČ 256 51 897
Zapsaná v obchodním rejstříku vedeném Městským soudem v Praze, oddíl C, vložka 58222

Důležité upozornění: Tato zpráva má jen informativní charakter. Obsah této zprávy odesílatele nezavazuje a odesílatel nemá v úmyslu touto zprávou uzavřít smlouvu, přijmout nabídku, potvrdit uzavření smlouvy ani nezakládá předsmluvní odpovědnost jejího odesílatele, ledaže je odesílatelem ve zprávě uvedeno výslovně jinak. Obsah této zprávy (včetně příloh) je důvěrný. Pokud nejste zamýšleným adresátem této zprávy, zpřístupnění, kopírování, distribuce nebo jiné užití obsahu zprávy je zakázáno a v takovém případě, prosím, okamžitě informujte odesílatele a poté zprávu (vč. příloh) odstraňte z Vašeho systému.

Re: Deadlock-like behaviour of svnserve in multi-threaded mode (-T)

Posted by Daniel Shahaf <da...@apache.org>.
Ping.  Has anyone looked into the network dump to try and understand
what's happening here?

Re: Deadlock-like behaviour of svnserve in multi-threaded mode (-T)

Posted by Tomas Bilek <to...@polarion.com>.
Hello Ivan,
I am continuing in topic, which Radek has started previously. I have created
networkdump where you can see communication on localhost. Dump was started
approx 5 s before our application started but is starts using SVN later.
There are relative time stamps.   Also I have packed process monitor result
but unfortunetely from time after svn starts hang up, because when I was
running process monitor I was not able to reproduce that problem, so I
turned it on after the system hangs.

To logs.
1) Network (network dump
https://drive.google.com/file/d/0ByHPI8jVgwP8ZHJxbHZ5Z3BScVU/view?usp=sharing)
when i filter only tcp.port = 3690 I see that communication stopped and
there is window from 245s until 637s 
and this is snippet from our app log
2016-11-07 12:54:20,753 [ObjectIndex-refreshIndex-1 | u:p] INFO 
com.polarion.platform.startup  -  -- -- 631 from 731 objects were refreshed
(86%)
2016-11-07 13:00:54,030 [ObjectIndex-refreshIndex-2 | u:p] INFO 
com.polarion.platform.startup  -  -- -- 731 from 731 objects were refreshed
(100%)
   /////Indexing take some time from 10-20s so the difference 300-400s is
corresponding.

2) Process Monitor  (dumps
https://drive.google.com/file/d/0ByHPI8jVgwP8Qm0xa1ExNEQxZFU/view?usp=sharing)
you can filter at start for svnserve.exe. at start 07.11.2016
13:00:52,5687177   there is "disconnect sequence" 

Do you think, that it is enough info to determine, where is the problem?



--
View this message in context: http://subversion.1072662.n5.nabble.com/Deadlock-like-behaviour-of-svnserve-in-multi-threaded-mode-T-tp196421p197955.html
Sent from the Subversion Dev mailing list archive at Nabble.com.

Re: Deadlock-like behaviour of svnserve in multi-threaded mode (-T)

Posted by Ivan Zhakov <iv...@visualsvn.com>.
On 31 May 2016 at 17:19, Radek Krotil <ra...@polarion.com> wrote:
> Polarion is our application and it uses SvnKit (http://svnkit.com/) as a
> connector to SVN. We use SVNKit version 1.8.12.
>
> Dump from our application also shows that the threads are waiting for data
> from network. Could it be an issue in Windows itself? Anyway, we have seen
> this behavior both in Windows and Linux environment and never seen it with
> Subversion 1.8 and earlier, so we suspect it's a regression in Subversion.
>
I suggest to check network dump prior deadlock.

-- 
Ivan Zhakov

RE: Deadlock-like behaviour of svnserve in multi-threaded mode (-T)

Posted by Radek Krotil <ra...@polarion.com>.
Polarion is our application and it uses SvnKit (http://svnkit.com/) as a
connector to SVN. We use SVNKit version 1.8.12.

Dump from our application also shows that the threads are waiting for data
from network. Could it be an issue in Windows itself? Anyway, we have seen
this behavior both in Windows and Linux environment and never seen it with
Subversion 1.8 and earlier, so we suspect it's a regression in Subversion.

Thanks,
Radek

-----Original Message-----
From: Ivan Zhakov [mailto:ivan@visualsvn.com]
Sent: Tuesday, May 31, 2016 2:10 PM
To: Radek Krotil
Cc: dev@subversion.apache.org
Subject: Re: Deadlock-like behaviour of svnserve in multi-threaded mode (-T)

On 31 May 2016 at 14:43, Radek Krotil <ra...@polarion.com> wrote:
> Hi Ivan.
>
> I managed to get the debug symbols for the Subversion binaries, we've
> been using. It can be downloaded at
> http://www.apachehaus.de/subversion-1.9.3-ap24-x64_pdb.zip.
>
As far I see all 22 workers threads are waiting for data from network:
[[
ntdll.dll!NtWaitForSingleObject ()    Unknown
mswsock.dll!SockWaitForSingleObject ()    Unknown
mswsock.dll!WSPRecv ()    Unknown
ws2_32.dll!WSARecv ()    Unknown
libapr-1.dll!0000000057bff0aa()    Unknown
svnserve.exe!sock_read_cb(void * baton, char * buffer, unsigned
__int64 * len) Line 120    C
svnserve.exe!readbuf_fill(svn_ra_svn_conn_st * conn, apr_pool_t *
pool) Line 391    C
svnserve.exe!svn_ra_svn__read_tuple(svn_ra_svn_conn_st * conn,
apr_pool_t * pool, const char * fmt, ...) Line 1379    C
svnserve.exe!serve_interruptable(int * terminate_p, connection_t *
connection, int(*)(connection_t *) is_busy, apr_pool_t * pool) Line
4057    C
svnserve.exe!serve_thread(apr_thread_t * tid, void * data) Line 598    C
]]]

It also seems that you're using some third-party (Polarion) svn:// client
for Subversion. Is it true?

--
Ivan Zhakov

Re: Deadlock-like behaviour of svnserve in multi-threaded mode (-T)

Posted by Ivan Zhakov <iv...@visualsvn.com>.
On 31 May 2016 at 14:43, Radek Krotil <ra...@polarion.com> wrote:
> Hi Ivan.
>
> I managed to get the debug symbols for the Subversion binaries, we've been
> using. It can be downloaded at
> http://www.apachehaus.de/subversion-1.9.3-ap24-x64_pdb.zip.
>
As far I see all 22 workers threads are waiting for data from network:
[[
ntdll.dll!NtWaitForSingleObject ()    Unknown
mswsock.dll!SockWaitForSingleObject ()    Unknown
mswsock.dll!WSPRecv ()    Unknown
ws2_32.dll!WSARecv ()    Unknown
libapr-1.dll!0000000057bff0aa()    Unknown
svnserve.exe!sock_read_cb(void * baton, char * buffer, unsigned
__int64 * len) Line 120    C
svnserve.exe!readbuf_fill(svn_ra_svn_conn_st * conn, apr_pool_t *
pool) Line 391    C
svnserve.exe!svn_ra_svn__read_tuple(svn_ra_svn_conn_st * conn,
apr_pool_t * pool, const char * fmt, ...) Line 1379    C
svnserve.exe!serve_interruptable(int * terminate_p, connection_t *
connection, int(*)(connection_t *) is_busy, apr_pool_t * pool) Line
4057    C
svnserve.exe!serve_thread(apr_thread_t * tid, void * data) Line 598    C
]]]

It also seems that you're using some third-party (Polarion) svn://
client for Subversion. Is it true?

-- 
Ivan Zhakov

RE: Deadlock-like behaviour of svnserve in multi-threaded mode (-T)

Posted by Radek Krotil <ra...@polarion.com>.
Hi Ivan.

I managed to get the debug symbols for the Subversion binaries, we've been
using. It can be downloaded at
http://www.apachehaus.de/subversion-1.9.3-ap24-x64_pdb.zip.

Thank you,
Radek Krotil

-----Original Message-----
From: Radek Krotil [mailto:radek.krotil@polarion.com]
Sent: Saturday, May 21, 2016 10:46 AM
To: 'Ivan Zhakov'
Cc: 'dev@subversion.apache.org'
Subject: RE: Deadlock-like behaviour of svnserve in multi-threaded mode (-T)

Hi Ivan.

Apologies for my late reply. I've been caught in other tasks and events, but
recently I've encountered the behavior again and also similar pattern was
observed in environment of one of our enterprise customer.

All the dumps from svnserve, our application and also the memory dump can be
downloaded at https://drive.google.com/open?id=0B-yalijSzAIULWNuRGNGUW9KeWM.
I've been using the binaries from
http://de.apachehaus.com/downloads/mod_svn-1.9.3-ap24-x64.zip.

So to answer your questions..
1. Do you have debug symbols for Subversion binaries you're using?
A: I'm not an expert in C development, so I will need to consult with one of
my developers to understand, what you actually ask for. Maybe you can figure
the necessary details from the binaries themselves, or maybe you can give me
bit more hint in how to get the information you seek.

2. Other threads should have different stack trace, because AFAIK only one
thread calls accept().
A: Correct. All the stacktraces are captured in the zip file above. Majority
of the threads are in the following state, but our application does not
receive any response for a long time.

ntoskrnl.exe!KeSynchronizeExecution+0x3de6
ntoskrnl.exe!KeWaitForMutexObject+0xc7a
ntoskrnl.exe!KeWaitForMutexObject+0x709
ntoskrnl.exe!KeWaitForMutexObject+0x375
ntoskrnl.exe!IoThreadToProcess+0xff0
ntoskrnl.exe!KeRemoveQueueEx+0x16ba
ntoskrnl.exe!KeWaitForMutexObject+0xe8e
ntoskrnl.exe!KeWaitForMutexObject+0x709
ntoskrnl.exe!KeWaitForMutexObject+0x375
ntoskrnl.exe!NtWaitForSingleObject+0xf2
ntoskrnl.exe!setjmpex+0x3963
ntdll.dll!NtWaitForSingleObject+0x14
MSWSOCK.dll!Tcpip6_WSHSetSocketInformation+0x155
MSWSOCK.dll!WSPStartup+0x3b1c
WS2_32.dll!WSARecv+0x17d
libapr-1.dll!apr_socket_recv+0x4a
svnserve.exe+0x15895
svnserve.exe+0x10a6d
svnserve.exe+0x1175b
svnserve.exe+0x1f8f
svnserve.exe+0xbeb9
libaprutil-1.dll!apr_thread_pool_top+0x797
MSVCR110.dll!beginthreadex+0x107
MSVCR110.dll!endthreadex+0x192
KERNEL32.DLL!BaseThreadInitThunk+0x22
ntdll.dll!RtlUserThreadStart+0x34

3. Could you please create full memory dump of locked process and send it to
me including debug symbols?
Dump is part of the zip file and for the debug symbol, I'll follow up on
that during the next week with my developers.

Thank you,
Radek Krotil

-----Original Message-----
From: Ivan Zhakov [mailto:ivan@visualsvn.com]
Sent: Friday, April 15, 2016 9:46 AM
To: Radek Krotil
Cc: dev@subversion.apache.org
Subject: Re: Deadlock-like behaviour of svnserve in multi-threaded mode (-T)

On 14 April 2016 at 19:14, Radek Krotil <ra...@polarion.com> wrote:
> Hi all.
>
> Our application generates lot of concurrent read requests to
> subversion using svn: protocol. When we tested the multithreaded mode
> of svnserve after upgrade to 1.9.3, we noticed strange 'deadlock-like'
> behavior: at some point all the requests are blocked in svnserve and
> wait there for a few minutes (3 to 15 minutes, no CPU activity), after
> which they continue to work. This is making our application significantly
> slower.
>
> Operating system: Windows 10, CentOS 6.6, CentOS 7.2
>
> The release and/or revision of Subversion: 1.9.3
>
> The compiler and configuration options you built Subversion with:
> WANDisco binaries for CentOS, Apache Haus binaries for Windows
>
> The workaround on Linux is to run svnserve without -T switch, i.e. not
> using multithreaded mode. For Windows, there is no workaround as
> svnserve only supports the multi-threaded mode.
>
> Here is a sample of thread dump of svnserve.exe during the 'deadlock'
> obtained on Windows 10 using Process Explorer:
>
> ntoskrnl.exe!KeSynchronizeExecution+0x3de6
> ntoskrnl.exe!KeWaitForMutexObject+0xc7a
> ntoskrnl.exe!KeWaitForMutexObject+0x709
> ntoskrnl.exe!KeWaitForMutexObject+0x375
> ntoskrnl.exe!IoThreadToProcess+0xff0
> ntoskrnl.exe!KeRemoveQueueEx+0x16ba
> ntoskrnl.exe!KeWaitForMutexObject+0xe8e
> ntoskrnl.exe!KeWaitForMutexObject+0x709
> ntoskrnl.exe!KeWaitForMutexObject+0x375
> ntoskrnl.exe!NtWaitForSingleObject+0xf2
> ntoskrnl.exe!setjmpex+0x3963
> ntdll.dll!NtWaitForSingleObject+0x14
> MSWSOCK.dll!Tcpip6_WSHSetSocketInformation+0x155
> MSWSOCK.dll+0x1bf1
> WS2_32.dll!WSAAccept+0xce
>
> WS2_32.dll!accept+0x12
> libapr-1.dll!apr_socket_accept+0x46
> svnserve.exe+0xc11c
> svnserve.exe+0xbae5
> svnserve.exe+0xaf6c
> svnserve.exe+0x13ab
> KERNEL32.DLL!BaseThreadInitThunk+0x22
> ntdll.dll!RtlUserThreadStart+0x34
>
> The similar stack can be seen with other threads too.
>
1. Do you have debug symbols for Subversion binaries you're using?
2. Other threads should have different stack trace, because AFAIK only one
thread calls accept().
3. Could you please create full memory dump of locked process and send it to
me including debug symbols?

--
Ivan Zhakov