You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tinkerpop.apache.org by "Kirk Marple (Jira)" <ji...@apache.org> on 2021/03/24 19:03:01 UTC

[jira] [Commented] (TINKERPOP-2341) GremlinClientExtensions.SubmitAsync hangs as it tries to dispose connection

    [ https://issues.apache.org/jira/browse/TINKERPOP-2341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17308130#comment-17308130 ] 

Kirk Marple commented on TINKERPOP-2341:
----------------------------------------

[~Florian Hockmann] We've started seeing a similar issue with 3.4.10, and CosmosDb.

It sends several queries successfully, and then hangs on the next query.  I've seen it timeout after 60 secs, and then successfully execute the query on our Polly retry; but I've also seen it hang for 10min+.

Are there still any known issues around this in 3.4.10?

> GremlinClientExtensions.SubmitAsync hangs as it tries to dispose connection
> ---------------------------------------------------------------------------
>
>                 Key: TINKERPOP-2341
>                 URL: https://issues.apache.org/jira/browse/TINKERPOP-2341
>             Project: TinkerPop
>          Issue Type: Bug
>          Components: dotnet
>    Affects Versions: 3.4.3
>            Reporter: Wilson Li
>            Priority: Major
>
> We have been seeing occasional hangs when logging to Cosmos DB graph using the Gremlin.Net NuGet (version 3.4.3).
>  
> Here's our code snippet: 
>     var results = await GremlinClientExtensions.SubmitAsync<dynamic>(this.Client, gremlinExpression).ConfigureAwait(false);
>  
>     foreach (var result in results)
>     {
>         Tracer.TraceMessage()(
>             traceId,
>             "Successfully logged activity result (ElapsedMilliseconds=\{0}): '\{1}'. Request charge is '\{2}'.",
>             sw.ElapsedMilliseconds,
>             JsonConvert.SerializeObject(result),
>             this.GetRequestCharge(results.StatusAttributes));
>     }
>  
> From the dumps, it seems the connection to the DB may have hit some {color:#de350b}failure{color}:
>  0:000> !PrintException /d 000001c61a0fd2c8
> Exception object: 000001c61a0fd2c8
> Exception type: System.Net.WebSockets.WebSocketException
> Message: An internal WebSocket error occurred. Please see the innerException, if present, for more details.
> InnerException: System.IO.IOException, Use !PrintException 000001c61a0fa760 to see more.
> StackTrace (generated):
>  SP IP Function
>  000000D5B446CA20 00007FFD9263D383 System_ni!System.Net.WebSockets.WebSocketBase.ThrowIfConvertibleException(System.String, System.Exception, System.Threading.CancellationToken, Boolean)+0x483
>  000000D5B446CAC0 00007FFD9203372B System_ni!System.Net.WebSockets.WebSocketBase+<ReceiveAsyncCore>d__45.MoveNext()+0x56b
>  000000D5B446A360 00007FFD93F7C3DF mscorlib_ni!System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()+0x1f
>  000000D5B446A390 00007FFD935EDF2C mscorlib_ni!System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(System.Threading.Tasks.Task)+0x3c
>  000000D5B446A3C0 00007FFD35D7B46C Gremlin_Net!Gremlin.Net.Driver.WebSocketConnection+<ReceiveMessageAsync>d__9.MoveNext()+0x20c
>  000000D5B4467CA0 00007FFD93F7C3DF mscorlib_ni!System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()+0x1f
>  000000D5B4467CD0 00007FFD935EDF2C mscorlib_ni!System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(System.Threading.Tasks.Task)+0x3c
>  000000D5B4467D00 00007FFD35D7A96F Gremlin_Net!Gremlin.Net.Driver.Connection+<ReceiveMessagesAsync>d__20.MoveNext()+0x16f
>  000000D3EF8FE770 00007FFD93F7C3DF mscorlib_ni!System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()+0x1f
>  000000D3EF8FE7A0 00007FFD935EDF2C mscorlib_ni!System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(System.Threading.Tasks.Task)+0x3c
>  000000D3EF8FE7D0 00007FFD35D8B1BC Gremlin_Net!Gremlin.Net.Driver.ProxyConnection+<SubmitAsync>d__3`1[[System.__Canon, mscorlib]].MoveNext()+0x15c
>  000000D3EF8FC0D0 00007FFD93F7C3DF mscorlib_ni!System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()+0x1f
>  000000D3EF8FC100 00007FFD935EDF2C mscorlib_ni!System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(System.Threading.Tasks.Task)+0x3c
>  000000D3EF8FC130 00007FFD35D8887C Gremlin_Net!Gremlin.Net.Driver.GremlinClient+<SubmitAsync>d__6`1[[System.__Canon, mscorlib]].MoveNext()+0x2ac
>  
> StackTraceString: <none>
> HResult: 80004005
> 0:000> !PrintException /d 000001c61a0fa760
> Exception object: 000001c61a0fa760
> Exception type: System.IO.IOException
> Message: Unable to read data from the transport connection: An existing connection was forcibly closed by the remote host.
> InnerException: System.Net.Sockets.SocketException, Use !PrintException 000001c61a0fa570 to see more.
> StackTrace (generated):
>  SP IP Function
>  000000D5B447A610 00007FFD928993C3 System_ni!System.Net.Security._SslStream.EndRead(System.IAsyncResult)+0x7f3873
>  000000D5B447A660 00007FFD920A5B4C System_ni!System.Net.TlsStream.EndRead(System.IAsyncResult)+0x10c
>  000000D5B447A6B0 00007FFD94043F13 mscorlib_ni!System.Threading.Tasks.TaskFactory`1+FromAsyncTrimPromise`1[[System.Int32, mscorlib],[System.__Canon, mscorlib]].Complete(System.__Canon, System.Func`3<System.__Canon,System.IAsyncResult,Int32>, System.IAsyncResult, Boolean)+0x23
>  000000D5B4475DD0 00007FFD93F7C3DF mscorlib_ni!System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()+0x1f
>  000000D5B4475E00 00007FFD935EDF2C mscorlib_ni!System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(System.Threading.Tasks.Task)+0x3c
>  000000D5B4475E30 00007FFD92035D25 System_ni!System.Net.WebSockets.WebSocketConnectionStream+<ReadAsync>d__21.MoveNext()+0x385
>  000000D5B4471430 00007FFD93F7C3DF mscorlib_ni!System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()+0x1f
>  000000D5B4471460 00007FFD935EDF2C mscorlib_ni!System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(System.Threading.Tasks.Task)+0x3c
>  000000D5B4471490 00007FFD92034541 System_ni!System.Net.WebSockets.WebSocketBase+WebSocketOperation+<Process>d__19.MoveNext()+0x8e1
>  000000D5B446ED40 00007FFD93F7C3DF mscorlib_ni!System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()+0x1f
>  000000D5B446ED70 00007FFD935EDF2C mscorlib_ni!System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(System.Threading.Tasks.Task)+0x3c
>  000000D5B446EDA0 00007FFD9203339E System_ni!System.Net.WebSockets.WebSocketBase+<ReceiveAsyncCore>d__45.MoveNext()+0x1de
>  
> StackTraceString: <none>
> HResult: 80131620
> 0:000> !PrintException /d 000001c61a0fa570
> Exception object: 000001c61a0fa570
> {color:#de350b}Exception type: System.Net.Sockets.SocketException{color}
> {color:#de350b}Message: An existing connection was forcibly closed by the remote host{color}
> InnerException: <none>
> StackTrace (generated):
>  SP IP Function
>  000000D5B447ED00 00007FFD928975F7 System_ni!System.Net.Sockets.Socket.EndReceive(System.IAsyncResult)+0x7fe207
>  000000D5B447ED40 00007FFD9209924D System_ni!System.Net.Sockets.NetworkStream.EndRead(System.IAsyncResult)+0x2d
>  
>  
> But then the Gremlin.Net 's exception handling to {color:#ff8b00}dispose and close{color} connections seemed to hit some deadlock:
> 0:000> ~26e !clrstack
> OS Thread Id: 0x31b8 (26)
>         Child SP               IP Call Site
> 000000d3ef8f8798 +00007ffda80b6594+ [GCFrame: 000000d3ef8f8798]
> 000000d3ef8f8868 +00007ffda80b6594+ [HelperMethodFrame_1OBJ: 000000d3ef8f8868] System.Threading.Monitor.ObjWait(Boolean, Int32, System.Object)
> 000000d3ef8f8980 +00007ffd935f6a0f+ System.Threading.ManualResetEventSlim.Wait(Int32, System.Threading.CancellationToken) [f:\dd\ndp\clr\src\BCL\system\threading\ManualResetEventSlim.cs @ 669]
> 000000d3ef8f8a10 +00007ffd935ef815+ System.Threading.Tasks.Task.SpinThenBlockingWait(Int32, System.Threading.CancellationToken) [f:\dd\ndp\clr\src\BCL\system\threading\Tasks\Task.cs @ 3320]
> 000000d3ef8f8a80 +00007ffd93f60a88+ System.Threading.Tasks.Task.InternalWait(Int32, System.Threading.CancellationToken) [f:\dd\ndp\clr\src\BCL\system\threading\Tasks\Task.cs @ 3259]
> 000000d3ef8f8b50 +00007ffd935eef08+ System.Threading.Tasks.Task.Wait(Int32, System.Threading.CancellationToken) [f:\dd\ndp\clr\src\BCL\system\threading\Tasks\Task.cs @ 3167]
> {color:#ff8b00}000000d3ef8f8b80 +00007ffd35d53f7f+ Gremlin.Net.Process.Utils.WaitUnwrap(System.Threading.Tasks.Task){color}
> {color:#ff8b00}000000d3ef8f8bc0 +00007ffd35db01dc+ Gremlin.Net.Driver.ConnectionPool.ReturnConnectionIfOpen(Gremlin.Net.Driver.Connection){color}
> 000000d3ef8f8bf0 +00007ffd35d88937+ Gremlin.Net.Driver.GremlinClient+d__6`1[[System.__Canon, mscorlib]].MoveNext()
> 000000d3ef8fbfe8 +00007ffd94756f21+ [HelperMethodFrame: 000000d3ef8fbfe8]
> 000000d3ef8fc0d0 +00007ffd93f7c3df+ System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() [f:\dd\ndp\clr\src\BCL\system\runtime\exceptionservices\exceptionservicescommon.cs @ 133]
> 000000d3ef8fc100 +00007ffd935edf2c+ System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(System.Threading.Tasks.Task) [f:\dd\ndp\clr\src\BCL\system\runtime\compilerservices\TaskAwaiter.cs @ 156]
> 000000d3ef8fc130 +00007ffd35d8887c+ Gremlin.Net.Driver.GremlinClient+d__6`1[[System.__Canon, mscorlib]].MoveNext()
> 000000d3ef8fc1e0 +00007ffd935a5ab3+ System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean) [f:\dd\ndp\clr\src\BCL\system\threading\executioncontext.cs @ 954]
> 000000d3ef8fc2b0 +00007ffd935a5944+ System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean) [f:\dd\ndp\clr\src\BCL\system\threading\executioncontext.cs @ 902]
> 000000d3ef8fc2e0 +00007ffd935f12bc+ System.Runtime.CompilerServices.AsyncMethodBuilderCore+MoveNextRunner.Run() [f:\dd\ndp\clr\src\BCL\system\runtime\compilerservices\AsyncMethodBuilder.cs @ 1070]
> 000000d3ef8fc330 +00007ffd935f80b2+ System.Threading.Tasks.AwaitTaskContinuation.RunOrScheduleAction(System.Action, Boolean, System.Threading.Tasks.Task ByRef) [f:\dd\ndp\clr\src\BCL\system\threading\Tasks\TaskContinuation.cs @ 810]
> 000000d3ef8fc380 +00007ffd93573dc7+ System.Threading.Tasks.Task.FinishContinuations() [f:\dd\ndp\clr\src\BCL\system\threading\Tasks\Task.cs @ 3617]
> 000000d3ef8fc410 +00007ffd935ee890+ System.Threading.Tasks.Task.Finish(Boolean) [f:\dd\ndp\clr\src\BCL\system\threading\Tasks\Task.cs @ 2263]
> 000000d3ef8fc470 +00007ffd935f0101+ System.Threading.Tasks.Task`1[[System.__Canon, mscorlib]].TrySetException(System.Object) [f:\dd\ndp\clr\src\BCL\system\threading\Tasks\Future.cs @ 606]
> 000000d3ef8fc4c0 +00007ffd935f06a8+ System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1[[System.__Canon, mscorlib]].SetException(System.Exception)
> 000000d3ef8fc500 +00007ffd35d8b27d+ Gremlin.Net.Driver.ProxyConnection+d__3`1[[System.__Canon, mscorlib]].MoveNext()
> 000000d3ef8fe688 +00007ffd94756f21+ [HelperMethodFrame: 000000d3ef8fe688]
> 000000d3ef8fe770 +00007ffd93f7c3df+ System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() [f:\dd\ndp\clr\src\BCL\system\runtime\exceptionservices\exceptionservicescommon.cs @ 133]
> 000000d3ef8fe7a0 +00007ffd935edf2c+ System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(System.Threading.Tasks.Task) [f:\dd\ndp\clr\src\BCL\system\runtime\compilerservices\TaskAwaiter.cs @ 156]
> 000000d3ef8fe7d0 +00007ffd35d8b1bc+ Gremlin.Net.Driver.ProxyConnection+d__3`1[[System.__Canon, mscorlib]].MoveNext()
> 000000d3ef8fe860 +00007ffd935a5ab3+ System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean) [f:\dd\ndp\clr\src\BCL\system\threading\executioncontext.cs @ 954]
> 000000d3ef8fe930 +00007ffd935a5944+ System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean) [f:\dd\ndp\clr\src\BCL\system\threading\executioncontext.cs @ 902]
> 000000d3ef8fe960 +00007ffd935f12bc+ System.Runtime.CompilerServices.AsyncMethodBuilderCore+MoveNextRunner.Run() [f:\dd\ndp\clr\src\BCL\system\runtime\compilerservices\AsyncMethodBuilder.cs @ 1070]
> 000000d3ef8fe9b0 +00007ffd935e4e22+ System.Threading.ThreadPoolWorkQueue.Dispatch() [f:\dd\ndp\clr\src\BCL\system\threading\threadpool.cs @ 820]
> 000000d3ef8fee48 +00007ffd94646a93+ [DebuggerU2MCatchHandlerFrame: 000000d3ef8fee48]
>  
> Looks like Gremlin.Net handles a dead connection by attempting to forcing a closure of the connection pool.  This path calls Task.Wait() instead of using await Task, so it is very possible that this is the cause of the hang.
>  
> Might be related to this issue:
> [https://issues.apache.org/jira/projects/TINKERPOP/issues/TINKERPOP-2268?filter=allopenissues&orderby=affectedVersion+DESC%2C+priority+DESC%2C+updated+DESC]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)