You are viewing a plain text version of this content. The canonical link for it is here.
Posted to gitbox@activemq.apache.org by GitBox <gi...@apache.org> on 2021/01/27 01:54:07 UTC

[GitHub] [activemq-nms-amqp] brudo opened a new pull request #59: AMQNET-656 failover stalls without reconnecting in some cases - ensure an await statement is not inside a lock statement

brudo opened a new pull request #59:
URL: https://github.com/apache/activemq-nms-amqp/pull/59


   This is in followup to #45. It appears that the Task.Run that was removed from the PR may have been needed after all. At least, I have seen failover fail to reconnect occasionally in our environment, and it appears to be related to this.
   
   I can't offer a full repro to illustrate the issue was present and its resolution, as it is very timing dependent. However, errors are to be expected when an await statement is inside a lock statement.
   
   See also AMQNET-656, opened for this issue, and related cases AMQNET-624 and AMQNET-626, both closed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [activemq-nms-amqp] brudo commented on pull request #59: AMQNET-656 failover stalls without reconnecting in some cases - ensure an await statement is not inside a lock statement

Posted by GitBox <gi...@apache.org>.
brudo commented on pull request #59:
URL: https://github.com/apache/activemq-nms-amqp/pull/59#issuecomment-781134528


   @Havret the thread IDs are consistent with what I said. Good, so I understand that much at least! But fair enough, without any await, there is at least a guarantee that the lock will be exited from the same thread where it was entered. I'll have to try to find more evidence or a more complete theory, or better yet a repro, of what actually caused it to stall in our case.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [activemq-nms-amqp] Havret edited a comment on pull request #59: AMQNET-656 failover stalls without reconnecting in some cases - ensure an await statement is not inside a lock statement

Posted by GitBox <gi...@apache.org>.
Havret edited a comment on pull request #59:
URL: https://github.com/apache/activemq-nms-amqp/pull/59#issuecomment-780173091


   I'm not sure if this solves anything. Currently you will leave the lock on first `await`. With the proposed fix you will leave it instantly. In either case lock doesn't work as it should. I'm not sure why https://issues.apache.org/jira/browse/AMQNET-626 was closed, so I reopened it. 
   
   As I said in https://github.com/apache/activemq-nms-amqp/pull/45 
   
   > I think we should revisit this part of code (mainly TriggerReconnectionAttempt) after we hopefully deliver the initial release. Calling async method without await or GetResult() is always a red flag.
   
   This was a bit too naively ported from Java and definitely requires a second look. A more reliable implementation might look as [follows](https://github.com/Havret/dotnet-activemq-artemis-client/blob/master/src/ActiveMQ.Artemis.Client/AutoRecovering/AutoRecoveringConnection.cs) but this wouldn't be a simple change.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [activemq-nms-amqp] Havret edited a comment on pull request #59: AMQNET-656 failover stalls without reconnecting in some cases - ensure an await statement is not inside a lock statement

Posted by GitBox <gi...@apache.org>.
Havret edited a comment on pull request #59:
URL: https://github.com/apache/activemq-nms-amqp/pull/59#issuecomment-780173091


   I'm not sure if this solve anything. Currently you will leave lock on first await. With the proposed fix you will leave it instantly. In either case lock doesn't work as it should. I'm not sure why https://issues.apache.org/jira/browse/AMQNET-626 was closed, so I reopened it. 
   
   As I said in https://github.com/apache/activemq-nms-amqp/pull/45 
   
   > I think we should revisit this part of code (mainly TriggerReconnectionAttempt) after we hopefully deliver the initial release. Calling async method without await or GetResult() is always a red flag.
   
   This was a bit too naively ported from Java and definitely requires a second look. A more reliable implementation might look as [follows](https://github.com/Havret/dotnet-activemq-artemis-client/blob/master/src/ActiveMQ.Artemis.Client/AutoRecovering/AutoRecoveringConnection.cs) but this wouldn't be a simple change.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [activemq-nms-amqp] michaelpearce-gain commented on pull request #59: AMQNET-656 failover stalls without reconnecting in some cases - ensure an await statement is not inside a lock statement

Posted by GitBox <gi...@apache.org>.
michaelpearce-gain commented on pull request #59:
URL: https://github.com/apache/activemq-nms-amqp/pull/59#issuecomment-803548695


   @brudo any update on your progress?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [activemq-nms-amqp] Havret commented on pull request #59: AMQNET-656 failover stalls without reconnecting in some cases - ensure an await statement is not inside a lock statement

Posted by GitBox <gi...@apache.org>.
Havret commented on pull request #59:
URL: https://github.com/apache/activemq-nms-amqp/pull/59#issuecomment-887835435


   @michaelandrepearce I will submit RC3 on Friday. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@activemq.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [activemq-nms-amqp] brudo commented on pull request #59: AMQNET-656 failover stalls without reconnecting in some cases - ensure an await statement is not inside a lock statement

Posted by GitBox <gi...@apache.org>.
brudo commented on pull request #59:
URL: https://github.com/apache/activemq-nms-amqp/pull/59#issuecomment-777887567


   I don't think the additional commit is what fixed it - in theory it should have the same run time behavior either way.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [activemq-nms-amqp] Havret edited a comment on pull request #59: AMQNET-656 failover stalls without reconnecting in some cases - ensure an await statement is not inside a lock statement

Posted by GitBox <gi...@apache.org>.
Havret edited a comment on pull request #59:
URL: https://github.com/apache/activemq-nms-amqp/pull/59#issuecomment-780898935


   @brudo Please take a look at the following snippet:
   
   ```
   class Program
   {
       private static object syncRoot = new object();
       
       static async Task Main(string[] args)
       {
           lock (syncRoot)
           {
               Console.WriteLine("Before not awaited");    
               NotWaitedMethod();
               Console.WriteLine("After not awaited");
           }
   
           await Task.Delay(TimeSpan.FromSeconds(2));
       }
   
       private static async Task NotWaitedMethod()
       {
           Console.WriteLine("Inside before await");
           await Task.Delay(TimeSpan.FromSeconds(1)).ConfigureAwait(false);
           Console.WriteLine("Inside after await");
       }
   }
   ```
   
   The output will be as follows: 
   ```
   Before not awaited
   Inside before await
   After not awaited
   Inside after await
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [activemq-nms-amqp] brudo commented on pull request #59: AMQNET-656 failover stalls without reconnecting in some cases - ensure an await statement is not inside a lock statement

Posted by GitBox <gi...@apache.org>.
brudo commented on pull request #59:
URL: https://github.com/apache/activemq-nms-amqp/pull/59#issuecomment-768012428


   So far I have tested this locally - no regressions on initial connect, and I have not observed any stalls yet when failover is induced by stopping / starting brokers.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [activemq-nms-amqp] Havret edited a comment on pull request #59: AMQNET-656 failover stalls without reconnecting in some cases - ensure an await statement is not inside a lock statement

Posted by GitBox <gi...@apache.org>.
Havret edited a comment on pull request #59:
URL: https://github.com/apache/activemq-nms-amqp/pull/59#issuecomment-780898935


   @brudo Please take a look at the following snippet, or better, try to run it. 
   
   ```
       class Program
       {
           private static object syncRoot = new object();
           
           static async Task Main(string[] args)
           {
               lock (syncRoot)
               {
                   Console.WriteLine("Before not awaited");    
                   NotWaitedMethod();
                   Console.WriteLine("After not awaited");
               }
   
               await Task.Delay(TimeSpan.FromSeconds(2));
           }
   
           private static async Task NotWaitedMethod()
           {
               Console.WriteLine("Inside before await");
               await Task.Delay(TimeSpan.FromSeconds(1)).ConfigureAwait(false);
               Console.WriteLine("Inside after await");
           }
       }
   ```
   
   The output will look as follows. 
   ```
   Before not awaited
   Inside before await
   After not awaited
   Inside after await
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [activemq-nms-amqp] Havret edited a comment on pull request #59: AMQNET-656 failover stalls without reconnecting in some cases - ensure an await statement is not inside a lock statement

Posted by GitBox <gi...@apache.org>.
Havret edited a comment on pull request #59:
URL: https://github.com/apache/activemq-nms-amqp/pull/59#issuecomment-780173091


   I'm not sure if this solve anything. Currently you will leave the lock on first `await`. With the proposed fix you will leave it instantly. In either case lock doesn't work as it should. I'm not sure why https://issues.apache.org/jira/browse/AMQNET-626 was closed, so I reopened it. 
   
   As I said in https://github.com/apache/activemq-nms-amqp/pull/45 
   
   > I think we should revisit this part of code (mainly TriggerReconnectionAttempt) after we hopefully deliver the initial release. Calling async method without await or GetResult() is always a red flag.
   
   This was a bit too naively ported from Java and definitely requires a second look. A more reliable implementation might look as [follows](https://github.com/Havret/dotnet-activemq-artemis-client/blob/master/src/ActiveMQ.Artemis.Client/AutoRecovering/AutoRecoveringConnection.cs) but this wouldn't be a simple change.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [activemq-nms-amqp] brudo commented on pull request #59: AMQNET-656 failover stalls without reconnecting in some cases - ensure an await statement is not inside a lock statement

Posted by GitBox <gi...@apache.org>.
brudo commented on pull request #59:
URL: https://github.com/apache/activemq-nms-amqp/pull/59#issuecomment-823599401


   Sorry for the delay following up... no, other things have taken priority and I did not have any progress on this. I won't be able to switch to NMS 2 until activemq-nms-openwire is also updated - right now it is missing the async methods defined in the 2.0.x interfaces.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [activemq-nms-amqp] michaelpearce-gain commented on pull request #59: AMQNET-656 failover stalls without reconnecting in some cases - ensure an await statement is not inside a lock statement

Posted by GitBox <gi...@apache.org>.
michaelpearce-gain commented on pull request #59:
URL: https://github.com/apache/activemq-nms-amqp/pull/59#issuecomment-823604253


   @brudo openwire nms client atm, isnt the most active, whilst bug fixes and bits like ensuring it works for latest language im not seeing much activity adding new features to it, most recent efforts are currently put into AMQP for NMS going forwards as this became an industry standard protocol, which both ActiveMQ 5 (Classic) and ActiveMQ Artemis brokers support and many other commercial offerings such as Solace, IBM MQ,Azure etc. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [activemq-nms-amqp] Havret commented on pull request #59: AMQNET-656 failover stalls without reconnecting in some cases - ensure an await statement is not inside a lock statement

Posted by GitBox <gi...@apache.org>.
Havret commented on pull request #59:
URL: https://github.com/apache/activemq-nms-amqp/pull/59#issuecomment-780505562


   Hi @brudo, what you're describing here could theoretically happen, when there is an await inside of a lock. In reality, the compiler prevents you from writing this kind of statement. It simply doesn't compile. You can give it a try. :) What we are dealing with here though is a slightly different situation. `TriggerReconnectionAttempt` is invoked inside of a lock in a fire and forget fashion. On the first await inside of it, it will simply return from the caller's perspective. The continuation will kick off on another thread, but the calling code will not be aware of that. This may cause some problems, but a deadlock is not one of them. 
   
   The solution that I proposed is a rather standard approach to synchronization/coordination when one needs to deal with asynchronous code.  


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [activemq-nms-amqp] Havret edited a comment on pull request #59: AMQNET-656 failover stalls without reconnecting in some cases - ensure an await statement is not inside a lock statement

Posted by GitBox <gi...@apache.org>.
Havret edited a comment on pull request #59:
URL: https://github.com/apache/activemq-nms-amqp/pull/59#issuecomment-780898935


   @brudo Please take a look at the following snippet:
   
   ```
   class Program
   {
       private static object syncRoot = new object();
       
       static async Task Main(string[] args)
       {
           lock (syncRoot)
           {
               Console.WriteLine("Before not awaited");    
               NotWaitedMethod();
               Console.WriteLine("After not awaited");
           }
   
           await Task.Delay(TimeSpan.FromSeconds(2));
       }
   
       private static async Task NotWaitedMethod()
       {
           Console.WriteLine("Inside before await");
           await Task.Delay(TimeSpan.FromSeconds(1));
           Console.WriteLine("Inside after await");
       }
   }
   ```
   
   The output will be as follows: 
   ```
   Before not awaited
   Inside before await
   After not awaited
   Inside after await
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [activemq-nms-amqp] brudo commented on pull request #59: AMQNET-656 failover stalls without reconnecting in some cases - ensure an await statement is not inside a lock statement

Posted by GitBox <gi...@apache.org>.
brudo commented on pull request #59:
URL: https://github.com/apache/activemq-nms-amqp/pull/59#issuecomment-780986295


   @Havret I will have to study this further, look at some IL, etc. I do note that "Inside after await" would / could be on a different thread (scheduled on the thread pool), while all the other WriteLines would be running on the initial thread (which might not be on the thread pool). And, now as I think about it, it is the "other side" (on that other thread) which actually has the await / continuation on it.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [activemq-nms-amqp] brudo commented on pull request #59: AMQNET-656 failover stalls without reconnecting in some cases - ensure an await statement is not inside a lock statement

Posted by GitBox <gi...@apache.org>.
brudo commented on pull request #59:
URL: https://github.com/apache/activemq-nms-amqp/pull/59#issuecomment-780877854


   @Havret the compiler will catch it if the await appears directly inside the same code block where the lock statement appeared; but as far as I understand, it's still invalid to await when inside a lock, even in the cases where the compiler can't detect it - as in the case where the await occurs in a different block, that gets invoked from inside the lock.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [activemq-nms-amqp] Havret edited a comment on pull request #59: AMQNET-656 failover stalls without reconnecting in some cases - ensure an await statement is not inside a lock statement

Posted by GitBox <gi...@apache.org>.
Havret edited a comment on pull request #59:
URL: https://github.com/apache/activemq-nms-amqp/pull/59#issuecomment-780898935


   @brudo Please take a look at the following snippet:
   
   ```
   class Program
   {
       private static object syncRoot = new object();
       
       static async Task Main(string[] args)
       {
           lock (syncRoot)
           {
               Console.WriteLine("Before not awaited");    
               NotAwaitedMethod();
               Console.WriteLine("After not awaited");
           }
   
           await Task.Delay(TimeSpan.FromSeconds(2));
       }
   
       private static async Task NotAwaitedMethod()
       {
           Console.WriteLine("Inside before await");
           await Task.Delay(TimeSpan.FromSeconds(1)).ConfigureAwait(false);
           Console.WriteLine("Inside after await");
       }
   }
   ```
   
   The output will be as follows: 
   ```
   Before not awaited
   Inside before await
   After not awaited
   Inside after await
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [activemq-nms-amqp] brudo commented on pull request #59: AMQNET-656 failover stalls without reconnecting in some cases - ensure an await statement is not inside a lock statement

Posted by GitBox <gi...@apache.org>.
brudo commented on pull request #59:
URL: https://github.com/apache/activemq-nms-amqp/pull/59#issuecomment-823631985


   @michaelandrepearce We had dropped openwire support from our product and pivoted to AMQP when we were migrating to .NET Standard and .NET Core, but had to add it back in due to the impacts of AMQNET-656.
   
   Perhaps if we can get a test and a fix in for this failover issue included in NMS 2.0 AMQP, then we could pull the plug on the openwire support again, particularly if it will not be moving back into a more-active status.
   
   Although I don't have any deep knowledge of the AmqpNetLite implementation at all, what @lukeabsent observed about the AsyncPump thread making synchronous calls to HandleProviderError sounded like a very strong lead toward the root cause. I'll take a look.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [activemq-nms-amqp] Havret commented on pull request #59: AMQNET-656 failover stalls without reconnecting in some cases - ensure an await statement is not inside a lock statement

Posted by GitBox <gi...@apache.org>.
Havret commented on pull request #59:
URL: https://github.com/apache/activemq-nms-amqp/pull/59#issuecomment-781117131


   @brudo The most important thing in this particular case is that "After not awaited" will be on the same thread that "Before not awaited", this is sth that guarantees that there won't be any deadlock here.
   
   I've updated my example a little bit. I'm not sure if it would be cost-effective to dive into IL. This is just how async-await works. 
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [activemq-nms-amqp] brudo commented on pull request #59: AMQNET-656 failover stalls without reconnecting in some cases - ensure an await statement is not inside a lock statement

Posted by GitBox <gi...@apache.org>.
brudo commented on pull request #59:
URL: https://github.com/apache/activemq-nms-amqp/pull/59#issuecomment-776357664


   @michaelpearce-gain I wasn't able to replicate the issue with Travis CI. Not sure what's going on there. I was able to build in Ubuntu 18.04 and run the test suite, without error, using the same commands as from the script section in travis.yml.
   
   I'll look at creating a PR for the 2.0.x branch with the same change, but hope to sort out this issue first, or it will likely repeat over there.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [activemq-nms-amqp] michaelpearce-gain edited a comment on pull request #59: AMQNET-656 failover stalls without reconnecting in some cases - ensure an await statement is not inside a lock statement

Posted by GitBox <gi...@apache.org>.
michaelpearce-gain edited a comment on pull request #59:
URL: https://github.com/apache/activemq-nms-amqp/pull/59#issuecomment-823604253


   @brudo openwire nms client atm, isnt the most active, whilst bug fixes and bits like ensuring it works for latest language support, e.g. support for .net std im not seeing much activity adding new features to it, most recent efforts are currently put into AMQP for NMS going forwards as this became an industry standard protocol, which both ActiveMQ 5 (Classic) and ActiveMQ Artemis brokers support and many other commercial offerings such as Solace, IBM MQ,Azure etc. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [activemq-nms-amqp] Havret edited a comment on pull request #59: AMQNET-656 failover stalls without reconnecting in some cases - ensure an await statement is not inside a lock statement

Posted by GitBox <gi...@apache.org>.
Havret edited a comment on pull request #59:
URL: https://github.com/apache/activemq-nms-amqp/pull/59#issuecomment-780898935


   @brudo Please take a look at the following snippet:
   
   ```
   class Program
   {
       private static object syncRoot = new object();
       
       static async Task Main(string[] args)
       {
           lock (syncRoot)
           {
               Console.WriteLine("Before not awaited");    
               NotAwaitedMethod();
               Console.WriteLine("After not awaited");
           }
   
           await Task.Delay(TimeSpan.FromSeconds(2));
       }
   
       private static async Task NotAwaitedMethod()
       {
           Console.WriteLine("Inside before await");
           await Task.Delay(TimeSpan.FromSeconds(1));
           Console.WriteLine("Inside after await");
       }
   }
   ```
   
   The output will be as follows: 
   ```
   Before not awaited
   Inside before await
   After not awaited
   Inside after await
   ```
   
   In order to preserve the async-await behavior, you need to push it all the way up the call stack. If you do that and at the same time try to fool the compiler by changing the lock statement to Monitor.Enter, Monitor.Exit we would in fact end up with a deadlock. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [activemq-nms-amqp] lukeabsent edited a comment on pull request #59: AMQNET-656 failover stalls without reconnecting in some cases - ensure an await statement is not inside a lock statement

Posted by GitBox <gi...@apache.org>.
lukeabsent edited a comment on pull request #59:
URL: https://github.com/apache/activemq-nms-amqp/pull/59#issuecomment-783215340


   Had a little look, and it seems overall that there could be more Reconnect() going 
   at the same time than just one (which I believe was the original intention), and that is one risk. 
   If you suspect deadlock happening around HandleProviderError-lock(SyncRoot) then maybe 
   it is being possible cause there is provider.Close inside - though deeper inside it seems to be timeouted for 1m.
   Other risk is that HandleProviderError looks like it may be called from underlying AsyncPump (AmqpNetLite) thread (inline execution) and if it locks or waits on something, then pump stops pumping and if thing being expected is something that was supposed to be delivered by AsyncPump, then there is deadlock.
   In general, it looks like it requires some work, we could be good to be able to reproduce it in some test.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [activemq-nms-amqp] Havret commented on pull request #59: AMQNET-656 failover stalls without reconnecting in some cases - ensure an await statement is not inside a lock statement

Posted by GitBox <gi...@apache.org>.
Havret commented on pull request #59:
URL: https://github.com/apache/activemq-nms-amqp/pull/59#issuecomment-780898935


   @brudo Please take a look at the following snippet, or better, try to run it. 
   
   ```
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [activemq-nms-amqp] Havret edited a comment on pull request #59: AMQNET-656 failover stalls without reconnecting in some cases - ensure an await statement is not inside a lock statement

Posted by GitBox <gi...@apache.org>.
Havret edited a comment on pull request #59:
URL: https://github.com/apache/activemq-nms-amqp/pull/59#issuecomment-780898935


   @brudo Please take a look at the following snippet:
   
   ```
   class Program
   {
       private static object syncRoot = new object();
       
       static async Task Main(string[] args)
       {
           lock (syncRoot)
           {
               Console.WriteLine("Before not awaited");    
               NotAwaitedMethod();
               Console.WriteLine("After not awaited");
           }
   
           await Task.Delay(TimeSpan.FromSeconds(2));
       }
   
       private static async Task NotAwaitedMethod()
       {
           Console.WriteLine("Inside before await");
           await Task.Delay(TimeSpan.FromSeconds(1)).ConfigureAwait(false);
           Console.WriteLine("Inside after await");
       }
   }
   ```
   
   The output will be as follows: 
   ```
   Before not awaited
   Inside before await
   After not awaited
   Inside after await
   ```
   
   In order to preserve the async-await behavior, you need to push it all the way up the call stack. If you do that and at the same time try to fool the compiler by changing the lock statement to Monitor.Enter, Monitor.Exit we would in fact end up with a deadlock. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [activemq-nms-amqp] michaelandrepearce commented on pull request #59: AMQNET-656 failover stalls without reconnecting in some cases - ensure an await statement is not inside a lock statement

Posted by GitBox <gi...@apache.org>.
michaelandrepearce commented on pull request #59:
URL: https://github.com/apache/activemq-nms-amqp/pull/59#issuecomment-923616645


   @brudo im closing this out, now 1.8.2 is released.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@activemq.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [activemq-nms-amqp] Havret edited a comment on pull request #59: AMQNET-656 failover stalls without reconnecting in some cases - ensure an await statement is not inside a lock statement

Posted by GitBox <gi...@apache.org>.
Havret edited a comment on pull request #59:
URL: https://github.com/apache/activemq-nms-amqp/pull/59#issuecomment-780898935


   @brudo Please take a look at the following snippet:
   
   ```
       class Program
       {
           private static object syncRoot = new object();
           
           static async Task Main(string[] args)
           {
               lock (syncRoot)
               {
                   Console.WriteLine("Before not awaited");    
                   NotWaitedMethod();
                   Console.WriteLine("After not awaited");
               }
   
               await Task.Delay(TimeSpan.FromSeconds(2));
           }
   
           private static async Task NotWaitedMethod()
           {
               Console.WriteLine("Inside before await");
               await Task.Delay(TimeSpan.FromSeconds(1)).ConfigureAwait(false);
               Console.WriteLine("Inside after await");
           }
       }
   ```
   
   The output will look as follows: 
   ```
   Before not awaited
   Inside before await
   After not awaited
   Inside after await
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [activemq-nms-amqp] michaelpearce-gain commented on pull request #59: AMQNET-656 failover stalls without reconnecting in some cases - ensure an await statement is not inside a lock statement

Posted by GitBox <gi...@apache.org>.
michaelpearce-gain commented on pull request #59:
URL: https://github.com/apache/activemq-nms-amqp/pull/59#issuecomment-887295067


   @brudo thanks for the update, and its good news in a way, that it turns out to be issue in underlying amqpnetlite, as then as you say people can patch this themselves by forcing dependency update. I know that 1.8.2 is in process of release @Havret has done two RC's but i think a third is needed to fix some bits like headers, @Havret are you able to give @brudo an idea how long till we fix RC to go again on votes, to hopefully get 1.8.2 out, or are we now just focussed on 2.0.0?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@activemq.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [activemq-nms-amqp] michaelandrepearce commented on pull request #59: AMQNET-656 failover stalls without reconnecting in some cases - ensure an await statement is not inside a lock statement

Posted by GitBox <gi...@apache.org>.
michaelandrepearce commented on pull request #59:
URL: https://github.com/apache/activemq-nms-amqp/pull/59#issuecomment-777460484


   @brudo a small note, the 2.0.x work is now merged into here https://github.com/apache/activemq-nms-amqp/tree/2.0.x, i think the aim is to release the api in the next week or two, followed by it after that
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [activemq-nms-amqp] brudo commented on pull request #59: AMQNET-656 failover stalls without reconnecting in some cases - ensure an await statement is not inside a lock statement

Posted by GitBox <gi...@apache.org>.
brudo commented on pull request #59:
URL: https://github.com/apache/activemq-nms-amqp/pull/59#issuecomment-887056833


   Although I did not fully understand the root cause, and the change I proposed in this PR may not have been the correct way to fix it, a stall was nevertheless observed where failover did not occur when the current active broker crashed and another broker took over.
   
   It seems the following bug in AMQPNetLite, addressed in 2.4.3, could have been the true root cause: https://github.com/Azure/amqpnetlite/issues/460 "NotifyClosed not being called on connection loss detected by HeartBeat..."
   
   If so, the issue should no longer occur in 1.8.2 as it now references AMQPNetLite 2.4.3. Likewise, even before 1.8.2 is released, perhaps a workaround could be to reference AMQPNetLite 2.4.3 explicitly.
   
   @michaelandrepearce @lukeabsent @michaelpearce-gain


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@activemq.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [activemq-nms-amqp] Havret edited a comment on pull request #59: AMQNET-656 failover stalls without reconnecting in some cases - ensure an await statement is not inside a lock statement

Posted by GitBox <gi...@apache.org>.
Havret edited a comment on pull request #59:
URL: https://github.com/apache/activemq-nms-amqp/pull/59#issuecomment-780173091


   I'm not sure if this solve anything. Currently you will leave lock on first await. With the proposed fix you will leave it instantly. In either case lock doesn't work as it should. I'm not sure why https://issues.apache.org/jira/browse/AMQNET-626 was closed, so I reopened it. 
   
   As I said in https://github.com/apache/activemq-nms-amqp/pull/45 
   
   > I think we should revisit this part of code (mainly TriggerReconnectionAttempt) after we hopefully deliver the initial release. Calling async method without await or GetResult() is always a red flag.
   
   This was a bit too naively ported from Java and definitely requires a second look. A more reliable implementation might look as [follows](https://github.com/Havret/dotnet-activemq-artemis-client/blob/master/src/ActiveMQ.Artemis.Client/AutoRecovering/AutoRecoveringConnection.cs) but this wouldn't be simple change.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [activemq-nms-amqp] brudo commented on pull request #59: AMQNET-656 failover stalls without reconnecting in some cases - ensure an await statement is not inside a lock statement

Posted by GitBox <gi...@apache.org>.
brudo commented on pull request #59:
URL: https://github.com/apache/activemq-nms-amqp/pull/59#issuecomment-825363253


   Thus far I haven't been successful in creating a unit test that would fail with a timeout indicating a stall. I wonder if other changes in 2.0.x might have cleared up the issue. However, it's at least as likely that my test attempts were flawed, e.g. that  synchronization within the test may have interfered with the results.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [activemq-nms-amqp] Havret commented on pull request #59: AMQNET-656 failover stalls without reconnecting in some cases - ensure an await statement is not inside a lock statement

Posted by GitBox <gi...@apache.org>.
Havret commented on pull request #59:
URL: https://github.com/apache/activemq-nms-amqp/pull/59#issuecomment-780173091


   I'm not sure if this solves anything. Currently you will leave lock on first await. With the proposed fix you will leave it instantly. In either case lock doesn't work as it should. I'm not sure why https://issues.apache.org/jira/browse/AMQNET-626 was closed, so I reopened it. 
   
   As I said in https://github.com/apache/activemq-nms-amqp/pull/45 
   
   > I think we should revisit this part of code (mainly TriggerReconnectionAttempt) after we hopefully deliver the initial release. Calling async method without await or GetResult() is always a red flag.
   
   This was a bit too naively ported from Java and definitely requires a second look. A more reliable implementation might look as follows: https://github.com/Havret/dotnet-activemq-artemis-client/blob/master/src/ActiveMQ.Artemis.Client/AutoRecovering/AutoRecoveringConnection.cs


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [activemq-nms-amqp] lukeabsent commented on pull request #59: AMQNET-656 failover stalls without reconnecting in some cases - ensure an await statement is not inside a lock statement

Posted by GitBox <gi...@apache.org>.
lukeabsent commented on pull request #59:
URL: https://github.com/apache/activemq-nms-amqp/pull/59#issuecomment-783215340


   Had a little look, and it seems overall that there could be more Reconnect() going 
   at the same time than just one (which I believe was the original intention), and that is one risk. 
   If you suspect deadlock happening around HandleProviderError-lock(SyncRoot) then maybe 
   it is being possible cause there is provider.Close inside - though deeper inside it seems to be timeouted for 1m.
   Other risk is that HandleProviderError looks like it may be called from underlying AsyncPump (AmqpNetLite) thread (inline execution) and if it locks or waits on something, then pump stops pumping and if thing being expected is something that was supposed to be delivered by AsyncPump, then there is deadlock.
   In general, it looks like it requires some work, if I have some time I will see if I can recreate it in some test.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [activemq-nms-amqp] Havret commented on pull request #59: AMQNET-656 failover stalls without reconnecting in some cases - ensure an await statement is not inside a lock statement

Posted by GitBox <gi...@apache.org>.
Havret commented on pull request #59:
URL: https://github.com/apache/activemq-nms-amqp/pull/59#issuecomment-781154045


   Unfortunately I don't know how you're using this library. I recall that we were experiencing some deadlock situations, but it was caused by the Spring.NET wrapper that we are using for our services. With Spring.NET it is quite easy to shoot yourself in the foot unfortunately. :(  
   
   Have you tried to take a memory dump and see where your treads are actually stuck?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [activemq-nms-amqp] michaelandrepearce commented on pull request #59: AMQNET-656 failover stalls without reconnecting in some cases - ensure an await statement is not inside a lock statement

Posted by GitBox <gi...@apache.org>.
michaelandrepearce commented on pull request #59:
URL: https://github.com/apache/activemq-nms-amqp/pull/59#issuecomment-780141406


   @lukeabsent would you mind quickly looking at this as a second pair of eyes?
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [activemq-nms-amqp] Havret edited a comment on pull request #59: AMQNET-656 failover stalls without reconnecting in some cases - ensure an await statement is not inside a lock statement

Posted by GitBox <gi...@apache.org>.
Havret edited a comment on pull request #59:
URL: https://github.com/apache/activemq-nms-amqp/pull/59#issuecomment-780173091


   I'm not sure if this solve anything. Currently you will leave lock on first await. With the proposed fix you will leave it instantly. In either case lock doesn't work as it should. I'm not sure why https://issues.apache.org/jira/browse/AMQNET-626 was closed, so I reopened it. 
   
   As I said in https://github.com/apache/activemq-nms-amqp/pull/45 
   
   > I think we should revisit this part of code (mainly TriggerReconnectionAttempt) after we hopefully deliver the initial release. Calling async method without await or GetResult() is always a red flag.
   
   This was a bit too naively ported from Java and definitely requires a second look. A more reliable implementation might look as follows: https://github.com/Havret/dotnet-activemq-artemis-client/blob/master/src/ActiveMQ.Artemis.Client/AutoRecovering/AutoRecoveringConnection.cs


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [activemq-nms-amqp] michaelandrepearce closed pull request #59: AMQNET-656 failover stalls without reconnecting in some cases - ensure an await statement is not inside a lock statement

Posted by GitBox <gi...@apache.org>.
michaelandrepearce closed pull request #59:
URL: https://github.com/apache/activemq-nms-amqp/pull/59


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@activemq.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [activemq-nms-amqp] michaelandrepearce commented on pull request #59: AMQNET-656 failover stalls without reconnecting in some cases - ensure an await statement is not inside a lock statement

Posted by GitBox <gi...@apache.org>.
michaelandrepearce commented on pull request #59:
URL: https://github.com/apache/activemq-nms-amqp/pull/59#issuecomment-778046197


   @Havret wdyt


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [activemq-nms-amqp] brudo commented on pull request #59: AMQNET-656 failover stalls without reconnecting in some cases - ensure an await statement is not inside a lock statement

Posted by GitBox <gi...@apache.org>.
brudo commented on pull request #59:
URL: https://github.com/apache/activemq-nms-amqp/pull/59#issuecomment-783694222


   @Havret in our case, the stall occurs very rarely, and I do not have a memory dump from when it last occurred. The failover AMQP connection is initialized from a thread pool thread. We do have some logging code in ConnectionInterruptedListener and ConnectionResumedListener events, which currently runs synchronously - I wonder if that could be a factor as it might affect the timing.
   
   I will try to create a test case and add to FailoverIntegrationTest.cs and see if I can get the issue to occur more consistently.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [activemq-nms-amqp] brudo commented on pull request #59: AMQNET-656 failover stalls without reconnecting in some cases - ensure an await statement is not inside a lock statement

Posted by GitBox <gi...@apache.org>.
brudo commented on pull request #59:
URL: https://github.com/apache/activemq-nms-amqp/pull/59#issuecomment-824271798


   I tried to see if I could understand this better, including digging into AmqpNetLite. In FailoverProvider, the underlying provider and its listener are cleared before calling provider.Close(), while Amqp.AmqpObject.NotifyClosed protects against calling back more than once from the same instance, so it's probably safe from triggering a second call to HandleProviderError there.
   
   In any case my initial approach with adding a call to Task.Run may be more heavy-handed than it needed to be, and might be in the wrong place for all we know. I notice there is not a single Task.Run, other than the one I added, and the ones in the tests.
   
   One approach that doesn't use Task.Run might be something like this: have a dedicated background thread, that waits for a manual reset event that is set when a reconnect is required. I don't know if that is better, though. It guarantees only one reconnect at a time, and doesn't tie up a thread from the thread pool, and doesn't require blocking the AsyncPump, but uses a kernel handle and an idle thread per failover provider instance.
   
   The next step though should be to come up with a test that confirms the issue... hopefully the test will point to the right way to fix it, and if it's a quick fix that stands up to review, then perhaps it will make it into the 2.0 release. I'll be looking into that this afternoon.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [activemq-nms-amqp] michaelpearce-gain edited a comment on pull request #59: AMQNET-656 failover stalls without reconnecting in some cases - ensure an await statement is not inside a lock statement

Posted by GitBox <gi...@apache.org>.
michaelpearce-gain edited a comment on pull request #59:
URL: https://github.com/apache/activemq-nms-amqp/pull/59#issuecomment-823604253






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [activemq-nms-amqp] Havret edited a comment on pull request #59: AMQNET-656 failover stalls without reconnecting in some cases - ensure an await statement is not inside a lock statement

Posted by GitBox <gi...@apache.org>.
Havret edited a comment on pull request #59:
URL: https://github.com/apache/activemq-nms-amqp/pull/59#issuecomment-780898935


   @brudo Please take a look at the following snippet:
   
   ```
   class Program
   {
       private static object syncRoot = new object();
   
       static async Task Main(string[] args)
       {
           lock (syncRoot)
           {
               Console.WriteLine($"Before not awaited on thread {Thread.CurrentThread.ManagedThreadId}.");
               NotAwaitedMethod();
               Console.WriteLine($"After not awaited on thread {Thread.CurrentThread.ManagedThreadId}.");
           }
   
           await Task.Delay(TimeSpan.FromSeconds(2));
       }
   
       private static async Task NotAwaitedMethod()
       {
           Console.WriteLine($"Inside before await on thread {Thread.CurrentThread.ManagedThreadId}.");
           await Task.Delay(TimeSpan.FromSeconds(1));
           Console.WriteLine($"Inside after await on thread {Thread.CurrentThread.ManagedThreadId}.");
       }
   }
   ```
   
   The output will be as follows: 
   ```
   Before not awaited on thread 1.
   Inside before await on thread 1.
   After not awaited on thread 1.
   Inside after await on thread 4.
   ```
   
   In order to preserve the async-await behavior, you need to push it all the way up the call stack. If you do that and at the same time try to fool the compiler by changing the lock statement to Monitor.Enter, Monitor.Exit we would in fact end up with a deadlock. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [activemq-nms-amqp] Havret closed pull request #59: AMQNET-656 failover stalls without reconnecting in some cases - ensure an await statement is not inside a lock statement

Posted by GitBox <gi...@apache.org>.
Havret closed pull request #59:
URL: https://github.com/apache/activemq-nms-amqp/pull/59


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [activemq-nms-amqp] michaelpearce-gain commented on pull request #59: AMQNET-656 failover stalls without reconnecting in some cases - ensure an await statement is not inside a lock statement

Posted by GitBox <gi...@apache.org>.
michaelpearce-gain commented on pull request #59:
URL: https://github.com/apache/activemq-nms-amqp/pull/59#issuecomment-770739544


   @brudo this seems to have failed one of the CI builds. Also a quick note, we've re-written alot of the internals ready for nms2.0, see here https://github.com/apache/activemq-nms-amqp/pull/56 without a test case, its not possible to validate we wont regress your change, you may want to look there, and send a pr to that branch, as its likely we will merge this soon


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [activemq-nms-amqp] brudo commented on pull request #59: AMQNET-656 failover stalls without reconnecting in some cases - ensure an await statement is not inside a lock statement

Posted by GitBox <gi...@apache.org>.
brudo commented on pull request #59:
URL: https://github.com/apache/activemq-nms-amqp/pull/59#issuecomment-780246783


   @Havret thanks for your comments. The issue is that when you hit that first await, on a thread that is currently in a synchronized block anywhere up the call stack, that is already an invalid condition that may lead to deadlocks (and has been observed doing so - assuming I identified the right root cause). I agree that Task.Run is not the most elegant way to resolve it, but it might be the right solution for patching 1.8.x.
   
   A more correct way, assuming this kind of synchronization is still required in 2.0.x, would be to change to use an async-friendly alternative such as a SemaphoreSlim wherever the synchronized statement appears. Then Task.Run would not be needed. I'd be willing to help with a change like that for 2.0.x as a separate PR.
   
   I don't see any locks of any kind in your Artemis-specific client, but putting the recovery loop in its own long-running task also provides a form of synchronization / coordination, as that Task will only be running on one thread at a time. I like that approach.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [activemq-nms-amqp] lukeabsent edited a comment on pull request #59: AMQNET-656 failover stalls without reconnecting in some cases - ensure an await statement is not inside a lock statement

Posted by GitBox <gi...@apache.org>.
lukeabsent edited a comment on pull request #59:
URL: https://github.com/apache/activemq-nms-amqp/pull/59#issuecomment-783215340


   Had a little look, and it seems overall that there could be more Reconnect() going 
   at the same time than just one (which I believe was the original intention), and that is one risk. 
   If you suspect deadlock happening around HandleProviderError-lock(SyncRoot) then maybe 
   it is being possible cause there is provider.Close inside - though deeper inside it seems to be timeouted for 1m.
   Other risk is that HandleProviderError looks like it may be called from underlying AsyncPump (AmqpNetLite) thread (inline execution) and if it locks or waits on something, then pump stops pumping and if thing being expected is something that was supposed to be delivered by AsyncPump, then there is deadlock.
   In general, it looks like it requires some work, it would be good to be able to reproduce it in some test.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [activemq-nms-amqp] michaelandrepearce commented on pull request #59: AMQNET-656 failover stalls without reconnecting in some cases - ensure an await statement is not inside a lock statement

Posted by GitBox <gi...@apache.org>.
michaelandrepearce commented on pull request #59:
URL: https://github.com/apache/activemq-nms-amqp/pull/59#issuecomment-805145366


   @brudo we have merged the 2.0.x feature into master and will be looking to cut a release this week. Is there any progress on this we should hold up on? If not we will proceed with 2.0.0 as is. And come back to this after


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [activemq-nms-amqp] michaelandrepearce edited a comment on pull request #59: AMQNET-656 failover stalls without reconnecting in some cases - ensure an await statement is not inside a lock statement

Posted by GitBox <gi...@apache.org>.
michaelandrepearce edited a comment on pull request #59:
URL: https://github.com/apache/activemq-nms-amqp/pull/59#issuecomment-923616645


   @brudo im closing this out, now 1.8.2 is released. re-open if you disagree.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@activemq.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org