You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucenenet.apache.org by GitBox <gi...@apache.org> on 2020/10/06 15:35:01 UTC

[GitHub] [lucenenet] NightOwl888 opened a new issue #363: Investigate Failing Test: Lucene.Net.Replicator.Http.HttpReplicatorTest::TestBasic()

NightOwl888 opened a new issue #363:
URL: https://github.com/apache/lucenenet/issues/363


   This test failure began occurring when we set up the [nightly build](https://dev.azure.com/lucene-net/Lucene.NET/_build?definitionId=4&_a=summary).
   
   Note that the failure could have something to do with changing the nightly limits of [Rarely()](https://github.com/apache/lucenenet/blob/f0930a5f8df377cccc428dcfeba02ea4098a6dee/src/Lucene.Net.TestFramework/Util/LuceneTestCase.cs#L1405) and [AtLeast()](https://github.com/apache/lucenenet/blob/f0930a5f8df377cccc428dcfeba02ea4098a6dee/src/Lucene.Net.TestFramework/Util/LuceneTestCase.cs#L1384) which were changed to try to make the nightly tests run in under 1 hour.
   
   ```
   Error message
   System.Threading.Tasks.TaskCanceledException : A task was canceled.
   
   
   Stack trace
      at Microsoft.AspNetCore.TestHost.ResponseStream.ReadAsync(Byte[] buffer, Int32 offset, Int32 count, CancellationToken cancellationToken)
      at System.IO.Stream.CopyToAsyncInternal(Stream destination, Int32 bufferSize, CancellationToken cancellationToken)
      at System.Net.Http.HttpContent.LoadIntoBufferAsyncCore(Task serializeToStreamTask, MemoryStream tempBuffer)
      at System.Net.Http.HttpClient.FinishSendAsyncBuffered(Task`1 sendTask, HttpRequestMessage request, CancellationTokenSource cts, Boolean disposeCts)
      at Lucene.Net.Replicator.Http.HttpClientBase.ExecuteGet(String request, String[] parameters) in D:\a\1\s\src\Lucene.Net.Replicator\Http\HttpClientBase.cs:line 229
      at Lucene.Net.Replicator.Http.HttpReplicator.CheckForUpdate(String currentVersion) in D:\a\1\s\src\Lucene.Net.Replicator\Http\HttpReplicator.cs:line 71
      at Lucene.Net.Replicator.ReplicationClient.DoUpdate() in D:\a\1\s\src\Lucene.Net.Replicator\ReplicationClient.cs:line 228
      at Lucene.Net.Replicator.ReplicationClient.UpdateNow() in D:\a\1\s\src\Lucene.Net.Replicator\ReplicationClient.cs:line 472
      at Lucene.Net.Replicator.Http.HttpReplicatorTest.TestBasic() in D:\a\1\s\src\Lucene.Net.Tests.Replicator\Http\HttpReplicatorTest.cs:line 103
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [lucenenet] NightOwl888 closed issue #363: Investigate Failing Test: Lucene.Net.Replicator.Http.HttpReplicatorTest::TestBasic()

Posted by GitBox <gi...@apache.org>.
NightOwl888 closed issue #363:
URL: https://github.com/apache/lucenenet/issues/363


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@lucenenet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [lucenenet] NightOwl888 commented on issue #363: Investigate Failing Test: Lucene.Net.Replicator.Http.HttpReplicatorTest::TestBasic()

Posted by GitBox <gi...@apache.org>.
NightOwl888 commented on issue #363:
URL: https://github.com/apache/lucenenet/issues/363#issuecomment-833078253


   Unfortunately, even with the `[Repeat]` or [`[FindFirstFailingSeed]`](https://github.com/nunit/nunit/issues/1461#issuecomment-429580661) attributes I was unable to get this to fail on my local (Windows 10) system, but admittedly I gave up rather easily after only a couple of attempts. Perhaps doing a Debug build on Azure DevOps, downloading the `testbinaries` assets, and then doing a local `dotnet test` while [attaching a debugger](https://github.com/dotnet/sdk/issues/4994#issuecomment-431147227) is a way of reproducing it, but that hasn't been attempted yet.
   
   This might have something to do with the fact that we have upgraded `Microsoft.AspNetCore.TestHost` to 2.0.0 when we dropped support for .NET Standard 1.x, or it might just be something we missed because the tests weren't run frequently before the nightly build was set up.
   
   > Off topic:  It isn't clear whether the [`ReplicationClient.ReplicationThread`](https://github.com/apache/lucenenet/blob/Lucene.Net_4_8_0_beta00014/src/Lucene.Net.Replicator/ReplicationClient.cs#L48) is working as was expected, being that in Lucene it was passing `ThreadInterruptedException` from the background thread to the calling thread. Would reverting this to a [`ThreadJob`](https://github.com/NightOwl888/J2N/blob/release/v2.0/src/J2N/Threading/ThreadJob.cs) subclass (going back to a more Java-like design) break anything substantial? `ThreadJob` takes care of propagating the first exception from the background thread to the calling thread. 
   
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [lucenenet] NightOwl888 commented on issue #363: Investigate Failing Test: Lucene.Net.Replicator.Http.HttpReplicatorTest::TestBasic()

Posted by GitBox <gi...@apache.org>.
NightOwl888 commented on issue #363:
URL: https://github.com/apache/lucenenet/issues/363#issuecomment-962488136


   @jeme 
   
   I just submitted some patches for Replicator in #534, including the fix for this issue and 2 other tests that were failing more rarely and sometimes deadlocking.
   
   During this process, I ended up going through and updating the AspNetCore tests to follow a more modern approach (with a conditional compile for the .NET Core 3.1 and lower tests). And of course, it was a bit cumbersome because of the lack of async/await support, as you mentioned. It seems like there should be a way to get there, but more analysis is needed to know how deep that rabbit hole goes. It would be nice to support async/await for the AspNetCore users of replicator, and if we do provide a solution it would be safer to provide it before the 4.8.0 release to avoid breaking any APIs. That said, it isn't a critical feature so we could live without it if need be.
   
   But it doesn't look like it is a contributor to any real problem with Replicator. Without the async support, we are probably blocking other parallel tasks from occurring. I had to add a configuration setting to the context to allow blocking IO code to run because in .NET 5 it will throw an `InvalidOperationException` to remind you that it is bad to run synchronous code when you can run async.
   
   > But I can't see that it would break anything to switch to the ThreadJob class as such. We could also do the same in the current implementation, catching the exception that halted the executing thread and then rethrowing that on a call to "StopUpdateThread".
   
   The tests that were overriding `HandleUpdateException` and throwing within that method, which is called within an [unhandled catch block](https://github.com/apache/lucenenet/blob/972d1f50dcb7b49cd08ae91cd984309f942909b8/src/Lucene.Net.Replicator/ReplicationClient.cs#L172). So, any exceptions that were thrown in the tests were causing the thread to terminate. I suspect this was causing the failures and possibly contributing to the deadlocks, but I tested this very thoroughly, and there are no longer any failures after moving to `ThreadJob`.
   
   You are right, it could have been done in the current implementation, but this does end up being less code to maintain by subclassing `ThreadJob`. Maybe there is a way we can still convert `ThreadJob` to use a `ThreadPool`, and it is probably something that is worth looking into being that it is used frequently across the codebase.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@lucenenet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [lucenenet] NightOwl888 commented on issue #363: Investigate Failing Test: Lucene.Net.Replicator.Http.HttpReplicatorTest::TestBasic()

Posted by GitBox <gi...@apache.org>.
NightOwl888 commented on issue #363:
URL: https://github.com/apache/lucenenet/issues/363#issuecomment-950225411


   This seems to be a known issue with Microsoft and there is a lot of discussion about it. However, I haven't yet tried any of the proposed solutions.
   
   - https://github.com/dotnet/runtime/issues/21965
   - https://stackoverflow.com/questions/10547895/how-can-i-tell-when-httpclient-has-timed-out/32230327#32230327
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@lucenenet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [lucenenet] jeme commented on issue #363: Investigate Failing Test: Lucene.Net.Replicator.Http.HttpReplicatorTest::TestBasic()

Posted by GitBox <gi...@apache.org>.
jeme commented on issue #363:
URL: https://github.com/apache/lucenenet/issues/363#issuecomment-832874514


   I have not had a lucene environment up and running for a while (sadly) - is this reproduceable locally?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [lucenenet] jeme commented on issue #363: Investigate Failing Test: Lucene.Net.Replicator.Http.HttpReplicatorTest::TestBasic()

Posted by GitBox <gi...@apache.org>.
jeme commented on issue #363:
URL: https://github.com/apache/lucenenet/issues/363#issuecomment-841424473


   Ok, that is unfortunate as that would likely make it easier to fix.
   I have been trying to go through the code to formalize a theory, but so far I have not been able to come up with something good, but this mapping "ASYNC" code to old style blocking code so I am wondering if that can cause trouble, after all it depends on where and in what the code runs how tasks are handled. (That is one thing I really think was a shitty choice by the .NET framework designers... regardless of them wanting to make it "Easier" to write async code in form based apps... And I am wondering if that is a thing that might hit us here...)
   
   As far as goes for the threading, I choose the design that is in place as that does not allocate a dedicated thread which would be against general recommendations, as well as not allowing workitem exceptions to bring down the whole process (which is a difference to JAVA).
   
   But I can't see that it would break anything to switch to the ThreadJob class as such. We could also do the same in the current implementation, catching the exception that halted the executing thread and then rethrowing that on a call to "StopUpdateThread".
   
   Threading and Exception in Java and .NET is very different, so it can be hard to be certain to ensure the Java behavior fully, so i think it's worth looking critically on the "Java Design" in such areas as we can hardly avoid differences anyways, then if it can be implemented in a .NET way in a "isolated" way, I would prefer that. But I have not hard feelings otherwise.
   
   StopUpdateThread is never called as far as I can tell in this case though so regardless of one or the other, I think that could be a discusison that could be taken in another issue.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org