You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@lucenenet.apache.org by GitBox <gi...@apache.org> on 2021/11/06 18:01:49 UTC

[GitHub] [lucenenet] NightOwl888 commented on issue #363: Investigate Failing Test: Lucene.Net.Replicator.Http.HttpReplicatorTest::TestBasic()

NightOwl888 commented on issue #363:
URL: https://github.com/apache/lucenenet/issues/363#issuecomment-962488136

@jeme

I just submitted some patches for Replicator in #534, including the fix for this issue and 2 other tests that were failing more rarely and sometimes deadlocking.

During this process, I ended up going through and updating the AspNetCore tests to follow a more modern approach (with a conditional compile for the .NET Core 3.1 and lower tests). And of course, it was a bit cumbersome because of the lack of async/await support, as you mentioned. It seems like there should be a way to get there, but more analysis is needed to know how deep that rabbit hole goes. It would be nice to support async/await for the AspNetCore users of replicator, and if we do provide a solution it would be safer to provide it before the 4.8.0 release to avoid breaking any APIs. That said, it isn't a critical feature so we could live without it if need be.

But it doesn't look like it is a contributor to any real problem with Replicator. Without the async support, we are probably blocking other parallel tasks from occurring. I had to add a configuration setting to the context to allow blocking IO code to run because in .NET 5 it will throw an `InvalidOperationException` to remind you that it is bad to run synchronous code when you can run async.

> But I can't see that it would break anything to switch to the ThreadJob class as such. We could also do the same in the current implementation, catching the exception that halted the executing thread and then rethrowing that on a call to "StopUpdateThread".

The tests that were overriding `HandleUpdateException` and throwing within that method, which is called within an [unhandled catch block](https://github.com/apache/lucenenet/blob/972d1f50dcb7b49cd08ae91cd984309f942909b8/src/Lucene.Net.Replicator/ReplicationClient.cs#L172). So, any exceptions that were thrown in the tests were causing the thread to terminate. I suspect this was causing the failures and possibly contributing to the deadlocks, but I tested this very thoroughly, and there are no longer any failures after moving to `ThreadJob`.

You are right, it could have been done in the current implementation, but this does end up being less code to maintain by subclassing `ThreadJob`. Maybe there is a way we can still convert `ThreadJob` to use a `ThreadPool`, and it is probably something that is worth looking into being that it is used frequently across the codebase.

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@lucenenet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org