You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucenenet.apache.org by Shad Storhaug <sh...@shadstorhaug.com> on 2021/10/24 17:28:00 UTC

RE: Release Signing Lucene.NET

Hi Shannon,

I am working on a new release of Lucene.NET. As you already know, Apache’s policy is to sign the release assets before starting the release vote.

https://lucenenet.apache.org/contributing/make-release.html#sign-the-release

However, recently the laptop I use for code signing failed due to a hardware problem (won’t even boot to the BIOS). I can either try to import my private key from a backup or register a new release signing key with Apache, but I am not sure how much time it will take to recover the ability to sign.

If you are able, I think the quickest way to sign the release would be to have you do that step. Are you still set up to sign Lucene.NET releases?


Thanks,

Shad Storhaug (NightOwl888)

Project Chairperson – Apache Lucene.NET

From: Karen Albrecht <Ka...@microsoft.com>
Sent: Wednesday, October 20, 2021 9:50 AM
To: Shad Storhaug <sh...@shadstorhaug.com>; Manish Godse <Ma...@microsoft.com>; Rainer Sigwald <ra...@microsoft.com>; Irina Gorbach <ir...@microsoft.com>
Cc: Kount Veluri <ko...@microsoft.com>; dev@lucenenet.apache.org; PowerBI Engineering Systems <pb...@microsoft.com>; Aaron Meyers <Aa...@microsoft.com>; Ron Clabo <Ro...@GiftOasis.com>
Subject: RE: Code analyzer fix - new release (beta00015) timeline?

+@Irina Gorbach<ma...@microsoft.com>

From: Shad Storhaug <sh...@shadstorhaug.com>>
Sent: Tuesday, October 19, 2021 12:53 PM
To: Karen Albrecht <Ka...@microsoft.com>>; Manish Godse <Ma...@microsoft.com>>; Rainer Sigwald <ra...@microsoft.com>>
Cc: Kount Veluri <ko...@microsoft.com>>; dev@lucenenet.apache.org<ma...@lucenenet.apache.org>; PowerBI Engineering Systems <pb...@microsoft.com>>; Aaron Meyers <Aa...@microsoft.com>>; Ron Clabo <Ro...@GiftOasis.com>>
Subject: [EXTERNAL] RE: Code analyzer fix - new release (beta00015) timeline?

Hi Karen,


  *   Let me see what I can do about sponsoring a new laptop.  I've never done something like this before so I don't know the policies on it.  I'll investigate.

Thanks.


  *   In the meanwhile, Power BI has an Azure OneBox image that has 16cores and is fully stocked with the latest development tools.  Let me know if it would help to provision one of those for your development and I can get that setup today.

Yes, that would be helpful.


  *   @Rainer Sigwald<ma...@microsoft.com> - For this issue (.NET Framework x86 Issue), it looks similar to the issue we debugged a few months ago where Visual Studio was reading in environment variables that were overriding the settings in Visual Studio, causing projects to skip being built.  Do you mind helping @Shad Storhaug<ma...@shadstorhaug.com> figure out what environmental settings might impact CHK/FRE builds in this use case?

To be clear, debugging is working in Visual Studio. However, the tests always pass when in Debug mode. They always fail in Release mode unless we turn off optimizations on the Lucene.Net assembly. We haven’t found a configuration that we can use to step through when the tests are failing to work out which lines are causing the failures.


Thanks,

Shad Storhaug (NightOwl888)

Project Chairperson – Apache Lucene.NET


From: Karen Albrecht <Ka...@microsoft.com>>
Sent: Wednesday, October 20, 2021 2:12 AM
To: Shad Storhaug <sh...@shadstorhaug.com>>; Manish Godse <Ma...@microsoft.com>>; Rainer Sigwald <ra...@microsoft.com>>
Cc: Kount Veluri <ko...@microsoft.com>>; dev@lucenenet.apache.org<ma...@lucenenet.apache.org>; PowerBI Engineering Systems <pb...@microsoft.com>>; Aaron Meyers <Aa...@microsoft.com>>; Ron Clabo <Ro...@GiftOasis.com>>
Subject: RE: Code analyzer fix - new release (beta00015) timeline?


@Shad Storhaug<ma...@shadstorhaug.com> -



Let me see what I can do about sponsoring a new laptop.  I've never done something like this before so I don't know the policies on it.  I'll investigate.



In the meanwhile, Power BI has an Azure OneBox image that has 16cores and is fully stocked with the latest development tools.  Let me know if it would help to provision one of those for your development and I can get that setup today.



@Rainer Sigwald<ma...@microsoft.com> - For this issue (.NET Framework x86 Issue), it looks similar to the issue we debugged a few months ago where Visual Studio was reading in environment variables that were overriding the settings in Visual Studio, causing projects to skip being built.  Do you mind helping @Shad Storhaug<ma...@shadstorhaug.com> figure out what environmental settings might impact CHK/FRE builds in this use case?



-Karen



-----Original Message-----
From: Shad Storhaug <sh...@shadstorhaug.com>>
Sent: Tuesday, October 19, 2021 11:42 AM
To: Karen Albrecht <Ka...@microsoft.com>>; Manish Godse <Ma...@microsoft.com>>
Cc: Kount Veluri <ko...@microsoft.com>>; dev@lucenenet.apache.org<ma...@lucenenet.apache.org>; PowerBI Engineering Systems <pb...@microsoft.com>>; Aaron Meyers <Aa...@microsoft.com>>; Ron Clabo <Ro...@GiftOasis.com>>
Subject: [EXTERNAL] RE: Code analyzer fix - new release (beta00015) timeline?



Hi Karen,



Java Lucene Debugging

===================



Unfortunately, the laptop I use to debug on Java failed today due to a hardware issue, and my main dev machine doesn't have enough disk space to install Eclipse which is the development environment for Java Lucene. Fortunately, I worked with Ron Clabo recently to set up a VM for him and we documented the process of setting it up, so one of us can debug. Also, I haven't lost any work. I just don't have the hardware to set up a Java IDE for myself at present which is frustrating.  What I really could use is a new laptop but frankly I don't currently have the money for one.  I could limp along by setting up a VM in Azure but that's not optimal either.  Any chance Microsoft could sponsor a new laptop for my Lucene.NET development work?  Alternatively, while not ideal, perhaps you can get some Azure credits so I can set up a VM for debugging on Java? Without being able to compare execution paths between .NET and Java or having to do it all through Ron, tracking down issues is extremely slow.



.NET Framework x86 Issue

=====================



I tried a couple of things over the weekend:



1. Created a custom code analyzer to locate lines that are comparing floating point values for equality, and disabled optimizations on the containing methods using [MethodImpl(MethodImplOptions.NoOptimization)], but this had no effect on the test failures.

2. I discovered there is a setting for JIT debugging (https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fdotnet%2Froslyn%2Fissues%2F7333%23issuecomment-560207547&amp;data=04%7C01%7Ckaren.albrecht%40microsoft.com%7C355be0a0a6c34f11aea608d993301488%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702657038321337%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=YEEvCjJxNk9YZ8py5GYe%2BtS7aFB91mUZtwFDPBsB%2B2Y%3D&amp;reserved=0<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fdotnet%2Froslyn%2Fissues%2F7333%23issuecomment-560207547&data=04%7C01%7Ckaren.albrecht%40microsoft.com%7Cafef86f87a484495d1d308d9933a09c5%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702699855156236%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=A26VfF%2B8j2P00mi97OymdAwMNy2hc2%2BLHrtZSpZbJB0%3D&reserved=0>), but changing the setting has no effect on the failures in either Debug or Release mode.



The second point seems to indicate this is not a JIT problem, but perhaps we are getting differences in IL output. I was looking at the possibility of using iladsm.exe to output the IL and compare it using BeyondCompare, but I got sidetracked with higher priority issues before investigating further.



I could really use some help to narrow down where to look for the problem, since it doesn't happen in Debug mode and the project has too many interdependencies to split up to try the divide and conquer approach.



We previously had a similar x86 floating point calculation issue that was resolved with an extra cast:



https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Flucenenet%2Fblob%2F3dcffb2929662ffb9c4678458eaf90f0df300f92%2Fsrc%2FLucene.Net%2FUtil%2FPacked%2FMonotonicAppendingLongBuffer.cs%23L88-L89&amp;data=04%7C01%7Ckaren.albrecht%40microsoft.com%7C355be0a0a6c34f11aea608d993301488%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702657038331337%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=Mwq1HQkyM6KSIFM6av%2BoZVTLL6K%2FuykMJuUvW9RUlAI%3D&amp;reserved=0<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Flucenenet%2Fblob%2F3dcffb2929662ffb9c4678458eaf90f0df300f92%2Fsrc%2FLucene.Net%2FUtil%2FPacked%2FMonotonicAppendingLongBuffer.cs%23L88-L89&data=04%7C01%7Ckaren.albrecht%40microsoft.com%7Cafef86f87a484495d1d308d9933a09c5%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702699855166229%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=roD9cyMDTGe3BGr%2FEfDID%2Bkwn%2FRBfWHYPinI3mo8VUs%3D&reserved=0>

https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Flucene%2Fblob%2Freleases%2Flucene-solr%2F4.8.0%2Flucene%2Fcore%2Fsrc%2Fjava%2Forg%2Fapache%2Flucene%2Futil%2Fpacked%2FMonotonicAppendingLongBuffer.java%23L77&amp;data=04%7C01%7Ckaren.albrecht%40microsoft.com%7C355be0a0a6c34f11aea608d993301488%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702657038331337%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=xd18A9aQukqK9wNEWpziuG0TDVfIg5hZD89rfhG%2Bo1g%3D&amp;reserved=0<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Flucene%2Fblob%2Freleases%2Flucene-solr%2F4.8.0%2Flucene%2Fcore%2Fsrc%2Fjava%2Forg%2Fapache%2Flucene%2Futil%2Fpacked%2FMonotonicAppendingLongBuffer.java%23L77&data=04%7C01%7Ckaren.albrecht%40microsoft.com%7Cafef86f87a484495d1d308d9933a09c5%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702699855176223%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=WyY6Q4v7%2FHFbEzigOWSE%2F9Ck3ElZToWOploATtwc3Gw%3D&reserved=0>



But it was easier to track down because we were able to step through the code to find the misbehaving lines.



Deadlocks During Testing

====================



The patch to use UninterruptableMonitor seems to have worked to both fix the test failures and the widespread deadlocks we were seeing during testing.



Unfortunately, while the tests pass there is a non-zero chance a ThreadInteruptedException could occur if there is contention on a lock in a dependency of Lucene.Net or in a custom user component that is engaged during the Commit() (such as a custom Directory implementation). I have opened https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Flucenenet%2Fissues%2F526&amp;data=04%7C01%7Ckaren.albrecht%40microsoft.com%7C355be0a0a6c34f11aea608d993301488%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702657038331337%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=n1PQUWxaklLbfjDkABYR37tsqUyYpTayW4c9oWTMqrg%3D&amp;reserved=0<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Flucenenet%2Fissues%2F526&data=04%7C01%7Ckaren.albrecht%40microsoft.com%7Cafef86f87a484495d1d308d9933a09c5%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702699855176223%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=bA9fSk32ZAz1VM7eFOFQPWdR1pX6dvDdj76scZ5R7nQ%3D&reserved=0> to track the issue, which we can work on in a future release.



There is still one (seemingly rare) deadlock that occurs in one Lucene.Net.Suggest test that I am investigating, but so far, I have only seen it occur on a couple of branches with experimental changes in them and not on the master branch.



Summary

========



So, while there are many issues we are dealing with to prepare for the release such as releasing dependencies and integrating changes we have in the development pipeline, there are 2 things we could use assistance with from your end:



1. A development machine for a Java IDE.

2. Tracking down the cause of the test failures on x86 in .NET Framework which we are unable to debug.



Thanks,

Shad Storhaug (NightOwl888)

Project Chairperson – Apache Lucene.NET



-----Original Message-----

From: Karen Albrecht <Ka...@microsoft.com>>

Sent: Tuesday, October 19, 2021 10:29 PM

To: Shad Storhaug <sh...@shadstorhaug.com>>; Manish Godse <Ma...@microsoft.com>>

Cc: Kount Veluri <ko...@microsoft.com>>; dev@lucenenet.apache.org<ma...@lucenenet.apache.org>; PowerBI Engineering Systems <pb...@microsoft.com>>

Subject: RE: Code analyzer fix - new release (beta00015) timeline?



@Shad Storhaug -



It looks like there is still some more investigation needed to fix the build so that we can stop seeing Lucene Analyzer build failures in Power BI.



What are the next steps to resolving the remaining active issues?



-Karen



-----Original Message-----

From: Shad Storhaug <sh...@shadstorhaug.com>>

Sent: Thursday, October 14, 2021 12:41 PM

To: Manish Godse <Ma...@microsoft.com>>

Cc: Kount Veluri <ko...@microsoft.com>>; dev@lucenenet.apache.org<ma...@lucenenet.apache.org>; Karen Albrecht <Ka...@microsoft.com>>

Subject: [EXTERNAL] RE: Code analyzer fix - new release (beta00015) timeline?



Hi Manish,



> Ok glad that the workaround is somewhat working. The Thread.Interrupt call here: https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FNightOwl888%2Flucenenet%2Fblob%2F9ed017f45ad4d1cbb9a11c5245fd379ff544791c%2Fsrc%2FLucene.Net%2FSupport%2FThreading%2FUninterruptableMonitor.cs%23L74&amp;data=04%7C01%7Ckaren.albrecht%40microsoft.com%7C355be0a0a6c34f11aea608d993301488%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702657038331337%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=T0ep%2FW40eBAbUtaK7%2Bh7Nyd5JFD1JIOWj%2BDC%2BFjuh1w%3D&amp;reserved=0<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FNightOwl888%2Flucenenet%2Fblob%2F9ed017f45ad4d1cbb9a11c5245fd379ff544791c%2Fsrc%2FLucene.Net%2FSupport%2FThreading%2FUninterruptableMonitor.cs%23L74&data=04%7C01%7Ckaren.albrecht%40microsoft.com%7Cafef86f87a484495d1d308d9933a09c5%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702699855186217%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=jgghgJOJHeYpqkmEYbj4Fwy7RKRQm30lrChPHYMs8fM%3D&reserved=0> does look somewhat unnatural. Guess but given the differences in Interrupt behavior it might necessary.



I tried using thread local storage to store the interrupted state in a previous iteration, but then there needed to be a RestoreInterrupt() method to copy the interrupt state back to .NET and it had to be called before every Wait(), Sleep() or Join(). Although it will perform slightly worse because an interrupt on each lock entry after the first one will have to deal with an exception, restoring the interrupt status immediately removes the need for this extra method call.



Update - We have run 28 full test runs and there wasn't a single deadlock after the UninterruptedMonitor change. By comparison, we were seeing 3-4 deadlocks in 12 full test runs.



> 2. For the JIT optimization issue if you could please provide a repro or dump of when the failures occur we could route to our .net Framework servicing team for some help. Or if you could provide details on which methods when optimized lead to failures they could suggest if that is a JIT bug or possibly some race conditions in code which show up in release builds.



Here is a quick example of the failures:

https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdev.azure.com%2Fshad0962%2FExperiments%2F_build%2Fresults%3FbuildId%3D1456%26view%3Dms.vss-test-web.build-test-results-tab&amp;data=04%7C01%7Ckaren.albrecht%40microsoft.com%7C355be0a0a6c34f11aea608d993301488%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702657038331337%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=32eBoo2EKBxM5HP%2BlvtqS0FKrRPjgP82fPGQ5xz7PZg%3D&amp;reserved=0<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdev.azure.com%2Fshad0962%2FExperiments%2F_build%2Fresults%3FbuildId%3D1456%26view%3Dms.vss-test-web.build-test-results-tab&data=04%7C01%7Ckaren.albrecht%40microsoft.com%7Cafef86f87a484495d1d308d9933a09c5%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702699855186217%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=LIqGXMitmL3P7T5v5jzbn9gLgmQxev1cOrXEvn4xq3g%3D&reserved=0>



If filtered for net461, you can see all 8. On net48 only 4 of them are failing because we have turned off optimizations on the net45 build (net461 is for testing the netstandard2.0 target framework which has optimizations enabled on the known problematic methods).



Repro setup steps:



1. Setup prerequisites: https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Flucenenet%23prerequisites-1&amp;data=04%7C01%7Ckaren.albrecht%40microsoft.com%7C355be0a0a6c34f11aea608d993301488%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702657038331337%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=XjGcQhga4oXj4MdJq7Z3xSonNMMadviffX0uRv6wzns%3D&amp;reserved=0<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Flucenenet%23prerequisites-1&data=04%7C01%7Ckaren.albrecht%40microsoft.com%7Cafef86f87a484495d1d308d9933a09c5%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702699855196213%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=2DS6vWsHGTriibFnH9YPjJlZJRaRml0FH7etQIVaHVg%3D&reserved=0>

2. Clone our repository from: https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Flucenenet&amp;data=04%7C01%7Ckaren.albrecht%40microsoft.com%7C355be0a0a6c34f11aea608d993301488%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702657038331337%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=cxC41CEDeBOQIECxfZpdIpPQXWI21Z230ZXk7nj8eks%3D&amp;reserved=0<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Flucenenet&data=04%7C01%7Ckaren.albrecht%40microsoft.com%7Cafef86f87a484495d1d308d9933a09c5%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702699855196213%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=m2LL5EptQBOSWF0AB8dY3n%2BmGE0OrAhEhVU1XKafpzU%3D&reserved=0>

3. Hand edit the file at https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Flucenenet%2Fblob%2Fd7a68e4ef67a34f365c2ebc58563102ff8af1d25%2FTestTargetFramework.props%23L29-L33&amp;data=04%7C01%7Ckaren.albrecht%40microsoft.com%7C355be0a0a6c34f11aea608d993301488%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702657038331337%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=mmHC3G6ttcapQE9Qr5UmG1j95HGmHwN9QHKYsrUcn9w%3D&amp;reserved=0<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Flucenenet%2Fblob%2Fd7a68e4ef67a34f365c2ebc58563102ff8af1d25%2FTestTargetFramework.props%23L29-L33&data=04%7C01%7Ckaren.albrecht%40microsoft.com%7Cafef86f87a484495d1d308d9933a09c5%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702699855206205%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=0VlZnI%2F%2FhDwXO9jxlxkJ4NHRihB2kl0NGtf5McQ2k9I%3D&reserved=0> to comment out net5.0 and uncomment net48 (to see the 4 failures we cannot find) or net461 to see all 8 failures.

4. Change the processor architecture in Test Explorer to x86.

5. Run the tests named in the above example. They fail 100% of the time in Release mode and never in Debug mode.





Viewing one of the offending methods, it is fairly obvious what the cause of the failures are - we are losing float precision with the JIT compiler and the methods are doing an exact float comparison (which is how it was originally designed and is what we have copied).



https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Flucenenet%2Fblob%2Fd7a68e4ef67a34f365c2ebc58563102ff8af1d25%2Fsrc%2FLucene.Net.Sandbox%2FQueries%2FFuzzyLikeThisQuery.cs%23L375&amp;data=04%7C01%7Ckaren.albrecht%40microsoft.com%7C355be0a0a6c34f11aea608d993301488%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702657038331337%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=4UacU64pB60pMt1FdzmApuSzayTFAdhNIOhbM5W574Q%3D&amp;reserved=0<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Flucenenet%2Fblob%2Fd7a68e4ef67a34f365c2ebc58563102ff8af1d25%2Fsrc%2FLucene.Net.Sandbox%2FQueries%2FFuzzyLikeThisQuery.cs%23L375&data=04%7C01%7Ckaren.albrecht%40microsoft.com%7Cafef86f87a484495d1d308d9933a09c5%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702699855206205%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=Y0PWHPzQzPLY7Dmkx6zkC9Fcy9ZQFF5qfwuaQQ3rtfU%3D&reserved=0>

https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Flucenenet%2Fblob%2Fd7a68e4ef67a34f365c2ebc58563102ff8af1d25%2Fsrc%2FLucene.Net.Sandbox%2FQueries%2FSlowFuzzyTermsEnum.cs%23L149&amp;data=04%7C01%7Ckaren.albrecht%40microsoft.com%7C355be0a0a6c34f11aea608d993301488%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702657038331337%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=FialLk9xSrMVTUmU6g%2FNqXxFnvssdhgvJ5Ta2OXCrCo%3D&amp;reserved=0<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Flucenenet%2Fblob%2Fd7a68e4ef67a34f365c2ebc58563102ff8af1d25%2Fsrc%2FLucene.Net.Sandbox%2FQueries%2FSlowFuzzyTermsEnum.cs%23L149&data=04%7C01%7Ckaren.albrecht%40microsoft.com%7Cafef86f87a484495d1d308d9933a09c5%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702699855216200%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=BQUhwEUn1BnfKkZZs1I1lygRSHuML3IdDHW%2F63%2BUYqQ%3D&reserved=0>

https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Flucenenet%2Fblob%2Fd7a68e4ef67a34f365c2ebc58563102ff8af1d25%2Fsrc%2FLucene.Net%2FSearch%2FFuzzyTermsEnum.cs%23L418&amp;data=04%7C01%7Ckaren.albrecht%40microsoft.com%7C355be0a0a6c34f11aea608d993301488%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702657038331337%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=ogFxaSECgEdzJgWpLO2Gpi9wL%2FxJNAATzooSZRGJxi4%3D&amp;reserved=0<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Flucenenet%2Fblob%2Fd7a68e4ef67a34f365c2ebc58563102ff8af1d25%2Fsrc%2FLucene.Net%2FSearch%2FFuzzyTermsEnum.cs%23L418&data=04%7C01%7Ckaren.albrecht%40microsoft.com%7Cafef86f87a484495d1d308d9933a09c5%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702699855216200%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=zOZxObcgRn0znMYst8txPQWOV5lu6gnSCXrxIo24qhw%3D&reserved=0>

https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Flucenenet%2Fblob%2Fd7a68e4ef67a34f365c2ebc58563102ff8af1d25%2Fsrc%2FLucene.Net%2FSearch%2FTopScoreDocCollector.cs%23L65&amp;data=04%7C01%7Ckaren.albrecht%40microsoft.com%7C355be0a0a6c34f11aea608d993301488%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702657038341327%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=UBBpEVI%2BD7NVId43xU1pFn65tc7hafKeV917Na10R0w%3D&amp;reserved=0<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Flucenenet%2Fblob%2Fd7a68e4ef67a34f365c2ebc58563102ff8af1d25%2Fsrc%2FLucene.Net%2FSearch%2FTopScoreDocCollector.cs%23L65&data=04%7C01%7Ckaren.albrecht%40microsoft.com%7Cafef86f87a484495d1d308d9933a09c5%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702699855226197%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=j79K6jlY8sYErby66cFZd5SgwD%2FnBdsbI226gZ1lbyc%3D&reserved=0>

https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Flucenenet%2Fblob%2Fd7a68e4ef67a34f365c2ebc58563102ff8af1d25%2Fsrc%2FLucene.Net%2FSearch%2FTopScoreDocCollector.cs%23L112&amp;data=04%7C01%7Ckaren.albrecht%40microsoft.com%7C355be0a0a6c34f11aea608d993301488%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702657038341327%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=iagIDVhV5IhuCpnOrLBRpUtiXgTbRfX34rrjE6sC1Vg%3D&amp;reserved=0<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Flucenenet%2Fblob%2Fd7a68e4ef67a34f365c2ebc58563102ff8af1d25%2Fsrc%2FLucene.Net%2FSearch%2FTopScoreDocCollector.cs%23L112&data=04%7C01%7Ckaren.albrecht%40microsoft.com%7Cafef86f87a484495d1d308d9933a09c5%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702699855236189%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=t2QZAZOIZTRzNSTWw9voAk6TdaRnGipmkTJv4sm5ttA%3D&reserved=0>

https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Flucenenet%2Fblob%2Fd7a68e4ef67a34f365c2ebc58563102ff8af1d25%2Fsrc%2FLucene.Net%2FSearch%2FTopScoreDocCollector.cs%23L118&amp;data=04%7C01%7Ckaren.albrecht%40microsoft.com%7C355be0a0a6c34f11aea608d993301488%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702657038341327%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=%2FMKiNm8%2FtHoDMNqWLJYbwlywqBIHvKpTxG4yCnbprLg%3D&amp;reserved=0<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Flucenenet%2Fblob%2Fd7a68e4ef67a34f365c2ebc58563102ff8af1d25%2Fsrc%2FLucene.Net%2FSearch%2FTopScoreDocCollector.cs%23L118&data=04%7C01%7Ckaren.albrecht%40microsoft.com%7Cafef86f87a484495d1d308d9933a09c5%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702699855236189%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=jawBK5EKZSs8o7rpoLMt8TUBP%2Bmi5Nfat9O19C30JIM%3D&reserved=0>

https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Flucenenet%2Fblob%2Fd7a68e4ef67a34f365c2ebc58563102ff8af1d25%2Fsrc%2FLucene.Net%2FSearch%2FTopScoreDocCollector.cs%23L167&amp;data=04%7C01%7Ckaren.albrecht%40microsoft.com%7C355be0a0a6c34f11aea608d993301488%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702657038341327%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=MEUgAIeSu0cEnIfi8jTATGRZqFpbjczx4pMetH3gumo%3D&amp;reserved=0<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Flucenenet%2Fblob%2Fd7a68e4ef67a34f365c2ebc58563102ff8af1d25%2Fsrc%2FLucene.Net%2FSearch%2FTopScoreDocCollector.cs%23L167&data=04%7C01%7Ckaren.albrecht%40microsoft.com%7Cafef86f87a484495d1d308d9933a09c5%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702699855246183%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=PMvfB%2BYCbPG2GIeR2KEDyeDBNRmkf7SF3bDC7Nwq6gs%3D&reserved=0>

https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Flucenenet%2Fblob%2Fd7a68e4ef67a34f365c2ebc58563102ff8af1d25%2Fsrc%2FLucene.Net%2FSearch%2FTopScoreDocCollector.cs%23L173&amp;data=04%7C01%7Ckaren.albrecht%40microsoft.com%7C355be0a0a6c34f11aea608d993301488%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702657038341327%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=PNT05mpElf5CZEqet%2FTsbI3fwIvxS15X7xG91MK7YcM%3D&amp;reserved=0<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Flucenenet%2Fblob%2Fd7a68e4ef67a34f365c2ebc58563102ff8af1d25%2Fsrc%2FLucene.Net%2FSearch%2FTopScoreDocCollector.cs%23L173&data=04%7C01%7Ckaren.albrecht%40microsoft.com%7Cafef86f87a484495d1d308d9933a09c5%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702699855246183%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=UMsSz%2F6xmubpfD5TX07AeqqjVlHmKMmdUmw%2FEjYJlfc%3D&reserved=0>

https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Flucenenet%2Fblob%2Fd7a68e4ef67a34f365c2ebc58563102ff8af1d25%2Fsrc%2FLucene.Net%2FSearch%2FTopScoreDocCollector.cs%23L213&amp;data=04%7C01%7Ckaren.albrecht%40microsoft.com%7C355be0a0a6c34f11aea608d993301488%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702657038341327%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=McxAcHSFHmrDUQOIrTnhOl7RrGk32jM1xfehxJ0rokc%3D&amp;reserved=0<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Flucenenet%2Fblob%2Fd7a68e4ef67a34f365c2ebc58563102ff8af1d25%2Fsrc%2FLucene.Net%2FSearch%2FTopScoreDocCollector.cs%23L213&data=04%7C01%7Ckaren.albrecht%40microsoft.com%7Cafef86f87a484495d1d308d9933a09c5%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702699855256177%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=CbRAt%2Fg2t4o5Yt8nE6ZxQM7GikAfC4u1sEGGkpU29oc%3D&reserved=0>

https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Flucenenet%2Fblob%2Fd7a68e4ef67a34f365c2ebc58563102ff8af1d25%2Fsrc%2FLucene.Net%2FSearch%2FTopScoreDocCollector.cs%23L218&amp;data=04%7C01%7Ckaren.albrecht%40microsoft.com%7C355be0a0a6c34f11aea608d993301488%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702657038341327%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=NZrlCev7osTaJzXIuaUxqL4hu4KX02hI5theKR0skNU%3D&amp;reserved=0<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Flucenenet%2Fblob%2Fd7a68e4ef67a34f365c2ebc58563102ff8af1d25%2Fsrc%2FLucene.Net%2FSearch%2FTopScoreDocCollector.cs%23L218&data=04%7C01%7Ckaren.albrecht%40microsoft.com%7Cafef86f87a484495d1d308d9933a09c5%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702699855256177%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=2v2%2BnAanqD6jRxcJYfpYw96a9ft%2B4XMlyKxuLqqAOWY%3D&reserved=0>

https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Flucenenet%2Fblob%2Fd7a68e4ef67a34f365c2ebc58563102ff8af1d25%2Fsrc%2FLucene.Net%2FSearch%2FTopScoreDocCollector.cs%23L224&amp;data=04%7C01%7Ckaren.albrecht%40microsoft.com%7C355be0a0a6c34f11aea608d993301488%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702657038341327%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=RIKgxY3DV8lcVhfSTEMozyUd%2BDfus%2FedFIASNIrN2b4%3D&amp;reserved=0<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Flucenenet%2Fblob%2Fd7a68e4ef67a34f365c2ebc58563102ff8af1d25%2Fsrc%2FLucene.Net%2FSearch%2FTopScoreDocCollector.cs%23L224&data=04%7C01%7Ckaren.albrecht%40microsoft.com%7Cafef86f87a484495d1d308d9933a09c5%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702699855266191%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=umpN8h%2BwIY8nZCXwVEYYoUDBw8WAJIxVJO9KdgGHdUg%3D&reserved=0>



So, I wouldn't characterize this as a .NET Framework JIT bug, it is just an application design peculiarity that we need to work around. However, this code works fine on .NET Framework x64 as well as all newer .NET frameworks. There is plenty of well documented advice on how and why not to design an app like this, but there isn't much in the way of locating these types of problems if they already exist.



The problem we are running into is finding the rest of the offending lines in a 500,000-line codebase so we can fix them. I have narrowed it down to the Lucene.Net assembly where turning off the optimizations there makes the tests pass, but that is still a large amount of code. Going from "somewhere in the assembly" to using an attribute that only works on one method at a time isn't that helpful, but if there were a way to turn on or simulate the JIT optimizations in Debug mode it would help a lot.



I suppose one solution might be to write a code analyzer rule to check whether any float data types are compared like this (I did a search, but it doesn't appear there is a built-in rule Code Analysis rule for "don't compare floating point types for equality"). However, there are no guarantees the rest of the test failures will have the same cause.



Thanks,

Shad Storhaug (NightOwl888)

Project Chairperson – Apache Lucene.NET



-----Original Message-----

From: Manish Godse <Ma...@microsoft.com>>

Sent: Friday, October 15, 2021 12:25 AM

To: Shad Storhaug <sh...@shadstorhaug.com>>

Cc: Kount Veluri <ko...@microsoft.com>>; dev@lucenenet.apache.org<ma...@lucenenet.apache.org>; Karen Albrecht <Ka...@microsoft.com>>

Subject: RE: Code analyzer fix - new release (beta00015) timeline?



Hey Shad,



Ok glad that the workaround is somewhat working. The Thread.Interrupt call here: https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FNightOwl888%2Flucenenet%2Fblob%2F9ed017f45ad4d1cbb9a11c5245fd379ff544791c%2Fsrc%2FLucene.Net%2FSupport%2FThreading%2FUninterruptableMonitor.cs%23L74&amp;data=04%7C01%7Ckaren.albrecht%40microsoft.com%7C355be0a0a6c34f11aea608d993301488%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702657038341327%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=KNYje32wDtMp%2BuTEGHYxqZ89dAj9Pb9%2FFTATqNms2zk%3D&amp;reserved=0<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FNightOwl888%2Flucenenet%2Fblob%2F9ed017f45ad4d1cbb9a11c5245fd379ff544791c%2Fsrc%2FLucene.Net%2FSupport%2FThreading%2FUninterruptableMonitor.cs%23L74&data=04%7C01%7Ckaren.albrecht%40microsoft.com%7Cafef86f87a484495d1d308d9933a09c5%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702699855266191%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=vkk1%2Bw0Kb14iiik%2Bt8SUkMcmJIDiu%2BbOfeq%2FhiJKrM8%3D&reserved=0> does look somewhat unnatural. Guess but given the differences in Interrupt behavior it might necessary.



For the other issues:



1. but we got a strange test run crash on .NET 6 x86 on Linux that might be of note to the .NET 6 team:

[manish] .net is not officially supported on x86 Linux, and the exception also seems to be for the VS test helper





2. For the JIT optimization issue if you could please provide a repro or dump of when the failures occur we could route to our .net Framework servicing team for some help. Or if you could provide details on which methods when optimized lead to failures they could suggest if that is a JIT bug or possibly some race conditions in code which show up in release builds.



Thanks



-----Original Message-----

From: Shad Storhaug <sh...@shadstorhaug.com>>

Sent: Thursday, October 14, 2021 8:55 AM

To: Manish Godse <Ma...@microsoft.com>>

Cc: Kount Veluri <ko...@microsoft.com>>; dev@lucenenet.apache.org<ma...@lucenenet.apache.org>; Karen Albrecht <Ka...@microsoft.com>>

Subject: [EXTERNAL] RE: Code analyzer fix - new release (beta00015) timeline?



Hi Manish,



I am still iterating through different prototypes, and of course changing 558 lock statements to try-finally blocks was no small task. I just completed all of those today. The tests are passing on .NET Core 3.1/.NET 5/.NET 6, and I am just trying to ascertain whether we still are getting deadlocks.



Here are the latest iteration and tests for your review:



https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FNightOwl888%2Flucenenet%2Fblob%2Ffix%2Fthreadinterrupt-4%2Fsrc%2FLucene.Net%2FSupport%2FThreading%2FUninterruptableMonitor.cs&amp;data=04%7C01%7Ckaren.albrecht%40microsoft.com%7C355be0a0a6c34f11aea608d993301488%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702657038341327%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=k5Muv6BmOHNgbs5k4I5c9AvPUgE9ujD%2BxvGu%2Fewo%2BBE%3D&amp;reserved=0<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FNightOwl888%2Flucenenet%2Fblob%2Ffix%2Fthreadinterrupt-4%2Fsrc%2FLucene.Net%2FSupport%2FThreading%2FUninterruptableMonitor.cs&data=04%7C01%7Ckaren.albrecht%40microsoft.com%7Cafef86f87a484495d1d308d9933a09c5%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702699855276187%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=bxEOYI9UT2g5b7zGesgXa1SGijsDKUmVnGVd2wpWMR0%3D&reserved=0>

https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FNightOwl888%2Flucenenet%2Fblob%2Ffix%2Fthreadinterrupt-4%2Fsrc%2FLucene.Net.Tests%2FSupport%2FThreading%2FTestUninterruptableMonitor.cs&amp;data=04%7C01%7Ckaren.albrecht%40microsoft.com%7C355be0a0a6c34f11aea608d993301488%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702657038341327%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=6gb%2FaSn8GXxRXeu8VOKqli8v4UesqG4oNXGfQ9vyq3s%3D&amp;reserved=0<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FNightOwl888%2Flucenenet%2Fblob%2Ffix%2Fthreadinterrupt-4%2Fsrc%2FLucene.Net.Tests%2FSupport%2FThreading%2FTestUninterruptableMonitor.cs&data=04%7C01%7Ckaren.albrecht%40microsoft.com%7Cafef86f87a484495d1d308d9933a09c5%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702699855286180%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=EdefBljdMtXDSh8FwJXMJl%2BdoUUPfXLnMQhO7%2Ff2De0%3D&reserved=0>



Since an interrupt can happen at any time, it seems the only way to guarantee a lock will occur is by recursively retrying until it succeeds. The Sleep() call appears to be unnecessary, since if Monitor.Enter() throws ThreadInterruptedException it will also clear the interrupted state.





The latest 4 runs didn't have any deadlocks, but we got a strange test run crash on .NET 6 x86 on Linux that might be of note to the .NET 6 team:



The active test run was aborted. Reason: Test host process crashed

Test Run Aborted with error System.Exception: One or more errors occurred.

---> System.Exception: Unable to read beyond the end of the stream.

   at System.IO.BinaryReader.Read7BitEncodedInt()

   at System.IO.BinaryReader.ReadString()

   at Microsoft.VisualStudio.TestPlatform.CommunicationUtilities.LengthPrefixCommunicationChannel.NotifyDataAvailable()

   at Microsoft.VisualStudio.TestPlatform.CommunicationUtilities.TcpClientExtensions.MessageLoopAsync(TcpClient client, ICommunicationChannel channel, Action`1 errorHandler, CancellationToken cancellationToken)

   --- End of inner exception stack trace ---.



The output and hang dump from the build are attached as artifacts:

https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdev.azure.com%2Flucene-net-temp3%2FLucene.NET%2F_build%2Fresults%3FbuildId%3D465%26view%3Dresults&amp;data=04%7C01%7Ckaren.albrecht%40microsoft.com%7C355be0a0a6c34f11aea608d993301488%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702657038341327%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=kfhXgBw%2BQOQz2rwTQkgooNsAZvjCj%2BDivna67ueyaBI%3D&amp;reserved=0<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdev.azure.com%2Flucene-net-temp3%2FLucene.NET%2F_build%2Fresults%3FbuildId%3D465%26view%3Dresults&data=04%7C01%7Ckaren.albrecht%40microsoft.com%7Cafef86f87a484495d1d308d9933a09c5%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702699855296175%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=iq9k6A05LeRmkUSY87tp2BqRJ2FbAWXG2JRBMV2r6Hk%3D&reserved=0>





On another note, we have another problem that is proving difficult to debug, and I am wondering if you can help or connect me to someone who can. We are getting 8 test failures on .NET Framework when run in x86. I was able to narrow down the cause to some sort of JIT optimization problem in one assembly by turning off optimizations in Release mode and seeing the test failures cease. However, since the failures never happen in Debug mode I am at a loss how to proceed.



I was able to track down 4 of the failing methods by applying the [MethodImpl(MethodImplOptions.NoOptimization)] attribute on related methods until I located the one that was causing the failure, but despite doing that on more than 2200 methods, I was unable to locate the others. Are there any debugging tools that exist to locate methods causing compiler optimization issues such as these when they don't happen in Debug mode? Failing that option, is there a tool that can apply an attribute to every method in a single project in one go?



Thanks,

Shad Storhaug (NightOwl888)

Project Chairperson – Apache Lucene.NET





-----Original Message-----

From: Manish Godse <Ma...@microsoft.com>>

Sent: Thursday, October 14, 2021 8:57 PM

To: Shad Storhaug <sh...@shadstorhaug.com>>

Cc: Kount Veluri <ko...@microsoft.com>>; dev@lucenenet.apache.org<ma...@lucenenet.apache.org>; Karen Albrecht <Ka...@microsoft.com>>

Subject: RE: Code analyzer fix - new release (beta00015) timeline?



+Karen.



Hey Shad,



Checking in whether you were able to try out the workaround, else we can continue to iterate on this repro.



Thanks



-----Original Message-----

From: Manish Godse

Sent: Tuesday, October 12, 2021 10:21 AM

To: Shad Storhaug <sh...@shadstorhaug.com>>

Cc: Kount Veluri <ko...@microsoft.com>>

Subject: RE: Code analyzer fix - new release (beta00015) timeline?



Hey Shad,



The workaround should be mostly ok. One thing I noticed is that shouldn’t the last call be Monitor.Enter instead of TryEnter, since TryEnter might be returning false occasionally if the monitor is being held by another thread, and we would want to block like the first call within try?



I have made a small repro for the issue and I can see that the lock can be successfully acquired after an interrupt. Perhaps we could iterate on this repro if this still doesn’t work for you.



Thanks



-----Original Message-----

From: Shad Storhaug <sh...@shadstorhaug.com>>

Sent: Tuesday, October 12, 2021 5:22 AM

To: dev@lucenenet.apache.org<ma...@lucenenet.apache.org>; Manish Godse <Ma...@microsoft.com>>; Stephen Toub <st...@microsoft.com>>; Karen Albrecht <Ka...@microsoft.com>>; Dustin Campbell <Du...@microsoft.com>>; Chaz Beck <Ch...@microsoft.com>>; Aaron Meyers <Aa...@microsoft.com>>; Donald Drake <do...@microsoft.com>>; Steve Carroll (DEVDIV) <St...@microsoft.com>>; Kount Veluri <ko...@microsoft.com>>

Cc: Vipul Gupta <Gu...@microsoft.com>>

Subject: [EXTERNAL] RE: Code analyzer fix - new release (beta00015) timeline?



Hi Manish,



I have managed to work out most (all?) of what is happening.



1. Using a custom ThreadInterruptedException is necessary because it subclasses RuntimeException in Java, and therefore it gets caught and handled in different places than System.Threading.ThreadInterruptedException. We have refactored the exception handling to act like it does in Java even though the inheritance hierarchy is different in .NET.

2. Java does not throw an InterruptedException on the synchronized statement, but .NET does throw System.Threading.ThreadInterruptedException on the lock statement or Monitor.Enter().



It appears that .NET Core is throwing on the lock statement more frequently than .NET Framework, which is why we are getting failures on .NET Core.



While the changes for issue #1 are straightforward, I am having difficulty with a workaround for #2.



1. The ideal solution would be to turn off System.Threading.ThreadInterruptedException application wide.

2. Failing that, most of the lock statements can be patched by wrapping them in a try/catch block to catch System.Threading.ThreadInterruptedException and throw the custom ThreadInterruptedException wrapper.

3. However, there are several methods (StartCommit, CommitInternal, PrepareCommitInternal, FinishCommit, etc), that are transactional. They are not expecting an exception when entering a lock and are expected to complete as an atomic unit. But if Thread.Interrupt() occurs after PrepareCommitInternal() has run, the interrupt takes it into CloseInternal() which throws an exception because FinishCommit() or RollbackCommit() didn't run, but both of those require entering locks to complete successfully.



I made a naïve attempt to patch it by using a custom version of the Monitor.Enter() method:



        internal static class UninterruptableMonitor

        {

            public static bool Enter(object obj)

            {

                try

                {

                    Monitor.Enter(obj);

                    return false;

                }

                catch (Exception ie) when (ie.IsInterruptedException())

                {

                    if (Monitor.TryEnter(obj))

                        return true;



                    // Attempt to clear the interrupt status and try again.

                    try

                    {

                        Thread.Sleep(0);

                    }

                    catch (Exception ie2) when (ie2.IsInterruptedException())

                    {

                        // ignore

                    }



                    if (Monitor.TryEnter(obj))

                        return true;



                    throw new Util.LuceneSystemException($"Could not enter {obj}");

                }

            }

         }



However, when there is an interrupt, it occasionally throws LuceneSystemException rather than entering the lock.



Unfortunately, other than a mention in the Monitor.Enter() docs that this is the expected behavior, there is nothing in the docs to suggest what to do to suppress or ignore System.Threading.ThreadInterruptedException and successfully acquire a lock anyway.



Please suggest a workaround.



Thanks,

Shad Storhaug (NightOwl888)

Project Chairperson – Apache Lucene.NET





-----Original Message-----

From: Shad Storhaug <sh...@shadstorhaug.com>>

Sent: Tuesday, October 12, 2021 7:46 AM

To: Manish Godse <Ma...@microsoft.com>>; dev@lucenenet.apache.org<ma...@lucenenet.apache.org>; Stephen Toub <st...@microsoft.com>>; Karen Albrecht <Ka...@microsoft.com>>; Dustin Campbell <Du...@microsoft.com>>; Chaz Beck <Ch...@microsoft.com>>; Aaron Meyers <Aa...@microsoft.com>>; Donald Drake <do...@microsoft.com>>; Steve Carroll (DEVDIV) <St...@microsoft.com>>; Kount Veluri <ko...@microsoft.com>>

Cc: Vipul Gupta <Gu...@microsoft.com>>

Subject: RE: Code analyzer fix - new release (beta00015) timeline?



Hi Manish,



Investigation is still required on .NET Core. We are getting both test failures and deadlocks on .NET Core 3.1, .NET 5, and .NET 6.



Lucene 4.8.0 was designed specifically to run on Java 8, and Lucene.NET is a line by line port to .NET. The approach taken in Java for disposing IndexWriter while safely interrupting running background threads seems to work in .NET Framework. But it is unclear what differs in .NET Core that is causing these tests to fail and deadlock.



Of course, if a different approach would be more sensible in .NET, please suggest. But at the very least we are seeking an explanation as to why there is a difference between .NET Framework and .NET Core and what we can do to account for the difference.



I apologize for not providing a more isolated example, but as per the comments https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Flucenenet%2Fblob%2F70cd8b87a8e61a36c4d093d40f6c4dccff986e90%2Fsrc%2FLucene.Net.Tests%2FIndex%2FTestIndexWriter.cs%23L1335-L1338&amp;data=04%7C01%7Ckaren.albrecht%40microsoft.com%7C355be0a0a6c34f11aea608d993301488%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702657038351319%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=ztUS3vjZFcW4I2vLXbXtiV%2BhNhLgSFRNtGA8GryrX%2Bc%3D&amp;reserved=0<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Flucenenet%2Fblob%2F70cd8b87a8e61a36c4d093d40f6c4dccff986e90%2Fsrc%2FLucene.Net.Tests%2FIndex%2FTestIndexWriter.cs%23L1335-L1338&data=04%7C01%7Ckaren.albrecht%40microsoft.com%7Cafef86f87a484495d1d308d9933a09c5%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702699855296175%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=%2FfJGhlRJypQqqElKLs0PkQR4%2BWn49V7D3bfkQswLil4%3D&reserved=0>, this is not an easy scenario to reproduce.



I am attaching a table of the lines of code that catch InterruptedException in Java Lucene for your reference. A few things of note:



1. While in Java a custom ThreadInterruptedException was thrown, we have opted to allow the System.Threading.InterruptedException to propagate to the caller.

2. J2N.Threading.ThreadJob is a wrapper around System.Thread that allows us to use inheritance on a thread so we can use a similar code structure as Lucene.

3. J2N.Threading.ThreadJob uses System.Runtime.ExceptionServices.ExceptionDispatchInfo in its Join() overloads to propagate exceptions from independent threads to the calling thread. We are not sure how this happens in Java, but it is the behavior we observe in tests and Lucene's design depends on it to allow ThreadInterruptedException to propagate IndexWriter.CloseInternal() where it is handled.



Thanks,

Shad Storhaug (NightOwl888)

Project Chairperson – Apache Lucene.NET



-----Original Message-----

From: Manish Godse <Ma...@microsoft.com>>

Sent: Monday, October 11, 2021 11:31 PM

To: Shad Storhaug <sh...@shadstorhaug.com>>; dev@lucenenet.apache.org<ma...@lucenenet.apache.org>; Stephen Toub <st...@microsoft.com>>; Karen Albrecht <Ka...@microsoft.com>>; Dustin Campbell <Du...@microsoft.com>>; Chaz Beck <Ch...@microsoft.com>>; Aaron Meyers <Aa...@microsoft.com>>; Donald Drake <do...@microsoft.com>>; Steve Carroll (DEVDIV) <St...@microsoft.com>>; Kount Veluri <ko...@microsoft.com>>

Cc: Vipul Gupta <Gu...@microsoft.com>>

Subject: RE: Code analyzer fix - new release (beta00015) timeline?



Thanks Shad, So is the issue resolved or investigation is still required for .net core failures?



Thanks



-----Original Message-----

From: Shad Storhaug <sh...@shadstorhaug.com>>

Sent: Monday, October 11, 2021 4:07 AM

To: dev@lucenenet.apache.org<ma...@lucenenet.apache.org>; Manish Godse <Ma...@microsoft.com>>; Stephen Toub <st...@microsoft.com>>; Karen Albrecht <Ka...@microsoft.com>>; Dustin Campbell <Du...@microsoft.com>>; Chaz Beck <Ch...@microsoft.com>>; Aaron Meyers <Aa...@microsoft.com>>; Donald Drake <do...@microsoft.com>>; Steve Carroll (DEVDIV) <St...@microsoft.com>>; Kount Veluri <ko...@microsoft.com>>

Cc: Vipul Gupta <Gu...@microsoft.com>>

Subject: [EXTERNAL] RE: Code analyzer fix - new release (beta00015) timeline?



Hi Manish,



I ran a test and it seems there are test failures on .NET Framework with the current configuration, but they happen so rarely that we have never seen a failure in CI. After doing a bit of trail and err, I discovered that the ThreadJob.Interrupted() method is causing the issue due to the fact that it internally calls Thread.Sleep(0) which is causing undesired side effects.



I have submitted https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Flucenenet%2Fpull%2F524&amp;data=04%7C01%7Ckaren.albrecht%40microsoft.com%7C355be0a0a6c34f11aea608d993301488%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702657038351319%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=gUeeoVQBIkfWVy49i9vVHNqG%2BbFGBwVPJK6GKc6nr4Y%3D&amp;reserved=0<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Flucenenet%2Fpull%2F524&data=04%7C01%7Ckaren.albrecht%40microsoft.com%7Cafef86f87a484495d1d308d9933a09c5%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702699855306159%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=aaDfw9OFjiR9ltgM0aD8CyNVrqCG0iRF%2B9Noz6BnT8g%3D&reserved=0> with the relevant details, which I have merged into the master branch. With these changes, the two tests are stable on .NET Framework over 500 continual iterations and complete successfully.



We are still seeing failures on .NET Core with this change, but at least now we are starting from a point where .NET Framework appears to be stable.



Thanks,

Shad Storhaug (NightOwl888)

Project Chairperson – Apache Lucene.NET



-----Original Message-----

From: Shad Storhaug <sh...@shadstorhaug.com>>

Sent: Saturday, October 9, 2021 5:17 AM

To: Manish Godse <Ma...@microsoft.com>>; Stephen Toub <st...@microsoft.com>>; Karen Albrecht <Ka...@microsoft.com>>; Dustin Campbell <Du...@microsoft.com>>; Chaz Beck <Ch...@microsoft.com>>; dev@lucenenet.apache.org<ma...@lucenenet.apache.org>; Aaron Meyers <Aa...@microsoft.com>>; Donald Drake <do...@microsoft.com>>; Steve Carroll (DEVDIV) <St...@microsoft.com>>; Kount Veluri <ko...@microsoft.com>>

Cc: Vipul Gupta <Gu...@microsoft.com>>

Subject: RE: Code analyzer fix - new release (beta00015) timeline?



Hi Manish,



I have been unable to reproduce the issue in any other context than the tests that we have ported from Java Lucene to detect this deadlock condition, which are a bit involved but reproduce the problem reliably around 60% of the time.



Our build only has a few prerequisites to run tests and debug in VS2019.





  1.  Install Prerequisites at: https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Flucenenet%23visual-studio&amp;data=04%7C01%7Ckaren.albrecht%40microsoft.com%7C355be0a0a6c34f11aea608d993301488%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702657038351319%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=A0m19Qk2jjMAHHa7EpPqyNhfNy8oPM9JDGsnBPH%2Fwjo%3D&amp;reserved=0<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Flucenenet%23visual-studio&data=04%7C01%7Ckaren.albrecht%40microsoft.com%7Cafef86f87a484495d1d308d9933a09c5%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702699855306159%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=t4gHkpjZt58lpkjBkGKU4spp6HZQkppUrdKcNJ65Pek%3D&reserved=0>

  2.  The two repro tests are in the Lucene.Net.Tests._E-I project

     *   https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Flucenenet%2Fblob%2F8ebd83090165d4688729040ea12ad4ed588bf7bf%2Fsrc%2FLucene.Net.Tests%2FIndex%2FTestIndexWriter.cs%23L1434&amp;data=04%7C01%7Ckaren.albrecht%40microsoft.com%7C355be0a0a6c34f11aea608d993301488%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702657038351319%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=c6YMnQaA2FQ6jb9KJcklopgGNE6i%2BGEFL2EIjFyt%2BUg%3D&amp;reserved=0<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Flucenenet%2Fblob%2F8ebd83090165d4688729040ea12ad4ed588bf7bf%2Fsrc%2FLucene.Net.Tests%2FIndex%2FTestIndexWriter.cs%23L1434&data=04%7C01%7Ckaren.albrecht%40microsoft.com%7Cafef86f87a484495d1d308d9933a09c5%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702699855316143%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=j85yXSetg%2BsKU3eTOVwgsbFbzPL5Ll6LhY81sWQrs%2Bg%3D&reserved=0>

     *   https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Flucenenet%2Fblob%2F8ebd83090165d4688729040ea12ad4ed588bf7bf%2Fsrc%2FLucene.Net.Tests%2FIndex%2FTestIndexWriter.cs%23L1476&amp;data=04%7C01%7Ckaren.albrecht%40microsoft.com%7C355be0a0a6c34f11aea608d993301488%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702657038351319%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=bS0UNhLCq4EFdGclJoGGRfoZa5KXVGITBgnbrT1qEJE%3D&amp;reserved=0<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Flucenenet%2Fblob%2F8ebd83090165d4688729040ea12ad4ed588bf7bf%2Fsrc%2FLucene.Net.Tests%2FIndex%2FTestIndexWriter.cs%23L1476&data=04%7C01%7Ckaren.albrecht%40microsoft.com%7Cafef86f87a484495d1d308d9933a09c5%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702699855316143%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=wOFc7tdC8yzdlUzvNkO0dk%2BpN%2FOgyw6Lbmy4%2B5T2Hh8%3D&reserved=0>

  3.  The equivalent Java tests (which we have ported) are

     *   https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Flucene%2Fblob%2Freleases%2Flucene-solr%2F4.8.0%2Flucene%2Fcore%2Fsrc%2Ftest%2Forg%2Fapache%2Flucene%2Findex%2FTestIndexWriter.java%23L1192&amp;data=04%7C01%7Ckaren.albrecht%40microsoft.com%7C355be0a0a6c34f11aea608d993301488%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702657038351319%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=DeXS81suQwSaOPqB9f7ABb3SMPvoBixY%2B5WzYUcb50E%3D&amp;reserved=0<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Flucene%2Fblob%2Freleases%2Flucene-solr%2F4.8.0%2Flucene%2Fcore%2Fsrc%2Ftest%2Forg%2Fapache%2Flucene%2Findex%2FTestIndexWriter.java%23L1192&data=04%7C01%7Ckaren.albrecht%40microsoft.com%7Cafef86f87a484495d1d308d9933a09c5%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702699855326135%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=lWyCKN0ySkzhbvjpaI4AR9KNyRbdHo3RUT6iZk0K9Uc%3D&reserved=0>

     *   https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Flucene%2Fblob%2Freleases%2Flucene-solr%2F4.8.0%2Flucene%2Fcore%2Fsrc%2Ftest%2Forg%2Fapache%2Flucene%2Findex%2FTestIndexWriter.java%23L1224&amp;data=04%7C01%7Ckaren.albrecht%40microsoft.com%7C355be0a0a6c34f11aea608d993301488%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702657038351319%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=Ca3GaBeTddrp7gMqatr0P64GgJs8%2FqP8eXyGLP0aI08%3D&amp;reserved=0<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Flucene%2Fblob%2Freleases%2Flucene-solr%2F4.8.0%2Flucene%2Fcore%2Fsrc%2Ftest%2Forg%2Fapache%2Flucene%2Findex%2FTestIndexWriter.java%23L1224&data=04%7C01%7Ckaren.albrecht%40microsoft.com%7Cafef86f87a484495d1d308d9933a09c5%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702699855326135%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=DgKSesyPZfusTnP0opnenSdUDHt5Hu57Zny%2B1TS4y3o%3D&reserved=0>

NOTE: We currently don’t have docs on how to run these, but it may not be strictly necessary since we have passing tests on .NET Framework 4.8. The dependencies for this project have gone stale, but I have an updated build at: https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FNightOwl888%2Flucene%2Ftree%2Freleases%2Flucene-solr%252F4.8.0%252Fupdated&amp;data=04%7C01%7Ckaren.albrecht%40microsoft.com%7C355be0a0a6c34f11aea608d993301488%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702657038351319%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=rt%2Bs4OJBdSilo4CN3E2C1hAB5lioB5TZs%2FaLV4UvYso%3D&amp;reserved=0<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FNightOwl888%2Flucene%2Ftree%2Freleases%2Flucene-solr%252F4.8.0%252Fupdated&data=04%7C01%7Ckaren.albrecht%40microsoft.com%7Cafef86f87a484495d1d308d9933a09c5%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702699855336130%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=t2NPxlksQPty%2Fdjpx8q0Ltf8wq5mtv2OyY%2BW2QcEfG0%3D&reserved=0>



  1.  To switch target framework, you will need to hand edit the file at: https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Flucenenet%2Fblob%2F8ebd83090165d4688729040ea12ad4ed588bf7bf%2FTestTargetFramework.props%23L26-L33&amp;data=04%7C01%7Ckaren.albrecht%40microsoft.com%7C355be0a0a6c34f11aea608d993301488%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702657038351319%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=5TISoBW5AZlJApZWLclMPNEfCBqmtLoLjcOtdOnMOXs%3D&amp;reserved=0<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Flucenenet%2Fblob%2F8ebd83090165d4688729040ea12ad4ed588bf7bf%2FTestTargetFramework.props%23L26-L33&data=04%7C01%7Ckaren.albrecht%40microsoft.com%7Cafef86f87a484495d1d308d9933a09c5%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702699855346123%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=ZpP15TWRPg54i6XsnihxA%2FQCJECVucpkZFMEcBAVOCs%3D&reserved=0>. Switching between net5.0 and net48 is enough to see the tests fail and pass reliably. Note that picking .NET Framework 4.8 loads the .NET Framework 4.5 targeted library for testing.

  2.  You may need to comment out the [AwaitsFix] attribute on each of the test methods to get these particular tests to run in your environment.



We also have an Azure DevOps setup that is designed for any contributor to run simply by pushing our repo to Azure DevOps Repos, creating a new build pipeline and pointing it at the azure-pipelines.yml at the repo root. Every run yields several of these test failures (but again, they don’t occur when running .NET Framework 4.8).



Here is an example of such a run, where you can drill down to see the stack traces:

https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdev.azure.com%2FLuceneNET-Temp%2FLucene.NET%2F_build%2Fresults%3FbuildId%3D1571%26view%3Dms.vss-test-web.build-test-results-tab&amp;data=04%7C01%7Ckaren.albrecht%40microsoft.com%7C355be0a0a6c34f11aea608d993301488%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702657038351319%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=%2FKYUiql6haa2Ma7wcvwersvgetG2RHJhzq0komS%2FTOE%3D&amp;reserved=0<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdev.azure.com%2FLuceneNET-Temp%2FLucene.NET%2F_build%2Fresults%3FbuildId%3D1571%26view%3Dms.vss-test-web.build-test-results-tab&data=04%7C01%7Ckaren.albrecht%40microsoft.com%7Cafef86f87a484495d1d308d9933a09c5%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702699855346123%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=Do41kAjiV9GslORHzihTujq0Uf%2BjeVgRc%2B1h9heLOW8%3D&reserved=0>



“From what I have read below the issue seems to be occurring after migrating from .net framework to core?”



This is true, but note this migration happened several years ago. These tests have always failed on .NET Core. A few months ago, I went through older tags to see if I could work out what change caused these 2 tests to run successfully on .NET Framework (since our debug notes stated that the failures once occurred on .NET Framework), but I couldn’t find a commit where the tests failed.



There are 2 methods in particular where we rely on both ThreadInterruptedException and Thread.CurrentThread.Interrupt() to propagate an event to the calling code which are highly suspect:





  1.  https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Flucenenet%2Fblob%2F8ebd83090165d4688729040ea12ad4ed588bf7bf%2Fsrc%2FLucene.Net%2FIndex%2FIndexWriter.cs%23L1130-L1281&amp;data=04%7C01%7Ckaren.albrecht%40microsoft.com%7C355be0a0a6c34f11aea608d993301488%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702657038351319%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=VeXaVBMdPfJCjV%2B33M1FHPj4hC1mOUrwyHeBuVjmqyQ%3D&amp;reserved=0<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Flucenenet%2Fblob%2F8ebd83090165d4688729040ea12ad4ed588bf7bf%2Fsrc%2FLucene.Net%2FIndex%2FIndexWriter.cs%23L1130-L1281&data=04%7C01%7Ckaren.albrecht%40microsoft.com%7Cafef86f87a484495d1d308d9933a09c5%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702699855356119%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=INIPJFQtKVJ257yI%2FQ4pOrboI5EB9S9D%2FKmjdHLZ2kU%3D&reserved=0>

  2.  https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Flucenenet%2Fblob%2F8ebd83090165d4688729040ea12ad4ed588bf7bf%2Fsrc%2FLucene.Net%2FIndex%2FConcurrentMergeScheduler.cs%23L320-L365&amp;data=04%7C01%7Ckaren.albrecht%40microsoft.com%7C355be0a0a6c34f11aea608d993301488%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702657038361316%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=S3F0Pj3dVLcAhVs%2BpqvsXVQkCGs2PnoRP8zGkKa4WUQ%3D&amp;reserved=0<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Flucenenet%2Fblob%2F8ebd83090165d4688729040ea12ad4ed588bf7bf%2Fsrc%2FLucene.Net%2FIndex%2FConcurrentMergeScheduler.cs%23L320-L365&data=04%7C01%7Ckaren.albrecht%40microsoft.com%7Cafef86f87a484495d1d308d9933a09c5%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702699855356119%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=HivpCOqv%2BUTnsrJZYNxgg0ZgSSUighgpgOfQ%2FxT%2BXnU%3D&reserved=0>



Again, I am making a few assumptions here due to the fact the reference counter in MockDirectoryWrapper is not reaching zero. It isn’t clear whether this is due to a missing ThreadInterruptedException in the background thread, Thread.CurrentThread.Interrupt() fails to propagate the signal to its caller, or something else. But given the fact these are the only 2 places in production code we call Thread.Interrupt(), these tests were specifically designed to look at the thread interrupt signal by the Lucene team and we are seeing a difference between .NET Framework and .NET Core, it seems likely this is it.



Let me know whether you need any additional assistance reproducing the error.



Thanks,

Shad Storhaug (NightOwl888)

Project Chairperson – Apache Lucene.NET





From: Manish Godse <Ma...@microsoft.com>>

Sent: Saturday, October 9, 2021 1:48 AM

To: Stephen Toub <st...@microsoft.com>>; Karen Albrecht <Ka...@microsoft.com>>; Dustin Campbell <Du...@microsoft.com>>; Chaz Beck <Ch...@microsoft.com>>; Shad Storhaug <sh...@shadstorhaug.com>>; dev@lucenenet.apache.org<ma...@lucenenet.apache.org>; Aaron Meyers <Aa...@microsoft.com>>; Donald Drake <do...@microsoft.com>>; Steve Carroll (DEVDIV) <St...@microsoft.com>>; Kount Veluri <ko...@microsoft.com>>

Cc: Vipul Gupta <Gu...@microsoft.com>>

Subject: RE: Code analyzer fix - new release (beta00015) timeline?



Hi Karen,



Yeah we can help investigate this issue, but we will need some standalone repro steps for the issue. From what I have read below the issue seems to be occurring after migrating from .net framework to core?



Thanks



From: Stephen Toub <st...@microsoft.com>>>

Sent: Friday, October 8, 2021 9:40 AM

To: Karen Albrecht <Ka...@microsoft.com>>>; Dustin Campbell <Du...@microsoft.com>>>; Chaz Beck <Ch...@microsoft.com>>>; Shad Storhaug <sh...@shadstorhaug.com>>>; dev@lucenenet.apache.org<ma...@lucenenet.apache.org>>; Aaron Meyers <Aa...@microsoft.com>>>; Donald Drake <do...@microsoft.com>>>; Steve Carroll (DEVDIV) <St...@microsoft.com>>>; Kount Veluri <ko...@microsoft.com>>>; Manish Godse <Ma...@microsoft.com>>>

Cc: Vipul Gupta <Gu...@microsoft.com>>>

Subject: RE: Code analyzer fix - new release (beta00015) timeline?



+Kount/Manish for Thread.Interrupt



From: Karen Albrecht <Ka...@microsoft.com>>>

Sent: Friday, October 8, 2021 12:28 PM

To: Dustin Campbell <Du...@microsoft.com>>>; Chaz Beck <Ch...@microsoft.com>>>; Shad Storhaug <sh...@shadstorhaug.com>>>; dev@lucenenet.apache.org<ma...@lucenenet.apache.org>>; Aaron Meyers <Aa...@microsoft.com>>>; Donald Drake <do...@microsoft.com>>>; Steve Carroll (DEVDIV) <St...@microsoft.com>>>; Stephen Toub <st...@microsoft.com>>>

Cc: Vipul Gupta <Gu...@microsoft.com>>>

Subject: RE: Code analyzer fix - new release (beta00015) timeline?



Thank you @Dustin Campbell<ma...@microsoft.com>!  😊



@Steve Carroll (DEVDIV)<ma...@microsoft.com> and @Stephen Toub<ma...@microsoft.com> –



PowerBI needs some help debugging an issue (the issue description section starts in this highlight and below).  To summarize, Power BI uses the Lucene.Net package for search indexing and we critically need the next version of Lucene.Net to unblock an intermittent build failures we are hitting in our main build pipeline.  The build failures cause a 7% reduction in pipeline reliability, so this is an urgent issue for Power BI.



Who can help investigate why this threading issue is occurring when Lucene.Net targets multiple versions of .Net (.NET Core/.NET 5/.NET 6)?



-Karen



From: Dustin Campbell <Du...@microsoft.com>>>

Sent: Friday, October 8, 2021 8:26 AM

To: Chaz Beck <Ch...@microsoft.com>>>; Karen Albrecht <Ka...@microsoft.com>>>; Shad Storhaug <sh...@shadstorhaug.com>>>; dev@lucenenet.apache.org<ma...@lucenenet.apache.org>>; Aaron Meyers <Aa...@microsoft.com>>>; Donald Drake <do...@microsoft.com>>>; Steve Carroll (DEVDIV) <St...@microsoft.com>>>; Stephen Toub <st...@microsoft.com>>>

Cc: Vipul Gupta <Gu...@microsoft.com>>>

Subject: RE: Code analyzer fix - new release (beta00015) timeline?



Hi @Karen<ma...@microsoft.com> – great to hear from you!



This is probably a bit closer to @Steve<ma...@microsoft.com>’s universe, since it looks more like a runtime or library issue. Also, looping in my other favorite @Stephen<ma...@microsoft.com>, who might have some insight on the threading issue.



Dust-



From: Chaz Beck <Ch...@microsoft.com>>>

Sent: Friday, October 8, 2021 8:22 AM

To: Karen Albrecht <Ka...@microsoft.com>>>; Shad Storhaug <sh...@shadstorhaug.com>>>; dev@lucenenet.apache.org<ma...@lucenenet.apache.org>>; Aaron Meyers <Aa...@microsoft.com>>>; Dustin Campbell <Du...@microsoft.com>>>; Donald Drake <do...@microsoft.com>>>

Cc: Vipul Gupta <Gu...@microsoft.com>>>

Subject: Re: Code analyzer fix - new release (beta00015) timeline?



Throwing this out there as a suggestion.



There is a MSFT research tool some Azure teams are using to root out concurrency issues:

Coyote - Microsoft Research<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.microsoft.com%2Fen-us%2Fresearch%2Fproject%2Fcoyote%2F&amp;data=04%7C01%7Ckaren.albrecht%40microsoft.com%7C355be0a0a6c34f11aea608d993301488%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702657038361316%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=IN3oL2FQs3TnaJJJwBlvH9D%2BNFmG3ENvPQ%2BeynE%2F4T8%3D&amp;reserved=0>

GitHub - microsoft/coyote: Coyote is a tool designed to help ensure that your C# code is free of annoying concurrency bugs.<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fmicrosoft%2Fcoyote%2F&amp;data=04%7C01%7Ckaren.albrecht%40microsoft.com%7C355be0a0a6c34f11aea608d993301488%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702657038361316%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=cTaeFvShNTle6SpHAf0bf8I4OU5oQ%2FToQ%2BkiRdIP0xg%3D&amp;reserved=0>

Coyote (microsoft.github.io)<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmicrosoft.github.io%2Fcoyote%2F%23&amp;data=04%7C01%7Ckaren.albrecht%40microsoft.com%7C355be0a0a6c34f11aea608d993301488%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702657038361316%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=D3Q0TfsdSiSFbsx2ClsG%2FPmrNqplKKFE1u8Yb5K1tMs%3D&amp;reserved=0>



I haven't looked at Lucene.Net source code to determine how compatible it is with this tool, considering its current test bed design and if it can mess with Thread directly. Documentation states can control Task and TPL calls along with Monitor.





--Chaz

Power BI Engineering Systems

Support | dev team<ma...@microsoft.com> | general support<ma...@microsoft.com> | tier1 support<ma...@microsoft.com> | wiki<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpowerbi.visualstudio.com%2FPower%2520BI%2F_wiki%2Fwikis%2FPower%2520BI.wiki%2F2803%2FEngineering-Systems-Support-Page&amp;data=04%7C01%7Ckaren.albrecht%40microsoft.com%7C355be0a0a6c34f11aea608d993301488%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702657038361316%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=4Ri8mux92fsE2UFruAhyLtR4JeigvIa4oXfssYOhxM0%3D&amp;reserved=0> | Office Hours | Mon 2 PM PST<https://teams.microsoft.com/l/meetup-join/19%3ameeting_NjJhZWNhZGMtMmQyYS00YTU4LWI2YTctOGQ3MDU4MjgzMTMw%40thread.v2/0?context=%7b%22Tid%22%3a%2272f988bf-86f1-41af-91ab-2d7cd011db47%22%2c%22Oid%22%3a%2238219856-1320-4d84-9e38-a8ea50094c5e%22%7d> | Tue 6 AM PST (ILDC)<https://teams.microsoft.com/l/meetup-join/19%3ameeting_NTUyMjg2NWUtYzk3OS00YTc3LWI1NzYtZjFmYmExNGUzN2Y3%40thread.v2/0?context=%7b%22Tid%22%3a%2272f988bf-86f1-41af-91ab-2d7cd011db47%22%2c%22Oid%22%3a%2238219856-1320-4d84-9e38-a8ea50094c5e%22%7d> | Connect | Meet<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbook.ms%2FChaz.Beck%40microsoft.com&amp;data=04%7C01%7Ckaren.albrecht%40microsoft.com%7C355be0a0a6c34f11aea608d993301488%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702657038361316%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=8KIx%2B%2FnL%2BAUPzhzQntosmAJG5Tzn59TxhSrBGsWIXRE%3D&amp;reserved=0> | Chat<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fteams.microsoft.com%2Fl%2Fchat%2F0%2F0%3Fusers%3Dchbeck%2540microsoft.com&amp;data=04%7C01%7Ckaren.albrecht%40microsoft.com%7C355be0a0a6c34f11aea608d993301488%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702657038361316%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=imvUt40gFh1hLj2iiFIywefD8sSZnTiDpDJmuQBdjNs%3D&amp;reserved=0> | ________________________________

From: Karen Albrecht <Ka...@microsoft.com>>

Sent: Thursday, October 7, 2021 5:25 PM

To: Shad Storhaug <sh...@shadstorhaug.com>>; dev@lucenenet.apache.org<ma...@lucenenet.apache.org> <de...@lucenenet.apache.org>>; Aaron Meyers <Aa...@microsoft.com>>; Dustin Campbell <Du...@microsoft.com>>; Donald Drake <do...@microsoft.com>>

Cc: Vipul Gupta <Gu...@microsoft.com>>; Chaz Beck <Ch...@microsoft.com>>

Subject: RE: Code analyzer fix - new release (beta00015) timeline?



Hi @Dustin Campbell and @Donald Drake -



Good to see you guys again!  😃



I need your help finding someone who can help investigate a threading issue we are seeing in a package called lucene.net.  Power BI uses the package for search indexing and we critically need the next version of the package to unblock an intermittent build failures we are hitting in our main build pipeline.  The build failures cause a 7% reduction in pipeline reliability, so this is an urgent issue for Power BI.



Lucene.Net supports .NET Core/.NET 5/.NET 6 and is hitting a threading issue (described in detail below).  The .Net team has grown a lot since we last worked together (😃) and I can't tell if Svetlana's team can help with the investigation or if this should go to Steve's team.  Do you know who can help look into this issue?



-Karen



-----Original Message-----

From: Shad Storhaug <sh...@shadstorhaug.com>>

Sent: Thursday, October 7, 2021 2:32 PM

To: Karen Albrecht <Ka...@microsoft.com>>; dev@lucenenet.apache.org<ma...@lucenenet.apache.org>; Aaron Meyers <Aa...@microsoft.com>>

Cc: Vipul Gupta <Gu...@microsoft.com>>; Chaz Beck <Ch...@microsoft.com>>

Subject: [EXTERNAL] RE: Code analyzer fix - new release (beta00015) timeline?



[You don't often get email from shad@shadstorhaug.com<ma...@shadstorhaug.com>. Learn why this is important at http://aka.ms/LearnAboutSenderIdentification.]<https://nam06.safelinks.protection.outlook.com/?url=http%3A%2F%2Faka.ms%2FLearnAboutSenderIdentification.%5D&amp;data=04%7C01%7Ckaren.albrecht%40microsoft.com%7C355be0a0a6c34f11aea608d993301488%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702657038361316%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=Z400aUbydI0DUzODCQUBy1w573sFgP5jkK8vFBpcTSg%3D&amp;reserved=0>



Hi Karen,



No, I have not been in contact with the .NET team about this because as I mentioned, I have no definite proof that Thread.Interrupt() is the underlying cause. But we could use some debugging help getting to the bottom of this issue, yes. These are literally the only 2 test failures we have on the Lucene.Net core library, and I suspect that fixing the cause of the failures will also fix the deadlocks.



Furthermore, it isn't clear whether using Thread.Interrupt() in this way is intended to be a valid use case in .NET as it is in Java. It may be that another structure such as AutoResetEvent or ManualResetEvent would be a better design choice to propagate an event signal in .NET. But maybe asking an engineer could at least clarify whether we are dealing with a limitation we need to work around in .NET that doesn't exist in Java or whether this is some kind of .NET bug.



Do note there is another possible issue here:



https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Flucenenet%2Fblob%2F8ebd83090165d4688729040ea12ad4ed588bf7bf%2Fsrc%2FLucene.Net%2FIndex%2FIndexWriter.cs%23L1165&amp;data=04%7C01%7Ckaren.albrecht%40microsoft.com%7C355be0a0a6c34f11aea608d993301488%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702657038361316%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=byuZrGos%2FJuzAQsveG81KcF4GH6gydSNAOmAnMGjbIo%3D&amp;reserved=0<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Flucenenet%2Fblob%2F8ebd83090165d4688729040ea12ad4ed588bf7bf%2Fsrc%2FLucene.Net%2FIndex%2FIndexWriter.cs%23L1165&amp;data=04%7C01%7Ckaren.albrecht%40microsoft.com%7C355be0a0a6c34f11aea608d993301488%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702657038361316%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=byuZrGos%2FJuzAQsveG81KcF4GH6gydSNAOmAnMGjbIo%3D&amp;reserved=0>

https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FNightOwl888%2FJ2N%2Fblob%2F3d3c7211a39395c00aa3dfd1cdf9c01501391d87%2Fsrc%2FJ2N%2FThreading%2FThreadJob.cs%23L479&amp;data=04%7C01%7Ckaren.albrecht%40microsoft.com%7C355be0a0a6c34f11aea608d993301488%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702657038361316%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=6iXo1XdqkdhrtELlXPSnu2WXmqBpsHa%2FCsmS5vEJ6X4%3D&amp;reserved=0<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FNightOwl888%2FJ2N%2Fblob%2F3d3c7211a39395c00aa3dfd1cdf9c01501391d87%2Fsrc%2FJ2N%2FThreading%2FThreadJob.cs%23L479&amp;data=04%7C01%7Ckaren.albrecht%40microsoft.com%7C355be0a0a6c34f11aea608d993301488%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702657038361316%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=6iXo1XdqkdhrtELlXPSnu2WXmqBpsHa%2FCsmS5vEJ6X4%3D&amp;reserved=0>



where we use a workaround to read the "interrupt status" which may also be an issue on .NET Core. The contrived tests for it pass, but it may also be that we cannot rely on this approach in .NET to mimic the missing interrupted() Java method to read the status.



The JDK docs for interrupt() are here: https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.oracle.com%2Fjavase%2Ftutorial%2Fessential%2Fconcurrency%2Finterrupt.html&amp;data=04%7C01%7Ckaren.albrecht%40microsoft.com%7C355be0a0a6c34f11aea608d993301488%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702657038361316%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=J%2F1hhx8EdMYHwRWRQLIDqU4UdiXWAqsgMsApZLKqxmk%3D&amp;reserved=0<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.oracle.com%2Fjavase%2Ftutorial%2Fessential%2Fconcurrency%2Finterrupt.html&amp;data=04%7C01%7Ckaren.albrecht%40microsoft.com%7C355be0a0a6c34f11aea608d993301488%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702657038371310%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=RIpQRaYdNOAj7F2zysRQ0OJSXpdleiAiLl7BwUlHFuY%3D&amp;reserved=0>.



Thanks,

Shad Storhaug (NightOwl888)

Project Chairperson - Apache Lucene.NET





-----Original Message-----

From: Karen Albrecht <Ka...@microsoft.com>>

Sent: Friday, October 8, 2021 3:26 AM

To: Shad Storhaug <sh...@shadstorhaug.com>>; dev@lucenenet.apache.org<ma...@lucenenet.apache.org>; Aaron Meyers <Aa...@microsoft.com>>

Cc: Vipul Gupta <Gu...@microsoft.com>>; Chaz Beck <Ch...@microsoft.com>>

Subject: RE: Code analyzer fix - new release (beta00015) timeline?



Hi @Shad Storhaug -



First, I really like your profile name (NightOwl888).  Super cool.  😃



Second, this seems like an issue where the different versions of .Net handling thread safety differently.  Have you already connected with the .Net team to investigate this?  Do you have contacts?  If not, would you like some?



-Karen



-----Original Message-----

From: Shad Storhaug <sh...@shadstorhaug.com>>

Sent: Thursday, October 7, 2021 1:15 PM

To: dev@lucenenet.apache.org<ma...@lucenenet.apache.org>; Aaron Meyers <Aa...@microsoft.com>>

Cc: Karen Albrecht <Ka...@microsoft.com>>; Vipul Gupta <Gu...@microsoft.com>>; Chaz Beck <Ch...@microsoft.com>>

Subject: [EXTERNAL] RE: Code analyzer fix - new release (beta00015) timeline?



[You don't often get email from shad@shadstorhaug.com<ma...@shadstorhaug.com>. Learn why this is important at http://aka.ms/LearnAboutSenderIdentification.]<https://nam06.safelinks.protection.outlook.com/?url=http%3A%2F%2Faka.ms%2FLearnAboutSenderIdentification.%5D&amp;data=04%7C01%7Ckaren.albrecht%40microsoft.com%7C355be0a0a6c34f11aea608d993301488%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702657038371310%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=ihzNO42S7RM%2FeeFjvwVsOhurmKsQCikiCg9BgKKcix0%3D&amp;reserved=0>



Hi Aaron,



Yes, you are correct the code analyzer issue has been addressed but not yet released. You are also correct that issue #429 is not blocking a release.



However, we have an issue with (I suspect) the way Thread.Interrupt() changed between .NET Framework and .NET Core/.NET 5/.NET 6 that has been causing issues since we integrated with .NET Core. After fixing several concurrency issues (missing locks due to differences in collections between Java and .NET and incorrect translations during porting) there are now random deadlocks. In turn, this is blocking https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Flucenenet%2Fpull%2F511&amp;data=04%7C01%7Ckaren.albrecht%40microsoft.com%7C355be0a0a6c34f11aea608d993301488%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702657038371310%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=n2a7%2F6ZhgoVnv%2FdwBD%2BQ5OSFyNtveXOH5REF15VvYYk%3D&amp;reserved=0<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Flucenenet%2Fpull%2F511&amp;data=04%7C01%7Ckaren.albrecht%40microsoft.com%7C355be0a0a6c34f11aea608d993301488%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702657038371310%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=n2a7%2F6ZhgoVnv%2FdwBD%2BQ5OSFyNtveXOH5REF15VvYYk%3D&amp;reserved=0> from being merged and has forced us to revert https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Flucenenet%2Fpull%2F513&amp;data=04%7C01%7Ckaren.albrecht%40microsoft.com%7C355be0a0a6c34f11aea608d993301488%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702657038371310%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=cN1zfVEK5Y8Q1IhE3k8Pe9pYNvXciOS7FDTcPgL%2Fxso%3D&amp;reserved=0<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Flucenenet%2Fpull%2F513&amp;data=04%7C01%7Ckaren.albrecht%40microsoft.com%7C355be0a0a6c34f11aea608d993301488%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702657038371310%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=cN1zfVEK5Y8Q1IhE3k8Pe9pYNvXciOS7FDTcPgL%2Fxso%3D&amp;reserved=0>, which cause the deadlocks to happen much more frequently.



The TestThreadInterruptDeadlock() and TestTwoThreadsInterruptDeadlock() tests have always failed on .NET Core. These are being tracked in https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Flucenenet%2Fissues%2F269&amp;data=04%7C01%7Ckaren.albrecht%40microsoft.com%7C355be0a0a6c34f11aea608d993301488%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702657038371310%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=eMeWOb6teW%2FCE3RN%2Bqmva3k30%2BS6tuE7KHcQZfk%2FBTg%3D&amp;reserved=0<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Flucenenet%2Fissues%2F269&amp;data=04%7C01%7Ckaren.albrecht%40microsoft.com%7C355be0a0a6c34f11aea608d993301488%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702657038371310%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=eMeWOb6teW%2FCE3RN%2Bqmva3k30%2BS6tuE7KHcQZfk%2FBTg%3D&amp;reserved=0>, but due note that the info is a bit out of date. The current stack trace looks like:



 TestThreadInterruptDeadlock

   Source: TestIndexWriter.cs line 1434

   Duration: 2 sec



  Message:

Lucene.Net.Util.LuceneSystemException : MockDirectoryWrapper: cannot close: there are still open files: {_75.tim=1, _75.doc=1, _74.dvdd=1, _75.dvdd=1, _75.dnvd=1, _7a.dvdd=1, _74.tim=1, _77.fdt=1, _7c.pos=1, _7c.dnvd=1, _7c.doc=1, _7c.fdt=1, _70.doc=1, _77.pos=1, _78.tim=1, _6t.dvdd=1, _6t.pos=1, _70.dvdd=1, _6t.fdx=1, _7d.tim=1, _7d.doc=1, _6t.tim=1, _6t.dnvd=1, _70.dnvd=1, _7b.dnvd=1, _7b.pos=1, _7a.fdt=1, _70.fdx=1, _7b.fdt=1, _70.tim=1, _78.doc=1, _78.dnvd=1, _78.fdx=1, _77.fdx=1, _7a.dnvd=1, _78.fdt=1, _77.tim=1, _77.doc=1, _6t.doc=1, _70.fdt=1, _6z.dnvd=1, segments_2r=1, _70.pos=1, _78.pos=1, _78.dvdd=1, _7c.dvdd=1, _7d.pos=1, _7b.doc=1, _6z.tim=1, _77.dvdd=1, _7a.pos=1, _7c.fdx=1, _7a.fdx=1, _7b.dvdd=1, _7a.doc=1, _6z.dvdd=1, _7a.tim=1, _7b.fdx=1, _6z.fdt=1, _6t.fdt=1, _6z.fdx=1, _77.dnvd=1, _7d.tip=1, _6z.doc=1, _7b.tim=1, _7c.tim=1, _6z.pos=1, _75.pos=1, _75.fdx=1, _74.dnvd=1, _75.fdt=1, _74.doc=1, _74.fdx=1, _74.fdt=1, _74.pos=1}

Data:

  OriginalMessage: Lucene.Net.Util.LuceneSystemException: MockDirectoryWrapper: cannot close: there are still open files: {_75.tim=1, _75.doc=1, _74.dvdd=1, _75.dvdd=1, _75.dnvd=1, _7a.dvdd=1, _74.tim=1, _77.fdt=1, _7c.pos=1, _7c.dnvd=1, _7c.doc=1, _7c.fdt=1, _70.doc=1, _77.pos=1, _78.tim=1, _6t.dvdd=1, _6t.pos=1, _70.dvdd=1, _6t.fdx=1, _7d.tim=1, _7d.doc=1, _6t.tim=1, _6t.dnvd=1, _70.dnvd=1, _7b.dnvd=1, _7b.pos=1, _7a.fdt=1, _70.fdx=1, _7b.fdt=1, _70.tim=1, _78.doc=1, _78.dnvd=1, _78.fdx=1, _77.fdx=1, _7a.dnvd=1, _78.fdt=1, _77.tim=1, _77.doc=1, _6t.doc=1, _70.fdt=1, _6z.dnvd=1, segments_2r=1, _70.pos=1, _78.pos=1, _78.dvdd=1, _7c.dvdd=1, _7d.pos=1, _7b.doc=1, _6z.tim=1, _77.dvdd=1, _7a.pos=1, _7c.fdx=1, _7a.fdx=1, _7b.dvdd=1, _7a.doc=1, _6z.dvdd=1, _7a.tim=1, _7b.fdx=1, _6z.fdt=1, _6t.fdt=1, _6z.fdx=1, _77.dnvd=1, _7d.tip=1, _6z.doc=1, _7b.tim=1, _7c.tim=1, _6z.pos=1, _75.pos=1, _75.fdx=1, _74.dnvd=1, _75.fdt=1, _74.doc=1, _74.fdx=1, _74.fdt=1, _74.pos=1}  ---> Lucene.Net.Util.LuceneSystemException: unclosed IndexInput: _77.fdt

   --- End of inner exception stack trace ---

   at Lucene.Net.Store.MockDirectoryWrapper.Dispose(Boolean disposing) in F:\Projects\lucenenet\src\Lucene.Net.TestFramework\Store\MockDirectoryWrapper.cs:line 862

   at Lucene.Net.Store.Directory.Dispose() in F:\Projects\lucenenet\src\Lucene.Net\Store\Directory.cs:line 134

   at Lucene.Net.Util.IOUtils.ReThrowUnchecked(Exception th) in F:\Projects\lucenenet\src\Lucene.Net\Util\IOUtils.cs:line 530

   at Lucene.Net.Index.TestIndexWriter.IndexerThreadInterrupt.Run() in F:\Projects\lucenenet\src\Lucene.Net.Tests\Index\TestIndexWriter.cs:line 1413

   at J2N.Threading.ThreadJob.SafeRun(ThreadStart start)

  ----> Lucene.Net.Util.LuceneSystemException : unclosed IndexInput: _77.fdt



  Stack Trace:

MockDirectoryWrapper.Dispose(Boolean disposing) line 862

Directory.Dispose() line 134

IOUtils.ReThrowUnchecked(Exception th) line 530

IndexerThreadInterrupt.Run() line 1413

ThreadJob.SafeRun(ThreadStart start)

--- End of stack trace from previous location ---

TestIndexWriter.TestThreadInterruptDeadlock() line 1465 --LuceneSystemException



Nobody has reported any negative effects due to these Thread.Interrupt() test failures and they either never happened on .NET Framework or the issue was inadvertently fixed by correcting mistranslations in the port. However, the deadlocking is new and (based on the names of the tests) I suspect this is what is expected when the Thread.Interrupt() signal failing to fire as anticipated by the original design. In other words, the missing locks were allowing us to get by until now (while causing other thread safety issues), but with the original locks in place we are getting the expected deadlocks that occur when the interrupt events are not reliable. I haven't had a chance to put in additional logging to prove for certain we are missing interrupt events, though.



Several contributing issues to these test failures have been identified and fixed already, the main one being that the ThreadInterruptedException that happens in the ConcurrentMergeScheduler and in IndexWriter is supposed to be propagated to the calling thread to signal the Thread.Interrupt() call in that thread (to reset the interrupt status according to the comments):



https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Flucenenet%2Fblob%2F8ebd83090165d4688729040ea12ad4ed588bf7bf%2Fsrc%2FLucene.Net%2FIndex%2FConcurrentMergeScheduler.cs%23L362&amp;data=04%7C01%7Ckaren.albrecht%40microsoft.com%7C355be0a0a6c34f11aea608d993301488%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702657038371310%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=FFL%2BqPu52qRsxWcSVaGYtw7r2b5kZPlEMfaziovFXEA%3D&amp;reserved=0<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Flucenenet%2Fblob%2F8ebd83090165d4688729040ea12ad4ed588bf7bf%2Fsrc%2FLucene.Net%2FIndex%2FConcurrentMergeScheduler.cs%23L362&amp;data=04%7C01%7Ckaren.albrecht%40microsoft.com%7C355be0a0a6c34f11aea608d993301488%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702657038371310%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=FFL%2BqPu52qRsxWcSVaGYtw7r2b5kZPlEMfaziovFXEA%3D&amp;reserved=0>

https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Flucenenet%2Fblob%2F8ebd83090165d4688729040ea12ad4ed588bf7bf%2Fsrc%2FLucene.Net%2FIndex%2FIndexWriter.cs%23L1278&amp;data=04%7C01%7Ckaren.albrecht%40microsoft.com%7C355be0a0a6c34f11aea608d993301488%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702657038371310%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=10OMS1dZH2mzlA6J0viWz%2FHWCNpElLuZ%2F2uR87sas88%3D&amp;reserved=0<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Flucenenet%2Fblob%2F8ebd83090165d4688729040ea12ad4ed588bf7bf%2Fsrc%2FLucene.Net%2FIndex%2FIndexWriter.cs%23L1278&amp;data=04%7C01%7Ckaren.albrecht%40microsoft.com%7C355be0a0a6c34f11aea608d993301488%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702657038381328%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=2FVdEuy5KuKs5tbqHb0Jgym2x7nhXa13UPSP8Fkzt7c%3D&amp;reserved=0>



The TestThreadInterruptDeadlock() and TestTwoThreadsInterruptDeadlock() tests use a reference counter to keep track of open files and if it doesn't reach 0 by the time Dispose() is called on the directory, the exception occurs. The ThreadInterruptedException is what decrements the reference counter, which is the primary suspect because it uses Thread.Interrupt() to carry the signal to decrement the reference counter. The mismatch reference count is what is causing the failures to occur, and I suspect the difference in Thread.Interrupt() behavior between .NET Framework and .NET Core is also what is causing the deadlocks to occur (which are more widespread than these two tests).



While there may be another cause for this issue than Thread.Interrupt(), it seems like the most likely culprit. Especially due to the fact the deadlocks occur more frequently on macOS than other operating systems and the fact the problem doesn't seem to exist at all on .NET Framework.



There do seem to be some caveats to how interrupting threads works on .NET vs Java: https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.microsoft.com%2Fen-us%2Fdotnet%2Fstandard%2Fthreading%2Fpausing-and-resuming-threads%23interrupting-threads&amp;data=04%7C01%7Ckaren.albrecht%40microsoft.com%7C355be0a0a6c34f11aea608d993301488%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702657038381328%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=fMbx9yHhue6qT1xj24mtJr0jpXfOmEJEi3ENrqCSTv4%3D&amp;reserved=0<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.microsoft.com%2Fen-us%2Fdotnet%2Fstandard%2Fthreading%2Fpausing-and-resuming-threads%23interrupting-threads&amp;data=04%7C01%7Ckaren.albrecht%40microsoft.com%7C355be0a0a6c34f11aea608d993301488%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702657038381328%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=fMbx9yHhue6qT1xj24mtJr0jpXfOmEJEi3ENrqCSTv4%3D&amp;reserved=0>, but it is not clear from the docs what changed between .NET Framework and .NET Core that could be the cause.



The bottom line is this issue is currently blocking beta00015 from being released and at present I don't have a lot of time to dedicate to solve it. But suggestions and PRs are welcome.



Note that the code analyzer isn't strictly necessary to compile, it is simply there to prevent a custom TokenStream from being created without sealing the IncrementToken() method or the class itself. I haven't done much experimentation with disabling the analyzer in VS 2017 - would that be an option for you or does the failure to load the DLL supersede that setting?



Thanks,

Shad Storhaug (NightOwl888)

Project Chairperson - Apache Lucene.NET





-----Original Message-----

From: Aaron Meyers <Aa...@microsoft.com.INVALID>>

Sent: Thursday, October 7, 2021 1:57 AM

To: dev@lucenenet.apache.org<ma...@lucenenet.apache.org>

Cc: Karen Albrecht <Ka...@microsoft.com>>; Vipul Gupta <Gu...@microsoft.com>>; Chaz Beck <Ch...@microsoft.com>>

Subject: Code analyzer fix - new release (beta00015) timeline?



Hi all,



We've recently been hitting the following issue in the Power BI build pipeline - appears to have started only recently due to our efforts moving to modern SDk-style projets and MSBuild:

Lucene.Net NuGet (4.8.0-beta00013) does not compile under Visual 2017 - Lucene.Net.CodeAnalysis.Lucene1000_TokenStreamOrItsIncrementTokenMethodMustBeSealedCSAnalyzer * Issue #394 * apache/lucenenet (github.com)<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Flucenenet%2Fissues%2F394&amp;data=04%7C01%7Ckaren.albrecht%40microsoft.com%7C355be0a0a6c34f11aea608d993301488%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702657038381328%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=rWVMSFCKECBFLhIPFw10wso8ygXw5WLYTlDgAsOjJ2I%3D&amp;reserved=0<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Flucenenet%2Fissues%2F394&amp;data=04%7C01%7Ckaren.albrecht%40microsoft.com%7C355be0a0a6c34f11aea608d993301488%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702657038381328%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=rWVMSFCKECBFLhIPFw10wso8ygXw5WLYTlDgAsOjJ2I%3D&amp;reserved=0>>



I see that Shad already fixed this issue back in April:

Fix for code analyzer support on Visual Studio 2017 (fixes #394) by NightOwl888 * Pull Request #467 * apache/lucenenet (github.com)<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Flucenenet%2Fpull%2F467&amp;data=04%7C01%7Ckaren.albrecht%40microsoft.com%7C355be0a0a6c34f11aea608d993301488%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702657038381328%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=QKMXpE72HLuantu%2B%2BvPZ5fif7YZcfWx8eFUP3y9JQuw%3D&amp;reserved=0<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Flucenenet%2Fpull%2F467&amp;data=04%7C01%7Ckaren.albrecht%40microsoft.com%7C355be0a0a6c34f11aea608d993301488%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702657038381328%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=QKMXpE72HLuantu%2B%2BvPZ5fif7YZcfWx8eFUP3y9JQuw%3D&amp;reserved=0>>



Would it be possible to get a beta00015 release which would include this fix? It's been about 6 months since the last beta release and from what I see there is only one open issue tagged for the beta00015 milestone - not sure if this is critically needed for beta00015 or could be deferred?

FieldDoc.Fields boxing issue * Issue #429 * apache/lucenenet (github.com)<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Flucenenet%2Fissues%2F429&amp;data=04%7C01%7Ckaren.albrecht%40microsoft.com%7C355be0a0a6c34f11aea608d993301488%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702657038381328%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=yAfLedtuO%2B6FQ6kxRNB2FVyoF3%2BO9o4O3d3ltgnMhxc%3D&amp;reserved=0<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Flucenenet%2Fissues%2F429&amp;data=04%7C01%7Ckaren.albrecht%40microsoft.com%7C355be0a0a6c34f11aea608d993301488%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637702657038381328%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=yAfLedtuO%2B6FQ6kxRNB2FVyoF3%2BO9o4O3d3ltgnMhxc%3D&amp;reserved=0>>



Thanks always for your help,



Aaron Meyers

Microsoft Power BI