You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucenenet.apache.org by Shad Storhaug <sh...@shadstorhaug.com> on 2017/03/20 22:25:35 UTC

API Work/Stabilization Update

I am getting very close to getting #203 merged. I wouldn't go so far as to say that the API is finished, but the most significant of the breaking API changes are now behind us.

BUILD/VERSIONING

I just wanted to be sure there is someone available to help get the build working after the merge. I think it would be appropriate to change the pre-release label from "beta" to "beta2" (without resetting the build number, since that is actually what NuGet uses). This would be primarily because of a major breaking API change, but also to indicate another advancement toward release.

We should probably also get this onto NuGet as soon as possible to (hopefully) make it easier to recruit help to stabilize and create some integration packages for popular Microsoft frameworks.

KNOWN ISSUES


1.       The QueryParser.Flexible custom localized message functionality is currently not implemented for .NET core, so those tests are now failing.

2.       The implementation of Lucene.Net.Expressions currently reads data from the configuration file. This is not how modern libraries are supposed to be built - instead we want any configuration to be pushed in from the application that uses Lucene.Net. Reading from the configuration file directly means no opportunity to use dependency injection. There is also a namespace Support/Configuration that can and should be removed after the implementation is refactored to be DI-friendly (see http://blog.ploeh.dk/2014/05/19/di-friendly-framework/). I haven't yet worked out how the implementation was done in .NET - in Java, the defaults were read from an embedded resource file and could be overridden by passing in a ClassLoader (similar to .NET's Assembly class) - if anyone has any information on how the "auto generated" C# code was generated, please share.

3.       The Collation functionality in Analysis.Common doesn't work with icu-dotnet, and has been excluded from compilation using the constant FEATURE_COLLATION. I am now convinced after reading the docs that it would be better to port the similar functionality from Analysis.ICU because it was designed to work with icu4j and is therefore more likely to work with icu-dotnet.

4.       The Highlighter PostingsHighlight and VectorHighlight functionality relies on icu-dotnet, which doesn't have a close match for the BreakIterator in the JRE, so there are likely some big differences in the functionality. Several hacks were put in to make the tests pass, but these are not likely to fix all of the issues in the wild.

5.       There are several namespaces in Lucene.Net.Core and Lucene.Net.Codecs that have broken documentation comments.

6.       There are some concurrency and performance issues (as pointed out by Vincent Van Den Burghe): http://git.net/ml/general/2017-02/msg00168.html

7.       We have around 2 dozen tests that fail during randomization (averaging about 17 broken per run), and 8 tests that fail all/most of the time.

RESOLVED ISSUES (in addition to API refactoring)


1.       Finished implementing the randomization of Codecs, Culture, Time Zone, and InfoStream in the TestFramework.

2.       Added factories for Codec, DocValuesFormat, and PostingsFormat so custom implementations can be provided via dependency injection instead of using the Java-ish NamedSPILoader class. The name must now be provided by an attribute (or by class naming convention) rather than via constructor, so it can be read without creating an instance of the class.

3.       Fixed several of the codecs in Lucene.Net.Codecs that were still not functioning (and not being tested because of the unfinished RandomCodec class and test mocks).

4.       Reviewed all catch blocks in Lucene.Net.Core to ensure the right type of exceptions are being caught and the right type re-thrown.

5.       Fixed culture-sensitive comparison and sort order issues when using strings in Lucene.Net.Core and Lucene.Net.Codecs.

6.       Merged similar functionality in Support into the same class and deleted several unused Support classes.

7.       Made the API CLS compliant, so it now works with all .NET languages.


Shad Storhaug (NightOwl888)