You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucenenet.apache.org by "CrommVardek (via GitHub)" <gi...@apache.org> on 2023/02/28 12:00:35 UTC

[GitHub] [lucenenet] CrommVardek commented on issue #785: Lucene.net sort returns doc with NULL fields. Invalidating sort

CrommVardek commented on issue #785:
URL: https://github.com/apache/lucenenet/issues/785#issuecomment-1448061298

   The null fields were returned because of this analyzer used to index and search : 
   
   ` private readonly StandardAnalyzer _standardAnalyzer = new(AppLuceneVersion);`
   
   Meaning some lastname (Will, To, etc.) were considered stopWords. fields are therefore set to NULL and the sort would sort them like the field was empty (null).
   
   Modifying to `private readonly StandardAnalyzer _standardAnalyzer = new(AppLuceneVersion, CharArraySet.EMPTY_SET);` solved the issue.
   
   IMHO, stops words set should be opt-in, and constructor that does not have the CharArraySet's parmater should default to CharArraySet.EMPTY_SET not the ENGLISH_STOP_WORDS_SET. Mostly because : Not everyone is building search engines for english only. Not everyone is build search engines for common words only. Using default ENGLISH_STOP_WORDS_SET is assuming a certain business case / usage of the search engine... But really it should not interfer with the context of usage.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@lucenenet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org