You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by GitBox <gi...@apache.org> on 2023/01/11 23:57:25 UTC

[GitHub] [lucene] hossman commented on issue #12077: WordBreakSpellChecker.maxEvaluations usage in generateBreakUpSuggestions() makes no sense

hossman commented on issue #12077:
URL: https://github.com/apache/lucene/issues/12077#issuecomment-1379633112

   FWIW: It also seems strange to me that this method is essentially doing a "depth first" walk of the possible splits, given that it's working a character at a time and the only possible `BreakSuggestionSortMethod` values start with `NUM_CHANGES_THEN_...`.
   
   it seems like we could get "better" results, with lower values of `maxEvaluations`, and less recursion (even if `maxChanges` is very large) if the logic was something like:
   
   * init a BitSet the same length as our input
   * loop over each character postion (`i`)
     * break if `totalEvaluations >= maxEvaluations` otherwise increment `totalEvaluations`
     * if `leftWord` is "valid" suggestion, record `i` in our bitset
     * if `rightWord` is also a "valid" suggestion, offer this left+right combo to our `suggestions` queue
   * if `numberBreaks` has not yet exceeded `maxChanges`:
     * loop over each set bit (`i`) in our BitSet:
       * break if `totalEvaluations >= maxEvaluations` otherwise increment `totalEvaluations`
       * recursively parse the portion of our input to the "right" of `i` (in the context of a new prefix using the portion to the "left" of `i` and an incremented `numberBreaks`)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org