You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by GitBox <gi...@apache.org> on 2023/01/11 17:15:04 UTC

[GitHub] [lucene] tang-hi commented on issue #11902: Customization of Edit distance costs for different operations

tang-hi commented on issue #11902:
URL: https://github.com/apache/lucene/issues/11902#issuecomment-1379206167

   Lucene does not calculate the Levenshtein distance one by one. Instead, it precompiles the Levenshtein automaton based on your output, and then finds terms that meet the distance requirements. The state transitions of the Levenshtein automaton are also already hard-coded in the code.This is also why the maximum edit distance supported by Lucene is 2.
   I think it would be difficult to support custom distances with different operation costs, because this would involve reworking the code related to the Levenshtein automaton.
   
   If you're interested, you can check out the following resources.
   [Blog](https://blog.mikemccandless.com/2011/03/lucenes-fuzzyquery-is-100-times-faster.html)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org