You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Robert Muir (JIRA)" <ji...@apache.org> on 2014/03/18 05:53:47 UTC
[jira] [Updated] (LUCENE-5111) Fix WordDelimiterFilter
[ https://issues.apache.org/jira/browse/LUCENE-5111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Robert Muir updated LUCENE-5111:
--------------------------------
Attachment: LUCENE-5111.patch
here is a patch. Its not super-optimized, but the 3 common conditions (no delimiters, all delimiters, just one word surrounded by delimiters) are just as fast. for the concatenation+parts stuff I used captureState (we can avoid it, it was just about correctness for me).
I think this is fairly important to fix so users can use e.g. postings highlighter and don't hit bugs like http://stackoverflow.com/questions/20324016/shingle-filter-factory-startoffset-must-be-non-negative-and-endoffset-must-be
> Fix WordDelimiterFilter
> -----------------------
>
> Key: LUCENE-5111
> URL: https://issues.apache.org/jira/browse/LUCENE-5111
> Project: Lucene - Core
> Issue Type: Bug
> Reporter: Adrien Grand
> Assignee: Adrien Grand
> Attachments: LUCENE-5111.patch
>
>
> WordDelimiterFilter is documented as broken is TestRandomChains (LUCENE-4641). Given how used it is, we should try to fix it.
--
This message was sent by Atlassian JIRA
(v6.2#6252)
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org