You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@lucene.apache.org by "Robert Muir (JIRA)" <ji...@apache.org> on 2014/01/09 05:04:51 UTC

[jira] [Updated] (LUCENE-5388) Eliminate construction over readers for Tokenizer

     [ https://issues.apache.org/jira/browse/LUCENE-5388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Robert Muir updated LUCENE-5388:
--------------------------------

    Attachment: LUCENE-5388.patch

Benson's changes, merged up to latest trunk as a patch for review.

I further simplified the PreAnalyzed stuff (only possible due to this change), because it still had problems, and fixed the test for uppercasefilter.

I think this is good to go.

> Eliminate construction over readers for Tokenizer
> -------------------------------------------------
>
>                 Key: LUCENE-5388
>                 URL: https://issues.apache.org/jira/browse/LUCENE-5388
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: core/other
>            Reporter: Benson Margulies
>         Attachments: LUCENE-5388.patch
>
>
> In the modern world, Tokenizers are intended to be reusable, with input supplied via #setReader. The constructors that take Reader are a vestige. Worse yet, they invite people to make mistakes in handling the reader that tangle them up with the state machine in Tokenizer. The sensible thing is to eliminate these ctors, and force setReader usage.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org