You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Tommaso Teofili (JIRA)" <ji...@apache.org> on 2014/11/03 09:06:34 UTC

[jira] [Assigned] (LUCENE-5548) Improve flexibility and testability of the classification module

     [ https://issues.apache.org/jira/browse/LUCENE-5548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tommaso Teofili reassigned LUCENE-5548:
---------------------------------------

    Assignee: Tommaso Teofili

> Improve flexibility and testability of the classification module
> ----------------------------------------------------------------
>
>                 Key: LUCENE-5548
>                 URL: https://issues.apache.org/jira/browse/LUCENE-5548
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: modules/classification
>            Reporter: Tommaso Teofili
>            Assignee: Tommaso Teofili
>              Labels: gsoc2014, mentor
>
> Lucene classification module's flexibility and capabilities may be improved with the following:
> - make it possible to use them "online" (or provide an online version of them) so that if the underlying index(reader) is updated the classifier doesn't need to be trained again to take into account newly added docs
> - eventually pass a different Analyzer together with the text to be classified (or directly a TokenStream) to specify custom tokenization/filtering.
> - normalize score calculations of existing classifiers
> - provide publicly available dataset based accuracy and speed tests
> - more Lucene based classification algorithms
> Specific subtasks for each of the above topics should be created to discuss each of them in depth.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org