You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Robert Muir (JIRA)" <ji...@apache.org> on 2011/05/10 20:09:47 UTC

[jira] [Created] (LUCENE-3086) add ElisionsFilter to ItalianAnalyzer

add ElisionsFilter to ItalianAnalyzer
-------------------------------------

                 Key: LUCENE-3086
                 URL: https://issues.apache.org/jira/browse/LUCENE-3086
             Project: Lucene - Java
          Issue Type: Improvement
            Reporter: Robert Muir
             Fix For: 3.2, 4.0
         Attachments: LUCENE-3086.patch

we set this up for french by default, but we don't for italian.
we should enable it with the standard italian contractions (e.g. definite articles).

the various stemmers for these languages assume this is already being taken care of
and don't do anything about it... in general things like snowball assume really dumb
tokenization, that you will split on the word-internal ', and they add these to stoplists.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (LUCENE-3086) add ElisionsFilter to ItalianAnalyzer

Posted by "Robert Muir (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-3086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Robert Muir updated LUCENE-3086:
--------------------------------

    Attachment: LUCENE-3086.patch

> add ElisionsFilter to ItalianAnalyzer
> -------------------------------------
>
>                 Key: LUCENE-3086
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3086
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Robert Muir
>             Fix For: 3.2, 4.0
>
>         Attachments: LUCENE-3086.patch
>
>
> we set this up for french by default, but we don't for italian.
> we should enable it with the standard italian contractions (e.g. definite articles).
> the various stemmers for these languages assume this is already being taken care of
> and don't do anything about it... in general things like snowball assume really dumb
> tokenization, that you will split on the word-internal ', and they add these to stoplists.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Resolved] (LUCENE-3086) add ElisionsFilter to ItalianAnalyzer

Posted by "Robert Muir (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-3086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Robert Muir resolved LUCENE-3086.
---------------------------------

    Resolution: Fixed

Committed revision 1102120, 1102127

> add ElisionsFilter to ItalianAnalyzer
> -------------------------------------
>
>                 Key: LUCENE-3086
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3086
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Robert Muir
>             Fix For: 3.2, 4.0
>
>         Attachments: LUCENE-3086.patch
>
>
> we set this up for french by default, but we don't for italian.
> we should enable it with the standard italian contractions (e.g. definite articles).
> the various stemmers for these languages assume this is already being taken care of
> and don't do anything about it... in general things like snowball assume really dumb
> tokenization, that you will split on the word-internal ', and they add these to stoplists.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org