You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Douglas Campos (JIRA)" <ji...@apache.org> on 2009/03/27 17:54:50 UTC

[jira] Created: (LUCENE-1576) Brazilian Analyzer doesn't remove stopwords when uppercase is given

Brazilian Analyzer doesn't remove stopwords when uppercase is given
-------------------------------------------------------------------

                 Key: LUCENE-1576
                 URL: https://issues.apache.org/jira/browse/LUCENE-1576
             Project: Lucene - Java
          Issue Type: Bug
          Components: contrib/analyzers
    Affects Versions: 2.3.3, 2.4.2, 2.9, 3.0
         Environment: not applicable
            Reporter: Douglas Campos


The order of filters matter here, just need to apply lowercase token filter before removing stopwords

	result = new StopFilter( result, stoptable );
		result = new BrazilianStemFilter( result, excltable );
		// Convert to lowercase after stemming!
		result = new LowerCaseFilter( result );

Lowercase must come before BrazilianStemFilter

At the end of day I'll attach a patch, it's straightforward

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1576) Brazilian Analyzer doesn't remove stopwords when uppercase is given

Posted by "Adriano Crestani (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12689973#action_12689973 ] 

Adriano Crestani commented on LUCENE-1576:
------------------------------------------

FYI, this topic was already discussed on this thread: http://markmail.org/thread/5wjjl6jx4yoxake5

> Brazilian Analyzer doesn't remove stopwords when uppercase is given
> -------------------------------------------------------------------
>
>                 Key: LUCENE-1576
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1576
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: contrib/analyzers
>    Affects Versions: 2.3.3, 2.4.2, 2.9, 3.0
>         Environment: not applicable
>            Reporter: Douglas Campos
>   Original Estimate: 0.25h
>  Remaining Estimate: 0.25h
>
> The order of filters matter here, just need to apply lowercase token filter before removing stopwords
> 	result = new StopFilter( result, stoptable );
> 		result = new BrazilianStemFilter( result, excltable );
> 		// Convert to lowercase after stemming!
> 		result = new LowerCaseFilter( result );
> Lowercase must come before BrazilianStemFilter
> At the end of day I'll attach a patch, it's straightforward

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Assigned: (LUCENE-1576) Brazilian Analyzer doesn't remove stopwords when uppercase is given

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-1576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael McCandless reassigned LUCENE-1576:
------------------------------------------

    Assignee: Michael McCandless

> Brazilian Analyzer doesn't remove stopwords when uppercase is given
> -------------------------------------------------------------------
>
>                 Key: LUCENE-1576
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1576
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: contrib/analyzers
>    Affects Versions: 2.3.3, 2.4.2, 2.9, 3.0
>         Environment: not applicable
>            Reporter: Douglas Campos
>            Assignee: Michael McCandless
>             Fix For: 2.9
>
>   Original Estimate: 0.25h
>  Remaining Estimate: 0.25h
>
> The order of filters matter here, just need to apply lowercase token filter before removing stopwords
> 	result = new StopFilter( result, stoptable );
> 		result = new BrazilianStemFilter( result, excltable );
> 		// Convert to lowercase after stemming!
> 		result = new LowerCaseFilter( result );
> Lowercase must come before BrazilianStemFilter
> At the end of day I'll attach a patch, it's straightforward

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1576) Brazilian Analyzer doesn't remove stopwords when uppercase is given

Posted by "Douglas Campos (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12689975#action_12689975 ] 

Douglas Campos commented on LUCENE-1576:
----------------------------------------

After reading this discussion, the next step is to provide the patches, right?

> Brazilian Analyzer doesn't remove stopwords when uppercase is given
> -------------------------------------------------------------------
>
>                 Key: LUCENE-1576
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1576
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: contrib/analyzers
>    Affects Versions: 2.3.3, 2.4.2, 2.9, 3.0
>         Environment: not applicable
>            Reporter: Douglas Campos
>   Original Estimate: 0.25h
>  Remaining Estimate: 0.25h
>
> The order of filters matter here, just need to apply lowercase token filter before removing stopwords
> 	result = new StopFilter( result, stoptable );
> 		result = new BrazilianStemFilter( result, excltable );
> 		// Convert to lowercase after stemming!
> 		result = new LowerCaseFilter( result );
> Lowercase must come before BrazilianStemFilter
> At the end of day I'll attach a patch, it's straightforward

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Updated: (LUCENE-1576) Brazilian Analyzer doesn't remove stopwords when uppercase is given

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-1576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael McCandless updated LUCENE-1576:
---------------------------------------

    Fix Version/s: 2.9

> Brazilian Analyzer doesn't remove stopwords when uppercase is given
> -------------------------------------------------------------------
>
>                 Key: LUCENE-1576
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1576
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: contrib/analyzers
>    Affects Versions: 2.3.3, 2.4.2, 2.9, 3.0
>         Environment: not applicable
>            Reporter: Douglas Campos
>            Assignee: Michael McCandless
>             Fix For: 2.9
>
>   Original Estimate: 0.25h
>  Remaining Estimate: 0.25h
>
> The order of filters matter here, just need to apply lowercase token filter before removing stopwords
> 	result = new StopFilter( result, stoptable );
> 		result = new BrazilianStemFilter( result, excltable );
> 		// Convert to lowercase after stemming!
> 		result = new LowerCaseFilter( result );
> Lowercase must come before BrazilianStemFilter
> At the end of day I'll attach a patch, it's straightforward

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1576) Brazilian Analyzer doesn't remove stopwords when uppercase is given

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12690005#action_12690005 ] 

Michael McCandless commented on LUCENE-1576:
--------------------------------------------

No need for a patch -- I see it in the thread.  Thanks!

> Brazilian Analyzer doesn't remove stopwords when uppercase is given
> -------------------------------------------------------------------
>
>                 Key: LUCENE-1576
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1576
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: contrib/analyzers
>    Affects Versions: 2.3.3, 2.4.2, 2.9, 3.0
>         Environment: not applicable
>            Reporter: Douglas Campos
>             Fix For: 2.9
>
>   Original Estimate: 0.25h
>  Remaining Estimate: 0.25h
>
> The order of filters matter here, just need to apply lowercase token filter before removing stopwords
> 	result = new StopFilter( result, stoptable );
> 		result = new BrazilianStemFilter( result, excltable );
> 		// Convert to lowercase after stemming!
> 		result = new LowerCaseFilter( result );
> Lowercase must come before BrazilianStemFilter
> At the end of day I'll attach a patch, it's straightforward

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Resolved: (LUCENE-1576) Brazilian Analyzer doesn't remove stopwords when uppercase is given

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-1576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael McCandless resolved LUCENE-1576.
----------------------------------------

    Resolution: Fixed

Thanks!

> Brazilian Analyzer doesn't remove stopwords when uppercase is given
> -------------------------------------------------------------------
>
>                 Key: LUCENE-1576
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1576
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: contrib/analyzers
>    Affects Versions: 2.3.3, 2.4.2, 2.9, 3.0
>         Environment: not applicable
>            Reporter: Douglas Campos
>            Assignee: Michael McCandless
>             Fix For: 2.9
>
>   Original Estimate: 0.25h
>  Remaining Estimate: 0.25h
>
> The order of filters matter here, just need to apply lowercase token filter before removing stopwords
> 	result = new StopFilter( result, stoptable );
> 		result = new BrazilianStemFilter( result, excltable );
> 		// Convert to lowercase after stemming!
> 		result = new LowerCaseFilter( result );
> Lowercase must come before BrazilianStemFilter
> At the end of day I'll attach a patch, it's straightforward

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org