You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Uwe Schindler (JIRA)" <ji...@apache.org> on 2010/04/06 23:41:33 UTC

[jira] Created: (LUCENE-2372) Replace deprecated TermAttribute by new CharTermAttribute

Replace deprecated TermAttribute by new CharTermAttribute
---------------------------------------------------------

                 Key: LUCENE-2372
                 URL: https://issues.apache.org/jira/browse/LUCENE-2372
             Project: Lucene - Java
          Issue Type: Improvement
    Affects Versions: 3.1
            Reporter: Uwe Schindler
             Fix For: 3.1


After LUCENE-2302 is merged to trunk with flex, we need to carry over all tokenizers and consumers of the TokenStreams to the new CharTermAttribute.

We should also think about adding a AttributeFactory that creates a subclass of CharTermAttributeImpl that returns collation keys in toBytesRef() accessor. CollationKeyFilter is then obsolete, instead you can simply convert every TokenStream to indexing only CollationKeys by changing the attribute implementation.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2372) Replace deprecated TermAttribute by new CharTermAttribute

Posted by "Uwe Schindler (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12855498#action_12855498 ] 

Uwe Schindler commented on LUCENE-2372:
---------------------------------------

One more: PerFieldAnalyzerWrapper :( - Sorry

> Replace deprecated TermAttribute by new CharTermAttribute
> ---------------------------------------------------------
>
>                 Key: LUCENE-2372
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2372
>             Project: Lucene - Java
>          Issue Type: Improvement
>    Affects Versions: 3.1
>            Reporter: Uwe Schindler
>             Fix For: 3.1
>
>         Attachments: LUCENE-2372.patch, LUCENE-2372.patch, LUCENE-2372.patch
>
>
> After LUCENE-2302 is merged to trunk with flex, we need to carry over all tokenizers and consumers of the TokenStreams to the new CharTermAttribute.
> We should also think about adding a AttributeFactory that creates a subclass of CharTermAttributeImpl that returns collation keys in toBytesRef() accessor. CollationKeyFilter is then obsolete, instead you can simply convert every TokenStream to indexing only CollationKeys by changing the attribute implementation.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Updated: (LUCENE-2372) Replace deprecated TermAttribute by new CharTermAttribute

Posted by "Uwe Schindler (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-2372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Uwe Schindler updated LUCENE-2372:
----------------------------------

    Attachment: LUCENE-2372.patch

Updated patch, now also KeywordAnalyzer and PerFieldAnalyzerWrapper made final and the backwards layer removed.

I will commit this later this day and proceed with contrib. Robert, we should talk who does which one!

> Replace deprecated TermAttribute by new CharTermAttribute
> ---------------------------------------------------------
>
>                 Key: LUCENE-2372
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2372
>             Project: Lucene - Java
>          Issue Type: Improvement
>    Affects Versions: 3.1
>            Reporter: Uwe Schindler
>             Fix For: 3.1
>
>         Attachments: LUCENE-2372.patch, LUCENE-2372.patch, LUCENE-2372.patch, LUCENE-2372.patch
>
>
> After LUCENE-2302 is merged to trunk with flex, we need to carry over all tokenizers and consumers of the TokenStreams to the new CharTermAttribute.
> We should also think about adding a AttributeFactory that creates a subclass of CharTermAttributeImpl that returns collation keys in toBytesRef() accessor. CollationKeyFilter is then obsolete, instead you can simply convert every TokenStream to indexing only CollationKeys by changing the attribute implementation.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Updated: (LUCENE-2372) Replace deprecated TermAttribute by new CharTermAttribute

Posted by "Uwe Schindler (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-2372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Uwe Schindler updated LUCENE-2372:
----------------------------------

    Attachment: LUCENE-2372.patch

Updated patch after last commit.

> Replace deprecated TermAttribute by new CharTermAttribute
> ---------------------------------------------------------
>
>                 Key: LUCENE-2372
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2372
>             Project: Lucene - Java
>          Issue Type: Improvement
>    Affects Versions: 3.1
>            Reporter: Uwe Schindler
>             Fix For: 3.1
>
>         Attachments: LUCENE-2372.patch, LUCENE-2372.patch, LUCENE-2372.patch, LUCENE-2372.patch, LUCENE-2372.patch
>
>
> After LUCENE-2302 is merged to trunk with flex, we need to carry over all tokenizers and consumers of the TokenStreams to the new CharTermAttribute.
> We should also think about adding a AttributeFactory that creates a subclass of CharTermAttributeImpl that returns collation keys in toBytesRef() accessor. CollationKeyFilter is then obsolete, instead you can simply convert every TokenStream to indexing only CollationKeys by changing the attribute implementation.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Updated: (LUCENE-2372) Replace deprecated TermAttribute by new CharTermAttribute

Posted by "Uwe Schindler (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-2372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Uwe Schindler updated LUCENE-2372:
----------------------------------

    Attachment: LUCENE-2372.patch

Small updates.

Just one question: The only non-final Analyzer left is KeywordAnalyzer. If I make it final and also use ReusableTokenizerBase, we can remove the overridesTokenStream check completely? The question is, whoever wants to override this class.

StandardAnalyzer was made final in this patch, why not also this one?

> Replace deprecated TermAttribute by new CharTermAttribute
> ---------------------------------------------------------
>
>                 Key: LUCENE-2372
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2372
>             Project: Lucene - Java
>          Issue Type: Improvement
>    Affects Versions: 3.1
>            Reporter: Uwe Schindler
>             Fix For: 3.1
>
>         Attachments: LUCENE-2372.patch, LUCENE-2372.patch, LUCENE-2372.patch
>
>
> After LUCENE-2302 is merged to trunk with flex, we need to carry over all tokenizers and consumers of the TokenStreams to the new CharTermAttribute.
> We should also think about adding a AttributeFactory that creates a subclass of CharTermAttributeImpl that returns collation keys in toBytesRef() accessor. CollationKeyFilter is then obsolete, instead you can simply convert every TokenStream to indexing only CollationKeys by changing the attribute implementation.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2372) Replace deprecated TermAttribute by new CharTermAttribute

Posted by "Uwe Schindler (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12855590#action_12855590 ] 

Uwe Schindler commented on LUCENE-2372:
---------------------------------------

Committed core part in revision: 932749

> Replace deprecated TermAttribute by new CharTermAttribute
> ---------------------------------------------------------
>
>                 Key: LUCENE-2372
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2372
>             Project: Lucene - Java
>          Issue Type: Improvement
>    Affects Versions: 3.1
>            Reporter: Uwe Schindler
>             Fix For: 3.1
>
>         Attachments: LUCENE-2372.patch, LUCENE-2372.patch, LUCENE-2372.patch, LUCENE-2372.patch, LUCENE-2372.patch
>
>
> After LUCENE-2302 is merged to trunk with flex, we need to carry over all tokenizers and consumers of the TokenStreams to the new CharTermAttribute.
> We should also think about adding a AttributeFactory that creates a subclass of CharTermAttributeImpl that returns collation keys in toBytesRef() accessor. CollationKeyFilter is then obsolete, instead you can simply convert every TokenStream to indexing only CollationKeys by changing the attribute implementation.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Updated: (LUCENE-2372) Replace deprecated TermAttribute by new CharTermAttribute

Posted by "Uwe Schindler (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-2372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Uwe Schindler updated LUCENE-2372:
----------------------------------

    Attachment: LUCENE-2372.patch

Here a first patch for the core tokenstreams. Tests not yet changed.

The following things were additionally fixed:
- StandardAnalyzer was made final (backwards break, we forgot to made it final in the 3.0 TS finalization issue). This enabled me to subclass StopwordAnalyzerBase and remove heavy code duplication. The original code also contained a bug in the tokenStream method (no setReplaceInvalidAcronym) which was correctin reusableTokenStream. Now it is correct.

I will post further patches for core.

> Replace deprecated TermAttribute by new CharTermAttribute
> ---------------------------------------------------------
>
>                 Key: LUCENE-2372
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2372
>             Project: Lucene - Java
>          Issue Type: Improvement
>    Affects Versions: 3.1
>            Reporter: Uwe Schindler
>             Fix For: 3.1
>
>         Attachments: LUCENE-2372.patch
>
>
> After LUCENE-2302 is merged to trunk with flex, we need to carry over all tokenizers and consumers of the TokenStreams to the new CharTermAttribute.
> We should also think about adding a AttributeFactory that creates a subclass of CharTermAttributeImpl that returns collation keys in toBytesRef() accessor. CollationKeyFilter is then obsolete, instead you can simply convert every TokenStream to indexing only CollationKeys by changing the attribute implementation.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Updated: (LUCENE-2372) Replace deprecated TermAttribute by new CharTermAttribute

Posted by "Uwe Schindler (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-2372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Uwe Schindler updated LUCENE-2372:
----------------------------------

    Attachment: LUCENE-2372.patch

Patch that removes deprecated usage of TermAttribute from Lucene Core completely, all tests also fixed.

> Replace deprecated TermAttribute by new CharTermAttribute
> ---------------------------------------------------------
>
>                 Key: LUCENE-2372
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2372
>             Project: Lucene - Java
>          Issue Type: Improvement
>    Affects Versions: 3.1
>            Reporter: Uwe Schindler
>             Fix For: 3.1
>
>         Attachments: LUCENE-2372.patch, LUCENE-2372.patch
>
>
> After LUCENE-2302 is merged to trunk with flex, we need to carry over all tokenizers and consumers of the TokenStreams to the new CharTermAttribute.
> We should also think about adding a AttributeFactory that creates a subclass of CharTermAttributeImpl that returns collation keys in toBytesRef() accessor. CollationKeyFilter is then obsolete, instead you can simply convert every TokenStream to indexing only CollationKeys by changing the attribute implementation.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2372) Replace deprecated TermAttribute by new CharTermAttribute

Posted by "Mark Miller (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12855489#action_12855489 ] 

Mark Miller commented on LUCENE-2372:
-------------------------------------

bq.If I make it final and

+1 - lets just remember to add these breaks to the CHANGES BW break section...

> Replace deprecated TermAttribute by new CharTermAttribute
> ---------------------------------------------------------
>
>                 Key: LUCENE-2372
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2372
>             Project: Lucene - Java
>          Issue Type: Improvement
>    Affects Versions: 3.1
>            Reporter: Uwe Schindler
>             Fix For: 3.1
>
>         Attachments: LUCENE-2372.patch, LUCENE-2372.patch, LUCENE-2372.patch
>
>
> After LUCENE-2302 is merged to trunk with flex, we need to carry over all tokenizers and consumers of the TokenStreams to the new CharTermAttribute.
> We should also think about adding a AttributeFactory that creates a subclass of CharTermAttributeImpl that returns collation keys in toBytesRef() accessor. CollationKeyFilter is then obsolete, instead you can simply convert every TokenStream to indexing only CollationKeys by changing the attribute implementation.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2372) Replace deprecated TermAttribute by new CharTermAttribute

Posted by "Uwe Schindler (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12855493#action_12855493 ] 

Uwe Schindler commented on LUCENE-2372:
---------------------------------------

Did it already for StandardAna (see patch).

> Replace deprecated TermAttribute by new CharTermAttribute
> ---------------------------------------------------------
>
>                 Key: LUCENE-2372
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2372
>             Project: Lucene - Java
>          Issue Type: Improvement
>    Affects Versions: 3.1
>            Reporter: Uwe Schindler
>             Fix For: 3.1
>
>         Attachments: LUCENE-2372.patch, LUCENE-2372.patch, LUCENE-2372.patch
>
>
> After LUCENE-2302 is merged to trunk with flex, we need to carry over all tokenizers and consumers of the TokenStreams to the new CharTermAttribute.
> We should also think about adding a AttributeFactory that creates a subclass of CharTermAttributeImpl that returns collation keys in toBytesRef() accessor. CollationKeyFilter is then obsolete, instead you can simply convert every TokenStream to indexing only CollationKeys by changing the attribute implementation.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2372) Replace deprecated TermAttribute by new CharTermAttribute

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12855492#action_12855492 ] 

Michael McCandless commented on LUCENE-2372:
--------------------------------------------

+1 to making KeywordAnalyzer final.

> Replace deprecated TermAttribute by new CharTermAttribute
> ---------------------------------------------------------
>
>                 Key: LUCENE-2372
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2372
>             Project: Lucene - Java
>          Issue Type: Improvement
>    Affects Versions: 3.1
>            Reporter: Uwe Schindler
>             Fix For: 3.1
>
>         Attachments: LUCENE-2372.patch, LUCENE-2372.patch, LUCENE-2372.patch
>
>
> After LUCENE-2302 is merged to trunk with flex, we need to carry over all tokenizers and consumers of the TokenStreams to the new CharTermAttribute.
> We should also think about adding a AttributeFactory that creates a subclass of CharTermAttributeImpl that returns collation keys in toBytesRef() accessor. CollationKeyFilter is then obsolete, instead you can simply convert every TokenStream to indexing only CollationKeys by changing the attribute implementation.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org