You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Robert Muir (JIRA)" <ji...@apache.org> on 2010/04/22 13:11:49 UTC
[jira] Created: (LUCENE-2409) add a tokenfilter for icu transforms
add a tokenfilter for icu transforms
------------------------------------
Key: LUCENE-2409
URL: https://issues.apache.org/jira/browse/LUCENE-2409
Project: Lucene - Java
Issue Type: New Feature
Components: contrib/*
Affects Versions: 3.1
Reporter: Robert Muir
Priority: Minor
Fix For: 3.1
I pulled the ICUTransformFilter out of LUCENE-1488 and create an issue for it here.
This is a tokenfilter that applies an ICU Transliterator, which is a context-sensitive way
to transform text.
These are typically rule-based and you can use ones included with ICU (such as Traditional-Simplified)
or you can make your own from your own set of rules.
User's Guide: http://userguide.icu-project.org/transforms/general
Rule Tutorial: http://userguide.icu-project.org/transforms/general/rules
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
[jira] Commented: (LUCENE-2409) add a tokenfilter for icu
transforms
Posted by "Robert Muir (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/LUCENE-2409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12859801#action_12859801 ]
Robert Muir commented on LUCENE-2409:
-------------------------------------
Thanks Uwe, i will remove the "crude benchmark" (as you can bench tokenfilters with benchmark), and add some examples and stuff to overview.html
> add a tokenfilter for icu transforms
> ------------------------------------
>
> Key: LUCENE-2409
> URL: https://issues.apache.org/jira/browse/LUCENE-2409
> Project: Lucene - Java
> Issue Type: New Feature
> Components: contrib/*
> Affects Versions: 3.1
> Reporter: Robert Muir
> Priority: Minor
> Fix For: 3.1
>
> Attachments: LUCENE-2409.patch
>
>
> I pulled the ICUTransformFilter out of LUCENE-1488 and create an issue for it here.
> This is a tokenfilter that applies an ICU Transliterator, which is a context-sensitive way
> to transform text.
> These are typically rule-based and you can use ones included with ICU (such as Traditional-Simplified)
> or you can make your own from your own set of rules.
> User's Guide: http://userguide.icu-project.org/transforms/general
> Rule Tutorial: http://userguide.icu-project.org/transforms/general/rules
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
[jira] Assigned: (LUCENE-2409) add a tokenfilter for icu transforms
Posted by "Robert Muir (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/LUCENE-2409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Robert Muir reassigned LUCENE-2409:
-----------------------------------
Assignee: Robert Muir
> add a tokenfilter for icu transforms
> ------------------------------------
>
> Key: LUCENE-2409
> URL: https://issues.apache.org/jira/browse/LUCENE-2409
> Project: Lucene - Java
> Issue Type: New Feature
> Components: contrib/*
> Affects Versions: 3.1
> Reporter: Robert Muir
> Assignee: Robert Muir
> Priority: Minor
> Fix For: 3.1
>
> Attachments: LUCENE-2409.patch, LUCENE-2409.patch
>
>
> I pulled the ICUTransformFilter out of LUCENE-1488 and create an issue for it here.
> This is a tokenfilter that applies an ICU Transliterator, which is a context-sensitive way
> to transform text.
> These are typically rule-based and you can use ones included with ICU (such as Traditional-Simplified)
> or you can make your own from your own set of rules.
> User's Guide: http://userguide.icu-project.org/transforms/general
> Rule Tutorial: http://userguide.icu-project.org/transforms/general/rules
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
[jira] Commented: (LUCENE-2409) add a tokenfilter for icu
transforms
Posted by "Uwe Schindler (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/LUCENE-2409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12859793#action_12859793 ]
Uwe Schindler commented on LUCENE-2409:
---------------------------------------
Go for it, its a private impl class, what should we do else. Speed, speed, speed. Its better than coping into a StringBuilder before and after. Even Java 6 has no Replaceable interface!
> add a tokenfilter for icu transforms
> ------------------------------------
>
> Key: LUCENE-2409
> URL: https://issues.apache.org/jira/browse/LUCENE-2409
> Project: Lucene - Java
> Issue Type: New Feature
> Components: contrib/*
> Affects Versions: 3.1
> Reporter: Robert Muir
> Priority: Minor
> Fix For: 3.1
>
> Attachments: LUCENE-2409.patch
>
>
> I pulled the ICUTransformFilter out of LUCENE-1488 and create an issue for it here.
> This is a tokenfilter that applies an ICU Transliterator, which is a context-sensitive way
> to transform text.
> These are typically rule-based and you can use ones included with ICU (such as Traditional-Simplified)
> or you can make your own from your own set of rules.
> User's Guide: http://userguide.icu-project.org/transforms/general
> Rule Tutorial: http://userguide.icu-project.org/transforms/general/rules
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
[jira] Updated: (LUCENE-2409) add a tokenfilter for icu transforms
Posted by "Robert Muir (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/LUCENE-2409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Robert Muir updated LUCENE-2409:
--------------------------------
Attachment: LUCENE-2409.patch
attached is a patch, its a little ugly since CharTermAttribute doesn't implement Replaceable :)
> add a tokenfilter for icu transforms
> ------------------------------------
>
> Key: LUCENE-2409
> URL: https://issues.apache.org/jira/browse/LUCENE-2409
> Project: Lucene - Java
> Issue Type: New Feature
> Components: contrib/*
> Affects Versions: 3.1
> Reporter: Robert Muir
> Priority: Minor
> Fix For: 3.1
>
> Attachments: LUCENE-2409.patch
>
>
> I pulled the ICUTransformFilter out of LUCENE-1488 and create an issue for it here.
> This is a tokenfilter that applies an ICU Transliterator, which is a context-sensitive way
> to transform text.
> These are typically rule-based and you can use ones included with ICU (such as Traditional-Simplified)
> or you can make your own from your own set of rules.
> User's Guide: http://userguide.icu-project.org/transforms/general
> Rule Tutorial: http://userguide.icu-project.org/transforms/general/rules
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
[jira] Resolved: (LUCENE-2409) add a tokenfilter for icu transforms
Posted by "Robert Muir (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/LUCENE-2409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Robert Muir resolved LUCENE-2409.
---------------------------------
Resolution: Fixed
Committed revision 937039.
> add a tokenfilter for icu transforms
> ------------------------------------
>
> Key: LUCENE-2409
> URL: https://issues.apache.org/jira/browse/LUCENE-2409
> Project: Lucene - Java
> Issue Type: New Feature
> Components: contrib/*
> Affects Versions: 3.1
> Reporter: Robert Muir
> Assignee: Robert Muir
> Priority: Minor
> Fix For: 3.1
>
> Attachments: LUCENE-2409.patch, LUCENE-2409.patch
>
>
> I pulled the ICUTransformFilter out of LUCENE-1488 and create an issue for it here.
> This is a tokenfilter that applies an ICU Transliterator, which is a context-sensitive way
> to transform text.
> These are typically rule-based and you can use ones included with ICU (such as Traditional-Simplified)
> or you can make your own from your own set of rules.
> User's Guide: http://userguide.icu-project.org/transforms/general
> Rule Tutorial: http://userguide.icu-project.org/transforms/general/rules
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
[jira] Updated: (LUCENE-2409) add a tokenfilter for icu transforms
Posted by "Robert Muir (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/LUCENE-2409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Robert Muir updated LUCENE-2409:
--------------------------------
Attachment: LUCENE-2409.patch
attached is an updated patch, with examples in the overview etc.
I would like to commit at the end of the day if no one objects.
> add a tokenfilter for icu transforms
> ------------------------------------
>
> Key: LUCENE-2409
> URL: https://issues.apache.org/jira/browse/LUCENE-2409
> Project: Lucene - Java
> Issue Type: New Feature
> Components: contrib/*
> Affects Versions: 3.1
> Reporter: Robert Muir
> Priority: Minor
> Fix For: 3.1
>
> Attachments: LUCENE-2409.patch, LUCENE-2409.patch
>
>
> I pulled the ICUTransformFilter out of LUCENE-1488 and create an issue for it here.
> This is a tokenfilter that applies an ICU Transliterator, which is a context-sensitive way
> to transform text.
> These are typically rule-based and you can use ones included with ICU (such as Traditional-Simplified)
> or you can make your own from your own set of rules.
> User's Guide: http://userguide.icu-project.org/transforms/general
> Rule Tutorial: http://userguide.icu-project.org/transforms/general/rules
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org