You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Christian Moen (Created) (JIRA)" <ji...@apache.org> on 2012/02/05 07:44:54 UTC
[jira] [Created] (LUCENE-3751) Align default Japanese
configurations for Lucene and Solr
Align default Japanese configurations for Lucene and Solr
---------------------------------------------------------
Key: LUCENE-3751
URL: https://issues.apache.org/jira/browse/LUCENE-3751
Project: Lucene - Java
Issue Type: Improvement
Components: modules/analysis
Affects Versions: 3.6, 4.0
Reporter: Christian Moen
The {{KuromojiAnalyzer}} in Lucene shoud have the same default configuration as the {{text_ja}} field type introduced in {{schema.xml}} by SOLR-3056.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
[jira] [Updated] (LUCENE-3751) Align default Japanese
configurations for Lucene and Solr
Posted by "Christian Moen (Updated) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/LUCENE-3751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Christian Moen updated LUCENE-3751:
-----------------------------------
Attachment: LUCENE-3751.patch
> Align default Japanese configurations for Lucene and Solr
> ---------------------------------------------------------
>
> Key: LUCENE-3751
> URL: https://issues.apache.org/jira/browse/LUCENE-3751
> Project: Lucene - Java
> Issue Type: Improvement
> Components: modules/analysis
> Affects Versions: 3.6, 4.0
> Reporter: Christian Moen
> Attachments: LUCENE-3751.patch, LUCENE-3751.patch
>
>
> The {{KuromojiAnalyzer}} in Lucene shoud have the same default configuration as the {{text_ja}} field type introduced in {{schema.xml}} by SOLR-3056.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
[jira] [Updated] (LUCENE-3751) Align default Japanese
configurations for Lucene and Solr
Posted by "Christian Moen (Updated) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/LUCENE-3751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Christian Moen updated LUCENE-3751:
-----------------------------------
Attachment: LUCENE-3751.patch
> Align default Japanese configurations for Lucene and Solr
> ---------------------------------------------------------
>
> Key: LUCENE-3751
> URL: https://issues.apache.org/jira/browse/LUCENE-3751
> Project: Lucene - Java
> Issue Type: Improvement
> Components: modules/analysis
> Affects Versions: 3.6, 4.0
> Reporter: Christian Moen
> Attachments: LUCENE-3751.patch
>
>
> The {{KuromojiAnalyzer}} in Lucene shoud have the same default configuration as the {{text_ja}} field type introduced in {{schema.xml}} by SOLR-3056.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
[jira] [Commented] (LUCENE-3751) Align default Japanese
configurations for Lucene and Solr
Posted by "Robert Muir (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/LUCENE-3751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13204521#comment-13204521 ]
Robert Muir commented on LUCENE-3751:
-------------------------------------
I opened LUCENE-3765 for that, but I think we are good to move forward here.
> Align default Japanese configurations for Lucene and Solr
> ---------------------------------------------------------
>
> Key: LUCENE-3751
> URL: https://issues.apache.org/jira/browse/LUCENE-3751
> Project: Lucene - Java
> Issue Type: Improvement
> Components: modules/analysis
> Affects Versions: 3.6, 4.0
> Reporter: Christian Moen
> Attachments: LUCENE-3751.patch, LUCENE-3751.patch, LUCENE-3751.patch, LUCENE-3751.patch
>
>
> The {{KuromojiAnalyzer}} in Lucene shoud have the same default configuration as the {{text_ja}} field type introduced in {{schema.xml}} by SOLR-3056.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
[jira] [Commented] (LUCENE-3751) Align default Japanese
configurations for Lucene and Solr
Posted by "Christian Moen (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/LUCENE-3751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13200690#comment-13200690 ]
Christian Moen commented on LUCENE-3751:
----------------------------------------
Patch for {{trunk}} is attached.
The behavior or {{KuromojiAnalyzer}} is now the same as field type {{text_ja}} in Solr's example {{schema.xml}} (see SOLR-3056), including the order of the filters.
I think it makes sense to have the {{LowerCaseFilter}} late in the chain as it might make sense to use a case-based {{StopFilter}}. It doesn't perhaps matter much in {{KuromojiAnalyzer}}'s case since the defaults don't do this anyway, but I thought it was good to practice to align configuration anyway.
I've also clarified an error message and a javadoc.
> Align default Japanese configurations for Lucene and Solr
> ---------------------------------------------------------
>
> Key: LUCENE-3751
> URL: https://issues.apache.org/jira/browse/LUCENE-3751
> Project: Lucene - Java
> Issue Type: Improvement
> Components: modules/analysis
> Affects Versions: 3.6, 4.0
> Reporter: Christian Moen
> Attachments: LUCENE-3751.patch
>
>
> The {{KuromojiAnalyzer}} in Lucene shoud have the same default configuration as the {{text_ja}} field type introduced in {{schema.xml}} by SOLR-3056.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
[jira] [Commented] (LUCENE-3751) Align default Japanese
configurations for Lucene and Solr
Posted by "Robert Muir (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/LUCENE-3751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13203636#comment-13203636 ]
Robert Muir commented on LUCENE-3751:
-------------------------------------
I agree with the patch... I'll commit soon.
> Align default Japanese configurations for Lucene and Solr
> ---------------------------------------------------------
>
> Key: LUCENE-3751
> URL: https://issues.apache.org/jira/browse/LUCENE-3751
> Project: Lucene - Java
> Issue Type: Improvement
> Components: modules/analysis
> Affects Versions: 3.6, 4.0
> Reporter: Christian Moen
> Attachments: LUCENE-3751.patch
>
>
> The {{KuromojiAnalyzer}} in Lucene shoud have the same default configuration as the {{text_ja}} field type introduced in {{schema.xml}} by SOLR-3056.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
[jira] [Updated] (LUCENE-3751) Align default Japanese
configurations for Lucene and Solr
Posted by "Christian Moen (Updated) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/LUCENE-3751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Christian Moen updated LUCENE-3751:
-----------------------------------
Attachment: LUCENE-3751.patch
> Align default Japanese configurations for Lucene and Solr
> ---------------------------------------------------------
>
> Key: LUCENE-3751
> URL: https://issues.apache.org/jira/browse/LUCENE-3751
> Project: Lucene - Java
> Issue Type: Improvement
> Components: modules/analysis
> Affects Versions: 3.6, 4.0
> Reporter: Christian Moen
> Attachments: LUCENE-3751.patch, LUCENE-3751.patch, LUCENE-3751.patch
>
>
> The {{KuromojiAnalyzer}} in Lucene shoud have the same default configuration as the {{text_ja}} field type introduced in {{schema.xml}} by SOLR-3056.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
[jira] [Updated] (LUCENE-3751) Align default Japanese
configurations for Lucene and Solr
Posted by "Robert Muir (Updated) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/LUCENE-3751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Robert Muir updated LUCENE-3751:
--------------------------------
Attachment: LUCENE-3751.patch
Hmm reviewing the patch, there is a trap in StopFilter/CommonGrams/etc.
That is, if you pass it CharArraySet, the ignoreCase parameter is silently ignored... sure its in the javadocs, but this is really bogus... I'm gonna open a separate issue for that trap.
Here is a modified patch building the set with ignoreCase=true instead.
> Align default Japanese configurations for Lucene and Solr
> ---------------------------------------------------------
>
> Key: LUCENE-3751
> URL: https://issues.apache.org/jira/browse/LUCENE-3751
> Project: Lucene - Java
> Issue Type: Improvement
> Components: modules/analysis
> Affects Versions: 3.6, 4.0
> Reporter: Christian Moen
> Attachments: LUCENE-3751.patch, LUCENE-3751.patch, LUCENE-3751.patch, LUCENE-3751.patch
>
>
> The {{KuromojiAnalyzer}} in Lucene shoud have the same default configuration as the {{text_ja}} field type introduced in {{schema.xml}} by SOLR-3056.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
[jira] [Issue Comment Edited] (LUCENE-3751) Align default Japanese
configurations for Lucene and Solr
Posted by "Christian Moen (Issue Comment Edited) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/LUCENE-3751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13204464#comment-13204464 ]
Christian Moen edited comment on LUCENE-3751 at 2/9/12 11:53 AM:
-----------------------------------------------------------------
I've updated the patch to now use a {{StopFilter}} that ignores case.
was (Author: cm):
Updated patch that now uses a {{StopFilter}} that ignores case.
> Align default Japanese configurations for Lucene and Solr
> ---------------------------------------------------------
>
> Key: LUCENE-3751
> URL: https://issues.apache.org/jira/browse/LUCENE-3751
> Project: Lucene - Java
> Issue Type: Improvement
> Components: modules/analysis
> Affects Versions: 3.6, 4.0
> Reporter: Christian Moen
> Attachments: LUCENE-3751.patch, LUCENE-3751.patch, LUCENE-3751.patch
>
>
> The {{KuromojiAnalyzer}} in Lucene shoud have the same default configuration as the {{text_ja}} field type introduced in {{schema.xml}} by SOLR-3056.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
[jira] [Resolved] (LUCENE-3751) Align default Japanese
configurations for Lucene and Solr
Posted by "Robert Muir (Resolved) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/LUCENE-3751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Robert Muir resolved LUCENE-3751.
---------------------------------
Resolution: Fixed
Fix Version/s: 4.0
3.6
Thanks Christian!
> Align default Japanese configurations for Lucene and Solr
> ---------------------------------------------------------
>
> Key: LUCENE-3751
> URL: https://issues.apache.org/jira/browse/LUCENE-3751
> Project: Lucene - Java
> Issue Type: Improvement
> Components: modules/analysis
> Affects Versions: 3.6, 4.0
> Reporter: Christian Moen
> Fix For: 3.6, 4.0
>
> Attachments: LUCENE-3751.patch, LUCENE-3751.patch, LUCENE-3751.patch, LUCENE-3751.patch
>
>
> The {{KuromojiAnalyzer}} in Lucene shoud have the same default configuration as the {{text_ja}} field type introduced in {{schema.xml}} by SOLR-3056.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
[jira] [Commented] (LUCENE-3751) Align default Japanese
configurations for Lucene and Solr
Posted by "Christian Moen (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/LUCENE-3751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13203666#comment-13203666 ]
Christian Moen commented on LUCENE-3751:
----------------------------------------
Thanks a lot, Robert. Let's put this one on hold until I've changed the default in {{StopFilter}} to ignore case (ref. discussion on SOLR-3056.) I expect to provide an updated patch tomorrow.
> Align default Japanese configurations for Lucene and Solr
> ---------------------------------------------------------
>
> Key: LUCENE-3751
> URL: https://issues.apache.org/jira/browse/LUCENE-3751
> Project: Lucene - Java
> Issue Type: Improvement
> Components: modules/analysis
> Affects Versions: 3.6, 4.0
> Reporter: Christian Moen
> Attachments: LUCENE-3751.patch
>
>
> The {{KuromojiAnalyzer}} in Lucene shoud have the same default configuration as the {{text_ja}} field type introduced in {{schema.xml}} by SOLR-3056.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
[jira] [Issue Comment Edited] (LUCENE-3751) Align default Japanese
configurations for Lucene and Solr
Posted by "Christian Moen (Issue Comment Edited) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/LUCENE-3751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13204464#comment-13204464 ]
Christian Moen edited comment on LUCENE-3751 at 2/9/12 11:53 AM:
-----------------------------------------------------------------
I've updated the patch to now use a {{StopFilter}} that ignores case. I think this is good to go.
was (Author: cm):
I've updated the patch to now use a {{StopFilter}} that ignores case.
> Align default Japanese configurations for Lucene and Solr
> ---------------------------------------------------------
>
> Key: LUCENE-3751
> URL: https://issues.apache.org/jira/browse/LUCENE-3751
> Project: Lucene - Java
> Issue Type: Improvement
> Components: modules/analysis
> Affects Versions: 3.6, 4.0
> Reporter: Christian Moen
> Attachments: LUCENE-3751.patch, LUCENE-3751.patch, LUCENE-3751.patch
>
>
> The {{KuromojiAnalyzer}} in Lucene shoud have the same default configuration as the {{text_ja}} field type introduced in {{schema.xml}} by SOLR-3056.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
[jira] [Commented] (LUCENE-3751) Align default Japanese
configurations for Lucene and Solr
Posted by "Christian Moen (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/LUCENE-3751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13204464#comment-13204464 ]
Christian Moen commented on LUCENE-3751:
----------------------------------------
Updated patch that now uses a {{StopFilter}} that ignores case.
> Align default Japanese configurations for Lucene and Solr
> ---------------------------------------------------------
>
> Key: LUCENE-3751
> URL: https://issues.apache.org/jira/browse/LUCENE-3751
> Project: Lucene - Java
> Issue Type: Improvement
> Components: modules/analysis
> Affects Versions: 3.6, 4.0
> Reporter: Christian Moen
> Attachments: LUCENE-3751.patch, LUCENE-3751.patch, LUCENE-3751.patch
>
>
> The {{KuromojiAnalyzer}} in Lucene shoud have the same default configuration as the {{text_ja}} field type introduced in {{schema.xml}} by SOLR-3056.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org