You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Christian Moen (Created) (JIRA)" <ji...@apache.org> on 2012/02/05 07:44:54 UTC

[jira] [Created] (LUCENE-3751) Align default Japanese configurations for Lucene and Solr

Align default Japanese configurations for Lucene and Solr
---------------------------------------------------------

                 Key: LUCENE-3751
                 URL: https://issues.apache.org/jira/browse/LUCENE-3751
             Project: Lucene - Java
          Issue Type: Improvement
          Components: modules/analysis
    Affects Versions: 3.6, 4.0
            Reporter: Christian Moen


The {{KuromojiAnalyzer}} in Lucene shoud have the same default configuration as the {{text_ja}} field type introduced in {{schema.xml}} by SOLR-3056.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (LUCENE-3751) Align default Japanese configurations for Lucene and Solr

Posted by "Christian Moen (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-3751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Christian Moen updated LUCENE-3751:
-----------------------------------

    Attachment: LUCENE-3751.patch
    
> Align default Japanese configurations for Lucene and Solr
> ---------------------------------------------------------
>
>                 Key: LUCENE-3751
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3751
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: modules/analysis
>    Affects Versions: 3.6, 4.0
>            Reporter: Christian Moen
>         Attachments: LUCENE-3751.patch, LUCENE-3751.patch
>
>
> The {{KuromojiAnalyzer}} in Lucene shoud have the same default configuration as the {{text_ja}} field type introduced in {{schema.xml}} by SOLR-3056.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (LUCENE-3751) Align default Japanese configurations for Lucene and Solr

Posted by "Christian Moen (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-3751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Christian Moen updated LUCENE-3751:
-----------------------------------

    Attachment: LUCENE-3751.patch
    
> Align default Japanese configurations for Lucene and Solr
> ---------------------------------------------------------
>
>                 Key: LUCENE-3751
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3751
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: modules/analysis
>    Affects Versions: 3.6, 4.0
>            Reporter: Christian Moen
>         Attachments: LUCENE-3751.patch
>
>
> The {{KuromojiAnalyzer}} in Lucene shoud have the same default configuration as the {{text_ja}} field type introduced in {{schema.xml}} by SOLR-3056.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-3751) Align default Japanese configurations for Lucene and Solr

Posted by "Robert Muir (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-3751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13204521#comment-13204521 ] 

Robert Muir commented on LUCENE-3751:
-------------------------------------

I opened LUCENE-3765 for that, but I think we are good to move forward here.
                
> Align default Japanese configurations for Lucene and Solr
> ---------------------------------------------------------
>
>                 Key: LUCENE-3751
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3751
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: modules/analysis
>    Affects Versions: 3.6, 4.0
>            Reporter: Christian Moen
>         Attachments: LUCENE-3751.patch, LUCENE-3751.patch, LUCENE-3751.patch, LUCENE-3751.patch
>
>
> The {{KuromojiAnalyzer}} in Lucene shoud have the same default configuration as the {{text_ja}} field type introduced in {{schema.xml}} by SOLR-3056.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-3751) Align default Japanese configurations for Lucene and Solr

Posted by "Christian Moen (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-3751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13200690#comment-13200690 ] 

Christian Moen commented on LUCENE-3751:
----------------------------------------

Patch for {{trunk}} is attached.

The behavior or {{KuromojiAnalyzer}} is now the same as field type {{text_ja}} in Solr's example {{schema.xml}} (see SOLR-3056), including the order of the filters.

I think it makes sense to have the {{LowerCaseFilter}} late in the chain as it might make sense to use a case-based {{StopFilter}}.  It doesn't perhaps matter much in {{KuromojiAnalyzer}}'s case since the defaults don't do this anyway, but I thought it was good to practice to align configuration anyway.

I've also clarified an error message and a javadoc.
                
> Align default Japanese configurations for Lucene and Solr
> ---------------------------------------------------------
>
>                 Key: LUCENE-3751
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3751
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: modules/analysis
>    Affects Versions: 3.6, 4.0
>            Reporter: Christian Moen
>         Attachments: LUCENE-3751.patch
>
>
> The {{KuromojiAnalyzer}} in Lucene shoud have the same default configuration as the {{text_ja}} field type introduced in {{schema.xml}} by SOLR-3056.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-3751) Align default Japanese configurations for Lucene and Solr

Posted by "Robert Muir (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-3751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13203636#comment-13203636 ] 

Robert Muir commented on LUCENE-3751:
-------------------------------------

I agree with the patch... I'll commit soon.

                
> Align default Japanese configurations for Lucene and Solr
> ---------------------------------------------------------
>
>                 Key: LUCENE-3751
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3751
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: modules/analysis
>    Affects Versions: 3.6, 4.0
>            Reporter: Christian Moen
>         Attachments: LUCENE-3751.patch
>
>
> The {{KuromojiAnalyzer}} in Lucene shoud have the same default configuration as the {{text_ja}} field type introduced in {{schema.xml}} by SOLR-3056.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (LUCENE-3751) Align default Japanese configurations for Lucene and Solr

Posted by "Christian Moen (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-3751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Christian Moen updated LUCENE-3751:
-----------------------------------

    Attachment: LUCENE-3751.patch
    
> Align default Japanese configurations for Lucene and Solr
> ---------------------------------------------------------
>
>                 Key: LUCENE-3751
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3751
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: modules/analysis
>    Affects Versions: 3.6, 4.0
>            Reporter: Christian Moen
>         Attachments: LUCENE-3751.patch, LUCENE-3751.patch, LUCENE-3751.patch
>
>
> The {{KuromojiAnalyzer}} in Lucene shoud have the same default configuration as the {{text_ja}} field type introduced in {{schema.xml}} by SOLR-3056.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (LUCENE-3751) Align default Japanese configurations for Lucene and Solr

Posted by "Robert Muir (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-3751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Robert Muir updated LUCENE-3751:
--------------------------------

    Attachment: LUCENE-3751.patch

Hmm reviewing the patch, there is a trap in StopFilter/CommonGrams/etc.

That is, if you pass it CharArraySet, the ignoreCase parameter is silently ignored... sure its in the javadocs, but this is really bogus... I'm gonna open a separate issue for that trap.

Here is a modified patch building the set with ignoreCase=true instead.


                
> Align default Japanese configurations for Lucene and Solr
> ---------------------------------------------------------
>
>                 Key: LUCENE-3751
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3751
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: modules/analysis
>    Affects Versions: 3.6, 4.0
>            Reporter: Christian Moen
>         Attachments: LUCENE-3751.patch, LUCENE-3751.patch, LUCENE-3751.patch, LUCENE-3751.patch
>
>
> The {{KuromojiAnalyzer}} in Lucene shoud have the same default configuration as the {{text_ja}} field type introduced in {{schema.xml}} by SOLR-3056.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Issue Comment Edited] (LUCENE-3751) Align default Japanese configurations for Lucene and Solr

Posted by "Christian Moen (Issue Comment Edited) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-3751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13204464#comment-13204464 ] 

Christian Moen edited comment on LUCENE-3751 at 2/9/12 11:53 AM:
-----------------------------------------------------------------

I've updated the patch to now use a {{StopFilter}} that ignores case.
                
      was (Author: cm):
    Updated patch that now uses a {{StopFilter}} that ignores case.
                  
> Align default Japanese configurations for Lucene and Solr
> ---------------------------------------------------------
>
>                 Key: LUCENE-3751
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3751
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: modules/analysis
>    Affects Versions: 3.6, 4.0
>            Reporter: Christian Moen
>         Attachments: LUCENE-3751.patch, LUCENE-3751.patch, LUCENE-3751.patch
>
>
> The {{KuromojiAnalyzer}} in Lucene shoud have the same default configuration as the {{text_ja}} field type introduced in {{schema.xml}} by SOLR-3056.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Resolved] (LUCENE-3751) Align default Japanese configurations for Lucene and Solr

Posted by "Robert Muir (Resolved) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-3751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Robert Muir resolved LUCENE-3751.
---------------------------------

       Resolution: Fixed
    Fix Version/s: 4.0
                   3.6

Thanks Christian!
                
> Align default Japanese configurations for Lucene and Solr
> ---------------------------------------------------------
>
>                 Key: LUCENE-3751
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3751
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: modules/analysis
>    Affects Versions: 3.6, 4.0
>            Reporter: Christian Moen
>             Fix For: 3.6, 4.0
>
>         Attachments: LUCENE-3751.patch, LUCENE-3751.patch, LUCENE-3751.patch, LUCENE-3751.patch
>
>
> The {{KuromojiAnalyzer}} in Lucene shoud have the same default configuration as the {{text_ja}} field type introduced in {{schema.xml}} by SOLR-3056.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-3751) Align default Japanese configurations for Lucene and Solr

Posted by "Christian Moen (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-3751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13203666#comment-13203666 ] 

Christian Moen commented on LUCENE-3751:
----------------------------------------

Thanks a lot, Robert.  Let's put this one on hold until I've changed the default in {{StopFilter}} to ignore case (ref. discussion on SOLR-3056.)  I expect to provide an updated patch tomorrow.
                
> Align default Japanese configurations for Lucene and Solr
> ---------------------------------------------------------
>
>                 Key: LUCENE-3751
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3751
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: modules/analysis
>    Affects Versions: 3.6, 4.0
>            Reporter: Christian Moen
>         Attachments: LUCENE-3751.patch
>
>
> The {{KuromojiAnalyzer}} in Lucene shoud have the same default configuration as the {{text_ja}} field type introduced in {{schema.xml}} by SOLR-3056.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Issue Comment Edited] (LUCENE-3751) Align default Japanese configurations for Lucene and Solr

Posted by "Christian Moen (Issue Comment Edited) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-3751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13204464#comment-13204464 ] 

Christian Moen edited comment on LUCENE-3751 at 2/9/12 11:53 AM:
-----------------------------------------------------------------

I've updated the patch to now use a {{StopFilter}} that ignores case.  I think this is good to go.
                
      was (Author: cm):
    I've updated the patch to now use a {{StopFilter}} that ignores case.
                  
> Align default Japanese configurations for Lucene and Solr
> ---------------------------------------------------------
>
>                 Key: LUCENE-3751
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3751
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: modules/analysis
>    Affects Versions: 3.6, 4.0
>            Reporter: Christian Moen
>         Attachments: LUCENE-3751.patch, LUCENE-3751.patch, LUCENE-3751.patch
>
>
> The {{KuromojiAnalyzer}} in Lucene shoud have the same default configuration as the {{text_ja}} field type introduced in {{schema.xml}} by SOLR-3056.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-3751) Align default Japanese configurations for Lucene and Solr

Posted by "Christian Moen (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-3751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13204464#comment-13204464 ] 

Christian Moen commented on LUCENE-3751:
----------------------------------------

Updated patch that now uses a {{StopFilter}} that ignores case.
                
> Align default Japanese configurations for Lucene and Solr
> ---------------------------------------------------------
>
>                 Key: LUCENE-3751
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3751
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: modules/analysis
>    Affects Versions: 3.6, 4.0
>            Reporter: Christian Moen
>         Attachments: LUCENE-3751.patch, LUCENE-3751.patch, LUCENE-3751.patch
>
>
> The {{KuromojiAnalyzer}} in Lucene shoud have the same default configuration as the {{text_ja}} field type introduced in {{schema.xml}} by SOLR-3056.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org