You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-dev@lucene.apache.org by "Yonik Seeley (JIRA)" <ji...@apache.org> on 2007/04/28 06:51:15 UTC

[jira] Created: (SOLR-219) Determine if prefix, wildcard, fuzzy queries should be lowercased

Determine if prefix, wildcard, fuzzy queries should be lowercased
-----------------------------------------------------------------

                 Key: SOLR-219
                 URL: https://issues.apache.org/jira/browse/SOLR-219
             Project: Solr
          Issue Type: Improvement
            Reporter: Yonik Seeley
            Priority: Minor


Solr should be able to "do the right thing" when doing prefix/wildcard/fuzzy queries on fields with respect to lowercasing or not.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-219) Determine if prefix, wildcard, fuzzy queries should be lowercased

Posted by "Hoss Man (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12492751 ] 

Hoss Man commented on SOLR-219:
-------------------------------

I'm not opposed to an approach like this ... but it seems like a slippery slope to go down, with hard coded test strings, and assumptions about how analyzers will behave in all cases beased on one test case.

perhaps a simpler approach that requires less guess work would be adding the ability for Fields and FieldTypes to container arbitrary key/val pair options that can be accessed as a map, and document that SolrQueryParser looks at some of these to make query parsing decisions?

    <fieldType name="text_ws" class="solr.TextField" positionIncrementGap="100">
      <analyzer>
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
      </analyzer>
      <option name="lowerCaseForPrefix">false</option>
    </fieldType>



> Determine if prefix, wildcard, fuzzy queries should be lowercased
> -----------------------------------------------------------------
>
>                 Key: SOLR-219
>                 URL: https://issues.apache.org/jira/browse/SOLR-219
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Yonik Seeley
>            Priority: Minor
>         Attachments: lowercase_prefix.patch
>
>
> Solr should be able to "do the right thing" when doing prefix/wildcard/fuzzy queries on fields with respect to lowercasing or not.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-219) Determine if prefix, wildcard, fuzzy queries should be lowercased

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12494623 ] 

Yonik Seeley commented on SOLR-219:
-----------------------------------

> I personally much prefer having direct control over query case sensitivity on a per-field basis, thanks!

Sure, if Solr is going to get it incorrect.

I'm inclined to wait until someone comes up with an analyzer where we *can't* figure out if it's case insensitive or not before adding more configuration complexity... for the sake of both solr developers and users.

> Determine if prefix, wildcard, fuzzy queries should be lowercased
> -----------------------------------------------------------------
>
>                 Key: SOLR-219
>                 URL: https://issues.apache.org/jira/browse/SOLR-219
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Yonik Seeley
>            Priority: Minor
>         Attachments: lowercase_prefix.patch
>
>
> Solr should be able to "do the right thing" when doing prefix/wildcard/fuzzy queries on fields with respect to lowercasing or not.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-219) Determine if prefix, wildcard, fuzzy queries should be lowercased

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yonik Seeley updated SOLR-219:
------------------------------

    Attachment: lowercase_prefix.patch

Here's a demo patch that optionally lowercases prefix query by testing the analyzer for the fieldType.  No tests, no wildcard/fuzzy implementation yet.  This is for evaluation of approach.

I delegated complete query construction to the fieldType (as opposed to just lowercasing the term) because I'm thinking ahead to more efficiently supporting other types of wildcard queries in the future based on the field type.  As an example, *foo* could be turned into a simple term query if the field contained the right ngram filter.


> Determine if prefix, wildcard, fuzzy queries should be lowercased
> -----------------------------------------------------------------
>
>                 Key: SOLR-219
>                 URL: https://issues.apache.org/jira/browse/SOLR-219
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Yonik Seeley
>            Priority: Minor
>         Attachments: lowercase_prefix.patch
>
>
> Solr should be able to "do the right thing" when doing prefix/wildcard/fuzzy queries on fields with respect to lowercasing or not.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Issue Comment Edited: (SOLR-219) Determine if prefix, wildcard, fuzzy queries should be lowercased

Posted by "Claus Brod (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12859903#action_12859903 ] 

Claus Brod edited comment on SOLR-219 at 4/22/10 1:09 PM:
----------------------------------------------------------

We also needed lowercase query support. We extended Yonik's patch to wildcard queries. Seems to work well in our environment. I added the patch as wildcardlowercase.patch; it's probably most useful for illustration purposes than for an industrial-strength final solution, but maybe it's useful for somebody.

Needless to say we'd love to see official support for case-insensitive searches in 1.5 :-)



      was (Author: clausb):
    We also needed lowercase query support. We extended Yonik's patch to wildcard queries. Seems to work well in our environment.


  
> Determine if prefix, wildcard, fuzzy queries should be lowercased
> -----------------------------------------------------------------
>
>                 Key: SOLR-219
>                 URL: https://issues.apache.org/jira/browse/SOLR-219
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Yonik Seeley
>            Priority: Minor
>             Fix For: 1.5
>
>         Attachments: lowercase_prefix.patch, wildcardlowercase.patch
>
>
> Solr should be able to "do the right thing" when doing prefix/wildcard/fuzzy queries on fields with respect to lowercasing or not.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Updated: (SOLR-219) Determine if prefix, wildcard, fuzzy queries should be lowercased

Posted by "Claus Brod (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Claus Brod updated SOLR-219:
----------------------------

    Attachment: wildcardlowercase.patch

Patch relative to Solr 1.4 which adds lowercase support for both prefix and wildcard queries

> Determine if prefix, wildcard, fuzzy queries should be lowercased
> -----------------------------------------------------------------
>
>                 Key: SOLR-219
>                 URL: https://issues.apache.org/jira/browse/SOLR-219
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Yonik Seeley
>            Priority: Minor
>             Fix For: 1.5
>
>         Attachments: lowercase_prefix.patch, wildcardlowercase.patch
>
>
> Solr should be able to "do the right thing" when doing prefix/wildcard/fuzzy queries on fields with respect to lowercasing or not.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (SOLR-219) Determine if prefix, wildcard, fuzzy queries should be lowercased

Posted by "Michael Pelz-Sherman (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12492779 ] 

Michael Pelz-Sherman commented on SOLR-219:
-------------------------------------------

IMHO, if this is implemented, it should be optional (via schema configuration) and NOT the default behavior. I personally much prefer having direct control over query case sensitivity on a per-field basis, thanks!

> Determine if prefix, wildcard, fuzzy queries should be lowercased
> -----------------------------------------------------------------
>
>                 Key: SOLR-219
>                 URL: https://issues.apache.org/jira/browse/SOLR-219
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Yonik Seeley
>            Priority: Minor
>         Attachments: lowercase_prefix.patch
>
>
> Solr should be able to "do the right thing" when doing prefix/wildcard/fuzzy queries on fields with respect to lowercasing or not.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-219) Determine if prefix, wildcard, fuzzy queries should be lowercased

Posted by "David Smiley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12592553#action_12592553 ] 

David Smiley commented on SOLR-219:
-----------------------------------

I'm totally with you Yonik.  I was surprised today to see that my prefix queries (part of an auto-complete feature I'm adding to my app) were turning up nothing because I was using upper case characters.  It's silly because Solr is otherwise smart enough in other basic queries yet not in this case.

> Determine if prefix, wildcard, fuzzy queries should be lowercased
> -----------------------------------------------------------------
>
>                 Key: SOLR-219
>                 URL: https://issues.apache.org/jira/browse/SOLR-219
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Yonik Seeley
>            Priority: Minor
>         Attachments: lowercase_prefix.patch
>
>
> Solr should be able to "do the right thing" when doing prefix/wildcard/fuzzy queries on fields with respect to lowercasing or not.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-219) Determine if prefix, wildcard, fuzzy queries should be lowercased

Posted by "Shalin Shekhar Mangar (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Shalin Shekhar Mangar updated SOLR-219:
---------------------------------------

    Fix Version/s: 1.4

> Determine if prefix, wildcard, fuzzy queries should be lowercased
> -----------------------------------------------------------------
>
>                 Key: SOLR-219
>                 URL: https://issues.apache.org/jira/browse/SOLR-219
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Yonik Seeley
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: lowercase_prefix.patch
>
>
> Solr should be able to "do the right thing" when doing prefix/wildcard/fuzzy queries on fields with respect to lowercasing or not.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Issue Comment Edited: (SOLR-219) Determine if prefix, wildcard, fuzzy queries should be lowercased

Posted by "Claus Brod (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12859903#action_12859903 ] 

Claus Brod edited comment on SOLR-219 at 4/22/10 1:08 PM:
----------------------------------------------------------

We also needed lowercase query support. We extended Yonik's patch to wildcard queries. Seems to work well in our environment.



      was (Author: clausb):
    Patch relative to Solr 1.4 which adds lowercase support for both prefix and wildcard queries
  
> Determine if prefix, wildcard, fuzzy queries should be lowercased
> -----------------------------------------------------------------
>
>                 Key: SOLR-219
>                 URL: https://issues.apache.org/jira/browse/SOLR-219
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Yonik Seeley
>            Priority: Minor
>             Fix For: 1.5
>
>         Attachments: lowercase_prefix.patch, wildcardlowercase.patch
>
>
> Solr should be able to "do the right thing" when doing prefix/wildcard/fuzzy queries on fields with respect to lowercasing or not.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org