You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@cayenne.apache.org by Andrus Adamchik <an...@objectstyle.org> on 2009/04/24 08:02:39 UTC

Re: [jira] Created: (CAY-1210) mysql does not use index for case insensitive searches

Yeah, I am aware of this issue, as I am using MySQL daily. Here is the  
problem - if we force case-sensitive syntax for case-insensitive  
searches on MySQL, this will break for case-sensitive (non-default)  
collations. I am personally using case-sensitive LIKE searches in my  
code explicitly, which makes such code MySQL-specific, which is  
definitely not good, and I'd hate to recommend it to the users.  
Checking and overriding collation for every column also doesn't seem  
practical.

So not sure how to solve that in a reasonable way?

Andrus


On Apr 24, 2009, at 3:49 AM, Ari Maniatis (JIRA) wrote:

> mysql does not use index for case insensitive searches
> ------------------------------------------------------
>
>                 Key: CAY-1210
>                 URL: https://issues.apache.org/jira/browse/CAY-1210
>             Project: Cayenne
>          Issue Type: Improvement
>            Reporter: Ari Maniatis
>            Assignee: Ari Maniatis
>             Fix For: 3.0
>
>
> When performing a case insensitive search Cayenne spits out SQL  
> which looks like this
>
> SELECT .... WHERE upper(name) LIKE upper("fred")
>
> This prevents any index being used for the search. Since mysql  
> already performed case insensitive searches on text fields we need  
> to suppress the 'upper' functions being used in these situations.  
> All searches on these fields are already case insensitive.
>
> http://dev.mysql.com/doc/refman/5.0/en/case-sensitivity.html
>
> -- 
> This message is automatically generated by JIRA.
> -
> You can reply to this email to add a comment to the issue online.
>
>


Re: [jira] Created: (CAY-1210) mysql does not use index for case insensitive searches

Posted by Andrus Adamchik <an...@objectstyle.org>.
On Apr 24, 2009, at 9:21 AM, Aristedes Maniatis wrote:

> * Binary collation (which is non-standard)

non-default != non-standard. Also note that there are binary and there  
are character case-sensitive collations (e.g. "latin1_general_cs"). I  
think it is reasonable to expect a user to pick a *_cs collation.  
After all, most other DB's are using *_cs collations as a default.

> * Explicit attempt to perform non-case sensitive search (which  
> questions why they chose a binary collation)

Maybe because they are doing both types of searches. Also *_cs  
collations affect ORDER BY if I am not mistaken, which can be  
important, independently from search. Essentially if we stop using  
UPPER for MySQL, we'll address one important case at the expense of  
another.

As it seems that any solution here is going to be a compromise, I  
guess it has to be implemented as a flag in MySQLAdapter that defines  
one or the other strategy. AutoAdapter can set this flag based on the  
database default collation:

==> show variables like "collation_database";

If a user needs different behavior, they will have to set it manually  
per adapter.

Andrus



Re: [jira] Created: (CAY-1210) mysql does not use index for case insensitive searches

Posted by Aristedes Maniatis <ar...@ish.com.au>.
On 24/04/2009, at 4:02 PM, Andrus Adamchik wrote:

> Yeah, I am aware of this issue, as I am using MySQL daily. Here is  
> the problem - if we force case-sensitive syntax for case-insensitive  
> searches on MySQL, this will break for case-sensitive (non-default)  
> collations. I am personally using case-sensitive LIKE searches in my  
> code explicitly, which makes such code MySQL-specific, which is  
> definitely not good, and I'd hate to recommend it to the users.  
> Checking and overriding collation for every column also doesn't seem  
> practical.

Well, there are two ways in MySQL to force case sensitive searches.  
Either by picking a binary collation for the text field, or by  
changing the column type from TEXT/VARCHAR to BLOB/CLOB. The beauty of  
the latter is that Cayenne knows about the datatype and so can behave  
differently. But Cayenne knows nothing about collations (perhaps it  
should?).

So there is a workaround to the problem you raise. But there is no  
workaround to getting case-insensitive searches (which in our use  
cases are 99.9% of searches on text fields) to use an index.

The only case where existing behaviour changes due to this  
modification is if:

* Binary collation (which is non-standard)
* Explicit attempt to perform non-case sensitive search (which  
questions why they chose a binary collation)

The recommendation for users is to change that column to CLOB and  
current behaviour continues.


What do you think?


Ari Maniatis


-------------------------->
ish
http://www.ish.com.au
Level 1, 30 Wilson Street Newtown 2042 Australia
phone +61 2 9550 5001   fax +61 2 9550 4001
GPG fingerprint CBFB 84B4 738D 4E87 5E5C  5EFA EF6A 7D2E 3E49 102A