You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@cloudstack.apache.org by Abhinandan Prateek <Ab...@citrix.com> on 2013/02/06 16:22:22 UTC

Re: [API] Fuzzy API searches when listing domains, accounts and/or volumes

+1, We should have exact match search for name.

On 30/01/13 11:06 PM, "Min Chen" <mi...@citrix.com> wrote:

>In my understanding, we should only use regular expression search using
>keyword query parameter, I don't understand why we introduced this
>inconsistent search behavior for "name" in this case, since I found that
>in other list apis we still use exact search for "name". We should just
>change them to exact search in these fuzzy search list APIs to be
>consistent. IMHO, I am not in favor of an extra wildcard parameter to each
>Cmd. This one parameter will control the search behavior of all the query
>parameters, but sometimes, user may want exact search for one query
>parameter, but wildcard search for another query parameter, this will not
>solve the issue. Ideally we should allow such wildcard search by using
>keyword search, of course, which needs some enhancement on current keyword
>search implementation to be google-like search using inverted index, since
>currently it is hard-coding which field to search for each command.
>
>Thanks
>-min
>
>On 1/30/13 7:46 AM, "Wido den Hollander" <wi...@widodh.nl> wrote:
>
>>Hi,
>>
>>During some API work I found that when you query for a 'name' with
>>ListDomains, ListAccounts and/or ListVolumes this search is fuzzy (with
>>a wildcard).
>>
>>For example when listing domains:
>>
>>         if (domainName != null) {
>>             sc.setParameters("name", "%" + domainName + "%");
>>         }
>>
>>Or when listing volumes:
>>
>>         if (name != null) {
>>             sc.setParameters("name", "%" + name + "%");
>>         }
>>
>>This search is always a wildcard.
>>
>>So if you want to know if domain 'customerX' exists you query for that,
>>but your results can also contain 'customerXY' and 'customerXX'.
>>
>>command=listDomains&name=customerX
>>
>>I'm taking the listing of domains again and you can also use the
>>'keyword' parameter like:
>>
>>command=listDomains&name=customerX&keyword=customerX
>>
>>When tracing it back to MySQL I see these queries:
>>
>>* Without keyword *
>>SELECT domain.id, domain.parent, domain.name, domain.owner, domain.path,
>>domain.level, domain.removed, domain.child_count, domain.next_child_seq,
>>domain.state, domain.network_domain, domain.uuid FROM domain WHERE
>>domain.name LIKE _binary'%customerX%'  AND domain.state = 'Active'  AND
>>domain.removed IS NULL  ORDER BY domain.id ASC
>>
>>* With keyword *
>>SELECT domain.id, domain.parent, domain.name, domain.owner, domain.path,
>>domain.level, domain.removed, domain.child_count, domain.next_child_seq,
>>domain.state, domain.network_domain, domain.uuid FROM domain WHERE
>>domain.name LIKE _binary'%customerX%'  AND domain.state = 'Active'  AND
>>  (domain.name LIKE _binary'%customerX%' )  AND domain.removed IS NULL
>>ORDER BY domain.id ASC
>>
>>
>>I'd like to propose to add a new API parameter in BaseListCmd called
>>'wildcard'.
>>
>>By default it is set to true so it behaves like it does now, but you can
>>do:
>>
>>* true (default): A %LIKE% search
>>* false: An exact search
>>* pre: %LIKE search
>>* post: LIKE% search
>>
>>This way you can do more exact searching with the API and you don't have
>>to process all this information on the client.
>>
>>Would this be an acceptable solution to use for all the list* API calls?
>>
>>Wido
>


Re: [API] Fuzzy API searches when listing domains, accounts and/or volumes

Posted by Wido den Hollander <wi...@widodh.nl>.
I just created an issue for it: 
https://issues.apache.org/jira/browse/CLOUDSTACK-1179

Wido

On 02/06/2013 04:22 PM, Abhinandan Prateek wrote:
> +1, We should have exact match search for name.
>
> On 30/01/13 11:06 PM, "Min Chen" <mi...@citrix.com> wrote:
>
>> In my understanding, we should only use regular expression search using
>> keyword query parameter, I don't understand why we introduced this
>> inconsistent search behavior for "name" in this case, since I found that
>> in other list apis we still use exact search for "name". We should just
>> change them to exact search in these fuzzy search list APIs to be
>> consistent. IMHO, I am not in favor of an extra wildcard parameter to each
>> Cmd. This one parameter will control the search behavior of all the query
>> parameters, but sometimes, user may want exact search for one query
>> parameter, but wildcard search for another query parameter, this will not
>> solve the issue. Ideally we should allow such wildcard search by using
>> keyword search, of course, which needs some enhancement on current keyword
>> search implementation to be google-like search using inverted index, since
>> currently it is hard-coding which field to search for each command.
>>
>> Thanks
>> -min
>>
>> On 1/30/13 7:46 AM, "Wido den Hollander" <wi...@widodh.nl> wrote:
>>
>>> Hi,
>>>
>>> During some API work I found that when you query for a 'name' with
>>> ListDomains, ListAccounts and/or ListVolumes this search is fuzzy (with
>>> a wildcard).
>>>
>>> For example when listing domains:
>>>
>>>          if (domainName != null) {
>>>              sc.setParameters("name", "%" + domainName + "%");
>>>          }
>>>
>>> Or when listing volumes:
>>>
>>>          if (name != null) {
>>>              sc.setParameters("name", "%" + name + "%");
>>>          }
>>>
>>> This search is always a wildcard.
>>>
>>> So if you want to know if domain 'customerX' exists you query for that,
>>> but your results can also contain 'customerXY' and 'customerXX'.
>>>
>>> command=listDomains&name=customerX
>>>
>>> I'm taking the listing of domains again and you can also use the
>>> 'keyword' parameter like:
>>>
>>> command=listDomains&name=customerX&keyword=customerX
>>>
>>> When tracing it back to MySQL I see these queries:
>>>
>>> * Without keyword *
>>> SELECT domain.id, domain.parent, domain.name, domain.owner, domain.path,
>>> domain.level, domain.removed, domain.child_count, domain.next_child_seq,
>>> domain.state, domain.network_domain, domain.uuid FROM domain WHERE
>>> domain.name LIKE _binary'%customerX%'  AND domain.state = 'Active'  AND
>>> domain.removed IS NULL  ORDER BY domain.id ASC
>>>
>>> * With keyword *
>>> SELECT domain.id, domain.parent, domain.name, domain.owner, domain.path,
>>> domain.level, domain.removed, domain.child_count, domain.next_child_seq,
>>> domain.state, domain.network_domain, domain.uuid FROM domain WHERE
>>> domain.name LIKE _binary'%customerX%'  AND domain.state = 'Active'  AND
>>>   (domain.name LIKE _binary'%customerX%' )  AND domain.removed IS NULL
>>> ORDER BY domain.id ASC
>>>
>>>
>>> I'd like to propose to add a new API parameter in BaseListCmd called
>>> 'wildcard'.
>>>
>>> By default it is set to true so it behaves like it does now, but you can
>>> do:
>>>
>>> * true (default): A %LIKE% search
>>> * false: An exact search
>>> * pre: %LIKE search
>>> * post: LIKE% search
>>>
>>> This way you can do more exact searching with the API and you don't have
>>> to process all this information on the client.
>>>
>>> Would this be an acceptable solution to use for all the list* API calls?
>>>
>>> Wido
>>
>