You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@couchdb.apache.org by Antony Blakey <an...@gmail.com> on 2009/05/01 02:07:21 UTC

Re: The need for a key prefix view parameter

On 01/05/2009, at 12:49 AM, Wojciech Kaczmarek wrote:

> On Thu, Apr 30, 2009 at 16:56, Brian Candler <B....@pobox.com>  
> wrote:
>> On Thu, Apr 30, 2009 at 02:23:17PM +0100, Brian Candler wrote:
>>> (5) Strangely, doc id keys in _all_docs appear to behave  
>>> differently;
>>> perhaps they are ASCII-compared rather than UCA compared. See  
>>> script 3
>>> below.
>>
>> And this has just had me tearing my hair out for the last half  
>> hour: a
>> search for
>>
>>    _all_docs?startkey="_design/"&endkey="_design/ZZZZ"
>>
>> did not match some of my documents, e.g. _design/c000. Now I  
>> realise that
>> almost certainly this is because Z comes before c in ASCII collation.
>>
>> Is this intentional behaviour? If so I will change the Wiki so it  
>> recommends
>>
>>    _all_docs?startkey="_design/"&endkey="_design/~"
>
> Isn't it better to use "\u9999" as the ending marker?


\u9999 isn't the final unicode collation point - firstly that's not  
the last value in a 16 bit space, secondly unicode isn't 16 bits, and  
finally, unicode collation is locale dependent.

I've previously argued that the only way to do this correctly is to  
allow a prefix search defined over all JSON values: http://mail-archives.apache.org/mod_mbox/couchdb-dev/200901.mbox/%3c67C42C78-4F52-409A-847B-F545F664D190@gmail.com%3e

Antony Blakey
--------------------------
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787

Only two things are infinite, the universe and human stupidity, and  
I'm not sure about the former.
  -- Albert Einstein


Re: The need for a key prefix view parameter

Posted by Chris Anderson <jc...@apache.org>.
On Mon, May 4, 2009 at 2:03 PM, Dean Landolt <de...@deanlandolt.com> wrote:
> On Mon, May 4, 2009 at 4:46 PM, Brian Candler <B....@pobox.com> wrote:
>
>> On Thu, Apr 30, 2009 at 06:02:38PM -0700, kowsik wrote:
>> > Maybe just a new URL parameter for all views to query by prefix?
>> >
>> > _all_docs?prefix="_design/"
>>
>> That's basically what I was thinking of. Would also be nice to be able to
>> POST {"prefixes":[...]}
>>
>
> Why not GET or POST for ?prefix=foo&prefix=bar
>

Same old saw we've talked about a million times. We should add the
option to do it via GET as well as POST. Multi-key is only done by
POST for now because 90% of the usage would not fit in the limits of a
URL length and GET doesn't tend to support request bodies.

Multi prefix is the same story, and actually should be wrapped up in a
generic multi-query api so we can do multiple discontinuous prefix key
startkey limit endkey etc in one request. For this GET would be nice
to support as well but practically useful only for the smallest cases.


-- 
Chris Anderson
http://jchrisa.net
http://couch.io

Re: The need for a key prefix view parameter

Posted by Dean Landolt <de...@deanlandolt.com>.
On Mon, May 4, 2009 at 4:46 PM, Brian Candler <B....@pobox.com> wrote:

> On Thu, Apr 30, 2009 at 06:02:38PM -0700, kowsik wrote:
> > Maybe just a new URL parameter for all views to query by prefix?
> >
> > _all_docs?prefix="_design/"
>
> That's basically what I was thinking of. Would also be nice to be able to
> POST {"prefixes":[...]}
>

Why not GET or POST for ?prefix=foo&prefix=bar

Re: The need for a key prefix view parameter

Posted by Brian Candler <B....@pobox.com>.
On Thu, Apr 30, 2009 at 06:02:38PM -0700, kowsik wrote:
> Maybe just a new URL parameter for all views to query by prefix?
> 
> _all_docs?prefix="_design/"

That's basically what I was thinking of. Would also be nice to be able to
POST {"prefixes":[...]}

Regards,

Brian.

Re: The need for a key prefix view parameter

Posted by Jan Lehnardt <ja...@apache.org>.
Guys, this needs to run on dev@.

Cheers
Jan
--

On 3 May 2009, at 13:17, Chris Anderson wrote:

> On Thu, Apr 30, 2009 at 6:02 PM, kowsik <ko...@gmail.com> wrote:
>> Maybe just a new URL parameter for all views to query by prefix?
>>
>> _all_docs?prefix="_design/"
>>
>> 'prefix' will need startkey_docid and/or endkey_docid, but could
>> handle the proper prefix matching without the user having to worry
>> about what the next character in the collating sequence is.
>>
>
> IMHO the prefix= param is a fine thing for CouchDB to implement. If
> anyone has a patch, please share. There may be implementation edge
> cases, I haven't really thought it through. Hopefully a patch and a
> test case will be all that's needed to ensure that it's done
> correctly.
>
>> K.
>>
>> On Thu, Apr 30, 2009 at 5:07 PM, Antony Blakey <antony.blakey@gmail.com 
>> > wrote:
>>>
>>> On 01/05/2009, at 12:49 AM, Wojciech Kaczmarek wrote:
>>>
>>>> On Thu, Apr 30, 2009 at 16:56, Brian Candler  
>>>> <B....@pobox.com> wrote:
>>>>>
>>>>> On Thu, Apr 30, 2009 at 02:23:17PM +0100, Brian Candler wrote:
>>>>>>
>>>>>> (5) Strangely, doc id keys in _all_docs appear to behave  
>>>>>> differently;
>>>>>> perhaps they are ASCII-compared rather than UCA compared. See  
>>>>>> script 3
>>>>>> below.
>>>>>
>>>>> And this has just had me tearing my hair out for the last half  
>>>>> hour: a
>>>>> search for
>>>>>
>>>>>   _all_docs?startkey="_design/"&endkey="_design/ZZZZ"
>>>>>
>>>>> did not match some of my documents, e.g. _design/c000. Now I  
>>>>> realise that
>>>>> almost certainly this is because Z comes before c in ASCII  
>>>>> collation.
>>>>>
>>>>> Is this intentional behaviour? If so I will change the Wiki so it
>>>>> recommends
>>>>>
>>>>>   _all_docs?startkey="_design/"&endkey="_design/~"
>>>>
>>>> Isn't it better to use "\u9999" as the ending marker?
>>>
>>>
>>> \u9999 isn't the final unicode collation point - firstly that's  
>>> not the last
>>> value in a 16 bit space, secondly unicode isn't 16 bits, and  
>>> finally,
>>> unicode collation is locale dependent.
>>>
>>> I've previously argued that the only way to do this correctly is  
>>> to allow a
>>> prefix search defined over all JSON values:
>>> http://mail-archives.apache.org/mod_mbox/couchdb-dev/200901.mbox/%3c67C42C78-4F52-409A-847B-F545F664D190@gmail.com%3e
>>>
>>> Antony Blakey
>>> --------------------------
>>> CTO, Linkuistics Pty Ltd
>>> Ph: 0438 840 787
>>>
>>> Only two things are infinite, the universe and human stupidity,  
>>> and I'm not
>>> sure about the former.
>>>  -- Albert Einstein
>>>
>>>
>>
>
>
>
> -- 
> Chris Anderson
> http://jchrisa.net
> http://couch.io
>


Re: The need for a key prefix view parameter

Posted by Chris Anderson <jc...@apache.org>.
On Thu, Apr 30, 2009 at 6:02 PM, kowsik <ko...@gmail.com> wrote:
> Maybe just a new URL parameter for all views to query by prefix?
>
> _all_docs?prefix="_design/"
>
> 'prefix' will need startkey_docid and/or endkey_docid, but could
> handle the proper prefix matching without the user having to worry
> about what the next character in the collating sequence is.
>

IMHO the prefix= param is a fine thing for CouchDB to implement. If
anyone has a patch, please share. There may be implementation edge
cases, I haven't really thought it through. Hopefully a patch and a
test case will be all that's needed to ensure that it's done
correctly.

> K.
>
> On Thu, Apr 30, 2009 at 5:07 PM, Antony Blakey <an...@gmail.com> wrote:
>>
>> On 01/05/2009, at 12:49 AM, Wojciech Kaczmarek wrote:
>>
>>> On Thu, Apr 30, 2009 at 16:56, Brian Candler <B....@pobox.com> wrote:
>>>>
>>>> On Thu, Apr 30, 2009 at 02:23:17PM +0100, Brian Candler wrote:
>>>>>
>>>>> (5) Strangely, doc id keys in _all_docs appear to behave differently;
>>>>> perhaps they are ASCII-compared rather than UCA compared. See script 3
>>>>> below.
>>>>
>>>> And this has just had me tearing my hair out for the last half hour: a
>>>> search for
>>>>
>>>>   _all_docs?startkey="_design/"&endkey="_design/ZZZZ"
>>>>
>>>> did not match some of my documents, e.g. _design/c000. Now I realise that
>>>> almost certainly this is because Z comes before c in ASCII collation.
>>>>
>>>> Is this intentional behaviour? If so I will change the Wiki so it
>>>> recommends
>>>>
>>>>   _all_docs?startkey="_design/"&endkey="_design/~"
>>>
>>> Isn't it better to use "\u9999" as the ending marker?
>>
>>
>> \u9999 isn't the final unicode collation point - firstly that's not the last
>> value in a 16 bit space, secondly unicode isn't 16 bits, and finally,
>> unicode collation is locale dependent.
>>
>> I've previously argued that the only way to do this correctly is to allow a
>> prefix search defined over all JSON values:
>> http://mail-archives.apache.org/mod_mbox/couchdb-dev/200901.mbox/%3c67C42C78-4F52-409A-847B-F545F664D190@gmail.com%3e
>>
>> Antony Blakey
>> --------------------------
>> CTO, Linkuistics Pty Ltd
>> Ph: 0438 840 787
>>
>> Only two things are infinite, the universe and human stupidity, and I'm not
>> sure about the former.
>>  -- Albert Einstein
>>
>>
>



-- 
Chris Anderson
http://jchrisa.net
http://couch.io

Re: The need for a key prefix view parameter

Posted by kowsik <ko...@gmail.com>.
Maybe just a new URL parameter for all views to query by prefix?

_all_docs?prefix="_design/"

'prefix' will need startkey_docid and/or endkey_docid, but could
handle the proper prefix matching without the user having to worry
about what the next character in the collating sequence is.

K.

On Thu, Apr 30, 2009 at 5:07 PM, Antony Blakey <an...@gmail.com> wrote:
>
> On 01/05/2009, at 12:49 AM, Wojciech Kaczmarek wrote:
>
>> On Thu, Apr 30, 2009 at 16:56, Brian Candler <B....@pobox.com> wrote:
>>>
>>> On Thu, Apr 30, 2009 at 02:23:17PM +0100, Brian Candler wrote:
>>>>
>>>> (5) Strangely, doc id keys in _all_docs appear to behave differently;
>>>> perhaps they are ASCII-compared rather than UCA compared. See script 3
>>>> below.
>>>
>>> And this has just had me tearing my hair out for the last half hour: a
>>> search for
>>>
>>>   _all_docs?startkey="_design/"&endkey="_design/ZZZZ"
>>>
>>> did not match some of my documents, e.g. _design/c000. Now I realise that
>>> almost certainly this is because Z comes before c in ASCII collation.
>>>
>>> Is this intentional behaviour? If so I will change the Wiki so it
>>> recommends
>>>
>>>   _all_docs?startkey="_design/"&endkey="_design/~"
>>
>> Isn't it better to use "\u9999" as the ending marker?
>
>
> \u9999 isn't the final unicode collation point - firstly that's not the last
> value in a 16 bit space, secondly unicode isn't 16 bits, and finally,
> unicode collation is locale dependent.
>
> I've previously argued that the only way to do this correctly is to allow a
> prefix search defined over all JSON values:
> http://mail-archives.apache.org/mod_mbox/couchdb-dev/200901.mbox/%3c67C42C78-4F52-409A-847B-F545F664D190@gmail.com%3e
>
> Antony Blakey
> --------------------------
> CTO, Linkuistics Pty Ltd
> Ph: 0438 840 787
>
> Only two things are infinite, the universe and human stupidity, and I'm not
> sure about the former.
>  -- Albert Einstein
>
>