You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@couchdb.apache.org by "Schroiff, Klaus" <Kl...@fast.au.fujitsu.com> on 2014/04/02 04:56:17 UTC

question about "complex" range queries

Hi,

Let's assume that I have the following view function:

function(doc) { emit([a, b], whatever) } }

In the query I'm running something like
startkey=[a1, b1] & endkey=[a2,b2]
Thus there're TWO explicit ranges here - a1->a2 and b1->b2.

What is the expected result ?

My understanding:
The query is executed in two phases
In the first phase, the index is filtered for qualifying results where the first key ranges from a1 to a2
In the second phase, these filtered results are filtered once more according to the range of the second emitted key from b1 to b2.
Thus essentially an "AND" operation.
The filtering is performed using lexicographical rules.

Is that correct ? The doc about complex keys is a bit slim.

Thanks

Klaus


Re: question about "complex" range queries

Posted by Mike Marino <mm...@gmail.com>.
Hi Klaus,

In case you actually want the functionality you described (i.e. being able
to define multiple key ranges in single query), there is an open JIRA about
this here:

https://issues.apache.org/jira/browse/COUCHDB-523

There is currently no indication when/for which version this will be
available in Couch, but I believe after the BigCouch merge.

Cheers,
Mike


On Wed, Apr 2, 2014 at 2:31 AM, Nick North <no...@gmail.com> wrote:

> It doesn't work quite like this. There is a single order across all
> possible keys, including both simple and complex keys, as described
> here<https://wiki.apache.org/couchdb/View_collation>.
> In the case of keys that are lists, the two lists are compared element by
> element and the sort order is the sort order of the first unequal elements.
>
> In your example, if a key has its first element between a1 and a2 (and a1
> and a2 are different), then the second element will not be inspected at
> all, so it does not matter whether it is between b1 and b2 or not. In fact
> the second element will only be inspected if the first element is either a1
> or a2.
>
> This is usually the behaviour we want. For example, dates are often
> represented as lists of [year, month, day]. Then you can pull out all the
> documents in a date range by specifying start date and end date as startkey
> and endkey. For example, if these are [2012, 4, 5] and [2014, 8, 3]
> respectively, we want to include [2013, 3, 6] in the output even though its
> second element does not lie between the second elements of the keys.
>
> Nick
>
>
> On 2 April 2014 03:56, Schroiff, Klaus <Kl...@fast.au.fujitsu.com> wrote:
>
> > Hi,
> >
> > Let's assume that I have the following view function:
> >
> > function(doc) { emit([a, b], whatever) } }
> >
> > In the query I'm running something like
> > startkey=[a1, b1] & endkey=[a2,b2]
> > Thus there're TWO explicit ranges here - a1->a2 and b1->b2.
> >
> > What is the expected result ?
> >
> > My understanding:
> > The query is executed in two phases
> > In the first phase, the index is filtered for qualifying results where
> the
> > first key ranges from a1 to a2
> > In the second phase, these filtered results are filtered once more
> > according to the range of the second emitted key from b1 to b2.
> > Thus essentially an "AND" operation.
> > The filtering is performed using lexicographical rules.
> >
> > Is that correct ? The doc about complex keys is a bit slim.
> >
> > Thanks
> >
> > Klaus
> >
> >
>

Re: question about "complex" range queries

Posted by Robert Samuel Newson <rn...@apache.org>.
You *can* rely on the order of rows with identical keys to be in _id order as that’s the tie-breaking field.

Other cases where this apparent consideration of value in the sort order is when views are clustered (ala BigCouch).

Sorting a view by value requires chained map-reduce, presently only on the proprietary Cloudant platform, but something CouchDB should deliver after the Rcouch merge.

B.

On 3 Apr 2014, at 15:04, Jens Alfke <je...@couchbase.com> wrote:

> 
> On Apr 3, 2014, at 5:53 AM, Scott Weber <sc...@sbcglobal.net> wrote:
> 
>> Are you saying that it is NOT something that should be relied on?
> 
> Bob Newson just said that pretty emphatically.
> 
> If you want your index to be sorted in a particular way, then emit keys that sort that way! Don’t try to find some undocumented hack that lets you get away without emitting properly ordered keys.
> 
>> I also notice that the documents are sent into the view already sorted by their _id.  Is that a behavior that CAN be replied on?
> 
> Definitely not. And a properly written map function shouldn’t care anyway — if your map function uses any sort of external state that could distinguish the order in which docs are passed to it, then You’re Doing It Wrong.
> 
> —Jens


Re: question about "complex" range queries

Posted by Scott Weber <sc...@sbcglobal.net>.
I appreciate that.  
I am more used to the catch phrase "RTFM". Which, ironically the clarification of which, was the goal of my question.

Now I can go back to my MFIC and tell him what we can and can't reply on :-)




________________________________
 From: Jens Alfke <je...@couchbase.com>
To: user@couchdb.apache.org; Scott Weber <sc...@sbcglobal.net> 
Sent: Friday, April 4, 2014 9:30 AM
Subject: Re: question about "complex" range queries
 


On Apr 3, 2014, at 12:32 PM, Scott Weber <sc...@sbcglobal.net> wrote:

> Thank you for being so kind, while you were putting statements into my text and then telling me that I am wrong.

Sorry for causing offense, but I didn’t say that you're wrong; I said that _if_ you were to rely on these undocumented behaviors, _then_ you would be doing it wrong. The specific wording “You’re Doing It Wrong” was intended as a humorous reference to a popular catchphrase, viz. http://knowyourmeme.com/memes/youre-doing-it-wrong

—Jens

Re: question about "complex" range queries

Posted by Jens Alfke <je...@couchbase.com>.
On Apr 3, 2014, at 12:32 PM, Scott Weber <sc...@sbcglobal.net> wrote:

> Thank you for being so kind, while you were putting statements into my text and then telling me that I am wrong.

Sorry for causing offense, but I didn’t say that you're wrong; I said that _if_ you were to rely on these undocumented behaviors, _then_ you would be doing it wrong. The specific wording “You’re Doing It Wrong” was intended as a humorous reference to a popular catchphrase, viz. http://knowyourmeme.com/memes/youre-doing-it-wrong

—Jens

Re: question about "complex" range queries

Posted by Scott Weber <sc...@sbcglobal.net>.
Jens,
Thank you for being so kind, while you were putting statements into my text and then telling me that I am wrong.

I never said what I was relying on.  I was trying to determine if this was a known behavior becuase it does not seem to appear in any of the no less than four different sites that purport to have documentation (any single one of which is not comprehensive).

I have specifically *not* relied on it because to date, none of the documentation I can find says it is expected behavior. I have coded my own way to solve it, or have found creative ways to use the keys which are documented behavior. And yet the sorted value behavior is there, yielding unexpected results that I had been spending time investigating.

My question was asking if it can be relied upon, and should be in the documentation (or perhaps was in yet another site I hadn't stumbled across yet), or if it is not intended, and therefore could stop occurring now or in the future.

Robert,
As you say it's an artifact, so we won't attempt to take advantage of it.  I just thought that if the system was using CPU cycles to intentionally sort that far into the string, and it was known and reliable, we could not waste code and CPU doing it again.

-Scott




________________________________
 From: Jens Alfke <je...@couchbase.com>
To: user@couchdb.apache.org; Scott Weber <sc...@sbcglobal.net> 
Sent: Thursday, April 3, 2014 9:04 AM
Subject: Re: question about "complex" range queries
 


On Apr 3, 2014, at 5:53 AM, Scott Weber <sc...@sbcglobal.net> wrote:

> Are you saying that it is NOT something that should be relied on?

Bob Newson just said that pretty emphatically.

If you want your index to be sorted in a particular way, then emit keys that sort that way! Don’t try to find some undocumented hack that lets you get away without emitting properly ordered keys.


> I also notice that the documents are sent into the view already sorted by their _id.  Is that a behavior that CAN be replied on?

Definitely not. And a properly written map function shouldn’t care anyway — if your map function uses any sort of external state that could distinguish the order in which docs are passed to it, then You’re Doing It Wrong.

—Jens

Re: question about "complex" range queries

Posted by Jens Alfke <je...@couchbase.com>.
On Apr 3, 2014, at 5:53 AM, Scott Weber <sc...@sbcglobal.net> wrote:

> Are you saying that it is NOT something that should be relied on?

Bob Newson just said that pretty emphatically.

If you want your index to be sorted in a particular way, then emit keys that sort that way! Don’t try to find some undocumented hack that lets you get away without emitting properly ordered keys.

> I also notice that the documents are sent into the view already sorted by their _id.  Is that a behavior that CAN be replied on?

Definitely not. And a properly written map function shouldn’t care anyway — if your map function uses any sort of external state that could distinguish the order in which docs are passed to it, then You’re Doing It Wrong.

—Jens

Re: question about "complex" range queries

Posted by Scott Weber <sc...@sbcglobal.net>.
I have been following this, and was wondering about it.

The documentation doesn't say anything about the second arg (i.e. value) of the emit call being used for sorting.  However it works that way every time, and I can find no situation where it doesn't happen.

Are you saying that it is NOT something that should be relied on?

I also notice that the documents are sent into the view already sorted by their _id.  Is that a behavior that CAN be replied on?

See my samples below:


Two of my docs:
{   "_id": "a",   "_rev": "1-a50d41b0fdfaaca4e679c7f2c6a1722d",
   "items": [   

  { "b": "5"}, {"a": "6"}, {"c": "4"}, { "x": "1" }, { "w": "3" }, { "z": "2" }
]  }


{  "_id": "b", "_rev": "1-82a40a986a557f05352dbcb1fb07b26e",
   "items": [
   {"m": "5" },{ "l": "6" }, { "n": "4"}
] }


The temp view:
function(doc) {    var x;
for (x = 0; x < doc.items.length; x++) {  emit("1", doc.items[x]);   }
}


The output:  

{"total_rows":9,"offset":0,"rows":[
{"id":"a","key":"1","value":{"a":"6"}},
{"id":"a","key":"1","value":{"b":"5"}},
{"id":"a","key":"1","value":{"c":"4"}},
{"id":"a","key":"1","value":{"w":"3"}},
{"id":"a","key":"1","value":{"x":"1"}},
{"id":"a","key":"1","value":{"z":"2"}},
{"id":"b","key":"1","value":{"l":"6"}},
{"id":"b","key":"1","value":{"m":"5"}},
{"id":"b","key":"1","value":{"n":"4"}}
]}


-Scott



________________________________

From: Robert Samuel Newson <rn...@apache.org>
>Date: April 3, 2014, 3:30:14 AM CDT
>To: user@couchdb.apache.org, "Rian R. Maloney" <ri...@yahoo.com>
>Subject: Re: question about "complex" range queries
>Reply-To: user@couchdb.apache.org

>Sorry, but the value is not included in the ordering of the views, what you’re seeing there is an artifact of how multiple emits from the same document (where the emitted key is the same *and* the tie-breaking _id is also the same) are handled.
>
>B.
>
>On 3 Apr 2014, at 01:52, Rian R. Maloney <ri...@yahoo.com> wrote:
>
>
>
>This is NOT what I am seeing ( just trying to understand how this example is not sorted on value). Here is my test case:
> 
>1) create database called test
>2) create 1 doc with _id = test and a field called stuff ="this is test data"
>3)  create the following map:
> 
>function(doc) {
> 
> for( var i = 1;i < 5;i++ )
> {
>    data = {};
>    data.sortkey = i;
>    data.stuff = doc.stuff;
>    emit( doc._id, data );
> }
> for( var i = 4;i > 0;i-- )
> {
>    data = {};
>    data.sortkey = i;
>    data.stuff = doc.stuff;
>    emit( doc._id, data );
> }
> 
> 
>Result:
>{"total_rows":8,"offset":0,"rows":[
>{"id":"test","key":"test","value":{"sortkey":1,"stuff":"this is test data"}},
>{"id":"test","key":"test","value":{"sortkey":1,"stuff":"this is test data"}},
>{"id":"test","key":"test","value":{"sortkey":2,"stuff":"this is test data"}},
>{"id":"test","key":"test","value":{"sortkey":2,"stuff":"this is test data"}},
>{"id":"test","key":"test","value":{"sortkey":3,"stuff":"this is test data"}},
>{"id":"test","key":"test","value":{"sortkey":3,"stuff":"this is test data"}},
>{"id":"test","key":"test","value":{"sortkey":4,"stuff":"this is test data"}},
>{"id":"test","key":"test","value":{"sortkey":4,"stuff":"this is test data"}}
>]}
> 
>But if value was not included in the sort, it would be:
> 
>{"total_rows":8,"offset":0,"rows":[
>{"id":"test","key":"test","value":{"sortkey":1,"stuff":"this is test data"}},
>{"id":"test","key":"test","value":{"sortkey":2,"stuff":"this is test data"}},
>{"id":"test","key":"test","value":{"sortkey":3,"stuff":"this is test data"}},
>{"id":"test","key":"test","value":{"sortkey":4,"stuff":"this is test data"}},{"id":"test","key":"test","value":{"sortkey":1,"stuff":"this is test data"}},{"id":"test","key":"test","value":{"sortkey":2,"stuff":"this is test data"}},{"id":"test","key":"test","value":{"sortkey":3,"stuff":"this is test data"}},{"id":"test","key":"test","value":{"sortkey":4,"stuff":"this is test data"}},
>]}
>On Wednesday, April 2, 2014 6:07 PM, Ryan Ramage <ry...@gmail.com> wrote:
> 
>The emitted value does not affect the sort.
> 
> 
>If you wanted to keep sorting add your sort field in an array key, so move
>from
> 
>emit(doc.key, doc.value)
> 
> 
>to
> 
>emit([doc.key, doc.value], doc.value)
> 
> 
> 
>On Wed, Apr 2, 2014 at 4:57 PM, <ri...@yahoo.com> wrote:
> 
>Thanks Robert. Let me clarify. Is the value included in the sort? I have
>10 entries all with identical keys. The behavior I am seeing indicates the
>contents of the emitted value is included in the sort criteria and if so
>can I count on this? Bottom line I am trying to sort on 2 fields, one is a
>key and the other is the beginning of the value. I also don't want to sort
>the results in a list due to performance constraints
> 
>Thanks
> 
>On Apr 2, 2014, at 5:45 PM, Robert Samuel Newson <rn...@apache.org>
>wrote:
> 
>A view is a single dimension index, ordered by the full key in
>accordance with the rules enumerated here:
>https://wiki.apache.org/couchdb/View_collation#Collation_Specification
> 
>For example;
> 
>9 sorts after false
>999 sorts after 9
>"foo" sorts after 999
>["foo"] sorts after "foo"
>["foo", "bar"] sorts after ["foo"]
> 
>B.
> 
>On 2 Apr 2014, at 23:29, rian.maloney@yahoo.com wrote:
> 
>I have a quick question related to this. What exactly is sorted? My
>test cases indicated the key AND the entire value is sorted as if it was
>one big string.
> 
>Thanks
>Rian
> 
>On Apr 2, 2014, at 2:31 AM, Nick North <no...@gmail.com> wrote:
> 
>It doesn't work quite like this. There is a single order across all
>possible keys, including both simple and complex keys, as described
>here<https://wiki.apache.org/couchdb/View_collation>.
>In the case of keys that are lists, the two lists are compared element
>by
>element and the sort order is the sort order of the first unequal
>elements.
> 
>In your example, if a key has its first element between a1 and a2 (and
>a1
>and a2 are different), then the second element will not be inspected at
>all, so it does not matter whether it is between b1 and b2 or not. In
>fact
>the second element will only be inspected if the first element is
>either a1
>or a2.
> 
>This is usually the behaviour we want. For example, dates are often
>represented as lists of [year, month, day]. Then you can pull out all
>the
>documents in a date range by specifying start date and end date as
>startkey
>and endkey. For example, if these are [2012, 4, 5] and [2014, 8, 3]
>respectively, we want to include [2013, 3, 6] in the output even
>though its
>second element does not lie between the second elements of the keys.
> 
>Nick
> 
> 
>On 2 April 2014 03:56, Schroiff, Klaus <Kl...@fast.au.fujitsu.com>
>wrote:
> 
>Hi,
> 
>Let's assume that I have the following view function:
> 
>function(doc) { emit([a, b], whatever) } }
> 
>In the query I'm running something like
>startkey=[a1, b1] & endkey=[a2,b2]
>Thus there're TWO explicit ranges here - a1->a2 and b1->b2.
> 
>What is the expected result ?
> 
>My understanding:
>The query is executed in two phases
>In the first phase, the index is filtered for qualifying results
>where the
>first key ranges from a1 to a2
>In the second phase, these filtered results are filtered once more
>according to the range of the second emitted key from b1 to b2.
>Thus essentially an "AND" operation.
>The filtering is performed using lexicographical rules.
> 
>Is that correct ? The doc about complex keys is a bit slim.
> 
>Thanks
> 
>Klaus
> 
> 

Re: question about "complex" range queries

Posted by Robert Samuel Newson <rn...@apache.org>.
Sorry, but the value is not included in the ordering of the views, what you’re seeing there is an artifact of how multiple emits from the same document (where the emitted key is the same *and* the tie-breaking _id is also the same) are handled.

B.

On 3 Apr 2014, at 01:52, Rian R. Maloney <ri...@yahoo.com> wrote:

> This is NOT what I am seeing ( just trying to understand how this example is not sorted on value). Here is my test case:
> 
> 1) create database called test
> 2) create 1 doc with _id = test and a field called stuff ="this is test data"
> 3)  create the following map:
> 
> function(doc) {
> 
>   for( var i = 1;i < 5;i++ )
>   {
>      data = {};
>      data.sortkey = i;
>      data.stuff = doc.stuff;
>      emit( doc._id, data );
>   }
>   for( var i = 4;i > 0;i-- )
>   {
>      data = {};
>      data.sortkey = i;
>      data.stuff = doc.stuff;
>      emit( doc._id, data );
>   }
> 
> 
> Result:
> {"total_rows":8,"offset":0,"rows":[
> {"id":"test","key":"test","value":{"sortkey":1,"stuff":"this is test data"}},
> {"id":"test","key":"test","value":{"sortkey":1,"stuff":"this is test data"}},
> {"id":"test","key":"test","value":{"sortkey":2,"stuff":"this is test data"}},
> {"id":"test","key":"test","value":{"sortkey":2,"stuff":"this is test data"}},
> {"id":"test","key":"test","value":{"sortkey":3,"stuff":"this is test data"}},
> {"id":"test","key":"test","value":{"sortkey":3,"stuff":"this is test data"}},
> {"id":"test","key":"test","value":{"sortkey":4,"stuff":"this is test data"}},
> {"id":"test","key":"test","value":{"sortkey":4,"stuff":"this is test data"}}
> ]}
> 
> But if value was not included in the sort, it would be:
> 
> {"total_rows":8,"offset":0,"rows":[
> {"id":"test","key":"test","value":{"sortkey":1,"stuff":"this is test data"}},
> {"id":"test","key":"test","value":{"sortkey":2,"stuff":"this is test data"}},
> {"id":"test","key":"test","value":{"sortkey":3,"stuff":"this is test data"}},
> {"id":"test","key":"test","value":{"sortkey":4,"stuff":"this is test data"}},{"id":"test","key":"test","value":{"sortkey":1,"stuff":"this is test data"}},{"id":"test","key":"test","value":{"sortkey":2,"stuff":"this is test data"}},{"id":"test","key":"test","value":{"sortkey":3,"stuff":"this is test data"}},{"id":"test","key":"test","value":{"sortkey":4,"stuff":"this is test data"}},
> ]}
> On Wednesday, April 2, 2014 6:07 PM, Ryan Ramage <ry...@gmail.com> wrote:
> 
> The emitted value does not affect the sort.
> 
> 
> If you wanted to keep sorting add your sort field in an array key, so move
> from
> 
> emit(doc.key, doc.value)
> 
> 
> to
> 
> emit([doc.key, doc.value], doc.value)
> 
> 
> 
> On Wed, Apr 2, 2014 at 4:57 PM, <ri...@yahoo.com> wrote:
> 
>> Thanks Robert. Let me clarify. Is the value included in the sort? I have
>> 10 entries all with identical keys. The behavior I am seeing indicates the
>> contents of the emitted value is included in the sort criteria and if so
>> can I count on this? Bottom line I am trying to sort on 2 fields, one is a
>> key and the other is the beginning of the value. I also don't want to sort
>> the results in a list due to performance constraints
>> 
>> Thanks
>> 
>> On Apr 2, 2014, at 5:45 PM, Robert Samuel Newson <rn...@apache.org>
>> wrote:
>> 
>>> A view is a single dimension index, ordered by the full key in
>> accordance with the rules enumerated here:
>> https://wiki.apache.org/couchdb/View_collation#Collation_Specification
>>> 
>>> For example;
>>> 
>>> 9 sorts after false
>>> 999 sorts after 9
>>> "foo" sorts after 999
>>> ["foo"] sorts after "foo"
>>> ["foo", "bar"] sorts after ["foo"]
>>> 
>>> B.
>>> 
>>> On 2 Apr 2014, at 23:29, rian.maloney@yahoo.com wrote:
>>> 
>>>> I have a quick question related to this. What exactly is sorted? My
>> test cases indicated the key AND the entire value is sorted as if it was
>> one big string.
>>>> 
>>>> Thanks
>>>> Rian
>>>> 
>>>> On Apr 2, 2014, at 2:31 AM, Nick North <no...@gmail.com> wrote:
>>>> 
>>>>> It doesn't work quite like this. There is a single order across all
>>>>> possible keys, including both simple and complex keys, as described
>>>>> here<https://wiki.apache.org/couchdb/View_collation>.
>>>>> In the case of keys that are lists, the two lists are compared element
>> by
>>>>> element and the sort order is the sort order of the first unequal
>> elements.
>>>>> 
>>>>> In your example, if a key has its first element between a1 and a2 (and
>> a1
>>>>> and a2 are different), then the second element will not be inspected at
>>>>> all, so it does not matter whether it is between b1 and b2 or not. In
>> fact
>>>>> the second element will only be inspected if the first element is
>> either a1
>>>>> or a2.
>>>>> 
>>>>> This is usually the behaviour we want. For example, dates are often
>>>>> represented as lists of [year, month, day]. Then you can pull out all
>> the
>>>>> documents in a date range by specifying start date and end date as
>> startkey
>>>>> and endkey. For example, if these are [2012, 4, 5] and [2014, 8, 3]
>>>>> respectively, we want to include [2013, 3, 6] in the output even
>> though its
>>>>> second element does not lie between the second elements of the keys.
>>>>> 
>>>>> Nick
>>>>> 
>>>>> 
>>>>> On 2 April 2014 03:56, Schroiff, Klaus <Kl...@fast.au.fujitsu.com>
>> wrote:
>>>>> 
>>>>>> Hi,
>>>>>> 
>>>>>> Let's assume that I have the following view function:
>>>>>> 
>>>>>> function(doc) { emit([a, b], whatever) } }
>>>>>> 
>>>>>> In the query I'm running something like
>>>>>> startkey=[a1, b1] & endkey=[a2,b2]
>>>>>> Thus there're TWO explicit ranges here - a1->a2 and b1->b2.
>>>>>> 
>>>>>> What is the expected result ?
>>>>>> 
>>>>>> My understanding:
>>>>>> The query is executed in two phases
>>>>>> In the first phase, the index is filtered for qualifying results
>> where the
>>>>>> first key ranges from a1 to a2
>>>>>> In the second phase, these filtered results are filtered once more
>>>>>> according to the range of the second emitted key from b1 to b2.
>>>>>> Thus essentially an "AND" operation.
>>>>>> The filtering is performed using lexicographical rules.
>>>>>> 
>>>>>> Is that correct ? The doc about complex keys is a bit slim.
>>>>>> 
>>>>>> Thanks
>>>>>> 
>>>>>> Klaus
>>> 


Re: question about "complex" range queries

Posted by "Rian R. Maloney" <ri...@yahoo.com>.
This is NOT what I am seeing ( just trying to understand how this example is not sorted on value). Here is my test case:

1) create database called test
2) create 1 doc with _id = test and a field called stuff ="this is test data"
3)  create the following map:

function(doc) {

  for( var i = 1;i < 5;i++ )
  {
     data = {};
     data.sortkey = i;
     data.stuff = doc.stuff;
     emit( doc._id, data );
  }
  for( var i = 4;i > 0;i-- )
  {
     data = {};
     data.sortkey = i;
     data.stuff = doc.stuff;
     emit( doc._id, data );
  }


Result:
{"total_rows":8,"offset":0,"rows":[
{"id":"test","key":"test","value":{"sortkey":1,"stuff":"this is test data"}},
{"id":"test","key":"test","value":{"sortkey":1,"stuff":"this is test data"}},
{"id":"test","key":"test","value":{"sortkey":2,"stuff":"this is test data"}},
{"id":"test","key":"test","value":{"sortkey":2,"stuff":"this is test data"}},
{"id":"test","key":"test","value":{"sortkey":3,"stuff":"this is test data"}},
{"id":"test","key":"test","value":{"sortkey":3,"stuff":"this is test data"}},
{"id":"test","key":"test","value":{"sortkey":4,"stuff":"this is test data"}},
{"id":"test","key":"test","value":{"sortkey":4,"stuff":"this is test data"}}
]}

But if value was not included in the sort, it would be:

{"total_rows":8,"offset":0,"rows":[
{"id":"test","key":"test","value":{"sortkey":1,"stuff":"this is test data"}},
{"id":"test","key":"test","value":{"sortkey":2,"stuff":"this is test data"}},
{"id":"test","key":"test","value":{"sortkey":3,"stuff":"this is test data"}},
{"id":"test","key":"test","value":{"sortkey":4,"stuff":"this is test data"}},{"id":"test","key":"test","value":{"sortkey":1,"stuff":"this is test data"}},{"id":"test","key":"test","value":{"sortkey":2,"stuff":"this is test data"}},{"id":"test","key":"test","value":{"sortkey":3,"stuff":"this is test data"}},{"id":"test","key":"test","value":{"sortkey":4,"stuff":"this is test data"}},
]}
On Wednesday, April 2, 2014 6:07 PM, Ryan Ramage <ry...@gmail.com> wrote:
 
The emitted value does not affect the sort.


If you wanted to keep sorting add your sort field in an array key, so move
from

emit(doc.key, doc.value)


to

emit([doc.key, doc.value], doc.value)



On Wed, Apr 2, 2014 at 4:57 PM, <ri...@yahoo.com> wrote:

> Thanks Robert. Let me clarify. Is the value included in the sort? I have
> 10 entries all with identical keys. The behavior I am seeing indicates the
> contents of the emitted value is included in the sort criteria and if so
> can I count on this? Bottom line I am trying to sort on 2 fields, one is a
> key and the other is the beginning of the value. I also don't want to sort
> the results in a list due to performance constraints
>
> Thanks
>
> On Apr 2, 2014, at 5:45 PM, Robert Samuel Newson <rn...@apache.org>
> wrote:
>
> > A view is a single dimension index, ordered by the full key in
> accordance with the rules enumerated here:
> https://wiki.apache.org/couchdb/View_collation#Collation_Specification
> >
> > For example;
> >
> > 9 sorts after false
> > 999 sorts after 9
> > "foo" sorts after 999
> > ["foo"] sorts after "foo"
> > ["foo", "bar"] sorts after ["foo"]
> >
> > B.
> >
> > On 2 Apr 2014, at 23:29, rian.maloney@yahoo.com wrote:
> >
> >> I have a quick question related to this. What exactly is sorted? My
> test cases indicated the key AND the entire value is sorted as if it was
> one big string.
> >>
> >> Thanks
> >> Rian
> >>
> >> On Apr 2, 2014, at 2:31 AM, Nick North <no...@gmail.com> wrote:
> >>
> >>> It doesn't work quite like this. There is a single order across all
> >>> possible keys, including both simple and complex keys, as described
> >>> here<https://wiki.apache.org/couchdb/View_collation>.
> >>> In the case of keys that are lists, the two lists are compared element
> by
> >>> element and the sort order is the sort order of the first unequal
> elements.
> >>>
> >>> In your example, if a key has its first element between a1 and a2 (and
> a1
> >>> and a2 are different), then the second element will not be inspected at
> >>> all, so it does not matter whether it is between b1 and b2 or not. In
> fact
> >>> the second element will only be inspected if the first element is
> either a1
> >>> or a2.
> >>>
> >>> This is usually the behaviour we want. For example, dates are often
> >>> represented as lists of [year, month, day]. Then you can pull out all
> the
> >>> documents in a date range by specifying start date and end date as
> startkey
> >>> and endkey. For example, if these are [2012, 4, 5] and [2014, 8, 3]
> >>> respectively, we want to include [2013, 3, 6] in the output even
> though its
> >>> second element does not lie between the second elements of the keys.
> >>>
> >>> Nick
> >>>
> >>>
> >>> On 2 April 2014 03:56, Schroiff, Klaus <Kl...@fast.au.fujitsu.com>
> wrote:
> >>>
> >>>> Hi,
> >>>>
> >>>> Let's assume that I have the following view function:
> >>>>
> >>>> function(doc) { emit([a, b], whatever) } }
> >>>>
> >>>> In the query I'm running something like
> >>>> startkey=[a1, b1] & endkey=[a2,b2]
> >>>> Thus there're TWO explicit ranges here - a1->a2 and b1->b2.
> >>>>
> >>>> What is the expected result ?
> >>>>
> >>>> My understanding:
> >>>> The query is executed in two phases
> >>>> In the first phase, the index is filtered for qualifying results
> where the
> >>>> first key ranges from a1 to a2
> >>>> In the second phase, these filtered results are filtered once more
> >>>> according to the range of the second emitted key from b1 to b2.
> >>>> Thus essentially an "AND" operation.
> >>>> The filtering is performed using lexicographical rules.
> >>>>
> >>>> Is that correct ? The doc about complex keys is a bit slim.
> >>>>
> >>>> Thanks
> >>>>
> >>>> Klaus
> >
>

Re: question about "complex" range queries

Posted by Ryan Ramage <ry...@gmail.com>.
The emitted value does not affect the sort.


If you wanted to keep sorting add your sort field in an array key, so move
from

emit(doc.key, doc.value)


to

emit([doc.key, doc.value], doc.value)


On Wed, Apr 2, 2014 at 4:57 PM, <ri...@yahoo.com> wrote:

> Thanks Robert. Let me clarify. Is the value included in the sort? I have
> 10 entries all with identical keys. The behavior I am seeing indicates the
> contents of the emitted value is included in the sort criteria and if so
> can I count on this? Bottom line I am trying to sort on 2 fields, one is a
> key and the other is the beginning of the value. I also don't want to sort
> the results in a list due to performance constraints
>
> Thanks
>
> On Apr 2, 2014, at 5:45 PM, Robert Samuel Newson <rn...@apache.org>
> wrote:
>
> > A view is a single dimension index, ordered by the full key in
> accordance with the rules enumerated here:
> https://wiki.apache.org/couchdb/View_collation#Collation_Specification
> >
> > For example;
> >
> > 9 sorts after false
> > 999 sorts after 9
> > "foo" sorts after 999
> > ["foo"] sorts after "foo"
> > ["foo", "bar"] sorts after ["foo"]
> >
> > B.
> >
> > On 2 Apr 2014, at 23:29, rian.maloney@yahoo.com wrote:
> >
> >> I have a quick question related to this. What exactly is sorted? My
> test cases indicated the key AND the entire value is sorted as if it was
> one big string.
> >>
> >> Thanks
> >> Rian
> >>
> >> On Apr 2, 2014, at 2:31 AM, Nick North <no...@gmail.com> wrote:
> >>
> >>> It doesn't work quite like this. There is a single order across all
> >>> possible keys, including both simple and complex keys, as described
> >>> here<https://wiki.apache.org/couchdb/View_collation>.
> >>> In the case of keys that are lists, the two lists are compared element
> by
> >>> element and the sort order is the sort order of the first unequal
> elements.
> >>>
> >>> In your example, if a key has its first element between a1 and a2 (and
> a1
> >>> and a2 are different), then the second element will not be inspected at
> >>> all, so it does not matter whether it is between b1 and b2 or not. In
> fact
> >>> the second element will only be inspected if the first element is
> either a1
> >>> or a2.
> >>>
> >>> This is usually the behaviour we want. For example, dates are often
> >>> represented as lists of [year, month, day]. Then you can pull out all
> the
> >>> documents in a date range by specifying start date and end date as
> startkey
> >>> and endkey. For example, if these are [2012, 4, 5] and [2014, 8, 3]
> >>> respectively, we want to include [2013, 3, 6] in the output even
> though its
> >>> second element does not lie between the second elements of the keys.
> >>>
> >>> Nick
> >>>
> >>>
> >>> On 2 April 2014 03:56, Schroiff, Klaus <Kl...@fast.au.fujitsu.com>
> wrote:
> >>>
> >>>> Hi,
> >>>>
> >>>> Let's assume that I have the following view function:
> >>>>
> >>>> function(doc) { emit([a, b], whatever) } }
> >>>>
> >>>> In the query I'm running something like
> >>>> startkey=[a1, b1] & endkey=[a2,b2]
> >>>> Thus there're TWO explicit ranges here - a1->a2 and b1->b2.
> >>>>
> >>>> What is the expected result ?
> >>>>
> >>>> My understanding:
> >>>> The query is executed in two phases
> >>>> In the first phase, the index is filtered for qualifying results
> where the
> >>>> first key ranges from a1 to a2
> >>>> In the second phase, these filtered results are filtered once more
> >>>> according to the range of the second emitted key from b1 to b2.
> >>>> Thus essentially an "AND" operation.
> >>>> The filtering is performed using lexicographical rules.
> >>>>
> >>>> Is that correct ? The doc about complex keys is a bit slim.
> >>>>
> >>>> Thanks
> >>>>
> >>>> Klaus
> >
>

Re: question about "complex" range queries

Posted by Robert Samuel Newson <rn...@apache.org>.
No, only the key. If you have 10 rows with identical keys, you’ll find that they are in document _id order.

B.

On 2 Apr 2014, at 23:57, rian.maloney@yahoo.com wrote:

> Thanks Robert. Let me clarify. Is the value included in the sort? I have 10 entries all with identical keys. The behavior I am seeing indicates the contents of the emitted value is included in the sort criteria and if so can I count on this? Bottom line I am trying to sort on 2 fields, one is a key and the other is the beginning of the value. I also don't want to sort the results in a list due to performance constraints
> 
> Thanks
> 
> On Apr 2, 2014, at 5:45 PM, Robert Samuel Newson <rn...@apache.org> wrote:
> 
>> A view is a single dimension index, ordered by the full key in accordance with the rules enumerated here: https://wiki.apache.org/couchdb/View_collation#Collation_Specification
>> 
>> For example;
>> 
>> 9 sorts after false
>> 999 sorts after 9
>> "foo" sorts after 999
>> ["foo"] sorts after "foo"
>> ["foo", "bar"] sorts after ["foo"]
>> 
>> B.
>> 
>> On 2 Apr 2014, at 23:29, rian.maloney@yahoo.com wrote:
>> 
>>> I have a quick question related to this. What exactly is sorted? My test cases indicated the key AND the entire value is sorted as if it was one big string. 
>>> 
>>> Thanks
>>> Rian
>>> 
>>> On Apr 2, 2014, at 2:31 AM, Nick North <no...@gmail.com> wrote:
>>> 
>>>> It doesn't work quite like this. There is a single order across all
>>>> possible keys, including both simple and complex keys, as described
>>>> here<https://wiki.apache.org/couchdb/View_collation>.
>>>> In the case of keys that are lists, the two lists are compared element by
>>>> element and the sort order is the sort order of the first unequal elements.
>>>> 
>>>> In your example, if a key has its first element between a1 and a2 (and a1
>>>> and a2 are different), then the second element will not be inspected at
>>>> all, so it does not matter whether it is between b1 and b2 or not. In fact
>>>> the second element will only be inspected if the first element is either a1
>>>> or a2.
>>>> 
>>>> This is usually the behaviour we want. For example, dates are often
>>>> represented as lists of [year, month, day]. Then you can pull out all the
>>>> documents in a date range by specifying start date and end date as startkey
>>>> and endkey. For example, if these are [2012, 4, 5] and [2014, 8, 3]
>>>> respectively, we want to include [2013, 3, 6] in the output even though its
>>>> second element does not lie between the second elements of the keys.
>>>> 
>>>> Nick
>>>> 
>>>> 
>>>> On 2 April 2014 03:56, Schroiff, Klaus <Kl...@fast.au.fujitsu.com> wrote:
>>>> 
>>>>> Hi,
>>>>> 
>>>>> Let's assume that I have the following view function:
>>>>> 
>>>>> function(doc) { emit([a, b], whatever) } }
>>>>> 
>>>>> In the query I'm running something like
>>>>> startkey=[a1, b1] & endkey=[a2,b2]
>>>>> Thus there're TWO explicit ranges here - a1->a2 and b1->b2.
>>>>> 
>>>>> What is the expected result ?
>>>>> 
>>>>> My understanding:
>>>>> The query is executed in two phases
>>>>> In the first phase, the index is filtered for qualifying results where the
>>>>> first key ranges from a1 to a2
>>>>> In the second phase, these filtered results are filtered once more
>>>>> according to the range of the second emitted key from b1 to b2.
>>>>> Thus essentially an "AND" operation.
>>>>> The filtering is performed using lexicographical rules.
>>>>> 
>>>>> Is that correct ? The doc about complex keys is a bit slim.
>>>>> 
>>>>> Thanks
>>>>> 
>>>>> Klaus
>> 


Re: question about "complex" range queries

Posted by ri...@yahoo.com.
Thanks Robert. Let me clarify. Is the value included in the sort? I have 10 entries all with identical keys. The behavior I am seeing indicates the contents of the emitted value is included in the sort criteria and if so can I count on this? Bottom line I am trying to sort on 2 fields, one is a key and the other is the beginning of the value. I also don't want to sort the results in a list due to performance constraints

Thanks

On Apr 2, 2014, at 5:45 PM, Robert Samuel Newson <rn...@apache.org> wrote:

> A view is a single dimension index, ordered by the full key in accordance with the rules enumerated here: https://wiki.apache.org/couchdb/View_collation#Collation_Specification
> 
> For example;
> 
> 9 sorts after false
> 999 sorts after 9
> "foo" sorts after 999
> ["foo"] sorts after "foo"
> ["foo", "bar"] sorts after ["foo"]
> 
> B.
> 
> On 2 Apr 2014, at 23:29, rian.maloney@yahoo.com wrote:
> 
>> I have a quick question related to this. What exactly is sorted? My test cases indicated the key AND the entire value is sorted as if it was one big string. 
>> 
>> Thanks
>> Rian
>> 
>> On Apr 2, 2014, at 2:31 AM, Nick North <no...@gmail.com> wrote:
>> 
>>> It doesn't work quite like this. There is a single order across all
>>> possible keys, including both simple and complex keys, as described
>>> here<https://wiki.apache.org/couchdb/View_collation>.
>>> In the case of keys that are lists, the two lists are compared element by
>>> element and the sort order is the sort order of the first unequal elements.
>>> 
>>> In your example, if a key has its first element between a1 and a2 (and a1
>>> and a2 are different), then the second element will not be inspected at
>>> all, so it does not matter whether it is between b1 and b2 or not. In fact
>>> the second element will only be inspected if the first element is either a1
>>> or a2.
>>> 
>>> This is usually the behaviour we want. For example, dates are often
>>> represented as lists of [year, month, day]. Then you can pull out all the
>>> documents in a date range by specifying start date and end date as startkey
>>> and endkey. For example, if these are [2012, 4, 5] and [2014, 8, 3]
>>> respectively, we want to include [2013, 3, 6] in the output even though its
>>> second element does not lie between the second elements of the keys.
>>> 
>>> Nick
>>> 
>>> 
>>> On 2 April 2014 03:56, Schroiff, Klaus <Kl...@fast.au.fujitsu.com> wrote:
>>> 
>>>> Hi,
>>>> 
>>>> Let's assume that I have the following view function:
>>>> 
>>>> function(doc) { emit([a, b], whatever) } }
>>>> 
>>>> In the query I'm running something like
>>>> startkey=[a1, b1] & endkey=[a2,b2]
>>>> Thus there're TWO explicit ranges here - a1->a2 and b1->b2.
>>>> 
>>>> What is the expected result ?
>>>> 
>>>> My understanding:
>>>> The query is executed in two phases
>>>> In the first phase, the index is filtered for qualifying results where the
>>>> first key ranges from a1 to a2
>>>> In the second phase, these filtered results are filtered once more
>>>> according to the range of the second emitted key from b1 to b2.
>>>> Thus essentially an "AND" operation.
>>>> The filtering is performed using lexicographical rules.
>>>> 
>>>> Is that correct ? The doc about complex keys is a bit slim.
>>>> 
>>>> Thanks
>>>> 
>>>> Klaus
> 

Re: question about "complex" range queries

Posted by Robert Samuel Newson <rn...@apache.org>.
A view is a single dimension index, ordered by the full key in accordance with the rules enumerated here: https://wiki.apache.org/couchdb/View_collation#Collation_Specification

For example;

9 sorts after false
999 sorts after 9
"foo" sorts after 999
["foo"] sorts after "foo"
["foo", "bar"] sorts after ["foo"]

B.

On 2 Apr 2014, at 23:29, rian.maloney@yahoo.com wrote:

> I have a quick question related to this. What exactly is sorted? My test cases indicated the key AND the entire value is sorted as if it was one big string. 
> 
> Thanks
> Rian
> 
> On Apr 2, 2014, at 2:31 AM, Nick North <no...@gmail.com> wrote:
> 
>> It doesn't work quite like this. There is a single order across all
>> possible keys, including both simple and complex keys, as described
>> here<https://wiki.apache.org/couchdb/View_collation>.
>> In the case of keys that are lists, the two lists are compared element by
>> element and the sort order is the sort order of the first unequal elements.
>> 
>> In your example, if a key has its first element between a1 and a2 (and a1
>> and a2 are different), then the second element will not be inspected at
>> all, so it does not matter whether it is between b1 and b2 or not. In fact
>> the second element will only be inspected if the first element is either a1
>> or a2.
>> 
>> This is usually the behaviour we want. For example, dates are often
>> represented as lists of [year, month, day]. Then you can pull out all the
>> documents in a date range by specifying start date and end date as startkey
>> and endkey. For example, if these are [2012, 4, 5] and [2014, 8, 3]
>> respectively, we want to include [2013, 3, 6] in the output even though its
>> second element does not lie between the second elements of the keys.
>> 
>> Nick
>> 
>> 
>> On 2 April 2014 03:56, Schroiff, Klaus <Kl...@fast.au.fujitsu.com> wrote:
>> 
>>> Hi,
>>> 
>>> Let's assume that I have the following view function:
>>> 
>>> function(doc) { emit([a, b], whatever) } }
>>> 
>>> In the query I'm running something like
>>> startkey=[a1, b1] & endkey=[a2,b2]
>>> Thus there're TWO explicit ranges here - a1->a2 and b1->b2.
>>> 
>>> What is the expected result ?
>>> 
>>> My understanding:
>>> The query is executed in two phases
>>> In the first phase, the index is filtered for qualifying results where the
>>> first key ranges from a1 to a2
>>> In the second phase, these filtered results are filtered once more
>>> according to the range of the second emitted key from b1 to b2.
>>> Thus essentially an "AND" operation.
>>> The filtering is performed using lexicographical rules.
>>> 
>>> Is that correct ? The doc about complex keys is a bit slim.
>>> 
>>> Thanks
>>> 
>>> Klaus
>>> 
>>> 


Re: question about "complex" range queries

Posted by ri...@yahoo.com.
I have a quick question related to this. What exactly is sorted? My test cases indicated the key AND the entire value is sorted as if it was one big string. 

Thanks
Rian

On Apr 2, 2014, at 2:31 AM, Nick North <no...@gmail.com> wrote:

> It doesn't work quite like this. There is a single order across all
> possible keys, including both simple and complex keys, as described
> here<https://wiki.apache.org/couchdb/View_collation>.
> In the case of keys that are lists, the two lists are compared element by
> element and the sort order is the sort order of the first unequal elements.
> 
> In your example, if a key has its first element between a1 and a2 (and a1
> and a2 are different), then the second element will not be inspected at
> all, so it does not matter whether it is between b1 and b2 or not. In fact
> the second element will only be inspected if the first element is either a1
> or a2.
> 
> This is usually the behaviour we want. For example, dates are often
> represented as lists of [year, month, day]. Then you can pull out all the
> documents in a date range by specifying start date and end date as startkey
> and endkey. For example, if these are [2012, 4, 5] and [2014, 8, 3]
> respectively, we want to include [2013, 3, 6] in the output even though its
> second element does not lie between the second elements of the keys.
> 
> Nick
> 
> 
> On 2 April 2014 03:56, Schroiff, Klaus <Kl...@fast.au.fujitsu.com> wrote:
> 
>> Hi,
>> 
>> Let's assume that I have the following view function:
>> 
>> function(doc) { emit([a, b], whatever) } }
>> 
>> In the query I'm running something like
>> startkey=[a1, b1] & endkey=[a2,b2]
>> Thus there're TWO explicit ranges here - a1->a2 and b1->b2.
>> 
>> What is the expected result ?
>> 
>> My understanding:
>> The query is executed in two phases
>> In the first phase, the index is filtered for qualifying results where the
>> first key ranges from a1 to a2
>> In the second phase, these filtered results are filtered once more
>> according to the range of the second emitted key from b1 to b2.
>> Thus essentially an "AND" operation.
>> The filtering is performed using lexicographical rules.
>> 
>> Is that correct ? The doc about complex keys is a bit slim.
>> 
>> Thanks
>> 
>> Klaus
>> 
>> 

Re: question about "complex" range queries

Posted by Nick North <no...@gmail.com>.
It doesn't work quite like this. There is a single order across all
possible keys, including both simple and complex keys, as described
here<https://wiki.apache.org/couchdb/View_collation>.
In the case of keys that are lists, the two lists are compared element by
element and the sort order is the sort order of the first unequal elements.

In your example, if a key has its first element between a1 and a2 (and a1
and a2 are different), then the second element will not be inspected at
all, so it does not matter whether it is between b1 and b2 or not. In fact
the second element will only be inspected if the first element is either a1
or a2.

This is usually the behaviour we want. For example, dates are often
represented as lists of [year, month, day]. Then you can pull out all the
documents in a date range by specifying start date and end date as startkey
and endkey. For example, if these are [2012, 4, 5] and [2014, 8, 3]
respectively, we want to include [2013, 3, 6] in the output even though its
second element does not lie between the second elements of the keys.

Nick


On 2 April 2014 03:56, Schroiff, Klaus <Kl...@fast.au.fujitsu.com> wrote:

> Hi,
>
> Let's assume that I have the following view function:
>
> function(doc) { emit([a, b], whatever) } }
>
> In the query I'm running something like
> startkey=[a1, b1] & endkey=[a2,b2]
> Thus there're TWO explicit ranges here - a1->a2 and b1->b2.
>
> What is the expected result ?
>
> My understanding:
> The query is executed in two phases
> In the first phase, the index is filtered for qualifying results where the
> first key ranges from a1 to a2
> In the second phase, these filtered results are filtered once more
> according to the range of the second emitted key from b1 to b2.
> Thus essentially an "AND" operation.
> The filtering is performed using lexicographical rules.
>
> Is that correct ? The doc about complex keys is a bit slim.
>
> Thanks
>
> Klaus
>
>