You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@couchdb.apache.org by Alex P <ap...@kolosy.com> on 2009/10/27 15:50:31 UTC

multiple range queries via POST?

i know this is currently unsupported (and may be more of a question for the
dev list), but is there a technical reason while multi-range queries can't
be submitted to couch (slight ah-hah moment at the end)?

the specific problem i'm trying to address is this:

suppose i have a message document, and a corresponding map function:

function (doc) {
  if (doc.docType != 'message') return;

  emit(doc.owner, null);
}

if i wanted to pull back all messages for users foo and bar, i'd simply do a
POST path/to/couch keys = ['foo', 'bar']. now let's make this data come back
sorted by create date:

function (doc) {
  if (doc.docType != 'message') return;

  emit([doc.owner, doc.createDate], null);
}

also cool, but now, to retrieve all messages pertaining to a single user, i
need to do GET path/to/couch startKey=['foo']&endKey=['foo', 'a']. this
works, but it now means that if i want all messages pertaining to both foo
and bar, i need to run two separate queries.

as i'm writing this, i think i'm starting to see that the problem would be
with having to merge overlapping ranges, but i still would like someone else
to weigh in on this


thanks,
alex.

Re: multiple range queries via POST?

Posted by Paul Davis <pa...@gmail.com>.
Alex,

Views are streamed from the database with no buffering in RAM. To the
point that all operations must keep those semantics.

There's an old ticket on striped queries that's quite similar on what
you're wanting at [1]. It had a patch that probably won't at all apply
to trunk but the idea is there. In general though I'm not overly
convinced on the necessity. Assuming your HTTP client is capable of
persistent connections I don't see this as having a huge effect in
terms of speed in normal use. Perhaps if you're heavy on the feature,
but as part of the general scheme of things, there's not much that
you'd be able to implement that would give you a boost.

Oh.... Actually, one thing you could convince me on is single reader
snapshots for the view that you couldn't do with requests. Either way,
the implementation wouldn't be *too* difficult. It should be able to
sit almost 100% in couch_httpd_view.erl and would be able to pull alot
from output_map_view and output_reduce_view.

Also, I wouldn't attempt to resolve overlaps. I think it'd be more
confusing to have merged result sets than just having a single result
for each sub-request. Ie, the return would just be an array of view
outputs. Or something.

HTH,
Paul Davis

[1] http://issues.apache.org/jira/browse/COUCHDB-244


On Tue, Oct 27, 2009 at 11:01 AM, Alex P <ap...@kolosy.com> wrote:
> good to hear. re: http call - well it should be consistent with the keys
> call, shouldn't it? so a POST with startKey[s] and an endKey[s] arguments?
>
> out of curiousity - when a view subset is returned, is it 'streamed' out? or
> is the entire dataset prefetched and then returned?
>
> i could see overlapping ranges being simple to solve mathematically, but
> posing either seek or memory issues (reading the two ranges concurrently vs.
> pre-fetching both and doing a merge)
>
> On Tue, Oct 27, 2009 at 9:55 AM, Adam Kocoloski <ko...@apache.org> wrote:
>
>> On Oct 27, 2009, at 10:50 AM, Alex P wrote:
>>
>>  i know this is currently unsupported (and may be more of a question for
>>> the
>>> dev list), but is there a technical reason while multi-range queries can't
>>> be submitted to couch (slight ah-hah moment at the end)?
>>>
>>> the specific problem i'm trying to address is this:
>>>
>>> suppose i have a message document, and a corresponding map function:
>>>
>>> function (doc) {
>>>  if (doc.docType != 'message') return;
>>>
>>>  emit(doc.owner, null);
>>> }
>>>
>>> if i wanted to pull back all messages for users foo and bar, i'd simply do
>>> a
>>> POST path/to/couch keys = ['foo', 'bar']. now let's make this data come
>>> back
>>> sorted by create date:
>>>
>>> function (doc) {
>>>  if (doc.docType != 'message') return;
>>>
>>>  emit([doc.owner, doc.createDate], null);
>>> }
>>>
>>> also cool, but now, to retrieve all messages pertaining to a single user,
>>> i
>>> need to do GET path/to/couch startKey=['foo']&endKey=['foo', 'a']. this
>>> works, but it now means that if i want all messages pertaining to both foo
>>> and bar, i need to run two separate queries.
>>>
>>> as i'm writing this, i think i'm starting to see that the problem would be
>>> with having to merge overlapping ranges, but i still would like someone
>>> else
>>> to weigh in on this
>>>
>>>
>>> thanks,
>>> alex.
>>>
>>
>> Hi Alex, internally, multiple keys are actually just a special case of
>> multiple ranges.  So that part is easy.  We would want to be clear about how
>> we handle overlapping ranges, but it's not that hard of a problem really.
>>
>> I wonder what the HTTP call for this should look like?
>>
>> Adam
>>
>>
>

Re: multiple range queries via POST?

Posted by Alex P <ap...@kolosy.com>.
good to hear. re: http call - well it should be consistent with the keys
call, shouldn't it? so a POST with startKey[s] and an endKey[s] arguments?

out of curiousity - when a view subset is returned, is it 'streamed' out? or
is the entire dataset prefetched and then returned?

i could see overlapping ranges being simple to solve mathematically, but
posing either seek or memory issues (reading the two ranges concurrently vs.
pre-fetching both and doing a merge)

On Tue, Oct 27, 2009 at 9:55 AM, Adam Kocoloski <ko...@apache.org> wrote:

> On Oct 27, 2009, at 10:50 AM, Alex P wrote:
>
>  i know this is currently unsupported (and may be more of a question for
>> the
>> dev list), but is there a technical reason while multi-range queries can't
>> be submitted to couch (slight ah-hah moment at the end)?
>>
>> the specific problem i'm trying to address is this:
>>
>> suppose i have a message document, and a corresponding map function:
>>
>> function (doc) {
>>  if (doc.docType != 'message') return;
>>
>>  emit(doc.owner, null);
>> }
>>
>> if i wanted to pull back all messages for users foo and bar, i'd simply do
>> a
>> POST path/to/couch keys = ['foo', 'bar']. now let's make this data come
>> back
>> sorted by create date:
>>
>> function (doc) {
>>  if (doc.docType != 'message') return;
>>
>>  emit([doc.owner, doc.createDate], null);
>> }
>>
>> also cool, but now, to retrieve all messages pertaining to a single user,
>> i
>> need to do GET path/to/couch startKey=['foo']&endKey=['foo', 'a']. this
>> works, but it now means that if i want all messages pertaining to both foo
>> and bar, i need to run two separate queries.
>>
>> as i'm writing this, i think i'm starting to see that the problem would be
>> with having to merge overlapping ranges, but i still would like someone
>> else
>> to weigh in on this
>>
>>
>> thanks,
>> alex.
>>
>
> Hi Alex, internally, multiple keys are actually just a special case of
> multiple ranges.  So that part is easy.  We would want to be clear about how
> we handle overlapping ranges, but it's not that hard of a problem really.
>
> I wonder what the HTTP call for this should look like?
>
> Adam
>
>

Re: multiple range queries via POST?

Posted by Nathan Stott <nr...@gmail.com>.
I opened a JIRA ticket about this some time ago:
https://issues.apache.org/jira/browse/COUCHDB-523
<https://issues.apache.org/jira/browse/COUCHDB-523>

On Tue, Oct 27, 2009 at 12:46 PM, Paul Davis <pa...@gmail.com>wrote:

> Indeed. Good catch.
>
> On Tue, Oct 27, 2009 at 12:42 PM, Adam Wolff <aw...@gmail.com> wrote:
> > makes sense. should that be "ranges"?
> > A
> >
> > On Tue, Oct 27, 2009 at 8:49 AM, Paul Davis <paul.joseph.davis@gmail.com
> >wrote:
> >
> >> I'd rather see something like:
> >>
> >> {keys: [k1, k2, k3]}
> >>
> >> or
> >>
> >> {range: [query_params_1, query_params_2]}
> >>
> >> with
> >>
> >> query_params_1 = {startkey: k1, endkey: k2} etc etc
> >>
> >> With ranges and keys being mutually exclusive
> >>
> >> Paul Davis
> >>
> >> On Tue, Oct 27, 2009 at 11:37 AM, Adam Wolff <aw...@gmail.com> wrote:
> >> > We could really use this feature too. Right now, we do this:
> >> >
> >> > post("/path_to_view/", {keys : keys});
> >> >
> >> > For this feature, maybe the syntax could be
> >> >
> >> > post("/path_to_view/", {keys : [{start:[k1, null], end:[k1, {}]},
> >> > {start:[k2, null], end:[k2, {}]} });
> >> >
> >> > A
> >> >
> >> >
> >> > On Tue, Oct 27, 2009 at 7:55 AM, Adam Kocoloski <ko...@apache.org>
> >> wrote:
> >> >
> >> >> On Oct 27, 2009, at 10:50 AM, Alex P wrote:
> >> >>
> >> >>  i know this is currently unsupported (and may be more of a question
> for
> >> >>> the
> >> >>> dev list), but is there a technical reason while multi-range queries
> >> can't
> >> >>> be submitted to couch (slight ah-hah moment at the end)?
> >> >>>
> >> >>> the specific problem i'm trying to address is this:
> >> >>>
> >> >>> suppose i have a message document, and a corresponding map function:
> >> >>>
> >> >>> function (doc) {
> >> >>>  if (doc.docType != 'message') return;
> >> >>>
> >> >>>  emit(doc.owner, null);
> >> >>> }
> >> >>>
> >> >>> if i wanted to pull back all messages for users foo and bar, i'd
> simply
> >> do
> >> >>> a
> >> >>> POST path/to/couch keys = ['foo', 'bar']. now let's make this data
> come
> >> >>> back
> >> >>> sorted by create date:
> >> >>>
> >> >>> function (doc) {
> >> >>>  if (doc.docType != 'message') return;
> >> >>>
> >> >>>  emit([doc.owner, doc.createDate], null);
> >> >>> }
> >> >>>
> >> >>> also cool, but now, to retrieve all messages pertaining to a single
> >> user,
> >> >>> i
> >> >>> need to do GET path/to/couch startKey=['foo']&endKey=['foo', 'a'].
> this
> >> >>> works, but it now means that if i want all messages pertaining to
> both
> >> foo
> >> >>> and bar, i need to run two separate queries.
> >> >>>
> >> >>> as i'm writing this, i think i'm starting to see that the problem
> would
> >> be
> >> >>> with having to merge overlapping ranges, but i still would like
> someone
> >> >>> else
> >> >>> to weigh in on this
> >> >>>
> >> >>>
> >> >>> thanks,
> >> >>> alex.
> >> >>>
> >> >>
> >> >> Hi Alex, internally, multiple keys are actually just a special case
> of
> >> >> multiple ranges.  So that part is easy.  We would want to be clear
> about
> >> how
> >> >> we handle overlapping ranges, but it's not that hard of a problem
> >> really.
> >> >>
> >> >> I wonder what the HTTP call for this should look like?
> >> >>
> >> >> Adam
> >> >>
> >> >>
> >> >
> >>
> >
>

Re: multiple range queries via POST?

Posted by Paul Davis <pa...@gmail.com>.
Indeed. Good catch.

On Tue, Oct 27, 2009 at 12:42 PM, Adam Wolff <aw...@gmail.com> wrote:
> makes sense. should that be "ranges"?
> A
>
> On Tue, Oct 27, 2009 at 8:49 AM, Paul Davis <pa...@gmail.com>wrote:
>
>> I'd rather see something like:
>>
>> {keys: [k1, k2, k3]}
>>
>> or
>>
>> {range: [query_params_1, query_params_2]}
>>
>> with
>>
>> query_params_1 = {startkey: k1, endkey: k2} etc etc
>>
>> With ranges and keys being mutually exclusive
>>
>> Paul Davis
>>
>> On Tue, Oct 27, 2009 at 11:37 AM, Adam Wolff <aw...@gmail.com> wrote:
>> > We could really use this feature too. Right now, we do this:
>> >
>> > post("/path_to_view/", {keys : keys});
>> >
>> > For this feature, maybe the syntax could be
>> >
>> > post("/path_to_view/", {keys : [{start:[k1, null], end:[k1, {}]},
>> > {start:[k2, null], end:[k2, {}]} });
>> >
>> > A
>> >
>> >
>> > On Tue, Oct 27, 2009 at 7:55 AM, Adam Kocoloski <ko...@apache.org>
>> wrote:
>> >
>> >> On Oct 27, 2009, at 10:50 AM, Alex P wrote:
>> >>
>> >>  i know this is currently unsupported (and may be more of a question for
>> >>> the
>> >>> dev list), but is there a technical reason while multi-range queries
>> can't
>> >>> be submitted to couch (slight ah-hah moment at the end)?
>> >>>
>> >>> the specific problem i'm trying to address is this:
>> >>>
>> >>> suppose i have a message document, and a corresponding map function:
>> >>>
>> >>> function (doc) {
>> >>>  if (doc.docType != 'message') return;
>> >>>
>> >>>  emit(doc.owner, null);
>> >>> }
>> >>>
>> >>> if i wanted to pull back all messages for users foo and bar, i'd simply
>> do
>> >>> a
>> >>> POST path/to/couch keys = ['foo', 'bar']. now let's make this data come
>> >>> back
>> >>> sorted by create date:
>> >>>
>> >>> function (doc) {
>> >>>  if (doc.docType != 'message') return;
>> >>>
>> >>>  emit([doc.owner, doc.createDate], null);
>> >>> }
>> >>>
>> >>> also cool, but now, to retrieve all messages pertaining to a single
>> user,
>> >>> i
>> >>> need to do GET path/to/couch startKey=['foo']&endKey=['foo', 'a']. this
>> >>> works, but it now means that if i want all messages pertaining to both
>> foo
>> >>> and bar, i need to run two separate queries.
>> >>>
>> >>> as i'm writing this, i think i'm starting to see that the problem would
>> be
>> >>> with having to merge overlapping ranges, but i still would like someone
>> >>> else
>> >>> to weigh in on this
>> >>>
>> >>>
>> >>> thanks,
>> >>> alex.
>> >>>
>> >>
>> >> Hi Alex, internally, multiple keys are actually just a special case of
>> >> multiple ranges.  So that part is easy.  We would want to be clear about
>> how
>> >> we handle overlapping ranges, but it's not that hard of a problem
>> really.
>> >>
>> >> I wonder what the HTTP call for this should look like?
>> >>
>> >> Adam
>> >>
>> >>
>> >
>>
>

Re: multiple range queries via POST?

Posted by Adam Wolff <aw...@gmail.com>.
makes sense. should that be "ranges"?
A

On Tue, Oct 27, 2009 at 8:49 AM, Paul Davis <pa...@gmail.com>wrote:

> I'd rather see something like:
>
> {keys: [k1, k2, k3]}
>
> or
>
> {range: [query_params_1, query_params_2]}
>
> with
>
> query_params_1 = {startkey: k1, endkey: k2} etc etc
>
> With ranges and keys being mutually exclusive
>
> Paul Davis
>
> On Tue, Oct 27, 2009 at 11:37 AM, Adam Wolff <aw...@gmail.com> wrote:
> > We could really use this feature too. Right now, we do this:
> >
> > post("/path_to_view/", {keys : keys});
> >
> > For this feature, maybe the syntax could be
> >
> > post("/path_to_view/", {keys : [{start:[k1, null], end:[k1, {}]},
> > {start:[k2, null], end:[k2, {}]} });
> >
> > A
> >
> >
> > On Tue, Oct 27, 2009 at 7:55 AM, Adam Kocoloski <ko...@apache.org>
> wrote:
> >
> >> On Oct 27, 2009, at 10:50 AM, Alex P wrote:
> >>
> >>  i know this is currently unsupported (and may be more of a question for
> >>> the
> >>> dev list), but is there a technical reason while multi-range queries
> can't
> >>> be submitted to couch (slight ah-hah moment at the end)?
> >>>
> >>> the specific problem i'm trying to address is this:
> >>>
> >>> suppose i have a message document, and a corresponding map function:
> >>>
> >>> function (doc) {
> >>>  if (doc.docType != 'message') return;
> >>>
> >>>  emit(doc.owner, null);
> >>> }
> >>>
> >>> if i wanted to pull back all messages for users foo and bar, i'd simply
> do
> >>> a
> >>> POST path/to/couch keys = ['foo', 'bar']. now let's make this data come
> >>> back
> >>> sorted by create date:
> >>>
> >>> function (doc) {
> >>>  if (doc.docType != 'message') return;
> >>>
> >>>  emit([doc.owner, doc.createDate], null);
> >>> }
> >>>
> >>> also cool, but now, to retrieve all messages pertaining to a single
> user,
> >>> i
> >>> need to do GET path/to/couch startKey=['foo']&endKey=['foo', 'a']. this
> >>> works, but it now means that if i want all messages pertaining to both
> foo
> >>> and bar, i need to run two separate queries.
> >>>
> >>> as i'm writing this, i think i'm starting to see that the problem would
> be
> >>> with having to merge overlapping ranges, but i still would like someone
> >>> else
> >>> to weigh in on this
> >>>
> >>>
> >>> thanks,
> >>> alex.
> >>>
> >>
> >> Hi Alex, internally, multiple keys are actually just a special case of
> >> multiple ranges.  So that part is easy.  We would want to be clear about
> how
> >> we handle overlapping ranges, but it's not that hard of a problem
> really.
> >>
> >> I wonder what the HTTP call for this should look like?
> >>
> >> Adam
> >>
> >>
> >
>

Re: multiple range queries via POST?

Posted by Paul Davis <pa...@gmail.com>.
I'd rather see something like:

{keys: [k1, k2, k3]}

or

{range: [query_params_1, query_params_2]}

with

query_params_1 = {startkey: k1, endkey: k2} etc etc

With ranges and keys being mutually exclusive

Paul Davis

On Tue, Oct 27, 2009 at 11:37 AM, Adam Wolff <aw...@gmail.com> wrote:
> We could really use this feature too. Right now, we do this:
>
> post("/path_to_view/", {keys : keys});
>
> For this feature, maybe the syntax could be
>
> post("/path_to_view/", {keys : [{start:[k1, null], end:[k1, {}]},
> {start:[k2, null], end:[k2, {}]} });
>
> A
>
>
> On Tue, Oct 27, 2009 at 7:55 AM, Adam Kocoloski <ko...@apache.org> wrote:
>
>> On Oct 27, 2009, at 10:50 AM, Alex P wrote:
>>
>>  i know this is currently unsupported (and may be more of a question for
>>> the
>>> dev list), but is there a technical reason while multi-range queries can't
>>> be submitted to couch (slight ah-hah moment at the end)?
>>>
>>> the specific problem i'm trying to address is this:
>>>
>>> suppose i have a message document, and a corresponding map function:
>>>
>>> function (doc) {
>>>  if (doc.docType != 'message') return;
>>>
>>>  emit(doc.owner, null);
>>> }
>>>
>>> if i wanted to pull back all messages for users foo and bar, i'd simply do
>>> a
>>> POST path/to/couch keys = ['foo', 'bar']. now let's make this data come
>>> back
>>> sorted by create date:
>>>
>>> function (doc) {
>>>  if (doc.docType != 'message') return;
>>>
>>>  emit([doc.owner, doc.createDate], null);
>>> }
>>>
>>> also cool, but now, to retrieve all messages pertaining to a single user,
>>> i
>>> need to do GET path/to/couch startKey=['foo']&endKey=['foo', 'a']. this
>>> works, but it now means that if i want all messages pertaining to both foo
>>> and bar, i need to run two separate queries.
>>>
>>> as i'm writing this, i think i'm starting to see that the problem would be
>>> with having to merge overlapping ranges, but i still would like someone
>>> else
>>> to weigh in on this
>>>
>>>
>>> thanks,
>>> alex.
>>>
>>
>> Hi Alex, internally, multiple keys are actually just a special case of
>> multiple ranges.  So that part is easy.  We would want to be clear about how
>> we handle overlapping ranges, but it's not that hard of a problem really.
>>
>> I wonder what the HTTP call for this should look like?
>>
>> Adam
>>
>>
>

Re: multiple range queries via POST?

Posted by Adam Wolff <aw...@gmail.com>.
We could really use this feature too. Right now, we do this:

post("/path_to_view/", {keys : keys});

For this feature, maybe the syntax could be

post("/path_to_view/", {keys : [{start:[k1, null], end:[k1, {}]},
{start:[k2, null], end:[k2, {}]} });

A


On Tue, Oct 27, 2009 at 7:55 AM, Adam Kocoloski <ko...@apache.org> wrote:

> On Oct 27, 2009, at 10:50 AM, Alex P wrote:
>
>  i know this is currently unsupported (and may be more of a question for
>> the
>> dev list), but is there a technical reason while multi-range queries can't
>> be submitted to couch (slight ah-hah moment at the end)?
>>
>> the specific problem i'm trying to address is this:
>>
>> suppose i have a message document, and a corresponding map function:
>>
>> function (doc) {
>>  if (doc.docType != 'message') return;
>>
>>  emit(doc.owner, null);
>> }
>>
>> if i wanted to pull back all messages for users foo and bar, i'd simply do
>> a
>> POST path/to/couch keys = ['foo', 'bar']. now let's make this data come
>> back
>> sorted by create date:
>>
>> function (doc) {
>>  if (doc.docType != 'message') return;
>>
>>  emit([doc.owner, doc.createDate], null);
>> }
>>
>> also cool, but now, to retrieve all messages pertaining to a single user,
>> i
>> need to do GET path/to/couch startKey=['foo']&endKey=['foo', 'a']. this
>> works, but it now means that if i want all messages pertaining to both foo
>> and bar, i need to run two separate queries.
>>
>> as i'm writing this, i think i'm starting to see that the problem would be
>> with having to merge overlapping ranges, but i still would like someone
>> else
>> to weigh in on this
>>
>>
>> thanks,
>> alex.
>>
>
> Hi Alex, internally, multiple keys are actually just a special case of
> multiple ranges.  So that part is easy.  We would want to be clear about how
> we handle overlapping ranges, but it's not that hard of a problem really.
>
> I wonder what the HTTP call for this should look like?
>
> Adam
>
>

Re: multiple range queries via POST?

Posted by Adam Kocoloski <ko...@apache.org>.
On Oct 27, 2009, at 10:50 AM, Alex P wrote:

> i know this is currently unsupported (and may be more of a question  
> for the
> dev list), but is there a technical reason while multi-range queries  
> can't
> be submitted to couch (slight ah-hah moment at the end)?
>
> the specific problem i'm trying to address is this:
>
> suppose i have a message document, and a corresponding map function:
>
> function (doc) {
>  if (doc.docType != 'message') return;
>
>  emit(doc.owner, null);
> }
>
> if i wanted to pull back all messages for users foo and bar, i'd  
> simply do a
> POST path/to/couch keys = ['foo', 'bar']. now let's make this data  
> come back
> sorted by create date:
>
> function (doc) {
>  if (doc.docType != 'message') return;
>
>  emit([doc.owner, doc.createDate], null);
> }
>
> also cool, but now, to retrieve all messages pertaining to a single  
> user, i
> need to do GET path/to/couch startKey=['foo']&endKey=['foo', 'a'].  
> this
> works, but it now means that if i want all messages pertaining to  
> both foo
> and bar, i need to run two separate queries.
>
> as i'm writing this, i think i'm starting to see that the problem  
> would be
> with having to merge overlapping ranges, but i still would like  
> someone else
> to weigh in on this
>
>
> thanks,
> alex.

Hi Alex, internally, multiple keys are actually just a special case of  
multiple ranges.  So that part is easy.  We would want to be clear  
about how we handle overlapping ranges, but it's not that hard of a  
problem really.

I wonder what the HTTP call for this should look like?

Adam