You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@couchdb.apache.org by Henrik Skupin <hs...@gmail.com> on 2010/09/13 09:12:02 UTC

Filtering of view results

Hi,

Using complex keys for my views gives me a problem with filtering the
results. Lets say I have two filters:

Branch: "All", "4.0", "3.6"
Platform: "All", "Windows NT", "Linux", "Mac"

Per default no filtering is active and all results get shown sorted by date.
The key which is used for emit looks like:

emit([doc.date, doc.branch, doc.platform], { ... }]);

But how can I re-use the view to only show results for branch:4.0 sorted by
date or even combined branch:4.0 & platform=Mac? I tried with
?key=[{},"4.0",{}] but none results get returned. Does the key option not
allow to specify catchall values? Sadly I can't find any information about
the usage of key.

I hope that is somehow possible.

Thanks,

-- 
Henrik Skupin
QA Engineer
Mozilla Corporation

Re: Filtering of view results

Posted by Henrik Skupin <hs...@gmail.com>.
On Mon, Sep 13, 2010 at 2:10 PM, Kenneth Tyler <ke...@8thfold.com> wrote:


> Can you say more about what this document is recording, what do "branch"
> and
> "platform" mean ? And what is the reason for storing this information ?  I
> am trying to think about your document structure... and just the sample
> values does not give me enough background.
>

You can find a working example of the couchapp here:
http://mozmill.hskupin.info/general/reports

In that case Branch means the major version of Firefox, while platform can
be Windows NT, linux, and mac. Click on the date of one entry to see the
complete document.

A raw document can be found here:
http://mozmill.couchone.com/mozmill-crowd/b2d76c168f83fa5519f2b8855a73d42c

-- 
Henrik Skupin
QA Engineer
Mozilla Corporation

Re: Filtering of view results

Posted by Kenneth Tyler <ke...@8thfold.com>.
Henrik,
Can you say more about what this document is recording, what do "branch" and
"platform" mean ? And what is the reason for storing this information ?  I
am trying to think about your document structure... and just the sample
values does not give me enough background.

Thanks

ken tyler

On Mon, Sep 13, 2010 at 12:12 AM, Henrik Skupin <hs...@gmail.com> wrote:

> Hi,
>
> Using complex keys for my views gives me a problem with filtering the
> results. Lets say I have two filters:
>
> Branch: "All", "4.0", "3.6"
> Platform: "All", "Windows NT", "Linux", "Mac"
>
> Per default no filtering is active and all results get shown sorted by
> date.
> The key which is used for emit looks like:
>
> emit([doc.date, doc.branch, doc.platform], { ... }]);
>
> But how can I re-use the view to only show results for branch:4.0 sorted by
> date or even combined branch:4.0 & platform=Mac? I tried with
> ?key=[{},"4.0",{}] but none results get returned. Does the key option not
> allow to specify catchall values? Sadly I can't find any information about
> the usage of key.
>
> I hope that is somehow possible.
>
> Thanks,
>
> --
> Henrik Skupin
> QA Engineer
> Mozilla Corporation
>

Re: Filtering of view results

Posted by Randall Leeds <ra...@gmail.com>.
On Mon, Sep 13, 2010 at 14:11, J Chris Anderson <jc...@apache.org> wrote:
>
> I could show you a complicated way to achieve the API you are asking for, using _list functions, but it would be slow, so I won't.
>
> Chris
>

This is the essential point. Indices will make things fast at the
expense of disk space. Since documents are only passed to the view
server once per design doc (or "view group") it shouldn't be much more
expensive (computationally) to compute four indices instead of just
one (most of the overhead is serialization to/from the view server).
You should probably just use four views. It will use more disk space
and strain your disk a bit more when updating the view, but the
community can help you with scale-out if it becomes a bottleneck
(e.g., sharding).

Randall

Re: Filtering of view results

Posted by J Chris Anderson <jc...@apache.org>.
On Sep 13, 2010, at 12:12 AM, Henrik Skupin wrote:

> 
> But how can I re-use the view to only show results for branch:4.0 sorted by
> date or even combined branch:4.0 & platform=Mac? I tried with
> ?key=[{},"4.0",{}] but none results get returned. Does the key option not
> allow to specify catchall values? Sadly I can't find any information about
> the usage of key.
> 

The solution is to relax and make 4 views, one to support each of the queries you describe: all by date // branch by date // branch && platform by date // platform by date

Eg, there would be a view each, with emit calls like this:

1) emit(doc.timestamp, foo)
2) emit([doc.branch, doc.timestamp], foo)
3) emit([doc.branch, doc.platform, doc.timestamp], foo)
4) emit([doc.platform, doc.timestamp], foo)

This will give you the best performance.

Views in CouchDB are just sorted structures. That is, the rows are sorted on insert. This means that queries are fast when they return contiguous ranges of rows. Queries which attempt to return non-contiguous ranges of rows would be slow, so CouchDB does not support them.

I could show you a complicated way to achieve the API you are asking for, using _list functions, but it would be slow, so I won't.

Chris

> I hope that is somehow possible.
> 
> Thanks,
> 
> -- 
> Henrik Skupin
> QA Engineer
> Mozilla Corporation


Re: Filtering of view results

Posted by Stephen Prater <st...@agrussell.com>.
You don't need to create different views, I don't think, you just need  
to emit all the permutations of the view parameters.

So, you know, if you had parameters D, B, P, you'd have to emit 6  
times.  (if that was Date, Branch, Platform)

[d,b,p]
[d,p,b]
[b,d,p]
[b,p,d]
[p,d,b]
[p,b,d]

That would let you do range queries based on what was being filtered  
on, so for filtering by date, branch, and platform
you would choose query one.

I think you'd need to sort each entry so they show up together.

Or, you could use couchdb-lucene.  I'm pretty sure it does something  
very similar with a custom indexer.

stephen

On Sep 13, 2010, at 3:46 PM, Henrik Skupin wrote:

> On Mon, Sep 13, 2010 at 12:14 AM, Dirkjan Ochtman  
> <di...@ochtman.nl>wrote:
>
>>> I hope that is somehow possible.
>>
>> It's not, you need two different views.
>>
>
> This worries me a lot. Because I will never be able to filter on  
> more than
> one key at the same time. I don't want to create a dozen of  
> different views
> to handle this specific requirement.
>
> What I really miss is a filter parameter which will be executed when
> accessing a view and which will accept/deny b-tree entries depending  
> on the
> filter criteria. Here some examples:
>
> viewName                                               No filtering
> viewName?filter=[{}, {}, {}]                       No filtering
> viewName?filter=["09/13/10", {}, {}]         Only return rows from  
> Sep 13th
> 2010
> viewName?filter=[{}, "4.0", {}]                  Only return rows  
> for the
> 4.0 application branch
> viewName?filter=[{}, {}, "Mac"]                Only return rows on Mac
> viewName?filter=[{}, "4.0", "Mac"]           Only return rows for  
> the 4.0
> application branch on Mac
>
>
> This thread is probably similar to the "missing join" thread we have  
> right
> now. But instead of having multiple views a single view should be  
> totally
> enough. All the information we have to query on should be part of  
> the key,
> so filtering on specific entries should be possible. Or is there a
> limitation on the b-tree side?
>
> For us it will be a hard requirement. If it can't be solved easily  
> with
> CouchDB, we probably have to switch to another backend storage. But I
> wouldn't like to do that because Couchdb is impressive and I want to  
> support
> it.
>
> -- 
> Henrik Skupin
> QA Engineer
> Mozilla Corporation


Re: Filtering of view results

Posted by Dirkjan Ochtman <di...@ochtman.nl>.
On Mon, Sep 13, 2010 at 22:46, Henrik Skupin <hs...@gmail.com> wrote:
> What I really miss is a filter parameter which will be executed when
> accessing a view and which will accept/deny b-tree entries depending on the
> filter criteria. Here some examples:

AFAICT this simply doesn't make sense from the point of view of the
b-tree indexes. You may indeed be better off using some other kind of
database.

Cheers,

Dirkjan

Re: Filtering of view results

Posted by Henrik Skupin <hs...@gmail.com>.
On Mon, Sep 13, 2010 at 12:14 AM, Dirkjan Ochtman <di...@ochtman.nl>wrote:

> > I hope that is somehow possible.
>
> It's not, you need two different views.
>

This worries me a lot. Because I will never be able to filter on more than
one key at the same time. I don't want to create a dozen of different views
to handle this specific requirement.

What I really miss is a filter parameter which will be executed when
accessing a view and which will accept/deny b-tree entries depending on the
filter criteria. Here some examples:

viewName                                               No filtering
viewName?filter=[{}, {}, {}]                       No filtering
viewName?filter=["09/13/10", {}, {}]         Only return rows from Sep 13th
2010
viewName?filter=[{}, "4.0", {}]                  Only return rows for the
4.0 application branch
viewName?filter=[{}, {}, "Mac"]                Only return rows on Mac
viewName?filter=[{}, "4.0", "Mac"]           Only return rows for the 4.0
application branch on Mac


This thread is probably similar to the "missing join" thread we have right
now. But instead of having multiple views a single view should be totally
enough. All the information we have to query on should be part of the key,
so filtering on specific entries should be possible. Or is there a
limitation on the b-tree side?

For us it will be a hard requirement. If it can't be solved easily with
CouchDB, we probably have to switch to another backend storage. But I
wouldn't like to do that because Couchdb is impressive and I want to support
it.

-- 
Henrik Skupin
QA Engineer
Mozilla Corporation

Re: Filtering of view results

Posted by James Hayton <th...@purplebulldog.com>.
This is something I have run into many times.  Wish there was a better solution than two views.

James 



Sent Via iPhone

On Sep 13, 2010, at 12:14 AM, Dirkjan Ochtman <di...@ochtman.nl> wrote:

> On Mon, Sep 13, 2010 at 09:12, Henrik Skupin <hs...@gmail.com> wrote:
>> I hope that is somehow possible.
> 
> It's not, you need two different views.
> 
> Cheers,
> 
> Dirkjan

Re: Filtering of view results

Posted by Dirkjan Ochtman <di...@ochtman.nl>.
On Mon, Sep 13, 2010 at 09:12, Henrik Skupin <hs...@gmail.com> wrote:
> I hope that is somehow possible.

It's not, you need two different views.

Cheers,

Dirkjan

Re: Filtering of view results

Posted by Aaron Miller <ap...@ninjawhale.com>.
If date is what you want to sort by and you don't mind not being able to
intersect filters you can use your complex keys like this (emitting multiple
rows per doc), and you'll only need the one view:

emit(["branch",doc.branch,doc.date);
emit(["platform",doc.platform,doc.date);

Then you can query
/whateverview?startkey=["branch","4.0"]&endkey=["branch","4.0",{}]

The combined filtering is not currently possible. If you -really- need it
you can http://en.wikipedia.org/wiki/Merge_join on the client side as long
as you can collate whatever you're sorting by on both "queries". In this
example, the doc.date field. The gist of it is that when intersecting two
datasets sorted by the same field you always know that there are no matches
in a set that appear before key the of the last element you looked at in the
other set, so you can rule out large chunks of data at a time for searching
through.

On Mon, Sep 13, 2010 at 12:12 AM, Henrik Skupin <hs...@gmail.com> wrote:

> Hi,
>
> Using complex keys for my views gives me a problem with filtering the
> results. Lets say I have two filters:
>
> Branch: "All", "4.0", "3.6"
> Platform: "All", "Windows NT", "Linux", "Mac"
>
> Per default no filtering is active and all results get shown sorted by
> date.
> The key which is used for emit looks like:
>
> emit([doc.date, doc.branch, doc.platform], { ... }]);
>
> But how can I re-use the view to only show results for branch:4.0 sorted by
> date or even combined branch:4.0 & platform=Mac? I tried with
> ?key=[{},"4.0",{}] but none results get returned. Does the key option not
> allow to specify catchall values? Sadly I can't find any information about
> the usage of key.
>
> I hope that is somehow possible.
>
> Thanks,
>
> --
> Henrik Skupin
> QA Engineer
> Mozilla Corporation
>