You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@couchdb.apache.org by Carl Bourne <ca...@me.com> on 2012/09/18 17:51:31 UTC

Exclude documents from view based on list of regex expressions

Hi,

Whats the best approach for excluding documents from a view based on a list of regex expressions. For example I want to exclude anything where doc.issue.name contains a value that matches a list of regex expressions.

e.g. exclusion list: [/foo/, /bar/]

{
"_id": "1",
"issue": {
    "name": "foo"
}

{
"_id": "2",
"issue": {
    "name": "bar"
}

{
"_id": "3",
"issue": {
    "name": "fred"
}

So based on the documents above, just return the document where doc.issue.name = "fred"


Best Regards, 

Carl

Re: Exclude documents from view based on list of regex expressions

Posted by Carl Bourne <ca...@me.com>.
>>However, I'm puzzled as I would expect it to only return the documents that match the regex NOT the ones that don't. I can't see a NOT operator in the syntax anywhere!

OK sorry - stupid question its an IF/ELSE statement. I'm more familiar with Ruby so sorry to appear dumb!


On 18 Sep 2012, at 19:57, Carl Bourne <ca...@me.com> wrote:

> Gentlemen, 
> 
> OK so this works exactly how I would like it to:
> 
> function(doc) {
> var reg_exps = [/ORA/g, /Hew/g, /VM/g];
> for (r in reg_exps){
>   if (doc.subject.name.match(reg_exps[r])){
>     return;
>   }
> }
> emit(doc.subject.organisation_name, 1);
> }
> 
> However, I'm puzzled as I would expect it to only return the documents that match the regex NOT the ones that don't. I can't see a NOT operator in the syntax anywhere!
> 
> I haven't tried the compound regex yet though!
> 
> 
> On 18 Sep 2012, at 18:24, Jens Alfke <je...@couchbase.com> wrote:
> 
>> 
>> On Sep 18, 2012, at 10:17 AM, Aurélien Bénel <au...@utt.fr>> wrote:
>> 
>> Are you sure an array of regexes would be as efficient as a compound regex?
>> 
>> Good point — I don’t know the innards of SpiderMonkey but I think it’s pretty much a guarantee that a compound regex would be much faster. In general the fewer times you have to jump out of or into an interpreter (meaning both JS and the regex state-machine) the better.
>> 
>> If I recall the syntax correctly, it would look like /bar|baz/ … right?
>> 
>> —Jens
>> 
> 


Re: Exclude documents from view based on list of regex expressions

Posted by Carl Bourne <ca...@me.com>.
Gentlemen, 

OK so this works exactly how I would like it to:

function(doc) {
 var reg_exps = [/ORA/g, /Hew/g, /VM/g];
 for (r in reg_exps){
   if (doc.subject.name.match(reg_exps[r])){
     return;
   }
 }
 emit(doc.subject.organisation_name, 1);
}

However, I'm puzzled as I would expect it to only return the documents that match the regex NOT the ones that don't. I can't see a NOT operator in the syntax anywhere!

I haven't tried the compound regex yet though!


On 18 Sep 2012, at 18:24, Jens Alfke <je...@couchbase.com> wrote:

> 
> On Sep 18, 2012, at 10:17 AM, Aurélien Bénel <au...@utt.fr>> wrote:
> 
> Are you sure an array of regexes would be as efficient as a compound regex?
> 
> Good point — I don’t know the innards of SpiderMonkey but I think it’s pretty much a guarantee that a compound regex would be much faster. In general the fewer times you have to jump out of or into an interpreter (meaning both JS and the regex state-machine) the better.
> 
> If I recall the syntax correctly, it would look like /bar|baz/ … right?
> 
> —Jens
> 


Re: Exclude documents from view based on list of regex expressions

Posted by Jens Alfke <je...@couchbase.com>.
On Sep 18, 2012, at 10:17 AM, Aurélien Bénel <au...@utt.fr>> wrote:

Are you sure an array of regexes would be as efficient as a compound regex?

Good point — I don’t know the innards of SpiderMonkey but I think it’s pretty much a guarantee that a compound regex would be much faster. In general the fewer times you have to jump out of or into an interpreter (meaning both JS and the regex state-machine) the better.

If I recall the syntax correctly, it would look like /bar|baz/ … right?

—Jens


Re: Exclude documents from view based on list of regex expressions

Posted by Aurélien Bénel <au...@utt.fr>.
Hi Jens,

>> If you need real regexes, I suppose you will need to build one huge regex containing them all. 
> The map function should contain an array of regexes, and match the doc.issue.name against each one in turn. If it matches any of them, just return, else emit whatever the appropriate key/value are.

Are you sure an array of regexes would be as efficient as a compound regex? Well, it's true that with JavaScript you never know... I suppose the only way to figure it out is to try both and take the faster.  


Aurélien

Re: Exclude documents from view based on list of regex expressions

Posted by Aurélien Bénel <au...@utt.fr>.
Hi Carl,

> Whats the best approach for excluding documents from a view based on a list of regex expressions. For example I want to exclude anything where doc.issue.name contains a value that matches a list of regex expressions.
> e.g. exclusion list: [/foo/, /bar/]

I am not sure what is the best approach to do that.

For your example, you could have done:

function(o) {
  switch (o.issue.name) {
      case "foo":
      case "bar": break;
      default: emit(o.issue.name, null) 
    }
}

The pros is that you can add other exclude values very easily.  But it only works with strings. If you need real regexes, I suppose you will need to build one huge regex containing them all. 


Regards,

Aurélien

Re: Exclude documents from view based on list of regex expressions

Posted by Carl Bourne <ca...@me.com>.
Hi Simon, 

Thanks for the example - much appreciated!

On 18 Sep 2012, at 18:05, Simon Metson <si...@cloudant.com> wrote:

> Hey Carl,
> I think you'll want a map like:
> 
> function(doc) {
>  var reg_exps = [/bar/g, /baz/g];
>  for (r in reg_exps){
>    if (doc.name.match(reg_exps[r])){
>      return;
>    }
>  }
>  emit(doc.name, 1);
> }
> 
> Cheers
> Simon
> 
> On Tue, Sep 18, 2012 at 5:57 PM, Carl Bourne <ca...@me.com> wrote:
> 
>> Thanks for the advice Jens!
>> 
>> I'm fairly new to Couch any chance of a simple example that shows how to
>> build this type of map function?
>> 
>> Regards,
>> 
>> Carl
>> 
>> Carl Bourne | Senior Sales Engineer | mobile: +44 (0) 7770 284294 |
>> www.venafi.com
>> 
>> On 18 Sep 2012, at 17:40, Jens Alfke <je...@couchbase.com> wrote:
>> 
>>> 
>>> On Sep 18, 2012, at 8:51 AM, Carl Bourne <carl.bourne@me.com<mailto:
>> carl.bourne@me.com>> wrote:
>>> 
>>> Whats the best approach for excluding documents from a view based on a
>> list of regex expressions. For example I want to exclude anything where
>> doc.issue.name contains a value that matches a list of regex expressions.
>>> 
>>> The map function should contain an array of regexes, and match the
>> doc.issue.name against each one in turn. If it matches any of them, just
>> return, else emit whatever the appropriate key/value are.
>>> 
>>> —Jens
>> 

Re: Exclude documents from view based on list of regex expressions

Posted by Simon Metson <si...@cloudant.com>.
Hey Carl,
I think you'll want a map like:

function(doc) {
  var reg_exps = [/bar/g, /baz/g];
  for (r in reg_exps){
    if (doc.name.match(reg_exps[r])){
      return;
    }
  }
  emit(doc.name, 1);
}

Cheers
Simon

On Tue, Sep 18, 2012 at 5:57 PM, Carl Bourne <ca...@me.com> wrote:

> Thanks for the advice Jens!
>
> I'm fairly new to Couch any chance of a simple example that shows how to
> build this type of map function?
>
> Regards,
>
> Carl
>
> Carl Bourne | Senior Sales Engineer | mobile: +44 (0) 7770 284294 |
> www.venafi.com
>
> On 18 Sep 2012, at 17:40, Jens Alfke <je...@couchbase.com> wrote:
>
> >
> > On Sep 18, 2012, at 8:51 AM, Carl Bourne <carl.bourne@me.com<mailto:
> carl.bourne@me.com>> wrote:
> >
> > Whats the best approach for excluding documents from a view based on a
> list of regex expressions. For example I want to exclude anything where
> doc.issue.name contains a value that matches a list of regex expressions.
> >
> > The map function should contain an array of regexes, and match the
> doc.issue.name against each one in turn. If it matches any of them, just
> return, else emit whatever the appropriate key/value are.
> >
> > —Jens
>

Re: Exclude documents from view based on list of regex expressions

Posted by Carl Bourne <ca...@me.com>.
Thanks for the advice Jens!

I'm fairly new to Couch any chance of a simple example that shows how to build this type of map function? 

Regards,

Carl

Carl Bourne | Senior Sales Engineer | mobile: +44 (0) 7770 284294 | www.venafi.com

On 18 Sep 2012, at 17:40, Jens Alfke <je...@couchbase.com> wrote:

> 
> On Sep 18, 2012, at 8:51 AM, Carl Bourne <ca...@me.com>> wrote:
> 
> Whats the best approach for excluding documents from a view based on a list of regex expressions. For example I want to exclude anything where doc.issue.name contains a value that matches a list of regex expressions.
> 
> The map function should contain an array of regexes, and match the doc.issue.name against each one in turn. If it matches any of them, just return, else emit whatever the appropriate key/value are.
> 
> —Jens

Re: Exclude documents from view based on list of regex expressions

Posted by Jens Alfke <je...@couchbase.com>.
On Sep 18, 2012, at 8:51 AM, Carl Bourne <ca...@me.com>> wrote:

Whats the best approach for excluding documents from a view based on a list of regex expressions. For example I want to exclude anything where doc.issue.name contains a value that matches a list of regex expressions.

The map function should contain an array of regexes, and match the doc.issue.name against each one in turn. If it matches any of them, just return, else emit whatever the appropriate key/value are.

—Jens