You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Ray Jette <rj...@mestek.com> on 2008/12/02 17:17:07 UTC

Rule to catch PO#

Good morning,
I am trying to write a negative scoring rule that files on the following:
PO
PO#
PO #

Following is the rule I am using:

header PO_AND_ORDERS        Subject =~ /\bPO*?#?/i
score PO_AND_ORDERS        -0.50
describe PO_AND_ORDERS    A negative scoring rule that searches the 
subject for PO #'s.   

Thanks for any help you can provide.





Re: Rule to catch PO#

Posted by Duane Hill <d....@yournetplus.com>.
On Tue, 2 Dec 2008, Ray Jette wrote:

> Good morning,
> I am trying to write a negative scoring rule that files on the following:
> PO
> PO#
> PO #
>
> Following is the rule I am using:
>
> header PO_AND_ORDERS        Subject =~ /\bPO*?#?/i
> score PO_AND_ORDERS        -0.50
> describe PO_AND_ORDERS    A negative scoring rule that searches the subject 
> for PO #'s. 
> Thanks for any help you can provide.

This regex will match the presents of either three you show above:

/PO(?: ?#)?/

Re: Rule to catch PO#

Posted by Chris Hoogendyk <ho...@bio.umass.edu>.

Ray Jette wrote:
> Thanks for all the help. I am still having issues. Let me try to 
> explain a little more. Subjects can contain the following
> PO <random #s>
> PO<random #s>
> PO# <random #s>
> PO#<random #s>
> PO # <random #s>
> PO #<random #s>
>
> I can match PO with /\bPO/i but this does not fill my requirements.
> I need to be able to match all above and i'm not sure where to start.
>
> Thank you for any help you may provide. 

/\bPO ?\#? ?[0-9]*\b/i

or

/\bPO\s?\#?\s?[0-9]*\b/i

just construct what you want, step by step. If you want PO not to be 
contained within anything else (except the possible numbers following), 
then you want the word boundary at the beginning and end. If you want a 
single space, do that, if you want any white space, then allow that. The 
"?" gives you that character optionally, and the "*" gives you any 
number of (including 0) of the digits. So, that ought to do it. Then, of 
course, you have to incorporate it into your perl snippet.


I tried the first of these in the following simple script (named ignore.pl):

#! /usr/local/bin/perl -w
# Routine to ignore "normal" log entries - after Marcus Ranum's "artifical
# ignorance"
#
        while (<>)
        {
          if (/\bPO ?\#? ?[0-9]*\b/i) { next }
          else {print}
        }



The script is put to use as follows:

# cat | ./ignore.pl

After that, I could type anything I wanted. If it matched, it would be 
ignored. If it didn't it would print it back out. I matched all your 
examples, including the lower case.

For example:

PO
lskdfjs
lskdfjs
this is in regard to po #2
this regardsPO234
this regardsPO234
can you grab me a PO 234
what about po#345?
what about that PO

where the non-matches got spat back at me.

Then you can play around a bit. Since the " " and "#" count as word 
boundaries, you can cut them out and use:

/\bPO\b|\bPO[0-9]\b/i

which works as well.

For reference, I have that script in my /var/adm/  directory. I 
routinely toss several regeps in it and use it when I'm scanning log 
files to filter out the commonly occurring lines I don't want to be 
bothered by. It helps focus in on the oddities.



-- 
---------------

Chris Hoogendyk

-
   O__  ---- Systems Administrator
  c/ /'_ --- Biology & Geology Departments
 (*) \(*) -- 140 Morrill Science Center
~~~~~~~~~~ - University of Massachusetts, Amherst 

<ho...@bio.umass.edu>

--------------- 

Erdös 4



Re: Rule to catch PO#

Posted by Ray Jette <rj...@mestek.com>.
Matt Garretson wrote:
> Ray Jette wrote:
>   
>> PO <random #s>
>> PO<random #s>
>> PO# <random #s>
>> PO#<random #s>
>> PO # <random #s>
>> PO #<random #s>
>>     
>
>
> Try:
>
>   Subject =~ /PO ?\#? ?\d+/i
>
> If you don't need case insensitivity, remove the trailing 'i'.
>
>
>
>
>   
Thanks for the reply. I tryed to use Subject ~
That matched PO but it did not match po. I have /i at the end.

Re: Rule to catch PO#

Posted by Matt Garretson <ma...@assembly.state.ny.us>.
Ray Jette wrote:
> PO <random #s>
> PO<random #s>
> PO# <random #s>
> PO#<random #s>
> PO # <random #s>
> PO #<random #s>


Try:

  Subject =~ /PO ?\#? ?\d+/i

If you don't need case insensitivity, remove the trailing 'i'.



Re: Rule to catch PO#

Posted by Ray Jette <rj...@mestek.com>.
Thanks for all the help. I am still having issues. Let me try to explain 
a little more. Subjects can contain the following
PO <random #s>
PO<random #s>
PO# <random #s>
PO#<random #s>
PO # <random #s>
PO #<random #s>

I can match PO with /\bPO/i but this does not fill my requirements.
I need to be able to match all above and i'm not sure where to start.

Thank you for any help you may provide.

Ray


Re: Rule to catch PO#

Posted by David B Funk <db...@engineering.uiowa.edu>.
On Tue, 2 Dec 2008, Ray Jette wrote:

> Karsten Bräckelmann wrote:
> > On Tue, 2008-12-02 at 14:06 -0500, Ray Jette wrote:
> > [ *snipp* ]
> >
> > If all else fails, just save the message out of your MUA.
> >
> > You can then test with the saved file and investigate the output:
> >   spamassassin < message.file | less
> >
> That might be hard to do. I am using Exchange.

I think you're confused, MUA == Mail-User-Agent (AKA User-Agent in a mail
context) (EG Thunderbird). Exchange is just a mail server.

Looking at the headers in your message, you use Thunderbird as
your MUA. Select an example message from a folder, right-click on
it, choose "Save As..", pick a file name, and it will save the
message as a ".eml" file.
You can then pipe that into spamassassin to test things.

Dave

-- 
Dave Funk                                  University of Iowa
<dbfunk (at) engineering.uiowa.edu>        College of Engineering
319/335-5751   FAX: 319/384-0549           1256 Seamans Center
Sys_admin/Postmaster/cell_admin            Iowa City, IA 52242-1527
#include <std_disclaimer.h>
Better is not better, 'standard' is better. B{

Re: Rule to catch PO#

Posted by Ray Jette <rj...@mestek.com>.
Karsten Bräckelmann wrote:
> On Tue, 2008-12-02 at 14:06 -0500, Ray Jette wrote:
> [ *snipp* ]
>
>   
>> I reset the daemon. How do I cann spamassassin with the message. I'm not 
>> sure how to create a message from the server with out sending one.
>>     
>
> If all else fails, just save the message out of your MUA.
>
> You can then test with the saved file and investigate the output:
>   spamassassin < message.file | less
>
>
>   
That might be hard to do. I am using Exchange.

Re: Rule to catch PO#

Posted by Karsten Bräckelmann <gu...@rudersport.de>.
On Tue, 2008-12-02 at 14:06 -0500, Ray Jette wrote:
[ *snipp* ]

> I reset the daemon. How do I cann spamassassin with the message. I'm not 
> sure how to create a message from the server with out sending one.

If all else fails, just save the message out of your MUA.

You can then test with the saved file and investigate the output:
  spamassassin < message.file | less


-- 
char *t="\10pse\0r\0dtu\0.@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4";
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1:
(c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}


Re: Rule to catch PO#

Posted by Karsten Bräckelmann <gu...@rudersport.de>.
On Tue, 2008-12-02 at 21:11 +0100, Karsten Bräckelmann wrote:

> > I created the test message and ran it through both ways. One with PO and 
> > the other with po. The rule fired on both.
> 
> Err, this is bad, isn't it?

Doh!  Ignore that line. A brain-fart made me read "with no".


-- 
char *t="\10pse\0r\0dtu\0.@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4";
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1:
(c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}


Re: Rule to catch PO#

Posted by Karsten Bräckelmann <gu...@rudersport.de>.
On Tue, 2008-12-02 at 14:55 -0500, Ray Jette wrote:
> I created the test message and ran it through both ways. One with PO and 
> the other with po. The rule fired on both.

Err, this is bad, isn't it?

What rule *exactly* are you talking about? Copy-n-paste it from the cf
file. What file name does it come from? Are there any other, similar
named rules?

How *exactly* did you run the test? What where the modified lines in the
test message?

> When receiving mail from the outside the rule only fires on PO and not 
> po. Is there any reason for this to happen?

You did not restart the SA incorporating daemon. Again. It's still
running with an old config. Did you --lint your configuration?

How *exactly* is SA being called in your mail processing chain? Are you
using spamd, amavis, etc...?


I have provided proof of a rule that works. Your problem is not with
that rule.  I'm out, unless we got a detailed description of your
environment, and answers to all questions above.


-- 
char *t="\10pse\0r\0dtu\0.@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4";
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1:
(c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}


Re: Rule to catch PO#

Posted by John Hardin <jh...@impsec.org>.
On Thu, 4 Dec 2008, Ray Jette wrote:

> The following looks like it will work. Does any one see any reasons why this 
> would not work?
> /\bPO ?s?:?#?\d{0,10}?[a-z]{0,5}?/i

The order of your optional bits will be respected, and there's not a "v" 
or apostrophe in there, which was in one of your actual samples, and a PO 
number with dashes won't match.

Try the one I provided. It should work on alphanumeric POs as long as they 
don't _start_ with letters.

-- 
  John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
  jhardin@impsec.org    FALaholic #11174     pgpk -a jhardin@impsec.org
  key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
   North Korea: the only country in the world where people would risk
   execution to flee to communist China.                  -- Ride Fast
-----------------------------------------------------------------------
  11 days until Bill of Rights day

Re: Rule to catch PO#

Posted by Ray Jette <rj...@mestek.com>.
Ray Jette wrote:
> mouss wrote:
>> Ray Jette a écrit :
>>  
>>> Karsten Bräckelmann wrote:
>>>    
>>>> Back on-list.
>>>>
>>>> On Tue, 2008-12-02 at 13:40 -0500, Ray Jette wrote:
>>>>  
>>>>      
>>>>>> Yes, and it does match case insensitively.
>>>>>>
>>>>>> I guess the issue is with your testing environment. How are you 
>>>>>> testing
>>>>>> the rule, err, regexp for a rule?
>>>>>>                 
>>>>> I sent to messages from yahoo. One with a subject of PO and the other
>>>>> with a subject of po.
>>>>>             
>>>> Wow, that's quite a lag for debugging and testing. Try calling
>>>> spamassassin with the message piped into instead. Also be sure to 
>>>> always
>>>> --lint before going live.
>>>>
>>>>  
>>>>      
>>>>> The rule only applyed to PO.
>>>>>             
>>>> You either  (a) forgot to restart the daemon, or  (b) are actually 
>>>> using
>>>> a different rule in your cf files than you pasted in your mail.
>>>>
>>>>
>>>>         
>>> I reset the daemon. How do I cann spamassassin with the message. I'm 
>>> not
>>> sure how to create a message from the server with out sending one.
>>>     
>>
>> use your favourite editor and write a file named message.eml:
>> ------------- cut here -----------
>> Date: Tue, 02 Dec 2008 14:06:52 -0500
>> From: Ray Jette <rj...@mestek.com>
>> To: Ray Jette <rj...@mestek.com>
>> Subject: PO ney
>>
>> blah blah
>> ------------- cut here ------------
>>
>>
>> then run:
>>
>> spamassassin -t < message.eml
>>
>>
>>
>>
>>
>>   
> I created the test message and ran it through both ways. One with PO 
> and the other with po. The rule fired on both.
> When receiving mail from the outside the rule only fires on PO and not 
> po. Is there any reason for this to happen?
>
>
The following looks like it will work. Does any one see any reasons why 
this would not work?
/\bPO ?s?:?#?\d{0,10}?[a-z]{0,5}?/i

Ray

Re: Rule to catch PO#

Posted by Ray Jette <rj...@mestek.com>.
mouss wrote:
> Ray Jette a écrit :
>   
>> Karsten Bräckelmann wrote:
>>     
>>> Back on-list.
>>>
>>> On Tue, 2008-12-02 at 13:40 -0500, Ray Jette wrote:
>>>  
>>>       
>>>>> Yes, and it does match case insensitively.
>>>>>
>>>>> I guess the issue is with your testing environment. How are you testing
>>>>> the rule, err, regexp for a rule?
>>>>>       
>>>>>           
>>>> I sent to messages from yahoo. One with a subject of PO and the other
>>>> with a subject of po.
>>>>     
>>>>         
>>> Wow, that's quite a lag for debugging and testing. Try calling
>>> spamassassin with the message piped into instead. Also be sure to always
>>> --lint before going live.
>>>
>>>  
>>>       
>>>> The rule only applyed to PO.
>>>>     
>>>>         
>>> You either  (a) forgot to restart the daemon, or  (b) are actually using
>>> a different rule in your cf files than you pasted in your mail.
>>>
>>>
>>>   
>>>       
>> I reset the daemon. How do I cann spamassassin with the message. I'm not
>> sure how to create a message from the server with out sending one.
>>     
>
> use your favourite editor and write a file named message.eml:
> ------------- cut here -----------
> Date: Tue, 02 Dec 2008 14:06:52 -0500
> From: Ray Jette <rj...@mestek.com>
> To: Ray Jette <rj...@mestek.com>
> Subject: PO ney
>
> blah blah
> ------------- cut here ------------
>
>
> then run:
>
> spamassassin -t < message.eml
>
>
>
>
>
>   
I created the test message and ran it through both ways. One with PO and 
the other with po. The rule fired on both.
When receiving mail from the outside the rule only fires on PO and not 
po. Is there any reason for this to happen?

Re: Rule to catch PO#

Posted by Ray Jette <rj...@mestek.com>.
mouss wrote:
> Ray Jette a écrit :
>   
>> Karsten Bräckelmann wrote:
>>     
>>> Back on-list.
>>>
>>> On Tue, 2008-12-02 at 13:40 -0500, Ray Jette wrote:
>>>  
>>>       
>>>>> Yes, and it does match case insensitively.
>>>>>
>>>>> I guess the issue is with your testing environment. How are you testing
>>>>> the rule, err, regexp for a rule?
>>>>>       
>>>>>           
>>>> I sent to messages from yahoo. One with a subject of PO and the other
>>>> with a subject of po.
>>>>     
>>>>         
>>> Wow, that's quite a lag for debugging and testing. Try calling
>>> spamassassin with the message piped into instead. Also be sure to always
>>> --lint before going live.
>>>
>>>  
>>>       
>>>> The rule only applyed to PO.
>>>>     
>>>>         
>>> You either  (a) forgot to restart the daemon, or  (b) are actually using
>>> a different rule in your cf files than you pasted in your mail.
>>>
>>>
>>>   
>>>       
>> I reset the daemon. How do I cann spamassassin with the message. I'm not
>> sure how to create a message from the server with out sending one.
>>     
>
> use your favourite editor and write a file named message.eml:
> ------------- cut here -----------
> Date: Tue, 02 Dec 2008 14:06:52 -0500
> From: Ray Jette <rj...@mestek.com>
> To: Ray Jette <rj...@mestek.com>
> Subject: PO ney
>
> blah blah
> ------------- cut here ------------
>
>
> then run:
>
> spamassassin -t < message.eml
>
>
>
>
>
>   
Thanks, I'll give that a try. This will make my testing a lot easer to do.

Re: Rule to catch PO#

Posted by mouss <mo...@netoyen.net>.
Ray Jette a écrit :
> Karsten Bräckelmann wrote:
>> Back on-list.
>>
>> On Tue, 2008-12-02 at 13:40 -0500, Ray Jette wrote:
>>  
>>>> Yes, and it does match case insensitively.
>>>>
>>>> I guess the issue is with your testing environment. How are you testing
>>>> the rule, err, regexp for a rule?
>>>>       
>>> I sent to messages from yahoo. One with a subject of PO and the other
>>> with a subject of po.
>>>     
>>
>> Wow, that's quite a lag for debugging and testing. Try calling
>> spamassassin with the message piped into instead. Also be sure to always
>> --lint before going live.
>>
>>  
>>> The rule only applyed to PO.
>>>     
>>
>> You either  (a) forgot to restart the daemon, or  (b) are actually using
>> a different rule in your cf files than you pasted in your mail.
>>
>>
>>   
> I reset the daemon. How do I cann spamassassin with the message. I'm not
> sure how to create a message from the server with out sending one.

use your favourite editor and write a file named message.eml:
------------- cut here -----------
Date: Tue, 02 Dec 2008 14:06:52 -0500
From: Ray Jette <rj...@mestek.com>
To: Ray Jette <rj...@mestek.com>
Subject: PO ney

blah blah
------------- cut here ------------


then run:

spamassassin -t < message.eml




Re: Rule to catch PO#

Posted by Ray Jette <rj...@mestek.com>.
Karsten Bräckelmann wrote:
> Back on-list.
>
> On Tue, 2008-12-02 at 13:40 -0500, Ray Jette wrote:
>   
>>> Yes, and it does match case insensitively.
>>>
>>> I guess the issue is with your testing environment. How are you testing
>>> the rule, err, regexp for a rule?
>>>       
>> I sent to messages from yahoo. One with a subject of PO and the other 
>> with a subject of po.
>>     
>
> Wow, that's quite a lag for debugging and testing. Try calling
> spamassassin with the message piped into instead. Also be sure to always
> --lint before going live.
>
>   
>> The rule only applyed to PO.
>>     
>
> You either  (a) forgot to restart the daemon, or  (b) are actually using
> a different rule in your cf files than you pasted in your mail.
>
>
>   
I reset the daemon. How do I cann spamassassin with the message. I'm not 
sure how to create a message from the server with out sending one.

Re: Rule to catch PO#

Posted by Karsten Bräckelmann <gu...@rudersport.de>.
Back on-list.

On Tue, 2008-12-02 at 13:40 -0500, Ray Jette wrote:
> > Yes, and it does match case insensitively.
> >
> > I guess the issue is with your testing environment. How are you testing
> > the rule, err, regexp for a rule?
> 
> I sent to messages from yahoo. One with a subject of PO and the other 
> with a subject of po.

Wow, that's quite a lag for debugging and testing. Try calling
spamassassin with the message piped into instead. Also be sure to always
--lint before going live.

> The rule only applyed to PO.

You either  (a) forgot to restart the daemon, or  (b) are actually using
a different rule in your cf files than you pasted in your mail.


-- 
char *t="\10pse\0r\0dtu\0.@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4";
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1:
(c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}


Re: Rule to catch PO#

Posted by Karsten Bräckelmann <gu...@rudersport.de>.
On Tue, 2008-12-02 at 13:20 -0500, Ray Jette wrote:
> I am having a lot of issues with this. Sorry but my regex skills are not 
> very good. I'm trying to learn through. This is a skill I need to learn. 
> I decided to start at the beginning and build the expression up from 
> there. I have the following:
> /\bPO\b/i      I would assume this would match PO and po. The problem is 
> that it is only matching PO. It will not match po. Any ideas why?

Yes, and it does match case insensitively.

I guess the issue is with your testing environment. How are you testing
the rule, err, regexp for a rule?


-- 
char *t="\10pse\0r\0dtu\0.@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4";
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1:
(c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}


Re: Rule to catch PO#

Posted by Ray Jette <rj...@mestek.com>.
Ray Jette wrote:
> Good morning,
> I am trying to write a negative scoring rule that files on the following:
> PO
> PO#
> PO #
>
> Following is the rule I am using:
>
> header PO_AND_ORDERS        Subject =~ /\bPO*?#?/i
> score PO_AND_ORDERS        -0.50
> describe PO_AND_ORDERS    A negative scoring rule that searches the 
> subject for PO #'s.  
> Thanks for any help you can provide.
>
>
>
>
>
>
I am having a lot of issues with this. Sorry but my regex skills are not 
very good. I'm trying to learn through. This is a skill I need to learn. 
I decided to start at the beginning and build the expression up from 
there. I have the following:
/\bPO\b/i      I would assume this would match PO and po. The problem is 
that it is only matching PO. It will not match po. Any ideas why?