You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Ben Wylie <sa...@benwylie.co.uk> on 2007/02/19 22:51:02 UTC
Using ^ and $ in SA Rules
I have tried to write a rule which would hit a line which only contains
four capital letters, each separated by a space.
so i wrote a body rule:
/^[A-Z]\s[A-Z]\s[A-Z]\s[A-Z]$/
unfortunately it doesn't hit when I expect it to:
C T C X
If I take the ^ and $ parts, it does hit, but i would like it to only
hit if that is the only thing on a particular line.
Have I made a mistake here? How might I get a rule like this to work?
Thanks
Ben
Re: Using ^ and $ in SA Rules
Posted by Ben Wylie <sa...@benwylie.co.uk>.
Matt Kettler wrote:
> Ben Wylie wrote:
>> I have tried to write a rule which would hit a line which only
>> contains four capital letters, each separated by a space.
>>
>> so i wrote a body rule:
>> /^[A-Z]\s[A-Z]\s[A-Z]\s[A-Z]$/
>>
>> unfortunately it doesn't hit when I expect it to:
>> C T C X
>>
>> If I take the ^ and $ parts, it does hit, but i would like it to only
>> hit if that is the only thing on a particular line.
>>
>> Have I made a mistake here? How might I get a rule like this to work?
> Use rawbody for this. Body rules have CR/LF stripped out.
That would explain it.
Thanks
Ben
Re: Using ^ and $ in SA Rules
Posted by Matt Kettler <mk...@verizon.net>.
Ben Wylie wrote:
> I have tried to write a rule which would hit a line which only
> contains four capital letters, each separated by a space.
>
> so i wrote a body rule:
> /^[A-Z]\s[A-Z]\s[A-Z]\s[A-Z]$/
>
> unfortunately it doesn't hit when I expect it to:
> C T C X
>
> If I take the ^ and $ parts, it does hit, but i would like it to only
> hit if that is the only thing on a particular line.
>
> Have I made a mistake here? How might I get a rule like this to work?
Use rawbody for this. Body rules have CR/LF stripped out.
Re: Using ^ and $ in SA Rules
Posted by Loren Wilton <lw...@earthlink.net>.
> An example email which doesn't hit can be found here:
> http://www.arkbb.co.uk/ExampleEmail.txt
I just looked again at that spam. I'm somewhat amused by the current and
projected prices:
Currently priced at: .80
Expected: .00
Loren
Re: Using ^ and $ in SA Rules
Posted by Loren Wilton <lw...@earthlink.net>.
Oh, you are going for a body rule, and the source is html. Whether the body
(which is broken into sections) starts just before the term you want is
questionable. Ah, there is also a plain text section. That makes it a
little easier.
Try this:
body FOO_SYMBOL /\n\s{0,15}[A-Z]\s[A-Z]\s[A-Z]\s[A-Z]\s{0,15}\n/
There are a lot of things Mr. Spammer can do to get around that, but I think
it will catch this one case.
If you don't have the SARE stock rules, you should get those also. I'd
expect at least a few points from them on this thing.
Loren
Re: Using ^ and $ in SA Rules
Posted by Matt Kettler <mk...@verizon.net>.
Mark Martinec wrote:
> Theo Van Dinter writes:
>
>> body rules aren't run on lines, they're run on paragraphs,
>> so that text is in the middle of a string.
>>
>
> Matt Kettler writes:
>
>> Use rawbody for this. Body rules have CR/LF stripped out.
>>
>
> Giving whole paragraphs to regexp is fine, but why are newlines
> stripped out in 'body' rules?
In order to normalize whitespace. This way rules don't have to care
about whitespace, they can just be written normally.
Otherwise
/Hello I'm a spammer/i
Would fail to match:
Hello I'm
a spammer.
SA also reduces excess spaces in normal body rules, that way spammers
can't obfuscate text by simply inserting piles of spaces.
It would be really a pain to have to rewrite the above rule as:
/Hello\s*I'm\s*a\s*spammer/m
And also much slower if you have to do that for a few hundred rules.
> Perl regexp modifiers m (and s)
> would be handy:
>
> body L_TEST /^[A-Z]\s[A-Z]\s[A-Z]\s[A-Z]$/m
>
> but as it stands now the m modifier is of no use in 'body' rules
> (unlike in 'rawbody').
True. If you care about whitespace formatting and EOLs, use rawbody.
If you want to match text in a straightforward way, use body and let
SA's pre-processing of the text deal with simplifying whitespace.
Re: Using ^ and $ in SA Rules
Posted by Mark Martinec <Ma...@ijs.si>.
Theo Van Dinter writes:
> body rules aren't run on lines, they're run on paragraphs,
> so that text is in the middle of a string.
Matt Kettler writes:
> Use rawbody for this. Body rules have CR/LF stripped out.
Giving whole paragraphs to regexp is fine, but why are newlines
stripped out in 'body' rules? Perl regexp modifiers m (and s)
would be handy:
body L_TEST /^[A-Z]\s[A-Z]\s[A-Z]\s[A-Z]$/m
but as it stands now the m modifier is of no use in 'body' rules
(unlike in 'rawbody').
Mark
Re: Using ^ and $ in SA Rules
Posted by Theo Van Dinter <fe...@apache.org>.
On Tue, Feb 20, 2007 at 12:26:17AM +0000, Ben Wylie wrote:
> >>so i wrote a body rule:
> >>/^[A-Z]\s[A-Z]\s[A-Z]\s[A-Z]$/
> >>
> >>Have I made a mistake here? How might I get a rule like this to
> >>work?
body rules aren't run on lines, they're run on paragraphs, so that text is in
the middle of a string.
--
Randomly Selected Tagline:
"Now they show you how detergents take out bloodstains, a pretty violent
image there. I think if you've got a T-shirt with a bloodstain all over
it, maybe laundry isn't your biggest problem. Maybe you should get rid
of the body before you do the wash." - Jerry Seinfeld
Re: Using ^ and $ in SA Rules
Posted by Ben Wylie <sa...@benwylie.co.uk>.
John D. Hardin wrote:
> On Mon, 19 Feb 2007, Ben Wylie wrote:
>
>> so i wrote a body rule:
>> /^[A-Z]\s[A-Z]\s[A-Z]\s[A-Z]$/
>>
>> unfortunately it doesn't hit when I expect it to:
>> C T C X
>>
>> If I take the ^ and $ parts, it does hit, but i would like it to
>> only hit if that is the only thing on a particular line.
>>
>> Have I made a mistake here? How might I get a rule like this to
>> work?
>
> There might be leading and/or trailing space. Try:
>
> /^\s?[A-Z]\s[A-Z]\s[A-Z]\s[A-Z]\s?$/
Thanks for the suggestion.
This still doesn't work for me.
An example email which doesn't hit can be found here:
http://www.arkbb.co.uk/ExampleEmail.txt
Thanks
Ben
Re: Using ^ and $ in SA Rules
Posted by "John D. Hardin" <jh...@impsec.org>.
On Mon, 19 Feb 2007, Ben Wylie wrote:
> so i wrote a body rule:
> /^[A-Z]\s[A-Z]\s[A-Z]\s[A-Z]$/
>
> unfortunately it doesn't hit when I expect it to:
> C T C X
>
> If I take the ^ and $ parts, it does hit, but i would like it to
> only hit if that is the only thing on a particular line.
>
> Have I made a mistake here? How might I get a rule like this to
> work?
There might be leading and/or trailing space. Try:
/^\s?[A-Z]\s[A-Z]\s[A-Z]\s[A-Z]\s?$/
--
John Hardin KA7OHZ http://www.impsec.org/~jhardin/
jhardin@impsec.org FALaholic #11174 pgpk -a jhardin@impsec.org
key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
USMC Rules of Gunfighting #9: Accuracy is relative: most combat
shooting standards will be more dependent on "pucker factor" than
the inherent accuracy of the gun.
-----------------------------------------------------------------------
3 days until George Washington's 275th Birthday