You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Simon Loewenthal <si...@klunky.co.uk> on 2012/10/25 16:47:20 UTC

Question about rule: 2.0 DEAR_SOMETHING BODY: Contains 'Dear (something)'

Evening all,

A great majority of our ham starts with Dear Sir/ Dear Madam / Dear Bob.

Therefore I've always wondered why this this is scored so highly: 

*  2.0 DEAR_SOMETHING BODY: Contains 'Dear (something)'


Does anyone know the rational behind this, or is our user base simply communicating on a higher level?  :)  I imagine the rational is sound, but I do not know what it is.


Cheers, S

-- 
	     PGP is optional: 4BA78604
	I won't accept your confidentiality
	agreement, and your Emails are mine.
      		       ~Ö¿Ö~


Re: Question about rule: 2.0 DEAR_SOMETHING BODY: Contains 'Dear (something)'

Posted by Alexandre Boyer <bi...@gmail.com>.
Hi all,

Simon, I had some FPs because of this rule and because my threshold is
lower than 5.

I just had a score override to lower it but this rule still hist a lot
of spam (419 scams essentially).

You may want to fine tune the score according to your specific FPs.

Regards,

Alex, from prypiat.
Yes, I recycle.


On 12-10-25 10:57 AM, Bowie Bailey wrote:
> On 10/25/2012 10:47 AM, Simon Loewenthal wrote:
>> Evening all,
>>
>> A great majority of our ham starts with Dear Sir/ Dear Madam / Dear Bob.
>>
>> Therefore I've always wondered why this this is scored so highly:
>>
>> *  2.0 DEAR_SOMETHING BODY: Contains 'Dear (something)'
>>
>>
>> Does anyone know the rational behind this, or is our user base simply
>> communicating on a higher level?  :)  I imagine the rational is
>> sound, but I do not know what it is.
>
> The rationale is simple.  The masscheck finds that this rule hits more
> spam than ham, so it gets a higher score.
>

Re: Question about rule: 2.0 DEAR_SOMETHING BODY: Contains 'Dear (something)'

Posted by Bowie Bailey <Bo...@BUC.com>.
On 10/25/2012 10:47 AM, Simon Loewenthal wrote:
> Evening all,
>
> A great majority of our ham starts with Dear Sir/ Dear Madam / Dear Bob.
>
> Therefore I've always wondered why this this is scored so highly:
>
> *  2.0 DEAR_SOMETHING BODY: Contains 'Dear (something)'
>
>
> Does anyone know the rational behind this, or is our user base simply communicating on a higher level?  :)  I imagine the rational is sound, but I do not know what it is.

The rationale is simple.  The masscheck finds that this rule hits more 
spam than ham, so it gets a higher score.

-- 
Bowie

Re: Question about rule: 2.0 DEAR_SOMETHING BODY: Contains 'Dear (something)'

Posted by RW <rw...@googlemail.com>.
On Thu, 25 Oct 2012 18:59:03 +0200
Simon Loewenthal top-posted:

> RW <rw...@googlemail.com> wrote:
> 
> >On Thu, 25 Oct 2012 16:47:20 +0200
> >Simon Loewenthal wrote:

> >> *  2.0 DEAR_SOMETHING BODY: Contains 'Dear (something)'
> >> 
> >> 
> >> Does anyone know the rational behind this, 
> >
> >So it wont hit Dear Bob, but will hit Dear Sir etc. It seems
> >reasonable, they're all forms of address that typically wouldn't  be
> >used if the recipient's name were known to the sender.
> 
> Except for formal letters to administrative addresses.

Such addresses tend to get a lot of spam, while the legitimate mail is
unlikely to look particularly spammy, and likely to do well in Bayes.
I suspect that this type of DEAR_SOMETHING  FP has a relatively small
effect on the classification FP rate at the default threshold.

If you have a particular problem with it you might want to shift the
bulk of the score to a meta rule that excludes problematic addresses.



Re: Dear Dfs (was Re: Question about rule: 2.0 DEAR_SOMETHING BODY: Contains 'Dear (something)')

Posted by Adam Moffett <ad...@plexicomm.net>.
> Here's an argument for *not* making your email address "first@example.com",
> "last@example.com" or something like that.
>
> I believe an email to me starting "Dear Dfs," has 100% probability of
> being spam.  If my email address were "david@roaringpenguin.com"
> instead, I'd get a lot of FPs on "Dear David,"
>
> So there you go... use obscure local parts in your email addresses. ;)
>
I couldn't agree more.  I've always used my name and my wife has always 
used a very long word from a fictional language.  She gets almost no 
spam even without filtering.  I think she's somewhat immune to her 
address being discovered by dictionary attacks.

Dear Dfs (was Re: Question about rule: 2.0 DEAR_SOMETHING BODY: Contains 'Dear (something)')

Posted by "David F. Skoll" <df...@roaringpenguin.com>.
On Thu, 25 Oct 2012 18:59:03 +0200
Simon Loewenthal <si...@klunky.co.uk> wrote:

> Except for formal letters to administrative addresses.
> Dear Bob was a frivolous and incorrect example. It is really Sir/Madam

Here's an argument for *not* making your email address "first@example.com",
"last@example.com" or something like that.

I believe an email to me starting "Dear Dfs," has 100% probability of
being spam.  If my email address were "david@roaringpenguin.com"
instead, I'd get a lot of FPs on "Dear David,"

So there you go... use obscure local parts in your email addresses. ;)

Regards,

David.

Re: Question about rule: 2.0 DEAR_SOMETHING BODY: Contains 'Dear (something)'

Posted by Simon Loewenthal <si...@klunky.co.uk>.
Except for formal letters to administrative addresses.
Dear Bob was a frivolous and incorrect example. It is really Sir/Madam

As Alex noted, I coils score it lower,bit am concerned on the overall effect. I'lltest first.

Cheers.

RW <rw...@googlemail.com> wrote:

>On Thu, 25 Oct 2012 16:47:20 +0200
>Simon Loewenthal wrote:
>
>> 
>> Evening all,
>> 
>> A great majority of our ham starts with Dear Sir/ Dear Madam / Dear
>> Bob.
>> 
>> Therefore I've always wondered why this this is scored so highly: 
>> 
>> *  2.0 DEAR_SOMETHING BODY: Contains 'Dear (something)'
>> 
>> 
>> Does anyone know the rational behind this, or is our user base simply
>> communicating on a higher level?  :)  I imagine the rational is
>> sound, but I do not know what it is.
>> 
>> 
>
>The test is
>
>/\bDear (?:IT\W|Internet|candidate|sirs?|madam|investor|travell?er|car
>shopper|web)\b/i
>
>So it wont hit Dear Bob, but will hit Dear Sir etc. It seems
>reasonable, they're all forms of address that typically wouldn't  be
>used if the recipient's name were known to the sender.


Re: Question about rule: 2.0 DEAR_SOMETHING BODY: Contains 'Dear (something)'

Posted by RW <rw...@googlemail.com>.
On Thu, 25 Oct 2012 16:47:20 +0200
Simon Loewenthal wrote:

> 
> Evening all,
> 
> A great majority of our ham starts with Dear Sir/ Dear Madam / Dear
> Bob.
> 
> Therefore I've always wondered why this this is scored so highly: 
> 
> *  2.0 DEAR_SOMETHING BODY: Contains 'Dear (something)'
> 
> 
> Does anyone know the rational behind this, or is our user base simply
> communicating on a higher level?  :)  I imagine the rational is
> sound, but I do not know what it is.
> 
> 

The test is

/\bDear (?:IT\W|Internet|candidate|sirs?|madam|investor|travell?er|car
shopper|web)\b/i

So it wont hit Dear Bob, but will hit Dear Sir etc. It seems
reasonable, they're all forms of address that typically wouldn't  be
used if the recipient's name were known to the sender.