You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Robert Menschel <Ro...@Menschel.net> on 2004/08/12 15:40:30 UTC
Re[2]: spammer headers: X-Mailer
Hello John,
Wednesday, August 11, 2004, 3:16:43 PM, you wrote:
>> On Wed, 2004-08-11 at 12:08, Procmail Security daemon wrote:
>> > Headers from message:
>> > > X-Mailer: moo airlift
>> > > eater-carabao: agee delusive tid
JH> "X-Mailer: [random words]" is a good indicator, the problem is how do
JH> you tell (in a program) the words are random? A person can pick it up
JH> pretty easily because it doesn't look like the name of a real program.
header SARE_XMAIL_SUSP2 X-Mailer =~ /^(?:[a-z]{4,20}[\-\.\,]? ){2,8}/ # no /i, trailing space
describe SARE_XMAIL_SUSP2 X-Mailer suggests spam
score SARE_XMAIL_SUSP2 1.666
#hist SARE_XMAIL_SUSP2 Loren Wilton, LW_BOGUS_MAILER
#counts SARE_XMAIL_SUSP2 370s/0h of 58338 corpus (33610s/24728h RM) 08/07/04
#max SARE_XMAIL_SUSP2 677s/0h of 85084 corpus (62489s/22595h RM) 06/08/04
#counts SARE_XMAIL_SUSP2 78s/0h of 32586 corpus (9341s/23245h JH) 06/10/04
#counts SARE_XMAIL_SUSP2 59s/0h of 17050 corpus (14617s/2433h MY) 08/08/04
JH> And random-word all-lowercase headers (the "eater-carabao" header above)
JH> are also a good indicator, but again, how does a program recognize words
JH> are random and don't make sense in the context? The fact that it's all
JH> lowercase might be worth a few tenths of a point towards spam
JH> independent of the actual content.
header SARE_MULT_LCASE_X2 ALL =~ /\n[a-z-]+: [a-z ,.]+\n/
describe SARE_MULT_LCASE_X2 Contains all lc header, all lc value
score SARE_MULT_LCASE_X2 1.666
#hist SARE_MULT_LCASE_X2 SARE_TM2_RW_UNSC
#counts SARE_MULT_LCASE_X2 338s/0h of 60211 corpus (35236s/24975h RM) 08/11/04
#counts SARE_MULT_LCASE_X2 113s/0h of 32586 corpus (9341s/23245h JH) 06/10/04
#counts SARE_MULT_LCASE_X2 0s/0h of 17050 corpus (14617s/2433h MY) 08/08/04
These rules are both in 70_sare_header0.cf
Bob Menschel