You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@spamassassin.apache.org by Nigel Frankcom <ni...@blue-canoe.net> on 2007/01/25 07:44:03 UTC

Drug spam, some caught some not - none caught by drug rules

Hi All,

Does anyone have any idea why there are such scoring disparities
between these two emails? I've been seeing a few of these creep
through lately.

http://dev.blue-canoe.net/spam/spam01.txt
http://dev.blue-canoe.net/spam/spam02.txt
http://dev.blue-canoe.net/spam/spam03.txt
http://dev.blue-canoe.net/spam/spam04.txt

More to the point with these is why are they not hitting any of the
drugs rules?

All help gratefully received

Nigel

Re: Drug spam, some caught some not - none caught by drug rules

Posted by Nigel Frankcom <ni...@blue-canoe.net>.

On Thu, 25 Jan 2007 02:40:30 -0500, Matt Kettler
<mk...@verizon.net> wrote:

>Nigel Frankcom wrote:
>> Hi All,
>>
>> Does anyone have any idea why there are such scoring disparities
>> between these two emails? I've been seeing a few of these creep
>> through lately.
>>
>> http://dev.blue-canoe.net/spam/spam01.txt
>> http://dev.blue-canoe.net/spam/spam02.txt
>> http://dev.blue-canoe.net/spam/spam03.txt
>> http://dev.blue-canoe.net/spam/spam04.txt
>>
>> More to the point with these is why are they not hitting any of the
>> drugs rules?
>
>There's a few million obfuscation methods, and the rules can't always
>cover em all.
>
>The examples you posted are using "duplicated letters", as well as
>inserted underscores.
>
>The old Antidrug rules (part of xx_drugs.cf now) that I wrote will deal
>with the underscores, and a wide range of character substitutions, but
>only a few special-cases of insertions.
>
>It's taken the spammers a long time to figure that out, but it appears
>they finally have.
>
>I used to have to update the set constantly, but lately I've been a bit
>too busy with real life.


Thanks for the info, I'll see what I can do locally to stop them.

Kind regards

Nigel

Re: Re: Drug spam, some caught some not - none caught by drug rules

Posted by Rich Shepard <rs...@appl-ecosys.com>.

On Fri, 26 Jan 2007, Ben Wylie wrote:

> On top of these rules, I have written a rule to give 4 points to any email
> with an .exe attachment as there have been a lot of these. With the above
> rules and the 4 for having an exe attachment, it hits a rating of 12. The
> rule i have for detecting the exe attachment, is this:
>
> full EXE_ATTACH /file(?:name)?=\".*\.exe/i
> score EXE_ATTACH 4.0
>
> I'm not sure if there is a better way of writing it, but it works for me.

   Thank you. I've appended that to local.cf.

Rich

-- 
Richard B. Shepard, Ph.D.               |    The Environmental Permitting
Applied Ecosystem Services, Inc.        |          Accelerator(TM)
<http://www.appl-ecosys.com>     Voice: 503-667-4517      Fax: 503-667-8863

Re: Re: Drug spam, some caught some not - none caught by drug rules

Posted by Ben Wylie <sa...@benwylie.co.uk>.

Rich Shepard wrote:
> Andy et al.:
> 
>   You can use <wget
>           http://www.appl-ecosys.com/temp-files/analyzed-spam.tgz>.
> 
>   I'll leave it there for a day. Any insight into how to better trap this
> type of spam would be welcome. I have a few other representative types, 
> too.

*  2.0 BOTNET Relay might be a spambot or virusbot
*      [botnet0.7,ip=65.123.242.225,nordns]
*  0.5 BAYES_50 BODY: Bayesian spam probability is 40 to 60%
*      [score: 0.5580]
*  1.6 RCVD_IN_BL_SPAMCOP_NET RBL: Received via a relay in bl.spamcop.net
*      [Blocked - see <http://www.spamcop.net/bl.shtml?65.123.242.225>]
*  3.9 RCVD_IN_XBL RBL: Received via a relay in Spamhaus XBL
*      [65.123.242.225 listed in zen.spamhaus.org]

On top of these rules, I have written a rule to give 4 points to any 
email with an .exe attachment as there have been a lot of these.
With the above rules and the 4 for having an exe attachment, it hits a 
rating of 12.
The rule i have for detecting the exe attachment, is this:

full EXE_ATTACH /file(?:name)?=\".*\.exe/i
score EXE_ATTACH 4.0

I'm not sure if there is a better way of writing it, but it works for me.

Cheers,
Ben

Re: Drug spam, some caught some not - none caught by drug rules

Posted by Rich Shepard <rs...@appl-ecosys.com>.

On Thu, 25 Jan 2007, Andy Figueroa wrote:

> Rich, if you can post the output as text files to a web site somewhere and
> just send the link/url, that's the kindest way to to this.  And then if I
> knew what I was doing, I'd go look at them and analyze them for you. 
> Thought it won't be me, I'm sure someone will.

Andy et al.:

   You can use <wget
           http://www.appl-ecosys.com/temp-files/analyzed-spam.tgz>.

   I'll leave it there for a day. Any insight into how to better trap this
type of spam would be welcome. I have a few other representative types, too.
But, Friday evening I run sa-learn on my spam-uncaught message file and
delete them.

Thanks,

Rich

-- 
Richard B. Shepard, Ph.D.               |    The Environmental Permitting
Applied Ecosystem Services, Inc.        |          Accelerator(TM)
<http://www.appl-ecosys.com>     Voice: 503-667-4517      Fax: 503-667-8863

Re: Drug spam, some caught some not - none caught by drug rules

Posted by Andy Figueroa <fi...@andyfigueroa.net>.

Rich, if you can post the output as text files to a web site somewhere 
and just send the link/url, that's the kindest way to to this.  And then 
if I knew what I was doing, I'd go look at them and analyze them for 
you.  Thought it won't be me, I'm sure someone will.

Andy Figueroa

Rich Shepard wrote:
> On Thu, 25 Jan 2007, Matt Kettler wrote:
> 
>> The proper command would be:
>>
>> spamassassin -D bayes < message1 2> debug1.txt
> 
>   OK. I have a spam message that made it to my inbox today. Empty body, the
> spam base64 encoded. SA gave it a score of 0 this morning.
> 
>   I've run it through the debug process per the above, but I've no idea how
> to interpret the results or learn from them what -- if anything -- 
> should be
> tweaked.
> 
>   How should I make the message and debug output tarball available?
> 
> Rich
>

Re: Drug spam, some caught some not - none caught by drug rules

Posted by Rich Shepard <rs...@appl-ecosys.com>.

On Thu, 25 Jan 2007, Matt Kettler wrote:

> The proper command would be:
>
> spamassassin -D bayes < message1 2> debug1.txt

   OK. I have a spam message that made it to my inbox today. Empty body, the
spam base64 encoded. SA gave it a score of 0 this morning.

   I've run it through the debug process per the above, but I've no idea how
to interpret the results or learn from them what -- if anything -- should be
tweaked.

   How should I make the message and debug output tarball available?

Rich

-- 
Richard B. Shepard, Ph.D.               |    The Environmental Permitting
Applied Ecosystem Services, Inc.        |          Accelerator(TM)
<http://www.appl-ecosys.com>     Voice: 503-667-4517      Fax: 503-667-8863

Re: Drug spam, some caught some not - none caught by drug rules

Posted by Andy Figueroa <fi...@andyfigueroa.net>.

Thanks, again, Matt.  I need all the help I can get.  I've only been 
managing my own SpamAssassin installations (two mailservers) for about 
four months and still have a lot to learn.

Andy

Matt Kettler wrote:
> Andy Figueroa wrote:
>> You can capture the debug output by using:
>> spamassassin -D -t < message1 2> debug1.txt
> 
> Andy, you'r missing something VERY important here. They need BAYES
> debugging, not general debugging. And using -t here is pointless. Won't
> hurt, but serves no useful purpose. (-t forces SA to mark the message up
> and generate a report like it would for spam, even if the score isn't
> over the threshold.
> 
> The proper command would be:
> 
> spamassassin -D bayes < message1 2> debug1.txt

Re: Drug spam, some caught some not - none caught by drug rules

Posted by Matt Kettler <mk...@verizon.net>.

Andy Figueroa wrote:
> Thanks, Matt.  That sounds like a good suggestion.
>
> Nigel, since you have the emails, if you could capture the debug
> output in a file and post like you did the messages, perhaps someone
> wise could evaluate what is going on.
>
> You can capture the debug output by using:
> spamassassin -D -t < message1 2> debug1.txt

Andy, you'r missing something VERY important here. They need BAYES
debugging, not general debugging. And using -t here is pointless. Won't
hurt, but serves no useful purpose. (-t forces SA to mark the message up
and generate a report like it would for spam, even if the score isn't
over the threshold.

The proper command would be:

spamassassin -D bayes < message1 2> debug1.txt

>
> Matt Kettler wrote:
>>
>> BAYES changes are easily explained by the header changes, but a deeper
>> analysis would involve running through spamassassin -D bayes and looking
>> at the exact tokens.
>>
>

Re: Drug spam, some caught some not - none caught by drug rules

Posted by Nigel Frankcom <ni...@blue-canoe.net>.

On Thu, 25 Jan 2007 10:28:21 -0500, Andy Figueroa
<fi...@andyfigueroa.net> wrote:

>Thanks, Matt.  That sounds like a good suggestion.
>
>Nigel, since you have the emails, if you could capture the debug output 
>in a file and post like you did the messages, perhaps someone wise could 
>evaluate what is going on.
>
>You can capture the debug output by using:
>spamassassin -D -t < message1 2> debug1.txt
>
>Andy Figueroa
>
>Matt Kettler wrote:
>> Andy Figueroa wrote:
>>> Matt (but not just to Matt), I don't understand your reply (though I
>>> am deeply in your dept for the work you do for this community).  The
>>> sample emails that Nigel posted are identical in content, including
>>> obfuscation.  I've noted the same situation.  Yet, the scoring is
>>> really different. On the low scoring ones, DCC and RAZOR2 didn't hit,
>>> and the BAYES score is different.  The main differences are in the
>>> headers' different forged From and To addresses.  I thought these
>>> samples were worthy of deeper analysis.
>> 
>> Well, there might be other analysis worth making.
>> 
>>  However,  Nigel asked why the drugs rules weren't matching. I answered
>> that question alone.
>> 
>> Not sure why the change in razor/dcc happend.
>> 
>> BAYES changes are easily explained by the header changes, but a deeper
>> analysis would involve running through spamassassin -D bayes and looking
>> at the exact tokens.
>> 

I'll sit down with a beer later and run the debug on them. In the
meantime Steve Basford from sanesecurity.com has added them to the
Clam add on I mentioned a while back. 

Their main download point is
http://sanesecurity.com/clamav/downloads.htm (in my experience here
it's worked very well indeed). For those of you that are interested
and are running multiple servers contact me off list for the URL to
the scripts James Rallo mod'd for updating multiple backend servers
(or you can hunt back through the mail archives for it :-D).

Kind regards

Nigel

Re: Drug spam, some caught some not - none caught by drug rules

Posted by Nigel Frankcom <ni...@blue-canoe.net>.

On Fri, 26 Jan 2007 13:54:03 +0000, Ben Wylie
<sa...@benwylie.co.uk> wrote:

>I recommend the KAM rules list which can be found here:
>http://www.peregrinehw.com/downloads/SpamAssassin/contrib/KAM.cf
>This catches the drugs names in these emails.
>
>Cheers,
>Ben
>
>Nigel Frankcom wrote:
>> On Thu, 25 Jan 2007 20:16:42 -0500, Matt Kettler
>> <mk...@verizon.net> wrote:
>> 
>>> Nigel Frankcom wrote:
>>>> Debug results are available on: 
>>>> http://dev.blue-canoe.net/spam/spam01.txt
>>>> http://dev.blue-canoe.net/spam/debug1.txt
>>>>
>>>> http://dev.blue-canoe.net/spam/spam02.txt
>>>> http://dev.blue-canoe.net/spam/debug2.txt
>>>>
>>>> http://dev.blue-canoe.net/spam/spam03.txt
>>>> http://dev.blue-canoe.net/spam/debug3.txt
>>>>
>>>> http://dev.blue-canoe.net/spam/spam04.txt
>>>> http://dev.blue-canoe.net/spam/debug4.txt
>>>>
>>>> Make of them what you will, I think I need more beer before that lot
>>>> makes much sense :-D
>>>>
>>>> Kind regards
>>>>
>>>> Nigel
>>>>   
>>> Sorry Nigel. Andy steered you a bit wrong and those debug outputs are
>>> useless.. You need "-D bayes" not just "-D".
>>>
>>> Try it again with:
>>>
>>> spamassassin -D bayes < message1 2> debug1.txt
>>>
>>> Instead of
>>> spamassassin -D -t < message1 2> debug1.txt
>>>
>> 
>> Files redone... a little more informative this time round :-D
>> 
>>  http://dev.blue-canoe.net/spam/spam01.txt
>>  http://dev.blue-canoe.net/spam/debug1.txt
>> 
>>  http://dev.blue-canoe.net/spam/spam02.txt
>>  http://dev.blue-canoe.net/spam/debug2.txt
>> 
>>  http://dev.blue-canoe.net/spam/spam03.txt
>>  http://dev.blue-canoe.net/spam/debug3.txt
>> 
>>  http://dev.blue-canoe.net/spam/spam04.txt
>>  http://dev.blue-canoe.net/spam/debug4.txt
>> 

Thanks Ben,

Training seems to have resolved the short term problem, I'll pull a
copy of that rule and if the problem strikes again I'll run it in.

Kind regards

Nigel

Re: Drug spam, some caught some not - none caught by drug rules

Posted by Nigel Frankcom <ni...@blue-canoe.net>.

On Mon, 29 Jan 2007 10:18:33 +0100, "D Ivago" <ba...@gmail.com>
wrote:

>> On Fri, 26 Jan 2007, Jim Maul wrote:
>>
>> > Those are the DEFAULT rules.  Do not add/remove/modify anything in this
>> > folder.
>> >
>> > custom rules go in /etc/mail/spamassassin/
>
>
>So basicly you just need to 'cd /etc/mail/spamassissin'
>and 'wget http://www.peregrinehw.com/downloads/SpamAssassin/contrib/KAM.cf'
>into this folder and restart spamassissin? or do I need to refer to his
>KAM.cf file in local.cf or something so SA knows it's there?
>
>kind regards,
>
>ivago

Just wget into /etc/mail/spamassassin then run spamassassin --lint
(just to check) and restart if --lint comes back with no reports.

The mails I saw were scoring 5.9 from the KAM rules, DCC added another
2.0 and I think they picked up various small scores as well, all
together it put them way over my threshold.

The rules run clean here on 3.1.7

regards

Nigel

Re: Drug spam, some caught some not - none caught by drug rules

Posted by D Ivago <ba...@gmail.com>.

> On Fri, 26 Jan 2007, Jim Maul wrote:
>
> > Those are the DEFAULT rules.  Do not add/remove/modify anything in this
> > folder.
> >
> > custom rules go in /etc/mail/spamassassin/

So basicly you just need to 'cd /etc/mail/spamassissin'
and 'wget http://www.peregrinehw.com/downloads/SpamAssassin/contrib/KAM.cf'
into this folder and restart spamassissin? or do I need to refer to his
KAM.cf file in local.cf or something so SA knows it's there?

kind regards,

ivago

Re: Drug spam, some caught some not - none caught by drug rules

Posted by Rich Shepard <rs...@appl-ecosys.com>.

On Fri, 26 Jan 2007, Jim Maul wrote:

> Those are the DEFAULT rules.  Do not add/remove/modify anything in this
> folder.
>
> custom rules go in /etc/mail/spamassassin/

   OK. I'll put the new ones there.

> You really need to have a better understanding of the basics of SA.  I'd 
> suggest going over the documentation again. Specifically: 
> http://wiki.apache.org/spamassassin/WhereDoLocalSettingsGo

   Sure will -- this weekend.

Rich

-- 
Richard B. Shepard, Ph.D.               |    The Environmental Permitting
Applied Ecosystem Services, Inc.        |          Accelerator(TM)
<http://www.appl-ecosys.com>     Voice: 503-667-4517      Fax: 503-667-8863

Re: Drug spam, some caught some not - none caught by drug rules

Posted by Jim Maul <jm...@elih.org>.

Rich Shepard wrote:
> On Fri, 26 Jan 2007, Rich Shepard wrote:
> 
>>  Where do I put this file so it's seen and used by SpamAssassin?
> 
>   Nevermind. I put it in /usr/share/spamassassin/ with all the other .cf
> files.
> 
> Rich
> 

nooooooo

Those are the DEFAULT rules.  Do not add/remove/modify anything in this 
folder.

custom rules go in /etc/mail/spamassassin/

You really need to have a better understanding of the basics of SA.  I'd 
suggest going over the documentation again. Specifically: 
http://wiki.apache.org/spamassassin/WhereDoLocalSettingsGo

-Jim

Re: Re: Drug spam, some caught some not - none caught by drug rules

Posted by Rich Shepard <rs...@appl-ecosys.com>.

On Fri, 26 Jan 2007, Rich Shepard wrote:

>  Where do I put this file so it's seen and used by SpamAssassin?

   Nevermind. I put it in /usr/share/spamassassin/ with all the other .cf
files.

Rich

-- 
Richard B. Shepard, Ph.D.               |    The Environmental Permitting
Applied Ecosystem Services, Inc.        |          Accelerator(TM)
<http://www.appl-ecosys.com>     Voice: 503-667-4517      Fax: 503-667-8863

Re: Re: Drug spam, some caught some not - none caught by drug rules

Posted by Rich Shepard <rs...@appl-ecosys.com>.

On Fri, 26 Jan 2007, Ben Wylie wrote:

> I recommend the KAM rules list which can be found here:
> http://www.peregrinehw.com/downloads/SpamAssassin/contrib/KAM.cf This
> catches the drugs names in these emails.

Ben,

   Where do I put this file so it's seen and used by SpamAssassin?

Thanks,

Rich

-- 
Richard B. Shepard, Ph.D.               |    The Environmental Permitting
Applied Ecosystem Services, Inc.        |          Accelerator(TM)
<http://www.appl-ecosys.com>     Voice: 503-667-4517      Fax: 503-667-8863

Re: Drug spam, some caught some not - none caught by drug rules

Posted by Stefan Hornburg <ra...@linuxia.de>.

Nigel Frankcom wrote:
> On Sun, 28 Jan 2007 14:51:21 -0500, "Tim Boyer" <ti...@denmantire.com>
> wrote:
> 
> 
>>One thing I've noticed is that Polyakov is starting to obfuscate the URL.
>>What would normally be caught because it's in the Spamhaus SBL is getting
>>missed because of this:
>>
>>Good day,
>>
>>Viazzgra  $1, 80
>>Ciazzlis  $3, 00
>>Levizztra $3, 35
>>
>>http://www.printeryml.*com ( Important ! Remove "*" )
>>
> 
> 
> I saw a few of those hit over the weekend, they got caught with a
> combination of DCC, bayes and the KAM.cf mentioned earlier in the
> week. They also tagged a modified test rule I'm running at the mo. 
> 
> body Test_01 /remove \"\*\"/i
> score Test_01 0.1
> describe Test_01 Test remove asterisk for URL spams
> 
> Warning, the above has not been mass checked and is running here only
> as a test. I can imagine instances where that would hit ham,
> particularly where some people obfuscate their email address.
> 
> No doubt a different character will be substituted for the * in due
> course.

How about let SpamAssassin remove invalid characters like that from the
URL before passing it to the URL blacklists ? Different characters can
be handled by making this configurable.

Bye
      Racke

-- 
LinuXia Systems => http://www.linuxia.de/
Expert Interchange Consulting and System Administration
ICDEVGROUP => http://www.icdevgroup.org/
Interchange Development Team

Re: Drug spam, some caught some not - none caught by drug rules

Posted by Nigel Frankcom <ni...@blue-canoe.net>.

On Sun, 28 Jan 2007 14:51:21 -0500, "Tim Boyer" <ti...@denmantire.com>
wrote:

>One thing I've noticed is that Polyakov is starting to obfuscate the URL.
>What would normally be caught because it's in the Spamhaus SBL is getting
>missed because of this:
>
>Good day,
> 
>Viazzgra  $1, 80
>Ciazzlis  $3, 00
>Levizztra $3, 35
> 
>http://www.printeryml.*com ( Important ! Remove "*" )
> 

I saw a few of those hit over the weekend, they got caught with a
combination of DCC, bayes and the KAM.cf mentioned earlier in the
week. They also tagged a modified test rule I'm running at the mo. 

body Test_01 /remove \"\*\"/i
score Test_01 0.1
describe Test_01 Test remove asterisk for URL spams

Warning, the above has not been mass checked and is running here only
as a test. I can imagine instances where that would hit ham,
particularly where some people obfuscate their email address.

No doubt a different character will be substituted for the * in due
course.

Kind regards

Nigel

RE: Re: Drug spam, some caught some not - none caught by drug rules

Posted by Tim Boyer <ti...@denmantire.com>.

One thing I've noticed is that Polyakov is starting to obfuscate the URL.
What would normally be caught because it's in the Spamhaus SBL is getting
missed because of this:

Good day,
 
Viazzgra  $1, 80
Ciazzlis  $3, 00
Levizztra $3, 35
 
http://www.printeryml.*com ( Important ! Remove "*" )
 
-- 
Tim Boyer 
Director
Information Systems and Engineering Projects
Denman Tire Corporation
tim@denmantire.com

Re: Re: Drug spam, some caught some not - none caught by drug rules

Posted by Ben Wylie <sa...@benwylie.co.uk>.

Hi Andy and Dave,

I asked the same question of Daryl back in November, and this was his 
response:

 > I'm not aware of Kevin publishing a channel for his rules, although he
 > does have commit access to SpamAssassin, so I'd hope that he would
 > commit his rules to SA for inclusion (upon meeting rule promotion
 > criteria) in the updates.spamassassin.org channel.

I have not found a channel to update it from, myself. If anyone has, 
then perhaps they could post details.

Cheers,
Ben



Andy Figueroa wrote:
> Ben, or others.  I've been experimenting with the KAM.cf rules and find them 
> quite helpful.  Is there a means of keeping these up-to-date, or are 
> they possibly on their way in to the standard set of rules?
> 
> Andy Figueroa
> 
> Ben Wylie wrote:
>> I recommend the KAM rules list which can be found here:
>> http://www.peregrinehw.com/downloads/SpamAssassin/contrib/KAM.cf
>> This catches the drugs names in these emails.
>>
>> Cheers,
>> Ben
> </div>

RE: Drug spam, some caught some not - none caught by drug rules

Posted by Dave Koontz <dk...@mbc.edu>.

Same here.  I've been very impressed with this ruleset so far. 

-----Original Message-----
From: Andy Figueroa [mailto:figueroa@andyfigueroa.net] 
Sent: Saturday, January 27, 2007 9:23 AM
To: users@spamassassin.apache.org
Subject: Re: Drug spam, some caught some not - none caught by drug rules

Ben, or others.  I've been experimenting with the KAM.cf rules and find them
quite helpful.  Is there a means of keeping these up-to-date, or are they
possibly on their way in to the standard set of rules?

Andy Figueroa

Ben Wylie wrote:
> I recommend the KAM rules list which can be found here:
> http://www.peregrinehw.com/downloads/SpamAssassin/contrib/KAM.cf
> This catches the drugs names in these emails.
> 
> Cheers,
> Ben

Re: Drug spam, some caught some not - none caught by drug rules

Posted by Andy Figueroa <fi...@andyfigueroa.net>.

Ben, or others.  I've been experimenting with the KAM.cf rules and find 
them quite helpful.  Is there a means of keeping these up-to-date, or 
are they possibly on their way in to the standard set of rules?

Andy Figueroa

Ben Wylie wrote:
> I recommend the KAM rules list which can be found here:
> http://www.peregrinehw.com/downloads/SpamAssassin/contrib/KAM.cf
> This catches the drugs names in these emails.
> 
> Cheers,
> Ben

Re: Re: Drug spam, some caught some not - none caught by drug rules

Posted by Ben Wylie <sa...@benwylie.co.uk>.

I recommend the KAM rules list which can be found here:
http://www.peregrinehw.com/downloads/SpamAssassin/contrib/KAM.cf
This catches the drugs names in these emails.

Cheers,
Ben

Nigel Frankcom wrote:
> On Thu, 25 Jan 2007 20:16:42 -0500, Matt Kettler
> <mk...@verizon.net> wrote:
> 
>> Nigel Frankcom wrote:
>>> Debug results are available on: 
>>> http://dev.blue-canoe.net/spam/spam01.txt
>>> http://dev.blue-canoe.net/spam/debug1.txt
>>>
>>> http://dev.blue-canoe.net/spam/spam02.txt
>>> http://dev.blue-canoe.net/spam/debug2.txt
>>>
>>> http://dev.blue-canoe.net/spam/spam03.txt
>>> http://dev.blue-canoe.net/spam/debug3.txt
>>>
>>> http://dev.blue-canoe.net/spam/spam04.txt
>>> http://dev.blue-canoe.net/spam/debug4.txt
>>>
>>> Make of them what you will, I think I need more beer before that lot
>>> makes much sense :-D
>>>
>>> Kind regards
>>>
>>> Nigel
>>>   
>> Sorry Nigel. Andy steered you a bit wrong and those debug outputs are
>> useless.. You need "-D bayes" not just "-D".
>>
>> Try it again with:
>>
>> spamassassin -D bayes < message1 2> debug1.txt
>>
>> Instead of
>> spamassassin -D -t < message1 2> debug1.txt
>>
> 
> Files redone... a little more informative this time round :-D
> 
>  http://dev.blue-canoe.net/spam/spam01.txt
>  http://dev.blue-canoe.net/spam/debug1.txt
> 
>  http://dev.blue-canoe.net/spam/spam02.txt
>  http://dev.blue-canoe.net/spam/debug2.txt
> 
>  http://dev.blue-canoe.net/spam/spam03.txt
>  http://dev.blue-canoe.net/spam/debug3.txt
> 
>  http://dev.blue-canoe.net/spam/spam04.txt
>  http://dev.blue-canoe.net/spam/debug4.txt
> 
> Kind regards
> 
> Nigel

Re: Drug spam, some caught some not - none caught by drug rules

Posted by Nigel Frankcom <ni...@blue-canoe.net>.

On Fri, 26 Jan 2007 09:16:09 -0500, Matt Kettler
<mk...@verizon.net> wrote:

>Nigel Frankcom wrote:
>>
>> Files redone... a little more informative this time round :-D
>>
>>  http://dev.blue-canoe.net/spam/spam01.txt
>>  http://dev.blue-canoe.net/spam/debug1.txt
>>
>>  http://dev.blue-canoe.net/spam/spam02.txt
>>  http://dev.blue-canoe.net/spam/debug2.txt
>>
>>  http://dev.blue-canoe.net/spam/spam03.txt
>>  http://dev.blue-canoe.net/spam/debug3.txt
>>
>>  http://dev.blue-canoe.net/spam/spam04.txt
>>  http://dev.blue-canoe.net/spam/debug4.txt
>>
>>   
>
>Well, it looks like whatever caused spam01 to hit bayes_99 and spam03 to
>hit bayes_80 is gone.. based on debug3, spam03 would now hit bayes_99
>more strongly than spam01 would.
>
>So whatever caused the slight bayes dropout has been trained out of your
>system now..

It occurred to me after I did the debug I'd already trained the misses
in.

Thanks for taking a look though.

Kind regards

Nigel

Re: Drug spam, some caught some not - none caught by drug rules

Posted by Matt Kettler <mk...@verizon.net>.

Nigel Frankcom wrote:
>
> Files redone... a little more informative this time round :-D
>
>  http://dev.blue-canoe.net/spam/spam01.txt
>  http://dev.blue-canoe.net/spam/debug1.txt
>
>  http://dev.blue-canoe.net/spam/spam02.txt
>  http://dev.blue-canoe.net/spam/debug2.txt
>
>  http://dev.blue-canoe.net/spam/spam03.txt
>  http://dev.blue-canoe.net/spam/debug3.txt
>
>  http://dev.blue-canoe.net/spam/spam04.txt
>  http://dev.blue-canoe.net/spam/debug4.txt
>
>   

Well, it looks like whatever caused spam01 to hit bayes_99 and spam03 to
hit bayes_80 is gone.. based on debug3, spam03 would now hit bayes_99
more strongly than spam01 would.

So whatever caused the slight bayes dropout has been trained out of your
system now..

Re: Drug spam, some caught some not - none caught by drug rules

Posted by Nigel Frankcom <ni...@blue-canoe.net>.

On Thu, 25 Jan 2007 20:16:42 -0500, Matt Kettler
<mk...@verizon.net> wrote:

>Nigel Frankcom wrote:
>> Debug results are available on: 
>> http://dev.blue-canoe.net/spam/spam01.txt
>> http://dev.blue-canoe.net/spam/debug1.txt
>>
>> http://dev.blue-canoe.net/spam/spam02.txt
>> http://dev.blue-canoe.net/spam/debug2.txt
>>
>> http://dev.blue-canoe.net/spam/spam03.txt
>> http://dev.blue-canoe.net/spam/debug3.txt
>>
>> http://dev.blue-canoe.net/spam/spam04.txt
>> http://dev.blue-canoe.net/spam/debug4.txt
>>
>> Make of them what you will, I think I need more beer before that lot
>> makes much sense :-D
>>
>> Kind regards
>>
>> Nigel
>>   
>
>Sorry Nigel. Andy steered you a bit wrong and those debug outputs are
>useless.. You need "-D bayes" not just "-D".
>
>Try it again with:
>
>spamassassin -D bayes < message1 2> debug1.txt
>
>Instead of
>spamassassin -D -t < message1 2> debug1.txt
>

Files redone... a little more informative this time round :-D

 http://dev.blue-canoe.net/spam/spam01.txt
 http://dev.blue-canoe.net/spam/debug1.txt

 http://dev.blue-canoe.net/spam/spam02.txt
 http://dev.blue-canoe.net/spam/debug2.txt

 http://dev.blue-canoe.net/spam/spam03.txt
 http://dev.blue-canoe.net/spam/debug3.txt

 http://dev.blue-canoe.net/spam/spam04.txt
 http://dev.blue-canoe.net/spam/debug4.txt

Kind regards

Nigel

Re: Drug spam, some caught some not - none caught by drug rules

Posted by Matt Kettler <mk...@verizon.net>.

Nigel Frankcom wrote:
> Debug results are available on: 
> http://dev.blue-canoe.net/spam/spam01.txt
> http://dev.blue-canoe.net/spam/debug1.txt
>
> http://dev.blue-canoe.net/spam/spam02.txt
> http://dev.blue-canoe.net/spam/debug2.txt
>
> http://dev.blue-canoe.net/spam/spam03.txt
> http://dev.blue-canoe.net/spam/debug3.txt
>
> http://dev.blue-canoe.net/spam/spam04.txt
> http://dev.blue-canoe.net/spam/debug4.txt
>
> Make of them what you will, I think I need more beer before that lot
> makes much sense :-D
>
> Kind regards
>
> Nigel
>   

Sorry Nigel. Andy steered you a bit wrong and those debug outputs are
useless.. You need "-D bayes" not just "-D".

Try it again with:

spamassassin -D bayes < message1 2> debug1.txt

Instead of
spamassassin -D -t < message1 2> debug1.txt

Re: Drug spam, some caught some not - none caught by drug rules

Posted by Nigel Frankcom <ni...@blue-canoe.net>.

On Thu, 25 Jan 2007 10:28:21 -0500, Andy Figueroa
<fi...@andyfigueroa.net> wrote:

>Thanks, Matt.  That sounds like a good suggestion.
>
>Nigel, since you have the emails, if you could capture the debug output 
>in a file and post like you did the messages, perhaps someone wise could 
>evaluate what is going on.
>
>You can capture the debug output by using:
>spamassassin -D -t < message1 2> debug1.txt
>
>Andy Figueroa
>
>Matt Kettler wrote:
>> Andy Figueroa wrote:
>>> Matt (but not just to Matt), I don't understand your reply (though I
>>> am deeply in your dept for the work you do for this community).  The
>>> sample emails that Nigel posted are identical in content, including
>>> obfuscation.  I've noted the same situation.  Yet, the scoring is
>>> really different. On the low scoring ones, DCC and RAZOR2 didn't hit,
>>> and the BAYES score is different.  The main differences are in the
>>> headers' different forged From and To addresses.  I thought these
>>> samples were worthy of deeper analysis.
>> 
>> Well, there might be other analysis worth making.
>> 
>>  However,  Nigel asked why the drugs rules weren't matching. I answered
>> that question alone.
>> 
>> Not sure why the change in razor/dcc happend.
>> 
>> BAYES changes are easily explained by the header changes, but a deeper
>> analysis would involve running through spamassassin -D bayes and looking
>> at the exact tokens.
>> 

Debug results are available on: 
http://dev.blue-canoe.net/spam/spam01.txt
http://dev.blue-canoe.net/spam/debug1.txt

http://dev.blue-canoe.net/spam/spam02.txt
http://dev.blue-canoe.net/spam/debug2.txt

http://dev.blue-canoe.net/spam/spam03.txt
http://dev.blue-canoe.net/spam/debug3.txt

http://dev.blue-canoe.net/spam/spam04.txt
http://dev.blue-canoe.net/spam/debug4.txt

Make of them what you will, I think I need more beer before that lot
makes much sense :-D

Kind regards

Nigel

Re: Drug spam, some caught some not - none caught by drug rules

Posted by Andy Figueroa <fi...@andyfigueroa.net>.

Thanks, Matt.  That sounds like a good suggestion.

Nigel, since you have the emails, if you could capture the debug output 
in a file and post like you did the messages, perhaps someone wise could 
evaluate what is going on.

You can capture the debug output by using:
spamassassin -D -t < message1 2> debug1.txt

Andy Figueroa

Matt Kettler wrote:
> Andy Figueroa wrote:
>> Matt (but not just to Matt), I don't understand your reply (though I
>> am deeply in your dept for the work you do for this community).  The
>> sample emails that Nigel posted are identical in content, including
>> obfuscation.  I've noted the same situation.  Yet, the scoring is
>> really different. On the low scoring ones, DCC and RAZOR2 didn't hit,
>> and the BAYES score is different.  The main differences are in the
>> headers' different forged From and To addresses.  I thought these
>> samples were worthy of deeper analysis.
> 
> Well, there might be other analysis worth making.
> 
>  However,  Nigel asked why the drugs rules weren't matching. I answered
> that question alone.
> 
> Not sure why the change in razor/dcc happend.
> 
> BAYES changes are easily explained by the header changes, but a deeper
> analysis would involve running through spamassassin -D bayes and looking
> at the exact tokens.
>

Re: Drug spam, some caught some not - none caught by drug rules

Posted by Matt Kettler <mk...@verizon.net>.

Andy Figueroa wrote:
> Matt (but not just to Matt), I don't understand your reply (though I
> am deeply in your dept for the work you do for this community).  The
> sample emails that Nigel posted are identical in content, including
> obfuscation.  I've noted the same situation.  Yet, the scoring is
> really different. On the low scoring ones, DCC and RAZOR2 didn't hit,
> and the BAYES score is different.  The main differences are in the
> headers' different forged From and To addresses.  I thought these
> samples were worthy of deeper analysis.

Well, there might be other analysis worth making.

 However,  Nigel asked why the drugs rules weren't matching. I answered
that question alone.

Not sure why the change in razor/dcc happend.

BAYES changes are easily explained by the header changes, but a deeper
analysis would involve running through spamassassin -D bayes and looking
at the exact tokens.

Re: Drug spam, some caught some not - none caught by drug rules

Posted by Andy Figueroa <fi...@andyfigueroa.net>.

Matt (but not just to Matt), I don't understand your reply (though I am 
deeply in your dept for the work you do for this community).  The sample 
emails that Nigel posted are identical in content, including 
obfuscation.  I've noted the same situation.  Yet, the scoring is really 
different. On the low scoring ones, DCC and RAZOR2 didn't hit, and the 
BAYES score is different.  The main differences are in the headers' 
different forged From and To addresses.  I thought these samples were 
worthy of deeper analysis.

Sincerely,
Andy Figueroa

Matt Kettler wrote:
> Nigel Frankcom wrote:
>> Hi All,
>>
>> Does anyone have any idea why there are such scoring disparities
>> between these two emails? I've been seeing a few of these creep
>> through lately.
>>
>> http://dev.blue-canoe.net/spam/spam01.txt
>> http://dev.blue-canoe.net/spam/spam02.txt
>> http://dev.blue-canoe.net/spam/spam03.txt
>> http://dev.blue-canoe.net/spam/spam04.txt
>>
>> More to the point with these is why are they not hitting any of the
>> drugs rules?
> 
> There's a few million obfuscation methods, and the rules can't always
> cover em all.
> 
> The examples you posted are using "duplicated letters", as well as
> inserted underscores.
> 
> The old Antidrug rules (part of xx_drugs.cf now) that I wrote will deal
> with the underscores, and a wide range of character substitutions, but
> only a few special-cases of insertions.
> 
> It's taken the spammers a long time to figure that out, but it appears
> they finally have.
> 
> I used to have to update the set constantly, but lately I've been a bit
> too busy with real life.

Re: Drug spam, some caught some not - none caught by drug rules

Posted by Matt Kettler <mk...@verizon.net>.

Nigel Frankcom wrote:
> Hi All,
>
> Does anyone have any idea why there are such scoring disparities
> between these two emails? I've been seeing a few of these creep
> through lately.
>
> http://dev.blue-canoe.net/spam/spam01.txt
> http://dev.blue-canoe.net/spam/spam02.txt
> http://dev.blue-canoe.net/spam/spam03.txt
> http://dev.blue-canoe.net/spam/spam04.txt
>
> More to the point with these is why are they not hitting any of the
> drugs rules?

There's a few million obfuscation methods, and the rules can't always
cover em all.

The examples you posted are using "duplicated letters", as well as
inserted underscores.

The old Antidrug rules (part of xx_drugs.cf now) that I wrote will deal
with the underscores, and a wide range of character substitutions, but
only a few special-cases of insertions.

It's taken the spammers a long time to figure that out, but it appears
they finally have.

I used to have to update the set constantly, but lately I've been a bit
too busy with real life.