You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by omehegan <ow...@nerdnetworks.org> on 2009/06/17 20:18:01 UTC

Lots of 419/scam and investment spams getting through suddenly

I'm running SpamAssassin 3.2.1 on Linux, with spamd integrated with Postfix.
I use SPF, greylisting, and bayes. Lately a lot of 419 and investment spams
have been getting through with very low SA scores. Can anyone take a look at
these and see if there's another ruleset I should use to trap them? I see
that they all appear in DCC now, but they didn't when I received them.
Thanks in advance...

http://www.nerdnetworks.org/spam/spam1
http://www.nerdnetworks.org/spam/spam2
http://www.nerdnetworks.org/spam/spam3
http://www.nerdnetworks.org/spam/spam4
http://www.nerdnetworks.org/spam/spam5
http://www.nerdnetworks.org/spam/spam6
-- 
View this message in context: http://www.nabble.com/Lots-of-419-scam-and-investment-spams-getting-through-suddenly-tp24079208p24079208.html
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.


Re: Lots of 419/scam and investment spams getting through suddenly

Posted by John Hardin <jh...@impsec.org>.
On Wed, 2009-06-17 at 11:18 -0700, omehegan wrote:
> Lately a lot of 419 and investment spams
> have been getting through with very low SA scores. Can anyone take a look at
> these and see if there's another ruleset I should use to trap them?

One thing I've been fiddling with for a while is a ruleset to detect
fill-in-the-form type stuff that you see a lot in scam emails. I've
recently modified it to use ReplaceTags, as the older non-tokenized
version has reached the point of unmaintainability.

If you're willing to try beta rules, you are welcome to download a
patched ReplaceTags plugin that implements multipass, and the FillForm
ruleset. As always, reduce the scores somewhat at first until you gain
confidence in the rules.

I get fairly good results against the fraud spams I get, but the results
against the SA masscheck are disappointing. I'd like to think that's
because the spam corpa don't have a lot of scam messages... :)

I'd appreciate some feedback if you do try the rules out, especially any
false positives with FILL_THIS_FORM_LONG.

http://svn.apache.org/viewvc/spamassassin/trunk/lib/Mail/SpamAssassin/Plugin/ReplaceTags.pm

http://svn.apache.org/viewvc/spamassassin/rules/trunk/sandbox/jhardin/20_fillform.cf

-- 
 John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
 jhardin@impsec.org    FALaholic #11174     pgpk -a jhardin@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79


Re: Lots of 419/scam and investment spams getting through suddenly

Posted by Anthony Peacock <a....@chime.ucl.ac.uk>.
Hi,

My results below...

omehegan wrote:
> 

<SNIP>

>>
> 
> Here are two more of a type that have been getting through CONSTANTLY.
> They're always almost exactly the same, and I keep training them into my
> bayes DB but it's not hitting on them :(
> 
> http://www.nerdnetworks.org/spam/spam7

Content analysis details:   (9.3 points, 5.0 required)

  pts rule name              description
---- ---------------------- 
--------------------------------------------------
  3.5 BAYES_99               BODY: Bayesian spam probability is 99 to 100%
                             [score: 1.0000]
  2.1 SUBJ_ALL_CAPS          Subject is all capitals
  0.6 US_DOLLARS_3           BODY: Mentions millions of $ ($NN,NNN,NNN.NN)
  0.6 J_CHICKENPOX_73        BODY: 7alpha-pock-3alpha
  0.5 RAZOR2_CHECK           Listed in Razor2 (http://razor.sf.net/)
  1.5 RAZOR2_CF_RANGE_E4_51_100 Razor2 gives engine 4 confidence level
                             above 50%
                             [cf: 100]
  0.5 RAZOR2_CF_RANGE_51_100 Razor2 gives confidence level above 50%
                             [cf: 100]


> http://www.nerdnetworks.org/spam/spam8

Content analysis details:   (7.5 points, 5.0 required)

  pts rule name              description
---- ---------------------- 
--------------------------------------------------
  3.5 BAYES_99               BODY: Bayesian spam probability is 99 to 100%
                             [score: 1.0000]
  1.5 MILLION_USD            BODY: Talks about millions of dollars
  2.1 SUBJ_ALL_CAPS          Subject is all capitals
  0.4 AWL                    AWL: From: address is in the auto white-list


It looks like my Bayes is trained to be better at picking these up.

-- 
Anthony Peacock
CHIME, UCL Medical School
WWW:    http://www.chime.ucl.ac.uk/~rmhiajp/
Study Health Informatics - Modular Postgraduate Degree
http://www.chime.ucl.ac.uk/study-health-informatics/

Re: Lots of 419/scam and investment spams getting through suddenly

Posted by omehegan <ow...@nerdnetworks.org>.


John Hardin wrote:
> 
> That's not what I asked - are you _training_ as that user? That's often 
> the problem when bayes isn't behaving the way you expect.
> 
> sa-update won't bring 3.2.1 up to 3.2.5; you're not getting the up-to-date 
> rules, which may catch those.
> 
> That said, I'm getting really poor scores on those from my 3.2.5 testbed 
> (which does not have a trained bayes), so upgrading might not help much...
> 

Yes, I'm training as user 'bayes' and SA is running as user 'bayes.' I get
the sense from other replies that maybe my bayes DB is underperforming. I
believe I've read some stuff on verifying and fixing that, so I'll Google.

I'll upgrade to 3.2.5 anyway - it's easy enough for me. For some reason I
thought that minor releases only included rules changes, and that these
would be propagated to earlier minor releases via sa-update.
-- 
View this message in context: http://www.nabble.com/Lots-of-419-scam-and-investment-spams-getting-through-suddenly-tp24079208p24118534.html
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.


Re: Lots of 419/scam and investment spams getting through suddenly

Posted by John Hardin <jh...@impsec.org>.
On Wed, 17 Jun 2009, omehegan wrote:

Please trim irrelecant content when you reply, thanks.

> I have site-wide bayes, and yeah its rules are owned by the same user 
> that SA is running as.

That's not what I asked - are you _training_ as that user? That's often 
the problem when bayes isn't behaving the way you expect.

> The leakers are not being autolearned as ham.

Good.

> I could upgrade SA, I didn't think that would help because I do run
> sa-update every night at midnight.

sa-update won't bring 3.2.1 up to 3.2.5; you're not getting the up-to-date 
rules, which may catch those.

That said, I'm getting really poor scores on those from my 3.2.5 testbed 
(which does not have a trained bayes), so upgrading might not help much...

-- 
  John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
  jhardin@impsec.org    FALaholic #11174     pgpk -a jhardin@impsec.org
  key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
   A sword is never a killer, it is but a tool in the killer's hands.
                           -- Lucius Annaeus Seneca (Martial) 4BC-65AD
-----------------------------------------------------------------------
  Today: SWMBO's Birthday

Re: Lots of 419/scam and investment spams getting through suddenly

Posted by omehegan <ow...@nerdnetworks.org>.


John Hardin wrote:
> 
> On Wed, 17 Jun 2009, omehegan wrote:
> 
>>>>> http://www.nerdnetworks.org/spam/spam1
>>>>> http://www.nerdnetworks.org/spam/spam2
>>>>> http://www.nerdnetworks.org/spam/spam3
>>>>> http://www.nerdnetworks.org/spam/spam4
>>>>> http://www.nerdnetworks.org/spam/spam5
>>>>> http://www.nerdnetworks.org/spam/spam6
>>
>> Here are two more of a type that have been getting through CONSTANTLY.
>> They're always almost exactly the same, and I keep training them into my
>> bayes DB but it's not hitting on them :(
>>
>> http://www.nerdnetworks.org/spam/spam7
>> http://www.nerdnetworks.org/spam/spam8
> 
> The highest score on any of those was BAYES_60, most were BAYES_50, and 
> one was BAYES_20. Bayes training seems to be a big problem.
> 
> If you are not running per-user bayes, are you sure you're training as the 
> same user that SA is running as?
> 
> Are any of the leakers being autolearned as ham?
> 
> I'm surprised by how few rules are hitting those.
> 
> Can you upgrade to 3.2.5?
> 
> Have you ever run sa-update?
> 
> -- 
>   John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
>   jhardin@impsec.org    FALaholic #11174     pgpk -a jhardin@impsec.org
>   key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
> -----------------------------------------------------------------------
>    After ten years (1998-2008) of draconian gun control in the State
>    of Massachusetts, the results are in: firearms-related assaults up
>    78%, firearms-related homicides up 67%, assault-related emergency
>    room visits up 331%. Gun Control does not reduce violent crime.
> -----------------------------------------------------------------------
>   Tomorrow: SWMBO's Birthday
> 
> 

I have site-wide bayes, and yeah its rules are owned by the same user that
SA is running as. 

The leakers are not being autolearned as ham.

I could upgrade SA, I didn't think that would help because I do run
sa-update every night at midnight.
-- 
View this message in context: http://www.nabble.com/Lots-of-419-scam-and-investment-spams-getting-through-suddenly-tp24079208p24086404.html
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.


Re: Lots of 419/scam and investment spams getting through suddenly

Posted by John Hardin <jh...@impsec.org>.
On Wed, 17 Jun 2009, omehegan wrote:

>>>> http://www.nerdnetworks.org/spam/spam1
>>>> http://www.nerdnetworks.org/spam/spam2
>>>> http://www.nerdnetworks.org/spam/spam3
>>>> http://www.nerdnetworks.org/spam/spam4
>>>> http://www.nerdnetworks.org/spam/spam5
>>>> http://www.nerdnetworks.org/spam/spam6
>
> Here are two more of a type that have been getting through CONSTANTLY.
> They're always almost exactly the same, and I keep training them into my
> bayes DB but it's not hitting on them :(
>
> http://www.nerdnetworks.org/spam/spam7
> http://www.nerdnetworks.org/spam/spam8

The highest score on any of those was BAYES_60, most were BAYES_50, and 
one was BAYES_20. Bayes training seems to be a big problem.

If you are not running per-user bayes, are you sure you're training as the 
same user that SA is running as?

Are any of the leakers being autolearned as ham?

I'm surprised by how few rules are hitting those.

Can you upgrade to 3.2.5?

Have you ever run sa-update?

-- 
  John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
  jhardin@impsec.org    FALaholic #11174     pgpk -a jhardin@impsec.org
  key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
   After ten years (1998-2008) of draconian gun control in the State
   of Massachusetts, the results are in: firearms-related assaults up
   78%, firearms-related homicides up 67%, assault-related emergency
   room visits up 331%. Gun Control does not reduce violent crime.
-----------------------------------------------------------------------
  Tomorrow: SWMBO's Birthday

Re: Lots of 419/scam and investment spams getting through suddenly

Posted by omehegan <ow...@nerdnetworks.org>.


omehegan wrote:
> 
> 
> 
> John Hardin wrote:
>> 
>> On Wed, 17 Jun 2009, omehegan wrote:
>> 
>>> Lately a lot of 419 and investment spams have been getting through with 
>>> very low SA scores.
>>>
>>> http://www.nerdnetworks.org/spam/spam1
>>> http://www.nerdnetworks.org/spam/spam2
>>> http://www.nerdnetworks.org/spam/spam3
>>> http://www.nerdnetworks.org/spam/spam4
>>> http://www.nerdnetworks.org/spam/spam5
>>> http://www.nerdnetworks.org/spam/spam6
>> 
>> Have you tried the Sought-Fraud ruleset? How about the SARE fraud
>> ruleset? 
>> I use both and, with bayes, get only rare leakers - mostly very short 
>> ones.
>> 
>> -- 
>>   John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
>>   jhardin@impsec.org    FALaholic #11174     pgpk -a jhardin@impsec.org
>>   key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
>> -----------------------------------------------------------------------
>>    Win95: Where do you want to go today?
>>    Vista: Where will Microsoft allow you to go today?
>> -----------------------------------------------------------------------
>>   Tomorrow: SWMBO's Birthday
>> 
>> 
> 
> I already use the Sought-Fraud ruleset. Based on my logs, it's working,
> but it didn't hit on any of these messages. I just installed the SARE
> fraud ruleset, and verified that it's getting loaded, but it doesn't hit
> on any of these sample messages.
> 

Here are two more of a type that have been getting through CONSTANTLY.
They're always almost exactly the same, and I keep training them into my
bayes DB but it's not hitting on them :(

http://www.nerdnetworks.org/spam/spam7
http://www.nerdnetworks.org/spam/spam8
-- 
View this message in context: http://www.nabble.com/Lots-of-419-scam-and-investment-spams-getting-through-suddenly-tp24079208p24086061.html
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.


Re: Lots of 419/scam and investment spams getting through suddenly

Posted by omehegan <ow...@nerdnetworks.org>.


John Hardin wrote:
> 
> On Wed, 17 Jun 2009, omehegan wrote:
> 
>> Lately a lot of 419 and investment spams have been getting through with 
>> very low SA scores.
>>
>> http://www.nerdnetworks.org/spam/spam1
>> http://www.nerdnetworks.org/spam/spam2
>> http://www.nerdnetworks.org/spam/spam3
>> http://www.nerdnetworks.org/spam/spam4
>> http://www.nerdnetworks.org/spam/spam5
>> http://www.nerdnetworks.org/spam/spam6
> 
> Have you tried the Sought-Fraud ruleset? How about the SARE fraud ruleset? 
> I use both and, with bayes, get only rare leakers - mostly very short 
> ones.
> 
> -- 
>   John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
>   jhardin@impsec.org    FALaholic #11174     pgpk -a jhardin@impsec.org
>   key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
> -----------------------------------------------------------------------
>    Win95: Where do you want to go today?
>    Vista: Where will Microsoft allow you to go today?
> -----------------------------------------------------------------------
>   Tomorrow: SWMBO's Birthday
> 
> 

I already use the Sought-Fraud ruleset. Based on my logs, it's working, but
it didn't hit on any of these messages. I just installed the SARE fraud
ruleset, and verified that it's getting loaded, but it doesn't hit on any of
these sample messages.
-- 
View this message in context: http://www.nabble.com/Lots-of-419-scam-and-investment-spams-getting-through-suddenly-tp24079208p24081502.html
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.


Re: Lots of 419/scam and investment spams getting through suddenly

Posted by John Hardin <jh...@impsec.org>.
On Wed, 17 Jun 2009, omehegan wrote:

> Lately a lot of 419 and investment spams have been getting through with 
> very low SA scores.
>
> http://www.nerdnetworks.org/spam/spam1
> http://www.nerdnetworks.org/spam/spam2
> http://www.nerdnetworks.org/spam/spam3
> http://www.nerdnetworks.org/spam/spam4
> http://www.nerdnetworks.org/spam/spam5
> http://www.nerdnetworks.org/spam/spam6

Have you tried the Sought-Fraud ruleset? How about the SARE fraud ruleset? 
I use both and, with bayes, get only rare leakers - mostly very short 
ones.

-- 
  John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
  jhardin@impsec.org    FALaholic #11174     pgpk -a jhardin@impsec.org
  key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
   Win95: Where do you want to go today?
   Vista: Where will Microsoft allow you to go today?
-----------------------------------------------------------------------
  Tomorrow: SWMBO's Birthday

Re: Lots of 419/scam and investment spams getting through suddenly

Posted by omehegan <ow...@nerdnetworks.org>.

Chip M. wrote:
> 
> Owen, particularly with 419/scam spams, it's VERY helpful if you
> tell us more about your ham ecology.
> 
> It would also be helpful if you told us about your FP pipeline.
> For example:  Do you have a corpus?  Can you easily analyze
> individual SA hits on ham, over an extended period?
> 
> The better your pipeline, the more aggressive you can be.
> If you have a deep understanding of your own ham ecology
> (based on analyzing data over multiple years), you can make
> informed decisions as to how to slant your tests.
> 
> 
> From your use of Nabble, I infer you are a small domain, with
> mostly/completely non-arms-length users.  From your domain name,
> I infer your userbase consists partly (perhaps completely) of
> Nerds. :)
> 
> If those inferences are correct, here's some things that should
> help:
>  1. raise the score for "SUBJ_ALL_CAPS" and some scammy tests
>  2. use a "FreeMail" plugin
>  3. use a country of origin/route plugin
> 
> 
> #1 is low-risk in a "pure Nerd" ham environment.  In Nerd/Geek ham,
> it hits most often on forwarded chain letters, and other crud, so
> even if it FPs, it's minimal "harm".
> 
> You might also want to tweak all the AdvanceFee/scam SA tests,
> including "ADVANCE_FEE_[n]", "DEAR_FRIEND", "MILLION_USD",
> "US_DOLLARS_[n]".  Of those, the first two occur occasionally in
> ham, but usually it's of low loss/FP value.
> 
> 
> #2 should hit on about half of your samples (I'm using a different
> implementation, so can't verify the exact performance - perhaps
> someone with the SA plugin can run your samples and report?).
> 
> Note that your middle scoring samples ALL should hit the FreeMail
> plugin.
> 
> 
> #3 is somewhat controversial, and if implemented must be done
> VERY carefully.
> 
> I hope we can all agree that scoring West Africa, particularly in
> combination with scam oriented metas, has an excellent risk-reward
> ratio.  So far this year, over half of all my AdvanceFee-ish spams
> have been sent via West Africa (typically originating there, and
> sent via a compromised USA/WEurope IP).
> 
> Here's a dump of the complete Countries routes of your samples
> (frequency first, then square brackets around the IP immediately
> outside your own network):
>  2 [France], Nigeria
>  1 [India], Japan
>  3 [Netherlands], Mexico
>  1 [Taiwan]
>  1 [United States], United States, Great Britain
> 
> In your samples, the lowest scoring three just happened to have the
> most unlikely nations (Nigeria, India+Japan) in their routes.
> That won't always be so.
> 
> I would NEVER block the Netherlands (it _IS_ one of the Geekiest
> nations on the planet!), however it does have many freemailers who
> are often compromised, so when it occurs in COMBINATION with an
> "unlikely" nation like Mexico, it's worth considering a CAUTIOUS
> score.
> 
> 

OK, in terms of my domain, it's a collection of, yes, nerdy users : ) It's
mostly friends, plus one guy who has a fleet of users of his own that I
maintain but don't know. However, in terms of my complaints about spam, they
relate only to my own mail. My other users don't complain to me about spam,
and I don't take it upon myself to monitor their spam folders for false
positives. That said, for my own case, I hardly get any. Maybe 1-2 a month,
and those are always because of over-scoring on FREEMAIL_FROM.

So, I will bump the scores of some of the tests you mentioned. I was hoping
for a less fiddly solution, like "install this plugin/rule set," but that's
OK.

Can you recommend a country of origin/route plugin for me to look at? I'm
not sure how I would search for one.
-- 
View this message in context: http://www.nabble.com/Lots-of-419-scam-and-investment-spams-getting-through-suddenly-tp24079208p24118767.html
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.