You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Jonathan Nichols <jn...@pbp.net> on 2013/06/23 03:35:37 UTC

'hair' spam

I've been getting flooded with pump n dump spams for a particular stock symbol, and my feeble admin skills these days are making it difficult to slow. Been using mailspike, spam cop at the mta, and barracuda too. 

Here's a sample:

http://pastebin.com/Y5q4QTnf

What kind of worries me are the low hayes scores. I've been feeding fairly consistent message after message. 

Also, this part might be off topic but i seem to have headers being cut in half by postfix. (postfix/amavis setup) 

cheers
--
j

Re: 'hair' spam

Posted by RW <rw...@googlemail.com>.
On Sat, 22 Jun 2013 20:38:43 -0700 (PDT)
John Hardin wrote:

> On Sat, 22 Jun 2013, Jonathan Nichols wrote:
> 
> > What kind of worries me are the low hayes scores. I've been feeding 
> > fairly consistent message after message.
> 
> If it never leaves BAYES_50 then the training isn't being properly
> done. Are you sure you're training the bayes database that SA is
> using? What user are you running training as? What user is SA running
> as? Are you configured for per-user or site-wide bayes databases?

I'm finding they stick at Bayes_50 too, and when I ran his spam it
actually hit BAYES_00. In my case the reason is that I've had very
little pump & dump spam for years.  Bayes learns very slowly when it
needs to detune existing tokens.

Bogofilter is shooting fish in a barrel with these, but if you look at
the tokens, the hammy tokens that make it through the cut are mostly
single word whereas the spammy tokens are mostly multiword. If I switch
off multiword tokenization, the result for the linked spam changes from
1.00 to 0.32 - similar to Bayes.


I think a rule to catch the HA_I R variants should be be worth a point.

Re: 'hair' spam

Posted by John Hardin <jh...@impsec.org>.
On Sat, 22 Jun 2013, Jonathan Nichols wrote:

> What kind of worries me are the low hayes scores. I've been feeding 
> fairly consistent message after message.

If it never leaves BAYES_50 then the training isn't being properly done. 
Are you sure you're training the bayes database that SA is using? What 
user are you running training as? What user is SA running as? Are you 
configured for per-user or site-wide bayes databases?

> Also, this part might be off topic but i seem to have headers being cut 
> in half by postfix. (postfix/amavis setup)

Headers are allowed to be wrapped; the continuation lines must start with 
whitespace (typically a tab).

-- 
  John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
  jhardin@impsec.org    FALaholic #11174     pgpk -a jhardin@impsec.org
  key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
   Our government should bear in mind the fact that the American
   Revolution was touched off by the then-current government
   attempting to confiscate firearms from the people.
-----------------------------------------------------------------------
  387 days since the first successful private support mission to ISS (SpaceX)

Re: 'hair' spam

Posted by Jonathan Nichols <jn...@pbp.net>.
>> 
>> 
>> 
>> Missing Date and Message-Id headers, no Received headers. I'd focus on
>> fixing those, before looking at Bayes again.
> 
> Just had a look at various samples here. Correction: The Message-Id
> header appears to be missing indeed.
> 
> 

I have now noticed this with all other messages. Something in my postfix/Amani's setup is chopping headers in half. Not sure what that is. Will have to research further. 

Re: 'hair' spam

Posted by Karsten Bräckelmann <gu...@rudersport.de>.
Replying to self, minor correction.

On Sun, 2013-06-23 at 19:02 +0200, Karsten Bräckelmann wrote:
> On Sat, 2013-06-22 at 20:35 -0500, Jonathan Nichols wrote:
> > I've been getting flooded with pump n dump spams for a particular stock
> > symbol, and my feeble admin skills these days are making it difficult
> > to slow. Been using mailspike, spam cop at the mta, and barracuda too.
> 
> For outright rejecting on a single BL hit?
> 
> I am seeing these, too, scored by SA without rejecting -- and indeed,
> these are commonly tripping Spamhaus XBL, PBL, Barracuda, MSPIKE, SSBL,
> Spamcop and friends. There doesn't seem to be low-ish scorer here.
> 
> 
> > Here's a sample:
> > http://pastebin.com/Y5q4QTnf
> > 
> > What kind of worries me are the low hayes scores. I've been feeding
> > fairly consistent message after message.
> 
> You're problem isn't a neutral Bayes classification, but something
> severely harming your messages prior to scanning with SA. From your
> pastebin:
> 
>   X-Spam-Status: No, score=2.655 tagged_above=-999 required=5.31
>       tests=[BAYES_50=0.8, MISSING_DATE=1.36, MISSING_MID=0.497,
>       NO_RECEIVED=-0.001, NO_RELAYS=-0.001] autolearn=no
> 
> Missing Date and Message-Id headers, no Received headers. I'd focus on
> fixing those, before looking at Bayes again.

Just had a look at various samples here. Correction: The Message-Id
header appears to be missing indeed.

One thing they all seem to have in common -- besides a juicy mix of
blacklist hits -- are RCVD_NUMERIC_HELO and RDNS_NONE. All of them got
dropped along with the Received headers for you.


> To be clear: The pump-n-dump spam of this campaign (including variations
> with different stock symbols earlier) do have these headers. Something
> in your mail processing chain butchered the messages. Given the Amavis
> Warning header, the culprit appears to be earlier in the chain.
> 
> Also, with no Received headers, any other blacklist not already checked
> at the perimeter at SMTP time is effectively disabled in SA.

-- 
char *t="\10pse\0r\0dtu\0.@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4";
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1:
(c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}


Re: 'hair' spam

Posted by Karsten Bräckelmann <gu...@rudersport.de>.
On Sat, 2013-06-22 at 20:35 -0500, Jonathan Nichols wrote:
> I've been getting flooded with pump n dump spams for a particular stock
> symbol, and my feeble admin skills these days are making it difficult
> to slow. Been using mailspike, spam cop at the mta, and barracuda too.

For outright rejecting on a single BL hit?

I am seeing these, too, scored by SA without rejecting -- and indeed,
these are commonly tripping Spamhaus XBL, PBL, Barracuda, MSPIKE, SSBL,
Spamcop and friends. There doesn't seem to be low-ish scorer here.


> Here's a sample:
> http://pastebin.com/Y5q4QTnf
> 
> What kind of worries me are the low hayes scores. I've been feeding
> fairly consistent message after message.

You're problem isn't a neutral Bayes classification, but something
severely harming your messages prior to scanning with SA. From your
pastebin:

  X-Spam-Status: No, score=2.655 tagged_above=-999 required=5.31
      tests=[BAYES_50=0.8, MISSING_DATE=1.36, MISSING_MID=0.497,
      NO_RECEIVED=-0.001, NO_RELAYS=-0.001] autolearn=no

Missing Date and Message-Id headers, no Received headers. I'd focus on
fixing those, before looking at Bayes again.

To be clear: The pump-n-dump spam of this campaign (including variations
with different stock symbols earlier) do have these headers. Something
in your mail processing chain butchered the messages. Given the Amavis
Warning header, the culprit appears to be earlier in the chain.

Also, with no Received headers, any other blacklist not already checked
at the perimeter at SMTP time is effectively disabled in SA.


-- 
char *t="\10pse\0r\0dtu\0.@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4";
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1:
(c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}