You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Lorenzo Thurman <lo...@thethurmans.com> on 2015/03/10 18:29:33 UTC

Improve spam hit rate

I have these messages in a paste: http://pastebin.com/jNQfRerx <http://pastebin.com/jNQfRerx>. They were received about 1 1/2 hours apart. After I received the first one, I ran sudo sa-learn —spam /path/to/mail/folder against it and then sudo sa-learn —sync. spamassasin reported that it ‘learned tokens from 1 message…’
I received the second message, but it was not marked as spam, even though, at least as far as I can see, the messages are identical. All the way down to the low contrast ‘hidden’ text.  I’m seeing a lot of this lately, although sometimes, the messages come from different domains (reverse lookups are always ok). My server is Ubuntu linux 14.04. What can I do to improve the detection rate?

I’m running sa 3.4.0 which is invoked via postfix in master.cf:
smtp	inet	n	-	-	-	-	smtpd -vvv -o content_filter=spamassassin

sa-update is run via a cron job daily and it last ran early this morning, so its rules should be up to date.
So, any ideas?
Thanks

Re: Improve spam hit rate

Posted by Lorenzo Thurman <lo...@thethurmans.com>.
> On Mar 10, 2015, at 12:54 PM, Reindl Harald <h....@thelounge.net> wrote:
> 
> 
> Am 10.03.2015 um 18:29 schrieb Lorenzo Thurman:
>> I have these messages in a paste: http://pastebin.com/jNQfRerx. They
>> were received about 1 1/2 hours apart. After I received the first one, I
>> ran sudo sa-learn —spam /path/to/mail/folder against it and then sudo
>> sa-learn —sync. spamassasin reported that it ‘learned tokens from 1
>> message…’
> 
> you likely train the wrong bayes
> sa-learn must run at the same user as the spamassassin / spamd
> 
> nobody is calling such things as root by sudo BTW
> 
Yes, I’m embarrassed. I actually receive mail in an account different account. When training, I thought I could just run sa-learn as root and get the desired affect. I’ve run it as the correct user and I’ve at least of couple of duplicate messages correctly labeled as spam.

Re: Improve spam hit rate

Posted by Reindl Harald <h....@thelounge.net>.
Am 10.03.2015 um 18:29 schrieb Lorenzo Thurman:
> I have these messages in a paste: http://pastebin.com/jNQfRerx. They
> were received about 1 1/2 hours apart. After I received the first one, I
> ran sudo sa-learn —spam /path/to/mail/folder against it and then sudo
> sa-learn —sync. spamassasin reported that it ‘learned tokens from 1
> message…’

you likely train the wrong bayes
sa-learn must run at the same user as the spamassassin / spamd

nobody is calling such things as root by sudo BTW


Re: Improve spam hit rate

Posted by John Hardin <jh...@impsec.org>.
On Tue, 10 Mar 2015, Lorenzo Thurman wrote:

> I have these messages in a paste: http://pastebin.com/jNQfRerx 
> <http://pastebin.com/jNQfRerx>. They were received about 1 1/2 hours 
> apart. After I received the first one, I ran sudo sa-learn —spam 
> /path/to/mail/folder against it and then sudo sa-learn —sync.

Is that the only message you've trained bayes with?

Bayes needs sufficient examples of both spam and ham in order to make a 
decision. The default minimum is 200 of each.

There's also the common error of training a different bayes database than 
the one that SA is using when it scans mail. What user is SA/postfix 
running under?

-- 
  John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
  jhardin@impsec.org    FALaholic #11174     pgpk -a jhardin@impsec.org
  key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
   Rights can only ever be individual, which means that you cannot
   gain a right by joining a mob, no matter how shiny the issued
   badges are, or how many of your neighbors are part of it.  -- Marko
-----------------------------------------------------------------------
  4 days until Albert Einstein's 136th Birthday