You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@spamassassin.apache.org by Joseph Acquisto <jo...@j4computers.com> on 2012/12/12 02:29:13 UTC

Suddenly a lot of low scores

Suddenly a lot of garbage is getting thru.  Stuff with nonsense text, etc.

This is what I see:


X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on open-122
X-Spam-Level: 
X-Spam-Status: No, score=0.1 required=5.0 tests=DECEASED_NO_ML,HTML_MESSAGE
	autolearn=unavailable version=3.3.2
X-Spam-Report: 
	*  0.0 HTML_MESSAGE BODY: HTML included in message
	*  0.1 DECEASED_NO_ML Dead not via mailing list


and 


X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on open-122
X-Spam-Level: **
X-Spam-Status: No, score=2.7 required=5.0 tests=DKIM_SIGNED,DKIM_VALID,
	FROM_12LTRDOM,HTML_IMAGE_ONLY_20,HTML_MESSAGE,HTML_SHORT_LINK_IMG_3,
	T_REMOTE_IMAGE autolearn=no version=3.3.2
X-Spam-Report: 
	*  0.7 HTML_IMAGE_ONLY_20 BODY: HTML: images with 1600-2000 bytes of words
	*  0.0 HTML_MESSAGE BODY: HTML included in message
	*  0.1 DKIM_SIGNED Message has a DKIM or DK signature, not necessarily
	*      valid
	* -0.1 DKIM_VALID Message has at least one valid DKIM or DK signature
	*  0.3 HTML_SHORT_LINK_IMG_3 HTML is very short with a linked image
	*  1.6 T_REMOTE_IMAGE Message contains an external image
	*  0.1 FROM_12LTRDOM From a 12-letter domain

The autolearn seems odd.

joe a.

Re: Suddenly a lot of low scores

Posted by Joseph Acquisto <jo...@j4computers.com>.

>>> On 12/12/2012 at 11:39 AM, Joseph Acquisto wrote:
>> 
>>
>>Without seeing the messages, there's not much we can say about the 
>>scores.  Put the full messages in pastebin and give us the link so we 
>>can look at it.
>>
>>The autolearn looks normal to me.
>>
>>autolearn=unavailable  -- This means that something was locking the 
>>bayes database when this message was processed.
>>
>>autolearn=no  -- This means that SA looked at the message and decided 
>>not to learn from it.  In this case, the score is too high to autolearn 
>>as ham and too low to autolearn as spam.
>>
>>I don't see the bayes rules firing.  Is this a new SA setup?  Once you 
>>learn enough messages to activate the bayes scoring, you should see a 
>>bayes rule hit on every email.
>>
>>-- 
>>Bowie
> 
> It's a relatively new setup.
> 
> No bayes seems wrong, but I'll have to check how many messages are in the
> database when I get back there.
> 
> I send 5-10 messages daily.  Spam only, tho, little ham seems to get by, 
> mostly
> missed spam.
> 
> joe a.

I'm willing to bet (a penny) this is more like what should be seen, when bayes is working:

X-Spam-Report: 
	*  1.0 FREEMAIL_FROM Sender email is commonly abused enduser mail provider
	*      (a.mail.user[at]gmail.com)
	* -1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1%
	*      [score: 0.0000]
	*  0.0 HTML_MESSAGE BODY: HTML included in message
	* -0.1 DKIM_VALID_AU Message has a valid DKIM or DK signature from author's
	*       domain
	*  0.1 DKIM_SIGNED Message has a DKIM or DK signature, not necessarily
	*      valid
	* -0.1 DKIM_VALID Message has at least one valid DKIM or DK signature
joe a.

(Permissions, Permissions?  We don't need no stinking permissions . . .)

Re: Suddenly a lot of low scores

Posted by Bowie Bailey <Bo...@BUC.com>.

Please keep this on the list.

On 12/12/2012 8:09 PM, Joseph Acquisto wrote:
>> It doesn't matter how many messages SA has processed.  What matters is
>> how many messages Bayes has learned via autolearn or manual sa-learn runs.
>>
>> You can log in as the user SA runs as and check the bayes database:
>>
>> $ sa-learn --dump magic
>>
>> You want to look at the nham and nspam numbers.  You MUST do this as the
>> same user SA is using or the results will not be useful.  Also, if you
>> do manual learning via sa-learn, you must do it as the same user as SA.
>>
> This is my result:
>
> 0.000          0          3          0  non-token data: bayes db version
> 0.000          0        878          0  non-token data: nspam
> 0.000          0       1064          0  non-token data: nham
> 0.000          0     114391          0  non-token data: ntokens
> 0.000          0 1352511853          0  non-token data: oldest atime
> 0.000          0 1355310610          0  non-token data: newest atime
> 0.000          0          0          0  non-token data: last journal sync atime
> 0.000          0 1355278210          0  non-token data: last expiry atime
> 0.000          0    2764800          0  non-token data: last expire atime delta
> 0.000          0      38573          0  non-token data: last expire reduction coun
>
> I run sa-learn via script, as root.   spamd runs as root.  spamassassin, in  /etc/postfix/main.cf has the user defined as spamfilter.
> I don't know if that is an issue.

It might be.  Take a look at spamfilter's database.

If spamd is running as root, it may be doing per-user filtering 
depending on your setup.  If this is the case, the spamd will switch 
users each time it receives a message to scan the message using that 
user's settings.  This means that each user's bayes db must be above the 
threshold before that user will see bayes scoring.

> What should I see in headers if bayes is active?

If bayes is active, you should see a BAYES_XX rule hit on every email.

> Tangent - I noticed this in /var/log/messages (probably unrelated)
>
> Dec 12 02:13:55 open-122 echo[665]: Starting spamd:
> Dec 12 02:13:58 open-122 echo[645]: Starting the SpamAssassin Proxy Daemon:
> Dec 12 06:14:09 open-122 spampd[682]: defined(@array) is deprecated at /usr/lib/perl5/vendor_perl/5.16.0/Net/Server.pm line 211.
> Dec 12 06:14:11 open-122 spampd[682]: (Maybe you should just omit the defined()?)
> Dec 12 06:14:50 open-122 systemd[1]: spampd.service: main process exited, code=exited, status=1
> Dec 12 06:14:50 open-122 systemd[1]: Unit spampd.service entered failed state.
>
> Seen a few times, over month or so.

No idea about this.

-- 
Bowie

Re: Suddenly a lot of low scores

Posted by Joseph Acquisto <jo...@j4computers.com>.

>I send 5-10 messages daily.  Spam only, tho, little ham seems to get by, mostly
>missed spam.
>
>joe a.

I meant, that number of forwarded messages for bayes to learn.   Should be well over 200 spam by now.
Will it accept unmarked mail as ham, if sent as such, or would that mess things up?

joe a.

Re: Suddenly a lot of low scores

Posted by Bowie Bailey <Bo...@BUC.com>.

On 12/12/2012 11:39 AM, Joseph Acquisto wrote:
>>
>> Without seeing the messages, there's not much we can say about the
>> scores.  Put the full messages in pastebin and give us the link so we
>> can look at it.
>>
>> The autolearn looks normal to me.
>>
>> autolearn=unavailable  -- This means that something was locking the
>> bayes database when this message was processed.
>>
>> autolearn=no  -- This means that SA looked at the message and decided
>> not to learn from it.  In this case, the score is too high to autolearn
>> as ham and too low to autolearn as spam.
>>
>> I don't see the bayes rules firing.  Is this a new SA setup?  Once you
>> learn enough messages to activate the bayes scoring, you should see a
>> bayes rule hit on every email.
>>
>> -- 
>> Bowie
> It's a relatively new setup.
>
> No bayes seems wrong, but I'll have to check how many messages are in the
> database when I get back there.
>
> I send 5-10 messages daily.  Spam only, tho, little ham seems to get by, mostly
> missed spam.

There must be at least 200 ham and 200 spam in the database before SA 
will start using the bayes rules.

-- 
Bowie

Re: Suddenly a lot of low scores

Posted by Joseph Acquisto <jo...@j4computers.com>.

>
>
>Without seeing the messages, there's not much we can say about the 
>scores.  Put the full messages in pastebin and give us the link so we 
>can look at it.
>
>The autolearn looks normal to me.
>
>autolearn=unavailable  -- This means that something was locking the 
>bayes database when this message was processed.
>
>autolearn=no  -- This means that SA looked at the message and decided 
>not to learn from it.  In this case, the score is too high to autolearn 
>as ham and too low to autolearn as spam.
>
>I don't see the bayes rules firing.  Is this a new SA setup?  Once you 
>learn enough messages to activate the bayes scoring, you should see a 
>bayes rule hit on every email.
>
>-- 
>Bowie

It's a relatively new setup.

No bayes seems wrong, but I'll have to check how many messages are in the
database when I get back there.

I send 5-10 messages daily.  Spam only, tho, little ham seems to get by, mostly
missed spam.

joe a.

Re: Suddenly a lot of low scores

Posted by Bowie Bailey <Bo...@BUC.com>.

On 12/11/2012 8:29 PM, Joseph Acquisto wrote:
> Suddenly a lot of garbage is getting thru.  Stuff with nonsense text, etc.
>
> This is what I see:
>
>
> X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on open-122
> X-Spam-Level:
> X-Spam-Status: No, score=0.1 required=5.0 tests=DECEASED_NO_ML,HTML_MESSAGE
> 	autolearn=unavailable version=3.3.2
> X-Spam-Report:
> 	*  0.0 HTML_MESSAGE BODY: HTML included in message
> 	*  0.1 DECEASED_NO_ML Dead not via mailing list
>
>
> and
>
>
> X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on open-122
> X-Spam-Level: **
> X-Spam-Status: No, score=2.7 required=5.0 tests=DKIM_SIGNED,DKIM_VALID,
> 	FROM_12LTRDOM,HTML_IMAGE_ONLY_20,HTML_MESSAGE,HTML_SHORT_LINK_IMG_3,
> 	T_REMOTE_IMAGE autolearn=no version=3.3.2
> X-Spam-Report:
> 	*  0.7 HTML_IMAGE_ONLY_20 BODY: HTML: images with 1600-2000 bytes of words
> 	*  0.0 HTML_MESSAGE BODY: HTML included in message
> 	*  0.1 DKIM_SIGNED Message has a DKIM or DK signature, not necessarily
> 	*      valid
> 	* -0.1 DKIM_VALID Message has at least one valid DKIM or DK signature
> 	*  0.3 HTML_SHORT_LINK_IMG_3 HTML is very short with a linked image
> 	*  1.6 T_REMOTE_IMAGE Message contains an external image
> 	*  0.1 FROM_12LTRDOM From a 12-letter domain
>
> The autolearn seems odd.

Without seeing the messages, there's not much we can say about the 
scores.  Put the full messages in pastebin and give us the link so we 
can look at it.

The autolearn looks normal to me.

autolearn=unavailable  -- This means that something was locking the 
bayes database when this message was processed.

autolearn=no  -- This means that SA looked at the message and decided 
not to learn from it.  In this case, the score is too high to autolearn 
as ham and too low to autolearn as spam.

I don't see the bayes rules firing.  Is this a new SA setup?  Once you 
learn enough messages to activate the bayes scoring, you should see a 
bayes rule hit on every email.

-- 
Bowie