You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Paul Dulaba <sc...@gmail.com> on 2006/07/26 22:43:41 UTC

bayes: not available for scanning

Hi all,

Running SA 3.1.0. I have been using the Bayes DB for 8-9 months now. All of
a sudden it seems like a lot of Spam is getting through, and training it
with the new Spam does not seem to have any effect. There is a quirk in the
following output:

>spamassassin -D --lint

[5284] dbg: bayes: not available for scanning, only 0 spam(s) in bayes DB <
200
<snip>
[5284] dbg: bayes: corpus size: nspam = 124095, nham = 262008
[5284] dbg: bayes: DB expiry: tokens in DB: 119251, Expiry max size: 150000,
Oldest atime: 1144827345, Newest atime: 1146270059, Last expire: 1144870877,
Current time: 1153945397


It doesn't matter how many Spams I have passed through sa-learn, it still
just says 0 spam(s) in bayes DB < 200. Yet you can see there are 124095 spam
tokens.

Here is the output of sa-learn --dump magic

0.000          0          3          0  non-token data: bayes db version
0.000          0     124095          0  non-token data: nspam
0.000          0     262008          0  non-token data: nham
0.000          0     119251          0  non-token data: ntokens
0.000          0 1144827345          0  non-token data: oldest atime
0.000          0 1146270059          0  non-token data: newest atime
0.000          0 1144876310          0  non-token data: last journal sync
atime
0.000          0 1144870877          0  non-token data: last expiry atime
0.000          0      43200          0  non-token data: last expire atime
delta
0.000          0     106911          0  non-token data: last expire
reduction count


I am not sure if the inconsistency noted above is causing the increase in
missed Spam, but it's the only thing I can find.
Any suggestions?

Thanks!

Re: bayes: not available for scanning

Posted by mouss <us...@free.fr>.
Paul Dulaba wrote:
> Hi all,
>
> Running SA 3.1.0. I have been using the Bayes DB for 8-9 months now. 
> All of
> a sudden it seems like a lot of Spam is getting through, and training it
> with the new Spam does not seem to have any effect. There is a quirk 
> in the
> following output:
>
>> spamassassin -D --lint
>
> [5284] dbg: bayes: not available for scanning, only 0 spam(s) in bayes 
> DB <
> 200
> <snip>
> [5284] dbg: bayes: corpus size: nspam = 124095, nham = 262008
> [5284] dbg: bayes: DB expiry: tokens in DB: 119251, Expiry max size: 
> 150000,
> Oldest atime: 1144827345, Newest atime: 1146270059, Last expire: 
> 1144870877,
> Current time: 1153945397
>
>
> It doesn't matter how many Spams I have passed through sa-learn, it still
> just says 0 spam(s) in bayes DB < 200. Yet you can see there are 
> 124095 spam
> tokens.

A common pitfall is to train as root and to run spamassassin as another 
user.

If you want a "site-wide bayes", I'd recommend using mysql and setting 
bayes_sql_override_username.
Otherwise, use sudo when training (examples have been posted time ago. 
google).