You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by di...@deeztech.com on 2007/01/03 22:15:57 UTC

Training Bayesian Filter


Running spamassassin 3.0 and I'm invoking it through amavisd. When I 
train
the spamassassin using sa-learn for ham and spam respectively,
it seems  to
only work for the ham not the spam. The command runs
fine, but spam  e-mail
that I trained spamassassin with still show up
untagged as spam. The  ham
e-mail that I trained spamassassin with
work fine and they don't  get
tagged as spam anymore.

Running spamassassin under  Mandriva
2006 Linux.

Your
help would be appreciated.

Re: Training Bayesian Filter

Posted by di...@deeztech.com.

Darn. I thought that might be the problem. Well, back to square one. You
are right, this is not the right list for amavisd questions, but where do
you draw the line?

> On Thu, Jan 04, 2007 at 03:49:35PM
-0500, dino@deeztech.com wrote:
>> So, the bayes_path is set in
my local.cf files and it points to
>>
"/etc/mail/spamassassin/bayes". Now, in my
>>
"/etc/mail/spamassassin" I have two files, bayes_seen and
>> bayes_toks. So, really the line should read:
>>
"/etc/mail/spamassassin" right?
> 
> Nope.  So
far it all looks fine.
> 
>> Amavisd runs at
>> amavisd user. So, which options does the system look in the
local.cf
>> file
>> if you have amavisd running. I
know that rewriting the subject is
>> handled
>> by
amavisd in my setup.
> 
> This is the wrong list to ask
about Amavis stuff (Amavis != SpamAssassin),
> but SA reads SA
configs, so it'll read local.cf if Amavis tells it to
> (or by
default if Amavis doesn't specify).  Whether or not the config is
> used
> by what Amavis calls is a different question (ie:
the header
> additions/rewrites, etc.).
> 
> --
> Randomly Selected Tagline:
> EACH year, as they pay their
taxes, many Americans conduct a tiny
>  mental debate. "Why
should I have to turn over such a huge fraction of my
> 
hard-earned money to the government?" And then, a moment later:
"Oh,
> yeah:
>  schools, roads, national security -
blah, blah, blah. Sign the check."
>  -
http://www.nytimes.com/2003/07/31/technology/circuits/31stat.html
>

Re: Training Bayesian Filter

Posted by Theo Van Dinter <fe...@apache.org>.
On Thu, Jan 04, 2007 at 03:49:35PM -0500, dino@deeztech.com wrote:
> So, the bayes_path is set in my local.cf files and it points to
> "/etc/mail/spamassassin/bayes". Now, in my
> "/etc/mail/spamassassin" I have two files, bayes_seen and
> bayes_toks. So, really the line should read:
> "/etc/mail/spamassassin" right?

Nope.  So far it all looks fine.

> Amavisd runs at
> amavisd user. So, which options does the system look in the local.cf file
> if you have amavisd running. I know that rewriting the subject is handled
> by amavisd in my setup.

This is the wrong list to ask about Amavis stuff (Amavis != SpamAssassin),
but SA reads SA configs, so it'll read local.cf if Amavis tells it to
(or by default if Amavis doesn't specify).  Whether or not the config is used
by what Amavis calls is a different question (ie: the header
additions/rewrites, etc.).

-- 
Randomly Selected Tagline:
EACH year, as they pay their taxes, many Americans conduct a tiny
 mental debate. "Why should I have to turn over such a huge fraction of my
 hard-earned money to the government?" And then, a moment later: "Oh, yeah:
 schools, roads, national security - blah, blah, blah. Sign the check."
 - http://www.nytimes.com/2003/07/31/technology/circuits/31stat.html

Re: Training Bayesian Filter

Posted by di...@deeztech.com.

So, the bayes_path is set in my local.cf files and it points to
"/etc/mail/spamassassin/bayes". Now, in my
"/etc/mail/spamassassin" I have two files, bayes_seen and
bayes_toks. So, really the line should read:
"/etc/mail/spamassassin" right?

Amavisd runs at
amavisd user. So, which options does the system look in the local.cf file
if you have amavisd running. I know that rewriting the subject is handled
by amavisd in my setup.

Here's my local.cf file for
reference:

###########################################################################
#
# rewrite_header Subject *****SPAM*****
# report_safe 1
# trusted_networks 
# lock_method flock

trusted_networks
xx.xx.xx.xx
internal_networks xx.xx.xx.xx

#required_hits
5
#rewrite_header Subject [SPAM]
#report_safe 0
auto_whitelist_path       
/var/spool/spamassassin/auto-whitelist
auto_whitelist_file_mode   0666
#dcc_home                 
/var/lib/dcc

bayes_auto_learn 1
bayes_path
/etc/mail/spamassassin/bayes
bayes_file_mode 0666

use_razor2 1
razor_config
/var/lib/amavis/.razor/razor-agent.conf
razor_timeout 10

use_pyzor 1
pyzor_timeout 10
pyzor_max 5
add_header all
Pyzor _PYZOR_

use_dcc 1
dcc_timeout 10
dcc_home
/var/lib/dcc
dcc_path /usr/bin/dccproc




> Let's keep traffic on the list, please.  Other people
might be able to
> help you and/or benefit from this
discussion.
> 
> dino@deeztech.com wrote:
>>
Where is the bayes_path defined at?
> 
> bayes_path would
be defined in local.cf.  Try man
> Mail::SpamAssassin::Conf for
more info.
> 
>> Running sa-learn as root has been
working for ham, is there a reason it
>> won't for spam?
> 
> Are you *sure* training on ham is actually working?  Are
you seeing
> messages come in with BAYES_00 or other rules lower
than BAYES_50?
> 
> What user does amavisd run as?
> 
>> Using SQL storage? I'm not sure how to do that.
> 
> Let me rephrase that: *if* you're using SQL storage for
Bayes, you
> should use the bayes_sql_override_username option
(since you're calling
> SA through a milter).  If you're not
familiar with it, you're probably
> not using it, so don't worry
about that one.
> 
> --
> Kelson Vibber
>
SpeedGate Communications <www.speed.net>
>

Re: Training Bayesian Filter

Posted by Kelson <ke...@speed.net>.
Let's keep traffic on the list, please.  Other people might be able to 
help you and/or benefit from this discussion.

dino@deeztech.com wrote:
> Where is the bayes_path defined at?

bayes_path would be defined in local.cf.  Try man 
Mail::SpamAssassin::Conf for more info.

> Running sa-learn as root has been working for ham, is there a reason it 
> won't for spam?

Are you *sure* training on ham is actually working?  Are you seeing 
messages come in with BAYES_00 or other rules lower than BAYES_50?

What user does amavisd run as?

> Using SQL storage? I'm not sure how to do that.

Let me rephrase that: *if* you're using SQL storage for Bayes, you 
should use the bayes_sql_override_username option (since you're calling 
SA through a milter).  If you're not familiar with it, you're probably 
not using it, so don't worry about that one.

-- 
Kelson Vibber
SpeedGate Communications <www.speed.net>

Re: Training Bayesian Filter

Posted by Kelson <ke...@speed.net>.
dino@deeztech.com wrote:
> Running spamassassin 3.0 and I'm invoking it through amavisd. When I train
> the spamassassin using sa-learn for ham and spam respectively, it seems to
> only work for the ham not the spam. The command runs fine, but spam e-mail
> that I trained spamassassin with still show up untagged as spam. The ham
> e-mail that I trained spamassassin with work fine and they don't get
> tagged as spam anymore.

Make sure you are doing one of the following:
- Using a sitewide Bayes DB defined using bayes_path.
- Running sa-learn as the same user under which amavisd runs.
- Using SQL storage and the bayes_sql_override_username option.

Also make sure that Amavisd is using the config file you think it is. 
I'm more familiar with MIMEDefang, which will use sa-mimedefang.cf 
instead of local.cf.

-- 
Kelson Vibber
SpeedGate Communications <www.speed.net>

Re: Training Bayesian Filter

Posted by maillist <ma...@emailacs.com>.
dino@deeztech.com wrote:
> Running spamassassin 3.0 and I'm invoking it through amavisd. When I train
> the spamassassin using sa-learn for ham and spam respectively, it seems to
> only work for the ham not the spam. The command runs fine, but spam e-mail
> that I trained spamassassin with still show up untagged as spam. The ham
> e-mail that I trained spamassassin with work fine and they don't get
> tagged as spam anymore.
>
> Running spamassassin under Mandriva
> 2006 Linux.
>
> Your help would be appreciated. 
This depends on how your server is set up.  Are you using mbox style 
in-boxes?

If so, make sure that you're using the --mbox switch along with the 
--spam or --ham switches.

-=Aubrey=-