You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Tinni <t_...@yahoo.co.in> on 2005/02/13 23:54:11 UTC

Spamassassin with sa-learn

Hi 

I am little bit confused of *sa-learn*.  I have installed SA 3.02. I
did not set any bayes path in local.cf . When i am checking with

#spamassassin -lint -D  

 it is showing  a path as

_________________________________________
debug: using "/home/sites/www.domain.org/users/<user>/.spamassassin" for user
state dir
debug: using "/home/sites/www.domain.org/users/<user>/.spamassassin/user_prefs
" for user prefs file
debug: config: read file /home/sites/www.domain.org/users/<user>/.spamassassin
/user_prefs
debug: using "/home/sites/www.domain.org/users/<user>/.spamassassin" for user
state dir
debug: bayes: 7103 tie-ing to DB file R/O /home/sites/www.domain.org/users/<user>/.spamassassin/bayes_toks
debug: bayes: 7103 tie-ing to DB file R/O /home/sites/www.domain.org/users/<user>
/.spamassassin/bayes_seen
debug: bayes: found bayes db version 3
debug: using "/home/sites/www.domain.org/users/<user>/.spamassassin" for user
state dir
debug: Score set 3 chosen.

_________________________________________


I am executing the *sa-learn*  as root, So do you think that the central database
for bayes is  the aboove path?  Also i am seeing that the individual users's
bayes database also updated . but i am not allowing ANYBODY to execute the *sa-learn*.
I want that  the mail will be filtered through the *central database* only.  

I am using procmail with sendmail..

Here is my few questoins::

  -  Do i need to mention the path of bayes db in the local.cf? 

  -  Or whatever way (default path) is being used that is also ok?

  -  Or do you think that only the user which i am getting while running with  
     --lint output will be benefited only by the bayes database learning?  

  -  Though no user is executing the *sa-learn* then how every
     userid bayes database is being updated ? (i am telling only 
     seeing the time stapm)

Suggession/advice.

Thanks in advance.
-tinni

 

Yahoo! India Matrimony: Find your life partneronline.

Re: Spamassassin with sa-learn

Posted by Kris Deugau <kd...@vianet.ca>.
Tinni wrote:
> Here my qs. is when a mail is coming to the server , suppose, for
> *user2* or many others,  will the spamassassin  check the mails for
> ham/spam with the *default* database which is bydefault set to
> *user1* ? or it will check only for the mails of *user1* ? I am
> little bit confused  here.

I don't have your original message, but IIRC you said you're calling
SpamAssassin from procmail.  This implies that you're doing so just
before the message is put into a mail folder (whether that's the inbox
for a user or elsewhere is determined by procmail).  On most mail
systems, this *also* means that mail processing is done one message at a
time, for one recipient at a time.

As I said in my first reply, if you want a single global Bayes database
you **MUST** at the very minimum put a bayes_path statement in one of
your local configuration files - local.cf is most commonly used.

When a message is processed by SA, with that bayes_path statement in
place, *ALL* Bayes activity is done on that global database.

> As i understand that individual users_prefs will supercede the
> value of the global parameter settings.

For certain settings, yes.  See the man page for
Mail::SpamAssassin::Conf for the details on which ones.

> So does this concept is for bayes database also?

IIRC, no;  bayes* options are considered "priviledged" settings.  Check
the man page on your installed SpamAssassin copy to be certain for your
usage.

> If yes, then the bayes default databaes whatever learned (spam + ham)
> for default user *user1* will not work for the other users - is this
> so?

Assuming that bayes* options are not priviledged, then yes, any user
could stick in a bayes_path statement and avoid the global database.

Otherwise, all users will refer to the global database.

> I want simply that the default  path  where i am seeing spamassassin
> is updatijng/working will be applicable for all the users.

Please see the suggestions at the bottom of my first reply, and refer to
the man page to make sure you have the settings laid out correctly for
your installed version of SA.

Those settings have been working on one of my systems for several years
now.

-kgd
-- 
Get your mouse off of there!  You don't know where that email has been!

Re: Spamassassin with sa-learn

Posted by Tinni <t_...@yahoo.co.in>.
Hi 

Thanks for your reply.

>>OK, looks good. SA puts preferences and AWL data and >>Bayes data files in ~/.spamassassin/ by default.


I am sorry if my qs sounds little bit funny but as i am new so i have some confusions.


Here my qs. is when a mail is coming to the server , suppose, for *user2* or many others,  will the spamassassin  check the mails for ham/spam with the *default* database which is bydefault set to  *user1* ? or it will check only for the mails of *user1* ? I am little bit confused  here.

As i understand that individual   users_prefs will supercede the value of the global parameter settings. So does this concept is   for bayes database also? If yes, then  the bayes default databaes whatever learned (spam + ham)  for default user *user1* will not work for the other users - is this so? 

I want simply that the default  path  where i am seeing spamassassin is updatijng/working will be applicable for all the users.

Suggessions/advice is really appreciated.

Thanks again

-Tinni



Kris Deugau <kd...@vianet.ca> wrote: Please post messages in plaintext in the future. Thanks.

Tinni wrote:
> I am little bit confused of *sa-learn*. I have installed SA 3.02. I
> did not set any bayes path in local.cf . When i am checking with
>
> #spamassassin -lint -D
>
> it is showing a path as

[snip]
< debug: bayes: 7103 tie-ing to DB file R/O
> /home/sites/www.domain.org/users//.spamassassin/bayes_toks
> debug: bayes: 7103 tie-ing to DB file R/O
> /home/sites/www.domain.org/users//.spamassassin/bayes_seen

OK, looks good. SA puts preferences and AWL data and Bayes data files
in ~/.spamassassin/ by default.

> I am executing the *sa-learn* as root, So do you think that the
> central database for bayes is the aboove path?

No, if you run sa-learn as root it will, like any other default SA call,
put Bayes data in ~/.spamassassin/. In particular, it will create
/root/.spamassassin/bayes_seen and /root/.spamassassin/bayes_toks.

> Also i am seeing that
> the individual users's bayes database also updated. but i am not
> allowing ANYBODY to execute the *sa-learn*.
> - Though no user is executing the *sa-learn* then how every
> userid bayes database is being updated ? (i am telling only 
> seeing the time stapm)

This is due to SpamAssassin's autolearning capability; by default
messages scoring under 0.1 or over 12 (IIRC, check the documentation)
will get autolearned as ham or spam respectively. Each user's
autolearned Bayes data will be put in the appropriate files in
~user/.spamassassin/.

> I want that the mail
> will be filtered through the *central database* only.
> - Do i need to mention the path of bayes db in the local.cf?

If you want a single, global Bayes database, you **MUST** set bayes_path
in your configuration.

For instance, on one of the systems I administer, I have the following
in my local.cf to set up SA's Bayes subsystem:

use_bayes 1
bayes_auto_learn 1
bayes_auto_learn_threshold_nonspam -0.01
bayes_learn_to_journal 1
bayes_expiry_max_db_size 1000000
bayes_auto_expire 0
bayes_path /var/SpamAssassin/bayes
bayes_file_mode 0777

I've explicitly set a number of options to their defaults as well, but
this provides me with a single, global Bayes database, accessible and
autolearn-able for all users, with a larger number of tokens than the
default. Check the Mail::SpamAssassin::Conf manpage for details on what
these options do. Note that some of them may have changed for 3.x; 
this is a working 2.64 install.

-kgd
-- 
Get your mouse off of there! You don't know where that email has been!


Yahoo! India Matrimony: Find your life partneronline.

Re: Spamassassin with sa-learn

Posted by Kris Deugau <kd...@vianet.ca>.
Please post messages in plaintext in the future.  Thanks.

Tinni wrote:
> I am little bit confused of *sa-learn*.  I have installed SA 3.02. I
> did not set any bayes path in local.cf . When i am checking with
>
> #spamassassin -lint -D
>
> it is showing  a path as

[snip]
< debug: bayes: 7103 tie-ing to DB file R/O
> /home/sites/www.domain.org/users/<user>/.spamassassin/bayes_toks
> debug: bayes: 7103 tie-ing to DB file R/O
> /home/sites/www.domain.org/users/<user>/.spamassassin/bayes_seen

OK, looks good.  SA puts preferences and AWL data and Bayes data files
in ~<user>/.spamassassin/ by default.

> I am executing the *sa-learn* as root, So do you think that the
> central database for bayes is the aboove path?

No, if you run sa-learn as root it will, like any other default SA call,
put Bayes data in ~<user>/.spamassassin/.  In particular, it will create
/root/.spamassassin/bayes_seen and /root/.spamassassin/bayes_toks.

>  Also i am seeing that
> the individual users's bayes database also updated. but i am not
> allowing ANYBODY to execute the *sa-learn*.
>  -  Though no user is executing the *sa-learn* then how every
>     userid bayes database is being updated ? (i am telling only 
>     seeing the time stapm)

This is due to SpamAssassin's autolearning capability;  by default
messages scoring under 0.1 or over 12 (IIRC, check the documentation)
will get autolearned as ham or spam respectively.  Each user's
autolearned Bayes data will be put in the appropriate files in
~user/.spamassassin/.

>  I want that  the mail
> will be filtered through the *central database* only.
>  -  Do i need to mention the path of bayes db in the local.cf?

If you want a single, global Bayes database, you **MUST** set bayes_path
in your configuration.

For instance, on one of the systems I administer, I have the following
in my local.cf to set up SA's Bayes subsystem:

use_bayes	1
bayes_auto_learn        1
bayes_auto_learn_threshold_nonspam      -0.01
bayes_learn_to_journal  1
bayes_expiry_max_db_size        1000000
bayes_auto_expire       0
bayes_path      /var/SpamAssassin/bayes
bayes_file_mode 0777

I've explicitly set a number of options to their defaults as well, but
this provides me with a single, global Bayes database, accessible and
autolearn-able for all users, with a larger number of tokens than the
default.  Check the Mail::SpamAssassin::Conf manpage for details on what
these options do.  Note that some of them may have changed for 3.x; 
this is a working 2.64 install.

-kgd
-- 
Get your mouse off of there!  You don't know where that email has been!