You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Wolfgang Jeltsch <wo...@jeltsch.net> on 2006/08/08 22:47:30 UTC

problems, problems

Hello,

I was kind of shocked when I discovered that there is no SpamAssassin manual 
or tutorial.  For me, it's unimaginable that the world's leading open source 
spam detection software is missing such an important piece of documentation.

The wiki pages are more bits and pieces than a coherent documentation and 
often don't explain things in principal but give you finished configuration 
files for procmail & Co.  But what if I don't use procmail?  (I use Courier 
maildrop.)

At the moment, I run spamassassin with no arguments as an ordinary user on 
every message I receive and decied what to do with the message accoring to 
the X-Spam-Flag: header line.  But I have some problems with this.

First, SpamAssassin seems to do autolearning.  What does this mean?  Does it 
learn that messages which it already considers spam are spam, and messages 
which it already considers ham are ham?  Wouldn't this mean that SpamAssassin 
is just doing self-affirmation?

Second, I often have a message of the following form in my mail log:

	courierlocal: […] Cannot open bayes databases
	/home/wolfgang/.spamassassin/bayes_* R/W: lock failed: File exists

What's the problem here, and how can I get rid of it?

I'm using SpamAssassin 3.0.3 on Debian GNU/Linux 3.1.

Thanks for you help.

Best wishes,
Wolfgang

Re: problems, problems

Posted by Wolfgang Jeltsch <wo...@jeltsch.net>.
Am Dienstag, 8. August 2006 23:54 schrieb Logan Shaw:
> On Tue, 8 Aug 2006, Wolfgang Jeltsch wrote:
> [...]

> > Second, I often have a message of the following form in my mail log:
> >
> > 	courierlocal: [...] Cannot open bayes databases
> > 	/home/wolfgang/.spamassassin/bayes_* R/W: lock failed: File exists
> >
> > What's the problem here, and how can I get rid of it?
>
> Without any more information than that, I would say that
> something is either still using the Bayes database in your
> home directory or it is finished but the lock file hasn't
> been removed.  I haven't tried using SpamAssassin with Courier
> anything, so I'm not really familiar with how it's normally
> invoked.

What I do currently, is to just pipe each message through spamassassin as an 
ordinary user before the mail is delivered.  How do you normally invoke 
SpamAssassin in conjunction with mail software other than the Courier tools?

>    - Logan

Best wishes,
Wolfgang

Re: problems, problems

Posted by Logan Shaw <ls...@emitinc.com>.
On Tue, 8 Aug 2006, Wolfgang Jeltsch wrote:
> I was kind of shocked when I discovered that there is no SpamAssassin manual
> or tutorial.  For me, it's unimaginable that the world's leading open source
> spam detection software is missing such an important piece of documentation.

Well, it's not entirely true that there isn't a manual.
The various components do have manuals.  Here are the most
commonly useful ones:

     perldoc Mail::SpamAssassin
     perldoc Mail::SpamAssassin::Conf
     man spamassassin
     man sa-learn
     man sa-update

And some other ones:

     perldoc Mail::SpamAssassin::Plugin
     perldoc Mail::SpamAssassin::Bayes
     perldoc Mail::SpamAssassin::BayesStore
     perldoc Mail::SpamAssassin::Plugin::Hashcash

Not all the modules that should have documentation do have
documentation (for instance, Mail::SpamAssassin::BayesStore::DBM
doesn't have any), but there is at least some information.

You can root around in the Mail/SpamAssassin directory (should
be somewhere inside your site_perl directory) to find more
modules that might have documentation.  There may be a more
elegant way, but this is one of seeing a list of modules which
have documentation:

     cd ...../site_perl/...../Mail/SpamAssassin
     find . -name '*.pm' -print | xargs grep -l '^=head'

> The wiki pages are more bits and pieces than a coherent documentation and
> often don't explain things in principal but give you finished configuration
> files for procmail & Co.  But what if I don't use procmail?

Well, SpamAssassin doesn't deliver mail, so this question,
which is about delivery methods, isn't really relevant.

> First, SpamAssassin seems to do autolearning.  What does this mean?  Does it
> learn that messages which it already considers spam are spam, and messages
> which it already considers ham are ham?  Wouldn't this mean that SpamAssassin
> is just doing self-affirmation?

The Bayes database needs to be fed training data in order to
be effective.  It needs to see several (preferably hundreds and
hundreds) of known spam and known ham messages.  sa-learn is the
command that is used to do this manually.  Autolearning means
to do the same thing as sa-learn, but automatically.

Basically, the other rules work well enough that they can identify
obvious spam and ham.  Those messages can be used to train the
Bayes database.

> Second, I often have a message of the following form in my mail log:
>
> 	courierlocal: […] Cannot open bayes databases
> 	/home/wolfgang/.spamassassin/bayes_* R/W: lock failed: File exists
> 
> What's the problem here, and how can I get rid of it?

Without any more information than that, I would say that
something is either still using the Bayes database in your
home directory or it is finished but the lock file hasn't
been removed.  I haven't tried using SpamAssassin with Courier
anything, so I'm not really familiar with how it's normally
invoked.

   - Logan

Re: problems, problems

Posted by jdow <jd...@earthlink.net>.
man spamassassin is the key to the whole thing beyond the INSTALL files.

Then you have things like "man Mail::SpamAssassin" and its kith and kin
like "man Mail::SpamAssassin::Conf". These will generally be more up to
date than any documentation file that exists. And of course the original
man spamassassin results point to some of the other that are important.

{^_^}
----- Original Message ----- 
From: "Wolfgang Jeltsch" <wo...@jeltsch.net>


Hello,

I was kind of shocked when I discovered that there is no SpamAssassin manual
or tutorial.  For me, it's unimaginable that the world's leading open source
spam detection software is missing such an important piece of documentation.

The wiki pages are more bits and pieces than a coherent documentation and
often don't explain things in principal but give you finished configuration
files for procmail & Co.  But what if I don't use procmail?  (I use Courier
maildrop.)

At the moment, I run spamassassin with no arguments as an ordinary user on
every message I receive and decied what to do with the message accoring to
the X-Spam-Flag: header line.  But I have some problems with this.

First, SpamAssassin seems to do autolearning.  What does this mean?  Does it
learn that messages which it already considers spam are spam, and messages
which it already considers ham are ham?  Wouldn't this mean that SpamAssassin
is just doing self-affirmation?

Second, I often have a message of the following form in my mail log:

courierlocal: […] Cannot open bayes databases
/home/wolfgang/.spamassassin/bayes_* R/W: lock failed: File exists

What's the problem here, and how can I get rid of it?

I'm using SpamAssassin 3.0.3 on Debian GNU/Linux 3.1.

Thanks for you help.

Best wishes,
Wolfgang 


RE: problems, problems

Posted by Gary V <mr...@hotmail.com>.
>Hello,
>
>I was kind of shocked when I discovered that there is no SpamAssassin 
>manual
>or tutorial.  For me, it's unimaginable that the world's leading open 
>source
>spam detection software is missing such an important piece of 
>documentation.

http://spamassassin.apache.org/doc.html

There are a large number of ways SpamAssassin can be incorporated into 
someone's system. Besides what is provided on the SpamAssassin site and the 
documentation provided with SpamAssassin itself, there are many HOWTOs out 
there that deal with particular setups. Google is your friend.

>
>The wiki pages are more bits and pieces than a coherent documentation and
>often don't explain things in principal but give you finished configuration
>files for procmail & Co.  But what if I don't use procmail?  (I use Courier
>maildrop.)
>
>At the moment, I run spamassassin with no arguments as an ordinary user on
>every message I receive and decied what to do with the message accoring to
>the X-Spam-Flag: header line.  But I have some problems with this.
>
>First, SpamAssassin seems to do autolearning.  What does this mean?  Does 
>it
>learn that messages which it already considers spam are spam, and messages
>which it already considers ham are ham?  Wouldn't this mean that 
>SpamAssassin
>is just doing self-affirmation?
>

Bayes builds a database of the tokens in obvious spam, and in obvious ham. 
When a message is recieved its tokens are compared to the database to help 
push the score one way or the other (or not). It's not self-affirmation 
because Bayes itself does not influence whether something is autolearned or 
not. The Bayes score tweak happens afterwards. It's more akin to learning 
from experience.

>Second, I often have a message of the following form in my mail log:
>
>	courierlocal: […] Cannot open bayes databases
>	/home/wolfgang/.spamassassin/bayes_* R/W: lock failed: File exists
>
>What's the problem here, and how can I get rid of it?

I would first try setting
lock_method flock
in local.cf

and if that does not help, try

bayes_learn_to_journal 1

http://spamassassin.apache.org/full/3.0.x/dist/doc/Mail_SpamAssassin_Conf.html#learning_options

Better yet, move Bayes to MySQL. This HOWTO is geared towards amavisd-new, 
but could be used for any other user and would be good for site-wide use, 
simply substitute the user name:

http://www200.pair.com/mecham/spam/debian-spamassassin-sql.html

>
>I'm using SpamAssassin 3.0.3 on Debian GNU/Linux 3.1.
>
>Thanks for you help.
>
>Best wishes,
>Wolfgang

Gary V

_________________________________________________________________
Is your PC infected? Get a FREE online computer virus scan from McAfeeŽ 
Security. http://clinic.mcafee.com/clinic/ibuy/campaign.asp?cid=3963