You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Nick Bright <ni...@terraworld.net> on 2004/08/27 18:05:56 UTC

SA3.0-rc1: Bayes, AWL, and User-Prefs in Postgres SQL

Greetings all,

 I am trying to set up SA3 with Bayes, User-Prefs, and AWL all stored in
a postgres database. I'm using a RedHat EL3 server for this application,
with the version of postgres that comes with RHEL3. I set up all the SQL
tables and such with the *.sql scripts in the SA3 distrib file, and SA's
local.cf according to the README's therein.


 When I send a message through the scanner, the following debug level
output is observed:

Aug 27 10:35:59 sanford spamd[19782]: logmsg: connection from prv1004132
[10.0.4.132] at port 32979
Aug 27 10:35:59 sanford spamd[19782]: connection from prv1004132
[10.0.4.132] at port 32979
Aug 27 10:35:59 sanford spamd[19782]: debug: Conf::SQL: executing SQL:
select preference, value  from userpref where username = 'nickb' or
username = '@GLOBAL' order by username asc
Aug 27 10:35:59 sanford spamd[19782]: debug: retrieving prefs for nickb
from SQL server
Aug 27 10:35:59 sanford spamd[19782]: debug: user has changed
Aug 27 10:35:59 sanford spamd[19782]: debug: bayes: Using username:
nickb
Aug 27 10:35:59 sanford spamd[19782]: debug: bayes: Database connection
established
Aug 27 10:35:59 sanford spamd[19782]: debug: bayes: found bayes db
version 3
Aug 27 10:35:59 sanford spamd[19782]: debug: bayes: Using userid: 4
Aug 27 10:35:59 sanford spamd[19782]: debug: bayes: Not available for
scanning, only 0 spam(s) in Bayes DB < 200
Aug 27 10:35:59 sanford spamd[19782]: debug: Score set 0 chosen.
Aug 27 10:35:59 sanford spamd[19782]: logmsg: processing message
<10...@excite.com> for nickb:89.
Aug 27 10:35:59 sanford spamd[19782]: processing message
<10...@excite.com> for nickb:89.
Aug 27 10:35:59 sanford spamd[19782]: debug: bayes: Database connection
established
Aug 27 10:35:59 sanford spamd[19782]: debug: bayes: found bayes db
version 3
Aug 27 10:35:59 sanford spamd[19782]: debug: bayes: Using userid: 4
Aug 27 10:35:59 sanford spamd[19782]: debug: bayes: Not available for
scanning, only 0 spam(s) in Bayes DB < 200
Aug 27 10:35:59 sanford spamd[19782]: debug: received-header: parsed as
[ ip=200.30.148.30 rdns= helo=terraworld.net by=mail.terraworld.net
ident= envfrom= intl=0 id= ]
Aug 27 10:35:59 sanford spamd[19782]: debug: received-header: cannot use
DNS, do not trust any hosts from here on
Aug 27 10:35:59 sanford spamd[19782]: debug: received-header: relay
200.30.148.30 trusted? no internal? no
Aug 27 10:35:59 sanford spamd[19782]: debug: metadata:
X-Spam-Relays-Trusted:
Aug 27 10:35:59 sanford spamd[19782]: debug: metadata:
X-Spam-Relays-Untrusted: [ ip=200.30.148.30 rdns= helo=terraworld.net
by=mail.terraworld.net ident= envfrom= intl=0 id= ]
Aug 27 10:35:59 sanford spamd[19782]: debug: ---- MIME PARSER START ----
Aug 27 10:35:59 sanford spamd[19782]: debug: main message type:
text/plain
Aug 27 10:35:59 sanford spamd[19782]: debug: parsing normal part
Aug 27 10:35:59 sanford spamd[19782]: debug: added part, type:
text/plain
Aug 27 10:35:59 sanford spamd[19782]: debug: ---- MIME PARSER END ----
Aug 27 10:35:59 sanford spamd[19782]: debug: decoding: other encoding
type (8bit), ignoring
Aug 27 10:35:59 sanford spamd[19782]: debug: uri found:
http://www.cutpricerxpills.com/_85924943b9db73ac62baa654773c6a8e/4
Aug 27 10:35:59 sanford spamd[19782]: debug: Running tests for priority:
0
Aug 27 10:35:59 sanford spamd[19782]: debug: running header regexp
tests; score so far=0
Aug 27 10:35:59 sanford spamd[19782]: debug: all '*From' addrs:
pookie121gambit@hotmail.com evan5mndy13@hotmail.com
Aug 27 10:35:59 sanford spamd[19782]: debug: all '*To' addrs:
buddha@terraworld.net terraworld.net-bugs@terraworld.net
Aug 27 10:35:59 sanford spamd[19782]: debug: forged-HELO: from=
helo=terraworld.net by=terraworld.net
Aug 27 10:35:59 sanford spamd[19782]: debug: running body-text per-line
regexp tests; score so far=1.352
Aug 27 10:35:59 sanford spamd[19782]: debug: running uri tests; score so
far=2.281
Aug 27 10:35:59 sanford spamd[19782]: debug: running raw-body-text
per-line regexp tests; score so far=2.281
Aug 27 10:35:59 sanford spamd[19782]: debug: running full-text regexp
tests; score so far=2.281
Aug 27 10:35:59 sanford spamd[19782]: debug: Running tests for priority:
500
Aug 27 10:35:59 sanford spamd[19782]: debug: running meta tests; score
so far=2.281
Aug 27 10:35:59 sanford spamd[19782]: debug: running header regexp
tests; score so far=2.281
Aug 27 10:35:59 sanford spamd[19782]: debug: running body-text per-line
regexp tests; score so far=2.281
Aug 27 10:35:59 sanford spamd[19782]: debug: running uri tests; score so
far=2.281
Aug 27 10:35:59 sanford spamd[19782]: debug: running raw-body-text
per-line regexp tests; score so far=2.281
Aug 27 10:35:59 sanford spamd[19782]: debug: running full-text regexp
tests; score so far=2.281
Aug 27 10:35:59 sanford spamd[19782]: debug: Running tests for priority:
1000
Aug 27 10:35:59 sanford spamd[19782]: debug: running meta tests; score
so far=2.281
Aug 27 10:35:59 sanford spamd[19782]: debug: running header regexp
tests; score so far=2.281
Aug 27 10:35:59 sanford spamd[19782]: debug: using
"/home/vpopmail/.spamassassin" for user state dir
Aug 27 10:35:59 sanford spamd[19782]: debug: mkdir
/home/vpopmail/.spamassassin failed: mkdir /home/vpopmail: Permission
denied at /usr/lib/perl5/site_perl/5.8.0/Mail/SpamAssassin.pm line 1438_
Aug 27 10:35:59 sanford spamd[19782]: debug: open of AWL file failed:
lock: 19782 cannot create tmp lockfile
/home/vpopmail/.spamassassin/auto-whitelist.lock.sanford.19782 for
/home/vpopmail/.spamassassin/auto-whitelist.lock: No such file or
directory
Aug 27 10:35:59 sanford spamd[19782]: debug: Post AWL score: 2.281
Aug 27 10:35:59 sanford spamd[19782]: debug: running body-text per-line
regexp tests; score so far=2.281
Aug 27 10:35:59 sanford spamd[19782]: debug: running uri tests; score so
far=2.281
Aug 27 10:35:59 sanford spamd[19782]: debug: running raw-body-text
per-line regexp tests; score so far=2.281
Aug 27 10:35:59 sanford spamd[19782]: debug: running full-text regexp
tests; score so far=2.281
Aug 27 10:35:59 sanford spamd[19782]: debug: auto-learn? ham=0.1,
spam=12, body-points=0.929, head-points=1.352
Aug 27 10:35:59 sanford spamd[19782]: debug: auto-learn: currently using
scoreset 0.  no need to recompute.
Aug 27 10:35:59 sanford spamd[19782]: debug: auto-learn? no: inside
auto-learn thresholds
Aug 27 10:35:59 sanford spamd[19782]: debug: is spam? score=2.281
required=5
Aug 27 10:35:59 sanford spamd[19782]: debug:
tests=FORGED_HOTMAIL_RCVD2,SAVE_THOUSANDS,SUBJ_BUY
Aug 27 10:35:59 sanford spamd[19782]: debug:
subtests=__CT,__CTE,__CT_TEXT_PLAIN,__HAS_MSGID,__HAS_SUBJECT,__MIME_VERSION,__MSGID_OK_DIGITS,__MSGID_OK_HOST,__MSGID_RANDY,__SANE_MSGID
Aug 27 10:36:00 sanford spamd[19782]: logmsg: clean message (2.3/5.0)
for nickb:89 in 0.5 seconds, 1206 bytes.
Aug 27 10:36:00 sanford spamd[19782]: clean message (2.3/5.0) for
nickb:89 in 0.5 seconds, 1206 bytes.
Aug 27 10:36:00 sanford spamd[19782]: logmsg: result: .  2 -
FORGED_HOTMAIL_RCVD2,SAVE_THOUSANDS,SUBJ_BUY
scantime=0.5,size=1206,mid=<10...@excite.com>,autolearn=no
Aug 27 10:36:00 sanford spamd[19782]: result: .  2 -
FORGED_HOTMAIL_RCVD2,SAVE_THOUSANDS,SUBJ_BUY
scantime=0.5,size=1206,mid=<10...@excite.com>,autolearn=no
 

Starting from the top, I've examined this output to mean:

1) SQL User-Prefs are being queried, but I don't know how to put
specific prefs in so I can't really test it (advice, please?).

2) SQL Bayes is being queried, but not correctly. The log states "bayes:
Not available for scanning, only 0 spam(s) in Bayes DB < 200" but I've
used sa-learn to train in about 500 spams and about 500 hams, it should
be trained well enough to at least try working (I do see this
information in the database tables for bayes).

3) AWL isn't using the SQL DB: "debug: open of AWL file failed: lock:
19782 cannot create tmp lockfile
/home/vpopmail/.spamassassin/auto-whitelist.lock.sanford.19782 for
/home/vpopmail/.spamassassin/auto-whitelist.lock: No such file or
directory", however it should be using SQL for this. There are zero rows
on the awl table in postgres.

In the process of troubleshooting #3, I created "/home/vpopmail", after
doing so, the log output changed in that section to this, which
reinforces "AWL isn't looking in SQL":

Aug 27 10:55:01 sanford spamd[19860]: debug: using
"/home/vpopmail/.spamassassin" for user state dir
Aug 27 10:55:01 sanford spamd[19860]: debug: lock: 19860 created
/home/vpopmail/.spamassassin/auto-whitelist.lock.sanford.19860
Aug 27 10:55:01 sanford spamd[19860]: debug: lock: 19860 trying to get
lock on /home/vpopmail/.spamassassin/auto-whitelist with 0 retries
Aug 27 10:55:01 sanford spamd[19860]: debug: lock: 19860 link to
/home/vpopmail/.spamassassin/auto-whitelist.lock: link ok
Aug 27 10:55:01 sanford spamd[19860]: debug: Tie-ing to DB file R/W in
/home/vpopmail/.spamassassin/auto-whitelist
Aug 27 10:55:01 sanford spamd[19860]: debug: auto-whitelist (db-based):
pookie121gambit@hotmail.com|ip=200.30 scores 0/0
Aug 27 10:55:01 sanford spamd[19860]: debug: auto-whitelist (db-based):
pookie121gambit@hotmail.com|ip=none scores 0/0
Aug 27 10:55:01 sanford spamd[19860]: debug: AWL active, pre-score:
2.281, autolearn score: 2.281, mean: undef, IP: 200.30.148.30
Aug 27 10:55:01 sanford spamd[19860]: debug: add_score: New count: 1,
new totscore: 2.281
Aug 27 10:55:01 sanford spamd[19860]: debug: DB addr list: untie-ing and
unlocking.
Aug 27 10:55:01 sanford spamd[19860]: debug: DB addr list: file locked,
breaking lock.
Aug 27 10:55:01 sanford spamd[19860]: debug: unlock: 19860 unlink
/home/vpopmail/.spamassassin/auto-whitelist.lock
Aug 27 10:55:02 sanford spamd[19860]: debug: Post AWL score: 2.281
--

For refrence, my local.cf is:

user_scores_dsn                 DBI:Pg:dbname=spamassassin
user_scores_sql_username        vpopmail
user_scores_sql_password        vpop
user_awl_dsn                    DBI:Pg:dbname=spamassassin
user_awl_sql_username           vpopmail
user_awl_sql_password           vpop
bayes_store_module              Mail::SpamAssassin::BayesStore::SQL
bayes_sql_dsn                   DBI:Pg:dbname=spamassassin
bayes_sql_username              vpopmail
bayes_sql_password              vpop
required_hits 5.0
rewrite_subject 1
report_safe 0
use_terse_report 1
use_bayes               1
auto_learn              1
use_auto_whitelist      1
use_razor2              0
use_dcc                 0
use_pyzor               0
ok_languages all
ok_locales all

Note the lack of user_awl_table and user_scores_table, the default
values are being used (in fact, i had to remove it to make user_scores
work).


What I'd like to get accomplish is to have AWL work through the database
properly, and get bayes to realize that there *is* stuff in the
database. sa-learn put it there, so spamd should be able to read it!
Also, what is the format I should use for user prefs in the database? Is
there a php application known that will put these prefs into postgres? I
googled around and didn't find one.
-- 
- Nick Bright
  Terraworld, Inc
  http://home.terraworld.net | 888-332-1616


Re: SA3.0-rc1: Bayes, AWL, and User-Prefs in Postgres SQL

Posted by Michael Parker <pa...@pobox.com>.
On Fri, Aug 27, 2004 at 01:30:31PM -0500, Nick Bright wrote:
> On Fri, 2004-08-27 at 13:06, Michael Parker wrote:
> > /me goes off and starts the Bayes SQL wiki page with all of these
> > useful tips
> > 
> 
> I wondered why there wasn't one already. I figured all of these were
> probably very common questions, and I did try to find this information
> on the Wiki rather unsuccessfully.
> 

Time and energy....

> 
> Thanks for the help getting my other issues resolved, I appreciate it
> very much.
> 

<shamless plug>
Come to ApacheCon, come to one of the several SpamAssassin talks,
including one on SQL.
</shamless plug>

Michael

Re: SA3.0-rc1: Bayes, AWL, and User-Prefs in Postgres SQL

Posted by Nick Bright <ni...@terraworld.net>.
On Fri, 2004-08-27 at 13:06, Michael Parker wrote:
> > On Fri, 2004-08-27 at 11:23, Michael Parker wrote:
> > > On Fri, Aug 27, 2004 at 11:05:56AM -0500, Nick Bright wrote:
> > > > 
> > <log trimmed for brevity>
> 
> Yeah, opps, I kept it around in case I needed to look at it again and
> forgot to remove it before sending, sorry all for the wasted
> bandwidth.
> 
> > > 
> > > Are you sure you ran sa-learn as the same user who is receiving the
> > > mail?  What user does userid 4 correspond to in your bayes_vars table?
> > > 
> > I ran sa-learn as root. . . userid 4 corrisponds to the user I'm
> > submitting the test mail as, I see some other things in the table too.
> > In specific, I see root with the spam_count and ham_count I expected to
> > see.
> > 
> > I assume, then, that I am doing something wrong WRT bayes. I want all
> > bayes information to be global, for all users. Can this be accomplished?
> > I can't just have it always use the same username to run under, as I am
> > also doing user_prefs. . .
> > 
> 
> You want to use bayes_sql_override_username for global bayes, you can
> read more about it in sql/README.bayes and the Conf documentation.
> Bayes SQL will store the data as whatever user you ran sa-learn as,
> unless you have bayes_sql_override_username set.
> 
> /me goes off and starts the Bayes SQL wiki page with all of these
> useful tips
> 

I wondered why there wasn't one already. I figured all of these were
probably very common questions, and I did try to find this information
on the Wiki rather unsuccessfully.

> 
> > > Did you set:
> > > auto_whitelist_factory Mail::SpamAssassin::SQLBasedAddrList
> > > ?
> > 
> > No as a matter of fact, I did not. I put that in my local.cf, and AWL in
> > SQL is now working. Perhaps that should be in the sql/README.awl file?
> > 
> 
> You mean this part right at the top?
> 
> "In order to activate the SQL based auto-whitelist you have to
> configure spamassassin and spamd to use a different whitelist factory.
> This is  done with the auto_whitelist_factory config variable, like
> so:
> 
> auto_whitelist_factory Mail::SpamAssassin::SQLBasedAddrList
> "

Yes, apparently I'm blind :)

> 
> > 
> > I noticed that if you go up one level, there is a much more recent
> > version of the php-sa-mysql available, but I don't see one for pgsql
> > still.
> > 
> > However, now that I know how to put the prefs into the DB, I should be
> > able to write my own php-sa-pgsql application, or beg someone I know to
> > do it for me :)
> > 
> 
> I guess I assumed it would be trivial to adapt for postgres.  I've
> been contemplating writing a cgi application to handle
> user_prefs/bayes/awl SQL data that would hopefully eventually make
> it's way into the distribution (ie officially supported) maybe I'll
> work it up and present it at ApacheCon.
> 

I hadn't looked at the source yet, but I'm not much of a programmer
anyways. I can make do on this front.

Thanks for the help getting my other issues resolved, I appreciate it
very much.

> Michael
-- 
- Nick Bright
  Terraworld, Inc
  http://home.terraworld.net | 888-332-1616


Re: SA3.0-rc1: Bayes, AWL, and User-Prefs in Postgres SQL

Posted by Michael Parker <pa...@pobox.com>.
> On Fri, 2004-08-27 at 11:23, Michael Parker wrote:
> > On Fri, Aug 27, 2004 at 11:05:56AM -0500, Nick Bright wrote:
> > > 
> <log trimmed for brevity>

Yeah, opps, I kept it around in case I needed to look at it again and
forgot to remove it before sending, sorry all for the wasted
bandwidth.

> > 
> > Are you sure you ran sa-learn as the same user who is receiving the
> > mail?  What user does userid 4 correspond to in your bayes_vars table?
> > 
> I ran sa-learn as root. . . userid 4 corrisponds to the user I'm
> submitting the test mail as, I see some other things in the table too.
> In specific, I see root with the spam_count and ham_count I expected to
> see.
> 
> I assume, then, that I am doing something wrong WRT bayes. I want all
> bayes information to be global, for all users. Can this be accomplished?
> I can't just have it always use the same username to run under, as I am
> also doing user_prefs. . .
> 

You want to use bayes_sql_override_username for global bayes, you can
read more about it in sql/README.bayes and the Conf documentation.
Bayes SQL will store the data as whatever user you ran sa-learn as,
unless you have bayes_sql_override_username set.

/me goes off and starts the Bayes SQL wiki page with all of these
useful tips


> > Did you set:
> > auto_whitelist_factory Mail::SpamAssassin::SQLBasedAddrList
> > ?
> 
> No as a matter of fact, I did not. I put that in my local.cf, and AWL in
> SQL is now working. Perhaps that should be in the sql/README.awl file?
> 

You mean this part right at the top?

"In order to activate the SQL based auto-whitelist you have to
configure spamassassin and spamd to use a different whitelist factory.
This is  done with the auto_whitelist_factory config variable, like
so:

auto_whitelist_factory Mail::SpamAssassin::SQLBasedAddrList
"

> 
> I noticed that if you go up one level, there is a much more recent
> version of the php-sa-mysql available, but I don't see one for pgsql
> still.
> 
> However, now that I know how to put the prefs into the DB, I should be
> able to write my own php-sa-pgsql application, or beg someone I know to
> do it for me :)
> 

I guess I assumed it would be trivial to adapt for postgres.  I've
been contemplating writing a cgi application to handle
user_prefs/bayes/awl SQL data that would hopefully eventually make
it's way into the distribution (ie officially supported) maybe I'll
work it up and present it at ApacheCon.

Michael

Re: SA3.0-rc1: Bayes, AWL, and User-Prefs in Postgres SQL

Posted by Nick Bright <ni...@terraworld.net>.
On Fri, 2004-08-27 at 11:23, Michael Parker wrote:
> On Fri, Aug 27, 2004 at 11:05:56AM -0500, Nick Bright wrote:
> > Greetings all,
> > 
> >  I am trying to set up SA3 with Bayes, User-Prefs, and AWL all stored in
> > a postgres database. I'm using a RedHat EL3 server for this application,
> > with the version of postgres that comes with RHEL3. I set up all the SQL
> > tables and such with the *.sql scripts in the SA3 distrib file, and SA's
> > local.cf according to the README's therein.
> > 
> > 
> >  When I send a message through the scanner, the following debug level
> > output is observed:
> > 
<log trimmed for brevity>
> >  
> > 
> > Starting from the top, I've examined this output to mean:
> > 
> > 1) SQL User-Prefs are being queried, but I don't know how to put
> > specific prefs in so I can't really test it (advice, please?).
> > 
> 
> insert into userprefs (username,preference,value) values ('yourusername','required_score','15');
> etc etc etc
> 
Thanks, that should do nicely. I just tested it and it showed what it
should have, so my userprefs are certinly working.

> > 2) SQL Bayes is being queried, but not correctly. The log states "bayes:
> > Not available for scanning, only 0 spam(s) in Bayes DB < 200" but I've
> > used sa-learn to train in about 500 spams and about 500 hams, it should
> > be trained well enough to at least try working (I do see this
> > information in the database tables for bayes).
> > 
> 
> Are you sure you ran sa-learn as the same user who is receiving the
> mail?  What user does userid 4 correspond to in your bayes_vars table?
> 
I ran sa-learn as root. . . userid 4 corrisponds to the user I'm
submitting the test mail as, I see some other things in the table too.
In specific, I see root with the spam_count and ham_count I expected to
see.

I assume, then, that I am doing something wrong WRT bayes. I want all
bayes information to be global, for all users. Can this be accomplished?
I can't just have it always use the same username to run under, as I am
also doing user_prefs. . .

> > 3) AWL isn't using the SQL DB: "debug: open of AWL file failed: lock:
> > 19782 cannot create tmp lockfile
> > /home/vpopmail/.spamassassin/auto-whitelist.lock.sanford.19782 for
> > /home/vpopmail/.spamassassin/auto-whitelist.lock: No such file or
> > directory", however it should be using SQL for this. There are zero rows
> > on the awl table in postgres.
> 
> Did you set:
> auto_whitelist_factory Mail::SpamAssassin::SQLBasedAddrList
> ?

No as a matter of fact, I did not. I put that in my local.cf, and AWL in
SQL is now working. Perhaps that should be in the sql/README.awl file?

> > 
> > Note the lack of user_awl_table and user_scores_table, the default
> > values are being used (in fact, i had to remove it to make user_scores
> > work).
> > 
> 
> Odd, user_scores_table was removed, but it shouldn't have broken
> anything and user_awl_table should not interfere, can you confirm that
> this breaks things and submit a bug
> (http://bugzilla.spamassassin.org/) if it does?

I put the line back in, and the user pref still queried, I must have
been doing something else wrong that caused the problem. 

> 
> > 
> > What I'd like to get accomplish is to have AWL work through the database
> > properly, and get bayes to realize that there *is* stuff in the
> > database. sa-learn put it there, so spamd should be able to read it!
> > Also, what is the format I should use for user prefs in the database? Is
> > there a php application known that will put these prefs into postgres? I
> > googled around and didn't find one.
> 
> Does:
> http://www.peregrinehw.com/downloads/SpamAssassin/old/
> work for you?

I found that URI in my searching, but I saw only
"php-sa-mysql-<version>.tar.gz", so I assumed that it would only work
with MySQL and not PostgreSQL. The newest file is also more than a year
old, so I didn't think that it would work well and decided not to even
try it.

I noticed that if you go up one level, there is a much more recent
version of the php-sa-mysql available, but I don't see one for pgsql
still.

However, now that I know how to put the prefs into the DB, I should be
able to write my own php-sa-pgsql application, or beg someone I know to
do it for me :)

> 
> Michael
-- 
- Nick Bright
  Terraworld, Inc
  http://home.terraworld.net | 888-332-1616


Re: SA3.0-rc1: Bayes, AWL, and User-Prefs in Postgres SQL

Posted by Michael Parker <pa...@pobox.com>.
On Fri, Aug 27, 2004 at 11:05:56AM -0500, Nick Bright wrote:
> Greetings all,
> 
>  I am trying to set up SA3 with Bayes, User-Prefs, and AWL all stored in
> a postgres database. I'm using a RedHat EL3 server for this application,
> with the version of postgres that comes with RHEL3. I set up all the SQL
> tables and such with the *.sql scripts in the SA3 distrib file, and SA's
> local.cf according to the README's therein.
> 
> 
>  When I send a message through the scanner, the following debug level
> output is observed:
> 
> Aug 27 10:35:59 sanford spamd[19782]: logmsg: connection from prv1004132
> [10.0.4.132] at port 32979
> Aug 27 10:35:59 sanford spamd[19782]: connection from prv1004132
> [10.0.4.132] at port 32979
> Aug 27 10:35:59 sanford spamd[19782]: debug: Conf::SQL: executing SQL:
> select preference, value  from userpref where username = 'nickb' or
> username = '@GLOBAL' order by username asc
> Aug 27 10:35:59 sanford spamd[19782]: debug: retrieving prefs for nickb
> from SQL server
> Aug 27 10:35:59 sanford spamd[19782]: debug: user has changed
> Aug 27 10:35:59 sanford spamd[19782]: debug: bayes: Using username:
> nickb
> Aug 27 10:35:59 sanford spamd[19782]: debug: bayes: Database connection
> established
> Aug 27 10:35:59 sanford spamd[19782]: debug: bayes: found bayes db
> version 3
> Aug 27 10:35:59 sanford spamd[19782]: debug: bayes: Using userid: 4
> Aug 27 10:35:59 sanford spamd[19782]: debug: bayes: Not available for
> scanning, only 0 spam(s) in Bayes DB < 200
> Aug 27 10:35:59 sanford spamd[19782]: debug: Score set 0 chosen.
> Aug 27 10:35:59 sanford spamd[19782]: logmsg: processing message
> <10...@excite.com> for nickb:89.
> Aug 27 10:35:59 sanford spamd[19782]: processing message
> <10...@excite.com> for nickb:89.
> Aug 27 10:35:59 sanford spamd[19782]: debug: bayes: Database connection
> established
> Aug 27 10:35:59 sanford spamd[19782]: debug: bayes: found bayes db
> version 3
> Aug 27 10:35:59 sanford spamd[19782]: debug: bayes: Using userid: 4
> Aug 27 10:35:59 sanford spamd[19782]: debug: bayes: Not available for
> scanning, only 0 spam(s) in Bayes DB < 200
> Aug 27 10:35:59 sanford spamd[19782]: debug: received-header: parsed as
> [ ip=200.30.148.30 rdns= helo=terraworld.net by=mail.terraworld.net
> ident= envfrom= intl=0 id= ]
> Aug 27 10:35:59 sanford spamd[19782]: debug: received-header: cannot use
> DNS, do not trust any hosts from here on
> Aug 27 10:35:59 sanford spamd[19782]: debug: received-header: relay
> 200.30.148.30 trusted? no internal? no
> Aug 27 10:35:59 sanford spamd[19782]: debug: metadata:
> X-Spam-Relays-Trusted:
> Aug 27 10:35:59 sanford spamd[19782]: debug: metadata:
> X-Spam-Relays-Untrusted: [ ip=200.30.148.30 rdns= helo=terraworld.net
> by=mail.terraworld.net ident= envfrom= intl=0 id= ]
> Aug 27 10:35:59 sanford spamd[19782]: debug: ---- MIME PARSER START ----
> Aug 27 10:35:59 sanford spamd[19782]: debug: main message type:
> text/plain
> Aug 27 10:35:59 sanford spamd[19782]: debug: parsing normal part
> Aug 27 10:35:59 sanford spamd[19782]: debug: added part, type:
> text/plain
> Aug 27 10:35:59 sanford spamd[19782]: debug: ---- MIME PARSER END ----
> Aug 27 10:35:59 sanford spamd[19782]: debug: decoding: other encoding
> type (8bit), ignoring
> Aug 27 10:35:59 sanford spamd[19782]: debug: uri found:
> http://www.cutpricerxpills.com/_85924943b9db73ac62baa654773c6a8e/4
> Aug 27 10:35:59 sanford spamd[19782]: debug: Running tests for priority:
> 0
> Aug 27 10:35:59 sanford spamd[19782]: debug: running header regexp
> tests; score so far=0
> Aug 27 10:35:59 sanford spamd[19782]: debug: all '*From' addrs:
> pookie121gambit@hotmail.com evan5mndy13@hotmail.com
> Aug 27 10:35:59 sanford spamd[19782]: debug: all '*To' addrs:
> buddha@terraworld.net terraworld.net-bugs@terraworld.net
> Aug 27 10:35:59 sanford spamd[19782]: debug: forged-HELO: from=
> helo=terraworld.net by=terraworld.net
> Aug 27 10:35:59 sanford spamd[19782]: debug: running body-text per-line
> regexp tests; score so far=1.352
> Aug 27 10:35:59 sanford spamd[19782]: debug: running uri tests; score so
> far=2.281
> Aug 27 10:35:59 sanford spamd[19782]: debug: running raw-body-text
> per-line regexp tests; score so far=2.281
> Aug 27 10:35:59 sanford spamd[19782]: debug: running full-text regexp
> tests; score so far=2.281
> Aug 27 10:35:59 sanford spamd[19782]: debug: Running tests for priority:
> 500
> Aug 27 10:35:59 sanford spamd[19782]: debug: running meta tests; score
> so far=2.281
> Aug 27 10:35:59 sanford spamd[19782]: debug: running header regexp
> tests; score so far=2.281
> Aug 27 10:35:59 sanford spamd[19782]: debug: running body-text per-line
> regexp tests; score so far=2.281
> Aug 27 10:35:59 sanford spamd[19782]: debug: running uri tests; score so
> far=2.281
> Aug 27 10:35:59 sanford spamd[19782]: debug: running raw-body-text
> per-line regexp tests; score so far=2.281
> Aug 27 10:35:59 sanford spamd[19782]: debug: running full-text regexp
> tests; score so far=2.281
> Aug 27 10:35:59 sanford spamd[19782]: debug: Running tests for priority:
> 1000
> Aug 27 10:35:59 sanford spamd[19782]: debug: running meta tests; score
> so far=2.281
> Aug 27 10:35:59 sanford spamd[19782]: debug: running header regexp
> tests; score so far=2.281
> Aug 27 10:35:59 sanford spamd[19782]: debug: using
> "/home/vpopmail/.spamassassin" for user state dir
> Aug 27 10:35:59 sanford spamd[19782]: debug: mkdir
> /home/vpopmail/.spamassassin failed: mkdir /home/vpopmail: Permission
> denied at /usr/lib/perl5/site_perl/5.8.0/Mail/SpamAssassin.pm line 1438_
> Aug 27 10:35:59 sanford spamd[19782]: debug: open of AWL file failed:
> lock: 19782 cannot create tmp lockfile
> /home/vpopmail/.spamassassin/auto-whitelist.lock.sanford.19782 for
> /home/vpopmail/.spamassassin/auto-whitelist.lock: No such file or
> directory
> Aug 27 10:35:59 sanford spamd[19782]: debug: Post AWL score: 2.281
> Aug 27 10:35:59 sanford spamd[19782]: debug: running body-text per-line
> regexp tests; score so far=2.281
> Aug 27 10:35:59 sanford spamd[19782]: debug: running uri tests; score so
> far=2.281
> Aug 27 10:35:59 sanford spamd[19782]: debug: running raw-body-text
> per-line regexp tests; score so far=2.281
> Aug 27 10:35:59 sanford spamd[19782]: debug: running full-text regexp
> tests; score so far=2.281
> Aug 27 10:35:59 sanford spamd[19782]: debug: auto-learn? ham=0.1,
> spam=12, body-points=0.929, head-points=1.352
> Aug 27 10:35:59 sanford spamd[19782]: debug: auto-learn: currently using
> scoreset 0.  no need to recompute.
> Aug 27 10:35:59 sanford spamd[19782]: debug: auto-learn? no: inside
> auto-learn thresholds
> Aug 27 10:35:59 sanford spamd[19782]: debug: is spam? score=2.281
> required=5
> Aug 27 10:35:59 sanford spamd[19782]: debug:
> tests=FORGED_HOTMAIL_RCVD2,SAVE_THOUSANDS,SUBJ_BUY
> Aug 27 10:35:59 sanford spamd[19782]: debug:
> subtests=__CT,__CTE,__CT_TEXT_PLAIN,__HAS_MSGID,__HAS_SUBJECT,__MIME_VERSION,__MSGID_OK_DIGITS,__MSGID_OK_HOST,__MSGID_RANDY,__SANE_MSGID
> Aug 27 10:36:00 sanford spamd[19782]: logmsg: clean message (2.3/5.0)
> for nickb:89 in 0.5 seconds, 1206 bytes.
> Aug 27 10:36:00 sanford spamd[19782]: clean message (2.3/5.0) for
> nickb:89 in 0.5 seconds, 1206 bytes.
> Aug 27 10:36:00 sanford spamd[19782]: logmsg: result: .  2 -
> FORGED_HOTMAIL_RCVD2,SAVE_THOUSANDS,SUBJ_BUY
> scantime=0.5,size=1206,mid=<10...@excite.com>,autolearn=no
> Aug 27 10:36:00 sanford spamd[19782]: result: .  2 -
> FORGED_HOTMAIL_RCVD2,SAVE_THOUSANDS,SUBJ_BUY
> scantime=0.5,size=1206,mid=<10...@excite.com>,autolearn=no
>  
> 
> Starting from the top, I've examined this output to mean:
> 
> 1) SQL User-Prefs are being queried, but I don't know how to put
> specific prefs in so I can't really test it (advice, please?).
> 

insert into userprefs (username,preference,value) values ('yourusername','required_score','15');
etc etc etc

> 2) SQL Bayes is being queried, but not correctly. The log states "bayes:
> Not available for scanning, only 0 spam(s) in Bayes DB < 200" but I've
> used sa-learn to train in about 500 spams and about 500 hams, it should
> be trained well enough to at least try working (I do see this
> information in the database tables for bayes).
> 

Are you sure you ran sa-learn as the same user who is receiving the
mail?  What user does userid 4 correspond to in your bayes_vars table?

> 3) AWL isn't using the SQL DB: "debug: open of AWL file failed: lock:
> 19782 cannot create tmp lockfile
> /home/vpopmail/.spamassassin/auto-whitelist.lock.sanford.19782 for
> /home/vpopmail/.spamassassin/auto-whitelist.lock: No such file or
> directory", however it should be using SQL for this. There are zero rows
> on the awl table in postgres.

Did you set:
auto_whitelist_factory Mail::SpamAssassin::SQLBasedAddrList
?

> 
> Note the lack of user_awl_table and user_scores_table, the default
> values are being used (in fact, i had to remove it to make user_scores
> work).
> 

Odd, user_scores_table was removed, but it shouldn't have broken
anything and user_awl_table should not interfere, can you confirm that
this breaks things and submit a bug
(http://bugzilla.spamassassin.org/) if it does?

> 
> What I'd like to get accomplish is to have AWL work through the database
> properly, and get bayes to realize that there *is* stuff in the
> database. sa-learn put it there, so spamd should be able to read it!
> Also, what is the format I should use for user prefs in the database? Is
> there a php application known that will put these prefs into postgres? I
> googled around and didn't find one.

Does:
http://www.peregrinehw.com/downloads/SpamAssassin/old/
work for you?

Michael