You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by "C. Bensend" <be...@bennyvision.com> on 2006/12/04 06:05:07 UTC

SA 3.1.7 not picking up SQL-based Bayes

Hey folks,

   I'm finishing up a mailserver upgrade this weekend, and I notice
that my new SQL-based install isn't picking up on user-based Bayes
data.  This is on a new, squeaky-clean OpenBSD 4.0-STABLE machine
running on AMD64, using SpamAssassin 3.1.7 with perl 5.8.8.

As per spamd -D info:

2006-12-03 22:41:53.760956500 [12889] dbg: config: retrieving prefs for
benny@bennyvision.com from SQL server

OK, yay, spamd is picking up on the SQL userprefs.

2006-12-03 22:41:53.772480500 [12889] dbg: info: user has changed

Not sure what this means?

2006-12-03 22:41:53.774209500 [12889] dbg: bayes: using username:
benny@bennyvision.com
2006-12-03 22:41:53.781308500 [12889] dbg: bayes: database connection
established
2006-12-03 22:41:53.786485500 [12889] dbg: bayes: found bayes db version 3
2006-12-03 22:41:53.789654500 [12889] dbg: bayes: unable to initialize
database for benny@bennyvision.com user, aborting!
2006-12-03 22:41:54.117388500 [12889] dbg: bayes: not scoring message,
returning undef
2006-12-03 22:41:54.118260500 [12889] dbg: bayes: opportunistic call
attempt failed, DB not readable

Uh.  What does "unable to initialize database" mean?  Spamd has already
successfully connected to the PostgreSQL database above, right?  So what
does "initializing database" mean?

My user_scores_sql_custom_query is as follows, if that makes a
difference (not sure if that's consulted for Bayes data):


user_scores_sql_custom_query    SELECT preference, value FROM userpref
WHERE username = _MAILBOX_ OR username = _USERNAME_ OR username =
'$GLOBAL' ORDER BY user name ASC;


To add insult to injury, learning spam and ham work just fine.
It's just the Bayes scoring that seems to have issues.

So.  I'm at a loss at the moment...  My SA install is doing well,
but not as well as it should, if it's ignoring Bayes.  What info
can I pass along to help diagnose this problem?

Thanks much!

Benny


-- 
"If stupidity were a handicap, you'd have the best parking spot."
                                                    --Bill Paul



Re: SA 3.1.7 not picking up SQL-based Bayes

Posted by "C. Bensend" <be...@bennyvision.com>.
> add the rest of you --dump magic command to that.

Right.  Duh me.  Heh.  The following was captured via -D:

[20507] dbg: bayes: using username: benny@bennyvision.com
[20507] dbg: bayes: database connection established
[20507] dbg: bayes: found bayes db version 3
[20507] dbg: bayes: unable to initialize database for
benny@bennyvision.com user, aborting!
[20507] dbg: config: score set 0 chosen.
[20507] dbg: bayes: database connection established
[20507] dbg: bayes: found bayes db version 3
[20507] dbg: bayes: unable to initialize database for
benny@bennyvision.com user, aborting!
ERROR: Bayes dump returned an error, please re-run with -D for more
information

> That custom query has nothing to do with bayes or awl sql stuffs.

Gotcha.  Thanks.

Thanks for taking a look at this, Michael,

Benny


-- 
"If stupidity were a handicap, you'd have the best parking spot."
                                                    --Bill Paul



Re: SA 3.1.7 not picking up SQL-based Bayes

Posted by Michael Parker <pa...@pobox.com>.
C. Bensend wrote:
>> Ahh but you didn't run the command I asked you to run.  You are passing
>> the user: benny@bennyvision.com to SpamAssassin so it will use that as
>> the key for the database, running the command from the command like that
>> way is going to use your unix id as the key.  I'm guessing you changed
>> something in your mail setup to start passing in @domain in addition to
>> the regular unix username.
> 
> Actually, yes, I did, but I don't think it turned out like we
> were expecting (hence I didn't include it, I'm sorry):
> 
> 
> [benny@fusion ~]$ sa-learn -u benny@bennyvision.com       

add the rest of you --dump magic command to that.


> 
> But regardless - won't the user_scores_sql_custom_query I posted
> handle that possibility?  I am _so_ not an SQL guru, but it looks
> correct to me?  I'm never afraid to admit a mistake, so if I'm
> smoking crack here, please step up and say so.  :)
> 

That custom query has nothing to do with bayes or awl sql stuffs.

Michael



> Benny
> 
> 


Re: SA 3.1.7 not picking up SQL-based Bayes

Posted by "C. Bensend" <be...@bennyvision.com>.
> Ahh but you didn't run the command I asked you to run.  You are passing
> the user: benny@bennyvision.com to SpamAssassin so it will use that as
> the key for the database, running the command from the command like that
> way is going to use your unix id as the key.  I'm guessing you changed
> something in your mail setup to start passing in @domain in addition to
> the regular unix username.

Actually, yes, I did, but I don't think it turned out like we
were expecting (hence I didn't include it, I'm sorry):


[benny@fusion ~]$ sa-learn -u benny@bennyvision.com                       
     SpamAssassin version 3.1.7
Please select either --spam, --ham, --folders, --forget, --sync, --import,
--dump, --clear, --backup or --restore
Usage:
    sa-learn [options] [file]...

    sa-learn [options] --dump [ all | data | magic ]

    Options:

     --ham                             Learn messages as ham (non-spam)
     --spam                            Learn messages as spam
     --forget                          Forget a message
     --use-ignores                     Use bayes_ignore_from and
bayes_ignore_to
     --sync                            Syncronize the database and the
journal if needed
     --force-expire                    Force a database sync and expiry run
     --dbpath <path>                   Allows commandline override (in
bayes_path form)
                                       for where to read the Bayes DB from
     --dump [all|data|magic]           Display the contents of the Bayes
database
                                       Takes optional argument for what to
display
      --regexp <re>                    For dump only, specifies which
tokens to
                                       dump based on a regular expression.
     -f file, --folders=file           Read list of files/directories from
file
     --dir                             Ignored; historical compatability
     --file                            Ignored; historical compatability
     --mbox                            Input sources are in mbox format
     --mbx                             Input sources are in mbx format
     --showdots                        Show progress using dots
     --no-sync                         Skip syncronizing the database and
journal
                                       after learning
     -L, --local                       Operate locally, no network accesses
     --import                          Migrate data from older version/non
DB_File
                                       based databases
     --clear                           Wipe out existing database
     --backup                          Backup, to STDOUT, existing database
     --restore <filename>              Restore a database from filename

     -u username, --username=username  Override username taken from the
runtime environment
     -C path, --configpath=path, --config-file=path   Path to standard
configuration dir
     -p prefs, --prefspath=file, --prefs-file=file    Set user preferences
file
     --siteconfigpath=path             Path for site configs (def:
/etc/mail/spamassassin)
     -D, --debug-level                 Print debugging messages
     -V, --version                     Print version
     -h, --help                        Print usage message


But regardless - won't the user_scores_sql_custom_query I posted
handle that possibility?  I am _so_ not an SQL guru, but it looks
correct to me?  I'm never afraid to admit a mistake, so if I'm
smoking crack here, please step up and say so.  :)

Benny


-- 
"If stupidity were a handicap, you'd have the best parking spot."
                                                    --Bill Paul



Re: SA 3.1.7 not picking up SQL-based Bayes

Posted by Michael Parker <pa...@pobox.com>.
C. Bensend wrote:
>> I think its just a slightly confusing message.  If you run:
>> sa-learn -u benny@bennyvision.com
>>
>> Does it show that you have 200 ham and 200 spam in the database?  If so
>> then there is a problem, if not you just need to train it some more.
>>
>> What the WARNING is telling you is that hey this database isn't ready
>> for scoring so I'm not gonna use it.  This is why learning works just
>> fine.  Finish training up the DB and see if it then starts working for
>> you.
>>
>> Michael
>>
>> PS Possibly we should get the warning text changed a bit, feel free to
>> open up a bug so we can track the work, thanks.
> 
> Hi Michael,
> 
> Well, I have the following in the script that runs every now and
> again, to execute sa-learn:
> 
> [benny@fusion ~]$ sa-learn --dump magic | grep "non-token data: nham" |
> awk '{ print $3 }'
> 257526
> [benny@fusion ~]$ sa-learn --dump magic | grep "non-token data: nspam" |
> awk '{ print $3 }'
> 470150
> 
> I'm fairly sure I have enough ham and spam.  :)  Also, I'm watching
> the PostgreSQL logfile when I do that, and it _is_ querying the
> database.
> 

Ahh but you didn't run the command I asked you to run.  You are passing
the user: benny@bennyvision.com to SpamAssassin so it will use that as
the key for the database, running the command from the command like that
way is going to use your unix id as the key.  I'm guessing you changed
something in your mail setup to start passing in @domain in addition to
the regular unix username.

Michael

> Just for argument's sake, I checked for *BAYES* in the spamd logfile,
> and I don't get a single hit.  So, Bayes is definately not working
> for _any_ of the accounts, not just mine.  :(
> 
> Thanks for any insight,
> 
> Benny
> 
> 


Re: SA 3.1.7 not picking up SQL-based Bayes

Posted by "C. Bensend" <be...@bennyvision.com>.
> I think its just a slightly confusing message.  If you run:
> sa-learn -u benny@bennyvision.com
>
> Does it show that you have 200 ham and 200 spam in the database?  If so
> then there is a problem, if not you just need to train it some more.
>
> What the WARNING is telling you is that hey this database isn't ready
> for scoring so I'm not gonna use it.  This is why learning works just
> fine.  Finish training up the DB and see if it then starts working for
> you.
>
> Michael
>
> PS Possibly we should get the warning text changed a bit, feel free to
> open up a bug so we can track the work, thanks.

Hi Michael,

Well, I have the following in the script that runs every now and
again, to execute sa-learn:

[benny@fusion ~]$ sa-learn --dump magic | grep "non-token data: nham" |
awk '{ print $3 }'
257526
[benny@fusion ~]$ sa-learn --dump magic | grep "non-token data: nspam" |
awk '{ print $3 }'
470150

I'm fairly sure I have enough ham and spam.  :)  Also, I'm watching
the PostgreSQL logfile when I do that, and it _is_ querying the
database.

Just for argument's sake, I checked for *BAYES* in the spamd logfile,
and I don't get a single hit.  So, Bayes is definately not working
for _any_ of the accounts, not just mine.  :(

Thanks for any insight,

Benny


-- 
"If stupidity were a handicap, you'd have the best parking spot."
                                                    --Bill Paul



Re: SA 3.1.7 not picking up SQL-based Bayes

Posted by Michael Parker <pa...@pobox.com>.
C. Bensend wrote:
> Hey folks,
> 
>    I'm finishing up a mailserver upgrade this weekend, and I notice
> that my new SQL-based install isn't picking up on user-based Bayes
> data.  This is on a new, squeaky-clean OpenBSD 4.0-STABLE machine
> running on AMD64, using SpamAssassin 3.1.7 with perl 5.8.8.
> 
> As per spamd -D info:
> 
> 2006-12-03 22:41:53.760956500 [12889] dbg: config: retrieving prefs for
> benny@bennyvision.com from SQL server
> 
> OK, yay, spamd is picking up on the SQL userprefs.
> 
> 2006-12-03 22:41:53.772480500 [12889] dbg: info: user has changed
> 
> Not sure what this means?
> 
> 2006-12-03 22:41:53.774209500 [12889] dbg: bayes: using username:
> benny@bennyvision.com
> 2006-12-03 22:41:53.781308500 [12889] dbg: bayes: database connection
> established
> 2006-12-03 22:41:53.786485500 [12889] dbg: bayes: found bayes db version 3
> 2006-12-03 22:41:53.789654500 [12889] dbg: bayes: unable to initialize
> database for benny@bennyvision.com user, aborting!
> 2006-12-03 22:41:54.117388500 [12889] dbg: bayes: not scoring message,
> returning undef
> 2006-12-03 22:41:54.118260500 [12889] dbg: bayes: opportunistic call
> attempt failed, DB not readable
> 
> Uh.  What does "unable to initialize database" mean?  Spamd has already
> successfully connected to the PostgreSQL database above, right?  So what
> does "initializing database" mean?
> 
> My user_scores_sql_custom_query is as follows, if that makes a
> difference (not sure if that's consulted for Bayes data):
> 
> 
> user_scores_sql_custom_query    SELECT preference, value FROM userpref
> WHERE username = _MAILBOX_ OR username = _USERNAME_ OR username =
> '$GLOBAL' ORDER BY user name ASC;
> 
> 
> To add insult to injury, learning spam and ham work just fine.
> It's just the Bayes scoring that seems to have issues.
> 
> So.  I'm at a loss at the moment...  My SA install is doing well,
> but not as well as it should, if it's ignoring Bayes.  What info
> can I pass along to help diagnose this problem?

I think its just a slightly confusing message.  If you run:
sa-learn -u benny@bennyvision.com

Does it show that you have 200 ham and 200 spam in the database?  If so
then there is a problem, if not you just need to train it some more.

What the WARNING is telling you is that hey this database isn't ready
for scoring so I'm not gonna use it.  This is why learning works just
fine.  Finish training up the DB and see if it then starts working for you.

Michael

PS Possibly we should get the warning text changed a bit, feel free to
open up a bug so we can track the work, thanks.

> 
> Thanks much!
> 
> Benny
> 
>