You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Joseph Acquisto <jo...@j4computers.com> on 2012/10/22 00:32:14 UTC

No magic, since clearing database

I cleared the sa database by saying: sa-learn --clear

Since then, I have been see this:

my_host:~ # sa-learn --dump magic
0.000          0          3          0  non-token data: bayes db version
0.000          0          0          0  non-token data: nspam
0.000          0          0          0  non-token data: nham
0.000          0          0          0  non-token data: ntokens
0.000          0          0          0  non-token data: oldest atime
0.000          0          0          0  non-token data: newest atime
0.000          0          0          0  non-token data: last journal sync atime
0.000          0          0          0  non-token data: last expiry atime
0.000          0          0          0  non-token data: last expire atime delta
0.000          0          0          0  non-token data: last expire reduction count

Has not changed despite attempting to learn spam and ham.  In fact, it does not seem to have
registered that "auto learned" message I posted about earlier.

Did I break something?

joe a.


Re: No magic, since clearing database

Posted by Joseph Acquisto <jo...@j4computers.com>.
> That's pretty easy. The SA man page says that the default bayes database 
> path is  ~/.spamassassin/bayes, which is in each user's home directory.
> 
> Just set the bayes path in your local config to a path which is not based 
> on the user (i.e. does not start with ~), perhaps something like this:
> 
>  	bayes_path /etc/mail/spamassassin/bayes_db/bayes
> 
> Make sure that the user who does run SA has permission to access that 
> directory (/etc/mail/spamassassin/bayes_db) and the files in it (bayes*) - 
> writable if you're using autolearn, read-only if not.
> 
> Then any user with write permission to the files in that directory (e.g. 
> root) can run sa-learn.
> 
> I'll see about updating the wiki.
> 

Still not doing something right.  

Even after editing /etc/mail/spamassassin/local.cf to include the bayes_path
(after having created the directories) and restarting spamd, still only /root/.spamassassing/bayes stuff
gets updates with sa-learn.  

The new directory stays empty.  

Spamd seems to have been invoked as root (ps aux | grep spam).

I'll have to review the setup as I thought it was to be the specific user I created.

joe a.




Re: No magic, since clearing database

Posted by John Hardin <jh...@impsec.org>.
On Mon, 22 Oct 2012, Joseph Acquisto wrote:

>>>> On 10/22/2012 at 7:18 PM, John Hardin <jh...@impsec.org> wrote:
>> On Mon, 22 Oct 2012, Joseph Acquisto wrote:
>>
>>>>>> On 10/22/2012 at 12:15 AM, John Hardin <jh...@impsec.org> wrote:
>>>> On Sun, 21 Oct 2012, Joseph Acquisto wrote:
>>>>
>>>>> If I then try to learn ham or spam, it tells me permission denied
>>>>> trying to access the mail directory.
>>>>
>>>> Yeah, that can happen. One way around this for smaller installations is
>>>> to define a "site-wide" bayes rather than per-user bayes. Another way
>>>> is to have root move messages from users' training mailboxes to
>>>> training mailboxes owned by (e.g.) spamfilter before running sa-learn.
>>>
>>> I don't grok how to do that.
>>
>> Which, the site-wide bayes or moving messages to a different mailbox
>> before learning?
>
> Site wide.  Sorry.

OK.

That's pretty easy. The SA man page says that the default bayes database 
path is  ~/.spamassassin/bayes, which is in each user's home directory.

Just set the bayes path in your local config to a path which is not based 
on the user (i.e. does not start with ~), perhaps something like this:

 	bayes_path /etc/mail/spamassassin/bayes_db/bayes

Make sure that the user who does run SA has permission to access that 
directory (/etc/mail/spamassassin/bayes_db) and the files in it (bayes*) - 
writable if you're using autolearn, read-only if not.

Then any user with write permission to the files in that directory (e.g. 
root) can run sa-learn.

I'll see about updating the wiki.

-- 
  John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
  jhardin@impsec.org    FALaholic #11174     pgpk -a jhardin@impsec.org
  key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
   So Microsoft's invented the ASCII equivalent to ugly ink spots that
   appear on your letter when your pen is malfunctioning.
          -- Greg Andrews, about Microsoft's way to encode apostrophes
-----------------------------------------------------------------------
  144 days since the first successful private support mission to ISS (SpaceX)

Re: No magic, since clearing database

Posted by Joseph Acquisto <jo...@j4computers.com>.
>>> On 10/22/2012 at 7:18 PM, John Hardin <jh...@impsec.org> wrote:
> On Mon, 22 Oct 2012, Joseph Acquisto wrote:
> 
>>>>> On 10/22/2012 at 12:15 AM, John Hardin <jh...@impsec.org> wrote:
>>> On Sun, 21 Oct 2012, Joseph Acquisto wrote:
>>>
>>>> If I then try to learn ham or spam, it tells me permission denied 
>>>> trying to access the mail directory.
>>>
>>> Yeah, that can happen. One way around this for smaller installations is 
>>> to define a "site-wide" bayes rather than per-user bayes. Another way 
>>> is to have root move messages from users' training mailboxes to 
>>> training mailboxes owned by (e.g.) spamfilter before running sa-learn.
>>
>> I don't grok how to do that.
> 
> Which, the site-wide bayes or moving messages to a different mailbox 
> before learning?
> 

Site wide.  Sorry.

joe a.


Re: No magic, since clearing database

Posted by John Hardin <jh...@impsec.org>.
On Mon, 22 Oct 2012, Joseph Acquisto wrote:

>>>> On 10/22/2012 at 12:15 AM, John Hardin <jh...@impsec.org> wrote:
>> On Sun, 21 Oct 2012, Joseph Acquisto wrote:
>>
>>> If I then try to learn ham or spam, it tells me permission denied 
>>> trying to access the mail directory.
>>
>> Yeah, that can happen. One way around this for smaller installations is 
>> to define a "site-wide" bayes rather than per-user bayes. Another way 
>> is to have root move messages from users' training mailboxes to 
>> training mailboxes owned by (e.g.) spamfilter before running sa-learn.
>
> I don't grok how to do that.

Which, the site-wide bayes or moving messages to a different mailbox 
before learning?

>> There is also a size limit on the messages it will scan, are the sample
>> messages unusually large?
>
> I did not think so.   How do I know what too large is?

Uh. I disremember what the max for sa-learn is, I want to say around 
256kB.

-- 
  John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
  jhardin@impsec.org    FALaholic #11174     pgpk -a jhardin@impsec.org
  key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
   So Microsoft's invented the ASCII equivalent to ugly ink spots that
   appear on your letter when your pen is malfunctioning.
          -- Greg Andrews, about Microsoft's way to encode apostrophes
-----------------------------------------------------------------------
  144 days since the first successful private support mission to ISS (SpaceX)

Re: No magic, since clearing database

Posted by Joseph Acquisto <jo...@j4computers.com>.
>>> On 10/22/2012 at 12:15 AM, John Hardin <jh...@impsec.org> wrote:
> On Sun, 21 Oct 2012, Joseph Acquisto wrote:
> 
>>>> On 10/21/2012 at 6:39 PM, John Hardin <jh...@impsec.org> wrote:
>>>
>>> This typically indicates that you aren't running sa-learn while logged in
>>> as the user that SA is running as.
>>>
>>> Does sa-learn actually say it is learning tokens from the messages?
>>
>> Hmm.  If I su to spamfilter and run sa-learn --dump magic, I do see reasonable 
> results.
>>
>> If I then try to learn ham or spam, it tells me permission denied trying to 
> access the mail directory.
> 
> Yeah, that can happen. One way around this for smaller installations is to 
> define a "site-wide" bayes rather than per-user bayes. Another way is to 
> have root move messages from users' training mailboxes to training 
> mailboxes owned by (e.g.) spamfilter before running sa-learn.

I don't grok how to do that.  The Wiki  describes what to do, but I find things don't 
quite match.  It says if there are any files starting with "bayes_" it can break
locking.  Yes that's what I seem to see.

>> Trying to learn spam or ham as root runs without complaint, but appears 
>> to be saying 0 learned.
> 
> That's odd, if your dump shows no tokens are learned. That's expected if 
> sa-learn remembers that it's seen the messages before.
> 
> There is also a size limit on the messages it will scan, are the sample 
> messages unusually large?

I did not think so.   How do I know what too large is?

joe a.

> --
>   John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/ 
>   jhardin@impsec.org    FALaholic #11174     pgpk -a jhardin@impsec.org 
>   key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79



Re: No magic, since clearing database

Posted by John Hardin <jh...@impsec.org>.
On Sun, 21 Oct 2012, Joseph Acquisto wrote:

>>> On 10/21/2012 at 6:39 PM, John Hardin <jh...@impsec.org> wrote:
>>
>> This typically indicates that you aren't running sa-learn while logged in
>> as the user that SA is running as.
>>
>> Does sa-learn actually say it is learning tokens from the messages?
>
> Hmm.  If I su to spamfilter and run sa-learn --dump magic, I do see reasonable results.
>
> If I then try to learn ham or spam, it tells me permission denied trying to access the mail directory.

Yeah, that can happen. One way around this for smaller installations is to 
define a "site-wide" bayes rather than per-user bayes. Another way is to 
have root move messages from users' training mailboxes to training 
mailboxes owned by (e.g.) spamfilter before running sa-learn.

> Trying to learn spam or ham as root runs without complaint, but appears 
> to be saying 0 learned.

That's odd, if your dump shows no tokens are learned. That's expected if 
sa-learn remembers that it's seen the messages before.

There is also a size limit on the messages it will scan, are the sample 
messages unusually large?

--
  John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
  jhardin@impsec.org    FALaholic #11174     pgpk -a jhardin@impsec.org
  key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
   The reason it took so long to get Bin Laden is that it took the
   SEALs five years to swim that far into the desert.          -- anon
-----------------------------------------------------------------------
  143 days since the first successful private support mission to ISS (SpaceX)

Re: No magic, since clearing database

Posted by Joseph Acquisto <jo...@j4computers.com>.
>>> On 10/21/2012 at 6:39 PM, John Hardin <jh...@impsec.org> wrote:
> On Sun, 21 Oct 2012, Joseph Acquisto wrote:
> 
>> I cleared the sa database by saying: sa-learn --clear
>>
>> Since then, I have been see this:
>>
>> my_host:~ # sa-learn --dump magic
>> 0.000          0          3          0  non-token data: bayes db version
>> 0.000          0          0          0  non-token data: nspam
>> 0.000          0          0          0  non-token data: nham
>> 0.000          0          0          0  non-token data: ntokens
>> 0.000          0          0          0  non-token data: oldest atime
>> 0.000          0          0          0  non-token data: newest atime
>> 0.000          0          0          0  non-token data: last journal sync 
> atime
>> 0.000          0          0          0  non-token data: last expiry atime
>> 0.000          0          0          0  non-token data: last expire atime 
> delta
>> 0.000          0          0          0  non-token data: last expire reduction 
> count
>>
>> Has not changed despite attempting to learn spam and ham.
> 
> This typically indicates that you aren't running sa-learn while logged in 
> as the user that SA is running as.
> 
> Does sa-learn actually say it is learning tokens from the messages?
> 

Hmm.  If I su to spamfilter and run sa-learn --dump magic, I do see reasonable results.

If I then try to learn ham or spam, it tells me permission denied trying to access the mail directory.

Trying to learn spam or ham as root runs without complaint, but appears to be saying 0 learned.

joe a.



Re: No magic, since clearing database

Posted by John Hardin <jh...@impsec.org>.
On Sun, 21 Oct 2012, Joseph Acquisto wrote:

> I cleared the sa database by saying: sa-learn --clear
>
> Since then, I have been see this:
>
> my_host:~ # sa-learn --dump magic
> 0.000          0          3          0  non-token data: bayes db version
> 0.000          0          0          0  non-token data: nspam
> 0.000          0          0          0  non-token data: nham
> 0.000          0          0          0  non-token data: ntokens
> 0.000          0          0          0  non-token data: oldest atime
> 0.000          0          0          0  non-token data: newest atime
> 0.000          0          0          0  non-token data: last journal sync atime
> 0.000          0          0          0  non-token data: last expiry atime
> 0.000          0          0          0  non-token data: last expire atime delta
> 0.000          0          0          0  non-token data: last expire reduction count
>
> Has not changed despite attempting to learn spam and ham.

This typically indicates that you aren't running sa-learn while logged in 
as the user that SA is running as.

Does sa-learn actually say it is learning tokens from the messages?

-- 
  John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
  jhardin@impsec.org    FALaholic #11174     pgpk -a jhardin@impsec.org
  key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
   Watch... Wallet... Gun... Knee...                    -- Denny Crane
-----------------------------------------------------------------------
  143 days since the first successful private support mission to ISS (SpaceX)