You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@spamassassin.apache.org by Vernon Webb <ve...@comp-wiz.com> on 2006/12/29 14:23:46 UTC

sa-learn explained

Yesterday someone asked if I used sa-learn and the response to myself was, I have 
something else to learn. Can someone explain to me how to use it?

If I understand correctly sa-learn can be used to train SA to recognize certain 
messages as SPAM or HAM. I've run the sa-learn command but it is not very clear as to 
how it is used. I mean I understand if I use "sa-learn --spam" I can train SA that 
something is SPAM but what, where? For instance today the thing is not "Effie 
Present"  but rather "Happy NW Effie". So the efforts I took yesterday using the 
phish.ndb and scan.ndb database is still not cathcing these guys (however it is 
catching some Phishing scams).

I'm willing to try sa-learn, but what will that do for me? These guys are beginning to 
drive me nuts and obvioulsy I have something wrong as others are telling me these are 
being caught as SPAM on their systems.

Thanks

Re: --lint test fails

Posted by Vernon Webb <ve...@comp-wiz.com>.

That did it thanks. It was in the local.cf file.

Re: --lint test fails

Posted by Theo Van Dinter <fe...@apache.org>.

On Sun, Dec 31, 2006 at 05:30:39PM -0500, Vernon Webb wrote:
> > 2) "pyzor_add_header" isn't a valid config option.  See "perldoc 
> > Mail::SpamAssassin::Plugin::Pyzor" for more info.  Perhaps you want to just 
> > use the add_header option with the _PYZOR_ tag?  (see "perldoc 
> > Mail::SpamAssassin::Conf" for info on that) 
> 
> I'm sorry I know how tired people get of answering questions of people how have not 
> read the docs. I have I'm just lost. Where exactly is this line written the add_header 
> so I can remove it? 

Now I'm confused.  The original message you posted was about a lint
failure for "pyzor_add_header 1", which I had assumed you added in.
Are you asking where that config line is?  If so, I can't answer that
for you, it's your config. ;)

It would likely be in your site config area, which is probably
/etc/mail/spamassassin.  So something like "grep pyzor_add_header
/etc/mail/spamassassin/*.cf" is probably going to find it for you.
If it doesn't, you can run "spamassassin --lint -D config", get the
list of config files being used, and grep each of them looking for
"pyzor_add_header".

> I've checked the perldoc Mail::SpamAssassin::Plugin::Pyzor and it doesn't make any 
> sense to me. I have installed SA and pyzor (and the myriad of other afore mentioned 
> plugins) and have not had to modify anything other than SA itself. Is there something 
> I am missing here?

If you've enabled the plugin, and there are no errors as seen by "--lint -D",
then you should be fine.  The problem so far is that you added in a config
option that's not valid, so you get a lint warning.

-- 
Randomly Selected Tagline:
"You are in a twisty little maze of Sendmail rules, all confusing."
         - jon schatz in <10...@valium.divisionbyzero.com>

Re: --lint test fails

Posted by Vernon Webb <ve...@comp-wiz.com>.

> 2) "pyzor_add_header" isn't a valid config option.  See "perldoc 
> Mail::SpamAssassin::Plugin::Pyzor" for more info.  Perhaps you want to just 
> use the add_header option with the _PYZOR_ tag?  (see "perldoc 
> Mail::SpamAssassin::Conf" for info on that) 

I'm sorry I know how tired people get of answering questions of people how have not 
read the docs. I have I'm just lost. Where exactly is this line written the add_header 
so I can remove it? 

I've checked the perldoc Mail::SpamAssassin::Plugin::Pyzor and it doesn't make any 
sense to me. I have installed SA and pyzor (and the myriad of other afore mentioned 
plugins) and have not had to modify anything other than SA itself. Is there something 
I am missing here?

Thanks

Re: --lint test fails

Posted by Theo Van Dinter <fe...@apache.org>.

On Fri, Dec 29, 2006 at 07:56:09PM -0500, Vernon Webb wrote:
> Well it was there and it was not commented out so I did comment it out but I am still 
> get the error.

Ok, there's 2 things going on here.

1) You need the plugin loaded.  It sounds like you have that, if the
"loadplugin" line is there, uncommented in the pre file.

2) "pyzor_add_header" isn't a valid config option.  See "perldoc
Mail::SpamAssassin::Plugin::Pyzor" for more info.  Perhaps you want to just
use the add_header option with the _PYZOR_ tag?  (see "perldoc
Mail::SpamAssassin::Conf" for info on that)

-- 
Randomly Selected Tagline:
"Spending time with my ex-wife this weekend was more enjoyable than this
 interview, but it was close." - Unknown

Re: --lint test fails

Posted by Sander Holthaus <in...@orangexl.com>.

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
 
Vernon Webb wrote:
>> Erm, you're not supposed to remove it. You're supposed to ADD it, or if
>> it's already there, make sure it's not commented out with a #.
>
> Well it was there and it was not commented out so I did comment it out
but I am still
> get the error.
>
You really really really need to read the documentation.

People are here to help you and more than willing to, but it is very
impolite to ask questions without reading the docs first (and getting
a basic understanding of SpamAssassin).

Kind Regards,
Sander Holthaus
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (MingW32)
 
iD8DBQFFlcR+Vf373DysOTURAtgaAJ4+kWrFjrxJl/at0YuspcwUtB3dCACeP8Cf
gdXrCUQh9ZIF+ZvLf/e84DQ=
=SaeB
-----END PGP SIGNATURE-----

Re: --lint test fails

Posted by Vernon Webb <ve...@comp-wiz.com>.

> Erm, you're not supposed to remove it. You're supposed to ADD it, or if 
> it's already there, make sure it's not commented out with a #. 

Well it was there and it was not commented out so I did comment it out but I am still 
get the error.

Re: --lint test fails

Posted by Matt Kettler <mk...@verizon.net>.

Vernon Webb wrote:
> I'm using 3.1.4 and I tried removing the line in the v310pre however I am still get 
> that error. 
>   

Erm, you're not supposed to remove it. You're supposed to ADD it, or if
it's already there, make sure it's not commented out with a #.

>   
>> assuming you're running a recent 31x ver of SA, that cmd is no longer 
>> the way to enable pyzor ... 
>>
>> rather, this 
>>
>>    loadplugin Mail::SpamAssassin::Plugin::Pyzor 
>>
>> is added to init.pre. 
>>     
>
>
>

Re: --lint test fails

Posted by Vernon Webb <ve...@comp-wiz.com>.

I'm using 3.1.4 and I tried removing the line in the v310pre however I am still get 
that error. 

> assuming you're running a recent 31x ver of SA, that cmd is no longer 
> the way to enable pyzor ... 
> 
> rather, this 
> 
>    loadplugin Mail::SpamAssassin::Plugin::Pyzor 
> 
> is added to init.pre.

Re: --lint test fails

Posted by snowcrash+spamassassin <sc...@gmail.com>.

> In running a lint test on one of my boxes I get the following error which I can't seem
> to figure out why. Pyzor is installed and the path is correct:
>
> [3075] warn: config: failed to parse line, skipping: pyzor_add_header 1
> [3075] warn: lint: 1 issues detected, please rerun with debug enabled for more
> information

assuming you're running a recent 31x ver of SA, that cmd is no longer
the way to enable pyzor ...

rather, this

    loadplugin Mail::SpamAssassin::Plugin::Pyzor

is added to init.pre.

--lint test fails

Posted by Vernon Webb <ve...@comp-wiz.com>.

In running a lint test on one of my boxes I get the following error which I can't seem 
to figure out why. Pyzor is installed and the path is correct:

[3075] warn: config: failed to parse line, skipping: pyzor_add_header 1
[3075] warn: lint: 1 issues detected, please rerun with debug enabled for more 
information

Anyone?

RE: sa-learn explained

Posted by vertito <ve...@aim-consultants.com>.

personally, attended sa-learn is better for me rather than having 1 with unattended auto learn,
as what they always say, one man's spam is another man's ham.

2 cents here.

-----Original Message-----
From: Jim Maul [mailto:jmaul@elih.org] 
Sent: Friday, December 29, 2006 6:28 PM
To: users@spamassassin.apache.org
Subject: Re: sa-learn explained

Dave Koontz wrote:
>  
> I guess milage varies.  Auto-Learn has been a life saver for us and 
> has drastically reduced false postives we used to get with emails to 
> our College's Health Care & Research departments.  We pass all local 
> user email through SA as well, so this really helps the system learn what is 'good'
> email.
> 
> I'd suggest that everyone should at least try it and monitor the results.
> 
> 

I have found autolearn to be quite a valuable function here as well. 
Keep in mind that i have adjusted the autolearn threshold values to prevent things from being
autolearned incorrectly.  I would suggest others do the same if they use autolearn.  IMO, with the
default scores, it is too easy for false learning to occur. I use:

bayes_auto_learn_threshold_nonspam -0.1
bayes_auto_learn_threshold_spam 10.0

-Jim


> -----Original Message-----
> From: Nigel Frankcom [mailto:nigel@blue-canoe.net]
> Sent: Friday, December 29, 2006 11:17 AM
> To: users@spamassassin.apache.org
> Subject: Re: sa-learn explained
> 
> On Fri, 29 Dec 2006 09:51:05 -0500, Andy Figueroa 
> <fi...@andyfigueroa.net> wrote:
> 
>> I still fee like a tyro with SpamAssassin, but my installation is 
>> catching better than 99% with perhaps 0.1% false positives (thanks in 
>> large part to things I've learned from this list), and I think I can 
>> tell you a couple of things better than just read the manual.  (But, 
>> do read the manual!)  My initial experience with SpamAssassin about a 
>> year ago was through a large web hosting company and I was limited to 
>> playing with SpamAssassin through cpanel, though till they moved 
>> SpamAssassin to its own server, I could also edit my own user 
>> preferences directly.  The problem was, this big company never could 
>> get it right, so now I'm running my own mailserver(s) out of what 
>> seemed like necessity.  I'm running Gentoo with SA 3.1.7.
>>
>> sa-learn is used to train and keep up-to-date the bayesian database.  
>> So, turn on autolearn in your /etc/mail/spamassassin/local.cf so the 
>> line reads:
>> bayes_auto_learn 1
>> (should be on by default).
>> This will cause selected spam and ham that you get to be used 
>> automagically to keep the bayesian database up-to-date.
>>
>> I'm using maildir and have two subdirectories in my .maildir called:
>> 2-learn-spam
>> 2-learn-ham
>>
>> I put missed spam in 2-learn-spam and ham misclassified as ham in 
>> 2-learn-ham.  Then, whenever I have a few messages in one of those 
>> directories, I run one of the following scripts:
>>
>> learnspam.scr, which contains this line:
>> sa-learn --spam --progress /home/figueroa/.maildir/.2-learn-spam/cur
>>
>> learnham.scr which contains this line:
>> sa-learn --ham --progress /home/figueroa/.maildir/.2-learn-ham/cur
>>
>> This is on my personal mailserver.  On the mailserver I run at a 
>> school, I run that script on each users 2-learn-spam/ham directories 
>> every night under crontab.
>>
>> Run an up-to-date version of SpmaAsssasin.  I was having pretty good 
>> results with 3.1.3 (the unmasked version in Gentoo), but got 
>> immediately better results when I upgraded to the current version.
>>
>> Also, to keep your RULES up-to-date, run sa-update as root from 
>> time-to-time.
>>
>> Good luck!  Happy spamassassaning!
> 
> 
> Personally, I'd disagree with auto-learn; having used SA in a 
> production environment for some years I've found manual training to be 
> a better solution.
> 
> YMMV
> 
> Just my 2 (pick your currency) worth.
> 
> Nigel
> 
> 
> 
>

Re: sa-learn explained

Posted by Jim Maul <jm...@elih.org>.

Dave Koontz wrote:
>  
> I guess milage varies.  Auto-Learn has been a life saver for us and has
> drastically reduced false postives we used to get with emails to our
> College's Health Care & Research departments.  We pass all local user email
> through SA as well, so this really helps the system learn what is 'good'
> email.
> 
> I'd suggest that everyone should at least try it and monitor the results.
> 
> 

I have found autolearn to be quite a valuable function here as well. 
Keep in mind that i have adjusted the autolearn threshold values to 
prevent things from being autolearned incorrectly.  I would suggest 
others do the same if they use autolearn.  IMO, with the default scores, 
it is too easy for false learning to occur. I use:

bayes_auto_learn_threshold_nonspam -0.1
bayes_auto_learn_threshold_spam 10.0

-Jim


> -----Original Message-----
> From: Nigel Frankcom [mailto:nigel@blue-canoe.net] 
> Sent: Friday, December 29, 2006 11:17 AM
> To: users@spamassassin.apache.org
> Subject: Re: sa-learn explained
> 
> On Fri, 29 Dec 2006 09:51:05 -0500, Andy Figueroa
> <fi...@andyfigueroa.net> wrote:
> 
>> I still fee like a tyro with SpamAssassin, but my installation is 
>> catching better than 99% with perhaps 0.1% false positives (thanks in 
>> large part to things I've learned from this list), and I think I can 
>> tell you a couple of things better than just read the manual.  (But, do 
>> read the manual!)  My initial experience with SpamAssassin about a year 
>> ago was through a large web hosting company and I was limited to 
>> playing with SpamAssassin through cpanel, though till they moved 
>> SpamAssassin to its own server, I could also edit my own user 
>> preferences directly.  The problem was, this big company never could 
>> get it right, so now I'm running my own mailserver(s) out of what 
>> seemed like necessity.  I'm running Gentoo with SA 3.1.7.
>>
>> sa-learn is used to train and keep up-to-date the bayesian database.  
>> So, turn on autolearn in your /etc/mail/spamassassin/local.cf so the 
>> line reads:
>> bayes_auto_learn 1
>> (should be on by default).
>> This will cause selected spam and ham that you get to be used 
>> automagically to keep the bayesian database up-to-date.
>>
>> I'm using maildir and have two subdirectories in my .maildir called:
>> 2-learn-spam
>> 2-learn-ham
>>
>> I put missed spam in 2-learn-spam and ham misclassified as ham in 
>> 2-learn-ham.  Then, whenever I have a few messages in one of those 
>> directories, I run one of the following scripts:
>>
>> learnspam.scr, which contains this line:
>> sa-learn --spam --progress /home/figueroa/.maildir/.2-learn-spam/cur
>>
>> learnham.scr which contains this line:
>> sa-learn --ham --progress /home/figueroa/.maildir/.2-learn-ham/cur
>>
>> This is on my personal mailserver.  On the mailserver I run at a 
>> school, I run that script on each users 2-learn-spam/ham directories 
>> every night under crontab.
>>
>> Run an up-to-date version of SpmaAsssasin.  I was having pretty good 
>> results with 3.1.3 (the unmasked version in Gentoo), but got 
>> immediately better results when I upgraded to the current version.
>>
>> Also, to keep your RULES up-to-date, run sa-update as root from 
>> time-to-time.
>>
>> Good luck!  Happy spamassassaning!
> 
> 
> Personally, I'd disagree with auto-learn; having used SA in a production
> environment for some years I've found manual training to be a better
> solution.
> 
> YMMV
> 
> Just my 2 (pick your currency) worth.
> 
> Nigel
> 
> 
> 
>

RE: sa-learn explained

Posted by Dave Koontz <dk...@mbc.edu>.

 
I guess milage varies.  Auto-Learn has been a life saver for us and has
drastically reduced false postives we used to get with emails to our
College's Health Care & Research departments.  We pass all local user email
through SA as well, so this really helps the system learn what is 'good'
email.

I'd suggest that everyone should at least try it and monitor the results.


-----Original Message-----
From: Nigel Frankcom [mailto:nigel@blue-canoe.net] 
Sent: Friday, December 29, 2006 11:17 AM
To: users@spamassassin.apache.org
Subject: Re: sa-learn explained

On Fri, 29 Dec 2006 09:51:05 -0500, Andy Figueroa
<fi...@andyfigueroa.net> wrote:

>I still fee like a tyro with SpamAssassin, but my installation is 
>catching better than 99% with perhaps 0.1% false positives (thanks in 
>large part to things I've learned from this list), and I think I can 
>tell you a couple of things better than just read the manual.  (But, do 
>read the manual!)  My initial experience with SpamAssassin about a year 
>ago was through a large web hosting company and I was limited to 
>playing with SpamAssassin through cpanel, though till they moved 
>SpamAssassin to its own server, I could also edit my own user 
>preferences directly.  The problem was, this big company never could 
>get it right, so now I'm running my own mailserver(s) out of what 
>seemed like necessity.  I'm running Gentoo with SA 3.1.7.
>
>sa-learn is used to train and keep up-to-date the bayesian database.  
>So, turn on autolearn in your /etc/mail/spamassassin/local.cf so the 
>line reads:
>bayes_auto_learn 1
>(should be on by default).
>This will cause selected spam and ham that you get to be used 
>automagically to keep the bayesian database up-to-date.
>
>I'm using maildir and have two subdirectories in my .maildir called:
>2-learn-spam
>2-learn-ham
>
>I put missed spam in 2-learn-spam and ham misclassified as ham in 
>2-learn-ham.  Then, whenever I have a few messages in one of those 
>directories, I run one of the following scripts:
>
>learnspam.scr, which contains this line:
>sa-learn --spam --progress /home/figueroa/.maildir/.2-learn-spam/cur
>
>learnham.scr which contains this line:
>sa-learn --ham --progress /home/figueroa/.maildir/.2-learn-ham/cur
>
>This is on my personal mailserver.  On the mailserver I run at a 
>school, I run that script on each users 2-learn-spam/ham directories 
>every night under crontab.
>
>Run an up-to-date version of SpmaAsssasin.  I was having pretty good 
>results with 3.1.3 (the unmasked version in Gentoo), but got 
>immediately better results when I upgraded to the current version.
>
>Also, to keep your RULES up-to-date, run sa-update as root from 
>time-to-time.
>
>Good luck!  Happy spamassassaning!


Personally, I'd disagree with auto-learn; having used SA in a production
environment for some years I've found manual training to be a better
solution.

YMMV

Just my 2 (pick your currency) worth.

Nigel

Re: sa-learn explained

Posted by Nigel Frankcom <ni...@blue-canoe.net>.

On Fri, 29 Dec 2006 09:51:05 -0500, Andy Figueroa
<fi...@andyfigueroa.net> wrote:

>I still fee like a tyro with SpamAssassin, but my installation is 
>catching better than 99% with perhaps 0.1% false positives (thanks in 
>large part to things I've learned from this list), and I think I can 
>tell you a couple of things better than just read the manual.  (But, do 
>read the manual!)  My initial experience with SpamAssassin about a year 
>ago was through a large web hosting company and I was limited to 
>playing with SpamAssassin through cpanel, though till they moved 
>SpamAssassin to its own server, I could also edit my own user 
>preferences directly.  The problem was, this big company never could 
>get it right, so now I'm running my own mailserver(s) out of what 
>seemed like necessity.  I'm running Gentoo with SA 3.1.7.
>
>sa-learn is used to train and keep up-to-date the bayesian database.  
>So, turn on autolearn in your /etc/mail/spamassassin/local.cf so the 
>line reads:
>bayes_auto_learn 1
>(should be on by default).
>This will cause selected spam and ham that you get to be used 
>automagically to keep the bayesian database up-to-date.
>
>I'm using maildir and have two subdirectories in my .maildir called:
>2-learn-spam
>2-learn-ham
>
>I put missed spam in 2-learn-spam and ham misclassified as ham in 
>2-learn-ham.  Then, whenever I have a few messages in one of those 
>directories, I run one of the following scripts:
>
>learnspam.scr, which contains this line:
>sa-learn --spam --progress /home/figueroa/.maildir/.2-learn-spam/cur
>
>learnham.scr which contains this line:
>sa-learn --ham --progress /home/figueroa/.maildir/.2-learn-ham/cur
>
>This is on my personal mailserver.  On the mailserver I run at a school, 
>I run that script on each users 2-learn-spam/ham directories every 
>night under crontab.
>
>Run an up-to-date version of SpmaAsssasin.  I was having pretty good 
>results with 3.1.3 (the unmasked version in Gentoo), but got 
>immediately better results when I upgraded to the current version.
>
>Also, to keep your RULES up-to-date, run sa-update as root from 
>time-to-time.
>
>Good luck!  Happy spamassassaning!


Personally, I'd disagree with auto-learn; having used SA in a
production environment for some years I've found manual training to be
a better solution.

YMMV

Just my 2 (pick your currency) worth.

Nigel

Re: sa-learn explained

Posted by Andy Figueroa <fi...@andyfigueroa.net>.

I still fee like a tyro with SpamAssassin, but my installation is 
catching better than 99% with perhaps 0.1% false positives (thanks in 
large part to things I've learned from this list), and I think I can 
tell you a couple of things better than just read the manual.  (But, do 
read the manual!)  My initial experience with SpamAssassin about a year 
ago was through a large web hosting company and I was limited to 
playing with SpamAssassin through cpanel, though till they moved 
SpamAssassin to its own server, I could also edit my own user 
preferences directly.  The problem was, this big company never could 
get it right, so now I'm running my own mailserver(s) out of what 
seemed like necessity.  I'm running Gentoo with SA 3.1.7.

sa-learn is used to train and keep up-to-date the bayesian database.  
So, turn on autolearn in your /etc/mail/spamassassin/local.cf so the 
line reads:
bayes_auto_learn 1
(should be on by default).
This will cause selected spam and ham that you get to be used 
automagically to keep the bayesian database up-to-date.

I'm using maildir and have two subdirectories in my .maildir called:
2-learn-spam
2-learn-ham

I put missed spam in 2-learn-spam and ham misclassified as ham in 
2-learn-ham.  Then, whenever I have a few messages in one of those 
directories, I run one of the following scripts:

learnspam.scr, which contains this line:
sa-learn --spam --progress /home/figueroa/.maildir/.2-learn-spam/cur

learnham.scr which contains this line:
sa-learn --ham --progress /home/figueroa/.maildir/.2-learn-ham/cur

This is on my personal mailserver.  On the mailserver I run at a school, 
I run that script on each users 2-learn-spam/ham directories every 
night under crontab.

Run an up-to-date version of SpmaAsssasin.  I was having pretty good 
results with 3.1.3 (the unmasked version in Gentoo), but got 
immediately better results when I upgraded to the current version.

Also, to keep your RULES up-to-date, run sa-update as root from 
time-to-time.

Good luck!  Happy spamassassaning!

-- 
Andy Figueroa 
http://philippians-1-20.us/
figueroa@andyfigueroa.net

On Friday 29 December 2006 08:23, Vernon Webb wrote:
> Yesterday someone asked if I used sa-learn and the response to myself
> was, I have something else to learn. Can someone explain to me how to
> use it?
>
> If I understand correctly sa-learn can be used to train SA to
> recognize certain messages as SPAM or HAM. I've run the sa-learn
> command but it is not very clear as to how it is used. I mean I
> understand if I use "sa-learn --spam" I can train SA that something
> is SPAM but what, where? For instance today the thing is not "Effie
> Present"  but rather "Happy NW Effie". So the efforts I took
> yesterday using the phish.ndb and scan.ndb database is still not
> cathcing these guys (however it is catching some Phishing scams).
>
> I'm willing to try sa-learn, but what will that do for me? These guys
> are beginning to drive me nuts and obvioulsy I have something wrong
> as others are telling me these are being caught as SPAM on their
> systems.
>
> Thanks

Re: sa-learn explained

Posted by Sebastian Ries <se...@dtnet.de>.

Hi

On Friday 29 December 2006 14:23, Vernon Webb wrote:
> Yesterday someone asked if I used sa-learn and the response to myself was,
> I have something else to learn. Can someone explain to me how to use it?
man sa-learn
sa-learn [options] [file]...

As I know sa-learn takes nearly every email-format as a file.
You can give an mbox that represents your spam-folder, you can give an email 
file out of your maildir or it might even work with a folder that contains 
mails in maildir format

I did not test all that, but this is what I undersood from the docu.

Just try it

Regards
Sebastian Ries
-- 
------------------------------------------------------------
DT Netsolution GmbH -  Talaeckerstr. 30 -  D-70437 Stuttgart
Tel: +49-711-849910-36               Fax: +49-711-849910-936
WEB: http://www.dtnet.de/     email: Sebastian.Ries@dtnet.de

Re: sa-learn explained

Posted by snowcrash+spamassassin <sc...@gmail.com>.

> Perhaps it's not ready for prime time. I can't imagine that if it was they
> would not be making it headline news.

linford has, apparently, stated in posts to newgroups that folks
should switch _now_. i think there's a reference in this list's
archive, iirc.

public announcements, i'd guess, will be made when all t's are crossed etc etc

Re: sa-learn explained

Posted by Phil Barnett <ph...@philb.us>.

On Friday 29 December 2006 23:55, snowcrash+spamassassin wrote:
> and this,
>
> http://www.spamhaus.org/zen
>
> "Caution: zen.spamhaus.org replaces sbl-xbl.spamhaus.org.
>
> If you are currently using sbl-xbl.spamhaus.org you can now replace
> 'sbl-xbl' with 'zen' (sbl-xbl.spamhaus.org will eventually become
> obsolete and may in the future be withdrawn from service).
>
> zen.spamhaus.org should now be the only spamhaus.org DNSBL in your
> configuration. You should not use ZEN together with other Spamhaus
> blocklists or you will simply be wasting DNS queries and slowing your
> mail queue."

It makes me wonder why there are no links to it from anywhere on their front 
page, from any FAQ or from any menu from the front page.

Perhaps it's not ready for prime time. I can't imagine that if it was they 
would not be making it headline news.

How did everyone hear about this when there is no apparent attempt at the 
spamhaus.org website to let anyone know that there is a change coming?

-- 
My other computer is your Windows machine

Re: sa-learn explained

Posted by snowcrash+spamassassin <sc...@gmail.com>.

and this,

http://www.spamhaus.org/zen

"Caution: zen.spamhaus.org replaces sbl-xbl.spamhaus.org.

If you are currently using sbl-xbl.spamhaus.org you can now replace
'sbl-xbl' with 'zen' (sbl-xbl.spamhaus.org will eventually become
obsolete and may in the future be withdrawn from service).

zen.spamhaus.org should now be the only spamhaus.org DNSBL in your
configuration. You should not use ZEN together with other Spamhaus
blocklists or you will simply be wasting DNS queries and slowing your
mail queue."

Re: sa-learn explained

Posted by Phil Barnett <ph...@philb.us>.

On Friday 29 December 2006 16:23, Duane Hill wrote:
> Phil Barnett wrote:
> > On Friday 29 December 2006 14:50, Vernon Webb wrote:
> >>  What are you using?
> >
> > Right now, I'm using sbl-xbl.
>
> I could be mistaken. sbl-xbl is being replaced by zen.spamhaus.org. That
> is what I'm currently using.

Their web site currently states this:

http://www.spamhaus.org/sbl/howtouse.html

-- 
My other computer is your Windows machine

Re: sa-learn explained

Posted by Duane Hill <d....@yournetplus.com>.

Phil Barnett wrote:
> On Friday 29 December 2006 14:50, Vernon Webb wrote:
>>  What are you using?
> 
> Right now, I'm using sbl-xbl.

I could be mistaken. sbl-xbl is being replaced by zen.spamhaus.org. That 
is what I'm currently using.

Re: sa-learn explained

Posted by Phil Barnett <ph...@philb.us>.

On Friday 29 December 2006 14:50, Vernon Webb wrote:
>  What are you using?

Right now, I'm using sbl-xbl.

-- 
My other computer is your Windows machine

Re: RBLs

Posted by Jason Faulkner <jf...@broadwick.com>.

Larry Nedry wrote:
> On 12/29/06 at 2:50 PM -0500 Vernon Webb wrote:
>> What are you using?
>
> Currently I am using only zen.spamhaus.org.  The rest of the RBLs that 
> I have tried have had too many false positives to be useful for my 
> requirements.
>
> Which RBLs do the rest of you folks feel comfortable using?

Spamhaus is amazingly good -- but some other are less than stellar. I'd 
reccomend using the rbl and xbl from Spamhaus.

-- 
Jason Faulkner
Systems Manager
Broadwick Corporation
(919) 459-2509

Re: RBLs (was: sa-learn explained)

Posted by Jeff Chan <je...@surbl.org>.

On Friday, December 29, 2006, 1:25:10 PM, Larry Nedry wrote:
> On 12/29/06 at 2:50 PM -0500 Vernon Webb wrote:
>>What are you using?

> Currently I am using only zen.spamhaus.org.  The rest of the RBLs that I
> have tried have had too many false positives to be useful for my
> requirements.

> Which RBLs do the rest of you folks feel comfortable using?

> Nedry

zen.spamhaus.org is the only RBL I recommend using for outright
blocking at the MTA level.  Of the spamhaus lists, zen is the
only one people should be using going forward, as already
mentioned from the Spamhaus site:

  http://www.spamhaus.org/faq/answers.lasso?section=DNSBL%20Technical#186

Be aware that zen will include the new PBL list in addition to
SBL and XBL:

  http://www.spamhaus.org/pbl/index.lasso

The PBL list should be quite effective, but it is a different,
new list.  It is a "Policy Black List" that lists network spaces
that ISPs say should not be emitting mail, such as dialup spaces,
DHCP spaces, DSL, cable modem, etc.  Many of the mail emitters in
such spaces tend to be botnets sending spam.

Also note that MTA level blocking is not the same as the way
SpamAssassin uses RBLs.  SpamAssassin uses many blacklists in
addition to Spamhaus:

  http://wiki.apache.org/spamassassin/UsingNetworkTests
  http://wiki.apache.org/spamassassin/DnsBlocklists

in order to score the senders of messages.  It scores different
blacklists differently, essentially depending on how accurate
they are.  The more accurate lists get a higher score, etc.

SpamAssassin also uses some RBLs to check message body URIs,
including using Spamhaus and SURBLs:

  http://spamassassin.apache.org/full/3.1.x/dist/doc/Mail_SpamAssassin_Plugin_URIDNSBL.html

Blocking in an MTA means not allowing the message to even get to
SpamAssassin for checking.  This is the normal way most mail
servers are set up since the volume of all spam could generally
overwhelm SpamAssassin without MTA blocking.  Blocking by sener
IP is much more efficient, so it's generally used as a fast
pre-filter before SpamAssassin even sees a message.

Jeff C.
-- 
Jeff Chan
mailto:jeffc@surbl.org
http://www.surbl.org/

Re: RBLs

Posted by John Rudd <jr...@ucsc.edu>.

Sander Holthaus wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>  
> John Rudd wrote:
>> John D. Hardin wrote:
>>> On Fri, 29 Dec 2006, Larry Nedry wrote:
>>>
>>>> On 12/29/06 at 2:50 PM -0500 Vernon Webb wrote:
>>>>> What are you using?
>>>> Currently I am using only zen.spamhaus.org.  The rest of the
>>>> RBLs that I have tried have had too many false positives to be
>>>> useful for my requirements.
>>>>
>>>> Which RBLs do the rest of you folks feel comfortable using?
>>> I use a few others from sorbs.net, but I don't see them having
>>> any effect as zen.spamhaus.org catches everything first... :)
>>>
>>
>> I've been using sbl-xbl for a while, and then recently switched to
>> zen.
>>
>> I also recently added list.dsbl.org (called before zen, so I can
>> see how much it's really catching).  It's pretty small (about 1/6
>> of what zen catches).
>>
>> I'm also contemplating adding dul.dnsbl.sorbs.net.
>>
>> I tend to put the newest (to me) rbl first, so I can see what it's
>> actually catching before the stuff I was already using :-)
>>
>>
> zen != xbl-sbl. It is xbl-sbl-pbl. AFAIK, the PBL's aren't active, but
> will be in near future. You might want to change the scoring for
> PBL-entries.
> 


Yes, I never implied that zen == sbl-xbl.  However, for now, according 
to spamhaus, zen only contains the production databases, so it IS 
currently (practically) the same as sbl-xbl.  The difference is that 
once the PBL becomes fully published/public, zen will include all 3, but 
sbl-xbl will not.

So, if what you want is "the one with everything [fully publicly 
published]" you can start using zen now and wont have to make a change 
in the future.

Re: RBLs

Posted by Sander Holthaus <in...@orangexl.com>.

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
 
John Rudd wrote:
> John D. Hardin wrote:
>> On Fri, 29 Dec 2006, Larry Nedry wrote:
>>
>>> On 12/29/06 at 2:50 PM -0500 Vernon Webb wrote:
>>>> What are you using?
>>> Currently I am using only zen.spamhaus.org.  The rest of the
>>> RBLs that I have tried have had too many false positives to be
>>> useful for my requirements.
>>>
>>> Which RBLs do the rest of you folks feel comfortable using?
>>
>> I use a few others from sorbs.net, but I don't see them having
>> any effect as zen.spamhaus.org catches everything first... :)
>>
>
>
> I've been using sbl-xbl for a while, and then recently switched to
> zen.
>
> I also recently added list.dsbl.org (called before zen, so I can
> see how much it's really catching).  It's pretty small (about 1/6
> of what zen catches).
>
> I'm also contemplating adding dul.dnsbl.sorbs.net.
>
> I tend to put the newest (to me) rbl first, so I can see what it's
> actually catching before the stuff I was already using :-)
>
>
zen != xbl-sbl. It is xbl-sbl-pbl. AFAIK, the PBL's aren't active, but
will be in near future. You might want to change the scoring for
PBL-entries.

Kind Regards,
Sander Holthaus
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (MingW32)
 
iD8DBQFFlcT7Vf373DysOTURAjd3AKC+Q8OY4AIiO6JSREp192zSPK/zHwCgwJes
Oqrza5QHzHQWS3X4T39G+l0=
=lbP5
-----END PGP SIGNATURE-----

Re: RBLs

Posted by Jason Faulkner <jf...@broadwick.com>.

>>
>> In a lot of cases, that seems to boil down to "sending a legitimate 
>> email to a recipient who once *asked* to be sent such email, who has 
>> now forgotten they signed up in the first place".  :(
>>
>> There's not much a sender can do about that - particularly for 
>> periodic emails of the type *many* companies send to customers (or 
>> potential customers) who have signed up for these messages.
>
>
> Not only can the sender not do anything about the reporting and 
> getting blacklisted, but the way spamcop sometimes (always?) lists the 
> host, they can't find out which of their senders was involved, and 
> thus have no hope of figuring out which of that sender's recipients is 
> responsible.
>
> Kind of hard to solve a problem when you're just being told "something 
> is wrong" and _nothing_ more.  Which is the case when a spamtrap was 
> involved.

Exactly the point I was trying to make earlier. As an ESP (email service 
provider), we have a tough job in separating the wheat from the chaff. 
When you have just under 10,000 customers and 12 IPs, it's a little 
difficult to know who sent to a spamtrap when we aren't even given the 
most basic information about a message.

-- 
Jason Faulkner
Systems Manager
Broadwick Corporation
(919) 459-2509

Re: RBLs

Posted by John Rudd <jr...@ucsc.edu>.

Kris Deugau wrote:
> Jeff Chan wrote:
>> The SpamCop BL is a fair representation of the sending IPs of the
>> messages that its users are reporting as spam.  One of your goals
>> as an ESP should be to not get perceived as spam in the mailboxes
>> of those users.  If the users get your messages and report them
>> as spam (via SpamCop, AOL, etc.), then you may be doing something
>> inappropriate that's worth reviewing and correcting.
> 
> In a lot of cases, that seems to boil down to "sending a legitimate 
> email to a recipient who once *asked* to be sent such email, who has now 
> forgotten they signed up in the first place".  :(
> 
> There's not much a sender can do about that - particularly for periodic 
> emails of the type *many* companies send to customers (or potential 
> customers) who have signed up for these messages.

Not only can the sender not do anything about the reporting and getting 
blacklisted, but the way spamcop sometimes (always?) lists the host, 
they can't find out which of their senders was involved, and thus have 
no hope of figuring out which of that sender's recipients is responsible.

Kind of hard to solve a problem when you're just being told "something 
is wrong" and _nothing_ more.  Which is the case when a spamtrap was 
involved.

Re: RBLs

Posted by Kris Deugau <kd...@vianet.ca>.

Jeff Chan wrote:
> The SpamCop BL is a fair representation of the sending IPs of the
> messages that its users are reporting as spam.  One of your goals
> as an ESP should be to not get perceived as spam in the mailboxes
> of those users.  If the users get your messages and report them
> as spam (via SpamCop, AOL, etc.), then you may be doing something
> inappropriate that's worth reviewing and correcting.

In a lot of cases, that seems to boil down to "sending a legitimate 
email to a recipient who once *asked* to be sent such email, who has now 
forgotten they signed up in the first place".  :(

There's not much a sender can do about that - particularly for periodic 
emails of the type *many* companies send to customers (or potential 
customers) who have signed up for these messages.

-kgd

Re: RBLs

Posted by Jeff Chan <je...@surbl.org>.

On Saturday, December 30, 2006, 10:40:21 AM, Jason Faulkner wrote:

> I will completely concur with the statement about spamcop being too 
> aggressive -- I work with a company that sends out ~10 million messages 
> per month per ip (we're an ESP) and we can get listed on Spamcop for as 
> few as 20 complaints on one of those IPs, and there's absolutely no 
> feedback mechanism that they'll listen to us with.

> Spamhaus is fair. DULs are a great idea. But please, please don't 
> support SpamCop. Their policies are not  fair and you /will/ lose some 
> legitimate email in the process.

While I agree that the SpamCop BL is too aggressive for use as
MTA blocking, it's unfair to say that SpamCop or their blacklist
are unfair.

The SpamCop BL is a fair representation of the sending IPs of the
messages that its users are reporting as spam.  One of your goals
as an ESP should be to not get perceived as spam in the mailboxes
of those users.  If the users get your messages and report them
as spam (via SpamCop, AOL, etc.), then you may be doing something
inappropriate that's worth reviewing and correcting.  Frankly if
I were an ESP, I'd be grateful for the feedback that something
may be wrong.  That feedback is valuable and gives you a chance
to review your practices before you become more widely viewed as
spammers.  Presumably that's something you would want to avoid.

In contrast to outright blocking at the MTA level, SpamAssassin
uses the SpamCop BL and many other BLs to create a score to tag
messages as spammy or not.  For a list that's a bit too
aggressive like SCBL, the score is lower.  For a list that's more
accurate like xbl.spamhaus.org, SpamAssassin gives it a higher
score.  Etc.

Jeff C.
-- 
Jeff Chan
mailto:jeffc@surbl.org
http://www.surbl.org/

Re: RBLs

Posted by Jason Faulkner <jf...@broadwick.com>.

>> I will completely concur with the statement about spamcop being too 
>> aggressive -- I work with a company that sends out ~10 million 
>> messages per month per ip (we're an ESP) and we can get listed on 
>> Spamcop for as few as 20 complaints on one of those IPs, and there's 
>> absolutely no feedback mechanism that they'll listen to us with.
>>
>> Spamhaus is fair. DULs are a great idea. But please, please don't 
>> support SpamCop. Their policies are not  fair and you /will/ lose 
>> some legitimate email in the process.
>
> Jason, if your folks own "emaildirect.com" expect to remain blacklisted.
> I just received a mortgage spam from them which hit no BL rules at all.
> I'm motivated to submit it to all and sundry.
>
> {^_^}

I don't know who emaildirect.com is -- but it's certainly not us, and I 
just checked our DB, and they aren't a customer of ours.

If you wanted to know who we were, you could have just looked up the 
website in my sig --- http://broadwick.com -- owners of 
http://www.intellicontact.com

-- 
Jason Faulkner
Systems Manager
Broadwick Corporation
(919) 459-2509
jfaulkne@broadwick.com

Re: RBLs

Posted by jdow <jd...@earthlink.net>.

From: "Jason Faulkner" <jf...@broadwick.com>
> 
>> we are using sbl-xbl.spamhaus.org, dul.dnsbl.sorbs.net, bl.spamcop.net
>> and list.dsbl.org in this particular order. The results are available
>> at:
>>
>> http://graph.noc.ntua.gr/a/graph_529.html
>>
>> Sbl-xbl(zen).spamhaus.org being first in the list and more complete
>> gets the most hits. Dul.dnsbl.sorbs.net does a pretty good job without
>> causing the problems of dnsbl.sorbs.net (too aggressive, weird
>> re-listing policy etc.). Bl.spamcop.net gets fewer hits, tends too be
>> aggressive sometimes (that's why we have combined rbls with a proper
>> whitelist) but also works as an early detector which is useful.
>> List.dsbl.org gets even fewer hits being last and smaller.
> 
> I will completely concur with the statement about spamcop being too 
> aggressive -- I work with a company that sends out ~10 million messages 
> per month per ip (we're an ESP) and we can get listed on Spamcop for as 
> few as 20 complaints on one of those IPs, and there's absolutely no 
> feedback mechanism that they'll listen to us with.
> 
> Spamhaus is fair. DULs are a great idea. But please, please don't 
> support SpamCop. Their policies are not  fair and you /will/ lose some 
> legitimate email in the process.

Jason, if your folks own "emaildirect.com" expect to remain blacklisted.
I just received a mortgage spam from them which hit no BL rules at all.
I'm motivated to submit it to all and sundry.

{^_^}

Re: RBLs

Posted by Jason Faulkner <jf...@broadwick.com>.

> we are using sbl-xbl.spamhaus.org, dul.dnsbl.sorbs.net, bl.spamcop.net
> and list.dsbl.org in this particular order. The results are available
> at:
>
> http://graph.noc.ntua.gr/a/graph_529.html
>
> Sbl-xbl(zen).spamhaus.org being first in the list and more complete
> gets the most hits. Dul.dnsbl.sorbs.net does a pretty good job without
> causing the problems of dnsbl.sorbs.net (too aggressive, weird
> re-listing policy etc.). Bl.spamcop.net gets fewer hits, tends too be
> aggressive sometimes (that's why we have combined rbls with a proper
> whitelist) but also works as an early detector which is useful.
> List.dsbl.org gets even fewer hits being last and smaller.

I will completely concur with the statement about spamcop being too 
aggressive -- I work with a company that sends out ~10 million messages 
per month per ip (we're an ESP) and we can get listed on Spamcop for as 
few as 20 complaints on one of those IPs, and there's absolutely no 
feedback mechanism that they'll listen to us with.

Spamhaus is fair. DULs are a great idea. But please, please don't 
support SpamCop. Their policies are not  fair and you /will/ lose some 
legitimate email in the process.

-- 
Jason Faulkner
Systems Manager
Broadwick Corporation
(919) 459-2509
jfaulkne@broadwick.com

Re: RBLs

Posted by Panagiotis Christias <ch...@gmail.com>.

On 12/30/06, John Rudd <jr...@ucsc.edu> wrote:
> John D. Hardin wrote:
> > On Fri, 29 Dec 2006, Larry Nedry wrote:
> >
> >> On 12/29/06 at 2:50 PM -0500 Vernon Webb wrote:
> >>> What are you using?
> >> Currently I am using only zen.spamhaus.org.  The rest of the RBLs
> >> that I have tried have had too many false positives to be useful
> >> for my requirements.
> >>
> >> Which RBLs do the rest of you folks feel comfortable using?
> >
> > I use a few others from sorbs.net, but I don't see them having any
> > effect as zen.spamhaus.org catches everything first... :)
> >
>
>
> I've been using sbl-xbl for a while, and then recently switched to zen.
>
> I also recently added list.dsbl.org (called before zen, so I can see how
> much it's really catching).  It's pretty small (about 1/6 of what zen
> catches).
>
> I'm also contemplating adding dul.dnsbl.sorbs.net.
>
> I tend to put the newest (to me) rbl first, so I can see what it's
> actually catching before the stuff I was already using :-)

Hello,

we are using sbl-xbl.spamhaus.org, dul.dnsbl.sorbs.net, bl.spamcop.net
and list.dsbl.org in this particular order. The results are available
at:

http://graph.noc.ntua.gr/a/graph_529.html

Sbl-xbl(zen).spamhaus.org being first in the list and more complete
gets the most hits. Dul.dnsbl.sorbs.net does a pretty good job without
causing the problems of dnsbl.sorbs.net (too aggressive, weird
re-listing policy etc.). Bl.spamcop.net gets fewer hits, tends too be
aggressive sometimes (that's why we have combined rbls with a proper
whitelist) but also works as an early detector which is useful.
List.dsbl.org gets even fewer hits being last and smaller.

Regards,
Panagiotis

Re: RBLs

Posted by John Rudd <jr...@ucsc.edu>.

John D. Hardin wrote:
> On Fri, 29 Dec 2006, Larry Nedry wrote:
> 
>> On 12/29/06 at 2:50 PM -0500 Vernon Webb wrote:
>>> What are you using?
>> Currently I am using only zen.spamhaus.org.  The rest of the RBLs
>> that I have tried have had too many false positives to be useful
>> for my requirements.
>>
>> Which RBLs do the rest of you folks feel comfortable using?
> 
> I use a few others from sorbs.net, but I don't see them having any 
> effect as zen.spamhaus.org catches everything first... :)
> 

I've been using sbl-xbl for a while, and then recently switched to zen.

I also recently added list.dsbl.org (called before zen, so I can see how 
much it's really catching).  It's pretty small (about 1/6 of what zen 
catches).

I'm also contemplating adding dul.dnsbl.sorbs.net.

I tend to put the newest (to me) rbl first, so I can see what it's 
actually catching before the stuff I was already using :-)

Re: RBLs (was: sa-learn explained)

Posted by "John D. Hardin" <jh...@impsec.org>.

On Fri, 29 Dec 2006, Larry Nedry wrote:

> On 12/29/06 at 2:50 PM -0500 Vernon Webb wrote:
> >What are you using?
> 
> Currently I am using only zen.spamhaus.org.  The rest of the RBLs
> that I have tried have had too many false positives to be useful
> for my requirements.
> 
> Which RBLs do the rest of you folks feel comfortable using?

I use a few others from sorbs.net, but I don't see them having any 
effect as zen.spamhaus.org catches everything first... :)

--
 John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
 jhardin@impsec.org    FALaholic #11174     pgpk -a jhardin@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
  Liberals love sex ed because it teaches kids to be safe around their
  sex organs. Conservatives love gun education because it teaches kids
  to be safe around guns. However, both believe that the other's
  education goals lead to dangers too terrible to contemplate.
-----------------------------------------------------------------------
 676 days until the Presidential Election

Re: RBLs (was: sa-learn explained)

Posted by Larry Nedry <sp...@bluestreak.net>.

On 12/29/06 at 2:50 PM -0500 Vernon Webb wrote:
>What are you using?

Currently I am using only zen.spamhaus.org.  The rest of the RBLs that I
have tried have had too many false positives to be useful for my
requirements.

Which RBLs do the rest of you folks feel comfortable using?

Nedry

Re: sa-learn explained

Posted by Larry Nedry <sp...@bluestreak.net>.

On 12/29/06 at 12:09 PM -0500 Vernon Webb wrote:
>Yes, ORDB-RBL & SBL+XBL

FWIW, relays.ordb.org no longer exists:
<http://ordb.org/news/?id=38>

Nedry

Re: sa-learn explained

Posted by Vernon Webb <ve...@comp-wiz.com>.

> My first question would be: 
> Have you installed Rules Du Jour and set it up to have comprehensive coverage? 
> Is pyzor, razor and DCC running? 

Yes
 
> Are you using an RBL? 

Yes, ORDB-RBL & SBL+XBL

Re: sa-learn explained

Posted by Phil Barnett <ph...@philb.us>.

On Friday 29 December 2006 08:23, Vernon Webb wrote:

> These guys are beginning to
> drive me nuts and obvioulsy I have something wrong as others are telling me
> these are being caught as SPAM on their systems.

My first question would be:

Have you installed Rules Du Jour and set it up to have comprehensive coverage?

Is pyzor, razor and DCC running?

Are you using an RBL?

-- 
My other computer is your Windows machine

Re: sa-learn explained

Posted by Duane Hill <d....@yournetplus.com>.

Vernon Webb wrote:
> Yesterday someone asked if I used sa-learn and the response to myself was, I have 
> something else to learn. Can someone explain to me how to use it?
> 
> If I understand correctly sa-learn can be used to train SA to recognize certain 
> messages as SPAM or HAM. I've run the sa-learn command but it is not very clear as to 
> how it is used. I mean I understand if I use "sa-learn --spam" I can train SA that 
> something is SPAM but what, where? For instance today the thing is not "Effie 
> Present"  but rather "Happy NW Effie". So the efforts I took yesterday using the 
> phish.ndb and scan.ndb database is still not cathcing these guys (however it is 
> catching some Phishing scams).
> 
> I'm willing to try sa-learn, but what will that do for me? These guys are beginning to 
> drive me nuts and obvioulsy I have something wrong as others are telling me these are 
> being caught as SPAM on their systems.
> 
> Thanks
> 

Tons here getting trapped with the "Happy NW (name)" spam:

X-Spam-Level: xxxxxxxxxxxxxxxxxxxx
X-Spam-Status: Hits:20.2 Learn:no Tests:BAYES_99,DATE_IN_PAST_03_06,
	HELO_DYNAMIC_DHCP,HELO_DYNAMIC_IPADDR,RCVD_FORGED_WROTE,RCVD_IN_SORBS_DUL,
	SARE_LWSHORTT,SARE_MLB_Stock1,SARE_MLB_Stock2

auto_learn, auto_whitelist and auto_expire are on and I have set a more 
strict window for auto_learn and when bayes first kicks in:

   bayes_min_ham_num 500
   bayes_min_spam_num 500
   bayes_auto_learn_threshold_nonspam -0.15
   bayes_auto_learn_threshold_spam 15.0

Here is a dump of the bayes DB stats:

0.000    0          3    0  non-token data: bayes db version
0.000    0       9307    0  non-token data: nspam
0.000    0       2461    0  non-token data: nham
0.000    0     195899    0  non-token data: ntokens
0.000    0 1167342651    0  non-token data: oldest atime
0.000    0 1167411577    0  non-token data: newest atime
0.000    0 1167411583    0  non-token data: last journal sync atime
0.000    0 1167393605    0  non-token data: last expiry atime
0.000    0      50956    0  non-token data: last expire atime delta
0.000    0      86718    0  non-token data: last expire reduction count

I have had this going now for a few days. I deposit anything scoring 
over 25 in a special mailbox and then rejecting the message at SMTP. 
Anything scoring over 5 is then placed in individual account spambox 
mailboxes. I also am monitoring close what is getting rejected. So far 
nothing legit gets rejected. I have also not noticed any message that 
was a false positive/negative get auto_learned. Therefore, token data in 
the bayes DB has not been changed. That is good in a sense as you don't 
want false positive/negative messages getting auto_learned in one 
direction or another.

Maybe I'm wrong. It works, so I will continue to monitor and be happy.

RE: sa-learn explained

Posted by vertito <ve...@aim-consultants.com>.

man sa-learn 

-----Original Message-----
From: Vernon Webb [mailto:vernon@comp-wiz.com] 
Sent: Friday, December 29, 2006 2:24 PM
To: SpamAssassin
Subject: sa-learn explained

Yesterday someone asked if I used sa-learn and the response to myself was, I have something else to
learn. Can someone explain to me how to use it?

If I understand correctly sa-learn can be used to train SA to recognize certain messages as SPAM or
HAM. I've run the sa-learn command but it is not very clear as to how it is used. I mean I
understand if I use "sa-learn --spam" I can train SA that something is SPAM but what, where? For
instance today the thing is not "Effie Present"  but rather "Happy NW Effie". So the efforts I took
yesterday using the phish.ndb and scan.ndb database is still not cathcing these guys (however it is
catching some Phishing scams).

I'm willing to try sa-learn, but what will that do for me? These guys are beginning to drive me nuts
and obvioulsy I have something wrong as others are telling me these are being caught as SPAM on
their systems.

Thanks