You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Dan Roberts <da...@JLAZYH.COM> on 2009/04/22 16:25:47 UTC

Issues with sa-update - seems to be running but not really updating

I am running Spamassassin Version 3.2.4 with Perl Version 5.8.8 on a  
CentOS Linux 5.2 system.

As FAQ postings and other notes appeared to suggest running sa-update  
daily, I set up a cron job to do that.   Though it has updated in the  
past, it seems most of the time to find nothing to update, yet the  
volume of spam getting through is definitely increasing.

Below  is a typical result from sa-update -D.  The current version  
759778 seems to have been unchanged for a while - and that seems  
strange.

I do see that there are modules not installed, and perhaps adding a  
few would help - it was a default install via CentOS, which up till  
recently has been pretty good.

So my question is, should I see greater frequency of updates, and if  
so, does the info below indicate why I am not?   Any suggestions on  
how to improve or resolve this?

Much appreciated.
Dan

[root@trailrunner dan]# sa-update -D && service spamassassin restart
[6634] dbg: logger: adding facilities: all
[6634] dbg: logger: logging level is DBG
[6634] dbg: generic: SpamAssassin version 3.2.4
[6634] dbg: config: score set 0 chosen.
[6634] dbg: dns: is Net::DNS::Resolver available? yes
[6634] dbg: dns: Net::DNS version: 0.59
[6634] dbg: generic: sa-update version svn607589
[6634] dbg: generic: using update directory: /var/lib/spamassassin/ 
3.002004
[6634] dbg: diag: perl platform: 5.008008 linux
[6634] dbg: diag: module installed: Digest::SHA1, version 2.11
[6634] dbg: diag: module installed: HTML::Parser, version 3.55
[6634] dbg: diag: module installed: Net::DNS, version 0.59
[6634] dbg: diag: module installed: MIME::Base64, version 3.07
[6634] dbg: diag: module installed: DB_File, version 1.814
[6634] dbg: diag: module installed: Net::SMTP, version 2.29
[6634] dbg: diag: module not installed: Mail::SPF ('require' failed)
[6634] dbg: diag: module not installed: Mail::SPF::Query ('require'  
failed)
[6634] dbg: diag: module not installed: IP::Country::Fast ('require'  
failed)
[6634] dbg: diag: module not installed: Razor2::Client::Agent  
('require' failed)
[6634] dbg: diag: module not installed: Net::Ident ('require' failed)
[6634] dbg: diag: module installed: IO::Socket::INET6, version 2.51
[6634] dbg: diag: module installed: IO::Socket::SSL, version 1.01
[6634] dbg: diag: module installed: Compress::Zlib, version 1.42
[6634] dbg: diag: module installed: Time::HiRes, version 1.86
[6634] dbg: diag: module not installed: Mail::DomainKeys ('require'  
failed)
[6634] dbg: diag: module not installed: Mail::DKIM ('require' failed)
[6634] dbg: diag: module installed: DBI, version 1.52
[6634] dbg: diag: module installed: Getopt::Long, version 2.35
[6634] dbg: diag: module installed: LWP::UserAgent, version 2.033
[6634] dbg: diag: module installed: HTTP::Date, version 1.47
[6634] dbg: diag: module installed: Archive::Tar, version 1.30
[6634] dbg: diag: module installed: IO::Zlib, version 1.04
[6634] dbg: diag: module not installed: Encode::Detect ('require'  
failed)
[6634] dbg: gpg: Searching for 'gpg'
[6634] dbg: util: current PATH is: /usr/kerberos/sbin:/usr/kerberos/ 
bin:/usr/local/bin:/bin:/usr/bin
[6634] dbg: util: executable for gpg was found at /usr/bin/gpg
[6634] dbg: gpg: found /usr/bin/gpg
[6634] dbg: gpg: release trusted key id list:  
5E541DC959CB8BAC7C78DFDC4056A61A5244EC45  
26C900A46DD40CD5AD24F6D7DEE01987265FA05B  
0C2B1D7175B852C64B3CDC716C55397824F434CE
[6634] dbg: channel: attempting channel updates.spamassassin.org
[6634] dbg: channel: update directory /var/lib/spamassassin/3.002004/ 
updates_spamassassin_org
[6634] dbg: channel: channel cf file /var/lib/spamassassin/3.002004/ 
updates_spamassassin_org.cf
[6634] dbg: channel: channel pre file /var/lib/spamassassin/3.002004/ 
updates_spamassassin_org.pre
[6634] dbg: channel: metadata version = 759778
[6634] dbg: dns: 4.2.3.updates.spamassassin.org => 759778, parsed as  
759778
[6634] dbg: channel: current version is 759778, new version is 759778,  
skipping channel
[6634] dbg: diag: updates complete, exiting with code 1
[root@trailrunner dan]#



Re: Issues with sa-update - seems to be running but not really updating

Posted by Karsten Bräckelmann <gu...@rudersport.de>.
On Wed, 2009-04-22 at 09:01 -0600, Dan Roberts wrote:
> I thought with the newer versions, rather than training Spamassassin,  
> the sa-update was the recommended way to update the rule set for  
> thwarting spam.

No. Rule updates are *NO* substitute for training. The Bayes subsystem
is independent and can *only* be enhanced by the same old training using
sa-learn.

Given that -- did you check your Bayes results? Are they getting worse?


> Is there another, perhaps better way to update the rule sets and get  
> spam to start dropping off?

One issue at a time. ;)  See my other post regarding enhancing results
and fixing issues with FNs.

Anyway, no, there is no better way "to update the [stock] rule sets".
Third-party rule-sets are a different topic and some of them actually do
use sa-update themself.

  guenther


[ useless full quote under including the sig -- snipped ]

-- 
char *t="\10pse\0r\0dtu\0.@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4";
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1:
(c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}


Re: Issues with sa-update - seems to be running but not really updating

Posted by Dan Roberts <da...@jlazyh.com>.
Hi back -

I thought with the newer versions, rather than training Spamassassin,  
the sa-update was the recommended way to update the rule set for  
thwarting spam.

Is there another, perhaps better way to update the rule sets and get  
spam to start dropping off?



On Apr 22, 2009, at 8:48 AM, Karsten Bräckelmann wrote:

> On Wed, 2009-04-22 at 08:25 -0600, Dan Roberts wrote:
>> I am running Spamassassin Version 3.2.4 with Perl Version 5.8.8 on a
>> CentOS Linux 5.2 system.
>>
>> As FAQ postings and other notes appeared to suggest running sa-update
>> daily, I set up a cron job to do that.   Though it has updated in the
>> past, it seems most of the time to find nothing to update, [...]
>
> One issue at a time...
>
>> So my question is, should I see greater frequency of updates, and if
>> so, does the info below indicate why I am not?   Any suggestions on
>> how to improve or resolve this?
>
> There is nothing wrong or strange about that. There just is no more
> recent update.
>
> Updates as of now are primarily used to fix issues, like dropping
> obsolete rules with a low hit-rate or poor S/O ratio but resulting in
> FPs. Updates are *not* used to drastically change scores or rules,
> because that requires a full GA run on a massive corpus -- and  
> sometimes
> also depends on new code.
>
> Also, please keep in mind that updates are pushed manually, by humans
> (err, the devs) and thus requires some work and spare time at their
> hands...
>
>
>> [6634] dbg: channel: metadata version = 759778
>> [6634] dbg: dns: 4.2.3.updates.spamassassin.org => 759778, parsed  
>> as 759778
>> [6634] dbg: channel: current version is 759778, new version is  
>> 759778, skipping channel
>
> -- 
> char *t="\10pse\0r\0dtu\0.@ghno\x4e\xc8\x79\xf4\xab\x51\x8a 
> \x10\xf4\xf4\xc4";
> main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i 
> %8? c<<=1:
> (c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]) 
> { putchar(t[s]);h=m;s=0; }}}


Re: Issues with sa-update - seems to be running but not really updating

Posted by Karsten Bräckelmann <gu...@rudersport.de>.
On Wed, 2009-04-22 at 08:25 -0600, Dan Roberts wrote:
> I am running Spamassassin Version 3.2.4 with Perl Version 5.8.8 on a  
> CentOS Linux 5.2 system.
> 
> As FAQ postings and other notes appeared to suggest running sa-update  
> daily, I set up a cron job to do that.   Though it has updated in the  
> past, it seems most of the time to find nothing to update, [...]

One issue at a time...

> So my question is, should I see greater frequency of updates, and if  
> so, does the info below indicate why I am not?   Any suggestions on  
> how to improve or resolve this?

There is nothing wrong or strange about that. There just is no more
recent update.

Updates as of now are primarily used to fix issues, like dropping
obsolete rules with a low hit-rate or poor S/O ratio but resulting in
FPs. Updates are *not* used to drastically change scores or rules,
because that requires a full GA run on a massive corpus -- and sometimes
also depends on new code.

Also, please keep in mind that updates are pushed manually, by humans
(err, the devs) and thus requires some work and spare time at their
hands...


> [6634] dbg: channel: metadata version = 759778
> [6634] dbg: dns: 4.2.3.updates.spamassassin.org => 759778, parsed as 759778
> [6634] dbg: channel: current version is 759778, new version is 759778, skipping channel

-- 
char *t="\10pse\0r\0dtu\0.@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4";
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1:
(c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}


Spam slipping through (was: Issues with sa-update - seems to be running but not really updating)

Posted by Karsten Bräckelmann <gu...@rudersport.de>.
On Wed, 2009-04-22 at 08:25 -0600, Dan Roberts wrote:
> I am running Spamassassin Version 3.2.4 with Perl Version 5.8.8 on a  
> CentOS Linux 5.2 system.
> 
> As FAQ postings and other notes appeared to suggest running sa-update  
> daily, I set up a cron job to do that.   Though it has updated in the  
> past, it seems most of the time to find nothing to update, yet the  
> volume of spam getting through is definitely increasing.

Not for me... ;)

Seriously, this hardly can be "fixed" with an sa-update run, but
indicates there is much room for improvement of your local install, or
probably even a mis-configuration.

As always, we can't give any advice how to improve spam detection,
unless you provide samples. Well, other than re-iterating [1] over and
over again what's been posted here. It's in the archives. Yes, lurking
on and reading lists often yields hints before one even has the problem.

Please either use your own web-space or a pastebin to upload full, raw
samples including all headers. Provide a link, don't post spam samples
on-list.

  guenther


[1] Some of them:  Enable network tests, train Bayes. Use third-party
    rules and plugins, like iXhash, SOUGHT and 90_2tld for URIBL
    results. The latter shows quite some impact recently.

-- 
char *t="\10pse\0r\0dtu\0.@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4";
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1:
(c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}