You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Yves Goergen <no...@unclassified.de> on 2010/05/22 20:42:35 UTC

Updated rules are not regarded

Hi,

I have recently added the 70_zmi_german_cf_zmi_sa-update_dostech_net
rules to my sa-update call. The rules seem to get updated, I can also
find the files under /var/lib/spamassassin/3.002005/. It just doesn't
look as if they're regarded. In my reject log, none of the ZMI rules
appear over days. And still some spam is passing the filter (though most
should be catched).

What do I need to do more than declaring the channel in sa-update for it
to be used?

-- 
Yves Goergen "LonelyPixel" <no...@unclassified.de>
Visit my web laboratory at http://beta.unclassified.de

Re: Updated rules are not regarded

Posted by Yves Goergen <no...@unclassified.de>.
On 22.05.2010 21:18 CE(S)T, Jari Fredriksson wrote:
> Just stating the obvious, but have you restarted/reloaded spamd? or
> amavisd or  whatever you use to start SA code.

Yes, I have. My sa-update-check script has all the sa-update parameters
I need and tests whether there was an update to restart spamd. I saw
that it did in my syslog report. This usually applied my local
configuration changes. It just doesn't seem to apply this.

Or maybe the new rules don't catch a thing for me? How could I test that?

-- 
Yves Goergen "LonelyPixel" <no...@unclassified.de>
Visit my web laboratory at http://beta.unclassified.de

Re: Updated rules are not regarded

Posted by Jari Fredriksson <ja...@iki.fi>.
> Hi,
>
> I have recently added the 70_zmi_german_cf_zmi_sa-update_dostech_net
> rules to my sa-update call. The rules seem to get updated, I can also
> find the files under /var/lib/spamassassin/3.002005/. It just doesn't
> look as if they're regarded. In my reject log, none of the ZMI rules
> appear over days. And still some spam is passing the filter (though most
> should be catched).
>
> What do I need to do more than declaring the channel in sa-update for it
> to be used?
>

Just stating the obvious, but have you restarted/reloaded spamd? or
amavisd or  whatever you use to start SA code.

Re: Updated rules are not regarded

Posted by Adam Katz <an...@khopis.com>.
On 05/29/2010 05:03 AM, Yves Goergen wrote:
>> Stepping away from the ZMI issue and headig towards the larger 
>> picture, what kind of spam are you trying to nail down with this 
>> ruleset?  What goals did you hope to meet with the ZMI rules?  If
>> it's a specific type of spam, can you pastebin an example so we
>> can help you more directly?
> 
> I have submitted a couple of those spam messages to the ruleset 
> maintainer, but I'm not sure if it helps. I can repost it here if
> you like to see it. (ZIP 48 kB)

If they're evading bayes and other filters, they might be worth a
look.  I can take a look at them if you post them to pastebin.com or a
similar site and then send me links (this is the best way to avoid
spam filters on the list, etc).

>> Are you using Bayes?  Are you training it?
> 
> Yes. Yes. I'm only training it with spam messages though. I assume
> it autolearns all the rest. But the bayes filter is absolutely
> useless to me, it most often rates spam 0-1%, even for repeatedly
> learned spam messages. Maybe I should erase the bayes brain and
> restart from new?

Bayes won't work unless you have lots of both spam and ham.  Autolearn
is apparently not doing its job if most of your spams hit 0-1%.  Try
teaching it everything you have.  If you're that out of whack, it
might be worthwhile to start from scratch as you suggested.

>> Most people who want to improve their deployment's SA filters
>> aren't properly utilizing the various plugins.  Specifically,
>> DNSBLs, URIBLs, and Bayes, but also things like Razor2, DCC (if
>> legal), and Pyzor.
> 
> The very most helpful plugin to me is Botnet. It detects most spam
> and rates 5 points which is often a big step towards rejection.

I've heard good things about Botnet, though most of its dynamic checks
appear to already be folded into SA's trunk (I've actually got some
detection rules in there that are more sophisticated but are not yet
done cooking).

That said, the dynamic detection bits like Botnet should pale in
comparison to any one of: DNSBLs, URIBLs, Bayes, Razor2, DCC, and
Pyzor.  Almost every case I encounter with this sort of "help me make
SA filter better" ends up being a misconfiguration of some or all of
those things.

> Most other SA rules don't detect anything although I'm running
> sa-update daily and it reports an update every some weeks. Only the
> DNSBL rules apply every once in a while - at least to what is
> passing the filter. I haven't investigated what's been blocked
> successfully. I think I've still installed the Image Info thing
> plugin but I don't think it catches anything these days. Image spam
> seems to be over.

DNSBLs do a good job; you're probably not noticing them because
anything they nail gets hit pretty hard by several rules and thus
probably hits your "block" threshold.

Image spam comes and goes.  Third party plugins like iXhash can help.

>> Upgrading to SA 3.3.1 would be a big step up if you're not there 
>> already (if you can't, you might want to consider a back-port of
>> the better DNSBLs to SA 3.2.x like my khop-bl channel).
> 
> I need to upgrade to SA 3.3, true. It's always been a hassle
> somewhere between CPAN, other disfunctional Perl junk, source code
> and Debian packages... It's a very complicated job. I'm also
> considering setting up the entire machine anew on Ubuntu basis and
> only use platform packages but that's not something I can do in the
> near future.

Messing with CPAN will work, but might feel daunting, especially if
you've never done it before.  It also introduces an additional thing
to keep track of.  For Debian, I recommend the volatile and backports
repositories.  Go to www.backports.org and add lenny-backports, then
pin it to a low priority and un-pin spamassassin.


Package: *
Pin: release a=lenny-backports
Pin-Priority: 150

Package: spamassassin
Pin: release a=lenny-backports
Pin-Priority: 500


I've also got testing and unstable pinned even lower at 1 and -1, but
that's up to you.  500 is the default pin, 101-500 will upgrade a
manually-installed newer package if there is a candidate, 1-100 will
install candidates if higher pin versions are missing, and lower pins
are never installed.  See the man page for apt_preferences for detail.


# apt-cache policy spamassassin
spamassassin:
  Installed: 3.2.5-2+lenny1.1~volatile1
  Candidate: 3.3.1-1~bpo50+1
  Package pin: 3.3.1-1~bpo50+1
  Version table:
     3.3.1-1 500
          1 http://debian.lcs.mit.edu/debian/ squeeze/main Packages
         -1 http://debian.lcs.mit.edu/debian/ unstable/main Packages
     3.3.1-1~bpo50+1 500
        150 http://www.backports.org lenny-backports/main Packages
     3.2.5-2+lenny2 500
        500 http://debian.lcs.mit.edu/debian/ lenny/main Packages
     3.2.5-2+lenny1.1~volatile1 500
        500 http://volatile.debian.org lenny/volatile/main Packages
# aptitude install spamassassin
...

Re: Updated rules are not regarded

Posted by Yves Goergen <no...@unclassified.de>.
On 26.05.2010 01:09 CE(S)T, Adam Katz wrote:
> Please note that the ZMI German rules are very old, and while there
> have been a few recent tweaks to the file, it doesn't look terribly
> useful to any system that uses the Bayesian filter (more on this
> later).  I would expect these rules to fire quite rarely, even in
> environments that have lots of German-language mail.

Thanks for the info. So I could drop it again as well.

>   spamassassin --lint -D config 2>&1 |grep zmi_german

>   spamassassin --lint

Those told me that it should be all fine.

> Next, let's see if the rules are ever triggering.  This is merely a
> question of filtering your logs (assuming SA is properly logged).

I did that already a few days ago but there was no hit. That's why I was
asking. But it seems the rules are useless to me.

> Stepping away from the ZMI issue and headig towards the larger
> picture, what kind of spam are you trying to nail down with this
> ruleset?  What goals did you hope to meet with the ZMI rules?  If it's
> a specific type of spam, can you pastebin an example so we can help
> you more directly?

I have submitted a couple of those spam messages to the ruleset
maintainer, but I'm not sure if it helps. I can repost it here if you
like to see it. (ZIP 48 kB)

> Are you using Bayes?  Are you training it?

Yes. Yes. I'm only training it with spam messages though. I assume it
autolearns all the rest. But the bayes filter is absolutely useless to
me, it most often rates spam 0-1%, even for repeatedly learned spam
messages. Maybe I should erase the bayes brain and restart from new?

> Most people who want to improve their deployment's SA filters aren't
> properly utilizing the various plugins.  Specifically, DNSBLs, URIBLs,
> and Bayes, but also things like Razor2, DCC (if legal), and Pyzor.

The very most helpful plugin to me is Botnet. It detects most spam and
rates 5 points which is often a big step towards rejection. Most other
SA rules don't detect anything although I'm running sa-update daily and
it reports an update every some weeks. Only the DNSBL rules apply every
once in a while - at least to what is passing the filter. I haven't
investigated what's been blocked successfully. I think I've still
installed the Image Info thing plugin but I don't think it catches
anything these days. Image spam seems to be over.

> Upgrading to SA 3.3.1 would be a big step up if you're not there
> already (if you can't, you might want to consider a back-port of the
> better DNSBLs to SA 3.2.x like my khop-bl channel).

I need to upgrade to SA 3.3, true. It's always been a hassle somewhere
between CPAN, other disfunctional Perl junk, source code and Debian
packages... It's a very complicated job. I'm also considering setting up
the entire machine anew on Ubuntu basis and only use platform packages
but that's not something I can do in the near future.

-- 
Yves Goergen "LonelyPixel" <no...@unclassified.de>
Visit my web laboratory at http://beta.unclassified.de

Re: Updated rules are not regarded

Posted by Adam Katz <an...@khopis.com>.
Please note that the ZMI German rules are very old, and while there
have been a few recent tweaks to the file, it doesn't look terribly
useful to any system that uses the Bayesian filter (more on this
later).  I would expect these rules to fire quite rarely, even in
environments that have lots of German-language mail.


Yves added ZMI via sa-update channels.  He confirmed its presence in
the correct area but wants to confirm it can run.

This command will tell you if SA is properly loading the configuration
file (this should note loading the ZMI rules):

  spamassassin --lint -D config 2>&1 |grep zmi_german

You can run lint without debug to see if SA takes issue with any of
the rules (no output means you're good):

  spamassassin --lint

Next, let's see if the rules are ever triggering.  This is merely a
question of filtering your logs (assuming SA is properly logged).

To do this, we'll first verify that there is the expected data your
logs and see how many messages SA scanned in this sampling period:

  zgrep -c 'spamd: result:' /var/log/mail.log*

Now let's look for rules from ZMI.  Since this rule set uses a common
prefix for all rules, this is an easy search:

  zgrep -c "spamd: result: .*ZMI" /var/log/mail.log*

I expect the results of the last two scans to be a very high number
for the total scanned message count and then a very low number (like
zero) for the ZMI-hitting message count.


For completeness, here's how to actually grab rules by name (in any
posix/bourne shell like bash but not like tcsh):

  RULES=`egrep '^ *score' 70_zmi_german.cf |awk '{printf $2"|"}'`

  zgrep -c "spamd: result: .*(${RULES%?})" /var/log/mail.log*


Finally, if you believe that the rules are being ignored, you can
compose a test to see if that is actually the case.  Take a *full*
sample spam and feed it into SA with a replaced subject as a test:

  formail -I "Subject: NLP Profis" < message.txt |spamassassin -t

You should see (among other things) a line noting that
ZMIde_SUBNLP_PROFI has been hit.


Stepping away from the ZMI issue and headig towards the larger
picture, what kind of spam are you trying to nail down with this
ruleset?  What goals did you hope to meet with the ZMI rules?  If it's
a specific type of spam, can you pastebin an example so we can help
you more directly?

Returning to my initial statement, I am under the impression that this
channel is useful only to victims of German spams who do not use
Bayes.  From a quick examination of the rules, it appears to be mostly
geared at SA implementations that cannot run Bayesian filters since
Bayes should be fully capable of grabbing ALL of those rules (possibly
excepting ZMISOBER_P_SPAM due to its examination of several non-word
elements) ... and Bayes should do a better job, too.

Are you using Bayes?  Are you training it?

Most people who want to improve their deployment's SA filters aren't
properly utilizing the various plugins.  Specifically, DNSBLs, URIBLs,
and Bayes, but also things like Razor2, DCC (if legal), and Pyzor.
Upgrading to SA 3.3.1 would be a big step up if you're not there
already (if you can't, you might want to consider a back-port of the
better DNSBLs to SA 3.2.x like my khop-bl channel).

Testing on a piece of spam:

  spamassassin -D < msg.txt > debug.txt 2>&1

Should reveal (among MANY other lines) output similar to this:

[5841] dbg: async: completed in 0.240 s: DNSBL-A,
dns:A:107.49.73.222.zen.spamhaus.org.

[5841] dbg: async: completed in 0.249 s: URI-DNSBL,
DNSBL:multi.uribl.com.:www.net.cn

[5841] dbg: bayes: score = 1

[5841] dbg: razor2: results: spam? 1

[5841] dbg: pyzor: got response: public.pyzor.org:24441 (200, 'OK') 4 0

[5841] dbg: dcc: dccifd got response: X-DCC-SIHOPE-DCC-3-Metrics:
guardian.ics.com 1085; Body=1 Fuz1=many Fuz2=many


This hit all those flags because I tested on a spam previously run
through 'spamassassin -r' (which teaches Bayes and reports to razor2
and others) ... you should still see results, even if they are ham.
The thing you want in this test is just successful connections to the
servers rather than the spam/ham results.

Re: Updated rules are not regarded

Posted by Karsten Bräckelmann <gu...@rudersport.de>.
On Sat, 2010-05-22 at 20:42 +0200, Yves Goergen wrote:
> I have recently added the 70_zmi_german_cf_zmi_sa-update_dostech_net
> rules to my sa-update call. The rules seem to get updated, I can also
> find the files under /var/lib/spamassassin/3.002005/. It just doesn't
> look as if they're regarded. In my reject log, none of the ZMI rules
> appear over days. And still some spam is passing the filter (though most
> should be catched).

It appears you believe, by just throwing in a new rule-set, you would
catch more spam. This assumption is false.

If it doesn't match, it doesn't match. *Your* particular spam in-stream,
that is.


In a later post you said:
> Or maybe the new rules don't catch a thing for me? How could I test that?

Sic. So you don't know, if it might help. You just added it to the mix
and expected it to do wonders.

If you do *not* know that it even *should* match anything for you, don't
assume it should. You pulled the rule-set via sa-update. You restarted
your spamd daemon. Thus, it works. Weather or not it catches a lot of
spam, is an entirely different topic.

You want to test, if it catches anything for you? Well, just check your
logs. Not specifically your reject log, mind you, but any SA rule hits.
Alternatively, have a look at the rule-set you blindly installed, and
check if it actually matches the spam you *want* to be caught.


Lastly, if you really wonder if sa-update and spamd are using the same
rules, you always can hand-craft a message that triggers one of these
rules. Feed it to your SA, and you'll see.


-- 
char *t="\10pse\0r\0dtu\0.@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4";
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1:
(c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}