You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Robert Menschel <Ro...@Menschel.net> on 2004/08/28 23:22:33 UTC

[SARE] HTML rules updates published

SARE's HTML rule sets have been updated.

All rules have been renamed and redescribed as necessary to prevent
--lint warnings under version 3.0.0

A rule that duplicates a rule added to the 3.0.0 distribution
rule set has been moved to a separate file.

Rules that work well for English-based systems, but that might cause
false positives have been moved to a separate file.

Rules that used to work well but no longer hit spam have been archived.

HTML rules from ratware.cf have been merged into this collection.

There are several ruleset files in this collection (nine, counting the
archive):

70_sare_html0.cf contains those HTML rules which in all SARE
mass-check testing hit ONLY spam. This is the safest of the four HTML
rulesets for use.

Unlike 70_sare_html0.cf, the 70_sare_html1.cf ruleset contains rules
which do (or in the past have) hit ham during SARE mass-check tests. The
S/O calculated by SA's hit-frequencies scripts are all at or above 0.900.
This file also contains rules which hit only spam, but fewer than 10 spam
in our mass-check tests. Systems which are highly sensitive to false
positives and/or tight on resources may want to exclude this ruleset,
pick and choose among its rules, or lower their scores. Any system using
this file 1 should also use file 0.

70_sare_html2.cf contains only rules which should never hit ham, but
that do not currently hit any emails during SARE mass-check testing
against current corpora. Therefore, systems which are very sensitive to
SpamAssassin overhead may want to exclude this ruleset to avoid its regex
overhead.

70_sare_html3.cf contains a subset of HTML rules which hit a
significant amount of ham during SARE mass-check tests. Systems which are
very sensitive to false positives should probably NOT install this
ruleset.

70_sare_html4.cf contains a subset of HTML rules which hit over 100 ham
during SARE mass-check tests. Systems which are very sensitive to false
positives should probably NOT install this ruleset.

70_sare_html_eng.cf contains HTML rules which work well where English
is the only expected language, but that may cause false positives in
systems which receive a significant number of emails in other languages.

70_sare_html_x30.cf contains HTML rules which have been merged into
the SpamAssassin distribution rule set in version 3.0.0. Systems which
have upgraded to this version should not use this file. Systems running
any 2.5x or 2.6x version should benefit from these rules.

70_sare_html_arc.cf contains HTML rules which used to work, but which
no longer hit spam, or that hit too many ham. SARE will continue testing
these rules and will reactivate any that again become productive. This
rules file should be used only by the most aggressive systems with
available resources.

As announced two months ago, the old coding_html.cf rule set file has
now been deleted.

Bob Menschel