You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spamassassin.apache.org by Theo Van Dinter <fe...@apache.org> on 2006/03/06 21:31:34 UTC

Re: svn commit: r383618 - in /spamassassin: rules/trunk/core/25_replace.cf rules/trunk/core/60_whitelist_spf.cf trunk/rules/25_replace.cf trunk/rules/60_whitelist_spf.cf

On Mon, Mar 06, 2006 at 08:11:24PM +0000, Justin Mason wrote:
> > more config files that go with plugins, so they should be in rules and not rulesrc
> 
> Hold on -- I'm not keen on this.  What's wrong with "ifplugin"?

There's nothing wrong with "ifplugin", but the idea was to have
code-related rules in the rules area, and not rulesrc.  plugins are code,
and we already had the other plugin files (uribl, razor, dcc, awl, etc,)
in rules.  I was just moving the outliers into the right place.

-- 
Randomly Generated Tagline:
"I don't think Microsoft is evil in itself; I just think they make really
 crappy operating systems."            - Linus Torvalds

Re: code-tied rules / full as plugin

Posted by Theo Van Dinter <fe...@apache.org>.
On Wed, Mar 08, 2006 at 12:31:57PM +0000, Justin Mason wrote:
> - (c) If something is implemented in a plugin, it is *not* a code-tied
>   rule, that just makes no sense.

Sure it does, you say as much below, but it depends on the situation.

> I would define the following criteria to determine if a rule is code-tied
> or not:
> 
> - 1. implementation is an "eval:" function
> 
> - 2. the eval function in question is not a documented public API, for use
>   in more than one rule.

Just to finish my statement from above, there are a bunch of plugins
which meet these criteria to be considered "code-tied": Razor, Pyzor,
DCC, Whitelist SPF, DomainKeys, etc.  So it makes sense, depending on
the situation.  We both were talking about different situations where
we're both right in our own little space.



Anyway, I haven't fully thought through whether or not the above criteria
are sufficient, but for now let's say they are.  There are still a few
issues that come to mind:

1) any rules implemented by a plugin need to be wrapped in ifplugin/endif
   conditionals.  this usually isn't a problem, but need to keep it in mind.
2) any rules not implemented by a plugin, which aren't of a type that have
   been around for a long time (let's say since 3.0.0) need to be wrapped
   in "if version"/endif conditionals.  right now I think we just put
   them in the rules/ dir, which may be sufficient, but onto #3.
3) rule updates/auto activation/promotion/score generation needs to be
   modified to deal with multiple versions of the engine.  otherwise
   there's no point in having separate rulesrc/rules, etc.  at the moment, the
   automatic stuff only helps for the development version.  with "if version"
   we can probably do multiple versions, but score generation will still be
   difficult -- perhaps we'll have to do it manually through many "if version"
   sections.  also, we'll probably need to remove "require_version" from all
   of the configs.

> If the term "code-tied" causes confusion, let's make it clearer.
> "non-public eval rules"?

Sure.

-- 
Randomly Generated Tagline:
I am Quayle of Borg....potatoe is irrelevant

Re: moving uribl to rules dir (Re: svn commit: r383618 - in /spamassassin: rules/trunk/core/25_replace.cf rules/trunk/core/60_whitelist_spf.cf trunk/rules/25_replace.cf trunk/rules/60_whitelist_spf.cf )

Posted by Loren Wilton <lw...@earthlink.net>.
> If it's a plugin, it has to be a code-tied rule!  Otherwise it wouldn't
need
> the plugin.

Hey, what a neat way to completely disable the initial concept of the Rules
project and put things back into the Land Of Arcana where they belong!

Just move 'body', 'rawbody', 'header', and 'full' to plugins!  Then you can
move all those ruletypes into the code-tied directory and never have to
worry about whether they are applicable to any particular version!

Hey!  Great idea!  While you are at it, why don't you make all those plugins
disabled by default, and admins can enable them only if they think they need
them?

:-(

        Loren


Re: moving uribl to rules dir (Re: svn commit: r383618 - in /spamassassin: rules/trunk/core/25_replace.cf rules/trunk/core/60_whitelist_spf.cf trunk/rules/25_replace.cf trunk/rules/60_whitelist_spf.cf )

Posted by John Myers <jg...@proofpoint.com>.
Theo Van Dinter wrote:
> When it comes down to it, the body/header/etc rules aren't any different
> from other code, beyond the fact that they have the same API for ages.
>   
Similarly, the DNS rules have had the same API for ages.  Adding and 
removing new block lists can be done independent of release/branch.

On the flip side, replace_rules, while being text match, are highly 
dependent on releases which have the ReplaceTags plugin.

The use or non-use of "eval:" is not particularly relevant.  What is 
relevant is the "version dependence" and ubiquity of the underlying 
mechanism and the amount of information contained in the rule definition.


Re: moving uribl to rules dir (Re: svn commit: r383618 - in /spamassassin: rules/trunk/core/25_replace.cf rules/trunk/core/60_whitelist_spf.cf trunk/rules/25_replace.cf trunk/rules/60_whitelist_spf.cf )

Posted by Theo Van Dinter <fe...@apache.org>.
On Tue, Mar 07, 2006 at 10:07:53AM -0800, John Myers wrote:
> I don't understand this term "code-tied."  It seems to me that all of 
> the rules are code-tied, since they all depend on what is in the code.

In the end, sure.  The way we've been using the phrase, "code-tied" means
something that requires "eval:", which means EvalTests and rule plugins.
Generally the code that these rules depend on are only made available
through full code releases, and therefore the rules are tied to the code.

The "standard" rules are all based on regular expressions, and while
they're eventually executed as code, they are generally considered
simple text string matching and not something more complicated that
requires writing perl or calling out to third-party code, etc.  These can
generally be used on any version of SpamAssassin, and are not specific
to any one release/branch.

In short, you're correct in saying that everything is code in the end.
When it comes down to it, the body/header/etc rules aren't any different
from other code, beyond the fact that they have the same API for ages.
So from a strictly "what is code and what isn't" POV, they're all code.
Generally it's a "text match" versus "something more complicated"
differentiation.

-- 
Randomly Generated Tagline:
"L: And stop saying Okay, okay?
  M: Okay.
  L: Good."                  - The Professional

Re: moving uribl to rules dir (Re: svn commit: r383618 - in /spamassassin: rules/trunk/core/25_replace.cf rules/trunk/core/60_whitelist_spf.cf trunk/rules/25_replace.cf trunk/rules/60_whitelist_spf.cf )

Posted by John Myers <jg...@proofpoint.com>.
Theo Van Dinter wrote:
> If it's a plugin, it has to be a code-tied rule! Otherwise it wouldn't 
> need
> the plugin.
>   
I don't understand this term "code-tied."  It seems to me that all of 
the rules are code-tied, since they all depend on what is in the code.

There are indeed rules that have less information in config than others, 
but there is no clear demarcation.  The various BL rules clearly have 
significant information in config.


Re: moving uribl to rules dir (Re: svn commit: r383618 - in /spamassassin: rules/trunk/core/25_replace.cf rules/trunk/core/60_whitelist_spf.cf trunk/rules/25_replace.cf trunk/rules/60_whitelist_spf.cf )

Posted by Theo Van Dinter <fe...@apache.org>.
On Tue, Mar 07, 2006 at 10:02:06AM +0000, Justin Mason wrote:
> > There's nothing wrong with "ifplugin", but the idea was to have
> > code-related rules in the rules area, and not rulesrc.  plugins are code,
> > and we already had the other plugin files (uribl, razor, dcc, awl, etc,)
> > in rules.  I was just moving the outliers into the right place.
> 
> I thought the idea was to share as many rules as possible between
> releases, by keeping them in rulesrc instead of rules?

For non-code rules, yes.

> These are not really code-tied rules -- they're just a rule type
> implemented by a plugin!

If it's a plugin, it has to be a code-tied rule!  Otherwise it wouldn't need
the plugin.

> The check_rbl() syntax, and the URIDNSBL rule syntax, hasn't changed afaik
> since 3.0.  That's nice version portability, and rules using those
> syntaxes can live safely in rulesrc as a result, in my opinion.

Unless we're distributing the plugins via rulesrc, I'd rather see us keep the
RE-rules and code rules separate.

> If we make it a guideline that rules implemented using "check_rbl()" or
> "urirhssub" are considered "code-tied" rules, as much as let's say
> 
>   header NO_DNS_FOR_FROM          eval:check_dns_sender()
> 
> and therefore should go in "rules", we can't test them in a sandbox --
> instead we'd have to throw them in with the released set in
> rules/25_uribl.cf, or revive the obsolete rules/70_testing.cf idea again.
> I think this is messy.

Well now, that's another issue.  For *testing*, living in the sandbox is fine.
However, if the rule is going to be promoted to "released", it should be moved
into rules.

It's the same thing that happens now for new EvalTests.  We test by putting a
plugin into the sandbox area, and writing the rules in the sandbox area.  If
it gets to the point it needs to be promoted, we copy the function to
EvalTests (or copy the plugin over), and move the rule to the rules/ dir.

> Basically I was planning to keep rules as much as possible in rulesrc, and
> where possible, render the "rules" dir obsolete -- just use it as the
> place where mkrules writes its output.
> 
> Otherwise we run into a situation where there isn't really much point in
> *having* the rulesrc-vs-rules differentiation!

The main issue is that if we end up putting everything into rulesrc, we'll
need to start using "if version > ..." conditionals around as things change,
etc.  It gets a bit complex.  Then again, at the moment, we're using a
separate area for released rule updates, so maybe there isn't an issue?

> (Bear in mind that rules in "rules" are considered legacy rules, are not
> compiled/lint-checked, are allowed to get past the new accuracy
> requirements, and so on.  It really is a legacy situation.)

There are a ton of legacy rules in the rulesrc/core area, so that's not a real
reason imo.

The reason we started this discussion is that there isn't a clear policy about
what goes where.  My understanding was that eval rules go into rules/, but
if you look in rulesrc/core you find a bunch of EvalTests-based rules, URIBL,
etc, and in rules/ you find other EvalTests-based rules, SURBL, Razor, etc.

Basically it looks like we went half-way when moving things around, and stopped.

So we really ought to figure out what we want to do and go that way.  If
we care about code-based vs non-code based, then we need to finish moving
things out of rulesrc/ and into rules/.  If we don't care, we can move
basically everything from rules/ into rulesrc/, and surround new
EvalTests/plugins with "if" conditionals, etc.

This tips into another issue which is: how are we going to deal with
rule updates?  The automatic promotion/demotion/active.list thing is
neat, but doesn't necessarily work for all versions of code since we're
only running mass-checks on the latest development version and not on
older versions which may behave differently (message and received header
parser changes, for instance, can cause rule hitrate/accuracy changes).

With judicious use of "if version" and such, we could probably just have a
single update and let everyone download it, though I haven't thought fully
about it yet.  If we manually maintain the previous version's updates, we can
just handle the issues when we want to make a new update -- but then what's
the point of active.list, etc?

-- 
Randomly Generated Tagline:
Dumb luck beats sound planning every time.