You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spamassassin.apache.org by Justin Mason <jm...@jmason.org> on 2005/07/24 20:42:50 UTC

Re: compiled user rules

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


Loren --

it's not a matter of "popularity" -- it's a matter of being horrendously
difficult to support.

Without somebody doing the work to profile this beforehand to see if it
helps (in particular loading code from disk may be *slower* than what we
do now), and without patches to implement the change anyway, no, this is
unlikely to be implemented. (in other words: please don't make me look
at that code ;)

- --j.

Loren Wilton writes:
> I know user rules aren't real popular with the sa dev community, however
> that attitude isn't universally shared by sa users.  Therefore may I
> suggest:
> 
> Would it be possible when reorganizing things to come up with some
> semi-persistant storage for compiled user rules, so that they don't have to
> be rewritten and recompiled in a bunch of evals for every message processed
> for the user?
> 
> At the very least it should be moderately trivial to save the text passed to
> the evals into a file someplace, and then just eval that one file to get
> everything compiled.  Timestamps would indicate if the pre-built stuff was
> out of date or not.
> 
> Is there a way that they could be saved in a more-compiled state when used
> with spamd and similar?  Maybe name the rules with the username as part of
> the procedure name, and just add them to the namespace the first time
> encountered?
> 
> If there were a way to make user rules be a subclass of the main rules, then
> you could just call the user rules for each user, and any user rules (or I
> presume scores) would override the standard rules, and anything that didn't
> override would pass through to the main rules.
> 
> Of course this assumes "the rules" was a class instance of some sort of
> other, and you could call ruleclass->dorules(message) or the like (however
> that C++ invocation would be spelled in Perl).
> 
> Having all user rules remain in memory forever would certainly add some to
> the bloat of SA memory usage. However, if one assumes most users would have
> few rules compared to the number of overall system rules, then the increase
> might be reasonable; especially since you would be trading off that memory
> for the compile and build overhead on every message.
> 
>         Loren
> 
> BTW, if "the rules" were in a class, then it would be possible to create the
> instance of that class at startup in just about exactly the same way user
> rule subclasses could be created.  Then a hup, instead of taking spamd and
> children down, would just cause the children to discard the current rules
> instance and fabricate a new instance and continue onward.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFC4+EqMJF5cimLx9ARAqfnAJ9CTZhaf14ZMeWcGqKZlSwve2GkZQCfdJdk
7P7QJKK2Tc+abWJDwX4cFIk=
=bliK
-----END PGP SIGNATURE-----


Re: compiled user rules

Posted by Loren Wilton <lw...@earthlink.net>.
> it's not a matter of "popularity" -- it's a matter of being horrendously
> difficult to support.

I grant from what I've seen of PMS that this gets pretty ugly.  Or at least
it seems to to me, but then a lot of apparently good Perl looks pretty ugly
to me.  ;-)  But I'm a C++ and Algol programmer, so the whole structure
looks kinda odd to me. :-)

That said, if I understoood enough about objectable Perl to safely do any
coding in it, I'd be perfectly happy to make something do this.  This would
come close to 'trivial implementation' in C++, and I'd hope it really isn't
all that much harder in Perl.

I brought this up now because there is talk of changing the structure of
things, and it seem to me that compiling the rules down into a class of
their own that could potentially be subclassed might be the clean way to
deal with a lot of the stuff that looks ugly now.

>From reading through pms, there just doesn't seem to be all that much there
that absolutely *has* to be ugly, and ugliness (at least in my experience)
is usually a sign of an incorrect structural or algorithmic decision at the
outset, and trying to force the code into a structure it doesn't like.  Of
course, the initial design decisions always seem good at the time; its only
after you don't want to touch the code anymore does it start to become
obvious that there should be another way, whether there is or not.

There are a lot of things I'd like to see done with rules to make rule
writing both easier and more productive, and I;ve been trying to come up
with a structure that conveniently supports all the things I'd like to see.
A few of the more obvious:

1. User rules handled efficiently
2. Counting rules
3. Chained re's
4. A simple algorithmic form of rule so you don't have to write an eval or
plugin to do something simple
5. Reloading the rule base without having to restart SA
6. Rules with scores based on number of hits (variation on counting rules)
7. Short circuiting.

I just have this feeling that with the right structure a lot of this will
fall out as no-brainers.  Witrh the wrong structure a lot is difficult or
impossible.  I know how I'd do it in other languages.  The question in my
mind is whether any of those techniques convert to Perl.  I guess I'll
either have to learn enough Perl to answer that myself, or be able to spend
an hour or two bouncing ideas off someone that already knows Perl.

        Loren