You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spamassassin.apache.org by bu...@bugzilla.spamassassin.org on 2005/07/24 01:06:44 UTC

[Bug 4497] New: reorganise PerMsgStatus code

http://bugzilla.spamassassin.org/show_bug.cgi?id=4497

           Summary: reorganise PerMsgStatus code
           Product: Spamassassin
           Version: SVN Trunk (Latest Devel Version)
          Platform: Other
        OS/Version: other
            Status: NEW
          Severity: enhancement
          Priority: P5
         Component: Libraries
        AssignedTo: dev@spamassassin.apache.org
        ReportedBy: jm@jmason.org


talking about this at the hackathon and there's pretty unanimous agreement that
PerMsgStatus does too many different things.

So here are the conceptual blocks of functionality in PerMsgStatus, and the
proposed new classes to implement it:

- context of scan --\
  results of scan ----> Mail::SpamAssassin::PerMsgStatus
  (keep this where it is, badly named or not.)

- logic for check() ......> Mail::SpamAssassin::Plugin::Check.
  (a new plugin to implement check logic, so that it can be swapped
  out for alternative check prioritisation implementations, short-circuiting
  implementations, etc.)

- body formats / rendering -\
- URI list --------------------> Mail::SpamAssassin::Message::Rendered

  (a new object to hold rendered state off Message.  *NOT* Message itself, as
  this object has a ptr to {main} and {conf}, so can use configuration data
  during its work; it's also deleted when Maill::SpamAssassin::PerMsgStatus
  is deleted.)

- logic for each rule type --\
- rule compilation ------------> Mail::SpamAssassin::Scanner
- config parsing for rules --/

- rewrite .....................> Mail::SpamAssassin::Plugin::Rewrite
  (a plugin that implements rewriting of mails, so that alternative
  rewriting methods can be used instead, optionally)

Regarding backwards compatibility -- all public APIs remain compatible.
This is essential (but easily done).

Every public member variable remains on PerMsgStatus; every public method keeps
a public facade method on PerMsgStatus that just calls into the real
implementation code.

this is post-3.1.0, just wanted to get this into bits and off the whiteboard...



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4497] reorganise PerMsgStatus code

Posted by bu...@bugzilla.spamassassin.org.
http://bugzilla.spamassassin.org/show_bug.cgi?id=4497





------- Additional Comments From jgmyers@proofpoint.com  2005-10-17 16:36 -------
Mail::SpamAssassin::Message::Node will need a ptr to {main} for bug 4636, so
there's little point for a separate Mail::SpamAssassin::Message:Rendered.




------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4497] reorganise PerMsgStatus code

Posted by bu...@bugzilla.spamassassin.org.
http://bugzilla.spamassassin.org/show_bug.cgi?id=4497





------- Additional Comments From jgmyers@proofpoint.com  2005-10-17 16:59 -------
There is a work-in-progress patch attached to bug 4636.  It would be better to
discuss the issue in that bug.




------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4497] reorganise PerMsgStatus code

Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4497


jm@jmason.org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |WONTFIX




------- Additional Comments From jm@jmason.org  2006-12-04 10:41 -------
marking this WONTFIX -- this work pretty much stalled and other stuff happened
instead.

this part of the idea is probably still worthwhile, though:

- rewrite .....................> Mail::SpamAssassin::Plugin::Rewrite
  (a plugin that implements rewriting of mails, so that alternative
  rewriting methods can be used instead, optionally)


maybe someone will eventually have tuits for that ;)



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4497] reorganise PerMsgStatus code

Posted by bu...@bugzilla.spamassassin.org.
http://bugzilla.spamassassin.org/show_bug.cgi?id=4497





------- Additional Comments From felicity@apache.org  2005-10-17 16:54 -------
Subject: Re:  reorganise PerMsgStatus code

On Mon, Oct 17, 2005 at 04:36:06PM -0700, bugzilla-daemon@bugzilla.spamassassin.org wrote:
> Mail::SpamAssassin::Message::Node will need a ptr to {main} for bug 4636, so
> there's little point for a separate Mail::SpamAssassin::Message:Rendered.

FYI, I will almost definitely -1 any such modification to the Message or
Message::Node code unless there is some huge absolute need to have it.
That code is designed to very specifically have no dependence, relation,
or access to the rest of the SA code.





------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

Re: [Bug 4497] New: reorganise PerMsgStatus code

Posted by Daniel Quinlan <qu...@pathname.com>.
Justin Mason <jm...@jmason.org> writes:

> I don't get it -- how does that help?

Basically, user rule would always run last.  Something like:

 for (@priorities) {
    standard prorities loop
 }
 if ($allow_user_rules && $defined_rules{$user}) {
    do_xxx_tests($user, ...);  # $user is the priority
    do_yyy_tests($user, ...);
 }

Precompiling could be done by an early Plugin that runs an init message
on each user that has configuration.  The do_xxx_tests routines should
not care about whether you are doing user rules, ideally.

Daniel

-- 
Daniel Quinlan
http://www.pathname.com/~quinlan/

Re: [Bug 4497] New: reorganise PerMsgStatus code

Posted by Daniel Quinlan <qu...@pathname.com>.
"Loren Wilton" <lw...@earthlink.net> writes:

> Is there a way that [user rules] could be saved in a more-compiled
> state when used with spamd and similar?  Maybe name the rules with the
> username as part of the procedure name, and just add them to the
> namespace the first time encountered?

Beyond the memory bloating this could potentially cause and the
potentially more complicated security aspects, it's important to
optimize for the common case.  I've been thinking we might be able to
move all user rules into their own priority and then *only* those would
be slower and overall performance would be as good as reasonably
possible.  Also, when there are no user rules, the user rule priority
could just be skipped for good performance.

In the long run, it might even be possible to give user rules priorities
based on the user name (non-numeric ones) so they could be cached.

So, are any user rule sets using priorities?

Daniel

-- 
Daniel Quinlan
http://www.pathname.com/~quinlan/

Re: [Bug 4497] New: reorganise PerMsgStatus code

Posted by Loren Wilton <lw...@earthlink.net>.
I know user rules aren't real popular with the sa dev community, however
that attitude isn't universally shared by sa users.  Therefore may I
suggest:

Would it be possible when reorganizing things to come up with some
semi-persistant storage for compiled user rules, so that they don't have to
be rewritten and recompiled in a bunch of evals for every message processed
for the user?

At the very least it should be moderately trivial to save the text passed to
the evals into a file someplace, and then just eval that one file to get
everything compiled.  Timestamps would indicate if the pre-built stuff was
out of date or not.

Is there a way that they could be saved in a more-compiled state when used
with spamd and similar?  Maybe name the rules with the username as part of
the procedure name, and just add them to the namespace the first time
encountered?

If there were a way to make user rules be a subclass of the main rules, then
you could just call the user rules for each user, and any user rules (or I
presume scores) would override the standard rules, and anything that didn't
override would pass through to the main rules.

Of course this assumes "the rules" was a class instance of some sort of
other, and you could call ruleclass->dorules(message) or the like (however
that C++ invocation would be spelled in Perl).

Having all user rules remain in memory forever would certainly add some to
the bloat of SA memory usage. However, if one assumes most users would have
few rules compared to the number of overall system rules, then the increase
might be reasonable; especially since you would be trading off that memory
for the compile and build overhead on every message.

        Loren

BTW, if "the rules" were in a class, then it would be possible to create the
instance of that class at startup in just about exactly the same way user
rule subclasses could be created.  Then a hup, instead of taking spamd and
children down, would just cause the children to discard the current rules
instance and fabricate a new instance and continue onward.