You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spamassassin.apache.org by Justin Mason <jm...@jmason.org> on 2004/11/19 03:16:15 UTC

proposal: an automated rule-qa system

So, we were discussing the rules situation -- ie. that we've been pretty
crap at getting rules into the distro. I proposed this, and I think we're
reasonably into the idea as a way to help out.

We add a web-app somewhere that periodically scrapes bugzilla
for bugs on the "rules" component which contain some token from trusted
users indicating that they contain rules that need testing.

That then extracts rules from attachments/text on that bug, and

- (a) checks out SVN trunk
- (a) adds them to the "rules" dir of that in a temporary file
- (b) runs a mass-check on those rules
- (c) does simple lint using "spamassassin --lint" and
  "lint-rules-from-freqs"
- (d) does some kind of basic S/O testing
- (e) it may be that we can also check in the rules into SVN for a full
  nightly mass-check from all the people doing those, in which case it
  should come up with the results from that, nicely snipped out of the
  full reports.
- (f) if we do (e), we can even get the results, segmented by the age of
  the corpus used!  in other words, give us a picture of the freqs based
  on how old the messages it was hitting on were.
- (g) -- possibly -- do a quick perceptron run to evaluate if the rule
  overlaps with other rules too much.

Finally, it'll display the results at a given URL -- probably based on the
bug and comment numbers, so it's easily hyperlinkable.

Using bugzilla as the backend is useful, btw, as that gives us

  - threaded discussion of rules
  - contributor CLA status tracking
  - good ways to get lists and overviews of what contributions are
    available and their status
  - gatewayed to mailing list, and viewable via www

Sound useful?  That should at least take some legwork out of rule QA,
and stop us committers being a bottleneck in the process.

--j.

Re: proposal: an automated rule-qa system

Posted by Henry Stern <he...@stern.ca>.
>- (g) -- possibly -- do a quick perceptron run to evaluate if the rule
>  overlaps with other rules too much.
>
The perceptron won't tell us much about overlap, but I'm sure that I can
come up with something to help out in that department... after I finish
my thesis.

Henry

P.S.  Writing a thesis is hard work!