You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spamassassin.apache.org by bu...@bugzilla.spamassassin.org on 2005/07/03 08:44:32 UTC

[Bug 4356] tests for the same thing should be grouped

http://bugzilla.spamassassin.org/show_bug.cgi?id=4356


Bob@Menschel.net changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
           Keywords|triage                      |
         Resolution|                            |INVALID




------- Additional Comments From Bob@Menschel.net  2005-07-02 23:44 -------
Closing as invalid provisionally, based on 
> A message is no more or less likely spam if the sender's IP is listed for
previous spamming / having an open relay / dynamic IP address
[ie, multiple rule hits], 
> therefore it is not wise to count every such listing toward the spam probability.

Thinking about the scoring algorithms, yes, a message is more likely spam if the
email message hits multiple rules, because that's how the rule scores are
determined. During the pre-release mass-checks, we determine which rules hit
which emails. The universe of matched rules are then analyzed, and just about
all rules are scored based on a) how much ham/spam the rules hit, b) what other
rules hit the same emails, c) how accurately the rules identify ham/spam, what
scores for the universe of rules can be used to score > 5 the most spam and the
least ham. 

Meta tests as you suggest would clean up the spam reports, but they would lose
some of the finesse of the current method. Since the method of determining the
scores takes your concern into account (the accuracy of combined rules), I think
 we would not want to make this change.

It might be worth while to do && metas, since
a) if network rule 1 is 75% certain to flag spam and not ham, 
b) if network rule 2 is 80% certain to flag spam and not ham, 
then c) it's very possible that rule1 && rule2 will be 95% certain. I'll see if
maybe SARE can play with this idea and suggest something along those lines for
3.2...



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.