You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spamassassin.apache.org by Theo Van Dinter <fe...@kluge.net> on 2004/08/19 20:16:26 UTC

Re: Why HAM learn on BAYES_99?

Moving this part to dev ...

On Thu, Aug 19, 2004 at 11:06:27AM -0700, Dan Quinlan wrote:
>   X-Spam-Status: No, score=1.9 required=5.0 tests=BAYES_99 autolearn=ham
>        version=3.0.0-rc1
> 
> That looks a lot like it was not fixed, or maybe we fixed something
> else.

According to the code...  If the message isn't considered spam,
PMS::learn() checks learned_points to see if it's > 1 point and aborts
autolearning as ham if it is.  The intent, it seems, is that if a learn
rule gives over 1 point to the score, we don't want to learn it as ham.

However, learned_points is increased whenever a rule is hit that is set
"noautolearn", which in 3.0rc1 is whitelist/blacklist, AWL, and GTUBE.

Seems to me like "noautolearn" should just be ignored, and "learn"
rules ought to go into learned_points.  This would all then "Do the
Right Thing(tm), I believe.

-- 
Randomly Generated Tagline:
"There are two major products to come out of Berkeley: LSD and UNIX.  We
 don't believe this to be a coincidence."      - Unknown

Re: Why HAM learn on BAYES_99?

Posted by Theo Van Dinter <fe...@kluge.net>.
On Thu, Aug 19, 2004 at 02:26:49PM -0400, Theo Van Dinter wrote:
> /me works up a patch

Ok, so our documention is sketchy on the whole thing.

bayes_auto_learn has documented:
           Note that certain tests are ignored when determining whether a mes-
           sage should be trained upon: - auto-whitelist (AWL) - rules with
           tflags set to 'learn' (the Bayesian rules) - rules with tflags set
           to 'userconf' (user white/black-listing rules, etc)

then the tflags option has documented:
           userconf
               The test requires user configuration before it can be used
               (like language- specific tests).

           learn
               The test requires training before it can be used.

           noautolearn
               The test will be ignored when calculating the score for learn-
               ing systems.


So now it's unclear as to whether userconf and learn ought to just be
ignored or whether those rules should also have "noautolearn" as a tflag.

The patch I'm working on will give "learn" and "userconf" an automatic
"noautolearn" as well as updating the documentation.  We can go from there.

-- 
Randomly Generated Tagline:
Come on, honey.  You work yourself stupid for this family.  If anyone
 deserves to be wrapped up in seaweed and buried in mud, it's you.
 
 		-- Homer Simpson
 		   Home Sweet Homediddly-Dum-Doodily

Re: Why HAM learn on BAYES_99?

Posted by Theo Van Dinter <fe...@kluge.net>.
On Thu, Aug 19, 2004 at 02:16:26PM -0400, Theo Van Dinter wrote:
> Seems to me like "noautolearn" should just be ignored, and "learn"
> rules ought to go into learned_points.  This would all then "Do the
> Right Thing(tm), I believe.

Erg.  This is all messed up.

PMS::get_nonlearn_nonuserconf_points() only looks for noautolearn, it
doesn't actually look for !learn && !userconf.  Then there's {score}
is used for autolearning for some reason, and there's a kluge with
{learned_points} ...

/me works up a patch

-- 
Randomly Generated Tagline:
"It's one thing if government pork directly benefits me, but a quarter
 million dollars to fight Goth culture in Blue Springs, MO?  Hey!  If you
 want to fight the Goths, I know a couple of Huns and Mongols who'll do
 it for free!" - Lewis Black, The Daily Show