You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by LuKreme <kr...@kreme.com> on 2009/04/01 00:36:05 UTC
Re: quirks with bayes ?
On 31-Mar-2009, at 14:24, James Wilkinson wrote:
> I wrote (about the AWL):
>> In the absence of any sort of expire mechanism¹ (see, for example,
>> https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6059) one
>> can do
>> a crude approximation by periodically resetting it.
>
> LuKreme wrote:
>> But why would you want to ever reset the AWL?
>
> To quote that bug report:
> at this stage I don’t think it’s worth having on by default,
> given
> the problems it causes for disk load and out-of-control bloated db
> files eating lots of disk space and memory, vs the marginal gains
> in
> accuracy it provides. let’s set it off by default.
Wow. I couldn't possibly disagree more. AWL is crucial to avoid the
occasional 'spammish' message from people I correspond with often. For
example, I just got this from my cousin:
X-Spam-Status: No, score=3.4 required=5.0
tests=AWL,BAYES_95,EXTRA_MPART_TYPE,
HTML_FONT_SIZE_HUGE,HTML_IMAGE_RATIO_02,HTML_MESSAGE,SPF_PASS,
UNPARSEABLE_RELAY autolearn=no version=3.2.5
> One can temporarily get the db files under control by deleting them
> and
> letting SpamAssassin recreate them from scratch.
Under control? They are not significantly large. The largest auto-
whitelist on my system is 10M and most of them are under 500K.
What's odd is that db41_dump185 autowhitelist produces no readable
info, while strings autowhitelist at least shows SOME information.
--
Can't stop the signal