You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Michael Holzt <kj...@fqdn.org> on 2005/03/10 21:30:08 UTC

Strange Scoring Results?

I'm operating a small companies mailserver (running Spamassassin 3.0.2 
invoked by the qpsmtpd spamassassin plugin on Debian Linux), and lately i'm
observing a major increase in spam that comes through the filter.

When looking into this problem, i noticed that the scoring results seems to
be much lower than what i was used to. I have set up my system to reject any
spam with score over 8 and handle score over 5 as probable spam. But when i
checked my messages i found that nearly all ham now has scores below zero,
while spam already starts at 1.5 to 2.0.

I wonder what the reasons for this are, and if i should lower my threshold
to a value of 3 or something liek this, but i fear false positives because i
do not understand the reason for this scoring. Here are some examples of spam
which came through and has only low score while having triggered much rules:

| X-Spam-Status: No, hits=5.0 required=8.0
| tests=BAYES_99,DRUGS_ERECTILE,DRUG_DOSAGE,DRUG_ED_CAPS,HTML_40_50,
| HTML_MESSAGE,HTML_TEXT_AFTER_BODY,HTML_TEXT_AFTER_HTML,MIME_HTML_ONLY,
| RCVD_NUMERIC_HELO,SUBJECT_DRUG_GAP_VIA

| X-Spam-Status: No, hits=1.4 required=8.0
| tests=BAYES_50,HTML_90_100,HTML_FONT_BIG,HTML_FONT_LOW_CONTRAST,
| HTML_MESSAGE,HTML_MIME_NO_HTML_TAG,MIME_HEADER_CTYPE_ONLY,MIME_HTML_ONLY,
| NORMAL_HTTP_TO_IP

| X-Spam-Status: No, hits=1.6 required=8.0
| tests=BAYES_50,DATE_IN_PAST_12_24,INVALID_DATE,NO_REAL_NAME,SUBJ_ALL_CAPS

| X-Spam-Status: No, hits=1.9 required=8.0
| tests=BAYES_50,HTML_50_60,HTML_MESSAGE,IP_LINK_PLUS,MIME_HEADER_CTYPE_ONLY,
| MIME_HTML_ONLY,NORMAL_HTTP_TO_IP,RCVD_IN_BL_SPAMCOP_NET,URI_REDIRECTOR

Now, is this scoring normal? I wonder if messages with 50 to 99% bayes
should only get such low scores. Should i lower my threshold or should i
higher the scores for the bayes rules? Or is something broken with my
setup?

Regards
Michael

-- 
      It's an insane world, but i'm proud to be a part of it. -- Bill Hicks

Re: Strange Scoring Results?

Posted by Matt Kettler <mk...@evi-inc.com>.
At 03:30 PM 3/10/2005, Michael Holzt wrote:
>Now, is this scoring normal? I wonder if messages with 50 to 99% bayes
>should only get such low scores.

Only one of those messages has 99% bayes. The others all have 50%.
A message with 50% bayes is by definition undecided between spam and ham.

I think the question you should be asking yourself is why your bayes 
training fails to categorize these spam messages as having a higher 
probability of spam than ham. (ie: having a bayes score over 50)