You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Lorenzo Lucioni <lo...@lucioni.it> on 2005/07/05 14:22:04 UTC

what does "MIME_HTML_ONLY: Message only has text/html MIME parts" mean?

Hello,
I receive some emails form a newsletter that is not spam. These emails go through SpamAssasin and they get this score:


Content analysis details:   (2.0 points, 2.0 required)

 pts rule name              description
---- ---------------------- --------------------------------------------------
 0.2 INVALID_DATE           Invalid Date: header (not RFC 2822)
 0.1 HTML_40_50             BODY: Message is 40% to 50% HTML
 0.0 HTML_MESSAGE           BODY: HTML included in message
 0.2 HTML_FONT_BIG          BODY: HTML tag for a big font size
 0.2 HTML_TAG_EXIST_TBODY   BODY: HTML has "tbody" tag
 1.2 MIME_HTML_ONLY         BODY: Message only has text/html MIME parts
 0.1 HTML_MIME_NO_HTML_TAG  HTML-only message, but there is no HTML tag
 0.0 MISSING_MIMEOLE        Message has X-MSMail-Priority, but no X-MimeOLE


I configured SpamAssasin with a 2.0 points as threshold because many spams came with a score lower than 3.0.
I would suggest to the person who send this newsletter to apply a correction to his emails to avoid the:
"1.2 MIME_HTML_ONLY         BODY: Message only has text/html MIME parts"
but I don't understand what it does mean. Can you help me and suggest my how to modify emails to avoid the matching with this rule?

Thank you very much,
Lorenzo


Re: what does "MIME_HTML_ONLY: Message only has text/html MIME parts" mean?

Posted by Robert Menschel <Ro...@Menschel.net>.
Hello Lorenzo,

Tuesday, July 5, 2005, 5:22:04 AM, you wrote:

LL> Hello,
LL> I receive some emails form a newsletter that is not spam.
LL> These emails go through SpamAssasin and they get this score:
LL> Content analysis details:   (2.0 points, 2.0 required)

LL> I configured SpamAssasin with a 2.0 points as threshold
LL> because many spams came with a score lower than 3.0.

What version of SpamAssassin are you using?

I would suggest you raise your threshold to at least 4.0 if not back
to the original 5.0, and you fix your operation so it catches more
spam:

1) If you aren't using network tests, do so.

2) If you aren't using SURBL tests, do so.

3) If you aren't using the better SARE rules files, do so.

Thousands of non-spam come through the systems here each week with
scores 2.0 to 3.0; very few spam get below 5.0

However, realize that you cannot reach 100% accuracy.  There is no way
to catch every single spam without also flagging some non-spam as if
it were spam.

Bob Menschel




Re: what does "MIME_HTML_ONLY: Message only has text/html MIME parts" mean?

Posted by Kai Schaetzl <ma...@conactive.com>.
Lorenzo Lucioni wrote on Tue, 05 Jul 2005 14:22:04 +0200:

> I would suggest to the person who send this newsletter to apply a correction to his 
emails to avoid the: 
> "1.2 MIME_HTML_ONLY         BODY: Message only has text/html MIME parts" 
> but I don't understand what it does mean. Can you help me and suggest my how to 
modify emails to avoid 
> the matching with this rule?

It means he sends only an HTML part in his mail. A decent email should contain only 
text/plain, a decent HTML mail should contain a text/plain *and* and a text/html part. 
This one contains only the latter. Obviously he uses a mass marketing software which 
allows this. I consider this software broken.
*I* would tell him to just use text/plain ;-)

BTW: your line width is set quite high, pleease reduce it to something normal like <75 
characters.

Kai

-- 
Kai Schätzl, Berlin, Germany
Get your web at Conactive Internet Services: http://www.conactive.com
IE-Center: http://ie5.de & http://msie.winware.org




Re: what does "MIME_HTML_ONLY: Message only has text/html MIME parts" mean?

Posted by Matt Kettler <mk...@evi-inc.com>.
Lorenzo Lucioni wrote:

> I configured SpamAssasin with a 2.0 points as threshold because many
> spams came with a score lower than 3.0.

The spam scoring 3.0 is your problem. Fix that.

Do NOT drive your score threshold through the floor to try to catch spam, as all
you're going to wind up doing is creating more problems such as this false positive.

Whenever you lower the threshold, your false negative rate goes down , but your
false positive rate goes up significantly.

Take a look at the data from STATISTICS-set3.txt:

# SUMMARY for threshold 5.0:
# Correctly non-spam:  29443  99.97%
# Correctly spam:      27220  97.53%
# False positives:         9  0.03%
# False negatives:       688  2.47%

# SUMMARY for threshold 2.0:
# Correctly non-spam:  29277  99.41%
# Correctly spam:      27696  99.24%
# False positives:       175  0.59%
# False negatives:       212  0.76%

By dropping your score from 5.0 you've theoretically reduced your FN rate from
2.47% to 0.76%, a factor of 3 fewer spam messages will get by SA.

However, you've also increased your False positive rate from 0.03% to 0.59%,
19.6 times as many messages will get tagged when they should not be.

Thus, you're getting the exact behavior that was expected... You've jacked your
threshold down low enough you're getting a significantly increased rate of false
positives.





> I would suggest to the person who send this newsletter to apply a
> correction to his emails to avoid the:
> "1.2 MIME_HTML_ONLY         BODY: Message only has text/html MIME parts"
> but I don't understand what it does mean.


 Can you help me and suggest my
> how to modify emails to avoid the matching with this rule?



That rule means just what it says. It got a mime encoded message, which only
contained text/html parts.

No text/plain or other types of mime sections were present in the message. This
is pretty common for HTML newsletters as well as spam, which is why this rule
only scores 1.2.

Normally this rule hitting nonspam email would be no problem, as the message in
your example only scored 2.0, which is less than half the required score for a
spam tag in