You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Alan <sp...@ambitonline.com> on 2023/04/12 15:26:54 UTC

FP on KAM_SOMETLD_ARE_BAD_TLD

A lovely message from a reputable sender with a penchant for fancy email formatting has CSS rules expressed in JSON, presumably so it can adjust for the mail client or some such.

A segment contains the text:

"items":[{"type":"Input.Date","id":"date"}]}

The KAM_SOMETLD_ARE_BAD_TLD rule is triggering on Input.Date. The rule is weighed quite high by default (5.0 here).
This is pushing messages over the spam threshold. I've adjusted the weight locally but it's probably something that should be tweaked globally.

--
For SpamAssassin Users List

Re: FP on KAM_SOMETLD_ARE_BAD_TLD

Posted by Alan <sp...@ambitonline.com>.
On 2023-04-12 20:42, Greg Troxel wrote:
> Alan<sp...@ambitonline.com>  writes:
>
>> A lovely message from a reputable sender with a penchant for fancy
>> email formatting has CSS rules expressed in JSON, presumably so it can
>> adjust for the mail client or some such.
>>
>> A segment contains the text:
>>
>> "items":[{"type":"Input.Date","id":"date"}]}
>>
>> The KAM_SOMETLD_ARE_BAD_TLD rule is triggering on Input.Date. The rule is weighed quite high by default (5.0 here).
>> This is pushing messages over the spam threshold. I've adjusted the weight locally but it's probably something that should be tweaked globally.
> (The KAM rules are on the aggressive side, and downscoring is appropriate
> for those who like to be a bit less aggressive, especially those who are
> not comfortable with single rules over 4ish.  But I am still running
> them, because I think they help a lot more than they hurt.)
>
> You seem to be suggesting reducing score, but that's not the real issue
> in this case.  What you have found, I think, is treating something like
> a URL that isn't.  However, that's really hard to fix given the MUA
> so-called feature of treating things that sort of look like URLs as
> URLs.
>
> If you haven't, I would send the message in question to KAM for analysis
> and perhaps rule adjustment.
>
> FWIW, I find that I have adjusted score to 1.5.

KAM is on this list and has replied off list. I trust him to find the 
best way to mitigate the problem.

I just lowered the score knowing it will take some time for any update 
to make it through my upstream. Short of running a headless Chromium and 
parsing the entire HTML and then inspecting the resulting DOM there are 
always going to be issues like this. I've been doing battle with a 
particularly persistent spammer (multiple spams per user per day from 
different sources) who always used long URLs that followed a specific 
format. Now he uses three formats, so I have to only match on the 
handful of users who I know are on his list to avoid my own FPs. With 
that one, I really wish I had the DOM because the [curse words] follows 
a format that would be easy to catch with an XPATH query.

All in a day's work...

--
For SpamAssassin Users List

Re: FP on KAM_SOMETLD_ARE_BAD_TLD

Posted by Greg Troxel <gd...@lexort.com>.
Alan <sp...@ambitonline.com> writes:

> A lovely message from a reputable sender with a penchant for fancy
> email formatting has CSS rules expressed in JSON, presumably so it can
> adjust for the mail client or some such.
>
> A segment contains the text:
>
> "items":[{"type":"Input.Date","id":"date"}]}
>
> The KAM_SOMETLD_ARE_BAD_TLD rule is triggering on Input.Date. The rule is weighed quite high by default (5.0 here).
> This is pushing messages over the spam threshold. I've adjusted the weight locally but it's probably something that should be tweaked globally.

(The KAM rules are on the aggressive side, and downscoring is appropriate
for those who like to be a bit less aggressive, especially those who are
not comfortable with single rules over 4ish.  But I am still running
them, because I think they help a lot more than they hurt.)

You seem to be suggesting reducing score, but that's not the real issue
in this case.  What you have found, I think, is treating something like
a URL that isn't.  However, that's really hard to fix given the MUA
so-called feature of treating things that sort of look like URLs as
URLs.

If you haven't, I would send the message in question to KAM for analysis
and perhaps rule adjustment.

FWIW, I find that I have adjusted score to 1.5.