You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Kenneth Porter <sh...@sewingwitch.com> on 2021/08/11 01:57:11 UTC

KAM_SOMETLD_ARE_BAD_TLD false positive

My cellular supplier has a weekly bag of goodies (coupons, schwag) and last 
week's included a free photo refrigerator magnet from CVS. So I signed up a 
CVS/Kodak account to put in my order. Like most such offers, they start 
sending me marketing mail, and the first one hit KAM_SOMETLD_ARE_BAD_TLD, 
with a 5.0 score. I'll be turning that score down (probably to 3.5) but I 
think the rule itself is the issue. It's firing on a uri that has dot shop 
as the last part of the path in a legitimate dotcom uri. Perhaps the rule 
can check for the absence of a single slash before the offending TLD. 
There's a helper rule that checks for false positives that could be 
replaced with one that ignores TLDs after an isolated slash in a uri.

Do the KAM rules have an issue tracker where this kind of report can be 
made?

The rule:

header      __KAM_SOMETLD_ARE_BAD_TLD_FROM          From:addr =~ 
/\.(pw|stream|trade|press|top|date|guru|casa|online|cam|shop|club|b
uri      __KAM_SOMETLD_ARE_BAD_TLD_URI 
/\.(pw|stream|trade|press|top|date|guru|casa|online|cam|shop|club|bar)($|\/)/i

#FPs
uri      __KAM_SOMETLD_ARE_BAD_TLD_URI_NEGATIVE 
/(^|\b)td\.date|div\.top($|\/)/i

meta     KAM_SOMETLD_ARE_BAD_TLD    (__KAM_SOMETLD_ARE_BAD_TLD_FROM) || 
(__KAM_SOMETLD_ARE_BAD_TLD_URI && !__KAM_SOMETLD_ARE_BAD_TLD
describe    KAM_SOMETLD_ARE_BAD_TLD         .stream, .trade, .pw, .top, 
.press, .guru, .casa, .online, .cam, .shop, .bar, .club & .d
score       KAM_SOMETLD_ARE_BAD_TLD         5.0


Re: KAM_SOMETLD_ARE_BAD_TLD false positive

Posted by Kenneth Porter <sh...@sewingwitch.com>.
--On Wednesday, August 11, 2021 12:29 AM -0400 "Kevin A. McGrail" 
<km...@apache.org> wrote:

> Hi Kenneth, the ruleset is designed for a system scoring over 5.0.
>
> Did the rule from the cell provider cause an fp?
>
> Is your threshold higher than 5.0?

I use the stock threshold of 5.0. I'm using the ruleset via the channel 
distribution on a CentOS (RHEL) 7 system.

> There is a way to report problems listed in the file but feel free to
> contact me off list and I'll tell you how to send me a sample.

Thanks, now that I know where to look, I submitted the sample with your web 
form.

Perhaps you could echo the support information to the main KAM web page? 
That's where I looked because that's where I found the channel information.

<https://mcgrail.com/newsmanager/news_article.cgi?&template=news.template&news_id=11&article_template=news_mcgrail_article_style>

That's what I found from following the link here:

<https://cwiki.apache.org/confluence/display/SPAMASSASSIN/CustomRulesets>



Re: KAM_SOMETLD_ARE_BAD_TLD false positive

Posted by "Kevin A. McGrail" <km...@apache.org>.
Hi Kenneth, the ruleset is designed for a system scoring over 5.0.

Did the rule from the cell provider cause an fp?

Is your threshold higher than 5.0?

There is a way to report problems listed in the file but feel free to
contact me off list and I'll tell you how to send me a sample.

Regards, KAM

On Tue, Aug 10, 2021, 22:00 Kenneth Porter <sh...@sewingwitch.com> wrote:

> My cellular supplier has a weekly bag of goodies (coupons, schwag) and
> last
> week's included a free photo refrigerator magnet from CVS. So I signed up
> a
> CVS/Kodak account to put in my order. Like most such offers, they start
> sending me marketing mail, and the first one hit KAM_SOMETLD_ARE_BAD_TLD,
> with a 5.0 score. I'll be turning that score down (probably to 3.5) but I
> think the rule itself is the issue. It's firing on a uri that has dot shop
> as the last part of the path in a legitimate dotcom uri. Perhaps the rule
> can check for the absence of a single slash before the offending TLD.
> There's a helper rule that checks for false positives that could be
> replaced with one that ignores TLDs after an isolated slash in a uri.
>
> Do the KAM rules have an issue tracker where this kind of report can be
> made?
>
> The rule:
>
> header      __KAM_SOMETLD_ARE_BAD_TLD_FROM          From:addr =~
> /\.(pw|stream|trade|press|top|date|guru|casa|online|cam|shop|club|b
> uri      __KAM_SOMETLD_ARE_BAD_TLD_URI
>
> /\.(pw|stream|trade|press|top|date|guru|casa|online|cam|shop|club|bar)($|\/)/i
>
> #FPs
> uri      __KAM_SOMETLD_ARE_BAD_TLD_URI_NEGATIVE
> /(^|\b)td\.date|div\.top($|\/)/i
>
> meta     KAM_SOMETLD_ARE_BAD_TLD    (__KAM_SOMETLD_ARE_BAD_TLD_FROM) ||
> (__KAM_SOMETLD_ARE_BAD_TLD_URI && !__KAM_SOMETLD_ARE_BAD_TLD
> describe    KAM_SOMETLD_ARE_BAD_TLD         .stream, .trade, .pw, .top,
> .press, .guru, .casa, .online, .cam, .shop, .bar, .club & .d
> score       KAM_SOMETLD_ARE_BAD_TLD         5.0
>
>

Re: Leaning toothpick syndrom (was: KAM_SOMETLD_ARE_BAD_TLD false positive)

Posted by "Kevin A. McGrail" <km...@apache.org>.
As a note, I sometimes make my rules harder to read on purpose to dissuade
bad actors from trying to unwind them.

On Wed, Aug 11, 2021, 11:21 Kenneth Porter <sh...@sewingwitch.com> wrote:

> On 8/11/2021 8:05 AM, Kenneth Porter wrote:
> >
> > BTW, does SA permit use of Perl-style regex delimiters to avoid
> > leaning toothpick syndrome?
> >
> > https://en.wikipedia.org/wiki/Leaning_toothpick_syndrome
> >
> Answering my own question, I see it used in this rule:
>
> uri        __IMGUR_IMG
> m,^https?://(?:[^.]+\.)?imgur\.com/[a-z0-9]{7}\.(?:png|gif|jpe?g)$,i
>
> I see a dozen rules in the latest SA rule update using the m<delimiter>
> scheme to avoid having to escape slashes in a uri. The result is
> significantly more readable.
>
>
>

Leaning toothpick syndrom (was: KAM_SOMETLD_ARE_BAD_TLD false positive)

Posted by Kenneth Porter <sh...@sewingwitch.com>.
On 8/11/2021 8:05 AM, Kenneth Porter wrote:
>
> BTW, does SA permit use of Perl-style regex delimiters to avoid 
> leaning toothpick syndrome?
>
> https://en.wikipedia.org/wiki/Leaning_toothpick_syndrome
>
Answering my own question, I see it used in this rule:

uri        __IMGUR_IMG 
m,^https?://(?:[^.]+\.)?imgur\.com/[a-z0-9]{7}\.(?:png|gif|jpe?g)$,i

I see a dozen rules in the latest SA rule update using the m<delimiter> 
scheme to avoid having to escape slashes in a uri. The result is 
significantly more readable.



Re: KAM_SOMETLD_ARE_BAD_TLD false positive

Posted by Kenneth Porter <sh...@sewingwitch.com>.
On 8/11/2021 7:39 AM, Jared Hall wrote:
>
> *Maybe* a little more refinement could prevent it picking  up .hidden 
> folders that have a BAD_TLD name.
>
> /[A-z0-9]+\.(pw|stream|trade|press|top|date|guru|casa|online|cam|shop|club|bar)(\s|$|\/)/i 


The CVS/Kodak uri would still fail on this pattern, as the BAD_TLD is 
the extension in the final path component.

My initial idea for fixing this in the negative pattern wouldn't work 
because a spammer could use https://example.badtld/example.badtld to 
sneak through.

Perhaps something like 
"//[^/]+\.(pw|stream|trade|press|top|date|guru|casa|online|cam|shop|club|bar)($|/)"i 
?

That might also need a matcher on the end for the optional port number.

BTW, does SA permit use of Perl-style regex delimiters to avoid leaning 
toothpick syndrome?

https://en.wikipedia.org/wiki/Leaning_toothpick_syndrome



Re: KAM_SOMETLD_ARE_BAD_TLD false positive

Posted by Jared Hall <ja...@jaredsec.com>.
Kenneth Porter wrote:
>
> uri      __KAM_SOMETLD_ARE_BAD_TLD_URI 
> /\.(pw|stream|trade|press|top|date|guru|casa|online|cam|shop|club|bar)($|\/)/i
>

I have a client whose NVR writes its archived video spools to a .cam 
folder on their server.  Heaven forbid ".well-known" ever becomes a TLD :)

*Maybe* a little more refinement could prevent it picking  up .hidden 
folders that have a BAD_TLD name.

/[A-z0-9]+\.(pw|stream|trade|press|top|date|guru|casa|online|cam|shop|club|bar)(\s|$|\/)/i 


$0.02,

-- Jared Hall