You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Ken Bass <kb...@kenbass.com> on 2014/10/15 22:49:32 UTC
SA skipping URI processing
I'm using Centos 7, which means SA version 3.3.2.
I am encountering several emails that are not being processed correctly
when checking against URI rules.
1) My local.cf has a rule to address the new .link domain which spammers
appear to be using recently:
uri LR_LINK_TLD /^(?:https?:\/\/|mailto:)[^\/]+\.link(?:\/|$)/i
describe LR_LINK_TLD Contains a URL in the LINK top-level domain
score LR_LINK_TLD 3.0
2) The URIDNSBL rules are not being executed for these email either.
Debug of SA shows an empty domains to query: Huh?
Oct 15 16:24:55.416 [15519] dbg: uridnsbl: domains to query:
Here is the pastebin link to the full spam email:
http://pastebin.com/RJWyGkKB
Re: SA skipping URI processing
Posted by "Kevin A. McGrail" <KM...@PCCC.com>.
On 10/15/2014 7:33 PM, Ken Bass wrote:
> On 10/15/2014 6:50 PM, Kevin A. McGrail wrote:
>> I'd have to dig into it to find out more but there are different
>> modules used for different tests so deviation in behavior is not
>> something that alarms me. If you replace your RegistrarBoundaries.pm
>> and it still has issues, please let us know. I am 99.9% sure I'm right.
>>
>> regards,
>> KAM
> Thanks -- My apologies for doubting you. Kinda of scary that there is
> a loophole that will grow each time a new tld is introduced. For now,
> I'll just block the .link domain at the smtp level.
I'm an engineer so Doubt is a good thing. Trust but verify ;-)
But yes, we know the TLD issue is a growing pain point and we have some
thoughts in progress to resolve it.
Re: SA skipping URI processing
Posted by Ken Bass <kb...@kenbass.com>.
On 10/15/2014 6:50 PM, Kevin A. McGrail wrote:
> I'd have to dig into it to find out more but there are different
> modules used for different tests so deviation in behavior is not
> something that alarms me. If you replace your RegistrarBoundaries.pm
> and it still has issues, please let us know. I am 99.9% sure I'm right.
>
> regards,
> KAM
Thanks -- My apologies for doubting you. Kinda of scary that there is a
loophole that will grow each time a new tld is introduced. For now, I'll
just block the .link domain at the smtp level.
Re: SA skipping URI processing
Posted by "Kevin A. McGrail" <KM...@PCCC.com>.
On 10/15/2014 6:20 PM, Ken Bass wrote:
> On 10/15/2014 6:12 PM, Martin Gregorie wrote:
>> I'm certain KAM is right and here's why.
> ...snip...
>> IOW, uri rules depend on matching the terminal part of the domain name
>> with an entry in SA's built-in TLD list and my version, installed from
>> the Fedora repo, doesn't yet include .link.
>>
>> I reverted my rules and test messages to test for the .link TLD and am
>> now waiting for a TLD list that contains .link to percolate through the
>> Fedora update process.
>>
>>
> I think my confusion is that for many spam messages, the uri rule is
> working fine for the .link domain.
> After looking at some different spam emails, I think the difference is
> that if the .link is inside an 'HTML' spam, the url processing works.
> If it is a normal text spam email, the url processing does not work.
> That has been the source of my confusion and why I was thinking KAM
> was referring to a different issue.
>
> So I am thinking that the HTML decoding part of SA doesn't use that
> built-in TLD list, but the test email processing does. That is the
> only way I can explain it what I am seeing.
I'd have to dig into it to find out more but there are different modules
used for different tests so deviation in behavior is not something that
alarms me. If you replace your RegistrarBoundaries.pm and it still has
issues, please let us know. I am 99.9% sure I'm right.
regards,
KAM
Re: SA skipping URI processing
Posted by Martin Gregorie <ma...@gregorie.org>.
On Wed, 2014-10-15 at 18:20 -0400, Ken Bass wrote:
> On 10/15/2014 6:12 PM, Martin Gregorie wrote:
> > I'm certain KAM is right and here's why.
> ...snip...
> > IOW, uri rules depend on matching the terminal part of the domain name
> > with an entry in SA's built-in TLD list and my version, installed from
> > the Fedora repo, doesn't yet include .link.
> >
> > I reverted my rules and test messages to test for the .link TLD and am
> > now waiting for a TLD list that contains .link to percolate through the
> > Fedora update process.
> >
> >
> I think my confusion is that for many spam messages, the uri rule is
> working fine for the .link domain.
> After looking at some different spam emails, I think the difference is
> that if the .link is inside an 'HTML' spam, the url processing works. If
> it is a normal text spam email, the url processing does not work. That
> has been the source of my confusion and why I was thinking KAM was
> referring to a different issue.
>
> So I am thinking that the HTML decoding part of SA doesn't use that
> built-in TLD list, but the test email processing does. That is the only
> way I can explain it what I am seeing.
>
That's quite possible. My test messages are all plaintext or have the
uris in plaintext MIME parts.
Martin
Re: SA skipping URI processing
Posted by Ken Bass <kb...@kenbass.com>.
On 10/15/2014 6:12 PM, Martin Gregorie wrote:
> I'm certain KAM is right and here's why.
...snip...
> IOW, uri rules depend on matching the terminal part of the domain name
> with an entry in SA's built-in TLD list and my version, installed from
> the Fedora repo, doesn't yet include .link.
>
> I reverted my rules and test messages to test for the .link TLD and am
> now waiting for a TLD list that contains .link to percolate through the
> Fedora update process.
>
>
I think my confusion is that for many spam messages, the uri rule is
working fine for the .link domain.
After looking at some different spam emails, I think the difference is
that if the .link is inside an 'HTML' spam, the url processing works. If
it is a normal text spam email, the url processing does not work. That
has been the source of my confusion and why I was thinking KAM was
referring to a different issue.
So I am thinking that the HTML decoding part of SA doesn't use that
built-in TLD list, but the test email processing does. That is the only
way I can explain it what I am seeing.
Re: SA skipping URI processing
Posted by Martin Gregorie <ma...@gregorie.org>.
On Wed, 2014-10-15 at 17:01 -0400, Ken Bass wrote:
> On 10/15/2014 4:52 PM, Kevin A. McGrail wrote:
> > On 10/15/2014 4:49 PM, Ken Bass wrote:
> >> 1) My local.cf has a rule to address the new .link domain which
> >> spammers appear to be using recently:
> >>
> >> uri LR_LINK_TLD /^(?:https?:\/\/|mailto:)[^\/]+\.link(?:\/|$)/i
> >> describe LR_LINK_TLD Contains a URL in the LINK top-level domain
> >> score LR_LINK_TLD 3.0
> >>
> >> 2) The URIDNSBL rules are not being executed for these email either.
> >>
> >> Debug of SA shows an empty domains to query: Huh?
> >> Oct 15 16:24:55.416 [15519] dbg: uridnsbl: domains to query:
> >>
> >> Here is the pastebin link to the full spam email:
> >>
> >> http://pastebin.com/RJWyGkKB
> > The TLDs are hardcoded in SA 3.3.2. We are working on not having
> > them hard-coded in 3.4.1.
> >
> > I believe someone made a patch suitable for 3.3.2 but I can't find it
> > at the moment.
>
> Sorry but I think you might be confusing some specific TLD related rule
> issues rather than the more generic custom uri rules and uridnsbl rules
> that I am using. Because these work fine on OTHER emails. Something in
> specific emails, like the one in the above pastebin are causing the
> issue. I've got lots of other emails that hit the above LR_LINK_TLD
> and/or URIBL_DBL_SPAM.
>
I'm certain KAM is right and here's why.
: I recently wrote a set of three experimental rules to detect *.link
Rules in body text, Received headers and From headers and set up some
test messages since I've yet to see any .link TLDs . The body text rule
was, of course, a URI rule. It didn't work though the other two rules,
which used ordinary regexes with \.link as part of the expression,
worked as expected. Eventually, as a debugging aid I changed the rules
and the test messages to search for \.com and all three rules worked as
expected.
IOW, uri rules depend on matching the terminal part of the domain name
with an entry in SA's built-in TLD list and my version, installed from
the Fedora repo, doesn't yet include .link.
I reverted my rules and test messages to test for the .link TLD and am
now waiting for a TLD list that contains .link to percolate through the
Fedora update process.
HTH
Martin
Re: SA skipping URI processing
Posted by Ken Bass <kb...@kenbass.com>.
On 10/15/2014 4:52 PM, Kevin A. McGrail wrote:
> On 10/15/2014 4:49 PM, Ken Bass wrote:
>> 1) My local.cf has a rule to address the new .link domain which
>> spammers appear to be using recently:
>>
>> uri LR_LINK_TLD /^(?:https?:\/\/|mailto:)[^\/]+\.link(?:\/|$)/i
>> describe LR_LINK_TLD Contains a URL in the LINK top-level domain
>> score LR_LINK_TLD 3.0
>>
>> 2) The URIDNSBL rules are not being executed for these email either.
>>
>> Debug of SA shows an empty domains to query: Huh?
>> Oct 15 16:24:55.416 [15519] dbg: uridnsbl: domains to query:
>>
>> Here is the pastebin link to the full spam email:
>>
>> http://pastebin.com/RJWyGkKB
> The TLDs are hardcoded in SA 3.3.2. We are working on not having
> them hard-coded in 3.4.1.
>
> I believe someone made a patch suitable for 3.3.2 but I can't find it
> at the moment.
Sorry but I think you might be confusing some specific TLD related rule
issues rather than the more generic custom uri rules and uridnsbl rules
that I am using. Because these work fine on OTHER emails. Something in
specific emails, like the one in the above pastebin are causing the
issue. I've got lots of other emails that hit the above LR_LINK_TLD
and/or URIBL_DBL_SPAM.
Re: SA skipping URI processing
Posted by Ken Bass <kb...@kenbass.com>.
On 10/15/2014 4:52 PM, Kevin A. McGrail wrote:
> The TLDs are hardcoded in SA 3.3.2. We are working on not having
> them hard-coded in 3.4.1.
I found Bug 6782, which I think you are referring to. I don't quite
understand the details of it. But are saying that the 'uri' and uridnsbl
rules
rely on those functions? If so, I am confused, because I have many spam
emails with the '.link' domain that are being tagged properly.
Re: SA skipping URI processing
Posted by "Kevin A. McGrail" <KM...@PCCC.com>.
On 10/15/2014 4:49 PM, Ken Bass wrote:
> I'm using Centos 7, which means SA version 3.3.2.
>
> I am encountering several emails that are not being processed
> correctly when checking against URI rules.
>
> 1) My local.cf has a rule to address the new .link domain which
> spammers appear to be using recently:
>
> uri LR_LINK_TLD /^(?:https?:\/\/|mailto:)[^\/]+\.link(?:\/|$)/i
> describe LR_LINK_TLD Contains a URL in the LINK top-level domain
> score LR_LINK_TLD 3.0
>
> 2) The URIDNSBL rules are not being executed for these email either.
>
> Debug of SA shows an empty domains to query: Huh?
> Oct 15 16:24:55.416 [15519] dbg: uridnsbl: domains to query:
>
> Here is the pastebin link to the full spam email:
>
> http://pastebin.com/RJWyGkKB
The TLDs are hardcoded in SA 3.3.2. We are working on not having them
hard-coded in 3.4.1.
I believe someone made a patch suitable for 3.3.2 but I can't find it at
the moment.
regards,
KAM