You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Mike Marynowski <mi...@singulink.com> on 2019/02/27 17:16:20 UTC
Spam rule for HTTP/HTTPS request to sender's root domain
Hi everyone,
I haven't been able to find any existing spam rules or checks that do
this, but from my analysis of ham/spam I'm getting I think this would be
a really great addition. Almost all of the spam emails that are coming
through do not have a working website at the room domain of the sender.
Of the 100 last legitimate email domains that have sent me mail, 100% of
them have working websites at the root domain.
As far as I can tell there isn't currently a way to build a rule that
does this and a Perl plugin would have to be created. Is this an
accurate assessment? Can you recommend some good resources for building
a SpamAssassin plugin if this is the case?
Thanks!
Re: Spam rule for HTTP/HTTPS request to sender's root domain
Posted by Mike Marynowski <mi...@singulink.com>.
Sorry, I meant I thought it was doing those checks because I know I was
playing with checking A records before and figured the rules would have
it enabled by default...I tried to find the rules after I sent that
message and realized that was related to sender domain A record checks
done in my MTA.
On 3/1/2019 2:26 PM, Antony Stone wrote:
> On Friday 01 March 2019 at 17:37:18, Mike Marynowski wrote:
>
>> Quick sampling of 10 emails: 8 of them have valid A records on the email
>> domain. I presumed SpamAssassin was already doing simple checks like that.
> That doesn't sound like a good idea to me (presuming, I mean).
>
>
> Antony.
>
Re: Spam rule for HTTP/HTTPS request to sender's root domain
Posted by Antony Stone <An...@spamassassin.open.source.it>.
On Friday 01 March 2019 at 17:37:18, Mike Marynowski wrote:
> Quick sampling of 10 emails: 8 of them have valid A records on the email
> domain. I presumed SpamAssassin was already doing simple checks like that.
That doesn't sound like a good idea to me (presuming, I mean).
Antony.
--
"The future is already here. It's just not evenly distributed yet."
- William Gibson
Please reply to the list;
please *don't* CC me.
Re: Spam rule for HTTP/HTTPS request to sender's root domain
Posted by Mike Marynowski <mi...@singulink.com>.
On 3/1/2019 1:07 PM, RW wrote:
> Sure, but had it turned-out that most of these domains didn't have the A
> record necessary for your HTTP test, it wouldn't have been worth doing
> anything more complicated.
I've noticed a lot of the spam domains appear to point to actual web
servers but throw 403 or 503 errors, which A records wouldn't help with
and has been taken into account here. As for being "more complicated" -
it's basically done and running in my test environment for final
tweaking haha, so bit late now :P It was only a day's work to put
everything together including the DNS service and caching layer, so meh.
Unless you mean complicated in the sense that it's more technically
complicated as opposed to effort wise.
> You don't need an A record for email. The last time I looked it just
> tests that there's enough DNS for a bounce to be received, so an A or
> MX for the sender domain.
I'm confusing different tests here, you can disregard my previous message.
Re: Spam rule for HTTP/HTTPS request to sender's root domain
Posted by RW <rw...@googlemail.com>.
On Fri, 1 Mar 2019 11:37:18 -0500
Mike Marynowski wrote:
> Looking for an A record on what - just the email address domain or
> the chain of parent domains as well? If the latter, well a lack of A
> record will cause this to fail so it's kind of embedded in.
Sure, but had it turned-out that most of these domains didn't have the A
record necessary for your HTTP test, it wouldn't have been worth doing
anything more complicated.
> Quick sampling of 10 emails: 8 of them have valid A records on the
> email domain. I presumed SpamAssassin was already doing simple checks
> like that.
You don't need an A record for email. The last time I looked it just
tests that there's enough DNS for a bounce to be received, so an A or
MX for the sender domain.
Re: Spam rule for HTTP/HTTPS request to sender's root domain
Posted by Mike Marynowski <mi...@singulink.com>.
Looking for an A record on what - just the email address domain or the
chain of parent domains as well? If the latter, well a lack of A record
will cause this to fail so it's kind of embedded in.
Quick sampling of 10 emails: 8 of them have valid A records on the email
domain. I presumed SpamAssassin was already doing simple checks like that.
On 3/1/2019 10:23 AM, RW wrote:
> On Wed, 27 Feb 2019 12:16:20 -0500
> Mike Marynowski wrote:
>> Almost all of the spam emails that are
>> coming through do not have a working website at the room domain of
>> the sender.
> Did you establish what fraction of this spam could be caught just by
> looking for an A record?
Re: Spam rule for HTTP/HTTPS request to sender's root domain
Posted by RW <rw...@googlemail.com>.
On Wed, 27 Feb 2019 12:16:20 -0500
Mike Marynowski wrote:
> Almost all of the spam emails that are
> coming through do not have a working website at the room domain of
> the sender.
Did you establish what fraction of this spam could be caught just by
looking for an A record?
Re: Open source
Posted by Ralph Seichter <ab...@monksofcool.net>.
* RW:
> You're missing the point.
It may surprise you, but there is more than one "point" to having
packages, and I can choose to make whatever point I damn well
please. :-)
-Ralph
Re: Open source (WAS: Spam rule for HTTP/HTTPS request to sender's
root domain)
Posted by RW <rw...@googlemail.com>.
On Thu, 21 Mar 2019 18:26:15 +0100
Ralph Seichter wrote:
> * Mike Marynowski:
>
> > I was more asking if there is a good reason to build packages
> > intended for local installation by email server operators and I
> > don't think there really is.
>
> As a maintainer of several Gentoo Linux ebuilds, I agree you should
> leave packaging to the various Linux distributions. Building, testing,
> dependency management etc. vary significantly. Best leave that to the
> folks who do it on a regular basis.
You're missing the point. The reason for not having packages, is that
it's more accurate if everyone shares the same database.
Re: Open source (WAS: Spam rule for HTTP/HTTPS request to sender's root domain)
Posted by Ralph Seichter <ab...@monksofcool.net>.
* Mike Marynowski:
> I was more asking if there is a good reason to build packages intended
> for local installation by email server operators and I don't think
> there really is.
As a maintainer of several Gentoo Linux ebuilds, I agree you should
leave packaging to the various Linux distributions. Building, testing,
dependency management etc. vary significantly. Best leave that to the
folks who do it on a regular basis.
-Ralph
Re: Open source (WAS: Spam rule for HTTP/HTTPS request to sender's
root domain)
Posted by Mike Marynowski <mi...@singulink.com>.
Perhaps I should have been clearer - I'm not against posting the code
for any reason and I am planning to do that anyway in case anyone wants
to look at it or chip in improvements and whatnot.
I'm an active contributor on many open source projects and I have fully
embraces OSS :) I was more asking if there is a good reason to build
packages intended for local installation by email server operators and I
don't think there really is. There's a fundamental difference in how the
project would be setup if it was intended to be installed by all email
server operators, i.e. writing a config file loader instead of
hardcoding values, allowing more flexibility, building packages for
different operating systems, etc. What I'm saying is I don't think I
will be officially supporting that route as it seems more beneficial to
collaborate on a central database, though people are obviously free to
do with the code as they wish.
Cheers!
Mike
On 3/21/2019 5:42 AM, Tom Hendrikx wrote:
> On 20-03-19 19:56, Mike Marynowski wrote:
>> A couple people asked about me posting the code/service so they could
>> run it on their own systems but I'm currently leaning away from that. I
>> don't think there is any benefit to doing that instead of just utilizing
>> the centralized service. The whole thing works better if everyone using
>> it queries a central service and helps avoid people easily making bad
>> mistakes like the one above and then spending hours scrambling to try to
>> find non-existent botnet infections on their network while mail bounces
>> because they are on a blocklisted :( If someone has a good reason for
>> making the service locally installable let me know though, haha.
> When people are interested in seeing the code, their main incentive for
> such a request is probably not that they want to run it themselves. They
> might, in no particular order:
>
> - would like to learn from what you're doing
> - would like to see how you're treating their contributed data
> - would like to verify the listing policy that you're proposing
> - would like to study if there could be better criteria for
> listing/unlisting than the ones currently available
> - change things to the software and contribute that back for the
> benefit of everyone
> - squash bugs that you're currently might be missing
> - help out on further development of the service if or when your time is
> limited
> - don't be depending on a single person to maintain a service they like
>
> This is called open source, and it's a good thing. For details on the
> philosophy behind it,
> http://www.catb.org/~esr/writings/cathedral-bazaar/cathedral-bazaar/ is
> a good read.
>
> In short: if you like your project to prosper, put it on github for
> everyone to see.
>
> Kind regards,
>
> Tom
>
Re: Open source (WAS: Spam rule for HTTP/HTTPS request to sender's
root domain)
Posted by Mike Marynowski <mi...@singulink.com>.
Here ya go ;)
https://github.com/mikernet/HttpCheckDnsServer
On 3/21/2019 5:42 AM, Tom Hendrikx wrote:
> On 20-03-19 19:56, Mike Marynowski wrote:
>> A couple people asked about me posting the code/service so they could
>> run it on their own systems but I'm currently leaning away from that. I
>> don't think there is any benefit to doing that instead of just utilizing
>> the centralized service. The whole thing works better if everyone using
>> it queries a central service and helps avoid people easily making bad
>> mistakes like the one above and then spending hours scrambling to try to
>> find non-existent botnet infections on their network while mail bounces
>> because they are on a blocklisted :( If someone has a good reason for
>> making the service locally installable let me know though, haha.
> When people are interested in seeing the code, their main incentive for
> such a request is probably not that they want to run it themselves. They
> might, in no particular order:
>
> - would like to learn from what you're doing
> - would like to see how you're treating their contributed data
> - would like to verify the listing policy that you're proposing
> - would like to study if there could be better criteria for
> listing/unlisting than the ones currently available
> - change things to the software and contribute that back for the
> benefit of everyone
> - squash bugs that you're currently might be missing
> - help out on further development of the service if or when your time is
> limited
> - don't be depending on a single person to maintain a service they like
>
> This is called open source, and it's a good thing. For details on the
> philosophy behind it,
> http://www.catb.org/~esr/writings/cathedral-bazaar/cathedral-bazaar/ is
> a good read.
>
> In short: if you like your project to prosper, put it on github for
> everyone to see.
>
> Kind regards,
>
> Tom
>
Open source (WAS: Spam rule for HTTP/HTTPS request to sender's root
domain)
Posted by Tom Hendrikx <to...@whyscream.net>.
On 20-03-19 19:56, Mike Marynowski wrote:
>
> A couple people asked about me posting the code/service so they could
> run it on their own systems but I'm currently leaning away from that. I
> don't think there is any benefit to doing that instead of just utilizing
> the centralized service. The whole thing works better if everyone using
> it queries a central service and helps avoid people easily making bad
> mistakes like the one above and then spending hours scrambling to try to
> find non-existent botnet infections on their network while mail bounces
> because they are on a blocklisted :( If someone has a good reason for
> making the service locally installable let me know though, haha.
When people are interested in seeing the code, their main incentive for
such a request is probably not that they want to run it themselves. They
might, in no particular order:
- would like to learn from what you're doing
- would like to see how you're treating their contributed data
- would like to verify the listing policy that you're proposing
- would like to study if there could be better criteria for
listing/unlisting than the ones currently available
- change things to the software and contribute that back for the
benefit of everyone
- squash bugs that you're currently might be missing
- help out on further development of the service if or when your time is
limited
- don't be depending on a single person to maintain a service they like
This is called open source, and it's a good thing. For details on the
philosophy behind it,
http://www.catb.org/~esr/writings/cathedral-bazaar/cathedral-bazaar/ is
a good read.
In short: if you like your project to prosper, put it on github for
everyone to see.
Kind regards,
Tom
Re: Spam rule for HTTP/HTTPS request to sender's root domain
Posted by Mike Marynowski <mi...@singulink.com>.
Continuing to fine-tune this service - thank you to everyone testing it.
Some updates were pushed out yesterday:
* Initial new domain "grace period" reduced to 8 minutes (down from 15
mins) - 4 attempts are made within this time to get a valid HTTP response
* Mozilla browser spoofing is implemented to avoid problems with
websites that block HttpClient requests
* Fixes to NXDOMAIN negative result caching appear to be working well now
Some lessons learned in the meantime as well. Turns out that letting the
HTTP test run though an email server IP is a terrible idea as it will
put the IP on some blocklists for attempting to make HTTP connections to
botnet command & control honeypot servers if someone happens to query
one of those domains, LOL.
A couple people asked about me posting the code/service so they could
run it on their own systems but I'm currently leaning away from that. I
don't think there is any benefit to doing that instead of just utilizing
the centralized service. The whole thing works better if everyone using
it queries a central service and helps avoid people easily making bad
mistakes like the one above and then spending hours scrambling to try to
find non-existent botnet infections on their network while mail bounces
because they are on a blocklisted :( If someone has a good reason for
making the service locally installable let me know though, haha.
Re: Spam rule for HTTP/HTTPS request to sender's root domain
Posted by Jari Fredriksson <ja...@iki.fi>.
> Antony Stone <An...@spamassassin.open.source.it> kirjoitti 13.3.2019 kello 20.36:
>
> On Wednesday 13 March 2019 at 19:21:47, Jari Fredriksson wrote:
>
>> What would it result for this:
>>
>> I have a couple domains that do not have any services for the root domain
>> name. How ever, the server the A points do have a web server that acts as
>> a reverse proxy for many subdomains that will be served a web page. A http
>> 503 is returned by the pound reverse for the root domains.
>
> What is a "pound reverse"?
>
> Antony.
>
>> gladiator:~ jarif$ curl -v http://bitwell.biz
>> * Rebuilt URL to: http://bitwell.biz/
>> * Trying 138.201.119.25...
>> * TCP_NODELAY set
>> * Connected to bitwell.biz (138.201.119.25) port 80 (#0)
>>
>>> GET / HTTP/1.1
>>> Host: bitwell.biz
>>> User-Agent: curl/7.54.0
>>> Accept: */*
>>
>> * HTTP 1.0, assume close after body
>> < HTTP/1.0 503 Service Unavailable
>> < Content-Type: text/html
>> < Content-Length: 53
>> < Expires: now
>> < Pragma: no-cache
>> < Cache-control: no-cache,no-store
>> <
>> * Closing connection 0
>>
>> Br. Jarif
>
Pound reverse proxy.I forgot that ”proxy” in that. Pound is a simple but effective reverse proxy software package (FOSS) for http(s).
Br. Jarif
Re: Spam rule for HTTP/HTTPS request to sender's root domain
Posted by Antony Stone <An...@spamassassin.open.source.it>.
On Wednesday 13 March 2019 at 19:21:47, Jari Fredriksson wrote:
> What would it result for this:
>
> I have a couple domains that do not have any services for the root domain
> name. How ever, the server the A points do have a web server that acts as
> a reverse proxy for many subdomains that will be served a web page. A http
> 503 is returned by the pound reverse for the root domains.
What is a "pound reverse"?
Antony.
> gladiator:~ jarif$ curl -v http://bitwell.biz
> * Rebuilt URL to: http://bitwell.biz/
> * Trying 138.201.119.25...
> * TCP_NODELAY set
> * Connected to bitwell.biz (138.201.119.25) port 80 (#0)
>
> > GET / HTTP/1.1
> > Host: bitwell.biz
> > User-Agent: curl/7.54.0
> > Accept: */*
>
> * HTTP 1.0, assume close after body
> < HTTP/1.0 503 Service Unavailable
> < Content-Type: text/html
> < Content-Length: 53
> < Expires: now
> < Pragma: no-cache
> < Cache-control: no-cache,no-store
> <
> * Closing connection 0
>
> Br. Jarif
--
Numerous psychological studies over the years have demonstrated that the
majority of people genuinely believe they are not like the majority of people.
Please reply to the list;
please *don't* CC me.
Re: Spam rule for HTTP/HTTPS request to sender's root domain
Posted by Mike Marynowski <mi...@singulink.com>.
Any HTTP status code 400 or higher is treated as no valid website on the
domain. I see a considerable amount of spam that returns 5xx codes so at
this point I don't plan on changing that behavior. 503 is supposed to
indicate a temporary condition so this seems like an abuse of the error
code.
On 3/13/2019 2:21 PM, Jari Fredriksson wrote:
> What would it result for this:
>
> I have a couple domains that do not have any services for the root domain name. How ever, the server the A points do have a web server that acts as a reverse proxy for many subdomains that will be served a web page. A http 503 is returned by the pound reverse for the root domains.
Re: Spam rule for HTTP/HTTPS request to sender's root domain
Posted by Jari Fredriksson <ja...@iki.fi>.
What would it result for this:
I have a couple domains that do not have any services for the root domain name. How ever, the server the A points do have a web server that acts as a reverse proxy for many subdomains that will be served a web page. A http 503 is returned by the pound reverse for the root domains.
gladiator:~ jarif$ curl -v http://bitwell.biz
* Rebuilt URL to: http://bitwell.biz/
* Trying 138.201.119.25...
* TCP_NODELAY set
* Connected to bitwell.biz (138.201.119.25) port 80 (#0)
> GET / HTTP/1.1
> Host: bitwell.biz
> User-Agent: curl/7.54.0
> Accept: */*
>
* HTTP 1.0, assume close after body
< HTTP/1.0 503 Service Unavailable
< Content-Type: text/html
< Content-Length: 53
< Expires: now
< Pragma: no-cache
< Cache-control: no-cache,no-store
<
* Closing connection 0
Br. Jarif
Re: Spam rule for HTTP/HTTPS request to sender's root domain
Posted by Dominic Raferd <do...@timedicer.co.uk>.
On Wed, 13 Mar 2019 at 13:04, RW <rw...@googlemail.com> wrote:
>
> On Wed, 13 Mar 2019 10:53:06 +0000
> Dominic Raferd wrote:
>
> > On Wed, 13 Mar 2019 at 10:33, Mike Marynowski <mi...@singulink.com>
> > wrote:
> > >
> >
> > For those of us who are not SA experts can you give an example of how
> > to use your helpful new lookup facility (i.e. lines to add in
> > local.cf)? Thanks
>
>
> askdns AUTHOR_IN_HTTPCHECK _AUTHORDOMAIN_.httpcheck.singulink.com A 1
>
> score AUTHOR_IN_HTTPCHECK 0.1 # adjust as appropriate
>
> This assumes that Mail::SpamAssassin::Plugin::AskDNS is loaded, which
> it is by default.
Thanks, giving it a go...
Re: Spam rule for HTTP/HTTPS request to sender's root domain
Posted by RW <rw...@googlemail.com>.
On Wed, 13 Mar 2019 10:53:06 +0000
Dominic Raferd wrote:
> On Wed, 13 Mar 2019 at 10:33, Mike Marynowski <mi...@singulink.com>
> wrote:
> >
>
> For those of us who are not SA experts can you give an example of how
> to use your helpful new lookup facility (i.e. lines to add in
> local.cf)? Thanks
askdns AUTHOR_IN_HTTPCHECK _AUTHORDOMAIN_.httpcheck.singulink.com A 1
score AUTHOR_IN_HTTPCHECK 0.1 # adjust as appropriate
This assumes that Mail::SpamAssassin::Plugin::AskDNS is loaded, which
it is by default.
Re: Spam rule for HTTP/HTTPS request to sender's root domain
Posted by Dominic Raferd <do...@timedicer.co.uk>.
On Wed, 13 Mar 2019 at 10:33, Mike Marynowski <mi...@singulink.com> wrote:
>
For those of us who are not SA experts can you give an example of how
to use your helpful new lookup facility (i.e. lines to add in
local.cf)? Thanks
Re: Spam rule for HTTP/HTTPS request to sender's root domain
Posted by Mike Marynowski <mi...@singulink.com>.
Back up after some extensive modifications.
Setting the DNS request timeout to 30 seconds is no longer necessary -
the service instantly responds to queries.
In order to prevent mail delivery issues if the website is having
technical issues the first time a domain is seen by the service, it will
instantly return a response that it is a valid domain (NXDOMAIN) with a
15 minute TTL. It will then queue up testing of this domain in the
background and automatically keep retrying every few minutes if HTTP
contact fails. After 15 minutes of failed HTTP contact, the DNS service
will begin responding with an invalid domain response (127.0.0.1),
exponentially increasing TTLs and time between background checks until
it reaches about 17 hours between checks. The service automatically run
checks in the background for all domains queried within the last 30 days
and instantly responds to DNS queries with the cached result. If a web
server goes down, has technical issues, etc...it will still be reported
as a valid domain for approximately 4 days after the last successful
HTTP contact while being continually being checked in the background, so
temporary issues won't affect mail delivery.
On 3/11/2019 7:18 PM, RW wrote:
> It doesn't seem to be working. Is it gone?
>
>
>
> $ dig +norecurse @ns1.singulink.com hwvyuprmjpdrws.com.httpcheck.singulink.com
>
> ; <<>> DiG 9.11.0-P5 <<>> +norecurse @ns1.singulink.com hwvyuprmjpdrws.com.httpcheck.singulink.com
> ; (1 server found)
> ;; global options: +cmd
> ;; Got answer:
> ;; ->>HEADER<<- opcode: QUERY, status: FORMERR, id: 57443
> ;; flags: qr; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1
> ...
Re: Spam rule for HTTP/HTTPS request to sender's root domain
Posted by RW <rw...@googlemail.com>.
On Fri, 1 Mar 2019 01:21:40 -0500
Mike Marynowski wrote:
> For anyone who wants to play around with this, the DNS service has
> been posted. You can test the existence of a website on a domain or
> any of its parent domains by making DNS queries as follows:
>
> subdomain.domain.com.httpcheck.singulink.com
It doesn't seem to be working. Is it gone?
$ dig +norecurse @ns1.singulink.com hwvyuprmjpdrws.com.httpcheck.singulink.com
; <<>> DiG 9.11.0-P5 <<>> +norecurse @ns1.singulink.com hwvyuprmjpdrws.com.httpcheck.singulink.com
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: FORMERR, id: 57443
;; flags: qr; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1
...
Re: Spam rule for HTTP/HTTPS request to sender's root domain
Posted by Andrea Venturoli <ml...@netfence.it>.
On 2019-03-01 07:21, Mike Marynowski wrote:
> For anyone who wants to play around with this, the DNS service has been
> posted. You can test the existence of a website on a domain or any of
> its parent domains by making DNS queries as follows:
>
> subdomain.domain.com.httpcheck.singulink.com
Hello.
I was getting around to test this, but I can't seem to reach the service.
Is it still active?
bye & Thanks
av.
Re: Spam rule for HTTP/HTTPS request to sender's root domain
Posted by Mike Marynowski <mi...@singulink.com>.
For anyone who wants to play around with this, the DNS service has been
posted. You can test the existence of a website on a domain or any of
its parent domains by making DNS queries as follows:
subdomain.domain.com.httpcheck.singulink.com
So, if you wanted to check if mail1.mx.google.com or any of its parent
domains have a website, you would do a DNS query with a 30 second
timeout for:
mail1.mx.google.com.httpcheck.singulink.com
This will check the following domains for a valid HTTP response within
15 seconds:
mail1.mx.google.com
mx.google.com
google.com
If a valid HTTP response comes back then the DNS query will return
NXDOMAIN with a 7 day TTL. If no valid HTTP response comes back then the
DNS query will return 127.0.0.1 with progressively increasing TTLs:
#1: 2 mins
#2: 4 mins
#3: 6 mins
#4: 8 mins
#5: 10 mins
#6: 20 mins
#7: 30 mins
#8: 40 mins
#9: 50 mins
#10: 1 hour
#11: 2 hours
#12+: add 2 hours extra for each attempt up to 24h max
As long as an invalid domain has been queried in the last 7 days, it
will remain cached and any further invalid attempts will continue to
progressively increase the TTL according to the rules above. If a domain
doesn't get queried for 7 days then it drops out of the cache and its
invalid attempt counter is reset. A valid HTTP response will reset the
domains invalid counter and a 7 day TTL is returned. Once a domain is in
the cache, responses are immediate until the TTL runs out and the domain
is rechecked again.
Re: Spam rule for HTTP/HTTPS request to sender's root domain
Posted by Benny Pedersen <me...@junc.eu>.
Ralph Seichter skrev den 2019-02-28 18:53:
> By the way, are you aware of https://www.dnswl.org ?
https://www.mywot.com
https://www.trustpilot.com
Re: Spam rule for HTTP/HTTPS request to sender's root domain
Posted by Ralph Seichter <ab...@monksofcool.net>.
* David Jones:
> I would like to see an Open Mail Reputation System setup by a working
> group of big companies so it would have some weight behind it.
Running a smaller business, I have no interest whatsoever in a "group of
big companies" having any say in our mail reputation, as you can surely
understand. All our commercial email passes DKIM, SPF and DMARC tests
anyway.
By the way, are you aware of https://www.dnswl.org ?
-Ralph
Re: Spam rule for HTTP/HTTPS request to sender's root domain
Posted by David Jones <dj...@ena.com>.
On 2/28/19 10:50 AM, Ralph Seichter wrote:
> * Mike Marynowski:
>
>> And the cat and mouse game continues :)
>
> It sure does, and that's what sticks in my craw here: For a pro spammer,
> it is easy to set up websites in an automated fashion. If I was such a
> naughty person, I'd just add one tiny service that answers "all is well"
> for every incoming HTTP request.
>
> Why even use a test for something that is so easily compromised?
>
> -Ralph
>
I would like to see an Open Mail Reputation System setup by a working
group of big companies so it would have some weight behind it. Setup
some sort of scale like 0 to 100 for reputation that starts established
domains older than X days at 50 (in the middle) and then have a clearing
house for spam reports where it takes several different reports from
different sources to lower a domain's score. I am sure some smart
Google engineers or SpamCop.net could do this their spare time in a way
that can't be abused or poisoned.
Newly registered domains that are less than X days old would start at
zero or 25 and have to earn their increase in score over time. Maybe
every week without a report of spam the score goes up by some increment.
Domains would have to implement good SPF, DKIM, and DMARC to participate
in this reputation system. A postmaster address (maybe the DMARC
reporting email address) would be required with a mail loop verification.
Bounce messages would have a clear/plain message with a link explaining
why the message was bounced (because of a sender problem, not the
recipient mail server problem). Default to opt-in sending copies of the
bounce message to the postmaster address and require mail admins to
opt-out if they don't want it. (A major problem in email support today
is not having good contacts of admins on the other end. End users don't
know what to do with bounce messages and mail admins can't easily get
together to work on delivery problems.)
--
David Jones
Re: Spam rule for HTTP/HTTPS request to sender's root domain
Posted by Ralph Seichter <ab...@monksofcool.net>.
* Mike Marynowski:
> You know what I mean.
That's quite an assumption to make, in a mailing list. ;-)
> I could just not publish this and keep it for myself and I'm sure that
> would make it more effective long term for me, but I figured I would
> contribute it so that others can gain some benefit from it.
Sounds reasonable to me, as long as such a plugin is not activated as a
SpamAssassin default, and defaults to a low score when activated.
-Ralph
Re: Spam rule for HTTP/HTTPS request to sender's root domain
Posted by John Schmerold <sc...@gmail.com>.
Mike: If you want a tester, I am happy to join the effort, I see little
harm in assigning 0.75 to the results.
There are quite a few email only domains we end up whitelist_auth'ing
them and all is well.
John Schmerold
Katy Computer Systems, Inc
https://katycomputer.com
St Louis
On 2/28/2019 11:19 AM, Mike Marynowski wrote:
> You know what I mean. *Many (not all) of the rules (rDNS verification,
> hostname check, SPF records, etc) are easy to circumvent but we still
> check all that. Those simple checks still manage to catch a surprising
> amount of spam.
>
> I could just not publish this and keep it for myself and I'm sure that
> would make it more effective long term for me, but I figured I would
> contribute it so that others can gain some benefit from it.
>
> If it doesn't become widespread and SpamAssassin isn't interested in
> embedding it directly into their rule checks then that's fine by me,
> I'm not going to cry about it...more spam catching for me and whoever
> decides to install the plugin on their own servers. If it does become
> widespread and some spammers adapt then I'll take solace in knowing I
> helped a lot of people stop at least some of their spam.
>> * Mike Marynowski:
>>
>>> Everything we test for is easily compromised on its own.
>> That's quite a sweeping statement, and I disagree. IP-based real time
>> blacklists, anyone? Also, "we" is too unspecific. In addition to the
>> stock rules, I happen to maintain a set of custom tests which are
>> neither published nor easily circumvented. They have proven pretty
>> effective for us.
>>
>> -Ralph
>
>
Re: Spam rule for HTTP/HTTPS request to sender's root domain
Posted by Mike Marynowski <mi...@singulink.com>.
You know what I mean. *Many (not all) of the rules (rDNS verification,
hostname check, SPF records, etc) are easy to circumvent but we still
check all that. Those simple checks still manage to catch a surprising
amount of spam.
I could just not publish this and keep it for myself and I'm sure that
would make it more effective long term for me, but I figured I would
contribute it so that others can gain some benefit from it.
If it doesn't become widespread and SpamAssassin isn't interested in
embedding it directly into their rule checks then that's fine by me, I'm
not going to cry about it...more spam catching for me and whoever
decides to install the plugin on their own servers. If it does become
widespread and some spammers adapt then I'll take solace in knowing I
helped a lot of people stop at least some of their spam.
> * Mike Marynowski:
>
>> Everything we test for is easily compromised on its own.
> That's quite a sweeping statement, and I disagree. IP-based real time
> blacklists, anyone? Also, "we" is too unspecific. In addition to the
> stock rules, I happen to maintain a set of custom tests which are
> neither published nor easily circumvented. They have proven pretty
> effective for us.
>
> -Ralph
Re: Spam rule for HTTP/HTTPS request to sender's root domain
Posted by Ralph Seichter <ab...@monksofcool.net>.
* Mike Marynowski:
> Everything we test for is easily compromised on its own.
That's quite a sweeping statement, and I disagree. IP-based real time
blacklists, anyone? Also, "we" is too unspecific. In addition to the
stock rules, I happen to maintain a set of custom tests which are
neither published nor easily circumvented. They have proven pretty
effective for us.
-Ralph
Re: Spam rule for HTTP/HTTPS request to sender's root domain
Posted by Mike Marynowski <mi...@singulink.com>.
> Why even use a test for something that is so easily compromised?
> -Ralph
Everything we test for is easily compromised on its own.
Re: Spam rule for HTTP/HTTPS request to sender's root domain
Posted by Ralph Seichter <ab...@monksofcool.net>.
* Mike Marynowski:
> And the cat and mouse game continues :)
It sure does, and that's what sticks in my craw here: For a pro spammer,
it is easy to set up websites in an automated fashion. If I was such a
naughty person, I'd just add one tiny service that answers "all is well"
for every incoming HTTP request.
Why even use a test for something that is so easily compromised?
-Ralph
Re: Spam rule for HTTP/HTTPS request to sender's root domain
Posted by Mike Marynowski <mi...@singulink.com>.
And the cat and mouse game continues :)
That said, all the big obvious "email-only domains" that send out
newsletters and notifications and such that I've come across in my
sampling already have placeholder websites or redirects to their main
websites configured. I'm sure that's not always the case but the data I
have indicates that's the exception and not the rule.
On 2/28/2019 11:37 AM, Ralph Seichter wrote:
> * Antony Stone:
>
>> Each to their own.
> Of course. Alas, if this gets widely adopted, we'll probably have to set
> up placeholder websites (as will spammers, I'm sure).
>
> -Ralph
Re: Spam rule for HTTP/HTTPS request to sender's root domain
Posted by Ralph Seichter <ab...@monksofcool.net>.
* Antony Stone:
> Each to their own.
Of course. Alas, if this gets widely adopted, we'll probably have to set
up placeholder websites (as will spammers, I'm sure).
-Ralph
Re: Spam rule for HTTP/HTTPS request to sender's root domain
Posted by Antony Stone <An...@spamassassin.open.source.it>.
On Thursday 28 February 2019 at 17:14:04, Ralph Seichter wrote:
> * Grant Taylor:
> > Why would you do it per email? I would think that you would do the
> > test and cache the results for some amount of time.
>
> I would not do it at all, caching or no caching. Personally, I don't see
> a benefit trying to correlate email with a website, as mentioned before,
> based on how we utilise email-only-domains.
Each to their own.
If a mail admin finds a good correlation between no-website and spam, it's a
good check to add into the mix
Nothing should be a poison pill in itself, and if you use email-only domains,
you (they) still won't get blocked provided the emails they send don't
otherwise look spammy.
Mike has already said:
On Thursday 28 February 2019 at 15:25:39, Mike Marynowski wrote:
> as a 100% ban rule this is obviously a bad idea. As a score modifier I think
> it would be highly effective.
>
> I found several "email only" domains in my sampling but all the big ones
> still had landing pages at the root domain saying "this domain is only
> used for serving email" or similar. I'm sure there are exceptions and
> some people will have email only domains, but that's why we don't put
> 100% confidence into any one rule.
Personally I'm very interested in such a rule and its real-world effectiveness.
Antony.
--
Tinned food was developed for the British Navy in 1813.
The tin opener was not invented until 1858.
Please reply to the list;
please *don't* CC me.
Re: Spam rule for HTTP/HTTPS request to sender's root domain
Posted by Mike Marynowski <mi...@singulink.com>.
I've tested this with good results and I'm actually not creating any
HTTPS connections - what I've found is a single HTTP request with zero
redirections is enough. If it returns a status code >= 400 then you
treat it like no valid website, and if you get a < 400 result (i.e. a
301/302 redirect or a 200 ok) then you can treat it like a valid
website. You don't even need to receive the body of the HTTP result, you
can quit after seeing the status.
And yes, as a 100% ban rule this is obviously a bad idea. As a score
modifier I think it would be highly effective.
I found several "email only" domains in my sampling but all the big ones
still had landing pages at the root domain saying "this domain is only
used for serving email" or similar. I'm sure there are exceptions and
some people will have email only domains, but that's why we don't put
100% confidence into any one rule.
On 2/27/2019 7:57 PM, Grant Taylor wrote:
> On 02/27/2019 03:25 PM, Ralph Seichter wrote:
>> We use some of our domains specifically for email, with no associated
>> website.
>
> I agree that /requiring/ a website at one of the parent domains
> (stopping before traversing into the Public Suffix List) is
> problematic and prone to false positives.
>
> There /may/ be some value to /some/ people in doing such a check and
> altering the spam score. (See below.)
>
>> Besides, I think the overhead to establish a HTTPS connection for
>> every incoming email would be prohibitive.
>
> Why would you do it per email? I would think that you would do the
> test and cache the results for some amount of time.
>
>> There is a reason most whitelist/blacklist services use "cheap" DNS
>> queries instead.
> I wonder if there is a way to hack DNS into doing this for us. I.e. a
> custom DNS ""server (BIND's DLZ comes to mind) that can perform the
> test(s) and fabricate an answer that could then be cached. ""Publish
> these answers in a new zone / domain name, and treat it like another RBL.
>
> Meaning a query goes to the new RBL server, which does the necessary
> $MAGIC to return an answer (possibly NXDOMAIN if there is a site and
> 127.0.0.1 if there is no site) which can be cached by standard local /
> recursive DNS servers.
>
>
>
Re: Spam rule for HTTP/HTTPS request to sender's root domain
Posted by Mike Marynowski <mi...@singulink.com>.
> I would not do it at all, caching or no caching. Personally, I don't see
> a benefit trying to correlate email with a website, as mentioned before,
> based on how we utilise email-only-domains.
>
> -Ralph
Fair enough. Based on the sampling I've done and the way I intend to use
this, I still see this as a net benefit. If you're running an email-only
domain then you're probably doing some pretty email intensive stuff and
you should be well-configured enough to the point where a nudge in the
score shouldn't put you over the spam threshold. If you're a spammer
just trying to make quick use of a domain and the spam score is already
quite high but not quite over then this can tip the score over into
marking it as spam.
Re: Spam rule for HTTP/HTTPS request to sender's root domain
Posted by Antony Stone <An...@spamassassin.open.source.it>.
On Thursday 28 February 2019 at 20:25:36, Bill Cole wrote:
> On 28 Feb 2019, at 13:43, Mike Marynowski wrote:
> > On 2/28/2019 12:41 PM, Bill Cole wrote:
> >> You should probably put the envelope sender (i.e. the SA
> >> "EnvelopeFrom" pseudo-header) into that list, maybe even first. That
> >> will make many messages sent via discussion mailing lists (such as
> >> this one) pass your test where a test of real header domains would
> >> fail, while it it is more likely to cause commercial bulk mail to
> >> fail where it would usually pass based on real standard headers.
> >> (That's based on a hunch, not testing.)
> >
> > Can you clarify why you think my currently proposed headers would fail
> > with the mailing list? As far as I can tell, all the messages I've
> > received from this mailing list would pass just fine. As an example
> > from the emails in this list, which header value specifically would
> > cause it to fail?
>
> If I did not explicitly set the Reply-To header, this message would be
> delivered without one. The domain part of the From header on messages I
> post to this and other mailing lists has no website and never will.
The same applies to my messages as well. I use a list-specific "subdomain" on
all my various list subscription addresses, however unlike Bill, I never set a
Reply-To address, because I expect all list replies to go to the list (which I
then receive as a subscriber).
Any emails which are sent to my list-subscription addresses directly (ie: not
via the mailing list server, which adds its own identifiable headers) are
discarded.
Regards,
Antony.
--
It may not seem obvious, but (6 x 5 + 5) x 5 - 55 equals 5!
Please reply to the list;
please *don't* CC me.
Re: Spam rule for HTTP/HTTPS request to sender's root domain
Posted by Grant Taylor <gt...@tnetconsulting.net>.
On 2/28/19 1:24 PM, Luis E. Muñoz wrote:
> I suggest you look at the Mozilla Public Suffix List at
> https://publicsuffix.org/ — it was created for different purposes, but I
> believe it maps well enough to my understanding of your use case. You'll
> be able to pad the gaps using a custom list.
+1 for Mozilla's PSL.
Also, remember to stop at the domain before the PS(L). (Another message
mentioned co.uk or something like that. That's a PS and shouldn't be
checked.)
--
Grant. . . .
unix || die
Re: Spam rule for HTTP/HTTPS request to sender's root domain
Posted by Mike Marynowski <mi...@singulink.com>.
I'm pretty sure the way I ended up implementing it everything is working
fine and it's nice and simple and clean but maybe there's some edge case
that doesn't work properly. If there is I haven't found it yet, so if
you can think of one let me know.
Since I'm sending an HTTP request to all subdomains simultaneously it
doesn't really matter if I go one further than the actual root domain. A
"co.uk" request will come back with no website so there's no need to
special handle it. For example, if the email address being tested is
bob@mail1.mx.stuff.co.uk, an HTTP request goes out to:
mail1.mx.stuff.co.uk
mx.stuff.co.uk
stuff.co.uk
co.uk
The last one will always be cached from a previous .co.uk address lookup
so it won't actually be sent out anyway. If any of them respond with a
valid website then an OK result is returned.
On 2/28/2019 3:24 PM, Luis E. Muñoz wrote:
> This is more complicated than it seems. I have the t-shirt to prove it.
>
> I suggest you look at the Mozilla Public Suffix List at
> https://publicsuffix.org/ — it was created for different purposes, but
> I believe it maps well enough to my understanding of your use case.
> You'll be able to pad the gaps using a custom list.
>
> Best regards
>
> -lem
Re: Spam rule for HTTP/HTTPS request to sender's root domain
Posted by "Luis E. Muñoz" <sa...@lem.click>.
On 28 Feb 2019, at 11:53, Mike Marynowski wrote:
> There are many ways to determine what the root domain is. One way is
> analyzing the DNS response from the query to realize it's actually a
> root domain, or you can just grab the ICANN TLD list and use that to
> make a determination.
>
> What I'm probably going to do now that I'm building this as a cached
> DNS service is just walk up the subdomains until I hit the root domain
> and if any of them have a website then it's fine.
This is more complicated than it seems. I have the t-shirt to prove it.
I suggest you look at the Mozilla Public Suffix List at
https://publicsuffix.org/ — it was created for different purposes,
but I believe it maps well enough to my understanding of your use case.
You'll be able to pad the gaps using a custom list.
Best regards
-lem
Re: Spam rule for HTTP/HTTPS request to sender's root domain
Posted by Mike Marynowski <mi...@singulink.com>.
There are many ways to determine what the root domain is. One way is
analyzing the DNS response from the query to realize it's actually a
root domain, or you can just grab the ICANN TLD list and use that to
make a determination.
What I'm probably going to do now that I'm building this as a cached DNS
service is just walk up the subdomains until I hit the root domain and
if any of them have a website then it's fine.
On 2/28/2019 2:39 PM, Antony Stone wrote:
> On Thursday 28 February 2019 at 20:33:42, Mike Marynowski wrote:
>
>> But scconsult.com does in fact have a website so I'm not sure what you
>> mean. This method checks the *root* domain, not the subdomain.
> How do you identify the root domain, given an email address?
>
> For example, for many years in the UK, it was possible to get something.co.uk
> or something.org.uk (and maybe something.net.uk), but now it is also possible
> to get something.uk
>
> So, I'm just wondering how you determine what the "root" domain for a given
> email address is.
>
>
> Antony.
>
Re: Spam rule for HTTP/HTTPS request to sender's root domain
Posted by Bill Cole <sa...@billmail.scconsult.com>.
On 28 Feb 2019, at 14:39, Antony Stone wrote:
> On Thursday 28 February 2019 at 20:33:42, Mike Marynowski wrote:
>
>> But scconsult.com does in fact have a website so I'm not sure what you
>> mean. This method checks the *root* domain, not the subdomain.
>
> How do you identify the root domain, given an email address?
Mail::SpamAssassin::RegistryBoundaries, of course! :)
Re: Spam rule for HTTP/HTTPS request to sender's root domain
Posted by Antony Stone <An...@spamassassin.open.source.it>.
On Thursday 28 February 2019 at 20:33:42, Mike Marynowski wrote:
> But scconsult.com does in fact have a website so I'm not sure what you
> mean. This method checks the *root* domain, not the subdomain.
How do you identify the root domain, given an email address?
For example, for many years in the UK, it was possible to get something.co.uk
or something.org.uk (and maybe something.net.uk), but now it is also possible
to get something.uk
So, I'm just wondering how you determine what the "root" domain for a given
email address is.
Antony.
--
"It is easy to be blinded to the essential uselessness of them by the sense of
achievement you get from getting them to work at all. In other words - and
this is the rock solid principle on which the whole of the Corporation's
Galaxy-wide success is founded - their fundamental design flaws are completely
hidden by their superficial design flaws."
- Douglas Noel Adams
Please reply to the list;
please *don't* CC me.
Re: Spam rule for HTTP/HTTPS request to sender's root domain
Posted by Rupert Gallagher <ru...@protonmail.com>.
On Fri, Mar 1, 2019 at 23:14, Mike Marynowski <mi...@singulink.com> wrote:
>> Does SpamAssassin even have facilities to do that?
> Yes, if spf runs at priority 1, you can define your test at priority 2, so SA executes them in the given order.
>> Don't all rules run all the time?
> They run when relevant, in the given order, and they do whay they say, so if you say that webtest stops if spf test succeeds, then SA does it.
>> SpamAssassin still needs to run all the rules because MTAs might have different spam mark / spam delete /etc thresholds than the one set in SA.
>
>> The number of cycles you're talking about is the same as an RBL lookup so I really don't see it as being significant. The DNS service does all the heavy lifting and I'm planning to make it public.
>
> It is significant of you have many emails to process. It is even more significant if you run the test locally.
>
> On 3/1/2019 5:09 PM, Rupert Gallagher wrote:
>
>> Case study:
>>
>> example.com bans any e-mail sent from its third levels up, and does it by spf.
>>
>> spf-banned.example.com sent mail, and my SA at server.com adds a big fat penalty, high enough to bounch it.
>>
>> Suppose I do not bounch it, and use your filter to check for its websites. It turns out that both example.com and spf-banned.example.com have a website. Was it worth it to spend cycles on it? I guess not. The spf is an accepted rfc and it should have priority. So, I recommend the website test to first read the result of the SPF test, quit when positive, continue otherwise.
>>
>> --- ruga
>
> On 3/1/2019 5:09 PM, Rupert Gallagher wrote:
>
>> Case study:
>>
>> example.com bans any e-mail sent from its third levels up, and does it by spf.
>>
>> spf-banned.example.com sent mail, and my SA at server.com adds a big fat penalty, high enough to bounch it.
>>
>> Suppose I do not bounch it, and use your filter to check for its websites. It turns out that both example.com and spf-banned.example.com have a website. Was it worth it to spend cycles on it? I guess not. The spf is an accepted rfc and it should have priority. So, I recommend the website test to first read the result of the SPF test, quit when positive, continue otherwise.
>>
>> --- ruga
>>
>> On Fri, Mar 1, 2019 at 22:31, Grant Taylor <gt...@tnetconsulting.net> wrote:
>>
>>> On 02/28/2019 09:39 PM, Mike Marynowski wrote:
>>>> I modified it so it checks the root domain and all subdomains up to the
>>>> email domain.
>>>
>>> :-)
>>>
>>>> As for your question - if afraid.org has a website then you are correct,
>>>> all subdomains of afraid.org will not flag this rule, but if lots of
>>>> afraid.org subdomains are sending spam then I imagine other spam
>>>> detection methods will have a good chance of catching it.
>>>
>>> ACK
>>>
>>> afraid.org is much like DynDNS in that one entity (afaid.org themselves
>>> or DynDNS) provide DNS services for other entities.
>>>
>>> I don't see a good way to differentiate between the sets of entities.
>>>
>>>> I'm not sure what you mean by "working up the tree" - if afraid.org has
>>>> a website and I work my way up the tree then either way eventually I'll
>>>> hit afraid.org and get a valid website, no?
>>>
>>> True.
>>>
>>> I wonder if there is any value in detecting zone boundaries via not
>>> going any higher up the tree past the zone that's containing the email
>>> domain(s).
>>>
>>> Perhaps something like that would enable differentiation between Afraid
>>> & DynDNS and the entities that they are hosting DNS services for.
>>> (Assuming that there are separate zones.
>>>
>>>> My current implementation fires off concurrent HTTP requests to the root
>>>> domain and all subdomains up to the email domain and waits for a valid
>>>> answer from any of them.
>>>
>>> ACK
>>>
>>> s/up to/down to/
>>>
>>> I don't grok the value of doing this as well as you do. But I think
>>> your use case is enough different than mine such that I can't make an
>>> objective value estimate.
>>>
>>> That being said, I do find the idea technically interesting, even if I
>>> think I'll not utilize it.
>>>
>>> --
>>> Grant. . . .
>>> unix || die
Re: Spam rule for HTTP/HTTPS request to sender's root domain
Posted by RW <rw...@googlemail.com>.
On Fri, 01 Mar 2019 22:09:01 +0000
Rupert Gallagher wrote:
> Case study:
>
> example.com bans any e-mail sent from its third levels up, and does
> it by spf.
>
> spf-banned.example.com sent mail, and my SA at server.com adds a big
> fat penalty, high enough to bounch it.
example.com has a TXT record of "v=spf1 -all"
spf-banned.example.com has no TXT record at all
Re: Spam rule for HTTP/HTTPS request to sender's root domain
Posted by Mike Marynowski <mi...@singulink.com>.
Does SpamAssassin even have facilities to do that? Don't all rules run
all the time? SpamAssassin still needs to run all the rules because MTAs
might have different spam mark / spam delete /etc thresholds than the
one set in SA.
The number of cycles you're talking about is the same as an RBL lookup
so I really don't see it as being significant. The DNS service does all
the heavy lifting and I'm planning to make it public.
On 3/1/2019 5:09 PM, Rupert Gallagher wrote:
> Case study:
>
> example.com bans any e-mail sent from its third levels up, and does it
> by spf.
>
> spf-banned.example.com sent mail, and my SA at server.com adds a big
> fat penalty, high enough to bounch it.
>
> Suppose I do not bounch it, and use your filter to check for its
> websites. It turns out that both example.com and
> spf-banned.example.com have a website. Was it worth it to spend cycles
> on it? I guess not. The spf is an accepted rfc and it should have
> priority. So, I recommend the website test to first read the result of
> the SPF test, quit when positive, continue otherwise.
>
> --- ruga
On 3/1/2019 5:09 PM, Rupert Gallagher wrote:
> Case study:
>
> example.com bans any e-mail sent from its third levels up, and does it
> by spf.
>
> spf-banned.example.com sent mail, and my SA at server.com adds a big
> fat penalty, high enough to bounch it.
>
> Suppose I do not bounch it, and use your filter to check for its
> websites. It turns out that both example.com and
> spf-banned.example.com have a website. Was it worth it to spend cycles
> on it? I guess not. The spf is an accepted rfc and it should have
> priority. So, I recommend the website test to first read the result of
> the SPF test, quit when positive, continue otherwise.
>
> --- ruga
>
>
>
> On Fri, Mar 1, 2019 at 22:31, Grant Taylor <gtaylor@tnetconsulting.net
> <ma...@tnetconsulting.net>> wrote:
>> On 02/28/2019 09:39 PM, Mike Marynowski wrote:
>> > I modified it so it checks the root domain and all subdomains up to the
>> > email domain.
>>
>> :-)
>>
>> > As for your question - if afraid.org has a website then you are
>> correct,
>> > all subdomains of afraid.org will not flag this rule, but if lots of
>> > afraid.org subdomains are sending spam then I imagine other spam
>> > detection methods will have a good chance of catching it.
>>
>> ACK
>>
>> afraid.org is much like DynDNS in that one entity (afaid.org themselves
>> or DynDNS) provide DNS services for other entities.
>>
>> I don't see a good way to differentiate between the sets of entities.
>>
>> > I'm not sure what you mean by "working up the tree" - if afraid.org has
>> > a website and I work my way up the tree then either way eventually I'll
>> > hit afraid.org and get a valid website, no?
>>
>> True.
>>
>> I wonder if there is any value in detecting zone boundaries via not
>> going any higher up the tree past the zone that's containing the email
>> domain(s).
>>
>> Perhaps something like that would enable differentiation between Afraid
>> & DynDNS and the entities that they are hosting DNS services for.
>> (Assuming that there are separate zones.
>>
>> > My current implementation fires off concurrent HTTP requests to the
>> root
>> > domain and all subdomains up to the email domain and waits for a valid
>> > answer from any of them.
>>
>> ACK
>>
>> s/up to/down to/
>>
>> I don't grok the value of doing this as well as you do. But I think
>> your use case is enough different than mine such that I can't make an
>> objective value estimate.
>>
>> That being said, I do find the idea technically interesting, even if I
>> think I'll not utilize it.
>>
>>
>>
>> --
>> Grant. . . .
>> unix || die
>>
>
>
Re: Spam rule for HTTP/HTTPS request to sender's root domain
Posted by Rupert Gallagher <ru...@protonmail.com>.
Case study:
example.com bans any e-mail sent from its third levels up, and does it by spf.
spf-banned.example.com sent mail, and my SA at server.com adds a big fat penalty, high enough to bounch it.
Suppose I do not bounch it, and use your filter to check for its websites. It turns out that both example.com and spf-banned.example.com have a website. Was it worth it to spend cycles on it? I guess not. The spf is an accepted rfc and it should have priority. So, I recommend the website test to first read the result of the SPF test, quit when positive, continue otherwise.
--- ruga
On Fri, Mar 1, 2019 at 22:31, Grant Taylor <gt...@tnetconsulting.net> wrote:
> On 02/28/2019 09:39 PM, Mike Marynowski wrote:
>> I modified it so it checks the root domain and all subdomains up to the
>> email domain.
>
> :-)
>
>> As for your question - if afraid.org has a website then you are correct,
>> all subdomains of afraid.org will not flag this rule, but if lots of
>> afraid.org subdomains are sending spam then I imagine other spam
>> detection methods will have a good chance of catching it.
>
> ACK
>
> afraid.org is much like DynDNS in that one entity (afaid.org themselves
> or DynDNS) provide DNS services for other entities.
>
> I don't see a good way to differentiate between the sets of entities.
>
>> I'm not sure what you mean by "working up the tree" - if afraid.org has
>> a website and I work my way up the tree then either way eventually I'll
>> hit afraid.org and get a valid website, no?
>
> True.
>
> I wonder if there is any value in detecting zone boundaries via not
> going any higher up the tree past the zone that's containing the email
> domain(s).
>
> Perhaps something like that would enable differentiation between Afraid
> & DynDNS and the entities that they are hosting DNS services for.
> (Assuming that there are separate zones.
>
>> My current implementation fires off concurrent HTTP requests to the root
>> domain and all subdomains up to the email domain and waits for a valid
>> answer from any of them.
>
> ACK
>
> s/up to/down to/
>
> I don't grok the value of doing this as well as you do. But I think
> your use case is enough different than mine such that I can't make an
> objective value estimate.
>
> That being said, I do find the idea technically interesting, even if I
> think I'll not utilize it.
>
> --
> Grant. . . .
> unix || die
Re: Spam rule for HTTP/HTTPS request to sender's root domain
Posted by Mike Marynowski <mi...@singulink.com>.
On 3/1/2019 4:31 PM, Grant Taylor wrote:
> afraid.org is much like DynDNS in that one entity (afaid.org
> themselves or DynDNS) provide DNS services for other entities.
>
> I don't see a good way to differentiate between the sets of entities.
I haven't come across any notable amount of spam that's punched through
all the other detection methods in place with a reply-to/from email
address subdomain on a service like that. I'm sure it happens though and
in that case this filter simply won't add any value.
Re: Spam rule for HTTP/HTTPS request to sender's root domain
Posted by Grant Taylor <gt...@tnetconsulting.net>.
On 02/28/2019 09:39 PM, Mike Marynowski wrote:
> I modified it so it checks the root domain and all subdomains up to the
> email domain.
:-)
> As for your question - if afraid.org has a website then you are correct,
> all subdomains of afraid.org will not flag this rule, but if lots of
> afraid.org subdomains are sending spam then I imagine other spam
> detection methods will have a good chance of catching it.
ACK
afraid.org is much like DynDNS in that one entity (afaid.org themselves
or DynDNS) provide DNS services for other entities.
I don't see a good way to differentiate between the sets of entities.
> I'm not sure what you mean by "working up the tree" - if afraid.org has
> a website and I work my way up the tree then either way eventually I'll
> hit afraid.org and get a valid website, no?
True.
I wonder if there is any value in detecting zone boundaries via not
going any higher up the tree past the zone that's containing the email
domain(s).
Perhaps something like that would enable differentiation between Afraid
& DynDNS and the entities that they are hosting DNS services for.
(Assuming that there are separate zones.
> My current implementation fires off concurrent HTTP requests to the root
> domain and all subdomains up to the email domain and waits for a valid
> answer from any of them.
ACK
s/up to/down to/
I don't grok the value of doing this as well as you do. But I think
your use case is enough different than mine such that I can't make an
objective value estimate.
That being said, I do find the idea technically interesting, even if I
think I'll not utilize it.
--
Grant. . . .
unix || die
Re: Spam rule for HTTP/HTTPS request to sender's root domain
Posted by Mike Marynowski <mi...@singulink.com>.
I modified it so it checks the root domain and all subdomains up to the
email domain.
As for your question - if afraid.org has a website then you are correct,
all subdomains of afraid.org will not flag this rule, but if lots of
afraid.org subdomains are sending spam then I imagine other spam
detection methods will have a good chance of catching it.
I'm not sure what you mean by "working up the tree" - if afraid.org has
a website and I work my way up the tree then either way eventually I'll
hit afraid.org and get a valid website, no?
My current implementation fires off concurrent HTTP requests to the root
domain and all subdomains up to the email domain and waits for a valid
answer from any of them.
On 2/28/2019 10:27 PM, Grant Taylor wrote:
> What about domains that have many client subdomains?
>
> afraid.org (et al) come to mind.
>
> You might end up allowing email from spammer.afraid.org who doesn't
> have a website because the parent afraid.org does have a website.
>
> I would think that checking from the child and working up the tree
> would be more accurate, even if it may take longer.
>
>
>
Re: Spam rule for HTTP/HTTPS request to sender's root domain
Posted by Grant Taylor <gt...@tnetconsulting.net>.
On 2/28/19 12:33 PM, Mike Marynowski wrote:
> This method checks the *root* domain, not the subdomain.
What about domains that have many client subdomains?
afraid.org (et al) come to mind.
You might end up allowing email from spammer.afraid.org who doesn't have
a website because the parent afraid.org does have a website.
I would think that checking from the child and working up the tree would
be more accurate, even if it may take longer.
--
Grant. . . .
unix || die
Re: Spam rule for HTTP/HTTPS request to sender's root domain
Posted by Bill Cole <sa...@billmail.scconsult.com>.
On 28 Feb 2019, at 14:33, Mike Marynowski wrote:
> But scconsult.com does in fact have a website so I'm not sure what you
> mean. This method checks the *root* domain, not the subdomain.
Ah, I see. I had missed that detail.
That's likely to have fewer issues, as long as you get the registry
boundary correct. SA actually helps with that: see
Mail::SpamAssassin::RegistryBoundaries.
> Even if this wasn't the case well, it is what it is. Emails from this
> mailing list (and most well configured lists) come in at a spam score
> of -6, so they are no risk of being blocked even if a non-website
> domain triggers this particular rule.
>
> On 2/28/2019 2:25 PM, Bill Cole wrote:
>> On 28 Feb 2019, at 13:43, Mike Marynowski wrote:
>>
>>> On 2/28/2019 12:41 PM, Bill Cole wrote:
>>>> You should probably put the envelope sender (i.e. the SA
>>>> "EnvelopeFrom" pseudo-header) into that list, maybe even first.
>>>> That will make many messages sent via discussion mailing lists
>>>> (such as this one) pass your test where a test of real header
>>>> domains would fail, while it it is more likely to cause commercial
>>>> bulk mail to fail where it would usually pass based on real
>>>> standard headers. (That's based on a hunch, not testing.)
>>> Can you clarify why you think my currently proposed headers would
>>> fail with the mailing list? As far as I can tell, all the messages
>>> I've received from this mailing list would pass just fine. As an
>>> example from the emails in this list, which header value
>>> specifically would cause it to fail?
>>
>> If I did not explicitly set the Reply-To header, this message would
>> be delivered without one. The domain part of the From header on
>> messages I post to this and other mailing lists has no website and
>> never will.
>>
Re: Spam rule for HTTP/HTTPS request to sender's root domain
Posted by Mike Marynowski <mi...@singulink.com>.
But scconsult.com does in fact have a website so I'm not sure what you
mean. This method checks the *root* domain, not the subdomain.
Even if this wasn't the case well, it is what it is. Emails from this
mailing list (and most well configured lists) come in at a spam score of
-6, so they are no risk of being blocked even if a non-website domain
triggers this particular rule.
On 2/28/2019 2:25 PM, Bill Cole wrote:
> On 28 Feb 2019, at 13:43, Mike Marynowski wrote:
>
>> On 2/28/2019 12:41 PM, Bill Cole wrote:
>>> You should probably put the envelope sender (i.e. the SA
>>> "EnvelopeFrom" pseudo-header) into that list, maybe even first. That
>>> will make many messages sent via discussion mailing lists (such as
>>> this one) pass your test where a test of real header domains would
>>> fail, while it it is more likely to cause commercial bulk mail to
>>> fail where it would usually pass based on real standard headers.
>>> (That's based on a hunch, not testing.)
>> Can you clarify why you think my currently proposed headers would
>> fail with the mailing list? As far as I can tell, all the messages
>> I've received from this mailing list would pass just fine. As an
>> example from the emails in this list, which header value specifically
>> would cause it to fail?
>
> If I did not explicitly set the Reply-To header, this message would be
> delivered without one. The domain part of the From header on messages
> I post to this and other mailing lists has no website and never will.
>
Re: Spam rule for HTTP/HTTPS request to sender's root domain
Posted by Bill Cole <sa...@billmail.scconsult.com>.
On 28 Feb 2019, at 13:43, Mike Marynowski wrote:
> On 2/28/2019 12:41 PM, Bill Cole wrote:
>> You should probably put the envelope sender (i.e. the SA
>> "EnvelopeFrom" pseudo-header) into that list, maybe even first. That
>> will make many messages sent via discussion mailing lists (such as
>> this one) pass your test where a test of real header domains would
>> fail, while it it is more likely to cause commercial bulk mail to
>> fail where it would usually pass based on real standard headers.
>> (That's based on a hunch, not testing.)
> Can you clarify why you think my currently proposed headers would fail
> with the mailing list? As far as I can tell, all the messages I've
> received from this mailing list would pass just fine. As an example
> from the emails in this list, which header value specifically would
> cause it to fail?
If I did not explicitly set the Reply-To header, this message would be
delivered without one. The domain part of the From header on messages I
post to this and other mailing lists has no website and never will.
--
Bill Cole
bill@scconsult.com or billcole@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Available For Hire: https://linkedin.com/in/billcole
Re: Spam rule for HTTP/HTTPS request to sender's root domain
Posted by Mike Marynowski <mi...@singulink.com>.
On 2/28/2019 12:41 PM, Bill Cole wrote:
> You should probably put the envelope sender (i.e. the SA
> "EnvelopeFrom" pseudo-header) into that list, maybe even first. That
> will make many messages sent via discussion mailing lists (such as
> this one) pass your test where a test of real header domains would
> fail, while it it is more likely to cause commercial bulk mail to fail
> where it would usually pass based on real standard headers. (That's
> based on a hunch, not testing.)
Can you clarify why you think my currently proposed headers would fail
with the mailing list? As far as I can tell, all the messages I've
received from this mailing list would pass just fine. As an example from
the emails in this list, which header value specifically would cause it
to fail?
Re: Spam rule for HTTP/HTTPS request to sender's root domain
Posted by Rupert Gallagher <ru...@protonmail.com>.
The focus was on the To header for mailing lists, complaints on MUAs and people's choices. If you do not want to appear in the To header of a list, you are exercising a legal right under the GDPR. So, to cut through all those problems and enforce a sound solution, I suggest list majordomos do the compliance heavy lifting by forcing a sane To header. That's all. If you want to talk more in general about GDPR, I do it everyday, so leave me alone on weekends, will you? :-)
On Fri, Mar 1, 2019 at 22:41, Grant Taylor <gt...@tnetconsulting.net> wrote:
> On 03/01/2019 01:25 AM, Rupert Gallagher wrote:
>> A future-proof list that complies with GDPR would automatically rewrite
>> the To header, leaving the list address only.
>
> Doesn't GDPR also include things like signatures? Thus if the mailing
> list is only modifying the email metadata and not the message body (thus
> signature), then it's still subject to GDPR.
>
> I also feel like it is a disservice to the mailing list to hide who the
> message is from. But I have no idea of the legalities of (not) doing such.
>
>> Any other recipient will still receive it from the original sender.
>
> I presume you're talking about (B)CC and additional To recipients.
>
> I never did hear, how does GDPR play out in such a scenario. Does the
> sender need to make a request to all To / (B)CC recipients for them to
> forget the sender? Also, does the mailing list operator have any
> responsibility to pass the request on to all subscribers to purge the
> requester from their personal archives? I feel like there's a LOT of
> unaddressed issues here, and that singling out the mailing list is
> somewhat unfair. But life's unfair. So … ¯_(ツ)_/¯
>
> --
> Grant. . . .
> unix || die
Re: Spam rule for HTTP/HTTPS request to sender's root domain
Posted by Grant Taylor <gt...@tnetconsulting.net>.
On 03/01/2019 01:25 AM, Rupert Gallagher wrote:
> A future-proof list that complies with GDPR would automatically rewrite
> the To header, leaving the list address only.
Doesn't GDPR also include things like signatures? Thus if the mailing
list is only modifying the email metadata and not the message body (thus
signature), then it's still subject to GDPR.
I also feel like it is a disservice to the mailing list to hide who the
message is from. But I have no idea of the legalities of (not) doing such.
> Any other recipient will still receive it from the original sender.
I presume you're talking about (B)CC and additional To recipients.
I never did hear, how does GDPR play out in such a scenario. Does the
sender need to make a request to all To / (B)CC recipients for them to
forget the sender? Also, does the mailing list operator have any
responsibility to pass the request on to all subscribers to purge the
requester from their personal archives? I feel like there's a LOT of
unaddressed issues here, and that singling out the mailing list is
somewhat unfair. But life's unfair. So … ¯\_(ツ)_/¯
--
Grant. . . .
unix || die
Re: Spam rule for HTTP/HTTPS request to sender's root domain
Posted by Rupert Gallagher <ru...@protonmail.com>.
A future-proof list that complies with GDPR would automatically rewrite the To header, leaving the list address only. Any other recipient will still receive it from the original sender.
On Thu, Feb 28, 2019 at 20:29, Mike Marynowski <mi...@singulink.com> wrote:
> Unfortunately I don't see a reply-to header on your messages. What do
> you have it set to? I thought mailing lists see who is in the "to"
> section of a reply so that 2 copies aren't sent out. The "mailing list
> ethics" guide I read said to always use "reply all" and the mailing list
> system takes care of not sending duplicate replies.
>
> I removed your direct email from this reply and only kept the mailing
> list address, but for the record I don't see any reply-to headers.
>
> On 2/28/2019 2:21 PM, Bill Cole wrote:
>> Please respect my consciously set Reply-To header. I don't ever need 2
>> copies of a message posted to a mailing list, and ignoring that header
>> is rude.
>>
>> On 28 Feb 2019, at 13:28, Mike Marynowski wrote:
>>
>>> On 2/28/2019 12:41 PM, Bill Cole wrote:
>>>> You should probably put the envelope sender (i.e. the SA
>>>> "EnvelopeFrom" pseudo-header) into that list, maybe even first. That
>>>> will make many messages sent via discussion mailing lists (such as
>>>> this one) pass your test where a test of real header domains would
>>>> fail, while it it is more likely to cause commercial bulk mail to
>>>> fail where it would usually pass based on real standard headers.
>>>> (That's based on a hunch, not testing.)
>>>
>>> Hmmm. I'll have to give some more thought into the exact headers it
>>> decides to test. I'm not sure if my MTA puts in envelope info into
>>> the SA request or not. For sake of simplicity right now I might just
>>> ignore mailing lists, I don't know. What I do know is that in the
>>> spam messages I'm reviewing right now, the reply-to / from headers
>>> set often don't have websites at those domains and none of them are
>>> masquerading as mailing lists. I haven't thought through the
>>> situation with mailing lists yet.
>>>
>>> I'm new to this whole SA plugin dev process - can you suggest the
>>> best way to log the full requests that SA receives so I can see what
>>> info it is getting and what I have to work with?
>>
>> The best way to see far too much information about what SA is doing is
>> to add a "-D all" to the invocation of the spamassassin script. You
>> can also add that to the flags used by spamd, if you want to punish
>> your logging subsystem
>>
Re: Spam rule for HTTP/HTTPS request to sender's root domain
Posted by Bill Cole <sa...@billmail.scconsult.com>.
On 28 Feb 2019, at 21:10, Mike Marynowski wrote:
> Thunderbird normally shows reply-to in normal messages...is this
> something that some MUAs ignore just on mailing list emails or all
> emails?
I cannot keep track all of the irrational things done by all MUAs. I'm
not even surprised by anything new and wrong I see any more.
It is certainly POSSIBLE that because *some* mailing lists impose a
fixed Reply-To, some MUAs ignore it on list messages. It is certain that
some just ignore it no matter what because they are written by fools.
Also, some people configure their MUA to "reply all" by default all the
time, overriding any normal Reply-To support.
> Because I see reply-to on plenty of other emails.
You can see it for yourself in the raw message at the list archive:
http://mail-archives.apache.org/mod_mbox/spamassassin-users/201902.mbox/raw/%3cE54C4C01-CFCC-430B-8A52-6D8D866DB038@billmail.scconsult.com%3e
>
> On 2/28/2019 3:44 PM, Bill Cole wrote:
>> On 28 Feb 2019, at 14:29, Mike Marynowski wrote:
>>
>>> Unfortunately I don't see a reply-to header on your messages. What
>>> do you have it set to? I thought mailing lists see who is in the
>>> "to" section of a reply so that 2 copies aren't sent out. The
>>> "mailing list ethics" guide I read said to always use "reply all"
>>> and the mailing list system takes care of not sending duplicate
>>> replies.
>>>
>>> I removed your direct email from this reply and only kept the
>>> mailing list address, but for the record I don't see any reply-to
>>> headers.
>>
>> But it's right there in the copy that the list delivered to me:
>>
>> From: "Bill Cole" <sa...@billmail.scconsult.com>
>> To: users@spamassassin.apache.org
>> Subject: Re: Spam rule for HTTP/HTTPS request to sender's
>> root domain
>> Date: Thu, 28 Feb 2019 14:21:41 -0500
>> Reply-To: users@spamassassin.apache.org
>>
>> Whether you see it is a function of how your MUA (TBird, it seems...
>> ) displays messages. Unfortunately, it has become common for MUAs
>> simply ignore Reply-To. I didn't think TBird was in that class.
Re: Spam rule for HTTP/HTTPS request to sender's root domain
Posted by Mike Marynowski <mi...@singulink.com>.
Thunderbird normally shows reply-to in normal messages...is this
something that some MUAs ignore just on mailing list emails or all
emails? Because I see reply-to on plenty of other emails.
On 2/28/2019 3:44 PM, Bill Cole wrote:
> On 28 Feb 2019, at 14:29, Mike Marynowski wrote:
>
>> Unfortunately I don't see a reply-to header on your messages. What do
>> you have it set to? I thought mailing lists see who is in the "to"
>> section of a reply so that 2 copies aren't sent out. The "mailing
>> list ethics" guide I read said to always use "reply all" and the
>> mailing list system takes care of not sending duplicate replies.
>>
>> I removed your direct email from this reply and only kept the mailing
>> list address, but for the record I don't see any reply-to headers.
>
> But it's right there in the copy that the list delivered to me:
>
> From: "Bill Cole" <sa...@billmail.scconsult.com>
> To: users@spamassassin.apache.org
> Subject: Re: Spam rule for HTTP/HTTPS request to sender's root domain
> Date: Thu, 28 Feb 2019 14:21:41 -0500
> Reply-To: users@spamassassin.apache.org
>
> Whether you see it is a function of how your MUA (TBird, it seems... )
> displays messages. Unfortunately, it has become common for MUAs simply
> ignore Reply-To. I didn't think TBird was in that class.
Re: Spam rule for HTTP/HTTPS request to sender's root domain
Posted by Bill Cole <sa...@billmail.scconsult.com>.
On 28 Feb 2019, at 14:29, Mike Marynowski wrote:
> Unfortunately I don't see a reply-to header on your messages. What do
> you have it set to? I thought mailing lists see who is in the "to"
> section of a reply so that 2 copies aren't sent out. The "mailing list
> ethics" guide I read said to always use "reply all" and the mailing
> list system takes care of not sending duplicate replies.
>
> I removed your direct email from this reply and only kept the mailing
> list address, but for the record I don't see any reply-to headers.
But it's right there in the copy that the list delivered to me:
From: "Bill Cole" <sa...@billmail.scconsult.com>
To: users@spamassassin.apache.org
Subject: Re: Spam rule for HTTP/HTTPS request to sender's root domain
Date: Thu, 28 Feb 2019 14:21:41 -0500
Reply-To: users@spamassassin.apache.org
Whether you see it is a function of how your MUA (TBird, it seems... )
displays messages. Unfortunately, it has become common for MUAs simply
ignore Reply-To. I didn't think TBird was in that class.
Re: Spam rule for HTTP/HTTPS request to sender's root domain
Posted by Mike Marynowski <mi...@singulink.com>.
Unfortunately I don't see a reply-to header on your messages. What do
you have it set to? I thought mailing lists see who is in the "to"
section of a reply so that 2 copies aren't sent out. The "mailing list
ethics" guide I read said to always use "reply all" and the mailing list
system takes care of not sending duplicate replies.
I removed your direct email from this reply and only kept the mailing
list address, but for the record I don't see any reply-to headers.
On 2/28/2019 2:21 PM, Bill Cole wrote:
> Please respect my consciously set Reply-To header. I don't ever need 2
> copies of a message posted to a mailing list, and ignoring that header
> is rude.
>
> On 28 Feb 2019, at 13:28, Mike Marynowski wrote:
>
>> On 2/28/2019 12:41 PM, Bill Cole wrote:
>>> You should probably put the envelope sender (i.e. the SA
>>> "EnvelopeFrom" pseudo-header) into that list, maybe even first. That
>>> will make many messages sent via discussion mailing lists (such as
>>> this one) pass your test where a test of real header domains would
>>> fail, while it it is more likely to cause commercial bulk mail to
>>> fail where it would usually pass based on real standard headers.
>>> (That's based on a hunch, not testing.)
>>
>> Hmmm. I'll have to give some more thought into the exact headers it
>> decides to test. I'm not sure if my MTA puts in envelope info into
>> the SA request or not. For sake of simplicity right now I might just
>> ignore mailing lists, I don't know. What I do know is that in the
>> spam messages I'm reviewing right now, the reply-to / from headers
>> set often don't have websites at those domains and none of them are
>> masquerading as mailing lists. I haven't thought through the
>> situation with mailing lists yet.
>>
>> I'm new to this whole SA plugin dev process - can you suggest the
>> best way to log the full requests that SA receives so I can see what
>> info it is getting and what I have to work with?
>
> The best way to see far too much information about what SA is doing is
> to add a "-D all" to the invocation of the spamassassin script. You
> can also add that to the flags used by spamd, if you want to punish
> your logging subsystem
>
Re: Spam rule for HTTP/HTTPS request to sender's root domain
Posted by Bill Cole <sa...@billmail.scconsult.com>.
Please respect my consciously set Reply-To header. I don't ever need 2
copies of a message posted to a mailing list, and ignoring that header
is rude.
On 28 Feb 2019, at 13:28, Mike Marynowski wrote:
> On 2/28/2019 12:41 PM, Bill Cole wrote:
>> You should probably put the envelope sender (i.e. the SA
>> "EnvelopeFrom" pseudo-header) into that list, maybe even first. That
>> will make many messages sent via discussion mailing lists (such as
>> this one) pass your test where a test of real header domains would
>> fail, while it it is more likely to cause commercial bulk mail to
>> fail where it would usually pass based on real standard headers.
>> (That's based on a hunch, not testing.)
>
> Hmmm. I'll have to give some more thought into the exact headers it
> decides to test. I'm not sure if my MTA puts in envelope info into the
> SA request or not. For sake of simplicity right now I might just
> ignore mailing lists, I don't know. What I do know is that in the spam
> messages I'm reviewing right now, the reply-to / from headers set
> often don't have websites at those domains and none of them are
> masquerading as mailing lists. I haven't thought through the situation
> with mailing lists yet.
>
> I'm new to this whole SA plugin dev process - can you suggest the best
> way to log the full requests that SA receives so I can see what info
> it is getting and what I have to work with?
The best way to see far too much information about what SA is doing is
to add a "-D all" to the invocation of the spamassassin script. You can
also add that to the flags used by spamd, if you want to punish your
logging subsystem
Re: Spam rule for HTTP/HTTPS request to sender's root domain
Posted by Mike Marynowski <mi...@singulink.com>.
On 2/28/2019 12:41 PM, Bill Cole wrote:
> You should probably put the envelope sender (i.e. the SA
> "EnvelopeFrom" pseudo-header) into that list, maybe even first. That
> will make many messages sent via discussion mailing lists (such as
> this one) pass your test where a test of real header domains would
> fail, while it it is more likely to cause commercial bulk mail to fail
> where it would usually pass based on real standard headers. (That's
> based on a hunch, not testing.)
Hmmm. I'll have to give some more thought into the exact headers it
decides to test. I'm not sure if my MTA puts in envelope info into the
SA request or not. For sake of simplicity right now I might just ignore
mailing lists, I don't know. What I do know is that in the spam messages
I'm reviewing right now, the reply-to / from headers set often don't
have websites at those domains and none of them are masquerading as
mailing lists. I haven't thought through the situation with mailing
lists yet.
I'm new to this whole SA plugin dev process - can you suggest the best
way to log the full requests that SA receives so I can see what info it
is getting and what I have to work with?
Re: Spam rule for HTTP/HTTPS request to sender's root domain
Posted by Bill Cole <sa...@billmail.scconsult.com>.
On 28 Feb 2019, at 11:33, Mike Marynowski wrote:
> Question though - what is your reply-to address set to in the emails
> coming from your email-only domain?
I can't answer for Ralph, but in my case I use a mail-only domain in
From for most of my personal mail, and while I usually set Reply-To to
list submission addresses when posting to a mailing list (because some
mail clients honor it...) I NEVER have a Reply-To on non-list mail. For
mailing lists I administer, aside from targeted DMARC workarounds
effecting a small subset of members, there is also no Reply-To forced.
Users can set it as they like. Note that for mailing lists, the From
header domain normally doesn't match the envelope sender domain.
> The domain checking I'm doing grabs the first available address in
> this order: reply-to, from, sender. It's not using the domain of the
> SMTP server. I did come across some email-only domain SENDERS in my
> sampling, but the overwhelming majority of reply-to addresses pointed
> to emails with HTTP servers on their domains.
You should probably put the envelope sender (i.e. the SA "EnvelopeFrom"
pseudo-header) into that list, maybe even first. That will make many
messages sent via discussion mailing lists (such as this one) pass your
test where a test of real header domains would fail, while it it is more
likely to cause commercial bulk mail to fail where it would usually pass
based on real standard headers. (That's based on a hunch, not testing.)
> On 2/28/2019 11:14 AM, Ralph Seichter wrote:
>> * Grant Taylor:
>>
>>> Why would you do it per email? I would think that you would do the
>>> test and cache the results for some amount of time.
>> I would not do it at all, caching or no caching. Personally, I don't
>> see
>> a benefit trying to correlate email with a website, as mentioned
>> before,
>> based on how we utilise email-only-domains.
>>
>> -Ralph
Re: Spam rule for HTTP/HTTPS request to sender's root domain
Posted by Ralph Seichter <ab...@monksofcool.net>.
* Mike Marynowski:
> Question though - what is your reply-to address set to in the emails
> coming from your email-only domain?
We very rarely inject Reply-To, because this might interfere with what
the original sender intended.
-Ralph
Re: Spam rule for HTTP/HTTPS request to sender's root domain
Posted by Mike Marynowski <mi...@singulink.com>.
You'll be able to decide how you want to prioritize the fields - I've
implemented it as a DNS server, so which domain you decide to send to
the DNS server is entirely up to you.
On 2/28/2019 10:23 PM, Grant Taylor wrote:
> On 2/28/19 9:33 AM, Mike Marynowski wrote:
>> I'm doing grabs the first available address in this order: reply-to,
>> from, sender.
>
> That sounds like it might be possible to game things by playing with
> the order.
>
> I'm not sure what sorts of validations are applied to the Sender:
> header. (I don't remember if DMARC checks the Sender: header or not.)
>
> How would your filter respond if the MAIL FROM: and the From: header
> were set to something that didn't have a website, yet had a Sender:
> header with <something>@gmail.com listed before the Reply-To: and
> From: headers?
>
>
>
Re: Spam rule for HTTP/HTTPS request to sender's root domain
Posted by Grant Taylor <gt...@tnetconsulting.net>.
On 2/28/19 9:33 AM, Mike Marynowski wrote:
> I'm doing grabs the first available address in this order: reply-to,
> from, sender.
That sounds like it might be possible to game things by playing with the
order.
I'm not sure what sorts of validations are applied to the Sender:
header. (I don't remember if DMARC checks the Sender: header or not.)
How would your filter respond if the MAIL FROM: and the From: header
were set to something that didn't have a website, yet had a Sender:
header with <something>@gmail.com listed before the Reply-To: and From:
headers?
--
Grant. . . .
unix || die
Re: Spam rule for HTTP/HTTPS request to sender's root domain
Posted by Mike Marynowski <mi...@singulink.com>.
Question though - what is your reply-to address set to in the emails
coming from your email-only domain?
The domain checking I'm doing grabs the first available address in this
order: reply-to, from, sender. It's not using the domain of the SMTP
server. I did come across some email-only domain SENDERS in my sampling,
but the overwhelming majority of reply-to addresses pointed to emails
with HTTP servers on their domains.
On 2/28/2019 11:14 AM, Ralph Seichter wrote:
> * Grant Taylor:
>
>> Why would you do it per email? I would think that you would do the
>> test and cache the results for some amount of time.
> I would not do it at all, caching or no caching. Personally, I don't see
> a benefit trying to correlate email with a website, as mentioned before,
> based on how we utilise email-only-domains.
>
> -Ralph
Re: Spam rule for HTTP/HTTPS request to sender's root domain
Posted by Ralph Seichter <ab...@monksofcool.net>.
* Grant Taylor:
> Why would you do it per email? I would think that you would do the
> test and cache the results for some amount of time.
I would not do it at all, caching or no caching. Personally, I don't see
a benefit trying to correlate email with a website, as mentioned before,
based on how we utilise email-only-domains.
-Ralph
Re: Spam rule for HTTP/HTTPS request to sender's root domain
Posted by Grant Taylor <gt...@tnetconsulting.net>.
On 02/27/2019 03:25 PM, Ralph Seichter wrote:
> We use some of our domains specifically for email, with no associated
> website.
I agree that /requiring/ a website at one of the parent domains
(stopping before traversing into the Public Suffix List) is problematic
and prone to false positives.
There /may/ be some value to /some/ people in doing such a check and
altering the spam score. (See below.)
> Besides, I think the overhead to establish a HTTPS connection for
> every incoming email would be prohibitive.
Why would you do it per email? I would think that you would do the test
and cache the results for some amount of time.
> There is a reason most whitelist/blacklist services use "cheap" DNS
> queries instead.
I wonder if there is a way to hack DNS into doing this for us. I.e. a
custom DNS ""server (BIND's DLZ comes to mind) that can perform the
test(s) and fabricate an answer that could then be cached. ""Publish
these answers in a new zone / domain name, and treat it like another RBL.
Meaning a query goes to the new RBL server, which does the necessary
$MAGIC to return an answer (possibly NXDOMAIN if there is a site and
127.0.0.1 if there is no site) which can be cached by standard local /
recursive DNS servers.
--
Grant. . . .
unix || die
Re: Spam rule for HTTP/HTTPS request to sender's root domain
Posted by Andrea Venturoli <ml...@netfence.it>.
On 2/28/19 3:40 PM, Mike Marynowski wrote:
> Right now the test plugin I've built makes a single HTTP request for
> each email while I evaluate this but I'll be building a DNS query
> endpoint or a local domain cache to make it more efficient before
> putting it into production.
Please keep us updated: I love the idea.
bye & Thanks
av.
Re: Spam rule for HTTP/HTTPS request to sender's root domain
Posted by Mike Marynowski <mi...@singulink.com>.
Just one more note - I've excluded .email domains from the check as I've
noticed several organizations using that as email only domains.
Right now the test plugin I've built makes a single HTTP request for
each email while I evaluate this but I'll be building a DNS query
endpoint or a local domain cache to make it more efficient before
putting it into production.
Re: Spam rule for HTTP/HTTPS request to sender's root domain
Posted by Ralph Seichter <ab...@monksofcool.net>.
* Mike Marynowski:
> Of the 100 last legitimate email domains that have sent me mail, 100%
> of them have working websites at the root domain.
We use some of our domains specifically for email, with no associated
website. Besides, I think the overhead to establish a HTTPS connection
for every incoming email would be prohibitive. There is a reason most
whitelist/blacklist services use "cheap" DNS queries instead.
-Ralph
Re: Spam rule for HTTP/HTTPS request to sender's root domain
Posted by Mike Marynowski <mi...@singulink.com>.
Thank you! I have no idea how I missed that...
On 3/13/2019 7:11 PM, RW wrote:
> On Wed, 13 Mar 2019 17:40:57 -0400
> Mike Marynowski wrote:
>
>> Can someone help me form the correct SOA record in my DNS responses
>> to ensure the NXDOMAIN responses get cached properly? Based on the
>> logs I don't think downstream DNS servers are caching it as requests
>> for the same valid HTTP domains keep hitting the service instead of
>> being cached for 4 days.
> ...
>> Based on random sampling of responses from other DNS servers this
>> seems correct to me. Nothing I'm reading indicates that TTL factors
>> into the negative caching but is it possible servers are only caching
>> the negative response for 15 mins because of the TTL on the SOA
>> record, using the smaller value between that and the default TTL?
> I believe so, from RFC 2308:
>
> 3 - Negative Answers from Authoritative Servers
>
> Name servers authoritative for a zone MUST include the SOA record of
> the zone in the authority section of the response when reporting an
> NXDOMAIN or indicating that no data of the requested type exists.
> This is required so that the response may be cached. The TTL of this
> record is set from the minimum of the MINIMUM field of the SOA record
> and the TTL of the SOA itself, and indicates how long a resolver may
> cache the negative answer.
Re: Spam rule for HTTP/HTTPS request to sender's root domain
Posted by RW <rw...@googlemail.com>.
On Wed, 13 Mar 2019 17:40:57 -0400
Mike Marynowski wrote:
> Can someone help me form the correct SOA record in my DNS responses
> to ensure the NXDOMAIN responses get cached properly? Based on the
> logs I don't think downstream DNS servers are caching it as requests
> for the same valid HTTP domains keep hitting the service instead of
> being cached for 4 days.
...
> Based on random sampling of responses from other DNS servers this
> seems correct to me. Nothing I'm reading indicates that TTL factors
> into the negative caching but is it possible servers are only caching
> the negative response for 15 mins because of the TTL on the SOA
> record, using the smaller value between that and the default TTL?
I believe so, from RFC 2308:
3 - Negative Answers from Authoritative Servers
Name servers authoritative for a zone MUST include the SOA record of
the zone in the authority section of the response when reporting an
NXDOMAIN or indicating that no data of the requested type exists.
This is required so that the response may be cached. The TTL of this
record is set from the minimum of the MINIMUM field of the SOA record
and the TTL of the SOA itself, and indicates how long a resolver may
cache the negative answer.
Re: Spam rule for HTTP/HTTPS request to sender's root domain
Posted by Mike Marynowski <mi...@singulink.com>.
Can someone help me form the correct SOA record in my DNS responses to
ensure the NXDOMAIN responses get cached properly? Based on the logs I
don't think downstream DNS servers are caching it as requests for the
same valid HTTP domains keep hitting the service instead of being cached
for 4 days.
From what I understand, if you want to cache an NXDOMAIN response then
you need to include an SOA record with the response and DNS servers
should use the min/default TTL value as a negative cache hint. My
NXDOMAIN responses currently look like this:
HEADER:
opcode = QUERY, id = 27, rcode = NXDOMAIN
header flags: response, want recursion, recursion avail.
questions = 1, answers = 0, authority records = 1, additional = 0
QUESTIONS:
www.singulink.com.httpcheck.singulink.com, type = A, class = IN
AUTHORITY RECORDS:
-> httpcheck.singulink.com
ttl = 900 (15 mins)
primary name server = httpcheck.singulink.com
responsible mail addr = admin.singulink.com
serial = 4212294798
refresh = 172800 (2 days)
retry = 86400 (1 day)
expire = 2592000 (30 days)
default TTL = 345600 (4 days)
Based on random sampling of responses from other DNS servers this seems
correct to me. Nothing I'm reading indicates that TTL factors into the
negative caching but is it possible servers are only caching the
negative response for 15 mins because of the TTL on the SOA record,
using the smaller value between that and the default TTL?
Re: Spam rule for HTTP/HTTPS request to sender's root domain
Posted by Mike Marynowski <mi...@singulink.com>.
Changing up the algorithm a bit. Once a domain has been added to the
cache, the DNS service will perform HTTP checks in the background
automatically on a much more aggressive schedule for invalid domains so
that temporary website problems are much less of an issue and invalid
domains don't delay mail delivery threads for up to 15s after TTL
expirations during the initial test period with progressively increasing
TTLs - queries can always return instantly after the first one, as long
as the domain has been queried in the last 30 days and is still in cache.
Domains deemed to have "invalid" websites will be rechecked much more
aggressively in the background to ensure newly queried domains with
temporary website issues stop tripping this filter as soon as possible.
There will be a "sliding window" of a few days where temporary website
issues during the window won't cause the filter to trip, it just needs
to provide a valid response sometime during the sliding window to stay
in good standing.