You are viewing a plain text version of this content. The canonical link for it is here.
Posted to ruleqa@spamassassin.apache.org by Marcin Mirosław <ma...@mejor.pl> on 2021/01/17 15:49:52 UTC

weekly masscheck, --net and URIBL_BLOCKED

Hi!
I've run masscheck on my host with MTA, after weekly masscheck my host
is blocked by uribl for sometime.
I think that weekly masscheck has incorrect results due to ratelimit of
uribl and I receive more spam for sometime ;)
Does it make sense to run weekley masscheck with --net when it can't
return correct results?

Marcin

Re: weekly masscheck, --net and URIBL_BLOCKED

Posted by Dave Warren <dw...@thedave.ca>.
On 2021-08-06 11:25, Marcin Mirosław wrote:
>> The other thing I did on my masscheck box was to configure my DNS
>> resolver with a minimum TTL of some hours. Since we are checking mail
>> that is relatively ancient anyway, who cares if a particular DNS request
>> is 5 minutes or 3 hours old? This reduced the number of DNS queries
>> hitting "the world" considerably and was probably sufficient to stay
>> under limits. Be sure to increase the negative TTL cache as well.
> But long TTL doesn't help when masscheck will be checking 2 month old
> emails.

In practice, it helps a lot.

Each masscheck run will still need to query each sender IP on each 
DNSBL, but now only once (per IP, per DNSBL, per masscheck run).

My masscheck took over an hour to finish, so I asked myself what value 
is there in asking if each of Google's outbound IPs are listed a hundred 
times in a row when I could just ask once and trust that answer for the 
duration. Turns out it got the masscheck process running a decent amount 
faster too, narrowing the gap between the daily and weekly net run 
significantly. The cache rate was a lot higher than I expected too.


> Henrik mentioned to use "--reuse" but I don't use header X-Spam-Status :/
> Anyway, where is a problem, why masscheck have to query all uribl/dnsbl?
> It needs a lot of time for devs?

There is value when DNSBLs are being evaluated. I used it in this 
fashion, but I'm unclear if anyone writing rules does (or can).

The older the corpus, and the more dynamic a DNSBL, the less value there 
will be. But conversely if I see a DNSBL's ham hitrate go up, it likely 
doesn't matter how old the corpus is, there is likely a problem with the 
DNSBL. Or an ESP playing games switching their spammy and non-spammy 
ranges around.

I'm not sure if the data would be useful as collected from masscheckers, 
I certainly looked closer at the specifics before making any decisions, 
but there was value to me, at the time. Whether this extends to having 
all masscheckers do this weekly or not is unclear to me.

Re: weekly masscheck, --net and URIBL_BLOCKED

Posted by Marcin Mirosław <ma...@mejor.pl>.
Hi!

W dniu 2021-08-05 o 09:51, Dave Warren pisze:
> There are two other things you can do to avoid hitting limits.
>
> Maybe the obvious one: Contact the DNSBL.

It's better to fix it in one place, in masscheck because this fix will
be spread over all masscheckers. Why every masschecker should bother
dnsbl admins?

[...]
> I also already ran my own wrbldnsd installation on my live mail server,
> and therefore was in a position to accept rsync based rbldnsd formatted
> files.

Workarround as above.

> The other thing I did on my masscheck box was to configure my DNS
> resolver with a minimum TTL of some hours. Since we are checking mail
> that is relatively ancient anyway, who cares if a particular DNS request
> is 5 minutes or 3 hours old? This reduced the number of DNS queries
> hitting "the world" considerably and was probably sufficient to stay
> under limits. Be sure to increase the negative TTL cache as well.

But long TTL doesn't help when masscheck will be checking 2 month old
emails.


Henrik mentioned to use "--reuse" but I don't use header X-Spam-Status :/
Anyway, where is a problem, why masscheck have to query all uribl/dnsbl?
It needs a lot of time for devs?
Regards!
Marcin


*** Qmail-Scanner Quarantine Envelope Details Begin ***
Dhaka-Fiber-Link-Ltd-Spam/Virus-Scanner-Mail-From:
"ruleqa-return-1612-imran=dhakafiber.net@spamassassin.apache.org" via
mail.dhakafiber.net
Dhaka-Fiber-Link-Ltd-Spam/Virus-Scanner-Rcpt-To: "imran@dhakafiber.net"
Dhaka-Fiber-Link-Ltd-Spam/Virus-Scanner: 2.11st (clamdscan: 0.101.2/25428.
perlscan: 2.11st.  SPAM Found. Processed in 9.948538 secs) process 17505
Quarantine-Description: SPAM exceeds "quarantine" threshold -
hits=24.7/4.0
SA_REPORT hits = 24.7/4.0
 -1.9 BAYES_00               BODY: Bayes spam probability is 0 to 1%
                             [score: 0.0000]
 -5.0 RCVD_IN_DNSWL_HI       RBL: Sender listed at https://www.dnswl.org/,
high
                             trust
                             [95.216.194.37 listed in list.dnswl.org]
  5.0 RCVD_IN_UCEPROTECT3    RBL: Network listed in dnsbl-3.uceprotect.net
                             [Your ISP OVH, FR/AS16276 is
UCEPROTECT-Level3]
                             [listed because of a spamscore of 69.2. See:]

[<http://www.uceprotect.net/rblcheck.php?ipr=151.80.237.52>]
  5.0 IN_BCUDA_RBL           RBL: Received via a relay listed by Barracuda
BRBL
                             [116.203.227.195 listed in
b.barracudacentral.org]
  5.0 RCVD_IN_BCUDA_RELAY    RBL: BCUDA: relay ip is convicted spammer
 -0.0 SPF_HELO_PASS          SPF: HELO matches SPF record
  4.0 HEADER_FROM_DIFFERENT_DOMAINS From and EnvelopeFrom 2nd level mail
                             domains are different
 -0.0 SPF_PASS               SPF: sender matches SPF record
   20 FROM_NOT_REPLYTO       From: does not match Reply-To:
 -7.5 USER_IN_DEF_SPF_WL     From: address is in the default SPF
white-list
  0.1 DKIM_SIGNED            Message has a DKIM or DK signature, not
necessarily valid
  0.1 DKIM_INVALID           DKIM or DK signature exists, but is not valid
 -1.0 MAILING_LIST_MULTI     Multiple indicators imply a widely-seen list
                             manager
 -0.1 NICE_REPLY_A           Looks like a legit reply (A)
  1.0 SAGREY                 Adds 1.0 to spam from first-time senders
*** Qmail-Scanner Quarantine Envelope Details End ***

Re: weekly masscheck, --net and URIBL_BLOCKED

Posted by Marcin Mirosław <ma...@mejor.pl>.
Hi!

W dniu 2021-08-05 o 09:51, Dave Warren pisze:
> There are two other things you can do to avoid hitting limits.
> 
> Maybe the obvious one: Contact the DNSBL.

It's better to fix it in one place, in masscheck because this fix will
be spread over all masscheckers. Why every masschecker should bother
dnsbl admins?

[...]
> I also already ran my own wrbldnsd installation on my live mail server,
> and therefore was in a position to accept rsync based rbldnsd formatted
> files.

Workarround as above.

> The other thing I did on my masscheck box was to configure my DNS
> resolver with a minimum TTL of some hours. Since we are checking mail
> that is relatively ancient anyway, who cares if a particular DNS request
> is 5 minutes or 3 hours old? This reduced the number of DNS queries
> hitting "the world" considerably and was probably sufficient to stay
> under limits. Be sure to increase the negative TTL cache as well.

But long TTL doesn't help when masscheck will be checking 2 month old
emails.


Henrik mentioned to use "--reuse" but I don't use header X-Spam-Status :/
Anyway, where is a problem, why masscheck have to query all uribl/dnsbl?
It needs a lot of time for devs?
Regards!
Marcin


Re: weekly masscheck, --net and URIBL_BLOCKED

Posted by Dave Warren <dw...@thedave.ca>.
There are two other things you can do to avoid hitting limits.

Maybe the obvious one: Contact the DNSBL.

Back when I was running masscheck I reached out to a couple places that 
blocked me due to volume and they were happy to work with me to make 
arrangements at no cost since I was contributing to the community as a 
whole (and contributing to the community using their DNSBL in an 
indirect fashion).

I stopped running masscheck at least a couple years ago as I no longer 
have a representative mail stream that I can sample, and of course the 
landscape may have changed.

I also already ran my own wrbldnsd installation on my live mail server, 
and therefore was in a position to accept rsync based rbldnsd formatted 
files.

The other thing I did on my masscheck box was to configure my DNS 
resolver with a minimum TTL of some hours. Since we are checking mail 
that is relatively ancient anyway, who cares if a particular DNS request 
is 5 minutes or 3 hours old? This reduced the number of DNS queries 
hitting "the world" considerably and was probably sufficient to stay 
under limits. Be sure to increase the negative TTL cache as well.



On 2021-08-01 07:53, Henrik K wrote:
> 
> Change what? Not use --net?
> 
> Obviously we can't do that, otherwise stuff like DKIM_VALID won't hit and
> things like that will affect many meta rules too.
> 
> I've advocated that all mass checkers should make use of --reuse, that way
> no network lookups are even required and hits actually reflect what happened
> during the original mail handling.  It's pretty much pointless to query
> blacklists weeks after the mail was received, it's not accurate.  And also
> like mentioned you might get ratelimited and abuse the lists with
> unnecessary queries..
> 
> Cheers,
> Henrik
> 
> On Sun, Aug 01, 2021 at 03:30:01PM +0200, Marcin Miros??aw wrote:
>> Hi,
>> others are rather quiet so maybe it can be changed?
>>
>>
>> W dniu 2021-01-17 o 17:32, Kevin A. McGrail pisze:
>>> Marcin,
>>>
>>> I don't think masscheck really affects what RBLs we include or the
>>> scores except when we are considering adding or removing them.
>>>
>>> I would be +1 for changing that behavior but defer to others to weigh in
>>> first.
>>>
>>> Regards,
>>>
>>> KAM
>>>
>>> On 1/17/2021 10:49 AM, Marcin Miros?aw wrote:
>>>> Hi!
>>>> I've run masscheck on my host with MTA, after weekly masscheck my host
>>>> is blocked by uribl for sometime.
>>>> I think that weekly masscheck has incorrect results due to ratelimit of
>>>> uribl and I receive more spam for sometime ;)
>>>> Does it make sense to run weekley masscheck with --net when it can't
>>>> return correct results?
>>>>
>>>> Marcin
>>>


Re: weekly masscheck, --net and URIBL_BLOCKED

Posted by Henrik K <he...@hege.li>.
Does all your corpus have X-Spam-Status?  And no shortcircuiting?  If those
are in order, then there shouldn't be any traffic, I fixed many cases
before.

Cheers,
Henrik


On Sun, Aug 01, 2021 at 10:15:26AM -0700, Steven Ihde wrote:
> I experimented with --reuse a few weeks back but it did not seem to
> suppress (all?) DNS lookups. Has anyone else observed this? If not
> I will try it again to get some more details.
> 
> -Steve
> 
> On 8/1/21 09:10, Kevin A. McGrail wrote:
> > +1 to reuse.  Thanks for reiterating that!
> > 
> > On 8/1/2021 9:53 AM, Henrik K wrote:
> > > Change what? Not use --net?
> > > 
> > > Obviously we can't do that, otherwise stuff like DKIM_VALID won't
> > > hit and
> > > things like that will affect many meta rules too.
> > > 
> > > I've advocated that all mass checkers should make use of --reuse,
> > > that way
> > > no network lookups are even required and hits actually reflect what
> > > happened
> > > during the original mail handling.  It's pretty much pointless to query
> > > blacklists weeks after the mail was received, it's not accurate. 
> > > And also
> > > like mentioned you might get ratelimited and abuse the lists with
> > > unnecessary queries..
> > > 
> > > Cheers,
> > > Henrik
> > > 
> > > On Sun, Aug 01, 2021 at 03:30:01PM +0200, Marcin Miros??aw wrote:
> > > > Hi,
> > > > others are rather quiet so maybe it can be changed?
> > > > 
> > > > 
> > > > W dniu 2021-01-17 o 17:32, Kevin A. McGrail pisze:
> > > > > Marcin,
> > > > > 
> > > > > I don't think masscheck really affects what RBLs we include or the
> > > > > scores except when we are considering adding or removing them.
> > > > > 
> > > > > I would be +1 for changing that behavior but defer to others
> > > > > to weigh in
> > > > > first.
> > > > > 
> > > > > Regards,
> > > > > 
> > > > > KAM
> > > > > 
> > > > > On 1/17/2021 10:49 AM, Marcin Miros?aw wrote:
> > > > > > Hi!
> > > > > > I've run masscheck on my host with MTA, after weekly
> > > > > > masscheck my host
> > > > > > is blocked by uribl for sometime.
> > > > > > I think that weekly masscheck has incorrect results due
> > > > > > to ratelimit of
> > > > > > uribl and I receive more spam for sometime ;)
> > > > > > Does it make sense to run weekley masscheck with --net when it can't
> > > > > > return correct results?
> > > > > > 
> > > > > > Marcin
> > 

Re: weekly masscheck, --net and URIBL_BLOCKED

Posted by Steven Ihde <si...@hamachi.us>.
I experimented with --reuse a few weeks back but it did not seem to
suppress (all?) DNS lookups. Has anyone else observed this? If not
I will try it again to get some more details.

-Steve

On 8/1/21 09:10, Kevin A. McGrail wrote:
> +1 to reuse.  Thanks for reiterating that!
>
> On 8/1/2021 9:53 AM, Henrik K wrote:
>> Change what? Not use --net?
>>
>> Obviously we can't do that, otherwise stuff like DKIM_VALID won't hit 
>> and
>> things like that will affect many meta rules too.
>>
>> I've advocated that all mass checkers should make use of --reuse, 
>> that way
>> no network lookups are even required and hits actually reflect what 
>> happened
>> during the original mail handling.  It's pretty much pointless to query
>> blacklists weeks after the mail was received, it's not accurate.  And 
>> also
>> like mentioned you might get ratelimited and abuse the lists with
>> unnecessary queries..
>>
>> Cheers,
>> Henrik
>>
>> On Sun, Aug 01, 2021 at 03:30:01PM +0200, Marcin Miros??aw wrote:
>>> Hi,
>>> others are rather quiet so maybe it can be changed?
>>>
>>>
>>> W dniu 2021-01-17 o 17:32, Kevin A. McGrail pisze:
>>>> Marcin,
>>>>
>>>> I don't think masscheck really affects what RBLs we include or the
>>>> scores except when we are considering adding or removing them.
>>>>
>>>> I would be +1 for changing that behavior but defer to others to 
>>>> weigh in
>>>> first.
>>>>
>>>> Regards,
>>>>
>>>> KAM
>>>>
>>>> On 1/17/2021 10:49 AM, Marcin Miros?aw wrote:
>>>>> Hi!
>>>>> I've run masscheck on my host with MTA, after weekly masscheck my 
>>>>> host
>>>>> is blocked by uribl for sometime.
>>>>> I think that weekly masscheck has incorrect results due to 
>>>>> ratelimit of
>>>>> uribl and I receive more spam for sometime ;)
>>>>> Does it make sense to run weekley masscheck with --net when it can't
>>>>> return correct results?
>>>>>
>>>>> Marcin
>


Re: weekly masscheck, --net and URIBL_BLOCKED

Posted by "Kevin A. McGrail" <km...@apache.org>.
+1 to reuse.  Thanks for reiterating that!

On 8/1/2021 9:53 AM, Henrik K wrote:
> Change what? Not use --net?
>
> Obviously we can't do that, otherwise stuff like DKIM_VALID won't hit and
> things like that will affect many meta rules too.
>
> I've advocated that all mass checkers should make use of --reuse, that way
> no network lookups are even required and hits actually reflect what happened
> during the original mail handling.  It's pretty much pointless to query
> blacklists weeks after the mail was received, it's not accurate.  And also
> like mentioned you might get ratelimited and abuse the lists with
> unnecessary queries..
>
> Cheers,
> Henrik
>
> On Sun, Aug 01, 2021 at 03:30:01PM +0200, Marcin Miros??aw wrote:
>> Hi,
>> others are rather quiet so maybe it can be changed?
>>
>>
>> W dniu 2021-01-17 o 17:32, Kevin A. McGrail pisze:
>>> Marcin,
>>>
>>> I don't think masscheck really affects what RBLs we include or the
>>> scores except when we are considering adding or removing them.
>>>
>>> I would be +1 for changing that behavior but defer to others to weigh in
>>> first.
>>>
>>> Regards,
>>>
>>> KAM
>>>
>>> On 1/17/2021 10:49 AM, Marcin Miros?aw wrote:
>>>> Hi!
>>>> I've run masscheck on my host with MTA, after weekly masscheck my host
>>>> is blocked by uribl for sometime.
>>>> I think that weekly masscheck has incorrect results due to ratelimit of
>>>> uribl and I receive more spam for sometime ;)
>>>> Does it make sense to run weekley masscheck with --net when it can't
>>>> return correct results?
>>>>
>>>> Marcin

-- 
Kevin A. McGrail
KMcGrail@Apache.org

Member, Apache Software Foundation
Chair Emeritus Apache SpamAssassin Project
https://www.linkedin.com/in/kmcgrail - 703.798.0171


Re: weekly masscheck, --net and URIBL_BLOCKED

Posted by Henrik K <he...@hege.li>.
Change what? Not use --net?

Obviously we can't do that, otherwise stuff like DKIM_VALID won't hit and
things like that will affect many meta rules too.

I've advocated that all mass checkers should make use of --reuse, that way
no network lookups are even required and hits actually reflect what happened
during the original mail handling.  It's pretty much pointless to query
blacklists weeks after the mail was received, it's not accurate.  And also
like mentioned you might get ratelimited and abuse the lists with
unnecessary queries..

Cheers,
Henrik

On Sun, Aug 01, 2021 at 03:30:01PM +0200, Marcin Miros??aw wrote:
> Hi,
> others are rather quiet so maybe it can be changed?
> 
> 
> W dniu 2021-01-17 o 17:32, Kevin A. McGrail pisze:
> > Marcin,
> > 
> > I don't think masscheck really affects what RBLs we include or the
> > scores except when we are considering adding or removing them.
> > 
> > I would be +1 for changing that behavior but defer to others to weigh in
> > first.
> > 
> > Regards,
> > 
> > KAM
> > 
> > On 1/17/2021 10:49 AM, Marcin Miros?aw wrote:
> >> Hi!
> >> I've run masscheck on my host with MTA, after weekly masscheck my host
> >> is blocked by uribl for sometime.
> >> I think that weekly masscheck has incorrect results due to ratelimit of
> >> uribl and I receive more spam for sometime ;)
> >> Does it make sense to run weekley masscheck with --net when it can't
> >> return correct results?
> >>
> >> Marcin
> > 

Re: weekly masscheck, --net and URIBL_BLOCKED

Posted by Marcin Mirosław <ma...@mejor.pl>.
Hi,
others are rather quiet so maybe it can be changed?


W dniu 2021-01-17 o 17:32, Kevin A. McGrail pisze:
> Marcin,
> 
> I don't think masscheck really affects what RBLs we include or the
> scores except when we are considering adding or removing them.
> 
> I would be +1 for changing that behavior but defer to others to weigh in
> first.
> 
> Regards,
> 
> KAM
> 
> On 1/17/2021 10:49 AM, Marcin Mirosław wrote:
>> Hi!
>> I've run masscheck on my host with MTA, after weekly masscheck my host
>> is blocked by uribl for sometime.
>> I think that weekly masscheck has incorrect results due to ratelimit of
>> uribl and I receive more spam for sometime ;)
>> Does it make sense to run weekley masscheck with --net when it can't
>> return correct results?
>>
>> Marcin
> 


Re: weekly masscheck, --net and URIBL_BLOCKED

Posted by "Kevin A. McGrail" <km...@apache.org>.
Marcin,

I don't think masscheck really affects what RBLs we include or the 
scores except when we are considering adding or removing them.

I would be +1 for changing that behavior but defer to others to weigh in 
first.

Regards,

KAM

On 1/17/2021 10:49 AM, Marcin Mirosław wrote:
> Hi!
> I've run masscheck on my host with MTA, after weekly masscheck my host
> is blocked by uribl for sometime.
> I think that weekly masscheck has incorrect results due to ratelimit of
> uribl and I receive more spam for sometime ;)
> Does it make sense to run weekley masscheck with --net when it can't
> return correct results?
>
> Marcin

-- 
Kevin A. McGrail
KMcGrail@Apache.org

Member, Apache Software Foundation
Chair Emeritus Apache SpamAssassin Project
https://www.linkedin.com/in/kmcgrail - 703.798.0171


Re: weekly masscheck, --net and URIBL_BLOCKED

Posted by John Hardin <jh...@impsec.org>.
On Sun, 17 Jan 2021, Marcin Mirosław wrote:

> Hi!
> I've run masscheck on my host with MTA, after weekly masscheck my host
> is blocked by uribl for sometime.
> I think that weekly masscheck has incorrect results due to ratelimit of
> uribl and I receive more spam for sometime ;)
> Does it make sense to run weekley masscheck with --net when it can't
> return correct results?

The scores of URIBL_(BLACK|GREY|RED) are fixed in 50_scores.cf so the -net 
masscheck doesn't affect them, and no active rules meta with them, so I 
just disable the URIBL lookup in a custom ruleset for nightly masschecks:

   dns_query_restriction deny uribl.com


-- 
  John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
  jhardin@impsec.org                         pgpk -a jhardin@impsec.org
  key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
   Maxim VI: If violence wasn’t your last resort, you failed to resort
             to enough of it.
-----------------------------------------------------------------------
  Today: Benjamin Franklin's 315th Birthday