You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by ji...@wohosting.net on 2019/06/05 13:36:10 UTC

Optimum Number of Spamd Children

Greetings,

I've searched but haven't had any luck finding documentation about how 
to determine the optimal settings for spamd children (max-children, 
min-children, max-spare, min-spare, and max-conn-per-child). I have a 
dedicated server for running spamd. It has 6GB (can add more) and 6 
cores. What would be the best settings? Or how would I determine the 
best settings?

Thanks!
Jim

Re: Optimum Number of Spamd Children

Posted by John Hardin <jh...@impsec.org>.
On Wed, 5 Jun 2019, jim.anderson@wohosting.net wrote:

> Greetings,
>
> I've searched but haven't had any luck finding documentation about how to 
> determine the optimal settings for spamd children (max-children, 
> min-children, max-spare, min-spare, and max-conn-per-child). I have a 
> dedicated server for running spamd. It has 6GB (can add more) and 6 cores. 
> What would be the best settings? Or how would I determine the best settings?

What's your system load when they are all running actively scanning mail? 
If you have unused capacity, add more.

I don't think there's a more formal way to do it than that...


-- 
  John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
  jhardin@impsec.org    FALaholic #11174     pgpk -a jhardin@impsec.org
  key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
   ...a great many people are not fit for Liberty, it scares the crap
   out of them and they'd much rather be ruled. As Loki said in the
   Avengers movie, kneeling is their natural state.    -- Mark D @ TSM
-----------------------------------------------------------------------
  Tomorrow: the 75th anniversary of D-Day

Re: Optimum Number of Spamd Children

Posted by Kris Deugau <kd...@vianet.ca>.
RW wrote:
> On Wed, 5 Jun 2019 10:45:13 -0400
> Kris Deugau wrote:
> 
>> jim.anderson@wohosting.net wrote:
>>> Greetings,
>>>
>>> I've searched but haven't had any luck finding documentation about
>>> how to determine the optimal settings for spamd children
>>> (max-children, min-children, max-spare, min-spare, and
>>> max-conn-per-child). I have a dedicated server for running spamd.
>>> It has 6GB (can add more) and 6 cores. What would be the best
>>> settings? Or how would I determine the best settings?
>>
>> "Try it and see."  :/
>>
>> At a minimum you'll want to make sure that you don't spawn more spamd
>> children than you can keep in RAM;  watch your system for a while,
>> take the worst-case spamd memory footprint, and divide that into your
>> physical RAM to find the absolute largest max-children you should
>> use.
> 
> You can get a more accurate feeling for that limit by stress-testing
> and watching-out for significant swap I/O, but if it turns-out to be
> relevant more memory may be needed.
> 
> What I did was measure the CPU limited throughput without network
> tests, and then calculate the number of children needed to sustain that
> throughput with a scan time on the high end of those seen.
> 
> It's a good idea to check that you can actually reach full CPU usage
> and aren't running into an avoidable locking bottleneck with Bayes etc.

*nod*  If you're having to fine-tune any of these to keep the system 
fully busy but not overloading, that's another key factor to watch;  you 
can allow more processes than CPU cores, but unless your DNS resolver is 
slow, not by much.  Hyperthreading may give you a bit more slack but I 
don't think a HT "core" really gives you a full CPU core of benefit for 
a workload like SA.  On top of which you have the growing list of 
security issues just having it enabled.  :/


>> We've also found that it's best to set max-children to
>> min-children+1, and max/min-spare to 1.  It may have been improved
>> since we last reviewed our settings in detail, but at the time spamd
>> did't spawn new children fast enough under load spikes,
> 
> Despite what it says in the documentation there isn't an actual rate
> limit. What happens is that above 'min-free' processes the number of
> children only gets incremented when a child becomes idle after
> completing a scan or initializing. So the worst case is that there is a
> delay of the time it takes to scan one message. Once a scan completes
> the number of children can jump to 'max-free' instantaneously.

My memory is a little hazy on the specifics;  it was ~8+ years ago IIRC 
when I was seeing problems and experimented with settings to avoid them. 
  I don't recall any documentation regarding a *defined* limit on the 
child spawn rate, but in live testing at the time there certainly was one.

We were seeing on the order of 30-60s to scale up;  think "single 
message with huge CC list", where SA is called on final delivery for 
each recipient.  I don't recall offhand if it was "spawn new child 
process, wait, spawn, wait, etc" or if it was a burst as you say above, 
but the ultimate result was a lot of mail suddenly stuck in the inbound 
mail queue waiting for delivery.  Prespawning the maximum number of 
child processes "fixed" the problem.  In the worst cases IIRC it took up 
to about 30 minutes to clear the backlog.

It might not be required any more but I haven't seen any issues 
continuing to do so.

We were also having issues at the time with pathological spam using 
gigantic (>200K) HTML comments causing severe slowdowns, so scantime on 
any given spam sometimes averaged 15-20s if it returned at all.  We 
added a second spamd instance relying almost solely on a subset of DNS 
rules plus a handful of local rules that we tuned to skim these off first.

> The worst case happens when the system is completely idle when the
> spike arrives, so it's something that's more likely to be seen in
> testing than on a busy system.

Our two scanning nodes are nearly idle most of the time (usually up to 
about 5 active children of 70 in the main SA instance) but if a big 
burst of mail comes in, it can hit the current limits we have set.  I'd 
raise them but I think at this point we'd hit a CPU limit instead;  we 
do not have 70 CPU cores available on these systems, and they're also 
running ClamAV and a separate spamd instance for outbound mail scanning.

We've also scaled to allow for load balancing and taking a node down 
without impacting operations;  while we're heavily overprovisioned for 
the average load, we're still a bit tight at peak load.

>> Setting them equal caused a deadlock of some kind IIRC.
> 
> Is there a bug report for that?

It was quite a while ago (possibly as far back as SA 3.2 - wasn't there 
a new forking pattern introduced around then?).  I'll see if I can 
reproduce it with current release or trunk versions.

-kgd

Re: Optimum Number of Spamd Children

Posted by RW <rw...@googlemail.com>.
On Wed, 5 Jun 2019 10:45:13 -0400
Kris Deugau wrote:

> jim.anderson@wohosting.net wrote:
> > Greetings,
> > 
> > I've searched but haven't had any luck finding documentation about
> > how to determine the optimal settings for spamd children
> > (max-children, min-children, max-spare, min-spare, and
> > max-conn-per-child). I have a dedicated server for running spamd.
> > It has 6GB (can add more) and 6 cores. What would be the best
> > settings? Or how would I determine the best settings?  
> 
> "Try it and see."  :/
> 
> At a minimum you'll want to make sure that you don't spawn more spamd 
> children than you can keep in RAM;  watch your system for a while,
> take the worst-case spamd memory footprint, and divide that into your 
> physical RAM to find the absolute largest max-children you should
> use.

You can get a more accurate feeling for that limit by stress-testing
and watching-out for significant swap I/O, but if it turns-out to be
relevant more memory may be needed.  

What I did was measure the CPU limited throughput without network
tests, and then calculate the number of children needed to sustain that
throughput with a scan time on the high end of those seen. 

It's a good idea to check that you can actually reach full CPU usage
and aren't running into an avoidable locking bottleneck with Bayes etc.


> We've also found that it's best to set max-children to
> min-children+1, and max/min-spare to 1.  It may have been improved
> since we last reviewed our settings in detail, but at the time spamd
> did't spawn new children fast enough under load spikes, 

Despite what it says in the documentation there isn't an actual rate
limit. What happens is that above 'min-free' processes the number of
children only gets incremented when a child becomes idle after
completing a scan or initializing. So the worst case is that there is a
delay of the time it takes to scan one message. Once a scan completes
the number of children can jump to 'max-free' instantaneously.

The worst case happens when the system is completely idle when the
spike arrives, so it's something that's more likely to be seen in
testing than on a busy system.

> Setting them equal caused a deadlock of some kind IIRC.

Is there a bug report for that?


Re: Optimum Number of Spamd Children

Posted by Kris Deugau <kd...@vianet.ca>.
jim.anderson@wohosting.net wrote:
> Greetings,
> 
> I've searched but haven't had any luck finding documentation about how 
> to determine the optimal settings for spamd children (max-children, 
> min-children, max-spare, min-spare, and max-conn-per-child). I have a 
> dedicated server for running spamd. It has 6GB (can add more) and 6 
> cores. What would be the best settings? Or how would I determine the 
> best settings?

"Try it and see."  :/

At a minimum you'll want to make sure that you don't spawn more spamd 
children than you can keep in RAM;  watch your system for a while, take 
the worst-case spamd memory footprint, and divide that into your 
physical RAM to find the absolute largest max-children you should use. 
Modify to taste depending on what else is on the system (eg, we run 
multiple spamd instances with different rule sets for different mail 
flows, as well as ClamAV;  no one spamd instance gets to hog all the RAM).

We've also found that it's best to set max-children to min-children+1, 
and max/min-spare to 1.  It may have been improved since we last 
reviewed our settings in detail, but at the time spamd did't spawn new 
children fast enough under load spikes, so we just forced it to 
pre-spawn the maximum number to ensure it could handle load spikes. 
Setting them equal caused a deadlock of some kind IIRC.

max-conn-per-child is a bit handwavy.  Locally we're using 100, but we 
could arguably cut that in half, or bump it to 1000, with essentially 
zero effect.  It's mostly a leash in case of memory leaks IMO;  spamd 
doesn't reread the rules for each new child process the way eg Amavis or 
MIMEDefang do.

-kgd