You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Francesco Potorti` <po...@potorti.it> on 2004/11/07 23:18:20 UTC

performance observations: 3.0.1 versus 2.42

I just finished installing 3.0.1 over a two-years old 2.42 installation
on a Sun (don't know which model, sorry) with 0.5 GB ram and 1.2 GB swap
serving less than 1000 users with smtp and pop services, about 15000
local deliveries per working day.

Undesired mail jumped from an average 1% of two years ago to around 60%
now.  Mail distribution in time was much les bursty then, but now bursts
of mail are quite frequent, up to 300 mails in ten minutes.

SpamAssassin 3.0 is much better at identifying mail than 2.42: about
half the currently identified spam went unnoticed with 2.42.

Pop usage is heavy on this box, and unfortunately 3.0 does not help.
Every spamd child uses at least 27 MB of real memory (RSS), which grows
up to 42 MB in busy periods.  

Comparing performance before and after upgrade, I suspect that having a
fixed number of servers makes things worse.  While having the servers
ready to work makes them more responsive, from a global perspective
performance degrades less gracefully with respect to the model where you
spawn a child when needed.

In fact, when you have a fixed number of servers, you cannot exceed the
real memory size without risking constant memory thrashing (so on this
box I can use a maximum of, say, 12 servers).  If more servers are
needed for short time periods, nothing can be easily done.  With the old
model, I could define a maximum of 30 servers, which were spawned only
when needed, so performance degraded gracefully.

Maybe I could use two or three servers with --max-children=6, listening
on three different sockets, and let procmail choose the second only when
the first gives an EX_UNAVAILABLE (69) error, and the third only when
such error is got from the second one, but I did not try that one.
Probably this arrangement could give the best of the two models, because
children that are idle for some time should go to swap and leave real
memory free for the pop servers in normal situation, while being ready
for overload periods.