You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Luis Hernán Otegui <lu...@gmail.com> on 2004/10/01 18:09:52 UTC

Re: 3.0 scanning delays

Same thing here, except that it also eats as much memory as it can...
Scan times keep growing bigger and bigger in time...


On Thu, 30 Sep 2004 11:56:01 -0600, Shane Hickey
<sh...@howsyournetwork.com> wrote:
> So, I take it that no one is seeing these weird spamd delays but me?  Rats.
> 
> Shane Hickey <sh...@howsyournetwork.com> [2004-09-29 14:11]:
> > Howdy all.  I'm running version 3.0.0 on Gentoo Linux (using the
> > 3.0.0-r1 ebuild).  The machine is a dual P3/450 and it is also running
> > sendmail 8.12.11 and it handles mail for 20 or so domains with less
> > than 20 users total.  So, the mail volume is pretty low.
> >
> > I'm running spamd in the following manner:
> >
> > /usr/sbin/spamd -d -r /var/run/spamd/spamd.pid -u mail -x -m 10 -L
> >
> > I'm running spamc out of my /etc/procmailrc (with no options).
> >
> > What I've noticed is that after spamd has been running for a little
> > while, it starts to take longer and longer to check each message.
> > Here is a snippet of my times from 2.64:
> >
> > clean message (-104.9/5.0) for user1:8 in 0.8 seconds, 1129 bytes.
> > clean message (-104.9/5.0) for user2:8 in 0.9 seconds, 1231 bytes.
> > clean message (-104.9/5.0) for user1:8 in 0.8 seconds, 1231 bytes.
> > clean message (-4.9/5.0) for user1:8 in 1.1 seconds, 1046 bytes.
> >
> > When I first start spamd, I see times that are very close to this.
> > But, within 10-20 minutes, they start to climb.  Here is how they look
> > right now (I started spamd 40 minutes ago).
> >
> > clean message (-102.8/5.0) for user1:8 in 5.8 seconds, 1282 bytes.
> > clean message (-5.0/5.0) for user2:8 in 41.8 seconds, 2867 bytes.
> > clean message (-100.0/5.0) for user3:8 in 37.8 seconds, 2250 bytes.
> >
> > If I let spamd run for several hours, I'll see times near 200 seconds
> > per message and it seems to keep increasing.
> >
> > I have always had "skip_rbl_checks 1" in my local.cf.  But, I've been
> > trying to isolate what's caused this new slowness, so I've also tried
> > to first disable razor2, dcc and pyzor and that didn't seem to make
> > much difference.  Then I set use_bayes to 0 and that seems to help a
> > little bit, but I still see long delays.  The delayed times that I
> > show above are for this configuration:
> >
> > # Enable the Bayes system
> > use_bayes               0
> >
> > # Enable or disable network checks
> > skip_rbl_checks         1
> > use_razor2              1
> > use_dcc                 1
> > use_pyzor               1
> >
> > I also tried "lock_method flock" and I didn't see much success their
> > either.  Anyway, I was hoping someone else had seen this behavior and
> > or maybe someone could shed some light on what might be the cause of
> > this?
> >
> > Thanks,
> > Shane
> >
> > --
> > Shane Hickey <sh...@howsyournetwork.com>: Network/System Consultant
> > GPG KeyID: 777CBF3F
> > Key fingerprint: 254F B2AC 9939 C715 278C  DA95 4109 9F69 777C BF3F
> > Listening to: The Courtship of Birdy Numnum - The
> > Parapalegic-Homoerotic Episode
> >
> 
> --
> Shane Hickey <sh...@howsyournetwork.com>: Network/System Consultant
> GPG KeyID: 777CBF3F
> Key fingerprint: 254F B2AC 9939 C715 278C  DA95 4109 9F69 777C BF3F
> Listening to: The Styrenes - Cold Meat
> 



-- 
-------------------------------------------------
GNU-GPL: "May The Source Be With You...
-------------------------------------------------

Re: 3.0 scanning delays

Posted by Loren Wilton <lw...@earthlink.net>.
> I posted some graphs of my Mem and Swap utilization under 3.0 and 2.64 but
I didn't realize provide a legend.  The green is memory/swap used and the
blue is free.  You can see that under 3.0 (the first half of the graph)
about 1/2 of my swap was used and under 2.64, just a tiny fraction is used.
The memory graph is less dramatic.

Yes, I looked at them after posting that question, and was confused.  :-)

It doesn't sound like you were deadly out of swap space, but you could well
have been thrashing.  That would account for increasing total processing
times per message, although the cpu time per message should have been
relatively constant (although there might not be a way to measure that to
see).


> I'm hearing a lot of talk about BigEvil and I am running that rule (I
emerged spamassassin-ruledujour in Gentoo, which includes a collection of
3rd party rules).  However, I'm fond of many of these rules, so I think I'll
just keep running 2.64 for the time being since it seems to handle these
rules without completely devouring all my swap and ram.

BigEvil is big evil, even on 2.64.  It worked for a while, but then toward
the end just got waaay too huge.  It will fry lightning in its tracks on
most machines.

Why it would result in taking more memory on 3.0 than on 2.64 is an
interesting theoretical question that probably only the devs could answer.
It implies a change in rule processing such that the various rules take a
lot more memory to process in 3.0 than in 2.6x.  That will likely become
less of a theoretical question and more of a practical question as time goes
on with 3.0; but at this point I doubt that anyone is interested.

The practical thing to do is eliminate BigEvil on 3.0 and turn on SURBL,
which should give you the same results at the cost of trading off memory for
net tests.  That will probably halve the amount of rule memory required.

In fact, doing the same thing on 2.64 woudl probably be worthwhile, except
that you have the memory to run BE on that system.

        Loren



>
> http://www.howsyournetwork.com/mem-day.png
> http://www.howsyournetwork.com/swap-day.png
>
> Shane
>
> -- 
> Shane Hickey <sh...@howsyournetwork.com>: Network/System Consultant
> GPG KeyID: 777CBF3F
> Key fingerprint: 254F B2AC 9939 C715 278C  DA95 4109 9F69 777C BF3F
> Listening to: Pixies - Bone Machine


Re: 3.0 scanning delays

Posted by Shane Hickey <sh...@howsyournetwork.com>.
"Loren Wilton" <lw...@earthlink.net> [2004-10-01 15:13]:
> > On Thu, 30 Sep 2004 11:56:01 -0600, Shane Hickey
> > <sh...@howsyournetwork.com> wrote:
> > > So, I take it that no one is seeing these weird spamd delays but
> > > me?
> Rats.
> 
> Hum.  Are you also perchance out of swap space as well as memory?

I posted some graphs of my Mem and Swap utilization under 3.0 and 2.64 but I didn't realize provide a legend.  The green is memory/swap used and the blue is free.  You can see that under 3.0 (the first half of the graph) about 1/2 of my swap was used and under 2.64, just a tiny fraction is used.  The memory graph is less dramatic.

I'm hearing a lot of talk about BigEvil and I am running that rule (I emerged spamassassin-ruledujour in Gentoo, which includes a collection of 3rd party rules).  However, I'm fond of many of these rules, so I think I'll just keep running 2.64 for the time being since it seems to handle these rules without completely devouring all my swap and ram.

http://www.howsyournetwork.com/mem-day.png
http://www.howsyournetwork.com/swap-day.png

Shane

-- 
Shane Hickey <sh...@howsyournetwork.com>: Network/System Consultant
GPG KeyID: 777CBF3F
Key fingerprint: 254F B2AC 9939 C715 278C  DA95 4109 9F69 777C BF3F
Listening to: Pixies - Bone Machine

Re: 3.0 scanning delays

Posted by Loren Wilton <lw...@earthlink.net>.
> On Thu, 30 Sep 2004 11:56:01 -0600, Shane Hickey
> <sh...@howsyournetwork.com> wrote:
> > So, I take it that no one is seeing these weird spamd delays but me?
Rats.

Hum.  Are you also perchance out of swap space as well as memory?

        Loren


Re: 3.0 scanning delays

Posted by Jon Trulson <jo...@radscan.com>.
On Fri, 1 Oct 2004, Luis Hernán Otegui wrote:

> Same thing here, except that it also eats as much memory as it can...
> Scan times keep growing bigger and bigger in time...
>

 	I saw this problem too on our scanning machine (dual Xeon HT 1GB 
RAM), upgraded to SA 3.0 over the weekend.  After awhile (4-8 hours) it 
would get slower and slower (to the point the milter on the mail gateway 
would timeout waiting for spamd to finish a message), and then unscanned 
email would be delivered.

 	I tracked it down (partially) to 3 or more of the spamd threads 
jumping up to around 320MB allocated RAM and staying there.  Easy to suck 
up a gig that way.  As more of the spamd children 'blew up' the slower the 
system became due to the increased swapping.

 	By default each spamd child will handle 200 connections before 
terminating and allowing the 'master' to start a new child.  After several 
hours, these blownup spamd's would bring the machine to it's knees.

 	What I did was add '--max-conn-per-child=1' to the spamd start 
line.  This causes each child to die after handling one connection.  I 
still see the occasional 'blow up' for a spamd child, but at least now it 
gets released as soon as that particular child as finished scanning it's 
message.

 	Since then, I haven't had any more problems - running 2 days now 
without requiring a manual restart to regain control of the machine.

 	Of course this is really just a workaround.  spamd really should 
release it's allocated mem after handling a message.  I have no idea what 
causes a spamd to explode like that - a 'special' message that exploits 
some bug in spamd?  You guys might try that option to spamd and see if it 
helps.

>
> On Thu, 30 Sep 2004 11:56:01 -0600, Shane Hickey
> <sh...@howsyournetwork.com> wrote:
>> So, I take it that no one is seeing these weird spamd delays but me?  Rats.
>>
>> Shane Hickey <sh...@howsyournetwork.com> [2004-09-29 14:11]:
>>> Howdy all.  I'm running version 3.0.0 on Gentoo Linux (using the
>>> 3.0.0-r1 ebuild).  The machine is a dual P3/450 and it is also running
>>> sendmail 8.12.11 and it handles mail for 20 or so domains with less
>>> than 20 users total.  So, the mail volume is pretty low.
>>>
>>> I'm running spamd in the following manner:
>>>
>>> /usr/sbin/spamd -d -r /var/run/spamd/spamd.pid -u mail -x -m 10 -L
>>>
>>> I'm running spamc out of my /etc/procmailrc (with no options).
>>>
>>> What I've noticed is that after spamd has been running for a little
>>> while, it starts to take longer and longer to check each message.
>>> Here is a snippet of my times from 2.64:
>>>
>>> clean message (-104.9/5.0) for user1:8 in 0.8 seconds, 1129 bytes.
>>> clean message (-104.9/5.0) for user2:8 in 0.9 seconds, 1231 bytes.
>>> clean message (-104.9/5.0) for user1:8 in 0.8 seconds, 1231 bytes.
>>> clean message (-4.9/5.0) for user1:8 in 1.1 seconds, 1046 bytes.
>>>
>>> When I first start spamd, I see times that are very close to this.
>>> But, within 10-20 minutes, they start to climb.  Here is how they look
>>> right now (I started spamd 40 minutes ago).
>>>
>>> clean message (-102.8/5.0) for user1:8 in 5.8 seconds, 1282 bytes.
>>> clean message (-5.0/5.0) for user2:8 in 41.8 seconds, 2867 bytes.
>>> clean message (-100.0/5.0) for user3:8 in 37.8 seconds, 2250 bytes.
>>>
>>> If I let spamd run for several hours, I'll see times near 200 seconds
>>> per message and it seems to keep increasing.
>>>
>>> I have always had "skip_rbl_checks 1" in my local.cf.  But, I've been
>>> trying to isolate what's caused this new slowness, so I've also tried
>>> to first disable razor2, dcc and pyzor and that didn't seem to make
>>> much difference.  Then I set use_bayes to 0 and that seems to help a
>>> little bit, but I still see long delays.  The delayed times that I
>>> show above are for this configuration:
>>>
>>> # Enable the Bayes system
>>> use_bayes               0
>>>
>>> # Enable or disable network checks
>>> skip_rbl_checks         1
>>> use_razor2              1
>>> use_dcc                 1
>>> use_pyzor               1
>>>
>>> I also tried "lock_method flock" and I didn't see much success their
>>> either.  Anyway, I was hoping someone else had seen this behavior and
>>> or maybe someone could shed some light on what might be the cause of
>>> this?
>>>
>>> Thanks,
>>> Shane
>>>
>>> --
>>> Shane Hickey <sh...@howsyournetwork.com>: Network/System Consultant
>>> GPG KeyID: 777CBF3F
>>> Key fingerprint: 254F B2AC 9939 C715 278C  DA95 4109 9F69 777C BF3F
>>> Listening to: The Courtship of Birdy Numnum - The
>>> Parapalegic-Homoerotic Episode
>>>
>>
>> --
>> Shane Hickey <sh...@howsyournetwork.com>: Network/System Consultant
>> GPG KeyID: 777CBF3F
>> Key fingerprint: 254F B2AC 9939 C715 278C  DA95 4109 9F69 777C BF3F
>> Listening to: The Styrenes - Cold Meat
>>
>
>
>
>

-- 
Jon Trulson    mailto:jon@radscan.com
ID: 1A9A2B09, FP: C23F328A721264E7 B6188192EC733962
PGP keys at http://radscan.com/~jon/PGPKeys.txt
#include <std/disclaimer.h>
"I am Nomad." -Nomad

Re: 3.0 scanning delays

Posted by Shane Hickey <sh...@howsyournetwork.com>.
I agree about the memory.  But my problem also seemed to be that it was eating a fair amount of swap.  This is on a machine with 500M of ram.  I finally gave up and went back to 2.64.  

Here's a link to some pngs that illustrate the symptoms.

http://www.howsyournetwork.com/mem-day.png
http://www.howsyournetwork.com/swap-day.png

You can see the big change when I finally switched back to 2.64 a little after 5pm.  The jumps before that were when I would stop spamd to tweak.

Shane

Luis Hernán Otegui <lu...@gmail.com> [2004-10-01 13:09]:
> Same thing here, except that it also eats as much memory as it can...
> Scan times keep growing bigger and bigger in time...
> 
> 
> On Thu, 30 Sep 2004 11:56:01 -0600, Shane Hickey
> <sh...@howsyournetwork.com> wrote:
> > So, I take it that no one is seeing these weird spamd delays but me?
> >  Rats.
> > 
> > Shane Hickey <sh...@howsyournetwork.com> [2004-09-29 14:11]:
> > > Howdy all.  I'm running version 3.0.0 on Gentoo Linux (using the
> > > 3.0.0-r1 ebuild).  The machine is a dual P3/450 and it is also
> > > running sendmail 8.12.11 and it handles mail for 20 or so domains
> > > with less than 20 users total.  So, the mail volume is pretty low.
> > >
> > > I'm running spamd in the following manner:
> > >
> > > /usr/sbin/spamd -d -r /var/run/spamd/spamd.pid -u mail -x -m 10 -L
> > >
> > > I'm running spamc out of my /etc/procmailrc (with no options).
> > >
> > > What I've noticed is that after spamd has been running for a
> > > little while, it starts to take longer and longer to check each
> > > message. Here is a snippet of my times from 2.64:
> > >
> > > clean message (-104.9/5.0) for user1:8 in 0.8 seconds, 1129 bytes.
> > > clean message (-104.9/5.0) for user2:8 in 0.9 seconds, 1231 bytes.
> > > clean message (-104.9/5.0) for user1:8 in 0.8 seconds, 1231 bytes.
> > > clean message (-4.9/5.0) for user1:8 in 1.1 seconds, 1046 bytes.
> > >
> > > When I first start spamd, I see times that are very close to this.
> > > But, within 10-20 minutes, they start to climb.  Here is how they
> > > look right now (I started spamd 40 minutes ago).
> > >
> > > clean message (-102.8/5.0) for user1:8 in 5.8 seconds, 1282 bytes.
> > > clean message (-5.0/5.0) for user2:8 in 41.8 seconds, 2867 bytes.
> > > clean message (-100.0/5.0) for user3:8 in 37.8 seconds, 2250
> > > bytes.
> > >
> > > If I let spamd run for several hours, I'll see times near 200
> > > seconds per message and it seems to keep increasing.
> > >
> > > I have always had "skip_rbl_checks 1" in my local.cf.  But, I've
> > > been trying to isolate what's caused this new slowness, so I've
> > > also tried to first disable razor2, dcc and pyzor and that didn't
> > > seem to make much difference.  Then I set use_bayes to 0 and that
> > > seems to help a little bit, but I still see long delays.  The
> > > delayed times that I show above are for this configuration:
> > >
> > > # Enable the Bayes system
> > > use_bayes               0
> > >
> > > # Enable or disable network checks
> > > skip_rbl_checks         1
> > > use_razor2              1
> > > use_dcc                 1
> > > use_pyzor               1
> > >
> > > I also tried "lock_method flock" and I didn't see much success
> > > their either.  Anyway, I was hoping someone else had seen this
> > > behavior and or maybe someone could shed some light on what might
> > > be the cause of this?
> > >
> > > Thanks,
> > > Shane
> > >
> > > --
> > > Shane Hickey <sh...@howsyournetwork.com>: Network/System
> > > Consultant GPG KeyID: 777CBF3F
> > > Key fingerprint: 254F B2AC 9939 C715 278C  DA95 4109 9F69 777C
> > > BF3F Listening to: The Courtship of Birdy Numnum - The
> > > Parapalegic-Homoerotic Episode
> > >
> > 
> > --
> > Shane Hickey <sh...@howsyournetwork.com>: Network/System Consultant
> > GPG KeyID: 777CBF3F
> > Key fingerprint: 254F B2AC 9939 C715 278C  DA95 4109 9F69 777C BF3F
> > Listening to: The Styrenes - Cold Meat
> > 
> 
> 
> 
> -- 
> -------------------------------------------------
> GNU-GPL: "May The Source Be With You...
> -------------------------------------------------
> 


-- 
Shane Hickey <sh...@howsyournetwork.com>: Network/System Consultant
GPG KeyID: 777CBF3F
Key fingerprint: 254F B2AC 9939 C715 278C  DA95 4109 9F69 777C BF3F
Listening to: Jr and His Soulettes - Rock N Roll Santa