You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by John Fleming <jo...@wa9als.com> on 2004/10/31 03:59:41 UTC

Re: Load Average Problems

jdow said:
> On another paw I note that most family tools are not left running
> 24x7. If this is his case then a large portion of his 250 messages
> may be coming in right after he boots. If he is setup to spawn
> too many spamds then he could experience a memory crisis.

That's not it.  It's mostly a family/hobby server, but it functions
"fairly professionally" - I just meant I'm not an ISP or big business
with thousands of emails a day.  The server's on 24/7/365 running
Apache, Mailman and other common server stuff - but all at a VERY low
activity/use level.

I've reviewed my local.cf, and there was some duplication.  I've
removed the dupes and we'll see if that helps.

I call spamd via spamc in procmail.  I've read man spamc/d - I see
where to limit the spamd children when using the spamd option, but I
don't see how to pass that option on when using spamc.  IOW, I don't see how
to limit spamd children when using spamc.

Also, my procmailrc uses a lock file when evaluating the results of
spamd - I guess that doesn't limit starting another spamd before
that file has been evaluated?  - John




Re: Load Average Problems

Posted by Duncan Findlay <du...@debian.org>.
On Sun, Oct 31, 2004 at 04:20:40PM -0500, John Fleming wrote:
> OK, I'll bare my ignorance here in hopes of enlightenment.  I'm probably
> lucky that I have SA working as well as I do.  I only have a loose
> understanding of the different roles of "spamassassin", "spamc", and
> "spamd".  I start things with /etc/init.d/spamassassin.  Then in procmail, I
> pipe the msg to spamc.  In neither of these places do I see how to pass any
> options to spamd.
> 
> I've also tried:
> # spamd -m 2
> but this gets an error about the socket being in use.
> 
> What am I missing?  - John

If you're using Debian, and from the sounds of it, you are, command
line options are set in /etc/default/spamassassin. Try
adding/subtracting options there.

-- 
Duncan Findlay

Re: Load Average Problems

Posted by jdow <jd...@earthlink.net>.
From: "Duncan Findlay" <du...@debian.org>
> Sorry, Mandrake not Debian. Anyways, change options in
> /etc/sysconfig/spamassassin, I think.


Yeah, in theory that's the best way. But do copy the "SPAMDOPTIONS"
line and then place IT into the /etc/sysconfig/spamassassin. I get
naughty and cheat on that issue by changing the line in the script.

{^_-}


Re: Load Average Problems

Posted by Duncan Findlay <du...@debian.org>.
On Sun, Oct 31, 2004 at 02:41:36PM -0800, jdow wrote:

[ In the future, please trim the message you are replying to so that
you only include the relavent bits. ]

> > OK, I'll bare my ignorance here in hopes of enlightenment.  I'm probably
> > lucky that I have SA working as well as I do.  I only have a loose
> > understanding of the different roles of "spamassassin", "spamc", and
> > "spamd".  I start things with /etc/init.d/spamassassin.  Then in procmail,
> I
> > pipe the msg to spamc.  In neither of these places do I see how to pass
> any
> > options to spamd.
> >
> > I've also tried:
> > # spamd -m 2
> > but this gets an error about the socket being in use.
> >
> > What am I missing?  - John
> 
> OK, from the "spamd --help" output:
>      -m num, --max-children num         Allow maximum num children
> 
> So that option is positively "a spamd thing." So how does one get that
> option into spamd? On the Mandrake test machine I have the init script
> in /etc/init.d as "spamassassin". It includes these lines:
> ====
> # Source spamd configuration.
> if [ -f /etc/sysconfig/spamassassin ] ; then
>         . /etc/sysconfig/spamassassin
> else
>         SPAMDOPTIONS="-d -c -m5 -Hi --user-config"
> fi

Sorry, Mandrake not Debian. Anyways, change options in
/etc/sysconfig/spamassassin, I think.

-- 
Duncan Findlay

Re: SOLVED Re: Load Average Problems

Posted by jdow <jd...@earthlink.net>.
From: "John Fleming" <jo...@wa9als.com>
> > > What am I missing?  - John
> >
> > OK, from the "spamd --help" output:
> >      -m num, --max-children num         Allow maximum num children
> >
> > So that option is positively "a spamd thing." So how does one get that
> > option into spamd? On the Mandrake test machine I have the init script
> > in /etc/init.d as "spamassassin". It includes these lines:
> > ====
>
> Ahhh, THAT's what's missing from my understanding!  (plus a lot of other
> stuff!)
>
> I reviewed my spamassassin init.d script and saw the options in there.  A
> comment line in there directed me to /etc/default/spamassassin (specific
to
> Debian).  GUESS WHAT I FOUND IN THERE?
>
> OPTIONS="-c -m 10 -a -H"
>
> GOOD GRIEF!  -m 10 and me with 512 RAM!
>
> Hope my load average will go down soon!!  Thanks especially to Jason and
> jdow and **WAIT** - I see I just got a msg from Duncan that directed me
> specifically to /etc/default/spamassassin!!!
>
> So I think the -m option could be added in the init script OR the
> /etc/default/spamassassin file, but the init script is probably
overwritten
> during updates, so better to use the default file.
>
> I know Fedora Core didn't used to have the /etc/default/spamassassin file,
> so that is specific to Debian.

Um, I just look at the actual script file for the distro in use and see
where *IT* expects the options file. It works better that way.

(I know I overkilled the explanation. I hoped it would increase the
understanding better than a rote solution.)

{^_-}



SOLVED Re: Load Average Problems

Posted by John Fleming <jo...@wa9als.com>.
> > What am I missing?  - John
>
> OK, from the "spamd --help" output:
>      -m num, --max-children num         Allow maximum num children
>
> So that option is positively "a spamd thing." So how does one get that
> option into spamd? On the Mandrake test machine I have the init script
> in /etc/init.d as "spamassassin". It includes these lines:
> ====

Ahhh, THAT's what's missing from my understanding!  (plus a lot of other
stuff!)

I reviewed my spamassassin init.d script and saw the options in there.  A
comment line in there directed me to /etc/default/spamassassin (specific to
Debian).  GUESS WHAT I FOUND IN THERE?

OPTIONS="-c -m 10 -a -H"

GOOD GRIEF!  -m 10 and me with 512 RAM!

Hope my load average will go down soon!!  Thanks especially to Jason and
jdow and **WAIT** - I see I just got a msg from Duncan that directed me
specifically to /etc/default/spamassassin!!!

So I think the -m option could be added in the init script OR the
/etc/default/spamassassin file, but the init script is probably overwritten
during updates, so better to use the default file.

I know Fedora Core didn't used to have the /etc/default/spamassassin file,
so that is specific to Debian.

Thanks everyone!  - John



Re: Load Average Problems

Posted by jdow <jd...@earthlink.net>.
From: "John Fleming" <jo...@wa9als.com>
> From: "jdow" <jd...@earthlink.net>
> > From: "John Fleming" <jo...@wa9als.com>
> >
> > > jdow said:
> > > > On another paw I note that most family tools are not left running
> > > > 24x7. If this is his case then a large portion of his 250 messages
> > > > may be coming in right after he boots. If he is setup to spawn
> > > > too many spamds then he could experience a memory crisis.
> > >
> > > That's not it.  It's mostly a family/hobby server, but it functions
> > > "fairly professionally" - I just meant I'm not an ISP or big business
> > > with thousands of emails a day.  The server's on 24/7/365 running
> > > Apache, Mailman and other common server stuff - but all at a VERY low
> > > activity/use level.
> > >
> > > I've reviewed my local.cf, and there was some duplication.  I've
> > > removed the dupes and we'll see if that helps.
> > >
> > > I call spamd via spamc in procmail.  I've read man spamc/d - I see
> > > where to limit the spamd children when using the spamd option, but I
> > > don't see how to pass that option on when using spamc.  IOW, I don't
see
> > how
> > > to limit spamd children when using spamc.
> > >
> > > Also, my procmailrc uses a lock file when evaluating the results of
> > > spamd - I guess that doesn't limit starting another spamd before
> > > that file has been evaluated?  - John
> >
> > Um, you do not limit with spamc. You simply setup the limit in spamd
when
> > you start or restart it. It is probably a good idea to play with several
> > values to see which gives you performance closest to your desired
> > performance. As soon as you get enough spamds up to trigger paging the
> > overall performance will take a serious dive. To a fairly real extent
> > a limit of two or three is probably best for single processor systems
> > modulo how much time is spent computing compared to waiting on IO for
> > any given spamd. If it is heavily compute bound 2 might be optimum.
>
> OK, I'll bare my ignorance here in hopes of enlightenment.  I'm probably
> lucky that I have SA working as well as I do.  I only have a loose
> understanding of the different roles of "spamassassin", "spamc", and
> "spamd".  I start things with /etc/init.d/spamassassin.  Then in procmail,
I
> pipe the msg to spamc.  In neither of these places do I see how to pass
any
> options to spamd.
>
> I've also tried:
> # spamd -m 2
> but this gets an error about the socket being in use.
>
> What am I missing?  - John

OK, from the "spamd --help" output:
     -m num, --max-children num         Allow maximum num children

So that option is positively "a spamd thing." So how does one get that
option into spamd? On the Mandrake test machine I have the init script
in /etc/init.d as "spamassassin". It includes these lines:
====
# Source spamd configuration.
if [ -f /etc/sysconfig/spamassassin ] ; then
        . /etc/sysconfig/spamassassin
else
        SPAMDOPTIONS="-d -c -m5 -Hi --user-config"
fi

[ -f /usr/bin/spamd -o -f /usr/local/bin/spamd ] || exit 0
PATH=$PATH:/usr/bin:/usr/local/bin

# See how we were called.
case "$1" in
  start)
        # Start daemon.
        gprintf "Starting spamd: "
        daemon spamd $SPAMDOPTIONS
        RETVAL=$?
        echo
        [ $RETVAL = 0 ] && touch /var/lock/subsys/spamassassin
        ;;
====
The first clause sets some options either from a file in the sysconfig
directory or a default from the /etc/init.d/spamassassin file itself.
As can be seen that default includes the option "-m5". (It appears
that spamd may not be happy with the form "-m 5"?) The next clause
makes sure the /usr/bin and /usr/local/bin directories are on the
path for spamd's execution if spamd is in either of those directories.
Otherwise it leaves the PATH variable unchanged. The final clause is
the start of the case statement on the command arguments for the
/etc/init.d/spamassassin script. In the "start" argument case spamd
gets run as a daemon with the "SPAMDOPTIONS" set in the first one of
the excterpted clauses.

So basically you want to look for the location in your version of
/etc/init.d/spamassassin that holds the "spamd" starting as a daemon
instruction. Note how the parameters are assigned to it, in this case
via SPAMDOPTIONS. Look for where those parameters are assigned and
change the "-m5" (in this case) to "-m2". (Which I think I will do
with this test machine because I note it still bogs down nastily
when a collection of Linux Kernel patch files are sent to the Linux
Kernel Mailing List. That's usually a dozen to two dozen files of
varying length that hit almost all at once. With -m5 it seems the
various spamd invocations are preempting each other to death. With
-m2 there might be fewer preemptions and better overall throughput.
At least, I'm willing to try, not that the machine cracks anything
close to a serious sweat on the load I place on it, about 1000
messages a day. At that rate it's loafing for a "2GHz" Atholn with
1G of memory even with X running.

{^_^}



RE: Load Average Problems

Posted by "Jason J. Ellingson" <ja...@ellingson.com>.
Being pretty much a new guy to the SA scene, I think I can help you
understand which does what...

SpamAssassin is the actual processing program.  When run directly as
"spamassassin" it needs to load a PERL processor (the scripting language
it's written in), runs, and then unloads from memory when done.  This is
fine for many applications, but when you need to check a lot of email (like
many of us that host email accounts for customers) that translates into a
very slow process as you have to wait for the whole load, execute, unload
process to run.  You also must run "spamassassin" on the machine that has
the email to be scanned.

"spamd" is a "daemon" (the "d" in spamd) or service.  It is a copy of
"spamassassin" that is loaded ahead of time (usually during the computer's
boot up), and not unloaded.  So initially, you may have 5 copies (also
called children) of spamd running (5 copies of spamassassin) which is a
quick hit on resources, but from there on it is MUCH faster as it doesn't
ever need to unload and reload again for each message it needs to process.
It is always ready and waiting... plus it has code to allow it to talk to
another server that has the email that needs processing... which brings me
to...

"spamc" is a "client" (the "c" in spamc).  It is very small, so it loads
very quickly as all it has to do is simply pass the message that needs
processing/checking to the server that is running spamd and then wait for a
response from spamd on what it found.

Now, you don't need two computers to use spamc/spamd.  Many run it on the
same computer because it is faster than running spamassassin as it is always
ready to run (no load/unload waiting).

Recap:
======
SpamAssassin: Processing program.  It loads, processes, and unloads.
SpamD: It is SpamAssassin, but doesn't unload, so it is always ready.  It
listens for a communication from SpamC (on same or different computer).
SpamC: It passes a message to be processed to SpamD (on same or different
computer).

So what you really want to do is get SpamD running with a line like:

spamd -i 0.0.0.0 -A 192.168/16,127.0.0.1

the -i tells spamd to listen on all IPs available (in case the computer has
more than 1 IP)
the -A tells spamd to accept SpamC connections from the following IP/IP
blocks - in my case 192.168.x.x (any computer on my private network - I have
3 servers using SpamC to talk to it) and 127.0.0.1 (itself)

By default spamd in spamassassin 3.x will run 5 children (5 copies of
spamassassin)... which will require a 512MB machine.  You can add a "-m 3"
to make it have 3 children if you have only 256MB.

You call spamc with a line like:

spamc -d 192.168.0.13 <messagetotest >messageresults

the -d 192.168.0.12 tells spamc that the spamd is running on the computer at
IP 192.168.0.13... the default is 127.0.0.1, so you don't need this bit if
you wanted it to talk to spamd running on the same PC.

I hope this helps you and others out.

Oh, and you "wise and knowledgeable" devs and users... feel free to correct
me if I'm wrong about anything.
------------------------------------------------------------
Jason J Ellingson
Technical Consultant

615.301.1682 : nashville
612.605.1132 : minneapolis

www.ellingson.com
jason@ellingson.com

-----Original Message-----
From: John Fleming [mailto:john@wa9als.com] 
Sent: Sunday, October 31, 2004 3:21 PM
To: users@spamassassin.apache.org
Subject: Re: Load Average Problems

OK, I'll bare my ignorance here in hopes of enlightenment.  I'm probably
lucky that I have SA working as well as I do.  I only have a loose
understanding of the different roles of "spamassassin", "spamc", and
"spamd".  I start things with /etc/init.d/spamassassin.  Then in procmail, I
pipe the msg to spamc.  In neither of these places do I see how to pass any
options to spamd.

I've also tried:
# spamd -m 2
but this gets an error about the socket being in use.

What am I missing?  - John





Re: Load Average Problems

Posted by John Fleming <jo...@wa9als.com>.
----- Original Message -----
From: "jdow" <jd...@earthlink.net>
To: <us...@spamassassin.apache.org>
Sent: Saturday, October 30, 2004 10:41 PM
Subject: Re: Load Average Problems


> From: "John Fleming" <jo...@wa9als.com>
>
> > jdow said:
> > > On another paw I note that most family tools are not left running
> > > 24x7. If this is his case then a large portion of his 250 messages
> > > may be coming in right after he boots. If he is setup to spawn
> > > too many spamds then he could experience a memory crisis.
> >
> > That's not it.  It's mostly a family/hobby server, but it functions
> > "fairly professionally" - I just meant I'm not an ISP or big business
> > with thousands of emails a day.  The server's on 24/7/365 running
> > Apache, Mailman and other common server stuff - but all at a VERY low
> > activity/use level.
> >
> > I've reviewed my local.cf, and there was some duplication.  I've
> > removed the dupes and we'll see if that helps.
> >
> > I call spamd via spamc in procmail.  I've read man spamc/d - I see
> > where to limit the spamd children when using the spamd option, but I
> > don't see how to pass that option on when using spamc.  IOW, I don't see
> how
> > to limit spamd children when using spamc.
> >
> > Also, my procmailrc uses a lock file when evaluating the results of
> > spamd - I guess that doesn't limit starting another spamd before
> > that file has been evaluated?  - John
>
> Um, you do not limit with spamc. You simply setup the limit in spamd when
> you start or restart it. It is probably a good idea to play with several
> values to see which gives you performance closest to your desired
> performance. As soon as you get enough spamds up to trigger paging the
> overall performance will take a serious dive. To a fairly real extent
> a limit of two or three is probably best for single processor systems
> modulo how much time is spent computing compared to waiting on IO for
> any given spamd. If it is heavily compute bound 2 might be optimum.

OK, I'll bare my ignorance here in hopes of enlightenment.  I'm probably
lucky that I have SA working as well as I do.  I only have a loose
understanding of the different roles of "spamassassin", "spamc", and
"spamd".  I start things with /etc/init.d/spamassassin.  Then in procmail, I
pipe the msg to spamc.  In neither of these places do I see how to pass any
options to spamd.

I've also tried:
# spamd -m 2
but this gets an error about the socket being in use.

What am I missing?  - John




Re: Load Average Problems

Posted by jdow <jd...@earthlink.net>.
From: "John Fleming" <jo...@wa9als.com>

> jdow said:
> > On another paw I note that most family tools are not left running
> > 24x7. If this is his case then a large portion of his 250 messages
> > may be coming in right after he boots. If he is setup to spawn
> > too many spamds then he could experience a memory crisis.
>
> That's not it.  It's mostly a family/hobby server, but it functions
> "fairly professionally" - I just meant I'm not an ISP or big business
> with thousands of emails a day.  The server's on 24/7/365 running
> Apache, Mailman and other common server stuff - but all at a VERY low
> activity/use level.
>
> I've reviewed my local.cf, and there was some duplication.  I've
> removed the dupes and we'll see if that helps.
>
> I call spamd via spamc in procmail.  I've read man spamc/d - I see
> where to limit the spamd children when using the spamd option, but I
> don't see how to pass that option on when using spamc.  IOW, I don't see
how
> to limit spamd children when using spamc.
>
> Also, my procmailrc uses a lock file when evaluating the results of
> spamd - I guess that doesn't limit starting another spamd before
> that file has been evaluated?  - John

Um, you do not limit with spamc. You simply setup the limit in spamd when
you start or restart it. It is probably a good idea to play with several
values to see which gives you performance closest to your desired
performance. As soon as you get enough spamds up to trigger paging the
overall performance will take a serious dive. To a fairly real extent
a limit of two or three is probably best for single processor systems
modulo how much time is spent computing compared to waiting on IO for
any given spamd. If it is heavily compute bound 2 might be optimum.

{^_^}