You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by email builder <em...@yahoo.com> on 2004/11/06 00:56:33 UTC

multiple spamd machines

Hello,

  Since I have not been able to tackle the excessive CPU usage of spamd on a
single machine (see the thread "spamd still burning CPU in 3.0.1"), I am
hoping that the people who said I am simply maxing out spamd's capabilities
are right:  I am moving to a multiple box solution.  I will be putting a
machine online that will likely do little else than run spamd.  I may or may
not continue to run a couple spamd children on the original machine, just to
help out.  I have two questions:

  1) Is it a waste of a whole machine to just run spamd?  I suppose I will be
watching its resource usage, but am wondering if it will be available to be
used as a MySQL server and/or something like NFS.

  2) I am clueless about how to serve spamd in more than one place.  How do I
tell spamc that I have more than one spamd listening for requests in more
than one place?  Also, one spamd will be running a lot more children and will
thus take more requests -- do I somehow (how???) need to load balance or will
it happen "automatically" by simply seeing that one of my machines' spamd
currently has no available children?  Uhhhh, I am lost here.  Anyone know of
any good reading I can do?  Links?  Tips?  How-to's?

Reading man spamc, I see:

-d host   In TCP/IP mode, connect to spamd server on given host (default:
          localhost).

          If host resolves to multiple addresses, then spamc will fail-over
to
          the other addresses, if the first one cannot be connected to.

So do I just do this:

/usr/bin/spamc  -u  <username>  -d  127.0.0.1  123.45.6.789

Again, is load balancing going to be an issue with this configuration? 
Remember, one of these addresses will be underpowered compared to the other.

Ah, then this:  

-H  For TCP/IP sockets, randomize the IP addresses returned from a DNS
    name lookup (when more than one IP is returned). This provides for a
    kind of hostname-base load balancing.

I am not sure how to use this one, but it looks a little more like what I
want.  Can anyone offer pointers on how to implement this?


TIA!



		
__________________________________ 
Do you Yahoo!? 
Check out the new Yahoo! Front Page. 
www.yahoo.com 
 


Re: multiple spamd machines

Posted by email builder <em...@yahoo.com>.
Thanks so much for your reply.  Further thoughts/questions inline below:

> >   Since I have not been able to tackle the excessive CPU usage of spamd
> on a
> > single machine (see the thread "spamd still burning CPU in 3.0.1"), I am
> > hoping that the people who said I am simply maxing out spamd's
> capabilities
> > are right:  I am moving to a multiple box solution.  I will be putting a
> > machine online that will likely do little else than run spamd.  I may or
> may
> > not continue to run a couple spamd children on the original machine, just
> to
> > help out.  I have two questions:
> > 
> >   1) Is it a waste of a whole machine to just run spamd?  I suppose I
> will be
> > watching its resource usage, but am wondering if it will be available to
> be
> > used as a MySQL server and/or something like NFS.
> > 
> >   2) I am clueless about how to serve spamd in more than one place.  How
> do I
> > tell spamc that I have more than one spamd listening for requests in more
> > than one place?  Also, one spamd will be running a lot more children and
> will
> > thus take more requests -- do I somehow (how???) need to load balance or
> will
> > it happen "automatically" by simply seeing that one of my machines' spamd
> > currently has no available children?  Uhhhh, I am lost here.  Anyone know
> of
> > any good reading I can do?  Links?  Tips?  How-to's?
> > 
> > Reading man spamc, I see:
> > 
> > -d host   In TCP/IP mode, connect to spamd server on given host (default:
> >           localhost).
> > 
> >           If host resolves to multiple addresses, then spamc will
> fail-over
> > to
> >           the other addresses, if the first one cannot be connected to.
> > 
> > So do I just do this:
> > 
> > /usr/bin/spamc  -u  <username>  -d  127.0.0.1  123.45.6.789
> > 
> > Again, is load balancing going to be an issue with this configuration? 
> > Remember, one of these addresses will be underpowered compared to the
> other.
> > 
> > Ah, then this:  
> > 
> > -H  For TCP/IP sockets, randomize the IP addresses returned from a DNS
> >     name lookup (when more than one IP is returned). This provides for a
> >     kind of hostname-base load balancing.
> > 
> > I am not sure how to use this one, but it looks a little more like what I
> > want.  Can anyone offer pointers on how to implement this?
> 
> Hi,
> 
> You are darn close there... What you want is
> 
> /usr/bin/spamc  -u  <username>  -d  spa.yourdomain.com -H
> 
> And spa.yourdomain.com has two ptr records, one to 127.0.0.1  and the 
> other to 123.45.6.789
> 
> in Bind talk that would be
> 
> spa.yourdomain.com.	IN A	127.0.0.1
> spa.yourdomain.com.	IN A	123.45.6.789
> 
> and in tinydns
> 
> +spa.yourdomain.com:127.0.0.1:3600
> +spa.yourdomain.com:123.45.6.789:3600
> 
> I'm not sure if you really need the -H, I know I don't using dnscache as 
> my local dns server.

Why?  Does dnscache randomize for you?
 
> To do true load balancing you need a hardware or software load balancer 
> running, linux will do it if you ever need/want to get into that.

Ideally, I'd not like to have to take that up just yet.  ;)

> I haven't tested using a -d 127.0.0.1,123.45.6.789 to see if spamc will 
> fail over to the second host if all the connections are busy on the 
> first host.  I could be wrong but I don't think it will fail over to the 
> second host because the first host will just place it in the queue to be 
> processed.  I could very well be wrong though.

But using the DNS-based approach as you do, it *will* fail over?  Why?  It
seems like if spamd tries to queue up any request it gets, then it would
happen to you, too.  Spamd should not have any way of knowing if you used DNS
to resolve it or the addresses were listed on the command line, no?

Thanks!



		
__________________________________ 
Do you Yahoo!? 
Check out the new Yahoo! Front Page. 
www.yahoo.com 
 


Re: multiple spamd machines

Posted by Rick Macdougall <ri...@nougen.com>.

email builder wrote:
> Hello,
> 
>   Since I have not been able to tackle the excessive CPU usage of spamd on a
> single machine (see the thread "spamd still burning CPU in 3.0.1"), I am
> hoping that the people who said I am simply maxing out spamd's capabilities
> are right:  I am moving to a multiple box solution.  I will be putting a
> machine online that will likely do little else than run spamd.  I may or may
> not continue to run a couple spamd children on the original machine, just to
> help out.  I have two questions:
> 
>   1) Is it a waste of a whole machine to just run spamd?  I suppose I will be
> watching its resource usage, but am wondering if it will be available to be
> used as a MySQL server and/or something like NFS.
> 
>   2) I am clueless about how to serve spamd in more than one place.  How do I
> tell spamc that I have more than one spamd listening for requests in more
> than one place?  Also, one spamd will be running a lot more children and will
> thus take more requests -- do I somehow (how???) need to load balance or will
> it happen "automatically" by simply seeing that one of my machines' spamd
> currently has no available children?  Uhhhh, I am lost here.  Anyone know of
> any good reading I can do?  Links?  Tips?  How-to's?
> 
> Reading man spamc, I see:
> 
> -d host   In TCP/IP mode, connect to spamd server on given host (default:
>           localhost).
> 
>           If host resolves to multiple addresses, then spamc will fail-over
> to
>           the other addresses, if the first one cannot be connected to.
> 
> So do I just do this:
> 
> /usr/bin/spamc  -u  <username>  -d  127.0.0.1  123.45.6.789
> 
> Again, is load balancing going to be an issue with this configuration? 
> Remember, one of these addresses will be underpowered compared to the other.
> 
> Ah, then this:  
> 
> -H  For TCP/IP sockets, randomize the IP addresses returned from a DNS
>     name lookup (when more than one IP is returned). This provides for a
>     kind of hostname-base load balancing.
> 
> I am not sure how to use this one, but it looks a little more like what I
> want.  Can anyone offer pointers on how to implement this?

Hi,

You are darn close there... What you want is

/usr/bin/spamc  -u  <username>  -d  spa.yourdomain.com -H

And spa.yourdomain.com has two ptr records, one to 127.0.0.1  and the 
other to 123.45.6.789

in Bind talk that would be

spa.yourdomain.com.	IN A	127.0.0.1
spa.yourdomain.com.	IN A	123.45.6.789

and in tinydns

+spa.yourdomain.com:127.0.0.1:3600
+spa.yourdomain.com:123.45.6.789:3600

I'm not sure if you really need the -H, I know I don't using dnscache as 
my local dns server.

To do true load balancing you need a hardware or software load balancer 
running, linux will do it if you ever need/want to get into that.


I haven't tested using a -d 127.0.0.1,123.45.6.789 to see if spamc will 
fail over to the second host if all the connections are busy on the 
first host.  I could be wrong but I don't think it will fail over to the 
second host because the first host will just place it in the queue to be 
processed.  I could very well be wrong though.

Regards,

Rick