You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Sammy Anderson <sa...@yahoo.com> on 2006/10/27 00:52:17 UTC

High CPU running SA in a VMware VM

We recently migrated our SpamAssassin installation from a physical 3.6  GHz system running RHEL 4 and SA 3.0.4 to a VMware VM (ESX 2.5.4) with  RHEL 4 as the guest OS and SA 3.1.7.  Each user has their own  Bayes files (Berkeley DB) and these were copied from the old to the new  server.  Now whenever an expiry process runs on a user's database,  the CPU spikes, sometimes for a minute or longer.  We did not  notice spikes on the old server, but it is really hammering the  VM.  Has anyone else experienced this problem?  For now I  have disabled Bayes altogether because of the unacceptable load.
  
  --SA
  
 		
---------------------------------
Do you Yahoo!?
 Get on board. You're invited to try the new Yahoo! Mail.

Re: High CPU running SA in a VMware VM

Posted by "Tim B." <mo...@optonline.net>.
Sammy Anderson wrote:
> We recently migrated our SpamAssassin installation from a physical 3.6 
> GHz system running RHEL 4 and SA 3.0.4 to a VMware VM (ESX 2.5.4) with 
> RHEL 4 as the guest OS and SA 3.1.7.  Each user has their own Bayes 
> files (Berkeley DB) and these were copied from the old to the new 
> server.  Now whenever an expiry process runs on a user's database, the 
> CPU spikes, sometimes for a minute or longer.  We did not notice 
> spikes on the old server, but it is really hammering the VM.  Has 
> anyone else experienced this problem?  For now I have disabled Bayes 
> altogether because of the unacceptable load.
>
> --SA
>
> ------------------------------------------------------------------------
> Do you Yahoo!?
> Get on board. You're invited 
> <http://us.rd.yahoo.com/evt=40791/*http://advision.webevents.yahoo.com/mailbeta> 
> to try the new Yahoo! Mail. 
I'm no VMware expert, but it's been my experience that any kind of 
database should not be run in a VMware VM.




Re: High CPU running SA in a VMware VM

Posted by Sammy Anderson <sa...@yahoo.com>.
You are correct, this was a new build, with a later version of SA and  migrated Bayes files.  It could very well be the case that  Berkeley DB needs to be patched, or the data converted in some fashion.
  
  I will say that in a VM environment, we tried to build gcc, and it took  MUCH longer than on a physical box with the same processors.   VMware analyzed our data, and they determined that we should disable  NPTL and use LinuxThreads instead (kb 1470).  This did help  substantially, and though slower than the physical machine, it was  acceptable. 
  
  I have tried this for SA, and it does seem to cut down the CPU required, so there is some hope.

Theo Van Dinter <fe...@apache.org> wrote:  On Fri, Oct 27, 2006 at 09:10:28PM +0000, Mark wrote:
> > I run my SMTP server entirely in a VMware VM, and have *never* seen a
> > high CPU usage on that particular machine. I run Postfix, Amavis-new
> > 2.4.3, SA 3.1.7 and quite some plug-ins.
> 
> I would run any of the "db_dump" or db_upgrade" utils for BerkeleyDB; or
> reinstall DB_File (and make darn sure it's compiled against the correct
> BerkeleyDB libs). At any rate, I myself would probably be more inclined to
> look into a BerkeleyDB issue than a Vmware one.

Yeah, I doubt there's an issue with VMware specifically (ESX++).  My guess is
that if you're seeing different behavior between a physical host and virtual
host, there's something different in the virtual host -- different OS, libs,
perl modules, etc.

Obviously that won't be the case if you virtualized a physical machine, but I
seem to recall from the start of the thread that you migrated the data but not
the OS.

-- 
Randomly Selected Tagline:
My wife and I were happy for years.  Then we met.


 		
---------------------------------
 All-new Yahoo! Mail - Fire up a more powerful email and get things done faster.

Re: High CPU running SA in a VMware VM

Posted by Theo Van Dinter <fe...@apache.org>.
On Fri, Oct 27, 2006 at 09:10:28PM +0000, Mark wrote:
> > I run my SMTP server entirely in a VMware VM, and have *never* seen a
> > high CPU usage on that particular machine. I run Postfix, Amavis-new
> > 2.4.3, SA 3.1.7 and quite some plug-ins.
> 
> I would run any of the "db_dump" or db_upgrade" utils for BerkeleyDB; or
> reinstall DB_File (and make darn sure it's compiled against the correct
> BerkeleyDB libs). At any rate, I myself would probably be more inclined to
> look into a BerkeleyDB issue than a Vmware one.

Yeah, I doubt there's an issue with VMware specifically (ESX++).  My guess is
that if you're seeing different behavior between a physical host and virtual
host, there's something different in the virtual host -- different OS, libs,
perl modules, etc.

Obviously that won't be the case if you virtualized a physical machine, but I
seem to recall from the start of the thread that you migrated the data but not
the OS.

-- 
Randomly Selected Tagline:
My wife and I were happy for years.  Then we met.

Re: High CPU running SA in a VMware VM

Posted by Sammy Anderson <sa...@yahoo.com>.
Rick Macdougall <ri...@ummm-beer.com> wrote:  Sammy Anderson wrote:
>  And there is one of these for each user, this is just for one user.  Sounds like we may have to abandon Bayes or possibly use mysql. Not  sure we are ready to invest in setting that all up...
> 

Bayes in MySQL is a snap to setup and it really runs rings around the 
dbm setup in a real world situation.

I switched over two clients this morning and neither of them had MySQL 
installed.  Installed from source (php 5 requirements etc) and still had 
both installs done before lunch.

Regards,

Rick

I may look at that route then.

 
---------------------------------
We have the perfect Group for you. Check out the handy changes to Yahoo! Groups.

Re: High CPU running SA in a VMware VM

Posted by Rick Macdougall <ri...@ummm-beer.com>.
Sammy Anderson wrote:
> And there is one of these for each user, this is just for one  user.  Sounds like we may have to abandon Bayes or possibly use  mysql.  Not sure we are ready to invest in setting that all up...
> 

Bayes in MySQL is a snap to setup and it really runs rings around the 
dbm setup in a real world situation.

I switched over two clients this morning and neither of them had MySQL 
installed.  Installed from source (php 5 requirements etc) and still had 
both installs done before lunch.

Regards,

Rick


Re: High CPU running SA in a VMware VM

Posted by Sammy Anderson <sa...@yahoo.com>.
And there is one of these for each user, this is just for one  user.  Sounds like we may have to abandon Bayes or possibly use  mysql.  Not sure we are ready to invest in setting that all up...

Theo Van Dinter <fe...@apache.org> wrote:  On Fri, Oct 27, 2006 at 03:01:45PM -0700, Sammy Anderson wrote:
>  I manually ran sa-learn --force-expire, and it hammered the box. Here  is a debug and timing information (for just a 5 MB file!):
>   
>   [18002] dbg: bayes: token count: 161725, final goal reduction size: 49225

want to get rid of (max) 49225 tokens

>  [18002] dbg: bayes: can't use estimation method for expiry, unexpected  result, calculating optimal atime delta (first pass)

have to do step 1 and can't estimate

>   [18002] dbg: bayes: expiry max exponent: 9
>   ------ about 20 seconds elapsed

it's going through every token in your db

>   [18002] dbg: bayes: atime token reduction
>   [18002] dbg: bayes: ======== ===============
>   [18002] dbg: bayes: 43200 144256
>   [18002] dbg: bayes: 86400 133029
>   [18002] dbg: bayes: 172800 111350
>   [18002] dbg: bayes: 345600 72306
>   [18002] dbg: bayes: 691200 9457
>   [18002] dbg: bayes: 1382400 0
[...]
>   [18002] dbg: bayes: first pass decided on 691200 for atime delta

691200 wins the Price Is Right (9457 is the closest without going over)

>   ------ about 40 seconds elapsed [a sort going on here???]

It's creating a new DB file, going back through every token in the original
DB, and for any that are newer than 9457 seconds ago, it copies the entry to
the new DB.

>   expired old bayes database entries in 60 seconds <= YIKES

yep.  expiry is relatively resource intensive and slow w/ DBMs, but
there's no other good way to do it (or at least, no one has suggested
a really better way to do it...)

-- 
Randomly Selected Tagline:
I believe it's not butter, I just can't believe it's $1.59!


 
---------------------------------
Get your email and see which of your friends are online - Right on the  new Yahoo.com

Re: High CPU running SA in a VMware VM

Posted by Theo Van Dinter <fe...@apache.org>.
On Fri, Oct 27, 2006 at 03:01:45PM -0700, Sammy Anderson wrote:
> I manually ran sa-learn --force-expire, and it hammered the box.   Here is a debug and timing information (for just a 5 MB file!):
>   
>   [18002] dbg: bayes: token count: 161725, final goal reduction size: 49225

want to get rid of (max) 49225 tokens

>   [18002] dbg: bayes: can't use estimation method for expiry, unexpected result, calculating optimal atime delta (first pass)

have to do step 1 and can't estimate

>   [18002] dbg: bayes: expiry max exponent: 9
>   ------ about 20 seconds elapsed

it's going through every token in your db

>   [18002] dbg: bayes: atime token reduction
>   [18002] dbg: bayes: ======== ===============
>   [18002] dbg: bayes: 43200 144256
>   [18002] dbg: bayes: 86400 133029
>   [18002] dbg: bayes: 172800 111350
>   [18002] dbg: bayes: 345600 72306
>   [18002] dbg: bayes: 691200 9457
>   [18002] dbg: bayes: 1382400 0
[...]
>   [18002] dbg: bayes: first pass decided on 691200 for atime delta

691200 wins the Price Is Right (9457 is the closest without going over)

>   ------ about 40 seconds elapsed [a sort going on here???]

It's creating a new DB file, going back through every token in the original
DB, and for any that are newer than 9457 seconds ago, it copies the entry to
the new DB.

>   expired old bayes database entries in 60 seconds <= YIKES

yep.  expiry is relatively resource intensive and slow w/ DBMs, but
there's no other good way to do it (or at least, no one has suggested
a really better way to do it...)

-- 
Randomly Selected Tagline:
I believe it's not butter, I just can't believe it's $1.59!

Re: High CPU running SA in a VMware VM

Posted by Sammy Anderson <sa...@yahoo.com>.
I manually ran sa-learn --force-expire, and it hammered the box.   Here is a debug and timing information (for just a 5 MB file!):
  
  [18002] dbg: bayes: tie-ing to DB file R/O /home/ian/.spamassassin/bayes_toks
  [18002] dbg: bayes: tie-ing to DB file R/O /home/ian/.spamassassin/bayes_seen
  [18002] dbg: bayes: found bayes db version 3
  [18002] dbg: bayes: DB journal sync: last sync: 1161899721
  [18002] dbg: bayes: opportunistic call found journal sync due
  [18002] dbg: bayes: bayes journal sync starting
  [18002] dbg: bayes: tie-ing to DB file R/W /home/ian/.spamassassin/bayes_toks
  [18002] dbg: bayes: tie-ing to DB file R/W /home/ian/.spamassassin/bayes_seen
  [18002] dbg: bayes: found bayes db version 3
  [18002] dbg: bayes: synced databases from journal in 0 seconds: 792 unique entries (974 total entries)
  [18002] dbg: bayes: bayes journal sync completed
  [18002] dbg: bayes: bayes journal sync starting
  [18002] dbg: bayes: bayes journal sync completed
  [18002] dbg: bayes: expiry starting
  [18002] dbg: bayes: expiry check keep size, 0.75 * max: 112500
  [18002] dbg: bayes: token count: 161725, final goal reduction size: 49225
  [18002] dbg: bayes: first pass? current: 1161986180, Last: 1161862273,  atime: 691200, count: 10015, newdelta: 140627, ratio: 4.91512730903645,  period: 43200
  [18002] dbg: bayes: can't use estimation method for expiry, unexpected result, calculating optimal atime delta (first pass)
  [18002] dbg: bayes: expiry max exponent: 9
  ------ about 20 seconds elapsed
  [18002] dbg: bayes: atime token reduction
  [18002] dbg: bayes: ======== ===============
  [18002] dbg: bayes: 43200 144256
  [18002] dbg: bayes: 86400 133029
  [18002] dbg: bayes: 172800 111350
  [18002] dbg: bayes: 345600 72306
  [18002] dbg: bayes: 691200 9457
  [18002] dbg: bayes: 1382400 0
  [18002] dbg: bayes: 2764800 0
  [18002] dbg: bayes: 5529600 0
  [18002] dbg: bayes: 11059200 0
  [18002] dbg: bayes: 22118400 0
  [18002] dbg: bayes: first pass decided on 691200 for atime delta
  ------ about 40 seconds elapsed [a sort going on here???]
  [18002] dbg: bayes: untie-ing
  [18002] dbg: bayes: untie-ing db_toks
  [18002] dbg: bayes: untie-ing db_seen
  [18002] dbg: bayes: files locked, now unlocking lock
  expired old bayes database entries in 60 seconds <= YIKES
  152268 entries kept, 9457 deleted
  token frequency: 1-occurrence tokens: 68.79%
  token frequency: less than 8 occurrences: 18.63%
  [18002] dbg: bayes: expiry completed
  .
  real    1m6.157s
  user    0m56.044s <= WOW!
  sys     0m2.370s
  
  

Anders Norrbring <li...@norrbring.se> wrote:  Sorry about top-posting, but I just catched the topic, and found it a 
bit interesting...

I run my SMTP server entirely in a VMware VM, and have *never* seen a 
high CPU usage on that particular machine.  I run Postfix, Amavis-new 
2.4.3, SA 3.1.7 and quite some plug-ins.

Bayes and quarantine are all in a MySQL database stored on another VM, 
no big load there either...
At peaks, I have a 2-4% CPU usage and 20-65% memory usage on eash VM, 
all reported by Virtual Center 1.4.

So, naturally I'm curious about why there would be a high CPU load from 
using SA.... My guess is that it's something else causing it.

-- 

Anders Norrbring
Norrbring Consulting

Sammy Anderson skrev:
> I'm pretty sure it is that, because when I turn of bayes altogether, the 
> spikes go away.  I also ran sa-learn --force-expire and it PEGS the VM.  
> With bayes debugging enabled, I see lines like this in my syslog:
> 
> bayes: expired old bayes database entries in 236 seconds: 152268 entries 
> kept, 9457 deleted
> 
> We have about 140 users, each with a 5 MB bayes_toks file, so there is a 
> need to expire somebody all throughout the day.  Each user is virtual, 
> they don't really have an account on the box, but the directories 
> correspond to each user address.  And we do auto-learn, with 
> opportunistic expiry.
> 
> Good thought about --round-robin, I am willing to use a little more 
> memory if it saves on CPU.
> 
> */"Ring, John C" /* wrote:
> 
>      >From: Sammy Anderson [mailto:sammyanderson789@yahoo.com]
>      >
>      >We recently migrated our SpamAssassin installation from a physical 3.6
>     GHz system
>      >running RHEL 4 and SA 3.0.4 to a VMware VM (ESX 2.5.4) with RHEL 4 as
>     the guest OS
>      >and SA 3.1.7.
> 
>     I just did the same thing last week, except we're using RHEL 3 and ESX
>     2.5.2, and the physical box it used to be on was far less powerful then
>     yours.
> 
>      >Each user has their own Bayes files (Berkeley DB) and these were
>     copied
>     from the old to
>      >the new server. Now whenever an expiry process runs on a user's
>     database, the CPU
>      >spikes, sometimes for a minute or longer.
> 
>     Hmm. We're using ours as a site-wide MTA to be able to reject incoming
>     mails at SMTP time, so no user DBs on the box, but we are running with
>     Bayes checking on (Berkeley DB), autolearning off, and manual Bayes
>     feeding only a few times a day. Because of that, I don't have practice
>     with a heavy Bayes load, but how certain are you that it's Bayes hitting
>     the CPU; did you run sa-learn (or spamassassin) with network reporting
>     turned off to see if that makes a difference?
> 
>     I ask because pyzor did keep our CPU at a constant 75% until I turned it
>     off; now it varies from 25% to 75% over the day, which is a lot more
>     acceptable :)
> 
>     Another thought, albeit perhaps not directly related, is are you running
>     spamd with --robin-robin? When I did that, it reduced the CPU load with
>     the trade-off of using a little more memory, which seems to be the
>     better trade-off, especially for a VM on ESX.
> 
>     -- 
>     John C. Ring, Jr.
>     jcring@switch.com
>     Network Engineer
>     Union Switch & Signal Inc.
> 
>     "If men were angels, no government would be necessary. If angels were to
>     govern men, neither external nor internal controls on government would
>     be necessary." -- James Madison
> 
> 
> ------------------------------------------------------------------------
> Do you Yahoo!?
> Everyone is raving about the all-new Yahoo! Mail. 
>  
> 


 
---------------------------------
We have the perfect Group for you. Check out the handy changes to Yahoo! Groups.

RE: High CPU running SA in a VMware VM

Posted by Mark <ad...@asarian-host.net>.

> -----Original Message-----
> From: Anders Norrbring [mailto:lists@norrbring.se]
> Sent: vrijdag 27 oktober 2006 20:58
> To: users@spamassassin.apache.org
> Subject: Re: High CPU running SA in a VMware VM


> I run my SMTP server entirely in a VMware VM, and have *never* seen a
> high CPU usage on that particular machine. I run Postfix, Amavis-new
> 2.4.3, SA 3.1.7 and quite some plug-ins.
>
> Bayes and quarantine are all in a MySQL database stored on
> another VM, no big load there either...

I concur. I've been using Vmware, as a shadow/test server, for the
production FreeBSD one, for years; never had any such issue.
Vmware rocks! :)

I would run any of the "db_dump" or db_upgrade" utils for BerkeleyDB; or
reinstall DB_File (and make darn sure it's compiled against the correct
BerkeleyDB libs). At any rate, I myself would probably be more inclined to
look into a BerkeleyDB issue than a Vmware one.

- Mark


Re: High CPU running SA in a VMware VM

Posted by Anders Norrbring <li...@norrbring.se>.
Sorry about top-posting, but I just catched the topic, and found it a 
bit interesting...

I run my SMTP server entirely in a VMware VM, and have *never* seen a 
high CPU usage on that particular machine.  I run Postfix, Amavis-new 
2.4.3, SA 3.1.7 and quite some plug-ins.

Bayes and quarantine are all in a MySQL database stored on another VM, 
no big load there either...
At peaks, I have a 2-4% CPU usage and 20-65% memory usage on eash VM, 
all reported by Virtual Center 1.4.

So, naturally I'm curious about why there would be a high CPU load from 
using SA.... My guess is that it's something else causing it.

-- 

Anders Norrbring
Norrbring Consulting

Sammy Anderson skrev:
> I'm pretty sure it is that, because when I turn of bayes altogether, the 
> spikes go away.  I also ran sa-learn --force-expire and it PEGS the VM.  
> With bayes debugging enabled, I see lines like this in my syslog:
> 
> bayes: expired old bayes database entries in 236 seconds: 152268 entries 
> kept, 9457 deleted
> 
> We have about 140 users, each with a 5 MB bayes_toks file, so there is a 
> need to expire somebody all throughout the day.  Each user is virtual, 
> they don't really have an account on the box, but the directories 
> correspond to each user address.  And we do auto-learn, with 
> opportunistic expiry.
> 
> Good thought about --round-robin, I am willing to use a little more 
> memory if it saves on CPU.
> 
> */"Ring, John C" <jc...@switch.com>/* wrote:
> 
>      >From: Sammy Anderson [mailto:sammyanderson789@yahoo.com]
>      >
>      >We recently migrated our SpamAssassin installation from a physical 3.6
>     GHz system
>      >running RHEL 4 and SA 3.0.4 to a VMware VM (ESX 2.5.4) with RHEL 4 as
>     the guest OS
>      >and SA 3.1.7.
> 
>     I just did the same thing last week, except we're using RHEL 3 and ESX
>     2.5.2, and the physical box it used to be on was far less powerful then
>     yours.
> 
>      >Each user has their own Bayes files (Berkeley DB) and these were
>     copied
>     from the old to
>      >the new server. Now whenever an expiry process runs on a user's
>     database, the CPU
>      >spikes, sometimes for a minute or longer.
> 
>     Hmm. We're using ours as a site-wide MTA to be able to reject incoming
>     mails at SMTP time, so no user DBs on the box, but we are running with
>     Bayes checking on (Berkeley DB), autolearning off, and manual Bayes
>     feeding only a few times a day. Because of that, I don't have practice
>     with a heavy Bayes load, but how certain are you that it's Bayes hitting
>     the CPU; did you run sa-learn (or spamassassin) with network reporting
>     turned off to see if that makes a difference?
> 
>     I ask because pyzor did keep our CPU at a constant 75% until I turned it
>     off; now it varies from 25% to 75% over the day, which is a lot more
>     acceptable :)
> 
>     Another thought, albeit perhaps not directly related, is are you running
>     spamd with --robin-robin? When I did that, it reduced the CPU load with
>     the trade-off of using a little more memory, which seems to be the
>     better trade-off, especially for a VM on ESX.
> 
>     -- 
>     John C. Ring, Jr.
>     jcring@switch.com
>     Network Engineer
>     Union Switch & Signal Inc.
> 
>     "If men were angels, no government would be necessary. If angels were to
>     govern men, neither external nor internal controls on government would
>     be necessary." -- James Madison
> 
> 
> ------------------------------------------------------------------------
> Do you Yahoo!?
> Everyone is raving about the all-new Yahoo! Mail. 
> <http://us.rd.yahoo.com/evt=42297/*http://advision.webevents.yahoo.com/mailbeta> 
> 

RE: High CPU running SA in a VMware VM

Posted by Sammy Anderson <sa...@yahoo.com>.
I'm pretty sure it is that, because when I turn of bayes altogether,  the spikes go away.  I also ran sa-learn --force-expire and it  PEGS the VM.  With bayes debugging enabled, I see lines like this  in my syslog:
  
  bayes: expired old bayes database entries in 236 seconds: 152268 entries kept, 9457 deleted
  
  We have about 140 users, each with a 5 MB bayes_toks file, so there is  a need to expire somebody all throughout the day.  Each user is  virtual, they don't really have an account on the box, but the  directories correspond to each user address.  And we do  auto-learn, with opportunistic expiry.
  
  Good thought about --round-robin, I am willing to use a little more memory if it saves on CPU.

"Ring, John C" <jc...@switch.com> wrote:  >From: Sammy Anderson [mailto:sammyanderson789@yahoo.com] 
>
>We recently migrated our SpamAssassin installation from a physical 3.6
GHz system
>running RHEL 4 and SA 3.0.4 to a VMware VM (ESX 2.5.4) with RHEL 4 as
the guest OS
>and SA 3.1.7.

I just did the same thing last week, except we're using RHEL 3 and ESX
2.5.2, and the physical box it used to be on was far less powerful then
yours.

>Each user has their own Bayes files (Berkeley DB) and these were copied
from the old to
>the new server.  Now whenever an expiry process runs on a user's
database, the CPU
>spikes, sometimes for a minute or longer.

Hmm.  We're using ours as a site-wide MTA to be able to reject incoming
mails at SMTP time, so no user DBs on the box, but we are running with
Bayes checking on (Berkeley DB), autolearning off, and manual Bayes
feeding only a few times a day.  Because of that, I don't have practice
with a heavy Bayes load, but how certain are you that it's Bayes hitting
the CPU; did you run sa-learn (or spamassassin) with network reporting
turned off to see if that makes a difference?

I ask because pyzor did keep our CPU at a constant 75% until I turned it
off; now it varies from 25% to 75% over the day, which is a lot more
acceptable :)

Another thought, albeit perhaps not directly related, is are you running
spamd with --robin-robin?  When I did that, it reduced the CPU load with
the trade-off of using a little more memory, which seems to be the
better trade-off, especially for a VM on ESX.

-- 
John C. Ring, Jr. 
jcring@switch.com 
Network Engineer
Union Switch & Signal Inc.

"If men were angels, no government would be necessary. If angels were to
govern men, neither external nor internal controls on government would
be necessary." -- James Madison


 		
---------------------------------
Do you Yahoo!?
 Everyone is raving about the  all-new Yahoo! Mail.

RE: High CPU running SA in a VMware VM

Posted by "Ring, John C" <jc...@switch.com>.
>From: Sammy Anderson [mailto:sammyanderson789@yahoo.com] 
>
>We recently migrated our SpamAssassin installation from a physical 3.6
GHz system
>running RHEL 4 and SA 3.0.4 to a VMware VM (ESX 2.5.4) with RHEL 4 as
the guest OS
>and SA 3.1.7.

I just did the same thing last week, except we're using RHEL 3 and ESX
2.5.2, and the physical box it used to be on was far less powerful then
yours.

>Each user has their own Bayes files (Berkeley DB) and these were copied
from the old to
>the new server.  Now whenever an expiry process runs on a user's
database, the CPU
>spikes, sometimes for a minute or longer.

Hmm.  We're using ours as a site-wide MTA to be able to reject incoming
mails at SMTP time, so no user DBs on the box, but we are running with
Bayes checking on (Berkeley DB), autolearning off, and manual Bayes
feeding only a few times a day.  Because of that, I don't have practice
with a heavy Bayes load, but how certain are you that it's Bayes hitting
the CPU; did you run sa-learn (or spamassassin) with network reporting
turned off to see if that makes a difference?

I ask because pyzor did keep our CPU at a constant 75% until I turned it
off; now it varies from 25% to 75% over the day, which is a lot more
acceptable :)

Another thought, albeit perhaps not directly related, is are you running
spamd with --robin-robin?  When I did that, it reduced the CPU load with
the trade-off of using a little more memory, which seems to be the
better trade-off, especially for a VM on ESX.

-- 
John C. Ring, Jr. 
jcring@switch.com 
Network Engineer
Union Switch & Signal Inc.

"If men were angels, no government would be necessary. If angels were to
govern men, neither external nor internal controls on government would
be necessary." -- James Madison

RE: High CPU running SA in a VMware VM

Posted by Sammy Anderson <sa...@yahoo.com>.
The I/O rate is pretty low.  The files going through expiration are  only about 5 MB, and it only takes one of these to drive the CPU up.  I  think there are over 100,000 tokens in the file, each with a timestamp,  and I believe there must be some sorting going on, so I suspect that is  where the issue is.
    
    Thanks,
    Ian

"Gary W. Smith" <ga...@primeexalia.com> wrote:                            What does the IO usage look like on the  server?  We ran a couple of our backup SA instances on VMWare but they  database is on a remote SQL server. So the only IO is logging.  We have  several VM Instances for a variety of things.  Did you pre-allocate the  disk space?  If not you might consider do that first and defragging the  disk.
     
     
            
---------------------------------
    
    From: Sammy Anderson  [mailto:sammyanderson789@yahoo.com] 
  Sent: Thursday, October 26, 2006  3:52 PM
  To: users@spamassassin.apache.org
  Subject: High CPU running SA in a  VMware VM
    
     
    We recently migrated our SpamAssassin installation from a physical 3.6  GHz system running RHEL 4 and SA 3.0.4 to a VMware VM (ESX 2.5.4) with RHEL 4  as the guest OS and SA 3.1.7.  Each user has their own Bayes files  (Berkeley DB) and these were copied from the old to the new server.  Now  whenever an expiry process runs on a user's database, the CPU spikes, sometimes  for a minute or longer.  We did not notice spikes on the old server, but  it is really hammering the VM.  Has anyone else experienced this problem?   For now I have disabled Bayes altogether because of the unacceptable load.
  
  --SA
      
        
---------------------------------
    
    Do you Yahoo!?
  Get on board. You're  invited to try the new Yahoo! Mail.
    
    

 		
---------------------------------
Do you Yahoo!?
 Get on board. You're invited to try the new Yahoo! Mail.

Re: High CPU running SA in a VMware VM

Posted by d....@yournetplus.com.
On Thu, 26 Oct 2006 21:48:22 -0700
  "Gary W. Smith" <ga...@primeexalia.com> wrote:
>Did you pre-allocate the disk space? If not you
>might consider do that first and defragging the disk.

Good point! I forgot about the disk space.

RE: High CPU running SA in a VMware VM

Posted by "Gary W. Smith" <ga...@primeexalia.com>.
What does the IO usage look like on the server?  We ran a couple of our
backup SA instances on VMWare but they database is on a remote SQL
server. So the only IO is logging.  We have several VM Instances for a
variety of things.  Did you pre-allocate the disk space?  If not you
might consider do that first and defragging the disk.

 

 

________________________________

From: Sammy Anderson [mailto:sammyanderson789@yahoo.com] 
Sent: Thursday, October 26, 2006 3:52 PM
To: users@spamassassin.apache.org
Subject: High CPU running SA in a VMware VM

 

We recently migrated our SpamAssassin installation from a physical 3.6
GHz system running RHEL 4 and SA 3.0.4 to a VMware VM (ESX 2.5.4) with
RHEL 4 as the guest OS and SA 3.1.7.  Each user has their own Bayes
files (Berkeley DB) and these were copied from the old to the new
server.  Now whenever an expiry process runs on a user's database, the
CPU spikes, sometimes for a minute or longer.  We did not notice spikes
on the old server, but it is really hammering the VM.  Has anyone else
experienced this problem?  For now I have disabled Bayes altogether
because of the unacceptable load.

--SA

  

________________________________

Do you Yahoo!?
Get on board. You're invited
<http://us.rd.yahoo.com/evt=40791/*http:/advision.webevents.yahoo.com/ma
ilbeta>  to try the new Yahoo! Mail.


Re: High CPU running SA in a VMware VM

Posted by Sammy Anderson <sa...@yahoo.com>.
The guest has more memory than it is using, so it isn't doing any paging or swapping.
    
    As for the ESX 2.5.4 box, it isn't swapping either.  There is currently enough physical RAM for the few VM's running.

d.hill@yournetplus.com wrote:  On Thu, 26 Oct 2006 15:52:17 -0700 (PDT)
  Sammy Anderson  wrote:
>We recently migrated our SpamAssassin installation from a 
>physical 3.6  GHz system running RHEL 4 and SA 3.0.4 to a 
>VMware VM (ESX 2.5.4) with  RHEL 4 as the guest OS and SA 
>3.1.7.  Each user has their own  Bayes files (Berkeley 
>DB) and these were copied from the old to the new 
> server.  Now whenever an expiry process runs on a user's 
>database,  the CPU spikes, sometimes for a minute or 
>longer.  We did not  notice spikes on the old server, but 
>it is really hammering the  VM.  Has anyone else 
>experienced this problem?  For now I  have disabled Bayes 
>altogether because of the unacceptable load.

Perhaps memory started to spill into the swap on either 
the VM or guest OS.

I don't know what version of VMWare you are using. I'm 
using v5.2.2 running
under Windows. In the memory preferences I have mine set 
so all the virtual
machine memory has to fit into the reserved host ram. I've 
done small tests
with SA before and haven't had any problems. Then again, I 
haven't found
anything I can use to put a load on a test install. My 
test bed is on a
duo-core 3.2ghz with four gig of ram. The VM has a full 
gig of ram allocated
and is running the release version of FreeBSD 6.1.


 __________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 

Re: High CPU running SA in a VMware VM

Posted by d....@yournetplus.com.
On Thu, 26 Oct 2006 15:52:17 -0700 (PDT)
  Sammy Anderson <sa...@yahoo.com> wrote:
>We recently migrated our SpamAssassin installation from a 
>physical 3.6  GHz system running RHEL 4 and SA 3.0.4 to a 
>VMware VM (ESX 2.5.4) with  RHEL 4 as the guest OS and SA 
>3.1.7.  Each user has their own  Bayes files (Berkeley 
>DB) and these were copied from the old to the new 
> server.  Now whenever an expiry process runs on a user's 
>database,  the CPU spikes, sometimes for a minute or 
>longer.  We did not  notice spikes on the old server, but 
>it is really hammering the  VM.  Has anyone else 
>experienced this problem?  For now I  have disabled Bayes 
>altogether because of the unacceptable load.

Perhaps memory started to spill into the swap on either 
the VM or guest OS.

I don't know what version of VMWare you are using. I'm 
using v5.2.2 running
under Windows. In the memory preferences I have mine set 
so all the virtual
machine memory has to fit into the reserved host ram. I've 
done small tests
with SA before and haven't had any problems. Then again, I 
haven't found
anything I can use to put a load on a test install. My 
test bed is on a
duo-core 3.2ghz with four gig of ram. The VM has a full 
gig of ram allocated
and is running the release version of FreeBSD 6.1.