You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@httpd.apache.org by Nicholas Sherlock <n....@gmail.com> on 2009/07/20 15:00:09 UTC

[users@httpd] Re: High load using memcache and 9G tmpfs

Matthew Tice wrote:
> Currently we're migrating our static node cluster from 32bit OpenSuse 
> 10.3 using the disk_cache_module on a 2G tmpfs to a 64bit CentOS 5.3 
> using the disk_cache module on a 9G tmpfs.  After pushing these CentOS 
> nodes into production (and consequently adding many more requests) we 
> started seeing a load spike on these systems.  Preliminary tests have 
> shown that using a 2G (maybe 3G - still testing that one) tmpfs on the 
> same CentOS node doesn't have the same high load.  I'm not sure if this 
> is a bug with tmpfs, Apache/disk_cache, CentOS, or what.  Any insight 
> into this strange problem would be appreciated.

I had this problem on my server where the system service "mlocate" was 
scheduled to run every day. It basically scans every file on the system, 
and with the huge numbers of files generated by disk_cache, it took more 
than a day to finish one scan. So the next day, there were two running 
mlocate instances. Then three. Then no legitimate IO requests were being 
serviced and the whole server ground to a halt. The load average 
skyrocketed because of all the waiting processes. "mlocate" didn't show 
up on 'top' because it used almost no CPU time. I diagnosed the problem 
with 'iotop' - it gives per-process IO stats.

This is probably not the same problem you're having, but iotop is still 
a useful tool to identify IO competition when you can't find the culprit 
based on CPU-time.

Cheers,
Nicholas Sherlock

---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
   "   from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org

[users@httpd] Re: High load using memcache and 9G tmpfs

Posted by Nicholas Sherlock <n....@gmail.com>.

Matthew Tice wrote:
> Thanks Nicholas, I'll take a look at that.  I had htcacheclean running 
> every 5 min. which could have caused a bulk of my problems.  I changed 
> the daemon to kick off every 30 min. instead which seems to have helped 
> - a little.  The machine isn't quite as sluggish but the load is still 
> hovering around 2 (5 min. average). 

If you see multiple htcacheclean instances running at the same time then 
you know it's in trouble - they'll probably be saturating the IO 
capacity for your filesystem.

A load of 2! I dream of the days when my server had a load average of 2! 
I'm making do with a slow single-core machine at the moment, with a lot 
of very persistent site visitors, and our load average rarely drops 
below 50.. :).

Cheers,
Nicholas Sherlock

---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
   "   from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org

Re: [users@httpd] Re: High load using memcache and 9G tmpfs

Posted by Matthew Tice <mj...@gmail.com>.

On Mon, Jul 20, 2009 at 7:00 AM, Nicholas Sherlock <n....@gmail.com>wrote:

> Matthew Tice wrote:
>
>> Currently we're migrating our static node cluster from 32bit OpenSuse 10.3
>> using the disk_cache_module on a 2G tmpfs to a 64bit CentOS 5.3 using the
>> disk_cache module on a 9G tmpfs.  After pushing these CentOS nodes into
>> production (and consequently adding many more requests) we started seeing a
>> load spike on these systems.  Preliminary tests have shown that using a 2G
>> (maybe 3G - still testing that one) tmpfs on the same CentOS node doesn't
>> have the same high load.  I'm not sure if this is a bug with tmpfs,
>> Apache/disk_cache, CentOS, or what.  Any insight into this strange problem
>> would be appreciated.
>>
>
> I had this problem on my server where the system service "mlocate" was
> scheduled to run every day. It basically scans every file on the system, and
> with the huge numbers of files generated by disk_cache, it took more than a
> day to finish one scan. So the next day, there were two running mlocate
> instances. Then three. Then no legitimate IO requests were being serviced
> and the whole server ground to a halt. The load average skyrocketed because
> of all the waiting processes. "mlocate" didn't show up on 'top' because it
> used almost no CPU time. I diagnosed the problem with 'iotop' - it gives
> per-process IO stats.
>
> This is probably not the same problem you're having, but iotop is still a
> useful tool to identify IO competition when you can't find the culprit based
> on CPU-time.
>
> Cheers,
> Nicholas Sherlock
>
> Thanks Nicholas, I'll take a look at that.  I had htcacheclean running
every 5 min. which could have caused a bulk of my problems.  I changed the
daemon to kick off every 30 min. instead which seems to have helped - a
little.  The machine isn't quite as sluggish but the load is still hovering
around 2 (5 min. average).

Matt