You are viewing a plain text version of this content. The canonical link for it is here.
Posted to modperl@perl.apache.org by Jeff Anderson <ca...@gmail.com> on 2011/01/03 22:48:53 UTC

Dollar Self Storage (aka mod_perl children sharing/updating data on a schedule)

Greetings,

I am looking to set up a mod_perl handler which keep track of the
count of requests coming in. Each child process will store this data
in local memory and after 5-10 minutes have passed, each child process
will merge its data into a central database, the goal being that each
child will not have to hit a database for every request.

I have a handler that contains a data in $self for each child and when
a REQUEST comes through, a check is made to see if the interval has
passed and if so, the child will merge its data with the database.

The problem is --- how do i additionally have each child merge its
data on a schedule -- that is, without relying only on an incoming
request to "hit" that specific child process? I have tried 2 attempted
solutions with no luck. (Keep in mind that as long as requests are
coming in, the children will eventually merge their data within a good
degree of accuracy, but only if requests are coming in.)

Attempt #1 --- configure a signal handler and send a signal to each
child process - this didn't seem to work but i am about to try some
more tests. I have read in the docs, however, that sending direct
signals to mod_perl children is not recommended.

Attempt #2 --- register a Clean Up hook. This doesn't seem to work for
me because, as i understand so far, assigning a reference to a sub via
PerlCleanupHandler is not the same as calling the object's method.
Hence ... i do not have access to $self nor the local memory. So, the
sub is called via the Clean Up phase, but the sub is meant to be
called as a method (and i can't use $self has a hash ref unless called
as a method).

Other considerations:

- Perhaps each child process will need to use it's own SQLite or similar cache?
- Perhaps there is another hook that i do not know about that better
suits such needs?
- Perhaps my mistake is obvious -- configuring Clean up hook incorrectly, etc.

Any information will be greatly appreciated. I hope everyone had a
Happy New Year. On a side note -- there is a storage facility in LA
called "Dollar Self Storage" ... :)


-- 
jeffa

Re: Dollar Self Storage (aka mod_perl children sharing/updating data on a schedule)

Posted by Jeff Anderson <ca...@gmail.com>.
Greetings,

First, a big thank you to everyone for these great suggestions and
corrections. The idea to send a HUP signal to the parent process works fine
under the pre-forked server model, but does not work at until under the
threaded worker model. So signals are right out.

I probably forgot to mention that we are also dealing with multiple servers,
so a central data store will be required.

We finally decided to go with a convention -- send as many "internal"
specialized requests as there are child proceses to each server seems to
working well enough. We are not interested in precious, just pretty darned
good accuracy.

Please feel free to keep this discussion open, ask questions or make further
suggestions.

Thank you all once again!
jeffa



On Thu, Jan 13, 2011 at 7:12 AM, Perrin Harkins <pe...@elem.com> wrote:

> Hi Jeff,
>
> > I am looking to set up a mod_perl handler which keep track of the
> > count of requests coming in. Each child process will store this data
> > in local memory and after 5-10 minutes have passed, each child process
> > will merge its data into a central database, the goal being that each
> > child will not have to hit a database for every request.
>
> I agree with the people saying that memcached/Cache::FastMmap or an
> in-memory file is probably fast enough to hit on every request.  In
> general though, storing and dumping things to a db now and then is not
> a bad way to go for non-critical data.
>
> > The problem is --- how do i additionally have each child merge its
> > data on a schedule -- that is, without relying only on an incoming
> > request to "hit" that specific child process?
>
> You can't.  The nature of apache is that it responds to network
> events.  Cleanup handlers are also only going to fire after a request.
>  You could rig up a cron to hit your server regularly and if the data
> was shared between the children then whatever child picked that up
> could write it to the db, but that seems a lot harder than
> alternatives already suggested.
>
> > Attempt #2 --- register a Clean Up hook. This doesn't seem to work for
> > me because, as i understand so far, assigning a reference to a sub via
> > PerlCleanupHandler is not the same as calling the object's method.
>
> You could just store this data in a $My::Global::Counter and read it
> from anywhere.  Each child has its own variable storage, so this is
> safe.
>
> Second, you should be able to make a cleanup handler call your sub as a
> method
>
> > - Perhaps each child process will need to use it's own SQLite or similar
> cache?
>
> SQLite may well be slower than your real database, so I wouldn't do
> that without testing.
>
> BTW, how are you configuring a handler to create a $self that lasts
> across multiple requests?
>
> - Perrin
>



-- 
jeffa

Re: Dollar Self Storage (aka mod_perl children sharing/updating data on a schedule)

Posted by Perrin Harkins <pe...@elem.com>.
Hi Jeff,

> I am looking to set up a mod_perl handler which keep track of the
> count of requests coming in. Each child process will store this data
> in local memory and after 5-10 minutes have passed, each child process
> will merge its data into a central database, the goal being that each
> child will not have to hit a database for every request.

I agree with the people saying that memcached/Cache::FastMmap or an
in-memory file is probably fast enough to hit on every request.  In
general though, storing and dumping things to a db now and then is not
a bad way to go for non-critical data.

> The problem is --- how do i additionally have each child merge its
> data on a schedule -- that is, without relying only on an incoming
> request to "hit" that specific child process?

You can't.  The nature of apache is that it responds to network
events.  Cleanup handlers are also only going to fire after a request.
 You could rig up a cron to hit your server regularly and if the data
was shared between the children then whatever child picked that up
could write it to the db, but that seems a lot harder than
alternatives already suggested.

> Attempt #2 --- register a Clean Up hook. This doesn't seem to work for
> me because, as i understand so far, assigning a reference to a sub via
> PerlCleanupHandler is not the same as calling the object's method.

You could just store this data in a $My::Global::Counter and read it
from anywhere.  Each child has its own variable storage, so this is
safe.

Second, you should be able to make a cleanup handler call your sub as a method

> - Perhaps each child process will need to use it's own SQLite or similar cache?

SQLite may well be slower than your real database, so I wouldn't do
that without testing.

BTW, how are you configuring a handler to create a $self that lasts
across multiple requests?

- Perrin

Re: Dollar Self Storage (aka mod_perl children sharing/updating data on a schedule)

Posted by Ryan Gies <ry...@livesite.net>.
On 01/03/2011 04:48 PM, Jeff Anderson wrote:
> the goal being that each child will not have to hit a database for every request.
>    
If the reason behind not hitting the database on each request is because 
you don't want to impact your page-response times, know that the Cleanup 
phase happens after the response has been sent to the client.

The below demonstration assigns a cleanup handler which sleeps for 3 
seconds.  You will notice that your pages still load before the snooze 
handler is run.

# ---- Source from: .../httpd.conf ----

PerlModule            Apache2::Testing
PerlCleanupHandler    Apache2::Testing->snooze_handler

# ---- Source from: .../Apache2/Testing.pm ----

package Apache2::Testing;
use strict;
use ModPerl::Util;
our $Seconds = 3;
sub snooze_handler {
   my $phase = ModPerl::Util::current_callback();
   warn sprintf("[%d] %s: is sleeping for %d seconds.\n", $$, $phase, $Seconds);
   sleep $Seconds;
   warn sprintf("[%d] %s: is now awake.\n", $$, $phase);
}
1;


Re: Dollar Self Storage (aka mod_perl children sharing/updating data on a schedule)

Posted by Cosimo Streppone <co...@streppone.it>.
On Mon, 03 Jan 2011 22:48:53 +0100, Jeff Anderson <ca...@gmail.com>  
wrote:

> I am looking to set up a mod_perl handler which keep track of the
> count of requests coming in. Each child process will store this data
> in local memory and after 5-10 minutes have passed, each child process
> will merge its data into a central database, the goal being that each
> child will not have to hit a database for every request.

Hi Jeff,

we usually do that with a local memcached server and
counters (Cache::Memcached::inc() function).

I'm looking into using Cache::FastMmap as an alternative.

I'm not sure about the volume of requests you have,
but using memcached, we got as far as 500 req/s
without any problem or slowdown at all.

-- 
Cosimo

Re: Dollar Self Storage (aka mod_perl children sharing/updating data on a schedule)

Posted by Torsten Förtsch <to...@gmx.net>.
On Monday, January 03, 2011 22:48:53 Jeff Anderson wrote:
> I am looking to set up a mod_perl handler which keep track of the
> count of requests coming in. Each child process will store this data
> in local memory and after 5-10 minutes have passed, each child process
> will merge its data into a central database, the goal being that each
> child will not have to hit a database for every request.
> 
> I have a handler that contains a data in $self for each child and when
> a REQUEST comes through, a check is made to see if the interval has
> passed and if so, the child will merge its data with the database.
> 
> The problem is --- how do i additionally have each child merge its
> data on a schedule -- that is, without relying only on an incoming
> request to "hit" that specific child process? I have tried 2 attempted
> solutions with no luck. (Keep in mind that as long as requests are
> coming in, the children will eventually merge their data within a good
> degree of accuracy, but only if requests are coming in.)
> 
> Attempt #1 --- configure a signal handler and send a signal to each
> child process - this didn't seem to work but i am about to try some
> more tests. I have read in the docs, however, that sending direct
> signals to mod_perl children is not recommended.
> 
> Attempt #2 --- register a Clean Up hook. This doesn't seem to work for
> me because, as i understand so far, assigning a reference to a sub via
> PerlCleanupHandler is not the same as calling the object's method.
> Hence ... i do not have access to $self nor the local memory. So, the
> sub is called via the Clean Up phase, but the sub is meant to be
> called as a method (and i can't use $self has a hash ref unless called
> as a method).
> 
> Other considerations:
> 
> - Perhaps each child process will need to use it's own SQLite or similar
> cache? - Perhaps there is another hook that i do not know about that
> better suits such needs?
> - Perhaps my mistake is obvious -- configuring Clean up hook incorrectly,
> etc.
> 
> Any information will be greatly appreciated. I hope everyone had a
> Happy New Year. On a side note -- there is a storage facility in LA
> called "Dollar Self Storage" ... :)

I can think of 2 solutions.

1) perhaps the Apache scoreboard already has all the information you need. 
Then you don't need any special hook. Just configure a scroreboard file on 
disk. I do that normally on a tmpfs fileysystem (Linux). So it does really 
exist only in RAM. Then write an external daemon that uses 
Apache2::ScorebordFile to read the information on a regular basis.

2) quite similar. But if the required information is not available in the 
scoreboard you can establish your own using either File::Map or 
IPC::ScoreBoard.

In both cases you have to deal with the fact that Apache starts up additional 
children on demand and also terminates them when the load goes down. The 
apache scoreboard does contain the necessary information to do that.

Torsten Förtsch

-- 
Need professional modperl support? Hire me! (http://foertsch.name)

Like fantasy? http://kabatinte.net