You are viewing a plain text version of this content. The canonical link for it is here.
Posted to modperl@perl.apache.org by ed phillips <ar...@sirius.com> on 2000/10/04 20:53:28 UTC

Re: Forking in mod_perl?

Hi David,

Check out the guide at

http://perl.apache.org/guide/performance.html#Forking_and_Executing_Subprocess

The Eagle book also covers the C API subprocess details on page 622-631.

Let us know if the guide is unclear to you, so we can improve it.

Ed


"David E. Wheeler" wrote:

> Hi All,
>
> Quick question - can I fork off a process in mod_perl? I've got a piece
> of code that needs to do a lot of processing that's unrelated to what
> shows up in the browser. So I'd like to be able to fork the processing
> off and return data to the browser, letting the forked process handle
> the extra processing at its leisure. Is this doable? Is forking a good
> idea in a mod_perl environment? Might there be another way to do it?
>
> TIA for the help!
>
> David
>
> --
> David E. Wheeler
> Software Engineer
> Salon Internet                                     ICQ:   15726394
> dw@salon.com                                       AIM:   dwTheory


Re: Forking in mod_perl?

Posted by Jim Woodgate <wo...@realtime.net>.
David E. Wheeler writes:
 > Using the cleanup phase, as Geoffey Young suggests, might be a bit
 > nicer, but I'll have to look into how much time my processing will
 > likely take, hogging up an apache fork while it finishes.

I've wondered about this as well.  I really like the cleanup handler,
and thought that in general it would be better to tie up the httpd
process and let apache decide when a new process is needed rather than 
always forking.

For the most part I use the cleanup handlers to handle something that
takes alot of time, but doesn't happen very often.  If I had something
that took alot of time every time someone hit a page I still don't
think I'd fork, instead I'd pass off the information to another
process and let that process run through the data asynchronously like
a spooler...

-- 
woody@bga.com

Re: Forking in mod_perl? (benchmarking)

Posted by Bill Moseley <mo...@hank.org>.
I'm working with the Swish search engine (www.apache.org and the guide use
it).

Until this month, SWISH could only be called via a fork/exec.  Now there's
an early C library for swish that I've built into a perl module for use
with mod_perl.

Yea! No forking!

I decided to do some quick benchmarking with ab.  I'm rotten at
benchmarking anything, so any suggestions are welcome.

My main question was this:  With the library version you first call a
routine to open the index files.  This reads in header info and gets ready
for the search.  Then you run the query, and then you call a routine to
close the index.

OR, you can open the index file, and do multiple queries without opening
and closing the index each time.  Somewhat like caching a DBI connection, I
suppose.

So I wanted to see how much faster it is to keep the index file open.

I decided to start Apache with only one child, so it would handle ALL the
requests.  I'm running ab on the same machine, and only doing 100 requests.

Running my mod_perl program without asking for a query I can get almost 100
requests per second.  That's just writing from memory and logging to an
open file.

Now comparing the two methods of calling SWISH I got about 7.7 request per
second leaving the index file open between requests, and 6.5 per second
opening each time.  My guess is Linux is helping buffer the file contents
quite a bit since this machine isn't doing anything else at the time, so
there might be a wider gap if the machine was busy.

Now, here's why this post is under this subject thread:

For fun I changed over to forking Apache and exec'ing SWISH each request,
and I got just over 6 requests per second.  I guess I would have expected
much worse, but again, I think Linux is helping out quite a bit in the fork.

And for more fun, the "same" program under mod_cgi: 0.90 requests/second





Bill Moseley
mailto:moseley@hank.org

Re: Forking in mod_perl?

Posted by "David E. Wheeler" <Da...@Wheeler.net>.
ed phillips wrote:
> 
> I hope it is clear that you don't want fork the whole server!
> 
> Mod_cgi goes to great pains to effectively fork a subprocess, and
> was the major impetus I believe for the development of
> the C subprocess API. It  (the source code for
> mod_cgi) is a great place to learn some of the
> subtleties as the Eagle book points out. As the Eagle book
> says, Apache is a complex beast. Mod_perl gives
> you the power to use the beast to your best advantage.

Yeah, but I don't speak C. Just Perl. And it looks like the way to do it
in Perl is to call system() and then detach the called script. I was
trying to keep this all nice and tidy in modules, but I don't know if
it'll be possible.

> Now you are faced with a trade off.  Is it more expensive to
> detach a subprocess, or use the child cleanup phase to do
> some extra processing? I'd have to know more specifics to answer
> that with any modicum of confidence.

I think I can probably evaluate that with a few tests.

Thanks!

David

Re: Forking in mod_perl?

Posted by "C. Jon Larsen" <jl...@richweb.com>.
I use a database table for the queue. No file locking issues, atomic
transactions, you can sort and order the jobs, etc . . . you can wrap the
entire "queue" library in a module. Plus, the background script that
processes the queue can easily run with higher permissions, and you don't
have to worry as much with setuid issues when forking from a parent
process (like your apache) running as a user with less priviledges than
what you (may) need. You can pass all the args you need to via a column in
the db, and, if passing data back and forth is a must, serialize your data
using Storable and have the queue runner thaw it back out. Very simple,
very fast, very powerful.

On Wed, 4 Oct 2000, Neil Conway wrote:

> On Wed, Oct 04, 2000 at 02:42:50PM -0700, David E. Wheeler wrote:
> > Yeah, I was thinking something along these lines. Don't know if I need
> > something as complex as IPC. I was thinking of perhaps a second Apache
> > server set up just to handle long-term processing. Then the first server
> > could send a request to the second with the commands it needs to execute
> > in a header. The second server processes those commands independantly of
> > the first server, which then returns data to the browser.
> 
> In a pinch, I'd just use something like a 'queue' directory. In other
> words, when your mod_perl code gets some info to process, it writes
> this into a file in a certain directory (name it with a timestamp /
> cksum to ensure the filename is unique). Every X seconds, have a
> daemon poll the directory; if it finds a file, it processes it.
> If not, it goes back to sleep for X seconds. I guess it's poor
> man's IPC. But it runs over NFS nicely, it's *very* simple, it's
> portable, and I've never needed anything more complex. You also
> don't need to fork the daemon or startup a new script every
> processing request. But if you need to do the processing in realtime,
> waiting up to X seconds for the results might be unacceptable.
> 
> How does this sound?
> 
> HTH,
> 
> Neil
> 
> -- 
> Neil Conway <ne...@home.com>
> Get my GnuPG key from: http://klamath.dyndns.org/mykey.asc
> Encrypted mail welcomed
> 
> It is dangerous to be right when the government is wrong.
>         -- Voltaire
> 





Re: Forking in mod_perl?

Posted by Tim Bishop <ti...@activespace.com>.

On Thu, 5 Oct 2000, Sean D. Cook wrote:

> > On Wed, Oct 04, 2000 at 02:42:50PM -0700, David E. Wheeler wrote:
> > > Yeah, I was thinking something along these lines. Don't know if I need
> > > something as complex as IPC. I was thinking of perhaps a second Apache
> > > server set up just to handle long-term processing. Then the first server
> > > could send a request to the second with the commands it needs to execute
> > > in a header. The second server processes those commands independantly of
> > > the first server, which then returns data to the browser.
> > 
> > In a pinch, I'd just use something like a 'queue' directory. In other
> > words, when your mod_perl code gets some info to process, it writes
> > this into a file in a certain directory (name it with a timestamp /
> > cksum to ensure the filename is unique). Every X seconds, have a
> 
> It might be safer to do this in a db rather than the file system.  That
> way there is less chance for colision and you don't have to worry about
> the file being half written when the daemon comes along and tries to read
> the file while mod_perl/apache is trying to write it.  Let the DB do the
> storage side and let the damon do a select to gather the info.

If you don't have a db easily available, I've had good luck using temp
files.  You can avoid partially written file errors by exploiting the
atomic nature of moving  (renaming) files.  NFS does *not* have this nice
behavior, however.

-Tim 


Re: Forking in mod_perl?

Posted by "Sean D. Cook" <sd...@edutest.com>.
> On Wed, Oct 04, 2000 at 02:42:50PM -0700, David E. Wheeler wrote:
> > Yeah, I was thinking something along these lines. Don't know if I need
> > something as complex as IPC. I was thinking of perhaps a second Apache
> > server set up just to handle long-term processing. Then the first server
> > could send a request to the second with the commands it needs to execute
> > in a header. The second server processes those commands independantly of
> > the first server, which then returns data to the browser.
> 
> In a pinch, I'd just use something like a 'queue' directory. In other
> words, when your mod_perl code gets some info to process, it writes
> this into a file in a certain directory (name it with a timestamp /
> cksum to ensure the filename is unique). Every X seconds, have a

It might be safer to do this in a db rather than the file system.  That
way there is less chance for colision and you don't have to worry about
the file being half written when the daemon comes along and tries to read
the file while mod_perl/apache is trying to write it.  Let the DB do the
storage side and let the damon do a select to gather the info.


Re: Forking in mod_perl?

Posted by Neil Conway <nc...@klamath.dyndns.org>.
On Wed, Oct 04, 2000 at 02:42:50PM -0700, David E. Wheeler wrote:
> Yeah, I was thinking something along these lines. Don't know if I need
> something as complex as IPC. I was thinking of perhaps a second Apache
> server set up just to handle long-term processing. Then the first server
> could send a request to the second with the commands it needs to execute
> in a header. The second server processes those commands independantly of
> the first server, which then returns data to the browser.

In a pinch, I'd just use something like a 'queue' directory. In other
words, when your mod_perl code gets some info to process, it writes
this into a file in a certain directory (name it with a timestamp /
cksum to ensure the filename is unique). Every X seconds, have a
daemon poll the directory; if it finds a file, it processes it.
If not, it goes back to sleep for X seconds. I guess it's poor
man's IPC. But it runs over NFS nicely, it's *very* simple, it's
portable, and I've never needed anything more complex. You also
don't need to fork the daemon or startup a new script every
processing request. But if you need to do the processing in realtime,
waiting up to X seconds for the results might be unacceptable.

How does this sound?

HTH,

Neil

-- 
Neil Conway <ne...@home.com>
Get my GnuPG key from: http://klamath.dyndns.org/mykey.asc
Encrypted mail welcomed

It is dangerous to be right when the government is wrong.
        -- Voltaire

Re: Forking in mod_perl?

Posted by "David E. Wheeler" <Da...@Wheeler.net>.
Billy Donahue wrote:

> > Now you are faced with a trade off.  Is it more expensive to
> > detach a subprocess, or use the child cleanup phase to do
> > some extra processing? I'd have to know more specifics to answer
> > that with any modicum of confidence.
> 
> He might try a daemon coprocesses using some IPC to communicate with
> Apache, which is my favorite way to do it..

Yeah, I was thinking something along these lines. Don't know if I need
something as complex as IPC. I was thinking of perhaps a second Apache
server set up just to handle long-term processing. Then the first server
could send a request to the second with the commands it needs to execute
in a header. The second server processes those commands independantly of
the first server, which then returns data to the browser.

But maybe that's overkill. I'll have to weigh the heft of the
post-request processing I need to do.

Thanks for the suggestion!

David

Re: Forking in mod_perl?

Posted by Billy Donahue <bi...@dadadada.net>.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Wed, 4 Oct 2000, ed phillips wrote:

> Now you are faced with a trade off.  Is it more expensive to
> detach a subprocess, or use the child cleanup phase to do
> some extra processing? I'd have to know more specifics to answer
> that with any modicum of confidence.

He might try a daemon coprocesses using some IPC to communicate with
Apache, which is my favorite way to do it..

- --
"The Funk, the whole Funk, and nothing but the Funk."
Linux barcode software mirror: http://dadadada.net/cuecat
Billy Donahue <ma...@dadadada.net>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.3 (GNU/Linux)
Comment: pgpenvelope 2.9.0 - http://pgpenvelope.sourceforge.net/

iD8DBQE525yz+2VvpwIZdF0RAjddAJ46Zxa4qHlLJuMfc1FHnS4aa7E7pwCfSFf8
MctjBHbwd8x31CAACVA98Ug=
=B/EE
-----END PGP SIGNATURE-----


Re: Forking in mod_perl?

Posted by ed phillips <ar...@sirius.com>.
I hope it is clear that you don't want fork the whole server!

Mod_cgi goes to great pains to effectively fork a subprocess, and
was the major impetus I believe for the development of
the C subprocess API. It  (the source code for
mod_cgi) is a great place to learn some of the
subtleties as the Eagle book points out. As the Eagle book
says, Apache is a complex beast. Mod_perl gives
you the power to use the beast to your best advantage.

Now you are faced with a trade off.  Is it more expensive to
detach a subprocess, or use the child cleanup phase to do
some extra processing? I'd have to know more specifics to answer
that with any modicum of confidence.

Cheers,

Ed


"David E. Wheeler" wrote:

> ed phillips wrote:
> >
> > Hi David,
> >
> > Check out the guide at
> >
> > http://perl.apache.org/guide/performance.html#Forking_and_Executing_Subprocess
> >
> > The Eagle book also covers the C API subprocess details on page 622-631.
> >
> > Let us know if the guide is unclear to you, so we can improve it.
>
> Yeah, it's a bit unclear. If I understand correctly, it's suggesting
> that I do a system() call and have the perl script called detach itself
> from Apache, yes? I'm not too sure I like this approach. I was hoping
> for something a little more integrated. And how much overhead are we
> talking about getting taken up by this approach?
>
> Using the cleanup phase, as Geoffey Young suggests, might be a bit
> nicer, but I'll have to look into how much time my processing will
> likely take, hogging up an apache fork while it finishes.
>
> Either way, I'll have to think about various ways to handle this stuff,
> since I'm writing it into a regular Perl module that will then be called
> from mod_perl...
>
> Thanks,
>
> David


Re: Forking in mod_perl?

Posted by "David E. Wheeler" <Da...@Wheeler.net>.
ed phillips wrote:
> 
> Hi David,
> 
> Check out the guide at
> 
> http://perl.apache.org/guide/performance.html#Forking_and_Executing_Subprocess
> 
> The Eagle book also covers the C API subprocess details on page 622-631.
> 
> Let us know if the guide is unclear to you, so we can improve it.

Yeah, it's a bit unclear. If I understand correctly, it's suggesting
that I do a system() call and have the perl script called detach itself
from Apache, yes? I'm not too sure I like this approach. I was hoping
for something a little more integrated. And how much overhead are we
talking about getting taken up by this approach?

Using the cleanup phase, as Geoffey Young suggests, might be a bit
nicer, but I'll have to look into how much time my processing will
likely take, hogging up an apache fork while it finishes.

Either way, I'll have to think about various ways to handle this stuff,
since I'm writing it into a regular Perl module that will then be called
from mod_perl...

Thanks,

David