You are viewing a plain text version of this content. The canonical link for it is here.
Posted to modperl@perl.apache.org by Morbus Iff <mo...@disobey.com> on 2001/05/09 21:51:33 UTC

mod_perl and 700k files...

Hey there, wondering if anyone could help me with this.

I'm relatively new to mod_perl... I've got a 700k file that is loaded each 
time I run a CGI script, so I'm hoping to cache the file using mod_perl 
somehow. The file will change occasionally (maybe once a week) - the reload 
of a few seconds isn't worrisome, but it has to be done without restarting 
the server.

Any suggestions on exactly the best way to do this? I've going to:

  - PerlSetupEnv Off
  - PerlModule and PerlRequre
  - Remove buffering.
  - Cache from XML::Simple **

** The 700k file is an XML file, read in by XML::Simple. XML::Simple can 
cache that file into memory. Is this how I should do it? Or should I load 
the file from my startup.pl script so that the file is shared amongst all 
the apache children? If that's the case, how would I dynamically reload it?



Morbus Iff
.sig on other machine.
http://www.disobey.com/
http://www.gamegrene.com/


Re:mod_perl and 700k files...

Posted by Mike Miller <mm...@crusoe.net>.
On Wednesday, May 09, 2001, Morbus Iff wrote the
following about "mod_perl and 700k files..."

MI>  >Keep in mind, if you load this data during startup (in the parent) it will
MI>  >be shared, but reloading it later will make a separate copy in each child,
MI>  >chewing up a large amount of memory.  You might have better luck using dbm

MI> That is something I was hoping I wouldn't here ;) ... Even reloading the 
MI> file into the same variable in my startup.pl wouldn't cause the parent to 
MI> share it with new children?

Has anyone suggested (I am jumping in late) a shared memory segment a
la IPC::ShareLite, etc.

Best Regards,

Mike Miller
mmiller@crusoe.net


Re: mod_perl and 700k files...

Posted by Perrin Harkins <pe...@elem.com>.
on 5/9/01 5:45 PM, Morbus Iff at morbus@disobey.com wrote:
>> Keep in mind, if you load this data during startup (in the parent) it will
>> be shared, but reloading it later will make a separate copy in each child,
>> chewing up a large amount of memory.  You might have better luck using dbm
> 
> That is something I was hoping I wouldn't here ;) ... Even reloading the
> file into the same variable in my startup.pl wouldn't cause the parent to
> share it with new children?

No, that won't work.  You could try one of the IPC:: modules like
IPC::ShareLite or IPC::MM, but I think you'll still end with a scalar that
takes up more than 700K in each child.  If you can't live with that, you
might try breaking up the file more so that you can access it in smaller
chunks.

- Perrin


Re: mod_perl and 700k files...

Posted by Dave Hodgkinson <da...@hodgkinson.org>.
Morbus Iff <mo...@disobey.com> writes:

> Well, it's a big deal if you've no "in" with the place you're webhosting
> with, sure... No one wants to be told that some lowly customer wants to
> restart the server that's running 200 other vhosts...
> 
> Granted, I work at the damn webhost, but it's gotten to the point where all
> the gizmo's and crap I've added have slowly been dipping the "stop touching
> the server, you feature freak" - I'd rather not stress the relationship.

Sounds like you need to start running multiple Apaches. Get to know
the architecture sections of the guide, get familiar with mod_rewrite
and start dividing up your servers.


-- 
Dave Hodgkinson,                             http://www.hodgkinson.org
Editor-in-chief, The Highway Star           http://www.deep-purple.com
	  Interim CTO, web server farms, technical strategy
	       ----------------------------------------

Re: mod_perl and 700k files...

Posted by Morbus Iff <mo...@disobey.com>.
>> Ultimately, I'm looking for something I can do totally from within Perl.
>
>Couldn't you create a Perl script to run as a cron job that could stat
>the file off-line for you and HUP the server when it has changed?
>That would seem easy enough.  You'd just have to work out the perms on
>the cron user to make sure it could affect httpd.  Restarting Apache
>isn't the end of the world, is it?

Well, it's a big deal if you've no "in" with the place you're webhosting
with, sure... No one wants to be told that some lowly customer wants to
restart the server that's running 200 other vhosts...

Granted, I work at the damn webhost, but it's gotten to the point where all
the gizmo's and crap I've added have slowly been dipping the "stop touching
the server, you feature freak" - I'd rather not stress the relationship.

-- 
      ICQ: 2927491      /      AOL: akaMorbus
   Yahoo: morbus_iff    /  Jabber: morbus@jabber.org
   morbus@disobey.com   /   http://www.disobey.com/

Re: mod_perl and 700k files...

Posted by "Ken Y. Clark" <kc...@logsoft.com>.
On Wed, 9 May 2001, Morbus Iff wrote:

> Date: Wed, 09 May 2001 17:45:03 -0400
> From: Morbus Iff <mo...@disobey.com>
> To: Perrin Harkins <pe...@elem.com>
> Cc: modperl@apache.org
> Subject: Re: mod_perl and 700k files...
>
> Ok. Thanks for the replies everybody. Collectively, I'm looking for a
> solution that DOES NOT require an Apache restart, or one that requires me
> to use a kill/killall command. I'm not in front of the server 100%, and I
> won't have access to telnet/ssh in to issue commands.
>
> Ultimately, I'm looking for something I can do totally from within Perl.

Couldn't you create a Perl script to run as a cron job that could stat
the file off-line for you and HUP the server when it has changed?
That would seem easy enough.  You'd just have to work out the perms on
the cron user to make sure it could affect httpd.  Restarting Apache
isn't the end of the world, is it?

ky


Re: mod_perl and 700k files...

Posted by Morbus Iff <mo...@disobey.com>.
Ok. Thanks for the replies everybody. Collectively, I'm looking for a 
solution that DOES NOT require an Apache restart, or one that requires me 
to use a kill/killall command. I'm not in front of the server 100%, and I 
won't have access to telnet/ssh in to issue commands.

Ultimately, I'm looking for something I can do totally from within Perl.

 >You might have better luck just having your app check -M against the file
 >and reload as needed.  If you don't want to take the hit on every request,
 >you can just use a counter or a "last checked" time kept in a global, and
 >check every 10 minutes or 100 requests or whatever.

 From various other posts, I'm understanding that I can do what I want by 
defining a subroutine in my startup.pl that would reload the file, and then 
I could call that subroutine whenever I determine the file is out of date 
(either through a cache, request, or -M system)...

 >Keep in mind, if you load this data during startup (in the parent) it will
 >be shared, but reloading it later will make a separate copy in each child,
 >chewing up a large amount of memory.  You might have better luck using dbm

That is something I was hoping I wouldn't here ;) ... Even reloading the 
file into the same variable in my startup.pl wouldn't cause the parent to 
share it with new children?


Morbus Iff
.sig on other machine.
http://www.disobey.com/
http://www.gamegrene.com/


Re: mod_perl and 700k files...

Posted by Tim Tompkins <ti...@arttoday.com>.
kill -s USR2 {process id}

see: man kill
see also: man killall

This is just one way to handle it.  It, of course, requires that you
manually send the signal to the process when your data file has changed.
Using this method eliminates the need to stat() the datafile with each
request to see if it has changed.  You surely will not want to base your
stat on -M, as that will certainly vary over time.  Instead, you'd want to
base on (stat)[9].  Remember though, this will have the overhead of
stat()ing the file with each FETCH(), but at least it won't have the
overhead of loading and parsing the file with each request.

For example:

package My::AutoData;

use strict;
use vars qw($MOD_TIME,$DATA_FILE,%DATA);

$DATA_FILE = "/path/to/data/file.dat";
$MOD_TIME =  (stat($DATA_FILE))[9];

sub FETCH {
      load_data() unless %DATA;
      load_data() if ($MOD_TIME != (stat($DATA_FILE))[9]);

      # continue with code to return requested data

}

sub load_data {
    # insert code to parse data file and assign to global %DATA
}

1;
__END__



I am assuming this would work, I've not tested it.


Thanks,

Tim Tompkins
----------------------------------------------
Staff Engineer / Programmer
http://www.arttoday.com/
----------------------------------------------


----- Original Message -----
From: "Morbus Iff" <mo...@disobey.com>
To: "G.W. Haywood" <ge...@www.jubileegroup.co.uk>
Cc: <mo...@apache.org>
Sent: Wednesday, May 09, 2001 2:14 PM
Subject: Re: mod_perl and 700k files...


> >> Ahhh. Ok. What's this $SIG{'USR2'} thingy. What's that do?
>  >
>  >http://perl.apache.org/guide
>
> Well, that's all fine and dandy, and I've gone through there before, but
> the only thing the search engine brings up concerning USR2 is:
>
>    >The above code asigns a signal handler for the USR2 signal.
>    >This signal has been chosen because it's least likely to be
>    >used by the other parts of the server.
>
> That, unfortunately doesn't tell me what causes a USR2 signal to be sent
to
> Apache. Or when it's caused. I only want to reload the file when said file
> has changed. Am I supposed to do some checking against the file -M time
> myself, and then send a USR2 signal myself?
>
>
> Morbus Iff
> .sig on other machine.
> http://www.disobey.com/
> http://www.gamegrene.com/
>
>


Re: mod_perl and 700k files...

Posted by ___cliff rayman___ <cl...@genwax.com>.
Morbus Iff wrote:

>  >> Ahhh. Ok. What's this $SIG{'USR2'} thingy. What's that do?
>
> has changed. Am I supposed to do some checking against the file -M time
> myself, and then send a USR2 signal myself?
>

yes.  this method assumes that the administrator of apache has made
a change to a file and now wants to reload it.  this is how some unix
services work such as named and inetd.  you modify the config
files and send a kill -HUP to the daemon.  in this case you would do:
kill -USR2 `cat /path/to/apache/logs/httpd.pid`

--
___cliff rayman___cliff@genwax.com___http://www.genwax.com/



Re: mod_perl and 700k files...

Posted by "Ken Y. Clark" <kc...@logsoft.com>.
On Wed, 9 May 2001, Morbus Iff wrote:

> Date: Wed, 09 May 2001 17:14:10 -0400
> From: Morbus Iff <mo...@disobey.com>
> To: G.W. Haywood <ge...@www.jubileegroup.co.uk>
> Cc: modperl@apache.org
> Subject: Re: mod_perl and 700k files...
>
>  >> Ahhh. Ok. What's this $SIG{'USR2'} thingy. What's that do?
>  >
>  >http://perl.apache.org/guide
>
> Well, that's all fine and dandy, and I've gone through there before, but
> the only thing the search engine brings up concerning USR2 is:
>
>    >The above code asigns a signal handler for the USR2 signal.
>    >This signal has been chosen because it's least likely to be
>    >used by the other parts of the server.
>
> That, unfortunately doesn't tell me what causes a USR2 signal to be sent to
> Apache. Or when it's caused. I only want to reload the file when said file
> has changed. Am I supposed to do some checking against the file -M time
> myself, and then send a USR2 signal myself?
>
>
> Morbus Iff
> .sig on other machine.
> http://www.disobey.com/
> http://www.gamegrene.com/
>

Well, here's an idea that I've used before, though this will only
cache the data in each child and not when the parent starts up.
Still...  Here's pseudo-code version of my idea:

package Foo;

use strict;

use vars qw[ %CACHE ];
%CACHE = ();

use constant EXPIRES  => 60*5; # five minutes

sub handler {
    my $r = shift;

    my $expires = $CACHE{'expires'} || 0;

    if ( $expires < time ) {
        #
        # stat the file, figure out when it was changed
        #
        my $mtime = ...

        if ( $mtime > $CACHE{'mtime'} ) {
            #
            # get your new data here
            #
            $CACHE{'data'}  = ...
            $CACHE{'mtime'} = $mtime;
        }

        #
        # reset CACHE's expire time
        #
        $CACHE{'expires'} = time + EXPIRES;
    }

    $r->print( $CACHE{'data'} );

    return OK;
}

You see that my idea is to eliminate stat'ing the file on every
request by caching data for at least 'x' amount of time.  Then when
the data expires, stat the file and reload only if it has changed.
Again, this means each child maintains its own cache, but it might not
be that bad (?).

ky


Re: mod_perl and 700k files...

Posted by Perrin Harkins <pe...@elem.com>.
on 5/9/01 5:14 PM, Morbus Iff at morbus@disobey.com wrote:
> That, unfortunately doesn't tell me what causes a USR2 signal to be sent to
> Apache.

You can use the kill command to send a USR2 signal.

> Or when it's caused.

When you send it.

> I only want to reload the file when said file
> has changed. Am I supposed to do some checking against the file -M time
> myself, and then send a USR2 signal myself?

You might have better luck just having your app check -M against the file
and reload as needed.  If you don't want to take the hit on every request,
you can just use a counter or a "last checked" time kept in a global, and
check every 10 minutes or 100 requests or whatever.

Keep in mind, if you load this data during startup (in the parent) it will
be shared, but reloading it later will make a separate copy in each child,
chewing up a large amount of memory.  You might have better luck using dbm
files or something similar that doesn't need to keep the whole thing in
memory at once.

- Perrin


RE: mod_perl and 700k files...

Posted by Rob Bloodgood <ro...@empire2.com>.
> That, unfortunately doesn't tell me what causes a USR2 signal to
> be sent to
> Apache. Or when it's caused. I only want to reload the file when
> said file
> has changed. Am I supposed to do some checking against the file -M time
> myself, and then send a USR2 signal myself?

USR2 only fires when you do it yourself, e.g.
kill -USR2 `cat /var/run/httpd.pid`
or under linux
killall -USR2 httpd

However, if you want to change based on mod time, then one way to do it
would be as follows (THIS CODE IS UNTESTED!!!).

in the handler/CGI that USES the 700k doc:

my $big_doc = My::get_big_doc();

and in startup.pl you can say:

package My;

my $big_doc = undef;
my $mod_time = 0;

my $big_file = '/path/to/big/file';

sub get_big_doc {

    if (defined $big_doc and -M $big_file < $mod_time) {
        return $big_doc;
    } # implicit else

    ($big_doc, $mod_time) = some_complex_operation();

    return $big_doc;

}

sub some_complex_operation {

	# read in $big_doc and its $mod_time

	return ($big_doc, $mod_time);
}

HTH!

L8r,
Rob

#!/usr/bin/perl -w
use Disclaimer qw/:standard/;



Re: mod_perl and 700k files...

Posted by Morbus Iff <mo...@disobey.com>.
 >> I hope ya understand.
 >
 >Well, I hope we've all got that off our chests.

I'm really hoping so - I *hate* this sort of stuff.

 >Now, have you got enough to get you going OK?

I'm thinking I do, yes. Thanks for asking.


Morbus Iff
.sig on other machine.
http://www.disobey.com/
http://www.gamegrene.com/


Re: mod_perl and 700k files...

Posted by "G.W. Haywood" <ge...@www.jubileegroup.co.uk>.
Hi there,

On Thu, 10 May 2001, Morbus Iff wrote:

> I hope ya understand.

Well, I hope we've all got that off our chests.

Now, have you got enough to get you going OK?

73,
Ged.



Re: mod_perl and 700k files...

Posted by Morbus Iff <mo...@disobey.com>.
 >> Sigh. What the frel is your problem, binky?
 >
 >Stas' problem, which apparently your researches have not discovered,
 >is that he WROTE the guide and when somebody starts spamming fifteen
 >hundred mailboxes because he didn't read it he's understandably a
 >little irritated.

Oh, no, don't get me wrong. I know he wrote the manual. I've even been to 
his website and seen his credentials (unknowingly enough, I've had his perl 
pages bookmarked for quite a while). I totally understand the whole RTFM 
thing. I did RTFM - how else would I have known all the other things to try 
in my original message (reproduced below)?:

 > - PerlSetupEnv Off
 > - PerlModule and PerlRequre
 > - Remove buffering.
 > - Cache from XML::Simple **

It wouldn't have been that big of a deal if he didn't point me to the exact 
same page that I had quoted from in my return email. And now, all of a 
sudden, I'm "spamming"?

 >Please try to understand that a lot of people have put a lot of work
 >into the stuff you are using for free.  Try to remain polite.

I'm not the one who initially accused me of not reading the manual and then 
trying to pull a fast one over on everyone else ("You didn't search the 
guide, even if you try to make everyone believe that you did").

 >We're doing our best to help.  Some of us are very busy.  Stas is very busy.

You don't think I'm busy? I code large projects, run an ISP/webhost, 
maintain a content site called Disobey.com which get a million hits a 
month, and have been doing it for the past four years. Disobey brought the 
world Ghost Sites and NetSlaves, and gets press 
(disobey.com/about/presspit.shtml) frequently enough that it's becoming 
flippin' annoying. But I digress.

I know what it's like to busy. I know what it's like to say "Hey! RTFM!". 
But I don't like being sent to the same page that I quoted from, with 
solutions outside of perl (which I had already said in another reply was 
not what I was looking for), and then to be given an elitism attitude at 
the same time.

If the list feels it best that I unsubscribe, then by all means, say so 
offlist. For the record, however, I'm not the clueless newbie, I did RTFM 
(although, as admitted in another post, skipped the explanation of 
triggering the USR2 signal due to it being related to 'kill' and not a perl 
solution), and was only impolite after the golden rule came into effect.

I hope ya understand.

Morbus Iff
.sig on other machine.
http://www.disobey.com/
http://www.gamegrene.com/


Re: mod_perl and 700k files...

Posted by "G.W. Haywood" <ge...@www.jubileegroup.co.uk>.
Hi there,

On Thu, 10 May 2001, Morbus Iff wrote:

> >You didn't search the guide, even if you try to make everyone believe that
> 
> Sigh. What the frel is your problem, binky?

Stas' problem, which apparently your researches have not discovered,
is that he WROTE the guide and when somebody starts spamming fifteen
hundred mailboxes because he didn't read it he's understandably a
little irritated.

Please try to understand that a lot of people have put a lot of work
into the stuff you are using for free.  Try to remain polite.

We're doing our best to help.  Some of us are very busy.  Stas is very busy.
Please respect that.

73,
Ged.


Re: mod_perl and 700k files...

Posted by Morbus Iff <mo...@disobey.com>.
>>    >The above code asigns a signal handler for the USR2 signal.
>>    >This signal has been chosen because it's least likely to be
>>    >used by the other parts of the server.
>>
>> That, unfortunately doesn't tell me what causes a USR2 signal to be sent to
>> Apache. Or when it's caused. I only want to reload the file when said file
>> has changed. Am I supposed to do some checking against the file -M time
>> myself, and then send a USR2 signal myself?
>
>You didn't search the guide, even if you try to make everyone believe that

Sigh. What the frel is your problem, binky?

>talking about using USR2, not the caching technique). The first hit in the
>http://thingy.kcilink.com/modperlguide/debug/Using_the_Perl_Trace.html

Actually, no, not really. That's the same exact page I got when I "didn't"
search the search engine (thus pulling the wool over your elite mod_perl
eyes) before. Strangely, enough, the same quoted sentence in my email, is
golly gosh darn, the second sentence on that page.

And my comments to that page still apply. I will agree, in some sense,
however, that my comments about not "tell[ing] me what causes a USR2
signal" are incorrect. I was looking for a solution in all perl, so when I
saw the kill command, I just stopped reading, since it wasn't what I'm
looking for.

>http://perl.apache.org/guide/porting.html#Configuration_Files_Writing_Dy
>http://perl.apache.org/guide/porting.html#Using_Apache_Reload

The first was more helpful than the second (ignorantly speaking). The thing
that worries me the most is about the 700k file being passed around to all
the children during a ::Reload or other solution. More specifically, is
there anyway to reload a file into the parent, which would automatically be
shared with existing children, and all newly spawned children? Without
restarting the server?

Thanks for the help...

-- 
      ICQ: 2927491      /      AOL: akaMorbus
   Yahoo: morbus_iff    /  Jabber: morbus@jabber.org
   morbus@disobey.com   /   http://www.disobey.com/

Re: mod_perl and 700k files...

Posted by Stas Bekman <st...@stason.org>.
On Wed, 9 May 2001, Morbus Iff wrote:

>  >> Ahhh. Ok. What's this $SIG{'USR2'} thingy. What's that do?
>  >
>  >http://perl.apache.org/guide
>
> Well, that's all fine and dandy, and I've gone through there before, but
> the only thing the search engine brings up concerning USR2 is:
>
>    >The above code asigns a signal handler for the USR2 signal.
>    >This signal has been chosen because it's least likely to be
>    >used by the other parts of the server.
>
> That, unfortunately doesn't tell me what causes a USR2 signal to be sent to
> Apache. Or when it's caused. I only want to reload the file when said file
> has changed. Am I supposed to do some checking against the file -M time
> myself, and then send a USR2 signal myself?

You didn't search the guide, even if you try to make everyone believe that
you did, and starting a thread that was documented long time ago (I'm
talking about using USR2, not the caching technique). The first hit in the
search gives:
http://thingy.kcilink.com/modperlguide/debug/Using_the_Perl_Trace.html
which is exactly what you need.

Your question about reloading something from within the code is answered
here:
http://perl.apache.org/guide/porting.html#Configuration_Files_Writing_Dy
and here:
http://perl.apache.org/guide/porting.html#Using_Apache_Reload

Hope this helps.

_____________________________________________________________________
Stas Bekman              JAm_pH     --   Just Another mod_perl Hacker
http://stason.org/       mod_perl Guide  http://perl.apache.org/guide
mailto:stas@stason.org   http://apachetoday.com http://eXtropia.com/
http://singlesheaven.com http://perl.apache.org http://perlmonth.com/



Re: mod_perl and 700k files...

Posted by "G.W. Haywood" <ge...@www.jubileegroup.co.uk>.
Hi again,

On Wed, 9 May 2001, Morbus Iff wrote:

>  >> Ahhh. Ok. What's this $SIG{'USR2'} thingy. What's that do?
>  >
>  >http://perl.apache.org/guide
> 
> Well, that's all fine and dandy, and I've gone through there before, but 
> the only thing the search engine brings up concerning USR2 is:
> 
>    >The above code asigns a signal handler for the USR2 signal.
>    >This signal has been chosen because it's least likely to be
>    >used by the other parts of the server.

Look again.

Get the source, go to the directory which contains all the .pod
files and type

grep USR2 * | less -S

or something similar.

(Or you could just read it. :)

73,
Ged.



Re: mod_perl and 700k files...

Posted by Morbus Iff <mo...@disobey.com>.
 >> Ahhh. Ok. What's this $SIG{'USR2'} thingy. What's that do?
 >
 >http://perl.apache.org/guide

Well, that's all fine and dandy, and I've gone through there before, but 
the only thing the search engine brings up concerning USR2 is:

   >The above code asigns a signal handler for the USR2 signal.
   >This signal has been chosen because it's least likely to be
   >used by the other parts of the server.

That, unfortunately doesn't tell me what causes a USR2 signal to be sent to 
Apache. Or when it's caused. I only want to reload the file when said file 
has changed. Am I supposed to do some checking against the file -M time 
myself, and then send a USR2 signal myself?


Morbus Iff
.sig on other machine.
http://www.disobey.com/
http://www.gamegrene.com/


Re: mod_perl and 700k files...

Posted by "G.W. Haywood" <ge...@www.jubileegroup.co.uk>.
Hi there,

On Wed, 9 May 2001, Morbus Iff wrote:

> Ahhh. Ok. What's this $SIG{'USR2'} thingy. What's that do?

http://perl.apache.org/guide

73,
Ged.


Re: mod_perl and 700k files...

Posted by Morbus Iff <mo...@disobey.com>.
At 04:24 PM 5/9/01, Robert Landrum wrote:
 >At 3:51 PM -0400 5/9/01, Morbus Iff wrote:
 >>** The 700k file is an XML file, read in by XML::Simple. XML::Simple
 >>can cache that file into memory. Is this how I should do it? Or
 >>should I load the file from my startup.pl script so that the file is
 >>shared amongst all the apache children? If that's the case, how
 >>would I dynamically reload it?
 >
 >Basically, I loaded the file into $My::Data via the startup.pl.
 >
 >Then I did a
 >$SIG{'USR2'} = sub {
 >	open(FILE,'file');
 >	$My::Data = process_file(\*FILE);
 >	close(FILE);
 >};

Ahhh. Ok. What's this $SIG{'USR2'} thingy. What's that do?


Morbus Iff
.sig on other machine.
http://www.disobey.com/
http://www.gamegrene.com/


Re: mod_perl and 700k files...

Posted by Robert Landrum <rl...@capitoladvantage.com>.
At 3:51 PM -0400 5/9/01, Morbus Iff wrote:
>** The 700k file is an XML file, read in by XML::Simple. XML::Simple 
>can cache that file into memory. Is this how I should do it? Or 
>should I load the file from my startup.pl script so that the file is 
>shared amongst all the apache children? If that's the case, how 
>would I dynamically reload it?
>


I've done similar things....

Basically, I loaded the file into $My::Data via the startup.pl.

Then I did a
$SIG{'USR2'} = sub {
	open(FILE,'file');
	$My::Data = process_file(\*FILE);
	close(FILE);
};


Rob

--
As soon as you make something foolproof, someone will create a better fool.

Re: mod_perl and 700k files...

Posted by Ask Bjoern Hansen <as...@valueclick.com>.
On Wed, 9 May 2001, Morbus Iff wrote:

> I'm relatively new to mod_perl... I've got a 700k file that is loaded each 
> time I run a CGI script, so I'm hoping to cache the file using mod_perl 
> somehow. The file will change occasionally (maybe once a week) - the reload 
> of a few seconds isn't worrisome, but it has to be done without restarting 
> the server.
[...]
> ** The 700k file is an XML file, read in by XML::Simple. XML::Simple can 
> cache that file into memory. Is this how I should do it? Or should I load 
> the file from my startup.pl script so that the file is shared amongst all 
> the apache children? 

As others have pointed out it'll only be shared until you reload it
so I wouldn't worry about it.

Things I would try:

   1) Have an external process parse the XML::Simple into the
      datastructure you need and save it with Storable::nfreeze.
      Then the apache processes can load it with thaw which should
      be much faster and might very well use less memory than the
      XML::Simple foo.

      For checking for updates something like 

      my $last_reload = 0;
      my $last_check  = 0;
      sub handler {
         reload_data() if time >= $last_check+3600; # check every hour
         [...]
      }

      sub reload_data {
        $last_check = time;
        return unless (stat "/home/foo/data/datafile")[9] > $last_reload;
        
        # load data
               
        $last_load = time;
      }       
      

   2) Save the data in a BerkeleyDB 3 database (which can be
      shared) and have an external process just load the data into
      it. Update will "just work".


 - ask


-- 
ask bjoern hansen, http://ask.netcetera.dk/   !try; do();
more than 100M impressions per day, http://valueclick.com


Re: mod_perl and 700k files...

Posted by Matt Sergeant <ma...@sergeant.org>.
On Sat, 12 May 2001, Morbus Iff wrote:

> >I store a .stor file which is a storable dump of my XML tree. I check the
> >mtime of that against the mtime of the .xml file. Whichever is newer I
> >load that. Works fast and is very simple.
> 
> I'll certainly check it out. I also started looking into SAX to see if I
> could do something with that as well, and that looks promising too. Tell
> me, is there some write up of the differences between XPath and SAX? And
> why I should use either one?

Kip Hampton is doing a series of three articles on XML.com about just this
sort of thing. The first one has gone out, so expect the next in a couple
of weeks and then a month after that for the last one.

-- 
<Matt/>

    /||    ** Founder and CTO  **  **   http://axkit.com/     **
   //||    **  AxKit.com Ltd   **  ** XML Application Serving **
  // ||    ** http://axkit.org **  ** XSLT, XPathScript, XSP  **
 // \\| // ** mod_perl news and resources: http://take23.org  **
     \\//
     //\\
    //  \\


Re: mod_perl and 700k files...

Posted by Perrin Harkins <pe...@elem.com>.
on 5/12/01 5:46 PM, Morbus Iff at morbus@disobey.com wrote:
>> I store a .stor file which is a storable dump of my XML tree. I check the
>> mtime of that against the mtime of the .xml file. Whichever is newer I
>> load that. Works fast and is very simple.
> 
> I'll certainly check it out.

The only trouble with that is that you will have a separate copy in every
child taking up 700K or more.  You can only avoid that if you restart the
server or use some kind of shared memory approach.

- Perrin


Re: mod_perl and 700k files...

Posted by Morbus Iff <mo...@disobey.com>.
>> I'm relatively new to mod_perl... I've got a 700k file that is loaded each
>> time I run a CGI script, so I'm hoping to cache the file using mod_perl
>> somehow. The file will change occasionally (maybe once a week) - the reload
>> of a few seconds isn't worrisome, but it has to be done without restarting
>> the server.
>
>Boy you sure got some complex answers...

You know, I really was thinking the same thing myself, but I'm new with
mod_perl, so I thought it was normal. Funnily enough, I got some offlist
replies about how sending signals causes memory leaks (and how the mod_perl
people won't admit it), and how Stas flips out a lot. Ah well...

>I store a .stor file which is a storable dump of my XML tree. I check the
>mtime of that against the mtime of the .xml file. Whichever is newer I
>load that. Works fast and is very simple.

I'll certainly check it out. I also started looking into SAX to see if I
could do something with that as well, and that looks promising too. Tell
me, is there some write up of the differences between XPath and SAX? And
why I should use either one?

-- 
      ICQ: 2927491      /      AOL: akaMorbus
   Yahoo: morbus_iff    /  Jabber: morbus@jabber.org
   morbus@disobey.com   /   http://www.disobey.com/

Re: mod_perl and 700k files...

Posted by Matt Sergeant <ma...@sergeant.org>.
On Wed, 9 May 2001, Morbus Iff wrote:

> Hey there, wondering if anyone could help me with this.
> 
> I'm relatively new to mod_perl... I've got a 700k file that is loaded each 
> time I run a CGI script, so I'm hoping to cache the file using mod_perl 
> somehow. The file will change occasionally (maybe once a week) - the reload 
> of a few seconds isn't worrisome, but it has to be done without restarting 
> the server.
> 
> Any suggestions on exactly the best way to do this? I've going to:
> 
>   - PerlSetupEnv Off
>   - PerlModule and PerlRequre
>   - Remove buffering.
>   - Cache from XML::Simple **
> 
> ** The 700k file is an XML file, read in by XML::Simple. XML::Simple can 
> cache that file into memory. Is this how I should do it? Or should I load 
> the file from my startup.pl script so that the file is shared amongst all 
> the apache children? If that's the case, how would I dynamically reload it?

Boy you sure got some complex answers...

I store a .stor file which is a storable dump of my XML tree. I check the
mtime of that against the mtime of the .xml file. Whichever is newer I
load that. Works fast and is very simple.

-- 
<Matt/>

    /||    ** Founder and CTO  **  **   http://axkit.com/     **
   //||    **  AxKit.com Ltd   **  ** XML Application Serving **
  // ||    ** http://axkit.org **  ** XSLT, XPathScript, XSP  **
 // \\| // ** mod_perl news and resources: http://take23.org  **
     \\//
     //\\
    //  \\