You are viewing a plain text version of this content. The canonical link for it is here.
Posted to modperl@perl.apache.org by Wesley Peng <wp...@pobox.com> on 2020/09/14 02:51:34 UTC

cache a object in modperl

Hello

I am not so familiar with modperl.

For work requirement, I need to access IANA TLD database.

So I wrote this perl module:
https://metacpan.org/pod/Net::IANA::TLD

But, for each new() in the module, the database file will be downloaded 
from IANA's website.

I know this is pretty Inefficient.

My question is, can I cache the new'ed object by modperl?

If so, how to do?

Thanks.

Re: cache a object in modperl

Posted by Mithun Bhattacharya <mi...@gmail.com>.
Your cache would have to be independent of mod_perl - I would suggest
saving to a REDIS instance ?

On Sun, Sep 13, 2020 at 9:51 PM Wesley Peng <wp...@pobox.com> wrote:

> Hello
>
> I am not so familiar with modperl.
>
> For work requirement, I need to access IANA TLD database.
>
> So I wrote this perl module:
> https://metacpan.org/pod/Net::IANA::TLD
>
> But, for each new() in the module, the database file will be downloaded
> from IANA's website.
>
> I know this is pretty Inefficient.
>
> My question is, can I cache the new'ed object by modperl?
>
> If so, how to do?
>
> Thanks.
>

Re: cache a object in modperl

Posted by Patrick Mevzek <pa...@patoche.org>.

On Sun, Sep 13, 2020, at 21:51, Wesley Peng wrote:
> For work requirement, I need to access IANA TLD database.
> 
> So I wrote this perl module:
> https://metacpan.org/pod/Net::IANA::TLD
> 
> But, for each new() in the module, the database file will be downloaded 
> from IANA's website.
> 
> I know this is pretty Inefficient.

Not only inefficient but you abuse remote resources and you risk having
your access being rate limited or just blocked.

You should use caching features available by HTTP as the resource has an ETag:

$ wget -SqO /dev/null http://www.internic.net/domain/root.zone
  HTTP/1.1 200 OK
  Date: Mon, 14 Sep 2020 15:17:50 GMT
  Server: Apache
  Last-Modified: Mon, 14 Sep 2020 05:44:00 GMT
  Content-Length: 2164237
  Vary: Accept-Encoding
  ETag: "21060d-5af3f856f0800"
  Accept-Ranges: bytes
  Cache-Control: max-age=420
  Expires: Mon, 14 Sep 2020 15:22:04 GMT
  X-Frame-Options: SAMEORIGIN
  Referrer-Policy: origin-when-cross-origin
  Content-Security-Policy: upgrade-insecure-requests
  Age: 165
  Keep-Alive: timeout=2, max=358
  Connection: Keep-Alive
  Content-Type: text/plain; charset=UTF-8
  Content-Language: en


So you can do a conditional GET as long as you store the latest ETag
on your side:

$ wget -SqO /dev/null --header 'If-None-Match: "21060d-5af3f856f0800"' http://www.internic.net/domain/root.zone
  HTTP/1.1 304 Not Modified
  Date: Mon, 14 Sep 2020 15:20:43 GMT
  Server: Apache
  Connection: Keep-Alive
  Keep-Alive: timeout=2, max=358
  ETag: "21060d-5af3f856f0800"
  Expires: Mon, 14 Sep 2020 15:22:04 GMT
  Cache-Control: max-age=420
  Vary: Accept-Encoding


All of this has nothing to do with modperl and very lightly to do with Perl at all in fact.

See also the "Cache-Control" and "Age" headers.

Your module on CPAN should take care of that automatically.

PS: TLDs do not vary so much, fetching once per day or once per week should be enough (with manual exceptional override for those cases that need it). But it depends why you do it. Note that the whole content is also available as a zone transfer from various root servers.

-- 
  Patrick Mevzek

RE: cache a object in modperl [EXT]

Posted by James Smith <js...@sanger.ac.uk>.
You can still have an always up service – but it will require a bit of work and a load balancing proxy set up in front of multiple apache instances. You can then restart each backend independently without an issue.

If the apaches are relatively lightweight you can run two on the same machine (e.g. on ports 8000 & 8001) and another one set up as a proxy on 80/443 which proxies back to these two.

I use this for a dev/live setup on a VM – where 8000 is live and 8001 is dev – and a lightweight apache proxies back to the other two…

From: Mithun Bhattacharya <mi...@gmail.com>
Sent: 14 September 2020 06:49
To: mod_perl list <mo...@perl.apache.org>
Subject: Re: cache a object in modperl [EXT]

Haha I can't answer that - I work with systems which are always up. We have users working across the globe so there is no non-active time.

In my case I would have to throw an independent cache (my current choice is REDIS but you could chose a DB_File for all I know) and refresh it as needed - IANA I could hit every 30 min to check for update :)

On Mon, Sep 14, 2020 at 12:44 AM Wesley Peng <wp...@pobox.com>> wrote:


Mithun Bhattacharya wrote:
> Does IANA have an easy way of determining whether there is an update
> since a certain date ? I was thinking it might make sense to just run a
> scheduled job to monitor for update and then restart your service or
> refresh your local cache depending upon how you solve it.

Yes I agree with this.
I may monitor IANA's database via their version changes, and run a
crontab to restart my apache server during the non-active user time
(i.e, 3:00 AM).

Or do you have better solution?
Thanks.



-- 
 The Wellcome Sanger Institute is operated by Genome Research 
 Limited, a charity registered in England with number 1021457 and a 
 company registered in England with number 2742969, whose registered 
 office is 215 Euston Road, London, NW1 2BE.

Re: cache a object in modperl

Posted by Mithun Bhattacharya <mi...@gmail.com>.
Haha I can't answer that - I work with systems which are always up. We have
users working across the globe so there is no non-active time.

In my case I would have to throw an independent cache (my current choice is
REDIS but you could chose a DB_File for all I know) and refresh it as
needed - IANA I could hit every 30 min to check for update :)

On Mon, Sep 14, 2020 at 12:44 AM Wesley Peng <wp...@pobox.com> wrote:

>
>
> Mithun Bhattacharya wrote:
> > Does IANA have an easy way of determining whether there is an update
> > since a certain date ? I was thinking it might make sense to just run a
> > scheduled job to monitor for update and then restart your service or
> > refresh your local cache depending upon how you solve it.
>
> Yes I agree with this.
> I may monitor IANA's database via their version changes, and run a
> crontab to restart my apache server during the non-active user time
> (i.e, 3:00 AM).
>
> Or do you have better solution?
> Thanks.
>

Re: cache a object in modperl

Posted by Wesley Peng <wp...@pobox.com>.

Mithun Bhattacharya wrote:
> Does IANA have an easy way of determining whether there is an update 
> since a certain date ? I was thinking it might make sense to just run a 
> scheduled job to monitor for update and then restart your service or 
> refresh your local cache depending upon how you solve it.

Yes I agree with this.
I may monitor IANA's database via their version changes, and run a 
crontab to restart my apache server during the non-active user time 
(i.e, 3:00 AM).

Or do you have better solution?
Thanks.

Re: cache a object in modperl

Posted by Mithun Bhattacharya <mi...@gmail.com>.
So how flexible are you with your service restart and how frequently do you
wish to update your cache ?

Does IANA have an easy way of determining whether there is an update since
a certain date ? I was thinking it might make sense to just run a scheduled
job to monitor for update and then restart your service or refresh your
local cache depending upon how you solve it.

On Mon, Sep 14, 2020 at 12:34 AM Wesley Peng <wp...@pobox.com> wrote:

> Hello
>
> Mithun Bhattacharya wrote:
> > How frequently do you wish to refresh the cache ? if you do in startup
> > then your cache refresh is tied to the service restart which might not
> > be ideal or feasible.
>
> I saw recent days IANA has updated their database on date of:
>
> 2020.09.09
> 2020.09.13
>
> So I assume they will update the DB file in few days.
>
> Regards.
>

Re: cache a object in modperl

Posted by Wesley Peng <wp...@pobox.com>.
Hello

Mithun Bhattacharya wrote:
> How frequently do you wish to refresh the cache ? if you do in startup 
> then your cache refresh is tied to the service restart which might not 
> be ideal or feasible.

I saw recent days IANA has updated their database on date of:

2020.09.09
2020.09.13

So I assume they will update the DB file in few days.

Regards.

Re: cache a object in modperl

Posted by Mithun Bhattacharya <mi...@gmail.com>.
Startup is not a great idea if your webserver is up forever - I have some
which are running for months.

How frequently do you wish to refresh the cache ? if you do in startup then
your cache refresh is tied to the service restart which might not be ideal
or feasible.

On Mon, Sep 14, 2020 at 12:26 AM Adam Prime <ad...@utoronto.ca> wrote:

> I left out the link to the thread.  Here it is.
>
> https://marc.info/?t=119062870700002&r=1&w=2
>
>
>
> On Sep 14, 2020, at 1:18 AM, Wesley Peng <wp...@pobox.com> wrote:
>
> That's great. Thank you Adam.
>
> Adam Prime wrote:
>
> If the database doesn't change very often, and you don't mind only getting
> updates to your database when you restart apache, and you're using prefork
> mod_perl, then you could use a startup.pl to load your database before
> apache forks, and get a shared copy globally in all your apache children.
>
> https://perl.apache.org/docs/1.0/guide/config.html#The_Startup_File
>
> This thread from 13 years ago seems to have a clear-ish example of how to
> use startup.pl to do what i'm talking about.
>
> If you need it to update more frequently than when you restart apache, you
> could potentially use a PerlChildInitHandler to load the data when apache
> creates children.  This will use more memory, as each child will have it's
> own copy, and can also result in situation where children can have
> different versions of the database loaded and be serving requests at the
> same time.  If you want to go this way you might want to also add a
> MaxRequestsPerChild directive to your apache config to make sure that
> you're children die and get refreshed on the regular, if you don't already
> have one.
>
> Adam
>
> On 9/13/2020 10:51 PM, Wesley Peng wrote:
>
> Hello
>
>
> I am not so familiar with modperl.
>
>
> For work requirement, I need to access IANA TLD database.
>
>
> So I wrote this perl module:
>
> https://metacpan.org/pod/Net::IANA::TLD
>
>
> But, for each new() in the module, the database file will be downloaded
> from IANA's website.
>
>
> I know this is pretty Inefficient.
>
>
> My question is, can I cache the new'ed object by modperl?
>
>
> If so, how to do?
>
>
> Thanks.
>
>

Re: cache a object in modperl

Posted by Adam Prime <ad...@utoronto.ca>.
I left out the link to the thread.  Here it is. 

https://marc.info/?t=119062870700002&r=1&w=2



> On Sep 14, 2020, at 1:18 AM, Wesley Peng <wp...@pobox.com> wrote:
> 
> That's great. Thank you Adam.
> 
> Adam Prime wrote:
>> If the database doesn't change very often, and you don't mind only getting updates to your database when you restart apache, and you're using prefork mod_perl, then you could use a startup.pl to load your database before apache forks, and get a shared copy globally in all your apache children.
>> https://perl.apache.org/docs/1.0/guide/config.html#The_Startup_File
>> This thread from 13 years ago seems to have a clear-ish example of how to use startup.pl to do what i'm talking about.
>> If you need it to update more frequently than when you restart apache, you could potentially use a PerlChildInitHandler to load the data when apache creates children.  This will use more memory, as each child will have it's own copy, and can also result in situation where children can have different versions of the database loaded and be serving requests at the same time.  If you want to go this way you might want to also add a MaxRequestsPerChild directive to your apache config to make sure that you're children die and get refreshed on the regular, if you don't already have one.
>> Adam
>>> On 9/13/2020 10:51 PM, Wesley Peng wrote:
>>> Hello
>>> 
>>> I am not so familiar with modperl.
>>> 
>>> For work requirement, I need to access IANA TLD database.
>>> 
>>> So I wrote this perl module:
>>> https://metacpan.org/pod/Net::IANA::TLD
>>> 
>>> But, for each new() in the module, the database file will be downloaded from IANA's website.
>>> 
>>> I know this is pretty Inefficient.
>>> 
>>> My question is, can I cache the new'ed object by modperl?
>>> 
>>> If so, how to do?
>>> 
>>> Thanks.

Re: cache a object in modperl

Posted by Wesley Peng <wp...@pobox.com>.
That's great. Thank you Adam.

Adam Prime wrote:
> If the database doesn't change very often, and you don't mind only 
> getting updates to your database when you restart apache, and you're 
> using prefork mod_perl, then you could use a startup.pl to load your 
> database before apache forks, and get a shared copy globally in all your 
> apache children.
> 
> https://perl.apache.org/docs/1.0/guide/config.html#The_Startup_File
> 
> This thread from 13 years ago seems to have a clear-ish example of how 
> to use startup.pl to do what i'm talking about.
> 
> If you need it to update more frequently than when you restart apache, 
> you could potentially use a PerlChildInitHandler to load the data when 
> apache creates children.  This will use more memory, as each child will 
> have it's own copy, and can also result in situation where children can 
> have different versions of the database loaded and be serving requests 
> at the same time.  If you want to go this way you might want to also add 
> a MaxRequestsPerChild directive to your apache config to make sure that 
> you're children die and get refreshed on the regular, if you don't 
> already have one.
> 
> Adam
> 
> 
> On 9/13/2020 10:51 PM, Wesley Peng wrote:
>> Hello
>>
>> I am not so familiar with modperl.
>>
>> For work requirement, I need to access IANA TLD database.
>>
>> So I wrote this perl module:
>> https://metacpan.org/pod/Net::IANA::TLD
>>
>> But, for each new() in the module, the database file will be 
>> downloaded from IANA's website.
>>
>> I know this is pretty Inefficient.
>>
>> My question is, can I cache the new'ed object by modperl?
>>
>> If so, how to do?
>>
>> Thanks.

Re: cache a object in modperl

Posted by Adam Prime <ad...@utoronto.ca>.
If the database doesn't change very often, and you don't mind only 
getting updates to your database when you restart apache, and you're 
using prefork mod_perl, then you could use a startup.pl to load your 
database before apache forks, and get a shared copy globally in all your 
apache children.

https://perl.apache.org/docs/1.0/guide/config.html#The_Startup_File

This thread from 13 years ago seems to have a clear-ish example of how 
to use startup.pl to do what i'm talking about.

If you need it to update more frequently than when you restart apache, 
you could potentially use a PerlChildInitHandler to load the data when 
apache creates children.  This will use more memory, as each child will 
have it's own copy, and can also result in situation where children can 
have different versions of the database loaded and be serving requests 
at the same time.  If you want to go this way you might want to also add 
a MaxRequestsPerChild directive to your apache config to make sure that 
you're children die and get refreshed on the regular, if you don't 
already have one.

Adam


On 9/13/2020 10:51 PM, Wesley Peng wrote:
> Hello
>
> I am not so familiar with modperl.
>
> For work requirement, I need to access IANA TLD database.
>
> So I wrote this perl module:
> https://metacpan.org/pod/Net::IANA::TLD
>
> But, for each new() in the module, the database file will be 
> downloaded from IANA's website.
>
> I know this is pretty Inefficient.
>
> My question is, can I cache the new'ed object by modperl?
>
> If so, how to do?
>
> Thanks.