You are viewing a plain text version of this content. The canonical link for it is here.

Posted to modperl@perl.apache.org by "James. L" <pe...@yahoo.com> on 2007/05/07 00:47:59 UTC

few newbie quesitons..

hello, 

few beginner questions on using mod_perl.

1. memory usage. 

i have an app that reads/parses a file line by line
then passes the data to TT for html output. each http
request may request different files. my question is
once the app produce the html, does the memory
allocated by the parsed data get released to perl?
that memory will be reused by other mod_perl app?

2. global variable.

in the config module example from mod_perl doc, it
says that declaring a global hash which consists the
configs is better than declaring few global variables.
why is that? i thought that they takes the same amount
of memory..

3. preloading modules in startup.pl..

is it approriate to preload all modules used in my app
in startup.pl so that it won't be loaded in each
apache chile process? 

if module A.pm uses B.pm, does preloading A also
preload B as well?

4. OO, class method vs object method?

for some module, i may create a instance first then
use it in all my app ( $obj->method ) . other modules,
i may access method by using it directly (
Foo::Bar->method ). 

does the object method way(create the object at app's
start up and cache it for later use) have less
overhead than class method? i usually don't use object
method if i don't have data shared by other module
methods. 

thanks in advance,

James.

__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com

Re: few newbie quesitons..

Posted by Perrin Harkins <pe...@elem.com>.

On 5/7/07, James. L <pe...@yahoo.com> wrote:
> do you mean parsing the file during
> each iteration and get the next availabe result then
> return it to TT?

Yes.  Open the file and then read it line-by-line as the iterator
requests it.  Look at the docs for Template::Iterator.

- Perrin

Re: few newbie quesitons..

Posted by "James. L" <pe...@yahoo.com>.

--- Perrin Harkins <pe...@elem.com> wrote:

> On 5/11/07, James. L <pe...@yahoo.com> wrote:
> > even if i am using an iterator object and call
> it.next
> > in TT, doesn't TT actually keep the rendered
> template
> > page into one variable and dump it to the browser?
> in
> > that case, the memory consumed is still equal to
> the
> > entire size of the data and iterator doesn't help
> > here. am i right?
> 
> You would end up with one copy of the data in memory
> (in the rendered
> template) instead of two if you use an iterator.  If
> it's a really
> large amount of data, you could avoid creating a
> giant template output
> variable by breaking it up into multiple chunks and
> sending them to
> the client as you go.  Template Toolkit may not be
> the best tool for
> this part of your application.
> 

thanks for confirming this. 

also, i found this post on TT mailing list that
describes a way to flush TT content as soon as it gets
generated.

http://www.template-toolkit.org/pipermail/templates/2004-February/005776.html

downside of it is that WRAPPER directive is broken in
this case. but i don't use it in this app.

> - Perrin
> 

James.

____________________________________________________________________________________
Moody friends. Drama queens. Your life? Nope! - their life, your story. Play Sims Stories at Yahoo! Games.
http://sims.yahoo.com/

Re: few newbie quesitons..

Posted by Perrin Harkins <pe...@elem.com>.

On 5/11/07, James. L <pe...@yahoo.com> wrote:
> even if i am using an iterator object and call it.next
> in TT, doesn't TT actually keep the rendered template
> page into one variable and dump it to the browser? in
> that case, the memory consumed is still equal to the
> entire size of the data and iterator doesn't help
> here. am i right?

You would end up with one copy of the data in memory (in the rendered
template) instead of two if you use an iterator.  If it's a really
large amount of data, you could avoid creating a giant template output
variable by breaking it up into multiple chunks and sending them to
the client as you go.  Template Toolkit may not be the best tool for
this part of your application.

- Perrin

Re: few newbie quesitons..

Posted by "James. L" <pe...@yahoo.com>.

--- Perrin Harkins <pe...@elem.com> wrote:

> > sub parse {
> >   my ($class,$file) = @_;
> >   my @data;
> >   open my $F, $file or die $!;
> >   while ( my $line = <$F> ) {
> >     my @fields = split /=/, $line;
> >     push @data, \@fields;
> >   }
> >   close $F;
> >   return \@data;
> > }
> 
> If you read enough data into @data to use up 20MB,
> it will stay that
> size.  That's a good thing if you intend to read
> another file of
> similar size on the next request.  This would only
> be bad if you read
> a very large amount of data in but only now and
> then.
> 
> The best way to avoid this kind of problem is to not
> read the whole
> thing into RAM.  You can pass an iterator object to
> TT instead of
> loading all the data at once.

how do you do that? I have to parse the file to get
the  data first. do you mean parsing the file during
each iteration and get the next availabe result then
return it to TT?

hypothetically in a database env, that would mean hit
the database with fetch_next_row for every iteration.
right?

> - Perrin
> 

James.

__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com

Re: few newbie quesitons..

Posted by Andy Armstrong <an...@hexten.net>.

On 11 May 2007, at 15:31, James. L wrote:
> here is my understand and please verify.
>
> say i have sub parse { return
> $array_ref_contain_the_data }
>
> does that memory used by the
> $array_ref_contain_the_data can be reused by other
> mod_perl application too? or it is used only by the
> particular $array_ref_contain_the_data?

The memory used by the content of the array will stay allocated until  
the reference count for the anonymous array goes to zero. As long as  
you have a reference to the data the memory will remain allocated.  
Once the reference count goes to zero that memory will be freed for  
use by Perl.

It's only storage directly attached to a scalar that hangs around for  
subsequent re-use.

So your $array_ref_contain_the_data will continue to occupy memory -  
but it's just a few bytes. The data in the actual array will be freed  
when there are no longer any references to it.

I'm not sure if this helps but give it a try:

Consider that many things may refer to your array data:

   $pork = [ 1, 2, 3 ];
   $beef = $pork;

Neither $pork nor $beef has exclusive ownership of the anonymous  
array - so neither of them can sensibly attempt to hang on to that  
storage for subsequent use. The only thing that's private to a scalar  
is its own allocated storage. /That/ memory may be left allocated to  
the scalar on the assumption that it might be needed again.

In comparison the memory owned by the [ 1, 2, 3 ] anon array is bit  
of a free spirit and can't reasonably be owned by any of the things  
that refer to it.  It remains allocated until nothing refers to it  
and then is freed.

-- 
Andy Armstrong, hexten.net

Re: few newbie quesitons..

Posted by "James. L" <pe...@yahoo.com>.

i still have few questions, would you please answer
them for me? see below..

--- Perrin Harkins <pe...@elem.com> wrote:

> On 5/7/07, James. L <pe...@yahoo.com> wrote:
> > the files i need to parse are usually in size of
> 2M -
> > 10M. will the mod_perl server(2G Mem) use up the
> > memory pretty quick after few hundred requests on
> > different files?
> 
> You're misunderstanding a little bit.  It's not that
> the memory used
> in parsing a file gets lost permanently.  Instead,
> the variable that
> you loaded the data holds onto the memory from the
> largest size it got
> to.

here is my understand and please verify.

say i have sub parse { return
$array_ref_contain_the_data } 

does that memory used by the
$array_ref_contain_the_data can be reused by other
mod_perl application too? or it is used only by the
particular $array_ref_contain_the_data? 

> > sub parse {
> >   my ($class,$file) = @_;
> >   my @data;
> >   open my $F, $file or die $!;
> >   while ( my $line = <$F> ) {
> >     my @fields = split /=/, $line;
> >     push @data, \@fields;
> >   }
> >   close $F;
> >   return \@data;
> > }
> 
> If you read enough data into @data to use up 20MB,
> it will stay that
> size.  That's a good thing if you intend to read
> another file of
> similar size on the next request.  This would only
> be bad if you read
> a very large amount of data in but only now and
> then.
> 
> The best way to avoid this kind of problem is to not
> read the whole
> thing into RAM.  You can pass an iterator object to
> TT instead of
> loading all the data at once.

even if i am using an iterator object and call it.next
in TT, doesn't TT actually keep the rendered template
page into one variable and dump it to the browser? in
that case, the memory consumed is still equal to the
entire size of the data and iterator doesn't help
here. am i right?

> - Perrin
> 

James.

____________________________________________________________________________________Give spam the boot. Take control with tough spam protection in the all-new Yahoo! Mail Beta.
http://advision.webevents.yahoo.com/mailbeta/newmail_html.html

Re: few newbie quesitons..

Posted by Perrin Harkins <pe...@elem.com>.

On 5/7/07, James. L <pe...@yahoo.com> wrote:
> the files i need to parse are usually in size of 2M -
> 10M. will the mod_perl server(2G Mem) use up the
> memory pretty quick after few hundred requests on
> different files?

You're misunderstanding a little bit.  It's not that the memory used
in parsing a file gets lost permanently.  Instead, the variable that
you loaded the data holds onto the memory from the largest size it got
to.

> sub parse {
>   my ($class,$file) = @_;
>   my @data;
>   open my $F, $file or die $!;
>   while ( my $line = <$F> ) {
>     my @fields = split /=/, $line;
>     push @data, \@fields;
>   }
>   close $F;
>   return \@data;
> }

If you read enough data into @data to use up 20MB, it will stay that
size.  That's a good thing if you intend to read another file of
similar size on the next request.  This would only be bad if you read
a very large amount of data in but only now and then.

The best way to avoid this kind of problem is to not read the whole
thing into RAM.  You can pass an iterator object to TT instead of
loading all the data at once.

- Perrin

Re: few newbie quesitons..

Posted by Perrin Harkins <pe...@elem.com>.

On 5/29/07, Jay Buffington <ja...@gmail.com> wrote:
> On 5/6/07, Perrin Harkins <pe...@elem.com> wrote:
> > On 5/6/07, James. L <pe...@yahoo.com> wrote:
> > > my question is
> > > once the app produce the html, does the memory
> > > allocated by the parsed data get released to perl?
> > > that memory will be reused by other mod_perl app?
> >
> > No, Perl doesn't work that way.  It will keep that memory allocated
> > for that variable unless you undef the variable explicitly.
>
> It's my understanding that even if you explicitly undef a variable,
> perl doesn't release the memory.

He was asking about it being released back to perl for use within the
same process, not about releasing it back to the operating system for
use in other processes.

- Perrin

Re: few newbie quesitons..

Posted by Jay Buffington <ja...@gmail.com>.

I'm a little behind here, but this thread caught my eye:

On 5/6/07, Perrin Harkins <pe...@elem.com> wrote:
> On 5/6/07, James. L <pe...@yahoo.com> wrote:
> > my question is
> > once the app produce the html, does the memory
> > allocated by the parsed data get released to perl?
> > that memory will be reused by other mod_perl app?
>
> No, Perl doesn't work that way.  It will keep that memory allocated
> for that variable unless you undef the variable explicitly.

It's my understanding that even if you explicitly undef a variable,
perl doesn't release the memory.

Here's a demonstration:

jay@webdev:~$ cat memusage.pl
#!/usr/bin/perl

use GTop ();

print "Base line:\n";
print_mem();

my %squared;
foreach my $num ( 1..500_000 ) {
        $squared{ $num } = $num**2;
}

print "\nAfter using some memory:\n";
print_mem();

%squared = undef;

print "\nAfter undef'ing variable:\n";
print_mem();

sub print_mem {
        my $proc_mem = GTop->new()->proc_mem($$);
        print "size:     " . $proc_mem->size() . "\n";
        print "vsize:    " . $proc_mem->vsize() . "\n";
        print "resident: " . $proc_mem->resident() . "\n";
        print "share:    " . $proc_mem->share() . "\n";
        print "rss:      " . $proc_mem->rss() . "\n";
}


jay@webdev:~$ perl memusage.pl
Base line:
size:     7180288
vsize:    7180288
resident: 2936832
share:    1916928
rss:      2932736

After using some memory:
size:     58142720
vsize:    58142720
resident: 49774592
share:    1941504
rss:      49774592

After undef'ing variable:
size:     58142720
vsize:    58142720
resident: 49774592
share:    1941504
rss:      49774592

Re: few newbie quesitons..

Posted by "James. L" <pe...@yahoo.com>.

--- Perrin Harkins <pe...@elem.com> wrote:

> On 5/6/07, James. L <pe...@yahoo.com> wrote:
> > my question is
> > once the app produce the html, does the memory
> > allocated by the parsed data get released to perl?
> > that memory will be reused by other mod_perl app?
> 
> No, Perl doesn't work that way.  It will keep that
> memory allocated
> for that variable unless you undef the variable
> explicitly.
>

the files i need to parse are usually in size of 2M -
10M. will the mod_perl server(2G Mem) use up the
memory pretty quick after few hundred requests on
different files?

the app currently run under plain cgi. i am using
CGI::Application. the simplified code as following: 

package My::App;
sub table {
  my $data_ref = My::Parser->parse( $file_to_parse );
  return $tt->process('table.tt', { data => $data } );
}

##### 
package My::Parser;
....
sub parse {
  my ($class,$file) = @_;
  my @data;
  open my $F, $file or die $!;
  while ( my $line = <$F> ) { 
    my @fields = split /=/, $line;
    push @data, \@fields;
  }
  close $F;
  return \@data;
}

i think i need to re-read CGI to mod_perl Porting doc.
in some case, i still unsure if the variable is gone
as i think it is.

> > in the config module example from mod_perl doc, it
> > says that declaring a global hash which consists
> the
> > configs is better than declaring few global
> variables.
> > why is that? i thought that they takes the same
> amount
> > of memory..
> 
> Without seeing the documentation you're referring
> to, we can only
> guess why it says that.  It wouldn't be to save
> memory.  Maybe it's to
> avoid namespace pollution or to make importing easy.
>

never mind. i read the doc wrong. 

[snip]

> - Perrin
> 

thanks,

James.

____________________________________________________________________________________
It's here! Your new message!  
Get new email alerts with the free Yahoo! Toolbar.
http://tools.search.yahoo.com/toolbar/features/mail/

Re: few newbie quesitons..

Posted by Perrin Harkins <pe...@elem.com>.

On 5/6/07, James. L <pe...@yahoo.com> wrote:
> my question is
> once the app produce the html, does the memory
> allocated by the parsed data get released to perl?
> that memory will be reused by other mod_perl app?

No, Perl doesn't work that way.  It will keep that memory allocated
for that variable unless you undef the variable explicitly.

> in the config module example from mod_perl doc, it
> says that declaring a global hash which consists the
> configs is better than declaring few global variables.
> why is that? i thought that they takes the same amount
> of memory..

Without seeing the documentation you're referring to, we can only
guess why it says that.  It wouldn't be to save memory.  Maybe it's to
avoid namespace pollution or to make importing easy.

> is it approriate to preload all modules used in my app
> in startup.pl so that it won't be loaded in each
> apache chile process?

Yes.

> if module A.pm uses B.pm, does preloading A also
> preload B as well?

Yes.

> does the object method way(create the object at app's
> start up and cache it for later use) have less
> overhead than class method?

Neither one has significant overhead.

- Perrin