You are viewing a plain text version of this content. The canonical link for it is here.
Posted to modperl@perl.apache.org by Jason Czerak <Ja...@Jasnik.net> on 2002/01/01 21:02:23 UTC

Re: Fast template system. Ideas,theorys and tools

On Sun, 2001-12-30 at 19:47, Ryan Thompson wrote:
> Mark Maunder wrote to Ryan Thompson:
> 
> > Ryan Thompson wrote:
> >
> > > There must be a faster way. I have thought about pre-compiling each
> > > HTML file into a Perl module, but there would have to be an automated
> > > (and secure) way to suck these in if the original file changes.
> > >
> > > Either that, or maybe someone has written a better parser. My code
> > > looks something like this, to give you an idea of what I need:
> >
> > Sure there are tons of good template systems out there. I think
> > someone made a comment about writing a template system being a
> > right of passage as a perl developer. But it's also more fun to do
> > it yourself.
> 
> :-)
> 
> 
> > I guess you've tried compiling your regex with the o modifier?
> 
> Yep, problem is there are several of them. I've done some work
> recently to simplify things, which might have a positive effect.
> 
> 
> > Also, have you tried caching your HTML in global package variables
> > instead of shared memory?  I think it may be a bit faster than
> > shared memory segments like Apache::Cache uses. (The first request
> > for each child will be slower, but after they've each served once,
> > they'll all be fast). Does your engine stat (access) the html file
> > on disk for each request? You mentioned you're caching, but
> > perhaps you're checking for changes to the file. Try to stat as
> 
> My caching algorithm uses 2 levels:
> 
> When an HTML file is requested, the instance of my template class
> checks in its memory cache. If it finds it there, great... everything
> is done within that server process.
> 
> If it's not in the memory cache, it checks in a central MySQL cache
> database on the local machine. These requests are on the order of a
> few ms, thanks to an optimized query and Apache::DBI. NOT a big deal.
> 
> If it's not in either cache, it takes it's lumps and goes to disk.
> 
> In each cache, I use a TTL. (time() + $TTL), which is configurable,
> and usually set to something like 5 minutes in production, or 60
> seconds during development/bug fixes. (And, for this kind of data, 5
> minutes is pretty granular, as templates don't change very often.. but
> setting it any higher would, on average, have only a negligible
> improvement in performance at the risk of annoying developers :-).
> 
> And, with debugging in my template module turned on, it has been
> observed that cache misses are VERY infrequent (< 0.1% of all
> requests).
> 
> In fact, if I use this cache system and disable all parsing (i.e.,
> just use it to include straight HTML into mod_perl apps), I can serve
> 150-200 requests/second on the same system.
> 
> With my parsing regexps enabled, it drops to 50-60 requests/second.
> 
First. I would love to know what kinda machine your using here. :)

Second. I"m creating a template based shopping cart. Yeah. there are a
ton of carts out there. But they all either under powered/featured/coded
or over powered/features/coded. Plus most of the free ones don't support
multiple domains. And I'm not in the mood duplicated a good single
domain cart 'x' times per client.

Key features were easy intergration into web sites, also,'pure
intergration' for a custom look and feel. 

I looked at just about every template system on CPAN and came across
text::template. Anyone use this one? I don't require if/then's within
the template. Infact all I really need are call backs to an API that I
am currently developing. And in the case that I do need if/then's. I can
use pure perl within the template. Even do SQL queries with-in the
template, or not even use the API at all if I wanted to.

There are how ever a few quirks that I need to work out with
text::template. I plan on posting a thread about it later today once I
make sure it's not my coding.

Anyways, getting closer to my point here. How my  cart works is that you
have something like this.

<html>
<body>
<table> <tr><td>
{ #display catagory view here
	%vars = ( #this modifies the global defaults for html output. 
		'border' = '1',
		'cellpadding' = '2'
	);
	&JC::API::cat_view(/%vars);
	''
}
</td>
</tr>
</body></html>

That's basicly it. You make call backs to various parts of the API to
create the look and feel of the site based on what is in the database.

I implamented some google style timing in the API. It's basicly gets a
Time::HiRes timestamp in the beginning and does the math at the very end
and posts it in an html comment.  On my dev machine, a dual PIII-750
with a gig of ram and mod_perl running with full caching enabled and
fully persistant code base for the cart. My average transaction time is
about .08 of a second. That leads me to think that my machine can handle
10 html page generations a second (this isn't an exact science, but
close).

Now, As with the bug I think I am having with text::template. I'm using
STDOUT when I compile the template and execute all call backs and sql
queries and the such fixes things for now. (the buffering bit it can use
to compile the output into a varable is not working, it process the
output of the perl API and prints it in the incorrect spot, the
beginning of the output). 

Second feature that I'm not using is caching of output. The caching I'm
wondering how to handle it. I would need this text::template buffering
bit to work better first before I can use the caching. But I found a few
nice mod_perl aware caching tools on CPAN that I'll use to do caching.

3rd. my sql queries are not the most optimized and mostly tossed
together. But then again my developemnt tables are not that big at all.

Keep in mind that my times are based also on the time it takes to make
the transmission of the data from server to client as well as the time
it takes to compile the template and do sql queries. I"m not sure if
your times are factoring that in. Locally times are .08/sec but remotely
from T1 to cable times are .16/sec to .20/sec depending on network
conditions and size of html file at the time naturally. Some of the html
files can be about 50K in size we found.

Also, pages will do about 3 sql queries on average plus queries to the
Apache::Session state information. 

Do these numbers sound right? There are places were I could move from
doing SQL to keeping some data in Apache::Session. but it would be small
speed improments unless the catagories are very deep, I do have
recursive SQL queries inplace for some things.

Your talking 50 to 60/sec as slow. So I dunno what is 'fast'.

I do plan on release this cart GPL one of these days once I know it
works. And I do plan on having a web site explaining it in full detail.
If anyone is interested right now in the development of it. Give me a
ring. I'll be glad to get some help with it. (expecially with fedex, ups
shipping calcs :) )

If anyone would like to see a working site using the cart that I am
working on. It's over at http://www2.test.jasnik.net. 

--
Jason Czerak




Re: Fast template system. Ideas,theorys and tools

Posted by Thomas Eibner <th...@stderr.net>.
On Tue, Jan 01, 2002 at 03:02:23PM -0500, Jason Czerak wrote:
> I looked at just about every template system on CPAN and came across
> text::template. Anyone use this one? I don't require if/then's within
                                       ^^^^^^^^^^^^^^^^^^^^^^^^^
A template system should (IMO) do exactly that. The logic shouldn't be
embedded in the templates. That's why I settled on CGI::FastTemplate
(Yes, yet another templating system). 

> the template. Infact all I really need are call backs to an API that I
> am currently developing. And in the case that I do need if/then's. I can
> use pure perl within the template. Even do SQL queries with-in the
> template, or not even use the API at all if I wanted to.

I don't want any code/sql in my templates, and it works out fine that
way for whatever project I do. And it's certainly easier for the
project's webdesigner to gasp when there's no code they can break by
using whatever windows-app they want to use to edit their html.

I tried Mason and other of the more featurefull template systems out there
but I always ended up going back to CGI::FastTemplate.

my $cent = 2; # on templates

-- 
  Thomas Eibner <http://thomas.eibner.dk/> DnsZone <http://dnszone.org/>
  mod_pointer <http://stderr.net/mod_pointer> 


Re: Fast template system. Ideas,theorys and tools

Posted by Stas Bekman <st...@stason.org>.
Perrin Harkins wrote:

>>What do you suggest as a good benchmark tool to use that would be
>>'smart' when testing a whole complete site.
>>
> 
> For pounding a bunch of URLs, the best are ab, httperf, and http_load.  If
> you need something fancier that tests a complex series of actions and
> responses, there are several packages on CPAN.  They've been discussed on
> this list, but I haven't tried any of them myself so I can't comment on
> them.  They will not scale as easilly as the first three I mentioned though,
> so if you need to test hundreds of requests per second with them you will
> need multiple machines to run the clients on.


Like Perrin says, also see some examples in the guide.

http://perl.apache.org/guide/performance.html


If you really need to write many benchmarks try my never released 
Apache::Benchmark and if you find it useful, it should be easy to port 
it to use HTTPD::Bench::ApacheBench as it currently calls 'ab' and 
parses the output.  Get the package from here:

http://stason.org/works/modules/Apache-Benchmark-0.01.tar.gz


_____________________________________________________________________
Stas Bekman             JAm_pH      --   Just Another mod_perl Hacker
http://stason.org/      mod_perl Guide   http://perl.apache.org/guide
mailto:stas@stason.org  http://ticketmaster.com http://apacheweek.com
http://singlesheaven.com http://perl.apache.org http://perlmonth.com/


Re: Fast template system. Ideas,theorys and tools

Posted by Perrin Harkins <pe...@elem.com>.
> What do you suggest as a good benchmark tool to use that would be
> 'smart' when testing a whole complete site.

For pounding a bunch of URLs, the best are ab, httperf, and http_load.  If
you need something fancier that tests a complex series of actions and
responses, there are several packages on CPAN.  They've been discussed on
this list, but I haven't tried any of them myself so I can't comment on
them.  They will not scale as easilly as the first three I mentioned though,
so if you need to test hundreds of requests per second with them you will
need multiple machines to run the clients on.

- Perrin


Re: Fast template system. Ideas,theorys and tools

Posted by Jason Czerak <Ja...@Jasnik.net>.
On Thu, 2002-01-03 at 12:20, Perrin Harkins wrote:
<snip> 
> > I implamented some google style timing in the API. It's basicly gets a
> > Time::HiRes timestamp in the beginning and does the math at the very end
> > and posts it in an html comment.
> 
> You'd be better off with Devel::DProf (or Apache::DProg under mod_perl).
> 
> > My average transaction time is
> > about .08 of a second. That leads me to think that my machine can handle
> > 10 html page generations a second (this isn't an exact science, but
> > close).
> 
> You are assuming serial execution.  You should be able to push much more
> than that through in a second because of parallel execution.
> 
What do you suggest as a good benchmark tool to use that would be
'smart' when testing a whole complete site.

--
Jason Czerak


Re: Fast template system. Ideas,theorys and tools

Posted by Perrin Harkins <pe...@elem.com>.
> I looked at just about every template system on CPAN and came across
> text::template. Anyone use this one?

I'd suggest you read my overview of templating options.  It summarizes the
top choices for templating tools, and talks about the strengths of
weaknesses of  Text::Template..
http://perl.apache.org/features/tmpl-cmp.html

> I implamented some google style timing in the API. It's basicly gets a
> Time::HiRes timestamp in the beginning and does the math at the very end
> and posts it in an html comment.

You'd be better off with Devel::DProf (or Apache::DProg under mod_perl).

> My average transaction time is
> about .08 of a second. That leads me to think that my machine can handle
> 10 html page generations a second (this isn't an exact science, but
> close).

You are assuming serial execution.  You should be able to push much more
than that through in a second because of parallel execution.

> 3rd. my sql queries are not the most optimized and mostly tossed
> together.

DBIx::Profile can help you identify problems in your queries.  And follow
the optimization advice for DBI in the guide.

- Perrin


Re: Fast template system. Ideas,theorys and tools

Posted by Ryan Thompson <ry...@sasknow.com>.
Jason Czerak wrote to modperl@apache.org:

> >
> > In fact, if I use this cache system and disable all parsing (i.e.,
> > just use it to include straight HTML into mod_perl apps), I can serve
> > 150-200 requests/second on the same system.
> >
> > With my parsing regexps enabled, it drops to 50-60 requests/second.
>
> First. I would love to know what kinda machine your using here. :)

It's part of a testing environment. Nothing fancy... Lightly loaded
single CPU PII-400MHz, 512MB RAM, Adaptec Ultra160 SCSI (no RAID),
FreeBSD 4.4-RELEASE, with some in-house kernel patches applied, latest
MySQL, mod_perl, modules.. It's great for testing, because when we
move things to the REAL server (more power, more load), I know they'll
be at least as fast over there and won't crack under pressure :-)

The database this runs off of has several tables. The smallest has
about 15K rows. The largest is pushing 300K rows.


> 3rd. my sql queries are not the most optimized and mostly tossed
> together. But then again my developemnt tables are not that big at
> all.

This might be one of your largest areas to improve upon. Thanks to
practical and theoretical experience with RDBMS', I write fairly well
optimized SQL when I "throw things together", but I recently rewrote a
JOIN in one of my applications that was "fast enough" and doubled the
observed performance (oops.. should have noticed that the first time).
Check your table design and indexing, too, as these are areas which
are tremendously important, that many people overlook.

Even a development table that is "not that big" might be a few
thousand rows. If your query is poorly designed, and your DBMS has to
read every row in the table, you're going to take a noticeable hit
over a number of queries. Worse yet, if, for example, you're taking
the cartesian product by mistake (this is easy enough to do), pretty
quickly you've got hundreds of thousands of rows to deal with. :-)

I guess what I'm saying is, never underestimate the importance of
carefully constructed SQL.


> Keep in mind that my times are based also on the time it takes to
> make the transmission of the data from server to client as well as
> the time it takes to compile the template and do sql queries. I"m
> not sure if your times are factoring that in. Locally times are
> .08/sec but remotely from T1 to cable times are .16/sec to .20/sec
> depending on network conditions and size of html file at the time
> naturally. Some of the html files can be about 50K in size we
> found.

Generally when testing application performance, it is wise to
eliminate highly variable things like network performance. Network
importance might be significant in your application deployment, but
you can test that in isolation with the many great network tools out
there.


> Also, pages will do about 3 sql queries on average plus queries to
> the Apache::Session state information.

Take advantage of query caching... Instead of
$cursor = $dbh->prepare('SELECT * FROM yourtable WHERE foo = bar');

use

$cursor = $dbh->prepare('SELECT * FROM yourtable WHERE foo = ?');
$cursor->bind_param(1,"bar");

This way, your SQL database will cache the queries so that they do not
have to be parsed and examined each time. Note that this does NOT
cache the results of the queries. You can do that, as well, but can
lead to nasty surprises.

Also, try to limit the total number of SQL queries you make per
request. Take advantage of joins, group by, etc, if you can.


> Do these numbers sound right? There are places were I could move
> from doing SQL to keeping some data in Apache::Session. but it
> would be small speed improments unless the catagories are very
> deep, I do have recursive SQL queries inplace for some things.
>
> Your talking 50 to 60/sec as slow. So I dunno what is 'fast'.

I "optimized" a few things wrt. the parsing and am now seeing
110-115/sec pretty consistently. IIRC, this server won't handle much
more than that without SQL.

- Ryan

-- 
  Ryan Thompson <ry...@sasknow.com>
  Network Administrator, Accounts

  SaskNow Technologies - http://www.sasknow.com
  #106-380 3120 8th St E - Saskatoon, SK - S7H 0W2

        Tel: 306-664-3600   Fax: 306-664-1161   Saskatoon
  Toll-Free: 877-727-5669     (877-SASKNOW)     North America