You are viewing a plain text version of this content. The canonical link for it is here.

Posted to modperl@perl.apache.org by Stas Bekman <sb...@stason.org> on 2000/06/07 18:45:25 UTC

[performance/benchmark] printing techniques

Following Tim's comments here is the new benchmark. (I'll address the
buffering issue in another post)

  use Benchmark;
  use Symbol;

  my $fh = gensym;
  open $fh, ">/dev/null" or die;
  
  sub multi_print{
    print $fh "<!DOCTYPE HTML PUBLIC \"-//IETF//DTD HTML//EN\">";
    print $fh "<HTML>";
    print $fh "  <HEAD>";
    print $fh "    <TITLE>";
    print $fh "      Test page";
    print $fh "    </TITLE>";
    print $fh "  </HEAD>";
    print $fh "  <BODY BGCOLOR=\"black\" TEXT=\"white\">";
    print $fh "    <H1> ";
    print $fh "      Test page ";
    print $fh "    </H1>";
    print $fh "    <A HREF=\"foo.html\">foo</A>";
    print $fh "    <HR>";
    print $fh "  </BODY>";
    print $fh "</HTML>";
  }
  
  sub single_print{
    print $fh qq{<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
<HTML>
  <HEAD>
    <TITLE>
      Test page
    </TITLE>
  </HEAD>
  <BODY BGCOLOR="black" TEXT="white">
    <H1> 
      Test page 
    </H1>
    <A HREF="foo.html">foo</A>
    <HR>
  </BODY>
</HTML>
    };
  }
  
  sub here_print{
    print $fh <<__EOT__;
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
<HTML>
  <HEAD>
    <TITLE>
      Test page
    </TITLE>
  </HEAD>
  <BODY BGCOLOR="black" TEXT="white">
    <H1> 
      Test page 
    </H1>
    <A HREF="foo.html">foo</A>
    <HR>
  </BODY>
</HTML>
__EOT__
  }
  
  sub list_print{
    print $fh "<!DOCTYPE HTML PUBLIC \"-//IETF//DTD HTML//EN\">",
              "<HTML>",
              "  <HEAD>",
              "    <TITLE>",
              "      Test page",
              "    </TITLE>",
              "  </HEAD>",
              "  <BODY BGCOLOR=\"black\" TEXT=\"white\">",
              "    <H1> ",
              "      Test page ",
              "    </H1>",
              "    <A HREF=\"foo.html\">foo</A>",
              "    <HR>",
              "  </BODY>",
              "</HTML>";
  }
  
  timethese
    (500_000, {
          list_print   => \&list_print,
          multi_print  => \&multi_print,
          single_print => \&single_print,
          here_print   => \&here_print,
          });

And the results are:

  single_print:  1 wallclock secs ( 1.74 usr +  0.05 sys =  1.79 CPU)
  here_print:    3 wallclock secs ( 1.79 usr +  0.07 sys =  1.86 CPU)
  list_print:    7 wallclock secs ( 6.57 usr +  0.01 sys =  6.58 CPU)
  multi_print:  10 wallclock secs (10.72 usr +  0.03 sys = 10.75 CPU)

Numbers tell it all, I<'single_print'> is the fastest, 'here_print' is
almost of the same speed, I<'list_print'> is quite slow and
I<'multi_print'> is the slowest.

If we run the same benchmark using the unbuffered prints by changing
the beginning of the code to:

  use Symbol;
  my $fh = gensym;
  open $fh, ">/dev/null" or die;
  
     # make all the calls unbuffered
  my $oldfh = select($fh);
  $| = 1;
  select($oldfh);

And the results are:

  single_print:  4 wallclock secs ( 2.28 usr +  0.47 sys =  2.75 CPU)
  here_print:    2 wallclock secs ( 2.45 usr +  0.45 sys =  2.90 CPU)
  list_print:    7 wallclock secs ( 7.17 usr +  0.45 sys =  7.62 CPU)
  multi_print:  23 wallclock secs (17.52 usr +  5.72 sys = 23.24 CPU)

The results are worse by the factor of 1.5 to 2, with only
I<'list_print'> changed by very little.

So if you want a better performance, you know what technique to use.

_____________________________________________________________________
Stas Bekman              JAm_pH     --   Just Another mod_perl Hacker
http://stason.org/       mod_perl Guide  http://perl.apache.org/guide 
mailto:stas@stason.org   http://perl.org     http://stason.org/TULARC
http://singlesheaven.com http://perlmonth.com http://sourcegarden.org

Re: [performance/benchmark] printing techniques

Posted by Gisle Aas <gi...@ActiveState.com>.

Stas Bekman <sb...@stason.org> writes:

> And the results are:
> 
>   single_print:  1 wallclock secs ( 1.74 usr +  0.05 sys =  1.79 CPU)
>   here_print:    3 wallclock secs ( 1.79 usr +  0.07 sys =  1.86 CPU)
>   list_print:    7 wallclock secs ( 6.57 usr +  0.01 sys =  6.58 CPU)
>   multi_print:  10 wallclock secs (10.72 usr +  0.03 sys = 10.75 CPU)
> 
> Numbers tell it all, I<'single_print'> is the fastest, 'here_print' is
> almost of the same speed,

'single_print' and 'here_print' compile down to exactly the same code,
so there should not be any real difference between them.

-- 
Gisle Aas

Re: Method overhead benchmarks [Was: [performance/benchmark] printingtechniques]

Posted by Barrie Slaymaker <ba...@slaysys.com>.

Matt Sergeant wrote:
> 
> You also forgot that print() goes to a tied STDOUT, which is even more of
> an overhead...

Yeah, that'd probably swamp almost all other effects Stas is testing right
there, and it explains Stas's test results when varying $|.

- Barrie

Re: Method overhead benchmarks [Was: [performance/benchmark] printing techniques]

Posted by Matt Sergeant <ma...@sergeant.org>.

On Thu, 8 Jun 2000, Barrie Slaymaker wrote:

> Stephen Zander wrote:
> > 
> > As Matt has already commented, in the handler the method call
> > overheads swamps all the other activities.
> 
> Just to clarify: that's only important if you are doing very few other
> activities, or if those other activities also include a high percentage 
> of method calls:

You also forgot that print() goes to a tied STDOUT, which is even more of
an overhead...

-- 
<Matt/>

Fastnet Software Ltd. High Performance Web Specialists
Providing mod_perl, XML, Sybase and Oracle solutions
Email for training and consultancy availability.
http://sergeant.org http://xml.sergeant.org

Method overhead benchmarks [Was: [performance/benchmark] printing techniques]

Posted by Barrie Slaymaker <ba...@slaysys.com>.

Stephen Zander wrote:
> 
> As Matt has already commented, in the handler the method call
> overheads swamps all the other activities.

Just to clarify: that's only important if you are doing very few other
activities, or if those other activities also include a high percentage 
of method calls:

###### Using an empty A::a() (see below):

Benchmark: running $a->a(), A->a(), A::a( "A" ), A::a( $a ), A::a(), a(), each for at least 3 CPU seconds...
   $a->a():  4 wallclock secs ( 3.24 usr +  0.02 sys =  3.26 CPU) @ 511465.64/s (n=1667378)
    A->a():  2 wallclock secs ( 3.28 usr +  0.00 sys =  3.28 CPU) @ 290696.34/s (n=953484)
A::a( "A" ):  3 wallclock secs ( 3.08 usr +  0.00 sys =  3.08 CPU) @ 610704.55/s (n=1880970)
A::a( $a ):  3 wallclock secs ( 3.07 usr +  0.00 sys =  3.07 CPU) @ 623101.63/s (n=1912922)
    A::a():  3 wallclock secs ( 3.22 usr +  0.00 sys =  3.22 CPU) @ 611970.19/s (n=1970544)
       a():  3 wallclock secs ( 3.14 usr +  0.00 sys =  3.14 CPU) @ 622945.22/s (n=1956048)

                Rate   A->a()  $a->a() A::a( "A" )   A::a()       a() A::a( $a )
A->a()      290696/s       --     -43%        -52%     -52%      -53%       -53%
$a->a()     511466/s      76%       --        -16%     -16%      -18%       -18%
A::a( "A" ) 610705/s     110%      19%          --      -0%       -2%        -2%
A::a()      611970/s     111%      20%          0%       --       -2%        -2%
a()         622945/s     114%      22%          2%       2%        --        -0%
A::a( $a )  623102/s     114%      22%          2%       2%        0%         --

###### And doing some trivial work in A::a():

[barries@jester tmp]$ perl ./cmp2
Name "A::b" used only once: possible typo at ./cmp2 line 7.
Benchmark: running $a->a(), A->a(), A::a( "A" ), A::a( $a ), a(), each for at least 3 CPU seconds...
   $a->a():  5 wallclock secs ( 3.19 usr +  0.00 sys =  3.19 CPU) @ 64346.39/s (n=205265)
    A->a():  4 wallclock secs ( 3.21 usr +  0.00 sys =  3.21 CPU) @ 54105.30/s (n=173678)
A::a( "A" ):  2 wallclock secs ( 3.09 usr +  0.00 sys =  3.09 CPU) @ 70333.66/s (n=217331)
A::a( $a ):  3 wallclock secs ( 3.19 usr +  0.00 sys =  3.19 CPU) @ 68128.84/s (n=217331)
       a():  4 wallclock secs ( 3.10 usr +  0.00 sys =  3.10 CPU) @ 72231.29/s (n=223917)

               Rate      A->a()     $a->a()  A::a( $a ) A::a( "A" )         a()
A->a()      54105/s          --        -16%        -21%        -23%        -25%
$a->a()     64346/s         19%          --         -6%         -9%        -11%
A::a( $a )  68129/s         26%          6%          --         -3%         -6%
A::a( "A" ) 70334/s         30%          9%          3%          --         -3%
a()         72231/s         34%         12%          6%          3%          --



You can see that even doing a few things in A::a() cause the relative slowdown
due to method call overhead to drop significantly.  I suspect that your opcode
count has more to do with it than method overhead, unless I've goofed the
benchmarks somehow.

Hmmm, maybe this shows it best (though I did tweak aggrlist_print to
print STDERR):


[barries@jester tmp]$ perl ./aggr_cmp  2> /dev/null
Benchmark: running $a->aggrlist_print(), A->aggrlist_print(), aggrlist_print( $a ), each for at least 3 CPU seconds...
$a->aggrlist_print():  3 wallclock secs ( 2.69 usr +  0.41 sys =  3.10 CPU) @ 15104.19/s (n=46823)
A->aggrlist_print():  3 wallclock secs ( 2.72 usr +  0.41 sys =  3.13 CPU) @ 14492.01/s (n=45360)
aggrlist_print( $a ):  3 wallclock secs ( 2.81 usr +  0.24 sys =  3.05 CPU) @ 15529.18/s (n=47364)

                        Rate A->aggrlist_print() $a->aggrlist_print() aggrlist_print( $a )
A->aggrlist_print()  14492/s                  --                  -4%                  -7%
$a->aggrlist_print() 15104/s                  4%                   --                  -3%
aggrlist_print( $a ) 15529/s                  7%                   3%                   --


- Barrie

[barries@jester tmp]$ cat cmp
#!/usr/local/bin/perl -w

## cmp

package A ;

use Benchmark qw( cmpthese ) ;

sub a {}

my $a = bless {}, 'A' ;

cmpthese( -3, {
   'a()'         => sub { a() },
   'A::a()'      => sub { A::a() },
   'A::a( "A" )' => sub { A::a( "A" ) },
   'A::a( $a )'  => sub { A::a( $a )  },
   'A->a()'      => sub { A->a()      },
   '$a->a()'     => sub { $a->a()     },
} ) ;


[barries@jester tmp]$ cat cmp2
#!/usr/local/bin/perl -w

## cmp2

package A ;

$b = {} ;

sub a {
   my $self = shift ;
   my ( $a, $b, $c ) = @_ ;
   $b->{FOO} = 'bar' ;
}

use Benchmark qw( cmpthese ) ;

my $a = bless {}, 'A' ;

cmpthese( -3, {
   'a()'         => sub { a()         },
   'A::a( "A" )' => sub { A::a( "A" ) },
   'A::a( $a )'  => sub { A::a( $a )  },
   'A->a()'      => sub { A->a()      },
   '$a->a()'     => sub { $a->a()     },
} ) ;


[barries@jester tmp]$ cat aggr_cmp 
#!/usr/local/bin/perl -w

package A ;

sub aggrlist_print{
   my @buffer = ();
   push @buffer,"<!DOCTYPE HTML PUBLIC \"-//IETF//DTD HTML//EN\">\n";
   push @buffer,"<HTML>\n";
   push @buffer,"  <HEAD>\n";
   push @buffer,"    <TITLE>\n";
   push @buffer,"      Test page\n";
   push @buffer,"    </TITLE>\n";
   push @buffer,"  </HEAD>\n";
   push @buffer,"  <BODY BGCOLOR=\"black\" TEXT=\"white\">\n";
   push @buffer,"    <H1> \n";
   push @buffer,"      Test page \n";
   push @buffer,"    </H1>\n";
   push @buffer,"    <A HREF=\"foo.html\">foo</A>\n";
   push @buffer,"    <HR>\n";
   push @buffer,"  </BODY>\n";
   push @buffer,"</HTML>\n";
   print STDERR @buffer;



use Benchmark qw( cmpthese ) ;

my $a = bless {}, 'A' ;

cmpthese( -3, {
   'aggrlist_print( $a )' => sub { aggrlist_print( $a ) },
   'A->aggrlist_print()'  => sub { A->aggrlist_print()  },
   '$a->aggrlist_print()' => sub { $a->aggrlist_print() },
} ) ;

Re: [performance/benchmark] printing techniques

Posted by Matt Sergeant <ma...@sergeant.org>.

On 8 Jun 2000, Stephen Zander wrote:

> As Matt has already commented, in the handler the method call
> overheads swamps all the other activities. so concat_print &
> aggrlist_print (yes, method invocation in perl really is that bad).
> When you remove that overhead the extra OPs in aggrlist_print become
> the dominating factor.

Perhaps it would be worth testing the horribly ugly:

Apache::print($r, <output>);

Rather than plain print().

-- 
<Matt/>

Fastnet Software Ltd. High Performance Web Specialists
Providing mod_perl, XML, Sybase and Oracle solutions
Email for training and consultancy availability.
http://sergeant.org http://xml.sergeant.org

Re: [performance/benchmark] printing techniques

Posted by Stephen Zander <gi...@pobox.com>.

>>>>> "Stas" == Stas Bekman <st...@stason.org> writes:
    Stas> Is this a question or a suggestion? but in both cases
    Stas> (mod_perl and perl benchmark) the process doesn't exit, so
    Stas> the allocated datastructure is reused... anyway it should be
    Stas> the same. But it's not.

It was a suggestion.  Examining the optrees produced by aggrlist_print
and the following two routines which should be equivalent to
concat_print and multi_print from your original posting

   sub concat_print{
      my $buffer;
      $buffer .= "<!DOCTYPE HTML PUBLIC \"-//IETF//DTD HTML//EN\">\n";
      $buffer .= "<HTML>\n";
      $buffer .= "  <HEAD>\n";
      $buffer .= "</HTML>\n";
      print $buffer;
   }

   sub aggrlist_print{
      my @buffer = ();
      push @buffer,"<!DOCTYPE HTML PUBLIC \"-//IETF//DTD HTML//EN\">\n";
      push @buffer,"<HTML>\n";
      push @buffer,"  <HEAD>\n";
      push @buffer,"</HTML>\n";
      print @buffer;
   }

   sub multi_print{
      print "<!DOCTYPE HTML PUBLIC \"-//IETF//DTD HTML//EN\">\n";
      print "<HTML>\n";
      print "  <HEAD>\n";
      print "</HTML>\n";
   }

shows that aggrlist_print performs 25% OPs than concat_list and 43%
more OPs than multi_print.

    Stas> handler:
    Stas> concat_print    |    111      5000      0    876 
    Stas> aggrlist_print  |    113      5000      0    862 
    Stas> multi_print     |    118      5000      0    820 

    Stas> buffered benchmark:

    Stas> concat_print:    8 wallclock secs ( 8.23 usr +  0.05 sys =  8.28 CPU)
    Stas> multi_print:    10 wallclock secs (10.70 usr +  0.01 sys = 10.71 CPU)
    Stas> aggrlist_print: 30 wallclock secs (31.06 usr +  0.04 sys = 31.10 CPU)

    Stas> Watch the aggrlist_print gives such a bad perl benchmark,
    Stas> but very good handler benchmark...

As Matt has already commented, in the handler the method call
overheads swamps all the other activities. so concat_print &
aggrlist_print (yes, method invocation in perl really is that bad).
When you remove that overhead the extra OPs in aggrlist_print become
the dominating factor.

-- 
Stephen

"So if she weighs the same as a duck, she's made of wood."... "And
therefore?"... "A witch!"

Re: [performance/benchmark] printing techniques

Posted by Stas Bekman <st...@stason.org>.

Stephen Zander wrote:
> 
> >>>>> "Stas" == Stas Bekman <sb...@stason.org> writes:
>     Stas> Ouch :( Someone to explain this phenomena? and it's just
>     Stas> fine under the handler.... puzzled, what can I say...
> 
> Continuous array growth and copying?

Is this a question or a suggestion? but in both cases (mod_perl and perl
benchmark) the process doesn't exit, so the allocated datastructure is
reused... anyway it should be the same. But it's not.

Just to remind the context (please quote the relevant parts or it's
impossible to understand what are you talking about. Thanks!):

handler:

single_print    |    108      5000      0    890 
here_print      |    110      5000      0    887 
concat_print    |    111      5000      0    876 
aggrlist_print  |    113      5000      0    862 
list_print      |    113      5000      0    861 
multi_print     |    118      5000      0    820 

unbuffered benchmark:

single_print:    2 wallclock secs ( 2.29 usr +  0.46 sys =  2.75 CPU)
here_print:      2 wallclock secs ( 2.42 usr +  0.50 sys =  2.92 CPU)
list_print:      7 wallclock secs ( 7.26 usr +  0.53 sys =  7.79 CPU)
concat_print:    9 wallclock secs ( 8.90 usr +  0.60 sys =  9.50 CPU)
aggrlist_print: 32 wallclock secs (32.37 usr +  0.71 sys = 33.08 CPU)
multi_print:    21 wallclock secs (16.47 usr +  5.84 sys = 22.31 CPU)

buffered benchmark:

single_print:    3 wallclock secs ( 1.69 usr +  0.02 sys =  1.71 CPU)
here_print:      3 wallclock secs ( 1.76 usr +  0.01 sys =  1.77 CPU)
list_print:      7 wallclock secs ( 6.41 usr +  0.03 sys =  6.44 CPU)
concat_print:    8 wallclock secs ( 8.23 usr +  0.05 sys =  8.28 CPU)
multi_print:    10 wallclock secs (10.70 usr +  0.01 sys = 10.71 CPU)
aggrlist_print: 30 wallclock secs (31.06 usr +  0.04 sys = 31.10 CPU)

Watch the aggrlist_print gives such a bad perl benchmark, but very good
handler benchmark...

  sub aggrlist_print{
    my @buffer = ();
    push @buffer,"<!DOCTYPE HTML PUBLIC \"-//IETF//DTD HTML//EN\">\n";
    push @buffer,"<HTML>\n";
    push @buffer,"  <HEAD>\n";
[snip]
    push @buffer,"</HTML>\n";
    print @buffer;
  }


 
_____________________________________________________________________
Stas Bekman              JAm_pH     --   Just Another mod_perl Hacker
http://stason.org/       mod_perl Guide  http://perl.apache.org/guide 
mailto:stas@stason.org   http://perl.org     http://stason.org/TULARC
http://singlesheaven.com http://perlmonth.com http://sourcegarden.org

Re: [performance/benchmark] printing techniques

Posted by Stephen Zander <gi...@pobox.com>.

>>>>> "Stas" == Stas Bekman <sb...@stason.org> writes:
    Stas> Ouch :( Someone to explain this phenomena? and it's just
    Stas> fine under the handler.... puzzled, what can I say...

Continuous array growth and copying?

-- 
Stephen

"So if she weighs the same as a duck, she's made of wood."... "And
therefore?"... "A witch!"

Re: [performance/benchmark] printing techniques

Posted by Stas Bekman <sb...@stason.org>.

>     What the other programmer here and I do is setup an array and push()
>     our lines of output onto it throughout all our code, and print it at
>     the very end.  I'd be interested in seeing benchmarks of this vs.
>     the other methods.  I'll try to find the time to run them. 

handler:
query           | avtime completed failed    rps 
-------------------------------------------------
single_print    |    108      5000      0    890 
here_print      |    110      5000      0    887 
concat_print    |    111      5000      0    876 
aggrlist_print  |    113      5000      0    862 
list_print      |    113      5000      0    861 
multi_print     |    118      5000      0    820 
-------------------------------------------------


unbuffered benchmark

single_print:    2 wallclock secs ( 2.29 usr +  0.46 sys =  2.75 CPU)
here_print:      2 wallclock secs ( 2.42 usr +  0.50 sys =  2.92 CPU)
list_print:      7 wallclock secs ( 7.26 usr +  0.53 sys =  7.79 CPU)
concat_print:    9 wallclock secs ( 8.90 usr +  0.60 sys =  9.50 CPU)
aggrlist_print: 32 wallclock secs (32.37 usr +  0.71 sys = 33.08 CPU)
multi_print:    21 wallclock secs (16.47 usr +  5.84 sys = 22.31 CPU)

buffered benchmark

single_print:    3 wallclock secs ( 1.69 usr +  0.02 sys =  1.71 CPU)
here_print:      3 wallclock secs ( 1.76 usr +  0.01 sys =  1.77 CPU)
list_print:      7 wallclock secs ( 6.41 usr +  0.03 sys =  6.44 CPU)
concat_print:    8 wallclock secs ( 8.23 usr +  0.05 sys =  8.28 CPU)
multi_print:    10 wallclock secs (10.70 usr +  0.01 sys = 10.71 CPU)
aggrlist_print: 30 wallclock secs (31.06 usr +  0.04 sys = 31.10 CPU)

Ouch :( Someone to explain this phenomena? and it's just fine under the
handler.... puzzled, what can I say...


here is the code delta...

  sub aggrlist_print{
    my @buffer = ();
    push @buffer,"<!DOCTYPE HTML PUBLIC \"-//IETF//DTD HTML//EN\">\n";
    push @buffer,"<HTML>\n";
    push @buffer,"  <HEAD>\n";
    push @buffer,"    <TITLE>\n";
    push @buffer,"      Test page\n";
    push @buffer,"    </TITLE>\n";
    push @buffer,"  </HEAD>\n";
    push @buffer,"  <BODY BGCOLOR=\"black\" TEXT=\"white\">\n";
    push @buffer,"    <H1> \n";
    push @buffer,"      Test page \n";
    push @buffer,"    </H1>\n";
    push @buffer,"    <A HREF=\"foo.html\">foo</A>\n";
    push @buffer,"    <HR>\n";
    push @buffer,"  </BODY>\n";
    push @buffer,"</HTML>\n";
    print @buffer;
  }


_____________________________________________________________________
Stas Bekman              JAm_pH     --   Just Another mod_perl Hacker
http://stason.org/       mod_perl Guide  http://perl.apache.org/guide 
mailto:stas@stason.org   http://perl.org     http://stason.org/TULARC
http://singlesheaven.com http://perlmonth.com http://sourcegarden.org

Re: [performance/benchmark] printing techniques

Posted by Frank Wiles <fr...@wiles.org>.

 .------[ Jeff Norman wrote (2000/06/07 at 14:27:29) ]------
 | 
 |  Frequently, it's hard to build up an entire output segment without
 |  code in-between the different additions to the output.  I guess you could
 |  call this the "append, append, append... output" technique.
 |  
 |  I think it would be an interesting addition to the benchmark:
 |  
 |     sub gather_print{
 |       my $buffer = '';
 |       $buffer .= "<!DOCTYPE HTML PUBLIC \"-//IETF//DTD HTML//EN\">";
 |       $buffer .= "<HTML>";     
 |       $buffer .= "  <HEAD>";   
 |       $buffer .= "    <TITLE>";
 |       $buffer .= "      Test page";
 |       $buffer .= "    </TITLE>";
 |       $buffer .= "  </HEAD>";
 |       $buffer .= "  <BODY BGCOLOR=\"black\" TEXT=\"white\">";
 |       $buffer .= "    <H1> ";
 |       $buffer .= "      Test page ";
 |       $buffer .= "    </H1>";
 |       $buffer .= "    <A HREF=\"foo.html\">foo</A>";
 |       $buffer .= "    <HR>"; 
 |       $buffer .= "  </BODY>";
 |       $buffer .= "</HTML>";
 |       print $fh $buffer;
 |     }
 |  
 `-------------------------------------------------

    What the other programmer here and I do is setup an array and push()
    our lines of output onto it throughout all our code, and print it at
    the very end.  I'd be interested in seeing benchmarks of this vs.
    the other methods.  I'll try to find the time to run them. 

 -------------------------------
  Frank Wiles <fr...@wiles.org>
  http://frank.wiles.org
 -------------------------------

Re: [performance/benchmark] printing techniques

Posted by Perrin Harkins <pe...@primenet.com>.

On Wed, 7 Jun 2000, Stas Bekman wrote:
> And the results are:
> 
>   single_print:  1 wallclock secs ( 1.74 usr +  0.05 sys =  1.79 CPU)
>   here_print:    3 wallclock secs ( 1.79 usr +  0.07 sys =  1.86 CPU)
>   list_print:    7 wallclock secs ( 6.57 usr +  0.01 sys =  6.58 CPU)
>   multi_print:  10 wallclock secs (10.72 usr +  0.03 sys = 10.75 CPU)

Never mind the performance, that multi_print and list_print are just
U-G-L-Y!  Here doc rules.  Great for SQL too.
- Perrin

Re: Template techniques

Posted by "Randal L. Schwartz" <me...@stonehenge.com>.

>>>>> "Bernhard" == Bernhard Graf <gr...@speedlink.de> writes:

Bernhard> Chris Winters wrote:
>> The newest version of Template Toolkit (currently in alpha) supports
>> compiling templates to perl code. See about 2/3 of the way down the
>> the README at www.template-toolkit.org. Why reinvent the wheel? :)

Bernhard> Also the current stable (1.06) can do this.

And Mason was doing this from the beginning. :)

-- 
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
<me...@stonehenge.com> <URL:http://www.stonehenge.com/merlyn/>
Perl/Unix/security consulting, Technical writing, Comedy, etc. etc.
See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training!

[OT now] Re: Template techniques

Posted by Drew Taylor <dt...@vialogix.com>.

Andy Wardley wrote:
> 
> On Jun 8,  1:56pm, Perrin Harkins wrote:
> > Not quite.  The current version uses its own system of opcodes (!) which
> > are implemented as closures.  Compiling to perl code gives much better
> > performance, which is why Andy is changing this.
> 
> Yep, Perrin's right.  Version 1 compiled templates to tree form.  Items
> in the tree were scalars (plain text) or references to directive objects
> which performed some processing (like INCLUDE another template, and so
> on).
> 
> This is actually pretty efficient when you have a limited directive set,
> but doesn't scale very well.  For version 1.00 I was more concerned
> about getting it functioning correctly than running fast (it was already
> an order of magnitude or two faster than Text::MetaText, the predecessor,
> so I was happy).  Also it was much easier to develop and evolve the toolkit
> with the tree-form architecture than when compiling to Perl, so it had some
> hidden benefit.
I was wondering if anyone had done comparisions between some of the
major templating engines. I'm thinking specifically of Template Toolkit,
Mason, HTML::Template, and EmbPerl. I currently use HTML::Template, and
am happy with it. But I am always open to suggestions.

I really like the fact that templates can be compiled to perl code &
cached. Any others besides Mason & EmbPerl (and TT in the near future)?


-- 
Drew Taylor
Vialogix Communications, Inc.
501 N. College Street
Charlotte, NC 28202
704 370 0550
http://www.vialogix.com/

Re: Template techniques

Posted by Andy Wardley <ab...@cre.canon.co.uk>.

On Jun 8,  1:56pm, Perrin Harkins wrote:
> Not quite.  The current version uses its own system of opcodes (!) which
> are implemented as closures.  Compiling to perl code gives much better
> performance, which is why Andy is changing this.

Yep, Perrin's right.  Version 1 compiled templates to tree form.  Items
in the tree were scalars (plain text) or references to directive objects
which performed some processing (like INCLUDE another template, and so
on).

This is actually pretty efficient when you have a limited directive set,
but doesn't scale very well.  For version 1.00 I was more concerned
about getting it functioning correctly than running fast (it was already
an order of magnitude or two faster than Text::MetaText, the predecessor,
so I was happy).  Also it was much easier to develop and evolve the toolkit
with the tree-form architecture than when compiling to Perl, so it had some
hidden benefit.

But the current alpha version of 2.00 compiles templates to Perl code.  A
template like this:

   [% INCLUDE header
      title = "This is a test"
   %]

    Blah Blah Blah [% foo %]

   [% INCLUDE footer %]

is compiled to something resembling

   sub {
        my $context = shift;
        my $stash   = $context->stash();
        my $output  = '';

        $output .= $context->include('header', { title => 'This is a test' });
        $output .= "\nBlah Blah Blah ";
        $output .= $stash->get('foo');

	return $output;
   }

Apart from the benefits of speed, this also means that you can cache
compiled templates to disk (i.e. write the Perl to a file).  Thus you
can run a web server directly from template components compiled to Perl
and you don't even need to load the template parser.  Ideally, it should
be possible to integrate compiled TT templates with Mason components
and/or any other template form which gets compiled to Perl.

A

-- 
Andy Wardley <ab...@kfs.org>   Signature regenerating.  Please remain seated.
     <ab...@cre.canon.co.uk>   For a good time: http://www.kfs.org/~abw/

Re: Template techniques

Posted by Perrin Harkins <pe...@primenet.com>.

On Thu, 8 Jun 2000, Bernhard Graf wrote:

> Chris Winters wrote:
> 
> > The newest version of Template Toolkit (currently in alpha) supports
> > compiling templates to perl code. See about 2/3 of the way down the
> > the README at www.template-toolkit.org. Why reinvent the wheel? :)
> 
> Also the current stable (1.06) can do this.

Not quite.  The current version uses its own system of opcodes (!) which
are implemented as closures.  Compiling to perl code gives much better
performance, which is why Andy is changing this.

Template Toolkit rocks, and will rock even more when it has the extra
speed.

- Perrin

Re: Template techniques

Posted by Bernhard Graf <gr...@speedlink.de>.

Chris Winters wrote:

> The newest version of Template Toolkit (currently in alpha) supports
> compiling templates to perl code. See about 2/3 of the way down the
> the README at www.template-toolkit.org. Why reinvent the wheel? :)

Also the current stable (1.06) can do this.

-- 
    Bernhard Graf   -- s p e e d l i n k . d e --   fon +49-30-28000-182
graf@speedlink.de      http://www.speedlink.de      fax +49-30-28000-22
   y a programmer         Dircksenstraße 47         D-10178 Berlin

Re: Template techniques [ newbie alert + long ]

Posted by "Randal L. Schwartz" <me...@stonehenge.com>.

>>>>> "Perrin" == Perrin Harkins <pe...@primenet.com> writes:

Perrin> I think the world's record for most compact implementation
Perrin> goes to Randal for a small post you can find in the archive here:
Perrin> http://forum.swarthmore.edu/epigone/modperl/beldkhinfol/8ciuhsney0.fsf@gadget.cscaper.com

Ahh yes, Apache::Cachet (it's a cache, eh?), mostly proof of concept,
aborted when I started using HTML::Mason in a serious way.

-- 
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
<me...@stonehenge.com> <URL:http://www.stonehenge.com/merlyn/>
Perl/Unix/security consulting, Technical writing, Comedy, etc. etc.
See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training!

Re: Template techniques [ newbie alert + long ]

Posted by Ged Haywood <ge...@jubileegroup.co.uk>.

Hi there,

On Thu, 8 Jun 2000, Perrin Harkins wrote:

> use references for passing data.

But see "Advanced Perl Programming" pages 9 (Performance Efficiency)
and 44 (Using Typeglob Aliases).

73,
Ged.

Re: Template techniques [ newbie alert + long ]

Posted by Perrin Harkins <pe...@primenet.com>.

On Thu, 8 Jun 2000, Greg Cope wrote:
> > > - the area I was trying to explore was how to read a template (all
> > > HTML with a few <!--TAGS--> in it) and the sub in the new content.
> > 
> > Embperl would work fine for that, but it's overkill.  Your substitution
> > approach is slower than compiling to perl subs, especially since you have
> > to load the file, but saves lots of memory and is fine for something as
> > simple as this.
> 
> Can you enlighten me into the compiling to perl subs ?

It's what Matt was talking about.  Your program parses the template,
generates perl code that produces the correct output, evals the code, and
stores the results in a sub reference which you can call whenever you want
that template.

The first time I ever saw this done was with ePerl, but I don't know if
that was really the first.  All the embedded perl systems popular around
here (Embperl, Apache::ASP, Mason, etc.) use some variation on this
technique.  I think the world's record for most compact implementation
goes to Randal for a small post you can find in the archive here:
http://forum.swarthmore.edu/epigone/modperl/beldkhinfol/8ciuhsney0.fsf@gadget.cscaper.com

> The file gets loaded once into shared memory - most (stripped) HTML
> files are only a few 10's of K.
> 
> Also the file gets loaded once at startup - not during the request
> stage.

You probably won't get much faster than that then, no matter what you do.  
Just make sure your regexps are fast (maybe use "study"?) and use
references for passing data.

- Perrin

Re: Template techniques [ newbie alert + long ]

Posted by Greg Cope <gj...@rubberplant.freeserve.co.uk>.

Perrin Harkins wrote:
> 
> On Thu, 8 Jun 2000, Greg Cope wrote:
> > My original question was not related to templates (I'll use embperl for
> > that)
> 
> Well, I'm confused now.  You'll use Embperl for templates but you're not
> using Embperl for templates?

I use Embperl when I want a templating system - but not when using HTML
templates (wrong use of names on my part) - I am refering to a template
in this case as an HTML file with a few special tags.

> 
> > - the area I was trying to explore was how to read a template (all
> > HTML with a few <!--TAGS--> in it) and the sub in the new content.
> 
> Embperl would work fine for that, but it's overkill.  Your substitution
> approach is slower than compiling to perl subs, especially since you have
> to load the file, but saves lots of memory and is fine for something as
> simple as this.

Can you enlighten me into the compiling to perl subs ?

The file gets loaded once into shared memory - most (stripped) HTML
files are only a few 10's of K.

Also the file gets loaded once at startup - not during the request
stage.

> > Has anyone any suggestions as to speeding this up - yet keeping it
> > simple - I have played with referances to avoid all the variable copying
> > etc . ?
> 
> Caching templates in memory would certainly help, but you'll eat up a
> chunk of RAM.

If the html is usually reasonable in size, and the code I C&P'ed strips
the template into one long strip with spaces / tabs (designers making
things all indented etc ..) at each end of the string - and chomp.

Also the templates are modular - in that one template covers main part
of the page, and other templates cover the rest.  This helps contiunity
in HTML design etc .. (i.e only make one changen in one place)

Thanks for the input.

Greg

> 
> - Perrin

Re: Template techniques [ newbie alert + long ]

Posted by Perrin Harkins <pe...@primenet.com>.

On Thu, 8 Jun 2000, Greg Cope wrote:
> My original question was not related to templates (I'll use embperl for
> that)

Well, I'm confused now.  You'll use Embperl for templates but you're not
using Embperl for templates?

> - the area I was trying to explore was how to read a template (all
> HTML with a few <!--TAGS--> in it) and the sub in the new content.

Embperl would work fine for that, but it's overkill.  Your substitution
approach is slower than compiling to perl subs, especially since you have
to load the file, but saves lots of memory and is fine for something as
simple as this.

> Has anyone any suggestions as to speeding this up - yet keeping it
> simple - I have played with referances to avoid all the variable copying
> etc . ?

Caching templates in memory would certainly help, but you'll eat up a
chunk of RAM.

- Perrin

Re: Template techniques [ newbie alert + long ]

Posted by Greg Cope <gj...@rubberplant.freeserve.co.uk>.

Chris Winters wrote:
> 
> * shane@isupportlive.com (shane@isupportlive.com) [000608 11:07]:
> > I'm curious Matt, as opposed to what?, reparsing the template each
> > run?  Clearly reparsing would be a big loser in terms of performance.
> >
> > But what other technique could be used..., hrm.., without direct
> > control over the pipe, I really don't think it would get too much
> > better than this.  I mean, you could yank out sections and stuff it
> > into an array that would be like: text, coderef, coderef, text, etc.
> > Like in an ASP template you would parse the file, grab sections
> > between <% %> and eval it as a code ref, and stuff it into your array.
> > But this would probably not work specifically in ASP's case, but you
> > might be able to pull it off in Embperl.  (Unless the array itself
> > could also point to arrays, etc.)  Overall..., I think compiling it
> > directly makes a lot more sense in 99.999% of template languages...,
> > otherwise you'd end up putting too many restrictions on the template
> > language itself.
> >
> > Hmm..., sort of an interesting question, what ways could be utilized
> > in order to maximize speed in template execution.  I thought about
> > this a while ago, but after the fact I have to agree with Matt...,
> > just evaling each template as a package, or a code ref would be a
> > lot quicker, and if you could cook up another scheme, the resulting
> > code complexity might not pan out to be worth it.
> 
> The newest version of Template Toolkit (currently in alpha) supports
> compiling templates to perl code. See about 2/3 of the way down the
> the README at www.template-toolkit.org. Why reinvent the wheel? :)

My original question was not related to templates (I'll use embperl for
that) - the area I was trying to explore was how to read a template (all
HTML with a few <!--TAGS--> in it) and the sub in the new content.

My pages usually have serveral templates - or one larger template with
comments arround iterative bits (tables) so that I end up with a main
template which may look something like:

<html>
<head>
<title>
<!--TITLE-->
</title>
</head>
<body <!-- BODY -->>
<table>
<!--TABLE-->
</table>
</body></html>

and a table template:

<tr><td valign='foo' color='bar'><!--CELL--></td></tr>

The Designers / HTML coders I work with can then chop and change these
templates - I have made a template loader that strips the iterative
table content out into a seperate template, so that they can code a fake
entry for design reasons - and I then strip it out (looking for some
special HTML comment tags).

In a startup.pl The code then looks somat like this forgive the probably
uncompliable perl etc ... this is for explaination purposes:

use vars (MAINTMP TABLETMP);

$MAINTMP = &load('main.tmp');
$TABLETMP = &load('table.tmp');


sub load {

        my $file = shift;
        my $html = '';

        open (TEMPLATE, $file) || die ('Cant open : ', $file, $!);
        while (<TEMPLATE>) {
                chomp;
                $_ =~ s/^\s+//;
                $_ =~ s/\s+$//;
                $html = $html . $_;
        }
        close (TEMPLATE)  || die ('Cant close : ', $file, $!);
        return $html; 
}


then in a handler:

sub makeTableContents {

	# assumes @_ is a nice list in order of values
	# to insert into the table ..
	my $template = $TABLETMP;
	my $tablehtml;
	foreach (@_) {
		$template ~= s/<!--CELL-->/$_/;
		$tableHTML .= $template;
	}
	return $tableHTML;
}


sub doStuff {
	
	# assumes that it is passed in the return value from
	# makeTableContents and a title
	my $tableHTML = shift;
	my $title = shift;
	my $template = $MAINTMP;
	
	$template =~ s/<!--TITLE-->/$title/;
	$template =~ s/<!--TABLE-->/$tableHTML/;

	return $template;
}

Then when I want to send it I just print the return value of doStuff

This may not be OO - but it is simplistic, and although not benchmarked,
should be fast.  I have seen another OO ish implentation, that uses a
hash a bit like  Mats example - this allows more flexibility as all you
need do is add the tag to the HTML, and then add the tag to the hash -
then when you do the regex it gets put in.

Has anyone any suggestions as to speeding this up - yet keeping it
simple - I have played with referances to avoid all the variable copying
etc . ?

Thinking about it Mat's example is far more scalable, and reusable.

I am enjoying this thread ;-)

Greg Cope


> Chris
> 
> --
> Chris Winters
> Internet Developer    INTES Networking
> cwinters@intes.net    http://www.intes.net/
> Integrated hardware/software solutions to make the Internet work for you.

Re: Template techniques

Posted by Chris Winters <cw...@intes.net>.

* shane@isupportlive.com (shane@isupportlive.com) [000608 11:07]:
> I'm curious Matt, as opposed to what?, reparsing the template each
> run?  Clearly reparsing would be a big loser in terms of performance.
> 
> But what other technique could be used..., hrm.., without direct
> control over the pipe, I really don't think it would get too much
> better than this.  I mean, you could yank out sections and stuff it
> into an array that would be like: text, coderef, coderef, text, etc.
> Like in an ASP template you would parse the file, grab sections
> between <% %> and eval it as a code ref, and stuff it into your array.
> But this would probably not work specifically in ASP's case, but you
> might be able to pull it off in Embperl.  (Unless the array itself
> could also point to arrays, etc.)  Overall..., I think compiling it
> directly makes a lot more sense in 99.999% of template languages...,
> otherwise you'd end up putting too many restrictions on the template
> language itself.
> 
> Hmm..., sort of an interesting question, what ways could be utilized
> in order to maximize speed in template execution.  I thought about
> this a while ago, but after the fact I have to agree with Matt...,
> just evaling each template as a package, or a code ref would be a
> lot quicker, and if you could cook up another scheme, the resulting
> code complexity might not pan out to be worth it.

The newest version of Template Toolkit (currently in alpha) supports
compiling templates to perl code. See about 2/3 of the way down the
the README at www.template-toolkit.org. Why reinvent the wheel? :)

Chris

-- 
Chris Winters
Internet Developer    INTES Networking
cwinters@intes.net    http://www.intes.net/
Integrated hardware/software solutions to make the Internet work for you.

Re: Template techniques

Posted by Matt Sergeant <ma...@sergeant.org>.

On Thu, 8 Jun 2000 shane@isupportlive.com wrote:

> > As far as I've seen, the fastest template systems are the ones that
> > convert the template to Perl code. So that's what I do. The templates all
> > call a method (in my case $Response->Write()) which appends to a
> > string. If there are no exceptions (see the guide) the string is sent to
> > the browser. If there are exceptions, I parse/send an error template with
> > the error in the template.
> 
> I'm curious Matt, as opposed to what?, reparsing the template each
> run?  Clearly reparsing would be a big loser in terms of performance.

As opposed to parsing into a tree and working from that.

-- 
<Matt/>

Fastnet Software Ltd. High Performance Web Specialists
Providing mod_perl, XML, Sybase and Oracle solutions
Email for training and consultancy availability.
http://sergeant.org http://xml.sergeant.org

Re: Template techniques

Posted by sh...@isupportlive.com.

On Thu, Jun 08, 2000 at 01:48:40PM +0100, Matt Sergeant wrote:
> On Thu, 8 Jun 2000, Greg Cope wrote:
> 
> > This may be veering off topic - but its been on my mind for a while now ....
> > 
> > Apart from thanking Stas for his benchmark work, which I find very
> > interesting (does he sleep ;-) - this and few few others (benchmarks) have
> > all touched on the area of including mod_perl output within HTML.  I have
> > always wonder what everyone else is doing on this front.
> > 
> > I usually suck a template into memory (one long line) - usually done at
> > startup.  I then create all the conent with either pushing onto an array, or
> > .= string concatination.  Finally I regex the template - looking for my tags
> > and replave those with output.  Needless to say that one page can onsists of
> > many templates (page or inside of table (bits from <tr> </tr>) etc ...).
> > 
> > From Stas previous benchmarks I've preloaded the mysql driver and now
> > usually use the "push" onto array to prepare content - Thanks Stas.
> > 
> > Who does everyone else do it ? Can this type of operation (that everyone
> > must do at some time) be optimised as aggressively as some of the others ?
> > Yet still keep the abstraction between design and content.
> 
> As far as I've seen, the fastest template systems are the ones that
> convert the template to Perl code. So that's what I do. The templates all
> call a method (in my case $Response->Write()) which appends to a
> string. If there are no exceptions (see the guide) the string is sent to
> the browser. If there are exceptions, I parse/send an error template with
> the error in the template.

I'm curious Matt, as opposed to what?, reparsing the template each
run?  Clearly reparsing would be a big loser in terms of performance.

But what other technique could be used..., hrm.., without direct
control over the pipe, I really don't think it would get too much
better than this.  I mean, you could yank out sections and stuff it
into an array that would be like: text, coderef, coderef, text, etc.
Like in an ASP template you would parse the file, grab sections
between <% %> and eval it as a code ref, and stuff it into your array.
But this would probably not work specifically in ASP's case, but you
might be able to pull it off in Embperl.  (Unless the array itself
could also point to arrays, etc.)  Overall..., I think compiling it
directly makes a lot more sense in 99.999% of template languages...,
otherwise you'd end up putting too many restrictions on the template
language itself.

Hmm..., sort of an interesting question, what ways could be utilized
in order to maximize speed in template execution.  I thought about
this a while ago, but after the fact I have to agree with Matt...,
just evaling each template as a package, or a code ref would be a
lot quicker, and if you could cook up another scheme, the resulting
code complexity might not pan out to be worth it.

> Of course I don't know if its the fastest possible method - I prefer to
> code cleanly first and worry about performance later. Much later. Clean
> code tends to lend itself to better performance in the long run anyway,
> because it's easier to optimise serious performance problems away.

Can't really disagree with that.  Clean code is 100x easier to work on
later.

Shane.

Template techniques

Posted by Matt Sergeant <ma...@sergeant.org>.

On Thu, 8 Jun 2000, Greg Cope wrote:

> This may be veering off topic - but its been on my mind for a while now ....
> 
> Apart from thanking Stas for his benchmark work, which I find very
> interesting (does he sleep ;-) - this and few few others (benchmarks) have
> all touched on the area of including mod_perl output within HTML.  I have
> always wonder what everyone else is doing on this front.
> 
> I usually suck a template into memory (one long line) - usually done at
> startup.  I then create all the conent with either pushing onto an array, or
> .= string concatination.  Finally I regex the template - looking for my tags
> and replave those with output.  Needless to say that one page can onsists of
> many templates (page or inside of table (bits from <tr> </tr>) etc ...).
> 
> From Stas previous benchmarks I've preloaded the mysql driver and now
> usually use the "push" onto array to prepare content - Thanks Stas.
> 
> Who does everyone else do it ? Can this type of operation (that everyone
> must do at some time) be optimised as aggressively as some of the others ?
> Yet still keep the abstraction between design and content.

As far as I've seen, the fastest template systems are the ones that
convert the template to Perl code. So that's what I do. The templates all
call a method (in my case $Response->Write()) which appends to a
string. If there are no exceptions (see the guide) the string is sent to
the browser. If there are exceptions, I parse/send an error template with
the error in the template.

Of course I don't know if its the fastest possible method - I prefer to
code cleanly first and worry about performance later. Much later. Clean
code tends to lend itself to better performance in the long run anyway,
because it's easier to optimise serious performance problems away.

-- 
<Matt/>

Fastnet Software Ltd. High Performance Web Specialists
Providing mod_perl, XML, Sybase and Oracle solutions
Email for training and consultancy availability.
http://sergeant.org http://xml.sergeant.org

Re: [performance/benchmark] printing techniques

Posted by Greg Cope <gj...@rubberplant.freeserve.co.uk>.

From: "Matt Sergeant" <ma...@sergeant.org>
To: "Stas Bekman" <sb...@stason.org>
Cc: "___cliff rayman___" <cl...@genwax.com>; <mo...@apache.org>
Sent: 08 June 2000 09:23
Subject: Re: [performance/benchmark] printing techniques

: On Wed, 7 Jun 2000, Stas Bekman wrote:
:
: > On Wed, 7 Jun 2000, ___cliff rayman___ wrote:
: >
: > >
: > >
: > > Stas Bekman wrote:
: > >
: > > >
: > > >
: > > > Per your request:
: > > >
: > > > The handler:
: > > >
: > > > query         | avtime completed failed    rps
: > > > -----------------------------------------------
: > > > single_print  |    110      5000      0    881
: > > > here_print    |    111      5000      0    881
: > > > list_print    |    111      5000      0    880
: > > > concat_print  |    111      5000      0    873
: > > > multi_print   |    119      5000      0    820
: > > > -----------------------------------------------
: > >
: > > not very much difference once stuck in a handler.
: > > obviously multi_print is both ugly and slow, but the rest should be
used by the
: > > discretion of the programmer based on the one that is easiest to
maintain in
: > > the code.
: >
: > absolutely. I'd also love to know why is it different under the handler.
: > (talking about relative performance!)
:
: Because as I said - the method dispatch and the overhead of the mod_perl
: handler takes over. multi-print is the only one that has to call methods
: several times. The rest are almost equal.
:
: This also demonstrates some of the value in template systems that send all
: their output at once, however often these template systems use method
: calls too, so it all gets messed up.

This may be veering off topic - but its been on my mind for a while now ....

Apart from thanking Stas for his benchmark work, which I find very
interesting (does he sleep ;-) - this and few few others (benchmarks) have
all touched on the area of including mod_perl output within HTML.  I have
always wonder what everyone else is doing on this front.

I usually suck a template into memory (one long line) - usually done at
startup.  I then create all the conent with either pushing onto an array, or
.= string concatination.  Finally I regex the template - looking for my tags
and replave those with output.  Needless to say that one page can onsists of
many templates (page or inside of table (bits from <tr> </tr>) etc ...).

Re: [performance/benchmark] printing techniques

Posted by Matt Sergeant <ma...@sergeant.org>.

On Wed, 7 Jun 2000, Stas Bekman wrote:

> On Wed, 7 Jun 2000, ___cliff rayman___ wrote:
> 
> > 
> > 
> > Stas Bekman wrote:
> > 
> > >
> > >
> > > Per your request:
> > >
> > > The handler:
> > >
> > > query         | avtime completed failed    rps
> > > -----------------------------------------------
> > > single_print  |    110      5000      0    881
> > > here_print    |    111      5000      0    881
> > > list_print    |    111      5000      0    880
> > > concat_print  |    111      5000      0    873
> > > multi_print   |    119      5000      0    820
> > > -----------------------------------------------
> > 
> > not very much difference once stuck in a handler.
> > obviously multi_print is both ugly and slow, but the rest should be used by the
> > discretion of the programmer based on the one that is easiest to maintain in
> > the code.
> 
> absolutely. I'd also love to know why is it different under the handler.
> (talking about relative performance!)

Because as I said - the method dispatch and the overhead of the mod_perl
handler takes over. multi-print is the only one that has to call methods
several times. The rest are almost equal.

This also demonstrates some of the value in template systems that send all
their output at once, however often these template systems use method
calls too, so it all gets messed up.

-- 
<Matt/>

Fastnet Software Ltd. High Performance Web Specialists
Providing mod_perl, XML, Sybase and Oracle solutions
Email for training and consultancy availability.
http://sergeant.org http://xml.sergeant.org

Re: [performance/benchmark] printing techniques

Posted by Stas Bekman <sb...@stason.org>.

On Wed, 7 Jun 2000, ___cliff rayman___ wrote:

> 
> 
> Stas Bekman wrote:
> 
> >
> >
> > Per your request:
> >
> > The handler:
> >
> > query         | avtime completed failed    rps
> > -----------------------------------------------
> > single_print  |    110      5000      0    881
> > here_print    |    111      5000      0    881
> > list_print    |    111      5000      0    880
> > concat_print  |    111      5000      0    873
> > multi_print   |    119      5000      0    820
> > -----------------------------------------------
> 
> not very much difference once stuck in a handler.
> obviously multi_print is both ugly and slow, but the rest should be used by the
> discretion of the programmer based on the one that is easiest to maintain in
> the code.

absolutely. I'd also love to know why is it different under the handler.
(talking about relative performance!)

> > The benchmark unbuffered:
> > single_print:  2 wallclock secs ( 2.44 usr +  0.31 sys =  2.75 CPU)
> > here_print:    4 wallclock secs ( 2.34 usr +  0.54 sys =  2.88 CPU)
> > list_print:    8 wallclock secs ( 7.06 usr +  0.43 sys =  7.49 CPU)
> > concat_print:  9 wallclock secs ( 8.95 usr +  0.66 sys =  9.61 CPU)
> > multi_print:  22 wallclock secs (16.94 usr +  5.74 sys = 22.68 CPU)
> >
> > The benchmark unbuffered:
> 
> should this say "The benchmark buffered"??

oops, buffered of course (copy-n-paste typo)

> > single_print:  1 wallclock secs ( 1.70 usr +  0.02 sys =  1.72 CPU)
> > here_print:    1 wallclock secs ( 1.78 usr +  0.01 sys =  1.79 CPU)
> > list_print:    7 wallclock secs ( 6.44 usr +  0.05 sys =  6.49 CPU)
> > concat_print:  9 wallclock secs ( 8.04 usr +  0.06 sys =  8.10 CPU)
> > multi_print:  10 wallclock secs (10.56 usr +  0.09 sys = 10.65 CPU)
> >
> > The interesting thing is that list_print and concat_print are quite bad in
> > the benchmark but very good in the handler. The rest holds.
> 
> --
> ___cliff rayman___www.genwax.com___cliff@genwax.com___
> 
> 
> 



_____________________________________________________________________
Stas Bekman              JAm_pH     --   Just Another mod_perl Hacker
http://stason.org/       mod_perl Guide  http://perl.apache.org/guide 
mailto:stas@stason.org   http://perl.org     http://stason.org/TULARC
http://singlesheaven.com http://perlmonth.com http://sourcegarden.org

Re: [performance/benchmark] printing techniques

Posted by ___cliff rayman___ <cl...@genwax.com>.


Stas Bekman wrote:

>
>
> Per your request:
>
> The handler:
>
> query         | avtime completed failed    rps
> -----------------------------------------------
> single_print  |    110      5000      0    881
> here_print    |    111      5000      0    881
> list_print    |    111      5000      0    880
> concat_print  |    111      5000      0    873
> multi_print   |    119      5000      0    820
> -----------------------------------------------

not very much difference once stuck in a handler.
obviously multi_print is both ugly and slow, but the rest should be used by the
discretion of the programmer based on the one that is easiest to maintain in
the code.

>
>
> The benchmark unbuffered:
> single_print:  2 wallclock secs ( 2.44 usr +  0.31 sys =  2.75 CPU)
> here_print:    4 wallclock secs ( 2.34 usr +  0.54 sys =  2.88 CPU)
> list_print:    8 wallclock secs ( 7.06 usr +  0.43 sys =  7.49 CPU)
> concat_print:  9 wallclock secs ( 8.95 usr +  0.66 sys =  9.61 CPU)
> multi_print:  22 wallclock secs (16.94 usr +  5.74 sys = 22.68 CPU)
>
> The benchmark unbuffered:

should this say "The benchmark buffered"??

>
> single_print:  1 wallclock secs ( 1.70 usr +  0.02 sys =  1.72 CPU)
> here_print:    1 wallclock secs ( 1.78 usr +  0.01 sys =  1.79 CPU)
> list_print:    7 wallclock secs ( 6.44 usr +  0.05 sys =  6.49 CPU)
> concat_print:  9 wallclock secs ( 8.04 usr +  0.06 sys =  8.10 CPU)
> multi_print:  10 wallclock secs (10.56 usr +  0.09 sys = 10.65 CPU)
>
> The interesting thing is that list_print and concat_print are quite bad in
> the benchmark but very good in the handler. The rest holds.

--
___cliff rayman___www.genwax.com___cliff@genwax.com___

Re: [performance/benchmark] printing techniques

Posted by Stas Bekman <sb...@stason.org>.

On Wed, 7 Jun 2000, Jeff Norman wrote:

> 
> 
> Frequently, it's hard to build up an entire output segment without
> code in-between the different additions to the output.  I guess you could
> call this the "append, append, append... output" technique.
> 
> I think it would be an interesting addition to the benchmark:
> 
>    sub gather_print{
>      my $buffer = '';
>      $buffer .= "<!DOCTYPE HTML PUBLIC \"-//IETF//DTD HTML//EN\">";
>      $buffer .= "<HTML>";     
>      $buffer .= "  <HEAD>";   
>      $buffer .= "    <TITLE>";
>      $buffer .= "      Test page";
>      $buffer .= "    </TITLE>";
>      $buffer .= "  </HEAD>";
>      $buffer .= "  <BODY BGCOLOR=\"black\" TEXT=\"white\">";
>      $buffer .= "    <H1> ";
>      $buffer .= "      Test page ";
>      $buffer .= "    </H1>";
>      $buffer .= "    <A HREF=\"foo.html\">foo</A>";
>      $buffer .= "    <HR>"; 
>      $buffer .= "  </BODY>";
>      $buffer .= "</HTML>";
>      print $fh $buffer;
>    }

Per your request:

The handler:

query         | avtime completed failed    rps 
-----------------------------------------------
single_print  |    110      5000      0    881 
here_print    |    111      5000      0    881 
list_print    |    111      5000      0    880 
concat_print  |    111      5000      0    873 
multi_print   |    119      5000      0    820 
-----------------------------------------------

The benchmark unbuffered:
single_print:  2 wallclock secs ( 2.44 usr +  0.31 sys =  2.75 CPU)
here_print:    4 wallclock secs ( 2.34 usr +  0.54 sys =  2.88 CPU)
list_print:    8 wallclock secs ( 7.06 usr +  0.43 sys =  7.49 CPU)
concat_print:  9 wallclock secs ( 8.95 usr +  0.66 sys =  9.61 CPU)
multi_print:  22 wallclock secs (16.94 usr +  5.74 sys = 22.68 CPU)

The benchmark unbuffered:
single_print:  1 wallclock secs ( 1.70 usr +  0.02 sys =  1.72 CPU)
here_print:    1 wallclock secs ( 1.78 usr +  0.01 sys =  1.79 CPU)
list_print:    7 wallclock secs ( 6.44 usr +  0.05 sys =  6.49 CPU)
concat_print:  9 wallclock secs ( 8.04 usr +  0.06 sys =  8.10 CPU)
multi_print:  10 wallclock secs (10.56 usr +  0.09 sys = 10.65 CPU)

The interesting thing is that list_print and concat_print are quite bad in
the benchmark but very good in the handler. The rest holds.


> 
> 
> 
> On Wed, 7 Jun 2000, Stas Bekman wrote:
> 
> > Following Tim's comments here is the new benchmark. (I'll address the
> > buffering issue in another post)
> > 
> 
> 



_____________________________________________________________________
Stas Bekman              JAm_pH     --   Just Another mod_perl Hacker
http://stason.org/       mod_perl Guide  http://perl.apache.org/guide 
mailto:stas@stason.org   http://perl.org     http://stason.org/TULARC
http://singlesheaven.com http://perlmonth.com http://sourcegarden.org

Re: [performance/benchmark] printing techniques

Posted by Jeff Norman <jn...@beckett.x-stream.on.ca>.


Frequently, it's hard to build up an entire output segment without
code in-between the different additions to the output.  I guess you could
call this the "append, append, append... output" technique.

I think it would be an interesting addition to the benchmark:

   sub gather_print{
     my $buffer = '';
     $buffer .= "<!DOCTYPE HTML PUBLIC \"-//IETF//DTD HTML//EN\">";
     $buffer .= "<HTML>";     
     $buffer .= "  <HEAD>";   
     $buffer .= "    <TITLE>";
     $buffer .= "      Test page";
     $buffer .= "    </TITLE>";
     $buffer .= "  </HEAD>";
     $buffer .= "  <BODY BGCOLOR=\"black\" TEXT=\"white\">";
     $buffer .= "    <H1> ";
     $buffer .= "      Test page ";
     $buffer .= "    </H1>";
     $buffer .= "    <A HREF=\"foo.html\">foo</A>";
     $buffer .= "    <HR>"; 
     $buffer .= "  </BODY>";
     $buffer .= "</HTML>";
     print $fh $buffer;
   }



On Wed, 7 Jun 2000, Stas Bekman wrote:

> Following Tim's comments here is the new benchmark. (I'll address the
> buffering issue in another post)
>

Re: [OT] Re: [performance/benchmark] printing techniques

Posted by Mike Lambert <mi...@home.com>.

> Sometimes it's worse than just ugly.  See the entry in the Perl FAQ:
>
http://www.perl.com/pub/doc/manual/html/pod/perlfaq4.html#What_s_wrong_with_
always_quoting
>
> Not likely that anyone would be using something as a hash key that would
> suffer from being stringified, but possible.  It's definitely a bit slower
> as well, but that's below the noise level.

Actually, when you use a reference as a hash key, it is automatically
stringified anyway.

http://www.perl.com/pub/doc/manual/html/pod/perlfaq4.html#How_can_I_use_a_re
ference_as_a_h

So that means that: $hash{"$key"} and $hash{$key}differ only in the relative
merits of their beauty. :)

Mike Lambert

Re: [OT] Re: [performance/benchmark] printing techniques

Posted by Matt Sergeant <ma...@sergeant.org>.

On Thu, 8 Jun 2000, Perrin Harkins wrote:

> On Thu, 8 Jun 2000, Matt Sergeant wrote:
> 
> > > The one that bugs me is when I see people doing this:
> > > 
> > > $hash{"$key"}
> > > 
> > > instead of this:
> > > 
> > > $hash{$key}
> > 
> > Those two now also result in the same code. ;-)
> > 
> > But the former is just ugly.
> 
> Sometimes it's worse than just ugly.  See the entry in the Perl FAQ:
> http://www.perl.com/pub/doc/manual/html/pod/perlfaq4.html#What_s_wrong_with_always_quoting
> 
> Not likely that anyone would be using something as a hash key that would
> suffer from being stringified, but possible.  It's definitely a bit slower
> as well, but that's below the noise level.

It's not slower in 5.6. "$x and $y" in 5.6 gets turned into $x . ' and '
. $y (in perl bytecode terms).

-- 
<Matt/>

Fastnet Software Ltd. High Performance Web Specialists
Providing mod_perl, XML, Sybase and Oracle solutions
Email for training and consultancy availability.
http://sergeant.org http://xml.sergeant.org

Re: [OT] Re: [performance/benchmark] printing techniques

Posted by Perrin Harkins <pe...@primenet.com>.

On Thu, 8 Jun 2000, Matt Sergeant wrote:

> > The one that bugs me is when I see people doing this:
> > 
> > $hash{"$key"}
> > 
> > instead of this:
> > 
> > $hash{$key}
> 
> Those two now also result in the same code. ;-)
> 
> But the former is just ugly.

Sometimes it's worse than just ugly.  See the entry in the Perl FAQ:
http://www.perl.com/pub/doc/manual/html/pod/perlfaq4.html#What_s_wrong_with_always_quoting

Not likely that anyone would be using something as a hash key that would
suffer from being stringified, but possible.  It's definitely a bit slower
as well, but that's below the noise level.

- Perrin

Re: [OT] Re: [performance/benchmark] printing techniques

Posted by Matt Sergeant <ma...@sergeant.org>.

On Wed, 7 Jun 2000, Perrin Harkins wrote:

> On Wed, 7 Jun 2000, Matt Sergeant wrote:
> 
> > On Wed, 7 Jun 2000, Eric Cholet wrote:
> > 
> > > This said, i hurry back to s/"constant strings"/'constant strings'/g;
> > 
> > Those two are equal.
> 
> Yes, although it's counter-intutive there's no real performance hit
> from double-quoting constant strings.
> 
> The one that bugs me is when I see people doing this:
> 
> $hash{"$key"}
> 
> instead of this:
> 
> $hash{$key}

Those two now also result in the same code. ;-)

But the former is just ugly.

-- 
<Matt/>

Fastnet Software Ltd. High Performance Web Specialists
Providing mod_perl, XML, Sybase and Oracle solutions
Email for training and consultancy availability.
http://sergeant.org http://xml.sergeant.org

[OT] Re: [performance/benchmark] printing techniques

Posted by Perrin Harkins <pe...@primenet.com>.

On Wed, 7 Jun 2000, Matt Sergeant wrote:

> On Wed, 7 Jun 2000, Eric Cholet wrote:
> 
> > This said, i hurry back to s/"constant strings"/'constant strings'/g;
> 
> Those two are equal.

Yes, although it's counter-intutive there's no real performance hit
from double-quoting constant strings.

The one that bugs me is when I see people doing this:

$hash{"$key"}

instead of this:

$hash{$key}

That one is actually in the perlfaq man page, but I still see it all the
time.  The performance difference is very small but it does exist, and you
can get unintended results from stringifying some things.

- Perrin

Re: [performance/benchmark] printing techniques

Posted by Matt Sergeant <ma...@sergeant.org>.

On Wed, 7 Jun 2000, Eric Cholet wrote:

> This said, i hurry back to s/"constant strings"/'constant strings'/g;

Those two are equal.

-- 
<Matt/>

Fastnet Software Ltd. High Performance Web Specialists
Providing mod_perl, XML, Sybase and Oracle solutions
Email for training and consultancy availability.
http://sergeant.org http://xml.sergeant.org

Re: [performance/benchmark] printing techniques

Posted by Eric Cholet <ch...@logilune.com>.

>From: "Eric Strovink" <st...@acm.org>
> > Of course the slowest stuff should be optimized first...
>
> Right.  Which means the Guide, if it is not already so doing, ought to
> rank-order the optimizations in their order of importance, or better,
their
> relative importance.  This one, it appears, should be near the bottom of
the
> list.

>From: "Matt Sergeant" <ma...@sergeant.org>
>
> Of course you can optimize forever, but some optimizations aren't going to
> make a whole lot of difference. This is one of those optimizations,
> judging by these benchmarks. Let Stas re-write this benchmark test as a
> handler() and see what kind of difference it makes. I'm willing to
> bet: barely any between averages.
>
> Perhaps I was a little strong: Lets not deprecate this part of the guide,
> just provide some realism in the conclusion.

Agreed, all optimizations should be put under perspective, and the guide
(and book :-) should put forward those that count most.

This said, i hurry back to s/"constant strings"/'constant strings'/g;

--
Eric

Re: [performance/benchmark] printing techniques

Posted by Eric Strovink <st...@acm.org>.

Eric Cholet wrote:

> Of course the slowest stuff should be optimized first...

Right.  Which means the Guide, if it is not already so doing, ought to
rank-order the optimizations in their order of importance, or better, their
relative importance.  This one, it appears, should be near the bottom of the
list.

Re: [performance/benchmark] printing techniques

Posted by Barrie Slaymaker <ba...@slaysys.com>.

[Sorry for the delay: didn't notice this since it was sent only to the list]

Eric Cholet wrote, in part:
> 
> I never advocated optimizing at the expense of the above criteria, we
> were discussing optimizations only. I certainly believe a program is a
> compromise, and have often chosen some of those criteria as being
> more important than performance savings.

Sorry: I took your statement at face value.  I'm well aware that you're 
not that shallow :-).

- Barrie

Re: [performance/benchmark] printing techniques

Posted by Eric Cholet <ch...@logilune.com>.

> > These
> > things add up, so don't you think that whatever can be optimized, should
?
>
> Wrong question, IMHO: it's what you optimize for that counts.  Several
things
> come to mind that are often more important than performance and often mean
not
> optimizing for performance (these are interrelated, of course):
>
>   Stability / reliability
>   Maintainability
>   Development time
>   Memory usage
>   Clarity of design (API, data structures, etc)

I never advocated optimizing at the expense of the above criteria, we were
discussing optimizations only. I certainly believe a program is a
compromise,
and have often chosen some of those criteria as being more important than
performance savings.

> There's a related rule of thumb that says don't optimize until you can
test it
> to see what the slow parts are.  Humans are pretty bad at predicting where
the
> bottlenecks are.

Neither did I say that optimizations should be carried out without first
determining whether they're worth it or not. Run benchmarks, optimize what
the benchmark shows to be slow. The point of the discussion was, is it worth
it to save a few microseconds here when milliseconds are being spent there.
My point was, yes it's worth it, every microsecond counts on a busy site.

> I think of it this way: if your process spends 80% of it's time in 20% of
your
> code, then you should only be thinking of performance optimizing that 20%,
and
> then only if you identify a problem there.  Of course, there are critical
sections
> that may need to operate lightening quick, but they're pretty few and far
between
> outside of real-time, embedded, or kernel hacking.

I don't see, provided I have the time and the need (ie my server's resources
are
strained) why I should not, once I have optimized that 20%, turn to the
other 80%
and see what I can do there too.

> - Barrie
>

--
Eric

Re: [performance/benchmark] printing techniques

Posted by Barrie Slaymaker <ba...@slaysys.com>.

Eric Cholet wrote:
> 
> These
> things add up, so don't you think that whatever can be optimized, should ?

Wrong question, IMHO: it's what you optimize for that counts.  Several things
come to mind that are often more important than performance and often mean not
optimizing for performance (these are interrelated, of course):

  Stability / reliability
  Maintainability
  Development time
  Memory usage
  Clarity of design (API, data structures, etc)

There's a related rule of thumb that says don't optimize until you can test it
to see what the slow parts are.  Humans are pretty bad at predicting where the
bottlenecks are.

I think of it this way: if your process spends 80% of it's time in 20% of your
code, then you should only be thinking of performance optimizing that 20%, and
then only if you identify a problem there.  Of course, there are critical sections
that may need to operate lightening quick, but they're pretty few and far between
outside of real-time, embedded, or kernel hacking.

- Barrie

Re: [performance/benchmark] printing techniques

Posted by Stas Bekman <sb...@stason.org>.

> > I don't understand what you're getting at. Does this mean that something
> > shouldn't be optimized because there's something else in the process that
> > is taking more time? For example I have a database powered site, the slowest
> > part of request processing is fetching data from the database. Should I
> > disregard any optimization not dealing with the database fetches ? These
> > things add up, so don't you think that whatever can be optimized, should ?
> > Of course the slowest stuff should be optimized first, but that doesn't
> > mean that other optimisations are useless.
> 
> Of course you can optimize forever, but some optimizations aren't going to
> make a whole lot of difference. This is one of those optimizations,
> judging by these benchmarks. Let Stas re-write this benchmark test as a
> handler() and see what kind of difference it makes. I'm willing to
> bet: barely any between averages.
> 
> Perhaps I was a little strong: Lets not deprecate this part of the guide,
> just provide some realism in the conclusion.

here we go, the benchmark holds for all but list_print!!!

query         | avtime completed failed    rps 
-----------------------------------------------
here_print    |    109      5000      0    894 
single_print  |    110      5000      0    883 
list_print    |    111      5000      0    877 
multi_print   |    118      5000      0    817 
-----------------------------------------------


Here is the module used in benchmarking:

package MyPrint;
use Apache::Constants qw(:common);
use Apache::URI ();

my %callbacks = (
          list_print   => \&list_print,
          multi_print  => \&multi_print,
          single_print => \&single_print,
          here_print   => \&here_print,
                );

sub handler{
  my $r = shift;
  $r->send_http_header('text/plain');
  my $uri = Apache::URI->parse($r);
  my $query = $uri->query;

  return DECLINED unless  $callbacks{$query};
  &{$callbacks{$query}};
  return OK;
}

  sub multi_print{
    print "<!DOCTYPE HTML PUBLIC \"-//IETF//DTD HTML//EN\">\n";
    print "<HTML>\n";
    print "  <HEAD>\n";
    print "    <TITLE>\n";
    print "      Test page\n";
    print "    </TITLE>\n";
    print "  </HEAD>\n";
    print "  <BODY BGCOLOR=\"black\" TEXT=\"white\">\n";
    print "    <H1> \n";
    print "      Test page \n";
    print "    </H1>\n";
    print "    <A HREF=\"foo.html\">foo</A>\n";
    print "    <HR>\n";
    print "  </BODY>\n";
    print "</HTML>\n";
  }
  
  sub single_print{
    print qq{<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
<HTML>
  <HEAD>
    <TITLE>
      Test page
    </TITLE>
  </HEAD>
  <BODY BGCOLOR="black" TEXT="white">
    <H1> 
      Test page 
    </H1>
    <A HREF="foo.html">foo</A>
    <HR>
  </BODY>
</HTML>
    };
  }
  
  sub here_print{
    print <<__EOT__;
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
<HTML>
  <HEAD>
    <TITLE>
      Test page
    </TITLE>
  </HEAD>
  <BODY BGCOLOR="black" TEXT="white">
    <H1> 
      Test page 
    </H1>
    <A HREF="foo.html">foo</A>
    <HR>
  </BODY>
</HTML>
__EOT__
  }
  
  sub list_print{
    print "<!DOCTYPE HTML PUBLIC \"-//IETF//DTD HTML//EN\">\n",
              "<HTML>\n",
              "  <HEAD>\n",
              "    <TITLE>\n",
              "      Test page\n",
              "    </TITLE>\n",
              "  </HEAD>\n",
              "  <BODY BGCOLOR=\"black\" TEXT=\"white\">\n",
              "    <H1> \n",
              "      Test page \n",
              "    </H1>\n",
              "    <A HREF=\"foo.html\">foo</A>\n",
              "    <HR>\n",
              "  </BODY>\n",
              "</HTML>\n";
  }

1;


_____________________________________________________________________
Stas Bekman              JAm_pH     --   Just Another mod_perl Hacker
http://stason.org/       mod_perl Guide  http://perl.apache.org/guide 
mailto:stas@stason.org   http://perl.org     http://stason.org/TULARC
http://singlesheaven.com http://perlmonth.com http://sourcegarden.org

Re: [performance/benchmark] printing techniques

Posted by Matt Sergeant <ma...@sergeant.org>.

On Wed, 7 Jun 2000, Eric Cholet wrote:

> > > So if you want a better performance, you know what technique to use.
> >
> > I think this last line is misleading. The reality is that you're doing
> > 500,000 iterations here. Even for the worst case scenario of multi_print
> > with no buffering you're managing nearly 22,000 outputs a second. Now
> > granted, the output isn't exactly of normal size, but I think what it
> > comes down to is that the way you choose to print is going to make almost
> > zero difference in any real world mod_perl application. The overhead of
> > URL parsing, resource location, and actually running your handler is going
> > to take far more overhead by the looks of things.
> 
> I don't understand what you're getting at. Does this mean that something
> shouldn't be optimized because there's something else in the process that
> is taking more time? For example I have a database powered site, the slowest
> part of request processing is fetching data from the database. Should I
> disregard any optimization not dealing with the database fetches ? These
> things add up, so don't you think that whatever can be optimized, should ?
> Of course the slowest stuff should be optimized first, but that doesn't
> mean that other optimisations are useless.

Of course you can optimize forever, but some optimizations aren't going to
make a whole lot of difference. This is one of those optimizations,
judging by these benchmarks. Let Stas re-write this benchmark test as a
handler() and see what kind of difference it makes. I'm willing to
bet: barely any between averages.

Perhaps I was a little strong: Lets not deprecate this part of the guide,
just provide some realism in the conclusion.

-- 
<Matt/>

Fastnet Software Ltd. High Performance Web Specialists
Providing mod_perl, XML, Sybase and Oracle solutions
Email for training and consultancy availability.
http://sergeant.org http://xml.sergeant.org

Re: [performance/benchmark] printing techniques

Posted by Eric Cholet <ch...@logilune.com>.

> > So if you want a better performance, you know what technique to use.
>
> I think this last line is misleading. The reality is that you're doing
> 500,000 iterations here. Even for the worst case scenario of multi_print
> with no buffering you're managing nearly 22,000 outputs a second. Now
> granted, the output isn't exactly of normal size, but I think what it
> comes down to is that the way you choose to print is going to make almost
> zero difference in any real world mod_perl application. The overhead of
> URL parsing, resource location, and actually running your handler is going
> to take far more overhead by the looks of things.

I don't understand what you're getting at. Does this mean that something
shouldn't be optimized because there's something else in the process that
is taking more time? For example I have a database powered site, the slowest
part of request processing is fetching data from the database. Should I
disregard any optimization not dealing with the database fetches ? These
things add up, so don't you think that whatever can be optimized, should ?
Of course the slowest stuff should be optimized first, but that doesn't
mean that other optimisations are useless.

--
Eric

Re: [performance/benchmark] printing techniques

Posted by Stas Bekman <sb...@stason.org>.

[benchmark code snipped]

> >   single_print:  4 wallclock secs ( 2.28 usr +  0.47 sys =  2.75 CPU)
> >   here_print:    2 wallclock secs ( 2.45 usr +  0.45 sys =  2.90 CPU)
> >   list_print:    7 wallclock secs ( 7.17 usr +  0.45 sys =  7.62 CPU)
> >   multi_print:  23 wallclock secs (17.52 usr +  5.72 sys = 23.24 CPU)
> > 
> > The results are worse by the factor of 1.5 to 2, with only
> > I<'list_print'> changed by very little.
> > 
> > So if you want a better performance, you know what technique to use.
> 
> I think this last line is misleading. The reality is that you're doing
> 500,000 iterations here. Even for the worst case scenario of multi_print
> with no buffering you're managing nearly 22,000 outputs a second. Now
> granted, the output isn't exactly of normal size, but I think what it
> comes down to is that the way you choose to print is going to make almost
> zero difference in any real world mod_perl application. The overhead of
> URL parsing, resource location, and actually running your handler is going
> to take far more overhead by the looks of things.
> 
> Perhaps this section should be (re)moved into a posterity section, for it
> seems fairly un-informative to me.

Matt, Have you seen all these scripts with hundreds of print statements? 
This section comes to open the eyes of programmers who tend to use this
style. 

Obviously, that if write the normal code the real choice doesn't really
matter, unless you do lots of printings.

But, remember that each of the performance sections of the guide can be
deleted following your suggestion. Each section tackles a separate
feature/technique. The overall approach only matters. My goal is to show
programmers how to squeeze more out of their code, definitely I'm not
talking to people who run guestbooks code. 

Take for example Ask and Nick from ValueClick.  Let's ask them whether
these techniques matter or not. With 70-80M requests served daily each
saved millisecond counts.  Ask? Nick?

What do you think?

_____________________________________________________________________
Stas Bekman              JAm_pH     --   Just Another mod_perl Hacker
http://stason.org/       mod_perl Guide  http://perl.apache.org/guide 
mailto:stas@stason.org   http://perl.org     http://stason.org/TULARC
http://singlesheaven.com http://perlmonth.com http://sourcegarden.org

Re: [performance/benchmark] printing techniques

Posted by Matt Sergeant <ma...@sergeant.org>.

On Wed, 7 Jun 2000, Stas Bekman wrote:

> Following Tim's comments here is the new benchmark. (I'll address the
> buffering issue in another post)
> 
>   use Benchmark;
>   use Symbol;
> 
>   my $fh = gensym;
>   open $fh, ">/dev/null" or die;
>   
>   sub multi_print{
>     print $fh "<!DOCTYPE HTML PUBLIC \"-//IETF//DTD HTML//EN\">";
>     print $fh "<HTML>";
>     print $fh "  <HEAD>";
>     print $fh "    <TITLE>";
>     print $fh "      Test page";
>     print $fh "    </TITLE>";
>     print $fh "  </HEAD>";
>     print $fh "  <BODY BGCOLOR=\"black\" TEXT=\"white\">";
>     print $fh "    <H1> ";
>     print $fh "      Test page ";
>     print $fh "    </H1>";
>     print $fh "    <A HREF=\"foo.html\">foo</A>";
>     print $fh "    <HR>";
>     print $fh "  </BODY>";
>     print $fh "</HTML>";
>   }
>   
>   sub single_print{
>     print $fh qq{<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
> <HTML>
>   <HEAD>
>     <TITLE>
>       Test page
>     </TITLE>
>   </HEAD>
>   <BODY BGCOLOR="black" TEXT="white">
>     <H1> 
>       Test page 
>     </H1>
>     <A HREF="foo.html">foo</A>
>     <HR>
>   </BODY>
> </HTML>
>     };
>   }
>   
>   sub here_print{
>     print $fh <<__EOT__;
> <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
> <HTML>
>   <HEAD>
>     <TITLE>
>       Test page
>     </TITLE>
>   </HEAD>
>   <BODY BGCOLOR="black" TEXT="white">
>     <H1> 
>       Test page 
>     </H1>
>     <A HREF="foo.html">foo</A>
>     <HR>
>   </BODY>
> </HTML>
> __EOT__
>   }
>   
>   sub list_print{
>     print $fh "<!DOCTYPE HTML PUBLIC \"-//IETF//DTD HTML//EN\">",
>               "<HTML>",
>               "  <HEAD>",
>               "    <TITLE>",
>               "      Test page",
>               "    </TITLE>",
>               "  </HEAD>",
>               "  <BODY BGCOLOR=\"black\" TEXT=\"white\">",
>               "    <H1> ",
>               "      Test page ",
>               "    </H1>",
>               "    <A HREF=\"foo.html\">foo</A>",
>               "    <HR>",
>               "  </BODY>",
>               "</HTML>";
>   }
>   
>   timethese
>     (500_000, {
>           list_print   => \&list_print,
>           multi_print  => \&multi_print,
>           single_print => \&single_print,
>           here_print   => \&here_print,
>           });
> 
> And the results are:
> 
>   single_print:  1 wallclock secs ( 1.74 usr +  0.05 sys =  1.79 CPU)
>   here_print:    3 wallclock secs ( 1.79 usr +  0.07 sys =  1.86 CPU)
>   list_print:    7 wallclock secs ( 6.57 usr +  0.01 sys =  6.58 CPU)
>   multi_print:  10 wallclock secs (10.72 usr +  0.03 sys = 10.75 CPU)
> 
> Numbers tell it all, I<'single_print'> is the fastest, 'here_print' is
> almost of the same speed, I<'list_print'> is quite slow and
> I<'multi_print'> is the slowest.
> 
> If we run the same benchmark using the unbuffered prints by changing
> the beginning of the code to:
> 
>   use Symbol;
>   my $fh = gensym;
>   open $fh, ">/dev/null" or die;
>   
>      # make all the calls unbuffered
>   my $oldfh = select($fh);
>   $| = 1;
>   select($oldfh);
> 
> And the results are:
> 
>   single_print:  4 wallclock secs ( 2.28 usr +  0.47 sys =  2.75 CPU)
>   here_print:    2 wallclock secs ( 2.45 usr +  0.45 sys =  2.90 CPU)
>   list_print:    7 wallclock secs ( 7.17 usr +  0.45 sys =  7.62 CPU)
>   multi_print:  23 wallclock secs (17.52 usr +  5.72 sys = 23.24 CPU)
> 
> The results are worse by the factor of 1.5 to 2, with only
> I<'list_print'> changed by very little.
> 
> So if you want a better performance, you know what technique to use.

I think this last line is misleading. The reality is that you're doing
500,000 iterations here. Even for the worst case scenario of multi_print
with no buffering you're managing nearly 22,000 outputs a second. Now
granted, the output isn't exactly of normal size, but I think what it
comes down to is that the way you choose to print is going to make almost
zero difference in any real world mod_perl application. The overhead of
URL parsing, resource location, and actually running your handler is going
to take far more overhead by the looks of things.

Perhaps this section should be (re)moved into a posterity section, for it
seems fairly un-informative to me.

-- 
<Matt/>

Fastnet Software Ltd. High Performance Web Specialists
Providing mod_perl, XML, Sybase and Oracle solutions
Email for training and consultancy availability.
http://sergeant.org http://xml.sergeant.org