You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@velocity.apache.org by Jose Alberto Fernandez <JF...@viquity.com> on 2000/12/16 04:39:39 UTC

[PATCH] Local contexts and forward macros

Hi,

This patch tries to solve several things:

- Eliminates the multiple parsing of Macros by passing macro parameters by
value.

- It implements local contexts (actually it has two implementations, more
later)

- It solves the problem of forward references for macros, or macros used
before they are defined in the template. This is achieved by a two pass lazy
compilation of macros.

- It is faster than before (in one of the implementations of local
contexts):

I tried implementing the recursive contexts as suggested by several of you,
including me and the runtime hit, blow me away. My large heavy macro
template, went from 9secs to 13 secs. For the last two days I have been
trying to squeze overhead out to no avail.

So I tried a different approach in which when a macro is called, the old
values of variables with the same name as arguments, in the context, is
copied to an array in the stack and at the end of the execution they old
values are put back.

And now the test is faster, only 6.5 secs.

I have left the code for both approaches for someone to figure out why the
local context is so much expensive. They are controlled by the constant
VelocimacroProxy.USE_MACRO_CONTEXT.

Here is the patch as well as the modified files for your study, I had to
change a couple of method signatures in Context for everything to hold
toghether, but no big deal.

Jose Alberto

Re: [PATCH] Local contexts and forward macros [LONG BUT IMPORTANT]

Posted by "Geir Magnusson Jr." <ge...@optonline.net>.

Note that the following is long : it was my 'notepad' for a long, long,
long day looking at this stuff.  I started with Jose's patches, and as
you can see, I was motivated to do some analytic performance
investigation. I wanted to resolve once and for all the VM performance
questions/mythology as well as review Jose's patch in real detail. 
There is some good information herein, I think, so it's worth a read
when you have the time.

I didn't take this lightly : I spent much time on this,  (too much, my
wife is currently observing) trying to avoid the outcome, which is
except for the ASTReference change, no others were accepted, as they
broke existing features.  I carefully detailed all below.  I am very
interested in seeing this solution work, though, and will continue to
work on it (and my own parallel attempts...)

I worked with three versions of Velocity, that I will refer to as :

71cvs : the 'current' CVS tree as of 12/15, which I use as a benchmark.
71new : my attempt to approach the performance increase of jose's patch
'from below', adding things step by step.
patch : this patch applied to 71cvs 

Also, I was given a test program that I use to benchmark VMs,
introspection, and performance.  It has a 10KB of VMs (65 of them), and
2MB of XML input to process into 10MB of output.  So it's a killer - it
heavily tests introspection performance / caching, VM parsing and
processing, reference access, etc. 

I spent a fair bit of time making sure that the numbers represent
performance of Velocity rather than limitiations of my box, or the
overhead of getting things going.  In each case, there was 1 'throwaway'
merge done on the template before starting timing, as there appears to
be quite a bit of setup soverhead. (Such as introspection caching et
al)  In each of the three versions, this overhead was comparable, but
would slew the iteration averages.

To summarize :

0) There is a huge problem with arg passing-by-value for VMs.  I will
explain later.  This is something I too have been working on to solve
some recursion issues using an approach suggested in this patch, and my
work today with Jose's patch as well as my own version shows me
conclusively that unless we wish to limit the functionality of VMs we
need to continue to search for another solution.  (Of course, the
performance numbers beg the question "solution to what?", but more on
that later.)

1) the majority of the performance improvement came from simply by
removing the call containsKey() before get() in
ASTReference.getVariableValue(), so that goes in.  Good one, Jose!  Bad
one, us! :)

More detail inline.

Jose Alberto Fernandez wrote:
> 
> Hi,
> 
> This patch tries to solve several things:
> 
> - Eliminates the multiple parsing of Macros by passing macro parameters by
> value.

<shaking head> :)  Ok.  Consider that strawman dead.

In the interest of honesty, I will point out that the above is
conditionally true : the VM was parsed once per use per template, just
like everything else in the template.  After that, it was cached like
everything else in the template.  So the whole issue of repeated parsing
and patching in a given template is nonexistant if caching was turned
on, and rather mimimal if not.

Now, I was interested in proving my assertion, so I changed ASTReference
to not search with containsKey(), but simply return the output of get()
(as suggested by Jose's patch), and this helped quite a bit. 
Performance of the test program, 5 iterations of the killer processing
(the actual numbers reveal the cruddyness of my box compared to Jose's,
but that doesn't matter) :

per iteration, no template caching (template parsed each time) :
71cvs : ave 13.988 sec
71new : ave 13.545 sec
patch : ave 13.529 sec

with template caching :
71new : ave 13.072 sec
patch : ave 12.939 sec

So there is a big improvement in the ASTReference fix.  The full patch
has a slight advantage, but its very small.

Again, I am testing the patch versus the current CVS tree with a small
change to ASTReference - 71new is still using the 'dreaded'
patch-and-parse approach to VMs.

The conclusion I draw, to spell it out, is that there isn't a
performance problem with the current VM implementation. (Although I
agree its a bit yecchy.)  But its necessary to support
pass-by-reference.

> - It implements local contexts (actually it has two implementations, more
> later)

One approach implements local context. The other is different - I like
it alot, but it turns out to break 2 (3?) valuable features of VMs. 
Will detail later in this post.

> - It solves the problem of forward references for macros, or macros used
> before they are defined in the template. This is achieved by a two pass lazy
> compilation of macros.

No, actually, it doesn't.  Try the following :

#woogie("foo")

#macro( woogie $i)
  Hello from woogie! >$i<
#end

I have had a fix in mind for a while, and will hopefully get it done
this weekend.  If you have a fix, please let me know so I don't waste
any [more] time.  But I will hint that a totally different
approach must be taken.

> - It is faster than before (in one of the implementations of local
> contexts):

Yes indeed, but I suspect mostly for the ASTReference change.  After
that, I see the performance difference between 71new and patch to be
about 1%.  It's consistant, so I don't believe it's noise, but it's not
clear from where it comes.

> I tried implementing the recursive contexts as suggested by several of you,
> including me and the runtime hit, blow me away. My large heavy macro
> template, went from 9secs to 13 secs. For the last two days I have been
> trying to squeze overhead out to no avail.

I tested it, and the timings were consistant with not supporting the
nodal introspection caching.  I guess you might have saved 2 days if we
corresponded and collaborated a little on this project since I was able
to guess the problem in about 0.1 sec (before morning coffee even...)
after I saw the timing ratios :) but each to his own, I guess.

Once I threw that in, the performance returned to a touch slower than
the not using the MacroContext.

> So I tried a different approach in which when a macro is called, the old
> values of variables with the same name as arguments, in the context, is
> copied to an array in the stack and at the end of the execution they old
> values are put back.

This was the solution I was tinkering with, but hadn't the time to knock
the bugs out of it.  We have to stop working this way. :( 

There is a big problem here, though.  It's effectively arg passing by
value, and we want to be able to pass by reference. (I will have
examples below.)  Further, since Velocity references ($<name> stuff) are
a little strange, you can't get their Node.value() when they are not
backed by something in the context.  But still, they are an entity that
is still useable as an arg to a VM because the parse makes them as
such.  It just means that using pass-by-ref, they will render as their
literal rather than a value. When passed-by-value, it disappears as an
arg.   I think this is a valuable thing.  Will give an example below.

> And now the test is faster, only 6.5 secs.

I don't understand why this is so, nor can I reproduce it. And I tried
very, very hard.

> 
> I have left the code for both approaches for someone to figure out why the
> local context is so much expensive. They are controlled by the constant
> VelocimacroProxy.USE_MACRO_CONTEXT.
> 
> Here is the patch as well as the modified files for your study, I had to
> change a couple of method signatures in Context for everything to hold
> toghether, but no big deal.

That actually is a big deal of sorts, as you can't just drop in the jar
anymore to compiled code and have things work, I think because of the
change in signature to Context.put().  However, I think we should do it
anyway for the future, as it's useful.  We piss off fewer people now. 
Comments please.

Anyway, regarding the issue with losing pass-by-reference :

Imagine a you have an object with a method that 'does something', rather
than just return data.  Suppose it increments an internal counter and
returns the value :

class Foo {
  private int i = 0;
  public int another() { return ++i; }
}

and you stuff it into your context : context.put( "provider", new Foo()
)

and a template :

#macro( foo $i $count )
  $i
  #set($count = $count - 1 )
  #if( $count > 0)
    #foo( $i $count )
  #end
#end

by reference :
#set($count = 5)
#set( $blargh = "a")
#foo( $provider.another()  $count)

by value : 
#set( $count = 5 )
#set( $val = $provider.another() )
#foo( $val $count )

With the current (correct) way VM args are handled, the output is :

by reference : 
  1
          2
          3
          4
          5

by value:
  6
          6
          6
          6
          6

In the first set, you see that the method gets called for each recursion
of the VM, as we are passing by reference.  In the second set, we were
able to pass by value by putting the val into the context, and passing
that by reference.  So we have the freedom to choose the way we need.

Now, if you switch only to pass by value (either the 'stack' or
MacroContext way ) :

by reference : 
  1
          1
          1
          1
          1

again by value:
  2
          2
          2
          2
          2

So you can see, we lose something valuable, with little gain performance
wise.

The second VM feature that breaks is that you can pass a reference
through to the VM even if it doesn't have a value. As a matter of fact,
Velocity with this patch won't pass the testbed because of this.
(Passing the testbed is a good thing for patches :)  Ex :

#macro (woogie $i)
 >$i<
#end

#woogie( $not-a-reference )

Under the current VM implementation, the output would (should) be :

>$not-a-reference<

where the pass-by-value approach results in  :

>$i<

This is valuable, I think, in that in debugging, you can see where and
what is wrong right in the output, rather than divining from the log,
because the output will show you what reference isn't right.

I tried a few things to get around this, but to no avail. 

There is one further problem with pass by value - I think any solution
that solves the above issue must make sure it doesn't break the
following behavior :

#macro( iffy $i )
  #if($i)
    valid!
  #else
    invalid!
  #end
#end

#set($valid = "valid")
#iffy( $valid )
#iffy( $invalid )

If the solution is one that places something into the context such that
it will properly render an invalid reference, we will lose the
functionality of #if() - no longer is that reference invalid (there is
something in the context, after all), so #if returns true, when actually
it should return false.

Anyhow, I think that the observation about ASTReference was right on,
and we got some good performance improvement from it.  But the other
stuff simply breaks existing features (the fact that it fails the
testbed should have hinted at that), and I wouldn't want to do that.  I
personally like the pass-by-reference feature, because I can have both
pass-by-ref and pass-by-val.  With only pass-by-val, users lose
something.  Now, we could abandon those features, but I am not sure what
we are getting in return.  Certainly not a tremendous preformance boost,
as far as I can tell. It does help with one specific recursion bug, but
there is an easy workaround for that, and I hope to have an internal fix
for that later.

I am hoping that I got something really wrong from the beginning, like I
didn't patch correctly or something, so please, let me know if someone
finds something different.  I wouldn't mind having wasted a day if this
stuff gets solved.

geir

-- 
Geir Magnusson Jr.                               geirm@optonline.com
Velocity : it's not just a good idea. It should be the law.
http://jakarta.apache.org/velocity

Re: [PATCH] Local contexts and forward macros

Posted by "Geir Magnusson Jr." <ge...@optonline.net>.

Jose Alberto Fernandez wrote:
> 
> Hi,
> 
> This patch tries to solve several things:

Cool!  If no one minds, I'll go over this and put them in.

> - Eliminates the multiple parsing of Macros by passing macro parameters by
> value.
> 
> - It implements local contexts (actually it has two implementations, more
> later)
> 
> - It solves the problem of forward references for macros, or macros used
> before they are defined in the template. This is achieved by a two pass lazy
> compilation of macros.
> 
> - It is faster than before (in one of the implementations of local
> contexts):
> 
> I tried implementing the recursive contexts as suggested by several of you,
> including me and the runtime hit, blow me away. My large heavy macro
> template, went from 9secs to 13 secs. For the last two days I have been
> trying to squeze overhead out to no avail.
> 
> So I tried a different approach in which when a macro is called, the old
> values of variables with the same name as arguments, in the context, is
> copied to an array in the stack and at the end of the execution they old
> values are put back.
> 
> And now the test is faster, only 6.5 secs.
> 
> I have left the code for both approaches for someone to figure out why the
> local context is so much expensive. They are controlled by the constant
> VelocimacroProxy.USE_MACRO_CONTEXT.
> 
> Here is the patch as well as the modified files for your study, I had to
> change a couple of method signatures in Context for everything to hold
> toghether, but no big deal.
> 
> Jose Alberto
> 
>   ------------------------------------------------------------------------
>                       Name: macro_patch.txt
>    macro_patch.txt    Type: Plain Text (text/plain)
>                   Encoding: quoted-printable
> 
>                    Name: Context.java
>    Context.java    Type: unspecified type (application/octet-stream)
>                Encoding: quoted-printable
> 
>                    Name: Runtime.java
>    Runtime.java    Type: unspecified type (application/octet-stream)
>                Encoding: quoted-printable
> 
>                               Name: VelocimacroFactory.java
>    VelocimacroFactory.java    Type: unspecified type (application/octet-stream)
>                           Encoding: quoted-printable
> 
>                               Name: VelocimacroManager.java
>    VelocimacroManager.java    Type: unspecified type (application/octet-stream)
>                           Encoding: quoted-printable
> 
>                             Name: VelocimacroProxy.java
>    VelocimacroProxy.java    Type: unspecified type (application/octet-stream)
>                         Encoding: quoted-printable
> 
>                         Name: ASTReference.java
>    ASTReference.java    Type: unspecified type (application/octet-stream)
>                     Encoding: quoted-printable

-- 
Geir Magnusson Jr.                               geirm@optonline.com
Velocity : it's not just a good idea. It should be the law.
http://jakarta.apache.org/velocity