You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@velocity.apache.org by "Geir Magnusson Jr." <ge...@optonline.net> on 2000/11/06 07:00:30 UTC

Parser Update

I have been promising to give an update on what is going on with the
parser, so here it is.  Forgive any goofy spelling and grammar - I am
beat - and if there is anything useful here, I am sure Minister of
Documentation John Castura will make it spiffy.

The two biggest changes recently beside bug fixes are motived by the
idea that Velocity is a general purpose template engine (not just for
outputting HTML), and therefore we have to be very careful that the
output is both predictable and doesn't damage the non-VTL components of
the input templates (called hereafter, 'schmoo'). 

Escape Handling 
----------------
The escape handling rules are very simple now.

1) When a non-null reference (a reference that refers to actual data in
the Context) is preceeded by a '\', it will be rendered as $<reference>. 

2) When a VTL directive ( #if, #else, #elseif, #end, #set, #foreach ...)
is preceeded by a '\', it will be rendered as  #<directive>.

3) In any other case, the '\' has no effect, and is simply output as-is.

This means:

#set $foo = "woogie"
\$foo => $foo
\$bar => $bar

would output as :

$foo =>  woogie
\$bar =>  $bar

because $bar is not a reference (assuming in this case that there is no
data element 'bar' in the Context).

Further

\#foreach( 
\#whumpus

would render as 

#foreach(
\#whumpus

because foreach() is a valid directive, and currently, whumpus isn't.

I hope you get the idea.  See test/templates/test.vm and escape.vm for
more examples.


'What You Expect'
-----------------
Don't know what else to call this. Maybe 'What You Want'.  The basic
idea is that the VTL (Velocity Template Language) control structures
such as #if, #end, etc. should not render anything into the output
stream. This allows precise output control for those generating output
where whitespace and newlines matter.  For most of us, our output is
HTML and Java, both being rather tolerant of spurious whitespace, so it
doesn't matter.  But if you want to use Vel for something else, like
automated text output, it does matter.

Below, when I say 'inline' I mean 'in the schmoo stream' -> you don't
have to start on a new line.

The basic rules :

1) #set will eat all preceeding whitespace and any following whitespace,
including the newline, which is mandatory. It should not (cannot?) be
used inline. There will be another #set statement (#inlineset() ?) for
inline use offered soon for any masochists who insist on using #set
inline.

2) All other directives leave preceeding whitespace alone to respect
'What You Expect' and eat any trailing whitespace if they are followed
by a newline.  If the whitespace is not followed immediately by a
newline, the whitespace is rendered.

I think that's it.

So, let me show with some examples. 

input to show how the directives render to nothing:
---
#if(false)
$foo
#end
---

output:
---
---

input to demonstrate inlining a #foreach() :
---
#set $colors =["red", "blue", "tangerine"]
>#foreach($color in $colors)$color #end<
---

output:
---
>red blue tangerine <
---

input - remove the '<' at the end and see what happens when the #end
eats the newline :
---
#set $colors =["red", "blue", "tangerine"]
>#foreach($color in $colors)$color #end
---

output :
---
>red blue tangerine ---

See?  since there was no newline after $color in the block, the --- was
sucked right up to the space trailing the last color.  This is as
expected because the #end ate the newline.  'Ate' is official parser
jargon. :)

How about an inline #if() to make nice text output:
----
#set $foo = "bar"
\$foo is a#if($foo) valid#else not defined#end VTL reference.
\$bar is a#if($bar) valid#else not defined#end VTL reference.
---

output:
----
$foo is a valid VTL reference.
\$bar is a not defined VTL reference.
---

The point is, you can tuck the VTL control structures now pretty much
wherever you want, the output should be What You Expect.  For more
examples, see test/templates/pedantic.vm  

Hope this clears things up a little, and hope the fixes are what people
want.  Comments and suggestions not only welcome, but expected.

geir


-- 
Geir Magnusson Jr.                               geirm@optonline.com
Dakota tribal wisdom: "when you discover you are riding a dead horse,
the best strategy is to dismount."

Re: Parser Update

Posted by "Geir Magnusson Jr." <ge...@optonline.net>.
John Castura wrote:

> I was thinking about the discussion re: escape characters and wondered whether
> some redundancy could be useful for designers over the long run. If $foo is defined
> ("Gibreel") and $bar is undefined, we could have:
> 
> $foo -> Gibreel
> $\foo  -> $foo
> $\\foo -> $\foo
> 
> $bar -> $bar
> $\bar -> $bar (rather than $\bar)
> $\\bar -> $\bar (rather than $\\bar)

This is perfect.  Exactly what I was thinking, but forgot about the
$\bar output case, and $\\bar does it.

It means that we have a nice parallel to $!bar, and the conventional
notion of escape disappears.

I was off down the path of \$bar -> $bar,  \\$bar-> \<bar>,  \\\$bar ->
\$bar ,etc and actually have it working for references and pluggable
directives.  When I figured out how the defined directives (#set, #if)
would work, I went to bed....

> (Of course, this idea would be the same regardless of the escape
> characters that are eventually decided upon.)

I think that \ is a great 'escape' character, since that is
conventional.
 
> Anyways, I may not be thinking about this correctly, but I thought I'd throw
> in my 2 cents...

You are thinking (at least to me) 100% correctly, although it may be us
two vs. quite a crowd :0
 (We can take 'em)

Lets hear it folks.  I want to put this to bed.

geir
-- 
Geir Magnusson Jr.                               geirm@optonline.com

Yes Mr. Bush, Social Security *is* a federal program.

Re: Parser Update

Posted by "Geir Magnusson Jr." <ge...@optonline.net>.
John Castura wrote:

> I was thinking about the discussion re: escape characters and wondered whether
> some redundancy could be useful for designers over the long run. If $foo is defined
> ("Gibreel") and $bar is undefined, we could have:
> 
> $foo -> Gibreel
> $\foo  -> $foo
> $\\foo -> $\foo
> 
> $bar -> $bar
> $\bar -> $bar (rather than $\bar)
> $\\bar -> $\bar (rather than $\\bar)

This is perfect.  Exactly what I was thinking, but forgot about the
$\bar output case, and $\\bar does it.

It means that we have a nice parallel to $!bar, and the conventional
notion of escape disappears.

I was off down the path of \$bar -> $bar,  \\$bar-> \<bar>,  \\\$bar ->
\$bar ,etc and actually have it working for references and pluggable
directives.  When I figured out how the defined directives (#set, #if)
would work, I went to bed....

> (Of course, this idea would be the same regardless of the escape
> characters that are eventually decided upon.)

I think that \ is a great 'escape' character, since that is
conventional.
 
> Anyways, I may not be thinking about this correctly, but I thought I'd throw
> in my 2 cents...

You are thinking (at least to me) 100% correctly, although it may be us
two vs. quite a crowd :0
 (We can take 'em)

Lets hear it folks.  I want to put this to bed.

geir
-- 
Geir Magnusson Jr.                               geirm@optonline.com

Yes Mr. Bush, Social Security *is* a federal program.

Re: Parser Update

Posted by John Castura <jc...@kw.igs.net>.
On Mon, 06 Nov 2000, Geir Magnusson Jr. wrote:
> I have been promising to give an update on what is going on with the
> parser, so here it is.  Forgive any goofy spelling and grammar - I am
> beat - and if there is anything useful here, I am sure Minister of
> Documentation John Castura will make it spiffy.

:~)

This writeup is fantastic, Geir. I'll keep an eye on this thread to see what
decisions are made.

I was thinking about the discussion re: escape characters and wondered whether
some redundancy could be useful for designers over the long run. If $foo is defined 
("Gibreel") and $bar is undefined, we could have:  

$foo -> Gibreel  
$\foo  -> $foo
$\\foo -> $\foo

$bar -> $bar 
$\bar -> $bar (rather than $\bar)
$\\bar -> $\bar (rather than $\\bar)

Then, if $bar becomes a valid reference ("Saladin") in the future, no changes
would be required to the templates.

$bar -> Saladin 
$\bar -> $bar 
$\\bar ->  $\bar 

(Of course, this idea would be the same regardless of the escape
characters that are eventually decided upon.)

Anyways, I may not be thinking about this correctly, but I thought I'd throw
in my 2 cents...

Cheers,
John

> The two biggest changes recently beside bug fixes are motived by the
> idea that Velocity is a general purpose template engine (not just for
> outputting HTML), and therefore we have to be very careful that the
> output is both predictable and doesn't damage the non-VTL components of
> the input templates (called hereafter, 'schmoo'). 
> 
> Escape Handling 
> ----------------
> The escape handling rules are very simple now.
> 
> 1) When a non-null reference (a reference that refers to actual data in
> the Context) is preceeded by a '\', it will be rendered as $<reference>. 
> 
> 2) When a VTL directive ( #if, #else, #elseif, #end, #set, #foreach ...)
> is preceeded by a '\', it will be rendered as  #<directive>.
> 
> 3) In any other case, the '\' has no effect, and is simply output as-is.
> 
> This means:
> 
> #set $foo = "woogie"
> \$foo => $foo
> \$bar => $bar
> 
> would output as :
> 
> $foo =>  woogie
> \$bar =>  $bar
> 
> because $bar is not a reference (assuming in this case that there is no
> data element 'bar' in the Context).
> 
> Further
> 
> \#foreach( 
> \#whumpus
> 
> would render as 
> 
> #foreach(
> \#whumpus
> 
> because foreach() is a valid directive, and currently, whumpus isn't.
> 
> I hope you get the idea.  See test/templates/test.vm and escape.vm for
> more examples.
> 
> 
> 'What You Expect'
> -----------------
> Don't know what else to call this. Maybe 'What You Want'.  The basic
> idea is that the VTL (Velocity Template Language) control structures
> such as #if, #end, etc. should not render anything into the output
> stream. This allows precise output control for those generating output
> where whitespace and newlines matter.  For most of us, our output is
> HTML and Java, both being rather tolerant of spurious whitespace, so it
> doesn't matter.  But if you want to use Vel for something else, like
> automated text output, it does matter.
> 
> Below, when I say 'inline' I mean 'in the schmoo stream' -> you don't
> have to start on a new line.
> 
> The basic rules :
> 
> 1) #set will eat all preceeding whitespace and any following whitespace,
> including the newline, which is mandatory. It should not (cannot?) be
> used inline. There will be another #set statement (#inlineset() ?) for
> inline use offered soon for any masochists who insist on using #set
> inline.
> 
> 2) All other directives leave preceeding whitespace alone to respect
> 'What You Expect' and eat any trailing whitespace if they are followed
> by a newline.  If the whitespace is not followed immediately by a
> newline, the whitespace is rendered.
> 
> I think that's it.
> 
> So, let me show with some examples. 
> 
> input to show how the directives render to nothing:
> ---
> #if(false)
> $foo
> #end
> ---
> 
> output:
> ---
> ---
> 
> input to demonstrate inlining a #foreach() :
> ---
> #set $colors =["red", "blue", "tangerine"]
> >#foreach($color in $colors)$color #end<
> ---
> 
> output:
> ---
> >red blue tangerine <
> ---
> 
> input - remove the '<' at the end and see what happens when the #end
> eats the newline :
> ---
> #set $colors =["red", "blue", "tangerine"]
> >#foreach($color in $colors)$color #end
> ---
> 
> output :
> ---
> >red blue tangerine ---
> 
> See?  since there was no newline after $color in the block, the --- was
> sucked right up to the space trailing the last color.  This is as
> expected because the #end ate the newline.  'Ate' is official parser
> jargon. :)
> 
> How about an inline #if() to make nice text output:
> ----
> #set $foo = "bar"
> \$foo is a#if($foo) valid#else not defined#end VTL reference.
> \$bar is a#if($bar) valid#else not defined#end VTL reference.
> ---
> 
> output:
> ----
> $foo is a valid VTL reference.
> \$bar is a not defined VTL reference.
> ---
> 
> The point is, you can tuck the VTL control structures now pretty much
> wherever you want, the output should be What You Expect.  For more
> examples, see test/templates/pedantic.vm  
> 
> Hope this clears things up a little, and hope the fixes are what people
> want.  Comments and suggestions not only welcome, but expected.
> 
> geir
> 
> 
> -- 
> Geir Magnusson Jr.                               geirm@optonline.com
> Dakota tribal wisdom: "when you discover you are riding a dead horse,
> the best strategy is to dismount."


[PROPOSAL] Parser Update

Posted by Christoph Reck <Ch...@dlr.de>.
I've been thinking of the backslash problem using it as escape 
character. I believe the standard approach taken is in the wrong
direction, thus causing lateral problems (see gunnar's message 
below). 

Looking into the parser src, I see the escapes handled rather
hardwired. (are macros accessed bia $macroName or #macroName -
second is stated by documentation).

The simpler solution is to use special context identifiers
to represent the $ and # symbols. To emit an '$foo' and
'#if' text in schmoo use '$$foo' and '$#if' in the template,
thus not having to embed another syntax (escape cahracter) 
and keeping in line with the special VTL symbols! 

The generalization of this would be using ${(#|$)any text and characters}
to be passed literally.

Any votes in favour of this change?

Any objections? Well i've got one: what happens if input text
has a string '$$$' in it, VL would output schmoo with '$$'. 
To emit it correctly it would then need to be '${$$$}'. Anyway 
the current implementation already requires rewriting the input 
to the template. 

:) Christoph


Gunnar R|nning wrote:
> 
> "Geir Magnusson Jr." <ge...@optonline.net> writes:
> 
> >
> > 1) When a non-null reference (a reference that refers to actual data in
> > the Context) is preceeded by a '\', it will be rendered as $<reference>.
> >
> >
> > #set $foo = "woogie"
> > \$foo => $foo
> > \$bar => $bar
> >
> 
> So what about escaping the backslash itself ? Let's say I want the
> following output by accessing $foo :
> 
> "\woogie"
> 
> regards,
> 
>         Gunnar

Re: Parser Update

Posted by Gunnar R|nning <gu...@candleweb.no>.
"Geir Magnusson Jr." <ge...@optonline.net> writes:


> 
> 1) When a non-null reference (a reference that refers to actual data in
> the Context) is preceeded by a '\', it will be rendered as $<reference>. 
> 
> 
> #set $foo = "woogie"
> \$foo => $foo
> \$bar => $bar
> 

So what about escaping the backslash itself ? Let's say I want the
following output by accessing $foo :

"\woogie"


regards, 

	Gunnar


Re: Parser Update

Posted by Gunnar R|nning <gu...@candleweb.no>.
"Geir Magnusson Jr." <ge...@optonline.net> writes:


> 
> 1) When a non-null reference (a reference that refers to actual data in
> the Context) is preceeded by a '\', it will be rendered as $<reference>. 
> 
> 
> #set $foo = "woogie"
> \$foo => $foo
> \$bar => $bar
> 

So what about escaping the backslash itself ? Let's say I want the
following output by accessing $foo :

"\woogie"


regards, 

	Gunnar


Re: Parser Update

Posted by Jon Stevens <jo...@latchkey.com>.
on 11/7/2000 5:00 AM, "Christoph Reck" <Ch...@dlr.de> wrote:

> The $!<ref> construct is a conditional emitting (if defined)
> and $\<ref> is (post-)escaping the reference, making it a literal.

I have already stated that I am -1 on any more modifiers other than $!.

I don't want this to turn into Perl.

-jon

-- 
http://scarab.tigris.org/    | http://noodle.tigris.org/
http://java.apache.org/      | http://java.apache.org/turbine/
http://www.working-dogs.com/ | http://jakarta.apache.org/velocity/
http://www.collab.net/       | http://www.sourcexchange.com/



Re: Parser Update

Posted by Christoph Reck <Ch...@dlr.de>.
OK, +1 for the current implementation.

Geir Magnusson Jr. wrote:
>[snip]
> The point is that I believe that something is a reference if and 
> only if it has a value in the Context.  Otherwise it is schmoo, 
>[snip].  
> This is the way the current parser behaves.
>[snip]

I can fully agree with this last explanation. Context variables
are indetified by a value placed in the context and not by the
prefix '$'. Same applies to directives (which either are 
predefined or plugged in) are defined by the engine and not 
by the '#' prefix. To emit these context variables as literal,
they need to be escaped.

The $!<ref> construct is a conditional emitting (if defined)
and $\<ref> is (post-)escaping the reference, making it a literal.

If anybody does not like it, he can use the explicit construct:
#set $d = "$"
${d}foo => $foo
which is independent of availability of 'foo' in the context.
This can avoid different rendering, when a variable may (or may not) 
be in the context (e.g. a #set within an #if).

Its three of us ready to put this issue to bed. One is a single,
two is couple, three is a croud! :D

:) Christoph

Re: Parser Update

Posted by Christoph Reck <Ch...@dlr.de>.
Wonderful, this is what I would expect - clearly defined
escape and whitespace handling!

One comment (wish) on the whitespace handling.

> 
> 2) All other directives leave preceeding whitespace alone to respect
> 'What You Expect' and eat any trailing whitespace if they are followed
> by a newline.  If the whitespace is not followed immediately by a
> newline, the whitespace is rendered.
> 

The rule of hungry directives should be:
1. standalone directives are hungry and eat preceeding and 
   successive whiteplaces up-to and including a newline character.
2. embedded directives (having any non-whitespace before or after it)
   do not touch the whitespaces around it.

JTest input examples:

## When the parser is eating trailing whitespaces up to and including 
## the newline, it should also eat preceeding whitespaces!
----
#set $foo = "Mud"
#set $bar = "Wendy"
#if ($foo)
  #set $mudslinger = "$foo slinger"
  #if ($bar)
    No bar to eat!
  #else
    $mudslinger is #if ($action) $action#else dreaming#end at $bar's!
  #end
#end
----
## should render as:
----
Mud slinger is dreaming at Wendy's!
----
## having non-embedded directives eating preceeding whitespaces
## allowing nicely indented constructs. 


## Embedded directives are not hungry, e.g.:
----
#set $colors =["red", "blue", "tangerine"]
The colors are: #foreach($color in $colors)$color #end
----
## should render to:
---
The colors are: red blue tangerine
---
## Leaving the newline untouched because the directive was not 
## standalone!


NOTE that this previous example contracdicts your (geir) 
previous posting in this thread - to achieve the same output, 
escape the newline with a comment:


## To eat a newline, munch it with a comment:
----
#set $colors =["red", "blue", "tangerine"]
The colors are: #foreach($color in $colors)$color #end##
----
## should render to:
---
The colors are: red blue tangerine ---
## having the directive line connected to the following one!


Is this conditional eating preceeding/successive whitespaces possible? 

:) Christoph

Re: Parser Update

Posted by John Castura <jc...@kw.igs.net>.
On Mon, 06 Nov 2000, Geir Magnusson Jr. wrote:
> I have been promising to give an update on what is going on with the
> parser, so here it is.  Forgive any goofy spelling and grammar - I am
> beat - and if there is anything useful here, I am sure Minister of
> Documentation John Castura will make it spiffy.

:~)

This writeup is fantastic, Geir. I'll keep an eye on this thread to see what
decisions are made.

I was thinking about the discussion re: escape characters and wondered whether
some redundancy could be useful for designers over the long run. If $foo is defined 
("Gibreel") and $bar is undefined, we could have:  

$foo -> Gibreel  
$\foo  -> $foo
$\\foo -> $\foo

$bar -> $bar 
$\bar -> $bar (rather than $\bar)
$\\bar -> $\bar (rather than $\\bar)

Then, if $bar becomes a valid reference ("Saladin") in the future, no changes
would be required to the templates.

$bar -> Saladin 
$\bar -> $bar 
$\\bar ->  $\bar 

(Of course, this idea would be the same regardless of the escape
characters that are eventually decided upon.)

Anyways, I may not be thinking about this correctly, but I thought I'd throw
in my 2 cents...

Cheers,
John

> The two biggest changes recently beside bug fixes are motived by the
> idea that Velocity is a general purpose template engine (not just for
> outputting HTML), and therefore we have to be very careful that the
> output is both predictable and doesn't damage the non-VTL components of
> the input templates (called hereafter, 'schmoo'). 
> 
> Escape Handling 
> ----------------
> The escape handling rules are very simple now.
> 
> 1) When a non-null reference (a reference that refers to actual data in
> the Context) is preceeded by a '\', it will be rendered as $<reference>. 
> 
> 2) When a VTL directive ( #if, #else, #elseif, #end, #set, #foreach ...)
> is preceeded by a '\', it will be rendered as  #<directive>.
> 
> 3) In any other case, the '\' has no effect, and is simply output as-is.
> 
> This means:
> 
> #set $foo = "woogie"
> \$foo => $foo
> \$bar => $bar
> 
> would output as :
> 
> $foo =>  woogie
> \$bar =>  $bar
> 
> because $bar is not a reference (assuming in this case that there is no
> data element 'bar' in the Context).
> 
> Further
> 
> \#foreach( 
> \#whumpus
> 
> would render as 
> 
> #foreach(
> \#whumpus
> 
> because foreach() is a valid directive, and currently, whumpus isn't.
> 
> I hope you get the idea.  See test/templates/test.vm and escape.vm for
> more examples.
> 
> 
> 'What You Expect'
> -----------------
> Don't know what else to call this. Maybe 'What You Want'.  The basic
> idea is that the VTL (Velocity Template Language) control structures
> such as #if, #end, etc. should not render anything into the output
> stream. This allows precise output control for those generating output
> where whitespace and newlines matter.  For most of us, our output is
> HTML and Java, both being rather tolerant of spurious whitespace, so it
> doesn't matter.  But if you want to use Vel for something else, like
> automated text output, it does matter.
> 
> Below, when I say 'inline' I mean 'in the schmoo stream' -> you don't
> have to start on a new line.
> 
> The basic rules :
> 
> 1) #set will eat all preceeding whitespace and any following whitespace,
> including the newline, which is mandatory. It should not (cannot?) be
> used inline. There will be another #set statement (#inlineset() ?) for
> inline use offered soon for any masochists who insist on using #set
> inline.
> 
> 2) All other directives leave preceeding whitespace alone to respect
> 'What You Expect' and eat any trailing whitespace if they are followed
> by a newline.  If the whitespace is not followed immediately by a
> newline, the whitespace is rendered.
> 
> I think that's it.
> 
> So, let me show with some examples. 
> 
> input to show how the directives render to nothing:
> ---
> #if(false)
> $foo
> #end
> ---
> 
> output:
> ---
> ---
> 
> input to demonstrate inlining a #foreach() :
> ---
> #set $colors =["red", "blue", "tangerine"]
> >#foreach($color in $colors)$color #end<
> ---
> 
> output:
> ---
> >red blue tangerine <
> ---
> 
> input - remove the '<' at the end and see what happens when the #end
> eats the newline :
> ---
> #set $colors =["red", "blue", "tangerine"]
> >#foreach($color in $colors)$color #end
> ---
> 
> output :
> ---
> >red blue tangerine ---
> 
> See?  since there was no newline after $color in the block, the --- was
> sucked right up to the space trailing the last color.  This is as
> expected because the #end ate the newline.  'Ate' is official parser
> jargon. :)
> 
> How about an inline #if() to make nice text output:
> ----
> #set $foo = "bar"
> \$foo is a#if($foo) valid#else not defined#end VTL reference.
> \$bar is a#if($bar) valid#else not defined#end VTL reference.
> ---
> 
> output:
> ----
> $foo is a valid VTL reference.
> \$bar is a not defined VTL reference.
> ---
> 
> The point is, you can tuck the VTL control structures now pretty much
> wherever you want, the output should be What You Expect.  For more
> examples, see test/templates/pedantic.vm  
> 
> Hope this clears things up a little, and hope the fixes are what people
> want.  Comments and suggestions not only welcome, but expected.
> 
> geir
> 
> 
> -- 
> Geir Magnusson Jr.                               geirm@optonline.com
> Dakota tribal wisdom: "when you discover you are riding a dead horse,
> the best strategy is to dismount."