You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@velocity.apache.org by Claude Brisson <cl...@savoirweb.com> on 2002/06/09 13:16:51 UTC

Automatic character escaping in reference values (was RE : escaping xml characters)

Here is my proposed implementation for automatic character escaping in reference values.

Only the XML format is taken into account for now, but the object model can easily be extended to handle other formats.
I recall that the only predicate is that reference values should not contain syntax boundaries (althougth they can in some cases), along with the fact that the document should of course be well-formed.

I guess the lack of feedback for the specs I posted a week ago was a kind of tepid approval, so I went ahead...
Now that the thing is coded, I'd appreciate some comments, despite the fact I know you've all plenty of late work to do, as I do... (at least, tell me you'll look at it one of those days ;-) or where this package has a chance to go : in the repository, in the whiteboard, or in the trashcan...).

### Description of this implementation :

A new boolean property, "runtime.format.character.escaping", controls the use of this character escaping mechanism.

A new string member of Template : "contentType", describing the MIME type of this template; has its getter and setter. For now, will remain null until explicitely set.

All new objects related to this character escaping are stored in the package org.apache.velocity.runtime.format

If template content type is known and there exists a corresponding specialized character escaper, a new "org.apache.velocity.runtime.format.CharEscaper" object member [with getter and setter] in InternalContextAdapter[Impl] is built by Template.merge( ).

If not null, this char escaper is called by ASTText.render( ) (to keep track of the current syntax) and by ASTReference.render( ) (to perform the actual character escaping).

### Three zip files attached : 

 - all modified files
 - the patch
 - an example

### Output of the example : 

<?xml version="1.0" encoding="UTF-8"?>
<document singlequote='-->&apos;<--' doublequote="-->&quot;<--">
        &lt;&amp;&gt;
        <![CDATA['<&>']]>
        &lt;&amp;&gt;
        <!--'<&>'-->
        &lt;&amp;&gt;
</document>

Fell free to make any comment or critic.

CloD


Re: Automatic character escaping in reference values (was RE : escaping xml characters)

Posted by Claude Brisson <cl...@savoirweb.com>.
Geir wrote :

> Interesting approach.  What worries  me is how deeply it touches things,
> from ICA to Template to nodes.

Hi Geir.

Let's review the things the mechanism touches in velocity :

 - template : what I did here is just to add an *optional* parameter, contentType.
To store type information (either in the template or directly in the resource) may be usefull for other tasks, as setting the proper
content type in the servlet response. Anyway, no escaping at all can take place without knowing sthing about the content, and the
template looked the right place to store this info.

 - ICA : this one may be avoided. I added an optional parameter, the CharEscaper. It could be stored anywhere the nodes have an
access to, provided it is in the scope of the current merging process. It could even be stored as a simple context object under the
apropriate key string, but when I saw this empty interface, and also your comment "If anything new comes along, add it here."...I
could not resist... ;-)

 - nodes : here I guess it cannot be avoided, since the nodes are the only one to know that what they render should get a chance to
be escaped or not. Note that only ASTText and ASTReference are concerned.

> Why couldn't this be implemented as a filter on the writer?

It would then be a FilterWriter that would wrap the outer writer, and would have two methods : a writer.write (no escaping) and a
writer.filter (escaping).
Then, only two classes would be touched : Template (for the content type and to build the FilterWriter) and ASTReference (to call
writer.filter instead of writer.write).

I can implement it this way if there is a better chance for the module to be commited (and I also guess it's a little better).

CloD



--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>


Re: Automatic character escaping in reference values (was RE : escaping xml characters)

Posted by "Geir Magnusson Jr." <ge...@adeptra.com>.
Interesting approach.  What worries  me is how deeply it touches things,
from ICA to Template to nodes.

Why couldn't this be implemented as a filter on the writer?



On 6/9/02 7:16 AM, "Claude Brisson" <cl...@savoirweb.com> wrote:

> Here is my proposed implementation for automatic character escaping in
> reference values.
> 
> Only the XML format is taken into account for now, but the object model can
> easily be extended to handle other formats.
> I recall that the only predicate is that reference values should not contain
> syntax boundaries (althougth they can in some cases), along with the fact that
> the document should of course be well-formed.

[SNIP]

-- 
Geir Magnusson Jr.
Research & Development, Adeptra Inc.
geirm@adeptra.com
+1-203-247-1713



--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>