You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@cocoon.apache.org by Timothy Larson <ti...@yahoo.com> on 2003/10/14 16:44:37 UTC

EffectTransformer/WoodyTemplateTransformer

I posted the source to two Java classes on my page on the wiki:
  http://wiki.cocoondev.org/Wiki.jsp?page=TimLarson
The first class, EffectTransformer, make writing transformers in Java
much easier and cleaner, IMHO.  The second class, WoodyTemplateTransformer,
is an example of its namesake rewritten to use the Effect transformer style.

Here is an explaination:

Stylesheets let the structure of your code reflect the structure of the
data you are processing, making them relatively easy to understand.
The downside to stylesheets is when you really need the support of a
full programming language and when you need side effects such as looking
up widgets or talking to a database.  Sure you can do it, but it just
does not feel right and can require too much gymnastics.

To solve this we try writing transformers in Java, but end up with the
logic for handling different elements mixed in the same startElement
method, etc. and have to introduce multiple flag and status variables
to keep track of the current state.  At least we gained a full programming
language and the ability to have side effects.

The (side-) Effect transformer solves all of these issues.

You write a transformer class which extends EffectTransformer and supplies
global data which can be directly accessed by element handler classes
implemented as inner classes.  Support is provided for the handler classes
to implement generic processing, tail recursion, and poor-man's continuations.

For each SAX event the event data is collected and the current handler is
called.  Start and end element events are handled specially.  The current
handler is told there is a nested element (event == EVENT_ELEMENT) and it is
expected to return a handler for that element.  The returned handler will
then be called with the start element event (event == EVENT_START_ELEMENT),
any nested events (EVENT_CHARACTERS, EVENT_COMMENT, etc.), and finally the
end element event (EVENT_END_ELEMENT).  At this point the enclosing handler
will automatically be reinstated from the active handler stack to receive any
further events.

Note that returning a reference to the current handler object, "this", allows
the current object to handle the nested element.

Also, returning an instance of an anonymous class which extends Handler allows
the code that handles the nested element to itself be nested inside the
current handler code.

For each event the handler can respond with either custom logic and/or use
one of the three generic processing methods:
  Returning without calling any output method discards the event.
  "out.copy()" copies the current input event to the output.
  "out.compile()" compiles the current input event.
To support custom logic, the full set of SAX events can be output using the
pattern out.methodname(parameters).

For every call, the handler is expected to return a reference to the object
that is to handle the next event.  For most events it is normal to return a
reference to the current handler object, "this".  Returning a reference to a
different object acts like tail recursion; control is passed horizontaly
to the returned object and will not return to the current handler object.

The two special cases are EVENT_ELEMENT which implements nesting of control
as explained above, and EVENT_END_ELEMENT which discards the returned value
to allow for the return of control that was nested by EVENT_ELEMENT.

Example inner handler class that assumes no errors in the input XML:

    protected class RepeaterSizeHandler extends Handler {
        public Handler process() throws SAXException {
            // The variable "event" is supplied by the EffectTransformer class
            switch(event) {
            case EVENT_START_ELEMENT:
                // Method "getRepeaterWidget" is supplied by the enclosing class
                getRepeaterWidget("RepeaterSizeHandler");
                // The variable "widget" is also supplied by the enclosing class
                ((Repeater)widget).generateSize(contentHandler);
                widget = null;
                return this;
            case EVENT_ELEMENT:
                // This is returning a built-in handler to discard nested elements
                return nullHandler;
            case EVENT_END_ELEMENT:
                return this;
            default:
                out.copy();
                return this;
            }
        }
    }

Above example modified to use poor-man's continuations to implement error checking.
The integer "step" acts as the instruction pointer that the switch statement
uses to continue right where it left off the last time the handler was called.
Any additional data that must live through a continuation should be declared
at the class level along with the variable "step".  An example of an anonymous class
is also included.

    protected class RepeaterSizeHandler extends Handler {
        protected int step = 0;
        public Handler process() throws SAXException {
            switch(step) {
            case 1:
                if (event != EVENT_START_ELEMENT) throw new SAXException("StartElement
expected!");
                getRepeaterWidget("RepeaterSizeHandler");
                ((Repeater)widget).generateSize(contentHandler);
                widget = null;
                step++;
                return this;
            case 2:
                if (event == EVENT_ELEMENT) {
                    // For good measure, here is an example of the use of an anonymous class
                    // to nest the logic for handling nested elements.
                    return new Handler() {
                        public Handler process() throws SAXException {
                            return this; 
                        }
                    }
                } else if (event == EVENT_END_ELEMENT) {
                    step++;
                    return this;
                } else {
                    out.copy();
                    return this;
                }
            default:
                throw new SAXException("I really did not expect to get called again!");
            }
        }
    }

I will give more details and examples if anybody is interested.
WDYT?

--Tim Larson


__________________________________
Do you Yahoo!?
The New Yahoo! Shopping - with improved product search
http://shopping.yahoo.com

Re: EffectTransformer/WoodyTemplateTransformer

Posted by Timothy Larson <ti...@yahoo.com>.
--- Javier del Gesu <de...@thinknola.com> wrote:
> Futher consideration (these are examples, not actual code):

>         } else if (localName.equals("command")) {
>             
>             // Create an anonymous inner class to intercept a value.
> 
>             ContentFilter anonymous = new AbstractContentFilter () {
>                 public void characters (FilterContext filterContext,
>                     SAXCharacterEvent event) throws SAXException {
>                     ExampleFilter.this.command =
>                         event.getCharactersAsString();
>                 }
>             }
>             filterContext.startContentFilter(anonymous);

This use of an anonymous class is almost the same as with the Effect
transformer.

>         } else if (localName.equals("item")) {
> 
>             // Use the same filter for a certian set of element
>             // names, preserves state.
> 
>             // Is this what you mean by continuations?
> 
>             filterContext.startContentFilter(collectionFilter);
> 
>         }

This is similar to the following code using the Effect transformer:

private class SomeClass extends Handler {
    private String name = null;
    public Handler process() throws SAXException {
        switch(event) {
        case EVENT_START_ELEMENT:
            if (name == null) {
                // This is the first time this handler was called.
                name = new String(loc);
            } else {
                // We are processing a child element this time.
            }
            return this;
        case EVENT_ELEMENT: // If child element...
            if ("item".equals(loc))
                // ...then process with this same object, preserving state.
                // Immediately after "return this", this same object will be called
                // again for this same element, but with event == EVENT_START_ELEMENT.
                return this;
        case EVENT_END_ELEMENT:
            // Do finishing work
            return this;
        default:
            return this;
        }
    }
}

By a continuation I mean something a bit different.  I can preserve both the processing
object's state, as in the example above, and also the position in the processing method,
like saving an instruction pointer.  When the handling method does a "return" it first
remembers where it left off, so the next time it is called it automatically picks
in the middle of the method right after the return statement that was last executed.
This allow for some easy sophisticated processing like the following:

private class SomeClass extends Handler {
    private int step = 0;
    private String name = null;
    public Handler process() throws SAXException {
        switch(step) {
        case 1:
            // This is the first time this handler was called.
            name = new String(loc);
            out.copy(); // Copy start element to output.
            step++; return this; // The next call to this method for any event...
        case 2: // ...will automatically pick up right here at step 2.
            if (event == EVENT_ELEMENT) { // Handle first child element.
                // The "step++;" will cause the the next call to this method
                // to automatically pick up at step 3 below.
                step++; return new Handler {
                    // Put code for anonymous handler class here.
                }
            }
            out.copy(); // If not a child element, then just copy to output.
            // No "step++;" is present here, so this step will be repeated
            // until a child element is finally processed.
            return this;
        case 3: // Step 3.
            if (event == EVENT_ELEMENT) { // Handle second child element differently.
                step++; return new Handler {
                    // Put different code for anonymous handler class here.
                }
            } else if (event == EVENT_CHARACTERS) {
                // Do special processing based on what the text is.
            }
            throw new SAXException("Oh, no. Unexpected event!");
        default: // Handle remaining events.
            switch(event) {
            case EVENT_END_ELEMENT:
                out.copy();
                return this;
            default:
                out.copy(); // Copy the rest of the input straight to the output.
                return this;
            }
        }
    }
}

> Did I cover all the bases?

Does this answer your questions?

--Tim Larson


__________________________________
Do you Yahoo!?
The New Yahoo! Shopping - with improved product search
http://shopping.yahoo.com

Re: EffectTransformer/WoodyTemplateTransformer

Posted by Javier del Gesu <de...@thinknola.com>.
* Timothy Larson <ti...@yahoo.com> [2003-10-14 19:00]:
> (Note to Javier: I am fowarding my reply to the list, since I saw your
> public reply just moments after I had already sent this reply privately)
> 
> --- Javier del Gesu <de...@thinknola.com> wrote:
> > This is reassuring. I wrote something similiar to create a SAX
> > filter, where processing of diferent nodes is handled by different
> > objects.

> This is where the poor man's continuations come in handy.  I have some
> techniques that allow loops and other code structures to be continued.
> These features do not get used in the WoodyTemplateTransformer, but I
> plan to use them in processing other structured data in other projects.
> 
> I like your put/get interface for communicating data between handlers.
> That concept may get used when the Effect transformer gets exteded to
> handle multiple inputs and outputs and a few other things...

Futher consideration (these are examples, not actual code):

class ExampleFilter extends AbstractContentFilter {
    private String command;
    private CollectionFilter collectionFilter;

    public ExampleFilter () {
        collectionFilter = new CollectionFilter();
    }

    public void start (FilterContext filterContext,
        SAXStartElementEvent event) throws SAXException {
    }

    public void startElement (FilterContext filterContext,
        SAXStartElementEvent event) throws SAXException {

        String localName = filterContext.getLocalName();

        if (localName.equals("document")) {

            // Reuse a common component.

            // Build a DOM document and put it user the key "doc"
            // in the filter context.

            filterContext.startContentFilter(new DOMFilter("doc"));

        } else if (localName.equals("command")) {
            
            // Create an anonymous inner class to intercept a value.

            ContentFilter anonymous = new AbstractContentFilter () {
                public void characters (FilterContext filterContext,
                    SAXCharacterEvent event) throws SAXException {
                    ExampleFilter.this.command =
                        event.getCharactersAsString();
                }
            }
            filterContext.startContentFilter(anonymous);

        } else if (localName.equals("item")) {

            // Use the same filter for a certian set of element
            // names, preserves state.

            // Is this what you mean by continuations?

            filterContext.startContentFilter(collectionFilter);

        }
    }

    public void end (FilterContext filterContext, SAXEventEvent event)
        throws SAXException {

        if (command != null)
            Singleton.doSomethingWithCommand(command);

        Document document = (Document) filterContext.get("doc");
        if (document != null)
            Singleton.doSomethingWithDoucment(document);

        Object items[] = collectionFilter.toArray();
        Singleton.doSomethingWithItems(items);
    }
}

Did I cover all the bases?

-- 
Javier del Gesu - delgesu@thinknola.com

Re: EffectTransformer/WoodyTemplateTransformer

Posted by Timothy Larson <ti...@yahoo.com>.
(Note to Javier: I am fowarding my reply to the list, since I saw your
public reply just moments after I had already sent this reply privately)

--- Javier del Gesu <de...@thinknola.com> wrote:
> This is reassuring. I wrote something similiar to create a SAX
> filter, where processing of diferent nodes is handled by different
> objects.

Thanks for you comments.  In my sequence of attempts to modularize
the WoodyTemplateTransformer, I had one solution that was very similar
to what you describe.  I opted instead for the "event" variable solution
with one common "process" method per handler to allow for easier complex
procedural order-sensitive processing of XML with mixed content
like this:
  <element>
    text <subelement/> text <subelement/> text
  <element>
This is where the poor man's continuations come in handy.  I have some
techniques that allow loops and other code structures to be continued.
These features do not get used in the WoodyTemplateTransformer, but I
plan to use them in processing other structured data in other projects.

I like your put/get interface for communicating data between handlers.
That concept may get used when the Effect transformer gets exteded to
handle multiple inputs and outputs and a few other things...

> That's what I got. I think it is good to know that we both hit upon
> the same solution, using an object stack to gather state.
> 
> -- 
> Javier del Gesu - delgesu@thinknola.com

Pulling the state and control flow out of the handlers and into the
framework lightens the load when coding handlers.  It is interesting
comparing our approaches.  Thanks for sharing.

--Tim Larson


__________________________________
Do you Yahoo!?
The New Yahoo! Shopping - with improved product search
http://shopping.yahoo.com

Re: EffectTransformer/WoodyTemplateTransformer

Posted by Javier del Gesu <de...@thinknola.com>.
* Timothy Larson <ti...@yahoo.com> [2003-10-14 14:44]:

> I posted the source to two Java classes on my page on the wiki:
>   http://wiki.cocoondev.org/Wiki.jsp?page=TimLarson
> The first class, EffectTransformer, make writing transformers in Java
> much easier and cleaner, IMHO.  The second class, WoodyTemplateTransformer,
> is an example of its namesake rewritten to use the Effect transformer style.
> 
> Here is an explaination:
> 
> Stylesheets let the structure of your code reflect the structure of the
> data you are processing, making them relatively easy to understand.
> The downside to stylesheets is when you really need the support of a
> full programming language and when you need side effects such as looking
> up widgets or talking to a database.  Sure you can do it, but it just
> does not feel right and can require too much gymnastics.
> 
> To solve this we try writing transformers in Java, but end up with the
> logic for handling different elements mixed in the same startElement
> method, etc. and have to introduce multiple flag and status variables
> to keep track of the current state.  At least we gained a full programming
> language and the ability to have side effects.
> 
> The (side-) Effect transformer solves all of these issues.

> I will give more details and examples if anybody is interested.
> WDYT?

This is reassuring. I wrote something similiar to create a SAX
filter, where processing of diferent nodes is handled by different
objects.

The user implements this interface:

public interface ContentFilter {
    /* Called when this content filter is pushed onto the stack */
    public void start(FilterContext filterContext, SAXStartElementEvent event)
        throws SAXException;

    /* Called when this content filter is popped off the stack */
    public void end(FilterContext filterContext, SAXEndElementEvent event)
        throws SAXException;

    /* Handle a SAX events */
    public void startElement(FilterContext filterContext,
        SAXStartElementEvent event) throws SAXException;

    public void endElement(FilterContext filterContext,
        SAXEndElementEvent event) throws SAXException;

    public void characters(FilterContext filterContext,
        SAXCharacterEvent event) throws SAXException;
}

Rather than change content filters after each call, the content
filter handles decendant elements until it decides it wants to
swtich filters. The filter switch is performed by calling
FilterContext.startContentFilter(). 

The FilterContext interface:

public interface FilterContext {
    /* Push a new content filter onto the stack */
    public void startContentFilter(ContentFilter contentFilter)
        throws SAXException;

    /* Make note of something in this context */
    public void put(String key, Object value);

    /* Retrieve a note of something in this context */
    public Object get(String key);

    /* Find the filter context with a non-null value for key */ 
    public FilterContext find(String key);

    /* Retrieve the filter context parent */
    public FilterContext parent();
}

The FilterContext keeps track of state. ContentFilter imps put data
into the filter context. Decendants can search for values obtained
in the tree traversal with the find method. They can return values
to a parent context with filterContext.parent().put("key", value).

When FilterContext.startContentFilter() is called, the
ContentFilter.start() method is called. ContentFilter.end() is
called when the element of the current filter context ends.

The SAXEvent class acts as both an ecapsulation of a sax event, and
as an interface to the next ContentHandler in the filter chain.

public interface SAXEvent {
    /* Forward this event.  */
    public void forward() throws SAXException;

    /* Create create an attribute with the attributes of this event.  */
    public SAXAttributes createSAXAttributes();

    public SAXAttributes createSAXAttributes(Attributes attributes);

    /* Send a start element event to the next content handler in the
     * chain of content handlers */
    public void forwardStartElement(String uri, String localName,
        Attributes attributes) throws SAXException;

    public void forwardStartElement(String uri, String localName,
        String[] attributes) throws SAXException;

    public void forwardStartElement(String uri, String localName)
        throws SAXException;

    /* Send an end element event to the next content handler in the
     * chain of content handlers */
    public void forwardEndElement(String uri, String localName)
        throws SAXException;

    /* Send characters to the listening content handler */
    public void forwardCharacters(char[] characters, int start
         int length) throws SAXException;

    public void forwardCharacters(String characters)
        throws SAXException;

    /* Play a DOM document to the next content handler */
    public void forwardNode(Node node) throws SAXException;

    /* Namespace information */
    public Iterator getNamespaceURIs();

    public String getPrefix(String uri);
}

public interface SAXEndElementEvent extends SAXEvent {
    public String getURI();

    public String getLocalName();

    public String getQualifiedName();
}

That's what I got. I think it is good to know that we both hit upon
the same solution, using an object stack to gather state.

-- 
Javier del Gesu - delgesu@thinknola.com

Re: EffectTransformer/WoodyTemplateTransformer

Posted by Bruno Dumon <br...@outerthought.org>.
On Tue, 2003-10-14 at 22:31, Timothy Larson wrote:
> Attn: Bruno, Sylvain, and anyone else working with Woody.
> 
> Would you take a look at the WoodyTemplateTransformer attached to:
>   http://wiki.cocoondev.org/Wiki.jsp?page=TimLarson
> and let me know if it is suitable to replace the current class?
> 
> If it is acceptable, I plan to post my union widget patches based
> on the new transformer design.  Otherwise, I will try to work out
> another solution that we can all like.

Finally found some time to look at it, sorry for the delay.

I think conceptually it is great, basically the handlers only need to
know how to handle the children of the current element, which makes the
code more readable and more OO.

The disadvantage of it all is that a lot of additional objects are
created: for each element a Handler is created,  and an Element object,
containing an AttributesImpl into which the existing attributes are
copied, etc. This will give the garbage collector some more work and
slow down the transformation a bit. I don't know if this is something we
still need to take into account nowadays... (pages containing forms
typically won't be very large anyway I guess)

I'd say that for the current functionality it's not really necessary to
change, but I guess it might become useful for any new features you're
planning to add.

-- 
Bruno Dumon                             http://outerthought.org/
Outerthought - Open Source, Java & XML Competence Support Center
bruno@outerthought.org                          bruno@apache.org


Re: EffectTransformer/WoodyTemplateTransformer

Posted by Sylvain Wallez <sy...@anyware-tech.com>.
Timothy Larson wrote:

>Attn: Bruno, Sylvain, and anyone else working with Woody.
>
>Would you take a look at the WoodyTemplateTransformer attached to:
>  http://wiki.cocoondev.org/Wiki.jsp?page=TimLarson
>and let me know if it is suitable to replace the current class?
>
>If it is acceptable, I plan to post my union widget patches based
>on the new transformer design.  Otherwise, I will try to work out
>another solution that we can all like.
>
>PS: Any improvements, changes, comments, etc. are welcome.
>  
>

So far it looks yummy! I'll have a deeper look at it in the coming days.

Sylvain

-- 
Sylvain Wallez                                  Anyware Technologies
http://www.apache.org/~sylvain           http://www.anyware-tech.com
{ XML, Java, Cocoon, OpenSource }*{ Training, Consulting, Projects }
Orixo, the opensource XML business alliance  -  http://www.orixo.com



Re: EffectTransformer/WoodyTemplateTransformer

Posted by Timothy Larson <ti...@yahoo.com>.
Attn: Bruno, Sylvain, and anyone else working with Woody.

Would you take a look at the WoodyTemplateTransformer attached to:
  http://wiki.cocoondev.org/Wiki.jsp?page=TimLarson
and let me know if it is suitable to replace the current class?

If it is acceptable, I plan to post my union widget patches based
on the new transformer design.  Otherwise, I will try to work out
another solution that we can all like.

PS: Any improvements, changes, comments, etc. are welcome.

--Tim Larson


__________________________________
Do you Yahoo!?
The New Yahoo! Shopping - with improved product search
http://shopping.yahoo.com