You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@cocoon.apache.org by Stefano Mazzocchi <st...@apache.org> on 2001/12/03 15:30:51 UTC

Re: Flowmaps: the wrong approach

I've taken the weekend to learn Scheme, now I know what we are talking
about :)

Daniel Fagerstrom wrote:
> 
> Ovidiu Predescu wrote:
> <snip/>
> > I now believe we should have a system centered around logic, not
> > around states and transitions.
> I agree completely, IMHO, writing FSM:s is like goto-programming, very
> small systems are easy to understand, but as soon as they grow, they
> easily become a maintainance nightmare.

This is the old tune against GOTOs and I totally agree on that.

On the other hand, I disagree that FSM equals goto-programming (in fact,
you are describing FSM down below, using the XML syntax :)
 
> > The logic should be expressed in a
> > language that supports continuations. The logic should drive what
> > pages are presented to the user. These pages could be expressed in an
> > XML markup language with appropriate elements for extracting data
> > previously created in the program. These XML pages could be then
> > processed through a pipeline, similarly with how they are processed
> > today in the sitemap. However since incoming URLs are handled directly
> > by the logic, there's no need for matchers in the sitemap.
> >
> >
> >  HTTP request                         transformations
> > --------------> logic -----> XML page -----------------> HTML/WML/... page
> >
> >
> > The generated pages contain URLs that point back to continuation
> > points in the logic.
> >
> > The biggest problem is the fact that the logic needs to be expressed
> > in language that supports continuations. Since most people don't like
> > the Lisp syntax, a language that abstracts out the continuations in
> > higher level abstractions like send-response could be developed. This
> > can probably be done by extending a familiar language, like
> > Javascript, with these concepts.
> <snip/>
> > This is a huge paradigm shift from what we have right now, but I
> > believe leads to easier ways to write Web applications. They become
> > more like usual programs, instead of the complex beasts that we have
> > today, with state scattered all over the place in the code.
> >
> > And yes, please read the papers I pointed to in my previous email, to
> > understand what the heck I'm talking about. Here they are for your
> > convenience:
> >
> >    http://youpou.lip6.fr/queinnec/Papers/webcont.ps.gz
> >    http://www.cs.rice.edu/CS/PLT/Publications/esop2001-gkvf.ps.gz
> 
> I happen to like Lisp as well at its syntax ;) 

Gosh, can't say the same, but it's not important at the moment.

> Still I wonder if it
> would not be possible to continue in the great cocoon tradition of
> SoC, and find a convenient description of webapp flow, without going
> all the way to a full high level programming language. 

Bingo! that's the point. The paper wants to "get control back" since
page-directed programming stole the control from the programmer.

The paper is right on many things but forgets about taking SoC into
consideration.

Turning a sitemap into a logic-oriented description would be equally bad
since web programming is a mixed form of "declarative" (page-oriented)
and "procedural" (logic-driven).

My personal opinion is that we should have both at the same time in
order to keep Cocoon elegance.

> Now that you
> have succeeded in exorcizing the mix of programming language
> constructs and tags from JSP etc, it seem like a pity to let this mix
> in again.

Exactly!
 
> Anyhow, inspired of the interesting discussion and the articles that
> you referd to, I started to think about how to use these concepts in
> cocoon. If possible, without having to do a "huge paradigm shift".

Same here!
 
> --------------------------------------------------------------------
> Flowmap
> -------
> 
> First, to make it more concrete, I will try to express the main
> example from, http://youpou.lip6.fr/queinnec/Papers/webcont.ps.gz, in
> terms of xslt, cocoon components, a sitemap and a flowmap. For those
> of you that not have read that article yet, the main example is a
> small webapp:
> 
> 1. On the first page it ask for the conversion rate between French
> Francs and another currency.
> 2. Then it ask for an amount of Francs.
> 3. And on the third page it returns the result.
> 
> One of the coolest things about the implementation in the article, is
> that it can take care of multiple questions at once. If you browse
> through the three steps above, and then click on the "new window"
> button in your browser. You can the go back to the first or second
> screen, and fill in new data, without affecting the result in the
> other browser, (even if you use the refresh button in it). This
> behavior is very useful for "what if" kind of questions, where one can
> evaluate several alternative scenarios in a convenient way.

Yes, this the really cool thing about their thesis of binding resources
to program continuations.
 
> Ok, here we go!
> We start with a high level description of the application flow:
> 
> <!-- flowMap.xml -->
> <fm:flowMap xmlns:fm="...">
>   <fm:flow url="conversion">
>     <fm:until test="/in/exchange/rate &gt; 0" id="rateTest">
>       <fm:show src="cocoon://readRate.html" id="rate"/>
>     </fm:until>
>     <fm:show src="cocoon://readFrancs.html" id="francs"/>
>     <fm:show src="cocoon://result.html" id="result"/>
>   </fm:flow>
> </fm:flowMap>
> 
> (The "id" attributes are not necesarry and are only used for making
> reference easier) The flow map is either a part of the sitemap or
> mounted from it. It will be executed by a "flowmap engine" on a
> request for "cocoon://conversion". The children of "fm:flow" are
> executed in sequence. Each child works as a pipeline. The flowmap
> engine feeds the pipeline with an xml-document, that has "in" as root
> element. This document contains two parts, a continuation, that is an
> url to the next stage (or stages) in the flowmap, and description of
> the current state.
> 
> The input to the first stage, "conversion#rateTest" could look like
> this:
> 
> <in/>
> 
> Now, the first stage is an until-statement (a mistake from a
> pedagogical point of view, I realize :) ), the test - an XPath
> expression, will obviously not succeed on the current input
> data. Therefore the body of the until-statement is executed. But
> before we can do that we have to set the continuation. 

I like very much what I see and I think you are onto something, but
there is a problem: the iterative step will fail to provide feedback on
the error.

We *must* take into consideration try/fail by providing the ability to
update the form page if some data inserting error is made. This is vital
for webapp usability.

> The next stage
> after "conversion#rate" is "conversion#rateTest". We represent this
> situation by creating the new input:
> 
> <in>
>   <flow>
>     <next>conversion?next=rateTest-23454</next>
>   </flow>
> </in>
> 
> Here the url "conversion?next=rateTest-23454" consists of two parts, one
> that identifies the next stage to go to in the flowmap and one "23454"
> that uniqely identifies the current state, which this far happens to
> be empty. The current state is stored in a hash table with the url as
> a key.
> 
> We need an implementation of "conversion#rate":
> 
> <!-- readRate.xsl -->
> <html xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
>   <head>
>     <title>Conversion</title>
>   </head>
>   <body>
>     <h1>Conversion from Francs</h1>
>     <form action="{/in/flow/next}" method="post">
>       <label for="/exchange/rate">rate</label>
>       <input name="/exchange/rate" type="text" size="10"/>
> 
>       <label for="/exchange/currency">currency</label>
>       <input name="/exchange/currency" type="text" size="10"/>
> 
>       <input type="submit" name="submit" value="Continue"/>
>     </form>
>   </body>
> </html>
> 
> The main things to notice here is that, the form/@action will be
> replaced with the current continuation from the input, in our case
> "conversion?next=rateTest-23454", and that the names in the form
> describes positions in a output xml-document, (this idea is taken from
> the XForms draft).

Hmmm, as a personal taste, I'd rather pass the continuation hashcode as
a hidden parameter of the form, so that it doesn't "pollute" the URI. Of
course, we can't let the user take care of this so we must come out with
something for this.

What about using XForms directly and provide our own transformations to
HTML forms that take care of everything? (they could even add
client-side javascript validation code)
 
> We also need a sitemap fragment to see how readRate.xsl is suposed to
> be called:
> 
> <!-- sitemap.xmap -->
> <map:sitemap xmlns:map="http://apache.org/cocoon/sitemap/1.0">
>   <map:pipelines>
> 
>     <map:pipeline>
>       <map:match pattern="**.html">
>         <map:generate type="flowMapGenerator"/>
>         <map:transform src="{1}.xsl"/>
>         <map:serialize/>
>       </map:match>
>     </map:pipeline>
> 
>   </map:pipelines>
> </map:sitemap>

There more I think about it, the more I get the perception that instead
of coming up with something anew, we should enhance the sitemap
semantics to consider flows.

But it's something I still can't picture :/

> Here the "flowMapGenerator" feeds the current input to
> e.g. readRate.xsl.
> 
> More interesting things happens when the user have filled in the form
> and hits the submit button, this will create a request for
> "conversion?next=rateTest-23454", and the flowmap-engine will respond
> in the following manner:
> 
> 1. Read the request parameters, in our case they might be:
> /exchange/rate=1.4551&/exchange/currency=SEK.
> 
> 2. Create an XML-document from the request parameters:
> <exchange>
>   <rate>1.4551</rate>
>   <currency>SEK</currency>
> </exchange>
> 
> 3. Resume the state that is associated with the
>    url, from the hashtable. It happens to be empty at this moment.
> 
> 4. Combine the restored state with the current input. This can and
>    needs to be done in many different ways, but for our current
>    example, an insert/replace operation, is enough, and results in:
> <in>
>   <exchange>
>     <rate>1.4551</rate>
>     <currency>SEK</currency>
>   </exchange>
> </in>
> 
> 5. And this is the new input to "conversion#rateTest", this time the
>    test will succeed, and as a result, the flowmap engine continues to
>    the next stage "conversion#francs", and sets the continuation to
>    the stage after:
> <in>
>   <exchange>
>     <rate>1.4551</rate>
>     <currency>SEK</currency>
>   </exchange>
>   <flow>
>     <next>conversion?next=result-54328</next>
>   </flow>
> </in>
> 
> So, here I will stop boring you with all the details. The last two
> pages look as follows:
> 
> <!-- readFrancs.xsl -->
> <html xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
>   <head>
>     <title>How many Francs?</title>
>   </head>
>   <body>
>     <h1>Converting into <xsl:value-of
> select="{/in/exchange/currency}"/></h1>
>     <form action="{/in/flow/next}" method="post">
>       <label for="/FRF">Francs</label>
>       <input name="/FRF" type="text" size="10"/>
> 
>       <input type="submit" name="submit" value="Continue"/>
>     </form>
>   </body>
> </html>
> 
> <!-- result.xsl -->
> <html xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
>   <head>
>     <title>Conversion result</title>
>   </head>
>   <body>
>     <h1>Conversion result</h1>
>     <p>
>       <xsl:value-of
>         select="concat('If 1 FRF corresponds to ',/in/exchange/rate,' ',
>                        /in/exchange/currency,' then ',/in/FRF,
>                        ' FRF correspond to ',/in/exchange/rate * /in/FRF,
>                        ' ',/in/exchange/currency,'.')"/>
>     </p>
>   </body>
> </html>
> 
> result.xsl will be called with input like this:
> 
> <in>
>   <exchange>
>     <rate>1.4551</rate>
>     <currency>SEK</currency>
>   </exchange>
>   <FRF>100</FRF>
>   <flow/>
> </in>

I see value in what you explain, but the use of XSLT an variable
expansion language is, IMO, a little bit overkill since no
transformation is taking place.

What do you think about Velocity instead?
 
> ------------------------------------------------------------------------
> More constructions
> ------------------
> 
> There are certainly need for more language constructions to make the
> flowmap usable, some examples:
> 
> <fm:if test="XPath">
>   Do something
> </fm:if>
> 
> A possiblity to have several possible continuations:
> 
> <fm:switch>
>   <fm:case link="l1" id="i1">
>     Do something
>   </fm:case>
>   <fm:case link="l2" id="i2">
>     Do something
>   </fm:case>
>   ...
>   <fm:case link="ln" id="in">
>     Do something
>   </fm:case>
> </fm:switch>
> 
> The switch statement will give the preceding statement the input:
> <in>
>   <flow>
>     <next>flow?l1=i1-76456</next>
>   </flow>
>   <flow>
>     <next>flow?l2=i2-09877</next>
>   </flow>
>   ...
>   <flow>
>     <next>flow?ln=in-65433</next>
>   </flow>
> </in>
> 
> Maybe the switch statement should be nestable with if statements, to
> make it possible to describe that some of the links only are
> available if certain conditions are fullfiled. An important example is
> to only show the links that one is allowed to traverse.
> 
> It is useful to call other flows:
> <fm:call src="cocoon://flow1"/>
> 
> To make flow calls possible one need to store a stack of
> continuations from the calling flows, in the state.
> 
> Some kind of try, catch statement would probably simplify error
> handling.

I'm not that sure.

> -------------------------------------------------------------------
> State handling
> ---------------
> 
> The state handling described above, is to primitive for many
> situations. It allows for the "what if"-scenarios mentioned in the
> beginning, (I guess that is far from obvious from what I have said,
> but the images in the beginning of the referred article explains the
> situation quite well). This flexibility comes with a high cost, each
> continuation, that is created is associated with an own copy of the
> state. As long as the state is read-only all the copies can have
> references to common parts, and thus take away most of the copying,
> still the approach requires a lot of resources. Another problem is
> garbage handling, when should an unaccesed continuation, be taken
> away? (some ideas can be found in the referred articles).
> 
> In situations where one updates a data source with a large state, a
> data base, for example, a "many world"-behavior is not desirable at
> all. It would mean that the system have to handle several copies of
> the database, or that the database must be able to take care of
> multiple branches of the stored information.
> 
> This situation can be handled by restricting the creation of new
> continuations, so that one copy of a continuation is allowed for each
> stage in the flow.

Yes. Even if it is cool to using continuations to avoid the need to
check for back and cloning, I see very little value in letting the user
clone the window without having finished the previous flow.

I see no problem in forbidding this by restricting the creation of a
single continuation.

Anyway, very good food for thoughs, indeed.

-- 
Stefano Mazzocchi      One must still have chaos in oneself to be
                          able to give birth to a dancing star.
<st...@apache.org>                             Friedrich Nietzsche
--------------------------------------------------------------------



---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org


Re: Flowmaps: the wrong approach

Posted by Torsten Curdt <tc...@dff.st>.
Let me join this discussing. I've been lurking now for a while
and didn't have the time to give back my RT or opinion on this...

First of all: can someone please enlighten me - Sorry, I don't
see a conceptual difference between a FSM and a Turing machine.
AFAIK they are pretty much the same!?

I always wanted to avoid page-cenctric logic (as Ovidiu postulates)
in my webapps - but real life has shown that forcing this is not
really useful at all. It often leads to an over-abstraction of the
problem that doesn't help a thing.

Consider a form where you collect data over a couple of pages
(I call this a multi-page form) If you have only an abstract
view of your logic you will not be able to derive a validation/
redirection behaviour. You simply have no idea which page to go
back to if validation fails - the information is missing in the abstract
description of the application flow.

I also doubt that building a webapp as FSM or Turing machine is
really that hard. We already using this approach and we are quite
happy about the gain of possibilities. I wrote a Flowmap
action to integrate smoothly with Cocoon. Not having one
big Flowmap but a couple of small ones makes it quite easy
to maintain.

Also take into account: even if the flowmap gets bigger
and becomes a PITA to maintain via emacs (or any of your
favorite editors;) there is always an option to write
a visual tool!! Don't lets us limit ourself because it will
be hard to maintain from commandline. I cannot think of
a much easier, faster and natural development of having
a visual tool describing a Turing machine! This also reminds
me of a computer science technique - I guess it was "data
flow diagrams".

Although I have to admit I haven't had the time yet
to look into the resources Ovidiu sent I have quite
a bad feeling about continuations.
Isn't a flow with a lot different choices/transitions
also a PITA with continuations?!

Well, I get back to continuations when I did my homework;)

> So, if in your context FSM programming means "page-centric programming",
> I totally agree with both of you that we should "get control back" when
> building webapps.

What do you mean by "get control back", Stefano?
--
Torsten


---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org


Re: Flowmaps: the wrong approach

Posted by Stefano Mazzocchi <st...@apache.org>.
Daniel Fagerstrom wrote:

> > On the other hand, I disagree that FSM equals goto-programming (in fact,
> > you are describing FSM down below, using the XML syntax :)
>
> Do you think so? In an FSM I can go from any state to any other state but in
> the examples below I cannot go from a stage inside a loop to a stage before
> the loop, e.g. Furthermore if we use a stack, (which is only needed if we
> allow for general recursion or flowmaps that are seperatly compiled), that
> gives more "power" than an FSM has. Or am I missing something?

Well, we are probably have a miscontextualization problem, so allow me
take a step back:

Ovidiu stated that programming a FSM is a pain. You agreed on that and
used the goto-is-bad antipattern as an example of this.

But a FSM is nothing but another way to express a Turing machine.

So, by transitive logical rules, you guys are saying that programming a
Turing Machine is a pain. Which clearly can't be used as an argument,
given the million different ways there exist to program a turing
machine.

Back to cocoon: Ovidiu said that we should follow those patterns on
continuations and avoid page-centric programming.

So, if in your context FSM programming means "page-centric programming",
I totally agree with both of you that we should "get control back" when
building webapps.

This goes along with anti-SoC: if a web application logic is scattered
around many files, the concern is separated where it should *not* be!

I fully support this notion of recompacting the webapp logic concern.

At the same time, the papers that Ovidiu showed us, aggregate back the
web-app logic concern that forget to keep the other concerns separate.
So, we are back on the old problems of web-app frameworks that happen to
be so programmer-oriented that only programmers can work on them.

And this is not what we want.

We want a balanced system where concerns are well separated but a single
concern is not scattered around tens of files.

> > We *must* take into consideration try/fail by providing the ability to
> > update the form page if some data inserting error is made. This is vital
> > for webapp usability.
>
> Yes, I agree completely, error handling is _very_ important. I have been
> thinking a lot about handling of data insertion errors, and were about to
> throw an RT on it. I didn't, however, think about it from a "flowmap
> perspective" before, so I am still not clear about how I would like to
> integrate it.

Neither do I, but let's keep talking about it and maybe we'll get our
collaborative mind-waves collapse into some solid quantum state. :)
 
> Anyway, here are the main lines:
> 
> * The webapp gets input from form fields, an uppload or a soap like call
> (are there more possiblitys?). The input data consists of strings in
> parameters, a document containing some (hopefully) structured data, (e.g.
> tab separated numbers), that we have decided to accept as input, or if we
> are realy lucky, XML-data.

It doesn't really matter: we can provide generators that adapt on the
request payload and format the result into a standardize markup that we
define. Along the lines of what you were proposing before.
 
> * This data should, rather sooner than later, be transformed to XML. After
> all we are in the _XML_-weapp buisiness :) This seem to be an obvious task
> for a generator. (In this context I actually prefer the term deserializer,
> or something similar, as you mentioned, althoug not advocated, in your
> original post in the "Data goes in data goes out" thread).

Yes, it's a 'deserializer' but it's not any different (at least
behaviorally) from a generator (which is a slightly more general term
anyway).
 
> * Now that the input data are in xml-form we could, or IMHO should, have an
> XML-schema (or your favorit schema language) to validate the input data
> against. We don't want to put data in the wrong format in our data base or
> in our java programs, do we?

is this calling for a ValidatingTransformer or we need some specific
sitemap semantics for validation of pipeline stages?

I ask because I can't picture what happens if we have a transformer and
the input is invalid. Sounds like a transformer is not enough as a
behavioral interface for this task, but maybe I'm wrong.
 
> * So what is the result of the validation? There seem to be three main
> cases:
> + The input is valid - let the it flow to the next step in the pipe.
> + The input has invalid structure - this means that we have a fatal input
> error. An error that it is hard to recover from or to report anything
> sensible from. If we have designed the client side, a structural error means
> that we have a bug in our system or that someone try to post data whithout
> using our client. In booth cases we can not do much more than logging what
> is relevant, and report a fatal error to the user. If we offer a webservice
> or allow for uploads we probably have to work a litle bit harder on our
> feedback to the user.

Exactly. It would dead simple to have the transformer generate some
markup explaining the errors, but this is *far* from being ideal from a
user-friendly point of view.

> + The input has valid structure but invalid data types in the text fields or
> in the attributes. This is the case you asked for above. This case is more
> complicated, we have to give the user detaild feedback on whats wrong and a
> possibilty to update the faulty data fields. Two possible ways for the
> validator to inform the rest of the system about what went wrong, are:
> A list of location path, error message pairs. This can describe all kinds of
> field errors, but it is not obvious to me how to make use of the
> information. Another possibility is to only allow user input within elements
> and not in attributes, in this case the input xml can be annotated with
> error attributes in the faulty elements.

Annotating is not a problem even for attributes since we can use
namespaced attributes for that.

The problem is controlling the sequence of components in the pipeline
when a mistake is found. This calls for a "conditional" component.

Hmmm, thinking out loud: can't we throw "selectors" into this?

hmmm...
 
> I think that the validator should be a transformer, it takes xml as input
> and, except for fatal errors, emit xml. It could be a part of the
> generators, but thats leads probably to overly complex design of the
> generators. Xerces2 is actually build as a pipline with plugable components
> (not Avalon components however), where the pipline can consist of components
> like a scanner, a DTD validator, an XNL-Schema validator etc, and where the
> pipeline components components communicate with XNI events, that are like
> SAX-events but somwhat more low level. After having browsed the relevant
> parts of the Xerces2 source code I belive that it should be possible to
> reuse some of the components to build a "error annotating" validator, but I
> am not completely certain yet.

I agree, the design of Xerces2 is very elegant and very useful, but I
would point to Relax NG implementation which is designed exactly for
that: validate an XML infoset at any processing time.
 
> * More complicated validation that check e.g. dependences between fields
> could be based on the "bind" elements from XForms and be put in another
> transformer.

Ah, double validation, sounds like a cool idea:

 - a generator produces the SAX evens using request parameters using
some markup we define (hey, could also be useful for multi-part payloads
and for file uploads!) and using their own namespace.

 - a ??? validates the markup structure and datatypes
 
 - another ??? validates the content and augments the infoset with
eventual error information

Now let's come up with an idea for ???:

 - it can't be a generator because it has SAX input
 - it can't be a transformer because it must "route" the output
 - it can't be a selector because it doesn't work on the stream

Gosh, this seems to imply need another component or we have to extend
some component behavior.

The new component should be an hybrid between a selector and a
transformer: a selector that is also capable of transforming what passes
thru.

We could extend selector functionality in that direction, let's make an
example:

 <map:match pattern="form">
  <map:generate type="form"/>
  <map:selector type="relax" src="form.relax">
   <map:when test="invalid">
    <map:transformer src="structure-errors.xsl"/>
    <map:serializer/>
   </map:when>
   <map:otherwise>
    <map:selector type="xform-validator" src="form.xml">
     <map:when test="invalid">
      <map:transformer src="form-errors.xsl"/>
      <map:serializer/>
     </map:when>
     <map:otherwise>
      ...
     </map:otherwise>
    </map:selector>
   </map:otherwise>
  </map:selector>
 </map:match>

where '...' further processes the input (still foggy that part for me,
unfortunately).

What do you think?

> So, now comes the crux: the steps this far seems to be quite naturally
> described in terms of a pipeline. But now we have to make a choice on where
> to pipe the reults, if the validation succeded the results should be piped
> to the "DoIt"-transformer and if we got field errors the results should be
> piped to a "partly filled in form with error messages"-transformer. This can
> defentively not be done in the same pipeline.
> 
> AFAIK, but I can have missed something important, if you want to build
> something like what I outlined above in Cocoon today you have to use a
> number of actions instead of generators and transformers. That obscures the
> pipline aspect of serialization and validation. As an example, the
> "StreamGenerator" would be usefull as a part of an XForms handeling pipe,
> but if I decide to validate, I would have to build a "StreamAction" that
> places its result in the model or in the session in some place that the
> "ValidatorAction" has to know about.
> 
> So what would I like to have instead, (and appologize if I have missed
> important aspects of what you can do with whats currently available in
> Cocoon), is something like the following:
> 
> * An "input pipline", like the one I described above, that is required to be
> side effect free and only dependent on input. The output of the
> input-pipline is stored in a datastructure (a DOM-tree I guess). Ok, that
> hurts, but I cannot see any choice, we can only know about the result of a
> validation after having seen all the data, and till then we have to store it
> somewhere.
> 
> * Now we can have flow control, (a selector maybe?) based on the result of
> the validation, and also things like XPath-expression aplied on the input,
> and on the global state of the system.

Hey, we're having the same ideas here (sorry, but I normally don't read
the entire message before replying, to avoid being influenced by others)
very cool :)
 
> * Based on the selection in the flow control, the input structure is
> unpacked to SAX-events again in the choosen pipeline. And for this step we
> have a "real" pipeline again that is free to perform any side effects it
> want to.
> 
> I think that the outlined concepts should integrate well with a continuation
> based flowmap engine, but I start to get to tired to be able to explain how.
> 
> Ok, i seem to have written my RT anyway, it was not my intension :)

Oh, well, I think it didn't hurt :) Quite the opposite, I would say.
 
> > Hmmm, as a personal taste, I'd rather pass the continuation hashcode as
> > a hidden parameter of the form, so that it doesn't "pollute" the URI. Of
> > course, we can't let the user take care of this so we must come out with
> > something for this.
> 
> Agreed, I thought that there was a need for one hashcode for each
> continuation on a page with multiple links, but they are anyhow
> distinguished by their URI:s, and they are even connected to the same state.
> 
> To clearify:
> * For state less pages, we don't need any hashcode.
> * For non copyable continuations, the hashcode will, AFAIK, correspond to
> the session id, so we can let the session handling system take care of it
> instead.
> * For copyable continuations we need a new hashcode for each page.
> 
> > What about using XForms directly and provide our own transformations to
> > HTML forms that take care of everything? (they could even add
> > client-side javascript validation code)
>
> Yes, absolutely, I have written such a system together with a colegue. It is
> based on a small subset of an early draft of XForms. From the
> XForms-document we create an XSLT-document (actually by using another
> XSLT-document :) ) that generates a populated HTML form when it get
> XML-instance data as input. It also it also uses the kind of error
> attributes that I mentioned in my "embeded RT" above.
> 
> I hope that I will find time to refine and update our things so that I can
> submit them.

Uh, that'd be great.

> > I see value in what you explain, but the use of XSLT an variable
> > expansion language is, IMO, a little bit overkill since no
> > transformation is taking place.
> >
> > What do you think about Velocity instead?
>
> Yes, Velocity is probably better for this kind of things, at least if it can
> use XPaths, (I have just browsed the documentation so I don't know much
> about it). For my own case I happen to be an XSLT-fanatic and uses it for
> all kinds of stuff where other languages might be a better choice ;)

Well, choice is the key: if you like XSLT, great, but let's keep the
mechanism that performs variable substitution pluggable so that
everybody chooses what they like the best and nobody complains :)

> > > This situation can be handled by restricting the creation of new
> > > continuations, so that one copy of a continuation is allowed for each
> > > stage in the flow.
> >
> > Yes. Even if it is cool to using continuations to avoid the need to
> > check for back and cloning, I see very little value in letting the user
> > clone the window without having finished the previous flow.
> >
> > I see no problem in forbidding this by restricting the creation of a
> > single continuation.
>
> I got carried away of the article, cloned pages are defenitely not the main
> use case for flowmaps, although we actually use it, (but based on other
> methods), in a datawarehose application on the company that I work for.

Uh, really? what for. I couldn't really figure out a useful use of these
(rather than performing multi-threading internet browsing, which is what
I normally do, but I avoid having two browser windows on the same webapp
like the plague (knowing the complexity in handling such a thing,
expecially for cookie-based applications)

-- 
Stefano Mazzocchi      One must still have chaos in oneself to be
                          able to give birth to a dancing star.
<st...@apache.org>                             Friedrich Nietzsche
--------------------------------------------------------------------



---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org


SV: Flowmaps: the wrong approach

Posted by Daniel Fagerstrom <da...@swipnet.se>.
Stefano Mazzocchi wrote:
> I've taken the weekend to learn Scheme, now I know what we are talking
> about :)
>
> Daniel Fagerstrom wrote:
> >
> > Ovidiu Predescu wrote:
> > <snip/>
> > > I now believe we should have a system centered around logic, not
> > > around states and transitions.
> > I agree completely, IMHO, writing FSM:s is like goto-programming, very
> > small systems are easy to understand, but as soon as they grow, they
> > easily become a maintainance nightmare.
>
> This is the old tune against GOTOs and I totally agree on that.
>
> On the other hand, I disagree that FSM equals goto-programming (in fact,
> you are describing FSM down below, using the XML syntax :)
Do you think so? In an FSM I can go from any state to any other state but in
the examples below I cannot go from a stage inside a loop to a stage before
the loop, e.g. Furthermore if we use a stack, (which is only needed if we
allow for general recursion or flowmaps that are seperatly compiled), that
gives more "power" than an FSM has. Or am I missing something?

> > > The logic should be expressed in a
> > > language that supports continuations. The logic should drive what
> > > pages are presented to the user. These pages could be expressed in an
> > > XML markup language with appropriate elements for extracting data
> > > previously created in the program. These XML pages could be then
> > > processed through a pipeline, similarly with how they are processed
> > > today in the sitemap. However since incoming URLs are handled directly
> > > by the logic, there's no need for matchers in the sitemap.
> > >
> > >
> > >  HTTP request                         transformations
> > > --------------> logic -----> XML page ----------------->
> HTML/WML/... page
> > >
> > >
> > > The generated pages contain URLs that point back to continuation
> > > points in the logic.
> > >
> > > The biggest problem is the fact that the logic needs to be expressed
> > > in language that supports continuations. Since most people don't like
> > > the Lisp syntax, a language that abstracts out the continuations in
> > > higher level abstractions like send-response could be developed. This
> > > can probably be done by extending a familiar language, like
> > > Javascript, with these concepts.
> > <snip/>
> > > This is a huge paradigm shift from what we have right now, but I
> > > believe leads to easier ways to write Web applications. They become
> > > more like usual programs, instead of the complex beasts that we have
> > > today, with state scattered all over the place in the code.
> > >
> > > And yes, please read the papers I pointed to in my previous email, to
> > > understand what the heck I'm talking about. Here they are for your
> > > convenience:
> > >
> > >    http://youpou.lip6.fr/queinnec/Papers/webcont.ps.gz
> > >    http://www.cs.rice.edu/CS/PLT/Publications/esop2001-gkvf.ps.gz
> >
> > I happen to like Lisp as well at its syntax ;)
>
> Gosh, can't say the same, but it's not important at the moment.
>
> > Still I wonder if it
> > would not be possible to continue in the great cocoon tradition of
> > SoC, and find a convenient description of webapp flow, without going
> > all the way to a full high level programming language.
>
> Bingo! that's the point. The paper wants to "get control back" since
> page-directed programming stole the control from the programmer.
>
> The paper is right on many things but forgets about taking SoC into
> consideration.
>
> Turning a sitemap into a logic-oriented description would be equally bad
> since web programming is a mixed form of "declarative" (page-oriented)
> and "procedural" (logic-driven).
>
> My personal opinion is that we should have both at the same time in
> order to keep Cocoon elegance.
>
> > Now that you
> > have succeeded in exorcizing the mix of programming language
> > constructs and tags from JSP etc, it seem like a pity to let this mix
> > in again.
>
> Exactly!
>
> > Anyhow, inspired of the interesting discussion and the articles that
> > you referd to, I started to think about how to use these concepts in
> > cocoon. If possible, without having to do a "huge paradigm shift".
>
> Same here!
>
> > --------------------------------------------------------------------
> > Flowmap
> > -------
> >
> > First, to make it more concrete, I will try to express the main
> > example from, http://youpou.lip6.fr/queinnec/Papers/webcont.ps.gz, in
> > terms of xslt, cocoon components, a sitemap and a flowmap. For those
> > of you that not have read that article yet, the main example is a
> > small webapp:
> >
> > 1. On the first page it ask for the conversion rate between French
> > Francs and another currency.
> > 2. Then it ask for an amount of Francs.
> > 3. And on the third page it returns the result.
> >
> > One of the coolest things about the implementation in the article, is
> > that it can take care of multiple questions at once. If you browse
> > through the three steps above, and then click on the "new window"
> > button in your browser. You can the go back to the first or second
> > screen, and fill in new data, without affecting the result in the
> > other browser, (even if you use the refresh button in it). This
> > behavior is very useful for "what if" kind of questions, where one can
> > evaluate several alternative scenarios in a convenient way.
>
> Yes, this the really cool thing about their thesis of binding resources
> to program continuations.
>
> > Ok, here we go!
> > We start with a high level description of the application flow:
> >
> > <!-- flowMap.xml -->
> > <fm:flowMap xmlns:fm="...">
> >   <fm:flow url="conversion">
> >     <fm:until test="/in/exchange/rate &gt; 0" id="rateTest">
> >       <fm:show src="cocoon://readRate.html" id="rate"/>
> >     </fm:until>
> >     <fm:show src="cocoon://readFrancs.html" id="francs"/>
> >     <fm:show src="cocoon://result.html" id="result"/>
> >   </fm:flow>
> > </fm:flowMap>
> >
> > (The "id" attributes are not necesarry and are only used for making
> > reference easier) The flow map is either a part of the sitemap or
> > mounted from it. It will be executed by a "flowmap engine" on a
> > request for "cocoon://conversion". The children of "fm:flow" are
> > executed in sequence. Each child works as a pipeline. The flowmap
> > engine feeds the pipeline with an xml-document, that has "in" as root
> > element. This document contains two parts, a continuation, that is an
> > url to the next stage (or stages) in the flowmap, and description of
> > the current state.
> >
> > The input to the first stage, "conversion#rateTest" could look like
> > this:
> >
> > <in/>
> >
> > Now, the first stage is an until-statement (a mistake from a
> > pedagogical point of view, I realize :) ), the test - an XPath
> > expression, will obviously not succeed on the current input
> > data. Therefore the body of the until-statement is executed. But
> > before we can do that we have to set the continuation.
>
> I like very much what I see and I think you are onto something,
Thank you:)

> but
> there is a problem: the iterative step will fail to provide feedback on
> the error.
>
> We *must* take into consideration try/fail by providing the ability to
> update the form page if some data inserting error is made. This is vital
> for webapp usability.
Yes, I agree completely, error handling is _very_ important. I have been
thinking a lot about handling of data insertion errors, and were about to
throw an RT on it. I didn't, however, think about it from a "flowmap
perspective" before, so I am still not clear about how I would like to
integrate it.

Anyway, here are the main lines:

* The webapp gets input from form fields, an uppload or a soap like call
(are there more possiblitys?). The input data consists of strings in
parameters, a document containing some (hopefully) structured data, (e.g.
tab separated numbers), that we have decided to accept as input, or if we
are realy lucky, XML-data.

* This data should, rather sooner than later, be transformed to XML. After
all we are in the _XML_-weapp buisiness :) This seem to be an obvious task
for a generator. (In this context I actually prefer the term deserializer,
or something similar, as you mentioned, althoug not advocated, in your
original post in the "Data goes in data goes out" thread).

* Now that the input data are in xml-form we could, or IMHO should, have an
XML-schema (or your favorit schema language) to validate the input data
against. We don't want to put data in the wrong format in our data base or
in our java programs, do we?

* So what is the result of the validation? There seem to be three main
cases:
+ The input is valid - let the it flow to the next step in the pipe.
+ The input has invalid structure - this means that we have a fatal input
error. An error that it is hard to recover from or to report anything
sensible from. If we have designed the client side, a structural error means
that we have a bug in our system or that someone try to post data whithout
using our client. In booth cases we can not do much more than logging what
is relevant, and report a fatal error to the user. If we offer a webservice
or allow for uploads we probably have to work a litle bit harder on our
feedback to the user.
+ The input has valid structure but invalid data types in the text fields or
in the attributes. This is the case you asked for above. This case is more
complicated, we have to give the user detaild feedback on whats wrong and a
possibilty to update the faulty data fields. Two possible ways for the
validator to inform the rest of the system about what went wrong, are:
A list of location path, error message pairs. This can describe all kinds of
field errors, but it is not obvious to me how to make use of the
information. Another possibility is to only allow user input within elements
and not in attributes, in this case the input xml can be annotated with
error attributes in the faulty elements.

I think that the validator should be a transformer, it takes xml as input
and, except for fatal errors, emit xml. It could be a part of the
generators, but thats leads probably to overly complex design of the
generators. Xerces2 is actually build as a pipline with plugable components
(not Avalon components however), where the pipline can consist of components
like a scanner, a DTD validator, an XNL-Schema validator etc, and where the
pipeline components components communicate with XNI events, that are like
SAX-events but somwhat more low level. After having browsed the relevant
parts of the Xerces2 source code I belive that it should be possible to
reuse some of the components to build a "error annotating" validator, but I
am not completely certain yet.

* More complicated validation that check e.g. dependences between fields
could be based on the "bind" elements from XForms and be put in another
transformer.

So, now comes the crux: the steps this far seems to be quite naturally
described in terms of a pipeline. But now we have to make a choice on where
to pipe the reults, if the validation succeded the results should be piped
to the "DoIt"-transformer and if we got field errors the results should be
piped to a "partly filled in form with error messages"-transformer. This can
defentively not be done in the same pipeline.

AFAIK, but I can have missed something important, if you want to build
something like what I outlined above in Cocoon today you have to use a
number of actions instead of generators and transformers. That obscures the
pipline aspect of serialization and validation. As an example, the
"StreamGenerator" would be usefull as a part of an XForms handeling pipe,
but if I decide to validate, I would have to build a "StreamAction" that
places its result in the model or in the session in some place that the
"ValidatorAction" has to know about.

So what would I like to have instead, (and appologize if I have missed
important aspects of what you can do with whats currently available in
Cocoon), is something like the following:

* An "input pipline", like the one I described above, that is required to be
side effect free and only dependent on input. The output of the
input-pipline is stored in a datastructure (a DOM-tree I guess). Ok, that
hurts, but I cannot see any choice, we can only know about the result of a
validation after having seen all the data, and till then we have to store it
somewhere.

* Now we can have flow control, (a selector maybe?) based on the result of
the validation, and also things like XPath-expression aplied on the input,
and on the global state of the system.

* Based on the selection in the flow control, the input structure is
unpacked to SAX-events again in the choosen pipeline. And for this step we
have a "real" pipeline again that is free to perform any side effects it
want to.

I think that the outlined concepts should integrate well with a continuation
based flowmap engine, but I start to get to tired to be able to explain how.

Ok, i seem to have written my RT anyway, it was not my intension :)

>
> > The next stage
> > after "conversion#rate" is "conversion#rateTest". We represent this
> > situation by creating the new input:
> >
> > <in>
> >   <flow>
> >     <next>conversion?next=rateTest-23454</next>
> >   </flow>
> > </in>
> >
> > Here the url "conversion?next=rateTest-23454" consists of two parts, one
> > that identifies the next stage to go to in the flowmap and one "23454"
> > that uniqely identifies the current state, which this far happens to
> > be empty. The current state is stored in a hash table with the url as
> > a key.
> >
> > We need an implementation of "conversion#rate":
> >
> > <!-- readRate.xsl -->
> > <html xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
> >   <head>
> >     <title>Conversion</title>
> >   </head>
> >   <body>
> >     <h1>Conversion from Francs</h1>
> >     <form action="{/in/flow/next}" method="post">
> >       <label for="/exchange/rate">rate</label>
> >       <input name="/exchange/rate" type="text" size="10"/>
> >
> >       <label for="/exchange/currency">currency</label>
> >       <input name="/exchange/currency" type="text" size="10"/>
> >
> >       <input type="submit" name="submit" value="Continue"/>
> >     </form>
> >   </body>
> > </html>
> >
> > The main things to notice here is that, the form/@action will be
> > replaced with the current continuation from the input, in our case
> > "conversion?next=rateTest-23454", and that the names in the form
> > describes positions in a output xml-document, (this idea is taken from
> > the XForms draft).
>
> Hmmm, as a personal taste, I'd rather pass the continuation hashcode as
> a hidden parameter of the form, so that it doesn't "pollute" the URI. Of
> course, we can't let the user take care of this so we must come out with
> something for this.
Agreed, I thought that there was a need for one hashcode for each
continuation on a page with multiple links, but they are anyhow
distinguished by their URI:s, and they are even connected to the same state.

To clearify:
* For state less pages, we don't need any hashcode.
* For non copyable continuations, the hashcode will, AFAIK, correspond to
the session id, so we can let the session handling system take care of it
instead.
* For copyable continuations we need a new hashcode for each page.

> What about using XForms directly and provide our own transformations to
> HTML forms that take care of everything? (they could even add
> client-side javascript validation code)
Yes, absolutely, I have written such a system together with a colegue. It is
based on a small subset of an early draft of XForms. From the
XForms-document we create an XSLT-document (actually by using another
XSLT-document :) ) that generates a populated HTML form when it get
XML-instance data as input. It also it also uses the kind of error
attributes that I mentioned in my "embeded RT" above.

I hope that I will find time to refine and update our things so that I can
submit them.

>
> > We also need a sitemap fragment to see how readRate.xsl is suposed to
> > be called:
> >
> > <!-- sitemap.xmap -->
> > <map:sitemap xmlns:map="http://apache.org/cocoon/sitemap/1.0">
> >   <map:pipelines>
> >
> >     <map:pipeline>
> >       <map:match pattern="**.html">
> >         <map:generate type="flowMapGenerator"/>
> >         <map:transform src="{1}.xsl"/>
> >         <map:serialize/>
> >       </map:match>
> >     </map:pipeline>
> >
> >   </map:pipelines>
> > </map:sitemap>
>
> There more I think about it, the more I get the perception that instead
> of coming up with something anew, we should enhance the sitemap
> semantics to consider flows.
>
> But it's something I still can't picture :/
>
> > Here the "flowMapGenerator" feeds the current input to
> > e.g. readRate.xsl.
> >
> > More interesting things happens when the user have filled in the form
> > and hits the submit button, this will create a request for
> > "conversion?next=rateTest-23454", and the flowmap-engine will respond
> > in the following manner:
> >
> > 1. Read the request parameters, in our case they might be:
> > /exchange/rate=1.4551&/exchange/currency=SEK.
> >
> > 2. Create an XML-document from the request parameters:
> > <exchange>
> >   <rate>1.4551</rate>
> >   <currency>SEK</currency>
> > </exchange>
> >
> > 3. Resume the state that is associated with the
> >    url, from the hashtable. It happens to be empty at this moment.
> >
> > 4. Combine the restored state with the current input. This can and
> >    needs to be done in many different ways, but for our current
> >    example, an insert/replace operation, is enough, and results in:
> > <in>
> >   <exchange>
> >     <rate>1.4551</rate>
> >     <currency>SEK</currency>
> >   </exchange>
> > </in>
> >
> > 5. And this is the new input to "conversion#rateTest", this time the
> >    test will succeed, and as a result, the flowmap engine continues to
> >    the next stage "conversion#francs", and sets the continuation to
> >    the stage after:
> > <in>
> >   <exchange>
> >     <rate>1.4551</rate>
> >     <currency>SEK</currency>
> >   </exchange>
> >   <flow>
> >     <next>conversion?next=result-54328</next>
> >   </flow>
> > </in>
> >
> > So, here I will stop boring you with all the details. The last two
> > pages look as follows:
> >
> > <!-- readFrancs.xsl -->
> > <html xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
> >   <head>
> >     <title>How many Francs?</title>
> >   </head>
> >   <body>
> >     <h1>Converting into <xsl:value-of
> > select="{/in/exchange/currency}"/></h1>
> >     <form action="{/in/flow/next}" method="post">
> >       <label for="/FRF">Francs</label>
> >       <input name="/FRF" type="text" size="10"/>
> >
> >       <input type="submit" name="submit" value="Continue"/>
> >     </form>
> >   </body>
> > </html>
> >
> > <!-- result.xsl -->
> > <html xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
> >   <head>
> >     <title>Conversion result</title>
> >   </head>
> >   <body>
> >     <h1>Conversion result</h1>
> >     <p>
> >       <xsl:value-of
> >         select="concat('If 1 FRF corresponds to ',/in/exchange/rate,' ',
> >                        /in/exchange/currency,' then ',/in/FRF,
> >                        ' FRF correspond to ',/in/exchange/rate
> * /in/FRF,
> >                        ' ',/in/exchange/currency,'.')"/>
> >     </p>
> >   </body>
> > </html>
> >
> > result.xsl will be called with input like this:
> >
> > <in>
> >   <exchange>
> >     <rate>1.4551</rate>
> >     <currency>SEK</currency>
> >   </exchange>
> >   <FRF>100</FRF>
> >   <flow/>
> > </in>
>
> I see value in what you explain, but the use of XSLT an variable
> expansion language is, IMO, a little bit overkill since no
> transformation is taking place.
>
> What do you think about Velocity instead?
Yes, Velocity is probably better for this kind of things, at least if it can
use XPaths, (I have just browsed the documentation so I don't know much
about it). For my own case I happen to be an XSLT-fanatic and uses it for
all kinds of stuff where other languages might be a better choice ;)


>
> > ------------------------------------------------------------------------
> > More constructions
> > ------------------
> >
> > There are certainly need for more language constructions to make the
> > flowmap usable, some examples:
> >
> > <fm:if test="XPath">
> >   Do something
> > </fm:if>
> >
> > A possiblity to have several possible continuations:
> >
> > <fm:switch>
> >   <fm:case link="l1" id="i1">
> >     Do something
> >   </fm:case>
> >   <fm:case link="l2" id="i2">
> >     Do something
> >   </fm:case>
> >   ...
> >   <fm:case link="ln" id="in">
> >     Do something
> >   </fm:case>
> > </fm:switch>
> >
> > The switch statement will give the preceding statement the input:
> > <in>
> >   <flow>
> >     <next>flow?l1=i1-76456</next>
> >   </flow>
> >   <flow>
> >     <next>flow?l2=i2-09877</next>
> >   </flow>
> >   ...
> >   <flow>
> >     <next>flow?ln=in-65433</next>
> >   </flow>
> > </in>
> >
> > Maybe the switch statement should be nestable with if statements, to
> > make it possible to describe that some of the links only are
> > available if certain conditions are fullfiled. An important example is
> > to only show the links that one is allowed to traverse.
> >
> > It is useful to call other flows:
> > <fm:call src="cocoon://flow1"/>
> >
> > To make flow calls possible one need to store a stack of
> > continuations from the calling flows, in the state.
> >
> > Some kind of try, catch statement would probably simplify error
> > handling.
>
> I'm not that sure.
>
> > -------------------------------------------------------------------
> > State handling
> > ---------------
> >
> > The state handling described above, is to primitive for many
> > situations. It allows for the "what if"-scenarios mentioned in the
> > beginning, (I guess that is far from obvious from what I have said,
> > but the images in the beginning of the referred article explains the
> > situation quite well). This flexibility comes with a high cost, each
> > continuation, that is created is associated with an own copy of the
> > state. As long as the state is read-only all the copies can have
> > references to common parts, and thus take away most of the copying,
> > still the approach requires a lot of resources. Another problem is
> > garbage handling, when should an unaccesed continuation, be taken
> > away? (some ideas can be found in the referred articles).
> >
> > In situations where one updates a data source with a large state, a
> > data base, for example, a "many world"-behavior is not desirable at
> > all. It would mean that the system have to handle several copies of
> > the database, or that the database must be able to take care of
> > multiple branches of the stored information.
> >
> > This situation can be handled by restricting the creation of new
> > continuations, so that one copy of a continuation is allowed for each
> > stage in the flow.
>
> Yes. Even if it is cool to using continuations to avoid the need to
> check for back and cloning, I see very little value in letting the user
> clone the window without having finished the previous flow.
>
> I see no problem in forbidding this by restricting the creation of a
> single continuation.
I got carried away of the article, cloned pages are defenitely not the main
use case for flowmaps, although we actually use it, (but based on other
methods), in a datawarehose application on the company that I work for.

>
> Anyway, very good food for thoughs, indeed.
Thanks for your inisightfull comments.

/Daniel



---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org