You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@cocoon.apache.org by Berin Loritsch <bl...@apache.org> on 2001/12/07 22:05:27 UTC

[RT] Managing Flow and Resources

There has been a lot of talk about Finite State Machine approaches, Staged Event
Driven Architecture (a refinement of the same), Scheme, and more.  When I saw the
post regarding Scheme it jogged my memory (a few days later).  When I was first
working for my company, we were developing a really cool tool that would automagically
create schematics (product category to shelf mapping) that follow the rules for
an organization so that a master schematic can be applied to a group of different
stores based on demographic information.  Since that is a mouthful, I will elaborate.
Retailers want to be able to map a bunch of products (organized by category) to
the shelves of their store.  They are well aware of the fact that different
products sell at different rates based on the demographic information of a store.
In other words a store in an area where 90% of the people make $100,000 or more
a year will consume 3 times as much Haagen Das ice cream than an area where only
10% of the people make that much.  This is known as corner-store marketing.  I.e.
getting the specialization of a corner store from a major retailer.  The problem
with that is that you can have as many as 1000 different schematics for _each_
category if a division has 1000 different stores.  This is nearly impossible to
manage by humans.  Our Automatic Schematic Generation tool handles the specializations
based on a rules-based engine.

"So what does this have to do with Cocoon?" I hear you ask.  Good question!  One
of the tools we used to perform these very complex decisions was based on an open
source project called the Java Expert System Shell (JESS) which can be found at
http://herzberg.ca.sandia.gov/jess/.  The concept of JESS is very powerful.  The
reason why Scheme jogged my memmory is because the JESS language is based on
Scheme--but only because the expert systems community is more familiar with it.

The over-simplified summary of the approach is that in a Rules-based system, you
have a set of Facts and a set of Rules.  Facts are your knowledge base (i.e. the
files you have in your context, the generators you have to choose from, etc.).
Rules are how to apply logic to that set of facts.  In Cocoon, the Environment
contains one set of facts that change with every request and the sitemap and
configuration files provide a set of facts that are constant.

The sitemap can be problem space can be viewed as a very simple Rules based system.
We make decisions based on URI, parameters, session variables, and more.  The set
of rules are simple if-then statements.  You can verify this by viewing the
generated source code for the Sitemap.  What we have run into is that the set of
rules available to us are too simple in some applications.  "If we are requesting
this set of information, send this resource."  This is most obvious when we have
multipage forms, and complex logic that has to be managed.

So how do we express the sitemap in JESS terms?  It would be something like this:

(defrule select-pipeline
     (environment (uri (concat "/foo/" ?X ".html")))
   =>
     (execute-pipeline (generator (source (concat "/foo/" ?X ".xml")) (type file))
                       (transformer (source "/bar/foo2html.xsl") (type xslt))
                       (serializer (type html))
     )
)

Ok, this is ugly.  But what if we changed this to be more granular?

(defrule select-transformer
     (environment (uri (concat ?X ".html")))
   =>
     (add-transformer (transformer (source "/bar/foo2html.xsl") (type xslt))
)

(defrule select-serializer
     (environment (uri (concat ?X ".html")))
   =>
     (set-serializer (serializer (type html)))
)

(defrule select-generator
     (environment (uri (concat ?X ".html")))
     (source (concat ?X "." ?Y))
   =>
     (set-generator (source (concat ?X ".xml")) (type ?Y))
)

(defrule execute-pipeline
     (generator (set true))
     (serializer (set true))
   =>
     (sitemap execute)
)

What this is saying is that we have a group of rules that all ".html" files are
to use the "html" serializer and the "xslt" transformer with the "/bar/foo2html.xsl"
stylesheet.  For the generator, there is a one-to-one mapping of generator type
to the extension.  (i.e. instead of "serverpages" generator it is renamed "xsp").

These can all be handled neatly with the current sitemap.  It is important to note,
however that the order of operations is that all rules are applied if they match.
Many times, we want to do something very complex that just can't be matched nicely.

For example, let's say we are in a multipage form that can exit based on if the
user is done adding items to their shopping cart which depends on data entered by
the user.  This absolutely cannot be simply expressed in the sitemap.  Let us say
that the two rules regarding transformers and serializers apply accross the board
and our only interest is in the next page of the form.  Remember that the data in
the form are facts that the rules engine can decide upon:

(defrule select-generator2
     (declare (salience 3))
     (session (add-more-items no))
  =>
     (set-generator (source "/foo/bar/baz.xml") (type xml))
)

What is this "salience" thing?  It is a precedence.  If multiple rules apply to
the same facts, then the one that has the highest salience wins.  That is more
of an expert shell type of thing, but it makes some excellent approaches.  I.e.
you have a base event that is fired, but it can be overriden by another rule.
That way you always have a default value.

It is also important to note that these rules are expressed like "and" statements.
There are also some other interesting ways of expressing patterns:

(defrule example
   (not-b-and-c ?n1&~b ?n2&~c)
   (different ?d1 ?d2&~d1)
   (same ?s ?s)
   (more-than-one-hundred ?m&:(> ?m 100))
   (red-or-blue red|blue)
  =>
   (printout t "Found what I wanted!" crlf)
)

So if there are facts where the "head" or type is "not-b-and-c" that has two
attributes and the first is not "b" and the second is not "c" along with
all the other facts found, the event "printout" will fire.

Now, I agree that Scheme is not an intuitive language, and very much LISP-like
(Lots of Imbeded Stinkin' Parentheses), it is easy to adapt to XML.  And more
importantly, to existing Java systems.  In other words, you can programatically
adjust the facts that the Rete decision engine uses with standard rules, and
watch it go to town obeying what you mean.  Coupled with a precedence scheme
that allows it to make a decision that when I have two sets of facts that
conflict, I can decide which rule is more important.  This is more powerful than
simply ordering the rules in the Sitemap.

I don't quite understand all of this either, but the gist is this:   A rules
based system is a way to declare the rules based on a known set of facts.  The
flexibility of a system like that allows a system that is astoundingly complex
be simplified.  The Facts are the availble resources (i.e. database resource,
file resources, session resources, request resources, etc.).  The rules are
a declaration of what to do with the facts.  The RETE algorithm is a way to
optimize the decision making process so that the decision making is more
constant time than linear time.

Such an approach would be interesting to follow....

Have I successfully confused everyone yet?  The link to JESS is better help.

-- 

"They that give up essential liberty to obtain a little temporary safety
  deserve neither liberty nor safety."
                 - Benjamin Franklin


---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org

Re: [RT] Managing Flow and Resources

Posted by Michael Hartle <mh...@hartle-klug.com>.

Berin Loritsch wrote:

> "So what does this have to do with Cocoon?" I hear you ask.  Good 
> question!  One
> of the tools we used to perform these very complex decisions was based 
> on an open
> source project called the Java Expert System Shell (JESS) which can be 
> found at
> http://herzberg.ca.sandia.gov/jess/.  The concept of JESS is very 
> powerful.  The
> reason why Scheme jogged my memmory is because the JESS language is 
> based on
> Scheme--but only because the expert systems community is more familiar 
> with it.

Due to the JESS license, I had a look at Sourceforge and found SISC 
(http://sisc.sourceforge.net/):

> *SISC* is an extensible Java based interpreter of the algorithmic 
> language  Scheme. <http://www.schemers.org>    SISC uses modern 
> interpretation techniques, and handily outperforms all existing Java 
> interpreters (often by more than an order of magnitude).
>
> In addition, *SISC *is a complete implementation of the language.  The 
> entire R5RS Scheme standard is supported, no exceptions.  This 
> includes a full number tower including complex number support and 
> arbitrary precision integers /and/ floating point numbers, as well as 
> full support for first-class continuations (not just escaping 
> continuations as in many other interpreters).  *SISC* also attempts to 
> implement the standard as correctly as possible, while still providing 
> exceptional performance.
>
> Functionality can be added through the use of Modules, extensions that 
> may add new types and functions to the language.  *SISC* can also 
> interface with Java through a bridge module called J2S.  
>
> SISC is released simultaneously under the terms of the Mozilla Public 
> License v1.1 and the GNU General Public License (GPL) v2.

Best regards,

Michael Hartle


---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org

Re: [RT] Managing Flow and Resources

Posted by Michael Hartle <mh...@hartle-klug.com>.

Berin Loritsch wrote:

> Such an approach would be interesting to follow....
>
> Have I successfully confused everyone yet?  The link to JESS is better 
> help. 

This reminds me of Prolog I had to endure in school back then, not 
defining how to solve the problem, but defining legal steps that can be 
taken towards a solution. Prolog did it by backtracking, as far as I 
knew, but I assume the Rete algorithm is more sophisticated than that. 
Sadly, the licensing of JESS seems to be free only to academic use.

Best regards,

Michael Hartle

---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org

Re: [RT] Managing Flow and Resources

Posted by Jason Foster <ja...@uwaterloo.ca>.

<snip/>

You have just provided a great example of something really close to what I 
was trying to express a while back when I proposed Prolog as a mechanism 
for "proving" URLs based on some rules, premises, and an inference engine.
   I don't think that it's exactly the same, but it's *very* similar.

The one thing that I see in common over the last bit of the discussion on 
flow and resources is the goal of moving away from the "monolithic and 
explicit" model to a "granular and implicit" model where some kind of 
inference engine handles putting everything together.  I think that this 
makes sense for complicated resource generation, but that it might get in 
the way of simple resource generation.

This leads to the question of whether a "one size fits all" approach to 
resource and flow management is appropriate?  In other words are the goals 
of Cocoon (whatever those are!) better served by picking one approach to 
management or by providing a number of alternate approaches?  One the one 
hand you have flexibility; on the other you have the potential for 
confusion.

A similar question can be asked about web applications.  For example in 
WebObjects a web application is basically a normal event-driven application 
and the environment deals with translating HTTP requests into programmatic 
events.  I don't know if XSP or JSP have an explicit model of a web 
application but I don't remember seeing one.  A web application could also 
be seen as a state machine or (and I have to admit that this is pretty cool)
  an application running with continuations.

Is it right/appropriate/wise for Cocoon to declare "A web application must 
be modelled as X"?  This far all Cocoon has declared (I think) is that:

- Avalon is the framework
- resource generation must be modelled as a pipeline
- valid components are generators, transformers, serializers
- components are connected by SAX events

I am a little leery of enforcing a particular model of a web application, 
but that's just me.

Thanks, Berin, for a fun and useful RT.

Jason Foster


---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org