You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@cocoon.apache.org by Stefano Mazzocchi <st...@apache.org> on 2002/02/21 13:32:38 UTC

[RT] Sources and Drains

[For those of you who didn't study elettronic, MOSFET transistors have
sources, drains and gates. Sources generate electrons, drains consume
them and gates control the flow]

The original idea of a pipeline was

 g -> t* -> s

you all know this very well.

Then I realized that the same pipelines shared the same properties, so I
came up with the notion of resource views, which required a way to 'get
out' of the normal pipeline flow. This was implemented with 'views' and
'labels'

 g -> t* -> s

 [label] -> t* -> s

so that we could ask for a specific 'view' and do something like

 g -> t -> t* -> s [normal view]
      |
      + -> t* -> s [another view]

Then we wanted to merge different pipelines with 'aggregation', so we
needed a way to do something like

 g -> t* -> A -> t* -> s
            ^
 g -> t* ---+

this created the notion of 'internal pipelines', those pipelines who
don't need a serializer because their sax events are aggregated
directly.

Now, Vadim proposed the notion of having serializers that write on
output streams which are not connected to the requesting agent.

Something like this

  request -> g -> t* -> MX -> t* -> s -> response
                         |
                         t*
                         |
                         s
                         |
                         v
                       stream

this reminded me a lot of transistor 'drains', the place where the
electrons are consumed.

So, ideally, we could have 

 - a multiplexer/expander/exploder/dispatcher (which is the symmetrical
of the aggregator, but performs as a transformer)

 - one or more 'drains', which are generator-less semi-pipelines where
the draining output stream is explicitly written.

So we could have something like this:

 <match pattern="...">
  <generate .../>
  <transform ../> 
  <route>
   <drain element="...." namespace="...." dest="drain://database"/>
   <drain element="...." namespace="...." dest="drain://email"/>
   <drain element="...." namespace="...." dest="drain://sms-gateway"/>
  </route>
  <serialize .../>
 </match>

where before we had something like this

 <map:drains>
  <map:drain name="database">
   <transform .../>
   <serialize .. dest="..."/>
  </map:drain>
  ...
 </map:drains>

note the 'dest' attribute of serializers!

we could also change the definition of 'internal pipelines' using 

 <map:sources>
  <map:source name="newsfeed">
   <generate .../>
   <transform .../>
  </map:source>
 </map:sources>

and then have 

 <match>
  <aggregate>
   <part src="source://newsfeed"/>
  </aggregate>
  <transform/>
  <serialize/>
 </match>

So, at the end we get:

 1) pipeline: g -> t* -> s
 2) source: g -> t*
 3) drain: t* -> s
 4) aggregator with parts
 5) router with drains (bah, don't really like the name router, but I
can't think of anything better)

Comments?

-- 
Stefano Mazzocchi      One must still have chaos in oneself to be
                          able to give birth to a dancing star.
<st...@apache.org>                             Friedrich Nietzsche
--------------------------------------------------------------------


---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org

Re: [RT] Sources and Drains (long response)

Posted by Berin Loritsch <bl...@apache.org>.

Stefano Mazzocchi wrote:
> Berin Loritsch wrote:
> 
>>Stefano Mazzocchi wrote:
>>
>>>Berin Loritsch wrote:
>>>
>>>
>>>>I understand the requirement.  What we need is the ability to
>>>>initiate asynchronous pipelines that consume information from a
>>>>request.  I say asynchronous because in this case, we are not
>>>>merging the information back into the critical stream.
>>>>
>>>>
>>>Yes. The problem being when the result of one branch is 'synchronous'
>>>with the result of the other branch. (say, saving something into a
>>>database)
>>>
>>Some saves to a database are not necessarily *required* to be
>>synchronous.  The problem is not that big when you consider *when*
>>something needs to be done.  
>>
> 
> Unless you have logical dependencies on it.

Here is the point: Future concept (below) starts the process
immediately.  When you get to the point where you need to know,
you wait for the result to be done.

>>One concurrency tool that has been
>>developed is the "Future".  A Future is an object or value that is
>>declared early but used later.  Typically a Future is an expensive
>>operation that can happen asynchronously while you are handling other
>>tasks.  When you finally need it, the Future will block until the
>>value is done being calculated.  In the meantime you handled a host
>>of other tasks simultaneously.
>>
> 
> Yes, I know the concept.

It is very useful in situations (like above) where you need to start
an expensive operation immediately, but continue to work on other parts
of the problem domain.  When you get to the point where you need the
answer, it will either have it done, or make you wait until it is done.
either way, we spend less time than simply writing everything in line.

>>For instance, validation and database activity can occur simultaneously.
>>You will need to look up values while you are validating other values.
>>In the mean time, once you have an authoritative response, all you
>>have to do is kick off that process.
>>
> 
> I'm not worried about these cases, Berin, but about the cases when you
> *need* to way for the result of one branch to execute the other.
> 
> For example: I want to send email. One branch serializes to the mail
> server, the other serializes back to me. But if the server is down, or
> misconfigured, or I'm not allowed to post, or.... I want to know it
> right there!

Concurrent programming always has its own share of issues.  In this
case, we have some options.  The simplest from the application
developer's perspective is guaranteed dilivery.  If the request must
send an email, there will be something on the machine (most likely
an SMTP server that forwards the mail to any machine in a pool of
SMTP servers) that guarantees the request will be completed.

QoS issues mean that you can't really guarantee the timing of the
dilivery--only that it will be delivered.  If you need to handle all
QoS issues within the webapp, you are going to run into alot of
headaches.

Furthermore, I don't know of any webapp that will send an email and
await a response within one HTTP request.  You cannot gurantee timing
on a mail server will be quick enough.  That is why there are messaging
services.  If you need a request and a response in a specific amount
of time, I highly recommend JMS.

Email should be considered the "poor man's" JMS.  It has no QoS
guarantees on timing--only on delivery.

> Sure, errors could be saved in logs, but the user doens't have a way to
> access those logs (nor should). he/she just wants immediate feedback.

Log messages are for humans to understand what is going on with a
process.  They were never meant for machine consumption.  It should stay
that way.

> Well, we could implement this with transformers and WriteableSources
> (which is the way I would do it right now) and use branching pipelines
> for 'asynchronous' things.... but my concern is that if we implement
> branching pipelines (even if elegant) users might use them in the wrong
> place because (i've seen this happening myself) people tend to reuse
> components in weird configurations rather than write their own 10 lines
> of java code.
> 
> This is dangerous in the long run.

:)  In a sense, we do that now.  Technically, if we were going to really
have an input pipeline and an output pipeline, all the input generators
would be like the RequestGenerator.  For XML requests, there is little
to be done.  For HTTP requests, the input needs to be converted to XML.

At that point, we can figure out what we are going to do with the input
*directly*.  As it stands now, we use Java code to process the input
pipeline, and XML pipelines to process the output pipeline.  They are
orthagonal to each other in the sense that the content of the page
does not typically rely on the information of the request.  The common
exception is a form that displays results from the last request.

> [don't know if I got all of it, though, anyway...]

That's ok, I spent 35 hours a week in school to learn all that.

>>Just as there are two phases of a cocoon app (development and release),
>>there are two phases of recording music (recording and mixing).  At
>>least in this simplified version (there are also master houses,
>>duplication plants, etc.).
>>
>>In the days before we came up with patch bays, we would physically
>>attach the connectors directly from one place to another.  If we wanted
>>to change an outboard device, it required several acrobatics to get
>>at the back of the machine and make the connections.
>>
>>This is where the sitemap sits now.  Granted, an XML file is a lot
>>easier to deal with than the backs of hardware devices, but you
>>get the idea.
>>
>>The "soft-wiring" I was speaking of is providing a default flow of
>>information that gets overridden by the developer.  The sitemap becomes
>>a "patch-bay" if you will.
>>
> 
> Ok, now I got it. 
> 
> You are asking for a 'base sitemap', something that you should use to
> start with and should be flexible enough, yet useful, to make you work
> right away.
> 
> Did I get it right?

Yep!

>>Just like in the recording industry you don't by
>>15 point 11.5" reels of tape that are 2" thick.  Those are used during
>>development/recording to keep all the instruments separated on tape.
>>What are distributed to customers are pressed CDs where the mix cannot
>>be altered.
>>
> 
> Perfect parallel! right on, brother!

:) Besides, the tape deck costs more than some peoples houses (can be
upwards of $340,000 USD.  (The 15 point was supposed to read 15 pounds).
BTW, ties are not allowed in rooms with these beasts.  Imagine getting
a tie caught in the machine that can spin the 15 pound reels at 400 ips!
That has happened, and a bill was sent to the offender for the lost time
in the studio (blood ruins the master).

>>>The more I think about it, the more I think Actions don't belong to the
>>>sitemap... but to be entirely honest with you, I don't have a clear view
>>>of where we should lead the semantics to.
>>>
>>>I see great promises in Ovidiu's work on Schecoon... but we also need a
>>>way to make the two things (flowmaps and sitemaps) interoperate yet
>>>keeping concerns separated and keep the number of contracts the smallest
>>>possible.
>>>
>>>And this is *not* an easy architectural task :/
>>>
>>Keep in mind that the Sitemap (and Cocoon as well) implement the
>>Processor interface.  That means anything that is a signal router would
>>implement that interface.  It can be as simple as that.
>>
> 
> I wasn't talking about implementation but about design. But Ovidiu got
> my point, so I'll continue there.

Keep me posted.

-- 

"They that give up essential liberty to obtain a little temporary safety
  deserve neither liberty nor safety."
                 - Benjamin Franklin

---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org

Re: [RT] Sources and Drains (long response)

Posted by Stefano Mazzocchi <st...@apache.org>.

Berin Loritsch wrote:
> 
> Stefano Mazzocchi wrote:
> > Berin Loritsch wrote:
> >
> >>I understand the requirement.  What we need is the ability to
> >>initiate asynchronous pipelines that consume information from a
> >>request.  I say asynchronous because in this case, we are not
> >>merging the information back into the critical stream.
> >>
> >
> > Yes. The problem being when the result of one branch is 'synchronous'
> > with the result of the other branch. (say, saving something into a
> > database)
> 
> Some saves to a database are not necessarily *required* to be
> synchronous.  The problem is not that big when you consider *when*
> something needs to be done.  

Unless you have logical dependencies on it.

> One concurrency tool that has been
> developed is the "Future".  A Future is an object or value that is
> declared early but used later.  Typically a Future is an expensive
> operation that can happen asynchronously while you are handling other
> tasks.  When you finally need it, the Future will block until the
> value is done being calculated.  In the meantime you handled a host
> of other tasks simultaneously.

Yes, I know the concept.
 
> For instance, validation and database activity can occur simultaneously.
> You will need to look up values while you are validating other values.
> In the mean time, once you have an authoritative response, all you
> have to do is kick off that process.

I'm not worried about these cases, Berin, but about the cases when you
*need* to way for the result of one branch to execute the other.

For example: I want to send email. One branch serializes to the mail
server, the other serializes back to me. But if the server is down, or
misconfigured, or I'm not allowed to post, or.... I want to know it
right there!

Sure, errors could be saved in logs, but the user doens't have a way to
access those logs (nor should). he/she just wants immediate feedback.

Well, we could implement this with transformers and WriteableSources
(which is the way I would do it right now) and use branching pipelines
for 'asynchronous' things.... but my concern is that if we implement
branching pipelines (even if elegant) users might use them in the wrong
place because (i've seen this happening myself) people tend to reuse
components in weird configurations rather than write their own 10 lines
of java code.

This is dangerous in the long run.

[skipped long description of how complex is something so apparently
simple as 'recording variations or air pressure' :)]

[don't know if I got all of it, though, anyway...]

> Just as there are two phases of a cocoon app (development and release),
> there are two phases of recording music (recording and mixing).  At
> least in this simplified version (there are also master houses,
> duplication plants, etc.).
> 
> In the days before we came up with patch bays, we would physically
> attach the connectors directly from one place to another.  If we wanted
> to change an outboard device, it required several acrobatics to get
> at the back of the machine and make the connections.
> 
> This is where the sitemap sits now.  Granted, an XML file is a lot
> easier to deal with than the backs of hardware devices, but you
> get the idea.
> 
> The "soft-wiring" I was speaking of is providing a default flow of
> information that gets overridden by the developer.  The sitemap becomes
> a "patch-bay" if you will.

Ok, now I got it. 

You are asking for a 'base sitemap', something that you should use to
start with and should be flexible enough, yet useful, to make you work
right away.

Did I get it right?
 
> > On the other hand I see an interpreted sitemap equivalent to a
> > soft-wiring, sort of a test-board where you conntect your components,
> > try them out and see when they do what you want.
> >
> > Then you go and assemble your board (if you ever compiled VHDL or
> > Verilog into circuits you know what I mean, but that's *way* down to the
> > metal for the people on this list, I guess).
> 
> :/  A different point, but I see where you are going.  I.e. the
> difference between Xalan and XSLTC.  It would be cool to provide this
> for Cocoon.

Yes, totally. The awesome thing would be to do have a tool that does
this 'assembling' and should

 1) compile all XML into compiledXML to increase performance
 2) prevalidate the XML that needs to be validated 
 3) compile the XSTL stylesheets into translets
 4) compile the XSP into generators
 5) compile the sitemaps into code
 6) package the whole thing
 7) deploy on top of Cocoon

so you get:

1) easy installation
2) best-possible performance
3) non-modificability of things in production (useful to avoid the
lost-update problem when people fix on production and forget to make
fixes back on the original source)

I think this goes along with my proposed cocoon web apps concept of
modularization of cocoon web applications.

> > I don't see the concept: a soft-wired sitemap would be a sitemap that is
> > able to modify itself at runtime. It has been proposed (a.k.a. dynamic
> > sitemap) but nobody has yet came out with a meaningful example that
> > *required* that and could not be processed with a better separation of
> > concerns between the pipeline components.
> 
> Here is the thing.  A sitemap would be more useful to a developer if we
> could easily rearrange it as necessary.  But once we are done... we
> can generate a class file that optimizes what we came up with.  We would
> use link crawling or some other methodology that would determine what
> sources needed to be compiled and which ones were better left alone.

Exactly.

> A "soft" sitemap is a developer's concern.  A "hard" sitemap is a
> deployment concern.  

Precisely.

> Just like in the recording industry you don't by
> 15 point 11.5" reels of tape that are 2" thick.  Those are used during
> development/recording to keep all the instruments separated on tape.
> What are distributed to customers are pressed CDs where the mix cannot
> be altered.

Perfect parallel! right on, brother!
 
> >>I think that the addition of Actions can attest to that.  Lets take
> >>Cocoon and instead of a web application platform lets place it in the
> >>light of a web services platform (all XML all the time).
> >>
> >
> > [snipped discussion on web services platform built on top of cocoon]
> >
> > The more I think about it, the more I think Actions don't belong to the
> > sitemap... but to be entirely honest with you, I don't have a clear view
> > of where we should lead the semantics to.
> >
> > I see great promises in Ovidiu's work on Schecoon... but we also need a
> > way to make the two things (flowmaps and sitemaps) interoperate yet
> > keeping concerns separated and keep the number of contracts the smallest
> > possible.
> >
> > And this is *not* an easy architectural task :/
> 
> Keep in mind that the Sitemap (and Cocoon as well) implement the
> Processor interface.  That means anything that is a signal router would
> implement that interface.  It can be as simple as that.

I wasn't talking about implementation but about design. But Ovidiu got
my point, so I'll continue there.

-- 
Stefano Mazzocchi      One must still have chaos in oneself to be
                          able to give birth to a dancing star.
<st...@apache.org>                             Friedrich Nietzsche
--------------------------------------------------------------------


---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org

Re: [RT] Sources and Drains (long response)

Posted by Berin Loritsch <bl...@apache.org>.

Stefano Mazzocchi wrote:
> Berin Loritsch wrote:
> 
>>I understand the requirement.  What we need is the ability to
>>initiate asynchronous pipelines that consume information from a
>>request.  I say asynchronous because in this case, we are not
>>merging the information back into the critical stream.
>>
> 
> Yes. The problem being when the result of one branch is 'synchronous'
> with the result of the other branch. (say, saving something into a
> database)

Some saves to a database are not necessarily *required* to be
synchronous.  The problem is not that big when you consider *when*
something needs to be done.  One concurrency tool that has been
developed is the "Future".  A Future is an object or value that is
declared early but used later.  Typically a Future is an expensive
operation that can happen asynchronously while you are handling other
tasks.  When you finally need it, the Future will block until the
value is done being calculated.  In the meantime you handled a host
of other tasks simultaneously.

For instance, validation and database activity can occur simultaneously.
You will need to look up values while you are validating other values.
In the mean time, once you have an authoritative response, all you
have to do is kick off that process.

>>There has been some talk about making SAX streams concurrent (i.e.
>>an aggregation of several pipelines all running at the same time
>>to reduce time to the client).  So the Multiplexer/Demultiplexer
>>idea does have some merit.
>>
>>However, let use take the abstraction up from the FET analogy.
>>
>>
> 
> [snipped very interesting parallel on cocoon and audio processing]

:) glad you thought so.

>>The closest thing we have to a signal router in Cocoon is the Sitemap.
>>
> 
> Agreed.
> 
> 
>>The sitemap does not provide a soft-wiring feature that assists in
>>creating a solution.  
>>
> 
> I midly disagree here: continuing on your parallel, a compiled sitemap
> is nothing different from an electrical circuit that is hardwired with
> special capabilities and then used in production.

I think you missed an important detail which I probably didn't
communicate well.  A patch bay and a console follow a "waterfall"
approach to the default signal processing.  This means that *visibly*
connection points from the top of the patch bay and console
automagically flow down to the bottom.

The full default signal path in a professional studio is this:

1) Patch panels in the studio map to the top row of points on the patch
    bay.  This is so microphones are connected to a point that can be
    reassigned.

2) The top row of points is "half-normalized" to flow to the next row
    down of patch points.  "Half normalized" means that if you stick
    a patch cable into the top row, you split the signal, but if you
    stick it in the bottom row you interrupt the signal.

3) That row of patch points is mapped to flow to the channel inputs.

4) The channel inputs go to the channel preamp within the mixer.

5) The preamp output flows to an insert out (i.e. a way to insert
    processing into the chain).

6) The insert out maps to the next row of patch points on the patch
    bay.

7) That row of patch points are "fully-normalized" to flow to the
    row of patch points below that.  "Fully-normalized" means that if
    you stick a patch cable into either row, you interupt normal
    processing flow.

8) The row of patch points from above are mapped to the insert in
    connections on the console.

9) The insert in is mapped to a fader that controls the amplitude of the
    signal.

10) The fader output is mapped to the next row of patch points.

11) That row of patch points is "half" normalized to flow to the next
     row of patch points.

12) the previously mentioned patch points are mapped to the tape input.

13) The tape output (i.e. recording media) is mapped to the next row
     of patch points.

14) That row of patch points is mapped back into the console's track
     input.

15) The track input is then mapped to a second set of insert outputs

16) those insert outputs are mapped to the patch bay (next row)

17) we have a "fully normalized" connection to the next row of patch
     points.

18) those patch points are mapped back into the console's second set
     of insert in points.

19) the insert in points are mapped to the track faders and pan controls
     that control how that signal is mixed into the master output.

20) the master output is mapped to the patch bay

21) the patch bay is mapped to the master recording media inputs

22) the master recording media outputs are mapped to the patch bay

23) the patch bay then maps those points to the power amp.

As you can see this is a very intricate set of defaults.  Basically, by
just making sure the faders are set to unity (no gain, no resistance),
and connecting a mike to the patch panel in the studio, you can start
recording music while listening to what is on tape.

Inevitably during the process of recording you will need to use some
outboard equipment.  Perhaps you have a favorite compressor, or you
want to add some reverberation to the mix.  Well, all the outboard
equipment is also mapped to the patch bay.  All we have to do is take
a patch cable and make the connection from any point in the signal's
life to the outboard equipment and back again.  All you need is two
cables.

A final mix of the recorded material typically has hundreds of patch
cables because the defaults are good for recording but not so much for
mixing.

Just as there are two phases of a cocoon app (development and release),
there are two phases of recording music (recording and mixing).  At
least in this simplified version (there are also master houses,
duplication plants, etc.).

In the days before we came up with patch bays, we would physically
attach the connectors directly from one place to another.  If we wanted
to change an outboard device, it required several acrobatics to get
at the back of the machine and make the connections.

This is where the sitemap sits now.  Granted, an XML file is a lot
easier to deal with than the backs of hardware devices, but you
get the idea.

The "soft-wiring" I was speaking of is providing a default flow of
information that gets overridden by the developer.  The sitemap becomes
a "patch-bay" if you will.

> On the other hand I see an interpreted sitemap equivalent to a
> soft-wiring, sort of a test-board where you conntect your components,
> try them out and see when they do what you want.
> 
> Then you go and assemble your board (if you ever compiled VHDL or
> Verilog into circuits you know what I mean, but that's *way* down to the
> metal for the people on this list, I guess).

:/  A different point, but I see where you are going.  I.e. the
difference between Xalan and XSLTC.  It would be cool to provide this
for Cocoon.

>>We only have hard-wiring which means that we have
>>to explicitly state how we handle each invokation of the sitemap.  Many
>>HTTP servers provide a soft-wiring that matches urls to the file system.
>>
> 
> This is nothing different from the sitemap: we could easily have an
> implicity rule such as 
> 
>  <match pattern="**">
>   <read src="{1}"/>
>  </match>
> 
> and that would clone the default HTTP functionality, but there is
> nothing "soft" in the function of this.

The softness comes from the fact that it is easily overriden.

>>This is an easily understood default that is as fragile as your file
>>hierarchy.  I don't have any thoughts on how to provide a middle of the
>>road soft-wiring for the sitemap.
>>
> 
> I don't see the concept: a soft-wired sitemap would be a sitemap that is
> able to modify itself at runtime. It has been proposed (a.k.a. dynamic
> sitemap) but nobody has yet came out with a meaningful example that
> *required* that and could not be processed with a better separation of
> concerns between the pipeline components.

Here is the thing.  A sitemap would be more useful to a developer if we
could easily rearrange it as necessary.  But once we are done... we
can generate a class file that optimizes what we came up with.  We would
use link crawling or some other methodology that would determine what
sources needed to be compiled and which ones were better left alone.

A "soft" sitemap is a developer's concern.  A "hard" sitemap is a
deployment concern.  Just like in the recording industry you don't by
15 point 11.5" reels of tape that are 2" thick.  Those are used during
development/recording to keep all the instruments separated on tape.
What are distributed to customers are pressed CDs where the mix cannot
be altered.

>>I think that the addition of Actions can attest to that.  Lets take
>>Cocoon and instead of a web application platform lets place it in the
>>light of a web services platform (all XML all the time).
>>
> 
> [snipped discussion on web services platform built on top of cocoon]
> 
> The more I think about it, the more I think Actions don't belong to the
> sitemap... but to be entirely honest with you, I don't have a clear view
> of where we should lead the semantics to.
> 
> I see great promises in Ovidiu's work on Schecoon... but we also need a
> way to make the two things (flowmaps and sitemaps) interoperate yet
> keeping concerns separated and keep the number of contracts the smallest
> possible.
> 
> And this is *not* an easy architectural task :/

Keep in mind that the Sitemap (and Cocoon as well) implement the
Processor interface.  That means anything that is a signal router would
implement that interface.  It can be as simple as that.

-- 

"They that give up essential liberty to obtain a little temporary safety
  deserve neither liberty nor safety."
                 - Benjamin Franklin

---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org

Re: [RT] Sources and Drains (long response)

Posted by Stefano Mazzocchi <st...@apache.org>.

Berin Loritsch wrote:

> > The original idea of a pipeline was
> >
> >  g -> t* -> s
> >
> > you all know this very well.
> >
> > Then I realized that the same pipelines shared the same properties, so I
> > came up with the notion of resource views, which required a way to 'get
> > out' of the normal pipeline flow. This was implemented with 'views' and
> > 'labels'
> >
> >  g -> t* -> s
> >
> >  [label] -> t* -> s
> >
> > so that we could ask for a specific 'view' and do something like
> >
> >  g -> t -> t* -> s [normal view]
> >       |
> >       + -> t* -> s [another view]
> >
> > Then we wanted to merge different pipelines with 'aggregation', so we
> > needed a way to do something like
> >
> >  g -> t* -> A -> t* -> s
> >             ^
> >  g -> t* ---+
> >
> > this created the notion of 'internal pipelines', those pipelines who
> > don't need a serializer because their sax events are aggregated
> > directly.
> 
> Now, I want to interject that it is easier to understand the usefulness
> of Aggregations than Views.  For that reason, we need another way of
> viewing the system.

yes, I know, contributions on this are very welcome :)

> > Now, Vadim proposed the notion of having serializers that write on
> > output streams which are not connected to the requesting agent.
> >
> > Something like this
> >
> >   request -> g -> t* -> MX -> t* -> s -> response
> >                          |
> >                          t*
> >                          |
> >                          s
> >                          |
> >                          v
> >                        stream
> >
> > this reminded me a lot of transistor 'drains', the place where the
> > electrons are consumed.
> 
> I understand the requirement.  What we need is the ability to
> initiate asynchronous pipelines that consume information from a
> request.  I say asynchronous because in this case, we are not
> merging the information back into the critical stream.

Yes. The problem being when the result of one branch is 'synchronous'
with the result of the other branch. (say, saving something into a
database)

> There has been some talk about making SAX streams concurrent (i.e.
> an aggregation of several pipelines all running at the same time
> to reduce time to the client).  So the Multiplexer/Demultiplexer
> idea does have some merit.
> 
> However, let use take the abstraction up from the FET analogy.
> 

[snipped very interesting parallel on cocoon and audio processing]

> The closest thing we have to a signal router in Cocoon is the Sitemap.

Agreed.

> The sitemap does not provide a soft-wiring feature that assists in
> creating a solution.  

I midly disagree here: continuing on your parallel, a compiled sitemap
is nothing different from an electrical circuit that is hardwired with
special capabilities and then used in production.

On the other hand I see an interpreted sitemap equivalent to a
soft-wiring, sort of a test-board where you conntect your components,
try them out and see when they do what you want.

Then you go and assemble your board (if you ever compiled VHDL or
Verilog into circuits you know what I mean, but that's *way* down to the
metal for the people on this list, I guess).

> We only have hard-wiring which means that we have
> to explicitly state how we handle each invokation of the sitemap.  Many
> HTTP servers provide a soft-wiring that matches urls to the file system.

This is nothing different from the sitemap: we could easily have an
implicity rule such as 

 <match pattern="**">
  <read src="{1}"/>
 </match>

and that would clone the default HTTP functionality, but there is
nothing "soft" in the function of this.

> This is an easily understood default that is as fragile as your file
> hierarchy.  I don't have any thoughts on how to provide a middle of the
> road soft-wiring for the sitemap.

I don't see the concept: a soft-wired sitemap would be a sitemap that is
able to modify itself at runtime. It has been proposed (a.k.a. dynamic
sitemap) but nobody has yet came out with a meaningful example that
*required* that and could not be processed with a better separation of
concerns between the pipeline components.
 
> One of the areas that we are facing (as the additions of views and
> aggregations attest) is that the sitemap semantics do not provide a
> balanced and flexible approach to signal (request/response) routing.

Yes, I think we all agree here.

> I think that the addition of Actions can attest to that.  Lets take
> Cocoon and instead of a web application platform lets place it in the
> light of a web services platform (all XML all the time).

[snipped discussion on web services platform built on top of cocoon]

The more I think about it, the more I think Actions don't belong to the
sitemap... but to be entirely honest with you, I don't have a clear view
of where we should lead the semantics to.

I see great promises in Ovidiu's work on Schecoon... but we also need a
way to make the two things (flowmaps and sitemaps) interoperate yet
keeping concerns separated and keep the number of contracts the smallest
possible.

And this is *not* an easy architectural task :/

> (Stefano, I know I went a completely different way than you started
> out but I think we have something to throw in the mix here that will
> get us closer to where we want to be).

Hey, RT were written exactly to do this :) no need to explain yourself.

-- 
Stefano Mazzocchi      One must still have chaos in oneself to be
                          able to give birth to a dancing star.
<st...@apache.org>                             Friedrich Nietzsche
--------------------------------------------------------------------



---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org

Re: [RT] Sources and Drains (long response)

Posted by Berin Loritsch <bl...@apache.org>.

Stefano Mazzocchi wrote:
> [For those of you who didn't study elettronic, MOSFET transistors have
> sources, drains and gates. Sources generate electrons, drains consume
> them and gates control the flow]

MOSFETs, aahh listen to the music...

For those who do not know, FETs (Field Emission Transistor) transistors
behave more like tubes than their PNP or NPN solid state cousins.  As a
result, they tend to have a more pleasing sound when they approach
saturation (they still clip a little hard but you soften the sound
some).  MOSFET (Metal Oxide Semiconductor Field Emission Transistor) is
a type of FET with a small insulator on the gate.

Since I studied music recording, and Studio Electronics, I know exactly
what Stefano is talking about.

BTW: there is a slight correction in the definition of Sources and
Drains in regards to FETs.  The source is where electrons enter
(generated by an external source like a battery or socket) and the
drain is where electrons leave.  The gates control the amount of
resistance that the FET, and consequently the the current flowing
through the FET.

I love FETs. :)

> The original idea of a pipeline was
> 
>  g -> t* -> s
> 
> you all know this very well.
> 
> Then I realized that the same pipelines shared the same properties, so I
> came up with the notion of resource views, which required a way to 'get
> out' of the normal pipeline flow. This was implemented with 'views' and
> 'labels'
> 
>  g -> t* -> s
> 
>  [label] -> t* -> s
> 
> so that we could ask for a specific 'view' and do something like
> 
>  g -> t -> t* -> s [normal view]
>       |
>       + -> t* -> s [another view]
> 
> Then we wanted to merge different pipelines with 'aggregation', so we
> needed a way to do something like
> 
>  g -> t* -> A -> t* -> s
>             ^
>  g -> t* ---+
> 
> this created the notion of 'internal pipelines', those pipelines who
> don't need a serializer because their sax events are aggregated
> directly.

Now, I want to interject that it is easier to understand the usefulness
of Aggregations than Views.  For that reason, we need another way of
viewing the system.

> Now, Vadim proposed the notion of having serializers that write on
> output streams which are not connected to the requesting agent.
> 
> Something like this
> 
>   request -> g -> t* -> MX -> t* -> s -> response
>                          |
>                          t*
>                          |
>                          s
>                          |
>                          v
>                        stream
> 
> this reminded me a lot of transistor 'drains', the place where the
> electrons are consumed.

I understand the requirement.  What we need is the ability to
initiate asynchronous pipelines that consume information from a
request.  I say asynchronous because in this case, we are not
merging the information back into the critical stream.

There has been some talk about making SAX streams concurrent (i.e.
an aggregation of several pipelines all running at the same time
to reduce time to the client).  So the Multiplexer/Demultiplexer
idea does have some merit.

However, let use take the abstraction up from the FET analogy.

Since I come from a recording background I will use the analogy of
a patch bay and console.  The purpose of a patch bay is to route signals 
through what is known as outboard equipment (processors outside the
console), and back into the console.  The Console takes several
signals and combines them into only a few (stereo, 5.1 format, 7.1
format).

In a typical recording session, we will take the input of several
instruments and route them directly to the media.  Because the
musicians all want to hear each other, along with what is already
on tape, the mixer is used to play back what is on tape and provide
a working mix for the musician.  Typically each musician wants to
hear themselves the loudest ;).

During this stage, the console has several output options.  Depending
on the console model, you can have as many as 8 different mixes for
the musicians (more than enough) and still have yet a different one
for the engineer.  There is only a little bit of processing done to
make the artist feel comfortable.

To bring the analogy into the Cocoon world, each "pipeline" represents
a signal.  The signal comes in through the preamp to amplify it, through
the recording media to record it, through EQ and compression to
transform it, and finally through the power amplifier so we can hear it.

In Cocon, the Generator acts as the preamp.  It takes an arbitrary
source and converts it into a usable format (SAX events).  Typically we
skip the "recording" phase for most applications, but I will get back
to this later.  The EQ and compression are analogous to different types
of transformers.  Lastly, the power amp is equivalent to the serializer.

Stepping back one more abstraction, we have Transducers, Signal Routers,
and Signal Modifiers.  Microphones, Tape Heads, Speakers, A/D/A
converters are all types of transducers.  They convert one form of
energy into another.  Also as you can see by the inclusion of
microphones and speakers in the list of transducers, the concept of a
transducer does not indicate a specific direction of signal flow.  In
fact, microphones can be used as an inefficient speaker and speakers
can be used as an inefficient microphone (I know of one engineer who
uses a 6" speaker to record the bass drum).

In Cocoon land we have three transducers:  Generator, Serializer, and
Reader.  This is ok, because it reflects real life.

The Signal Modifiers are EQs, dynamics processors, digital signal
processors, and other specialty devices.  EQs modify the frequency
content of the signal.  Dynamics processors like compressors and
expanders modify the amplitude content of a signal.  Digital signal
processors typically modify the time domain (phase) of a signal,
although they work equally well for all areas of signal modification.
This is where all the fun happens.

In Cocoon our signal modifier is the Transformer.  There are many
types of transformers that are powerful and easy to use.

Moving on to signal routers, we are talking about the combination of the
patch bay and the console.  The console has limited routing capabilities
because its main function is to combine a large number of signals into
a small number of signals.  It delegates the responsibility of complex
routing to the patch bay.  The patch bay is the real signal router in
the system.  We can use the patch bay to split any signal to two or
three outputs to allow different outboard devices to modify the signal
concurrently.  The inputs and outputs of those devices and all the
channels on the console, and all the tracks on recording media are
connected to the patch bay.  The patch bay is usually soft-wired so that
there is a default flow for the signals that can be easily overridden
by inserting patch cables.

The closest thing we have to a signal router in Cocoon is the Sitemap.
The sitemap does not provide a soft-wiring feature that assists in
creating a solution.  We only have hard-wiring which means that we have
to explicitly state how we handle each invokation of the sitemap.  Many
HTTP servers provide a soft-wiring that matches urls to the file system.
This is an easily understood default that is as fragile as your file
hierarchy.  I don't have any thoughts on how to provide a middle of the
road soft-wiring for the sitemap.

One of the areas that we are facing (as the additions of views and
aggregations attest) is that the sitemap semantics do not provide a
balanced and flexible approach to signal (request/response) routing.
I think that the addition of Actions can attest to that.  Lets take
Cocoon and instead of a web application platform lets place it in the
light of a web services platform (all XML all the time).

Let's say we have a complex system that takes a new item submission from
a manufacturer and puts it in the queue for a retailer to review.
Typically there are a set of canned responses that the Web Service has
to provide.  Those responses would be: "Success", "Internal Error",
"Incorrectly Formatted Request".  In the best of worlds, we would
directly process the incoming XML into the database and shove out a
response from a different source.  An example flow would be:

               "Internal Error"->Response
                       +
Request->Validate->Database
             |         +
             |     "Success"->Response
             +
         "Incorrectly Formatted Request"->Response

As you can see, the incoming request's destination is the database--not
the message response.  In this system we have 3 transducers: Request,
Response, and Database.  To ease development, we would want to make
a direct line from the request to the database because the input request
is formatted as XML.  Also, we want to kick off a canned response as
soon as it is safe to do so.  If there is any validation error, we want
to kick out a response while we are cleaning up resources.  If there
is a database error, we want to kick out a response while we are
cleaning up resources.  Once the request is properly processed we can
safely return the canned "Success" response.

Currently we have a complex way of accomplishing these goals:
We use Actions to perform validation and database entry.  Using clues
from the return values of the actions, we select the response as our
generator and folow it immediately with the XML serializer.

It would be better if we could separate the input and output streams
so that the output stream can be kicked off as soon as we know which
one.  With a separate input stream, we can get rid of the need for
Actions altogether and have a collection of kick-but transformers and
transducers that take the request and directly place it in whatever
storage device we need.

This allows simple things to remain simple (i.e. regular web sites
only need output streams), but allows things to be flexible.  For
instance, let's take the same example and have it kick off a couple
other pipelines.  One is a pipeline that processes payments from the
SOAP header informaiton, one is a pipeline that informs the retailer
that there is a new product in their queue.  Now we have three
separate destinations that no longer need to keep track of each other.

With the new CommandManager and event processing tools in Excalibur
scratchpad, it is easy to kick off an asynchronous pipeline without
worrying about a multiplicity of threads (we can have a controlled
number of threads that handle the truly asynchronous events).

Now, when we bring this infrastructure into a portal, we have to
worry about recombining pipelines.  For instance, one request will
have several inputs, all needing to be combined into one output.
This is a slightly different problem that is not as friendly to
asynchronous processing.  The main reason for this is that SAX events
are inherently ordered.  You will have to cache parts of the other
pipelines until they are needed, a requirement wich adds overhead
in the form of RAM or disk IO.  I doubt we will better our response
times significantly if we allow asynchronous demultiplexing (or
mixing).  I am open to more discussion on the subject though....

Comments?

(Stefano, I know I went a completely different way than you started
out but I think we have something to throw in the mix here that will
get us closer to where we want to be).
-- 

"They that give up essential liberty to obtain a little temporary safety
  deserve neither liberty nor safety."
                 - Benjamin Franklin

---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org

Re: [RT] Sources and Drains

Posted by Stefano Mazzocchi <st...@apache.org>.

Matt Sergeant wrote:
> 
> On Thu, 21 Feb 2002, Stefano Mazzocchi wrote:
> 
> > [For those of you who didn't study elettronic, MOSFET transistors have
> > sources, drains and gates. Sources generate electrons, drains consume
> > them and gates control the flow]
> 
> And in computing terms, a drain has for a long time been called a "Tee".
> See tee(1).

No, a BJT transistor is a tee, a MOSFET is simply a controlled pipe.

> > So, at the end we get:
> >
> >  1) pipeline: g -> t* -> s
> >  2) source: g -> t*
> >  3) drain: t* -> s
> >  4) aggregator with parts
> >  5) router with drains (bah, don't really like the name router, but I
> > can't think of anything better)
> >
> > Comments?
> 
> It's fine and obvious and called a Tee, and part of XML::SAX::Machines
> (and part of everyone's Perl SAX toolkit for a long time before that).
> But then you guys *never* read stuff when I post Perl links, so I don't
> know why I even bother.

I did Matt. Honest. But I don't know enough Perl to understand something
out of it :(

-- 
Stefano Mazzocchi      One must still have chaos in oneself to be
                          able to give birth to a dancing star.
<st...@apache.org>                             Friedrich Nietzsche
--------------------------------------------------------------------



---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org

Re: [RT] Sources and Drains

Posted by Matt Sergeant <ma...@sergeant.org>.

On Thu, 21 Feb 2002, Stefano Mazzocchi wrote:

> [For those of you who didn't study elettronic, MOSFET transistors have
> sources, drains and gates. Sources generate electrons, drains consume
> them and gates control the flow]

And in computing terms, a drain has for a long time been called a "Tee".
See tee(1).

> So, at the end we get:
>
>  1) pipeline: g -> t* -> s
>  2) source: g -> t*
>  3) drain: t* -> s
>  4) aggregator with parts
>  5) router with drains (bah, don't really like the name router, but I
> can't think of anything better)
>
> Comments?

It's fine and obvious and called a Tee, and part of XML::SAX::Machines
(and part of everyone's Perl SAX toolkit for a long time before that).
But then you guys *never* read stuff when I post Perl links, so I don't
know why I even bother.

-- 
<!-- Matt -->
<:->Get a smart net</:->

---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org