You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@cocoon.apache.org by Allan Erskine <a....@cs.ucl.ac.uk> on 2002/01/21 00:08:23 UTC

RE: [UPDATE] Scheme/Cocoon progress - initial benchmarks show good speed!

(a long-time lurker takes a break from his studies and is BLOWN AWAY by
C2's progress!!!  Congratulations all!)

Ovidiu,

I was trying to get Schecoon to build (great name, BTW!), but build came
unstuck trying to generate sisc.heap...I'm not a CVS expert; that's
probably why I'm suspicious of it - if sisc.heap is a serialised file,
would it not need flagged as text, so CVS could perform all the
appropriate conversions from *nix?

Best,
Allan

-----Original Message-----
From: ovidiu@cup.hp.com [mailto:ovidiu@cup.hp.com] 
Sent: 19 January 2002 07:07
To: cocoon-dev@xml.apache.org
Subject: [UPDATE] Scheme/Cocoon progress - initial benchmarks show good
speed!


Hi everybody,

This past week I've done some good progress in integrating the Scheme
engine in the Cocoon system. Thanks to Sylvain and Berin, I was able
to rework in a nicer, and perhaps more extensible way, the hookup
between the Scheme sitemap and Cocoon. This allows me to have XML
sitemaps, for now very simple, which are parsed and interpreted in
Scheme.

Currently the XML sitemap file is parsed using the Java Parser
component and translated into a Scheme representation called SXML
(Scheme XML, see http://okmij.org/ftp/Scheme/xml.html). This
representation is then translated by a Scheme function into Scheme
code, which becomes the Scheme runtime representation of the
sitemap. On each request, this Scheme function, acting as the sitemap,
is executed to process the request, according to the sitemap
definition.

The only supported constructs right now are simple linear pipelines,
with a generator, one or more transformers, and a serializer, or
pipelines that have only a reader. At some point I want to write some
more generic code to allow for more complex sitemap syntax to be
described. This Scheme code will allow the sitemap syntax to be
described using a BNF like syntax, and allow semantic actions to be
attached to the BNF rules. With this code it should be very easy to
experiment with new syntaxes/semantics for the sitemap (provided
you're willing to describe them in Scheme ;-). If you think about
Cocoon' sitemap syntax as a mini-language, then the natural way of
describing, analyzing and processing it is through the usual
techniques of compiling. That's exactly what I have planned to do, as
soon as I have more clearer ideas on this.

The current Scheme sitemap implementation tries to do some basic
analysis of the correctness of the XML sitemap, and reports the
encountered errors to the standard output by now. This will change to
be more integrated with Cocoon's error reporting mechanisms.

With the new architecture it should be easy to hookup the Scheme
sitemap implementation with another sitemap implementation, like the
current one, or with Sylvain's TreeProcessor implementation. I will
however defer this for the moment as I'm eager to get to the meat of
the problem, playing with the continuations concept.

-- 

Benchmark

I've done some basic, very rough, speed comparison between the Scheme
sitemap implementation and the compiled version. I used the Apache
'ab' program to send requests to process through a simple pipeline
(generator+XSLT+serializer) a very small stylebook document. The
resulting page has about 2.5kb.

The results are surprising: it appears the Scheme sitemap
implementation runs at the same speeds with the compiled version!

The only explanation I have for this is that the Scheme implementation
uses its own URI matcher, based on Jakarta ORO, which rumors say is
faster (at least in the simple usage I have) than Jakarta Regexp,
using by the "compiled" Cocoon. A bigger difference perhaps is the way
the parenthesized expressions in regular expression patters are
interpreted. In the current compiled approach, a substitute() function
is called at runtime to replace in the match pattern with the actual
values. In the Scheme implementation the parenthesis groups are
statically replaced at sitemap compile time, and they become function
arguments. The final expression is composed by doing a string append
of the pattern components and actual values. This trick saves
processing time at runtime, as the pattern does not have to be
traversed to find out where to place the values.

Other than this everything else in the infrastructure is the same
between the two implementations. The difference is of course, the way
the pipelines are setup: in the compiled case each pipeline is setup
by compiled code, running as fast as possible, while in the Scheme
case, the pipeline setup is driven by the Scheme sitemap function,
which should be slower than the compiled version.

--

Future work

So far, I'm quite pleased with how things progress. Next week I'm
going to focus on implementing the infrastructure for using the
continuations from Scheme. I hope I'll be able to work out an example
in Scheme to drive the implementation. Once that's complete, I'm going
to focus on implementing a translator for the flow language to
Scheme.

For the flow language, I was thinking to name it JWebFlow. I
originally thought of JavaFlow, or WebFlow, but they are both taken by
some other projects/commercial companies.

Any better ideas on a good name for the flow language? I'd like to
hear your comments on this one.


Best regards,
-- 
Ovidiu Predescu <ov...@cup.hp.com>
http://orion.nsr.hp.com/ (inside HP's firewall only)
http://sourceforge.net/users/ovidiu/ (my SourceForge page)
http://www.geocities.com/SiliconValley/Monitor/7464/ (GNU, Emacs, other
stuff)

---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org