You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@cocoon.apache.org by Berin Loritsch <bl...@apache.org> on 2003/04/04 00:12:25 UTC
Re: Compiling XML, and its replacement

Stefano Mazzocchi wrote:
> Berin Loritsch wrote:
> 
>> BTW, My Binary XML project (http://d-haven.org/bxml) has the ability
>> to compile an XML document into a Java Class (I know this is nothing
>> extraordinary as XSP has been doing it for years).  However what is
>> very different from XSP is the following:
>>
>> 1) No Java file is ever written (it uses BCEL)
>> 2) No Class file *needs* to be written (although it is an option)
>> 3) The original document name and the line numbers are part of the
>>    generated source code.
>>    * That means the Stack trace has debug information you can
>>      use.
> 
> 
> This has nothing to do with my rants about how stupid the xslt syntax 
> is, but I would be interested to see actual performances between your 
> approach to compiled xml and mine (the one that currently cocoon uses 
> for the cache).

I understand.  However, if the language generates SAX events, it works
pretty well to compile the non XML representation.

> This is in light of the compiled/interpreted question. I think it would 
> be kind of cool to have a serious benchmark between the two approaches, 
> because this would be very helpful on the recent discussion on XSP.

It would be.  However the real performance tradeoff that I am seeking
to gain is in the CallBack facility which has only been partially
defined and implemented.

> I mean, the fact that you aren't generating the source code, well, 
> that's nice from an implementation perspective (I thought about making a 
> BCEL java assembler for XSP but then stopped because it was simply too 
> complex and not really needed now that we have the superb eclipse 
> compiler) but it's nothing different from XSP.

The major difference between what I have and XSP is that we actually
have line numbers that correspond to the original XML document :)
That makes debugging 100% easier.

> So, you are, in fact, unrolling a big loop of sax events, while my xml 
> compilation approach works at serializing the sax events in a 
> easy-to-be-parsed-later binary format.

For now.  Yes.  Eventually everything gets called the same way--so it
would be interesting to see authoritatively if the HotSpot would have
any reason to perform some optimizations.

> I would expect my approach to be faster than yours on modern virtual 
> machines because hotspot can optimize the tight binary SAX reparsing 
> code, while your approach will never reveal any hotspots.

There's only one way to find out.  To keep things as similar as
possible, both sources need to be compiled and in memory.  Then we
need to read the documents.

The interesting benchmarks are server load and scalability.  Just
because something is technically faster doesn't mean it scales well.

> I'll also be interested to see how different the performance gets on 
> hotspot server/client and how much it changes with several subsequent runs

Hmm. Anyone know of a XML parser benchmark out there?

> Too bad I don't have any time for this right now... but if you want to 
> do it, it would be *extremely* useful not only for you but also for the 
> future of compiled/interpreted stuff in java on the server side.

Problem is time.  It took me a year to find time for Binary XML, and
I only got it this far.  Who knows, I will probably get to it
eventually, but if someone has a benchmark that they already know about,
they could do the benchmark too.