You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@cocoon.apache.org by Lars Huttar <la...@sil.org> on 2004/11/18 15:19:48 UTC

strategic importance of Cocoon?

Hi all,
I have an unexpected opportunity to speak to a computer conference of 
some 200-300 attendees about what Cocoon is and its strategic importance 
(or lack thereof) for our organization!
My slot is tomorrow morning at 8:30.

I will be looking in the usual places for material for this 
presentation, such as the wiki page for notes from GT2004. We will also 
demo one of our internal Cocoon apps.
If anybody has suggestions about Cocoon's unique selling points, or its 
current status (considering that some of the material out there about 
Cocoon is or may be obsolete), I would love to hear from you today!
Please email me directly. Feel free to CC this list too.

Thanks,
Lars



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org


Re: strategic importance of Cocoon?

Posted by Lars Huttar <la...@sil.org>.
Brad Kain wrote:

>Lars --
>
>We have been invoived with both Cocoon and Lenya for almost two years.
>In addition to the well-understood advantages of the core architecture
>and its support for content transformation, I would make sure to
>emphasize Cocoon's use as an application framework. We have built a
>number of applications that use Cocoon to implement the core
>model-view-controller mechanism in a tiered architecture. For example,
>we are currently working on an ecommerce/travel web site that uses
>Cocoon. In this approach, Cocoon is really an alternative to something
>like the Jakarta Struts framework.
>
>Perhaps this is old news to you, but I see many articles or
>presentations on Cocoon that only describe its advantages for content
>transformation.
>
>I hope that this helps.
>
>Brad Kain
>
>Quoting Lars Huttar <la...@sil.org>:
>
>  
>
>>Andreas Kuckartz wrote:
>>
>>    
>>
>>>Do not forget to mention Apache Lenya and Apache Forrest (both are
>>>      
>>>
>>top-level
>>    
>>
>>>Apache projects which are using Cocoon):
>>>http://lenya.apache.org/
>>>http://forrest.apache.org/
>>>
>>>Cheers,
>>>Andreas
>>>
>>>
>>>
>>>      
>>>
>>Thank you.
>>Lars
>>
>>    
>>
>>>----- Original Message -----
>>>From: "Lars Huttar" <la...@sil.org>
>>>To: "Cocoon Users List" <us...@cocoon.apache.org>
>>>Sent: Thursday, November 18, 2004 3:19 PM
>>>Subject: strategic importance of Cocoon?
>>>
>>>
>>>
>>>
>>>      
>>>
>>>>Hi all,
>>>>I have an unexpected opportunity to speak to a computer conference
>>>>        
>>>>
>>of
>>    
>>
>>>>some 200-300 attendees about what Cocoon is and its strategic
>>>>        
>>>>
>>importance
>>    
>>
>>>>(or lack thereof) for our organization!
>>>>My slot is tomorrow morning at 8:30.
>>>>
>>>>I will be looking in the usual places for material for this
>>>>presentation, such as the wiki page for notes from GT2004. We will
>>>>        
>>>>
>>also
>>    
>>
>>>>demo one of our internal Cocoon apps.
>>>>If anybody has suggestions about Cocoon's unique selling points, or
>>>>        
>>>>
>>its
>>    
>>
>>>>current status (considering that some of the material out there
>>>>        
>>>>
>>about
>>    
>>
>>>>Cocoon is or may be obsolete), I would love to hear from you today!
>>>>Please email me directly. Feel free to CC this list too.
>>>>
>>>>Thanks,
>>>>Lars
>>>>
>>>>
>>>>        
>>>>
>>
>  
>
Thanks, that's helpful.
Lars


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org


Re: Large XML transformations in Cocoon.

Posted by Upayavira <uv...@upaya.co.uk>.
Tom Bloomfield wrote:

> I'm planning to do xml -> text transformations (for tab-delimited 
> output) and xml -> FOP on large XML datasets.  The XML I will  be 
> processing will be 10-12 MB in size, and will grow from there. Based 
> on planning, the XSL will contain around 50 node traversals and will 
> iterate over my XML dataset around 46,000 times.  Previous to this, my 
> Cocoon transformations haven't been nearly this big.
>
> The amount of JVM memory I have to deal with is limited (<256M).  This 
> transformation will need to run in real-time.
> Does anyone have experience dealing with large datasets like this?

That sounds like quite a challenge. XSLT isn't that appropriate for that 
sort of thing. Firstly, in XSLT, avoid arbitrary wanders around your XML 
tree - stay as close to the context node as you can.

Alternatively, look at STX (there is an STX block). See if you can 
manage your transformations with that. This is "streaming" 
transformations for XML, i.e. it is designed for streaming, and thus 
should be able to handle large datasets.

Regards, Upayavira


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org


Re: Large XML transformations in Cocoon.

Posted by Bertrand Delacretaz <bd...@apache.org>.
Le 19 nov. 04, à 02:45, Tom Bloomfield a écrit :

> ...The XML I will  be processing will be 10-12 MB in size, and will 
> grow from there. Based on planning, the XSL will contain around 50 
> node traversals and will iterate over my XML dataset around 46,000 
> times....

You'll probably have a hard time doing this on a 256-MB system.

In such a case I'd ask myself if my problem is *so* hard as to require 
46'000 iterations over the XML dataset. Of course it depends on the 
kind of data you're processing, but this sounds very unusual.

-Bertrand

Re: Large XML transformations in Cocoon.

Posted by Miles Elam <mi...@pcextremist.com>.
Go right ahead.  Anything I write to this mailing list is fair 
game/public domain.

- Miles Elam


On Nov 20, 2004, at 7:26 AM, Upayavira wrote:

> Miles Elam wrote:
>
> Very useful piece. Would you mind if I put this on the wiki?
>
> Regards, Upayavira
>
>> As someone who has used STX, I can recommend it in this situation 
>> wholeheartedly.  STX looks very much like XSLT but uses a different 
>> namespace and doesn't have as many options for transformation.


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org


Re: Large XML transformations in Cocoon.

Posted by Upayavira <uv...@upaya.co.uk>.
Miles Elam wrote:

Very useful piece. Would you mind if I put this on the wiki?

Regards, Upayavira

> As someone who has used STX, I can recommend it in this situation 
> wholeheartedly.  STX looks very much like XSLT but uses a different 
> namespace and doesn't have as many options for transformation.
>
> Unless something drastic has changed lately in the XSLT used by 
> Cocoon, it uses a document table model (like a DOM but tailored toward 
> a read-only view and a transformation source).  This is necessary 
> because XSLT allows several passes over the same source document and 
> also allows arbitrary access to any point in the tree (although this 
> is usually quite inefficient).  So while XSLT is the preferred method 
> for XML transformation in general, certain circumstances like yours 
> would point toward alternatives.
>
> As far as streaming XSLT results is concerned, it's possible to 
> configure it this way at the expense of overall processing time.  But 
> you don't appear to have the memory for even one full transformation 
> let alone many at the same time.  STX is your best bet in my opinion.  
> This always streams the output by its very nature.
>
> Also, do NOT put this into a caching pipeline.  With such a large 
> source, memory constraints will get worse before they get better.  
> Reprocess each time (or pregenerate on intervals a la cron) to shift 
> the weight from memory to CPU/disk in this case.
>
> Of course, a final option is to write your own custom Cocoon 
> transformer, but I would recommend the STX route as it would likely be 
> almost as fast and a while lot more flexible and maintainable in the 
> long run.
>
> - Miles Elam
>
>
> On Nov 19, 2004, at 7:07 AM, Tom Bloomfield wrote:
>
>> The number of iterations cooresponds to the number of rows returned 
>> from the database.  There are roughly 46,000 rows present now, so I  
>> need at least that many rows in my display.  The XSL design enables 
>> me to use SAX which should help.  The easiest thing would be to limit 
>> the number of rows returned to something more reasonable like 10,000 
>> (or up the JVM memory :P), but this is the requirement I'm stuck with.
>>
>> Help me understand this: If I apply a transformation using XSLT, 
>> streaming the xml in, does Cocoon "stream" the results out?   IE, 
>> does the entire transformation happen in memory and then get flushed 
>> to the client, or does Cocoon flush the buffer to the client as xxx 
>> bytes are filled?  I made an assumption that Cocoon does this 
>> automatically.
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
> For additional commands, e-mail: users-help@cocoon.apache.org
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org


Re: Large XML transformations in Cocoon.

Posted by Tom Bloomfield <to...@shopbloomfield.com>.
Miles,

Thanks for the tips.  I'll move forward on coding this using STX and 
post some benchmarking numbers when I finish. 

TB



Upayavira wrote:

> Miles Elam wrote:
>
> Very useful piece. Would you mind if I put this on the wiki?
>
> Regards, Upayavira
>
>> As someone who has used STX, I can recommend it in this situation 
>> wholeheartedly.  STX looks very much like XSLT but uses a different 
>> namespace and doesn't have as many options for transformation.
>>
>> Unless something drastic has changed lately in the XSLT used by 
>> Cocoon, it uses a document table model (like a DOM but tailored 
>> toward a read-only view and a transformation source).  This is 
>> necessary because XSLT allows several passes over the same source 
>> document and also allows arbitrary access to any point in the tree 
>> (although this is usually quite inefficient).  So while XSLT is the 
>> preferred method for XML transformation in general, certain 
>> circumstances like yours would point toward alternatives.
>>
>> As far as streaming XSLT results is concerned, it's possible to 
>> configure it this way at the expense of overall processing time.  But 
>> you don't appear to have the memory for even one full transformation 
>> let alone many at the same time.  STX is your best bet in my 
>> opinion.  This always streams the output by its very nature.
>>
>> Also, do NOT put this into a caching pipeline.  With such a large 
>> source, memory constraints will get worse before they get better.  
>> Reprocess each time (or pregenerate on intervals a la cron) to shift 
>> the weight from memory to CPU/disk in this case.
>>
>> Of course, a final option is to write your own custom Cocoon 
>> transformer, but I would recommend the STX route as it would likely 
>> be almost as fast and a while lot more flexible and maintainable in 
>> the long run.
>>
>> - Miles Elam
>>
>>
>> On Nov 19, 2004, at 7:07 AM, Tom Bloomfield wrote:
>>
>>> The number of iterations cooresponds to the number of rows returned 
>>> from the database.  There are roughly 46,000 rows present now, so I  
>>> need at least that many rows in my display.  The XSL design enables 
>>> me to use SAX which should help.  The easiest thing would be to 
>>> limit the number of rows returned to something more reasonable like 
>>> 10,000 (or up the JVM memory :P), but this is the requirement I'm 
>>> stuck with.
>>>
>>> Help me understand this: If I apply a transformation using XSLT, 
>>> streaming the xml in, does Cocoon "stream" the results out?   IE, 
>>> does the entire transformation happen in memory and then get flushed 
>>> to the client, or does Cocoon flush the buffer to the client as xxx 
>>> bytes are filled?  I made an assumption that Cocoon does this 
>>> automatically.
>>
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
>> For additional commands, e-mail: users-help@cocoon.apache.org
>>
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
> For additional commands, e-mail: users-help@cocoon.apache.org
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org


Re: Large XML transformations in Cocoon.

Posted by Miles Elam <mi...@pcextremist.com>.
As someone who has used STX, I can recommend it in this situation 
wholeheartedly.  STX looks very much like XSLT but uses a different 
namespace and doesn't have as many options for transformation.

Unless something drastic has changed lately in the XSLT used by Cocoon, 
it uses a document table model (like a DOM but tailored toward a 
read-only view and a transformation source).  This is necessary because 
XSLT allows several passes over the same source document and also 
allows arbitrary access to any point in the tree (although this is 
usually quite inefficient).  So while XSLT is the preferred method for 
XML transformation in general, certain circumstances like yours would 
point toward alternatives.

As far as streaming XSLT results is concerned, it's possible to 
configure it this way at the expense of overall processing time.  But 
you don't appear to have the memory for even one full transformation 
let alone many at the same time.  STX is your best bet in my opinion.  
This always streams the output by its very nature.

Also, do NOT put this into a caching pipeline.  With such a large 
source, memory constraints will get worse before they get better.  
Reprocess each time (or pregenerate on intervals a la cron) to shift 
the weight from memory to CPU/disk in this case.

Of course, a final option is to write your own custom Cocoon 
transformer, but I would recommend the STX route as it would likely be 
almost as fast and a while lot more flexible and maintainable in the 
long run.

- Miles Elam


On Nov 19, 2004, at 7:07 AM, Tom Bloomfield wrote:

> The number of iterations cooresponds to the number of rows returned 
> from the database.  There are roughly 46,000 rows present now, so I  
> need at least that many rows in my display.  The XSL design enables me 
> to use SAX which should help.  The easiest thing would be to limit the 
> number of rows returned to something more reasonable like 10,000 (or 
> up the JVM memory :P), but this is the requirement I'm stuck with.
>
> Help me understand this: If I apply a transformation using XSLT, 
> streaming the xml in, does Cocoon "stream" the results out?   IE, does 
> the entire transformation happen in memory and then get flushed to the 
> client, or does Cocoon flush the buffer to the client as xxx bytes are 
> filled?  I made an assumption that Cocoon does this automatically.


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org


Re: Large XML transformations in Cocoon.

Posted by Tom Bloomfield <to...@shopbloomfield.com>.
Upayavira, thanks for the heads up about STX.  I'll check out Joost 
later today.

The number of iterations cooresponds to the number of rows returned from 
the database.  There are roughly 46,000 rows present now, so I  need at 
least that many rows in my display.  The XSL design enables me to use 
SAX which should help.  The easiest thing would be to limit the number 
of rows returned to something more reasonable like 10,000 (or up the JVM 
memory :P), but this is the requirement I'm stuck with.

Help me understand this: If I apply a transformation using XSLT, 
streaming the xml in, does Cocoon "stream" the results out?   IE, does 
the entire transformation happen in memory and then get flushed to the 
client, or does Cocoon flush the buffer to the client as xxx bytes are 
filled?  I made an assumption that Cocoon does this automatically.

If anyone else has any suggestions, please let me know. 
TIA,
Tom



Bertrand Delacretaz wrote:

>
> Le 19 nov. 04, à 02:45, Tom Bloomfield a écrit :
>
>> ...The XML I will  be processing will be 10-12 MB in size, and will 
>> grow from there. Based on planning, the XSL will contain around 50 
>> node traversals and will iterate over my XML dataset around 46,000 
>> times....
>
>
> You'll probably have a hard time doing this on a 256-MB system.
>
> In such a case I'd ask myself if my problem is *so* hard as to require 
> 46'000 iterations over the XML dataset. Of course it depends on the 
> kind of data you're processing, but this sounds very unusual.
>
> -Bertrand
>


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org


Large XML transformations in Cocoon.

Posted by Tom Bloomfield <to...@shopbloomfield.com>.
I'm planning to do xml -> text transformations (for tab-delimited 
output) and xml -> FOP on large XML datasets.  The XML I will  be 
processing will be 10-12 MB in size, and will grow from there. Based on 
planning, the XSL will contain around 50 node traversals and will 
iterate over my XML dataset around 46,000 times.  Previous to this, my 
Cocoon transformations haven't been nearly this big.

The amount of JVM memory I have to deal with is limited (<256M).  This 
transformation will need to run in real-time. 

Does anyone have experience dealing with large datasets like this?

TIA,
Tom









---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org


Re: strategic importance of Cocoon?

Posted by Brad Kain <br...@quoininc.com>.
Lars --

We have been invoived with both Cocoon and Lenya for almost two years.
In addition to the well-understood advantages of the core architecture
and its support for content transformation, I would make sure to
emphasize Cocoon's use as an application framework. We have built a
number of applications that use Cocoon to implement the core
model-view-controller mechanism in a tiered architecture. For example,
we are currently working on an ecommerce/travel web site that uses
Cocoon. In this approach, Cocoon is really an alternative to something
like the Jakarta Struts framework.

Perhaps this is old news to you, but I see many articles or
presentations on Cocoon that only describe its advantages for content
transformation.

I hope that this helps.

Brad Kain

Quoting Lars Huttar <la...@sil.org>:

> Andreas Kuckartz wrote:
> 
> >Do not forget to mention Apache Lenya and Apache Forrest (both are
> top-level
> >Apache projects which are using Cocoon):
> >http://lenya.apache.org/
> >http://forrest.apache.org/
> >
> >Cheers,
> >Andreas
> >
> >
> >
> Thank you.
> Lars
> 
> >----- Original Message -----
> >From: "Lars Huttar" <la...@sil.org>
> >To: "Cocoon Users List" <us...@cocoon.apache.org>
> >Sent: Thursday, November 18, 2004 3:19 PM
> >Subject: strategic importance of Cocoon?
> >
> >
> >
> >
> >>Hi all,
> >>I have an unexpected opportunity to speak to a computer conference
> of
> >>some 200-300 attendees about what Cocoon is and its strategic
> importance
> >>(or lack thereof) for our organization!
> >>My slot is tomorrow morning at 8:30.
> >>
> >>I will be looking in the usual places for material for this
> >>presentation, such as the wiki page for notes from GT2004. We will
> also
> >>demo one of our internal Cocoon apps.
> >>If anybody has suggestions about Cocoon's unique selling points, or
> its
> >>current status (considering that some of the material out there
> about
> >>Cocoon is or may be obsolete), I would love to hear from you today!
> >>Please email me directly. Feel free to CC this list too.
> >>
> >>Thanks,
> >>Lars
> >>
> >>
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
> For additional commands, e-mail: users-help@cocoon.apache.org
> 
> 


-------------------------------------------------------------
Brad Kain
President
                       QUOIN Inc.
  "Your Complex Challenges. Our Innovative Solutions."

Boston Solutions Center                tel: 617.357.5233
210 South Street, Suite 1000B          fax: 617.357.5234
Boston, Massachusetts 02111            tel: 866.357.J2EE
                 email: brad.kain@QuoinInc.com
                mobile: 617.429.4516
                   web: http://www.quoininc.com
-------------------------------------------------------------


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org


Re: strategic importance of Cocoon?

Posted by Lars Huttar <la...@sil.org>.
Andreas Kuckartz wrote:

>Do not forget to mention Apache Lenya and Apache Forrest (both are top-level
>Apache projects which are using Cocoon):
>http://lenya.apache.org/
>http://forrest.apache.org/
>
>Cheers,
>Andreas
>
>  
>
Thank you.
Lars

>----- Original Message -----
>From: "Lars Huttar" <la...@sil.org>
>To: "Cocoon Users List" <us...@cocoon.apache.org>
>Sent: Thursday, November 18, 2004 3:19 PM
>Subject: strategic importance of Cocoon?
>
>
>  
>
>>Hi all,
>>I have an unexpected opportunity to speak to a computer conference of
>>some 200-300 attendees about what Cocoon is and its strategic importance
>>(or lack thereof) for our organization!
>>My slot is tomorrow morning at 8:30.
>>
>>I will be looking in the usual places for material for this
>>presentation, such as the wiki page for notes from GT2004. We will also
>>demo one of our internal Cocoon apps.
>>If anybody has suggestions about Cocoon's unique selling points, or its
>>current status (considering that some of the material out there about
>>Cocoon is or may be obsolete), I would love to hear from you today!
>>Please email me directly. Feel free to CC this list too.
>>
>>Thanks,
>>Lars
>>    
>>


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org