You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@cocoon.apache.org by Martin Lüthi <an...@tnoo.net> on 2002/09/05 10:37:10 UTC

Efficient aggregation

Dear Cocooners

I wonder what would be the most efficient aggregation of many small xml
documents in a directory. My present approach is very slow:

1. use directory-generator to make an index of documents
2. generate a document containing xincludes with help of a stylesheet
3. perform all xincludes
4. serialize

Thank you for any suggestions

Martin

-- 
Martin L�thi                 answer@tnoo.net




---------------------------------------------------------------------
Please check that your question  has not already been answered in the
FAQ before posting.     <http://xml.apache.org/cocoon/faq/index.html>

To unsubscribe, e-mail:     <co...@xml.apache.org>
For additional commands, e-mail:   <co...@xml.apache.org>


Re: Efficient aggregation

Posted by Bertrand Delacretaz <bd...@codeconsult.ch>.
On Thursday 05 September 2002 11:21, Martin Lüthi wrote:
> I
> want to aggregate all files in a directory, which is not possible with
> <map:aggregate> afaik.

You'll have to use CInclude or XInclude for this, along with a 
DirectoryGenerator to get the list of files.

-- 
 Bertrand Delacrétaz (codeconsult.ch, jfor.org)

 buzzwords: XML, java, XSLT, cocoon, mentoring/teaching/coding.
 disclaimer: eternity is very long. mostly towards the end. get ready.






---------------------------------------------------------------------
Please check that your question  has not already been answered in the
FAQ before posting.     <http://xml.apache.org/cocoon/faq/index.html>

To unsubscribe, e-mail:     <co...@xml.apache.org>
For additional commands, e-mail:   <co...@xml.apache.org>


Re: Efficient aggregation

Posted by Martin Lüthi <an...@tnoo.net>.
"u15603" <u1...@hs-harz.de> writes:

> have you ever use the <map:aggregation> inside the sitemap? It is easy to
> handle!

Well, with <map:aggregate> you have to know the names of the files, but I
want to aggregate all files in a directory, which is not possible with
<map:aggregate> afaik.

Thanks

Martin

-- 
Martin L�thi                 answer@tnoo.net



---------------------------------------------------------------------
Please check that your question  has not already been answered in the
FAQ before posting.     <http://xml.apache.org/cocoon/faq/index.html>

To unsubscribe, e-mail:     <co...@xml.apache.org>
For additional commands, e-mail:   <co...@xml.apache.org>


Re: Efficient aggregation

Posted by u15603 <u1...@hs-harz.de>.
Hello Martin,

have you ever use the <map:aggregation> inside the sitemap? It is easy 
to handle!

dirk

>Dear Cocooners
>
>I wonder what would be the most efficient aggregation of many small xml
>documents in a directory. My present approach is very slow:
>
>1. use directory-generator to make an index of documents
>2. generate a document containing xincludes with help of a stylesheet
>3. perform all xincludes
>4. serialize
>
>Thank you for any suggestions
>
>Martin
>
>  
>




---------------------------------------------------------------------
Please check that your question  has not already been answered in the
FAQ before posting.     <http://xml.apache.org/cocoon/faq/index.html>

To unsubscribe, e-mail:     <co...@xml.apache.org>
For additional commands, e-mail:   <co...@xml.apache.org>


Re: Efficient aggregation

Posted by Bruno Dumon <br...@outerthought.org>.
On Thu, 2002-09-05 at 10:37, Martin Lüthi wrote:
> Dear Cocooners
> 
> I wonder what would be the most efficient aggregation of many small xml
> documents in a directory. My present approach is very slow:
> 
> 1. use directory-generator to make an index of documents
> 2. generate a document containing xincludes with help of a stylesheet
> 3. perform all xincludes
> 4. serialize
> 

You first need to find out why it is taking so long. Cocoon's profiler
can help you here.

-- 
Bruno Dumon                             http://outerthought.org/
Outerthought - Open Source, Java & XML Competence Support Center
bruno@outerthought.org


---------------------------------------------------------------------
Please check that your question  has not already been answered in the
FAQ before posting.     <http://xml.apache.org/cocoon/faq/index.html>

To unsubscribe, e-mail:     <co...@xml.apache.org>
For additional commands, e-mail:   <co...@xml.apache.org>


Re: Expiration Attribute in Pipeline & Efficient aggregation

Posted by Vadim Gritsenko <va...@verizon.net>.
Ivelin Ivanov wrote:

>AFAIK,
>If you front Cocoon with Apache (ProxyCache enabled) or if any other proxy
>server is between your browser and Cocoon, then the expires attribute is of
>significant help.
>Only I don't remember the syntax when used in the pipeline.
>Can someone point us to a document?
>

IIRC, it was said that syntax is similar to apache's mod_expires. Examples:
    now
    now plus 10 minutes
    access
    access plus 2 years

See AbstractProcessingPipeline.java.parseExpires().

Vadim


>However if the browser is hitting Cocoon directly, then Vadim is right. It
>would make sence to allow a cache timeout parameter to be allowed in the
>pipeline for the dirGenerator.
>
>
>----- Original Message -----
>From: "Vadim Gritsenko" <va...@verizon.net>
>To: <co...@xml.apache.org>
>Sent: Friday, September 06, 2002 8:01 AM
>Subject: Re: Expiration Attribute in Pipeline & Efficient aggregation
>
>
>Ivelin Ivanov wrote:
>
>  
>
>>Cocoon 2.1 supports an extra attribute in the pipeline,
>>which specifies the expiration header in the http response.
>>
>>    
>>
>
>It serves different purpose; other client won't get cached result, and
>refresh also won't get cached result.
>
>Martin, you still need to exted generator to make it cacheable, with
>delta timestamp (simplest approach) or any other way.
>
>Vadim
>
>
>
>  
>
>>This should allow temporary caching of the result.
>>I am not sure where this is documented though.
>>
>>
>>----- Original Message -----
>>From: "Martin Lüthi" <an...@tnoo.net>
>>To: <co...@xml.apache.org>
>>Sent: Thursday, September 05, 2002 8:47 AM
>>Subject: Re: Efficient aggregation
>>
>>
>>Thank you for your hint. I just tried out XPathDirectoryGenerator
>>(scratchpad)
>>which essentially does what I need, but is a lot less messy than my initial
>>approach. However, also these results seem not to get cached... Presumably
>>    
>>
>I
>  
>
>>should save the result with something like a SourceWritingTransformer, and
>>only rebuild the file after explizit request.
>>
>>
>>Martin
>>
>>Nick Airey <ni...@ihavemoved.com> writes:
>>
>>
>>
>>    
>>
>>>After 1 minute of looking, it seems that the DirectoryGenerator is not
>>>cacheable.
>>>
>>>So it is going to re-read the directory every time you hit the pipeline.
>>>Your Xincluded pieces might be cacheable, however. For instance, the
>>>FileGenerator *is* cacheable (if you are using it).
>>>
>>>
>>>If you can live with refreshing the cached directory every x seconds (or
>>>miliseconds), and you can write some java, you could extend the
>>>DirectoryGenerator to make a "caching directory generator", by
>>>implementing interface Cacheable and implementing generateKey() and
>>>generateValidity().  The generateValidity() method would return a
>>>DeltaTimeCacheValidity instance set to the caching time.
>>>
>>>
>>>Regs,
>>>Nick.
>>>
>>>
>>>      
>>>
>>--
>>Martin Lüthi                 answer@tnoo.net
>>
>>    
>>




---------------------------------------------------------------------
Please check that your question  has not already been answered in the
FAQ before posting.     <http://xml.apache.org/cocoon/faq/index.html>

To unsubscribe, e-mail:     <co...@xml.apache.org>
For additional commands, e-mail:   <co...@xml.apache.org>


Re: Expiration Attribute in Pipeline & Efficient aggregation

Posted by Ivelin Ivanov <iv...@apache.org>.
AFAIK,
If you front Cocoon with Apache (ProxyCache enabled) or if any other proxy
server is between your browser and Cocoon, then the expires attribute is of
significant help.
Only I don't remember the syntax when used in the pipeline.
Can someone point us to a document?

However if the browser is hitting Cocoon directly, then Vadim is right. It
would make sence to allow a cache timeout parameter to be allowed in the
pipeline for the dirGenerator.


----- Original Message -----
From: "Vadim Gritsenko" <va...@verizon.net>
To: <co...@xml.apache.org>
Sent: Friday, September 06, 2002 8:01 AM
Subject: Re: Expiration Attribute in Pipeline & Efficient aggregation


Ivelin Ivanov wrote:

>Cocoon 2.1 supports an extra attribute in the pipeline,
>which specifies the expiration header in the http response.
>

It serves different purpose; other client won't get cached result, and
refresh also won't get cached result.

Martin, you still need to exted generator to make it cacheable, with
delta timestamp (simplest approach) or any other way.

Vadim



>This should allow temporary caching of the result.
>I am not sure where this is documented though.
>
>
>----- Original Message -----
>From: "Martin Lüthi" <an...@tnoo.net>
>To: <co...@xml.apache.org>
>Sent: Thursday, September 05, 2002 8:47 AM
>Subject: Re: Efficient aggregation
>
>
>Thank you for your hint. I just tried out XPathDirectoryGenerator
>(scratchpad)
>which essentially does what I need, but is a lot less messy than my initial
>approach. However, also these results seem not to get cached... Presumably
I
>should save the result with something like a SourceWritingTransformer, and
>only rebuild the file after explizit request.
>
>
>Martin
>
>Nick Airey <ni...@ihavemoved.com> writes:
>
>
>
>>After 1 minute of looking, it seems that the DirectoryGenerator is not
>>cacheable.
>>
>>So it is going to re-read the directory every time you hit the pipeline.
>>Your Xincluded pieces might be cacheable, however. For instance, the
>>FileGenerator *is* cacheable (if you are using it).
>>
>>
>>If you can live with refreshing the cached directory every x seconds (or
>>miliseconds), and you can write some java, you could extend the
>>DirectoryGenerator to make a "caching directory generator", by
>>implementing interface Cacheable and implementing generateKey() and
>>generateValidity().  The generateValidity() method would return a
>>DeltaTimeCacheValidity instance set to the caching time.
>>
>>
>>Regs,
>>Nick.
>>
>>
>
>--
>Martin Lüthi                 answer@tnoo.net
>
>
>




---------------------------------------------------------------------
Please check that your question  has not already been answered in the
FAQ before posting.     <http://xml.apache.org/cocoon/faq/index.html>

To unsubscribe, e-mail:     <co...@xml.apache.org>
For additional commands, e-mail:   <co...@xml.apache.org>



---------------------------------------------------------------------
Please check that your question  has not already been answered in the
FAQ before posting.     <http://xml.apache.org/cocoon/faq/index.html>

To unsubscribe, e-mail:     <co...@xml.apache.org>
For additional commands, e-mail:   <co...@xml.apache.org>


Re: Expiration Attribute in Pipeline & Efficient aggregation

Posted by Vadim Gritsenko <va...@verizon.net>.
Ivelin Ivanov wrote:

>Cocoon 2.1 supports an extra attribute in the pipeline,
>which specifies the expiration header in the http response.
>

It serves different purpose; other client won't get cached result, and 
refresh also won't get cached result.

Martin, you still need to exted generator to make it cacheable, with 
delta timestamp (simplest approach) or any other way.

Vadim



>This should allow temporary caching of the result.
>I am not sure where this is documented though.
>
>
>----- Original Message -----
>From: "Martin Lüthi" <an...@tnoo.net>
>To: <co...@xml.apache.org>
>Sent: Thursday, September 05, 2002 8:47 AM
>Subject: Re: Efficient aggregation
>
>
>Thank you for your hint. I just tried out XPathDirectoryGenerator
>(scratchpad)
>which essentially does what I need, but is a lot less messy than my initial
>approach. However, also these results seem not to get cached... Presumably I
>should save the result with something like a SourceWritingTransformer, and
>only rebuild the file after explizit request.
>
>
>Martin
>
>Nick Airey <ni...@ihavemoved.com> writes:
>
>  
>
>>After 1 minute of looking, it seems that the DirectoryGenerator is not
>>cacheable.
>>
>>So it is going to re-read the directory every time you hit the pipeline.
>>Your Xincluded pieces might be cacheable, however. For instance, the
>>FileGenerator *is* cacheable (if you are using it).
>>
>>
>>If you can live with refreshing the cached directory every x seconds (or
>>miliseconds), and you can write some java, you could extend the
>>DirectoryGenerator to make a "caching directory generator", by
>>implementing interface Cacheable and implementing generateKey() and
>>generateValidity().  The generateValidity() method would return a
>>DeltaTimeCacheValidity instance set to the caching time.
>>
>>
>>Regs,
>>Nick.
>>    
>>
>
>--
>Martin Lüthi                 answer@tnoo.net
>
>  
>




---------------------------------------------------------------------
Please check that your question  has not already been answered in the
FAQ before posting.     <http://xml.apache.org/cocoon/faq/index.html>

To unsubscribe, e-mail:     <co...@xml.apache.org>
For additional commands, e-mail:   <co...@xml.apache.org>


Re: Expiration Attribute in Pipeline & Efficient aggregation

Posted by Martin Lüthi <an...@tnoo.net>.
Ivelin

Thank you for your hint. Do you mean something like

<map:generate type="xpathdirectory" src="docs/myfiles#/Body">
   <map:parameter name="expires" value="1000000"/>
</map:generate>

This seems not to be cached and the directory listing gets created every
time. 

Martin

"Ivelin Ivanov" <iv...@apache.org> writes:

> Cocoon 2.1 supports an extra attribute in the pipeline,
> which specifies the expiration header in the http response.
> This should allow temporary caching of the result.
> I am not sure where this is documented though.

-- 
Martin L�thi                 answer@tnoo.net



---------------------------------------------------------------------
Please check that your question  has not already been answered in the
FAQ before posting.     <http://xml.apache.org/cocoon/faq/index.html>

To unsubscribe, e-mail:     <co...@xml.apache.org>
For additional commands, e-mail:   <co...@xml.apache.org>


Re: Expiration Attribute in Pipeline & Efficient aggregation

Posted by Ivelin Ivanov <iv...@apache.org>.
Cocoon 2.1 supports an extra attribute in the pipeline,
which specifies the expiration header in the http response.
This should allow temporary caching of the result.
I am not sure where this is documented though.


----- Original Message -----
From: "Martin Lüthi" <an...@tnoo.net>
To: <co...@xml.apache.org>
Sent: Thursday, September 05, 2002 8:47 AM
Subject: Re: Efficient aggregation


Thank you for your hint. I just tried out XPathDirectoryGenerator
(scratchpad)
which essentially does what I need, but is a lot less messy than my initial
approach. However, also these results seem not to get cached... Presumably I
should save the result with something like a SourceWritingTransformer, and
only rebuild the file after explizit request.


Martin

Nick Airey <ni...@ihavemoved.com> writes:

> After 1 minute of looking, it seems that the DirectoryGenerator is not
> cacheable.
>
> So it is going to re-read the directory every time you hit the pipeline.
> Your Xincluded pieces might be cacheable, however. For instance, the
> FileGenerator *is* cacheable (if you are using it).
>
>
> If you can live with refreshing the cached directory every x seconds (or
> miliseconds), and you can write some java, you could extend the
> DirectoryGenerator to make a "caching directory generator", by
> implementing interface Cacheable and implementing generateKey() and
> generateValidity().  The generateValidity() method would return a
> DeltaTimeCacheValidity instance set to the caching time.
>
>
> Regs,
> Nick.

--
Martin Lüthi                 answer@tnoo.net



---------------------------------------------------------------------
Please check that your question  has not already been answered in the
FAQ before posting.     <http://xml.apache.org/cocoon/faq/index.html>

To unsubscribe, e-mail:     <co...@xml.apache.org>
For additional commands, e-mail:   <co...@xml.apache.org>



---------------------------------------------------------------------
Please check that your question  has not already been answered in the
FAQ before posting.     <http://xml.apache.org/cocoon/faq/index.html>

To unsubscribe, e-mail:     <co...@xml.apache.org>
For additional commands, e-mail:   <co...@xml.apache.org>


Re: Efficient aggregation

Posted by Martin Lüthi <an...@tnoo.net>.
Thank you for your hint. I just tried out XPathDirectoryGenerator (scratchpad)
which essentially does what I need, but is a lot less messy than my initial
approach. However, also these results seem not to get cached... Presumably I
should save the result with something like a SourceWritingTransformer, and
only rebuild the file after explizit request.


Martin

Nick Airey <ni...@ihavemoved.com> writes:

> After 1 minute of looking, it seems that the DirectoryGenerator is not
> cacheable. 
> 
> So it is going to re-read the directory every time you hit the pipeline.
> Your Xincluded pieces might be cacheable, however. For instance, the
> FileGenerator *is* cacheable (if you are using it).
> 
> 
> If you can live with refreshing the cached directory every x seconds (or
> miliseconds), and you can write some java, you could extend the
> DirectoryGenerator to make a "caching directory generator", by
> implementing interface Cacheable and implementing generateKey() and
> generateValidity().  The generateValidity() method would return a
> DeltaTimeCacheValidity instance set to the caching time. 
> 
> 
> Regs,
> Nick.

-- 
Martin L�thi                 answer@tnoo.net



---------------------------------------------------------------------
Please check that your question  has not already been answered in the
FAQ before posting.     <http://xml.apache.org/cocoon/faq/index.html>

To unsubscribe, e-mail:     <co...@xml.apache.org>
For additional commands, e-mail:   <co...@xml.apache.org>


Re: Efficient aggregation

Posted by Nick Airey <ni...@ihavemoved.com>.
After 1 minute of looking, it seems that the DirectoryGenerator is not
cacheable. 

So it is going to re-read the directory every time you hit the pipeline.
Your Xincluded pieces might be cacheable, however. For instance, the
FileGenerator *is* cacheable (if you are using it).


If you can live with refreshing the cached directory every x seconds (or
miliseconds), and you can write some java, you could extend the
DirectoryGenerator to make a "caching directory generator", by
implementing interface Cacheable and implementing generateKey() and
generateValidity().  The generateValidity() method would return a
DeltaTimeCacheValidity instance set to the caching time. 


Regs,
Nick.



On Thu, 2002-09-05 at 09:37, answer@tnoo.net wrote:
> This message uses a character set that is not supported by the Internet
> Service.  To view the original message content,  open the attached
> message. If the text doesn't display correctly, save the attachment to
> disk, and then open it using a viewer that can display the original
> character set. <<message.txt>> 
> ----
> 

> Received: from smart.ihavemoved.com (192.168.0.1 [192.168.0.1]) by caipiroshka.tuftonst.ihavemoved.com with SMTP (Microsoft Exchange Internet Mail Service Version 5.5.2653.13)
> 	id SGHDFL4G; Thu, 5 Sep 2002 10:03:14 +0100
> Received: from [63.251.56.142] (helo=apache.org)
> 	by smart.ihavemoved.com with smtp (Exim 3.20 #1)
> 	id 17mtTq-0001HQ-00
> 	for nick.airey@ihavemoved.com; Thu, 05 Sep 2002 11:03:19 +0100
> Received: (qmail 24172 invoked by uid 500); 5 Sep 2002 09:01:33 -0000
> Mailing-List: contact cocoon-users-help@xml.apache.org; run by ezmlm
> Precedence: bulk
> list-help: <ma...@xml.apache.org>
> list-unsubscribe: <ma...@xml.apache.org>
> list-post: <ma...@xml.apache.org>
> Reply-To: cocoon-users@xml.apache.org
> Delivered-To: mailing list cocoon-users@xml.apache.org
> Received: (qmail 24160 invoked from network); 5 Sep 2002 09:01:33 -0000
> To: cocoon-users@xml.apache.org
> X-Injected-Via-Gmane: http://gmane.org/
> Path: not-for-mail
> From: answer@tnoo.net (Martin =?iso-8859-1?q?L=FCthi?=)
> Subject: Efficient aggregation
> Date: 05 Sep 2002 10:37:10 +0200
> Organization: tnoo.net
> Lines: 17
> Message-ID: <m3...@yak.ethz.ch>
> Reply-To: answer@tnoo.net
> NNTP-Posting-Host: vawpc33.ethz.ch
> Mime-Version: 1.0
> Content-Type: text/plain; charset=latin-iso8859-1
> Content-Transfer-Encoding: 8bit
> X-Trace: main.gmane.org 1031215360 30218 129.132.59.60 (5 Sep 2002 08:42:40 GMT)
> X-Complaints-To: usenet@main.gmane.org
> NNTP-Posting-Date: Thu, 5 Sep 2002 08:42:40 +0000 (UTC)
> User-Agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.3.50
> Sender: news <ne...@main.gmane.org>
> X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N
> 
> Dear Cocooners
> 
> I wonder what would be the most efficient aggregation of many small xml
> documents in a directory. My present approach is very slow:
> 
> 1. use directory-generator to make an index of documents
> 2. generate a document containing xincludes with help of a stylesheet
> 3. perform all xincludes
> 4. serialize
> 
> Thank you for any suggestions
> 
> Martin
> 
> -- 
> Martin Lüthi                 answer@tnoo.net
> 
> 
> 
> 
> ---------------------------------------------------------------------
> Please check that your question  has not already been answered in the
> FAQ before posting.     <http://xml.apache.org/cocoon/faq/index.html>
> 
> To unsubscribe, e-mail:     <co...@xml.apache.org>
> For additional commands, e-mail:   <co...@xml.apache.org>



---------------------------------------------------------------------
Please check that your question  has not already been answered in the
FAQ before posting.     <http://xml.apache.org/cocoon/faq/index.html>

To unsubscribe, e-mail:     <co...@xml.apache.org>
For additional commands, e-mail:   <co...@xml.apache.org>