You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@cocoon.apache.org by Antonio Fiol Bonnín <an...@gmail.com> on 2004/09/09 08:38:07 UTC

Custom extensions - to be made available if possible

Hello,

We have started developing two extensions for cocoon, and we would
like to know if the core team would be interested in getting them into
the trunk, and optionally in maintaining them in the future.

The extensions are:

- A transformer that connects via HTTP POST and sends its XML input to
the server, and returns the XML returned from the server to the
pipeline.

This is similar to the SOAP thing, but without the envelope, and with
a predefined (configured in the sitemap) URL.


- An extension to the Cocoon Lucene searching system (or something
different, yet pending design), so that non-XML content can also be
indexed. In particular, we are interested on PDF, but we are designing
it as generic as possible.

BTW, your opinion may be very valueble for the design. Let me explain
the two approaches we have thought of:

a) Refactoring SimpleLuceneXMLIndexerImpl so that its private method
indexDocument is not private, and taking it to an external component.

b) Creating a PDFGenerator (in the cocoon sense of generator, of course).

Option (a) seems to be giving us more headaches than pleasure, and
option (b) seems cleaner to a certain point. Option (b) would allow to
follow links in the PDF file, if developed to that point.

However, option (b) implies choosing a format for its output (which?),
and also poses some problems wrt. the sitemap. Until now, we have a
pipeline using a reader to read pdf files (static, from disk). And we
would need a generator to be invoked instead for the content and links
views. How can we do that? Maybe with a selector? But that does not
seem very clean. Any hints there?

Any other options?

Any general comments?

What about making these into the trunk once they are tested?

Yours sincerely,


Antonio Fiol

Re: Custom extensions - to be made available if possible

Posted by Daniel Fagerstrom <da...@nada.kth.se>.
Antonio Fiol Bonnín wrote:

>>I have implemented something like that, see:
>>http://issues.apache.org/bugzilla/show_bug.cgi?id=24402. It is not yet
>>part of Cocoon as we have differing opinions about the design as you can
>>see in the Bugzilla entry and also in the thread:
>>http://marc.theaimsgroup.com/?t=109085126300004&r=1&w=2.
>>    
>>
>
>Thank you Daniel.
>
>My approach was, in fact, much simpler.
>
>It is a very simple transformer.
>You can configure (in the sitemap, src attribute for the transform
>element) the URL to post to.
>Everything in the input is posted.
>The response is piped back "alone", without anything around.
>
>Of course, your approach is probably much better.
>
I'm not so certain that my approach is better. When I designed the 
stuff, my first idea was to write something like what you describe. The 
advantage with my approach is that you can embed the web service call in 
an XML document that contains other stuff and that you can make several 
web service calls at once. The disadvantage is that if you have to embed 
your message to the web service in XML tags and strip of XML tags from 
the output. For many use cases you don't need the flexibility you get 
from my approach, it just complicates thing. If your input and/or output 
documents from the web service are large it can also be quite costly to 
add and strip away tags.

So as a conclusion I think that your approach would make a usefull 
adition to Cocoon. If you use the "postable" http source that I 
submitted in the Bugzilla entry, It would be quite easy to write, (read 
the thread cited above first for other opinions, however).

Another thing tha might be worth mentioning is that if you need more 
detailed error control or need to do some decisions based on the output, 
it is often better to call the web service from a flowscript, (you can 
find some example code in the bugzilla entry).

/Daniel



Re: Custom extensions - to be made available if possible

Posted by Antonio Fiol Bonnín <an...@gmail.com>.
> I have implemented something like that, see:
> http://issues.apache.org/bugzilla/show_bug.cgi?id=24402. It is not yet
> part of Cocoon as we have differing opinions about the design as you can
> see in the Bugzilla entry and also in the thread:
> http://marc.theaimsgroup.com/?t=109085126300004&r=1&w=2.

Thank you Daniel.

My approach was, in fact, much simpler.

It is a very simple transformer.
You can configure (in the sitemap, src attribute for the transform
element) the URL to post to.
Everything in the input is posted.
The response is piped back "alone", without anything around.

Of course, your approach is probably much better.

Yours,


Antonio Fiol

Re: Custom extensions - to be made available if possible

Posted by Daniel Fagerstrom <da...@nada.kth.se>.
Antonio Fiol Bonnín wrote:

>Hello,
>
>We have started developing two extensions for cocoon, and we would
>like to know if the core team would be interested in getting them into
>the trunk, and optionally in maintaining them in the future.
>
>The extensions are:
>
>- A transformer that connects via HTTP POST and sends its XML input to
>the server, and returns the XML returned from the server to the
>pipeline.
>
>This is similar to the SOAP thing, but without the envelope, and with
>a predefined (configured in the sitemap) URL.
>
<snip/>

I have implemented something like that, see: 
http://issues.apache.org/bugzilla/show_bug.cgi?id=24402. It is not yet 
part of Cocoon as we have differing opinions about the design as you can 
see in the Bugzilla entry and also in the thread: 
http://marc.theaimsgroup.com/?t=109085126300004&r=1&w=2.

/Daniel



Re: Custom extensions - to be made available if possible

Posted by Antonio Fiol Bonnín <an...@gmail.com>.
> How are your PDFs generated? Are they generated by Cocoon? If so, you
> should index the raw data, before you serialize to PDF.
> 
> Just a thought.

Thanks, but PDFs already exist (and their origin is varying). Most of
them are generated from Word documents. But the only thing I have is
the PDF file.

Re: Custom extensions - to be made available if possible

Posted by Upayavira <uv...@upaya.co.uk>.
Antonio Fiol Bonnín wrote:

>Hello,
>
>We have started developing two extensions for cocoon, and we would
>like to know if the core team would be interested in getting them into
>the trunk, and optionally in maintaining them in the future.
>
>The extensions are:
>
>- A transformer that connects via HTTP POST and sends its XML input to
>the server, and returns the XML returned from the server to the
>pipeline.
>
>This is similar to the SOAP thing, but without the envelope, and with
>a predefined (configured in the sitemap) URL.
>
>
>- An extension to the Cocoon Lucene searching system (or something
>different, yet pending design), so that non-XML content can also be
>indexed. In particular, we are interested on PDF, but we are designing
>it as generic as possible.
>
>BTW, your opinion may be very valueble for the design. Let me explain
>the two approaches we have thought of:
>
>a) Refactoring SimpleLuceneXMLIndexerImpl so that its private method
>indexDocument is not private, and taking it to an external component.
>
>b) Creating a PDFGenerator (in the cocoon sense of generator, of course).
>
>Option (a) seems to be giving us more headaches than pleasure, and
>option (b) seems cleaner to a certain point. Option (b) would allow to
>follow links in the PDF file, if developed to that point.
>
>However, option (b) implies choosing a format for its output (which?),
>and also poses some problems wrt. the sitemap. Until now, we have a
>pipeline using a reader to read pdf files (static, from disk). And we
>would need a generator to be invoked instead for the content and links
>views. How can we do that? Maybe with a selector? But that does not
>seem very clean. Any hints there?
>
>Any other options?
>
>Any general comments?
>  
>
How are your PDFs generated? Are they generated by Cocoon? If so, you 
should index the raw data, before you serialize to PDF.

Just a thought.

Upayavira