You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@cocoon.apache.org by Peter Klotz <pe...@blue-elephant-systems.com> on 2003/06/16 16:34:12 UTC

Building lucene index without crawling?

Hi,

I have Cocoon 2.0.4.
Is it possible to generate a Lucene index from XML content without using 
a crawler? The question might sound strange but the reason is that I 
could produce all content and all links to index from a pipeline. Just 
indexing and searching that would be fine. Furthermore I'm having 
problems building the link-views because I would need to transfrom the 
XML produced by my pipelines in order to get href="" link attributes.
And it would be quite impossible to duplicate each pipeline only to 
generate the same XML but putting a link-transformation on it.

Therefore I would like to not crawl but still index. The question how 
can that be done?
I have XML like this

   <tag name="myname">
     <label>text to index</label>
     <description>text to index on as well </description>
   </tag>

I could apply a XSL on this so that I get links

<tag href="/cocoon/app/get_blah_myname"/>

But of course not in the same pipeline that produces HTML from this XML.
Therefore I want to explicitely by myself product the XML content and 
links to index instead of crawling.


Thanks for any help, Peter


---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-users-unsubscribe@xml.apache.org
For additional commands, e-mail: cocoon-users-help@xml.apache.org

RE: Building lucene index without crawling?

Posted by Reinhard Pötz <re...@gmx.net>.

> From: Peter Klotz [mailto:peter.klotz@blue-elephant-systems.com] 
> 
> 
> Hi,
> 
> I have Cocoon 2.0.4.
> Is it possible to generate a Lucene index from XML content 
> without using 
> a crawler? The question might sound strange but the reason is that I 
> could produce all content and all links to index from a 
> pipeline.

I think you have to make your own implementation of
org.apache.cocoon.components.search.LuceneCocoonIndexer which does the
job you want using the SourceResolver. Look at
SimpleLuceneCocoonIndexerImpl to get the idea!

Hope that helps.

Reinhard


---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-users-unsubscribe@xml.apache.org
For additional commands, e-mail: cocoon-users-help@xml.apache.org

RE: Building lucene index without crawling?

Posted by Conal Tuohy <co...@paradise.net.nz>.

Yes it is possible. See the LuceneIndexTransformer:
http://wiki.cocoondev.org/Wiki.jsp?page=LuceneIndexTransformer

This component is quite new and there's no other documentation for it, but
it sounds exactly like what you want.

Con

> -----Original Message-----
> From: Peter Klotz [mailto:peter.klotz@blue-elephant-systems.com]
> Sent: Tuesday, 17 June 2003 02:34
> To: cocoon-users@xml.apache.org
> Subject: Building lucene index without crawling?
>
>
> Hi,
>
> I have Cocoon 2.0.4.
> Is it possible to generate a Lucene index from XML content
> without using
> a crawler? The question might sound strange but the reason is that I
> could produce all content and all links to index from a
> pipeline. Just
> indexing and searching that would be fine. Furthermore I'm having
> problems building the link-views because I would need to
> transfrom the
> XML produced by my pipelines in order to get href="" link attributes.
> And it would be quite impossible to duplicate each pipeline only to
> generate the same XML but putting a link-transformation on it.
>
> Therefore I would like to not crawl but still index. The question how
> can that be done?
> I have XML like this
>
>    <tag name="myname">
>      <label>text to index</label>
>      <description>text to index on as well </description>
>    </tag>
>
> I could apply a XSL on this so that I get links
>
> <tag href="/cocoon/app/get_blah_myname"/>
>
> But of course not in the same pipeline that produces HTML
> from this XML.
> Therefore I want to explicitely by myself product the XML content and
> links to index instead of crawling.
>
>
> Thanks for any help, Peter
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: cocoon-users-unsubscribe@xml.apache.org
> For additional commands, e-mail: cocoon-users-help@xml.apache.org
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-users-unsubscribe@xml.apache.org
For additional commands, e-mail: cocoon-users-help@xml.apache.org