You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@cocoon.apache.org by Nicolas Maisonneuve <ni...@free.fr> on 2004/03/02 06:06:01 UTC

A different luceneIndexTtransformer

hy,
i updated a old modified luceneIndexTransformer that i made last year.
This LuceneIndexTransformer is more general (more low level) and can be used to index all kind of resources (not only HTML page). 

a typic pipeline to index would be :
ressource -> XSLT -> LuceneIndexTransformer-> xml results

a sample : 

<lucene:index xmlns:lucene="http://apache.org/cocoon/lucene/1.0" 
create="true" 
analyzer="org.apache.lucene.analysis.standard.StandardAnalyzer"
directory="d:/indexbase"
merge-factor="100">
    <lucene:document >
        <lucene:field name="tile" type="keyword">sqdqsdq</lucene:field>
        <lucene:field name="description" type="text"> bla bal blalael balbal </lucene:field>
        <lucene:field name="date" type="date" dateformat="MM/dd/yyyy">10/12/2002</lucene:field> 
        (see java API Class SimpleDateFormat for more explanation about the dateFormat attribut)            
        <lucene:field name="date" type="unstored" >just indexed information (not stored)</lucene:field>
        <lucene:field name="date" type="unindexed" >just stored information (not indexed)</lucene:field>
    </lucene:document>
    <lucene:document>
        <lucene:field name="author" type="keyword" boost="2">Mr Author </lucene:field> (boost the field for the search (see Lucene documentation))
        <lucene:field name="langage" type="keyword">french</lucene:field>
    </lucene:document>
< /lucene:index> 

To delete documents
<lucene:delete directory="d:/indexbase" >
    <lucene:document field="author" value="Mr Author"/> (delete all documents with the field author ="Mr Author")
    <lucene:document field="id" value="1E3RFE"/>
< /lucene:delete> 

Example of Output Source 
< lucene:index nbdocuments="2"/>
< lucene:delete nbdocuments="1"/>

Maybe someone would be interessed...

NIcolas Maisonneuve