You are viewing a plain text version of this content. The canonical link for it is here.

Posted to java-user@lucene.apache.org by "Juan A. Polanco" <po...@hotmail.com> on 2004/10/14 19:59:01 UTC

How extract a Field.Text(String, String) field to process it with a Stylesheet?

I have used the Lucene factory method Field.Text(String, String) to index, 
tokenize and store several hundreds short xml files. I stored the entire xml 
content of these files in a field called "content".
Now I want to use the Lucene search results with Cocoon.

For this I'm using a XSP with the following code extract:

<![CDATA[
for(int i = 0; i < hits.length(); i++)
{]]
   <xsp:content>
      <xsp:expr>hits.doc(i).get("content")</xsp:expr>
   </xsp:content>
<![CDATA[
}
searcher.close();

This code works as much as it extracts the xml files safed in the Lucene 
"content" field. The problem is  that the contents appear as normal text 
instead of xml to which I could apply a stylesheet.

Thanks for any help.
Juan

_________________________________________________________________
FREE pop-up blocking with the new MSN Toolbar  get it now! 
http://toolbar.msn.click-url.com/go/onm00200415ave/direct/01/


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org

Re: How extract a Field.Text(String, String) field to process it with a Stylesheet?

Posted by Otis Gospodnetic <ot...@yahoo.com>.

That's true, sorry for the confusion.  The original text is stored
verbatim.

Otis

--- Morus Walter <mo...@tanto.de> wrote:

> Otis Gospodnetic writes:
> > That's likely because you used an Analyzer that stripped the XML
> (<, >,
> > etc.) from the original text.  If you want to preserve the original
> > text, use an Analyzer that doesn't throw your XML away.  You can
> write
> > your own Analyzer that doesn't discard anything, for instance.
> > 
> An analyzer doesn't change the stored content. Only the indexed
> tokens.
> So if something threw away the tags (or just the spectial characters)
> it
> must have been before Field.Text(String, String) was called.
> This of course wouldn't be surprising, since indexing xml often means
> to extract the text from an xml document and index that text.
> 
> Morus
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org

Re: How extract a Field.Text(String, String) field to process it with a Stylesheet?

Posted by Morus Walter <mo...@tanto.de>.

Otis Gospodnetic writes:
> That's likely because you used an Analyzer that stripped the XML (<, >,
> etc.) from the original text.  If you want to preserve the original
> text, use an Analyzer that doesn't throw your XML away.  You can write
> your own Analyzer that doesn't discard anything, for instance.
> 
An analyzer doesn't change the stored content. Only the indexed tokens.
So if something threw away the tags (or just the spectial characters) it
must have been before Field.Text(String, String) was called.
This of course wouldn't be surprising, since indexing xml often means
to extract the text from an xml document and index that text.

Morus

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org

Re: How extract a Field.Text(String, String) field to process it with a Stylesheet?

Posted by Otis Gospodnetic <ot...@yahoo.com>.

That's likely because you used an Analyzer that stripped the XML (<, >,
etc.) from the original text.  If you want to preserve the original
text, use an Analyzer that doesn't throw your XML away.  You can write
your own Analyzer that doesn't discard anything, for instance.

Otis


--- "Juan A. Polanco" <po...@hotmail.com> wrote:

> I have used the Lucene factory method Field.Text(String, String) to
> index, 
> tokenize and store several hundreds short xml files. I stored the
> entire xml 
> content of these files in a field called "content".
> Now I want to use the Lucene search results with Cocoon.
> 
> For this I'm using a XSP with the following code extract:
> 
> <![CDATA[
> for(int i = 0; i < hits.length(); i++)
> {]]
>    <xsp:content>
>       <xsp:expr>hits.doc(i).get("content")</xsp:expr>
>    </xsp:content>
> <![CDATA[
> }
> searcher.close();
> 
> This code works as much as it extracts the xml files safed in the
> Lucene 
> "content" field. The problem is  that the contents appear as normal
> text 
> instead of xml to which I could apply a stylesheet.
> 
> Thanks for any help.
> Juan
> 
> _________________________________________________________________
> FREE pop-up blocking with the new MSN Toolbar � get it now! 
> http://toolbar.msn.click-url.com/go/onm00200415ave/direct/01/
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org