You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by dkropotova <Da...@semantico.com> on 2007/11/15 11:22:43 UTC

How to unescape a text representation of (incomplete) xml, returned in the hit-highlight element in the SOLR response doc

Dear all,

I'm using SOLR to store documents with an xml field (appropriate description
of the xml field has been added to the SOLR schema. The content of the xml
field in the SOLR document is escaped.)

Now, I would like to highlight search the xml field content. I'm using a
following pipeline

<map:pipeline>
      <map:match pattern="solr-search.xml">
        <map:generate type="request"/>  
        
        <!-- Constructs the SOLR select URL -->
        <!-- It looks like -->
        <!-- http://localhost:8080/solr/select?q=xml%3Aprocedure&rows=15"/>
-->
        <map:transform src="xsl/query-to-cinclude-solr.xsl">
          <map:parameter name="host"
value="http://localhost:8080/solr/select"/>
        </map:transform>
        
        <!-- Includes the previously constucted link -->            
        <map:transform type="cinclude"/>
        <map:transform type="solr"/>
        
        <!-- XSLT to process the SOLR response doc -->
        <map:transform src="xsl/quick_search.xsl" type="xslt">
          <map:parameter name="contextPath" value="{request:contextPath}"/>
        </map:transform>
        
        <map:serialize type="xhtml"/>
      </map:match>
</map:pipeline>

Where I have the following components

<map:components>
    <map:generators default="file">
      <!-- generators for SOLR integration -->
      <!-- generates an XML representation of the incoming request -->
      <map:generator name="request"
src="org.apache.cocoon.generation.RequestGenerator"/>   
      <map:generator name="solr-search"
logger="sitemap.generator.searchgenerator"
        src="org.apache.forrest.solr.client.SolrSearchGenerator"/>
    </map:generators>
    
    <map:transformers default="xslt">
      <!-- transformer for SOLR integration -->
      <!-- paginator transformer that calculates the next pages based on
start, rows and numFound -->
      <map:transformer logger="sitemap.transformer.solr" name="solr"
pool-max="16"
        src="org.apache.forrest.solr.client.SolrQueryTransformer"/>
      
    </map:transformers>
  </map:components>

It does work and the hits are returned in the 'highlight' element of the
SOLR response. The results are returned as text that I need to unescape (to
convert it back to xml). However, the results are snippets so the return xml
is incomplete.

I was wondering what would be the easiest way to convert(unescape) it back
to xml, so that I can display the results to user.
Is it a good idea to get rid of the xml markup and just leave text?

Many thanks!
Dasha
-- 
View this message in context: http://www.nabble.com/How-to-unescape-a-text-representation-of-%28incomplete%29-xml%2C-returned-in-the-hit-highlight-element-in-the-SOLR-response-doc-tf4810977.html#a13765032
Sent from the Solr - User mailing list archive at Nabble.com.