You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by "Nemani, Raj" <Ra...@turner.com> on 2011/05/16 20:27:17 UTC

Highlighting issue with Solr 3.1

All,

 

I have just installed Solr 3.1 running on Tomcat 7.  I am noticing a possible issue with Highlighting.  I have a filed in my index called "story".  The solr document that I am testing with the data in the story field starts with the following snippet (remaining data in the field is not shown to keep things simple)

 

<p><a idref="0" /></p><p>EN AMÉRICA LATINA, 

 

When I search for "america" with the highlighting enabled on the "story' field, here is what I get in my "highlighting" section of the response.  I am using the "ASCIIFoldingFilterFactory" to make my searches accent insensitive.  

 

<lst name="highlighting"><lst name="2011_May_13_ _1c77033a"><arr name="story"><str>&lt;p&gt;&lt;a idref=&quot;0&quot; /&gt;&lt;/p&gt;&lt;p&gt;EN <em>AM&#201;RICA</em> LATINA, SE HAN PRODUCIDO AVANCES, CON RESPECTO A LA PROTECCI&#211;N</str></arr></lst>.  The problem is the encode html tags before the <em> showing up as raw html tags (because of the encoding) on my search results page.  Just to make sure, I do want the html to be interpreted as html not as text.  In this particular situation I am not worried about the dangers of allowing such behavior.

 

The same test performed on the same data running on 1.4.1 index does not exhibit this behavior.

 

Any help is appreciated.  Please let me know if I need to post my field type definitions (index and query) from the SolrConfig.xml for the "story" field.

 

Thanks in advance

 

Raj


RE: Highlighting issue with Solr 3.1

Posted by "Nemani, Raj" <Ra...@turner.com>.
Thank you, so much!

That was it.  

Thanks again
Raj


-----Original Message-----
From: Koji Sekiguchi [mailto:koji@r.email.ne.jp] 
Sent: Monday, May 16, 2011 8:45 PM
To: solr-user@lucene.apache.org
Subject: Re: Highlighting issue with Solr 3.1

(11/05/17 3:27), Nemani, Raj wrote:
> All,
>
>
>
> I have just installed Solr 3.1 running on Tomcat 7.  I am noticing a possible issue with Highlighting.  I have a filed in my index called "story".  The solr document that I am testing with the data in the story field starts with the following snippet (remaining data in the field is not shown to keep things simple)
>
>
>
> <p><a idref="0" /></p><p>EN AMÉRICA LATINA,
>
>
>
> When I search for "america" with the highlighting enabled on the "story' field, here is what I get in my "highlighting" section of the response.  I am using the "ASCIIFoldingFilterFactory" to make my searches accent insensitive.
>
>
>
> <lst name="highlighting"><lst name="2011_May_13_ _1c77033a"><arr name="story"><str>&lt;p&gt;&lt;a idref=&quot;0&quot; /&gt;&lt;/p&gt;&lt;p&gt;EN<em>AM&#201;RICA</em>  LATINA, SE HAN PRODUCIDO AVANCES, CON RESPECTO A LA PROTECCI&#211;N</str></arr></lst>.  The problem is the encode html tags before the<em>  showing up as raw html tags (because of the encoding) on my search results page.  Just to make sure, I do want the html to be interpreted as html not as text.  In this particular situation I am not worried about the dangers of allowing such behavior.
>
>
>
> The same test performed on the same data running on 1.4.1 index does not exhibit this behavior.
>
>
>
> Any help is appreciated.  Please let me know if I need to post my field type definitions (index and query) from the SolrConfig.xml for the "story" field.
>
>
>
> Thanks in advance
>
>
>
> Raj
>
>

I bet you have an encoder setting in your solrconfig.xml:

<encoder name="html"
          default="true"
          class="solr.highlight.HtmlEncoder" />

If so, try to comment it out.

Koji
-- 
http://www.rondhuit.com/en/

Re: Highlighting issue with Solr 3.1

Posted by Koji Sekiguchi <ko...@r.email.ne.jp>.
(11/05/17 3:27), Nemani, Raj wrote:
> All,
>
>
>
> I have just installed Solr 3.1 running on Tomcat 7.  I am noticing a possible issue with Highlighting.  I have a filed in my index called "story".  The solr document that I am testing with the data in the story field starts with the following snippet (remaining data in the field is not shown to keep things simple)
>
>
>
> <p><a idref="0" /></p><p>EN AMÉRICA LATINA,
>
>
>
> When I search for "america" with the highlighting enabled on the "story' field, here is what I get in my "highlighting" section of the response.  I am using the "ASCIIFoldingFilterFactory" to make my searches accent insensitive.
>
>
>
> <lst name="highlighting"><lst name="2011_May_13_ _1c77033a"><arr name="story"><str>&lt;p&gt;&lt;a idref=&quot;0&quot; /&gt;&lt;/p&gt;&lt;p&gt;EN<em>AM&#201;RICA</em>  LATINA, SE HAN PRODUCIDO AVANCES, CON RESPECTO A LA PROTECCI&#211;N</str></arr></lst>.  The problem is the encode html tags before the<em>  showing up as raw html tags (because of the encoding) on my search results page.  Just to make sure, I do want the html to be interpreted as html not as text.  In this particular situation I am not worried about the dangers of allowing such behavior.
>
>
>
> The same test performed on the same data running on 1.4.1 index does not exhibit this behavior.
>
>
>
> Any help is appreciated.  Please let me know if I need to post my field type definitions (index and query) from the SolrConfig.xml for the "story" field.
>
>
>
> Thanks in advance
>
>
>
> Raj
>
>

I bet you have an encoder setting in your solrconfig.xml:

<encoder name="html"
          default="true"
          class="solr.highlight.HtmlEncoder" />

If so, try to comment it out.

Koji
-- 
http://www.rondhuit.com/en/