You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Massimo Schiavon <ms...@volunia.com> on 2011/06/16 17:15:45 UTC

Encoding of alternate fields in highlighting

I have an index with various fields and I want to highlight query 
matchings on "title" and "content" fields.
These fields could contain html tags so I've configured HtmlFormatter 
for highlighting. The problem is that if the query doesn't match the 
text of the field, solr returns the value of configured alternate field 
without encoding it.
Is there any way to get encoded value also for alternate fields? And in 
general there is a way to do html escaping on values returned from a 
response writer?

I'm using solr 3.1 and here is an excerpt from requestHandler configuration

[...]
<str name="wt">json</str>
<str name="hl">true</str>
<str name="hl.fl">title,content</str>
<str name="hl.simple.pre"><![CDATA[<b>]]></str>
<str name="hl.simple.post"><![CDATA[</b>]]></str>
<str name="f.title.hl.fragsize">1024</str>
<str name="f.title.hl.alternateField">title</str>
<str name="f.title.hl.maxAlternateFieldLength">512</str>
<int name="f.title.hl.snippets">1</int>
<str name="f.content.hl.alternateField">content</str>
<str name="f.content.hl.maxAlternateFieldLength">512</str>
<int name="f.content.hl.snippets">2</int>
[...]

and from highlighting configuration

[...]
<highlighting>
<formatter name="html" class="org.apache.solr.highlight.HtmlFormatter" 
default="true">
</formatter>
<encoder name="html" class="org.apache.solr.highlight.HtmlEncoder" 
default="true" />
<fragmentsBuilder name="default" 
class="org.apache.solr.highlight.ScoreOrderFragmentsBuilder"
             default="true" />
</highlighting>
[...]

Thanks
Massimo

-- 
DISCLAIMER: This e-mail and any attachment is for authorised use by
the intended recipient(s) only. It may contain proprietary material,
confidential information and/or be subject to legal privilege. It
should not be copied, disclosed to, retained or used by, any other
party. If you are not an intended recipient then please promptly
delete this e-mail and any attachment and all copies and inform
the sender. Thank you.


Re: Encoding of alternate fields in highlighting

Posted by Koji Sekiguchi <ko...@r.email.ne.jp>.
(11/06/17 0:15), Massimo Schiavon wrote:
> I have an index with various fields and I want to highlight query matchings on "title" and "content"
> fields.
> These fields could contain html tags so I've configured HtmlFormatter for highlighting. The problem
> is that if the query doesn't match the text of the field, solr returns the value of configured
> alternate field without encoding it.
> Is there any way to get encoded value also for alternate fields? And in general there is a way to do
> html escaping on values returned from a response writer?

Massimo,

At first impression, I think the requirement is reasonable. As long as we support HtmlEncoder,
we had better support it with alternateField option. Please open a jira issue, and if possible,
suggest appropriate option and attach a patch (patch is not required, but it is very helpful).

koji
-- 
http://www.rondhuit.com/en/