You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Oliver Messner <me...@synyx.de> on 2010/12/30 13:36:05 UTC
Highlighter problem when using WordDelimiterFilter and term vectors
Hi,
when using WordDelimiterFilterFactory in the fieldType definition and
setting termVectors="true" termPositions="true" termOffsets="true" on
the field, Solr gives me the following response for the query request
?q=warmwasserspeicher&version=2.2&indent=on&hl=true
<lst name="highlighting">
<lst name="id-1">
<arr name="content">
<str>some text Warm<em>WarmWasserSpeicher</em> here</str>
</arr>
</lst>
</lst>
As you can see, the highlighter does not work like expected (at least
for me). If the term vectors are not stored into the index, I get the
expected result <str>some text <em>WarmWasserSpeicher</em> here</str>.
I'm using Solr version 1.4.1
BTW, this problem does not occur when using the FastVectorHighlighter
(after applying patches https://issues.apache.org/jira/browse/SOLR-1268)
Any ideas?
Uploaded document:
<add>
<doc>
<field name="id">id-1</field>
<field name="content">some text WarmWasserSpeicher here</field>
</doc>
</add>
Field type definition:
<fieldType name="text" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1" generateNumberParts="1" catenateWords="1"
catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1" generateNumberParts="1" catenateWords="0"
catenateNumbers="0" catenateAll="0" splitOnCaseChange="0"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
Field definition:
<fields>
...
<field name="content" type="text" indexed="true" stored="true"
termVectors="true" termPositions="true" termOffsets="true"/>
</fields>
solrconf.xml:
<requestHandler name="dismax" class="solr.SearchHandler" default="true">
<lst name="defaults">
<bool name="tv">true</bool>
<str name="defType">dismax</str>
<str name="qf">content</str>
<str name="mm">1</str>
<str name="hl">true</str>
<str name="fl">score</str>
</lst>
<arr name="last-components">
<str>tvComponent</str>
</arr>
</requestHandler>
...
<searchComponent name="tvComponent"
class="org.apache.solr.handler.component.TermVectorComponent"/>
Thanks,
Oliver