You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Jamie Johnson <je...@gmail.com> on 2012/03/05 23:42:39 UTC
Highlighting Multivalued Field question
If I have a multivalued field with values as follows
<arr name="clothing"><str>black pants</str><str>white shirt</str></arr>
and I do a query against that field with highlighting enabled as follows
/select?hl.fl=clothing&rows=5&q=clothing:black clothing:shirt&hl=on&indent=true
I thought I would see the following in the highlights
<arr name="clothing"><str><em>black</em> pants</str><str>white
<em>shirt</em></str></arr>
but instead I'm seeing the following
<arr name="clothing"><str><em>black</em> pants</str></arr>
is this expected?
Also I'm using a custom highlighter which extends SolrHighlighter but
99.9% of it is a straight copy of DefaultSolrHighlighter with support
from pulling unstored fields from an external data base, so I expect
that this works the same was as DefaultSolrHighlighter, but if this is
not the expected case I will try with DefaultSolrHighlighter.
Re: Highlighting Multivalued Field question
Posted by Jamie Johnson <je...@gmail.com>.
so my mistake on this, I was not setting hl.snippets so the default
value of 1 was being used. If I change to 2 I get the expected
result.
On Tue, Mar 6, 2012 at 9:10 AM, Jamie Johnson <je...@gmail.com> wrote:
> as an FYI I tried this with the standard highlighter and got the same
> result. Additionally if it matters this is using the following text
> field definition
>
> <fieldType name="text" class="solr.TextField"
> positionIncrementGap="100" autoGeneratePhraseQueries="true">
> <analyzer type="index">
> <tokenizer class="solr.WhitespaceTokenizerFactory"/>
> <!-- in this example, we will only use synonyms at query time
> <filter class="solr.SynonymFilterFactory"
> synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/>
> -->
> <!-- Case insensitive stop word removal.
> add enablePositionIncrements=true in both the index and query
> analyzers to leave a 'gap' for more accurate phrase queries.
> -->
> <filter class="solr.StopFilterFactory"
> ignoreCase="true"
> words="stopwords.txt"
> enablePositionIncrements="true"
> />
> <filter class="solr.WordDelimiterFilterFactory"
> generateWordParts="1" generateNumberParts="1" catenateWords="1"
> catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
> <filter class="solr.LowerCaseFilterFactory"/>
> <filter class="solr.KeywordMarkerFilterFactory"
> protected="protwords.txt"/>
> <filter class="solr.PorterStemFilterFactory"/>
> </analyzer>
> <analyzer type="query">
> <tokenizer class="solr.WhitespaceTokenizerFactory"/>
> <filter class="solr.SynonymFilterFactory"
> synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
> <filter class="solr.StopFilterFactory"
> ignoreCase="true"
> words="stopwords.txt"
> enablePositionIncrements="true"
> />
> <filter class="solr.WordDelimiterFilterFactory"
> generateWordParts="1" generateNumberParts="1" catenateWords="0"
> catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
> <filter class="solr.LowerCaseFilterFactory"/>
> <filter class="solr.KeywordMarkerFilterFactory"
> protected="protwords.txt"/>
> <filter class="solr.PorterStemFilterFactory"/>
> </analyzer>
> </fieldType>
>
> On Mon, Mar 5, 2012 at 5:42 PM, Jamie Johnson <je...@gmail.com> wrote:
>> If I have a multivalued field with values as follows
>>
>> <arr name="clothing"><str>black pants</str><str>white shirt</str></arr>
>>
>> and I do a query against that field with highlighting enabled as follows
>>
>> /select?hl.fl=clothing&rows=5&q=clothing:black clothing:shirt&hl=on&indent=true
>>
>> I thought I would see the following in the highlights
>>
>> <arr name="clothing"><str><em>black</em> pants</str><str>white
>> <em>shirt</em></str></arr>
>>
>> but instead I'm seeing the following
>>
>> <arr name="clothing"><str><em>black</em> pants</str></arr>
>>
>> is this expected?
>>
>> Also I'm using a custom highlighter which extends SolrHighlighter but
>> 99.9% of it is a straight copy of DefaultSolrHighlighter with support
>> from pulling unstored fields from an external data base, so I expect
>> that this works the same was as DefaultSolrHighlighter, but if this is
>> not the expected case I will try with DefaultSolrHighlighter.
Re: Highlighting Multivalued Field question
Posted by Jamie Johnson <je...@gmail.com>.
as an FYI I tried this with the standard highlighter and got the same
result. Additionally if it matters this is using the following text
field definition
<fieldType name="text" class="solr.TextField"
positionIncrementGap="100" autoGeneratePhraseQueries="true">
<analyzer type="index">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<!-- in this example, we will only use synonyms at query time
<filter class="solr.SynonymFilterFactory"
synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/>
-->
<!-- Case insensitive stop word removal.
add enablePositionIncrements=true in both the index and query
analyzers to leave a 'gap' for more accurate phrase queries.
-->
<filter class="solr.StopFilterFactory"
ignoreCase="true"
words="stopwords.txt"
enablePositionIncrements="true"
/>
<filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1" generateNumberParts="1" catenateWords="1"
catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.KeywordMarkerFilterFactory"
protected="protwords.txt"/>
<filter class="solr.PorterStemFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.SynonymFilterFactory"
synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
<filter class="solr.StopFilterFactory"
ignoreCase="true"
words="stopwords.txt"
enablePositionIncrements="true"
/>
<filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1" generateNumberParts="1" catenateWords="0"
catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.KeywordMarkerFilterFactory"
protected="protwords.txt"/>
<filter class="solr.PorterStemFilterFactory"/>
</analyzer>
</fieldType>
On Mon, Mar 5, 2012 at 5:42 PM, Jamie Johnson <je...@gmail.com> wrote:
> If I have a multivalued field with values as follows
>
> <arr name="clothing"><str>black pants</str><str>white shirt</str></arr>
>
> and I do a query against that field with highlighting enabled as follows
>
> /select?hl.fl=clothing&rows=5&q=clothing:black clothing:shirt&hl=on&indent=true
>
> I thought I would see the following in the highlights
>
> <arr name="clothing"><str><em>black</em> pants</str><str>white
> <em>shirt</em></str></arr>
>
> but instead I'm seeing the following
>
> <arr name="clothing"><str><em>black</em> pants</str></arr>
>
> is this expected?
>
> Also I'm using a custom highlighter which extends SolrHighlighter but
> 99.9% of it is a straight copy of DefaultSolrHighlighter with support
> from pulling unstored fields from an external data base, so I expect
> that this works the same was as DefaultSolrHighlighter, but if this is
> not the expected case I will try with DefaultSolrHighlighter.