You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Jamie Johnson <je...@gmail.com> on 2012/03/05 23:42:39 UTC

Highlighting Multivalued Field question

If I have a multivalued field with values as follows

<arr name="clothing"><str>black pants</str><str>white shirt</str></arr>

and I do a query against that field with highlighting enabled as follows

/select?hl.fl=clothing&rows=5&q=clothing:black clothing:shirt&hl=on&indent=true

I thought I would see the following in the highlights

<arr name="clothing"><str><em>black</em> pants</str><str>white
<em>shirt</em></str></arr>

but instead I'm seeing the following

<arr name="clothing"><str><em>black</em> pants</str></arr>

is this expected?

Also I'm using a custom highlighter which extends SolrHighlighter but
99.9% of it is a straight copy of DefaultSolrHighlighter with support
from pulling unstored fields from an external data base, so I expect
that this works the same was as DefaultSolrHighlighter, but if this is
not the expected case I will try with DefaultSolrHighlighter.

Re: Highlighting Multivalued Field question

Posted by Jamie Johnson <je...@gmail.com>.
so my mistake on this, I was not setting hl.snippets so the default
value of 1 was being used.  If I change to 2 I get the expected
result.

On Tue, Mar 6, 2012 at 9:10 AM, Jamie Johnson <je...@gmail.com> wrote:
> as an FYI I tried this with the standard highlighter and got the same
> result.  Additionally if it matters this is using the following text
> field definition
>
> <fieldType name="text" class="solr.TextField"
> positionIncrementGap="100" autoGeneratePhraseQueries="true">
>      <analyzer type="index">
>        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
>        <!-- in this example, we will only use synonyms at query time
>        <filter class="solr.SynonymFilterFactory"
> synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/>
>        -->
>        <!-- Case insensitive stop word removal.
>          add enablePositionIncrements=true in both the index and query
>          analyzers to leave a 'gap' for more accurate phrase queries.
>        -->
>        <filter class="solr.StopFilterFactory"
>                ignoreCase="true"
>                words="stopwords.txt"
>                enablePositionIncrements="true"
>                />
>        <filter class="solr.WordDelimiterFilterFactory"
> generateWordParts="1" generateNumberParts="1" catenateWords="1"
> catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
>        <filter class="solr.LowerCaseFilterFactory"/>
>        <filter class="solr.KeywordMarkerFilterFactory"
> protected="protwords.txt"/>
>        <filter class="solr.PorterStemFilterFactory"/>
>      </analyzer>
>      <analyzer type="query">
>        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
>        <filter class="solr.SynonymFilterFactory"
> synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
>        <filter class="solr.StopFilterFactory"
>                ignoreCase="true"
>                words="stopwords.txt"
>                enablePositionIncrements="true"
>                />
>        <filter class="solr.WordDelimiterFilterFactory"
> generateWordParts="1" generateNumberParts="1" catenateWords="0"
> catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
>        <filter class="solr.LowerCaseFilterFactory"/>
>        <filter class="solr.KeywordMarkerFilterFactory"
> protected="protwords.txt"/>
>        <filter class="solr.PorterStemFilterFactory"/>
>      </analyzer>
>    </fieldType>
>
> On Mon, Mar 5, 2012 at 5:42 PM, Jamie Johnson <je...@gmail.com> wrote:
>> If I have a multivalued field with values as follows
>>
>> <arr name="clothing"><str>black pants</str><str>white shirt</str></arr>
>>
>> and I do a query against that field with highlighting enabled as follows
>>
>> /select?hl.fl=clothing&rows=5&q=clothing:black clothing:shirt&hl=on&indent=true
>>
>> I thought I would see the following in the highlights
>>
>> <arr name="clothing"><str><em>black</em> pants</str><str>white
>> <em>shirt</em></str></arr>
>>
>> but instead I'm seeing the following
>>
>> <arr name="clothing"><str><em>black</em> pants</str></arr>
>>
>> is this expected?
>>
>> Also I'm using a custom highlighter which extends SolrHighlighter but
>> 99.9% of it is a straight copy of DefaultSolrHighlighter with support
>> from pulling unstored fields from an external data base, so I expect
>> that this works the same was as DefaultSolrHighlighter, but if this is
>> not the expected case I will try with DefaultSolrHighlighter.

Re: Highlighting Multivalued Field question

Posted by Jamie Johnson <je...@gmail.com>.
as an FYI I tried this with the standard highlighter and got the same
result.  Additionally if it matters this is using the following text
field definition

<fieldType name="text" class="solr.TextField"
positionIncrementGap="100" autoGeneratePhraseQueries="true">
      <analyzer type="index">
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
        <!-- in this example, we will only use synonyms at query time
        <filter class="solr.SynonymFilterFactory"
synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/>
        -->
        <!-- Case insensitive stop word removal.
          add enablePositionIncrements=true in both the index and query
          analyzers to leave a 'gap' for more accurate phrase queries.
        -->
        <filter class="solr.StopFilterFactory"
                ignoreCase="true"
                words="stopwords.txt"
                enablePositionIncrements="true"
                />
        <filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1" generateNumberParts="1" catenateWords="1"
catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.KeywordMarkerFilterFactory"
protected="protwords.txt"/>
        <filter class="solr.PorterStemFilterFactory"/>
      </analyzer>
      <analyzer type="query">
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
        <filter class="solr.SynonymFilterFactory"
synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
        <filter class="solr.StopFilterFactory"
                ignoreCase="true"
                words="stopwords.txt"
                enablePositionIncrements="true"
                />
        <filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1" generateNumberParts="1" catenateWords="0"
catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.KeywordMarkerFilterFactory"
protected="protwords.txt"/>
        <filter class="solr.PorterStemFilterFactory"/>
      </analyzer>
    </fieldType>

On Mon, Mar 5, 2012 at 5:42 PM, Jamie Johnson <je...@gmail.com> wrote:
> If I have a multivalued field with values as follows
>
> <arr name="clothing"><str>black pants</str><str>white shirt</str></arr>
>
> and I do a query against that field with highlighting enabled as follows
>
> /select?hl.fl=clothing&rows=5&q=clothing:black clothing:shirt&hl=on&indent=true
>
> I thought I would see the following in the highlights
>
> <arr name="clothing"><str><em>black</em> pants</str><str>white
> <em>shirt</em></str></arr>
>
> but instead I'm seeing the following
>
> <arr name="clothing"><str><em>black</em> pants</str></arr>
>
> is this expected?
>
> Also I'm using a custom highlighter which extends SolrHighlighter but
> 99.9% of it is a straight copy of DefaultSolrHighlighter with support
> from pulling unstored fields from an external data base, so I expect
> that this works the same was as DefaultSolrHighlighter, but if this is
> not the expected case I will try with DefaultSolrHighlighter.