You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Xavier Schepler <xa...@sciences-po.fr> on 2010/01/07 13:47:00 UTC

Field highlighting

Hi,

I'm trying to highlight short text values. The field they came from has 
a type shared with other fields. I have highlighting working on other 
fields but not on this one.
Why ?

Re: Field highlighting

Posted by Koji Sekiguchi <ko...@r.email.ne.jp>.
Jan Høydahl / Cominvent wrote:
> Did you solve this?
> If yes, what was wrong?
> If no, can you specify one concrete example document and a matching query which fails to highlight?
>
> --
> Jan Høydahl  - search architect
> Cominvent AS - www.cominvent.com
>
> On 7. jan. 2010, at 15.23, Xavier Schepler wrote:
>
>   
>> Erick Erickson a écrit :
>>     
>>> It's really hard to provide any response with so little information,
>>> could you show us the difference between a field that works
>>> and one that doesn't? Especially the relevant schema.xml entries
>>> and the query that fails to highlight....
>>>
>>> Erick
>>>
>>> On Thu, Jan 7, 2010 at 7:47 AM, Xavier Schepler <
>>> xavier.schepler@sciences-po.fr> wrote:
>>>
>>>  
>>>       
>>>> Hi,
>>>>
>>>> I'm trying to highlight short text values. The field they came from has a
>>>> type shared with other fields. I have highlighting working on other fields
>>>> but not on this one.
>>>> Why ?
>>>>
>>>>    
>>>>         
>>>  
>>>       
>> Thanks for your response.
>> Here are some extracts from my schema.xml :
>>
>> <fieldtype name="textFr" class="solr.TextField">
>>     <analyzer>
>>       <!-- suppression des mots vides de sens -->
>>       <filter class="solr.StopFilterFactory" words="french-stopwords.txt" ignoreCase="true"/>
>>       <!-- decoupage en jetons -->
>>       <tokenizer class="solr.StandardTokenizerFactory"/>
>>       <!-- suppression des accents -->
>>       <filter class="solr.ISOLatin1AccentFilterFactory"/>
>>       <!-- suppression des points a la fin des accronymes -->
>>       <filter class="solr.StandardFilterFactory"/>
>>       <!-- passage en miniscules -->
>>       <filter class="solr.LowerCaseFilterFactory"/>
>>       <!-- lexemisation avec le filtre porter -->
>>       <filter class="solr.SnowballPorterFilterFactory" language="French"/>
>>       <!-- synonymes -->
>>       <filter class="solr.SynonymFilterFactory" synonyms="test-synonyms.txt" ignoreCase="true" expand="true"/>
>>     </analyzer>
>>   </fieldtype>
>>
>> Here's a field on which highlighting works :
>>
>> <field     name="questionsLabelsFr"
>>           required="false"
>>           type="textFr"
>>           multiValued="true"
>>           indexed="true"
>>           stored="true"
>>           compressed="false"
>>           omitNorms="false"
>>           termVectors="true"
>>           termPositions="true"
>>           termOffsets="true"
>>   />
>>
>> Here's the field on which it doesn't :
>>
>>  <field     name="modalitiesLabelsFr"
>>           required="false"
>>           type="textFr"
>>           multiValued="true"
>>           indexed="true"
>>           stored="true"
>>           compressed="false"
>>           omitNorms="false"
>>           termVectors="true"
>>           termPositions="true"
>>           termOffsets="true"
>>   />
>>
>> They are kinda the same.
>>
>> But modalitiesLabelFr contains mostly short strings like :
>>
>> Côtes-d Armor
>> Creuse
>> Dordogne
>> Doubs
>> Drôme
>> Eure
>> Eure-et-Loir
>> Finistère
>>
>> When matches are found in them, I get a list like this, with no text :
>>
>> <lst name="highlighting">
>> <lst name="dbbd3642-db1d-4b35-9280-11582523903d"/>
>> ....
>> <lst name="f1d8be2d-1070-4111-b16e-94d16c8c0bc6"/>
>> </lst>
>>
>> The name attribute is the uid of the document.
>>
>> I tryed several values for hl.fragsize (0, 1, 2, ...) with no success at all.
>>     
>
>
>   
If you are using trunk version, because those fields'
termVectors/termPositions/termOffsets are all on, DefaultSolrHighlighter
uses FastVectorHighlighter unless you set hl.useHighlighter to true 
explicitly.
And FVH doesn't support dismax at the moment. It has been fixed in
Lucene trunk:

https://issues.apache.org/jira/browse/LUCENE-2243

If you want to use the fix, download Lucene trunk, execute ant 
build-contrib,
cp lucene-fast-vector-highlighter-3.1-dev.jar to solr/lib and delete old
FVH jar in solr/lib.

Koji

-- 
http://www.rondhuit.com/en/


Re: Field highlighting

Posted by Jan Høydahl / Cominvent <ja...@cominvent.com>.
Did you solve this?
If yes, what was wrong?
If no, can you specify one concrete example document and a matching query which fails to highlight?

--
Jan Høydahl  - search architect
Cominvent AS - www.cominvent.com

On 7. jan. 2010, at 15.23, Xavier Schepler wrote:

> Erick Erickson a écrit :
>> It's really hard to provide any response with so little information,
>> could you show us the difference between a field that works
>> and one that doesn't? Especially the relevant schema.xml entries
>> and the query that fails to highlight....
>> 
>> Erick
>> 
>> On Thu, Jan 7, 2010 at 7:47 AM, Xavier Schepler <
>> xavier.schepler@sciences-po.fr> wrote:
>> 
>>  
>>> Hi,
>>> 
>>> I'm trying to highlight short text values. The field they came from has a
>>> type shared with other fields. I have highlighting working on other fields
>>> but not on this one.
>>> Why ?
>>> 
>>>    
>> 
>>  
> Thanks for your response.
> Here are some extracts from my schema.xml :
> 
> <fieldtype name="textFr" class="solr.TextField">
>     <analyzer>
>       <!-- suppression des mots vides de sens -->
>       <filter class="solr.StopFilterFactory" words="french-stopwords.txt" ignoreCase="true"/>
>       <!-- decoupage en jetons -->
>       <tokenizer class="solr.StandardTokenizerFactory"/>
>       <!-- suppression des accents -->
>       <filter class="solr.ISOLatin1AccentFilterFactory"/>
>       <!-- suppression des points a la fin des accronymes -->
>       <filter class="solr.StandardFilterFactory"/>
>       <!-- passage en miniscules -->
>       <filter class="solr.LowerCaseFilterFactory"/>
>       <!-- lexemisation avec le filtre porter -->
>       <filter class="solr.SnowballPorterFilterFactory" language="French"/>
>       <!-- synonymes -->
>       <filter class="solr.SynonymFilterFactory" synonyms="test-synonyms.txt" ignoreCase="true" expand="true"/>
>     </analyzer>
>   </fieldtype>
> 
> Here's a field on which highlighting works :
> 
> <field     name="questionsLabelsFr"
>           required="false"
>           type="textFr"
>           multiValued="true"
>           indexed="true"
>           stored="true"
>           compressed="false"
>           omitNorms="false"
>           termVectors="true"
>           termPositions="true"
>           termOffsets="true"
>   />
> 
> Here's the field on which it doesn't :
> 
>  <field     name="modalitiesLabelsFr"
>           required="false"
>           type="textFr"
>           multiValued="true"
>           indexed="true"
>           stored="true"
>           compressed="false"
>           omitNorms="false"
>           termVectors="true"
>           termPositions="true"
>           termOffsets="true"
>   />
> 
> They are kinda the same.
> 
> But modalitiesLabelFr contains mostly short strings like :
> 
> Côtes-d Armor
> Creuse
> Dordogne
> Doubs
> Drôme
> Eure
> Eure-et-Loir
> Finistère
> 
> When matches are found in them, I get a list like this, with no text :
> 
> <lst name="highlighting">
> <lst name="dbbd3642-db1d-4b35-9280-11582523903d"/>
> ....
> <lst name="f1d8be2d-1070-4111-b16e-94d16c8c0bc6"/>
> </lst>
> 
> The name attribute is the uid of the document.
> 
> I tryed several values for hl.fragsize (0, 1, 2, ...) with no success at all.


Re: Field highlighting

Posted by Xavier Schepler <xa...@sciences-po.fr>.
Erick Erickson a écrit :
> It's really hard to provide any response with so little information,
> could you show us the difference between a field that works
> and one that doesn't? Especially the relevant schema.xml entries
> and the query that fails to highlight....
>
> Erick
>
> On Thu, Jan 7, 2010 at 7:47 AM, Xavier Schepler <
> xavier.schepler@sciences-po.fr> wrote:
>
>   
>> Hi,
>>
>> I'm trying to highlight short text values. The field they came from has a
>> type shared with other fields. I have highlighting working on other fields
>> but not on this one.
>> Why ?
>>
>>     
>
>   
Thanks for your response.
Here are some extracts from my schema.xml :

<fieldtype name="textFr" class="solr.TextField">
      <analyzer>
        <!-- suppression des mots vides de sens -->
        <filter class="solr.StopFilterFactory" 
words="french-stopwords.txt" ignoreCase="true"/>
        <!-- decoupage en jetons -->
        <tokenizer class="solr.StandardTokenizerFactory"/>
        <!-- suppression des accents -->
        <filter class="solr.ISOLatin1AccentFilterFactory"/>
        <!-- suppression des points a la fin des accronymes -->
        <filter class="solr.StandardFilterFactory"/>
        <!-- passage en miniscules -->
        <filter class="solr.LowerCaseFilterFactory"/>
        <!-- lexemisation avec le filtre porter -->
        <filter class="solr.SnowballPorterFilterFactory" language="French"/>
        <!-- synonymes -->
        <filter class="solr.SynonymFilterFactory" 
synonyms="test-synonyms.txt" ignoreCase="true" expand="true"/>
      </analyzer>
    </fieldtype>

Here's a field on which highlighting works :

<field     name="questionsLabelsFr"
            required="false"
            type="textFr"
            multiValued="true"
            indexed="true"
            stored="true"
            compressed="false"
            omitNorms="false"
            termVectors="true"
            termPositions="true"
            termOffsets="true"
    />

Here's the field on which it doesn't :

   <field     name="modalitiesLabelsFr"
            required="false"
            type="textFr"
            multiValued="true"
            indexed="true"
            stored="true"
            compressed="false"
            omitNorms="false"
            termVectors="true"
            termPositions="true"
            termOffsets="true"
    />

They are kinda the same.

But modalitiesLabelFr contains mostly short strings like :

Côtes-d Armor
Creuse
Dordogne
Doubs
Drôme
Eure
Eure-et-Loir
Finistère

When matches are found in them, I get a list like this, with no text :

<lst name="highlighting">
<lst name="dbbd3642-db1d-4b35-9280-11582523903d"/>
....
<lst name="f1d8be2d-1070-4111-b16e-94d16c8c0bc6"/>
</lst>

The name attribute is the uid of the document.

I tryed several values for hl.fragsize (0, 1, 2, ...) with no success at 
all.

Re: Field highlighting

Posted by Erick Erickson <er...@gmail.com>.
It's really hard to provide any response with so little information,
could you show us the difference between a field that works
and one that doesn't? Especially the relevant schema.xml entries
and the query that fails to highlight....

Erick

On Thu, Jan 7, 2010 at 7:47 AM, Xavier Schepler <
xavier.schepler@sciences-po.fr> wrote:

> Hi,
>
> I'm trying to highlight short text values. The field they came from has a
> type shared with other fields. I have highlighting working on other fields
> but not on this one.
> Why ?
>