You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Xavier Schepler <xa...@sciences-po.fr> on 2010/01/07 13:47:00 UTC
Field highlighting
Hi,
I'm trying to highlight short text values. The field they came from has
a type shared with other fields. I have highlighting working on other
fields but not on this one.
Why ?
Re: Field highlighting
Posted by Koji Sekiguchi <ko...@r.email.ne.jp>.
Jan Høydahl / Cominvent wrote:
> Did you solve this?
> If yes, what was wrong?
> If no, can you specify one concrete example document and a matching query which fails to highlight?
>
> --
> Jan Høydahl - search architect
> Cominvent AS - www.cominvent.com
>
> On 7. jan. 2010, at 15.23, Xavier Schepler wrote:
>
>
>> Erick Erickson a écrit :
>>
>>> It's really hard to provide any response with so little information,
>>> could you show us the difference between a field that works
>>> and one that doesn't? Especially the relevant schema.xml entries
>>> and the query that fails to highlight....
>>>
>>> Erick
>>>
>>> On Thu, Jan 7, 2010 at 7:47 AM, Xavier Schepler <
>>> xavier.schepler@sciences-po.fr> wrote:
>>>
>>>
>>>
>>>> Hi,
>>>>
>>>> I'm trying to highlight short text values. The field they came from has a
>>>> type shared with other fields. I have highlighting working on other fields
>>>> but not on this one.
>>>> Why ?
>>>>
>>>>
>>>>
>>>
>>>
>> Thanks for your response.
>> Here are some extracts from my schema.xml :
>>
>> <fieldtype name="textFr" class="solr.TextField">
>> <analyzer>
>> <!-- suppression des mots vides de sens -->
>> <filter class="solr.StopFilterFactory" words="french-stopwords.txt" ignoreCase="true"/>
>> <!-- decoupage en jetons -->
>> <tokenizer class="solr.StandardTokenizerFactory"/>
>> <!-- suppression des accents -->
>> <filter class="solr.ISOLatin1AccentFilterFactory"/>
>> <!-- suppression des points a la fin des accronymes -->
>> <filter class="solr.StandardFilterFactory"/>
>> <!-- passage en miniscules -->
>> <filter class="solr.LowerCaseFilterFactory"/>
>> <!-- lexemisation avec le filtre porter -->
>> <filter class="solr.SnowballPorterFilterFactory" language="French"/>
>> <!-- synonymes -->
>> <filter class="solr.SynonymFilterFactory" synonyms="test-synonyms.txt" ignoreCase="true" expand="true"/>
>> </analyzer>
>> </fieldtype>
>>
>> Here's a field on which highlighting works :
>>
>> <field name="questionsLabelsFr"
>> required="false"
>> type="textFr"
>> multiValued="true"
>> indexed="true"
>> stored="true"
>> compressed="false"
>> omitNorms="false"
>> termVectors="true"
>> termPositions="true"
>> termOffsets="true"
>> />
>>
>> Here's the field on which it doesn't :
>>
>> <field name="modalitiesLabelsFr"
>> required="false"
>> type="textFr"
>> multiValued="true"
>> indexed="true"
>> stored="true"
>> compressed="false"
>> omitNorms="false"
>> termVectors="true"
>> termPositions="true"
>> termOffsets="true"
>> />
>>
>> They are kinda the same.
>>
>> But modalitiesLabelFr contains mostly short strings like :
>>
>> Côtes-d Armor
>> Creuse
>> Dordogne
>> Doubs
>> Drôme
>> Eure
>> Eure-et-Loir
>> Finistère
>>
>> When matches are found in them, I get a list like this, with no text :
>>
>> <lst name="highlighting">
>> <lst name="dbbd3642-db1d-4b35-9280-11582523903d"/>
>> ....
>> <lst name="f1d8be2d-1070-4111-b16e-94d16c8c0bc6"/>
>> </lst>
>>
>> The name attribute is the uid of the document.
>>
>> I tryed several values for hl.fragsize (0, 1, 2, ...) with no success at all.
>>
>
>
>
If you are using trunk version, because those fields'
termVectors/termPositions/termOffsets are all on, DefaultSolrHighlighter
uses FastVectorHighlighter unless you set hl.useHighlighter to true
explicitly.
And FVH doesn't support dismax at the moment. It has been fixed in
Lucene trunk:
https://issues.apache.org/jira/browse/LUCENE-2243
If you want to use the fix, download Lucene trunk, execute ant
build-contrib,
cp lucene-fast-vector-highlighter-3.1-dev.jar to solr/lib and delete old
FVH jar in solr/lib.
Koji
--
http://www.rondhuit.com/en/
Re: Field highlighting
Posted by Jan Høydahl / Cominvent <ja...@cominvent.com>.
Did you solve this?
If yes, what was wrong?
If no, can you specify one concrete example document and a matching query which fails to highlight?
--
Jan Høydahl - search architect
Cominvent AS - www.cominvent.com
On 7. jan. 2010, at 15.23, Xavier Schepler wrote:
> Erick Erickson a écrit :
>> It's really hard to provide any response with so little information,
>> could you show us the difference between a field that works
>> and one that doesn't? Especially the relevant schema.xml entries
>> and the query that fails to highlight....
>>
>> Erick
>>
>> On Thu, Jan 7, 2010 at 7:47 AM, Xavier Schepler <
>> xavier.schepler@sciences-po.fr> wrote:
>>
>>
>>> Hi,
>>>
>>> I'm trying to highlight short text values. The field they came from has a
>>> type shared with other fields. I have highlighting working on other fields
>>> but not on this one.
>>> Why ?
>>>
>>>
>>
>>
> Thanks for your response.
> Here are some extracts from my schema.xml :
>
> <fieldtype name="textFr" class="solr.TextField">
> <analyzer>
> <!-- suppression des mots vides de sens -->
> <filter class="solr.StopFilterFactory" words="french-stopwords.txt" ignoreCase="true"/>
> <!-- decoupage en jetons -->
> <tokenizer class="solr.StandardTokenizerFactory"/>
> <!-- suppression des accents -->
> <filter class="solr.ISOLatin1AccentFilterFactory"/>
> <!-- suppression des points a la fin des accronymes -->
> <filter class="solr.StandardFilterFactory"/>
> <!-- passage en miniscules -->
> <filter class="solr.LowerCaseFilterFactory"/>
> <!-- lexemisation avec le filtre porter -->
> <filter class="solr.SnowballPorterFilterFactory" language="French"/>
> <!-- synonymes -->
> <filter class="solr.SynonymFilterFactory" synonyms="test-synonyms.txt" ignoreCase="true" expand="true"/>
> </analyzer>
> </fieldtype>
>
> Here's a field on which highlighting works :
>
> <field name="questionsLabelsFr"
> required="false"
> type="textFr"
> multiValued="true"
> indexed="true"
> stored="true"
> compressed="false"
> omitNorms="false"
> termVectors="true"
> termPositions="true"
> termOffsets="true"
> />
>
> Here's the field on which it doesn't :
>
> <field name="modalitiesLabelsFr"
> required="false"
> type="textFr"
> multiValued="true"
> indexed="true"
> stored="true"
> compressed="false"
> omitNorms="false"
> termVectors="true"
> termPositions="true"
> termOffsets="true"
> />
>
> They are kinda the same.
>
> But modalitiesLabelFr contains mostly short strings like :
>
> Côtes-d Armor
> Creuse
> Dordogne
> Doubs
> Drôme
> Eure
> Eure-et-Loir
> Finistère
>
> When matches are found in them, I get a list like this, with no text :
>
> <lst name="highlighting">
> <lst name="dbbd3642-db1d-4b35-9280-11582523903d"/>
> ....
> <lst name="f1d8be2d-1070-4111-b16e-94d16c8c0bc6"/>
> </lst>
>
> The name attribute is the uid of the document.
>
> I tryed several values for hl.fragsize (0, 1, 2, ...) with no success at all.
Re: Field highlighting
Posted by Xavier Schepler <xa...@sciences-po.fr>.
Erick Erickson a écrit :
> It's really hard to provide any response with so little information,
> could you show us the difference between a field that works
> and one that doesn't? Especially the relevant schema.xml entries
> and the query that fails to highlight....
>
> Erick
>
> On Thu, Jan 7, 2010 at 7:47 AM, Xavier Schepler <
> xavier.schepler@sciences-po.fr> wrote:
>
>
>> Hi,
>>
>> I'm trying to highlight short text values. The field they came from has a
>> type shared with other fields. I have highlighting working on other fields
>> but not on this one.
>> Why ?
>>
>>
>
>
Thanks for your response.
Here are some extracts from my schema.xml :
<fieldtype name="textFr" class="solr.TextField">
<analyzer>
<!-- suppression des mots vides de sens -->
<filter class="solr.StopFilterFactory"
words="french-stopwords.txt" ignoreCase="true"/>
<!-- decoupage en jetons -->
<tokenizer class="solr.StandardTokenizerFactory"/>
<!-- suppression des accents -->
<filter class="solr.ISOLatin1AccentFilterFactory"/>
<!-- suppression des points a la fin des accronymes -->
<filter class="solr.StandardFilterFactory"/>
<!-- passage en miniscules -->
<filter class="solr.LowerCaseFilterFactory"/>
<!-- lexemisation avec le filtre porter -->
<filter class="solr.SnowballPorterFilterFactory" language="French"/>
<!-- synonymes -->
<filter class="solr.SynonymFilterFactory"
synonyms="test-synonyms.txt" ignoreCase="true" expand="true"/>
</analyzer>
</fieldtype>
Here's a field on which highlighting works :
<field name="questionsLabelsFr"
required="false"
type="textFr"
multiValued="true"
indexed="true"
stored="true"
compressed="false"
omitNorms="false"
termVectors="true"
termPositions="true"
termOffsets="true"
/>
Here's the field on which it doesn't :
<field name="modalitiesLabelsFr"
required="false"
type="textFr"
multiValued="true"
indexed="true"
stored="true"
compressed="false"
omitNorms="false"
termVectors="true"
termPositions="true"
termOffsets="true"
/>
They are kinda the same.
But modalitiesLabelFr contains mostly short strings like :
Côtes-d Armor
Creuse
Dordogne
Doubs
Drôme
Eure
Eure-et-Loir
Finistère
When matches are found in them, I get a list like this, with no text :
<lst name="highlighting">
<lst name="dbbd3642-db1d-4b35-9280-11582523903d"/>
....
<lst name="f1d8be2d-1070-4111-b16e-94d16c8c0bc6"/>
</lst>
The name attribute is the uid of the document.
I tryed several values for hl.fragsize (0, 1, 2, ...) with no success at
all.
Re: Field highlighting
Posted by Erick Erickson <er...@gmail.com>.
It's really hard to provide any response with so little information,
could you show us the difference between a field that works
and one that doesn't? Especially the relevant schema.xml entries
and the query that fails to highlight....
Erick
On Thu, Jan 7, 2010 at 7:47 AM, Xavier Schepler <
xavier.schepler@sciences-po.fr> wrote:
> Hi,
>
> I'm trying to highlight short text values. The field they came from has a
> type shared with other fields. I have highlighting working on other fields
> but not on this one.
> Why ?
>