You are viewing a plain text version of this content. The canonical link for it is here.

Posted to solr-user@lucene.apache.org by ito hayato <ha...@yahoo.co.jp> on 2010/07/04 16:55:34 UTC

FastVectorHighlighter and SynonymFilter

Hi all,

I try using concurrently SynonymFilter and
FastVectorHighlighter,
then I got empty highlight response.
As investigation of the causes, I set all combination of
hl.useVectorHighlighter and expand attribute.

-------------------------------
<filter class="solr.SynonymFilterFactory
         synonyms="syn.txt"
         ignoreCase="true" expand="false"/>
-------------------------------

Result:

 expand="true" in solrconfig.xml,and
hl.useFastVectorHighlighter=true
 -> Returned highlight is empty as below.

-----------
<lst name="highlighting">
<lst name="404"/>
<lst name="401"/>
<lst name="89"/>
<lst name="155"/>
<lst name="411"/>
</lst>
-----------

 expand="false" , and hl.useFastVectorHighlighter=true
 expand="true"  , and hl.useFastVectorHighlighter=false
 expand="false" , and hl.useFastVectorHighlighter=false

 -> On these cases,Highlighting has correct snippet.


Are SynonymFilter and FastVectorHighlighter not supported
using concurrently?
These component are not compatible?

In additional:
 - target field type is defined as following
 - target field is tokenized by CJKTokenizer.
 - This problem occured only when I search japanese
keyword.
  but not occured when English keyword.
  (trouble was caused by n-gram tokenize?)

-----------
 <fieldType name="text_cjk" class="solr.TextField">
   <analyzer type="index">
     <tokenizer class="solr.CJKTokenizerFactory"/>
   </analyzer>
   <analyzer type="query">
     <tokenizer class="solr.CJKTokenizerFactory"/>
     <filter class="solr.SynonymFilterFactory"
synonyms="synonyms.txt" ignore
Case="true" expand="true"
tokenizerFactory="solr.CJKTokenizerFactory"/>
   </analyzer>
 </fieldType>
-----------

--------------------------------------
2010 FIFA World Cup News [Yahoo!Sports/sportsnavi]
http://pr.mail.yahoo.co.jp/southafrica2010/

Re: FastVectorHighlighter and SynonymFilter

Posted by ITO Hayato <ha...@gmail.com>.

> I think the cause of the problem is that combination of query
> time expansion and N-gram tokenizer generates MultiPhraseQuery,
> however, FVH doesn't support MPQ.

Sekiguchi-san

I try following test.

- Index time filtering and set SynonymFilter expand=true.

Query result is up to my expectations.
(correct snippet.)

I guess this problem related to LUCENE-1889.
https://issues.apache.org/jira/browse/LUCENE-1889

thanks for your reply.

Re: FastVectorHighlighter and SynonymFilter

Posted by Koji Sekiguchi <ko...@r.email.ne.jp>.

(10/07/04 23:55), ito hayato wrote:
> Hi all,
>
> I try using concurrently SynonymFilter and
> FastVectorHighlighter,
> then I got empty highlight response.
> As investigation of the causes, I set all combination of
> hl.useVectorHighlighter and expand attribute.
>
> -------------------------------
> <filter class="solr.SynonymFilterFactory
>          synonyms="syn.txt"
>          ignoreCase="true" expand="false"/>
> -------------------------------
>
> Result:
>
>  expand="true" in solrconfig.xml,and
> hl.useFastVectorHighlighter=true
>  -> Returned highlight is empty as below.
>
> -----------
> <lst name="highlighting">
> <lst name="404"/>
> <lst name="401"/>
> <lst name="89"/>
> <lst name="155"/>
> <lst name="411"/>
> </lst>
> -----------
>
>  expand="false" , and hl.useFastVectorHighlighter=true
>  expand="true"  , and hl.useFastVectorHighlighter=false
>  expand="false" , and hl.useFastVectorHighlighter=false
>
>  -> On these cases,Highlighting has correct snippet.
>
>
> Are SynonymFilter and FastVectorHighlighter not supported
> using concurrently?
> These component are not compatible?
>
> In additional:
>  - target field type is defined as following
>  - target field is tokenized by CJKTokenizer.
>  - This problem occured only when I search japanese
> keyword.
>   but not occured when English keyword.
>   (trouble was caused by n-gram tokenize?)
>
> -----------
>  <fieldType name="text_cjk" class="solr.TextField">
>    <analyzer type="index">
>      <tokenizer class="solr.CJKTokenizerFactory"/>
>    </analyzer>
>    <analyzer type="query">
>      <tokenizer class="solr.CJKTokenizerFactory"/>
>      <filter class="solr.SynonymFilterFactory"
> synonyms="synonyms.txt" ignore
> Case="true" expand="true"
> tokenizerFactory="solr.CJKTokenizerFactory"/>
>    </analyzer>
>  </fieldType>
> -----------
>
> --------------------------------------
> 2010 FIFA World Cup News [Yahoo!Sports/sportsnavi]
> http://pr.mail.yahoo.co.jp/southafrica2010/
>
>   
Hello Ito-san,

I think the cause of the problem is that combination of query
time expansion and N-gram tokenizer generates MultiPhraseQuery,
however, FVH doesn't support MPQ.

Koji

-- 
http://www.rondhuit.com/en/