You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by ale42 <al...@etu.esisar.grenoble-inp.fr> on 2015/03/06 12:20:17 UTC

Check the return of suggestions

Hello everyone.

I'm working with Solr 4.3. I use the Spellechecker component which gives me
suggestions as i expect.

I will explain my problem with an example : 

I am querying /"cartouchhe"/instead of /"cartouche"/.

I obtain these suggestions 

array (size=5)
  0 => 
    array (size=2)
      'word' => *string 'cartouche' (length=9)*
      'freq' => *int 1519*
  1 => 
    array (size=2)
      'word' => *string 'touches' (length=7)*
      'freq' =>* int 55*
  2 => 
    array (size=2)
      'word' => *string 'cartouches' (length=10)*
      'freq' =>*int 32*
  3 => 
    array (size=2)
      'word' =>* string 'caoutchoucs' (length=11)*
      'freq' =>* int 16*
  4 => 
    array (size=2)
      'word' => *string 'cartonnees' (length=10)*
      'freq' =>* int 15*

This is what I want ==> OK.

The problem is that when I query /"cartouche"/or /"cartouches"/, I exactly
have the same results because for both query, the term that will be
searching into my index is /"cartouch"/.

Is there a way with Solr to fix this kind of "problem" ie check that 2
collations will not return exactly the same results?

Thanks for your answers,
Alex.



--
View this message in context: http://lucene.472066.n3.nabble.com/Check-the-return-of-suggestions-tp4191383.html
Sent from the Solr - User mailing list archive at Nabble.com.

RE: Check the return of suggestions

Posted by ale42 <al...@etu.esisar.grenoble-inp.fr>.
Thanks for your answer Charlie,


Reitzel, Charles wrote
> 
/
> It looks like your search term and index are both subject to a stem
> filter.  Is that right?
/
> 
> Yes, that is right and that is what I want !
/
> To avoid the default query parser for spellcheck purposes, you might try
> spellcheck.q=cartouche.   But that may not be sufficient if the spellcheck
> field is also "aggressively" stemmed.   I.e. try
> solr.EnglishMinimalStemFilterFactory vs. solr.PorterStemFilterFactory.
> Worst case, you may need to copy values to a separate spellcheck field
> with less aggressive stemming.
/
> 
> 
> My spellcheck field is not agressive. It doesn't use stem, just 
*
> WhitespaceTokenizer
*
> , 
*
> StopFilter
*
> , 
*
> WordDelimiterFilter
*
> , 
*
> LowerCaseFilter 
*
> and 
*
> ASCIIFoldingFilter
*
> .
> I am using my website in France, so I don't think I can use 
*
> solr.EnglishMinimalStemFilterFactory
*
> .
/
> It seems unlikely, to me, that "touches" and "cartouche" would have the
> same stem.   But "touches" may or may not be an ok spellcheck correction
> for your app.    You can tweak the accuracy parameter.  Also, if using
> DirectSolrSpellChecker, check maxEdits.
/
> 
> Yes, I success to get around the problem by using threshold and acuracy
> but it is not a perfect way for me, because it can miss few usefull
> suggestions if these one are not well represented in the corpus...

Thanks for your help.



--
View this message in context: http://lucene.472066.n3.nabble.com/Check-the-return-of-suggestions-tp4191383p4191856.html
Sent from the Solr - User mailing list archive at Nabble.com.

RE: Check the return of suggestions

Posted by "Reitzel, Charles" <Ch...@tiaa-cref.org>.
Hi Alex,

It looks like your search term and index are both subject to a stem filter.  Is that right?

To avoid the default query parser for spellcheck purposes, you might try spellcheck.q=cartouche.   But that may not be sufficient if the spellcheck field is also "aggressively" stemmed.   I.e. try solr.EnglishMinimalStemFilterFactory vs. solr.PorterStemFilterFactory.

Worst case, you may need to copy values to a separate spellcheck field with less aggressive stemming.

It seems unlikely, to me, that "touches" and "cartouche" would have the same stem.   But "touches" may or may not be an ok spellcheck correction for your app.    You can tweak the accuracy parameter.  Also, if using DirectSolrSpellChecker, check maxEdits.

Just a couple thoughts ... 

hth,
Charlie

-----Original Message-----
From: ale42 [mailto:alexandre.fayeaux@etu.esisar.grenoble-inp.fr] 
Sent: Friday, March 06, 2015 6:20 AM
To: solr-user@lucene.apache.org
Subject: Check the return of suggestions

Hello everyone.

I'm working with Solr 4.3. I use the Spellechecker component which gives me suggestions as i expect.

I will explain my problem with an example : 

I am querying /"cartouchhe"/instead of /"cartouche"/.

I obtain these suggestions 

array (size=5)
  0 => 
    array (size=2)
      'word' => *string 'cartouche' (length=9)*
      'freq' => *int 1519*
  1 => 
    array (size=2)
      'word' => *string 'touches' (length=7)*
      'freq' =>* int 55*
  2 => 
    array (size=2)
      'word' => *string 'cartouches' (length=10)*
      'freq' =>*int 32*
  3 => 
    array (size=2)
      'word' =>* string 'caoutchoucs' (length=11)*
      'freq' =>* int 16*
  4 => 
    array (size=2)
      'word' => *string 'cartonnees' (length=10)*
      'freq' =>* int 15*

This is what I want ==> OK.

The problem is that when I query /"cartouche"/or /"cartouches"/, I exactly have the same results because for both query, the term that will be searching into my index is /"cartouch"/.

Is there a way with Solr to fix this kind of "problem" ie check that 2 collations will not return exactly the same results?

Thanks for your answers,
Alex.



--
View this message in context: http://lucene.472066.n3.nabble.com/Check-the-return-of-suggestions-tp4191383.html
Sent from the Solr - User mailing list archive at Nabble.com.

*************************************************************************
This e-mail may contain confidential or privileged information.
If you are not the intended recipient, please notify the sender immediately and then delete it.

TIAA-CREF
*************************************************************************