You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Villalba, Raúl <rv...@minsait.com> on 2020/07/14 11:05:54 UTC

SOLR Exact phrase search issue

Hello,

We have an app that uses SOLR as search engine. We have detected incorrect behavior for which we find no explanation. If we perform a search with the phrase "Què t'hi jugues" we do not receive any results, although we know that there is a result that contains this phrase. However, if we search for "Què t'hi" or for "t'hi jugues" we do find results, including "Què t'hi jugues ". We attach screenshots of the search tool and the xml of the results. We would greatly appreciate it if you could lend a hand in trying to find a solution or identify the cause of the problem.

Search 1 - "Què t'hi jugues"
[cid:image007.jpg@01D659DF.7FFF5FB0]

Search 2 - "Què t'hi"
[cid:image008.jpg@01D659DF.7FFF5FB0] [cid:image010.jpg@01D659DF.7FFF5FB0]

Search 3 - "t'hi jugues"
[cid:image011.jpg@01D659DF.7FFF5FB0]

Best regards,

[cid:image001.png@01D589A7.4512A050]<http://www.minsait.com/>

Raül Villalba Sans
Delivery Centers - Centros de Producción

Parque de Gardeny, Edificio 28
25071 Lleida, España
T +34 973 193 580


Re: SOLR Exact phrase search issue

Posted by Michael Gibney <mi...@michaelgibney.net>.
Raúl, I notice that your test search that's failing is a phrase
search. Going out on a limb here: do you have WordDelimiterGraphFilter
configured in your index-time analysis chain? Could you send the
analysis chains for the affected fields?

Michael


On Wed, Jul 15, 2020 at 8:14 AM Erick Erickson <er...@gmail.com> wrote:
>
> Heck, Charlie, it explains 90% of the problems I’ve personally had with
> programming in general over my entire career...
>
> > On Jul 15, 2020, at 5:08 AM, Charlie Hull <ch...@flax.co.uk> wrote:
> >
> > On 14/07/2020 12:48, Erick Erickson wrote:
> >> <snip> This is almost certainly a mismatch between what you think is happening and what you’ve actually told Solr to do ;).
> > That's a great one-line explanation of 90% of the issues people face with Solr :-)
> >
> > Charlie
> >>
> >> Best,
> >> Erick
> >>
> >>> On Jul 14, 2020, at 7:05 AM, Villalba Sans, Raúl <rv...@minsait.com> wrote:
> >>>
> >>> Hello,
> >>>
> >>> We have an app that uses SOLR as search engine. We have detected incorrect behavior for which we find no explanation. If we perform a search with the phrase "Què t’hi jugues" we do not receive any results, although we know that there is a result that contains this phrase. However, if we search for "Què t’hi" or for "t’hi jugues" we do find results, including "Què t’hi jugues ". We attach screenshots of the search tool and the xml of the results. We would greatly appreciate it if you could lend a hand in trying to find a solution or identify the cause of the problem.
> >>>  Search 1 – “Què t’hi jugues”
> >>> <image007.jpg>
> >>>  Search 2 – “Què t’hi”
> >>> <image008.jpg> <image010.jpg>
> >>>
> >>> Search 3 – “t’hi jugues”
> >>> <image011.jpg>
> >>>
> >>> Best regards,
> >>>  <image015.png>
> >>>  Raül Villalba Sans
> >>> Delivery Centers – Centros de Producción
> >>>  Parque de Gardeny, Edificio 28
> >>> 25071 Lleida, España
> >>> T +34 973 193 580
> >>>  <search2.xml><search1.xml><search3.xml>
> >
> >
> > --
> > Charlie Hull
> > OpenSource Connections, previously Flax
> >
> > tel/fax: +44 (0)8700 118334
> > mobile:  +44 (0)7767 825828
> > web: www.o19s.com
>

Re: SOLR Exact phrase search issue

Posted by Erick Erickson <er...@gmail.com>.
Heck, Charlie, it explains 90% of the problems I’ve personally had with
programming in general over my entire career...

> On Jul 15, 2020, at 5:08 AM, Charlie Hull <ch...@flax.co.uk> wrote:
> 
> On 14/07/2020 12:48, Erick Erickson wrote:
>> <snip> This is almost certainly a mismatch between what you think is happening and what you’ve actually told Solr to do ;).
> That's a great one-line explanation of 90% of the issues people face with Solr :-)
> 
> Charlie
>> 
>> Best,
>> Erick
>> 
>>> On Jul 14, 2020, at 7:05 AM, Villalba Sans, Raúl <rv...@minsait.com> wrote:
>>> 
>>> Hello,
>>> 
>>> We have an app that uses SOLR as search engine. We have detected incorrect behavior for which we find no explanation. If we perform a search with the phrase "Què t’hi jugues" we do not receive any results, although we know that there is a result that contains this phrase. However, if we search for "Què t’hi" or for "t’hi jugues" we do find results, including "Què t’hi jugues ". We attach screenshots of the search tool and the xml of the results. We would greatly appreciate it if you could lend a hand in trying to find a solution or identify the cause of the problem.
>>>  Search 1 – “Què t’hi jugues”
>>> <image007.jpg>
>>>  Search 2 – “Què t’hi”
>>> <image008.jpg> <image010.jpg>
>>> 
>>> Search 3 – “t’hi jugues”
>>> <image011.jpg>
>>> 
>>> Best regards,
>>>  <image015.png>
>>>  Raül Villalba Sans
>>> Delivery Centers – Centros de Producción
>>>  Parque de Gardeny, Edificio 28
>>> 25071 Lleida, España
>>> T +34 973 193 580
>>>  <search2.xml><search1.xml><search3.xml>
> 
> 
> -- 
> Charlie Hull
> OpenSource Connections, previously Flax
> 
> tel/fax: +44 (0)8700 118334
> mobile:  +44 (0)7767 825828
> web: www.o19s.com


Re: SOLR Exact phrase search issue

Posted by Charlie Hull <ch...@flax.co.uk>.
On 14/07/2020 12:48, Erick Erickson wrote:
> <snip> This is almost certainly a mismatch between what you think is happening and what you’ve actually told Solr to do ;).
That's a great one-line explanation of 90% of the issues people face 
with Solr :-)

Charlie
>
> Best,
> Erick
>
>> On Jul 14, 2020, at 7:05 AM, Villalba Sans, Raúl <rv...@minsait.com> wrote:
>>
>> Hello,
>>
>> We have an app that uses SOLR as search engine. We have detected incorrect behavior for which we find no explanation. If we perform a search with the phrase "Què t’hi jugues" we do not receive any results, although we know that there is a result that contains this phrase. However, if we search for "Què t’hi" or for "t’hi jugues" we do find results, including "Què t’hi jugues ". We attach screenshots of the search tool and the xml of the results. We would greatly appreciate it if you could lend a hand in trying to find a solution or identify the cause of the problem.
>>   
>> Search 1 – “Què t’hi jugues”
>> <image007.jpg>
>>   
>> Search 2 – “Què t’hi”
>> <image008.jpg> <image010.jpg>
>>
>> Search 3 – “t’hi jugues”
>> <image011.jpg>
>>
>> Best regards,
>>   
>> <image015.png>
>>   
>> Raül Villalba Sans
>> Delivery Centers – Centros de Producción
>>   
>> Parque de Gardeny, Edificio 28
>> 25071 Lleida, España
>> T +34 973 193 580
>>   
>> <search2.xml><search1.xml><search3.xml>


-- 
Charlie Hull
OpenSource Connections, previously Flax

tel/fax: +44 (0)8700 118334
mobile:  +44 (0)7767 825828
web: www.o19s.com


Re: SOLR Exact phrase search issue

Posted by Erick Erickson <er...@gmail.com>.
This is usually a result of either indexing or querying not quite doing what you expect. The screenshots don’t help diagnose as they’re just the results, but don’t really help understand why.

So here’s what I do to try to figure out why:

1> add &debug=query to the query You can check the “debugQuery” checkbox on the admin UI. In particular look at the “parsed query” in the results. Is it what you expect?

2> use the Admin/Analysis page to see how the fields you’re searching against are tokenized. Sometimes your analysis chain produces unexpected results. <1> will show you this for querying, but not indexing.

3> try turning on highlighting. You have not shown, for instance, that "Què t’hi jugues” all appears in a single field. It’s conceivable that you’re not searching that field at all and are matching
"t’hi jugues” or "Què t’hi” in a different field than you expect.

4> Another thing that fools people is that the analysis chain may break up “t’hi” into “t” and “hi” which then may match unexpected places.

5> Are any of these stopwords? The admin/analysis page will show you.

6> Finally, try attaching &debug=true&explainOther=id:id_of_doc_you_expect. That will show you now the document you expect was scored, whether or not it’s included in numFound. It’s intended exactly for answering the question “why didn’t my searche return a doc I _know_ it should?"


Seems like a lot of places to look, but <1> is certainly the first place I’d look. This is almost certainly a mismatch between what you think is happening and what you’ve actually told Solr to do ;).

Best,
Erick

> On Jul 14, 2020, at 7:05 AM, Villalba Sans, Raúl <rv...@minsait.com> wrote:
> 
> Hello,
> 
> We have an app that uses SOLR as search engine. We have detected incorrect behavior for which we find no explanation. If we perform a search with the phrase "Què t’hi jugues" we do not receive any results, although we know that there is a result that contains this phrase. However, if we search for "Què t’hi" or for "t’hi jugues" we do find results, including "Què t’hi jugues ". We attach screenshots of the search tool and the xml of the results. We would greatly appreciate it if you could lend a hand in trying to find a solution or identify the cause of the problem.
>  
> Search 1 – “Què t’hi jugues”
> <image007.jpg>
>  
> Search 2 – “Què t’hi”
> <image008.jpg> <image010.jpg>
> 
> Search 3 – “t’hi jugues”
> <image011.jpg>
> 
> Best regards,
>  
> <image015.png>
>  
> Raül Villalba Sans
> Delivery Centers – Centros de Producción
>  
> Parque de Gardeny, Edificio 28
> 25071 Lleida, España
> T +34 973 193 580
>  
> <search2.xml><search1.xml><search3.xml>