You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by "anil.vadhavane" <an...@procentris.com> on 2015/09/28 09:26:13 UTC

Keyword match distance rule issue

Hello,

I'm using Lucene Solr 4.10.4 for Keyword match functionality. I found some
issues with distance rule.
I have added search keyword with distance 2 "Bridgewater~2".
When I make search it did not return "bridwater" in results which should be. 

If I change placing of 'ge' at any other place it works. For e.g.
"Bridwgeater~2"

Has anyone faced similar issues and possible solutions.

Thanks.




--
View this message in context: http://lucene.472066.n3.nabble.com/Keyword-match-distance-rule-issue-tp4231624.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Keyword match distance rule issue

Posted by "anil.vadhavane" <an...@procentris.com>.
Hi Jack,

Thanks for a quick reply.

I understood your point regarding the edit distances related restriction in
Solr. Yes, the query string does not contain actual quotes. The query should
match with 2 edit distance. As I mentioned, if we try "Bridffwater~2", Solr
matching it.

We haven't noticed the Exception. We are using Solarium (php) client to
query Solr. We have also tried direct query to Solr using web browser.

Can you please check this case on your system and let us know if it matches?
if it is, we can go ahead and do further analysis to solve it. Please tell
us your Solr version and operating system if it matches.

Thanks



--
View this message in context: http://lucene.472066.n3.nabble.com/Keyword-match-distance-rule-issue-tp4231624p4232055.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Keyword match distance rule issue

Posted by Jack Krupansky <ja...@gmail.com>.
This feature is known as fuzzy query, not keyword match.

Unfortunately, the edit distance limit is limited to 2. 3 or more are not
supported. Lucene itself still has the old "slow" fuzzy query that supports
larger edit distances, but Solr has no syntax for selecting it.

Actually, this limit of 2 is strict and enforced in Solr 4.x and 5.x and an
exception will be thrown. So, are you really not seeing an exception when
you use an edit distance greater than 2?

Also, please confirm that your query string does not contain actual quotes.
If it did, the fuzzy syntax would simply be analyzed as if it were simple
text.


-- Jack Krupansky

On Wed, Sep 30, 2015 at 9:32 AM, anil.vadhavane <an...@procentris.com>
wrote:

> Hi Benedetti,
>
> Yes, at first it looks like a user error and I am surprised as well with
> the
> case.
>
> We tested this on two different system. We tried it with lower case input
> but it is not matching. We are using the standard title column to store the
> data. Even we tried with 3, 4 and 5 edit distance but, this particular
> query
> is not matching.
>
> I wonder if anyone really try this on their own system to confirm if that
> is
> the case with others as well or not.
>
> Just to clarify -
>
> We want to match "emma bridwater radios", stored in title column, with the
> search query "Bridgewater~2" (you can use 3 edit distance if you want). We
> observed that, Solr not matching it. However, if we try "Bridffwater~2",
> Solr matching it.
>
> It might be a silly mistack from our side but, we are not able to find the
> solution at present.
>
>
> Thanks
>
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Keyword-match-distance-rule-issue-tp4231624p4232040.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Re: Keyword match distance rule issue

Posted by "anil.vadhavane" <an...@procentris.com>.
Hello,

We have tried the Analysis tool. Below is the screenshot of analysis tool.

<http://lucene.472066.n3.nabble.com/file/n4232246/lucene.png> 





--
View this message in context: http://lucene.472066.n3.nabble.com/Keyword-match-distance-rule-issue-tp4231624p4232246.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Keyword match distance rule issue

Posted by Alessandro Benedetti <be...@gmail.com>.
Hi Anil,
what does make you feel that bridgewater~2 is not matching bridwater ?

Are you sure bridwater is in your index ?
Are you sure is in the field where you are looking for bridgewater ?
I would verify that, because it does n to make sense as they both have the
same distance to bridwater.
Are you sure bridffwater is not matching something else ?
To be sure I tried and of course it's working, no reason to have
discrimination against bridgewater :)


2015-10-05 9:18 GMT+01:00 anil.vadhavane <an...@procentris.com>:

> Hello,
>
> Could you please try same search query on your machine to check if it
> matches?
>
> Thanks.
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Keyword-match-distance-rule-issue-tp4231624p4232713.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>



-- 
--------------------------

Benedetti Alessandro
Visiting card - http://about.me/alessandro_benedetti
Blog - http://alexbenedetti.blogspot.co.uk

"Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?"

William Blake - Songs of Experience -1794 England

Re: Keyword match distance rule issue

Posted by "anil.vadhavane" <an...@procentris.com>.
Hello,

Could you please try same search query on your machine to check if it
matches? 

Thanks.



--
View this message in context: http://lucene.472066.n3.nabble.com/Keyword-match-distance-rule-issue-tp4231624p4232713.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Keyword match distance rule issue

Posted by Alessandro Benedetti <be...@gmail.com>.
Hi, Solr does not support more than 2 as an edit distance !
You need to customise this at code level if you want to.

If in the index we have :

bridwater

Bridgewater (3)
Bridffwater (3)

This is really weird, but please , can you tell me what exactly have
indexed for that field ? Can you check the analysis tool and show me the
tokens produced for that field, at indexing and query time ?
The analysis tool is reachable from the core in the admin UI and is really
useful in this kind of situations.

Cheers



2015-09-30 14:32 GMT+01:00 anil.vadhavane <an...@procentris.com>:

> Hi Benedetti,
>
> Yes, at first it looks like a user error and I am surprised as well with
> the
> case.
>
> We tested this on two different system. We tried it with lower case input
> but it is not matching. We are using the standard title column to store the
> data. Even we tried with 3, 4 and 5 edit distance but, this particular
> query
> is not matching.
>
> I wonder if anyone really try this on their own system to confirm if that
> is
> the case with others as well or not.
>
> Just to clarify -
>
> We want to match "emma bridwater radios", stored in title column, with the
> search query "Bridgewater~2" (you can use 3 edit distance if you want). We
> observed that, Solr not matching it. However, if we try "Bridffwater~2",
> Solr matching it.
>
> It might be a silly mistack from our side but, we are not able to find the
> solution at present.
>
>
> Thanks
>
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Keyword-match-distance-rule-issue-tp4231624p4232040.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>



-- 
--------------------------

Benedetti Alessandro
Visiting card - http://about.me/alessandro_benedetti
Blog - http://alexbenedetti.blogspot.co.uk

"Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?"

William Blake - Songs of Experience -1794 England

Re: Keyword match distance rule issue

Posted by "anil.vadhavane" <an...@procentris.com>.
Hi Benedetti,

Yes, at first it looks like a user error and I am surprised as well with the
case.

We tested this on two different system. We tried it with lower case input
but it is not matching. We are using the standard title column to store the
data. Even we tried with 3, 4 and 5 edit distance but, this particular query
is not matching.

I wonder if anyone really try this on their own system to confirm if that is
the case with others as well or not.

Just to clarify -

We want to match "emma bridwater radios", stored in title column, with the
search query "Bridgewater~2" (you can use 3 edit distance if you want). We
observed that, Solr not matching it. However, if we try "Bridffwater~2",
Solr matching it.

It might be a silly mistack from our side but, we are not able to find the
solution at present.


Thanks




--
View this message in context: http://lucene.472066.n3.nabble.com/Keyword-match-distance-rule-issue-tp4231624p4232040.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Keyword match distance rule issue

Posted by Alessandro Benedetti <be...@gmail.com>.
Maybe it's a silly observation...
But are you lowercasing at indexing/querying time ?
Can you show us the schema analysis config for the field type you use ?
Because strictly talking about Levenshtein distance bridwater is 3 edits
from Bridgewater.

Cheers

2015-09-28 8:26 GMT+01:00 anil.vadhavane <an...@procentris.com>:

> Hello,
>
> I'm using Lucene Solr 4.10.4 for Keyword match functionality. I found some
> issues with distance rule.
> I have added search keyword with distance 2 "Bridgewater~2".
> When I make search it did not return "bridwater" in results which should
> be.
>
> If I change placing of 'ge' at any other place it works. For e.g.
> "Bridwgeater~2"
>
> Has anyone faced similar issues and possible solutions.
>
> Thanks.
>
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Keyword-match-distance-rule-issue-tp4231624.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>



-- 
--------------------------

Benedetti Alessandro
Visiting card - http://about.me/alessandro_benedetti
Blog - http://alexbenedetti.blogspot.co.uk

"Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?"

William Blake - Songs of Experience -1794 England