You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Ryan Wilson <rp...@gmail.com> on 2018/10/25 21:58:10 UTC

Fuzzy search expansion problem on 6.6.3

Hello all,

I am running a solr 6.6.3 3-shard cloud with one main collection that
contains 587,371,821 rows of data. One of the fields in this collection is
names. We are currently running into an issue with fuzzy searches on name
where it seems unable to get all possible values for a number of different
names even when only querying for 1 change (~1).

I've technically asked this question in the distant past and the answer I
received at the time was to modify org.apache.lucene.search.FuzzySearch to
have a larger defaultMaxExpansions value. For disclosure we also set
defaultTranspositions to false as the customers did not like query results
they were getting with it on. For a time this worked. However, within the
last 6 months or so we've started seeing signs of this issue cropping up
again.

The two things that have changed since the original email is that we've
migrated from 4.7.1 to 6.6.3 and we almost doubled the number of records in
the index. With the hope that the old solution would still work, I've
tweaked defaultMaxExpansions as high as 10240 with the requisite change to
maxBooleanClauses to match and it seems to have had no effect. So much so
that I am suspicious that the change is having no effect whatsoever. I am
in the process of setting up a much more focused testing environment for
just names, but figured I'd send this out to get some initial advice or
suggestions on what I might have missed or should investigate.

I've reviewed patch notes for versions before and after 6.6.3 to check for
breaking changes from 4.7.1 or fixes in future versions and haven't seen
anything.

Thanks,
Ryan Wilson