You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Prasanna R <pl...@gmail.com> on 2011/08/05 20:17:36 UTC

Re: Handling space variations in queries - matching 'thunderbolt' for query 'thunder bolt'

Requesting the community for feedback one more time - Does anyone have any
suggestions/comments regarding this?

Thanks in advance,

Prasanna

On Sat, Jul 30, 2011 at 12:04 AM, Prasanna R <pl...@gmail.com> wrote:

>
> We use a dismax handler with mm 1 in our Solr installation. I have a
> fieldType defined that creates shingles to handle space variations in the
> input strings and user queries. This fieldType can successfully handle cases
> where the query is 'thunderbolt' and the document contains the string
> 'thunder bolt' (the shingle results in the token 'thunderbolt' created
> during indexing).  However, due to the pre-analysis whitespace tokenization
> done by lucene query parser, the reverse is not handled well - document with
> string 'thunderbolt' being matched to query 'thunder bolt'.
>
> I find that in our dismax handler the shingle field records a match and
> scores on the 'pf' but the document is not returned as none of the fields in
> 'qf' record a match (mm is 1). I am looking for suggestions on how to handle
> this scenario. Using a synonym will obviously work but it seems a rather
> hackish solution. Is there a more elegant way of achieving a similar effect?
>
>
> Alternatively, is there a way to get the 'mm' parameter to factor in
> matches on 'pf' also?
>
> Kindly help.
>
> Regards,
>
> Prasanna
>