You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Chris Hostetter <ho...@fucit.org> on 2009/09/04 00:36:25 UTC
Re: Clarifications to Synonym Filter Wiki entry? (2 of 2)
: Earlier on the thread repeats the claim that, if you use index side
: expansion, you won't have a problem. But it doesn't explain how/why that
: fixes it, given that the Lucene parser still breaks on white space.
because at query time, nothing knows (or cares) that that multiple
variants were indexed ... if your feld contains "sea" and
"biscut" and "seabiscut" the query parser doesn't care .... a querystring
whose parsed form results in the query (field:seabiscut) is going
to match, ditto for (field:sea field:biscut) ... the only place things
start getting interesting is with phrase queries: because the synonyms
are put at the same term position, things typically work ok, but you
sometimes (ie: when the synonyms have differnet number of tokens) need a
non-zero slop factor to help bridge the gap.
: Later there's a clue, it seems that even single words of a multi-word
: thesaurus entry are matched - so I guess Lucene doesn't need to see both
: words in a multi-word query, it just picks up either word, so it works
: around the multi-word parsing problem, but adds the undesireable side effect
: of false positive matches?
no ... A multi word (phrase) query needs to match all the words ... what
that's referign to is that if a document orriginall contained "seabiscut"
and synonyms caused "sea" and "biscut" to be added, then a search for just
the term "sea" will match.
-Hoss