You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by SuoNayi <su...@163.com> on 2012/03/07 16:24:33 UTC

How to exactly match fields which are multi-valued?

Hi all, how to offer exact-match capabilities on the multi-valued fields?

Any helps are appreciated!

SuoNayi

Re: How to exactly match fields which are multi-valued?

Posted by Jonathan Rochkind <ro...@jhu.edu>.
Well, if you really want EXACT exact, just use a KeywordTokenizer (ie, 
not tokenize at all). But then matches will really have to be EXACT, 
including punctuation, whitespace, diacritics, etc.  But a query will 
only match if it 'exactly' matches one value in your multi-valued field.

You could try a KeywordTokenizer with some normalization too.

Either way, though, if you're issuing a query to a field tokenized with 
KeywordTokenizer that can include whitespace in it's values, you really 
need to issue it as a _phrase query_, to avoid being messed up by the 
lucene or dismax query parser's "pre tokenization".  Which is 
potentially fine, that's what you want to do anyway for 'exact match'.  
Except if you wanted to use dismax multiple qf's with just a BOOST on 
the 'exact match', but _not_ a phrase query for other fields... well, I 
can't figure out any way to do it with this technique.

It gets tricky, I haven't found a great solution.

On 3/8/2012 7:44 AM, Erick Erickson wrote:
> You haven't really given us much to go on here. Matches
> are just like a single valued field with the exception of
> the increment gap. Say one entry were
> large cat big dog
> in a multi-valued field. ay the next document
> indexed two values,
> large cat
> big dog
>
> And, say the increment gap were 100. The token offsets
> for doc 1 would be
> 0, 1, 2, 3
> and for doc 2 would be
> 0, 1, 101, 102
>
> The only effective difference is that phrase queries with "slop"
> less than 100 would NEVER match across multi-values. I.e.
> "cat big"~10 would match doc1 but not doc 2
>
> Best
> Erick
>
> 2012/3/7 SuoNayi<su...@163.com>:
>> Hi all, how to offer exact-match capabilities on the multi-valued fields?
>>
>> Any helps are appreciated!
>>
>> SuoNayi

Re: How to exactly match fields which are multi-valued?

Posted by Erick Erickson <er...@gmail.com>.
You haven't really given us much to go on here. Matches
are just like a single valued field with the exception of
the increment gap. Say one entry were
large cat big dog
in a multi-valued field. ay the next document
indexed two values,
large cat
big dog

And, say the increment gap were 100. The token offsets
for doc 1 would be
0, 1, 2, 3
and for doc 2 would be
0, 1, 101, 102

The only effective difference is that phrase queries with "slop"
less than 100 would NEVER match across multi-values. I.e.
"cat big"~10 would match doc1 but not doc 2

Best
Erick

2012/3/7 SuoNayi <su...@163.com>:
> Hi all, how to offer exact-match capabilities on the multi-valued fields?
>
> Any helps are appreciated!
>
> SuoNayi