You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Norberto Meijome <nu...@gmail.com> on 2008/10/01 12:53:24 UTC

Re: Dismax , "query phrases"

On Tue, 30 Sep 2008 11:43:57 -0700 (PDT)
Chris Hostetter <ho...@fucit.org> wrote:

> 
> : That's why I was wondering how Dismax breaks it all apart. It makes
> sense...I : suppose what I'd like to have is a way to tell dismax which
> fields NOT to : tokenize the input for. For these fields, it would pass the
> full q instead of : each part of it. Does this make sense? would it be useful
> at all? 
> 
> the *goal* makes sense, but the implementation would be ... problematic.
> 
> you have to remember the DisMax parser's whole way of working is to make 
> each "chunk" of input match against any qf field, and find the highest 
> scoring field for each chunk, with this input...
> 
> 	q = some phase  & qf = a b c
> 
> ...you get...
> 
> 	( (a:some | b:some | c:some) (a:phrase | b:phrase | c:phrase) )
> 
> ...even if dismax could tell that "c" was a field that should only support 
> exact matches,

thanks Hoss,

it would by a configuration option. 

> how would it fit c:"some phrase" into that structure?

does this make sense?

 ( (a:some | b:some ) (a:phrase | b:phrase) ( c:"some phrase") )


> I've already kinda forgotten how this thread started ... 

trying to get *exact* matches to always score higher using dismax - keeping in
mind that I have multiple exact fields, with different boosts...

> but would it make 
> sense to just use your "exact" fields in the pf, and have inexact versions 
> of them in the qf?  then docs that match your input exactly should score 
> at the top, but less exact matches will also still match.

aha! right, i think that makes sense...i obviously haven't got my head properly
around all the different functionality of dismax.

I will try it when I'm back @ work... right now, i seem to have solved the
problem by using shingles -the fields are artists, song & albumtitles ,so high
matching on shingles is quite approximate to exact matching - except that I had
to remove stopwords, so that impacts on performance.

Thanks again :)
B
_________________________
{Beto|Norberto|Numard} Meijome

Which is worse: ignorance or apathy?
Don't know. Don't care.

I speak for myself, not my employer. Contents may be hot. Slippery when wet.
Reading disclaimers makes you go blind. Writing them is worse. You have been
Warned.

Re: Dismax , "query phrases"

Posted by Chris Hostetter <ho...@fucit.org>.
: > how would it fit c:"some phrase" into that structure?
: 
: does this make sense?
: 
:  ( (a:some | b:some ) (a:phrase | b:phrase) ( c:"some phrase") )

that's pretty much exactly what pf does, the only distinction is you 
get...

 +( (a:some | b:some ) (a:phrase | b:phrase) )  ( c:"some phrase" ) 

...where the "mm" param only applies to the (mandatory) boolean built 
using the qf.


-Hoss