You are viewing a plain text version of this content. The canonical link for it is here.
Posted to general@lucene.apache.org by stevef-pcbi <st...@pcbi.upenn.edu> on 2008/05/12 18:39:25 UTC
words close together - like google
hi, i am a newbie to text search, but need to evaluate lucene.
my question is this: in a google query such as "prune scotch broom" it has
always seemed to me that the closer together the three words are found the
better the rank of the document.
(1) is that true?
(2) in the FAQ (http://wiki.apache.org/lucene-java/LuceneFAQ) it says this:
Does the position of the matches in the text affect the scoring?
No, the position of matches within a field does not affect ranking.
does that mean that lucene does not support what i imagine google is doing?
(3) the lucene querying language described here
http://lucene.apache.org/java/2_3_2/queryparsersyntax.html seems very fancy.
but, i don't understand why, in the most common use cases, i need it. in
google, i just type some words, and it figures the rest out. for example:
- the more words i hit the better. i don't need to specify AND or OR
- the closer they are together the better. i don't need to specify
distance requirements
thanks very much for explaining this, and please pardon and ignorance on my
part
steve
--
View this message in context: http://www.nabble.com/words-close-together---like-google-tp17189864p17189864.html
Sent from the Lucene - General mailing list archive at Nabble.com.
Re: words close together - like google
Posted by Chris Hostetter <ho...@fucit.org>.
: Does the position of the matches in the text affect the scoring?
:
: No, the position of matches within a field does not affect ranking.
:
: does that mean that lucene does not support what i imagine google is doing?
no .. that comment is in regards to basic term queries. if you want the
proximity of terms (to eachother) to affect the scoring this can be donw
with a PhraseQuery or a SpanNearQuery.
: (3) the lucene querying language described here
: http://lucene.apache.org/java/2_3_2/queryparsersyntax.html seems very fancy.
: but, i don't understand why, in the most common use cases, i need it. in
: google, i just type some words, and it figures the rest out. for example:
: - the more words i hit the better. i don't need to specify AND or OR
: - the closer they are together the better. i don't need to specify
: distance requirements
you don't have to use the QueryParser ... it's just there for convinience.
you're free to parse your query strings into Query objects any way you
want.
BTW: future questions about the java API will get more/better
responses from the java-user list.
-Hoss