You are viewing a plain text version of this content. The canonical link for it is here.
Posted to general@lucene.apache.org by "Balaji.A" <re...@gmail.com> on 2010/10/27 16:35:14 UTC
Scoring Pattern for partial and exact match search results
Hi,
I have 6 fields in a document with respective data types given below.
field name data type
------------------------
content text
title text
description text
content_em text_ws
title_em text_ws
description text_ws
My requirement is to prioritize search results based on exact and partial
match conditions. Document that have exact match should have high score than
documents with partial match.
To achieve this problem I have added 3 fields
(content_em,title_em,description_em) which contains the same content of
content,title and description respectively.
My dismax query is something similar to this
mm=1&qf=content^100+description^200+title^300&pf=content_em^500000+description_em^700000+title_em^900000&fl=id&start=0&q=London&qt=dismax
I have 2 problems with this approach:
Problem 1:
For instance if doc1 has London text appearing 1 time in description,
content and title fields and doc2 has
same text appearing 1 time only in description and content field, doc2 gives
me high score than doc1. Can anyone explain why this happens? Since I give
more boost to title field, I expect term matching that field should be given
more score.
Problem 2
Another scenario is with the search term "Ryder Cup".
Doc 1 has text "Cup" appearing 20 or more times in content field
Doc 2 has text "Ryder Cup" appearing 1 time in title field
On search I expect Doc 2 to be on top since I want exact match documents to
be prioritized over partial match documents. But unfortunatly Doc 1 comes on
top with more scoring.
Since I am new to Lucene, can anyone help me to solve these problem?
Many Thanks,
Balaji.
--
View this message in context: http://lucene.472066.n3.nabble.com/Scoring-Pattern-for-partial-and-exact-match-search-results-tp1780471p1780471.html
Sent from the Lucene - General mailing list archive at Nabble.com.