You are viewing a plain text version of this content. The canonical link for it is here.
Posted to general@lucene.apache.org by csantos <cl...@gmail.com> on 2008/12/29 15:07:24 UTC
Lucene retrieval model
Hello,
I would like to know more about Lucene's retrieval model, more specifically
about the boolean model part, is that a standard model (just documents that
match the boolean expression) or an extended model (include in the search
result all Documents which correspond to the given conditions, regardless of
the boolean connectors - AND, OR, NOT) ?
In the Apache Lucene - Scoring's page i found not that much about:
"Lucene scoring uses a combination of the Vector Space Model (VSM) of
Information Retrieval and the Boolean model to determine how relevant a
given Document is to a User's query. In general, the idea behind the VSM is
the more times a query term appears in a document relative to the number of
times the term appears in all the documents in the collection, the more
relevant that document is to the query. It uses the Boolean model to first
narrow down the documents that need to be scored based on the use of boolean
logic in the Query specification. Lucene also adds some capabilities and
refinements onto this model to support boolean and fuzzy searching, but it
essentially remains a VSM based system at the heart."
Thanks in advance for any responses
--
View this message in context: http://www.nabble.com/Lucene-retrieval-model-tp21203662p21203662.html
Sent from the Lucene - General mailing list archive at Nabble.com.
Re: Lucene retrieval model
Posted by Claudia Santos <cl...@gmail.com>.
Hello,
Thnks for the tip.
The idea of extended boolean model is that a weight between 0 and 1 would be
calculated for all search results that contains at least one of the terms.
The extended model evaluates documents with only one of the terms with a
smaller value than one that contains both. A NOT B would have value 0.
regards,
----- Original Message -----
From: "Steven A Rowe" <sa...@syr.edu>
To: <ge...@lucene.apache.org>
Sent: Monday, December 29, 2008 8:35 PM
Subject: RE: Lucene retrieval model
Hi csantos,
Very few people are subscribed to the general@lucene.apache.org mailing
list - you'll get much better response if you use the java-user@l.a.o list
instead.
On 12/29/2008 at 9:07 AM, csantos wrote:
> I would like to know more about Lucene's retrieval model,
> more specifically about the boolean model part, is that a
> standard model (just documents that match the boolean
> expression) or an extended model (include in the search
> result all Documents which correspond to the given
> conditions, regardless of the boolean connectors - AND,
> OR, NOT) ?
I'm not familiar with your use of the terms "standard model" and "extended
model", so take my response here with a grain of salt.
There is no way I know of to include documents in the search results that
violate the constraints represented by the connectors you use. But if
you're interested in getting all documents that match a query, can't you
simply use all OR connectors?
Out of curiosity, how useful would it be for the query "A NOT B" to return
documents that match "B"?
Steve
RE: Lucene retrieval model
Posted by Steven A Rowe <sa...@syr.edu>.
Hi csantos,
Very few people are subscribed to the general@lucene.apache.org mailing list - you'll get much better response if you use the java-user@l.a.o list instead.
On 12/29/2008 at 9:07 AM, csantos wrote:
> I would like to know more about Lucene's retrieval model,
> more specifically about the boolean model part, is that a
> standard model (just documents that match the boolean
> expression) or an extended model (include in the search
> result all Documents which correspond to the given
> conditions, regardless of the boolean connectors - AND,
> OR, NOT) ?
I'm not familiar with your use of the terms "standard model" and "extended model", so take my response here with a grain of salt.
There is no way I know of to include documents in the search results that violate the constraints represented by the connectors you use. But if you're interested in getting all documents that match a query, can't you simply use all OR connectors?
Out of curiosity, how useful would it be for the query "A NOT B" to return documents that match "B"?
Steve