You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by Melissa Mifsud <me...@yahoo.com> on 2002/03/07 16:22:30 UTC

What type of indexer is Lucene? Question reworded.

Hi again!

I should really reword my question as follows:

On which criteria are relevant documents chosen given a particular query

and

once retrieved, how are these documents ranked?

The techniques by which this is done will then determine what type of IR model Lucene implements.

Thanks again!

Melissa

Re: What type of indexer is Lucene? Question reworded.

Posted by Dmitry Serebrennikov <dm...@earthlink.net>.
I can't answer all of these questions fully, but since Doug is out, I'll 
give it a start. Please check the FAQ for more detailed explanation. I 
believe you will find enough information there to answer all of your 
questions. The FAQ is linked from the Jakarta's page (there are actually 
two FAQs so you might want to check both).

As far as I understand, Lucene is a probabilistic indexer. It supports 
boolean queries but it also supports phrase queries, where it does true 
ranking. The ranking is done based on how many of the search words 
appear in a document and how "important" the words are for that 
document, which is a function of the word frequency and the size of the 
document.

For a given search, the type of result you get depends on the type of 
Query that is used. For example, boolean queries can have "traditional" 
AND terms which are all required for a match, but they can also have 
"optional" terms that rank the document higher if they are found, but do 
not rule out a document if they are not.

I hope this helps.
Dmitry.


Melissa Mifsud wrote:

>Hi again!
>
>I should really reword my question as follows:
>
>On which criteria are relevant documents chosen given a particular query
>
>and
>
>once retrieved, how are these documents ranked?
>
>The techniques by which this is done will then determine what type of IR model Lucene implements.
>
>Thanks again!
>
>Melissa
>




--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>