You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Claes Holmerson <cl...@polopoly.com> on 2003/07/03 14:36:55 UTC
Understanding how indexing works
Hi,
In my job, I have become the new maintainer of a search feature that
uses Lucene. I am trying to understand how it works by examining the
index it produces.
When I list index fields by opening an IndexReader, looping over
documents, then looping over fields with Document.fields(), I see a
number of fields. However, I don't see the fields that are searchable
within this document.
If I do IndexReader.terms() directly and then Term.field() on each term,
I see fields that no longer should exist since the documents containing
them have been deleted. Is this like it should be?
How can I easily tell if a term is no longer searchable (or rather
successfully found, when searching), apart from actually doing the
search? Is there a way to list those terms?
Thanks,
Claes
--
Claes Holmerson
Polopoly - Cultivating the information garden
Kungsgatan 88, SE-112 27 Stockholm, SWEDEN
Direct: +46 8 506 782 59
Mobile: +46 704 47 82 59
Fax: +46 8 506 782 51
claes.holmerson@polopoly.com, http://www.polopoly.com
---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org
Re: Understanding how indexing works
Posted by Ype Kingma <yk...@xs4all.nl>.
Claes,
On Thursday 03 July 2003 05:36, Claes Holmerson wrote:
> Hi,
>
> In my job, I have become the new maintainer of a search feature that
> uses Lucene. I am trying to understand how it works by examining the
> index it produces.
>
> When I list index fields by opening an IndexReader, looping over
> documents, then looping over fields with Document.fields(), I see a
> number of fields. However, I don't see the fields that are searchable
> within this document.
You only see the stored fields that way.
> If I do IndexReader.terms() directly and then Term.field() on each term,
> I see fields that no longer should exist since the documents containing
> them have been deleted. Is this like it should be?
Until you optimize the index, yes.
> How can I easily tell if a term is no longer searchable (or rather
> successfully found, when searching), apart from actually doing the
Optimize before checking the index for the term.
> search? Is there a way to list those terms?
Yes, but it is slow: for each term, check whether
the associated documents are all deleted.
Kind regards,
Ype
---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org