You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Xiyang Chen <xi...@gmail.com> on 2011/06/24 18:27:04 UTC

Constructing an IDF table without indexing documents

Hi,

I'm developing a search application with two types of documents:

   1. Documents that need to be indexed and queried against
   2. Documents that will never show up in search results, but their content
   needs to contribute to the global term frequency table

In other words, the application needs Type 2 documents only for the purpose
of constructing an
IDF<http://lucene.apache.org/java/3_0_1/api/core/org/apache/lucene/search/Similarity.html#formula_idf>table.

I understand I can accomplish this by indexing both types of documents and
using a special indicator field to make sure that Type 2 documents never
show up in search results. However, this approach seems to be a waste of
resources with all the indexing overhead for Type 2 documents. Is there a
more efficient way for doing this?

Thanks,
Xiyang