You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Richard Krenek <ri...@gmail.com> on 2005/09/09 00:03:23 UTC

Weird time results doing wildcard queries

Hello All,
I am getting some weird time results when retrieving documents back from a 
hits object. I am just timing this bit of code:
Hits hits = searcher.search(query);
long startTime = System.currentTimeMillis();
for (int i = 0; i < hits.length(); i++) {
Document doc = hits.doc(i);
String field = doc.get(defaultField);
}
System.out.println("Cycle Time: "+(System.currentTimeMillis()-startTime));

It seems when I have a wilcard query like *abcd* vs weqrew*, the *abcd* 
query will always take longer to retrieve the documents even if they are of 
simular result sizes. We are talking a big difference 1 second vs 16. It is 
consistent no matter what order I run the queries in, terms with multiple 
wildcards always take longer to retrieve the documents. I am not counting 
the time of the query.

The index is 2.18 GB, 9 fields per document, 10,694,190 documents, 
25,538,793 terms and has been optimized.

I am not sure if this is a real or just a percieved issue. We cannot figure 
out why the type of query would affect the speed it takes to retrieve each 
document. We have run this on both Windows XP and Linux. With the same 
results. Also to note we did watch GC and this did not have any significant 
impact that we could se.

We are trying to figure out what could cause this and how we can work around 
it.


Thanks,
Richard