You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by parth_n <na...@asu.edu> on 2014/10/02 18:36:38 UTC

Lucene Java Caching Question

Hi everyone,

I am using a Lucene application from a JAVA IDE. I have this following
scenario:

I run my java application (set of spatial queries) to get the execution time
and results for the queries. The application is terminated. Whenever I
re-run the application with the same set of queries, the execution time is
very low comparative to the first run. So I am assuming that there is some
caching going on, but where is this stored? I have checked the index folder
(where I have created the spatial index), and no files have been updated.

I have looked on for similar question on this forum, but it seems no one has
come across this particular problem. 



--
View this message in context: http://lucene.472066.n3.nabble.com/Lucene-Java-Caching-Question-tp4162354.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


RE: Lucene Java Caching Question

Posted by Toke Eskildsen <te...@statsbiblioteket.dk>.
parth_n [nagarkar@asu.edu] wrote:
> I run my java application (set of spatial queries) to get the execution time
> and results for the queries. The application is terminated. Whenever I
> re-run the application with the same set of queries, the execution time is
> very low comparative to the first run. So I am assuming that there is some
> caching going on, but where is this stored?

It is the disk cache of your operating system. It is independent of Lucene and is in-memory. Most modern operating systems uses all free memory for disk cache.

Lucene uses random access all the time and search speed is largely dictated by how fast it can do such reads. If the data are in your disk cache, they will be fetched _very_ fast.

> I have looked on for similar question on this forum, but it seems no one has
> come across this particular problem.

Problem? You mean for testing? Well, it is quite hard to test Lucene performance. Related to disk cache there are three strategies:

1) Empty the disk cache before you test (how you do that depends on your operating system). This makes the tests fairly repeatable, but say nothing about real world performance, as there is always some amount of caching going on when you're running for real.

2) Fill the disk cache, either by repeating your test a few times and measuring the last result or by reading all your index files into disk cache before you start (on linux, 'cat * > /dev/null' should work). Again this ensures test repeatability, but it is only representative of real world performance if your production index size is less than the amount of free memory.

3) Try to simulate a real setup, with some queries from your production system, before you start your test. This is tricky to get right, but the only somewhat-sound approximation of real world performance.

- Toke Eskildsen

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org