You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Marc Sturlese <ma...@gmail.com> on 2011/06/20 11:13:22 UTC
About IndexReader.reopen with very similar indexes
Hey there,
I have a doubt about the behaviour of IndexReader.reopen.
I have a tomcat server holding a lucene index over an IndexSearcher. If I
move the index.folder to index.folder.old and another index, let's say
index.folder.2 to index.folder and then I reopen readers, something weird
happen if the first index and the second have very similar size and are
built from scratch. It seems that when I get the new reader and compare with
the new one:
IndexReader reader = ...
...
IndexReader newReader = r.reopen();
if (newReader != reader) {
... // reader was reopened
reader.close();
}
reader = newReader;
...
Lucene does not detect that are different indexes.
Here you can see both indexes (have same number of files and names are
similar, but sizes are a bit different, as contained documents are not
exactly the same).
*This does not happen if indexes have much more differents sizes (and so,
file names will not be equal, ex: _4.fdt, etc)
Index1:
-rw-r--r-- 1 marc admin 269289634 15 Jun 15:52 _3.fdt
-rw-r--r-- 1 marc admin 2066764 15 Jun 15:52 _3.fdx
-rw-r--r-- 1 marc admin 463 15 Jun 15:52 _3.fnm
-rw-r--r-- 1 marc admin 40358787 15 Jun 15:52 _3.frq
-rw-r--r-- 1 marc admin 1033384 15 Jun 15:52 _3.nrm
-rw-r--r-- 1 marc admin 27014923 15 Jun 15:52 _3.prx
-rw-r--r-- 1 marc admin 234797 15 Jun 15:52 _3.tii
-rw-r--r-- 1 marc admin 19322234 15 Jun 15:52 _3.tis
-rw-r--r-- 1 marc admin 20 15 Jun 15:52 segments.gen
-rw-r--r-- 1 marc admin 298 15 Jun 15:52 segments_2
Index2:
-rw-r--r-- 1 marc admin 269044254 15 Jun 15:52 _3.fdt
-rw-r--r-- 1 marc admin 2068116 15 Jun 15:52 _3.fdx
-rw-r--r-- 1 marc admin 463 15 Jun 15:52 _3.fnm
-rw-r--r-- 1 marc admin 40320465 15 Jun 15:52 _3.frq
-rw-r--r-- 1 marc admin 1034060 15 Jun 15:52 _3.nrm
-rw-r--r-- 1 marc admin 26967519 15 Jun 15:52 _3.prx
-rw-r--r-- 1 marc admin 235895 15 Jun 15:52 _3.tii
-rw-r--r-- 1 marc admin 19372446 15 Jun 15:52 _3.tis
-rw-r--r-- 1 marc admin 20 15 Jun 15:52 segments.gen
-rw-r--r-- 1 marc admin 298 15 Jun 15:52 segments_2
Can someone explain me the lucene criteria to decide if a segment has
changed or not?
Thanks in advance.
--
View this message in context: http://lucene.472066.n3.nabble.com/About-IndexReader-reopen-with-very-similar-indexes-tp3085456p3085456.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: About IndexReader.reopen with very similar indexes
Posted by Michael McCandless <lu...@mikemccandless.com>.
Reopening is based entirely on the latest segments_N file present in the index.
Lucene loads that file and checks if it refers to any new segments not
already open and if so opens those new ones. And segments in common
with what the reader already has open (ie same segment name) are
simply reused.
Lucene doesn't look at file modification times, etc.
Mike McCandless
http://blog.mikemccandless.com
On Mon, Jun 20, 2011 at 5:13 AM, Marc Sturlese <ma...@gmail.com> wrote:
> Hey there,
> I have a doubt about the behaviour of IndexReader.reopen.
> I have a tomcat server holding a lucene index over an IndexSearcher. If I
> move the index.folder to index.folder.old and another index, let's say
> index.folder.2 to index.folder and then I reopen readers, something weird
> happen if the first index and the second have very similar size and are
> built from scratch. It seems that when I get the new reader and compare with
> the new one:
>
> IndexReader reader = ...
> ...
> IndexReader newReader = r.reopen();
> if (newReader != reader) {
> ... // reader was reopened
> reader.close();
> }
> reader = newReader;
> ...
>
> Lucene does not detect that are different indexes.
> Here you can see both indexes (have same number of files and names are
> similar, but sizes are a bit different, as contained documents are not
> exactly the same).
> *This does not happen if indexes have much more differents sizes (and so,
> file names will not be equal, ex: _4.fdt, etc)
>
> Index1:
> -rw-r--r-- 1 marc admin 269289634 15 Jun 15:52 _3.fdt
> -rw-r--r-- 1 marc admin 2066764 15 Jun 15:52 _3.fdx
> -rw-r--r-- 1 marc admin 463 15 Jun 15:52 _3.fnm
> -rw-r--r-- 1 marc admin 40358787 15 Jun 15:52 _3.frq
> -rw-r--r-- 1 marc admin 1033384 15 Jun 15:52 _3.nrm
> -rw-r--r-- 1 marc admin 27014923 15 Jun 15:52 _3.prx
> -rw-r--r-- 1 marc admin 234797 15 Jun 15:52 _3.tii
> -rw-r--r-- 1 marc admin 19322234 15 Jun 15:52 _3.tis
> -rw-r--r-- 1 marc admin 20 15 Jun 15:52 segments.gen
> -rw-r--r-- 1 marc admin 298 15 Jun 15:52 segments_2
>
> Index2:
> -rw-r--r-- 1 marc admin 269044254 15 Jun 15:52 _3.fdt
> -rw-r--r-- 1 marc admin 2068116 15 Jun 15:52 _3.fdx
> -rw-r--r-- 1 marc admin 463 15 Jun 15:52 _3.fnm
> -rw-r--r-- 1 marc admin 40320465 15 Jun 15:52 _3.frq
> -rw-r--r-- 1 marc admin 1034060 15 Jun 15:52 _3.nrm
> -rw-r--r-- 1 marc admin 26967519 15 Jun 15:52 _3.prx
> -rw-r--r-- 1 marc admin 235895 15 Jun 15:52 _3.tii
> -rw-r--r-- 1 marc admin 19372446 15 Jun 15:52 _3.tis
> -rw-r--r-- 1 marc admin 20 15 Jun 15:52 segments.gen
> -rw-r--r-- 1 marc admin 298 15 Jun 15:52 segments_2
>
> Can someone explain me the lucene criteria to decide if a segment has
> changed or not?
> Thanks in advance.
>
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/About-IndexReader-reopen-with-very-similar-indexes-tp3085456p3085456.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org