You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Jean-Baptiste Reure <jb...@weborganic.com> on 2011/04/05 09:49:00 UTC

Question about open files

Hi all,

I have been looking for information about this and found a few things here
and there but nothing very clear on when files are opened and closed by
Lucene.

We have an application that uses Lucene quite heavily in the following
fashion: there are multiple indexes in use at all times. For each index, a
single IndexWriter object is constantly opened (and used to update the
index) and a single IndexReader is also constantly opened. This Reader is
created using the almost real-time reader of the IndexWriter. The Reader is
used to query the index and when the Writer updates an index, the
corresponding reader is re-opened and the previous one is discarded when all
searches using it are finished. Periodically, the index is optimised if it
needs to be (if the writer has been used).

This is all working perfectly fine but we've been running into an issue
recently when we've add a lot of indexes on one server (more than a 1000
indexes). This led to the "Too many open files" exception, so we increased
the number of open files allowed on that server but obviously, it's a
temporary solution.

So we decided to implement a mechanism that would keep only a maximum number
of indexes "open" at all times. When the maximum number is reached, the
least recently used indexes are closed (closing the IndexWriter and the
IndexReader), and if these indexes are needed again in the future, they are
"re-opened".
This is functioning fine as well, as we can see ".lock" files appearing and
disappearing in the index directories, but handles on the segment files are
still in use.

So my question is when are these file handles discarded? I was under the
impression from reading the documentation that closing an IndexReader was
enough to close the files associated with that reader but we went even
further by closing the IndexWriter as well but to no avail. If it should
work that way, it means we have a bug somewhere with readers not being
closed properly but I couldn't find one last time I checked.

Also, do you think it is the right way to go around the "too many open
files" issue? Can you recommend using another method?

Thanks for any help.

JB Reure

Re: Question about open files

Posted by Ian Lea <ia...@gmail.com>.
Yes, to the best of my knowledge and experience, closing readers and
writers releases the file handles.


--
Ian.


On Wed, Apr 6, 2011 at 12:59 AM, Jean-Baptiste Reure
<jb...@weborganic.com> wrote:
> We are using version 3.0.3. So you can confirm that closing the writer (and
> the reader created from that writer) should be enough to release the file
> handles?
> If that is the case, that means our application has a bug somewhere that i
> need to track down.
> Thanks,
> JB.
>
> On 5 April 2011 19:48, Ian Lea <ia...@gmail.com> wrote:
>
>> Which version of lucene are you using?  Something changed in the 3.x
>> release, and maybe 2.9.x, in the way that old file handles are closed.
>>  Previously it wasn't always necessary to explicitly close everything,
>> now it is.
>>
>> Your usage sounds fine to me.  When I've hit too many open files when
>> using 3.0.x it has always been because something, somewhere, wasn't
>> being closed.
>>
>>
>> --
>> Ian.
>>
>>
>> On Tue, Apr 5, 2011 at 8:49 AM, Jean-Baptiste Reure
>> <jb...@weborganic.com> wrote:
>> > Hi all,
>> >
>> > I have been looking for information about this and found a few things
>> here
>> > and there but nothing very clear on when files are opened and closed by
>> > Lucene.
>> >
>> > We have an application that uses Lucene quite heavily in the following
>> > fashion: there are multiple indexes in use at all times. For each index,
>> a
>> > single IndexWriter object is constantly opened (and used to update the
>> > index) and a single IndexReader is also constantly opened. This Reader is
>> > created using the almost real-time reader of the IndexWriter. The Reader
>> is
>> > used to query the index and when the Writer updates an index, the
>> > corresponding reader is re-opened and the previous one is discarded when
>> all
>> > searches using it are finished. Periodically, the index is optimised if
>> it
>> > needs to be (if the writer has been used).
>> >
>> > This is all working perfectly fine but we've been running into an issue
>> > recently when we've add a lot of indexes on one server (more than a 1000
>> > indexes). This led to the "Too many open files" exception, so we
>> increased
>> > the number of open files allowed on that server but obviously, it's a
>> > temporary solution.
>> >
>> > So we decided to implement a mechanism that would keep only a maximum
>> number
>> > of indexes "open" at all times. When the maximum number is reached, the
>> > least recently used indexes are closed (closing the IndexWriter and the
>> > IndexReader), and if these indexes are needed again in the future, they
>> are
>> > "re-opened".
>> > This is functioning fine as well, as we can see ".lock" files appearing
>> and
>> > disappearing in the index directories, but handles on the segment files
>> are
>> > still in use.
>> >
>> > So my question is when are these file handles discarded? I was under the
>> > impression from reading the documentation that closing an IndexReader was
>> > enough to close the files associated with that reader but we went even
>> > further by closing the IndexWriter as well but to no avail. If it should
>> > work that way, it means we have a bug somewhere with readers not being
>> > closed properly but I couldn't find one last time I checked.
>> >
>> > Also, do you think it is the right way to go around the "too many open
>> > files" issue? Can you recommend using another method?
>> >
>> > Thanks for any help.
>> >
>> > JB Reure
>> >
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Question about open files

Posted by Jean-Baptiste Reure <jb...@weborganic.com>.
I found what the problem was: we were closing the IndexSearcher but not the
underlying IndexReader (I wrongly assumed that closing one would close the
other).
Everything is working perfectly now, thanks for the help.

JB

On 6 April 2011 22:13, Erick Erickson <er...@gmail.com> wrote:

> I suspect you're already aware of this, but I've
> overlooked the obvious so many times I thought
> I'd mention it...
>
> A classic mistake is to assign a reader with reopen
> and not close the old reader, see:
>
> http://lucene.apache.org/java/3_0_1/api/core/org/apache/lucene/index/IndexReader.html#reopen()
>
> <
> http://lucene.apache.org/java/3_0_1/api/core/org/apache/lucene/index/IndexReader.html#reopen()
> >
> Erick.
>
> On Tue, Apr 5, 2011 at 7:59 PM, Jean-Baptiste Reure
> <jb...@weborganic.com>wrote:
>
> > We are using version 3.0.3. So you can confirm that closing the writer
> (and
> > the reader created from that writer) should be enough to release the file
> > handles?
> > If that is the case, that means our application has a bug somewhere that
> i
> > need to track down.
> > Thanks,
> > JB.
> >
> > On 5 April 2011 19:48, Ian Lea <ia...@gmail.com> wrote:
> >
> > > Which version of lucene are you using?  Something changed in the 3.x
> > > release, and maybe 2.9.x, in the way that old file handles are closed.
> > >  Previously it wasn't always necessary to explicitly close everything,
> > > now it is.
> > >
> > > Your usage sounds fine to me.  When I've hit too many open files when
> > > using 3.0.x it has always been because something, somewhere, wasn't
> > > being closed.
> > >
> > >
> > > --
> > > Ian.
> > >
> > >
> > > On Tue, Apr 5, 2011 at 8:49 AM, Jean-Baptiste Reure
> > > <jb...@weborganic.com> wrote:
> > > > Hi all,
> > > >
> > > > I have been looking for information about this and found a few things
> > > here
> > > > and there but nothing very clear on when files are opened and closed
> by
> > > > Lucene.
> > > >
> > > > We have an application that uses Lucene quite heavily in the
> following
> > > > fashion: there are multiple indexes in use at all times. For each
> > index,
> > > a
> > > > single IndexWriter object is constantly opened (and used to update
> the
> > > > index) and a single IndexReader is also constantly opened. This
> Reader
> > is
> > > > created using the almost real-time reader of the IndexWriter. The
> > Reader
> > > is
> > > > used to query the index and when the Writer updates an index, the
> > > > corresponding reader is re-opened and the previous one is discarded
> > when
> > > all
> > > > searches using it are finished. Periodically, the index is optimised
> if
> > > it
> > > > needs to be (if the writer has been used).
> > > >
> > > > This is all working perfectly fine but we've been running into an
> issue
> > > > recently when we've add a lot of indexes on one server (more than a
> > 1000
> > > > indexes). This led to the "Too many open files" exception, so we
> > > increased
> > > > the number of open files allowed on that server but obviously, it's a
> > > > temporary solution.
> > > >
> > > > So we decided to implement a mechanism that would keep only a maximum
> > > number
> > > > of indexes "open" at all times. When the maximum number is reached,
> the
> > > > least recently used indexes are closed (closing the IndexWriter and
> the
> > > > IndexReader), and if these indexes are needed again in the future,
> they
> > > are
> > > > "re-opened".
> > > > This is functioning fine as well, as we can see ".lock" files
> appearing
> > > and
> > > > disappearing in the index directories, but handles on the segment
> files
> > > are
> > > > still in use.
> > > >
> > > > So my question is when are these file handles discarded? I was under
> > the
> > > > impression from reading the documentation that closing an IndexReader
> > was
> > > > enough to close the files associated with that reader but we went
> even
> > > > further by closing the IndexWriter as well but to no avail. If it
> > should
> > > > work that way, it means we have a bug somewhere with readers not
> being
> > > > closed properly but I couldn't find one last time I checked.
> > > >
> > > > Also, do you think it is the right way to go around the "too many
> open
> > > > files" issue? Can you recommend using another method?
> > > >
> > > > Thanks for any help.
> > > >
> > > > JB Reure
> > > >
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > > For additional commands, e-mail: java-user-help@lucene.apache.org
> > >
> > >
> >
>

Re: Question about open files

Posted by Erick Erickson <er...@gmail.com>.
I suspect you're already aware of this, but I've
overlooked the obvious so many times I thought
I'd mention it...

A classic mistake is to assign a reader with reopen
and not close the old reader, see:
http://lucene.apache.org/java/3_0_1/api/core/org/apache/lucene/index/IndexReader.html#reopen()

<http://lucene.apache.org/java/3_0_1/api/core/org/apache/lucene/index/IndexReader.html#reopen()>
Erick.

On Tue, Apr 5, 2011 at 7:59 PM, Jean-Baptiste Reure
<jb...@weborganic.com>wrote:

> We are using version 3.0.3. So you can confirm that closing the writer (and
> the reader created from that writer) should be enough to release the file
> handles?
> If that is the case, that means our application has a bug somewhere that i
> need to track down.
> Thanks,
> JB.
>
> On 5 April 2011 19:48, Ian Lea <ia...@gmail.com> wrote:
>
> > Which version of lucene are you using?  Something changed in the 3.x
> > release, and maybe 2.9.x, in the way that old file handles are closed.
> >  Previously it wasn't always necessary to explicitly close everything,
> > now it is.
> >
> > Your usage sounds fine to me.  When I've hit too many open files when
> > using 3.0.x it has always been because something, somewhere, wasn't
> > being closed.
> >
> >
> > --
> > Ian.
> >
> >
> > On Tue, Apr 5, 2011 at 8:49 AM, Jean-Baptiste Reure
> > <jb...@weborganic.com> wrote:
> > > Hi all,
> > >
> > > I have been looking for information about this and found a few things
> > here
> > > and there but nothing very clear on when files are opened and closed by
> > > Lucene.
> > >
> > > We have an application that uses Lucene quite heavily in the following
> > > fashion: there are multiple indexes in use at all times. For each
> index,
> > a
> > > single IndexWriter object is constantly opened (and used to update the
> > > index) and a single IndexReader is also constantly opened. This Reader
> is
> > > created using the almost real-time reader of the IndexWriter. The
> Reader
> > is
> > > used to query the index and when the Writer updates an index, the
> > > corresponding reader is re-opened and the previous one is discarded
> when
> > all
> > > searches using it are finished. Periodically, the index is optimised if
> > it
> > > needs to be (if the writer has been used).
> > >
> > > This is all working perfectly fine but we've been running into an issue
> > > recently when we've add a lot of indexes on one server (more than a
> 1000
> > > indexes). This led to the "Too many open files" exception, so we
> > increased
> > > the number of open files allowed on that server but obviously, it's a
> > > temporary solution.
> > >
> > > So we decided to implement a mechanism that would keep only a maximum
> > number
> > > of indexes "open" at all times. When the maximum number is reached, the
> > > least recently used indexes are closed (closing the IndexWriter and the
> > > IndexReader), and if these indexes are needed again in the future, they
> > are
> > > "re-opened".
> > > This is functioning fine as well, as we can see ".lock" files appearing
> > and
> > > disappearing in the index directories, but handles on the segment files
> > are
> > > still in use.
> > >
> > > So my question is when are these file handles discarded? I was under
> the
> > > impression from reading the documentation that closing an IndexReader
> was
> > > enough to close the files associated with that reader but we went even
> > > further by closing the IndexWriter as well but to no avail. If it
> should
> > > work that way, it means we have a bug somewhere with readers not being
> > > closed properly but I couldn't find one last time I checked.
> > >
> > > Also, do you think it is the right way to go around the "too many open
> > > files" issue? Can you recommend using another method?
> > >
> > > Thanks for any help.
> > >
> > > JB Reure
> > >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> >
>

Re: Question about open files

Posted by Jean-Baptiste Reure <jb...@weborganic.com>.
We are using version 3.0.3. So you can confirm that closing the writer (and
the reader created from that writer) should be enough to release the file
handles?
If that is the case, that means our application has a bug somewhere that i
need to track down.
Thanks,
JB.

On 5 April 2011 19:48, Ian Lea <ia...@gmail.com> wrote:

> Which version of lucene are you using?  Something changed in the 3.x
> release, and maybe 2.9.x, in the way that old file handles are closed.
>  Previously it wasn't always necessary to explicitly close everything,
> now it is.
>
> Your usage sounds fine to me.  When I've hit too many open files when
> using 3.0.x it has always been because something, somewhere, wasn't
> being closed.
>
>
> --
> Ian.
>
>
> On Tue, Apr 5, 2011 at 8:49 AM, Jean-Baptiste Reure
> <jb...@weborganic.com> wrote:
> > Hi all,
> >
> > I have been looking for information about this and found a few things
> here
> > and there but nothing very clear on when files are opened and closed by
> > Lucene.
> >
> > We have an application that uses Lucene quite heavily in the following
> > fashion: there are multiple indexes in use at all times. For each index,
> a
> > single IndexWriter object is constantly opened (and used to update the
> > index) and a single IndexReader is also constantly opened. This Reader is
> > created using the almost real-time reader of the IndexWriter. The Reader
> is
> > used to query the index and when the Writer updates an index, the
> > corresponding reader is re-opened and the previous one is discarded when
> all
> > searches using it are finished. Periodically, the index is optimised if
> it
> > needs to be (if the writer has been used).
> >
> > This is all working perfectly fine but we've been running into an issue
> > recently when we've add a lot of indexes on one server (more than a 1000
> > indexes). This led to the "Too many open files" exception, so we
> increased
> > the number of open files allowed on that server but obviously, it's a
> > temporary solution.
> >
> > So we decided to implement a mechanism that would keep only a maximum
> number
> > of indexes "open" at all times. When the maximum number is reached, the
> > least recently used indexes are closed (closing the IndexWriter and the
> > IndexReader), and if these indexes are needed again in the future, they
> are
> > "re-opened".
> > This is functioning fine as well, as we can see ".lock" files appearing
> and
> > disappearing in the index directories, but handles on the segment files
> are
> > still in use.
> >
> > So my question is when are these file handles discarded? I was under the
> > impression from reading the documentation that closing an IndexReader was
> > enough to close the files associated with that reader but we went even
> > further by closing the IndexWriter as well but to no avail. If it should
> > work that way, it means we have a bug somewhere with readers not being
> > closed properly but I couldn't find one last time I checked.
> >
> > Also, do you think it is the right way to go around the "too many open
> > files" issue? Can you recommend using another method?
> >
> > Thanks for any help.
> >
> > JB Reure
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Re: Question about open files

Posted by Ian Lea <ia...@gmail.com>.
Which version of lucene are you using?  Something changed in the 3.x
release, and maybe 2.9.x, in the way that old file handles are closed.
 Previously it wasn't always necessary to explicitly close everything,
now it is.

Your usage sounds fine to me.  When I've hit too many open files when
using 3.0.x it has always been because something, somewhere, wasn't
being closed.


--
Ian.


On Tue, Apr 5, 2011 at 8:49 AM, Jean-Baptiste Reure
<jb...@weborganic.com> wrote:
> Hi all,
>
> I have been looking for information about this and found a few things here
> and there but nothing very clear on when files are opened and closed by
> Lucene.
>
> We have an application that uses Lucene quite heavily in the following
> fashion: there are multiple indexes in use at all times. For each index, a
> single IndexWriter object is constantly opened (and used to update the
> index) and a single IndexReader is also constantly opened. This Reader is
> created using the almost real-time reader of the IndexWriter. The Reader is
> used to query the index and when the Writer updates an index, the
> corresponding reader is re-opened and the previous one is discarded when all
> searches using it are finished. Periodically, the index is optimised if it
> needs to be (if the writer has been used).
>
> This is all working perfectly fine but we've been running into an issue
> recently when we've add a lot of indexes on one server (more than a 1000
> indexes). This led to the "Too many open files" exception, so we increased
> the number of open files allowed on that server but obviously, it's a
> temporary solution.
>
> So we decided to implement a mechanism that would keep only a maximum number
> of indexes "open" at all times. When the maximum number is reached, the
> least recently used indexes are closed (closing the IndexWriter and the
> IndexReader), and if these indexes are needed again in the future, they are
> "re-opened".
> This is functioning fine as well, as we can see ".lock" files appearing and
> disappearing in the index directories, but handles on the segment files are
> still in use.
>
> So my question is when are these file handles discarded? I was under the
> impression from reading the documentation that closing an IndexReader was
> enough to close the files associated with that reader but we went even
> further by closing the IndexWriter as well but to no avail. If it should
> work that way, it means we have a bug somewhere with readers not being
> closed properly but I couldn't find one last time I checked.
>
> Also, do you think it is the right way to go around the "too many open
> files" issue? Can you recommend using another method?
>
> Thanks for any help.
>
> JB Reure
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org