You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@lucenenet.apache.org by "Allan, Brad (Bracknell)" <Br...@Fiserv.com> on 2014/03/24 16:55:03 UTC

IndexReader disposal causing unexpected 'AlreadyClosedException'

I've create some pseudo code to try and describe what I'm seeing. This 'problem' can also be replicated without multiple threads (sessions as I describe them below) if the order of operations is as described below.

How am I misunderstanding the Lucene docs (around Reopen() method)?
I have looked into the source code and the Dispose does appear to not want to close the reader if there are existing refs. What do I need to do in my code to ensure the following scenario does not occur?

A solution I'm mulling over is to keep an additional reference count for the readers I dish out from 'MyCache' and only Dispose the reader when MyCache knows there are no existing refs to the 'old' reader.

Wise words appreciated. Thanks.

// Session 1
var reader1 = GetIndexReaderFromMyCache("MyTestIndex");

// Session 2
....some changes are made to 'MyTestIndex'.....

// Session 3
var reader3 = GetIndexReaderFromMyCache("MyTestIndex");

// Session 1
....perform a search with reader1....
BOOM! get 'AlreadyClosedException'

Where:
-----------
IndexReader GetIndexReaderFromMyCache(string indexName) {
     lock(lockObject) {
           IndexReader reader = MyCache.Get(indexName);
           IndexReader newReader = reader.Reopen();
           if (newReader != reader) {
                reader.Dispose();
           }
           reader = newReader;
           MyCache.Update(indexName, reader);
           return reader;
     }
}


________________________________

CheckFree Solutions Limited (trading as Fiserv)
Registered Office: Eversheds House, 70 Great Bridgewater Street, Manchester, M15 ES
Registered in England: No. 2694333

Re: IndexReader disposal causing unexpected 'AlreadyClosedException'

Posted by Alex Davidson <de...@hotmail.com>.
Hi all,

I don't recall finding anything in the source to indicate that absence
of a Dispose call would leak anything important, but being a bit
paranoid about that stuff I made sure it got called correctly anyway in
my own Lucene.NET wrapper.

I implemented a shim which returned a disposable 'session' object
wrapping a Reader and Searcher, which would IncRef at construction and
DecRef at disposal. The factory would Dispose the reader and searcher
before replacing them. Each new shim object would atomically take
references to them. There was something about lazily creating a Searcher
too, since the refresh rate was such that not every snapshot would get
searched against.

This was mainly for the purposes of instrumentation, because we had
issues with suspected resource leaks somewhere and needed to guarantee
proper cleanup of absolutely everything. As I recall, Dispose pretty
much just calls DecRef and lets the reference-counting clean stuff up.
So as long as construction is matched with Dispose and every IncRef is
matched with a DecRef, everything should get 'disposed' at the right
time.

Alex

On Mon, 2014-03-24 at 19:37 +0100, Simon Svensson wrote:
> Hi,
> 
> I wrote that answer in 2011 and have continued building code on the 
> assumption that the normally occurring garbage collection solves 
> everything, so far without any issues. The usual logic goes...
> 
>      public void Update(Entity x) {
>          // writer.AddDocument, writer.Commit, ...
>          _reader = _reader.Reopen();
>          _searcher = null;
>      }
> 
>      public IndexSearcher GetSearcher() {
>          var searcher = _searcher;
>          if (searcher == null) {
>              searcher = new IndexSearcher(_reader);
>              _searcher = searcher;
>          }
> 
>          return searcher;
>      }
> 
> While there are some threading issues with this code, I've never had any 
> memory issues with this implementation. Old reader instances (from 
> Reopen) are thrown away once any active searchers are done, same thing 
> for old searcher instances. Duplicate searcher instances (threading 
> issue) are also garbage collected.
> 
> // Simon
> 
> On 24/03/14 19:07, Paul Irwin wrote:
> > Yes, relying on GC is Simon's answer in that SO question and has been my
> > approach as well and it hasn't caused any problems. There isn't any complex
> > or unmanaged state in the
> > IndexSearcher/IndexReader/DirectoryReader/SegmentReader, so GC cleans it up
> > nicely. I would be interested to know if anyone has found an issue with
> > this approach, though.
> >
> >
> > On Mon, Mar 24, 2014 at 1:37 PM, Allan, Brad (Bracknell) <
> > Brad.Allan@fiserv.com> wrote:
> >
> >> Thanks Paul,
> >> Great answer!
> >>
> >> One further query around your suggestion, in the following scenario:
> >> - Thread 1 has a searcher from the manager.
> >> - Some index related changes happen (i.e. reopen will be necessary)
> >> - Thread 2 asks for a searcher - cool, we return a 'new' IndexSearcher
> >> - Thread 1 is done with searching.......nothing would seem to Dispose the
> >> 'old' IndexReader?  So do I just trust .NET GC to clean it up because my
> >> application now has zero refs?
> >>
> >> -----Original Message-----
> >> From: Paul Irwin [mailto:pirwin@feature23.com]
> >> Sent: 24 March 2014 16:43
> >> To: user@lucenenet.apache.org
> >> Subject: Re: IndexReader disposal causing unexpected
> >> 'AlreadyClosedException'
> >>
> >> Brad,
> >>
> >> Yes the technique is similar however you are calling Dispose on an object
> >> that might have references to it elsewhere, hence your exception. So best
> >> practice, unless you have to implement some sort of global disposal
> >> strategy after testing shows that GC isn't effective enough, is to let GC
> >> take care of it. See here: http://stackoverflow.com/a/5491000/44636
> >>
> >> Also static variables would be preferred to storing in a cache as caches
> >> are typically LRU or memory-pressure managed, and may end up being
> >> serialized out of process as your architecture grows. Keeping it in a
> >> static variable makes it clear that you don't want it cleaned up under
> >> memory pressure and that it can't be serialized for out-of-process use.
> >> Maybe try something like this, if you have to constantly make sure it's up
> >> to date. Note the use of IsCurrent() to check to see if you need to
> >> Reopen() it, and that if you're mainly dealing with IndexSearcher and not
> >> IndexReader directly, try to stick to using IndexSearcher and keep the
> >> IndexReader hidden. The IndexSearcher will have a reference to the reader,
> >> and that should be singleton as well for performance reasons.
> >>
> >> public static class IndexSearcherManager
> >>      {
> >>          private static readonly Directory _dir;
> >>          private static IndexReader _reader;
> >>          private static IndexSearcher _searcher;
> >>          private static readonly object _lock = new object();
> >>
> >>          static IndexSearcherManager()
> >>          {
> >>              _dir = new RAMDirectory(); // replace with your dir
> >>              _reader = IndexReader.Open(_dir, true);
> >>              _searcher = new IndexSearcher(_reader);
> >>          }
> >>
> >>          public static Directory Directory
> >>          {
> >>              get { return _dir; }
> >>          }
> >>
> >>          public static IndexReader IndexReader
> >>          {
> >>              get
> >>              {
> >>                  ReopenIfNecessary();
> >>                  return _reader;
> >>              }
> >>          }
> >>
> >>          public static IndexSearcher IndexSearcher
> >>          {
> >>              get
> >>              {
> >>                  ReopenIfNecessary();
> >>                  return _searcher;
> >>              }
> >>          }
> >>
> >>          private static void ReopenIfNecessary()
> >>          {
> >>              if (_reader.IsCurrent())
> >>                  return;
> >>
> >>              lock (_lock)
> >>              {
> >>                  if (_reader.IsCurrent())
> >>                      return;
> >>
> >>                  _reader = _reader.Reopen();
> >>                  _searcher = new IndexSearcher(_reader);
> >>              }
> >>          }
> >>      }
> >>
> >>
> >> On Mon, Mar 24, 2014 at 12:25 PM, Allan, Brad (Bracknell) <
> >> Brad.Allan@fiserv.com> wrote:
> >>
> >>> Thanks Paul for responding.
> >>> The background timer is not an option for the application.
> >>>
> >>> The 'cache' is only administering a single index reader for each index
> >>> in my system, and the cache does perform a reopen before returning the
> >>> reader, so I am using a technique similar to what you are describing.
> >>>
> >>> It seems to me that I can call IncRef() and DecRef(). For the latter
> >>> I'll have to add something to the cache that allows 'getters' to let
> >>> the cache know they are done.
> >>>
> >>> -----Original Message-----
> >>> From: Paul Irwin [mailto:pirwin@feature23.com]
> >>> Sent: 24 March 2014 16:07
> >>> To: user@lucenenet.apache.org
> >>> Cc: lucene-net-user@lucene.apache.org
> >>> Subject: Re: IndexReader disposal causing unexpected
> >>> 'AlreadyClosedException'
> >>>
> >>> I'd recommend trying to keep the IndexReader instance as a singleton
> >>> static variable. It's safe to use across multiple threads, so there's
> >>> no benefit to disposing and creating new instances. Just call
> >>> .Reopen() on it when the data changes.
> >>>
> >>> What's happening here is the "reader" variable is being returned from
> >>> the first call to the method, then on the second call, you're getting
> >>> the same instance back from cache and then calling dispose on it,
> >>> while the "reader1" variable still has a reference to that instance,
> >>> so when you do anything on "reader1" it's already disposed. So yeah, I
> >>> would recommend keeping an instance open singleton, never calling
> >>> Dispose on it, and calling Reopen when needed.
> >>>
> >>> If you need to naively call Reopen periodically because you don't know
> >>> when the index changes, you could do that on a background timer tick
> >>> if you can accept slightly stale data.
> >>>
> >>> Paul
> >>>
> >>>
> >>> On Mon, Mar 24, 2014 at 11:55 AM, Allan, Brad (Bracknell) <
> >>> Brad.Allan@fiserv.com> wrote:
> >>>
> >>>> I've create some pseudo code to try and describe what I'm seeing.
> >>>> This 'problem' can also be replicated without multiple threads
> >>>> (sessions as I describe them below) if the order of operations is as
> >> described below.
> >>>> How am I misunderstanding the Lucene docs (around Reopen() method)?
> >>>> I have looked into the source code and the Dispose does appear to
> >>>> not want to close the reader if there are existing refs. What do I
> >>>> need to do in my code to ensure the following scenario does not occur?
> >>>>
> >>>> A solution I'm mulling over is to keep an additional reference count
> >>>> for the readers I dish out from 'MyCache' and only Dispose the
> >>>> reader when MyCache knows there are no existing refs to the 'old'
> >> reader.
> >>>> Wise words appreciated. Thanks.
> >>>>
> >>>> // Session 1
> >>>> var reader1 = GetIndexReaderFromMyCache("MyTestIndex");
> >>>>
> >>>> // Session 2
> >>>> ....some changes are made to 'MyTestIndex'.....
> >>>>
> >>>> // Session 3
> >>>> var reader3 = GetIndexReaderFromMyCache("MyTestIndex");
> >>>>
> >>>> // Session 1
> >>>> ....perform a search with reader1....
> >>>> BOOM! get 'AlreadyClosedException'
> >>>>
> >>>> Where:
> >>>> -----------
> >>>> IndexReader GetIndexReaderFromMyCache(string indexName) {
> >>>>       lock(lockObject) {
> >>>>             IndexReader reader = MyCache.Get(indexName);
> >>>>             IndexReader newReader = reader.Reopen();
> >>>>             if (newReader != reader) {
> >>>>                  reader.Dispose();
> >>>>             }
> >>>>             reader = newReader;
> >>>>             MyCache.Update(indexName, reader);
> >>>>             return reader;
> >>>>       }
> >>>> }
> >>>>
> >>>>
> >>>> ________________________________
> >>>>
> >>>> CheckFree Solutions Limited (trading as Fiserv) Registered Office:
> >>>> Eversheds House, 70 Great Bridgewater Street, Manchester, M15 ES
> >>>> Registered in England: No. 2694333
> >>>>
> >>>
> >>>
> >>> --
> >>>
> >>> Paul Irwin
> >>> Lead Software Engineer
> >>> feature[23]
> >>>
> >>> Email: pirwin@feature23.com
> >>> Cell: 863-698-9294
> >>>
> >>> ________________________________
> >>>
> >>> CheckFree Solutions Limited (trading as Fiserv) Registered Office:
> >>> Eversheds House, 70 Great Bridgewater Street, Manchester, M15 ES
> >>> Registered in England: No. 2694333
> >>>
> >>
> >>
> >> --
> >>
> >> Paul Irwin
> >> Lead Software Engineer
> >> feature[23]
> >>
> >> Email: pirwin@feature23.com
> >> Cell: 863-698-9294
> >>
> >> ________________________________
> >>
> >> CheckFree Solutions Limited (trading as Fiserv)
> >> Registered Office: Eversheds House, 70 Great Bridgewater Street,
> >> Manchester, M15 ES
> >> Registered in England: No. 2694333
> >>
> >
> >
> 

-- 
Sent from a real computer


Re: IndexReader disposal causing unexpected 'AlreadyClosedException'

Posted by Simon Svensson <si...@devhost.se>.
Hi,

I wrote that answer in 2011 and have continued building code on the 
assumption that the normally occurring garbage collection solves 
everything, so far without any issues. The usual logic goes...

     public void Update(Entity x) {
         // writer.AddDocument, writer.Commit, ...
         _reader = _reader.Reopen();
         _searcher = null;
     }

     public IndexSearcher GetSearcher() {
         var searcher = _searcher;
         if (searcher == null) {
             searcher = new IndexSearcher(_reader);
             _searcher = searcher;
         }

         return searcher;
     }

While there are some threading issues with this code, I've never had any 
memory issues with this implementation. Old reader instances (from 
Reopen) are thrown away once any active searchers are done, same thing 
for old searcher instances. Duplicate searcher instances (threading 
issue) are also garbage collected.

// Simon

On 24/03/14 19:07, Paul Irwin wrote:
> Yes, relying on GC is Simon's answer in that SO question and has been my
> approach as well and it hasn't caused any problems. There isn't any complex
> or unmanaged state in the
> IndexSearcher/IndexReader/DirectoryReader/SegmentReader, so GC cleans it up
> nicely. I would be interested to know if anyone has found an issue with
> this approach, though.
>
>
> On Mon, Mar 24, 2014 at 1:37 PM, Allan, Brad (Bracknell) <
> Brad.Allan@fiserv.com> wrote:
>
>> Thanks Paul,
>> Great answer!
>>
>> One further query around your suggestion, in the following scenario:
>> - Thread 1 has a searcher from the manager.
>> - Some index related changes happen (i.e. reopen will be necessary)
>> - Thread 2 asks for a searcher - cool, we return a 'new' IndexSearcher
>> - Thread 1 is done with searching.......nothing would seem to Dispose the
>> 'old' IndexReader?  So do I just trust .NET GC to clean it up because my
>> application now has zero refs?
>>
>> -----Original Message-----
>> From: Paul Irwin [mailto:pirwin@feature23.com]
>> Sent: 24 March 2014 16:43
>> To: user@lucenenet.apache.org
>> Subject: Re: IndexReader disposal causing unexpected
>> 'AlreadyClosedException'
>>
>> Brad,
>>
>> Yes the technique is similar however you are calling Dispose on an object
>> that might have references to it elsewhere, hence your exception. So best
>> practice, unless you have to implement some sort of global disposal
>> strategy after testing shows that GC isn't effective enough, is to let GC
>> take care of it. See here: http://stackoverflow.com/a/5491000/44636
>>
>> Also static variables would be preferred to storing in a cache as caches
>> are typically LRU or memory-pressure managed, and may end up being
>> serialized out of process as your architecture grows. Keeping it in a
>> static variable makes it clear that you don't want it cleaned up under
>> memory pressure and that it can't be serialized for out-of-process use.
>> Maybe try something like this, if you have to constantly make sure it's up
>> to date. Note the use of IsCurrent() to check to see if you need to
>> Reopen() it, and that if you're mainly dealing with IndexSearcher and not
>> IndexReader directly, try to stick to using IndexSearcher and keep the
>> IndexReader hidden. The IndexSearcher will have a reference to the reader,
>> and that should be singleton as well for performance reasons.
>>
>> public static class IndexSearcherManager
>>      {
>>          private static readonly Directory _dir;
>>          private static IndexReader _reader;
>>          private static IndexSearcher _searcher;
>>          private static readonly object _lock = new object();
>>
>>          static IndexSearcherManager()
>>          {
>>              _dir = new RAMDirectory(); // replace with your dir
>>              _reader = IndexReader.Open(_dir, true);
>>              _searcher = new IndexSearcher(_reader);
>>          }
>>
>>          public static Directory Directory
>>          {
>>              get { return _dir; }
>>          }
>>
>>          public static IndexReader IndexReader
>>          {
>>              get
>>              {
>>                  ReopenIfNecessary();
>>                  return _reader;
>>              }
>>          }
>>
>>          public static IndexSearcher IndexSearcher
>>          {
>>              get
>>              {
>>                  ReopenIfNecessary();
>>                  return _searcher;
>>              }
>>          }
>>
>>          private static void ReopenIfNecessary()
>>          {
>>              if (_reader.IsCurrent())
>>                  return;
>>
>>              lock (_lock)
>>              {
>>                  if (_reader.IsCurrent())
>>                      return;
>>
>>                  _reader = _reader.Reopen();
>>                  _searcher = new IndexSearcher(_reader);
>>              }
>>          }
>>      }
>>
>>
>> On Mon, Mar 24, 2014 at 12:25 PM, Allan, Brad (Bracknell) <
>> Brad.Allan@fiserv.com> wrote:
>>
>>> Thanks Paul for responding.
>>> The background timer is not an option for the application.
>>>
>>> The 'cache' is only administering a single index reader for each index
>>> in my system, and the cache does perform a reopen before returning the
>>> reader, so I am using a technique similar to what you are describing.
>>>
>>> It seems to me that I can call IncRef() and DecRef(). For the latter
>>> I'll have to add something to the cache that allows 'getters' to let
>>> the cache know they are done.
>>>
>>> -----Original Message-----
>>> From: Paul Irwin [mailto:pirwin@feature23.com]
>>> Sent: 24 March 2014 16:07
>>> To: user@lucenenet.apache.org
>>> Cc: lucene-net-user@lucene.apache.org
>>> Subject: Re: IndexReader disposal causing unexpected
>>> 'AlreadyClosedException'
>>>
>>> I'd recommend trying to keep the IndexReader instance as a singleton
>>> static variable. It's safe to use across multiple threads, so there's
>>> no benefit to disposing and creating new instances. Just call
>>> .Reopen() on it when the data changes.
>>>
>>> What's happening here is the "reader" variable is being returned from
>>> the first call to the method, then on the second call, you're getting
>>> the same instance back from cache and then calling dispose on it,
>>> while the "reader1" variable still has a reference to that instance,
>>> so when you do anything on "reader1" it's already disposed. So yeah, I
>>> would recommend keeping an instance open singleton, never calling
>>> Dispose on it, and calling Reopen when needed.
>>>
>>> If you need to naively call Reopen periodically because you don't know
>>> when the index changes, you could do that on a background timer tick
>>> if you can accept slightly stale data.
>>>
>>> Paul
>>>
>>>
>>> On Mon, Mar 24, 2014 at 11:55 AM, Allan, Brad (Bracknell) <
>>> Brad.Allan@fiserv.com> wrote:
>>>
>>>> I've create some pseudo code to try and describe what I'm seeing.
>>>> This 'problem' can also be replicated without multiple threads
>>>> (sessions as I describe them below) if the order of operations is as
>> described below.
>>>> How am I misunderstanding the Lucene docs (around Reopen() method)?
>>>> I have looked into the source code and the Dispose does appear to
>>>> not want to close the reader if there are existing refs. What do I
>>>> need to do in my code to ensure the following scenario does not occur?
>>>>
>>>> A solution I'm mulling over is to keep an additional reference count
>>>> for the readers I dish out from 'MyCache' and only Dispose the
>>>> reader when MyCache knows there are no existing refs to the 'old'
>> reader.
>>>> Wise words appreciated. Thanks.
>>>>
>>>> // Session 1
>>>> var reader1 = GetIndexReaderFromMyCache("MyTestIndex");
>>>>
>>>> // Session 2
>>>> ....some changes are made to 'MyTestIndex'.....
>>>>
>>>> // Session 3
>>>> var reader3 = GetIndexReaderFromMyCache("MyTestIndex");
>>>>
>>>> // Session 1
>>>> ....perform a search with reader1....
>>>> BOOM! get 'AlreadyClosedException'
>>>>
>>>> Where:
>>>> -----------
>>>> IndexReader GetIndexReaderFromMyCache(string indexName) {
>>>>       lock(lockObject) {
>>>>             IndexReader reader = MyCache.Get(indexName);
>>>>             IndexReader newReader = reader.Reopen();
>>>>             if (newReader != reader) {
>>>>                  reader.Dispose();
>>>>             }
>>>>             reader = newReader;
>>>>             MyCache.Update(indexName, reader);
>>>>             return reader;
>>>>       }
>>>> }
>>>>
>>>>
>>>> ________________________________
>>>>
>>>> CheckFree Solutions Limited (trading as Fiserv) Registered Office:
>>>> Eversheds House, 70 Great Bridgewater Street, Manchester, M15 ES
>>>> Registered in England: No. 2694333
>>>>
>>>
>>>
>>> --
>>>
>>> Paul Irwin
>>> Lead Software Engineer
>>> feature[23]
>>>
>>> Email: pirwin@feature23.com
>>> Cell: 863-698-9294
>>>
>>> ________________________________
>>>
>>> CheckFree Solutions Limited (trading as Fiserv) Registered Office:
>>> Eversheds House, 70 Great Bridgewater Street, Manchester, M15 ES
>>> Registered in England: No. 2694333
>>>
>>
>>
>> --
>>
>> Paul Irwin
>> Lead Software Engineer
>> feature[23]
>>
>> Email: pirwin@feature23.com
>> Cell: 863-698-9294
>>
>> ________________________________
>>
>> CheckFree Solutions Limited (trading as Fiserv)
>> Registered Office: Eversheds House, 70 Great Bridgewater Street,
>> Manchester, M15 ES
>> Registered in England: No. 2694333
>>
>
>


Re: IndexReader disposal causing unexpected 'AlreadyClosedException'

Posted by Paul Irwin <pi...@feature23.com>.
Yes, relying on GC is Simon's answer in that SO question and has been my
approach as well and it hasn't caused any problems. There isn't any complex
or unmanaged state in the
IndexSearcher/IndexReader/DirectoryReader/SegmentReader, so GC cleans it up
nicely. I would be interested to know if anyone has found an issue with
this approach, though.


On Mon, Mar 24, 2014 at 1:37 PM, Allan, Brad (Bracknell) <
Brad.Allan@fiserv.com> wrote:

> Thanks Paul,
> Great answer!
>
> One further query around your suggestion, in the following scenario:
> - Thread 1 has a searcher from the manager.
> - Some index related changes happen (i.e. reopen will be necessary)
> - Thread 2 asks for a searcher - cool, we return a 'new' IndexSearcher
> - Thread 1 is done with searching.......nothing would seem to Dispose the
> 'old' IndexReader?  So do I just trust .NET GC to clean it up because my
> application now has zero refs?
>
> -----Original Message-----
> From: Paul Irwin [mailto:pirwin@feature23.com]
> Sent: 24 March 2014 16:43
> To: user@lucenenet.apache.org
> Subject: Re: IndexReader disposal causing unexpected
> 'AlreadyClosedException'
>
> Brad,
>
> Yes the technique is similar however you are calling Dispose on an object
> that might have references to it elsewhere, hence your exception. So best
> practice, unless you have to implement some sort of global disposal
> strategy after testing shows that GC isn't effective enough, is to let GC
> take care of it. See here: http://stackoverflow.com/a/5491000/44636
>
> Also static variables would be preferred to storing in a cache as caches
> are typically LRU or memory-pressure managed, and may end up being
> serialized out of process as your architecture grows. Keeping it in a
> static variable makes it clear that you don't want it cleaned up under
> memory pressure and that it can't be serialized for out-of-process use.
> Maybe try something like this, if you have to constantly make sure it's up
> to date. Note the use of IsCurrent() to check to see if you need to
> Reopen() it, and that if you're mainly dealing with IndexSearcher and not
> IndexReader directly, try to stick to using IndexSearcher and keep the
> IndexReader hidden. The IndexSearcher will have a reference to the reader,
> and that should be singleton as well for performance reasons.
>
> public static class IndexSearcherManager
>     {
>         private static readonly Directory _dir;
>         private static IndexReader _reader;
>         private static IndexSearcher _searcher;
>         private static readonly object _lock = new object();
>
>         static IndexSearcherManager()
>         {
>             _dir = new RAMDirectory(); // replace with your dir
>             _reader = IndexReader.Open(_dir, true);
>             _searcher = new IndexSearcher(_reader);
>         }
>
>         public static Directory Directory
>         {
>             get { return _dir; }
>         }
>
>         public static IndexReader IndexReader
>         {
>             get
>             {
>                 ReopenIfNecessary();
>                 return _reader;
>             }
>         }
>
>         public static IndexSearcher IndexSearcher
>         {
>             get
>             {
>                 ReopenIfNecessary();
>                 return _searcher;
>             }
>         }
>
>         private static void ReopenIfNecessary()
>         {
>             if (_reader.IsCurrent())
>                 return;
>
>             lock (_lock)
>             {
>                 if (_reader.IsCurrent())
>                     return;
>
>                 _reader = _reader.Reopen();
>                 _searcher = new IndexSearcher(_reader);
>             }
>         }
>     }
>
>
> On Mon, Mar 24, 2014 at 12:25 PM, Allan, Brad (Bracknell) <
> Brad.Allan@fiserv.com> wrote:
>
> > Thanks Paul for responding.
> > The background timer is not an option for the application.
> >
> > The 'cache' is only administering a single index reader for each index
> > in my system, and the cache does perform a reopen before returning the
> > reader, so I am using a technique similar to what you are describing.
> >
> > It seems to me that I can call IncRef() and DecRef(). For the latter
> > I'll have to add something to the cache that allows 'getters' to let
> > the cache know they are done.
> >
> > -----Original Message-----
> > From: Paul Irwin [mailto:pirwin@feature23.com]
> > Sent: 24 March 2014 16:07
> > To: user@lucenenet.apache.org
> > Cc: lucene-net-user@lucene.apache.org
> > Subject: Re: IndexReader disposal causing unexpected
> > 'AlreadyClosedException'
> >
> > I'd recommend trying to keep the IndexReader instance as a singleton
> > static variable. It's safe to use across multiple threads, so there's
> > no benefit to disposing and creating new instances. Just call
> > .Reopen() on it when the data changes.
> >
> > What's happening here is the "reader" variable is being returned from
> > the first call to the method, then on the second call, you're getting
> > the same instance back from cache and then calling dispose on it,
> > while the "reader1" variable still has a reference to that instance,
> > so when you do anything on "reader1" it's already disposed. So yeah, I
> > would recommend keeping an instance open singleton, never calling
> > Dispose on it, and calling Reopen when needed.
> >
> > If you need to naively call Reopen periodically because you don't know
> > when the index changes, you could do that on a background timer tick
> > if you can accept slightly stale data.
> >
> > Paul
> >
> >
> > On Mon, Mar 24, 2014 at 11:55 AM, Allan, Brad (Bracknell) <
> > Brad.Allan@fiserv.com> wrote:
> >
> > > I've create some pseudo code to try and describe what I'm seeing.
> > > This 'problem' can also be replicated without multiple threads
> > > (sessions as I describe them below) if the order of operations is as
> described below.
> > >
> > > How am I misunderstanding the Lucene docs (around Reopen() method)?
> > > I have looked into the source code and the Dispose does appear to
> > > not want to close the reader if there are existing refs. What do I
> > > need to do in my code to ensure the following scenario does not occur?
> > >
> > > A solution I'm mulling over is to keep an additional reference count
> > > for the readers I dish out from 'MyCache' and only Dispose the
> > > reader when MyCache knows there are no existing refs to the 'old'
> reader.
> > >
> > > Wise words appreciated. Thanks.
> > >
> > > // Session 1
> > > var reader1 = GetIndexReaderFromMyCache("MyTestIndex");
> > >
> > > // Session 2
> > > ....some changes are made to 'MyTestIndex'.....
> > >
> > > // Session 3
> > > var reader3 = GetIndexReaderFromMyCache("MyTestIndex");
> > >
> > > // Session 1
> > > ....perform a search with reader1....
> > > BOOM! get 'AlreadyClosedException'
> > >
> > > Where:
> > > -----------
> > > IndexReader GetIndexReaderFromMyCache(string indexName) {
> > >      lock(lockObject) {
> > >            IndexReader reader = MyCache.Get(indexName);
> > >            IndexReader newReader = reader.Reopen();
> > >            if (newReader != reader) {
> > >                 reader.Dispose();
> > >            }
> > >            reader = newReader;
> > >            MyCache.Update(indexName, reader);
> > >            return reader;
> > >      }
> > > }
> > >
> > >
> > > ________________________________
> > >
> > > CheckFree Solutions Limited (trading as Fiserv) Registered Office:
> > > Eversheds House, 70 Great Bridgewater Street, Manchester, M15 ES
> > > Registered in England: No. 2694333
> > >
> >
> >
> >
> > --
> >
> > Paul Irwin
> > Lead Software Engineer
> > feature[23]
> >
> > Email: pirwin@feature23.com
> > Cell: 863-698-9294
> >
> > ________________________________
> >
> > CheckFree Solutions Limited (trading as Fiserv) Registered Office:
> > Eversheds House, 70 Great Bridgewater Street, Manchester, M15 ES
> > Registered in England: No. 2694333
> >
>
>
>
> --
>
> Paul Irwin
> Lead Software Engineer
> feature[23]
>
> Email: pirwin@feature23.com
> Cell: 863-698-9294
>
> ________________________________
>
> CheckFree Solutions Limited (trading as Fiserv)
> Registered Office: Eversheds House, 70 Great Bridgewater Street,
> Manchester, M15 ES
> Registered in England: No. 2694333
>



-- 

Paul Irwin
Lead Software Engineer
feature[23]

Email: pirwin@feature23.com
Cell: 863-698-9294

RE: IndexReader disposal causing unexpected 'AlreadyClosedException'

Posted by "Allan, Brad (Bracknell)" <Br...@Fiserv.com>.
Thanks Paul,
Great answer!

One further query around your suggestion, in the following scenario:
- Thread 1 has a searcher from the manager.
- Some index related changes happen (i.e. reopen will be necessary)
- Thread 2 asks for a searcher - cool, we return a 'new' IndexSearcher
- Thread 1 is done with searching.......nothing would seem to Dispose the 'old' IndexReader?  So do I just trust .NET GC to clean it up because my application now has zero refs?

-----Original Message-----
From: Paul Irwin [mailto:pirwin@feature23.com]
Sent: 24 March 2014 16:43
To: user@lucenenet.apache.org
Subject: Re: IndexReader disposal causing unexpected 'AlreadyClosedException'

Brad,

Yes the technique is similar however you are calling Dispose on an object that might have references to it elsewhere, hence your exception. So best practice, unless you have to implement some sort of global disposal strategy after testing shows that GC isn't effective enough, is to let GC take care of it. See here: http://stackoverflow.com/a/5491000/44636

Also static variables would be preferred to storing in a cache as caches are typically LRU or memory-pressure managed, and may end up being serialized out of process as your architecture grows. Keeping it in a static variable makes it clear that you don't want it cleaned up under memory pressure and that it can't be serialized for out-of-process use.
Maybe try something like this, if you have to constantly make sure it's up to date. Note the use of IsCurrent() to check to see if you need to
Reopen() it, and that if you're mainly dealing with IndexSearcher and not IndexReader directly, try to stick to using IndexSearcher and keep the IndexReader hidden. The IndexSearcher will have a reference to the reader, and that should be singleton as well for performance reasons.

public static class IndexSearcherManager
    {
        private static readonly Directory _dir;
        private static IndexReader _reader;
        private static IndexSearcher _searcher;
        private static readonly object _lock = new object();

        static IndexSearcherManager()
        {
            _dir = new RAMDirectory(); // replace with your dir
            _reader = IndexReader.Open(_dir, true);
            _searcher = new IndexSearcher(_reader);
        }

        public static Directory Directory
        {
            get { return _dir; }
        }

        public static IndexReader IndexReader
        {
            get
            {
                ReopenIfNecessary();
                return _reader;
            }
        }

        public static IndexSearcher IndexSearcher
        {
            get
            {
                ReopenIfNecessary();
                return _searcher;
            }
        }

        private static void ReopenIfNecessary()
        {
            if (_reader.IsCurrent())
                return;

            lock (_lock)
            {
                if (_reader.IsCurrent())
                    return;

                _reader = _reader.Reopen();
                _searcher = new IndexSearcher(_reader);
            }
        }
    }


On Mon, Mar 24, 2014 at 12:25 PM, Allan, Brad (Bracknell) < Brad.Allan@fiserv.com> wrote:

> Thanks Paul for responding.
> The background timer is not an option for the application.
>
> The 'cache' is only administering a single index reader for each index
> in my system, and the cache does perform a reopen before returning the
> reader, so I am using a technique similar to what you are describing.
>
> It seems to me that I can call IncRef() and DecRef(). For the latter
> I'll have to add something to the cache that allows 'getters' to let
> the cache know they are done.
>
> -----Original Message-----
> From: Paul Irwin [mailto:pirwin@feature23.com]
> Sent: 24 March 2014 16:07
> To: user@lucenenet.apache.org
> Cc: lucene-net-user@lucene.apache.org
> Subject: Re: IndexReader disposal causing unexpected
> 'AlreadyClosedException'
>
> I'd recommend trying to keep the IndexReader instance as a singleton
> static variable. It's safe to use across multiple threads, so there's
> no benefit to disposing and creating new instances. Just call
> .Reopen() on it when the data changes.
>
> What's happening here is the "reader" variable is being returned from
> the first call to the method, then on the second call, you're getting
> the same instance back from cache and then calling dispose on it,
> while the "reader1" variable still has a reference to that instance,
> so when you do anything on "reader1" it's already disposed. So yeah, I
> would recommend keeping an instance open singleton, never calling
> Dispose on it, and calling Reopen when needed.
>
> If you need to naively call Reopen periodically because you don't know
> when the index changes, you could do that on a background timer tick
> if you can accept slightly stale data.
>
> Paul
>
>
> On Mon, Mar 24, 2014 at 11:55 AM, Allan, Brad (Bracknell) <
> Brad.Allan@fiserv.com> wrote:
>
> > I've create some pseudo code to try and describe what I'm seeing.
> > This 'problem' can also be replicated without multiple threads
> > (sessions as I describe them below) if the order of operations is as described below.
> >
> > How am I misunderstanding the Lucene docs (around Reopen() method)?
> > I have looked into the source code and the Dispose does appear to
> > not want to close the reader if there are existing refs. What do I
> > need to do in my code to ensure the following scenario does not occur?
> >
> > A solution I'm mulling over is to keep an additional reference count
> > for the readers I dish out from 'MyCache' and only Dispose the
> > reader when MyCache knows there are no existing refs to the 'old' reader.
> >
> > Wise words appreciated. Thanks.
> >
> > // Session 1
> > var reader1 = GetIndexReaderFromMyCache("MyTestIndex");
> >
> > // Session 2
> > ....some changes are made to 'MyTestIndex'.....
> >
> > // Session 3
> > var reader3 = GetIndexReaderFromMyCache("MyTestIndex");
> >
> > // Session 1
> > ....perform a search with reader1....
> > BOOM! get 'AlreadyClosedException'
> >
> > Where:
> > -----------
> > IndexReader GetIndexReaderFromMyCache(string indexName) {
> >      lock(lockObject) {
> >            IndexReader reader = MyCache.Get(indexName);
> >            IndexReader newReader = reader.Reopen();
> >            if (newReader != reader) {
> >                 reader.Dispose();
> >            }
> >            reader = newReader;
> >            MyCache.Update(indexName, reader);
> >            return reader;
> >      }
> > }
> >
> >
> > ________________________________
> >
> > CheckFree Solutions Limited (trading as Fiserv) Registered Office:
> > Eversheds House, 70 Great Bridgewater Street, Manchester, M15 ES
> > Registered in England: No. 2694333
> >
>
>
>
> --
>
> Paul Irwin
> Lead Software Engineer
> feature[23]
>
> Email: pirwin@feature23.com
> Cell: 863-698-9294
>
> ________________________________
>
> CheckFree Solutions Limited (trading as Fiserv) Registered Office:
> Eversheds House, 70 Great Bridgewater Street, Manchester, M15 ES
> Registered in England: No. 2694333
>



--

Paul Irwin
Lead Software Engineer
feature[23]

Email: pirwin@feature23.com
Cell: 863-698-9294

________________________________

CheckFree Solutions Limited (trading as Fiserv)
Registered Office: Eversheds House, 70 Great Bridgewater Street, Manchester, M15 ES
Registered in England: No. 2694333

Re: IndexReader disposal causing unexpected 'AlreadyClosedException'

Posted by Paul Irwin <pi...@feature23.com>.
Brad,

Yes the technique is similar however you are calling Dispose on an object
that might have references to it elsewhere, hence your exception. So best
practice, unless you have to implement some sort of global disposal
strategy after testing shows that GC isn't effective enough, is to let GC
take care of it. See here: http://stackoverflow.com/a/5491000/44636

Also static variables would be preferred to storing in a cache as caches
are typically LRU or memory-pressure managed, and may end up being
serialized out of process as your architecture grows. Keeping it in a
static variable makes it clear that you don't want it cleaned up under
memory pressure and that it can't be serialized for out-of-process use.
Maybe try something like this, if you have to constantly make sure it's up
to date. Note the use of IsCurrent() to check to see if you need to
Reopen() it, and that if you're mainly dealing with IndexSearcher and not
IndexReader directly, try to stick to using IndexSearcher and keep the
IndexReader hidden. The IndexSearcher will have a reference to the reader,
and that should be singleton as well for performance reasons.

public static class IndexSearcherManager
    {
        private static readonly Directory _dir;
        private static IndexReader _reader;
        private static IndexSearcher _searcher;
        private static readonly object _lock = new object();

        static IndexSearcherManager()
        {
            _dir = new RAMDirectory(); // replace with your dir
            _reader = IndexReader.Open(_dir, true);
            _searcher = new IndexSearcher(_reader);
        }

        public static Directory Directory
        {
            get { return _dir; }
        }

        public static IndexReader IndexReader
        {
            get
            {
                ReopenIfNecessary();
                return _reader;
            }
        }

        public static IndexSearcher IndexSearcher
        {
            get
            {
                ReopenIfNecessary();
                return _searcher;
            }
        }

        private static void ReopenIfNecessary()
        {
            if (_reader.IsCurrent())
                return;

            lock (_lock)
            {
                if (_reader.IsCurrent())
                    return;

                _reader = _reader.Reopen();
                _searcher = new IndexSearcher(_reader);
            }
        }
    }


On Mon, Mar 24, 2014 at 12:25 PM, Allan, Brad (Bracknell) <
Brad.Allan@fiserv.com> wrote:

> Thanks Paul for responding.
> The background timer is not an option for the application.
>
> The 'cache' is only administering a single index reader for each index in
> my system, and the cache does perform a reopen before returning the reader,
> so I am using a technique similar to what you are describing.
>
> It seems to me that I can call IncRef() and DecRef(). For the latter I'll
> have to add something to the cache that allows 'getters' to let the cache
> know they are done.
>
> -----Original Message-----
> From: Paul Irwin [mailto:pirwin@feature23.com]
> Sent: 24 March 2014 16:07
> To: user@lucenenet.apache.org
> Cc: lucene-net-user@lucene.apache.org
> Subject: Re: IndexReader disposal causing unexpected
> 'AlreadyClosedException'
>
> I'd recommend trying to keep the IndexReader instance as a singleton
> static variable. It's safe to use across multiple threads, so there's no
> benefit to disposing and creating new instances. Just call .Reopen() on it
> when the data changes.
>
> What's happening here is the "reader" variable is being returned from the
> first call to the method, then on the second call, you're getting the same
> instance back from cache and then calling dispose on it, while the
> "reader1" variable still has a reference to that instance, so when you do
> anything on "reader1" it's already disposed. So yeah, I would recommend
> keeping an instance open singleton, never calling Dispose on it, and
> calling Reopen when needed.
>
> If you need to naively call Reopen periodically because you don't know
> when the index changes, you could do that on a background timer tick if you
> can accept slightly stale data.
>
> Paul
>
>
> On Mon, Mar 24, 2014 at 11:55 AM, Allan, Brad (Bracknell) <
> Brad.Allan@fiserv.com> wrote:
>
> > I've create some pseudo code to try and describe what I'm seeing. This
> > 'problem' can also be replicated without multiple threads (sessions as
> > I describe them below) if the order of operations is as described below.
> >
> > How am I misunderstanding the Lucene docs (around Reopen() method)?
> > I have looked into the source code and the Dispose does appear to not
> > want to close the reader if there are existing refs. What do I need to
> > do in my code to ensure the following scenario does not occur?
> >
> > A solution I'm mulling over is to keep an additional reference count
> > for the readers I dish out from 'MyCache' and only Dispose the reader
> > when MyCache knows there are no existing refs to the 'old' reader.
> >
> > Wise words appreciated. Thanks.
> >
> > // Session 1
> > var reader1 = GetIndexReaderFromMyCache("MyTestIndex");
> >
> > // Session 2
> > ....some changes are made to 'MyTestIndex'.....
> >
> > // Session 3
> > var reader3 = GetIndexReaderFromMyCache("MyTestIndex");
> >
> > // Session 1
> > ....perform a search with reader1....
> > BOOM! get 'AlreadyClosedException'
> >
> > Where:
> > -----------
> > IndexReader GetIndexReaderFromMyCache(string indexName) {
> >      lock(lockObject) {
> >            IndexReader reader = MyCache.Get(indexName);
> >            IndexReader newReader = reader.Reopen();
> >            if (newReader != reader) {
> >                 reader.Dispose();
> >            }
> >            reader = newReader;
> >            MyCache.Update(indexName, reader);
> >            return reader;
> >      }
> > }
> >
> >
> > ________________________________
> >
> > CheckFree Solutions Limited (trading as Fiserv) Registered Office:
> > Eversheds House, 70 Great Bridgewater Street, Manchester, M15 ES
> > Registered in England: No. 2694333
> >
>
>
>
> --
>
> Paul Irwin
> Lead Software Engineer
> feature[23]
>
> Email: pirwin@feature23.com
> Cell: 863-698-9294
>
> ________________________________
>
> CheckFree Solutions Limited (trading as Fiserv)
> Registered Office: Eversheds House, 70 Great Bridgewater Street,
> Manchester, M15 ES
> Registered in England: No. 2694333
>



-- 

Paul Irwin
Lead Software Engineer
feature[23]

Email: pirwin@feature23.com
Cell: 863-698-9294

RE: IndexReader disposal causing unexpected 'AlreadyClosedException'

Posted by "Allan, Brad (Bracknell)" <Br...@Fiserv.com>.
Thanks Paul for responding.
The background timer is not an option for the application.

The 'cache' is only administering a single index reader for each index in my system, and the cache does perform a reopen before returning the reader, so I am using a technique similar to what you are describing.

It seems to me that I can call IncRef() and DecRef(). For the latter I'll have to add something to the cache that allows 'getters' to let the cache know they are done.

-----Original Message-----
From: Paul Irwin [mailto:pirwin@feature23.com]
Sent: 24 March 2014 16:07
To: user@lucenenet.apache.org
Cc: lucene-net-user@lucene.apache.org
Subject: Re: IndexReader disposal causing unexpected 'AlreadyClosedException'

I'd recommend trying to keep the IndexReader instance as a singleton static variable. It's safe to use across multiple threads, so there's no benefit to disposing and creating new instances. Just call .Reopen() on it when the data changes.

What's happening here is the "reader" variable is being returned from the first call to the method, then on the second call, you're getting the same instance back from cache and then calling dispose on it, while the "reader1" variable still has a reference to that instance, so when you do anything on "reader1" it's already disposed. So yeah, I would recommend keeping an instance open singleton, never calling Dispose on it, and calling Reopen when needed.

If you need to naively call Reopen periodically because you don't know when the index changes, you could do that on a background timer tick if you can accept slightly stale data.

Paul


On Mon, Mar 24, 2014 at 11:55 AM, Allan, Brad (Bracknell) < Brad.Allan@fiserv.com> wrote:

> I've create some pseudo code to try and describe what I'm seeing. This
> 'problem' can also be replicated without multiple threads (sessions as
> I describe them below) if the order of operations is as described below.
>
> How am I misunderstanding the Lucene docs (around Reopen() method)?
> I have looked into the source code and the Dispose does appear to not
> want to close the reader if there are existing refs. What do I need to
> do in my code to ensure the following scenario does not occur?
>
> A solution I'm mulling over is to keep an additional reference count
> for the readers I dish out from 'MyCache' and only Dispose the reader
> when MyCache knows there are no existing refs to the 'old' reader.
>
> Wise words appreciated. Thanks.
>
> // Session 1
> var reader1 = GetIndexReaderFromMyCache("MyTestIndex");
>
> // Session 2
> ....some changes are made to 'MyTestIndex'.....
>
> // Session 3
> var reader3 = GetIndexReaderFromMyCache("MyTestIndex");
>
> // Session 1
> ....perform a search with reader1....
> BOOM! get 'AlreadyClosedException'
>
> Where:
> -----------
> IndexReader GetIndexReaderFromMyCache(string indexName) {
>      lock(lockObject) {
>            IndexReader reader = MyCache.Get(indexName);
>            IndexReader newReader = reader.Reopen();
>            if (newReader != reader) {
>                 reader.Dispose();
>            }
>            reader = newReader;
>            MyCache.Update(indexName, reader);
>            return reader;
>      }
> }
>
>
> ________________________________
>
> CheckFree Solutions Limited (trading as Fiserv) Registered Office:
> Eversheds House, 70 Great Bridgewater Street, Manchester, M15 ES
> Registered in England: No. 2694333
>



--

Paul Irwin
Lead Software Engineer
feature[23]

Email: pirwin@feature23.com
Cell: 863-698-9294

________________________________

CheckFree Solutions Limited (trading as Fiserv)
Registered Office: Eversheds House, 70 Great Bridgewater Street, Manchester, M15 ES
Registered in England: No. 2694333

Re: IndexReader disposal causing unexpected 'AlreadyClosedException'

Posted by Paul Irwin <pi...@feature23.com>.
I'd recommend trying to keep the IndexReader instance as a singleton static
variable. It's safe to use across multiple threads, so there's no benefit
to disposing and creating new instances. Just call .Reopen() on it when the
data changes.

What's happening here is the "reader" variable is being returned from the
first call to the method, then on the second call, you're getting the same
instance back from cache and then calling dispose on it, while the
"reader1" variable still has a reference to that instance, so when you do
anything on "reader1" it's already disposed. So yeah, I would recommend
keeping an instance open singleton, never calling Dispose on it, and
calling Reopen when needed.

If you need to naively call Reopen periodically because you don't know when
the index changes, you could do that on a background timer tick if you can
accept slightly stale data.

Paul


On Mon, Mar 24, 2014 at 11:55 AM, Allan, Brad (Bracknell) <
Brad.Allan@fiserv.com> wrote:

> I've create some pseudo code to try and describe what I'm seeing. This
> 'problem' can also be replicated without multiple threads (sessions as I
> describe them below) if the order of operations is as described below.
>
> How am I misunderstanding the Lucene docs (around Reopen() method)?
> I have looked into the source code and the Dispose does appear to not want
> to close the reader if there are existing refs. What do I need to do in my
> code to ensure the following scenario does not occur?
>
> A solution I'm mulling over is to keep an additional reference count for
> the readers I dish out from 'MyCache' and only Dispose the reader when
> MyCache knows there are no existing refs to the 'old' reader.
>
> Wise words appreciated. Thanks.
>
> // Session 1
> var reader1 = GetIndexReaderFromMyCache("MyTestIndex");
>
> // Session 2
> ....some changes are made to 'MyTestIndex'.....
>
> // Session 3
> var reader3 = GetIndexReaderFromMyCache("MyTestIndex");
>
> // Session 1
> ....perform a search with reader1....
> BOOM! get 'AlreadyClosedException'
>
> Where:
> -----------
> IndexReader GetIndexReaderFromMyCache(string indexName) {
>      lock(lockObject) {
>            IndexReader reader = MyCache.Get(indexName);
>            IndexReader newReader = reader.Reopen();
>            if (newReader != reader) {
>                 reader.Dispose();
>            }
>            reader = newReader;
>            MyCache.Update(indexName, reader);
>            return reader;
>      }
> }
>
>
> ________________________________
>
> CheckFree Solutions Limited (trading as Fiserv)
> Registered Office: Eversheds House, 70 Great Bridgewater Street,
> Manchester, M15 ES
> Registered in England: No. 2694333
>



-- 

Paul Irwin
Lead Software Engineer
feature[23]

Email: pirwin@feature23.com
Cell: 863-698-9294