You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucenenet.apache.org by "Joe Shaw (JIRA)" <ji...@apache.org> on 2007/06/05 21:06:26 UTC

[jira] Commented: (LUCENENET-40) Memory leaks in 1.9.1-004-27Nov06 from Thread-Local Storage

    [ https://issues.apache.org/jira/browse/LUCENENET-40?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12501661 ] 

Joe Shaw commented on LUCENENET-40:
-----------------------------------

This is an issue that seems to come up a lot.  I'm not 100% on this, but I believe this is the intended behavior.

Loading the TermInfosReader and SegmentReaders off of the disk is an expensive operation, and the thread local storage is an optimization to prevent it from happening repeatedly.  If you have a single search thread, this data is kept in memory and reused on demand; there is no reason to constantly release and then reload that data whenever another search is started.  If you have short-lived multiple threads, this data is fairly quickly cleaned up by the GC when the thread dies.  The only situation in which this should be a real problem is when you have long-lived threads which might rarely do searches but spend most of its time doing other things.

The finalizer stuff was a hold over from the Java version, because some versions of the JVM wouldn't release the thread local data when the thread died.  When that code was migrated into Lucene.Net, it actually *caused* a mem leak on MS.NET 1.1, if I recall correctly, because of a runtime bug.

Maybe the right thing to do is to make this optional behavior, because if you have short-lived searching threads, you probably don't want this data to be cached, and want to release it as soon as possible.

> Memory leaks in 1.9.1-004-27Nov06 from Thread-Local Storage
> -----------------------------------------------------------
>
>                 Key: LUCENENET-40
>                 URL: https://issues.apache.org/jira/browse/LUCENENET-40
>             Project: Lucene.Net
>          Issue Type: Bug
>         Environment: Windows XP, .NET 1.1
>            Reporter: zarquon78
>
> In both TermInfosReader and SegmentReader, the call to System.Threading.Thread.SetData() to release the data is no longer performed.  This leads to the data remaining attached to the thread.  This call is in the now commented-out finalized.  By adding the call to the Close() or DoClose() method, the leak is removed.
> I.e.,
> // TermInfosReader.cs
> public /*internal*/ void Close()
> {
>    // ...
>    System.Threading.Thread.SetData(enumerators, null);  // Added
> }
> // SegmentReader.cs
> protected internal override void DoClose()
> {
>    // ...
>   System.Threading.Thread.SetData(termVectorsLocal, null);  // Added
> }

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


RE: [jira] Commented: (LUCENENET-40) Memory leaks in 1.9.1-004-27Nov06 from Thread-Local Storage

Posted by Erich Eichinger <E....@diamonddogs.cc>.
Hi,

> Do you have any idea what changed between 1.9 and 2.0 so that 
> these leaks were fixed? 

Atm I can only roughly tell from my mind: They fixed a couply of missing Close() calls and added some code to remove the instances from the cache.

I can't tell exactly since I only ran some tests testing for memory leaks that failed miserably with 1.4.3 but passed without problems with 2.0. Thus I didn't look into 2.0 sources too much then.

> Ok, then what is the intended behavior of the TLS?

Of course the intended behaviour is per thread caching. But some cache items (I remember SegmentReader or something like that) are specific to a certain Reader/Writer instance and become "orphan" if the corresponding Reader/Writer is closed. Those items remain in the cache until the process shuts down.
I remember that in the original Java code a WeakHashMap was used that didn't prevent those cacheitems from being GCed. But the .NET port used a normal Hashtable instead in version 1.4.3 which caused a couple of those memory leaks.

-Erich

> -----Original Message-----
> From: Joe Shaw [mailto:joe@joeshaw.org] 
> Sent: Tuesday, June 05, 2007 11:21 PM
> To: lucene-net-dev@incubator.apache.org
> Subject: Re: [jira] Commented: (LUCENENET-40) Memory leaks in 
> 1.9.1-004-27Nov06 from Thread-Local Storage
> 
> Hi Erich,
> 
> On 6/5/07, Erich Eichinger <E....@diamonddogs.cc> wrote:
> > I posted a patch for 1.4.3 on 7.March'07 to this 
> mailinglist. Some of the leaks fixed by this patch also 
> affect 1.9 asfaik. Lucene 2.0 doesn't have these memory-leaks anymore.
> 
> Do you have any idea what changed between 1.9 and 2.0 so that 
> these leaks were fixed?
> 
> > > > This is an issue that seems to come up a lot.  I'm not 100%
> > > on this, but I believe this is the intended behavior.
> >
> > i know from writing my patches that this is not the 
> intended behaviour. Data stored in the various caches is not 
> reused after closing the reader/writer. Much worse since 
> Reader/Writer instances are used as keys in the Hashtable, 
> any filehandle they hold will never be closed.
> 
> Ok, then what is the intended behavior of the TLS?
> 
> Joe
> 

Re: [jira] Commented: (LUCENENET-40) Memory leaks in 1.9.1-004-27Nov06 from Thread-Local Storage

Posted by Joe Shaw <jo...@joeshaw.org>.
Hi Erich,

On 6/5/07, Erich Eichinger <E....@diamonddogs.cc> wrote:
> I posted a patch for 1.4.3 on 7.March'07 to this mailinglist. Some of the leaks fixed by this patch also affect 1.9 asfaik. Lucene 2.0 doesn't have these memory-leaks anymore.

Do you have any idea what changed between 1.9 and 2.0 so that these
leaks were fixed?

> > > This is an issue that seems to come up a lot.  I'm not 100%
> > on this, but I believe this is the intended behavior.
>
> i know from writing my patches that this is not the intended behaviour. Data stored in the various caches is not reused after closing the reader/writer. Much worse since Reader/Writer instances are used as keys in the Hashtable, any filehandle they hold will never be closed.

Ok, then what is the intended behavior of the TLS?

Joe

RE: [jira] Commented: (LUCENENET-40) Memory leaks in 1.9.1-004-27Nov06 from Thread-Local Storage

Posted by Erich Eichinger <E....@diamonddogs.cc>.
Hi,

I posted a patch for 1.4.3 on 7.March'07 to this mailinglist. Some of the leaks fixed by this patch also affect 1.9 asfaik. Lucene 2.0 doesn't have these memory-leaks anymore.

> Any thought to changing it from ThreadLocalStorage to using 
> the ThreadStatic attribute on the field?  It's supposed to be 

[ThreadStatic] is definitely more efficient.

> > This is an issue that seems to come up a lot.  I'm not 100% 
> on this, but I believe this is the intended behavior.

i know from writing my patches that this is not the intended behaviour. Data stored in the various caches is not reused after closing the reader/writer. Much worse since Reader/Writer instances are used as keys in the Hashtable, any filehandle they hold will never be closed.

cheers,
Erich

> -----Original Message-----
> From: Michael Garski [mailto:mgarski@mac.com] 
> Sent: Tuesday, June 05, 2007 9:24 PM
> To: lucene-net-dev@incubator.apache.org
> Subject: Re: [jira] Commented: (LUCENENET-40) Memory leaks in 
> 1.9.1-004-27Nov06 from Thread-Local Storage
> 
> Any thought to changing it from ThreadLocalStorage to using 
> the ThreadStatic attribute on the field?  It's supposed to be 
> more efficient, but I've not confirmed that.  I just find it 
> easier to use.
> 
> I've done that here on my end with Lucene.NET.
> 
> Michael
> 
> Joe Shaw (JIRA) wrote:
> >     [ 
> > 
> https://issues.apache.org/jira/browse/LUCENENET-40?page=com.atlassian.
> > jira.plugin.system.issuetabpanels:comment-tabpanel#action_12501661 ]
> >
> > Joe Shaw commented on LUCENENET-40:
> > -----------------------------------
> >
> > This is an issue that seems to come up a lot.  I'm not 100% 
> on this, but I believe this is the intended behavior.
> >
> > Loading the TermInfosReader and SegmentReaders off of the 
> disk is an expensive operation, and the thread local storage 
> is an optimization to prevent it from happening repeatedly.  
> If you have a single search thread, this data is kept in 
> memory and reused on demand; there is no reason to constantly 
> release and then reload that data whenever another search is 
> started.  If you have short-lived multiple threads, this data 
> is fairly quickly cleaned up by the GC when the thread dies.  
> The only situation in which this should be a real problem is 
> when you have long-lived threads which might rarely do 
> searches but spend most of its time doing other things.
> >
> > The finalizer stuff was a hold over from the Java version, 
> because some versions of the JVM wouldn't release the thread 
> local data when the thread died.  When that code was migrated 
> into Lucene.Net, it actually *caused* a mem leak on MS.NET 
> 1.1, if I recall correctly, because of a runtime bug.
> >
> > Maybe the right thing to do is to make this optional 
> behavior, because if you have short-lived searching threads, 
> you probably don't want this data to be cached, and want to 
> release it as soon as possible.
> >
> >   
> >> Memory leaks in 1.9.1-004-27Nov06 from Thread-Local Storage
> >> -----------------------------------------------------------
> >>
> >>                 Key: LUCENENET-40
> >>                 URL: 
> https://issues.apache.org/jira/browse/LUCENENET-40
> >>             Project: Lucene.Net
> >>          Issue Type: Bug
> >>         Environment: Windows XP, .NET 1.1
> >>            Reporter: zarquon78
> >>
> >> In both TermInfosReader and SegmentReader, the call to 
> System.Threading.Thread.SetData() to release the data is no 
> longer performed.  This leads to the data remaining attached 
> to the thread.  This call is in the now commented-out 
> finalized.  By adding the call to the Close() or DoClose() 
> method, the leak is removed.
> >> I.e.,
> >> // TermInfosReader.cs
> >> public /*internal*/ void Close()
> >> {
> >>    // ...
> >>    System.Threading.Thread.SetData(enumerators, null);  // 
> Added } // 
> >> SegmentReader.cs protected internal override void DoClose() {
> >>    // ...
> >>   System.Threading.Thread.SetData(termVectorsLocal, null); 
>  // Added 
> >> }
> >>     
> >
> >   
> 
> 

Re: [jira] Commented: (LUCENENET-40) Memory leaks in 1.9.1-004-27Nov06 from Thread-Local Storage

Posted by Michael Garski <mg...@mac.com>.
Any thought to changing it from ThreadLocalStorage to using the 
ThreadStatic attribute on the field?  It's supposed to be more 
efficient, but I've not confirmed that.  I just find it easier to use.

I've done that here on my end with Lucene.NET.

Michael

Joe Shaw (JIRA) wrote:
>     [ https://issues.apache.org/jira/browse/LUCENENET-40?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12501661 ] 
>
> Joe Shaw commented on LUCENENET-40:
> -----------------------------------
>
> This is an issue that seems to come up a lot.  I'm not 100% on this, but I believe this is the intended behavior.
>
> Loading the TermInfosReader and SegmentReaders off of the disk is an expensive operation, and the thread local storage is an optimization to prevent it from happening repeatedly.  If you have a single search thread, this data is kept in memory and reused on demand; there is no reason to constantly release and then reload that data whenever another search is started.  If you have short-lived multiple threads, this data is fairly quickly cleaned up by the GC when the thread dies.  The only situation in which this should be a real problem is when you have long-lived threads which might rarely do searches but spend most of its time doing other things.
>
> The finalizer stuff was a hold over from the Java version, because some versions of the JVM wouldn't release the thread local data when the thread died.  When that code was migrated into Lucene.Net, it actually *caused* a mem leak on MS.NET 1.1, if I recall correctly, because of a runtime bug.
>
> Maybe the right thing to do is to make this optional behavior, because if you have short-lived searching threads, you probably don't want this data to be cached, and want to release it as soon as possible.
>
>   
>> Memory leaks in 1.9.1-004-27Nov06 from Thread-Local Storage
>> -----------------------------------------------------------
>>
>>                 Key: LUCENENET-40
>>                 URL: https://issues.apache.org/jira/browse/LUCENENET-40
>>             Project: Lucene.Net
>>          Issue Type: Bug
>>         Environment: Windows XP, .NET 1.1
>>            Reporter: zarquon78
>>
>> In both TermInfosReader and SegmentReader, the call to System.Threading.Thread.SetData() to release the data is no longer performed.  This leads to the data remaining attached to the thread.  This call is in the now commented-out finalized.  By adding the call to the Close() or DoClose() method, the leak is removed.
>> I.e.,
>> // TermInfosReader.cs
>> public /*internal*/ void Close()
>> {
>>    // ...
>>    System.Threading.Thread.SetData(enumerators, null);  // Added
>> }
>> // SegmentReader.cs
>> protected internal override void DoClose()
>> {
>>    // ...
>>   System.Threading.Thread.SetData(termVectorsLocal, null);  // Added
>> }
>>     
>
>