You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucenenet.apache.org by "zarquon78 (JIRA)" <ji...@apache.org> on 2007/06/04 20:23:25 UTC

[jira] Created: (LUCENENET-40) Memory leaks in 1.9.1-004-27Nov06 from Thread-Local Storage

Memory leaks in 1.9.1-004-27Nov06 from Thread-Local Storage
-----------------------------------------------------------

                 Key: LUCENENET-40
                 URL: https://issues.apache.org/jira/browse/LUCENENET-40
             Project: Lucene.Net
          Issue Type: Bug
         Environment: Windows XP, .NET 1.1
            Reporter: zarquon78


In both TermInfosReader and SegmentReader, the call to System.Threading.Thread.SetData() to release the data is no longer performed.  This leads to the data remaining attached to the thread.  This call is in the now commented-out finalized.  By adding the call to the Close() or DoClose() method, the leak is removed.

I.e.,
// TermInfosReader.cs
public /*internal*/ void Close()
{
   // ...
   System.Threading.Thread.SetData(enumerators, null);  // Added
}

// SegmentReader.cs
protected internal override void DoClose()
{
   // ...
  System.Threading.Thread.SetData(termVectorsLocal, null);  // Added
}


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


RE: [jira] Created: (LUCENENET-40) Memory leaks in 1.9.1-004-27Nov06 from Thread-Local Storage

Posted by Digy <di...@gmail.com>.
Have you tried the version 2.0.0.3 or 2.0.0.4? 
I have been using it(2.0.0.3) in a server-application for months without an application restart and memory leakage.

DIGY

-----Original Message-----
From: zarquon78 (JIRA) [mailto:jira@apache.org] 
Sent: Monday, June 04, 2007 9:23 PM
To: lucene-net-dev@incubator.apache.org
Subject: [jira] Created: (LUCENENET-40) Memory leaks in 1.9.1-004-27Nov06 from Thread-Local Storage

Memory leaks in 1.9.1-004-27Nov06 from Thread-Local Storage
-----------------------------------------------------------

                 Key: LUCENENET-40
                 URL: https://issues.apache.org/jira/browse/LUCENENET-40
             Project: Lucene.Net
          Issue Type: Bug
         Environment: Windows XP, .NET 1.1
            Reporter: zarquon78


In both TermInfosReader and SegmentReader, the call to System.Threading.Thread.SetData() to release the data is no longer performed.  This leads to the data remaining attached to the thread.  This call is in the now commented-out finalized.  By adding the call to the Close() or DoClose() method, the leak is removed.

I.e.,
// TermInfosReader.cs
public /*internal*/ void Close()
{
   // ...
   System.Threading.Thread.SetData(enumerators, null);  // Added
}

// SegmentReader.cs
protected internal override void DoClose()
{
   // ...
  System.Threading.Thread.SetData(termVectorsLocal, null);  // Added
}


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


RE: [jira] Commented: (LUCENENET-40) Memory leaks in 1.9.1-004-27Nov06 from Thread-Local Storage

Posted by Erich Eichinger <E....@diamonddogs.cc>.
Hi,

> Do you have any idea what changed between 1.9 and 2.0 so that 
> these leaks were fixed? 

Atm I can only roughly tell from my mind: They fixed a couply of missing Close() calls and added some code to remove the instances from the cache.

I can't tell exactly since I only ran some tests testing for memory leaks that failed miserably with 1.4.3 but passed without problems with 2.0. Thus I didn't look into 2.0 sources too much then.

> Ok, then what is the intended behavior of the TLS?

Of course the intended behaviour is per thread caching. But some cache items (I remember SegmentReader or something like that) are specific to a certain Reader/Writer instance and become "orphan" if the corresponding Reader/Writer is closed. Those items remain in the cache until the process shuts down.
I remember that in the original Java code a WeakHashMap was used that didn't prevent those cacheitems from being GCed. But the .NET port used a normal Hashtable instead in version 1.4.3 which caused a couple of those memory leaks.

-Erich

> -----Original Message-----
> From: Joe Shaw [mailto:joe@joeshaw.org] 
> Sent: Tuesday, June 05, 2007 11:21 PM
> To: lucene-net-dev@incubator.apache.org
> Subject: Re: [jira] Commented: (LUCENENET-40) Memory leaks in 
> 1.9.1-004-27Nov06 from Thread-Local Storage
> 
> Hi Erich,
> 
> On 6/5/07, Erich Eichinger <E....@diamonddogs.cc> wrote:
> > I posted a patch for 1.4.3 on 7.March'07 to this 
> mailinglist. Some of the leaks fixed by this patch also 
> affect 1.9 asfaik. Lucene 2.0 doesn't have these memory-leaks anymore.
> 
> Do you have any idea what changed between 1.9 and 2.0 so that 
> these leaks were fixed?
> 
> > > > This is an issue that seems to come up a lot.  I'm not 100%
> > > on this, but I believe this is the intended behavior.
> >
> > i know from writing my patches that this is not the 
> intended behaviour. Data stored in the various caches is not 
> reused after closing the reader/writer. Much worse since 
> Reader/Writer instances are used as keys in the Hashtable, 
> any filehandle they hold will never be closed.
> 
> Ok, then what is the intended behavior of the TLS?
> 
> Joe
> 

Re: [jira] Commented: (LUCENENET-40) Memory leaks in 1.9.1-004-27Nov06 from Thread-Local Storage

Posted by Joe Shaw <jo...@joeshaw.org>.
Hi Erich,

On 6/5/07, Erich Eichinger <E....@diamonddogs.cc> wrote:
> I posted a patch for 1.4.3 on 7.March'07 to this mailinglist. Some of the leaks fixed by this patch also affect 1.9 asfaik. Lucene 2.0 doesn't have these memory-leaks anymore.

Do you have any idea what changed between 1.9 and 2.0 so that these
leaks were fixed?

> > > This is an issue that seems to come up a lot.  I'm not 100%
> > on this, but I believe this is the intended behavior.
>
> i know from writing my patches that this is not the intended behaviour. Data stored in the various caches is not reused after closing the reader/writer. Much worse since Reader/Writer instances are used as keys in the Hashtable, any filehandle they hold will never be closed.

Ok, then what is the intended behavior of the TLS?

Joe

RE: [jira] Commented: (LUCENENET-40) Memory leaks in 1.9.1-004-27Nov06 from Thread-Local Storage

Posted by Erich Eichinger <E....@diamonddogs.cc>.
Hi,

I posted a patch for 1.4.3 on 7.March'07 to this mailinglist. Some of the leaks fixed by this patch also affect 1.9 asfaik. Lucene 2.0 doesn't have these memory-leaks anymore.

> Any thought to changing it from ThreadLocalStorage to using 
> the ThreadStatic attribute on the field?  It's supposed to be 

[ThreadStatic] is definitely more efficient.

> > This is an issue that seems to come up a lot.  I'm not 100% 
> on this, but I believe this is the intended behavior.

i know from writing my patches that this is not the intended behaviour. Data stored in the various caches is not reused after closing the reader/writer. Much worse since Reader/Writer instances are used as keys in the Hashtable, any filehandle they hold will never be closed.

cheers,
Erich

> -----Original Message-----
> From: Michael Garski [mailto:mgarski@mac.com] 
> Sent: Tuesday, June 05, 2007 9:24 PM
> To: lucene-net-dev@incubator.apache.org
> Subject: Re: [jira] Commented: (LUCENENET-40) Memory leaks in 
> 1.9.1-004-27Nov06 from Thread-Local Storage
> 
> Any thought to changing it from ThreadLocalStorage to using 
> the ThreadStatic attribute on the field?  It's supposed to be 
> more efficient, but I've not confirmed that.  I just find it 
> easier to use.
> 
> I've done that here on my end with Lucene.NET.
> 
> Michael
> 
> Joe Shaw (JIRA) wrote:
> >     [ 
> > 
> https://issues.apache.org/jira/browse/LUCENENET-40?page=com.atlassian.
> > jira.plugin.system.issuetabpanels:comment-tabpanel#action_12501661 ]
> >
> > Joe Shaw commented on LUCENENET-40:
> > -----------------------------------
> >
> > This is an issue that seems to come up a lot.  I'm not 100% 
> on this, but I believe this is the intended behavior.
> >
> > Loading the TermInfosReader and SegmentReaders off of the 
> disk is an expensive operation, and the thread local storage 
> is an optimization to prevent it from happening repeatedly.  
> If you have a single search thread, this data is kept in 
> memory and reused on demand; there is no reason to constantly 
> release and then reload that data whenever another search is 
> started.  If you have short-lived multiple threads, this data 
> is fairly quickly cleaned up by the GC when the thread dies.  
> The only situation in which this should be a real problem is 
> when you have long-lived threads which might rarely do 
> searches but spend most of its time doing other things.
> >
> > The finalizer stuff was a hold over from the Java version, 
> because some versions of the JVM wouldn't release the thread 
> local data when the thread died.  When that code was migrated 
> into Lucene.Net, it actually *caused* a mem leak on MS.NET 
> 1.1, if I recall correctly, because of a runtime bug.
> >
> > Maybe the right thing to do is to make this optional 
> behavior, because if you have short-lived searching threads, 
> you probably don't want this data to be cached, and want to 
> release it as soon as possible.
> >
> >   
> >> Memory leaks in 1.9.1-004-27Nov06 from Thread-Local Storage
> >> -----------------------------------------------------------
> >>
> >>                 Key: LUCENENET-40
> >>                 URL: 
> https://issues.apache.org/jira/browse/LUCENENET-40
> >>             Project: Lucene.Net
> >>          Issue Type: Bug
> >>         Environment: Windows XP, .NET 1.1
> >>            Reporter: zarquon78
> >>
> >> In both TermInfosReader and SegmentReader, the call to 
> System.Threading.Thread.SetData() to release the data is no 
> longer performed.  This leads to the data remaining attached 
> to the thread.  This call is in the now commented-out 
> finalized.  By adding the call to the Close() or DoClose() 
> method, the leak is removed.
> >> I.e.,
> >> // TermInfosReader.cs
> >> public /*internal*/ void Close()
> >> {
> >>    // ...
> >>    System.Threading.Thread.SetData(enumerators, null);  // 
> Added } // 
> >> SegmentReader.cs protected internal override void DoClose() {
> >>    // ...
> >>   System.Threading.Thread.SetData(termVectorsLocal, null); 
>  // Added 
> >> }
> >>     
> >
> >   
> 
> 

Re: [jira] Commented: (LUCENENET-40) Memory leaks in 1.9.1-004-27Nov06 from Thread-Local Storage

Posted by Michael Garski <mg...@mac.com>.
Any thought to changing it from ThreadLocalStorage to using the 
ThreadStatic attribute on the field?  It's supposed to be more 
efficient, but I've not confirmed that.  I just find it easier to use.

I've done that here on my end with Lucene.NET.

Michael

Joe Shaw (JIRA) wrote:
>     [ https://issues.apache.org/jira/browse/LUCENENET-40?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12501661 ] 
>
> Joe Shaw commented on LUCENENET-40:
> -----------------------------------
>
> This is an issue that seems to come up a lot.  I'm not 100% on this, but I believe this is the intended behavior.
>
> Loading the TermInfosReader and SegmentReaders off of the disk is an expensive operation, and the thread local storage is an optimization to prevent it from happening repeatedly.  If you have a single search thread, this data is kept in memory and reused on demand; there is no reason to constantly release and then reload that data whenever another search is started.  If you have short-lived multiple threads, this data is fairly quickly cleaned up by the GC when the thread dies.  The only situation in which this should be a real problem is when you have long-lived threads which might rarely do searches but spend most of its time doing other things.
>
> The finalizer stuff was a hold over from the Java version, because some versions of the JVM wouldn't release the thread local data when the thread died.  When that code was migrated into Lucene.Net, it actually *caused* a mem leak on MS.NET 1.1, if I recall correctly, because of a runtime bug.
>
> Maybe the right thing to do is to make this optional behavior, because if you have short-lived searching threads, you probably don't want this data to be cached, and want to release it as soon as possible.
>
>   
>> Memory leaks in 1.9.1-004-27Nov06 from Thread-Local Storage
>> -----------------------------------------------------------
>>
>>                 Key: LUCENENET-40
>>                 URL: https://issues.apache.org/jira/browse/LUCENENET-40
>>             Project: Lucene.Net
>>          Issue Type: Bug
>>         Environment: Windows XP, .NET 1.1
>>            Reporter: zarquon78
>>
>> In both TermInfosReader and SegmentReader, the call to System.Threading.Thread.SetData() to release the data is no longer performed.  This leads to the data remaining attached to the thread.  This call is in the now commented-out finalized.  By adding the call to the Close() or DoClose() method, the leak is removed.
>> I.e.,
>> // TermInfosReader.cs
>> public /*internal*/ void Close()
>> {
>>    // ...
>>    System.Threading.Thread.SetData(enumerators, null);  // Added
>> }
>> // SegmentReader.cs
>> protected internal override void DoClose()
>> {
>>    // ...
>>   System.Threading.Thread.SetData(termVectorsLocal, null);  // Added
>> }
>>     
>
>   

[jira] Commented: (LUCENENET-40) Memory leaks in 1.9.1-004-27Nov06 from Thread-Local Storage

Posted by "Joe Shaw (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENENET-40?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12501661 ] 

Joe Shaw commented on LUCENENET-40:
-----------------------------------

This is an issue that seems to come up a lot.  I'm not 100% on this, but I believe this is the intended behavior.

Loading the TermInfosReader and SegmentReaders off of the disk is an expensive operation, and the thread local storage is an optimization to prevent it from happening repeatedly.  If you have a single search thread, this data is kept in memory and reused on demand; there is no reason to constantly release and then reload that data whenever another search is started.  If you have short-lived multiple threads, this data is fairly quickly cleaned up by the GC when the thread dies.  The only situation in which this should be a real problem is when you have long-lived threads which might rarely do searches but spend most of its time doing other things.

The finalizer stuff was a hold over from the Java version, because some versions of the JVM wouldn't release the thread local data when the thread died.  When that code was migrated into Lucene.Net, it actually *caused* a mem leak on MS.NET 1.1, if I recall correctly, because of a runtime bug.

Maybe the right thing to do is to make this optional behavior, because if you have short-lived searching threads, you probably don't want this data to be cached, and want to release it as soon as possible.

> Memory leaks in 1.9.1-004-27Nov06 from Thread-Local Storage
> -----------------------------------------------------------
>
>                 Key: LUCENENET-40
>                 URL: https://issues.apache.org/jira/browse/LUCENENET-40
>             Project: Lucene.Net
>          Issue Type: Bug
>         Environment: Windows XP, .NET 1.1
>            Reporter: zarquon78
>
> In both TermInfosReader and SegmentReader, the call to System.Threading.Thread.SetData() to release the data is no longer performed.  This leads to the data remaining attached to the thread.  This call is in the now commented-out finalized.  By adding the call to the Close() or DoClose() method, the leak is removed.
> I.e.,
> // TermInfosReader.cs
> public /*internal*/ void Close()
> {
>    // ...
>    System.Threading.Thread.SetData(enumerators, null);  // Added
> }
> // SegmentReader.cs
> protected internal override void DoClose()
> {
>    // ...
>   System.Threading.Thread.SetData(termVectorsLocal, null);  // Added
> }

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Closed: (LUCENENET-40) Memory leaks in 1.9.1-004-27Nov06 from Thread-Local Storage

Posted by "George Aroush (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENENET-40?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

George Aroush closed LUCENENET-40.
----------------------------------

    Resolution: Won't Fix
      Assignee: George Aroush

There are no plans to fix issues with older releases of Lucene.Net.  Please confirm that you are seeing the same problem with Lucene.Net 2.0 if so, provide an example of how you are indexing to reproduce the leak so we can debug.

Thanks,

-- George

> Memory leaks in 1.9.1-004-27Nov06 from Thread-Local Storage
> -----------------------------------------------------------
>
>                 Key: LUCENENET-40
>                 URL: https://issues.apache.org/jira/browse/LUCENENET-40
>             Project: Lucene.Net
>          Issue Type: Bug
>         Environment: Windows XP, .NET 1.1
>            Reporter: zarquon78
>            Assignee: George Aroush
>
> In both TermInfosReader and SegmentReader, the call to System.Threading.Thread.SetData() to release the data is no longer performed.  This leads to the data remaining attached to the thread.  This call is in the now commented-out finalized.  By adding the call to the Close() or DoClose() method, the leak is removed.
> I.e.,
> // TermInfosReader.cs
> public /*internal*/ void Close()
> {
>    // ...
>    System.Threading.Thread.SetData(enumerators, null);  // Added
> }
> // SegmentReader.cs
> protected internal override void DoClose()
> {
>    // ...
>   System.Threading.Thread.SetData(termVectorsLocal, null);  // Added
> }

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (LUCENENET-40) Memory leaks in 1.9.1-004-27Nov06 from Thread-Local Storage

Posted by "zarquon78 (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENENET-40?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12501685 ] 

zarquon78 commented on LUCENENET-40:
------------------------------------

Sorry, but I should have mentioned that I was creating an index and not searching...  

The problem was that the amount of memory being used (even after forcing a GC) was continually increasing.  Since the same thread was doing the indexing, this memory was never being cleaned up (or so it appears).

While I agree the GC will get the memory when the thread ends, it would seem to be a bad idea to NOT clean up the memory.  Unless I'm completely missing something, allocating a new SegmentReader (for example) wouldn't be guarenteed to return the same LocalDataStoreSlot, right?  If the program continually opens and closes the index in the same thread, won't we get a leak?

If the idea was to store information across instances of SegmentReader and TermInfosReader so they could be 'closed', wouldn't a static cache object using TLS with named slots be better?  That way, you're guaranteed to get the data back.

I must be missing something.  It appears that once you release all references to them, you can never can the data back...  If I'm wrong, just say so.  This isn't the forum for "educating" me on it.




> Memory leaks in 1.9.1-004-27Nov06 from Thread-Local Storage
> -----------------------------------------------------------
>
>                 Key: LUCENENET-40
>                 URL: https://issues.apache.org/jira/browse/LUCENENET-40
>             Project: Lucene.Net
>          Issue Type: Bug
>         Environment: Windows XP, .NET 1.1
>            Reporter: zarquon78
>
> In both TermInfosReader and SegmentReader, the call to System.Threading.Thread.SetData() to release the data is no longer performed.  This leads to the data remaining attached to the thread.  This call is in the now commented-out finalized.  By adding the call to the Close() or DoClose() method, the leak is removed.
> I.e.,
> // TermInfosReader.cs
> public /*internal*/ void Close()
> {
>    // ...
>    System.Threading.Thread.SetData(enumerators, null);  // Added
> }
> // SegmentReader.cs
> protected internal override void DoClose()
> {
>    // ...
>   System.Threading.Thread.SetData(termVectorsLocal, null);  // Added
> }

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.