You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Michael McCandless (JIRA)" <ji...@apache.org> on 2008/09/11 23:05:44 UTC

[jira] Updated: (LUCENE-1383) Workaround ThreadLocal's "leak"

     [ https://issues.apache.org/jira/browse/LUCENE-1383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael McCandless updated LUCENE-1383:
---------------------------------------

    Attachment: LUCENE-1383.patch

Attached patch.  All tests pass.

The patch adds o.a.l.util.CloseableThreadLocal.  It's a wrapper around ThreadLocal that wraps the values inside a WeakReference, but then also holds a strong reference to the value (to ensure GC doesn't reclaim it) until you call the close method.  On calling close, GC is then free to reclaim all values you had stored, regardless of how long it takes ThreadLocal's implementation to actually release its references.

There are a couple places in Lucene where I left the current usage of ThreadLocal.

First, Analyzer.java uses ThreadLocal to hold reusable token streams.  There is no "close" called for Analyzer, so unless we are willing to add a finalizer to call CloseableThreadLocal.close() I think we can leave it.

Second, some of the contrib/benchmark tasks use ThreadLocal to store per-thread DateFormat which should use tiny memory.

> Workaround ThreadLocal's "leak"
> -------------------------------
>
>                 Key: LUCENE-1383
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1383
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Index
>    Affects Versions: 1.9, 2.0.0, 2.1, 2.2, 2.3, 2.3.1, 2.3.2
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>             Fix For: 2.4
>
>         Attachments: LUCENE-1383.patch
>
>
> Java's ThreadLocal is dangerous to use because it is able to take a
> surprisingly very long time to release references to the values you
> store in it.  Even when a ThreadLocal instance itself is GC'd, hard
> references to the values you had stored in it are easily kept for
> quite some time later.
> While this is not technically a "memory leak", because eventually
> (when the underlying Map that stores the values cleans up its "stale"
> references) the hard reference will be cleared, and GC can proceed,
> its end behavior is not different from a memory leak in that under the
> right situation you can easily tie up far more memory than you'd
> expect, and then hit unexpected OOM error despite allocating an
> extremely large heap to your JVM.
> Lucene users have hit this many times.  Here's the most recent thread:
>   http://mail-archives.apache.org/mod_mbox/lucene-java-dev/200809.mbox/%3C6e3ae6310809091157j7a9fe46bxcc31f6e63305fcdc%40mail.gmail.com%3E
> And here's another:
>   http://mail-archives.apache.org/mod_mbox/lucene-java-dev/200807.mbox/%3CF5FC94B2-E5C7-40C0-8B73-E12245B91CEE%40mikemccandless.com%3E
> And then there's LUCENE-436 and LUCENE-529 at least.
> A google search for "ThreadLocal leak" yields many compelling hits.
> Sun does this for performance reasons, but I think it's a terrible
> trap and we should work around it with Lucene.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Re: [jira] Updated: (LUCENE-1383) Workaround ThreadLocal's "leak"

Posted by Michael McCandless <lu...@mikemccandless.com>.
Chris, if possible, could you try out this patch to see if it fixes  
the leak you're seeing?  Thanks!

Mike

Michael McCandless (JIRA) wrote:

>
>     [ https://issues.apache.org/jira/browse/LUCENE-1383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel 
>  ]
>
> Michael McCandless updated LUCENE-1383:
> ---------------------------------------
>
>    Attachment: LUCENE-1383.patch
>
> Attached patch.  All tests pass.
>
> The patch adds o.a.l.util.CloseableThreadLocal.  It's a wrapper  
> around ThreadLocal that wraps the values inside a WeakReference, but  
> then also holds a strong reference to the value (to ensure GC  
> doesn't reclaim it) until you call the close method.  On calling  
> close, GC is then free to reclaim all values you had stored,  
> regardless of how long it takes ThreadLocal's implementation to  
> actually release its references.
>
> There are a couple places in Lucene where I left the current usage  
> of ThreadLocal.
>
> First, Analyzer.java uses ThreadLocal to hold reusable token  
> streams.  There is no "close" called for Analyzer, so unless we are  
> willing to add a finalizer to call CloseableThreadLocal.close() I  
> think we can leave it.
>
> Second, some of the contrib/benchmark tasks use ThreadLocal to store  
> per-thread DateFormat which should use tiny memory.
>
>> Workaround ThreadLocal's "leak"
>> -------------------------------
>>
>>                Key: LUCENE-1383
>>                URL: https://issues.apache.org/jira/browse/LUCENE-1383
>>            Project: Lucene - Java
>>         Issue Type: Bug
>>         Components: Index
>>   Affects Versions: 1.9, 2.0.0, 2.1, 2.2, 2.3, 2.3.1, 2.3.2
>>           Reporter: Michael McCandless
>>           Assignee: Michael McCandless
>>            Fix For: 2.4
>>
>>        Attachments: LUCENE-1383.patch
>>
>>
>> Java's ThreadLocal is dangerous to use because it is able to take a
>> surprisingly very long time to release references to the values you
>> store in it.  Even when a ThreadLocal instance itself is GC'd, hard
>> references to the values you had stored in it are easily kept for
>> quite some time later.
>> While this is not technically a "memory leak", because eventually
>> (when the underlying Map that stores the values cleans up its "stale"
>> references) the hard reference will be cleared, and GC can proceed,
>> its end behavior is not different from a memory leak in that under  
>> the
>> right situation you can easily tie up far more memory than you'd
>> expect, and then hit unexpected OOM error despite allocating an
>> extremely large heap to your JVM.
>> Lucene users have hit this many times.  Here's the most recent  
>> thread:
>>  http://mail-archives.apache.org/mod_mbox/lucene-java-dev/200809.mbox/%3C6e3ae6310809091157j7a9fe46bxcc31f6e63305fcdc%40mail.gmail.com%3E
>> And here's another:
>>  http://mail-archives.apache.org/mod_mbox/lucene-java-dev/200807.mbox/%3CF5FC94B2-E5C7-40C0-8B73-E12245B91CEE%40mikemccandless.com%3E
>> And then there's LUCENE-436 and LUCENE-529 at least.
>> A google search for "ThreadLocal leak" yields many compelling hits.
>> Sun does this for performance reasons, but I think it's a terrible
>> trap and we should work around it with Lucene.
>
> -- 
> This message is automatically generated by JIRA.
> -
> You can reply to this email to add a comment to the issue online.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org