You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Michael McCandless (Created) (JIRA)" <ji...@apache.org> on 2011/11/05 21:54:51 UTC

[jira] [Created] (LUCENE-3562) Stop storing TermsEnum in CloseableThreadLocal inside Terms instance

Stop storing TermsEnum in CloseableThreadLocal inside Terms instance
--------------------------------------------------------------------

                 Key: LUCENE-3562
                 URL: https://issues.apache.org/jira/browse/LUCENE-3562
             Project: Lucene - Java
          Issue Type: Improvement
            Reporter: Michael McCandless
            Assignee: Michael McCandless
             Fix For: 4.0


We have sugar methods in Terms.java (docFreq, totalTermFreq, docs,
docsAndPositions) that use a saved thread-private TermsEnum to do the
lookups.

But on apps that send many threads through Lucene, and/or have many
segments, this can add up to a lot of RAM, especially if the codecs
impl holds onto stuff.

Also, Terms has a close method (closes the CloseableThreadLocal) which
must be called, but we fail to do so in some places.

These saved enums are the cause of the recent OOME in TestNRTManager
(TestNRTManager.testNRTManager -seed
2aa27e1aec20c4a2:-4a5a5ecf46837d0e:-7c4f651f1f0b75d7 -mult 3
-nightly).

Really sharing these enums is a holdover from before Lucene queries
would share state (ie, save the TermState from the first pass, and use
it later to pull enums, get docFreq, etc.).  It's not helpful anymore,
and it can use gobbs of RAM, so I'd like to remove it.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-3562) Stop storing TermsEnum in CloseableThreadLocal inside Terms instance

Posted by "Robert Muir (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-3562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13152151#comment-13152151 ] 

Robert Muir commented on LUCENE-3562:
-------------------------------------

+1
                
> Stop storing TermsEnum in CloseableThreadLocal inside Terms instance
> --------------------------------------------------------------------
>
>                 Key: LUCENE-3562
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3562
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>             Fix For: 4.0
>
>         Attachments: LUCENE-3562.patch, LUCENE-3562.patch
>
>
> We have sugar methods in Terms.java (docFreq, totalTermFreq, docs,
> docsAndPositions) that use a saved thread-private TermsEnum to do the
> lookups.
> But on apps that send many threads through Lucene, and/or have many
> segments, this can add up to a lot of RAM, especially if the codecs
> impl holds onto stuff.
> Also, Terms has a close method (closes the CloseableThreadLocal) which
> must be called, but we fail to do so in some places.
> These saved enums are the cause of the recent OOME in TestNRTManager
> (TestNRTManager.testNRTManager -seed
> 2aa27e1aec20c4a2:-4a5a5ecf46837d0e:-7c4f651f1f0b75d7 -mult 3
> -nightly).
> Really sharing these enums is a holdover from before Lucene queries
> would share state (ie, save the TermState from the first pass, and use
> it later to pull enums, get docFreq, etc.).  It's not helpful anymore,
> and it can use gobbs of RAM, so I'd like to remove it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-3562) Stop storing TermsEnum in CloseableThreadLocal inside Terms instance

Posted by "Simon Willnauer (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-3562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13149531#comment-13149531 ] 

Simon Willnauer commented on LUCENE-3562:
-----------------------------------------

mike I think you should commit this - patch looks good to me
                
> Stop storing TermsEnum in CloseableThreadLocal inside Terms instance
> --------------------------------------------------------------------
>
>                 Key: LUCENE-3562
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3562
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>             Fix For: 4.0
>
>         Attachments: LUCENE-3562.patch
>
>
> We have sugar methods in Terms.java (docFreq, totalTermFreq, docs,
> docsAndPositions) that use a saved thread-private TermsEnum to do the
> lookups.
> But on apps that send many threads through Lucene, and/or have many
> segments, this can add up to a lot of RAM, especially if the codecs
> impl holds onto stuff.
> Also, Terms has a close method (closes the CloseableThreadLocal) which
> must be called, but we fail to do so in some places.
> These saved enums are the cause of the recent OOME in TestNRTManager
> (TestNRTManager.testNRTManager -seed
> 2aa27e1aec20c4a2:-4a5a5ecf46837d0e:-7c4f651f1f0b75d7 -mult 3
> -nightly).
> Really sharing these enums is a holdover from before Lucene queries
> would share state (ie, save the TermState from the first pass, and use
> it later to pull enums, get docFreq, etc.).  It's not helpful anymore,
> and it can use gobbs of RAM, so I'd like to remove it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (LUCENE-3562) Stop storing TermsEnum in CloseableThreadLocal inside Terms instance

Posted by "Michael McCandless (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-3562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael McCandless updated LUCENE-3562:
---------------------------------------

    Attachment: LUCENE-3562.patch

New patch; also cuts over MultiPhraseQuery to save the TermStates from weight -> scorer, and optimizes BlockTree's TermsEnum to reduce cost of init + seekExact only usages.

I think it's ready!
                
> Stop storing TermsEnum in CloseableThreadLocal inside Terms instance
> --------------------------------------------------------------------
>
>                 Key: LUCENE-3562
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3562
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>             Fix For: 4.0
>
>         Attachments: LUCENE-3562.patch, LUCENE-3562.patch
>
>
> We have sugar methods in Terms.java (docFreq, totalTermFreq, docs,
> docsAndPositions) that use a saved thread-private TermsEnum to do the
> lookups.
> But on apps that send many threads through Lucene, and/or have many
> segments, this can add up to a lot of RAM, especially if the codecs
> impl holds onto stuff.
> Also, Terms has a close method (closes the CloseableThreadLocal) which
> must be called, but we fail to do so in some places.
> These saved enums are the cause of the recent OOME in TestNRTManager
> (TestNRTManager.testNRTManager -seed
> 2aa27e1aec20c4a2:-4a5a5ecf46837d0e:-7c4f651f1f0b75d7 -mult 3
> -nightly).
> Really sharing these enums is a holdover from before Lucene queries
> would share state (ie, save the TermState from the first pass, and use
> it later to pull enums, get docFreq, etc.).  It's not helpful anymore,
> and it can use gobbs of RAM, so I'd like to remove it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (LUCENE-3562) Stop storing TermsEnum in CloseableThreadLocal inside Terms instance

Posted by "Uwe Schindler (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-3562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13144852#comment-13144852 ] 

Uwe Schindler commented on LUCENE-3562:
---------------------------------------

+1
                
> Stop storing TermsEnum in CloseableThreadLocal inside Terms instance
> --------------------------------------------------------------------
>
>                 Key: LUCENE-3562
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3562
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>             Fix For: 4.0
>
>         Attachments: LUCENE-3562.patch
>
>
> We have sugar methods in Terms.java (docFreq, totalTermFreq, docs,
> docsAndPositions) that use a saved thread-private TermsEnum to do the
> lookups.
> But on apps that send many threads through Lucene, and/or have many
> segments, this can add up to a lot of RAM, especially if the codecs
> impl holds onto stuff.
> Also, Terms has a close method (closes the CloseableThreadLocal) which
> must be called, but we fail to do so in some places.
> These saved enums are the cause of the recent OOME in TestNRTManager
> (TestNRTManager.testNRTManager -seed
> 2aa27e1aec20c4a2:-4a5a5ecf46837d0e:-7c4f651f1f0b75d7 -mult 3
> -nightly).
> Really sharing these enums is a holdover from before Lucene queries
> would share state (ie, save the TermState from the first pass, and use
> it later to pull enums, get docFreq, etc.).  It's not helpful anymore,
> and it can use gobbs of RAM, so I'd like to remove it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Resolved] (LUCENE-3562) Stop storing TermsEnum in CloseableThreadLocal inside Terms instance

Posted by "Michael McCandless (Resolved) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-3562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael McCandless resolved LUCENE-3562.
----------------------------------------

    Resolution: Fixed
    
> Stop storing TermsEnum in CloseableThreadLocal inside Terms instance
> --------------------------------------------------------------------
>
>                 Key: LUCENE-3562
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3562
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>             Fix For: 4.0
>
>         Attachments: LUCENE-3562.patch, LUCENE-3562.patch
>
>
> We have sugar methods in Terms.java (docFreq, totalTermFreq, docs,
> docsAndPositions) that use a saved thread-private TermsEnum to do the
> lookups.
> But on apps that send many threads through Lucene, and/or have many
> segments, this can add up to a lot of RAM, especially if the codecs
> impl holds onto stuff.
> Also, Terms has a close method (closes the CloseableThreadLocal) which
> must be called, but we fail to do so in some places.
> These saved enums are the cause of the recent OOME in TestNRTManager
> (TestNRTManager.testNRTManager -seed
> 2aa27e1aec20c4a2:-4a5a5ecf46837d0e:-7c4f651f1f0b75d7 -mult 3
> -nightly).
> Really sharing these enums is a holdover from before Lucene queries
> would share state (ie, save the TermState from the first pass, and use
> it later to pull enums, get docFreq, etc.).  It's not helpful anymore,
> and it can use gobbs of RAM, so I'd like to remove it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (LUCENE-3562) Stop storing TermsEnum in CloseableThreadLocal inside Terms instance

Posted by "Michael McCandless (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-3562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael McCandless updated LUCENE-3562:
---------------------------------------

    Attachment: LUCENE-3562.patch

Patch.
                
> Stop storing TermsEnum in CloseableThreadLocal inside Terms instance
> --------------------------------------------------------------------
>
>                 Key: LUCENE-3562
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3562
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>             Fix For: 4.0
>
>         Attachments: LUCENE-3562.patch
>
>
> We have sugar methods in Terms.java (docFreq, totalTermFreq, docs,
> docsAndPositions) that use a saved thread-private TermsEnum to do the
> lookups.
> But on apps that send many threads through Lucene, and/or have many
> segments, this can add up to a lot of RAM, especially if the codecs
> impl holds onto stuff.
> Also, Terms has a close method (closes the CloseableThreadLocal) which
> must be called, but we fail to do so in some places.
> These saved enums are the cause of the recent OOME in TestNRTManager
> (TestNRTManager.testNRTManager -seed
> 2aa27e1aec20c4a2:-4a5a5ecf46837d0e:-7c4f651f1f0b75d7 -mult 3
> -nightly).
> Really sharing these enums is a holdover from before Lucene queries
> would share state (ie, save the TermState from the first pass, and use
> it later to pull enums, get docFreq, etc.).  It's not helpful anymore,
> and it can use gobbs of RAM, so I'd like to remove it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org