You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Andrzej Bialecki (JIRA)" <ji...@apache.org> on 2018/01/30 13:25:00 UTC

[jira] [Reopened] (SOLR-11882) SolrMetric registries retain references to SolrCores when closed

     [ https://issues.apache.org/jira/browse/SOLR-11882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrzej Bialecki  reopened SOLR-11882:
--------------------------------------

This fix exposed another existing issue, which is not fatal but looks ugly:
{code}
ERROR - 2018-01-30 12:51:11.480; [c:gettingstarted s:shard1 r:core_node5 x:gettingstarted_shard1_replica_n2] org.apache.solr.common.SolrException; org.apache.lucene.store.AlreadyClosedException: this IndexReader is closed
        at org.apache.lucene.index.IndexReader.ensureOpen(IndexReader.java:268)
        at org.apache.lucene.index.StandardDirectoryReader.getVersion(StandardDirectoryReader.java:338)
        at org.apache.lucene.index.FilterDirectoryReader.getVersion(FilterDirectoryReader.java:119)
        at org.apache.lucene.index.FilterDirectoryReader.getVersion(FilterDirectoryReader.java:119)
        at org.apache.solr.search.SolrIndexSearcher.lambda$initializeMetrics$14(SolrIndexSearcher.java:2262)
        at org.apache.solr.metrics.SolrCoreMetricManager.lambda$close$93(SolrCoreMetricManager.java:156)
        at java.util.TreeMap.forEach(TreeMap.java:1001)
        at java.util.Collections$UnmodifiableMap.forEach(Collections.java:1505)
        at org.apache.solr.metrics.SolrCoreMetricManager.close(SolrCoreMetricManager.java:155)
        at org.apache.solr.core.SolrCore.close(SolrCore.java:1513)
        at org.apache.solr.core.CoreContainer.registerCore(CoreContainer.java:898)
        at org.apache.solr.core.CoreContainer.reload(CoreContainer.java:1292)
        at org.apache.solr.core.SolrCore.lambda$getConfListener$38(SolrCore.java:2969)
        at org.apache.solr.cloud.ZkController.lambda$fireEventListeners$190(ZkController.java:2610)
        at java.lang.Thread.run(Thread.java:745)
{code}

This is caused by a bunch of too simplistic lambda expressions in SolrIndexSearch:2257 and following - basically, they should first check if the reader is still open, and return eg. -1 if it's not.

> SolrMetric registries retain references to SolrCores when closed
> ----------------------------------------------------------------
>
>                 Key: SOLR-11882
>                 URL: https://issues.apache.org/jira/browse/SOLR-11882
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: metrics, Server
>    Affects Versions: 7.1
>            Reporter: Eros Taborelli
>            Assignee: Erick Erickson
>            Priority: Major
>             Fix For: 7.3
>
>         Attachments: SOLR-11882.patch, SOLR-11882.patch, SOLR-11882.patch, SOLR-11882.patch, create-cores.zip, solr-dump-full_Leak_Suspects.zip, solr.config.zip
>
>
> *Description:*
> Our setup involves using a lot of small cores (possibly hundred thousand), but working only on a few of them at any given time.
> We already followed all recommendations in this guide: [https://wiki.apache.org/solr/LotsOfCores]
> We noticed that after creating/loading around 1000-2000 empty cores, with no documents inside, the heap consumption went through the roof despite having set transientCacheSize to only 64 (heap size set to 12G).
> All cores are correctly set to loadOnStartup=false and transient=true, and we have verified via logs that the cores in excess are actually being closed.
> However, a reference remains in the org.apache.solr.metrics.SolrMetricManager#registries that is never removed until a core if fully unloaded.
> Restarting the JVM loads all cores in the admin UI, but doesn't populate the ConcurrentHashMap until a core is actually fully loaded.
> I reproduced the issue on a smaller scale (transientCacheSize = 5, heap size = 512m) and made a report (attached) using eclipse MAT.
> *Desired outcome:*
> When a transient core is closed, the references in the SolrMetricManager should be removed, in the same fashion the reporters for the core are also closed and removed.
> In alternative, a unloadOnClose=true|false flag could be implemented to fully unload a transient core when closed due to the cache size.
> *Note:*
> The documentation mentions everywhere that the unused cores will be unloaded, but it's misleading as the cores are never fully unloaded.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org