You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "Chris M. Hostetter (Jira)" <ji...@apache.org> on 2020/08/01 00:48:00 UTC
[jira] [Updated] (SOLR-14657) spurious ERRORs due to race condition
between SolrIndexSearcher metrics and IndexReader closing
[ https://issues.apache.org/jira/browse/SOLR-14657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Chris M. Hostetter updated SOLR-14657:
--------------------------------------
Attachment: SOLR-14657.patch
Status: Open (was: Open)
bq. how about adding this convenience method as SolrMetricsContext.safeGauge(Gauge g, Object defValue) / SolrMetricsContext.safeGauge(Gauge g) ? At least we would have one central place to make these decisions.
I think thta would be a bad idea, as it would presume that it would be safe to _silently_ ignore *any* exception/throwable from *any* gauge registred that way -- and i don't think that's true. Even for these gauges, if we encounter something truely unexpected then we should let those percolate up and show up in the logs - but in this case it's totaly expected that an IndexReader might throw an AlreadyCloseException, so it's ok to explicitly ignore that type of exception.
bq. For this particular gauge I would be tempted to return -1 instead of null, as a kind of "impossible value" that we already return from other numeric gauges. ... see SOLR-14683
Sure, i updated the patch to return -1 for the numeric IndexReader gauges, and "" for hte one place that an IndexReader related gauge returns a String ("readerDir"). I also fixed the "indexVersion" gauge that i some how completely overlooked in my first patch.
> spurious ERRORs due to race condition between SolrIndexSearcher metrics and IndexReader closing
> -----------------------------------------------------------------------------------------------
>
> Key: SOLR-14657
> URL: https://issues.apache.org/jira/browse/SOLR-14657
> Project: Solr
> Issue Type: Improvement
> Security Level: Public(Default Security Level. Issues are Public)
> Reporter: Chris M. Hostetter
> Assignee: Chris M. Hostetter
> Priority: Major
> Attachments: SOLR-14657.patch, SOLR-14657.patch
>
>
> I've seen situations in the wild where systems monitoring/polling metrics can trigger scary looking - but otherwise benign - ERRORs due to AlreadyClosedExceptions if/when the searcher/reader is in the process of being re-opened and the Gauge tries to call reader.numDocs(), etc...
> We should tweak the metrics logic to just ignore these exceptions
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org