You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "Chris M. Hostetter (Jira)" <ji...@apache.org> on 2020/08/01 00:48:00 UTC

[jira] [Updated] (SOLR-14657) spurious ERRORs due to race condition between SolrIndexSearcher metrics and IndexReader closing

     [ https://issues.apache.org/jira/browse/SOLR-14657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chris M. Hostetter updated SOLR-14657:
--------------------------------------
    Attachment: SOLR-14657.patch
        Status: Open  (was: Open)

bq. how about adding this convenience method as SolrMetricsContext.safeGauge(Gauge g, Object defValue) / SolrMetricsContext.safeGauge(Gauge g) ? At least we would have one central place to make these decisions. 

I think thta would be a bad idea, as it would presume that it would be safe to _silently_ ignore *any* exception/throwable from *any* gauge registred that way -- and i don't think that's true.  Even for these gauges, if we encounter something truely unexpected then we should let those percolate up and show up in the logs - but in this case it's totaly expected that an IndexReader might throw an AlreadyCloseException, so it's ok to explicitly ignore that type of exception.

bq. For this particular gauge I would be tempted to return -1 instead of null, as a kind of "impossible value" that we already return from other numeric gauges. ...  see SOLR-14683

Sure, i updated the patch to return -1 for the numeric IndexReader gauges, and "" for hte one place that an IndexReader related gauge returns a String ("readerDir").  I also fixed the "indexVersion" gauge that i some how completely overlooked in my first patch.

> spurious ERRORs due to race condition between SolrIndexSearcher metrics and IndexReader closing
> -----------------------------------------------------------------------------------------------
>
>                 Key: SOLR-14657
>                 URL: https://issues.apache.org/jira/browse/SOLR-14657
>             Project: Solr
>          Issue Type: Improvement
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: Chris M. Hostetter
>            Assignee: Chris M. Hostetter
>            Priority: Major
>         Attachments: SOLR-14657.patch, SOLR-14657.patch
>
>
> I've seen situations in the wild where systems monitoring/polling metrics can trigger scary looking - but otherwise benign - ERRORs due to AlreadyClosedExceptions if/when the searcher/reader is in the process of being re-opened and the Gauge tries to call reader.numDocs(), etc...
> We should tweak the metrics logic to just ignore these exceptions



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org