You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "Andrzej Bialecki (Jira)" <ji...@apache.org> on 2020/11/05 14:00:00 UTC

[jira] [Commented] (SOLR-14683) Review the metrics API to ensure consistent placeholders for missing values

    [ https://issues.apache.org/jira/browse/SOLR-14683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17226729#comment-17226729 ] 

Andrzej Bialecki commented on SOLR-14683:
-----------------------------------------

Prometheus best practices recommend "avoiding missing metrics" (as if that were always possible... what about eg. missing them due to network connectivity?), and recommend reporting 0 or NaN for the missing numeric metrics:

{quote}
Avoid missing metrics
Time series that are not present until something happens are difficult to deal with, as the usual simple operations are no longer sufficient to correctly handle them. To avoid this, export 0 (or NaN, if 0 would be misleading) for any time series you know may exist in advance.

Most Prometheus client libraries (including Go, Java, and Python) will automatically export a 0 for you for metrics with no labels.
{quote}

For frequently occurring events, where the average value of the metric may be high, reporting 0 WILL skew the stats more than reporting NaN. Reporting NaN also clearly indicates that the data is not available, as opposed to 0 which may be a legitimate value of the metric.

The problem is that serialization of NaN in JSON is not present in the JSON standard, only in extensions such as JSON 5 (http://json5.org). The current JSON standard ECMA-404 says "Numeric values that cannot be represented as sequences of digits (such as Infinity and NaN) are not permitted."

So the only standard option left in JSON to indicate that the data is missing is to return {{null}}.

> Review the metrics API to ensure consistent placeholders for missing values
> ---------------------------------------------------------------------------
>
>                 Key: SOLR-14683
>                 URL: https://issues.apache.org/jira/browse/SOLR-14683
>             Project: Solr
>          Issue Type: Improvement
>          Components: metrics
>            Reporter: Andrzej Bialecki
>            Assignee: Andrzej Bialecki
>            Priority: Major
>
> Spin-off from SOLR-14657. Some gauges can legitimately be missing or in an unknown state at some points in time, eg. during SolrCore startup or shutdown.
> Currently the API returns placeholders with either impossible values for numeric gauges (such as index size -1) or empty maps / strings for other non-numeric gauges.
> [~hossman] noticed that the values for these placeholders may be misleading, depending on how the user treats them - if the client has no special logic to treat them as "missing values" it may erroneously treat them as valid data. E.g. numeric values of -1 or 0 may severely skew averages and produce misleading peaks / valleys in metrics histories.
> On the other hand returning a literal {{null}} value instead of the expected number may also cause unexpected client issues - although in this case it's clearer that there's actually no data available, so long-term this may be a better strategy than returning impossible values, even if it means that the client should learn to handle {{null}} values appropriately.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org