You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Karthik Manamcheri (JIRA)" <ji...@apache.org> on 2019/01/04 00:54:00 UTC

[jira] [Updated] (HIVE-21045) Add HMS total api count stats and connection pool stats to metrics

     [ https://issues.apache.org/jira/browse/HIVE-21045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Karthik Manamcheri updated HIVE-21045:
--------------------------------------
    Summary: Add HMS total api count stats and connection pool stats to metrics  (was: Add connection pool info and rolling performance info to the metrics system)

> Add HMS total api count stats and connection pool stats to metrics
> ------------------------------------------------------------------
>
>                 Key: HIVE-21045
>                 URL: https://issues.apache.org/jira/browse/HIVE-21045
>             Project: Hive
>          Issue Type: Improvement
>          Components: Standalone Metastore
>            Reporter: Karthik Manamcheri
>            Assignee: Karthik Manamcheri
>            Priority: Minor
>
> There are two key metrics which I think we lack and which would be really great to help with scaling visibility in HMS.
> *Average API duration for the past 'n' minutes*
> We already compute and log the duration of API calls in the {{PerfLogger}}. We don't have any gauge on what the average duration of an API call is for the past some bucket of time. This will give us an insight into if there is load on the server which is increasing the average API response time.
>  
> *RDBMS Connection wait time*
> We can use different connection pooling libraries such as bonecp or hikaricp. These pool managers expose statistics such as average time waiting to get a connection, number of connections active, etc. We should expose this as a metric so that we can track if the the connection pool size configured is too small and we are saturating!
> These metrics would help catch problems with HMS resource contention before they actually have jobs failing.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)