You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Eugene Koifman (JIRA)" <ji...@apache.org> on 2015/09/30 02:24:04 UTC

[jira] [Updated] (HIVE-11444) ACID Compactor should generate stats/alerts

     [ https://issues.apache.org/jira/browse/HIVE-11444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Eugene Koifman updated HIVE-11444:
----------------------------------
    Description: 
Compaction should generate stats about number of files it reads, min/max/avg size etc.  It should also generate alerts if it looks like the system is not configured correctly.

For example, if there are lots of delta files with very small files, it's a good sign that Streaming API is configured with batches that are too small.

Simplest idea is to add another periodic task to AcidHouseKeeperService to
        //periodically do select count(*), min(txnid),max(txnid), type from txns group by type.
        //1. dump that to log file at info
        //2. could also keep counts for last 10min, hour, 6 hours, 24 hours, etc
        //2.2 if a large increase is detected - issue alert (at least to the log for now) at warn/error


  was:
Compaction should generate stats about number of files it reads, min/max/avg size etc.  It should also generate alerts if it looks like the system is not configured correctly.

For example, if there are lots of delta files with very small files, it's a good sign that Streaming API is configured with batches that are too small.


> ACID Compactor should generate stats/alerts
> -------------------------------------------
>
>                 Key: HIVE-11444
>                 URL: https://issues.apache.org/jira/browse/HIVE-11444
>             Project: Hive
>          Issue Type: Improvement
>          Components: Transactions
>    Affects Versions: 1.0.0
>            Reporter: Eugene Koifman
>            Assignee: Eugene Koifman
>
> Compaction should generate stats about number of files it reads, min/max/avg size etc.  It should also generate alerts if it looks like the system is not configured correctly.
> For example, if there are lots of delta files with very small files, it's a good sign that Streaming API is configured with batches that are too small.
> Simplest idea is to add another periodic task to AcidHouseKeeperService to
>         //periodically do select count(*), min(txnid),max(txnid), type from txns group by type.
>         //1. dump that to log file at info
>         //2. could also keep counts for last 10min, hour, 6 hours, 24 hours, etc
>         //2.2 if a large increase is detected - issue alert (at least to the log for now) at warn/error



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)