You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Chris Lohfink (JIRA)" <ji...@apache.org> on 2014/09/21 07:35:34 UTC

[jira] [Commented] (CASSANDRA-7247) Provide top ten most frequent keys per column family

    [ https://issues.apache.org/jira/browse/CASSANDRA-7247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14142343#comment-14142343 ] 

Chris Lohfink commented on CASSANDRA-7247:
------------------------------------------

Updated to always do it, but I think 2 or 3 are equally viable - its still using executor to single-thread it for more performant StreamSummary and provide a 1k backlog cap, especially since im not sure about performance impact of now using the AbstractType.  Instead of using the DecoratedKey.toString I changed it to use the human readable format from the partitions type which makes it more useful for debugging.  If keeping this as an always on option I can add a nodetool command to list them out in a nice format.

> Provide top ten most frequent keys per column family
> ----------------------------------------------------
>
>                 Key: CASSANDRA-7247
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7247
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Chris Lohfink
>            Assignee: Chris Lohfink
>            Priority: Minor
>         Attachments: cassandra-2.1-7247.txt, jconsole.png, patch.txt
>
>
> Since already have the nice addthis stream library, can use it to keep track of most frequent DecoratedKeys that come through the system using StreamSummaries ([nice explaination|http://boundary.com/blog/2013/05/14/approximate-heavy-hitters-the-spacesaving-algorithm/]).  Then provide a new metric to access them via JMX.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)