You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Andrew Purtell (JIRA)" <ji...@apache.org> on 2013/01/10 19:26:13 UTC

[jira] [Commented] (HBASE-4147) StoreFile query usage report

    [ https://issues.apache.org/jira/browse/HBASE-4147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13549869#comment-13549869 ] 

Andrew Purtell commented on HBASE-4147:
---------------------------------------

bq. Meantime, marking this as non-critical and moving out of 0.96 while it is w/o a sponsor.

I might be looking at this again in the future in the context of HBASE-6572. Deciding what stores to migrate.
                
> StoreFile query usage report
> ----------------------------
>
>                 Key: HBASE-4147
>                 URL: https://issues.apache.org/jira/browse/HBASE-4147
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Doug Meil
>         Attachments: hbase_4147_storefilereport_2011_08_10.pdf, hbase_4147_storefilereport.pdf
>
>
> Detailed information on what HBase is doing in terms of reads is hard to come by.
> What would be useful is to have a periodic StoreFile query report.  Specifically, this could run on a configured interval (e.g., every 30 seconds, 60 seconds) and dump the output to the log files.
> This would have all StoreFiles accessed during the reporting period (and with the Path we would also know region, CF, and table), # of times the StoreFile was accessed, the size of the StoreFile, and the total time (ms) spent processing that StoreFile.
> Even this level of summary would be useful to detect a which tables & CFs are being accessed the most, and including the StoreFile would provide insight into relative "uncompaction" (i.e., lots of StoreFiles).
> I think the log-output, as opposed to UI, is an important facet with this.  I'm assuming that users will slice and dice this data on their own so I think we should skip any kind of admin view for now (i.e., new JSPs, new APIs to expose this data).  Just getting this to log-file would be a big improvement.
> Will this have a non-zero performance impact?  Yes.  Hopefully small, but yes it will.  However, flying a plane without any instrumentation isn't fun.  :-)  
>  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira