You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "Lars George (JIRA)" <ji...@apache.org> on 2010/02/04 07:04:28 UTC

[jira] Resolved: (HBASE-2181) remove or provide config to completely disable all aspects of 'table fragmentation'

     [ https://issues.apache.org/jira/browse/HBASE-2181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lars George resolved HBASE-2181.
--------------------------------

    Resolution: Duplicate

Duplicates HBASE-2165

> remove or provide config to completely disable all aspects of 'table fragmentation' 
> ------------------------------------------------------------------------------------
>
>                 Key: HBASE-2181
>                 URL: https://issues.apache.org/jira/browse/HBASE-2181
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: ryan rawson
>             Fix For: 0.21.0
>
>
> Given the potentially low and misleading value of the metric, and how much effort must be expended to collect them, I would argue at least we should allow users to disable the feature completely.
> The first problem is the data the metric delivers is not very useful. On any given busy system, this value is often 100%. On a sample system here, 12% of the  tables were at either 0 or 100%.  Furthermore the 100% metric is not particularly informative. If a table has 100% 'fragmentation' it does not necessarily imply that this table is in dire need of compaction.  The HBase compaction code will generally keep at least 2 store files around - it refuses to minor compact older and larger files, preferring to merge small files.  Thus on a table taking writes on all regions, the expected value of fragmentation is in fact 100%. And this is not a bad thing either.  Considering that compacting a 500GB table will take an hour and hammer a cluster, misleading users into striving to get to 0% is non ideal.
> The other major problem of this feature is collecting the data is non-trivial on larger clusters.  I did a test where I did a lsr on a hadoop cluster, and to generate 15k lines of output, it pegged the namenode at over 100% cpu for a few seconds. On a cluster with 7000 regions, we can clearly easily have 14,000 (2 store files per region is typical) files thus causing spikes against the namenode to generate this statistic.
> I would propose 3 courses of actions:
> - allow complete disablement of the feature, including the background thread and the UI display
> - change the metric to mean '# of regions with > 5 store files'
> - replacing the metric with a completely different one that attempts to capture the spirit of the intent but with less load.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.