You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "Lars George (JIRA)" <ji...@apache.org> on 2010/02/04 07:04:28 UTC
[jira] Resolved: (HBASE-2181) remove or provide config to
completely disable all aspects of 'table fragmentation'
[ https://issues.apache.org/jira/browse/HBASE-2181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Lars George resolved HBASE-2181.
--------------------------------
Resolution: Duplicate
Duplicates HBASE-2165
> remove or provide config to completely disable all aspects of 'table fragmentation'
> ------------------------------------------------------------------------------------
>
> Key: HBASE-2181
> URL: https://issues.apache.org/jira/browse/HBASE-2181
> Project: Hadoop HBase
> Issue Type: Bug
> Reporter: ryan rawson
> Fix For: 0.21.0
>
>
> Given the potentially low and misleading value of the metric, and how much effort must be expended to collect them, I would argue at least we should allow users to disable the feature completely.
> The first problem is the data the metric delivers is not very useful. On any given busy system, this value is often 100%. On a sample system here, 12% of the tables were at either 0 or 100%. Furthermore the 100% metric is not particularly informative. If a table has 100% 'fragmentation' it does not necessarily imply that this table is in dire need of compaction. The HBase compaction code will generally keep at least 2 store files around - it refuses to minor compact older and larger files, preferring to merge small files. Thus on a table taking writes on all regions, the expected value of fragmentation is in fact 100%. And this is not a bad thing either. Considering that compacting a 500GB table will take an hour and hammer a cluster, misleading users into striving to get to 0% is non ideal.
> The other major problem of this feature is collecting the data is non-trivial on larger clusters. I did a test where I did a lsr on a hadoop cluster, and to generate 15k lines of output, it pegged the namenode at over 100% cpu for a few seconds. On a cluster with 7000 regions, we can clearly easily have 14,000 (2 store files per region is typical) files thus causing spikes against the namenode to generate this statistic.
> I would propose 3 courses of actions:
> - allow complete disablement of the feature, including the background thread and the UI display
> - change the metric to mean '# of regions with > 5 store files'
> - replacing the metric with a completely different one that attempts to capture the spirit of the intent but with less load.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.