You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@hbase.apache.org by "Jean-Daniel Cryans (JIRA)" <ji...@apache.org> on 2011/03/16 21:58:29 UTC

[jira] Created: (HBASE-3658) Alert when heap is over committed

Alert when heap is over committed
---------------------------------

                 Key: HBASE-3658
                 URL: https://issues.apache.org/jira/browse/HBASE-3658
             Project: HBase
          Issue Type: Improvement
    Affects Versions: 0.90.1
            Reporter: Jean-Daniel Cryans
             Fix For: 0.92.0


Something I just witnessed, the block cache setting was at 70% but the max global memstore size was at the default of 40% meaning that 110% of the heap can potentially be "assigned" and then you need more heap to do stuff like flushing and compacting.

We should run a configuration check that alerts the user when that happens and maybe even refuse to start.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (HBASE-3658) Alert when heap is over committed

Posted by "Subbu M Iyer (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-3658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13008288#comment-13008288 ] 

Subbu M Iyer commented on HBASE-3658:
-------------------------------------

Here is what we would do:

1. At the time of RS startup we examine the heap settings from config for the following config settings:
a) hbase.regionserver.global.memstore.upperLimit
b) hfile.block.cache.size

If some of a+b is say 80% of heap (assuming we need additional 0.2 to do other household chores), we abort the RS with appropriate error message.

c) Do we also need an additional config setting to declare default minimum we need to household chores?

Please let me know.
 

> Alert when heap is over committed
> ---------------------------------
>
>                 Key: HBASE-3658
>                 URL: https://issues.apache.org/jira/browse/HBASE-3658
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.90.1
>            Reporter: Jean-Daniel Cryans
>             Fix For: 0.92.0
>
>
> Something I just witnessed, the block cache setting was at 70% but the max global memstore size was at the default of 40% meaning that 110% of the heap can potentially be "assigned" and then you need more heap to do stuff like flushing and compacting.
> We should run a configuration check that alerts the user when that happens and maybe even refuse to start.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HBASE-3658) Alert when heap is over committed

Posted by "stack (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-3658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack resolved HBASE-3658.
--------------------------

      Resolution: Fixed
        Assignee: Subbu M Iyer
    Hadoop Flags: [Reviewed]

Nice one Subbu.  Thanks for the patch.  Commmitted branch and trunk.

> Alert when heap is over committed
> ---------------------------------
>
>                 Key: HBASE-3658
>                 URL: https://issues.apache.org/jira/browse/HBASE-3658
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.90.1
>            Reporter: Jean-Daniel Cryans
>            Assignee: Subbu M Iyer
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3658_Alert_when_heap_is_over_committed.patch
>
>
> Something I just witnessed, the block cache setting was at 70% but the max global memstore size was at the default of 40% meaning that 110% of the heap can potentially be "assigned" and then you need more heap to do stuff like flushing and compacting.
> We should run a configuration check that alerts the user when that happens and maybe even refuse to start.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Updated: (HBASE-3658) Alert when heap is over committed

Posted by "Subbu M Iyer (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-3658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Subbu M Iyer updated HBASE-3658:
--------------------------------

    Attachment: HBASE-3658_Alert_when_heap_is_over_committed.patch

> Alert when heap is over committed
> ---------------------------------
>
>                 Key: HBASE-3658
>                 URL: https://issues.apache.org/jira/browse/HBASE-3658
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.90.1
>            Reporter: Jean-Daniel Cryans
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3658_Alert_when_heap_is_over_committed.patch
>
>
> Something I just witnessed, the block cache setting was at 70% but the max global memstore size was at the default of 40% meaning that 110% of the heap can potentially be "assigned" and then you need more heap to do stuff like flushing and compacting.
> We should run a configuration check that alerts the user when that happens and maybe even refuse to start.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (HBASE-3658) Alert when heap is over committed

Posted by "Jonathan Gray (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-3658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13008052#comment-13008052 ] 

Jonathan Gray commented on HBASE-3658:
--------------------------------------

+1 on refusing to start

> Alert when heap is over committed
> ---------------------------------
>
>                 Key: HBASE-3658
>                 URL: https://issues.apache.org/jira/browse/HBASE-3658
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.90.1
>            Reporter: Jean-Daniel Cryans
>             Fix For: 0.92.0
>
>
> Something I just witnessed, the block cache setting was at 70% but the max global memstore size was at the default of 40% meaning that 110% of the heap can potentially be "assigned" and then you need more heap to do stuff like flushing and compacting.
> We should run a configuration check that alerts the user when that happens and maybe even refuse to start.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (HBASE-3658) Alert when heap is over committed

Posted by "stack (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-3658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13008306#comment-13008306 ] 

stack commented on HBASE-3658:
------------------------------

That sounds right Subbu.  I see it as a utility method on HBaseConfiguration.  You'd pass a method named 'check' or 'checkIsWholesome" or something and it would run what you describe above.  Later we might add other checks beyond the two above.

> Alert when heap is over committed
> ---------------------------------
>
>                 Key: HBASE-3658
>                 URL: https://issues.apache.org/jira/browse/HBASE-3658
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.90.1
>            Reporter: Jean-Daniel Cryans
>             Fix For: 0.92.0
>
>
> Something I just witnessed, the block cache setting was at 70% but the max global memstore size was at the default of 40% meaning that 110% of the heap can potentially be "assigned" and then you need more heap to do stuff like flushing and compacting.
> We should run a configuration check that alerts the user when that happens and maybe even refuse to start.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (HBASE-3658) Alert when heap is over committed

Posted by "Subbu M Iyer (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-3658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13008456#comment-13008456 ] 

Subbu M Iyer commented on HBASE-3658:
-------------------------------------

I verified that my cluster does not start when I have the sum of Memstore + Blockcache allocation exceeds 0.8f.


$ ./start-hbase.sh 
Exception in thread "main" java.lang.RuntimeException: Current heap configuration for MemStore and BlockCache exceeds the threshold required for successful cluster operation. The combined value cannot exceed 0.8. Please check the settings for hbase.regionserver.global.memstore.upperLimit and hfile.block.cache.size in your configuration.
	at org.apache.hadoop.hbase.HBaseConfiguration.checkForClusterFreeMemoryLimit(HBaseConfiguration.java:78)
	at org.apache.hadoop.hbase.HBaseConfiguration.addHbaseResources(HBaseConfiguration.java:91)
	at org.apache.hadoop.hbase.HBaseConfiguration.create(HBaseConfiguration.java:101)
	at org.apache.hadoop.hbase.util.HBaseConfTool.main(HBaseConfTool.java:38)
Exception in thread "main" java.lang.RuntimeException: Current heap configuration for MemStore and BlockCache exceeds the threshold required for successful cluster operation. The combined value cannot exceed 0.8. Please check the settings for hbase.regionserver.global.memstore.upperLimit and hfile.block.cache.size in your configuration.
	at org.apache.hadoop.hbase.HBaseConfiguration.checkForClusterFreeMemoryLimit(HBaseConfiguration.java:78)
	at org.apache.hadoop.hbase.HBaseConfiguration.addHbaseResources(HBaseConfiguration.java:91)
	at org.apache.hadoop.hbase.HBaseConfiguration.create(HBaseConfiguration.java:101)
	at org.apache.hadoop.hbase.zookeeper.ZKServerTool.main(ZKServerTool.java:39)
starting master, logging to /work/hbase-0.90.1/bin/../logs/hbase-subbu-master-subbumac.local.out
Exception in thread "main" java.lang.RuntimeException: Current heap configuration for MemStore and BlockCache exceeds the threshold required for successful cluster operation. The combined value cannot exceed 0.8. Please check the settings for hbase.regionserver.global.memstore.upperLimit and hfile.block.cache.size in your configuration.
	at org.apache.hadoop.hbase.HBaseConfiguration.checkForClusterFreeMemoryLimit(HBaseConfiguration.java:78)
	at org.apache.hadoop.hbase.HBaseConfiguration.addHbaseResources(HBaseConfiguration.java:91)
	at org.apache.hadoop.hbase.HBaseConfiguration.create(HBaseConfiguration.java:101)
	at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:76)
	at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:1205)
localhost: starting regionserver, logging to /work/hbase-0.90.1/bin/../logs/hbase-subbu-regionserver-subbumac.local.out
localhost: Exception in thread "main" java.lang.RuntimeException: Current heap configuration for MemStore and BlockCache exceeds the threshold required for successful cluster operation. The combined value cannot exceed 0.8. Please check the settings for hbase.regionserver.global.memstore.upperLimit and hfile.block.cache.size in your configuration.
localhost: 	at org.apache.hadoop.hbase.HBaseConfiguration.checkForClusterFreeMemoryLimit(HBaseConfiguration.java:78)
localhost: 	at org.apache.hadoop.hbase.HBaseConfiguration.addHbaseResources(HBaseConfiguration.java:91)
localhost: 	at org.apache.hadoop.hbase.HBaseConfiguration.create(HBaseConfiguration.java:101)
localhost: 	at org.apache.hadoop.hbase.regionserver.HRegionServer.main(HRegionServer.java:2815)


> Alert when heap is over committed
> ---------------------------------
>
>                 Key: HBASE-3658
>                 URL: https://issues.apache.org/jira/browse/HBASE-3658
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.90.1
>            Reporter: Jean-Daniel Cryans
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3658_Alert_when_heap_is_over_committed.patch
>
>
> Something I just witnessed, the block cache setting was at 70% but the max global memstore size was at the default of 40% meaning that 110% of the heap can potentially be "assigned" and then you need more heap to do stuff like flushing and compacting.
> We should run a configuration check that alerts the user when that happens and maybe even refuse to start.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3658) Alert when heap is over committed

Posted by "Hudson (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-3658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13011489#comment-13011489 ] 

Hudson commented on HBASE-3658:
-------------------------------

Integrated in HBase-TRUNK #1814 (See [https://hudson.apache.org/hudson/job/HBase-TRUNK/1814/])
    

> Alert when heap is over committed
> ---------------------------------
>
>                 Key: HBASE-3658
>                 URL: https://issues.apache.org/jira/browse/HBASE-3658
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.90.1
>            Reporter: Jean-Daniel Cryans
>            Assignee: Subbu M Iyer
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3658_Alert_when_heap_is_over_committed.patch
>
>
> Something I just witnessed, the block cache setting was at 70% but the max global memstore size was at the default of 40% meaning that 110% of the heap can potentially be "assigned" and then you need more heap to do stuff like flushing and compacting.
> We should run a configuration check that alerts the user when that happens and maybe even refuse to start.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (HBASE-3658) Alert when heap is over committed

Posted by "Subbu M Iyer (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-3658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13008455#comment-13008455 ] 

Subbu M Iyer commented on HBASE-3658:
-------------------------------------

First draft submitted for review:

1. I am seeing that the following 3 test cases fail in my local and on a first glance it doesn't look anything to do with my above changes. I am looking at the test cases now to see why they are failing.
 
TestSplitTransactionOnCluster
TestStoreFile
TestHMsg



> Alert when heap is over committed
> ---------------------------------
>
>                 Key: HBASE-3658
>                 URL: https://issues.apache.org/jira/browse/HBASE-3658
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.90.1
>            Reporter: Jean-Daniel Cryans
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3658_Alert_when_heap_is_over_committed.patch
>
>
> Something I just witnessed, the block cache setting was at 70% but the max global memstore size was at the default of 40% meaning that 110% of the heap can potentially be "assigned" and then you need more heap to do stuff like flushing and compacting.
> We should run a configuration check that alerts the user when that happens and maybe even refuse to start.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira