You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Jean-Daniel Cryans (Created) (JIRA)" <ji...@apache.org> on 2012/01/10 20:35:39 UTC

[jira] [Created] (HBASE-5174) Coalesce aborted tasks in the TaskMonitor

Coalesce aborted tasks in the TaskMonitor
-----------------------------------------

                 Key: HBASE-5174
                 URL: https://issues.apache.org/jira/browse/HBASE-5174
             Project: HBase
          Issue Type: Improvement
    Affects Versions: 0.92.0
            Reporter: Jean-Daniel Cryans
             Fix For: 0.94.0, 0.92.1


Some tasks can get repeatedly canceled like flushing when splitting is going on, in the logs it looks like this:

{noformat}
2012-01-10 19:28:29,164 INFO org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush of region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c. due to global heap pressure
2012-01-10 19:28:29,164 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: NOT flushing memstore for region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c., flushing=false, writesEnabled=false
2012-01-10 19:28:29,164 DEBUG org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush thread woke up because memory above low water=1.6g
2012-01-10 19:28:29,164 INFO org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush of region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c. due to global heap pressure
2012-01-10 19:28:29,164 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: NOT flushing memstore for region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c., flushing=false, writesEnabled=false
2012-01-10 19:28:29,164 DEBUG org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush thread woke up because memory above low water=1.6g
2012-01-10 19:28:29,164 INFO org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush of region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c. due to global heap pressure
2012-01-10 19:28:29,164 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: NOT flushing memstore for region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c., flushing=false, writesEnabled=false
{noformat}

But in the TaskMonitor UI you'll get MAX_TASKS (1000) displayed on top of the regions. Basically 1000x:

{noformat}
Tue Jan 10 19:28:29 UTC 2012	Flushing test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c.	ABORTED (since 31sec ago)	Not flushing since writes not enabled (since 31sec ago)
{noformat}

It's ugly and I'm sure some users will freak out seeing this, plus you have to scroll down all the way to see your regions. Coalescing consecutive aborted tasks seems like a good solution.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5174) Coalesce aborted tasks in the TaskMonitor

Posted by "Andrew Purtell (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13185187#comment-13185187 ] 

Andrew Purtell commented on HBASE-5174:
---------------------------------------

bq. Considering ABORTED task would be cleaned up in 1 minute, I wonder if the complexity introduced is worth it.

On the other hand the display would be cleaner with coalescing, so perhaps failed or aborted tasks could remain displayed for a longer period of time.
                
> Coalesce aborted tasks in the TaskMonitor
> -----------------------------------------
>
>                 Key: HBASE-5174
>                 URL: https://issues.apache.org/jira/browse/HBASE-5174
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.92.0
>            Reporter: Jean-Daniel Cryans
>             Fix For: 0.94.0, 0.92.1
>
>
> Some tasks can get repeatedly canceled like flushing when splitting is going on, in the logs it looks like this:
> {noformat}
> 2012-01-10 19:28:29,164 INFO org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush of region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c. due to global heap pressure
> 2012-01-10 19:28:29,164 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: NOT flushing memstore for region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c., flushing=false, writesEnabled=false
> 2012-01-10 19:28:29,164 DEBUG org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush thread woke up because memory above low water=1.6g
> 2012-01-10 19:28:29,164 INFO org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush of region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c. due to global heap pressure
> 2012-01-10 19:28:29,164 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: NOT flushing memstore for region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c., flushing=false, writesEnabled=false
> 2012-01-10 19:28:29,164 DEBUG org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush thread woke up because memory above low water=1.6g
> 2012-01-10 19:28:29,164 INFO org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush of region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c. due to global heap pressure
> 2012-01-10 19:28:29,164 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: NOT flushing memstore for region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c., flushing=false, writesEnabled=false
> {noformat}
> But in the TaskMonitor UI you'll get MAX_TASKS (1000) displayed on top of the regions. Basically 1000x:
> {noformat}
> Tue Jan 10 19:28:29 UTC 2012	Flushing test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c.	ABORTED (since 31sec ago)	Not flushing since writes not enabled (since 31sec ago)
> {noformat}
> It's ugly and I'm sure some users will freak out seeing this, plus you have to scroll down all the way to see your regions. Coalescing consecutive aborted tasks seems like a good solution.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5174) Coalesce aborted tasks in the TaskMonitor

Posted by "Jean-Daniel Cryans (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13185237#comment-13185237 ] 

Jean-Daniel Cryans commented on HBASE-5174:
-------------------------------------------

bq.  so perhaps failed or aborted tasks could remain displayed for a longer period of time.

Agreed, and if for each time you coalesce tasks together you reset the timer then it could stick around for a while. 
                
> Coalesce aborted tasks in the TaskMonitor
> -----------------------------------------
>
>                 Key: HBASE-5174
>                 URL: https://issues.apache.org/jira/browse/HBASE-5174
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.92.0
>            Reporter: Jean-Daniel Cryans
>             Fix For: 0.94.0, 0.92.1
>
>
> Some tasks can get repeatedly canceled like flushing when splitting is going on, in the logs it looks like this:
> {noformat}
> 2012-01-10 19:28:29,164 INFO org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush of region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c. due to global heap pressure
> 2012-01-10 19:28:29,164 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: NOT flushing memstore for region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c., flushing=false, writesEnabled=false
> 2012-01-10 19:28:29,164 DEBUG org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush thread woke up because memory above low water=1.6g
> 2012-01-10 19:28:29,164 INFO org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush of region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c. due to global heap pressure
> 2012-01-10 19:28:29,164 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: NOT flushing memstore for region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c., flushing=false, writesEnabled=false
> 2012-01-10 19:28:29,164 DEBUG org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush thread woke up because memory above low water=1.6g
> 2012-01-10 19:28:29,164 INFO org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush of region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c. due to global heap pressure
> 2012-01-10 19:28:29,164 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: NOT flushing memstore for region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c., flushing=false, writesEnabled=false
> {noformat}
> But in the TaskMonitor UI you'll get MAX_TASKS (1000) displayed on top of the regions. Basically 1000x:
> {noformat}
> Tue Jan 10 19:28:29 UTC 2012	Flushing test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c.	ABORTED (since 31sec ago)	Not flushing since writes not enabled (since 31sec ago)
> {noformat}
> It's ugly and I'm sure some users will freak out seeing this, plus you have to scroll down all the way to see your regions. Coalescing consecutive aborted tasks seems like a good solution.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5174) Coalesce aborted tasks in the TaskMonitor

Posted by "Andrew Purtell (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13184571#comment-13184571 ] 

Andrew Purtell commented on HBASE-5174:
---------------------------------------

Render the monitored tasks as a treeview, with something like http://jquery.bassistance.de/treeview/ ? While building the tree, put entries with identical text one level down, as soon as you see something different, move back up to toplevel? Render fully collapsed?
                
> Coalesce aborted tasks in the TaskMonitor
> -----------------------------------------
>
>                 Key: HBASE-5174
>                 URL: https://issues.apache.org/jira/browse/HBASE-5174
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.92.0
>            Reporter: Jean-Daniel Cryans
>             Fix For: 0.94.0, 0.92.1
>
>
> Some tasks can get repeatedly canceled like flushing when splitting is going on, in the logs it looks like this:
> {noformat}
> 2012-01-10 19:28:29,164 INFO org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush of region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c. due to global heap pressure
> 2012-01-10 19:28:29,164 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: NOT flushing memstore for region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c., flushing=false, writesEnabled=false
> 2012-01-10 19:28:29,164 DEBUG org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush thread woke up because memory above low water=1.6g
> 2012-01-10 19:28:29,164 INFO org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush of region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c. due to global heap pressure
> 2012-01-10 19:28:29,164 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: NOT flushing memstore for region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c., flushing=false, writesEnabled=false
> 2012-01-10 19:28:29,164 DEBUG org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush thread woke up because memory above low water=1.6g
> 2012-01-10 19:28:29,164 INFO org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush of region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c. due to global heap pressure
> 2012-01-10 19:28:29,164 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: NOT flushing memstore for region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c., flushing=false, writesEnabled=false
> {noformat}
> But in the TaskMonitor UI you'll get MAX_TASKS (1000) displayed on top of the regions. Basically 1000x:
> {noformat}
> Tue Jan 10 19:28:29 UTC 2012	Flushing test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c.	ABORTED (since 31sec ago)	Not flushing since writes not enabled (since 31sec ago)
> {noformat}
> It's ugly and I'm sure some users will freak out seeing this, plus you have to scroll down all the way to see your regions. Coalescing consecutive aborted tasks seems like a good solution.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5174) Coalesce aborted tasks in the TaskMonitor

Posted by "Jimmy Xiang (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13185694#comment-13185694 ] 

Jimmy Xiang commented on HBASE-5174:
------------------------------------

Failed or aborted tasks should not be displayed after the retry is succeeded. Otherwise, will it cause confusion?
                
> Coalesce aborted tasks in the TaskMonitor
> -----------------------------------------
>
>                 Key: HBASE-5174
>                 URL: https://issues.apache.org/jira/browse/HBASE-5174
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.92.0
>            Reporter: Jean-Daniel Cryans
>             Fix For: 0.94.0, 0.92.1
>
>
> Some tasks can get repeatedly canceled like flushing when splitting is going on, in the logs it looks like this:
> {noformat}
> 2012-01-10 19:28:29,164 INFO org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush of region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c. due to global heap pressure
> 2012-01-10 19:28:29,164 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: NOT flushing memstore for region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c., flushing=false, writesEnabled=false
> 2012-01-10 19:28:29,164 DEBUG org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush thread woke up because memory above low water=1.6g
> 2012-01-10 19:28:29,164 INFO org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush of region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c. due to global heap pressure
> 2012-01-10 19:28:29,164 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: NOT flushing memstore for region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c., flushing=false, writesEnabled=false
> 2012-01-10 19:28:29,164 DEBUG org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush thread woke up because memory above low water=1.6g
> 2012-01-10 19:28:29,164 INFO org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush of region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c. due to global heap pressure
> 2012-01-10 19:28:29,164 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: NOT flushing memstore for region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c., flushing=false, writesEnabled=false
> {noformat}
> But in the TaskMonitor UI you'll get MAX_TASKS (1000) displayed on top of the regions. Basically 1000x:
> {noformat}
> Tue Jan 10 19:28:29 UTC 2012	Flushing test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c.	ABORTED (since 31sec ago)	Not flushing since writes not enabled (since 31sec ago)
> {noformat}
> It's ugly and I'm sure some users will freak out seeing this, plus you have to scroll down all the way to see your regions. Coalescing consecutive aborted tasks seems like a good solution.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5174) Coalesce aborted tasks in the TaskMonitor

Posted by "Jimmy Xiang (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13185701#comment-13185701 ] 

Jimmy Xiang commented on HBASE-5174:
------------------------------------

I meant we can not just show the failed or aborted tasks longer.  We should also show the succeeded one or the retrying one as well, if it failed before and the failed tasks is still showing.
                
> Coalesce aborted tasks in the TaskMonitor
> -----------------------------------------
>
>                 Key: HBASE-5174
>                 URL: https://issues.apache.org/jira/browse/HBASE-5174
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.92.0
>            Reporter: Jean-Daniel Cryans
>             Fix For: 0.94.0, 0.92.1
>
>
> Some tasks can get repeatedly canceled like flushing when splitting is going on, in the logs it looks like this:
> {noformat}
> 2012-01-10 19:28:29,164 INFO org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush of region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c. due to global heap pressure
> 2012-01-10 19:28:29,164 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: NOT flushing memstore for region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c., flushing=false, writesEnabled=false
> 2012-01-10 19:28:29,164 DEBUG org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush thread woke up because memory above low water=1.6g
> 2012-01-10 19:28:29,164 INFO org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush of region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c. due to global heap pressure
> 2012-01-10 19:28:29,164 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: NOT flushing memstore for region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c., flushing=false, writesEnabled=false
> 2012-01-10 19:28:29,164 DEBUG org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush thread woke up because memory above low water=1.6g
> 2012-01-10 19:28:29,164 INFO org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush of region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c. due to global heap pressure
> 2012-01-10 19:28:29,164 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: NOT flushing memstore for region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c., flushing=false, writesEnabled=false
> {noformat}
> But in the TaskMonitor UI you'll get MAX_TASKS (1000) displayed on top of the regions. Basically 1000x:
> {noformat}
> Tue Jan 10 19:28:29 UTC 2012	Flushing test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c.	ABORTED (since 31sec ago)	Not flushing since writes not enabled (since 31sec ago)
> {noformat}
> It's ugly and I'm sure some users will freak out seeing this, plus you have to scroll down all the way to see your regions. Coalescing consecutive aborted tasks seems like a good solution.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5174) Coalesce aborted tasks in the TaskMonitor

Posted by "Todd Lipcon (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13184532#comment-13184532 ] 

Todd Lipcon commented on HBASE-5174:
------------------------------------

There's no guarantee that Object.hashCode() is unique - just that it's usually unique. Would rather coalesce by actual identity (WeakIdentityHashMap?) or by some string (eg region id) than use hashcode.
                
> Coalesce aborted tasks in the TaskMonitor
> -----------------------------------------
>
>                 Key: HBASE-5174
>                 URL: https://issues.apache.org/jira/browse/HBASE-5174
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.92.0
>            Reporter: Jean-Daniel Cryans
>             Fix For: 0.94.0, 0.92.1
>
>
> Some tasks can get repeatedly canceled like flushing when splitting is going on, in the logs it looks like this:
> {noformat}
> 2012-01-10 19:28:29,164 INFO org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush of region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c. due to global heap pressure
> 2012-01-10 19:28:29,164 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: NOT flushing memstore for region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c., flushing=false, writesEnabled=false
> 2012-01-10 19:28:29,164 DEBUG org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush thread woke up because memory above low water=1.6g
> 2012-01-10 19:28:29,164 INFO org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush of region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c. due to global heap pressure
> 2012-01-10 19:28:29,164 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: NOT flushing memstore for region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c., flushing=false, writesEnabled=false
> 2012-01-10 19:28:29,164 DEBUG org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush thread woke up because memory above low water=1.6g
> 2012-01-10 19:28:29,164 INFO org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush of region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c. due to global heap pressure
> 2012-01-10 19:28:29,164 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: NOT flushing memstore for region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c., flushing=false, writesEnabled=false
> {noformat}
> But in the TaskMonitor UI you'll get MAX_TASKS (1000) displayed on top of the regions. Basically 1000x:
> {noformat}
> Tue Jan 10 19:28:29 UTC 2012	Flushing test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c.	ABORTED (since 31sec ago)	Not flushing since writes not enabled (since 31sec ago)
> {noformat}
> It's ugly and I'm sure some users will freak out seeing this, plus you have to scroll down all the way to see your regions. Coalescing consecutive aborted tasks seems like a good solution.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5174) Coalesce aborted tasks in the TaskMonitor

Posted by "Zhihong Yu (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13184562#comment-13184562 ] 

Zhihong Yu commented on HBASE-5174:
-----------------------------------

I think the MonitoredTask display should be placed under region server section.
                
> Coalesce aborted tasks in the TaskMonitor
> -----------------------------------------
>
>                 Key: HBASE-5174
>                 URL: https://issues.apache.org/jira/browse/HBASE-5174
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.92.0
>            Reporter: Jean-Daniel Cryans
>             Fix For: 0.94.0, 0.92.1
>
>
> Some tasks can get repeatedly canceled like flushing when splitting is going on, in the logs it looks like this:
> {noformat}
> 2012-01-10 19:28:29,164 INFO org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush of region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c. due to global heap pressure
> 2012-01-10 19:28:29,164 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: NOT flushing memstore for region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c., flushing=false, writesEnabled=false
> 2012-01-10 19:28:29,164 DEBUG org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush thread woke up because memory above low water=1.6g
> 2012-01-10 19:28:29,164 INFO org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush of region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c. due to global heap pressure
> 2012-01-10 19:28:29,164 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: NOT flushing memstore for region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c., flushing=false, writesEnabled=false
> 2012-01-10 19:28:29,164 DEBUG org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush thread woke up because memory above low water=1.6g
> 2012-01-10 19:28:29,164 INFO org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush of region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c. due to global heap pressure
> 2012-01-10 19:28:29,164 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: NOT flushing memstore for region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c., flushing=false, writesEnabled=false
> {noformat}
> But in the TaskMonitor UI you'll get MAX_TASKS (1000) displayed on top of the regions. Basically 1000x:
> {noformat}
> Tue Jan 10 19:28:29 UTC 2012	Flushing test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c.	ABORTED (since 31sec ago)	Not flushing since writes not enabled (since 31sec ago)
> {noformat}
> It's ugly and I'm sure some users will freak out seeing this, plus you have to scroll down all the way to see your regions. Coalescing consecutive aborted tasks seems like a good solution.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5174) Coalesce aborted tasks in the TaskMonitor

Posted by "Zhihong Yu (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13185175#comment-13185175 ] 

Zhihong Yu commented on HBASE-5174:
-----------------------------------

Considering ABORTED task would be cleaned up in 1 minute, I wonder if the complexity introduced is worth it.
                
> Coalesce aborted tasks in the TaskMonitor
> -----------------------------------------
>
>                 Key: HBASE-5174
>                 URL: https://issues.apache.org/jira/browse/HBASE-5174
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.92.0
>            Reporter: Jean-Daniel Cryans
>             Fix For: 0.94.0, 0.92.1
>
>
> Some tasks can get repeatedly canceled like flushing when splitting is going on, in the logs it looks like this:
> {noformat}
> 2012-01-10 19:28:29,164 INFO org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush of region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c. due to global heap pressure
> 2012-01-10 19:28:29,164 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: NOT flushing memstore for region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c., flushing=false, writesEnabled=false
> 2012-01-10 19:28:29,164 DEBUG org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush thread woke up because memory above low water=1.6g
> 2012-01-10 19:28:29,164 INFO org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush of region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c. due to global heap pressure
> 2012-01-10 19:28:29,164 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: NOT flushing memstore for region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c., flushing=false, writesEnabled=false
> 2012-01-10 19:28:29,164 DEBUG org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush thread woke up because memory above low water=1.6g
> 2012-01-10 19:28:29,164 INFO org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush of region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c. due to global heap pressure
> 2012-01-10 19:28:29,164 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: NOT flushing memstore for region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c., flushing=false, writesEnabled=false
> {noformat}
> But in the TaskMonitor UI you'll get MAX_TASKS (1000) displayed on top of the regions. Basically 1000x:
> {noformat}
> Tue Jan 10 19:28:29 UTC 2012	Flushing test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c.	ABORTED (since 31sec ago)	Not flushing since writes not enabled (since 31sec ago)
> {noformat}
> It's ugly and I'm sure some users will freak out seeing this, plus you have to scroll down all the way to see your regions. Coalescing consecutive aborted tasks seems like a good solution.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5174) Coalesce aborted tasks in the TaskMonitor

Posted by "Zhihong Yu (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13183653#comment-13183653 ] 

Zhihong Yu commented on HBASE-5174:
-----------------------------------

I think this issue is similar to HBASE-5136 in that TaskMonitor.get().createStatus() is called imprudently.
We can store MonitoredTask for flushcache() as a field in HRegion and reuse it.
                
> Coalesce aborted tasks in the TaskMonitor
> -----------------------------------------
>
>                 Key: HBASE-5174
>                 URL: https://issues.apache.org/jira/browse/HBASE-5174
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.92.0
>            Reporter: Jean-Daniel Cryans
>             Fix For: 0.94.0, 0.92.1
>
>
> Some tasks can get repeatedly canceled like flushing when splitting is going on, in the logs it looks like this:
> {noformat}
> 2012-01-10 19:28:29,164 INFO org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush of region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c. due to global heap pressure
> 2012-01-10 19:28:29,164 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: NOT flushing memstore for region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c., flushing=false, writesEnabled=false
> 2012-01-10 19:28:29,164 DEBUG org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush thread woke up because memory above low water=1.6g
> 2012-01-10 19:28:29,164 INFO org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush of region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c. due to global heap pressure
> 2012-01-10 19:28:29,164 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: NOT flushing memstore for region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c., flushing=false, writesEnabled=false
> 2012-01-10 19:28:29,164 DEBUG org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush thread woke up because memory above low water=1.6g
> 2012-01-10 19:28:29,164 INFO org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush of region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c. due to global heap pressure
> 2012-01-10 19:28:29,164 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: NOT flushing memstore for region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c., flushing=false, writesEnabled=false
> {noformat}
> But in the TaskMonitor UI you'll get MAX_TASKS (1000) displayed on top of the regions. Basically 1000x:
> {noformat}
> Tue Jan 10 19:28:29 UTC 2012	Flushing test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c.	ABORTED (since 31sec ago)	Not flushing since writes not enabled (since 31sec ago)
> {noformat}
> It's ugly and I'm sure some users will freak out seeing this, plus you have to scroll down all the way to see your regions. Coalescing consecutive aborted tasks seems like a good solution.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5174) Coalesce aborted tasks in the TaskMonitor

Posted by "Jean-Daniel Cryans (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13184520#comment-13184520 ] 

Jean-Daniel Cryans commented on HBASE-5174:
-------------------------------------------

Same as in HBASE-5136, I think we need to know something was aborted. Overwriting it will make it seem that nothing wrong's happening. Then add coalescing to make sure you only have 1 aborted and not a flood.
                
> Coalesce aborted tasks in the TaskMonitor
> -----------------------------------------
>
>                 Key: HBASE-5174
>                 URL: https://issues.apache.org/jira/browse/HBASE-5174
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.92.0
>            Reporter: Jean-Daniel Cryans
>             Fix For: 0.94.0, 0.92.1
>
>
> Some tasks can get repeatedly canceled like flushing when splitting is going on, in the logs it looks like this:
> {noformat}
> 2012-01-10 19:28:29,164 INFO org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush of region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c. due to global heap pressure
> 2012-01-10 19:28:29,164 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: NOT flushing memstore for region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c., flushing=false, writesEnabled=false
> 2012-01-10 19:28:29,164 DEBUG org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush thread woke up because memory above low water=1.6g
> 2012-01-10 19:28:29,164 INFO org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush of region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c. due to global heap pressure
> 2012-01-10 19:28:29,164 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: NOT flushing memstore for region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c., flushing=false, writesEnabled=false
> 2012-01-10 19:28:29,164 DEBUG org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush thread woke up because memory above low water=1.6g
> 2012-01-10 19:28:29,164 INFO org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush of region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c. due to global heap pressure
> 2012-01-10 19:28:29,164 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: NOT flushing memstore for region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c., flushing=false, writesEnabled=false
> {noformat}
> But in the TaskMonitor UI you'll get MAX_TASKS (1000) displayed on top of the regions. Basically 1000x:
> {noformat}
> Tue Jan 10 19:28:29 UTC 2012	Flushing test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c.	ABORTED (since 31sec ago)	Not flushing since writes not enabled (since 31sec ago)
> {noformat}
> It's ugly and I'm sure some users will freak out seeing this, plus you have to scroll down all the way to see your regions. Coalescing consecutive aborted tasks seems like a good solution.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5174) Coalesce aborted tasks in the TaskMonitor

Posted by "Zhihong Yu (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13184547#comment-13184547 ] 

Zhihong Yu commented on HBASE-5174:
-----------------------------------

Looks like I didn't take State of MonitoredTask into account.
Personally I think seeing the latest status for a MonitoredTask is fine. To dig deeper, log is always the place to check.

Map<Class, Map<String, Map<MonitoredTask.State, MonitoredTask>>> is easy to confuse a few people reading the code :-)
                
> Coalesce aborted tasks in the TaskMonitor
> -----------------------------------------
>
>                 Key: HBASE-5174
>                 URL: https://issues.apache.org/jira/browse/HBASE-5174
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.92.0
>            Reporter: Jean-Daniel Cryans
>             Fix For: 0.94.0, 0.92.1
>
>
> Some tasks can get repeatedly canceled like flushing when splitting is going on, in the logs it looks like this:
> {noformat}
> 2012-01-10 19:28:29,164 INFO org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush of region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c. due to global heap pressure
> 2012-01-10 19:28:29,164 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: NOT flushing memstore for region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c., flushing=false, writesEnabled=false
> 2012-01-10 19:28:29,164 DEBUG org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush thread woke up because memory above low water=1.6g
> 2012-01-10 19:28:29,164 INFO org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush of region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c. due to global heap pressure
> 2012-01-10 19:28:29,164 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: NOT flushing memstore for region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c., flushing=false, writesEnabled=false
> 2012-01-10 19:28:29,164 DEBUG org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush thread woke up because memory above low water=1.6g
> 2012-01-10 19:28:29,164 INFO org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush of region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c. due to global heap pressure
> 2012-01-10 19:28:29,164 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: NOT flushing memstore for region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c., flushing=false, writesEnabled=false
> {noformat}
> But in the TaskMonitor UI you'll get MAX_TASKS (1000) displayed on top of the regions. Basically 1000x:
> {noformat}
> Tue Jan 10 19:28:29 UTC 2012	Flushing test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c.	ABORTED (since 31sec ago)	Not flushing since writes not enabled (since 31sec ago)
> {noformat}
> It's ugly and I'm sure some users will freak out seeing this, plus you have to scroll down all the way to see your regions. Coalescing consecutive aborted tasks seems like a good solution.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5174) Coalesce aborted tasks in the TaskMonitor

Posted by "Jean-Daniel Cryans (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13185697#comment-13185697 ] 

Jean-Daniel Cryans commented on HBASE-5174:
-------------------------------------------

bq. Failed or aborted tasks should not be displayed after the retry is succeeded. Otherwise, will it cause confusion?

I'd rather want to know that something went wrong, and since it's ordered by time you can see that it eventually succeeds.
                
> Coalesce aborted tasks in the TaskMonitor
> -----------------------------------------
>
>                 Key: HBASE-5174
>                 URL: https://issues.apache.org/jira/browse/HBASE-5174
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.92.0
>            Reporter: Jean-Daniel Cryans
>             Fix For: 0.94.0, 0.92.1
>
>
> Some tasks can get repeatedly canceled like flushing when splitting is going on, in the logs it looks like this:
> {noformat}
> 2012-01-10 19:28:29,164 INFO org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush of region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c. due to global heap pressure
> 2012-01-10 19:28:29,164 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: NOT flushing memstore for region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c., flushing=false, writesEnabled=false
> 2012-01-10 19:28:29,164 DEBUG org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush thread woke up because memory above low water=1.6g
> 2012-01-10 19:28:29,164 INFO org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush of region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c. due to global heap pressure
> 2012-01-10 19:28:29,164 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: NOT flushing memstore for region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c., flushing=false, writesEnabled=false
> 2012-01-10 19:28:29,164 DEBUG org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush thread woke up because memory above low water=1.6g
> 2012-01-10 19:28:29,164 INFO org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush of region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c. due to global heap pressure
> 2012-01-10 19:28:29,164 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: NOT flushing memstore for region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c., flushing=false, writesEnabled=false
> {noformat}
> But in the TaskMonitor UI you'll get MAX_TASKS (1000) displayed on top of the regions. Basically 1000x:
> {noformat}
> Tue Jan 10 19:28:29 UTC 2012	Flushing test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c.	ABORTED (since 31sec ago)	Not flushing since writes not enabled (since 31sec ago)
> {noformat}
> It's ugly and I'm sure some users will freak out seeing this, plus you have to scroll down all the way to see your regions. Coalescing consecutive aborted tasks seems like a good solution.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5174) Coalesce aborted tasks in the TaskMonitor

Posted by "Zhihong Yu (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13183851#comment-13183851 ] 

Zhihong Yu commented on HBASE-5174:
-----------------------------------

Maybe we can add the following method to TaskMonitor:
{code}
  public MonitoredTask createStatus(String description, Object obj) {
{code}
TaskMonitor can maintain Map<Class, Map<Integer, MonitoredTask>> where Class key is the class of Object and Integer key is obj.hashCode().
This way we keep the current usage pattern and reduce redundancy in the mean time.
                
> Coalesce aborted tasks in the TaskMonitor
> -----------------------------------------
>
>                 Key: HBASE-5174
>                 URL: https://issues.apache.org/jira/browse/HBASE-5174
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.92.0
>            Reporter: Jean-Daniel Cryans
>             Fix For: 0.94.0, 0.92.1
>
>
> Some tasks can get repeatedly canceled like flushing when splitting is going on, in the logs it looks like this:
> {noformat}
> 2012-01-10 19:28:29,164 INFO org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush of region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c. due to global heap pressure
> 2012-01-10 19:28:29,164 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: NOT flushing memstore for region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c., flushing=false, writesEnabled=false
> 2012-01-10 19:28:29,164 DEBUG org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush thread woke up because memory above low water=1.6g
> 2012-01-10 19:28:29,164 INFO org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush of region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c. due to global heap pressure
> 2012-01-10 19:28:29,164 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: NOT flushing memstore for region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c., flushing=false, writesEnabled=false
> 2012-01-10 19:28:29,164 DEBUG org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush thread woke up because memory above low water=1.6g
> 2012-01-10 19:28:29,164 INFO org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush of region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c. due to global heap pressure
> 2012-01-10 19:28:29,164 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: NOT flushing memstore for region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c., flushing=false, writesEnabled=false
> {noformat}
> But in the TaskMonitor UI you'll get MAX_TASKS (1000) displayed on top of the regions. Basically 1000x:
> {noformat}
> Tue Jan 10 19:28:29 UTC 2012	Flushing test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c.	ABORTED (since 31sec ago)	Not flushing since writes not enabled (since 31sec ago)
> {noformat}
> It's ugly and I'm sure some users will freak out seeing this, plus you have to scroll down all the way to see your regions. Coalescing consecutive aborted tasks seems like a good solution.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5174) Coalesce aborted tasks in the TaskMonitor

Posted by "Ted Yu (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13186527#comment-13186527 ] 

Ted Yu commented on HBASE-5174:
-------------------------------

A slight variation of my previous proposal:
MonitoredTaskImpl can maintain Map<String, MonitoredTask> where String key is the description passed to 
TaskMonitor.createStatus(), prepended with MonitoredTask.State and separator string (such as '||').

A task may have two entries in the map, one starting with 'ABORTED', the other starting with 'COMPLETE'. This corresponds to task retries.
Special handling would be added to MonitoredTaskImpl.setState().
                
> Coalesce aborted tasks in the TaskMonitor
> -----------------------------------------
>
>                 Key: HBASE-5174
>                 URL: https://issues.apache.org/jira/browse/HBASE-5174
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.92.0
>            Reporter: Jean-Daniel Cryans
>             Fix For: 0.94.0, 0.92.1
>
>
> Some tasks can get repeatedly canceled like flushing when splitting is going on, in the logs it looks like this:
> {noformat}
> 2012-01-10 19:28:29,164 INFO org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush of region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c. due to global heap pressure
> 2012-01-10 19:28:29,164 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: NOT flushing memstore for region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c., flushing=false, writesEnabled=false
> 2012-01-10 19:28:29,164 DEBUG org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush thread woke up because memory above low water=1.6g
> 2012-01-10 19:28:29,164 INFO org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush of region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c. due to global heap pressure
> 2012-01-10 19:28:29,164 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: NOT flushing memstore for region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c., flushing=false, writesEnabled=false
> 2012-01-10 19:28:29,164 DEBUG org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush thread woke up because memory above low water=1.6g
> 2012-01-10 19:28:29,164 INFO org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush of region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c. due to global heap pressure
> 2012-01-10 19:28:29,164 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: NOT flushing memstore for region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c., flushing=false, writesEnabled=false
> {noformat}
> But in the TaskMonitor UI you'll get MAX_TASKS (1000) displayed on top of the regions. Basically 1000x:
> {noformat}
> Tue Jan 10 19:28:29 UTC 2012	Flushing test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c.	ABORTED (since 31sec ago)	Not flushing since writes not enabled (since 31sec ago)
> {noformat}
> It's ugly and I'm sure some users will freak out seeing this, plus you have to scroll down all the way to see your regions. Coalescing consecutive aborted tasks seems like a good solution.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira