You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@hbase.apache.org by "Todd Lipcon (JIRA)" <ji...@apache.org> on 2011/01/27 00:42:46 UTC

[jira] Created: (HBASE-3483) No soft flush trigger on global memstore limit

No soft flush trigger on global memstore limit
----------------------------------------------

                 Key: HBASE-3483
                 URL: https://issues.apache.org/jira/browse/HBASE-3483
             Project: HBase
          Issue Type: Bug
          Components: performance, regionserver
    Affects Versions: 0.90.0
            Reporter: Todd Lipcon
            Assignee: Todd Lipcon
            Priority: Critical
             Fix For: 0.90.1


I think this is the reason people see long blocking periods under write load.

Currently when we hit the global memstore limit, we call reclaimMemStoreMemory() which is synchronized - thus everyone has to wait until the memory has flushed down to the low water mark. This causes every writer to block for 10-15 seconds on a large heap.

Instead we should start triggering flushes (in another thread) whenever we're above the low water mark. Then only block writers when we're above the high water mark.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HBASE-3483) No soft flush trigger on global memstore limit

Posted by "Todd Lipcon (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-3483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Todd Lipcon updated HBASE-3483:
-------------------------------

    Attachment: hbase-3483.txt

Here's a patch which may or may not work (tested something like this and it fixed a lot of the blocking behavior, but this isn't exactly the same patch). Will keep working on it.

> No soft flush trigger on global memstore limit
> ----------------------------------------------
>
>                 Key: HBASE-3483
>                 URL: https://issues.apache.org/jira/browse/HBASE-3483
>             Project: HBase
>          Issue Type: Bug
>          Components: performance, regionserver
>    Affects Versions: 0.90.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>            Priority: Critical
>             Fix For: 0.90.1
>
>         Attachments: hbase-3483.txt
>
>
> I think this is the reason people see long blocking periods under write load.
> Currently when we hit the global memstore limit, we call reclaimMemStoreMemory() which is synchronized - thus everyone has to wait until the memory has flushed down to the low water mark. This causes every writer to block for 10-15 seconds on a large heap.
> Instead we should start triggering flushes (in another thread) whenever we're above the low water mark. Then only block writers when we're above the high water mark.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-3483) No soft flush trigger on global memstore limit

Posted by "Todd Lipcon (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-3483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12988892#comment-12988892 ] 

Todd Lipcon commented on HBASE-3483:
------------------------------------

I ran this patch under heavy load on a cluster over the weekend, seems to work well.

Running unit tests now.

> No soft flush trigger on global memstore limit
> ----------------------------------------------
>
>                 Key: HBASE-3483
>                 URL: https://issues.apache.org/jira/browse/HBASE-3483
>             Project: HBase
>          Issue Type: Bug
>          Components: performance, regionserver
>    Affects Versions: 0.90.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>            Priority: Critical
>             Fix For: 0.90.1
>
>         Attachments: hbase-3483.txt, hbase-3483.txt
>
>
> I think this is the reason people see long blocking periods under write load.
> Currently when we hit the global memstore limit, we call reclaimMemStoreMemory() which is synchronized - thus everyone has to wait until the memory has flushed down to the low water mark. This causes every writer to block for 10-15 seconds on a large heap.
> Instead we should start triggering flushes (in another thread) whenever we're above the low water mark. Then only block writers when we're above the high water mark.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (HBASE-3483) No soft flush trigger on global memstore limit

Posted by "Todd Lipcon (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-3483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12988934#comment-12988934 ] 

Todd Lipcon commented on HBASE-3483:
------------------------------------

FlushQueueEntry -> FlushRegionEntry

> No soft flush trigger on global memstore limit
> ----------------------------------------------
>
>                 Key: HBASE-3483
>                 URL: https://issues.apache.org/jira/browse/HBASE-3483
>             Project: HBase
>          Issue Type: Bug
>          Components: performance, regionserver
>    Affects Versions: 0.90.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>            Priority: Critical
>             Fix For: 0.90.1
>
>         Attachments: hbase-3483.txt, hbase-3483.txt
>
>
> I think this is the reason people see long blocking periods under write load.
> Currently when we hit the global memstore limit, we call reclaimMemStoreMemory() which is synchronized - thus everyone has to wait until the memory has flushed down to the low water mark. This causes every writer to block for 10-15 seconds on a large heap.
> Instead we should start triggering flushes (in another thread) whenever we're above the low water mark. Then only block writers when we're above the high water mark.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (HBASE-3483) No soft flush trigger on global memstore limit

Posted by "Hudson (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-3483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12989218#comment-12989218 ] 

Hudson commented on HBASE-3483:
-------------------------------

Integrated in HBase-TRUNK #1726 (See [https://hudson.apache.org/hudson/job/HBase-TRUNK/1726/])
    HBASE-3483 Memstore lower limit should trigger asynchronous flushes


> No soft flush trigger on global memstore limit
> ----------------------------------------------
>
>                 Key: HBASE-3483
>                 URL: https://issues.apache.org/jira/browse/HBASE-3483
>             Project: HBase
>          Issue Type: Bug
>          Components: performance, regionserver
>    Affects Versions: 0.90.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>            Priority: Critical
>             Fix For: 0.90.1
>
>         Attachments: hbase-3483.txt, hbase-3483.txt
>
>
> I think this is the reason people see long blocking periods under write load.
> Currently when we hit the global memstore limit, we call reclaimMemStoreMemory() which is synchronized - thus everyone has to wait until the memory has flushed down to the low water mark. This causes every writer to block for 10-15 seconds on a large heap.
> Instead we should start triggering flushes (in another thread) whenever we're above the low water mark. Then only block writers when we're above the high water mark.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (HBASE-3483) No soft flush trigger on global memstore limit

Posted by "Todd Lipcon (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-3483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12988937#comment-12988937 ] 

Todd Lipcon commented on HBASE-3483:
------------------------------------

Unit tests passed except for TestMasterFailover, which couldn't be related (and I've seen fail on trunk)

Will commit momentarily. Do you think this belongs in 0.90.1 or just trunk?

> No soft flush trigger on global memstore limit
> ----------------------------------------------
>
>                 Key: HBASE-3483
>                 URL: https://issues.apache.org/jira/browse/HBASE-3483
>             Project: HBase
>          Issue Type: Bug
>          Components: performance, regionserver
>    Affects Versions: 0.90.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>            Priority: Critical
>             Fix For: 0.90.1
>
>         Attachments: hbase-3483.txt, hbase-3483.txt
>
>
> I think this is the reason people see long blocking periods under write load.
> Currently when we hit the global memstore limit, we call reclaimMemStoreMemory() which is synchronized - thus everyone has to wait until the memory has flushed down to the low water mark. This causes every writer to block for 10-15 seconds on a large heap.
> Instead we should start triggering flushes (in another thread) whenever we're above the low water mark. Then only block writers when we're above the high water mark.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (HBASE-3483) No soft flush trigger on global memstore limit

Posted by "Jonathan Gray (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-3483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12987308#action_12987308 ] 

Jonathan Gray commented on HBASE-3483:
--------------------------------------

Nice catch!

> No soft flush trigger on global memstore limit
> ----------------------------------------------
>
>                 Key: HBASE-3483
>                 URL: https://issues.apache.org/jira/browse/HBASE-3483
>             Project: HBase
>          Issue Type: Bug
>          Components: performance, regionserver
>    Affects Versions: 0.90.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>            Priority: Critical
>             Fix For: 0.90.1
>
>
> I think this is the reason people see long blocking periods under write load.
> Currently when we hit the global memstore limit, we call reclaimMemStoreMemory() which is synchronized - thus everyone has to wait until the memory has flushed down to the low water mark. This causes every writer to block for 10-15 seconds on a large heap.
> Instead we should start triggering flushes (in another thread) whenever we're above the low water mark. Then only block writers when we're above the high water mark.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-3483) No soft flush trigger on global memstore limit

Posted by "Jonathan Gray (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-3483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12988984#comment-12988984 ] 

Jonathan Gray commented on HBASE-3483:
--------------------------------------

This is a pretty ugly bug, I say branch and trunk.

> No soft flush trigger on global memstore limit
> ----------------------------------------------
>
>                 Key: HBASE-3483
>                 URL: https://issues.apache.org/jira/browse/HBASE-3483
>             Project: HBase
>          Issue Type: Bug
>          Components: performance, regionserver
>    Affects Versions: 0.90.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>            Priority: Critical
>             Fix For: 0.90.1
>
>         Attachments: hbase-3483.txt, hbase-3483.txt
>
>
> I think this is the reason people see long blocking periods under write load.
> Currently when we hit the global memstore limit, we call reclaimMemStoreMemory() which is synchronized - thus everyone has to wait until the memory has flushed down to the low water mark. This causes every writer to block for 10-15 seconds on a large heap.
> Instead we should start triggering flushes (in another thread) whenever we're above the low water mark. Then only block writers when we're above the high water mark.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (HBASE-3483) No soft flush trigger on global memstore limit

Posted by "stack (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-3483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12988918#comment-12988918 ] 

stack commented on HBASE-3483:
------------------------------

+1 on commit.

What changed here? Tab for spaces?

{code}
-  private final Map<HRegion, FlushQueueEntry> regionsInQueue =
-    new HashMap<HRegion, FlushQueueEntry>();
+  private final Map<HRegion, FlushRegionEntry> regionsInQueue =
+    new HashMap<HRegion, FlushRegionEntry>();
+  private AtomicBoolean wakeupPending = new AtomicBoolean();
{code}

> No soft flush trigger on global memstore limit
> ----------------------------------------------
>
>                 Key: HBASE-3483
>                 URL: https://issues.apache.org/jira/browse/HBASE-3483
>             Project: HBase
>          Issue Type: Bug
>          Components: performance, regionserver
>    Affects Versions: 0.90.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>            Priority: Critical
>             Fix For: 0.90.1
>
>         Attachments: hbase-3483.txt, hbase-3483.txt
>
>
> I think this is the reason people see long blocking periods under write load.
> Currently when we hit the global memstore limit, we call reclaimMemStoreMemory() which is synchronized - thus everyone has to wait until the memory has flushed down to the low water mark. This causes every writer to block for 10-15 seconds on a large heap.
> Instead we should start triggering flushes (in another thread) whenever we're above the low water mark. Then only block writers when we're above the high water mark.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Updated: (HBASE-3483) No soft flush trigger on global memstore limit

Posted by "Todd Lipcon (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-3483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Todd Lipcon updated HBASE-3483:
-------------------------------

    Attachment: hbase-3483.txt

Slightly more cleaned up.

> No soft flush trigger on global memstore limit
> ----------------------------------------------
>
>                 Key: HBASE-3483
>                 URL: https://issues.apache.org/jira/browse/HBASE-3483
>             Project: HBase
>          Issue Type: Bug
>          Components: performance, regionserver
>    Affects Versions: 0.90.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>            Priority: Critical
>             Fix For: 0.90.1
>
>         Attachments: hbase-3483.txt, hbase-3483.txt
>
>
> I think this is the reason people see long blocking periods under write load.
> Currently when we hit the global memstore limit, we call reclaimMemStoreMemory() which is synchronized - thus everyone has to wait until the memory has flushed down to the low water mark. This causes every writer to block for 10-15 seconds on a large heap.
> Instead we should start triggering flushes (in another thread) whenever we're above the low water mark. Then only block writers when we're above the high water mark.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (HBASE-3483) No soft flush trigger on global memstore limit

Posted by "Todd Lipcon (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-3483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Todd Lipcon resolved HBASE-3483.
--------------------------------

      Resolution: Fixed
    Hadoop Flags: [Reviewed]

Committed to branch and trunk

> No soft flush trigger on global memstore limit
> ----------------------------------------------
>
>                 Key: HBASE-3483
>                 URL: https://issues.apache.org/jira/browse/HBASE-3483
>             Project: HBase
>          Issue Type: Bug
>          Components: performance, regionserver
>    Affects Versions: 0.90.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>            Priority: Critical
>             Fix For: 0.90.1
>
>         Attachments: hbase-3483.txt, hbase-3483.txt
>
>
> I think this is the reason people see long blocking periods under write load.
> Currently when we hit the global memstore limit, we call reclaimMemStoreMemory() which is synchronized - thus everyone has to wait until the memory has flushed down to the low water mark. This causes every writer to block for 10-15 seconds on a large heap.
> Instead we should start triggering flushes (in another thread) whenever we're above the low water mark. Then only block writers when we're above the high water mark.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira