You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Amitanand Aiyer (JIRA)" <ji...@apache.org> on 2013/06/26 17:26:20 UTC

[jira] [Commented] (HBASE-8228) Investigate time taken to snapshot memstore

    [ https://issues.apache.org/jira/browse/HBASE-8228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13694028#comment-13694028 ] 

Amitanand Aiyer commented on HBASE-8228:
----------------------------------------

Looks like this is caused when we are using multiple memstore-flusher threads; and two flush requests have a log-roll between them.

to flush a region, we grab the HRegion.updatesLock.writeLock, and then try to grab the HLog.cacheFlushLock.readLock(). Most of the operation that happens within the lock, is done in memory, so this should have been a short duration. ... unless, we are waiting to grab the lock.

HLog.rollWriter tries to grab the HLog.cacheFlushLock.writeLock(). This means that a Log-roll cannot happen when a flush is already in progress.

If a second flush were to be initiated, when there is already a flush going on, and there is a log-roll, waiting (for a writer's lock); then
the second flush, is able to get the HRegion.updatesLock.writeLock (presumably, for a different region). But, will stall on the HLog.cacheFlushLock.readLock(). This is because the ReaderWriterLock implementation, which uses the NonFairSync() will cause the reader locks to wait on the writer's request; if the writer is at the head of the queue.

This interleaving results in the second flush request, holding the HRegion.updatesLock.writeLock() for as long as the first thread took to flush a region + do a log roll.

Swapping the order of the HRegion.updatesLock.writeLock(), and startCacheFlush should probably fix this issue. Reducing the # of memstore flusher threads to <= 1 can also stop this behavior.
                
> Investigate time taken to snapshot memstore
> -------------------------------------------
>
>                 Key: HBASE-8228
>                 URL: https://issues.apache.org/jira/browse/HBASE-8228
>             Project: HBase
>          Issue Type: Sub-task
>            Reporter: Amitanand Aiyer
>            Assignee: Amitanand Aiyer
>            Priority: Minor
>             Fix For: 0.89-fb
>
>
> Snapshotting memstores is normally quick. But, sometimes it seems to take long. This JIRA is to track the investigation and fix to improve the outliers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira