You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Lars Hofhansl (JIRA)" <ji...@apache.org> on 2016/03/05 02:34:40 UTC

[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner

    [ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15181406#comment-15181406 ] 

Lars Hofhansl commented on HBASE-13082:
---------------------------------------

So I found a deceptively simple way of (1) avoiding the locks in every method, (2) avoiding resetting the scanner stack, (3) avoiding an extra cleanup chore:
# At StoreScanner creation time create a CountDownLatch initialized with 1.
# When a StoreScanner is closed, call countDown on the latch.
# When the compaction calls updateReaders in the StoreScanner it will block until the scanner is done.

So we have an easy way to ensure that no compactions can finish while a scanner using referring to the store files is running. That also means that we no longer need to reset the scanner stack (a compaction only happens after the scanner is closed), and lastly the compactions will naturally stack up behind the scanners and eventually run.

The risk is that scanner is not properly closed and we'll never count down that latch, and hence block compactions forever. I've not seen this happening in my tests, also every scanner is guarded by a lease, so eventually we will close it. The other risk is that we simply might not be able to get any compactions through in a busy system.

Comments? Happy to attach a patch (or open another jira for this)

[~stack], [~ram_krish], [~anoop.hbase], [~apurtell]


> Coarsen StoreScanner locks to RegionScanner
> -------------------------------------------
>
>                 Key: HBASE-13082
>                 URL: https://issues.apache.org/jira/browse/HBASE-13082
>             Project: HBase
>          Issue Type: Bug
>          Components: Performance, Scanners
>            Reporter: Lars Hofhansl
>            Assignee: ramkrishna.s.vasudevan
>             Fix For: 2.0.0
>
>         Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, 13082-v4.txt, 13082.txt, 13082.txt, HBASE-13082.pdf, HBASE-13082_1.pdf, HBASE-13082_12.patch, HBASE-13082_13.patch, HBASE-13082_14.patch, HBASE-13082_15.patch, HBASE-13082_16.patch, HBASE-13082_17.patch, HBASE-13082_18.patch, HBASE-13082_19.patch, HBASE-13082_1_WIP.patch, HBASE-13082_2.pdf, HBASE-13082_2_WIP.patch, HBASE-13082_3.patch, HBASE-13082_4.patch, HBASE-13082_9.patch, HBASE-13082_9.patch, HBASE-13082_withoutpatch.jpg, HBASE-13082_withpatch.jpg, LockVsSynchronized.java, gc.png, gc.png, gc.png, hits.png, next.png, next.png
>
>
> Continuing where HBASE-10015 left of.
> We can avoid locking (and memory fencing) inside StoreScanner by deferring to the lock already held by the RegionScanner.
> In tests this shows quite a scan improvement and reduced CPU (the fences make the cores wait for memory fetches).
> There are some drawbacks too:
> * All calls to RegionScanner need to be remain synchronized
> * Implementors of coprocessors need to be diligent in following the locking contract. For example Phoenix does not lock RegionScanner.nextRaw() and required in the documentation (not picking on Phoenix, this one is my fault as I told them it's OK)
> * possible starving of flushes and compaction with heavy read load. RegionScanner operations would keep getting the locks and the flushes/compactions would not be able finalize the set of files.
> I'll have a patch soon.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)