You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@hbase.apache.org by "Gregory Chanan (JIRA)" <ji...@apache.org> on 2012/12/03 20:33:59 UTC

[jira] [Created] (HBASE-7263) Investigate more fine grain locking for checkAndPut/append/increment

Gregory Chanan created HBASE-7263:
-------------------------------------

             Summary: Investigate more fine grain locking for checkAndPut/append/increment
                 Key: HBASE-7263
                 URL: https://issues.apache.org/jira/browse/HBASE-7263
             Project: HBase
          Issue Type: Improvement
          Components: Transactions/MVCC
            Reporter: Gregory Chanan
            Assignee: Gregory Chanan
            Priority: Minor


HBASE-7051 lists 3 options for fixing an ACID-violation wrt checkAndPut:
{quote}
1) Waiting for the MVCC to advance for read/updates: the downside is that you have to wait for updates on other rows.
2) Have an MVCC per-row (table configuration): this avoids the unnecessary contention of 1)
3) Transform the read/updates to write-only with rollup on read.. E.g. an increment would just have the number of values to increment.
{quote}

HBASE-7051 and HBASE-4583 implement option #1.  The downside, as mentioned, is that you have to wait for updates on other rows, since MVCC is per-row.

Another option occurred to me that I think is worth investigating: rely on a row-level read/write lock rather than MVCC.

Here is pseudo-code for what exists today for read/updates like checkAndPut
{code}
(1)  Acquire RowLock
(1a) BeginMVCC + Finish MVCC
(2)  Begin MVCC
(3)  Do work
(4)  Release RowLock
(5)  Append to WAL
(6)  Finish MVCC
{code}

Write-only operations (e.g. puts) are the same, just without step 1a.

Now, consider the following instead:
{code}
(1)  Acquire RowLock
(1a) Grab+Release RowWriteLock (instead of BeginMVCC + Finish MVCC)
(1b) Grab RowReadLock (new step!)
(2)  Begin MVCC
(3)  Do work
(4)  Release RowLock
(5)  Append to WAL
(6)  Finish MVCC
(7)  Release RowReadLock (new step!)
{code}

As before, write-only operations are the same, just without step 1a.

The difference here is that writes grab a row-level read lock and hold it until the MVCC is completed.  The nice property that this gives you is that read/updates can tell when the MVCC is done on a per-row basis, because they can just try to acquire the write-lock which will block until the MVCC is competed for that row in step 7.

There is overhead for acquiring the read lock that I need to measure, but it should be small, since there will never be any blocking on acquiring the row-level read lock.  This is because the read lock can only block if someone else holds the write lock, but both the write and read lock are only acquired under the row lock.

I ran a quick test of this approach over a region (this directly interacts with HRegion, so no client effects):
- 30 threads
- 5000 increments per thread
- 30 columns per increment
- Each increment uniformly distributed over 500,000 rows
- 5 trials

Better-Than-Theoretical-Max: (No locking or MVCC on step 1a): 10362.2 ms
Today: 13950 ms
The locking approach: 10877 ms

So it looks like an improvement, at least wrt increment.  As mentioned, I need to measure the overhead of acquiring the read lock for puts.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7263) Investigate more fine grained locking for checkAndPut/append/increment

Posted by "Lars Hofhansl (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-7263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13509422#comment-13509422 ] 

Lars Hofhansl commented on HBASE-7263:
--------------------------------------

Interesting!

This looks right to me:
* multiple Puts can be in the section guarded by the row read lock
* the readlock covers the MVCC section
* an increment will wait out all concurrent Puts for the same row by attempting to take the row write lock

Do these new locks time out like original RowLock?
This has the potential that Puts starve out Increments, right? (not a big deal, though)

Also, I'd be a bit skeptical about another mechanism for change visibility.

Do the result even get better if you spread the increments out over fewer rows?
                
> Investigate more fine grained locking for checkAndPut/append/increment
> ----------------------------------------------------------------------
>
>                 Key: HBASE-7263
>                 URL: https://issues.apache.org/jira/browse/HBASE-7263
>             Project: HBase
>          Issue Type: Improvement
>          Components: Transactions/MVCC
>            Reporter: Gregory Chanan
>            Assignee: Gregory Chanan
>            Priority: Minor
>
> HBASE-7051 lists 3 options for fixing an ACID-violation wrt checkAndPut:
> {quote}
> 1) Waiting for the MVCC to advance for read/updates: the downside is that you have to wait for updates on other rows.
> 2) Have an MVCC per-row (table configuration): this avoids the unnecessary contention of 1)
> 3) Transform the read/updates to write-only with rollup on read.. E.g. an increment would just have the number of values to increment.
> {quote}
> HBASE-7051 and HBASE-4583 implement option #1.  The downside, as mentioned, is that you have to wait for updates on other rows, since MVCC is per-row.
> Another option occurred to me that I think is worth investigating: rely on a row-level read/write lock rather than MVCC.
> Here is pseudo-code for what exists today for read/updates like checkAndPut
> {code}
> (1)  Acquire RowLock
> (1a) BeginMVCC + Finish MVCC
> (2)  Begin MVCC
> (3)  Do work
> (4)  Release RowLock
> (5)  Append to WAL
> (6)  Finish MVCC
> {code}
> Write-only operations (e.g. puts) are the same, just without step 1a.
> Now, consider the following instead:
> {code}
> (1)  Acquire RowLock
> (1a) Grab+Release RowWriteLock (instead of BeginMVCC + Finish MVCC)
> (1b) Grab RowReadLock (new step!)
> (2)  Begin MVCC
> (3)  Do work
> (4)  Release RowLock
> (5)  Append to WAL
> (6)  Finish MVCC
> (7)  Release RowReadLock (new step!)
> {code}
> As before, write-only operations are the same, just without step 1a.
> The difference here is that writes grab a row-level read lock and hold it until the MVCC is completed.  The nice property that this gives you is that read/updates can tell when the MVCC is done on a per-row basis, because they can just try to acquire the write-lock which will block until the MVCC is competed for that row in step 7.
> There is overhead for acquiring the read lock that I need to measure, but it should be small, since there will never be any blocking on acquiring the row-level read lock.  This is because the read lock can only block if someone else holds the write lock, but both the write and read lock are only acquired under the row lock.
> I ran a quick test of this approach over a region (this directly interacts with HRegion, so no client effects):
> - 30 threads
> - 5000 increments per thread
> - 30 columns per increment
> - Each increment uniformly distributed over 500,000 rows
> - 5 trials
> Better-Than-Theoretical-Max: (No locking or MVCC on step 1a): 10362.2 ms
> Today: 13950 ms
> The locking approach: 10877 ms
> So it looks like an improvement, at least wrt increment.  As mentioned, I need to measure the overhead of acquiring the read lock for puts.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-7263) Investigate more fine grained locking for checkAndPut/append/increment

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-7263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ted Yu updated HBASE-7263:
--------------------------

    Summary: Investigate more fine grained locking for checkAndPut/append/increment  (was: Investigate more fine grain locking for checkAndPut/append/increment)

I think this approach is worth pursuit.
                
> Investigate more fine grained locking for checkAndPut/append/increment
> ----------------------------------------------------------------------
>
>                 Key: HBASE-7263
>                 URL: https://issues.apache.org/jira/browse/HBASE-7263
>             Project: HBase
>          Issue Type: Improvement
>          Components: Transactions/MVCC
>            Reporter: Gregory Chanan
>            Assignee: Gregory Chanan
>            Priority: Minor
>
> HBASE-7051 lists 3 options for fixing an ACID-violation wrt checkAndPut:
> {quote}
> 1) Waiting for the MVCC to advance for read/updates: the downside is that you have to wait for updates on other rows.
> 2) Have an MVCC per-row (table configuration): this avoids the unnecessary contention of 1)
> 3) Transform the read/updates to write-only with rollup on read.. E.g. an increment would just have the number of values to increment.
> {quote}
> HBASE-7051 and HBASE-4583 implement option #1.  The downside, as mentioned, is that you have to wait for updates on other rows, since MVCC is per-row.
> Another option occurred to me that I think is worth investigating: rely on a row-level read/write lock rather than MVCC.
> Here is pseudo-code for what exists today for read/updates like checkAndPut
> {code}
> (1)  Acquire RowLock
> (1a) BeginMVCC + Finish MVCC
> (2)  Begin MVCC
> (3)  Do work
> (4)  Release RowLock
> (5)  Append to WAL
> (6)  Finish MVCC
> {code}
> Write-only operations (e.g. puts) are the same, just without step 1a.
> Now, consider the following instead:
> {code}
> (1)  Acquire RowLock
> (1a) Grab+Release RowWriteLock (instead of BeginMVCC + Finish MVCC)
> (1b) Grab RowReadLock (new step!)
> (2)  Begin MVCC
> (3)  Do work
> (4)  Release RowLock
> (5)  Append to WAL
> (6)  Finish MVCC
> (7)  Release RowReadLock (new step!)
> {code}
> As before, write-only operations are the same, just without step 1a.
> The difference here is that writes grab a row-level read lock and hold it until the MVCC is completed.  The nice property that this gives you is that read/updates can tell when the MVCC is done on a per-row basis, because they can just try to acquire the write-lock which will block until the MVCC is competed for that row in step 7.
> There is overhead for acquiring the read lock that I need to measure, but it should be small, since there will never be any blocking on acquiring the row-level read lock.  This is because the read lock can only block if someone else holds the write lock, but both the write and read lock are only acquired under the row lock.
> I ran a quick test of this approach over a region (this directly interacts with HRegion, so no client effects):
> - 30 threads
> - 5000 increments per thread
> - 30 columns per increment
> - Each increment uniformly distributed over 500,000 rows
> - 5 trials
> Better-Than-Theoretical-Max: (No locking or MVCC on step 1a): 10362.2 ms
> Today: 13950 ms
> The locking approach: 10877 ms
> So it looks like an improvement, at least wrt increment.  As mentioned, I need to measure the overhead of acquiring the read lock for puts.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7263) Investigate more fine grained locking for checkAndPut/append/increment

Posted by "Gregory Chanan (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-7263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13509963#comment-13509963 ] 

Gregory Chanan commented on HBASE-7263:
---------------------------------------

I should also mention in the above numbers, I'm not doing real memory management.  I create the ReadWriteRowLock when needed and never delete.  Need to do something smarter there which may have performance impact.
                
> Investigate more fine grained locking for checkAndPut/append/increment
> ----------------------------------------------------------------------
>
>                 Key: HBASE-7263
>                 URL: https://issues.apache.org/jira/browse/HBASE-7263
>             Project: HBase
>          Issue Type: Improvement
>          Components: Transactions/MVCC
>            Reporter: Gregory Chanan
>            Assignee: Gregory Chanan
>            Priority: Minor
>
> HBASE-7051 lists 3 options for fixing an ACID-violation wrt checkAndPut:
> {quote}
> 1) Waiting for the MVCC to advance for read/updates: the downside is that you have to wait for updates on other rows.
> 2) Have an MVCC per-row (table configuration): this avoids the unnecessary contention of 1)
> 3) Transform the read/updates to write-only with rollup on read.. E.g. an increment would just have the number of values to increment.
> {quote}
> HBASE-7051 and HBASE-4583 implement option #1.  The downside, as mentioned, is that you have to wait for updates on other rows, since MVCC is per-row.
> Another option occurred to me that I think is worth investigating: rely on a row-level read/write lock rather than MVCC.
> Here is pseudo-code for what exists today for read/updates like checkAndPut
> {code}
> (1)  Acquire RowLock
> (1a) BeginMVCC + Finish MVCC
> (2)  Begin MVCC
> (3)  Do work
> (4)  Release RowLock
> (5)  Append to WAL
> (6)  Finish MVCC
> {code}
> Write-only operations (e.g. puts) are the same, just without step 1a.
> Now, consider the following instead:
> {code}
> (1)  Acquire RowLock
> (1a) Grab+Release RowWriteLock (instead of BeginMVCC + Finish MVCC)
> (1b) Grab RowReadLock (new step!)
> (2)  Begin MVCC
> (3)  Do work
> (4)  Release RowLock
> (5)  Append to WAL
> (6)  Finish MVCC
> (7)  Release RowReadLock (new step!)
> {code}
> As before, write-only operations are the same, just without step 1a.
> The difference here is that writes grab a row-level read lock and hold it until the MVCC is completed.  The nice property that this gives you is that read/updates can tell when the MVCC is done on a per-row basis, because they can just try to acquire the write-lock which will block until the MVCC is competed for that row in step 7.
> There is overhead for acquiring the read lock that I need to measure, but it should be small, since there will never be any blocking on acquiring the row-level read lock.  This is because the read lock can only block if someone else holds the write lock, but both the write and read lock are only acquired under the row lock.
> I ran a quick test of this approach over a region (this directly interacts with HRegion, so no client effects):
> - 30 threads
> - 5000 increments per thread
> - 30 columns per increment
> - Each increment uniformly distributed over 500,000 rows
> - 5 trials
> Better-Than-Theoretical-Max: (No locking or MVCC on step 1a): 10362.2 ms
> Today: 13950 ms
> The locking approach: 10877 ms
> So it looks like an improvement, at least wrt increment.  As mentioned, I need to measure the overhead of acquiring the read lock for puts.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7263) Investigate more fine grain locking for checkAndPut/append/increment

Posted by "Gregory Chanan (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-7263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13509299#comment-13509299 ] 

Gregory Chanan commented on HBASE-7263:
---------------------------------------

Did a test of puts.

- 30 threads
- 5000 increments per thread
- 30 columns per increment
- Each increment uniformly distributed over 500,000 rows
- 5 trials

With new locks:
9901

Without new locks:
9790

so about a 1% difference.  I haven't tried to optimize this code at all; I suspect I could do better.

But right now it's taking a 1% performance hit on puts for 22% performance benefit on append/increment/checkAndPut on cases that are very contended at the region, but not row level.
                
> Investigate more fine grain locking for checkAndPut/append/increment
> --------------------------------------------------------------------
>
>                 Key: HBASE-7263
>                 URL: https://issues.apache.org/jira/browse/HBASE-7263
>             Project: HBase
>          Issue Type: Improvement
>          Components: Transactions/MVCC
>            Reporter: Gregory Chanan
>            Assignee: Gregory Chanan
>            Priority: Minor
>
> HBASE-7051 lists 3 options for fixing an ACID-violation wrt checkAndPut:
> {quote}
> 1) Waiting for the MVCC to advance for read/updates: the downside is that you have to wait for updates on other rows.
> 2) Have an MVCC per-row (table configuration): this avoids the unnecessary contention of 1)
> 3) Transform the read/updates to write-only with rollup on read.. E.g. an increment would just have the number of values to increment.
> {quote}
> HBASE-7051 and HBASE-4583 implement option #1.  The downside, as mentioned, is that you have to wait for updates on other rows, since MVCC is per-row.
> Another option occurred to me that I think is worth investigating: rely on a row-level read/write lock rather than MVCC.
> Here is pseudo-code for what exists today for read/updates like checkAndPut
> {code}
> (1)  Acquire RowLock
> (1a) BeginMVCC + Finish MVCC
> (2)  Begin MVCC
> (3)  Do work
> (4)  Release RowLock
> (5)  Append to WAL
> (6)  Finish MVCC
> {code}
> Write-only operations (e.g. puts) are the same, just without step 1a.
> Now, consider the following instead:
> {code}
> (1)  Acquire RowLock
> (1a) Grab+Release RowWriteLock (instead of BeginMVCC + Finish MVCC)
> (1b) Grab RowReadLock (new step!)
> (2)  Begin MVCC
> (3)  Do work
> (4)  Release RowLock
> (5)  Append to WAL
> (6)  Finish MVCC
> (7)  Release RowReadLock (new step!)
> {code}
> As before, write-only operations are the same, just without step 1a.
> The difference here is that writes grab a row-level read lock and hold it until the MVCC is completed.  The nice property that this gives you is that read/updates can tell when the MVCC is done on a per-row basis, because they can just try to acquire the write-lock which will block until the MVCC is competed for that row in step 7.
> There is overhead for acquiring the read lock that I need to measure, but it should be small, since there will never be any blocking on acquiring the row-level read lock.  This is because the read lock can only block if someone else holds the write lock, but both the write and read lock are only acquired under the row lock.
> I ran a quick test of this approach over a region (this directly interacts with HRegion, so no client effects):
> - 30 threads
> - 5000 increments per thread
> - 30 columns per increment
> - Each increment uniformly distributed over 500,000 rows
> - 5 trials
> Better-Than-Theoretical-Max: (No locking or MVCC on step 1a): 10362.2 ms
> Today: 13950 ms
> The locking approach: 10877 ms
> So it looks like an improvement, at least wrt increment.  As mentioned, I need to measure the overhead of acquiring the read lock for puts.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7263) Investigate more fine grain locking for checkAndPut/append/increment

Posted by "Gregory Chanan (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-7263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13509301#comment-13509301 ] 

Gregory Chanan commented on HBASE-7263:
---------------------------------------

Numbers in last comment are in ms.
                
> Investigate more fine grain locking for checkAndPut/append/increment
> --------------------------------------------------------------------
>
>                 Key: HBASE-7263
>                 URL: https://issues.apache.org/jira/browse/HBASE-7263
>             Project: HBase
>          Issue Type: Improvement
>          Components: Transactions/MVCC
>            Reporter: Gregory Chanan
>            Assignee: Gregory Chanan
>            Priority: Minor
>
> HBASE-7051 lists 3 options for fixing an ACID-violation wrt checkAndPut:
> {quote}
> 1) Waiting for the MVCC to advance for read/updates: the downside is that you have to wait for updates on other rows.
> 2) Have an MVCC per-row (table configuration): this avoids the unnecessary contention of 1)
> 3) Transform the read/updates to write-only with rollup on read.. E.g. an increment would just have the number of values to increment.
> {quote}
> HBASE-7051 and HBASE-4583 implement option #1.  The downside, as mentioned, is that you have to wait for updates on other rows, since MVCC is per-row.
> Another option occurred to me that I think is worth investigating: rely on a row-level read/write lock rather than MVCC.
> Here is pseudo-code for what exists today for read/updates like checkAndPut
> {code}
> (1)  Acquire RowLock
> (1a) BeginMVCC + Finish MVCC
> (2)  Begin MVCC
> (3)  Do work
> (4)  Release RowLock
> (5)  Append to WAL
> (6)  Finish MVCC
> {code}
> Write-only operations (e.g. puts) are the same, just without step 1a.
> Now, consider the following instead:
> {code}
> (1)  Acquire RowLock
> (1a) Grab+Release RowWriteLock (instead of BeginMVCC + Finish MVCC)
> (1b) Grab RowReadLock (new step!)
> (2)  Begin MVCC
> (3)  Do work
> (4)  Release RowLock
> (5)  Append to WAL
> (6)  Finish MVCC
> (7)  Release RowReadLock (new step!)
> {code}
> As before, write-only operations are the same, just without step 1a.
> The difference here is that writes grab a row-level read lock and hold it until the MVCC is completed.  The nice property that this gives you is that read/updates can tell when the MVCC is done on a per-row basis, because they can just try to acquire the write-lock which will block until the MVCC is competed for that row in step 7.
> There is overhead for acquiring the read lock that I need to measure, but it should be small, since there will never be any blocking on acquiring the row-level read lock.  This is because the read lock can only block if someone else holds the write lock, but both the write and read lock are only acquired under the row lock.
> I ran a quick test of this approach over a region (this directly interacts with HRegion, so no client effects):
> - 30 threads
> - 5000 increments per thread
> - 30 columns per increment
> - Each increment uniformly distributed over 500,000 rows
> - 5 trials
> Better-Than-Theoretical-Max: (No locking or MVCC on step 1a): 10362.2 ms
> Today: 13950 ms
> The locking approach: 10877 ms
> So it looks like an improvement, at least wrt increment.  As mentioned, I need to measure the overhead of acquiring the read lock for puts.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7263) Investigate more fine grained locking for checkAndPut/append/increment

Posted by "Gregory Chanan (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-7263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13509440#comment-13509440 ] 

Gregory Chanan commented on HBASE-7263:
---------------------------------------

bq. Do these new locks time out like original RowLock?
Not yet, the write-lock should though.

bq. This has the potential that Puts starve out Increments, right? (not a big deal, though)
I don't think so.  Consider some set of puts that are holding the readlock, but not the rowlock.  The next operation to go will be whoever grabs the rowlock next (since it is grabbed before the read or write lock), which is exactly the same as today.

bq. Also, I'd be a bit skeptical about another mechanism for change visibility.
Well, I also investigated making MVCC more aware (either making it totally per row or having it track the rows being modified and providing a "waitForRowComplete" method, but those were way too complicated :).  That's not an argument for adding another mechanism, though.  I agree we should only do this if the performance is worth it.

bq. Do the result even get better if you spread the increments out over fewer rows?
I'm not sure.  My thinking was to spread the increments over a large number of rows so that the contention is on MVCC, not on the row lock.  I thought the "Today" number for the increment test would be way worse, actually, because you'd essentially get only one increment at a time actually doing work.  I need to investigate further.  It's possible that multiple increments are doing work at the same time, though; any number can technically finish completeMemstoreInsert(beginMemstoreInsert) before another starts a new transaction with beginMemstoreInsert.
                
> Investigate more fine grained locking for checkAndPut/append/increment
> ----------------------------------------------------------------------
>
>                 Key: HBASE-7263
>                 URL: https://issues.apache.org/jira/browse/HBASE-7263
>             Project: HBase
>          Issue Type: Improvement
>          Components: Transactions/MVCC
>            Reporter: Gregory Chanan
>            Assignee: Gregory Chanan
>            Priority: Minor
>
> HBASE-7051 lists 3 options for fixing an ACID-violation wrt checkAndPut:
> {quote}
> 1) Waiting for the MVCC to advance for read/updates: the downside is that you have to wait for updates on other rows.
> 2) Have an MVCC per-row (table configuration): this avoids the unnecessary contention of 1)
> 3) Transform the read/updates to write-only with rollup on read.. E.g. an increment would just have the number of values to increment.
> {quote}
> HBASE-7051 and HBASE-4583 implement option #1.  The downside, as mentioned, is that you have to wait for updates on other rows, since MVCC is per-row.
> Another option occurred to me that I think is worth investigating: rely on a row-level read/write lock rather than MVCC.
> Here is pseudo-code for what exists today for read/updates like checkAndPut
> {code}
> (1)  Acquire RowLock
> (1a) BeginMVCC + Finish MVCC
> (2)  Begin MVCC
> (3)  Do work
> (4)  Release RowLock
> (5)  Append to WAL
> (6)  Finish MVCC
> {code}
> Write-only operations (e.g. puts) are the same, just without step 1a.
> Now, consider the following instead:
> {code}
> (1)  Acquire RowLock
> (1a) Grab+Release RowWriteLock (instead of BeginMVCC + Finish MVCC)
> (1b) Grab RowReadLock (new step!)
> (2)  Begin MVCC
> (3)  Do work
> (4)  Release RowLock
> (5)  Append to WAL
> (6)  Finish MVCC
> (7)  Release RowReadLock (new step!)
> {code}
> As before, write-only operations are the same, just without step 1a.
> The difference here is that writes grab a row-level read lock and hold it until the MVCC is completed.  The nice property that this gives you is that read/updates can tell when the MVCC is done on a per-row basis, because they can just try to acquire the write-lock which will block until the MVCC is competed for that row in step 7.
> There is overhead for acquiring the read lock that I need to measure, but it should be small, since there will never be any blocking on acquiring the row-level read lock.  This is because the read lock can only block if someone else holds the write lock, but both the write and read lock are only acquired under the row lock.
> I ran a quick test of this approach over a region (this directly interacts with HRegion, so no client effects):
> - 30 threads
> - 5000 increments per thread
> - 30 columns per increment
> - Each increment uniformly distributed over 500,000 rows
> - 5 trials
> Better-Than-Theoretical-Max: (No locking or MVCC on step 1a): 10362.2 ms
> Today: 13950 ms
> The locking approach: 10877 ms
> So it looks like an improvement, at least wrt increment.  As mentioned, I need to measure the overhead of acquiring the read lock for puts.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira