You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Andrew Purtell (JIRA)" <ji...@apache.org> on 2018/04/17 21:43:00 UTC

[jira] [Created] (HBASE-20445) Defer work when a row lock is busy

Andrew Purtell created HBASE-20445:
--------------------------------------

             Summary: Defer work when a row lock is busy
                 Key: HBASE-20445
                 URL: https://issues.apache.org/jira/browse/HBASE-20445
             Project: HBase
          Issue Type: Improvement
            Reporter: Andrew Purtell


Instead of blocking on row locks, defer the call and make the call runner available so it can service other activity. Have runners pick up deferred calls in the background after servicing the other request. 

Spin briefly on tryLock() wherever we are now using lock() to acquire a row lock. Introduce two new configuration parameters: one for the amount of time to wait between lock acquisition attempts, and another for the total number of times we wait before deferring the work. If the lock cannot be acquired, put the call back into the call queue. Call queues therefore should be priority queues sorted by deadline. Currently they are implemented with LinkedBlockingQueue (which isn't), or AdaptiveLifoCoDelCallQueue (which is) if the CoDel scheduler is enabled. Perhaps we could just require use of AdaptiveLifoCoDelCallQueue. Runners will be picking up work from the head of the queues as long as they are not empty, so deferred calls will be serviced again, or dropped if the deadline has passed.

Implementing continuations for simple operations should be straightforward. 

Batch mutations try to acquire as many rowlocks as they can, then do the partial batch over the successfully locked rows, then loop back to attempt the remaining work. This is a partial implementation of what we need so we can build on it. Rather than loop around, save the partial batch completion state and put a pointer to it along with the call back into the RPC queue.

For scans where allowPartialResults has been set to true we can simply complete the call at the point we fail to acquire a row lock. The client will handle the rest. For scans where allowPartialResults is false we have to save the scanner state and partial results, and put a pointer to this state along with the call back into the queue. 

We could approach this in phases:

Phase 0 - Sort out the call queuing details. Do we require AdaptiveLifoCoDelCallQueue? Certainly we can make use of it. Can we also have RWQueueRpcExecutor create queues as PriorityBlockingQueue instead of LinkedBlockingQueue? There must be a reason why not already.

Phase 1 - Implement deferral of simple ops only. (Batch mutations and scans will still block on rowlocks.)

Phase 2 - Implement deferral of batch mutations. (Scans will still block on rowlocks.)

Phase 3 - Implement deferral of scans where allowPartialResults is false.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)