You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Jeff Cunningham (JIRA)" <ji...@apache.org> on 2014/06/04 00:21:01 UTC

[jira] [Created] (HBASE-11295) Long running scan produces OutOfOrderScannerNextException

Jeff Cunningham created HBASE-11295:
---------------------------------------

             Summary: Long running scan produces OutOfOrderScannerNextException
                 Key: HBASE-11295
                 URL: https://issues.apache.org/jira/browse/HBASE-11295
             Project: HBase
          Issue Type: Bug
          Components: regionserver
    Affects Versions: 0.96.0
            Reporter: Jeff Cunningham
         Attachments: OutOfOrderScannerNextException.tar.gz

Attached Files:

HRegionServer.java - instramented from 0.96.1.1-cdh5.0.0
HBaseLeaseTimeoutIT.java - reproducing JUnit 4 test
WaitFilter.java - Scan filter (extends FilterBase) that overrides filterRowKey() to sleep during invocation
SpliceFilter.proto - Protobuf defintiion for WaitFilter.java
OutOfOrderScann_InstramentedServer.log - instramented server log
Steps.txt - this note

Set up:

In HBaseLeaseTimeoutIT, create a scan, set the given filter (which sleeps in overridden filterRowKey() method) and set it on the scan, and scan the table.
This is done in test client_0x0_server_150000x10().

Here's what I'm seeing (see also attached log):

A new request comes into server (ID 1940798815214593802 - RpcServer.handler=96) and a RegionScanner is created for it, cached by ID, immediately looked up again and cached RegionScannerHolder's nextCallSeq incremeted (now at 1).
The RegionScan thread goes to sleep in WaitFilter#filterRowKey().

A short (variable) period later, another request comes into the server (ID 8946109289649235722 - RpcServer.handler=98) and the same series of events happen to this request.

At this point both RegionScanner threads are sleeping in WaitFilter.filterRowKey(). After another period, the client retries another scan request which thinks its next_call_seq is 0.  However, HRegionServer's cached RegionScannerHolder thinks the matching RegionScanner's nextCallSeq should be 1.



--
This message was sent by Atlassian JIRA
(v6.2#6252)