You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "Andrew Purtell (JIRA)" <ji...@apache.org> on 2015/01/15 18:37:36 UTC
[jira] [Resolved] (HBASE-11295) Long running scan produces OutOfOrderScannerNextException

     [ https://issues.apache.org/jira/browse/HBASE-11295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrew Purtell resolved HBASE-11295.
------------------------------------
       Resolution: Not a Problem
    Fix Version/s:     (was: 1.1.0)
                       (was: 0.98.10)
                       (was: 2.0.0)
                       (was: 1.0.0)
         Assignee:     (was: Andrew Purtell)

Where we are now with this issue is a suggestion to have OOSNE retry the same number of configured times as other retryable IOExceptions, but without strong opinion. Therefore I'm going to resolve this as Not A Problem as the original report is of a correct and intended response. Reopen if something changes.

> Long running scan produces OutOfOrderScannerNextException
> ---------------------------------------------------------
>
>                 Key: HBASE-11295
>                 URL: https://issues.apache.org/jira/browse/HBASE-11295
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>    Affects Versions: 0.96.0
>            Reporter: Jeff Cunningham
>            Priority: Critical
>         Attachments: OutOfOrderScannerNextException.tar.gz
>
>
> Attached Files:
> HRegionServer.java - instramented from 0.96.1.1-cdh5.0.0
> HBaseLeaseTimeoutIT.java - reproducing JUnit 4 test
> WaitFilter.java - Scan filter (extends FilterBase) that overrides filterRowKey() to sleep during invocation
> SpliceFilter.proto - Protobuf defintiion for WaitFilter.java
> OutOfOrderScann_InstramentedServer.log - instramented server log
> Steps.txt - this note
> Set up:
> In HBaseLeaseTimeoutIT, create a scan, set the given filter (which sleeps in overridden filterRowKey() method) and set it on the scan, and scan the table.
> This is done in test client_0x0_server_150000x10().
> Here's what I'm seeing (see also attached log):
> A new request comes into server (ID 1940798815214593802 - RpcServer.handler=96) and a RegionScanner is created for it, cached by ID, immediately looked up again and cached RegionScannerHolder's nextCallSeq incremeted (now at 1).
> The RegionScan thread goes to sleep in WaitFilter#filterRowKey().
> A short (variable) period later, another request comes into the server (ID 8946109289649235722 - RpcServer.handler=98) and the same series of events happen to this request.
> At this point both RegionScanner threads are sleeping in WaitFilter.filterRowKey(). After another period, the client retries another scan request which thinks its next_call_seq is 0.  However, HRegionServer's cached RegionScannerHolder thinks the matching RegionScanner's nextCallSeq should be 1.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)