You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@hbase.apache.org by "Alina GHERMAN (JIRA)" <ji...@apache.org> on 2015/12/30 10:45:49 UTC

[jira] [Commented] (HBASE-11295) Long running scan produces OutOfOrderScannerNextException

    [ https://issues.apache.org/jira/browse/HBASE-11295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15074865#comment-15074865 ] 

Alina GHERMAN commented on HBASE-11295:
---------------------------------------

I tried the " increasing the rpc timeout"  solution but now the job is just stopping (no error)

The only logs: Query 1e4f1be62f3e7791:70e065374c943880: 0% Complete (0 out of 205)

Note: I increased the RPC timeout to hbase.rpc.timeout=30 seconds


> Long running scan produces OutOfOrderScannerNextException
> ---------------------------------------------------------
>
>                 Key: HBASE-11295
>                 URL: https://issues.apache.org/jira/browse/HBASE-11295
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>    Affects Versions: 0.96.0
>            Reporter: Jeff Cunningham
>            Priority: Critical
>         Attachments: OutOfOrderScannerNextException.tar.gz
>
>
> Attached Files:
> HRegionServer.java - instramented from 0.96.1.1-cdh5.0.0
> HBaseLeaseTimeoutIT.java - reproducing JUnit 4 test
> WaitFilter.java - Scan filter (extends FilterBase) that overrides filterRowKey() to sleep during invocation
> SpliceFilter.proto - Protobuf defintiion for WaitFilter.java
> OutOfOrderScann_InstramentedServer.log - instramented server log
> Steps.txt - this note
> Set up:
> In HBaseLeaseTimeoutIT, create a scan, set the given filter (which sleeps in overridden filterRowKey() method) and set it on the scan, and scan the table.
> This is done in test client_0x0_server_150000x10().
> Here's what I'm seeing (see also attached log):
> A new request comes into server (ID 1940798815214593802 - RpcServer.handler=96) and a RegionScanner is created for it, cached by ID, immediately looked up again and cached RegionScannerHolder's nextCallSeq incremeted (now at 1).
> The RegionScan thread goes to sleep in WaitFilter#filterRowKey().
> A short (variable) period later, another request comes into the server (ID 8946109289649235722 - RpcServer.handler=98) and the same series of events happen to this request.
> At this point both RegionScanner threads are sleeping in WaitFilter.filterRowKey(). After another period, the client retries another scan request which thinks its next_call_seq is 0.  However, HRegionServer's cached RegionScannerHolder thinks the matching RegionScanner's nextCallSeq should be 1.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)