You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Victor Xu (JIRA)" <ji...@apache.org> on 2013/10/06 02:33:41 UTC
[jira] [Updated] (HBASE-9645) Regionserver halt because of HLog's
"Logic Error Snapshot seq id from earlier flush still present!"
[ https://issues.apache.org/jira/browse/HBASE-9645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Victor Xu updated HBASE-9645:
-----------------------------
Release Note: We used to add some code in postBatchMutation coprocessor that calls the HLog.appendNoSync method. This could cause some trouble if I didn't lock the updatesLock of HRegion. However, the HRegion class doesn't provide public methods to lock/unlock this updatesLock outside of it, so I submit a small patch adding two public methods to solve this.
Status: Patch Available (was: Open)
HBASE-9646 is actually the same problem. It's all because of the updatesLock of HRegion class.
> Regionserver halt because of HLog's "Logic Error Snapshot seq id from earlier flush still present!"
> ---------------------------------------------------------------------------------------------------
>
> Key: HBASE-9645
> URL: https://issues.apache.org/jira/browse/HBASE-9645
> Project: HBase
> Issue Type: Bug
> Components: regionserver, wal
> Affects Versions: 0.94.10
> Environment: Linux 2.6.32-el5.x86_64
> Reporter: Victor Xu
> Priority: Critical
> Attachments: HBASE_9645-0.94.10.patch
>
>
> I upgrade my hbase cluster to 0.94.10 three weeks ago, and this case happened several days after that. I change the bug's priority to 'Critical' because every time it happens, a regionserver halt down. All of them have the same log:
> {noformat}
> ERROR org.apache.hadoop.hbase.regionserver.wal.HLog: Logic Error Snapshot seq id from earlier flush still present! for region c0d88db4ce3606842fbec9d34c38f707 overwritten oldseq=80114270537with new seq=80115066829
> {noformat}
> I check the code finding that it locates at HLog.startCacheFlush method. The 'lastSeqWritten' has been locked. Maybe something wrong happened outside the HLog that change it by mistake.
--
This message was sent by Atlassian JIRA
(v6.1#6144)