You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "stack (JIRA)" <ji...@apache.org> on 2010/03/30 02:47:27 UTC

[jira] Reopened: (HBASE-2338) log recovery: deleted items may be resurrected

     [ https://issues.apache.org/jira/browse/HBASE-2338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack reopened HBASE-2338:
--------------------------

      Assignee: stack

Commit broke build.  Reopening.  TestClient.testDeletes fails.  We add two values each at different timestamps.  We then do two deletes w/o ts.  This should be purging both values but seems to only get rid of one.  I'm taking a look at it...

> log recovery: deleted items may be resurrected
> ----------------------------------------------
>
>                 Key: HBASE-2338
>                 URL: https://issues.apache.org/jira/browse/HBASE-2338
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.20.4
>            Reporter: Kannan Muthukkaruppan
>            Assignee: stack
>             Fix For: 0.20.4, 0.21.0
>
>         Attachments: delete.patch
>
>
> While working on HBASE-2283, noticed that if you do a put followed by a delete, and then crash the RS, and trigger log recovery to happen, then deleted entries may be resurrected. 
> Suprisingly, the issue only affected delete of a specific column. Full row delete didn't run into this issue.
> ---
> Code inspection revealed that we might have an issue with timestamps & WAL stuff for delete that come in with "LATEST" timestamp. [Note: The "LATEST" timestamp is syntax sugar/hint to the RS to convert it to "now". ]
> Basically, in:
> {code}
> delete(byte [] family, List<KeyValue> kvs, boolean writeToWAL)
> {code}
> the "kv.updateLatestStamp(byteNow);" time stamp massaging (from LATEST to now) happens *after* the WAL log.append() call. So the KeyValue entries written to the HLog do not have the massaged timestamp. On recovery, when these entries are replayed, we add them back to reconstructionCache but don't do anything with timestamps. 
> The above could be the potential source of the problem. But there could be more to the problem than my simple analysis. For instance, we still don't know why full row delete worked fine, but delete of a specific column didn't work ok. Forking this off as a separate issue from HBASE-2283.
> [Note: Aravind is starting to take a look at this issue.]

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.