You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "Jonathan Gray (JIRA)" <ji...@apache.org> on 2009/07/29 01:47:14 UTC

[jira] Created: (HBASE-1718) Reuse of KeyValue during log replay could cause the wrong data to be used

Reuse of KeyValue during log replay could cause the wrong data to be used
-------------------------------------------------------------------------

                 Key: HBASE-1718
                 URL: https://issues.apache.org/jira/browse/HBASE-1718
             Project: Hadoop HBase
          Issue Type: Bug
          Components: regionserver
    Affects Versions: 0.20.0
            Reporter: Jonathan Gray
            Priority: Blocker
             Fix For: 0.20.0


Our meta table got a row key of METAROW in it.  Hard to explain how it happened, but under code inspection stack found that we are reusing the same KV instance for each replayed key.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-1718) Reuse of KeyValue during log replay could cause the wrong data to be used

Posted by "Jonathan Gray (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-1718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Gray updated HBASE-1718:
---------------------------------

    Attachment: HBASE-1718-v1.patch

Instantiates a new KeyValue at the end of the while loop, meaning we only re-instantiate once we've passed forward the KV (we can reuse the times we do continue).

> Reuse of KeyValue during log replay could cause the wrong data to be used
> -------------------------------------------------------------------------
>
>                 Key: HBASE-1718
>                 URL: https://issues.apache.org/jira/browse/HBASE-1718
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: regionserver
>    Affects Versions: 0.20.0
>            Reporter: Jonathan Gray
>            Priority: Blocker
>             Fix For: 0.20.0
>
>         Attachments: HBASE-1718-v1.patch
>
>
> Our meta table got a row key of METAROW in it.  Hard to explain how it happened, but under code inspection stack found that we are reusing the same KV instance for each replayed key.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-1718) Reuse of KeyValue during log replay could cause the wrong data to be used

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-1718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12736384#action_12736384 ] 

stack commented on HBASE-1718:
------------------------------

+1 on patch.

Its hard to test.

Here is my supposition as to why this might be responsible for hbase-1638.

Server crashes, its logs are split.  The crashed servers regions are opened anew and there are logs for them to replay.

The inner loop playing the reconstruction logs is this:

{code}
     while (logReader.next(key, val)) {
        maxSeqIdInLog = Math.max(maxSeqIdInLog, key.getLogSeqNum());
        .....
        // Check this edit is for me. Also, guard against writing the special
        // METACOLUMN info such as HBASE::CACHEFLUSH entries
        if (/* commented out for now - stack via jgray key.isTransactionEntry() || */
            val.matchingFamily(HLog.METAFAMILY) ||
          !Bytes.equals(key.getRegionName(), regioninfo.getRegionName()) ||
          !val.matchingFamily(family.getName())) {
          continue;
        }
        // Add anything as value as long as we use same instance each time.
        reconstructedCache.add(val);
        ....
      }
{code}

So, a value might clear the family checks and get added to the reconstructionCache.

We call next again.  The 'val' instance is up in reconstructionCache.  The next deserializes a new KV into same instance.   The deserialized value might not make it past family checks but its already in the reconstructionCache.

This would account for our adding a single value. 

> Reuse of KeyValue during log replay could cause the wrong data to be used
> -------------------------------------------------------------------------
>
>                 Key: HBASE-1718
>                 URL: https://issues.apache.org/jira/browse/HBASE-1718
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: regionserver
>    Affects Versions: 0.20.0
>            Reporter: Jonathan Gray
>            Priority: Blocker
>             Fix For: 0.20.0
>
>         Attachments: HBASE-1718-v1.patch
>
>
> Our meta table got a row key of METAROW in it.  Hard to explain how it happened, but under code inspection stack found that we are reusing the same KV instance for each replayed key.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-1718) Reuse of KeyValue during log replay could cause the wrong data to be used

Posted by "Jean-Daniel Cryans (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-1718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12736369#action_12736369 ] 

Jean-Daniel Cryans commented on HBASE-1718:
-------------------------------------------

+1 seems fine.

> Reuse of KeyValue during log replay could cause the wrong data to be used
> -------------------------------------------------------------------------
>
>                 Key: HBASE-1718
>                 URL: https://issues.apache.org/jira/browse/HBASE-1718
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: regionserver
>    Affects Versions: 0.20.0
>            Reporter: Jonathan Gray
>            Priority: Blocker
>             Fix For: 0.20.0
>
>         Attachments: HBASE-1718-v1.patch
>
>
> Our meta table got a row key of METAROW in it.  Hard to explain how it happened, but under code inspection stack found that we are reusing the same KV instance for each replayed key.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (HBASE-1718) Reuse of KeyValue during log replay could cause the wrong data to be used

Posted by "Jonathan Gray (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-1718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Gray resolved HBASE-1718.
----------------------------------

      Resolution: Fixed
    Hadoop Flags: [Reviewed]

Committed.  Upon further thought, this _could_ create cross-family scenario we've seen in HBASE-1638 / HBASE-1715 (but apurtell says no replay).  It does, however, seem to be the case that this bug completely broke log replay as you would always end up only having the last seen KV.

> Reuse of KeyValue during log replay could cause the wrong data to be used
> -------------------------------------------------------------------------
>
>                 Key: HBASE-1718
>                 URL: https://issues.apache.org/jira/browse/HBASE-1718
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: regionserver
>    Affects Versions: 0.20.0
>            Reporter: Jonathan Gray
>            Priority: Blocker
>             Fix For: 0.20.0
>
>         Attachments: HBASE-1718-v1.patch
>
>
> Our meta table got a row key of METAROW in it.  Hard to explain how it happened, but under code inspection stack found that we are reusing the same KV instance for each replayed key.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (HBASE-1718) Reuse of KeyValue during log replay could cause the wrong data to be used

Posted by "Jonathan Gray (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-1718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Gray reassigned HBASE-1718:
------------------------------------

    Assignee: Jonathan Gray

> Reuse of KeyValue during log replay could cause the wrong data to be used
> -------------------------------------------------------------------------
>
>                 Key: HBASE-1718
>                 URL: https://issues.apache.org/jira/browse/HBASE-1718
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: regionserver
>    Affects Versions: 0.20.0
>            Reporter: Jonathan Gray
>            Assignee: Jonathan Gray
>            Priority: Blocker
>             Fix For: 0.20.0
>
>         Attachments: HBASE-1718-v1.patch
>
>
> Our meta table got a row key of METAROW in it.  Hard to explain how it happened, but under code inspection stack found that we are reusing the same KV instance for each replayed key.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-1718) Reuse of KeyValue during log replay could cause the wrong data to be used

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-1718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12736408#action_12736408 ] 

stack commented on HBASE-1718:
------------------------------

So, at a minimum, this patch fixes replaying of logs:

Below I added logging.  Replay had 17k edits.  Only one was added, the last one.

{code}
....
2009-07-29 01:27:15,274 [regionserver/208.76.44.139:60020.worker] INFO org.apache.hadoop.hbase.regionserver.Store: ADDED TO RECON CACHE: \x00\x00\x00\x03\x00\x01\x07\x02\x08\x05/info:data/1248830494586/Put/vlen=1000
2009-07-29 01:27:15,274 [regionserver/208.76.44.139:60020.worker] DEBUG org.apache.hadoop.hbase.regionserver.Store: Applied 17286, skipped 0 because sequence id <= 22830124
2009-07-29 01:27:15,274 [regionserver/208.76.44.139:60020.worker] DEBUG org.apache.hadoop.hbase.regionserver.Store: flushing reconstructionCache: 1
2009-07-29 01:27:15,274 [regionserver/208.76.44.139:60020.worker] INFO org.apache.hadoop.hbase.regionserver.Store: DUMP \x00\x00\x00\x03\x00\x01\x07\x02\x08\x05/info:data/1248830494586/Put/vlen=1000
{code}

I thought this was hbase-1483 but that was in splitLog.  This is something else.

> Reuse of KeyValue during log replay could cause the wrong data to be used
> -------------------------------------------------------------------------
>
>                 Key: HBASE-1718
>                 URL: https://issues.apache.org/jira/browse/HBASE-1718
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: regionserver
>    Affects Versions: 0.20.0
>            Reporter: Jonathan Gray
>            Assignee: Jonathan Gray
>            Priority: Blocker
>             Fix For: 0.20.0
>
>         Attachments: HBASE-1718-v1.patch
>
>
> Our meta table got a row key of METAROW in it.  Hard to explain how it happened, but under code inspection stack found that we are reusing the same KV instance for each replayed key.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-1718) Reuse of KeyValue during log replay could cause the wrong data to be used

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-1718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12736385#action_12736385 ] 

stack commented on HBASE-1718:
------------------------------

Single bad value.

> Reuse of KeyValue during log replay could cause the wrong data to be used
> -------------------------------------------------------------------------
>
>                 Key: HBASE-1718
>                 URL: https://issues.apache.org/jira/browse/HBASE-1718
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: regionserver
>    Affects Versions: 0.20.0
>            Reporter: Jonathan Gray
>            Priority: Blocker
>             Fix For: 0.20.0
>
>         Attachments: HBASE-1718-v1.patch
>
>
> Our meta table got a row key of METAROW in it.  Hard to explain how it happened, but under code inspection stack found that we are reusing the same KV instance for each replayed key.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.