You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Jean-Daniel Cryans (JIRA)" <ji...@apache.org> on 2010/11/05 22:06:41 UTC

[jira] Updated: (HBASE-3198) Log rolling archives files prematurely

     [ https://issues.apache.org/jira/browse/HBASE-3198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jean-Daniel Cryans updated HBASE-3198:
--------------------------------------

    Summary: Log rolling archives files prematurely  (was: HLog periodic roll doesn't seem to care about MemStores)

So some digging revealed interesting things. When we print the "whose sequenceid is x", it's always smaller than the real one by 1 since in the code we do:

{code}
this.outputfiles.put(Long.valueOf(this.logSeqNum.get() - 1), oldFile);
{code}

It may have been right to do this at some point in the past, but now since rolling is async from appending it means that the current logSeqNum is in fact the last one in the log. It's wrong to -1. Then there's this:

{code}
    TreeSet<Long> sequenceNumbers =
    new TreeSet<Long>(this.outputfiles.headMap(
      (Long.valueOf(oldestOutstandingSeqNum.longValue() + 1L))).keySet());
{code}

Here we are getting the log files that we can delete since we know that their oldest edit's sequence number is still smaller than the oldest edit. I don't know why we're doing a +1L, since you don't really  want to delete log files that do contain it. It may be a "fix" to my previous finding, but it's still broken since as I showed when creating this jira rolling does remove logs with unflushed edits.

I'm changing the title of this jira to a more broad scope, as any log rolling is at risk of lowering data durability.

> Log rolling archives files prematurely
> --------------------------------------
>
>                 Key: HBASE-3198
>                 URL: https://issues.apache.org/jira/browse/HBASE-3198
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Jean-Daniel Cryans
>            Assignee: Jean-Daniel Cryans
>            Priority: Blocker
>             Fix For: 0.90.0
>
>
> From the mailing list, Erdem Agaoglu found a case where when an HLog gets rolled from the periodic log roller and it gets archived even tho the region (ROOT) still has edits in the MemStore. I did an experiment on a local empty machine and it does look broken:
> {noformat}
> org.apache.hadoop.hbase.regionserver.LogRoller: Hlog roll period 6000ms elapsed
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter: Using syncFs -- HDFS-200
> org.apache.hadoop.hbase.regionserver.wal.HLog: Roll /hbase-89-su/.logs/hbasedev,60020,1288977933643/10.10.1.177%3A60020.1288977933829, entries=1,
>  filesize=295. New hlog /hbase-89-su/.logs/hbasedev,60020,1288977933643/10.10.1.177%3A60020.1288977943913
> org.apache.hadoop.hbase.regionserver.wal.HLog: Found 1 hlogs to remove  out of total 1; oldest outstanding sequenceid is 270055 from region -ROOT-,,0
> org.apache.hadoop.hbase.regionserver.wal.HLog: moving old hlog file /hbase-89-su/.logs/hbasedev,60020,1288977933643/10.10.1.177%3A60020.1288977933829
>  whose highest sequenceid is 270054 to /hbase-89-su/.oldlogs/10.10.1.177%3A60020.1288977933829
> {noformat}
> Marking as Blocker and taking a deeper look.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.