You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "stack (JIRA)" <ji...@apache.org> on 2010/01/01 01:22:29 UTC
[jira] Commented: (HBASE-2053) Upper bound of outstanding WALs can be overrun

    [ https://issues.apache.org/jira/browse/HBASE-2053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12795710#action_12795710 ] 

stack commented on HBASE-2053:
------------------------------

So, this is an interesting problem.  This patch is not enough.  It makes the situation better though so I think it should be committed to the branch and trunk.   It needs a +1.

The patch makes it so we can queue more than one region flush if too many log files.  If too many log files, it'll make sure that we queue the flushing of regions enough to free up at least the oldest WAL file.  Previous we just flushed the oldest region though the oldest WAL could have more than just the oldest regions edits in it.

This patch isn't enough though because flushing gets held up for long periods of time.

One such reason is hdfs running slow so flush takes a long time.  Flushes are queued and then addressed oldest first.  Queue might accumulate many regions to flush just by way of normal operation.  Since its not a priority queue, the flush needed to free up the WAL may not happen for a while until all ahead of it in the queue have been cleared.

A more serious one is the mechanism whereby we hold up flush on a region if we have too many store files.  The hold up is done for a single region but the wait on compaction is inline with the MemStoreFlusher#run thread so no other flushes can happen while we're waiting on store file count to shrink because of a compaction.  This is a bug.  I'll open an issue.

It doesn't take much for us to accumulate many log files if the upload rate is high and flushing is taking a while or is  heldup.   Flushes can take 3-10 seconds on slow HDFS.  WALs can be rolling every 4-5 seconds in a PE upload.


> Upper bound of outstanding WALs can be overrun
> ----------------------------------------------
>
>                 Key: HBASE-2053
>                 URL: https://issues.apache.org/jira/browse/HBASE-2053
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>             Fix For: 0.20.3
>
>         Attachments: 2053-v2.patch, 2053.patch, hbase-root-regionserver-server-2.log.2009-12-22.gz
>
>
> Kevin Peterson up on hbase-user posted the following.  Of interest is the link on the end which is logs of WAL rolls and removals.  In once place we remove 70plus logs because the outstanding edits have moved passed the outstanding sequence numbers -- so our basic WAL removal mechanism is working -- but if you study the log, the tendency is steady climb in the number of logs.   HLog#cleanOldLogs needs to notice such an upward tendency and work more aggressively cleaning the old in this case.  Here is Kevin's note:
> {code}
> n Tue, Dec 15, 2009 at 3:17 PM, Kevin Peterson <x...@y.com> wrote:
> This makes some sense now. I currently have 2200 regions across 3 tables. My
> largest table accounts for about 1600 of those regions and is mostly active
> at one end of the keyspace -- our key is based on date, but data only
> roughly arrives in order. I also write to two secondary indexes, which have
> no pattern to the key at all. One of these secondary tables has 488 regions
> and the other has 96 regions.
> We write about 10M items per day to the main table (articles). All of these
> get written to one of the secondary indexes (article-ids). About a third get
> written to the other secondary index. Total volume of data is about 10GB /
> day written.
> I think the key is as you say that the regions aren't filled enough to
> flush. The articles table gets mostly written to near one end and I see
> splits happening regularly. The index tables have no pattern so the 10
> millions writes get scattered across the different regions. I've looked more
> closely at a log file (linked below), and if I forget about my main table
> (which would tend to get flushed), and look only at the indexes, this seems
> to be what's happening:
> 1. Up to maxLogs HLogs, it doesn't do any flushes.
> 2. Once it gets above maxLogs, it will start flushing one region each time
> it creates a new HLog.
> 3. If the first HLog had edits for say 50 regions, it will need to flush the
> region with oldest edits 50 times before the HLog can be removed.
> If N is the number of regions getting written to, but not getting enough
> writes to flush on their own, then I think this converges to maxLogs + N
> logs on average. If I think of maxLogs as "number of logs to start flushing
> regions at" this makes sense.
> http://kdpeterson.net/paste/hbase-hadoop-regionserver-mi-prod-app35.ec2.biz360.com.log.2009-12-14
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.