You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@hbase.apache.org by "stack (JIRA)" <ji...@apache.org> on 2008/06/09 22:54:45 UTC

[jira] Created: (HBASE-674) memcache size unreliable

memcache size unreliable
------------------------

                 Key: HBASE-674
                 URL: https://issues.apache.org/jira/browse/HBASE-674
             Project: Hadoop HBase
          Issue Type: Bug
    Affects Versions: 0.1.2
            Reporter: stack
            Priority: Blocker
             Fix For: 0.2.0


Multiple updates against same row/column/ts will be seen as increments to cache size on insert but when we then play the memcache at flush time, we'll only see the most recent entry and decrement the memcache size by whatever its size; memcache will be off.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-674) memcache size unreliable

Posted by "Ning Li (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12603704#action_12603704 ] 

Ning Li commented on HBASE-674:
-------------------------------

Would it be better if HStore's mem cache computes/maintains its own memory size/usage? When a region needs its memory size, it sums from all its stores instead increasing a count during update and subtracting after flush.

> memcache size unreliable
> ------------------------
>
>                 Key: HBASE-674
>                 URL: https://issues.apache.org/jira/browse/HBASE-674
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.1.2
>            Reporter: stack
>            Priority: Blocker
>             Fix For: 0.2.0
>
>
> Multiple updates against same row/column/ts will be seen as increments to cache size on insert but when we then play the memcache at flush time, we'll only see the most recent entry and decrement the memcache size by whatever its size; memcache will be off.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-674) memcache size unreliable

Posted by "Billy Pearson (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12610482#action_12610482 ] 

Billy Pearson commented on HBASE-674:
-------------------------------------

I changed my flush size to 16MB from my default 128MB and run a large job here is some lines from the logs
also I added back StringUtils.humanReadableInt(this.memcacheSize.get()) on the end so I could see if the size was growing after every flush

{code}
2008-07-04 03:09:46,684 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Finished memcache flush for region webdata,,1214968800601 in 5595ms, sequence id=237700684, 12.9m, 351.2k
2008-07-04 03:15:02,169 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Finished memcache flush for region webdata,,1214968800601 in 5741ms, sequence id=239128833, 13.4m, 188.1k
2008-07-04 03:15:04,145 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Finished memcache flush for region webdata,,1214968800601 in 1975ms, sequence id=239155758, 167.6k, 222.8k
{code}

so the above looks good now the last number is the memcacheSize.get() and its moveing down and up so thats good to see I thank this patch solved my problem of the flushes.
I run the job for quite a while and flushes seams to happen normal in place of less time between flushes.

The only other thing I see on this issue is the size before flush of the memcache is still off like the logs stack posted above

{code}
2008-07-04 03:09:41,089 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memcache flush for region webdata,,1214968800601. Current region memcache size 16.0m
2008-07-04 03:09:42,587 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Added /hbase/webdata/1748955538/anchor/mapfiles/5595070426400799233 with 29142 entries, sequence id 237700684, data size 3.7m, file size 507.2k
2008-07-04 03:09:43,719 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Added /hbase/webdata/1748955538/stime/mapfiles/4943615301545296809 with 2872 entries, sequence id 237700684, data size 154.9k, file size 24.3k
2008-07-04 03:09:45,197 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Added /hbase/webdata/1748955538/in_rank/mapfiles/367761225010760821 with 36409 entries, sequence id 237700684, data size 4.5m, file size 415.8k
2008-07-04 03:09:45,470 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Added /hbase/webdata/1748955538/size/mapfiles/6451240725630689572 with 2872 entries, sequence id 237700684, data size 135.9k, file size 28.4k
2008-07-04 03:09:46,683 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Added /hbase/webdata/1748955538/last_seen/mapfiles/3472123077784461203 with 36410 entries, sequence id 237700684, data size 4.4m, file size 409.3k
2008-07-04 03:09:46,684 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Finished memcache flush for region webdata,,1214968800601 in 5595ms, sequence id=237700684, 12.9m, 351.2k
{code}

16m is whats reported as the size before the flush and the total data flushed was 12.9m

> memcache size unreliable
> ------------------------
>
>                 Key: HBASE-674
>                 URL: https://issues.apache.org/jira/browse/HBASE-674
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.1.2
>            Reporter: stack
>            Assignee: Jim Kellerman
>             Fix For: 0.2.0
>
>         Attachments: 674-v2.patch, 674.patch, patch.txt
>
>
> Multiple updates against same row/column/ts will be seen as increments to cache size on insert but when we then play the memcache at flush time, we'll only see the most recent entry and decrement the memcache size by whatever its size; memcache will be off.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-674) memcache size unreliable

Posted by "stack (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12603695#action_12603695 ] 

stack commented on HBASE-674:
-----------------------------

Issue is a regionserver that is stuck with the block gate down.  I can see it flushing over time but the memcache size continues to crawl until its at maximum and then never goes down in spite of fact we've been regularly flushing.  Our math is obviously off... See here:

{code}
2008-06-05 23:29:54,872 DEBUG org.apache.hadoop.hbase.HRegion: Started memcache flush for region enwiki_meta,4xm9tOa_JLlpDI7EFNz4OF==,1212282770124. Current region memcache size 10.7m
2008-06-05 23:29:56,246 DEBUG org.apache.hadoop.hbase.HStore: Added /hbase/aa0-005-2.u.powerset.com/enwiki_meta/1966274647/alternate_title/mapfiles/7141575363599707225 with 4 entries, sequence id 1109246475, data size 290.0
2008-06-05 23:29:56,444 DEBUG org.apache.hadoop.hbase.HStore: Added /hbase/aa0-005-2.u.powerset.com/enwiki_meta/1966274647/misc/mapfiles/8160270909949078904 with 39 entries, sequence id 1109246475, data size 2.3k
2008-06-05 23:29:56,661 DEBUG org.apache.hadoop.hbase.HStore: Added /hbase/aa0-005-2.u.powerset.com/enwiki_meta/1966274647/alternate_url/mapfiles/3974349317844214611 with 4 entries, sequence id 1109246475, data size 398.0
2008-06-05 23:29:56,889 DEBUG org.apache.hadoop.hbase.HStore: Added /hbase/aa0-005-2.u.powerset.com/enwiki_meta/1966274647/page/mapfiles/5684117091377129929 with 117 entries, sequence id 1109246475, data size 87.0k
2008-06-05 23:29:56,890 DEBUG org.apache.hadoop.hbase.HRegion: Finished memcache flush for region enwiki_meta,4xm9tOa_JLlpDI7EFNz4OF==,1212282770124 in 2019ms, sequence id=1109246475
{code}

See how we are at 10.7M when flush starts but how if you count up all that was flushed, we flushed 90k odd.

> memcache size unreliable
> ------------------------
>
>                 Key: HBASE-674
>                 URL: https://issues.apache.org/jira/browse/HBASE-674
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.1.2
>            Reporter: stack
>            Priority: Blocker
>             Fix For: 0.2.0
>
>
> Multiple updates against same row/column/ts will be seen as increments to cache size on insert but when we then play the memcache at flush time, we'll only see the most recent entry and decrement the memcache size by whatever its size; memcache will be off.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (HBASE-674) memcache size unreliable

Posted by "Jim Kellerman (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jim Kellerman reassigned HBASE-674:
-----------------------------------

    Assignee: Jim Kellerman

> memcache size unreliable
> ------------------------
>
>                 Key: HBASE-674
>                 URL: https://issues.apache.org/jira/browse/HBASE-674
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.1.2
>            Reporter: stack
>            Assignee: Jim Kellerman
>             Fix For: 0.2.0
>
>         Attachments: 674-v2.patch, 674.patch
>
>
> Multiple updates against same row/column/ts will be seen as increments to cache size on insert but when we then play the memcache at flush time, we'll only see the most recent entry and decrement the memcache size by whatever its size; memcache will be off.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HBASE-674) memcache size unreliable

Posted by "stack (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HBASE-674:
------------------------

    Fix Version/s: 0.2.0

Bringing this back into 0.2 because of the Billy comments.  Should be easy enough to reproduce.

> memcache size unreliable
> ------------------------
>
>                 Key: HBASE-674
>                 URL: https://issues.apache.org/jira/browse/HBASE-674
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.1.2
>            Reporter: stack
>             Fix For: 0.2.0
>
>         Attachments: 674-v2.patch, 674.patch
>
>
> Multiple updates against same row/column/ts will be seen as increments to cache size on insert but when we then play the memcache at flush time, we'll only see the most recent entry and decrement the memcache size by whatever its size; memcache will be off.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HBASE-674) memcache size unreliable

Posted by "Jim Kellerman (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jim Kellerman updated HBASE-674:
--------------------------------

    Fix Version/s:     (was: 0.2.0)
         Assignee:     (was: stack)
         Priority: Major  (was: Blocker)

While this is an important issue, it is not a blocker for 0.2.0

> memcache size unreliable
> ------------------------
>
>                 Key: HBASE-674
>                 URL: https://issues.apache.org/jira/browse/HBASE-674
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.1.2
>            Reporter: stack
>         Attachments: 674-v2.patch, 674.patch
>
>
> Multiple updates against same row/column/ts will be seen as increments to cache size on insert but when we then play the memcache at flush time, we'll only see the most recent entry and decrement the memcache size by whatever its size; memcache will be off.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-674) memcache size unreliable

Posted by "Jim Kellerman (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12610088#action_12610088 ] 

Jim Kellerman commented on HBASE-674:
-------------------------------------

There are a number of issues here:
- multiple inserts or deletes for the same row/colum/timestamp are counted and can inflate the memcache size some. This may not be a big issue because it is unlikely that someone is using the same row/column/timestamp especially if they do not specify a timestamp for puts or deletes.
- because of the inaccuracies of the above, subtracting the actual number of flushed bytes from the memcache size leads to the potential of the memcache size growing over time if fewer bytes are flushed than what HRegion thinks is is the memcache. What we really need to do is keep track of both updates and memcache size, so that during a flush, we accumulate the size of updates that are taken after the snapshot. When the flush is completed, we can set the size of the memcache to the number of bytes submitted as updates during the flush.
- why the memcache size seems to be going negative more frequently recently is somewhat of a mystery. It is pretty easy to understand why we might flush less than what we think is in the cache, but how would we flush more than what we think is in the cache.
- Finally I don't particularly like the finished memcache flush message in HRegion. It reports what it thinks is the current memcache size after the flush, but doesn't say that. It would lead the casual observer to think that the size reported by HRegion after the flush is the number of bytes flushed from the cache.

> memcache size unreliable
> ------------------------
>
>                 Key: HBASE-674
>                 URL: https://issues.apache.org/jira/browse/HBASE-674
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.1.2
>            Reporter: stack
>             Fix For: 0.2.0
>
>         Attachments: 674-v2.patch, 674.patch
>
>
> Multiple updates against same row/column/ts will be seen as increments to cache size on insert but when we then play the memcache at flush time, we'll only see the most recent entry and decrement the memcache size by whatever its size; memcache will be off.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-674) memcache size unreliable

Posted by "Billy Pearson (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12609707#action_12609707 ] 

Billy Pearson commented on HBASE-674:
-------------------------------------

I currently run jobs in smaller batches and restart hbase after abotu 20 small jobs inserting about 500K cells per job on one region server

> memcache size unreliable
> ------------------------
>
>                 Key: HBASE-674
>                 URL: https://issues.apache.org/jira/browse/HBASE-674
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.1.2
>            Reporter: stack
>         Attachments: 674-v2.patch, 674.patch
>
>
> Multiple updates against same row/column/ts will be seen as increments to cache size on insert but when we then play the memcache at flush time, we'll only see the most recent entry and decrement the memcache size by whatever its size; memcache will be off.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (HBASE-674) memcache size unreliable

Posted by "stack (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack reassigned HBASE-674:
---------------------------

    Assignee: stack

> memcache size unreliable
> ------------------------
>
>                 Key: HBASE-674
>                 URL: https://issues.apache.org/jira/browse/HBASE-674
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.1.2
>            Reporter: stack
>            Assignee: stack
>            Priority: Blocker
>             Fix For: 0.2.0
>
>         Attachments: 674.patch
>
>
> Multiple updates against same row/column/ts will be seen as increments to cache size on insert but when we then play the memcache at flush time, we'll only see the most recent entry and decrement the memcache size by whatever its size; memcache will be off.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-674) memcache size unreliable

Posted by "Jim Kellerman (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12603699#action_12603699 ] 

Jim Kellerman commented on HBASE-674:
-------------------------------------

Good catch! 

We should probably update the memcache size with each update, and if we are overwriting a previous update we need to take that into account.

> memcache size unreliable
> ------------------------
>
>                 Key: HBASE-674
>                 URL: https://issues.apache.org/jira/browse/HBASE-674
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.1.2
>            Reporter: stack
>            Priority: Blocker
>             Fix For: 0.2.0
>
>
> Multiple updates against same row/column/ts will be seen as increments to cache size on insert but when we then play the memcache at flush time, we'll only see the most recent entry and decrement the memcache size by whatever its size; memcache will be off.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-674) memcache size unreliable

Posted by "stack (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12603691#action_12603691 ] 

stack commented on HBASE-674:
-----------------------------

For example:

{code}
  public void testSizeCount() throws Exception {
    HStoreKey hsk = new HStoreKey(new Text(getName()),
      new Text(getName()), System.currentTimeMillis());
    for (int i = 0; i < 3; i++) {
      this.hmemcache.add(hsk, HStoreKey.getBytes(hsk));
    }
    this.hmemcache.snapshot();
    System.out.println(this.hmemcache.getSnapshot().size());
  }
{code}

The out.println in above says only one entry in the memcache though adding we added 3 items to the memcache size.

Other issues here are that exceptions adding/deleting, etc., items can cause count to be off.  We add to stuff to the memcache size before successful add of item.

> memcache size unreliable
> ------------------------
>
>                 Key: HBASE-674
>                 URL: https://issues.apache.org/jira/browse/HBASE-674
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.1.2
>            Reporter: stack
>            Priority: Blocker
>             Fix For: 0.2.0
>
>
> Multiple updates against same row/column/ts will be seen as increments to cache size on insert but when we then play the memcache at flush time, we'll only see the most recent entry and decrement the memcache size by whatever its size; memcache will be off.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HBASE-674) memcache size unreliable

Posted by "stack (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HBASE-674:
------------------------

    Attachment: 674-v2.patch

I applied to branch something that had fewer changes after tesitng on cluster to make sure flushing worked as it used to.

> memcache size unreliable
> ------------------------
>
>                 Key: HBASE-674
>                 URL: https://issues.apache.org/jira/browse/HBASE-674
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.1.2
>            Reporter: stack
>            Assignee: stack
>            Priority: Blocker
>             Fix For: 0.2.0
>
>         Attachments: 674-v2.patch, 674.patch
>
>
> Multiple updates against same row/column/ts will be seen as increments to cache size on insert but when we then play the memcache at flush time, we'll only see the most recent entry and decrement the memcache size by whatever its size; memcache will be off.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Issue Comment Edited: (HBASE-674) memcache size unreliable

Posted by "Billy Pearson (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12609707#action_12609707 ] 

viper799 edited comment on HBASE-674 at 7/1/08 1:59 PM:
-------------------------------------------------------------

I currently run jobs in smaller batches and restart hbase after abotu 20 small jobs inserting about 500K cells per job on one region server.

If we want to get 2.0 out then this should be a blocker in 2.1

      was (Author: viper799):
    I currently run jobs in smaller batches and restart hbase after abotu 20 small jobs inserting about 500K cells per job on one region server
  
> memcache size unreliable
> ------------------------
>
>                 Key: HBASE-674
>                 URL: https://issues.apache.org/jira/browse/HBASE-674
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.1.2
>            Reporter: stack
>         Attachments: 674-v2.patch, 674.patch
>
>
> Multiple updates against same row/column/ts will be seen as increments to cache size on insert but when we then play the memcache at flush time, we'll only see the most recent entry and decrement the memcache size by whatever its size; memcache will be off.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HBASE-674) memcache size unreliable

Posted by "Jim Kellerman (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jim Kellerman updated HBASE-674:
--------------------------------

    Attachment: patch.txt

Memcache.add now computes the delta size of the memcache (so if multiple updates are made to the same row/column/timestamp, they are correctly accounted for)

HStore.add returns the value from Memcache.add

HRegion.internalFlushCache now zeros the memcache size while it has updates locked out. Because of this, the memcache size will reflect the size of the updates that happened since the flush started. Additionally, at the end of a cache flush it reports the number of bytes flushed and not the number of bytes currently in the memcache.

memcache size is now updated based on the value returned from HStore.add (which is computed by Memcache.add



> memcache size unreliable
> ------------------------
>
>                 Key: HBASE-674
>                 URL: https://issues.apache.org/jira/browse/HBASE-674
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.1.2
>            Reporter: stack
>            Assignee: Jim Kellerman
>             Fix For: 0.2.0
>
>         Attachments: 674-v2.patch, 674.patch, patch.txt
>
>
> Multiple updates against same row/column/ts will be seen as increments to cache size on insert but when we then play the memcache at flush time, we'll only see the most recent entry and decrement the memcache size by whatever its size; memcache will be off.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-674) memcache size unreliable

Posted by "stack (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12603708#action_12603708 ] 

stack commented on HBASE-674:
-----------------------------

Ning: Agree.

I think for 0.1 branch, I'll just set memcache size to zero after successful flush. Will do better fix for 0.2.

> memcache size unreliable
> ------------------------
>
>                 Key: HBASE-674
>                 URL: https://issues.apache.org/jira/browse/HBASE-674
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.1.2
>            Reporter: stack
>            Priority: Blocker
>             Fix For: 0.2.0
>
>
> Multiple updates against same row/column/ts will be seen as increments to cache size on insert but when we then play the memcache at flush time, we'll only see the most recent entry and decrement the memcache size by whatever its size; memcache will be off.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HBASE-674) memcache size unreliable

Posted by "Jim Kellerman (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jim Kellerman updated HBASE-674:
--------------------------------

    Resolution: Fixed
        Status: Resolved  (was: Patch Available)

Committed patch.

> memcache size unreliable
> ------------------------
>
>                 Key: HBASE-674
>                 URL: https://issues.apache.org/jira/browse/HBASE-674
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.1.2
>            Reporter: stack
>            Assignee: Jim Kellerman
>             Fix For: 0.2.0
>
>         Attachments: 674-v2.patch, 674.patch, patch.txt
>
>
> Multiple updates against same row/column/ts will be seen as increments to cache size on insert but when we then play the memcache at flush time, we'll only see the most recent entry and decrement the memcache size by whatever its size; memcache will be off.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-674) memcache size unreliable

Posted by "stack (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12603718#action_12603718 ] 

stack commented on HBASE-674:
-----------------------------

Committed patch to branch.

> memcache size unreliable
> ------------------------
>
>                 Key: HBASE-674
>                 URL: https://issues.apache.org/jira/browse/HBASE-674
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.1.2
>            Reporter: stack
>            Priority: Blocker
>             Fix For: 0.2.0
>
>         Attachments: 674.patch
>
>
> Multiple updates against same row/column/ts will be seen as increments to cache size on insert but when we then play the memcache at flush time, we'll only see the most recent entry and decrement the memcache size by whatever its size; memcache will be off.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-674) memcache size unreliable

Posted by "Jim Kellerman (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12610642#action_12610642 ] 

Jim Kellerman commented on HBASE-674:
-------------------------------------

Computing the memcache size is an approximation at best. If the memcache size is not growing over time, and causing OutOfMemoryExceptions, then I think this the best we can do for 0.2.0. If you think the accounting should be more accurate, please open an issue for 0.3.0.

> memcache size unreliable
> ------------------------
>
>                 Key: HBASE-674
>                 URL: https://issues.apache.org/jira/browse/HBASE-674
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.1.2
>            Reporter: stack
>            Assignee: Jim Kellerman
>             Fix For: 0.2.0
>
>         Attachments: 674-v2.patch, 674.patch, patch.txt
>
>
> Multiple updates against same row/column/ts will be seen as increments to cache size on insert but when we then play the memcache at flush time, we'll only see the most recent entry and decrement the memcache size by whatever its size; memcache will be off.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HBASE-674) memcache size unreliable

Posted by "Jim Kellerman (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jim Kellerman updated HBASE-674:
--------------------------------

    Status: Patch Available  (was: Open)

patch available. Please review.

> memcache size unreliable
> ------------------------
>
>                 Key: HBASE-674
>                 URL: https://issues.apache.org/jira/browse/HBASE-674
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.1.2
>            Reporter: stack
>            Assignee: Jim Kellerman
>             Fix For: 0.2.0
>
>         Attachments: 674-v2.patch, 674.patch, patch.txt
>
>
> Multiple updates against same row/column/ts will be seen as increments to cache size on insert but when we then play the memcache at flush time, we'll only see the most recent entry and decrement the memcache size by whatever its size; memcache will be off.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-674) memcache size unreliable

Posted by "Billy Pearson (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12609705#action_12609705 ] 

Billy Pearson commented on HBASE-674:
-------------------------------------

I thank this should be a blocker for 2.0 as it causes problems on long running jobs and region server that have been online for a while.

What I am seeing is the last line of the flush like the below log lines has the memcache size on the end after the flush. this grows with every flush that has data in it. 

The problem comes up with the region server thanks the memcache is > then hbase.hregion.memcache.flush.size then it starts flushing back to back with little or nothing 
to flush when it starts doing this on my servers the region server hangs and is needed to be restarted after killing it with a kill -9 pid.

Also a side note I do see flushes back to back on region server that only have one or two regions I see a memcache flush then 1-3 more of the same region back to back then its ok but this seams to go away after the region server host more then 4 region.

Logs from a few days ago:
{code}
2008-06-25 21:02:20,632 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Finished memcache flush for region webdata,,1214427652459 in 10604ms, sequence id=1973821, 17.0m
2008-06-25 21:02:20,633 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Finished memcache flush for region webdata,,1214427652459 in 1ms, sequence id=1973822, 17.0m
2008-06-25 21:02:21,478 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Finished memcache flush for region webdata,,1214427652459 in 842ms, sequence id=1973832, 17.4m
2008-06-25 21:02:22,896 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Finished memcache flush for region webdata,,1214427652459 in 1408ms, sequence id=1976711, 17.8m
2008-06-25 21:11:20,979 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Finished memcache flush for region webdata,,1214427652459 in 16988ms, sequence id=3827578, 52.0m
hbase restarted
2008-06-25 21:16:18,004 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Finished memcache flush for region webdata,,1214427652459 in 10336ms, sequence id=4817365, 17.0m
2008-06-25 21:16:18,408 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Finished memcache flush for region webdata,,1214427652459 in 404ms, sequence id=4817378, 17.0m
2008-06-25 21:16:19,838 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Finished memcache flush for region webdata,,1214427652459 in 1421ms, sequence id=4817952, 17.7m
2008-06-25 21:19:01,512 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Finished memcache flush for region webdata,,1214427652459 in 9692ms, sequence id=5661153, 32.8m
2008-06-25 21:19:01,821 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Finished memcache flush for region webdata,,1214427652459 in 308ms, sequence id=5661169, 32.8m
2008-06-25 21:19:04,261 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Finished memcache flush for region webdata,,1214427652459 in 2439ms, sequence id=5661523, 33.4m
2008-06-25 21:19:05,317 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Finished memcache flush for region webdata,,1214427652459 in 1052ms, sequence id=5666391, 33.5m
2008-06-25 21:49:21,616 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Finished memcache flush for region webdata,,1214427652459 in 14214ms, sequence id=7415697, 64.7m
2008-06-25 21:49:22,369 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Finished memcache flush for region webdata,,1214427652459 in 738ms, sequence id=7415798, 64.7m
2008-06-25 21:49:22,373 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Finished memcache flush for region webdata,,1214427652459 in 4ms, sequence id=7415800, 64.7m
hbase restarted
2008-06-25 22:54:33,665 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Finished memcache flush for region webdata,,1214448561806 in 11168ms, sequence id=9238125, 19.5m
2008-06-25 22:54:34,248 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Finished memcache flush for region webdata,,1214448561806 in 582ms, sequence id=9241589, 19.7m
2008-06-25 22:54:34,989 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Finished memcache flush for region webdata,,1214448561806 in 740ms, sequence id=9242882, 19.8m
2008-06-25 22:54:35,749 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Finished memcache flush for region webdata,,1214448561806 in 759ms, sequence id=9244773, 19.8m
2008-06-25 22:54:36,619 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Finished memcache flush for region webdata,,1214448561806 in 869ms, sequence id=9247214, 19.8m
2008-06-25 22:54:37,657 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Finished memcache flush for region webdata,,1214448561806 in 1031ms, sequence id=9249946, 20.0m
2008-06-25 23:14:29,857 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Finished memcache flush for region webdata,,1214448561806 in 13549ms, sequence id=12776517, 53.0m
hbase restart
2008-06-25 23:21:25,013 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Finished memcache flush for region webdata,,1214448561806 in 13706ms, sequence id=14457037, 20.2m
2008-06-25 23:21:25,508 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Finished memcache flush for region webdata,,1214448561806 in 494ms, sequence id=14474826, 20.2m
2008-06-25 23:21:26,353 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Finished memcache flush for region webdata,,1214448561806 in 844ms, sequence id=14475821, 20.3m
2008-06-25 23:21:27,208 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Finished memcache flush for region webdata,,1214448561806 in 854ms, sequence id=14478027, 20.3m
2008-06-25 23:21:27,816 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Finished memcache flush for region webdata,,1214448561806 in 607ms, sequence id=14481138, 20.3m
2008-06-25 23:21:28,609 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Finished memcache flush for region webdata,,1214448561806 in 791ms, sequence id=14483523, 20.3m
2008-06-25 23:21:29,234 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Finished memcache flush for region webdata,,1214448561806 in 623ms, sequence id=14485701, 20.3m
2008-06-25 23:21:31,459 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Finished memcache flush for region webdata,,1214448561806 in 2223ms, sequence id=14487619, 20.6m
2008-06-25 23:21:32,546 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Finished memcache flush for region webdata,,1214448561806 in 1079ms, sequence id=14496978, 20.8m
2008-06-25 23:24:36,701 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Finished memcache flush for region webdata,,1214448561806 in 3432ms, sequence id=15103902, 27.4m
hbase restarted
2008-06-25 23:28:18,288 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Finished memcache flush for region webdata,,1214448561806 in 1ms, sequence id=15103936, 0.0
2008-06-25 23:28:19,151 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Finished memcache flush for region webdata,,1214454498285 in 0ms, sequence id=15103937, 0.0
2008-06-25 23:45:43,021 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Finished memcache flush for region webdata,,1214454498285 in 14039ms, sequence id=19210706, 23.6m
2008-06-25 23:45:43,285 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Finished memcache flush for region webdata,,1214454498285 in 264ms, sequence id=19226471, 23.6m
2008-06-25 23:45:44,016 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Finished memcache flush for region webdata,,1214454498285 in 730ms, sequence id=19227141, 23.6m
2008-06-25 23:45:45,481 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Finished memcache flush for region webdata,,1214454498285 in 1464ms, sequence id=19229633, 23.7m
2008-06-25 23:45:46,474 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Finished memcache flush for region webdata,,1214454498285 in 990ms, sequence id=19234660, 23.8m
2008-06-25 23:47:56,208 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Finished memcache flush for region webdata,com.wallbuilders.www%2FLIBissuesArticles.asp%3Fid%3D45%3Ahttp467123 in 759ms, sequence id=19961148, 21.7m
2008-06-25 23:55:07,552 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Finished memcache flush for region webdata,,1214454498285 in 5795ms, sequence id=22333011, 41.4m
hbase restarted
{code}

> memcache size unreliable
> ------------------------
>
>                 Key: HBASE-674
>                 URL: https://issues.apache.org/jira/browse/HBASE-674
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.1.2
>            Reporter: stack
>         Attachments: 674-v2.patch, 674.patch
>
>
> Multiple updates against same row/column/ts will be seen as increments to cache size on insert but when we then play the memcache at flush time, we'll only see the most recent entry and decrement the memcache size by whatever its size; memcache will be off.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HBASE-674) memcache size unreliable

Posted by "stack (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HBASE-674:
------------------------

    Attachment: 674.patch

Here's a crude patch for 0.1; just zeros memcache size on successful flush (needed internally).  Lets do something better in TRUNK.

> memcache size unreliable
> ------------------------
>
>                 Key: HBASE-674
>                 URL: https://issues.apache.org/jira/browse/HBASE-674
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.1.2
>            Reporter: stack
>            Priority: Blocker
>             Fix For: 0.2.0
>
>         Attachments: 674.patch
>
>
> Multiple updates against same row/column/ts will be seen as increments to cache size on insert but when we then play the memcache at flush time, we'll only see the most recent entry and decrement the memcache size by whatever its size; memcache will be off.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.