You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "Jean-Daniel Cryans (JIRA)" <ji...@apache.org> on 2009/11/30 23:33:20 UTC
[jira] Created: (HBASE-2018) Updates to .META. blocked under high
MemStore load
Updates to .META. blocked under high MemStore load
--------------------------------------------------
Key: HBASE-2018
URL: https://issues.apache.org/jira/browse/HBASE-2018
Project: Hadoop HBase
Issue Type: Bug
Affects Versions: 0.20.2
Reporter: Jean-Daniel Cryans
Assignee: Jean-Daniel Cryans
Priority: Blocker
Fix For: 0.20.3, 0.21.0
I discovered this on Lars' cluster. The symptom was the good old:
{code}
09/11/30 08:10:26 INFO mapred.JobClient: Task Id : attempt_200911250121_0011_r_000010_1, Status : FAILED
org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to contact region server Some server, retryOnlyOne=true, index=0, islastrow=false, tries=9, numtries=10, i=14, listsize=20, region=prev-docs,de68fb97795ef3d936a3f10ff8790253,1259573366564 for region prev-docs,ccea967e66ccb53d83c48849c3a23f21,1259542138868, row 'ccff8cd4ca871c41f4fa7d44cffed962', but failed after 10 attempts.
Exceptions:
at org.apache.hadoop.hbase.client.HConnectionManager$TableServers$Batch.process(HConnectionManager.java:1120)
at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.processBatchOfRows(HConnectionManager.java:1201)
at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:605)
at org.apache.hadoop.hbase.client.HTable.put(HTable.java:470)
at org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordW
{code}
But the load wasn't that heavy, just lots of splitting going on. Looking at the logs, I see a split taking more than 4 minutes which is explained by this happening on the RS hosting .META. :
{code}
2009-11-30 08:08:39,922 INFO org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Forced flushing of prev-docs,2c9d51e57b20decd5c6419d23ede822b,1259542273901 because global memstore limit of 1.6g exceeded; currently 1.6g and flushing till 1021.9m
...
2009-11-30 08:12:33,743 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~22.9m for region prev-docs,c8fea4fbbc41e746d960854ed4d41dd6,1259587143838 in 14160ms, sequence id=13677, compaction requested=false
2009-11-30 08:12:33,744 INFO org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Forced flushing of prev-docs,39c2995d955c041d21f4dc4a0d0dbf6c,1259587061295 because global memstore limit of 1.6g exceeded; currently 1.0g and flushing till 1021.9m
{code}
So we should not block updates to .META. for any reason. I'm pretty sure this issue explains other issues we've seen on the mailing list.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HBASE-2018) Updates to .META. blocked under high
MemStore load
Posted by "Jean-Daniel Cryans (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-2018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jean-Daniel Cryans updated HBASE-2018:
--------------------------------------
Attachment: HBASE-2018.patch
This patch adds a check before calling cacheFlusher.reclaimMemStoreMemory so that we don't go waiting on the synchronized method if it's a .META. update. I'm currently running the tests.
> Updates to .META. blocked under high MemStore load
> --------------------------------------------------
>
> Key: HBASE-2018
> URL: https://issues.apache.org/jira/browse/HBASE-2018
> Project: Hadoop HBase
> Issue Type: Bug
> Affects Versions: 0.20.2
> Reporter: Jean-Daniel Cryans
> Assignee: Jean-Daniel Cryans
> Priority: Blocker
> Fix For: 0.20.3, 0.21.0
>
> Attachments: HBASE-2018.patch
>
>
> I discovered this on Lars' cluster. The symptom was the good old:
> {code}
> 09/11/30 08:10:26 INFO mapred.JobClient: Task Id : attempt_200911250121_0011_r_000010_1, Status : FAILED
> org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to contact region server Some server, retryOnlyOne=true, index=0, islastrow=false, tries=9, numtries=10, i=14, listsize=20, region=prev-docs,de68fb97795ef3d936a3f10ff8790253,1259573366564 for region prev-docs,ccea967e66ccb53d83c48849c3a23f21,1259542138868, row 'ccff8cd4ca871c41f4fa7d44cffed962', but failed after 10 attempts.
> Exceptions:
> at org.apache.hadoop.hbase.client.HConnectionManager$TableServers$Batch.process(HConnectionManager.java:1120)
> at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.processBatchOfRows(HConnectionManager.java:1201)
> at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:605)
> at org.apache.hadoop.hbase.client.HTable.put(HTable.java:470)
> at org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordW
> {code}
> But the load wasn't that heavy, just lots of splitting going on. Looking at the logs, I see a split taking more than 4 minutes which is explained by this happening on the RS hosting .META. :
> {code}
> 2009-11-30 08:08:39,922 INFO org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Forced flushing of prev-docs,2c9d51e57b20decd5c6419d23ede822b,1259542273901 because global memstore limit of 1.6g exceeded; currently 1.6g and flushing till 1021.9m
> ...
> 2009-11-30 08:12:33,743 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~22.9m for region prev-docs,c8fea4fbbc41e746d960854ed4d41dd6,1259587143838 in 14160ms, sequence id=13677, compaction requested=false
> 2009-11-30 08:12:33,744 INFO org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Forced flushing of prev-docs,39c2995d955c041d21f4dc4a0d0dbf6c,1259587061295 because global memstore limit of 1.6g exceeded; currently 1.0g and flushing till 1021.9m
> {code}
> So we should not block updates to .META. for any reason. I'm pretty sure this issue explains other issues we've seen on the mailing list.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Resolved: (HBASE-2018) Updates to .META. blocked under high
MemStore load
Posted by "Andrew Purtell (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-2018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Andrew Purtell resolved HBASE-2018.
-----------------------------------
Resolution: Fixed
Hadoop Flags: [Reviewed]
Committed -v2 patch to trunk and 0.20 branch.
> Updates to .META. blocked under high MemStore load
> --------------------------------------------------
>
> Key: HBASE-2018
> URL: https://issues.apache.org/jira/browse/HBASE-2018
> Project: Hadoop HBase
> Issue Type: Bug
> Affects Versions: 0.20.2
> Reporter: Jean-Daniel Cryans
> Assignee: Jean-Daniel Cryans
> Priority: Blocker
> Fix For: 0.20.3, 0.21.0
>
> Attachments: HBASE-2018-v2.patch, HBASE-2018.patch
>
>
> I discovered this on Lars' cluster. The symptom was the good old:
> {code}
> 09/11/30 08:10:26 INFO mapred.JobClient: Task Id : attempt_200911250121_0011_r_000010_1, Status : FAILED
> org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to contact region server Some server, retryOnlyOne=true, index=0, islastrow=false, tries=9, numtries=10, i=14, listsize=20, region=prev-docs,de68fb97795ef3d936a3f10ff8790253,1259573366564 for region prev-docs,ccea967e66ccb53d83c48849c3a23f21,1259542138868, row 'ccff8cd4ca871c41f4fa7d44cffed962', but failed after 10 attempts.
> Exceptions:
> at org.apache.hadoop.hbase.client.HConnectionManager$TableServers$Batch.process(HConnectionManager.java:1120)
> at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.processBatchOfRows(HConnectionManager.java:1201)
> at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:605)
> at org.apache.hadoop.hbase.client.HTable.put(HTable.java:470)
> at org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordW
> {code}
> But the load wasn't that heavy, just lots of splitting going on. Looking at the logs, I see a split taking more than 4 minutes which is explained by this happening on the RS hosting .META. :
> {code}
> 2009-11-30 08:08:39,922 INFO org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Forced flushing of prev-docs,2c9d51e57b20decd5c6419d23ede822b,1259542273901 because global memstore limit of 1.6g exceeded; currently 1.6g and flushing till 1021.9m
> ...
> 2009-11-30 08:12:33,743 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~22.9m for region prev-docs,c8fea4fbbc41e746d960854ed4d41dd6,1259587143838 in 14160ms, sequence id=13677, compaction requested=false
> 2009-11-30 08:12:33,744 INFO org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Forced flushing of prev-docs,39c2995d955c041d21f4dc4a0d0dbf6c,1259587061295 because global memstore limit of 1.6g exceeded; currently 1.0g and flushing till 1021.9m
> {code}
> So we should not block updates to .META. for any reason. I'm pretty sure this issue explains other issues we've seen on the mailing list.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-2018) Updates to .META. blocked under high
MemStore load
Posted by "Jean-Daniel Cryans (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-2018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12784033#action_12784033 ]
Jean-Daniel Cryans commented on HBASE-2018:
-------------------------------------------
All the tests pass. I would love if Lars could try out my patch before committing.
> Updates to .META. blocked under high MemStore load
> --------------------------------------------------
>
> Key: HBASE-2018
> URL: https://issues.apache.org/jira/browse/HBASE-2018
> Project: Hadoop HBase
> Issue Type: Bug
> Affects Versions: 0.20.2
> Reporter: Jean-Daniel Cryans
> Assignee: Jean-Daniel Cryans
> Priority: Blocker
> Fix For: 0.20.3, 0.21.0
>
> Attachments: HBASE-2018.patch
>
>
> I discovered this on Lars' cluster. The symptom was the good old:
> {code}
> 09/11/30 08:10:26 INFO mapred.JobClient: Task Id : attempt_200911250121_0011_r_000010_1, Status : FAILED
> org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to contact region server Some server, retryOnlyOne=true, index=0, islastrow=false, tries=9, numtries=10, i=14, listsize=20, region=prev-docs,de68fb97795ef3d936a3f10ff8790253,1259573366564 for region prev-docs,ccea967e66ccb53d83c48849c3a23f21,1259542138868, row 'ccff8cd4ca871c41f4fa7d44cffed962', but failed after 10 attempts.
> Exceptions:
> at org.apache.hadoop.hbase.client.HConnectionManager$TableServers$Batch.process(HConnectionManager.java:1120)
> at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.processBatchOfRows(HConnectionManager.java:1201)
> at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:605)
> at org.apache.hadoop.hbase.client.HTable.put(HTable.java:470)
> at org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordW
> {code}
> But the load wasn't that heavy, just lots of splitting going on. Looking at the logs, I see a split taking more than 4 minutes which is explained by this happening on the RS hosting .META. :
> {code}
> 2009-11-30 08:08:39,922 INFO org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Forced flushing of prev-docs,2c9d51e57b20decd5c6419d23ede822b,1259542273901 because global memstore limit of 1.6g exceeded; currently 1.6g and flushing till 1021.9m
> ...
> 2009-11-30 08:12:33,743 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~22.9m for region prev-docs,c8fea4fbbc41e746d960854ed4d41dd6,1259587143838 in 14160ms, sequence id=13677, compaction requested=false
> 2009-11-30 08:12:33,744 INFO org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Forced flushing of prev-docs,39c2995d955c041d21f4dc4a0d0dbf6c,1259587061295 because global memstore limit of 1.6g exceeded; currently 1.0g and flushing till 1021.9m
> {code}
> So we should not block updates to .META. for any reason. I'm pretty sure this issue explains other issues we've seen on the mailing list.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-2018) Updates to .META. blocked under high
MemStore load
Posted by "Lars George (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-2018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12784866#action_12784866 ]
Lars George commented on HBASE-2018:
------------------------------------
So testing for me is not going too well. Overall the patch is not harming anything, I say +1. For my testing I am struggling apparently with a too small cluster :(
> Updates to .META. blocked under high MemStore load
> --------------------------------------------------
>
> Key: HBASE-2018
> URL: https://issues.apache.org/jira/browse/HBASE-2018
> Project: Hadoop HBase
> Issue Type: Bug
> Affects Versions: 0.20.2
> Reporter: Jean-Daniel Cryans
> Assignee: Jean-Daniel Cryans
> Priority: Blocker
> Fix For: 0.20.3, 0.21.0
>
> Attachments: HBASE-2018.patch
>
>
> I discovered this on Lars' cluster. The symptom was the good old:
> {code}
> 09/11/30 08:10:26 INFO mapred.JobClient: Task Id : attempt_200911250121_0011_r_000010_1, Status : FAILED
> org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to contact region server Some server, retryOnlyOne=true, index=0, islastrow=false, tries=9, numtries=10, i=14, listsize=20, region=prev-docs,de68fb97795ef3d936a3f10ff8790253,1259573366564 for region prev-docs,ccea967e66ccb53d83c48849c3a23f21,1259542138868, row 'ccff8cd4ca871c41f4fa7d44cffed962', but failed after 10 attempts.
> Exceptions:
> at org.apache.hadoop.hbase.client.HConnectionManager$TableServers$Batch.process(HConnectionManager.java:1120)
> at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.processBatchOfRows(HConnectionManager.java:1201)
> at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:605)
> at org.apache.hadoop.hbase.client.HTable.put(HTable.java:470)
> at org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordW
> {code}
> But the load wasn't that heavy, just lots of splitting going on. Looking at the logs, I see a split taking more than 4 minutes which is explained by this happening on the RS hosting .META. :
> {code}
> 2009-11-30 08:08:39,922 INFO org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Forced flushing of prev-docs,2c9d51e57b20decd5c6419d23ede822b,1259542273901 because global memstore limit of 1.6g exceeded; currently 1.6g and flushing till 1021.9m
> ...
> 2009-11-30 08:12:33,743 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~22.9m for region prev-docs,c8fea4fbbc41e746d960854ed4d41dd6,1259587143838 in 14160ms, sequence id=13677, compaction requested=false
> 2009-11-30 08:12:33,744 INFO org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Forced flushing of prev-docs,39c2995d955c041d21f4dc4a0d0dbf6c,1259587061295 because global memstore limit of 1.6g exceeded; currently 1.0g and flushing till 1021.9m
> {code}
> So we should not block updates to .META. for any reason. I'm pretty sure this issue explains other issues we've seen on the mailing list.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HBASE-2018) Updates to .META. blocked under high
MemStore load
Posted by "Andrew Purtell (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-2018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Andrew Purtell updated HBASE-2018:
----------------------------------
Attachment: HBASE-2018-v2.patch
Should this check be for all operations on meta regions, deletes also? See patch -v2.
> Updates to .META. blocked under high MemStore load
> --------------------------------------------------
>
> Key: HBASE-2018
> URL: https://issues.apache.org/jira/browse/HBASE-2018
> Project: Hadoop HBase
> Issue Type: Bug
> Affects Versions: 0.20.2
> Reporter: Jean-Daniel Cryans
> Assignee: Jean-Daniel Cryans
> Priority: Blocker
> Fix For: 0.20.3, 0.21.0
>
> Attachments: HBASE-2018-v2.patch, HBASE-2018.patch
>
>
> I discovered this on Lars' cluster. The symptom was the good old:
> {code}
> 09/11/30 08:10:26 INFO mapred.JobClient: Task Id : attempt_200911250121_0011_r_000010_1, Status : FAILED
> org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to contact region server Some server, retryOnlyOne=true, index=0, islastrow=false, tries=9, numtries=10, i=14, listsize=20, region=prev-docs,de68fb97795ef3d936a3f10ff8790253,1259573366564 for region prev-docs,ccea967e66ccb53d83c48849c3a23f21,1259542138868, row 'ccff8cd4ca871c41f4fa7d44cffed962', but failed after 10 attempts.
> Exceptions:
> at org.apache.hadoop.hbase.client.HConnectionManager$TableServers$Batch.process(HConnectionManager.java:1120)
> at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.processBatchOfRows(HConnectionManager.java:1201)
> at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:605)
> at org.apache.hadoop.hbase.client.HTable.put(HTable.java:470)
> at org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordW
> {code}
> But the load wasn't that heavy, just lots of splitting going on. Looking at the logs, I see a split taking more than 4 minutes which is explained by this happening on the RS hosting .META. :
> {code}
> 2009-11-30 08:08:39,922 INFO org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Forced flushing of prev-docs,2c9d51e57b20decd5c6419d23ede822b,1259542273901 because global memstore limit of 1.6g exceeded; currently 1.6g and flushing till 1021.9m
> ...
> 2009-11-30 08:12:33,743 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~22.9m for region prev-docs,c8fea4fbbc41e746d960854ed4d41dd6,1259587143838 in 14160ms, sequence id=13677, compaction requested=false
> 2009-11-30 08:12:33,744 INFO org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Forced flushing of prev-docs,39c2995d955c041d21f4dc4a0d0dbf6c,1259587061295 because global memstore limit of 1.6g exceeded; currently 1.0g and flushing till 1021.9m
> {code}
> So we should not block updates to .META. for any reason. I'm pretty sure this issue explains other issues we've seen on the mailing list.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-2018) Updates to .META. blocked under high
MemStore load
Posted by "stack (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-2018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12785486#action_12785486 ]
stack commented on HBASE-2018:
------------------------------
+1 on v2 of patch.
> Updates to .META. blocked under high MemStore load
> --------------------------------------------------
>
> Key: HBASE-2018
> URL: https://issues.apache.org/jira/browse/HBASE-2018
> Project: Hadoop HBase
> Issue Type: Bug
> Affects Versions: 0.20.2
> Reporter: Jean-Daniel Cryans
> Assignee: Jean-Daniel Cryans
> Priority: Blocker
> Fix For: 0.20.3, 0.21.0
>
> Attachments: HBASE-2018-v2.patch, HBASE-2018.patch
>
>
> I discovered this on Lars' cluster. The symptom was the good old:
> {code}
> 09/11/30 08:10:26 INFO mapred.JobClient: Task Id : attempt_200911250121_0011_r_000010_1, Status : FAILED
> org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to contact region server Some server, retryOnlyOne=true, index=0, islastrow=false, tries=9, numtries=10, i=14, listsize=20, region=prev-docs,de68fb97795ef3d936a3f10ff8790253,1259573366564 for region prev-docs,ccea967e66ccb53d83c48849c3a23f21,1259542138868, row 'ccff8cd4ca871c41f4fa7d44cffed962', but failed after 10 attempts.
> Exceptions:
> at org.apache.hadoop.hbase.client.HConnectionManager$TableServers$Batch.process(HConnectionManager.java:1120)
> at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.processBatchOfRows(HConnectionManager.java:1201)
> at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:605)
> at org.apache.hadoop.hbase.client.HTable.put(HTable.java:470)
> at org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordW
> {code}
> But the load wasn't that heavy, just lots of splitting going on. Looking at the logs, I see a split taking more than 4 minutes which is explained by this happening on the RS hosting .META. :
> {code}
> 2009-11-30 08:08:39,922 INFO org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Forced flushing of prev-docs,2c9d51e57b20decd5c6419d23ede822b,1259542273901 because global memstore limit of 1.6g exceeded; currently 1.6g and flushing till 1021.9m
> ...
> 2009-11-30 08:12:33,743 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~22.9m for region prev-docs,c8fea4fbbc41e746d960854ed4d41dd6,1259587143838 in 14160ms, sequence id=13677, compaction requested=false
> 2009-11-30 08:12:33,744 INFO org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Forced flushing of prev-docs,39c2995d955c041d21f4dc4a0d0dbf6c,1259587061295 because global memstore limit of 1.6g exceeded; currently 1.0g and flushing till 1021.9m
> {code}
> So we should not block updates to .META. for any reason. I'm pretty sure this issue explains other issues we've seen on the mailing list.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-2018) Updates to .META. blocked under high
MemStore load
Posted by "Andrew Purtell (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-2018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12784004#action_12784004 ]
Andrew Purtell commented on HBASE-2018:
---------------------------------------
+1
Agree with the issue priority also.
> Updates to .META. blocked under high MemStore load
> --------------------------------------------------
>
> Key: HBASE-2018
> URL: https://issues.apache.org/jira/browse/HBASE-2018
> Project: Hadoop HBase
> Issue Type: Bug
> Affects Versions: 0.20.2
> Reporter: Jean-Daniel Cryans
> Assignee: Jean-Daniel Cryans
> Priority: Blocker
> Fix For: 0.20.3, 0.21.0
>
> Attachments: HBASE-2018.patch
>
>
> I discovered this on Lars' cluster. The symptom was the good old:
> {code}
> 09/11/30 08:10:26 INFO mapred.JobClient: Task Id : attempt_200911250121_0011_r_000010_1, Status : FAILED
> org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to contact region server Some server, retryOnlyOne=true, index=0, islastrow=false, tries=9, numtries=10, i=14, listsize=20, region=prev-docs,de68fb97795ef3d936a3f10ff8790253,1259573366564 for region prev-docs,ccea967e66ccb53d83c48849c3a23f21,1259542138868, row 'ccff8cd4ca871c41f4fa7d44cffed962', but failed after 10 attempts.
> Exceptions:
> at org.apache.hadoop.hbase.client.HConnectionManager$TableServers$Batch.process(HConnectionManager.java:1120)
> at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.processBatchOfRows(HConnectionManager.java:1201)
> at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:605)
> at org.apache.hadoop.hbase.client.HTable.put(HTable.java:470)
> at org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordW
> {code}
> But the load wasn't that heavy, just lots of splitting going on. Looking at the logs, I see a split taking more than 4 minutes which is explained by this happening on the RS hosting .META. :
> {code}
> 2009-11-30 08:08:39,922 INFO org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Forced flushing of prev-docs,2c9d51e57b20decd5c6419d23ede822b,1259542273901 because global memstore limit of 1.6g exceeded; currently 1.6g and flushing till 1021.9m
> ...
> 2009-11-30 08:12:33,743 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~22.9m for region prev-docs,c8fea4fbbc41e746d960854ed4d41dd6,1259587143838 in 14160ms, sequence id=13677, compaction requested=false
> 2009-11-30 08:12:33,744 INFO org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Forced flushing of prev-docs,39c2995d955c041d21f4dc4a0d0dbf6c,1259587061295 because global memstore limit of 1.6g exceeded; currently 1.0g and flushing till 1021.9m
> {code}
> So we should not block updates to .META. for any reason. I'm pretty sure this issue explains other issues we've seen on the mailing list.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-2018) Updates to .META. blocked under high
MemStore load
Posted by "Jean-Daniel Cryans (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-2018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12785455#action_12785455 ]
Jean-Daniel Cryans commented on HBASE-2018:
-------------------------------------------
New patch makes sense. Also Lars just reported that his job finally was successful (by tweaking other stuff). At least I think this patch covers a very bad corner case.
> Updates to .META. blocked under high MemStore load
> --------------------------------------------------
>
> Key: HBASE-2018
> URL: https://issues.apache.org/jira/browse/HBASE-2018
> Project: Hadoop HBase
> Issue Type: Bug
> Affects Versions: 0.20.2
> Reporter: Jean-Daniel Cryans
> Assignee: Jean-Daniel Cryans
> Priority: Blocker
> Fix For: 0.20.3, 0.21.0
>
> Attachments: HBASE-2018-v2.patch, HBASE-2018.patch
>
>
> I discovered this on Lars' cluster. The symptom was the good old:
> {code}
> 09/11/30 08:10:26 INFO mapred.JobClient: Task Id : attempt_200911250121_0011_r_000010_1, Status : FAILED
> org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to contact region server Some server, retryOnlyOne=true, index=0, islastrow=false, tries=9, numtries=10, i=14, listsize=20, region=prev-docs,de68fb97795ef3d936a3f10ff8790253,1259573366564 for region prev-docs,ccea967e66ccb53d83c48849c3a23f21,1259542138868, row 'ccff8cd4ca871c41f4fa7d44cffed962', but failed after 10 attempts.
> Exceptions:
> at org.apache.hadoop.hbase.client.HConnectionManager$TableServers$Batch.process(HConnectionManager.java:1120)
> at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.processBatchOfRows(HConnectionManager.java:1201)
> at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:605)
> at org.apache.hadoop.hbase.client.HTable.put(HTable.java:470)
> at org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordW
> {code}
> But the load wasn't that heavy, just lots of splitting going on. Looking at the logs, I see a split taking more than 4 minutes which is explained by this happening on the RS hosting .META. :
> {code}
> 2009-11-30 08:08:39,922 INFO org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Forced flushing of prev-docs,2c9d51e57b20decd5c6419d23ede822b,1259542273901 because global memstore limit of 1.6g exceeded; currently 1.6g and flushing till 1021.9m
> ...
> 2009-11-30 08:12:33,743 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~22.9m for region prev-docs,c8fea4fbbc41e746d960854ed4d41dd6,1259587143838 in 14160ms, sequence id=13677, compaction requested=false
> 2009-11-30 08:12:33,744 INFO org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Forced flushing of prev-docs,39c2995d955c041d21f4dc4a0d0dbf6c,1259587061295 because global memstore limit of 1.6g exceeded; currently 1.0g and flushing till 1021.9m
> {code}
> So we should not block updates to .META. for any reason. I'm pretty sure this issue explains other issues we've seen on the mailing list.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-2018) Updates to .META. blocked under high
MemStore load
Posted by "Lars George (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-2018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12784219#action_12784219 ]
Lars George commented on HBASE-2018:
------------------------------------
Testing now, takes a few hours to ramp up through the map phase. Results forthcoming...
> Updates to .META. blocked under high MemStore load
> --------------------------------------------------
>
> Key: HBASE-2018
> URL: https://issues.apache.org/jira/browse/HBASE-2018
> Project: Hadoop HBase
> Issue Type: Bug
> Affects Versions: 0.20.2
> Reporter: Jean-Daniel Cryans
> Assignee: Jean-Daniel Cryans
> Priority: Blocker
> Fix For: 0.20.3, 0.21.0
>
> Attachments: HBASE-2018.patch
>
>
> I discovered this on Lars' cluster. The symptom was the good old:
> {code}
> 09/11/30 08:10:26 INFO mapred.JobClient: Task Id : attempt_200911250121_0011_r_000010_1, Status : FAILED
> org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to contact region server Some server, retryOnlyOne=true, index=0, islastrow=false, tries=9, numtries=10, i=14, listsize=20, region=prev-docs,de68fb97795ef3d936a3f10ff8790253,1259573366564 for region prev-docs,ccea967e66ccb53d83c48849c3a23f21,1259542138868, row 'ccff8cd4ca871c41f4fa7d44cffed962', but failed after 10 attempts.
> Exceptions:
> at org.apache.hadoop.hbase.client.HConnectionManager$TableServers$Batch.process(HConnectionManager.java:1120)
> at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.processBatchOfRows(HConnectionManager.java:1201)
> at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:605)
> at org.apache.hadoop.hbase.client.HTable.put(HTable.java:470)
> at org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordW
> {code}
> But the load wasn't that heavy, just lots of splitting going on. Looking at the logs, I see a split taking more than 4 minutes which is explained by this happening on the RS hosting .META. :
> {code}
> 2009-11-30 08:08:39,922 INFO org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Forced flushing of prev-docs,2c9d51e57b20decd5c6419d23ede822b,1259542273901 because global memstore limit of 1.6g exceeded; currently 1.6g and flushing till 1021.9m
> ...
> 2009-11-30 08:12:33,743 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~22.9m for region prev-docs,c8fea4fbbc41e746d960854ed4d41dd6,1259587143838 in 14160ms, sequence id=13677, compaction requested=false
> 2009-11-30 08:12:33,744 INFO org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Forced flushing of prev-docs,39c2995d955c041d21f4dc4a0d0dbf6c,1259587061295 because global memstore limit of 1.6g exceeded; currently 1.0g and flushing till 1021.9m
> {code}
> So we should not block updates to .META. for any reason. I'm pretty sure this issue explains other issues we've seen on the mailing list.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.