You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "Jean-Daniel Cryans (JIRA)" <ji...@apache.org> on 2008/08/28 20:22:44 UTC

[jira] Created: (HBASE-851) Region is left unassigned after a split/rebalancing, throws NSRE

Region is left unassigned after a split/rebalancing, throws NSRE
----------------------------------------------------------------

                 Key: HBASE-851
                 URL: https://issues.apache.org/jira/browse/HBASE-851
             Project: Hadoop HBase
          Issue Type: Bug
    Affects Versions: 0.2.0
            Reporter: Jean-Daniel Cryans
             Fix For: 0.19.0


Master log:
{code}
2008-08-28 12:12:27,174 INFO org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_PROCESS_OPEN: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794 from 192.168.1.95:60020
<jdcryans> 2008-08-28 12:12:27,174 INFO org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_OPEN: web_pages,http://www.salonskincare.co.uk/product_info.php/products_id/168,1219939934794 from 192.168.1.95:60020
<jdcryans> 2008-08-28 12:12:27,174 INFO org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_OPEN: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794 from 192.168.1.95:60020
<jdcryans> 2008-08-28 12:12:27,174 DEBUG org.apache.hadoop.hbase.master.RegionManager: Server 192.168.1.95:60020 is overloaded. Server load: 8 avg: 7.0
<jdcryans> 2008-08-28 12:12:27,174 DEBUG org.apache.hadoop.hbase.master.RegionManager: Choosing to reassign 1 regions. mostLoadedRegions has 8 regions in it.
<jdcryans> 2008-08-28 12:12:27,174 DEBUG org.apache.hadoop.hbase.master.RegionManager: Going to close region web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
<jdcryans> 2008-08-28 12:12:27,174 DEBUG org.apache.hadoop.hbase.master.HMaster: Main processing loop: PendingOpenOperation from 192.168.1.95:60020
<jdcryans> 2008-08-28 12:12:27,175 INFO org.apache.hadoop.hbase.master.ProcessRegionOpen$1: web_pages,http://www.salonskincare.co.uk/product_info.php/products_id/168,1219939934794 open on 192.168.1.95:60020
<jdcryans> 2008-08-28 12:12:27,175 DEBUG org.apache.hadoop.hbase.master.RegionServerOperation: numberOfMetaRegions: 1, onlineMetaRegions.size(): 1
<jdcryans> 2008-08-28 12:12:27,175 INFO org.apache.hadoop.hbase.master.ProcessRegionOpen$1: updating row web_pages,http://www.salonskincare.co.uk/product_info.php/products_id/168,1219939934794 in region .META.,,1 with startcode 1219931259154 and server 192.168.1.95:60020
<jdcryans> 2008-08-28 12:12:30,352 INFO org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_CLOSE: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794 from 192.168.1.95:60020
<jdcryans> 2008-08-28 12:1
<jdcryans> 2008-08-28 12:12:32,557 DEBUG org.apache.hadoop.hbase.master.ServerManager: Total Load: 103, Num Servers: 15, Avg Load: 7.0
<jdcryans> 2008-08-28 12:12:34,093 DEBUG org.apache.hadoop.hbase.master.HMaster: Main processing loop: PendingOpenOperation from 192.168.1.95:60020
<jdcryans> 2008-08-28 12:12:34,093 INFO org.apache.hadoop.hbase.master.ProcessRegionOpen$1: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794 open on 192.168.1.95:60020
<jdcryans> 2008-08-28 12:12:34,093 DEBUG org.apache.hadoop.hbase.master.RegionServerOperation: numberOfMetaRegions: 1, onlineMetaRegions.size(): 1
<jdcryans> 2008-08-28 12:12:34,093 INFO org.apache.hadoop.hbase.master.ProcessRegionOpen$1: updating row web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794 in region .META.,,1 with startcode 1219931259154 and server 192.168.1.95:60020
{code}

HRS 192.168.1.95
{code}
jdcryans> 2008-08-28 12:12:24,953 DEBUG org.apache.hadoop.hbase.regionserver.CompactSplitThread: Compaction requested for region: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
<jdcryans> 2008-08-28 12:12:27,307 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: MSG_REGION_CLOSE: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794: [B@f0a360
<jdcryans> 2008-08-28 12:12:27,307 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: MSG_REGION_CLOSE: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794: [B@f0a360
<jdcryans> 2008-08-28 12:12:27,308 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Compactions and cache flushes disabled for region web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
<jdcryans> 2008-08-28 12:12:27,308 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Scanners disabled for region web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
<jdcryans> 2008-08-28 12:12:27,308 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: No more active scanners for region web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
<jdcryans> 2008-08-28 12:12:27,308 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Updates disabled for region web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
<jdcryans> 2008-08-28 12:12:27,308 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: No more row locks outstanding on region web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
<jdcryans> 2008-08-28 12:12:27,308 DEBUG org.apache.hadoop.hbase.regionserver.HStore: closed 1860667227/attribute
<jdcryans> 2008-08-28 12:12:27,308 INFO org.apache.hadoop.hbase.regionserver.HRegion: closed web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
<jdcryans> 2008-08-28 12:12:34,246 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 on 60020, call batchUpdate([B@552a4a, row => http://www.simplewebengines.com/, {column => attribute:traveliness, value => '...', column => attribute:processed_at, value => '...', column => attribute:content, value => '...', column => attribute:refs, value => '...', column => attribute:crawled_at, value => '...', column => att
<jdcryans> ribute:html, value => '...', column => attribute:crawled, value => '...'}) from 192.168.1.96:50102: error: org.apache.hadoop.hbase.NotServingRegionException: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
<jdcryans> org.apache.hadoop.hbase.NotServingRegionException: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794

NSRE for a hundred times
{code}

Restarting the cluster cleared the issue but this is a nasty bug. Proposed bandaid would be that if we have a NSRE after the retries, asked the master to scan the HRS to see if it's located somewhere else. If not, assign it somewhere. Finally update META.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-851) Region is left unassigned after a split/rebalancing, throws NSRE

Posted by "Jean-Daniel Cryans (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12626772#action_12626772 ] 

Jean-Daniel Cryans commented on HBASE-851:
------------------------------------------

{code}
2008-08-28 12:11:47,207 DEBUG org.apache.hadoop.hbase.master.BaseScanner: RegionManager.metaScanner REGION => {NAME => 'web_pages,http://www.xtremtours.com/,1219937877462', STARTKEY => 'http://www.xtremtours.com/', ENDKEY => '', ENCODED => 718025308, TABLE => {{NAME => 'web_pages', IS_ROOT => 'false', IS_META => 'false', FAMILIES => [{NAME => 'attribute', BLOOMFILTER => 'false', COMPRESSION => 'NONE', VERSIONS => '1000000', LENGTH => '2147483647', TTL => '-1', IN_MEMORY => 'false', BLOCKCACHE => 'false'}]}}}, SERVER => '192.168.1.70:60020', STARTCODE => 1219931259523
2008-08-28 12:11:47,212 INFO org.apache.hadoop.hbase.master.BaseScanner: RegionManager.metaScanner scan of meta region {regionname: .META.,,1, startKey: <>, server: 192.168.1.70:60020} complete
2008-08-28 12:11:47,212 INFO org.apache.hadoop.hbase.master.BaseScanner: all meta regions scanned
2008-08-28 12:11:47,283 DEBUG org.apache.hadoop.hbase.master.ServerManager: Total Load: 102, Num Servers: 15, Avg Load: 7.0
2008-08-28 12:12:00,950 INFO org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_SPLIT: web_pages,http://www.irvingoil.com/company/privacy.asp,1219938079718: [B@70eab4 from 192.168.1.90:60020
2008-08-28 12:12:00,950 INFO org.apache.hadoop.hbase.master.RegionManager: assigning region web_pages,http://www.jquebec.com/,1219939915297 to server 192.168.1.90:60020
2008-08-28 12:12:00,952 INFO org.apache.hadoop.hbase.master.RegionManager: assigning region web_pages,http://www.irvingoil.com/company/privacy.asp,1219939915297 to server 192.168.1.90:60020
2008-08-28 12:12:02,401 DEBUG org.apache.hadoop.hbase.master.ServerManager: Total Load: 101, Num Servers: 15, Avg Load: 7.0
2008-08-28 12:12:05,361 INFO org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_PROCESS_OPEN: web_pages,http://www.irvingoil.com/company/privacy.asp,1219939915297 from 192.168.1.90:60020
2008-08-28 12:12:05,361 INFO org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_OPEN: web_pages,http://www.jquebec.com/,1219939915297 from 192.168.1.90:60020
2008-08-28 12:12:05,362 INFO org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_OPEN: web_pages,http://www.irvingoil.com/company/privacy.asp,1219939915297 from 192.168.1.90:60020
2008-08-28 12:12:05,362 DEBUG org.apache.hadoop.hbase.master.RegionManager: Server 192.168.1.90:60020 is overloaded. Server load: 8 avg: 7.0
2008-08-28 12:12:05,362 DEBUG org.apache.hadoop.hbase.master.RegionManager: Choosing to reassign 1 regions. mostLoadedRegions has 8 regions in it.
2008-08-28 12:12:05,362 DEBUG org.apache.hadoop.hbase.master.RegionManager: Going to close region web_pages,http://www.fizber.com/sale-by-owner-home-services/Michigan-city-crystal-falls-profile.html,1219938031085
2008-08-28 12:12:05,362 DEBUG org.apache.hadoop.hbase.master.HMaster: Main processing loop: PendingOpenOperation from 192.168.1.90:60020
2008-08-28 12:12:05,362 INFO org.apache.hadoop.hbase.master.ProcessRegionOpen$1: web_pages,http://www.jquebec.com/,1219939915297 open on 192.168.1.90:60020
2008-08-28 12:12:05,362 DEBUG org.apache.hadoop.hbase.master.RegionServerOperation: numberOfMetaRegions: 1, onlineMetaRegions.size(): 1
2008-08-28 12:12:05,362 INFO org.apache.hadoop.hbase.master.ProcessRegionOpen$1: updating row web_pages,http://www.jquebec.com/,1219939915297 in region .META.,,1 with startcode 1219931259216 and server 192.168.1.90:60020
2008-08-28 12:12:05,513 DEBUG org.apache.hadoop.hbase.master.HMaster: Main processing loop: PendingOpenOperation from 192.168.1.90:60020
2008-08-28 12:12:05,513 INFO org.apache.hadoop.hbase.master.ProcessRegionOpen$1: web_pages,http://www.irvingoil.com/company/privacy.asp,1219939915297 open on 192.168.1.90:60020
2008-08-28 12:12:05,513 DEBUG org.apache.hadoop.hbase.master.RegionServerOperation: numberOfMetaRegions: 1, onlineMetaRegions.size(): 1
2008-08-28 12:12:05,513 INFO org.apache.hadoop.hbase.master.ProcessRegionOpen$1: updating row web_pages,http://www.irvingoil.com/company/privacy.asp,1219939915297 in region .META.,,1 with startcode 1219931259216 and server 192.168.1.90:60020
2008-08-28 12:12:17,524 DEBUG org.apache.hadoop.hbase.master.ServerManager: Total Load: 102, Num Servers: 15, Avg Load: 7.0
2008-08-28 12:12:18,406 INFO org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_CLOSE: web_pages,http://www.fizber.com/sale-by-owner-home-services/Michigan-city-crystal-falls-profile.html,1219938031085 from 192.168.1.90:60020
2008-08-28 12:12:18,578 INFO org.apache.hadoop.hbase.master.RegionManager: assigning region web_pages,http://www.fizber.com/sale-by-owner-home-services/Michigan-city-crystal-falls-profile.html,1219938031085 to server 192.168.1.96:60020
2008-08-28 12:12:21,585 INFO org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_PROCESS_OPEN: web_pages,http://www.fizber.com/sale-by-owner-home-services/Michigan-city-crystal-falls-profile.html,1219938031085 from 192.168.1.96:60020
2008-08-28 12:12:23,838 INFO org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_SPLIT: web_pages,http://www.salonskincare.co.uk/product_info.php/products_id/168,1219937970833: [B@30a4a7 from 192.168.1.95:60020
2008-08-28 12:12:23,838 INFO org.apache.hadoop.hbase.master.RegionManager: assigning region web_pages,http://www.salonskincare.co.uk/product_info.php/products_id/168,1219939934794 to server 192.168.1.95:60020
2008-08-28 12:12:23,840 INFO org.apache.hadoop.hbase.master.RegionManager: assigning region web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794 to server 192.168.1.95:60020
2008-08-28 12:12:24,588 INFO org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_OPEN: web_pages,http://www.fizber.com/sale-by-owner-home-services/Michigan-city-crystal-falls-profile.html,1219938031085 from 192.168.1.96:60020
2008-08-28 12:12:24,588 DEBUG org.apache.hadoop.hbase.master.HMaster: Main processing loop: PendingOpenOperation from 192.168.1.96:60020
2008-08-28 12:12:24,589 INFO org.apache.hadoop.hbase.master.ProcessRegionOpen$1: web_pages,http://www.fizber.com/sale-by-owner-home-services/Michigan-city-crystal-falls-profile.html,1219938031085 open on 192.168.1.96:60020
2008-08-28 12:12:24,589 DEBUG org.apache.hadoop.hbase.master.RegionServerOperation: numberOfMetaRegions: 1, onlineMetaRegions.size(): 1
2008-08-28 12:12:24,589 INFO org.apache.hadoop.hbase.master.ProcessRegionOpen$1: updating row web_pages,http://www.fizber.com/sale-by-owner-home-services/Michigan-city-crystal-falls-profile.html,1219938031085 in region .META.,,1 with startcode 1219931259127 and server 192.168.1.96:60020
2008-08-28 12:12:27,174 INFO org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_PROCESS_OPEN: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794 from 192.168.1.95:60020
2008-08-28 12:12:27,174 INFO org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_OPEN: web_pages,http://www.salonskincare.co.uk/product_info.php/products_id/168,1219939934794 from 192.168.1.95:60020
2008-08-28 12:12:27,174 INFO org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_OPEN: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794 from 192.168.1.95:60020
2008-08-28 12:12:27,174 DEBUG org.apache.hadoop.hbase.master.RegionManager: Server 192.168.1.95:60020 is overloaded. Server load: 8 avg: 7.0
2008-08-28 12:12:27,174 DEBUG org.apache.hadoop.hbase.master.RegionManager: Choosing to reassign 1 regions. mostLoadedRegions has 8 regions in it.
2008-08-28 12:12:27,174 DEBUG org.apache.hadoop.hbase.master.RegionManager: Going to close region web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
2008-08-28 12:12:27,174 DEBUG org.apache.hadoop.hbase.master.HMaster: Main processing loop: PendingOpenOperation from 192.168.1.95:60020
2008-08-28 12:12:27,175 INFO org.apache.hadoop.hbase.master.ProcessRegionOpen$1: web_pages,http://www.salonskincare.co.uk/product_info.php/products_id/168,1219939934794 open on 192.168.1.95:60020
2008-08-28 12:12:27,175 DEBUG org.apache.hadoop.hbase.master.RegionServerOperation: numberOfMetaRegions: 1, onlineMetaRegions.size(): 1
2008-08-28 12:12:27,175 INFO org.apache.hadoop.hbase.master.ProcessRegionOpen$1: updating row web_pages,http://www.salonskincare.co.uk/product_info.php/products_id/168,1219939934794 in region .META.,,1 with startcode 1219931259154 and server 192.168.1.95:60020
2008-08-28 12:12:30,352 INFO org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_CLOSE: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794 from 192.168.1.95:60020
2008-08-28 12:12:32,557 DEBUG org.apache.hadoop.hbase.master.ServerManager: Total Load: 103, Num Servers: 15, Avg Load: 7.0
2008-08-28 12:12:34,093 DEBUG org.apache.hadoop.hbase.master.HMaster: Main processing loop: PendingOpenOperation from 192.168.1.95:60020
2008-08-28 12:12:34,093 INFO org.apache.hadoop.hbase.master.ProcessRegionOpen$1: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794 open on 192.168.1.95:60020
2008-08-28 12:12:34,093 DEBUG org.apache.hadoop.hbase.master.RegionServerOperation: numberOfMetaRegions: 1, onlineMetaRegions.size(): 1
2008-08-28 12:12:34,093 INFO org.apache.hadoop.hbase.master.ProcessRegionOpen$1: updating row web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794 in region .META.,,1 with startcode 1219931259154 and server 192.168.1.95:60020
2008-08-28 12:12:43,828 INFO org.apache.hadoop.hbase.master.BaseScanner: RegionManager.rootScanner scanning meta region {regionname: -ROOT-,,0, startKey: <>, server: 192.168.1.95:60020}
2008-08-28 12:12:43,848 DEBUG org.apache.hadoop.hbase.master.BaseScanner: RegionManager.rootScanner REGION => {NAME => '.META.,,1', STARTKEY => '', ENDKEY => '', ENCODED => 1028785192, TABLE => {{NAME => '.META.', IS_ROOT => 'false', IS_META => 'true', FAMILIES => [{NAME => 'info', BLOOMFILTER => 'false', VERSIONS => '1', COMPRESSION => 'NONE', LENGTH => '2147483647', TTL => '-1', IN_MEMORY => 'false', BLOCKCACHE => 'false'}, {NAME => 'historian', BLOOMFILTER => 'false', VERSIONS => '2147483647', COMPRESSION => 'NONE', LENGTH => '2147483647', TTL => '-1', IN_MEMORY => 'false', BLOCKCACHE => 'false'}]}}}, SERVER => '192.168.1.70:60020', STARTCODE => 1219931259523
2008-08-28 12:12:43,849 INFO org.apache.hadoop.hbase.master.BaseScanner: RegionManager.rootScanner scan of meta region {regionname: -ROOT-,,0, startKey: <>, server: 192.168.1.95:60020} complete
2008-08-28 12:12:46,977 INFO org.apache.hadoop.hbase.master.BaseScanner: RegionManager.metaScanner scanning meta region {regionname: .META.,,1, startKey: <>, server: 192.168.1.70:60020}
2008-08-28 12:12:47,171 DEBUG org.apache.hadoop.hbase.master.BaseScanner: RegionManager.metaScanner REGION => {NAME => 'entities,,1219165003546', STARTKEY => '', ENDKEY => '7f3f7c20-9234-4255-9c42-ce5814903412', ENCODED => 1253295232, TABLE => {{NAME => 'entities', IS_ROOT => 'false', IS_META => 'false', FAMILIES => [{NAME => 'context', BLOOMFILTER => 'false', VERSIONS => '3', COMPRESSION => 'NONE', LENGTH => '2147483647', TTL => '-1', IN_MEMORY => 'false', BLOCKCACHE => 'false'}, {NAME => 'attribute', BLOOMFILTER => 'false', VERSIONS => '1000000', COMPRESSION => 'NONE', LENGTH => '2147483647', TTL => '-1', IN_MEMORY => 'false', BLOCKCACHE => 'false'}]}}}, SERVER => '192.168.1.98:60020', STARTCODE => 1219931259258
2008-08-28 12:12:47,199 DEBUG org.apache.hadoop.hbase.master.BaseScanner: RegionManager.metaScanner REGION => {NAME => 'entities,7f3f7c20-9234-4255-9c42-ce5814903412,1219165003546', STARTKEY => '7f3f7c20-9234-4255-9c42-ce5814903412', ENDKEY => '', ENCODED => 146570614, TABLE => {{NAME => 'entities', IS_ROOT => 'false', IS_META => 'false', FAMILIES => [{NAME => 'context', BLOOMFILTER => 'false', VERSIONS => '3', COMPRESSION => 'NONE', LENGTH => '2147483647', TTL => '-1', IN_MEMORY => 'false', BLOCKCACHE => 'false'}, {NAME => 'attribute', BLOOMFILTER => 'false', VERSIONS => '1000000', COMPRESSION => 'NONE', LENGTH => '2147483647', TTL => '-1', IN_MEMORY => 'false', BLOCKCACHE => 'false'}]}}}, SERVER => '192.168.1.87:60020', STARTCODE => 1219931259249
2008-08-28 12:12:47,200 DEBUG org.apache.hadoop.hbase.master.BaseScanner: RegionManager.metaScanner REGION => {NAME => 'entity_candidates,,1219110570649', STARTKEY => '', ENDKEY => '', ENCODED => 303566559, TABLE => {{NAME => 'entity_candidates', IS_ROOT => 'false', IS_META => 'false', FAMILIES => [{NAME => 'attribute', BLOOMFILTER => 'false', COMPRESSION => 'NONE', VERSIONS => '1000000', LENGTH => '2147483647', TTL => '-1', IN_MEMORY => 'false', BLOCKCACHE => 'false'}]}}}, SERVER => '192.168.1.99:60020', STARTCODE => 1219931259137
2008-08-28 12:12:47,200 DEBUG org.apache.hadoop.hbase.master.BaseScanner: RegionManager.metaScanner REGION => {NAME => 'guides,,1219110550196', STARTKEY => '', ENDKEY => '', ENCODED => 34054954, TABLE => {{NAME => 'guides', IS_ROOT => 'false', IS_META => 'false', FAMILIES => [{NAME => 'attribute', BLOOMFILTER => 'false', COMPRESSION => 'NONE', VERSIONS => '1000000', LENGTH => '2147483647', TTL => '-1', IN_MEMORY => 'false', BLOCKCACHE => 'false'}]}}}, SERVER => '192.168.1.90:60020', STARTCODE => 1219931259216
2008-08-28 12:12:47,202 DEBUG org.apache.hadoop.hbase.master.BaseScanner: RegionManager.metaScanner REGION => {NAME => 'medias,,1219110580836', STARTKEY => '', ENDKEY => '', ENCODED => 216609116, TABLE => {{NAME => 'medias', IS_ROOT => 'false', IS_META => 'false', FAMILIES => [{NAME => 'attribute', BLOOMFILTER => 'false', COMPRESSION => 'NONE', VERSIONS => '1000000', LENGTH => '2147483647', TTL => '-1', IN_MEMORY => 'false', BLOCKCACHE => 'false'}]}}}, SERVER => '192.168.1.90:60020', STARTCODE => 1219931259216
2008-08-28 12:12:47,203 DEBUG org.apache.hadoop.hbase.master.BaseScanner: RegionManager.metaScanner REGION => {NAME => 'web_pages,,1219939407687', STARTKEY => '', ENDKEY => 'http://antrim-county-recycling.blogspot.com/2007/12/overview-of-pa-138-campaign-failed-part.html', ENCODED => 751617614, TABLE => {{NAME => 'web_pages', IS_ROOT => 'false', IS_META => 'false', FAMILIES => [{NAME => 'attribute', BLOOMFILTER => 'false', COMPRESSION => 'NONE', VERSIONS => '1000000', LENGTH => '2147483647', TTL => '-1', IN_MEMORY => 'false', BLOCKCACHE => 'false'}]}}}, SERVER => '192.168.1.97:60020', STARTCODE => 1219931259191
2008-08-28 12:12:47,205 DEBUG org.apache.hadoop.hbase.master.BaseScanner: RegionManager.metaScanner REGION => {NAME => 'web_pages,http://antrim-county-recycling.blogspot.com/2007/12/overview-of-pa-138-campaign-failed-part.html,1219939407687', STARTKEY => 'http://antrim-county-recycling.blogspot.com/2007/12/overview-of-pa-138-campaign-failed-part.html', ENDKEY => 'http://automotive.autoaubaine.com/occasions-auto/Saguenay-Lac-St-Jean-listing.html', ENCODED => 385826300, TABLE => {{NAME => 'web_pages', IS_ROOT => 'false', IS_META => 'false', FAMILIES => [{NAME => 'attribute', BLOOMFILTER => 'false', COMPRESSION => 'NONE', VERSIONS => '1000000', LENGTH => '2147483647', TTL => '-1', IN_MEMORY => 'false', BLOCKCACHE => 'false'}]}}}, SERVER => '192.168.1.93:60020', STARTCODE => 1219931259324
2008-08-28 12:12:47,209 DEBUG org.apache.hadoop.hbase.master.BaseScanner: RegionManager.metaScanner REGION => {NAME => 'web_pages,http://automotive.autoaubaine.com/occasions-auto/Saguenay-Lac-St-Jean-listing.html,1219939021978', STARTKEY => 'http://automotive.autoaubaine.com/occasions-auto/Saguenay-Lac-St-Jean-listing.html', ENDKEY => 'http://buildingpros.com/build/state/MI/Grosse+Ile.html', ENCODED => 1156151379, TABLE => {{NAME => 'web_pages', IS_ROOT => 'false', IS_META => 'false', FAMILIES => [{NAME => 'attribute', BLOOMFILTER => 'false', COMPRESSION => 'NONE', VERSIONS => '1000000', LENGTH => '2147483647', TTL => '-1', IN_MEMORY => 'false', BLOCKCACHE => 'false'}]}}}, SERVER => '192.168.1.87:60020', STARTCODE => 1219931259249
2008-08-28 12:12:47,211 DEBUG org.apache.hadoop.hbase.master.BaseScanner: RegionManager.metaScanner REGION => {NAME => 'web_pages,http://buildingpros.com/build/state/MI/Grosse+Ile.html,1219939021978', STARTKEY => 'http://buildingpros.com/build/state/MI/Grosse+Ile.html', ENDKEY => 'http://cancun-hotels.tripadvisor.com/Hotel_Review-g150807-d154413-Reviews-Hyatt_Cancun_Caribe_Resort-Cancun_Yucatan_Peninsula.html', ENCODED => 554004736, TABLE => {{NAME => 'web_pages', IS_ROOT => 'false', IS_META => 'false', FAMILIES => [{NAME => 'attribute', BLOOMFILTER => 'false', COMPRESSION => 'NONE', VERSIONS => '1000000', LENGTH => '2147483647', TTL => '-1', IN_MEMORY => 'false', BLOCKCACHE => 'false'}]}}}, SERVER => '192.168.1.97:60020', STARTCODE => 1219931259191
{code}

As you can see, the rest of the log is empty. In the web UI, I saw that the region server didn't have that region (web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794)
But it was there in the table.jsp page.

I can replicate it here after ~1 hour of MR jobs so... I think you should first do the bandaid stuff.

> Region is left unassigned after a split/rebalancing, throws NSRE
> ----------------------------------------------------------------
>
>                 Key: HBASE-851
>                 URL: https://issues.apache.org/jira/browse/HBASE-851
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.2.0
>            Reporter: Jean-Daniel Cryans
>             Fix For: 0.19.0
>
>
> Master log:
> {code}
> 2008-08-28 12:12:27,174 INFO org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_PROCESS_OPEN: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794 from 192.168.1.95:60020
> <jdcryans> 2008-08-28 12:12:27,174 INFO org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_OPEN: web_pages,http://www.salonskincare.co.uk/product_info.php/products_id/168,1219939934794 from 192.168.1.95:60020
> <jdcryans> 2008-08-28 12:12:27,174 INFO org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_OPEN: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794 from 192.168.1.95:60020
> <jdcryans> 2008-08-28 12:12:27,174 DEBUG org.apache.hadoop.hbase.master.RegionManager: Server 192.168.1.95:60020 is overloaded. Server load: 8 avg: 7.0
> <jdcryans> 2008-08-28 12:12:27,174 DEBUG org.apache.hadoop.hbase.master.RegionManager: Choosing to reassign 1 regions. mostLoadedRegions has 8 regions in it.
> <jdcryans> 2008-08-28 12:12:27,174 DEBUG org.apache.hadoop.hbase.master.RegionManager: Going to close region web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> <jdcryans> 2008-08-28 12:12:27,174 DEBUG org.apache.hadoop.hbase.master.HMaster: Main processing loop: PendingOpenOperation from 192.168.1.95:60020
> <jdcryans> 2008-08-28 12:12:27,175 INFO org.apache.hadoop.hbase.master.ProcessRegionOpen$1: web_pages,http://www.salonskincare.co.uk/product_info.php/products_id/168,1219939934794 open on 192.168.1.95:60020
> <jdcryans> 2008-08-28 12:12:27,175 DEBUG org.apache.hadoop.hbase.master.RegionServerOperation: numberOfMetaRegions: 1, onlineMetaRegions.size(): 1
> <jdcryans> 2008-08-28 12:12:27,175 INFO org.apache.hadoop.hbase.master.ProcessRegionOpen$1: updating row web_pages,http://www.salonskincare.co.uk/product_info.php/products_id/168,1219939934794 in region .META.,,1 with startcode 1219931259154 and server 192.168.1.95:60020
> <jdcryans> 2008-08-28 12:12:30,352 INFO org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_CLOSE: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794 from 192.168.1.95:60020
> <jdcryans> 2008-08-28 12:1
> <jdcryans> 2008-08-28 12:12:32,557 DEBUG org.apache.hadoop.hbase.master.ServerManager: Total Load: 103, Num Servers: 15, Avg Load: 7.0
> <jdcryans> 2008-08-28 12:12:34,093 DEBUG org.apache.hadoop.hbase.master.HMaster: Main processing loop: PendingOpenOperation from 192.168.1.95:60020
> <jdcryans> 2008-08-28 12:12:34,093 INFO org.apache.hadoop.hbase.master.ProcessRegionOpen$1: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794 open on 192.168.1.95:60020
> <jdcryans> 2008-08-28 12:12:34,093 DEBUG org.apache.hadoop.hbase.master.RegionServerOperation: numberOfMetaRegions: 1, onlineMetaRegions.size(): 1
> <jdcryans> 2008-08-28 12:12:34,093 INFO org.apache.hadoop.hbase.master.ProcessRegionOpen$1: updating row web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794 in region .META.,,1 with startcode 1219931259154 and server 192.168.1.95:60020
> {code}
> HRS 192.168.1.95
> {code}
> jdcryans> 2008-08-28 12:12:24,953 DEBUG org.apache.hadoop.hbase.regionserver.CompactSplitThread: Compaction requested for region: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> <jdcryans> 2008-08-28 12:12:27,307 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: MSG_REGION_CLOSE: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794: [B@f0a360
> <jdcryans> 2008-08-28 12:12:27,307 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: MSG_REGION_CLOSE: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794: [B@f0a360
> <jdcryans> 2008-08-28 12:12:27,308 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Compactions and cache flushes disabled for region web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> <jdcryans> 2008-08-28 12:12:27,308 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Scanners disabled for region web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> <jdcryans> 2008-08-28 12:12:27,308 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: No more active scanners for region web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> <jdcryans> 2008-08-28 12:12:27,308 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Updates disabled for region web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> <jdcryans> 2008-08-28 12:12:27,308 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: No more row locks outstanding on region web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> <jdcryans> 2008-08-28 12:12:27,308 DEBUG org.apache.hadoop.hbase.regionserver.HStore: closed 1860667227/attribute
> <jdcryans> 2008-08-28 12:12:27,308 INFO org.apache.hadoop.hbase.regionserver.HRegion: closed web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> <jdcryans> 2008-08-28 12:12:34,246 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 on 60020, call batchUpdate([B@552a4a, row => http://www.simplewebengines.com/, {column => attribute:traveliness, value => '...', column => attribute:processed_at, value => '...', column => attribute:content, value => '...', column => attribute:refs, value => '...', column => attribute:crawled_at, value => '...', column => att
> <jdcryans> ribute:html, value => '...', column => attribute:crawled, value => '...'}) from 192.168.1.96:50102: error: org.apache.hadoop.hbase.NotServingRegionException: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> <jdcryans> org.apache.hadoop.hbase.NotServingRegionException: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> NSRE for a hundred times
> {code}
> Restarting the cluster cleared the issue but this is a nasty bug. Proposed bandaid would be that if we have a NSRE after the retries, asked the master to scan the HRS to see if it's located somewhere else. If not, assign it somewhere. Finally update META.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-851) Region is left unassigned after a split/rebalancing, throws NSRE

Posted by "Jim Kellerman (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12637243#action_12637243 ] 

Jim Kellerman commented on HBASE-851:
-------------------------------------

@Stack

{quote}
We should fix this 'hole' in our protocol?
{quote}

Yes, but the master should not reassign the region if it is still being closed, or
the new region server will corrupt it.

Perhaps something like what we do with open: have a MSG_PROCESS_CLOSE tied
to a progressable so that the master knows that the region has not yet been closed,
and the region server is not dead.

> Region is left unassigned after a split/rebalancing, throws NSRE
> ----------------------------------------------------------------
>
>                 Key: HBASE-851
>                 URL: https://issues.apache.org/jira/browse/HBASE-851
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.2.0, 0.2.1
>            Reporter: Jean-Daniel Cryans
>             Fix For: 0.19.0
>
>
> Master log:
> {code}
> 2008-08-28 12:12:27,174 INFO org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_PROCESS_OPEN: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794 from 192.168.1.95:60020
> <jdcryans> 2008-08-28 12:12:27,174 INFO org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_OPEN: web_pages,http://www.salonskincare.co.uk/product_info.php/products_id/168,1219939934794 from 192.168.1.95:60020
> <jdcryans> 2008-08-28 12:12:27,174 INFO org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_OPEN: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794 from 192.168.1.95:60020
> <jdcryans> 2008-08-28 12:12:27,174 DEBUG org.apache.hadoop.hbase.master.RegionManager: Server 192.168.1.95:60020 is overloaded. Server load: 8 avg: 7.0
> <jdcryans> 2008-08-28 12:12:27,174 DEBUG org.apache.hadoop.hbase.master.RegionManager: Choosing to reassign 1 regions. mostLoadedRegions has 8 regions in it.
> <jdcryans> 2008-08-28 12:12:27,174 DEBUG org.apache.hadoop.hbase.master.RegionManager: Going to close region web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> <jdcryans> 2008-08-28 12:12:27,174 DEBUG org.apache.hadoop.hbase.master.HMaster: Main processing loop: PendingOpenOperation from 192.168.1.95:60020
> <jdcryans> 2008-08-28 12:12:27,175 INFO org.apache.hadoop.hbase.master.ProcessRegionOpen$1: web_pages,http://www.salonskincare.co.uk/product_info.php/products_id/168,1219939934794 open on 192.168.1.95:60020
> <jdcryans> 2008-08-28 12:12:27,175 DEBUG org.apache.hadoop.hbase.master.RegionServerOperation: numberOfMetaRegions: 1, onlineMetaRegions.size(): 1
> <jdcryans> 2008-08-28 12:12:27,175 INFO org.apache.hadoop.hbase.master.ProcessRegionOpen$1: updating row web_pages,http://www.salonskincare.co.uk/product_info.php/products_id/168,1219939934794 in region .META.,,1 with startcode 1219931259154 and server 192.168.1.95:60020
> <jdcryans> 2008-08-28 12:12:30,352 INFO org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_CLOSE: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794 from 192.168.1.95:60020
> <jdcryans> 2008-08-28 12:1
> <jdcryans> 2008-08-28 12:12:32,557 DEBUG org.apache.hadoop.hbase.master.ServerManager: Total Load: 103, Num Servers: 15, Avg Load: 7.0
> <jdcryans> 2008-08-28 12:12:34,093 DEBUG org.apache.hadoop.hbase.master.HMaster: Main processing loop: PendingOpenOperation from 192.168.1.95:60020
> <jdcryans> 2008-08-28 12:12:34,093 INFO org.apache.hadoop.hbase.master.ProcessRegionOpen$1: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794 open on 192.168.1.95:60020
> <jdcryans> 2008-08-28 12:12:34,093 DEBUG org.apache.hadoop.hbase.master.RegionServerOperation: numberOfMetaRegions: 1, onlineMetaRegions.size(): 1
> <jdcryans> 2008-08-28 12:12:34,093 INFO org.apache.hadoop.hbase.master.ProcessRegionOpen$1: updating row web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794 in region .META.,,1 with startcode 1219931259154 and server 192.168.1.95:60020
> {code}
> HRS 192.168.1.95
> {code}
> jdcryans> 2008-08-28 12:12:24,953 DEBUG org.apache.hadoop.hbase.regionserver.CompactSplitThread: Compaction requested for region: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> <jdcryans> 2008-08-28 12:12:27,307 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: MSG_REGION_CLOSE: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794: [B@f0a360
> <jdcryans> 2008-08-28 12:12:27,307 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: MSG_REGION_CLOSE: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794: [B@f0a360
> <jdcryans> 2008-08-28 12:12:27,308 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Compactions and cache flushes disabled for region web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> <jdcryans> 2008-08-28 12:12:27,308 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Scanners disabled for region web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> <jdcryans> 2008-08-28 12:12:27,308 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: No more active scanners for region web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> <jdcryans> 2008-08-28 12:12:27,308 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Updates disabled for region web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> <jdcryans> 2008-08-28 12:12:27,308 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: No more row locks outstanding on region web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> <jdcryans> 2008-08-28 12:12:27,308 DEBUG org.apache.hadoop.hbase.regionserver.HStore: closed 1860667227/attribute
> <jdcryans> 2008-08-28 12:12:27,308 INFO org.apache.hadoop.hbase.regionserver.HRegion: closed web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> <jdcryans> 2008-08-28 12:12:34,246 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 on 60020, call batchUpdate([B@552a4a, row => http://www.simplewebengines.com/, {column => attribute:traveliness, value => '...', column => attribute:processed_at, value => '...', column => attribute:content, value => '...', column => attribute:refs, value => '...', column => attribute:crawled_at, value => '...', column => att
> <jdcryans> ribute:html, value => '...', column => attribute:crawled, value => '...'}) from 192.168.1.96:50102: error: org.apache.hadoop.hbase.NotServingRegionException: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> <jdcryans> org.apache.hadoop.hbase.NotServingRegionException: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> NSRE for a hundred times
> {code}
> Restarting the cluster cleared the issue but this is a nasty bug. Proposed bandaid would be that if we have a NSRE after the retries, asked the master to scan the HRS to see if it's located somewhere else. If not, assign it somewhere. Finally update META.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-851) Region is left unassigned after a split/rebalancing, throws NSRE

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12637228#action_12637228 ] 

stack commented on HBASE-851:
-----------------------------

bq. If the master never receives MSG_REPORT_CLOSE, it is never going to reassign the region.

We should fix this 'hole' in our protocol?

> Region is left unassigned after a split/rebalancing, throws NSRE
> ----------------------------------------------------------------
>
>                 Key: HBASE-851
>                 URL: https://issues.apache.org/jira/browse/HBASE-851
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.2.0, 0.2.1
>            Reporter: Jean-Daniel Cryans
>             Fix For: 0.19.0
>
>
> Master log:
> {code}
> 2008-08-28 12:12:27,174 INFO org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_PROCESS_OPEN: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794 from 192.168.1.95:60020
> <jdcryans> 2008-08-28 12:12:27,174 INFO org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_OPEN: web_pages,http://www.salonskincare.co.uk/product_info.php/products_id/168,1219939934794 from 192.168.1.95:60020
> <jdcryans> 2008-08-28 12:12:27,174 INFO org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_OPEN: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794 from 192.168.1.95:60020
> <jdcryans> 2008-08-28 12:12:27,174 DEBUG org.apache.hadoop.hbase.master.RegionManager: Server 192.168.1.95:60020 is overloaded. Server load: 8 avg: 7.0
> <jdcryans> 2008-08-28 12:12:27,174 DEBUG org.apache.hadoop.hbase.master.RegionManager: Choosing to reassign 1 regions. mostLoadedRegions has 8 regions in it.
> <jdcryans> 2008-08-28 12:12:27,174 DEBUG org.apache.hadoop.hbase.master.RegionManager: Going to close region web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> <jdcryans> 2008-08-28 12:12:27,174 DEBUG org.apache.hadoop.hbase.master.HMaster: Main processing loop: PendingOpenOperation from 192.168.1.95:60020
> <jdcryans> 2008-08-28 12:12:27,175 INFO org.apache.hadoop.hbase.master.ProcessRegionOpen$1: web_pages,http://www.salonskincare.co.uk/product_info.php/products_id/168,1219939934794 open on 192.168.1.95:60020
> <jdcryans> 2008-08-28 12:12:27,175 DEBUG org.apache.hadoop.hbase.master.RegionServerOperation: numberOfMetaRegions: 1, onlineMetaRegions.size(): 1
> <jdcryans> 2008-08-28 12:12:27,175 INFO org.apache.hadoop.hbase.master.ProcessRegionOpen$1: updating row web_pages,http://www.salonskincare.co.uk/product_info.php/products_id/168,1219939934794 in region .META.,,1 with startcode 1219931259154 and server 192.168.1.95:60020
> <jdcryans> 2008-08-28 12:12:30,352 INFO org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_CLOSE: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794 from 192.168.1.95:60020
> <jdcryans> 2008-08-28 12:1
> <jdcryans> 2008-08-28 12:12:32,557 DEBUG org.apache.hadoop.hbase.master.ServerManager: Total Load: 103, Num Servers: 15, Avg Load: 7.0
> <jdcryans> 2008-08-28 12:12:34,093 DEBUG org.apache.hadoop.hbase.master.HMaster: Main processing loop: PendingOpenOperation from 192.168.1.95:60020
> <jdcryans> 2008-08-28 12:12:34,093 INFO org.apache.hadoop.hbase.master.ProcessRegionOpen$1: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794 open on 192.168.1.95:60020
> <jdcryans> 2008-08-28 12:12:34,093 DEBUG org.apache.hadoop.hbase.master.RegionServerOperation: numberOfMetaRegions: 1, onlineMetaRegions.size(): 1
> <jdcryans> 2008-08-28 12:12:34,093 INFO org.apache.hadoop.hbase.master.ProcessRegionOpen$1: updating row web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794 in region .META.,,1 with startcode 1219931259154 and server 192.168.1.95:60020
> {code}
> HRS 192.168.1.95
> {code}
> jdcryans> 2008-08-28 12:12:24,953 DEBUG org.apache.hadoop.hbase.regionserver.CompactSplitThread: Compaction requested for region: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> <jdcryans> 2008-08-28 12:12:27,307 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: MSG_REGION_CLOSE: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794: [B@f0a360
> <jdcryans> 2008-08-28 12:12:27,307 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: MSG_REGION_CLOSE: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794: [B@f0a360
> <jdcryans> 2008-08-28 12:12:27,308 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Compactions and cache flushes disabled for region web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> <jdcryans> 2008-08-28 12:12:27,308 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Scanners disabled for region web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> <jdcryans> 2008-08-28 12:12:27,308 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: No more active scanners for region web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> <jdcryans> 2008-08-28 12:12:27,308 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Updates disabled for region web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> <jdcryans> 2008-08-28 12:12:27,308 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: No more row locks outstanding on region web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> <jdcryans> 2008-08-28 12:12:27,308 DEBUG org.apache.hadoop.hbase.regionserver.HStore: closed 1860667227/attribute
> <jdcryans> 2008-08-28 12:12:27,308 INFO org.apache.hadoop.hbase.regionserver.HRegion: closed web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> <jdcryans> 2008-08-28 12:12:34,246 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 on 60020, call batchUpdate([B@552a4a, row => http://www.simplewebengines.com/, {column => attribute:traveliness, value => '...', column => attribute:processed_at, value => '...', column => attribute:content, value => '...', column => attribute:refs, value => '...', column => attribute:crawled_at, value => '...', column => att
> <jdcryans> ribute:html, value => '...', column => attribute:crawled, value => '...'}) from 192.168.1.96:50102: error: org.apache.hadoop.hbase.NotServingRegionException: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> <jdcryans> org.apache.hadoop.hbase.NotServingRegionException: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> NSRE for a hundred times
> {code}
> Restarting the cluster cleared the issue but this is a nasty bug. Proposed bandaid would be that if we have a NSRE after the retries, asked the master to scan the HRS to see if it's located somewhere else. If not, assign it somewhere. Finally update META.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-851) Region is left unassigned after a split/rebalancing, throws NSRE

Posted by "Jim Kellerman (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12637224#action_12637224 ] 

Jim Kellerman commented on HBASE-851:
-------------------------------------

It appears that the master got the REPORT_OPEN from the server but not
the REPORT_CLOSE.

So when the NSRE happens, the master thinks the region is open, but
has told the region server to close it.

The region server has not yet reported the region as closed, but it
may have removed the region from onlineRegions but just has not yet
gotten around to finish and report close or, due to a thread
scheduling problem, the heartbeat has either not been sent to the
master or the master has not polled the heartbeat message queue. Beyond
that, the logs do not show enough information

What would be really useful next time, would be thread dumps of master
and region server.

If that is really all that is in the logs except for the NSRE's,
something is wedged.

If the master never receives MSG_REPORT_CLOSE, it is never going to
reassign the region.


> Region is left unassigned after a split/rebalancing, throws NSRE
> ----------------------------------------------------------------
>
>                 Key: HBASE-851
>                 URL: https://issues.apache.org/jira/browse/HBASE-851
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.2.0, 0.2.1
>            Reporter: Jean-Daniel Cryans
>             Fix For: 0.19.0
>
>
> Master log:
> {code}
> 2008-08-28 12:12:27,174 INFO org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_PROCESS_OPEN: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794 from 192.168.1.95:60020
> <jdcryans> 2008-08-28 12:12:27,174 INFO org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_OPEN: web_pages,http://www.salonskincare.co.uk/product_info.php/products_id/168,1219939934794 from 192.168.1.95:60020
> <jdcryans> 2008-08-28 12:12:27,174 INFO org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_OPEN: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794 from 192.168.1.95:60020
> <jdcryans> 2008-08-28 12:12:27,174 DEBUG org.apache.hadoop.hbase.master.RegionManager: Server 192.168.1.95:60020 is overloaded. Server load: 8 avg: 7.0
> <jdcryans> 2008-08-28 12:12:27,174 DEBUG org.apache.hadoop.hbase.master.RegionManager: Choosing to reassign 1 regions. mostLoadedRegions has 8 regions in it.
> <jdcryans> 2008-08-28 12:12:27,174 DEBUG org.apache.hadoop.hbase.master.RegionManager: Going to close region web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> <jdcryans> 2008-08-28 12:12:27,174 DEBUG org.apache.hadoop.hbase.master.HMaster: Main processing loop: PendingOpenOperation from 192.168.1.95:60020
> <jdcryans> 2008-08-28 12:12:27,175 INFO org.apache.hadoop.hbase.master.ProcessRegionOpen$1: web_pages,http://www.salonskincare.co.uk/product_info.php/products_id/168,1219939934794 open on 192.168.1.95:60020
> <jdcryans> 2008-08-28 12:12:27,175 DEBUG org.apache.hadoop.hbase.master.RegionServerOperation: numberOfMetaRegions: 1, onlineMetaRegions.size(): 1
> <jdcryans> 2008-08-28 12:12:27,175 INFO org.apache.hadoop.hbase.master.ProcessRegionOpen$1: updating row web_pages,http://www.salonskincare.co.uk/product_info.php/products_id/168,1219939934794 in region .META.,,1 with startcode 1219931259154 and server 192.168.1.95:60020
> <jdcryans> 2008-08-28 12:12:30,352 INFO org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_CLOSE: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794 from 192.168.1.95:60020
> <jdcryans> 2008-08-28 12:1
> <jdcryans> 2008-08-28 12:12:32,557 DEBUG org.apache.hadoop.hbase.master.ServerManager: Total Load: 103, Num Servers: 15, Avg Load: 7.0
> <jdcryans> 2008-08-28 12:12:34,093 DEBUG org.apache.hadoop.hbase.master.HMaster: Main processing loop: PendingOpenOperation from 192.168.1.95:60020
> <jdcryans> 2008-08-28 12:12:34,093 INFO org.apache.hadoop.hbase.master.ProcessRegionOpen$1: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794 open on 192.168.1.95:60020
> <jdcryans> 2008-08-28 12:12:34,093 DEBUG org.apache.hadoop.hbase.master.RegionServerOperation: numberOfMetaRegions: 1, onlineMetaRegions.size(): 1
> <jdcryans> 2008-08-28 12:12:34,093 INFO org.apache.hadoop.hbase.master.ProcessRegionOpen$1: updating row web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794 in region .META.,,1 with startcode 1219931259154 and server 192.168.1.95:60020
> {code}
> HRS 192.168.1.95
> {code}
> jdcryans> 2008-08-28 12:12:24,953 DEBUG org.apache.hadoop.hbase.regionserver.CompactSplitThread: Compaction requested for region: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> <jdcryans> 2008-08-28 12:12:27,307 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: MSG_REGION_CLOSE: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794: [B@f0a360
> <jdcryans> 2008-08-28 12:12:27,307 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: MSG_REGION_CLOSE: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794: [B@f0a360
> <jdcryans> 2008-08-28 12:12:27,308 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Compactions and cache flushes disabled for region web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> <jdcryans> 2008-08-28 12:12:27,308 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Scanners disabled for region web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> <jdcryans> 2008-08-28 12:12:27,308 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: No more active scanners for region web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> <jdcryans> 2008-08-28 12:12:27,308 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Updates disabled for region web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> <jdcryans> 2008-08-28 12:12:27,308 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: No more row locks outstanding on region web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> <jdcryans> 2008-08-28 12:12:27,308 DEBUG org.apache.hadoop.hbase.regionserver.HStore: closed 1860667227/attribute
> <jdcryans> 2008-08-28 12:12:27,308 INFO org.apache.hadoop.hbase.regionserver.HRegion: closed web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> <jdcryans> 2008-08-28 12:12:34,246 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 on 60020, call batchUpdate([B@552a4a, row => http://www.simplewebengines.com/, {column => attribute:traveliness, value => '...', column => attribute:processed_at, value => '...', column => attribute:content, value => '...', column => attribute:refs, value => '...', column => attribute:crawled_at, value => '...', column => att
> <jdcryans> ribute:html, value => '...', column => attribute:crawled, value => '...'}) from 192.168.1.96:50102: error: org.apache.hadoop.hbase.NotServingRegionException: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> <jdcryans> org.apache.hadoop.hbase.NotServingRegionException: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> NSRE for a hundred times
> {code}
> Restarting the cluster cleared the issue but this is a nasty bug. Proposed bandaid would be that if we have a NSRE after the retries, asked the master to scan the HRS to see if it's located somewhere else. If not, assign it somewhere. Finally update META.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-851) Region is left unassigned after a split/rebalancing, throws NSRE

Posted by "Jean-Daniel Cryans (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12632733#action_12632733 ] 

Jean-Daniel Cryans commented on HBASE-851:
------------------------------------------

Seeing this issue again. Happened during a big MR job on a table with 300 regions, one of them was assigned in META but not open on the server. Restarting the cluster fixed the problem.

> Region is left unassigned after a split/rebalancing, throws NSRE
> ----------------------------------------------------------------
>
>                 Key: HBASE-851
>                 URL: https://issues.apache.org/jira/browse/HBASE-851
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.2.0, 0.2.1
>            Reporter: Jean-Daniel Cryans
>             Fix For: 0.19.0
>
>
> Master log:
> {code}
> 2008-08-28 12:12:27,174 INFO org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_PROCESS_OPEN: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794 from 192.168.1.95:60020
> <jdcryans> 2008-08-28 12:12:27,174 INFO org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_OPEN: web_pages,http://www.salonskincare.co.uk/product_info.php/products_id/168,1219939934794 from 192.168.1.95:60020
> <jdcryans> 2008-08-28 12:12:27,174 INFO org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_OPEN: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794 from 192.168.1.95:60020
> <jdcryans> 2008-08-28 12:12:27,174 DEBUG org.apache.hadoop.hbase.master.RegionManager: Server 192.168.1.95:60020 is overloaded. Server load: 8 avg: 7.0
> <jdcryans> 2008-08-28 12:12:27,174 DEBUG org.apache.hadoop.hbase.master.RegionManager: Choosing to reassign 1 regions. mostLoadedRegions has 8 regions in it.
> <jdcryans> 2008-08-28 12:12:27,174 DEBUG org.apache.hadoop.hbase.master.RegionManager: Going to close region web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> <jdcryans> 2008-08-28 12:12:27,174 DEBUG org.apache.hadoop.hbase.master.HMaster: Main processing loop: PendingOpenOperation from 192.168.1.95:60020
> <jdcryans> 2008-08-28 12:12:27,175 INFO org.apache.hadoop.hbase.master.ProcessRegionOpen$1: web_pages,http://www.salonskincare.co.uk/product_info.php/products_id/168,1219939934794 open on 192.168.1.95:60020
> <jdcryans> 2008-08-28 12:12:27,175 DEBUG org.apache.hadoop.hbase.master.RegionServerOperation: numberOfMetaRegions: 1, onlineMetaRegions.size(): 1
> <jdcryans> 2008-08-28 12:12:27,175 INFO org.apache.hadoop.hbase.master.ProcessRegionOpen$1: updating row web_pages,http://www.salonskincare.co.uk/product_info.php/products_id/168,1219939934794 in region .META.,,1 with startcode 1219931259154 and server 192.168.1.95:60020
> <jdcryans> 2008-08-28 12:12:30,352 INFO org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_CLOSE: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794 from 192.168.1.95:60020
> <jdcryans> 2008-08-28 12:1
> <jdcryans> 2008-08-28 12:12:32,557 DEBUG org.apache.hadoop.hbase.master.ServerManager: Total Load: 103, Num Servers: 15, Avg Load: 7.0
> <jdcryans> 2008-08-28 12:12:34,093 DEBUG org.apache.hadoop.hbase.master.HMaster: Main processing loop: PendingOpenOperation from 192.168.1.95:60020
> <jdcryans> 2008-08-28 12:12:34,093 INFO org.apache.hadoop.hbase.master.ProcessRegionOpen$1: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794 open on 192.168.1.95:60020
> <jdcryans> 2008-08-28 12:12:34,093 DEBUG org.apache.hadoop.hbase.master.RegionServerOperation: numberOfMetaRegions: 1, onlineMetaRegions.size(): 1
> <jdcryans> 2008-08-28 12:12:34,093 INFO org.apache.hadoop.hbase.master.ProcessRegionOpen$1: updating row web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794 in region .META.,,1 with startcode 1219931259154 and server 192.168.1.95:60020
> {code}
> HRS 192.168.1.95
> {code}
> jdcryans> 2008-08-28 12:12:24,953 DEBUG org.apache.hadoop.hbase.regionserver.CompactSplitThread: Compaction requested for region: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> <jdcryans> 2008-08-28 12:12:27,307 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: MSG_REGION_CLOSE: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794: [B@f0a360
> <jdcryans> 2008-08-28 12:12:27,307 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: MSG_REGION_CLOSE: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794: [B@f0a360
> <jdcryans> 2008-08-28 12:12:27,308 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Compactions and cache flushes disabled for region web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> <jdcryans> 2008-08-28 12:12:27,308 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Scanners disabled for region web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> <jdcryans> 2008-08-28 12:12:27,308 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: No more active scanners for region web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> <jdcryans> 2008-08-28 12:12:27,308 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Updates disabled for region web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> <jdcryans> 2008-08-28 12:12:27,308 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: No more row locks outstanding on region web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> <jdcryans> 2008-08-28 12:12:27,308 DEBUG org.apache.hadoop.hbase.regionserver.HStore: closed 1860667227/attribute
> <jdcryans> 2008-08-28 12:12:27,308 INFO org.apache.hadoop.hbase.regionserver.HRegion: closed web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> <jdcryans> 2008-08-28 12:12:34,246 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 on 60020, call batchUpdate([B@552a4a, row => http://www.simplewebengines.com/, {column => attribute:traveliness, value => '...', column => attribute:processed_at, value => '...', column => attribute:content, value => '...', column => attribute:refs, value => '...', column => attribute:crawled_at, value => '...', column => att
> <jdcryans> ribute:html, value => '...', column => attribute:crawled, value => '...'}) from 192.168.1.96:50102: error: org.apache.hadoop.hbase.NotServingRegionException: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> <jdcryans> org.apache.hadoop.hbase.NotServingRegionException: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> NSRE for a hundred times
> {code}
> Restarting the cluster cleared the issue but this is a nasty bug. Proposed bandaid would be that if we have a NSRE after the retries, asked the master to scan the HRS to see if it's located somewhere else. If not, assign it somewhere. Finally update META.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Issue Comment Edited: (HBASE-851) Region is left unassigned after a split/rebalancing, throws NSRE

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12639180#action_12639180 ] 

stack edited comment on HBASE-851 at 10/13/08 12:46 PM:
--------------------------------------------------------

Reading the j-d snippets above, this looks like HBASE-921; i.e. the open of the region 'web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794' is being processed after a close.

      was (Author: stack):
    Reading the j-d snippets above, this looks like HBASE-918; i.e. the open of the region 'web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794' is being processed after a close.
  
> Region is left unassigned after a split/rebalancing, throws NSRE
> ----------------------------------------------------------------
>
>                 Key: HBASE-851
>                 URL: https://issues.apache.org/jira/browse/HBASE-851
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.2.0, 0.2.1
>            Reporter: Jean-Daniel Cryans
>             Fix For: 0.19.0
>
>
> Master log:
> {code}
> 2008-08-28 12:12:27,174 INFO org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_PROCESS_OPEN: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794 from 192.168.1.95:60020
> <jdcryans> 2008-08-28 12:12:27,174 INFO org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_OPEN: web_pages,http://www.salonskincare.co.uk/product_info.php/products_id/168,1219939934794 from 192.168.1.95:60020
> <jdcryans> 2008-08-28 12:12:27,174 INFO org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_OPEN: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794 from 192.168.1.95:60020
> <jdcryans> 2008-08-28 12:12:27,174 DEBUG org.apache.hadoop.hbase.master.RegionManager: Server 192.168.1.95:60020 is overloaded. Server load: 8 avg: 7.0
> <jdcryans> 2008-08-28 12:12:27,174 DEBUG org.apache.hadoop.hbase.master.RegionManager: Choosing to reassign 1 regions. mostLoadedRegions has 8 regions in it.
> <jdcryans> 2008-08-28 12:12:27,174 DEBUG org.apache.hadoop.hbase.master.RegionManager: Going to close region web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> <jdcryans> 2008-08-28 12:12:27,174 DEBUG org.apache.hadoop.hbase.master.HMaster: Main processing loop: PendingOpenOperation from 192.168.1.95:60020
> <jdcryans> 2008-08-28 12:12:27,175 INFO org.apache.hadoop.hbase.master.ProcessRegionOpen$1: web_pages,http://www.salonskincare.co.uk/product_info.php/products_id/168,1219939934794 open on 192.168.1.95:60020
> <jdcryans> 2008-08-28 12:12:27,175 DEBUG org.apache.hadoop.hbase.master.RegionServerOperation: numberOfMetaRegions: 1, onlineMetaRegions.size(): 1
> <jdcryans> 2008-08-28 12:12:27,175 INFO org.apache.hadoop.hbase.master.ProcessRegionOpen$1: updating row web_pages,http://www.salonskincare.co.uk/product_info.php/products_id/168,1219939934794 in region .META.,,1 with startcode 1219931259154 and server 192.168.1.95:60020
> <jdcryans> 2008-08-28 12:12:30,352 INFO org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_CLOSE: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794 from 192.168.1.95:60020
> <jdcryans> 2008-08-28 12:1
> <jdcryans> 2008-08-28 12:12:32,557 DEBUG org.apache.hadoop.hbase.master.ServerManager: Total Load: 103, Num Servers: 15, Avg Load: 7.0
> <jdcryans> 2008-08-28 12:12:34,093 DEBUG org.apache.hadoop.hbase.master.HMaster: Main processing loop: PendingOpenOperation from 192.168.1.95:60020
> <jdcryans> 2008-08-28 12:12:34,093 INFO org.apache.hadoop.hbase.master.ProcessRegionOpen$1: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794 open on 192.168.1.95:60020
> <jdcryans> 2008-08-28 12:12:34,093 DEBUG org.apache.hadoop.hbase.master.RegionServerOperation: numberOfMetaRegions: 1, onlineMetaRegions.size(): 1
> <jdcryans> 2008-08-28 12:12:34,093 INFO org.apache.hadoop.hbase.master.ProcessRegionOpen$1: updating row web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794 in region .META.,,1 with startcode 1219931259154 and server 192.168.1.95:60020
> {code}
> HRS 192.168.1.95
> {code}
> jdcryans> 2008-08-28 12:12:24,953 DEBUG org.apache.hadoop.hbase.regionserver.CompactSplitThread: Compaction requested for region: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> <jdcryans> 2008-08-28 12:12:27,307 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: MSG_REGION_CLOSE: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794: [B@f0a360
> <jdcryans> 2008-08-28 12:12:27,307 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: MSG_REGION_CLOSE: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794: [B@f0a360
> <jdcryans> 2008-08-28 12:12:27,308 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Compactions and cache flushes disabled for region web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> <jdcryans> 2008-08-28 12:12:27,308 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Scanners disabled for region web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> <jdcryans> 2008-08-28 12:12:27,308 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: No more active scanners for region web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> <jdcryans> 2008-08-28 12:12:27,308 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Updates disabled for region web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> <jdcryans> 2008-08-28 12:12:27,308 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: No more row locks outstanding on region web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> <jdcryans> 2008-08-28 12:12:27,308 DEBUG org.apache.hadoop.hbase.regionserver.HStore: closed 1860667227/attribute
> <jdcryans> 2008-08-28 12:12:27,308 INFO org.apache.hadoop.hbase.regionserver.HRegion: closed web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> <jdcryans> 2008-08-28 12:12:34,246 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 on 60020, call batchUpdate([B@552a4a, row => http://www.simplewebengines.com/, {column => attribute:traveliness, value => '...', column => attribute:processed_at, value => '...', column => attribute:content, value => '...', column => attribute:refs, value => '...', column => attribute:crawled_at, value => '...', column => att
> <jdcryans> ribute:html, value => '...', column => attribute:crawled, value => '...'}) from 192.168.1.96:50102: error: org.apache.hadoop.hbase.NotServingRegionException: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> <jdcryans> org.apache.hadoop.hbase.NotServingRegionException: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> NSRE for a hundred times
> {code}
> Restarting the cluster cleared the issue but this is a nasty bug. Proposed bandaid would be that if we have a NSRE after the retries, asked the master to scan the HRS to see if it's located somewhere else. If not, assign it somewhere. Finally update META.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-851) Region is left unassigned after a split/rebalancing, throws NSRE

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12639180#action_12639180 ] 

stack commented on HBASE-851:
-----------------------------

Reading the j-d snippets above, this looks like HBASE-918; i.e. the open of the region 'web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794' is being processed after a close.

> Region is left unassigned after a split/rebalancing, throws NSRE
> ----------------------------------------------------------------
>
>                 Key: HBASE-851
>                 URL: https://issues.apache.org/jira/browse/HBASE-851
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.2.0, 0.2.1
>            Reporter: Jean-Daniel Cryans
>             Fix For: 0.19.0
>
>
> Master log:
> {code}
> 2008-08-28 12:12:27,174 INFO org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_PROCESS_OPEN: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794 from 192.168.1.95:60020
> <jdcryans> 2008-08-28 12:12:27,174 INFO org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_OPEN: web_pages,http://www.salonskincare.co.uk/product_info.php/products_id/168,1219939934794 from 192.168.1.95:60020
> <jdcryans> 2008-08-28 12:12:27,174 INFO org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_OPEN: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794 from 192.168.1.95:60020
> <jdcryans> 2008-08-28 12:12:27,174 DEBUG org.apache.hadoop.hbase.master.RegionManager: Server 192.168.1.95:60020 is overloaded. Server load: 8 avg: 7.0
> <jdcryans> 2008-08-28 12:12:27,174 DEBUG org.apache.hadoop.hbase.master.RegionManager: Choosing to reassign 1 regions. mostLoadedRegions has 8 regions in it.
> <jdcryans> 2008-08-28 12:12:27,174 DEBUG org.apache.hadoop.hbase.master.RegionManager: Going to close region web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> <jdcryans> 2008-08-28 12:12:27,174 DEBUG org.apache.hadoop.hbase.master.HMaster: Main processing loop: PendingOpenOperation from 192.168.1.95:60020
> <jdcryans> 2008-08-28 12:12:27,175 INFO org.apache.hadoop.hbase.master.ProcessRegionOpen$1: web_pages,http://www.salonskincare.co.uk/product_info.php/products_id/168,1219939934794 open on 192.168.1.95:60020
> <jdcryans> 2008-08-28 12:12:27,175 DEBUG org.apache.hadoop.hbase.master.RegionServerOperation: numberOfMetaRegions: 1, onlineMetaRegions.size(): 1
> <jdcryans> 2008-08-28 12:12:27,175 INFO org.apache.hadoop.hbase.master.ProcessRegionOpen$1: updating row web_pages,http://www.salonskincare.co.uk/product_info.php/products_id/168,1219939934794 in region .META.,,1 with startcode 1219931259154 and server 192.168.1.95:60020
> <jdcryans> 2008-08-28 12:12:30,352 INFO org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_CLOSE: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794 from 192.168.1.95:60020
> <jdcryans> 2008-08-28 12:1
> <jdcryans> 2008-08-28 12:12:32,557 DEBUG org.apache.hadoop.hbase.master.ServerManager: Total Load: 103, Num Servers: 15, Avg Load: 7.0
> <jdcryans> 2008-08-28 12:12:34,093 DEBUG org.apache.hadoop.hbase.master.HMaster: Main processing loop: PendingOpenOperation from 192.168.1.95:60020
> <jdcryans> 2008-08-28 12:12:34,093 INFO org.apache.hadoop.hbase.master.ProcessRegionOpen$1: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794 open on 192.168.1.95:60020
> <jdcryans> 2008-08-28 12:12:34,093 DEBUG org.apache.hadoop.hbase.master.RegionServerOperation: numberOfMetaRegions: 1, onlineMetaRegions.size(): 1
> <jdcryans> 2008-08-28 12:12:34,093 INFO org.apache.hadoop.hbase.master.ProcessRegionOpen$1: updating row web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794 in region .META.,,1 with startcode 1219931259154 and server 192.168.1.95:60020
> {code}
> HRS 192.168.1.95
> {code}
> jdcryans> 2008-08-28 12:12:24,953 DEBUG org.apache.hadoop.hbase.regionserver.CompactSplitThread: Compaction requested for region: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> <jdcryans> 2008-08-28 12:12:27,307 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: MSG_REGION_CLOSE: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794: [B@f0a360
> <jdcryans> 2008-08-28 12:12:27,307 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: MSG_REGION_CLOSE: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794: [B@f0a360
> <jdcryans> 2008-08-28 12:12:27,308 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Compactions and cache flushes disabled for region web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> <jdcryans> 2008-08-28 12:12:27,308 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Scanners disabled for region web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> <jdcryans> 2008-08-28 12:12:27,308 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: No more active scanners for region web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> <jdcryans> 2008-08-28 12:12:27,308 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Updates disabled for region web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> <jdcryans> 2008-08-28 12:12:27,308 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: No more row locks outstanding on region web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> <jdcryans> 2008-08-28 12:12:27,308 DEBUG org.apache.hadoop.hbase.regionserver.HStore: closed 1860667227/attribute
> <jdcryans> 2008-08-28 12:12:27,308 INFO org.apache.hadoop.hbase.regionserver.HRegion: closed web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> <jdcryans> 2008-08-28 12:12:34,246 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 on 60020, call batchUpdate([B@552a4a, row => http://www.simplewebengines.com/, {column => attribute:traveliness, value => '...', column => attribute:processed_at, value => '...', column => attribute:content, value => '...', column => attribute:refs, value => '...', column => attribute:crawled_at, value => '...', column => att
> <jdcryans> ribute:html, value => '...', column => attribute:crawled, value => '...'}) from 192.168.1.96:50102: error: org.apache.hadoop.hbase.NotServingRegionException: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> <jdcryans> org.apache.hadoop.hbase.NotServingRegionException: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> NSRE for a hundred times
> {code}
> Restarting the cluster cleared the issue but this is a nasty bug. Proposed bandaid would be that if we have a NSRE after the retries, asked the master to scan the HRS to see if it's located somewhere else. If not, assign it somewhere. Finally update META.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-851) Region is left unassigned after a split/rebalancing, throws NSRE

Posted by "Jean-Daniel Cryans (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jean-Daniel Cryans updated HBASE-851:
-------------------------------------

    Affects Version/s: 0.2.1

Was on a 0.2.1 cluster

> Region is left unassigned after a split/rebalancing, throws NSRE
> ----------------------------------------------------------------
>
>                 Key: HBASE-851
>                 URL: https://issues.apache.org/jira/browse/HBASE-851
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.2.0, 0.2.1
>            Reporter: Jean-Daniel Cryans
>             Fix For: 0.19.0
>
>
> Master log:
> {code}
> 2008-08-28 12:12:27,174 INFO org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_PROCESS_OPEN: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794 from 192.168.1.95:60020
> <jdcryans> 2008-08-28 12:12:27,174 INFO org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_OPEN: web_pages,http://www.salonskincare.co.uk/product_info.php/products_id/168,1219939934794 from 192.168.1.95:60020
> <jdcryans> 2008-08-28 12:12:27,174 INFO org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_OPEN: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794 from 192.168.1.95:60020
> <jdcryans> 2008-08-28 12:12:27,174 DEBUG org.apache.hadoop.hbase.master.RegionManager: Server 192.168.1.95:60020 is overloaded. Server load: 8 avg: 7.0
> <jdcryans> 2008-08-28 12:12:27,174 DEBUG org.apache.hadoop.hbase.master.RegionManager: Choosing to reassign 1 regions. mostLoadedRegions has 8 regions in it.
> <jdcryans> 2008-08-28 12:12:27,174 DEBUG org.apache.hadoop.hbase.master.RegionManager: Going to close region web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> <jdcryans> 2008-08-28 12:12:27,174 DEBUG org.apache.hadoop.hbase.master.HMaster: Main processing loop: PendingOpenOperation from 192.168.1.95:60020
> <jdcryans> 2008-08-28 12:12:27,175 INFO org.apache.hadoop.hbase.master.ProcessRegionOpen$1: web_pages,http://www.salonskincare.co.uk/product_info.php/products_id/168,1219939934794 open on 192.168.1.95:60020
> <jdcryans> 2008-08-28 12:12:27,175 DEBUG org.apache.hadoop.hbase.master.RegionServerOperation: numberOfMetaRegions: 1, onlineMetaRegions.size(): 1
> <jdcryans> 2008-08-28 12:12:27,175 INFO org.apache.hadoop.hbase.master.ProcessRegionOpen$1: updating row web_pages,http://www.salonskincare.co.uk/product_info.php/products_id/168,1219939934794 in region .META.,,1 with startcode 1219931259154 and server 192.168.1.95:60020
> <jdcryans> 2008-08-28 12:12:30,352 INFO org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_CLOSE: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794 from 192.168.1.95:60020
> <jdcryans> 2008-08-28 12:1
> <jdcryans> 2008-08-28 12:12:32,557 DEBUG org.apache.hadoop.hbase.master.ServerManager: Total Load: 103, Num Servers: 15, Avg Load: 7.0
> <jdcryans> 2008-08-28 12:12:34,093 DEBUG org.apache.hadoop.hbase.master.HMaster: Main processing loop: PendingOpenOperation from 192.168.1.95:60020
> <jdcryans> 2008-08-28 12:12:34,093 INFO org.apache.hadoop.hbase.master.ProcessRegionOpen$1: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794 open on 192.168.1.95:60020
> <jdcryans> 2008-08-28 12:12:34,093 DEBUG org.apache.hadoop.hbase.master.RegionServerOperation: numberOfMetaRegions: 1, onlineMetaRegions.size(): 1
> <jdcryans> 2008-08-28 12:12:34,093 INFO org.apache.hadoop.hbase.master.ProcessRegionOpen$1: updating row web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794 in region .META.,,1 with startcode 1219931259154 and server 192.168.1.95:60020
> {code}
> HRS 192.168.1.95
> {code}
> jdcryans> 2008-08-28 12:12:24,953 DEBUG org.apache.hadoop.hbase.regionserver.CompactSplitThread: Compaction requested for region: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> <jdcryans> 2008-08-28 12:12:27,307 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: MSG_REGION_CLOSE: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794: [B@f0a360
> <jdcryans> 2008-08-28 12:12:27,307 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: MSG_REGION_CLOSE: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794: [B@f0a360
> <jdcryans> 2008-08-28 12:12:27,308 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Compactions and cache flushes disabled for region web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> <jdcryans> 2008-08-28 12:12:27,308 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Scanners disabled for region web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> <jdcryans> 2008-08-28 12:12:27,308 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: No more active scanners for region web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> <jdcryans> 2008-08-28 12:12:27,308 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Updates disabled for region web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> <jdcryans> 2008-08-28 12:12:27,308 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: No more row locks outstanding on region web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> <jdcryans> 2008-08-28 12:12:27,308 DEBUG org.apache.hadoop.hbase.regionserver.HStore: closed 1860667227/attribute
> <jdcryans> 2008-08-28 12:12:27,308 INFO org.apache.hadoop.hbase.regionserver.HRegion: closed web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> <jdcryans> 2008-08-28 12:12:34,246 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 on 60020, call batchUpdate([B@552a4a, row => http://www.simplewebengines.com/, {column => attribute:traveliness, value => '...', column => attribute:processed_at, value => '...', column => attribute:content, value => '...', column => attribute:refs, value => '...', column => attribute:crawled_at, value => '...', column => att
> <jdcryans> ribute:html, value => '...', column => attribute:crawled, value => '...'}) from 192.168.1.96:50102: error: org.apache.hadoop.hbase.NotServingRegionException: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> <jdcryans> org.apache.hadoop.hbase.NotServingRegionException: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> NSRE for a hundred times
> {code}
> Restarting the cluster cleared the issue but this is a nasty bug. Proposed bandaid would be that if we have a NSRE after the retries, asked the master to scan the HRS to see if it's located somewhere else. If not, assign it somewhere. Finally update META.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-851) Region is left unassigned after a split/rebalancing, throws NSRE

Posted by "Izaak Rubin (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12626770#action_12626770 ] 

Izaak Rubin commented on HBASE-851:
-----------------------------------

JD, could you post a bit more of the master log?  Does it keep showing INFO messages for ProcessRegionOpen?  Also, any tips on how to replicate this?

> Region is left unassigned after a split/rebalancing, throws NSRE
> ----------------------------------------------------------------
>
>                 Key: HBASE-851
>                 URL: https://issues.apache.org/jira/browse/HBASE-851
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.2.0
>            Reporter: Jean-Daniel Cryans
>             Fix For: 0.19.0
>
>
> Master log:
> {code}
> 2008-08-28 12:12:27,174 INFO org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_PROCESS_OPEN: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794 from 192.168.1.95:60020
> <jdcryans> 2008-08-28 12:12:27,174 INFO org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_OPEN: web_pages,http://www.salonskincare.co.uk/product_info.php/products_id/168,1219939934794 from 192.168.1.95:60020
> <jdcryans> 2008-08-28 12:12:27,174 INFO org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_OPEN: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794 from 192.168.1.95:60020
> <jdcryans> 2008-08-28 12:12:27,174 DEBUG org.apache.hadoop.hbase.master.RegionManager: Server 192.168.1.95:60020 is overloaded. Server load: 8 avg: 7.0
> <jdcryans> 2008-08-28 12:12:27,174 DEBUG org.apache.hadoop.hbase.master.RegionManager: Choosing to reassign 1 regions. mostLoadedRegions has 8 regions in it.
> <jdcryans> 2008-08-28 12:12:27,174 DEBUG org.apache.hadoop.hbase.master.RegionManager: Going to close region web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> <jdcryans> 2008-08-28 12:12:27,174 DEBUG org.apache.hadoop.hbase.master.HMaster: Main processing loop: PendingOpenOperation from 192.168.1.95:60020
> <jdcryans> 2008-08-28 12:12:27,175 INFO org.apache.hadoop.hbase.master.ProcessRegionOpen$1: web_pages,http://www.salonskincare.co.uk/product_info.php/products_id/168,1219939934794 open on 192.168.1.95:60020
> <jdcryans> 2008-08-28 12:12:27,175 DEBUG org.apache.hadoop.hbase.master.RegionServerOperation: numberOfMetaRegions: 1, onlineMetaRegions.size(): 1
> <jdcryans> 2008-08-28 12:12:27,175 INFO org.apache.hadoop.hbase.master.ProcessRegionOpen$1: updating row web_pages,http://www.salonskincare.co.uk/product_info.php/products_id/168,1219939934794 in region .META.,,1 with startcode 1219931259154 and server 192.168.1.95:60020
> <jdcryans> 2008-08-28 12:12:30,352 INFO org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_CLOSE: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794 from 192.168.1.95:60020
> <jdcryans> 2008-08-28 12:1
> <jdcryans> 2008-08-28 12:12:32,557 DEBUG org.apache.hadoop.hbase.master.ServerManager: Total Load: 103, Num Servers: 15, Avg Load: 7.0
> <jdcryans> 2008-08-28 12:12:34,093 DEBUG org.apache.hadoop.hbase.master.HMaster: Main processing loop: PendingOpenOperation from 192.168.1.95:60020
> <jdcryans> 2008-08-28 12:12:34,093 INFO org.apache.hadoop.hbase.master.ProcessRegionOpen$1: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794 open on 192.168.1.95:60020
> <jdcryans> 2008-08-28 12:12:34,093 DEBUG org.apache.hadoop.hbase.master.RegionServerOperation: numberOfMetaRegions: 1, onlineMetaRegions.size(): 1
> <jdcryans> 2008-08-28 12:12:34,093 INFO org.apache.hadoop.hbase.master.ProcessRegionOpen$1: updating row web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794 in region .META.,,1 with startcode 1219931259154 and server 192.168.1.95:60020
> {code}
> HRS 192.168.1.95
> {code}
> jdcryans> 2008-08-28 12:12:24,953 DEBUG org.apache.hadoop.hbase.regionserver.CompactSplitThread: Compaction requested for region: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> <jdcryans> 2008-08-28 12:12:27,307 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: MSG_REGION_CLOSE: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794: [B@f0a360
> <jdcryans> 2008-08-28 12:12:27,307 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: MSG_REGION_CLOSE: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794: [B@f0a360
> <jdcryans> 2008-08-28 12:12:27,308 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Compactions and cache flushes disabled for region web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> <jdcryans> 2008-08-28 12:12:27,308 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Scanners disabled for region web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> <jdcryans> 2008-08-28 12:12:27,308 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: No more active scanners for region web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> <jdcryans> 2008-08-28 12:12:27,308 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Updates disabled for region web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> <jdcryans> 2008-08-28 12:12:27,308 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: No more row locks outstanding on region web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> <jdcryans> 2008-08-28 12:12:27,308 DEBUG org.apache.hadoop.hbase.regionserver.HStore: closed 1860667227/attribute
> <jdcryans> 2008-08-28 12:12:27,308 INFO org.apache.hadoop.hbase.regionserver.HRegion: closed web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> <jdcryans> 2008-08-28 12:12:34,246 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 on 60020, call batchUpdate([B@552a4a, row => http://www.simplewebengines.com/, {column => attribute:traveliness, value => '...', column => attribute:processed_at, value => '...', column => attribute:content, value => '...', column => attribute:refs, value => '...', column => attribute:crawled_at, value => '...', column => att
> <jdcryans> ribute:html, value => '...', column => attribute:crawled, value => '...'}) from 192.168.1.96:50102: error: org.apache.hadoop.hbase.NotServingRegionException: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> <jdcryans> org.apache.hadoop.hbase.NotServingRegionException: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> NSRE for a hundred times
> {code}
> Restarting the cluster cleared the issue but this is a nasty bug. Proposed bandaid would be that if we have a NSRE after the retries, asked the master to scan the HRS to see if it's located somewhere else. If not, assign it somewhere. Finally update META.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (HBASE-851) Region is left unassigned after a split/rebalancing, throws NSRE

Posted by "Jim Kellerman (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jim Kellerman resolved HBASE-851.
---------------------------------

       Resolution: Duplicate
    Fix Version/s:     (was: 0.19.0)

> Region is left unassigned after a split/rebalancing, throws NSRE
> ----------------------------------------------------------------
>
>                 Key: HBASE-851
>                 URL: https://issues.apache.org/jira/browse/HBASE-851
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.2.0, 0.2.1
>            Reporter: Jean-Daniel Cryans
>
> Master log:
> {code}
> 2008-08-28 12:12:27,174 INFO org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_PROCESS_OPEN: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794 from 192.168.1.95:60020
> <jdcryans> 2008-08-28 12:12:27,174 INFO org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_OPEN: web_pages,http://www.salonskincare.co.uk/product_info.php/products_id/168,1219939934794 from 192.168.1.95:60020
> <jdcryans> 2008-08-28 12:12:27,174 INFO org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_OPEN: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794 from 192.168.1.95:60020
> <jdcryans> 2008-08-28 12:12:27,174 DEBUG org.apache.hadoop.hbase.master.RegionManager: Server 192.168.1.95:60020 is overloaded. Server load: 8 avg: 7.0
> <jdcryans> 2008-08-28 12:12:27,174 DEBUG org.apache.hadoop.hbase.master.RegionManager: Choosing to reassign 1 regions. mostLoadedRegions has 8 regions in it.
> <jdcryans> 2008-08-28 12:12:27,174 DEBUG org.apache.hadoop.hbase.master.RegionManager: Going to close region web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> <jdcryans> 2008-08-28 12:12:27,174 DEBUG org.apache.hadoop.hbase.master.HMaster: Main processing loop: PendingOpenOperation from 192.168.1.95:60020
> <jdcryans> 2008-08-28 12:12:27,175 INFO org.apache.hadoop.hbase.master.ProcessRegionOpen$1: web_pages,http://www.salonskincare.co.uk/product_info.php/products_id/168,1219939934794 open on 192.168.1.95:60020
> <jdcryans> 2008-08-28 12:12:27,175 DEBUG org.apache.hadoop.hbase.master.RegionServerOperation: numberOfMetaRegions: 1, onlineMetaRegions.size(): 1
> <jdcryans> 2008-08-28 12:12:27,175 INFO org.apache.hadoop.hbase.master.ProcessRegionOpen$1: updating row web_pages,http://www.salonskincare.co.uk/product_info.php/products_id/168,1219939934794 in region .META.,,1 with startcode 1219931259154 and server 192.168.1.95:60020
> <jdcryans> 2008-08-28 12:12:30,352 INFO org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_CLOSE: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794 from 192.168.1.95:60020
> <jdcryans> 2008-08-28 12:1
> <jdcryans> 2008-08-28 12:12:32,557 DEBUG org.apache.hadoop.hbase.master.ServerManager: Total Load: 103, Num Servers: 15, Avg Load: 7.0
> <jdcryans> 2008-08-28 12:12:34,093 DEBUG org.apache.hadoop.hbase.master.HMaster: Main processing loop: PendingOpenOperation from 192.168.1.95:60020
> <jdcryans> 2008-08-28 12:12:34,093 INFO org.apache.hadoop.hbase.master.ProcessRegionOpen$1: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794 open on 192.168.1.95:60020
> <jdcryans> 2008-08-28 12:12:34,093 DEBUG org.apache.hadoop.hbase.master.RegionServerOperation: numberOfMetaRegions: 1, onlineMetaRegions.size(): 1
> <jdcryans> 2008-08-28 12:12:34,093 INFO org.apache.hadoop.hbase.master.ProcessRegionOpen$1: updating row web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794 in region .META.,,1 with startcode 1219931259154 and server 192.168.1.95:60020
> {code}
> HRS 192.168.1.95
> {code}
> jdcryans> 2008-08-28 12:12:24,953 DEBUG org.apache.hadoop.hbase.regionserver.CompactSplitThread: Compaction requested for region: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> <jdcryans> 2008-08-28 12:12:27,307 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: MSG_REGION_CLOSE: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794: [B@f0a360
> <jdcryans> 2008-08-28 12:12:27,307 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: MSG_REGION_CLOSE: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794: [B@f0a360
> <jdcryans> 2008-08-28 12:12:27,308 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Compactions and cache flushes disabled for region web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> <jdcryans> 2008-08-28 12:12:27,308 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Scanners disabled for region web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> <jdcryans> 2008-08-28 12:12:27,308 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: No more active scanners for region web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> <jdcryans> 2008-08-28 12:12:27,308 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Updates disabled for region web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> <jdcryans> 2008-08-28 12:12:27,308 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: No more row locks outstanding on region web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> <jdcryans> 2008-08-28 12:12:27,308 DEBUG org.apache.hadoop.hbase.regionserver.HStore: closed 1860667227/attribute
> <jdcryans> 2008-08-28 12:12:27,308 INFO org.apache.hadoop.hbase.regionserver.HRegion: closed web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> <jdcryans> 2008-08-28 12:12:34,246 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 on 60020, call batchUpdate([B@552a4a, row => http://www.simplewebengines.com/, {column => attribute:traveliness, value => '...', column => attribute:processed_at, value => '...', column => attribute:content, value => '...', column => attribute:refs, value => '...', column => attribute:crawled_at, value => '...', column => att
> <jdcryans> ribute:html, value => '...', column => attribute:crawled, value => '...'}) from 192.168.1.96:50102: error: org.apache.hadoop.hbase.NotServingRegionException: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> <jdcryans> org.apache.hadoop.hbase.NotServingRegionException: web_pages,http://www.senior-community.net/michigan/charlevoix.htm,1219939934794
> NSRE for a hundred times
> {code}
> Restarting the cluster cleared the issue but this is a nasty bug. Proposed bandaid would be that if we have a NSRE after the retries, asked the master to scan the HRS to see if it's located somewhere else. If not, assign it somewhere. Finally update META.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.