You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "Jonathan Gray (JIRA)" <ji...@apache.org> on 2009/10/20 02:06:59 UTC
[jira] Created: (HBASE-1920) Client with cached region location
pointing at dead server does not recover
Client with cached region location pointing at dead server does not recover
---------------------------------------------------------------------------
Key: HBASE-1920
URL: https://issues.apache.org/jira/browse/HBASE-1920
Project: Hadoop HBase
Issue Type: Bug
Affects Versions: 0.20.1
Reporter: Jonathan Gray
Priority: Critical
Fix For: 0.20.2, 0.21.0
Testing in HBASE-1908 uncovered an issue where I had a client that had cached a region location. I killed that regionserver, waited for recovery/reassignment, and then tried to scan the table again from the same client w/o restarting it.
A client normally gets a NotServingRegionException when a region is reassigned, but since this server is dead the client just got Connection Refused type exceptions. These didn't seem to trigger the client to ask META for a new region location.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HBASE-1920) Client with cached region location
pointing at dead server does not recover
Posted by "stack (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-1920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
stack updated HBASE-1920:
-------------------------
Fix Version/s: (was: 0.20.2)
Moving out of 0.20.2 after chatting w/ JGray.
> Client with cached region location pointing at dead server does not recover
> ---------------------------------------------------------------------------
>
> Key: HBASE-1920
> URL: https://issues.apache.org/jira/browse/HBASE-1920
> Project: Hadoop HBase
> Issue Type: Bug
> Affects Versions: 0.20.1
> Reporter: Jonathan Gray
> Priority: Critical
> Fix For: 0.21.0
>
>
> Testing in HBASE-1908 uncovered an issue where I had a client that had cached a region location. I killed that regionserver, waited for recovery/reassignment, and then tried to scan the table again from the same client w/o restarting it.
> A client normally gets a NotServingRegionException when a region is reassigned, but since this server is dead the client just got Connection Refused type exceptions. These didn't seem to trigger the client to ask META for a new region location.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-1920) Client with cached region location
pointing at dead server does not recover
Posted by "Jean-Daniel Cryans (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-1920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12773189#action_12773189 ]
Jean-Daniel Cryans commented on HBASE-1920:
-------------------------------------------
Well it seems I'm not able anymore... Since I use java 7 on that cluster my master crashed and was just unable to reassign so I was getting retries because .META. was never updated. Now I retried scanning, killing (sometimes waiting, sometimes not for reassignment), then scanning again at least 10 times and it's always working just fine.
> Client with cached region location pointing at dead server does not recover
> ---------------------------------------------------------------------------
>
> Key: HBASE-1920
> URL: https://issues.apache.org/jira/browse/HBASE-1920
> Project: Hadoop HBase
> Issue Type: Bug
> Affects Versions: 0.20.1
> Reporter: Jonathan Gray
> Priority: Critical
> Fix For: 0.20.2, 0.21.0
>
>
> Testing in HBASE-1908 uncovered an issue where I had a client that had cached a region location. I killed that regionserver, waited for recovery/reassignment, and then tried to scan the table again from the same client w/o restarting it.
> A client normally gets a NotServingRegionException when a region is reassigned, but since this server is dead the client just got Connection Refused type exceptions. These didn't seem to trigger the client to ask META for a new region location.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-1920) Client with cached region location
pointing at dead server does not recover
Posted by "Jean-Daniel Cryans (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-1920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12773154#action_12773154 ]
Jean-Daniel Cryans commented on HBASE-1920:
-------------------------------------------
I was able to manufacture the problem by using the shell to scan, kill the RS, then scan again. I'll try to see what's the exact trace.
> Client with cached region location pointing at dead server does not recover
> ---------------------------------------------------------------------------
>
> Key: HBASE-1920
> URL: https://issues.apache.org/jira/browse/HBASE-1920
> Project: Hadoop HBase
> Issue Type: Bug
> Affects Versions: 0.20.1
> Reporter: Jonathan Gray
> Priority: Critical
> Fix For: 0.20.2, 0.21.0
>
>
> Testing in HBASE-1908 uncovered an issue where I had a client that had cached a region location. I killed that regionserver, waited for recovery/reassignment, and then tried to scan the table again from the same client w/o restarting it.
> A client normally gets a NotServingRegionException when a region is reassigned, but since this server is dead the client just got Connection Refused type exceptions. These didn't seem to trigger the client to ask META for a new region location.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-1920) Client with cached region location
pointing at dead server does not recover
Posted by "Jonathan Gray (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-1920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12773287#action_12773287 ]
Jonathan Gray commented on HBASE-1920:
--------------------------------------
Will need to test more. I believe I did have only one regionserver remaining but need to verify.
> Client with cached region location pointing at dead server does not recover
> ---------------------------------------------------------------------------
>
> Key: HBASE-1920
> URL: https://issues.apache.org/jira/browse/HBASE-1920
> Project: Hadoop HBase
> Issue Type: Bug
> Affects Versions: 0.20.1
> Reporter: Jonathan Gray
> Priority: Critical
> Fix For: 0.20.2, 0.21.0
>
>
> Testing in HBASE-1908 uncovered an issue where I had a client that had cached a region location. I killed that regionserver, waited for recovery/reassignment, and then tried to scan the table again from the same client w/o restarting it.
> A client normally gets a NotServingRegionException when a region is reassigned, but since this server is dead the client just got Connection Refused type exceptions. These didn't seem to trigger the client to ask META for a new region location.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-1920) Client with cached region location
pointing at dead server does not recover
Posted by "stack (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-1920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12773196#action_12773196 ]
stack commented on HBASE-1920:
------------------------------
@jgray, does this require there to be one RS only?
> Client with cached region location pointing at dead server does not recover
> ---------------------------------------------------------------------------
>
> Key: HBASE-1920
> URL: https://issues.apache.org/jira/browse/HBASE-1920
> Project: Hadoop HBase
> Issue Type: Bug
> Affects Versions: 0.20.1
> Reporter: Jonathan Gray
> Priority: Critical
> Fix For: 0.20.2, 0.21.0
>
>
> Testing in HBASE-1908 uncovered an issue where I had a client that had cached a region location. I killed that regionserver, waited for recovery/reassignment, and then tried to scan the table again from the same client w/o restarting it.
> A client normally gets a NotServingRegionException when a region is reassigned, but since this server is dead the client just got Connection Refused type exceptions. These didn't seem to trigger the client to ask META for a new region location.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-1920) Client with cached region location
pointing at dead server does not recover
Posted by "stack (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-1920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12774430#action_12774430 ]
stack commented on HBASE-1920:
------------------------------
@jgray Will we move out of 0.20.2?
> Client with cached region location pointing at dead server does not recover
> ---------------------------------------------------------------------------
>
> Key: HBASE-1920
> URL: https://issues.apache.org/jira/browse/HBASE-1920
> Project: Hadoop HBase
> Issue Type: Bug
> Affects Versions: 0.20.1
> Reporter: Jonathan Gray
> Priority: Critical
> Fix For: 0.20.2, 0.21.0
>
>
> Testing in HBASE-1908 uncovered an issue where I had a client that had cached a region location. I killed that regionserver, waited for recovery/reassignment, and then tried to scan the table again from the same client w/o restarting it.
> A client normally gets a NotServingRegionException when a region is reassigned, but since this server is dead the client just got Connection Refused type exceptions. These didn't seem to trigger the client to ask META for a new region location.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.