You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Jean-Daniel Cryans (JIRA)" <ji...@apache.org> on 2010/06/11 20:06:15 UTC
[jira] Created: (HBASE-2712) Cached region location that went stale
won't recover if asking for first row
Cached region location that went stale won't recover if asking for first row
----------------------------------------------------------------------------
Key: HBASE-2712
URL: https://issues.apache.org/jira/browse/HBASE-2712
Project: HBase
Issue Type: Bug
Affects Versions: 0.20.4
Reporter: Jean-Daniel Cryans
Assignee: Jean-Daniel Cryans
Priority: Blocker
Fix For: 0.20.5, 0.21.0
Let's say that:
- A client cached the location of some region, not the first one in the table
- The RS that was holding it fails
- The first thing the client does after the failure is trying to reach the first row of that region
This will never recover, since HCM.deleteCachedLocation doesn't delete if the row we asked for is the first row in a region. This looks a lot like HBASE-1920, but there isn't enough information in that jira to say that it's the same thing.
This is a blocker, and it kills 0.20.5 RC2 (sorry).
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-2712) Cached region location that went
stale won't recover if asking for first row
Posted by "HBase Review Board (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-2712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12877929#action_12877929 ]
HBase Review Board commented on HBASE-2712:
-------------------------------------------
Message from: "Jean-Daniel Cryans" <jd...@apache.org>
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://review.hbase.org/r/170/
-----------------------------------------------------------
(Updated 2010-06-11 12:14:26.490410)
Review request for hbase.
Changes
-------
Fixed the potential NPE, and this one is against trunk.
Summary
-------
Patch against branch that does some refactoring and that fixes the bug at the same time. Adds a unit test that doesn't pass without the refactoring.
Urgent to review.
This addresses bug HBASE-2712.
http://issues.apache.org/jira/browse/HBASE-2712
Diffs (updated)
-----
/trunk/src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 953830
/trunk/src/test/java/org/apache/hadoop/hbase/client/TestHCM.java PRE-CREATION
Diff: http://review.hbase.org/r/170/diff
Testing
-------
Thanks,
Jean-Daniel
> Cached region location that went stale won't recover if asking for first row
> ----------------------------------------------------------------------------
>
> Key: HBASE-2712
> URL: https://issues.apache.org/jira/browse/HBASE-2712
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.20.4
> Reporter: Jean-Daniel Cryans
> Assignee: Jean-Daniel Cryans
> Priority: Blocker
> Fix For: 0.20.5, 0.21.0
>
>
> Let's say that:
> - A client cached the location of some region, not the first one in the table
> - The RS that was holding it fails
> - The first thing the client does after the failure is trying to reach the first row of that region
> This will never recover, since HCM.deleteCachedLocation doesn't delete if the row we asked for is the first row in a region. This looks a lot like HBASE-1920, but there isn't enough information in that jira to say that it's the same thing.
> This is a blocker, and it kills 0.20.5 RC2 (sorry).
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-2712) Cached region location that went
stale won't recover if asking for first row
Posted by "HBase Review Board (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-2712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12877922#action_12877922 ]
HBase Review Board commented on HBASE-2712:
-------------------------------------------
Message from: "Jonathan Gray" <jg...@apache.org>
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://review.hbase.org/r/170/#review189
-----------------------------------------------------------
/branches/0.20/src/java/org/apache/hadoop/hbase/client/HConnectionManager.java
<http://review.hbase.org/r/170/#comment866>
This looks like a much better approach :)
Isn't there an NPE if this returns null?
- Jonathan
> Cached region location that went stale won't recover if asking for first row
> ----------------------------------------------------------------------------
>
> Key: HBASE-2712
> URL: https://issues.apache.org/jira/browse/HBASE-2712
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.20.4
> Reporter: Jean-Daniel Cryans
> Assignee: Jean-Daniel Cryans
> Priority: Blocker
> Fix For: 0.20.5, 0.21.0
>
>
> Let's say that:
> - A client cached the location of some region, not the first one in the table
> - The RS that was holding it fails
> - The first thing the client does after the failure is trying to reach the first row of that region
> This will never recover, since HCM.deleteCachedLocation doesn't delete if the row we asked for is the first row in a region. This looks a lot like HBASE-1920, but there isn't enough information in that jira to say that it's the same thing.
> This is a blocker, and it kills 0.20.5 RC2 (sorry).
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HBASE-2712) Cached region location that went stale
won't recover if asking for first row
Posted by "Jean-Daniel Cryans (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-2712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jean-Daniel Cryans updated HBASE-2712:
--------------------------------------
Attachment: HBASE-2712-trunk.patch
Final patch from rb.
> Cached region location that went stale won't recover if asking for first row
> ----------------------------------------------------------------------------
>
> Key: HBASE-2712
> URL: https://issues.apache.org/jira/browse/HBASE-2712
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.20.4
> Reporter: Jean-Daniel Cryans
> Assignee: Jean-Daniel Cryans
> Priority: Blocker
> Fix For: 0.20.5, 0.21.0
>
> Attachments: HBASE-2712-trunk.patch
>
>
> Let's say that:
> - A client cached the location of some region, not the first one in the table
> - The RS that was holding it fails
> - The first thing the client does after the failure is trying to reach the first row of that region
> This will never recover, since HCM.deleteCachedLocation doesn't delete if the row we asked for is the first row in a region. This looks a lot like HBASE-1920, but there isn't enough information in that jira to say that it's the same thing.
> This is a blocker, and it kills 0.20.5 RC2 (sorry).
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Resolved: (HBASE-2712) Cached region location that went
stale won't recover if asking for first row
Posted by "Jean-Daniel Cryans (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-2712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jean-Daniel Cryans resolved HBASE-2712.
---------------------------------------
Hadoop Flags: [Reviewed]
Resolution: Fixed
Committed to branch and trunk.
> Cached region location that went stale won't recover if asking for first row
> ----------------------------------------------------------------------------
>
> Key: HBASE-2712
> URL: https://issues.apache.org/jira/browse/HBASE-2712
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.20.4
> Reporter: Jean-Daniel Cryans
> Assignee: Jean-Daniel Cryans
> Priority: Blocker
> Fix For: 0.20.5, 0.21.0
>
> Attachments: HBASE-2712-trunk.patch
>
>
> Let's say that:
> - A client cached the location of some region, not the first one in the table
> - The RS that was holding it fails
> - The first thing the client does after the failure is trying to reach the first row of that region
> This will never recover, since HCM.deleteCachedLocation doesn't delete if the row we asked for is the first row in a region. This looks a lot like HBASE-1920, but there isn't enough information in that jira to say that it's the same thing.
> This is a blocker, and it kills 0.20.5 RC2 (sorry).
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-2712) Cached region location that went
stale won't recover if asking for first row
Posted by "HBase Review Board (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-2712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12877924#action_12877924 ]
HBase Review Board commented on HBASE-2712:
-------------------------------------------
Message from: "Jean-Daniel Cryans" <jd...@apache.org>
bq. On 2010-06-11 11:52:48, stack wrote:
bq. > /branches/0.20/src/java/org/apache/hadoop/hbase/client/HConnectionManager.java, line 850
bq. > <http://review.hbase.org/r/170/diff/1/?file=1246#file1246line850>
bq. >
bq. > getCachedLocation does the right thing?
Yep, that one did, so that's why we should reuse it instead.
bq. On 2010-06-11 11:52:48, stack wrote:
bq. > /branches/0.20/src/java/org/apache/hadoop/hbase/client/HConnectionManager.java, line 851
bq. > <http://review.hbase.org/r/170/diff/1/?file=1246#file1246line851>
bq. >
bq. > rl will never be null?
Doh
- Jean-Daniel
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://review.hbase.org/r/170/#review188
-----------------------------------------------------------
> Cached region location that went stale won't recover if asking for first row
> ----------------------------------------------------------------------------
>
> Key: HBASE-2712
> URL: https://issues.apache.org/jira/browse/HBASE-2712
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.20.4
> Reporter: Jean-Daniel Cryans
> Assignee: Jean-Daniel Cryans
> Priority: Blocker
> Fix For: 0.20.5, 0.21.0
>
>
> Let's say that:
> - A client cached the location of some region, not the first one in the table
> - The RS that was holding it fails
> - The first thing the client does after the failure is trying to reach the first row of that region
> This will never recover, since HCM.deleteCachedLocation doesn't delete if the row we asked for is the first row in a region. This looks a lot like HBASE-1920, but there isn't enough information in that jira to say that it's the same thing.
> This is a blocker, and it kills 0.20.5 RC2 (sorry).
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-2712) Cached region location that went
stale won't recover if asking for first row
Posted by "HBase Review Board (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-2712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12877921#action_12877921 ]
HBase Review Board commented on HBASE-2712:
-------------------------------------------
Message from: stack@duboce.net
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://review.hbase.org/r/170/#review188
-----------------------------------------------------------
/branches/0.20/src/java/org/apache/hadoop/hbase/client/HConnectionManager.java
<http://review.hbase.org/r/170/#comment865>
getCachedLocation does the right thing?
/branches/0.20/src/java/org/apache/hadoop/hbase/client/HConnectionManager.java
<http://review.hbase.org/r/170/#comment864>
rl will never be null?
- stack
> Cached region location that went stale won't recover if asking for first row
> ----------------------------------------------------------------------------
>
> Key: HBASE-2712
> URL: https://issues.apache.org/jira/browse/HBASE-2712
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.20.4
> Reporter: Jean-Daniel Cryans
> Assignee: Jean-Daniel Cryans
> Priority: Blocker
> Fix For: 0.20.5, 0.21.0
>
>
> Let's say that:
> - A client cached the location of some region, not the first one in the table
> - The RS that was holding it fails
> - The first thing the client does after the failure is trying to reach the first row of that region
> This will never recover, since HCM.deleteCachedLocation doesn't delete if the row we asked for is the first row in a region. This looks a lot like HBASE-1920, but there isn't enough information in that jira to say that it's the same thing.
> This is a blocker, and it kills 0.20.5 RC2 (sorry).
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-2712) Cached region location that went
stale won't recover if asking for first row
Posted by "Jonathan Gray (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-2712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12877899#action_12877899 ]
Jonathan Gray commented on HBASE-2712:
--------------------------------------
This could very much be what I tripped over in HBASE-1920. Let's keep the other jira open for now. Good catch JD.
> Cached region location that went stale won't recover if asking for first row
> ----------------------------------------------------------------------------
>
> Key: HBASE-2712
> URL: https://issues.apache.org/jira/browse/HBASE-2712
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.20.4
> Reporter: Jean-Daniel Cryans
> Assignee: Jean-Daniel Cryans
> Priority: Blocker
> Fix For: 0.20.5, 0.21.0
>
>
> Let's say that:
> - A client cached the location of some region, not the first one in the table
> - The RS that was holding it fails
> - The first thing the client does after the failure is trying to reach the first row of that region
> This will never recover, since HCM.deleteCachedLocation doesn't delete if the row we asked for is the first row in a region. This looks a lot like HBASE-1920, but there isn't enough information in that jira to say that it's the same thing.
> This is a blocker, and it kills 0.20.5 RC2 (sorry).
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-2712) Cached region location that went
stale won't recover if asking for first row
Posted by "HBase Review Board (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-2712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12877933#action_12877933 ]
HBase Review Board commented on HBASE-2712:
-------------------------------------------
Message from: "Jonathan Gray" <jg...@apache.org>
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://review.hbase.org/r/170/#review193
-----------------------------------------------------------
Ship it!
looks good to me
- Jonathan
> Cached region location that went stale won't recover if asking for first row
> ----------------------------------------------------------------------------
>
> Key: HBASE-2712
> URL: https://issues.apache.org/jira/browse/HBASE-2712
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.20.4
> Reporter: Jean-Daniel Cryans
> Assignee: Jean-Daniel Cryans
> Priority: Blocker
> Fix For: 0.20.5, 0.21.0
>
>
> Let's say that:
> - A client cached the location of some region, not the first one in the table
> - The RS that was holding it fails
> - The first thing the client does after the failure is trying to reach the first row of that region
> This will never recover, since HCM.deleteCachedLocation doesn't delete if the row we asked for is the first row in a region. This looks a lot like HBASE-1920, but there isn't enough information in that jira to say that it's the same thing.
> This is a blocker, and it kills 0.20.5 RC2 (sorry).
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-2712) Cached region location that went
stale won't recover if asking for first row
Posted by "HBase Review Board (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-2712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12877914#action_12877914 ]
HBase Review Board commented on HBASE-2712:
-------------------------------------------
Message from: "Jean-Daniel Cryans" <jd...@apache.org>
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://review.hbase.org/r/170/
-----------------------------------------------------------
Review request for hbase.
Summary
-------
Patch against branch that does some refactoring and that fixes the bug at the same time. Adds a unit test that doesn't pass without the refactoring.
Urgent to review.
This addresses bug HBASE-2712.
http://issues.apache.org/jira/browse/HBASE-2712
Diffs
-----
/branches/0.20/src/java/org/apache/hadoop/hbase/client/HConnectionManager.java 953796
/branches/0.20/src/test/org/apache/hadoop/hbase/client/TestHCM.java PRE-CREATION
Diff: http://review.hbase.org/r/170/diff
Testing
-------
Thanks,
Jean-Daniel
> Cached region location that went stale won't recover if asking for first row
> ----------------------------------------------------------------------------
>
> Key: HBASE-2712
> URL: https://issues.apache.org/jira/browse/HBASE-2712
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.20.4
> Reporter: Jean-Daniel Cryans
> Assignee: Jean-Daniel Cryans
> Priority: Blocker
> Fix For: 0.20.5, 0.21.0
>
>
> Let's say that:
> - A client cached the location of some region, not the first one in the table
> - The RS that was holding it fails
> - The first thing the client does after the failure is trying to reach the first row of that region
> This will never recover, since HCM.deleteCachedLocation doesn't delete if the row we asked for is the first row in a region. This looks a lot like HBASE-1920, but there isn't enough information in that jira to say that it's the same thing.
> This is a blocker, and it kills 0.20.5 RC2 (sorry).
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.