You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Todd Lipcon (JIRA)" <ji...@apache.org> on 2011/03/17 21:20:30 UTC

[jira] Created: (HBASE-3666) TestScannerTimeout fails occasionally

TestScannerTimeout fails occasionally
-------------------------------------

                 Key: HBASE-3666
                 URL: https://issues.apache.org/jira/browse/HBASE-3666
             Project: HBase
          Issue Type: Bug
    Affects Versions: 0.90.1
            Reporter: Todd Lipcon
            Assignee: Todd Lipcon
             Fix For: 0.90.2


If I loop TestScannerTimeout, it eventually fails with:

org.apache.hadoop.hbase.regionserver.LeaseException: org.apache.hadoop.hbase.regionserver.LeaseException: lease '-4526340287831625207' does not exist
        at org.apache.hadoop.hbase.regionserver.Leases.cancelLease(Leases.java:209)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1816)
...
        at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:83)
        at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:38)
        at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionServerWithRetries(HConnectionManager.java:1003)
        at org.apache.hadoop.hbase.client.HTable$ClientScanner.next(HTable.java:1103)
        at org.apache.hadoop.hbase.client.HTable$ClientScanner.next(HTable.java:1175)
        at org.apache.hadoop.hbase.client.TestScannerTimeout.test2772(TestScannerTimeout.java:133)

I think the issue is a race where at the top of the function, the scanner does exist, but by the time it gets to cancelLease, it has timed out.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (HBASE-3666) TestScannerTimeout fails occasionally

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13008115#comment-13008115 ] 

stack commented on HBASE-3666:
------------------------------

+1 but would change the message to be more generic -- can checkopen fail because region is closing or for some other reason than just shutdown?

> TestScannerTimeout fails occasionally
> -------------------------------------
>
>                 Key: HBASE-3666
>                 URL: https://issues.apache.org/jira/browse/HBASE-3666
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.90.1
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>             Fix For: 0.90.2
>
>         Attachments: hbase-3666.txt
>
>
> If I loop TestScannerTimeout, it eventually fails with:
> org.apache.hadoop.hbase.regionserver.LeaseException: org.apache.hadoop.hbase.regionserver.LeaseException: lease '-4526340287831625207' does not exist
>         at org.apache.hadoop.hbase.regionserver.Leases.cancelLease(Leases.java:209)
>         at org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1816)
> ...
>         at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:83)
>         at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:38)
>         at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionServerWithRetries(HConnectionManager.java:1003)
>         at org.apache.hadoop.hbase.client.HTable$ClientScanner.next(HTable.java:1103)
>         at org.apache.hadoop.hbase.client.HTable$ClientScanner.next(HTable.java:1175)
>         at org.apache.hadoop.hbase.client.TestScannerTimeout.test2772(TestScannerTimeout.java:133)
> I think the issue is a race where at the top of the function, the scanner does exist, but by the time it gets to cancelLease, it has timed out.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (HBASE-3666) TestScannerTimeout fails occasionally

Posted by "Todd Lipcon (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13008219#comment-13008219 ] 

Todd Lipcon commented on HBASE-3666:
------------------------------------

If the fs went away, it will also have aborted, in which case the RS is also shutting down :)

> TestScannerTimeout fails occasionally
> -------------------------------------
>
>                 Key: HBASE-3666
>                 URL: https://issues.apache.org/jira/browse/HBASE-3666
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.90.1
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>             Fix For: 0.90.2
>
>         Attachments: hbase-3666.txt
>
>
> If I loop TestScannerTimeout, it eventually fails with:
> org.apache.hadoop.hbase.regionserver.LeaseException: org.apache.hadoop.hbase.regionserver.LeaseException: lease '-4526340287831625207' does not exist
>         at org.apache.hadoop.hbase.regionserver.Leases.cancelLease(Leases.java:209)
>         at org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1816)
> ...
>         at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:83)
>         at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:38)
>         at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionServerWithRetries(HConnectionManager.java:1003)
>         at org.apache.hadoop.hbase.client.HTable$ClientScanner.next(HTable.java:1103)
>         at org.apache.hadoop.hbase.client.HTable$ClientScanner.next(HTable.java:1175)
>         at org.apache.hadoop.hbase.client.TestScannerTimeout.test2772(TestScannerTimeout.java:133)
> I think the issue is a race where at the top of the function, the scanner does exist, but by the time it gets to cancelLease, it has timed out.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3666) TestScannerTimeout fails occasionally

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13010897#comment-13010897 ] 

stack commented on HBASE-3666:
------------------------------

True.  I'm committing your patch as is Todd.

> TestScannerTimeout fails occasionally
> -------------------------------------
>
>                 Key: HBASE-3666
>                 URL: https://issues.apache.org/jira/browse/HBASE-3666
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.90.1
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>             Fix For: 0.90.2
>
>         Attachments: hbase-3666.txt
>
>
> If I loop TestScannerTimeout, it eventually fails with:
> org.apache.hadoop.hbase.regionserver.LeaseException: org.apache.hadoop.hbase.regionserver.LeaseException: lease '-4526340287831625207' does not exist
>         at org.apache.hadoop.hbase.regionserver.Leases.cancelLease(Leases.java:209)
>         at org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1816)
> ...
>         at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:83)
>         at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:38)
>         at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionServerWithRetries(HConnectionManager.java:1003)
>         at org.apache.hadoop.hbase.client.HTable$ClientScanner.next(HTable.java:1103)
>         at org.apache.hadoop.hbase.client.HTable$ClientScanner.next(HTable.java:1175)
>         at org.apache.hadoop.hbase.client.TestScannerTimeout.test2772(TestScannerTimeout.java:133)
> I think the issue is a race where at the top of the function, the scanner does exist, but by the time it gets to cancelLease, it has timed out.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3666) TestScannerTimeout fails occasionally

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13011492#comment-13011492 ] 

Hudson commented on HBASE-3666:
-------------------------------

Integrated in HBase-TRUNK #1814 (See [https://hudson.apache.org/hudson/job/HBase-TRUNK/1814/])
    

> TestScannerTimeout fails occasionally
> -------------------------------------
>
>                 Key: HBASE-3666
>                 URL: https://issues.apache.org/jira/browse/HBASE-3666
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.90.1
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>             Fix For: 0.90.2
>
>         Attachments: hbase-3666.txt
>
>
> If I loop TestScannerTimeout, it eventually fails with:
> org.apache.hadoop.hbase.regionserver.LeaseException: org.apache.hadoop.hbase.regionserver.LeaseException: lease '-4526340287831625207' does not exist
>         at org.apache.hadoop.hbase.regionserver.Leases.cancelLease(Leases.java:209)
>         at org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1816)
> ...
>         at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:83)
>         at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:38)
>         at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionServerWithRetries(HConnectionManager.java:1003)
>         at org.apache.hadoop.hbase.client.HTable$ClientScanner.next(HTable.java:1103)
>         at org.apache.hadoop.hbase.client.HTable$ClientScanner.next(HTable.java:1175)
>         at org.apache.hadoop.hbase.client.TestScannerTimeout.test2772(TestScannerTimeout.java:133)
> I think the issue is a race where at the top of the function, the scanner does exist, but by the time it gets to cancelLease, it has timed out.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (HBASE-3666) TestScannerTimeout fails occasionally

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13008214#comment-13008214 ] 

stack commented on HBASE-3666:
------------------------------

It looks like usually its shutdown only but seems like it could be because we determined the fs went away also.

> TestScannerTimeout fails occasionally
> -------------------------------------
>
>                 Key: HBASE-3666
>                 URL: https://issues.apache.org/jira/browse/HBASE-3666
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.90.1
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>             Fix For: 0.90.2
>
>         Attachments: hbase-3666.txt
>
>
> If I loop TestScannerTimeout, it eventually fails with:
> org.apache.hadoop.hbase.regionserver.LeaseException: org.apache.hadoop.hbase.regionserver.LeaseException: lease '-4526340287831625207' does not exist
>         at org.apache.hadoop.hbase.regionserver.Leases.cancelLease(Leases.java:209)
>         at org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1816)
> ...
>         at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:83)
>         at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:38)
>         at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionServerWithRetries(HConnectionManager.java:1003)
>         at org.apache.hadoop.hbase.client.HTable$ClientScanner.next(HTable.java:1103)
>         at org.apache.hadoop.hbase.client.HTable$ClientScanner.next(HTable.java:1175)
>         at org.apache.hadoop.hbase.client.TestScannerTimeout.test2772(TestScannerTimeout.java:133)
> I think the issue is a race where at the top of the function, the scanner does exist, but by the time it gets to cancelLease, it has timed out.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HBASE-3666) TestScannerTimeout fails occasionally

Posted by "stack (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-3666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack resolved HBASE-3666.
--------------------------

      Resolution: Fixed
    Hadoop Flags: [Reviewed]

Applied branch and trunk.

> TestScannerTimeout fails occasionally
> -------------------------------------
>
>                 Key: HBASE-3666
>                 URL: https://issues.apache.org/jira/browse/HBASE-3666
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.90.1
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>             Fix For: 0.90.2
>
>         Attachments: hbase-3666.txt
>
>
> If I loop TestScannerTimeout, it eventually fails with:
> org.apache.hadoop.hbase.regionserver.LeaseException: org.apache.hadoop.hbase.regionserver.LeaseException: lease '-4526340287831625207' does not exist
>         at org.apache.hadoop.hbase.regionserver.Leases.cancelLease(Leases.java:209)
>         at org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1816)
> ...
>         at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:83)
>         at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:38)
>         at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionServerWithRetries(HConnectionManager.java:1003)
>         at org.apache.hadoop.hbase.client.HTable$ClientScanner.next(HTable.java:1103)
>         at org.apache.hadoop.hbase.client.HTable$ClientScanner.next(HTable.java:1175)
>         at org.apache.hadoop.hbase.client.TestScannerTimeout.test2772(TestScannerTimeout.java:133)
> I think the issue is a race where at the top of the function, the scanner does exist, but by the time it gets to cancelLease, it has timed out.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Updated: (HBASE-3666) TestScannerTimeout fails occasionally

Posted by "Todd Lipcon (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-3666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Todd Lipcon updated HBASE-3666:
-------------------------------

    Attachment: hbase-3666.txt

Proposed fix.

> TestScannerTimeout fails occasionally
> -------------------------------------
>
>                 Key: HBASE-3666
>                 URL: https://issues.apache.org/jira/browse/HBASE-3666
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.90.1
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>             Fix For: 0.90.2
>
>         Attachments: hbase-3666.txt
>
>
> If I loop TestScannerTimeout, it eventually fails with:
> org.apache.hadoop.hbase.regionserver.LeaseException: org.apache.hadoop.hbase.regionserver.LeaseException: lease '-4526340287831625207' does not exist
>         at org.apache.hadoop.hbase.regionserver.Leases.cancelLease(Leases.java:209)
>         at org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1816)
> ...
>         at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:83)
>         at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:38)
>         at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionServerWithRetries(HConnectionManager.java:1003)
>         at org.apache.hadoop.hbase.client.HTable$ClientScanner.next(HTable.java:1103)
>         at org.apache.hadoop.hbase.client.HTable$ClientScanner.next(HTable.java:1175)
>         at org.apache.hadoop.hbase.client.TestScannerTimeout.test2772(TestScannerTimeout.java:133)
> I think the issue is a race where at the top of the function, the scanner does exist, but by the time it gets to cancelLease, it has timed out.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (HBASE-3666) TestScannerTimeout fails occasionally

Posted by "Todd Lipcon (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13008117#comment-13008117 ] 

Todd Lipcon commented on HBASE-3666:
------------------------------------

checkOpen in this case is on the HRegionServer, not the region, so I think it's only on shutdown

> TestScannerTimeout fails occasionally
> -------------------------------------
>
>                 Key: HBASE-3666
>                 URL: https://issues.apache.org/jira/browse/HBASE-3666
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.90.1
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>             Fix For: 0.90.2
>
>         Attachments: hbase-3666.txt
>
>
> If I loop TestScannerTimeout, it eventually fails with:
> org.apache.hadoop.hbase.regionserver.LeaseException: org.apache.hadoop.hbase.regionserver.LeaseException: lease '-4526340287831625207' does not exist
>         at org.apache.hadoop.hbase.regionserver.Leases.cancelLease(Leases.java:209)
>         at org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1816)
> ...
>         at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:83)
>         at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:38)
>         at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionServerWithRetries(HConnectionManager.java:1003)
>         at org.apache.hadoop.hbase.client.HTable$ClientScanner.next(HTable.java:1103)
>         at org.apache.hadoop.hbase.client.HTable$ClientScanner.next(HTable.java:1175)
>         at org.apache.hadoop.hbase.client.TestScannerTimeout.test2772(TestScannerTimeout.java:133)
> I think the issue is a race where at the top of the function, the scanner does exist, but by the time it gets to cancelLease, it has timed out.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira