You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "Jonathan Gray (JIRA)" <ji...@apache.org> on 2008/09/29 05:10:44 UTC

[jira] Created: (HBASE-907) getScanner hangs with some startRows that are found if scanning entire table

getScanner hangs with some startRows that are found if scanning entire table
----------------------------------------------------------------------------

                 Key: HBASE-907
                 URL: https://issues.apache.org/jira/browse/HBASE-907
             Project: Hadoop HBase
          Issue Type: Bug
    Affects Versions: 0.18.0, 0.18.1, 0.19.0
            Reporter: Jonathan Gray
            Priority: Critical
             Fix For: 0.18.1, 0.19.0


I have a table with 8 byte binary row keys.  There are a a few hundred thousands rows, each with two families and between 1k and 50k of total data across about 15 columns.

When attempting to get a scanner using a specified startRow, my client freezes on the HT.getScanner(cols,row) with no exception ever thrown and no debug output in any server logs.

If I get a scanner with HT.getScanner(cols) and then iterate through, I will eventually reach the row I was seeking before successfully.

Some rows can be found, some cannot.  At this point I'm not able to distinguish anything special about the ones that cause the client the hang.

At first I thought this was only a problem with 0.19 trunk as a downgrade to 0.18 resolved the issue for a particular key.  However other keys still have this issue on 0.18 branch.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (HBASE-907) getScanner hangs with some startRows that are found if scanning entire table

Posted by "Jonathan Gray (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Gray resolved HBASE-907.
---------------------------------

    Resolution: Cannot Reproduce

> getScanner hangs with some startRows that are found if scanning entire table
> ----------------------------------------------------------------------------
>
>                 Key: HBASE-907
>                 URL: https://issues.apache.org/jira/browse/HBASE-907
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.18.0, 0.18.1, 0.19.0
>            Reporter: Jonathan Gray
>            Priority: Critical
>             Fix For: 0.18.1, 0.19.0
>
>
> I have a table with 8 byte binary row keys.  There are a a few hundred thousands rows, each with two families and between 1k and 50k of total data across about 15 columns.
> When attempting to get a scanner using a specified startRow, my client freezes on the HT.getScanner(cols,row) with no exception ever thrown and no debug output in any server logs.
> If I get a scanner with HT.getScanner(cols) and then iterate through, I will eventually reach the row I was seeking before successfully.
> Some rows can be found, some cannot.  At this point I'm not able to distinguish anything special about the ones that cause the client the hang.
> At first I thought this was only a problem with 0.19 trunk as a downgrade to 0.18 resolved the issue for a particular key.  However other keys still have this issue on 0.18 branch.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-907) getScanner hangs with some startRows that are found if scanning entire table

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12635472#action_12635472 ] 

stack commented on HBASE-907:
-----------------------------

Our retry mechanism is masking real issue by throwing NPE on last retry.  Need to fix that.  That said, anything earlier in the log Jon?  Dumb exception should at least say what the problematic row was so could try it in client.

> getScanner hangs with some startRows that are found if scanning entire table
> ----------------------------------------------------------------------------
>
>                 Key: HBASE-907
>                 URL: https://issues.apache.org/jira/browse/HBASE-907
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.18.0, 0.18.1, 0.19.0
>            Reporter: Jonathan Gray
>            Priority: Critical
>             Fix For: 0.18.1, 0.19.0
>
>
> I have a table with 8 byte binary row keys.  There are a a few hundred thousands rows, each with two families and between 1k and 50k of total data across about 15 columns.
> When attempting to get a scanner using a specified startRow, my client freezes on the HT.getScanner(cols,row) with no exception ever thrown and no debug output in any server logs.
> If I get a scanner with HT.getScanner(cols) and then iterate through, I will eventually reach the row I was seeking before successfully.
> Some rows can be found, some cannot.  At this point I'm not able to distinguish anything special about the ones that cause the client the hang.
> At first I thought this was only a problem with 0.19 trunk as a downgrade to 0.18 resolved the issue for a particular key.  However other keys still have this issue on 0.18 branch.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-907) getScanner hangs with some startRows that are found if scanning entire table

Posted by "Jonathan Gray (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12636128#action_12636128 ] 

Jonathan Gray commented on HBASE-907:
-------------------------------------

No longer able to recreate issue.

Things have changed in the client code that triggered this.  Still unsure whether there was a bug in the client (not really sure how it would make rows visible with one method and not the other) or I have not gotten the table as big as it was when I experienced this.

Would like to wait another week or two before closing this issue.  Definitely not a blocker at this point though.

> getScanner hangs with some startRows that are found if scanning entire table
> ----------------------------------------------------------------------------
>
>                 Key: HBASE-907
>                 URL: https://issues.apache.org/jira/browse/HBASE-907
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.18.0, 0.18.1, 0.19.0
>            Reporter: Jonathan Gray
>            Priority: Critical
>             Fix For: 0.18.1, 0.19.0
>
>
> I have a table with 8 byte binary row keys.  There are a a few hundred thousands rows, each with two families and between 1k and 50k of total data across about 15 columns.
> When attempting to get a scanner using a specified startRow, my client freezes on the HT.getScanner(cols,row) with no exception ever thrown and no debug output in any server logs.
> If I get a scanner with HT.getScanner(cols) and then iterate through, I will eventually reach the row I was seeking before successfully.
> Some rows can be found, some cannot.  At this point I'm not able to distinguish anything special about the ones that cause the client the hang.
> At first I thought this was only a problem with 0.19 trunk as a downgrade to 0.18 resolved the issue for a particular key.  However other keys still have this issue on 0.18 branch.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-907) getScanner hangs with some startRows that are found if scanning entire table

Posted by "Jonathan Gray (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12635276#action_12635276 ] 

Jonathan Gray commented on HBASE-907:
-------------------------------------

Just received an exception in my client, I don't think this ever happened when I was running 0.19 but maybe I didn't leave it long enough.  This took >10 minutes

Exception in thread "main" java.lang.NullPointerException
        at java.lang.String.<init>(String.java:523)
        at org.apache.hadoop.hbase.util.Bytes.toString(Bytes.java:75)
        at org.apache.hadoop.hbase.client.RetriesExhaustedException.getMessage(RetriesExhaustedException.java:50)
        at org.apache.hadoop.hbase.client.RetriesExhaustedException.<init>(RetriesExhaustedException.java:40)
        at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.getRegionServerWithRetries(HConnectionManager.java:863)
        at org.apache.hadoop.hbase.client.HTable$ClientScanner.nextScanner(HTable.java:1163)
        at org.apache.hadoop.hbase.client.HTable$ClientScanner.initialize(HTable.java:1108)
        at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:733)
        at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:575)
        at ItemTest.getItemSource(ItemTest.java:37)
        at ItemTest.main(ItemTest.java:13)


> getScanner hangs with some startRows that are found if scanning entire table
> ----------------------------------------------------------------------------
>
>                 Key: HBASE-907
>                 URL: https://issues.apache.org/jira/browse/HBASE-907
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.18.0, 0.18.1, 0.19.0
>            Reporter: Jonathan Gray
>            Priority: Critical
>             Fix For: 0.18.1, 0.19.0
>
>
> I have a table with 8 byte binary row keys.  There are a a few hundred thousands rows, each with two families and between 1k and 50k of total data across about 15 columns.
> When attempting to get a scanner using a specified startRow, my client freezes on the HT.getScanner(cols,row) with no exception ever thrown and no debug output in any server logs.
> If I get a scanner with HT.getScanner(cols) and then iterate through, I will eventually reach the row I was seeking before successfully.
> Some rows can be found, some cannot.  At this point I'm not able to distinguish anything special about the ones that cause the client the hang.
> At first I thought this was only a problem with 0.19 trunk as a downgrade to 0.18 resolved the issue for a particular key.  However other keys still have this issue on 0.18 branch.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-907) getScanner hangs with some startRows that are found if scanning entire table

Posted by "Jonathan Gray (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12635277#action_12635277 ] 

Jonathan Gray commented on HBASE-907:
-------------------------------------

Restarting hbase fixes the issue.

Could this be related to the old binary key issues?

The inserts are being done differently than the ones I used in testing the other issues with splitting and binary keys.  These were done via MapReduce, but actually inserted using the API and not TableReduce.

> getScanner hangs with some startRows that are found if scanning entire table
> ----------------------------------------------------------------------------
>
>                 Key: HBASE-907
>                 URL: https://issues.apache.org/jira/browse/HBASE-907
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.18.0, 0.18.1, 0.19.0
>            Reporter: Jonathan Gray
>            Priority: Critical
>             Fix For: 0.18.1, 0.19.0
>
>
> I have a table with 8 byte binary row keys.  There are a a few hundred thousands rows, each with two families and between 1k and 50k of total data across about 15 columns.
> When attempting to get a scanner using a specified startRow, my client freezes on the HT.getScanner(cols,row) with no exception ever thrown and no debug output in any server logs.
> If I get a scanner with HT.getScanner(cols) and then iterate through, I will eventually reach the row I was seeking before successfully.
> Some rows can be found, some cannot.  At this point I'm not able to distinguish anything special about the ones that cause the client the hang.
> At first I thought this was only a problem with 0.19 trunk as a downgrade to 0.18 resolved the issue for a particular key.  However other keys still have this issue on 0.18 branch.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.