You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@hbase.apache.org by "Jonathan Gray (JIRA)" <ji...@apache.org> on 2008/09/04 19:09:44 UTC

[jira] Created: (HBASE-868) Incrementing binary rows cause strange behavior once table splits

Incrementing binary rows cause strange behavior once table splits
-----------------------------------------------------------------

                 Key: HBASE-868
                 URL: https://issues.apache.org/jira/browse/HBASE-868
             Project: Hadoop HBase
          Issue Type: Bug
    Affects Versions: 0.2.1
         Environment: 4 nodes: 1 master/namenode, 3 regionserver/datanodes
            Reporter: Jonathan Gray
            Priority: Blocker
             Fix For: 0.2.1


We're now using incrementing binary row keys and to this point had only been doing small tests with them, never having actually had a table split.

I'm still working through the logs but it seems that there's a problem somewhere with startKey and endKeys.

Binary in general is not well supported (inconsistent in display in the logs, very odd rendering in the web ui, hard to interpret in the shell, etc..)  Once we figure out these serious bugs we will spend some time trying to clean that up.  But right now this makes things even harder to debug.

The actual symptoms are that my import eventually started to throw (in the client and on the region server):

org.apache.hadoop.hbase.regionserver.WrongRegionException: org.apache.hadoop.hbase.regionserver.WrongRegionException: Requested row out of range for HRegion sources,,1220546297947, startKey='', getEndKey()='
ï¿½', row='cï¿½'
        at org.apache.hadoop.hbase.regionserver.HRegion.checkRow(HRegion.java:1775)
        at org.apache.hadoop.hbase.regionserver.HRegion.obtainRowLock(HRegion.java:1831)
        at org.apache.hadoop.hbase.regionserver.HRegion.batchUpdate(HRegion.java:1387)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.batchUpdate(HRegionServer.java:1145)
        at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:616)
        at org.apache.hadoop.hbase.ipc.HbaseRPC$Server.call(HbaseRPC.java:473)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:888)


There are 3 regionservers, but this error only happens on one of them (while the other two always continue normally, allowing updates to this same table).

The regionserver that this happens on is special for two reasons, one it is hosting the META table.  And secondly it also hosts the first region in the table, startkey = ''.  I'm unsure which is the cause, I have a clue leading to both.

After about 15 minutes, the regionserver sees:

2008-09-04 09:52:57,948 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memcache flush for region .META.,,1. Current region memcache size 24.5k
2008-09-04 09:52:58,003 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Added /hbase/.META./1028785192/historian/mapfiles/8699673838203663799 with 106 entries, sequence id 25341510, data size 8.9k, file size 10.6k
2008-09-04 09:52:58,050 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Added /hbase/.META./1028785192/info/mapfiles/1791564557665476834 with 96 entries, sequence id 25341510, data size 14.2k, file size 15.8k
2008-09-04 09:52:58,050 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Finished memcache flush for region .META.,,1 in 102ms, sequence id=25341510, compaction requested=true
2008-09-04 09:52:58,050 DEBUG org.apache.hadoop.hbase.regionserver.CompactSplitThread: Compaction requested for region: .META.,,1
2008-09-04 09:52:58,051 INFO org.apache.hadoop.hbase.regionserver.HRegion: starting compaction on region .META.,,1
2008-09-04 09:52:58,055 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Compaction size of 1028785192/historian: 41.9k; Skipped 1 files , size: 21957
2008-09-04 09:52:58,088 DEBUG org.apache.hadoop.hbase.regionserver.HStore: started compaction of 2 files into /hbase/.META./compaction.dir/1028785192/historian/mapfiles/6948796056606699674
2008-09-04 09:52:58,128 DEBUG org.apache.hadoop.hbase.regionserver.HStore: moving /hbase/.META./compaction.dir/1028785192/historian/mapfiles/6948796056606699674 to /hbase/.META./1028785192/historian/mapfiles/75733875840914142
2008-09-04 09:52:58,175 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Completed compaction of 1028785192/historian store size is 41.1k; time since last major compaction: 5426 seconds
2008-09-04 09:52:58,179 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Compaction size of 1028785192/info: 61.9k; Skipped 0 files , size: 0
2008-09-04 09:52:58,192 DEBUG org.apache.hadoop.hbase.regionserver.HStore: started compaction of 3 files into /hbase/.META./compaction.dir/1028785192/info/mapfiles/7781013568996125923
2008-09-04 09:52:58,260 DEBUG org.apache.hadoop.hbase.regionserver.HStore: moving /hbase/.META./compaction.dir/1028785192/info/mapfiles/7781013568996125923 to /hbase/.META./1028785192/info/mapfiles/2187291308709057119
2008-09-04 09:52:58,290 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Completed compaction of 1028785192/info store size is 61.0k; time since last major compaction: 32534 seconds
2008-09-04 09:52:58,296 INFO org.apache.hadoop.hbase.regionserver.HRegion: compaction completed on region .META.,,1 in 0sec
2008-09-04 09:53:09,620 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner -2085474968086468199 lease expired
2008-09-04 09:54:35,449 INFO org.apache.hadoop.hbase.regionserver.LogRoller: Rolling hlog. Number of entries: 30009


Following this, insertion continues normally.  This leads me to believe there's an issue with the META table memcache, but oddly the other regions of this table on other regionservers continue on fine.

As for the hosting the first region of the table on this region server, it seems to be consistent that I get the row out of range errors when looking for a region with startkey = '', although there are 5 other regions on this RS.

Will attach full logs from master and three RS.  Also a couple screenshots showing weird behavior in listing the regions in the table.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-868) Incrementing binary rows cause strange behavior once table splits

Posted by "Jonathan Gray (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12629050#action_12629050 ] 

Jonathan Gray commented on HBASE-868:
-------------------------------------

Could issue be in BloomFilterMapFile.Reader and/or HbaseMapWirtable (which it extends)?

HMW implementes a SortedMap<byte[],V> which is the culprit before.  We have nothing pointing it to our custom comparator.

> Incrementing binary rows cause strange behavior once table splits
> -----------------------------------------------------------------
>
>                 Key: HBASE-868
>                 URL: https://issues.apache.org/jira/browse/HBASE-868
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.2.1, 0.18.0
>         Environment: 4 nodes: 1 master/namenode, 3 regionserver/datanodes
>            Reporter: Jonathan Gray
>            Assignee: Jonathan Gray
>            Priority: Blocker
>             Fix For: 0.2.1, 0.18.0
>
>         Attachments: hbase-868-0.2-v2.patch, hbase-868-0.2-v3.patch, hbase-868-0.2-v4.patch, hbase-868-0.2-v5-fix.patch, hbase-868-0.2-v5.patch, hbase-868-branch-0.2.patch, hbase-hbase-regionserver-web3.streamy.com.log, master_regionsinsources.png, regionserver_regionlist.png
>
>
> We're now using incrementing binary row keys and to this point had only been doing small tests with them, never having actually had a table split.
> I'm still working through the logs but it seems that there's a problem somewhere with startKey and endKeys.
> Binary in general is not well supported (inconsistent in display in the logs, very odd rendering in the web ui, hard to interpret in the shell, etc..)  Once we figure out these serious bugs we will spend some time trying to clean that up.  But right now this makes things even harder to debug.
> The actual symptoms are that my import eventually started to throw (in the client and on the region server):
> org.apache.hadoop.hbase.regionserver.WrongRegionException: org.apache.hadoop.hbase.regionserver.WrongRegionException: Requested row out of range for HRegion sources,,1220546297947, startKey='', getEndKey()='
> ï¿½', row='cï¿½'
>         at org.apache.hadoop.hbase.regionserver.HRegion.checkRow(HRegion.java:1775)
>         at org.apache.hadoop.hbase.regionserver.HRegion.obtainRowLock(HRegion.java:1831)
>         at org.apache.hadoop.hbase.regionserver.HRegion.batchUpdate(HRegion.java:1387)
>         at org.apache.hadoop.hbase.regionserver.HRegionServer.batchUpdate(HRegionServer.java:1145)
>         at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:616)
>         at org.apache.hadoop.hbase.ipc.HbaseRPC$Server.call(HbaseRPC.java:473)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:888)
> There are 3 regionservers, but this error only happens on one of them (while the other two always continue normally, allowing updates to this same table).
> The regionserver that this happens on is special for two reasons, one it is hosting the META table.  And secondly it also hosts the first region in the table, startkey = ''.  I'm unsure which is the cause, I have a clue leading to both.
> After about 15 minutes, the regionserver sees:
> 2008-09-04 09:52:57,948 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memcache flush for region .META.,,1. Current region memcache size 24.5k
> 2008-09-04 09:52:58,003 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Added /hbase/.META./1028785192/historian/mapfiles/8699673838203663799 with 106 entries, sequence id 25341510, data size 8.9k, file size 10.6k
> 2008-09-04 09:52:58,050 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Added /hbase/.META./1028785192/info/mapfiles/1791564557665476834 with 96 entries, sequence id 25341510, data size 14.2k, file size 15.8k
> 2008-09-04 09:52:58,050 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Finished memcache flush for region .META.,,1 in 102ms, sequence id=25341510, compaction requested=true
> 2008-09-04 09:52:58,050 DEBUG org.apache.hadoop.hbase.regionserver.CompactSplitThread: Compaction requested for region: .META.,,1
> 2008-09-04 09:52:58,051 INFO org.apache.hadoop.hbase.regionserver.HRegion: starting compaction on region .META.,,1
> 2008-09-04 09:52:58,055 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Compaction size of 1028785192/historian: 41.9k; Skipped 1 files , size: 21957
> 2008-09-04 09:52:58,088 DEBUG org.apache.hadoop.hbase.regionserver.HStore: started compaction of 2 files into /hbase/.META./compaction.dir/1028785192/historian/mapfiles/6948796056606699674
> 2008-09-04 09:52:58,128 DEBUG org.apache.hadoop.hbase.regionserver.HStore: moving /hbase/.META./compaction.dir/1028785192/historian/mapfiles/6948796056606699674 to /hbase/.META./1028785192/historian/mapfiles/75733875840914142
> 2008-09-04 09:52:58,175 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Completed compaction of 1028785192/historian store size is 41.1k; time since last major compaction: 5426 seconds
> 2008-09-04 09:52:58,179 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Compaction size of 1028785192/info: 61.9k; Skipped 0 files , size: 0
> 2008-09-04 09:52:58,192 DEBUG org.apache.hadoop.hbase.regionserver.HStore: started compaction of 3 files into /hbase/.META./compaction.dir/1028785192/info/mapfiles/7781013568996125923
> 2008-09-04 09:52:58,260 DEBUG org.apache.hadoop.hbase.regionserver.HStore: moving /hbase/.META./compaction.dir/1028785192/info/mapfiles/7781013568996125923 to /hbase/.META./1028785192/info/mapfiles/2187291308709057119
> 2008-09-04 09:52:58,290 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Completed compaction of 1028785192/info store size is 61.0k; time since last major compaction: 32534 seconds
> 2008-09-04 09:52:58,296 INFO org.apache.hadoop.hbase.regionserver.HRegion: compaction completed on region .META.,,1 in 0sec
> 2008-09-04 09:53:09,620 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner -2085474968086468199 lease expired
> 2008-09-04 09:54:35,449 INFO org.apache.hadoop.hbase.regionserver.LogRoller: Rolling hlog. Number of entries: 30009
> Following this, insertion continues normally.  This leads me to believe there's an issue with the META table memcache, but oddly the other regions of this table on other regionservers continue on fine.
> As for the hosting the first region of the table on this region server, it seems to be consistent that I get the row out of range errors when looking for a region with startkey = '', although there are 5 other regions on this RS.
> Will attach full logs from master and three RS.  Also a couple screenshots showing weird behavior in listing the regions in the table.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-868) Incrementing binary rows cause strange behavior once table splits

Posted by "Jonathan Gray (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12629004#action_12629004 ] 

Jonathan Gray commented on HBASE-868:
-------------------------------------

Jim, this is actually running off of 0.18.

We're not seeing the dropped snapshot anymore, so not sure if that is relevant anymore.

> Incrementing binary rows cause strange behavior once table splits
> -----------------------------------------------------------------
>
>                 Key: HBASE-868
>                 URL: https://issues.apache.org/jira/browse/HBASE-868
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.2.1
>         Environment: 4 nodes: 1 master/namenode, 3 regionserver/datanodes
>            Reporter: Jonathan Gray
>            Assignee: Jonathan Gray
>            Priority: Blocker
>             Fix For: 0.2.1
>
>         Attachments: hbase-868-0.2-v2.patch, hbase-868-0.2-v3.patch, hbase-868-0.2-v4.patch, hbase-868-branch-0.2.patch, hbase-hbase-regionserver-web3.streamy.com.log, master_regionsinsources.png, regionserver_regionlist.png
>
>
> We're now using incrementing binary row keys and to this point had only been doing small tests with them, never having actually had a table split.
> I'm still working through the logs but it seems that there's a problem somewhere with startKey and endKeys.
> Binary in general is not well supported (inconsistent in display in the logs, very odd rendering in the web ui, hard to interpret in the shell, etc..)  Once we figure out these serious bugs we will spend some time trying to clean that up.  But right now this makes things even harder to debug.
> The actual symptoms are that my import eventually started to throw (in the client and on the region server):
> org.apache.hadoop.hbase.regionserver.WrongRegionException: org.apache.hadoop.hbase.regionserver.WrongRegionException: Requested row out of range for HRegion sources,,1220546297947, startKey='', getEndKey()='
> ï¿½', row='cï¿½'
>         at org.apache.hadoop.hbase.regionserver.HRegion.checkRow(HRegion.java:1775)
>         at org.apache.hadoop.hbase.regionserver.HRegion.obtainRowLock(HRegion.java:1831)
>         at org.apache.hadoop.hbase.regionserver.HRegion.batchUpdate(HRegion.java:1387)
>         at org.apache.hadoop.hbase.regionserver.HRegionServer.batchUpdate(HRegionServer.java:1145)
>         at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:616)
>         at org.apache.hadoop.hbase.ipc.HbaseRPC$Server.call(HbaseRPC.java:473)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:888)
> There are 3 regionservers, but this error only happens on one of them (while the other two always continue normally, allowing updates to this same table).
> The regionserver that this happens on is special for two reasons, one it is hosting the META table.  And secondly it also hosts the first region in the table, startkey = ''.  I'm unsure which is the cause, I have a clue leading to both.
> After about 15 minutes, the regionserver sees:
> 2008-09-04 09:52:57,948 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memcache flush for region .META.,,1. Current region memcache size 24.5k
> 2008-09-04 09:52:58,003 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Added /hbase/.META./1028785192/historian/mapfiles/8699673838203663799 with 106 entries, sequence id 25341510, data size 8.9k, file size 10.6k
> 2008-09-04 09:52:58,050 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Added /hbase/.META./1028785192/info/mapfiles/1791564557665476834 with 96 entries, sequence id 25341510, data size 14.2k, file size 15.8k
> 2008-09-04 09:52:58,050 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Finished memcache flush for region .META.,,1 in 102ms, sequence id=25341510, compaction requested=true
> 2008-09-04 09:52:58,050 DEBUG org.apache.hadoop.hbase.regionserver.CompactSplitThread: Compaction requested for region: .META.,,1
> 2008-09-04 09:52:58,051 INFO org.apache.hadoop.hbase.regionserver.HRegion: starting compaction on region .META.,,1
> 2008-09-04 09:52:58,055 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Compaction size of 1028785192/historian: 41.9k; Skipped 1 files , size: 21957
> 2008-09-04 09:52:58,088 DEBUG org.apache.hadoop.hbase.regionserver.HStore: started compaction of 2 files into /hbase/.META./compaction.dir/1028785192/historian/mapfiles/6948796056606699674
> 2008-09-04 09:52:58,128 DEBUG org.apache.hadoop.hbase.regionserver.HStore: moving /hbase/.META./compaction.dir/1028785192/historian/mapfiles/6948796056606699674 to /hbase/.META./1028785192/historian/mapfiles/75733875840914142
> 2008-09-04 09:52:58,175 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Completed compaction of 1028785192/historian store size is 41.1k; time since last major compaction: 5426 seconds
> 2008-09-04 09:52:58,179 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Compaction size of 1028785192/info: 61.9k; Skipped 0 files , size: 0
> 2008-09-04 09:52:58,192 DEBUG org.apache.hadoop.hbase.regionserver.HStore: started compaction of 3 files into /hbase/.META./compaction.dir/1028785192/info/mapfiles/7781013568996125923
> 2008-09-04 09:52:58,260 DEBUG org.apache.hadoop.hbase.regionserver.HStore: moving /hbase/.META./compaction.dir/1028785192/info/mapfiles/7781013568996125923 to /hbase/.META./1028785192/info/mapfiles/2187291308709057119
> 2008-09-04 09:52:58,290 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Completed compaction of 1028785192/info store size is 61.0k; time since last major compaction: 32534 seconds
> 2008-09-04 09:52:58,296 INFO org.apache.hadoop.hbase.regionserver.HRegion: compaction completed on region .META.,,1 in 0sec
> 2008-09-04 09:53:09,620 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner -2085474968086468199 lease expired
> 2008-09-04 09:54:35,449 INFO org.apache.hadoop.hbase.regionserver.LogRoller: Rolling hlog. Number of entries: 30009
> Following this, insertion continues normally.  This leads me to believe there's an issue with the META table memcache, but oddly the other regions of this table on other regionservers continue on fine.
> As for the hosting the first region of the table on this region server, it seems to be consistent that I get the row out of range errors when looking for a region with startkey = '', although there are 5 other regions on this RS.
> Will attach full logs from master and three RS.  Also a couple screenshots showing weird behavior in listing the regions in the table.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HBASE-868) Incrementing binary rows cause strange behavior once table splits

Posted by "Jonathan Gray (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Gray updated HBASE-868:
--------------------------------

        Fix Version/s: 0.18.0
    Affects Version/s: 0.18.0
               Status: Patch Available  (was: Reopened)

> Incrementing binary rows cause strange behavior once table splits
> -----------------------------------------------------------------
>
>                 Key: HBASE-868
>                 URL: https://issues.apache.org/jira/browse/HBASE-868
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.2.1, 0.18.0
>         Environment: 4 nodes: 1 master/namenode, 3 regionserver/datanodes
>            Reporter: Jonathan Gray
>            Assignee: Jonathan Gray
>            Priority: Blocker
>             Fix For: 0.2.1, 0.18.0
>
>         Attachments: hbase-868-0.2-v2.patch, hbase-868-0.2-v3.patch, hbase-868-0.2-v4.patch, hbase-868-0.2-v5-fix.patch, hbase-868-0.2-v5.patch, hbase-868-branch-0.2.patch, hbase-hbase-regionserver-web3.streamy.com.log, master_regionsinsources.png, regionserver_regionlist.png
>
>
> We're now using incrementing binary row keys and to this point had only been doing small tests with them, never having actually had a table split.
> I'm still working through the logs but it seems that there's a problem somewhere with startKey and endKeys.
> Binary in general is not well supported (inconsistent in display in the logs, very odd rendering in the web ui, hard to interpret in the shell, etc..)  Once we figure out these serious bugs we will spend some time trying to clean that up.  But right now this makes things even harder to debug.
> The actual symptoms are that my import eventually started to throw (in the client and on the region server):
> org.apache.hadoop.hbase.regionserver.WrongRegionException: org.apache.hadoop.hbase.regionserver.WrongRegionException: Requested row out of range for HRegion sources,,1220546297947, startKey='', getEndKey()='
> ï¿½', row='cï¿½'
>         at org.apache.hadoop.hbase.regionserver.HRegion.checkRow(HRegion.java:1775)
>         at org.apache.hadoop.hbase.regionserver.HRegion.obtainRowLock(HRegion.java:1831)
>         at org.apache.hadoop.hbase.regionserver.HRegion.batchUpdate(HRegion.java:1387)
>         at org.apache.hadoop.hbase.regionserver.HRegionServer.batchUpdate(HRegionServer.java:1145)
>         at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:616)
>         at org.apache.hadoop.hbase.ipc.HbaseRPC$Server.call(HbaseRPC.java:473)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:888)
> There are 3 regionservers, but this error only happens on one of them (while the other two always continue normally, allowing updates to this same table).
> The regionserver that this happens on is special for two reasons, one it is hosting the META table.  And secondly it also hosts the first region in the table, startkey = ''.  I'm unsure which is the cause, I have a clue leading to both.
> After about 15 minutes, the regionserver sees:
> 2008-09-04 09:52:57,948 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memcache flush for region .META.,,1. Current region memcache size 24.5k
> 2008-09-04 09:52:58,003 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Added /hbase/.META./1028785192/historian/mapfiles/8699673838203663799 with 106 entries, sequence id 25341510, data size 8.9k, file size 10.6k
> 2008-09-04 09:52:58,050 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Added /hbase/.META./1028785192/info/mapfiles/1791564557665476834 with 96 entries, sequence id 25341510, data size 14.2k, file size 15.8k
> 2008-09-04 09:52:58,050 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Finished memcache flush for region .META.,,1 in 102ms, sequence id=25341510, compaction requested=true
> 2008-09-04 09:52:58,050 DEBUG org.apache.hadoop.hbase.regionserver.CompactSplitThread: Compaction requested for region: .META.,,1
> 2008-09-04 09:52:58,051 INFO org.apache.hadoop.hbase.regionserver.HRegion: starting compaction on region .META.,,1
> 2008-09-04 09:52:58,055 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Compaction size of 1028785192/historian: 41.9k; Skipped 1 files , size: 21957
> 2008-09-04 09:52:58,088 DEBUG org.apache.hadoop.hbase.regionserver.HStore: started compaction of 2 files into /hbase/.META./compaction.dir/1028785192/historian/mapfiles/6948796056606699674
> 2008-09-04 09:52:58,128 DEBUG org.apache.hadoop.hbase.regionserver.HStore: moving /hbase/.META./compaction.dir/1028785192/historian/mapfiles/6948796056606699674 to /hbase/.META./1028785192/historian/mapfiles/75733875840914142
> 2008-09-04 09:52:58,175 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Completed compaction of 1028785192/historian store size is 41.1k; time since last major compaction: 5426 seconds
> 2008-09-04 09:52:58,179 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Compaction size of 1028785192/info: 61.9k; Skipped 0 files , size: 0
> 2008-09-04 09:52:58,192 DEBUG org.apache.hadoop.hbase.regionserver.HStore: started compaction of 3 files into /hbase/.META./compaction.dir/1028785192/info/mapfiles/7781013568996125923
> 2008-09-04 09:52:58,260 DEBUG org.apache.hadoop.hbase.regionserver.HStore: moving /hbase/.META./compaction.dir/1028785192/info/mapfiles/7781013568996125923 to /hbase/.META./1028785192/info/mapfiles/2187291308709057119
> 2008-09-04 09:52:58,290 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Completed compaction of 1028785192/info store size is 61.0k; time since last major compaction: 32534 seconds
> 2008-09-04 09:52:58,296 INFO org.apache.hadoop.hbase.regionserver.HRegion: compaction completed on region .META.,,1 in 0sec
> 2008-09-04 09:53:09,620 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner -2085474968086468199 lease expired
> 2008-09-04 09:54:35,449 INFO org.apache.hadoop.hbase.regionserver.LogRoller: Rolling hlog. Number of entries: 30009
> Following this, insertion continues normally.  This leads me to believe there's an issue with the META table memcache, but oddly the other regions of this table on other regionservers continue on fine.
> As for the hosting the first region of the table on this region server, it seems to be consistent that I get the row out of range errors when looking for a region with startkey = '', although there are 5 other regions on this RS.
> Will attach full logs from master and three RS.  Also a couple screenshots showing weird behavior in listing the regions in the table.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HBASE-868) Incrementing binary rows cause strange behavior once table splits

Posted by "Jonathan Gray (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Gray updated HBASE-868:
--------------------------------

    Attachment: hbase-868-0.2-v2.patch

> Incrementing binary rows cause strange behavior once table splits
> -----------------------------------------------------------------
>
>                 Key: HBASE-868
>                 URL: https://issues.apache.org/jira/browse/HBASE-868
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.2.1
>         Environment: 4 nodes: 1 master/namenode, 3 regionserver/datanodes
>            Reporter: Jonathan Gray
>            Assignee: Jonathan Gray
>            Priority: Blocker
>             Fix For: 0.2.1
>
>         Attachments: hbase-868-0.2-v2.patch, hbase-868-branch-0.2.patch, hbase-hbase-regionserver-web3.streamy.com.log, master_regionsinsources.png, regionserver_regionlist.png
>
>
> We're now using incrementing binary row keys and to this point had only been doing small tests with them, never having actually had a table split.
> I'm still working through the logs but it seems that there's a problem somewhere with startKey and endKeys.
> Binary in general is not well supported (inconsistent in display in the logs, very odd rendering in the web ui, hard to interpret in the shell, etc..)  Once we figure out these serious bugs we will spend some time trying to clean that up.  But right now this makes things even harder to debug.
> The actual symptoms are that my import eventually started to throw (in the client and on the region server):
> org.apache.hadoop.hbase.regionserver.WrongRegionException: org.apache.hadoop.hbase.regionserver.WrongRegionException: Requested row out of range for HRegion sources,,1220546297947, startKey='', getEndKey()='
> ï¿½', row='cï¿½'
>         at org.apache.hadoop.hbase.regionserver.HRegion.checkRow(HRegion.java:1775)
>         at org.apache.hadoop.hbase.regionserver.HRegion.obtainRowLock(HRegion.java:1831)
>         at org.apache.hadoop.hbase.regionserver.HRegion.batchUpdate(HRegion.java:1387)
>         at org.apache.hadoop.hbase.regionserver.HRegionServer.batchUpdate(HRegionServer.java:1145)
>         at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:616)
>         at org.apache.hadoop.hbase.ipc.HbaseRPC$Server.call(HbaseRPC.java:473)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:888)
> There are 3 regionservers, but this error only happens on one of them (while the other two always continue normally, allowing updates to this same table).
> The regionserver that this happens on is special for two reasons, one it is hosting the META table.  And secondly it also hosts the first region in the table, startkey = ''.  I'm unsure which is the cause, I have a clue leading to both.
> After about 15 minutes, the regionserver sees:
> 2008-09-04 09:52:57,948 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memcache flush for region .META.,,1. Current region memcache size 24.5k
> 2008-09-04 09:52:58,003 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Added /hbase/.META./1028785192/historian/mapfiles/8699673838203663799 with 106 entries, sequence id 25341510, data size 8.9k, file size 10.6k
> 2008-09-04 09:52:58,050 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Added /hbase/.META./1028785192/info/mapfiles/1791564557665476834 with 96 entries, sequence id 25341510, data size 14.2k, file size 15.8k
> 2008-09-04 09:52:58,050 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Finished memcache flush for region .META.,,1 in 102ms, sequence id=25341510, compaction requested=true
> 2008-09-04 09:52:58,050 DEBUG org.apache.hadoop.hbase.regionserver.CompactSplitThread: Compaction requested for region: .META.,,1
> 2008-09-04 09:52:58,051 INFO org.apache.hadoop.hbase.regionserver.HRegion: starting compaction on region .META.,,1
> 2008-09-04 09:52:58,055 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Compaction size of 1028785192/historian: 41.9k; Skipped 1 files , size: 21957
> 2008-09-04 09:52:58,088 DEBUG org.apache.hadoop.hbase.regionserver.HStore: started compaction of 2 files into /hbase/.META./compaction.dir/1028785192/historian/mapfiles/6948796056606699674
> 2008-09-04 09:52:58,128 DEBUG org.apache.hadoop.hbase.regionserver.HStore: moving /hbase/.META./compaction.dir/1028785192/historian/mapfiles/6948796056606699674 to /hbase/.META./1028785192/historian/mapfiles/75733875840914142
> 2008-09-04 09:52:58,175 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Completed compaction of 1028785192/historian store size is 41.1k; time since last major compaction: 5426 seconds
> 2008-09-04 09:52:58,179 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Compaction size of 1028785192/info: 61.9k; Skipped 0 files , size: 0
> 2008-09-04 09:52:58,192 DEBUG org.apache.hadoop.hbase.regionserver.HStore: started compaction of 3 files into /hbase/.META./compaction.dir/1028785192/info/mapfiles/7781013568996125923
> 2008-09-04 09:52:58,260 DEBUG org.apache.hadoop.hbase.regionserver.HStore: moving /hbase/.META./compaction.dir/1028785192/info/mapfiles/7781013568996125923 to /hbase/.META./1028785192/info/mapfiles/2187291308709057119
> 2008-09-04 09:52:58,290 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Completed compaction of 1028785192/info store size is 61.0k; time since last major compaction: 32534 seconds
> 2008-09-04 09:52:58,296 INFO org.apache.hadoop.hbase.regionserver.HRegion: compaction completed on region .META.,,1 in 0sec
> 2008-09-04 09:53:09,620 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner -2085474968086468199 lease expired
> 2008-09-04 09:54:35,449 INFO org.apache.hadoop.hbase.regionserver.LogRoller: Rolling hlog. Number of entries: 30009
> Following this, insertion continues normally.  This leads me to believe there's an issue with the META table memcache, but oddly the other regions of this table on other regionservers continue on fine.
> As for the hosting the first region of the table on this region server, it seems to be consistent that I get the row out of range errors when looking for a region with startkey = '', although there are 5 other regions on this RS.
> Will attach full logs from master and three RS.  Also a couple screenshots showing weird behavior in listing the regions in the table.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-868) Incrementing binary rows cause strange behavior once table splits

Posted by "stack (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12629013#action_12629013 ] 

stack commented on HBASE-868:
-----------------------------

I was able to reproduce the jgray failures by doing this in PE:

{code}
Index: src/test/org/apache/hadoop/hbase/PerformanceEvaluation.java
===================================================================
--- src/test/org/apache/hadoop/hbase/PerformanceEvaluation.java (revision 692728)
+++ src/test/org/apache/hadoop/hbase/PerformanceEvaluation.java (working copy)
@@ -481,7 +481,8 @@
     byte [] b = new byte[10];
     int d = Math.abs(number);
     for (int i = b.length - 1; i >= 0; i--) {
-      b[i] = (byte)((d % 10) + '0');
+      // b[i] = (byte)((d % 10) + '0');
+      b[i] = (byte)((d % 10));
       d /= 10;
     }
     return b;
{code}

We need to add a unit test with binary keys... as part of HBASE-859

> Incrementing binary rows cause strange behavior once table splits
> -----------------------------------------------------------------
>
>                 Key: HBASE-868
>                 URL: https://issues.apache.org/jira/browse/HBASE-868
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.2.1
>         Environment: 4 nodes: 1 master/namenode, 3 regionserver/datanodes
>            Reporter: Jonathan Gray
>            Assignee: Jonathan Gray
>            Priority: Blocker
>             Fix For: 0.2.1
>
>         Attachments: hbase-868-0.2-v2.patch, hbase-868-0.2-v3.patch, hbase-868-0.2-v4.patch, hbase-868-branch-0.2.patch, hbase-hbase-regionserver-web3.streamy.com.log, master_regionsinsources.png, regionserver_regionlist.png
>
>
> We're now using incrementing binary row keys and to this point had only been doing small tests with them, never having actually had a table split.
> I'm still working through the logs but it seems that there's a problem somewhere with startKey and endKeys.
> Binary in general is not well supported (inconsistent in display in the logs, very odd rendering in the web ui, hard to interpret in the shell, etc..)  Once we figure out these serious bugs we will spend some time trying to clean that up.  But right now this makes things even harder to debug.
> The actual symptoms are that my import eventually started to throw (in the client and on the region server):
> org.apache.hadoop.hbase.regionserver.WrongRegionException: org.apache.hadoop.hbase.regionserver.WrongRegionException: Requested row out of range for HRegion sources,,1220546297947, startKey='', getEndKey()='
> ï¿½', row='cï¿½'
>         at org.apache.hadoop.hbase.regionserver.HRegion.checkRow(HRegion.java:1775)
>         at org.apache.hadoop.hbase.regionserver.HRegion.obtainRowLock(HRegion.java:1831)
>         at org.apache.hadoop.hbase.regionserver.HRegion.batchUpdate(HRegion.java:1387)
>         at org.apache.hadoop.hbase.regionserver.HRegionServer.batchUpdate(HRegionServer.java:1145)
>         at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:616)
>         at org.apache.hadoop.hbase.ipc.HbaseRPC$Server.call(HbaseRPC.java:473)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:888)
> There are 3 regionservers, but this error only happens on one of them (while the other two always continue normally, allowing updates to this same table).
> The regionserver that this happens on is special for two reasons, one it is hosting the META table.  And secondly it also hosts the first region in the table, startkey = ''.  I'm unsure which is the cause, I have a clue leading to both.
> After about 15 minutes, the regionserver sees:
> 2008-09-04 09:52:57,948 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memcache flush for region .META.,,1. Current region memcache size 24.5k
> 2008-09-04 09:52:58,003 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Added /hbase/.META./1028785192/historian/mapfiles/8699673838203663799 with 106 entries, sequence id 25341510, data size 8.9k, file size 10.6k
> 2008-09-04 09:52:58,050 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Added /hbase/.META./1028785192/info/mapfiles/1791564557665476834 with 96 entries, sequence id 25341510, data size 14.2k, file size 15.8k
> 2008-09-04 09:52:58,050 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Finished memcache flush for region .META.,,1 in 102ms, sequence id=25341510, compaction requested=true
> 2008-09-04 09:52:58,050 DEBUG org.apache.hadoop.hbase.regionserver.CompactSplitThread: Compaction requested for region: .META.,,1
> 2008-09-04 09:52:58,051 INFO org.apache.hadoop.hbase.regionserver.HRegion: starting compaction on region .META.,,1
> 2008-09-04 09:52:58,055 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Compaction size of 1028785192/historian: 41.9k; Skipped 1 files , size: 21957
> 2008-09-04 09:52:58,088 DEBUG org.apache.hadoop.hbase.regionserver.HStore: started compaction of 2 files into /hbase/.META./compaction.dir/1028785192/historian/mapfiles/6948796056606699674
> 2008-09-04 09:52:58,128 DEBUG org.apache.hadoop.hbase.regionserver.HStore: moving /hbase/.META./compaction.dir/1028785192/historian/mapfiles/6948796056606699674 to /hbase/.META./1028785192/historian/mapfiles/75733875840914142
> 2008-09-04 09:52:58,175 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Completed compaction of 1028785192/historian store size is 41.1k; time since last major compaction: 5426 seconds
> 2008-09-04 09:52:58,179 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Compaction size of 1028785192/info: 61.9k; Skipped 0 files , size: 0
> 2008-09-04 09:52:58,192 DEBUG org.apache.hadoop.hbase.regionserver.HStore: started compaction of 3 files into /hbase/.META./compaction.dir/1028785192/info/mapfiles/7781013568996125923
> 2008-09-04 09:52:58,260 DEBUG org.apache.hadoop.hbase.regionserver.HStore: moving /hbase/.META./compaction.dir/1028785192/info/mapfiles/7781013568996125923 to /hbase/.META./1028785192/info/mapfiles/2187291308709057119
> 2008-09-04 09:52:58,290 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Completed compaction of 1028785192/info store size is 61.0k; time since last major compaction: 32534 seconds
> 2008-09-04 09:52:58,296 INFO org.apache.hadoop.hbase.regionserver.HRegion: compaction completed on region .META.,,1 in 0sec
> 2008-09-04 09:53:09,620 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner -2085474968086468199 lease expired
> 2008-09-04 09:54:35,449 INFO org.apache.hadoop.hbase.regionserver.LogRoller: Rolling hlog. Number of entries: 30009
> Following this, insertion continues normally.  This leads me to believe there's an issue with the META table memcache, but oddly the other regions of this table on other regionservers continue on fine.
> As for the hosting the first region of the table on this region server, it seems to be consistent that I get the row out of range errors when looking for a region with startkey = '', although there are 5 other regions on this RS.
> Will attach full logs from master and three RS.  Also a couple screenshots showing weird behavior in listing the regions in the table.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (HBASE-868) Incrementing binary rows cause strange behavior once table splits

Posted by "Jonathan Gray (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Gray reassigned HBASE-868:
-----------------------------------

    Assignee: Jonathan Gray

> Incrementing binary rows cause strange behavior once table splits
> -----------------------------------------------------------------
>
>                 Key: HBASE-868
>                 URL: https://issues.apache.org/jira/browse/HBASE-868
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.2.1
>         Environment: 4 nodes: 1 master/namenode, 3 regionserver/datanodes
>            Reporter: Jonathan Gray
>            Assignee: Jonathan Gray
>            Priority: Blocker
>             Fix For: 0.2.1
>
>         Attachments: hbase-868-branch-0.2.patch, hbase-hbase-regionserver-web3.streamy.com.log, master_regionsinsources.png, regionserver_regionlist.png
>
>
> We're now using incrementing binary row keys and to this point had only been doing small tests with them, never having actually had a table split.
> I'm still working through the logs but it seems that there's a problem somewhere with startKey and endKeys.
> Binary in general is not well supported (inconsistent in display in the logs, very odd rendering in the web ui, hard to interpret in the shell, etc..)  Once we figure out these serious bugs we will spend some time trying to clean that up.  But right now this makes things even harder to debug.
> The actual symptoms are that my import eventually started to throw (in the client and on the region server):
> org.apache.hadoop.hbase.regionserver.WrongRegionException: org.apache.hadoop.hbase.regionserver.WrongRegionException: Requested row out of range for HRegion sources,,1220546297947, startKey='', getEndKey()='
> ï¿½', row='cï¿½'
>         at org.apache.hadoop.hbase.regionserver.HRegion.checkRow(HRegion.java:1775)
>         at org.apache.hadoop.hbase.regionserver.HRegion.obtainRowLock(HRegion.java:1831)
>         at org.apache.hadoop.hbase.regionserver.HRegion.batchUpdate(HRegion.java:1387)
>         at org.apache.hadoop.hbase.regionserver.HRegionServer.batchUpdate(HRegionServer.java:1145)
>         at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:616)
>         at org.apache.hadoop.hbase.ipc.HbaseRPC$Server.call(HbaseRPC.java:473)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:888)
> There are 3 regionservers, but this error only happens on one of them (while the other two always continue normally, allowing updates to this same table).
> The regionserver that this happens on is special for two reasons, one it is hosting the META table.  And secondly it also hosts the first region in the table, startkey = ''.  I'm unsure which is the cause, I have a clue leading to both.
> After about 15 minutes, the regionserver sees:
> 2008-09-04 09:52:57,948 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memcache flush for region .META.,,1. Current region memcache size 24.5k
> 2008-09-04 09:52:58,003 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Added /hbase/.META./1028785192/historian/mapfiles/8699673838203663799 with 106 entries, sequence id 25341510, data size 8.9k, file size 10.6k
> 2008-09-04 09:52:58,050 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Added /hbase/.META./1028785192/info/mapfiles/1791564557665476834 with 96 entries, sequence id 25341510, data size 14.2k, file size 15.8k
> 2008-09-04 09:52:58,050 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Finished memcache flush for region .META.,,1 in 102ms, sequence id=25341510, compaction requested=true
> 2008-09-04 09:52:58,050 DEBUG org.apache.hadoop.hbase.regionserver.CompactSplitThread: Compaction requested for region: .META.,,1
> 2008-09-04 09:52:58,051 INFO org.apache.hadoop.hbase.regionserver.HRegion: starting compaction on region .META.,,1
> 2008-09-04 09:52:58,055 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Compaction size of 1028785192/historian: 41.9k; Skipped 1 files , size: 21957
> 2008-09-04 09:52:58,088 DEBUG org.apache.hadoop.hbase.regionserver.HStore: started compaction of 2 files into /hbase/.META./compaction.dir/1028785192/historian/mapfiles/6948796056606699674
> 2008-09-04 09:52:58,128 DEBUG org.apache.hadoop.hbase.regionserver.HStore: moving /hbase/.META./compaction.dir/1028785192/historian/mapfiles/6948796056606699674 to /hbase/.META./1028785192/historian/mapfiles/75733875840914142
> 2008-09-04 09:52:58,175 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Completed compaction of 1028785192/historian store size is 41.1k; time since last major compaction: 5426 seconds
> 2008-09-04 09:52:58,179 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Compaction size of 1028785192/info: 61.9k; Skipped 0 files , size: 0
> 2008-09-04 09:52:58,192 DEBUG org.apache.hadoop.hbase.regionserver.HStore: started compaction of 3 files into /hbase/.META./compaction.dir/1028785192/info/mapfiles/7781013568996125923
> 2008-09-04 09:52:58,260 DEBUG org.apache.hadoop.hbase.regionserver.HStore: moving /hbase/.META./compaction.dir/1028785192/info/mapfiles/7781013568996125923 to /hbase/.META./1028785192/info/mapfiles/2187291308709057119
> 2008-09-04 09:52:58,290 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Completed compaction of 1028785192/info store size is 61.0k; time since last major compaction: 32534 seconds
> 2008-09-04 09:52:58,296 INFO org.apache.hadoop.hbase.regionserver.HRegion: compaction completed on region .META.,,1 in 0sec
> 2008-09-04 09:53:09,620 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner -2085474968086468199 lease expired
> 2008-09-04 09:54:35,449 INFO org.apache.hadoop.hbase.regionserver.LogRoller: Rolling hlog. Number of entries: 30009
> Following this, insertion continues normally.  This leads me to believe there's an issue with the META table memcache, but oddly the other regions of this table on other regionservers continue on fine.
> As for the hosting the first region of the table on this region server, it seems to be consistent that I get the row out of range errors when looking for a region with startkey = '', although there are 5 other regions on this RS.
> Will attach full logs from master and three RS.  Also a couple screenshots showing weird behavior in listing the regions in the table.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HBASE-868) Incrementing binary rows cause strange behavior once table splits

Posted by "Jonathan Gray (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Gray updated HBASE-868:
--------------------------------

    Attachment: hbase-868-0.2-v5.patch

Patch appears to fix all issues.  Will confirm.

> Incrementing binary rows cause strange behavior once table splits
> -----------------------------------------------------------------
>
>                 Key: HBASE-868
>                 URL: https://issues.apache.org/jira/browse/HBASE-868
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.2.1
>         Environment: 4 nodes: 1 master/namenode, 3 regionserver/datanodes
>            Reporter: Jonathan Gray
>            Assignee: Jonathan Gray
>            Priority: Blocker
>             Fix For: 0.2.1
>
>         Attachments: hbase-868-0.2-v2.patch, hbase-868-0.2-v3.patch, hbase-868-0.2-v4.patch, hbase-868-0.2-v5.patch, hbase-868-branch-0.2.patch, hbase-hbase-regionserver-web3.streamy.com.log, master_regionsinsources.png, regionserver_regionlist.png
>
>
> We're now using incrementing binary row keys and to this point had only been doing small tests with them, never having actually had a table split.
> I'm still working through the logs but it seems that there's a problem somewhere with startKey and endKeys.
> Binary in general is not well supported (inconsistent in display in the logs, very odd rendering in the web ui, hard to interpret in the shell, etc..)  Once we figure out these serious bugs we will spend some time trying to clean that up.  But right now this makes things even harder to debug.
> The actual symptoms are that my import eventually started to throw (in the client and on the region server):
> org.apache.hadoop.hbase.regionserver.WrongRegionException: org.apache.hadoop.hbase.regionserver.WrongRegionException: Requested row out of range for HRegion sources,,1220546297947, startKey='', getEndKey()='
> ï¿½', row='cï¿½'
>         at org.apache.hadoop.hbase.regionserver.HRegion.checkRow(HRegion.java:1775)
>         at org.apache.hadoop.hbase.regionserver.HRegion.obtainRowLock(HRegion.java:1831)
>         at org.apache.hadoop.hbase.regionserver.HRegion.batchUpdate(HRegion.java:1387)
>         at org.apache.hadoop.hbase.regionserver.HRegionServer.batchUpdate(HRegionServer.java:1145)
>         at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:616)
>         at org.apache.hadoop.hbase.ipc.HbaseRPC$Server.call(HbaseRPC.java:473)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:888)
> There are 3 regionservers, but this error only happens on one of them (while the other two always continue normally, allowing updates to this same table).
> The regionserver that this happens on is special for two reasons, one it is hosting the META table.  And secondly it also hosts the first region in the table, startkey = ''.  I'm unsure which is the cause, I have a clue leading to both.
> After about 15 minutes, the regionserver sees:
> 2008-09-04 09:52:57,948 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memcache flush for region .META.,,1. Current region memcache size 24.5k
> 2008-09-04 09:52:58,003 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Added /hbase/.META./1028785192/historian/mapfiles/8699673838203663799 with 106 entries, sequence id 25341510, data size 8.9k, file size 10.6k
> 2008-09-04 09:52:58,050 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Added /hbase/.META./1028785192/info/mapfiles/1791564557665476834 with 96 entries, sequence id 25341510, data size 14.2k, file size 15.8k
> 2008-09-04 09:52:58,050 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Finished memcache flush for region .META.,,1 in 102ms, sequence id=25341510, compaction requested=true
> 2008-09-04 09:52:58,050 DEBUG org.apache.hadoop.hbase.regionserver.CompactSplitThread: Compaction requested for region: .META.,,1
> 2008-09-04 09:52:58,051 INFO org.apache.hadoop.hbase.regionserver.HRegion: starting compaction on region .META.,,1
> 2008-09-04 09:52:58,055 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Compaction size of 1028785192/historian: 41.9k; Skipped 1 files , size: 21957
> 2008-09-04 09:52:58,088 DEBUG org.apache.hadoop.hbase.regionserver.HStore: started compaction of 2 files into /hbase/.META./compaction.dir/1028785192/historian/mapfiles/6948796056606699674
> 2008-09-04 09:52:58,128 DEBUG org.apache.hadoop.hbase.regionserver.HStore: moving /hbase/.META./compaction.dir/1028785192/historian/mapfiles/6948796056606699674 to /hbase/.META./1028785192/historian/mapfiles/75733875840914142
> 2008-09-04 09:52:58,175 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Completed compaction of 1028785192/historian store size is 41.1k; time since last major compaction: 5426 seconds
> 2008-09-04 09:52:58,179 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Compaction size of 1028785192/info: 61.9k; Skipped 0 files , size: 0
> 2008-09-04 09:52:58,192 DEBUG org.apache.hadoop.hbase.regionserver.HStore: started compaction of 3 files into /hbase/.META./compaction.dir/1028785192/info/mapfiles/7781013568996125923
> 2008-09-04 09:52:58,260 DEBUG org.apache.hadoop.hbase.regionserver.HStore: moving /hbase/.META./compaction.dir/1028785192/info/mapfiles/7781013568996125923 to /hbase/.META./1028785192/info/mapfiles/2187291308709057119
> 2008-09-04 09:52:58,290 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Completed compaction of 1028785192/info store size is 61.0k; time since last major compaction: 32534 seconds
> 2008-09-04 09:52:58,296 INFO org.apache.hadoop.hbase.regionserver.HRegion: compaction completed on region .META.,,1 in 0sec
> 2008-09-04 09:53:09,620 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner -2085474968086468199 lease expired
> 2008-09-04 09:54:35,449 INFO org.apache.hadoop.hbase.regionserver.LogRoller: Rolling hlog. Number of entries: 30009
> Following this, insertion continues normally.  This leads me to believe there's an issue with the META table memcache, but oddly the other regions of this table on other regionservers continue on fine.
> As for the hosting the first region of the table on this region server, it seems to be consistent that I get the row out of range errors when looking for a region with startkey = '', although there are 5 other regions on this RS.
> Will attach full logs from master and three RS.  Also a couple screenshots showing weird behavior in listing the regions in the table.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HBASE-868) Incrementing binary rows cause strange behavior once table splits

Posted by "Jonathan Gray (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Gray updated HBASE-868:
--------------------------------

    Resolution: Fixed
        Status: Resolved  (was: Patch Available)

Already fixed, remaining issue in 877

> Incrementing binary rows cause strange behavior once table splits
> -----------------------------------------------------------------
>
>                 Key: HBASE-868
>                 URL: https://issues.apache.org/jira/browse/HBASE-868
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.2.1, 0.18.0
>         Environment: 4 nodes: 1 master/namenode, 3 regionserver/datanodes
>            Reporter: Jonathan Gray
>            Assignee: Jonathan Gray
>            Priority: Blocker
>             Fix For: 0.2.1, 0.18.0
>
>         Attachments: hbase-868-0.2-v2.patch, hbase-868-0.2-v3.patch, hbase-868-0.2-v4.patch, hbase-868-0.2-v5-fix.patch, hbase-868-0.2-v5.patch, hbase-868-branch-0.2.patch, hbase-hbase-regionserver-web3.streamy.com.log, master_regionsinsources.png, regionserver_regionlist.png
>
>
> We're now using incrementing binary row keys and to this point had only been doing small tests with them, never having actually had a table split.
> I'm still working through the logs but it seems that there's a problem somewhere with startKey and endKeys.
> Binary in general is not well supported (inconsistent in display in the logs, very odd rendering in the web ui, hard to interpret in the shell, etc..)  Once we figure out these serious bugs we will spend some time trying to clean that up.  But right now this makes things even harder to debug.
> The actual symptoms are that my import eventually started to throw (in the client and on the region server):
> org.apache.hadoop.hbase.regionserver.WrongRegionException: org.apache.hadoop.hbase.regionserver.WrongRegionException: Requested row out of range for HRegion sources,,1220546297947, startKey='', getEndKey()='
> ï¿½', row='cï¿½'
>         at org.apache.hadoop.hbase.regionserver.HRegion.checkRow(HRegion.java:1775)
>         at org.apache.hadoop.hbase.regionserver.HRegion.obtainRowLock(HRegion.java:1831)
>         at org.apache.hadoop.hbase.regionserver.HRegion.batchUpdate(HRegion.java:1387)
>         at org.apache.hadoop.hbase.regionserver.HRegionServer.batchUpdate(HRegionServer.java:1145)
>         at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:616)
>         at org.apache.hadoop.hbase.ipc.HbaseRPC$Server.call(HbaseRPC.java:473)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:888)
> There are 3 regionservers, but this error only happens on one of them (while the other two always continue normally, allowing updates to this same table).
> The regionserver that this happens on is special for two reasons, one it is hosting the META table.  And secondly it also hosts the first region in the table, startkey = ''.  I'm unsure which is the cause, I have a clue leading to both.
> After about 15 minutes, the regionserver sees:
> 2008-09-04 09:52:57,948 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memcache flush for region .META.,,1. Current region memcache size 24.5k
> 2008-09-04 09:52:58,003 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Added /hbase/.META./1028785192/historian/mapfiles/8699673838203663799 with 106 entries, sequence id 25341510, data size 8.9k, file size 10.6k
> 2008-09-04 09:52:58,050 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Added /hbase/.META./1028785192/info/mapfiles/1791564557665476834 with 96 entries, sequence id 25341510, data size 14.2k, file size 15.8k
> 2008-09-04 09:52:58,050 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Finished memcache flush for region .META.,,1 in 102ms, sequence id=25341510, compaction requested=true
> 2008-09-04 09:52:58,050 DEBUG org.apache.hadoop.hbase.regionserver.CompactSplitThread: Compaction requested for region: .META.,,1
> 2008-09-04 09:52:58,051 INFO org.apache.hadoop.hbase.regionserver.HRegion: starting compaction on region .META.,,1
> 2008-09-04 09:52:58,055 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Compaction size of 1028785192/historian: 41.9k; Skipped 1 files , size: 21957
> 2008-09-04 09:52:58,088 DEBUG org.apache.hadoop.hbase.regionserver.HStore: started compaction of 2 files into /hbase/.META./compaction.dir/1028785192/historian/mapfiles/6948796056606699674
> 2008-09-04 09:52:58,128 DEBUG org.apache.hadoop.hbase.regionserver.HStore: moving /hbase/.META./compaction.dir/1028785192/historian/mapfiles/6948796056606699674 to /hbase/.META./1028785192/historian/mapfiles/75733875840914142
> 2008-09-04 09:52:58,175 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Completed compaction of 1028785192/historian store size is 41.1k; time since last major compaction: 5426 seconds
> 2008-09-04 09:52:58,179 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Compaction size of 1028785192/info: 61.9k; Skipped 0 files , size: 0
> 2008-09-04 09:52:58,192 DEBUG org.apache.hadoop.hbase.regionserver.HStore: started compaction of 3 files into /hbase/.META./compaction.dir/1028785192/info/mapfiles/7781013568996125923
> 2008-09-04 09:52:58,260 DEBUG org.apache.hadoop.hbase.regionserver.HStore: moving /hbase/.META./compaction.dir/1028785192/info/mapfiles/7781013568996125923 to /hbase/.META./1028785192/info/mapfiles/2187291308709057119
> 2008-09-04 09:52:58,290 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Completed compaction of 1028785192/info store size is 61.0k; time since last major compaction: 32534 seconds
> 2008-09-04 09:52:58,296 INFO org.apache.hadoop.hbase.regionserver.HRegion: compaction completed on region .META.,,1 in 0sec
> 2008-09-04 09:53:09,620 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner -2085474968086468199 lease expired
> 2008-09-04 09:54:35,449 INFO org.apache.hadoop.hbase.regionserver.LogRoller: Rolling hlog. Number of entries: 30009
> Following this, insertion continues normally.  This leads me to believe there's an issue with the META table memcache, but oddly the other regions of this table on other regionservers continue on fine.
> As for the hosting the first region of the table on this region server, it seems to be consistent that I get the row out of range errors when looking for a region with startkey = '', although there are 5 other regions on this RS.
> Will attach full logs from master and three RS.  Also a couple screenshots showing weird behavior in listing the regions in the table.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-868) Incrementing binary rows cause strange behavior once table splits

Posted by "Jim Kellerman (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12629047#action_12629047 ] 

Jim Kellerman commented on HBASE-868:
-------------------------------------

Doesn't HSK.compareTo do the "right thing"? If not, why can't it?

> Incrementing binary rows cause strange behavior once table splits
> -----------------------------------------------------------------
>
>                 Key: HBASE-868
>                 URL: https://issues.apache.org/jira/browse/HBASE-868
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.2.1, 0.18.0
>         Environment: 4 nodes: 1 master/namenode, 3 regionserver/datanodes
>            Reporter: Jonathan Gray
>            Assignee: Jonathan Gray
>            Priority: Blocker
>             Fix For: 0.2.1, 0.18.0
>
>         Attachments: hbase-868-0.2-v2.patch, hbase-868-0.2-v3.patch, hbase-868-0.2-v4.patch, hbase-868-0.2-v5-fix.patch, hbase-868-0.2-v5.patch, hbase-868-branch-0.2.patch, hbase-hbase-regionserver-web3.streamy.com.log, master_regionsinsources.png, regionserver_regionlist.png
>
>
> We're now using incrementing binary row keys and to this point had only been doing small tests with them, never having actually had a table split.
> I'm still working through the logs but it seems that there's a problem somewhere with startKey and endKeys.
> Binary in general is not well supported (inconsistent in display in the logs, very odd rendering in the web ui, hard to interpret in the shell, etc..)  Once we figure out these serious bugs we will spend some time trying to clean that up.  But right now this makes things even harder to debug.
> The actual symptoms are that my import eventually started to throw (in the client and on the region server):
> org.apache.hadoop.hbase.regionserver.WrongRegionException: org.apache.hadoop.hbase.regionserver.WrongRegionException: Requested row out of range for HRegion sources,,1220546297947, startKey='', getEndKey()='
> ï¿½', row='cï¿½'
>         at org.apache.hadoop.hbase.regionserver.HRegion.checkRow(HRegion.java:1775)
>         at org.apache.hadoop.hbase.regionserver.HRegion.obtainRowLock(HRegion.java:1831)
>         at org.apache.hadoop.hbase.regionserver.HRegion.batchUpdate(HRegion.java:1387)
>         at org.apache.hadoop.hbase.regionserver.HRegionServer.batchUpdate(HRegionServer.java:1145)
>         at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:616)
>         at org.apache.hadoop.hbase.ipc.HbaseRPC$Server.call(HbaseRPC.java:473)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:888)
> There are 3 regionservers, but this error only happens on one of them (while the other two always continue normally, allowing updates to this same table).
> The regionserver that this happens on is special for two reasons, one it is hosting the META table.  And secondly it also hosts the first region in the table, startkey = ''.  I'm unsure which is the cause, I have a clue leading to both.
> After about 15 minutes, the regionserver sees:
> 2008-09-04 09:52:57,948 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memcache flush for region .META.,,1. Current region memcache size 24.5k
> 2008-09-04 09:52:58,003 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Added /hbase/.META./1028785192/historian/mapfiles/8699673838203663799 with 106 entries, sequence id 25341510, data size 8.9k, file size 10.6k
> 2008-09-04 09:52:58,050 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Added /hbase/.META./1028785192/info/mapfiles/1791564557665476834 with 96 entries, sequence id 25341510, data size 14.2k, file size 15.8k
> 2008-09-04 09:52:58,050 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Finished memcache flush for region .META.,,1 in 102ms, sequence id=25341510, compaction requested=true
> 2008-09-04 09:52:58,050 DEBUG org.apache.hadoop.hbase.regionserver.CompactSplitThread: Compaction requested for region: .META.,,1
> 2008-09-04 09:52:58,051 INFO org.apache.hadoop.hbase.regionserver.HRegion: starting compaction on region .META.,,1
> 2008-09-04 09:52:58,055 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Compaction size of 1028785192/historian: 41.9k; Skipped 1 files , size: 21957
> 2008-09-04 09:52:58,088 DEBUG org.apache.hadoop.hbase.regionserver.HStore: started compaction of 2 files into /hbase/.META./compaction.dir/1028785192/historian/mapfiles/6948796056606699674
> 2008-09-04 09:52:58,128 DEBUG org.apache.hadoop.hbase.regionserver.HStore: moving /hbase/.META./compaction.dir/1028785192/historian/mapfiles/6948796056606699674 to /hbase/.META./1028785192/historian/mapfiles/75733875840914142
> 2008-09-04 09:52:58,175 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Completed compaction of 1028785192/historian store size is 41.1k; time since last major compaction: 5426 seconds
> 2008-09-04 09:52:58,179 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Compaction size of 1028785192/info: 61.9k; Skipped 0 files , size: 0
> 2008-09-04 09:52:58,192 DEBUG org.apache.hadoop.hbase.regionserver.HStore: started compaction of 3 files into /hbase/.META./compaction.dir/1028785192/info/mapfiles/7781013568996125923
> 2008-09-04 09:52:58,260 DEBUG org.apache.hadoop.hbase.regionserver.HStore: moving /hbase/.META./compaction.dir/1028785192/info/mapfiles/7781013568996125923 to /hbase/.META./1028785192/info/mapfiles/2187291308709057119
> 2008-09-04 09:52:58,290 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Completed compaction of 1028785192/info store size is 61.0k; time since last major compaction: 32534 seconds
> 2008-09-04 09:52:58,296 INFO org.apache.hadoop.hbase.regionserver.HRegion: compaction completed on region .META.,,1 in 0sec
> 2008-09-04 09:53:09,620 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner -2085474968086468199 lease expired
> 2008-09-04 09:54:35,449 INFO org.apache.hadoop.hbase.regionserver.LogRoller: Rolling hlog. Number of entries: 30009
> Following this, insertion continues normally.  This leads me to believe there's an issue with the META table memcache, but oddly the other regions of this table on other regionservers continue on fine.
> As for the hosting the first region of the table on this region server, it seems to be consistent that I get the row out of range errors when looking for a region with startkey = '', although there are 5 other regions on this RS.
> Will attach full logs from master and three RS.  Also a couple screenshots showing weird behavior in listing the regions in the table.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-868) Incrementing binary rows cause strange behavior once table splits

Posted by "Jonathan Gray (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12629026#action_12629026 ] 

Jonathan Gray commented on HBASE-868:
-------------------------------------

Thanks again to JD and stack for their help over the past four days, you guys definitely went above and beyond to help get this bug squashed.  Much appreciated.

> Incrementing binary rows cause strange behavior once table splits
> -----------------------------------------------------------------
>
>                 Key: HBASE-868
>                 URL: https://issues.apache.org/jira/browse/HBASE-868
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.2.1
>         Environment: 4 nodes: 1 master/namenode, 3 regionserver/datanodes
>            Reporter: Jonathan Gray
>            Assignee: Jonathan Gray
>            Priority: Blocker
>             Fix For: 0.2.1
>
>         Attachments: hbase-868-0.2-v2.patch, hbase-868-0.2-v3.patch, hbase-868-0.2-v4.patch, hbase-868-0.2-v5.patch, hbase-868-branch-0.2.patch, hbase-hbase-regionserver-web3.streamy.com.log, master_regionsinsources.png, regionserver_regionlist.png
>
>
> We're now using incrementing binary row keys and to this point had only been doing small tests with them, never having actually had a table split.
> I'm still working through the logs but it seems that there's a problem somewhere with startKey and endKeys.
> Binary in general is not well supported (inconsistent in display in the logs, very odd rendering in the web ui, hard to interpret in the shell, etc..)  Once we figure out these serious bugs we will spend some time trying to clean that up.  But right now this makes things even harder to debug.
> The actual symptoms are that my import eventually started to throw (in the client and on the region server):
> org.apache.hadoop.hbase.regionserver.WrongRegionException: org.apache.hadoop.hbase.regionserver.WrongRegionException: Requested row out of range for HRegion sources,,1220546297947, startKey='', getEndKey()='
> ï¿½', row='cï¿½'
>         at org.apache.hadoop.hbase.regionserver.HRegion.checkRow(HRegion.java:1775)
>         at org.apache.hadoop.hbase.regionserver.HRegion.obtainRowLock(HRegion.java:1831)
>         at org.apache.hadoop.hbase.regionserver.HRegion.batchUpdate(HRegion.java:1387)
>         at org.apache.hadoop.hbase.regionserver.HRegionServer.batchUpdate(HRegionServer.java:1145)
>         at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:616)
>         at org.apache.hadoop.hbase.ipc.HbaseRPC$Server.call(HbaseRPC.java:473)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:888)
> There are 3 regionservers, but this error only happens on one of them (while the other two always continue normally, allowing updates to this same table).
> The regionserver that this happens on is special for two reasons, one it is hosting the META table.  And secondly it also hosts the first region in the table, startkey = ''.  I'm unsure which is the cause, I have a clue leading to both.
> After about 15 minutes, the regionserver sees:
> 2008-09-04 09:52:57,948 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memcache flush for region .META.,,1. Current region memcache size 24.5k
> 2008-09-04 09:52:58,003 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Added /hbase/.META./1028785192/historian/mapfiles/8699673838203663799 with 106 entries, sequence id 25341510, data size 8.9k, file size 10.6k
> 2008-09-04 09:52:58,050 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Added /hbase/.META./1028785192/info/mapfiles/1791564557665476834 with 96 entries, sequence id 25341510, data size 14.2k, file size 15.8k
> 2008-09-04 09:52:58,050 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Finished memcache flush for region .META.,,1 in 102ms, sequence id=25341510, compaction requested=true
> 2008-09-04 09:52:58,050 DEBUG org.apache.hadoop.hbase.regionserver.CompactSplitThread: Compaction requested for region: .META.,,1
> 2008-09-04 09:52:58,051 INFO org.apache.hadoop.hbase.regionserver.HRegion: starting compaction on region .META.,,1
> 2008-09-04 09:52:58,055 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Compaction size of 1028785192/historian: 41.9k; Skipped 1 files , size: 21957
> 2008-09-04 09:52:58,088 DEBUG org.apache.hadoop.hbase.regionserver.HStore: started compaction of 2 files into /hbase/.META./compaction.dir/1028785192/historian/mapfiles/6948796056606699674
> 2008-09-04 09:52:58,128 DEBUG org.apache.hadoop.hbase.regionserver.HStore: moving /hbase/.META./compaction.dir/1028785192/historian/mapfiles/6948796056606699674 to /hbase/.META./1028785192/historian/mapfiles/75733875840914142
> 2008-09-04 09:52:58,175 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Completed compaction of 1028785192/historian store size is 41.1k; time since last major compaction: 5426 seconds
> 2008-09-04 09:52:58,179 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Compaction size of 1028785192/info: 61.9k; Skipped 0 files , size: 0
> 2008-09-04 09:52:58,192 DEBUG org.apache.hadoop.hbase.regionserver.HStore: started compaction of 3 files into /hbase/.META./compaction.dir/1028785192/info/mapfiles/7781013568996125923
> 2008-09-04 09:52:58,260 DEBUG org.apache.hadoop.hbase.regionserver.HStore: moving /hbase/.META./compaction.dir/1028785192/info/mapfiles/7781013568996125923 to /hbase/.META./1028785192/info/mapfiles/2187291308709057119
> 2008-09-04 09:52:58,290 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Completed compaction of 1028785192/info store size is 61.0k; time since last major compaction: 32534 seconds
> 2008-09-04 09:52:58,296 INFO org.apache.hadoop.hbase.regionserver.HRegion: compaction completed on region .META.,,1 in 0sec
> 2008-09-04 09:53:09,620 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner -2085474968086468199 lease expired
> 2008-09-04 09:54:35,449 INFO org.apache.hadoop.hbase.regionserver.LogRoller: Rolling hlog. Number of entries: 30009
> Following this, insertion continues normally.  This leads me to believe there's an issue with the META table memcache, but oddly the other regions of this table on other regionservers continue on fine.
> As for the hosting the first region of the table on this region server, it seems to be consistent that I get the row out of range errors when looking for a region with startkey = '', although there are 5 other regions on this RS.
> Will attach full logs from master and three RS.  Also a couple screenshots showing weird behavior in listing the regions in the table.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HBASE-868) Incrementing binary rows cause strange behavior once table splits

Posted by "Jonathan Gray (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Gray updated HBASE-868:
--------------------------------

    Attachment: hbase-868-0.2-v4.patch

This patch contains a ton more switches of Bytes.compare and Bytes.equals to HStoreKey.compareTwoRowKeys and HStoreKey.equalsTwoRowKeys

That fixes a great deal of the issues...

Right now, we can successfully get past the first split.  After the second split, the first region gets lost from the META table.  Often the historian family will exist but the info is gone completely.

There is also a lot of debug output in this spread across Memcache + HScanner, SFS, and a few other places.  This is attempting to figure out when we do our scans if we are missing something.

JD and I originally thought that the row for the first region was still in META but for some reason scanner was missing it.  Looking at it now with the debug output, it seems that it sees old META rows with the splitA splitB info in them, and then a start row META row for the split table, but only historian info.  That may have been throwing us off.

I'm a bit stumped at this point as to whether the scanner is messed up or somehow when we remove the "mother" region we might be deleting too much, or something else all together.

> Incrementing binary rows cause strange behavior once table splits
> -----------------------------------------------------------------
>
>                 Key: HBASE-868
>                 URL: https://issues.apache.org/jira/browse/HBASE-868
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.2.1
>         Environment: 4 nodes: 1 master/namenode, 3 regionserver/datanodes
>            Reporter: Jonathan Gray
>            Assignee: Jonathan Gray
>            Priority: Blocker
>             Fix For: 0.2.1
>
>         Attachments: hbase-868-0.2-v2.patch, hbase-868-0.2-v3.patch, hbase-868-0.2-v4.patch, hbase-868-branch-0.2.patch, hbase-hbase-regionserver-web3.streamy.com.log, master_regionsinsources.png, regionserver_regionlist.png
>
>
> We're now using incrementing binary row keys and to this point had only been doing small tests with them, never having actually had a table split.
> I'm still working through the logs but it seems that there's a problem somewhere with startKey and endKeys.
> Binary in general is not well supported (inconsistent in display in the logs, very odd rendering in the web ui, hard to interpret in the shell, etc..)  Once we figure out these serious bugs we will spend some time trying to clean that up.  But right now this makes things even harder to debug.
> The actual symptoms are that my import eventually started to throw (in the client and on the region server):
> org.apache.hadoop.hbase.regionserver.WrongRegionException: org.apache.hadoop.hbase.regionserver.WrongRegionException: Requested row out of range for HRegion sources,,1220546297947, startKey='', getEndKey()='
> ï¿½', row='cï¿½'
>         at org.apache.hadoop.hbase.regionserver.HRegion.checkRow(HRegion.java:1775)
>         at org.apache.hadoop.hbase.regionserver.HRegion.obtainRowLock(HRegion.java:1831)
>         at org.apache.hadoop.hbase.regionserver.HRegion.batchUpdate(HRegion.java:1387)
>         at org.apache.hadoop.hbase.regionserver.HRegionServer.batchUpdate(HRegionServer.java:1145)
>         at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:616)
>         at org.apache.hadoop.hbase.ipc.HbaseRPC$Server.call(HbaseRPC.java:473)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:888)
> There are 3 regionservers, but this error only happens on one of them (while the other two always continue normally, allowing updates to this same table).
> The regionserver that this happens on is special for two reasons, one it is hosting the META table.  And secondly it also hosts the first region in the table, startkey = ''.  I'm unsure which is the cause, I have a clue leading to both.
> After about 15 minutes, the regionserver sees:
> 2008-09-04 09:52:57,948 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memcache flush for region .META.,,1. Current region memcache size 24.5k
> 2008-09-04 09:52:58,003 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Added /hbase/.META./1028785192/historian/mapfiles/8699673838203663799 with 106 entries, sequence id 25341510, data size 8.9k, file size 10.6k
> 2008-09-04 09:52:58,050 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Added /hbase/.META./1028785192/info/mapfiles/1791564557665476834 with 96 entries, sequence id 25341510, data size 14.2k, file size 15.8k
> 2008-09-04 09:52:58,050 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Finished memcache flush for region .META.,,1 in 102ms, sequence id=25341510, compaction requested=true
> 2008-09-04 09:52:58,050 DEBUG org.apache.hadoop.hbase.regionserver.CompactSplitThread: Compaction requested for region: .META.,,1
> 2008-09-04 09:52:58,051 INFO org.apache.hadoop.hbase.regionserver.HRegion: starting compaction on region .META.,,1
> 2008-09-04 09:52:58,055 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Compaction size of 1028785192/historian: 41.9k; Skipped 1 files , size: 21957
> 2008-09-04 09:52:58,088 DEBUG org.apache.hadoop.hbase.regionserver.HStore: started compaction of 2 files into /hbase/.META./compaction.dir/1028785192/historian/mapfiles/6948796056606699674
> 2008-09-04 09:52:58,128 DEBUG org.apache.hadoop.hbase.regionserver.HStore: moving /hbase/.META./compaction.dir/1028785192/historian/mapfiles/6948796056606699674 to /hbase/.META./1028785192/historian/mapfiles/75733875840914142
> 2008-09-04 09:52:58,175 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Completed compaction of 1028785192/historian store size is 41.1k; time since last major compaction: 5426 seconds
> 2008-09-04 09:52:58,179 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Compaction size of 1028785192/info: 61.9k; Skipped 0 files , size: 0
> 2008-09-04 09:52:58,192 DEBUG org.apache.hadoop.hbase.regionserver.HStore: started compaction of 3 files into /hbase/.META./compaction.dir/1028785192/info/mapfiles/7781013568996125923
> 2008-09-04 09:52:58,260 DEBUG org.apache.hadoop.hbase.regionserver.HStore: moving /hbase/.META./compaction.dir/1028785192/info/mapfiles/7781013568996125923 to /hbase/.META./1028785192/info/mapfiles/2187291308709057119
> 2008-09-04 09:52:58,290 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Completed compaction of 1028785192/info store size is 61.0k; time since last major compaction: 32534 seconds
> 2008-09-04 09:52:58,296 INFO org.apache.hadoop.hbase.regionserver.HRegion: compaction completed on region .META.,,1 in 0sec
> 2008-09-04 09:53:09,620 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner -2085474968086468199 lease expired
> 2008-09-04 09:54:35,449 INFO org.apache.hadoop.hbase.regionserver.LogRoller: Rolling hlog. Number of entries: 30009
> Following this, insertion continues normally.  This leads me to believe there's an issue with the META table memcache, but oddly the other regions of this table on other regionservers continue on fine.
> As for the hosting the first region of the table on this region server, it seems to be consistent that I get the row out of range errors when looking for a region with startkey = '', although there are 5 other regions on this RS.
> Will attach full logs from master and three RS.  Also a couple screenshots showing weird behavior in listing the regions in the table.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-868) Incrementing binary rows cause strange behavior once table splits

Posted by "Jonathan Gray (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12629031#action_12629031 ] 

Jonathan Gray commented on HBASE-868:
-------------------------------------

There's still something broken here.  It's not related to the splitting and all that, that still appears to be fixed.

Now there is an issue with obtaining HTable reference to the same table in question.  Exception is:

org.apache.hadoop.hbase.TableNotFoundException: items
        at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegionInMeta(HConnectionManager.java:508)
        at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:460)
        at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:420)
        at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:130)
        at HBaseRef.<init>(HBaseRef.java:29)
        at Import.<init>(Import.java:20)
        at Import.main(Import.java:26)

Will post more info as I discover it.

> Incrementing binary rows cause strange behavior once table splits
> -----------------------------------------------------------------
>
>                 Key: HBASE-868
>                 URL: https://issues.apache.org/jira/browse/HBASE-868
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.2.1
>         Environment: 4 nodes: 1 master/namenode, 3 regionserver/datanodes
>            Reporter: Jonathan Gray
>            Assignee: Jonathan Gray
>            Priority: Blocker
>             Fix For: 0.2.1
>
>         Attachments: hbase-868-0.2-v2.patch, hbase-868-0.2-v3.patch, hbase-868-0.2-v4.patch, hbase-868-0.2-v5.patch, hbase-868-branch-0.2.patch, hbase-hbase-regionserver-web3.streamy.com.log, master_regionsinsources.png, regionserver_regionlist.png
>
>
> We're now using incrementing binary row keys and to this point had only been doing small tests with them, never having actually had a table split.
> I'm still working through the logs but it seems that there's a problem somewhere with startKey and endKeys.
> Binary in general is not well supported (inconsistent in display in the logs, very odd rendering in the web ui, hard to interpret in the shell, etc..)  Once we figure out these serious bugs we will spend some time trying to clean that up.  But right now this makes things even harder to debug.
> The actual symptoms are that my import eventually started to throw (in the client and on the region server):
> org.apache.hadoop.hbase.regionserver.WrongRegionException: org.apache.hadoop.hbase.regionserver.WrongRegionException: Requested row out of range for HRegion sources,,1220546297947, startKey='', getEndKey()='
> ï¿½', row='cï¿½'
>         at org.apache.hadoop.hbase.regionserver.HRegion.checkRow(HRegion.java:1775)
>         at org.apache.hadoop.hbase.regionserver.HRegion.obtainRowLock(HRegion.java:1831)
>         at org.apache.hadoop.hbase.regionserver.HRegion.batchUpdate(HRegion.java:1387)
>         at org.apache.hadoop.hbase.regionserver.HRegionServer.batchUpdate(HRegionServer.java:1145)
>         at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:616)
>         at org.apache.hadoop.hbase.ipc.HbaseRPC$Server.call(HbaseRPC.java:473)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:888)
> There are 3 regionservers, but this error only happens on one of them (while the other two always continue normally, allowing updates to this same table).
> The regionserver that this happens on is special for two reasons, one it is hosting the META table.  And secondly it also hosts the first region in the table, startkey = ''.  I'm unsure which is the cause, I have a clue leading to both.
> After about 15 minutes, the regionserver sees:
> 2008-09-04 09:52:57,948 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memcache flush for region .META.,,1. Current region memcache size 24.5k
> 2008-09-04 09:52:58,003 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Added /hbase/.META./1028785192/historian/mapfiles/8699673838203663799 with 106 entries, sequence id 25341510, data size 8.9k, file size 10.6k
> 2008-09-04 09:52:58,050 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Added /hbase/.META./1028785192/info/mapfiles/1791564557665476834 with 96 entries, sequence id 25341510, data size 14.2k, file size 15.8k
> 2008-09-04 09:52:58,050 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Finished memcache flush for region .META.,,1 in 102ms, sequence id=25341510, compaction requested=true
> 2008-09-04 09:52:58,050 DEBUG org.apache.hadoop.hbase.regionserver.CompactSplitThread: Compaction requested for region: .META.,,1
> 2008-09-04 09:52:58,051 INFO org.apache.hadoop.hbase.regionserver.HRegion: starting compaction on region .META.,,1
> 2008-09-04 09:52:58,055 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Compaction size of 1028785192/historian: 41.9k; Skipped 1 files , size: 21957
> 2008-09-04 09:52:58,088 DEBUG org.apache.hadoop.hbase.regionserver.HStore: started compaction of 2 files into /hbase/.META./compaction.dir/1028785192/historian/mapfiles/6948796056606699674
> 2008-09-04 09:52:58,128 DEBUG org.apache.hadoop.hbase.regionserver.HStore: moving /hbase/.META./compaction.dir/1028785192/historian/mapfiles/6948796056606699674 to /hbase/.META./1028785192/historian/mapfiles/75733875840914142
> 2008-09-04 09:52:58,175 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Completed compaction of 1028785192/historian store size is 41.1k; time since last major compaction: 5426 seconds
> 2008-09-04 09:52:58,179 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Compaction size of 1028785192/info: 61.9k; Skipped 0 files , size: 0
> 2008-09-04 09:52:58,192 DEBUG org.apache.hadoop.hbase.regionserver.HStore: started compaction of 3 files into /hbase/.META./compaction.dir/1028785192/info/mapfiles/7781013568996125923
> 2008-09-04 09:52:58,260 DEBUG org.apache.hadoop.hbase.regionserver.HStore: moving /hbase/.META./compaction.dir/1028785192/info/mapfiles/7781013568996125923 to /hbase/.META./1028785192/info/mapfiles/2187291308709057119
> 2008-09-04 09:52:58,290 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Completed compaction of 1028785192/info store size is 61.0k; time since last major compaction: 32534 seconds
> 2008-09-04 09:52:58,296 INFO org.apache.hadoop.hbase.regionserver.HRegion: compaction completed on region .META.,,1 in 0sec
> 2008-09-04 09:53:09,620 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner -2085474968086468199 lease expired
> 2008-09-04 09:54:35,449 INFO org.apache.hadoop.hbase.regionserver.LogRoller: Rolling hlog. Number of entries: 30009
> Following this, insertion continues normally.  This leads me to believe there's an issue with the META table memcache, but oddly the other regions of this table on other regionservers continue on fine.
> As for the hosting the first region of the table on this region server, it seems to be consistent that I get the row out of range errors when looking for a region with startkey = '', although there are 5 other regions on this RS.
> Will attach full logs from master and three RS.  Also a couple screenshots showing weird behavior in listing the regions in the table.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HBASE-868) Incrementing binary rows cause strange behavior once table splits

Posted by "Jonathan Gray (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Gray updated HBASE-868:
--------------------------------

    Attachment: hbase-868-0.2-v3.patch

Comparator is called now, thanks JD.

Current issue with this patch is that following first split, META table eventually loses all references to the new split regions.

> Incrementing binary rows cause strange behavior once table splits
> -----------------------------------------------------------------
>
>                 Key: HBASE-868
>                 URL: https://issues.apache.org/jira/browse/HBASE-868
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.2.1
>         Environment: 4 nodes: 1 master/namenode, 3 regionserver/datanodes
>            Reporter: Jonathan Gray
>            Assignee: Jonathan Gray
>            Priority: Blocker
>             Fix For: 0.2.1
>
>         Attachments: hbase-868-0.2-v2.patch, hbase-868-0.2-v3.patch, hbase-868-branch-0.2.patch, hbase-hbase-regionserver-web3.streamy.com.log, master_regionsinsources.png, regionserver_regionlist.png
>
>
> We're now using incrementing binary row keys and to this point had only been doing small tests with them, never having actually had a table split.
> I'm still working through the logs but it seems that there's a problem somewhere with startKey and endKeys.
> Binary in general is not well supported (inconsistent in display in the logs, very odd rendering in the web ui, hard to interpret in the shell, etc..)  Once we figure out these serious bugs we will spend some time trying to clean that up.  But right now this makes things even harder to debug.
> The actual symptoms are that my import eventually started to throw (in the client and on the region server):
> org.apache.hadoop.hbase.regionserver.WrongRegionException: org.apache.hadoop.hbase.regionserver.WrongRegionException: Requested row out of range for HRegion sources,,1220546297947, startKey='', getEndKey()='
> ï¿½', row='cï¿½'
>         at org.apache.hadoop.hbase.regionserver.HRegion.checkRow(HRegion.java:1775)
>         at org.apache.hadoop.hbase.regionserver.HRegion.obtainRowLock(HRegion.java:1831)
>         at org.apache.hadoop.hbase.regionserver.HRegion.batchUpdate(HRegion.java:1387)
>         at org.apache.hadoop.hbase.regionserver.HRegionServer.batchUpdate(HRegionServer.java:1145)
>         at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:616)
>         at org.apache.hadoop.hbase.ipc.HbaseRPC$Server.call(HbaseRPC.java:473)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:888)
> There are 3 regionservers, but this error only happens on one of them (while the other two always continue normally, allowing updates to this same table).
> The regionserver that this happens on is special for two reasons, one it is hosting the META table.  And secondly it also hosts the first region in the table, startkey = ''.  I'm unsure which is the cause, I have a clue leading to both.
> After about 15 minutes, the regionserver sees:
> 2008-09-04 09:52:57,948 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memcache flush for region .META.,,1. Current region memcache size 24.5k
> 2008-09-04 09:52:58,003 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Added /hbase/.META./1028785192/historian/mapfiles/8699673838203663799 with 106 entries, sequence id 25341510, data size 8.9k, file size 10.6k
> 2008-09-04 09:52:58,050 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Added /hbase/.META./1028785192/info/mapfiles/1791564557665476834 with 96 entries, sequence id 25341510, data size 14.2k, file size 15.8k
> 2008-09-04 09:52:58,050 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Finished memcache flush for region .META.,,1 in 102ms, sequence id=25341510, compaction requested=true
> 2008-09-04 09:52:58,050 DEBUG org.apache.hadoop.hbase.regionserver.CompactSplitThread: Compaction requested for region: .META.,,1
> 2008-09-04 09:52:58,051 INFO org.apache.hadoop.hbase.regionserver.HRegion: starting compaction on region .META.,,1
> 2008-09-04 09:52:58,055 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Compaction size of 1028785192/historian: 41.9k; Skipped 1 files , size: 21957
> 2008-09-04 09:52:58,088 DEBUG org.apache.hadoop.hbase.regionserver.HStore: started compaction of 2 files into /hbase/.META./compaction.dir/1028785192/historian/mapfiles/6948796056606699674
> 2008-09-04 09:52:58,128 DEBUG org.apache.hadoop.hbase.regionserver.HStore: moving /hbase/.META./compaction.dir/1028785192/historian/mapfiles/6948796056606699674 to /hbase/.META./1028785192/historian/mapfiles/75733875840914142
> 2008-09-04 09:52:58,175 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Completed compaction of 1028785192/historian store size is 41.1k; time since last major compaction: 5426 seconds
> 2008-09-04 09:52:58,179 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Compaction size of 1028785192/info: 61.9k; Skipped 0 files , size: 0
> 2008-09-04 09:52:58,192 DEBUG org.apache.hadoop.hbase.regionserver.HStore: started compaction of 3 files into /hbase/.META./compaction.dir/1028785192/info/mapfiles/7781013568996125923
> 2008-09-04 09:52:58,260 DEBUG org.apache.hadoop.hbase.regionserver.HStore: moving /hbase/.META./compaction.dir/1028785192/info/mapfiles/7781013568996125923 to /hbase/.META./1028785192/info/mapfiles/2187291308709057119
> 2008-09-04 09:52:58,290 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Completed compaction of 1028785192/info store size is 61.0k; time since last major compaction: 32534 seconds
> 2008-09-04 09:52:58,296 INFO org.apache.hadoop.hbase.regionserver.HRegion: compaction completed on region .META.,,1 in 0sec
> 2008-09-04 09:53:09,620 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner -2085474968086468199 lease expired
> 2008-09-04 09:54:35,449 INFO org.apache.hadoop.hbase.regionserver.LogRoller: Rolling hlog. Number of entries: 30009
> Following this, insertion continues normally.  This leads me to believe there's an issue with the META table memcache, but oddly the other regions of this table on other regionservers continue on fine.
> As for the hosting the first region of the table on this region server, it seems to be consistent that I get the row out of range errors when looking for a region with startkey = '', although there are 5 other regions on this RS.
> Will attach full logs from master and three RS.  Also a couple screenshots showing weird behavior in listing the regions in the table.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-868) Incrementing binary rows cause strange behavior once table splits

Posted by "Jonathan Gray (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12629049#action_12629049 ] 

Jonathan Gray commented on HBASE-868:
-------------------------------------

It does do the right thing.  The issue is that it's not always being used.

We have almost certainly changed every Bytes.compareTo/Bytes.equals to HSK.compareTwoRowKeys/HSK.equalsTwoRowKeys necessary

The issue now is that many things use a SortedMap/TreeMap to iterate through results.  When those are instantiated, if it will be used with row keys, they must be passed the HSK comparator that was added with this patch.

> Incrementing binary rows cause strange behavior once table splits
> -----------------------------------------------------------------
>
>                 Key: HBASE-868
>                 URL: https://issues.apache.org/jira/browse/HBASE-868
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.2.1, 0.18.0
>         Environment: 4 nodes: 1 master/namenode, 3 regionserver/datanodes
>            Reporter: Jonathan Gray
>            Assignee: Jonathan Gray
>            Priority: Blocker
>             Fix For: 0.2.1, 0.18.0
>
>         Attachments: hbase-868-0.2-v2.patch, hbase-868-0.2-v3.patch, hbase-868-0.2-v4.patch, hbase-868-0.2-v5-fix.patch, hbase-868-0.2-v5.patch, hbase-868-branch-0.2.patch, hbase-hbase-regionserver-web3.streamy.com.log, master_regionsinsources.png, regionserver_regionlist.png
>
>
> We're now using incrementing binary row keys and to this point had only been doing small tests with them, never having actually had a table split.
> I'm still working through the logs but it seems that there's a problem somewhere with startKey and endKeys.
> Binary in general is not well supported (inconsistent in display in the logs, very odd rendering in the web ui, hard to interpret in the shell, etc..)  Once we figure out these serious bugs we will spend some time trying to clean that up.  But right now this makes things even harder to debug.
> The actual symptoms are that my import eventually started to throw (in the client and on the region server):
> org.apache.hadoop.hbase.regionserver.WrongRegionException: org.apache.hadoop.hbase.regionserver.WrongRegionException: Requested row out of range for HRegion sources,,1220546297947, startKey='', getEndKey()='
> ï¿½', row='cï¿½'
>         at org.apache.hadoop.hbase.regionserver.HRegion.checkRow(HRegion.java:1775)
>         at org.apache.hadoop.hbase.regionserver.HRegion.obtainRowLock(HRegion.java:1831)
>         at org.apache.hadoop.hbase.regionserver.HRegion.batchUpdate(HRegion.java:1387)
>         at org.apache.hadoop.hbase.regionserver.HRegionServer.batchUpdate(HRegionServer.java:1145)
>         at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:616)
>         at org.apache.hadoop.hbase.ipc.HbaseRPC$Server.call(HbaseRPC.java:473)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:888)
> There are 3 regionservers, but this error only happens on one of them (while the other two always continue normally, allowing updates to this same table).
> The regionserver that this happens on is special for two reasons, one it is hosting the META table.  And secondly it also hosts the first region in the table, startkey = ''.  I'm unsure which is the cause, I have a clue leading to both.
> After about 15 minutes, the regionserver sees:
> 2008-09-04 09:52:57,948 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memcache flush for region .META.,,1. Current region memcache size 24.5k
> 2008-09-04 09:52:58,003 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Added /hbase/.META./1028785192/historian/mapfiles/8699673838203663799 with 106 entries, sequence id 25341510, data size 8.9k, file size 10.6k
> 2008-09-04 09:52:58,050 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Added /hbase/.META./1028785192/info/mapfiles/1791564557665476834 with 96 entries, sequence id 25341510, data size 14.2k, file size 15.8k
> 2008-09-04 09:52:58,050 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Finished memcache flush for region .META.,,1 in 102ms, sequence id=25341510, compaction requested=true
> 2008-09-04 09:52:58,050 DEBUG org.apache.hadoop.hbase.regionserver.CompactSplitThread: Compaction requested for region: .META.,,1
> 2008-09-04 09:52:58,051 INFO org.apache.hadoop.hbase.regionserver.HRegion: starting compaction on region .META.,,1
> 2008-09-04 09:52:58,055 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Compaction size of 1028785192/historian: 41.9k; Skipped 1 files , size: 21957
> 2008-09-04 09:52:58,088 DEBUG org.apache.hadoop.hbase.regionserver.HStore: started compaction of 2 files into /hbase/.META./compaction.dir/1028785192/historian/mapfiles/6948796056606699674
> 2008-09-04 09:52:58,128 DEBUG org.apache.hadoop.hbase.regionserver.HStore: moving /hbase/.META./compaction.dir/1028785192/historian/mapfiles/6948796056606699674 to /hbase/.META./1028785192/historian/mapfiles/75733875840914142
> 2008-09-04 09:52:58,175 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Completed compaction of 1028785192/historian store size is 41.1k; time since last major compaction: 5426 seconds
> 2008-09-04 09:52:58,179 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Compaction size of 1028785192/info: 61.9k; Skipped 0 files , size: 0
> 2008-09-04 09:52:58,192 DEBUG org.apache.hadoop.hbase.regionserver.HStore: started compaction of 3 files into /hbase/.META./compaction.dir/1028785192/info/mapfiles/7781013568996125923
> 2008-09-04 09:52:58,260 DEBUG org.apache.hadoop.hbase.regionserver.HStore: moving /hbase/.META./compaction.dir/1028785192/info/mapfiles/7781013568996125923 to /hbase/.META./1028785192/info/mapfiles/2187291308709057119
> 2008-09-04 09:52:58,290 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Completed compaction of 1028785192/info store size is 61.0k; time since last major compaction: 32534 seconds
> 2008-09-04 09:52:58,296 INFO org.apache.hadoop.hbase.regionserver.HRegion: compaction completed on region .META.,,1 in 0sec
> 2008-09-04 09:53:09,620 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner -2085474968086468199 lease expired
> 2008-09-04 09:54:35,449 INFO org.apache.hadoop.hbase.regionserver.LogRoller: Rolling hlog. Number of entries: 30009
> Following this, insertion continues normally.  This leads me to believe there's an issue with the META table memcache, but oddly the other regions of this table on other regionservers continue on fine.
> As for the hosting the first region of the table on this region server, it seems to be consistent that I get the row out of range errors when looking for a region with startkey = '', although there are 5 other regions on this RS.
> Will attach full logs from master and three RS.  Also a couple screenshots showing weird behavior in listing the regions in the table.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Reopened: (HBASE-868) Incrementing binary rows cause strange behavior once table splits

Posted by "Jonathan Gray (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Gray reopened HBASE-868:
---------------------------------


Needs another one-line patch to fix HCM.  Patch is actually in HStore.

> Incrementing binary rows cause strange behavior once table splits
> -----------------------------------------------------------------
>
>                 Key: HBASE-868
>                 URL: https://issues.apache.org/jira/browse/HBASE-868
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.2.1
>         Environment: 4 nodes: 1 master/namenode, 3 regionserver/datanodes
>            Reporter: Jonathan Gray
>            Assignee: Jonathan Gray
>            Priority: Blocker
>             Fix For: 0.2.1
>
>         Attachments: hbase-868-0.2-v2.patch, hbase-868-0.2-v3.patch, hbase-868-0.2-v4.patch, hbase-868-0.2-v5-fix.patch, hbase-868-0.2-v5.patch, hbase-868-branch-0.2.patch, hbase-hbase-regionserver-web3.streamy.com.log, master_regionsinsources.png, regionserver_regionlist.png
>
>
> We're now using incrementing binary row keys and to this point had only been doing small tests with them, never having actually had a table split.
> I'm still working through the logs but it seems that there's a problem somewhere with startKey and endKeys.
> Binary in general is not well supported (inconsistent in display in the logs, very odd rendering in the web ui, hard to interpret in the shell, etc..)  Once we figure out these serious bugs we will spend some time trying to clean that up.  But right now this makes things even harder to debug.
> The actual symptoms are that my import eventually started to throw (in the client and on the region server):
> org.apache.hadoop.hbase.regionserver.WrongRegionException: org.apache.hadoop.hbase.regionserver.WrongRegionException: Requested row out of range for HRegion sources,,1220546297947, startKey='', getEndKey()='
> ï¿½', row='cï¿½'
>         at org.apache.hadoop.hbase.regionserver.HRegion.checkRow(HRegion.java:1775)
>         at org.apache.hadoop.hbase.regionserver.HRegion.obtainRowLock(HRegion.java:1831)
>         at org.apache.hadoop.hbase.regionserver.HRegion.batchUpdate(HRegion.java:1387)
>         at org.apache.hadoop.hbase.regionserver.HRegionServer.batchUpdate(HRegionServer.java:1145)
>         at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:616)
>         at org.apache.hadoop.hbase.ipc.HbaseRPC$Server.call(HbaseRPC.java:473)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:888)
> There are 3 regionservers, but this error only happens on one of them (while the other two always continue normally, allowing updates to this same table).
> The regionserver that this happens on is special for two reasons, one it is hosting the META table.  And secondly it also hosts the first region in the table, startkey = ''.  I'm unsure which is the cause, I have a clue leading to both.
> After about 15 minutes, the regionserver sees:
> 2008-09-04 09:52:57,948 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memcache flush for region .META.,,1. Current region memcache size 24.5k
> 2008-09-04 09:52:58,003 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Added /hbase/.META./1028785192/historian/mapfiles/8699673838203663799 with 106 entries, sequence id 25341510, data size 8.9k, file size 10.6k
> 2008-09-04 09:52:58,050 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Added /hbase/.META./1028785192/info/mapfiles/1791564557665476834 with 96 entries, sequence id 25341510, data size 14.2k, file size 15.8k
> 2008-09-04 09:52:58,050 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Finished memcache flush for region .META.,,1 in 102ms, sequence id=25341510, compaction requested=true
> 2008-09-04 09:52:58,050 DEBUG org.apache.hadoop.hbase.regionserver.CompactSplitThread: Compaction requested for region: .META.,,1
> 2008-09-04 09:52:58,051 INFO org.apache.hadoop.hbase.regionserver.HRegion: starting compaction on region .META.,,1
> 2008-09-04 09:52:58,055 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Compaction size of 1028785192/historian: 41.9k; Skipped 1 files , size: 21957
> 2008-09-04 09:52:58,088 DEBUG org.apache.hadoop.hbase.regionserver.HStore: started compaction of 2 files into /hbase/.META./compaction.dir/1028785192/historian/mapfiles/6948796056606699674
> 2008-09-04 09:52:58,128 DEBUG org.apache.hadoop.hbase.regionserver.HStore: moving /hbase/.META./compaction.dir/1028785192/historian/mapfiles/6948796056606699674 to /hbase/.META./1028785192/historian/mapfiles/75733875840914142
> 2008-09-04 09:52:58,175 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Completed compaction of 1028785192/historian store size is 41.1k; time since last major compaction: 5426 seconds
> 2008-09-04 09:52:58,179 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Compaction size of 1028785192/info: 61.9k; Skipped 0 files , size: 0
> 2008-09-04 09:52:58,192 DEBUG org.apache.hadoop.hbase.regionserver.HStore: started compaction of 3 files into /hbase/.META./compaction.dir/1028785192/info/mapfiles/7781013568996125923
> 2008-09-04 09:52:58,260 DEBUG org.apache.hadoop.hbase.regionserver.HStore: moving /hbase/.META./compaction.dir/1028785192/info/mapfiles/7781013568996125923 to /hbase/.META./1028785192/info/mapfiles/2187291308709057119
> 2008-09-04 09:52:58,290 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Completed compaction of 1028785192/info store size is 61.0k; time since last major compaction: 32534 seconds
> 2008-09-04 09:52:58,296 INFO org.apache.hadoop.hbase.regionserver.HRegion: compaction completed on region .META.,,1 in 0sec
> 2008-09-04 09:53:09,620 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner -2085474968086468199 lease expired
> 2008-09-04 09:54:35,449 INFO org.apache.hadoop.hbase.regionserver.LogRoller: Rolling hlog. Number of entries: 30009
> Following this, insertion continues normally.  This leads me to believe there's an issue with the META table memcache, but oddly the other regions of this table on other regionservers continue on fine.
> As for the hosting the first region of the table on this region server, it seems to be consistent that I get the row out of range errors when looking for a region with startkey = '', although there are 5 other regions on this RS.
> Will attach full logs from master and three RS.  Also a couple screenshots showing weird behavior in listing the regions in the table.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-868) Incrementing binary rows cause strange behavior once table splits

Posted by "Jonathan Gray (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12628828#action_12628828 ] 

Jonathan Gray commented on HBASE-868:
-------------------------------------

Using this patch, I don't see any of the added debug output anywhere.

Here is where it breaks.  During the first memcache flush of META, Flusher says Replay of hlog required and then DroppedSnapshot/Key out of order exceptions:

2008-09-05 19:09:12,202 INFO org.apache.hadoop.hbase.regionserver.LogRoller: Rolling hlog. Number of entries: 30001
2008-09-05 19:09:12,215 DEBUG org.apache.hadoop.hbase.regionserver.HLog: Closing current log writer /hbase/log_72.34.249.110_1220665441919_60020/hlog.dat.1220666931124
2008-09-05 19:09:12,218 INFO org.apache.hadoop.hbase.regionserver.HLog: New log writer created at /hbase/log_72.34.249.110_1220665441919_60020/hlog.dat.1220666952215
2008-09-05 19:14:02,219 INFO org.apache.hadoop.hbase.regionserver.LogRoller: Rolling hlog. Number of entries: 14234
2008-09-05 19:14:02,230 DEBUG org.apache.hadoop.hbase.regionserver.HLog: Closing current log writer /hbase/log_72.34.249.110_1220665441919_60020/hlog.dat.1220666941412
2008-09-05 19:14:02,233 INFO org.apache.hadoop.hbase.regionserver.HLog: New log writer created at /hbase/log_72.34.249.110_1220665441919_60020/hlog.dat.1220667242230
2008-09-05 19:14:22,342 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memcache flush for region .META.,,1. Current region memcache size 23.6k
2008-09-05 19:14:22,362 INFO org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop library
2008-09-05 19:14:22,363 INFO org.apache.hadoop.io.compress.zlib.ZlibFactory: Successfully loaded & initialized native-zlib library
2008-09-05 19:14:22,364 INFO org.apache.hadoop.io.compress.CodecPool: Got brand-new compressor
2008-09-05 19:14:22,392 FATAL org.apache.hadoop.hbase.regionserver.Flusher: Replay of hlog required. Forcing server restart
org.apache.hadoop.hbase.DroppedSnapshotException: region: .META.,,1
        at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1081)
        at org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:979)
        at org.apache.hadoop.hbase.regionserver.Flusher.flushRegion(Flusher.java:174)
        at org.apache.hadoop.hbase.regionserver.Flusher.run(Flusher.java:91)
Caused by: java.io.IOException: key out of order: items,1220665885921/historian:assignment/1220665888554 after items,,1220665885921/historian:split/1220665886459
        at org.apache.hadoop.io.MapFile$Writer.checkKey(MapFile.java:205)
        at org.apache.hadoop.io.MapFile$Writer.append(MapFile.java:191)
        at org.apache.hadoop.hbase.regionserver.HStoreFile$BloomFilterMapFile$Writer.append(HStoreFile.java:846)
        at org.apache.hadoop.hbase.regionserver.HStore.internalFlushCache(HStore.java:625)
        at org.apache.hadoop.hbase.regionserver.HStore.flushCache(HStore.java:578)
        at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1068)
        ... 3 more
2008-09-05 19:14:22,393 INFO org.apache.hadoop.hbase.regionserver.Flusher: regionserver/0:0:0:0:0:0:0:0:60020.cacheFlusher exiting
2008-09-05 19:14:24,583 DEBUG org.apache.hadoop.hbase.RegionHistorian: Offlined
2008-09-05 19:14:24,583 INFO org.apache.hadoop.ipc.Server: Stopping server on 60020
2008-09-05 19:14:24,583 INFO org.apache.hadoop.ipc.Server: IPC Server handler 0 on 60020: exiting
2008-09-05 19:14:24,584 INFO org.apache.hadoop.ipc.Server: IPC Server handler 2 on 60020: exiting
2008-09-05 19:14:24,584 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server listener on 60020
2008-09-05 19:14:24,584 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Stopping infoServer
2008-09-05 19:14:24,584 INFO org.mortbay.util.ThreadedServer: Stopping Acceptor ServerSocket[addr=0.0.0.0/0.0.0.0,port=0,localport=60030]
2008-09-05 19:14:24,584 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 on 60020: exiting
2008-09-05 19:14:24,584 INFO org.apache.hadoop.ipc.Server: IPC Server handler 6 on 60020: exiting
2008-09-05 19:14:24,584 INFO org.apache.hadoop.ipc.Server: IPC Server handler 3 on 60020: exiting
2008-09-05 19:14:24,585 INFO org.apache.hadoop.ipc.Server: IPC Server handler 4 on 60020: exiting
2008-09-05 19:14:24,585 INFO org.apache.hadoop.ipc.Server: IPC Server handler 5 on 60020: exiting
2008-09-05 19:14:24,585 INFO org.apache.hadoop.ipc.Server: IPC Server handler 8 on 60020: exiting
2008-09-05 19:14:24,585 INFO org.apache.hadoop.ipc.Server: IPC Server handler 9 on 60020: exiting
2008-09-05 19:14:24,585 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server Responder
2008-09-05 19:14:24,587 INFO org.apache.hadoop.ipc.Server: IPC Server handler 7 on 60020: exiting
2008-09-05 19:14:24,587 INFO org.mortbay.http.SocketListener: Stopped SocketListener on 0.0.0.0:60030
2008-09-05 19:14:25,254 INFO org.mortbay.util.Container: Stopped HttpContext[/static,/static]
2008-09-05 19:14:25,762 INFO org.mortbay.util.Container: Stopped HttpContext[/logs,/logs]
2008-09-05 19:14:25,763 INFO org.mortbay.util.Container: Stopped org.mortbay.jetty.servlet.WebApplicationHandler@6b754699
2008-09-05 19:14:26,418 INFO org.mortbay.util.Container: Stopped WebApplicationContext[/,/]
2008-09-05 19:14:26,418 INFO org.mortbay.util.Container: Stopped org.mortbay.jetty.Server@177ba38f
2008-09-05 19:14:26,419 DEBUG org.apache.hadoop.hbase.regionserver.HLog: closing log writer in hdfs://web1:9000/hbase/log_72.34.249.110_1220665441919_60020
2008-09-05 19:14:26,419 INFO org.apache.hadoop.hbase.regionserver.LogRoller: LogRoller exiting.
2008-09-05 19:14:26,419 INFO org.apache.hadoop.hbase.regionserver.CompactSplitThread: regionserver/0:0:0:0:0:0:0:0:60020.compactor exiting
2008-09-05 19:14:26,432 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: On abort, closed hlog
2008-09-05 19:14:26,433 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: closing region .META.,,1
2008-09-05 19:14:26,433 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Compactions and cache flushes disabled for region .META.,,1
2008-09-05 19:14:26,433 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Scanners disabled for region .META.,,1
2008-09-05 19:14:26,433 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: No more active scanners for region .META.,,1
2008-09-05 19:14:26,433 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Updates disabled for region .META.,,1
2008-09-05 19:14:26,433 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: No more row locks outstanding on region .META.,,1
2008-09-05 19:14:26,433 DEBUG org.apache.hadoop.hbase.regionserver.HStore: closed 1028785192/historian
2008-09-05 19:14:26,433 DEBUG org.apache.hadoop.hbase.regionserver.HStore: closed 1028785192/info
2008-09-05 19:14:26,433 INFO org.apache.hadoop.hbase.regionserver.HRegion: closed .META.,,1
...
2008-09-05 19:14:26,435 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: aborting server at: 72.34.249.110:60020
2008-09-05 19:14:30,719 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: worker thread exiting
2008-09-05 19:14:30,719 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: regionserver/0:0:0:0:0:0:0:0:60020 exiting
2008-09-05 19:14:32,341 INFO org.apache.hadoop.hbase.Leases: regionserver/0:0:0:0:0:0:0:0:60020.leaseChecker closing leases
2008-09-05 19:14:32,341 INFO org.apache.hadoop.hbase.Leases: regionserver/0:0:0:0:0:0:0:0:60020.leaseChecker closed leases
2008-09-05 19:14:32,342 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Starting shutdown thread.
2008-09-05 19:14:32,342 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Shutdown thread complete



> Incrementing binary rows cause strange behavior once table splits
> -----------------------------------------------------------------
>
>                 Key: HBASE-868
>                 URL: https://issues.apache.org/jira/browse/HBASE-868
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.2.1
>         Environment: 4 nodes: 1 master/namenode, 3 regionserver/datanodes
>            Reporter: Jonathan Gray
>            Assignee: Jonathan Gray
>            Priority: Blocker
>             Fix For: 0.2.1
>
>         Attachments: hbase-868-branch-0.2.patch, hbase-hbase-regionserver-web3.streamy.com.log, master_regionsinsources.png, regionserver_regionlist.png
>
>
> We're now using incrementing binary row keys and to this point had only been doing small tests with them, never having actually had a table split.
> I'm still working through the logs but it seems that there's a problem somewhere with startKey and endKeys.
> Binary in general is not well supported (inconsistent in display in the logs, very odd rendering in the web ui, hard to interpret in the shell, etc..)  Once we figure out these serious bugs we will spend some time trying to clean that up.  But right now this makes things even harder to debug.
> The actual symptoms are that my import eventually started to throw (in the client and on the region server):
> org.apache.hadoop.hbase.regionserver.WrongRegionException: org.apache.hadoop.hbase.regionserver.WrongRegionException: Requested row out of range for HRegion sources,,1220546297947, startKey='', getEndKey()='
> ï¿½', row='cï¿½'
>         at org.apache.hadoop.hbase.regionserver.HRegion.checkRow(HRegion.java:1775)
>         at org.apache.hadoop.hbase.regionserver.HRegion.obtainRowLock(HRegion.java:1831)
>         at org.apache.hadoop.hbase.regionserver.HRegion.batchUpdate(HRegion.java:1387)
>         at org.apache.hadoop.hbase.regionserver.HRegionServer.batchUpdate(HRegionServer.java:1145)
>         at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:616)
>         at org.apache.hadoop.hbase.ipc.HbaseRPC$Server.call(HbaseRPC.java:473)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:888)
> There are 3 regionservers, but this error only happens on one of them (while the other two always continue normally, allowing updates to this same table).
> The regionserver that this happens on is special for two reasons, one it is hosting the META table.  And secondly it also hosts the first region in the table, startkey = ''.  I'm unsure which is the cause, I have a clue leading to both.
> After about 15 minutes, the regionserver sees:
> 2008-09-04 09:52:57,948 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memcache flush for region .META.,,1. Current region memcache size 24.5k
> 2008-09-04 09:52:58,003 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Added /hbase/.META./1028785192/historian/mapfiles/8699673838203663799 with 106 entries, sequence id 25341510, data size 8.9k, file size 10.6k
> 2008-09-04 09:52:58,050 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Added /hbase/.META./1028785192/info/mapfiles/1791564557665476834 with 96 entries, sequence id 25341510, data size 14.2k, file size 15.8k
> 2008-09-04 09:52:58,050 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Finished memcache flush for region .META.,,1 in 102ms, sequence id=25341510, compaction requested=true
> 2008-09-04 09:52:58,050 DEBUG org.apache.hadoop.hbase.regionserver.CompactSplitThread: Compaction requested for region: .META.,,1
> 2008-09-04 09:52:58,051 INFO org.apache.hadoop.hbase.regionserver.HRegion: starting compaction on region .META.,,1
> 2008-09-04 09:52:58,055 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Compaction size of 1028785192/historian: 41.9k; Skipped 1 files , size: 21957
> 2008-09-04 09:52:58,088 DEBUG org.apache.hadoop.hbase.regionserver.HStore: started compaction of 2 files into /hbase/.META./compaction.dir/1028785192/historian/mapfiles/6948796056606699674
> 2008-09-04 09:52:58,128 DEBUG org.apache.hadoop.hbase.regionserver.HStore: moving /hbase/.META./compaction.dir/1028785192/historian/mapfiles/6948796056606699674 to /hbase/.META./1028785192/historian/mapfiles/75733875840914142
> 2008-09-04 09:52:58,175 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Completed compaction of 1028785192/historian store size is 41.1k; time since last major compaction: 5426 seconds
> 2008-09-04 09:52:58,179 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Compaction size of 1028785192/info: 61.9k; Skipped 0 files , size: 0
> 2008-09-04 09:52:58,192 DEBUG org.apache.hadoop.hbase.regionserver.HStore: started compaction of 3 files into /hbase/.META./compaction.dir/1028785192/info/mapfiles/7781013568996125923
> 2008-09-04 09:52:58,260 DEBUG org.apache.hadoop.hbase.regionserver.HStore: moving /hbase/.META./compaction.dir/1028785192/info/mapfiles/7781013568996125923 to /hbase/.META./1028785192/info/mapfiles/2187291308709057119
> 2008-09-04 09:52:58,290 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Completed compaction of 1028785192/info store size is 61.0k; time since last major compaction: 32534 seconds
> 2008-09-04 09:52:58,296 INFO org.apache.hadoop.hbase.regionserver.HRegion: compaction completed on region .META.,,1 in 0sec
> 2008-09-04 09:53:09,620 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner -2085474968086468199 lease expired
> 2008-09-04 09:54:35,449 INFO org.apache.hadoop.hbase.regionserver.LogRoller: Rolling hlog. Number of entries: 30009
> Following this, insertion continues normally.  This leads me to believe there's an issue with the META table memcache, but oddly the other regions of this table on other regionservers continue on fine.
> As for the hosting the first region of the table on this region server, it seems to be consistent that I get the row out of range errors when looking for a region with startkey = '', although there are 5 other regions on this RS.
> Will attach full logs from master and three RS.  Also a couple screenshots showing weird behavior in listing the regions in the table.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HBASE-868) Incrementing binary rows cause strange behavior once table splits

Posted by "Jean-Daniel Cryans (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jean-Daniel Cryans updated HBASE-868:
-------------------------------------

    Attachment: hbase-868-branch-0.2.patch

I left those three Bytes.compareTo in Memcache thinking it was not going to cause problems. Well maybe it is. Let's see with Jonathan.

> Incrementing binary rows cause strange behavior once table splits
> -----------------------------------------------------------------
>
>                 Key: HBASE-868
>                 URL: https://issues.apache.org/jira/browse/HBASE-868
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.2.1
>         Environment: 4 nodes: 1 master/namenode, 3 regionserver/datanodes
>            Reporter: Jonathan Gray
>            Priority: Blocker
>             Fix For: 0.2.1
>
>         Attachments: hbase-868-branch-0.2.patch, hbase-hbase-regionserver-web3.streamy.com.log
>
>
> We're now using incrementing binary row keys and to this point had only been doing small tests with them, never having actually had a table split.
> I'm still working through the logs but it seems that there's a problem somewhere with startKey and endKeys.
> Binary in general is not well supported (inconsistent in display in the logs, very odd rendering in the web ui, hard to interpret in the shell, etc..)  Once we figure out these serious bugs we will spend some time trying to clean that up.  But right now this makes things even harder to debug.
> The actual symptoms are that my import eventually started to throw (in the client and on the region server):
> org.apache.hadoop.hbase.regionserver.WrongRegionException: org.apache.hadoop.hbase.regionserver.WrongRegionException: Requested row out of range for HRegion sources,,1220546297947, startKey='', getEndKey()='
> ï¿½', row='cï¿½'
>         at org.apache.hadoop.hbase.regionserver.HRegion.checkRow(HRegion.java:1775)
>         at org.apache.hadoop.hbase.regionserver.HRegion.obtainRowLock(HRegion.java:1831)
>         at org.apache.hadoop.hbase.regionserver.HRegion.batchUpdate(HRegion.java:1387)
>         at org.apache.hadoop.hbase.regionserver.HRegionServer.batchUpdate(HRegionServer.java:1145)
>         at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:616)
>         at org.apache.hadoop.hbase.ipc.HbaseRPC$Server.call(HbaseRPC.java:473)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:888)
> There are 3 regionservers, but this error only happens on one of them (while the other two always continue normally, allowing updates to this same table).
> The regionserver that this happens on is special for two reasons, one it is hosting the META table.  And secondly it also hosts the first region in the table, startkey = ''.  I'm unsure which is the cause, I have a clue leading to both.
> After about 15 minutes, the regionserver sees:
> 2008-09-04 09:52:57,948 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memcache flush for region .META.,,1. Current region memcache size 24.5k
> 2008-09-04 09:52:58,003 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Added /hbase/.META./1028785192/historian/mapfiles/8699673838203663799 with 106 entries, sequence id 25341510, data size 8.9k, file size 10.6k
> 2008-09-04 09:52:58,050 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Added /hbase/.META./1028785192/info/mapfiles/1791564557665476834 with 96 entries, sequence id 25341510, data size 14.2k, file size 15.8k
> 2008-09-04 09:52:58,050 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Finished memcache flush for region .META.,,1 in 102ms, sequence id=25341510, compaction requested=true
> 2008-09-04 09:52:58,050 DEBUG org.apache.hadoop.hbase.regionserver.CompactSplitThread: Compaction requested for region: .META.,,1
> 2008-09-04 09:52:58,051 INFO org.apache.hadoop.hbase.regionserver.HRegion: starting compaction on region .META.,,1
> 2008-09-04 09:52:58,055 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Compaction size of 1028785192/historian: 41.9k; Skipped 1 files , size: 21957
> 2008-09-04 09:52:58,088 DEBUG org.apache.hadoop.hbase.regionserver.HStore: started compaction of 2 files into /hbase/.META./compaction.dir/1028785192/historian/mapfiles/6948796056606699674
> 2008-09-04 09:52:58,128 DEBUG org.apache.hadoop.hbase.regionserver.HStore: moving /hbase/.META./compaction.dir/1028785192/historian/mapfiles/6948796056606699674 to /hbase/.META./1028785192/historian/mapfiles/75733875840914142
> 2008-09-04 09:52:58,175 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Completed compaction of 1028785192/historian store size is 41.1k; time since last major compaction: 5426 seconds
> 2008-09-04 09:52:58,179 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Compaction size of 1028785192/info: 61.9k; Skipped 0 files , size: 0
> 2008-09-04 09:52:58,192 DEBUG org.apache.hadoop.hbase.regionserver.HStore: started compaction of 3 files into /hbase/.META./compaction.dir/1028785192/info/mapfiles/7781013568996125923
> 2008-09-04 09:52:58,260 DEBUG org.apache.hadoop.hbase.regionserver.HStore: moving /hbase/.META./compaction.dir/1028785192/info/mapfiles/7781013568996125923 to /hbase/.META./1028785192/info/mapfiles/2187291308709057119
> 2008-09-04 09:52:58,290 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Completed compaction of 1028785192/info store size is 61.0k; time since last major compaction: 32534 seconds
> 2008-09-04 09:52:58,296 INFO org.apache.hadoop.hbase.regionserver.HRegion: compaction completed on region .META.,,1 in 0sec
> 2008-09-04 09:53:09,620 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner -2085474968086468199 lease expired
> 2008-09-04 09:54:35,449 INFO org.apache.hadoop.hbase.regionserver.LogRoller: Rolling hlog. Number of entries: 30009
> Following this, insertion continues normally.  This leads me to believe there's an issue with the META table memcache, but oddly the other regions of this table on other regionservers continue on fine.
> As for the hosting the first region of the table on this region server, it seems to be consistent that I get the row out of range errors when looking for a region with startkey = '', although there are 5 other regions on this RS.
> Will attach full logs from master and three RS.  Also a couple screenshots showing weird behavior in listing the regions in the table.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (HBASE-868) Incrementing binary rows cause strange behavior once table splits

Posted by "stack (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack resolved HBASE-868.
-------------------------

    Resolution: Fixed

Committed branch and trunk.

Thanks Jon Gray and J-D for marathon weekend debugging session that produced this patch.

Passes all unit tests.

I also amended PE with binary keys.  Below is scan of table with binary keys.  Notice that there are more than one.

{code}
2008-09-07 16:58:31,982 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memcache flush for region TestTabl,1220831867463. Current region memcache size 64.0m
2008-09-07 16:58:32,600 INFO org.apache.hadoop.hbase.master.BaseScanner: RegionManager.metaScanner scanning meta region {regionname: .META.,,1, startKey: <>, server: 127.0.0.1:61450}
2008-09-07 16:58:32,604 DEBUG org.apache.hadoop.hbase.master.BaseScanner: RegionManager.metaScanner REGION => {NAME => 'TestTable,,1220831633849', STARTKEY => '', ENDKEY => '          ', ENCODED => 377629314, TABLE => {{NAME => 'TestTable', IS_ROOT => 'false', IS_META => 'false', FAMILIES => [{NAME => 'info', BLOOMFILTER => 'false', IN_MEMORY => 'false', VERSIONS => '3', BLOCKCACHE => 'false', LENGTH => '2147483647', TTL => '-1', COMPRESSION => 'NONE'}]}}}, SERVER => '127.0.0.1:61450', STARTCODE => 1220831422560
2008-09-07 16:58:32,605 DEBUG org.apache.hadoop.hbase.master.BaseScanner: RegionManager.metaScanner REGION => {NAME => 'TestTable,              ,1220831723306', STARTKEY => '          ', ENDKEY => '  ', ENCODED => 1215868091, TABLE => {{NAME => 'TestTable', IS_ROOT => 'false', IS_META => 'false', FAMILIES => [{NAME => 'info', BLOOMFILTER => 'false', IN_MEMORY => 'false', VERSIONS => '3', BLOCKCACHE => 'false', LENGTH => '2147483647', TTL => '-1', COMPRESSION => 'NONE'}]}}}, SERVER => '127.0.0.1:61450', STARTCODE => 1220831422560
2008-09-07 16:58:32,606 DEBUG org.apache.hadoop.hbase.master.BaseScanner: RegionManager.metaScanner REGION => {NAME => 'TestTable,      ,1220831779573', STARTKEY => '  ', ENDKEY => ', ENCODED => 2091937218, TABLE => {{NAME => 'TestTable', IS_ROOT => 'false', IS_META => 'false', FAMILIES => [{NAME => 'info', BLOOMFILTER => 'false', IN_MEMORY => 'false', VERSIONS => '3', BLOCKCACHE => 'false', LENGTH => '2147483647', TTL => '-1', COMPRESSION => 'NONE'}]}}}, SERVER => '127.0.0.1:61450', STARTCODE => 1220831422560
2008-09-07 16:58:32,607 DEBUG org.apache.hadoop.hbase.master.BaseScanner: RegionManager.metaScanner REGION => {NAME => 'TestTable,1220831779573', STARTKEY => ', ENDKEY => '', ENCODED => 1789154425, OFFLINE => true, SPLIT => true, TABLE => {{NAME => 'TestTable', IS_ROOT => 'false', IS_META => 'false', FAMILIES => [{NAME => 'info', BLOOMFILTER => 'false', IN_MEMORY => 'false', VERSIONS => '3', BLOCKCACHE => 'false', LENGTH => '2147483647', TTL => '-1', COMPRESSION => 'NONE'}]}}}, SERVER => '127.0.0.1:61450', STARTCODE => 1220831422560
2008-09-07 16:58:32,607 DEBUG org.apache.hadoop.hbase.master.BaseScanner: RegionManager.metaScanner REGION => {NAME => 'TestTable,1220831867463', STARTKEY => ', ENDKEY =>', ENCODED => 1298791821, TABLE => {{NAME => 'TestTable', IS_ROOT => 'false', IS_META => 'false', FAMILIES => [{NAME => 'info', BLOOMFILTER => 'false', IN_MEMORY => 'false', VERSIONS => '3', BLOCKCACHE => 'false', LENGTH => '2147483647', TTL => '-1', COMPRESSION => 'NONE'}]}}}, SERVER => '127.0.0.1:61450', STARTCODE => 1220831422560
2008-09-07 16:58:32,608 DEBUG org.apache.hadoop.hbase.master.BaseScanner: RegionManager.metaScanner REGION => {NAME => 'TestTabl,1220831867463', STARTKEY =>', ENDKEY => '', ENCODED => 639105844, TABLE => {{NAME => 'TestTable', IS_ROOT => 'false', IS_META => 'false', FAMILIES => [{NAME => 'info', BLOOMFILTER => 'false', IN_MEMORY => 'false', VERSIONS => '3', BLOCKCACHE => 'false', LENGTH => '2147483647', TTL => '-1', COMPRESSION => 'NONE'}]}}}, SERVER => '127.0.0.1:61450', STARTCODE => 1220831422560
{code}

> Incrementing binary rows cause strange behavior once table splits
> -----------------------------------------------------------------
>
>                 Key: HBASE-868
>                 URL: https://issues.apache.org/jira/browse/HBASE-868
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.2.1
>         Environment: 4 nodes: 1 master/namenode, 3 regionserver/datanodes
>            Reporter: Jonathan Gray
>            Assignee: Jonathan Gray
>            Priority: Blocker
>             Fix For: 0.2.1
>
>         Attachments: hbase-868-0.2-v2.patch, hbase-868-0.2-v3.patch, hbase-868-0.2-v4.patch, hbase-868-0.2-v5.patch, hbase-868-branch-0.2.patch, hbase-hbase-regionserver-web3.streamy.com.log, master_regionsinsources.png, regionserver_regionlist.png
>
>
> We're now using incrementing binary row keys and to this point had only been doing small tests with them, never having actually had a table split.
> I'm still working through the logs but it seems that there's a problem somewhere with startKey and endKeys.
> Binary in general is not well supported (inconsistent in display in the logs, very odd rendering in the web ui, hard to interpret in the shell, etc..)  Once we figure out these serious bugs we will spend some time trying to clean that up.  But right now this makes things even harder to debug.
> The actual symptoms are that my import eventually started to throw (in the client and on the region server):
> org.apache.hadoop.hbase.regionserver.WrongRegionException: org.apache.hadoop.hbase.regionserver.WrongRegionException: Requested row out of range for HRegion sources,,1220546297947, startKey='', getEndKey()='
> ï¿½', row='cï¿½'
>         at org.apache.hadoop.hbase.regionserver.HRegion.checkRow(HRegion.java:1775)
>         at org.apache.hadoop.hbase.regionserver.HRegion.obtainRowLock(HRegion.java:1831)
>         at org.apache.hadoop.hbase.regionserver.HRegion.batchUpdate(HRegion.java:1387)
>         at org.apache.hadoop.hbase.regionserver.HRegionServer.batchUpdate(HRegionServer.java:1145)
>         at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:616)
>         at org.apache.hadoop.hbase.ipc.HbaseRPC$Server.call(HbaseRPC.java:473)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:888)
> There are 3 regionservers, but this error only happens on one of them (while the other two always continue normally, allowing updates to this same table).
> The regionserver that this happens on is special for two reasons, one it is hosting the META table.  And secondly it also hosts the first region in the table, startkey = ''.  I'm unsure which is the cause, I have a clue leading to both.
> After about 15 minutes, the regionserver sees:
> 2008-09-04 09:52:57,948 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memcache flush for region .META.,,1. Current region memcache size 24.5k
> 2008-09-04 09:52:58,003 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Added /hbase/.META./1028785192/historian/mapfiles/8699673838203663799 with 106 entries, sequence id 25341510, data size 8.9k, file size 10.6k
> 2008-09-04 09:52:58,050 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Added /hbase/.META./1028785192/info/mapfiles/1791564557665476834 with 96 entries, sequence id 25341510, data size 14.2k, file size 15.8k
> 2008-09-04 09:52:58,050 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Finished memcache flush for region .META.,,1 in 102ms, sequence id=25341510, compaction requested=true
> 2008-09-04 09:52:58,050 DEBUG org.apache.hadoop.hbase.regionserver.CompactSplitThread: Compaction requested for region: .META.,,1
> 2008-09-04 09:52:58,051 INFO org.apache.hadoop.hbase.regionserver.HRegion: starting compaction on region .META.,,1
> 2008-09-04 09:52:58,055 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Compaction size of 1028785192/historian: 41.9k; Skipped 1 files , size: 21957
> 2008-09-04 09:52:58,088 DEBUG org.apache.hadoop.hbase.regionserver.HStore: started compaction of 2 files into /hbase/.META./compaction.dir/1028785192/historian/mapfiles/6948796056606699674
> 2008-09-04 09:52:58,128 DEBUG org.apache.hadoop.hbase.regionserver.HStore: moving /hbase/.META./compaction.dir/1028785192/historian/mapfiles/6948796056606699674 to /hbase/.META./1028785192/historian/mapfiles/75733875840914142
> 2008-09-04 09:52:58,175 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Completed compaction of 1028785192/historian store size is 41.1k; time since last major compaction: 5426 seconds
> 2008-09-04 09:52:58,179 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Compaction size of 1028785192/info: 61.9k; Skipped 0 files , size: 0
> 2008-09-04 09:52:58,192 DEBUG org.apache.hadoop.hbase.regionserver.HStore: started compaction of 3 files into /hbase/.META./compaction.dir/1028785192/info/mapfiles/7781013568996125923
> 2008-09-04 09:52:58,260 DEBUG org.apache.hadoop.hbase.regionserver.HStore: moving /hbase/.META./compaction.dir/1028785192/info/mapfiles/7781013568996125923 to /hbase/.META./1028785192/info/mapfiles/2187291308709057119
> 2008-09-04 09:52:58,290 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Completed compaction of 1028785192/info store size is 61.0k; time since last major compaction: 32534 seconds
> 2008-09-04 09:52:58,296 INFO org.apache.hadoop.hbase.regionserver.HRegion: compaction completed on region .META.,,1 in 0sec
> 2008-09-04 09:53:09,620 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner -2085474968086468199 lease expired
> 2008-09-04 09:54:35,449 INFO org.apache.hadoop.hbase.regionserver.LogRoller: Rolling hlog. Number of entries: 30009
> Following this, insertion continues normally.  This leads me to believe there's an issue with the META table memcache, but oddly the other regions of this table on other regionservers continue on fine.
> As for the hosting the first region of the table on this region server, it seems to be consistent that I get the row out of range errors when looking for a region with startkey = '', although there are 5 other regions on this RS.
> Will attach full logs from master and three RS.  Also a couple screenshots showing weird behavior in listing the regions in the table.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HBASE-868) Incrementing binary rows cause strange behavior once table splits

Posted by "Jonathan Gray (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Gray updated HBASE-868:
--------------------------------

    Attachment: hbase-868-0.2-v5-fix.patch

> Incrementing binary rows cause strange behavior once table splits
> -----------------------------------------------------------------
>
>                 Key: HBASE-868
>                 URL: https://issues.apache.org/jira/browse/HBASE-868
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.2.1
>         Environment: 4 nodes: 1 master/namenode, 3 regionserver/datanodes
>            Reporter: Jonathan Gray
>            Assignee: Jonathan Gray
>            Priority: Blocker
>             Fix For: 0.2.1
>
>         Attachments: hbase-868-0.2-v2.patch, hbase-868-0.2-v3.patch, hbase-868-0.2-v4.patch, hbase-868-0.2-v5-fix.patch, hbase-868-0.2-v5.patch, hbase-868-branch-0.2.patch, hbase-hbase-regionserver-web3.streamy.com.log, master_regionsinsources.png, regionserver_regionlist.png
>
>
> We're now using incrementing binary row keys and to this point had only been doing small tests with them, never having actually had a table split.
> I'm still working through the logs but it seems that there's a problem somewhere with startKey and endKeys.
> Binary in general is not well supported (inconsistent in display in the logs, very odd rendering in the web ui, hard to interpret in the shell, etc..)  Once we figure out these serious bugs we will spend some time trying to clean that up.  But right now this makes things even harder to debug.
> The actual symptoms are that my import eventually started to throw (in the client and on the region server):
> org.apache.hadoop.hbase.regionserver.WrongRegionException: org.apache.hadoop.hbase.regionserver.WrongRegionException: Requested row out of range for HRegion sources,,1220546297947, startKey='', getEndKey()='
> ï¿½', row='cï¿½'
>         at org.apache.hadoop.hbase.regionserver.HRegion.checkRow(HRegion.java:1775)
>         at org.apache.hadoop.hbase.regionserver.HRegion.obtainRowLock(HRegion.java:1831)
>         at org.apache.hadoop.hbase.regionserver.HRegion.batchUpdate(HRegion.java:1387)
>         at org.apache.hadoop.hbase.regionserver.HRegionServer.batchUpdate(HRegionServer.java:1145)
>         at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:616)
>         at org.apache.hadoop.hbase.ipc.HbaseRPC$Server.call(HbaseRPC.java:473)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:888)
> There are 3 regionservers, but this error only happens on one of them (while the other two always continue normally, allowing updates to this same table).
> The regionserver that this happens on is special for two reasons, one it is hosting the META table.  And secondly it also hosts the first region in the table, startkey = ''.  I'm unsure which is the cause, I have a clue leading to both.
> After about 15 minutes, the regionserver sees:
> 2008-09-04 09:52:57,948 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memcache flush for region .META.,,1. Current region memcache size 24.5k
> 2008-09-04 09:52:58,003 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Added /hbase/.META./1028785192/historian/mapfiles/8699673838203663799 with 106 entries, sequence id 25341510, data size 8.9k, file size 10.6k
> 2008-09-04 09:52:58,050 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Added /hbase/.META./1028785192/info/mapfiles/1791564557665476834 with 96 entries, sequence id 25341510, data size 14.2k, file size 15.8k
> 2008-09-04 09:52:58,050 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Finished memcache flush for region .META.,,1 in 102ms, sequence id=25341510, compaction requested=true
> 2008-09-04 09:52:58,050 DEBUG org.apache.hadoop.hbase.regionserver.CompactSplitThread: Compaction requested for region: .META.,,1
> 2008-09-04 09:52:58,051 INFO org.apache.hadoop.hbase.regionserver.HRegion: starting compaction on region .META.,,1
> 2008-09-04 09:52:58,055 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Compaction size of 1028785192/historian: 41.9k; Skipped 1 files , size: 21957
> 2008-09-04 09:52:58,088 DEBUG org.apache.hadoop.hbase.regionserver.HStore: started compaction of 2 files into /hbase/.META./compaction.dir/1028785192/historian/mapfiles/6948796056606699674
> 2008-09-04 09:52:58,128 DEBUG org.apache.hadoop.hbase.regionserver.HStore: moving /hbase/.META./compaction.dir/1028785192/historian/mapfiles/6948796056606699674 to /hbase/.META./1028785192/historian/mapfiles/75733875840914142
> 2008-09-04 09:52:58,175 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Completed compaction of 1028785192/historian store size is 41.1k; time since last major compaction: 5426 seconds
> 2008-09-04 09:52:58,179 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Compaction size of 1028785192/info: 61.9k; Skipped 0 files , size: 0
> 2008-09-04 09:52:58,192 DEBUG org.apache.hadoop.hbase.regionserver.HStore: started compaction of 3 files into /hbase/.META./compaction.dir/1028785192/info/mapfiles/7781013568996125923
> 2008-09-04 09:52:58,260 DEBUG org.apache.hadoop.hbase.regionserver.HStore: moving /hbase/.META./compaction.dir/1028785192/info/mapfiles/7781013568996125923 to /hbase/.META./1028785192/info/mapfiles/2187291308709057119
> 2008-09-04 09:52:58,290 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Completed compaction of 1028785192/info store size is 61.0k; time since last major compaction: 32534 seconds
> 2008-09-04 09:52:58,296 INFO org.apache.hadoop.hbase.regionserver.HRegion: compaction completed on region .META.,,1 in 0sec
> 2008-09-04 09:53:09,620 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner -2085474968086468199 lease expired
> 2008-09-04 09:54:35,449 INFO org.apache.hadoop.hbase.regionserver.LogRoller: Rolling hlog. Number of entries: 30009
> Following this, insertion continues normally.  This leads me to believe there's an issue with the META table memcache, but oddly the other regions of this table on other regionservers continue on fine.
> As for the hosting the first region of the table on this region server, it seems to be consistent that I get the row out of range errors when looking for a region with startkey = '', although there are 5 other regions on this RS.
> Will attach full logs from master and three RS.  Also a couple screenshots showing weird behavior in listing the regions in the table.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HBASE-868) Incrementing binary rows cause strange behavior once table splits

Posted by "Jonathan Gray (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Gray updated HBASE-868:
--------------------------------

    Attachment: regionserver_regionlist.png
                master_regionsinsources.png

Shows odd behavior in web ui

> Incrementing binary rows cause strange behavior once table splits
> -----------------------------------------------------------------
>
>                 Key: HBASE-868
>                 URL: https://issues.apache.org/jira/browse/HBASE-868
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.2.1
>         Environment: 4 nodes: 1 master/namenode, 3 regionserver/datanodes
>            Reporter: Jonathan Gray
>            Priority: Blocker
>             Fix For: 0.2.1
>
>         Attachments: hbase-868-branch-0.2.patch, hbase-hbase-regionserver-web3.streamy.com.log, master_regionsinsources.png, regionserver_regionlist.png
>
>
> We're now using incrementing binary row keys and to this point had only been doing small tests with them, never having actually had a table split.
> I'm still working through the logs but it seems that there's a problem somewhere with startKey and endKeys.
> Binary in general is not well supported (inconsistent in display in the logs, very odd rendering in the web ui, hard to interpret in the shell, etc..)  Once we figure out these serious bugs we will spend some time trying to clean that up.  But right now this makes things even harder to debug.
> The actual symptoms are that my import eventually started to throw (in the client and on the region server):
> org.apache.hadoop.hbase.regionserver.WrongRegionException: org.apache.hadoop.hbase.regionserver.WrongRegionException: Requested row out of range for HRegion sources,,1220546297947, startKey='', getEndKey()='
> ï¿½', row='cï¿½'
>         at org.apache.hadoop.hbase.regionserver.HRegion.checkRow(HRegion.java:1775)
>         at org.apache.hadoop.hbase.regionserver.HRegion.obtainRowLock(HRegion.java:1831)
>         at org.apache.hadoop.hbase.regionserver.HRegion.batchUpdate(HRegion.java:1387)
>         at org.apache.hadoop.hbase.regionserver.HRegionServer.batchUpdate(HRegionServer.java:1145)
>         at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:616)
>         at org.apache.hadoop.hbase.ipc.HbaseRPC$Server.call(HbaseRPC.java:473)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:888)
> There are 3 regionservers, but this error only happens on one of them (while the other two always continue normally, allowing updates to this same table).
> The regionserver that this happens on is special for two reasons, one it is hosting the META table.  And secondly it also hosts the first region in the table, startkey = ''.  I'm unsure which is the cause, I have a clue leading to both.
> After about 15 minutes, the regionserver sees:
> 2008-09-04 09:52:57,948 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memcache flush for region .META.,,1. Current region memcache size 24.5k
> 2008-09-04 09:52:58,003 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Added /hbase/.META./1028785192/historian/mapfiles/8699673838203663799 with 106 entries, sequence id 25341510, data size 8.9k, file size 10.6k
> 2008-09-04 09:52:58,050 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Added /hbase/.META./1028785192/info/mapfiles/1791564557665476834 with 96 entries, sequence id 25341510, data size 14.2k, file size 15.8k
> 2008-09-04 09:52:58,050 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Finished memcache flush for region .META.,,1 in 102ms, sequence id=25341510, compaction requested=true
> 2008-09-04 09:52:58,050 DEBUG org.apache.hadoop.hbase.regionserver.CompactSplitThread: Compaction requested for region: .META.,,1
> 2008-09-04 09:52:58,051 INFO org.apache.hadoop.hbase.regionserver.HRegion: starting compaction on region .META.,,1
> 2008-09-04 09:52:58,055 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Compaction size of 1028785192/historian: 41.9k; Skipped 1 files , size: 21957
> 2008-09-04 09:52:58,088 DEBUG org.apache.hadoop.hbase.regionserver.HStore: started compaction of 2 files into /hbase/.META./compaction.dir/1028785192/historian/mapfiles/6948796056606699674
> 2008-09-04 09:52:58,128 DEBUG org.apache.hadoop.hbase.regionserver.HStore: moving /hbase/.META./compaction.dir/1028785192/historian/mapfiles/6948796056606699674 to /hbase/.META./1028785192/historian/mapfiles/75733875840914142
> 2008-09-04 09:52:58,175 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Completed compaction of 1028785192/historian store size is 41.1k; time since last major compaction: 5426 seconds
> 2008-09-04 09:52:58,179 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Compaction size of 1028785192/info: 61.9k; Skipped 0 files , size: 0
> 2008-09-04 09:52:58,192 DEBUG org.apache.hadoop.hbase.regionserver.HStore: started compaction of 3 files into /hbase/.META./compaction.dir/1028785192/info/mapfiles/7781013568996125923
> 2008-09-04 09:52:58,260 DEBUG org.apache.hadoop.hbase.regionserver.HStore: moving /hbase/.META./compaction.dir/1028785192/info/mapfiles/7781013568996125923 to /hbase/.META./1028785192/info/mapfiles/2187291308709057119
> 2008-09-04 09:52:58,290 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Completed compaction of 1028785192/info store size is 61.0k; time since last major compaction: 32534 seconds
> 2008-09-04 09:52:58,296 INFO org.apache.hadoop.hbase.regionserver.HRegion: compaction completed on region .META.,,1 in 0sec
> 2008-09-04 09:53:09,620 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner -2085474968086468199 lease expired
> 2008-09-04 09:54:35,449 INFO org.apache.hadoop.hbase.regionserver.LogRoller: Rolling hlog. Number of entries: 30009
> Following this, insertion continues normally.  This leads me to believe there's an issue with the META table memcache, but oddly the other regions of this table on other regionservers continue on fine.
> As for the hosting the first region of the table on this region server, it seems to be consistent that I get the row out of range errors when looking for a region with startkey = '', although there are 5 other regions on this RS.
> Will attach full logs from master and three RS.  Also a couple screenshots showing weird behavior in listing the regions in the table.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-868) Incrementing binary rows cause strange behavior once table splits

Posted by "stack (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12629020#action_12629020 ] 

stack commented on HBASE-868:
-----------------------------

Reviewed the patch.

I added javadoc here:

{code}
+  // WritableComparator that takes account of meta keys.
+  public static class HStoreKeyWritableComparator extends WritableComparator {
+    private final HRegionInfo hri;
{code}

... and removed the creation of new objects done in the below:

{code}
Index: src/java/org/apache/hadoop/hbase/master/MetaRegion.java
===================================================================
--- src/java/org/apache/hadoop/hbase/master/MetaRegion.java (revision 692702)
+++ src/java/org/apache/hadoop/hbase/master/MetaRegion.java (working copy)
@@ -20,7 +20,10 @@
 package org.apache.hadoop.hbase.master;

 import org.apache.hadoop.hbase.HConstants;
+import org.apache.hadoop.hbase.HRegionInfo;
+import org.apache.hadoop.hbase.HTableDescriptor;
 import org.apache.hadoop.hbase.HServerAddress;
+import org.apache.hadoop.hbase.HStoreKey;
 import org.apache.hadoop.hbase.util.Bytes;


@@ -90,7 +93,9 @@
   public int compareTo(MetaRegion other) {
     int result = Bytes.compareTo(this.regionName, other.getRegionName());
     if(result == 0) {
-      result = Bytes.compareTo(this.startKey, other.getStartKey());
+      HRegionInfo hri = new HRegionInfo(HTableDescriptor.META_TABLEDESC, 
+          HConstants.EMPTY_END_ROW, HConstants.EMPTY_END_ROW);
+      result = HStoreKey.compareTwoRowKeys(hri, this.startKey, other.getStartKey());
       if (result == 0) {
         // Might be on different host?
         result = this.server.compareTo(other.server);
Index: src/java/org/apache/hadoop/hbase/client/MetaScanner.java
===================================================================
--- src/java/org/apache/hadoop/hbase/client/MetaScanner.java    (revision 692702)
+++ src/java/org/apache/hadoop/hbase/client/MetaScanner.java    (working copy)
@@ -5,6 +5,7 @@
 import org.apache.hadoop.hbase.HBaseConfiguration;
 import org.apache.hadoop.hbase.HConstants;
 import org.apache.hadoop.hbase.HRegionInfo;
+import org.apache.hadoop.hbase.HStoreKey;
 import org.apache.hadoop.hbase.io.RowResult;
 import org.apache.hadoop.hbase.util.Bytes;

@@ -47,6 +48,7 @@
           HRegionInfo.createRegionName(tableName, null, ZEROES);

     // Scan over each meta region
+    HRegionInfo hri;
     do {
       ScannerCallable callable = new ScannerCallable(connection,
         META_TABLE_NAME, COLUMN_FAMILY_ARRAY, startRow, LATEST_TIMESTAMP, null);
@@ -62,12 +64,13 @@
         } while(visitor.processRow(r));
         // Advance the startRow to the end key of the current region
         startRow = callable.getHRegionInfo().getEndKey();
+        hri = new HRegionInfo(callable.getHRegionInfo());
       } finally {
         // Close scanner
         callable.setClose();
         connection.getRegionServerWithRetries(callable);
       }
-    } while (Bytes.compareTo(startRow, LAST_ROW) != 0);
+    } while (HStoreKey.compareTwoRowKeys(hri, startRow, LAST_ROW) != 0);
{code}

Otherwise, the patch looks fine.

This kinda change looks unnecessary in SFS but I'm afraid to change it since the patch works:
{code}
-      keys[i] = new HStoreKey();
+      keys[i] = new HStoreKey(HConstants.EMPTY_BYTE_ARRAY, this.store.getHRegionInfo());
{code}

I'm running unit tests now.  I'll then run the amended PE test to ensure binary keys work.

> Incrementing binary rows cause strange behavior once table splits
> -----------------------------------------------------------------
>
>                 Key: HBASE-868
>                 URL: https://issues.apache.org/jira/browse/HBASE-868
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.2.1
>         Environment: 4 nodes: 1 master/namenode, 3 regionserver/datanodes
>            Reporter: Jonathan Gray
>            Assignee: Jonathan Gray
>            Priority: Blocker
>             Fix For: 0.2.1
>
>         Attachments: hbase-868-0.2-v2.patch, hbase-868-0.2-v3.patch, hbase-868-0.2-v4.patch, hbase-868-0.2-v5.patch, hbase-868-branch-0.2.patch, hbase-hbase-regionserver-web3.streamy.com.log, master_regionsinsources.png, regionserver_regionlist.png
>
>
> We're now using incrementing binary row keys and to this point had only been doing small tests with them, never having actually had a table split.
> I'm still working through the logs but it seems that there's a problem somewhere with startKey and endKeys.
> Binary in general is not well supported (inconsistent in display in the logs, very odd rendering in the web ui, hard to interpret in the shell, etc..)  Once we figure out these serious bugs we will spend some time trying to clean that up.  But right now this makes things even harder to debug.
> The actual symptoms are that my import eventually started to throw (in the client and on the region server):
> org.apache.hadoop.hbase.regionserver.WrongRegionException: org.apache.hadoop.hbase.regionserver.WrongRegionException: Requested row out of range for HRegion sources,,1220546297947, startKey='', getEndKey()='
> ï¿½', row='cï¿½'
>         at org.apache.hadoop.hbase.regionserver.HRegion.checkRow(HRegion.java:1775)
>         at org.apache.hadoop.hbase.regionserver.HRegion.obtainRowLock(HRegion.java:1831)
>         at org.apache.hadoop.hbase.regionserver.HRegion.batchUpdate(HRegion.java:1387)
>         at org.apache.hadoop.hbase.regionserver.HRegionServer.batchUpdate(HRegionServer.java:1145)
>         at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:616)
>         at org.apache.hadoop.hbase.ipc.HbaseRPC$Server.call(HbaseRPC.java:473)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:888)
> There are 3 regionservers, but this error only happens on one of them (while the other two always continue normally, allowing updates to this same table).
> The regionserver that this happens on is special for two reasons, one it is hosting the META table.  And secondly it also hosts the first region in the table, startkey = ''.  I'm unsure which is the cause, I have a clue leading to both.
> After about 15 minutes, the regionserver sees:
> 2008-09-04 09:52:57,948 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memcache flush for region .META.,,1. Current region memcache size 24.5k
> 2008-09-04 09:52:58,003 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Added /hbase/.META./1028785192/historian/mapfiles/8699673838203663799 with 106 entries, sequence id 25341510, data size 8.9k, file size 10.6k
> 2008-09-04 09:52:58,050 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Added /hbase/.META./1028785192/info/mapfiles/1791564557665476834 with 96 entries, sequence id 25341510, data size 14.2k, file size 15.8k
> 2008-09-04 09:52:58,050 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Finished memcache flush for region .META.,,1 in 102ms, sequence id=25341510, compaction requested=true
> 2008-09-04 09:52:58,050 DEBUG org.apache.hadoop.hbase.regionserver.CompactSplitThread: Compaction requested for region: .META.,,1
> 2008-09-04 09:52:58,051 INFO org.apache.hadoop.hbase.regionserver.HRegion: starting compaction on region .META.,,1
> 2008-09-04 09:52:58,055 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Compaction size of 1028785192/historian: 41.9k; Skipped 1 files , size: 21957
> 2008-09-04 09:52:58,088 DEBUG org.apache.hadoop.hbase.regionserver.HStore: started compaction of 2 files into /hbase/.META./compaction.dir/1028785192/historian/mapfiles/6948796056606699674
> 2008-09-04 09:52:58,128 DEBUG org.apache.hadoop.hbase.regionserver.HStore: moving /hbase/.META./compaction.dir/1028785192/historian/mapfiles/6948796056606699674 to /hbase/.META./1028785192/historian/mapfiles/75733875840914142
> 2008-09-04 09:52:58,175 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Completed compaction of 1028785192/historian store size is 41.1k; time since last major compaction: 5426 seconds
> 2008-09-04 09:52:58,179 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Compaction size of 1028785192/info: 61.9k; Skipped 0 files , size: 0
> 2008-09-04 09:52:58,192 DEBUG org.apache.hadoop.hbase.regionserver.HStore: started compaction of 3 files into /hbase/.META./compaction.dir/1028785192/info/mapfiles/7781013568996125923
> 2008-09-04 09:52:58,260 DEBUG org.apache.hadoop.hbase.regionserver.HStore: moving /hbase/.META./compaction.dir/1028785192/info/mapfiles/7781013568996125923 to /hbase/.META./1028785192/info/mapfiles/2187291308709057119
> 2008-09-04 09:52:58,290 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Completed compaction of 1028785192/info store size is 61.0k; time since last major compaction: 32534 seconds
> 2008-09-04 09:52:58,296 INFO org.apache.hadoop.hbase.regionserver.HRegion: compaction completed on region .META.,,1 in 0sec
> 2008-09-04 09:53:09,620 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner -2085474968086468199 lease expired
> 2008-09-04 09:54:35,449 INFO org.apache.hadoop.hbase.regionserver.LogRoller: Rolling hlog. Number of entries: 30009
> Following this, insertion continues normally.  This leads me to believe there's an issue with the META table memcache, but oddly the other regions of this table on other regionservers continue on fine.
> As for the hosting the first region of the table on this region server, it seems to be consistent that I get the row out of range errors when looking for a region with startkey = '', although there are 5 other regions on this RS.
> Will attach full logs from master and three RS.  Also a couple screenshots showing weird behavior in listing the regions in the table.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HBASE-868) Incrementing binary rows cause strange behavior once table splits

Posted by "Jonathan Gray (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Gray updated HBASE-868:
--------------------------------

    Attachment: hbase-hbase-regionserver-web3.streamy.com.log

This is the region hosting META, the first region in the sources table, and the one that throws the exceptions row not in region.

> Incrementing binary rows cause strange behavior once table splits
> -----------------------------------------------------------------
>
>                 Key: HBASE-868
>                 URL: https://issues.apache.org/jira/browse/HBASE-868
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.2.1
>         Environment: 4 nodes: 1 master/namenode, 3 regionserver/datanodes
>            Reporter: Jonathan Gray
>            Priority: Blocker
>             Fix For: 0.2.1
>
>         Attachments: hbase-hbase-regionserver-web3.streamy.com.log
>
>
> We're now using incrementing binary row keys and to this point had only been doing small tests with them, never having actually had a table split.
> I'm still working through the logs but it seems that there's a problem somewhere with startKey and endKeys.
> Binary in general is not well supported (inconsistent in display in the logs, very odd rendering in the web ui, hard to interpret in the shell, etc..)  Once we figure out these serious bugs we will spend some time trying to clean that up.  But right now this makes things even harder to debug.
> The actual symptoms are that my import eventually started to throw (in the client and on the region server):
> org.apache.hadoop.hbase.regionserver.WrongRegionException: org.apache.hadoop.hbase.regionserver.WrongRegionException: Requested row out of range for HRegion sources,,1220546297947, startKey='', getEndKey()='
> ï¿½', row='cï¿½'
>         at org.apache.hadoop.hbase.regionserver.HRegion.checkRow(HRegion.java:1775)
>         at org.apache.hadoop.hbase.regionserver.HRegion.obtainRowLock(HRegion.java:1831)
>         at org.apache.hadoop.hbase.regionserver.HRegion.batchUpdate(HRegion.java:1387)
>         at org.apache.hadoop.hbase.regionserver.HRegionServer.batchUpdate(HRegionServer.java:1145)
>         at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:616)
>         at org.apache.hadoop.hbase.ipc.HbaseRPC$Server.call(HbaseRPC.java:473)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:888)
> There are 3 regionservers, but this error only happens on one of them (while the other two always continue normally, allowing updates to this same table).
> The regionserver that this happens on is special for two reasons, one it is hosting the META table.  And secondly it also hosts the first region in the table, startkey = ''.  I'm unsure which is the cause, I have a clue leading to both.
> After about 15 minutes, the regionserver sees:
> 2008-09-04 09:52:57,948 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memcache flush for region .META.,,1. Current region memcache size 24.5k
> 2008-09-04 09:52:58,003 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Added /hbase/.META./1028785192/historian/mapfiles/8699673838203663799 with 106 entries, sequence id 25341510, data size 8.9k, file size 10.6k
> 2008-09-04 09:52:58,050 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Added /hbase/.META./1028785192/info/mapfiles/1791564557665476834 with 96 entries, sequence id 25341510, data size 14.2k, file size 15.8k
> 2008-09-04 09:52:58,050 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Finished memcache flush for region .META.,,1 in 102ms, sequence id=25341510, compaction requested=true
> 2008-09-04 09:52:58,050 DEBUG org.apache.hadoop.hbase.regionserver.CompactSplitThread: Compaction requested for region: .META.,,1
> 2008-09-04 09:52:58,051 INFO org.apache.hadoop.hbase.regionserver.HRegion: starting compaction on region .META.,,1
> 2008-09-04 09:52:58,055 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Compaction size of 1028785192/historian: 41.9k; Skipped 1 files , size: 21957
> 2008-09-04 09:52:58,088 DEBUG org.apache.hadoop.hbase.regionserver.HStore: started compaction of 2 files into /hbase/.META./compaction.dir/1028785192/historian/mapfiles/6948796056606699674
> 2008-09-04 09:52:58,128 DEBUG org.apache.hadoop.hbase.regionserver.HStore: moving /hbase/.META./compaction.dir/1028785192/historian/mapfiles/6948796056606699674 to /hbase/.META./1028785192/historian/mapfiles/75733875840914142
> 2008-09-04 09:52:58,175 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Completed compaction of 1028785192/historian store size is 41.1k; time since last major compaction: 5426 seconds
> 2008-09-04 09:52:58,179 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Compaction size of 1028785192/info: 61.9k; Skipped 0 files , size: 0
> 2008-09-04 09:52:58,192 DEBUG org.apache.hadoop.hbase.regionserver.HStore: started compaction of 3 files into /hbase/.META./compaction.dir/1028785192/info/mapfiles/7781013568996125923
> 2008-09-04 09:52:58,260 DEBUG org.apache.hadoop.hbase.regionserver.HStore: moving /hbase/.META./compaction.dir/1028785192/info/mapfiles/7781013568996125923 to /hbase/.META./1028785192/info/mapfiles/2187291308709057119
> 2008-09-04 09:52:58,290 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Completed compaction of 1028785192/info store size is 61.0k; time since last major compaction: 32534 seconds
> 2008-09-04 09:52:58,296 INFO org.apache.hadoop.hbase.regionserver.HRegion: compaction completed on region .META.,,1 in 0sec
> 2008-09-04 09:53:09,620 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner -2085474968086468199 lease expired
> 2008-09-04 09:54:35,449 INFO org.apache.hadoop.hbase.regionserver.LogRoller: Rolling hlog. Number of entries: 30009
> Following this, insertion continues normally.  This leads me to believe there's an issue with the META table memcache, but oddly the other regions of this table on other regionservers continue on fine.
> As for the hosting the first region of the table on this region server, it seems to be consistent that I get the row out of range errors when looking for a region with startkey = '', although there are 5 other regions on this RS.
> Will attach full logs from master and three RS.  Also a couple screenshots showing weird behavior in listing the regions in the table.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-868) Incrementing binary rows cause strange behavior once table splits

Posted by "Jonathan Gray (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12628542#action_12628542 ] 

Jonathan Gray commented on HBASE-868:
-------------------------------------

One note, I accidentally brought up a clean hdfs on hbase 0.2.1 rc1 and including all the patches I made with JD today.  First insert to a table immediately caused:

Exception in thread "main" java.io.IOException: java.io.IOException: java.io.IOException: java.lang.                                                                                               NegativeArraySizeException
        at org.apache.hadoop.hbase.HStoreKey.stripStartKeyMeta(HStoreKey.java:532)
        at org.apache.hadoop.hbase.HStoreKey.compareTwoRowKeys(HStoreKey.java:494)


This does not happen on 0.2.1 rc1 without our patches.  I'm to do plenty of inserting of whatever until I reach a split.  Once it finishes splitting, the next request immediately starts receiving:

 org.apache.hadoop.hbase.regionserver.WrongRegionException: Requested row out of range

> Incrementing binary rows cause strange behavior once table splits
> -----------------------------------------------------------------
>
>                 Key: HBASE-868
>                 URL: https://issues.apache.org/jira/browse/HBASE-868
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.2.1
>         Environment: 4 nodes: 1 master/namenode, 3 regionserver/datanodes
>            Reporter: Jonathan Gray
>            Priority: Blocker
>             Fix For: 0.2.1
>
>         Attachments: hbase-868-branch-0.2.patch, hbase-hbase-regionserver-web3.streamy.com.log, master_regionsinsources.png, regionserver_regionlist.png
>
>
> We're now using incrementing binary row keys and to this point had only been doing small tests with them, never having actually had a table split.
> I'm still working through the logs but it seems that there's a problem somewhere with startKey and endKeys.
> Binary in general is not well supported (inconsistent in display in the logs, very odd rendering in the web ui, hard to interpret in the shell, etc..)  Once we figure out these serious bugs we will spend some time trying to clean that up.  But right now this makes things even harder to debug.
> The actual symptoms are that my import eventually started to throw (in the client and on the region server):
> org.apache.hadoop.hbase.regionserver.WrongRegionException: org.apache.hadoop.hbase.regionserver.WrongRegionException: Requested row out of range for HRegion sources,,1220546297947, startKey='', getEndKey()='
> ï¿½', row='cï¿½'
>         at org.apache.hadoop.hbase.regionserver.HRegion.checkRow(HRegion.java:1775)
>         at org.apache.hadoop.hbase.regionserver.HRegion.obtainRowLock(HRegion.java:1831)
>         at org.apache.hadoop.hbase.regionserver.HRegion.batchUpdate(HRegion.java:1387)
>         at org.apache.hadoop.hbase.regionserver.HRegionServer.batchUpdate(HRegionServer.java:1145)
>         at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:616)
>         at org.apache.hadoop.hbase.ipc.HbaseRPC$Server.call(HbaseRPC.java:473)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:888)
> There are 3 regionservers, but this error only happens on one of them (while the other two always continue normally, allowing updates to this same table).
> The regionserver that this happens on is special for two reasons, one it is hosting the META table.  And secondly it also hosts the first region in the table, startkey = ''.  I'm unsure which is the cause, I have a clue leading to both.
> After about 15 minutes, the regionserver sees:
> 2008-09-04 09:52:57,948 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memcache flush for region .META.,,1. Current region memcache size 24.5k
> 2008-09-04 09:52:58,003 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Added /hbase/.META./1028785192/historian/mapfiles/8699673838203663799 with 106 entries, sequence id 25341510, data size 8.9k, file size 10.6k
> 2008-09-04 09:52:58,050 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Added /hbase/.META./1028785192/info/mapfiles/1791564557665476834 with 96 entries, sequence id 25341510, data size 14.2k, file size 15.8k
> 2008-09-04 09:52:58,050 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Finished memcache flush for region .META.,,1 in 102ms, sequence id=25341510, compaction requested=true
> 2008-09-04 09:52:58,050 DEBUG org.apache.hadoop.hbase.regionserver.CompactSplitThread: Compaction requested for region: .META.,,1
> 2008-09-04 09:52:58,051 INFO org.apache.hadoop.hbase.regionserver.HRegion: starting compaction on region .META.,,1
> 2008-09-04 09:52:58,055 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Compaction size of 1028785192/historian: 41.9k; Skipped 1 files , size: 21957
> 2008-09-04 09:52:58,088 DEBUG org.apache.hadoop.hbase.regionserver.HStore: started compaction of 2 files into /hbase/.META./compaction.dir/1028785192/historian/mapfiles/6948796056606699674
> 2008-09-04 09:52:58,128 DEBUG org.apache.hadoop.hbase.regionserver.HStore: moving /hbase/.META./compaction.dir/1028785192/historian/mapfiles/6948796056606699674 to /hbase/.META./1028785192/historian/mapfiles/75733875840914142
> 2008-09-04 09:52:58,175 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Completed compaction of 1028785192/historian store size is 41.1k; time since last major compaction: 5426 seconds
> 2008-09-04 09:52:58,179 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Compaction size of 1028785192/info: 61.9k; Skipped 0 files , size: 0
> 2008-09-04 09:52:58,192 DEBUG org.apache.hadoop.hbase.regionserver.HStore: started compaction of 3 files into /hbase/.META./compaction.dir/1028785192/info/mapfiles/7781013568996125923
> 2008-09-04 09:52:58,260 DEBUG org.apache.hadoop.hbase.regionserver.HStore: moving /hbase/.META./compaction.dir/1028785192/info/mapfiles/7781013568996125923 to /hbase/.META./1028785192/info/mapfiles/2187291308709057119
> 2008-09-04 09:52:58,290 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Completed compaction of 1028785192/info store size is 61.0k; time since last major compaction: 32534 seconds
> 2008-09-04 09:52:58,296 INFO org.apache.hadoop.hbase.regionserver.HRegion: compaction completed on region .META.,,1 in 0sec
> 2008-09-04 09:53:09,620 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner -2085474968086468199 lease expired
> 2008-09-04 09:54:35,449 INFO org.apache.hadoop.hbase.regionserver.LogRoller: Rolling hlog. Number of entries: 30009
> Following this, insertion continues normally.  This leads me to believe there's an issue with the META table memcache, but oddly the other regions of this table on other regionservers continue on fine.
> As for the hosting the first region of the table on this region server, it seems to be consistent that I get the row out of range errors when looking for a region with startkey = '', although there are 5 other regions on this RS.
> Will attach full logs from master and three RS.  Also a couple screenshots showing weird behavior in listing the regions in the table.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-868) Incrementing binary rows cause strange behavior once table splits

Posted by "Jim Kellerman (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12628931#action_12628931 ] 

Jim Kellerman commented on HBASE-868:
-------------------------------------

Jonathan,

WRT to the DroppedSnapshotException, can you confirm that you have hadoop-0.17.2.1 in the libs directory and that you are using 0.17.2.1 for Hadoop?

I noted that some native libraries were loaded in your log above. If you just have hadoop-0.17.2, be advised that the native libs are incorrect.

If you are using 0.17.2.1, then this is just a red herring.

> Incrementing binary rows cause strange behavior once table splits
> -----------------------------------------------------------------
>
>                 Key: HBASE-868
>                 URL: https://issues.apache.org/jira/browse/HBASE-868
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.2.1
>         Environment: 4 nodes: 1 master/namenode, 3 regionserver/datanodes
>            Reporter: Jonathan Gray
>            Assignee: Jonathan Gray
>            Priority: Blocker
>             Fix For: 0.2.1
>
>         Attachments: hbase-868-0.2-v2.patch, hbase-868-0.2-v3.patch, hbase-868-0.2-v4.patch, hbase-868-branch-0.2.patch, hbase-hbase-regionserver-web3.streamy.com.log, master_regionsinsources.png, regionserver_regionlist.png
>
>
> We're now using incrementing binary row keys and to this point had only been doing small tests with them, never having actually had a table split.
> I'm still working through the logs but it seems that there's a problem somewhere with startKey and endKeys.
> Binary in general is not well supported (inconsistent in display in the logs, very odd rendering in the web ui, hard to interpret in the shell, etc..)  Once we figure out these serious bugs we will spend some time trying to clean that up.  But right now this makes things even harder to debug.
> The actual symptoms are that my import eventually started to throw (in the client and on the region server):
> org.apache.hadoop.hbase.regionserver.WrongRegionException: org.apache.hadoop.hbase.regionserver.WrongRegionException: Requested row out of range for HRegion sources,,1220546297947, startKey='', getEndKey()='
> ï¿½', row='cï¿½'
>         at org.apache.hadoop.hbase.regionserver.HRegion.checkRow(HRegion.java:1775)
>         at org.apache.hadoop.hbase.regionserver.HRegion.obtainRowLock(HRegion.java:1831)
>         at org.apache.hadoop.hbase.regionserver.HRegion.batchUpdate(HRegion.java:1387)
>         at org.apache.hadoop.hbase.regionserver.HRegionServer.batchUpdate(HRegionServer.java:1145)
>         at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:616)
>         at org.apache.hadoop.hbase.ipc.HbaseRPC$Server.call(HbaseRPC.java:473)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:888)
> There are 3 regionservers, but this error only happens on one of them (while the other two always continue normally, allowing updates to this same table).
> The regionserver that this happens on is special for two reasons, one it is hosting the META table.  And secondly it also hosts the first region in the table, startkey = ''.  I'm unsure which is the cause, I have a clue leading to both.
> After about 15 minutes, the regionserver sees:
> 2008-09-04 09:52:57,948 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memcache flush for region .META.,,1. Current region memcache size 24.5k
> 2008-09-04 09:52:58,003 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Added /hbase/.META./1028785192/historian/mapfiles/8699673838203663799 with 106 entries, sequence id 25341510, data size 8.9k, file size 10.6k
> 2008-09-04 09:52:58,050 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Added /hbase/.META./1028785192/info/mapfiles/1791564557665476834 with 96 entries, sequence id 25341510, data size 14.2k, file size 15.8k
> 2008-09-04 09:52:58,050 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Finished memcache flush for region .META.,,1 in 102ms, sequence id=25341510, compaction requested=true
> 2008-09-04 09:52:58,050 DEBUG org.apache.hadoop.hbase.regionserver.CompactSplitThread: Compaction requested for region: .META.,,1
> 2008-09-04 09:52:58,051 INFO org.apache.hadoop.hbase.regionserver.HRegion: starting compaction on region .META.,,1
> 2008-09-04 09:52:58,055 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Compaction size of 1028785192/historian: 41.9k; Skipped 1 files , size: 21957
> 2008-09-04 09:52:58,088 DEBUG org.apache.hadoop.hbase.regionserver.HStore: started compaction of 2 files into /hbase/.META./compaction.dir/1028785192/historian/mapfiles/6948796056606699674
> 2008-09-04 09:52:58,128 DEBUG org.apache.hadoop.hbase.regionserver.HStore: moving /hbase/.META./compaction.dir/1028785192/historian/mapfiles/6948796056606699674 to /hbase/.META./1028785192/historian/mapfiles/75733875840914142
> 2008-09-04 09:52:58,175 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Completed compaction of 1028785192/historian store size is 41.1k; time since last major compaction: 5426 seconds
> 2008-09-04 09:52:58,179 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Compaction size of 1028785192/info: 61.9k; Skipped 0 files , size: 0
> 2008-09-04 09:52:58,192 DEBUG org.apache.hadoop.hbase.regionserver.HStore: started compaction of 3 files into /hbase/.META./compaction.dir/1028785192/info/mapfiles/7781013568996125923
> 2008-09-04 09:52:58,260 DEBUG org.apache.hadoop.hbase.regionserver.HStore: moving /hbase/.META./compaction.dir/1028785192/info/mapfiles/7781013568996125923 to /hbase/.META./1028785192/info/mapfiles/2187291308709057119
> 2008-09-04 09:52:58,290 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Completed compaction of 1028785192/info store size is 61.0k; time since last major compaction: 32534 seconds
> 2008-09-04 09:52:58,296 INFO org.apache.hadoop.hbase.regionserver.HRegion: compaction completed on region .META.,,1 in 0sec
> 2008-09-04 09:53:09,620 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner -2085474968086468199 lease expired
> 2008-09-04 09:54:35,449 INFO org.apache.hadoop.hbase.regionserver.LogRoller: Rolling hlog. Number of entries: 30009
> Following this, insertion continues normally.  This leads me to believe there's an issue with the META table memcache, but oddly the other regions of this table on other regionservers continue on fine.
> As for the hosting the first region of the table on this region server, it seems to be consistent that I get the row out of range errors when looking for a region with startkey = '', although there are 5 other regions on this RS.
> Will attach full logs from master and three RS.  Also a couple screenshots showing weird behavior in listing the regions in the table.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-868) Incrementing binary rows cause strange behavior once table splits

Posted by "Jonathan Gray (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12629045#action_12629045 ] 

Jonathan Gray commented on HBASE-868:
-------------------------------------

This does fix it.  But eventually it broke again with the same error.

There are quite a few references around the code for a synchronizedSortedMap/TreeMap instantiation that may also need to be passed the HSK comparator.

> Incrementing binary rows cause strange behavior once table splits
> -----------------------------------------------------------------
>
>                 Key: HBASE-868
>                 URL: https://issues.apache.org/jira/browse/HBASE-868
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.2.1, 0.18.0
>         Environment: 4 nodes: 1 master/namenode, 3 regionserver/datanodes
>            Reporter: Jonathan Gray
>            Assignee: Jonathan Gray
>            Priority: Blocker
>             Fix For: 0.2.1, 0.18.0
>
>         Attachments: hbase-868-0.2-v2.patch, hbase-868-0.2-v3.patch, hbase-868-0.2-v4.patch, hbase-868-0.2-v5-fix.patch, hbase-868-0.2-v5.patch, hbase-868-branch-0.2.patch, hbase-hbase-regionserver-web3.streamy.com.log, master_regionsinsources.png, regionserver_regionlist.png
>
>
> We're now using incrementing binary row keys and to this point had only been doing small tests with them, never having actually had a table split.
> I'm still working through the logs but it seems that there's a problem somewhere with startKey and endKeys.
> Binary in general is not well supported (inconsistent in display in the logs, very odd rendering in the web ui, hard to interpret in the shell, etc..)  Once we figure out these serious bugs we will spend some time trying to clean that up.  But right now this makes things even harder to debug.
> The actual symptoms are that my import eventually started to throw (in the client and on the region server):
> org.apache.hadoop.hbase.regionserver.WrongRegionException: org.apache.hadoop.hbase.regionserver.WrongRegionException: Requested row out of range for HRegion sources,,1220546297947, startKey='', getEndKey()='
> ï¿½', row='cï¿½'
>         at org.apache.hadoop.hbase.regionserver.HRegion.checkRow(HRegion.java:1775)
>         at org.apache.hadoop.hbase.regionserver.HRegion.obtainRowLock(HRegion.java:1831)
>         at org.apache.hadoop.hbase.regionserver.HRegion.batchUpdate(HRegion.java:1387)
>         at org.apache.hadoop.hbase.regionserver.HRegionServer.batchUpdate(HRegionServer.java:1145)
>         at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:616)
>         at org.apache.hadoop.hbase.ipc.HbaseRPC$Server.call(HbaseRPC.java:473)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:888)
> There are 3 regionservers, but this error only happens on one of them (while the other two always continue normally, allowing updates to this same table).
> The regionserver that this happens on is special for two reasons, one it is hosting the META table.  And secondly it also hosts the first region in the table, startkey = ''.  I'm unsure which is the cause, I have a clue leading to both.
> After about 15 minutes, the regionserver sees:
> 2008-09-04 09:52:57,948 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memcache flush for region .META.,,1. Current region memcache size 24.5k
> 2008-09-04 09:52:58,003 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Added /hbase/.META./1028785192/historian/mapfiles/8699673838203663799 with 106 entries, sequence id 25341510, data size 8.9k, file size 10.6k
> 2008-09-04 09:52:58,050 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Added /hbase/.META./1028785192/info/mapfiles/1791564557665476834 with 96 entries, sequence id 25341510, data size 14.2k, file size 15.8k
> 2008-09-04 09:52:58,050 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Finished memcache flush for region .META.,,1 in 102ms, sequence id=25341510, compaction requested=true
> 2008-09-04 09:52:58,050 DEBUG org.apache.hadoop.hbase.regionserver.CompactSplitThread: Compaction requested for region: .META.,,1
> 2008-09-04 09:52:58,051 INFO org.apache.hadoop.hbase.regionserver.HRegion: starting compaction on region .META.,,1
> 2008-09-04 09:52:58,055 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Compaction size of 1028785192/historian: 41.9k; Skipped 1 files , size: 21957
> 2008-09-04 09:52:58,088 DEBUG org.apache.hadoop.hbase.regionserver.HStore: started compaction of 2 files into /hbase/.META./compaction.dir/1028785192/historian/mapfiles/6948796056606699674
> 2008-09-04 09:52:58,128 DEBUG org.apache.hadoop.hbase.regionserver.HStore: moving /hbase/.META./compaction.dir/1028785192/historian/mapfiles/6948796056606699674 to /hbase/.META./1028785192/historian/mapfiles/75733875840914142
> 2008-09-04 09:52:58,175 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Completed compaction of 1028785192/historian store size is 41.1k; time since last major compaction: 5426 seconds
> 2008-09-04 09:52:58,179 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Compaction size of 1028785192/info: 61.9k; Skipped 0 files , size: 0
> 2008-09-04 09:52:58,192 DEBUG org.apache.hadoop.hbase.regionserver.HStore: started compaction of 3 files into /hbase/.META./compaction.dir/1028785192/info/mapfiles/7781013568996125923
> 2008-09-04 09:52:58,260 DEBUG org.apache.hadoop.hbase.regionserver.HStore: moving /hbase/.META./compaction.dir/1028785192/info/mapfiles/7781013568996125923 to /hbase/.META./1028785192/info/mapfiles/2187291308709057119
> 2008-09-04 09:52:58,290 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Completed compaction of 1028785192/info store size is 61.0k; time since last major compaction: 32534 seconds
> 2008-09-04 09:52:58,296 INFO org.apache.hadoop.hbase.regionserver.HRegion: compaction completed on region .META.,,1 in 0sec
> 2008-09-04 09:53:09,620 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner -2085474968086468199 lease expired
> 2008-09-04 09:54:35,449 INFO org.apache.hadoop.hbase.regionserver.LogRoller: Rolling hlog. Number of entries: 30009
> Following this, insertion continues normally.  This leads me to believe there's an issue with the META table memcache, but oddly the other regions of this table on other regionservers continue on fine.
> As for the hosting the first region of the table on this region server, it seems to be consistent that I get the row out of range errors when looking for a region with startkey = '', although there are 5 other regions on this RS.
> Will attach full logs from master and three RS.  Also a couple screenshots showing weird behavior in listing the regions in the table.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.