You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@hbase.apache.org by "stack (JIRA)" <ji...@apache.org> on 2009/09/01 06:45:32 UTC

[jira] Created: (HBASE-1806) Scanners do not respect row locks; scanner view could return a skewed view on row if ongoing update

Scanners do not respect row locks; scanner view could return a skewed view on row if ongoing update
---------------------------------------------------------------------------------------------------

                 Key: HBASE-1806
                 URL: https://issues.apache.org/jira/browse/HBASE-1806
             Project: Hadoop HBase
          Issue Type: Bug
            Reporter: stack


What I'm seeing is that BaseScanner misses updates made by an update milliseconds before -- even hundreds of milliseconds before.  See hbase-1784 where I'm seeing double-assignment of regions.

Scanners do not respect row locks.  They should else could return a row with partial updates committed.  What if a .META. region has tens of storefiles and a scan does a get full row which takes a long time.  Say an update comes in during this read.  First it will go in because no row lock is outstanding.  Second, we'll miss the edit given we look at things in order -- memstore, then each storefile down to the oldest.  What if the update is followed by an update of server state; e.g. region is moved out of intransition state?  And inside in same server, say the master, it makes decisions dependent on what it sees when it does a scanner#next; e.g. BaseScanner checking for assignment?



-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1806) Scanners do not respect row locks; scanner view could return a skewed view on row if ongoing update

Posted by "Andrew Purtell (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-1806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12750732#action_12750732 ] 

Andrew Purtell commented on HBASE-1806:
---------------------------------------

Close as WontFix?

> Scanners do not respect row locks; scanner view could return a skewed view on row if ongoing update
> ---------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-1806
>                 URL: https://issues.apache.org/jira/browse/HBASE-1806
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>
> What I'm seeing is that BaseScanner misses updates made by an update milliseconds before -- even hundreds of milliseconds before.  See hbase-1784 where I'm seeing double-assignment of regions.
> Scanners do not respect row locks.  They should else could return a row with partial updates committed.  What if a .META. region has tens of storefiles and a scan does a get full row which takes a long time.  Say an update comes in during this read.  First it will go in because no row lock is outstanding.  Second, we'll miss the edit given we look at things in order -- memstore, then each storefile down to the oldest.  What if the update is followed by an update of server state; e.g. region is moved out of intransition state?  And inside in same server, say the master, it makes decisions dependent on what it sees when it does a scanner#next; e.g. BaseScanner checking for assignment?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1806) Scanners do not respect row locks; scanner view could return a skewed view on row if ongoing update

Posted by "stack (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-1806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12766384#action_12766384 ] 

stack commented on HBASE-1806:
------------------------------

Thanks Andrew.  I added a note to javadoc a while back.

> Scanners do not respect row locks; scanner view could return a skewed view on row if ongoing update
> ---------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-1806
>                 URL: https://issues.apache.org/jira/browse/HBASE-1806
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>
> What I'm seeing is that BaseScanner misses updates made by an update milliseconds before -- even hundreds of milliseconds before.  See hbase-1784 where I'm seeing double-assignment of regions.
> Scanners do not respect row locks.  They should else could return a row with partial updates committed.  What if a .META. region has tens of storefiles and a scan does a get full row which takes a long time.  Say an update comes in during this read.  First it will go in because no row lock is outstanding.  Second, we'll miss the edit given we look at things in order -- memstore, then each storefile down to the oldest.  What if the update is followed by an update of server state; e.g. region is moved out of intransition state?  And inside in same server, say the master, it makes decisions dependent on what it sees when it does a scanner#next; e.g. BaseScanner checking for assignment?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1806) Scanners do not respect row locks; scanner view could return a skewed view on row if ongoing update

Posted by "Jonathan Gray (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-1806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12749962#action_12749962 ] 

Jonathan Gray commented on HBASE-1806:
--------------------------------------

These kinds of race conditions are another reason why moving stuff into ZK will make life easier.  Do you think we need to do something major for 0.20 or can we wait for 0.21?

> Scanners do not respect row locks; scanner view could return a skewed view on row if ongoing update
> ---------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-1806
>                 URL: https://issues.apache.org/jira/browse/HBASE-1806
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>
> What I'm seeing is that BaseScanner misses updates made by an update milliseconds before -- even hundreds of milliseconds before.  See hbase-1784 where I'm seeing double-assignment of regions.
> Scanners do not respect row locks.  They should else could return a row with partial updates committed.  What if a .META. region has tens of storefiles and a scan does a get full row which takes a long time.  Say an update comes in during this read.  First it will go in because no row lock is outstanding.  Second, we'll miss the edit given we look at things in order -- memstore, then each storefile down to the oldest.  What if the update is followed by an update of server state; e.g. region is moved out of intransition state?  And inside in same server, say the master, it makes decisions dependent on what it sees when it does a scanner#next; e.g. BaseScanner checking for assignment?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1806) Scanners do not respect row locks; scanner view could return a skewed view on row if ongoing update

Posted by "Andrew Purtell (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-1806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12749983#action_12749983 ] 

Andrew Purtell commented on HBASE-1806:
---------------------------------------

I've always accepted this fact about scanners. It's debatable if we should take the performance hit to consider transactional semantics of any kind for scanners. Could be enough to just document that TableMappers should test for inconsistent state from the application's perspective and ignore the record as appropriate. Can get another shot at it during the next job. Of course master scanners are a special case. I wonder why this did not occur to me before... Moving META into ZK is the appropriate strategy to use for this and other reasons IMO. 

> Scanners do not respect row locks; scanner view could return a skewed view on row if ongoing update
> ---------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-1806
>                 URL: https://issues.apache.org/jira/browse/HBASE-1806
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>
> What I'm seeing is that BaseScanner misses updates made by an update milliseconds before -- even hundreds of milliseconds before.  See hbase-1784 where I'm seeing double-assignment of regions.
> Scanners do not respect row locks.  They should else could return a row with partial updates committed.  What if a .META. region has tens of storefiles and a scan does a get full row which takes a long time.  Say an update comes in during this read.  First it will go in because no row lock is outstanding.  Second, we'll miss the edit given we look at things in order -- memstore, then each storefile down to the oldest.  What if the update is followed by an update of server state; e.g. region is moved out of intransition state?  And inside in same server, say the master, it makes decisions dependent on what it sees when it does a scanner#next; e.g. BaseScanner checking for assignment?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1806) Scanners do not respect row locks; scanner view could return a skewed view on row if ongoing update

Posted by "stack (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-1806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12749977#action_12749977 ] 

stack commented on HBASE-1806:
------------------------------

I don't think we can fix this w/ tools available to us in 0.20 hbase other than by doing the ugly stuff we're doing over in hbase-1784.  This is a contrary issue.  ZK will help but will not be enough (IMO).

> Scanners do not respect row locks; scanner view could return a skewed view on row if ongoing update
> ---------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-1806
>                 URL: https://issues.apache.org/jira/browse/HBASE-1806
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>
> What I'm seeing is that BaseScanner misses updates made by an update milliseconds before -- even hundreds of milliseconds before.  See hbase-1784 where I'm seeing double-assignment of regions.
> Scanners do not respect row locks.  They should else could return a row with partial updates committed.  What if a .META. region has tens of storefiles and a scan does a get full row which takes a long time.  Say an update comes in during this read.  First it will go in because no row lock is outstanding.  Second, we'll miss the edit given we look at things in order -- memstore, then each storefile down to the oldest.  What if the update is followed by an update of server state; e.g. region is moved out of intransition state?  And inside in same server, say the master, it makes decisions dependent on what it sees when it does a scanner#next; e.g. BaseScanner checking for assignment?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1806) Scanners do not respect row locks; scanner view could return a skewed view on row if ongoing update

Posted by "stack (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-1806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12750734#action_12750734 ] 

stack commented on HBASE-1806:
------------------------------

Ok.  I buy the argument.  I opened https://issues.apache.org/jira/browse/HBASE-1812 for 0.20.1 so we doc. this agreement preferably in Scanner javadoc.

> Scanners do not respect row locks; scanner view could return a skewed view on row if ongoing update
> ---------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-1806
>                 URL: https://issues.apache.org/jira/browse/HBASE-1806
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>
> What I'm seeing is that BaseScanner misses updates made by an update milliseconds before -- even hundreds of milliseconds before.  See hbase-1784 where I'm seeing double-assignment of regions.
> Scanners do not respect row locks.  They should else could return a row with partial updates committed.  What if a .META. region has tens of storefiles and a scan does a get full row which takes a long time.  Say an update comes in during this read.  First it will go in because no row lock is outstanding.  Second, we'll miss the edit given we look at things in order -- memstore, then each storefile down to the oldest.  What if the update is followed by an update of server state; e.g. region is moved out of intransition state?  And inside in same server, say the master, it makes decisions dependent on what it sees when it does a scanner#next; e.g. BaseScanner checking for assignment?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (HBASE-1806) Scanners do not respect row locks; scanner view could return a skewed view on row if ongoing update

Posted by "Andrew Purtell (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-1806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrew Purtell resolved HBASE-1806.
-----------------------------------

    Resolution: Won't Fix

Closed as WontFix.

> Scanners do not respect row locks; scanner view could return a skewed view on row if ongoing update
> ---------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-1806
>                 URL: https://issues.apache.org/jira/browse/HBASE-1806
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>
> What I'm seeing is that BaseScanner misses updates made by an update milliseconds before -- even hundreds of milliseconds before.  See hbase-1784 where I'm seeing double-assignment of regions.
> Scanners do not respect row locks.  They should else could return a row with partial updates committed.  What if a .META. region has tens of storefiles and a scan does a get full row which takes a long time.  Say an update comes in during this read.  First it will go in because no row lock is outstanding.  Second, we'll miss the edit given we look at things in order -- memstore, then each storefile down to the oldest.  What if the update is followed by an update of server state; e.g. region is moved out of intransition state?  And inside in same server, say the master, it makes decisions dependent on what it sees when it does a scanner#next; e.g. BaseScanner checking for assignment?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1806) Scanners do not respect row locks; scanner view could return a skewed view on row if ongoing update

Posted by "Jean-Daniel Cryans (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-1806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12749994#action_12749994 ] 

Jean-Daniel Cryans commented on HBASE-1806:
-------------------------------------------

I bet this issue is playing a lot against us on small clusters under high load, at least 1784 can temporarily help fixing that.

> Scanners do not respect row locks; scanner view could return a skewed view on row if ongoing update
> ---------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-1806
>                 URL: https://issues.apache.org/jira/browse/HBASE-1806
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>
> What I'm seeing is that BaseScanner misses updates made by an update milliseconds before -- even hundreds of milliseconds before.  See hbase-1784 where I'm seeing double-assignment of regions.
> Scanners do not respect row locks.  They should else could return a row with partial updates committed.  What if a .META. region has tens of storefiles and a scan does a get full row which takes a long time.  Say an update comes in during this read.  First it will go in because no row lock is outstanding.  Second, we'll miss the edit given we look at things in order -- memstore, then each storefile down to the oldest.  What if the update is followed by an update of server state; e.g. region is moved out of intransition state?  And inside in same server, say the master, it makes decisions dependent on what it sees when it does a scanner#next; e.g. BaseScanner checking for assignment?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1806) Scanners do not respect row locks; scanner view could return a skewed view on row if ongoing update

Posted by "Jonathan Gray (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-1806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12749990#action_12749990 ] 

Jonathan Gray commented on HBASE-1806:
--------------------------------------

Agree with apurtell.  It's an accepted fact about scanners, from user POV.  But must be addressed for master/meta, and I think ZK is the right direction.  For now, ugly stuff in HBASE-1784 is fine with me... this is another one of the issues that only rears it's head during high-load, sustained imports.  But dealing with it in the case it does happen with a Get sounds fine for now.

> Scanners do not respect row locks; scanner view could return a skewed view on row if ongoing update
> ---------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-1806
>                 URL: https://issues.apache.org/jira/browse/HBASE-1806
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>
> What I'm seeing is that BaseScanner misses updates made by an update milliseconds before -- even hundreds of milliseconds before.  See hbase-1784 where I'm seeing double-assignment of regions.
> Scanners do not respect row locks.  They should else could return a row with partial updates committed.  What if a .META. region has tens of storefiles and a scan does a get full row which takes a long time.  Say an update comes in during this read.  First it will go in because no row lock is outstanding.  Second, we'll miss the edit given we look at things in order -- memstore, then each storefile down to the oldest.  What if the update is followed by an update of server state; e.g. region is moved out of intransition state?  And inside in same server, say the master, it makes decisions dependent on what it sees when it does a scanner#next; e.g. BaseScanner checking for assignment?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.