You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@hbase.apache.org by "stack (JIRA)" <ji...@apache.org> on 2009/02/25 07:27:01 UTC

[jira] Created: (HBASE-1219) Scanners can miss values riding the flush transition

Scanners can miss values riding the flush transition
----------------------------------------------------

                 Key: HBASE-1219
                 URL: https://issues.apache.org/jira/browse/HBASE-1219
             Project: Hadoop HBase
          Issue Type: Bug
            Reporter: stack
            Priority: Blocker
             Fix For: 0.19.1, 0.20.0


A scanner is made of a memcache scanner and a store files scanner.  When a flush happens, the memcache content gets turned into a store file and is added to the list of already existing scanners.  Currently the two scanners run autonomously.  Ben Maurer points out that if we were returning values out of the memcache because they were of lower value than store file content, the lower pegging that was going on in memcache doesn't make it across when we pick up the memcache values in the flushed store file; we just keep on with whatever the lowest among the store files that were in place before the flush.

Its a hard one to spot but should be easy to make a test for it.

Ben Maurer also points out that in StoreFileScanner, we should not register the observer until after the scanners have been setup:

{code}
  public StoreFileScanner(final Store store, final long timestamp,
    final byte [][] targetCols, final byte [] firstRow)
  throws IOException {
    super(timestamp, targetCols);
    this.store = store;
    this.store.addChangedReaderObserver(this);
    try {
      openScanner(firstRow);
...
{code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HBASE-1219) Scanners can miss values riding the flush transition

Posted by "stack (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-1219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HBASE-1219:
-------------------------

    Attachment: 1219.patch

Fix for hbase-1219 and hbase-1220.

For 1220, we pass in the sequenceid so we know which of the readers is the new one so we only open it, not all as we used to.

For 1219, we save the key we give out when nexing HStoreScanner.  This is the key we use recalibrating running readers in a scanner (rather than ask memcache -- which, as it turns out, can be empty during certain transitions).



> Scanners can miss values riding the flush transition
> ----------------------------------------------------
>
>                 Key: HBASE-1219
>                 URL: https://issues.apache.org/jira/browse/HBASE-1219
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>            Priority: Blocker
>             Fix For: 0.19.1, 0.20.0
>
>         Attachments: 1219.patch
>
>
> A scanner is made of a memcache scanner and a store files scanner.  When a flush happens, the memcache content gets turned into a store file and is added to the list of already existing scanners.  Currently the two scanners run autonomously.  Ben Maurer points out that if we were returning values out of the memcache because they were of lower value than store file content, the lower pegging that was going on in memcache doesn't make it across when we pick up the memcache values in the flushed store file; we just keep on with whatever the lowest among the store files that were in place before the flush.
> Its a hard one to spot but should be easy to make a test for it.
> Ben Maurer also points out that in StoreFileScanner, we should not register the observer until after the scanners have been setup:
> {code}
>   public StoreFileScanner(final Store store, final long timestamp,
>     final byte [][] targetCols, final byte [] firstRow)
>   throws IOException {
>     super(timestamp, targetCols);
>     this.store = store;
>     this.store.addChangedReaderObserver(this);
>     try {
>       openScanner(firstRow);
> ...
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (HBASE-1219) Scanners can miss values riding the flush transition

Posted by "stack (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-1219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack resolved HBASE-1219.
--------------------------

       Resolution: Fixed
    Fix Version/s:     (was: 0.20.0)

Applied to branch (Doesn't make sense on trunk).  Applied fix for 1219 and 1220 at same time.

> Scanners can miss values riding the flush transition
> ----------------------------------------------------
>
>                 Key: HBASE-1219
>                 URL: https://issues.apache.org/jira/browse/HBASE-1219
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>            Priority: Blocker
>             Fix For: 0.19.1
>
>         Attachments: 1219.patch
>
>
> A scanner is made of a memcache scanner and a store files scanner.  When a flush happens, the memcache content gets turned into a store file and is added to the list of already existing scanners.  Currently the two scanners run autonomously.  Ben Maurer points out that if we were returning values out of the memcache because they were of lower value than store file content, the lower pegging that was going on in memcache doesn't make it across when we pick up the memcache values in the flushed store file; we just keep on with whatever the lowest among the store files that were in place before the flush.
> Its a hard one to spot but should be easy to make a test for it.
> Ben Maurer also points out that in StoreFileScanner, we should not register the observer until after the scanners have been setup:
> {code}
>   public StoreFileScanner(final Store store, final long timestamp,
>     final byte [][] targetCols, final byte [] firstRow)
>   throws IOException {
>     super(timestamp, targetCols);
>     this.store = store;
>     this.store.addChangedReaderObserver(this);
>     try {
>       openScanner(firstRow);
> ...
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1219) Scanners can miss values riding the flush transition

Posted by "Jim Kellerman (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-1219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12680659#action_12680659 ] 

Jim Kellerman commented on HBASE-1219:
--------------------------------------

In TestScanner, the code:
{code}
    } catch (Exception e) {
      LOG.error("Failed test because of ...", e);
{code}

doesn't seem to let the exception and does not call fail(). Should it?


Spelling:

HStoreScanner:235:
{code}
              LOG.info("RMOEVE CLOSING NEXT " + i);
{code}

HStoreScanner:250:
{code}
            LOG.info("RMOEVE CLOSING ADVANCING " + i);
{code}

HStoreScanner:280:
{code}
          LOG.info("RMOEVE CLOSING " + i);
{code}

Otherwise, +1 (a little hard to grasp all of what's going on, but scanners take time to get your head around anyway. I wouldn't
hold that against this patch)

> Scanners can miss values riding the flush transition
> ----------------------------------------------------
>
>                 Key: HBASE-1219
>                 URL: https://issues.apache.org/jira/browse/HBASE-1219
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>            Priority: Blocker
>             Fix For: 0.19.1, 0.20.0
>
>         Attachments: 1219.patch
>
>
> A scanner is made of a memcache scanner and a store files scanner.  When a flush happens, the memcache content gets turned into a store file and is added to the list of already existing scanners.  Currently the two scanners run autonomously.  Ben Maurer points out that if we were returning values out of the memcache because they were of lower value than store file content, the lower pegging that was going on in memcache doesn't make it across when we pick up the memcache values in the flushed store file; we just keep on with whatever the lowest among the store files that were in place before the flush.
> Its a hard one to spot but should be easy to make a test for it.
> Ben Maurer also points out that in StoreFileScanner, we should not register the observer until after the scanners have been setup:
> {code}
>   public StoreFileScanner(final Store store, final long timestamp,
>     final byte [][] targetCols, final byte [] firstRow)
>   throws IOException {
>     super(timestamp, targetCols);
>     this.store = store;
>     this.store.addChangedReaderObserver(this);
>     try {
>       openScanner(firstRow);
> ...
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1219) Scanners can miss values riding the flush transition

Posted by "stack (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-1219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12680632#action_12680632 ] 

stack commented on HBASE-1219:
------------------------------

All tests pass.  Would like a review before committing.

> Scanners can miss values riding the flush transition
> ----------------------------------------------------
>
>                 Key: HBASE-1219
>                 URL: https://issues.apache.org/jira/browse/HBASE-1219
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>            Priority: Blocker
>             Fix For: 0.19.1, 0.20.0
>
>         Attachments: 1219.patch
>
>
> A scanner is made of a memcache scanner and a store files scanner.  When a flush happens, the memcache content gets turned into a store file and is added to the list of already existing scanners.  Currently the two scanners run autonomously.  Ben Maurer points out that if we were returning values out of the memcache because they were of lower value than store file content, the lower pegging that was going on in memcache doesn't make it across when we pick up the memcache values in the flushed store file; we just keep on with whatever the lowest among the store files that were in place before the flush.
> Its a hard one to spot but should be easy to make a test for it.
> Ben Maurer also points out that in StoreFileScanner, we should not register the observer until after the scanners have been setup:
> {code}
>   public StoreFileScanner(final Store store, final long timestamp,
>     final byte [][] targetCols, final byte [] firstRow)
>   throws IOException {
>     super(timestamp, targetCols);
>     this.store = store;
>     this.store.addChangedReaderObserver(this);
>     try {
>       openScanner(firstRow);
> ...
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.