You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Amitanand Aiyer (JIRA)" <ji...@apache.org> on 2011/09/26 18:19:26 UTC

[jira] [Created] (HBASE-4485) Eliminate window of missing Data

Eliminate window of missing Data
--------------------------------

                 Key: HBASE-4485
                 URL: https://issues.apache.org/jira/browse/HBASE-4485
             Project: HBase
          Issue Type: Sub-task
            Reporter: Amitanand Aiyer
            Assignee: Amitanand Aiyer


After incorporating v11 of the 2856 fix, we discovered that we are still having some ACID violations.

This time, however, the problem is not about including "newer" updates; but, about missing older updates
that should be including. 

Here is what seems to be happing.


0 - Scanner starts scanning.

0 - MemStore.snapshot is called.

    Scanner has access to kvHeap and snapshot

1-  Flush takes place. 
     1.1 KV's in the snapshot are written to the disk.
     1.2 HFile is ready. 

2   Store.updateStoreFiles() deletes the old snapshot.
    
     2.1 updateReaders will not be called until the end of the columnFamily seek.

3  For a brief window of time, scanner does not have access to certain KeyValues.
   a) Scanner has no longer access to the snapshot because it is flushed to the
disk. 
   b) It does not yet have access to the HFile because the updateReaders was
not called yet.
        

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4485) Eliminate window of missing Data

Posted by "Jonathan Hsieh (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13127164#comment-13127164 ] 

Jonathan Hsieh commented on HBASE-4485:
---------------------------------------

@Amitanand

I've applied HBASE-2856's from (https://reviews.apache.org/r/2224/diff/#index_header) onto trunk (with minor tweak) and then applied HBASE-4485 but have a compile failure.  Specificially matcher.ignoreNewerKVs() seems to be missing.  Is there another commit that I'm missing?  
                
> Eliminate window of missing Data
> --------------------------------
>
>                 Key: HBASE-4485
>                 URL: https://issues.apache.org/jira/browse/HBASE-4485
>             Project: HBase
>          Issue Type: Sub-task
>            Reporter: Amitanand Aiyer
>            Assignee: Amitanand Aiyer
>             Fix For: 0.94.0
>
>         Attachments: 4485-v1.diff, 4485-v2.diff, 4485-v3.diff, 4485-v4.diff, repro_bug-4485.diff
>
>
> After incorporating v11 of the 2856 fix, we discovered that we are still having some ACID violations.
> This time, however, the problem is not about including "newer" updates; but, about missing older updates
> that should be including. 
> Here is what seems to be happening.
> There is a race condition in the StoreScanner.getScanners()
>   private List<KeyValueScanner> getScanners(Scan scan,
>       final NavigableSet<byte[]> columns) throws IOException {
>     // First the store file scanners
>     List<StoreFileScanner> sfScanners = StoreFileScanner
>       .getScannersForStoreFiles(store.getStorefiles(), cacheBlocks,
>                                 isGet, false);
>     List<KeyValueScanner> scanners =
>       new ArrayList<KeyValueScanner>(sfScanners.size()+1);
>     // include only those scan files which pass all filters
>     for (StoreFileScanner sfs : sfScanners) {
>       if (sfs.shouldSeek(scan, columns)) {
>         scanners.add(sfs);
>       }
>     }
>     // Then the memstore scanners
>     if (this.store.memstore.shouldSeek(scan)) {
>       scanners.addAll(this.store.memstore.getScanners());
>     }
>     return scanners;
>   }
> If for example there is a call to Store.updateStorefiles() that happens between
> the store.getStorefiles() and this.store.memstore.getScanners(); then
> it is possible that there was a new HFile created, that is not seen by the
> StoreScanner, and the data is not present in the Memstore.snapshot either.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4485) Eliminate window of missing Data

Posted by "Amitanand Aiyer (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13114763#comment-13114763 ] 

Amitanand Aiyer commented on HBASE-4485:
----------------------------------------

One way to fix this issue, would be to swap the order in which we clear the snapshot, in Store.updateStoreFiles.

We might want to notifyReaders() before doing the clearSnapshot(). While it seems like this might fix
the issue. Here are some issues that it may raise.

  (a) If say Scanner 1 got updated; but Scanner 2 is yet to be updated (waiting for the lock). Scanner 1 may
now see duplicate data. Some KV's exist both in the new file and in the snapshot.
     Not sure if our current KVHeap mechanism is able to handle this. esp in terms of the number of updates we 
have
  (b) performance concerns about holding on to the snapshot until all the scanners/readers finish scanning
the columnFamily to allow StoreScanner.updateReaders() to get the lock.

> Eliminate window of missing Data
> --------------------------------
>
>                 Key: HBASE-4485
>                 URL: https://issues.apache.org/jira/browse/HBASE-4485
>             Project: HBase
>          Issue Type: Sub-task
>            Reporter: Amitanand Aiyer
>            Assignee: Amitanand Aiyer
>             Fix For: 0.94.0
>
>
> After incorporating v11 of the 2856 fix, we discovered that we are still having some ACID violations.
> This time, however, the problem is not about including "newer" updates; but, about missing older updates
> that should be including. 
> Here is what seems to be happing.
> 0 - Scanner starts scanning.
> 0 - MemStore.snapshot is called.
>     Scanner has access to kvHeap and snapshot
> 1-  Flush takes place. 
>      1.1 KV's in the snapshot are written to the disk.
>      1.2 HFile is ready. 
> 2   Store.updateStoreFiles() deletes the old snapshot.
>     
>      2.1 updateReaders will not be called until the end of the columnFamily seek.
> 3  For a brief window of time, scanner does not have access to certain KeyValues.
>    a) Scanner has no longer access to the snapshot because it is flushed to the
> disk. 
>    b) It does not yet have access to the HFile because the updateReaders was
> not called yet.
>         

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4485) Eliminate window of missing Data

Posted by "Ted Yu (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13152425#comment-13152425 ] 

Ted Yu commented on HBASE-4485:
-------------------------------

@Amit:
Can you highlight the changes in the latest patch ?
There're 12 files where the patch doesn't cleanly apply.

The patch is much larger than diff-oct-14.diff:
{code}
-rw-r--r--@ 1 zhihyu  110088321  60773 Nov 17 14:51 4485-nov-17.txt
-rw-r--r--  1 zhihyu  110088321  33693 Oct 14 15:18 4485.v14
{code}
                
> Eliminate window of missing Data
> --------------------------------
>
>                 Key: HBASE-4485
>                 URL: https://issues.apache.org/jira/browse/HBASE-4485
>             Project: HBase
>          Issue Type: Sub-task
>            Reporter: Amitanand Aiyer
>            Assignee: Amitanand Aiyer
>             Fix For: 0.94.0
>
>         Attachments: 4485-v1.diff, 4485-v2.diff, 4485-v3.diff, 4485-v4.diff, repro_bug-4485.diff
>
>
> After incorporating v11 of the 2856 fix, we discovered that we are still having some ACID violations.
> This time, however, the problem is not about including "newer" updates; but, about missing older updates
> that should be including. 
> Here is what seems to be happening.
> There is a race condition in the StoreScanner.getScanners()
>   private List<KeyValueScanner> getScanners(Scan scan,
>       final NavigableSet<byte[]> columns) throws IOException {
>     // First the store file scanners
>     List<StoreFileScanner> sfScanners = StoreFileScanner
>       .getScannersForStoreFiles(store.getStorefiles(), cacheBlocks,
>                                 isGet, false);
>     List<KeyValueScanner> scanners =
>       new ArrayList<KeyValueScanner>(sfScanners.size()+1);
>     // include only those scan files which pass all filters
>     for (StoreFileScanner sfs : sfScanners) {
>       if (sfs.shouldSeek(scan, columns)) {
>         scanners.add(sfs);
>       }
>     }
>     // Then the memstore scanners
>     if (this.store.memstore.shouldSeek(scan)) {
>       scanners.addAll(this.store.memstore.getScanners());
>     }
>     return scanners;
>   }
> If for example there is a call to Store.updateStorefiles() that happens between
> the store.getStorefiles() and this.store.memstore.getScanners(); then
> it is possible that there was a new HFile created, that is not seen by the
> StoreScanner, and the data is not present in the Memstore.snapshot either.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4485) Eliminate window of missing Data

Posted by "Amitanand Aiyer (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13131844#comment-13131844 ] 

Amitanand Aiyer commented on HBASE-4485:
----------------------------------------

I am not clear on weather we need the notifyChangedReader to be under the lock or not? I guess that depends on the
state that the lock is trying to guard. My understanding/assumption was that 

(a) The lock in Store.java  is guarding the list-of-store-files plus the state of MemStore's (kvset and snapshot) references. 

(b) Was also assuming that StoreScanner is okay with seeing an older older set of files, as long as the memStoreScanner that
it is using is current (i.e. The StoreScanner's view of the world may not be accurate as of "now". But it is guaranteed
to be consistent across the set of storefiles and the memstore (at the time getScanners was called to create the scanners).

I don't think that we can totally avoid (b) even if we have notifyChangedReaders under the lock. StoreScanner could already
be processing a next() operation, during which the updateReader will just have to wait to let the StoreScanner complete, however
long it takes.

Please correct me if my assumptions/understanding are wrong.

                
> Eliminate window of missing Data
> --------------------------------
>
>                 Key: HBASE-4485
>                 URL: https://issues.apache.org/jira/browse/HBASE-4485
>             Project: HBase
>          Issue Type: Sub-task
>            Reporter: Amitanand Aiyer
>            Assignee: Amitanand Aiyer
>             Fix For: 0.94.0
>
>         Attachments: 4485-v1.diff, 4485-v2.diff, 4485-v3.diff, 4485-v4.diff, repro_bug-4485.diff
>
>
> After incorporating v11 of the 2856 fix, we discovered that we are still having some ACID violations.
> This time, however, the problem is not about including "newer" updates; but, about missing older updates
> that should be including. 
> Here is what seems to be happening.
> There is a race condition in the StoreScanner.getScanners()
>   private List<KeyValueScanner> getScanners(Scan scan,
>       final NavigableSet<byte[]> columns) throws IOException {
>     // First the store file scanners
>     List<StoreFileScanner> sfScanners = StoreFileScanner
>       .getScannersForStoreFiles(store.getStorefiles(), cacheBlocks,
>                                 isGet, false);
>     List<KeyValueScanner> scanners =
>       new ArrayList<KeyValueScanner>(sfScanners.size()+1);
>     // include only those scan files which pass all filters
>     for (StoreFileScanner sfs : sfScanners) {
>       if (sfs.shouldSeek(scan, columns)) {
>         scanners.add(sfs);
>       }
>     }
>     // Then the memstore scanners
>     if (this.store.memstore.shouldSeek(scan)) {
>       scanners.addAll(this.store.memstore.getScanners());
>     }
>     return scanners;
>   }
> If for example there is a call to Store.updateStorefiles() that happens between
> the store.getStorefiles() and this.store.memstore.getScanners(); then
> it is possible that there was a new HFile created, that is not seen by the
> StoreScanner, and the data is not present in the Memstore.snapshot either.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4485) Eliminate window of missing Data

Posted by "Amitanand Aiyer (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amitanand Aiyer updated HBASE-4485:
-----------------------------------

    Attachment: repro_bug-4485.diff

Here is a way to repro the bug.

uses unnecessary sleep to get things to go bad. Not intended to be included in the final diff/submission.
                
> Eliminate window of missing Data
> --------------------------------
>
>                 Key: HBASE-4485
>                 URL: https://issues.apache.org/jira/browse/HBASE-4485
>             Project: HBase
>          Issue Type: Sub-task
>            Reporter: Amitanand Aiyer
>            Assignee: Amitanand Aiyer
>             Fix For: 0.94.0
>
>         Attachments: 4485-v1.diff, repro_bug-4485.diff
>
>
> After incorporating v11 of the 2856 fix, we discovered that we are still having some ACID violations.
> This time, however, the problem is not about including "newer" updates; but, about missing older updates
> that should be including. 
> Here is what seems to be happening.
> There is a race condition in the StoreScanner.getScanners()
>   private List<KeyValueScanner> getScanners(Scan scan,
>       final NavigableSet<byte[]> columns) throws IOException {
>     // First the store file scanners
>     List<StoreFileScanner> sfScanners = StoreFileScanner
>       .getScannersForStoreFiles(store.getStorefiles(), cacheBlocks,
>                                 isGet, false);
>     List<KeyValueScanner> scanners =
>       new ArrayList<KeyValueScanner>(sfScanners.size()+1);
>     // include only those scan files which pass all filters
>     for (StoreFileScanner sfs : sfScanners) {
>       if (sfs.shouldSeek(scan, columns)) {
>         scanners.add(sfs);
>       }
>     }
>     // Then the memstore scanners
>     if (this.store.memstore.shouldSeek(scan)) {
>       scanners.addAll(this.store.memstore.getScanners());
>     }
>     return scanners;
>   }
> If for example there is a call to Store.updateStorefiles() that happens between
> the store.getStorefiles() and this.store.memstore.getScanners(); then
> it is possible that there was a new HFile created, that is not seen by the
> StoreScanner, and the data is not present in the Memstore.snapshot either.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4485) Eliminate window of missing Data

Posted by "Amitanand Aiyer (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13131833#comment-13131833 ] 

Amitanand Aiyer commented on HBASE-4485:
----------------------------------------

@Stack. The reason that I wanted to move notifyChangedReaders outside the lock is to avoid a potential race condition where.

Thread A holds the lock for the Store.java, and wants to do notifyChangedReaders holding the lock.
  NotifyChangedReaders calls updateReaders on a StoreScanner -- Say scanner-B

Thread B is doing a seek on scanner-B so it holds a lock on the StoreScanner object.
  Thread B could now have to call getScanners() (which is now a synchronized function in store) if the heap == null.

This could end up in a deadlock where Thread A has the lock for Store.java but needs the lock for StoreScanner to get into updateReaders.
Thread B has the lock for StoreScanner.java but needs the lock for Store.java to get into getScanners and finish the seek().



                
> Eliminate window of missing Data
> --------------------------------
>
>                 Key: HBASE-4485
>                 URL: https://issues.apache.org/jira/browse/HBASE-4485
>             Project: HBase
>          Issue Type: Sub-task
>            Reporter: Amitanand Aiyer
>            Assignee: Amitanand Aiyer
>             Fix For: 0.94.0
>
>         Attachments: 4485-v1.diff, 4485-v2.diff, 4485-v3.diff, 4485-v4.diff, repro_bug-4485.diff
>
>
> After incorporating v11 of the 2856 fix, we discovered that we are still having some ACID violations.
> This time, however, the problem is not about including "newer" updates; but, about missing older updates
> that should be including. 
> Here is what seems to be happening.
> There is a race condition in the StoreScanner.getScanners()
>   private List<KeyValueScanner> getScanners(Scan scan,
>       final NavigableSet<byte[]> columns) throws IOException {
>     // First the store file scanners
>     List<StoreFileScanner> sfScanners = StoreFileScanner
>       .getScannersForStoreFiles(store.getStorefiles(), cacheBlocks,
>                                 isGet, false);
>     List<KeyValueScanner> scanners =
>       new ArrayList<KeyValueScanner>(sfScanners.size()+1);
>     // include only those scan files which pass all filters
>     for (StoreFileScanner sfs : sfScanners) {
>       if (sfs.shouldSeek(scan, columns)) {
>         scanners.add(sfs);
>       }
>     }
>     // Then the memstore scanners
>     if (this.store.memstore.shouldSeek(scan)) {
>       scanners.addAll(this.store.memstore.getScanners());
>     }
>     return scanners;
>   }
> If for example there is a call to Store.updateStorefiles() that happens between
> the store.getStorefiles() and this.store.memstore.getScanners(); then
> it is possible that there was a new HFile created, that is not seen by the
> StoreScanner, and the data is not present in the Memstore.snapshot either.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4485) Eliminate window of missing Data

Posted by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13152573#comment-13152573 ] 

jiraposter@reviews.apache.org commented on HBASE-4485:
------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2481/
-----------------------------------------------------------

(Updated 2011-11-18 01:35:33.881853)


Review request for Ted Yu, Michael Stack, Jonathan Gray, Lars Hofhansl, Amitanand Aiyer, Kannan Muthukkaruppan, Karthik Ranganathan, and Nicolas Spiegelberg.


Changes
-------

rebased with the latest.


Summary
-------

Part of the 2856 diff split into 3 parts for easier review

The first part is v6 of the patch submitted to:
https://reviews.apache.org/r/2224/

This is the fix for HBase-4485


This addresses bug hbase-4485.
    https://issues.apache.org/jira/browse/hbase-4485


Diffs (updated)
-----

  src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java 747a90b 
  src/main/java/org/apache/hadoop/hbase/regionserver/Store.java fec5547 
  src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 6512a54 

Diff: https://reviews.apache.org/r/2481/diff


Testing
-------

running mvn test with all 3 patches together.


Thanks,

Amitanand


                
> Eliminate window of missing Data
> --------------------------------
>
>                 Key: HBASE-4485
>                 URL: https://issues.apache.org/jira/browse/HBASE-4485
>             Project: HBase
>          Issue Type: Sub-task
>            Reporter: Amitanand Aiyer
>            Assignee: Amitanand Aiyer
>             Fix For: 0.94.0
>
>         Attachments: 4485-v1.diff, 4485-v2.diff, 4485-v3.diff, 4485-v4.diff, 4485-v5.diff, 4485-v6.diff, repro_bug-4485.diff
>
>
> After incorporating v11 of the 2856 fix, we discovered that we are still having some ACID violations.
> This time, however, the problem is not about including "newer" updates; but, about missing older updates
> that should be including. 
> Here is what seems to be happening.
> There is a race condition in the StoreScanner.getScanners()
>   private List<KeyValueScanner> getScanners(Scan scan,
>       final NavigableSet<byte[]> columns) throws IOException {
>     // First the store file scanners
>     List<StoreFileScanner> sfScanners = StoreFileScanner
>       .getScannersForStoreFiles(store.getStorefiles(), cacheBlocks,
>                                 isGet, false);
>     List<KeyValueScanner> scanners =
>       new ArrayList<KeyValueScanner>(sfScanners.size()+1);
>     // include only those scan files which pass all filters
>     for (StoreFileScanner sfs : sfScanners) {
>       if (sfs.shouldSeek(scan, columns)) {
>         scanners.add(sfs);
>       }
>     }
>     // Then the memstore scanners
>     if (this.store.memstore.shouldSeek(scan)) {
>       scanners.addAll(this.store.memstore.getScanners());
>     }
>     return scanners;
>   }
> If for example there is a call to Store.updateStorefiles() that happens between
> the store.getStorefiles() and this.store.memstore.getScanners(); then
> it is possible that there was a new HFile created, that is not seen by the
> StoreScanner, and the data is not present in the Memstore.snapshot either.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4485) Eliminate window of missing Data

Posted by "stack (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13116660#comment-13116660 ] 

stack commented on HBASE-4485:
------------------------------

@Amit Great stuff.  I like the reasoning above especially the bit where the fix I'd have done, the swapping order, likely has issues.

Looks like a little pollution in this patch from hbase-4344 but no matter since you've merged this into hbase-4344 over in hbase-4344 (getMaxMemstoreTS?).

Why move the notify outside of the lock?  Is it possible that when done outside of the lock, that observers could ever see different lists of readers?



                
> Eliminate window of missing Data
> --------------------------------
>
>                 Key: HBASE-4485
>                 URL: https://issues.apache.org/jira/browse/HBASE-4485
>             Project: HBase
>          Issue Type: Sub-task
>            Reporter: Amitanand Aiyer
>            Assignee: Amitanand Aiyer
>             Fix For: 0.94.0
>
>         Attachments: 4485-v1.diff, 4485-v2.diff, 4485-v3.diff, 4485-v4.diff, repro_bug-4485.diff
>
>
> After incorporating v11 of the 2856 fix, we discovered that we are still having some ACID violations.
> This time, however, the problem is not about including "newer" updates; but, about missing older updates
> that should be including. 
> Here is what seems to be happening.
> There is a race condition in the StoreScanner.getScanners()
>   private List<KeyValueScanner> getScanners(Scan scan,
>       final NavigableSet<byte[]> columns) throws IOException {
>     // First the store file scanners
>     List<StoreFileScanner> sfScanners = StoreFileScanner
>       .getScannersForStoreFiles(store.getStorefiles(), cacheBlocks,
>                                 isGet, false);
>     List<KeyValueScanner> scanners =
>       new ArrayList<KeyValueScanner>(sfScanners.size()+1);
>     // include only those scan files which pass all filters
>     for (StoreFileScanner sfs : sfScanners) {
>       if (sfs.shouldSeek(scan, columns)) {
>         scanners.add(sfs);
>       }
>     }
>     // Then the memstore scanners
>     if (this.store.memstore.shouldSeek(scan)) {
>       scanners.addAll(this.store.memstore.getScanners());
>     }
>     return scanners;
>   }
> If for example there is a call to Store.updateStorefiles() that happens between
> the store.getStorefiles() and this.store.memstore.getScanners(); then
> it is possible that there was a new HFile created, that is not seen by the
> StoreScanner, and the data is not present in the Memstore.snapshot either.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4485) Eliminate window of missing Data

Posted by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13131917#comment-13131917 ] 

jiraposter@reviews.apache.org commented on HBASE-4485:
------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2481/
-----------------------------------------------------------

Review request for Ted Yu, Michael Stack, Jonathan Gray, Lars Hofhansl, Amitanand Aiyer, Kannan Muthukkaruppan, Karthik Ranganathan, and Nicolas Spiegelberg.


Summary
-------

Part of the 2856 diff split into 3 parts for easier review

The first part is v6 of the patch submitted to:
https://reviews.apache.org/r/2224/

This is the fix for HBase-4485


This addresses bug hbase-4485.
    https://issues.apache.org/jira/browse/hbase-4485


Diffs
-----

  src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 7761c42 
  src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java f5b5c4c 
  src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java 34263e4 

Diff: https://reviews.apache.org/r/2481/diff


Testing
-------

running mvn test with all 3 patches together.


Thanks,

Amitanand


                
> Eliminate window of missing Data
> --------------------------------
>
>                 Key: HBASE-4485
>                 URL: https://issues.apache.org/jira/browse/HBASE-4485
>             Project: HBase
>          Issue Type: Sub-task
>            Reporter: Amitanand Aiyer
>            Assignee: Amitanand Aiyer
>             Fix For: 0.94.0
>
>         Attachments: 4485-v1.diff, 4485-v2.diff, 4485-v3.diff, 4485-v4.diff, repro_bug-4485.diff
>
>
> After incorporating v11 of the 2856 fix, we discovered that we are still having some ACID violations.
> This time, however, the problem is not about including "newer" updates; but, about missing older updates
> that should be including. 
> Here is what seems to be happening.
> There is a race condition in the StoreScanner.getScanners()
>   private List<KeyValueScanner> getScanners(Scan scan,
>       final NavigableSet<byte[]> columns) throws IOException {
>     // First the store file scanners
>     List<StoreFileScanner> sfScanners = StoreFileScanner
>       .getScannersForStoreFiles(store.getStorefiles(), cacheBlocks,
>                                 isGet, false);
>     List<KeyValueScanner> scanners =
>       new ArrayList<KeyValueScanner>(sfScanners.size()+1);
>     // include only those scan files which pass all filters
>     for (StoreFileScanner sfs : sfScanners) {
>       if (sfs.shouldSeek(scan, columns)) {
>         scanners.add(sfs);
>       }
>     }
>     // Then the memstore scanners
>     if (this.store.memstore.shouldSeek(scan)) {
>       scanners.addAll(this.store.memstore.getScanners());
>     }
>     return scanners;
>   }
> If for example there is a call to Store.updateStorefiles() that happens between
> the store.getStorefiles() and this.store.memstore.getScanners(); then
> it is possible that there was a new HFile created, that is not seen by the
> StoreScanner, and the data is not present in the Memstore.snapshot either.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Resolved] (HBASE-4485) Eliminate window of missing Data

Posted by "Nicolas Spiegelberg (Resolved) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Nicolas Spiegelberg resolved HBASE-4485.
----------------------------------------

    Resolution: Fixed
    
> Eliminate window of missing Data
> --------------------------------
>
>                 Key: HBASE-4485
>                 URL: https://issues.apache.org/jira/browse/HBASE-4485
>             Project: HBase
>          Issue Type: Sub-task
>            Reporter: Amitanand Aiyer
>            Assignee: Amitanand Aiyer
>             Fix For: 0.94.0
>
>         Attachments: 4485-v1.diff, 4485-v2.diff, 4485-v3.diff, 4485-v4.diff, 4485-v5.diff, 4485-v6.diff, repro_bug-4485.diff
>
>
> After incorporating v11 of the 2856 fix, we discovered that we are still having some ACID violations.
> This time, however, the problem is not about including "newer" updates; but, about missing older updates
> that should be including. 
> Here is what seems to be happening.
> There is a race condition in the StoreScanner.getScanners()
>   private List<KeyValueScanner> getScanners(Scan scan,
>       final NavigableSet<byte[]> columns) throws IOException {
>     // First the store file scanners
>     List<StoreFileScanner> sfScanners = StoreFileScanner
>       .getScannersForStoreFiles(store.getStorefiles(), cacheBlocks,
>                                 isGet, false);
>     List<KeyValueScanner> scanners =
>       new ArrayList<KeyValueScanner>(sfScanners.size()+1);
>     // include only those scan files which pass all filters
>     for (StoreFileScanner sfs : sfScanners) {
>       if (sfs.shouldSeek(scan, columns)) {
>         scanners.add(sfs);
>       }
>     }
>     // Then the memstore scanners
>     if (this.store.memstore.shouldSeek(scan)) {
>       scanners.addAll(this.store.memstore.getScanners());
>     }
>     return scanners;
>   }
> If for example there is a call to Store.updateStorefiles() that happens between
> the store.getStorefiles() and this.store.memstore.getScanners(); then
> it is possible that there was a new HFile created, that is not seen by the
> StoreScanner, and the data is not present in the Memstore.snapshot either.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4485) Eliminate window of missing Data

Posted by "Amitanand Aiyer (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amitanand Aiyer updated HBASE-4485:
-----------------------------------

    Attachment: 4485-v1.diff

v1 is the Initial diff. May have other issues/repurcussions of the change.

Have not yet tested. Just out there for initial feedback.


> Eliminate window of missing Data
> --------------------------------
>
>                 Key: HBASE-4485
>                 URL: https://issues.apache.org/jira/browse/HBASE-4485
>             Project: HBase
>          Issue Type: Sub-task
>            Reporter: Amitanand Aiyer
>            Assignee: Amitanand Aiyer
>             Fix For: 0.94.0
>
>         Attachments: 4485-v1.diff
>
>
> After incorporating v11 of the 2856 fix, we discovered that we are still having some ACID violations.
> This time, however, the problem is not about including "newer" updates; but, about missing older updates
> that should be including. 
> Here is what seems to be happing.
> 0 - Scanner starts scanning.
> 0 - MemStore.snapshot is called.
>     Scanner has access to kvHeap and snapshot
> 1-  Flush takes place. 
>      1.1 KV's in the snapshot are written to the disk.
>      1.2 HFile is ready. 
> 2   Store.updateStoreFiles() deletes the old snapshot.
>     
>      2.1 updateReaders will not be called until the end of the columnFamily seek.
> 3  For a brief window of time, scanner does not have access to certain KeyValues.
>    a) Scanner has no longer access to the snapshot because it is flushed to the
> disk. 
>    b) It does not yet have access to the HFile because the updateReaders was
> not called yet.
>         

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4485) Eliminate window of missing Data

Posted by "Amitanand Aiyer (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amitanand Aiyer updated HBASE-4485:
-----------------------------------

    Description: 
After incorporating v11 of the 2856 fix, we discovered that we are still having some ACID violations.

This time, however, the problem is not about including "newer" updates; but, about missing older updates
that should be including. 

Here is what seems to be happening.

There is a race condition in the StoreScanner.getScanners()

  private List<KeyValueScanner> getScanners(Scan scan,
      final NavigableSet<byte[]> columns) throws IOException {
    // First the store file scanners
    List<StoreFileScanner> sfScanners = StoreFileScanner
      .getScannersForStoreFiles(store.getStorefiles(), cacheBlocks,
                                isGet, false);
    List<KeyValueScanner> scanners =
      new ArrayList<KeyValueScanner>(sfScanners.size()+1);

    // include only those scan files which pass all filters
    for (StoreFileScanner sfs : sfScanners) {
      if (sfs.shouldSeek(scan, columns)) {
        scanners.add(sfs);
      }
    }

    // Then the memstore scanners
    if (this.store.memstore.shouldSeek(scan)) {
      scanners.addAll(this.store.memstore.getScanners());
    }
    return scanners;
  }


If for example there is a call to Store.updateStorefiles() that happens between
the store.getStorefiles() and this.store.memstore.getScanners(); then
it is possible that there was a new HFile created, that is not seen by the
StoreScanner, and the data is not present in the Memstore.snapshot either.


  was:
After incorporating v11 of the 2856 fix, we discovered that we are still having some ACID violations.

This time, however, the problem is not about including "newer" updates; but, about missing older updates
that should be including. 

Here is what seems to be happing.


0 - Scanner starts scanning.

0 - MemStore.snapshot is called.

    Scanner has access to kvHeap and snapshot

1-  Flush takes place. 
     1.1 KV's in the snapshot are written to the disk.
     1.2 HFile is ready. 

2   Store.updateStoreFiles() deletes the old snapshot.
    
     2.1 updateReaders will not be called until the end of the columnFamily seek.

3  For a brief window of time, scanner does not have access to certain KeyValues.
   a) Scanner has no longer access to the snapshot because it is flushed to the
disk. 
   b) It does not yet have access to the HFile because the updateReaders was
not called yet.
        

    
> Eliminate window of missing Data
> --------------------------------
>
>                 Key: HBASE-4485
>                 URL: https://issues.apache.org/jira/browse/HBASE-4485
>             Project: HBase
>          Issue Type: Sub-task
>            Reporter: Amitanand Aiyer
>            Assignee: Amitanand Aiyer
>             Fix For: 0.94.0
>
>         Attachments: 4485-v1.diff
>
>
> After incorporating v11 of the 2856 fix, we discovered that we are still having some ACID violations.
> This time, however, the problem is not about including "newer" updates; but, about missing older updates
> that should be including. 
> Here is what seems to be happening.
> There is a race condition in the StoreScanner.getScanners()
>   private List<KeyValueScanner> getScanners(Scan scan,
>       final NavigableSet<byte[]> columns) throws IOException {
>     // First the store file scanners
>     List<StoreFileScanner> sfScanners = StoreFileScanner
>       .getScannersForStoreFiles(store.getStorefiles(), cacheBlocks,
>                                 isGet, false);
>     List<KeyValueScanner> scanners =
>       new ArrayList<KeyValueScanner>(sfScanners.size()+1);
>     // include only those scan files which pass all filters
>     for (StoreFileScanner sfs : sfScanners) {
>       if (sfs.shouldSeek(scan, columns)) {
>         scanners.add(sfs);
>       }
>     }
>     // Then the memstore scanners
>     if (this.store.memstore.shouldSeek(scan)) {
>       scanners.addAll(this.store.memstore.getScanners());
>     }
>     return scanners;
>   }
> If for example there is a call to Store.updateStorefiles() that happens between
> the store.getStorefiles() and this.store.memstore.getScanners(); then
> it is possible that there was a new HFile created, that is not seen by the
> StoreScanner, and the data is not present in the Memstore.snapshot either.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4485) Eliminate window of missing Data

Posted by "Hudson (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13152700#comment-13152700 ] 

Hudson commented on HBASE-4485:
-------------------------------

Integrated in HBase-TRUNK #2454 (See [https://builds.apache.org/job/HBase-TRUNK/2454/])
    HBASE-4485 Eliminate window of missing Data

nspiegelberg : 
Files : 
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java

                
> Eliminate window of missing Data
> --------------------------------
>
>                 Key: HBASE-4485
>                 URL: https://issues.apache.org/jira/browse/HBASE-4485
>             Project: HBase
>          Issue Type: Sub-task
>            Reporter: Amitanand Aiyer
>            Assignee: Amitanand Aiyer
>             Fix For: 0.94.0
>
>         Attachments: 4485-v1.diff, 4485-v2.diff, 4485-v3.diff, 4485-v4.diff, 4485-v5.diff, 4485-v6.diff, repro_bug-4485.diff
>
>
> After incorporating v11 of the 2856 fix, we discovered that we are still having some ACID violations.
> This time, however, the problem is not about including "newer" updates; but, about missing older updates
> that should be including. 
> Here is what seems to be happening.
> There is a race condition in the StoreScanner.getScanners()
>   private List<KeyValueScanner> getScanners(Scan scan,
>       final NavigableSet<byte[]> columns) throws IOException {
>     // First the store file scanners
>     List<StoreFileScanner> sfScanners = StoreFileScanner
>       .getScannersForStoreFiles(store.getStorefiles(), cacheBlocks,
>                                 isGet, false);
>     List<KeyValueScanner> scanners =
>       new ArrayList<KeyValueScanner>(sfScanners.size()+1);
>     // include only those scan files which pass all filters
>     for (StoreFileScanner sfs : sfScanners) {
>       if (sfs.shouldSeek(scan, columns)) {
>         scanners.add(sfs);
>       }
>     }
>     // Then the memstore scanners
>     if (this.store.memstore.shouldSeek(scan)) {
>       scanners.addAll(this.store.memstore.getScanners());
>     }
>     return scanners;
>   }
> If for example there is a call to Store.updateStorefiles() that happens between
> the store.getStorefiles() and this.store.memstore.getScanners(); then
> it is possible that there was a new HFile created, that is not seen by the
> StoreScanner, and the data is not present in the Memstore.snapshot either.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4485) Eliminate window of missing Data

Posted by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13132733#comment-13132733 ] 

jiraposter@reviews.apache.org commented on HBASE-4485:
------------------------------------------------------



bq.  On 2011-10-21 06:13:51, Lars Hofhansl wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java, line 645
bq.  > <https://reviews.apache.org/r/2481/diff/1/?file=51676#file51676line645>
bq.  >
bq.  >     Can these be private?
bq.  >     volatile because you did not want synchronous in the MemstoreScanner constructor?

actually, I had this volatile because the kvset in the Memstore is volatile.

But, on second thought, it seems like we can get rid of that. While the kvset and snapshot in Memstore 
can be accessed from different threads (running different Memstorescanners). The reference in the
MemstoreScanner should only be used in the Thread that is performing the scan.

Will update the diff and send it again.


bq.  On 2011-10-21 06:13:51, Lars Hofhansl wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java, line 712
bq.  > <https://reviews.apache.org/r/2481/diff/1/?file=51676#file51676line712>
bq.  >
bq.  >     Cool.
bq.  >     
bq.  >     Only concern is that while the scanner exists we may need more memory than before.

you are right. 

Are we guaranteed to do a seek() only once? If so, we probably can reset kvset/snapshot to null, 
once we have evaluated kvTail and snapshotTail.


bq.  On 2011-10-21 06:13:51, Lars Hofhansl wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/regionserver/Store.java, line 1335
bq.  > <https://reviews.apache.org/r/2481/diff/1/?file=51677#file51677line1335>
bq.  >
bq.  >     Why can you move this up?

Here is my understanding/assumption:
https://issues.apache.org/jira/browse/HBASE-4485?focusedCommentId=13131844&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13131844

Let me know if that is wrong :-)


- Amitanand


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2481/#review2736
-----------------------------------------------------------


On 2011-10-20 19:13:56, Amitanand Aiyer wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/2481/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-10-20 19:13:56)
bq.  
bq.  
bq.  Review request for Ted Yu, Michael Stack, Jonathan Gray, Lars Hofhansl, Amitanand Aiyer, Kannan Muthukkaruppan, Karthik Ranganathan, and Nicolas Spiegelberg.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  Part of the 2856 diff split into 3 parts for easier review
bq.  
bq.  The first part is v6 of the patch submitted to:
bq.  https://reviews.apache.org/r/2224/
bq.  
bq.  This is the fix for HBase-4485
bq.  
bq.  
bq.  This addresses bug hbase-4485.
bq.      https://issues.apache.org/jira/browse/hbase-4485
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 7761c42 
bq.    src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java f5b5c4c 
bq.    src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java 34263e4 
bq.  
bq.  Diff: https://reviews.apache.org/r/2481/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  running mvn test with all 3 patches together.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Amitanand
bq.  
bq.


                
> Eliminate window of missing Data
> --------------------------------
>
>                 Key: HBASE-4485
>                 URL: https://issues.apache.org/jira/browse/HBASE-4485
>             Project: HBase
>          Issue Type: Sub-task
>            Reporter: Amitanand Aiyer
>            Assignee: Amitanand Aiyer
>             Fix For: 0.94.0
>
>         Attachments: 4485-v1.diff, 4485-v2.diff, 4485-v3.diff, 4485-v4.diff, repro_bug-4485.diff
>
>
> After incorporating v11 of the 2856 fix, we discovered that we are still having some ACID violations.
> This time, however, the problem is not about including "newer" updates; but, about missing older updates
> that should be including. 
> Here is what seems to be happening.
> There is a race condition in the StoreScanner.getScanners()
>   private List<KeyValueScanner> getScanners(Scan scan,
>       final NavigableSet<byte[]> columns) throws IOException {
>     // First the store file scanners
>     List<StoreFileScanner> sfScanners = StoreFileScanner
>       .getScannersForStoreFiles(store.getStorefiles(), cacheBlocks,
>                                 isGet, false);
>     List<KeyValueScanner> scanners =
>       new ArrayList<KeyValueScanner>(sfScanners.size()+1);
>     // include only those scan files which pass all filters
>     for (StoreFileScanner sfs : sfScanners) {
>       if (sfs.shouldSeek(scan, columns)) {
>         scanners.add(sfs);
>       }
>     }
>     // Then the memstore scanners
>     if (this.store.memstore.shouldSeek(scan)) {
>       scanners.addAll(this.store.memstore.getScanners());
>     }
>     return scanners;
>   }
> If for example there is a call to Store.updateStorefiles() that happens between
> the store.getStorefiles() and this.store.memstore.getScanners(); then
> it is possible that there was a new HFile created, that is not seen by the
> StoreScanner, and the data is not present in the Memstore.snapshot either.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4485) Eliminate window of missing Data

Posted by "Amitanand Aiyer (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amitanand Aiyer updated HBASE-4485:
-----------------------------------

    Attachment: 4485-v3.diff

Move some functions to Store.java so that we do not have to access store.lock from StoreScanner.java

Passes the TestAcidGuarantees on the internal branch (0.89). 

Running the test suite on the open source trunk (in progress)
                
> Eliminate window of missing Data
> --------------------------------
>
>                 Key: HBASE-4485
>                 URL: https://issues.apache.org/jira/browse/HBASE-4485
>             Project: HBase
>          Issue Type: Sub-task
>            Reporter: Amitanand Aiyer
>            Assignee: Amitanand Aiyer
>             Fix For: 0.94.0
>
>         Attachments: 4485-v1.diff, 4485-v2.diff, 4485-v3.diff, repro_bug-4485.diff
>
>
> After incorporating v11 of the 2856 fix, we discovered that we are still having some ACID violations.
> This time, however, the problem is not about including "newer" updates; but, about missing older updates
> that should be including. 
> Here is what seems to be happening.
> There is a race condition in the StoreScanner.getScanners()
>   private List<KeyValueScanner> getScanners(Scan scan,
>       final NavigableSet<byte[]> columns) throws IOException {
>     // First the store file scanners
>     List<StoreFileScanner> sfScanners = StoreFileScanner
>       .getScannersForStoreFiles(store.getStorefiles(), cacheBlocks,
>                                 isGet, false);
>     List<KeyValueScanner> scanners =
>       new ArrayList<KeyValueScanner>(sfScanners.size()+1);
>     // include only those scan files which pass all filters
>     for (StoreFileScanner sfs : sfScanners) {
>       if (sfs.shouldSeek(scan, columns)) {
>         scanners.add(sfs);
>       }
>     }
>     // Then the memstore scanners
>     if (this.store.memstore.shouldSeek(scan)) {
>       scanners.addAll(this.store.memstore.getScanners());
>     }
>     return scanners;
>   }
> If for example there is a call to Store.updateStorefiles() that happens between
> the store.getStorefiles() and this.store.memstore.getScanners(); then
> it is possible that there was a new HFile created, that is not seen by the
> StoreScanner, and the data is not present in the Memstore.snapshot either.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4485) Eliminate window of missing Data

Posted by "Amitanand Aiyer (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amitanand Aiyer updated HBASE-4485:
-----------------------------------

    Attachment: 4485-v6.diff
    
> Eliminate window of missing Data
> --------------------------------
>
>                 Key: HBASE-4485
>                 URL: https://issues.apache.org/jira/browse/HBASE-4485
>             Project: HBase
>          Issue Type: Sub-task
>            Reporter: Amitanand Aiyer
>            Assignee: Amitanand Aiyer
>             Fix For: 0.94.0
>
>         Attachments: 4485-v1.diff, 4485-v2.diff, 4485-v3.diff, 4485-v4.diff, 4485-v5.diff, 4485-v6.diff, repro_bug-4485.diff
>
>
> After incorporating v11 of the 2856 fix, we discovered that we are still having some ACID violations.
> This time, however, the problem is not about including "newer" updates; but, about missing older updates
> that should be including. 
> Here is what seems to be happening.
> There is a race condition in the StoreScanner.getScanners()
>   private List<KeyValueScanner> getScanners(Scan scan,
>       final NavigableSet<byte[]> columns) throws IOException {
>     // First the store file scanners
>     List<StoreFileScanner> sfScanners = StoreFileScanner
>       .getScannersForStoreFiles(store.getStorefiles(), cacheBlocks,
>                                 isGet, false);
>     List<KeyValueScanner> scanners =
>       new ArrayList<KeyValueScanner>(sfScanners.size()+1);
>     // include only those scan files which pass all filters
>     for (StoreFileScanner sfs : sfScanners) {
>       if (sfs.shouldSeek(scan, columns)) {
>         scanners.add(sfs);
>       }
>     }
>     // Then the memstore scanners
>     if (this.store.memstore.shouldSeek(scan)) {
>       scanners.addAll(this.store.memstore.getScanners());
>     }
>     return scanners;
>   }
> If for example there is a call to Store.updateStorefiles() that happens between
> the store.getStorefiles() and this.store.memstore.getScanners(); then
> it is possible that there was a new HFile created, that is not seen by the
> StoreScanner, and the data is not present in the Memstore.snapshot either.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4485) Eliminate window of missing Data

Posted by "Amitanand Aiyer (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amitanand Aiyer updated HBASE-4485:
-----------------------------------

    Attachment: 4485-v2.diff

fix the issue using locks to ensure that Store.updateStoreFiles does not get called between StoreScanner getting the List of store files, and it getting to the MemStoreScanner.
                
> Eliminate window of missing Data
> --------------------------------
>
>                 Key: HBASE-4485
>                 URL: https://issues.apache.org/jira/browse/HBASE-4485
>             Project: HBase
>          Issue Type: Sub-task
>            Reporter: Amitanand Aiyer
>            Assignee: Amitanand Aiyer
>             Fix For: 0.94.0
>
>         Attachments: 4485-v1.diff, 4485-v2.diff, 4485-v3.diff, repro_bug-4485.diff
>
>
> After incorporating v11 of the 2856 fix, we discovered that we are still having some ACID violations.
> This time, however, the problem is not about including "newer" updates; but, about missing older updates
> that should be including. 
> Here is what seems to be happening.
> There is a race condition in the StoreScanner.getScanners()
>   private List<KeyValueScanner> getScanners(Scan scan,
>       final NavigableSet<byte[]> columns) throws IOException {
>     // First the store file scanners
>     List<StoreFileScanner> sfScanners = StoreFileScanner
>       .getScannersForStoreFiles(store.getStorefiles(), cacheBlocks,
>                                 isGet, false);
>     List<KeyValueScanner> scanners =
>       new ArrayList<KeyValueScanner>(sfScanners.size()+1);
>     // include only those scan files which pass all filters
>     for (StoreFileScanner sfs : sfScanners) {
>       if (sfs.shouldSeek(scan, columns)) {
>         scanners.add(sfs);
>       }
>     }
>     // Then the memstore scanners
>     if (this.store.memstore.shouldSeek(scan)) {
>       scanners.addAll(this.store.memstore.getScanners());
>     }
>     return scanners;
>   }
> If for example there is a call to Store.updateStorefiles() that happens between
> the store.getStorefiles() and this.store.memstore.getScanners(); then
> it is possible that there was a new HFile created, that is not seen by the
> StoreScanner, and the data is not present in the Memstore.snapshot either.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4485) Eliminate window of missing Data

Posted by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13152413#comment-13152413 ] 

jiraposter@reviews.apache.org commented on HBASE-4485:
------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2481/
-----------------------------------------------------------

(Updated 2011-11-17 22:43:10.296972)


Review request for Ted Yu, Michael Stack, Jonathan Gray, Lars Hofhansl, Amitanand Aiyer, Kannan Muthukkaruppan, Karthik Ranganathan, and Nicolas Spiegelberg.


Changes
-------

added comments, based on kannan/nicolas' comments on the 89 branch.


Summary
-------

Part of the 2856 diff split into 3 parts for easier review

The first part is v6 of the patch submitted to:
https://reviews.apache.org/r/2224/

This is the fix for HBase-4485


This addresses bug hbase-4485.
    https://issues.apache.org/jira/browse/hbase-4485


Diffs (updated)
-----

  src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java 34263e4 
  src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 7761c42 
  src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java f5b5c4c 

Diff: https://reviews.apache.org/r/2481/diff


Testing
-------

running mvn test with all 3 patches together.


Thanks,

Amitanand


                
> Eliminate window of missing Data
> --------------------------------
>
>                 Key: HBASE-4485
>                 URL: https://issues.apache.org/jira/browse/HBASE-4485
>             Project: HBase
>          Issue Type: Sub-task
>            Reporter: Amitanand Aiyer
>            Assignee: Amitanand Aiyer
>             Fix For: 0.94.0
>
>         Attachments: 4485-v1.diff, 4485-v2.diff, 4485-v3.diff, 4485-v4.diff, repro_bug-4485.diff
>
>
> After incorporating v11 of the 2856 fix, we discovered that we are still having some ACID violations.
> This time, however, the problem is not about including "newer" updates; but, about missing older updates
> that should be including. 
> Here is what seems to be happening.
> There is a race condition in the StoreScanner.getScanners()
>   private List<KeyValueScanner> getScanners(Scan scan,
>       final NavigableSet<byte[]> columns) throws IOException {
>     // First the store file scanners
>     List<StoreFileScanner> sfScanners = StoreFileScanner
>       .getScannersForStoreFiles(store.getStorefiles(), cacheBlocks,
>                                 isGet, false);
>     List<KeyValueScanner> scanners =
>       new ArrayList<KeyValueScanner>(sfScanners.size()+1);
>     // include only those scan files which pass all filters
>     for (StoreFileScanner sfs : sfScanners) {
>       if (sfs.shouldSeek(scan, columns)) {
>         scanners.add(sfs);
>       }
>     }
>     // Then the memstore scanners
>     if (this.store.memstore.shouldSeek(scan)) {
>       scanners.addAll(this.store.memstore.getScanners());
>     }
>     return scanners;
>   }
> If for example there is a call to Store.updateStorefiles() that happens between
> the store.getStorefiles() and this.store.memstore.getScanners(); then
> it is possible that there was a new HFile created, that is not seen by the
> StoreScanner, and the data is not present in the Memstore.snapshot either.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4485) Eliminate window of missing Data

Posted by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13132740#comment-13132740 ] 

jiraposter@reviews.apache.org commented on HBASE-4485:
------------------------------------------------------



bq.  On 2011-10-21 06:13:51, Lars Hofhansl wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java, line 216
bq.  > <https://reviews.apache.org/r/2481/diff/1/?file=51678#file51678line216>
bq.  >
bq.  >     I'm not usually a fan of instanceof - unless necessary of course.
bq.  >     Why did you need to fold this into one loop?

The reason for having one loop is because we want to get the
Scanners for the storefiles and the scanners for memstore at
the same time. This is required for ensuring that the view of
the StoreScanner is consistent with the state of the store,
at some point in the execution history.

I agree, instance of is kinda ugly. Will try to get rid of that.


- Amitanand


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2481/#review2736
-----------------------------------------------------------


On 2011-10-20 19:13:56, Amitanand Aiyer wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/2481/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-10-20 19:13:56)
bq.  
bq.  
bq.  Review request for Ted Yu, Michael Stack, Jonathan Gray, Lars Hofhansl, Amitanand Aiyer, Kannan Muthukkaruppan, Karthik Ranganathan, and Nicolas Spiegelberg.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  Part of the 2856 diff split into 3 parts for easier review
bq.  
bq.  The first part is v6 of the patch submitted to:
bq.  https://reviews.apache.org/r/2224/
bq.  
bq.  This is the fix for HBase-4485
bq.  
bq.  
bq.  This addresses bug hbase-4485.
bq.      https://issues.apache.org/jira/browse/hbase-4485
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 7761c42 
bq.    src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java f5b5c4c 
bq.    src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java 34263e4 
bq.  
bq.  Diff: https://reviews.apache.org/r/2481/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  running mvn test with all 3 patches together.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Amitanand
bq.  
bq.


                
> Eliminate window of missing Data
> --------------------------------
>
>                 Key: HBASE-4485
>                 URL: https://issues.apache.org/jira/browse/HBASE-4485
>             Project: HBase
>          Issue Type: Sub-task
>            Reporter: Amitanand Aiyer
>            Assignee: Amitanand Aiyer
>             Fix For: 0.94.0
>
>         Attachments: 4485-v1.diff, 4485-v2.diff, 4485-v3.diff, 4485-v4.diff, repro_bug-4485.diff
>
>
> After incorporating v11 of the 2856 fix, we discovered that we are still having some ACID violations.
> This time, however, the problem is not about including "newer" updates; but, about missing older updates
> that should be including. 
> Here is what seems to be happening.
> There is a race condition in the StoreScanner.getScanners()
>   private List<KeyValueScanner> getScanners(Scan scan,
>       final NavigableSet<byte[]> columns) throws IOException {
>     // First the store file scanners
>     List<StoreFileScanner> sfScanners = StoreFileScanner
>       .getScannersForStoreFiles(store.getStorefiles(), cacheBlocks,
>                                 isGet, false);
>     List<KeyValueScanner> scanners =
>       new ArrayList<KeyValueScanner>(sfScanners.size()+1);
>     // include only those scan files which pass all filters
>     for (StoreFileScanner sfs : sfScanners) {
>       if (sfs.shouldSeek(scan, columns)) {
>         scanners.add(sfs);
>       }
>     }
>     // Then the memstore scanners
>     if (this.store.memstore.shouldSeek(scan)) {
>       scanners.addAll(this.store.memstore.getScanners());
>     }
>     return scanners;
>   }
> If for example there is a call to Store.updateStorefiles() that happens between
> the store.getStorefiles() and this.store.memstore.getScanners(); then
> it is possible that there was a new HFile created, that is not seen by the
> StoreScanner, and the data is not present in the Memstore.snapshot either.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4485) Eliminate window of missing Data

Posted by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13152450#comment-13152450 ] 

jiraposter@reviews.apache.org commented on HBASE-4485:
------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2481/
-----------------------------------------------------------

(Updated 2011-11-17 23:28:27.463840)


Review request for Ted Yu, Michael Stack, Jonathan Gray, Lars Hofhansl, Amitanand Aiyer, Kannan Muthukkaruppan, Karthik Ranganathan, and Nicolas Spiegelberg.


Changes
-------

added comments based on kannan/nicolas' feedback.


Summary
-------

Part of the 2856 diff split into 3 parts for easier review

The first part is v6 of the patch submitted to:
https://reviews.apache.org/r/2224/

This is the fix for HBase-4485


This addresses bug hbase-4485.
    https://issues.apache.org/jira/browse/hbase-4485


Diffs (updated)
-----

  src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java 34263e4 
  src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 7761c42 
  src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java f5b5c4c 

Diff: https://reviews.apache.org/r/2481/diff


Testing
-------

running mvn test with all 3 patches together.


Thanks,

Amitanand


                
> Eliminate window of missing Data
> --------------------------------
>
>                 Key: HBASE-4485
>                 URL: https://issues.apache.org/jira/browse/HBASE-4485
>             Project: HBase
>          Issue Type: Sub-task
>            Reporter: Amitanand Aiyer
>            Assignee: Amitanand Aiyer
>             Fix For: 0.94.0
>
>         Attachments: 4485-v1.diff, 4485-v2.diff, 4485-v3.diff, 4485-v4.diff, 4485-v5.diff, repro_bug-4485.diff
>
>
> After incorporating v11 of the 2856 fix, we discovered that we are still having some ACID violations.
> This time, however, the problem is not about including "newer" updates; but, about missing older updates
> that should be including. 
> Here is what seems to be happening.
> There is a race condition in the StoreScanner.getScanners()
>   private List<KeyValueScanner> getScanners(Scan scan,
>       final NavigableSet<byte[]> columns) throws IOException {
>     // First the store file scanners
>     List<StoreFileScanner> sfScanners = StoreFileScanner
>       .getScannersForStoreFiles(store.getStorefiles(), cacheBlocks,
>                                 isGet, false);
>     List<KeyValueScanner> scanners =
>       new ArrayList<KeyValueScanner>(sfScanners.size()+1);
>     // include only those scan files which pass all filters
>     for (StoreFileScanner sfs : sfScanners) {
>       if (sfs.shouldSeek(scan, columns)) {
>         scanners.add(sfs);
>       }
>     }
>     // Then the memstore scanners
>     if (this.store.memstore.shouldSeek(scan)) {
>       scanners.addAll(this.store.memstore.getScanners());
>     }
>     return scanners;
>   }
> If for example there is a call to Store.updateStorefiles() that happens between
> the store.getStorefiles() and this.store.memstore.getScanners(); then
> it is possible that there was a new HFile created, that is not seen by the
> StoreScanner, and the data is not present in the Memstore.snapshot either.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4485) Eliminate window of missing Data

Posted by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13132355#comment-13132355 ] 

jiraposter@reviews.apache.org commented on HBASE-4485:
------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2481/#review2736
-----------------------------------------------------------


Looks good to me. Would be awesome to get this finally out of the way!
A few nits (like trailing spaces). And some questions below.


src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java
<https://reviews.apache.org/r/2481/#comment6164>

    Can this now be a static inner class (if we pass kvSet, etc) to the MemstoreScanner constructor?



src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java
<https://reviews.apache.org/r/2481/#comment6166>

    Can these be private?
    volatile because you did not want synchronous in the MemstoreScanner constructor?



src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java
<https://reviews.apache.org/r/2481/#comment6162>

    Cool.
    
    Only concern is that while the scanner exists we may need more memory than before.



src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
<https://reviews.apache.org/r/2481/#comment6160>

    Why can you move this up?



src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java
<https://reviews.apache.org/r/2481/#comment6159>

    I'm not usually a fan of instanceof - unless necessary of course.
    Why did you need to fold this into one loop?


- Lars


On 2011-10-20 19:13:56, Amitanand Aiyer wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/2481/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-10-20 19:13:56)
bq.  
bq.  
bq.  Review request for Ted Yu, Michael Stack, Jonathan Gray, Lars Hofhansl, Amitanand Aiyer, Kannan Muthukkaruppan, Karthik Ranganathan, and Nicolas Spiegelberg.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  Part of the 2856 diff split into 3 parts for easier review
bq.  
bq.  The first part is v6 of the patch submitted to:
bq.  https://reviews.apache.org/r/2224/
bq.  
bq.  This is the fix for HBase-4485
bq.  
bq.  
bq.  This addresses bug hbase-4485.
bq.      https://issues.apache.org/jira/browse/hbase-4485
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 7761c42 
bq.    src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java f5b5c4c 
bq.    src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java 34263e4 
bq.  
bq.  Diff: https://reviews.apache.org/r/2481/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  running mvn test with all 3 patches together.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Amitanand
bq.  
bq.


                
> Eliminate window of missing Data
> --------------------------------
>
>                 Key: HBASE-4485
>                 URL: https://issues.apache.org/jira/browse/HBASE-4485
>             Project: HBase
>          Issue Type: Sub-task
>            Reporter: Amitanand Aiyer
>            Assignee: Amitanand Aiyer
>             Fix For: 0.94.0
>
>         Attachments: 4485-v1.diff, 4485-v2.diff, 4485-v3.diff, 4485-v4.diff, repro_bug-4485.diff
>
>
> After incorporating v11 of the 2856 fix, we discovered that we are still having some ACID violations.
> This time, however, the problem is not about including "newer" updates; but, about missing older updates
> that should be including. 
> Here is what seems to be happening.
> There is a race condition in the StoreScanner.getScanners()
>   private List<KeyValueScanner> getScanners(Scan scan,
>       final NavigableSet<byte[]> columns) throws IOException {
>     // First the store file scanners
>     List<StoreFileScanner> sfScanners = StoreFileScanner
>       .getScannersForStoreFiles(store.getStorefiles(), cacheBlocks,
>                                 isGet, false);
>     List<KeyValueScanner> scanners =
>       new ArrayList<KeyValueScanner>(sfScanners.size()+1);
>     // include only those scan files which pass all filters
>     for (StoreFileScanner sfs : sfScanners) {
>       if (sfs.shouldSeek(scan, columns)) {
>         scanners.add(sfs);
>       }
>     }
>     // Then the memstore scanners
>     if (this.store.memstore.shouldSeek(scan)) {
>       scanners.addAll(this.store.memstore.getScanners());
>     }
>     return scanners;
>   }
> If for example there is a call to Store.updateStorefiles() that happens between
> the store.getStorefiles() and this.store.memstore.getScanners(); then
> it is possible that there was a new HFile created, that is not seen by the
> StoreScanner, and the data is not present in the Memstore.snapshot either.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4485) Eliminate window of missing Data

Posted by "Amitanand Aiyer (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amitanand Aiyer updated HBASE-4485:
-----------------------------------

    Attachment: 4485-v4.diff

move notifyChangeReaderObservers() outside the lock at all places.

Testing in progress.
                
> Eliminate window of missing Data
> --------------------------------
>
>                 Key: HBASE-4485
>                 URL: https://issues.apache.org/jira/browse/HBASE-4485
>             Project: HBase
>          Issue Type: Sub-task
>            Reporter: Amitanand Aiyer
>            Assignee: Amitanand Aiyer
>             Fix For: 0.94.0
>
>         Attachments: 4485-v1.diff, 4485-v2.diff, 4485-v3.diff, 4485-v4.diff, repro_bug-4485.diff
>
>
> After incorporating v11 of the 2856 fix, we discovered that we are still having some ACID violations.
> This time, however, the problem is not about including "newer" updates; but, about missing older updates
> that should be including. 
> Here is what seems to be happening.
> There is a race condition in the StoreScanner.getScanners()
>   private List<KeyValueScanner> getScanners(Scan scan,
>       final NavigableSet<byte[]> columns) throws IOException {
>     // First the store file scanners
>     List<StoreFileScanner> sfScanners = StoreFileScanner
>       .getScannersForStoreFiles(store.getStorefiles(), cacheBlocks,
>                                 isGet, false);
>     List<KeyValueScanner> scanners =
>       new ArrayList<KeyValueScanner>(sfScanners.size()+1);
>     // include only those scan files which pass all filters
>     for (StoreFileScanner sfs : sfScanners) {
>       if (sfs.shouldSeek(scan, columns)) {
>         scanners.add(sfs);
>       }
>     }
>     // Then the memstore scanners
>     if (this.store.memstore.shouldSeek(scan)) {
>       scanners.addAll(this.store.memstore.getScanners());
>     }
>     return scanners;
>   }
> If for example there is a call to Store.updateStorefiles() that happens between
> the store.getStorefiles() and this.store.memstore.getScanners(); then
> it is possible that there was a new HFile created, that is not seen by the
> StoreScanner, and the data is not present in the Memstore.snapshot either.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4485) Eliminate window of missing Data

Posted by "Amitanand Aiyer (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amitanand Aiyer updated HBASE-4485:
-----------------------------------

    Attachment: 4485-v5.diff
    
> Eliminate window of missing Data
> --------------------------------
>
>                 Key: HBASE-4485
>                 URL: https://issues.apache.org/jira/browse/HBASE-4485
>             Project: HBase
>          Issue Type: Sub-task
>            Reporter: Amitanand Aiyer
>            Assignee: Amitanand Aiyer
>             Fix For: 0.94.0
>
>         Attachments: 4485-v1.diff, 4485-v2.diff, 4485-v3.diff, 4485-v4.diff, 4485-v5.diff, repro_bug-4485.diff
>
>
> After incorporating v11 of the 2856 fix, we discovered that we are still having some ACID violations.
> This time, however, the problem is not about including "newer" updates; but, about missing older updates
> that should be including. 
> Here is what seems to be happening.
> There is a race condition in the StoreScanner.getScanners()
>   private List<KeyValueScanner> getScanners(Scan scan,
>       final NavigableSet<byte[]> columns) throws IOException {
>     // First the store file scanners
>     List<StoreFileScanner> sfScanners = StoreFileScanner
>       .getScannersForStoreFiles(store.getStorefiles(), cacheBlocks,
>                                 isGet, false);
>     List<KeyValueScanner> scanners =
>       new ArrayList<KeyValueScanner>(sfScanners.size()+1);
>     // include only those scan files which pass all filters
>     for (StoreFileScanner sfs : sfScanners) {
>       if (sfs.shouldSeek(scan, columns)) {
>         scanners.add(sfs);
>       }
>     }
>     // Then the memstore scanners
>     if (this.store.memstore.shouldSeek(scan)) {
>       scanners.addAll(this.store.memstore.getScanners());
>     }
>     return scanners;
>   }
> If for example there is a call to Store.updateStorefiles() that happens between
> the store.getStorefiles() and this.store.memstore.getScanners(); then
> it is possible that there was a new HFile created, that is not seen by the
> StoreScanner, and the data is not present in the Memstore.snapshot either.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4485) Eliminate window of missing Data

Posted by "Ted Yu (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13132809#comment-13132809 ] 

Ted Yu commented on HBASE-4485:
-------------------------------

In order to avoid using instanceof in useRWCC(), we have at least two options:

1. create a tuple for each KeyValueScanner discovered with boolean flag indicating whether the KeyValueScanner is StoreFileScanner or not.
Thus scanners would be List<KeyValueScannerWrapper>

2. Make scanners a Map<KeyValueScanner, Boolean>
                
> Eliminate window of missing Data
> --------------------------------
>
>                 Key: HBASE-4485
>                 URL: https://issues.apache.org/jira/browse/HBASE-4485
>             Project: HBase
>          Issue Type: Sub-task
>            Reporter: Amitanand Aiyer
>            Assignee: Amitanand Aiyer
>             Fix For: 0.94.0
>
>         Attachments: 4485-v1.diff, 4485-v2.diff, 4485-v3.diff, 4485-v4.diff, repro_bug-4485.diff
>
>
> After incorporating v11 of the 2856 fix, we discovered that we are still having some ACID violations.
> This time, however, the problem is not about including "newer" updates; but, about missing older updates
> that should be including. 
> Here is what seems to be happening.
> There is a race condition in the StoreScanner.getScanners()
>   private List<KeyValueScanner> getScanners(Scan scan,
>       final NavigableSet<byte[]> columns) throws IOException {
>     // First the store file scanners
>     List<StoreFileScanner> sfScanners = StoreFileScanner
>       .getScannersForStoreFiles(store.getStorefiles(), cacheBlocks,
>                                 isGet, false);
>     List<KeyValueScanner> scanners =
>       new ArrayList<KeyValueScanner>(sfScanners.size()+1);
>     // include only those scan files which pass all filters
>     for (StoreFileScanner sfs : sfScanners) {
>       if (sfs.shouldSeek(scan, columns)) {
>         scanners.add(sfs);
>       }
>     }
>     // Then the memstore scanners
>     if (this.store.memstore.shouldSeek(scan)) {
>       scanners.addAll(this.store.memstore.getScanners());
>     }
>     return scanners;
>   }
> If for example there is a call to Store.updateStorefiles() that happens between
> the store.getStorefiles() and this.store.memstore.getScanners(); then
> it is possible that there was a new HFile created, that is not seen by the
> StoreScanner, and the data is not present in the Memstore.snapshot either.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4485) Eliminate window of missing Data

Posted by "Amitanand Aiyer (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13131845#comment-13131845 ] 

Amitanand Aiyer commented on HBASE-4485:
----------------------------------------


Btw, Ted also has a fix to this issue where in we use a reader lock and a writer lock in Store.java, and the writerlock is used
when we update the files. We switch back to the reader lock when we do the notifyChangedReaders.

We could definitely use that to be "safe". But, I am still trying to understand the case where having notifyChangedReaders
outside the lock could cause problems.
                
> Eliminate window of missing Data
> --------------------------------
>
>                 Key: HBASE-4485
>                 URL: https://issues.apache.org/jira/browse/HBASE-4485
>             Project: HBase
>          Issue Type: Sub-task
>            Reporter: Amitanand Aiyer
>            Assignee: Amitanand Aiyer
>             Fix For: 0.94.0
>
>         Attachments: 4485-v1.diff, 4485-v2.diff, 4485-v3.diff, 4485-v4.diff, repro_bug-4485.diff
>
>
> After incorporating v11 of the 2856 fix, we discovered that we are still having some ACID violations.
> This time, however, the problem is not about including "newer" updates; but, about missing older updates
> that should be including. 
> Here is what seems to be happening.
> There is a race condition in the StoreScanner.getScanners()
>   private List<KeyValueScanner> getScanners(Scan scan,
>       final NavigableSet<byte[]> columns) throws IOException {
>     // First the store file scanners
>     List<StoreFileScanner> sfScanners = StoreFileScanner
>       .getScannersForStoreFiles(store.getStorefiles(), cacheBlocks,
>                                 isGet, false);
>     List<KeyValueScanner> scanners =
>       new ArrayList<KeyValueScanner>(sfScanners.size()+1);
>     // include only those scan files which pass all filters
>     for (StoreFileScanner sfs : sfScanners) {
>       if (sfs.shouldSeek(scan, columns)) {
>         scanners.add(sfs);
>       }
>     }
>     // Then the memstore scanners
>     if (this.store.memstore.shouldSeek(scan)) {
>       scanners.addAll(this.store.memstore.getScanners());
>     }
>     return scanners;
>   }
> If for example there is a call to Store.updateStorefiles() that happens between
> the store.getStorefiles() and this.store.memstore.getScanners(); then
> it is possible that there was a new HFile created, that is not seen by the
> StoreScanner, and the data is not present in the Memstore.snapshot either.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4485) Eliminate window of missing Data

Posted by "Ted Yu (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13131988#comment-13131988 ] 

Ted Yu commented on HBASE-4485:
-------------------------------

In Store.completeCompaction(), the following code is outside lock.writeLock:
{code}
      // Tell observers that list of StoreFiles has changed.
      notifyChangedReadersObservers();
      // Finally, delete old store files.
      for (StoreFile hsf: compactedFiles) {
        hsf.deleteReader();
      }
{code}
I think the above should be placed in (downgraded) lock.readLock
                
> Eliminate window of missing Data
> --------------------------------
>
>                 Key: HBASE-4485
>                 URL: https://issues.apache.org/jira/browse/HBASE-4485
>             Project: HBase
>          Issue Type: Sub-task
>            Reporter: Amitanand Aiyer
>            Assignee: Amitanand Aiyer
>             Fix For: 0.94.0
>
>         Attachments: 4485-v1.diff, 4485-v2.diff, 4485-v3.diff, 4485-v4.diff, repro_bug-4485.diff
>
>
> After incorporating v11 of the 2856 fix, we discovered that we are still having some ACID violations.
> This time, however, the problem is not about including "newer" updates; but, about missing older updates
> that should be including. 
> Here is what seems to be happening.
> There is a race condition in the StoreScanner.getScanners()
>   private List<KeyValueScanner> getScanners(Scan scan,
>       final NavigableSet<byte[]> columns) throws IOException {
>     // First the store file scanners
>     List<StoreFileScanner> sfScanners = StoreFileScanner
>       .getScannersForStoreFiles(store.getStorefiles(), cacheBlocks,
>                                 isGet, false);
>     List<KeyValueScanner> scanners =
>       new ArrayList<KeyValueScanner>(sfScanners.size()+1);
>     // include only those scan files which pass all filters
>     for (StoreFileScanner sfs : sfScanners) {
>       if (sfs.shouldSeek(scan, columns)) {
>         scanners.add(sfs);
>       }
>     }
>     // Then the memstore scanners
>     if (this.store.memstore.shouldSeek(scan)) {
>       scanners.addAll(this.store.memstore.getScanners());
>     }
>     return scanners;
>   }
> If for example there is a call to Store.updateStorefiles() that happens between
> the store.getStorefiles() and this.store.memstore.getScanners(); then
> it is possible that there was a new HFile created, that is not seen by the
> StoreScanner, and the data is not present in the Memstore.snapshot either.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4485) Eliminate window of missing Data

Posted by "Ted Yu (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13131994#comment-13131994 ] 

Ted Yu commented on HBASE-4485:
-------------------------------

In StoreScanner.useRWCC(boolean flag):
{code}
    List<KeyValueScanner> allStoreScanners = 
      this.store.getScanners(cacheBlocks, isGet);
{code}
I am not sure if allStoreScanners is a good name - MemStoreScanner is checked later in the method.
I suggest we keep the original name - scanners.
                
> Eliminate window of missing Data
> --------------------------------
>
>                 Key: HBASE-4485
>                 URL: https://issues.apache.org/jira/browse/HBASE-4485
>             Project: HBase
>          Issue Type: Sub-task
>            Reporter: Amitanand Aiyer
>            Assignee: Amitanand Aiyer
>             Fix For: 0.94.0
>
>         Attachments: 4485-v1.diff, 4485-v2.diff, 4485-v3.diff, 4485-v4.diff, repro_bug-4485.diff
>
>
> After incorporating v11 of the 2856 fix, we discovered that we are still having some ACID violations.
> This time, however, the problem is not about including "newer" updates; but, about missing older updates
> that should be including. 
> Here is what seems to be happening.
> There is a race condition in the StoreScanner.getScanners()
>   private List<KeyValueScanner> getScanners(Scan scan,
>       final NavigableSet<byte[]> columns) throws IOException {
>     // First the store file scanners
>     List<StoreFileScanner> sfScanners = StoreFileScanner
>       .getScannersForStoreFiles(store.getStorefiles(), cacheBlocks,
>                                 isGet, false);
>     List<KeyValueScanner> scanners =
>       new ArrayList<KeyValueScanner>(sfScanners.size()+1);
>     // include only those scan files which pass all filters
>     for (StoreFileScanner sfs : sfScanners) {
>       if (sfs.shouldSeek(scan, columns)) {
>         scanners.add(sfs);
>       }
>     }
>     // Then the memstore scanners
>     if (this.store.memstore.shouldSeek(scan)) {
>       scanners.addAll(this.store.memstore.getScanners());
>     }
>     return scanners;
>   }
> If for example there is a call to Store.updateStorefiles() that happens between
> the store.getStorefiles() and this.store.memstore.getScanners(); then
> it is possible that there was a new HFile created, that is not seen by the
> StoreScanner, and the data is not present in the Memstore.snapshot either.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4485) Eliminate window of missing Data

Posted by "Nicolas Spiegelberg (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13152596#comment-13152596 ] 

Nicolas Spiegelberg commented on HBASE-4485:
--------------------------------------------

+1 lgtm.  went over this with kannan.
                
> Eliminate window of missing Data
> --------------------------------
>
>                 Key: HBASE-4485
>                 URL: https://issues.apache.org/jira/browse/HBASE-4485
>             Project: HBase
>          Issue Type: Sub-task
>            Reporter: Amitanand Aiyer
>            Assignee: Amitanand Aiyer
>             Fix For: 0.94.0
>
>         Attachments: 4485-v1.diff, 4485-v2.diff, 4485-v3.diff, 4485-v4.diff, 4485-v5.diff, 4485-v6.diff, repro_bug-4485.diff
>
>
> After incorporating v11 of the 2856 fix, we discovered that we are still having some ACID violations.
> This time, however, the problem is not about including "newer" updates; but, about missing older updates
> that should be including. 
> Here is what seems to be happening.
> There is a race condition in the StoreScanner.getScanners()
>   private List<KeyValueScanner> getScanners(Scan scan,
>       final NavigableSet<byte[]> columns) throws IOException {
>     // First the store file scanners
>     List<StoreFileScanner> sfScanners = StoreFileScanner
>       .getScannersForStoreFiles(store.getStorefiles(), cacheBlocks,
>                                 isGet, false);
>     List<KeyValueScanner> scanners =
>       new ArrayList<KeyValueScanner>(sfScanners.size()+1);
>     // include only those scan files which pass all filters
>     for (StoreFileScanner sfs : sfScanners) {
>       if (sfs.shouldSeek(scan, columns)) {
>         scanners.add(sfs);
>       }
>     }
>     // Then the memstore scanners
>     if (this.store.memstore.shouldSeek(scan)) {
>       scanners.addAll(this.store.memstore.getScanners());
>     }
>     return scanners;
>   }
> If for example there is a call to Store.updateStorefiles() that happens between
> the store.getStorefiles() and this.store.memstore.getScanners(); then
> it is possible that there was a new HFile created, that is not seen by the
> StoreScanner, and the data is not present in the Memstore.snapshot either.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4485) Eliminate window of missing Data

Posted by "Ted Yu (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13127696#comment-13127696 ] 

Ted Yu commented on HBASE-4485:
-------------------------------

Here is the related change w.r.t. ignoreNewerKVs():
{code}
Index: src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java
===================================================================
--- src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java    (revision 1176657)
+++ src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java    (working copy)
@@ -59,6 +59,12 @@
   /** Row the query is on */
   protected byte [] row;

+  /** Should we ignore KV's with a newer RWCC timestamp **/
+  private boolean ignoreNewerKVs = false;
+  public void ignoreNewerKVs() {
+    this.ignoreNewerKVs = true;
+  }
+
   /**
    * Constructs a ScanQueryMatcher for a Scan.
    * @param scan
@@ -166,6 +172,12 @@
         return columns.getNextRowOrNextColumn(bytes, offset, qualLength);
     }

+    // The compaction thread has no readPoint set. For other operations, we
+    // will ignore updates that are done after the read operation has started.
+    if (this.ignoreNewerKVs &&
+        kv.getMemstoreTS() > ReadWriteConsistencyControl.getThreadReadPoint())
+        return MatchCode.SKIP;
+
     byte type = kv.getType();
     if (isDelete(type)) {
       if (tr.withinOrAfterTimeRange(timestamp)) {
{code}
                
> Eliminate window of missing Data
> --------------------------------
>
>                 Key: HBASE-4485
>                 URL: https://issues.apache.org/jira/browse/HBASE-4485
>             Project: HBase
>          Issue Type: Sub-task
>            Reporter: Amitanand Aiyer
>            Assignee: Amitanand Aiyer
>             Fix For: 0.94.0
>
>         Attachments: 4485-v1.diff, 4485-v2.diff, 4485-v3.diff, 4485-v4.diff, repro_bug-4485.diff
>
>
> After incorporating v11 of the 2856 fix, we discovered that we are still having some ACID violations.
> This time, however, the problem is not about including "newer" updates; but, about missing older updates
> that should be including. 
> Here is what seems to be happening.
> There is a race condition in the StoreScanner.getScanners()
>   private List<KeyValueScanner> getScanners(Scan scan,
>       final NavigableSet<byte[]> columns) throws IOException {
>     // First the store file scanners
>     List<StoreFileScanner> sfScanners = StoreFileScanner
>       .getScannersForStoreFiles(store.getStorefiles(), cacheBlocks,
>                                 isGet, false);
>     List<KeyValueScanner> scanners =
>       new ArrayList<KeyValueScanner>(sfScanners.size()+1);
>     // include only those scan files which pass all filters
>     for (StoreFileScanner sfs : sfScanners) {
>       if (sfs.shouldSeek(scan, columns)) {
>         scanners.add(sfs);
>       }
>     }
>     // Then the memstore scanners
>     if (this.store.memstore.shouldSeek(scan)) {
>       scanners.addAll(this.store.memstore.getScanners());
>     }
>     return scanners;
>   }
> If for example there is a call to Store.updateStorefiles() that happens between
> the store.getStorefiles() and this.store.memstore.getScanners(); then
> it is possible that there was a new HFile created, that is not seen by the
> StoreScanner, and the data is not present in the Memstore.snapshot either.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira