You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Prakash Khemani (Commented) (JIRA)" <ji...@apache.org> on 2011/11/20 04:15:52 UTC
[jira] [Commented] (HBASE-4823) long running scans lose benefit of
bloomfilters and timerange hints
[ https://issues.apache.org/jira/browse/HBASE-4823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13153655#comment-13153655 ]
Prakash Khemani commented on HBASE-4823:
----------------------------------------
https://issues.apache.org/jira/browse/HBASE-3415 is also related
> long running scans lose benefit of bloomfilters and timerange hints
> -------------------------------------------------------------------
>
> Key: HBASE-4823
> URL: https://issues.apache.org/jira/browse/HBASE-4823
> Project: HBase
> Issue Type: Bug
> Reporter: Kannan Muthukkaruppan
> Assignee: Kannan Muthukkaruppan
>
> When you have a long running scan due to say an MR job, you can lose the benefit of timerange hints & bloom filters midway if your scanner gets reset. [Note: The scanners can get reset say due to a flush or compaction].
> In one of our workloads, we periodically want to do rollups on recent 15 minutes of data in a column family... but the timerange hint benefit is lost midway when this resetScannerStack (shown below) happens. And end result-- we end up reading all the old HFiles rather than just the recent HFiles.
> {code}
> private void resetScannerStack(KeyValue lastTopKey) throws IOException {
> if (heap != null) {
> throw new RuntimeException("StoreScanner.reseek run on an existing heap!");
> }
> /* When we have the scan object, should we not pass it to getScanners()
> * to get a limited set of scanners? We did so in the constructor and we
> * could have done it now by storing the scan object from the constructor */
> List<KeyValueScanner> scanners = getScanners();
> {code}
> The comment in the code seems to be aware of this issue and even has the suggested fix!
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira