You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Lars Hofhansl (JIRA)" <ji...@apache.org> on 2012/11/01 05:27:12 UTC

[jira] [Commented] (HBASE-6874) Implement prefetching for scanners

    [ https://issues.apache.org/jira/browse/HBASE-6874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13488448#comment-13488448 ] 

Lars Hofhansl commented on HBASE-6874:
--------------------------------------

[~karthik.ranga] Do you have a patch. We were just discussing something similar here and I was about to open a similar issue before I found this one. This is even more useful with scanner caching.

One could even go a step further and parallelize the prefetching into N threads (useful if the results are heavily prefiltered at the server).

We do our own parallel scanner fetching (not necessarily on region or buffer boundaries), but it would be nice if that could be generalized and be part of HBase.

                
> Implement prefetching for scanners
> ----------------------------------
>
>                 Key: HBASE-6874
>                 URL: https://issues.apache.org/jira/browse/HBASE-6874
>             Project: HBase
>          Issue Type: Sub-task
>            Reporter: Karthik Ranganathan
>            Assignee: Karthik Ranganathan
>
> I did some quick experiments by scanning data that should be completely in memory and found that adding pre-fetching increases the throughput by about 50% from 26MB/s to 39MB/s.
> The idea is to perform the next in a background thread, and keep the result ready. When the scanner's next comes in, return the pre-computed result and issue another background read.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira