You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "Jonathan Gray (JIRA)" <ji...@apache.org> on 2009/06/14 01:14:07 UTC

[jira] Created: (HBASE-1517) Implement inexpensive seek operations in HFile

Implement inexpensive seek operations in HFile
----------------------------------------------

                 Key: HBASE-1517
                 URL: https://issues.apache.org/jira/browse/HBASE-1517
             Project: Hadoop HBase
          Issue Type: Improvement
    Affects Versions: 0.20.0
            Reporter: Jonathan Gray
             Fix For: 0.20.1


When we early-out of a row because of columns, versions, filters, etc... we seek to the end of that row one key at a time.  We should do the seek at the HFile level in cases where we would end up skipping blocks in the process.  This will be very common in cases with relatively large rows and regex row filters.

If calls that end up doing nothing are constant time, we could also call this to seek to the next column (or even a specific column in ExplicitTracker case).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-1517) Implement inexpensive seek operations in HFile

Posted by "Jonathan Gray (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-1517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12751310#action_12751310 ] 

Jonathan Gray commented on HBASE-1517:
--------------------------------------

Let's leave it in for now.  This may not be that hard and there's a bunch of simple but useful optimizations that this will unlock.

> Implement inexpensive seek operations in HFile
> ----------------------------------------------
>
>                 Key: HBASE-1517
>                 URL: https://issues.apache.org/jira/browse/HBASE-1517
>             Project: Hadoop HBase
>          Issue Type: Improvement
>    Affects Versions: 0.20.0
>            Reporter: Jonathan Gray
>             Fix For: 0.20.1
>
>
> When we early-out of a row because of columns, versions, filters, etc... we seek to the end of that row one key at a time.  We should do the seek at the HFile level in cases where we would end up skipping blocks in the process.  This will be very common in cases with relatively large rows and regex row filters.
> If calls that end up doing nothing are constant time, we could also call this to seek to the next column (or even a specific column in ExplicitTracker case).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-1517) Implement inexpensive seek operations in HFile

Posted by "Jonathan Gray (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-1517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Gray updated HBASE-1517:
---------------------------------

    Fix Version/s:     (was: 0.20.1)
                   0.21.0

I spent some time looking in to this... I can see a hack or two that could get us some efficiencies, but this should really be part of a larger set of optimizations which is seeking/skipping/earlying out whenever possible.

Bumping to 0.21

> Implement inexpensive seek operations in HFile
> ----------------------------------------------
>
>                 Key: HBASE-1517
>                 URL: https://issues.apache.org/jira/browse/HBASE-1517
>             Project: Hadoop HBase
>          Issue Type: Improvement
>    Affects Versions: 0.20.0
>            Reporter: Jonathan Gray
>             Fix For: 0.21.0
>
>
> When we early-out of a row because of columns, versions, filters, etc... we seek to the end of that row one key at a time.  We should do the seek at the HFile level in cases where we would end up skipping blocks in the process.  This will be very common in cases with relatively large rows and regex row filters.
> If calls that end up doing nothing are constant time, we could also call this to seek to the next column (or even a specific column in ExplicitTracker case).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-1517) Implement inexpensive seek operations in HFile

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-1517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12751307#action_12751307 ] 

stack commented on HBASE-1517:
------------------------------

I don't think this belongs in 0.20.1.  Move it out?

> Implement inexpensive seek operations in HFile
> ----------------------------------------------
>
>                 Key: HBASE-1517
>                 URL: https://issues.apache.org/jira/browse/HBASE-1517
>             Project: Hadoop HBase
>          Issue Type: Improvement
>    Affects Versions: 0.20.0
>            Reporter: Jonathan Gray
>             Fix For: 0.20.1
>
>
> When we early-out of a row because of columns, versions, filters, etc... we seek to the end of that row one key at a time.  We should do the seek at the HFile level in cases where we would end up skipping blocks in the process.  This will be very common in cases with relatively large rows and regex row filters.
> If calls that end up doing nothing are constant time, we could also call this to seek to the next column (or even a specific column in ExplicitTracker case).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.