You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Andrew Purtell (JIRA)" <ji...@apache.org> on 2015/09/26 19:30:04 UTC
[jira] [Comment Edited] (HBASE-14283) Reverse scan doesn’t work with HFile inline index/bloom blocks

    [ https://issues.apache.org/jira/browse/HBASE-14283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14909362#comment-14909362 ] 

Andrew Purtell edited comment on HBASE-14283 at 9/26/15 5:30 PM:
-----------------------------------------------------------------

When we were designing tags we accepted some limitations of HFile that later were problematic, specifically, we couldn't vary cell encoding on a block by block basis. Even if no cells use tags in a file, we'd bloat each cell with a short. Later we introduced a whole file optimization for this issue but clearly we'd have more opportunities to employ it if we could vary encoding strategy on a block-by-block basis. I thought about introducing an extensible pbufed block header. It didn't make sense for the tag serialization issue - we would trade one type of bloat for another, those additional header bytes will end up in the block cache - but if there are multiple use cases for it lined up, a new extensible pbufed 'block header' could be worthwhile. Would make future block level changes less likely to be incompatible changes too. 

Does the introduction of something like that require a major or minor version bump? I think so. I'd like to see us be more like semver with HFile versioning, and if we're on the same page about that, this is a major version bump because earlier versioned readers won't be able to handle the change. That is assuming 


was (Author: apurtell):
When we were designing tags we accepted some limitations of HFile that later were problematic, specifically the issue of file and Cell serialized form bloat produced by adding the tag length, even if universally 0 for the file. We introduced a whole file optimization for this issue but clearly we'd have more opportunities to employ it if we could vary encoding strategy on a block-by-block basis. I thought about introducing an extensible pbufed block header. It didn't make sense for the tag serialization issue - we would trade one type of bloat for another, those additional header bytes will end up in the block cache - but if there are multiple use cases for it lined up, a new extensible pbufed 'block header' could be worthwhile. Would make future block level changes less likely to be incompatible changes too. 

Does the introduction of something like that require a major or minor version bump? I think so. I'd like to see us be more like semver with HFile versioning, and if we're on the same page about that, this is a major version bump because earlier versioned readers won't be able to handle the change. That is assuming 

> Reverse scan doesn’t work with HFile inline index/bloom blocks
> --------------------------------------------------------------
>
>                 Key: HBASE-14283
>                 URL: https://issues.apache.org/jira/browse/HBASE-14283
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Ben Lau
>            Assignee: Ben Lau
>         Attachments: HBASE-14283-v2.patch, HBASE-14283.patch, hfile-seek-before.patch
>
>
> Reverse scans do not work if an HFile contains inline bloom blocks or leaf level index blocks.  The reason is because the seekBefore() call calculates the previous data block’s size by assuming data blocks are contiguous which is not the case in HFile V2 and beyond.
> Attached is a first cut patch (targeting bcef28eefaf192b0ad48c8011f98b8e944340da5 on trunk) which includes:
> (1) a unit test which exposes the bug and demonstrates failures for both inline bloom blocks and inline index blocks
> (2) a proposed fix for inline index blocks that does not require a new HFile version change, but is only performant for 1 and 2-level indexes and not 3+.  3+ requires an HFile format update for optimal performance.    
> This patch does not fix the bloom filter blocks bug.  But the fix should be similar to the case of inline index blocks.  The reason I haven’t made the change yet is I want to confirm that you guys would be fine with me revising the HFile.Reader interface.
> Specifically, these 2 functions (getGeneralBloomFilterMetadata and getDeleteBloomFilterMetadata) need to return the BloomFilter.  Right now the HFileReader class doesn’t have a reference to the bloom filters (and hence their indices) and only constructs the IO streams and hence has no way to know where the bloom blocks are in the HFile.  It seems that the HFile.Reader bloom method comments state that they “know nothing about how that metadata is structured” but I do not know if that is a requirement of the abstraction (why?) or just an incidental current property. 
> We would like to do 3 things with community approval:
> (1) Update the HFile.Reader interface and implementation to contain and return BloomFilters directly rather than unstructured IO streams
> (2) Merge the fixes for index blocks and bloom blocks into open source
> (3) Create a new Jira ticket for open source HBase to add a ‘prevBlockSize’ field in the block header in the next HFile version, so that seekBefore() calls can not only be correct but performant in all cases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)