You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Andrew Purtell (JIRA)" <ji...@apache.org> on 2013/08/06 04:47:48 UTC

[jira] [Comment Edited] (HBASE-9127) HFileContext

    [ https://issues.apache.org/jira/browse/HBASE-9127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13729854#comment-13729854 ] 

Andrew Purtell edited comment on HBASE-9127 at 8/6/13 2:47 AM:
---------------------------------------------------------------

Attached are three patches as a thought experiment. They are based on 0.94 but taking the same actions in later versions would be similar.

0001 - Introduces HFileContext and moves CacheConfig into it. 

0002 - Moves compression algorithm selection into HFileContext.

0003 - Moves in cache and on disk data block encoder selection into HFileContext. This one is not tested, so a few unit tests may need more tweaks.

Next steps could be: 

- Move BloomType into HFileContext

- Move whether to persist memstoreTS into HFileContext

- Test if the additional indirection has a performance impact.

- Consider what more of HFileContext state can be made final/immutable to facilitate inlining. This is interesting wherever a reader wants to reconfigure based on file trailer or other metadata discovered when opening the HFile.

- CacheConfig could be fully rolled into HFileContext.

- In the HFile V3 work we have a patch that, like in the existing case with memstoreTS, does whole file optimization if no tags will be stored (just like if memstoreTS would be 0 for all KVs in an HFile), which breaks encapsulation in the Store and Compactor. Putting this into HFileContext (at some future time) would fix that.

- HBASE-7544 could be reworked to use this instead of sprinkling additional method parameters throughout Store and HFile.
                
      was (Author: apurtell):
    Attached are three patches as a thought experiment. They are based on 0.94 but taking the same actions in later versions would be similar.

0001 - Introduces HFileContext and moves CacheConfig into it. 

0002 - Moves compression algorithm selection into HFileContext.

0003 - Moves in cache and on disk data block encoder selection into HFileContext. This one is not tested, so a few unit tests may need more tweaks.

Next steps could be: 

- Move BloomType into HFileContext

- Move the MVCC read point into HFileContext

- Test if the additional indirection has a performance impact.

- Consider what more of HFileContext state can be made final/immutable to facilitate inlining. This is interesting wherever a reader wants to reconfigure based on file trailer or other metadata discovered when opening the HFile.

- CacheConfig could be fully rolled into HFileContext.

- In the HFile V3 work we have a patch that, like in the existing case with memstoreTS, does whole file optimization if no tags will be stored (just like if memstoreTS would be 0 for all KVs in an HFile), which breaks encapsulation in the Store and Compactor. Putting this into HFileContext (at some future time) would fix that.

- HBASE-7544 could be reworked to use this instead of sprinkling additional method parameters throughout Store and HFile.
                  
> HFileContext
> ------------
>
>                 Key: HBASE-9127
>                 URL: https://issues.apache.org/jira/browse/HBASE-9127
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Andrew Purtell
>         Attachments: 0001-Introduce-HFileContext.patch, 0002-Move-compression-algorithm-into-HFileContext.patch, 0003-Move-data-block-encoder-into-HFileContext.patch
>
>
> We can roll up at least some of the state we are leaking between the Store layer and the HFile layer by introducing an IO context for passing between the two. This idea has come up in other discussions. Here I am calling it 'HFileContext' because the particulars are regarding how to configure HFile readers and writers. This will be easier to maintain than various and sundry method parameters and (duplicated) instance variables sprinkled about, and will make adding or modifying persistence features easier and less disruptive.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira