You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "Jonathan Gray (JIRA)" <ji...@apache.org> on 2009/07/09 06:06:15 UTC

[jira] Updated: (HBASE-68) [hbase] HStoreFiles needlessly store the column family name in every entry

     [ https://issues.apache.org/jira/browse/HBASE-68?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Gray updated HBASE-68:
-------------------------------

    Fix Version/s:     (was: 0.20.0)
                   0.21.0

Debates ensued on IRC.  Agreed to punt for now.

One new idea discussed that we might explore is using codes instead of storing the entire string.  Client could rebuild by looking at HTD (which would contain the mapping from code -> family name), or we could send along a little header at the beginning of a Result.

> [hbase] HStoreFiles needlessly store the column family name in every entry
> --------------------------------------------------------------------------
>
>                 Key: HBASE-68
>                 URL: https://issues.apache.org/jira/browse/HBASE-68
>             Project: Hadoop HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Bryan Duxbury
>            Priority: Minor
>             Fix For: 0.21.0
>
>
> Today, HStoreFiles keep the entire serialized HStoreKey objects around for every cell in the HStore. Since HStores are 1-1 with column families, this is really unnecessary - you can always surmise the column family by looking at the HStore it belongs to. (This information would ostensibly come from the file name or a header section.) This means that we could remove the column family part of the HStoreKeys we put into the HStoreFile, reducing the size of data stored. This would be a space-saving benefit, removing redundant data, and could be a speed benefit, as you have to scan over less data in memory and transfer less data over the network.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.