You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@accumulo.apache.org by David Medinets <da...@gmail.com> on 2013/12/04 05:57:14 UTC

HBase rowkey design guidelines

http://hbase.apache.org/book/rowkey.design.html - unless I am
misunderstanding much of the advice given for HBase simply doesn't apply to
Accumulo. For example "Try to keep the ColumnFamily names as small as
possible, preferably one character (e.g. "d" for data/default)."

Re: HBase rowkey design guidelines

Posted by Ted Yu <yu...@gmail.com>.
For HFile v3, please take a look at:
HBASE-9045 Dictionary based tag compression

Cheers

On Dec 5, 2013, at 12:23 AM, Josh Elser <jo...@gmail.com> wrote:

> They have a couple of different encoding strategies in HFile v2 that are similar.
> 
> https://issues.apache.org/jira/browse/HBASE-4218
> https://issues.apache.org/jira/browse/HBASE-4676
> 
> Not sure if there are any new slated approaches for HFile v3.
> 
> On 12/4/13, 12:28 AM, John Vines wrote:
>> Also, I'm not sure if HBase has the encoding techniques that we utilize
>> in our RFile
>> 
>> On Wed, Dec 4, 2013 at 12:19 AM, Mike Drob <mdrob@mdrob.com
>> <ma...@mdrob.com>> wrote:
>> 
>>    Well, yes and no.
>> 
>>    Smaller keys still mean less network traffic, potentially less IO,
>>    and maybe faster operations if you're trying to do application
>>    logic. Using data or default or just d probably doesn't matter in
>>    the long term (although there are certainly cases where it might).
>> 
>>    On Dec 3, 2013 11:57 PM, "David Medinets" <david.medinets@gmail.com
>>    <ma...@gmail.com>> wrote:
>> 
>>        http://hbase.apache.org/book/rowkey.design.html - unless I am
>>        misunderstanding much of the advice given for HBase simply
>>        doesn't apply to Accumulo. For example "Try to keep the
>>        ColumnFamily names as small as possible, preferably one
>>        character (e.g. "d" for data/default)."
>> 
>> 

Re: HBase rowkey design guidelines

Posted by Josh Elser <jo...@gmail.com>.
They have a couple of different encoding strategies in HFile v2 that are 
similar.

https://issues.apache.org/jira/browse/HBASE-4218
https://issues.apache.org/jira/browse/HBASE-4676

Not sure if there are any new slated approaches for HFile v3.

On 12/4/13, 12:28 AM, John Vines wrote:
> Also, I'm not sure if HBase has the encoding techniques that we utilize
> in our RFile
>
> On Wed, Dec 4, 2013 at 12:19 AM, Mike Drob <mdrob@mdrob.com
> <ma...@mdrob.com>> wrote:
>
>     Well, yes and no.
>
>     Smaller keys still mean less network traffic, potentially less IO,
>     and maybe faster operations if you're trying to do application
>     logic. Using data or default or just d probably doesn't matter in
>     the long term (although there are certainly cases where it might).
>
>     On Dec 3, 2013 11:57 PM, "David Medinets" <david.medinets@gmail.com
>     <ma...@gmail.com>> wrote:
>
>         http://hbase.apache.org/book/rowkey.design.html - unless I am
>         misunderstanding much of the advice given for HBase simply
>         doesn't apply to Accumulo. For example "Try to keep the
>         ColumnFamily names as small as possible, preferably one
>         character (e.g. "d" for data/default)."
>
>

Re: HBase rowkey design guidelines

Posted by John Vines <vi...@apache.org>.
Also, I'm not sure if HBase has the encoding techniques that we utilize in
our RFile

On Wed, Dec 4, 2013 at 12:19 AM, Mike Drob <md...@mdrob.com> wrote:

> Well, yes and no.
>
> Smaller keys still mean less network traffic, potentially less IO, and
> maybe faster operations if you're trying to do application logic. Using
> data or default or just d probably doesn't matter in the long term
> (although there are certainly cases where it might).
>  On Dec 3, 2013 11:57 PM, "David Medinets" <da...@gmail.com>
> wrote:
>
>> http://hbase.apache.org/book/rowkey.design.html - unless I am
>> misunderstanding much of the advice given for HBase simply doesn't apply to
>> Accumulo. For example "Try to keep the ColumnFamily names as small as
>> possible, preferably one character (e.g. "d" for data/default)."
>>
>

Re: HBase rowkey design guidelines

Posted by Mike Drob <md...@mdrob.com>.
Well, yes and no.

Smaller keys still mean less network traffic, potentially less IO, and
maybe faster operations if you're trying to do application logic. Using
data or default or just d probably doesn't matter in the long term
(although there are certainly cases where it might).
On Dec 3, 2013 11:57 PM, "David Medinets" <da...@gmail.com> wrote:

> http://hbase.apache.org/book/rowkey.design.html - unless I am
> misunderstanding much of the advice given for HBase simply doesn't apply to
> Accumulo. For example "Try to keep the ColumnFamily names as small as
> possible, preferably one character (e.g. "d" for data/default)."
>