You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Eric Czech <er...@nextbigsound.com> on 2012/08/07 15:35:16 UTC

Ideal row size

Hello everyone,

I'm trying to store many small values in indexes created via MR jobs,
and I was hoping to get some advice on how to structure my rows.
Essentially, I have complete control over how large the rows should be
as the values are small, consistent in size, and can be grouped
together in any way I'd like.  My question then is, what's the ideal
size for a row in Hbase, in bytes?  I'm trying to determine how to
group my values together into larger values, and I think having a
target size to hit would make that a lot easier.

I know fewer rows is generally better to avoid the repetitive storage
of keys, cfs, and qualifiers provided that those rows still suit a
given application, but I'm not sure at what point the scale will tip
in the other direction and I'll start to see undue memory pressure or
compaction issues with rows that are too large.

Thanks in advance!

Re: Ideal row size

Posted by Eric Czech <er...@nextbigsound.com>.
That's the exactly sort of target I was looking for -- thanks for the help!

I'll probably shoot for something close to 48KB so I don't exceed that
block size.

On Tue, Aug 7, 2012 at 2:26 PM, Jean-Daniel Cryans <jd...@apache.org> wrote:
> Hi Eric,
>
> An ideal cell size would probably be the size of a block, so 64KB
> including the keys. Having bigger cells would inflate the size of your
> blocks but then you'd be outside of the normal HBase settings.
>
> That, and do some experiments.
>
> J-D
>
> On Tue, Aug 7, 2012 at 6:35 AM, Eric Czech <er...@nextbigsound.com> wrote:
>> Hello everyone,
>>
>> I'm trying to store many small values in indexes created via MR jobs,
>> and I was hoping to get some advice on how to structure my rows.
>> Essentially, I have complete control over how large the rows should be
>> as the values are small, consistent in size, and can be grouped
>> together in any way I'd like.  My question then is, what's the ideal
>> size for a row in Hbase, in bytes?  I'm trying to determine how to
>> group my values together into larger values, and I think having a
>> target size to hit would make that a lot easier.
>>
>> I know fewer rows is generally better to avoid the repetitive storage
>> of keys, cfs, and qualifiers provided that those rows still suit a
>> given application, but I'm not sure at what point the scale will tip
>> in the other direction and I'll start to see undue memory pressure or
>> compaction issues with rows that are too large.
>>
>> Thanks in advance!

Re: Ideal row size

Posted by Jean-Daniel Cryans <jd...@apache.org>.
Hi Eric,

An ideal cell size would probably be the size of a block, so 64KB
including the keys. Having bigger cells would inflate the size of your
blocks but then you'd be outside of the normal HBase settings.

That, and do some experiments.

J-D

On Tue, Aug 7, 2012 at 6:35 AM, Eric Czech <er...@nextbigsound.com> wrote:
> Hello everyone,
>
> I'm trying to store many small values in indexes created via MR jobs,
> and I was hoping to get some advice on how to structure my rows.
> Essentially, I have complete control over how large the rows should be
> as the values are small, consistent in size, and can be grouped
> together in any way I'd like.  My question then is, what's the ideal
> size for a row in Hbase, in bytes?  I'm trying to determine how to
> group my values together into larger values, and I think having a
> target size to hit would make that a lot easier.
>
> I know fewer rows is generally better to avoid the repetitive storage
> of keys, cfs, and qualifiers provided that those rows still suit a
> given application, but I'm not sure at what point the scale will tip
> in the other direction and I'll start to see undue memory pressure or
> compaction issues with rows that are too large.
>
> Thanks in advance!