You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Doug Meil <do...@explorysmedical.com> on 2014/04/11 23:14:58 UTC

HFile size writeup in HBase Blog

Hey folks,

Stack published a writeup I did on the HBase blog on the effects of rowkey size, column-name size, CF compression, data block encoding and KV storage approach on HFile size.  For example, had large row keys vs. small row keys, used Snappy vs. LZO vs. etc., used prefix vs. fast-diff, used a KV per column vs. a single KV per row.  We tried 'em all... and wrote it up.

http://blogs.apache.org/hbase/


Doug Meil
Chief Software Architect, Explorys
doug.meil@explorysmedical.com



Re: HFile size writeup in HBase Blog

Posted by Ted Yu <yu...@gmail.com>.
Looking forward to your next blog. 

Cheers

On Apr 12, 2014, at 5:08 AM, Doug Meil <do...@explorysmedical.com> wrote:

> 
> Thanks Ted!
> 
> I can add that to the to-do list.  Also have plans for read/write
> performance numbers too in a follow-up blog.
> 
> 
> 
> 
> 
> 
> On 4/11/14, 6:00 PM, "Ted Yu" <yu...@gmail.com> wrote:
> 
>> Nice writeup, Doug.
>> 
>> Do you have plan to profile Prefix Tree data block encoding ?
>> 
>> Cheers
>> 
>> 
>> On Fri, Apr 11, 2014 at 3:14 PM, Doug Meil
>> <do...@explorysmedical.com>wrote:
>> 
>>> Hey folks,
>>> 
>>> Stack published a writeup I did on the HBase blog on the effects of
>>> rowkey
>>> size, column-name size, CF compression, data block encoding and KV
>>> storage
>>> approach on HFile size.  For example, had large row keys vs. small row
>>> keys, used Snappy vs. LZO vs. etc., used prefix vs. fast-diff, used a KV
>>> per column vs. a single KV per row.  We tried 'em all... and wrote it
>>> up.
>>> 
>>> http://blogs.apache.org/hbase/
>>> 
>>> 
>>> Doug Meil
>>> Chief Software Architect, Explorys
>>> doug.meil@explorysmedical.com
> 

Re: HFile size writeup in HBase Blog

Posted by Doug Meil <do...@explorysmedical.com>.
Thanks Ted!

I can add that to the to-do list.  Also have plans for read/write
performance numbers too in a follow-up blog.






On 4/11/14, 6:00 PM, "Ted Yu" <yu...@gmail.com> wrote:

>Nice writeup, Doug.
>
>Do you have plan to profile Prefix Tree data block encoding ?
>
>Cheers
>
>
>On Fri, Apr 11, 2014 at 3:14 PM, Doug Meil
><do...@explorysmedical.com>wrote:
>
>> Hey folks,
>>
>> Stack published a writeup I did on the HBase blog on the effects of
>>rowkey
>> size, column-name size, CF compression, data block encoding and KV
>>storage
>> approach on HFile size.  For example, had large row keys vs. small row
>> keys, used Snappy vs. LZO vs. etc., used prefix vs. fast-diff, used a KV
>> per column vs. a single KV per row.  We tried 'em all... and wrote it
>>up.
>>
>> http://blogs.apache.org/hbase/
>>
>>
>> Doug Meil
>> Chief Software Architect, Explorys
>> doug.meil@explorysmedical.com
>>
>>
>>


Re: HFile size writeup in HBase Blog

Posted by Ted Yu <yu...@gmail.com>.
Nice writeup, Doug.

Do you have plan to profile Prefix Tree data block encoding ?

Cheers


On Fri, Apr 11, 2014 at 3:14 PM, Doug Meil <do...@explorysmedical.com>wrote:

> Hey folks,
>
> Stack published a writeup I did on the HBase blog on the effects of rowkey
> size, column-name size, CF compression, data block encoding and KV storage
> approach on HFile size.  For example, had large row keys vs. small row
> keys, used Snappy vs. LZO vs. etc., used prefix vs. fast-diff, used a KV
> per column vs. a single KV per row.  We tried 'em all... and wrote it up.
>
> http://blogs.apache.org/hbase/
>
>
> Doug Meil
> Chief Software Architect, Explorys
> doug.meil@explorysmedical.com
>
>
>