You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Yves Langisch <yv...@langisch.ch> on 2011/04/14 22:54:07 UTC

Row key stored many times?

Hi,

On the opentsdb website [1] you can read the following:
---
The problem with HBase's implementation is that every single cell also stores the row key and a bunch of other redundant information. In the example above with 2 rows and 16 cells, the 13-byte row key is stored 16 times both on disk and in memory. This leads to several scalability problems, especially due to memory pressure inside Region Servers and the increased number of objects that HBase has to handle.
---

Is that true? Means that long row keys and many columns should be avoided?

Yves

[1] http://opentsdb.net/schema.html

Re: Row key stored many times?

Posted by Ryan Rawson <ry...@gmail.com>.
Good question, I'd try to keep most row keys < 30 bytes, and
definitely avoid > 1000 bytes.

On Thu, Apr 14, 2011 at 2:22 PM, David Schnepper <da...@yahoo-inc.com> wrote:
> On 14/Apr/2011 13:55, Ryan Rawson wrote:
>>
>> Yes, the row key is stored with every column.
>>
>> Avoid ridiculously long row keys :-)  Use compression.
>
> So how long is "ridiculously long" ?  10 bytes?  100?  1000?   10**N ?
>
>> On Thu, Apr 14, 2011 at 1:54 PM, Yves Langisch<yv...@langisch.ch>  wrote:
>>>
>>> Hi,
>>>
>>> On the opentsdb website [1] you can read the following:
>>> ---
>>> The problem with HBase's implementation is that every single cell also
>>> stores the row key and a bunch of other redundant information. In the
>>> example above with 2 rows and 16 cells, the 13-byte row key is stored 16
>>> times both on disk and in memory. This leads to several scalability
>>> problems, especially due to memory pressure inside Region Servers and the
>>> increased number of objects that HBase has to handle.
>>> ---
>>>
>>> Is that true? Means that long row keys and many columns should be
>>> avoided?
>>>
>>> Yves
>>>
>>> [1] http://opentsdb.net/schema.html
>
>

Re: Row key stored many times?

Posted by David Schnepper <da...@yahoo-inc.com>.
On 14/Apr/2011 13:55, Ryan Rawson wrote:
> Yes, the row key is stored with every column.
>
> Avoid ridiculously long row keys :-)  Use compression.

So how long is "ridiculously long" ?  10 bytes?  100?  1000?   10**N ?

> On Thu, Apr 14, 2011 at 1:54 PM, Yves Langisch<yv...@langisch.ch>  wrote:
>> Hi,
>>
>> On the opentsdb website [1] you can read the following:
>> ---
>> The problem with HBase's implementation is that every single cell also stores the row key and a bunch of other redundant information. In the example above with 2 rows and 16 cells, the 13-byte row key is stored 16 times both on disk and in memory. This leads to several scalability problems, especially due to memory pressure inside Region Servers and the increased number of objects that HBase has to handle.
>> ---
>>
>> Is that true? Means that long row keys and many columns should be avoided?
>>
>> Yves
>>
>> [1] http://opentsdb.net/schema.html


Re: Row key stored many times?

Posted by Ryan Rawson <ry...@gmail.com>.
Yes, the row key is stored with every column.

Avoid ridiculously long row keys :-)  Use compression.

On Thu, Apr 14, 2011 at 1:54 PM, Yves Langisch <yv...@langisch.ch> wrote:
> Hi,
>
> On the opentsdb website [1] you can read the following:
> ---
> The problem with HBase's implementation is that every single cell also stores the row key and a bunch of other redundant information. In the example above with 2 rows and 16 cells, the 13-byte row key is stored 16 times both on disk and in memory. This leads to several scalability problems, especially due to memory pressure inside Region Servers and the increased number of objects that HBase has to handle.
> ---
>
> Is that true? Means that long row keys and many columns should be avoided?
>
> Yves
>
> [1] http://opentsdb.net/schema.html