You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by innowireless TaeYun Kim <ta...@innowireless.co.kr> on 2014/08/05 04:24:00 UTC

What is in a HBase block index entry?

Hi,

This is a newbie question.

 

What is in a HBase block index entry?

 

My guess is that it's one of these:

 

1.      all key components: rowkey + column family + column qualifier +
timestamp

2.      all key components except for column family (since the index is in a
HFile that is a part of the storage for a column family): rowkey + column
qualifier + timestamp

3.      rowkey only

 

I've tried to find the information, But the articles only say that it
contains a 'key'. For me as a newbie, it is confusing since in a KeyValue
the all key components comprise the 'key', while a rowkey is also a 'key'.

 

Thanks.

 


RE: What is in a HBase block index entry?

Posted by innowireless TaeYun Kim <ta...@innowireless.co.kr>.
Thank you Anoop.

Though it's a bit strange to include CF in the index, since all the block index is contained in a HFile for a specific CF, I'm sure there would be a good reason (maybe for the performance of the comparison).
Anyways it should be almost no issue since the length of the CF should be short(mostly one byte).

Thanks.

-----Original Message-----
From: Anoop John [mailto:anoop.hbase@gmail.com] 
Sent: Thursday, August 07, 2014 1:40 PM
To: user@hbase.apache.org
Subject: Re: What is in a HBase block index entry?

It will be the key of the KeyValue.  Key includes rk + cf + qualifier + ts + type.

So all these part of key.  Your annswer#1 is correct (but with addition of type also)..  Hope this make it clear for you.

-Anoop-

On Tue, Aug 5, 2014 at 9:43 AM, innowireless TaeYun Kim < taeyun.kim@innowireless.co.kr> wrote:

> So, is it safe to assume that there is no documentation for the exact 
> content of the block index?
> I think that reading the source code should be the last resort, since 
> one cannot sure whether it is an implementation detail, or it is the 
> specification that can be relied upon.
> The information on the exact content of the block index is important, 
> since it is related in the size of the index (let alone the query
> performance) and therefore the schema design.
>
> Or, could you please simply provide the exact answer to my original 
> question(with proper reference)?
>
> Thank you.
>
> -----Original Message-----
> From: Ted Yu [mailto:yuzhihong@gmail.com]
>  Sent: Tuesday, August 05, 2014 12:59 PM
> To: user@hbase.apache.org
> Subject: Re: What is in a HBase block index entry?
>
> I suggest you read the source code of KeyValue class.
> e.g. you can start with this method:
>
>   public static long getKeyDataStructureSize(int rlength, int flength, 
> int
> qlength) {
>
> Cheers
>
>
> On Mon, Aug 4, 2014 at 8:13 PM, innowireless TaeYun Kim < 
> taeyun.kim@innowireless.co.kr> wrote:
>
> > So, your answer is 1. in my list?
> > I don’t think so since the column family information is not necessary.
> > Could you please refer a more direct information?
> >
> >
> > -----Original Message-----
> > From: Ted Yu [mailto:yuzhihong@gmail.com]
> > Sent: Tuesday, August 05, 2014 12:01 PM
> > To: user@hbase.apache.org
> > Subject: Re: What is in a HBase block index entry?
> >
> > Please see:
> > http://hbase.apache.org/book.html#keyvalue
> >
> >
> > On Mon, Aug 4, 2014 at 7:55 PM, innowireless TaeYun Kim < 
> > taeyun.kim@innowireless.co.kr> wrote:
> >
> > > Thank you for your reply.
> > >
> > > It only says 'Key'.
> > > That's what I'm confused.
> > >
> > >
> > > -----Original Message-----
> > > From: Ted Yu [mailto:yuzhihong@gmail.com]
> > > Sent: Tuesday, August 05, 2014 11:45 AM
> > > To: user@hbase.apache.org
> > > Subject: Re: What is in a HBase block index entry?
> > >
> > > Have you read this ?
> > > http://hbase.apache.org/book.html#d3593e20175
> > >
> > > Cheers
> > >
> > >
> > > On Mon, Aug 4, 2014 at 7:24 PM, innowireless TaeYun Kim < 
> > > taeyun.kim@innowireless.co.kr> wrote:
> > >
> > > > Hi,
> > > >
> > > > This is a newbie question.
> > > >
> > > >
> > > >
> > > > What is in a HBase block index entry?
> > > >
> > > >
> > > >
> > > > My guess is that it's one of these:
> > > >
> > > >
> > > >
> > > > 1.      all key components: rowkey + column family + column
> qualifier +
> > > > timestamp
> > > >
> > > > 2.      all key components except for column family (since the index
> is
> > > in
> > > > a
> > > > HFile that is a part of the storage for a column family): rowkey 
> > > > + column qualifier + timestamp
> > > >
> > > > 3.      rowkey only
> > > >
> > > >
> > > >
> > > > I've tried to find the information, But the articles only say 
> > > > that it contains a 'key'. For me as a newbie, it is confusing 
> > > > since in a KeyValue the all key components comprise the 'key', 
> > > > while a rowkey is
> > > also a 'key'.
> > > >
> > > >
> > > >
> > > > Thanks.
> > > >
> > > >
> > > >
> > > >
> > >
> > >
> >
> >
>
>


Re: What is in a HBase block index entry?

Posted by Anoop John <an...@gmail.com>.
It will be the key of the KeyValue.  Key includes
rk + cf + qualifier + ts + type.

So all these part of key.  Your annswer#1 is correct (but with addition of
type also)..  Hope this make it clear for you.

-Anoop-

On Tue, Aug 5, 2014 at 9:43 AM, innowireless TaeYun Kim <
taeyun.kim@innowireless.co.kr> wrote:

> So, is it safe to assume that there is no documentation for the exact
> content of the block index?
> I think that reading the source code should be the last resort, since one
> cannot sure whether it is an implementation detail, or it is the
> specification that can be relied upon.
> The information on the exact content of the block index is important,
> since it is related in the size of the index (let alone the query
> performance) and therefore the schema design.
>
> Or, could you please simply provide the exact answer to my original
> question(with proper reference)?
>
> Thank you.
>
> -----Original Message-----
> From: Ted Yu [mailto:yuzhihong@gmail.com]
>  Sent: Tuesday, August 05, 2014 12:59 PM
> To: user@hbase.apache.org
> Subject: Re: What is in a HBase block index entry?
>
> I suggest you read the source code of KeyValue class.
> e.g. you can start with this method:
>
>   public static long getKeyDataStructureSize(int rlength, int flength, int
> qlength) {
>
> Cheers
>
>
> On Mon, Aug 4, 2014 at 8:13 PM, innowireless TaeYun Kim <
> taeyun.kim@innowireless.co.kr> wrote:
>
> > So, your answer is 1. in my list?
> > I don’t think so since the column family information is not necessary.
> > Could you please refer a more direct information?
> >
> >
> > -----Original Message-----
> > From: Ted Yu [mailto:yuzhihong@gmail.com]
> > Sent: Tuesday, August 05, 2014 12:01 PM
> > To: user@hbase.apache.org
> > Subject: Re: What is in a HBase block index entry?
> >
> > Please see:
> > http://hbase.apache.org/book.html#keyvalue
> >
> >
> > On Mon, Aug 4, 2014 at 7:55 PM, innowireless TaeYun Kim <
> > taeyun.kim@innowireless.co.kr> wrote:
> >
> > > Thank you for your reply.
> > >
> > > It only says 'Key'.
> > > That's what I'm confused.
> > >
> > >
> > > -----Original Message-----
> > > From: Ted Yu [mailto:yuzhihong@gmail.com]
> > > Sent: Tuesday, August 05, 2014 11:45 AM
> > > To: user@hbase.apache.org
> > > Subject: Re: What is in a HBase block index entry?
> > >
> > > Have you read this ?
> > > http://hbase.apache.org/book.html#d3593e20175
> > >
> > > Cheers
> > >
> > >
> > > On Mon, Aug 4, 2014 at 7:24 PM, innowireless TaeYun Kim <
> > > taeyun.kim@innowireless.co.kr> wrote:
> > >
> > > > Hi,
> > > >
> > > > This is a newbie question.
> > > >
> > > >
> > > >
> > > > What is in a HBase block index entry?
> > > >
> > > >
> > > >
> > > > My guess is that it's one of these:
> > > >
> > > >
> > > >
> > > > 1.      all key components: rowkey + column family + column
> qualifier +
> > > > timestamp
> > > >
> > > > 2.      all key components except for column family (since the index
> is
> > > in
> > > > a
> > > > HFile that is a part of the storage for a column family): rowkey +
> > > > column qualifier + timestamp
> > > >
> > > > 3.      rowkey only
> > > >
> > > >
> > > >
> > > > I've tried to find the information, But the articles only say that
> > > > it contains a 'key'. For me as a newbie, it is confusing since in
> > > > a KeyValue the all key components comprise the 'key', while a
> > > > rowkey is
> > > also a 'key'.
> > > >
> > > >
> > > >
> > > > Thanks.
> > > >
> > > >
> > > >
> > > >
> > >
> > >
> >
> >
>
>

RE: What is in a HBase block index entry?

Posted by innowireless TaeYun Kim <ta...@innowireless.co.kr>.
So, is it safe to assume that there is no documentation for the exact content of the block index?
I think that reading the source code should be the last resort, since one cannot sure whether it is an implementation detail, or it is the specification that can be relied upon.
The information on the exact content of the block index is important, since it is related in the size of the index (let alone the query performance) and therefore the schema design.

Or, could you please simply provide the exact answer to my original question(with proper reference)?

Thank you.

-----Original Message-----
From: Ted Yu [mailto:yuzhihong@gmail.com] 
Sent: Tuesday, August 05, 2014 12:59 PM
To: user@hbase.apache.org
Subject: Re: What is in a HBase block index entry?

I suggest you read the source code of KeyValue class.
e.g. you can start with this method:

  public static long getKeyDataStructureSize(int rlength, int flength, int
qlength) {

Cheers


On Mon, Aug 4, 2014 at 8:13 PM, innowireless TaeYun Kim < taeyun.kim@innowireless.co.kr> wrote:

> So, your answer is 1. in my list?
> I don’t think so since the column family information is not necessary.
> Could you please refer a more direct information?
>
>
> -----Original Message-----
> From: Ted Yu [mailto:yuzhihong@gmail.com]
> Sent: Tuesday, August 05, 2014 12:01 PM
> To: user@hbase.apache.org
> Subject: Re: What is in a HBase block index entry?
>
> Please see:
> http://hbase.apache.org/book.html#keyvalue
>
>
> On Mon, Aug 4, 2014 at 7:55 PM, innowireless TaeYun Kim < 
> taeyun.kim@innowireless.co.kr> wrote:
>
> > Thank you for your reply.
> >
> > It only says 'Key'.
> > That's what I'm confused.
> >
> >
> > -----Original Message-----
> > From: Ted Yu [mailto:yuzhihong@gmail.com]
> > Sent: Tuesday, August 05, 2014 11:45 AM
> > To: user@hbase.apache.org
> > Subject: Re: What is in a HBase block index entry?
> >
> > Have you read this ?
> > http://hbase.apache.org/book.html#d3593e20175
> >
> > Cheers
> >
> >
> > On Mon, Aug 4, 2014 at 7:24 PM, innowireless TaeYun Kim < 
> > taeyun.kim@innowireless.co.kr> wrote:
> >
> > > Hi,
> > >
> > > This is a newbie question.
> > >
> > >
> > >
> > > What is in a HBase block index entry?
> > >
> > >
> > >
> > > My guess is that it's one of these:
> > >
> > >
> > >
> > > 1.      all key components: rowkey + column family + column qualifier +
> > > timestamp
> > >
> > > 2.      all key components except for column family (since the index is
> > in
> > > a
> > > HFile that is a part of the storage for a column family): rowkey + 
> > > column qualifier + timestamp
> > >
> > > 3.      rowkey only
> > >
> > >
> > >
> > > I've tried to find the information, But the articles only say that 
> > > it contains a 'key'. For me as a newbie, it is confusing since in 
> > > a KeyValue the all key components comprise the 'key', while a 
> > > rowkey is
> > also a 'key'.
> > >
> > >
> > >
> > > Thanks.
> > >
> > >
> > >
> > >
> >
> >
>
>


Re: What is in a HBase block index entry?

Posted by Ted Yu <yu...@gmail.com>.
I suggest you read the source code of KeyValue class.
e.g. you can start with this method:

  public static long getKeyDataStructureSize(int rlength, int flength, int
qlength) {

Cheers


On Mon, Aug 4, 2014 at 8:13 PM, innowireless TaeYun Kim <
taeyun.kim@innowireless.co.kr> wrote:

> So, your answer is 1. in my list?
> I don’t think so since the column family information is not necessary.
> Could you please refer a more direct information?
>
>
> -----Original Message-----
> From: Ted Yu [mailto:yuzhihong@gmail.com]
> Sent: Tuesday, August 05, 2014 12:01 PM
> To: user@hbase.apache.org
> Subject: Re: What is in a HBase block index entry?
>
> Please see:
> http://hbase.apache.org/book.html#keyvalue
>
>
> On Mon, Aug 4, 2014 at 7:55 PM, innowireless TaeYun Kim <
> taeyun.kim@innowireless.co.kr> wrote:
>
> > Thank you for your reply.
> >
> > It only says 'Key'.
> > That's what I'm confused.
> >
> >
> > -----Original Message-----
> > From: Ted Yu [mailto:yuzhihong@gmail.com]
> > Sent: Tuesday, August 05, 2014 11:45 AM
> > To: user@hbase.apache.org
> > Subject: Re: What is in a HBase block index entry?
> >
> > Have you read this ?
> > http://hbase.apache.org/book.html#d3593e20175
> >
> > Cheers
> >
> >
> > On Mon, Aug 4, 2014 at 7:24 PM, innowireless TaeYun Kim <
> > taeyun.kim@innowireless.co.kr> wrote:
> >
> > > Hi,
> > >
> > > This is a newbie question.
> > >
> > >
> > >
> > > What is in a HBase block index entry?
> > >
> > >
> > >
> > > My guess is that it's one of these:
> > >
> > >
> > >
> > > 1.      all key components: rowkey + column family + column qualifier +
> > > timestamp
> > >
> > > 2.      all key components except for column family (since the index is
> > in
> > > a
> > > HFile that is a part of the storage for a column family): rowkey +
> > > column qualifier + timestamp
> > >
> > > 3.      rowkey only
> > >
> > >
> > >
> > > I've tried to find the information, But the articles only say that
> > > it contains a 'key'. For me as a newbie, it is confusing since in a
> > > KeyValue the all key components comprise the 'key', while a rowkey
> > > is
> > also a 'key'.
> > >
> > >
> > >
> > > Thanks.
> > >
> > >
> > >
> > >
> >
> >
>
>

RE: What is in a HBase block index entry?

Posted by innowireless TaeYun Kim <ta...@innowireless.co.kr>.
So, your answer is 1. in my list?
I don’t think so since the column family information is not necessary.
Could you please refer a more direct information?


-----Original Message-----
From: Ted Yu [mailto:yuzhihong@gmail.com] 
Sent: Tuesday, August 05, 2014 12:01 PM
To: user@hbase.apache.org
Subject: Re: What is in a HBase block index entry?

Please see:
http://hbase.apache.org/book.html#keyvalue


On Mon, Aug 4, 2014 at 7:55 PM, innowireless TaeYun Kim < taeyun.kim@innowireless.co.kr> wrote:

> Thank you for your reply.
>
> It only says 'Key'.
> That's what I'm confused.
>
>
> -----Original Message-----
> From: Ted Yu [mailto:yuzhihong@gmail.com]
> Sent: Tuesday, August 05, 2014 11:45 AM
> To: user@hbase.apache.org
> Subject: Re: What is in a HBase block index entry?
>
> Have you read this ?
> http://hbase.apache.org/book.html#d3593e20175
>
> Cheers
>
>
> On Mon, Aug 4, 2014 at 7:24 PM, innowireless TaeYun Kim < 
> taeyun.kim@innowireless.co.kr> wrote:
>
> > Hi,
> >
> > This is a newbie question.
> >
> >
> >
> > What is in a HBase block index entry?
> >
> >
> >
> > My guess is that it's one of these:
> >
> >
> >
> > 1.      all key components: rowkey + column family + column qualifier +
> > timestamp
> >
> > 2.      all key components except for column family (since the index is
> in
> > a
> > HFile that is a part of the storage for a column family): rowkey + 
> > column qualifier + timestamp
> >
> > 3.      rowkey only
> >
> >
> >
> > I've tried to find the information, But the articles only say that 
> > it contains a 'key'. For me as a newbie, it is confusing since in a 
> > KeyValue the all key components comprise the 'key', while a rowkey 
> > is
> also a 'key'.
> >
> >
> >
> > Thanks.
> >
> >
> >
> >
>
>


Re: What is in a HBase block index entry?

Posted by Ted Yu <yu...@gmail.com>.
Please see:
http://hbase.apache.org/book.html#keyvalue


On Mon, Aug 4, 2014 at 7:55 PM, innowireless TaeYun Kim <
taeyun.kim@innowireless.co.kr> wrote:

> Thank you for your reply.
>
> It only says 'Key'.
> That's what I'm confused.
>
>
> -----Original Message-----
> From: Ted Yu [mailto:yuzhihong@gmail.com]
> Sent: Tuesday, August 05, 2014 11:45 AM
> To: user@hbase.apache.org
> Subject: Re: What is in a HBase block index entry?
>
> Have you read this ?
> http://hbase.apache.org/book.html#d3593e20175
>
> Cheers
>
>
> On Mon, Aug 4, 2014 at 7:24 PM, innowireless TaeYun Kim <
> taeyun.kim@innowireless.co.kr> wrote:
>
> > Hi,
> >
> > This is a newbie question.
> >
> >
> >
> > What is in a HBase block index entry?
> >
> >
> >
> > My guess is that it's one of these:
> >
> >
> >
> > 1.      all key components: rowkey + column family + column qualifier +
> > timestamp
> >
> > 2.      all key components except for column family (since the index is
> in
> > a
> > HFile that is a part of the storage for a column family): rowkey +
> > column qualifier + timestamp
> >
> > 3.      rowkey only
> >
> >
> >
> > I've tried to find the information, But the articles only say that it
> > contains a 'key'. For me as a newbie, it is confusing since in a
> > KeyValue the all key components comprise the 'key', while a rowkey is
> also a 'key'.
> >
> >
> >
> > Thanks.
> >
> >
> >
> >
>
>

RE: What is in a HBase block index entry?

Posted by innowireless TaeYun Kim <ta...@innowireless.co.kr>.
Thank you for your reply.

It only says 'Key'.
That's what I'm confused.


-----Original Message-----
From: Ted Yu [mailto:yuzhihong@gmail.com] 
Sent: Tuesday, August 05, 2014 11:45 AM
To: user@hbase.apache.org
Subject: Re: What is in a HBase block index entry?

Have you read this ?
http://hbase.apache.org/book.html#d3593e20175

Cheers


On Mon, Aug 4, 2014 at 7:24 PM, innowireless TaeYun Kim < taeyun.kim@innowireless.co.kr> wrote:

> Hi,
>
> This is a newbie question.
>
>
>
> What is in a HBase block index entry?
>
>
>
> My guess is that it's one of these:
>
>
>
> 1.      all key components: rowkey + column family + column qualifier +
> timestamp
>
> 2.      all key components except for column family (since the index is in
> a
> HFile that is a part of the storage for a column family): rowkey + 
> column qualifier + timestamp
>
> 3.      rowkey only
>
>
>
> I've tried to find the information, But the articles only say that it 
> contains a 'key'. For me as a newbie, it is confusing since in a 
> KeyValue the all key components comprise the 'key', while a rowkey is also a 'key'.
>
>
>
> Thanks.
>
>
>
>


Re: What is in a HBase block index entry?

Posted by Ted Yu <yu...@gmail.com>.
Have you read this ?
http://hbase.apache.org/book.html#d3593e20175

Cheers


On Mon, Aug 4, 2014 at 7:24 PM, innowireless TaeYun Kim <
taeyun.kim@innowireless.co.kr> wrote:

> Hi,
>
> This is a newbie question.
>
>
>
> What is in a HBase block index entry?
>
>
>
> My guess is that it's one of these:
>
>
>
> 1.      all key components: rowkey + column family + column qualifier +
> timestamp
>
> 2.      all key components except for column family (since the index is in
> a
> HFile that is a part of the storage for a column family): rowkey + column
> qualifier + timestamp
>
> 3.      rowkey only
>
>
>
> I've tried to find the information, But the articles only say that it
> contains a 'key'. For me as a newbie, it is confusing since in a KeyValue
> the all key components comprise the 'key', while a rowkey is also a 'key'.
>
>
>
> Thanks.
>
>
>
>