You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Varun Sharma <va...@pinterest.com> on 2014/07/14 20:14:41 UTC

Question about HFile indices

Hi folks,

I am wondering why we have a tiered index in the HFile format. Is it
because the root index must fit in memory - hence must be limited in size.
Does the bound on the root index pretty much dictate the index tiers ?

Thank
Varun

Re: Question about HFile indices

Posted by Andrew Purtell <ap...@apache.org>.
We used to hold the entire HFile index in memory, although you could trade
off between the size of the index and amount of IO required to find the
desired records. Facebook found the aggregate size of indexes far too large
for their use case(s) so included the tiered index in the HFile V2
redesign. The design documents on HBASE-3857 might be worth a look.


On Mon, Jul 14, 2014 at 11:14 AM, Varun Sharma <va...@pinterest.com> wrote:

> Hi folks,
>
> I am wondering why we have a tiered index in the HFile format. Is it
> because the root index must fit in memory - hence must be limited in size.
> Does the bound on the root index pretty much dictate the index tiers ?
>
> Thank
> Varun
>



-- 
Best regards,

   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein
(via Tom White)

Re: Question about HFile indices

Posted by Ted Yu <yu...@gmail.com>.
Varun:
Have you read the first section of
https://issues.apache.org/jira/secure/attachment/12487932/hfile_format_v2_design_draft_0.4.odt
?

Please see this as well :
https://issues.apache.org/jira/browse/HBASE-3857?focusedCommentId=13031489&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13031489

Cheers


On Mon, Jul 14, 2014 at 11:14 AM, Varun Sharma <va...@pinterest.com> wrote:

> Hi folks,
>
> I am wondering why we have a tiered index in the HFile format. Is it
> because the root index must fit in memory - hence must be limited in size.
> Does the bound on the root index pretty much dictate the index tiers ?
>
> Thank
> Varun
>

Re: Question about HFile indices

Posted by Andrew Purtell <ap...@apache.org>.
We used to hold the entire HFile index in memory, although you could trade
off between the size of the index and amount of IO required to find the
desired records. Facebook found the aggregate size of indexes far too large
for their use case(s) so included the tiered index in the HFile V2
redesign. The design documents on HBASE-3857 might be worth a look.


On Mon, Jul 14, 2014 at 11:14 AM, Varun Sharma <va...@pinterest.com> wrote:

> Hi folks,
>
> I am wondering why we have a tiered index in the HFile format. Is it
> because the root index must fit in memory - hence must be limited in size.
> Does the bound on the root index pretty much dictate the index tiers ?
>
> Thank
> Varun
>



-- 
Best regards,

   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein
(via Tom White)

Re: Question about HFile indices

Posted by Ted Yu <yu...@gmail.com>.
Varun:
Have you read the first section of
https://issues.apache.org/jira/secure/attachment/12487932/hfile_format_v2_design_draft_0.4.odt
?

Please see this as well :
https://issues.apache.org/jira/browse/HBASE-3857?focusedCommentId=13031489&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13031489

Cheers


On Mon, Jul 14, 2014 at 11:14 AM, Varun Sharma <va...@pinterest.com> wrote:

> Hi folks,
>
> I am wondering why we have a tiered index in the HFile format. Is it
> because the root index must fit in memory - hence must be limited in size.
> Does the bound on the root index pretty much dictate the index tiers ?
>
> Thank
> Varun
>