You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hbase.apache.org by Lars Francke <la...@gmail.com> on 2018/07/12 11:30:33 UTC

On the number of column families

I've got a question on the number of column families. I've told everyone
for years that you shouldn't use more than maybe 3-10 column families.

Our book still says the following:
"HBase currently does not do well with anything above two or three column
families so keep the number of column families in your schema low.
Currently, *flushing* and compactions are done on a per Region basis so if
one column family is carrying the bulk of the data bringing on flushes, the
adjacent families will also be flushed even though the amount of data they
carry is small."

I'm wondering what the state of the art _really_ is today.

I know that flushing happens per CF. As far as I can tell though
compactions still happen for all stores in a region after a flush.

Related question there (there's always a good chance that I misread the
code): Wouldn't it make sense to make the compaction decision after a flush
also per Store?

But back to the original question. How many column families do you see
and/or use in production? And what are the remaining reasons against "a
lot"?

My list is the following:
- Splits happen per region, so small CFs will be split to be even smaller
- Each CF takes up a few resources even if they are not in use (no reads or
writes)
- If each CF is used then there is an increased total memory pressure which
will probably lead to early flushes which leads to smaller files which
leads to more compactions etc.
- As far as I can tell (but I'm not sure) when a single Store/CF answers
"yes" to the "needsCompaction()" call after a flush the whole region will
be compacted
- Each CF creates a directory + files per region -> might lead to lots of
small files

I'd love to update the book when I have some answers.

Thank you!

Cheers,
Lars

Re: On the number of column families

Posted by Lars Francke <la...@gmail.com>.

I just want to bump this once more. Does anyone have any more input for me
before I update the documentation?

On Mon, Jul 16, 2018 at 10:07 AM, Lars Francke <la...@gmail.com>
wrote:

> Thanks Andrew for taking the time to answer in detail!
>
> I have to admit that I didn't check the code for this one but I remember
> these JIRAs:
> https://issues.apache.org/jira/browse/HBASE-3149: "Make flush decisions
> per column family"
> https://issues.apache.org/jira/browse/HBASE-10201: "Port 'Make flush
> decisions per column family' to trunk" (in 1.1 and 2.0)
> So I assume that that's one thing that's been solved.
>
> Good point about the open files, thanks! I didn't know the differences
> between "normal" HDFS and other HDFS FS Implementations.
>
> And thanks to the pointer about the Phoenix column encoding feature.
>
>
> On Sat, Jul 14, 2018 at 2:21 AM, Andrew Purtell <ap...@apache.org>
> wrote:
>
>> I think flushes are still done by region in all versions, so this can lead
>> to a lot of file IO depending on how well compaction can keep up. The CF
>> is
>> the unit of IO scheduling granularity. For a single row query where you
>> don't select only a subset of CFs, then each CF adds IO demand with
>> attendant impact. The flip side to this is if you segregate subsets of
>> data
>> that are separately accessed into a CF for each subset, and use queries
>> with high CF selectivity, then this optimizes IO to your query. This kind
>> of "manual" query planning is an intended benefit (and burden) of the
>> bigtable data model.
>>
>> Because HBase currently holds open a reference to all files in a store,
>> there is some modest linear increase in heap demand as the number of CFs
>> grows. HDFS does a good job of multiplexing the notion of open file over a
>> smaller set of OS level resources. Other filesystem implementations (like
>> the S3 family) do not, so if you have a root FS on S3 then as the number
>> of
>> files in the aggregate goes up so does resource demand at the OS layer,
>> and
>> you might have issues with hitting open file descriptor limits. There are
>> some JIRAs open that propose changes to this. (I filed them.)
>>
>> If you use Phoenix, like we do, then if you turn on Phoenix's column
>> encoding feature, PHOENIX-1598 (
>> https://blogs.apache.org/phoenix/entry/column-mapping-and-immutable-data)
>> then no matter how many logical columns you have in your schema they are
>> mapped to a single CF at the HBase layer, which produces some space and
>> query time benefits (and has some tradeoffs). So where I work the ideal is
>> one CF, although because we have legacy tables it is not universally
>> applied.
>>
>>
>> On Thu, Jul 12, 2018 at 4:31 AM Lars Francke <la...@gmail.com>
>> wrote:
>>
>> > I've got a question on the number of column families. I've told everyone
>> > for years that you shouldn't use more than maybe 3-10 column families.
>> >
>> > Our book still says the following:
>> > "HBase currently does not do well with anything above two or three
>> column
>> > families so keep the number of column families in your schema low.
>> > Currently, *flushing* and compactions are done on a per Region basis so
>> if
>> > one column family is carrying the bulk of the data bringing on flushes,
>> the
>> > adjacent families will also be flushed even though the amount of data
>> they
>> > carry is small."
>> >
>> > I'm wondering what the state of the art _really_ is today.
>> >
>> > I know that flushing happens per CF. As far as I can tell though
>> > compactions still happen for all stores in a region after a flush.
>> >
>> > Related question there (there's always a good chance that I misread the
>> > code): Wouldn't it make sense to make the compaction decision after a
>> flush
>> > also per Store?
>> >
>> > But back to the original question. How many column families do you see
>> > and/or use in production? And what are the remaining reasons against "a
>> > lot"?
>> >
>> > My list is the following:
>> > - Splits happen per region, so small CFs will be split to be even
>> smaller
>> > - Each CF takes up a few resources even if they are not in use (no
>> reads or
>> > writes)
>> > - If each CF is used then there is an increased total memory pressure
>> which
>> > will probably lead to early flushes which leads to smaller files which
>> > leads to more compactions etc.
>> > - As far as I can tell (but I'm not sure) when a single Store/CF answers
>> > "yes" to the "needsCompaction()" call after a flush the whole region
>> will
>> > be compacted
>> > - Each CF creates a directory + files per region -> might lead to lots
>> of
>> > small files
>> >
>> > I'd love to update the book when I have some answers.
>> >
>> > Thank you!
>> >
>> > Cheers,
>> > Lars
>> >
>>
>>
>> --
>> Best regards,
>> Andrew
>>
>> Words like orphans lost among the crosstalk, meaning torn from truth's
>> decrepit hands
>>    - A23, Crosstalk
>>
>
>

Re: On the number of column families

Posted by Lars Francke <la...@gmail.com>.

Thanks Andrew for taking the time to answer in detail!

I have to admit that I didn't check the code for this one but I remember
these JIRAs:
https://issues.apache.org/jira/browse/HBASE-3149: "Make flush decisions per
column family"
https://issues.apache.org/jira/browse/HBASE-10201: "Port 'Make flush
decisions per column family' to trunk" (in 1.1 and 2.0)
So I assume that that's one thing that's been solved.

Good point about the open files, thanks! I didn't know the differences
between "normal" HDFS and other HDFS FS Implementations.

And thanks to the pointer about the Phoenix column encoding feature.


On Sat, Jul 14, 2018 at 2:21 AM, Andrew Purtell <ap...@apache.org> wrote:

> I think flushes are still done by region in all versions, so this can lead
> to a lot of file IO depending on how well compaction can keep up. The CF is
> the unit of IO scheduling granularity. For a single row query where you
> don't select only a subset of CFs, then each CF adds IO demand with
> attendant impact. The flip side to this is if you segregate subsets of data
> that are separately accessed into a CF for each subset, and use queries
> with high CF selectivity, then this optimizes IO to your query. This kind
> of "manual" query planning is an intended benefit (and burden) of the
> bigtable data model.
>
> Because HBase currently holds open a reference to all files in a store,
> there is some modest linear increase in heap demand as the number of CFs
> grows. HDFS does a good job of multiplexing the notion of open file over a
> smaller set of OS level resources. Other filesystem implementations (like
> the S3 family) do not, so if you have a root FS on S3 then as the number of
> files in the aggregate goes up so does resource demand at the OS layer, and
> you might have issues with hitting open file descriptor limits. There are
> some JIRAs open that propose changes to this. (I filed them.)
>
> If you use Phoenix, like we do, then if you turn on Phoenix's column
> encoding feature, PHOENIX-1598 (
> https://blogs.apache.org/phoenix/entry/column-mapping-and-immutable-data)
> then no matter how many logical columns you have in your schema they are
> mapped to a single CF at the HBase layer, which produces some space and
> query time benefits (and has some tradeoffs). So where I work the ideal is
> one CF, although because we have legacy tables it is not universally
> applied.
>
>
> On Thu, Jul 12, 2018 at 4:31 AM Lars Francke <la...@gmail.com>
> wrote:
>
> > I've got a question on the number of column families. I've told everyone
> > for years that you shouldn't use more than maybe 3-10 column families.
> >
> > Our book still says the following:
> > "HBase currently does not do well with anything above two or three column
> > families so keep the number of column families in your schema low.
> > Currently, *flushing* and compactions are done on a per Region basis so
> if
> > one column family is carrying the bulk of the data bringing on flushes,
> the
> > adjacent families will also be flushed even though the amount of data
> they
> > carry is small."
> >
> > I'm wondering what the state of the art _really_ is today.
> >
> > I know that flushing happens per CF. As far as I can tell though
> > compactions still happen for all stores in a region after a flush.
> >
> > Related question there (there's always a good chance that I misread the
> > code): Wouldn't it make sense to make the compaction decision after a
> flush
> > also per Store?
> >
> > But back to the original question. How many column families do you see
> > and/or use in production? And what are the remaining reasons against "a
> > lot"?
> >
> > My list is the following:
> > - Splits happen per region, so small CFs will be split to be even smaller
> > - Each CF takes up a few resources even if they are not in use (no reads
> or
> > writes)
> > - If each CF is used then there is an increased total memory pressure
> which
> > will probably lead to early flushes which leads to smaller files which
> > leads to more compactions etc.
> > - As far as I can tell (but I'm not sure) when a single Store/CF answers
> > "yes" to the "needsCompaction()" call after a flush the whole region will
> > be compacted
> > - Each CF creates a directory + files per region -> might lead to lots of
> > small files
> >
> > I'd love to update the book when I have some answers.
> >
> > Thank you!
> >
> > Cheers,
> > Lars
> >
>
>
> --
> Best regards,
> Andrew
>
> Words like orphans lost among the crosstalk, meaning torn from truth's
> decrepit hands
>    - A23, Crosstalk
>

Re: On the number of column families

Posted by Andrew Purtell <ap...@apache.org>.

I think flushes are still done by region in all versions, so this can lead
to a lot of file IO depending on how well compaction can keep up. The CF is
the unit of IO scheduling granularity. For a single row query where you
don't select only a subset of CFs, then each CF adds IO demand with
attendant impact. The flip side to this is if you segregate subsets of data
that are separately accessed into a CF for each subset, and use queries
with high CF selectivity, then this optimizes IO to your query. This kind
of "manual" query planning is an intended benefit (and burden) of the
bigtable data model.

Because HBase currently holds open a reference to all files in a store,
there is some modest linear increase in heap demand as the number of CFs
grows. HDFS does a good job of multiplexing the notion of open file over a
smaller set of OS level resources. Other filesystem implementations (like
the S3 family) do not, so if you have a root FS on S3 then as the number of
files in the aggregate goes up so does resource demand at the OS layer, and
you might have issues with hitting open file descriptor limits. There are
some JIRAs open that propose changes to this. (I filed them.)

If you use Phoenix, like we do, then if you turn on Phoenix's column
encoding feature, PHOENIX-1598 (
https://blogs.apache.org/phoenix/entry/column-mapping-and-immutable-data)
then no matter how many logical columns you have in your schema they are
mapped to a single CF at the HBase layer, which produces some space and
query time benefits (and has some tradeoffs). So where I work the ideal is
one CF, although because we have legacy tables it is not universally
applied.

On Thu, Jul 12, 2018 at 4:31 AM Lars Francke <la...@gmail.com> wrote:

> I've got a question on the number of column families. I've told everyone
> for years that you shouldn't use more than maybe 3-10 column families.
>
> Our book still says the following:
> "HBase currently does not do well with anything above two or three column
> families so keep the number of column families in your schema low.
> Currently, *flushing* and compactions are done on a per Region basis so if
> one column family is carrying the bulk of the data bringing on flushes, the
> adjacent families will also be flushed even though the amount of data they
> carry is small."
>
> I'm wondering what the state of the art _really_ is today.
>
> I know that flushing happens per CF. As far as I can tell though
> compactions still happen for all stores in a region after a flush.
>
> Related question there (there's always a good chance that I misread the
> code): Wouldn't it make sense to make the compaction decision after a flush
> also per Store?
>
> But back to the original question. How many column families do you see
> and/or use in production? And what are the remaining reasons against "a
> lot"?
>
> My list is the following:
> - Splits happen per region, so small CFs will be split to be even smaller
> - Each CF takes up a few resources even if they are not in use (no reads or
> writes)
> - If each CF is used then there is an increased total memory pressure which
> will probably lead to early flushes which leads to smaller files which
> leads to more compactions etc.
> - As far as I can tell (but I'm not sure) when a single Store/CF answers
> "yes" to the "needsCompaction()" call after a flush the whole region will
> be compacted
> - Each CF creates a directory + files per region -> might lead to lots of
> small files
>
> I'd love to update the book when I have some answers.
>
> Thank you!
>
> Cheers,
> Lars
>

-- 
Best regards,
Andrew

Words like orphans lost among the crosstalk, meaning torn from truth's
decrepit hands
   - A23, Crosstalk

Re: On the number of column families

Posted by Lars Francke <la...@gmail.com>.

Stack, sorry for the late answer. Took me a while to get to this.


On Thu, Aug 2, 2018 at 6:30 PM, Stack <st...@duboce.net> wrote:

> On Thu, Jul 12, 2018 at 4:31 AM Lars Francke <la...@gmail.com>
> wrote:
> >
> > I've got a question on the number of column families. I've told everyone
> > for years that you shouldn't use more than maybe 3-10 column families.
> >
> > Our book still says the following:
> > "HBase currently does not do well with anything above two or three column
> > families so keep the number of column families in your schema low.
> > Currently, *flushing* and compactions are done on a per Region basis so
> if
> > one column family is carrying the bulk of the data bringing on flushes,
> the
> > adjacent families will also be flushed even though the amount of data
> they
> > carry is small."
> >
> > I'm wondering what the state of the art _really_ is today.
> >
> > I know that flushing happens per CF.
>
> Yes.
>
>
> As far as I can tell though
> > compactions still happen for all stores in a region after a flush.
> >
> > Related question there (there's always a good chance that I misread the
> > code): Wouldn't it make sense to make the compaction decision after a
> flush
> > also per Store?
> >
>
> Yes.
>
> We compact a CF-at-a-time (looking in logs). CompactionRequest is CF
> scoped. You reckon we do full Region Lars (I've not dug in).


Looking at this <
https://github.com/apache/hbase/blob/rel/2.0.0/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreFlusher.java#L607-L614>
which calls this <
https://github.com/apache/hbase/blob/rel/2.0.0/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplit.java#L297-L300>
which then gets to this line <
https://github.com/apache/hbase/blob/rel/2.0.0/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplit.java#L350-L351
>

So all CFs are added to the queue. I know that the file selection only
happens when the CompactSplit Thread is actually working on the request.
But I have to be honest that I don't know all the implications. To me it
looks like all CFs are added to the queue when any CF is flushed....


>
> > But back to the original question. How many column families do you see
> > and/or use in production? And what are the remaining reasons against "a
> > lot"?
> >
>
> I think the 3-10 is fine as general recommendation. Perhaps caveat
> that more is also possible but queries should be CF scoped outlining
> what happens when full-row fetches, especially if the character of the
> data in each CF varies radically; e.g. one CF has image, while another
> has metadata.
>
>
> > My list is the following:
> > - Splits happen per region, so small CFs will be split to be even smaller
> > - Each CF takes up a few resources even if they are not in use (no reads
> or
> > writes)
> > - If each CF is used then there is an increased total memory pressure
> which
> > will probably lead to early flushes which leads to smaller files which
> > leads to more compactions etc.
> > - As far as I can tell (but I'm not sure) when a single Store/CF answers
> > "yes" to the "needsCompaction()" call after a flush the whole region will
> > be compacted
>
> We need to answer this question. I spent five minutes looking in logs
> and they look to run per-CF. Looking in code, I see generally that we
> do by CF but there is a top-level method that does all CFs.... used
> from tests seemingly.
>
> What you seeing Lars?
>
> If we compact all CFs when a Compaction runs, thats a bug.
>
> Thanks,
> S
>
>
>
> > - Each CF creates a directory + files per region -> might lead to lots of
> > small files
> >
>
> This is done lazily.
>
> > I'd love to update the book when I have some answers.
> >
>
> Thanks Lars,
> S
>
>
> > Thank you!
> >
> > Cheers,
> > Lars
>

Re: On the number of column families

Posted by Stack <st...@duboce.net>.

On Thu, Jul 12, 2018 at 4:31 AM Lars Francke <la...@gmail.com> wrote:
>
> I've got a question on the number of column families. I've told everyone
> for years that you shouldn't use more than maybe 3-10 column families.
>
> Our book still says the following:
> "HBase currently does not do well with anything above two or three column
> families so keep the number of column families in your schema low.
> Currently, *flushing* and compactions are done on a per Region basis so if
> one column family is carrying the bulk of the data bringing on flushes, the
> adjacent families will also be flushed even though the amount of data they
> carry is small."
>
> I'm wondering what the state of the art _really_ is today.
>
> I know that flushing happens per CF.

Yes.


As far as I can tell though
> compactions still happen for all stores in a region after a flush.
>
> Related question there (there's always a good chance that I misread the
> code): Wouldn't it make sense to make the compaction decision after a flush
> also per Store?
>

Yes.

We compact a CF-at-a-time (looking in logs). CompactionRequest is CF
scoped. You reckon we do full Region Lars (I've not dug in).


> But back to the original question. How many column families do you see
> and/or use in production? And what are the remaining reasons against "a
> lot"?
>

I think the 3-10 is fine as general recommendation. Perhaps caveat
that more is also possible but queries should be CF scoped outlining
what happens when full-row fetches, especially if the character of the
data in each CF varies radically; e.g. one CF has image, while another
has metadata.


> My list is the following:
> - Splits happen per region, so small CFs will be split to be even smaller
> - Each CF takes up a few resources even if they are not in use (no reads or
> writes)
> - If each CF is used then there is an increased total memory pressure which
> will probably lead to early flushes which leads to smaller files which
> leads to more compactions etc.
> - As far as I can tell (but I'm not sure) when a single Store/CF answers
> "yes" to the "needsCompaction()" call after a flush the whole region will
> be compacted

We need to answer this question. I spent five minutes looking in logs
and they look to run per-CF. Looking in code, I see generally that we
do by CF but there is a top-level method that does all CFs.... used
from tests seemingly.

What you seeing Lars?

If we compact all CFs when a Compaction runs, thats a bug.

Thanks,
S



> - Each CF creates a directory + files per region -> might lead to lots of
> small files
>

This is done lazily.

> I'd love to update the book when I have some answers.
>

Thanks Lars,
S


> Thank you!
>
> Cheers,
> Lars