You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Leif Wickland <le...@gmail.com> on 2011/06/02 23:40:35 UTC

Question from HBase book: "HBase currently does not do well with anything about two or three column families"

I was reading through the HBase book and came across the following in *6.2. On
the number of column families.<http://hbase.apache.org/book.html#number.of.cfs>
*
*
*

*"HBase currently does not do well with anything about two or three column
families so keep the number of column families in your schema low.
Currently, flushing and compactions are done on a per Region basis so if one
column family is carrying the bulk of the data bringing on flushes, the
adjacent families will also be flushed though the amount of data they carry
is small. Compaction is currently triggered by the total number of files
under a column family. Its not size based. When many column families the
flushing and compaction interaction can make for a bunch of needless i/o
loading (To be addressed by changing flushing and compaction to work on a
per column family basis).*

*Try to make do with one column famliy if you can in your schemas. Only
introduce a second and third column family in the case where data access is
usually column scoped; i.e. you query one column family or the other but
usually not both at the one time."*

Is that still considered current?  Do folks on the list generally agree with
that guideline?

The reason that I ask is that I'm designing a data model which currently has
five column families.  I expect each of those column families to have
divergent read and write patterns.  Do you think I should look for ways to
reduce the number of CFs?

Thanks,

Leif Wickland

RE: Question from HBase book: "HBase currently does not do well with anything about two or three column families"

Posted by Doug Meil <do...@explorysmedical.com>.
Re:  " monotonically increasing column names."

No problem with that. 


-----Original Message-----
From: Leif Wickland [mailto:leifwickland@gmail.com] 
Sent: Monday, June 13, 2011 5:29 PM
To: user@hbase.apache.org
Subject: Re: Question from HBase book: "HBase currently does not do well with anything about two or three column families"

>
> Read the part about monotonically increasing keys in the HBase book.  
> There have been lots of other threads in the dist-list about this topic too.


Thanks for mentioning that, Doug.  I did see that in the HBase book.

My wording was poor.  I meant that the column names would be derived from data like [timestamp, action details, session ID].  I've been trying to figure out if I could use the cell's timestamp (and have no garbage
collection) so that the key name would be derived from [action details, session ID].  The downside of that approach is I'd need to load all of the cells in memory and sort it in order to do some of the analysis I need.

I don't remember seeing an admonishing against monotonically increasing column names.  Is that also a bad idea?

Thanks for your help,

Leif Wickland


>
> -----Original Message-----
> From: Leif Wickland [mailto:leifwickland@gmail.com]
> Sent: Monday, June 13, 2011 1:29 PM
> To: user@hbase.apache.org
> Subject: Re: Question from HBase book: "HBase currently does not do 
> well with anything about two or three column families"
>
> >
> > If they have divergent read and write patterns why not put them in 
> > separate tables?
> >
>
> That's an entirely fair question.  I'm new to this.  I figured if the 
> data was related to the same thing and could have the same key, then 
> it ought to go into various CFs on that key in a single table.  I got 
> the feeling from reading the BigTable paper that the typical design 
> approach was to dump lots of CFs into a table.  It seems like that's not the HBase-way, though.
>
> For the most part it's not a big deal to store the data in separate tables.
>  However, I'm curious what you'd recommend for one particular part of it.
>  Specifically I'd like to store actions within a web visit.  I've been 
> planning to store individual actions as columns in their own column 
> family, keyed by something like [timestamp, action details, session 
> ID].  In another column family I'd been planning on storing statistics 
> about the actions, such as first time, end time, count, etc.  When 
> writing to the actions CF, I'd need to read from and possibly update 
> the stats CF.  Would your recommendation be to store that kind of data 
> in the same CF, two CFs in the same table, or in two separate tables?
>
> My thought was that I could use row locking to avoid races to update 
> the stats after inserting into actions if I took the two CF approach.
>
> Thanks for your feedback,
>
> Leif Wickland
>

Re: Question from HBase book: "HBase currently does not do well with anything about two or three column families"

Posted by Leif Wickland <le...@gmail.com>.
>
> Read the part about monotonically increasing keys in the HBase book.  There
> have been lots of other threads in the dist-list about this topic too.


Thanks for mentioning that, Doug.  I did see that in the HBase book.

My wording was poor.  I meant that the column names would be derived from
data like [timestamp, action details, session ID].  I've been trying to
figure out if I could use the cell's timestamp (and have no garbage
collection) so that the key name would be derived from [action details,
session ID].  The downside of that approach is I'd need to load all of the
cells in memory and sort it in order to do some of the analysis I need.

I don't remember seeing an admonishing against monotonically increasing
column names.  Is that also a bad idea?

Thanks for your help,

Leif Wickland


>
> -----Original Message-----
> From: Leif Wickland [mailto:leifwickland@gmail.com]
> Sent: Monday, June 13, 2011 1:29 PM
> To: user@hbase.apache.org
> Subject: Re: Question from HBase book: "HBase currently does not do well
> with anything about two or three column families"
>
> >
> > If they have divergent read and write patterns why not put them in
> > separate tables?
> >
>
> That's an entirely fair question.  I'm new to this.  I figured if the data
> was related to the same thing and could have the same key, then it ought to
> go into various CFs on that key in a single table.  I got the feeling from
> reading the BigTable paper that the typical design approach was to dump lots
> of CFs into a table.  It seems like that's not the HBase-way, though.
>
> For the most part it's not a big deal to store the data in separate tables.
>  However, I'm curious what you'd recommend for one particular part of it.
>  Specifically I'd like to store actions within a web visit.  I've been
> planning to store individual actions as columns in their own column family,
> keyed by something like [timestamp, action details, session ID].  In another
> column family I'd been planning on storing statistics about the actions,
> such as first time, end time, count, etc.  When writing to the actions CF,
> I'd need to read from and possibly update the stats CF.  Would your
> recommendation be to store that kind of data in the same CF, two CFs in the
> same table, or in two separate tables?
>
> My thought was that I could use row locking to avoid races to update the
> stats after inserting into actions if I took the two CF approach.
>
> Thanks for your feedback,
>
> Leif Wickland
>

RE: Question from HBase book: "HBase currently does not do well with anything about two or three column families"

Posted by Doug Meil <do...@explorysmedical.com>.
Re:  " keyed by something like [timestamp, action details, session ID]"

Read the part about monotonically increasing keys in the HBase book.  There have been lots of other threads in the dist-list about this topic too.


-----Original Message-----
From: Leif Wickland [mailto:leifwickland@gmail.com] 
Sent: Monday, June 13, 2011 1:29 PM
To: user@hbase.apache.org
Subject: Re: Question from HBase book: "HBase currently does not do well with anything about two or three column families"

>
> If they have divergent read and write patterns why not put them in 
> separate tables?
>

That's an entirely fair question.  I'm new to this.  I figured if the data was related to the same thing and could have the same key, then it ought to go into various CFs on that key in a single table.  I got the feeling from reading the BigTable paper that the typical design approach was to dump lots of CFs into a table.  It seems like that's not the HBase-way, though.

For the most part it's not a big deal to store the data in separate tables.
 However, I'm curious what you'd recommend for one particular part of it.
 Specifically I'd like to store actions within a web visit.  I've been planning to store individual actions as columns in their own column family, keyed by something like [timestamp, action details, session ID].  In another column family I'd been planning on storing statistics about the actions, such as first time, end time, count, etc.  When writing to the actions CF, I'd need to read from and possibly update the stats CF.  Would your recommendation be to store that kind of data in the same CF, two CFs in the same table, or in two separate tables?

My thought was that I could use row locking to avoid races to update the stats after inserting into actions if I took the two CF approach.

Thanks for your feedback,

Leif Wickland

Re: Question from HBase book: "HBase currently does not do well with anything about two or three column families"

Posted by Stack <st...@duboce.net>.
Jason:

See the BigTable paper.

The HBase architecture is missing a tier from what is described in the
BigTable paper. How LGs would function is how CFs currently work in
HBase.  In BigTable, they have an added functionality where they can
group their notion of CFs into a LG.  Apparently they can change their
schema as they go to change what CFs are in what LGs (Out in their SST
files, I'd imagine, every entry is not from the same CF as it is in
our HFiles).

St.Ack

On Wed, Jun 22, 2011 at 10:13 AM, Jason Rutherglen
<ja...@gmail.com> wrote:
> Todd,
>
> Can you define what a Locality Group is for HBase and how it would
> function?  Eg, it sounds like the same thing as a column-family, and
> it's not clear how one would benefit from the usage of an LG.
>
> Jason
>
> On Mon, Jun 13, 2011 at 8:45 PM, Todd Lipcon <to...@cloudera.com> wrote:
>> Keep in mind that BigTable can have a large number of CFs because they also
>> have Locality Groups. HBase has a 1:1 mapping of CF -> Locality Group.
>>
>> I don't know for sure, but I imagine most of these BT tables have a very
>> small number of locality groups, even if they have 20+ CFs.
>>
>> Would be nice to extract CFs from LGs in HBase some day, if anyone has a
>> month free ;-)
>>
>> -Todd
>>
>> On Mon, Jun 13, 2011 at 2:44 PM, Jason Rutherglen <
>> jason.rutherglen@gmail.com> wrote:
>>
>>> > Table 2 provides some actual CF/table numbers.  One of the crawl tables
>>> has
>>> > 16 CFs and one of the Google Base tables had 29 CFs
>>>
>>> What's Google doing in BigTable that enables so many CFs?
>>>
>>> Is the cost in HBase the seek to each individual key in the CFs, or is
>>> it the cost of loading each block into RAM (?), which could be
>>> alleviated though bypassing the block cache and accessing the blocks
>>> as if they're local.
>>>
>>> On Mon, Jun 13, 2011 at 2:35 PM, Leif Wickland <le...@gmail.com>
>>> wrote:
>>> > Thanks for replying, J-D.
>>> >
>>> > My interpretation is that they try to keep that number low, from page 2:
>>> >>
>>> >> "It is our intent that the number of distinct column families in a
>>> >> table be small (in the hundreds at most)"
>>> >>
>>> >
>>> > Table 2 provides some actual CF/table numbers.  One of the crawl tables
>>> has
>>> > 16 CFs and one of the Google Base tables had 29 CFs.
>>> >
>>> >
>>> >> Could you just store that in the same family?
>>> >>
>>> >
>>> > Yup.  I could.  Their would be a little weirdness to it, but I think it's
>>> > doable.  It seems like that's the consensus suggestion.
>>> >
>>> >
>>> >> Row locking is rarely a good idea, it doesn't scale and they currently
>>> >> aren't persisted anywhere except the RS memory (so if it dies...).
>>> >> Using a single family might be better for you.
>>> >
>>> >
>>> > Thanks for the pointer.
>>> >
>>> > Leif
>>> >
>>>
>>
>>
>>
>> --
>> Todd Lipcon
>> Software Engineer, Cloudera
>>
>

Re: Question from HBase book: "HBase currently does not do well with anything about two or three column families"

Posted by Jason Rutherglen <ja...@gmail.com>.
Todd,

Can you define what a Locality Group is for HBase and how it would
function?  Eg, it sounds like the same thing as a column-family, and
it's not clear how one would benefit from the usage of an LG.

Jason

On Mon, Jun 13, 2011 at 8:45 PM, Todd Lipcon <to...@cloudera.com> wrote:
> Keep in mind that BigTable can have a large number of CFs because they also
> have Locality Groups. HBase has a 1:1 mapping of CF -> Locality Group.
>
> I don't know for sure, but I imagine most of these BT tables have a very
> small number of locality groups, even if they have 20+ CFs.
>
> Would be nice to extract CFs from LGs in HBase some day, if anyone has a
> month free ;-)
>
> -Todd
>
> On Mon, Jun 13, 2011 at 2:44 PM, Jason Rutherglen <
> jason.rutherglen@gmail.com> wrote:
>
>> > Table 2 provides some actual CF/table numbers.  One of the crawl tables
>> has
>> > 16 CFs and one of the Google Base tables had 29 CFs
>>
>> What's Google doing in BigTable that enables so many CFs?
>>
>> Is the cost in HBase the seek to each individual key in the CFs, or is
>> it the cost of loading each block into RAM (?), which could be
>> alleviated though bypassing the block cache and accessing the blocks
>> as if they're local.
>>
>> On Mon, Jun 13, 2011 at 2:35 PM, Leif Wickland <le...@gmail.com>
>> wrote:
>> > Thanks for replying, J-D.
>> >
>> > My interpretation is that they try to keep that number low, from page 2:
>> >>
>> >> "It is our intent that the number of distinct column families in a
>> >> table be small (in the hundreds at most)"
>> >>
>> >
>> > Table 2 provides some actual CF/table numbers.  One of the crawl tables
>> has
>> > 16 CFs and one of the Google Base tables had 29 CFs.
>> >
>> >
>> >> Could you just store that in the same family?
>> >>
>> >
>> > Yup.  I could.  Their would be a little weirdness to it, but I think it's
>> > doable.  It seems like that's the consensus suggestion.
>> >
>> >
>> >> Row locking is rarely a good idea, it doesn't scale and they currently
>> >> aren't persisted anywhere except the RS memory (so if it dies...).
>> >> Using a single family might be better for you.
>> >
>> >
>> > Thanks for the pointer.
>> >
>> > Leif
>> >
>>
>
>
>
> --
> Todd Lipcon
> Software Engineer, Cloudera
>

Re: Question from HBase book: "HBase currently does not do well with anything about two or three column families"

Posted by Todd Lipcon <to...@cloudera.com>.
Keep in mind that BigTable can have a large number of CFs because they also
have Locality Groups. HBase has a 1:1 mapping of CF -> Locality Group.

I don't know for sure, but I imagine most of these BT tables have a very
small number of locality groups, even if they have 20+ CFs.

Would be nice to extract CFs from LGs in HBase some day, if anyone has a
month free ;-)

-Todd

On Mon, Jun 13, 2011 at 2:44 PM, Jason Rutherglen <
jason.rutherglen@gmail.com> wrote:

> > Table 2 provides some actual CF/table numbers.  One of the crawl tables
> has
> > 16 CFs and one of the Google Base tables had 29 CFs
>
> What's Google doing in BigTable that enables so many CFs?
>
> Is the cost in HBase the seek to each individual key in the CFs, or is
> it the cost of loading each block into RAM (?), which could be
> alleviated though bypassing the block cache and accessing the blocks
> as if they're local.
>
> On Mon, Jun 13, 2011 at 2:35 PM, Leif Wickland <le...@gmail.com>
> wrote:
> > Thanks for replying, J-D.
> >
> > My interpretation is that they try to keep that number low, from page 2:
> >>
> >> "It is our intent that the number of distinct column families in a
> >> table be small (in the hundreds at most)"
> >>
> >
> > Table 2 provides some actual CF/table numbers.  One of the crawl tables
> has
> > 16 CFs and one of the Google Base tables had 29 CFs.
> >
> >
> >> Could you just store that in the same family?
> >>
> >
> > Yup.  I could.  Their would be a little weirdness to it, but I think it's
> > doable.  It seems like that's the consensus suggestion.
> >
> >
> >> Row locking is rarely a good idea, it doesn't scale and they currently
> >> aren't persisted anywhere except the RS memory (so if it dies...).
> >> Using a single family might be better for you.
> >
> >
> > Thanks for the pointer.
> >
> > Leif
> >
>



-- 
Todd Lipcon
Software Engineer, Cloudera

Re: Question from HBase book: "HBase currently does not do well with anything about two or three column families"

Posted by Jason Rutherglen <ja...@gmail.com>.
> Table 2 provides some actual CF/table numbers.  One of the crawl tables has
> 16 CFs and one of the Google Base tables had 29 CFs

What's Google doing in BigTable that enables so many CFs?

Is the cost in HBase the seek to each individual key in the CFs, or is
it the cost of loading each block into RAM (?), which could be
alleviated though bypassing the block cache and accessing the blocks
as if they're local.

On Mon, Jun 13, 2011 at 2:35 PM, Leif Wickland <le...@gmail.com> wrote:
> Thanks for replying, J-D.
>
> My interpretation is that they try to keep that number low, from page 2:
>>
>> "It is our intent that the number of distinct column families in a
>> table be small (in the hundreds at most)"
>>
>
> Table 2 provides some actual CF/table numbers.  One of the crawl tables has
> 16 CFs and one of the Google Base tables had 29 CFs.
>
>
>> Could you just store that in the same family?
>>
>
> Yup.  I could.  Their would be a little weirdness to it, but I think it's
> doable.  It seems like that's the consensus suggestion.
>
>
>> Row locking is rarely a good idea, it doesn't scale and they currently
>> aren't persisted anywhere except the RS memory (so if it dies...).
>> Using a single family might be better for you.
>
>
> Thanks for the pointer.
>
> Leif
>

Re: Question from HBase book: "HBase currently does not do well with anything about two or three column families"

Posted by Leif Wickland <le...@gmail.com>.
Thanks for replying, J-D.

My interpretation is that they try to keep that number low, from page 2:
>
> "It is our intent that the number of distinct column families in a
> table be small (in the hundreds at most)"
>

Table 2 provides some actual CF/table numbers.  One of the crawl tables has
16 CFs and one of the Google Base tables had 29 CFs.


> Could you just store that in the same family?
>

Yup.  I could.  Their would be a little weirdness to it, but I think it's
doable.  It seems like that's the consensus suggestion.


> Row locking is rarely a good idea, it doesn't scale and they currently
> aren't persisted anywhere except the RS memory (so if it dies...).
> Using a single family might be better for you.


Thanks for the pointer.

Leif

Re: Question from HBase book: "HBase currently does not do well with anything about two or three column families"

Posted by Jean-Daniel Cryans <jd...@apache.org>.
> That's an entirely fair question.  I'm new to this.  I figured if the data
> was related to the same thing and could have the same key, then it ought to
> go into various CFs on that key in a single table.  I got the feeling from
> reading the BigTable paper that the typical design approach was to dump lots
> of CFs into a table.  It seems like that's not the HBase-way, though.

My interpretation is that they try to keep that number low, from page 2:

"It is our intent that the number of distinct column families in a
table be small (in the hundreds at most)"

>
> For the most part it's not a big deal to store the data in separate tables.
>  However, I'm curious what you'd recommend for one particular part of it.
>  Specifically I'd like to store actions within a web visit.  I've been
> planning to store individual actions as columns in their own column family,
> keyed by something like [timestamp, action details, session ID].  In another
> column family I'd been planning on storing statistics about the actions,
> such as first time, end time, count, etc.  When writing to the actions CF,
> I'd need to read from and possibly update the stats CF.  Would your
> recommendation be to store that kind of data in the same CF, two CFs in the
> same table, or in two separate tables?

Could you just store that in the same family?

>
> My thought was that I could use row locking to avoid races to update the
> stats after inserting into actions if I took the two CF approach.

Row locking is rarely a good idea, it doesn't scale and they currently
aren't persisted anywhere except the RS memory (so if it dies...).
Using a single family might be better for you.

J-D

Re: Question from HBase book: "HBase currently does not do well with anything about two or three column families"

Posted by Leif Wickland <le...@gmail.com>.
>
> If they have divergent read and write patterns why not put them in separate
> tables?
>

That's an entirely fair question.  I'm new to this.  I figured if the data
was related to the same thing and could have the same key, then it ought to
go into various CFs on that key in a single table.  I got the feeling from
reading the BigTable paper that the typical design approach was to dump lots
of CFs into a table.  It seems like that's not the HBase-way, though.

For the most part it's not a big deal to store the data in separate tables.
 However, I'm curious what you'd recommend for one particular part of it.
 Specifically I'd like to store actions within a web visit.  I've been
planning to store individual actions as columns in their own column family,
keyed by something like [timestamp, action details, session ID].  In another
column family I'd been planning on storing statistics about the actions,
such as first time, end time, count, etc.  When writing to the actions CF,
I'd need to read from and possibly update the stats CF.  Would your
recommendation be to store that kind of data in the same CF, two CFs in the
same table, or in two separate tables?

My thought was that I could use row locking to avoid races to update the
stats after inserting into actions if I took the two CF approach.

Thanks for your feedback,

Leif Wickland

Re: Question from HBase book: "HBase currently does not do well with anything about two or three column families"

Posted by Jeff Whiting <je...@qualtrics.com>.
If they have divergent read and write patterns why not put them in separate tables?

~Jeff

On 6/2/2011 4:06 PM, Doug Meil wrote:
> Re:  " Is that still considered current?  Do folks on the list generally agree with that guideline?"
>
> Yes and yes.  HBase runs better with fewer CFs.
>
>
>
> -----Original Message-----
> From: Leif Wickland [mailto:leifwickland@gmail.com]
> Sent: Thursday, June 02, 2011 5:41 PM
> To: user@hbase.apache.org
> Subject: Question from HBase book: "HBase currently does not do well with anything about two or three column families"
>
> I was reading through the HBase book and came across the following in *6.2. On the number of column families.<http://hbase.apache.org/book.html#number.of.cfs>
> *
> *
> *
>
> *"HBase currently does not do well with anything about two or three column families so keep the number of column families in your schema low.
> Currently, flushing and compactions are done on a per Region basis so if one column family is carrying the bulk of the data bringing on flushes, the adjacent families will also be flushed though the amount of data they carry is small. Compaction is currently triggered by the total number of files under a column family. Its not size based. When many column families the flushing and compaction interaction can make for a bunch of needless i/o loading (To be addressed by changing flushing and compaction to work on a per column family basis).*
>
> *Try to make do with one column famliy if you can in your schemas. Only introduce a second and third column family in the case where data access is usually column scoped; i.e. you query one column family or the other but usually not both at the one time."*
>
> Is that still considered current?  Do folks on the list generally agree with that guideline?
>
> The reason that I ask is that I'm designing a data model which currently has five column families.  I expect each of those column families to have divergent read and write patterns.  Do you think I should look for ways to reduce the number of CFs?
>
> Thanks,
>
> Leif Wickland

-- 
Jeff Whiting
Qualtrics Senior Software Engineer
jeffw@qualtrics.com


RE: Question from HBase book: "HBase currently does not do well with anything about two or three column families"

Posted by Doug Meil <do...@explorysmedical.com>.
Re:  " Is that still considered current?  Do folks on the list generally agree with that guideline?"

Yes and yes.  HBase runs better with fewer CFs.



-----Original Message-----
From: Leif Wickland [mailto:leifwickland@gmail.com] 
Sent: Thursday, June 02, 2011 5:41 PM
To: user@hbase.apache.org
Subject: Question from HBase book: "HBase currently does not do well with anything about two or three column families"

I was reading through the HBase book and came across the following in *6.2. On the number of column families.<http://hbase.apache.org/book.html#number.of.cfs>
*
*
*

*"HBase currently does not do well with anything about two or three column families so keep the number of column families in your schema low.
Currently, flushing and compactions are done on a per Region basis so if one column family is carrying the bulk of the data bringing on flushes, the adjacent families will also be flushed though the amount of data they carry is small. Compaction is currently triggered by the total number of files under a column family. Its not size based. When many column families the flushing and compaction interaction can make for a bunch of needless i/o loading (To be addressed by changing flushing and compaction to work on a per column family basis).*

*Try to make do with one column famliy if you can in your schemas. Only introduce a second and third column family in the case where data access is usually column scoped; i.e. you query one column family or the other but usually not both at the one time."*

Is that still considered current?  Do folks on the list generally agree with that guideline?

The reason that I ask is that I'm designing a data model which currently has five column families.  I expect each of those column families to have divergent read and write patterns.  Do you think I should look for ways to reduce the number of CFs?

Thanks,

Leif Wickland

Re: Question from HBase book: "HBase currently does not do well with anything about two or three column families"

Posted by Jean-Daniel Cryans <jd...@apache.org>.
https://issues.apache.org/jira/browse/HBASE-3149 for flushes, not sure
about compactions.

J-D

On Thu, Jun 2, 2011 at 2:57 PM, Vidhyashankar Venkataraman
<vi...@yahoo-inc.com> wrote:
> Is there a JIRA for issuing flushes and compactions on a per column family basis?
>
>
> On 6/2/11 2:48 PM, "Stack" <st...@duboce.net> wrote:
>
> On Thu, Jun 2, 2011 at 2:40 PM, Leif Wickland <le...@gmail.com> wrote:
>> Do you think I should look for ways to reduce the number of CFs?
>>
>
> If you can, yes (The book is current -- the work on making hbase do
> better with more CFs is yet to be done).
>
> Good luck,
> St.Ack
>
>

Re: Question from HBase book: "HBase currently does not do well with anything about two or three column families"

Posted by Vidhyashankar Venkataraman <vi...@yahoo-inc.com>.
Is there a JIRA for issuing flushes and compactions on a per column family basis?


On 6/2/11 2:48 PM, "Stack" <st...@duboce.net> wrote:

On Thu, Jun 2, 2011 at 2:40 PM, Leif Wickland <le...@gmail.com> wrote:
> Do you think I should look for ways to reduce the number of CFs?
>

If you can, yes (The book is current -- the work on making hbase do
better with more CFs is yet to be done).

Good luck,
St.Ack


Re: Question from HBase book: "HBase currently does not do well with anything about two or three column families"

Posted by Stack <st...@duboce.net>.
On Thu, Jun 2, 2011 at 2:40 PM, Leif Wickland <le...@gmail.com> wrote:
> Do you think I should look for ways to reduce the number of CFs?
>

If you can, yes (The book is current -- the work on making hbase do
better with more CFs is yet to be done).

Good luck,
St.Ack