You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Mohit Anchlia <mo...@gmail.com> on 2012/09/10 19:26:27 UTC

More rows or less rows and more columns

Is there any recommendation on how many columns one should have per row. My
columns are < 200 bytes. This will help me to decide if I should shard my
rows with id + <some date/time value>.

Re: More rows or less rows and more columns

Posted by Doug Meil <do...@explorysmedical.com>.
re:  "You may want to update this section"

Good point.  I will add.





On 9/11/12 6:59 AM, "Michel Segel" <mi...@hotmail.com> wrote:

>Option c, depending on the use case, add a structure to you columns to
>store the data.
>You may want to update this section....
>
>
>Sent from a remote device. Please excuse any typos...
>
>Mike Segel
>
>On Sep 10, 2012, at 12:30 PM, Harsh J <ha...@cloudera.com> wrote:
>
>> Hey Mohit,
>> 
>> See http://hbase.apache.org/book.html#schema.smackdown.rowscols
>> 
>> On Mon, Sep 10, 2012 at 10:56 PM, Mohit Anchlia
>><mo...@gmail.com> wrote:
>>> Is there any recommendation on how many columns one should have per
>>>row. My
>>> columns are < 200 bytes. This will help me to decide if I should shard
>>>my
>>> rows with id + <some date/time value>.
>> 
>> 
>> 
>> -- 
>> Harsh J
>> 
>



Re: More rows or less rows and more columns

Posted by Michel Segel <mi...@hotmail.com>.
Option c, depending on the use case, add a structure to you columns to store the data.
You may want to update this section....


Sent from a remote device. Please excuse any typos...

Mike Segel

On Sep 10, 2012, at 12:30 PM, Harsh J <ha...@cloudera.com> wrote:

> Hey Mohit,
> 
> See http://hbase.apache.org/book.html#schema.smackdown.rowscols
> 
> On Mon, Sep 10, 2012 at 10:56 PM, Mohit Anchlia <mo...@gmail.com> wrote:
>> Is there any recommendation on how many columns one should have per row. My
>> columns are < 200 bytes. This will help me to decide if I should shard my
>> rows with id + <some date/time value>.
> 
> 
> 
> -- 
> Harsh J
> 

Re: More rows or less rows and more columns

Posted by Harsh J <ha...@cloudera.com>.
Ah, sorry for assuming that then. I don't know of a way to sort
qualifiers. I haven't seen anyone do that or require it for
unstructured data (i.e. a query like "fetch me the latest qualifier
added to this row"). I suppose you can compare the last two versions
to see what was changed, but I still don't see why you need this?

For timeseries, I'd recommend looking at what OpenTSDB already provides though.

On Mon, Sep 10, 2012 at 11:32 PM, Mohit Anchlia <mo...@gmail.com> wrote:
> On Mon, Sep 10, 2012 at 10:59 AM, Harsh J <ha...@cloudera.com> wrote:
>
>> Versions is what you're talking about, and by default all queries
>> return the latest version of updated values.
>>
>
> No actually I was asking if I have columns with qualifier:
>
> d,b,c,e can I store them sorted such that it is e,d,c,b? This ways I can
> just get the most recent qualifier or for timeseries most recent qualifier.
>
>>
>> On Mon, Sep 10, 2012 at 11:04 PM, Mohit Anchlia <mo...@gmail.com>
>> wrote:
>> > On Mon, Sep 10, 2012 at 10:30 AM, Harsh J <ha...@cloudera.com> wrote:
>> >
>> >> Hey Mohit,
>> >>
>> >> See http://hbase.apache.org/book.html#schema.smackdown.rowscols
>> >
>> >
>> > Thanks! Is there a way in HBase to get the most recent inserted column?
>> Or
>> > a way to sort columns such that I can manage how many columns I want to
>> > read? In timeseries we might be interested in only most recent data
>> point.
>> >
>> >>
>> >>
>> >> On Mon, Sep 10, 2012 at 10:56 PM, Mohit Anchlia <mohitanchlia@gmail.com
>> >
>> >> wrote:
>> >> > Is there any recommendation on how many columns one should have per
>> row.
>> >> My
>> >> > columns are < 200 bytes. This will help me to decide if I should
>> shard my
>> >> > rows with id + <some date/time value>.
>> >>
>> >>
>> >>
>> >> --
>> >> Harsh J
>> >>
>>
>>
>>
>> --
>> Harsh J
>>



-- 
Harsh J

Re: More rows or less rows and more columns

Posted by Mohit Anchlia <mo...@gmail.com>.
On Mon, Sep 10, 2012 at 10:59 AM, Harsh J <ha...@cloudera.com> wrote:

> Versions is what you're talking about, and by default all queries
> return the latest version of updated values.
>

No actually I was asking if I have columns with qualifier:

d,b,c,e can I store them sorted such that it is e,d,c,b? This ways I can
just get the most recent qualifier or for timeseries most recent qualifier.

>
> On Mon, Sep 10, 2012 at 11:04 PM, Mohit Anchlia <mo...@gmail.com>
> wrote:
> > On Mon, Sep 10, 2012 at 10:30 AM, Harsh J <ha...@cloudera.com> wrote:
> >
> >> Hey Mohit,
> >>
> >> See http://hbase.apache.org/book.html#schema.smackdown.rowscols
> >
> >
> > Thanks! Is there a way in HBase to get the most recent inserted column?
> Or
> > a way to sort columns such that I can manage how many columns I want to
> > read? In timeseries we might be interested in only most recent data
> point.
> >
> >>
> >>
> >> On Mon, Sep 10, 2012 at 10:56 PM, Mohit Anchlia <mohitanchlia@gmail.com
> >
> >> wrote:
> >> > Is there any recommendation on how many columns one should have per
> row.
> >> My
> >> > columns are < 200 bytes. This will help me to decide if I should
> shard my
> >> > rows with id + <some date/time value>.
> >>
> >>
> >>
> >> --
> >> Harsh J
> >>
>
>
>
> --
> Harsh J
>

Re: More rows or less rows and more columns

Posted by Harsh J <ha...@cloudera.com>.
Versions is what you're talking about, and by default all queries
return the latest version of updated values.

On Mon, Sep 10, 2012 at 11:04 PM, Mohit Anchlia <mo...@gmail.com> wrote:
> On Mon, Sep 10, 2012 at 10:30 AM, Harsh J <ha...@cloudera.com> wrote:
>
>> Hey Mohit,
>>
>> See http://hbase.apache.org/book.html#schema.smackdown.rowscols
>
>
> Thanks! Is there a way in HBase to get the most recent inserted column? Or
> a way to sort columns such that I can manage how many columns I want to
> read? In timeseries we might be interested in only most recent data point.
>
>>
>>
>> On Mon, Sep 10, 2012 at 10:56 PM, Mohit Anchlia <mo...@gmail.com>
>> wrote:
>> > Is there any recommendation on how many columns one should have per row.
>> My
>> > columns are < 200 bytes. This will help me to decide if I should shard my
>> > rows with id + <some date/time value>.
>>
>>
>>
>> --
>> Harsh J
>>



-- 
Harsh J

Re: More rows or less rows and more columns

Posted by Mohit Anchlia <mo...@gmail.com>.
On Mon, Sep 10, 2012 at 10:30 AM, Harsh J <ha...@cloudera.com> wrote:

> Hey Mohit,
>
> See http://hbase.apache.org/book.html#schema.smackdown.rowscols


Thanks! Is there a way in HBase to get the most recent inserted column? Or
a way to sort columns such that I can manage how many columns I want to
read? In timeseries we might be interested in only most recent data point.

>
>
> On Mon, Sep 10, 2012 at 10:56 PM, Mohit Anchlia <mo...@gmail.com>
> wrote:
> > Is there any recommendation on how many columns one should have per row.
> My
> > columns are < 200 bytes. This will help me to decide if I should shard my
> > rows with id + <some date/time value>.
>
>
>
> --
> Harsh J
>

Re: More rows or less rows and more columns

Posted by Harsh J <ha...@cloudera.com>.
Hey Mohit,

See http://hbase.apache.org/book.html#schema.smackdown.rowscols

On Mon, Sep 10, 2012 at 10:56 PM, Mohit Anchlia <mo...@gmail.com> wrote:
> Is there any recommendation on how many columns one should have per row. My
> columns are < 200 bytes. This will help me to decide if I should shard my
> rows with id + <some date/time value>.



-- 
Harsh J