You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Bui Ngoc Son <ge...@gmail.com> on 2011/02/07 12:25:18 UTC
Select first n columns of a column family
Hi everyone,
I am going to implement a facebook-like comments system (with main
comments and their sub-comments), using HBase 0.90. I designed the
storage of comments as follow:
- Row key is composite row key. Each row key is a composition of user id
and timestamp (in reverse order). With this row key, i can easily get
top n newest main comments of a specific user.
- Each row has two column family: "data" and "sub_comment"
+ The "data" family stores information of main comment, such as comment
content, user id of poster, time of post...
+ The "sub_comment" family, as its name, store all the sub comments. The
number of columns in this family equal the number of sub-comments, and
column names follow this format: "sub-comment:timestamp" (timestamp in
reverse order), and its value is the content of sub-comment.
So, my question is, is there any way to get the first n columns (ordered
by timestamp) of the "sub_comment" family? I searched arround google and
this mailing list, but cannot find this information.
I know that i can use different versions of a column to solve this
problem, but for some reasons, i do not want to use multiple versions of
a column.
Thanks in advance
Eddie Bui,
Re: Select first n columns of a column family
Posted by Stack <st...@duboce.net>.
The timestamp as column family qualifier on sub_comments seems
redundant since HBase sorts already in reverse timestamp order. Why
not instead name your column as sub_comment:data_comment_id and then
just do a Get w/ maximum versions set to the number of returns you
want (You can even supply a TimeRange to Get and it should do the
right thing).
St.Ack
On Mon, Feb 7, 2011 at 3:25 AM, Bui Ngoc Son <ge...@gmail.com> wrote:
> Hi everyone,
>
> I am going to implement a facebook-like comments system (with main comments
> and their sub-comments), using HBase 0.90. I designed the storage of
> comments as follow:
>
> - Row key is composite row key. Each row key is a composition of user id and
> timestamp (in reverse order). With this row key, i can easily get top n
> newest main comments of a specific user.
>
> - Each row has two column family: "data" and "sub_comment"
> + The "data" family stores information of main comment, such as comment
> content, user id of poster, time of post...
> + The "sub_comment" family, as its name, store all the sub comments. The
> number of columns in this family equal the number of sub-comments, and
> column names follow this format: "sub-comment:timestamp" (timestamp in
> reverse order), and its value is the content of sub-comment.
>
> So, my question is, is there any way to get the first n columns (ordered by
> timestamp) of the "sub_comment" family? I searched arround google and this
> mailing list, but cannot find this information.
> I know that i can use different versions of a column to solve this problem,
> but for some reasons, i do not want to use multiple versions of a column.
>
> Thanks in advance
>
> Eddie Bui,
>
RE: How long can I have a table open?
Posted by Jonathan Gray <jg...@fb.com>.
What's the nature of the stability problems? These are issues seen server-side or just client-side?
Restarting your long-running clients fixes the issues?
> -----Original Message-----
> From: Peter Haidinyak [mailto:phaidinyak@local.com]
> Sent: Monday, February 07, 2011 11:29 AM
> To: user@hbase.apache.org
> Subject: RE: How long can I have a table open?
>
> Thanks, I have been having some real stability problems with my cluster and
> I'm trying to narrow down the possible problems.
>
> -Pete
>
> -----Original Message-----
> From: Jonathan Gray [mailto:jgray@fb.com]
> Sent: Monday, February 07, 2011 11:27 AM
> To: user@hbase.apache.org
> Subject: RE: How long can I have a table open?
>
> There is not really a limit on that. Underneath, the client will deal with
> following regions around if the move, a new master if there is a failover,
> etc...
>
> The only thing that cannot be left open indefinitely is a scanner (they have a
> server-side lease and expire if left idle).
>
> JG
>
> > -----Original Message-----
> > From: Peter Haidinyak [mailto:phaidinyak@local.com]
> > Sent: Monday, February 07, 2011 10:07 AM
> > To: user@hbase.apache.org
> > Subject: How long can I have a table open?
> >
> > During my import process I create a connection to a table. How long
> > can I leave that connection open without using it before HBase will try to
> close it?
> >
> > Hadoop 0.20.2+737
> > HBase 0.89.20100924+28
> >
> > Thanks
> >
> > -Pete
RE: How long can I have a table open?
Posted by Peter Haidinyak <ph...@local.com>.
Thanks, I have been having some real stability problems with my cluster and I'm trying to narrow down the possible problems.
-Pete
-----Original Message-----
From: Jonathan Gray [mailto:jgray@fb.com]
Sent: Monday, February 07, 2011 11:27 AM
To: user@hbase.apache.org
Subject: RE: How long can I have a table open?
There is not really a limit on that. Underneath, the client will deal with following regions around if the move, a new master if there is a failover, etc...
The only thing that cannot be left open indefinitely is a scanner (they have a server-side lease and expire if left idle).
JG
> -----Original Message-----
> From: Peter Haidinyak [mailto:phaidinyak@local.com]
> Sent: Monday, February 07, 2011 10:07 AM
> To: user@hbase.apache.org
> Subject: How long can I have a table open?
>
> During my import process I create a connection to a table. How long can I
> leave that connection open without using it before HBase will try to close it?
>
> Hadoop 0.20.2+737
> HBase 0.89.20100924+28
>
> Thanks
>
> -Pete
RE: How long can I have a table open?
Posted by Jonathan Gray <jg...@fb.com>.
There is not really a limit on that. Underneath, the client will deal with following regions around if the move, a new master if there is a failover, etc...
The only thing that cannot be left open indefinitely is a scanner (they have a server-side lease and expire if left idle).
JG
> -----Original Message-----
> From: Peter Haidinyak [mailto:phaidinyak@local.com]
> Sent: Monday, February 07, 2011 10:07 AM
> To: user@hbase.apache.org
> Subject: How long can I have a table open?
>
> During my import process I create a connection to a table. How long can I
> leave that connection open without using it before HBase will try to close it?
>
> Hadoop 0.20.2+737
> HBase 0.89.20100924+28
>
> Thanks
>
> -Pete
How long can I have a table open?
Posted by Peter Haidinyak <ph...@local.com>.
During my import process I create a connection to a table. How long can I leave that connection open without using it before HBase will try to close it?
Hadoop 0.20.2+737
HBase 0.89.20100924+28
Thanks
-Pete
Re: Select first n columns of a column family
Posted by Ted Yu <yu...@gmail.com>.
We use multiple versions of a column in our application. It is quite
convenient.
In your current approach, how many columns do you specify when you create
the table ?
On Mon, Feb 7, 2011 at 3:25 AM, Bui Ngoc Son <ge...@gmail.com>wrote:
> Hi everyone,
>
> I am going to implement a facebook-like comments system (with main comments
> and their sub-comments), using HBase 0.90. I designed the storage of
> comments as follow:
>
> - Row key is composite row key. Each row key is a composition of user id
> and timestamp (in reverse order). With this row key, i can easily get top n
> newest main comments of a specific user.
>
> - Each row has two column family: "data" and "sub_comment"
> + The "data" family stores information of main comment, such as comment
> content, user id of poster, time of post...
> + The "sub_comment" family, as its name, store all the sub comments. The
> number of columns in this family equal the number of sub-comments, and
> column names follow this format: "sub-comment:timestamp" (timestamp in
> reverse order), and its value is the content of sub-comment.
>
> So, my question is, is there any way to get the first n columns (ordered by
> timestamp) of the "sub_comment" family? I searched arround google and this
> mailing list, but cannot find this information.
> I know that i can use different versions of a column to solve this problem,
> but for some reasons, i do not want to use multiple versions of a column.
>
> Thanks in advance
>
> Eddie Bui,
>