You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Bui Ngoc Son <ge...@gmail.com> on 2011/04/20 08:53:31 UTC
HBase Schema: how to sort row by last update?
Hi everybody,
I am desiging a two-tiers comment system like facebook: the system
inlcuded main comments and each main comment has a various number of
sub-comments. My schema is as follow:
table comments
family "data":
"data:content" - content of main comment
"data:uid" - uid of user who posted this comment
"data:sub_count" - the number of its sub-comment.
family "sub_content"
"sub_content:<cid>": content of the sub-comment which has id=cid
family "sub_uid"
"sub_uid:<cid>": uid of user who posted the sub-comment which
has id=cid
So, each sub-comment has two columns in 2 families with an unique cid.
As well, each main comment has an unique id that is decreased by time
(to make sure that newer main comments are selected first)
However, my boss requires that main comments have to be ordered by last
update first. It is mean that, each time a main comment have a new
sub-comment, it becomes "newer" than the others, and i have to sort the
main comments by the time of their last sub-comments.
How can i design the comment schema to meet my boss's requirement? can i
keep the current schema and use some extra-indexing table? or i have to
re-design the schema?
Thanks in advance
Regards,
Eddie Bui
Re: HBase Schema: how to sort row by last update?
Posted by Bui Ngoc Son <ge...@gmail.com>.
Hi Stack,
Thanks for your reply!
Two major queries in my system are "select" queries:
1. Select main comments by page, n items per page, ordered by last
update DESC. The last update of a main comment is the timestamp of its
newest sub-comment (if it doesn't have any sub-comment, last update is
the timestamp of main comment).
2. For each main comment, select 2 newest sub-comments, and select all
sub-comments.
With my current schema, the second query can be done easily, but the
first cannot.
I'm thinking about the second index table with this schema:
table "lastpost_index"
row key: <reversed_timestamp_of_last_post>_<id_of_main_comment>
no need any family.
Each time a main comment has new sub comment, i insert a new row into
"lastpost_index" table.
Scanning "lastpost_index" table and parsing row keys i could have a list
of <id_of_main_comment> order by lastpost DESC. However, there is one
problem:
Some main comments are "busy" than others, it means that they have more
sub-comment and are updated more regularly than others. For example:
table "lastpost_index":
key: 123456789_10
key: 123456799_11
key: 123456899_11
key: 123456999_9
the main comment with id 11 has two newer sub-comments than main comment
9. When i scan the table, the main comment 11 appear twice.
So, it's difficult for me to get exactly n newest distinct main comment
using lastpost_index.
Regards,
Eddie Bui
On 4/20/2011 11:30 PM, Stack wrote:
> It looks like you keep up a second index table ordered by last post
> (row key will have a reversed timestamp of the post's date).
>
> What are the queries you are going to make against the table?
>
> St.Ack
>
> On Tue, Apr 19, 2011 at 11:53 PM, Bui Ngoc Son<ge...@gmail.com> wrote:
>> Hi everybody,
>>
>> I am desiging a two-tiers comment system like facebook: the system inlcuded
>> main comments and each main comment has a various number of sub-comments. My
>> schema is as follow:
>>
>> table comments
>> family "data":
>> "data:content" - content of main comment
>> "data:uid" - uid of user who posted this comment
>> "data:sub_count" - the number of its sub-comment.
>> family "sub_content"
>> "sub_content:<cid>": content of the sub-comment which has id=cid
>> family "sub_uid"
>> "sub_uid:<cid>": uid of user who posted the sub-comment which has
>> id=cid
>>
>> So, each sub-comment has two columns in 2 families with an unique cid. As
>> well, each main comment has an unique id that is decreased by time (to make
>> sure that newer main comments are selected first)
>>
>> However, my boss requires that main comments have to be ordered by last
>> update first. It is mean that, each time a main comment have a new
>> sub-comment, it becomes "newer" than the others, and i have to sort the main
>> comments by the time of their last sub-comments.
>>
>> How can i design the comment schema to meet my boss's requirement? can i
>> keep the current schema and use some extra-indexing table? or i have to
>> re-design the schema?
>>
>> Thanks in advance
>>
>> Regards,
>>
>> Eddie Bui
>>
Re: HBase Schema: how to sort row by last update?
Posted by Stack <st...@duboce.net>.
It looks like you keep up a second index table ordered by last post
(row key will have a reversed timestamp of the post's date).
What are the queries you are going to make against the table?
St.Ack
On Tue, Apr 19, 2011 at 11:53 PM, Bui Ngoc Son <ge...@gmail.com> wrote:
> Hi everybody,
>
> I am desiging a two-tiers comment system like facebook: the system inlcuded
> main comments and each main comment has a various number of sub-comments. My
> schema is as follow:
>
> table comments
> family "data":
> "data:content" - content of main comment
> "data:uid" - uid of user who posted this comment
> "data:sub_count" - the number of its sub-comment.
> family "sub_content"
> "sub_content:<cid>": content of the sub-comment which has id=cid
> family "sub_uid"
> "sub_uid:<cid>": uid of user who posted the sub-comment which has
> id=cid
>
> So, each sub-comment has two columns in 2 families with an unique cid. As
> well, each main comment has an unique id that is decreased by time (to make
> sure that newer main comments are selected first)
>
> However, my boss requires that main comments have to be ordered by last
> update first. It is mean that, each time a main comment have a new
> sub-comment, it becomes "newer" than the others, and i have to sort the main
> comments by the time of their last sub-comments.
>
> How can i design the comment schema to meet my boss's requirement? can i
> keep the current schema and use some extra-indexing table? or i have to
> re-design the schema?
>
> Thanks in advance
>
> Regards,
>
> Eddie Bui
>