You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Aditya Narayan <ad...@gmail.com> on 2011/03/06 18:35:16 UTC

What would be a good strategy for Storing the large text contents like blog posts in Cassandra.

What would be a good strategy to store large text content/(blog posts
of around 1500-3000 characters)  in cassandra? I need to store these
blog posts along with their metadata like bloggerId, blogTags. I am
looking forward to store this data in a single row giving each
attribute a single column. So one blog per row. Is using a single
column for a large blog post like this a good strategy?

Next, I also need to store the blogComments which I am planning to
store all, in another single row. 1 comment per column. Thus the
entire information about the a single comment like  commentBody,
commentor would be serialized(using google Protocol buffers) and
stored in a single column,
For storing the no. of likes of each comment itself,  I am planning to
keep a counter_column, in the same row, for each comment that will
hold an no. specifiying no. of 'likes' of that comment.

Any suggestions on the above design highly appreciated.. Thanks.

Re: What would be a good strategy for Storing the large text contents like blog posts in Cassandra.

Posted by Aditya Narayan <ad...@gmail.com>.
Thanks Aaron!!

I didnt knew about the upcoming facility for inbuilt counters. This
sounds really great for my use-case!! Could you let me know where can
I read more about this, if this had been blogged about, somewhere ?

I'll go forward with the one (entire)blog per column design.

Thanks



On Mon, Mar 7, 2011 at 5:10 AM, Aaron Morton <aa...@thelastpickle.com> wrote:
> Sounds reasonable, one CF for the blog post one CF for the comments. You could also use a single CF if you will often read the blog and the comments at the same time. The best design is the one that suits how your app works, try one and be prepared to change.
>
> Note that counters are only in the 0.8 trunk and are still under development, they are not going to be released for a couple of months.
>
> Your per column data size is nothing to be concerned abut.
>
> Hope that helps.
> Aaron
>
> On 7/03/2011, at 6:35 AM, Aditya Narayan <ad...@gmail.com> wrote:
>
>> What would be a good strategy to store large text content/(blog posts
>> of around 1500-3000 characters)  in cassandra? I need to store these
>> blog posts along with their metadata like bloggerId, blogTags. I am
>> looking forward to store this data in a single row giving each
>> attribute a single column. So one blog per row. Is using a single
>> column for a large blog post like this a good strategy?
>>
>> Next, I also need to store the blogComments which I am planning to
>> store all, in another single row. 1 comment per column. Thus the
>> entire information about the a single comment like  commentBody,
>> commentor would be serialized(using google Protocol buffers) and
>> stored in a single column,
>> For storing the no. of likes of each comment itself,  I am planning to
>> keep a counter_column, in the same row, for each comment that will
>> hold an no. specifiying no. of 'likes' of that comment.
>>
>> Any suggestions on the above design highly appreciated.. Thanks.
>

Re: What would be a good strategy for Storing the large text contents like blog posts in Cassandra.

Posted by Aaron Morton <aa...@thelastpickle.com>.
Sounds reasonable, one CF for the blog post one CF for the comments. You could also use a single CF if you will often read the blog and the comments at the same time. The best design is the one that suits how your app works, try one and be prepared to change.

Note that counters are only in the 0.8 trunk and are still under development, they are not going to be released for a couple of months.

Your per column data size is nothing to be concerned abut.

Hope that helps.
Aaron 

On 7/03/2011, at 6:35 AM, Aditya Narayan <ad...@gmail.com> wrote:

> What would be a good strategy to store large text content/(blog posts
> of around 1500-3000 characters)  in cassandra? I need to store these
> blog posts along with their metadata like bloggerId, blogTags. I am
> looking forward to store this data in a single row giving each
> attribute a single column. So one blog per row. Is using a single
> column for a large blog post like this a good strategy?
> 
> Next, I also need to store the blogComments which I am planning to
> store all, in another single row. 1 comment per column. Thus the
> entire information about the a single comment like  commentBody,
> commentor would be serialized(using google Protocol buffers) and
> stored in a single column,
> For storing the no. of likes of each comment itself,  I am planning to
> keep a counter_column, in the same row, for each comment that will
> hold an no. specifiying no. of 'likes' of that comment.
> 
> Any suggestions on the above design highly appreciated.. Thanks.

Re: What would be a good strategy for Storing the large text contents like blog posts in Cassandra.

Posted by Jean-Christophe Sirot <je...@cryptolog.com>.
On 03/07/2011 10:08 PM, Aaron Morton wrote:
> You can fill your boots.
>
> So long as your boots have a capacity of 2 billion.
>
> Background ...
> http://wiki.apache.org/cassandra/LargeDataSetConsiderations
>
> http://wiki.apache.org/cassandra/CassandraLimitations
>
> http://www.pcworld.idg.com.au/article/373483/new_cassandra_can_pack_two_billion_columns_into_row/
>

Thx, I haven't seen these wiki pages.

-- 
Jean-Christophe Sirot

Re: What would be a good strategy for Storing the large text contents like blog posts in Cassandra.

Posted by Aaron Morton <aa...@thelastpickle.com>.
You can fill your boots.

So long as your boots have a capacity of 2 billion.

Background ...
http://wiki.apache.org/cassandra/LargeDataSetConsiderations

http://wiki.apache.org/cassandra/CassandraLimitations

http://www.pcworld.idg.com.au/article/373483/new_cassandra_can_pack_two_billion_columns_into_row/

aaron

On 8/03/2011, at 4:57 AM, Jean-Christophe Sirot <je...@cryptolog.com> wrote:

> Hello,
> 
> On 03/06/2011 06:35 PM, Aditya Narayan wrote:
>> Next, I also need to store the blogComments which I am planning to
>> store all, in another single row. 1 comment per column. Thus the
>> entire information about the a single comment like  commentBody,
>> commentor would be serialized(using google Protocol buffers) and
>> stored in a single column,
> 
> Is there any limitation/issue in having a signle row with a lot of columns? For instance, can I have millions of columns in a single row?
> 
> -- 
> Jean-Christophe Sirot
> 

Re: What would be a good strategy for Storing the large text contents like blog posts in Cassandra.

Posted by Jean-Christophe Sirot <je...@cryptolog.com>.
Hello,

On 03/06/2011 06:35 PM, Aditya Narayan wrote:
> Next, I also need to store the blogComments which I am planning to
> store all, in another single row. 1 comment per column. Thus the
> entire information about the a single comment like  commentBody,
> commentor would be serialized(using google Protocol buffers) and
> stored in a single column,

Is there any limitation/issue in having a signle row with a lot of 
columns? For instance, can I have millions of columns in a single row?

-- 
Jean-Christophe Sirot