You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Narendra Sharma <na...@gmail.com> on 2011/01/27 08:37:26 UTC

Using Cassandra for storing large objects

Anyone using Cassandra for storing large number (millions) of large (mostly
immutable) objects (200KB-5MB size each)? I would like to understand the
experience in general considering that Cassandra is not considered a good
fit for large objects. https://issues.apache.org/jira/browse/CASSANDRA-265


Thanks,
Naren

Re: Using Cassandra for storing large objects

Posted by Narendra Sharma <na...@gmail.com>.
Thanks Anand. Let's keep exchanging our experiences.

-Naren

On Thu, Jan 27, 2011 at 8:50 PM, Anand Somani <me...@gmail.com> wrote:

> At this point we are not in production, in the lab only. The longest test
> so far has been about 2-3 days, the datasize at this point is about 2-3 TB
> per node, we have 2 nodes. We do see spikes to high response times (and
> timeouts), which seemed to be around the time GC kicks in. We were pushing
> the system as much as we can. Also given our application we can do major
> compactions at night, have not tried it on this big data set yet. We do
> still have minor compactions turned on.
>
>
> On Thu, Jan 27, 2011 at 12:56 PM, Narendra Sharma <
> narendra.sharma@gmail.com> wrote:
>
>> Thanks Anand. Few questions:
>> - What is the size of nodes (in terms for data)?
>> - How long have you been running?
>> - Howz compaction treating you?
>>
>> Thanks,
>> Naren
>>
>>
>> On Thu, Jan 27, 2011 at 12:13 PM, Anand Somani <me...@gmail.com>wrote:
>>
>>> Using it for storing large immutable objects, like Aaron was suggesting
>>> we are splitting the blob across multiple columns. Also we are reading it a
>>> few columns at a time (for memory considerations). Currently we have only
>>> gone upto about 300-400KB size objects.
>>>
>>> We do have machines with 32Gb memory and with 8G for java. Row cache is
>>> disabled. There is some latency that needs to be sorted out, but overall I
>>> am positive. This is with 6.6, am in the process of moving it to 0.7.
>>>
>>> On Wed, Jan 26, 2011 at 11:37 PM, Narendra Sharma <
>>> narendra.sharma@gmail.com> wrote:
>>>
>>>> Anyone using Cassandra for storing large number (millions) of large
>>>> (mostly immutable) objects (200KB-5MB size each)? I would like to understand
>>>> the experience in general considering that Cassandra is not considered a
>>>> good fit for large objects.
>>>> https://issues.apache.org/jira/browse/CASSANDRA-265
>>>>
>>>>
>>>> Thanks,
>>>> Naren
>>>>
>>>
>>>
>>
>

Re: Using Cassandra for storing large objects

Posted by Anand Somani <me...@gmail.com>.
At this point we are not in production, in the lab only. The longest test so
far has been about 2-3 days, the datasize at this point is about 2-3 TB per
node, we have 2 nodes. We do see spikes to high response times (and
timeouts), which seemed to be around the time GC kicks in. We were pushing
the system as much as we can. Also given our application we can do major
compactions at night, have not tried it on this big data set yet. We do
still have minor compactions turned on.

On Thu, Jan 27, 2011 at 12:56 PM, Narendra Sharma <narendra.sharma@gmail.com
> wrote:

> Thanks Anand. Few questions:
> - What is the size of nodes (in terms for data)?
> - How long have you been running?
> - Howz compaction treating you?
>
> Thanks,
> Naren
>
>
> On Thu, Jan 27, 2011 at 12:13 PM, Anand Somani <me...@gmail.com>wrote:
>
>> Using it for storing large immutable objects, like Aaron was suggesting we
>> are splitting the blob across multiple columns. Also we are reading it a few
>> columns at a time (for memory considerations). Currently we have only gone
>> upto about 300-400KB size objects.
>>
>> We do have machines with 32Gb memory and with 8G for java. Row cache is
>> disabled. There is some latency that needs to be sorted out, but overall I
>> am positive. This is with 6.6, am in the process of moving it to 0.7.
>>
>> On Wed, Jan 26, 2011 at 11:37 PM, Narendra Sharma <
>> narendra.sharma@gmail.com> wrote:
>>
>>> Anyone using Cassandra for storing large number (millions) of large
>>> (mostly immutable) objects (200KB-5MB size each)? I would like to understand
>>> the experience in general considering that Cassandra is not considered a
>>> good fit for large objects.
>>> https://issues.apache.org/jira/browse/CASSANDRA-265
>>>
>>>
>>> Thanks,
>>> Naren
>>>
>>
>>
>

Re: Using Cassandra for storing large objects

Posted by Narendra Sharma <na...@gmail.com>.
Thanks Anand. Few questions:
- What is the size of nodes (in terms for data)?
- How long have you been running?
- Howz compaction treating you?

Thanks,
Naren

On Thu, Jan 27, 2011 at 12:13 PM, Anand Somani <me...@gmail.com> wrote:

> Using it for storing large immutable objects, like Aaron was suggesting we
> are splitting the blob across multiple columns. Also we are reading it a few
> columns at a time (for memory considerations). Currently we have only gone
> upto about 300-400KB size objects.
>
> We do have machines with 32Gb memory and with 8G for java. Row cache is
> disabled. There is some latency that needs to be sorted out, but overall I
> am positive. This is with 6.6, am in the process of moving it to 0.7.
>
> On Wed, Jan 26, 2011 at 11:37 PM, Narendra Sharma <
> narendra.sharma@gmail.com> wrote:
>
>> Anyone using Cassandra for storing large number (millions) of large
>> (mostly immutable) objects (200KB-5MB size each)? I would like to understand
>> the experience in general considering that Cassandra is not considered a
>> good fit for large objects.
>> https://issues.apache.org/jira/browse/CASSANDRA-265
>>
>>
>> Thanks,
>> Naren
>>
>
>

Re: Using Cassandra for storing large objects

Posted by Anand Somani <me...@gmail.com>.
Using it for storing large immutable objects, like Aaron was suggesting we
are splitting the blob across multiple columns. Also we are reading it a few
columns at a time (for memory considerations). Currently we have only gone
upto about 300-400KB size objects.

We do have machines with 32Gb memory and with 8G for java. Row cache is
disabled. There is some latency that needs to be sorted out, but overall I
am positive. This is with 6.6, am in the process of moving it to 0.7.

On Wed, Jan 26, 2011 at 11:37 PM, Narendra Sharma <narendra.sharma@gmail.com
> wrote:

> Anyone using Cassandra for storing large number (millions) of large (mostly
> immutable) objects (200KB-5MB size each)? I would like to understand the
> experience in general considering that Cassandra is not considered a good
> fit for large objects. https://issues.apache.org/jira/browse/CASSANDRA-265
>
>
> Thanks,
> Naren
>

Re: Using Cassandra for storing large objects

Posted by buddhasystem <po...@bnl.gov>.
Will it work for a billion rows? Because that's where eventually I'll end up
being.

-- 
View this message in context: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Using-Cassandra-for-storing-large-objects-tp5965418p5966284.html
Sent from the cassandra-user@incubator.apache.org mailing list archive at Nabble.com.

Re: Using Cassandra for storing large objects

Posted by aaron morton <aa...@thelastpickle.com>.
Millions of rows/items is no problem, megabytes per item is doable. Generally people have talked about chunking blobs and storing them across multiple columns. 

See 
http://wiki.apache.org/cassandra/LargeDataSetConsiderations
http://wiki.apache.org/cassandra/CassandraLimitations

Hope that helps. 
Aaron

On 27 Jan 2011, at 20:37, Narendra Sharma wrote:

> Anyone using Cassandra for storing large number (millions) of large (mostly immutable) objects (200KB-5MB size each)? I would like to understand the experience in general considering that Cassandra is not considered a good fit for large objects. https://issues.apache.org/jira/browse/CASSANDRA-265
> 
> 
> Thanks,
> Naren


Re: Using Cassandra for storing large objects

Posted by buddhasystem <po...@bnl.gov>.
I would ask myself a different question, which is what media-hosting sites
use (YouTube and all others). Cassandra still may have its usefulness here
as a mapper between a logical id and physical file location.
-- 
View this message in context: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Using-Cassandra-for-storing-large-objects-tp5965418p5967730.html
Sent from the cassandra-user@incubator.apache.org mailing list archive at Nabble.com.